0% found this document useful (0 votes)
58 views36 pages

Chapter3 PDF

This document discusses various techniques for visualizing quantitative data using Matplotlib, a Python data visualization library. It begins by showing how to create basic bar charts of Olympic medal counts by country. It then demonstrates how to customize bar charts by rotating labels, stacking bars, and adding legends. Next, it covers histograms and how to customize them by setting bin numbers, boundaries, and transparency. The document also explores error bars, boxplots, and scatter plots, demonstrating how to encode additional variables through color and customize scatter plots. The overall content is an introduction to quantitative data visualization concepts and best practices using Matplotlib.

Uploaded by

Hana Banana
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
58 views36 pages

Chapter3 PDF

This document discusses various techniques for visualizing quantitative data using Matplotlib, a Python data visualization library. It begins by showing how to create basic bar charts of Olympic medal counts by country. It then demonstrates how to customize bar charts by rotating labels, stacking bars, and adding legends. Next, it covers histograms and how to customize them by setting bin numbers, boundaries, and transparency. The document also explores error bars, boxplots, and scatter plots, demonstrating how to encode additional variables through color and customize scatter plots. The overall content is an introduction to quantitative data visualization concepts and best practices using Matplotlib.

Uploaded by

Hana Banana
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 36

Quantitative

comparisons: bar-
charts
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H M AT P LOT L I B

Ariel Rokem
Data Scientist
Olympic medals
,Gold, Silver, Bronze
United States, 137, 52, 67
Germany, 47, 43, 67
Great Britain, 64, 55, 26
Russia, 50, 28, 35
China, 44, 30, 35
France, 20, 55, 21
Australia, 23, 34, 25
Italy, 8, 38, 24
Canada, 4, 4, 61
Japan, 17, 13, 34

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Olympic medals: visualizing the data
medals = pd.read_csv('medals_by_country_2016.csv', index_col=0)

fig, ax = plt.subplots()

ax.bar(medals.index, medals["Gold"])
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Interlude: rotate the tick labels
fig, ax = plt.subplots()
ax.bar(medals.index, medals["Gold"])

ax.set_xticklabels(medals.index, rotation=90)
ax.set_ylabel("Number of medals")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Olympic medals: visualizing the other medals
fig, ax = plt.subplots
ax.bar(medals.index, medals["Gold"])

ax.bar(medals.index, medals["Silver"], bottom=medals["Gold"])

ax.set_xticklabels(medals.index, rotation=90)
ax.set_ylabel("Number of medals")
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Olympic medals: visualizing all three
fig, ax = plt.subplots
ax.bar(medals.index, medals["Gold"])

ax.bar(medals.index, medals["Silver"], bottom=medals["Gold"])

ax.bar(medals.index, medals["Bronze"],
bottom=medals["Gold"] + medals["Silver"])

ax.set_xticklabels(medals.index, rotation=90)
ax.set_ylabel("Number of medals")
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Stacked bar chart

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Adding a legend
fig, ax = plt.subplots
ax.bar(medals.index, medals["Gold"])
ax.bar(medals.index, medals["Silver"], bottom=medals["Gold"])
ax.bar(medals.index, medals["Bronze"],
bottom=medals["Gold"] + medals["Silver"])

ax.set_xticklabels(medals.index, rotation=90)
ax.set_ylabel("Number of medals")

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Adding a legend
fig, ax = plt.subplots
ax.bar(medals.index, medals["Gold"], label="Gold")
ax.bar(medals.index, medals["Silver"], bottom=medals["Gold"],
label="Silver")
ax.bar(medals.index, medals["Bronze"],
bottom=medals["Gold"] + medals["Silver"],
label="Bronze")

ax.set_xticklabels(medals.index, rotation=90)
ax.set_ylabel("Number of medals")

ax.legend()
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Stacked bar chart with legend

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Create a bar chart!
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H M AT P LOT L I B
Quantitative
comparisons:
histograms
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H M AT P LOT L I B

Ariel Rokem
Data Scientist
Histograms

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


A bar chart again
fig, ax = plt.subplots()

ax.bar("Rowing", mens_rowing["Height"].mean())

ax.bar("Gymnastics", mens_gymnastics["Height"].mean())

ax.set_ylabel("Height (cm)")
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Introducing histograms
fig, ax = plt.subplots()

ax.hist(mens_rowing["Height"])

ax.hist(mens_gymnastic["Height"])

ax.set_xlabel("Height (cm)")
ax.set_ylabel("# of observations")
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Labels are needed
ax.hist(mens_rowing["Height"], label="Rowing")
ax.hist(mens_gymnastic["Height"], label="Gymnastics")
ax.set_xlabel("Height (cm)")
ax.set_ylabel("# of observations")

ax.legend()
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Customizing histograms: setting the number of
bins
ax.hist(mens_rowing["Height"], label="Rowing", bins=5)
ax.hist(mens_gymnastic["Height"], label="Gymnastics", bins=5)
ax.set_xlabel("Height (cm)")
ax.set_ylabel("# of observations")
ax.legend()
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Customizing histograms: setting bin boundaries
ax.hist(mens_rowing["Height"], label="Rowing",
bins=[150, 160, 170, 180, 190, 200, 210])

ax.hist(mens_gymnastic["Height"], label="Gymnastics",
bins=[150, 160, 170, 180, 190, 200, 210])

ax.set_xlabel("Height (cm)")
ax.set_ylabel("# of observations")
ax.legend()
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Customizing histograms: transparency
ax.hist(mens_rowing["Height"], label="Rowing",
bins=[150, 160, 170, 180, 190, 200, 210],
histtype="step")

ax.hist(mens_gymnastic["Height"], label="Gymnastics",
bins=[150, 160, 170, 180, 190, 200, 210],
histtype="step")

ax.set_xlabel("Height (cm)")
ax.set_ylabel("# of observations")
ax.legend()
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Histogram with a histtype of step

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Create your own
histogram!
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H M AT P LOT L I B
Statistical plotting
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H M AT P LOT L I B

Ariel Rokem
Data Scientist
Adding error bars to bar charts
fig, ax = plt.subplots()

ax.bar("Rowing",
mens_rowing["Height"].mean(),
yerr=mens_rowing["Height"].std())

ax.bar("Gymnastics",
mens_gymnastics["Height"].mean(),
yerr=mens_gymnastics["Height"].std())

ax.set_ylabel("Height (cm)")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Error bars in a bar chart

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Adding error bars to plots
fig, ax = plt.subplots()

ax.errorbar(seattle_weather["MONTH"],
seattle_weather["MLY-TAVG-NORMAL"],
yerr=seattle_weather["MLY-TAVG-STDDEV"])

ax.errorbar(austin_weather["MONTH"],
austin_weather["MLY-TAVG-NORMAL"],
yerr=austin_weather["MLY-TAVG-STDDEV"])

ax.set_ylabel("Temperature (Fahrenheit)")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Error bars in plots

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Adding boxplots
fig, ax = plt.subplots()

ax.boxplot([mens_rowing["Height"],
mens_gymnastics["Height"]])

ax.set_xticklabels(["Rowing", "Gymnastics"])
ax.set_ylabel("Height (cm)")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Interpreting boxplots

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Try it yourself!
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H M AT P LOT L I B
Quantitative
comparisons: scatter
plots
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H M AT P LOT L I B

Ariel Rokem
Data Scientist
Introducing scatter plots
fig, ax = plt.subplots()

ax.scatter(climate_change["co2"], climate_change["relative_temp"])

ax.set_xlabel("CO2 (ppm)")
ax.set_ylabel("Relative temperature (Celsius)")
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Customizing scatter plots
eighties = climate_change["1980-01-01":"1989-12-31"]
nineties = climate_change["1990-01-01":"1999-12-31"]

fig, ax = plt.subplots()

ax.scatter(eighties["co2"], eighties["relative_temp"],
color="red", label="eighties")

ax.scatter(nineties["co2"], nineties["relative_temp"],
color="blue", label="nineties")

ax.legend()

ax.set_xlabel("CO2 (ppm)")
ax.set_ylabel("Relative temperature (Celsius)")

plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Encoding a comparison by color

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Encoding a third variable by color
fig, ax = plt.subplots()

ax.scatter(climate_change["co2"], climate_change["relative_temp"],
c=climate_change.index)

ax.set_xlabel("CO2 (ppm)")
ax.set_ylabel("Relative temperature (Celsius)")
plt.show()

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Encoding time in color

INTRODUCTION TO DATA VISUALIZATION WITH MATPLOTLIB


Practice making your
own scatter plots!
I N T R O D U C T I O N TO D ATA V I S U A L I Z AT I O N W I T H M AT P LOT L I B

You might also like