0% found this document useful (0 votes)
20 views16 pages

Video Notes Unit 2

The document provides an overview of categorical and quantitative data in AP Statistics, detailing definitions, examples, and methods for tallying and graphing data. It discusses the importance of measures of center and spread, including mean, median, mode, range, IQR, and standard deviation, as well as how to interpret the shape of data distributions. Additionally, it covers various graphical representations such as pie charts, bar graphs, dot plots, stemplots, and histograms, emphasizing the need for appropriate labeling and context.

Uploaded by

jasmine04262007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views16 pages

Video Notes Unit 2

The document provides an overview of categorical and quantitative data in AP Statistics, detailing definitions, examples, and methods for tallying and graphing data. It discusses the importance of measures of center and spread, including mean, median, mode, range, IQR, and standard deviation, as well as how to interpret the shape of data distributions. Additionally, it covers various graphical representations such as pie charts, bar graphs, dot plots, stemplots, and histograms, emphasizing the need for appropriate labeling and context.

Uploaded by

jasmine04262007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

AP Statistics

Unit 2
Exploring Data
Video Notes

Name: _________________________________

1
2
Chapter 1: [Video #1] – Categorical Data

Two types of data: ___________________________


Categorical and _____________________________.
Quantitative

Def: Categorical Data: _________________________________________________________


A variable that
places individual into
an one of several groups categories
or .

________________________________________________________________________

• Examples: _____________________________________________________________
Gender eye color Likert scale
, , , model zip code
car ,
code favorite music
,
area ,

Two ways of tallying categorical data: by ___________________________________________


Frequency (Counts actual numbers)
:

or by _______________________________________________________________________
Percents (convert counts to % 's)

Graphing ONE categorical variable:


- Pie Chart:
1. Calculate what percent of the whole each category is (if it’s not given to you!).
2. Multiply the percent by 360o to determine what portion of a circle the category should be.
3. Draw the whole “pie” & separate the “slices” into the appropriate sizes.
4. Don’t forget appropriate labels to identify the context of the graph!

- Bar Graph:
1. Label your axes and title your graph.
2. Scale your axes. Use the counts in each category to help you scale your vertical axis.
Write the category names at equally spaced intervals beneath the horizontal axis.
3. Draw a vertical bar above each category name to the height that corresponds to the
count in that category. Optional: ____________________________________________
put each bar
on

4. Leave s p a c e s between the bars!

Basic info about TWO categorical variables:

Typically organized using a two-way table. One variable is broken down vertically in the table
while the second variable is broken down horizontally in the table.

All of the sub-totals for one variable broken down into all of its sub-parts is called the
______________________
Marginal distribution. The individual sub-totals for one variable are called
____________________________
Conditional distributions.

3
Graphing TWO categorical variables:

• Side-by-Side Bar Graphs


1. A main variable is displayed with the other variable broken
down with its own bars within the main variable’s options
(medals by country or countries by medal).
2. It can be displayed with frequencies or relative frequencies.
3. Appropriately label both vertical and horizontal axes in context!!!

• Segmented Bar Graphs

1. A main variable is displayed with the other variable broken


down within the singular bars of the main variable’s options.
2. It can be displayed with frequencies or relative frequencies.
3. Appropriately label both vertical and horizontal axes in context!

************************************************************************************************************

Workspace for your conversions:


27/67 = 40 3 %
.
26/70 : 37 .
1%

23/67 = 34 3 %
.
18/0 : 25 7 %
.

167 =
25 4 %
.

26/70 :
37 .
1 %

4
Chapter 1: [Video #2] – Describing Quantitative Data: Part #1

Two types of data: ___________________________


Categorical Data and _____________________________.
Quantitative Data

Def: Quantitative Data: ________________________________________________________


A variable that takes urmerical values for which makes to calculate
it average
sense an .

________________________________________________________________________
• Examples: _____________________________________________________________
Calories Age Likert Scale IQ
, , , Height Weight Hours Slept
, , ,

Def: Distribution of a variable: __________________________________________________


The values variable takes and how often it takes those values
a ·

________________________________________________________________________

We will __________________
describe the distribution of a set of _____________________________
quantitative

data in _________
four special ways. We will discuss the “SOCS” of the distribution.

S
O The ____________________
Center is what we have called the average
C_____________
enter

S or typical our whole lives.

There are three common measure of _________________


Center for any distribution of data:

• Mean: _________________________________________________________________
Sum of all values /# of values in set

• Median: _______________________________________________________________
the middle value when the data arranged in order
is

• Mode: ________________________________________________________________
the most frequent value in the data set

S
O The _________________
spread is a way to describe how much variation
C
S _____________
pread (or _________________________
variability ) exists in our data.

There are three common measure of _________________


spread for any distribution of data:

• Range: ________________________________________________________________
maximum-minimum value (covers all 100 % of the data set)

• IQR: _________________________________________________________________
Interquartile range Third Quartile-First Quartile
:
Q3-Q covers the middle
, ,
boy of the data set

_____________________________________________________________________
[ =1 (X x)3 ,
-

• Standard Deviation: _____________________________________________________


Average amount each value differs from the
Sx
mean
=
n -
1

______________________________________________________________________

5
T 79 8 13 3
. Sx .

46
79 5 .
100 -
54

20

*************************************************************************************************************

* 1227 5 116 3
Sx
.

480
1220

135

6
Chapter 1: [Video #3] – Describing Quantitative Data: Part #2

S _____________
hape
O
C enter
S pread

There are seven common ways to interpret the _________________


Shape of a set of data:

1. Approximately normal: ___________________,


unimodal bell-shaped
,
____________________________
2. Skewed right: _______________________________________
tail Right side
=

3. Skewed left: ________________________________________


tail Left side
:

4. (Roughly) uniform: _______________________________________


relatively flat

5. Bimodal: ______________________________
two peaks

6. Multimodal: ______________________________
2 peaks
+

7. Symmetric: _________________________________________________________
Left and right side is the when folded in half
same

Draw and label examples of each shape from above in the box below:

111

um1
S hape
O _____________
C enter
S pread

Def: Outlier: _________________________________________________________________


A value that "lies outside" (is much smaller
or
larger than) most of the other values in
a set of data
.

________________________________________________________________________

The “1.5IQR Rule” for determining outliers: ________________________________________


A value outlier if it falls outside the
is an
following range

____________________________________________________________________________
of values:

93485531131111211 7
Write out the mathematical formula in the box below:
(Qz -
Q, )
[Q ,
- 1 SIQR.
, Q3 + 1 .
SIOR]

Outlier check for Group 1:

0 . 483 - 1 .
5(3 .
073 - 0 .

483) ,
3 . 073 + 1 .
5(3 .
073 -
0 .
483)

-
3 402 ,
.
6 958 .

None

*************************************************************************************************************

Outlier check for Group 2:

1. 105 -
1 5
.
(2 . 540 -
1 .
109) ,
2 .
540 + 1 5(2 540
.
.
- 1 .
105)
-
1 0475
.
, 4 . 6925

Upper outlier

8
Chapter 1: [Video #4] – Describing Quantitative Data: Part #3

We have discussed the individual SOCS, but now we see how they interact with each other.

If the shape is skewed left or right, then use ________________


median as the measure of center.

If the shape is approx. normal, then use ________________ as the measure of center.
mean

If the shape is symmetric (not approx. normal), then use ______________________________.


median
mean or

Why does shape affect center?


The _____________
mean is ___________
less ____________________
resistant
to the effects of extreme values
and/or outliers than the ___________________.
median

If the shape is skewed (in either direction), then the ______________ will ALWAYS be pulled
mean

towards the tail of the distribution and become less reliable as the measure of center.

The mean will start to UNDERESTIMATE or OVERESTIMATE the true center of the distribution!

Which measures of center go with which measure of spread?

Mean will typically be paired with _________________________________________________.


standard deviation

Median will be paired with ______________


range IF ______________________________________.
outliers
no present
are

Median will be paired with ______________


IQR IF ______________________________________.
outliers present
are

To “describe a distribution”, we will discuss its SOCS but the data must be what kind of data???

__________________________________________________
quantitative data

9
Describe the distribution of the number of books in backpacks.

The distribution of the number of books in backpacks is ________________________________


skewed left with median of and a 3 5 . an

____________________________________________________________________________
IGR of 1 5
. have
.
We at one outlier zero.

____________________________________________________________________________

____________________________________________________________________________.

*************************************************************************************************************
Describe the distribution of the number of flights of stairs climbed per day.

The distribution of the number of flights of stairs

has approximately normal shape ,


climbed per day an

the mean is 3 96 ,
to be present
.

no outliers appear ,

and the range is 8


.

10
Chapter 1: [Video #5] – Displaying Quantitative Data Graphically: Part 1

• Dotplot: graphs one quantitative variable by using…wait, for it….DOTS!!!


o Be sure to label your dotplot appropriately and add context!

• Stemplot: similar to a dotplot, but the actual data values are used instead of dots.

o Leaves must contain the ___________________


number of
same
digits
______________________________________.

o Do NOT skip stems where ___________________


data exist !!!
no

_______________________________________.

o NEVER forget to include the __________


key to know
how to read the graph’s numbers!
o You could make a stemplot rotated 90o like the
dotplot, but the way shown is more common.

• Split Stemplot: a variation of the stemplot.


o Stems must be split the same way throughout the entire
graph (notice how there are two 5’s, two 6’s, …, two 10’s?
the first 5 covers 50-54 while the second 5 covers 55-59).

o Allows condensed data sets to _____________________


spread out little
a more

___________________________________________.

o As always with ANY kind of stemplot, do not forget the…

11
• Back-to-Back Stemplot: a more in-depth variation of the stemplot.
o Breaks the data down by a __________________
categorical value

____________________________________.

o Don’t forget to include TWO ______________


keys
for how to read both sides individually!

Pros of Dotplots and Stemplots

1) Both work well with _______________________


smaller data sets.

2) Dotplots work well with ___________________________________


very condensed data sets.

3) Dotplots work well with ______________________


whole numbers.

4) Stemplots work well with ________________


2
digit numbers (whole or a few decimal places).
+

*************************************************************************************************************
Create a stemplot and then describe the distribution of the scores.

S S I
/
S S S S
S
C
S
3 9 /
459 / S S C
50

6
22489 /
S & /
7389

S I 224 S

9 I E

key : 319 39
=

bimodal There do not appear to be any outliers . median is


The distribution of the scores is .
The 66 and the

mean is 67 1. The is 56 .
.

range

12
Chapter 1: [Video #6] – Displaying Quantitative Data Graphically: Part 2

• Histograms: …are NOT bar graphs!!!

o Histograms have NO ________


gaps between each bin.

o Histogram bins have ________________


equal width.

o The values on the horizontal axis go on the


border/edge of each bin to signify the
starting/ending value of each bin.

How to make a histogram by hand:


/ / ---
11/
/ -1/ S S
S S S S / &
/
11/ / S

1) Determine how many bins of what width you need first.


You want ideally _____________
5-12 bins depending on how much and/or how spread out
your data is.

2) Organize your data into a table. 3) Set up your axes & label appropriately.

10

4) Draw in all of the bins into their appropriate places to finish the histogram!

Now, describe the distribution of high temperatures in the box below!


The distribution of high temperatures is roughly symmetric with a mean of 94 60F
. There don't appear to be
.

any
outliers. The Standard deviation
5 9
is .
.

13
How to make a histogram with a TI-83/84 calculator:

1) Enter your data into a list.

a. [STAT], select “Edit”. Use L1 or whatever list you like!

2) Prepare the histogram.

a. [2ND] [Y=] to get to STAT PLOT.

b. Turn ON, select the histogram TYPE, Xlist is where your data is, leave Freq at 1.

3) Let’s see the histogram!

a. Press [ZOOM], select 9: ZoomStat to fit the histogram perfectly on the screen.

4) Adjust the histogram as needed.

a. Go to [WINDOW]. This is where you can set your own minimum (Xmin),
Maximum (Xmax), and scale/bin width (Xscl). Leave all else alone. Then
[GRAPH]. Do not do [ZOOM] 9 again or else your new settings will be lost!

Use [TRACE] to help quickly identify each bin height to make a quick histogram by hand!

*************************************************************************************************************
Create a histogram with the following: SAT Scores

min 1000, max 1600, bin width 100 S - S

& --
Then, describe the distribution of SAT scores. S 11/
S & & S
1- S

C 1000-1099 : 3

1100-1199 : E

5 1200-1299 : 7

n
1300-1399 : 3
4
1400 1499 :

'
-

1500 -
1599 :

I
1000 110
1200 1300
I
100 isdo indo
SAT Scores

14
Chapter 1: [Video #7] – Displaying Quantitative Data Graphically: Part 3

• Boxplot (Box and Whisker Plot): shows one measure of center (median), two
measures of spread (IQR and range), outliers (if present), and a rough idea of shape.

To make a boxplot:

The “five number summary”

1) _____________________
minimum

2) _____________________
Q (first
, quartile)

3) _____________________
median

4) _____________________
Q (third quartile)
z

5) _____________________
maximum

If not given, use your calculator to get these!_________________________________________

Barry Bonds’ home run boxplot:

____________________________________________________________________________

Jimmy Juicer’s home run boxplot:

____________________________________________________________________________

15
How to make a boxplot with a TI-83/84 calculator:

1) Enter your data into a list.

a. [STAT], select “Edit”. Use L1 or whatever list you like!

2) Prepare the boxplot.

a. [2ND] [Y=] to get to STAT PLOT.

b. Turn ON the plot, select the boxplot TYPE that shows outliers (if present)
Then, “Xlist” is where your data is stored and leave “Freq” at 1.

3) Let’s see the boxplot!

a. Press [ZOOM], select 9: ZoomStat to fit the boxplot perfectly on the screen.
If you only press [GRAPH], you might not see your boxplot!

Use [TRACE] to help quickly identify the “five number summary” along the boxplot

*************************************************************************************************************

SAT Scores

Create a boxplot in your calculator and copy

it down in your notes. Use the [TRACE] feature

to get the “five number summary” to label the axis.

**Then, describe the distribution of SAT scores.**

I 1111 1 I

1000
1050 1100 1150 1200 1250 1300 1350 invo iso isoo

16

You might also like