Statistics For Business and Economics: Describing Data: Numerical
Statistics For Business and Economics: Describing Data: Numerical
Chapter 2
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-1
Describing Data Numerically
Describing Data Numerically
Mode Variance
x i
x i 1
n
Arithmetic Midpoint of Most frequently
average ranked values observed value
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-3
Arithmetic Mean
(aritmetik ortalama)
The arithmetic mean (mean) is the most
common measure of central tendency
For a population of N values:
N
x
x1 x 2 x N
i Population
μ i1
values
N N
Population size
n
For a sample of
x size n:
i
x1 x 2 x n Observed
x i 1
values
n n
Sample size
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-4
Arithmetic Mean
(continued)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Mean = 3 Mean = 4
1 2 3 4 5 15 1 2 3 4 10 20
3 4
5 5 5 5
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-5
Median (medyan, ortanca)
In an ordered list, the median is the “middle”
number (50% above, 50% below)
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
Median = 3 Median = 3
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-6
Finding the Median
n 1
Median position position in the ordered data
2
If the number of values is odd, the median is the middle number
If the number of values is even, the median is the average of
the two middle numbers
n 1
Note that is not the value of the median, only the
2
position of the median in the ranked data
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-7
Mode (mod)
A measure of central tendency
Value that occurs most often
Not affected by extreme values
Used for either numerical or categorical data
There may be no mode
There may be several modes
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 0 1 2 3 4 5 6
No Mode
Mode = 9
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-8
Example (from your book, pp.66-67)
88 51 63 85 79 65 79 70 73 77
$2,000,000
500,000 $500 K
300,000 $300 K
100,000
100,000
$100 K
$100 K
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-10
Review Example:
Summary Statistics
House Prices:
Mean: ($3,000,000/5)
$2,000,000 = $600,000
500,000
300,000
100,000
100,000
Median: middle value of ranked data
Sum 3,000,000
= $300,000
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-11
Which measure of location
is the “best”?
Categorical data are best described by the median
and mode.
Clothing retailers (the size of the items sold most often,
the mode,the one in heaviest demand)
Numerical data are best described by the mean.
Mean is generally used, unless extreme values
(outliers) exist . . .
Then median is often used, since the median is not
sensitive to extreme values.
Example: Median home prices may be reported for a region
– less sensitive to outliers
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-12
Shape of a Distribution
Measures of shape
Symmetric (simetrik)
Skewed (çarpık)
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-13
Symmetry:
The shape of a distribution is said to be symmetric
if
the observations are balanced or approximately
evenly distributed, about its middle.
Symmetric
Mean = Median
Shape of a Distribution
Left-Skewed Right-Skewed
Mean < Median Median < Mean
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-15
Geometric Mean
(geometrik ortalama)
Geometric mean
Used to measure the rate of change of a variable
over time
x g (x 1 x 2 x n ) (x1 x 2 x n )
n 1/n
rg (x 1 x 2 ... x n ) 1/n
1
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-18
Example
(continued)
Arithmetic
mean rate of (50%) (20%)
X 35% Misleading result
return: 2
3.6 3.1 3.9 3.7 3.5 3.7 3.4 3.0 3.7 3.4
20 73 75 80 82
40 55 62 43 50 60 65
Variation
Measures of variation
give information on the
spread (yayıklık) or
variability (değişkenlik)
of the data values.
Same center,
different variation
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-23
Range (aralık)
Simplest measure of variation
Difference between the largest and the
smallest observations:
Example:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Range = 14 - 1 = 13
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-24
Disadvantages of the Range
Ignores the way in which data are distributed
7 8 9 10 11 12 7 8 9 10 11 12
Range = 12 - 7 = 5 Range = 12 - 7 = 5
Sensitive to outliers
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,5
Range = 5 - 1 = 4
1,1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,4,120
Range = 120 - 1 = 119
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-25
Quartiles
(dörde bölenler veya kartiller)
Quartiles split the ranked data into 4 segments
with an equal number of values per segment
Q1 Q2 Q3
The first quartile, Q1, is the value for which 25% of
the observations are smaller and 75% are larger
Q2 is the same as the median (50% are smaller, 50%
are larger)
Only 25% of the observations are greater than the
third quartile
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-26
Quartile Formulas
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-27
Quartiles
(n = 9)
Q1 = is in the 0.25(9+1) = 2.5 position of the ranked data
so use the value half way between the 2nd and 3rd values,
so Q1 = 12.5
Q2 = 16 Q3 = 19.5
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-28
Example (from your book, pp.74)
51 63 65 70 73 77 79 79 85 88
Interquartile Range
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-30
Interquartile Range
Example:
X Median X
minimum Q1 (Q2) Q3 maximum
25% 25% 25% 25%
12 30 45 57 70
Interquartile range
= 57 – 30 = 27
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-31
Population Variance
(ana kütle varyansı)
σ
2 i1
N
Where μ = population mean
N = population size
xi = ith value of the variable x
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-32
Sample Variance
(örneklem varyansı)
s
2 i 1
n -1
Where X = arithmetic mean
n = sample size
Xi = ith value of the variable X
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-33
Population Standard Deviation
(ana kütle standart sapması)
i
(x μ) 2
σ i1
N
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-34
Sample Standard Deviation
(örneklem standart sapması)
i
(x x) 2
S i1
n -1
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-35
Calculation Example:
Sample Standard Deviation
Sample
Data (xi) : 10 12 14 15 17 18 18 24
n=8 Mean = x = 16
20 35 28 22 10 40 23 32 28 30
of different assets.
If two assets have the same mean rates of
return,
then the asset with the smaller standard
deviation has less risk than
the asset with the larger standard deviation.
Example (from your book, pp.78-79)
6 8 7 10 3 5 9 8
Example (from your book, pp.83)
3 0 -2 -1 5 10
Measuring variation
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-42
Comparing Standard Deviations
Data A
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 3.338
Data B
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 0.926
Data C
Mean = 15.5
11 12 13 14 15 16 17 18 19 20 21 s = 4.570
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-43
Advantages of Variance and
Standard Deviation
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-44
Coefficient of Variation
(değişim katsayısı)
Measures relative variation
Always in percentage (%)
Shows variation relative to mean (expresses
the standard deviation as a percentage the
mean)
Can be used to compare two or more sets of
data measured in different units
s
CV 100% if x 0
x
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-45
Comparing Coefficient
of Variation
Stock A:
Average price last year = $50
Standard deviation = $5
s $5
CVA 100% 100% 10%
x $50 Both stocks
Stock B: have the same
standard
Average price last year = $100
deviation, but
Standard deviation = $5 stock B is less
variable relative
to its price
s $5
CVB 100% 100% 5%
x $100
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-46
Comparing Coefficient
of Variation (pp.80)
Stock A:
Average closing price = $4
Standard deviation = $2
Stock B:
Average closing price = $80
Standard deviation = $8
s $2
CVA 100% 100% 50% The market value of
x $4 stock A fluctuates
more from period to
s $8
CVB 100% 100% 10% period than does that
of Stock B.
x $80
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-47
Weighted Mean
2.3
(ağırlıklı ortalama)
w x i i
w 1x1 w 2 x 2 w n x n
x i1
n n
Where wi is the weight of the ith observation
and n w i
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-48
Example (from your book, pp.85)
fm
K
i i where n fi
x i1
i1
n
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-50
Approximations for Grouped Data
Suppose data are grouped into K classes, with
frequencies f1, f2, . . . fK, and the midpoints of the
classes are m1, m2, . . ., mK
i i
f (m x) 2
s2 i 1
n 1
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-51
Example (from your book,pp.87)
Coffee shop customers were randomly surveyed and
asked to select a category that described the cost
of their recent purchase. The results were as
follows:
(x i x )(y i y )
Cov (x , y) xy i 1
N
The sample covariance:
n
(x x)(y i i y)
Cov (x , y) s xy i 1
n 1
Only concerned with the strength of the relationship
No causal effect is implied
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-54
Interpreting Covariance
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-55
Coefficient of Correlation
(korelasyon katsayısı)
Measures the relative strength of the linear
relationship between two variables
Cov (x , y)
ρ
σXσY
Sample correlation coefficient:
Cov (x , y)
r
sX sY
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-56
Features of
Correlation Coefficient, r
Unit free
Ranges between –1 and 1
The closer to –1, the stronger the negative linear
relationship
The closer to 1, the stronger the positive linear
relationship
The closer to 0, the weaker any positive linear
relationship
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-57
Scatter Plots of Data with Various
Correlation Coefficients
Y Y Y
X X X
r = -1 r = -.6 r=0
Y
Y Y
X X X
r = +1 r = +.3 r=0
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-58
Interpreting the Result
Scatter Plot of Test Scores
r = .733 100
95
There is a relatively
Test #2 Score
90
80
relationship between 75
test score #1 70
70 75 80 85 90 95 100
Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall Ch. 2-59
Example (from your book,pp.95)