Statistics for Business
and Economics
Module 1:Probability Theory and
Statistical Inference
Spring 2010
Lecture 3: Continuous probability distributions
Priyantha Wijayatunga, Department of Statistics, Ume
University
These materials
are altered ones from copyrighted lecture slides ( 2009 W.H.
[Link]@[Link]
Freeman and Company) from the homepage of the book:
The Practice of Business Statistics Using Data for Decisions :Second Edition
by Moore, McCabe, Duckworth and Alwan.
Continuous probability
distributions
Probability density
Uniform probability density
Normal distributions, standard normal distribution
Law of large numbers
Sampling distributions
The mean and standard deviation of sample mean
The central limit theorem
Recall: Discrete Probability
Distributions
Let X denote the # of days a student comes to class (in a week).
Probability distibution is
0.1
0.2
P X x p ( x) 0.2
0.3
0.2
if x 1
if x 2
if x 3
if x 4
if x 5
then
1)what is the probability that a student comes to the class more than 3 days?
2)what is the probability that a student comes to the class 2 or 3 days?
Continuous Probability
A
continuous random variable X takes all values in an interval.
Distributions
Example: There is an infinity of numbers between 0 and 1 (e.g., 0.001, 0.4, 0.0063876).
The probability distribution of a continuous random variable is described
by a density curve ( also called density function or probability
density).
The probability of any event is the area under the density curve for the
values of X that make up the event.
This is a uniform density curve for the variable X.
The probability that X falls between 0.3 and 0.7 is
the area under the density curve for that interval:
P(0.3 X 0.7) = (0.7 0.3)*1 = 0.4
Density function:
X
f(x)= 1; for 0 x 1
f(x)= 0; for x<0 or x>1
Intervals
All continuous probability distributions assign probability 0 to every
individual outcome. Only intervals can have a positive probability, represented
by the area under the density curve for that interval.
The probability of a single event is zero:
P(X=1) = (1 1)*1 = 0
Height
=1
The probability of an interval is the same whether
boundary values are included or excluded:
P(0 X 0.5) = (0.5 0)*1 = 0.5
P(0 < X < 0.5) = (0.5 0)*1 = 0.5
P(0 X < 0.5) = (0.5 0)*1 = 0.5
P(X < 0.5 or X > 0.8) = P(X < 0.5) + P(X > 0.8) = 1 P(0.5 < X < 0.8) = 0.7
Assigning Probabilities: intervals of
outcomes
A sample space may contain all numbers within a range.
For continuous outcomes, the probability model is a density curve.
Area under the entire density curve is equal to 1.
Probability model assigns probabilities as areas under the density
curve.
Assigning Probabilities: intervals of
If
all possible outcomes are equally likely: for example, obtaining a
outcomes
value from 0 to 1 is equally likely.
Uniform density curve (uniform probability distribution) on [0,1].
Probabilities are computed as areas
P(0.3 X 0.7) = 0.4
Similarly, P(X < 0.5 or X > 0.8) = 0.5 +0.2 = 0.7
General uniform probability
If
the outcomes are equally likely for any value in between two numbers a and b
distribution
(random variable X can take any value in between a and b) where a<b,
then the probability density of X is
f (x)
(b - a)
if a x b
otherwise
Ex: The number of minutes that a student
takes to solve a math problem is
known to be any number in between
10 to 20 with equal chances.
Find the probability that a student
takes more than 6 but less than 12
minutes to solve a given math problem.
Continuous random variable and population
distribution
The shaded area under a density
curve shows the proportion, or %,
of individuals in a population with
values of X between x1 and x2.
Because the probability of drawing
one individual at random
depends on the frequency of this
type of individual in the population,
the probability is also the shaded
area under the curve.
% individuals with X
such that x1 < X < x2
Normal probability models
Normal probability models look like:
The scores of students on the ACT college entrance examination
in a recent year had the normal distribution with mean =18.6 and
standard deviation = 5.9.
What is the probability that a randomly chosen student scores 21 or
higher?
Normal probability
distributions
The
probability distribution of many random variables is a normal
distribution. It shows what values the random variable can take and is
used to assign probabilities to those values.
Example: Probability
distribution of womens
heights.
Here since we chose a woman
randomly, her height, X, is a
random variable.
To calculate probabilities with the normal distribution, we will
standardize the random variable (z score) and use Table A.
Normal distributions
Normal or Gaussian distributions are a family of symmetrical, bell
shaped density curves defined by a mean (mu) and a standard
deviation (sigma) : N().
f ( x)
1
2
1 x
x
e = 2.71828 The base of the natural logarithm
= pi = 3.14159
A family of density curves
Here means are the same ( = 15)
while standard deviations are
different ( = 2, 4, and 6).
Here means are different
( = 10, 15, and 20) while
standard deviations are the same
( = 3)
The 68-95-99.7 rule
About 68% of all observations
are within 1 standard deviation
Inflection point
(of the mean ().
About 95% of all observations
are within 2 of the mean .
Almost all (99.7%) observations
are within 3 of the mean.
mean = 64.5
standard deviation = 2.5
N(, ) = N(64.5, 2.5)
The standard Normal distribution
Because all Normal distributions share the same properties, we can
standardize our data to transform any Normal curve N() into the
standard Normal curve N(0,1).
N(64.5, 2.5)
N(0,1)
=>
Standardized height (no units)
For each x we calculate a new value, z (called a z-score).
Standardizing: calculating zA
z-score measures the number of standard deviations that a data
scores
value x is from the mean .
(x )
z
When x is 1 standard deviation larger
than the mean, then z = 1.
for x , z
When x is 2 standard deviations larger
than the mean, then z = 2.
for x 2 , z
2 2
When x is larger than the mean, z is positive.
When x is smaller than the mean, z is negative.
Ex. Women heights
N(, ) =
N(64.5, 2.5)
Women heights follow the N(64.5,2.5)
distribution. What percent of women are
Area= ???
shorter than 67 inches tall (thats 56)?
mean = 64.5"
standard deviation = 2.5"
x (height) = 67"
Area = ???
= 64.5 x = 67
z=0
z=1
We calculate z, the standardized value of x:
(x )
(67 64.5) 2.5
, z
1 1 stand. dev. from mean
2.5
2.5
Because of the 68-95-99.7 rule, we can conclude that the percent of women
shorter than 67 should be, approximately, .68 + half of (1 - .68) = .84 or 84%.
What is the probability, if we pick one woman at random, that her height will be
some value X? For instance, between 68 and 70 inches P(68 < X < 70)?
Because the woman is selected at random, X is a random variable.
(x )
z
N(, ) =
N(64.5, 2.5)
As before, we calculate the zscores for 68 and 70.
For x = 68",
(68 64.5)
1. 4
2.5
For x = 70",
(70 64.5)
2.2
2.5
0.9192
0.9861
The area under the curve for the interval [68" to 70"] is 0.9861 0.9192 = 0.0669.
Thus, the probability that a randomly chosen woman falls into this range is 6.69%.
P(68 < X < 70) = 6.69%
Using Table A
Table A gives the area under the standard Normal curve to the left of any z value.
.0082 is the
area under
N(0,1) left
of z = -2.40
.0080 is the area
under N(0,1) left
of z = -2.41
0.0069 is the area
under N(0,1) left
of z = -2.46
()
Percent of women shorter than 67
For z = 1.00, the area under
the standard Normal curve
to the left of z is 0.8413.
N(, ) =
N(64.5, 2.5)
Area 0.84
Conclusion:
84.13% of women are shorter than 67.
Area 0.16
By subtraction, 1 - 0.8413, or 15.87% of
women are taller than 67".
= 64.5 x = 67
z=1
Tips on using Table A
Because the Normal distribution
is symmetrical, there are 2 ways
Area = 0.9901
that you can calculate the area
under the standard Normal curve
Area = 0.0099
to the right of a z value.
z = -2.33
area right of z = area left of -z
area right of z =
area left of z
Tips on using Table A
To calculate the area between 2 z-values, first get the area under N(0,1)
to the left for each z-value from Table A.
Then subtract the
smaller area from the
larger area.
A common mistake made by
students is to subtract both zvalues, but the Normal curve is
not uniform.
area between z1 and z2 =
area left of z1 area left of z2
The area under N(0,1) for a single value of z is zero
(Try calculating the area to the left of z minus that same area!)
The National Collegiate Athletic Association (NCAA) requires Division I athletes to
score at least 820 on the combined math and verbal SAT exam to compete in their
first college year. The SAT scores of 2003 were approximately normal with mean
1026 and standard deviation 209.
What proportion of all students would be NCAA qualifiers (SAT 820)?
x 820
1026
209
(x )
z
(820 1026)
z
209
206
z
0.99
209
Table A : area under
N(0,1) to the left of
z - .99 is 0.1611
or approx.16%.
area right of 820
=
=
total area
1
area left of 820
0.1611
84%
Note: The actual data may contain students who scored
exactly 820 on the SAT. However, the proportion of scores
exactly equal to 820 is 0 for a normal distribution is a
consequence of the idealized smoothing of density curves.
The NCAA defines a partial qualifier eligible to practice and receive an athletic
scholarship, but not to compete, as a combined SAT score is at least 720.
What proportion of all students who take the SAT would be partial
qualifiers? That is, what proportion have scores between 720 and 820?
x 720
1026
209
(x )
z
(720 1026)
z
209
306
z
1.46
209
Table A : area under
N(0,1) to the left of
z - .99 is 0.0721
or approx. 7%.
area between
720 and 820
9%
=
=
area left of 820
0.1611
area left of 720
0.0721
About 9% of all students who take the SAT have scores
between 720 and 820.
The cool thing about working with
normally distributed data is that
we can manipulate it and then find
answers to questions that involve
comparing seemingly noncomparable distributions.
We do this by standardizing the
data. All this involves is changing
the scale so that the mean now = 0
and the standard deviation = 1. If
you do this to different distributions
it makes them comparable.
(x )
z
N(0,1)
Finding a value when given a proportion
Backward normal calculations: We may also want to find
the observed range of values that correspond to a given proportion under the
curve.
For that, we use Table A backward:
we first find the desired
area/proportion in the
body of the table
we then read the
corresponding z-value
from the left column and
top row
For an area to the left of 1.25 % (0.0125),
the z-value is -2.24
Backward Normal Calculations
Miles per gallon ratings of compact cars (2001 models) follow
approximately the N(25.7, 5.88) distribution. How many miles per gallon
must a vehicle get to place in the top 10% of all 2001 model compact cars?
1. z = 1.28 is the standardized
value with area 0.9 to its left and
0.1 to its right.
2. Unstandardize
x 25.7
1.28
5.88
Solving for x gives x = 33.2
miles per gallon.
Other Standard Normal
probability tables
0.2
0.0
0.1
density
0.3
0.4
Standard normal distribution
-3
-2
-1
If X ~ N (10,0.3) then what is P X 11.025 ?
Z
P(Z > 1.87 )= 0.03
X 10
P X 11 P
11.025 10
0.3
P Z 1.87
1 P Z 1.87
1 - 0.9693
0.0307
0.3
Assessing the Normality of data
One way to assess if a distribution is indeed approximately normal is to
plot the data on a normal quantile plot.
The data points are ranked and the percentile ranks are converted to zscores with Table A. The z-scores are then used for the x axis against
which the data are plotted on the y axis of the normal quantile plot.
If the distribution is indeed normal the plot will show a straight line,
indicating a good match between the data and a normal distribution.
Systematic deviations from a straight line indicate a nonnormal
distribution. Outliers appear as points that are far away from the overall
pattern of the plot.
Normal quantile plot of
the earnings of 15 black
female hourly workers at
National Bank. This
distribution is roughly
Normal except for one
low outlier.
The Normal Distributions
Normal quantile plot of
the salaries of Cincinnati
Reds players on opening
day of the 2000 season.
This distribution is
skewed to the right.
Law of large numbers
As the number of randomly drawn
observations in a sample increases,
the mean of the sample
gets
closer and closer to the population
mean .
This is the law of large numbers. It
is valid for any population.
Note: We often intuitively expect predictability over a few random observations,
but it is wrong. The law of large numbers only applies to really large numbers.
Reminder: What is a sampling
distribution?
The sampling distribution of a statistic is the distribution of all
possible values taken by the statistic when all possible samples of a
fixed size n are taken from the population. It is a theoretical idea we
do not actually build it.
The sampling distribution of a statistic is the probability distribution
of that statistic.
Sampling distribution of
We
take many random
samples of a given size n from a population
sample
mean
with mean and standard deviation
Some sample means will be above the population mean and some
will be below, making up the sampling distribution.
Sampling
distribution
of x bar
Histogram
of some
sample
averages
For any population with mean and standard deviation :
The mean of the sampling distribution is equal to the population
mean
standard deviation of the sampling distribution is /n, where n
is the sample size.
The
Sampling distribution of x bar
Mean and standard deviation of
sample
mean
Mean of a sampling distribution of
x
There is no tendency for a sample mean to fall systematically above or
below even if the distribution of the raw data is skewed. Thus, the mean
of the sampling distribution is an unbiased estimate of the population
mean it will be correct on average in many samples.
Standard deviation of a sampling distribution of
The standard deviation of the sampling distribution is smaller than the
standard deviation of the population by a factor of n. Averages are
less variable than individual observations. Also, the results of large
samples are less variable than the results of small samples.
For normally distributed
populations
When a variable in a population is normally distributed, the sampling
distribution of the sample mean for all possible samples of size n is
also normally distributed.
Sampling distribution
If the population is N( )
then the sample means
distribution is N( /n).
Population
The central limit theorem
Central Limit Theorem: When randomly sampling from any population
with mean and standard deviation , when n is large enough, the
sampling distribution of x bar is approximately normal: ~ N( /n).
Population with
strongly skewed
distribution
Sampling
distribution of
x for n = 2
observations
Sampling
distribution of
x for n = 10
observations
Sampling
distribution of
x for n = 25
observations
The central limit theorem
Histogram of 1000 sample means of 50-sized samples
Density
1.0
1.0
0.5
0.5
0.0
0.0
Density
1.5
1.5
2.0
2.5
Bin(5,0.7)
3.0
3.2
3.4
3.6
3.8
sample mean
From a highly skewed distribution (mean=3.5, sd=1.024695) get
random samples with n=50 and get their sample means
Relative frequency distribution is pproximately normal (bell shaped)
mean=3.50164 and sd=0.1471508
1.024695/ 50 0.1449138
IQ scores: population vs. sample
In a large population of adults, the mean IQ is 112 with standard deviation 20.
Suppose 200 adults are randomly selected for a market research campaign.
The
distribution of the sample mean IQ is:
A) Exactly normal, mean 112, standard deviation 20
B) Approximately normal, mean 112, standard deviation 20
C) Approximately normal, mean 112 , standard deviation 1.414
D) Approximately normal, mean 112, standard deviation 0.1
C) Approximately normal, mean 112 , standard deviation 1.414
Application
Hypokalemia is diagnosed when blood potassium levels are low, below
3.5mEq/dl. Lets assume that we know a patient whose measured potassium
levels vary daily according to a normal distribution N( = 3.8, = 0.2).
If only one measurement is made, what is the probability that this patient will be
misdiagnosed hypokalemic?
( x ) 3.5 3.8
z
0.2
z = 1.5, P(z < 1.5) = 0.0668 7%
If instead measurements are taken on 4 separate days, what is the probability
of such a misdiagnosis?
( x ) 3.5 3.8
z
n
0.2 4
z = 3, P(z < 1.5) = 0.0013 0.1%
Note: Make sure to standardize (z) using the standard deviation for the sampling
distribution.
Income distribution
Lets consider the very large database of individual incomes from the Bureau of
Labor Statistics as our population. It is strongly right skewed.
We take 1000 SRSs of 100 incomes, calculate the sample mean for
each, and make a histogram of these 1000 means.
We also take 1000 SRSs of 25 incomes, calculate the sample mean for
each, and make a histogram of these 1000 means.
Which histogram
corresponds to the
samples of size
100? 25?
How large a sample size?
It depends on the population distribution. More observations are
required if the population distribution is far from normal.
A sample size of 25 is generally enough to obtain a normal sampling
distribution from a strong skewness or even mild outliers.
A sample size of 40 will typically be good enough to overcome extreme
skewness and outliers.
In many cases, n = 25 isnt a huge sample. Thus,
even for strange population distributions we can
assume a normal sampling distribution of the mean
and work with it to solve problems.