0% found this document useful (0 votes)

28 views20 pages

Limit Theoram

The lecture slides cover the Central Limit Theorem (CLT) and sampling distributions, illustrating how sample means approximate the population mean as sample size increases. Through examples, it demonstrates that while small samples may not yield a normal distribution, larger samples lead to a more normal distribution of sample means. The slides also discuss the importance of sample size in applying the CLT and provide exercises related to estimating probabilities in non-normal distributions.

Uploaded by

Sunil Rathee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views20 pages

Limit Theoram

Uploaded by

Sunil Rathee

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

STAT 22000 Lecture Slides

Variability in Estimates &

Central Limit Theorem

Yibi Huang
Department of Statistics
University of Chicago
Outline

This set of slides covers section 4.1 and 4.4 in the text, which
includes

• Central Limit Theorem (CLT)

• Sampling distribution

1
Example — Rating of a Movie

Suppose a certain movie has a bipolar distribution of ratings, that

in a 1 to 10 scale, of those having watched the movie, 1/3 gave 9
points, 1/3 gave 2 points, and the remaining 1/3 gave 1 points.

So the population distribution is

X 1 2 9
P (X ) 1/3 1/3 1/3

1 2 3 4 5 6 7 8 9 10
Population Distribution
2
Histogram of the Sample

In practice, since the population are difficult (or impossible) to

examine completely, we take a sample to learn about the
population. Will the makeup of the sample mimic the makeup of
the population?

First, the sampling method must be appropriate. A biased sample

won’t give us the correct information about the population.

Suppose we take a simple random sample of size n (say

n = 400) from the population. What will the histogram of the
ratings of the movie given by subjects in the sample look like?

popratings = c(1,2,9)
s400 = sample(popratings, size = 400, replace=T, prob=c(1/3,1/3,1/3))
hist(s400, breaks=0:10+.5, xlab="Ratings", main="Sample Size = 400")

3
sample size = 10 sample size = 25
sample mean = 3.6 sample mean = 4.72
Frequency

Frequency
4

8
2

4
0

0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Ratings Ratings
sample size = 100 sample size = 400
sample mean = 4.01 sample mean = 3.86
30
Frequency

Frequency
50 100
10 0

0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Ratings Ratings

The histogram of the sample looks somewhat like the histogram of

the population. The larger the sample size, the higher the
resemblance.

4
Estimation of the Population Mean

In practice, the population distribution is usually unknown. We are

often interested in population parameters, like the population
mean.

• As all we know about the population is the sample, we can

only use the sample to estimate the population parameter of
interest, called statistic.
• A commonly used estimate of the population mean is the
sample mean. Thus the sample mean is one of such statistic.
• Sample statistics vary from sample to sample.
• How close is the sample mean to the population mean?

5
Variability of the Sample Means

To know the variability of the sample mean of a sample of size

n = 25, we pretend that we know the population

X 1 2 9
P (X ) 1/3 1/3 1/3

and then to the following simulation.

1. We take a random sample of size n = 25 from the population,

compute and record the sample mean, and the put the
sample back.
2. We repeat the previous step 10000 times, and then obtain
10000 sample means.

What will the histogram of the 10000 sample means look like?
6
samplemean25 = vector("numeric", 10000)
for(i in 1:10000){
samplemean25[i] = mean(sample(popratings, size = 25, replace=T,
prob=c(1/3,1/3,1/3)))
}
hist(samplemean25, breaks=seq(1.5,7.02,by=0.04),
xlab="sample mean",
main="Histogram of the Means of 10000 Samples of Size 25")
abline(v=4, col=2)

Histogram of the Means of 10000 Samples of Size 25

0 100 250
Frequency

2 3 4 5 6 7
sample mean
The red vertical line marks the position of the population mean = 4

When we take a sample of size 25, the distribution of the sample

means is not very normal, with a number of hills and valleys. 7
samplemean100 = vector("numeric", 10000)
for(i in 1:10000){
samplemean100[i] = mean(sample(popratings, size = 100, replace=T,
prob=c(1/3,1/3,1/3)))
}
hist(samplemean100, breaks=seq(2.51,5.51,by=0.02),
xlab="sample mean",
main="Histogram of the Means of 10000 Samples of Size 100")
abline(v=4, col=2)

Histogram of the Means of 10000 Samples of Size 100

100 200
Frequency
0

2.5 3.0 3.5 4.0 4.5 5.0 5.5

sample mean
The red vertical line marks the position of the population mean = 4

8
samplemean400 = vector("numeric", 10000)
for(i in 1:10000){
samplemean400[i] = mean(sample(popratings, size = 400, replace=T,
prob=c(1/3,1/3,1/3)))
}
hist(samplemean400, breaks=seq(3.3,4.7,by=0.01),
xlab="sample mean",
main="Histogram of the Means of 10000 Samples of Size 400")
abline(v=4, col=2) # population mean

Histogram of the Means of 10000 Samples of Size 400

100 200
Frequency
0

3.4 3.6 3.8 4.0 4.2 4.4 4.6

sample mean
The red vertical line marks the position of the population mean = 4

When the sample size increases to 400, the distribution of the

9
sample means looks very normal.
Sampling Distribution

• The probability distribution of a statistic is called the sampling

distribution of the statistic.
• What we just constructed is the sampling distribution of the
sample mean.

10
Observations for the Simulations Above

• The sampling distribution of the sample mean may not be

normal when the sample size is small, but it gets more normal
when the sample size gets larger.
• The sample mean may not be equal to the population mean,
but its distribution centers at the population mean.
• With a larger sample, the variability sample mean around the
population gets smaller.
• What are the SDs of the sample means?

> mean(samplemean25) > sd(samplemean25)

[1] 3.99808 [1] 0.7073244
> mean(samplemean100) > sd(samplemean100)
[1] 4.001438 [1] 0.3577802
> mean(samplemean400) > sd(samplemean400)
[1] 3.99929 [1] 0.1770972
11
Expected Value and SD of the Sample Mean

For i.i.d. random variables X1 , X2 , . . . , Xn from a population with

mean µ and SD σ, the expected value and SD of the sample mean
X n = (X1 + X2 + · · · + Xn )/n are respectively
√
E (X n ) = µ, SD (X n ) = σ/ n

• Here, “i.i.d.” = “independent, and identically distributed”.

which means X1 , . . . , Xn are independent and have identical
probability distributions.
• Observations in a simple random sample is nearly i.i.d. if the
sample size is less than 10% of the population size.
• SD of the sample mean is specifically call the standard error.

12
For the movie rating example, recall the population distribution is

X 1 2 9
P (X ) 1/3 1/3 1/3

The mean, variance and SD of the population distribution are

respectively
1 1 1
µ=1· +2· +9· =4
3 3 3
r r
1 1 1 38
σ = (1 − 4) · + (2 − 4) · + (9 − 4) · =
2 2 2 ≈ 3.56.
3 3 3 3

sample expected > sd(samplemean25)

size n value of X n SD of X n [1] 0.7073244
√
25 4 3.56/ 25 ≈ 0.712 > sd(samplemean100)
√
100 4 3.56/ 100 ≈ 0.356 [1] 0.3577802
√
400 4 3.56/ 400 ≈ 0.178 > sd(samplemean400)
[1] 0.1770972
13
Central Limit Theorem (CLT)

Let X1 , X2 , . . . be a sequence of i.i.d. random variables (discrete or

continuous) with mean µ and variance σ2 . Then, when n is large,

• the distribution of the sample mean

1
Xn = (X1 + X2 + · · · + Xn )
n
is approximately
σ
!
N µ, √ .
n
• the distribution of the sum Sn = X1 + X2 + · · · + Xn is
approximately
√
N (nµ, nσ).

14
Example

Xi 1 2 9
Xi ’s are i.i.d., with the distribution
P (Xi ) 1/3 1/3 1/3

Recall that µ = 4, σ ≈ 3.56. So the sampling distribution of X 100

is approximately
√
N (µ, σ/ 100) = N (4, 0.356).
So
4.5 − 4
!
P (X 100 > 4.5) = P Z > ≈ P (Z > 1.40) ≈ 0.08.
0.356

In the simulation 804 of the 10000 simulated X 100 exceeds 4.5,

which agrees with the CLT approximation that X 100 exceeds 4.5 for
about 8% of the time.

> sum(samplemean100 > 4.5)

[1] 804 15
Sample Size Required to Use CLT?

• Provided the sample size is large enough, the sampling

distributions of the sample mean will be approximately
normal, even when the population distribution is not normal.
• If the population distribution is normal, then so does the
sampling distributions of the sample mean, regardless of the
sample size.
• If population distribution is symmetric, then n should be at
least 30 or so.
• If the population distribution is skewed or has outliers, then
sample size n should be moderate (at least 100 or so), or
even larger depending on how skewed or irregular the
population distribution is.

16
Exercise 4.35 – Housing Prices (p.214)

A housing survey was conducted to determine the price of a typical

home in Topanga, CA. The mean price of a house was roughly $1.3
million with an SD of $0.3 million. There were no houses listed
below $0.3 million but a few houses above $3 million.

Can we find an approximate probability that a randomly chosen

house in Topanga costs more than $1.4 million using the normal
distribution?

No, because the population do not follow a normal distribution (it is

right skewed), and a sample of size 1 is too small to use CLT.

17
Exercise 4.35 – Housing Prices (p.214)

Can we find an approximate probability that the mean of 60 ran-

domly chosen houses in Topanga is more than $1.4 million using
the normal distribution? If yes, compute the approximate probabil-
ity.

Yes, if the population distribution is not too skewed, the sampling

distribution of the sample mean of a sample of size 60 might be
normal by CLT.
σ 0.3
!
X 60 ∼ N µ = 1.3, SE = √ = √ = N (1.3, 0.0387).
60 60
So,
1.4 − 1.3
!
P (X 60 > 1.4) = P Z > ≈ P (Z > 2.58) ≈ 0.0049.
0.0387

18
What Does the CLT Say?

True or False and explain: The central limit theorem says that as you
take larger and larger samples from a population, the histogram of
the sample values looks more and more normal.

False, as you take larger and larger samples, the histogram of the
sample values looks more and more like the histogram of the
population.

What is the thing that becomes more and more normal as the sam-
ple size gets larger and larger?

It is the distribution of the sample mean that get’s more and more
normal.

Sampling Distributions in Statistics
No ratings yet
Sampling Distributions in Statistics
31 pages
Sampling Statistics and Distributions
No ratings yet
Sampling Statistics and Distributions
22 pages
Chapter 6-8 Sampling and Estimation
No ratings yet
Chapter 6-8 Sampling and Estimation
48 pages
Population Sampling and Confidence Intervals
No ratings yet
Population Sampling and Confidence Intervals
81 pages
COMM162 - Week 05 - Sampling
No ratings yet
COMM162 - Week 05 - Sampling
45 pages
Understanding Population Sampling and CLT
No ratings yet
Understanding Population Sampling and CLT
81 pages
Applied Statistics and Probability For Engineers Chapter - 7
No ratings yet
Applied Statistics and Probability For Engineers Chapter - 7
8 pages
Sampling Distributions Explained
No ratings yet
Sampling Distributions Explained
29 pages
Central Limit Theorem & Sampling Distributions
No ratings yet
Central Limit Theorem & Sampling Distributions
42 pages
Slideset 2
No ratings yet
Slideset 2
63 pages
Understanding Sampling Distributions
No ratings yet
Understanding Sampling Distributions
31 pages
Unit - 4
No ratings yet
Unit - 4
10 pages
Sampling Distributions in Statistics
No ratings yet
Sampling Distributions in Statistics
72 pages
Sampling Distribution of the Mean Explained
No ratings yet
Sampling Distribution of the Mean Explained
35 pages
Probd
No ratings yet
Probd
49 pages
R Chapter 4 (Lecture 1)
No ratings yet
R Chapter 4 (Lecture 1)
22 pages
Chapter 7
No ratings yet
Chapter 7
10 pages
Sampling Distribution & CLT Guide
No ratings yet
Sampling Distribution & CLT Guide
16 pages
Expected Value & Standard Error in Sampling
No ratings yet
Expected Value & Standard Error in Sampling
7 pages
Sampling Distributions and Estimation
No ratings yet
Sampling Distributions and Estimation
13 pages
Normal Distribution & Central Limit Theorem
No ratings yet
Normal Distribution & Central Limit Theorem
49 pages
Q1W3 The Central Limit Theorem
No ratings yet
Q1W3 The Central Limit Theorem
80 pages
Introductory Statistics for Business
No ratings yet
Introductory Statistics for Business
15 pages
Sampling Distributions Explained
No ratings yet
Sampling Distributions Explained
47 pages
Understanding the Central Limit Theorem
No ratings yet
Understanding the Central Limit Theorem
25 pages
2 Central Limit Theorem (Statistics IEM 2-2)
No ratings yet
2 Central Limit Theorem (Statistics IEM 2-2)
31 pages
And Estimation Sampling Distributions: Learning Outcomes
No ratings yet
And Estimation Sampling Distributions: Learning Outcomes
12 pages
And Estimation Sampling Distributions: Learning Outcomes
No ratings yet
And Estimation Sampling Distributions: Learning Outcomes
12 pages
Lecture 3 - Sampling-Distribution & Central Limit Theorem
No ratings yet
Lecture 3 - Sampling-Distribution & Central Limit Theorem
5 pages
Central Limit Theorem Explained
No ratings yet
Central Limit Theorem Explained
7 pages
Session 09 Lecture Notes 0215
No ratings yet
Session 09 Lecture Notes 0215
17 pages
Biost 6.1
No ratings yet
Biost 6.1
28 pages
Central Limi Theorem
No ratings yet
Central Limi Theorem
19 pages
Week 7 and 8 31 Aug To 18 Sept Sampling Distributions
No ratings yet
Week 7 and 8 31 Aug To 18 Sept Sampling Distributions
6 pages
Lesson 14 Slides
No ratings yet
Lesson 14 Slides
39 pages
Central Limit Theorem Explained
No ratings yet
Central Limit Theorem Explained
20 pages
Week 7 & 8
No ratings yet
Week 7 & 8
37 pages
Z and T Tests
No ratings yet
Z and T Tests
30 pages
Sampling and Distributions Guide
No ratings yet
Sampling and Distributions Guide
45 pages
Stab22 Lecture8
No ratings yet
Stab22 Lecture8
21 pages
Chapter 4. Sampling Distributions
No ratings yet
Chapter 4. Sampling Distributions
31 pages
Topic 4
No ratings yet
Topic 4
2 pages
Understanding Sampling Distributions
No ratings yet
Understanding Sampling Distributions
24 pages
Understanding Sampling Distributions
No ratings yet
Understanding Sampling Distributions
55 pages
Module 2 - Sample - Afterclass
No ratings yet
Module 2 - Sample - Afterclass
36 pages
Understanding Sampling Distributions
No ratings yet
Understanding Sampling Distributions
82 pages
Sampling Distribution and Estimation
No ratings yet
Sampling Distribution and Estimation
46 pages
Sampling Distribution
No ratings yet
Sampling Distribution
41 pages
Civil Engineering Sampling Methods
No ratings yet
Civil Engineering Sampling Methods
30 pages
ECO2004 Ch8
No ratings yet
ECO2004 Ch8
9 pages
Sampling Distribution (19.09.2020)
No ratings yet
Sampling Distribution (19.09.2020)
23 pages
Sampling Distributions in Business Statistics
No ratings yet
Sampling Distributions in Business Statistics
31 pages
Central Limit Theorem Basics
No ratings yet
Central Limit Theorem Basics
41 pages
Estimation & Hypothesis Testing - PPTX (Final)
No ratings yet
Estimation & Hypothesis Testing - PPTX (Final)
92 pages
5 BSM214 Lecture5 Fall2023
No ratings yet
5 BSM214 Lecture5 Fall2023
25 pages
UNIT 2,3, 4 Material
No ratings yet
UNIT 2,3, 4 Material
79 pages
Lecture 2 Foundations of Inference
No ratings yet
Lecture 2 Foundations of Inference
23 pages
Statistics (QABD) Theory Notes - 2 (1) - 1
No ratings yet
Statistics (QABD) Theory Notes - 2 (1) - 1
10 pages
Understanding Nursing Research E Book: Building An Evidence Based Practice 7th Edition, (Ebook PDF) Ebook Enhanced Text 2026
100% (6)
Understanding Nursing Research E Book: Building An Evidence Based Practice 7th Edition, (Ebook PDF) Ebook Enhanced Text 2026
43 pages
On-Line Creativity in Scientific Thinking
No ratings yet
On-Line Creativity in Scientific Thinking
33 pages
Effects of Pricing and Promotion On Consumer Perceptions: It Depends On How You Frame It
100% (2)
Effects of Pricing and Promotion On Consumer Perceptions: It Depends On How You Frame It
13 pages
Statistics and Probability Module 1 Lesson 1.2 Part 1
No ratings yet
Statistics and Probability Module 1 Lesson 1.2 Part 1
5 pages
Quantum Mechanics & EPR Paradox
No ratings yet
Quantum Mechanics & EPR Paradox
6 pages
SCIENCE Gr5 Q3 W1 D1
No ratings yet
SCIENCE Gr5 Q3 W1 D1
4 pages
Institute Information Technology&Management of Itm Universe, Gwalior (M.P.)
70% (10)
Institute Information Technology&Management of Itm Universe, Gwalior (M.P.)
57 pages
Aims and Objectives of The Research
No ratings yet
Aims and Objectives of The Research
3 pages
Bioanalysis-Latest Publication
No ratings yet
Bioanalysis-Latest Publication
26 pages
Lab 2 Worksheet
0% (1)
Lab 2 Worksheet
2 pages
Crime Causation-Psychological Theories
100% (1)
Crime Causation-Psychological Theories
21 pages
Business Research Methods Course Outline MSCSM643
No ratings yet
Business Research Methods Course Outline MSCSM643
5 pages
The Science of Self - Aspose
No ratings yet
The Science of Self - Aspose
340 pages
Meaning Building Aldo Rossi and The Practice of Memory PDF
No ratings yet
Meaning Building Aldo Rossi and The Practice of Memory PDF
124 pages
Pronunciation Action Research
33% (3)
Pronunciation Action Research
38 pages
Inrad Optics X-Ray Focusing Poster
No ratings yet
Inrad Optics X-Ray Focusing Poster
1 page
Test Bank Questions for Econometrics Chapter 3
100% (1)
Test Bank Questions for Econometrics Chapter 3
6 pages
Template Thesis Book MIF LaTeX Ver
No ratings yet
Template Thesis Book MIF LaTeX Ver
34 pages
Unit I Lesson-1 Exploring Random Variables
67% (3)
Unit I Lesson-1 Exploring Random Variables
32 pages
How To Write Chapter 5 For Thesis Writing
83% (6)
How To Write Chapter 5 For Thesis Writing
6 pages
How To Perpre Review Artcle
No ratings yet
How To Perpre Review Artcle
6 pages
Optimal Sources for Literature Reviews
No ratings yet
Optimal Sources for Literature Reviews
6 pages
Theoretical and Conceptual Frameworks in Research
No ratings yet
Theoretical and Conceptual Frameworks in Research
12 pages
Stats2 Week11
No ratings yet
Stats2 Week11
13 pages
Senior High School Culture Test
No ratings yet
Senior High School Culture Test
2 pages
Support Vector Regression
No ratings yet
Support Vector Regression
14 pages
Discursivity, Difference, and Disruption: Genealogical Reflections On The Consumer Culture Theory Heteroglossiauld, and Giesler 2013
No ratings yet
Discursivity, Difference, and Disruption: Genealogical Reflections On The Consumer Culture Theory Heteroglossiauld, and Giesler 2013
26 pages
Pengaruh Pendidikan Kesehatan KDRT
No ratings yet
Pengaruh Pendidikan Kesehatan KDRT
10 pages
Systematic Literature Review For Dummies
100% (1)
Systematic Literature Review For Dummies
6 pages

Limit Theoram

Uploaded by

Limit Theoram

Uploaded by

STAT 22000 Lecture Slides

Variability in Estimates &

• Central Limit Theorem (CLT)

Suppose a certain movie has a bipolar distribution of ratings, that

So the population distribution is

In practice, since the population are difficult (or impossible) to

First, the sampling method must be appropriate. A biased sample

Suppose we take a simple random sample of size n (say

The histogram of the sample looks somewhat like the histogram of

In practice, the population distribution is usually unknown. We are

• As all we know about the population is the sample, we can

To know the variability of the sample mean of a sample of size

and then to the following simulation.

1. We take a random sample of size n = 25 from the population,

Histogram of the Means of 10000 Samples of Size 25

When we take a sample of size 25, the distribution of the sample

Histogram of the Means of 10000 Samples of Size 100

2.5 3.0 3.5 4.0 4.5 5.0 5.5

Histogram of the Means of 10000 Samples of Size 400

3.4 3.6 3.8 4.0 4.2 4.4 4.6

When the sample size increases to 400, the distribution of the

• The probability distribution of a statistic is called the sampling

• The sampling distribution of the sample mean may not be

> mean(samplemean25) > sd(samplemean25)

For i.i.d. random variables X1 , X2 , . . . , Xn from a population with

• Here, “i.i.d.” = “independent, and identically distributed”.

The mean, variance and SD of the population distribution are

sample expected > sd(samplemean25)

Let X1 , X2 , . . . be a sequence of i.i.d. random variables (discrete or

• the distribution of the sample mean

Recall that µ = 4, σ ≈ 3.56. So the sampling distribution of X 100

In the simulation 804 of the 10000 simulated X 100 exceeds 4.5,

> sum(samplemean100 > 4.5)

• Provided the sample size is large enough, the sampling

A housing survey was conducted to determine the price of a typical

Can we find an approximate probability that a randomly chosen

No, because the population do not follow a normal distribution (it is

Can we find an approximate probability that the mean of 60 ran-

Yes, if the population distribution is not too skewed, the sampling

You might also like