Module 6 Sampling Theory
Module 6 Sampling Theory
LEARNING OUTCOMES
After successfully completing this module, the student should be able to:
1. Construct sampling distribution of the sample means;
2. Understand the Central Limit Theorem;
3. Know types, advantages, and steps on probability and non-probability
sampling;
4. Differentiate scientific and non-scientific sampling designs; and
5. Determine the sample sample size using Slovin’s Formula and G-
Power Analysis
Pre-test
Content
LESSON 1 – SAMPLING DISTRIBUTION OF MEANS
In this lesson, we shall learn how to construct the sampling distribution of sample
means. And understand the Central Limit Theorem. This will eventually help us to
understand the process of making statistical inference about the population, using a
sample drawn from it.
Definition: The Sampling Distribution of the Mean is the mean of the population
from where the items are sampled. If the population distribution is normal, then the
sampling distribution of the mean is likely to be normal for the samples of all sizes.
Suppose we have a population of size N with a mean μ, and we draw or select all
possible sample size n from the population. Naturally, we expect to get different values of
the means for each sample. The sample means may be less than, greater than, or equal
to the population mean μ. The sample means obtained will form a frequency distribution
and the corresponding probability distribution can be constructed. This distribution is
called the sampling distribution of the sample means.
Example 1. A population consists of five values (Php 2, Php 3, Php 4, Php 5, Php 6).
A sample of size 2 is to be taken from this population.
a. How many samples are possible? List them and compute the mean of each
sample
b. Construct the sampling distribution of the sample means.
c. Construct the histogram of the sampling distribution of the sample means.
Solution
Table showing the list of all possible samples with their corresponding means.
Observe that the means of the samples vary from sample to sample. The population
mean μ = 4, while the sample means may be less, greater or equal to 4.
Example 2. The following table gives the monthly salaries (in thousand of pesos) of
six officers in a government office. Suppose that random samples of size 4 are taken
from this population of six officers
OFFICER SALARY a. How many samples are possible? List them and
A 8 compute the mean of each sample
B 12 b. Construct the sampling distribution of the sample means.
C 16 c. Construct the histogram of the sampling distribution of
D 20 the sample means.
E 24
F 28
Solution
PROBABILITY
16 2 2/15 3
17 2 2/15
2
18 3 1/5
19 2 2/15 1
20 2 2/15 0
21 1 1/15 14 15 16 17 18 19 20 21 22
22 1 1/15 SALARIES
TOTAL 15 1
STEP 2 STEP 4
Construct the frequency Construct a histogram of the
distribution of the sample means. sampling distribution of the sample
STEP 3 means.
Construct the probability
distribution of the sample means.
That is, the variance of the sampling distribution of the mean is the population
variance divided by N, the sample size (the number of scores used to compute a
mean). Thus, the larger the sample size, the smaller the variance of the
sampling distribution of the mean.
*Given a population with a mean of μ and a standard deviation of σ, the sampling
distribution of the mean has a mean of μ and a standard deviation of
where n is the sample size. The standard deviation of the sampling distribution of
the mean is called the standard error of the mean. It is designated by the symbol:
σM. Note that the spread of the sampling distribution of the mean decreases as
the sample size increases.
What if the POPULATION is not NORMALLY DISTRIBUTED?
What Is the Central Limit Theorem (CLT)?
Types:
Use Case:
1. When you want to reduce the sampling bias: This sampling method is used
when the bias has to be minimum. The selection of the sample largely determines
the quality of the research’s inference. How researchers select their sample
largely determines the quality of a researcher’s findings. Probability sampling
leads to higher quality findings because it provides an unbiased representation of
the population.
2. When the population is usually diverse: Researchers use this method
extensively as it helps them create samples that fully represent the population. Say
we want to find out how many people prefer medical tourism over getting treated
in their own country. This sampling method will help pick samples from various
socio-economic strata, background, etc. to represent the broader population.
3. To create an accurate sample: Probability sampling help researchers create
accurate samples of their population. Researchers use proven statistical methods
to draw a precise sample size to obtained well-defined data.
Advantages:
Examples:
You could put their names in a hat. If you sample with replacement, you would
choose one person’s name, put that person’s name back in the hat, and then
choose another name. The possibilities for your two-name sample are:
• John, John John, Jack John, Qui Jack, Qui Jack Tina
• …and so on.
When you sample with replacement, your two items are independent. In other
words, one does not affect the outcome of the other. You have a 1 out of 7 (1/7)
chance of choosing the first name and a 1/7 chance of choosing the second name.
• P(John, John) = (1/7) * (1/7) = .02.
• P(John, Jack) = (1/7) * (1/7) = .02.
• P(John, Qui) = (1/7) * (1/7) = .02.
• P(Jack, Qui) = (1/7) * (1/7) = .02.
• P(Jack Tina) = (1/7) * (1/7) = .02.
Note that P(John, John) just means “the probability of choosing John’s name, and
then John’s name again.” You can figure out these probabilities using the
multiplication rule.
As you can probably figure out, I’ve only used a few items here, so the odds only
change a little. But larger samples taken from small populations can have more
dramatic results. You can tell how dramatic these results are by calculating the
covariance. That’s a measure of how much two items’ probabilities are linked
together; the higher the covariance, the more dramatic the results. A covariance
of zero would mean there’s no difference between sampling with replacement or
sampling without.
B. NON-PROBABILITY SAMPLING
Types:
1. Convenience sampling: A non-probability sampling technique where samples
are selected from the population only because they are conveniently available to
the researcher. Researchers choose these samples just because they are easy to
recruit, and the researcher did not consider selecting a sample that represents the
entire population.
2. Consecutive sampling: This non-probability sampling method is very similar to
convenience sampling, with a slight variation. Here, the researcher picks a single
person or a group of a sample, conducts research over a period, analyzes the
results, and then moves on to another subject or group if needed. Consecutive
sampling technique gives the researcher a chance to work with many topics and
fine-tune his/her research by collecting results that have vital insights.
3. Quota sampling: Hypothetically consider, a researcher wants to study the career
goals of male and female employees in an organization. There are 500 employees
in the organization, also known as the population. To understand better about a
population, the researcher will need only a sample, not the entire population.
Further, the researcher is interested in particular strata within the population. Here
is where quota sampling helps in dividing the population into strata or groups.
4. Judgmental or Purposive sampling: Researchers select the samples based
purely on the researcher’s knowledge and credibility. In other words, researchers
choose only those people who they deem fit to participate in the research study.
Judgmental or purposive sampling is not a scientific method of sampling, and the
Use Case:
1. Use this type of sampling to indicate if a particular trait or characteristic exists in a
population.
2. Researchers widely use the non-probability sampling method when they aim at
conducting qualitative research, pilot studies, or exploratory research.
3. Researchers use it when they have limited time to conduct research or have
budget constraints.
4. When the researcher needs to observe whether a particular issue needs in-depth
analysis, he applies this method.
5. Use it when you do not intend to generate results that will generalize the entire
population.
Advantages:
1. Non-probability sampling techniques are a more conducive and practical method
for researchers deploying surveys in the real world. Although statisticians prefer
probability sampling because it yields data in the form of numbers, however, if
done correctly, it can produce similar if not the same quality of results.
2. Getting responses using non-probability sampling is faster and more cost-effective
than probability sampling because the sample is known to the researcher. The
respondents respond quickly as compared to people randomly selected as they
have a high motivation level to participate.
Examples:
1. An example of convenience sampling would be using student volunteers known to
the researcher. Researchers can send the survey to students belonging to a
particular school, college, or university, and act as a sample.
2. In an organization, for studying the career goals of 500 employees, technically, the
sample selected should have proportionate numbers of males and females. Which
means there should be 250 males and 250 females. Since this is unlikely, the
researcher selects the groups or strata using quota sampling.
3. Researchers also use this type of sampling to conduct research involving a
particular illness in patients or a rare disease. Researchers can seek help from
subjects to refer to other subjects suffering from the same ailment to form a
subjective sample to carry out the study.
A. SCIENTIFIC SAMPLING
B. NON-SCIENTIFIC SAMPLING
Here, not all of the individuals in a population are given equal chance of being
included as sample hence, subjectivity occurs.
NOTE:
- Where n0 is the sample size,
- Z2 is the abscissa of the normal curve that cuts off an area at the tails; -(1 –
α) equals the desired confidence level, e.g., 95%);
- e is the desired level of precision,
- p is the estimated proportion of an attribute that is present in the population,
and q is 1-p.
-The value for Z is found in statistical tables which contain the area under the
normal curve. e.g. Z = 1.96 for 95 % level of confidence
Example:
Suppose we wish to evaluate a statewide. Extension program in which farmers
were encouraged to adopt a new practice. Assume there is a large population but
that we do not know the variability in the proportion that will adopt the practice;
therefore, assume p=.5 (maximum variability). Furthermore, suppose we desire a
95% confidence level and ±5% precision.
Example:
Find out what sample of a population of 1000 people you need to take for a survey
on their soda preference. Confidence level of 95%; giving you an alpha level of
0.05
G*POWER
G*POWER is a FREE program that can make the calculations a lot easier
http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3/
Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible
statistical power analysis program for the social, behavioral, and biomedical
sciences. Behavior Research Methods, 39, 175-191.
G*Power computes:
• power values for given sample sizes, effect sizes, and alpha levels (post hoc
power analyses),
• sample sizes for given effect sizes, alpha levels, and power values (a priori
power analyses)
• suitable for most fundamental statistical methods
Note – some tests assume equal variance across groups and assume using pop SD
(which are likely to be est from sample)
Learning Activities
A. The following are heights of five students in centimeters. Suppose samples of size
3 are taken from this population of five students.
Student Height (cm) a. How many samples are possible? List them
Bert 120 and compute the mean each sample.
Tony 130 b. Construct the sampling distribution of the
Danny 110 sample means.
Henry 125 c. Construct the histogram of the sampling
distribution of the sample means. Describe the
Peter 115
shape of the histogram.
B. TRUE OR FALSE
1. Given the population is normally distributed, the population has a mean of 14 and
a standard deviation of 3 and the sample size of your sampling distribution is N=10.
Therefore, the mean of the sampling distribution of the mean is also 14.
2. The population has a mean of 30 and a standard deviation of 6. The sample size
of your sampling distribution in N=9. Therefore, the variance of the sampling
distribution is 4.
3. The population has a mean of 120 and a standard deviation of 12. The sample
size of your sampling distribution is N = 16. Therefore, the standard error of the
mean is ¾.
4. Sampling distribution of the mean, with n=30, of a moderately negatively skewed
distribution is about normal.
5. The entire student body of 225 students took a test. These test scores have a
mean of 75, a standard deviation of 10, and are slightly positively skewed. If you
randomly chose 35 of these test scores and calculated the mean over and over
again, the sample mean is also 75 and SD is 2, from [10/sqrt(25)].
Assessments
Fill in the blanks
Directions: Read the following statements and write the word/s which best complete/s the
sentences on your paper/notebook. Then later, check your own work. (BE HONEST even
NOBODY SEES YOU)
16. The ___________ obtained will form a frequency distribution and the
corresponding probability distribution can be constructed.
17. The ____________ is the mean of the population from where the items are
sampled.
18. If the population mean is normally distributed, the population mean μ is _____ to
the sample mean.
19. The central limit theorem (CLT) states that the distribution of sample __________
a normal distribution (also known as a “bell curve”) as the sample size becomes
larger.
20. ____________ concerned with selection of a subset of individual from within a
statistical population to estimate characteristic of the whole population.
VI. REFERENCE:
Textbooks
1. Alfarez, Merle S., et.al. Statistics and Probability. 2010
2. Ocampo, Jose M. Jr., et.al. Senior High Conceptual Statistics and Probability. 2016
Online references:
1. https://businessjargons.com/sampling-distribution-of
2. http://www.onlinestatbook.com/2/sampling_distributions/samp_dist_mean.html
3. What is Sampling Distribution of Mean? definition and meaning - Business
Jargons
4. https://www.google.com/amp/s/www.questionpro.com/blog/non-probability-
sampling/amp/
5. https://www.statisticshowto.com/sampling-with-replacement-without/
6. https://www.questionpro.com/blog/probability-sampling/
7. https://www.slideshare.net/mobile/RoquiMalijan/group-5-28367482