0% found this document useful (0 votes)
302 views16 pages

Module 6 Sampling Theory

1. The document discusses sampling theory and introduces key concepts like sampling distribution of means, central limit theorem, and sampling methods. 2. It provides learning outcomes related to constructing sampling distributions, understanding the central limit theorem, differentiating sampling designs, and determining sample size. 3. The content covers sampling distribution of means, examples of constructing sampling distributions and histograms, and the central limit theorem.

Uploaded by

ana mejico
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
302 views16 pages

Module 6 Sampling Theory

1. The document discusses sampling theory and introduces key concepts like sampling distribution of means, central limit theorem, and sampling methods. 2. It provides learning outcomes related to constructing sampling distributions, understanding the central limit theorem, differentiating sampling designs, and determining sample size. 3. The content covers sampling distribution of means, examples of constructing sampling distributions and histograms, and the central limit theorem.

Uploaded by

ana mejico
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 16

Module 6

LEARNING OUTCOMES

After successfully completing this module, the student should be able to:
1. Construct sampling distribution of the sample means;
2. Understand the Central Limit Theorem;
3. Know types, advantages, and steps on probability and non-probability
sampling;
4. Differentiate scientific and non-scientific sampling designs; and
5. Determine the sample sample size using Slovin’s Formula and G-
Power Analysis

Pre-test

Fill in the blanks


Directions: Read the following statements and write the word/s which best complete/s the
sentences on your paper/notebook. Then later, check your own work. (BE HONEST even
NOBODY SEES YOU)
1. The ___________ obtained will form a frequency distribution and the
corresponding probability distribution can be constructed.
2. The ____________ is the mean of the population from where the items are
sampled.
3. If the population mean is normally distributed, the population mean μ is _____ to
the sample mean.
4. The central limit theorem (CLT) states that the distribution of sample __________
a normal distribution (also known as a “bell curve”) as the sample size becomes
larger.
5. ____________ concerned with selection of a subset of individual from within a
statistical population to estimate characteristic of the whole population.
6. ___________________ method assumes that each site has an equal chance of
being part of the sample selected.
7. ____________ is one of the most common forms of non-probability sampling.
8. ____________ is a sampling technique used when “natural” but relatively
homogeneous groupings are evident in statistical population.
9. ____________ is defined as a sampling technique in which the researcher
chooses samples from a larger population using a method based on the theory of
probability.
10. ____________________ involves a method where the researcher divides a more
extensive population into smaller groups that usually don’t overlap but represent
the entire population.
11. ___________________ is a sampling method in which not all members of the
population have an equal chance of participating in the study, unlike probability
sampling.
12. _______________________ is not a scientific method of sampling, and the
downside to this sampling technique is that the preconceived notions of a
researcher can influence the results.
13. ____________________ is the mathematical estimation of number of subjects/
units included in the study.
14. The more heterogeneous a population, the ______ the sample size required to
obtain a given level of precision.
15. Sample sizes equal or greater than ___ are considered sufficient for the CLT to
hold.

1|Pa ge M o d u le in Statistics and Evaluation in Education MCBriñosa


Introduction
This module deals with different sampling theories, types, methods, and
classifications. In conducting a research, the use of samples to acquire information is more
convenient and more feasible than using the entire population. The use of sample is
generally less costly and more practical. However, we cannot expect the sample to yield
accurate information about the population. We should expect a certain amount of error
from using the sample. The purpose of this module is for you to be aware of different
sampling techniques, types, and methods that you can use in doing researches. As you
go over the discussion and activity, you will appreciate the importance of sampling theory
in research. Enjoy learning this module.
Activity. Arrange the jumbled letters below.
1. ASLPME
2. STRDIIUTBOIN
3. RPBAOYTLIB
4. CTRNAEL
5. EISCNFCITI

Content
LESSON 1 – SAMPLING DISTRIBUTION OF MEANS
In this lesson, we shall learn how to construct the sampling distribution of sample
means. And understand the Central Limit Theorem. This will eventually help us to
understand the process of making statistical inference about the population, using a
sample drawn from it.
Definition: The Sampling Distribution of the Mean is the mean of the population
from where the items are sampled. If the population distribution is normal, then the
sampling distribution of the mean is likely to be normal for the samples of all sizes.
Suppose we have a population of size N with a mean μ, and we draw or select all
possible sample size n from the population. Naturally, we expect to get different values of
the means for each sample. The sample means may be less than, greater than, or equal
to the population mean μ. The sample means obtained will form a frequency distribution
and the corresponding probability distribution can be constructed. This distribution is
called the sampling distribution of the sample means.

Example 1. A population consists of five values (Php 2, Php 3, Php 4, Php 5, Php 6).
A sample of size 2 is to be taken from this population.

a. How many samples are possible? List them and compute the mean of each
sample
b. Construct the sampling distribution of the sample means.
c. Construct the histogram of the sampling distribution of the sample means.

Solution

POSSIBLE Sample STEP 1


SAMPLE SIZE Mean
OF SIZE 2 Since the size of the population is 5, we
2,3 2.5 have N=5. We shall draw a sample of size
2,4 3 2 from this population, so n = 2. Thus, the
2,5 3.5 number of possible samples of size 2 that
2,6 4 can be drawn from this population is C(N,n)
3,4 3.5 = C(5,2) = 10

2|Pa ge M o d u le in Statistics and Evaluation in Education MCBriñosa


3,5 4
3,6 4.5
4,5 4.5
4,6 5
5,6 5.5

Table showing the list of all possible samples with their corresponding means.
Observe that the means of the samples vary from sample to sample. The population
mean μ = 4, while the sample means may be less, greater or equal to 4.

SAMPLE MEAN F (Step 2) Probability (Step 3)


2.5 1 1/10 STEP 2
3.0 1 1/10 Construct the frequency
3.5 2 1/5 distribution of the sample means
4.0 2 1/5
4.5 2 1/5 STEP 3
5.0 1 1/10 Construct the probability
5.5 1 1/10 distribution of the sample means.
TOTAL 10 1
SAMPLING DISTRIBUTION OF
THE SAMPLE MEAN
2.5 STEP 4
PROBABILITY

2 Construct a histogram of the


1.5 sampling distribution of the
1 sample means.
0.5
0
2.5 3 3.5 4 4.5 5 5.5
PHP

Example 2. The following table gives the monthly salaries (in thousand of pesos) of
six officers in a government office. Suppose that random samples of size 4 are taken
from this population of six officers

OFFICER SALARY a. How many samples are possible? List them and
A 8 compute the mean of each sample
B 12 b. Construct the sampling distribution of the sample means.
C 16 c. Construct the histogram of the sampling distribution of
D 20 the sample means.
E 24
F 28

Solution

SAMPLE SALARIES MEAN Since the size of the population is 6, we


A,B,C,D 8, 12, 16, 20 14 have N=6. We shall draw a sample of size 4
A,B,C,E 8, 12, 16, 24 15 from this population, so n = 4. Thus, the
A,B,C,F 8, 12, 16, 28 16 number of possible samples of size 4 that can
A,B,D,E 8, 12, 20, 24 16 be drawn from this population is C(N,n) =
A,B,D,F 8, 12, 20, 28 17 C(6,4) = 15
A,B,E,F 8, 16, 20, 28 18
A,C,D,E 8, 16, 20, 24 17

3|Pa ge M o d u le in Statistics and Evaluation in Education MCBriñosa


A,C,D,F 8, 16, 20, 28 18 Table showing the list of all possible
A,C,E,F 8, 16, 24, 28 19 samples with their corresponding means.
A,D,E,F 8, 20, 24, 28 20
B,C,D,E 12, 16, 20, 24 18 Observe that the means of the samples
B,C,D,F 12, 16, 24, 28 19 vary from sample to sample. The population
B,C,E,F 12, 16, 24, 28 20 mean μ = 18, while the sample means may
B,D,E,F 12, 20, 24, 28 21 be less, greater or equal to 18.
C,D,E,F 16, 20, 24, 28 22

SAMPLE MEAN F PROBABILITY SAMPLING DISTRIBUTION OF


14 1 1/15 THE SAMPLE MEAN
15 1 1/15 4

PROBABILITY
16 2 2/15 3
17 2 2/15
2
18 3 1/5
19 2 2/15 1
20 2 2/15 0
21 1 1/15 14 15 16 17 18 19 20 21 22
22 1 1/15 SALARIES
TOTAL 15 1

STEP 2 STEP 4
Construct the frequency Construct a histogram of the
distribution of the sample means. sampling distribution of the sample
STEP 3 means.
Construct the probability
distribution of the sample means.

MAIN PROPERTIES OF THE SAMPLING DISTRIBUTION OF THE MEAN:


If the population is normally distributed, then:
*The mean of the sampling distribution of the mean is the mean of the population
from which the scores were sampled. Therefore, if a population has a mean μ,
then the mean of the sampling distribution of the mean is also μ. The symbol
μM is used to refer to the mean of the sampling distribution of the mean.
μM = μ
*The variance of the sampling distribution of the mean is computed as follows:

That is, the variance of the sampling distribution of the mean is the population
variance divided by N, the sample size (the number of scores used to compute a
mean). Thus, the larger the sample size, the smaller the variance of the
sampling distribution of the mean.
*Given a population with a mean of μ and a standard deviation of σ, the sampling
distribution of the mean has a mean of μ and a standard deviation of

where n is the sample size. The standard deviation of the sampling distribution of
the mean is called the standard error of the mean. It is designated by the symbol:
σM. Note that the spread of the sampling distribution of the mean decreases as
the sample size increases.
What if the POPULATION is not NORMALLY DISTRIBUTED?
What Is the Central Limit Theorem (CLT)?

4|Pa ge M o d u le in Statistics and Evaluation in Education MCBriñosa


In the study of probability theory, the central limit theorem (CLT) states that
the distribution of sample approximates a normal distribution (also known as a
“bell curve”) as the sample size becomes larger, assuming that all samples are
identical in size, and regardless of the population distribution shape.
Here’s a proof of “distribution of sample mean approximates a normal
distribution, as the sample size becomes larger”, as stated in the central limit
theorem.
An exponential distribution is not a normal
distribution.

Notice how the distribution of mean


approximates a normal curve/distribution as the
sample size n becomes larger. In general rule, n
must be greater than or equal to 30.

Sample size n = 2 Sample size n = 4

Sample size n = 10 Sample size n = 50

LESSON 2 – POBABILITY and NON-PROBABILITY SAMPLING


A. PROBABILITY SAMPLING
Definition: Probability sampling is defined as a sampling technique in which the
researcher chooses samples from a larger population using a method based on the
theory of probability. For a participant to be considered as a probability sample, he/she
must be selected using a random selection.

Types:

1. Simple random sampling, as the name


suggests, is an entirely random method of
selecting the sample. This sampling method
is as easy as assigning numbers to the
individuals (sample) and then randomly
choosing from those numbers through an
automated process. Finally, the numbers that
are chosen are the members that are
included in the sample. There are two ways
in which researchers choose the samples in this method of sampling: The lottery
system and using number generating software/ random number table. This sampling
technique usually works around a large population and has its fair share of
advantages and disadvantages.

2. Stratified random sampling involves a method where the researcher divides a


more extensive population into smaller groups that usually don’t overlap but
represent the entire population. While sampling, organize these groups and then

5|Pa ge M o d u le in Statistics and Evaluation in Education MCBriñosa


draw a sample from each group separately.
Members of these groups should be distinct
so that every member of all groups get equal
opportunity to be selected using simple
probability. This sampling method is also
called “random quota sampling.”

3. Random cluster sampling is a way to select


participants randomly that are spread out
geographically. Cluster sampling usually
analyzes a particular population in which the
sample consists of more than a few elements.
Researchers then select the clusters by
dividing the population into various smaller
sections.

4. Systematic sampling is when you choose


every “nth” individual to be a part of the
sample. For example, you can select every 5th
person to be in the sample. Systematic
sampling is an extended implementation of the
same old probability technique in which each
member of the group is selected at regular
periods to form a sample. There’s an equal
opportunity for every member of a population
to be selected using this sampling technique.
Steps:
1. Choose your population of interest carefully: Carefully think and choose from
the population, people you believe whose opinions should be collected and then
include them in the sample.
2. Determine a suitable sample frame: Your frame should consist of a sample from
your population of interest and no one from outside to collect accurate data.
3. Select your sample and start your survey: It can sometimes be challenging to
find the right sample and determine a suitable sample frame. Even if all factors are
in your favor, there still might be unforeseen issues like cost factor, quality of
respondents, and quickness to respond. Getting a sample to respond to a
probability survey accurately might be difficult but not impossible.

Use Case:

1. When you want to reduce the sampling bias: This sampling method is used
when the bias has to be minimum. The selection of the sample largely determines
the quality of the research’s inference. How researchers select their sample
largely determines the quality of a researcher’s findings. Probability sampling
leads to higher quality findings because it provides an unbiased representation of
the population.
2. When the population is usually diverse: Researchers use this method
extensively as it helps them create samples that fully represent the population. Say
we want to find out how many people prefer medical tourism over getting treated
in their own country. This sampling method will help pick samples from various
socio-economic strata, background, etc. to represent the broader population.
3. To create an accurate sample: Probability sampling help researchers create
accurate samples of their population. Researchers use proven statistical methods
to draw a precise sample size to obtained well-defined data.

Advantages:

6|Pa ge M o d u le in Statistics and Evaluation in Education MCBriñosa


1. It’s Cost-effective: This process is both cost and time effective, and a larger
sample can also be chosen based on numbers assigned to the samples and then
choosing random numbers from the more significant sample.
2. It’s simple and straightforward: Probability sampling is an easy way of sampling
as it does not involve a complicated process. It’s quick and saves time. The time
saved can thus be used to analyze the data and draw conclusions.
3. It is non-technical: This method of sampling doesn’t require any technical
knowledge because of its simplicity. It doesn’t require intricate expertise and is not
at all lengthy.

Examples:

1. The population of the US alone is 330 million.” It is practically impossible to send


a survey to every individual to gather information. Use probability sampling to
collect data, even if you collect it from a smaller population.

2. An organization has 500,000 employees sitting at different geographic locations.


The organization wishes to make certain amendments in its human resource
policy, but before they roll out the change, they want to know if the employees will
be happy with the change or not. However, it’s a tedious task to reach out to all
500,000 employees.” This is where probability sampling comes handy. A sample
from the larger population i.e., from 500,000 employees, is chosen. This sample
will represent the population. Deploy a survey now to the sample. From the
responses received, management will now be able to know whether employees in
that organization are happy or not about the amendment.

A.1 SAMPLING WITH REPLACEMENT


Sampling with replacement is used to find probability with replacement. In other
words, you want to find the probability of some event where there’s a number of
balls, cards or other objects, and you replace the item each time you choose one.
Let’s say you had a population of 7 people, and you wanted to sample 2. Their
names are:

• John, Jack, Qiu, Tina, Hatty, Jacques, Des

You could put their names in a hat. If you sample with replacement, you would
choose one person’s name, put that person’s name back in the hat, and then
choose another name. The possibilities for your two-name sample are:
• John, John John, Jack John, Qui Jack, Qui Jack Tina
• …and so on.

When you sample with replacement, your two items are independent. In other
words, one does not affect the outcome of the other. You have a 1 out of 7 (1/7)
chance of choosing the first name and a 1/7 chance of choosing the second name.
• P(John, John) = (1/7) * (1/7) = .02.
• P(John, Jack) = (1/7) * (1/7) = .02.
• P(John, Qui) = (1/7) * (1/7) = .02.
• P(Jack, Qui) = (1/7) * (1/7) = .02.
• P(Jack Tina) = (1/7) * (1/7) = .02.

Note that P(John, John) just means “the probability of choosing John’s name, and
then John’s name again.” You can figure out these probabilities using the
multiplication rule.

A.2 SAMPLING WITHOUT REPLACEMENT


Sampling without Replacement is a way to figure out probability without
replacement. In other words, you don’t replace the first item you choose before
you choose a second. This dramatically changes the odds of choosing sample
items.

7|Pa ge M o d u le in Statistics and Evaluation in Education MCBriñosa


Taking the above example, you would have the same list of names to choose two
people from. And your list of results would similar, except you couldn’t choose the
same person twice:
• John, Jack
• John, Qui
• Jack, Qui
• Jack Tina…
But now, your two items are dependent, or linked to each other. When you choose
the first item, you have a 1/7 probability of picking a name. But then, assuming you
don’t replace the name, you only have six names to pick from. That gives you a
1/6 chance of choosing a second name. The odds become:
• P(John, Jack) = (1/7) * (1/6) = .024.
• P(John, Qui) = (1/7) * (1/6) = .024.
• P(Jack, Qui) = (1/7) * (1/6) = .024.
• P(Jack Tina) = (1/7) * (1/6) = .024…

As you can probably figure out, I’ve only used a few items here, so the odds only
change a little. But larger samples taken from small populations can have more
dramatic results. You can tell how dramatic these results are by calculating the
covariance. That’s a measure of how much two items’ probabilities are linked
together; the higher the covariance, the more dramatic the results. A covariance
of zero would mean there’s no difference between sampling with replacement or
sampling without.

B. NON-PROBABILITY SAMPLING

Definition: Non-probability sampling is defined as a sampling technique in which the


researcher selects samples based on the subjective judgment of the researcher rather
than random selection. It is a less stringent method. This sampling method depends
heavily on the expertise of the researchers. It is carried out by observation, and
researchers use it widely for qualitative research.

Non-probability sampling is a sampling method in which not all members of the


population have an equal chance of participating in the study, unlike probability
sampling. Each member of the population has a known chance of being selected. Non-
probability sampling is most useful for exploratory studies like a pilot survey (deploying
a survey to a smaller sample compared to pre-determined sample size). Researchers
use this method in studies where it is impossible to draw random probability sampling
due to time or cost considerations.

Types:
1. Convenience sampling: A non-probability sampling technique where samples
are selected from the population only because they are conveniently available to
the researcher. Researchers choose these samples just because they are easy to
recruit, and the researcher did not consider selecting a sample that represents the
entire population.
2. Consecutive sampling: This non-probability sampling method is very similar to
convenience sampling, with a slight variation. Here, the researcher picks a single
person or a group of a sample, conducts research over a period, analyzes the
results, and then moves on to another subject or group if needed. Consecutive
sampling technique gives the researcher a chance to work with many topics and
fine-tune his/her research by collecting results that have vital insights.
3. Quota sampling: Hypothetically consider, a researcher wants to study the career
goals of male and female employees in an organization. There are 500 employees
in the organization, also known as the population. To understand better about a
population, the researcher will need only a sample, not the entire population.
Further, the researcher is interested in particular strata within the population. Here
is where quota sampling helps in dividing the population into strata or groups.
4. Judgmental or Purposive sampling: Researchers select the samples based
purely on the researcher’s knowledge and credibility. In other words, researchers
choose only those people who they deem fit to participate in the research study.
Judgmental or purposive sampling is not a scientific method of sampling, and the

8|Pa ge M o d u le in Statistics and Evaluation in Education MCBriñosa


downside to this sampling technique is that the preconceived notions of a
researcher can influence the results. Thus, this research technique involves a high
amount of ambiguity.
5. Snowball sampling: This helps researchers find a sample when they are difficult
to locate. Researchers use this technique when the sample size is small and not
easily available. This sampling system works like the referral program. Once the
researchers find suitable subjects, he asks them for assistance to seek similar
subjects to form a considerably good size sample.

Use Case:
1. Use this type of sampling to indicate if a particular trait or characteristic exists in a
population.
2. Researchers widely use the non-probability sampling method when they aim at
conducting qualitative research, pilot studies, or exploratory research.
3. Researchers use it when they have limited time to conduct research or have
budget constraints.
4. When the researcher needs to observe whether a particular issue needs in-depth
analysis, he applies this method.
5. Use it when you do not intend to generate results that will generalize the entire
population.

Advantages:
1. Non-probability sampling techniques are a more conducive and practical method
for researchers deploying surveys in the real world. Although statisticians prefer
probability sampling because it yields data in the form of numbers, however, if
done correctly, it can produce similar if not the same quality of results.
2. Getting responses using non-probability sampling is faster and more cost-effective
than probability sampling because the sample is known to the researcher. The
respondents respond quickly as compared to people randomly selected as they
have a high motivation level to participate.

Examples:
1. An example of convenience sampling would be using student volunteers known to
the researcher. Researchers can send the survey to students belonging to a
particular school, college, or university, and act as a sample.
2. In an organization, for studying the career goals of 500 employees, technically, the
sample selected should have proportionate numbers of males and females. Which
means there should be 250 males and 250 females. Since this is unlikely, the
researcher selects the groups or strata using quota sampling.
3. Researchers also use this type of sampling to conduct research involving a
particular illness in patients or a rare disease. Researchers can seek help from
subjects to refer to other subjects suffering from the same ailment to form a
subjective sample to carry out the study.

LESSON 3 – SAMPLING DESIGN (Scientific & Non-Scientific Design)

SAMPLING is concerned with selection of a subset of individual from within a


statistical population to estimate characteristic of the whole population.
SAMPLE is a small amount or part of something that shows you what the rest is or it
should be.

4 Principles of Sampling Design


o Standardize samples
o Replicate (for each combination of time, location, and any controlled factor)
o Establish equal number of suitable Controls.
o Locate all samples Randomly.
Advantages of Sampling
o Very accurate
o Economical in nature.
o Very reliable.
o High suitability ratio towards the different surveys.

9|Pa ge M o d u le in Statistics and Evaluation in Education MCBriñosa


oTakes less time
oIn case, when the universe is very large, then the sampling methods is the
only practical method for collecting the data.
Disadvantages of Sampling
o Inadequacy of the samples.
o Chances for bias.
o Problems of accuracy.
o Difficulty of getting the representative sample.
o Untrained manpower.
Sampling Design
• Specifies for every sample, there is a probability of being drawn.

TYPES OF SAMPLING DESIGN

A. SCIENTIFIC SAMPLING

1. Restricted Random Sampling. A method of sampling is described which is a


compromise between systematic sampling and stratified random sampling. It has
less potential for bias than systematic sampling and also avoids the practical
problems associated with stratified random sampling.
2. Unrestricted Random Sampling. This method assumes that each site has an
equal chance of being part of the sample selected. Make a list of all project sites,
perhaps by alphabetical order. Every project site is given a number. Random
sampling isn’t always the most convenient method of choosing a sample.

DIFFERENCE BETWEEN RESTRICTED AND UNRESTRICTED SAMPLING


Unrestricted sampling occurs when elements are selected individually and
directly from the population, whereas, restricted sampling occurs when elements
are chosen using a specific methodology as in probability sampling or complex
probability sampling.

3. Stratified Random Sampling. This method of sampling is sometimes used if there


are wide variations in site performance within a certain geographic location or type
of distribution site. (i.e, health centers or schools).All the sites are grouped into
segments , each having some uniform, easily identifiable characteristics. Each
segment is sampled separately using unrestricted random sampling methods.
4. Systematic Sampling. A statistical method involving the selection of elements
from an ordered sampling frame. The most common form of systematic sampling
is an equal-probability method. In this approach, progression through the list is
treated circularly, with a return to the top once the end of the list is passed.
5. Multistage Sampling. A Complex form of cluster sampling. Cluster sampling is a
type of sampling which involves dividing the population into groups (cluster). Then,
one more cluster are chosen at random and everyone within the chosen cluster is
sampled.
6. Cluster Sampling. It is a sampling technique used when “natural” but relatively
homogeneous groupings are evident in statistical population.

B. NON-SCIENTIFIC SAMPLING

Here, not all of the individuals in a population are given equal chance of being
included as sample hence, subjectivity occurs.

THREE TYPES OF NONSCIENTIFIC SAMPLING:

1. Purposive Sampling. This type of non-scientific sampling is based on


selecting the individuals as samples according to the purposes of the
researcher as his controls.
2. Convenience Sampling. Also referred to as haphazard or accidental
sampling. The process of selecting some people to be part of a sample
because they are readily available, not because they are most representative
of the population being studied.

10 | P a g e M o d u l e i n Statistics and Evaluation in Education MCBriñosa


Examples Convenience Sampling
• Female moviegoers sitting in the first row of a movie theater.
• The first 100 customers to enter a department store.
• The first three callers in a radio contest.
3. Quota Sampling. This is one of the most common forms of non-probability
sampling. Sampling is done until specific number of units (quotas) for various
sub-populations have been selected.
To choose a Quota Sample:
• Divide the population into strata or groups of individuals that are
similar in some way that is important to the response.
• Choose a separate sample from each stratum. This does not have
to be a random sample.
• Combine these samples to form a quota sample.

LESSON 4 – SAMPLE SIZE DETERMINATION


SAMPLE SIZE DETERMINATION
• Sample size determination is the mathematical estimation of number of
subjects/ units included in the study
• What a representative sample is taken from a population, finding a generalized
to the population.
• Optimum sample size determination is required for the following reasons:
a. To allow for appropriate analysis
b. To provide the desired level of accuracy
c. To allow validity of significance test.

Sample size criteria


The level of precision is also called sampling error. It Is the range in which the true
value of the population is estimated to be. This range is often expressed in percentage
points, e, g. (+-5 percent), in the same way that results for political campaign polls are
reported by the media.

The level of confidence or risk


Based on ideas encompassed under the central limit theorem. E.g a 95%
confidence level is selected, 95 out of 100 samples will have the true population value
within the range of precision

The degree of variability


It refers to the distribution of attributes in the population. The more heterogeneous
a population, the larger the sample size required to obtain a given level of precision.
The less variable (more homogeneous) a population, the smaller the sample b size A
proportion of 50 % indicates a greater level of variability than either 20% or 80%. This
is because 20% and 80% indicate that a large majority do not or do, respectively, have
the attribute of interest. A proportion of 0.5 indicates the maximum variability in a
population, it is often used in determining a more conservative sample size, that is, the
sample size may be larger than if the true variability of the population attribute were
used.

STRATEGIES FOR DETERMINING SAMPLE SIZE


Census for small populations
• One approach is to use the entire population as the sample.
• Although cost considerations make this impossible for large populations.
• Attractive for small populations (e.g., 200 or less).
• Eliminates sampling error and provides data on all the individuals in the
population.
• Some costs such as questionnaire design and developing the sampling frame
are “fixed,” that is, they will be the same for samples of 50 or 200.
• Finally, virtually the entire population would have to be sampled in small
populations to achieve a desirable level of precision

Imitating a sample size of similar studies

11 | P a g e M o d u l e i n Statistics and Evaluation in Education MCBriñosa


• Use the same sample size as those of studies similar to the one you plan (Cite
reference).
• Without reviewing the procedures employed in these studies you may run the
risk of repeating errors that were made in determining the sample size for
another study.
• However, a review of the literature in your discipline can provide guidance
about “typical” sample sizes that are used.

Using published tables


• Published tables provide the sample size for a given set of criteria.
• Necessary for given combinations of precision, confidence levels and
variability.
• The sample sizes presume that the attributes being measured are distributed
normally or nearly so.
• Although tables can provide a useful guide for determining the sample size,
you may need to calculate the necessary sample size for a different
combination of levels of precision, confidence, and variability

Applying formulas to calculate a sample size


• Sample size can be determined by the application of one of several
mathematical formulae.
• Formula mostly used for calculating a sample for proportions.
Example:
• For populations that are large, the Cochran (1963:75) equation yields a
representative sample for proportions.
Fisher equation, Mugenda etc.

NOTE:
- Where n0 is the sample size,
- Z2 is the abscissa of the normal curve that cuts off an area at the tails; -(1 –
α) equals the desired confidence level, e.g., 95%);
- e is the desired level of precision,
- p is the estimated proportion of an attribute that is present in the population,
and q is 1-p.
-The value for Z is found in statistical tables which contain the area under the
normal curve. e.g. Z = 1.96 for 95 % level of confidence

Example:
Suppose we wish to evaluate a statewide. Extension program in which farmers
were encouraged to adopt a new practice. Assume there is a large population but
that we do not know the variability in the proportion that will adopt the practice;
therefore, assume p=.5 (maximum variability). Furthermore, suppose we desire a
95% confidence level and ±5% precision.

SLOVIN’S (Simplified Formula for Proportions)


- n is the sample size
- N is the population size
- e is the level of precision

Example:
Find out what sample of a population of 1000 people you need to take for a survey
on their soda preference. Confidence level of 95%; giving you an alpha level of
0.05

12 | P a g e M o d u l e i n Statistics and Evaluation in Education MCBriñosa


= ___1000____
1+1000(0.05)
= 285.714286
N = 286

Use of software in sample size determination depending on type of study and


specific software Some information will be required:

• Population sample size, population standard deviation, population sampling


error, confidence level, z –value, power of study etc …
• 80% power in a clinical trial means that the study has a 80% chance of ending
up with a p value of less than 5% in a statistical test (i.e. a statistically significant
treatment effect) if there really was an important difference (e.g. 10% versus 5%
mortality) between treatments.

G*POWER
G*POWER is a FREE program that can make the calculations a lot easier
http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3/
Faul, F., Erdfelder, E., Lang, A.-G., & Buchner, A. (2007). G*Power 3: A flexible
statistical power analysis program for the social, behavioral, and biomedical
sciences. Behavior Research Methods, 39, 175-191.
G*Power computes:
• power values for given sample sizes, effect sizes, and alpha levels (post hoc
power analyses),
• sample sizes for given effect sizes, alpha levels, and power values (a priori
power analyses)
• suitable for most fundamental statistical methods
Note – some tests assume equal variance across groups and assume using pop SD
(which are likely to be est from sample)

LET’S DO IT: BS T-TEST


two random samples of n = 25
expect difference between means of 5
two-tailed test, = .05
– m1 = 5
– m2 = 10 5 - 10
– s = 10 d= = .500
10

power calculations: example

➢ two random samples of n = 25


➢ expect difference between means of 5
➢ two-tailed test,  = .05
– 1 = 5
– 2 = 10
–  = 10

13 | P a g e M o d u l e i n Statistics and Evaluation in Education MCBriñosa


EXAMPLES

1. A researcher plans to conduct a survey. If the population on high city is 1,000,000


find the sample size if the margin error is 25%
n = 1,000,000/ 1+1,000,000(0.025)2
= 1,000,000/ 1+1,000,000(0.000625)
= 1,000,000/625
= 1597.44
n= 1597
2. Suppose our evaluation of farmers’ adoption of the new practice only affected
2,000 farmers. Find the sample size(n)
A 95% confidence level; e= .5 are assumed

Learning Activities

LEARNING ACTIVITY NO.1

A. The following are heights of five students in centimeters. Suppose samples of size
3 are taken from this population of five students.

Student Height (cm) a. How many samples are possible? List them
Bert 120 and compute the mean each sample.
Tony 130 b. Construct the sampling distribution of the
Danny 110 sample means.
Henry 125 c. Construct the histogram of the sampling
distribution of the sample means. Describe the
Peter 115
shape of the histogram.

B. TRUE OR FALSE
1. Given the population is normally distributed, the population has a mean of 14 and
a standard deviation of 3 and the sample size of your sampling distribution is N=10.
Therefore, the mean of the sampling distribution of the mean is also 14.
2. The population has a mean of 30 and a standard deviation of 6. The sample size
of your sampling distribution in N=9. Therefore, the variance of the sampling
distribution is 4.
3. The population has a mean of 120 and a standard deviation of 12. The sample
size of your sampling distribution is N = 16. Therefore, the standard error of the
mean is ¾.
4. Sampling distribution of the mean, with n=30, of a moderately negatively skewed
distribution is about normal.
5. The entire student body of 225 students took a test. These test scores have a
mean of 75, a standard deviation of 10, and are slightly positively skewed. If you
randomly chose 35 of these test scores and calculated the mean over and over
again, the sample mean is also 75 and SD is 2, from [10/sqrt(25)].

LEARNING ACTIVITY NO. 2

A. Directions: Identify if the statement is referring to PROBABILITY SAMPLING and


NON-PROBABILITY SAMPLING.
__________ 1. The samples are randomly selected.
__________ 2. Samples are selected on the basis of the researcher’s subjective judgment.
__________ 3. Everyone in the population has an equal chance of getting selected.
__________ 4. Not everyone has an equal chance to participate.
__________ 5. Researchers use this technique when they want to keep a tab on sampling
bias.
__________ 6. Sampling bias is not a concern for the researcher.
__________ 7. Useful in an environment having a diverse population.

14 | P a g e M o d u l e i n Statistics and Evaluation in Education MCBriñosa


___________ 8. Useful in an environment that shares similar traits.
___________ 9. Used when the researcher wants to create accurate samples.
___________ 10. Finding an audience is very simple.

LEARNING ACTIVITY NO. 3

A. Multiple Choices. Choose the letter of the correct answer.


1. It concerned with selection of a subset of individual from within a statistical population
to estimate characteristic of the whole population.
a. Sample b. Scientific Sampling c. Non-Scientific Sampling c. Sampling
2. A small amount or part of something that shows you what the rest is or it should be
a. Sample b.Scientific Sampling c.Non-Scientific Sampling d.Sampling
3. This method assumes that each site has an equal chance of being part of the sample
selected.
a. Restricted Random Sampling c. Systematic Sampling
b. Unrestricted Random Sampling d. Stratified Random Sampling
4. It is a sampling technique used when “natural” but relatively homogeneous groupings
are evident in statistical population.
a. Cluster Sampling b. Systematic Sampling
c. Multistage Sampling d. Purposive Sampling
5. A statistical method involving the selection of elements from an ordered sampling
frame.
a. Cluster Sampling b. Systematic Sampling
c. Multistage Sampling d. Purposive Sampling
6. This type of non-scientific sampling is based on selecting the individuals as samples
according to the purposes of the researcher as his controls.
a. Systematic Sampling b. Purposive Sampling
c. Quota Sampling d. Covenience Sampling
7. The process of selecting some people to be part of a sample because they are readily
available, not because they are most representative of the population being studied.
a. Systematic Sampling b. Purposive Sampling
c. Quota Sampling d. Covenience Sampling
8. This is one of the most common forms of non-probability sampling.
a. Systematic Sampling b.Purposive Sampling
c.Quota Sampling d.Covenience Sampling
9. Choose which is not Advantages of Sampling.
a. Very accurate b. Economical in nature
c. Chances for bias d. Very reliable
10. Choose which is not Disdvantages of Sampling
a. Inadequacy of the samples. b.Chances for bias.
c.Problems of accuracy. d. High suitability ratio towards the different surveys.

Assessments
Fill in the blanks
Directions: Read the following statements and write the word/s which best complete/s the
sentences on your paper/notebook. Then later, check your own work. (BE HONEST even
NOBODY SEES YOU)
16. The ___________ obtained will form a frequency distribution and the
corresponding probability distribution can be constructed.
17. The ____________ is the mean of the population from where the items are
sampled.
18. If the population mean is normally distributed, the population mean μ is _____ to
the sample mean.
19. The central limit theorem (CLT) states that the distribution of sample __________
a normal distribution (also known as a “bell curve”) as the sample size becomes
larger.
20. ____________ concerned with selection of a subset of individual from within a
statistical population to estimate characteristic of the whole population.

15 | P a g e M o d u l e i n Statistics and Evaluation in Education MCBriñosa


21. ___________________ method assumes that each site has an equal chance of
being part of the sample selected.
22. ____________ is one of the most common forms of non-probability sampling.
23. ____________ is a sampling technique used when “natural” but relatively
homogeneous groupings are evident in statistical population.
24. ____________ is defined as a sampling technique in which the researcher
chooses samples from a larger population using a method based on the theory of
probability.
25. ____________________ involves a method where the researcher divides a more
extensive population into smaller groups that usually don’t overlap but represent
the entire population.
26. ___________________ is a sampling method in which not all members of the
population have an equal chance of participating in the study, unlike probability
sampling.
27. _______________________ is not a scientific method of sampling, and the
downside to this sampling technique is that the preconceived notions of a
researcher can influence the results.
28. ____________________ is the mathematical estimation of number of subjects/
units included in the study.
29. The more heterogeneous a population, the ______ the sample size required to
obtain a given level of precision.
30. Sample sizes equal or greater than ___ are considered sufficient for the CLT to
hold.

VI. REFERENCE:

Textbooks
1. Alfarez, Merle S., et.al. Statistics and Probability. 2010
2. Ocampo, Jose M. Jr., et.al. Senior High Conceptual Statistics and Probability. 2016

Online references:
1. https://businessjargons.com/sampling-distribution-of
2. http://www.onlinestatbook.com/2/sampling_distributions/samp_dist_mean.html
3. What is Sampling Distribution of Mean? definition and meaning - Business
Jargons
4. https://www.google.com/amp/s/www.questionpro.com/blog/non-probability-
sampling/amp/
5. https://www.statisticshowto.com/sampling-with-replacement-without/
6. https://www.questionpro.com/blog/probability-sampling/
7. https://www.slideshare.net/mobile/RoquiMalijan/group-5-28367482

16 | P a g e M o d u l e i n Statistics and Evaluation in Education MCBriñosa

You might also like