0% found this document useful (0 votes)
9 views57 pages

Introducing Statistics Part 2

The document outlines various data collection and sampling methods, emphasizing the importance of choosing the right strategy based on the type of data needed, available resources, and analysis goals. It discusses different tools for data collection, such as surveys, interviews, and observations, as well as the advantages and disadvantages of non-probability and probability sampling techniques. Key concepts like target population, study population, and sampling methods are also defined to guide effective data collection practices.

Uploaded by

innocentzoya851
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views57 pages

Introducing Statistics Part 2

The document outlines various data collection and sampling methods, emphasizing the importance of choosing the right strategy based on the type of data needed, available resources, and analysis goals. It discusses different tools for data collection, such as surveys, interviews, and observations, as well as the advantages and disadvantages of non-probability and probability sampling techniques. Key concepts like target population, study population, and sampling methods are also defined to guide effective data collection practices.

Uploaded by

innocentzoya851
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Data Collection & Sampling Methods Sampling Methods

Introduction To Statistics: Part 2.

[Link]
sbanya@[Link]

University of Malawi, The Polytechnic

Monday 11th January, 2021

1 / 56
Data Collection & Sampling Methods Sampling Methods

Data Collection

Data Collection Strategy: No one best way: decision


depends on:
What you need to know: numbers or stories
Where the data reside: environment, files, people
Resources and time available
Complexity of the data to be collected
Frequency of data collection
Intended forms of data analysis

2 / 56
Data Collection & Sampling Methods Sampling Methods

Rules for Collecting Data.

Use multiple data collection methods


Use available data, but need to know
how the measures were defined
how the data were collected and cleaned
the extent of missing data
how accuracy of the data was ensured

3 / 56
Data Collection & Sampling Methods Sampling Methods

Rules for Collecting Data.

If must collect original data:


be sensitive to burden on others
pre-test, pre-test, pre-test
establish procedures and follow them (protocol)
maintain accurate records of definitions and coding
verify accuracy of coding, data input

4 / 56
Data Collection & Sampling Methods Sampling Methods

Data Collection Tools

Participatory Methods
Records and Secondary Data
Observation
Surveys and Interviews
Focus Groups
Diaries, Journals, Self-reported Checklists
Other Tools

5 / 56
Data Collection & Sampling Methods Sampling Methods

Participatory Methods

Involve groups or communities heavily in data collection


Examples:
community meetings
mapping
transect walks

6 / 56
Data Collection & Sampling Methods Sampling Methods

Community Meetings

One of the most common participatory methods


Must be well organized
agree on purpose
establish ground rules
who will speak
time allotted for speakers
format for questions and answers

7 / 56
Data Collection & Sampling Methods Sampling Methods

Records and Secondary data

Examples of sources:
files/records
computer data bases
industry or government reports
other reports or prior evaluations
census data and household survey data
electronic mailing lists and discussion groups
documents (budgets, organizational charts, policies and
procedures, maps, monitoring reports)
newspapers and television reports

8 / 56
Data Collection & Sampling Methods Sampling Methods

Using Existing Data Set

Key issues to consider: validity, reliability, accuracy, response


rates, data dictionaries, and missing data rates

9 / 56
Data Collection & Sampling Methods Sampling Methods

Advantages/Disadvantages

Advantages: Often less expensive and faster than


collecting the original data again
Disadvantage: There may be coding errors or other
problems. Data may not be exactly what is needed. You
may have difficulty getting access. You have to verify
validity and reliability of data

10 / 56
Data Collection & Sampling Methods Sampling Methods

Observation

One sees what is happening:


traffic patterns
land use patterns
layout of city and rural areas
quality of housing
condition of roads
conditions of buildings
who goes to a health clinic

11 / 56
Data Collection & Sampling Methods Sampling Methods

Observation is helpful when:

need direct information


trying to understand ongoing behavior
there is physical evidence, products, or outputs than can be
observed
need to provide alternative when other data collection is
unfeasible or inappropriate

12 / 56
Data Collection & Sampling Methods Sampling Methods

Ways to Record Information from Observations:

Observation guide: printed form with space to record


Recording sheet or checklist: Yes/no options; tallies, rating
scales
Field notes:least structured, recorded in narrative,
descriptive style

13 / 56
Data Collection & Sampling Methods Sampling Methods

Guidelines for Planning Observations

Have more than one observer, if feasible


Train observers so they observe the same things
Pilot test the observation data collection instrument
For less structured approach, have a few key questions in
mind

14 / 56
Data Collection & Sampling Methods Sampling Methods

Advantages/Disadvantages

Advantage: Collects data on actual vs. self- reported


behavior or perceptions. It is real-time vs. retrospective
Disadvantage: Observer bias, potentially unreliable;
interpretation and coding challenges; sampling can be a
problem; can be labor intensive; low response rates.

15 / 56
Data Collection & Sampling Methods Sampling Methods

Surveys and Interviews

Excellent for asking people about: perceptions, opinions,


ideas
Less accurate for measuring behavior
Sample should be representative of the whole
Big problem with response rates

16 / 56
Data Collection & Sampling Methods Sampling Methods

Modes of Survey

Telephone surveys
Self-administered questionnaires distributed by mail,
e-mail, or websites
Administered questionnaires, common in the development
context
In development context, often issues of language and
translation

17 / 56
Data Collection & Sampling Methods Sampling Methods

Advantage/Disadvantage

Advantage: Best when you want to know what people


think, believe, or perceive, only them can tell you that.
Disadvantage:People may not accurately recall their
behavior or may be reluctant to reveal their behavior if it is
illegal or stigmatized. What people think they do or say
they do is not always the same as what they actually do.

18 / 56
Data Collection & Sampling Methods Sampling Methods

Interviews.

Often semi-structured
Used to explore complex issues in depth
Forgiving of mistakes: unclear questions can be clarified
during the interview and changed for subsequent interviews
Can provide evaluators with an intuitive sense of the
situation

19 / 56
Data Collection & Sampling Methods Sampling Methods

Challenges of Interviews.

Can be expensive, labor intensive, and time consuming


Selective hearing on the part of the interviewer may miss
information that does not conform to pre-existing beliefs
Cultural sensitivity: e.g., gender issues

20 / 56
Data Collection & Sampling Methods Sampling Methods

Focus Group

Type of qualitative research where small homogeneous


groups of people are brought together to informally discuss
specific topics under the guidance of a moderator
Purpose: to identify issues and themes, not just interesting
information, and not ”counts”

21 / 56
Data Collection & Sampling Methods Sampling Methods

Focus Groups are Inappropriate when:

language barriers are insurmountable


evaluator has little control over the situation
trust cannot be established
free expression cannot be ensured
confidentiality cannot be assured

22 / 56
Data Collection & Sampling Methods Sampling Methods

Advantage/Disadvantage

Advantage: Can be conducted relatively quickly and


easily; may take less staff time than in-depth, in-person
interviews; allow flexibility to make changes in process and
questions; can explore different perspectives; can be fun.
Disadvantage: Analysis is time consuming; participants
not be representative of population, possibly biasing the
data; group may be influenced by moderator or dominant
group members.

23 / 56
Data Collection & Sampling Methods Sampling Methods

The Population

There are two different types of population:


Target Population: Consists of the group of population
units from whom we would like to collect data (e.g. all
students in the Unima)
Study or Survey Population: Consists of the group of
population units from whom we can collect data (e.g. all
students in UNIMA with laptops)

24 / 56
Data Collection & Sampling Methods Sampling Methods

The Population

NOTE: Ideally a sample survey should have collected data from


Target Population but in practice, we collect data from Study
Population due to some constraints.

25 / 56
Data Collection & Sampling Methods Sampling Methods

The Sample

A sample must be:


Unbiased: The chosen sample should be representative of
the entire population of interest. E.g. if we are interested
in the weight of primary school children, we should select a
sample that includes children from a range of primary
school classes and year groups.
Taken from the collect population: The sample should
only contain members of the population of interest. E.g. if
we are interested in the characteristics of primary school
children, the sample should not contain children from
secondary school.

26 / 56
Data Collection & Sampling Methods Sampling Methods

Sampling Methods

Grouped into two categories:


Non-Probability Sampling: Involves non-random
selection based on convenience or other criteria, allowing
you to easily collect initial data.
Probability Sampling: Involves random selection,
allowing you to make statistical inferences about the whole
group.

27 / 56
Data Collection & Sampling Methods Sampling Methods

Non-Probability Sampling

Has the following characteristics:


No sampling frame is used, therefore the chance of someone
being included in the sample cannot be calculated.
Results from the survey can be produced cheaply and
quickly.
Population coverage is poor since it only captures those
that are available to contribute at the time and/or are
interested enough in the subject under investigation;
It is difficult to make estimates of the population from the
sample results and any generalizations that are made must
be treated with caution.
Performing non-probability sampling is considerably less
expensive than probability sampling methods.
28 / 56
Data Collection & Sampling Methods Sampling Methods

Types of Non-probability Sampling

Convenience Sampling: Data is collected from any


willing and available respondent. Examples include
Street corner interviews;
Magazine and newspaper questionnaires; and
Phone-in polls.
The sample is likely to be unrepresentative of the
population, because only those who feel strongly about the
topic are likely to respond and interviewers may only
approach one particular type of respondent, usually those
that they feel comfortable with. Therefore, the results of
the survey may be biased.

29 / 56
Data Collection & Sampling Methods Sampling Methods

Types of Non-probability Sampling

Purposive Sampling:
Read on Purposive Sampling and write down what it
is,when to use it, advantages and disadvantages.

30 / 56
Data Collection & Sampling Methods Sampling Methods

Types of Non-probability Sampling

Quota Sampling: The population is divided into different


groups or classes according to different characteristics of
the population, and some percentage(proportion) of the
different groups in total population is fixed
In Quota sampling, researchers create a sample involving
individuals that represent a population.
Researchers choose these individuals according to specific
traits or qualities.
Quotas are devised to reflect the characteristics of the
population, hence quota sampling attempts to obtain a
more representative sample than convenience sampling, and
therefore more representative sample results should be
obtained.

31 / 56
Data Collection & Sampling Methods Sampling Methods

Quota Sampling Example & Steps

A study to investigate the proportion of those who eat Pizza


and Cake at home.
Steps
Divide the group into subgroups of some characteristics
Identify proportion of these subgroups in the population.
i.e. N = 10, 4 cakes and 6 pizza
Lastly, select subjects to form sample group: i.e. 50% cakes
(n = 2) and 50% pizza (n = 3), hence total sample n = 5

32 / 56
Data Collection & Sampling Methods Sampling Methods

Advantages & Disadvantages of Non-probability


Sampling

Advantages:
Non-probability sampling techniques are a more conducive
and practical method for researchers deploying surveys in
the real world.
Getting responses using non-probability sampling is
faster(time effective) and more cost-effective than
probability sampling because the sample is known to the
researcher. The respondents respond quickly as compared
to people randomly selected as they have a high motivation
level to participate.
Effective when it is unfeasible or impractical to conduct
probability sampling.

33 / 56
Data Collection & Sampling Methods Sampling Methods

Advantages & Disadvantages of Non-probability


Sampling

Disadvantages:
Lower level of generalization of research findings compared
to probability sampling
Difficulties in estimating sampling variability and
identifying possible bias

34 / 56
Data Collection & Sampling Methods Sampling Methods

Probability Sampling

All members of the study population have known probability of


being included in the sample
Has the following characteristics:
Use a sampling frame from which to select a sample
Select samples at random from the sampling frame.
Therefore every item on the sampling frame has a chance
of being selected and the probability of selection can be
calculated
Select a sample that is more representative of the
population (than non-probability methods) and
Researchers can calculate the accuracy of the survey
estimates

35 / 56
Data Collection & Sampling Methods Sampling Methods

Example Questions

1 What is the distribution of household sizes in Mulanje


district?
2 What proportion of children aged 6 and attending standard
1 in Mangochi sleep under a mosquito net?
3 What is the distribution of ages of University students in
Malawi?

36 / 56
Data Collection & Sampling Methods Sampling Methods

Some Important terms

Target population: Total population about which


information is required, e.g all University students at time
of study
Study population: The set of individuals from which
individuals to be studied will be selected, e.g all those
attending classes during the study period (when data
collection takes place)
Often these are identical or very similar. But not always

37 / 56
Data Collection & Sampling Methods Sampling Methods

Some Important terms Cont...

Population characteristic: The aspect(s) of the


population to be studied, e.g mean age, proportion of
babies who sleep under a net
Sampling units: The persons or groupings used to select
sample members, e.g households
Sampling frame: Set of sampling units, e.g schools in a
village
List: A real list of units in the sampling frame

38 / 56
Data Collection & Sampling Methods Sampling Methods

Some notation

Population size: N
Sample size: n
n
Sampling fraction: f = N

39 / 56
Data Collection & Sampling Methods Sampling Methods

Probability Sampling Methods

1 Simple Random sample


2 Systematic sample
3 Stratified sample
4 Cluster sample
5 Multi-stage sample

40 / 56
Data Collection & Sampling Methods Sampling Methods

Simple Random sample (SRS)

Each and every member of the study population has the


same chance of being selected into the sample.
The chance is equal to the sampling fraction (f) where
n
f=N .
Requirements:
A list of all members of the sampling frame
Possible methods:
Pieces of paper in a hat / drum
Random digit tables
Use random digit methods in a software package

41 / 56
Data Collection & Sampling Methods Sampling Methods

Replacement

Sample without replacement - once selected a sampling


unit cannot be drawn again
Sample with replacement - after being selected a sampling
unit can still be drawn again (same chance each time)

42 / 56
Data Collection & Sampling Methods Sampling Methods

Simple Random Sample (WITHOUT Replacement)

Step 1: List the N subjects in the study population. This is


the list of the sampling frame.
Step 2: Number entries in the listing from 1 to N
Step 3: Select n random numbers between 1 and N
Step 4: Use the list of the sampling frame to identify each
individual corresponding to the ID numbers selected
Step 5: Locate each and seek their consent to participate
in the survey

43 / 56
Data Collection & Sampling Methods Sampling Methods

Selecting n random numbers using Excel

Use function: RANDBETWEEN(1, N )


Repeat at least n times
Example
Select a SRS of 30 subjects from a population of 500
N = 500
n = 30

44 / 56
Data Collection & Sampling Methods Sampling Methods

Stratified random sampling

Stratification is the process of grouping the units within a


population of interest into homogeneous sub-groups called
strata
All strata should be mutually exclusive, that is that every
unit within the population of interest can only be assigned
to one strata.
Collectively the strata should also be exhaustive so that all
units are covered by one of the strata

45 / 56
Data Collection & Sampling Methods Sampling Methods

Stratified random sampling cont...

A stratified random sample can be chosen by following the steps


below:
Divide the population into groups called strata: The
population should be split into groups according to some
characteristic that is related to the subject of the survey
A sample is selected from within each stratum using SRS
method. We determine the number of units to be selected
from each strata using an allocation method. The methods
of allocation that such as equal, proportional or optimal
allocation.
The samples for each stratum are collated to form the total
sample of the population. This ensures that each stratum
is represented in the sample.

46 / 56
Data Collection & Sampling Methods Sampling Methods

Allocating the Sample among the Strata

Once we have split our population into strata, we need to


work out how many units to sample from each stratum.
There are three methods of allocating a sample of size n
among the different strata - equal allocation, proportional
allocation and optimal

47 / 56
Data Collection & Sampling Methods Sampling Methods

Advantages

1 The results of stratified random samples tend to be more


accurate (have lower variance) since the grouping together
of similar units controls for the variation within strata.
2 The sample obtained through stratification is more
representative of the population
3 Stratification also permits separate analyses on each group,
which researchers may find useful

48 / 56
Data Collection & Sampling Methods Sampling Methods

Disadvantages

1 This method is more costly and difficult to organize, since


it involves splitting the population into different strata and
taking a sample from each stratum
2 There is a danger of splitting the population into too many
small strata. This may mean that some of the strata may
not contain any sample members or the sample may not be
large enough to be spread across all of the strata
3 Sometimes there may be more than one variable that the
survey needs to be stratified by

49 / 56
Data Collection & Sampling Methods Sampling Methods

Systematic Random Sampling

Systematic random sampling


Use the anticipated population size and planned sample
size to determine the sampling fraction f to be used
Determine a sequence in which sampling units are added to
the list, eg entry in a register, order on a route
1
Determine the sampling interval k = f
Randomly select a number between 1 and k
Select this sampling unit
Then select every k ∗ th sampling unit thereafter

50 / 56
Data Collection & Sampling Methods Sampling Methods

Example

Target population: Patients attending the Out Patient


Department (OPD) at QECH
Number of patients expected in study period = 20, 000
Sample size = 200
Sampling fraction f = 1/100; k = 1/f = 100
Select a random number between 1 and 100, say 42
Approach 42nd patient, then 142nd , 242nd etc.

51 / 56
Data Collection & Sampling Methods Sampling Methods

Cluster Random Sampling

Cluster sampling
Used members of the study population are naturally in
groups, called clusters,
e.g villages for residence,
schools for education,
health center catchment areas for health care e.t.c.
Obtain a simple random sample of clusters
Sample members from the selected clusters only
May select only a sub-set of them

52 / 56
Data Collection & Sampling Methods Sampling Methods

Cluster Sampling Example

What proportion of standard 1 students sleep under a mosquito


net in Mangochi district?
Study population: Standard 1 students aged 6 in Mangochi
district
Population size: approximately 3,000
Number of schools = 54
Randomly select 7 schools and obtain data for every
standard 1 student in the chosen schools
7
Final sample size is approximately 3, 000 × = 389
54

53 / 56
Data Collection & Sampling Methods Sampling Methods

Do all members of the study population have known probability


of being included in the sample?
If Yes:
7
probability a school is selected = = 0.13
54
since all students in selected schools are selected this is also
probability a student is selected
Sometimes sampling of clusters uses sampling in proportion
to size

54 / 56
Data Collection & Sampling Methods Sampling Methods

What are the sampling units?


In cluster sampling the primary sampling units are the
clusters
Individuals that make up the clusters are secondary
sampling units
For the standard 1 students e.g:
primary sampling units -schools
secondary sampling units - students

55 / 56
Data Collection & Sampling Methods Sampling Methods

Multistage cluster sampling

56 / 56
Data Collection & Sampling Methods Sampling Methods

The End.

57 / 56

You might also like