Chapter 4
DATA COLLECTION
Lecturer: Le Hoai Kieu Giang
Email: [email protected]
Contents:
• Secondary Data
• Primary Data
• Experimentation
2
Secondary
Research data
What kind sources
Literature/ planning Primary
of data
Research Model data
objectives need to be sources
collected Identify the Internal &
data external
source
3
Research data
Secondary data Primary data
Observational Experimental
Survey data Others
data data
4
1. Secondary data
The nature of secondary data is that it has been collected and
processed to serve a certain goal, which may be different from the goal of
the topic being researched.
Pros. and Cons.
5
1. Secondary data
Secondary data can be used to:
• Provide information to formulate the research
• Propose methods and types of primary data to collect
• Provide the basis for comparing and evaluating/interpreting
primary information
7
1. Secondary data
Ø Secondary data sources:
Internal sources
• Marketing documents and databases
• HR documents and databases
• Financial documents and databases
• Operations documents and databases
• Other internal documents
8
1. Secondary data
Ø Secondary data sources:
External sources
• Associations, research articles, conferences,
newspapers
• Governmental organizations/ non-
governmental organizations/ tax departments/
general statistics offices/...
• Other external documents and databases
9
Summary on primary and secondary data
10
2. Primary data
• The nature of primary data is specifically
collected for specific research topics.
• It is used when secondary data is insufficient
or unsatisfactory.
• Data collected with a specific objective may
not be appropriate in another situation.
• Obtaining primary data can be expensive and
time consuming.
• The research problem requires primary data
collection could raise ethical concerns.
11
2. Primary data
Ø How to collect primary data?
Communication (Giao tiếp)
• Participants will proactively provide the idea and information related to the
research problem through direct or indirect communication with the
researcher. (E.g. interview or telephone conversations, survey)
Observation (Quan sát)
• Participants are completely passive in the process of providing data.
12
2. Primary data
Communication Observation
High Low
Utility and flexibility Survey participants can be asked about Only suitable for variables that have
their feelings, intentions, and opinions external and observable manifestations
Time and cost Usually faster - less expensive Longer – more expensive
Depending on:
• Research problem Depending on:
• Collecting method • Method
• Nature of data • Tools
Accuracy and reliability • Honesty of respondents
Everything is equal, the observation method will often give more reliable results.
Convenience for
Less convenient More convenient
respondents
13
2. Primary data
Can the object to be studied be no
accurately observed?
yes
Can observations be completed
within the duration of the research no
project?
yes
Is the budget enough for no
observation?
yes
Observational Communication
method method 14
2. Primary data
Natural setting Natural setting
Without instruments With instruments
Laboratory Laboratory
Without instruments With instruments
Observational methods
15
2. Primary data
RFID systems are used to track product
movement and collects data on customer
behavior.
The UPC system (also called bar code),
together with optical scanners, allows for
mechanized information collection regarding
consumer purchases by product category,
brand, store type, price, and quantity.
16
2. Primary data
Eye-tracking equipment, such as oculometers, eye cameras, or eye view
minuters, records the gaze movements of the eye.
These devices can be used to determine how a respondent reads an
advertisement or views a TV commercial and for how long the respondent looks at
various parts of the advertisement or packaging of a product,...
17
2. Primary data
The psychogalvanometer (tâm thần
điện kế) measures galvanic skin
response (GSR) or changes in the
electrical resistance of the skin that
relate to a respondent’s affective state.
18
Real research
In the department store project, license plate observations could be used to establish the primary trading area of a
shopping mall. These observations help marketers determine where their customers live. In a license plate study,
observers record the license plate numbers of the automobiles in a parking lot. These numbers are fed into a
computer and paired with automobile registration data. This results in a map of customers located by census tract
or zip codes. Such a map, along with other demographic data, can help a department store chain determine new
locations, decide on billboard space, and target direct marketing efforts. License plate observation studies cost
less ($5,000 to $25,000) and are believed to be quicker and more reliable than direct communication methods
such as interviews with shoppers.
20
2. Primary data
Ø Communication methods:
Based on the "question - answer" process
Instruments: the questionnaire is often used in many different
formats and administration methods.
21
2. Primary data
Structure
Format
Disguise
Questionnaire
Personal
Interview
Admin. method
Mail Survey
22
2. Primary data
Structure
Question and number of questions: similar or different for each
respondent.
The order of question: similar or different for each respondent.
The answer for each question: open-ended questions
(respondents answer in their own words) or prespecify the set of
response alternatives and the response format
23
2. Primary data
Criteria Structured question Unstructured question
• Different research populations • Provides many new ideas.
can be studied.
• Allows for detailed responses.
• The respondents' literacy and
Flexibility communication skills requirements
are not too high.
• Multiple research topics may be
included in an interview/
questionnaire.
The • More convenient because taking • Less convenient for
convenience for less time and easier to answer respondents
respondent
24
2. Primary data
Criteria Structured question Unstructured question
• It takes little time to respond. • It takes less time to design the
questionnaire but more time to
Time • The collected data is transferred collect the answers.
to the machine for analyzing
quickly.
Cost • Lower cost • Higher cost
• It is necessary to comprehensively
• Fewer interviewer errors and
Accuracy and accurately interpret
respondent errors
respondents' answers
25
2. Primary data
Disguise
(Mức độ trực tiếp)
The extent to which the respondent knows/does not know the purpose
of the question.
DIRECT QUESTIONS – INDIRECT QUESTIONS
Indirect questioning is a widely used approach for securing opinions on
sensitive topics. The participants are asked how “other people” or
“people you know” feel about a topic.
26
2. Primary data
Example:
Have you downloaded copyrighted films from the Internet without
paying for it?
Do you know people who have downloaded copyrighted films
from the Internet without paying for it?
27
2. Primary data
Extent of the Respondent willingness/ Ability to
findings answer direct questions
conclusiveness
(Mức độ chắc chắn của High Low
các kết quả)
Structured and Structured and
High
Undisguised (trực disguised (gián
(Confirmatory)
tiếp) gián tiếp)
Low Nonstructured and Nonstructured and
(Exploratory) Undisguised disguised
28
2. Primary data
Administration method
Phỏng vấn trực tiếp (personal interview)
• Direct Q&A between interviewer and interviewee
(face-to-face)
Khảo sát qua thư tín (mail survey)
• There is no direct communication, only through
questionnaires
29
Criteria Ranking
1st 2nd 3rd
Number of
Personal Mail Telephone
question
The variety of Personal Telephone Mail
data
Duration to Telephone Personal Mail
collect data (fastest)
Mail (Cheapest) Telephone Personal
Cost
30
Criteria Ranking
1st 2nd 3rd
Ability to control
Personal Telephone Mail
the respondent
Opportunity to
Personal Telephone Mail
explain
Convenient to
Mail Telephone Personal
informant
31
2. Primary data
Response rate: The percentage of the total attempted interviews that
are completed.
Nonresponse error: occurs when the researcher (1) cannot locate the
person (the predesignated sample element) to be studied or (2) is
unsuccessful in encouraging that person to participate.
How to improve the response rate and
decrease the nonresponse error?
32
3. Experimentation
An experiment is formed when the researcher
manipulates one or more independent
variables and measures their effect on one
or more dependent variables, while
controlling for the effect of extraneous
variables.
Experimentation is commonly used to infer
causal relationships.
33
3. Experimentation
• Independent variable (Biến độc lập): are variables or alternatives that are
manipulated (i.e., the levels of these variables are changed by the researcher)
and whose effects are measured and compared.
• Dependent variable (Biến phụ thuộc): are the variables that measure the
effect of the independent variables on the test units.
• Extraneous variables (biến ngoại lai): are all variables other than the
independent variables that affect the response of the test units.
• Test unit (đơn vị thử nghiệm): individuals, organizations, or other entities
whose response to independent variables or treatments is being studied
34
What is the causal effect?
35
3. Experimentation
Experiment validity
Internal Validity (giá trị nội): A measure of accuracy of an experiment. It
measures whether the manipulation of the independent variables actually
caused the effects on the dependent variable(s).
External validity (giá trị ngoại): A determination of whether the cause-
and-effect relationships found in the experiment can be generalized.
36
3. Experimentation
Laboratory experiment Field experiment
(Hiện trường giả) (Hiện trường thật)
Internal Validity High Low
External validity Low High
37