Lecture Notes 4
Lecture Notes 4
UnitII
Meaning and Definition of a Research Problems: Selection, sampling, Unit 2: Meaning and Definition of a
steps, t5pes, sample size, testing hypothesis-I, (parametric or standard Research Problem (Pages 2568)
test of hypothesis), testing hypothesis-II (non parametric or
distribution), free tests.
UnitIII
Sempling Design: Census and sample survey, implications of sample Unlt 3: Sampling Design
design, steps in sampling survey, types of universe, sampling unit, (Pagec 69-lt4)
source list, size of sample, parameters interest, budgetary constraint,
sampling procedure, criteria of selecting a sampling procedure,
characteristics of a good sample design, different tlpes of sample design,
how to select a random sample, random sample from an infinitive universe
complex random sample design.
Sampling Fundamentals: Need for sampling, some fundamental
definitions, important sampling distributions, central limit theorem,
sampling theory Sandier's A test, concept of standard errot estimation,
estimating the population, mean, estimating population proportion,
sample size through the approach based on precision rate and
confidence level, deterministic sample size through the approach based
on Bayesian statistics.
UnitIV
Research Design: Meaning ofresearch design, need for research design, Unlt 4: Research Design
features of a good design, important concepts relating to reseaich (Pages 115.165)
design, different research designs, basic principles of experimental
design, conclusions, developing research plan.
Experimental Designs: Between group designs, within group designs,
mixed designs, latin square designs.
Non experimental Design and Correlational Methods: Non and quasi-
experimental designs, correlational designs, newer social methods,
advanced correlational methods, discriminant function analysis.
Qualitative Methods: Definition and aim of qualitative research,
construction of reality, subjectivity, dynamic research, process of
documentation of qualitative research.
UnitV
Computer and Its Role in Research: Computer applications, the Unit 5: Role of Computers in Research,
computer system, important characteristics, the binary number system, lnterpretation and Report Writing
computers and researchers. (Pages 167-189)
Interpretation and Report Writing: Meaning, reasons, techniques,
precautions, steps in report writing, layout, types, oral presentation
and precautions in report writing.
CONTENTS
l\'t'R0DI-r('Tr0N
I
UNIT I IVTEANING OF RESEARCH
3-24
l.() lntroduction
l.l Unit Ob.icctivcs
l.l lVlcanins ol' Rcscarch
1.3 { )h.icctivex ol Rcscarch
1.4 Typcs of Rcscarch
1.,5 Approache.s to Rescarclr
1.6 Si_snificancc of Rcsearch
1.7 Method vs. Mcthodology
L8
Rcscarch Prclcess
1.9 Flow Chart
l.l0 Criteria of Good Rcscarch
I.lI Problcnx Encountcretl by Inclian Rcscarchcrs
I.l2 Mcasurcrrrcnt Scales
I.l3 Sourccs ol' Error in lVlcasurcmcnt
l.l4 Test ol'sourrd lUcasurcnrcnt
l.l.l. I 'l cst ol-\ralitlit-v..
l.l.l.l Tcst ol'Rcliability
I'14.-1 Tcst ol'l)racticalilv
l.l5 Sunrrnary
l.16 Key Tcnns
l.l7 Answers to 'C'hcck Your progrcss'
l.l tl Qucstions and Exerciscs
l.l9 Furthcr Reading
3.0 [ntroduction
3.1 Unit Objectives
3.2 SamplingDesign
3.2.1 Steps in SamPling Designing
3.2.2 Principles for Selecting a Sampling Procedure
3.2.3 Systemic Bias and Sampling Errors
3.2.4 Criteria ofGood Sample Design
3.2.5 Types of SamPling Designs
3.2.6 Types of SamPling
3.3 Statistical Distributions
3.4 Surmnary
3.5 Key Terms
3.6 Answers to 'Check Your Progress'
3.7 Questions and Exercises
3.8 Further Reading
[Jrlit ] lltt'on s ltgltl tttl ltou to rlcl'inc a rcscLu'clr ltnrhlcrn antl lirnrrrrlatc a
hyxrthcsis. l-ltt unit ltrflhcrcxplairslriu.anrctr.ic arrln()n-p:l.lultclrrc test:llrlgur.lcs
lx^r'to ch.osc ditll'rcrrt kinds .t'tcsts tti tcst a hrp.thcsis.
Unit 3 csplainstlu conccpt ofsanplin-g dcsi-un aril cril]e\rc,t twcs ol'vunpling
proccdurcs. Tlrc turit also cxprlains probabilitv ancl rron-llrohabilitv .u,r.,1ilirg
tcclmitlttcs. dil'llrcnt trpcs ol'sarrtpling cmors ancl horv to nrirriprizr-- thcnr.
l..rnit'l cxpltlrcs tlrc ctutccpl ol'rcsearch clcsign and tliflbrr.ut t)'pcs ofrcseadr.
It also discttsscs lrorv to dilll'r'crrtratr'lrctu'ccn the qualitatir,e and cluantitativc
rcscarch.
Llnit 5. erittttittcs httrr cotttprrlcrs lrar c rcr,olrrtitlnizctl tlrc rcscru.ch rr,<lrk.
Thc trnit alstt crplailts horr lo rvritc ir rcscurclr r-!.p()11. irrtcrpret thc rcsults alcl
drarv corrclusr()ns.
Each unit. in this lrook. is supplcnrcrrtccluith Sururrur KcyTcnns. Arurrcru
kt 'Chcck lirul p11-'1'gss'.
Qucslions attd [:rcrcis.'s and [turthcr Rcaclin-t scctions.
to aitl tlrc rcardcr.
I.O INTRODUCTION
In layman's terms research means the search for knowledge. Scientific research is
a systematic and objective way ofseeking answers to certain questions that require
inquiry and insight or that have been raised on a particular topic. The purpose of
research, therefore, is to discover and develop an organized body ofknowledge
in any discipline.
Sell:lntr*"rtoro, Material 3
,lkottirrg rtl Rr', r,r', i: ttllitrt.tlr' aittt of'\()ci.ll sc,cnec rcsc.lcll is thc control and prccliction of
[thar,'iour.
Knowinq hon fo do research: Every rcsearclrcr should have thc nccessary
traitrirr;: in i:.irlhcrinq,i.rtrr. orsilltizurg ntalcrials suilablr anrlciruasirrg in l'icklor
NO'T[,S
latxtttlttry w ork. ;t: I'crlur ctl. I lc slroukl also havo tlu coupctcncc in usinu statistics
lirr trcating thc'datu and the abitity to intcrpret thc'data collcctcrl nrcaninufully.
'l raining of the mind:
Rcsc,arch nccds disciplinc. right nrcntalnukcup. tl'rc
abilitl'to manage tinrc cflbctivch: oh.icctir itr,. logicalthirrhing, thc capac:ity to
ovaltntc tltc rcsults o{'tlrc tcscarc,h and abilitv to carctirllv arist..\\ rhe findings tlrat
arc founcl bv thc rcscarch.
Rcsc:trch tlittn allon s pcoplc 1o tnakc inliruncd dccisiorrs by cxtrapolating
thc fitrdnrgs lirrttt thc ticklor latxrraton'on to rcallill'situatiorrs. I lris is thc practical
application ol'thc tindings scncriltc(l b1, rcrcalch.
Rcscarch is also a way ofprcparing thc nrinrl to look at things iu a liesh or
dillbrcnt way. out of such on oricntation n'ould comc ncw and innovativc
obscnations ahout cvcryday cvcnts anclhappcnings. This is how originalitycomcs
about itt rcscarch. Sotrtc ofthc nrost outstanding discovcries have bccn made in
thc rnost scrcndipitous manner. Somc outstandin.q results havc been obtained by
rcseatchcrs who had kepl their minds open and fi'cc ofcluttcr. This otabled thern
to scc slafl lingly' rrc'* conncctiorrs.
Rcsciu'clt itt cutrutton parlatrcc rcfbrs to scarch fbr knowlcd-uc. Orrc can also dr:fine
rescarch as a scientific attd svstcttratic serarch for pcrrlinurt inlbnrratiotr on a spccilic
topic. In lirct. rcscarch is att at1 ofscicntific irl'cstigation. Accorclirry to the Advanccd
Leamcr's Dictionary ofCurrettt English, 'rcsearch is a carcful invcstigation or
enquiry cspeciallv a thorough search for ncw tacts in any branch of knowledge.'
Redrnan and NItlry ( I 923) deftred research as a 'svstc-rlatizcd eflort to gain nerv
knowlcdgc.' Sornc pcople ctxrsitlcr rescarch as a \oyage ofdiscoverythat involves
rnovcnrcnt lionr thc known to the unknow'n.
4 Sel/:ln.\n1!di(uiul lfuttrtd
Research in a technical sense is an academic actirity. Clifforcl Wtrocly defined Meuning of Research
research as an activit,v tlmt cottrpriscs dctirring ancl redelining problems. tirmrulating
a hypothesis; collccting. organizing ancl cvaluatirrg dltta; ntaking clctlucticlns and
reaclring conclusions: and citretirllv tcrsting thc conclusit>ns to rte'tcnnine ifthey
sr.lpport the formulatecl lrypothesis. D. Slcsinger and N,t. Stcphcnson, in the NOTES
Eno'ck4xrcdia ol'&x'iul Sc'ie'ne't'.s. dclinctl rcsearch as 'the nunipulatisn ofthings,
concepts or synrhols lirr thc purp()sc ol'gencralizing. extcncling. correctinq or
verifring thc'knowlcdgc. wlrcthet that knowlcrlge aids ir tlrc corstruction oftheory
or in the practice ofan afl.' Ilcsearch is thus an original contribution to thc existing
stock of knowledge rnaking for its advancement.
Principles ofa tyrrical rcsearch:
o It is basccl an utrpriricltl clalir
o [t i,vol'cs prccisr,, tltrse^,ati..s irntl rrrc:.rsrr.cl]..,t\
o [t is aimed at developing thcories, principlcs and gcnc,mlizatigns
o There arc systcmatic. logical trrroceclurcs inr,oh cd
o It is rcplicable-
o The findings ofthc rcscarch uced to be reporlctl
A flow chart in tenns ol'how rcsearch prccecds is prcscntcri in Section 1.9
for better understandin-s rvhere thr: c:ritc-ria ofgoocl rc.searclr hal.c becn i6entified
in easy steps.
Finally, one is exposed to some ol'the specilic problons tllrl rcscarclgs in
India have to face. Measurenrent scalr,'s lrc irrtxrducctl ti,r r:it l.:i:nclit ol'the
researcher in turtns ol'the ir lttiltlrsnla tica I propcrtics. 'l'lrc nt.r
l(,r :cirhs t 1ar ftlve
been widely uscd havc bccrr 1lr'.'scnlcrl irnrl rrisr:rrssci.l ilr s,r,r.: .l,.i,rr! r'lr ..:.rlcs
arc:
o Nominalscalc
o Ordinalscalc
o lntcrval scalc and
o Ratio scale
Their differences and significance in teffns of use, sources of eror in
measurement have been dealt with in detail the following sections so as to facilitate
precise measurements in research rn'ork.
S,ll:lutrru<tionul llute,rial 5
Meaning of Research
1.3 OBJECTIVES OF RESEARCH
6 Self-lnstrrctional Mateiql
concfusion (for a particular problem). Doing research on a current social or Meaning of Research
SeA-hslruclional Moterial 7
Meaning of Research
I.5 APPROACHES TO RESEARCH
Quantitativc ilppn)llch and clualitutivc approach are thc tu'o basic approachcs to
NOTES rcscarch. Thcsc ttr o ;xtt'adirtn.r r.lrc basctl on t\\,o dilllrcnt and cornpcting rvays
o1'undcrstanding tltc world. Tltr.rsc conrpcting rvuys ol'corrrprchcnding thc rvorld
arc rcllsctecl in thc \\'a)'thc rcscarch elatir i.s collcctccl (c.g.. rvords vs nur)lbcrs),
and the pcrspcctivc ol'thc' rcscarclrcr lpcrspectival vs objccrivc). 'fhc perspcctilcs
of thc' participants arc vcrv critical.
(i) Quantitative approach: Il'there has hccn ont. ovenl,hclming conscnsus
among acldctnic prs"tholouists on a singlc poirr or cr thc pi.tsl ll's clecaclcs,
it is tlrlrt thc b!.st err rpirrclrl r cscu clt in tlrc li-.ltl is lirlrlr r-lrotrrtlcrl fur t1r rllrl itlt irc
tlletlttltls. lltlirtrltlrllnraclt.clatlrisuiinr'lllr(l intlrr.,r,irtrrli.,t'lirrrn urrrl tlrut
tltltldatuissttlrlcclctl 1(,rigorous(luiultit:ilt\c-utill\:!rt!t,trrgrrl lrrrtl 1rt'1Hl
litsltittn. lnlL'rcntial. c.xpcnntcntalarrd .irrrrrlatiorr:rirplolrclres tu'c thr] sub-
classifications ol'quantitltil'r.. approriclr. lrrli'r.cntlalrPpro:.rch ltr rcscarch
ltlcuses on sttr\c.v rcseurclt u'ltcrc r.latltrlscs ar,-. brrilt stud!ing sanrplcs rll'
populatiorr and thc'n thcsc tlatabascs are usctl to inll'r characteristics or
relationships in populations. IIt experimental approach. greater control is
exercisccl o\cr rhc research cnr,'ironmcnt and oftr'n. sorne variablcs
( independent ) I ariatrlcs are controlled or manipulated to record their c'ffi:cts
CHrcxYoun PnocRnss
l. What is scientific research'J
2. Dillerentiate betwecn fundamental and applied research.
3. Differentiate between conceptual and empirical research.
4. Differentiate between quantitative and qualitative approach.
8 Self-lnsntutionul l,laterial
Table 1.1 Types ofResearch Meaning of Research
,i Laboratory
Research
Small group study of random
behaviour, play and role
character.
Use of audio-visual recording devices,
use ofobservers. etc.
analysis
CnscxYouR PRoGRESS
(viii) Analysis of data: After collecting the data, the next step is analyzing the
data. The data analysis includes a number ofcloselyrelated operations like
speciSing different categoriesofdata, differentiating andtabulating the data
into different categories, applying the statistical techniques and formulae to
the data, doing the right calculations and then drawing statistical inferences.
Various tests, such as chi-square test, f-test, F-test, etc. he$ in data analysis.
(ix) Hypothesis-testing : Aft er analyzngthe data, the researcher should test
the working hypothesis against the statistical inferences obtained after
analyzing the data. The question now should be answered: Do the findings
support the working hypothesis or they contradict?
(x) Generalizations and Interpretation: If a hlpothesis is tested and upheld
sufficient number of times, the researcher can arrive at a generalization.
The degree of success of a research is calculated on the basis of how
much the arrived generalizations are close to the acceptability. If the
researcher starts with no hypothesis, the researcher will interpret his
findings on the basis of some existing theory and this is known as
interpretation. The process of interpretation often triggers new questions
which lead to further researches.
(xi) Preparation of the report or the thesis: Finally, the researcher has to
prepare the report ofwhat has been studied. Report must be written with
great care keeping the following lay'out inmind:
a. The preliminarypages: The preliminarypages ofthe report should
contain the title, the date, acknowledgments, foreword, table of
contents, list oftables, list ofgraphs and charts (ifany).
b. The main text: The main text ofthe report should have introduction,
sumnrary offrrdings, main report, conchrsion and suggestions for future
research.
c. The closure:At the end ofthe report, appendices shouldbe listed in
respect o f all technical data, fo llowed by biblio graphy. Index terms
should also be given specially in a published research report. A11
references should be cited as per the research writing formats.
12 Self-lnstructional Materiol
Meaning of Research
1.9 FLOW CHART
RESEARCH PROCESS IN FLO1V C}IART
NOTES
In figure 1.1 the flow chart indicates the sequential steps to be followed in
the research process. One must start with defining the research problem along
with reviewing the relevant literature in the field to become frmiliar wittl the corrcepts
and theories relevant to the issue to be investigated. The next step is the formulation
ofthe hypothesis, which is followed by the research design and sample selection.
Then the collection of data and its analysis is to be attempted. After that the
interpretation and the report writing stages complete the research report. These
have to be written step by step and then edited and refined several times before
preparing the final report.
Good research also irnplies obtaining reliable data which provides sound
validityto the research findings.
The following principles underlie a good research criteria:
o The aim and objective of the research being conducted should be clearly
specified;
. The researchprocedure should be replicable so that ifthe research needs
to be continued or repeated, it can be done easily;
o The research design should be so chosenthat the results are as objective as
possible.
Self-lnstmctional Material 13
Meaning of Research Interpretation ofanyresearch shouldbe done keeping inmind the flaws in
the procedural design and the extent to which it has an effect on the results.
easily accessible
o Poor encouragement to do research
These problems need to be surmounted effectively in order to promote NOTES
research as a professional activity.
There are three important properties that make one scale ofmeasurement different
from anotlrcr: (i) Magnitude, (ii) Equal intervals, and (iii) Absolute zero.
(i) Magnitude refers to the property of 'moreness'. Any attribute that is being
measured can be more, less or equal at one instance as courpared to another
instance. For example, Vijay is taller than Karan.
(ii) Equal intervals: This means the difference between anytwo points atany
place on the scale has the same meaning as the difference between two
other points that differ by the same number of scale units. For example, the
difference betrveen 20 kg and 30 kg on a weight measurement is the same
as between 80 kg and 90 kg, that is, 10 kg.
(iii) Absolute zero: It is a conditionwhen nothing ofthe propertybeing measured,
exists. For example, ifthere is no pulse measured the situation is alarming,
for the body.
For manypsychological tests such a condition does not exist. For example,
in case of anxiety, there cannot be a 'zero'for there can be no situation where
there is absolutelyno anxiety.
Anymeasurement is a yardstick for evaluation. It involves some formof
judgment. Measurement canbe made ofphysicalobjects, people, situations, etc.
It can be quantitative like height, area, etc. or qualitative like personality traits or
abstract (like patriotism). Measurement is a process of assigning some value to
the observed phenomena. There are clearly accepted rules under which these
evaluations are to be made. Some tasks are easywhile others are complex and
dfficult. Some are direct, like weight. while others like motivation, leadership etc.,
have to be inferred. Some measurements are very precise whereas others are
abstract andtentative. The attempt ofallmeasurement is to achieve confidence in
the evaluations that are made.
Measurements are generallypresented in the fonn of a scale in a range. For
example, ifthe movie is watched by men, women and children, we assign a 0 to
men 1 to women and 2 to children in order to tabulate our observations ofthe
audience. Similarlythe people who attend alone, or as a couple or in a group can
be allotted alphabets like A, C or G for classification purposes. This is the process
ofartificially determining categories. These can fuither be divided qualitatively or
descriptively. There are four levels or tlpes ofscales for measurements-nominal
scale, ordinal scale, interval scale and ratio scale.
Selflnstructionol Moterial l5
Meaning of Research
Nominal scales: This is a system ofmeasurement where a number is given
to label an event. These are merely convenient labels and do not have any special
meaning or significance. The numbers assigned do not have anyquantitative value.
They do not indicate any order or distance. For example, one player of a baseball
NOTES ,
team is givenjerseyno. 3 and another 5. This is neither a rank, nor a distance
I
between them. It is only a way of categorizing them as a team member. Nominal
scale Measures offer a Count ofData. This is a useful way ofclassification.
16 Ser-Instructional Material
o Situation: This involves the context inwhich anymeasurement is made. If Meaning of Research
the context places any type ofstress on the individual, then the response
would reflect it. Such a response, therefore, would not be a true index of
the topic ofmeasurement. Even the mere presence ofanother person can
affect the responses obtained. It is important to keep the situation balanced NOTES
and free ofdistorting influences.
o The interviewer / measurer effect: The way questions are put forth can
reflect the bias ofthe experimenter/ researcher. The style, mannerisms,
appearance, language, expressiorl etc., all could distort the presentation.
Any casuakress incarrying out the rneaswement couldhave large inplications
in terms ofthe data obtained.
o Instrumenu tools: Malirnctioning ofthe equipment could instantly lead to
errors inthe measures obtained. Iflanguage is an important component of
the measurenrent, then anrbiguous words, inadequate expressions, inconplete
questions, etc., could all lead to measurement errors. All these factors
must be borne in mind, and carefullyneutralized or completely eliminated as
for as possible to get accurate information.
Reliability implies the consistency with which a test measures what it seeks
to measure.
Practicalityrefers to the costs involved in administering a test, the time
needed, the convenience and the ease ofcarrying out the measurements as well
as
the usefulness ofthe obtained data, besides the interpretation.
A sinple way to determine validity is to ask the question: Is one measuring what is
being thought that the test is measuring?Another wayto determine the valiOity of
a test is the accuracy with which specific predictions can be made from
the iest
scores. This is determined bycomparing the scores obtained with some extemal
test scores as standard. This means one can zusess the individual's present status
or predict his / her future status with respect to some type offunctioning. This
comparison with other relevant evidence is one ofthe better ways ofestablishing
the validity of an instmnrent.
Self-Instructional Moterial l7
Meaning of Research Types of validity:
(i) Content validity: This irnpliesthe extent to whichthe measuring instnrment
has covered adequately all the aspects ofthe topic that is to be measured.
If the measure includes a representative sample ofthe population, then the
NOTES
content validity would be assumed to be good. This can be determined by
a panel ofjudges who evaluate the contents ofan instntment for yielding the
measures the tool is intended to measure. Another way ofestablishing the
content validity is intuitive judgments involving the theme ofthe measuring
tool.
(ii) Criterion-based Validity: Here the success ofthe measuring instrument is
determined bythe ability ofthe scores obtained to predict some outcomes
of a current condition. For example, those who train hard can be predicted
to be winners. The extent oftraining can thus be reflected in the number of
winners.
A good research criterion must posses the following characteristics:
a. Relevance: If the criterion is judged to be the proper measure
b. Freedom from bias: This is reached if everytest taker has an opportunity
to score well. (It is not biased in favour ofany group)
c. Reliabili(v: The measure is stable across several administrations.
Ifthese criteria are met by a tool used for measurement, then the instrument
is thought to be valid. The measures obtained from such a tool can be viewed as a
correct estimate ofthe feature under assessment.
l8 Self-Instntctional Matuial
1.14.2 Test of Reliability Meaning of Research
The test ofreliability of a measuring device is its abitity to yield consistent results
from one set ofmeasures to another.
Ifan instrument
is reliable, then temporary and extraneous factors would
not affect the measures obtained.
There are two aspects to reliabitity.
(i) Stability: The extent to which consistent results are obtained with
repeated measurements withthe same instnrment onthe same individual.
A measure ofstability is obtained by comparing the results ofrepeated
measurements.
(ii) Equivalence: This is estimated by comparing the measures obtained
bytwo assessors onthe same aspectisituation or individual.
Reliability can be enhanced by three procedures
(i) By standardizing the conditions under which measurements are made.
Here, all extraneous factors can be kept under control.
(if By systematrzngthe directions for measurement.
(m) Bv training personnel suitably, e.g., technicians who are trained for
measuring blood pressure. Also, having larger samples from the person
on whom the measurement is done.
Types of reliability
There are tlree cofltmon methods ofestimating reliability:
(i) Retestreliability
(i| Intemal consistency reliability
(ii) Parallel forms / or altemate forrns / or equivalent forms. Here, a single form
ofthe test is administered twice on the same sample with a reasonable time
gap.
(i) Retest reliability: Two measures yield independent sets of scores. The
two scores when correlated would give the value ofthe reliability coefficient.
Such a coefficient is also known as the tenrporal stability coefficient. This
means how far the examinees retain their relative position as measured in
terms oftest scores over a given period oftime. The ideal time gap between
thetwo administrations is about l5 days.
(ii) Internal consistency reliability: This indicates the homogeneity ofthe
test. Ifall the items ofa test measure the same function or trait, then the test
Self-lnsttttclional Material 19
Meaning of Research is seento be homogeneous andthe internalconsistencywouldbe high. The
most coflrmon way of determining intemal consistency reliability is by the
split-halfmethod. Here the items to be tested are divided into two equal or
nearly equal halves. Another way to split a test is by using the odd and even
NOTES numbered items. This method is prefened to the regular split-halfmethod
because in a power test, the first half would normallybe made up ofthe
easier items while the second halfwould have the tougher items. The odd
numbers 1, 3, 5, 7,9 and so on and even numbers of 4,6,8, and so on
would balance the items. Each examiner would receive two scores, i.e., the
scores ofthe odd and those ofthe even mrnbers ofa given test. Thus, from
a single administration ofthe same test, two sets of scores are generated. A
'product moment correlation' is conputed to obtain the reliabrlify ofthe half
test. On the basis ofthe reliability ofthis halftest, the reliability ofthe whole
test is estimated. The Spearman/Brown formula is used for estimating the
reliability of the whole test.
This is a useful method two administrations ofthe same test.
as it eliminates
A quick estimate ofreliability is possible. This is a kind of on-the-spot reliability
measure. The demerit ofthis method is that it cannot be used for power tests.
The Kuder-Richardson formula is used for determining the reliability
coefficient in a test where the terms are scored as 0 or 1 or (right or wrong). This
estimates the coefficient alpha which yelds a tneasure of internal consistency.
Alternate forms reliability: These are also known as parallel forms or
equivalent forms or comparable forms reliability. This requires that the test be
developed in trvo forms which are comparable or equivalent. Two forms of the
test are administered to the same sampie either immediately the same day or in a
time interval ofa fortnight. When the reliability is calculated on the basis of data
obtained fromthe two administrations ofthe test, it is the altemate formreliability.
Pearson coeffrcient (r) between two sets of scores obtained from two equivalent
forms becomes the measure of reliability. Such a coefficient is known as the
coefficient of equivalence. Alternate forms reliability measures the consistency of
the examinee's scores between two administrations ofparallel forms of a single
test.
The biggest problemwith this procedure is making the two forms ofa test,
trutyequivalent.
Criteria forjudging whether the forms are parallel:
o The number of Items in both the forms, should be same
o The item-difficultylevels inboththe forms shouldbe similar
o Mode ofadministration ofboth forms shouldbe the some.
Scorer reliability: This kind ofreliabilityis important in tests ofcreativity,
projective tests, etc. This is the reliability which can be estimated by having a
sample oftest independently scored by two or more examiners. The two sets of
20 Self-Instructional Material
scores obtained by each examiner are completed and the resulting correlation Meaning of Research
coefficient is known as scorer reliability.
Test-Retest reliability, internal consistency reliability, and parallel forms
reliability all express reliability in terms ofthe correlation coefficient.
NOTES
1.14.3 Test of Practicality
The practicalitycharacteristic ofa measuring instrument can be estimated in
terms
ofconvenience, cost effectiveness, ease ofadministration, scoring, interpretation,
etc. These factors have an important bearing on the development ofthe most
suitable measuring instruments needed for research.
1.15 SUMMARY
o Research is done to find the solution to a problem, or to know more about
something, or to know new things. Scientific research involves systematic,
controlled, empirical and critical examination ofa hlpothesis orproposition
about the relations in aphenomenon.
o The irrportant stepsneeded forconducting scientific researchare as follows:
(i) Identifying the problem;
(i) Formulating a hypothesis: once the problem has been identified and
the literature review has been completed; the hlpothesis has to be
formulated;
(iif Identifying the variables to be manipulated:
(a) The independent variable, (b) The dependent variable, and (c) The
extraneous variables;
(iv) Formulating a research desrgn;
(v) carrying out the observations and measurements This involves the
-
utilizing ofthe tools ofthe study, for obtaining data; and
(vi) Sumrnarizingthe results;
(vi) Statistical Tieatment;
(viif Drawing conclusions on the basis of the study and deciding upon
applications, further research, etc.
o There are four different ways ofassigning numerals to the attributes ofany
event:
(i) Nominalscale,
(if Ordinal scale, and
(iif Interval or equal interval scale, which includes the characteristics of
the nominaland ordinal scales ofmeasurement, and
(iv) Ratio scale.
22 Self-lnslructional Material
3. conceptual research is concerned with some abstract theory or idea(s). Meaning of Research
Empirical research, on the other hand, relies only on real experiences and
observations. It is data-based research and its conclusions canbe verified
by observations or experiments.
NOTES
4. In quantitative approach, the data is inthe formofquantitieswhich is then
subjected to mathematical and statistical approaches.
eualitative approach
deals with datathat cannot be strictly quantified, for example, opinions,
tastes, attitudes. etc.
Short-Answer Questions
L List the diftbrent approaches to scientific research.
2. State the objectives ofresearch.
3. What are the criteria ofgood research?
4. State the problems encountered by researchers in India.
5. State the major measurernent scales.
Long-Answer Questions
1. Describe different types ofresearch.
2. Explain the research process in detail.
3. Discuss the objectives types and approaches to scientific research.
4. Examinc the concepts ofvalidity and reiiability along with the various tests
that are available for ensuring them.
Se(-lnstructional Material 23
Meaning of Research
I.I9 FURTHER READING
24 Self-lnstructional Material
Meaning and Definition of
UNIT 2 MEANING AND DEFINITION a Research Problenr
OF A RESEARCH PROBLEM
NOTES
Structure
2.0 Introduction
2.1 Unit Objectives
2.2 Research Problem and Working Hypothesis
2.3 Sample Selection, Types and Size
2.4 Parumetric and Non-Parametric Tests
2.5 Testing Hypothesis I: Parametric or Standard Testing
2.6 Testing Hypothesis II: Non-Parametric Testing
2.7 Correlation and Regression
2.8 Surnmary
2.9 Key Terms
2.10 Answers to 'Check Your Progress'
2.ll Questions and Exercises
2.12 Further Reading
2,0 INTRODUCTION
In this unit, you will learn how to define a research problem. You will also learn
how to select the problem and define it clearly and specifically. The next step is
formulating a working hypothesis. Aworking hypothesis is a tentativeproposition
that provides the solution, in the researchers view, to the defined problem. The
characteristics ofa good hypothesis are presented in the unit.
The next step is testing the hypothesis. For testing the hypothesis there are
two kinds of tests, namely, parametric and non-parametric tests. This unit will
teach you how to choose the right type of test to test a hlpothesis, depending
upon the data. This unit presents a deiailed examination ofparametric and non-
parametric statisticaltests. The different typesoftestsundereachcategoryhave
been presented and discussed.
Sell-lnstructiondl l|ateriol 25
Mectning and Definition of
a Reseorch Probtem 2,2 RESEARCH PROBLEM AND WORKING
HYPOTHESIS
NOTES Clearlyidentiffing and defining aresearchproblemas the first step before beginning
any investigation.
Example 2.2: lVhat are the factors responsible for the chinese goods being
cheaper than similar Indian products? Here the focus must be on cost factors and
pricing methods adoptedbythe manufacturers inthe two countries.
Self-lnstruclional l'lalerial 27
Meaning and DeJinition af definition needs to be fully operationalned,the techniques, procedures identified,
a Research Problem
sample selected and the timeframe, costs, etc., have to be estimated. Then the
research has to begin.
A few conditions about a research question:
NOTES
o It is clear and unambiguous
o [t is fairly specific
o lfidentifies the type ofdatato be collected
o The questions asked are related to each other
o It is aworthwhile attempt
The problem statement of any research is written in the form of a question.
Examples 2.3:
(i) What is the relationship between stress and performance?
(ir) Does leaming proceed beuer in democratic environment or in an authoritarian
structure?
A problem also expresses the relationship between two or more variables.
So, one ofthe variables could be manipulated and its efects on the other, studied.
Example 2.4: Is higher anxiety related to lowered performance?
A problem should be testable by empirical methods. Data obtained should
be testable.
It is wise to avoid ethical and moral statements like for example should gay
people be allowed to join the defiance services. The problern chosen should not
be veryfivial, extremelybroad or generalthat cannot be studied effectively. Similarly
excessive specificity can also be a lirniting factor.
Considerations for selecting a problem: Before a problem for
investigation is selected, several qttestions have to be asked.
(i) Is the problem significant in terms ofthe variables to be studied and its
contribution to research, social theories, practices, etc.
(if Can useful data be gathered for the study?
(O Is there anyduplicationthat couldhappen?
(iv) Is the problem operationally workable?
(v) Are there good instruments to measure?
A problem is worth studying, ifthe answers to most ofthese questions are
positive.
How does a problem show up for study?
Does a gap inthe information exist?
Contradictory results occur when certain facts are unwarranted from the
existing knowledge.
2E Sel.l-lnstnK'tional Maleri.tl
Research problems are oftwo types Meaning and Delinition of
a Research Problem
(i) Solvable: For solvable problem an answer is possibre. Ahypothesis
canbe formulated.
(i) Unsolvable: Forunsolvable problem, no answers are possible.
NOTES
Example 2.5: Is there life after death?
CHscrYouR PRocRESS
l. What is the meaning ofresearch and how does one define a research
problem?
2. How does a researcher select a research problem?
3. How is a research problern stated?
4. What are the different typcs ofresearch problems?
5. What is a hypothesis?
6. What isanullhypothesis?
7 . What is statisticalhypothesis.
30 Self-lnsn'uclional lvtuteriul
ivleaning and D.fi,.ition o.f
2.3 SAMPLE SELECTION, TYPES AND SIZE a Research Problem
Sample selection
I
NOTES
All research in the field ofbehavioural sciences involves drawing inferences from a
]
specified, identifiable group on the basis ofa selected sarrple. The clearly identifiable
and specified group is known as the population or universe. The selected group of
persons or objects is called the sample. The conclusions are drawn fi'om the sarnple.
which are deemed to be ralid to the entire population. Such conclusions arc known
as the statistical inferences.
Steps in sampling
Sampling is the plan for obtaining a sample from a given population. There are
several steps that a researcher must keep in mind while selecting a sample:
The type ofuniverse or population to be studied can be finite or infinite.
Sampling unit: This stands for a geographical area like a district or province.
taluk, etc., or it could be a social unit like a family, school club, etc. The researcher
has to decide the factors that would be studied in advance. Alist ofall the sarnpling
units is called a sampling frame. This is the source list and contains all the items of
a universe or population. Self-lnstnr:tional Moteriol 31
Meaning and Definilion oJ' Types of sample
a Research hoblem
Most samples can be categorized into two types:
(i) Probabilitysampling
NOTES (i!) Non-probability sampling.
(i) Probability sampling: It sanple is based on the concept ofrandom selection
or chance sampling. Here, every item ofthe universe has an equal chance of
inclusion in the sample. It is a form of lottery method where the units are
chosen fromthe whole group bya mechanical method. This is almost a
blind selection. Random sanpling ensures that the law ofstatistical Regularity
is followed. This implies that ifthe sample chosen is a random one, the
chances are that the sample will have the same composition and
characteristics as the universe. This is why randorn selection is considered
as the nmst useful method for obtaining a representative sanple. Probability
sanpling must follow the conditions given below:
a. The size ofthe parent populationmust be knownto the investigator;
b. Each element ofthe parent populationmust have an equalchance of
being included in the sample;
c. The sample size needed must be clearly specffied.
The major probability sanpling methods are:
a. Sinrple random sampling;
b. Stratified random sampling - (a) Proportionate stratified random
sampling, and (b) Disproportionate stratified random sampling;
c. Area orcluster sampling.
(ii) Non-probability sampling: This is a method in which there is no way of
determining the probabilitythat each iteminthe population gets inchrded in
the sample. This means that there is no basis for estimating how closely the
characteristics ofa sample approximate the parameters ofthe population
from which the sample has been drawn. This is due to the absence ofrandom
selection procedures. The nrajor techniques used in non-probability sanpling
are:
a. Quota sampling
b. Accidental sampling
.. c. Judgmentalorpurposivesampling
d. Systenutb sampling
e. Snowballsampling
f Saturationsampling
g Dense sampling
32 Se$-lrctuctionalMatefial
Size of the sample Meaning and Definition o.f
a Research Problem
Size refers to the number ofitems selected fromthe universe to constitute a
sanple.
The size ofthe sanple should be neither too large, nor too small. An optimum
sample size should be:
NOTES
a. Efficient
b. Representative
c. Reliable
d. Flexible
The sample size should be decided bythe level ofprecisionneeded and the
estimate ofthe confidence level, desired. The size ofthe population variance is an
important determiner. Ifthe population variance is large, then a larger sample size
is indicated. The size ofthepopulation is another factorto be kept inmind. This
limits the size ofthe sarnple
Self-Instructional Material 33
Meaning and Delinition of o The variable studied should be continuous. Examples ofparamedic tests
a Research Problem
are z test, / test and Ftest.
A non-parametric test is one which does not speciryanyconditions about
the parameters ofthe population fromwhichthe sample is drawn. These are also
NOTES
called distribution free statistics because no assumptions are made about the
distnbution ofthe population. For these tests also, the variables must be continuous
and the observations should be independent. But these conditions do not apply
rigidty.
Examples ofnon-parametric tests are the chi-square test, the Mann-Whitney
Utest, Kendall's I, Kendall's coefficient of concordance W,etc.
Conditions that suggest the use of the non-parametric test:
o When the shape ofthe distribution ofthe population is not a normal one.
34 SelfJnstructional Materiul
Meaning and Definilion of
2.5 TESTING HYPOTHESIS I: PARAMETRIC OR a Reseqrch Problem
STANDARD TESTING
Before discussing the small sample / test in detail, five important concepts,
namely. degree offreedom, null hypothesis, level of significance and one-tailed
test should be understood.
Degree of freedom
The degree of lieedom means freedom to vary and its abbreviated expression is
d.f
In statistical language, the degree of freedom means the number of
observations that are independent ofeach other and that cannot be deduced from
each other. Suppose we have five scores and the mean of five scores is 10. The
fifth score immediately makes adjustrnent with the rernaining four scores in a way
which assures that the mean of all five scores must be 10. For example. suppose
we have four scores 12, 18, 5, 12, and for the mean to become 10. the fifth score
must be 3. In another distribution, ifthe four scores are2,10,8, 5, the ffih score
must be 25 in order to derive a mean of 10.
The meaning ofthis is that four scores in the distribution are independent,
they may have any values and they cannot be deduced from each other. The sizc
ofthe fifth score, however, is fixed because the mean in each case is 10. Hence
d.f:N-l:5-l:4.
Self-lnstructional Moteriol 35
Meaning and Delinition ol' In larger cases, in computing the mean it goes something like this: Suppose
a Research Prohlem
we have a set of I 0 1 scores. We compute the mean and in computing the mean,
we lose 1 d.f . We had initially 101 d.f (because there were 101 scores) but now
after computing mean. we have.ly'- I : 101 - I : 100 degrees of freedom.
NOTES
Sometimes, we have paired data and in such cases, the number of degrees of
freedom is equal to one less than number ofpairs.
Null hypothesis
The starting point in all statistical tests is the statement ofnull hypothesis (F/n),
which is a 'no difference hypothesis.' A null hypothesis states that there is no
significant difference between the samples under study. It makes ajudgment pbout
whether the obtained difference between the samples is due to some true differences
or due to some chance error. The null hypothesis is formulated for the express
purpose ofbeing rejected because ifit is rejected the altemative hypothesis (11,)
which is an operational statement of the investigators' research hypothesis, is
accepted. Aresearch hypothesis is nothing but predictions or deductions drawn
from a theory. The tests of the null hypothesis are generally called tests of
significance, the outcome of which is stated in terms ofprobability figures or levels
ofsignificance.
Ifthe difference between the experimental group and the control group is
very sma[ the experimenter is likelyto accept the nullhypothesis, indicating the
fact that the small difference between these two groups is due to sampling errors
or some other chance fluctuations. On the other hand, ifthe difference between
the experimental group and the control group is too large, the experimenter is
likely to refuse or reject the null hypothesis, including the fact that the obtained
differences are realdifferences between or among the samplesunder study.
Level of significance
The null fupothesis has been developed for the express purpose ofrejection. The
rejection or acceptance ofthe null hlpothesis is based upon the level ofsignificance,
which is used as a criterion. The levels of significance are also known as alpha
levels.
From the above discussion, it can be said that an error ofrype I can be
reduced byputting the alpha level at the 0.01 or 0.001 level. But as one reduces
the chance for making a Type I error, one increases the chance for making a Type
II error where one does not reject the null hypothesis when it shouldbe rejected.
Therefore, while decreasing the possibility of making one type of error, the
investigator also increases the probability ofmaking another type oferror. The
research workers must be cautious with this situation and should try as far as
possible, to limit the probability for making a Type I error.
Self-lnstrilctional Materiql 37
Meaning and Definition of where M, : meanof the experimental group and
a Research Problem
M r: mean of contro I group.
When it is said that the mean ofthe experimental group willbe higher than
the meanofthe controlgoup, one is concemedwithonlyone endofthedistribution.
NOTES
Putting it in terms of a normal curve, one is concemed with only one end ofthe
curve (see figurebelow). Whenthe alpha level is set at the 0.05 level, a 5 percent
ofarea ofthe normal curve is obtained, all in one-tail rather than having distributed
it equally into two tails ofthe curve. Therefore, the directional null hlpothesis is
called a one-tailed test. A simple inspection of the table of areas of the normal
cr:rve givenat the end reveals that a z score of 1.64 cuts off5 per cent ofthe area
under the normalcurve inthe smaller part, and similarlyaz score of2.33 cuts off
I per cent ofthe area in the smaller part. Ifthe null hypothesis is rejected, that is
hlpothesis I is not tenable, we automatically accept the alternative hypothesis. If
the experirnenter has some reason to believe that the experimental group would
have a lower mean score thanthe control group (altemative hypothesis), he can
set up a directional hypothesis that the mean of the experimentalgroup is lower
than the mean of the control group (one-tailed test). Rejection of this hypothesis
would automatically lead to the acceptance ofthe null hypothesis. This time the
normal curve in which the experimenter is interested is 5 per cent or 1 per cent of
the area in the left-hand tail of the normal curve. When the null hypothesis is
rejected by using a one-tailed test, one must say that one is rejecting the hypothesis
at I per cent or 5 per cent points, not levels (see Fig. 2.1).
Source: Singh, A.K., Tests, Measurements and Research Methods in behavioural sciences,
2008.
(M,- Mr)-0
,l:- SE,
...(2.1)
SD
SE, = ---- ...(2.3)
VM_ I
sEr,, =
5.68
,E-=1.302;
g0u, =# =1.s24
SE
o = .i1t.:OZy' + (1.524): = 2.004
(34.s6-30.s6)-0 4.00
= 1.996
2.004 2.044
SE, : ...(2.4)
where r,, : coefficient ofcorrelation between the initial set ofscores and the final
set of scores. The rest of the subscripts are defined like those in the equation
above.
3Er,: 5Er,:
#=0.982 #=0.s32
SE, :
:
:m=0.641
(M,- Mr)-0 (36.38-40.33)-0 _ 4.05
" SE, 0.625 0.641
= 6.31
df :N- l:20-l: t9
Entering the probabilitytable ofl :
ratios at d.f 19, we find that our obtained
value of I exceeds the value of r at even the 0.001 level ofsignificance. Hence, the
null hypothesis is rejected and it is concluded that the training has produced
significant difference between the mean ofthe initial set ofscores and the mean of
the final set ofscores.
t ratio from matched groups: Sometimes it becomes necessary for the
researchers to match the groups under study. Matching can be done on the basis
ofnumbers or it can be done in terms ofmean and standard deviation. When the
matching is done on the basis ofthe number ofthe subjects, each person has his
corresponding match in the other group and therefore, the number ofpersons in
the two matchedgroups is always equal. whenmatching is done interms ofmean
and standard deviation, the number in the two groups may or may not be equal.
Suppose two groups fromtwo different classes are considered and each
group is compared on the basis ofnumerical reasoning test. Both groups have
been matched in terms of mean and standard deviations on the basis of their
scores on a General Intelligence Test. Do the groups differ in terms of mean
numerical ability? The following is the data obtained:
Self-lnstuctional Material 4l
Meaning and Definition of Class VIII Class X
a Research Problem
NI 100 r20
Means of Intelligence tests 70.26 70.25
8'67 7E
tfrro
= o'867; 5Eu, = 0.691
5E r,.: ffi
SEo: -
:
(M,- Mr)-o _ (ss.62-60.34)-0
,, _ SEO 0.991
= 0.991
4.72
= 47.62
0.991
df : ({ - 1) + (Nr- 1) -1 : (100- 1) + (120- 1) - 1
(ii) F ratio
The /ratio or zratiois one ofthepowerfulparametric tests th,roughwhichwe can
test the significance ofthe differencebetween two means. There afe two general
limitations ofthe r ratio. First, when there are several groups and ifwe want to test
the significance ofthe mean difference among thenr, several I ratios me required to
be computed. For example, suppose there are five groups. Then there is need to
compute.
42 Self-lnstructional Material
N(1/ - l) 5x4
l0r ratios
Meaning and Definilion of
a Research Problem
)', ,,,(2.6)
Then it becomes a cumbersome task. Secondly, the / ratio does not account
for interaction effect in its statistical analysis. The variations in the scores may be
NOTES
due to the interactions taking place among groups. Such variations are not accounted
for by I ratios. In order to remove these two limitations we turn to analysis of
variance, originally developed by R.F. Fisher. Analysis of variance is a class of
statistical techniques through which we test the overall difference among the two
or more than two (normally more than two) sample means.
Analysis ofvariance is of two common types: Sinple analysis ofvariance
or one-way analysis ofvariance and complex analysis of variance or two-way
aualysis ofvariance. Analysis ofvariance (ofwhatever types) is oftenreferred to
by its acronyn, ANOVA.
In simple analysis ofvariance there is only one independent variable and the
samples are classified into several groups on the basis ofthis variable. Since the
basis ofclassification is onlyone independent variable, the sinple analysis ofvariance
is also known as one-wayANovA. SuchANovA is suited to the completely
randomized design. In complex ANovA, there are two or more than two
independent variables, which form the basis of classification of groups. Such
ANOVA is suited to factorial design.
Statistically, the Fratio is calculated as follows.
Larger variance Between-qrouDs variance
F* Smaller variance
- nfvr
88 t2 8 7744 IM &
87 18 r0 7569 324 100
80 t4 t6 6400 196 256
25 20 t4 625 400 t96
Sums 561 t54 312 3861 I 2610 16070
_ (561)'z
+(154)'z +Q1^2)z _35157.63 ...(2.8)
l0 10 10
number ofdegrees offreedorrl the sum ofsquares for each ofthe three sources of
variations, we compute mean squares or variances, which are obtained by dividing
each ofthe sum of squares by its respective number of degrees of freedom of NOTES
freedom These two types ofvariances are the estimates ofthe population variance.
we obtain F ratio by dividing the between-groups variance by the within-groups
variance.
F ratio is interpreted by the use of the Ftable (Guilford & Fruchter, 1 97 8).
In the F table the number ofdegrees offreedom for greater mean square (d.f is
r)
written at the top and the numberofdegrees for freedomforsmallermean sqrnre
(d.f.r) is written on the left-hand side. For this problem d.f - 2 and d.f.r:27 .
,
Locating at these d.f s, one finds that the required Fratio at the 0.05 level is 3.35
and at the 0.01 level is 5.49. Since the obtained value of F ratio is 8.29, which
exceeds 5.49, we reject the null hlpothesis and conclude that there is an overall
difference between the three groups of subjects on the educational achievement
test.
(M,- Mr)'
F_ sDi(Nt+ Nr)l N,N, ...(2.e)
(s6.10- ts.40)2
F_ (10 * lo)
= 16.31
soz.rgs
(10x10)
Self-Instructional Material 45
Meaning and Definition
a Research Problem
of F ratio for distributions I and C:
(s6.lo -31.20)2
= 6.10
s07.88sge!o)
NOTES (10x10)
F ratio for distributions .B and C:
(15.40-3 r.2o)2
=2.46
5s7.3359q*--1-E
(10x10)
As seen out earlier, F atthe 0.05 level ofsignificance for d.f.l :2 and d.f.
2:27 is 3.35. This value, ifmultipliedbyK- 1, yields (3 - 1) (3.35):6.70. Only
theFratio fordistributions,4 andB isgreaterthan6.T0. Hence, it isconcluded
that there is a sigfficant difference between the means ofAand.B only. The mean
difference between.,4 and C and B is not significant.
(iii) Pearson r
Ofall the meixures ofcorrelation the Pearson r, developed by Prof Karl Pearson,
is one of the most common methods of assessing the association between two
variables under study. It is also known as Pearsonproduct-moment correlation
and abbreviated to r. The size ofPearson r varies from *1 through 0 to -1 . In frct,
allcorrelationcoefficientshavethelimitof+l and-l.Acoefficientof+l indicates
a perfect positive correlatioq and a coefficient of-l indicates a perfect negative
correlation. The coefficient ofcorrelation indicates two things. First, it indicates
the magnitude ofa relationship. Acorrelation coefficient of say, +0.90 or-{.90
gives the same information about the magnitude or size ofcorrelation. The sign
makes no variation in the size ofthe correlation. Second, it gives an indication
regarding the directionofthe correlationcoefficient. Apositive correlationindicates
a similar trend ofrelationship between two variables, that is, as one increases the
other also increases or as one decreases, the other also decreases. Consider the
relationship between intelligence test scores and classroom achievement. Generally,
as intelligence test scores become higher, classroomachievement also becomes
better. Therefore, the direction ofthe correlation between these two variables is '
positive. Similarly, as fatigue increases, output decreases. Here the relationship is
negativebecause asone increases, the otherdecreases. Sometimes, therelationship
is not consistent. And in this situatior! coefficient of correlation is likely to be zero.
46 SelfJnstructional Material
Meaning and Definition of
wWW
a Research Problem
NOTES
(a) (b) (c)
Fig, 2.3 (a) and (b) Non-homoscedastic and (c) Homoscedastic and Linear
N2 XY _>, XLY
...(2.10)
N>x2 -G&'][N>Y2 -Gn']
where r: Pearsonproduct-moment correlation coefficient; N:number of
scoresX: scores inXvariance; and I:
scores in lvariable.
The table below presents the scores of 10 students who were administered
intelligence test (X) and anxiety test ( Y). Pearson r has also been calculated from
the ten pairs of scores. The significance ofthe obtained r is tested with the help of
a table (Downie & Health, 1970:378). The obtained value of r as less than the
value required at even 0.05 level of significance. Hence, the null hypothesis is
accepted and it is concluded that the scores on intelligence test andthe scores on
the anxiety test are not correlated. The sign makes no difference in the magnitude
ofthe correlation.
When the data are arranged in bivariate distribution as is the case in the
scatter diagram or in the correlation table, Pearson r should be calculated by the
followingformula:
Ex'y'
= C*C,
N ...(2.11)
(o,,)(o,,)
Self-lnstruclional Material 47
Meaning and Definition of where r : Pearson r x' : deviation of scores from mean on x test;/' : deviation
a Research Problem
of scores from mean on / :
tes| /y' sum of frequencies ; C, : correction in x-series
scores; *d Cr: coffection iny-series scores.
Pearson r by raw-score method
NOTES
x Y x2 Y2 XY
10 8 100 @ 80
5 20 25 400 100
6 l5 36 22s 90
J l3 9 169 39
8 t6 g 256 1;8
t2 20 144 400 240
l3 t3 169 169 169
20 1l 400 tzt 120
l5 l0 225 100 150
l0 t2 100 144 120
>x:102 Er: 138 2X:1272 2Y:2048 ,,xY:1336
I,{2 XY -Z X >Y
(10x1336) - (102x138)
: .,/11r o;1r z i z) - (102)'z1 t(l 0X2048, - (r-a)1
-716 -716
J3325776 t823.671
df:N-2:10-2:8
CrmcxYouR PRocRESS
Self-Instructional Material 49
Meaning and Definition of The third important use of chi-square is in testing a hlpothesis regarding the
a Research Problem
nomal shape ofa frequency distribution. When chi-square is used in this connectior!
it is commonly referred to as a test of goodness-otf,t.
The fourthuse ofchi-square is in testing the significance of several statistics.
NOTES
For example, for testing the significance of the phi-coefficient, coefficient of
concordance, and coefficient ofcontingency, one converts the respective values
into chi-square values.
Ifthe chi-square value appears to be a significant one, one should also take
theiroriginalvaluesas significant. To illustratethis, a3x3 contingencytable, which
shows data of200 students who were classified into three classes onthe basis of
their educational qualification is used. The students' educational attainments are
measured in the course oftheir studyby classiffing them as superior, average or
inferior.
100 60 40 200
50 Self-Instructional Material
Aftercalculatingexpected frequencyforeachce[ the chi-squarernaybe calculated Meaning and Definition of
a Research Problem
as:
(f" - f")'
.f" .f" f,- f" (f, - f)' f" NOTES
30 25 +5 25 I
l5 l5 0 0 0
5 l0 -5 25 2.5
25 25 0 0 0
10 l5 -5 25 1.67
l5 10 +5 25 2.5
45 50 -5 25 0.5
35 30 +5 25 0.83
20 20 0 0 0
2f,:2oo z,f":2oo E(f"- "f") :o E:9.00
df: (r- lX,K: 1) : (3 - 1X3 - : 1) 2x2 : 4
Entering the probability table of chi-square, one finds that the value ofchi-
square for d.f :4 atthe 0.05 level should be 9.488. As the obtained chi-square
is below it (p >> 0.05), one concludes that the null
fulpothesis is retained. Hence,
thetwo variables, namely, educationalqualificationandeducationalattainment in
the present studyare found to be independent. For calculating d.f in a chi-square
test, the formula as noted above is (, - r) (k - I ) where r : the number of rows
and fr is the number of columns.
chi-square: when the data has been arranged in a 2x2 contngency table
(where d.f. : 1), we need not calculate the expected frequency in the manner
described above. In such a situation, the chi-square can be directly calculated
with the help ofthe following equation:
Nil AD - BC ll'
x2: (A+ B)(C + D)(A+C)(B + D) ...(2.13)
where A, B, C and D are symbols for frequency of four cells in a zx2 table; N
: total number of frequencies; bars (II) indicate that in subtracting BC fromAD
the signis ignored.
Suppose the researcher wants to know whether or not the two given items
in the test are independent. Both items have been answered in 'yes'or 'No'forrn
The test was administered to a sample of400 students and the obtained data were
asfollows:
Self-Instructional Material 51
Meaning and Definition of Chi-square in a 2x2 Table
a Research Problem
Item No. 6
No
Item No. l0 A B
Yes
90 l0 100
C D
270 130
According to the formula:
d.f : (r - l)(k - I) : (2 - 1)(2- 1) : 1
Entering the probability table ofchi-square, we find that for d.f , the value
of chi-square at the 0.00 I level should be 10.827 . As the obtained value of the
chi-square is much above it, we conclude that item nos. 6 and 10 are not
independent, that is, theyare related.
Sometimes it happens that with I d.f., any one of the expected cell
frequencies becomes less than 5. In such a situation a correction called Yates'
correction for continuity is applied. Some writers have suggested that Yates'
correction for continuity should be appliedwhen anyofthe expected frequencies
goes hlow 1 0. Where frequencies are large, this correction makes no difference
but where frequencies are small, Yates'correction is significant. Yates'correction
consists in reducing the absolute value ofdifference betweenf,ardf"by 0.5, that
is, each fo which is larger than fe is decreased by 0.5 and each fo which is smaller
than fe is increased by 0.5. The formula for chi-square in such a situation is as
givenbelow:
where subscripts are defined as usual. Suppose, 60 students (50 boys and 10
girls) were administered an attitude scale. The items were to be answered in 'Yes'
and'No'forrn
Their frequencies towards 10 items are presented in the table below. The
question is: Do the opinions ofboys and girls differ significantty?
Boys 2A 30 50
A B
1 10
3
Girls
C D
23 JI 60
52 Self-Instructional Material
According to Equation (2.14): Meaning and Definition of
a Research Problem
Here the calculation the Marur-whitney utest, which are concemed with
larger sample size ofmore than 20 cases is given:
Table below presents the scores oftwo groups on the Lie scale. Group I
has 10 subjects and Group II has 2l subjects. The first step is to rank all the
scores in one combined distribution in an increasing order of size. The lowest
score (taking both sets ofscores together) is 7 (second column) and hence, it is
given a rank of 1. The nest score is 8, which is again in the second column and it
has been given a rank of 2. The third score from below is 1 0 (in the first column),
which has been given a rank of 3. In this way ranking is continued until all scores
receive ranks. Subsequently, the two columns ofranks are summed. At this point,
a check on arithmetical calculation is irrposed. The check is that the sums ofthese
two columns must be equal to N (N + 1)12.
Self-lnstuctional Material 53
Meaning and Definition of (lOW+t) (31X32)
a Research Problem check: x& +tR2 = = 88.5 +407.5 = 496; = 496
22
Hence, we can proceed:
NOTES (byEquation2.ll)
Gr.I Gr. II Rr R2
(N,:10) (N,:21)
l8 32 7 13
t4 40 15 l8
30 3l ll t2
a
10 39 J 16.5
39 15 16.5 6
26 8 9 2
27 47 10 t9
19 33 8 t4
35 52 l5 22
1l 48 4 20
7 I
50 2t
6t 27
58 24
53 23
59 25
60 26
65 30
63 28
67 3l
& 29
54 Self-lnstructional Material
Meaning and Definition of
u -N,N, a Research Problem
m
!12
...(2.17)
NOTES
176.s _(1ox2l)
2 71.5
=3.02
(l0X2lxl0 + 21+ 1) 23.664
oFp2
p:,_r(N_,
1L/
...(2.18)
Sel/-lnstructional Material 55
Meaning and DeJinition of Table 2.4 lllustrqtion of the Speorman Rank-Diflbrence Correlation
a Research Problem
6(339)
- , 2034
'p:l_ t2(122-t)
=l__=l_1.195=_0.195
t7 t6
r : ...(2.1e)
0r2)N(N )
where r: Kendall's r, ^S: actual total; and N= nurnber of objects or scores
which have been ranked.
Suppose 12 students have been administered two tests and their scores are
presented inthe following table.
The first step is to rank both sets of scores giving the highest score a rank of
1, the next higher a rank of 2, and so on. The following table presents the ranks
based upon two sets ofthe scores given in that table above. Subsequently, the
ranks ofthe Xtest are reuuranged in a way that they appear in a natural order like
1,2,3.
Ranl{s based upon Two Sets of Scores given in the aboye Table
A B C D E F G H I J K L
X 7 J 9 l0 ll 5 6 4 8 2 I t2
1
Y 5 I 9 8 l0 6 2 J 4 11 12
56 Self-lnstructional Material
Accordingly, ranks onthe )ztest are adjusted. The followingtable presents Meaning and DeJinition of
a Research Probtem
the ranks in a rearranged order. Subsequently, the value of,S is computed. For
this, we start with the rank on the Iztest from the left side.
4-6.
Identical procedures are repeated for other ranks on the lztest. Thus:
s=(1 - 10)+ (4-6)+(7 - l)+ (4-3 )+(6-0)+(4_ 1)+(4_0)+
(2 - r) + (2 -0) + ( I - 0) : (-e) + (-2) + (e) + (6) + ( 1) + (6) + (3) + (4) + ( I
)
+ (2) + (1) = 3l * 1l :22.
Substituting this in the formula given in Equation (2. 1 9) :
s22))
' =- =-=n
.^lN(N-r) lnoz-rl 66
222
22
, the formula for
follows:
]
whichisas I
...(2.20\
2(2N +s)
9N(,^/ - 1)
Hence
0.33 0'3
ffi=m
0.33
z: =0.2209 = 1.4938
e(12)(12-t)
Since the obtained z score is less than 1.96, one can say that this is not
significant even at the 0.05 level. Accepting the null hypothesis, one can say that
the given set of scores is not correlated. According to Siegel ( I 956), r has one
advantage over p, and that is that the former can be generalized to partial correlation
Ifboth r and p are computed from the same data, the answer will not be the same
and hence, nulirerically, they are not equal.
Self-lnstructionolMaterial 57
Meaning and Definition of (iv) Coeffi cient concordance, ll/
a Research Problem
The coefficient ofconcordance symbolized by the letter Whasbeen developed by
Kendall and is a measure of correlation between more than two sets of ranks.
Thus, W is ameasure of correlation between more than two sets ofrankings of
NOTES
events, objects and individuals. When the investigator is interested in knowing the
inter-test reliabiliry lZis chosen as the most appropriate statistic. One characteristic
of l/which distinguishesit fromothermethods ofcorrelationisthat it is eitherzero
or positive. It cannot be negative. W canlcr- computed with the help ofthe formula
givenbelow:
W_ ...(2.2r)
arr(,a,r, - lr)
t2
where I/: coefficient ofconcordance; S: sumofsquares ofdeviations fromthe
mean ofR ; K: number ofjudges or sets ofrankings; and N: number ofobjects
or individuals which have been ranked. Suppose four teachers (A, B, C and D)
ranked 8 students on the basis ofperformance shown in the classroom. The ranks
given by the four teachers are presented in the following table. The details of the
calculations have also been shown.
Students
Teachers
(i) (ii) (lIl) (iv) (v) (vi) (vii) (viii)
A 3 4 7 5 8 6 '2 I
B 2 3 6 4 8 7 I 5
C I 3 5 6 8 7 2 4
D J 4 2 5 7 6 I 8
RJ 9 t4 20 20 31 26 6 18
the clri-square exceeds this required value, one can take this value of W as a
significant one. Thus rejecting the null hypothesis, one can say that there is an
overallsignificant relationship in ranking done bythe fourteachers. NOTES
(v) Median test
The median test is used to see iftwo groups (not necessarily of same size) come
from the same population or frompopulations having the same median. In the
median test, the null hypothesis is that there is no difference between the two sets
of scores because they have been taken from the same population. If the null
hypothesis true, halfofthe scores in both the groups should lie above the median
is
and the rernaining half ofthe scores should lie below the median. The following
table presents the scores oftrvo groups of students in an arithmetic test. The firsi
step in the comptttation of a rnedian test is to compute a common median for both
distributions taken together.
Gr. B
16, 17, g, 12, 14, 9,7, 5,20,22, 4,26,27,5, 10, lg
(1/: l6)
Gr. B
(1/= 14)
28, 30, 33. 40, 45,47,40, 3g" 42, 50,20, lg,lg, lg
For computing the comnron median, both the distributions are pulled together
as shown in the following table:
Scores
49*53 I
1418 2
3913 -)
34- 38 I
t9-23
(30 t 2- l3)s
5 = 18.5 +
l4- l8 5
9-13 =20.5
-1
4-8 5
N: 30
A B l6
Gr. A
J l3
NOTES 14
C] D
l0 4
t3 t7 30
-
x':@ 3Oil 3(4) - ( l3Xl o) ll:
Fromthe probabilitytable for chi-square, we find that for d.f 1 the chi-
:
square value at the 0.01 level should be 6.635. Since the obtained value ofthe
chi-square exceeds this value (p < 0.01), we canreject the nullhypothesis and
conclude that the two samples have not been drawn fromthe same population or
from populations having equal medians.
The primary difference between the F test and the Kruskal-Wallis Fltest on the
one hand and the Friedman test on the other hand is that the Ftest is a parametric
analysis ofvariance, whereas the Htest and Friedman test are the non-parametric
analysis. The .Fl test is a one-way non-parametric analysis of variairce and the
Friedman test is a two-way non-parametric analysis ofvariance.
t2 (l l) 36 (24)
t8 (16) 30 (23)
29 (22)
27 (21)
R;: 38 Ri:76 fti: 186
60 Sef-Instruclional Material
The llltest is used when the investigator is interested in knowing whether or Meaning and Definition of
a Research Problem
not groups ofthe independent samples have been drawn fromthe same population.
Ifthe obtained data does not fulfill the two basic parametric assumptions, namely,
the assumptionsofnormalityand the assumption ofhomogeneity ofvariances, the
.F/test is the most appropriate statistic. The equation for the lltest is as given NOTES
below:
H: ,(fu[r#]-3(N+,) ...(2.22)
When each sample has six or more than six cases, the lltest is interpreted
as chi-square. In such a situation, d.f. : number ofgroups or samples minus one.
So, here d.f. :2. Entering the probability table for chi-square, we find hat for d.f
:2,the value ofchi-square at the level ofsignificance should be 9.210. Since the
obtained value oflltest exceeds this required value, it can be said that the F/value
is significant. Rejecting the null hypothesis, one concludes that the samples are
independent and that they have not been drawn from same population.
The fir'st step in calculation of the Friedman test is to rank each score in
each row separately giving the lowest score in each row a rank of 1 and the next
lowest score in each row a rank of 2, and so on. The ranking can also be done in
reversed order, that is, giving the highest score in each row a rank of l, the next
highest score in each row a rank of2, and so on. Rank assigned to each score in
each row i;s given in parentheses. The Friedman test is applied to determine whether
or not the rank totals symbolized by R- differs significantly. Now, substituting the
values inthe equation ofchi-square, we get
v2
.t?
-
-' I\ +(14):]-3(3X5+l)
fror: -/7\r -/3):
^r- (3X5X5+l;Lvr '\"
: 63.866 - 54:9.87
When the number or rows (Af and the nunrber of columns (r(') are too
small, the significance of the Friedman test can be ascertained with the help of
special tables (Siegel, 1956). For example, when K:4, N:2 to 4 or when
K:3,N:2to 9, the significance ofthe Friedrnan test can be done through these
special tables. But when the number of rows and the number of columns are
greater than those said above, the Friedman test is interpreted as the chi-square
test. In the present example, the significance of the Friedman test would be
interpreted in terms ofchi-square. The d.f, is always equal to 1(- 1 for chi-square
applied as a test of significance of the Friedrnan test. Hence d.f. in the present
example would be K - I = 5 * | : 4. Entering the table for d.f 4, we find that the
chi-square should be 9.488 at the 0.05 level ofsignificance. Since the obtained
value of the Friedman test exceeds this value (p < 0.05), one rejects the null
hypothesis andconclude that th,ree matched groups differ significantly.
Correlational methods are the most commonly uscd statistical techniques in the
testing field. Some important methods of correlation are: Pearson r, Spearman
rank difference method and Kendall's Tndiscussed earlier. Acorrelation coefficient
is a mathematical index that describes the direction and magnitude ofa relationship.
62 Se(-lnstructional Marerial
There is also a related technique called regression which is used to make a prediction Meaning and Definition of
a Research Problem
about scores on one variable on the basis ofknown score on another variable. In
fact, these predictions are done from the regression line, which is defined as the
best-fitting straight line through a set ofpoints in a scatter diagram. It is estimated
by using the principle of least squares which, in fact, minimizes the squared NOTES
deviations around the regression line.
This canbe explained through an example. The mean is the point ofleast
squares for any single variable. In other words, the sum of squared deviations
around the mean will be less than it is around any value other than mean. For
example, the mean of the five scores namely, 2,3, 4,5 and 6 is ...X/N = 2015
:4. The squared deviation of each score around the mean can now be easily
determined. For score 6, the squared deviation is (6 - 4) x 2: 4; for score 5, the
squared deviation is (5 - 4) 2 :1. The score 4 is equal to the mean and
"
therefore, the squared deviation around mean will be (a - a) x 2:0. Thus, by
definition, the meanwill always be the point least squares. The regression line is
the line ofleast squares or ruining mean in two dimensions or in the space created
bytwo variables.
As descnbed earlier, a regression line is the best-fitting straight line through
a set of several points in a scatter dragranl which is the picture ofthe relationship
between two variables. The regression line is described as a mathematical index
called regression equation. The general linear regression equation for the straight
lineis:
Y: bx + a (figure 2.4) ...(2.24)
Where a is the intercept, that is, value of Iwhe n X rs zero.In other words,
the point at which the regression line crosses the )z-axis (a) is found by using the
followingformula:
34 67
Fig.2.4: Y = b2+a
a:Y-bx ...(2.2s)
Self-Inslructional Material 63
Meaning and Definition of D is the slope of regression line or it is called the regression coefficient. It is
a Research Problem
expressed as the ratio of sum of squares for the covariance to the sum of
squares for In the above figure, two regression lines have been drawn based
-X
on equation Y: a + Dx. The regression equation gives a predicted value for
NOTES I, on the basis ofX. This predicted value is called Y'.lnfact, the actual score
and predicted score on )'is rarely exactly the same. Some difference between
the observed and the predicted score, that is, Y- Y' occurs and this is technically
called the residual. Symbolically, the residual thus, is defined as Iz- I'. The
best-fitting line actually keeps the residual to minimurn [n other words, it minimizes
the deviation between the observed and the predicted Iscore. Since residuals
can be either positive or negative and will cancel to zero if averaged, the best-
fitting line is rightly estimated by squaring each residual. In this way, it can be
said that the best-fitting line is best obtained by keeping these squared residuals
as small as possible.
CrucrYouR PRocRESS
2.8 SUMMARY
o while carrying out research, defining the problem is the most important
step. Once this is done, the next step is formulating a hypothesis, which is a
tentative solutionto the problenr, inthe view ofthe researcher. Hlpotheses
are oftwo tlpes-solvable and unsolvable.
o There are different types of hypotheses which include causal hlpothesis,
descriptive hypothesis, simple hypothe sis, co mplex hypothesis, research
hlpothesis, null hypothesis and statistical hypothesis.
r To test the hypotheses, the following kinds oftests are used: parametric
tests and non-parametric tests.
o In a parametric statistical test, certain conditions about the population
parameter from which the sample is drawn, are specified. Examples of
paramedic tests are z test, ttest andFtest.
o on the other hand, in a non-parametric test, no conditions are specified
about the parameters of the population from which the sample is drawn.
Self-Instructional Material 65
Meaning and Definilion of o The appropriate statistical tests are chosen on the basis ofthe nature ofthe
a Research Problem
obtained data- Ifthe data fulfills the requirement ofparametric assumptions,
any ofthe parametric tests which suit the purpose can be selected. On the
other hand, any ofthe non-parametric statistical tests that suit the pu{pose
NOTES- canbe choserq ifthe data does not fulfillthe parametric requirements.
. Other factors to be kept inmind while selecting the appropriate statistical
tests other factors like the number ofindependent and dependent variables
and the nature of the variables-whether they are nominal, interval or
ordinal--should be kept in mind.
6 Eef-Instntctional Material
4. Researchproblems are oftwo types: solvable andunsolvable. Meaning and Definilion of
a Research Problem
5. A hypothesis is a suggestion about a possible solution. It is a patented
association between variables, a conjecture about a relationship, and a
testable proposition.
NOTES
6. A null hypothesis is a hlpothesis that negates the research hypothesis. It is a
no difference hypothesis.
7. Statistical hypothesis is also known as the altemate hypothesis. This
hypothesis states that the two groups that are being studied, would differ.
8. In a parametric statistical test, certain conditions are specified about the
population parameter from where the sample is drawn.
9. A non-parametric test does not specify any conditions about the parameters
of the population, from which the sample is drawn.
10. when significance of difference between two means is to be tested the
students I test or t ratio or the z test or z ratio, is used.
I l. Degrees offreedom is the number ofobservations that are independent of
each other and cannot be deduced from each other.
I2 . Experimental variance is the effect ofmanbulation ofthe independent variable
on the dependent variable.
Self-lnstructional Material 67
Meaning and Detinition of
a Research Probtem 2.ll QUESTIONS AND EXERCISES
Short-Answer Questions
NOTES
l.List anytwo specifications for a researchproblem'
2. List two considerations while selecting a research problent
3. Define parametric statistical test.
4. Define non-parametric statistical test
Long-Answer Questions
1. Explain a research problem and hypothesis in detail '
2. Explain in detail the three most important parametric statistics: (i) Student's
t testandz test, (ii) FRatio, and (iii) Pearsonr
3. Explain sample selectioq types and size and the steps in sampling in detail'
6t Sel!-Instructional Mareriul
Sampling Design
3.0 INTRODUCTION
In this unit, you will learn how to design the right sample for a chosen study. All
items in a field ofinquiry are thought to constitute a universe or a population. A
sample is any number ofpersons selected to represent a population according to
some rule orplan. Hence, a sample is a smaller representation ofthe population. A
measure based on a sample is known as a statistic.
Self-Instructional Material 69
Sampling Design a Gain fimiliarity with the probability and non-probability sanpling techniques
o Learn how to select right size ofthe sample and how to minimize sampling
EITOTS
NOTES
3.2 SAMPLING DESIGN
Apopulation is the aggregate of all the cases that confofin to the researcher's
designated set of specifications. Therefore, the term people may mean all the
residents oflndia, or those engaged in factory work, or women, boys under the
age of20 and so on, as defined by the researcher. By specification all the boys
under 20 would be included in the population of India, can be referred to as a sub-
population or stratum with reference to the main population.
A stratum may be defined by one or more specifications that divide a
population into mutually exclusive segments, €.g., a given population may be
subdMded into strata ofmales under the age of 21, and females under age of 2l .
Similarly, one can have a stratumbased on education, income, etc.
A single member ofa population is known as an element. Often, one wants
to know how certain characteristics ofthe elements are distributed in a population,
e.g., one wants to the age distribution ofpeople who have a particular political
preference.
A census is a count ofall the elements in a population and a determination of
the distribution oftheir characteristics, based on the information obtained for each
ofthe elements. It is economical in terms of time, effort and money to get the
desired information for only some ofthe elements than for all ofthem.
When we select some of the elements with the intention of finding out
something about the population from which they are taken, we refer to that group
of elements as a sample. The expectation here is that what we find out about the
sanple is true ofthe population as a whole. This depends on the waythe sample
is selected.
70 Self-lnstuctional Mateial
Factors influencing decisions while drawing a sanple are: Sampling Design
o Size ofthe population: Whur the population size is large; the selection of
a sample becomes necessary.
o Costs involved in obtaining the elements: Ifthe cost is reasonable; the
sanpling inquiry is fr cilitated. NOTES
. Convenifnce ofavailabilityofthe elements, Each ofthese factors is
mportarfr for deciding to select a sample, for study.
Implications of sample design : Asample is obtained according to a'plan'.
A sanple design is a technique for selecting the items for a sample. The size ofthe
sanple means the number of items to be included in the sample. Sample design
should be determined before data collection and the sample should be designed to
suitthestudy.
Sanpling is a process of selecting a few from a bigger group for estimating
or predicting the prevalence ofsome outcome / factor regarding the bigger groups.
So, a sample is a sub-group ofthe population, one is interested in. This is the
concept ofsanpling see figure 3.1
\ Sample
Population
/*,*.\
obtain from
Vr,,o,._/
Fig. 3.1 Sampling
72 Self-lnstructional Material
tnderstate their incorneswhen the govemment asks forit, but overstate Sampling Design
when social status is involved. In psychological surveys there is a
tendency to give a 'right, answeq rather than a true one.
(ii) Sampling error"rs: These are randomvariations in the sample estimate
around
the true population mean. Sampling errors are effors which arise NOTES
fiom
inaccurate sampling and they generally happen to be random variations
(when sanpling is random) in the sarrple estimates around the
true population
values. see figure 3.2.
Source: Singh, A.K., Tests, Measurements and Research Methods in behavioural sciences,
2m8.
Types ofsampling
Disproportionate
stratified sampling
m-t@
Source: Kothari, C.R. Research Methodology Methods and Techniques, 1995.
74 Self-lnstmctional Materiql
3.2.6 Types of Sampling Sampling Design
Self-lnstructional Material 7S
Sampling Design d. Keep moving through the table and fill until the selected sample has
50 elements.
64 59 28 12 85 55
75 20 97 7l 03 60
58 t7 74 0l 74
38 69 36 79 66
85 6l 34 84 59
84 03 43 95 l6
t2 02 03 5l 00
22 52 81 30 12
76 Self-lnstructional Material
Simple random sample Sampling Design
4(- 4t'
u2: 2xlx2xt=244 -^
(4-2)l2l-4x3x2xl
SeA-hslructional Material 77
Sampling Design Similarly, where N:5, we can have 10 sampled of size 2 as under.
tc,: 5!
5x4x3x2xl 120
=lo
(5-2)12! 3x2x1x2xt = I2
NOTES
But fromthe same population, we can have 5 samples ofsize 4 as under:
But ifthe investigator has decided to proceed with the technique of sampling
-= he can derive the likely number of samples from the given
with replacement,
population withthe help ofthe following equation.
N"
where Nand n are againpopulation and the sample size. Suppose the size ofthe
population is 4 and the size of sample is 2. In such a situation the investigator,
following the technique of sampling with replacement, can maximally draw 16
samples, that is, 42:4x4: 16. Ifthe fourmembers ofpopulation are named asA,
B, C and D, the sixteen samples of size 2 wouldbe
AA AB AC AD
BA BB BC BD
CA CB CC CD
DA DB DC DD
The case ofAA, BB, CC and DD cornbinations reflects the fact that in
sampling withreplacement, an element or individual once drawn canbe drawn
again. In actualpractice such cases are ignored.
There are some advantages and disadvantages of simple random sampling
are as givenbelow:
0 It is difficult to ensure that the smaller elements that exist in a population are
included in the sample. For example, in a population of500 persons, only
78 Self-Instructional Material
12 people are dialectic. The sample size is only 50. The chance that they Sampling Design
Quota sampling
This is an important non-probability sampling method. The population is seen as
made up ofstrata ofthe population and from each stratunr, individuals are chosen NOTES
randomly, e.g., ifthe population ofstudents in a school is 5000 made up ofhigh
and low socio-economic classes. From this, 500 students can be chosen with 250
fromthe higherand 250 fromthe lower class. This is the quota sample.
Advantages of the quota sampling method:
Purposive sample
Purposive sample, which is a handpicked sample is typicalofthe population. lt is
also calledjudgmental sample, because the choice is determined bythe judgments
of the researcher, e.g., attitudes towards comrption would be ascertained by
interviewing professionals, academicians, tainted people and politicians. The
investigator selects the persons from these select people.
Advantages of purposive sample:
(i) Purposive sample is cost effective and easily accessible
(if It isveryconvenient
(ri| Onlyrelevant individuals get included
Disadvantages of purposive sample:
(i) Inpurposive sampling randomness representatives not ensured.
(if Generalizability is poor.
(ii| Sampling is highly subjective.
(iv) Inability to apply inferential statistics to acceptable levels
Accidental sampling
Accidental sampling is also called incidental sampling. It is based on a non-
probability sampling plan. Here, the investigator chooses the sample according to
Self-lnstructional Material 81
Sampling Design convenience. Convenience and economy guide this method, as a useful option,
e.9., students ofa particular school chooses because ofavailability
Systematic sampling
This is another method ofthe non-probability sarrpling plan. Here, every rth person
is drawn from a predetermined list for study, e.g., every 5th roll number ofclass of
50 students or every 1Oth name from the telephone directory and so on. It is
systematic in view of the fact that the selection is made according to a pre-
determined plan. The first element selected is random and has non-probability
characteristics
82 Self-lnstructional Material
Advantages of snow ball sampling Sampling Design
Saturation sampling
Saturation sampling involves drawing all elements or individuals having
characteristics that are ofinterest to the researcheq e.g., allpsychiatrists below the
age of45 years. Dense sampling is a method that lies between simple random
sanpling and saturation sampling. Here researcher selects 50 per cent or more
from the popirlation and takes a rnajority ofindividuals having specific characteristics
that are of significance, e.g., 500-600 students from a class of 1000 students.
These two methods are convenient. But it is not useful when the Nexceeds 1000.
Double sampling
Double sampling means drawing a sample of individuals from a sample that has
already been drawn earlier, e. g., from a population of 1 0, 000 people, a sampling
of 1000 is drawn. Again fromthis 1000, a further sample 200 is drawn, for the
study, e.9., o questionnaire is sent to I 000 people on the issue of pollution. Say,
50 per cent (or 500) ofthem respond. From these 500 persons, a sample of I 00
is drawn for an in-depth interview. This is double sampling.
Sef-Instructional Material 83
Sampling Design (ii) The sample size should be adequate: This implies that the size be
zufficient. Alarger sample is better forreducing the error. This is the difference
between the population value and the sample value. The larger the size of
the sample, the lesser is the error. However, too large a sample maynot
NOTES yield better results, as a large sample creates other problems.
Advantages of sampling methods in general
(i) Using sampling methods increases accuracy. Examining a sample becomes
both efficient and involves lesser work, so the purpose ofa sample is to get
nriximum accuracywith minirnal effort, time, money, etc.
(ir) It reduces the cost as the data is from a smaller number ofcases. Statistical
calculations for accidental errors are also reduced.
(iif Since the sample and not the udverse is studied, workproceeds fast. This
is a great advantage for research.
CrmcxYouR PRocRESS
Fundamentals of sampling
Sampling is the process of obtaining information about an active population by
examining only a part of it. one draws inferences about the parameters of a
population from which the samples are obtained. The assumption behind this is
that the sanple data would enable an estimate ofthe population The items selected
E4 Sef-lnstructional Material
from the population for observation is called a sample. The method of selecting a Sampling Design
sample is called sample design. Any survey that is carried out on the basis ofthe
sample is referred to as sample survey. To draw valid and reliable conclusions, the
sample must be truly be representative ofthe population.
NOTES
Need for sampling
Sampling reduces time and money. It is less expensive and requires less time.
Sampling makes measurements more accurate, due to smaller size.
Self-Instructional Material 85
Sampling Design (iv) Statistic(s) and parameter(s): Astatistic is the characteristic ofthe sanple
whereas the parameter is the characteristic ofthe population. Sampling
analysis involves estirnating the parameter fromthe statistic.
of items from a population, e.g., boys under the age of20 in a town.
This would give the sampling distribution ofproportion ofboys under
20 to the whole population. As the size increases; the distribution tends
to become normal. NOTES
c. Student's / distribution when the population standard deviation is not
known and the sample size is smaller than 30, the I distribution is used.
t:
(,
s
Self-lnstructional Materiol 87
Sampling Design that the null hlpothesis ofthe variances being equal cannot be accepted.
The Fratio canbe used in the context ofhypothesis testing and also in
the context ofANOVA t echnique.
l\
^,.:
" +(o,-8,)2
2
r=l Ei
There are tables available that give the value of x' fbr given d.f. and these
can be used for calculating values of 1' for relevant d.f, at a desired level of
significance for testing hlpotheses.
Se lf- I nstruct io na I Ma te ri a I
Sampling theory Sampling Design
S ampling theory deals with the relationships between a population and random
samples drawn from the population. population or a univeise is an aggregate
of
items with common traits. A universe constitutes the totalilty ofthe items
about
which researcher seeks study. The universe may be finite or infinite. Finite universe NOTES
contains a definite number or items. In an infinite unverse the number ofitems is
infinite.
The theory can also be applied in the context of statistics ofvariables (i.e.,
data relating to some characteristic conceming population which can be estimaied).
The objectives are:
Self-Instructional Material 89
Sampling Design . To compare the observed and expected values ofthe sample and to
determine if the difference can be ascribed to the fluctuations of
sampling;
o To estimate the population parameters fromthe sample, and
NOTES
o To find out the degree ofreliabilityofthe estimate.
The tests of significance used in dealing with problems arising in studying
large samples are different from those used for small samples. This is because the
assumptions that one has to make in the case of large samples do not hold good
for srnall sarrples. It is assumed in case oflarge sanples that the sarrpling distribution
tends to be normal and the sanple values are approxirnately close to the population
values. This helps in applyrrg what is known as the z-test. When n is large, the
probabilityofa sample value ofthe statistic deviating fromtheparameterbymore
than three times its standard error is very small (it is 0.0027 as per table giving
area under normal curve). The z-test, thus is applied to find out the degree of
reliability ofa statistic in case oflarge samples. One, ofcourse, needs to work out
appropriate standard errors as they will enable one to give the limits within which
tlreparametervalues would lie orwouldenable one to judgewhetherthe difference
happens to be significant or not. For example, X + 3o, would give the range
within which the parameter mean value is expected to vary with 99.73 per cent
confidence level.
The sampling theory that is applied for large samples is not applicable in the
case ofsmall samples because in the case of samples, one cannot assume that the
sampling distributionis approximatelynormal.Adifferenttechnique is required for
handling smallsanples inparticularwhen thepopulationparameters areunknown.
Sir William S. Gosset developed a significance test, known as student's I test,
based on / distribution. His was a significant contribution to the theoryofsampling,
applicable in case ofsrnall samples. Student's / test is used whentwo conditions
are fulfilled: the sample size is 30 or less andthe population variance is not known.
While using I test, one assumes that in the population from which the sample has
beendrawn:
o The sampleisrandomlydrawn
o Observations are independent
o There is no measurement error
o And that in the case oftwo samples where equality ofthe two population
means is to be tested, one assumes that the population variances are equal.
To applythe /test, one needs to work out the value oftest statistic (i.e., r)
and then compare it with the table value of / (based on / distribution) at certain
level of significance for given degrees of freedom. Ifthe calculated value of r is
either equal to or exceeds the table value, the inference is that the difference is
significant. But ifthe calculated value ofr is less than the conceming table value of
90 Self-lnstnlctional Material
l, the difference is not treated as significant. The following formulae are commonly Sampling Design
used to calculate the rvalue:
(r) To test the significance ofthe mean ofa random sample:
(x-rr) NOTES
t-
or
where X, :Meanofsampleone
X, : Meanof sample two
c *,- N, : Standard error of difference between two
6_-:
xt -xr
Self-Instructional Material 9I
Sampling Design (v) To test the difference in case ofpaired or colrelated samples data (in
which case I test is often described as difference test).
D r-
-lto tln;o D-o
t--ln
r-
NOTES
t -- oD
l>ol - o,'n
o': 1/ a-1
D. = Differences { i.e., D.: (X;Y)}
n : number ofpairs in two samples and the d.f. : (n - 1)
Sandler's I test
Joseph Sandlerhas developed an altemate approach based on a simplifiedversion
of r-test. Sandler's A test that serves the same purpose as is accomplished by
/-test relating to paired data. Researchers can also use l-test when correlated
samples are employed and hypothesized mean difference is taken as zero, i.e.,
Hn: pr: 0. Psychologists generally use this test in case oftwo groups that are
matched with respect to some extraneous variable(s). While using l-test, one has
to work out l-statistic that yields exactly the same results as student's /-test.
A-statistic is found as follows.
n-l I
,,|I _
- n.t' n
-L-
92 Self-lnstructional Mlterial
(i) I interms of A canbe expressed as Sampling Design
t-
NOTES
Thus, computingl-statistic is relatively srnple. Using ofl-statistic heSs in
saving time and labour considerabty, especiallywhen the matched groups are to
be compared with respect to a large number ofvariables. Researchers would be
well advised to replace student's I test by Sandler's ,,4 test whenever correlated
sets of scores are employed. Sandler's l-statistic can also be used 'in the one
sample case as a direct substitute for the student t-ratio.' because Sandler's I is
an algebraically equivalent to the Sfudent's /. when one uses l-test in one sample
case, the following steps are involved:
(i) Subtract the hypothesised mean ofthe population (zr) from each
individual score (,q) to obtainD. and thenwork out ID
(u) Square each D and then obtain the sum of such squares i.e., D!
(O Find,4 statistic asunder:
tr:fi21(LD,)2
(iv) Look up the table of A statistic for (n - 1) degrees of freedom at a
given level of significance (using one-tailed or two-tailed values
depending upon H") to find the critical valu e of A.
(v) Finally, the inference canbe drawnas indicatedbelow:
When calculated value ofl is equal to or less than the table value,
rject ,I1o (or accept 4) but when computed A is greater than its table
value, then accept .Flo.
Self-lnstnrctional Material 93
Sampling Design words, the difference is outside the limits, i.e., it lies in the 5 per cent area
(2.5 per cent onboth sides) outside the 95 per cent area ofthe sampling
distribution. Hence one can conclude with 95 per cent confidence that the
said difference is not due to fluctuations of sampling. In such a case, the
NOTES
hypothesis that there is no difference is rejected at 5 per cent level of
significance.
But ifthe difference is less than 1.98 times the S.E., then it is considered not
significant at 5 per cent level. It can then be said with 95 per cent confidence
that it is because ofthe fluctuations of sampling. In such a case, the null
hypothesis stands true. 1.96 is the critical value at 5 per cent leve!. The
product ofthe critical value at a certain level of significance and S.E. is
described as 'sampling error'at that particular level ofsignificance. One
can test the difference at certain other levels of significance as well depending
upon one's requirement.
94 Self-lnstuctional Material
Important formulae forcomputingthe standard errorsconcemingvariousmeasures Sampling Design
based on samples are as under:
where
n : number ofevents inn each sample,
p : probability of success in each event,
q : probability offailure in each event,
r)(r
p: p'ql\.u,
-+-nz ) I
q:l-q
z, : number ofevents in sample one
n, : nurnber ofevents in sample two
Nale.'Instehd ofthe above formula, we use the following formula:
Pflt , PzQz
6 pr-p, :
t\ n2
when samples are drawn from two heterogeneous populations and where we cannot
have the best estimate of proportion in the universe on the basis of the given sample
data. Such a situation often arises in the study ofassociation ofattributes.
C,
o-: /-
1n
where
o_
p
: standard deviation ofpopulation
r :numberofitems inthe sample
Note: This formula is used even when r is 30 or less.
Self-Instructional Material 95
Sampling Design IL Standard effor ofmean population standard deviation is unknown:
os
o-:-F
" ^l
n
NOTES
where
o,2 : standard deviation ofthe sample and is worked out as under
>(x,- N)',
>(X, - N)'
0,:
,-t
where
n : number ofitems in the sample.
v Standard error ofthe coefficient ofsimple correlation:
I- 12
o: _G
where
r: coefficient ofsimple correlation
n :numberofitems inthe sample
vi. Standard error ofdifference betweenmeans oftwo samples:
c. When two samples are drawn from the same population:
0t,-.", =
96 Se{Jret'uctional Material
d. Whentwo samples are drawn fromdifferent populations: Sampling Design
or-,-x, =
NOTES
llforr and or.rare not known, then in their places o., and orrrespectively
maybe substituted.)
e. in case of sampling ofvariables (small samples) :
6p
sEx: Ji
In cases in which the population is very large in relation to the size ofthe sample, the finite
population multiplier is close to one and has little effect on the calculation of S.E. In such a
case where the sampling fractiotl is less than 0.05, the finite population multiplier is not
generally used.
: *= n-1
".r !n J,
ii standard error ofdifference between two sample rneans whor o, is unknown
6--=
.\t- A t
E=
!', n2
Se{-Instructional Moterial 97
Sampling Design is referredto as 'point estimate,' and in case where there is a range ofparameters,
it is termed as interval estimate. The researcher has to make these two types of
estimates through sampling analysis. While making estimates of population
parameters, the researcher can give only the best point estimate as otherwise
NOTES he/she has to speak in terms of intervals and probabilities; For, he/she can never
estimate with certainty the exact values ofpopulation parameters.
A good estimator possesses the following properties:
(1) An estimator should on the average be equal to the value of the
parameter being estimated. This is known as the property of
unbiasedness. An estimator is said to be unbiased if the expected
value ofthe estimator is equalto the parameterbeing estimated. The
sanple mean 1X; is the most widelyused estimator because it provides
an unbiased estirnate ofthe population mean (p).
(i} An estimator should have a relatively small variance.This property is
technically described as the property ofefficiencybecause an estirnator
is expected to have the smallest variance.
(trD An estimator is expected to use as much informatrion as possible as
is available from the sample. This property is known as the property
ofsufficiency.
(iv) As the sample size becomes larger and larger, an estimator has to
approach the value ofpopulation parameter. This property is refened
to as the property ofconsistency.
The researchermust select the appropriate estimator(s) forhiVher studyby
keeping in view the above stated properties.
Inpoint estimate, the sample mean )( is the best estimator ofthe population mean,
m and its sampling distribution when the sample is sufficiently large, because this
tends to approxirnate the normal distribution. Ifone knows the sampling distribution
of X , one can make statements about any estimate that one may make from the
sampling ffirmation. In a sanple of36 students, ifone finds that the sample yields
an arithmetic mean of 6.2 i.e., X : 6.2, replace these student names on the
population list and draw another sample of36 randornlyand assume that one gets
a mean of 7.5 this time. Similarly a third sample mayyield a mean of6.9; fourth a
mean of6.7; and so on. One can go on drawing such samples till one accumulates
a large number ofmeans of samples of 36. Each such sample mean is a separate
point estimate of the population mean. When such means are presented in the
form of a distribution, the distribution happens to be quite close to normal. This is
acharacteristic ofadistnbutionofsanplemeans (andalso ofothersanple statistics).
Even ifthepopulation is not normal the sanple means drawn fromthat population
are dispersed around the parameter in a distribution that is generally close to
normal; the mean of the distribution of sample means is equal to the population
98 Self-Ins*uctional Material
mean. This is true in case of large samples as expected from the central limit Sampling Design
theorem. The relationship bet'ween a population distribution and a distribution of
sample mean is critical for drawing inferences about parameters.
co
O-: --F
" ,l n
os
-, ---
6-:
,l n
>(X,- X)'
where o:s
n-l
With the help ofthe above, one can arrive at intervalestimates about the
parameter inprobabilistic terms (utilising the fundamentalcharacteristics ofthe
normal distribution). Suppose one takes a sample of 36 items and works out its
mean 0 to be equal to 6.20 and its standard deviation (q) to be equal to 3.g:
Then the best pgint estimate ofpopulation mean (p) is 6.20. The S.E ofmean (o)
would be 3.8 / {36 : 3.816:0.663. And ifone takes the interval estimate of p to
ue x t 1.96 ( o';; or 6.20 tr.24or from4.96 to 7.44,it means that there is a 95
per cent chance that the population mean is within 4.96 to 7.44 ntewal. This
means that ifone were to take a complete census ofall items in the population, the
chances are 95 to 5 that one would find the population mean between 4.96 to
7 .44.In case one wants to have an estimate that will hold for a much smaller
range, then one must either accept a smaller degree ofconfidence in the results or
take asample large enoughto provide this smallerintervalwithadequate confidence
levels. Usually one thinks of increasing the sample size till one can secure the
desired interval estimate and the degree ofconfidence.
Illustration 3.1 : From a random sanple of36 New Delhi civil service persorurel
mean age andthe sample standard deviationwere found to be 40 years and 4.5
years respectively. Construct a 95 per cent confidence interval for the mean age of
civil servants in New Delhi.
X tr9+
Jn
61 4o+ r.s6+
J36
or 40 + (1.96X0.75)
or 40 + 1.47 years
Illustration 3.2:lna random selection of 64 ofthe 2400 intersections in a small
city, the mean number of scooter accidents per year was 3.2 and the sample
standard deviation was 0. 8.
(l) Make an estimate ofthe standard deviation ofthe population from the
sarple standard deviation.
(2) Work out the standard error of mean for this finite population.
(3) If the desired confidence level is .9, what will limits
be the upper and lower
ofthe confidence interval for the mean number ofaccidents per intersection
per year?
o, : oN;
100 Self-lnstuctional Material
fr*r/ ,_,
Sampling Design
0.8
E4oo-64
- JA"1 z+oo-t
--v,-
0.8 836
: .,lA* NOTES
i rr*: 1o.l X0.97) :0.097
(3) 90 per cent confidence interval for the mean number of acciedents per
intersection per years is as follows:
-**,[9*." EI
-- lfi ! r'r-t J
:3.2 + (1.645X0.97)
: 3.2 + 0. 1 6 accidents per intersection.
when the sample size happens to be alarge one or when the population
standmd deviation is knowq one uses normal distnbution fo determining confidence
intervals for population nrean as stated above. But how to handle estimation problem
when population standard deviation is not known and the sanple size is small (ie.,
when n < 30)? In such situations, normal distribution is not the appropriate tool
and one can use I distribution for the purpose. While using / distribution, one
assumes that population is normal or approximatelynormal. There is a different
I distnbution for each ofthe possible degrees offreedom. When using r distribution
for estimating a population mean, one works out the degrees offreedom as equal
to n - l, where n means the size ofthe sample and then can look for critical value
of 'r' in the r distribution table for appropriate degrees offreedom at a given level
of significance. This can be illustrated by taking an example:
Illustration 3.3: The forernan ofABC mining conpany has estimated the average
quantityofironore extractedto be 36.8 tonnes per shift andthe sample standard
deviation to be 2.8 tonnes per shift, based upon a random selection of4 shifts.
Construct a 90 per cent confidence interval around this estimate.
Solution: As the standard deviation ofpopulation is not known and the size of
the sample is small, one can use r distribution for finding the required confidence
interval about the population mean. The given information can be written as trnder:
:36.8 tonnesper shift
o, :2.8 tonnesper shift
n:4
degrees of freedom : n- |: 4- |: 3 and the critical value of r for 90 per cent
confidence interval or at l0 per cent level ofsignificance is2.353 for 3 d.f (as per
the table o f I distribution).
Xt,$
',ln I
o:^l- W
p \n
using the above estimated standard error ofproportion, one can r,r,ork out
the confidence interval for population proportion thus:
pq
pt z'
n
where
p: sample proportion of successes;
q:l*P;
re : numberoftrials (size ofthe sample);
z : standard variate for given confidence level (as per normal curve
area table).
This formula can be explained as in illustration 3.4.
Illustration 3.4: Amarket research survey in which 64 consumers were contacted
states that 64 per cent ofall consumers ofa certain product were motivated by the
product's advertising. Find the confidence limits for the proportion ofconsumers
motivated by advertising in the population, given a confidence level equrl to 0" 95.
102 Self-lnstructional Material
Solution: The given information can be written as under: Sampling Design
n :64
p :64 per cent or 0.64
q:l*P:l-0.64:0.36 NOTES
and the standard variate (z) for 95 per cent confidences is 1.96 (as per the normal
curve area table)
Thus, 95 per cent confidence interval for the proportion of consumers
motivated by advertising in the population is :
n r z' ^l!!-
\n
: 0.64 tl .s6 ^/ (0'64X0'36)
:0.64 + (1.96X0.06)
:0.64 + 0_1t76
Thus, lower confidence limit is 52.24 per cent upper confidence limit is
75.76 per cent.
For the sake ofconvenience, one can summarise the formulae which give
confidence intervals while estimating population mean (p) and the population
proportion (p) s shown in the following table.
The 3.1 table summarizes important formulae concerning estimation:
J,
xxz.2*.W
Jn N-l V
Estimating population mean
(ir) when we do not know oo
and use o. as the best estimate X-6 + z.:L xxr.fr*,H
of o, and sample is large (i.e.,
n>30)
Estimating population mean
(p) when we do not know o,
and use o. as the best estimate
xt_t In - o
x+t.--!x l-N-;
of o, and sample is large (i.e.,
^'l ^ln \,rY-t
n?30)
Estimating the population
proportion (p) whenp is not
known but the sample is large.
otz',@
\n ,rre"rH
In case offinite population, the standard error has to be multiplied by the
X: samplemean;
z: the valueofthe standard variate at a given confidence level (to be
read fromthe table giving the areas turder normal curve as
shown
in appendix) and it is r.96 for a 95 per cent confidence level.
:
n size ofthe sample
o: Standard deviation ofthe population (to be estinuted from past
expenence or on the basis ofa trial sample). Suppose, one has
o-p :4.8 for the purpose.
lnr
NOTES 3 :1.96+
:ln
(1.96)'z(4.8)'z
n: :9.834 = 10.
(3)'
In general, ifone wants to estimate pr in a population with standard deviation
o_
p
with an effor no greater then e by calculating a confidence interval with
cbnfidence correspondingto z,the necessary sample size, n, equals as under:
21
z6
____;_
n=
e'
This is true ifthe population is infinite, but ifthe population is finite, the
above stated formula for determining sample size will become.
z' .N .c'o
:
' (N -l)ez + z2a2o
-- x
Y+rGP @
Jn ! tn-tl
where J (ttt - n) t(N - 1) is the finite population multiplier and all other terms mean
the same thing as stated above. Ifthe precision is taken as equal to 'e', then one
has
o, N;
e: ;-f ,1i-
in J lr-t
^ ,G, N-n
e-:z-
n N-l
,'c'N 2'6'n
e,(N- 1): Jv----L-
z' 'o'o'N
e'(N -l)+ z2o2o
where
try' : size ofpopulation NOTES
n: saeofsample
e: acceptable error (the precision)
o-p : standard deviation ofpopulation
z: standardvariate at a given confidence level.
This is how one obtains the above stated formula for determining n in the
case ofinfinite population given the precision and confidence levels.
Illustration 3.5: Determine the size ofthe sample for estimate the true weight of
the cerealcontainers forthe universe withN:5000 onthe basis ofthe following
infornntion:
(1) The variance ofweight:4 ounces on the basis ofpast records.
(2) Estimate should be within 0.8 ounc€s ofthe true average weight with 99 per
cent probability.
Will there be a change in the size ofthe sample ifone assumes infinite population in
the given case? If so, explain by how much?
x+z
# J+j
And accordinglythe sample size can be worked out rs under:
z' . N .oro
(N -l)e'z + z'o'o
Q.sT'z.(5000).(2),
(s000- 1x0.8)'z + e.s7), (2),
132098 t32098
:---:-=
3199.36+26.4196 3225.7796 = 40.95 = 4l
pxz' p'q
n
p: sample proportion, Q
: | - p;
z : ofthe standard variate at a given confidence level and to
the value
be worked out from table showing area under Normal Curve;
n : saeof sample;
NOTES
Sincep is what one is trying to estimate, what value should one assign to it?
one method may be to take the value of p :0.5 in which case n will be the
maximum and the sample will yield at least the desired precision. This willbe the
most conservative samplestze.Theothermethodmaybeto takean initialestirnate
ofp which may either be based on personal judgment or may be the result of a
pilot study. In this case, it is suggested that a pilot study of something like 225 or
more items mayresult in a reasonable approximation ofp value. Then with the
given precision rate, the acceptable error, e, canbeexpressed as under:
-W
"'!;
or e?:zzPQ
n
or ,' . p.q
n:--:2 -
e
The formula gives the size ofthe sample in case ofinfinite population when
one has to estimate the proportion in the universe. But in case offinite population
the above formula will change to:
(2.00s)'z (0.02xI
- 0.02x4000)
(0.02)'z (4000 - 1) + (2.005)'z (0.02x1 - 0.02)
31s.1699 315'1699
-
1.5996 + 0.0788 1.6784
=187.78= 188
n: z'pq
t
(1.e6)'? .(0.sx1- 0.5) 0.e604
= 1067 .11 = 1067
(0.03)'? 0.0009
Then, the most conservative sample size needed for the problem is : 1067.
Determination of sample size through the approach based on Bayesian
Statistics:
110 Self-lnstructional Material
Sampling Design
Another approach to determine zl is to use the Bayesian statistics known as
the Bayesian approach. The procedure for finding the optimalvalue ofz or the
size ofsample under this approachis as under:
(i) Find the expected value ofthe sample information (EVSD for every
NOTES
possible n;
(i| Workout reasonably the approximate cost oftaking a sample ofevery
possible n;
(iif Conrpare the EVSI and the cost ofthe sample for every possible r. In
other words, work out the expected net gain @NG) for every possible
n as stated below:
For a given sample size (n): (EVSI) - (Cost of sample) : (ENG)
(iv) From (iii) above, the optimal sample size, that is the value ofn which
maximizes the difference between the EVSI and the cost ofthe satryle,
canbe determined.
One disadvantage ofthis approach is that the computation of EVSI for
everypossible n andthen comparing the same with the respective cost is often
a very cumbersome task and is generally feasible with the help ofcomputers only.
Therefore, although theoretically optimal this approach is rarelyused in practice.
CrmcxYouR PRocREss
3.4 SUMMARY
l. when all items ofa universe are enumerated, it is called a census inquiry.
2. A simple random sample is also known as an unrestricted random sample.
Here every individual of a population has an equal chance of
being
included in the sample.
3. Probability sampling is preferred because it ensures that every element
of
the population has an equal chance ofbeing included.
9. when sampling is not from a normal population and the size ofthe z,
the
shape ofthe distribution will depend on the parent population.
But as r gets
larger, the sampling distribution would resemble a normal
distribution,
irrespective ofthe shape ofthe population distribution.
l0' The Bayesian statistic suggests weighing the cost ofadditional
information
obtained against the expectd value ofthe additional information
The second
approach is difficult to determine, so the fust is used often.
Short-Answer Questions
1. write two points to be kept in mind while designing a sample.
2. Define bias and sampling errors.
Self-lnstmctional Material ll3
Sampling Design 3. What are probability sampling methods?
4. What is sampling distribution ofmean?
4.0 INTRODUCTION
In this unit, you will learn the meaning and the purpose of research design. A
research design is a plan and a systematic procedure for collecting the data and
performing analysis on that data for the purpose ofresearch. In other words, a
research design is a conceptual framework for conducting research. It is a blueprint
for collecting, measuring and atnlyzngdata. Research designs tell us what, where,
whenandhow anyinquiryisto be made.
The unit explains the features and important concepts related to research
design. It throws light on different bapes ofresearch designs. It explains the basic
principles and different types of experimental designs. It also discusses quasi-
experimentaldesign.
The later part ofthe unit teaches you the principles and different types of
qualitative research. It will also make you familiar with different types ofinte*i"rr,
NOTES
4.I UNIT OBJECTIVES
Aft er going ttrough this unit, you will be able to :
Type of Study
Research Design
CrmcrYouR Pnocnrss
1. What isadesign?
2. What is a variable? NOTES
3. Name are the different research designs?
4. What ismeant byrandomization?
E4perimental designs are the structures ofan experiment, which are oftrvo tlpes-
informal and formal. Informal designs have less sophistication and less control
while the formal designs offer more control and lend themselves to the use of
precise statistical procedures for analysis.
S o urce : Kothari, C.R., Re s earc: h Meth o d o logy Met h ods and Te c hn iqu e s, 199 5.
The basic assumption ofthis design is that the groups are identical with
respect to their behaviour towards the phenomenon under consideration. If
this is not tnre, there are chances of extraneous factors entering into the
treatment Problems with respect to time lapses do not enter into this design.
This design is superior to Before-and-After without control design.
(iii) before-and-after with control design: In this design, two groups are
selected and the dependent variable is measured in both, for an identical
length oftime prior to introducing the independent variable or treatment.
Then the independent variable or the treatment is introduced only in the
experimental condition (as shown in the table below). After an identical
tinre period lapses, the dependent variable is measured in boththe conditions.
The treatment effect is determined by subtracting the change in the dependent
variable inthe control condition frornthe change in the dependent variable,
in the experimental condition.
NOTES
Since the sample is drawn randomly from the population and then again
randomlyassigned to the two conditions (experimentaland control) and
then each ofthese conditions receive different treatments (.r andy), the
independent variables; the conchrsions drawn from the samples are applicable
to the population. Such a design has the merit ofrandomizing the sample.
one could test each goup before and after the treatment afteiensuringlhe
equivalence ofthe two groups.
Individual differences that exist in the two conditions and the experimenter
influences (e.g., theteacher ortrainer differences inapplying thi methods)
can be further controlled by the random replication deiign. such diftrences
get minimized with such a design.
A B C D E
B A D E C
C E A B D
D C E A B
E D B C A
I 2 J 4 5
6 7 8 9 l0
ll t2 l3 t4 l5
t6 t7 l8 t9 20
2t 22 23 24 25
the timeslots on the first day. Now we use the five columns ofthe square to assign
the five temperatures, and the letter to assign the recipes l, B, C, D, E.
(iv) Factorial design: This is used when to study the effects byvarying more
NOTES
than one factor. This is especiallyuseful in social science research where a
number offactors do influence a particularphenomenon. Factorial designs
are of two types: l. Simple factorial designs, and2. complex factored
designs.
Experimental variable
Here there are two treatments or experimental conditions and two control
conditions. This sample is divided into four cells. Each ofthese has one
fteatment condition Subjects are assigned at randorn The means for different
rows and columns can be obtained. The column means represent the main
effects oftreatments. The row means are the main effects for level without
regard to treatment. So, the main effects oftreatment as wellas levels can
be studied by this design. Further, the interaction between treatments and
levels canalso be studied. This enables one to seewhetherthe treatments
and levels are independent ofeach other or not.
complex-factorial design: A complex-factorial design has more than two
factors at a time. It is a design with three or more than three independent
variables simultaneously.Inthe following design, there are (3 l) factors.
-
Experimental variable has two treatments and two control variables each
with two levels. Such a design is a2x2xz complex frctorial design. There
are atotalofeight cells. The design is as givenbelow:
Experimerrtal Variable
TreatmentA Treatment B
Control Control Control Control
NOTES Variable Variable
Variable Variable
2 2 2 2
Level I Level ll Level I Level II
Cell3 Cell5 Cell
control J-- Level I Cell I T
Variable I
l- Level II
I
Cell2 Cell4 Cell6 Cell8
Source: Kothari, C.R., Research Methodo I ogy Methods and Techniques, 1995.
Experimental Variables
Treatment A 'l'reatrnent B
NOTES
Level I Clells 1,3 Cells 5. 7
Control
Variable I
Le.r,el ll Cells 2. 4 Cells 6, 8
Experimcntal Design
Between-groups Within-groups
design design
Two-randomized-groups Morc-tnan-two-ranciomized-groups
design desigl
Based on the number ofgroups, the two designs are: 1. Between groups and 2.
within groups. ln psychological and educational studies the 'between groups design,
Self-Insrructional Material 127
Research Design is used often. 'Between group designs' are divided into the following three types:
a. Randomized groups design
b. Matched-goup design
NOTES c. Factoreddesign
Between groups designs
Randomized groups design: It is one in which subjects are assigned randomly
to different groups meant for the different conditions or values ofthe independent
variable. The assumption here is that the random assignment makes these groups
statistically equivalent. This means the variations in the dependent variable become
easily identifiable. When the subjects are randomly assigned to onlytwo groups;
the design is called as 'two-randomized groups design. And when the subjects are
assigned to more than two groups, it is called a multi-group desrgn'.
Matched-groups design: It is another between groups design in which
the subjects are matched on mean, standard deviation, pairs, etc.
Factorial design: In this design two or more independent variables are
studied in various possible combinations. Here their independent and interactive
effects on the dependent variable can also be studied.
There are two primary ways through which unbiased groups or random
groups subjects can be formed: 1. Captive assignment, and2. Sequential
assignmart.
Captive assignment: Here allthe subjects are individuallyknown and for
the duration ofthe experiment they are made captive, so that they are randomly
assigned to the different conditions or groups and so the term captive assignment.
The randomprocedure is followed. The onlypre-requisite is that the 'M ofthe
groups, be equal.
Sequential assignment: The technique of sequential assignment is one
where the experimenter does not know the subjects in advance. As the experiment
progresses; the experimenter may use three objects on day one, six on day two
and so on. So the experiment follows a pre-arranged schedule or sequence.
The sequential assignment ofpre-arranged schedule or sequence or simply
the sequentialassignment of subjects canbe done by complete randomization or
block randomization. Any biases that could arise in the sequence can be overcome
byblockrandomization. The concept ofblockrandomization is thatthe independent
variable occurs in each successive block oftrials once, but the order ofconditions
within a block is random and different from every other condition. All conditions
must occur equal number of times for the block randomization to be effective.
Social sciences researches have to establish which ofthe several independent
variables influences the dependent variable most, and also the type ofrelationship
that exists between the most influentialof independent variables andthe dependent
variable. At'wo-group randomised design satisfies the first nee{ but not the second
one. Here two independent variables are involved so it is difficult to ascertain the
f28 Self-lnstructional Material
most significant independent variable on the dependent variable. It is irnportant to Research Design
studymore thantwo values ofthe independent variable forthispurpose. This is
the 'more thantwo randomized group dergn'.
Two-group randomized design: The subjects are assigned randomly to
the two groups. The independent and dependent variables have to be defined NOTES
clearly. The two values ofthe independent variable have to be identified. The
values are conditions ortreatments ofthe e4periment. The objective ofthe studyis
to see whether the two conditions affect the dependent variable.
The populationhas to be specified and the sample drawn. The subjects are
then randomly assigned to the two groups. (Method ofrandom selection to be
followed)
Here, two equalNgroups are formed. Acoin toss could decide which the
experimental group is and which the control group is. The experimental group
receives the experimental treatment, while the control group does not receive it.
After the experiment, the scores obtained by the subjects in both the groups on
the dependent variable are recorded. Usually the r test or the non-parametric
Mann whitney utest is used in a two-group randomized design. Ifthe two groups
differ, thenthe independent variable is seento affect the dependent variable. St,
the independent variable manipulation is seen to influence the resultant dependent
variable.
central to the two-group randomized design is the random assignment of
subjects into the groups, so that no systematic relationship between the
characteristics ofthe subjects and a particular group to which they are randomly
assigned emerges. To achieve this, (underwood, 1966) suggests two ways in
which random groups can be formed.
i Captive assignment
ii Sequential assignment
More-thon-two-randomized-groups design: This is also called a multi-
groups design Here, there are three ormore values or conditions ofthe independent
variable. So, three or more groups of subjects participate. All the subjects are
randomly assigned to the three ormore groups. The process ofcaptive assignrnent
or sequential assignment using the random procedure can be used here. For exanple,
for hunger drive levels on responses, three groups of experimental variables with
three conditions ofdeprivatiorr-mild, moderate and extreme deprivation offood
can be chosen. Ifthe two-groups design was chosen and the conditions were mild
and moderate deprivation and the number of correct responses (bar press) was
used as the dependent variable, perhaps the two conditions mild and moderate
would not yield a significant difference. So, the use ofthe third group (extreme
deprivation) would have a clear effect on the dependent variable. This rezult would
have been missed in the two-group experiment.
Methods ofmatching: (a) matching bypairs; and (b) Matching by mean and
standard deviation.
(a) Matching by pairs: On the basis of scores obtained, the subjects are
matched and paired. A subject who has a score of 80 is paired with another ofa
similar score on a test ofmemory. In this way, pairs are formed. Eachpair can be
seen to be a block. Another block would have pairs with scores of 90 and so on.
Here subjects with deviant scores on the matching variable would get eliminated.
(b) Matching by mean and standard deviation: Here the measures of
central tendency and variabilities in the distribution of scores on the matching
variable are used.
Three rnatching methods are employed:
(D Randomblocks methods
(D Method ofcounter balancing order and
(t'D Block-repetition method.
0 Random blocl<s method: In this method, the blocks are first created. Then
with the number ofsubjects in eachblock being kept the same, as deJermined
by the values ofthe independent variable, the subjects from each block are
randonrly assigned to the different groups as per conditions orvalues ofthe
independent variable.
l30 Seflnstuctionol Material
($ Method of counter balancing order: cotxfierbalancing method is used to Research Design
avoid confounding among variables. consider an experiment where the
subjects are tested on both an auditory and visual reaction time task (here
the subjects respond to an auditory stimulus) and a visual reaction time task
(in which subjects respond to a visual stimulus). If each subject is tested NOTES
first on the auditory reaction time task and second on the visual reaction
time task, then the type of task and the order of presentation would be
confounded. Ifvisual reaction time were lower, there would be dfficulty to
know ifthe reaction time to a visual stimulus is 'really' faster than to an
auditory stimuhrs or ifthe subjects learned something while performing the
auditory task that inproved their performance on the visual task. This isthe
compounding effect on the reaction time of the subject. The experiment
could be designed better using the counter-balancing procedure. Here half
the subjects would be given the auditory task, first. In this manner, the order
effects would be neutralized. So the design counter-balances the sequence
oftaskpresentation. This design enables one to conclude that the effects of
the reaction time obtained are pure and not influenced by the order of
presentation. Halfof the subjects should have been given the visual task
first and the other halfofthe subjects should have been given the auditory
task first. That way, there would have been no confounding of order of
presentation and task, as the order of presentation and task would be
'counterbalanced.'
(ll Block-repetition method: In this method, the block is first created and
then successively repeated in which the natural sequence and the same
order ofthe natural sequence ofconditions is reported for each block. For
example, ABC is repeated for each block. This means the first subjects
would be to groupA, the second to groups B and the third to group c.
The larger mean differences occur in the random blocks method, while the least
occur inthe randomblocks method.
Factorial design: when the researcher is needed to manipulate two or
more independent variables simuhaneously; the most suitable design is the fictorial
design. The benefit from this is that their independent as well as their interactive
effects onthe dependent variable can be studied. Afactorial designhas tlree main
features.
l. Two or more independent variables manipulated in all possible
cukninations.
2. Different sub-groups or subjects can serve everypossible combination
ofthe independent variables-equal number of subjects in all groups is
preferred, though this is not necessary.
3. Independent as well as interactive effects can be studied, e.g., noise and
illumination on the rate of learning. Noise and illumination are two
independent variables. Further, noise has two levers-highand low and
Group I (nl = 10) Group II (n2 = 10) Group III (n3 = f0) Group IV (n4 = l0)
High noise and Low noise and High noise and low Low noise and
high illumination high illumination illumination illumination low
l5 l0 l6 10
t4 t2 18 9
20 l0 22 8
22 9 25 7
l6 8 26 6
l8 t2 20 10
20 ll 20 ll
2t 10 l8 t2
l8 10 t7 l0
t7 10 t6 l0
.. X'= 181 102 198 93
..X2 = 3339 1054 4034 895
Mean= 18.1 10.2 19.8 9.3
Source: Singh, A.K., Tbsts, Measurement and Research Methods in Behavioural Sciences,
2m8.
.rl{i.$},:..i;tr:, .,
iUdr{,A) ' '.{tr4.1
,vleilE
(48,) {,+rql
E
!
.!i f.X = lEl !I - I02
I irleer.lS.l tr,L.rr- 10.2 l{.t5
O-tO nr-10
l48rl l,tQ)
a f,,t = lgE g1 - e3
I
,dcan = 19.8
J
lvkao * 9.3 Ir.ts
* l0 dr * l0
18.95 9.75
E.{ e Totil scorr on t}tc depcndmt varirble
132 Self-Instuctionol Material
Illustration of the details of the Calculation of ANOVA in a 2 x 2 Factorial Research Design
Design (datafrom Table on the previous page)
JtepZi Totil SS =tElrr + EXrr +IXf + IX.2)-C =63i9+ t054 + 4034 +895)-s236.9
*9322 -82369 = l0E5l
trxr+rxrf*E{r+u.f-c
4+nr n,+r.
*(t8t+resf
10+ lO
-OW-8236e
l( -ry-ryf -8236s
In the above table the number oftrials taken by each subject in leaming a
list of 1 5 consonant syllables with the criterion ofone perfect recitation is shown.
This data is rearranged in the next table so that the means ofthe four groups are
placed in their appropriate cell.
Then the first step is to compute the total sum of squares (ss) and then
divide it among (ss). (Step 3, next table) andwithin (SS) (step 4 ofthe table).
Before this, it is irnportant to compute the correction value (step I ofthe table).
The among (ss) indicates whether the groups differ or not. The purpose of
.
this'studywas to find out whether or not variation in each independent variable
affects the dependent variable score and whether there was any significant
interaction
FforA:ffi
MSfor B
F for B :
Ms fb, irtr-u.tio,
FforAxB:ffi
Soarce: Singh, A.K., ksts, Meastrement and Research Methods in Behavioural Sciences,
2m8.
01 x02
o3 o4
This shows that the random assignments of subjects has not been possible
to the experimental and control conditions. So the equivalency is absent' The
intact groups are compared on the pre-tests O I and O3. The statistical analysis
involves comparing the mean gain score ofthe treatment group (O2-O I ) to that
ofthe non-treatment group (O4-O3). This non-equivalent group design cannot
be used rvhen tlre intact groups are dissimilar.
- post-test
]
olx02
03x04
If 02 exceeds ol and 04 exceeds o3, we can conclude the effect ofthe
treatment X. Since the two groups are separate and the independent variable is
administered, some amount ofpower to the conclusion is enabled.
Patched-up design: Here the experin-renter starts with an inadequate design
and adds more features, as one proceeds.
GroupsA X Ol
Groups B 02 xo3
This isused to overcome the weakness ofa given design. The comparisons
made are between 03 and 02 and o2 and o I . The groups are tested in slquence
and then compared.
5678910years
x xx xx x
x xx xx x
x xx xx x
x xx xx x
xx xxxx
X represents the scores and X represents the mean ofdifferent age groups.
5 a
D B 9 IO
1990 r x x x x x x x
Subiects z x x
3 i i i x
I 995 I x x x x x x ;erreralion
Subjectr x
2 X x x x lmain effect
: :
20CI0 'r
x X x X A x x
Subjecu ! x x x x x X
3 :
matn
Age ma X xxxx
effect
Source: Singh, A.K., Tbsts, Measurement and Research Methods in Behavioural Sciences,
2008.
The columns represent the repeated measures over the age. The rows denote
three respective generations. Age effects and generation effects, can both be
ascertained
Ex post facto design: Here the experimenter does not introduce a
treatment, but evaluates a naturallypresent treatment, or one that has already
occurred. The dependent variable is related to the conditions that already exist.
Two types ofcommon expost facto designs
l. Correlationaldesign
2. Criterion-group design
dependent variable that cannot be measured, but can be elassified into two or
more groups on the basis ofattribute. The object ofthis analysis is to be able to
predict that a particular entity belongs to a particular group, based on several
predictor variables. NOTES
MultipleANOVA: This is an extension ofthe trvo-u,ayANOVA.
Canonical analysis: This is a useful rnethod when both the measurable
and non-measurable variables are present. This method can help in simultaneously
predicting a set of dependent variables fiomtheir joint covariance rvitha set of
independent variables.
Inferential analysis: This is concemed with various tests of significance
for hSpotlresis testing. It can also help in estimation ofpopulation values. Conclusions
are based onthis type ofanalysis.
CHscxYoun Pnocnrss
5. What is matched groups design also known as?
6. Name the different types ofmatching.
7 . S,hen is a factorial design used?
8. Definetrueexperimentaldesigns.
9 " What is a correlational design'/
10. What isthe cohorl-design?
Qualitative research is :
t-z
his or her vier.r,point(s) and discourses
practices with a twofold function: First to establish an order which will enable
individuals to orient themselves properly in their material and social world and to
master it; and secondly to enable communication to take place among the members
of a community by providing them with a code for social exchange and a code for NOTES
nanring and classifying unarnbiguously the various aspects of their world and their
individual and group hlstory"
Fig.4.8 Realities
Strictly speaking there are no such things as facts, pure and simple. All facts
are from the outset selected from a universal context by the activities ofour mind.
Theyare, therefore, always interpreted facts, either facts looked at as detached
fromtheircontext byan artificialabstractionor facts considered intheirparticular
setting. ln either case, they carrytheir interpretational inner and outer horizons.
For Goodman (1978), the world is sociallyconstructed through different
forms ofknowledge-from everydayknowledge to science and art as ways of
making the world. Social research is an analysis of such ways ofworld making
and the constructive efforts ofthe participants in everyday life. Acentral idea in
this context is the distinction between first degree and second degree
constructions-first degree isthe construct madebyan actor. The constructs of
the constructs made by the actor in the social scene is the second degree. The
exploration of the general principles, according to which man organizes his
experiences in daily life and those of the social world, is the first task of the
methodology of social sciences.
So, there couldbe muhiple social realities. Social science research encounters
f48 Self-lnstructional Material the world it wishes to study only in those versions as constructions by the subjects.
So, there are subjective constructions by the participants and subjective Research Design
constructions bythe researchers. Knowledge ofthe world is not just found in the
world, rather it is built into it. worlds are made from other worlds. Abig part of
research involves reconstructing life stories or biographies in interviews
NOTES
4.4.5 Theories in Qualitative Research
To beginwith, qualitative research is circular, not linear like in quantitative research.
In this tlpe ofresearch, theories undergo revisions, evaluation, construction and
reconstruction. They are versions ofperspectives through which the world is seen.
T-_
tpI
Idt
-1
[t
loi
I:I
t>1
II i*:_l
t
F---e*'r co
Int F
lal 6
l9l E
iEl
ict 6
c
t
cl
o,
ut
et
al
U
()I
I AJ
!?
{
\€od
,lo\
'o\
O\*
\S
.U
*{
..1o
\s \
qa!)
q)
o)
(-)
\
r.
(.)
lrt
p
*
'rrlljitirilli ti :il,.licti,r: resga
ilLreglii:l!
t
;ft$i jliii.or di lor*,?irrr*
i:,)11i-:iriliS
t I
f
AraiY*irg
?he ilsis
I
t
6et-re:'alilalel and glrii;il!-l ci
tFr$ anBlYSEr
I
i
r,;rIogi111gr. g;
tre i;lrd:r"es
NOTES
.iEB
€E
el
Eg
l! 3
a,
c,
o
it
E
-
6
=*
c
BS E
e
E $- E
a
6 tE 3
t)
B6 C
* e
-8
E E
3 .,
It cod
G
0
I
?e
-1
E.S
EE ES
tr-
lE^:
Pg E:
C=
5
E
o
'L
gH {E
s3
4S
3
r
tt{l
E
t o.s
FS
E:
F
I Ir
a!
EA
3S
o
I
E
E:
a\
!
F C
aCJ -sz
oq t-
Ei
st J
l{
a
F f!$Hn$HEE E$ f;s
Self-lntn ,"rrorr, Material l5l
Research Design
Aresearcher can enter the field ofstudy asr a stranger, visitor or as an insider.
The best role to adopt is that ofan insider. The set ofrealities presented
would be most similar in the role ofan insider and ideal for qualitative research.
Sample is also defined gradually. They are made on the basis ofthe groups
to be compared or may focus on specific persons. The sample is chosen on the
basis of new insights for developing a theory. Groups or persons chosen for the
sample are stopped when saturation is reached, i.e., nothing new could emerge
hereafter.
Here knowledge and experiences are presented. This is called a generative nanztive.
This isto obtain or elicit answers on a theme ofstudy.
All the above methods are used for collecting verbal data. The method to
be chosen is on the basis of its appropriateness.
4.4,9,1 Observation
Observations can be ofthe following types.
o Coveft vs. ovL'rt observation
o Non-paft iciparrt vs. Participant observation
o Systematic vs. unsystematic observation
o Natural vs. artificial situation observation
o Self-observation vs. obsen ing others
The selection ofa setting, i.e. where and wheri the interesting processes and
persons canbe observed;
The definition ofwhat is to be documented in the observation and in every
case;
NOTES Focused observations which concentrate more and more on aspects that
are relevant to the research question;
4.4.10 Ethnography
the largerunit
Transcription
When language analysis is involved, the focus in transcription should be to obtain
the maximum exactness in classiffing the statements, pauses, hesitations, etc.
NOTES Triangulation
This is a term used for cornbining the methods in qualitative research. Four types
oftriangulations are zuggested:
o Data triangulation: It invohes using different data sources in rating persons,
places and situations
o lnvestigatortriangulation: Here different interviewers orobservers are used
with a view to minimizing errors/biases.
o Theory triangulation: It involves approaching data with multfule pospectives
and hypotheses in mind. Various types of orientations are placed side by
side to see their usefulness forproducing knowledge
o Methodological triangulation: It involves combining different methods such
as combing questionnaire with an interview or using different sub-scales for
measuring a phenomenon.
Tiiangulationis seenas aconcept forvalidatingresuhs obtainedwithindividual
methods. These are thought to enrich and complete knowledge and lessen the
limitations ofindividual methods used singly. These are the ways social realities are
sought to be studied systematically. Triangulation is seen as a means to increase
the scope, depth and consistency ofknowledge tlrough methodological means.
Analytic induction: Here the attempt is to understand and explain the
exception that is deviant to a hypothesis in a systematic wayto interpret results. It
is a case of looking at negative data to be able to substantiate the general.
4. Writingthetheory
This is a continuous growthprocess. Contrasting cases and idealffi analysis
are carried out so that pure cases can be tracked and the understanding ofthe
individual case be made more systematic. NOTES
Construction
Communicative
validation
Experience Interpretation
Computers are widely used in the qualitative research for data analysis. Special
programs are available for analyzing data, combing qualitative and quantitative
NOTES research possibilities. ard transfonning one typc ofdata into another-nualitativc
to quantitativc and vice vcrsa. Triangulation ofrcscarch can also be done with thc
help ofcomputers.
CupcrYouR PROGRESS
4.5 SUMMARY
I 3. The theories ofqualitative research are circular, not linear. Theories undergo
revisions with new versions ofconstructions and reconstructions.
16. The steps in documenting observational data in qualitative research are: (i)
recording the data, (ii) editing the data, and (iii) constructing a new reality
fromthe data.
Short-Answer Questions
1. Give anytwo basic principles ofexperimental designs.
2. Give one difference between group and within-group designs.
RESEARCH.
INTERPRETATION AND NOTES
REPORT WRITING
Strucfure
5.0 Introduction
5.1 Unit Objectives
5,2 ComputerApplications
5.3 The Computer Systenr
5.3. I
hnportant Charactcristics
5.4 The Binary Number Systern
5,4.1 Decimal to BinaryConversion
5.4.2 Binary to Decilnal Conversion
5,4.3 Computations irr Binary System
5.5 Computers and Researchers
5.6 Interpretation and Reporl Writing
5,6.1 Meaning of Interpretation
5.6.2 The Need and Importance of Interpreting the Findings
5.6.3 Techniques of Interpretation
5.6.4 Precautionsduring Interpretation
5.6.5 Steps in Report Writing
5.6.6 The Layout of the Report
5.6.7 Types of Reports
5.6.8 Oral Presentation of Reports
5.7 Summary
5.8 Key Terms
5.9 Answers to 'Check Your Progress'
5.10 Questions and llxc'rcises
5.ll Further Reading
5.0 INTRODUCTION
In this unit you will learn how computers have revolutionized research work. The
high speed electronic digital computer has a major impact on every phase of
behavioural research. Problem-solving and lengthy statisticaland mathematical
calculations, done manually, are things ofthe past. what took days, weeks and
even months earlier, is now done in a matter ofminutes. Research studies and
calculations, which looked impossible earlier can now be tackled with the aid of
computers, in minutes or hours. Computers are now dominating almost every
walk oflife, which not only makes them important but indispensable. Here, we will
look at the applications ofthe computer, some important characteristics, the binary
number systernand the role ofcomputers in research.
Today, computers are used in allpossible fields and forvarious pu{poses. Every
sector, be iteducation, cofirmerce, nunagement, industryorcommunications, relies
on conputers for its snrooth functioning. Even ifan individual is not directly involved
with the functioning of computers, his / her life is affected by them in everyday life
andwork.
Computers are not only used in numeric applications like carrying out
complex research and data ana$sis, but also for non-numeric uses like assisting in
teaching and learningprocesses, providing a large databank ofinformation, handling
payrolls, record keeping, financial forecasting, making clinical diagnosis, providing
entertainment like playing games, watching movies and listening to music, besides
sportsviewing.
Computers are used in applications ranging from running a firmto monitoring
all environmental effects. Computers have made the research and the development
ofvarious diagnosis and prevention methods, easy and cost-effective. For exanple,
a computer can accuratelyprovide better forecasts when andwhere an earthquake
or tsunami is likely to occur, the effects of drugs on the human systenr, dissections
for study purposes, creating three-dimensional models for buildings or airplanes,
and a host of other applications in media that have brought about a revolution in
communications.
F.r
rt NOTES
fi
iL
ru
0
l;;r
t"-:_]
Fig. 5.1 Coinputer Architecture
A conputer program is wriuen into the intemal storage and then transmitted
to the control unit. This data is available for processing Uy ttre arithmetic logical
unit, which conveys the results back to the internal storage, and thus the
ouput is
obtained fromthe intemal stage ofthe CpU.
The four primary components of a computer system are:
Input: Input devices senddata and instructions to the CPU. Theytranslate
the datainto binarylanguage whichthe CpU understands.
we all know that computers can calculate complex equations and perform
complex mathematics at lightning speed. Although a computer will only
process ls and 0s, there comes apoint when the ls and 0s have to be converted
into the usual decimal nurnbers that we all are familiar with.
Let us consider a number 1234:
Thousands Hundreds Tens Ones
r234
Whichmeans,
1234:1 x 1000+2 x 100+3 x l0 +4x I
The binary system operates with base 2 or radtx2 (6i is the Latin prefix for
two) or it uses0s and ls to represent numbers. The simple comparison between
decimal and binary are given in the following table:
Solution:
22 11 0
11 5 I
5 2 I
2 I 0
I 0 I
Solution:
l. Doubling the leftmostbit we get 2
2. Adding the bit on its right 1, we get 2-tl :3
3. Doubling again this number obtained (3), we get 6
4. Adding to it the next bit 0, we get 6+ 0: 6
5. Againdoublingwe get 12
6. Finallyadding the last bit (1), we get 12+ I = 13
The decimalequivalent ofbinary ll0l is 13.
5.4.3 Computations in Binary System
Binary addition: binary addition is just like decirnal addition. It follows as:
0*0:0
0* 1:1
l+6:1
l+1:1g
Self-Instructional Material 173
Role of Computers in Sum of 1 and 1 is written as '10' (0 as sum and 1 as carry), which is the
Re s ear ch, I n terp re tat i o n
and Report Witing equivalent ofdecimal digit'2' .
Binary Decimalequivalent
1010 (10)
+ 101 + (s)
:1111 15
3. The whole number part ofthe result gives the second 1 or 0 and so on.
The computer has emerged as one ofthe most useful research tools in modern
times. It does a great variety ofjobs with tremendous speed and efficiency.
Computer usage has become a subject of study at all the schools and today
computer is an indispensable part ofany profession.
Conputers have become high$ usefirl tools in the research process, particular
whenthe datais largeandconplex involving corrplicatedrnathematicsandstatistics.
Researchers often deal with huge amounts of data, needing timely storage and
quickretrievalwhenrequired and dataprocessing usingvarious techniques. Apart
from speeding the research process, computers have also added quality to the
researchprocess. Computers assist the researchers throughout the different phases
ofthe research process.
Phases ofresearch process :
L Conceptualphase
2. Design andplanning phase
3. Empiricalphase
4. Analytic phase
5. Disseminationphase
6. Concludingphase
1. Role of computers in conceptual phase: Conceptual phase is when the
researcher formulates the research problenr, reviews literature and formulates
the hypothesis.
Computers help in searching for the literature through the Internet from
databases stored in servers all over the world. The desired information can
then be downloaded and stored in the computer for future use. It thus saves
the time spent on visiting libraries and collecting the data by writing down
the relevant material.
A note of caution
It computer is just a tool and a resource. Computers
is wise to remember that a
can only compute but cannot think. Computer-based analysis is not usually
economical in case of small research projects. Certain important details which are
saved in the computer may get lost.
The use ofcomputers enables the researcher to use trial and effor processes
which involve a lot ofcalculations andrepetitive work. Not onlydoes it produce
the results rapidly but the different options are also made available to researchers.
These would otherwise take days or months.
Researchers must be fimiliar with data organization and coding, storing the
data in the computer, selection of appropriate statistical techniques, selection of
appropriate software packages and execution ofthe computer progralr
Once the studyhas been completed, the analysis is carried out and the conclusions
drawn, the report ofthe entire studyand the findings have to be written. It is only
through the process of interpretation that the researcher can tell the scientific
community about the relationships that were studied and the theoretical concepts
underlying the findings.
A popular report is often simple and attractive. It does not use too many
mathematical or technical terms; instead it uses charts and diagrams, large fonts,
manysubheadings andvisuals whichnrake it easyto read. It enrphasizes onpractical
findings and their implications.
Along with the oral presentation, there should also be a written document
for record keeping and can also be circulated before the presentation to acquaint
the audience. It should be supplemented with visuals, slides, wall charts and
blackboards. Abrief outline should also be given.
All these fictors highlight the need for interpretation offindings ofthe research
studies.
(i) Banks and other financial institutions present their reports in the form ofa
balance sheet. This is largely a statement of accounts for their customers
and shareholders.
(iif Chemists present their reports in terms of formulae and other symbols to
explain the preparation, etc.
f. Appendix.' All types oftests, questionnaires. and other details that are
ofinterest to the public is to given here.
Make the report simple, easyto read and understand apart from an appealing
presentatior,.
5.7 SUMMARY
. An oral presentation is made for policy makers who wish to have easy and
NOTES
quick access to results of the policy.
Short-Answer Questions
1. Define computer hardware and software.
2. Define the binary number systern
3. What is the centralprocessing unit?
SeA-Instructional
190 Material
NOTES
Self-Instructional
Material f91
NOTES
Self-Instruc lional
Material