0% found this document useful (0 votes)
321 views66 pages

Module 3 Report

This document discusses data management and statistics. It covers topics such as data gathering, measures of central tendency, and levels of measurement. Some key points include: - Descriptive statistics is used to describe and summarize data, while inferential statistics makes conclusions about a dataset. - Probability sampling methods give all population members an equal chance of being selected, while non-probability methods do not. - Common measures of central tendency are the mean, median, and mode. The mean is the average, the median is the middle value, and the mode is the most frequent value. - Data can be qualitative, involving categories, or quantitative, involving numbers/measures. It can also be nominal, ordinal, interval, or
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
0% found this document useful (0 votes)
321 views66 pages

Module 3 Report

This document discusses data management and statistics. It covers topics such as data gathering, measures of central tendency, and levels of measurement. Some key points include: - Descriptive statistics is used to describe and summarize data, while inferential statistics makes conclusions about a dataset. - Probability sampling methods give all population members an equal chance of being selected, while non-probability methods do not. - Common measures of central tendency are the mean, median, and mode. The mean is the average, the median is the middle value, and the mode is the most frequent value. - Data can be qualitative, involving categories, or quantitative, involving numbers/measures. It can also be nominal, ordinal, interval, or
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 66

1

MODULE 3
DATA
MANAGEMEN
T
Let Us Pray
Dear Lord and Father of all,
Thank you for today.
Thank you for ways in which you provide us all.
For your protection and love we thank you .
Help us to focus our hearts and minds now on what we are about
to learn.
Inspire us by Your Holy Spirit as we listen and write.
Guide us by your eternal light as we discover more about the
world around us.
We ask all this in the name of Jesus.
Amen.
3.1: data gathering
3.2: Measure of
central tendency
3.3: Measure of
dispersion
Data Management
and
Data Gathering
Data
Management
WHAT IS DATA
MANAGEMENT or
statistics?
The science of collecting, organizing, presenting, analyzing
and interpreting numerical data.
TYPES OF STATISTICS
1. Descriptive Statistics

- is concerned with collecting, organizing, presenting,


and analyzing numerical data. The statistician tries
to describe or summarize a situation.
TYPES OF STATISTICS
2. Inferential Statistics
- draws conclusions like decisions, predictions, or
generalizations about the data set.
- It implies that before carrying out an inference,
appropriate and correct descriptive measures or
methods are employed to bring out good results.
Data Collection
• Population as used in statistics, refers to a set
of people, objects, measurements, or
happenings that belong to a defined group.
Data Collection
Sample is a portion of the population.

In this way , you will save efforts, time and


resources in conducting your study.

In this way , you will save efforts, time and


resources in conducting your study.
Data Collection
Sample is a portion of the population.
SAMPLING TECHNIQUES
Probability Sampling
1.Random sampling 3. Stratified Sampling
2. Systematic Sampling 4. Cluster Sampling
Non-Probability Sampling
1.Convenience Sampling 3. Quota Sampling
2.Purposive Sampling 4. Snowball sampling
PROBABILITY SAMPLING
 All members of the population have equal chances of being chosen
as a part of the sample.

Systematic Sampling
Random Sampling
Members of your
Members of your sample
population are written in
are selected through
a list systematically with
lottery
corresponding numbers.
PROBABILITY SAMPLING
 All members of the population have equal chances of being chosen
as a part of the sample.

Stratified Sampling Cluster Sampling


 Members of your population are grouped. You
 Members of your population are
can choose equal number of respondents in
each group, or in proportion to the number of grouped. Selection of all respondents
elements in each group. are in groups. You can chooses all the
respondents in your selected groups.
NON-PROBABILITY SAMPLING
 All members of the population do not have equal chances of being
chosen as a part of the sample.

Convenience Sampling Purposive Sampling


Samples are selected Samples are determined
because of their immediate by the researcher based
availability. on the purpose of the
study.
NON-PROBABILITY SAMPLING
 All members of the population do not have equal chances of being
chosen as a part of the sample.

Quota Sampling Snowball Sampling


Samples are selected to
Samples are selected
achieve the needed number of
participants in the study. based on the
recommendation of other
members in the sample.
Lets Try!
I. Identify which item in each column is the Population and the Sample.

1. P NCR S Manila
2. S Tablespoon of sugar PJar of sugar
3. S STEM students PAcademic Track students
4. P Juice in a pitcher SJuice in a glass
5. P All manufactured cellular phones Smodel units of cellular phones
Lets Try!
I. identify the what sampling used in each items.
Random 1. The online reseller writes all her loyal customers in a sheet of paper.
Stratified 2. The coordinator selects 3 students in each grade level.
Systematic 3. A radio program staff member answers every 50th caller.
Cluster 4. The Local Government unit chooses respondents only from

barangays that are placed under hard lockdown.


Convenience 5. Asking 100 customers who are leaving the mall.
Snowball 6. Accepting blood donations from persons with AB- blood type and asking them if they

can also refer friends whom they know with the same blood type.
Purposive 7. Selecting Covid-19 survivors as respondents in a survey because the study deals with

the development of a new coronavirus vaccine.


Quota 8. Posting an online survey and accepting only 300 responses.
METHODS OF DATA GATHERING

Direct Method Indirect Method


 Includes observation and  Includes ways in which you can
interview where you get the obtain the needed data without
information firsthand your actual presence.
Example:
Example:
 The researcher sets a
 The researcher checks the
particular time and date
school records for the
to talk with the
average grade of his
respondents.
respondents.
CLASSIFICATIONS OF DATA
Qualitative Data Quantitative Data
Categories that show Numbers or values that
classifications or subtitles. represents counts or
Gender, Marital Status, measures.
Weight, number of siblings,
Grade Level, Senior High
Track/Strand hours spent in studying.
 Discrete data
 Continuous Data
LEVELS OF MEASUREMENT
Nominal - data that are categorical.
Examples: Gender, Nationality, Civil Status

Ordinal - data that are in ordered or ranked categories


Examples: rating (good, better, best), ranking (first, second, third

Interval - data that have no real zero.


Examples: Temperature because having 0 degrees does not mean no
temperature
Ratio - data that have real zero
Examples: Weight because 0 kilograms means no weight at all
Measure of
Central Tendency
Measure of Central Tendency
Mean
Example:
The mean is the The grades in Statistics of 10
sum of the item students are 82, 85, 79, 78, 89,
values divided by
the number of 87, 88, 89, 75, and 77.
items.
What is their average grade?
Measure of Central Tendency

Mean  

The mean is the


sum of the item
values divided by
the number of
items.
Measure of Central Tendency
Media • The grades in Statistics of 10
Thenmedian is the students are 82, 85, 79, 78, 89, 87,
value of the 88, 89, 75, and 77.
middle term when • What is the median?
data are arranged
in either
ascending or 75, 77, 78, 79, 82, 85, 87, 88, 89,
 

descending order. 89
Measure of Central Tendency
Mode • The grades in Statistics of 10
The mode is students are 82, 85, 79, 78,
referred to as the 89, 87, 88, 89, 75, and 77.
most frequently
occurring value in • What is the mode?
a given set of
data. 75, 77, 78, 79, 82, 85, 87, 88, 89, 89
 
Mean, Median and Mode
(grouped Data, Decreasing Order)
Mean of grouped data

The "mean" is the "average" you're


used to, where you add up all the
numbers and then divide by the
number of numbers. 
The data represents the ages of 40 women when
they each had a boyfriend. Construct a grouped
frequency distribution with 5 classes.

18 20 20 20 20 21 20 17 19 20
19 18 22 26 20 19 22 15 18 27
16 23 24 17 25 24 16 20 26 15
21 17 23 16 21 17 26 16 23 19
Grouped Frequency 18 13 16 21
Distribution 20 18 23 17
Class Limits
21 22 24 23
Frequency
25 – 27 5 20 26 17 16
22 – 24 7 20 20 25 21
19 – 21 11 21 19 24 17
16 – 18 14
13 – 15 3
20 22 16 26
17 15 20 16
total 40
19 18 26 23
20 27 15 19
Mean
  ∑ 𝑓𝑋
´𝑥 =
∑𝑓

Class Limits Frequency


25 – 27 5
22 – 24 7
19 – 21 14
16 – 18 11
13 – 15 3
Mean
 
Where
f - frequency
X - class mark (midpoint)
𝑓𝑋
 

´𝑥 =
∑ fX - product of the frequency and
the class mark
∑𝑓 ΣfX - sum of the product of the
frequency and the class mark
Σf - total frequency
- sample mean
Mean
  ∑ 𝑓𝑋
´𝑥 =
(25+2
7)/2 ∑𝑓
=26

Class limits f X fX
25-27 5 26 130  
22-24
19-21
7 23 161 = =20
14 20 280
16-18
11 17 187
13-15
3 14 42
  ´
𝟒𝟎 ´
𝟖𝟎𝟎
 
Σ f / 2 ≺ cf
Median
 
Md =lb mc +
[ f mc ]i

 
Where
- the lower boundary of the
median class
Σf/2 - total frequency divided by 2
<cf - cumulative frequency of the
lower class next to the median class.
- frequency of the median class.
i - class width
Σ f / 2 ≺ cf
Median
 
Md =lb mc +
[ f mc ]i

Class limits f
The median class is
25-27 5 the class with the
22-24 7 smallest cumulative
19-21 14
frequency greater than
or equal to Σf/2.
16-18 11

13-15 3
Σ f / 2 ≺ cf
Median
 
Md =lb mc +
[ f mc ]i

Class limits f <cf


25-27 5 Md= 18.5 + 3
 

22-24 7 40
35
19-21 1  
4 28 = 18.5 + 3
16-18 1 14
1 3 =18.5 + 1.29
13-15 3
´ Md = 19.79
𝟒𝟎
 

lb=19 – 0.5 = 18.5


i=16-13=3 Σf/2=20
𝐷1
Mode
 
Mo=lb mo +
[
𝐷 1+ 𝐷 2
i
]
 
Where
- lower boundary of the modal class
- highest frequency minus the frequency of the next
lower class
- highest frequency minus the frequency of the next
upper class
i - class width
 

Mode
Class f
limits The modal class is the
25-27 5 class with the highest
22-24 7 frequency.
19-21 14
16-18 11
13-15 3
 

Mode
 
Class f Mo = 18.5 + 3
limits  
25-27 5
  = 18.5 + 3
22-24 7
> =7
19-21 14 >  =18.5 + 0.9
=3
16-18 11 Mo =19.4
13-15 3
lb=19 - 0.5 = 18.5
i=16 - 13=3
Mean of Ungroup Data
-the Mean is the most commonly
used measure of central Tendency.
When we speak of average, we
always refer to the mean.
 
Σ 𝑥
𝑥
´ =
𝑁
Example:
Six friends in a biology class of 20 students receives test
grades of 92, 84,65,76,88 and 90. Find the mean of these
test score.
First get the sum of their scores:
 
=
=
82.5
Example:
The ages of five contestants in a Statistics Quiz Bee
are the following:
18,17, 18,19 and 18. Find their average age.
 
=
=
18
  
Median of Ungroup Data
It is the midpoint of the data array. Before finding the value,
the data must be arrange in order, from least to greatest or
vice versa. The median will either be a specific value or will
fall between two values.
 
=
=
Example:
Seven mothers were selected and given a blood pressure
check, their blood pressure were recorded below.
135, 121, 119, 116, 130, 121, 131
Find their Median.
 
Solution: arrange the data in order.
116,119,121,130, 131, 135
= 121
Example:
Eight novels were randomly selected and the numbers of
pages were recorded as follows:
415, 398, 402, 400, 420, 415, 407, 425
Find their Median.
 
Solution: arrange the data in order.
398, 400, 402, 407, 415,4,15, 420, 425

=
= 411
Mode of Ungroup Data

It is the value that occurs most often in the data


set.
The number/value/observation in a data set
which appears the most number of times.
Example:
Finds the mode of the given data set:
15, 28, 25, 48, 22, 43, 39, 44, 43, 49, 34, 22, 33, 27, 25, 22, 30
Arrange the data set
15, 22, 22, 22, 25, 25, 27, 28, 30, 33, 34, 39, 43, 43, 44, 48, 49
Another Example:
The speed of ten stenographer in typing per minute are as follows:
121, 110, 120, 119, 112, 121, 118, 115, 107, 115.
Arrange the data set:107, 110, 112, 115, 115, 118, 119, 120, 121, 121
The data set has two models: 115 and 121- the data set is said to be
bimodal.
Example:

Finds the mode of the given data:


2, 5, 8, 9, 11, 4, 23.

There is no mode
Measures of
Dispersion
Definition
 Measure of dispersion are descriptive statistics that describe how
similar a set of scores are to each other.
 The more similar the scores are to each other, the lower
the measure of dispersion will be.
 The less similar the scores are to each other, the higher the
measure of dispersion will be..

 In general, the more spread out a distribution is, the larger


the measure of dispersion will be.
Measures of Dispersion
 Which of the distribution
of scores has the larger
dispersion?
 The upper distribution has
more dispersion because the
scores are more spread out.

 That is, they are less similar


to each other.
Measures of Dispersion
These are the measures of dispersion:
 The range
 Interquartile range
 Variance/standard deviation
 Coefficient of Variaton
The Range
 The range is defined as the difference between the largest score
in the set of data and the smallest score in the set of
s data,
L X -X
 What is the range of the following data:
4 8 1 6 6 2 9 3 6 9
 The largest score (XL ) is 9; the smallest scores (X ) is 1;
  The range is - = 9-1=8
When to use the Range
The range is use when
 You have ordinal data
The range is rarely used in scientific work as it is fairly
insensitive
 It depends on only two score in the set of data, X and X
 Two very different sets of data can have the same L s

range:
1 1 1 1 9 vs 1 3 5 7 9
Variance
 Variance is defined as the average of the
square deviations:
 

=
What does the Variance Formula mean?

 It says to subtract the mean from each of the scores

 This difference is called a deviate or a deviation score

 The deviate tells us how far a given score is from a typical, or average,
score
 Thus, the deviate is a measure of dispersion for a given score

=
What does the Variance Formula mean?

 One of the definitions of the mean was that it always made the sum of the scores
minus the mean equal to 0

 Thus, the average of the deviates must be a 0 since the sum of the deviates must
equal 0.
 to avoid this problem, statisticians square the deviate score prior to averaging them
 Squaring the deviate score makes all the squared scores positive.

 
 
≠ =
Standard Deviation

When the deviate scores are squared in variance, their unit


of measure is squared as
2 well

 E.g. If people’s weights are measured in pounds, then the variance


of the weights would be expressed in pounds (or squared pounds)

Since squared units of measure are often difficult to deal


with, the square root of variance is often used instead.
• The standard deviation is the square root of variance
Variance of a Sample

When calculating variance, it is often easier to use a


computation formula which is algebraically equivalent to the
definitional formula:
 
=
 
is the population variance, X is a score, µ is the population
mean, and N is the number of scores.
Variance of a Sample

 the sample mean is not a perfect estimate of the population mean, the
formula for the variance of a sample is slightly different from the formula for
the variance of a population:
  2=
S
 
is the sample variance, X is a score, is the sample mean, and
N is the number of scores.
Coefficient of Variation

It tells if a standard deviation in large or small by comparing the


standard deviation to the mean.
It allows comparison of standard deviations that come from data
sets with different means.

 
For population: cv= x 100%

 
For the sample: cv= x100%
Coefficient of Variation

Find the measures of dispersion in the given sample


data.

6 ,7, 7, 8 , 9 , 10
1. Range: 10-6 = 4
Exercise:
Find the measures of dispersion in the given sample
data.
6 , 7, 7, 8 , 9 , 10
1. Range: 10-6 = 4
 2. Variance
S2 =

Get the MEAN


= = = 7.83
Exercise:
Find the measures of dispersion in the given sample data.
6, 7, 7, 8, 9, 10
 2. Variance x x-
= 7.83 6 6-7.83=-1.83 =3.3489
s2 = 7 -0.83 0.6889
7 -0.83 0.6889
s2 = 8 0.17 0.0289
s2 = 9 1.17 1.3689
s2 = 2.1667 10 2.17 4. 7089
10. 8334
Exercise:
Find the measures of dispersion in the given sample data.
6, 7, 7, 8, 9, 10
2. Variance
 
s2 = 2.1667 4. Coefficient of Variation
 
3. Standard Deviation cv= x100%
s=

cv = x 100
s=
cv= 0.1880 x100
s= 1.4720
cv=18.8%

You might also like