0% found this document useful (0 votes)

56 views

Variability

The document discusses different measures of variability in statistics, including range, interquartile range, and standard deviation. It explains that standard deviation is the most robust measure as it takes into account every value in the dataset rather than just the extremes. Standard deviation indicates how tightly the values are clustered around the mean and is usually reported alongside the mean. For datasets with a normal distribution, standard deviation can determine what proportion of values fall within a certain range of the mean.

Uploaded by

Jelica Vasquez

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views

Variability

Uploaded by

Jelica Vasquez

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

STATISTICS 101

MEASURES OF VARIABILITY
Measures of average such as the median and mean represent the typical value
for a dataset. Within the dataset the actual values usually differ from one another and
from the average value itself. The extent to which the median and mean are good
representatives of the values in the original dataset depends upon the variability or
dispersion in the original data. Datasets are said to have high dispersion when they
contain values considerably higher and lower than the mean value. In figure 1 the
number of different sized tutorial groups in semester 1 and semester 2 are presented.
In both semesters the mean and median tutorial group size is 5 students, however the
groups in semester 2 show more dispersion (or variability in size) than those in
semester 1. Dispersion within a dataset can be measured or described in several ways
including the range, inter-quartile range and standard deviation.

The Range.
The range is the most obvious measure of dispersion and is the difference
between the lowest and highest values in a dataset. In figure 1, the size of the largest
semester 1 tutorial group is 6 students and the size of the smallest group is 4 students,
resulting in a range of 2 (6-4). In semester 2, the largest tutorial group size is 7 students
and the smallest tutorial group contains 3 students, therefore the range is 4 (7-3).
The range is simple to compute and is useful when you wish to evaluate
the whole of a dataset.
The range is useful for showing the spread within a dataset and for
comparing the spread between similar datasets.

1 rensonrobles@yahoo.com
STATISTICS 101

An example of the use of the range to compare spread within datasets is

provided in table 1. The scores of individual students in the examination and
coursework component of a module are shown.

To find the range in marks the highest and lowest values need to be found from
the table. The highest coursework mark was 48 and the lowest was 27 giving a range of
21. In the examination, the highest mark was 45 and the lowest 12 producing a range of
33. This indicates that there was wider variation in the students performance in the
examination than in the coursework for this module. Since the range is based solely on
the two most extreme values within the dataset, if one of these is either exceptionally
high or low (sometimes referred to as outlier) it will result in a range that is not typical
of the variability within the dataset. For example, imagine in the above example that
one student failed to hand in any coursework and was awarded a mark of zero,
however they sat the exam and scored 40. The range for the coursework marks would
now become 48 (48-0), rather than 21, however the new range is not typical of the
dataset as a whole and is distorted by the outlier in the coursework marks. In order to
reduce the problems caused by outliers in a dataset, the inter-quartile range is often
calculated instead of the range.

The Inter-quartile Range

The inter-quartile range is a measure that indicates the extent to which the
central 50% of values within the dataset are dispersed. It is based upon, and related to,
the median. In the same way that the median divides a dataset into two halves, it can be
further divided into quarters by identifying the upper and lower quartiles. The lower
quartile is found one quarter of the way along a dataset when the values have been
arranged in order of magnitude; the upper quartile is found three quarters along the
dataset. Therefore, the upper quartile lies half way between the median and the highest
value in the dataset whilst the lower quartile lies halfway between the median and the
lowest value in the dataset. The inter-quartile range is found by subtracting the lower
quartile from the upper quartile. For example, the examination marks for 20 students
following a particular module are arranged in order of magnitude.

2 rensonrobles@yahoo.com
STATISTICS 101

The median lies at the mid-point between the two central values (10th and 11th)
= half-way between 60 and 62 = 61
The lower quartile lies at the mid-point between the 5th and 6th values
= half-way between 52 and 53 = 52.5
The upper quartile lies at the mid-point between the 15th and 16th values
= half-way between 70 and 71 = 70.5
The inter-quartile range for this dataset is therefore 70.5 - 52.5 = 18 whereas the
range is: 80 - 43 = 37.
The inter-quartile range provides a clearer picture of the overall dataset by
removing/ignoring the outlying values. Like the range however, the inter-quartile range
is a measure of dispersion that is based upon only two values from the dataset.
Statistically, the standard deviation is a more powerful measure of dispersion because it
takes into account every value in the dataset. The standard deviation is explored in the
next section of this guide.
Calculating the Inter-quartile range using Excel.
The method Excel uses to calculate quartiles is not commonly used and tends to
produce unusual results particularly when the dataset contains only a few values. For
this reason you may be best to calculate the inter-quartile range by hand.

The Standard Deviation

The standard deviation is a measure that summarises the amount by which
every value within a dataset varies from the mean. Effectively it indicates how tightly
the values in the dataset are bunched around the mean value. It is the most robust and
widely used measure of dispersion since, unlike the range and inter-quartile range, it
takes into account every variable in the dataset. When the values in a dataset are pretty
tightly bunched together the standard deviation is small. When the values are spread
apart the standard deviation will be relatively large. The standard deviation is usually
presented in conjunction with the mean and is measured in the same units. In many
datasets the values deviate from the mean value due to chance and such datasets are
said to display a normal distribution. In a dataset with a normal distribution most of the
values are clustered around the mean while relatively few values tend to be extremely
high or extremely low. Many natural phenomena display a normal distribution. For
datasets that have a normal distribution the standard deviation can be used to

3 rensonrobles@yahoo.com
STATISTICS 101

determine the proportion of values that lie within a particular range of the mean value.
For such distributions it is always the case that 68% of values are less than one
standard deviation (1SD) away from the mean value, that 95% of values are less than
two standard deviations (2SD) away from the mean and that 99% of values are less
than three standard deviations (3SD) away from the mean. Figure 3 shows this concept
in diagrammatical form.

If the mean of a dataset is 25 and its standard deviation is 1.6, then

68% of the values in the dataset will lie between MEAN-1SD (25-
1.6=23.4) and MEAN+1SD (25+1.6=26.6)
99% of the values will lie between MEAN-3SD (25-4.8=20.2) and
MEAN+3SD (25+4.8=29.8).
If the dataset had the same mean of 25 but a larger standard deviation (for
example, 2.3) it would indicate that the values were more dispersed. The frequency
distribution for a dispersed dataset would still show a normal distribution but when
plotted on a graph the shape of the curve will be flatter as in figure 4.

4 rensonrobles@yahoo.com
STATISTICS 101

Population and sample standard deviations

There are two different calculations for the Standard Deviation. Which formula
you use depends upon whether the values in your dataset represent an entire
population or whether they form a sample of a larger population. For example, if all
student users of the library were asked how many books they had borrowed in the past
month then the entire population has been studied since all the students have been
asked. In such cases the population standard deviation should be used. Sometimes it is
not possible to find information about an entire population and it might be more
realistic to ask a sample of 150 students about their library borrowing and use these
results to estimate library borrowing habits for the entire population of students. In
such cases the sample standard deviation should be used.

Formulae for the standard deviation

Whilst it is not necessary to learn the formula for calculating the standard
deviation, there may be times when you wish to include it in a report or dissertation.
The standard deviation of an entire population is known as (sigma) and is
calculated using:

x x2 x
2 2
or
N N N
Where x represents each value in the population, is the mean value of the
population, is the summation (or total), and N is the number of values in the
population.
The standard deviation of a sample is known as S and is calculated using:

x
2

n 1

Where x represents each value in the population, x is the mean value of the
sample, is the summation (or total), and n-1 is the number of values in the sample
minus 1.

Calculating the standard deviation using Excel

Excel has functions to calculate the population and sample standard deviations.
The appropriate commands are entered into the formula bar towards the top of the
spreadsheet and the corresponding cells in the spreadsheet are updated to show the
result. For an example of calculating the population standard deviation, imagine you
wish to know how fuel-efficient a new car that you have just purchased is. You calculate
how many kilometres you have done per litre on your first five trips. This information is

5 rensonrobles@yahoo.com
STATISTICS 101

presented as column A of the spreadsheet (figure 5). As you have only made 5 trips you
do not have any further information and you are therefore measuring the whole

population at this point in time. The command to find the population standard deviation
in Excel is =STDEVP(VALUES) and in this case the command is =STDEVP(A2:A6) which
gives an answer of 0.49. Basing your results on the population standard deviation and
assuming that your first 5 trips in your new car have been typical of your usual
journeys, you can be 99% confident that your new car will do between 14.75 (MEAN-
3SD) and 17.69 (MEAN+3SD) kilometres per litre .
The same data can be used to demonstrate how to calculate the sample standard
deviation in Excel. In this case, imagine that the data in column A represent the
kilometres per litre found for a sample of 5 new cars tested by the manufacturer. The
population standard deviation is calculated using =STDEV(VALUES) and in this case the
command is =STDEV(A2:A6) which produces an answer of 0.55. The sample standard
deviation will always be greater than the population standard deviation when they are
calculated for the same dataset. This is because the formula for the sample standard
deviation has to take into account the possibility of there being more variation in the
true population than has been measured in the sample. Based on their sample of 5 cars,
and therefore using the sample standard deviation, the manufacturers could state with
99% confidence that similar cars will do between 14.57 (MEAN-3SD) and 17.87
(MEAN+3SD) kilometres per litre . These examples show the quick method of
calculating standard deviations using a cell range. Each of the commands can also be
written out in a longer format with the individual kilometres/litre entered.
For example entering: =STDEV(16.13,16.40,15.81,17.07,15.69) produces an identical
result to =STDEV(A2:A6). However, if one of the values in column A was found to be
incorrect and adjusted, the cell range method would automatically update the

6 rensonrobles@yahoo.com
STATISTICS 101

calculation of the standard deviation whereas the longer format will require manual
adjustment of the command.

VARIANCE
The variance and the closely-related standard deviation are measures of
how spread out a distribution is. In other words, they are measures of variability.
The variance is computed as the average squared deviation of each number from
its mean. For example, for the numbers 1, 2, and 3, the mean is 2 and the variance is:

The formula (in summation

notation) for the variance in a population is

x

x

2 2
x2
2
or 2

N N N

STANDARD DEVIATION FOR GROUPED DATA

The standard deviation for grouped data can be computed by the following
formula

f M
fM 2 fM
2 2
or
N N N
Where - is the standard deviation
M - class mark
- Mean
f frequency
N total frequency
Example:
Below are the scores of 40 BS Architecture students in Building Technology.
Compute the variance and standard deviation.

Class
Class frequency
Interval (f)
Mark M M 2 f M
2

(M)
45 - 49 2 47 -16.875 284.766 569.5313
50 - 54 5 52 -11.875 141.016 705.0781
55 - 59 6 57 -6.875 47.2656 283.5938
60 - 64 10 62 -1.875 3.51563 35.15625
65 - 69 4 67 3.125 9.76563 39.0625
70 - 74 6 72 8.125 66.0156 396.0938

7 rensonrobles@yahoo.com
STATISTICS 101

75 - 79 7 77 13.125 172.266 1205.859

f M 3234.375
2
N = 40
Mean = 63.875
f M
f M
2 2
2
N N
3234.375 3234.375

40 40
80.859375
80.859375
8.9922

Exercises:
Find the variance and standard deviation of the following set of data.
1. 44, 49, 52, 62, 53, 48, 54, 49, 46, 51
2. 12, 13, 10, 14, 14, 15, 17, 17, 10. 12, 11
3. 5.5, 4.3, 3.4, 5.6, 5.4, 7.8
4. 65, 75, 73, 50, 60, 64, 69, 62, 67, 85
5. 85, 79, 57, 39, 45, 71, 67, 87, 91, 49
6. 43, 51, 53, 110, 50, 48, 87, 69, 68, 91
7. Class Frequency
2732 1
3338 0
3944 6
4549 4
5055 2

8. Class Frequency
59 1
913 2
1317 5
1720 6
2024 3

9. Class Frequency
913 1
1419 6
2025 2
2628 5
2932 9

10. Class Frequency

123127 3
128132 7
138142 2
143147 19

8 rensonrobles@yahoo.com

Stuff That Sucks
100% (3)
Stuff That Sucks
96 pages
Black Cinema Visual Culture Art and Politics in The 21st Century (Artel Great, Ed Guerrero)
No ratings yet
Black Cinema Visual Culture Art and Politics in The 21st Century (Artel Great, Ed Guerrero)
181 pages
Trauma Solutions Relational Empowerment Training Slides PDF
100% (1)
Trauma Solutions Relational Empowerment Training Slides PDF
23 pages
J - Suffering, Sickness & Healing
No ratings yet
J - Suffering, Sickness & Healing
4 pages
Kest 106
No ratings yet
Kest 106
17 pages
measures of dispersion updated
No ratings yet
measures of dispersion updated
38 pages
Assignment of Maths
No ratings yet
Assignment of Maths
6 pages
Measures of Dispersion
100% (1)
Measures of Dispersion
13 pages
Variability New (Autosaved)
No ratings yet
Variability New (Autosaved)
23 pages
Chapter 4 Dispersion of Data
No ratings yet
Chapter 4 Dispersion of Data
3 pages
Statistics Part 1 and 2
No ratings yet
Statistics Part 1 and 2
53 pages
Lecture 5&6
No ratings yet
Lecture 5&6
15 pages
Lecture 5 Notes
No ratings yet
Lecture 5 Notes
23 pages
Bast 503 Lect 5
No ratings yet
Bast 503 Lect 5
53 pages
Biostatistics (Descriptive Statistics)
No ratings yet
Biostatistics (Descriptive Statistics)
30 pages
Measurement of Variation Dispersion 2
No ratings yet
Measurement of Variation Dispersion 2
25 pages
Author(s) Prerequisites Learning Objectives: Measures of Variability
No ratings yet
Author(s) Prerequisites Learning Objectives: Measures of Variability
17 pages
Final Measures of Dispersion DR Lotfi
No ratings yet
Final Measures of Dispersion DR Lotfi
54 pages
Sta301 Lec10
No ratings yet
Sta301 Lec10
54 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
17 pages
Lecture 4 Copy 1
No ratings yet
Lecture 4 Copy 1
13 pages
Lecture 3 - Stat HO
No ratings yet
Lecture 3 - Stat HO
21 pages
Module 3 - Statistics
No ratings yet
Module 3 - Statistics
30 pages
Probability & Statistics Basics
No ratings yet
Probability & Statistics Basics
30 pages
Qtymeth Dispersion
No ratings yet
Qtymeth Dispersion
8 pages
Chapter 4
No ratings yet
Chapter 4
23 pages
Measures of Dispersion New
No ratings yet
Measures of Dispersion New
23 pages
5. Lecture Note 05_ Measures of Dispersion (2)
No ratings yet
5. Lecture Note 05_ Measures of Dispersion (2)
11 pages
Varianc and Standard Deviation
No ratings yet
Varianc and Standard Deviation
10 pages
Measures of Disperson
No ratings yet
Measures of Disperson
17 pages
Univariate Statistics
No ratings yet
Univariate Statistics
4 pages
History Reporting
No ratings yet
History Reporting
61 pages
Describing Data: Centre Mean Is The Technical Term For What Most People Call An Average. in Statistics, "Average"
No ratings yet
Describing Data: Centre Mean Is The Technical Term For What Most People Call An Average. in Statistics, "Average"
4 pages
3.dispersion and Skewness-Students Notes-MAR
No ratings yet
3.dispersion and Skewness-Students Notes-MAR
29 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
29 pages
Chapter - 4 Dispersion
No ratings yet
Chapter - 4 Dispersion
10 pages
Lesson 5 Measure of Spread 1
No ratings yet
Lesson 5 Measure of Spread 1
9 pages
Lecture 2b Brief Lecture Notes On Measures of Dispersion (Variability)
No ratings yet
Lecture 2b Brief Lecture Notes On Measures of Dispersion (Variability)
11 pages
Click To Add Text Dr. Cemre Erciyes
No ratings yet
Click To Add Text Dr. Cemre Erciyes
69 pages
Chapter 3 Dispersion
No ratings yet
Chapter 3 Dispersion
12 pages
Measures of Variability
No ratings yet
Measures of Variability
7 pages
5 Measures of Dispersion Skewness and Kurtosis Rev
No ratings yet
5 Measures of Dispersion Skewness and Kurtosis Rev
63 pages
Statistical Analysis_ Descriptive Stat (2)
No ratings yet
Statistical Analysis_ Descriptive Stat (2)
6 pages
Module 5 - Range, Variance, Standard Deviation
No ratings yet
Module 5 - Range, Variance, Standard Deviation
31 pages
MetNum1 2023 1 Week 10
No ratings yet
MetNum1 2023 1 Week 10
79 pages
lec3
No ratings yet
lec3
18 pages
Imp - MEASURES OF DISPERSION
No ratings yet
Imp - MEASURES OF DISPERSION
5 pages
Univariate Statistics
No ratings yet
Univariate Statistics
7 pages
Tech Seminar Sem 2
No ratings yet
Tech Seminar Sem 2
35 pages
Dispersion
No ratings yet
Dispersion
26 pages
Worksheets-Importance of Mathematics
No ratings yet
Worksheets-Importance of Mathematics
38 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
23 pages
Week 5 - Result and Analysis 1 (UP)
No ratings yet
Week 5 - Result and Analysis 1 (UP)
7 pages
Measures of Central Tendency
100% (15)
Measures of Central Tendency
15 pages
4 - Dispersion & Skewness - Part 1
No ratings yet
4 - Dispersion & Skewness - Part 1
35 pages
Dispersion
No ratings yet
Dispersion
18 pages
UNGROUPED DATA Measures of Central Tendency, Dispersion, and Position
No ratings yet
UNGROUPED DATA Measures of Central Tendency, Dispersion, and Position
34 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
40 pages
Unit II
No ratings yet
Unit II
76 pages
Quartile Deviation Chap3
100% (1)
Quartile Deviation Chap3
11 pages
01_Ram Kishor MTECH_3rd SEM_ ESE-711_BATCH (2022-2024)_research Methodology
No ratings yet
01_Ram Kishor MTECH_3rd SEM_ ESE-711_BATCH (2022-2024)_research Methodology
23 pages
Lec 1
No ratings yet
Lec 1
54 pages
Statistical Foundations for Psychology
From Everand
Statistical Foundations for Psychology
James C. Ware
No ratings yet
Statistics I Essentials
From Everand
Statistics I Essentials
Emil G. Milewski
No ratings yet
Tallbuildings 130202082453
No ratings yet
Tallbuildings 130202082453
58 pages
Museum Systems in Italy: Università Di Lingue e Comunicazione
No ratings yet
Museum Systems in Italy: Università Di Lingue e Comunicazione
23 pages
Berde NC Edu
No ratings yet
Berde NC Edu
66 pages
Acronyms and Abbreviations
No ratings yet
Acronyms and Abbreviations
6 pages
Validity and Reliability in Qualitative Research: H.I.L. Brink (Conference Paper)
No ratings yet
Validity and Reliability in Qualitative Research: H.I.L. Brink (Conference Paper)
4 pages
Outline Specifications: Architectural, Electrical, Electronics and Communications
No ratings yet
Outline Specifications: Architectural, Electrical, Electronics and Communications
11 pages
Building A Market Based Pay Structure
No ratings yet
Building A Market Based Pay Structure
20 pages
United States v. Philip Rastelli, Nicholas Marangello, Joseph Massino, Carmine Rastelli, James Vincent Bracco, Charles Martelli, Charles Agar, Anthony Cantatore, Warren Weissman and Dominic Mariani, 870 F.2d 822, 2d Cir. (1989)
No ratings yet
United States v. Philip Rastelli, Nicholas Marangello, Joseph Massino, Carmine Rastelli, James Vincent Bracco, Charles Martelli, Charles Agar, Anthony Cantatore, Warren Weissman and Dominic Mariani, 870 F.2d 822, 2d Cir. (1989)
24 pages
Criminal Law Omission Essay
No ratings yet
Criminal Law Omission Essay
9 pages
FinQuiz - Curriculum Note, Study Session 4-6, Reading 13-21 - Economics
100% (2)
FinQuiz - Curriculum Note, Study Session 4-6, Reading 13-21 - Economics
124 pages
Cobra ODE Supported Boot Discs
No ratings yet
Cobra ODE Supported Boot Discs
16 pages
panchanama 1
No ratings yet
panchanama 1
1 page
Walking The Labyrinth
100% (4)
Walking The Labyrinth
10 pages
Anand Final Report
100% (1)
Anand Final Report
89 pages
Target Market and Marketing Mix
No ratings yet
Target Market and Marketing Mix
2 pages
Brazilian Psychosocial Histories of Psychoanalysis 1st Edition Belinda Mandelbaum - Quickly download the ebook to explore the full content
100% (1)
Brazilian Psychosocial Histories of Psychoanalysis 1st Edition Belinda Mandelbaum - Quickly download the ebook to explore the full content
68 pages
Business Environment-Unit 1 (MBA Sem I)
No ratings yet
Business Environment-Unit 1 (MBA Sem I)
75 pages
Published research paper
No ratings yet
Published research paper
10 pages
How To Apply ICF in Rehabilitation PDF
No ratings yet
How To Apply ICF in Rehabilitation PDF
14 pages
Abakada Guro Vs Executive Secretary
No ratings yet
Abakada Guro Vs Executive Secretary
1 page
DLL ICT Week 1
100% (2)
DLL ICT Week 1
10 pages
Inverse Kinematics
No ratings yet
Inverse Kinematics
64 pages
Science Unit 1
No ratings yet
Science Unit 1
2 pages
475-20 DC Dielectric Test Set - Version 5.0
No ratings yet
475-20 DC Dielectric Test Set - Version 5.0
31 pages
Self Report Childrens
No ratings yet
Self Report Childrens
11 pages
What Is SWOT Analysis
No ratings yet
What Is SWOT Analysis
4 pages
Electronic Devices and Circuits Lab: Department of Electrical and Electronics Engineering
No ratings yet
Electronic Devices and Circuits Lab: Department of Electrical and Electronics Engineering
14 pages
The 1900 Buganda Agreement
No ratings yet
The 1900 Buganda Agreement
5 pages
Rollover in LNG Storage Tanks
100% (1)
Rollover in LNG Storage Tanks
42 pages
LInux LVM
No ratings yet
LInux LVM
10 pages
Cataratas
No ratings yet
Cataratas
13 pages
Busadv TRD Reviewtest U1-4
No ratings yet
Busadv TRD Reviewtest U1-4
7 pages