0% found this document useful (0 votes)

44 views8 pages

Describing Data

Uploaded by

Irish Nicole Ferrer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views8 pages

Describing Data

Uploaded by

Irish Nicole Ferrer

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Describing Data

Measures of Variability

Variability - refers to how spread out or how much data points differ from each other and from
the center of a distribution. It indicates the degree to which individual data points are different or
similar to each other within a dataset or distribution.

Figure 3-6 explanation:

● Distribution A (left graph):

○ Scores spread widely from 0 to 100.
○ This means students’ test scores are very spread out (high variability).
○ Even though the average (mean) score is 50, scores are all over the place.

● Distribution B (right graph):

○ Scores are tightly clustered between 40 and 60.
○ This means students’ scores are close together (low variability).
○ Again, the average (mean) is still 50, but most students scored near that
average.

Main Point:
Both distributions have the same mean (50), but their variability is very different.

● Distribution A shows a wide range of scores.

● Distribution B shows a narrow range of scores.

Range - the range of a distribution is equal to the difference between the highest and lowest
scores.

Example:

Suppose the test scores in a class are: 45, 50, 60, 70, 80

● Highest score = 80
● Lowest score = 45
● Range = 80 − 45 = 35

So, the range of the test scores is 35.

So, as a descriptive statistic of variation, the range provides a quick but gross description of the
spread of scores. When its value on extreme scores in a distribution, the resulting description of
variation may be understated or overstated. Better measures of variation include the
interquartile range and the semi-interquartile range.

Interquartile Range - The IQR shows the spread of the middle 50% of the data, so it’s less
affected by extreme values (outliers).

Quartile - refers to a specific point

● Quartiles help you see who’s in the lower group, middle group, and higher group.
● Quartile → splits data into 4 equal parts.

Quarter - refers to an interval

● “Quarter” just means one-fourth of the whole.

● Quarter → splits anything (time, money, objects, etc.) into 4 equal parts.

Sem-interquartile Range - half of that spread, showing the typical deviation from the median.

Example:

Suppose we have this data set of test scores: 10, 20, 30, 40, 50, 60, 70, 80, 90

1. Median = 50

2. Q2 (50th percentile) = 45

3. Q1 (25th percentile) = 30

4. Q3 (75th percentile) = 70

● IQR = Q3 - Q1 = 70 - 30 = 40

● SIQR = IQR ÷ 2 = 40 ÷ 2 = 20

In a perfectly symmetrical distribution, Q1 and Q3 will be exactly the same distance from the
median. If these distances are unequal then there is a lack of symmetry. This lack of symmetry
is referred to as skewness.

The Mean Absolute Deviation (MAD) - another tool that could be used to describe the amount
of variability in a distribution is the mean absolute deviation.

The formula is:

The bar on each side of X — X¯ indicates that it is the absolute value of the deviation score
(ignoring the positive or negative sign and treating all deviation scores as positive). All the
deviation scores are then summed and divided by the total number of scores (n) to arrive at the
average deviation.
● It is the “average” of the “positive distances” of each point from the mean.
● It tells us, on average, how far the numbers are from the mean.
● It is the “average distance” between each data value and the mean.

We make all the differences positive in Mean Absolute Deviation because:

● Some numbers will be above the mean (positive differences) and some will be below
the mean (negative differences).
● If we add them all up, the positives and negatives would cancel each other out, and it
might look like there's no variation when there actually is.
● By taking the absolute value (making the difference positive), we measure the actual
distance from the mean, no matter if it's above or below.
● That's why we make all the differences positive — so we can see the true distance from
the average.

Example data set: 2, 4, 6, 8, 10

1. Find the mean: xˉ = 2 + 4 + 6 + 8 + 10 / 5 = 6

2. Find deviations from the mean:
● |2 – 6| = 4

● |4 – 6| = 2

● |6 – 6| = 0

● |8 – 6| = 2

● |10 – 6| = 4
3. Find the average of these deviations:

MAD = 4 + 2 + 0 + 2 + 4 / 5 = 12 / 5 = 2.4

Standard Deviation - refers to how spread out the numbers are from the average/mean. It
shows how consistent or how varied the data is.

● If the standard deviation is small, the numbers are close to the mean → the data is
consistent (less varied).
● Consistent results show the values are close to each other and close to the average.
● This means most people/things performed similarly.
● If the standard deviation is large, the numbers are spread out from the mean → the
data is varied (less consistent)
● Varied results show the values are spread out and far apart from the average.
● This means that there's a big difference in performance/measurement.
● The symbol for standard deviation has variously been represented as s, S, SD and the
lowercase Greek letter sigma (σ).
● One custom (the one we adhere to) has it that s refers to the sample standard deviation
and σ refers to the population standard deviation.
● Population standard deviation is used when you have data for the entire population
(all members).
● Sample standard deviation is used when you only have a subset/sample (a part of the
population.
● The “-1” is called Bessel's correction, and it helps make the estimate more accurate
since a sample usually underestimates the true spread.

Why is n-1 called degrees of freedom?

● When we calculate the sample SD, we first need the sample mean.
● Once the sample mean is known, one piece of data is no longer free to vary— it's
already determined by the others.
● Formula of sample standard deviation is:

Example Data (Sample): 5, 7, 3, 7, 9

Here, n = 5

Step 1: Find the Sample Mean

xˉ = 5 + 7 + 3 + 7 + 9 / 5 = 31 / 5 = 6.2

Step 2: Find Deviations from the Mean and Square Them

(5 − 6.2)² = (−1.2)² = 1.44

(7 − 6.2)² = (0.8)² = 0.64

(3 − 6.2)² = (−3.2)² = 10.24

(7 − 6.2)² = (0.8)² = 0.64

(9 − 6.2)²= (2.8)² = 7.84

Sum of squared deviations = 1.44 + 0.64 + 10.24 + 0.64 + 7.84 = 20.8

Step 3: Divide by n − 1 (degrees of freedom)

s² = 20.8 / 5 − 1 = 20.8 / 4 = 5.2

Step 4: Take the Square Root

s = √5.2 ≈ 2.28

● Formula for the population standard deviation:

Example Data (Population): 2, 4, 6, 8,1 0

Population size: N = 5

Step 1: Find the Mean (µ)

μ = 2 + 4 + 6 + 8 + 10 / 5 = 30 / 5 = 6

Step 2: Find the Squared Deviations

(2 − 6)² = (−4)² = 16

(4 − 6)² =(−2)² = 4

(6 − 6)² = (0)² = 0

(8 − 6)² = (2)² = 4

(10 − 6)² = (4)² = 16

Sum of squared deviations = 16 + 4 + 0 + 4 + 16 = 40

Step 3: Find the Variance

σ² = 40 / N = 40 / 5 = 8

Step 4: Find the Standard Deviation

σ = √8 ≈ 2.83

Variance - it is equal to the arithmetic mean of the squares of the differences between the
scores in a distribution and their mean. The formula used to calculate the variance (s2) using
deviation scores is:

s² = ∑(X − X¯ )² / n

Example Data Set: 4, 8, 6, 10, 12

Step 1: Find the Mean

xˉ = 4 + 8 + 6 + 10 + 12 / 5 = 40 / 5 = 8

Step 2: Find the Deviations from the Mean

● (4 – 8) = –4
● (8 – 8) = 0

● (6 – 8) = –2

● (10 – 8) = 2

● (12 – 8) = 4

Step 3: Square the Deviations

● (–4)² = 16

● 0² = 0

● (–2)² = 4

● 2² = 4

● 4² = 16

Step 4: Find the Average of Squared Deviations

Variance = 16 + 0 + 4 + 4 + 16 / 5 = 40 / 5 = 8

Skewness - refers to the nature and extent to which symmetry is absent. It is an indication of
how the measurements in a distribution are distributed.

● Positive skew - when relatively few of the scores fall at the high end of the distribution.
● It shows that the tail goes to the right (higher values; 10, 15, 20).
● Most data are on the low end, but a few very high numbers pull the curve to the right.
● For example: the test scores of the students in an exam. Most students score low, but a
few students get very high scores.
● Its results may indicate that the test was too difficult.
● Negative skew - when relatively few of the scores fall at the low end of the distribution.
● It shows that the tail goes to the left (lower values; -5, 0, 2).
● Most data are on the high end, but a few very low numbers pull the curve to the left.
● For example: the test scores of the students in an exam. Most students score very high,
but a few fail badly.
● Its results may indicate that the test was too easy.
● Zero skew - the distribution is perfectly symmetrical around the mean.
● It shows the balanced bell curve and no leaning to the right or left.
● The data is spread evenly around the average.
● Its results show that the left and right sides of the distribution are mirror images.
● For example: IQ tests. By design, IQ tests are made to follow a normal (balanced)
distribution.
Why are the peaks around 5-7?

● Both the positive skew and negative skew graphs have their peak (the mode, where
most values occur) around 5-7.
● This happens because skewness doesn't move the peak much.
● Skewness only stretches one side of the distribution (the tail), not the whole graph.
● So, the majority of the data (the bulk, the peak) stays in the middle. What changes is
how far the extreme values go on one side.
● Both graphs peak at 5-7 because that's where the bulk of the data lies.

Kurtosis - refers to the steepness of a distribution in its center.

● Platykurtic - relatively flat (low kurtosis)

● It shows a flat peak + light tails.
● Values are spread out more evenly. Has a few outliers because tails are short.
● For example: exam scores. Where students’ scores spread out evenly from low to high
— no strong clustering and no big outliers.
● Leptokurtic - relatively peaked (high kurtosis)
● It shows a tall, skinny peak + heavy tails.
● But heavy tails → more extreme outliers on both sides.
● For example: stock market returns → most days have small changes, but sometimes
there are huge gains/losses.
● Mesokurtic - somewhere in the middle or between (normal curve)
● It's a “just right” curve. The distribution looks like the usual bell shape.
● Neither too peaked nor too flat.
● Tails are moderate → average number of outliers.
● For example: heights of adults → most people are near the average, a few are
taller/shorter.

● Outliers are data points (values) that are much higher or much lower that the most of
the other values in a dataset.
● They are the “odd ones out” — unusual values that don't fit the general pattern.
● For example: student exam scores. Most students score between 70-90 on a test.
However, one student scores 20 (very low) and another scores 100 (perfect).
● In short, outliers are extreme values that stand far away from the rest of the data.

Measures of Central Tendency and Dispersion/ Variability
No ratings yet
Measures of Central Tendency and Dispersion/ Variability
35 pages
Measures of Dispersion Explained
No ratings yet
Measures of Dispersion Explained
11 pages
Lecture 4
No ratings yet
Lecture 4
38 pages
Understanding Measures of Variability
No ratings yet
Understanding Measures of Variability
3 pages
Understanding Measures of Variability
No ratings yet
Understanding Measures of Variability
35 pages
Understanding Variability: Key Metrics
No ratings yet
Understanding Variability: Key Metrics
10 pages
Lecture III-Measures of Dispersion
No ratings yet
Lecture III-Measures of Dispersion
33 pages
Measures of Dispersion OR Measures of Variations
No ratings yet
Measures of Dispersion OR Measures of Variations
7 pages
Measures of Variability
No ratings yet
Measures of Variability
20 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
13 pages
Central Tendency and Variability Measures
No ratings yet
Central Tendency and Variability Measures
31 pages
Statistical Measures 2024 (Part 2) - Word
No ratings yet
Statistical Measures 2024 (Part 2) - Word
8 pages
Statistics: Understanding Variability
No ratings yet
Statistics: Understanding Variability
30 pages
Understanding Measures of Dispersion
100% (1)
Understanding Measures of Dispersion
13 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
33 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
38 pages
Phs 202 Biostat 3
No ratings yet
Phs 202 Biostat 3
23 pages
Chapter 4 Data Managmnt Lesson 3 Measures of Dispersion
No ratings yet
Chapter 4 Data Managmnt Lesson 3 Measures of Dispersion
9 pages
Chapter 4 Measures of Disperstion 1.dotx
No ratings yet
Chapter 4 Measures of Disperstion 1.dotx
20 pages
Lecture 06-Describing Data Visual Information
No ratings yet
Lecture 06-Describing Data Visual Information
49 pages
Biostat Ch-5
No ratings yet
Biostat Ch-5
58 pages
Understanding Measures of Variation
No ratings yet
Understanding Measures of Variation
29 pages
Central Tendency and Variability Explained
No ratings yet
Central Tendency and Variability Explained
38 pages
Difference Between Statistic and Statistics
No ratings yet
Difference Between Statistic and Statistics
11 pages
Understanding Measures of Variability
No ratings yet
Understanding Measures of Variability
7 pages
Measure of Dispersion Explained
No ratings yet
Measure of Dispersion Explained
14 pages
Statistics Chapter-IV
No ratings yet
Statistics Chapter-IV
59 pages
Univariate Statistics
No ratings yet
Univariate Statistics
4 pages
Chapter 5 Stat I
No ratings yet
Chapter 5 Stat I
22 pages
Understanding Statistics: Central Tendency
No ratings yet
Understanding Statistics: Central Tendency
8 pages
Measures of Dispersion in Statistics
No ratings yet
Measures of Dispersion in Statistics
15 pages
Understanding Standard Deviation and Variability
No ratings yet
Understanding Standard Deviation and Variability
63 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
4 pages
Measures of Dispersion in Management
No ratings yet
Measures of Dispersion in Management
13 pages
Chapter 4
No ratings yet
Chapter 4
9 pages
Understanding Measures of Variability
No ratings yet
Understanding Measures of Variability
17 pages
Measures of Dispersion Explained
No ratings yet
Measures of Dispersion Explained
34 pages
Chapter 4 Measures of Variability
No ratings yet
Chapter 4 Measures of Variability
26 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
59 pages
Variability and Position Measures Explained
No ratings yet
Variability and Position Measures Explained
24 pages
Understanding Measures of Spread in Statistics
No ratings yet
Understanding Measures of Spread in Statistics
19 pages
Statistical Methods in Social Sciences
No ratings yet
Statistical Methods in Social Sciences
69 pages
Dispersion
No ratings yet
Dispersion
23 pages
ECO2004 Ch3
No ratings yet
ECO2004 Ch3
16 pages
Measures of Central Tendency Explained
No ratings yet
Measures of Central Tendency Explained
18 pages
Statistics 1
No ratings yet
Statistics 1
10 pages
Chapter 22 Summarising Analysing Data Lyst1744453682255
No ratings yet
Chapter 22 Summarising Analysing Data Lyst1744453682255
18 pages
Understanding Measures of Dispersion
No ratings yet
Understanding Measures of Dispersion
10 pages
Measure of Dispersion
No ratings yet
Measure of Dispersion
32 pages
Understanding Dispersion and Skewness
No ratings yet
Understanding Dispersion and Skewness
29 pages
Central Tendency & Variability Explained
No ratings yet
Central Tendency & Variability Explained
41 pages
Measures of Variability in Math 7
No ratings yet
Measures of Variability in Math 7
7 pages
Contemporary Math (Statistics - Docx Semi
No ratings yet
Contemporary Math (Statistics - Docx Semi
22 pages
Chapter Four Bio
No ratings yet
Chapter Four Bio
13 pages
Measures of Dispersion
No ratings yet
Measures of Dispersion
91 pages
Understanding Measures of Variability
No ratings yet
Understanding Measures of Variability
48 pages
UCEED 2025 Full Study Material - Free PDF - 1729090082497
100% (7)
UCEED 2025 Full Study Material - Free PDF - 1729090082497
417 pages
Average of Consecutive Numbers Explained
No ratings yet
Average of Consecutive Numbers Explained
15 pages
Notes
No ratings yet
Notes
16 pages
(Slichter, 1989) - Principles of Magnetic Resonance (3rd Ed.)
100% (4)
(Slichter, 1989) - Principles of Magnetic Resonance (3rd Ed.)
339 pages
USB TO RS485 Converter Strip: Data Sheet
No ratings yet
USB TO RS485 Converter Strip: Data Sheet
2 pages
Simulscan: Next Generation Automated Data Capture For Android
No ratings yet
Simulscan: Next Generation Automated Data Capture For Android
2 pages
18SW115 4
No ratings yet
18SW115 4
2 pages
Signal System Handwitten Notes
No ratings yet
Signal System Handwitten Notes
191 pages
X-Ray Powder Diffraction Techniques
No ratings yet
X-Ray Powder Diffraction Techniques
30 pages
TISP4070J3BJ
No ratings yet
TISP4070J3BJ
8 pages
Public Perception of Ex-Quarry Land Use
No ratings yet
Public Perception of Ex-Quarry Land Use
8 pages
UNIT 1 PPT Mass Transfer
No ratings yet
UNIT 1 PPT Mass Transfer
18 pages
Design Basis Report Fgs
100% (1)
Design Basis Report Fgs
17 pages
Production Planning and Control Anna University Question Bank
0% (1)
Production Planning and Control Anna University Question Bank
4 pages
Cut and Chip Resistance in Rubber Tires
No ratings yet
Cut and Chip Resistance in Rubber Tires
4 pages
Livio Rossetti - Thales The Measurer (Issues in Ancient Philosophy) - Routledge (2022)
100% (1)
Livio Rossetti - Thales The Measurer (Issues in Ancient Philosophy) - Routledge (2022)
228 pages
Introduction to Plasma Physics
No ratings yet
Introduction to Plasma Physics
32 pages
L31: Dynamic Programming (L3) : Some Problems and Discussions
No ratings yet
L31: Dynamic Programming (L3) : Some Problems and Discussions
14 pages
GB 3087 e 2008
No ratings yet
GB 3087 e 2008
18 pages
DP83848C PDF
No ratings yet
DP83848C PDF
84 pages
DBMS Unit-9
No ratings yet
DBMS Unit-9
62 pages
Chapter 9999999999
No ratings yet
Chapter 9999999999
19 pages
Chassis Frame and Lozenging Effects
100% (1)
Chassis Frame and Lozenging Effects
95 pages
Axial and Torsional Stress Analysis
100% (1)
Axial and Torsional Stress Analysis
2 pages
Calibración Del Sensor de Oxigeno
No ratings yet
Calibración Del Sensor de Oxigeno
6 pages
Cold Welding of Gold Nanowires
No ratings yet
Cold Welding of Gold Nanowires
7 pages
HWC03
No ratings yet
HWC03
7 pages
Bridge Model
No ratings yet
Bridge Model
6 pages
ICU Ventilator Components and Standards
No ratings yet
ICU Ventilator Components and Standards
44 pages
Workplace Spirituality and Employee Performance
No ratings yet
Workplace Spirituality and Employee Performance
97 pages