Normal Distribution
A normal distribution is a special kind of symmetric distribution and it represents some
properties in mathematics. Normal distribution can be determined by the values of the mean
and its standard deviation. It is the centered at the mean of the variable and its dispersion will
depend on the value of its standard deviation. The smaller the value of the standard deviation,
the steeper and less dispersed is the distribution. In graph form, normal distribution will appear
as a bell curve.
Properties of the normal distribution
The curve has a single peak, implying that the distribution is unimodal.
It is bell-shaped.
It is symmetrical around the mean; the mean lies at the center of the distribution.
The end tails of the curve can be extended indefinitely and asymptotic to the horizontal
axis.
The shape of the curve will depend in the values of the mean and standard deviation.
The total are under the curve is 1. Thus, the area of the curve in each side is 0.5
The probability between two given points is equal to the area under the curve between
those points.
This random variable X is said to be normally distributed with mean μ and standard
deviation σ if its probability distribution.
Standard Normal Distribution
The standard normal distribution is a normal distribution with a mean of zero and standard
deviation of 1. The standard normal distribution is centered at zero and the degree to which a
given measurement deviates from the mean is given by the standard deviation.
x−μ
z=
σ
Where:
x = is the raw score
μ = is the population mean
σ = is the population standard deviation
Finding Areas under the Normal Curve and Finding the z-scores
1. Sketch the standard normal curve and shade the appropriate area under the curve.
2. Find the area by following the directions for each case shown
a. To find the area to the left of z, find the area that corresponds to z in the Standard
Normal Table
b. To find the area to the right of z, use the Standard Normal Table to find the area that
corresponds to z. Then subtract the area to 1.
c. To find the area between two z-scores, find the area corresponding to each z-score
in the Standard Normal Table. Then subtract the smaller area from the larger area.
Examples:
a.
b.
c.
Measures of Shapes
Skewness
Skewness refers to a distortion or asymmetry that deviates from the symmetrical bell
curve, or normal distribution, in a set of data. If the curve is shifted to the left or to the
right, it is said to be skewed. Skewness can be quantified as a representation of the
extent to which a given distribution varies from a normal distribution.
Types of Skewness
1. Positive Skewness – If the given distribution is shifted to the left with its tail on the
right side, it is positively skewed distribution. It is also called the right skewed
distribution. A tail is referred to as the tapering of the curve differently from the
data points on the other side. A positively skewed distribution assumes a skewness
value more than zero.
2. Negative Skewness – If the given distribution is shifted to the right and with its tail
on the left side, it is a negatively skewed distribution. It is also called a left-skewed
distribution. The skewness value of any distribution showing a negative skew is
always less than zero.
Kurtosis
Kurtosis is a statistical measure used to describe the degree to which scores cluster in
the tails or the peak of a frequency distribution. The peak is the tallest part of the
distribution, and the tails are the ends of the distribution.
Types of Kurtosis
1. Mesokurtic - Distributions that are moderate in breadth and curves with a medium
peaked height. If the kurtosis of data falls close to zero or equal to zero, it is referred
to as Mesokurtic.
2. Leptokurtic - More values in the distribution tails and more values close to the mean
(i.e. sharply peaked with heavy tails). When kurtosis is positive on in other terms,
more than zero, the data falls under leptokurtic.
3. Platykurtic - Fewer values in the tails and fewer values close to the mean (i.e. the
curve has a flat peak and has more dispersed scores with lighter tails). Whenever the
kurtosis is less than zero or negative, it refers to Platykurtic.
Formula to get the Skewness:
3
∑ ( x− x )
Skewness =
( n−1 ) ⋅ S 3
Formula to get the Kurtosis:
∑ ( x− x ) 4
Kurtosis =
( n−1 ) ⋅S 4
Example: Calculate the sample skewness and sample kurtosis from the following data
85, 96, 76, 108, 85, 80, 100, 85, 70, 95.
Solution:
∑x
Mean x =
n
85+96+76 +108+85+80+ 100+ 85+70+95
10
880
=
10
= 88
x (x - x ) (x - x )2 (x - x )3 (x - x )4
= (x – 88) = (x – 88)2 = (x – 88)3 = (x – 88)4
85 -3 9 -27 81
96 8 64 512 4,096
76 -12 144 -1,728 20,736
108 20 400 8,000 160,000
85 -3 9 -27 81
80 -8 64 -512 4,096
100 12 144 1,728 20,736
85 -3 9 -27 81
70 -18 324 -5,832 104,976
95 7 49 343 2,401
Total 880 0 1,216 2,430 317,284
√
2
Standard Deviation S = ∑ ( x− x )
n−1
=
√ 1216
9
=√ 135.1111
= 11.6237
∑ ( x− x )3
Skewness =
( n−1 ) ⋅ S 3
2430
= 3
9 ⋅ ( 11.6237 )
2430
=
9 ⋅ 1570.4951
=0.1719
∑ ( x− x ) 4
Kurtosis =
( n−1 ) ⋅S 4
317284
= 4
9 ⋅ ( 11.6237 )
317284
=
9 ⋅ 18255.0123
= 1. 9312
The measures of variability define how far away the data points tend to fall from the center.
If the measures of variability has a:
Low dispersion/variability
-it indicates that the data points tend to be clustered tightly around the center.
-it is ideal because it means that you can predict better info about population base on the given
sample data
High dispersion/variability
-signifies that they tend to fall farther away
-means that the values are less consistent so it is harder for you to make predictions
Measures of variability
1. Range
-range tells us the spread of our data from the lowest to highest value n the distribution.
Formula:
Range = highest value – lowest value
2. Interquartile range
-it gives us the spread of the middle of your distribution. Any distribution in ascending order in
interquartile range contains half of the values.
Formula:
Interquartile Range = Q3 – Q1
Qk= k/4 (n+1)
To compute for the value of quartiles, the Q3 and Q1 ungrouped data:
Step 1 arrange the data to ascending order
Step 2 find the value of Q3 and Q1
Step 3 use linear interpolation method to find exact value of quartiles
Step 4 substitute the value of Q3 and Q1 to find the interquartile range
3. Standard deviation
-it is the average amount of variability in your dataset
It tells you on average how far each score lies from the mean. The larger the standard
deviation, the more variable the dataset is.
Formula:
Standard deviation for populations
σ=
√ ∑ (X −μ)2
N
σ = population standard deviation
∑ = summation of
X = each value
μ = population mean
N = number of values in the population
Standard deviation for sample
s=
√ ∑ ( X−x )2
n−1
s = sample standard deviation
∑ = summation of
X = each value
x̄ = sample mean
n = number of values in the sample
To compute:
Step 1 list each score and find S their mean
Step 2 subtract the mean from each score to get the deviation from the mean
Step 3 square each of these deviations
Step 4 add up all the squared deviations
Step 5 divide the sum of the squared deviations by n – 1 (for a sample) or N (for a population)
Step 6 find the square root of the number you have found.
4. Variance
-it is the average of squared deviations from the mean. A deviation from the mean is how far a
score lies from the mean
-reflects the degree of spread in the dataset. The more spread the data, the larger the variance
is in relation to the mean.
Formula:
V = s2
Problem 1
The area to the right if z is 0.6664. What is the z value?
0.6664 – 0.5000 = 0.1664
0.1664 = 0.43
Problem 2
Find the probability that the petty cash is over 3550
3550−3500
Z= =0.1 ( 0.0398 )
500
0.0398 + ? = 5000
The probability that the petty cash is over
3,550 is 0.4602
Problem 3
Calculate the Population Skewness and Population Kurtosis from the following grouped data.
CLASS 2-4 4-6 6-8 8-10
f 3 4 2 1
Σfx
x=
n
52
=
10
= 5.2
Class f x fx ( x -x ) ( x - x )2 f( x - x )2 f( x - x )3 f( x - x )4
2-4 3 3 9 -2.2 4.84 14.52 -31.944 70.2768
4-6 4 5 20 -0.2 0.04 0.16 -0.032 0.0064
6-8 2 7 14 1.8 3.24 6.48 11.664 20.9952
8-10 1 9 9 3.8 14.44 14.44 54.872 208.5136
Total 10 24 52 3.38 22.56 35.6 34.56 299.792
√
2
S= ∑ f ( x−x )
n−1
=
√ 35.6
9
=√ 3.9556
= 1.9889
n ∑ f ( x−x )3
Skewness = x
(n−1)(n−2) S
3
10 34.56
= x
(9)(8) 7.8675
= (0.1389) (4.3928) Negative skewed
= 0.6101
n(n+1) ∑ f ( x−x )4 3 ( n−1 )2
Kurtosis = x −
(n−1)( n−2)(n−3) S
4
( n−2 ) (n−3)
2
10(1 1) 299.792 3 ( 9 )
= x −
(9)(8)(7) 15.6478 ( 8 ) (7)
110 299.792 243
= x −
504 15.6478 56
= (0.2183) (19.1587) – 4.3393 Platykurtic
= -0.1570
Problem 4
Calculate the Population Skewness, and Population Kurtosis from the following
ungrouped data: 3, 13, 11, 11, 5, 4, 2.
Σx
Mean x=
n
3+13+11+11+5+4 +2
=
7
49
=
7
=7
x (x - x ) (x - x )2 (x - x )3 (x - x )4
= (x – 7) = (x – 7)2 = (x – 7)3 = (x – 7)4
3 -4 16 -64 256
13 6 36 216 1,296
11 4 16 64 256
11 4 16 64 256
5 -2 4 -8 16
4 -3 9 -27 81
2 -5 25 -125 625
49 0 122 120 2,786
√
2
Standard Deviation S = ∑ ( x− x )
n−1
=
√ 122
6
=√ 20.3333
= 4.5092
3
∑ ( x− x )
Skewness =
( n−1 ) ⋅ S 3
120
= 3
6 ⋅ ( 4.5092 )
120
= Negative skewed
6 ⋅ 91.6850
= 0.2181
4
∑ ( x− x )
Kurtosis =
( n−1 ) ⋅S 4
2,786
= 4
6 ⋅ ( 4.5092 )
2,786
=
6 ⋅ 413.4262
= 1.1231 Leptokurtic
Problem 5
Problem 5. In a basketball game, a player monitors the minutes he was on the court for
the whole game: 7, 10, 8, 4, 11, 10, 6, 5.
Compute for the range, interquartile range, variance, and standard deviation.
4, 5, 6, 7, 8, 10, 10, 11
Range
R = 11 – 4
R=7
Interquartile range
IR = Q3 – Q1
IR = 10 – 5.25
IR = 4.75
Q3= 3/4 (8+1) Q1= 1/4 (8+1)
Q3 = 6.75th Q1= 2.25th
10 – 10 = 0 6–5=1
0 x 0.75 = 0 1 x .25 = 0.25
0 + 10 = 10 0.25 + 5 = 5.25
Standard deviation
X X - x̄ (X – x̄)2
4 -3.625 13.1406
5 -2.625 6.8906
6 -1.625 2.6406
7 -0.625 0.3906
8 0.375 0.1406
10 2.375 5.6406
10 2.375 5.6406
11 3.375 11.3906
X = 7.63 0 45.8748
s=
√ ∑ ( X−x )2
n−1
S=
√
45.8748
8−1
S=
√
45.8748
8−1
S = 2.56
Variance
V = s2
V = 2.562
V = 6.55
Lyka Aubrey Dela Cruz (Normal Distribution)
Roselyn O. Requito (Measures of Shapes)
Angelika Denise Badua (Measures of Variability)