Module 2 Grouped Data
Module 2 Grouped Data
1. Midpoint Method
2. Unit Deviation Method
In using the Midpoint Method, the midpoint of each class interval is taken as the representative
of each class. These midpoint are multiplied by their corresponding frequencies. The products
are added and the sum is divided by the total number of frequencies. The value obtained is
considered the mean of the grouped data. The formula is
∑
̅
Example: Consider the frequency distribution of the examination scores of the sixty students in a
statistics class. Compute the value of the mean.
Solution:
f
Classes x (midpoint) fx
(frequency)
11-22 3 16.5 49.5
23-34 5 28.5 142.5
35-46 11 40.5 445.5
47-58 19 52.5 997.5
59-70 14 64.5 903
71-82 6 76.5 459
83-94 2 88.5 177
n = 60 ∑ == 3174
Step 4. Divide the result in step 3 by the sample size. The result is the mean of the distribution.
Hence
∑
̅
f
Classes x (midpoint) fx
(frequency)
25-30 3 27.5 82.5
31-36 6 33.5 201
37-42 11 39.5 434.5
43-48 27 40.5 1093.5
49-54 16 48.5 776
55-60 7 57.5 402.5
61-66 4 63.5 254
67-72 1 69.5 69.5
75 3313.5
The alternative method of computing the value of the mean for grouped data is the Unit
Deviation Method. Instead of using midpoints, this method uses unit deviation. This method is
usually implemented by considering an arbitrary point as the initial step in approximating the
value of the mean. This point is the midpoint of any class interval. For conventional purposes,
however, the midpoint of the class interval with the highest frequency will be the arbitrary value
and shall be called the assumed mean. The interval containing the assumed mean shall be
referred as to the mean class.
The next step is done by constructing the unit deviation column. This step involves
assigning a deviation of 0 to the assumed class mean and the other class marks with successive
integers. Form example, if the distribution has nine classes and the fifth class interval is the
assumed class mean, then the entries in the unit deviation column shall be -4, -3, -2, -1, 0, 1, 2, 3,
4. However, if assumed class mean is the 4th class interval, then the entries in the unit deviation
column will be -3, -2, -1, 0, 1, 2, 3, 4 and 5 respectively. The unit deviation are usually represented
by d.
The third step is implemented by multiplying the frequencies by their corresponding unit
deviations. The products are added and the sum is divided by the sample size. The result is then
multiplied by the size of the class interval.
Finally, the value of the mean is determined by adding the product to the assumed
mean.
Classes F d Fd
11-22 3 -3 -9
23-34 5 -2 -10
35-46 11 -1 -11 -30
47-58 19 0 0
59-70 14 1 14
71-82 6 2 12
83-94 2 3 6 32
60 2
∑
̅ ( )
̅ ( )
Seatwork: Using the unit deviation method, compute the mean age of the 75 mayor.
Classes F D Fd
25-30 3
31-36 5
37-42 11
43-48 27
49-54 16
55-60 7
61-66 4
67-72 1
MEDIAN
In the process of computing the mean, we observed that ll the values are taken into
consideration. Thus, if a distribution contains extreme values, then the value of the mean usually
pulled either to the right or to the left depending on the position of these extreme values.
We shall now consider a measure of central tendency that does not take into
consideration all the values in the distribution. This measure, called the median is a positional
measure defined as the middlemost value in the distribution. Hence, this value divides a given
set of data into two equal parts.
Just like the mean, the computation of the value of the median is done through
interpolation. The procedure requires the construction of the less than cumulative frequency
column (<cumf)
After identifying the median class, we shall approximate the position of the median
within the median class. This approximation shall be done by subtracting the value of cumf,
from . Then the difference is divided by the frequency of the median class times the size of the
class interval. The result is then added to the lower boundary of the median class to get the
median of the distribution.
̃ ( )
Example 3. Compute the value of the median of the examination scores of the students in
Statistics.
Solution: We shall first construct the less than cumulative frequency column. Using the steps
indicated, we have.
Classes F <cumf
11-22 3 3
23-34 5 8
35-46 11 19 cumfb
47-58 19 fm 38 median
59-70 14 52
71-82 6 58
83-94 2 60
Steps:
1. n/2 = 60/2 = 30
5. ̃ ( )
2. cumfb = 19
̃ ( )
3. median class 47 – 58
̃
4. xlb = 46.5 ; fm = 19; c = 12
In the computation of the value of the mode for grouped data, it is necessary to identify
the class interval that contains the mode. This interval, called modal class, containing the
highest frequency in the distribution.
The next step after getting the modal class is to determine the mode within the class. This
value may be approximately by getting the differences of the frequency of the modal class to
the frequency before and to the frequency after the modal class. If we let d 1be the difference
of the frequency of the modal class and the frequency of the interval preceding the modal
class and d2 be the difference of the frequency of the modal class, then the mode within the
class shall be approximately using the expression.
( )
If this expression is added to the lower boundary of the modal class, then we can come
up with the computing formula for the value of the mode for grouped data. The formula is
̂ ( )
Solution: The frequency distribution of the data is reproduced below. To compute the mode, we
have
Classes f
11-22 3
23-34 5
35-46 11
47-58 19 modal class
59-70 14
71-82 6
83-94 2
d1 = 19 – 11 = 8
d2 = 19 – 14 = 5
̂ ( )
̂ ( )