Math Module 5
Math Module 5
__________________________________________________________________
Syllabus:
Statistical methods: Correlation and regression – Karl Pearson’s coefficient of correlation and
rank correlation – problems, Regression analysis – lines of regression – problems.
Curve fitting: Curve fitting by the method of least squares – fitting the curves of the form –
𝑦 = 𝑎𝑥 + 𝑏, 𝑦 = 𝑎𝑥 𝑏 and 𝑦 = 𝑎𝑥 2 + 𝑏𝑥 + 𝑐 .
Two variables are said to be correlated if the changes in the values of one variable are
associated with the changes in the values of the other variable.
Two variables are said to be positively correlated if an increase (decrease) in the values
of one variable corresponds to an increase (decrease) in the other variable.
Two variables are said to be negatively correlated if an increase (decrease) in the values
of one variable corresponds to a decrease (increase) in the other variable.
Two variables are said to be independent or uncorrelated if there is no relationship
indicated between the variables.
Measure of correlation is called as the coefficient of correlation which is denoted by
𝜌 𝑜𝑟 𝑟. − 1 ≤ 𝜌 ≤ 1.
Karl Pearson’s Coefficient of correlation is given by
nΣxy−ΣxΣy
𝜌= .
√[nΣ𝑥 2 −(Σx)2 ][nΣ𝑦 2 −(Σ𝑦)2 ]
x 10 14 18 22 26 30
y 18 12 24 6 30 36
x 92 89 87 86 83 77 71 63 53 50
y 86 88 91 77 68 85 52 82 37 57
Wrong values are (6, 14), (8, 6) and correct values are (8, 12), (6, 8).
New Total = Total −wrong values+Correct values
Corrected values are
Σ𝑥 = 125 − 6 − 8 + 8 + 6 = 125 − 14 + 14 = 125
Σ𝑦 = 100 − 14 − 6 + 12 + 8 = 100
Σ𝑥 2 = 650 − 62 − 82 + 82 + 62 = 650
Σ𝑦 2 = 460 − 142 − 62 + 122 + 82 = 436
Σ𝑥𝑦 = 508 − 6(14) − 8(6) + 8(12) + 6(8) = 520
Therefore, Coefficient of correlation is given by
nΣxy−ΣxΣy
𝑟=
√[nΣ𝑥 2 −(Σx)2 ][nΣ𝑦 2 −(Σ𝑦)2 ]
25(520)−(125)(100)
=
√(25(650)−1252 )(25(460)−1002 )
= 0.5653
2. Calculate the rank correlation co-efficient from the following data showing the
ranks of 10 students in two subjects:
Mathematics 3 8 9 2 7 10 4 6 1 5
Physics 5 9 10 1 8 7 3 4 2 6
4. Three judges A, B, C give the following ranks. Find which pair of judges has
common approach.
A 1 6 5 10 3 2 4 9 7 8
B 3 5 8 4 7 10 2 1 6 9
C 6 4 9 8 1 2 3 10 5 7
Co-efficient of rank correlation is given by
Σ𝑑2
𝐴𝐵
𝑟𝐴𝐵 = 1 − 6 (𝑛3 −𝑛 ) 2 2 2
𝐴 𝐵 𝐶 𝑑𝐴𝐵 𝑑𝐵𝐶 𝑑𝐶𝐴 𝑑𝐴𝐵 𝑑𝐵𝐶 𝑑𝐶𝐴
200
= 1 − 6 (103 −10) = −0.2 1 3 6 -2 -1 5 4 9 5
6 5 4 1 1 -2 1 1 -2
Σ𝑑2 5 8 9 -3 -1 4 9 1 4
𝐵𝐶
𝑟𝐵𝐶 = 1 − 6 (𝑛3 −𝑛 ) 10 4 8 6 -4 -2 36 16 -2
14 3 7 1 -4 6 -2 16 36 -2
= 1 − 6 (103 −10) = −0.3
2 10 2 -8 8 0 64 64 0
4 2 3 2 -1 -1 4 1 -1
Σ𝑑2
𝐶𝐴 9 1 10 8 -9 1 64 81 1
𝑟𝐶𝐴 = 1 − 6 (𝑛3 −𝑛 )
7 6 5 1 1 -2 1 1 -2
14
= 1 − 6 (103 −10) = 0.6 8 9 7 -1 2 -1 1 4 -1
200 214 60
𝑟𝐶𝐴 is maximum.
Therefore, Judges C and A has nearest common approach.
Introduction:
Problems:
Σ𝑥 56 𝑥 𝑦 𝑥𝑦 𝑥2 𝑦2
𝑥̅ = = =7
𝑛 8 1 1 1 1 1
Σ𝑦 40
𝑦̅ = = =5 3 2 6 9 4
𝑛 8
𝑛Σ𝑥𝑦−Σ𝑥Σ𝑦 4 4 16 16 16
𝑏𝑦𝑥 = 𝑛Σ𝑥 2 −(Σ𝑥)2 6 4 24 36 16
2912−2240 8 5 40 64 25
= 4192−3136
9 7 63 81 49
= 0.6364 11 8 88 121 64
Regression line of y on x is 14 9 126 196 81
𝑦 − 𝑦̅ = 𝑏𝑦𝑥 (𝑥 − 𝑥̅ ) 56 40 364 524 256
𝑦 − 5 = 0.6364(𝑥 − 7) Σ𝑥 Σ𝑦 Σ𝑥 2 Σ𝑦 2 Σ𝑥𝑦
When 𝑥 = 10, 𝑦 = 6.9092
𝑥̅ =
Σ𝑥 70
= 10 = 7 𝑥 𝑦 𝑥𝑦 𝑥2 𝑦2
𝑛
Σ𝑦 150
1 8 8 1 64
𝑦̅ = = = 15 3 6 18 9 36
𝑛 10
𝑛Σ𝑥𝑦−Σ𝑥Σ𝑦
𝑏𝑦𝑥 = 𝑛Σ𝑥 2 −(Σ𝑥)2 =
14100−10500
= 1.7647 4 10 40 16 100
6940−4900 2 8 16 4 64
𝑛Σ𝑥𝑦−Σ𝑥Σ𝑦 14100−10500
𝑏𝑥𝑦 = 𝑛Σ𝑦 2 −(Σ𝑦)2 = 30680−22500 = 0.44 5 12 60 25 144
8 16 128 64 256
Regression line of y on x is given by
9 16 144 81 256
𝑦 − 𝑦̅ = 𝑏𝑦𝑥 (𝑥 − 𝑥̅ ) 10 10 100 100 100
𝑦 − 15 = 1.7647(𝑥 − 7) 13 32 416 169 1024
Regression line of 𝑥 on 𝑦 is given by 15 32 480 225 1024
𝑥 − 𝑥̅ = 𝑏𝑥𝑦 (𝑦 − 𝑦̅) 70 150 1410 694 3068
𝑥 − 7 = 0.44(𝑦 − 15) Σ𝑥 Σ𝑦 Σ𝑥𝑦 Σ𝑥 2 Σ𝑦 2
Coefficient of correlation is given by
𝑟 = √ 𝑏𝑦𝑥 × 𝑏𝑥𝑦 = √1.7647 × 0.44 = 0.8812
3. Find the coefficient of correlation and two regression lines for the following:
x 1 2 3 4 5 6 7 8 9 10
y 10 12 16 28 25 36 41 49 40 50
Σ𝑥 55
𝑥̅ = = 10 = 5.5 𝑥 𝑦 𝑥𝑦 𝑥2 𝑦2
𝑛
Σ𝑦 307 1 10 10 1 100
𝑦̅ = = = 30.7
𝑛 10 2 12 24 4 144
𝑛Σ𝑥𝑦−Σ𝑥Σ𝑦 20740−(55)(307)
𝑏𝑦𝑥 = 𝑛Σ𝑥 2 −(Σ𝑥)2 = = 4.6727 3 16 48 9 256
3850−3025
𝑛Σ𝑥𝑦−Σ𝑥Σ𝑦 20740−(55)(307) 4 28 112 16 384
𝑏𝑥𝑦 = = = 0.1965 5 25 125 25 625
𝑛Σ𝑦 2 −(Σ𝑦)2 113870−(307)2
6 36 216 36 1926
Regression line of y on x is given by 7 41 287 49 1681
𝑦 − 𝑦̅ = 𝑏𝑦𝑥 (𝑥 − 𝑥̅ ) 8 49 392 64 2401
9 40 360 81 1600
𝑦 − 30.7 = 4.6727(𝑥 − 5.5)
10 50 500 100 2500
Regression line of 𝑥 on 𝑦 is given by 55 307 2074 385 11387
𝑥 − 𝑥̅ = 𝑏𝑥𝑦 (𝑦 − 𝑦̅) Σ𝑥 Σ𝑦 Σ𝑥𝑦 Σ𝑥 2 Σ𝑦 2
𝑥 − 5.5 = 0.1965(𝑦 − 30.7)
Coefficient of correlation is given by
𝑟 = √ 𝑏𝑦𝑥 × 𝑏𝑥𝑦 = √4.6727 × 0.1965 = 0.9582
5. The equations of regression lines of two variables x and y are 𝒚 = 𝟎. 𝟓𝟏𝟔𝒙 + 𝟑𝟑. 𝟕𝟑
and 𝒙 = 𝟎. 𝟓𝟏𝟐𝒚 + 𝟑𝟐. 𝟓𝟐 Find the correlation coefficient and the means of x and y.
Introduction:
The curve obtained by predicting the most suitable values for the unknowns is
called the curve of best fit.
The process of determining a curve of best fit is called curve fitting.
The method employed for curve fitting is called the method of least squares.
To fit the straight line 𝑦 = 𝑎 + 𝑏𝑥, solve the normal equations
Σ𝑦 = 𝑛𝑎 + 𝑏Σ𝑥
Σ𝑥𝑦 = 𝑎 Σ𝑥 + 𝑏Σ𝑥 2
Problems:
1. find the straight line that best fits the following data:
x 1 2 3 4 5
y 14 27 40 55 68
3. A simply supported beam carries a concentrated load P (lb) at its mid point.
Corresponding to various values of P, the maximum deflection Y (in) is
measured. The data are given below:
P 100 120 140 160 180 200
Y 0.45 0.55 0.6 0.70 0.80 0.85
Find the law of the form 𝒀 = 𝒂 + 𝒃𝑷.
Introduction:
Problems:
𝑥 𝑦 𝑥𝑦 𝑥2 𝑥2𝑦 𝑥3 𝑥4
0 1 0 0 0 0 0
1 1.8 1.8 1 1.8 1 1
2 1.3 1.8 4 5.2 8 16
3 2.5 2.6 9 22.5 27 81
4 6.3 7.5 16 100.8 64 256
10 12.9 25.2 30 130.3 100 354
Σ𝑥 Σ𝑦 Σ𝑥𝑦 Σ𝑥 2 Σ𝑥 2 𝑦 Σ𝑥 3 Σ𝑥 4
𝑥 𝑦 𝑥𝑦 𝑥2 𝑥2𝑦 𝑥3 𝑥4
-3 4.63 -13.89 9 41.67 -27 81
-2 2.11 -4.22 4 8.44 -8 16
-1 0.67 -0.67 1 0.67 -1 1
0 0.09 0 0 0 0 0
1 0.63 0.63 1 0.63 1 1
2 2.15 8.6 4 8.6 8 16
3 4.58 41.22 9 41.22 27 81
0 14.86 101.23 28 101.23 0 28
Σ𝑥 Σ𝑦 Σ𝑥𝑦 Σ𝑥 2 Σ𝑥 2 𝑦 Σ𝑥 3 Σ𝑥 4
Second degree parabola is of the form 𝑦 = 𝑎 + 𝑏𝑥 + 𝑐𝑥 2 ---------- (1)
Solve the normal equations
Σ𝑦 = 𝑛𝑎 + 𝑏Σ𝑥 + 𝑐Σ𝑥 2
Σ𝑥𝑦 = 𝑎 Σ𝑥 + 𝑏Σ𝑥 2 + 𝑐Σ𝑥 3
Σ𝑥 2 𝑦 = 𝑎Σ𝑥 2 + 𝑏Σ𝑥 3 + 𝑐Σ𝑥 4
On substituting the values from the table,
14.86 = 7𝑎 + 0𝑏 + 28𝑐
−0.11 = 0𝑎 + 28𝑏 + 0𝑐
101.23 = 28𝑎 + 0𝑏 + 196𝑐
By solving these equations,
𝑎 = 0.1329, 𝑏 = −0.0039, 𝑐 = 0.4975
The best fitting second degree parabola is
𝑦 = 0.1329 − 0.0039𝑥 + 0.4975𝑥 2
𝑥 𝑦 𝑥𝑦 𝑥2 𝑥2𝑦 𝑥3 𝑥4
2 3.07 6.14 4 12.28 8 16
4 12.85 51.4 16 205.6 64 256
6 31.47 188.82 36 1132.92 216 1296
8 57.38 459.04 64 3672.32 512 4096
10 91.29 912.9 100 9129 1000 10000
30 196.06 1618.3 220 14152.12 1800 15664
Σ𝑥 Σ𝑦 Σ𝑥𝑦 Σ𝑥 2 Σ𝑥 2 𝑦 Σ𝑥 3 Σ𝑥 4
𝑉 𝑅 𝑉𝑅 𝑉2 𝑉 2𝑅 𝑉3 𝑉4
20 5.5 110 400 2200 8000 160000
40 9.1 364 1600 14560 64000 2560000
60 14.9 894 3600 53640 216000 12960000
80 22.8 1824 6400 145920 512000 40960000
100 33.3 3330 10000 333000 1000000 1004
120 46.0 5520 14400 662400 1728000 1204
420 131.6 12042 36400 1211720 3528000 364000000
Σ𝑉 Σ𝑅 Σ𝑉𝑅 Σ𝑉 2 Σ𝑉 2 𝑅 Σ𝑉 3 Σ𝑉 4
Second degree parabola is of the form 𝑅 = 𝑎 + 𝑏𝑉 + 𝑐𝑉 2 ---------- (1)
Solve the normal equations
Σ𝑅 = 𝑛𝑎 + 𝑏Σ𝑉 + 𝑐Σ𝑉 2
Σ𝑉𝑅 = 𝑎 Σ𝑉 + 𝑏Σ𝑉 2 + 𝑐Σ𝑉 3
Σ𝑉 2 𝑅 = 𝑎Σ𝑉 2 + 𝑏Σ𝑉 3 + 𝑐Σ𝑉 4
On substituting the values from the table,
12042 = 6𝑎 + 420𝑏 + 36400𝑐
12042 = 420𝑎 + 36400𝑏 + 3528000𝑐
1211720 = 36400𝑎 + 3528000𝑏 + 364000000𝑐
By solving these equations,
𝑎 = 4.35, 𝑏 = 0.0024, 𝑐 = 0.0029
The best fitting second degree parabola is
𝑅 = 4.35 + 0.0024𝑉 + 0.0029𝑉 2
Introduction:
To fit the curve of the form 𝑦 = 𝑎𝑥 𝑏 , Take log on both sides,
log 𝑦 = log 𝑎 + 𝑏 log 𝑥 or 𝑌 = 𝐴 + 𝑏𝑋. Solve the normal equations
Σ𝑌 = 𝑛𝐴 + 𝑏Σ𝑋
Σ𝑋𝑌 = 𝐴 Σ𝑋 + 𝑏Σ𝑋 2
Use natural log only.
1. Find the curve of the form 𝒚 = 𝒂𝒙𝒃 that fits the best the following data:
𝒙 3 4 5 6 7
𝒚 6 9 10 11 12
𝑋 = log 𝑥 𝑌 = log 𝑦 𝑋𝑌 𝑋2
1.0986 1.7918 1.9685 1.2069
1.3863 2.1927 3.0460 1.9218
1.6094 2.3026 3.7058 2.5902
1.7918 2.3979 4.2966 3.2105
1.9459 2.4849 4.8354 3.7865
7.8320 11.1744 17.8523 12.7159
Σ𝑋 Σ𝑌 Σ𝑋𝑌 Σ𝑋 2
𝑋 = log 𝑥 𝑌 = log 𝑦 𝑋𝑌 𝑋2
0 1.0919 0 0
0.6931 1.4492 1.0044 0.4804
1.0968 1.6506 1.8133 1.2069
1.3863 1.8083 2.5068 1.9218
1.6094 1.9169 3.0851 2.5909
1.7918 2.0149 3.6103 3.2105
6.5792 9.9318 12.0199 9.4098
Σ𝑋 Σ𝑌 Σ𝑋𝑌 Σ𝑋 2
𝑋 = log 𝑡 𝑌 = log 𝑣 𝑋𝑌 𝑋2
4.1109 5.8579 24.0812 16.8995
3.2581 5.9915 19.5209 10.6152
1.9459 6.2146 12.0930 3.7865
0.9555 6.3969 6.1122 0.9130
10.2704 24.4609 61.8073 29.2142
Σ𝑋 Σ𝑌 Σ𝑋𝑌 Σ𝑋 2
𝑋 = log 𝑣 𝑌 = log 𝑝 𝑋𝑌 𝑋2
0.4824 −0.6931 −.03344 0.48242
0 0 0 0
−0.2877 0.4055 −0.1167 0.28772
−0.4780 0.6961 −0.3313 0.47802
−0.6539 0.9163 −0.5992 0.65392
−0.7765 1.0985 −0.8530 0.77652
−1.7137 2.4203 −2.2346 1.5746
Σ𝑃 Σ𝑉 Σ𝑃𝑉 Σ𝑃2