Math 133 - Unit 7 Graphing Data-1

Math 133 – Engineering Mathematics 1
Unit 7 Graphing Data
7.1 Numerical Data and its various graphs
In all experiments that we perform, we collect data. For example, we may collect data on a particular
metal’s resistance to electrical current with respect to its temperature. Or in social science, we may collect
data on the average income vs the age of the person. Is it true that the average income increases as one
grows older?
If the data is numerical, often, we will want to graph the data. From the graph, we can predict what the
outcome is like at a particular instance. For example, we would like to know how much gasoline a
particular engine consumes at a particular speed.
From the graph that we generate from our experiments or observations, we can predict or estimate the
outcome given a particular situation. For example, suppose we perform an experiment that measures the
height and weight of people on Earth. From the data we collect, we can obtain the function that relates
the weight and height of people. Then, from this function, we can obtain the expected weight of a person
given his or her height.
To determine a useful relationship, we need to distinguish between the independent variable and the
dependent variable in the data we collect. In the previous example of the height and weight of people,
since we want to know what is the expected weight of a person given his or her height, then we can take
height as the independent variable and weight as the dependent variable, i.e. the weight depends on the
height of a person. Usually, we assign the symbol x to be the independent variable and y as the dependent
variable.
Different types of situations will plot different types of graphs. For example, we already know that when
we throw a rock in the air, its distance from the ground vs time follows a parabolic graph. Thomas
Malthus, a renowned social scientist, discovered that the world’s population vs time follows an
exponential graph, i.e. the world’s population increases exponentially. Eventually, we are going to run
out of space on planet Earth to live in or we shall eventually run out of food given the rate we populate
Earth.
In this unit, we shall discuss three types of relationships:
7.1.1 The linear relationship, 𝑦 = 𝑚𝑥 + 𝑏
If our data points, when plotted on the rectilinear Cartesian coordinate plane, appears to follow a straight
line, then we can say that there is a linear correlation in our data. We can define a straight line that best
approximates the data points, calculate its slope m and its y-intercept, b.
7.1.2 The exponential relationship, 𝑦 = 𝑏10𝑚𝑥
If we plot our data points on the rectilinear Cartesian coordinate plane and the data points do not follow
a straight line, it does not mean there is no correlation in the data. It just means there is no linear
correlation and the correlation could be of a non-linear type.
To find out if this non-linear type correlation is exponential, we could plot the points (𝑥1 , 𝑦𝑖 ) on a semi-
log graph paper. The y-axis of this graph paper has been adjusted to increase on the logarithmic scale. So
when we plot the (𝑥1 , 𝑦𝑖 ) points on this semi-log graph paper, it is similar to us putting points (𝑥𝑖 , log10 𝑦𝑖 )
on the rectilinear graph paper. If we obtain, on this semi-log graph paper, a set of points that resemble a
straight line, this indicates that the data points follow an exponential correlation.
We can also check out if the correlation is exponential by transforming the exponential relationship to a
linear relation as follows,
𝑦 = 𝑏10𝑚𝑥 ; This is a general exponential relationship.
⇒ log10 𝑦 = log10(𝑏10𝑚𝑥 ) ; Take the logarithm to the base 10 of both sides.
⇒ log10 𝑦 = log10 𝑏 + log10 10𝑚𝑥
⇒ log10 𝑦 = 𝑚𝑥 + log10 𝑏
Notice that this is a straight line equation, 𝑌 = 𝑚𝑥 + 𝐵, where
𝑌 = log10 𝑦 and 𝐵 = log10 𝑏
We then find the logarithm to the base 10 of the y-coordinates and plot them versus the x-coordinates,
i.e. we plot the points (𝑥𝑖 , log10 𝑦𝑖 ) for all the data points on the rectilinear Cartesian coordinate system.
If this results in a set of points resembling a straight line, then the data follows an exponential correlation.
If not, then the data points do not follow an exponential correlation.
If the relationship is exponential, we can then find the straight line that best approximates the points
(𝑥𝑖 , log10 𝑦𝑖 ). We calculate the slope m and find the value of b by finding the y-intercept, 𝐵 = log10 𝑏,
and then calculating b from
𝑏 = 10𝐵
7.1.2 The power relationship, 𝑦 = 𝑏𝑥 𝑚
The power relationship is the other non-linear type of correlation we can check for if we are not able to
find a linear relationship in the data set. To check if there is a power relationship in the given data set, we
plot the points (𝑥1 , 𝑦𝑖 ) on a log-log type of graph paper. The x-axis and the y-axis on this type of graph
paper have been adjusted to increase on the logarithmic scale. So when we plot the (𝑥1 , 𝑦𝑖 ) points on
this log-log graph paper, it is similar to us putting points (log10 𝑥𝑖 , log10 𝑦𝑖 ) on the rectilinear graph paper.
If we obtain, on this log-log graph paper, a set of points that resemble a straight line, this indicates that
the data points follow a power correlation.
We can also check out if the correlation is power by transforming the power relationship to a linear
relation as follows,
𝑦 = 𝑏𝑥 𝑚 ; This is a general power relationship.
⇒ log10 𝑦 = log10(𝑏𝑥 𝑚 ) ; Take the logarithm to the base 10 of both sides.
⇒ log10 𝑦 = log10 𝑏 + log10 𝑥 𝑚
⇒ log10 𝑦 = 𝑚 log10 𝑥 + log10 𝑏
Notice that this is also a straight line equation, 𝑌 = 𝑚𝑋 + 𝐵, where where
𝑌 = log10 𝑦 , 𝑋 = log10 𝑥 and 𝐵 = log10 𝑏
We then find the logarithm to the base 10 of both the x-coordinates and the y-coordinates, and plot the
points (log10 𝑥𝑖 , log10 𝑦𝑖 ) for all the data points on the rectilinear Cartesian coordinate system. If this
results in a set of points resembling a straight line, then the data follows a power correlation. If not, then
the data points do not follow an exponential correlation.
If the relationship is power, we can then find the straight line that best approximates the points
(log10 𝑥𝑖 , log10 𝑦𝑖 ). We calculate the slope m and find the value of b by finding the y-intercept,
𝐵 = log10 𝑏, and then calculating b from
𝑏 = 10𝐵
7.2 Linear Correlation Coefficient, r
The linear correlation coefficient, r, gives a measure of the strength of the linear relationship between
two quantitative variables.
Suppose we are given n number of data points (𝑥1 , 𝑦1 ), (𝑥2 , 𝑦2 ), ⋯, (𝑥𝑛 , 𝑦𝑛 ). The formula to calculate
the linear correlation coefficient of these data points is
𝑛 ∑𝑛𝑖=1(𝑥𝑖 𝑦𝑖 ) − ∑𝑛𝑖=1 𝑥𝑖 ∑𝑛𝑖=1 𝑦𝑖

𝑟=
2 2
√𝑛 ∑𝑛𝑖=1 𝑥𝑖 2 − (∑𝑛𝑖=1 𝑥𝑖 ) √𝑛 ∑𝑛𝑖=1 𝑦𝑖 2 − (∑𝑛𝑖=1 𝑦𝑖 )
Where
𝑛
∑ 𝑥𝑖 = 𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛
𝑖=1
∑ 𝑦𝑖 = 𝑦1 + 𝑦2 + ⋯ + 𝑦𝑛
𝑖=1
𝑛
∑ 𝑥𝑖 2 = 𝑥1 2 + 𝑥2 2 + ⋯ + 𝑥𝑛 2
𝑖=1
∑ 𝑦𝑖 2 = 𝑦1 2 + 𝑦2 2 + ⋯ + 𝑦𝑛 2
𝑖=1
𝑛 2
(∑ 𝑥𝑖 ) = (𝑥1 + 𝑥2 + ⋯ + 𝑥𝑛 )2
𝑖=1
𝑛 2
(∑ 𝑦𝑖 ) = (𝑦1 + 𝑦2 + ⋯ + 𝑦𝑛 )2
𝑖=1
And
𝑛
∑(𝑥𝑖 𝑦𝑖 ) = 𝑥1 𝑦1 + 𝑥2 𝑦2 + ⋯ + 𝑥𝑛 𝑦𝑛
𝑖=1
Properties of the linear correlation coefficient:

• The linear correlation coefficient is always between −1 and +1, inclusive. That is, −1 ≤ 𝑟 ≤
1.
• If 𝑟 = +1 , there is a perfect positive linear relationship between the two variables.
• If 𝑟 = −1 , there is a perfect negative linear relationship between the two variables.
• The closer 𝑟 is to +1, the stronger the evidence of a positive linear relationship between the
two variables.
• The closer 𝑟 is to −1, the stronger the evidence of a negative linear relationship between
the two variables.
• If 𝑟 is close to 0, there is evidence of no linear relationship between the two variables.
However, because the linear correlation coefficient is a measure of the strength of a linear
relationship, 𝑟 being close to 0 does not imply no relationship, just no linear relationship, i.e.
the relation could possibly be a non-linear relationship.
• 𝑟 is a unit-less measure of association. So, the units of measurement for x and y play no role
in the value or interpretation of 𝑟 .
• If x and y are interchanged, the correlation between the two variables remains the same.
• If 𝑟 > 0, then the slope, m, of the best approximating straight line to the data points will be
positive.
• If 𝑟 < 0, then the slope, m, of the best approximating straight line to the data points will be
negative.
Examples:
7.3 Least Squares Regression Line, 𝒚 = 𝒎𝒙 + 𝒃
Let us say we are given 𝑛 number of data points, i.e.
(𝑥1 , 𝑦1 ), (𝑥2 , 𝑦2 ), (𝑥3 , 𝑦3 ), ⋯, (𝑥𝑛 , 𝑦𝑛 )
The line that best approximates all the above data points is the least-squares regression line. It is the line
that minimizes the sum of the squares of the residuals or errors. It is a straight line
𝑦 = 𝑚𝑥 + 𝑏
Where its slope is

𝑚= 2
𝑛 ∑𝑛𝑖=1 𝑥𝑖 2 − (∑𝑛𝑖=1 𝑥𝑖 )
And its y-intercept is

𝑏 = 𝑦̅ − 𝑚𝑥̅
Where 𝑦̅ is the average of all the y-values
∑𝑛𝑖=1 𝑦𝑖
𝑦̅ =
𝑛
And 𝑥̅ is the average of all the x-values
∑𝑛𝑖=1 𝑥𝑖
𝑥̅ =
𝑛
Examples: Let us consider the following data:
Pump Power Consumption

Discharge, Q (L/s) Power, P (kW)
5 31
10 39
15 45
20 53
25 60
30 67
35 75
Source: GE111.3 Assignment #2 (2019)
(i) Find the relationship between Discharge and Power.

(ii) Estimate the required power when 𝑄 = 17.5 𝐿/𝑠.
Solution: First of all, let us plot the data points on a rectilinear graph.
Pump Power Comsumption

80
70
60
Power, P (kW)
50
40
30
20
10
0
0 5 10 15 20 25 30 35 40
Discharge, Q (L/s)
Now, we plot the data points on a semi-log graph.

100
Power, P (kW)
10
1
0 5 10 15 20 25 30 35 40
Discharge, Q (L/s)
And we now plot the points on a log-log graph.

100
Power, P (kW)
10
1
1 10 100
Discharge, Q (L/s)
It appears that all three graphs are quite close linearly. We can check mathematically if the data follows
best a linear relationship or an exponential relationship or a power relationship by calculating the linear
correlation coefficient for all three relationships.
(i) Does the data follow a Linear Relationship?
We expand our table as follows:
Discharge, Q (L/s) Power, P (kW) 𝑄2 𝑃2 𝑄𝑃

5 31 25 961 155
10 39 100 1521 390
15 45 225 2025 675
20 53 400 2809 1060
25 60 625 3600 1500
30 67 900 4489 2010
35 75 1225 5625 2625
Summation 140 370 3500 21030 8415
The formula for calculating the linear correlation coefficient is

𝑟=
2 2
By letting the symbols x and y to represent Q and P respectively, we have from the above table
∑𝑛𝑖=1 𝑥𝑖 = ∑𝑛𝑖=1 𝑄𝑖 = 140 , ∑𝑛𝑖=1 𝑦𝑖 = ∑𝑛𝑖=1 𝑃𝑖 = 370 , ∑𝑛𝑖=1 𝑥𝑖 2 = ∑𝑛𝑖=1 𝑄𝑖 2 = 3500 ,
∑𝑛𝑖=1 𝑦𝑖 2 = ∑𝑛𝑖=1 𝑃𝑖 2 = 21030 and ∑𝑛𝑖=1(𝑥𝑖 𝑦𝑖 ) = ∑𝑛𝑖=1(𝑄𝑖 𝑃𝑖 ) = 8415.
Not forgetting that we have 𝑛 = 7 number of data points.
Plugging into the above formula, we calculate the linear correlation coefficient as

𝑟=
2 2
⇒
(7)(8415) − (140)(370)
𝑟=
√(7)(3500) − 1402 √(7)(21030) − 3702
7105
= ≈ 0.999624
√4900√10310
This value is very close to 1 which indicates a very strong linear correlation between the Discharge and
the Power variables.
(ii) Does the data follow an Exponential Relationship?
Since we are calculating what is the power requirement for a particular discharge, we let Discharge to be
the independent variable and Power to be the dependent variable. Hence, we should let x be Q and y be
P.
We have shown that the exponential relationship,
𝑦 = 𝑏10𝑚𝑥
can be expressed linearly as
log10 𝑦 = 𝑚𝑥 + log10 𝑏
We further let 𝑌 = log10 𝑦 and 𝐵 = log10 𝑏, and the above equation becomes
𝑌 = 𝑚𝑥 + 𝐵
We need to calculate the linear correlation coefficient between x and log10 𝑦. Let us not forget that we
take x to represent Q and y to represent P. Thus, log10 𝑃 = log10 𝑦 = 𝑌.
And we expand the table with our data as follows:
Discharge, Power,
𝐥𝐨𝐠 𝟏𝟎 𝒚
Q (L/s) P (kW) 𝒙𝟐 𝒀𝟐 𝒙𝒀
Y
x y
5 31 1.491362 25 2.22416 7.456808
10 39 1.591065 100 2.531487 15.91065
15 45 1.653213 225 2.733112 24.79819
20 53 1.724276 400 2.973127 34.48552
25 60 1.778151 625 3.161822 44.45378
30 67 1.826075 900 3.334549 54.78224
35 75 1.875061 1225 3.515855 65.62714
Summation 140 - 11.9392 3500 20.47411 247.5143
Plugging into the linear correlation coefficient formula with 𝑛 = 7 number of data points, we have the
linear correlation coefficient as
𝑛 ∑𝑛𝑖=1(𝑥𝑖 𝑌𝑖 ) − ∑𝑛𝑖=1 𝑥𝑖 ∑𝑛𝑖=1 𝑌𝑖

𝑟=
2 2
√𝑛 ∑𝑛𝑖=1 𝑥𝑖 2 − (∑𝑛𝑖=1 𝑥𝑖 ) √𝑛 ∑𝑛𝑖=1 𝑌𝑖 2 − (∑𝑛𝑖=1 𝑌𝑖 )
⇒
(7)(247.5143) − (140)(11.9392)
𝑟=
√(7)(3500) − 1402 √(7)(20.47411) − 11.93922
61.11202
= ≈ 0.992186
√4900√0.774232
This value is also quite close to 1 which indicates a strong exponential relationship. However, this value
is not as close to 1 as the value calculated when assuming the data is linearly related. So our money is still
on the linear relationship.
(iii) Does the data follow a Power Relationship?
We have shown that the power relationship,
𝑦 = 𝑏𝑥 𝑚
can be expressed linearly as
log10 𝑦 = 𝑚 log10 𝑥 + log10 𝑏
We further let 𝑋 = log10 𝑥, 𝑌 = log10 𝑦 and 𝐵 = log10 𝑏, and the above equation becomes
𝑌 = 𝑚𝑋 + 𝐵
We need to calculate the linear correlation coefficient between log10 𝑥 and log10 𝑦. Let us not forget that
we take x to represent Q and y to represent P. Thus, log10 𝑄 = log10 𝑥 = 𝑋 and log10 𝑃 = log10 𝑦 = 𝑌.
We expand the table with our data,
Discharge, Power,
𝐥𝐨𝐠 𝟏𝟎 𝒙 𝐥𝐨𝐠 𝟏𝟎 𝒚
Q (L/s) P (kW) 𝑿𝟐 𝒀𝟐 𝑿𝒀
X Y
x y
5 31 0.69897 1.491362 0.488559 2.22416 1.042417
10 39 1 1.591065 1 2.531487 1.591065
15 45 1.176091 1.653213 1.383191 2.733112 1.944329
20 53 1.30103 1.724276 1.692679 2.973127 2.243335
25 60 1.39794 1.778151 1.954236 3.161822 2.485749
30 67 1.477121 1.826075 2.181887 3.334549 2.697334
35 75 1.544068 1.875061 2.384146 3.515855 2.895222
Summation - - 8.595221 11.9392 11.0847 20.47411 14.89945
As before, plugging into the linear correlation coefficient formula with 𝑛 = 7 number of data points, we
have the linear correlation coefficient as
𝑛 ∑𝑛𝑖=1(𝑋𝑖 𝑌𝑖 ) − ∑𝑛𝑖=1 𝑋𝑖 ∑𝑛𝑖=1 𝑌𝑖

𝑟=
2 2
√𝑛 ∑𝑛𝑖=1 𝑋𝑖 2 − (∑𝑛𝑖=1 𝑋𝑖 ) √𝑛 ∑𝑛𝑖=1 𝑌𝑖 2 − (∑𝑛𝑖=1 𝑌𝑖 )
⇒
(7)(14.89945) − (8.595221)(11.9392)
𝑟=
√(7)(11.0847) − 8.5952212 √(7)(20.47411) − 11.93922
1.676075
= ≈ 0.988262
√3.715072√0.774232
This value is the lowest among the three coefficients we obtained. So this set of calculations show that
the data follows the linear relationship the best.
Finding the straight line that best approximates the data points
Now, we proceed to find the straight line that best approximates the data points. Let us pull out the table
we generated to find the linear correlation coefficient for the Linear Relationship. Once again, we let
𝑥 = 𝑄 and 𝑦 = 𝑃.
Discharge,
Power, P (kW)
Q (L/s) 𝒙𝟐 𝒙𝒚
y
x
5 31 25 155
10 39 100 390
15 45 225 675
20 53 400 1060
25 60 625 1500
30 67 900 2010
35 75 1225 2625
Summation 140 370 3500 8415
The straight line that best approximates these data points is
𝑦 = 𝑚𝑥 + 𝑏
Where its slope is

𝑚= 2
𝑛 ∑𝑛𝑖=1 𝑥𝑖 2 − (∑𝑛𝑖=1 𝑥𝑖 )
Something interesting: This best approximating straight line is best known as the Least Squares
Regression Line because we use the least squares method to find the formula to generate its slope.
Plugging the values from the table into the formula, we have
(7)(8415) − (140)(370)
𝑚=
(7)(3500) − 1402
7105 29 9
= = =1
4900 20 20

𝑏 = 𝑦̅ − 𝑚𝑥̅
Where 𝑦̅ is the average of all the y-values

∑𝑛𝑖=1 𝑦𝑖
𝑦̅ =
𝑛
𝑥̅ =
𝑛
So we have
∑𝑛𝑖=1 𝑦𝑖 370 6
𝑦̅ = = = 52
𝑛 7 7
∑𝑛𝑖=1 𝑥𝑖 140
𝑥̅ = = = 20
𝑛 7
And the y-intercept is
370 29 167 6
𝑏 = 𝑦̅ − 𝑚𝑥̅ = − ( ) (20) = = 23
7 20 7 7
∴ The least squares regression line that best approximates the data points is
9 6
𝑦 = (1 ) 𝑥 + 23
20 7
𝑦 = 1.45𝑥 + 23.86
Or in terms of Q and P,
𝑃 = 1.45𝑄 + 23.86
And when 𝑄 = 17.5 𝐿/𝑠, the estimated power required is obtained by plugging this value of Q into the
straight line that we obtained,
9 6
𝑃 = (1 ) (17.5) + 23 ≈ 49.23 𝑘𝑊
20 7
Another example: Let us consider the following data:
Resistance of an Unknown Material

Area, A (𝒎𝒎𝟐 ) Resistance, R (𝒎𝛀/𝒎)
0.05 215
0.1 110
0.2 57
0.5 23
1 12
3 4
5 2.5
10 1.3
Source: GE111.3, Assignment #3 (2019)

We want to find the relationship between the cross-section area of a wire and its electrical resistance. Let
us plot the data points.
[1] On a rectilinear graph

250
200
Resistance, R
150
100
50
0
0 2 4 6 8 10 12
Area, A
[2] On a semi-log graph

1000
100
Resistance, R
10
1
0 2 4 6 8 10 12
Area, A
[3] On a log-log graph

1000
Resistance, R
100
10
1
0.01 0.1 1 10
Area, A
A visual inspection of the three graphs tells us that the data follows a Power Relationship since only its
log-log graph shows the data points following a straight line quite closely.
We want the area to be the independent variable and the resistance to be the dependent variable, i.e. we
want to know what is the resistance when the area is of a particular value. So we let
𝑥 = 𝐴𝑟𝑒𝑎 and 𝑦 = 𝑅𝑒𝑠𝑖𝑠𝑡𝑎𝑛𝑐𝑒
Thus, the power relationship is written as

That can be expressed linearly as
log10 𝑦 = 𝑚 log10 𝑥 + log10 𝑏
By letting 𝑋 = log10 𝑥, 𝑌 = log10 𝑦 and 𝐵 = log10 𝑏, we can rewrite the above equation as
And once we find B, we can find b by taking the inverse,
𝐵 = log10 𝑏 ⇒ 𝑏 = 10𝐵
Now, we expand the given table rewriting x and y in their logarithmic values,
Area, Resistance,
𝒎𝛀 log10 𝑥 log10 𝑦
A (𝒎𝒎𝟐 ) R( ) 𝑋2 𝑋𝑌
𝒎 X Y
x y
0.05 215 -1.30103 2.332438 1.692679 -3.03457
0.1 110 -1 2.041393 1 -2.04139
0.2 57 -0.69897 1.755875 0.488559 -1.2273
0.5 23 -0.30103 1.361728 0.090619 -0.40992
1 12 0 1.079181 0 0
3 4 0.477121 0.60206 0.227645 0.287256
5 2.5 0.69897 0.39794 0.488559 0.278148
10 1.3 1 0.113943 1 0.113943
Summation - - -1.12494 9.684558 4.988061 -6.03384
We count a total of 𝑛 = 8 number of data points.
Where its slope is

𝑛 ∑𝑛𝑖=1(𝑋𝑖 𝑌𝑖 ) − ∑𝑛𝑖=1 𝑋𝑖 ∑𝑛𝑖=1 𝑌𝑖
𝑚= 2
𝑛 ∑𝑛𝑖=1 𝑋𝑖 2 − (∑𝑛𝑖=1 𝑋𝑖 )
(8)(−6.03384) − (−1.12494)(9.684558)
𝑚=
(8)(4.988061) − (−1.12494)2
−37.3762
= ≈ −0.96732
38.639

𝐵 = 𝑌̅ − 𝑚𝑋̅
Where 𝑌̅ is the average of all the y-values

∑𝑛𝑖=1 𝑌𝑖
𝑌̅ =
𝑛
And 𝑋̅ is the average of all the x-values
∑𝑛𝑖=1 𝑋𝑖
𝑋̅ =
𝑛
So we have
∑𝑛𝑖=1 𝑌𝑖 9.684558
𝑌̅ = = = 1.21057
𝑛 8
and
∑𝑛𝑖=1 𝑋𝑖 −1.12494
𝑋̅ = = = −0.14062
𝑛 8
𝐵 = 𝑌̅ − 𝑚𝑋̅ = 1.21057 − (−0.96732)(−0.14062) = 1.074548
From log10 𝑏 = 𝐵, we have
𝑏 = 10𝐵 = 101.074548 ≈ 11.87266
∴ The power relationship of the data is
⇒
𝑦 = 11.87266𝑥 −0.96732
Or to be more precise,
𝑅 = 11.87266𝐴−0.96732
Yet another example: Consider the following data,
Radiation Passing Through a Plate

Geiger Reading, G
Plate Thickness, t (in)
(rad/s)
0.1 5950
0.5 5780
1 5565
2.5 4975
5 4125
10 2835
20 1340
30 630
40 300
50 140
Test on 11 October 1982
We want to find the type of relationship between the radiation passing through a plate and the plate’s
thickness. So let us plot the data points.
[i] Data points plotted on a rectilinear graph paper

7000
6000
Geiger Reading, G (rad/s)
5000
4000
3000
2000
1000
0
0 10 20 30 40 50 60
[ii] Data points plotted on a semi-log graph paper

10000
1000
100
10
1
0 10 20 30 40 50 60
[iii] Data points plotted on a log-log graph paper

10000
1000
100
10
1
0.1 1 10 100
A visual inspection of the three graphs tells us that the data follows an Exponential Relationship since only
its semi-log graph shows the data points following a straight line closely.
We want the Plate Thickness to be the independent variable and the Geiger Reading to be the dependent
variable, i.e. we want to know what is the amount of radiation passing through the plate when the plate’s
thickness is of a particular value. So we let
𝑥 = 𝑃𝑙𝑎𝑡𝑒 𝑇ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠 and 𝑦 = 𝐺𝑒𝑖𝑔𝑒𝑟 𝑅𝑒𝑎𝑑𝑖𝑛𝑔
The exponential relationship is written as

𝑦 = 𝑏10𝑚𝑥
That can be expressed linearly as
log10 𝑦 = 𝑚𝑥 + log10 𝑏
By letting 𝑌 = log10 𝑦 and 𝐵 = log10 𝑏, we can rewrite the above equation as
And once we find B, we can find b by taking the inverse,
𝐵 = log10 𝑏 ⇒ 𝑏 = 10𝐵
Now, we expand the table and rewriting y in its logarithmic values,
Plate Geiger
Thickness, Reading, 𝐥𝐨𝐠 𝟏𝟎 𝒚
𝒙𝟐 𝒙𝒀
t (in) G (rad/s) Y
x y
0.1 5950 3.774517 0.01 0.377452
0.5 5780 3.761928 0.25 1.880964
1 5565 3.745465 1 3.745465
2.5 4975 3.696793 6.25 9.241983
5 4125 3.615424 25 18.07712
10 2835 3.452553 100 34.52553
20 1340 3.127105 400 62.5421
30 630 2.799341 900 83.98022
40 300 2.477121 1600 99.08485
50 140 2.146128 2500 107.3064
Summation 159.1 - 32.59637 5532.51 420.7621
Here, we count a total of 𝑛 = 10 number of data points.
Where its slope is

𝑛 ∑𝑛𝑖=1(𝑥𝑖 𝑌𝑖 ) − ∑𝑛𝑖=1 𝑥𝑖 ∑𝑛𝑖=1 𝑌𝑖
𝑚= 2
𝑛 ∑𝑛𝑖=1 𝑥𝑖 2 − (∑𝑛𝑖=1 𝑥𝑖 )
(10)(420.7621) − (159.1)(32.59637)
𝑚=
(10)(5532.51) − (159.1)2
−978.462
= ≈ −0.0326
30012.29

𝐵 = 𝑌̅ − 𝑚𝑥̅
Where 𝑌 is the average of all the y-values

∑𝑛𝑖=1 𝑌𝑖
𝑌̅ =
𝑛
𝑥̅ =
𝑛
So we have
∑𝑛𝑖=1 𝑌𝑖 32.59637
𝑌̅ = = = 3.259637
𝑛 10
and
∑𝑛𝑖=1 𝑥𝑖 159.1
𝑥̅ = = = 15.91
𝑛 10
𝐵 = 𝑌̅ − 𝑚𝑥̅ = 3.259637 − (−0.0326)(15.91) = 3.778336
From log10 𝑏 = 𝐵, we have
𝑏 = 10𝐵 = 103.778336 ≈ 6002.556
∴ The power relationship of the data is
𝑦 = 𝑏10𝑥𝑚
⇒
𝑦 = (6002.556)10−0.0326𝑥
Or to be more precise,
𝐺 = (6002.556)10−0.0326𝑡

Math 133 - Unit 7 Graphing Data-1

Uploaded by

Math 133 - Unit 7 Graphing Data-1

Uploaded by

Math 133 – Engineering Mathematics 1

Unit 7 Graphing Data

7.1 Numerical Data and its various graphs

In this unit, we shall discuss three types of relationships:

7.1.1 The linear relationship, 𝑦 = 𝑚𝑥 + 𝑏

𝑦 = 𝑏10𝑚𝑥 ; This is a general exponential relationship.

⇒ log10 𝑦 = log10(𝑏10𝑚𝑥 ) ; Take the logarithm to the base 10 of both sides.

⇒ log10 𝑦 = log10 𝑏 + log10 10𝑚𝑥

Notice that this is a straight line equation, 𝑌 = 𝑚𝑥 + 𝐵, where

𝑌 = log10 𝑦 and 𝐵 = log10 𝑏

7.1.2 The power relationship, 𝑦 = 𝑏𝑥 𝑚

𝑦 = 𝑏𝑥 𝑚 ; This is a general power relationship.

⇒ log10 𝑦 = log10(𝑏𝑥 𝑚 ) ; Take the logarithm to the base 10 of both sides.

⇒ log10 𝑦 = log10 𝑏 + log10 𝑥 𝑚

⇒ log10 𝑦 = 𝑚 log10 𝑥 + log10 𝑏

Notice that this is also a straight line equation, 𝑌 = 𝑚𝑋 + 𝐵, where where

𝑌 = log10 𝑦 , 𝑋 = log10 𝑥 and 𝐵 = log10 𝑏

7.2 Linear Correlation Coefficient, r

𝑛 ∑𝑛𝑖=1(𝑥𝑖 𝑦𝑖 ) − ∑𝑛𝑖=1 𝑥𝑖 ∑𝑛𝑖=1 𝑦𝑖

Properties of the linear correlation coefficient:

7.3 Least Squares Regression Line, 𝒚 = 𝒎𝒙 + 𝒃

Let us say we are given 𝑛 number of data points, i.e.

(𝑥1 , 𝑦1 ), (𝑥2 , 𝑦2 ), (𝑥3 , 𝑦3 ), ⋯, (𝑥𝑛 , 𝑦𝑛 )

Where its slope is

And its y-intercept is

Examples: Let us consider the following data:

Pump Power Consumption

Source: GE111.3 Assignment #2 (2019)

(i) Find the relationship between Discharge and Power.

Pump Power Comsumption

Pump Power Comsumption

And we now plot the points on a log-log graph.

Pump Power Comsumption

We expand our table as follows:

Discharge, Q (L/s) Power, P (kW) 𝑄2 𝑃2 𝑄𝑃

The formula for calculating the linear correlation coefficient is

𝑛 ∑𝑛𝑖=1(𝑥𝑖 𝑦𝑖 ) − ∑𝑛𝑖=1 𝑥𝑖 ∑𝑛𝑖=1 𝑦𝑖

∑𝑛𝑖=1 𝑥𝑖 = ∑𝑛𝑖=1 𝑄𝑖 = 140 , ∑𝑛𝑖=1 𝑦𝑖 = ∑𝑛𝑖=1 𝑃𝑖 = 370 , ∑𝑛𝑖=1 𝑥𝑖 2 = ∑𝑛𝑖=1 𝑄𝑖 2 = 3500 ,

∑𝑛𝑖=1 𝑦𝑖 2 = ∑𝑛𝑖=1 𝑃𝑖 2 = 21030 and ∑𝑛𝑖=1(𝑥𝑖 𝑦𝑖 ) = ∑𝑛𝑖=1(𝑄𝑖 𝑃𝑖 ) = 8415.

Not forgetting that we have 𝑛 = 7 number of data points.

𝑛 ∑𝑛𝑖=1(𝑥𝑖 𝑦𝑖 ) − ∑𝑛𝑖=1 𝑥𝑖 ∑𝑛𝑖=1 𝑦𝑖

We have shown that the exponential relationship,

And we expand the table with our data as follows:

𝑛 ∑𝑛𝑖=1(𝑥𝑖 𝑌𝑖 ) − ∑𝑛𝑖=1 𝑥𝑖 ∑𝑛𝑖=1 𝑌𝑖

(iii) Does the data follow a Power Relationship?

We have shown that the power relationship,

We expand the table with our data,

𝑛 ∑𝑛𝑖=1(𝑋𝑖 𝑌𝑖 ) − ∑𝑛𝑖=1 𝑋𝑖 ∑𝑛𝑖=1 𝑌𝑖

The straight line that best approximates these data points is

Where its slope is

And its y-intercept is

Where 𝑦̅ is the average of all the y-values

Another example: Let us consider the following data:

Resistance of an Unknown Material

Source: GE111.3, Assignment #3 (2019)

[1] On a rectilinear graph

Resistance of an Unknown Material

[2] On a semi-log graph

Resistance of an Unknown Material

Resistance of an Unknown Material

𝑥 = 𝐴𝑟𝑒𝑎 and 𝑦 = 𝑅𝑒𝑠𝑖𝑠𝑡𝑎𝑛𝑐𝑒

Thus, the power relationship is written as

And once we find B, we can find b by taking the inverse,

We count a total of 𝑛 = 8 number of data points.

The straight line that best approximates these data points is

Where its slope is

And its y-intercept is

Where 𝑌̅ is the average of all the y-values

𝐵 = 𝑌̅ − 𝑚𝑋̅ = 1.21057 − (−0.96732)(−0.14062) = 1.074548

From log10 𝑏 = 𝐵, we have

𝑏 = 10𝐵 = 101.074548 ≈ 11.87266

∴ The power relationship of the data is

Yet another example: Consider the following data,

Radiation Passing Through a Plate

Test on 11 October 1982

Radiation Passing Through a Plate