0% found this document useful (0 votes)
2 views11 pages

3. error functions

The document discusses error functions in statistics, focusing on mean squared error (MSE) and its relationship to regression lines. It explains how to calculate MSE to find the best-fitting line through a set of points on a Cartesian axis, including the derivation of equations for slope (M) and y-intercept (B). Additionally, it introduces mean absolute error (MAE) as a measure of the average of absolute errors in measurements.

Uploaded by

q7ak26tja0
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
2 views11 pages

3. error functions

The document discusses error functions in statistics, focusing on mean squared error (MSE) and its relationship to regression lines. It explains how to calculate MSE to find the best-fitting line through a set of points on a Cartesian axis, including the derivation of equations for slope (M) and y-intercept (B). Additionally, it introduces mean absolute error (MAE) as a measure of the average of absolute errors in measurements.

Uploaded by

q7ak26tja0
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 11

ERROR FUNCTIONS

Reference notes
1

Error Functions

We will see a statistical method mean squared error, and describe the
relationship of this method to the regression line.
The example consists of points on the Cartesian axis. We will define a
mathematical function that will give us the straight line that passes best between
all points on the Cartesian axis.

And in this way, we will learn the connection between these two methods, and
how the result of their connection looks together.

General Explanation:
In statistics, the mean squared error (MSE) of an estimator (of a procedure for
estimating an unobserved quantity) measures the average of the squares of the
errors — that is, the average squared difference between the estimated values and
what is estimated. MSE is a risk function, corresponding to the expected value of
the squared error loss. The fact that MSE is almost always strictly positive (and
not zero) is because of randomness or because the estimator does not account for
information that could produce a more accurate estimate.

Let’s say we have seven points, and our goal is to find a line that minimizes the
squared distances to these different points.
Let’s try to understand that.
I will take an example and I will draw a line between the points. Of course, my
drawing isn’t the best, but it’s just for demonstration purposes.
2

Error Functions
You might be asking yourself, what is this graph?

• The purple dots are the points on the graph. Each point has an x-coordinate
and a y-coordinate.
• The blue line is our prediction line. This is a line that passes through all the
points and fits them in the best way. This line contains the predicted points.
• The red line between each purple point and the prediction line are the
errors. Each error is the distance from the point to its predicted point.

You should remember this equation from your school days, y=Mx+B, where M
is the slope of the line and B is y-intercept of the line.
We want to find M (slope) and B (y-intercept) that minimizes the squared
error!
Let’s define a mathematical equation that will give us the mean squared error
for all our points.

Let’s analyze what this equation means.

In mathematics, the character that looks like weird E is called summation


(Greek sigma). It is the sum of a sequence of numbers, from i=1 to n. Let’s
imagine this like an array of points, where we go through all the points, from
the first (i=1) to the last (i=n).
For each point, we take the y-coordinate of the point, and the y’-coordinate.
The y-coordinate is our purple dot. The y’ point sits on the line we created. We
3

Error Functions
subtract the y-coordinate value from the y’-coordinate value and calculate the
square of the result.
The third part is to take the sum of all the (y-y’)² values, and divide it by n,
which will give the mean.

Our goal is to minimize this means, which will provide us with the best line
that goes through all the points.

From concept to mathematical equations:


As you know, the line equation is y=mx+b, where m is the slope and b is the y-
intercept.
Let’s take each point on the graph, and we’ll do our calculation (y-y’)².
But what is y’, and how do we calculate it? We do not have it as part of the
data.
But we do know that, in order to calculate y’, we need to use our line equation,
y=mx+b, and put the x in the equation.
From here we get the following equation:

Let’s rewrite this expression to simplify it.

Let’s begin by opening all the brackets in the equation. I colored the difference
between the equations to make it easier to understand.

Now, let’s apply another manipulation. We will take each part and put it
together. We will take all the y, and (-2ymx) and etc, and we will put them all
side-by-side.
4

Error Functions
At this point we’re starting to be messy, so let’s take the mean of all squared
values for y, xy, x, x².
Let’s define, for each one, a new character which will represent the mean of all
the squared values.
Let’s see an example, let’s take all the y values, and divide them by n since it’s
the mean, and call it y(HeadLine).

If we multiply both sides of the equation by n we get:

Which will lead us to the following equation:

If we look at what we got, we can see that we have a 3D surface. It looks like a
glass, which rises sharply upwards.
We want to find M and B that minimize the function. We will make a partial
derivative with respect to M and a partial derivative with respect to B.
Since we are looking for a minimum point, we will take the partial derivatives
and compare to 0.
5

Error Functions
Let’s take the two equations we received, isolating the variable b from both,
and then subtracting the upper equation from the bottom equation.

Different writing of the equations after the derivation by parts

Let’s subtract the first equation from the second equation.

Let’s get rid of the denominators from the equation.

And there we go, this is the equation to find M, let’s take this and write down B
equation.

Equations for slope and y-intercept


Let’s provide the mathematical equations that will help us find the required
slope and y-intercept.
6

Error Functions
So, you probably think to yourself, what the heck are those weird equations?
They are actually simple to understand, so let’s talk about them a little bit.

Now that we understand our equations it’s time to get all things together and
show some examples.
7

Error Functions
Example #1
Let’s take 3 points, (1,2), (2,1), (4,3).

Let’s find M and B for the equation y=mx+b.


8

Error Functions
After we’ve calculated the relevant parts for our M equation and B equation,
let’s put those values inside the equations and get the slope and y-intercept.

Let’s take those results and set them inside the line equation y=mx+b.

Now let’s draw the line and see how the line passes through the lines in such a
way that it minimizes the squared distances.
9

Error Functions
Mean Absolute Error
Absolute Error is the amount of error in your measurements. It is the
difference between the measured value and “true” value. For example, if a
scale states 90 pounds but you know your true weight is 89 pounds, then the
scale has an absolute error of 90 𝑙𝑏𝑠 – 89 𝑙𝑏𝑠 = 1 𝑙𝑏𝑠.
This can be caused by your scale not measuring the exact amount you are
trying to measure. For example, your scale may be accurate to the nearest
pound. If you weigh 89.6 lbs, the scale may “round up” and give you 90 lbs. In
this case the absolute error is 90 𝑙𝑏𝑠 – 89.6 𝑙𝑏𝑠 = .4 𝑙𝑏𝑠.

Formula
The formula for the absolute error (Δx) is: (𝛥𝑥) = 𝑥𝑖 – 𝑥,
Where:
xi is the measurement,
x is the true value.
Using the first weight example above, the absolute error formula gives the
same result:
(𝛥𝑥) = 90 𝑙𝑏𝑠 – 89 𝑙𝑏𝑠 = 1 𝑙𝑏.
Sometimes you’ll see the formula written with the absolute value symbol
(these bars: | |). This is often used when you’re dealing with multiple
measurements:
(𝛥𝑥) = |𝑥𝑖 – 𝑥|
The absolute value symbol is needed because sometimes the measurement will
be smaller, giving a negative number. For example, if the scale measured 89
lbs and the true value was 95 lbs then you would have a difference of
89 𝑙𝑏𝑠 – 95 𝑙𝑏𝑠 = −6 𝑙𝑏𝑠. On its own, a negative value is fine (-6 just means
“six units below”) but the problem comes when you’re trying to add several
values, some of which are positive and some are negative. For example, let’s
say you have:
10

Error Functions
• 89 lbs – 95 lbs = -6 lbs and
• 98 lbs – 92 lbs = 6 lbs

On their own, both measurements have absolute errors of 6 lbs. If you add
them together, you should get a total of 12 lbs of error, but because of that
negative sign you’ll actually get −6 𝑙𝑏𝑠 + 6 𝑙𝑏𝑠 = 0 𝑙𝑏𝑠, which makes no
sense at all — after all, there was a pretty big error (12 lbs) which has
somehow become 0 lbs of error. We can solve this by taking the absolute
value of the results and then adding: | − 6 𝑙𝑏𝑠| + |6 𝑙𝑏𝑠| = 12 𝑙𝑏𝑠.

The Mean Absolute Error(MAE) is the average of all absolute errors.


The formula is:

The formula may look a little daunting, but the steps are easy:
Find all of your absolute errors, xi – x.
Add them all up.
Divide by the number of errors. For example, if you had 10 measurements,
divide by 10.

You might also like