0% found this document useful (0 votes)
4 views15 pages

Week 9 Tutorial Solutions

Uploaded by

haihant.2006
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
4 views15 pages

Week 9 Tutorial Solutions

Uploaded by

haihant.2006
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 15

Week 9 Tutorial

• Recap of Seminar Week 8: Regression Analysis


• Part 1: Simple Regression [Questions & discussion (Tutor led)]
• Quiz Part A
• Part 2: Multiple regression [ Recap, Questions & Discussion
(Tutor led)]
• Quiz Part B
Simple Linear Regression Model
A Simple Linear Regression Model captures the straight line relationship between the
“outcome” variable 𝒀 and the “input” variable, 𝑿.

The true model is:


𝑌𝑖 = 𝛽0 + 𝛽1 𝑋𝑖 + 𝜀𝑖
• Y is the dependent variable, the thing we want to influence.
• X is the independent variable, the thing we can control.
• 𝛽0 and 𝛽1 are the population parameters which we want to estimate.
• 𝜀𝑖 is the error - how far the observed value of 𝑌𝑖 is from the regression line.
The Concept of the Simple Linear Regression Model
Textbook uses b0
We estimate the true model using our sample data as: and b1


• We estimate the true population values 𝛽’s with 𝛽‘s.
෡ 𝟎 and 𝜷
• Key issue: how to search for the values of 𝜷 ෡ 𝟏 given the data?

• These estimates determine our predicted of value of 𝑌𝑖


𝑌෠𝑖 = 𝛽መ0 + 𝛽መ1 𝑋𝑖
and the error 𝑒𝑖 = 𝑌𝑖 − 𝑌෠𝑖
Interpretation of Regression Model Coefficients
𝑌෠𝑖 = 𝛽መ0 + 𝛽መ1 𝑋𝑖

Predicted/estimate Intercept: the Slope: the change in Independent


d value of Y predicted value of predicted 𝑌 when 𝑋 values variable
(dependent 𝑌 when 𝑋 = 0 increases by one unit.
variable) Not meaningful when 𝛽መ1 > 0 (Positive); an increase
the range value of 𝑋 in 𝑋 is accompanied by an
does not include 0 increase in 𝑌

𝛽መ1 < 0 (Negative); an increase


in 𝑋 is accompanied by a
decrease in 𝑌
Simple Linear Regression Pitfalls
Area
• Looking at the effect of a single
independent variable (X) on Y can be 𝜷𝟏
misleading
Baths
• Omitting relevant variables from the
regression model may result in biased
estimates of the regression coefficient
Price
(omitted variable bias)
Cars
e.g. Larger properties (in land size)
tends to have larger dwelling size and
therefore, larger beds, baths and cars

Beds
Simple Linear Regression Pitfalls

• If you estimate many single linear regressions:

Intercept Slope
Area 539.357 1.093
Beds 261.501 264.855
Baths 661.001 288.180
Cars 845.000 200.464

• These results give you area, beds, baths and cars premiums
• But they are misleading because of inter-connected relations!
Part A : Tutor Led
We are going to explore the relationship between property price and
property features for older homes in this tutorial. Open the datafile Week 9
Tutorial Data.xlsx. Look at the “Older Homes” tab. We will begin by
estimating the simple linear regressions and report the coefficients here.

T1. Estimate the following two simple linear regressions:


a) 𝑃𝑟𝑖𝑐𝑒𝑖 = 𝛽0 + 𝛽1 𝐴𝑟𝑒𝑎𝑖 + 𝜀𝑖
b)𝑃𝑟𝑖𝑐𝑒𝑖 = 𝛽0 + 𝛽1 𝐵𝑒𝑑𝑠𝑖 + 𝜀𝑖
Report the slope coefficients from these two models.
Part A : Tutor Led
T2. Interpret the slope of Area from the simple linear regression.
The simple linear regression suggests that an additional square metre of
land area is associated with an average increase of $1,168 of property
price.
T3. If an investor is considering renovating an older home to include two
more bedrooms, calculate the expected price increase from this renovation
from the models above.
2* 198.626 = 397.252 => Ans: $397,252
Part A : Tutor Led
T4. In the previous section, we have estimated two simple regression
models using area and beds. In addition, you are provided with the
following simple regressions for older homes which relate to the
independent variables; baths and cars.
𝑃𝑟𝑖𝑐𝑒𝑖 = 𝛽0 + 𝛽1 𝐵𝑎𝑡ℎ𝑠𝑖 + 𝜀𝑖
𝑃𝑟𝑖𝑐𝑒𝑖 = 𝛽0 + 𝛽1 𝐶𝑎𝑟𝑠𝑖 + 𝜀𝑖
In week 8 Seminar, we estimated the simple linear regression models of
price on the property features for the whole market. The results for both
older homes and the whole market are tabulated below.
Part A : Tutor Led
According to the simple linear regression analyses so far, which two of the
following statements are TRUE?
• In terms of area, older homes tend to generate larger premium (larger
price increase) compared to the whole market
• Older homes tend to generate larger premium for car when compared
to the whole market.
• Regardless of the age of the homes, the simple linear regression
suggests that the feature that generates the largest premium is the
number of bathrooms.
• Older homes with no car spaces tend to be cheaper than an average
home with no car spaces in the whole market.
Quiz on Moodle
Go to Moodle, open Week 9 Tutorial Quiz and complete the Part A
questions.

Time allowed: 20 minutes


Multiple regression & Interpretation of Regression
Model Coefficients
𝑌෠𝑖 = 𝛽መ0 + 𝛽መ1 𝑋1𝑖 + 𝛽መ2 𝑋2𝑖 + ⋯ + 𝛽መ𝑘 𝑋𝑘𝑖

𝛽መ0 = The predicted value of 𝑌 when all 𝑋 variables are zero


Not meaningful when the range value of any 𝑋 variables does not include 0

𝛽መ1 = The change in predicted 𝑌 when 𝑋1 values increases by one unit, holding other
𝑋 variables fixed.

𝛽መ𝑘 = The change in predicted 𝑌 when 𝑋𝑘 values increases by one unit, holding other
𝑋 variables fixed.
Part B: Tutor Led

T5. For the “OlderHomes” data, estimate the multiple linear regression:
𝑃𝑟𝑖𝑐𝑒𝑖 = 𝛽0 + 𝛽1 𝐴𝑟𝑒𝑎𝑖 + 𝛽2 𝐵𝑒𝑑𝑠𝑖 + 𝛽3 𝐵𝑎𝑡ℎ𝑠𝑖 + 𝛽4 𝐶𝑎𝑟𝑠𝑖 + 𝜀𝑖
Report the slope of Area. Report to 4 decimal places.
Slope of Area = 0.9395
T6. If an investor is considering renovating an older home to include two
more bedrooms, calculate the expected increase in property price as a
result of this renovation from the multiple linear regression model.
$61,107
Part B: Tutor Led
T7.Compare the premiums calculated in T3 (simple linear regression) and
T6 (multiple linear regression). Which two of the following statements
are FALSE?
• The premium calculated in T6 is misleading because we included
additional features in the model.
• The premium calculated in T3 is misleading due to omitted variable bias in
the single linear regression model.
• Because Area and Beds are positively correlated, the answer in T3 reflects
the land size that comes with larger area as well, not just the bedroom
feature.
• We should use the premium calculated in T3 because it gives a higher
estimate.
Quiz on Moodle
Complete the Part B questions.

Time allowed: 20 minutes

You might also like