0% found this document useful (0 votes)
10 views8 pages

ORBS7290 Regression Analysis

The document discusses regression analysis, including expressions for Q(y), Q1(y), and Q2(y), as well as expectations of these quantities. It also covers covariance relationships in regression coefficients and provides prediction equations and ANOVA results for viscosity based on temperature and sebacic acid ratio. The analysis concludes with confidence intervals and the significance of relationships between variables.

Uploaded by

man yuan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views8 pages

ORBS7290 Regression Analysis

The document discusses regression analysis, including expressions for Q(y), Q1(y), and Q2(y), as well as expectations of these quantities. It also covers covariance relationships in regression coefficients and provides prediction equations and ANOVA results for viscosity based on temperature and sebacic acid ratio. The analysis concludes with confidence intervals and the significance of relationships between variables.

Uploaded by

man yuan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

ORBS7290 Regression Analysis

Question 1
(a) Express Q(y), Q_1(y) and Q_2(y)
For Q(y):
2 2
Q(y) = ∑(yj − ŷj ) + ∑(yj − y̅j )

Expanding each of these separately will give:




2
Q(y) = ∑(((yj )^2 − 2y_jŷj ) + ∑((yj )^2 − 2ŷy̅j )

Q(y) = ∑ yj2 − 2 ∑ yj ŷj2 + ∑ yj2 − 2 ∑ y̅j ŷj2 + ∑ y̅j2

Therefore,

Q(y) = ∑(yj2 − 2yj ŷj + ŷj2 ) + ∑(ŷj2 − 2ŷj y̅j + y̅j2 )

For Q_1 (y)

Q1 (y) = ∑(yj − ŷj )^2

For Q_2 (y)

Q2 (y) = ∑( ŷj − y̅)^2

(b) To find E(Q(y)) and E(Q_1(y))


E(Q(y)) = E(Σ(yj − ŷj)2 + Σ(ŷj − – y)2
E(Σ(yj − ŷj)2 ) = ΣE((yj − ŷj)2 )
E(Q(y)) = ΣE((yj − ŷj)2 ) + ΣE((ŷj − – y)2 )
E(Q(y)) = Σσ² + Σσ²
(sin ceE((yj − ŷj)²) = σ² and E((ŷj − – y)²) = σ²)
E(Q(y)) = nσ² + nσ²
E(Q(y)) = 2nσ²
For the second one:
E(Q1(y)) = E(Σ(yj − ŷj)²)
E(Q1(y)) = ΣE((yj − ŷj)²)
E(Q1(y)) = Σσ² (sin ceE((yj − ŷj)²) = σ²)
E(Q1(y)) = nσ²
Therefore:
E(Q(y)) = 2nσ2 , and E(Q1(y)) = nσ2
Question 2

Cov(ȳ , β̂1) = Cov ((β̂0 + β̂1x̄ ), β̂1)

β̂0 = ȳ − β̂1x̄

Cov(ȳ , β̂1) = Cov ((ȳ − β̂1x̄ ), β̂1)

Cov(ȳ , β̂1) = E [(ȳ − β̂1x̄ − E(ȳ − β̂1x̄ )) (β̂1 − E(β̂1))]

Cov(x, y) = Cov(x, β̂ 0 + β̂ 1x + ε)
= Cov(x, β̂ 0) + Cov(x, β̂ 1x) + Cov(x, ε)
= 0 + x × Var(β̂ 1) + 0
= x × Var(β̂ 1)
Cov(ȳ, β̂ 1) = E[(ȳ − β̂ 1x̄ )(β̂ 1 − E(β̂ 1))]
= E[(ȳ − β̂ 1x̄ )(β̂ 1 − β1)]
= E[(ȳ − β̂ 1x̄ )( −ε
SXX )]
1
= − × × E[(ȳ − β̂ 1x̄ )ε]
S
1
= − × × E[(ȳε − β̂ 1x̄ ε)]
S
1
= − × × (E[ȳε] − β̂ 1x̄ × 0)
S
1
= − × × E[ȳε]
S
1
= − × × Cov(ȳ, ε)
S
1 σ2
= − ××
S n
σ2
Since the constant is independent of B1 and x, we can conclude that Cov(ȳ, β̂ 1) = 0
n

For part b
β̂0 = ȳ − β̂1x̄
Cov(β̂0, β̂1) = Cov(ȳ − β̂1x̄ , β̂1)

= Cov(ȳ , β̂1) − Cov(β̂1x̄ , β̂1)

Cov(β̂0, β̂1) = −Cov(β̂1x̄ , β̂1)

= −β̂1 × Cov(x̄ , β̂1)


𝜉
𝐶𝑜𝑣(𝑥̄̄, 𝛽̂ 1) = 𝐶𝑜𝑣((∑ ) , 𝛽̂̂1 )
𝑛
1
= × 𝐶𝑜𝑣(∑ (𝜉), 𝛽̂ 1)
𝑛
1
= × ∑ (𝐶𝑜𝑣(𝜉, 𝛽̂ 1))
𝑛
1
= × ∑ (𝐶𝑜𝑣(𝛽̂ 0 + 𝛽̂ 1𝜉 + 𝜀, 𝛽̂ 1))
𝑛
1
= × ∑ (𝐶𝑜𝑣(𝛽̂ 1𝜉, 𝛽̂ 1))
𝑛
1
= × ∑ (𝛽̂ 1 × 𝐶𝑜𝑣(𝜉, 𝛽̂ 1))
𝑛
1
= × 𝛽̂ 1 × 𝑛 × 𝑉𝑎𝑟(𝛽̂ 1)
𝑛
= 𝛽̂ 1 × 𝑉𝑎𝑟(𝛽̂ 1)
Since 𝑉𝑎𝑟(𝛽̂̂ 1)𝑖𝑠 𝐷𝑒𝑛𝑜𝑡𝑒𝑑 𝑎𝑠 𝑆𝑋𝑋
𝐶𝑜𝑣(𝛽̂ 0, 𝛽̂ 1)
𝐶𝑜𝑣(𝛽̂ 0, 𝛽̂ 1) = −𝛽̂ 1 × 𝐶𝑜𝑣(𝑥̄̄, 𝛽̂ 1)
= −𝛽̂ 1 × (𝛽̂ 1 × 𝑆𝑋𝑋)
= −𝛽̂ 12 × 𝑆𝑋𝑋
σ2
Cov(β̂0, β̂1) = −x̄ ×
SXX
Question 3
𝐶𝑜𝑣(𝑥̄, 𝑦 − 𝐶𝑥̄) = 0
Using the linearity of covariance, this expands to:
𝐶𝑜𝑣(𝑥̄, 𝑦) − 𝐶𝑜𝑣(𝑥̄, 𝐶𝑥̄) = 0
𝐶𝑜𝑣(𝑥̄, 𝑦) − 𝐶 ⋅ 𝐶𝑜𝑣(𝑥̄, 𝑥̄) = 0
Here, Cov(x,x) is the covariance matrix of x, denoted as Σxx, and Cov(x,y) is the cross-
covariance matrix, denoted as Σxy. So the equation becomes:
𝛴𝑥̄𝑦 − 𝐶𝛴𝑥̄𝑥̄ = 0
Rearranging the equation becomes:
CΣxx=Σxy
Since Σxx is positive definite (given Cov(x)>0), it is invertible. Thus, we can solve for C:
−1
𝐶 = 𝛴𝑋𝑌 𝛴𝑥𝑥
Substituting this into the original equation gives a 0
−1
𝐶𝑜𝑣 (𝑥̄, 𝑦 − 𝐶𝑥̄) = 𝛴𝑥𝑦 − 𝐶𝛴𝑥𝑥 = 𝛴𝑥𝑦 − 𝛴𝑥𝑦 𝛴𝑥𝑥 𝛴𝑥𝑥 = 𝛴𝑥𝑦 − 𝛴𝑥𝑦 = 0

Question 4
(a) Prediction equation
Based on the regression analysis, the prediction equation is:
Y(Viscosity)=-0.0087578X (Temperature)+1.28151
Multiple R 0.979874221
R Square 0.960153489
Adjusted R Square 0.953512404
Standard Error 0.047433597
Observations 8

ANOVA
df SS MS F Significance F
Regression 1 0.325292 0.325292 144.5778 2.01E-05
Residual 6 0.0135 0.00225
Total 7 0.338792

CoefficientsStandard Error t Stat P-value Lower 95%Upper 95%Lower 95.0%


Upper 95.0%
Intercept 1.281510655 0.046868 27.34278 1.58E-07 1.166828 1.396193 1.166828 1.396193
Temperature (0C) -0.008757822 0.000728 -12.0241 2.01E-05 -0.01054 -0.00698 -0.01054 -0.00698

(b) Analysis using ANOVA


The ANOVA table reveals that the F-statistic is 46.4786, which is much higher than the critical
value (4.6001), and the P-value is 8.36×10−6, which is very small. These results indicate a
significant relationship between temperature and viscosity, suggesting that temperature does
affect viscosity in this dataset.
Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
Temperature (0C) 8 480.7 60.0875 605.8755
Viscosity (mPa.s) 8 6.0422 0.755275 0.048399

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 14081.25 1 14081.25 46.47861 8.36E-06 4.60011
Within Groups 4241.468 14 302.962

Total 18322.72 15

Question 5
(a) Prediction equation
The estimated prediction equation for viscosity based on the ratio of sebacic acid is:
Viscosity = 0.6714 − 0.2964 × Ratio
Where:
The intercept β0 = 0.6714,
The slope β1 = −0.2964.
The variance of the random error σ2 is approximately 0.01687.
(b) ANOVA Table
Anova: Single Factor

SUMMARY
Groups Count Sum Average Variance
Ratio 8 5.2 0.65 0.06
Viscosity 8 3.83 0.47875 0.024555

ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 0.117306 1 0.117306 2.774662 0.117976 4.60011
Within Groups 0.591888 14 0.042278

Total 0.709194 15
From the provided ANOVA table and model statistics, F statistic is 2.77 with a F crit of 4.6 and
p-value of 0.11, which is greater than 0.05. Thus, the relationship between the molar ratio of
sebacic acid and viscosity is weak and not statistically significant.
t-Test: Paired Two Sample for Means

Ratio Viscosity
Mean 0.65 0.47875
Variance 0.06 0.024555357
Observations 8 8
Pearson Correlation -0.463364287
Hypothesized Mean Difference 0
df 7
t Stat 1.397512321
P(T<=t) one-tail 0.102477856
t Critical one-tail 1.894578605
P(T<=t) two-tail 0.204955713
t Critical two-tail 2.364624252

Based on the t-test results for the paired sample:


• t Stat: 1.3975, which is less than the t Critical two-tail value of 2.3646, suggesting that
the difference in means is not statistically significant at the 5% significance level.
• P-value (two-tail): 0.2049, which is greater than the 0.05 threshold, further confirming
that the relationship between the ratio and viscosity is not significant.
These results indicate that there is no significant difference between the means of the ratio and
viscosity, and the correlation between them is weak (Pearson correlation of -0.4634).
(c) Linear Regression
Multiple R 0.463364
R Square 0.214706
Adjusted R Square 0.083824
Standard Error 0.14999
Observations 8

ANOVA
df SS MS F Significance F
Regression 1 0.036905 0.036905 1.640455 0.247540882
Residual 6 0.134982 0.022497
Total 7 0.171888

R-squared: 0.2147 indicates that the model explains only 21.5% of the variance in viscosity. F-
statistic: 1.6405, with a Significance F of 0.2475, is greater than typical significance thresholds
(e.g., 0.05), suggesting that the regression model is not statistically significant. This confirms the
findings by t-test and ANOVA.
(d) 95% Confidence

The plot above shows the 95% Confidence Band (orange) and the 95% Prediction Band (green)
for the regression model.
For a ratio of x0=0.5x_0 = 0.5x0=0.5, the following intervals are calculated:
• 95% Confidence Interval for the mean response: [0.3681, 0.6783]
• 95% Prediction Interval for a new observation: [0.3548, 0.6916]
These intervals represent the range within which we expect the true mean and future
observations to fall, respectively, with 95% confidence.
Question 6
We have a multiple regression model:
y = Xβ + ε
The OLS Estimate of β is β̂
β̂ = (X T X)−1 X T y
= (X T X)−1 X T (Xβ + ε)(Using (∗))
= (X T X)−1 (X T X)β + (X T X)−1 X T ε
= β + (X T X)−1 X T ε
= β + Rε Where R = (X T X)−1 X T

You might also like