0% found this document useful (0 votes)

304 views

Multiple Linear Regression

This document describes the process of building a regression model through stepwise selection of explanatory variables. It begins with preliminary checks on the data including investigating relationships between variables through scatter plots and correlation matrices. Potential explanatory variables are identified and an initial regression model is fitted using all variables. Diagnostics are used to identify issues like non-linearity and determine appropriate transformations of variables. Stepwise selection is then used to iteratively add and remove variables, starting with the most significant variable and adding/removing others based on statistical tests at each step until no further improvements can be made. Model fit, diagnostics and validation measures are considered throughout the process to identify the best fitting and most parsimonious regression model.

Uploaded by

Hemanshu Das

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

304 views

Multiple Linear Regression

Uploaded by

Hemanshu Das

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

FIGURE 9.

1 Strategy for

Building
Model.

Regression

Preliminary checks on data qualitY

Data collection

and preparation

Determine several potentiallY useful subsets of explanatory variables; include known

essential variables

Reduction of number of explanatory variables (for

exploratory
observational studies)

Investigate curvature

and interaction
effects more fullY

Study residuals and other diagnostics

Model refinement
and selection

350

Part

Two

Multiple Linear Regression

ModelValidation
pfuo'iUifitfund
moO"t-iuitaing
Section 9'6'

Modelvalidityreferstothestabilityandreasonablenessoftheregresstoncoe, usaUitity of the regression function'

r -- -^L,^-

ffrcients,rhe

analysis' Validation i ences drawn from the regression p'o"""' S"u"ral methods of assessing

qr,

ical Unit Exam

of the with the completion of this overview T"u:1111111"^t'fl:"'"":: 31*'l fili"-y:'l#r1""1t"':H:^""'n"ffi;'iii""'"J',"''"':'l':":li:'::::::,:i::n""'ff,::":Ti
ho sp

itar surgic ar unit was intere sted A :'#"#; ;il;;;";; ;: ltr: JIT:I # :::Tflnarients rrndersoins a particular ty ofliver operation' t 11nd9t

^titlJ;

il"r:ffi:,tTr'l"StTit"";ffi':1u3'F"t"91'.n1it::l:'1":"-'":t'J:i:3"1:;11il;t"iil,ii# ::fl::li:'$iili;13il'#'ffi ;;;F#;'J':"1::hpatientrecord'theroilowing

the preoperation information was extracted from blood clotting score X1 prognostic index Xz enzyme function test score X3 liver function test score X+ evaluation:

Xs v A6
X1

variable
Xg variable
None Moderate
Severe

ars

:
se:

female)

and

Alcohol Use'

00 t0 01

TABLE 9.1

chapter

Building the Regression Model I: Model selection and validation

3sl

To illustrate the model-building procedures discussed in this and the next section, we

will

explanatory
h masses of

of relationis possible at this stage of data preparation.

the predictor variables (not shown). These

satorwastherebyalerredtoexaminer",".*"ffi ::""::""?,it#TgJT::*ilTJff,i;
and the correlation matrix were also obtained (not shown). A first-order regression model based on all predictor variables was fitted to serve as a point. A plot of residuals against predicted values for this fitted model is shown

in Figure 9'2a'The plot suggests that both curvature and nonconstant error variance are apparent' In addition, some departure from normality is suggested by the normal probability plot of residuals in Figure 9.2b. To make the distribution of the enor terms more nearly normal and to see if the same transformation would also reduce the apparent curyature, the investigator examined the

:tarting

JRE9.2
e

(a) Residual Plot for y

(b) Normal Plot for y

minary

000

lual
._Surgical
Example.

E
'G
O

500

E f
,a

500

o
0

-500

s00 1000
Predicted value

1500

-500' -3 -2 -1
'

012

Expected value

o6 04
A)

(c) Residual Plot for Iny

0.6 0.4

(d) Normal Plot for InY

'
-o.2 -0.4

Eo
-0.2

/ -"'
t

-0.4

5.5

6.5

3-2

Predicted value

-t012
Expected value

352

Part

Two

Muttiple Linear Regresston

logarithmic transformatio n Y'

:ln

Y' Data for the

tr
probabilitY terms is error the the distribution of "1ehfl";11:

t"fr::TJlrl,:T"1t

and the conelation matrix with the aho obtained a scatter plor matrix

transrbrmedlvariable;tiresearep,","n."o.inFigureg.3.Inaddition,variousscatterand

FIGURE 9.3 JMP Scatter Plot Matrix

and

Multivariate Correlations

LnSurvival Bloodclot Progindex

LnSurvival

Bloodclot
Progindex
Enzyme Liver

Correlation

Matrix when
Response

1.o0oo 0'2462 0'4699 0.2462 1.oo0o 0'0901 0.4699 0.0901 1'0000 0.6539 -0.1496 -0'0236 0.6493 0.5024 0'3690

Enzyme 0.6539 -o.1496 -0.0236 1.0000 o.4164

Liver

0.6493 0.5024 0.3690 0.4164, 1.0000

Variable Is

Y'-Surgical
Unit ExamPle.
8 7.5

'
.flt -

ScatterPlot Matrix

6.5
6

LnSurvival

,:.$

:.i:ri.'

..tii''

5.5
11

Bloodclot
5 3

i:,+':.
l

-{3::r
I

.-.$ t'..
,
.l

,{:
,rt! "

;..

r. . t

90 70 50 30
0 10

'r'.
itrC.

i.
Progindex

t:h

.rir'

it':'J

"
il.

0t 110
|oJ 90

f0 70

:1.3$

rl'

,r.i?:
'"^tt
Enryme

"J."".
..3

50 ;0

l0
5
I

,'$'
I

-., f"'I1'
l_

$i:
rr
ro ro 50 70 90

. ..r:;.1

Liver

'.': '.. .

5.566.577.5I 3456789

30 50 70 90

1Z 54 )

'!rlll

Chapter

Builcling the Regression Model

Model Selectiott and

Vulidntiott

:i,53

residual plots were obtained (not shown here). All of these plots indicate that each of rhe predictor variables is linearly assocriated with y', with X3 and Xa showing the highest degrees of association and X 1 the lorvest. 'fhe scatter plot matrix and the correlation matrix furlher show intercorrelations amonf4 the potential predictor variables. In particular, Xa has moderately high pairwise correlations with X1 , X2, and X3. On the basis of these analyses, the investigator concluded to use, at this stage of rhe model-building process, Y' : In I ar; the response variable, to represent the predictor variables in linear terms, and not to inch-Lde any interaction terms. The next stage in the modelbuilding process is to examine whether all of the potential predictor variables are neecled or whether a subset of them is adequate. A rLumber of useful mcasures have been developed to assess the adequacy of the .rarious subsets. We now turn to a discussion of those
measures.

9,3

Criteria for Model Selection

From any set of p - 1 predictors,2p-t alternative rnodels can be constructed. This calL:ulation is based on the fact that each predictor can be either included or excluded from ;he model. For example, the 2a :16 different posrsible subset models that can be formed tiLrm the pool of four X variables in the surgical unit example are listed in Table 9.2. First, thi:re is the regression model with no X vnriables, i,e., the model Y; : flo I e1 . Then there are the regression models with one X variable (.Xt, X2, Xs, Xq), with two X variables (X1 zLnd Xz, Xt and X3, X1 and Xq, Xz and .K3, X2 and Xt, Xz and Xa), and so on.

TAELE 9.2 SSIlp, n;, Rl,p, C p, AIC p, Models--Surgical Unit Example' (1) \fariables irr Model
l',lone

SBC p, and PR.E)SSp Valu,es

for All Possible Regression (6)

Alc
P

(2)
55EP

(3)

(4)

(s)

(7)
sBc

(8)

p
1

R1
P

R:^
9\F

c"
151 ..498

)(t
/\2

z
2 2 2
3 3 3 3

12.808 12.031

/\f
)(1, X2

9.979 7.332 7.409 9.443

5.781

)(t, Xt )(t, Xq
)Kz,

7.299

Xt
X+

4.312
6.622

)\2,

3
3

)\3, Xa

5.130
3.109

'\t,

Xz, Xz
X4
Xz,

4
4 4
4
5

,\1, X2, Xz,

6.570

'\1,
,Y,z',

X3, Xa

4.968
3.614
3.084

,Y.1,

X2, X3, Xa

0.000 0.000 0.061 0.04:i 0.221 0.20(; 0.428 0.41)', 0:422 0.410 0.263 0.23t1 0.549 0.531 0.430 0.408 0.653 0.650 0.483 0.46:' 0.599 0.s84 0.757 0.74:l 0.487 0.45(i 0.612 0,589 0.718 0.701 0.759 0.740

141.164 1 08.556 66.489 67 .715

'102.031

-75.703 -77.079 -87.178

-103^827

43.852
67.972

20.520 57.215
33.504

-1 14.658 -102.067
-130.483

-103.262 -88.162

-107.324

3.39i 58.392 32.932

11.424

-tzt.t t5 -146.161
-105.748

5.000

-120.844 -1 38.023 -144.590

PRESIi;P P -73.714 13.296 -73.101 13.512 -83.200 10.744 -99.849 8.32!"7 -99.284 8.0rlt5 -82.195 11.0(t2 -108.691 6.9tii8 -96.100 8.47',2 -124.516 s.065 -1 0'f .357 7.41',6 6.1 1]l -1 5 .1 46 3.914 38.205 -1 -97.792 7.9("t3 -1 12.888 6.207 -130.067 4.5't7 -134.645 4.0t;9
1

Chapter

Building the Regression Model

1: Model Selection and

Validatiort 355

considered sufficiently helpful to enter the regression model. Since the degrees offreedom associated with MSE vary depending on the number of X variables in the model, and since repeated tests on the same data are undertaken, fixed /* limits for adding or deleting a variable have no precise probabilistic meaning. For this reason, software programs often favor the use of predetermined alimits.

2. Assume X7 is the variable

entered at step 1. The stepwise regression routine now

fits all regression models with two X variables, where X7 is one of the pair. For each such regression model, the r* test statistic corresponding to the newly added predictor X1 is obtained. This is the statistic for testing whether or not p1 :0 when X7 and X1, Ne the variables in the model. The X variable with the largest /* value----or equivalently, the smallest P-value-is the candidate for addition at the second stage. If this t* value exceeds a predetermined level (i.e., the P-value falls below a predetermined level), the second X
variable is added. Otherwise, the program temlnates.

Suppose X3 is added at the second stage. Now the stepwise regression routine examines

whether any of the other X variables already in the model should be dropped. For our illustration, there is at this stage only one other X variable in the model, X7, so that only one /+ test statistic is obtained:

- slhl
-

(e.1e)

these t+ statistics, one for each of the variables The variable for which this /* value is smallest (or last added. the besides the one in model is largest) is the candidate for deletion. If which the P-value equivalently the variable for this t* value falls below--or the P-value sxsesds-4 predetermined limit, the variable is

At later stages, there would be a number of

dropped from the model; otherwise, it is retained.

4. Suppose X7 is retained so that both X3 and X7 are now in the model. The stepwise regression routine now examines which X variable is the next candidate for addition, then examines whether any of the variables already in the model should now be dropped, and n either be added or deleted, at which point the search

algorithm allows an X variable, brought into the model sequently if it is no longer helpful in conjunction with

ter printout for the forward stepwise regression procehe maximum acceptable o limit for adding a variable is 0. 10 and the minimum acceptable c limit for removing a variable is 0.15, as shown at the top of Figure 9.7. I ) i\ a iutoloN{ We now follow through the steps.

i,!

At the start of the stepwise search, no X variable is in the model so that the model to be fitted is IZ;: fo * ei. In step 1, the t* statistics (9.18) and corresponding P-values
are calculated for each potential

X variable,

and the predictor having the smallest P-value

(largest /* value) is chosen to enter the equation. We see that Enzyme (X3) had the largest

366

Part

Two

Multiple Linear Regression

FIGURE9.7
MINITAB

A1Pha-to-Euter:
Response

O'1

Alpha-to-Renove: 0'15
N = 54

Forward
Stepwise

is lnsurviv on 8 predictors' with

R'";;i"'StePr234 constanl o;p";

Surgical Unit

'264

3s1

'29r

3 '852

n-"i.pf".

Euzyne T-VaLue P-VaIue Proglnde T-Value P-Value nrbulleav T-Val-ue P-Value Bloodclo T-VaIue P-Va1ue s

O'0151 O'O1s4 0'0145 0'0155'''

6'?3 8 ' 19 9 ' 33 II 'O7 / 0'OO0 0'O0O 0'000 0'000"/

0'0141 0'0149 O'O142 '' 5'98 7 '68 8'20 / 0'OO0 0'000 0'000"

0'429 0'353 '/ 5'08 4'57 t 0'000 0'000'

0'073 "
0'000
"'

3'86 ;

R-sq

0'375 o'2gL 0'238 0'2tt

R-Sq(adj) C-D
test statistic:

77 B? 2?" ' 81'60 76'47 ?? 65'01 41'66 5'8 5O'5 18'9 tL7 '4

42'76 66'33

"',

h.
t) : 'r

.015124 ,
.002427

s{b:}
-

TheP.valueforthisteststatisticis0.000'whichfallsbelowthemaximum.acceptable' (X:) is added to the model' : d-to-enter value of 0' 10; hence Enzyme Th" completed' been has I 2. At rhis stage, step ".Y*-t::tj:,tl::,:::"*t":'T:i "Step 1"' rher'j disprays, near the top of rhe corumn labeled -?rulrintout : r.^i'-'f r#;;;;;idriLi rr' *"i. ":nf':&i, resres

:t"il:

criteri
il*,,
utr

r"grrrrio,l*i"*i'';1il;i;;& -;'"notherX
MSR(Xrlx3)

"io'oool' ni,,(41'66), and Cl (rr1'4) ar: 1ite11,' 11",Pi:l:t:l-; and the r* are fitted' o..r variab;

At the bottom

column l ' a number of variables-selection

fy- ?'^'::'-"::::T*ll;3il."il3-Hi

statistics calculated. They are now:

MSE(h, xk)
Progindex(Xz)hasthehighestf*value,anditsP-value(0.000)fallsbelow0.l0,sothat Xr oow enters the model'

Chapter

Building the Regression Model I: Model Selection and Validation 367

Step 2 in Figure 9.7 summarizes the situation at this point. Enzyme are now in the model, and information about this model is and Progindex (X3 and provided. At this point, a test whether Enzyme (Xl) should be dropped is undertaken, but because the P-value (0.000) corresponding to X3 is not above 0.15, this variable is retained.

3. The column labeled

4. Next, all regression models containing Xz, Xt, and one of the remaining potential X variables are fitted. The appropriate t* statistics now are:
'k -

MSR(Xklxz, h) MSE(X2, Xz, Xr)

The predictor labeled Histheavy (Xa) had the largest rf value (P-value : 0.000) and was next added to the model. 5. The column labeled Step 3 in Figure 9.7 summarizes the situation at this point. X2, X3, and Xs are now in the model. Next, a test is undertaken to determine whether X2 or X3 should be dropped. Since both of the conesponding P-values are less than 0.15, neither predictor is dropped from the model. 6. At step 4 Bloodclot (Xr) is added, and no terms previously included were dropped. The right-most column of Figure 9.7 summarizes the addition of variable X 1 into the model containing variables Xz, Xt, and Xs. Next, a test is undertaken to determine whether either Xz, Xt, or Xs should be dropped. Since all P-values are less than 0.15 (all are 0.000), all
variables are retained.

7. Finally, the stepwise regression routine considers adding one of X+, Xs, Xa, or X7 to the model containing Xr, Xz, X3, and Xs. In each case, the P-values are greater than 0. l0
(not shown); therefore, no additional variables can be added to the model and the search
process is terminated. Thus, the stepwise search algorithrnidentifies (Xt, Xz, Xz, Xi as the "best" subset of X variables. This model also happens to be the model identified by both the SBCp and pRESSp criteria in our previous analyses based on an assessment of "best" subset selection.

Comments
predictor variables that tendencies. Simulation studies have shown that for large pools of uncorrelated large or moderately large of use have been generated to be unconelated with the response variable, that is, it allows liberal; is too that procedure in a cu-to-enter values as the entry criterion results by an automatic produced models hand, other the On model. the too many predictor variables into in o2 being badly resulting underspecified, often are values cv-to-enter small with selection piocedu.e References 9'2 and 9'3)' overestimated and the procedure being too conservative (see, for example, T'n".*imum acceptable d-to-enter value should never be larger than the minimum acceptable value; otherwise, cycling is possible where a variable is continually entered and removed.

1. Thechoiceofc-to-enterandc-to-removevaluesessentiallyrepresentsabalancingofopposing

tSZ.
A

o-to-remove At 3. 1.he order in which variables enter the regression model does not reflect their importance' predicted be it can because stage a latel at Fr.r"r, u variable may enter the model, only to be dropped I well from the other predictors that have been subsequently added'

Other Stepwise Procedures

predictor variables. We Other stepwise procedures are available to find a "best" subset of these' mention two of

410

Part

Two

Multiple Linear Regression

7.3b) tolether (R] large. Exami

Figure

It is interesting to note that

(V1F):

instance Pairwise rrelations

= 105 despite the fact that both

onglY imPle
se

rl'

and

t;' (Y
Xz not

this

Comments
the variance inflation factor to detect Some computer regression programs use the reciprocal of regression model because ofexcesfitted the into ailowed be not should instances where an X variable X variables in the model' Tolerance sively high interdependence between this variable and the other below which the variable is not or .0001, I Ri frequently used are .01, .001, limits for |

/(vIF)k: -

entered into the model.

2. A limitation of

is that they cannot variance inflation factors for detectirig multicollinearities

distinguish between several simultaneous multicollinearities' have.been proposed' These 3. A number of other formal methods for detecting multicollinearity texts such as Refin specialized discussed are and are more complex than variance inflation factors
erences 10.5 and

10'6.

10.6

Sureical Unit ExamPle-Cot!*g9d

In Chapter 9 we developed a regression
Table 9.1). Recall that validation studies in the model containing variables X I , X2, X3,art to demonstrate a more in-depth study of curv other

r the six two-factor interaction e examined' These Plots (not interactions are present and need to be shown) did not suggest that any strong two-variable odel. The absence of anY containing X1, X2, X3' a The P-value of the form and the interaction the model containing both the first-order effects from interaction terms effects is .35, indicating that interaction effects at were generated to check Figure 10.9 containJsome of the additional d on the adequacy of the first-order model:

rther, adde XzX

el containing first-order terms

Yi
where Y,l

fro

flrXr

flzXiz

* ftXit I

lsXis

(10.4s)

In Yi. The following points are worth noting:

Figure 10.9a shows no evidence of serious 1. The residual plot against the fitted values in
departures from the model' studies in Section 9'6 2. Oneof the three candid a$e(b) was negative contained X5 (patient age) as the sign of b5 became in model (9.23), but when th" an added-variable plot to study graphically oositive. We will now use a residual ptot and

Chapter

Building the Regression Model II: Diagnostics 411

FIGURE 10.9
Residual and

(a) Residual Plot against Predicted 0.6 0.4

6
q)

(b) Residual Plot against

0.6 0.4

AddedVariable Plots for Surgical

0.2
0

Unit

ExampleRegression Model (10.45).

Po
o

-0.2 -0.4

-u

-z

-0.6

-0.4 -0.6

5.5 6

6.5

50 xs

Predicted Value

(c) Added-Variable Plot for 0.6

(d) Normal Probability Plot

0.6 0.4 0.2
0

x 0.4 x 0.2 0 x x -0.2

-0.2

-0.4

-0.6

-o.4 -0.6

-20 -10
e(X5lX1,

0
X2,

-202
Expected Value

X8)

the strength of the marginal relationship between X5 and the response, when Xy, Xz, Xy and Xs are already in the model. Figure 10.9b shows the plot of the residuals for the model containing Xt, Xz, X3, and X3 against X5, the predictor variable not in the model. This plot shows no need to include patient age (Xs) in the model to predict logarithm of survival time. A better view of this marginal relationship is provided by the added-variable plot in Figure 10.9c. The slope coefficient b5 can be seen again to be slightly negative as depicted by the solid line in the added-variable plot. Overall, however, the marginal relationship between X5 and Y' is weak. The P-value of the formal r test (9.18) for dropping X5 from the model containing Xt, Xz, Xz, Xs and Xs is 0.194. In addition, the plot shows that the negative slope is driven largely by one or two outliers----one in the upper left region of the plot, and one in the lower right region. In this way the added-variable plot provides additional support for dropping X5. 3. The normal probability plot of the residuals in Figure 10.9d shows little departure from linearity. The coefficient of correlation between the ordered residuals and their expected values under normality is .982, which is larger than the critical value for significance level .05

in Table 8.6.

412

Part

Two

MuLtiple Linear Regression

Multicollineantywasstudiedbycalculatingthevarianceinflationfactors:
Variable
Xr Xz Y.
8

(vtO*
1

.10 1.02 1.05 1.09

among the four predictor variables As may be seen from these results, multicollinearity not a Problem. regression Figure 10.10 contarns index plots of four kev

t""l-'. rigui' 10'10a' the leverage value I h;; in liryle l0'10b' :: plots suggest values in Figure 10.10d. These distances D; in Figure ro.io"., andDFFITSi c for values 10.6 lrsts numerical diagnostic further study of cases 17, 28 and 38. Table 1-5 are the residuals^lit columns in these cases. The measures presented -f:':l* t values h;; i1(t!,tS)':n" leverage the (10.24), in studentized deleted."riJuuri ri (DFFITS)i values in (10'30)' The following distance measures n, in if O':il, and the points about the diagnostics in Table 10'6:

,,J"i;;;;;;';';"

1*,9t::tt:::.1:"l1iT :"]:tff

9""lii *-j

noteworthy

r! regard 17 was identified as outlying with CaselTwasidenttttedasoutlylngwrltrrtrBduLUrNr.vsruvs::;'.-: 1. Case We test formally whether''E deviations' standard three ihun rno." by deleted residual, outlying test procedure' For a familY:""#|;iiKI case 1? is outlying uy t"un, of the Bonfenoni

to its Yvalue accordinS to its

sli:::'',*1'

ti-al2n;n-p-r):t('eee54;4e)'|: ;ffi;i'i:b'iJ'i.""*ntesi;en:54,werequire outlier test indicates that case 22\s116:' < formal the *-3'528' :3i3696 --)rnoe lt17l 3'528.Since l't?l - J'Jw/v -- J')26' -not critical value' and alth an outlier. Still, r17 is very close to the of we may wish to inve to be outlying to any substantial extent'
to remove anY doubts.
cases

dot plots identirv onlv iu'"' 28 and 38 liillrii;.1il;d:;.;"**are outlier identification'

2.with2pln:2(5)154:.l85asaguideforidentifyingoutlyinsXobservation*l as outlvinc 23, 28,32,38, 4;,ind s2 were ideitified """"":ldllg ll j1111'l?"t?5'

outlving Here'

we see the value of multivariable their 28' 32' 38' 42' 32' and 52'we consider 3. To determine the infl uence of cases l't' 23' 1? ic fhe' of these measures, i"t"-11it ry:: .oloi,:"ff;:'J'i""1'rio;rt values. According ro each (DFFITSIt=i"o^ttt^i:j:*X.t and 'i1oa mosr influential, with cooti ai,ott" we note that the cook's value' rreedom' oi degrees il"irlJT'.1lXXlilli. ')un 5 rnd 4e thu-s appear: corresponds to the 1lth fercentite' It cases arso do

on:

ffi|",:o}}.iii;J#;iil;;;;, to be overly influential'

not appear

"T-t::1""."::i:t:"'"t^::[ir:'i] rhe other outrvins il'clsequentry

-Tralr"J,:"rlffiiimu"

,..of^.ur"

itterest v_ pJ--y u[wrvlvve Lrrv inferences I Here, f_LElgr the P: 9f "f making predictions in the for used be to ,oJ"r i, intended vsrvg on all 54 observations was f;I based v4uw ! rrLlsu value each fitted ft"n"", nence, eaclr
percent differences:

., ,-!^-^-^^- ^{;nraracr urrs also conductod interestwasarsoconducretli ontheinferences of fit the regression model because the' are in

y*,','t'*case17isdeIetedinfittingtheregressionmoder.llIG4v9I46v

lP','r,.-

roo Y, 'l
I

414

Part

Two

Multiple Linear Regresstott

is for case 17) is only absolute percent difference (which on the fitted not have such a disproportionate influence be required.

4.Insummary,thediagnosticanalysesidentifiedanumberofpotentialproblems,but enough to require further remedial action' none of these was consrdeld to be serious

Cited
References

l0.l.Atkinson,A.C.Plots,Transformations'andRegression'Oxford:ClarendonPress'1987' ..Diagnostic Value of Residual and Partial Residual Plots,,' 10.2' Mansfield, E. R', unavt. o. Con".ly.
The

American Statistician 41 (1987)' pp' I

l0.3.Cook,R.D...ExploringPartialResidua]PIhnometrics35(1993)'pp.35|_62. Outlier Detection' New York:

10.4. Rousseeuw,

J'' ana I' Vt' Leroy'

Robust

on and

John

10.5.
John WileY

and R' E' Welsch' nriry. New York: John

Reg

entifying Influential Data inRegression'NewYork:'

10.6. Belsley,D -A.ConditiontingDiagnosticst

ColI

& Sons, 1991'

Problems

to.l.Astudentasked:..WhyisitnecessarytoperformdiagnosticchecksofthefitwhenR2is
large?" Comment'

l0.2.Aresearcherstated:..onegoodthingaboutadded-variableplotsisthattheyareextremely usefulforidentifyingmodeladequacyevenwhenthepredictorvariablesarenotproperly
specified in the regression model'" Comment'

l0.3.Astudentsuggested:..Ifextremelyinfluentialoutlyingcasesaredetectedinadataset,simply Comment' discard these cases from the data set"'

l0.4.Describeseveralinformalmethodsthatcanbehelpfulinidentifyingmulticollinearityamong model' the X variables in a multiple regression

6'5b' 0.5. Refer to Brand preference Problem

each ofthe predictor variables' Prepare an added-variable plot for regression b. Do your plots in part (a) suggest that the for any ofthe predictor areinappropriate 6.5b ProUlem in function problem 6.5b by separately regressing both c. obtain rhe firted regression function in appropriate fashion' an in X2 ort X 1,unO tt'"n-'"g'"'sing the residuals

relationships

ron

r and

6'9' 10.6. Refer to Grocery retailer Problem

using X1 and X2 only' a. Fit regression modei (6'1) to the data

X2' each of the predictor variables X1 and Prepare an added-variable plot for

c.DoyourplotsinPart(a)Suggestthattheregressionrelationshipsinthefittedfeglessron ofthe predictor variables? Explain' in part (a) are inappropriate for any
function

d,obtainthefittedregressionfunctioninpart(a)byseparatelyregressingbothlandX2on
in an appropriate fashion' X r . and then regres"sing the residuals
6'15c' 10.7. Refer to Patient satisfaction Problem of the predictor variables' each for plot a. Prepare an added-variable

b.Doyourplotsinpart(a)Suggestthattheregressionrelationshipsinthefittedregression predictor variables? Explain'

funcrion in

p.oij"* o.r!.

uiJlnupp.opriateior any of the

Solid Starts - First 100 Days
94% (18)
Solid Starts - First 100 Days
287 pages
Hourglass Workout Program by Luisagiuliet 2
76% (21)
Hourglass Workout Program by Luisagiuliet 2
51 pages
12 Week Program: Summer Body Starts Now
89% (45)
12 Week Program: Summer Body Starts Now
70 pages
The Hold Me Tight Workbook - Dr. Sue Johnson
100% (16)
The Hold Me Tight Workbook - Dr. Sue Johnson
187 pages
Read People Like A Book by Patrick King-Edited
62% (65)
Read People Like A Book by Patrick King-Edited
12 pages
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
77% (13)
Livingood, Blake - Livingood Daily Your 21-Day Guide To Experience Real Health
260 pages
Facial Gains Guide (001 081)
91% (44)
Facial Gains Guide (001 081)
81 pages
Cheat Code To The Universe
94% (77)
Cheat Code To The Universe
34 pages
Curse of Strahd
95% (467)
Curse of Strahd
258 pages
The Psychiatric Interview - Daniel Carlat
91% (34)
The Psychiatric Interview - Daniel Carlat
473 pages
The Borax Conspiracy
91% (57)
The Borax Conspiracy
14 pages
COSMIC CONSCIOUSNESS OF HUMANITY - PROBLEMS OF NEW COSMOGONY (V.P.Kaznacheev,. Л. V. Trofimov.)
94% (212)
COSMIC CONSCIOUSNESS OF HUMANITY - PROBLEMS OF NEW COSMOGONY (V.P.Kaznacheev,. Л. V. Trofimov.)
212 pages
The Secret Language of Attraction
86% (107)
The Secret Language of Attraction
278 pages
How To Develop and Write A Grant Proposal
83% (541)
How To Develop and Write A Grant Proposal
17 pages
Workbook For The Body Keeps The Score
88% (52)
Workbook For The Body Keeps The Score
111 pages
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
83% (1016)
Donald Trump & Jeffrey Epstein Rape Lawsuit and Affidavits
13 pages
KamaSutra Positions
78% (69)
KamaSutra Positions
55 pages
7 Hermetic Principles
93% (28)
7 Hermetic Principles
3 pages
27 Feedback Mechanisms Pogil Key
75% (12)
27 Feedback Mechanisms Pogil Key
6 pages
Frank Hammond - List of Demons
92% (92)
Frank Hammond - List of Demons
3 pages
36 Questions That Lead To Love
91% (35)
36 Questions That Lead To Love
3 pages
36 Questions To Fall in Love 1
97% (31)
36 Questions To Fall in Love 1
2 pages
The 36 Questions That Lead To Love - The New York Times
94% (34)
The 36 Questions That Lead To Love - The New York Times
3 pages
100 Questions To Ask Your Partner
80% (35)
100 Questions To Ask Your Partner
2 pages
The 36 Questions That Lead To Love - The New York Times
95% (21)
The 36 Questions That Lead To Love - The New York Times
3 pages
Jeffrey Epstein39s Little Black Book Unredacted PDF
75% (12)
Jeffrey Epstein39s Little Black Book Unredacted PDF
95 pages
ALCHEMIST
64% (14)
ALCHEMIST
4 pages
1001 Songs
71% (69)
1001 Songs
1,798 pages
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
23% (954)
The 4 Hour Workweek, Expanded and Updated by Timothy Ferriss - Excerpt
38 pages
Zodiac Sign & Their Most Common Addictions
63% (30)
Zodiac Sign & Their Most Common Addictions
9 pages
US CPPIP Supply Chain India 2009
No ratings yet
US CPPIP Supply Chain India 2009
13 pages
Phdcoursework1 PDF
No ratings yet
Phdcoursework1 PDF
136 pages
Tumor_Real
No ratings yet
Tumor_Real
18 pages
Statistical Models in R
No ratings yet
Statistical Models in R
18 pages
Quality Engineering
No ratings yet
Quality Engineering
13 pages
Intro To Forecasting
No ratings yet
Intro To Forecasting
15 pages
Chapter 4 Transformations and Weighting To Correct Model Inadequacies 13 March
No ratings yet
Chapter 4 Transformations and Weighting To Correct Model Inadequacies 13 March
27 pages
Nonlife Actuarial Models: Applications of Monte Carlo Methods
No ratings yet
Nonlife Actuarial Models: Applications of Monte Carlo Methods
46 pages
otero-baum-2018-unit-root-tests-based-on-forward-and-reverse-dickey-fuller-regressions
No ratings yet
otero-baum-2018-unit-root-tests-based-on-forward-and-reverse-dickey-fuller-regressions
7 pages
Ayuda Comandos Stata Meta
No ratings yet
Ayuda Comandos Stata Meta
42 pages
Regression Assg1
No ratings yet
Regression Assg1
2 pages
American Statistical Association
No ratings yet
American Statistical Association
19 pages
A Study On The False Alarm Rates Of, EWMA and CUSUM Control Charts When Parameters Are Estimated
No ratings yet
A Study On The False Alarm Rates Of, EWMA and CUSUM Control Charts When Parameters Are Estimated
5 pages
STATS 330: Lecture 6: Inference For The Multiple Regression Model
No ratings yet
STATS 330: Lecture 6: Inference For The Multiple Regression Model
26 pages
TimeSeries Ch6
No ratings yet
TimeSeries Ch6
11 pages
Barndorff-Nielsen_1987
No ratings yet
Barndorff-Nielsen_1987
68 pages
Nonlife Actuarial Models: Applications of Monte Carlo Methods
No ratings yet
Nonlife Actuarial Models: Applications of Monte Carlo Methods
46 pages
Python Programs
No ratings yet
Python Programs
7 pages
Best Subset Reg 2
No ratings yet
Best Subset Reg 2
9 pages
Answers Review Questions Econometrics
84% (25)
Answers Review Questions Econometrics
59 pages
AMS2320 Assignment 2 - s235039 3
No ratings yet
AMS2320 Assignment 2 - s235039 3
10 pages
Lecture 35 - Simulation Modeling I
No ratings yet
Lecture 35 - Simulation Modeling I
12 pages
Test-Based Computational Model Updating of A Car Body in White
No ratings yet
Test-Based Computational Model Updating of A Car Body in White
5 pages
Paper_IWSHM
No ratings yet
Paper_IWSHM
10 pages
Multiple Regression
No ratings yet
Multiple Regression
7 pages
Modified Akaike Information Criterion (MAIC) For Statistical Model Selection
No ratings yet
Modified Akaike Information Criterion (MAIC) For Statistical Model Selection
12 pages
MA 585: Time Series Analysis and Forecasting: February 12, 2017
No ratings yet
MA 585: Time Series Analysis and Forecasting: February 12, 2017
15 pages
Fuzzy C-Regression Model With A New Cluster Validity Criterion
No ratings yet
Fuzzy C-Regression Model With A New Cluster Validity Criterion
6 pages
Simr Power Analysis For GLMMs
No ratings yet
Simr Power Analysis For GLMMs
19 pages
Arima Jmulti
No ratings yet
Arima Jmulti
11 pages
00 Lab Notes
No ratings yet
00 Lab Notes
8 pages
Model Selection
No ratings yet
Model Selection
11 pages
Multiple Regression
No ratings yet
Multiple Regression
100 pages
In Sem 2 Study Material
No ratings yet
In Sem 2 Study Material
19 pages
Cooks
No ratings yet
Cooks
5 pages
Egarch Model
No ratings yet
Egarch Model
25 pages
Diagnostico de Modelos
No ratings yet
Diagnostico de Modelos
4 pages
SAS Code To Select The Best Multiple Linear Regression Model For Multivariate Data Using Information Criteria
No ratings yet
SAS Code To Select The Best Multiple Linear Regression Model For Multivariate Data Using Information Criteria
6 pages
Unit II - Diagnotis and Multiple Linear
No ratings yet
Unit II - Diagnotis and Multiple Linear
8 pages
ASTM Data and Control
100% (1)
ASTM Data and Control
141 pages
Report Jan Proj06
No ratings yet
Report Jan Proj06
9 pages
Engineering Risk Benefit Analysis
No ratings yet
Engineering Risk Benefit Analysis
26 pages
Lectures 8 9 10
No ratings yet
Lectures 8 9 10
185 pages
Nataf Model
No ratings yet
Nataf Model
8 pages
Regression and Multiple Regression Analysis
100% (1)
Regression and Multiple Regression Analysis
21 pages
304BA AdvancedStatisticalMethodsUsingR
No ratings yet
304BA AdvancedStatisticalMethodsUsingR
31 pages
Gretlcli HLP
No ratings yet
Gretlcli HLP
85 pages
L Modeling In: Issing Observati
No ratings yet
L Modeling In: Issing Observati
10 pages
Introduction To Linear Regression Analysis
No ratings yet
Introduction To Linear Regression Analysis
22 pages
Gaussian Process Emulation of Dynamic Computer Codes (Conti, Gosling Et Al)
No ratings yet
Gaussian Process Emulation of Dynamic Computer Codes (Conti, Gosling Et Al)
14 pages
Chapter 04 Answers
50% (2)
Chapter 04 Answers
6 pages
Stiffness and Damage Identification With Model Reduction Technique
No ratings yet
Stiffness and Damage Identification With Model Reduction Technique
8 pages
Chapter 6: How To Do Forecasting by Regression Analysis
No ratings yet
Chapter 6: How To Do Forecasting by Regression Analysis
7 pages
A New Look at The Statistical Model Identification PDF
No ratings yet
A New Look at The Statistical Model Identification PDF
8 pages
Evolutionary Algorithms
From Everand
Evolutionary Algorithms
Alain Petrowski
No ratings yet
Digital Signal Processing (DSP) with Python Programming
From Everand
Digital Signal Processing (DSP) with Python Programming
Maurice Charbit
No ratings yet
Co-Clustering: Models, Algorithms and Applications
From Everand
Co-Clustering: Models, Algorithms and Applications
Gérard Govaert
No ratings yet
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
From Everand
Multi-dimensional Monte Carlo Integrations Utilizing Mathematica
SUJAUL CHOWDHURY
No ratings yet
Fundamental Statistical Inference: A Computational Approach
From Everand
Fundamental Statistical Inference: A Computational Approach
Marc S. Paolella
No ratings yet
GARCH Models: Structure, Statistical Inference and Financial Applications
From Everand
GARCH Models: Structure, Statistical Inference and Financial Applications
Christian Francq
5/5 (1)
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
Analytical Methods of Optimization
From Everand
Analytical Methods of Optimization
D. F. Lawden
No ratings yet
b2b HD
No ratings yet
b2b HD
1 page
Arbitration Between Parties Must Be Conducted Within The "Four Corners" of The Contract
No ratings yet
Arbitration Between Parties Must Be Conducted Within The "Four Corners" of The Contract
1 page
Wilkins, A Zurn Company
No ratings yet
Wilkins, A Zurn Company
10 pages
(Done) 2011 CF Slot5 Quiz2
No ratings yet
(Done) 2011 CF Slot5 Quiz2
1 page
Zomato 1
100% (1)
Zomato 1
10 pages
Wilkins, A Zurn Company
No ratings yet
Wilkins, A Zurn Company
10 pages
Quiz 2 Solution
No ratings yet
Quiz 2 Solution
3 pages
DM Modified Solution Quiz2
No ratings yet
DM Modified Solution Quiz2
3 pages
Nutella
No ratings yet
Nutella
16 pages
Gdpi Abm
100% (1)
Gdpi Abm
7 pages
Forbes India
No ratings yet
Forbes India
2 pages
IEEE Paper Review
No ratings yet
IEEE Paper Review
1 page
Active and Passive Filters
No ratings yet
Active and Passive Filters
11 pages
Transmission Lines: (Geometric Model)
No ratings yet
Transmission Lines: (Geometric Model)
11 pages
Pi, PD, Pid Controllers
No ratings yet
Pi, PD, Pid Controllers
21 pages
STTN115 Exam Preparation Guide
No ratings yet
STTN115 Exam Preparation Guide
4 pages
Students' Perception of Lecturers' Competency and The Effect On Institution Loyalty: The Mediating Role of Students' Satisfaction
No ratings yet
Students' Perception of Lecturers' Competency and The Effect On Institution Loyalty: The Mediating Role of Students' Satisfaction
13 pages
Internal Control in Nigeria Tertiary Institutions
100% (1)
Internal Control in Nigeria Tertiary Institutions
6 pages
Mod 5 Hypo1fin
No ratings yet
Mod 5 Hypo1fin
50 pages
The Use of Bjork's Indications of Growth For Evaluation of Extremes of Skeletal Morphology
No ratings yet
The Use of Bjork's Indications of Growth For Evaluation of Extremes of Skeletal Morphology
8 pages
Analysis of Variance
No ratings yet
Analysis of Variance
14 pages
Math 1350 Project
No ratings yet
Math 1350 Project
4 pages
R For Bioinformatics
No ratings yet
R For Bioinformatics
53 pages
[FREE PDF sample] Statistics for Business and Economics : Metric Edition, 14th Edition Cengage South-Western ebooks
No ratings yet
[FREE PDF sample] Statistics for Business and Economics : Metric Edition, 14th Edition Cengage South-Western ebooks
39 pages
New Problem Solving and Data Analysis Questions - With Answers
No ratings yet
New Problem Solving and Data Analysis Questions - With Answers
69 pages
The Impact of Time Management On Students' Academic Achievement
No ratings yet
The Impact of Time Management On Students' Academic Achievement
8 pages
Arima Cho Usd Eur
No ratings yet
Arima Cho Usd Eur
15 pages
Usp Diapositivas 1220 Ciclo de Vida Estadistica
No ratings yet
Usp Diapositivas 1220 Ciclo de Vida Estadistica
121 pages