100% found this document useful (1 vote)

1K views14 pages

Multiple Linear Regression in Data Mining

Course Summary Data that has relevance for managerial decisions is accumulating at an incredible rate due to a host of technological advances. Electronic data capture has become inexpensive and ubiquitous as a by-product of innovations such as the internet, e-commerce, electronic banking, point-of-sale devices, bar-code readers, and intelligent machines. Such data is often stored in data warehouses and data marts specifically intended for management decision support. Data mining is a rapidly growing field that is concerned with developing techniques to assist managers to make intelligent use of these repositories. A number of successful applications have been reported in areas such as credit rating, fraud detection, database marketing, customer relationship management, and stock market investments. The field of data mining has evolved from the disciplines of statistics and artificial intelligence. This course will examine methods that have emerged from both fields and proven to be of value in recognizing patterns and making predictions from an applications perspective. We will survey applications and provide an opportunity for hands-on experimentation with algorithms for data mining using easy-to- use software and cases.

Uploaded by

akirank1

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

100% found this document useful (1 vote)

1K views14 pages

Multiple Linear Regression in Data Mining

Uploaded by

akirank1

Available Formats

Download as PDF, TXT or read online on Scribd

Download as pdf or txt

You are on page 1/ 14

Multiple Linear Regression in

Data Mining

Contents
2.1. A Review of Multiple Linear Regression
2.2. Illustration of the Regression Process
2.3. Subset Selection in Linear Regression

1
2 Chap. 2 Multiple Linear Regression

Perhaps the most popular mathematical model for making predictions is

the multiple linear regression model. You have already studied multiple re-
gression models in the “Data, Models, and Decisions” course. In this note we
will build on this knowledge to examine the use of multiple linear regression
models in data mining applications. Multiple linear regression is applicable to
numerous data mining situations. Examples are: predicting customer activity
on credit cards from demographics and historical activity patterns, predicting
the time to failure of equipment based on utilization and environment condi-
tions, predicting expenditures on vacation travel based on historical frequent
ﬂier data, predicting staﬃng requirements at help desks based on historical
data and product and sales information, predicting sales from cross selling of
products from historical information and predicting the impact of discounts on
sales in retail outlets.
In this note, we review the process of multiple linear regression. In this
context we emphasize (a) the need to split the data into two categories: the
training data set and the validation data set to be able to validate the multiple
linear regression model, and (b) the need to relax the assumption that errors
follow a Normal distribution. After this review, we introduce methods for
identifying subsets of the independent variables to improve predictions.

2.1 A Review of Multiple Linear Regression

In this section, we review brieﬂy the multiple regression model that you en-
countered in the DMD course. There is a continuous random variable called the
dependent variable, Y , and a number of independent variables, x1 , x2 , . . . , xp .
Our purpose is to predict the value of the dependent variable (also referred to
as the response variable) using a linear function of the independent variables.
The values of the independent variables(also referred to as predictor variables,
regressors or covariates) are known quantities for purposes of prediction, the
model is:
Y = β0 + β1 x1 + β2 x2 + · · · + βp xp + ε, (2.1)

where ε, the “noise” variable, is a Normally distributed random variable with

mean equal to zero and standard deviation σ whose value we do not know. We
also do not know the values of the coefficients β0 , β1 , β2 , . . . , βp . We estimate
all these (p + 2) unknown values from the available data.
The data consist of n rows of observations also called cases, which give
us values yi , xi1 , xi2 , . . . , xip ; i = 1, 2, . . . , n. The estimates for the β coefficients
are computed so as to minimize the sum of squares of differences between the
Sec. 2.1 A Review of Multiple Linear Regression 3

ﬁtted (predicted) values at the observed values in the data. The sum of squared
diﬀerences is given by
n

(yi − β0 − β1 xi1 − β2 xi2 − . . . − βp xip )2
i=1

Let us denote the values of the coeﬃcients that minimize this expression by
β̂0 , β̂1 , β̂2 , . . . , β̂p . These are our estimates for the unknown values and are called
OLS (ordinary least squares) estimates in the literature. Once we have com-
puted the estimates β̂0 , β̂1 , β̂2 , . . . , β̂p we can calculate an unbiased estimate σ̂ 2
for σ 2 using the formula:
n
1
σ̂ 2 = (yi − β̂0 − β1 xi1 − β2 xi2 − . . . − βp xip )2
n − p − 1 i=1

Sum of the residuals

= .
#observations − #coeﬃcients
We plug in the values of β̂0 , β̂1 , β̂2 , . . . , β̂p in the linear regression model
(1) to predict the value of the dependent value from known values of the in-
dependent values, x1 , x2 , . . . , xp . The predicted value, Ŷ , is computed from the
equation
Ŷ = β̂0 + β̂1 x1 + β̂2 x2 + · · · + β̂p xp .
Predictions based on this equation are the best predictions possible in the sense
that they will be unbiased (equal to the true values on the average) and will
have the smallest expected squared error compared to any unbiased estimates
if we make the following assumptions:

1. Linearity The expected value of the dependent variable is a linear func-

tion of the independent variables, i.e.,

E(Y |x1 , x2 , . . . , xp ) = β0 + β1 x1 + β2 x2 + . . . + βp xp .

2. Independence The “noise” random variables εi are independent between

all the rows. Here εi is the “noise” random variable in observation i for
i = 1, . . . , n.

3. Unbiasness The noise random variable εi has zero mean, i.e., E(εi ) = 0
for i = 1, 2, . . . , n.

4. Homoskedasticity The standard deviation of εi equals the same (un-

known) value, σ, for i = 1, 2, . . . , n.
4 Chap. 2 Multiple Linear Regression

5. Normality The “noise” random variables, εi , are Normally distributed.

An important and interesting fact for our purposes is that even if we

drop the assumption of normality (Assumption 5) and allow the noise variables
to follow arbitrary distributions, these estimates are very good for prediction.
We can show that predictions based on these estimates are the best linear
predictions in that they minimize the expected squared error. In other words,
amongst all linear models, as deﬁned by equation (1) above, the model using
the least squares estimates,

β̂0 , β̂1 , β̂2 , . . . , β̂p ,

will give the smallest value of squared error on the average. We elaborate on
this idea in the next section.
The Normal distribution assumption was required to derive confidence in-
tervals for predictions. In data mining applications we have two distinct sets of
data: the training data set and the validation data set that are both representa-
tive of the relationship between the dependent and independent variables. The
training data is used to estimate the regression coefficients β̂0 , β̂1 , β̂2 , . . . , β̂p .
The validation data set constitutes a “hold-out” sample and is not used in
computing the coefficient estimates. This enables us to estimate the error in
our predictions without having to assume that the noise variables follow the
Normal distribution. We use the training data to fit the model and to estimate
the coefficients. These coefficient estimates are used to make predictions for
each case in the validation data. The prediction for each case is then compared
to value of the dependent variable that was actually observed in the validation
data. The average of the square of this error enables us to compare different
models and to assess the accuracy of the model in making predictions.

2.2 Illustration of the Regression Process

We illustrate the process of Multiple Linear Regression using an example adapted
from Chaterjee, Hadi and Price from on estimating the performance of super-
visors in a large financial organization.
The data shown in Table 2.1 are from a survey of clerical employees in a
sample of departments in a large financial organization. The dependent variable
is a performance measure of effectiveness for supervisors heading departments
in the organization. Both the dependent and the independent variables are
totals of ratings on different aspects of the supervisor’s job on a scale of 1 to
5 by 25 clerks reporting to the supervisor. As a result, the minimum value for
Sec. 2.3 Subset Selection in Linear Regression 5

each variable is 25 and the maximum value is 125. These ratings are answers
to survey questions given to a sample of 25 clerks in each of 30 departments.
The purpose of the analysis was to explore the feasibility of using a question-
naire for predicting effectiveness of departments thus saving the considerable
effort required to directly measure effectiveness. The variables are answers to
questions on the survey and are described below.

• Y Measure of eﬀectiveness of supervisor.

• X1 Handles employee complaints

• X2 Does not allow special privileges.

• X3 Opportunity to learn new things.

• X4 Raises based on performance.

• X5 Too critical of poor performance.

• X6 Rate of advancing to better jobs.

The multiple linear regression estimates as computed by the StatCalc add-

in to Excel are reported in Table 2.2. The equation to predict performance is

Y = 13.182 + 0.583X1 − 0.044X2 + 0.329X3 − 0.057X4 + 0.112X5 − 0.197X6.

In Table 2.3 we use ten more cases as the validation data. Applying the previous
equation to the validation data gives the predictions and errors shown in Table
2.3. The last column entitled error is simply the diﬀerence of the predicted
minus the actual rating. For example for Case 21, the error is equal to 44.46-
50=-5.54
We note that the average error in the predictions is small (-0.52) and so
the predictions are unbiased. Further the errors are roughly Normal so that this
model gives prediction errors that are approximately 95% of the time within
±14.34 (two standard deviations) of the true value.

2.3 Subset Selection in Linear Regression

A frequent problem in data mining is that of using a regression equation to
predict the value of a dependent variable when we have a number of variables
available to choose as independent variables in our model. Given the high speed
of modern algorithms for multiple linear regression calculations, it is tempting
6 Chap. 2 Multiple Linear Regression

Case Y X1 X2 X3 X4 X5 X6
1 43 51 30 39 61 92 45
2 63 64 51 54 63 73 47
3 71 70 68 69 76 86 48
4 61 63 45 47 54 84 35
5 81 78 56 66 71 83 47
6 43 55 49 44 54 49 34
7 58 67 42 56 66 68 35
8 71 75 50 55 70 66 41
9 72 82 72 67 71 83 31
10 67 61 45 47 62 80 41
11 64 53 53 58 58 67 34
12 67 60 47 39 59 74 41
13 69 62 57 42 55 63 25
14 68 83 83 45 59 77 35
15 77 77 54 72 79 77 46
16 81 90 50 72 60 54 36
17 74 85 64 69 79 79 63
18 65 60 65 75 55 80 60
19 65 70 46 57 75 85 46
20 50 58 68 54 64 78 52

Table 2.1: Training Data (20 departments).

in such a situation to take a kitchen-sink approach: why bother to select a

subset, just use all the variables in the model. There are several reasons why
this could be undesirable.

• It may be expensive to collect the full complement of variables for future

predictions.

• We may be able to more accurately measure fewer variables (for example

in surveys).

• Parsimony is an important property of good models. We obtain more

insight into the inﬂuence of regressors in models with a few parameters.
Sec. 2.3 Subset Selection in Linear Regression 7

Multiple R-squared 0.656

Residual SS 738.900
Std. Dev. Estimate 7.539

Coefficient StdError t-statistic p-value

Constant 13.182 16.746 0.787 0.445
X1 0.583 0.232 2.513 0.026
X2 -0.044 0.167 -0.263 0.797
X3 0.329 0.219 1.501 0.157
X4 -0.057 0.317 -0.180 0.860
X5 0.112 0.196 0.570 0.578
X6 -0.197 0.247 -0.798 0.439

Table 2.2: Output of StatCalc.

• Estimates of regression coeﬃcients are likely to be unstable due to multi-

collinearity in models with many variables. We get better insights into the
inﬂuence of regressors from models with fewer variables as the coeﬃcients
are more stable for parsimonious models.

• It can be shown that using independent variables that are uncorrelated

with the dependent variable will increase the variance of predictions.

• It can be shown that dropping independent variables that have small

(non-zero) coeﬃcients can reduce the average error of predictions.

Let us illustrate the last two points using the simple case of two indepen-
dent variables. The reasoning remains valid in the general situation of more
than two independent variables.

2.3.1 Dropping Irrelevant Variables

Suppose that the true equation for Y, the dependent variable, is:

Y = β1 X1 + ε (2.2)
8 Chap. 2 Multiple Linear Regression

Case Y X1 X2 X3 X4 X5 X6 Prediction Error

21 50 40 33 34 43 64 33 44.46 -5.54
22 64 61 52 62 66 80 41 63.98 -0.02
23 53 66 52 50 63 80 37 63.91 10.91
24 40 37 42 58 50 57 49 45.87 5.87
25 63 54 42 48 66 75 33 56.75 -6.25
26 66 77 66 63 88 76 72 65.22 -0.78
27 78 75 58 74 80 78 49 73.23 -4.77
28 48 57 44 45 51 83 38 58.19 10.19
29 85 85 71 71 77 74 55 76.05 -8.95
30 82 82 39 59 64 78 39 76.10 -5.90
Averages: 62.38 -0.52
Std Devs: 11.30 7.17

Table 2.3: Predictions on the validation data.

and suppose that we estimate Y (using an additional variable X2 that is actually

irrelevant) with the equation:

Y = β1 X1 + β2 X2 + ε. (2.3)

We use data yi , xi1 , xi2, i = 1, 2 . . . , n. We can show that in this situation

the least squares estimates β̂1 and β̂2 will have the following expected values
and variances:
σ2
E(β̂1 ) = β1 , V ar(β̂1 ) = 2 ) n x2
(1 − R12 i=1 i1

σ2
E(β̂2 ) = 0, V ar(β̂2 ) = 2 ) n x2 ,
(1 − R12 i=1 i2
where R12 is the correlation coeﬃcient between X1 and X2 .
We notice that β̂1 is an unbiased estimator of β1 and β̂2 is an unbiased
estimator of β2 , since it has an expected value of zero. If we use Model (2) we
obtain that
σ2
E(β̂1 ) = β1 , V ar(β̂1 ) = n 2.
i=1 x1

Note that in this case the variance of β̂1 is lower.

Sec. 2.3 Subset Selection in Linear Regression 9

The variance is the expected value of the squared error for an unbiased
estimator. So we are worse oﬀ using the irrelevant estimator in making predic-
tions. Even if X2 happens to be uncorrelated with X1 so that R12 2 = 0 and the

variance of β̂1 is the same in both models, we can show that the variance of a
prediction based on Model (3) will be worse than a prediction based on Model
(2) due to the added variability introduced by estimation of β2 .
Although our analysis has been based on one useful independent variable
and one irrelevant independent variable, the result holds true in general. It is
always better to make predictions with models that do not include
irrelevant variables.

2.3.2 Dropping independent variables with small coeﬃcient val-

ues
Suppose that the situation is the reverse of what we have discussed above,
namely that Model (3) is the correct equation, but we use Model (2) for our
estimates and predictions ignoring variable X2 in our model. To keep our results
simple let us suppose that we have scaled the values of X1 , X2 , and Y so that
their variances are equal to 1. In this case the least squares estimate β̂1 has the
following expected value and variance:

E(β̂1 ) = β1 + R12 β2 , V ar(β̂1 ) = σ 2 .

Notice that β̂1 is a biased estimator of β1 with bias equal to R12 β2 and its
Mean Square Error is given by:

M SE(β̂1 ) = E[(β̂1 − β1 )2 ]
= E[{β̂1 − E(β̂1 ) + E(β̂1 ) − β1 }2 ]
= [Bias(β̂1 )]2 + V ar(β̂1 )
= (R12 β2 )2 + σ 2 .

If we use Model (3) the least squares estimates have the following expected
values and variances:
σ2
E(β̂1 ) = β1 , V ar(β̂1 ) = 2 ),
(1 − R12

σ2
E(β̂2 ) = β2 , V ar(β̂2 ) = 2 ).
(1 − R12
Now let us compare the Mean Square Errors for predicting Y at X1 =
u 1 , X2 = u 2 .
10 Chap. 2 Multiple Linear Regression

For Model (2), the Mean Square Error is:

M SE2(Ŷ ) = E[(Ŷ − Y )2 ]
= E[(u1 β̂1 − u1 β1 − ε)2 ]
= u21 M SE2(β̂1 ) + σ 2
= u21 (R12 β2 )2 + u21 σ 2 + σ 2

For Model (2), the Mean Square Error is:

M SE3(Ŷ ) = E[(Ŷ − Y )2 ]
= E[(u1 β̂1 + u2 β̂2 − u1 β1 − u2 β2 − ε)2 ]
= V ar(u1 β̂1 + u2 β̂2 ) + σ 2 , because now Ŷ isunbiased
= u21 V ar(β̂1 ) + u22 V ar(β̂2 ) + 2u1 u2 Covar(β̂1 , β̂2 )
(u21 + u22 − 2u1 u2 R12 ) 2
= 2 ) σ + σ2.
(1 − R12

Model (2) can lead to lower mean squared error for many combinations
of values for u1 , u2 , R12 , and (β2 /σ)2 . For example, if u1 = 1, u2 = 0, then
M SE2(Ŷ ) < M SE3(Ŷ ), when

σ2
(R12 β2 )2 + σ 2 < 2 ),
(1 − R12

i.e., when
|β2 | 1
< .
σ 2
1 − R12

If |βσ2 | < 1, this will be true for all values of R12

2 ; if, however, say R2 > .9,
12
then this will be true for |β|/σ < 2.
In general, accepting some bias can reduce MSE. This Bias-Variance trade-
off generalizes to models with several independent variables and is particularly
important for large values of the number p of independent variables, since in
that case it is very likely that there are variables in the model that have small
coefficients relative to the standard deviation of the noise term and also exhibit
at least moderate correlation with other variables. Dropping such variables will
improve the predictions as it will reduce the MSE.
This type of Bias-Variance trade-off is a basic aspect of most data mining
procedures for prediction and classification.
Sec. 2.3 Subset Selection in Linear Regression 11

2.3.3 Algorithms for Subset Selection

Selecting subsets to improve MSE is a diﬃcult computational problem for large
number p of independent variables. The most common procedure for p greater
than about 20 is to use heuristics to select “good” subsets rather than to look for
the best subset for a given criterion. The heuristics most often used and avail-
able in statistics software are step-wise procedures. There are three common
procedures: forward selection, backward elimination and step-wise regression.

Forward Selection
Here we keep adding variables one at a time to construct what we hope is a
reasonably good subset. The steps are as follows:

1. Start with constant term only in subset S.

2. Compute the reduction in the sum of squares of the residuals (SSR) ob-
tained by including each variable that is not presently in S. We denote
by SSR(S) the sum of square residuals given that the model consists of
the set S of variables. Let σ̂ 2 (S) be an unbiased estimate for σ for the
model consisting of the set S of variables. For the variable, say, i, that
gives the largest reduction in SSR compute

SSR(S) − SSR(S ∪ {i})

Fi = M axi∈S
/
σ̂ 2 (S ∪ {i})

If Fi > Fin , where Fin is a threshold (typically between 2 and 4) add i to

3. Repeat 2 until no variables can be added.

Backward Elimination
1. Start with all variables in S.

2. Compute the increase in the sum of squares of the residuals (SSR) ob-
tained by excluding each variable that is presently in S. For the variable,
say, i, that gives the smallest increase in SSR compute
SSR(S−{i})−SSR(S)
Fi = M ini∈S/ σ̂ 2 (S)
If Fi < Fout , where Fout is a threshold (typically between 2 and 4) then
drop i from S.

3. Repeat 2 until no variable can be dropped.

12 Chap. 2 Multiple Linear Regression

Backward Elimination has the advantage that all variables are included in
S at some stage. This addresses a problem of forward selection that will never
select a variable that is better than a previously selected variable that is strongly
correlated with it. The disadvantage is that the full model with all variables is
required at the start and this can be time-consuming and numerically unstable.

Step-wise Regression
This procedure is like Forward Selection except that at each step we consider
dropping variables as in Backward Elimination.
Convergence is guaranteed if the thresholds Fout and Fin satisfy: Fout <
Fin . It is possible, however, for a variable to enter S and then leave S at a
subsequent step and even rejoin S at a yet later step.
As stated above these methods pick one best subset. There are straight-
forward variations of the methods that do identify several close to best choices
for diﬀerent sizes of independent variable subsets.
None of the above methods guarantees that they yield the best subset
for any criterion such as adjusted R2 . (Deﬁned later in this note.) They are
reasonable methods for situations with large numbers of independent variables
but for moderate numbers of independent variables the method discussed next
is preferable.

All Subsets Regression

The idea here is to evaluate all subsets. Eﬃcient implementations use branch
and bound algorithms of the type you have seen in DMD for integer program-
ming to avoid explicitly enumerating all subsets. (In fact the subset selection
problem can be set up as a quadratic integer program.) We compute a criterion
2 , the adjusted R2 for all subsets to choose the best one. (This is
such as Radj
only feasible if p is less than about 20).

2.3.4 Identifying subsets of variables to improve predictions

The All Subsets Regression (as well as modiﬁcations of the heuristic algorithms)
will produce a number of subsets. Since the number of subsets for even moderate
values of p is very large, we need some way to examine the most promising
subsets and to select from them. An intuitive metric to compare subsets is R2 .
However since R2 = 1 − SSR SST where SST , the Total Sum of Squares, is the
Sum of Squared Residuals for the model with just the constant term, if we use
it as a criterion we will always pick the full model with all p variables. One
approach is therefore to select the subset with the largest R2 for each possible
Sec. 2.3 Subset Selection in Linear Regression 13

size k, k = 2, . . . , p + 1. The size is the number of coeﬃcients in the model and

is therefore one more than the number of variables in the subset to account
for the constant term. We then examine the increase in R2 as a function of k
amongst these subsets and choose a subset such that subsets that are larger in
size give only insigniﬁcant increases in R2 .
Another, more automatic, approach is to choose the subset that maxi-
2 , a modiﬁcation of R2 that makes an adjustment to account for size.
mizes, Radj
The formula for Radj 2 is

2 n−1
Radj =1− (1 − R2 ).
n−k−1
It can be shown that using Radj2 to choose a subset is equivalent to picking

the subset that minimizes σ̂ .2

Table 2.4 gives the results of the subset selection procedures applied to
the training data in the Example on supervisor data in Section 2.2.
Notice that the step-wise method fails to find the best subset for sizes of 4,
5, and 6 variables. The Forward and Backward methods do find the best subsets
of all sizes and so give identical results as the All subsets algorithm. The best
subset of size 3 consisting of {X1, X3} maximizes Radj 2 for all the algorithms.
This suggests that we may be better off in terms of MSE of predictions if we
use this subset rather than the full model of size 7 with all six variables in the
model. Using this model on the validation data gives a slightly higher standard
deviation of error (7.3) than the full model (7.1) but this may be a small price to
pay if the cost of the survey can be reduced substantially by having 2 questions
instead of 6. This example also underscores the fact that we are basing our
analysis on small (tiny by data mining standards!) training and validation data
sets. Small data sets make our estimates of R2 unreliable.
A criterion that is often used for subset selection is known as Mallow’s
Cp . This criterion assumes that the full model is unbiased although it may have
variables that, if dropped, would improve the M SE. With this assumption
we can show that if a subset model is unbiased E(Cp ) equals k, the size of the
subset. Thus a reasonable approach to identifying subset models with small bias
is to examine those with values of Cp that are near k. Cp is also an estimate
of the sum of MSE (standardized by dividing by σ 2 ) for predictions (the fitted
values) at the x-values observed in the training set. Thus good models are those
that have values of Cp near k and that have small k (i.e. are of small size). Cp
is computed from the formula:
SSR
Cp = + 2k − n,
σ̂F2 ull
14 Chap. 2 Multiple Linear Regression

SST= 2149.000 Fin= 3.840

Fout= 2.710

Forward, backward, and all subsets selections

Models
Size SSR RSq RSq Cp 1 2 3 4 5 6 7
(adj)
2 874.467 0.593 0.570 -0.615 Constant X1
3 786.601 0.634 0.591 -0.161 Constant X1 X3
4 759.413 0.647 0.580 1.361 Constant X1 X3 X6
5 743.617 0.654 0.562 3.083 Constant X1 X3 X5 X6
6 740.746 0.655 0.532 5.032 Constant X1 X2 X3 X5 X6
7 738.900 0.656 0.497 7.000 Constant X1 X2 X3 X4 X5 X6

Stepwise Selection
Models
Size SSR RSq RSq Cp 1 2 3 4 5 6 7
(adj)
2 874.467 0.593 0.570 -0.615 Constant X1
3 786.601 0.634 0.591 -0.161 Constant X1 X3
4 783.970 0.635 0.567 1.793 Constant X1 X2 X3
5 781.089 0.637 0.540 3.742 Constant X1 X2 X3 X4
6 775.094 0.639 0.511 5.637 Constant X1 X2 X3 X4 X5
7 738.900 0.656 0.497 7.000 Constant X1 X2 X3 X4 X5 X6

Table 2.4: Subset Selection for the example in Section 2.2

where σ̂F2 ull is the estimated value of σ 2 in the full model that includes all the
variables. It is important to remember that the usefulness of this approach
depends heavily on the reliability of the estimate of σ 2 for the full model. This
requires that the training set contains a large number of observations relative to
the number of variables. We note that for our example only the subsets of size
6 and 7 seem to be unbiased as for the other models Cp diﬀers substantially
from k. This is a consequence of having too few observations to estimate σ 2
accurately in the full model.

Linear Regression
No ratings yet
Linear Regression
16 pages
Steel Pipes Pipe Schedule Chart
100% (5)
Steel Pipes Pipe Schedule Chart
1 page
MATH6183 Introduction+Regression
No ratings yet
MATH6183 Introduction+Regression
70 pages
Ms 236 N 0
No ratings yet
Ms 236 N 0
63 pages
CTL-OP 111 Ed.2.0 PDF
No ratings yet
CTL-OP 111 Ed.2.0 PDF
5 pages
Linear Regression PDF
100% (1)
Linear Regression PDF
32 pages
Lecture 15: Diagnostics and Inference For Multiple Linear Regression 1 Review
No ratings yet
Lecture 15: Diagnostics and Inference For Multiple Linear Regression 1 Review
7 pages
Multiple Regression
No ratings yet
Multiple Regression
100 pages
Data Science Interview Preparation
100% (1)
Data Science Interview Preparation
113 pages
Topic 3 Multiple Regression Analysis Estimation
No ratings yet
Topic 3 Multiple Regression Analysis Estimation
31 pages
Lecture 3 Multiple Regression Model-Estimation
No ratings yet
Lecture 3 Multiple Regression Model-Estimation
40 pages
Untitled 472
No ratings yet
Untitled 472
13 pages
Ch06 MultipleLinearRegression
0% (2)
Ch06 MultipleLinearRegression
19 pages
Simple Linear Regression Analysis
No ratings yet
Simple Linear Regression Analysis
55 pages
Multiple Linear Regression Part-3: Lectures 24
No ratings yet
Multiple Linear Regression Part-3: Lectures 24
7 pages
7-Multiple Regression
No ratings yet
7-Multiple Regression
17 pages
Multiple Regression
100% (1)
Multiple Regression
100 pages
Module 6
No ratings yet
Module 6
8 pages
Applied Business Forecasting and Planning: Multiple Regression Analysis
No ratings yet
Applied Business Forecasting and Planning: Multiple Regression Analysis
100 pages
Lecture 22
No ratings yet
Lecture 22
11 pages
CHAPTER THREE - Multiple Linear Regression Analysis
No ratings yet
CHAPTER THREE - Multiple Linear Regression Analysis
77 pages
Chapter 3
No ratings yet
Chapter 3
36 pages
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
No ratings yet
SimpleMultipleLinearRegression_FoundationalMathofAI_S24
6 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
6 pages
TP06 Econometrics p2
No ratings yet
TP06 Econometrics p2
20 pages
Module01 LinearRegression
No ratings yet
Module01 LinearRegression
41 pages
Chapter 2
No ratings yet
Chapter 2
19 pages
Mungadze Linear
No ratings yet
Mungadze Linear
21 pages
ML - Module 2
No ratings yet
ML - Module 2
16 pages
Chapter 2 Regression Analysis Notes
No ratings yet
Chapter 2 Regression Analysis Notes
11 pages
1.linear Regression PSP
No ratings yet
1.linear Regression PSP
92 pages
Multiple Linear Regression: Application
No ratings yet
Multiple Linear Regression: Application
22 pages
Stat 353 Study Guide
No ratings yet
Stat 353 Study Guide
44 pages
Module01.1 LinearRegression
No ratings yet
Module01.1 LinearRegression
32 pages
IV Ai & Ds Al3451 Ml Unit2
No ratings yet
IV Ai & Ds Al3451 Ml Unit2
50 pages
STAT22209 - Chapter 03-Multiple Regression - 2022
No ratings yet
STAT22209 - Chapter 03-Multiple Regression - 2022
41 pages
5_AML Lecture 5_Linear regression
No ratings yet
5_AML Lecture 5_Linear regression
56 pages
Multiple Regression Model
No ratings yet
Multiple Regression Model
18 pages
2020 Comp Q1_3
No ratings yet
2020 Comp Q1_3
4 pages
Multiple Linear Regression-I
No ratings yet
Multiple Linear Regression-I
6 pages
Lecture 2.3 Model Validation
No ratings yet
Lecture 2.3 Model Validation
16 pages
DS303: Introduction To Machine Learning: Manjesh K. Hanawal
No ratings yet
DS303: Introduction To Machine Learning: Manjesh K. Hanawal
17 pages
4 - Multiple Linear Regressions
No ratings yet
4 - Multiple Linear Regressions
61 pages
Sta 3
No ratings yet
Sta 3
9 pages
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript
No ratings yet
Home Ai Machine Learning Dbms Java Blockchain Control System Selenium HTML Css Javascript
9 pages
Unit 3 Notes
No ratings yet
Unit 3 Notes
33 pages
Chapter 3
No ratings yet
Chapter 3
27 pages
ML Unit
No ratings yet
ML Unit
23 pages
Multiple Linear Regression
No ratings yet
Multiple Linear Regression
18 pages
Exp 1 121a1047 Lavanya Kurup ML
No ratings yet
Exp 1 121a1047 Lavanya Kurup ML
11 pages
Multiple Linear Reegression
No ratings yet
Multiple Linear Reegression
21 pages
CH-3-Multiple Linear Regression
No ratings yet
CH-3-Multiple Linear Regression
13 pages
Chapter three
No ratings yet
Chapter three
35 pages
(Reformatted) Module 5 (Students)
No ratings yet
(Reformatted) Module 5 (Students)
32 pages
3.multiple Correlation & Regression
No ratings yet
3.multiple Correlation & Regression
24 pages
STAT630Slide Adv Data Analysis
No ratings yet
STAT630Slide Adv Data Analysis
238 pages
Block 3
No ratings yet
Block 3
45 pages
ECN 318 Lecture Notes Weeks 3-4
No ratings yet
ECN 318 Lecture Notes Weeks 3-4
25 pages
Linear Regression Basic Interview Questions
No ratings yet
Linear Regression Basic Interview Questions
36 pages
Unit II - Diagnotis and Multiple Linear
No ratings yet
Unit II - Diagnotis and Multiple Linear
8 pages
Unit 3 Notes
100% (1)
Unit 3 Notes
32 pages
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Campbell
No ratings yet
Campbell
32 pages
00012-20040831 Skylink Federal Circuit Opinion
No ratings yet
00012-20040831 Skylink Federal Circuit Opinion
45 pages
Prs w13d1 Qonly
No ratings yet
Prs w13d1 Qonly
2 pages
LexisNexis® 1976 Copyright Act
No ratings yet
LexisNexis® 1976 Copyright Act
1 page
Prs w14d1 Qonly
No ratings yet
Prs w14d1 Qonly
4 pages
Look at The 2 Top Vote Getters (Tied For First Place!) On The Handout Sheet and Vote For The One You Find Most Appealing or Striking. 1. 2
No ratings yet
Look at The 2 Top Vote Getters (Tied For First Place!) On The Handout Sheet and Vote For The One You Find Most Appealing or Striking. 1. 2
39 pages
Prs w09d1 Qonly
No ratings yet
Prs w09d1 Qonly
4 pages
Coherent, Monochromatic Plane Waves
No ratings yet
Coherent, Monochromatic Plane Waves
6 pages
Experiment 6: Prediction 1
No ratings yet
Experiment 6: Prediction 1
8 pages
Prs w03d2 Qonly
No ratings yet
Prs w03d2 Qonly
8 pages
Class 36: Outline: Yell If You Have Any Questions
No ratings yet
Class 36: Outline: Yell If You Have Any Questions
46 pages
Resistance: L and Cross Sectional Area A, The
No ratings yet
Resistance: L and Cross Sectional Area A, The
5 pages
Class 30: Outline: Hour 1: Traveling & Standing Waves
No ratings yet
Class 30: Outline: Hour 1: Traveling & Standing Waves
29 pages
Prs w02d1 Qonly
No ratings yet
Prs w02d1 Qonly
6 pages
Prs w07d1 Qonly
No ratings yet
Prs w07d1 Qonly
5 pages
Prs w05d1 Qonly
No ratings yet
Prs w05d1 Qonly
3 pages
Practice Right Hand Rule #1
No ratings yet
Practice Right Hand Rule #1
4 pages
Class 33: Outline: Hour 1: Interference
No ratings yet
Class 33: Outline: Hour 1: Interference
38 pages
Class 32: Outline
No ratings yet
Class 32: Outline
36 pages
Prs w01d1 Qonly
No ratings yet
Prs w01d1 Qonly
9 pages
Lecture 23: Outline: Yell If You Have Any Questions
No ratings yet
Lecture 23: Outline: Yell If You Have Any Questions
43 pages
Prs w03d1 Qonly
No ratings yet
Prs w03d1 Qonly
4 pages
Class 31: Outline: Hour 1: Concept Review / Overview PRS Questions - Possible Exam Questions Hour 2
No ratings yet
Class 31: Outline: Hour 1: Concept Review / Overview PRS Questions - Possible Exam Questions Hour 2
46 pages
Class 28: Outline: Hour 1: Displacement Current Maxwell's Equations Hour 2: Electromagnetic Waves
No ratings yet
Class 28: Outline: Hour 1: Displacement Current Maxwell's Equations Hour 2: Electromagnetic Waves
33 pages
Class 24: Outline: Hour 1: Inductance & LR Circuits Hour 2: Energy in Inductors
No ratings yet
Class 24: Outline: Hour 1: Inductance & LR Circuits Hour 2: Energy in Inductors
37 pages
Class 20: Outline: Hour 1: Faraday's Law
No ratings yet
Class 20: Outline: Hour 1: Faraday's Law
42 pages
Class 17: Outline: Hour 1: Dipoles & Magnetic Fields
No ratings yet
Class 17: Outline: Hour 1: Dipoles & Magnetic Fields
26 pages
Class 18: Outline: Hour 1: Levitation Experiment 8: Magnetic Forces Hour 2: Ampere's Law
No ratings yet
Class 18: Outline: Hour 1: Levitation Experiment 8: Magnetic Forces Hour 2: Ampere's Law
49 pages
Class 15: Outline: Hour 1: Magnetic Force Expt. 6: Magnetic Force
No ratings yet
Class 15: Outline: Hour 1: Magnetic Force Expt. 6: Magnetic Force
33 pages
Class 14: Outline: Hour 1: Magnetic Fields Expt. 5: Magnetic Fields
No ratings yet
Class 14: Outline: Hour 1: Magnetic Fields Expt. 5: Magnetic Fields
31 pages
Leon Guinto Memorial College, Inc
No ratings yet
Leon Guinto Memorial College, Inc
3 pages
Securing MWA Telnet Communication Using Network Encryption
No ratings yet
Securing MWA Telnet Communication Using Network Encryption
19 pages
Computer Hacking Forensic Investigator - Another Summary
100% (2)
Computer Hacking Forensic Investigator - Another Summary
150 pages
Prot
No ratings yet
Prot
124 pages
Honeywell: Micronik 200 Application
No ratings yet
Honeywell: Micronik 200 Application
8 pages
Exam Study Guide GCP8 CVP
No ratings yet
Exam Study Guide GCP8 CVP
10 pages
Nus Iss Executive Education Planner 2018
No ratings yet
Nus Iss Executive Education Planner 2018
5 pages
SRI SR200 Streamex Catalogue - Compressed
No ratings yet
SRI SR200 Streamex Catalogue - Compressed
16 pages
Antimicrobial Resistance
No ratings yet
Antimicrobial Resistance
1 page
Unit 2 Grammar
No ratings yet
Unit 2 Grammar
10 pages
7 Rules of Fashion Supply Chain (Zara Case Study) PDF
No ratings yet
7 Rules of Fashion Supply Chain (Zara Case Study) PDF
3 pages
Intrusion Quick Reference Guide
No ratings yet
Intrusion Quick Reference Guide
4 pages
OMAX2652
No ratings yet
OMAX2652
2 pages
PET524 1b Porosity
No ratings yet
PET524 1b Porosity
8 pages
Check List Floor Opening
No ratings yet
Check List Floor Opening
2 pages
StarCraft Program CPP
0% (1)
StarCraft Program CPP
5 pages
Shaft Coupling Alignment in Steam Turbines 1713927447
No ratings yet
Shaft Coupling Alignment in Steam Turbines 1713927447
44 pages
Peran Guru Kelas Untuk Meningkatkan Motivasi Belajar Siswa Di Sekolah Dasar
No ratings yet
Peran Guru Kelas Untuk Meningkatkan Motivasi Belajar Siswa Di Sekolah Dasar
7 pages
BMW
No ratings yet
BMW
8 pages
B0409
No ratings yet
B0409
4 pages
Dseu en
No ratings yet
Dseu en
24 pages
Mi Hobby Favorito
100% (1)
Mi Hobby Favorito
4 pages
P330i-P430i Spare Parts List v11.00
No ratings yet
P330i-P430i Spare Parts List v11.00
123 pages
Grade 8 - ICT - Presentation - Practical Worksheet
No ratings yet
Grade 8 - ICT - Presentation - Practical Worksheet
2 pages
Novelio C. Ferrolino: Objective
No ratings yet
Novelio C. Ferrolino: Objective
3 pages
Abigpa MV Exp7 061721
No ratings yet
Abigpa MV Exp7 061721
3 pages
Lecture 14 - Multiphase Flows Applied Computational Fluid Dynamics
No ratings yet
Lecture 14 - Multiphase Flows Applied Computational Fluid Dynamics
31 pages
Sensores de Velocidade - whirligigUK
No ratings yet
Sensores de Velocidade - whirligigUK
2 pages