Linear Regression: An Approach For Forecasting

The document describes linear regression as a statistical technique for forecasting the performance of Oracle-based systems. It explains that linear regression finds the linear relationship between two variables, such as the number of user calls and CPU utilization. It discusses exploring the relationship through scatter plots and correlation coefficients. A strong positive correlation is found between the number of order lines and CPU utilization using sample data. The regression coefficients of the linear model are then estimated using the least squares method in Excel, resulting in the best-fit line and equation to forecast CPU utilization based on order lines.

Uploaded by

svrbikkina

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

150 views

Linear Regression: An Approach For Forecasting

Uploaded by

svrbikkina

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

1

Linear Regression: An approach for forecasting

Neeraj Bhatia, Oracle Corporation

We may encountered several situations in our daily life when we are asked to forecast the
performance of Oracle-based systems. Many questions strike our mind like:
How much load of some particular business activity, our existing system can support,
before running out of gas?
What is the optimal volume of a business activity that system can support without
any performance problem?
If the workload grows by x percentage every quarter, when we will need to add more
capacity to the system?

The questions you are asked to answer, may vary according to your business requirements;
but the bottom line remains the same. As a Capacity Planner, you are asked to do the capacity
planning and intimate the management regarding the additional capacity requirements in
proactive fashion.
This paper is all about forecasting of Oracle-based systems performance. Throughout the
paper, I will explain a statistical method called Linear Regression; that is industry-proven,
easy and timesaving technique to answer all such complex questions. I hope that after reading
the paper, you will feel more confident while forecasting Oracle performance.
Here, the approach followed for forecasting is based on a statistical technique, so if you are
not from statistics background, at times it may feel a bit harder for you to digest the text and
terminology. Dont Worry!! I have tried to explain the statistics terms in detail, and at times
doing that, also have diverted from main theme.

What is Linear Regression?

In very simple words, regression analysis is a method for investigating relationships among
variables. In context of Oracle examples of such relations are: Number of sessions vs
memory utilization, physical I/O vs. disk subsystem utilization etc. Regression relations can
be classified as linear and nonlinear, simple and multiple. For the sake of applicability, here
we are only concerned with simple linear Regression (or simply, Linear Regression).

Linear Regression tries to find a linear relationship between two variables. The general form
of such a relation is y= mx + c, which is also an equation of a straight line (dont get
confused with mathematical terminology). Here c represents the y-intercept of the line and m
represents the slope.
Seems bit complex, Yes its. Let me explain it in simple words. As a performance analyst, we
need to find a relationship between two technical metrics such as user calls and CPU
utilization. Such a relationship can be expressed in terms of an equation. Here the variable
user calls is used to predict the value of CPU utilization and is known as explanatory
variable or simply predictor. On the other hand, the variable whose value is to be predicted,
known as response variable or dependent variable. We generally denote response variable
by Y and predictor variable as X. If we apply these notations to above example, we can write
the relation as:

2
Sample# 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Order Lines/day 16483 13142 12015 11986 1119 0 0 12259 6531 14086 12797 13141 454 1 5971 10901
Avg CPU Utilization 27.01 32.43 21.74 20.56 2.85 1.41 1.45 46.38 21.95 29.55 30.04 28.08 3.26 1.62 29.41 40.02

Sample# 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
Order Lines/day 14271 13728 12938 1158 0 11450 5311 17073 11336 7340 11330 0 10679 12803 12827
Avg CPU Utilization 29.86 28.34 34.82 3.22 1.43 34.22 23.58 33.6 23.36 26.76 4.31 2.62 33.44 29.19 28.11
CPU utilization = user calls * m + c

Here m and c are known as regression coefficients or parameters, whose values we need to
determine. Once we get these values, we can predict the systems CPU utilization based on
the values of user calls.

Exploring relationships with Scatters and Correlations

Before we dig into linear regression to predict a value of response variable, there are some
conditions that should be met.
There exists a linear relationship between response and prediction variable. In other
words, if we plot a scatter diagram between X and Y, it should be a straight line.
There should be a strong correlation strength between response and prediction
variable.
These are the two conditions, which show a strong linear relation between response and
prediction variables. Lets discuss these one by one using an example.

Throughout this paper, we will discuss a data warehouse support system. The major
workload is the warehouse orders. Thus the key business metric is identified as number of
order lines. We wish to find a relation, where we can predict CPU utilization based on the
number of order lines entered into the system.
Here response variable (Y) is CPU utilization and prediction variable (X) is number of order
lines.
Here are 31 samples of CPU utilization and number of order line entries, collected
throughout the month.

3
Exploring relationships with Scatter plot

A scatter plot is a 2-dimensional graph that displays pairs of data, one pair per sample in the
(x, y) format. We can use Excels Chart Wizard to plot a scatter diagram for the above
example.

Figure 1: Scatter plot of Order lines vs. CPU Utilization

Here you can see that relationship appears to be a straight line, except some points, where
CPU utilization is not following the trend. These points are known as Outliers and we will
discuss them during later course of the paper.
Now since its confirmed that the condition of linear regression meets, we need to check how
strong the relation is.

Exploring relationships with Correlation Coefficient

Correlation Coefficient measures the strength and direction of the linear relationship
between response and prediction variable. Its a number between 1 and +1. Higher the
absolute value of r, stronger is the linear relationship between the variables. Negative value
of correlation coefficient represents negative relation between response and prediction
variable, means that if X increases then Y decreases and vice versa. We can calculate
correlation coefficient of a set of x and y values, using following formula:

Corr coeff r = (y
i
y)( x
i
x) / SQRT ( (x
i
y)
2
(x
i
x)
2
)

Again there is no need to solve this complex mathematical equation; Excels predefined
function CORREL () is available for us (Good news!).

The square of correlation coefficient (i.e., r
2
) denotes how much response variable can be
explained from the prediction variable. In our previous example, correlation coefficient is
calculated as r= 0.818 and r
2
= 0.669, so we can say that 66.9% of CPU utilization output
can be explained by order lines/day data. The remaining 33.1% of the CPU utilization output
cannot be explained from the prediction variable under observation, but by some other
variables (other technical metric). In order to do the forecasting precisely, we need to find a
variable which is most likely explain the response variable, or in order words, whose
correlation strength is maximum.

Where 1 value of correlation coefficient indicates perfect correlation between response and
prediction variables (you will rarely find it in real environments), Zero (0) exhibits absence
Order Lines vs CPU Utilization
0
10
20
30
40
50
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
Number of Order Lines (X)
C
P
U

U
t
i
l
i
z
a
t
i
o
n

(
Y
)
4
of correlation. Any other value of correlation coefficient between 1 and +1 indicates limited
degree of correlation. Following table
*
gives practical meaning of various correlation
coefficients:

Correlation Coefficient (r) Practical Meaning

0.0 to 0.2 Very week
0.2 to 0.4 Week
0.4 to 0.7 Moderate
0.7 to 0.9 Strong
0.9 to 1.0 Very strong

* Reference 1

Regression Coefficients Estimation

Now after we have established that there is a strong linear relationship between response
variable and predictor, we are ready to estimate the regression coefficients. This is equivalent
to finding the equation of a straight line that best fits the points on the scatter diagram of
response variable versus the predictor variable. There are various methods available, one of
which is known as least squares method. It gives the equation of a straight line, which
minimizes the sum of squares of the vertical distances from each point to the line.

Using least squares method, the values of slope m and y-intercept c can be given as:

Slope m = (y
i
-y)(x
i
-x) / (x
i
-x)
2
And y-intercept c = y m * x
Where x = mean value of x-values, y = mean of y-values

If we dont want to solve these mathematical formulas, Microsoft Excel is there to help you.
It has predefined statistical functions Slope() and Intercept(), which takes X and Y values,
and return us the slope and y-intercept of the regression line respectively.
Excel can help us to plot best-fitted line along with its equation, all we need to do is to select
graphs points, right click on it and select add Trend line. Select linear trend line and in
options tab, check the display equation on chart box.
5
Here is the scatter plot for our previous example along with best-fit line and its equation:

Figure 2: Best-fitted line for Order lines vs. CPU Utilization plot

In this case slope m = 0.0019 and y-intercept c = 4.6317

Forecasting through Linear Regression

Now we have all the data that we need for forecasting. We have equation of a straight line
that best fits the response and prediction variables.
Suppose we want to predict the CPU utilization when there are 13141 order lines/day. We
can calculate this as follows:

Y-estimated = 0.0019 * 13141 + 4.6317 = 30.18%

From sample data, we know that at 13141 order lines/day, CPU utilization is observed as
28.08 %, we can see that there is 2.0 difference in actual and estimated CPU utilization. This
is known as residual. In a formal language, residual is the difference between observed value
of y and estimated value of y. In general, a positive residual means you have underestimated
y at that point and negative value of y implies that you have overestimated y.
Residuals are expected in data set, there may be some data points, which are away from main
data trend, such data points are known as Outliers. In real production environment Outliers
may be caused due to various reasons. Some of them are, backups, one-time report
generation process or problem in data collection. Any process, which is not a part of routine
workload will account for Outliers. While Outliers distort the regressions coefficients, they
sometimes alert us regarding the problem in data collection process. The bottom line is,
Outlier removal is most important part in forecasting through regression analysis. When we
detect Outliers, we need to think of whether these are part of normal workload and if its, then
we should include the same in our analysis, otherwise remove. But important thing here to
note down is that we should have proper documentation and justification for all the data
points that have been removed.

Scatter Plot (CPU utilization vs Order Lines/day)
y = 0.001944x + 4.631658
R
2
= 0.668916
0
10
20
30
40
50
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
Order Lines per day (X)
C
P
U

U
t
i
l
i
z
a
t
i
o
n

(
Y
)
6
Outlier Removal Process

Outlier removal process may be very frustrating and time-consuming sometimes. Further, we
should know when to stop. Discussion with business/system specialist may help us to identify
outliers, points that are not part of a normal workload. Statistically Outliers can be detected
using following method.
First standardize residuals, means subtract the mean (average) of the residuals and divide by
the standard deviation from each calculated residual. Residuals have a property that they have
zero as their means. In order words, error above the linear regression line (positive residuals)
and below the regression line (negative residuals) is equal.
Following is the table containing standardized residual for the case study we are discussing:

Sample
Order Lines
(X)
CPU
Utilization (Y)
Estimated
Y Residual
Residual
Square
Stnd
Residual
27 11330 4.31 26.66 -22.35 499.5241 -2.87
8 12259 46.38 29.42 16.95 287.3514 2.53
16 10901 40.02 25.83 14.19 201.4710 1.84
15 5971 29.41 16.24 13.17 173.4282 1.70
1 16483 27.01 36.68 -9.67 93.4851 -1.25
23 5311 23.58 14.96 8.62 74.3461 1.12
29 10679 33.44 25.39 8.05 64.7328 1.04
26 7340 26.76 18.90 7.86 61.7408 1.02
4 11986 20.56 27.94 -7.38 54.3975 -0.95
22 11450 34.22 26.89 7.33 53.6799 0.95
3 12015 21.74 27.99 -6.25 39.0856 -0.81
19 12938 34.82 29.79 5.03 25.3372 0.65
9 6531 21.95 17.33 4.62 21.3484 0.60
24 17073 33.60 37.83 -4.23 17.8580 -0.55
5 1119 2.85 6.81 -3.96 15.6600 -0.51
20 1158 3.22 6.88 -3.66 13.4183 -0.47
25 11336 23.36 26.67 -3.31 10.9674 -0.43
6 0 1.41 4.63 -3.22 10.3791 -0.42
21 0 1.43 4.63 -3.20 10.2506 -0.41
7 0 1.45 4.63 -3.18 10.1229 -0.41
14 1 1.62 4.63 -3.01 9.0818 -0.39
18 13728 28.34 31.32 -2.98 8.8944 -0.39
17 14271 29.86 32.38 -2.52 6.3407 -0.33
10 14086 29.55 32.02 -2.47 6.0930 -0.32
13 454 3.26 5.51 -2.25 5.0821 -0.29
2 13142 32.43 30.18 2.25 5.0489 0.29
12 13141 28.08 30.18 -2.10 4.4145 -0.27
28 0 2.62 4.63 -2.01 4.0468 -0.26
31 12827 28.11 29.57 -1.46 2.1333 -0.19
11 12797 30.04 29.51 0.53 0.2785 0.07
30 12803 29.19 29.52 -0.33 0.1115 -0.04
Sum 0.0 0.0 0.0
7
After calculating standardized residuals, plot them against estimated values of response
variables. This is known as standardized residual plot. The motive behind plotting
standardized residuals is to check the normality
*
of the residuals. While most of the
standardize residuals should be close to zero, as we move away from zero, the frequency of
the residuals should decrease. Further, a standard residual at or beyond 3 and +3 is not
acceptable (due to Empirical rule of Standard Scores). These points should be considered as
Outliers and we should investigate them further.
Secondly, residuals should occur in random, they shouldnt follow a pattern. A pattern in
residual plots implies that regression line may not be fitting right.

Note: For normality and empericial rule, I have another document titled Statistics basics
for forecasting

Alternate way for detecting the Outliers is to square the residuals and sort the data set in
descending order. By doing so all the data points with higher residuals (positive or negative)
will come on the top. Further investigation of these data points will lead us to know whether
we should exclude these points from our analysis or not.

(a) (b)

Figure 3: Standardize residual plot, (a) before Outliers removal and (b) After Outliers
removal

From the above table and figure 3(a), we can see that Sample number 27 and 8 can be treated
as Outliers as their standardize residuals are greater than 2.5 (in any direction) and the
residual squares are very high as compared to other data points. Now its the time to
investigate the reason for these observations.

In this particular case, I personally found that it was Sunday when sample number 8 was
collected. So some statistics gathering processes were running which resulted in high CPU
utilization. Thats why we have positive residual (we underestimated CPU utilization). Since
its not a part of standard business workload, we can safely remove this data point from our
data set after proper documentation. Now digging further into the details for sample number
27, I found that there was no reasonable CPU utilization during the complete day. Overall
workload was not much, so even after there were 11000+ order lines served; resulted average
CPU utilization was only 4%. We need to discuss the reason of low workload with system
specialist. This may be due to skewness effect in CPU utilization data (CPU utilization is not
evenly distributed throughout the day). We can avoid the skewness effect by increasing the
data collection frequency (in this particular case per hour, for example). It should be noted
that, while increasing data collection frequency will eliminate skewness problem, it may
Standardize Residual Plot (After Outliers removal)
-3
-2
-1
0
1
2
3
0 5 10 15 20 25 30 35 40
Estimated Y values (CPU Utilization)
S
t
a
n
d
a
r
d
i
z
e

R
e
s
i
d
u
a
l
Standardize Residual Plot (Before Outliers removal)
-3
-2
-1
0
1
2
3
0 5 10 15 20 25 30 35 40
Estimated Y values (CPU Utilization)
S
t
a
n
d
a
r
d
i
z
e

R
e
s
i
d
u
a
l
8
result in high resource consumption on production servers. However optimized data
collection process can save precious resources (like memory and CPU cycles).
Proper Outlier detection and removal is very important step in regression analysis since they
can affect regression formula and make the overall forecasting process less precise.
In our example, after removing sample numbers 27 and 8 from our analysis, correlation
coefficient (r) increases to 0.884 and r
2
= 0.782. It means that now 78.2% of CPU utilization
can be explained using order lines/day. Compare there values with the previous ones where
r= 0.818 and r
2
= 0.669.

Figure:4 shows the scatter plot of the data set without the Outliers. Clearly regression line is
better without the Outliers, where more data points are close to the line.

Figure 4: Scatter plot of CPU Utilization vs. Order Lines/day after Outliers removal

Avoiding Extrapolation

In Linear regression, we end-up with an equation of a straight line, which best fits our data.
We substitute a value for x and get a predicted value for y. Plugging x values into the
equation that fall outside a reasonable bound is known as extrapolation. We should avoid
extrapolation in case of linear regression. In fact if we do, well have wrong results. This can
be considered as main drawback of linear regression; it can forecast only during the linearity
of the relation. This is due to the fact that systems performance is not linear. For example, in
our case, when number of order lines increases, it will increase host CPU utilization. Around
70-75% of CPU utilization queueing behavior comes into the picture, and CPU utilization
will never remain linear after that. Thus we can forecast through Linear Regression only
during the linear relationships of x and y values. In our case, minimum and maximum
observed value of Order lines per day are 0 and 17073 respectively. Choosing order lines/day
between 0 and 17073 to forecast is reasonable, but beyond these limits (value greater than
17073) isnt a good idea. We cant make sure that same linear relationship between order
lines/day and CPU utilization will be there after 17073 order lines/day.
So the bottom line is; never forecast in nonlinear areas. For CPU subsystem, this limit is
around 75% and for I/O subsystem this limit is around 65%.

Forecasting in the absence of single non-correlated variable

In our example, we have found that order lines/day is correlated with CPU Utilization and
after Outliers removal, able to predict CPU Utilization 78.2% of the times (Correlation
coefficient of 0.884 is strong).
Scatter Plot (CPU utilization vs Order Lines/day)
y = 0.001940x + 4.825653
R
2
= 0.781843
0
10
20
30
40
50
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
Order Lines/day (X)
C
P
U

U
t
i
l
i
z
a
t
i
o
n

(
Y
)
9
We need to collect various Business Workload metrics and check, which is maximum
correlating with the Response Variable. Identification process of such metrics is sometimes
frustrating and may result in failure. A good approach here is to discuss the systems
behavior with all the concerned persons (business/system specialist, Application Engineer
etc.), as they better know the business and its impact on the system workload. Its always
better to end up the discussion with a good set of identified metrics. No one wants to re-
arrange the meeting if the initial metrics are not found be correlated with response variable.
There may be a case when no single metric is identified, that is correlated with the response
variable. Then we can select a set of variables, which are moderately correlated with response
variable and none of them is highly correlated. A Statistical regression approach where we
predict response variable on the basis of multiple prediction variables, is know as Multiple
Regression Analysis. Discussion of that approach is beyond the scope of this paper. I have
separate document, which explain the concept behind multiple regression.

Linear Regression Functions in Oracle

Oracle has inbuilt linear regression functions that support least squares method for regression
coefficient estimation.

The functions are as follows:

REGR_COUNT Function: It returns the number of non-null number pairs used to fit
the regression line.

REGR_AVGY and REGR_AVGX Functions: REGR_AVGY and REGR_AVGX
compute the averages of the dependent variable and the independent variable of the
regression line, respectively. REGR_AVGY computes the average of its first
argument (dependent variable Y) after eliminating number pairs where either of the
variables is null. Similarly, REGR_AVGX computes the average of its second
argument (independent variable X) after null elimination.

REGR_SLOPE and REGR_INTERCEPT Functions: The REGR_SLOPE function
computes the slope of the regression line fitted to non-null number pairs. The
REGR_INTERCEPT function computes the y-intercept of the regression line.

REGR_R2 Function: The REGR_R2 function computes the coefficient of
determination (usually called "R-squared" or "goodness of fit") for the regression
line.

Sample code- Automate Linear Regression through Oracle functions

Following is sample code, which can be used for linear regression analysis for Oracle
performance forecasting. Here I tried to give an example of Oracle linear regression
functions.
This sample code chooses database workload metric logical reads per sec as a predictor
variable. You can customize the code and include other metrics as well like user calls
per sec, executions per sec etc.
10
Description:

Table regression_data: This table will be used for storing database technical metrics and
regression data.
Table outlier_data: This table will store outlier data. It should be noted that outliers
would be deleted from regression_data and store in this table, for further investigation
and documentation purpose.
View dba_hist_sysmetric_summary: This database view is used to fetch historical AWR
data.

create table regression_data(
snap_id number,
begin_time date,
end_time date,
logical_reads number,
host_CPU number,
proj_cpu number,
residual number,
residual_sqr number,
stnd_residual number);

create table outlier_data(
snap_id number,
begin_time date,
end_time date,
logical_reads number,
host_cpu number);

Table population:

insert into regression_data
(snap_id ,begin_time, end_time,logical_reads)
select snap_id,BEGIN_TIME,END_TIME,AVERAGE from
DBA_HIST_SYSMETRIC_SUMMARY v, gv$instance i
where metric_name='Logical Reads Per Sec' and
begin_time >= to_date('1-oct-08 00:00:00','dd-mon-yy
hh24:mi:ss') and
end_time <= to_date('31-oct-08 00:00:00','dd-mon-yy hh24:mi:ss')
and v.instance_number=i.instance_number;

declare
cursor c2 is select
snap_id,metric_name,BEGIN_TIME,END_TIME,MAXVAL,AVERAGE from
DBA_HIST_SYSMETRIC_SUMMARY where metric_name ='Host CPU
Utilization (%)';
begin
FOR table_scan in c2 loop
update regression_data set host_cpu = table_scan.average
where snap_id = table_scan.snap_id ;
11
END LOOP;
end;
/

Regression analysis code:

declare

outlier_count number;
intercept number;
slope number;
stnd_dev number;
avg_res number;
cursor c1 is select snap_id ,begin_time, end_time ,logical_reads,
host_CPU, proj_cpu ,residual , residual_sqr, stnd_residual from
regression_data ;

begin
update regression_data set stnd_residual= 4;

select count(*) into outlier_count
from regression_data
where abs(stnd_residual) > 3;

while outlier_count >0 loop
select round(REGR_INTERCEPT (host_cpu, logical_reads),8)
into intercept from regression_data ;
select round(REGR_SLOPE (host_cpu, logical_reads),8) into
slope from regression_data ;
FOR table_scan in c1 loop
update regression_data set proj_cpu=
slope*table_scan.logical_reads + intercept where
snap_id=table_scan.snap_id;
update regression_data set residual= proj_cpu-
host_cpu where snap_id=table_scan.snap_id;
update regression_data set residual_sqr =
residual*residual where snap_id=table_scan.snap_id;
END LOOP;
select round(stddev(residual),8) into stnd_dev from
regression_data;
select round(avg(residual),8) into avg_res from
regression_data;
FOR table_scan2 in c1 loop
update regression_data set stnd_residual=
(residual-avg_res)/stnd_dev where snap_id=table_scan2.snap_id;
END Loop;
select count(*) into outlier_count from regression_data
where abs(stnd_residual) > 3;
If outlier_count >0 then
FOR table_scan3 in c1 loop
12
If abs(table_scan3.stnd_residual) > 3 then
insert into outlier_data(snap_id,
begin_time,end_time,logical_reads, host_cpu) values
(table_scan3.snap_id, table_scan3.begin_time,
table_scan3.end_time, table_scan3.logical_reads,
table_scan3.host_cpu);
delete from regression_data where
snap_id=table_scan3.snap_id;
end if;
END LOOP;
end if;
END LOOP;
commit;
end;
/

References

1. Statistics Without Tears: A Primer for Non Mathematicians by Derek Rowntree
(MacMillan Publishing Company; 1981)

2. Forecasting Oracle Performance by Craig Shallahamer (Apress, 2007)

3. Intermediate Statistics for dummies by Deborah Rumsey (Wiley Publishing, 2007)

4. Regression Analysis by example, fourth edition by Samprit Chatterjee and Ali S.Hadi
(John Wiley & Sons, 2006)

5. Oracle9i Data Warehousing Guide Release 2 (9.2) Part Number A96520-01

About the Author

Neeraj Bhatia is a Senior Technical Analyst in Oracle India, based in Noida, India. He has
been working with Oracle Databases for five years. Currently he is responsible for Capacity
Planning for Oracle-based systems that include Oracle Database, Oracle Application Server,
PeopleSoft and Oracle Applications. Prior to this, he worked as a Performance Analyst for
VLDBs. When not working with Oracle, he likes to listen music, watch movies and spend a
good time with family. He can be reached at: neeraj.dba@gmail.com

COMM 374 Midterm Notes
No ratings yet
COMM 374 Midterm Notes
10 pages
Liver Phases Detox Paths PDF
100% (11)
Liver Phases Detox Paths PDF
26 pages
CHP 13 ARMA
No ratings yet
CHP 13 ARMA
23 pages
Practical Examples Using Eviews
No ratings yet
Practical Examples Using Eviews
27 pages
Autocorrelation
100% (1)
Autocorrelation
172 pages
BW Naming Standards
No ratings yet
BW Naming Standards
3 pages
Second Quarter - Module 35: Mathematics
No ratings yet
Second Quarter - Module 35: Mathematics
23 pages
Stat331-Multiple Linear Regression
No ratings yet
Stat331-Multiple Linear Regression
13 pages
Multiple Regression
100% (1)
Multiple Regression
7 pages
Multipkle Regression
No ratings yet
Multipkle Regression
29 pages
Guide On Multiple Regression
No ratings yet
Guide On Multiple Regression
29 pages
Applied Linear Regression
No ratings yet
Applied Linear Regression
9 pages
Multivar 2 - Simple and Multiple Regression PDF
No ratings yet
Multivar 2 - Simple and Multiple Regression PDF
26 pages
SPSS Logistic Regression
No ratings yet
SPSS Logistic Regression
4 pages
Poisson Regression
No ratings yet
Poisson Regression
12 pages
GMM Stata
No ratings yet
GMM Stata
27 pages
Linear Regression
100% (1)
Linear Regression
14 pages
Regression Analysis
No ratings yet
Regression Analysis
20 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
36 pages
STATA Commands
No ratings yet
STATA Commands
42 pages
ANOVA Presentation
No ratings yet
ANOVA Presentation
29 pages
Chapter 2-Simple Regression Model
No ratings yet
Chapter 2-Simple Regression Model
25 pages
Var Svar
100% (1)
Var Svar
31 pages
Structural VAR and Applications: Jean-Paul Renne
No ratings yet
Structural VAR and Applications: Jean-Paul Renne
55 pages
Panel Stochastic Frontier Models With Endogeneity in Stata: Mustafa U. Karakaplan
No ratings yet
Panel Stochastic Frontier Models With Endogeneity in Stata: Mustafa U. Karakaplan
13 pages
Testing Mediation Using Medsem' Package in Stata
0% (1)
Testing Mediation Using Medsem' Package in Stata
17 pages
DSGE Africa
100% (1)
DSGE Africa
13 pages
Multiple Regression SPECIALISTICA
No ratings yet
Multiple Regression SPECIALISTICA
93 pages
Principal Component Analysis Tutorial 101 With NumXL
No ratings yet
Principal Component Analysis Tutorial 101 With NumXL
10 pages
Robust Regression
No ratings yet
Robust Regression
52 pages
Violations of OLS
No ratings yet
Violations of OLS
64 pages
Pvar Stata Modul
No ratings yet
Pvar Stata Modul
29 pages
Ardl Model
No ratings yet
Ardl Model
6 pages
Nonlinear Econometric Models: The Smooth Transition Regression Approach
No ratings yet
Nonlinear Econometric Models: The Smooth Transition Regression Approach
36 pages
Multiple Regression
100% (1)
Multiple Regression
58 pages
Analyse D'article: Trade & Human Development in OIC Contries - Zarinah, Ruzita
No ratings yet
Analyse D'article: Trade & Human Development in OIC Contries - Zarinah, Ruzita
17 pages
Panel Vs Pooled Data
No ratings yet
Panel Vs Pooled Data
9 pages
Ejercicois Eviews
100% (1)
Ejercicois Eviews
10 pages
Simple Linear Regression and Correlation: Model and Examine The Relationship Between A and One or More (Predictors)
No ratings yet
Simple Linear Regression and Correlation: Model and Examine The Relationship Between A and One or More (Predictors)
31 pages
Chapter Three Multiple
No ratings yet
Chapter Three Multiple
15 pages
A Brief Overview of The Classical Linear Regression Model: Introductory Econometrics For Finance' © Chris Brooks 2013 1
No ratings yet
A Brief Overview of The Classical Linear Regression Model: Introductory Econometrics For Finance' © Chris Brooks 2013 1
80 pages
Arch Garch Assignment
No ratings yet
Arch Garch Assignment
5 pages
E-Commerce Customer Prediction
No ratings yet
E-Commerce Customer Prediction
5 pages
FMOLS Model
No ratings yet
FMOLS Model
8 pages
Panel Stata Instructions
No ratings yet
Panel Stata Instructions
4 pages
Goldfeld Quandt Test
No ratings yet
Goldfeld Quandt Test
10 pages
3.financial Statement Anylasis
No ratings yet
3.financial Statement Anylasis
20 pages
Oneway ANOVA
No ratings yet
Oneway ANOVA
38 pages
Multiple Regression MS
No ratings yet
Multiple Regression MS
35 pages
Session 3 - Logistic Regression
50% (2)
Session 3 - Logistic Regression
28 pages
Introduction To Econometrics
No ratings yet
Introduction To Econometrics
21 pages
OLS Assumptions
No ratings yet
OLS Assumptions
11 pages
A Factor-Augmented Vector Autoregressive (Favar)
No ratings yet
A Factor-Augmented Vector Autoregressive (Favar)
18 pages
Garch Model Thesis
100% (3)
Garch Model Thesis
8 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
24 pages
One Way Anova
100% (1)
One Way Anova
52 pages
Var, Svar and Svec Models
No ratings yet
Var, Svar and Svec Models
32 pages
Ch07 - Dummy Variables - Ver1
No ratings yet
Ch07 - Dummy Variables - Ver1
29 pages
Predictive Analytics-Mid Sem Exam Question Bank
No ratings yet
Predictive Analytics-Mid Sem Exam Question Bank
28 pages
ML unit-2
No ratings yet
ML unit-2
52 pages
Management Science Notes
No ratings yet
Management Science Notes
13 pages
MODULE-3
No ratings yet
MODULE-3
34 pages
Direct Linear Transformation: Practical Applications and Techniques in Computer Vision
From Everand
Direct Linear Transformation: Practical Applications and Techniques in Computer Vision
Fouad Sabry
No ratings yet
Alumine FB 110 English
No ratings yet
Alumine FB 110 English
1 page
Alumine FB 103 English
No ratings yet
Alumine FB 103 English
1 page
0509 FOSSIL BW 70 Upgrade To BW 73 Success Story and Lessons Learnt For Customers Looking To Upgrade
No ratings yet
0509 FOSSIL BW 70 Upgrade To BW 73 Success Story and Lessons Learnt For Customers Looking To Upgrade
22 pages
Alumine FB 111 English
No ratings yet
Alumine FB 111 English
1 page
Alumine FB 106 107 English
No ratings yet
Alumine FB 106 107 English
1 page
Alumine FB 114 English
No ratings yet
Alumine FB 114 English
1 page
Detoxifying Citrus C Facial Cleanser: Cleans, Tones and Detoxifies
No ratings yet
Detoxifying Citrus C Facial Cleanser: Cleans, Tones and Detoxifies
1 page
How Do You Configure Transaction Timeout For BPEL On SOA 11g
No ratings yet
How Do You Configure Transaction Timeout For BPEL On SOA 11g
2 pages
National Rail Network Map Zoom
No ratings yet
National Rail Network Map Zoom
1 page
Bose LS35II Guide
No ratings yet
Bose LS35II Guide
46 pages
UK Visas & Immigration Contact
No ratings yet
UK Visas & Immigration Contact
3 pages
SAP BW Demo - All
No ratings yet
SAP BW Demo - All
14 pages
SAP Testing Content
No ratings yet
SAP Testing Content
2 pages
Owners Guide Lifestyle 135 235 System V25 V35 System Lifestyle T10 T20 System AM342774 00 Tcm6 36063
No ratings yet
Owners Guide Lifestyle 135 235 System V25 V35 System Lifestyle T10 T20 System AM342774 00 Tcm6 36063
34 pages
BW DL Issues Index
No ratings yet
BW DL Issues Index
74 pages
Report RSDDS - AGGREGATES - MAINTAIN Completed With Errors or As Master Data Is Loaded For 0MAT - ST - LOC, No Change Run Is Possible
No ratings yet
Report RSDDS - AGGREGATES - MAINTAIN Completed With Errors or As Master Data Is Loaded For 0MAT - ST - LOC, No Change Run Is Possible
3 pages
SAP BW Demo - All
No ratings yet
SAP BW Demo - All
14 pages
Hierarchies in SAP HANA
No ratings yet
Hierarchies in SAP HANA
8 pages
Specs ICE 28C
100% (1)
Specs ICE 28C
2 pages
Assignment 1 and 2
No ratings yet
Assignment 1 and 2
4 pages
Unit - 2 - Exercises - Taton, Querubine Mae
No ratings yet
Unit - 2 - Exercises - Taton, Querubine Mae
4 pages
Project Dates: What Are The Dates in SAP Project System?
No ratings yet
Project Dates: What Are The Dates in SAP Project System?
4 pages
Rmo 12 2013 List of Unused Expired Orssiscis Annex D Docxdocx PDF Free
100% (2)
Rmo 12 2013 List of Unused Expired Orssiscis Annex D Docxdocx PDF Free
2 pages
App Rai 133 747
No ratings yet
App Rai 133 747
2 pages
Practical Guide To Evil Jumpchain Updated
No ratings yet
Practical Guide To Evil Jumpchain Updated
15 pages
Witricity Synopsis
No ratings yet
Witricity Synopsis
11 pages
Tecumseh Pull Starter Identification
No ratings yet
Tecumseh Pull Starter Identification
1 page
Cathodic Protection of Reinforced Concrete Distance Learning
No ratings yet
Cathodic Protection of Reinforced Concrete Distance Learning
4 pages
Warsash New Training Requirements Under Stcw10
No ratings yet
Warsash New Training Requirements Under Stcw10
6 pages
General Instruction Manual Hamilton
No ratings yet
General Instruction Manual Hamilton
100 pages
Anders 1958 The Obsolescence of Privacy PDF
No ratings yet
Anders 1958 The Obsolescence of Privacy PDF
27 pages
Steps in Human Resource Planning
No ratings yet
Steps in Human Resource Planning
2 pages
Emba Yug Neft.sco
No ratings yet
Emba Yug Neft.sco
7 pages
Provisional Graduation List Website 04.10.24
No ratings yet
Provisional Graduation List Website 04.10.24
42 pages
Hero - Honda Data
No ratings yet
Hero - Honda Data
9 pages
Linux Firewall: For The Office and Home
No ratings yet
Linux Firewall: For The Office and Home
53 pages
Curriculum Vitae: Personal Information
No ratings yet
Curriculum Vitae: Personal Information
4 pages
WWW - Yoquieroaprobar.es: Elementary Test - 1
100% (1)
WWW - Yoquieroaprobar.es: Elementary Test - 1
30 pages
Internet Service Provider
No ratings yet
Internet Service Provider
4 pages
50 IGCSE Maths Questions Clearly Explained
No ratings yet
50 IGCSE Maths Questions Clearly Explained
51 pages
Notes - Basic Accounting
No ratings yet
Notes - Basic Accounting
3 pages
Amazon Fees Structure
No ratings yet
Amazon Fees Structure
22 pages
Eco Friendly Competent Ware - VIZAG STEEL
No ratings yet
Eco Friendly Competent Ware - VIZAG STEEL
8 pages
Excel e Rice
No ratings yet
Excel e Rice
5 pages
Reading Comprehensión Power Tools GUELL
No ratings yet
Reading Comprehensión Power Tools GUELL
4 pages
Ch-02 Agile Methodology.pptx
No ratings yet
Ch-02 Agile Methodology.pptx
26 pages
MS BULLET NOTES 5 - Budgeting
No ratings yet
MS BULLET NOTES 5 - Budgeting
4 pages