0% found this document useful (0 votes)

28 views49 pages

MIS410 Chapter7

The document outlines Chapter 7 of a Business Intelligence course, focusing on classical forecasting algorithms and machine learning methods for time series data. It discusses the characteristics of time-stamped data, statistical properties such as stationarity and normality, and various forecasting models including AR, MA, ARMA, and ARIMA. Additionally, it emphasizes the importance of model identification, parameter estimation, and diagnostic checking for effective forecasting.

Uploaded by

riasad.rahman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views49 pages

MIS410 Chapter7

Uploaded by

riasad.rahman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Business Intelligence

(Course Code: MIS 410, Prerequisite: BUS 173, MIS 210/MIS 310)

Dr. Atikur R. Khan

Associate Professor
Department of Management
North South University
Chapter 7: Classical Forecasting Algorithms
and Machine Learning
Outline of Chapter Seven

 Time Series Data and Its Characteristics

 Classical Forecasting Algorithms
 Machine Learning Methods
Time Series Data and Its Characteristics
Time Stamped Data
 Any data that comes with the time of measurement/trading
(year, month, day, hour, etc.)

13:30 14:00 14:30

𝑦 𝑡 −2 𝑦 𝑡 −1 𝑦 𝑡
 Properties:
 Behavior at current time point generally depends on

previous time point

 Noisy (because of effects of unknown and unexplained

factors)
Time Stamped Data: Daily Bitcoin Closing Prices

Daily Bitcoin closing prices

from 01-01-2018 to 30-06-
2019. What would be the
closing price on 01-07-2019?
Time Stamped Data

R script is
uploaded
in Canvas
Time-Lagged Relationship
 Example of a half-hourly data set

13:30 14:00 14:30

𝑦 𝑡 −2 𝑦 𝑡 −1 𝑦 𝑡

 Let us say a relationship between current time point and

previous time point is
Time-Lagged Relationship
 Example of a half-hourly data set

13:30 14:00 14:30

𝑦 𝑡 −2 𝑦 𝑡 −1 𝑦 𝑡
 Relationships between current time point and previous time point:
AR(1): Autoregressive model of order 1

AR(2): Autoregressive model of order 2

AR(3) model

Can we fit any of these models to the Bitcoin data? – We need to examine the
statistical properties of data to see whether we can use any of these models.
Statistical Properties of Data

 Stationarity: To test the stationarity of data use following methods

 Sample autocorrelation plot (correlogram)
 Augmented Dickey-Fuller test (ADF test) for unit root
 KPSS test for stationarity of time series

 Normaility: To test normality use the Jarque-Berra test

 Nonlinearity: To test nonlinearity use the TNN test

Stationarity of Data: Autocorrelation
Function

First differencing of the series [Link] gives [Link] where

9.614611 – 9.522022 = 0.092589259
9.629116 – 9.614611 = 0.014505086
and proceed in the same way up to the end of the data frame.
Stationarity of Data: Autocorrelation Function
Stationarity of Data: Autocorrelation Function

Log(Closing Price) is nonstationary, but log return

series is stationary. Log return series is the first
differenced series from log(closing price)
Stationarity of Data: [Link]

 Ho: The time series is non-stationary.

 H1: The time series is stationary.

Here, the p-value is greater

than 0.05 and we do not
reject the Ho. Thus the series
is non-stationary.
Stationarity of Data: [Link]()

 Ho: The time series is non-stationary.

 H1: The time series is stationary.

Here, the p-value is even

smaller than 0.01 and we
reject the Ho. Thus the
series is stationary.
Stationarity of Data: [Link]()
Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test

 Ho: The time series is stationary.

 H1: The time series is not stationary.

Here, the p-value is even

smaller than 0.01 and we
reject the Ho. Thus the
series is not stationary.

Here, the p-value is even

smaller than 0.05 and we
reject the Ho. Thus the
series is not stationary.
Stationarity of Data: [Link]()
 Ho: The time series is trend stationary.
 H1: The time series is not trend stationary.
Normality of Data: [Link]()
 Ho: Data is normally distributed.
 H1: Data is not normally distributed.

Since the p-value

is smaller than 0.05
and we reject the
Ho.
Normality of Data: [Link]()
 Ho: Data is normally distributed.
 H1: Data is not normally distributed.

Since the p-value

is smaller than 0.05
and we reject the
Ho.
Normality of Data: [Link]()
 Ho: Linearity in mean
 H1: Nonlinearity in mean.

Since the p-value

is greater than 0.05
and we do not
reject the Ho.
What Type of Model for Data?

 You can apply forecasting model when your data is stationary.

 Model can be based on normal distribution assumption if the

test supports normality of data.

 Model can be linear if linearity test supports linearity.

Classical Forecasting Algorithms
Autoregressive Model (AR)

 Value at time depends on values observed in previous time points

 depends on and we may form a time-lagged relationship

 Autoregressive (AR) model of order is denoted by model

where is a zero mean process, and

 AR(1) model:
Autoregressive Model (AR)

Year Sales ()

2001 1 23 --- --- ---

2002 2 40 23 --- ---

2003 3 25 40 23 ---

2004 4 27 25 40 23

2005 5 32 27 25 40

2006 6 48 32 27 25

2007 7 33 48 32 27

2008 8 37 33 48 32

2009 9 37 37 33 48

2010 10 50 37 37 33
Autoregressive Model (AR)

to fit a model
missing values
data excluding
Use complete
• Imagine a regression model like estimation with this data for the model:

• Other preferred estimation options will be discussed later.

Moving Average (MA) Model

 depends on previous noise terms

 MA model of order is denoted by of the form

where is a zero mean process, and

 MA(1) model:
Autoregressive and Moving Average (ARMA)
Model

 Combine AR(p) and MA(q) together to form ARMA(p,q)

where is a zero mean process, , and

 Combine AR(1) and MA(1) together to form ARMA(1,1)

Autoregressive and Moving Average (ARMA) Model

• Imagine a fitted model from this data (colored columns):

• Other preferred estimation options will be discussed later.

• You have fitted a model . What type of model is it?

Autoregressive Integrated Moving Average (ARIMA) Model

 Fit an ARMA(p,q) model when time series is stationary.

 If is nonstationary, fit an ARIMA(p,d,q) model where d is the

difference parameter (known as integrated order). So, any
stationary data is I(0) and any data that becomes stationary
after first differencing is I(1).

 First differenced data . If is nonstationary, calculate the first

difference series and fit ARMA(p,q) model with the first
differenced series. This will be then ARIMA(p,1,q) model.

 How would you identify the model orders for ARIMA(p,d,q)?

Step 1: Model Identification

 Use correlogram (plot of sample autocorrelation function, SACF) of time

series to check for stationarity. If SACF values are consistently outside of
95% confidence interval and do not tail off quickly, consider differencing the
series. This will help us to determine the value of .

 Once the value of is determined, use the SACF and partial SACF (PACF)
plots to identify and
Step1: Model Identification (data analysis)
• Log(US T. Bill) from 1950 to 1999: We may consider ARIMA(6,1,0),
ARIMA(0,1,6) and ARIMA(6,1,6) to this data.

Note: R script and data

files are in Canvas
Step 2: Estimation of Parameters

 Two-Step Regression

 Yule-Walker Estimation

 Maximum Likelihood Estimation

Step 3: Diagnostic Checking

All of these three models have qualitatively similar ACF for residuals, but
the all ACF values in (c) are within the 95% CI and the AIC is minimum for
this model too.

Note: R script and data

files are in Canvas
Step 4: Forecasting

Note: R script and data

files are in Canvas
R for ARIMA Models : [Link]( )
Use forecast library for fitting ARIMA(p,d,q) model.
Model selection is done automatically when
[Link]() is used.
R for ARIMA Models
R for ARIMA Models
Forecasting with Machine Learning Algorithms
Construct Lagged Features
Construct Lagged Features
Construct Lagged Features
Construct Data Frame with Lagged Features
Are All Lagged Features Important?
Are All Lagged Features Important?
Training and Test Data for Back Testing
Training and Test Data for Back Testing

• Use Decision Tree model to obtain MSFE

• Compare with the MSFE obtained from Random Forest

• The best model provides least MSFE

• Select the best model for forecasting and business

strategy development.
Homework: Model Building, Back Testing, and Forecasting

• Use the [Link] data and up to 7 lag

periods to obtain one-week ahead forecast.
Concluding remarks

All models are wrong, but some

are useful.
- G. E. P. Box (Famous Statistician and Professor,
1991)
References

Cryer and Chan (2013). Time Series Analysis with Applications in

R, 2nd Edition, Springer.

James et al. (2013). An Introduction to Statistical Learning, 2nd

Edition, Springer.

Shumway and Stoffer (2017). Time Series Analysis and Its

Applications With R Examples, Springer.

Wickham and Grolemund (2016). R for Data Science: Import,

Tidy, Transform, Visualize, and Model Data, O’Reilly.

Python Time Series Cheat Sheet
No ratings yet
Python Time Series Cheat Sheet
7 pages
ARIMA Modeling Tutorial with R
No ratings yet
ARIMA Modeling Tutorial with R
12 pages
ARIMA for Stock Price Prediction
No ratings yet
ARIMA for Stock Price Prediction
1 page
ARIMA Time Series Forecasting Guide
No ratings yet
ARIMA Time Series Forecasting Guide
17 pages
Arima Model For Aapl
No ratings yet
Arima Model For Aapl
16 pages
ARMA and ARIMA Time Series Models
No ratings yet
ARMA and ARIMA Time Series Models
10 pages
Gold ETF Price Forecast Milestone Report
No ratings yet
Gold ETF Price Forecast Milestone Report
23 pages
Time Series ARIMA Models PDF
No ratings yet
Time Series ARIMA Models PDF
22 pages
Time Series Forecasting for Analysts
No ratings yet
Time Series Forecasting for Analysts
19 pages
Auto-Regressive Integrated Moving Average Models I
No ratings yet
Auto-Regressive Integrated Moving Average Models I
12 pages
Arima Word
No ratings yet
Arima Word
13 pages
A Data-Driven Model For Software Reliability Prediction
No ratings yet
A Data-Driven Model For Software Reliability Prediction
32 pages
ARIMA Model for Time Series in Python
No ratings yet
ARIMA Model for Time Series in Python
11 pages
Be A 65 Ads Exp 8
No ratings yet
Be A 65 Ads Exp 8
10 pages
Group 9 Time Series Data Analysis (ARIMA)
No ratings yet
Group 9 Time Series Data Analysis (ARIMA)
47 pages
Business Forecast Vishay Sood
No ratings yet
Business Forecast Vishay Sood
8 pages
Time Series Forecasting Guide
No ratings yet
Time Series Forecasting Guide
6 pages
ARIMA Model
No ratings yet
ARIMA Model
44 pages
Time Series
No ratings yet
Time Series
67 pages
Stationarity & AR, MA, ARIMA, SARIMA
100% (1)
Stationarity & AR, MA, ARIMA, SARIMA
6 pages
Time Series Analysis Handbook 03
No ratings yet
Time Series Analysis Handbook 03
12 pages
Chapter 12 Part 2 - Arima Model Estimation - 2023
No ratings yet
Chapter 12 Part 2 - Arima Model Estimation - 2023
15 pages
ARIMA Guide for Economists
No ratings yet
ARIMA Guide for Economists
32 pages
cheatsheet的副本
No ratings yet
cheatsheet的副本
8 pages
Time Arima 002
No ratings yet
Time Arima 002
11 pages
Business Analytis C4
No ratings yet
Business Analytis C4
10 pages
ATA Unit3 4 5
No ratings yet
ATA Unit3 4 5
51 pages
End Term Project (BA)
No ratings yet
End Term Project (BA)
19 pages
Lec 08 - ARIMA Models
No ratings yet
Lec 08 - ARIMA Models
35 pages
Time Series Analysis
No ratings yet
Time Series Analysis
9 pages
AP SHAH ADS Notes Smote
No ratings yet
AP SHAH ADS Notes Smote
52 pages
Arima 1b
No ratings yet
Arima 1b
6 pages
Intro To Time Series
No ratings yet
Intro To Time Series
85 pages
Time Series Analysis and Forecasting Using R
No ratings yet
Time Series Analysis and Forecasting Using R
30 pages
Resumos Forecasting
No ratings yet
Resumos Forecasting
17 pages
Sarima Group 11
No ratings yet
Sarima Group 11
21 pages
Mifi 564 - Unit 2 - 2022
No ratings yet
Mifi 564 - Unit 2 - 2022
71 pages
Arima
No ratings yet
Arima
13 pages
Basic Time Series With Python Code
No ratings yet
Basic Time Series With Python Code
33 pages
Time Series Analysis for Stock Prediction
No ratings yet
Time Series Analysis for Stock Prediction
21 pages
Quantitative Chapter10
No ratings yet
Quantitative Chapter10
27 pages
ARIMA Modelling and Forecasting
No ratings yet
ARIMA Modelling and Forecasting
30 pages
Arima
100% (1)
Arima
65 pages
Time Series EDA for Data Analysts
No ratings yet
Time Series EDA for Data Analysts
20 pages
Oil Price Prediction Using Data Science
No ratings yet
Oil Price Prediction Using Data Science
40 pages
Box-Jenkins ARIMA Forecasting Guide
No ratings yet
Box-Jenkins ARIMA Forecasting Guide
20 pages
Time Series Analysis with SAS & SPSS
No ratings yet
Time Series Analysis with SAS & SPSS
10 pages
Time Series Components:: The Long-Term Direction.: The Periodic Behavior.: The Irregular Fluctuations
No ratings yet
Time Series Components:: The Long-Term Direction.: The Periodic Behavior.: The Irregular Fluctuations
19 pages
1P 2024-10 Solución
No ratings yet
1P 2024-10 Solución
50 pages
Time Series Linear Models
No ratings yet
Time Series Linear Models
121 pages
E Monika Sree 10-10-2024
No ratings yet
E Monika Sree 10-10-2024
60 pages
Time Series
No ratings yet
Time Series
13 pages
FM - Resumes
No ratings yet
FM - Resumes
18 pages
Arima Slide Share
No ratings yet
Arima Slide Share
65 pages
Forecasting
No ratings yet
Forecasting
6 pages
Time Series ARIMA Models Example PDF
No ratings yet
Time Series ARIMA Models Example PDF
11 pages
Time Series Analysis and Modeling Techniques
No ratings yet
Time Series Analysis and Modeling Techniques
28 pages
2025 SBM Self Assessment Checklist Sumangga Es
No ratings yet
2025 SBM Self Assessment Checklist Sumangga Es
7 pages
APJ Elevate - Databricks Certification Exam Overview Training Data Analyst Associate
No ratings yet
APJ Elevate - Databricks Certification Exam Overview Training Data Analyst Associate
96 pages
BANCAL-IS-JOURNALISM-LIST-of-PARTICIPANTS 2024
No ratings yet
BANCAL-IS-JOURNALISM-LIST-of-PARTICIPANTS 2024
2 pages
Judgment in Managerial Decision Making (8th Edition) Bazerman
No ratings yet
Judgment in Managerial Decision Making (8th Edition) Bazerman
10 pages
Racquel Bursee: Experienced Substitute Teacher Profile
No ratings yet
Racquel Bursee: Experienced Substitute Teacher Profile
3 pages
IKS - PPTX 210625
No ratings yet
IKS - PPTX 210625
62 pages
Information To Candidate & Declaration Updated As On 24th Apr'23
No ratings yet
Information To Candidate & Declaration Updated As On 24th Apr'23
2 pages
Group5 Lessonplan
No ratings yet
Group5 Lessonplan
4 pages
Raz Cqlj43 Rainbows
No ratings yet
Raz Cqlj43 Rainbows
2 pages
Challenges for Hartford's Children
100% (3)
Challenges for Hartford's Children
16 pages
Assignment On Performance Appraisal System
No ratings yet
Assignment On Performance Appraisal System
16 pages
JD Records Request Form
100% (1)
JD Records Request Form
1 page
If
50% (2)
If
4 pages
Multiple Intelligences and Success in School Studies: International Journal of Higher Education Vol. 9, No. 6 2020
No ratings yet
Multiple Intelligences and Success in School Studies: International Journal of Higher Education Vol. 9, No. 6 2020
11 pages
Aldelia Vacancy - Lead Static Mechanical Engineer - Dubai
No ratings yet
Aldelia Vacancy - Lead Static Mechanical Engineer - Dubai
2 pages
Florida Atlantic University
No ratings yet
Florida Atlantic University
2 pages
Jake S Resume
No ratings yet
Jake S Resume
1 page
Delmer A
No ratings yet
Delmer A
2 pages
QP Feed Back Form
No ratings yet
QP Feed Back Form
2 pages
PPT - Consumer Learning Process
No ratings yet
PPT - Consumer Learning Process
32 pages
Ip (Unit-2) - Vit
No ratings yet
Ip (Unit-2) - Vit
25 pages
Impact of CSA on Female Parenting
100% (1)
Impact of CSA on Female Parenting
5 pages
Malaysia Form 2 Mathematics Curriculum
0% (1)
Malaysia Form 2 Mathematics Curriculum
139 pages
INTRODUCTION
No ratings yet
INTRODUCTION
3 pages
Bushra CV
No ratings yet
Bushra CV
1 page
Issues in The Anthropology of Islam Cont PDF
No ratings yet
Issues in The Anthropology of Islam Cont PDF
42 pages
Woolley Malone HBR2011
No ratings yet
Woolley Malone HBR2011
4 pages
Terms and Conditions To Join SmartData Enterprises - 2024
No ratings yet
Terms and Conditions To Join SmartData Enterprises - 2024
2 pages
Bhavik Bansal - Y1
No ratings yet
Bhavik Bansal - Y1
1 page
w08 Chapter 7 Lesson Planning Emily Popoca
No ratings yet
w08 Chapter 7 Lesson Planning Emily Popoca
2 pages

MIS410 Chapter7

Uploaded by

MIS410 Chapter7

Uploaded by

Business Intelligence

Dr. Atikur R. Khan

 Time Series Data and Its Characteristics

13:30 14:00 14:30

previous time point

Daily Bitcoin closing prices

13:30 14:00 14:30

 Let us say a relationship between current time point and

13:30 14:00 14:30

AR(2): Autoregressive model of order 2

 Stationarity: To test the stationarity of data use following methods

 Normaility: To test normality use the Jarque-Berra test

 Nonlinearity: To test nonlinearity use the TNN test

First differencing of the series [Link] gives [Link] where

Log(Closing Price) is nonstationary, but log return

 Ho: The time series is non-stationary.

Here, the p-value is greater

 Ho: The time series is non-stationary.

Here, the p-value is even

 Ho: The time series is stationary.

Here, the p-value is even

Here, the p-value is even

Since the p-value

Since the p-value

Since the p-value

 You can apply forecasting model when your data is stationary.

 Model can be based on normal distribution assumption if the

 Model can be linear if linearity test supports linearity.

 Value at time depends on values observed in previous time points

 depends on and we may form a time-lagged relationship

 Autoregressive (AR) model of order is denoted by model

where is a zero mean process, and

2001 1 23 --- --- ---

2002 2 40 23 --- ---

• Other preferred estimation options will be discussed later.

 depends on previous noise terms

 MA model of order is denoted by of the form

where is a zero mean process, and

 Combine AR(p) and MA(q) together to form ARMA(p,q)

where is a zero mean process, , and

 Combine AR(1) and MA(1) together to form ARMA(1,1)

• Imagine a fitted model from this data (colored columns):

• Other preferred estimation options will be discussed later.

• You have fitted a model . What type of model is it?

 Fit an ARMA(p,q) model when time series is stationary.

 If is nonstationary, fit an ARIMA(p,d,q) model where d is the

 First differenced data . If is nonstationary, calculate the first

 How would you identify the model orders for ARIMA(p,d,q)?

 Use correlogram (plot of sample autocorrelation function, SACF) of time

Note: R script and data

 Maximum Likelihood Estimation

Note: R script and data

Note: R script and data

• Use Decision Tree model to obtain MSFE

• Compare with the MSFE obtained from Random Forest

• The best model provides least MSFE

• Select the best model for forecasting and business

• Use the [Link] data and up to 7 lag

All models are wrong, but some

Cryer and Chan (2013). Time Series Analysis with Applications in

James et al. (2013). An Introduction to Statistical Learning, 2nd

Shumway and Stoffer (2017). Time Series Analysis and Its

Wickham and Grolemund (2016). R for Data Science: Import,

You might also like