Business Intelligence
(Course Code: MIS 410, Prerequisite: BUS 173, MIS 210/MIS 310)
Dr. Atikur R. Khan
Associate Professor
Department of Management
North South University
Chapter 7: Classical Forecasting Algorithms
and Machine Learning
Outline of Chapter Seven
Time Series Data and Its Characteristics
Classical Forecasting Algorithms
Machine Learning Methods
Time Series Data and Its Characteristics
Time Stamped Data
Any data that comes with the time of measurement/trading
(year, month, day, hour, etc.)
13:30 14:00 14:30
𝑦 𝑡 −2 𝑦 𝑡 −1 𝑦 𝑡
Properties:
Behavior at current time point generally depends on
previous time point
Noisy (because of effects of unknown and unexplained
factors)
Time Stamped Data: Daily Bitcoin Closing Prices
Daily Bitcoin closing prices
from 01-01-2018 to 30-06-
2019. What would be the
closing price on 01-07-2019?
Time Stamped Data
R script is
uploaded
in Canvas
Time-Lagged Relationship
Example of a half-hourly data set
13:30 14:00 14:30
𝑦 𝑡 −2 𝑦 𝑡 −1 𝑦 𝑡
Let us say a relationship between current time point and
previous time point is
Time-Lagged Relationship
Example of a half-hourly data set
13:30 14:00 14:30
𝑦 𝑡 −2 𝑦 𝑡 −1 𝑦 𝑡
Relationships between current time point and previous time point:
AR(1): Autoregressive model of order 1
AR(2): Autoregressive model of order 2
AR(3) model
Can we fit any of these models to the Bitcoin data? – We need to examine the
statistical properties of data to see whether we can use any of these models.
Statistical Properties of Data
Stationarity: To test the stationarity of data use following methods
Sample autocorrelation plot (correlogram)
Augmented Dickey-Fuller test (ADF test) for unit root
KPSS test for stationarity of time series
Normaility: To test normality use the Jarque-Berra test
Nonlinearity: To test nonlinearity use the TNN test
Stationarity of Data: Autocorrelation
Function
First differencing of the series [Link] gives [Link] where
9.614611 – 9.522022 = 0.092589259
9.629116 – 9.614611 = 0.014505086
and proceed in the same way up to the end of the data frame.
Stationarity of Data: Autocorrelation Function
Stationarity of Data: Autocorrelation Function
Log(Closing Price) is nonstationary, but log return
series is stationary. Log return series is the first
differenced series from log(closing price)
Stationarity of Data: [Link]
Ho: The time series is non-stationary.
H1: The time series is stationary.
Here, the p-value is greater
than 0.05 and we do not
reject the Ho. Thus the series
is non-stationary.
Stationarity of Data: [Link]()
Ho: The time series is non-stationary.
H1: The time series is stationary.
Here, the p-value is even
smaller than 0.01 and we
reject the Ho. Thus the
series is stationary.
Stationarity of Data: [Link]()
Kwiatkowski-Phillips-Schmidt-Shin (KPSS) Test
Ho: The time series is stationary.
H1: The time series is not stationary.
Here, the p-value is even
smaller than 0.01 and we
reject the Ho. Thus the
series is not stationary.
Here, the p-value is even
smaller than 0.05 and we
reject the Ho. Thus the
series is not stationary.
Stationarity of Data: [Link]()
Ho: The time series is trend stationary.
H1: The time series is not trend stationary.
Normality of Data: [Link]()
Ho: Data is normally distributed.
H1: Data is not normally distributed.
Since the p-value
is smaller than 0.05
and we reject the
Ho.
Normality of Data: [Link]()
Ho: Data is normally distributed.
H1: Data is not normally distributed.
Since the p-value
is smaller than 0.05
and we reject the
Ho.
Normality of Data: [Link]()
Ho: Linearity in mean
H1: Nonlinearity in mean.
Since the p-value
is greater than 0.05
and we do not
reject the Ho.
What Type of Model for Data?
You can apply forecasting model when your data is stationary.
Model can be based on normal distribution assumption if the
test supports normality of data.
Model can be linear if linearity test supports linearity.
Classical Forecasting Algorithms
Autoregressive Model (AR)
Value at time depends on values observed in previous time points
depends on and we may form a time-lagged relationship
Autoregressive (AR) model of order is denoted by model
where is a zero mean process, and
AR(1) model:
Autoregressive Model (AR)
Year Sales ()
2001 1 23 --- --- ---
2002 2 40 23 --- ---
2003 3 25 40 23 ---
2004 4 27 25 40 23
2005 5 32 27 25 40
2006 6 48 32 27 25
2007 7 33 48 32 27
2008 8 37 33 48 32
2009 9 37 37 33 48
2010 10 50 37 37 33
Autoregressive Model (AR)
to fit a model
missing values
data excluding
Use complete
• Imagine a regression model like estimation with this data for the model:
• Other preferred estimation options will be discussed later.
Moving Average (MA) Model
depends on previous noise terms
MA model of order is denoted by of the form
where is a zero mean process, and
MA(1) model:
Autoregressive and Moving Average (ARMA)
Model
Combine AR(p) and MA(q) together to form ARMA(p,q)
where is a zero mean process, , and
Combine AR(1) and MA(1) together to form ARMA(1,1)
Autoregressive and Moving Average (ARMA) Model
• Imagine a fitted model from this data (colored columns):
• Other preferred estimation options will be discussed later.
• You have fitted a model . What type of model is it?
Autoregressive Integrated Moving Average (ARIMA) Model
Fit an ARMA(p,q) model when time series is stationary.
If is nonstationary, fit an ARIMA(p,d,q) model where d is the
difference parameter (known as integrated order). So, any
stationary data is I(0) and any data that becomes stationary
after first differencing is I(1).
First differenced data . If is nonstationary, calculate the first
difference series and fit ARMA(p,q) model with the first
differenced series. This will be then ARIMA(p,1,q) model.
How would you identify the model orders for ARIMA(p,d,q)?
Step 1: Model Identification
Use correlogram (plot of sample autocorrelation function, SACF) of time
series to check for stationarity. If SACF values are consistently outside of
95% confidence interval and do not tail off quickly, consider differencing the
series. This will help us to determine the value of .
Once the value of is determined, use the SACF and partial SACF (PACF)
plots to identify and
Step1: Model Identification (data analysis)
• Log(US T. Bill) from 1950 to 1999: We may consider ARIMA(6,1,0),
ARIMA(0,1,6) and ARIMA(6,1,6) to this data.
Note: R script and data
files are in Canvas
Step 2: Estimation of Parameters
Two-Step Regression
Yule-Walker Estimation
Maximum Likelihood Estimation
Step 3: Diagnostic Checking
All of these three models have qualitatively similar ACF for residuals, but
the all ACF values in (c) are within the 95% CI and the AIC is minimum for
this model too.
Note: R script and data
files are in Canvas
Step 4: Forecasting
Note: R script and data
files are in Canvas
R for ARIMA Models : [Link]( )
Use forecast library for fitting ARIMA(p,d,q) model.
Model selection is done automatically when
[Link]() is used.
R for ARIMA Models
R for ARIMA Models
Forecasting with Machine Learning Algorithms
Construct Lagged Features
Construct Lagged Features
Construct Lagged Features
Construct Data Frame with Lagged Features
Are All Lagged Features Important?
Are All Lagged Features Important?
Training and Test Data for Back Testing
Training and Test Data for Back Testing
• Use Decision Tree model to obtain MSFE
• Compare with the MSFE obtained from Random Forest
• The best model provides least MSFE
• Select the best model for forecasting and business
strategy development.
Homework: Model Building, Back Testing, and Forecasting
• Use the [Link] data and up to 7 lag
periods to obtain one-week ahead forecast.
Concluding remarks
All models are wrong, but some
are useful.
- G. E. P. Box (Famous Statistician and Professor,
1991)
References
Cryer and Chan (2013). Time Series Analysis with Applications in
R, 2nd Edition, Springer.
James et al. (2013). An Introduction to Statistical Learning, 2nd
Edition, Springer.
Shumway and Stoffer (2017). Time Series Analysis and Its
Applications With R Examples, Springer.
Wickham and Grolemund (2016). R for Data Science: Import,
Tidy, Transform, Visualize, and Model Data, O’Reilly.