11 Classical Time Series Forecasting Methods in Python (Cheat Sheet)
11 Classical Time Series Forecasting Methods in Python (Cheat Sheet)
Search... !
Machine learning methods can be used for classification and forecasting on time series
problems.
Picked for you:
Before exploring machine learning methods for time series, it is a good idea to ensure you
have exhausted classical linear time series forecasting methods. Classical time series How to Create an ARIMA Model for Time
forecasting methods may be focused on linear relationships, nevertheless, they are Series Forecasting in Python
sophisticated and perform well on a wide range of problems, assuming that your data is
suitably prepared and the method is well configured.
How to Convert a Time Series to a
In this post, will you will discover a suite of classical methods for time series forecasting that Supervised Learning Problem in Python
you can test on your forecasting problem prior to exploring to machine learning methods.
Discover how to prepare and visualize time series data and develop autoregressive
forecasting models in my new book, with 28 step-by-step tutorials, and full python code.
Loving the Tutorials?
Let’s get started. The Time Series with Python EBook
is where I keep the Really Good stuff.
Updated Apr/2020: Changed AR to AutoReg due to API change.
SEE WHAT'S INSIDE
Overview
This cheat sheet demonstrates 11 different classical time series forecasting methods; they
are:
1. Autoregression (AR)
2. Moving Average (MA)
3. Autoregressive Moving Average (ARMA)
4. Autoregressive Integrated Moving Average (ARIMA)
5. Seasonal Autoregressive Integrated Moving-Average (SARIMA)
6. Seasonal Autoregressive Integrated Moving-Average with Exogenous Regressors
(SARIMAX)
7. Vector Autoregression (VAR)
8. Vector Autoregression Moving-Average (VARMA)
9. Vector Autoregression Moving-Average with Exogenous Regressors (VARMAX)
10. Simple Exponential Smoothing (SES)
11. Holt Winter’s Exponential Smoothing (HWES)
This includes:
Each code example is demonstrated on a simple contrived dataset that may or may not be
appropriate for the method. Replace the contrived dataset with your data in order to test the
method.
Remember: each method will require tuning to your specific problem. In many cases, I have
examples of how to configure and even grid search parameters on the blog already, try the
search function.
If you find this cheat sheet useful, please let me know in the comments below.
Autoregression (AR)
The autoregression (AR) method models the next step in the sequence as a linear function of
the observations at prior time steps.
The notation for the model involves specifying the order of the model p as a parameter to the
AR function, e.g. AR(p). For example, AR(1) is a first-order autoregression model.
The method is suitable for univariate time series without trend and seasonal components.
Python Code
1 # AR example
2 from statsmodels.tsa.ar_model import AutoReg
3 from random import random
4 # contrived dataset
5 data = [x + random() for x in range(1, 100)]
6 # fit model
7 model = AutoReg(data, lags=1)
8 model_fit = model.fit()
9 # make prediction
10 yhat = model_fit.predict(len(data), len(data))
11 print(yhat)
More Information
statsmodels.tsa.ar_model.AutoReg API
statsmodels.tsa.ar_model.AutoRegResults API
Autoregressive model on Wikipedia
A moving average model is different from calculating the moving average of the time series.
The notation for the model involves specifying the order of the model q as a parameter to the
MA function, e.g. MA(q). For example, MA(1) is a first-order moving average model.
The method is suitable for univariate time series without trend and seasonal components.
Python Code
We can use the ARMA class to create an MA model and setting a zeroth-order AR model. We
must specify the order of the MA model in the order argument.
1 # MA example
2 from statsmodels.tsa.arima_model import ARMA
3 from random import random
4 # contrived dataset
5 data = [x + random() for x in range(1, 100)]
6 # fit model
7 model = ARMA(data, order=(0, 1))
8 model_fit = model.fit(disp=False)
9 # make prediction
10 yhat = model_fit.predict(len(data), len(data))
11 print(yhat)
More Information
statsmodels.tsa.arima_model.ARMA API
statsmodels.tsa.arima_model.ARMAResults API
Moving-average model on Wikipedia
The notation for the model involves specifying the order for the AR(p) and MA(q) models as
parameters to an ARMA function, e.g. ARMA(p, q). An ARIMA model can be used to develop
AR or MA models.
The method is suitable for univariate time series without trend and seasonal components.
Python Code
1 # ARMA example
2 from statsmodels.tsa.arima_model import ARMA
3 from random import random
4 # contrived dataset
5 data = [random() for x in range(1, 100)]
6 # fit model
7 model = ARMA(data, order=(2, 1))
8 model_fit = model.fit(disp=False)
9 # make prediction
10 yhat = model_fit.predict(len(data), len(data))
11 print(yhat)
More Information
statsmodels.tsa.arima_model.ARMA API
statsmodels.tsa.arima_model.ARMAResults API
Autoregressive–moving-average model on Wikipedia
It combines both Autoregression (AR) and Moving Average (MA) models as well as a
differencing pre-processing step of the sequence to make the sequence stationary, called
integration (I).
The notation for the model involves specifying the order for the AR(p), I(d), and MA(q) models
as parameters to an ARIMA function, e.g. ARIMA(p, d, q). An ARIMA model can also be used
to develop AR, MA, and ARMA models.
The method is suitable for univariate time series with trend and without seasonal
components.
Python Code
1 # ARIMA example
2 from statsmodels.tsa.arima_model import ARIMA
3 from random import random
4 # contrived dataset
5 data = [x + random() for x in range(1, 100)]
6 # fit model
7 model = ARIMA(data, order=(1, 1, 1))
8 model_fit = model.fit(disp=False)
9 # make prediction
10 yhat = model_fit.predict(len(data), len(data), typ='levels')
11 print(yhat)
More Information
statsmodels.tsa.arima_model.ARIMA API
statsmodels.tsa.arima_model.ARIMAResults API
Autoregressive integrated moving average on Wikipedia
It combines the ARIMA model with the ability to perform the same autoregression,
differencing, and moving average modeling at the seasonal level.
The notation for the model involves specifying the order for the AR(p), I(d), and MA(q) models
as parameters to an ARIMA function and AR(P), I(D), MA(Q) and m parameters at the
seasonal level, e.g. SARIMA(p, d, q)(P, D, Q)m where “m” is the number of time steps in each
season (the seasonal period). A SARIMA model can be used to develop AR, MA, ARMA and
ARIMA models.
The method is suitable for univariate time series with trend and/or seasonal components.
Python Code
1 # SARIMA example
2 from statsmodels.tsa.statespace.sarimax import SARIMAX
3 from random import random
4 # contrived dataset
5 data = [x + random() for x in range(1, 100)]
6 # fit model
7 model = SARIMAX(data, order=(1, 1, 1), seasonal_order=(1, 1, 1, 1))
8 model_fit = model.fit(disp=False)
9 # make prediction
10 yhat = model_fit.predict(len(data), len(data))
11 print(yhat)
More Information
statsmodels.tsa.statespace.sarimax.SARIMAX API
statsmodels.tsa.statespace.sarimax.SARIMAXResults API
Autoregressive integrated moving average on Wikipedia
Exogenous variables are also called covariates and can be thought of as parallel input
sequences that have observations at the same time steps as the original series. The primary
series may be referred to as endogenous data to contrast it from the exogenous
sequence(s). The observations for exogenous variables are included in the model directly at
each time step and are not modeled in the same way as the primary endogenous sequence
(e.g. as an AR, MA, etc. process).
The SARIMAX method can also be used to model the subsumed models with exogenous
variables, such as ARX, MAX, ARMAX, and ARIMAX.
The method is suitable for univariate time series with trend and/or seasonal components and
exogenous variables.
Python Code
1 # SARIMAX example
2 from statsmodels.tsa.statespace.sarimax import SARIMAX
3 from random import random
4 # contrived dataset
5 data1 = [x + random() for x in range(1, 100)]
6 data2 = [x + random() for x in range(101, 200)]
7 # fit model
8 model = SARIMAX(data1, exog=data2, order=(1, 1, 1), seasonal_order=(0, 0, 0, 0))
9 model_fit = model.fit(disp=False)
10 # make prediction
11 exog2 = [200 + random()]
12 yhat = model_fit.predict(len(data1), len(data1), exog=[exog2])
13 print(yhat)
More Information
statsmodels.tsa.statespace.sarimax.SARIMAX API
statsmodels.tsa.statespace.sarimax.SARIMAXResults API
Autoregressive integrated moving average on Wikipedia
The notation for the model involves specifying the order for the AR(p) model as parameters
to a VAR function, e.g. VAR(p).
The method is suitable for multivariate time series without trend and seasonal components.
Python Code
1 # VAR example
2 from statsmodels.tsa.vector_ar.var_model import VAR
3 from random import random
4 # contrived dataset with dependency
5 data = list()
6 for i in range(100):
7 v1 = i + random()
8 v2 = v1 + random()
9 row = [v1, v2]
10 data.append(row)
11 # fit model
12 model = VAR(data)
13 model_fit = model.fit()
14 # make prediction
15 yhat = model_fit.forecast(model_fit.y, steps=1)
16 print(yhat)
More Information
statsmodels.tsa.vector_ar.var_model.VAR API
statsmodels.tsa.vector_ar.var_model.VARResults API
Vector autoregression on Wikipedia
The notation for the model involves specifying the order for the AR(p) and MA(q) models as
parameters to a VARMA function, e.g. VARMA(p, q). A VARMA model can also be used to
develop VAR or VMA models.
The method is suitable for multivariate time series without trend and seasonal components.
Python Code
1 # VARMA example
2 from statsmodels.tsa.statespace.varmax import VARMAX
3 from random import random
4 # contrived dataset with dependency
5 data = list()
6 for i in range(100):
7 v1 = random()
8 v2 = v1 + random()
9 row = [v1, v2]
10 data.append(row)
11 # fit model
12 model = VARMAX(data, order=(1, 1))
13 model_fit = model.fit(disp=False)
14 # make prediction
15 yhat = model_fit.forecast()
16 print(yhat)
More Information
statsmodels.tsa.statespace.varmax.VARMAX API
statsmodels.tsa.statespace.varmax.VARMAXResults
Vector autoregression on Wikipedia
Exogenous variables are also called covariates and can be thought of as parallel input
sequences that have observations at the same time steps as the original series. The primary
series(es) are referred to as endogenous data to contrast it from the exogenous sequence(s).
The observations for exogenous variables are included in the model directly at each time
step and are not modeled in the same way as the primary endogenous sequence (e.g. as an
AR, MA, etc. process).
The VARMAX method can also be used to model the subsumed models with exogenous
variables, such as VARX and VMAX.
The method is suitable for multivariate time series without trend and seasonal components
with exogenous variables.
Python Code
1 # VARMAX example
2 from statsmodels.tsa.statespace.varmax import VARMAX
3 from random import random
4 # contrived dataset with dependency
5 data = list()
6 for i in range(100):
7 v1 = random()
8 v2 = v1 + random()
9 row = [v1, v2]
10 data.append(row)
11 data_exog = [x + random() for x in range(100)]
12 # fit model
13 model = VARMAX(data, exog=data_exog, order=(1, 1))
14 model_fit = model.fit(disp=False)
15 # make prediction
16 data_exog2 = [[100]]
17 yhat = model_fit.forecast(exog=data_exog2)
18 print(yhat)
More Information
statsmodels.tsa.statespace.varmax.VARMAX API
statsmodels.tsa.statespace.varmax.VARMAXResults
Vector autoregression on Wikipedia
The method is suitable for univariate time series without trend and seasonal components.
Python Code
1 # SES example
2 from statsmodels.tsa.holtwinters import SimpleExpSmoothing
3 from random import random
4 # contrived dataset
5 data = [x + random() for x in range(1, 100)]
6 # fit model
7 model = SimpleExpSmoothing(data)
8 model_fit = model.fit()
9 # make prediction
10 yhat = model_fit.predict(len(data), len(data))
11 print(yhat)
More Information
statsmodels.tsa.holtwinters.SimpleExpSmoothing API
statsmodels.tsa.holtwinters.HoltWintersResults API
Exponential smoothing on Wikipedia
The method is suitable for univariate time series with trend and/or seasonal components.
Python Code
1 # HWES example
2 from statsmodels.tsa.holtwinters import ExponentialSmoothing
3 from random import random
4 # contrived dataset
5 data = [x + random() for x in range(1, 100)]
6 # fit model
7 model = ExponentialSmoothing(data)
8 model_fit = model.fit()
9 # make prediction
10 yhat = model_fit.predict(len(data), len(data))
11 print(yhat)
More Information
statsmodels.tsa.holtwinters.ExponentialSmoothing API
statsmodels.tsa.holtwinters.HoltWintersResults API
Exponential smoothing on Wikipedia
Further Reading
This section provides more resources on the topic if you are looking to go deeper.
Summary
In this post, you discovered a suite of classical time series forecasting methods that you can
test and tune on your time series dataset.
" Statistics for Machine Learning (7-Day Mini-Course) Taxonomy of Time Series Forecasting Problems #
REPLY $
Adriena Welch August 6, 2018 at 3:20 pm #
Hi Jason, thanks for such an excellent and comprehensive post on time series. I
sincerely appreciate your effort. As you ask for the further topic, just wondering if I can
request you for a specific topic I have been struggling to get an output. It’s about Structural
Dynamic Factor model ( SDFM) by Barigozzi, M., Conti, A., and Luciani, M. (Do euro area
countries respond asymmetrically to the common monetary policy) and Mario Forni Luca
Gambetti (The Dynamic Effects of Monetary Policy: A Structural Factor Model Approach).
Would it be possible for you to go over and estimate these two models using Python or R?
It’s just a request from me and sorry if it doesn’t go with your interest.
REPLY $
Jason Brownlee August 7, 2018 at 6:23 am #
Thanks for the suggestion. I’ve not heard of that method before.
REPLY $
Akhila March 10, 2020 at 7:17 pm #
I have 1000 time series data and i want to predict next 500 steps. So how
can i do this
REPLY $
Jason Brownlee March 11, 2020 at 5:22 am #
Hi Jason,
Great technical insights. I work in the oil and gas sector and one of our day
to day job is to to rate vs time oil rate forecasting. We typically use empirical
equations and use regression to fit those empirical input parameters. I was
wondering if we could use sophisticated ML techniques to accomplish this?
Thanks!
Thanks.
REPLY $
Dali July 26, 2020 at 12:06 am #
REPLY $
Kamal Singh August 6, 2018 at 6:19 pm #
I am working on Time series or Prediction with neural network and SVR, I want to
this in matlab by scratch can you give me the references of this materials
Thank you in advance
REPLY $
Jason Brownlee August 7, 2018 at 6:26 am #
Sorry, I don’t have any materials for matlab, it is only really used in universities.
REPLY $
Catalin August 6, 2018 at 8:50 pm #
Hi Jason! From which editor do you import the python code into the webpage of
your article? Or what kind of container it that windowed control used to display the python
code?
REPLY $
Jason Brownlee August 7, 2018 at 6:26 am #
Great question, I explain the software I use for the website here:
https://machinelearningmastery.com/faq/single-faq/what-software-do-you-use-to-run-
your-website
REPLY $
Mike August 7, 2018 at 2:28 am #
I recently stumbled over some tasks where the classic algorithms like linear regression or
decision trees outperformed even sophisticated NNs. Especially when boosted or averaged
out with each other.
Maybe its time to try the same with time series forecasting as I’m not getting good results for
some tasks with an LSTM.
REPLY $
Jason Brownlee August 7, 2018 at 6:30 am #
Always start with simple methods before trying more advanced methods.
REPLY $
Elie Kawerk August 7, 2018 at 2:36 am #
Hi Jason,
You’ve imported the sin function from math many times but have not used it.
I’d like to see more posts about GARCH, ARCH and co-integration models.
Best,
Elie
REPLY $
Jason Brownlee August 7, 2018 at 6:30 am #
Thanks, fixed.
REPLY $
Elie Kawerk August 7, 2018 at 2:38 am #
Will you consider writing a follow-up book on advanced time-series models soon?
REPLY $
Jason Brownlee August 7, 2018 at 6:32 am #
Yes, it is written. I am editing it now. The title will be “Deep Learning for Time
Series Forecasting”.
CNNs are amazing at time series, and CNNs + LSTMs together are really great.
REPLY $
Elie Kawerk August 7, 2018 at 6:40 am #
will the new book cover classical time-series models like VAR, GARCH, ..?
REPLY $
Jason Brownlee August 7, 2018 at 2:29 pm #
The focus is deep learning (MLP, CNN and LSTM) with tutorials on how
to get the most from classical methods (Naive, SARIMA, ETS) before jumping
into deep learning methods. I hope to have it done by the end of the month.
This is great news! Don’t you think that R is better suited than
Python for classical time-series models?
Great to hear this news. May I ask if the book also cover the topic of
multivariate and multistep?
Yes, there are many chapters on multi-step and most chapters work
with multivariate data.
REPLY $
Søren August 7, 2018 at 10:27 pm #
Sounds amazing that you finally are geting the new book out on time-
series models – when will it be available to buy?
REPLY $
Jason Brownlee August 8, 2018 at 6:20 am #
REPLY $
Kalpesh Ghadigaonkar March 5, 2019 at 1:01 am #
REPLY $
Arun Mishra August 10, 2018 at 5:25 am #
I use Prophet.
https://facebook.github.io/prophet/docs/quick_start.html
REPLY $
Jason Brownlee August 10, 2018 at 6:21 am #
Thanks.
REPLY $
AJ Rader August 16, 2018 at 7:11 am #
I would second the use of prophet, especially in the context of shock events
— this is where this approach has a unique advantage.
REPLY $
Jason Brownlee August 16, 2018 at 1:56 pm #
REPLY $
Naresh May 4, 2019 at 4:28 pm #
Hi,can you pls help to get the method for timeseries forecasting of10000
products at same time .
REPLY $
Jason Brownlee May 5, 2019 at 6:24 am #
I have some suggestions here that might help (replace “site” with
“product”):
https://machinelearningmastery.com/faq/single-faq/how-to-develop-forecast-
models-for-multiple-sites
REPLY $
User104 May 17, 2019 at 2:24 pm #
Hi Arun,
Can you let me know how you worked with fbprophet. I am struggling with the
installation of fbprophet module. Since it’s asking for c++ complier. Can you please
share how you installed the c++ complier. I tried all ways to resolve it.
Thanks
REPLY $
Jason Brownlee May 17, 2019 at 2:53 pm #
REPLY $
TJ Slezak June 5, 2019 at 1:45 am #
Nice!
REPLY $
Ravi Rokhade August 10, 2018 at 5:19 pm #
REPLY $
Jason Brownlee August 11, 2018 at 6:06 am #
REPLY $
Alberto Garcia Galindo August 11, 2018 at 12:14 am #
Hi Jason!
Firstly I congratulate you for your blog. It is helping me a lot in my final work on my
bachelor’s degree in Statistics!
What are the assumptions for make forecasting on time series using Machine Learning
algorithms? For example, it must to be stationary? Thanks!
REPLY $
Jason Brownlee August 11, 2018 at 6:11 am #
The methods like SARIMA/ETS try to make the series stationary as part of modeling (e.g.
differencing).
You may want to look at power transforms to make data more Gaussian.
REPLY $
sana June 23, 2020 at 6:30 am #
REPLY $
Jason Brownlee June 23, 2020 at 6:33 am #
It depends on the algorithm and the data, yes this is often the case.
REPLY $
Neeraj August 12, 2018 at 4:55 pm #
Hi Jason
I’m interested in forecasting the temperatures
I’m provided with the previous data of the temperature
Can you suggest me the procedure I should follow in order to solve this problem
REPLY $
Jason Brownlee August 13, 2018 at 6:15 am #
REPLY $
Den August 16, 2018 at 12:15 am #
Hey Jason,
Real quick:
How would you combine VARMAX with an SVR in python?
Elaboration.
Right now I am trying to predict a y-value, and have x1…xn variables.
The tricky part is, the rows are grouped.
So, for example.
If the goal is to predict the price of a certain car in the 8th year, and I have data for 1200 cars,
and for each car I have x11_xnm –> y1_xm data (meaning that let’s say car_X has data until
m=10 years and car_X2 has data until m=3 years, for example).
First I divide the data with the 80/20 split, trainset/testset, here the first challenge arises. How
to make the split?? I chose to split the data based on the car name, then for each car I
gathered the data for year 1 to m. (If this approach is wrong, please tell me) The motivation
behind this, is that the 80/20 could otherwise end up with data of all the cars of which some
would have all the years and others would have none of the years. aka a very skewed
distribution.
Final question(s).
How do you make a time series prediction if you have multiple groups [in this case 1200
cars, each of which have a variable number of years(rows)] to make the model from?
Am I doing right by using the VARMAX or could you tell me a better approach?
Sorry for the long question and thank you for your patience!
Best,
Den
REPLY $
Jason Brownlee August 16, 2018 at 6:09 am #
You can try model per group or across groups. Try both and see what works
best.
Compare a suite of ml methods to varmax and use what performs the best on your
dataset.
REPLY $
Petrônio Cândido August 16, 2018 at 6:36 am #
Hi Jason!
Excellent post! I also would like to invite you to know the Fuzzy Time Series, which are data
driven, scalable and interpretable methods to analyze and forecast time series data. I have
recently published a python library for that on http://petroniocandido.github.io/pyFTS/ .
REPLY $
Jason Brownlee August 16, 2018 at 1:55 pm #
REPLY $
SuFian AhMad May 15, 2019 at 3:43 pm #
Hello Sir, Can you please share an example code using your Fuzzy Logic
timeseries library..
I want to implement Fuzzy Logic time series, and i am just a student, so that’s why it will
be a great help from you if you will help me in this.
I just need a sample code that is written in python.
REPLY $
Chris Phillips August 30, 2018 at 8:19 am #
Hi Jason,
Thank you so much for the many code examples on your site. I am wondering if you can help
an amatur like me on something.
When I pull data from our database, I generally do it for multiple SKU’s at the same time into
a large table. Considering that there are thousands of unique SKU’s in the table, is there a
methodology you would recommend for generating a forecast for each individual SKU? My
initial thought is to run a loop and say something to the effect of: For each in SKU run…Then
the VAR Code or the SARIMA code.
Ideally I’d love to use SARIMA, as I think this works the best for the data I am looking to
forecast, but if that is only available to one SKU at a time and VAR is not constrained by this,
it will work as well. If there is a better methodology that you know of for these, I would gladly
take this advice as well!
REPLY $
Jason Brownlee August 30, 2018 at 4:49 pm #
REPLY $
Eric September 6, 2018 at 6:32 am #
REPLY $
Jason Brownlee September 6, 2018 at 2:07 pm #
Sounds good, I hope to cover state space methods in the future. To be honest,
I’ve had limited success but also limited exposure with the methods.
Not familiar with the lib. Let me know how you go with it.
REPLY $
Alex Rodriguez April 18, 2019 at 7:30 am #
I use it almost everyday and it really improved the effectiveness of my forecasts over any
other method.
REPLY $
Roberto Tomás September 27, 2018 at 7:38 am #
REPLY $
Jason Brownlee September 27, 2018 at 2:43 pm #
Typically I would write a function to perform the transform and a sister function
to invert it.
REPLY $
Sara October 2, 2018 at 7:36 am #
Thanks for your great tutorial posts. This one was very helpful. I am wondering if
there is any method that is suitable for multivariate time series with a trend or/and seasonal
components?
REPLY $
Jason Brownlee October 2, 2018 at 11:03 am #
You can experiment with each with and without data prep to make the series stationary.
REPLY $
Sara October 3, 2018 at 1:48 am #
Thanks for your respond. I also have another question I would appreciate if
you help me.
I have a dataset which includes multiple time series variables which are not
stationary and seems that these variables are not dependent on each other. I tried
ARIMA for each variable column, also VAR for the pair of variables, I expected to get
better result with ARIMA model (for non-stationarity of time series) but VAR provides
much better prediction. Do you have any thought why?
REPLY $
Jason Brownlee October 3, 2018 at 6:20 am #
REPLY $
Eric October 17, 2018 at 9:52 am #
Hi Jason,
If I don’t, then I can’t tell if a change in X is related to a change in Y, or if they are both just
trending with time. The time trend dominates as 0 <= random() <= 1
https://robjhyndman.com/hyndsight/arimax/
Thanks
REPLY $
Jason Brownlee October 17, 2018 at 2:27 pm #
No, the library will not do this for you. Differencing is only performed on the
provided series, not the exogenous variables.
Perhaps try with and without and use the approach that results in the lowest forecast
error for your specific dataset.
REPLY $
Andrew K October 23, 2018 at 9:09 am #
Hi Jason,
I do have a question regarding data that isn’t continuous, for example, data that can only be
measured during daylight hours. How would you approach a time series analysis
(forecasting) with data that has this behavior? Fill non-daylight hour data with 0’s or nan’s?
Thanks.
REPLY $
Jason Brownlee October 23, 2018 at 2:25 pm #
I’d encourage you to test many different framings of the problem to see what
works.
REPLY $
Khalifa Ali October 23, 2018 at 4:48 pm #
Hey..
Kindly Help us in making hybrid forecasting techniques.
Using two forecasting technique and make a hybrid technique from them.
Like you may use any two techniques mentioned above and make a hybrid technique form
them.
Thanks.
REPLY $
Jason Brownlee October 24, 2018 at 6:25 am #
Sure, what problem are you having with using multiple methods exactly?
REPLY $
Mohammad Alzyout October 31, 2018 at 6:25 pm #
I wondered which is the best way to forecast the next second Packet Error Rate in DSRC
network for safety messages exchange between vehicles to decide the best distribution over
Access Categories of EDCA.
Kindly, note that I’m beginner in both methods and want to decide the best one to go deep
with it because I don’t have enouph time to learn both methods especially they are as I think
from different backgrounds.
Best regards,
Mohammad.
REPLY $
Jason Brownlee November 1, 2018 at 6:03 am #
I recommend testing a suite of methods in order to discover what works best for
your specific problem.
REPLY $
Jawad November 8, 2018 at 12:33 am #
Hi Jason,
Thanks for great post. I have 2 questions. First, is there a way to calculate confidence
intervals in HWES, because i could not find any way in the documentation. And second, do
we have something like ‘nnetar’ R’s neural network package for time series forecasting
available in python.
Regards
REPLY $
Jason Brownlee November 8, 2018 at 6:10 am #
I’m not sure if the library has a built in confidence interval, you could calculate it
yourself:
https://machinelearningmastery.com/confidence-intervals-for-machine-learning/
What is “nnetar”?
REPLY $
joao May 4, 2020 at 3:50 am #
https://www.rdocumentation.org/packages/forecast/versions/8.12/topics/nnetar
REPLY $
Jawad Iqbal November 22, 2018 at 8:46 am #
REPLY $
Jason Brownlee November 22, 2018 at 2:08 pm #
REPLY $
Rima December 4, 2018 at 9:59 pm #
Hi Jason,
Thank you for this great post!
In VARMAX section, at the end you wrote:
“The method is suitable for univariate time series without trend and seasonal components
and exogenous variables.”
I understand from the description of VARMAX that it takes as input, multivariate time series
and exogenous variables. No?
Another question, can we use the seasonal_decompose
(https://www.statsmodels.org/dev/generated/statsmodels.tsa.seasonal.seasonal_decompos
e.html) function in python to remove the seasonality and transform our time series to
stationary time series? If so, is the result residual (output of seasonal_decompose) is what
are we looking for?
Thanks!
Rima
REPLY $
Jason Brownlee December 5, 2018 at 6:16 am #
Thanks, fixed.
REPLY $
Rima December 11, 2018 at 9:36 pm #
REPLY $
Jason Brownlee December 12, 2018 at 5:53 am #
REPLY $
Lucky December 5, 2018 at 12:49 am #
Hi Jason,
Could you please help me list down the names of all the models available to forecast a
univariate time series?
Thanks!
REPLY $
Jason Brownlee December 5, 2018 at 6:18 am #
REPLY $
Jane December 6, 2018 at 5:21 am #
Hi Jason,
For the AR code, is there any modification I can make so that model predicts multiple
periods as opposed to the next one? For example, if am using a monthly time series, and
have data up until August 2018, the AR predicts September 2018. Can it predict September
2018, October, 2018, and November 2018 based on the same model and give me these
results?
REPLY $
Jason Brownlee December 6, 2018 at 6:03 am #
Yes, you can specify the interval for which you need a prediction.
REPLY $
Jane December 7, 2018 at 3:29 am #
How might I go about doing that? I have read through the statsmodel
methods and have not found a variable that allows this
REPLY $
Jason Brownlee December 7, 2018 at 5:25 am #
REPLY $
Abzal June 28, 2020 at 6:04 am #
Hi Jason,
I have dynamic demand forecasting problem, i.e. simple time series but with
DaysBeforeDeparture complexity added.
Historical data looks like:
daysbeforedeparture – DepartureDate Time – Bookings
175. 21 09 2018. 16 00 00 – 10
Etc.
I am trying to use fbProphet model, aggregating time series by hour, but no idea how
to deal with DBDeparture feature.
REPLY $
Jason Brownlee June 29, 2020 at 6:20 am #
I think the date is redundant as you already have days before departure.
I think you may have 2 inputs, the days before departure, bookings, and one
output, the bookings. This would be a multivariate input and univariate output. Or
univariate forecasting with an exog variable.
In fact, if you have sequential data for days before departure, it is also redundant,
and you can model bookings directly as a univariate time series.
REPLY $
Esteban December 21, 2018 at 6:56 am #
REPLY $
Jason Brownlee December 21, 2018 at 3:15 pm #
Yes, I have many examples and a book on the topic. You can get started here:
https://machinelearningmastery.com/start-here/#deep_learning_time_series
REPLY $
mk December 22, 2018 at 10:34 pm #
All methods have common problems. In real life, we do not need to predict the
sample data. The sample data already contains the values of the next moment. The so-called
prediction is only based on a difference, time lag. That is to say, the best prediction is
performance delay. If we want to predict the future, we don’t know the value of the current
moment. How do we predict? Or maybe we have collected the present and past values,
trained for a long time, and actually the next moment has passed. What need do we have to
predict?
REPLY $
Jason Brownlee December 23, 2018 at 6:06 am #
You can frame the problem any way you wish, e.g. carefully define what inputs
you have and what output you want to predict, then fit a model to achieve that.
REPLY $
Dr. Omar January 2, 2019 at 1:40 am #
Dear Jason : your post and book look interesting , I am interested in forecasting a
daily close price for a stock market or any other symbol, data collected is very huge and
contain each price ( let’s say one price for each second) , can you briefly tell how we can
predict this in general and if your book and example codes if applied will yield to future data.
can we after inputting our data and producing the plot for the past data , can we extend the
time series and get the predicted priced for next day/month /year , please explain
REPLY $
Jason Brownlee January 2, 2019 at 6:41 am #
REPLY $
AD January 4, 2019 at 7:51 pm #
Hi Jason,
CUSTOMER_NUMBER
CUSTOMER_TRX_ID
INVOICE_NUMBER
INVOICE_DATE
RECEIPT_AMOUNT
BAL_AMOUNT
CUSTOMER_PROFILE
CITY_STATE
STATE
PAYMENT_TERM
DUE_DATE
PAYMENT_METHOD
RECEIPT_DATE
It would be a great help if you can guide me which algo be suitable for this requirement. I
think a multivariate method can satisfy this requirement
Thanks,
AD
REPLY $
Jason Brownlee January 5, 2019 at 6:53 am #
REPLY $
Marius January 8, 2019 at 7:17 am #
Hi Jason,
Kindest
Marius
REPLY $
Jason Brownlee January 8, 2019 at 11:12 am #
REPLY $
Kostyantyn Kravchenko June 27, 2019 at 8:53 pm #
REPLY $
Jason Brownlee June 28, 2019 at 6:01 am #
REPLY $
Heracles January 11, 2019 at 8:35 pm #
Hi Jason,
REPLY $
Jason Brownlee January 12, 2019 at 5:40 am #
Sounds like it might be easier to model the problem as time series classification.
I have some examples of activity recognition, which is time series classification that
might provide a good starting point:
https://machinelearningmastery.com/start-here/#deep_learning_time_series
REPLY $
Gary Morton January 13, 2019 at 5:00 am #
Good morning
A quality cheat sheet for time series, which I took time to re-create and decided to try an
augment by adding code snippets for ARCH and GARH
It did not take long to realize that Statsmodels does not have an ARCH function, leading to a
google search that took me directly to:
https://machinelearningmastery.com/develop-arch-and-garch-models-for-time-series-
forecasting-in-python/
Great work =) Thought to include here as I did not see a direct link, sans your above
comment on thinking to do an ARCH and GARCH module.
REPLY $
Jason Brownlee January 13, 2019 at 5:45 am #
Thanks, and many more here, but neural nets are not really “classic” methods
and arch only forecasts volatility:
https://machinelearningmastery.com/start-here/#deep_learning_time_series
REPLY $
Mahmut COLAK January 14, 2019 at 10:08 am #
Hi Jason,
Thank you very much this paper. I have a time series problem but i can’t find any technique
for applying. My dataset include multiple input and one output like multiple linear regression
but also it has timestamp. Which algorithm is the best solution for my problem?
Thanks.
REPLY $
Jason Brownlee January 14, 2019 at 11:15 am #
REPLY $
Mahmut COLAK January 14, 2019 at 10:16 am #
Hi Jason,
Thanks.
REPLY $
Jason Brownlee January 14, 2019 at 11:15 am #
I recommend testing a suite of methods and discover what works best for your
specific dataset.
REPLY $
RedzCh January 18, 2019 at 11:12 pm #
REPLY $
Jason Brownlee January 19, 2019 at 5:41 am #
REPLY $
Rodrigo January 18, 2019 at 11:15 pm #
The entry for the possible model could be the alerts (each alert correctly coded as they are
categorical variables) of the last x days, but this entry must have a fixed size/format. Since
the time windows don’t have the same number of alerts, I don’t know what is the correct way
to deal with this problem.
Any data formatting suggestion to make the observations uniform over time?
REPLY $
Jason Brownlee January 19, 2019 at 5:44 am #
There are many ways to frame and model the problem and I would encourage you to
explore a number and discover what works best.
First, you need to confirm that you have data that can be used to predict the outcome,
e.g. is it temporally dependent, or whatever it is dependent upon, can the model get
access to that.
Then, perhaps explore modeling it as a time series classification problem, e.g. is the even
going to occur in this interval. Explore different interval sizes and different input history
sizes and see what works.
REPLY $
Mery January 28, 2019 at 2:33 am #
Hello Sir,
I have a question.
I wanna know if we can use the linear regression model for time series data ?
REPLY $
Jason Brownlee January 28, 2019 at 7:15 am #
You can, but it probably won’t perform as well as specalized linear methods like
SARIMA or ETS.
REPLY $
sunil February 5, 2019 at 5:40 pm #
I have time series data,i want to plot the seasonality graph from it. I am familiar with
holt-winter. Are there any other methods?
REPLY $
Jason Brownlee February 6, 2019 at 7:38 am #
You can plot the series directly to see any seasonality that may be present.
REPLY $
Sai Teja February 11, 2019 at 9:11 pm #
Hi Jason Brownlee
like auto_arima function present in the R ,Do we have any functions like that in python
for VAR,VARMAX,SES,HWES etc
REPLY $
Jason Brownlee February 12, 2019 at 8:00 am #
REPLY $
Michal February 13, 2019 at 1:08 am #
REPLY $
Jason Brownlee February 13, 2019 at 8:00 am #
REPLY $
Bilal February 15, 2019 at 8:41 am #
Dear Jason,
Thanks for your valuable afford and explanations in such a simple way…
I would be very good to see the structural developments of the code from simple to more
complex one.
Best regards,
Bilal
REPLY $
Jason Brownlee February 15, 2019 at 2:17 pm #
REPLY $
Ayoub Benaissa March 2, 2019 at 8:44 am #
REPLY $
Jason Brownlee March 2, 2019 at 9:37 am #
Perhaps ask the person who gave you the assignment what they meant exactly?
REPLY $
Rafael March 12, 2019 at 12:46 am #
Thank you, i have a datset en csv format and i can open it in excel, it has data since
1984 until 2019, I want to train an artificial neural network in python i order to make frecastig
or predictions about that dataset in csv format, I was thinking in a MLP, coul you help me
Jason, a guide pls. Many thanks.
REPLY $
Jason Brownlee March 12, 2019 at 6:54 am #
REPLY $
Venkat March 24, 2019 at 5:00 pm #
Dear Jason,
Working on one of the banking use cases i.e. Current account and Saving account attrition
prediction.
We are using the last 6 months data for training, we need to predict customers whose
balance will reduce more than 70% with one exception, as long money invested in the same
bank it is fine.
Great, if you could suggest, which models or time series models will be the best options to
try in this case?
REPLY $
Jason Brownlee March 25, 2019 at 6:42 am #
REPLY $
Augusto O. Rodero March 29, 2019 at 8:08 pm #
Hi Jason,
Thank you so much for this post I learned a lot. I am a fan of the ARMAX models in my work
as a hydrologist for streamflow forecasting.
I hope you can share something about Gamma autoregressive models or GARMA models
which work well even for non-Gaussian time series which the streamflow time series mostly
are. Can we do GARMA in python?
REPLY $
Jason Brownlee March 30, 2019 at 6:26 am #
REPLY $
Zack March 30, 2019 at 11:18 pm #
This blog is very helpful to a novice like me. I have been running the examples you
have provided with some changes to create seasonality for example (period of 10 as in 0 to
9, back to 0, again to 9 with randomness thrown in). Linear regression seems to be better at
it than the others which I find surprising. What am I missing?
REPLY $
Jason Brownlee March 31, 2019 at 9:30 am #
REPLY $
Andrew April 1, 2019 at 10:01 pm #
Hi Jason,
Thank you for all your posts, they are so helpful for people who are starting in this area. I am
trying to forecast some data and they recommended me to use NARX, but I haven’t found a
good implementation in python. Do you know other method implemented in python similar to
NARX?
REPLY $
Jason Brownlee April 2, 2019 at 8:11 am #
You can use a SARIMAX as a NARX, just turn off all the aspects you don’t need.
REPLY $
Berta April 19, 2019 at 10:19 pm #
Hi Jason,
Thank you for all you share us. it’s very helpful.
REPLY $
Jason Brownlee April 20, 2019 at 7:38 am #
REPLY $
jessy April 20, 2019 at 11:30 am #
sir,
above 11 models are time series forecasting models, in few section you are discussing about
persistence models…what is the difference.
REPLY $
Jason Brownlee April 21, 2019 at 8:18 am #
REPLY $
Naomi May 4, 2019 at 1:55 am #
The line in VARMAX “The method is suitable for multivariate time series without trend and
seasonal components and exogenous variables.” is very confusing.
REPLY $
Jason Brownlee May 4, 2019 at 7:12 am #
REPLY $
User104 May 17, 2019 at 2:20 pm #
Hi Jason,
Thanks for the post. It was great and easy to understand for a beginner in time series.
Can you please suggest methods to increase the accuracy as well as RMSE.
Thanks
REPLY $
Jason Brownlee May 17, 2019 at 2:53 pm #
REPLY $
Menghok June 4, 2019 at 9:40 pm #
I have a question. What if my data is time series with multiple variables including categorical
data, which model should be used for this? For example, i’m predicting The Air pollution level
using the previous observation value of Temperature + Outlook (rain or not).
Thank you.
REPLY $
Jason Brownlee June 5, 2019 at 8:41 am #
It is a good idea to encode categorical data, e.g. with an integer, one hot
encoding or embedding.
REPLY $
Ramesh July 31, 2019 at 2:34 pm #
Hi Menghok, did you get any luck in implementing forecasting problem when
you have one more categorical variable in dataset
REPLY $
Prasanna June 17, 2019 at 3:45 pm #
Hi Jason,
Thanks for the great post again, wonderful learning experience.
Do you have R codes for the time series methods you described in your article?
or Can you suggest me a good source where can i get R codes to learn some of these
methods?
Thanks
REPLY $
Jason Brownlee June 18, 2019 at 6:32 am #
Sorry, I don’t have R code for time series, perhaps you can start here:
https://machinelearningmastery.com/books-on-time-series-forecasting-with-r/
REPLY $
Ibrahim Rashid June 18, 2019 at 8:39 pm #
Thanks for the post. Please do check out AnticiPy which is an open-source tool for
forecasting using Python and developed by Sky.
The goal of AnticiPy is to provide reliable forecasts for a variety of time series data, while
requiring minimal user effort.
AnticiPy can handle trend as well as multiple seasonality components, such as weekly or
yearly seasonality. There is built-in support for holiday calendars, and a framework for users
to define their own event calendars. The tool is tolerant to data with gaps and null values,
and there is an option to detect outliers and exclude them from the analysis.
Ease of use has been one of our design priorities. A user with no statistical background can
generate a working forecast with a single line of code, using the default settings. The tool
automatically selects the best fit from a list of candidate models, and detects seasonality
components from the data. Advanced users can tune this list of models or even add custom
model components, for scenarios that require it. There are also tools to automatically
generate interactive plots of the forecasts (again, with a single line of code), which can be run
on a Jupyter notebook, or exported as .html or .png files.
REPLY $
Jason Brownlee June 19, 2019 at 7:52 am #
REPLY $
Min June 19, 2019 at 7:46 am #
Hi Jason,
I have a question for time series forecasting. Have you heard about Dynamic Time Warping?
As far as I know, this is a method for time series classification/clustering, but I think it can
also be used for forecasting based on the similar time series. What do you think about this
method compared to ARIMA? Do you think it will be better if I combine both two methods?
For example, use DTW to group similar time series and then use ARIMA for each group?
Thanks
REPLY $
Jason Brownlee June 19, 2019 at 8:20 am #
I don’t have any posts on the topic, but I hope to cover it in the future.
REPLY $
lda4526 June 22, 2019 at 7:11 am #
Can you please explain why you use len(data) in your predict arguments? I was
using the .forecast feature for a while which is for out of sample forecasts but I keep getting
an error on my triple expo smoothing. Apparently, the .predict can be used for in-sample
prediction as well as out of sample. The arguments are start and end, and you use len(data)
for both which is confusing me. Will this really forecast or will it just produce a forecast for
months in the past?
REPLY $
Jason Brownlee June 22, 2019 at 7:43 am #
Great question.
REPLY $
lda4526 June 25, 2019 at 1:03 am #
Thanks for the reply! I was reading through the explanation in your linked
article and it was great. Can the .predict() do multiple periods in the future like
.forecast? I was using .forecast(12) for forecasting 12 months into the future.
REPLY $
lda4526 June 25, 2019 at 1:17 am #
EDIT: Dumb question- did not read until the end. If you got time, check
out my stack post though:
https://stackoverflow.com/questions/56709745/statsmodels-operands-could-
not-be-broadcast-together-in-pandas-series . I think i found some error in
statsmodels as the error is thrown in the .forecast function as it wants me to
specify a frequency. I found the potential misstep by reading through the actual
code for ExponentialSmoothing and it was the only part that really referenced the
frequency. Its either that, or my noob is really showing.
REPLY $
karim June 23, 2019 at 9:05 pm #
Hi Jason,
Can you please provide any link of your tutorial which has described forecasting of
multivariate time series with a statistical model like VAR?
REPLY $
Jason Brownlee June 24, 2019 at 6:27 am #
REPLY $
karim June 25, 2019 at 7:00 am #
OK, thanks for your reply. Hopefully, if you can manage time will come with a tutorial
on VAR for us.
REPLY $
qwizak July 2, 2019 at 10:56 pm #
Hi Jason, do you cover all these models using a real dataset in your book?
REPLY $
Jason Brownlee July 3, 2019 at 8:33 am #
REPLY $
Shabnam July 3, 2019 at 11:26 pm #
Hi Jason,
Thanks for the helpful tutorial. I’ve been trying to solve a sales forecast problem but haven’t
been any successful. The data is the monthly records of product purchases (counts) with
their respective prices for ten years. The records do not show either a significant auto-
correlation for a wide range of lags or seasonality. The records are stationary though.
Among the time series models, I have tried (S)ARIMA, exponential methods, the Prophet
model, and a simple LSTM. I have also tried regression models using a number of industrial
and financial indices and the product price. Unfortunately, no method has led to an
acceptable result. With regression models, the test R^2 is always negative.
My questions are:
* What category of problems is this problem more relevant to?
* Do you have any suggestions for possibly suitable approaches to follow for this kind of
problems?
Shabnam
REPLY $
Jason Brownlee July 4, 2019 at 7:49 am #
REPLY $
Shabnam July 8, 2019 at 11:04 pm #
I guess that might be the case. I’m also guessing that maybe I don’t have
sufficiently relevant explanatory variables to obtain a good regression model. Thanks
for your feedback though. And, thanks again for your very helpful tutorials.
Shabnam
REPLY $
Jason Brownlee July 9, 2019 at 8:11 am #
You’re welcome.
REPLY $
George July 9, 2019 at 1:05 am #
Hello Jason,
Do you know if there any way to randomly sample a fitted VARMA time series using the
statsmodel library. Just like using the sm.tsa.arma_generate_sample for the ARMA.
I cant seem to see this how this is done anywhere.
REPLY $
Jason Brownlee July 9, 2019 at 8:12 am #
REPLY $
Manjinder Singh July 11, 2019 at 9:06 pm #
Hello Jason,
It is really nice and informative article.
I am having network data and in that data there are different parameters (network incidents)
which slows down the network. From the lot of parameters one is network traffic. I have
DATE and TIME when network traffic crosses a certain threshold at certain location. I need to
draw a predictive model so that i can spot when network traffic is crossing threshold value at
a particular location, so that i can take preventive measures prior to occurrence of that
parameter at that place.
So please can you suggest me appropriate model for this problem.
REPLY $
Jason Brownlee July 12, 2019 at 8:41 am #
REPLY $
Juan Flores July 11, 2019 at 9:56 pm #
Hi, Jason,
We’ve been having trouble with statsmodels’ ARIMA. It just doesn’t work and takes forever.
What can you tell us about these issues? Do you know of any alternatives to statsmodels?
ThX,
Juan
REPLY $
Jason Brownlee July 12, 2019 at 8:43 am #
REPLY $
Juan J. Flores July 13, 2019 at 10:47 pm #
Dear Jason,
Of course there are many regression models available in sklearn. The point is that
statsmodels seems to fail miserably, both in time and accuracy. That is, with respect
to their arima (family) set of functions.
Juan
REPLY $
Jason Brownlee July 14, 2019 at 8:11 am #
I have found the statsmodels implementation to be reliable, but only if the data is
sensible and only if the order is modest.
I cannot recommend another library at this time. R may be slightly more reliable,
but is not bulletproof.
Thanks, Jason.
Best,
Juan
Nice work!
I agree.
REPLY $
Manjinder Singh July 12, 2019 at 7:12 pm #
In time series classification when I am plotting it is showing day wise data . For an
instance if i am taking data of past 1 month and applying Autoregression time series
classification on it then i am not able to fetch detailed outputs from that chart. From detailed
output I mean that 1 incident is occuring number of times in a day so I want visibility in such
a way so that everytime an incident occur it should be noticed on that plot.
Is there any precise way by which I can do it.I would be really thankful if you would help me.
Thank you
REPLY $
Jason Brownlee July 13, 2019 at 6:54 am #
REPLY $
Berns B. August 8, 2019 at 9:40 am #
D:\Users\Berns\Anaconda3\envs\time_series_p27\lib\site-
packages\statsmodels\tsa\vector_ar\var_model.pyc in fit(self, maxlags, method, ic, trend,
verbose)
644 self.data.xnames[k_trend:])
645
–> 646 return self._estimate_var(lags, trend=trend)
647
648 def _estimate_var(self, lags, offset=0, trend=’c’):
D:\Users\Berns\Anaconda3\envs\time_series_p27\lib\site-
packages\statsmodels\tsa\vector_ar\var_model.pyc in _estimate_var(self, lags, offset, trend)
666 exog = None if self.exog is None else self.exog[offset:]
667 z = util.get_var_endog(endog, lags, trend=trend,
–> 668 has_constant=’raise’)
669 if exog is not None:
670 # TODO: currently only deterministic terms supported (exoglags==0)
D:\Users\Berns\Anaconda3\envs\time_series_p27\lib\site-
packages\statsmodels\tsa\vector_ar\util.pyc in get_var_endog(y, lags, trend, has_constant)
36 if trend != ‘nc’:
37 Z = tsa.add_trend(Z, prepend=True, trend=trend,
—> 38 has_constant=has_constant)
39
40 return Z
D:\Users\Berns\Anaconda3\envs\time_series_p27\lib\site-
packages\statsmodels\tsa\tsatools.pyc in add_trend(x, trend, prepend, has_constant)
97 col_const = x.apply(safe_is_const, 0)
98 else:
—> 99 ptp0 = np.ptp(np.asanyarray(x), axis=0)
100 col_is_const = ptp0 == 0
101 nz_const = col_is_const & (x[0] != 0)
D:\Users\Berns\Anaconda3\envs\time_series_p27\lib\site-
packages\numpy\core\fromnumeric.pyc in ptp(a, axis, out, keepdims)
2388 else:
2389 return ptp(axis=axis, out=out, **kwargs)
-> 2390 return _methods._ptp(a, axis=axis, out=out, **kwargs)
2391
2392
D:\Users\Berns\Anaconda3\envs\time_series_p27\lib\site-
packages\numpy\core\_methods.pyc in _ptp(a, axis, out, keepdims)
151 def _ptp(a, axis=None, out=None, keepdims=False):
152 return um.subtract(
–> 153 umr_maximum(a, axis, None, out, keepdims),
154 umr_minimum(a, axis, None, None, keepdims),
155 out
REPLY $
Jason Brownlee August 8, 2019 at 2:21 pm #
REPLY $
Berns B. August 8, 2019 at 12:25 pm #
By the way Doc Jason, I just bought your book today this morning. I have a huge
use case for a time series at work I can use them for.
REPLY $
Jason Brownlee August 8, 2019 at 2:21 pm #
REPLY $
Berns B. August 8, 2019 at 1:19 pm #
# VAR example
from statsmodels.tsa.vector_ar.var_model import VAR
from random import random
# contrived dataset with dependency
data = list()
for i in range(100):
v1 = i + random()
v2 = v1 + random()
row = [v1, v2]
data.append(row)
# fit model
model = VAR(data)
model_fit = model.fit()
# make prediction
yhat = model_fit.forecast(model_fit.y, steps=1)
print(yhat)
REPLY $
Jason Brownlee August 8, 2019 at 2:22 pm #
REPLY $
soumyajit August 20, 2019 at 6:27 pm #
REPLY $
Jason Brownlee August 21, 2019 at 6:38 am #
REPLY $
Kristen September 24, 2019 at 1:27 am #
Hi Jason,
GREAT article. I’ve been looking for an overview like this. I do have a side question about a
project I’m working on. I’ve got a small grocery distribution company that wants weekly
forecasts of its 4000+ different products/SKU numbers. I’m struggling with how to produce a
model that can forecast all these different types of time-series trends and seasonalities.
ARIMA with Python doesn’t seem like a good option because of the need to fine tune the
parameters. I would love you’re feedback on the following ideas:
1. Use facebooks prophet for a “good enough” model for all different types of time series
trends and seasonalities.
2. Separate the grocery items into 11-20 broader category groups and then produce
individual models for all those different categories that can then be applied to each individual
sku within that category.
LIMITATIONS:
1. I need this process to be computationally efficient, and a for-loop for 4000 items seems
like a horrible way to go about this.
QUESTIONS:
1. Do i have to model sales and NOT item quantities and why is that?
2. Is there a way to cluster the items with unsupervised learning just based off their trend and
seasonality? Each item does not have good descriptor category so I don’t have much to
work with for unsupervised clustering techniques.
Would love to hear any thoughts you have about this type of problem!
Kristen
REPLY $
Jason Brownlee September 24, 2019 at 7:49 am #
This will give you some ideas of working with multiple SKUs:
https://machinelearningmastery.com/faq/single-faq/how-to-develop-forecast-models-
for-multiple-sites
Explore groupings by simple aspects like product category, average sales, average gross
in an interval, etc.
REPLY $
Anne Bierhoff October 2, 2019 at 7:45 pm #
Hi Jason,
I have a simulation model of a system which receives a forecast of a time series as input.
In my scientific work I would like to examine how the performance of the simulation model
behaves in relation to the accuracy of the forecast. So I would like to have forecasts between
80% (bad prediction) and 95% (very good prediction) for my time series.
How would you recommend me to generate such pseudo predictions that are meaningful for
real predictions? Maybe adjusting a mediocre prediction in both direction?
REPLY $
Jason Brownlee October 3, 2019 at 6:44 am #
REPLY $
Anne Bierhoff October 4, 2019 at 12:25 am #
REPLY $
Jason Brownlee October 4, 2019 at 5:44 am #
Yes, why not contrive forecasts as input and contrive expected outputs
and control the entire system – e.g. a controlled experiment.
If you’re new to computational experimental design, perhaps this book will help:
https://amzn.to/2AFTd7t
Or a controllable simplification of the data you will be working with for real,
e.g. trends/seasonality/level changes/etc.
REPLY $
Sadani October 4, 2019 at 8:38 pm #
I have to perform a regression on a data set that has the following data
Age of Person at time of data collection
Date
cholesterol Level
HDL
Event which can be death etc.
Need to find out if there are any patterns in the Cholesterol Level or the HDL Levels prior to
the death event.
I’m trying to do this using Jupyter Notebook. Really appreciate your help sincerely.
Do you have any examples that would be close ?
REPLY $
Jason Brownlee October 6, 2019 at 8:08 am #
Sounds like a great problem, perhaps start with some of the simpler tutorials
here:
https://machinelearningmastery.com/start-here/#timeseries
REPLY $
Jose October 15, 2019 at 3:45 am #
Thanks!
REPLY $
Jason Brownlee October 15, 2019 at 6:19 am #
If you can’t locate, perhaps you can implement it yourself? Or call a function from an
implementation in another language?
REPLY $
Burcu October 22, 2019 at 11:51 pm #
Hi Jason,
Actually I am very impatient and did not read all the comments but thot that you already
knew them
I need to make a hierarcial and grouped time series in Python.
I need to predict daily & hourly passenger unit by continent or vessel name. I investigated but
did not find a solution, could you pls direct me what can I do.
Below are my inputs for the model.
Date & Time Vessel Name Continent Passenger Unit
Thank you,
REPLY $
Jason Brownlee October 23, 2019 at 6:49 am #
REPLY $
Praveen November 13, 2019 at 12:53 am #
Hi Jason,
While forecasting in VAR or VARMAX I’ m not getting the future timestamp associated with
it?
REPLY $
Jason Brownlee November 13, 2019 at 5:47 am #
REPLY $
Prem November 13, 2019 at 3:53 pm #
Hi Jason,
For the simple AR model yhat values is it possible to get the threshold?
data = [x + random() for x in range(1, 100)]
model = AR(data)
model_fit = model.fit()
yhat = model_fit.predict(len(data), len(data))
Thanks
REPLY $
Jason Brownlee November 14, 2019 at 7:57 am #
What threshold?
REPLY $
Prem November 14, 2019 at 3:56 pm #
REPLY $
Jason Brownlee November 15, 2019 at 7:43 am #
REPLY $
Markus December 2, 2019 at 12:08 am #
Hi
http://www.statsmodels.org/dev/generated/generated/statsmodels.tsa.ar_model.AR.predict.
html#statsmodels.tsa.ar_model.AR.predict
With the given parameters as params, start, end and dynamic, I can’t correspond them to the
method call you have as:
For which exact documented parameters do you pass over len(data) twice?
Thanks.
REPLY $
Jason Brownlee December 2, 2019 at 6:04 am #
That makes a single step prediction for the next time step at the end of the data.
REPLY $
Nikhil Gupta December 4, 2019 at 6:46 pm #
I see that you are using random function to generate the data sets. Can you suggest any
pointers where I can get real data sets to try out these algorithms. If possible IoT data sets
would be preferable.
REPLY $
Jason Brownlee December 5, 2019 at 6:40 am #
REPLY $
Juni December 12, 2019 at 12:47 pm #
Hey
It is really great to see all this stuff.
Can you please explain me:
which algorithm should i use when i want to predict the voltage of a battery:ie…
battery voltage is recorded from 4.2 to 3.7 is 6 months which algorithm will be the best one
to predict the next 6 months
REPLY $
Jason Brownlee December 12, 2019 at 1:45 pm #
REPLY $
jhon December 18, 2019 at 3:55 pm #
Buenas tardes, se podría utilizar el walk fordward validación para series de tiempo
multivariado ? es decir una variable Target y varias variables input, y k te haga la técnica de
wall fordward validación?? esa es mi pregunta, saben de algún link o tutoría
REPLY $
Jason Brownlee December 19, 2019 at 6:21 am #
Yes, the same walk-forward validation approach applies and you can evaluate
each variate separately or all together if they have the same units.
REPLY $
A. January 13, 2020 at 12:57 am #
Hi,
Have you ever evaluated the performance of Gaussian processes for time series forecast?
Do you have any material on this (or plan to make a post about this)?
Thanks!
REPLY $
Jason Brownlee January 13, 2020 at 8:21 am #
REPLY $
Pallavi February 17, 2020 at 4:19 pm #
Hi Jason,
I’m currently working on a sales data set which has got date, location, item and the unit
sales. Time period is 4 yrs data, need to forecast sales for subsequent year -1 month(1-jan to
15 jan). Kindly suggest a model for this multivariate data. Another challenge which i’m
currently facing is working with google collab with restricted RAM size of 25 GB and the data
set size is 1M rows.
Please suggest a good model which will handle the memory size as well.
Note: I have change the datatype to reduce the memory, still facing the issue
REPLY $
Jason Brownlee February 18, 2020 at 6:17 am #
REPLY $
Amit Patidar March 5, 2020 at 8:27 pm #
Can you briefly explain about Fbprophet, it’s pros and cons etc.
REPLY $
Jason Brownlee March 6, 2020 at 5:31 am #
REPLY $
Ricardo Reis March 21, 2020 at 2:07 am #
Hi Jason,
Mine is sensor data. These sensors only detect the presence of a person in a room. Each
entry gives me the start time, duration and room. This is for monitoring one person with
health issues as they move through their flat.
None of the above methods — fantastically depicted, by the way! — is a glove to this hand. I
need to detect patterns, trigger alerts on anomalies and predict future anomalies.
Any suggestions?
REPLY $
Jason Brownlee March 21, 2020 at 8:27 am #
REPLY $
William March 22, 2020 at 1:38 am #
REPLY $
Jason Brownlee March 22, 2020 at 6:56 am #
Perhaps confirm that the inputs to your model have the same length.
REPLY $
Artur March 22, 2020 at 1:42 am #
Hello,
”
for i in range(100):
v1 = random()
“
REPLY $
Jason Brownlee March 22, 2020 at 6:57 am #
Fixed, thanks!
REPLY $
LANET March 23, 2020 at 4:39 am #
Hi Dr.Jason,
I have a question about time-series forecasting.I have to predict the value of Time-series A in
the next few hours.And at the same time,the Time-series B have an influence on A.I have two
ideas :
1.X_train is time T and Y_train is the value of A_T and B._T
2 X_train is the value of (A_n, A_n-1, A_n-2,…A_n-k, B_n, B_n-1, B_n-2,…,B_n-k) ,Y_train is
(A_n+1, A_n+2,…,A_n+m)
Which is better?
REPLY $
Jason Brownlee March 23, 2020 at 6:15 am #
REPLY $
LANET March 23, 2020 at 8:08 pm #
REPLY $
Jason Brownlee March 24, 2020 at 6:00 am #
REPLY $
LANET March 23, 2020 at 4:44 am #
REPLY $
Jason Brownlee March 23, 2020 at 6:15 am #
You’re welcome.
REPLY $
DEAN MCBRIDE April 7, 2020 at 12:54 am #
Hi Jason
I get numerous warnings for some of the models (e.g. robustness warnings, and model
convergence warnings, etc.) Do you get the same?
Thanks,
Dean
REPLY $
Jason Brownlee April 7, 2020 at 5:52 am #
Sometimes. You can ignore them for now, or explore model configs/data preps
that are more kind to the underlying solver.
REPLY $
DEAN MCBRIDE April 7, 2020 at 4:24 pm #
Thanks Jason.
One other thing – do you know where I could find an explanation on how the solvers
work (relating to the theory) ? Since I’ll need to know what config/parameter changes
will do.
I did access the links you posted (e.g. statsmodels.tsa.ar_model.AR API) , but I don’t
find these very descriptive, and they are without code examples.
REPLY $
Jason Brownlee April 8, 2020 at 7:45 am #
REPLY $
Stéfano April 7, 2020 at 1:00 pm #
Hi Jason Brownlee,
I was in doubt as to how to extract the predicted value and the current value of each
prediction. I would like to calculate the RMSE, MAPE, among others.
REPLY $
Jason Brownlee April 7, 2020 at 1:33 pm #
You’re welcome.
You can make a prediction by fitting the model on all available data and then calling
model.predict()
REPLY $
rina April 9, 2020 at 8:09 pm #
Do you have an idea how can i use one model to predict trafic for all regions so one model to
a multiple time series?
REPLY $
Jason Brownlee April 10, 2020 at 8:27 am #
REPLY $
rina April 10, 2020 at 7:38 pm #
thank you for your answer. for my project i want to have one model to
predict all the values for all cities.
I don’t know wich types of algorithms can i use that accept multiseries can you help
me with this thank you
REPLY $
Jason Brownlee April 11, 2020 at 6:16 am #
Follow this framework to discover the algorithm that works best for your
specific data:
https://machinelearningmastery.com/how-to-develop-a-skilful-time-series-
forecasting-model/
thank you for your response. In fact do you have an examples of link
or article with python that forecast with multiple independant time series
Yes many – search the blog, perhaps start with the simple examples
in this tutorial:
https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-
series-forecasting/
REPLY $
Nirav April 9, 2020 at 10:33 pm #
I want to fit a curve on time series data. Will this methods work for it or it can be
useful for prediction only?
REPLY $
Jason Brownlee April 10, 2020 at 8:30 am #
REPLY $
Alaa May 9, 2020 at 11:01 am #
Dear Jason,
Do you have a blog where you forecast univariate time series using a combination of
exponential smoothing and neural network?
I mean where we give lags of time series to exponential smoothing model and the output of
exponential smoothing will be the input of a neural network?
Thank you
REPLY $
Jason Brownlee May 9, 2020 at 1:51 pm #
REPLY $
QuangAnh May 28, 2020 at 8:21 am #
Hi Jason,
Could you also please review the BATS and TBATS models? (De Livera, A.M., Hyndman,
R.J., & Snyder, R. D. (2011), Forecasting time series with complex seasonal patterns using
exponential smoothing, Journal of the American Statistical Association, 106(496), 1513-
1527.) Like SARIMA, they are quite good to deal with seasonal time series. And I am
wondering if you have any plan to review Granger causality which is often used to find time
series dependencies from multivariate time series? Thank you.
REPLY $
Jason Brownlee May 28, 2020 at 1:23 pm #
REPLY $
adarsh singh June 11, 2020 at 1:44 am #
hello
i have a dtaset of many vehicle which ask me to predict output by taking 5 timeseries input
and i have to do this for all vehicle almost 100 vehicle to improve accuracy of model,can u
help me
REPLY $
Jason Brownlee June 11, 2020 at 6:00 am #
REPLY $
Shekhar P June 11, 2020 at 4:03 pm #
Hello Sir,
The SARIMAX model you explained here is for single step. Can you let me know, how to do it
for multi step ( with steps = 96 in future) forecasting.
REPLY $
Jason Brownlee June 12, 2020 at 6:08 am #
Specify the number of steps required in the call to the forecast() function.
REPLY $
Puspendra June 18, 2020 at 12:48 am #
Hi Jeson,
Great info. Can you suggest the algorithms which cover multivariate with trend and
seasonality?
REPLY $
Jason Brownlee June 18, 2020 at 6:28 am #
Neural nets:
https://machinelearningmastery.com/start-here/#deep_learning_time_series
REPLY $
Greg June 18, 2020 at 1:20 am #
hey, Jason. there are some changes on the statsmodels library. e.g. on the arima is
now
>> statsmodels.tsa.arima.model
REPLY $
Jason Brownlee June 18, 2020 at 6:28 am #
Thanks.
REPLY $
User1 June 23, 2020 at 7:13 am #
It is not very clear to me what is the difference between a multivariate and an exogenous
time series
REPLY $
Jason Brownlee June 23, 2020 at 1:26 pm #
Thanks.
My understanding:
Multivariate refers to multiple parallel input time series that are modeled as sych (lags
obs)
Exogenous variables are simply static regressors that are added to the model (e.g. linear
regression).
REPLY $
t June 25, 2020 at 1:14 am #
statsmodels.tsa.arima_model.ARMA:
Deprecated since version 0.12: Use statsmodels.tsa.arima.model.ARIMA instead…
REPLY $
Jason Brownlee June 25, 2020 at 6:24 am #
REPLY $
Sha June 29, 2020 at 3:46 pm #
REPLY $
Jason Brownlee June 30, 2020 at 6:13 am #
ARX (ARIMAX), the “X”, for when you have a univariate time series with exogenous
variables:
https://en.wikipedia.org/wiki/Exogenous_and_endogenous_variables
If you’re unsure, try both and one will be a natural fit for your data and another won’t. It
might be that clear.
REPLY $
durvesh July 3, 2020 at 6:15 pm #
1. For the SARIMAX seasonal component “m” should it be different than 1 ,get an error when
1 is used .
2. In the predict function why we pass the length as 99 if we want next prediction
REPLY $
Jason Brownlee July 4, 2020 at 5:53 am #
You specify the index or date you want to forecast in the predict() function.
REPLY $
Gaurav August 1, 2020 at 1:21 am #
I was curious to know when to use which model? And conditions to be met before
using a model. Can I get some points on this?
REPLY $
Jason Brownlee August 1, 2020 at 6:12 am #
Determine if you have multiple or single inputs and/or multiple or single outputs
then select a model that supports your requirements.
REPLY $
Gaurav August 2, 2020 at 12:14 am #
As far as I understand
Single input : Univariate Model
Multiple Input : Multivariate Model
But I guess where each of the models, in which scenario fits exceeds the scope of
this topic i.e cheat sheet.
However I am having IoT data project in my company. where you get to see
monitored data in timestamp. I am being asked to make Pv predictions. Can you
suggest a model as well that would be great :
REPLY $
Jason Brownlee August 2, 2020 at 5:47 am #
Yes, the example for each model show’s the type of example where it is
appropriate.
Leave a Reply
Name (required)
Website
SUBMIT COMMENT
© 2020 Machine Learning Mastery Pty. Ltd. All Rights Reserved. Privacy | Disclaimer | Terms | Contact | Sitemap | Search
Address: PO Box 206, Vermont Victoria 3133, Australia. | ACN: 626 223 336.
LinkedIn | Twitter | Facebook | Newsletter | RSS