0% found this document useful (0 votes)
72 views9 pages

CRD Assignment

Uploaded by

Sreenivas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views9 pages

CRD Assignment

Uploaded by

Sreenivas
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

ASSIGNMENT

ON
Autoregressive (AR) Model for Time
Series Forecasting

ABM 540
RESEARCH METHODOLOGY FOR
AGRIBUSINESS MANAGEMENT

Submitted to
Dr. C. R. Dudhagara,
Assistant Professor & Head,
Dept. of Communication and Information Technology
International Agribusiness Management Institute
Anand Agricultural University, Anand

Submitted by

Name: Pandya Maitrak


Reg No: 2040623019
MBA-ABM, 2nd Semester

INTERNATIONAL AGRIBUSINESS MANAGEMENT INSTITUTE


ANAND AGRICULTURAL UNIVERSITY
Anand 388110
Autoregressive Models
Autoregressive models belong to the family of time series models. These models
capture the relationship between an observation and several lagged observations
(previous time steps). The core idea is that the current value of a time series can be
expressed as a linear combination of its past values, with some random noise.
Mathematically, an autoregressive model of order p, denoted as AR(p), can be
expressed as:

Where:
xt
 is the value at time t.
 c is a constant.
 are the model parameters.

 are the lagged values.

 represents white noise (random error) at time t.

Autocorrelation (ACF) in Autoregressive Models


Autocorrelation, often denoted as “ACF” (Autocorrelation Function), is a
fundamental concept in timeseries analysis and autoregressive models. It refers to
the correlation between a time series and a lagged version of itself. In the context of
autoregressive models, autocorrelation measures how closely the current value of a
time series is related to its past values, specifically those at different time lags.
Here’s a breakdown of the concept of autocorrelation in autoregressive models:
 Autocorrelation involves calculating the correlation between a time series
and a lagged version of itself. The “lag” represents the number of time
units by which the series is shifted. For example, a lag of 1 corresponds to
comparing the series with its previous time step, while a lag of 2 compares
it with the time step before that, and so on. Lag values help you calculate
autocorrelation, which measures how each observation in a time series is
related to previous observations.
 The autocorrelation at a particular lag provides insights into the temporal
dependence of the data. If the autocorrelation is high at a certain lag, it
indicates a strong relationship between the current value and the value at
that lag. Conversely, if the autocorrelation is low or close to zero, it
suggests a weak or no relationship.
 To visualize autocorrelation, a common approach is to create an ACF plot.
This plot displays the autocorrelation coefficients at different lags. The
horizontal axis represents the lag, and the vertical axis represents the
autocorrelation values. Significant peaks or patterns in the ACF plot can
reveal the underlying temporal structure of the data. Autocorrelation plays
a pivotal role in autoregressive models.
 In an Autoregressive model of order p, the current value of the time series
is expressed as a linear combination of its past p values, with coefficients
determined through methods like least squares or maximum likelihood
estimation. The selection of the lag order (p) in the AR model often relies
on the analysis of the ACF plot.
 Autocorrelation can also be used to assess whether a time series is
stationary. In a stationary time series, autocorrelation should gradually
decrease as the lag increases. Deviations from this behavior might indicate
non-stationarity.

Types of Autoregressive Models


AR(1) Model:
In the AR(1) model, the current value depends only on the previous value.
It is expressed as:

AR(p) Model:
The general autoregressive model of order p includes p lagged values.
It is expressed as shown in the introduction.

Implementing AR Model for predicting Temperature


Step 1: Importing Data
In the first step, we import the required libraries and the temperature dataset.

The data is visualized in this step.


Output:
Step 2: Data Preprocessing
Now that we have our synthetic data, we need to preprocess it. We’ll create lag
features, split the data into training and testing sets, and format it for modeling.
 In the first step, the lag features are added to the data frame.
 Then the rows with null values are completely removed.
 The data is then split into training and testing datasets.
 The input features and target variable are defined.

ACF Plot

The Autocorrelation Function (ACF) plot is a graphical tool used to visualize and
assess the autocorrelation of a time series data at different lags. The ACF plot helps
you understand how the current value of a time series is correlated with its past
values. You can create an ACF plot in Python using the plot_acf function from the
Stats models library.

Output:
The graph shows, the autocorrelation values for the first 20 lags. The plot displays
autocorrelation values at different lags, with lags on x-axis and autocorrelation
values on the y-axis. The graph helps us to identify the significant lags where
autocorrelation values are outside the confidence interval (indicated by the shaded
region).
We can observe a significant correlation from lag=1 to lag=4. We check the
correlation of the lagged values using the approach mentioned below:

data['Temperature '].corr(data['Temperature '].shift(1))

Output:

0.7997281316018658

Lag=1 provides us with the highest correlation value of 0.799. Similarly, we have
checked with lag= 2, 3, 4. For the shift set to 4, we get the correlation as 0.31.
Step 3: Model Building
We’ll build an autoregressive model using AutoReg model.
 We import the required libraries to create the autoregressive model.
Then we train the autoregressive model on the train data.
from [Link].ar_model import AutoReg
from [Link] import plot_acf
from [Link] import AutoReg
from [Link] import mean_absolute_error, mean_squared_error
# Create and train the autoregressive model
lag_order = 1 # Adjust this based on the ACF plot
ar_model = AutoReg(y_train, lags=lag_order)
ar_results = ar_model.fit()
Step 4: Model Evaluation
Evaluate the model’s performance using Mean Absolute Error (MAE) and Root
Mean Squared Error (RMSE).
 We then make predictions using the AutoReg model and label it as y_pred.
 MAE and RMSE metrics are calculated to evaluate the performance of
AutoReg model.
# Make predictions on the test set
y_pred = ar_results.predict(start=len(train_data), end=len(train_data) + len(test_data) - 1,
dynamic=False)
#print(y_pred)

# Calculate MAE and RMSE


mae = mean_absolute_error(y_test, y_pred)
rmse = [Link](mean_squared_error(y_test, y_pred))
print(f'Mean Absolute Error: {mae:.2f}')
print(f'Root Mean Squared Error: {rmse:.2f}')

Output:
Mean Absolute Error: 1.59
Root Mean Squared Error: 2.30
In the code, ar_results is an ARIMA model fitted to our time series data. To make
predictions on the test set, we use the predict method of the ARIMA model. Here’s
how it works:
 start specifies the starting point for prediction. In this case, we start the
prediction right after the last data point in our training data, which is
equivalent to the first data point in our test set.
 end specifies the ending point for prediction. We set it to the last data
point in our test set.
 dynamic=False indicates that we are using out-of-sample forecasting. This
means that each forecasted point uses the true values of the previous
observations. This is typically used for model evaluation on the test set.
 The predictions are stored in y_pred, which contains the forecasted values
for the test set.
Step 5: Visualization
Visualize the model’s predictions against the actual temperature data. Finally, the
predictions made by the AutoReg model are visualized using Matplotlib library.
Actual Predictions Plot:
# Visualize the results
[Link](figsize=(12, 6))
[Link](test_data["Date"] ,y_test, label='Actual Temperature')
[Link]( test_data["Date"],y_pred, label='Predicted Temperature', linestyle='--')
[Link]('Date')
[Link]('Temperature')
[Link]()
[Link]('Temperature Prediction with Autoregressive Model')
[Link]()

Output:

Forecast Plot:

# Define the number of future time steps you want to predict (1 week) forecast_steps = 7

# Extend the predictions into the future for one year

future_indices = range(len(test_data), len(test_data) + forecast_steps)

future_predictions = ar_results.predict(start=len(train_data), end=len(train_data) + len(test_data) +


forecast_steps - 1, dynamic=False)

# Create date indices for the future predictions

future_dates = pd.date_range(start=test_data['Date'].iloc[-1], periods=forecast_steps, freq='D')

# Plot the actual data, existing predictions, and one year of future predictions

[Link](figsize=(12, 6))

[Link](test_data['Date'], y_test, label='Actual Temperature')

[Link](test_data['Date'], y_pred, label='Predicted Temperature', linestyle='--')

[Link](future_dates, future_predictions[-forecast_steps:], label='Future Predictions', linestyle='--',


color='red')

[Link]('Date')

[Link]('Temperature')
[Link]()

[Link]('Temperature Prediction with Autoregressive Model')


[Link]()

Output:

Benefits of Autoregressive Models:


 Simplicity: AR models are relatively simple to understand and implement.
They rely on past values of the time series to predict future values, making
them conceptually straightforward.
 Interpretability: The coefficients in an AR model have clear
interpretations. They represent the strength and direction of the
relationship between past and future values, making it easier to derive
insights from the model.
 Useful for Stationary Data: AR models work well with stationary time
series data. Stationary data have stable statistical properties over time,
which is an assumption that AR models are built upon.
 Efficiency: AR models can be computationally efficient, especially for
short time series or when you have a reasonable amount of data.
 Modeling Temporal Patterns: AR models are good at capturing short-
term temporal dependencies and patterns in the data, which makes them
valuable for short-term forecasting.

Common questions

Powered by AI

AR models offer several benefits including simplicity, as they are conceptually straightforward by relying on past values to predict future ones. They are interpretable since the coefficients denote the strength and direction of the relationship between past and future values. AR models perform well on stationary data, which possess stable statistical properties over time. They are computationally efficient for short time series or when dealing with a reasonable amount of data and are effective in modeling short-term temporal patterns, making them valuable for short-term forecasting .

The ACF plot in AR models is a graphical tool used to visualize and assess the autocorrelation of a time series at different lags. It helps understand how the current value of a series is correlated with its past values, which is essential for identifying significant lags for the model. In AR models, significant lags where autocorrelation values are outside the confidence interval indicate where past values strongly influence the current value, thus guiding the choice of the model order (p). Any significant peaks or patterns in the ACF plot reveal the underlying temporal dependencies in the data .

The steps for implementing an AR model for temperature prediction include: 1. Importing data and visualizing it; 2. Data preprocessing by adding lag features, removing rows with null values, and splitting the data into training and testing sets; 3. Building the AR model using libraries such as AutoReg and training it on the split data; 4. Evaluating the model using metrics like Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE), with the MAE and RMSE calculated on test predictions and compared to actual values; 5. Visualizing the predictions against actual data to assess model accuracy and to further refine the model .

Autocorrelation measures how closely the current value of a time series is related to its past values at different time lags. In AR models, high autocorrelation at a particular lag indicates a strong relationship between the current and past values, essential for model parameter determination. The Autocorrelation Function (ACF) plot is crucial for identifying significant lags which help in selecting the appropriate order of the AR model. Additionally, autocorrelation is used to assess stationarity; in a stationary series, autocorrelation should gradually decrease as the lag increases, deviations might indicate non-stationarity .

The evaluative metrics used in AR models for assessing prediction accuracy include Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE). MAE provides the average magnitude of prediction errors, offering interpretability in terms of units of the data, while RMSE highlights larger errors due to squaring, asserting strict penalization on deviations. These metrics guide model refinement by identifying model shortcomings, such as underfitting or overfitting, and help in comparing alternative models or model configurations, ultimately driving improvements in predictive performance through iterative adjustments .

Autoregressive models belong to the family of time series models that capture the relationship between an observation and several lagged observations (previous time steps). The core idea is that the current value of a time series can be expressed as a linear combination of its past values, along with some random noise. Mathematically, an autoregressive model of order p, denoted as AR(p), can be expressed as: xt = c + (phi1)xt-1 + (phi2)xt-2 + ... + (phip)xt-p + et, where c is a constant, phis are the model parameters, xt is the value at time t, and et represents white noise (random error) at time t .

The stationarity assumption is foundational for AR models as they are built assuming the statistical properties of the series are stable over time. If a time series is non-stationary, with trends or varying variability over time, the model parameters could be misestimated, leading to poor predictions. In a stationary series, autocorrelation decreases with higher lags, aiding in correctly determining the model order and interpreting relationships accurately. Deviations from stationarity, evident in ACF plots, signal the need for transformations or differencing to stabilize the series before AR modeling .

The AR(1) model is a specific case of AR models where the current value only depends on the immediately previous value, making it simple and quick to estimate, suitable for very short-term forecasts or highly autocorrelated series at lag 1. In contrast, an AR(p) model includes multiple lagged values (up to p lags) in its structure, allowing it to account for more complex relationships and longer memory in the data. AR(p) models are used when autocorrelation persists beyond the first lag, indicating a need for considering multiple past observations to capture the series' patterns effectively .

AR models are computationally efficient owing to their relatively simple mathematical framework which involves linear combinations of a limited number of past observations. This simplicity translates into faster computations, making them suitable for real-time applications and frequent updates. Their efficiency is especially beneficial for short time series or when dealing with large datasets requiring speed. However, as p increases, the model complexity can rise, slightly impacting scalability, but generally, AR models handle moderate increases in size while maintaining performance .

In AR models, future values are predicted by using a fitted model to compute a linear combination of past series values weighted by estimated coefficients. When making predictions, setting dynamic=false implies that each forecasted point uses actual observations as input for prediction rather than recursively using previously forecasted values, maintaining accuracy by minimizing the propagation of prediction errors. This is particularly useful during model evaluation, ensuring predictions reflect past data without accumulated forecast inaccuracies .

You might also like