ARCH Model in R Programming
What is ARCH Model?
ARCH (Autoregressive Conditional
Heteroskedasticity) model is used to analyze time
series data — especially financial data (like stock
returns) where variance (volatility) changes over
time.
It was introduced by Robert Engle (1982), who also
won a Nobel Prize for it.
🔹 Step 1: Install and Load Packages
Required Package: tseries
Before you can use the ARCH model, you need to install and load the tseries package in R.
install.packages("tseries") # Installs the package
library(tseries) # Loads the package
👉 Explanation:
The tseries package provides functions for time series analysis, especially useful for financial
data such as stock returns, exchange rates, etc.
🔹 Step 2: Import and Visualize Data
You need some financial or time series data (for example, daily stock returns).
data <- read.csv("stock_returns.csv") # Import data from a CSV file
returns <- data$Returns # Extract the column 'Returns'
plot(returns, type="l", main="Daily Stock Returns", ylab="Return")
👉 Explanation:
read.csv() loads the dataset.
plot() displays how returns vary over time.
o If you see large ups and downs, it shows volatility clustering —
meaning some periods have high volatility, others low.
🔹 Step 3: Test for ARCH Effect
Before fitting the model, we must check whether ARCH effects exist in the data — i.e., whether
the variance changes over time.
We use the Lagrange Multiplier (LM) Test:
arch.test(lm(returns ~ 1))
👉 Explanation:
lm(returns ~ 1) builds a simple regression model (no independent
variable).
arch.test() checks for ARCH effects in the residuals.
If p-value < 0.05, then ARCH effects are present, meaning the
volatility changes over time.
🔹 Step 4: Fit ARCH(1) Model
Now that we know ARCH effects exist, we fit the model.
arch_model <- garch(returns, order = c(0,1)) # ARCH(1) model
summary(arch_model)
👉 Explanation:
garch() fits either ARCH or GARCH models.
order = c(0,1) means ARCH(1):
o The first value (0) is for GARCH terms (none here).
o The second (1) is for ARCH terms (one lag of squared residual).
summary() shows the results and coefficients.
🔹 Step 5: Analyze Output
Example output:
alpha0 = 0.0005
alpha1 = 0.35
👉 Interpretation:
α₀ (alpha0) = baseline variance.
α₁ (alpha1) = effect of the previous day’s squared error on today’s
variance.
So:
If α₁ = 0.35 → 35% of today’s volatility is explained by yesterday’s shock.
If α₁ < 1 → the model is stable (volatility doesn’t explode).
🔹 Step 6: Diagnostic Checking
After fitting, check whether the residuals (errors) are random and the model fits well.
res <- residuals(arch_model)
acf(res^2, main="ACF of Squared Residuals")
👉 Explanation:
residuals() extracts model errors.
acf() checks the autocorrelation ofsquared residuals.
If there’s no autocorrelation, it means the ARCH model adequately
captures the volatility.
🔹 Step 7: Interpretation of Coefficients
Coefficie
Meaning Explanation
nt
α₀ Baseline
Minimum level of volatility
(alpha0) variance
α₁ How much past squared error affects current
Lag effect
(alpha1) volatility
Coefficie
Meaning Explanation
nt
Stability
α₁ < 1 Ensures volatility doesn’t grow infinitely
condition
🔹 Step 8: Formula Summary
r_t = \mu + \varepsilon_t
h_t = \alpha_0 + \alpha_1 \varepsilon_{t-1}^2
Where:
: current variance (volatility)
: last period’s squared error (shock)
HOMESKEDSTICITY:
Homo = same
Skedasticity = variance (spread)
Homoskedasticity = same variance throughout the data.
🔸 Opposite: Heteroskedasticity
Hetero = different
Skedasticity = variance
Heteroskedasticity means the variance changes —
For example, residuals (errors) are small in one region and large in another.
📊 Example 1: Homoskedastic Data
Imagine you’re studying the relationship between hours studied (X) and exam score (Y).
Student Hours Studied (X) Exam Score (Y)
1 1 45
2 2 50
3 3 55
4 4 60
5 5 65
The points are evenly spread above and below the line — no matter what X is.
📘 Interpretation:
The variance (distance from line) is roughly constant for all X values.
✅ That’s Homoskedasticity.
📊 Example 2: Heteroskedastic Data
Now, imagine another dataset:
Student Hours Studied (X) Exam Score (Y)
1 1 45
2 2 52
3 3 60
4 4 75
5 5 100
🔺 Here, as “Hours Studied” increases, the spread of scores gets larger.
For low X values, scores are close to the line.
For high X values, scores vary widely.
📘 Interpretation:
The variance increases with X.
❌ That’s Heteroskedasticity.
🎯 Visual Concept (Imagine two scatterplots)
✅ Homoskedastic:
Y|
| * * * *
| * * * *
|________________________ X ( Even spread all along)
❌ Heteroskedastic:
Y|
| * *
| * *
| * *
|________________________ X (Spread increases )