Introduction To Copula in Finance With Python and R 1705992786
Introduction To Copula in Finance With Python and R 1705992786
January 2024
1
Abstract
This research article is geared toward student who has an essential background
in Probability and Statistics, assuming the reader is not familiar with the main
topic. The primary objective is to deliver a thorough introduction to Copulas,
allowing the audience to delve deeper into this subject independently. The theory
is allied with intuitions and examples in order to show how Copulas are applied
to real world data. Throughout this dissertation I explained what Copula is, its
main features and the estimation methods. Finally, I illustrated how to employ
Copulas in Finance through R software. The main purpose of this example is to
estimate the joint distribution of stock returns through a Survival Gumbel and T
student Copulas. Additionally, I have simulated synthetic returns to assess the
ability of the models to capture the behavior of historical data. I harvested Apple
and Exxon returns, from 1 January 2019 to 18 December 2023. As anticipated,
all the presented results stem from my research using R as statistical software.
The main libraries I relied upon are VineCopula and Copula. Noteworthy is
the intentional omission of detailed proofs, preventing the article from becoming
excessively mathematical. Nevertheless, for interested readers, comprehensive
proofs can be found in the provided references.
2
Contents
1 Introduction 4
1.1 Why Copulas are useful? . . . . . . . . . . . . . . . . . . . . . . . . . 4
3 Copulas function 7
3.1 Copulas and its properties . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Copulas Families . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.1 Elliptical Copulas . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2.2 Archimedean Copulas . . . . . . . . . . . . . . . . . . . . . . . 11
3.3 Estimation of Copula parameters . . . . . . . . . . . . . . . . . . . . . 12
3.3.1 Kendall Tau method of moments . . . . . . . . . . . . . . . . . 13
3.3.2 Maximum Likelihood Estimation . . . . . . . . . . . . . . . . . 13
6 APPENDIX 23
6.1 Appendix A - Cholesky Decomposition . . . . . . . . . . . . . . . . . . 23
6.2 Appendix B - Maximum Likelihood Estimation . . . . . . . . . . . . . 24
3
1 Introduction
1.1 Why Copulas are useful?
In financial markets, is well known that returns are neither normally distributed nor
symmetric, and is not seldom to have dependent assets that exhibit tail dependence
and outliers. In the context of credit risk, it is essential to model the tail dependence
between defaults of different financial instruments, such as bonds or credit derivatives.
Having said that, we can surmise that the modeling of multivariate distributions
and their dependence structure is one of the most critical concerns in probability
theory and financial applications. In this paper we are going to deepen as Copula func-
tions unravel this two issues, namely how to gauge dependence and how to deal with
multivariate distribution.
The main appeal of copulas is that by using them you can model the correlation struc-
ture and the marginals separately. This separation allows for more flexible modeling, as
changes in marginal distributions or dependence structure can be addressed indepen-
dently. In addition, different Copula families can capture various types of dependence,
including positive or negative correlation, tail dependence, and asymmetry.
Moreover, from a probability theory point of view, this can be an advantage because
for many combinations of marginals there no exist a closed form to generate the de-
sired multivariate distribution, albeit there exist several univariate distributions that
are parametric.
For instance, it is feasible to generate random samples from a joint normal distribu-
tion. In this case, each marginal is normal, and the multivariate distribution is fully
characterized by its mean vector and covariance matrix. Therefore, one may carry out
Cholesky decomposition to generate random samples from a multivariate. Notwith-
standing, it’s not straightforward to do the same when marginals are different, for
instance Beta, Gamma, Weibull and Gumbel. In this case the joint distribution has to
take into account manifold parameters and this usually lead to a complex dependence
structures, which for non elliptical distribution is difficult to be described by a simple
covariance matrix.
In the bivariate case, given (X1 , X2 ), the 2-dimensional CDF is defined as:
Z x1 Z x2
FX (x) = P (X1 ≤ x1 , X2 ≤ x2 ) = fX (t1 , t2 ) dt2 dt1
−∞ −∞
4
In addition, we reflect on a further significant way of defining the CDF.
Regarding the bivariate case, since the latter is the probability that, at the same time,
P (X1 ≤ x1 , X2 ≤ x2 ), we surmise that the CDF depends on the marginals of X1 , X2
and their dependence structure.
Figure 1: From the left: We can use the marginal inverse CDFs to map from (u, v) to
(FX−1 (u), FY−1 (v)). Conversely, the second picture show how can use the marginal CDFs to
map from (x, y) to a point (u, v) on the unit square.
5
2.2 Measures of dependence
In quantitative finance, correlation is employed to construct well-diversified portfolios,
manage risk, optimize asset allocation, hedging and algorithmic trading. Moreover, in
financial econometrics and stochastic calculus the dependence is often introduced by
instantaneous Brownian shocks, which are Gaussian. Two different types of dependency
measures are covered in this section: linear correlation and rank correlation.
For a couple of random variables, each of these dependent measurements produces a
scalar measurement, even though each case has unique characteristics.
6 d2i
P
ρ=1−
n(n2 − 1)
6
where di is the difference between the ranks of corresponding observations, and n is
the number of observations. It gauges the strength and direction of the monotonic
relationship between two variables. It ranges from -1 to 1. A monotonic relationship
between two variables entails that as one variable increases (or decreases), the other
variable tends to consistently increase (or decrease) as well but not at a constant rate.
Definition: The Kendall’s tau for two variables X and Y is given by:
Number of concordant pairs − Number of discordant pairs
τ= 1
2
n(n − 1)
where n is the number of observations. Thus, it measures the similarity in the ordering
of data points between two variables. It ranges from -1 to 1. If, for instance, we obtain
τ = - 0.2, It means that, on average, as X increases, there tends to be a slight decrease
in Y .
Let’s just recapitulate the essential point. Rank correlation gauges the strength of the
’direction’ of two random variables. It’s more rough than the linear correlation. Yet,
Spearman’s rho and Kendall tau measures are more robust and suitable to model non
linear dependence. In addiction they don’t hinge upon the shape of marginal distribu-
tions of the random variables. The latter is a paramount connection with Copulas.
3 Copulas function
In order to have a quick grasp of Copula, imagine you’re building a house. The margins
are akin the foundational pillars of the house, whereas the Copula is the blueprint that
dictates how these pillars are connected. Regardless of the specific choice of materials
(marginal distributions) you use for the pillars, the blueprint (copula) remains consis-
tent in connecting them. Moreover, once you pin down a blueprint (Copula) you can
change the materials (marginals) without modifying the Copula structure (see Thr 3).
Now, we can kick off reasoning in financial terms. Let’s think about the returns of
two stocks, like Apple and Exxon. Then, imagine to plot them on a scatter plot.
The result can be seen as a simultaneously interactions of the dependence structure
among the two stocks and their own marginal distributions. This is consistent with
reality owing to a stock price basically depends on its core business (idiosyncratic risk)
and the market condition (market risk). If we change the correlation, data will change,
as well as if we modify the marginals, the data set will have different values.
Therefore, the main appeal of Copulas is that by using them, you can model the cor-
relation structure of the two stocks and their marginals separately. Moreover, let’s
mull over the previous decomposition of multivariate distribution Remark 1. Linking
these two ways of reasoning, we can infer that Copulas can be seen as a multivariate
distribution function, where Copulas depicts the correlation structure. From the above
idea, we can surmise that this separation allows for more flexible modeling, as changes
in marginal distributions or dependence structure can be addressed independently.
7
3.1 Copulas and its properties
Definition 3.1: Copula is a multivariate cumulative distribution function for which
the marginal probability distribution of each variable is uniform on the interval [0, 1].
In probabilistic terms, a Copula, denoted as C, is a map from [0, 1]d → [0, 1].
The common notation for a copula is:
C(u1 , u2 , u3 , . . . , um ) = P (U1 ≤ u1 , . . . , Ud ≤ ud )
such that ui ∼ Uniform(0, 1).
Basically, it receives a vector of uniform marginals and returns a value between zero and
one. Hence, any multivariate joint distribution can be written in terms of univariate
marginal distribution functions and a copula, where the latter describes the dependence
structure between the two variables.
F (x1 , . . . , xd ) = P (X1 ≤ x1 , . . . , Xd ≤ xd )
= P (F1 (X1 ) ≤ F1 (x1 ), . . . , Fd (Xd ) ≤ Fd (xd ))
= C(F1 (x1 ), . . . , Fd (xd ))
Problem: We don’t know a priori their joint distribution FXY (x, y), since it is
challenging to gauge because the latter has to take into account manifold parameters
and this usually lead to a complex dependence structure.
8
Remark 2: For an arbitrary continuous multivariate distribution, we can determine
its Copula by applying Definition 2.1.4:
For the next definition, we recall that the density (pdf) is indeed the derivative of the
cumulative distribution function (CDF). Mathematically, the relationship is given by:
d
f (x) = F (x)
dx
Definition 3.2: The Copula density and the density of the multivariate dis-
tribution with respect to copula are:
∂ d C(u1 , . . . , ud )
c(u1 , . . . , ud ) = , u1 , . . . , ud ∈ [0, 1],
∂u1 . . . ∂ud
d
Y
f (x1 , . . . , xd ) = c{F1 (x1 ), . . . , Fd (xd )} f (xi ).
i=1
The copula c models the joint dependence between the variables, while the marginal
densities fi describe the individual behavior of each variable.
The upper bound represents a copula with stronger positive dependence, whereas the
lower bound represents a copula with stronger negative dependence. The value of d
depicts the dimension of the Copula.
9
3.2 Copulas Families
Different Copula families can capture various types of dependence, including positive
or negative correlation, tail dependence, and asymmetry. In this section we are going
to present the two main group of Copulas, their features and properties.
Let suppose to have Apple and Exxon stocks, each with a t-distribution for returns:
Exxon: - Mean: 0.04 - Scale parameter (σ ∗ ): 0.15 - (df ): 8
Apple: - Mean: 0.07 - Scale parameter (σ ∗ ): 0.17 - (df ): 5
In addiction, we take heed that scale parameter for the t distribution is not the stan-
dard deviation,
as
these are based also on degrees of freedom. The exact relationship
0.5
is that s = σ dfdf−2 , where σ is the standard deviation, s is the scale, and df is the
degrees of freedom.
Problem: Assume we want to employ a normal copula, with the stock’s correlation
matrix Σ = 0.7. What is the probability that both assets produce a loss
(namely the returns are less than zero)?
10
Solution:
• Step 1: Compute the probability that Exxon and Apple’s returns are less than
zero according to the Student’s t-distribution.
0.4
PExxon (0) : 0 − 0.15 = −0.2667 → Ft-stud8 (−0.2667) = 0.3982
0.7
PApple (0) : 0 − 0.17 = −0.4118 → Ft-stud5 (−0.4118) = 0.3488.
This result entails that, according to all our hypothesis and data we posited, the joint
probability that both asset produce a loss is around 25%.
Figure 2: Simulation of Gaussian (up) and Student,= 2 (bottom) Copula, with correlation
−0.5(l) , 0.3(m) and 0.9(r). As we can see the T Copula has fatter (symmetrical) tail than
the Gaussian.
11
Gumbel Copula:
1/θ
CGumbel (u1 , u2 ) = exp − (− log u1 )θ + (− log u2 )θ
Clayton Copula:
−1/θ
CClayton (u1 , u2 ) = max u−θ −θ
1 + u2 − 1, 0
Frank Copula:
(e−θu1 − 1)(e−θu2 − 1)
−1
CFrank (u1 , u2 ) = log 1 +
θ e−θ − 1
In these formulas, θ is a parameter that controls the tail dependence of the Copula.
Different values of θ yield copulas with different tail behaviors.
In most cases, Gumbel Copulas can be used to model the upper tail dependence or
the max behaviour of a phenomena. Conversely, the Clayton one can be employed to
manage left tail dependency.
Figure 3: From the left, a simulation of Clayton,Gumbel and Frank with dependence measure
of 0.9(up) and 0.3(bottom).
12
3.3.1 Kendall Tau method of moments
In the bivariate case, a standard method of estimating the univariate parameter is
based on Kendall’s statistic (see Genest and Rivest, 1993). For most copula functions
with a single parameter there is a one-to-one relationship between and the Kendall’s
tau. This entails that from the latter, we can infer an estimate for the Copula parameter.
Theorem 4: Let (X1 , X2 ) be a pair of random variable vector with continuous marginal
cdfs F1 , F2 , and copula C, then:
Z 1Z 1
τ (X1 , X2 ) = 4 C(u1 , u2 ) dC(u1 , u2 ) − 1
0 0
This theorem provides a direct link between Kendall tau and Copula. Both are mea-
sures of dependency and don’t rely on marginal distributions since the integrand is
expressed only in terms of the Copula, and the integral is defined in the unit square.
2
τ (X1 , X2 ) = arcsin(ρij )
π
This means that Kendall’s can be used as estimator for the matrix Σ. Furthermore,
the elements of the correlation matrix ρij are defined as:
1
pij = sin πτij
2
13
The ML estimator η̂=(α, θ) solves the system:
∂L(η, X)
=0
∂η ⊤
where:
n
" d
#
X Y
L(η, X) = log c {F1 (x1i , α1 ) , . . . , Fd (xdi , αd ) , θ} fj (xji , αj )
i=1 j=1
Xn
= [log c {F1 (x1i , α1 ) , . . . , Fd (xdi , αd ) , θ}
i=1
d
#
X
+ log fj (xji , αj )
j=1
In line to the classical properties of ML estimation, the estimator is efficient and asymp-
totically normal. Moreover, plugging back the estimates into the L(η, X) we obtain a
value of the log likelihood function. This can be used to compute AIC, BIC and the
Likelihood ratio test. Notwithstanding, a drawback of this method is that it is often
computationally demanding to solve, especially when we deal with several distributions.
The main issue is that both Copula and marginals parameters are gauged simultane-
ously. For comprehensive presentation of MLE for each class of Copulas see (Cherubini,
Luciano, Vecchiato, 2004, Chapter7)
Alternatively, in order to cope with this flaw, we can employ Inference of Mar-
gins estimation. It is basically a two stages optimization.
First, we estimate separately the parameters of the margins, and then use them in
the estimation of the Copula parameters as known quantities. The above optimization
problem is then replaced by:
∂L1 ∂Ld ∂Ld+1
,..., , =0
∂α1 ∂αd ∂θ
where:
n
X
Lj = lj (Xi ) , for j = 1, . . . , d + 1,
i=1
lj (Xi ) = log fj (xji , αj ) , for j = 1, . . . , d, i = 1, . . . , n,
ld+1 (Xi ) = log [c {F1 (x1i , α1 ) , . . . , Fd (xdi , αd )}] , for i = 1, . . . , n.
14
4 Sampling from Gaussian and T Student Copulas
Throughout this article, we outlined that a Copula can be seen as a joint cumulative
distribution which ties altogether different univariate marginals. Therefore, to figure
out how a given Copula (with arbitrary marginals) behaves in practise, it is extremely
useful to sample from it, namely to simulate its values. This procedure is beneficial
for scenario analysis and model validation. Sampling from a Copula allows to compare
synthetic (i.e. simulated) data generated using the Copula with observed data. If the
simulated data closely matches the historical data, it suggests that the Copula captures
the dependence structure effectively.
Figure 4: Simulation of a three-dimensional multivariate normal, (0.5, 0.10, −0.6) are pairwise
correlation values we set. In doing that one must specify a positive definite correlation matrix.
15
2. Transform the vector Z into U = (Φ(Z1 ), . . . , Φ(Zm )), where Φ is the distribution
function of the univariate standard normal. By means of the Gaussian CDF, we
get rid of marginals and have a pure representation of the dependence structure,
as shown by the scatterplot. This is essentially a Gaussian Copula representation.
Figure 5: Each distribution is uniformly distributed in [0, 1]. It’s paramount to recognize that
the correlation remains the same. The applied transformation did not alter the correlation
structure among the random variables. Essentially, what remains is the pure dependence
structure (scatter plots).
3. Transform the uniform vector in the marginal you want applying the inverse CDF:
(F1−1 (u1 ), . . . , Fn−1 (un )).
Figure 6: Inverse CDF (definition 2.1.4) applied to the uniform (0,1) data.
For this example we posit arbitrary distributions: Gamma (2,1), Beta( 2,2) and T-stud (5).
16
To sum up, we define a correlation structure of our choice. Then we exploit the fact
that normal random variables are feasible to simulate with Cholesky decomposition.
We applied integral probability transform to factor out the dependence on gaussian
marginals, but we preserve the same correlation structure. Finally, by applying the
univariate inverse CDF, the result is a simulation of the distribution we selected linked
by the initial correlation structure. To utterly appreciate it from a graphical point of
view, one can compare scatter plot of figure 4 and 6.
Sampling from T-student Copula: the input parameters for the simulation are
(v, Σ). The t copula can be simulated as follow:
2. Transform the vector X into U = (tv (X1 ) , . . . , tv (Xm ))T , where tv is the distri-
bution function of univariate t distribution with v degrees of freedom.
3. Transform the uniform vector in the marginal you want applying the inverse CDF
: (F1−1 (u1 ), . . . , Fn−1 (un )).
17
5 Sampling Apple and Exxon returns from
Survival Gumbel and T student Copula
The main purpose of this example is to estimate the joint distribution of stock returns
through a Survival Gumbel and T student Copulas employing Maximum Likelihood
Estimation (see 3.3.2). In addition, we simulated synthetic returns to test whether the
Copulas are able to capture the behaviour of historical data. Apple and Exxon returns,
from 1 January 2019 to 18 December 2023 are considered. As financial markets plunge
in periods of crisis, I have intentionally included the Covid pandemic in our data to
assess whether our models allow for these fat tail events, sometimes called ’Black Swan’.
Figure 7: Apple and Exxon returns from January 1, 2019 to December 18, 2023.
In the previous illustration (4.2), the selection of a normal Copula model was executed
without exhaustive deliberation. Nonetheless, practical applications of these models
require an algorithm that pins down the most appropriate Copula. VineCopula pack-
age in R provides a valuable tool for Copula selection. The BiCopSelect function
facilitates an informed decision-making process by systematically evaluating different
Copulas. At first, for any kind of Copulas all parameters are estimated via MLE. Then
the function elects the most appropriate model according to AIC and BIC. This ap-
proach ensures an efficient selection of the best model from a wide range of Copula
families.
Given our data, the most deliberate Copula selected by the aforementioned algorithm
is the Survival Gumbel with parameter θ̂ = 1.24. In addition, estimation with the
kendall’s tau is feasible since this model hinges upon one single parameter, and it leads
to an almost identical estimate for θ. For the sake of completeness, we estimate an
additional model by the means of BiCopSelect, to single out the second most suitable
Copula. It turns out to be a T-student Copula with parameters ρ̂ = 0.25 and df = 5.
18
Figure 8: Survival Gumbel Copula density with parameter 1.24 and 3
Figure 9: T student Copula density with df= 5 and ρ̂ 0.25 and -0.5
The density sheds the light on the characteristics of the Copula . The Survival Gumbel
Copula has harsh left tail dependency, whereas the T student Copula has a symmetrical
tail dependency. Nonetheless, both copulas have fat tails. Furthermore, as θ̂ increases,
the probability to experience extreme events is higher in the Survival Gumbel (see
Figure 8). For the T Copula, as ρ̂ increases in absolute value, the probability of expe-
riencing extreme events also rises. When the parameter is negative, we basically invert
the direction of dependency.(see Figure 9)
19
Figure 10: Empirical and simulated returns from a t student distribution
At this point we have both Copula and t marginals. Let’s combine them altogether by
Sklar’s Theorem to have a representation of the multivariate distribution. We could see
in practise that by the latter theorem each joint distribution FXY... can be written as a
copula function C(F1 (x1 , F2 (x2 ), . . .) taking the marginal distributions as arguments.
Figure 11: Multivariate distribution representation of Survival Gumbel (θ̂ = 1.24) and T-
student Copula (df= 5 and ρ̂ = 0.25) with the estimated t students marginals.
Last step regards the simulation from our Copulas to verify whether the simulated
returns are in line with empirical stock data. Copula and VineCopula libraries in R
don’t allow to simulate from a Survival Gumbel Copula with t student marginals since
the family of survivals has been recently implemented and this feature is not available
yet. Despite it was seemingly the most suitable model, we simulated synthetic returns
from the other T Copula with t marginals we presented throughout this article.
20
Figure 12: Apple and Exxon empirical and simulated returns from January 1, 2019 to
December 18, 2023.
Figure 13: Apple and Exxon empirical and simulated returns from January 1, 2019 to
December 18, 2023.
21
5.1 Conclusion
In conclusion, the estimated T Copula leads to results rather close to the actual obser-
vations. Nonetheless, there are two extreme negative observations where stocks sank
simultaneously by -10% that were not captured by our model. Notwithstanding, our
model seems to work adequately when Apple and Exxon plummet or soar by -5%.
These features of the model stems from the fact the T Copulas tends has symmetrical
tail dependence (see Figure 9). Therefore, my conclusion is that our model seems to be-
have appropriately when returns range from -5% to 5%. Furthermore, it also accounts
for extremely negative returns, but no so frequently as seems to happen in financial
markets. Nevertheless, I recall the main aim of this article is to strive to apply Copulas
to financial data.
At this point, one could appreciate how flexible and powerful these functions are. We
have modeled the joint behavior of stocks, which are rather rough random variables.
A practitioner may enhance and employ this procedure for model validation, stress
testing, risk management and pricing multi-asset derivatives. However, Copulas are
not a panacea; they are a paramount tool in the statistical toolkit, but they are not
a universal solution. Depending on the characteristics of the data and the specific
research question, other methods might be more appropriate.
22
6 APPENDIX
6.1 Appendix A - Cholesky Decomposition
A square matrix A is said to have a Cholesky decomposition if it can be written as the
product of a lower triangular matrix and its transpose. The lower triangular matrix
is required to have strictly positive real entries on its main diagonal. For a given
symmetric positive definite matrix , the Cholesky decomposition is expressed as:
A = LLT
where:
• LT is the transpose of L.
The decomposition is not unique, as there are multiple ways to factorize A into LLT
where L is a lower triangular matrix.
Consider the symmetric positive definite matrix:
25 15 −5
A = 15 18 0
−5 0 11
The Cholesky decomposition results in the lower triangular matrix L:
5 0 0
L = 3 3 0
−1 1 3
To verify the decomposition, let’s check that LLT = A:
5 0 0 5 3 −1 25 15 −5
L · LT = 3 3 0 · 0 3 1 = 15 18 0
−1 1 3 0 0 3 −5 0 11
The product LLT indeed equals the original matrix A.
23
6.2 Appendix B - Maximum Likelihood Estimation
Let θ be a vector of unknown parameters. When we have a set of jointly continuous
random variables X1 , X2 , X3 , . . . , Xn we utilize the joint probability density function
(PDF) to define the likelihood:
24
References
[1] McNeil, Rudiger Frey, Paul Embrechts, 2005, Quantitative risk management :
concepts, techniques, and tools, by Princeton University Press
[2] Ruey S. Tsay, 2010, Analysis of Financial Time Series, Third Edition by John
Wiley & Sons
[4] Umberto Cherubini, Elisa Luciano, Walter Vecchiato, 2004, Copula Method in
Finance by John Wiley & Sons
[8] Genest, C., & Rivest, L.-P. (1993). Statistical Inference Procedures for Bivariate
Archimedean Copulas. Journal of the American Statistical Association
[10] Marshall, A.W. and Olkin, I. (1988) Families of Multivariate Distributions. Journal
of the American Statistical Association
25