3.
RANDOM VECTORS
Lecture 2 Review:
Elementary Matrix Algebra Review
rank, trace, transpose, determinants, orthogonality, etc.,
linear independence, range (column) space, null space,
spectral theorem/principal axis theorem,
idempotent matrices, projection matrices, positive definite and
positive semi-definite matrices.
RANDOM VECTORS
Definitions:
1. A random vector is a vector of random variables
X1
X = ... .
Xn
2. The mean or expectation of X is defined as
E[X1 ]
..
.
E[X] =
.
E[Xn ]
3. A random matrix is a matrix of random variables Z = (Zij ). Its
expectation is given by E[Z] = (E[Zij ]).
3. RANDOM VECTORS
Properties:
1. A constant vector a and a constant matrix A satisfy E[a] = a and
E[A] = A. (Constant means non-random in this context.)
2. E[X + Y] = E[X] + E[Y].
3. E[AX] = AE[X] for a constant matrix A.
4. More generally (Seber & Lee Theorem 1.1):
E[AZB + C] = AE[Z]B + C
if A, B, C are constant matrices.
Definition: If X is a random vector, the covariance matrix of X is
defined as
cov(X) [cov(Xi, Xj )]
var(X1)
cov(X1, X2 )
cov(X , X )
var(X2)
2
1
..
..
.
.
cov(Xn, X1 ) cov(Xn, X2 )
cov(X1, Xn )
cov(X2, Xn )
..
...
.
var(Xn)
Also called the variance matrix or the variance-covariance matrix.
Alternatively:
cov(X) = E[(X E[X])(X E[X])0]
X1 E[X1 ]
..
(X1 E[X1 ], , Xn E[Xn]) .
= E
.
Xn E[Xn ]
Example: (Independent random variables.) If X1, . . . , Xn are independent then cov(X) = diag(12 , . . . , n2 ).
If, in addition, the Xi have common variance 2 , then cov(X) = 2 In .
3. RANDOM VECTORS
Properties of Covariance Matrices:
1. Symmetric: cov(X) = [cov(X)]0.
Proof: cov(Xi, Xj ) = cov(Xj , Xi).
2. cov(X + a) = cov(X) if a is a constant vector.
3. cov(AX) = Acov(X)A0 if A is a constant matrix.
Proof:
cov(AX) =
=
=
=
E[(AX E[AX])(AX E[AX])0]
E[A(X E[X])(X E[X])0A0]
AE[(X E[X])(X E[X])0]A0
Acov(X)A0
4. cov(X) is positive semi-definite.
Proof: For any constant vector a, a0cov(X)a = cov(a0X).
But this is just the variance of a random variable:
cov(a0X) = var(a0X) 0.
(Variances are never negative.)
Therefore:
5. cov(X) is positive definite provided no linear combination of the
Xi is a constant (Seber & Lee Theorem 1.4)
6. cov(X) = E[XX0] E[X](E[X])0
3. RANDOM VECTORS
Definition: The correlation matrix of X is defined as
corr(X) = [corr(Xi , Xj )]
1
corr(X1 , X2)
1
corr(X2 , X1 )
...
...
corr(Xn , X1) corr(Xn , X2)
corr(X1 , Xn)
corr(X2 , Xn)
.
...
...
Denote cov(X) by = (ij ). Then the correlation matrix and
covariance matrix are related by
cov(X) = diag( 11 , . . . , nn)corr(X)diag( 11 , . . . , nn ).
This is easily seen using corr(Xi , Xj ) = cov(Xi , Xj )/ iijj .
Example: (Exchangeable random variables.) If X1, . . . , Xn are
exchangeable, they have a constant variance 2 and a constant
correlation between any pair of variables. Thus
2 1
cov(X) = .. .. . . .. .
. .
. .
1
This is sometimes called an exchangeable covariance matrix.
3. RANDOM VECTORS
Definition: If Xm1 and Yn1 are random vectors,
cov(X, Y) = [cov(Xi , Yj )]
cov(X1 , Y1) cov(X1 , Y2)
cov(X2 , Y1) cov(X2 , Y2)
...
...
cov(Xm , Y1) cov(Xm , Y2)
cov(X1 , Yn)
cov(X2 , Yn)
.
...
...
cov(Xm , Yn)
Note: We have now defined the covariance matrix for a random
vector and a covariance matrix for a pair of random vectors.
Alternative form:
cov(X, Y) = E[(X E[X])(Y E[Y])0 ]
X1 E[X1 ]
...
(Y1 E[Y1], , Yn E[Yn]) .
= E
Xm E[Xm ]
Note: The covariance is defined regardless of the values of m
and n.
Theorem: If A and B are constant matrices,
cov(AX, BY) = Acov(X, Y)B0 .
Proof: Similar to proof of cov(AX) = Acov(X)A0 .
Partitioned variance matrix: Let
X
Z=
.
Y
Then
cov(Z) =
cov(X) cov(X, Y)
cov(Y, X) cov(Y)
3. RANDOM VECTORS
Expectation of a Quadratic Form:
Theorem: Let E[X] = and cov(X) = and A be a constant
matrix. Then
E[(X )0 A(X )] = tr(A).
First Proof (brute force):
0
E[(X ) A(X )] = E[
XX
i
aij (Xi i )(Xj j )]
XX
aij E[(Xi i )(Xj j )]
XX
aij cov(Xi , Xj )
= tr(A).
Second Proof (more clever):
E[(X )0 A(X )] =
=
=
=
=
E[tr{(X )0 A(X )}]
E[tr{A(X )(X )0 }]
tr{E[A(X )(X )0 ]}
tr{AE[(X )(X )0 ]}
tr{A}
Corollary: E[X0 AX] = tr(A) + 0 A.
Proof:
X0AX = (X )0 A(X ) + 0 AX + X0 A 0 A,
Therefore,
E[X0 AX] = E[(X )0 A(X )] + 0 A.
3. RANDOM VECTORS
Example: Let X1, . . . , Xn be independent random variables
with common mean and variance 2. Then the sample variP
2/(n 1) is an unbiased estimate of
ance s2 = i (Xi X)
2.
Proof: Let X = (X1 , . . . , Xn)0 . Then E[X] = 1, cov(X) =
n. (J
n = 1n10 /n.)
2Inn. Let A = Inn 1n 10n/n = In J
n
Note that
(n 1)s =
X
i
2 = X0 AX
(Xi X)
By the corollary
E[(n 1)s2 ] = E[X0 AX]
= tr(A 2I) + 10 A1
= (n 1) 2
because A1 = 0.
3. RANDOM VECTORS
Independence of Normal Random Variables:
Theorem: For x N(, ) and matrices A and B, x0 Ax and
Bx are independently distributed iff BA = 0.
Proof: Sufficiency (Searle, 1971, 2.5), necessity (Driscol and
Gundberg, 1986, American Statistician)
Example: Let X1 , . . . , Xn be independent random variables
with common mean and variance 2. Show that the samPn
2
ple mean X =
i=1 Xi /n and the sample variance S are
independently distributed.
Let x = (X1, . . . , Xn)0 so that x N(1n , 2In). S 2 = x0 Ax,
n
J
= Bx where B = 10n /n.
where A = Inn1
, and X
We now apply the theorem above:
BA =
n
J
2
)=(
)(10n 10n ) = 0.
n1
n(n 1)
In
(10n /n)( 2 In)(
are independently distributed.
Therefore, S 2 and X