Page 1 of 20
Matrices & Linear
Algebra
Vector Spaces
Scalars are the elements of a number field (for
example, R and C), which
o Is a set of elements on which the operations of
addition and multiplication are defined, and
satisfy
the
usual
laws
of
arithmetic
(commutative, associative and distributive).
o Is closed under addition and multiplication
o Includes identity elements for addition and
multiplication (0 and 1).
o Includes inverses (negatives and reciprocals) for
addition and multiplication, except 0.
Vectors are elements of a linear vector space defined
over a number field F. A vector space V
o Is a set of elements on which the operations of
vector addition and scalar multiplication are
defined and satisfy certain axioms.
o Is closed under these operations.
o Includes an identity element (0) for vector
addition.
If the number field F over which the linear vector
space is defined is real, then the vector space is real.
Notes:
o Vector multiplication is not, in general, defined
for a vector space.
o The basic example of a vector space is a list of
n scalars, Rn. Vector addition and scalar
multiplication are defined component-wise.
Maths Revision Notes
Daniel Guetta, 2007
Page 2 of 20
o R2 is not exactly the same as C, because C has
a rule for multiplication.
o Similarly, R3 is not quite the same as physical
space, because physical space has a rule
(Pythagoras) for the distance between two
points.
The Inner Product
The inner product is used to give a meaning to
lengths and angles in a vector space.
It is a scalar function, x, y of two vectors x and y.
An inner product must
o Be linear in the second argument
x, ay = a x, y
x , y + z = x, y + x , z
o Have Hermitian symmetry
y, x = x , y
o Be positive definite
x, x 0
with equality if and only if x = 0.
Notes:
o The inner product has an existence without
reference to any basis.
o The hermitian symmetry is required so that
x, x is real. It is not required in a real vector
space.
o It follows from the above that the inner
product is antilinear in the first argument:
ax, y = a* x, y
x + y, z = x, z + y, z
o In Cn, the standard (Euclidean) inner product,
the dot product, is:
x, y = x i*yi
Maths Revision Notes
Daniel Guetta, 2007
Page 3 of 20
The complex conjugation is needed to maintain
Hermitian symmetry, and to ensure that the
product is linear in the second argument.
The Cauchy-Schwarz Inequality states that
x, y
x , x y, y
Or, equivalently:
x, y x y
With equality if and only if x = ay .
To prove, assume that x, y 0 (in which case the
inequality is trivial). We first say:
x - ay | x - ay = x - ay | x - a x - ay | y
= x | x - a* y | x - a x | y + aa * y | y
*
= x | x - a* x | y - a x | y + aa * y | y
= x 2 + a 2 y - 2 Re (a x | y
2
Now, this quantity must be positive, because of the
positive definite property of the inner product. If we
choose the phase of a (which is arbitrary) such that
a x |y
is
real
and
non-negative,
so
that
a x | y = a x | y , we then have that:
2
x 2 + a 2 y -2 a x |y 0
2
(x - a y ) +2 a x y -2 a x |y 0
And now, if we choose a = x / y , we get:
x y x |y
As required.
We can use the Cauchy-Schwarz inequality to prove
the triangle inequality ( x + y x + y ), by writing
x +y
in terms of the inner product, expanding,
using the inequality, and factorising.
The Cauchy-Schwarz Inequality allows us to define, in
real vector space, the angle q between two vectors,
though
x, y = x y cos q
Maths Revision Notes
Daniel Guetta, 2007
Page 4 of 20
[This is possible because, by the Cauchy Schwartz,
x ,y
x y
1 . If x, y = 0 , in any vector space, x and y
are said to be orthogonal.
A knowledge of the inner product of the basis vectors
is sufficient to determine the inner product of any two
vectors x and y. Let:
ei | e j = Gij
Then
x, y = Gij x i*y j
Where the Gij are the metric coefficients of the basis.
The Hermetian Symmetry of the inner product implies
that
Gij = G ji*
The matrix G is hermitian.
For an orthonormal basis, in which ei | e j = dij , we
have that
x | y = x i*yi
Bases
Let S = {e1, e2, , em } be a subset of the vectors in V.
A linear combination of S is any vector of the form
x 1e1 + x 2e2 + + x mem , where the x are scalars.
The span of S is the set of all vectors that are linear
combinations of S. If the span of S is the entire vector
space V, then S is said to span V.
The vectors of S are said to be linearly independent if
x 1e1 + x 2e2 + + x mem = 0
x 1, x 2, , x m = 0
If, on the other hand, such an equation holds for nontrivial values of the coefficients, one of the vectors is
redundant, and can be written as a linear combination
of the other vectors.
If an additional vector is added to a spanning set, it
remains a spanning set. If a vector is removed from a
Maths Revision Notes
Daniel Guetta, 2007
Page 5 of 20
linearly
independent
set,
it
remains
linearly
independent set.
A basis for a vector space V is the subset of vectors S
that spans V and is also linearly independent. The
properties of basis sets are:
o All bases of V have the same number of
elements, n, which is called the dimension of V.
o Any n linearly independent vectors in V form a
basis of V.
o Any vector x V can be written [prove by
considering V {x } as a linearly dependent set]
in a unique way [prove by contradiction] as a
linear combination of the vectors in a basis.
The relevant scalars are called the components
of x with respect to that particular basis.
The same vector (a geometrical identity) has different
components with respect to different bases. To see
how we can change from one to the other, consider
two bases of V S = {ei } and S = {ei} .
o Because both are bases, the elements of one
basis can be written in terms of the other:
e j = eiRij
Where Rij is the transformation matrix between
the two bases
o Now,
consider
vector
x V .
The
representation of the vector in each basis is:
x = e j x j = eix i
However, using our result from above, we can
write this as:
x = eix i = e j x j = eiRij x j
o From this, we can deduce the transformation
law for vector components:
x i = Rij x j
Maths Revision Notes
Daniel Guetta, 2007
Page 6 of 20
Note that:
The law is the reverse of that for
basis vector transformation. This is
to ensure that, overall, the vector x
stays unchanged by transformation.
The first suffix of R corresponds to
the same basis in both relations.
We defined Rij, above, by
e j = eiRij
The condition of the basis {ej} to be orthonormal is
eie j = dij
(ekRki ) elRlj
= dij
Rki* Rljekel = dij
If the second basis is also orthonormal, this becomes:
Rki* Rkj = dij
RR = I
In other words, transformations between orthonormal
bases is described by unitary matrices. In real vector
space, an orthogonal matrix does this in R2 and R3,
this corresponds to a rotation and/or reflection.
Given any m vectors u1 um
that span an n-
dimensional vector space (m > n), it is possible to
construct an orthogonal basis e1 en using the GramSchmidt procedure:
e1 = u1
es ur
es
s =1 es es
r -1
er = u r -
What we are effectively doing is taking each vector u
and removing any bits of vectors weve already
added to the basis from it, to leave us with a final
vector that is orthogonal to all others already added
We can prove, by induction, that this works:
Maths Revision Notes
Daniel Guetta, 2007
Page 7 of 20
Inductive step
Assume that vectors e1 et have already been added
to the orthogonal basis (such that ei e j = 0 " i j ),
and now consider the vector et+1 that were about to
add:
es ut +1
es
s =1 es es
r -1
et +1 = ut +1 -
And now, consider dotting it with ev (v < t), any of
the vectors already in the basis:
es ut +1
es ev ]
[
s =1 es es
=0 if v s
e u
= ut +1 ev - v t +1 [es es ]
es es
= ut +1 ev - ev ut +1
r -1
et +1 ev = ut +1 ev -
=0
So the new vector is indeed orthogonal to all the
vectors already in the set.
"Starting off" step
Consider e1 e2 :
e u
e1 e2 = u1 u2 - 1 2 e1
e1 e1
u u
= u1 u2 - 1 2 [u1 e1 ]
e1 e1
u u
= u1 u2 - 1 2 [e1 e1 ]
e1 e1
=0
So the first two vectors are, indeed, orthogonal.
Matrices
ARRAY VIEWPOINT
o Matrices can be regarded, simply, as an array
of numbers, Rij.
o The rule for multiplying a matrix by a vector is
then
(Ax )i = Aij x j
Maths Revision Notes
Daniel Guetta, 2007
Page 8 of 20
o The
rules
for
matrix
addition
and
multiplication are
(A + B )ij = Aij + Bij
(AB )ij = Aik Bkj
LINEAR OPERATOR VIEWPOINT
o A linear operator A acts on a vector space V to
produce other elements of V.
o The property of linearity means that:
L(ax ) = aL(x )
A(x + y ) = A(x ) + A(y )
o A linear operator can exist without reference to
any basis. It can be thought of as a linear
transformation or mapping of the space V.
[Some linear operators can even transform
between different bases].
o The components of A with respect to a basis
{ei } is defined by the action of A on the basis
vectors:
Ae j = Aijei
The components form a square matrix. [In
other words, the jth column of A contains the
compoenents of the result of A acting on ej].
o We now know enough to determine the action
of A only any x:
Ax = A(x je j ) = x j Ae j = x j Aijei = Aij x jei
So:
(Ax )i = Aij x j
This corresponds to the rule for multiplying a
matrix by a vector.
o Furthermore, the sum of two linear operators is
defined by
(A + B)x = Ax + Bx = ei (Aij + Bij ) x j
o And the product of two linear operators is
defined by
Maths Revision Notes
Daniel Guetta, 2007
Page 9 of 20
(AB)x = A(Bx ) = A(ei Bij x j ) = (Aek )Bkj x j = ei Aik Bkj x j
o Both these operations satisfy the rules of
matrix addition and multiplication and action
on a vector. As such, a linear operator can be
represented as a matrix.
BACK TO CHANGE OF BASIS
o Above, we wrote one set of basis vectors in
terms of the other:
e j = eiRij
But we could also have written
e j = eiSij
Substituting one into the other, we have
e j = ek Ski Rij
e j = ekRkiSij
But this can only be true if
SkiRij = RkiSij = dkj
Which implies that
RS = SR = 1
R = S -1
o We noted, above, that the transformation laws
for vector components could be written
x i = Rij x j
We can write this in matrix form, as
x = Rx
With the inverse relation
x = R-1x
LINEAR OPERATORS CHANGE OF BASIS
o To find how the components of a linear
operator A transform under a change of basis,
we note that we require
Ax = ei Aij x j = eiAij x j
Using e j = Rijei , we have that:
Maths Revision Notes
Daniel Guetta, 2007
Page 10 of 20
ekRki Aij x j = ekAkj x j
Rki Aij x j = Akj x j
RAx = A x
RA (R-1x ) = A x
Which means that
A = RAR-1
MATRIX MULTIPLICATION
o Matrix multiplication does not commute. But it
does distribute, so, with a bit of care, normal
rules of algebra can be applied. For example:
(1 -W )(1 +W )
= 1(1 +W ) -W (1 +W )
= 1 + 1W -W1 -W 2
= 1 -W 2
= (1 +W )(1 -W )
Hermitian Conjugate
We define the Hermitian conjugate of a matrix as
follows:
A = (AT )*
(A )ij
= Aji*
Importantly:
(AB ) = B A
(Note the reversal of the order).
We can also write the inner product as:
x, y = x Gy
Where G is the matrix of metric coefficients. This
preserve the hermitian symmetry of the inner product
as long as the matrix is hermitian G = G .
The adjoint of a linear operator A with respect to a
given inner product is a linear operator A satisfying
Ax | y = x | Ay
Maths Revision Notes
Daniel Guetta, 2007
Page 11 of 20
With the standard inner product, we find that the
matrix defining A is, indeed, the hermitian conjugate
of A.
Special Matrices
SYMMETRY
o A symmetric matrix is equal to its transpose
A = AT
o An hermitian matrix is equal to its hermitian
conjugate.
A = A
o An antisymmetric (or skew-symmetric) matrix
satisfies
AT = -A
o An anti-hermitian (or skew-hermitian) matrix
satisfies
A = -A
ORTHOGONALITY
o An orthogonal matrix is one whose transpose is
equal to its inverse
AT = A-1
o A unitary matrix is one whose hermitian
conjugate is equal to its inverse
A = A-1
We note that if U is a unitary matrix, then
AA = 1 . This implies that the columns of A
are orthonormal vectors.
A normal matrix is one that commutes with its
Hermitian conjugate:
AA = AA
It is easy to verify that hermitian, anti-hermitian and
unitary matrices are all normal.
RELATIONSHIPS
Maths Revision Notes
Daniel Guetta, 2007
Page 12 of 20
o If A is Hermitian, then Ai is anti-hermitian,
and vice-versa.
o If A is Hermitian, then
(Ai )n
exp(Ai ) =
n!
n =0
Is unitary.
o [This can be remembered by bearing in mind
that if z is a real number, iz is imaginary, and
if z is a real number, then eiz has unit modulus
(see below when talking about eigenvalues of
normal matrices)]
To prove that a matrix is a certain type of special
matrix, find an expression for the determining
property. For example, to prove its unitary, find
UU .
Eigenvalues and Eigenvectors
An eigenvector of a linear operator A is a non-zero
vector x satisfying
Ax = lx
(A - l1)x = 0
For some scalar l , called the eigenvalue.
The equation (in its second form) effectively says that
a linear combination of the columns of the matrix
(A - lI ) is equal to 0. This is equivalent to the
statement
det(A - lI ) = 0
Which is called the characteristic equation of the
matrix.
There are two possibilities in terms of roots:
o If there are n distinct solutions to the
characteristic
equation,
then
there
are
linearly independent eigenvectors. We prove
this as follows. Assume that
Maths Revision Notes
Daniel Guetta, 2007
Page 13 of 20
a e
a a
=0
We can multiply both sides by whatever we
want, so:
(A - l1I ) a aea = 0
y = (l2 - l1 )a2e2 + (l3 - l1 )a 3e3 + + (ln - l1 )anen = 0
We can do the same again with y:
(A - l2I )y = 0
(l3 - l1 )(l3 - l2 )a 3e3 + + (ln - l1 )(ln - l2 )anen = 0
We can then repeat this until we obtain:
(
ln - l1 )(ln - l2 )(ln - ln -2 )(ln - ln -1 )anen = 0
Now, if all the l are distinct, the expression
enclosed by a brace is non-zero. Therefore, an
must be 0. Removing the last vector and
repeatedly applying this method shows us that
all the an must be 0. Therefore,
a e
a a
=0
is only true if all the a a are 0. Therefore, the
vectors are linearly independent.
o If the roots are not all distinct, then the
repeated values are said to be degenerate. If a
value l occurs m times, there may be any
number
between
independent
and
eigenvectors.
of
linearly
Any
linear
combination of these is also an eigenvector.
A defective matrix is one who vector space is not
spanned by its eigenvectors. Such a matrix cannot be
diagonalised by a change of basis.
It can be shown that a normal matrix is never
defective. In fact, an orthonormal basis can always be
constructed from the eigenvectors of a matrix, if and
only if the matrix is normal.
Some interesting properties can be derived regarding
the properties of the eigenvectors and eigenvalues of
normal matrices:
Maths Revision Notes
Daniel Guetta, 2007
Page 14 of 20
o The eigenvectors corresponding to distinct
eigenvalues are orthogonal.
o The eigenvalues are
Real for hermitian matrices.
Imaginary for anti-Hermitian matrices.
Of unit modulus for unitary matrices.
A good way to remember these properties is to
consider that a 1 1 matrix is just a number l ,
and to be Hermitian, imaginary or unitary, it
must satisfy
l = l*
l = -l *
l * = l-1
Which are precisely the conditions for l being
real, imaginary or of unit modulus.
The method to prove these results is, in general, as
follows:
o Choose two arbitrary eigenvectors and write
the eigenvector equations:
Ay = my
Ax = lx
o Take one of these equations, and find the
hermitian conjugate.
o Then
For a hermitian matrix, construct two
expressions for y Ax .
For a unitary matrix, multiply both
sides by the other eigenvector equation
that hadnt be used.
o Re-arrange in the form something = 0.
o Assume that x = y, and using the fact that
x, y 0 ,
deduce
something
about
the
eigenvalues.
o Now, assume that x y
and deduce that
y x = 0 as long as l m , proving that the
vectors are orthogonal.
Matrices are given particular names:
Maths Revision Notes
Daniel Guetta, 2007
Page 15 of 20
o If all eigenvalues are < 0 (> 0), the matrix is
negative (positive) definite.
o If all eigenvalues are < 0 (> 0), the matrix is
negative (positive) semi-definite.
o A matrix is definite if it is either positive
definite or negative definite.
Diagonalization
Two square matrices A and B are said to be similar if
they are related by
B = S -1AS
In other words, if they are representations of the same
linear transformation in different bases. S is called the
similarity matrix.
A matrix is said to be diagonalisable if it is similar to
a diagonal matrix in other words, if
S -1AS = L
Where L is a diagonal matrix.
Consider
matrix
whose
columns
are
the
eigenvectors of the matrix A:
AS = A e1
= l1e1
= e1
e2
en
l2e2 l3en
0
l
2
e2 en
l
n
= SL
We can therefore say that
S -1AS = L
Maths Revision Notes
Daniel Guetta, 2007
Page 16 of 20
Provided that S is invertible ie:, provided that the
columns of S are linearly independent ie: provided
that the eigenvectors of A are linearly independent.
Notes:
o We notice that S is the transformation matrix
to
the
eigenvector
basis.
Therefore,
diagonalisation is the process of expressing a
matrix in its simplest form by transforming to
its eigenvector basis.
o An n n matrix is diagonalisable if and only if
it has n linearly independent eigenvectors. That
is to say, only if it is normal. Furthermore, if
the eigenvectors are chosen to be orthonormal,
then the columns of S are orthonormal and S is
therefore unitary (= a matrix whose columns
are orthonormal vectors).
Diagonalisation is rather useful in carrying out certain
operations on matrices:
Am = (S LS -1 )(S LS -1 )(S LS -1 ) = S LmS -1
det(A) = det(S LS -1 ) = det(S )det(L)det(S -1 ) = det(L)
tr(A) = tr(S LS -1 ) = tr(LSS -1 ) = tr(L)
tr(Am ) = tr(Lm )
Where we have used the following properties of
determinants and traces:
det(AB) = det(A)det(B)
det(S )det(S -1 ) = 1
tr(AB) = (AB)ii = Aij B ji = B ji Aij = (BA)jj = tr(BA)
Note that in general, for any matrix A
n
det(A) = li
i =1
n
tr(A) = li
i =1
Quadratic & Hermitian Forms
Maths Revision Notes
Daniel Guetta, 2007
Page 17 of 20
The quadratic form associated with a real symmetric
matrix A is
Q(x ) = x T Ax = Aij x ix j
is
homogeneous
quadratic
function
ie:
Q(ax ) = a2Q(x ) .
In fact, any homogenous quadratic equation is the
quadratic form of a symmetric matrix:
a b x
= x T Ax
Q = ax 2 + 2bxy + cy 2 = x y
b c y
In fact, A can be diagonalised by a real orthogonal
transformation:
(ST = S -1 )
ST AS = L
And the vector x transforms according to x = Sx , so
Q = x T Ax = (x T ST )(S LST )(Sx ) = x T Lx
The quadric form can therefore be reduced to:
n
Q = li x i
i =1
Where the x i are given by:
x = S -1x = S T x
We have effectively rotated the coordinates to reduce
the quadric form to its simplest form.
The quadric surfaces (or quadrics) are the family of
surfaces
Q(x ) = k = constant
In the eigenvector basis, this simplifies to
l1x 12 + l2x 22 + l3x 32 = k
The conic and quadric surfaces that can result are
depicted on the next page. The relevant semi-axes are
given by 1/ l . If l 0 , the shape comes apart.
A few special cases:
o If l1 = l2 = l3 , we have a sphere.
o If (for example), l1 = l2 , we have a surface of
revolution about the third axis, whatever it
might be.
Maths Revision Notes
Daniel Guetta, 2007
Page 18 of 20
o If
(for
example),
l3 = 0 ,
we
have
the
translation of a conic section along the relevant
axis (an elliptic or hyperbolic cylinder).
Maths Revision Notes
Daniel Guetta, 2007
Page 19 of 20
In a complex vector space, the Hermitian form
associated with an Hermitian matrix A is:
Maths Revision Notes
Daniel Guetta, 2007
Page 20 of 20
H (x ) = x Ax = x i*Aij x j
H is a real scalar, because
*
H * (x ) = (x i*Aij x j ) = x j*Aij*x i = x j*Aji x i = H (x )
We also know that A can be diagonalised by a unitary
transformation
U = U -1
U AU = L
And therefore:
H (x ) = x (U LU )x = (U x ) L (U x ) = x Lx = ln x i2
i =1
Therefore, a hermitian form can be reduced to a real
quadratic form by transforming to the eigenvector
basis.
Maths Revision Notes
Daniel Guetta, 2007