Topic 5: Linear Transformations
[AR 8.1]
5.1 Linear transformations from R2 to R2
5.2 Linear transformations from Rn to Rm
5.3 Matrix representations in general
5.4 Image, kernel, rank and nullity
5.5 Change of basis
207
We now turn to thinking about functions from one vector space to
another. These are also known as maps or mappings or transformations
between vector spaces.
Recall that a function T : V W is defined by:
(1) a set V (the domain of T ),
(2) a set W (the codomain or target of T ), and
(3) a rule assigning a value T (x) in W to every x in V .
The simplest such transformations T : V W between vector spaces
V and W have the special algebraic property that
T (1 v1 + 2 v2 + + n vn ) = 1 T (v1 ) + 2 T (v2 ) + + n T (vn );
in other words, T preserves linear combinations of vectors.
These will be called linear transformations and will be important for
many of the applications of linear algebra.
208
Definition (Linear transformation)
Let V and W be vector spaces (over the same field of scalars).
A linear transformation from V to W is a map T : V W such that
for each u, v V and for each scalar :
1. T (u + v) = T (u) + T (v)
(T preserves addition)
2. T (u) = T (u)
(T preserves scalar multiplication)
Loosely speaking, linear transformations are those maps between vector
spaces that preserve the vector space structure.
Examples
209
Note: Geometric interpretation.
Let T : V W be a linear transformation. Then
1. T maps the zero vector to the zero vector.
2. T maps each line through the origin to a line through the origin
(or to just the origin).
3. T maps each parallelogram to a parallelogram (or a line segment
or point).
210
5.1 Linear transformations from R2 to R2
We will start by looking at some geometric transformations of R2 .
A vector in R2 is an ordered pair (x, y ), with x, y R.
To describe the effect of a transformation we will use coordinate
matrices. With respect to the standard
basis B = {e1 , e2 }, the vector
x
(x, y ) has coordinate matrix
.
y
Example
Reflection across y -axis.
x x
=
T
y
y
(x, y )
(x, y )
211
A common feature of all linear transformations is that they can be
represented by a matrix. In the example above
x 1 0 x x
T
=
=
0 1 y
y
y
The matrix
1 0
AT =
0 1
is called the standard matrix representation of the transformation T .
Why is T a linear transformation?
212
Examples of (geometric) linear transformations from R2 to R2
1. Reflection across the x-axis has matrix
2. Reflection in the line y = 5x has matrix
" 12
13
5
13
5
13
12
13
213
3. Rotation
around
the origin anticlockwise by an angle of /2 has
0 1
matrix
1 0
4. Rotation
around theorigin anticlockwise by an angle of has
matrix
Q
We need to work out the coordinates of the point Q obtained
by rotating P.
214
Examples continued
5. Compression/expansion along the x-axis has matrix
1 c
6. Shear along the x-axis has matrix
0 1
These are best thought of as mappings on a rectangle.
For example, a shear along the x-axis corresponds to the mapping
1 0
6. Shear along the y -axis has matrix
c 1
215
Successive Transformations
Example
Find the image of (x, y ) after a shear along the x-axis with c = 1
followed by a compression along the y -axis with c = 21 .
Solution:
Let R : R2 R2 be the compression and denote its standard matrix
representation by AR . Similarly let S : R2 R2 be the shear and
denote its standard matrix representation by AS . Then the coordinate
matrix of R(S(x, y )) is given by
x
AR AS
y
It remains to recall AR and AS , and to compute the matrix products.
216
Note
1. The composition of two linear transformations T (v) = R(S(v)) is
also written T (v) = R S(v)
2. The matrix for the linear transformation S followed by the linear
transformation R is the matrix product AR AS .
In other words
ARS = AR AS
(This is why matrix multiplication is defined the way it is!)
3. Notice that (reading left to right) the two matrices are in the
opposite order to the order in which the transformations are
applied.
217
5.2 Linear transformations from Rn to Rm
An example of a linear transformation from R3 to R2 is
T (x1 , x2 , x3 ) = (x2 2x3 , 3x1 + x3 )
To prove that this is a linear transformation, we must show that for any
u, v R3
and R
we have that
T (u + v) = T (u) + T (v)
and
T (u) = T (u)
218
Proof Let u = (u1 , u2 , u3 ) and v = (v1 , v2 , v3 ). First we note that
u + v = (u1 + v1 , u2 + v2 , u3 + v3 )
Applying T to this gives
T (u + v) = ((u2 + v2 ) 2(u3 + v3 ), 3(u1 + v1 ) + (u3 + v3 ))
Re-arranging the right-hand side gives
((u2 + v2 ) 2(u3 + v3 ), 3(u1 + v1 ) + (u3 + v3 ))
= (u2 2u3 , 3u1 + u3 ) + ( v2 2v3 , 3v1 + v3 )
= T (u) + T (v)
For the second part,
T (u) = T ((u1 , u2 , u3 )) = (u2 2u3 , 3u1 + u3 )
= (u2 2u3 , 3u1 + u3 ) = T (u)
219
We can also write T in matrix form:
Using coordinate matrices we have
x1
x 2x3
T x2 = 2
=
3x1 + x3
x3
Thus [T v] = AT [v] where [v] is the coordinate matrix of v etc.
and AT =
AT is called the standard matrix representation for T .
220
In fact all linear transformations from Rn to Rm can be represented by
matrices.
Theorem
Every linear transformation T : Rn Rm has a standard matrix
representation AT specified by
AT = [ [T (e1 )] [T (e2 )] [T (en )] ]
where each [T (ei )] denotes the coordinate matrix of T (ei ) and
S = {e1 , . . . , en } is the standard basis for Rn .
Then
[T v] = AT [v]
for all v Rn .
Note
The matrix AT has size m n.
Alternative notations for AT include: [T ] or [T ]S or [T ]S,S
221
Proof
Let v Rn and write
v = 1 e1 + 2 e2 + + n en
The coordinate matrix of v with respect to the standard basis is then
1
2
[v] = .
..
n
We seek an m n matrix AT such that
[T (v)] = AT [v]
(independently of v )
()
In words, this equation says that AT times the column matrix [v] is
equal to the coordinate matrix of T (v).
Since T is linear, we have that
T (v) = 1 T (e1 ) + 2 T (e2 ) + + n T (en )
222
We read-off from this that
(property of
[T (v)] = 1 [T (e1 )] + 2 [T (e2 )] + + n [T (en )] coordinate matrices)
1
..
(just matrix
= [ [T (e1 )] [T (e2 )] [T (en )] ] .
multiplication)
n
Comparing with equation () above, we see that we can take
AT = [ [T (e1 )] [T (e2 )] [T (en )] ]
Summarizing: Given a linear transformation T from Rn to Rm there is a
corresponding m n matrix AT satisfying
[T (v)] = AT [v]
That is, the image of any vector v Rn can be calculated using the
matrix AT .
223
We revisit the example from Slide 218, T : R3 to R2 defined by
T (x1 , x2 , x3 ) = (x2 2x3 , 3x1 + x3 ).
Recall that we found the standard matrix of transformation to be
0 1 2
AT =
3 0 1
Now
T (e1 ) =
T (e2 ) =
T (e3 ) =
Note that
the number of columns in AT = dim(R3 ) = dim(domain of T ) and
the number of rows in AT = dim(R2 ) = dim(codomain of T ).
224
Examples
1. Define T : R3 R4 by T (x1 , x2 , x3 ) = (x1 , x3 , x2 , x1 + x3 ).
Calculate AT .
2. Give a reason why the mapping T : R2 R2 specified by
T (x1 , x2 ) = (x1 x2 , x1 + 1) is not a linear transformation.
225
5.3 Matrix representations in general
[AR 8.4]
Weve talked about the standard matrix representations for linear
transformations from Rn to Rm . We can now generalizes the above
theory to study linear transformations from any n-dimensional vector
space V to any m-dimensional vector space W .
To do this we first introduce bases and represent vectors by their
coordinate matrices.
Let U and V be finite dimensional vector spaces.
Suppose that
T : U V is a linear transformation
B = {b1 , . . . , bn } is an (ordered) basis for U and
C = {c1 , . . . , cm } is an (ordered) basis for V
226
We want a matrix that can be used to calculate the effect of T .
Specifically, if we denote the matrix by AC,B , we want
[T (u)]C = AC,B [u]B
for all u U.
()
Note: For this matrix equation to make sense, the size of AC,B must be
m n.
Theorem
There exists a unique matrix satisfying the above condition (). It is
given by
i
h
AC,B = [T (b1 )]C [T (b2 )]C [T (bn )]C
The proof is the same as for the case of the standard matrix AT .
The matrix AC,B is also denoted by [T ]C,B and is called the matrix of T
with respect to the bases B and C.
In the special case in which U = V and B = C, we often write [T ]B in
place of [T ]B,B .
227
Example
Find [T ]C,B for the linear transformation T : P2 P1 given by
T (a0 + a1 x + a2 x 2 ) = (a0 + a2 ) + a0 x
using the bases B = {1, x, x 2 } for P2 and C = {1, x} for P1 .
Solution
T (1) =
T (x) =
T (x 2 ) =
[T ]C,B =
228
Example
5 1 0
A linear transformation T :
has matrix
with
1 5 2
respect to the standard bases of R3 and R2 . What is its matrix with
respect to the basis B = {(1, 1, 0), (1, 1, 0), (1, 1, 2)} of R3 and
the basis C = {(1, 1), (1, 1)} of R2 ?
R3
R2
Solution
We apply T to the elements of B to get:
Then we write the result in terms of C:
We obtain the matrix AC,B =
229
Example
Consider the linear transformation T : V V where V is the vector
space of real valued 2 2 matrices and T is defined by
T (Q) = Q T = the transpose of Q.
Find the matrix representation of T with respect to bases:
B=
C=
and
1 1
1 0
1 0
0 1
,
,
,
0 0
0 1
1 0
1 0
1 0
0 1
0 0
0 0
,
,
,
0 0
0 0
1 0
0 1
230
Solution
The task is to work out the coordinates with respect to C of the image
of each element in B.
231
5.4 Image, kernel, rank and nullity
Let T : U V be a linear transformation.
Definition (Kernel and Image)
The kernel of T is defined to be
ker(T ) = {u U | T (u) = 0}
The image of T is defined to be
Im(T ) = {v V | v = T (u) for some u U}
The kernel is also called the nullspace. It is a subspace of U. Its
dimension is denoted nullity(T ).
The image is also called the range. It is a subspace of V . Its dimension
is denoted rank(T ).
232
Example
Consider the linear transformation T : P2 R2 defined by
T (a0 + a1 x + a2 x 2 ) = (a0 a1 + a2 , 2a0 2a1 + 2a2 )
Find bases for ker(T ) and Im(T ).
233
Note
When we calculate ker(T ) we find that we are solving equations of the
form AT X = 0. So ker(T ) corresponds to the solution space for AT .
To calculate Im(T ) we can use the fact that, because B spans U then
T (B) must span T (U); the image of T . But the elements of T (B) are
given by the columns of AT . Thus the column space of AT gives the
coordinate vectors of the elements of Im(T ). It follows that
rank(T ) = rank(AT ).
We can now use the result of slide 166 to prove the following:
Theorem (Rank-nullity Theorem)
For a linear transformation T : U V with dim(U) = n
nullity(T ) + rank(T ) = n = dim(domain of T ).
234
Definition
Recall that a function T : U V is called injective or one-to-one if
T (x) = T (y) implies x = y. It is called surjective or onto if T (U) = V .
For linear transformations, these conditions simplify to the following:
Theorem
A linear transformation T : U V is injective if ker(T ) = {0} and is
surjective if Im(T ) = V .
Example
Is the linear transformation of the preceding example injective ?
Is it surjective ?
235
Definition (Invertibility)
A linear transformation T : U V is invertible if there is a linear
transformation S : V U such that
1. (S T )(u) = u
2. (T S)(v) = v
for all u U
for all v V
Lemma
1. If T is invertible, then the linear transformation S is unique. It is
called the inverse of T and is denoted T 1 .
2. A linear transformation T is invertible iff T is both injective and
surjective.
3. If we fix bases B for U and C for V then
[T 1 ]B,C = ([T ]C,B )1
236
5.5 Change of basis
Transition Matrices
We have seen that a matrix representation of a linear transformation
T : U V depends on both the choices of bases for U and V .
In fact different matrix representations are related by a matrix which
depends on the bases but not on T . To understand this, we undertake a
study of converting coordinates with respect to one basis to coordinates
using another basis.
Let B = {b1 , . . . , bn } and C = {c1 , . . . , cn } be bases for the same
vector space V and let v V .
Q. How are [v]B and [v]C related?
A. By multiplication by a matrix!
237
Theorem
There exists a unique matrix P such that for any vector v V ,
[v]C = P[v]B
The matrix P is given by
P = [ [b1 ]C [bn ]C ]
and is called the transition matrix from B to C.
In words, the columns of P are the coordinate matrices, with respect to
C, of the elements of B.
We will sometimes denote this transition matrix by PC,B
It is also sometimes denoted by PBC
238
Proof:
We want to find a matrix P such that for all vectors v in V ,
[v]C = P[v]B
Recall that if T : V V is any linear transformation, then
[T (v)]C = [T ]C,B [v]B
where
()
[T ]C,B = [T (b1 )]C [T (b2 )]C [T (bn )]C
is the matrix representation of T .
239
Applying this to the special case where T (v) = v for all v
(i.e., T is the identity linear transformation)
gives
[T ]C,B = [b1 ]C [b2 ]C . . . [bn ]C
and () becomes
[v]C = [T ]C,B [v]B
So we can take P = [T ]C,B
Exercise
Finish the proof by showing that P is unique. That is, if Q is a matrix
satisfying [v]C = Q[v]B for all v, then Q = P.
240
A simple case
The transition matrix is easy to calculate when one of B or C is the
standard basis.
Example
In R2 , write down the transition matrix from B to S, where
B = {(1, 1), (1, 1)} and S = {(1, 0), (0, 1)} .
1
Use this to compute [v]S , given that [v]B =
.
1
Solution
i
h
PS,B = [b1 ]S [b2 ]S =
[v]S = PS,B [v]B =
241
Going in the other direction
A useful fact is that transition matrices are always invertible.
Starting with the equation
[v]C = PC,B [v]B
and rearranging gives
1
[v]C
[v]B = PC,B
But we know that
[v]B = PB,C [v]C
and so, by the uniqueness part of the above theorem, it must be the
case that
PB,C = (PC,B )1
242
Example
For B and S as in the previous example, compute PB,S , the transition
matrix from S to B.
2
Use it to compute [v]B , given [v]S =
.
0
Solution
We saw that in this case
PS,B
1 1
=
1 1
It follows that
PB,S =
[v]B =
243
Calculating a general transition matrix
Keeping notation as before, we have
[v]S = PS,B [v]B
and
[v]C = PC,S [v]S
Combining these,
[v]C = PC,S [v]S = PC,S PS,B [v]B
Using the uniqueness of the transition matrix, we get
1
PS,B
PC,B = PC,S PS,B = PS,C
Since it is usually easy to calculate a transition matrix from a
non-standard basis to the standard basis (PS,B ), this makes it
straightforward to calculate any transition matrix.
244
Example
With U = V = R2 and B = {(1, 2), (1, 1)} and C = {(3, 4), (1, 1)},
find PC,B .
PS,B =
PS,C =
So
PC,S =
and
PC,B =
245
Relationship Between Different Matrix Representations
Example
Calculate the standard matrix representation of T : R2 R2 where
T (x, y ) = (3x y , x + 3y )
Solution
246
Example continued
Now find the matrix of T with respect to the basis
B = {(1, 1), (1, 1)}
Solution
Notice that [T ]B is diagonal. This makes it very convenient to use the
basis B in order to understand the effect of T .
Note: Well later see a systematic way to find a basis B making [T ]B
diagonal, by using eigenvectors and eigenvalues of a linear
transformation T .
247
How are [T ]C and [T ]B related?
Theorem
The matrix representations of T : V V with respect to two bases C
and B are related by the following equation:
[T ]B = PB,C [T ]C PC,B
Proof:
We need to show that for all v V
[T (v)]B = PB,C [T ]C PC,B [v]B
Starting with the right-hand side we obtain:
PB,C [T ]C PC,B [v]B = PB,C [T ]C [v]C
= PB,C [T (v)]C
= [T (v)]B
(property of PC,B )
(property of [T ]C )
(property of PB,C )
248
Example
For the above linear transformation T : R2 R2 given by
T (x, y ) = (3x y , x + 3y ) we saw that
[T ]C =
[T ]B =
where C = {(1, 0), (0, 1)} is the standard basis and B = {(1, 1), (1, 1)}
Since C is the standard basis, it is easy to write down
PC,B =
From which we calculate PB,C =
Calculation verifies that in this case we do indeed have:
[T ]B = PB,C [T ]C PC,B
249
250