Lecture Notes PDF
Lecture Notes PDF
Lecture Notes
University of Namibia
1 Vector spaces
1.1 Fields
We define an algebraic structure called a field. You are acquainted with several
fields already, the rational numbers Q, the real numbers R, or the complex
numbers C. In MAT3711 and MAT3712, examples will be almost exclusively
vector spaces (see the definition below) over one of those three fields. However,
there are many, in fact infinitely many, fields.
Definition 1.1. Let F be a set on which two binary operations F × F → F are
defined. The two operations are called addition and multiplication and denoted
by (a, b) 7→ a + b and (a, b) 7→ ab.1 . The set F is called a field if the following
requirements are met: For a, b, c ∈ F
1. (a + b) + c = a + (b + c).
2. a + b = b + a.
3. There is an element 0 satisfying a + 0 = a.
4. There is an element −a satisfying a + (−a) = 0.
5. (ab)c = a(bc).
6. ab = ba.
7. There is an element 1 satisfying 1a = a.
8. If a ̸= 0, then there is an element a−1 satisfying aa−1 = 1.
9. a(b + c) = ab + ac.
10. 1 ̸= 0.
Remark. According to the axioms, addition in F is associative, commutative,
there is a neutral element 0 and every element has a negative. Likewise, mul-
tiplication is associative and commutative, there is a multiplicative unity 1 and
every nonzero element has a multiplicative inverse. Finally, the distributive law
holds in F . The last axiom guarantees that F has nonzero elements.
The field axioms create a structure in which we can perform arithmetic as we
know it.
1 So a + b is the sum of a and b and ab the product
1
Examples. a) Q, R, C are fields.
b) The set of integers, Z, with its usual multiplication and addition, is not a
field. The only axiom that fails to hold is the existence of a multiplicative
inverse for each nonzero element. Indeed, 5 ̸= 0, but 15 ∈/ Z.
c) There are fields with finitely many elements. The smallest has only two
elements, 0 and 1, and the operations are succinctly displayed in the fol-
lowing multiplication tables:
+ 0 1
0 0 1
1 1 0
· 0 1
0 0 0
1 0 1
1.2 Families
Definition 1.2. Let S and I be sets. A family of elements of S, indexed by
I, is a map f : I → S. We usually use subscripts to name a family, writing
si instead of f (i) and denoting the family as (si )i∈I . If I is the set of natural
numbers 1, . . . , r, r ∈ N we write (s1 , . . . , sr ).
A subfamily of a given family f is a map g : I1 → S, where I1 ⊆ I and g(i) =
f (i) whenever i ∈ I1 .
Examples. 1. Given elements ai , i = 1, . . . , 5 of any set, (a1 , a2 , a3 , a4 , a5 )
is a family-here the index set I is {1, 2, 3, 4, 5}. The family (a1 , a3 , a4 )
is a subfamily.
2. Sequences (an )n∈N of real numbers are families of real numbers with index
set N and subsequences are subfamilies.
The concept of a family generalises that of (finite or infinite) sequences.
Remark. It is also possible to join two given families. Say, (a1 , a2 , a3 , a4 , a5 )
and (b1 , b2 , b3 ) can be put together to form the family (a1 , a2 , a3 , a4 , a5 , a6 , a7 , a8 )
where a6 = b1 , a7 = b2 , a8 = b3 . The formal definition of this procedure is cum-
bersome and we will omit it.
2
1. (a) (u + v) + w = u + (v + w).
(b) u + v = v + w.
(c) There is an element 0, called a null vector satisfying v + 0 = 0.
(d) There is an element −v, called a negative of V , satisfying v +(−v) =
0.
2. (a) (λµ)v = λ(µv).
(b) 1v = v.
3. (a) λ(u + v) = λu + λv.
(b) (λ + µ)v = λv + µv.
Remark. The first group of axioms concern vector addition, saying that it is
associative and commutative, that there is a neutral element and that every ele-
ment has a negative. The second group of axioms connects scalar multiplication
with the multiplication in F , and the third group are the two distributive laws.
In the context of a vector space over F , the elements of F will often be called
scalars.
Examples. 1. Let n ∈ N. The set F n consists of n-tuples of elements of F ,
displayed as column vectors:
x1
x2
n
F = . | x1 , . . . , x n ∈ F .
..
xn
3
x1 y1 z1
x2 y2 z2
of elements of F n are in F n .) Next, . + . + . =
.. .. ..
xn yn zn
x1 + y1 z1 (x1 + y1 ) + z1 x1 + (y1 + z1 )
x2 + y2 z2 (x2 + y2 ) + z2 x2 + (y2 + z2 )
+ . = = =
.. . .
. ..
. . . .
xn + yn zn (xn + yn ) + zn xn + (yn + zn )
x1 y 1 + z1 x1 y1 z1
x2 y2 + z2 x2 y2 z2
. + = . + . + . . This is as-
.
.. .. .. .. ..
xn yn + zn xn yn zn
x1 + y1 y1 + x1
x2 + y2 y2 + x2
sociativity of addition. For commutativity, = =
.. ..
. .
xn + yn yn + xn
x1 y1 y1 x1 0
x2 y2 y2 x2 0
. + . = . + . . The vector . serves as null
.. .. .. .. ..
xn yn yn xn 0
−x1
x1
x2 −x2
vector, and − . = . . Those are the four axioms in the first
.. ..
xn −xn
group.
x1 (λµ)x1 λ(µx1 ) µx1 x1
x2 (λµ)x2 λ(µx2 ) µx2 x2
Next, λµ . = = = λ . and 1 . =
. .
.. .. .. .. ..
xn (λµ)xn λ(µxn ) µxn xn
1x1 x1
1x2 x2
. = . . This takes care of the second group.
.. ..
1xn xn
x1 y1 x1 + y1 λ(x1 + y1 )
x2 y2 x2 + y2 λ(x2 + y2 )
Finally, λ . + . = λ = =
. ..
.. .. .. .
xn yn xn + yn λ(xn + yn )
λx1 + λy1 λx1 λy1 x1 y1
λx2 + λy2 λx2 λy2 x2 y2
= . + . = λ . +λ . , and (λ+
.. . . . .
. . . . .
λxn + λyn λxn λyn xn yn
4
x1 (λ + µ)x1 λx1 + µx1 x1 x1
x2 (λ + µ)x2 λx2 + µx2 x2 x2
µ) = = = λ +µ .
.. .. .. .. ..
. . . . .
xn (λ + µ)xn λxn + µxn xn xn
5
g) If v ̸= 0, then λv = 0 if and only if λ = 0.
h) In V , there is exactly one solution to the equation u + x = v.
Proof. If u + v = u + w, then (−u) + (u + v) = (−u) + (u + w) = (−u + u) + v =
(−u + u) + w = 0 + v = v = 0 + w = w. This is a).
If z + 0′ = z = z + 0 holds whenever z ∈ V , then 0 + 0′ = 0′ + 0 = 0 = 0′ .This
is b).
If v + u = v + w = 0, then u = w by a). This is c).
For d), v + (−v) = 0 = (−(−v)) + (−v). Now apply a).
For the first statement in e), λ0 = λ0 + 0 = λ(0 + 0) = λ0 + λ0. Now apply a).
For the second statement, 0v = 0v + 0 = (0 + 0)v = 0v + 0v, now apply a).
As to f), we use e) to obtain λ(v + (−v)) = λ0 = 0 = λv + λ(−v). By c), this
means that −λv = λ(−v). Similarly, the second statement in e) implies that
(λ + (−λ))v = 0v = 0 = λv + (−λv), so −λv = (−λ)v follows from c).
Moving on to g), assume that λv = 0 and λ ̸= 0. Every nonzero element of a field
has a multiplicative inverse. By e), λ−1 (λv) = (λ−1 λ)v = 1v = v = λ−1 0 = 0.
Finally, u + (−u + v) = (u + (−u)) + v = v. The uniqueness of the solution
follows from a).
Definition 1.5. Let V be a vector space over F . For u, v ∈ V , we define
u − v = u + (−v).
1.4 Subspaces
Definition 1.6. Let V be a vector space over F . A subset U of V is called a
subspace if
a) U ̸= ∅.
b) If u ∈ U ∋ v, then u + v ∈ U .
c) If u ∈ U and λ ∈ F , then λu ∈ U .
We denoted ”U is a subspace of V ” by ”U ≤ V ”.
Properties b) and c) mean, respectively, that U is closed with respect to addition
and scalar multiplication. A subspace should be thought of as a subset of a
vector space that is a vector space in its own right, with the operations inherited
from the surrounding space.
Examples. 1. V ≤ V and {0} ≤ V (for the latter, see the lemma below).
6
x
2. Let V = R2 and U = | x ∈ R . The set U is certainly
−2x
1 x
not empty, say ∈ U . For x, y, λ ∈ R, we have +
−2 −2x
y x+y x+y x
= = ∈ U and λ =
−2y −2x − 2y −2(x + y) −2x
λx
∈ U.
−2λx
3. Example 1. is the set of solutions in R2 of the homogeneous linear equation
2x1 + x2 = 0. You will recall that a homogeneous linear system with
coefficients in some field F 3 of m equations and in n unknowns x1 , . . . , xn
can be conveniently expressed as
x1 0
.. ..
A . = . ,
xn 0
where A is an m × n matrix with coefficients in F and the right hand sight
is the null vector of Rm . The set of solutions of the
system
is the set U =
x1
{v ∈ F n | Av = 0}. If x1 = . . . = xn = 0, then A ... = 0, so U ̸= ∅.
xn
If u, v ∈ U and λ ∈ F , then A(u + v) = Au + Av = 0 = λAu = Aλu.
So the set of solutions of a homogeneous linear system in n unknowns is
a subspace of F n .
4. Let V be a vector space over F and let v ∈ V . The set W = {λv | λ ∈ F }
of scalar multiples of V is a subspace of V : First of all, v = 1v ∈ W ̸= ∅.
For λ, µ, α ∈ F , λv + µv = (λ + µ)v ∈ W and α(λv) = (αλ)v ∈ W .
Lemma 1.7. Let V be a vector space over F . The set {0} is a subspace of V
and 0 belongs to every subspace of V .
Proof. The set {0} is nonempty and 0 + 0 = 0 = λ0 (λ ∈ F ). So {0} is a
subspace. Now let U ≤ V . Since U ̸= ∅, there is u in U . It follows that
(−1)u = −1u = −u ∈ U . Thus U ∋ u + (−u) = 0.
Lemma 1.8. Let V be a vector space over F . Let U1 , U2 ≤ V .
Then U1 ∩ U2 ≤ V .
Proof. By Lemma 1.7, 0 ∈ U1 ∩U2 ̸= ∅. Let u, v ∈ U1 ∩U2 and λ ∈ F . Then u+v
and λv are in U1 and in U2 , both being subspaces. So u + v ∈ U1 ∩ U2 ∋ λv.
Remark. Unlike the intersection,
theunion oftwo subspaces
usually
is not a
1 0
subspace. Take V = R2 , U1 = λ | λ ∈ R and U2 = λ |λ ∈ R
0 1
3 You know this theory for F = R, but the field really does not make a difference here.
7
1
. Both sets are subspaces, yet U1 ∪ U2 does not contain the vector +
0
0 1
= . So U1 ∪ U2 is not closed with respect to addition.
1 1
Let V be a vector space and let S be a subset of V . Since V itself is a subspace
of V , S is contained in a subspace of V . By Lemma 1.8, the intersection over
all subspaces of V that contain S is a subspace of V .
Definition 1.9. Let V be a vector space and let S ⊆ V . The span of S, denoted
⟨S⟩, is defined by \
⟨S⟩ = U.
S⊆U ≤V
⟨v⟩ = {λv | λ ∈ F }.
U1 + U2 = ⟨U1 ∪ U2 ⟩.
8
Lemma 1.11. In the notation of 1.10, U1 + U2 = {u + w | u ∈ U1 , w ∈ U2 }.
Proof. Let S = {u + w | u ∈ U1 , w ∈ U2 }. By Lemma 1.7, 0 ∈ U1 , U2 . So
U1 = {u + 0 | u ∈ U1 } ⊆ S and U2 = {0 + w | w ∈ U2 } ⊆ S. So U1 ∪ U2 ⊆ S; in
particular, S ̸= ∅4 Next, let u, u′ ∈ U1 and w, w′ ∈ U2 . Then u + u′ ∈ U and
w + w′ ∈ W , so that (u + w) + (u′ + w′ ) = (u + u′ ) + (w + w′ ) ∈ S. Furthermore,
λu ∈ U and λw ∈ W , so that λ(u + w) = λu + λw ∈ S. So S is a subspace of
V containing U1 ∪ U2 and it follows that U1 + U2 ⊆ S. To establish the reverse
inclusion, consider any subspace W with W ⊇ U1 ∪ U2 . Since W is additively
T sum u + w with u ∈ U and w ∈ W , so S ⊆ W .
closed, W contains every
Accordingly, S ⊆ W = U1 + U2 . This completes the proof.
U1 ∪U2 ⊆W ≤V
21 0 x1
Examples. Let V = R , v1 = and v2 = . If w = ∈ V,
0 1 x2
then w = x1 v1 + x2 v2 . Thus V = ⟨v1 ⟩ + ⟨v2 ⟩.
9
1 1
2 4
0
2. The vector w =
3 in R is a linear combination of the vectors
3
4 2
0
2
0 –find out how.
and
3. Let v ∈ V . The linear combinations of the family (v) are the scalar
multiples of V . The linear combinations of the family (v, 2v, 3v) are
again the scalar multiples of v.
Lemma 2.2. Let U ≤ V . If (v1 , . . . , vr ) is a finite family of elements of U ,
then every linear combination of v1 , . . . , vr is in U .
Proof. We use induction on r. The linear combinations of (v1 ) are the scalar
Pr
multiples of v1 , which belong to U . Now let r ≥ 2 and consider w = λi vi .
i=1
r−1
Via induction, the vector w′ = λi vi is in U . Now w = w′ + λr vr . Since
P
i=1
w′ , vr ∈ U , λr vr and then w′ + λr vr are in U .
The ”meaning” of Lemma 2.2 is that subspaces are closed with respect to linear
combinations.
Lemma 2.3. Let (v1 , . . . , vr ) be a nonempty finite family in V . Then ⟨v1 , . . . , vr ⟩
is the set of linear combinations of v1 , . . . , vr .
Proof. Let S be the set of linear combinations of (v1 , . . . , vr ). By Lemma 2.2,
S ⊆ ⟨v1 , . . . , vr ⟩. Now we show that S is a subspace. Certainly, S is not empty as
Pr r
P r
P
v1 ∈ S. Let u, w ∈ S, u = λi v i , w = µi vi . Then u + w = (λi + µi )vi ∈
i=1 i=1 i=1
r
6 7
P
S. For α ∈ F , αu = αλi vi ∈ S.
i=1
*
1 −1 + 1 −1
Examples. In R3 , 2 , 2 = λ1 2 + λ2 2 | λ1 , λ2 ∈ R =
−3 0 −3 0
λ1 − λ2
= 2λ1 + 2λ2 | λ1 , λ2 ∈ R .
−3λ1
The final lemma in this subsection contains the main argument in the proof of
the Steinitz exchange lemma in the next section (see Theorem 2.13)). This in
turn is prpbably the most important result in this course.
6 λ v + λ v + . . . + λ v + µ v + µ v + . . . + µ v = (λ + µ )v + (λ + µ )v + . . . +
1 1 2 2 r r 1 1 2 2 r r 1 1 1 2 2 2
(λr + µr )vr
7 α(λ v + λ v + . . . + λ v ) = αλ v + αλ v + . . . + αλ v
1 1 2 2 r r 1 1 2 2 r r
10
Lemma 2.4. Let (v1 , . . . , vr ) be a finite family in V . Let w ∈ ⟨v1 , . . . , vr ⟩.
r
P
Write w = λi vi . Fix j in {1, . . . , r}. If λj ̸= 0, then ⟨v1 , . . . , vr ⟩ =
i=1
⟨v1 , . . . , vj−1 , w, vj+1 , . . . , vr ⟩.
Proof. Since subspaces are closed under linear combinations (Lemma 2.2),
⟨v1 , . . . , vj−1 , w, vj+1 , . . . , vn ⟩ ⊆ ⟨v1 , . . . , vn ⟩ regardless of λj being equal to 0.
Now suppose that λj ̸= 0. Then λ1j exists, and
1 λ1 λj−1 λj+1 λr
vj = (λ1 v1 +. . .+λj vj +. . .+λr vr )− v1 −. . .− vj−1 − vj+1 −. . .− vr =
λj λj λj λj λj
1 1 X
= w− λi vi .
λj λj
i̸=j
11
d) If there are i, j ∈ {1, . . . , r} with i ̸= j and vi = vj , then (v1 , . . . , vr ) is linearly
dependent: P
We have 0 = 1vi − 1vj + 0vk .
k̸=i, j
12
Lemma 2.7. Let (v1 , . . . , vr ) be a family in V . The following assertions are
equivalent:
a) The family (v1 , . . . , vr ) is linearly independent.
b) For every element w of ⟨v1 , . . . , vr ⟩ there is a uniquely determined r-tuple
r
P
(λ1 , . . . , λr ) of scalars satisfying w = λ i vi .
i=1
Proof. ”a) implies b)”: Suppose that (v1 , . . . , vr ) is linearly independent. Let
w ∈ ⟨v1 , . . . , vr ⟩. We know (Lemma 2.3 ) that w is a linear combination w =
Pr Pr r
P
λi vi . If there are scalars µ1 , . . . , µr such that w = µi vi , then (λi −
i=1 i=1 i=1
µi )vi = 0. Given that (v1 , . . . , vr ) is linearly independent, this means that
λi = µi holds for all i.
”b) implies a)”: We know that 0 = 0v1 + 0v2 + . . . + 0vr . If that is the only
way to express the null vector as linear combination of (v1 , . . . , vr ), the family
is linearly independent.
Definition 2.8. a) A family (vi , i ∈ I) in V is called a generating system
(or spanning system) if V = ⟨vi | i ∈ I⟩.
b) The space V is finitely generated if it has a spanning system with finitely
many members.
1 0
Examples. 1. The space R2 is finitely generated, say by , .
0 1
2. More generally, the space F n is finitely generated:
x1 1 0 0
x2 0 1 0
= x1 + x2 + . . . + xn .
.. .. .. ..
. . . .
xn 0 0 1
3. The space ℓ0 of zero sequences of real numbers is not finitely generated. We will
be able to give a fairly easy proof further below, when we will have developed a
little more of the theory.
4. The real numbers R form a vector space also over the rational numbers. In fact,
Q is a field and if we let the vectors be real numbers and the scalars rational
numbers, the vector space axioms work out (check). It can be shown that R
cannot be finitely generated over Q.
13
Proof. If (v1 , . . . , vr ) is linearly independent, it is already a basis. Otherwise,
(v1 , . . . , vr ) is linearly dependent. Lemma 2.6 yields that there is i ∈ {1, . . . , r}
such that V = ⟨v1 , . . . , vr ⟩ = ⟨v1 , . . . , vi−1 , vi+1 , . . . , vr ⟩. 8 If (v1 , . . . , vi−1 , vi+1 , . . . , vr )
is linearly dependent, we may remove a second vector and obtain an even smaller
spanning system. We continue in this way. Since there are only r vectors to
begin with, the process must stop and deliver a linearly independent subfamily
which is still generating V .
* 1 0 1 +
1
14
Remark. b) and c) in 2.11 say, respectively, that a basis is a maximally linearly
independent family and a minimal generating system.
Definition 2.12. Let 1 ≤ i≤ n ∈ N. The ith standard basis vector of F n ,
0
..
.
0
, where the entry 1 is in ith place.9
1
called ei , is the vector
0
.
..
0
x1
x2
Remark. From = x1 e1 + x2 e2 + . . . + xn en , it is clear that (e1 , . . . , en )
..
.
xn
is both linearly independent and a generating system of F n . The family (e1 , . . . , en )
in F n will be referred to as the standard basis of F n .
15
Now write wm as linear combination of (w1 , . . . , wm−1 , vm , . . . , vn ), i.e.
m−1
X n
X
wm = λi wi + λj vj .
i=1 j=m
Since wm ∈
/ ⟨w1 , . . . , wm−1 ⟩, there is j ∈ {m, . . . , n} such that λj =
̸ 0. Perhaps
upon renumbering the vectors vm , . . . , vn , we are free to assume that λm ̸= 0.
Lemma 2.4 and the inductive assumption yield that
m−1 m−1 n
! n
X X X X
µi w i + µm λi wi + λj v j + µi v i = 0 =
i=1 i=1 j=m i=m+1
m−1
X n
X
= (µi + µm λi )wi + λm µm vm + (µi + µm λi )vi .
i=1 i=m+1
16
2. Suppose we have (finitely many) vectors v1 , . . . , vr in V and we know that
(v1 , . . . , vr ) is linearly independent. Then certainly U = ⟨v1 , . . . , vr ⟩ is
generated by the linearly independent family (v1 , . . . , vr ) and so dim U =
r. If we drop the assumption that (v1 , . . . , vr ) is linearly independent,
then we still know that U is generated by v1 , . . . , vr . By Lemma 2.10, we
can select a basis of U from the vectors (v1 , . . . , vr ), so that dim U ≤ r.
3. Recall that {0} = ⟨∅⟩ and thus dim{0} = 0. The only vector space of
dimension 0 is the null space.
4. Let 0 ̸= v ∈ V . We have seen that (v) is linearly independent, so dim⟨v⟩ =
1.
1 −1
2 −2
5. Let V = R4 , v1 = 0 , v2 = 3 .
4 −4
If λ1 v1 + λ2 v2 = 0, then λ1 − λ2 = 3λ2 = λ1 = λ2 = 0. Hence (v1 , v2 )
is linearly independent. We know that (e1 , e2 , e3 , e4 ) is a basis of V and
that v1 = e1 + 2e2 + ee4 . The vectors e1 , e2 , e4 have nonzero coefficients
in this linear combination, and we can exchange any of them (but NOT
e3 ) to form a new basis. We decide to replace e4 be v1 and continue with
the new basis (v1 , e1 , e2 , e3 ). Now we write v2 as a linear combination
of this basis: v2 = −v1 + 3e3 . The only member of the original basis
that we are allowed to exchange against v2 is e3 , and we obtain the basis
(v1 , v2 , e1 , e2 ).
Lemma 2.17. Let U ≤ V . If dim V = n, then dim U ≤ n.
Proof. A linearly independent family in U is linearly independent in V . Now
apply Theorem 2.13.
Remark. If the dimension of a vector space is n, then the Exchange Lemma
says that any n-element linearly independent family is a basis. In particular, V
is the only subspace of V of dimension n.
We are going to say ”finite-dimensional” instead of ”finitely generated” - we
have seen that the two things are the same.
17
of U1 and a basis (u1 , . . . , uℓ , wℓ+1 , . . . , wm ) of U2 . We aim to show that B =
(u1 , . . . , uℓ , vℓ+1 , . . . , vk , wℓ+1 , . . . , wm ) is a basis of U1 + U2 . First of all, every
element of U1 and every element of U2 are linear combinations of the elements
of B. It follows directly from the definition U1 + U2 = ⟨U1 ∪ U2 ⟩ that B is a
generating system of U1 + U2 .
Pℓ
It remains to show that B is linearly independent: Suppose that λi ui +
i=1 !
k
P m
P m
P Pℓ Pk
µi vi + αi wi = 0. Then αi wi = − λi ui + µi vi ∈
i=ℓ+1 i=ℓ+1 i=ℓ+1 i=1 i=ℓ+1
m
P
U1 ∩ U2 . This means that αi wi is a linear combination of (u1 , . . . , uℓ ), say
i=ℓ+1
ℓ
P m
P ℓ
P
βj uj . Now αi wi + −βj uj = 0. Since (u1 , . . . , uℓ , wℓ+1 , . . . , wm ) is
j=1 i=ℓ+1 j=1
ℓ
P
linearly independent, we obtain that αℓ+1 = . . . = αn = 0. Thus λi ui +
i=1
k
P
µi vi = 0. However, (u1 , . . . , uℓ , vℓ+1 , . . . , vk ) is linearly independent, so
i=ℓ+1
that λ1 = . . . = λℓ = µℓ+1 = . . . = µk = 0, completing the proof.
Definition 2.19. Let U1 and U2 be subspaces of V . We say that the sum
U1 + U2 is direct, denoted U1 ⊕ U2 , if U1 ∩ U2 = 0.
Let U be a subspace of V . A subspace W is called a complement of U in
V if V = U ⊕ W .
Remark. if V is finite-dimensional, then, by Theorem 2.18, the sum of U1 and
U2 is direct if and only if dim(U1 + U2 ) = dim U1 + dim U2 .
Lemma 2.20. Suppose that V is finite-dimensional. Let U ≤ V . There is a
complement of U in V .
Proof. Let dim V = n and dim U = k. We know that k ≤ n by 2.17. Let
(u1 , . . . , uk ) be a basis of U . Extend this basis to a basis (u1 , . . . , uk , vk+1 , . . . , vn )
of V . Let W = ⟨vk+1 , . . . , vn ⟩. We claim that V = U ⊕ W . If z ∈ V , then
k
P Pn Pk
z is a linear combination z = λi ui + λi vi . Since λi ui ∈ U and
i=1 i=k+1 i=1
n
P
λi vi ∈ W , z ∈ U + W = V . Now dim W = n − k since W is generated by
i=k+1
n−k linearly independent vectors, and dim V = dim U +dim W −dim(U ∩W ) =
n = k + (n − k) = dim U + dim W . So U ∩ W = {0}.
1 −1
2 −2
Examples. Let V = R4 , v1 = 0 , v2 = 3 , U = ⟨v1 , v2 ⟩. We
4 −4
have seen above that dim U = 2 and that (v1 , v2 , e1 , e2 ) is a basis of V . So
W = ⟨e1 , e2 ⟩ is a complement of U in V .
18
Remark. Let V = U ⊕ W . Then every vector is a sum of an element of U
and one of W . If u, u′ ∈ U and w, w′ ∈ W , then u + w = u′ + w′ means that
u − u′ = w′ − w ∈ U ∩ W = {0}. So u = u′ and w = w′ . In other words, every
element of V is the sum of uniquely determined elements of U and of W .
symmetric - if x ∼ y, then y ∼ x
transitive - if x ∼ y ∼ z, then x ∼ z.
[x] = {y ∈ S | x ∼ y}.
Proof. Suppose that [x] ∩ [y] ̸= ∅. This means that there is z ∈ S such that
x ∼ z and y ∼ z. Since ”∼” is symmetric, x ∼ z ∼ y, so that x ∼ y. Now if w
is any element of [x], we have x ∼ w and x ∼ y, so y ∼ x ∼ w and y ∼ w. It
follows that [x] ⊆ [y] and [y] ⊆ [x] is obtained in just the same way.
Theorem 2.23 and the fact that x ∈ [x] combine to say that the equivalence
classes of an equivalence relation on S form a partition of S. This means that
S is the disjoint union of equivalence classes; equivalently, every element of S
lies in exactly one equivalence class.
2.4.2 Cosets
Let V be a vector space over the field F and let U a subspace of V .
Definition 2.24. We define the relation ∼U on V by
v ∼U w if v − w ∈ U.
19
Lemma 2.25. ∼U is an equivalence relation.
Proof. Let v, w, z ∈ V . Then v − v = 0 ∈ U , so ∼U is reflexive. If v − w ∈ U ,
then (−1)(v − w) = w − v ∈ U , so ∼U is symmetric. If v − w ∈ U ∋ w − z, then
U ∋ (v − w) + (w − z), so ∼U is transitive.
Lemma 2.26. Let v, v ′ , w, w′ ∈ V and λ ∈ F . If v + U = v ′ + U and
w + U = w′ + U , then (v + w) + U = (v ′ + w′ ) + U and λv + U = λv ′ + U .
Proof. ”v + U = v ′ + U and w + U = w′ + U ” means that there are elements u1
and u2 of U such that v ′ = v + u1 and w′ = w + u2 . Then v + w − (v ′ + w′ ) =
v − v ′ + w − w′ = u1 + u2 ∈ U and λv − λv ′ = λ(v − v ′ ) ∈ U .
Remark: Lemmas 2.25 and 2.26 combine to say that ∼U is a congruence relation
on V - an equivalence relation compatible with the vector space operations.
Definition 2.27. Let v ∈ V . The equivalence class of v with respect to ∼U is
called the coset of U containing v and denoted v + U .
Thus v + U = {w ∈ V | v − w ∈ U }.
Remark. v + U = {v + u | u ∈ U }.
Proof. If w = v +u with u ∈ U , then v −w = −u ∈ U and v ∼U w. If v −w ∈ U ,
then w = v + w − v is of the form v + u with u ∈ U .
Examples. 1. The coset containing 0 is {0 + u | u ∈ U } = U.
2. Let V = R2 , U = ⟨e1 + e2 ⟩ , v = e1 . We have v + U = {e1 + λ(e1 + e2 ) | λ ∈
R} = {(1 + λλ) | λ ∈ R} .
3. Let V = R3 and U = ⟨e1 − e2 , e1 + 2e3 ⟩.
Let v = e1 .
a+b
We have U = −a | a, b ∈ R and
2b
1 a+b 1+a+b
v+U = 0 + −a | a, b ∈ R = −a | a, b ∈ R .
0 2b 2b
Let w = e2 , x = e1 + e2 + e3 .
Then v − w = e1 − e 2 ∈ U ,
so that v + U = w + U . On the other hand,
0
v − x = −e2 − e3 = −1 . Assuming v − x ∈ U , there would have to
−1
0 a+b
exist a, b ∈ R with −1 = −a ; but this means a = 1 = −b
−1 2b
and 2b = −1, which is impossible. So x ∈ / v + U . This means that
(v + U ) ∩ (x + U ) = ∅.
20
Definition 2.28. V /U is the set of cosets of U in V . In other words, V /U =
{v + U | v ∈ V }.
Definition 2.29. [The quotient space] Let V be a vector space over the field F
and let U be a subspace of V . The elements of the quotient space V /U are the
cosets of U in V . Vector space operations on V /U are defined as follows: For
v, w ∈ U and λ ∈ F
(v + U ) + (w + U ) = (v + w) + U
λ(v + U ) = λv + U .
Remark. Lemma 2.26 is responsible for the fact that this definition makes
sense - in other words, that the proposed operations on V /U are well-defined :
If v ′ ∈ v + U and w′ ∈ w + U , then (v + w) − (v ′ + w′ ) ∈ U ∋ λv − λv ′ .
Theorem 2.30. Definition 2.29 defines the structure of a vector space on V /U .
Proof. We shall only sketch this proof. Every vector space axiom in V /U follows
from the fact that the same axiom is valid in V . As an example (v + U + w +
U ) + (z + U ) = (v + w) + U + z + U = (v + w) + z + U = v + (w + z) + U =
v + U + (w + z) + U = v + U + (w + U + z + U ).
The null vector of V /U is the coset 0 + U = {0 + u | u ∈ U } = U . For v ∈ V ,
the negative −(v + U ) of v + U is −(v + U ) = (−v) + U .
Examples. We use the subspace U of R3 introduced in the previous round of
examples. Let v = e1 and w = e2 . Then (v + U ) + (w + U ) = e1 + e2 + U . Since
e1 − e2 ∈ U , e1 + e2 + U = e1 + e1 + U = 2e1 + U = 2(e1 + U ).
Let V be a finite-dimensional vector space. Then
Theorem 2.31. dim V /U = dim V − dim U .
Proof. Let dim V = n and dim U = k. Given a basis (u1 , . . . , uk ) of U , there
is a basis (u1 , . . . , uk , vk+1 , . . . , vn ) of V . The proof will be complete once we
have showns that the cosets vk+1 + U, . . . , vn + U form a basis of V /U .
Let v ∈ V . Then v is a linear combination v = λ1 u1 + . . . + λk uk + µk+1 vk+1 +
. . . + µn vn . Since λ1 u1 + . . . + λk uk ∈ U , v ∈ µk+1 vk+1 + . . . + µn vn + U and
v + U = µk+1 vk+1 + . . . + µn vn + U = µk+1 (vk+1 + U ) + . . . + µn (vn + U ).
Hence the cosets vk+1 + U, . . . , vn + U span the space V /U . If αk+1 vk+1 +
U ) + . . . + αn (vn + U ) = 0 + U = U , then w = αk+1 vk+1 + . . . + αn vn ∈
U . This means that w is a linear combination β1 u1 + . . . + βk uk , such that
αk+1 vk+1 + . . . + αn vn − β1 u1 − . . . − βk uk = 0; since (u1 , . . . , uk , vk+1 , . . . , vn )
is linearly independent, in particular αk+1 = 0 = . . . = αn .
Examples. Let V = R3 and U = ⟨e1 − e2 , e1 + 2e3 ⟩ as before.
If λ(e1 −e2 )+µ(e1 +2e3 ) = 0, then (λ+µ)e1 −λe2 +2µe3 = 0,so that λ = µ = 0.
It follows that U is spanned by two linearly independent vectors. This means
dim U = 2. Hence dim V /U = dim V − dim U = 3 − 2 = 1.
Now let u1 = e1 − e2 , u2 = e1 + 2e3 . By the exchange lemma, (u1 , e1 , e3 ) is a
basis of V . Since u2 = 0u1 + e1 + 2e3 , (u1 , u2 , e1 ) is a basis of V , too. It follows
that (e1 + U ) is a basis of V /U .
21
3 Linear transformations
Definition 3.1. Let V1 and V2 be vector spaces over the field F and let T : V1 →
V2 be a function. Then T is called a linear transformation (or linear map or
homomorphism) if, for all u and v in V1 and λ ∈ F , we have
T (u + v) = T (u) + T (v) and T (λu) = λT (u).
x1
3 2 x1 − x2 + x3
Examples. 1. Let V1 = R and V2 = R . Define T x2 = .
x2 + 3x3
x3
T is a linear
transformation.
x1 + y1
x1 + y1 − x2 − y2 + x3 + y3 x1 − x2 + x3
Indeed, T x2 + y2 = = +
x2 + y3 + 3x3 + 3y3 x2 + 3x3
x3 + y3
x1
y1 − y2 + y3 λx1 − λx2 + λx3 x1 − x2 + x3
and T λ x2 = =λ .
y2 + 3y3 λx2 + 3λx3 x2 + 3x3
x3
2. Let V be a vector space over the field F . The zero map T given by
T (v) = 0 for all v in V is a linear map.
Let u, v ∈ V and λ ∈ F . Then T (u + v) = 0 = 0 + 0 = T (u) + T (v), while
T (λu) = 0 = λ0 = λT (u).
3. Let V be a vector space over the field F and let U ≤ V . The map T : V → V /U
given by T (v) = v + U (v ∈ V ) is a linear transformation.
If v, w ∈ V and λ ∈ F , then T (v + w) = (v + w) + U = (v + U ) + (w + U ) =
T (v) + T (w) and T (λv) = λv + U = λ(v + U ) = λT (v).
So T is a linear transformation. Also note that T is clearly a surjective map,
since V /U = {w + U | w ∈ V } = T (V ).
Definition 3.2. Let V be a vector space and U a subspace of V . The map
T : V → V /U given by T (v) = v + U is called the natural homomorphism from
V onto V /U .
Definition 3.3 (Images and preimages). Let S1 and S2 be sets and let f : S1 →
S2 be a function.
Given a subset A of S1 , f (A), the image of A under f , is the set {f (a) | a ∈ A}.
If B ⊆ S2 , the preimage f −1 (B) is the set {a ∈ S1 | f (a) ∈ B}. In other words,
an element of S1 is in f −1 (B) if and only if its f -image is in B.
The notation f −1 (B) might make it seem as if f has an inverse function. How-
ever: For every function, every subset of the codomain has a preimage. The
function may or may not be bijective.
Examples. Let f : R → R+ be the function given by f (x) = x2 . Then
−1 2
√ [0, 1].√ The preimage f ([2, 4]) is the set {x ∈ R | 2 ≤ x ≤
f ([−1, 1]) =
4} = [−2, − 2] ∪ [ 2, 2].
Lemma 3.4. Let V1 and V2 be vector spaces over the field F and let T : V1 → V2
be a linear transformation. Denote by 01 and 02 the respective null vectors of
V1 and V2 . Let k, n ∈ N. The following hold:
22
a) T (01 ) = 02 .
b) If U ≤ V1 , then T (U ) ≤ V2 .
c) If W ≤ V2 , then T −1 (W ) ≤ V1 .
d) If the family (v1 , . . . , vk ) is linearly dependent in V1 , then (T (v1 ), . . . , T (vk ))
is linearly dependent in V2 .
e) If (T (v1 ), . . . , T (vk )) is linearly independent in V2 , then (v1 , . . . , vk ) is
linearly independent in V1 .
f ) Let v1 , . . . , vk be vectors in V1 . Then T (⟨v1 , . . . , vk ⟩) = ⟨T (v1 ), . . . , T (vk )⟩.
g) If (v1 , . . . , vn ) is a generating system of V1 , then (T (v1 ), . . . , T (vn )) is a
generating system for T (V1 ). In particular, dim T (V1 ) ≤ dim V1 .
23
T (U ). For λ1 , . . . , λk ∈ F , T (λ1 v1 + . . . + λk vk ) = λ1 T (v1 ) + . . . + λk T (vk ). So
T (U ) ≤ ⟨T (v1 ), . . . T (vk )⟩.
”g)” If V1 = ⟨v1 , . . . , vn ⟩, then T (V1 ) = ⟨T (v1 ), . . . , T (vn )⟩ by f). Now assume
that dim V1 = n; this means that (v1 , . . . , vn ) is a basis of V1 . Since T (V1 ) is
spanned by the n vectors T (v1 ), . . . , T (vn ), dim T (V1 ) ≤ n. [Recall that every
finite generating system contains a basis].
T (λ1 v1 + λ2 v2 + . . . + λn vn ) = λ1 w1 + . . . + λn wn .
24
that S(u) = T (u) for any vector u ∈ V1 .
As before, a vector in V1 is a linear combination u = λ1 v1 + λ2 v2 + . . . + λn vn .
Since S is a linear map, we have
S(u) = S(λ1 v1 + λ2 v2 + . . . + λn vn ) =
= λ1 S(v1 ) + λ2 S(v2 ) + . . . + λn S(vn ) =
= λ1 w1 + . . . + λn wn = T (u).
b) Let V1 = R3 and V2 = R2 .
Let v1 = e1 , v2 = e2 − e1 , v3 = e1 + 2e3 . Show that (v1 , v2 , v3 ) is a basis
of V1 .
T: V1 → V2 be
Let theuniquely determined
linear
map satisfying T (v1 ) =
1 0 2
, T (v2 ) = and T (v3 ) = .
2 3 −10
x1
Compute T (v) for v = x2 ∈ V1 .
x3
Solution to a):
(i) Note that T exists and is unique according to Lemma 3.5 and because (e1 , e2 ) is a
basis of V1 .
x1
Let V1 ∋ v = . We have v = x1 e1 + x2 e2 . It follows that
x2
1 2 x1 + 2x2
T (v) = x1 T (e1 ) + x2 T (e2 ) = x1 2 + x2 −10 = 2x1 − 10x2 .
−10 1 −10x1 + x2
Solution to b):
We prove that (v1 , v2 , v3 ) is linearly independent. Since dim V1 = 3, any three-element
family is a basis.
Suppose that λ1 e1 +λ2 (e2 −e1 )+λ3 (e1 +2e3 ) = 0. We rewrite this as linear combination
of e1 , e2 , e3 and get (λ1 − λ2 + λ3 )e1 + λ2 e2 + 2λ3 e3 = 0. Accordingly, λ1 − λ2 + λ3 =
0 = λ2 = 2λ3 , so λ1 = λ2 = λ3 = 0.
25
Our next task is to
write any given vector in V1 as linear
combination
of v1 , v2 , v3 . Let
x1 x1 µ1 − µ2 + µ3
v ∈ V1 , v = x2 . If u = µ1 v1 + µ2 v2 + µ3 v3 , then x2 = µ2 .
x3 x3 2µ3
So µ2 = x2 , µ3 = 21 x3 , and µ1 = x1 + x2 − 12 x3 . This means that
x1
v = x2 = (x1 + x2 − 12 x3 )v1 + x2 v2 + 21 x3 v3 . Thus T (v) = T ((x1 + x2 −
x3
1
2
x 3 )v1 + x 2 v2 + 21 x3 v3 ) =
= (x1 + x2 − 21 x3 )T (v1 ) + x2 T (v2 ) + 21 x3 T (v3 ) =
1 0 2
= (x1 + x2 − 21 x3 ) + x2 + 12 x3 =
2 3 −10
x1 + x2 − 21 x3 + x3 x1 + x2 + 21 x3
= = .
2x1 + 2x2 − x3 + 3x2 − 5x3 2x1 + 5x2 − 6x3
Lemma 3.7. Let V1 and V2 be vector spaces over the field F and let T : V1 → V2 be
a linear transformation. Then ker T is a subspace of V1 and im T a subspace of V2 .
Proof. This follows straight from Lemma 3.4: V1 ≤ V1 , so 3.4 b) says that im T =
T (V1 ) ≤ V2 . The kernel of T is the preimage of the subspace {02 } of V2 .
26
Lemma 3.8. Let V1 and V2 be vector spaces over the field F and let T : V1 → V2 be
a linear transformation. Then:
a) For u, v ∈ V1 , T (u) = T (v) if and only if u − v ∈ ker T .
b) T is injective if and only if ker T = {01 }.
Note that Statement a) of 3.8 is equivalent to ”T (u) = T (v) if and only if u + ker T =
v + ker T ”.
Theorem 3.9. [Rank-nullity formula] Let V1 and V2 be vector spaces over the field
F . Let T : V1 → V2 be a linear transformation. Suppose that dim V1 < ∞. Then
Proof. Let dim V1 = n. By Lemma 3.8 f), dim im T ≤ n. Since ker T is a subspace of
V1 , dim ker T ≤ n, too. We let dim ker T = k and dim Im T = m. Note that we want
to establish that
n = m + k.
27
Lemma 3.8 a) implies that v − (µ1 v1 + . . . + µm vm ) ∈ ker T . Since ker T = ⟨u1 , . . . , uk ⟩,
this implies that v − (µ1 v1 + . . . + µm vm ) is a linear combination λ1 u1 + . . . + λk uk .
Consequently, v = µ1 v1 + . . . + µm vm + λ1 u1 + . . . + λk uk .
So B is linearly independent and spans V1 ; since B has k + m members, this finishes
the proof.
If the dimensions of V1 and V2 are finite and the same, Lemma 3.8 has a particularly
useful application:
Lemma 3.10. Let V1 and V2 be finite-dimensional vector spaces over the field F with
dim V1 = dim V2 = n. Let T : V1 → V2 be a linear transformation. Then the following
statements are equivalent:
a) T is injective.
b) dim ker T = 0.
c) dim im T = n.
d) T is surjective.
Proof. From Lemma 3.8, we know that T is injective if and only if ker T = {01 };
since the null space is the only vector space of dimension 0, T is injective if and
only if dim ker T = 0 if and only if n − dim ker T = n. By the rank-nullity formula,
n − dim ker T = dim im T , so T is injective if and only if dim im T = n. Since
dim V2 = n and the only n-dimensional subspace of an n-dimensional vectorspace is the
space itself, T is injective if and only if im T = V2 ; and that means T is surjective.
Definition 3.11. Two vector spaces W1 and W2 over the field F are called isomorphic,
denoted W1 ∼
= W2 if there is a bijective linear transformation W1 → W2 .
Theorem 3.12. [First homomorphism theorem for vector spaces] Let V1 and V2 be
vector spaces over the field F and let T : V1 → V2 be a linear transformation. Then
V1 / ker T ∼
= im T .
Proof. Let u, , v ∈ V1 . By Lemma 3.8 and the remark directly below, T (u) = T (v) if
and only if u + ker T = v + ker T .
In other words, T is constant on each coset of ker T in V1 . This makes it possible to
define a map T̄ : V1 / ker T → im T by
T̄ (u + ker T ) = T (u).
The map T̄ assigns to the coset u + ker T the value T (u) which is the same for all
members of the coset.
For u, v ∈ V1 and λ ∈ F , we have
T̄ ((u + ker T ) + (v + ker T )) =
= T̄ (u + v + ker T ) = T (u + v) = T (u) + T (v) = T̄ (u + ker T ) + T̄ (v + ker T ) and
T̄ (λ(u + ker T )) = T̄ (λu + ker T ) =
= T (λu) = λT (u) = λT̄ (u + ker T ).
28
So T̄ is a linear transformation.
What is ker T̄ ? If T̄ (u + ker T ) = 02 , then T (u) = 02 i.e. u ∈ ker T = 01 + ker T . So
the only element of the kernel of T̄ is the null vector of V1 / ker T . By Lemma 3.8 b),
T̄ is injective.
Since im T = {T (v) | v ∈ V1 } = {T̄ (v + ker T ) | v ∈ V1 } = im T̄ , T̄ is also surjective.
This completes the proof.
Lemma 3.13. Let V1 , V2 , V3 be vector spaces over the field F and let T : V1 → V2
and S : V2 → V3 be linear transformations. Then ST is a linear transformation.
Solution:
Please verify that (e1 + e2 , e1 −e2 ) isa basis of R2 .
x1
We compute S(v) for V2 ∋ v = . If v = λ(e1 + e2 ) + µ(e1 − e2 ), we have
x2
x1 = λ + µ and
x2 = λ − µ,
x1 +x2 x1 −x2
so that λ = 2
and µ = 2
.
x1 +x2 x1 −x2 x1 +x2
It follows that S(v) = S( 2 (e1 + e2 ) + 2
(e1 − e2 )) = 2
S(e1 + e2 ) +
x1 −x2
2
S(e1 − e2 ) = − x1 +x
2
2
+ x1 −x
2
2
= −x2 .
x1 + y1
x1 + y1 + x2 + y2 + x3 + y3
We verify that T is a linear transformation. Indeed, T x2 + y2 = = =
x1 + y1 + 2x3 + 2y3
x3 + y3
x1 + x2 + x3 y1 + y2 + y3
+ ,
x1 + 2x3 y1 + 2y3
29
x1
λx1 + λx2 + λx3 x1 + x2 + x3
while T λ x2 = = =λ .
λx1 + 2λx3 x1 + 2x3
x3
Finally,
x1
ST x2 =
x3
x1 + x2 + x3
=S = −x1 − 2x3 .
x1 + 2x3
All that’s left is to find dim ker ST . While it is possible and not difficult to compute
this using the above equation, we shall use Theorem 3.9: Indeed, ST (e1 ) = −1, so
im ST is a subspace of a one-dimensional space of dimension at least 1. It follows that
dim im ST = 1 and therefore dim ker ST = 3 − 1 = 2.
30
Lemma 3.15. Let V1 and V2 be vector spaces and let T : V1 → V2 be a bijective
linear transformation. Then the inverse map T −1 : V2 → V1 , defined by T T −1 =
idV2 and T −1 T = idV1 is a linear transformation V2 → V1 .
Proof. Let w and z be vectors in V2 . Since T is a linear map,
T (T −1 (w) + T −1 (z)) = T (T −1 (w)) + T (T −1 (z)) =
= T T −1 (w) + T T −1 (z) = idV2 (w) + idV2 (z) = w + z.
T T −1 (w + z) = w + z, i.e.
T (T −1 (w) + T −1 (z)) = T (T −1 (w + z)).
31
Definition 3.16. Let T : V1 → V2 be a bijective linear transformation. The
inverse S : V2 → V1 of T , i.e. the linear map S : V2 → V1 uniquely determined
by ST = idV1 and T S = idV2 is denoted T −1 .
Lemma 3.17. Let V1 , V2 , and V3 be vector spaces over F . Let T : V1 → V2 and
S : V2 → V3 be linear bijections. Then:
a) ST is bijective and (ST )−1 = T −1 S −1 .
b) (T −1 )−1 = T .
Proof. These facts are probably well known to you from the general theory of
functions.
For a), we note that ST (T −1 S −1 ) = S(T T −1 )S −1 = SidV2 S −1 = SS −1 = idV3
and T −1 S −1 (ST ) = T −1 (S −1 S)T = T −1 idV2 T = idV1 .
b) The equalities T T −1 = idV2 and T −1 T = idV1 imply that T −1 is bijective
with inverse T .
Concluding the section, we present a lemma that lessens the work required to
compute the inverse of a given linear transformation.
Lemma 3.18. Let V1 and V2 be vector spaces over F of the same (finite)
dimension n. Let T : V1 → V2 be a bijective linear transformation and S : V2 →
V1 a linear transformation. Each of the following is equivalent to S = T −1 :
a) ST = idV1 .
b) T S = idV2 .
Proof. Recall that S = T −1 is equivalent to ST = idV1 and T S = idV2 ; so we
need to show that, in the presence of the conditions of the lemma, a) and b) are
equivalent.
Let (v1 , . . . , vn ) be a basis of V1 . By Lemma 3.4, V2 = im T = ⟨T (v1 ), . . . , T (vn )⟩,
so that (T (v1 ), . . . , T (vn )) must be a basis of V2 .
Assume that ST = idV1 ; then S(T (vi )) = vi for i = 1, . . . , n. It follows that
(T S)T (vi ) = T (ST (vi )) = T (vi ) whenever i ∈ {1, . . . , n}. As mentioned above,
(T (v1 ), . . . , T (vn )) is a basis of V2 , and, by Lemma 3.4, the only linear map
V2 → V2 that maps every element of a basis to itself is idV2 . Thus a) implies b).
Conversely, suppose that T S = idV2 . Then, for i = 1, . . . , n, T S(T (vi )) =
T (vi ) = T (ST (vi )). Since T is injective, it follows that vi = ST (vi ) for all i, so
that Lemma 3.5 says that ST = idV1 .
32
Solution:
x1 + y1 x1 + y1 + x2 + y2 + x3 + y3
T x2 + y2 = x1 + y1 =
x3 + y3 x2 + y2
x1 + x2 + x3 y1 + y2 + y3
x1 + y1 ;
x2 y2
λx1 x1 + x2 + x3
T λx2 = λ x1 .
λx3 x2
x1
So T is a linear transformation. Let ker T ∋ v = x2 . Since x1 = x2 = 0 =
x3
x1 + x2 + x3 , x3 = x1 = x2 = 0 so that v = 0. So T is injective; by Lemma 3.8,
dim im T = 3 and T is surjective.
As before, we compute the images of the standard basis vectors:
1 1 1
T (e1 ) = 1 , T (e2 ) = 0 , T (e3 ) = 0
0 1 0
x1 = λ1 + λ2 + λ3
x2 = λ1
x3 = λ2 ,
i.e. λ3 = x1 − x2 − x3 , λ1 = x2 ,λ2 = x3 . It follows that
x1
−1
T x2 = T −1 (x2 T (e1 ) + x3 T (e2 ) + (x1 − x2 − x3 )T (e3 )) =
x3
x2
x2 e1 + x3 e2 + (x1 − x2 − x3 )e3 = x3 .
x1 − x2 − x3
λn
n
n n
P
F with v = λi vi . This gives rise to a map TB : V → F called the coordinate
i=1
map with respect to B.
33
Definition 3.19. With the notation introduced just above
n λ1
λi vi ) = ... .
X
TB (
i=1 λn
n
P n
P n
P n
P n
P
Remark. Since λ i vi + µi vi = (λi +µi )vi and λ λ i vi = λλi vi , TB
i=1 i=1 i=1 i=1 i=1
n
P Pn
is a linear map. If TB ( µi vi ), then λi = µi holds for 1 ≤ i ≤ n.
λi vi ) = TB (
i=1 i=1
λ1
Thus TB is injective; it is also clearly surjective, since for any ... in F n ,
λn
n
P
there exists a preimage λi vi in V .
i=1
We immediately obtain:
In other words
m
P
For j = 1, . . . , n, T (vj ) = aij wj , and
i=1
a1j
a2j
thus the coordinate vector in F m of TB′ (T (vj )) is .
..
.
amj
Definition 3.21. With the notation introduced above, the matrix of T with
34
respect to B and B ′ , denoted B′ MB (T ), is the matrix
a11 a12 . . . a1n
a21 a22 . . . a2n
MB (T ) = (aij ) = .
B′ .. .. .. ..
. . . .
am1 am2 . . . amn
x2
x1
Examples. Let T : R2 → R3 be given by T = x1 . Let
x2
x2 − x1
B = (v1 , v2 ) = (e2 , e1 − e2 ) and B ′ = (w1 , w2 , w3 ) = (e3 , e1 + e2 , e2 ).10 We
have
1
T (v1 ) = 0 = 1w1 + 1w2 − 1w3
1
−1
T (v2 ) = 1 = −2w1 − 1w2 + 2w3
−2
It follows that
1 −2
B ′ MB (T ) = 1 −1 .
−1 2
We continue to use the notation introduced above.
Theorem 3.22. Let v ∈ V1 . The coordinate vector of T (v) with respect to the
basis B ′ is the result of multiplying the matrix B′ MB (T ) and the coordinate
vector of v with resoect to rhe basis B. In other words
TB′ (T (v)) = B′ MB (T ) · TB (v).
n
P
Proof. Write v = λj vj . Then
j=1
n
X n
X m
X m X
X n
T (v) = λj T (vj ) = λj aij wi = ( λj aij )wi .
j=1 j=1 i=1 i=1 j=1
Now
Pn
λ a
j=1 j 1j
a11 a12 . . . a1n λ1 n
P
a21 a22 . . . a2n λ2 λ a
j=1 j 2j
MB (T ) · TB (v) = = ,
B′ .. .. ... .. ..
. . . . ..
.
am1 am2 . . . amn λn n
P
λj amj
j=1
10 Verify that T is linear and B and B′ are bases of R2 and R3 , respectively.
35
completing the proof.
Examples (continued). We continue with the map T introduced in the previous
2 1
example. We take v = e1 (in R ). Since v = e2 + e1 − e2 , TB (v) = .
1
0 −1
Moreover, T (v) = 1 = −e3 + e2 = −w1 + w3 . So TB′ (T (v)) = 0 .
−1 1
1 −2 −1
1
Now we compute B MB′ (T )(TB (v). This is 1 −1 = 0 ,
1
−1 2 1
which is as it should be.
m
! m m k
X X X X
ST (vj ) = S aij wi = aij S(wi ) = aij bri zr =
i=1 i=1 i=1 r=1
k X
m
!
X
= bri aij zr .
r=1 i=1
36
1 −1
−1
have ST (v1 ) = S 0 = = −z2 and ST (v2 ) = S 1 =
−1
1 −2
2 1 −1 0
= −z1 + 3z2 . We have B” MB′ (S) = . 11 Thus
3 −1 1 1
1 −2
1 −1 0 0 −1
B ” M B ′ (S) B ′ MB (T ) = 1 −1 = ,
−1 1 1 −1 3
−1 2
which is as it should be.
Definition 3.24. Let r (∈ N. The r × r-unit matrix Ir is defined by Ir =
1 if i = j
(dij )1≤i, j≤r where dij = .
0 if i ̸= j
In other words,
1 0 ... 0
0 1 ... 0
Ir = .
.. .. .. ..
. . . .
0 ... 0 1
We note a corollary of Theorem 3.23:
Corollary 3.25. Suppose that T is bijective. Then n = m and
−1 −1
B′ MB (T ) B MB′ (T ) = In = B MB′ (T ) B′ MB (T ).
Remark. It should be noted that Lemma 3.25 says that a linear transformation
is bijective if and only if it is represented by an invertible matrix.
37
Remark.
′ MB1 (T )
B1 = ′ MB1 (idV2 T idV1 )
B1 =
= ′
B1 MB1 (idV2 T idV1 ) = ′
B1 MB′ (idV2 ) B′ MB (T ) B MB1 (idV1 ).
By Corollary 3.25, B MB1 is an invertible matrix-indeed, B MB1 (idV1 ) B1 MB (idV1 ) =
B MB (idV1 ) = In . If X is any invertible r × r-matrix with coefficients in F , then the
column vectors of X form a basis B2 of F r ; letting B2′ be the standard basis of F r , we
have X = B2′ MB2 idF r .
Lemma 3.26. Let A = B′ MB (T ) and let B be an m × n-matrix with coefficients in
F . The following statements are equivalent:
There are bases B1 of V1 and B1′ of V2 such that B = B1′ MB1 (T ).
There are invertible matrices Y of size n × n and X of size m × m over F such
that B = XAY .
In other words,
x1 y1
.. .. =
. · .
xn yn
y1
= (x1 , x2 , . . . , xn ) ... =
yn
n
X
= xi yi .
i=1
38
We verify that the dot product is an inner product: Recall the rules (A +
B)t = At + B t , (AB)t = B t At , (At )t = A, A(B + C) = AB + AC of matrix
multiplication.
Let u, v ∈ V . Since ut v ∈ R, ut v = (ut v)t = v t u, so the dot product is
symmetric.
Given u, v, w ∈ V and λ ∈ R, we have
u · (v + w) = ut (v + w) = ut v + ut w
and
u · (λv) = ut (λv) = λut v.
Linearity in
the first
argument follows from symmetry.
x1
Letting v = ... ,
xn
n
X
vt v = x2i ,
i=1
t
so v v = 0 if and only if x1 = x2 = . . . = xn = 0, i.e. v = 0.
Note: The dot product is also called the canonical inner product on Rn .
For bilinearity,
f (λu + µv, w) = (λu + µv)t Aw =
(λut + µv t )Aw = λut Aw + µv t Aw = λf (u, w) + µf (v, w).
39
0 1
4) Let V = R2 and A = . Noting that A is symmetric, the map f
1 −2
t
7→ uAv is bilinear and symmetric.
given by (u, v)
x1
Let V ∋ v = . We have
x2
x1
f (v, v) = v t Av = (x2 , x1 − 2x2 ) = 2x1 x2 − 2x22 .
x2
4.0.1 Inequalities
The main point of this short subsection is the proof of the Cauchy-Schwarz inequality
and some of its corollaries. Those inequalities for the basis for the definitions of angles
and distances in analytic geometry. Throughout, (V, f ) is an inner product space.
p
Definition 4.2. The norm of a vector v in V is given by ∥v∥ = f (v, v).
Remark. Since f is positive definite, ∥v∥ ≥ 0 for any v in V , with equality if and
only if v = 0. p p
If λ ∈ R, then ∥λv∥ = f (λv,
λv) =
λ f (v, v) = |λ| ∥v∥.
2
1
In particular: If v ̸= 0, then
∥v∥ (v)
= 1.
∥u + v∥ ≤ ∥u∥ + ∥v∥
Proof. We simply apply 4.3: ∥u + v∥2 = f (u+v, u+v) = f (u, u)+f (v, v)+2f (u, v) ≤
f (u, u) + f (v, v) + 2|f (u, v)| ≤ f (u, u) + f (v, v) + 2 ∥u∥ ∥v∥ = (∥u∥ + ∥v∥)2 . When
the triangle inequality is not strict (so the two expressions are equal for a given pair
of vectors), then |f (u, v)| = ∥u∥ ∥v∥, so the Cauchy-Schwarz inquality is an equality
and the two vectors are linear dependent.
40
Definition 4.5. We define the distance between two vectors u, v in V as d(u, v) =
∥u − v∥ .
Remark. For u ∈ V ,
0 = f (vi , 0) = f (vi , λ1 v1 + . . . + λm vm ) =
m
X
= λj f (vi , vj ) = λi f (vi , vi ).
j=1
Definition 4.9. Let (V, f ) be an inner product space of finite dimension n. An or-
thonormal basis of V is a basis (v1 , . . . , vn ) of V such that
1. ∥vi ∥ = 1, i = 1, . . . , n.
2. vi ⊥ vj whenever 1 ≤ i < j ≤ n.
41
Proof. We proceed by induction on n.. Let v1 = ∥u11 ∥ u1 .12 Then ⟨v1 ⟩ = ⟨u1 ⟩ and
∥v1 ∥ = 1.
Let 1 ≤ i < n and suppose that the family (v1 , . . . , vi ) satisfies the requirements of
the Lemma:
∥vj ∥ = 1, vj ⊥ vk for any j ̸= k ∈ {1, . . . , i}, ⟨v1 , . . . , vi ⟩ = ⟨u1 , . . . , ui ⟩.
Let
i
X
ṽi+1 = ui+1 − f (vj , ui+1 )vj .
j=1
i
X
= f (vj , ui+1 ) − f (vk , ui+1 )f (vj , vk ) =
k=1
Apart from its theoretical meaning, the proof of Lemma 3.6 also contains a procedure
known as Gram-Schmidt algorithm with which to modify an existing basis into an
orthonormal basis. We provide an example.
42
q q 1
1 1 3 −2 . √1
Noting that ∥ṽ2 ∥ = +1+ =
4 4 2
,
we get v2 = 6
−1
Finally, v1 · u3 = 0 and v2 · u3 = √26 , so ṽ3 = u3 − (v1 · u3 )v1 − (v2 · u3 )v2 =
2
1 1 3 q √
0 − −2 = 2 . The norm of ṽ3 being
1 12
= 2 3 3 , we have
3 3 9
−1 −1 − 23
1
3 √1
v 3 = 2√ 3
ṽ3 = 3
1 .
−1
Lemma 4.13. Suppose that V has finite dimension. Let U be a subspace of V . Then
V = U ⊕ U ⊥.
Proof. Let dim V = n, dim U = k and (u1 , . . . , uk ) a basis of U . Extend to a basis
(u1 , . . . , uk , uk+1 , . . . , un ) of V .
Using the Gram-Schmidt-algorithm 4.10, we construct an orthonormal basis (v1 , . . . , vn )
of V satisfying ⟨v1 , . . . , vk ⟩ = ⟨u1 , . . . , uk ⟩ = U. In other words, (v1 , . . . , vk ) is an or-
thonormal basis of U .
For ℓ > k ≥ i, we have vℓ ⊥ vi . This implies that vi ∈ {vℓ }⊥ for any i ∈ {1, . . . , k}.
By Lemma 3.8, {vℓ }⊥ is a subspace of V . Containing each vi for i = 1, . . . , k, it must
contain ⟨v1 , . . . , vk ⟩ = U . In other words, vℓ ∈ U ⊥ for any ℓ in {k + 1, . . . , n}; since
U ⊥ is a subspace of V , ⟨vk+1 , . . . , vn ⟩ ≤ U ⊥ . In particular, V = U + U ⊥ .
If u ∈ U ∩ U ⊥ , then u ⊥ u which means u = 0 because f is positive definite. So the
sum is direct.
43
Examples. Let V = R4 , equipped with the dot product. Let U = ⟨u1 , u2 ⟩, where
1 1
1 0
u1 = 1 , u2 = −2 .
1 1
Solution:
First of all, λ1 u1 + λ2 u2 = 0 implies that λ1 = 0 = λ2 , so dim U = 2.
Since u1 = e1 + e2 + e3 + e4 , (u1 , e1 , e2 , e3 ) is a basis of V . Since u2 = u1 − 3e3 − e2 ,
(u1 , u2 , e1 , e2 ) is a basis of V . Let e1 = u3 , e2 = u4 .
We apply the Gram-Schmidt algorithm to transform (u1 , u2 , e1 , e2 ) into an orthonor-
mal basis (v1 , v2 , v3 , v4 ) of V . We start the Gram-Schmidt process with u1 and u2 ,
which guarantees that ⟨u1 , u2 ⟩ = ⟨v1 , v2 ⟩ = U ; moreover, v3 , v4 ∈ {v1 , v2 }⊥ = U ⊥ .
Since dim⟨v3 , v4 ⟩ = 4 − dim U = dim U ⊥ (compare Lemma 3.10) we know that
U ⊥ = ⟨v3 , v4 ⟩.
Finally, U = (U ⊥ )⊥ = {v ∈ V |v3 · v = 0 = v4 · v}- this will yield a linear system
with
1
√ 1
solution space U . Since ∥u1 ∥ = 12 + 12 + 12 + 12 = 2, v1 = ∥u11 ∥ u1 = 21 1 .
1
Next, ṽ2 = u2 − (v1 · u2 )v1 , while v1 · u2 = 21 − 2 · 12 + 12 = 0. Hence ṽ2 = u2 and
1
0
v2 = ∥u12 ∥ u2 = √16
−2 . Next, ṽ3 = u3 − (u3 · v1 )v1 − (u3 · v2 )v2 .
1
We have u3 = e1 and u3 · v1 = 21 , while u3 · v2 = √16 . Hence:
1 1 1
0 1 1 1 0
0 − 4
ṽ3 = 1 − 6 −2 =
0 1 1
7
−3 .
1
=
12 1
−5
q
49+9+1+25 1
√
Since ∥ṽ3 ∥ = 144
= 12
84,
7
1 1 −3
v3 = ṽ3 = √ .
∥ṽ3 ∥ 84 1
−5
44
We have (v1 , u4 ) = 12 , (v2 , u4 ) = 0, (v3 , u4 ) = − √384 . This yields
0 1 7
1 1 1 3 −3
0 − 4
ṽ4 = 1 + 84 1 =
0 1 −5
0 0
1
54 = 3
3
= −1 .
84 −18 14
−36 −2
1
Accordingly, v4 = ṽ
∥ṽ4 ∥ 4
=
√ 0
14
3 .
3 −1
−2
7 0
−3 3
Thus (v3 , v4 ) is a basis of U ⊥ . Let w3 = 1 and w4 = −1 . The vectors w3
−5 −2
and w4 are nonzero scalar multiples of v3 and v4 , respectively, so that U ⊥ = ⟨w3 , w4 ⟩.
By Lemma 3.10, U = U ⊥⊥ = {w3 , w4 }⊥ =
x1
x2
= | 7x1 − 3x2 + x3 − 5x4 = 0 = 3x2 − x3 − 2x4 .
x3
x4
4.2.1 Distances
Lemma 4.16. Suppose that V has finite dimension. Let U be a subspace of V , and
let v ∈ V . If v = u + w with u ∈ U and w ∈ U ⊥ , then d(u, v) = min{d(x, v) | x ∈ U }.
Proof. Let u and w be defined as in the statement of the lemma. Let x ∈ U . Write
x = u + (x − u). Recalling that w ∈ U ⊥ , we obtain
d(v, x)2 = ∥v − x∥2 = f (v−u+(x−u), v−u+(x−u)) = f (w+(x−u), w+(x−u)) = f (w, w)+f (x−u, x−u).
Examples. Let V = R3 with the dot product. Let U = ⟨y⟩, where y = 2e1 − e3 .
a) Show that U ⊥ = ⟨e2 , e1 + 2e3 ⟩.
b) Let x = e2 + e3 . Determine a vector in U at minimal distance to x.
45
Solution: For a), we could use the Gram-Schmidt algorithm, but there is a shorter
way: Since dim U = 1, we know that dim U ⊥ = 2. The two vectors e2 and e1 + 2e3
are linearly independent and perpendicular to y. This is sufficient for a).
For b), we write x as a sum of an element of U and an element of U ⊥ . This means
finding scalars α, β, γ such that x = e2 + e3 = α(2e1 − e3 ) + βe2 + γ(e1 + 2e3 ) =
(2α + γ)e1 + βe2 + (−α + 2γ)e3 . It follows that β = 1, γ = −2α, and −5α = 1, i.e.
α = − 51 and γ = 52 . Thus u = α(2e1 − e3 ) = − 15 y is the element of U at minimal
distance to x.
46