0% found this document useful (0 votes)
46 views

Lecture Notes PDF

λ(x1 , x2 , . . , xn ) = (λx1 , λx2 , . . , λxn ) The document provides notes on linear algebra concepts including: 1) It defines a field as a set with addition and multiplication operations satisfying certain properties. The fields of rational numbers, real numbers, and complex numbers are examples. 2) It introduces the concept of a vector space as a set with vector addition and scalar multiplication satisfying certain properties, with examples including sets of n-tuples over a field. 3) It explains that a vector space consists of vectors that can be added and multiplied by scalars, with properties like associativity and dist

Uploaded by

Manex Man
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

Lecture Notes PDF

λ(x1 , x2 , . . , xn ) = (λx1 , λx2 , . . , λxn ) The document provides notes on linear algebra concepts including: 1) It defines a field as a set with addition and multiplication operations satisfying certain properties. The fields of rational numbers, real numbers, and complex numbers are examples. 2) It introduces the concept of a vector space as a set with vector addition and scalar multiplication satisfying certain properties, with examples including sets of n-tuples over a field. 3) It explains that a vector space consists of vectors that can be added and multiplied by scalars, with properties like associativity and dist

Uploaded by

Manex Man
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Linear Algebra 1 (MAT 3711)

Lecture Notes
University of Namibia

1 Vector spaces
1.1 Fields
We define an algebraic structure called a field. You are acquainted with several
fields already, the rational numbers Q, the real numbers R, or the complex
numbers C. In MAT3711 and MAT3712, examples will be almost exclusively
vector spaces (see the definition below) over one of those three fields. However,
there are many, in fact infinitely many, fields.
Definition 1.1. Let F be a set on which two binary operations F × F → F are
defined. The two operations are called addition and multiplication and denoted
by (a, b) 7→ a + b and (a, b) 7→ ab.1 . The set F is called a field if the following
requirements are met: For a, b, c ∈ F
1. (a + b) + c = a + (b + c).
2. a + b = b + a.
3. There is an element 0 satisfying a + 0 = a.
4. There is an element −a satisfying a + (−a) = 0.
5. (ab)c = a(bc).
6. ab = ba.
7. There is an element 1 satisfying 1a = a.
8. If a ̸= 0, then there is an element a−1 satisfying aa−1 = 1.
9. a(b + c) = ab + ac.
10. 1 ̸= 0.
Remark. According to the axioms, addition in F is associative, commutative,
there is a neutral element 0 and every element has a negative. Likewise, mul-
tiplication is associative and commutative, there is a multiplicative unity 1 and
every nonzero element has a multiplicative inverse. Finally, the distributive law
holds in F . The last axiom guarantees that F has nonzero elements.
The field axioms create a structure in which we can perform arithmetic as we
know it.
1 So a + b is the sum of a and b and ab the product

1
Examples. a) Q, R, C are fields.
b) The set of integers, Z, with its usual multiplication and addition, is not a
field. The only axiom that fails to hold is the existence of a multiplicative
inverse for each nonzero element. Indeed, 5 ̸= 0, but 15 ∈/ Z.
c) There are fields with finitely many elements. The smallest has only two
elements, 0 and 1, and the operations are succinctly displayed in the fol-
lowing multiplication tables:

+ 0 1
0 0 1
1 1 0

· 0 1
0 0 0
1 0 1

1.2 Families
Definition 1.2. Let S and I be sets. A family of elements of S, indexed by
I, is a map f : I → S. We usually use subscripts to name a family, writing
si instead of f (i) and denoting the family as (si )i∈I . If I is the set of natural
numbers 1, . . . , r, r ∈ N we write (s1 , . . . , sr ).
A subfamily of a given family f is a map g : I1 → S, where I1 ⊆ I and g(i) =
f (i) whenever i ∈ I1 .
Examples. 1. Given elements ai , i = 1, . . . , 5 of any set, (a1 , a2 , a3 , a4 , a5 )
is a family-here the index set I is {1, 2, 3, 4, 5}. The family (a1 , a3 , a4 )
is a subfamily.
2. Sequences (an )n∈N of real numbers are families of real numbers with index
set N and subsequences are subfamilies.
The concept of a family generalises that of (finite or infinite) sequences.
Remark. It is also possible to join two given families. Say, (a1 , a2 , a3 , a4 , a5 )
and (b1 , b2 , b3 ) can be put together to form the family (a1 , a2 , a3 , a4 , a5 , a6 , a7 , a8 )
where a6 = b1 , a7 = b2 , a8 = b3 . The formal definition of this procedure is cum-
bersome and we will omit it.

1.3 Vector spaces and subspaces


Let F be a field.
Definition 1.3. Let V be a nonempty set. Elements of V will be called vectors.
The structure V is a vector space over F if there are operations + : V × V →
V (called vector addition) and : F × V → V (called scalar multiplication)
satisfying the following requirements: For u, v, w ∈ V and λ, µ ∈ F ,

2
1. (a) (u + v) + w = u + (v + w).
(b) u + v = v + w.
(c) There is an element 0, called a null vector satisfying v + 0 = 0.
(d) There is an element −v, called a negative of V , satisfying v +(−v) =
0.
2. (a) (λµ)v = λ(µv).
(b) 1v = v.
3. (a) λ(u + v) = λu + λv.
(b) (λ + µ)v = λv + µv.
Remark. The first group of axioms concern vector addition, saying that it is
associative and commutative, that there is a neutral element and that every ele-
ment has a negative. The second group of axioms connects scalar multiplication
with the multiplication in F , and the third group are the two distributive laws.
In the context of a vector space over F , the elements of F will often be called
scalars.
Examples. 1. Let n ∈ N. The set F n consists of n-tuples of elements of F ,
displayed as column vectors:
  

 x1 

 x2 
 

n
F =  .  | x1 , . . . , x n ∈ F .
 

  ..  

 
xn
 

Addition in F n is defined componentwise, i.e.


     
x1 y1 x1 + y1
 x2   y2   x2 + y2 
 ..  +  ..  = 
     
.. 
 .   .   . 
xn yn xn + yn

and scalar multiplication by


   
x1 λx1
 x2   λx2 
λ = .
   
.. ..
 .   . 
xn λxn
2
We verify that F n is a vector space: First of all, F n ̸= ∅ and is closed un-
der the intended addition and scalar multiplication (sums or scalar multiples
         
2 In 1 −7 −6 1 5
R2 , + = and 5 = .
3 5 8 3 15

3
     
x1 y1 z1
 x2   y2   z2 
of elements of F n are in F n .) Next,  .  +  .  +  .  =
     
 ..   ..   .. 
xn yn zn
       
x1 + y1 z1 (x1 + y1 ) + z1 x1 + (y1 + z1 )
 x2 + y2   z2   (x2 + y2 ) + z2   x2 + (y2 + z2 ) 
+ .  =   =   =
       
 .. . .
. ..
 .   .   .   . 
xn + yn zn (xn + yn ) + zn xn + (yn + zn )
         
x1 y 1 + z1 x1 y1 z1
 x2   y2 + z2   x2   y2   z2 
 . +  =  .  +  .  +  .  . This is as-
         
.
 ..   ..   ..   ..   .. 
xn yn + zn xn yn zn
   
x1 + y1 y1 + x1
 x2 + y2   y2 + x2 
sociativity of addition. For commutativity,  = =
   
.. ..
 .   . 
xn + yn yn + xn
         
x1 y1 y1 x1 0
 x2   y2   y2   x2   0 
 .  +  .  =  .  +  .  . The vector  .  serves as null
         
 ..   ..   ..   ..   .. 
xn yn yn xn 0
−x1
   
x1
 x2   −x2 
vector, and −  .  =  .  . Those are the four axioms in the first
   
 ..   .. 
xn −xn
group.          
x1 (λµ)x1 λ(µx1 ) µx1 x1
 x2   (λµ)x2   λ(µx2 )   µx2   x2 
Next, λµ  .  =  =  = λ  .  and 1  .  =
         
. .
 ..   ..   ..   ..   .. 
xn (λµ)xn λ(µxn ) µxn xn
   
1x1 x1
 1x2   x2 
 .  =  .  . This takes care of the second group.
   
 ..   .. 
1xn xn
       
x1 y1 x1 + y1 λ(x1 + y1 )
 x2   y2   x2 + y2   λ(x2 + y2 ) 
Finally, λ  .  +  .  = λ   =   =
       
. ..
 ..   ..   ..   . 
xn yn xn + yn λ(xn + yn )
         
λx1 + λy1 λx1 λy1 x1 y1
 λx2 + λy2   λx2   λy2   x2   y2 
 =  .  +  .  = λ  .  +λ  . , and (λ+
         
 .. . . . .
 .   .   .   .   . 
λxn + λyn λxn λyn xn yn

4
         
x1 (λ + µ)x1 λx1 + µx1 x1 x1
 x2   (λ + µ)x2   λx2 + µx2   x2   x2 
µ)  = =  = λ +µ  .
         
.. .. .. .. ..
 .   .   .   .   . 
xn (λ + µ)xn λxn + µxn xn xn

2. The set ℓ = {(an )n∈N | an ∈ R for all n} of sequences of real numbers is a


vector space with componentwise addition and scalar multiplication, i.e.
(an ) + (bn ) = (an + bn ) and λ(an ) = (λan ) is a vector space. Note that
these operations are exactly addition of sequences and multiplication of
a sequences by a constant as you know them from Calculus. The vector
space axioms are proved as in a). The null vector is the constant zero
sequence.
3. The set ℓ0 of real number sequences converging to 0 is another vector
space. You will know from Calculus that sums and constant multiples of
zero sequences are zero sequences, and of course the constant zero sequence
converges to 0. The vector space axioms carry over from ℓ. Indeed, ℓ0 is
our first example of a subspace.
4. Let ′ = 0 with 0 + 0 = 0 = λ0 (λ ∈ F ). The set ′ is a vector space called
the null space. Indeed, there is a null vector, and all the vector space
axioms are equalities that remain equalities when everything is equal to
0.
5. Let S be a set. Recall that V = PS is the set of subsets of S. This can be
turned into a vector space over the field F2 with two elements discussed
above. Scalar multiplication is given by 1A = A and 0A = ∅ whenever
S ∈ PS and addition is given by symmetric difference,i.e. A+B = A∆B =
(A \ B) ∪ (B \ A). Directly verifying the vector space axioms is not entirely
easy -associativity of addition takes some work. You are welcome to try,
if anyone is interested, they are welcome to discuss with me. We shall
return to this example later and see an easy proof.
In the next lemma, we collect some fundamental consequences of the vector
space axioms.
Lemma 1.4. Let V be a vector space over the field F . Let u, v, w ∈ V and
λ ∈ F.
a) If u + v = u + w, then v = w.
b) There is only one null vector in V .
c) Every vector in V has a uniquely determined negative.
d) −(−v) = v.
e) λ0 = 0 = 0v.
f ) −λv = λ(−v) = (−λ)v.

5
g) If v ̸= 0, then λv = 0 if and only if λ = 0.
h) In V , there is exactly one solution to the equation u + x = v.
Proof. If u + v = u + w, then (−u) + (u + v) = (−u) + (u + w) = (−u + u) + v =
(−u + u) + w = 0 + v = v = 0 + w = w. This is a).
If z + 0′ = z = z + 0 holds whenever z ∈ V , then 0 + 0′ = 0′ + 0 = 0 = 0′ .This
is b).
If v + u = v + w = 0, then u = w by a). This is c).
For d), v + (−v) = 0 = (−(−v)) + (−v). Now apply a).
For the first statement in e), λ0 = λ0 + 0 = λ(0 + 0) = λ0 + λ0. Now apply a).
For the second statement, 0v = 0v + 0 = (0 + 0)v = 0v + 0v, now apply a).
As to f), we use e) to obtain λ(v + (−v)) = λ0 = 0 = λv + λ(−v). By c), this
means that −λv = λ(−v). Similarly, the second statement in e) implies that
(λ + (−λ))v = 0v = 0 = λv + (−λv), so −λv = (−λ)v follows from c).
Moving on to g), assume that λv = 0 and λ ̸= 0. Every nonzero element of a field
has a multiplicative inverse. By e), λ−1 (λv) = (λ−1 λ)v = 1v = v = λ−1 0 = 0.
Finally, u + (−u + v) = (u + (−u)) + v = v. The uniqueness of the solution
follows from a).
Definition 1.5. Let V be a vector space over F . For u, v ∈ V , we define

u − v = u + (−v).

Remark. ˆ Let λ, µ ∈ F , u, v ∈ V . It follows from 1.4 e) that λ(u − v) =


λu + λ(−v) = λu − λv and (λ − µ)v = λv − µv.
ˆ Let n be some positive integer, u1 , . . . , un in V . It follows from the asso-
ciative law that all distributions of pairs of parenthesis on v1 + . . . + vn
yield the same result. We simply write v1 + . . . + vn .

1.4 Subspaces
Definition 1.6. Let V be a vector space over F . A subset U of V is called a
subspace if
a) U ̸= ∅.
b) If u ∈ U ∋ v, then u + v ∈ U .
c) If u ∈ U and λ ∈ F , then λu ∈ U .
We denoted ”U is a subspace of V ” by ”U ≤ V ”.
Properties b) and c) mean, respectively, that U is closed with respect to addition
and scalar multiplication. A subspace should be thought of as a subset of a
vector space that is a vector space in its own right, with the operations inherited
from the surrounding space.
Examples. 1. V ≤ V and {0} ≤ V (for the latter, see the lemma below).

6
  
x
2. Let V = R2 and U = | x ∈ R . The set U is certainly
−2x
   
1 x
not empty, say ∈ U . For x, y, λ ∈ R, we have +
   −2     −2x 
y x+y x+y x
= = ∈ U and λ =
 −2y  −2x − 2y −2(x + y) −2x
λx
∈ U.
−2λx
3. Example 1. is the set of solutions in R2 of the homogeneous linear equation
2x1 + x2 = 0. You will recall that a homogeneous linear system with
coefficients in some field F 3 of m equations and in n unknowns x1 , . . . , xn
can be conveniently expressed as
   
x1 0
 ..   .. 
A .  =  . ,
xn 0
where A is an m × n matrix with coefficients in F and the right hand sight
is the null vector of Rm . The set of solutions of the
 system
 is the set U =
x1
{v ∈ F n | Av = 0}. If x1 = . . . = xn = 0, then A  ...  = 0, so U ̸= ∅.
 

xn
If u, v ∈ U and λ ∈ F , then A(u + v) = Au + Av = 0 = λAu = Aλu.
So the set of solutions of a homogeneous linear system in n unknowns is
a subspace of F n .
4. Let V be a vector space over F and let v ∈ V . The set W = {λv | λ ∈ F }
of scalar multiples of V is a subspace of V : First of all, v = 1v ∈ W ̸= ∅.
For λ, µ, α ∈ F , λv + µv = (λ + µ)v ∈ W and α(λv) = (αλ)v ∈ W .
Lemma 1.7. Let V be a vector space over F . The set {0} is a subspace of V
and 0 belongs to every subspace of V .
Proof. The set {0} is nonempty and 0 + 0 = 0 = λ0 (λ ∈ F ). So {0} is a
subspace. Now let U ≤ V . Since U ̸= ∅, there is u in U . It follows that
(−1)u = −1u = −u ∈ U . Thus U ∋ u + (−u) = 0.
Lemma 1.8. Let V be a vector space over F . Let U1 , U2 ≤ V .
Then U1 ∩ U2 ≤ V .
Proof. By Lemma 1.7, 0 ∈ U1 ∩U2 ̸= ∅. Let u, v ∈ U1 ∩U2 and λ ∈ F . Then u+v
and λv are in U1 and in U2 , both being subspaces. So u + v ∈ U1 ∩ U2 ∋ λv.
Remark. Unlike the intersection,
  theunion oftwo subspaces
 usually
  is not  a
1 0
subspace. Take V = R2 , U1 = λ | λ ∈ R and U2 = λ |λ ∈ R
0 1
3 You know this theory for F = R, but the field really does not make a difference here.

7
 
1
. Both sets are subspaces, yet U1 ∪ U2 does not contain the vector +
0
   
0 1
= . So U1 ∪ U2 is not closed with respect to addition.
1 1
Let V be a vector space and let S be a subset of V . Since V itself is a subspace
of V , S is contained in a subspace of V . By Lemma 1.8, the intersection over
all subspaces of V that contain S is a subspace of V .
Definition 1.9. Let V be a vector space and let S ⊆ V . The span of S, denoted
⟨S⟩, is defined by \
⟨S⟩ = U.
S⊆U ≤V

Remark. Let S ⊆ V . By its definition, ⟨S⟩ is a subspace of V containing S


and if W is a subspace of V with S ⊆ W , then ⟨S⟩ ⊆ W .
If S = {v1 , . . . , vr } is a finite set, we write ⟨S⟩ = ⟨v1 , . . . , vr ⟩, omitting the
Cayley brackets.
Examples. Let V be a vector space over F .
1. Let v ∈ V . In the previous round of examples, we have seen that W =
{λv | λ ∈ F } is a subspace of V . The subspace W contains v = 1v. This
implies that ⟨v⟩ ⊆ W . If U is any subspace of V and v ∈ U , then W ⊆ U
T U is closed with respect to scalar multiples. This means that W ⊆
since
U = ⟨v⟩. Accordingly,
v∈U ≤V

⟨v⟩ = {λv | λ ∈ F }.

2. What is ⟨∅⟩? We know thatTthe emptyTset is a subset of every other set.


This means that ⟨∅⟩ = U = U , i.e. the intersection of all
∅⊆U ≤V U ≤V
subspaces of V . By Lemma 1.7, it follows that ⟨∅⟩ = {0}.
   
2 1 0
3. Let V = R . Let v1 = and v2 = . For a, b ∈ R, av1 + bv2 =
  0 1
a
. In other words, every element of V is a sum of a scalar multiple
b
of v1 and a scalar multiple of v2 . So every subspace of V containing v1
and v2 contains all the elements of V . It follows that ⟨v1 , v2 ⟩ = V .
Definition 1.10. Let V be a vector space over the field F and let U1 ≤ V ≥ V2 .
The sum of U1 and U2 , denoted U1 + U2 is defined by

U1 + U2 = ⟨U1 ∪ U2 ⟩.

So U1 + U2 is a subspace containing U1 and U2 and every subspace containing


both U1 and U2 contains U1 + U2 . The reason why it is called a sum is the
following lemma, which also allows us to compute U1 + U2 in examples.

8
Lemma 1.11. In the notation of 1.10, U1 + U2 = {u + w | u ∈ U1 , w ∈ U2 }.
Proof. Let S = {u + w | u ∈ U1 , w ∈ U2 }. By Lemma 1.7, 0 ∈ U1 , U2 . So
U1 = {u + 0 | u ∈ U1 } ⊆ S and U2 = {0 + w | w ∈ U2 } ⊆ S. So U1 ∪ U2 ⊆ S; in
particular, S ̸= ∅4 Next, let u, u′ ∈ U1 and w, w′ ∈ U2 . Then u + u′ ∈ U and
w + w′ ∈ W , so that (u + w) + (u′ + w′ ) = (u + u′ ) + (w + w′ ) ∈ S. Furthermore,
λu ∈ U and λw ∈ W , so that λ(u + w) = λu + λw ∈ S. So S is a subspace of
V containing U1 ∪ U2 and it follows that U1 + U2 ⊆ S. To establish the reverse
inclusion, consider any subspace W with W ⊇ U1 ∪ U2 . Since W is additively
T sum u + w with u ∈ U and w ∈ W , so S ⊆ W .
closed, W contains every
Accordingly, S ⊆ W = U1 + U2 . This completes the proof.
U1 ∪U2 ⊆W ≤V
     
21 0 x1
Examples. Let V = R , v1 = and v2 = . If w = ∈ V,
0 1 x2
then w = x1 v1 + x2 v2 . Thus V = ⟨v1 ⟩ + ⟨v2 ⟩.

2 Linear dependence, basis, dimension


Throughout this chapter, V is a vector space over the field F .

2.1 Linear combinations


Definition 2.1. Let (v1 , . . . , vr ) be a finite family of vectors in V . A vector w
is a linear combination of the vectors v1 , . . . , vr if there are scalars λ1 , . . . , λr
such that
X r
w= λ i vi .
i=1
5
A vector z is a linear combination of an infinite family (vi | i ∈ I) if z is a
linear combination of a finite subfamily.
Remark. A very useful thing to observe is that linear combinations involving
only some of the vectors in (v1 , . . . , vr ) are linear combinations of all vectors in
the family. We simply give the vectors we do not need a coefficient 0. Example:
v1 + 2v3 + v4 = v1 + 0v2 + 2v3 + v4 + 0v5 .
     
1 1 1
Examples. 1.  2  = 2  1  −  0  is a linear combination of
   3  1 −1
1 1
 1  and  0 .
1 −1
4 Subspaces are nonempty.
5 So w = λ1 v1 + λ2 v2 + . . . + λr vr .

9
  
1 1
 2  4
 0 
2. The vector w = 
 3  in R is a linear combination of the vectors 
  
3 
  4 2
0
 2 
 0  –find out how.
and  

3. Let v ∈ V . The linear combinations of the family (v) are the scalar
multiples of V . The linear combinations of the family (v, 2v, 3v) are
again the scalar multiples of v.
Lemma 2.2. Let U ≤ V . If (v1 , . . . , vr ) is a finite family of elements of U ,
then every linear combination of v1 , . . . , vr is in U .
Proof. We use induction on r. The linear combinations of (v1 ) are the scalar
Pr
multiples of v1 , which belong to U . Now let r ≥ 2 and consider w = λi vi .
i=1
r−1
Via induction, the vector w′ = λi vi is in U . Now w = w′ + λr vr . Since
P
i=1
w′ , vr ∈ U , λr vr and then w′ + λr vr are in U .
The ”meaning” of Lemma 2.2 is that subspaces are closed with respect to linear
combinations.
Lemma 2.3. Let (v1 , . . . , vr ) be a nonempty finite family in V . Then ⟨v1 , . . . , vr ⟩
is the set of linear combinations of v1 , . . . , vr .
Proof. Let S be the set of linear combinations of (v1 , . . . , vr ). By Lemma 2.2,
S ⊆ ⟨v1 , . . . , vr ⟩. Now we show that S is a subspace. Certainly, S is not empty as
Pr r
P r
P
v1 ∈ S. Let u, w ∈ S, u = λi v i , w = µi vi . Then u + w = (λi + µi )vi ∈
i=1 i=1 i=1
r
6 7
P
S. For α ∈ F , αu = αλi vi ∈ S.
i=1
*         
1 −1 +  1 −1 
Examples. In R3 ,  2  ,  2  = λ1  2  + λ2  2  | λ1 , λ2 ∈ R =
−3 0 −3 0
 
  
 λ1 − λ2 
=  2λ1 + 2λ2  | λ1 , λ2 ∈ R .
−3λ1
 

The final lemma in this subsection contains the main argument in the proof of
the Steinitz exchange lemma in the next section (see Theorem 2.13)). This in
turn is prpbably the most important result in this course.
6 λ v + λ v + . . . + λ v + µ v + µ v + . . . + µ v = (λ + µ )v + (λ + µ )v + . . . +
1 1 2 2 r r 1 1 2 2 r r 1 1 1 2 2 2
(λr + µr )vr
7 α(λ v + λ v + . . . + λ v ) = αλ v + αλ v + . . . + αλ v
1 1 2 2 r r 1 1 2 2 r r

10
Lemma 2.4. Let (v1 , . . . , vr ) be a finite family in V . Let w ∈ ⟨v1 , . . . , vr ⟩.
r
P
Write w = λi vi . Fix j in {1, . . . , r}. If λj ̸= 0, then ⟨v1 , . . . , vr ⟩ =
i=1
⟨v1 , . . . , vj−1 , w, vj+1 , . . . , vr ⟩.
Proof. Since subspaces are closed under linear combinations (Lemma 2.2),
⟨v1 , . . . , vj−1 , w, vj+1 , . . . , vn ⟩ ⊆ ⟨v1 , . . . , vn ⟩ regardless of λj being equal to 0.
Now suppose that λj ̸= 0. Then λ1j exists, and

1 λ1 λj−1 λj+1 λr
vj = (λ1 v1 +. . .+λj vj +. . .+λr vr )− v1 −. . .− vj−1 − vj+1 −. . .− vr =
λj λj λj λj λj
1 1 X
= w− λi vi .
λj λj
i̸=j

Thus vj ∈ ⟨v1 , . . . , vj−1 , w, vj+1 , . . . , vr ⟩. Certainly vi ∈ ⟨v1 , . . . , vj−1 , w, vj+1 , . . . , vr ⟩


whenever i ̸= j. By Lemmas 2.2 and 2.3, it follows that
⟨v1 , . . . , vr ⟩ ⊆ ⟨v1 , . . . , vj−1 , w, vj+1 , . . . , vr ⟩.

2.2 Linear dependence/independence


Throughout this section, V is a vector space over the field F and r ∈ N.
Definition 2.5. The family (v1 , . . . , vr ) in V is called linearly independent if
the following holds:
r
P
If λ1 , . . . , λr ∈ F and λi vi = 0, then λ1 = 0 = . . . = λr .
i=1
The family (v1 , . . . , vr ) is called linearly dependent if it is not linearly indepen-
dent.
An infinite family (vi , i ∈ I) is called linearly independent if every finite sub-
family is linearly independent.
   
1 1
Examples. a) In R2 , the family , is linearly independent.
0 1
     
1 1 λ1
Proof: Let λ1 , λ2 ∈ R; then λ1 + λ2 = . If λ1 = 0 =
0 1 λ1 + λ2
λ1 + λ2 , then λ1 = λ2 = 0.
    
1 2 0
b) Let V = R3 and let v1 =  −1 , v2 =  −1 , v3 =  1 .
1 0 −2
Let λ1 , λ2 , λ3 be such that λ1 v1 + λ2 v2 + λ3 v3 = 0. This means that 0 =
λ1 + 2λ2 = −λ1 − λ2 + λ3 = λ1 − 2λ3 . If λ1 = −2λ2 = 2λ3 , then λ2 = −λ3 and
−λ1 − λ2 + λ3 = 2λ2 − 2λ2 = 0. In particular, 2v1 − v2 + v3 = 0, so the three
vectors are linearly dependent.

c) If vi = 0 for some i in {1, . . . , r}, then (v1 , . . . , vr ) is linearly dependent:


Indeed, 10 = 0 and 1 ̸= 0 in F .

11
d) If there are i, j ∈ {1, . . . , r} with i ̸= j and vi = vj , then (v1 , . . . , vr ) is linearly
dependent: P
We have 0 = 1vi − 1vj + 0vk .
k̸=i, j

e) If V ∋ v ̸= 0, then the family (v) is linearly independent.


This is exactly the assertion of Lemma 1.4 g).

f) The ”empty family” ∅ is linearly independent.


There are no linear combinations, so none that could be equal to 0.

g) If r ≥ 2, then the family (v1 , . . . , vr ) is linearly dependent if and only if there


is i in {1, . . . , r} such that vi is a linear combination of the other vectors in the
family, i.e.of (v1 , . . . , vi−1 , vi+1 , . . . , vr ).

Proof: Suppose that (v1 , . . . , vr ) is linearly dependent. Then there exist λ1 , . . . , λr ,


Pr
not all equal to 0, such that λj vj = 0. Pick i ∈ {1, . . . , r} with λi ̸= 0.
j=1
We have X
λi vi = − λj vj .
j̸=i

Since λi ̸= 0, we may divide by λi on both sides, to obtain


1 X X λj
vi = − λj vj = − vj .
λi λi
j̸=i j̸=i
P
Conversely, assume that vi is a linear combination vi = µj vj . Then
j̸=i

0 = µ1 v1 + . . . + µi−1 vi−1 + (−1)vi + µi+1 vi+1 + . . . + µr vr .

The coefficient of vi in this is −1, in particular is nonzero, showing that the


family (v1 , . . . , vr ) is linearly dependent.

Lemma 2.6. Let (v1 , . . . , vr ) be a family in V . The following assertions are


equivalent:
a) The family (v1 , . . . , vr ) is linearly dependent.
b) r = 1 and v1 = 0 or r ≥ 2 and there is i ∈ {1, . . . , r} such that
⟨v1 , . . . , vr ⟩ = ⟨v1 , . . . , vi−1 , vi+1 , . . . vr ⟩.
Proof. ”a) implies b)”: Suppose that (v1 , . . . , vr ) is linearly independent. We
refer to examples c), e) and f) just above where we deduced all the statements
in b).
”b) implies a)”: As seen above, 0 = 10, making the family (0) linearly depen-
dent. Assume that r ≥ 2 and that there is i ∈ {1, . . . , r} with ⟨v1 , . . .P , vr ⟩ =
⟨v1 , . . . , vi−1 , vi+1 , . . . vr ⟩. By Lemma 2.3, vi is a linear combination vi = λj v j .
P j̸=i
Thus 1vi − λj vj = 0 This is a linear combination of (v1 , . . . , vr ) in which
j̸=i
the coefficient of vi is equal to 1 and which is equal to 0. Hence (v1 , . . . , vr ) is
linearly dependent.

12
Lemma 2.7. Let (v1 , . . . , vr ) be a family in V . The following assertions are
equivalent:
a) The family (v1 , . . . , vr ) is linearly independent.
b) For every element w of ⟨v1 , . . . , vr ⟩ there is a uniquely determined r-tuple
r
P
(λ1 , . . . , λr ) of scalars satisfying w = λ i vi .
i=1

Proof. ”a) implies b)”: Suppose that (v1 , . . . , vr ) is linearly independent. Let
w ∈ ⟨v1 , . . . , vr ⟩. We know (Lemma 2.3 ) that w is a linear combination w =
Pr Pr r
P
λi vi . If there are scalars µ1 , . . . , µr such that w = µi vi , then (λi −
i=1 i=1 i=1
µi )vi = 0. Given that (v1 , . . . , vr ) is linearly independent, this means that
λi = µi holds for all i.
”b) implies a)”: We know that 0 = 0v1 + 0v2 + . . . + 0vr . If that is the only
way to express the null vector as linear combination of (v1 , . . . , vr ), the family
is linearly independent.
Definition 2.8. a) A family (vi , i ∈ I) in V is called a generating system
(or spanning system) if V = ⟨vi | i ∈ I⟩.
b) The space V is finitely generated if it has a spanning system with finitely
many members.
   
1 0
Examples. 1. The space R2 is finitely generated, say by , .
0 1
2. More generally, the space F n is finitely generated:
       
x1 1 0 0
 x2   0   1   0 
 = x1   + x2   + . . . + xn  .
       
 .. .. .. ..
 .   .   .   . 
xn 0 0 1
3. The space ℓ0 of zero sequences of real numbers is not finitely generated. We will
be able to give a fairly easy proof further below, when we will have developed a
little more of the theory.
4. The real numbers R form a vector space also over the rational numbers. In fact,
Q is a field and if we let the vectors be real numbers and the scalars rational
numbers, the vector space axioms work out (check). It can be shown that R
cannot be finitely generated over Q.

Definition 2.9. A family (vi | i ∈ I) of vectors in V is called a basis of V if it


is a generating system and linearly independent.
Of course we do not know yet if this definition will be very useful as we do not
know if bases exist at all. The following lemma clears things up in the case of
finitely generated vector spaces.
Lemma 2.10. Suppose that V has a finite generating system (v1 , . . . , vr ). Then
there is a basis of V which is a subfamily of (v1 , . . . , vr ).

13
Proof. If (v1 , . . . , vr ) is linearly independent, it is already a basis. Otherwise,
(v1 , . . . , vr ) is linearly dependent. Lemma 2.6 yields that there is i ∈ {1, . . . , r}
such that V = ⟨v1 , . . . , vr ⟩ = ⟨v1 , . . . , vi−1 , vi+1 , . . . , vr ⟩. 8 If (v1 , . . . , vi−1 , vi+1 , . . . , vr )
is linearly dependent, we may remove a second vector and obtain an even smaller
spanning system. We continue in this way. Since there are only r vectors to
begin with, the process must stop and deliver a linearly independent subfamily
which is still generating V .
* 1   0   1 + 
1

Examples. Let U =  2  ,  2  ,  −2  in R3 . We have  2 −


0 −1 2 0
    * 1   0 +
0 1
2  2  =  −2 , so U =  2  ,  2  . The two remaining vectors
−1 2 0 −1
are linearly independent and therefore a basis.

Lemma 2.11. Let V be finitely generated and let (v1 , . . . , vn ) be a family in V .


Then the following are equivalent:
a) The family (v1 , . . . , vn ) is a basis of V .
b) The family (v1 , . . . , vn ) is linearly independent, but (v1 , . . . , vn , w) is lin-
early dependent whenever w ∈ V .

c) V = ⟨v1 , . . . , vn ⟩, but V ̸= ⟨⟨v1 , . . . , vi−1 , vi+1 , . . . , vn ⟩ whenever i ∈


{1, . . . , n}.
Proof. ”a) implies b)”: Suppose that the family (v1 , . . . , vn ) is a basis. Then it
is linearly independent by definition. Since it is also a generating system, any w
n
P n
P
in V is a linear combination w = λi vi . It follows that w − λi vi = 0 and
i=1 i=1
the coefficient of the vector w in the linear combination on the left hand side is
1. So (v1 , . . . , vn , w) is linearly dependent.
”b) implies c)”: We show that a family (v1 , . . . , vn ) satisfying b) is a spanning
system. Let w ∈ V . Given that (v1 , . . . , vn , w) is linearly dependent, there
n
P
are scalars λ1 , . . . , λn , and µ, not all equal to 0, such that λi vi + µw = 0.
i=1
n
P
If µ = 0, then λi vi = 0. Since (v1 , . . . , vn ) is linearly independent, this
i=1
would mean that λ1 = . . . = λn = µ = 0, a contradiction. Thus µ ̸= 0
n
and w = µ1
P
λi vi ∈ ⟨v1 , . . . , vn ⟩. Thus (v1 , . . . , vn ) is a spanning system, as
i=1
claimed. Since (v1 , . . . , vn ) is linearly independent, Lemma 2.6 says that any
proper subfamily cannot span the same span as the whole family.
”c) implies a)”: Suppose that (v1 , . . . , vn ) satisfies c). Then certainly (v1 , . . . , vn )
spans V and, by Lemma 2.6, cannot be linearly dependent.

8 This also holds if r = 1, why?

14
Remark. b) and c) in 2.11 say, respectively, that a basis is a maximally linearly
independent family and a minimal generating system.
Definition 2.12. Let  1 ≤ i≤ n ∈ N. The ith standard basis vector of F n ,
0
 .. 
 . 
 
 0 
 
, where the entry 1 is in ith place.9
 1
called ei , is the vector  
 0 
 
 .
 ..


0
 
x1
 x2 
Remark. From   = x1 e1 + x2 e2 + . . . + xn en , it is clear that (e1 , . . . , en )
 
..
 . 
xn
is both linearly independent and a generating system of F n . The family (e1 , . . . , en )
in F n will be referred to as the standard basis of F n .

Theorem 2.13. [Steinitz exchange lemma] Let V be finitely generated with


basis (v1 , . . . , vn ). Let (w1 , . . . , wm ) be a linearly independent family in V .
Then m ≤ n and the vectors v1 , . . . , vn may be renumbered in such a way that
(w1 , . . . , wm , vm+1 . . . , vn ) is a basis of V .
n
P
Proof. Since (v1 , . . . , vn ) is a basis, w1 is a linear combination w1 = λ i vi .
i=1
Since (w1 , . . . , wm ) is linearly independent, w1 ̸= 0. This means that at least
one of the scalars λ1 , . . . , λn is nonzero. Possibly upon renumbering the vectors
(v1 , . . . , vn ), we may assume that λ1 ̸= 0. By Lemma 2.4, ⟨w1 , v2 , . . . , vn ⟩ =
Pn Pn n
P
⟨v1 , . . . , vn ⟩ = V . If µ1 w1 + µj vj = 0, then µ1 λi vi + µj vj = 0 =
j=2 i=1 j=2
n
P
µ1 λ1 v1 + (µ1 λj +µj )vj . Since (v1 , . . . , vn ) is linearly independent, this implies
j=2
n
P
that µ1 λ1 = 0. As λ1 ̸= 0, µ1 = 0 and therefore µj vj = 0. Since (v1 , . . . , vn )
j=2
is linearly independent, µ1 = . . . = µn = 0. Accordingly, (w1 , v2 , . . . , vn ) is a
basis of V .
Now we apply induction on m. Indeed, we have already dealt with the case
m − 1, i.e. the base of induction. Now suppose that m − 1 ≤ n and that
(w1 , . . . , wm−1 , vm , . . . , vn ) is a basis of V . Since (w1 , . . . , wm ) is linearly inde-
pendent, wm ∈ / ⟨w1 , . . . , wm−1 ⟩ by Lemma 2.6. In particular, (w1 , . . . , wm−1 ) is
not a basis of V , so that m − 1 < n and m ≤ n, as required.
    
1 0 0
9 Example: In F 3 , e =  0  , e =  1  , e =  0  .
1 2 3
0 0 1

15
Now write wm as linear combination of (w1 , . . . , wm−1 , vm , . . . , vn ), i.e.
m−1
X n
X
wm = λi wi + λj vj .
i=1 j=m

Since wm ∈
/ ⟨w1 , . . . , wm−1 ⟩, there is j ∈ {m, . . . , n} such that λj =
̸ 0. Perhaps
upon renumbering the vectors vm , . . . , vn , we are free to assume that λm ̸= 0.
Lemma 2.4 and the inductive assumption yield that

V = ⟨w1 , . . . , wm−1 , vm , . . . , vn ⟩ = ⟨w1 , . . . , wm , vm+1 . . . , vn ⟩.


m−1
P n
P
Now suppose that µi wi + µm wm + µi vi = 0. This means that
i=1 i=m+1

m−1 m−1 n
! n
X X X X
µi w i + µm λi wi + λj v j + µi v i = 0 =
i=1 i=1 j=m i=m+1

m−1
X n
X
= (µi + µm λi )wi + λm µm vm + (µi + µm λi )vi .
i=1 i=m+1

Since (w1 , . . . , wm−1 , vm , . . . , vn ) is linearly independent and λm ̸= 0, we get


λm µm = 0 = µm . For i ̸= m, this yields that µi + µm λi = 0 = µi . Hence
(w1 , . . . , wm , vm+1 . . . , vn ) is a basis of V , as claimed.
‘ The procedure used in the proof of 2.13 can be used to determine a basis
that contains a given linearly independent family. We will see how this works
further below, first listing some theoretical consequences. The first is a much
weaker version of Theorem 2.13 and does not require an extra proof.
Lemma 2.14. If V is finitely generated, then every linearly independent family
in V can be extended to a basis.
Lemma 2.15. Assume that V is finitely generated. Let B = (v1 , . . . , vn ) and
B ′ = (u1 , . . . , um ) be bases of V . Then n = m.
Proof. The family B ′ is linearly independent and B is a basis, so m ≤ n by
Theorem 2.13. The family B is linearly independent and B ′ is a basis, so n ≤
m.
Lemma 2.15 says that the number of elements in a basis is an invariant of a
finitely generated vector space. This means that it only depends on the space
itself, not on the basis whose elements we are counting.
Definition 2.16. Assume that V is finitely generated. The dimension of V ,
dim V , is the number of vectors in a basis - and hence in any basis - of V .
Examples. 1. For n ∈ N we have found a basis of V = F n , namely the
standard basis (e1 , . . . , en ). We now know that every basis of F n has n
elements and that dim F n = n.

16
2. Suppose we have (finitely many) vectors v1 , . . . , vr in V and we know that
(v1 , . . . , vr ) is linearly independent. Then certainly U = ⟨v1 , . . . , vr ⟩ is
generated by the linearly independent family (v1 , . . . , vr ) and so dim U =
r. If we drop the assumption that (v1 , . . . , vr ) is linearly independent,
then we still know that U is generated by v1 , . . . , vr . By Lemma 2.10, we
can select a basis of U from the vectors (v1 , . . . , vr ), so that dim U ≤ r.
3. Recall that {0} = ⟨∅⟩ and thus dim{0} = 0. The only vector space of
dimension 0 is the null space.
4. Let 0 ̸= v ∈ V . We have seen that (v) is linearly independent, so dim⟨v⟩ =
1.
   
1 −1
 2   −2 
5. Let V = R4 , v1 =   0  , v2 =  3  .
  

4 −4
If λ1 v1 + λ2 v2 = 0, then λ1 − λ2 = 3λ2 = λ1 = λ2 = 0. Hence (v1 , v2 )
is linearly independent. We know that (e1 , e2 , e3 , e4 ) is a basis of V and
that v1 = e1 + 2e2 + ee4 . The vectors e1 , e2 , e4 have nonzero coefficients
in this linear combination, and we can exchange any of them (but NOT
e3 ) to form a new basis. We decide to replace e4 be v1 and continue with
the new basis (v1 , e1 , e2 , e3 ). Now we write v2 as a linear combination
of this basis: v2 = −v1 + 3e3 . The only member of the original basis
that we are allowed to exchange against v2 is e3 , and we obtain the basis
(v1 , v2 , e1 , e2 ).
Lemma 2.17. Let U ≤ V . If dim V = n, then dim U ≤ n.
Proof. A linearly independent family in U is linearly independent in V . Now
apply Theorem 2.13.
Remark. If the dimension of a vector space is n, then the Exchange Lemma
says that any n-element linearly independent family is a basis. In particular, V
is the only subspace of V of dimension n.
We are going to say ”finite-dimensional” instead of ”finitely generated” - we
have seen that the two things are the same.

2.3 Sums and direct sums


Theorem 2.18. Suppose that V is finite-dimensional. Let U1 and U2 be sub-
spaces of V . Then

dim(U1 + U2 ) = dim U1 + dim U2 − dim(U1 ∩ U2 ).

Proof. Let dim V = n, dim U1 = k, dim U2 = m, and dim(U1 ∩ U2 ) = ℓ. We aim


to show that dim(U1 + U2 ) = m + k − ℓ.
Lemma 2.17 says that ℓ ≤ m, k ≤ n. Let (u1 , . . . , uℓ ) be a basis of U1 ∩ U2 .
By Lemma 2.14, this basis can be extended to a basis (u1 , . . . , uℓ , vℓ+1 , . . . , vk )

17
of U1 and a basis (u1 , . . . , uℓ , wℓ+1 , . . . , wm ) of U2 . We aim to show that B =
(u1 , . . . , uℓ , vℓ+1 , . . . , vk , wℓ+1 , . . . , wm ) is a basis of U1 + U2 . First of all, every
element of U1 and every element of U2 are linear combinations of the elements
of B. It follows directly from the definition U1 + U2 = ⟨U1 ∪ U2 ⟩ that B is a
generating system of U1 + U2 .
Pℓ
It remains to show that B is linearly independent: Suppose that λi ui +
i=1 !
k
P m
P m
P Pℓ Pk
µi vi + αi wi = 0. Then αi wi = − λi ui + µi vi ∈
i=ℓ+1 i=ℓ+1 i=ℓ+1 i=1 i=ℓ+1
m
P
U1 ∩ U2 . This means that αi wi is a linear combination of (u1 , . . . , uℓ ), say
i=ℓ+1

P m
P ℓ
P
βj uj . Now αi wi + −βj uj = 0. Since (u1 , . . . , uℓ , wℓ+1 , . . . , wm ) is
j=1 i=ℓ+1 j=1

P
linearly independent, we obtain that αℓ+1 = . . . = αn = 0. Thus λi ui +
i=1
k
P
µi vi = 0. However, (u1 , . . . , uℓ , vℓ+1 , . . . , vk ) is linearly independent, so
i=ℓ+1
that λ1 = . . . = λℓ = µℓ+1 = . . . = µk = 0, completing the proof.
Definition 2.19. ˆ Let U1 and U2 be subspaces of V . We say that the sum
U1 + U2 is direct, denoted U1 ⊕ U2 , if U1 ∩ U2 = 0.
ˆ Let U be a subspace of V . A subspace W is called a complement of U in
V if V = U ⊕ W .
Remark. if V is finite-dimensional, then, by Theorem 2.18, the sum of U1 and
U2 is direct if and only if dim(U1 + U2 ) = dim U1 + dim U2 .
Lemma 2.20. Suppose that V is finite-dimensional. Let U ≤ V . There is a
complement of U in V .
Proof. Let dim V = n and dim U = k. We know that k ≤ n by 2.17. Let
(u1 , . . . , uk ) be a basis of U . Extend this basis to a basis (u1 , . . . , uk , vk+1 , . . . , vn )
of V . Let W = ⟨vk+1 , . . . , vn ⟩. We claim that V = U ⊕ W . If z ∈ V , then
k
P Pn Pk
z is a linear combination z = λi ui + λi vi . Since λi ui ∈ U and
i=1 i=k+1 i=1
n
P
λi vi ∈ W , z ∈ U + W = V . Now dim W = n − k since W is generated by
i=k+1
n−k linearly independent vectors, and dim V = dim U +dim W −dim(U ∩W ) =
n = k + (n − k) = dim U + dim W . So U ∩ W = {0}.
   
1 −1
 2   −2 
Examples. Let V = R4 , v1 =   0  , v2 =  3 , U = ⟨v1 , v2 ⟩. We
  

4 −4
have seen above that dim U = 2 and that (v1 , v2 , e1 , e2 ) is a basis of V . So
W = ⟨e1 , e2 ⟩ is a complement of U in V .

18
Remark. Let V = U ⊕ W . Then every vector is a sum of an element of U
and one of W . If u, u′ ∈ U and w, w′ ∈ W , then u + w = u′ + w′ means that
u − u′ = w′ − w ∈ U ∩ W = {0}. So u = u′ and w = w′ . In other words, every
element of V is the sum of uniquely determined elements of U and of W .

2.4 Cosets and quotient spaces


2.4.1 Equivalence relations and partitions
Definition 2.21. Let S be a set. You will recall from ”Sets and Logic” that a
binary relation ∼ on S is called an equivalence relation if ∼ is
ˆ reflexive - x ∼ x whenever x ∈ S.

ˆ symmetric - if x ∼ y, then y ∼ x

ˆ transitive - if x ∼ y ∼ z, then x ∼ z.

Examples. The best-known equivalence relation is equality.


Definition 2.22. Let ∼ be an equivalence relation on the set S and let x ∈ S.
The equivalence class of x with respect to ∼ is the set

[x] = {y ∈ S | x ∼ y}.

Note that since ∼ is reflexive, x ∈ [x].


Theorem 2.23. Let ∼ be an equivalence relation on the set S and let x, y ∈ S.
Either [x] = [y] or [x] ∩ [y] = ∅.

Proof. Suppose that [x] ∩ [y] ̸= ∅. This means that there is z ∈ S such that
x ∼ z and y ∼ z. Since ”∼” is symmetric, x ∼ z ∼ y, so that x ∼ y. Now if w
is any element of [x], we have x ∼ w and x ∼ y, so y ∼ x ∼ w and y ∼ w. It
follows that [x] ⊆ [y] and [y] ⊆ [x] is obtained in just the same way.
Theorem 2.23 and the fact that x ∈ [x] combine to say that the equivalence
classes of an equivalence relation on S form a partition of S. This means that
S is the disjoint union of equivalence classes; equivalently, every element of S
lies in exactly one equivalence class.

2.4.2 Cosets
Let V be a vector space over the field F and let U a subspace of V .
Definition 2.24. We define the relation ∼U on V by

v ∼U w if v − w ∈ U.

Examples. Let V = R2 and U = ⟨e1 + e2 ⟩. Since e1 − e2 ∈


/ U , e1 ̸∼U e2 .
However, e1 + e2 ∈ U , so e1 ∼U −e2 .

19
Lemma 2.25. ∼U is an equivalence relation.
Proof. Let v, w, z ∈ V . Then v − v = 0 ∈ U , so ∼U is reflexive. If v − w ∈ U ,
then (−1)(v − w) = w − v ∈ U , so ∼U is symmetric. If v − w ∈ U ∋ w − z, then
U ∋ (v − w) + (w − z), so ∼U is transitive.
Lemma 2.26. Let v, v ′ , w, w′ ∈ V and λ ∈ F . If v + U = v ′ + U and
w + U = w′ + U , then (v + w) + U = (v ′ + w′ ) + U and λv + U = λv ′ + U .
Proof. ”v + U = v ′ + U and w + U = w′ + U ” means that there are elements u1
and u2 of U such that v ′ = v + u1 and w′ = w + u2 . Then v + w − (v ′ + w′ ) =
v − v ′ + w − w′ = u1 + u2 ∈ U and λv − λv ′ = λ(v − v ′ ) ∈ U .
Remark: Lemmas 2.25 and 2.26 combine to say that ∼U is a congruence relation
on V - an equivalence relation compatible with the vector space operations.
Definition 2.27. Let v ∈ V . The equivalence class of v with respect to ∼U is
called the coset of U containing v and denoted v + U .
Thus v + U = {w ∈ V | v − w ∈ U }.
Remark. v + U = {v + u | u ∈ U }.
Proof. If w = v +u with u ∈ U , then v −w = −u ∈ U and v ∼U w. If v −w ∈ U ,
then w = v + w − v is of the form v + u with u ∈ U .
Examples. 1. The coset containing 0 is {0 + u | u ∈ U } = U.
2. Let V = R2 , U = ⟨e1 + e2 ⟩ , v = e1 . We have v + U = {e1 + λ(e1 + e2 ) | λ ∈
R} = {(1 + λλ) | λ ∈ R} .
3. Let V = R3 and U = ⟨e1 − e2 , e1 + 2e3 ⟩.
Let v = e1 .
  
 a+b 
We have U =  −a  | a, b ∈ R and
2b
 
       
 1 a+b   1+a+b 
v+U =  0  +  −a  | a, b ∈ R =  −a  | a, b ∈ R .
0 2b 2b
   

Let w = e2 , x = e1 + e2 + e3 .
Then v − w = e1 − e 2 ∈ U ,
so that v + U = w + U . On the other hand,
0
v − x = −e2 − e3 =  −1 . Assuming v − x ∈ U , there would have to
−1
   
0 a+b
exist a, b ∈ R with  −1  =  −a ; but this means a = 1 = −b
−1 2b
and 2b = −1, which is impossible. So x ∈ / v + U . This means that
(v + U ) ∩ (x + U ) = ∅.

20
Definition 2.28. V /U is the set of cosets of U in V . In other words, V /U =
{v + U | v ∈ V }.
Definition 2.29. [The quotient space] Let V be a vector space over the field F
and let U be a subspace of V . The elements of the quotient space V /U are the
cosets of U in V . Vector space operations on V /U are defined as follows: For
v, w ∈ U and λ ∈ F
(v + U ) + (w + U ) = (v + w) + U
λ(v + U ) = λv + U .

Remark. Lemma 2.26 is responsible for the fact that this definition makes
sense - in other words, that the proposed operations on V /U are well-defined :
If v ′ ∈ v + U and w′ ∈ w + U , then (v + w) − (v ′ + w′ ) ∈ U ∋ λv − λv ′ .
Theorem 2.30. Definition 2.29 defines the structure of a vector space on V /U .
Proof. We shall only sketch this proof. Every vector space axiom in V /U follows
from the fact that the same axiom is valid in V . As an example (v + U + w +
U ) + (z + U ) = (v + w) + U + z + U = (v + w) + z + U = v + (w + z) + U =
v + U + (w + z) + U = v + U + (w + U + z + U ).
The null vector of V /U is the coset 0 + U = {0 + u | u ∈ U } = U . For v ∈ V ,
the negative −(v + U ) of v + U is −(v + U ) = (−v) + U .
Examples. We use the subspace U of R3 introduced in the previous round of
examples. Let v = e1 and w = e2 . Then (v + U ) + (w + U ) = e1 + e2 + U . Since
e1 − e2 ∈ U , e1 + e2 + U = e1 + e1 + U = 2e1 + U = 2(e1 + U ).
Let V be a finite-dimensional vector space. Then
Theorem 2.31. dim V /U = dim V − dim U .
Proof. Let dim V = n and dim U = k. Given a basis (u1 , . . . , uk ) of U , there
is a basis (u1 , . . . , uk , vk+1 , . . . , vn ) of V . The proof will be complete once we
have showns that the cosets vk+1 + U, . . . , vn + U form a basis of V /U .
Let v ∈ V . Then v is a linear combination v = λ1 u1 + . . . + λk uk + µk+1 vk+1 +
. . . + µn vn . Since λ1 u1 + . . . + λk uk ∈ U , v ∈ µk+1 vk+1 + . . . + µn vn + U and
v + U = µk+1 vk+1 + . . . + µn vn + U = µk+1 (vk+1 + U ) + . . . + µn (vn + U ).
Hence the cosets vk+1 + U, . . . , vn + U span the space V /U . If αk+1 vk+1 +
U ) + . . . + αn (vn + U ) = 0 + U = U , then w = αk+1 vk+1 + . . . + αn vn ∈
U . This means that w is a linear combination β1 u1 + . . . + βk uk , such that
αk+1 vk+1 + . . . + αn vn − β1 u1 − . . . − βk uk = 0; since (u1 , . . . , uk , vk+1 , . . . , vn )
is linearly independent, in particular αk+1 = 0 = . . . = αn .
Examples. Let V = R3 and U = ⟨e1 − e2 , e1 + 2e3 ⟩ as before.
If λ(e1 −e2 )+µ(e1 +2e3 ) = 0, then (λ+µ)e1 −λe2 +2µe3 = 0,so that λ = µ = 0.
It follows that U is spanned by two linearly independent vectors. This means
dim U = 2. Hence dim V /U = dim V − dim U = 3 − 2 = 1.
Now let u1 = e1 − e2 , u2 = e1 + 2e3 . By the exchange lemma, (u1 , e1 , e3 ) is a
basis of V . Since u2 = 0u1 + e1 + 2e3 , (u1 , u2 , e1 ) is a basis of V , too. It follows
that (e1 + U ) is a basis of V /U .

21
3 Linear transformations
Definition 3.1. Let V1 and V2 be vector spaces over the field F and let T : V1 →
V2 be a function. Then T is called a linear transformation (or linear map or
homomorphism) if, for all u and v in V1 and λ ∈ F , we have
T (u + v) = T (u) + T (v) and T (λu) = λT (u).
 
x1  
3 2 x1 − x2 + x3
Examples. 1. Let V1 = R and V2 = R . Define T  x2   = .
x2 + 3x3
x3
T is a linear
 transformation.

x1 + y1    
x1 + y1 − x2 − y2 + x3 + y3 x1 − x2 + x3
Indeed, T  x2 + y2  = = +
x2 + y3 + 3x3 + 3y3 x2 + 3x3
x3 + y3
  
  x1    
y1 − y2 + y3 λx1 − λx2 + λx3 x1 − x2 + x3
and T λ  x2  = =λ .
y2 + 3y3 λx2 + 3λx3 x2 + 3x3
x3

2. Let V be a vector space over the field F . The zero map T given by
T (v) = 0 for all v in V is a linear map.
Let u, v ∈ V and λ ∈ F . Then T (u + v) = 0 = 0 + 0 = T (u) + T (v), while
T (λu) = 0 = λ0 = λT (u).
3. Let V be a vector space over the field F and let U ≤ V . The map T : V → V /U
given by T (v) = v + U (v ∈ V ) is a linear transformation.
If v, w ∈ V and λ ∈ F , then T (v + w) = (v + w) + U = (v + U ) + (w + U ) =
T (v) + T (w) and T (λv) = λv + U = λ(v + U ) = λT (v).
So T is a linear transformation. Also note that T is clearly a surjective map,
since V /U = {w + U | w ∈ V } = T (V ).
Definition 3.2. Let V be a vector space and U a subspace of V . The map
T : V → V /U given by T (v) = v + U is called the natural homomorphism from
V onto V /U .
Definition 3.3 (Images and preimages). Let S1 and S2 be sets and let f : S1 →
S2 be a function.
Given a subset A of S1 , f (A), the image of A under f , is the set {f (a) | a ∈ A}.
If B ⊆ S2 , the preimage f −1 (B) is the set {a ∈ S1 | f (a) ∈ B}. In other words,
an element of S1 is in f −1 (B) if and only if its f -image is in B.
The notation f −1 (B) might make it seem as if f has an inverse function. How-
ever: For every function, every subset of the codomain has a preimage. The
function may or may not be bijective.
Examples. Let f : R → R+ be the function given by f (x) = x2 . Then
−1 2
√ [0, 1].√ The preimage f ([2, 4]) is the set {x ∈ R | 2 ≤ x ≤
f ([−1, 1]) =
4} = [−2, − 2] ∪ [ 2, 2].
Lemma 3.4. Let V1 and V2 be vector spaces over the field F and let T : V1 → V2
be a linear transformation. Denote by 01 and 02 the respective null vectors of
V1 and V2 . Let k, n ∈ N. The following hold:

22
a) T (01 ) = 02 .
b) If U ≤ V1 , then T (U ) ≤ V2 .
c) If W ≤ V2 , then T −1 (W ) ≤ V1 .
d) If the family (v1 , . . . , vk ) is linearly dependent in V1 , then (T (v1 ), . . . , T (vk ))
is linearly dependent in V2 .
e) If (T (v1 ), . . . , T (vk )) is linearly independent in V2 , then (v1 , . . . , vk ) is
linearly independent in V1 .
f ) Let v1 , . . . , vk be vectors in V1 . Then T (⟨v1 , . . . , vk ⟩) = ⟨T (v1 ), . . . , T (vk )⟩.
g) If (v1 , . . . , vn ) is a generating system of V1 , then (T (v1 ), . . . , T (vn )) is a
generating system for T (V1 ). In particular, dim T (V1 ) ≤ dim V1 .

Proof. ”a)” T (01 ) = T (001 ) = 0T (01 ) = 02 .


”b)” There is an element u in U and T (u) ∈ T (U ) ̸= ∅. Suppose that v, w ∈
T (U ). There are vectors t and u in U such that v = T (t) and w = T (u). Since
U is closed with respect to addition, t + u ∈ U , and T (t + u) = T (t) + T (u) =
v + w ∈ T (U ). So T (U ) is closed with respect to addition. Let λ ∈ F . Then
λv = λT (t) = T (λt). As U is closed with respect to scalar multiplication,
λt ∈ U , so λv ∈ T (U ) and T (U ) is closed with respect to scalar multiplication.
”c)” Let W ≤ V2 . Recall that T −1 (W ) = {v ∈ V1 | T (v) ∈ W }. Since W is a
subspace of V2 , 02 ∈ W . By a), T (01 ) = 02 ∈ W and therefore 01 ∈ T −1 (W ) ̸=
∅. Let u, v ∈ T −1 (W ) and λ ∈ F . Then T (u) ∈ W ∋ T (v). Since W ≤ V2 ,
this means T (u) + T (v) ∈ W ; but T (u) + T (v) = T (u + v), so u + v ∈ T −1 (W ).
Finally, λT (u) ∈ W , and λT (u) = T (λu), so λu ∈ T −1 (W ).
The proof of all of the remaining statements of the lemma rests upon the fact
that
k
X k
X
T( λ i vi ) = λi T (vi ) (1)
i=1 i=1

holds whenever v1 , . . . , vk ∈ V and λ1 , . . . , λk ∈ F .


”d)” If (v1 , . . . , vk ) is linearly dependent, then there is a k-tuple (λ1 , . . . , λk )
with at least one nonzero element, say λi , such that λ1 v1 + . . . + λk vk = 01 .
By a), T (λ1 v1 + . . . + λk vk ) = λ1 T (v1 ) + . . . + λk T (vk ) = 02 . Since λi ̸= 0,
(T (v1 ), . . . , T (vk )) is linearly dependent.
”e)” Suppose that the family (T (v1 ), . . . , T (vk )) is linearly independent. If
µ1 v1 + µ2 v2 + . . . + µk vk = 01 , then, by equation 1 and by a), 02 = T (µ1 v1 +
µ2 v2 +. . .+µk vk ) = µ1 T (v1 )+. . .+µk T (vk ). Since (T (v1 ), . . . , T (vk )) is linearly
independent, it follows that µ1 = µ2 = 0 = . . . = µk .
”f)” Let U = ⟨v1 , . . . , vk ⟩. From b), T (U ) is a subspace of V2 . Since T (vi ) ∈
T (U ) for i = 1, . . . , k, T (U ) is a subspace of V2 containing T (v1 ), . . . , T (vk ).
Since subspaces are closed with respect to linear combinations, ⟨T (v1 ), . . . , T (vk )⟩ ⊆

23
T (U ). For λ1 , . . . , λk ∈ F , T (λ1 v1 + . . . + λk vk ) = λ1 T (v1 ) + . . . + λk T (vk ). So
T (U ) ≤ ⟨T (v1 ), . . . T (vk )⟩.
”g)” If V1 = ⟨v1 , . . . , vn ⟩, then T (V1 ) = ⟨T (v1 ), . . . , T (vn )⟩ by f). Now assume
that dim V1 = n; this means that (v1 , . . . , vn ) is a basis of V1 . Since T (V1 ) is
spanned by the n vectors T (v1 ), . . . , T (vn ), dim T (V1 ) ≤ n. [Recall that every
finite generating system contains a basis].

3.1 Linear transformations and bases


Lemma 3.5. Let V1 and V2 be vector spaces over the same field. Suppose that
V1 has finite dimension n. Let (v1 , . . . , vn ) be a basis of V1 and let w1 , . . . , wn
be vectors in V2 . Then there is a uniquely determined linear transformation
T : V1 → V2 satisfying
T (vi ) = wi
for i = 1, . . . , n.
Proof. By Lemma ZITAT we know: For every vector u in V1 , there is a unique
n-tuple (λ1 , λ2 , . . . , λn ) of scalars satisfying u = λ1 v1 + . . . + λn vn .
We define the map T : V1 → V2 by

T (λ1 v1 + λ2 v2 + . . . + λn vn ) = λ1 w1 + . . . + λn wn .

Since every vector in V1 can be written as linear combination of v1 , . . . , vn in


only one way, this definition assigns one value to every element of V1 , so the
definition produces a well-defined function. Let i ∈ {1, . . . , n}. Then vi =
0v1 + . . . + 0vi−1 + 1vi + 0vi+1 + . . . + 0vn , so T (vi ) = 0w1 + . . . + 0wi−1 + 1wi +
0wi+1 + . . . + 0wn = wi . So T (vi ) = wi for all i.
Let u, v ∈ V1 , u = λ1 v1 + λ2 v2 + . . . + λn vn , v = µ1 v1 + µ2 v2 + . . . + µn vn . Then
u + v = (λ1 + µ1 )v1 + (λ2 + µ2 )v2 + . . . + (λn + µn )vn and

T (u + v) = (λ1 + µ1 )w1 + (λ2 + µ2 )w2 + . . . + (λn + µn )wn =


= λ1 w1 + . . . + λn wn + µ1 w1 + . . . + µn wn =
= T (u) + T (v).

Let α ∈ F and let u be as before. We have αu = αλ1 v1 + αλ2 v2 + . . . + αλn vn .


Accordingly,

T (αu) = αλ1 w1 + αλ2 w2 + . . . + αλn wn =


= α(λ1 w1 + . . . + λn wn )
= αT (u).

So T is a linear transformation. We still need to show that T is the only


linear transformation to map vi to wi for i = 1, . . . , n. So consider a linear
transformation S that satisfies S(v1 ) = w1 , . . . , S(vn ) = wn . We need to show

24
that S(u) = T (u) for any vector u ∈ V1 .
As before, a vector in V1 is a linear combination u = λ1 v1 + λ2 v2 + . . . + λn vn .
Since S is a linear map, we have

S(u) = S(λ1 v1 + λ2 v2 + . . . + λn vn ) =
= λ1 S(v1 ) + λ2 S(v2 ) + . . . + λn S(vn ) =
= λ1 w1 + . . . + λn wn = T (u).

This completes the proof.

Lemma 3.5 summarised:

A linear transformation is determined by its values on a basis of its domain.


Given vector spaces V1 and V2 , where V1 is finite-dimensional, every map from
a basis of V1 into V2 extends to a linear transformation V1 → V2 .

Examples. a) Let V1 = R2 and V2 = R3 . Let 1 → V2 be the uniquely


 T : V  
1 2
determined linear map satisfying T (e1 ) =  2  and T (e2 ) =  −10 .
−10 1
 
x1
Compute T (v) for v = ∈ V1 .
x2

b) Let V1 = R3 and V2 = R2 .
Let v1 = e1 , v2 = e2 − e1 , v3 = e1 + 2e3 . Show that (v1 , v2 , v3 ) is a basis
of V1 .
 T: V1 → V2 be
Let  theuniquely determined
 linear
 map satisfying T (v1 ) =
1 0 2
, T (v2 ) = and T (v3 ) = .
2 3 −10
 
x1
Compute T (v) for v =  x2  ∈ V1 .
x3
Solution to a):
(i) Note that T exists and is unique according to Lemma 3.5 and because (e1 , e2 ) is a
basis of V1 .  
x1
Let V1 ∋ v = . We have v = x1 e1 + x2 e2 . It follows that
x2
     
1 2 x1 + 2x2
T (v) = x1 T (e1 ) + x2 T (e2 ) = x1  2  + x2  −10  =  2x1 − 10x2 .
−10 1 −10x1 + x2
Solution to b):
We prove that (v1 , v2 , v3 ) is linearly independent. Since dim V1 = 3, any three-element
family is a basis.
Suppose that λ1 e1 +λ2 (e2 −e1 )+λ3 (e1 +2e3 ) = 0. We rewrite this as linear combination
of e1 , e2 , e3 and get (λ1 − λ2 + λ3 )e1 + λ2 e2 + 2λ3 e3 = 0. Accordingly, λ1 − λ2 + λ3 =
0 = λ2 = 2λ3 , so λ1 = λ2 = λ3 = 0.

25
Our next task is to 
write any given vector in V1 as linear
 combination
  of v1 , v2 , v3 . Let

x1 x1 µ1 − µ2 + µ3
v ∈ V1 , v =  x2 . If u = µ1 v1 + µ2 v2 + µ3 v3 , then  x2  =  µ2 .
x3 x3 2µ3
So µ2 = x2 , µ3 = 21 x3 , and µ1 = x1 + x2 − 12 x3 . This means that
 
x1
v =  x2  = (x1 + x2 − 12 x3 )v1 + x2 v2 + 21 x3 v3 . Thus T (v) = T ((x1 + x2 −
x3
1
2
x 3 )v1 + x 2 v2 + 21 x3 v3 ) =
= (x1 + x2 − 21 x3 )T (v1 ) + x2 T (v2 ) + 21 x3 T (v3 ) =
     
1 0 2
= (x1 + x2 − 21 x3 ) + x2 + 12 x3 =
2 3 −10
x1 + x2 − 21 x3 + x3 x1 + x2 + 21 x3
   
= = .
2x1 + 2x2 − x3 + 3x2 − 5x3 2x1 + 5x2 − 6x3

3.2 Kernel and image


Definition 3.6. Let V1 and V2 be vector spaces over the field F and let T : V1 → V2 be a
linear transformation. As before, we denote the null vectors by 01 and 02 , respectively.
ˆ The kernel of T , denoted ker T , is the set {v ∈ V1 | T (v) = 02 }.
ˆ The image of T , denoted im T , is the set T (V1 ) = {T (u) | u ∈ V1 }.

Lemma 3.7. Let V1 and V2 be vector spaces over the field F and let T : V1 → V2 be
a linear transformation. Then ker T is a subspace of V1 and im T a subspace of V2 .

Proof. This follows straight from Lemma 3.4: V1 ≤ V1 , so 3.4 b) says that im T =
T (V1 ) ≤ V2 . The kernel of T is the preimage of the subspace {02 } of V2 .

Examples. We revisit our first example


 of a linear transformation
x1  
x1 − x2 + x3
Let V1 = R3 , V2 = R2 , T  x2  = .
x2 + 3x3
x3
 
x1
If ker T ∋ v =  x2 , then x1 − x2 + x3 = 0 = x2 + 3x3 . So ker T =
x3
   * −4 +
 −4a 
 −3a  | a ∈ R =  −3  . To determine the image of a linear trans-
a 1
 
formation, it is helpful to remember the last two statements in Lemma 3.4: Given a
basis (v1 , . . . , vn ) of the domain V1 of the linear transformation S, the image im S is
the span of S(v1 ), . . . , S(vn ).
Returning to our example, the domain of T is R3 , so im T = ⟨T (e1 ), T (e2 ), T (e3 )⟩.
We compute:
     
1 −1 1
T (e1 ) = , T (e2 ) = , T (e3 ) = .
0 1 3
Since (T (e1 ), T (e2 ))is linearly independent and since dim V2 = 2, im T = V2 .

26
Lemma 3.8. Let V1 and V2 be vector spaces over the field F and let T : V1 → V2 be
a linear transformation. Then:
a) For u, v ∈ V1 , T (u) = T (v) if and only if u − v ∈ ker T .
b) T is injective if and only if ker T = {01 }.

Proof. Let u, v ∈ V1 . If T (u) = T (v), then T (u − v) = T (u) − T (v) = 02 , so


u − v ∈ ker T . Conversely, ”u − v ∈ ker T ” means that 02 = T (u − v) = T (u) − T (v),
so T (u) = T (v). This proves a).
As to b), suppose that ker T = {01 }. Let u ∈ V1 ∋ v. If T (u) = T (v), then, by a)
u − v ∈ ker T = {01 }, i.e. u − v = 01 and u = v.So T is injective. Conversely, assume
that T is injective. By 3.3.a), T (01 ) = 02 . Since T is injective, 01 must be the only
vector in V1 with image 02 . So ker T = ker T = {01 }.

Note that Statement a) of 3.8 is equivalent to ”T (u) = T (v) if and only if u + ker T =
v + ker T ”.

Theorem 3.9. [Rank-nullity formula] Let V1 and V2 be vector spaces over the field
F . Let T : V1 → V2 be a linear transformation. Suppose that dim V1 < ∞. Then

dim V1 = dim ker T + dim im T .

Proof. Let dim V1 = n. By Lemma 3.8 f), dim im T ≤ n. Since ker T is a subspace of
V1 , dim ker T ≤ n, too. We let dim ker T = k and dim Im T = m. Note that we want
to establish that
n = m + k.

Let (u1 , . . . , uk ) be a basis of ker T and let (w1 , . . . , wm ) be a basis of im T . Given


that im T = {T (v) | v ∈ V1 }, there are vectors v1 , . . . , vm in V1 satisfying
T (vi ) = wi (1 ≤ i ≤ m).
We prove the following:
Claim: (u1 , . . . , uk , v1 , . . . , vm ) is a basis of V .
Let B = (u1 , . . . , uk , v1 , . . . , vm ). We show that B is linearly independent.
Suppose that
λ1 u1 + . . . + λk uk + µ1 v1 + . . . + µm vm = 01 .
Applying T on both sides, we get
u1 ,...,uk ∈ker T
02 = T (λ1 u1 + . . . + λk uk ) + µ1 T (v1 ) + . . . + µm T (vm ) =
= µ1 T (v1 ) + . . . + µm T (vm ) = µ1 w1 + . . . + µm wm .
Since the family (w1 , . . . , wm ) is linearly independent, this yields µ1 = . . . = µm = 0.
It follows that λ1 u1 + . . . + λk uk + µ1 v1 + . . . + µm vm = 01 = λ1 u1 + . . . + λk uk . Since
the family (u1 , . . . , uk ) is linearly independent, λ1 = . . . = λk = 0, too.
We show that B spans V1 . Let v ∈ V1 . Since (w1 , . . . , wm ) spans im T , T (v) is a linear
combination
T (v) = µ1 w1 + . . . + µm wm .
It follows that
T (v) = µ1 T (v1 ) + . . . + µm T (vm ) = T (µ1 v1 + . . . + µm vm ).

27
Lemma 3.8 a) implies that v − (µ1 v1 + . . . + µm vm ) ∈ ker T . Since ker T = ⟨u1 , . . . , uk ⟩,
this implies that v − (µ1 v1 + . . . + µm vm ) is a linear combination λ1 u1 + . . . + λk uk .
Consequently, v = µ1 v1 + . . . + µm vm + λ1 u1 + . . . + λk uk .
So B is linearly independent and spans V1 ; since B has k + m members, this finishes
the proof.

If the dimensions of V1 and V2 are finite and the same, Lemma 3.8 has a particularly
useful application:

Lemma 3.10. Let V1 and V2 be finite-dimensional vector spaces over the field F with
dim V1 = dim V2 = n. Let T : V1 → V2 be a linear transformation. Then the following
statements are equivalent:
a) T is injective.
b) dim ker T = 0.
c) dim im T = n.
d) T is surjective.

Proof. From Lemma 3.8, we know that T is injective if and only if ker T = {01 };
since the null space is the only vector space of dimension 0, T is injective if and
only if dim ker T = 0 if and only if n − dim ker T = n. By the rank-nullity formula,
n − dim ker T = dim im T , so T is injective if and only if dim im T = n. Since
dim V2 = n and the only n-dimensional subspace of an n-dimensional vectorspace is the
space itself, T is injective if and only if im T = V2 ; and that means T is surjective.

3.2.1 The first homomorphism theorem for vector spaces


The following Theorem can be thought of as ”the real meaning of ” theorem 3.9 -
indeed, the rank-nulity formula follows from it. We require a definition first.

Definition 3.11. Two vector spaces W1 and W2 over the field F are called isomorphic,
denoted W1 ∼
= W2 if there is a bijective linear transformation W1 → W2 .

Theorem 3.12. [First homomorphism theorem for vector spaces] Let V1 and V2 be
vector spaces over the field F and let T : V1 → V2 be a linear transformation. Then

V1 / ker T ∼
= im T .

Proof. Let u, , v ∈ V1 . By Lemma 3.8 and the remark directly below, T (u) = T (v) if
and only if u + ker T = v + ker T .
In other words, T is constant on each coset of ker T in V1 . This makes it possible to
define a map T̄ : V1 / ker T → im T by
T̄ (u + ker T ) = T (u).
The map T̄ assigns to the coset u + ker T the value T (u) which is the same for all
members of the coset.
For u, v ∈ V1 and λ ∈ F , we have
T̄ ((u + ker T ) + (v + ker T )) =
= T̄ (u + v + ker T ) = T (u + v) = T (u) + T (v) = T̄ (u + ker T ) + T̄ (v + ker T ) and
T̄ (λ(u + ker T )) = T̄ (λu + ker T ) =
= T (λu) = λT (u) = λT̄ (u + ker T ).

28
So T̄ is a linear transformation.
What is ker T̄ ? If T̄ (u + ker T ) = 02 , then T (u) = 02 i.e. u ∈ ker T = 01 + ker T . So
the only element of the kernel of T̄ is the null vector of V1 / ker T . By Lemma 3.8 b),
T̄ is injective.
Since im T = {T (v) | v ∈ V1 } = {T̄ (v + ker T ) | v ∈ V1 } = im T̄ , T̄ is also surjective.
This completes the proof.

3.3 Composition of linear transformations


Let V1 , V2 , V3 be vector spaces over the field F and let T : V1 → V2 and S : V2 → V3
be maps. Recall that the composition ST is the map V1 → V3 given by

ST (v) = S(T (v)) (v ∈ V1 ).

Lemma 3.13. Let V1 , V2 , V3 be vector spaces over the field F and let T : V1 → V2
and S : V2 → V3 be linear transformations. Then ST is a linear transformation.

Proof. Let u, v ∈ V1 and λ ∈ F . Then


ST (u + v) = S(T (u + v)) = S(T (u) + T (v)) = S(T (u)) + S(T (v)) = ST (u) + ST (v)
and ST (λu) = S(T (λu)) = S(λT (u)) = λS(T (u)) = λST (u).

Examples. Let V1 = R3 , V2 = R2 , V3 = R. Let T : V1 → V2 and S : V2 → V3 be


defined
 as follows:

x1  
x1 + x2 + x3
T   x2   = ,
x1 + 2x3
x3
S is the unique linear transformation V2 → V3 satisfying
S(e1 + e2 ) = −1 and S(e1 − e2 ) = 1.
 
x1
Compute ST  x2  and determine dim ker ST .
x3

Solution:
Please verify that (e1 + e2 , e1 −e2 ) isa basis of R2 .
x1
We compute S(v) for V2 ∋ v = . If v = λ(e1 + e2 ) + µ(e1 − e2 ), we have
x2

x1 = λ + µ and
x2 = λ − µ,
x1 +x2 x1 −x2
so that λ = 2
and µ = 2
.
x1 +x2 x1 −x2 x1 +x2
It follows that S(v) = S( 2 (e1 + e2 ) + 2
(e1 − e2 )) = 2
S(e1 + e2 ) +
x1 −x2
2
S(e1 − e2 ) = − x1 +x
2
2
+ x1 −x
2
2
= −x2 .

x1 + y1  
x1 + y1 + x2 + y2 + x3 + y3
We verify that T is a linear transformation. Indeed, T  x2 + y2  = = =
x1 + y1 + 2x3 + 2y3
x3 + y3
   
x1 + x2 + x3 y1 + y2 + y3
+ ,
x1 + 2x3 y1 + 2y3

29
  
x1    
λx1 + λx2 + λx3 x1 + x2 + x3
while T λ  x2  = = =λ .
λx1 + 2λx3 x1 + 2x3
x3
Finally,
 
x1
ST  x2  =
x3
 
x1 + x2 + x3
=S = −x1 − 2x3 .
x1 + 2x3
All that’s left is to find dim ker ST . While it is possible and not difficult to compute
this using the above equation, we shall use Theorem 3.9: Indeed, ST (e1 ) = −1, so
im ST is a subspace of a one-dimensional space of dimension at least 1. It follows that
dim im ST = 1 and therefore dim ker ST = 3 − 1 = 2.

3.3.1 Reminder on functions


Let A and B be sets and let f : A → B be a function.

ˆ f is injective (one-to-one) if, for a, a′ ∈ A, f (a) = f (a′ ) implies that


a = a′ .
ˆ f is surjective (onto) if {f (a) | a ∈ A} = B.
ˆ f is bijective if f is both injective and surjective.
ˆ The identity function on a set S, called idS is the function given by
idS (s) = s (s ∈ S).
ˆ f is bijective if and only if there is a function g : B → A such that f ◦ g =
idB and g ◦f = idA . If f is bijective, the function g is uniquely determined
by the conditions named.
The function g is the inverse of f and usually denoted by f −1 . The
definition of f −1 may be summarised as:

For b ∈ B, f (a) = b if and only if f −1 (b) = a.

ˆ The inverse of a bijective function is bijective - indeed, (f −1 )−1 = f .

3.4 The inverse of a linear bijection


Definition 3.14. Let V be a vector space over the field F . The identity map
on V is denoted by idV and given by idV (v) = v (v ∈ V ). Note that idV is a
linear transformation.

In the reminder on functions above it was mentioned that a function f : A →


B is bijective if and only if there exists an inverse function f −1 and that f −1 is
completely determined by the conditions f −1 f = idA and f f −1 = idB .
The next lemma shows that the inverse of a linear bijection is a linear map.

30
Lemma 3.15. Let V1 and V2 be vector spaces and let T : V1 → V2 be a bijective
linear transformation. Then the inverse map T −1 : V2 → V1 , defined by T T −1 =
idV2 and T −1 T = idV1 is a linear transformation V2 → V1 .
Proof. Let w and z be vectors in V2 . Since T is a linear map,
T (T −1 (w) + T −1 (z)) = T (T −1 (w)) + T (T −1 (z)) =
= T T −1 (w) + T T −1 (z) = idV2 (w) + idV2 (z) = w + z.

From the definition of an inverse function, T T −1 = idV2 , so that

T T −1 (w + z) = w + z, i.e.
T (T −1 (w) + T −1 (z)) = T (T −1 (w + z)).

Since T is injective, it follows that T −1 (w) + T −1 (z) = T −1 (w + z).


Let λ ∈ F . Since T is linear,
T T −1 (λw) = λw =
= λT (T −1 (w)) = T (λT −1 (w)).

Since T is injective, it follows that T −1 (λw) = λT −1 (w). This finishes the


proof.
   
2 x1 x1 + x2
Examples. Let V = R . Let T : V → V be given by T = .
x2 2x1
Show that T is a bijective linear transformation and determine T −1 .
Solution:    
x1 x2 + x1
Recall that T = . I shall leave it to you to show that T is a
x2 2x1
linear transformation
 (please check you can do this).
x1
If ker T ∋ v = , then x1 + x2 = 0 = 2x1 = x1 = x2 . So ker T = {0}, i.e. T is
x2
injective (Lemma 3.8.)  By  Lemma , T is also surjective.
 
1 1
Let w1 = T (e1 ) = and w2 = T (e2 ) = . Since im T = ⟨w1 , w2 ⟩ = V ,
2 0
(w1 , w2 ) is a 
basis of
V.
x1
Let V ∋ v = . We write v as linear combination of w1 and w2 , i.e.
x2
     
x1 1 1
=λ +µ .
x2 2 0

This means x1 = λ + µ and x2 = 2λ, i.e. λ = 21 x2 and µ = x1 − 21 x2 . It follows that


1 1
T −1 (v) = T −1 ( x2 T (e1 ) + (x1 − x2 )T (e2 )) =
2 2
1 1
= x2 T −1 T (e1 ) + (x1 − x2 )T −1 T (e2 ) =
2 2
1
 
1 1 x
2 2
= x2 e1 + (x1 − x2 )e2 = 1 .
2 2 x1 − 2 x2

31
Definition 3.16. Let T : V1 → V2 be a bijective linear transformation. The
inverse S : V2 → V1 of T , i.e. the linear map S : V2 → V1 uniquely determined
by ST = idV1 and T S = idV2 is denoted T −1 .
Lemma 3.17. Let V1 , V2 , and V3 be vector spaces over F . Let T : V1 → V2 and
S : V2 → V3 be linear bijections. Then:
a) ST is bijective and (ST )−1 = T −1 S −1 .
b) (T −1 )−1 = T .

Proof. These facts are probably well known to you from the general theory of
functions.
For a), we note that ST (T −1 S −1 ) = S(T T −1 )S −1 = SidV2 S −1 = SS −1 = idV3
and T −1 S −1 (ST ) = T −1 (S −1 S)T = T −1 idV2 T = idV1 .
b) The equalities T T −1 = idV2 and T −1 T = idV1 imply that T −1 is bijective
with inverse T .
Concluding the section, we present a lemma that lessens the work required to
compute the inverse of a given linear transformation.
Lemma 3.18. Let V1 and V2 be vector spaces over F of the same (finite)
dimension n. Let T : V1 → V2 be a bijective linear transformation and S : V2 →
V1 a linear transformation. Each of the following is equivalent to S = T −1 :
a) ST = idV1 .
b) T S = idV2 .
Proof. Recall that S = T −1 is equivalent to ST = idV1 and T S = idV2 ; so we
need to show that, in the presence of the conditions of the lemma, a) and b) are
equivalent.
Let (v1 , . . . , vn ) be a basis of V1 . By Lemma 3.4, V2 = im T = ⟨T (v1 ), . . . , T (vn )⟩,
so that (T (v1 ), . . . , T (vn )) must be a basis of V2 .
Assume that ST = idV1 ; then S(T (vi )) = vi for i = 1, . . . , n. It follows that
(T S)T (vi ) = T (ST (vi )) = T (vi ) whenever i ∈ {1, . . . , n}. As mentioned above,
(T (v1 ), . . . , T (vn )) is a basis of V2 , and, by Lemma 3.4, the only linear map
V2 → V2 that maps every element of a basis to itself is idV2 . Thus a) implies b).
Conversely, suppose that T S = idV2 . Then, for i = 1, . . . , n, T S(T (vi )) =
T (vi ) = T (ST (vi )). Since T is injective, it follows that vi = ST (vi ) for all i, so
that Lemma 3.5 says that ST = idV1 .

Examples. Let V = R3 , T : V → V be defined by


   
x1 x1 + x2 + x3
T  x2  =  x1 
x3 x2

Show that T is a linear bijection and compute T −1 .

32
Solution:
   
x1 + y1 x1 + y1 + x2 + y2 + x3 + y3
T  x2 + y2  =  x1 + y1 =
x3 + y3 x2 + y2
   
x1 + x2 + x3 y1 + y2 + y3
 x1 + y1 ;
x2 y2
   
λx1 x1 + x2 + x3
T  λx2  = λ  x1 .
λx3 x2
 
x1
So T is a linear transformation. Let ker T ∋ v =  x2 . Since x1 = x2 = 0 =
x3
x1 + x2 + x3 , x3 = x1 = x2 = 0 so that v = 0. So T is injective; by Lemma 3.8,
dim im T = 3 and T is surjective.
As before, we compute the images of the standard basis vectors:
     
1 1 1
T (e1 ) =  1  , T (e2 ) =  0  , T (e3 ) =  0 
0 1 0

The map T −1 is theunique linear map V → V with T −1 (T (ei )) = ei (i = 1, 2, 3).


x1
Let V ∋ v =  x2 . If v = λ1 T (e1 ) + λ2 T (e2 ) + λ3 T (e3 ), we have
x3

x1 = λ1 + λ2 + λ3
x2 = λ1
x3 = λ2 ,
i.e. λ3 = x1 − x2 − x3 , λ1 = x2 ,λ2 = x3 . It follows that
 
x1
−1 
T x2  = T −1 (x2 T (e1 ) + x3 T (e2 ) + (x1 − x2 − x3 )T (e3 )) =
x3
 
x2
x2 e1 + x3 e2 + (x1 − x2 − x3 )e3 =  x3 .
x1 − x2 − x3

3.5 The matrix of a linear transformation


Let V be a vector space of dimension n with basis B = (v1 , . . . , vn ).
 We know

λ1
 ..
ZITAT that for every every vector w in V there is a unique vector  .  in

λn
n
n n
P
F with v = λi vi . This gives rise to a map TB : V → F called the coordinate
i=1
map with respect to B.

33
Definition 3.19. With the notation introduced just above
 
n λ1
λi vi ) =  ...  .
X
TB (
 
i=1 λn
n
P n
P n
P n
P n
P
Remark. Since λ i vi + µi vi = (λi +µi )vi and λ λ i vi = λλi vi , TB
i=1 i=1 i=1 i=1 i=1
n
P Pn
is a linear map. If TB ( µi vi ), then λi = µi holds for 1 ≤ i ≤ n.
λi vi ) = TB (
i=1 i=1  
λ1
Thus TB is injective; it is also clearly surjective, since for any  ...  in F n ,
 

λn
n
P
there exists a preimage λi vi in V .
i=1

We immediately obtain:

Lemma 3.20 (The coordinate isomorphism). Let V be an n-dimensional vector


space over F . Then V ∼
= F n.
Let V1 and V2 be finite-dimensional vector spaces over the field F . Let B =
(v1 , . . . , vn ) be a basis of V1 and B ′ = (w1 , . . . , wm ) a basis of V2 . Let T : V1 → V2
be a linear transformation. We write the images T (v1 ), . . . , T (vn ) as linear
combinations of the vectors of B ′ , as follows:

T (v1 ) = a11 w1 + w21 w2 + . . . + am1 wm


T (v2 ) = a12 w1 + a22 w2 + . . . + am2 wm
...............
T (vj ) = a1j w1 + a2j w2 + . . . + amj wm
...............
...............
T (vn ) = a1n w1 + a2n w2 + . . . + amn wm

In other words
m
P
For j = 1, . . . , n, T (vj ) = aij wj , and
i=1  
a1j
 a2j 
thus the coordinate vector in F m of TB′ (T (vj )) is  .
 
..
 . 
amj
Definition 3.21. With the notation introduced above, the matrix of T with

34
respect to B and B ′ , denoted B′ MB (T ), is the matrix
 
a11 a12 . . . a1n
 a21 a22 . . . a2n 
MB (T ) = (aij ) =  .
 
B′ .. .. .. ..
 . . . . 
am1 am2 . . . amn
 
 x2 
x1
Examples. Let T : R2 → R3 be given by T = x1  . Let
x2
x2 − x1
B = (v1 , v2 ) = (e2 , e1 − e2 ) and B ′ = (w1 , w2 , w3 ) = (e3 , e1 + e2 , e2 ).10 We
have
 
1
T (v1 ) =  0  = 1w1 + 1w2 − 1w3
1
 
−1
T (v2 ) =  1  = −2w1 − 1w2 + 2w3
−2
It follows that  
1 −2
B ′ MB (T ) =  1 −1  .
−1 2
We continue to use the notation introduced above.
Theorem 3.22. Let v ∈ V1 . The coordinate vector of T (v) with respect to the
basis B ′ is the result of multiplying the matrix B′ MB (T ) and the coordinate
vector of v with resoect to rhe basis B. In other words
TB′ (T (v)) = B′ MB (T ) · TB (v).
n
P
Proof. Write v = λj vj . Then
j=1
n
X n
X m
X m X
X n
T (v) = λj T (vj ) = λj aij wi = ( λj aij )wi .
j=1 j=1 i=1 i=1 j=1

Now
 Pn 
λ a
    j=1 j 1j 
a11 a12 . . . a1n λ1  n
 P


a21 a22 . . . a2n λ2 λ a
  j=1 j 2j
    
MB (T ) · TB (v) =  = ,
  
B′ .. .. ... ..  ..
 . . .  .   .. 
 . 
am1 am2 . . . amn λn  n
 P


λj amj
j=1
10 Verify that T is linear and B and B′ are bases of R2 and R3 , respectively.

35
completing the proof.
Examples (continued). We continue with the map T introduced in the previous
 
2 1
example. We take v = e1 (in R ). Since v = e2 + e1 − e2 , TB (v) = .
   1
0 −1
Moreover, T (v) =  1  = −e3 + e2 = −w1 + w3 . So TB′ (T (v)) =  0 .
−1 1
   
1 −2   −1
1
Now we compute B MB′ (T )(TB (v). This is  1 −1  =  0 ,
1
−1 2 1
which is as it should be.

3.6 Matrix multiplication and composition of linear maps


Let T, V1 , V2 , B, and B ′ and B′ MB (T ) = (aij ) be as before. We shall require
a third finite-dimensional vector space V3 with basis B ” = (z1 , . . . , zk ) and a
linear transformation S : V1 → V3 . We show:
Theorem 3.23. B” MB (ST ) = B” MB′ (S) B′ MB (T ).
m
P
Proof. We recall that T (vj ) = aij wj f holds for 1 ≤ j ≤ n. Let B” MB′ (S) =
i=1
k
P
(brs ), i.e. for s = 1, . . . , m, S(ws ) = brs zr . Now pick j ∈ {1, . . . , n}. Then
r=1

m
! m m k
X X X X
ST (vj ) = S aij wi = aij S(wi ) = aij bri zr =
i=1 i=1 i=1 r=1

k X
m
!
X
= bri aij zr .
r=1 i=1

Now, by the rules of matrix multiplication,


    
b11 b12 ... b1m a11 a12 ... a1n c11 c12 ... c1n
 b21 b22 ... b2m  a21 a22 ... a2n   c21 c22 ... c2n 
· = ,
    
 .. .. .. .. .. .. .. .. .. .. .. ..
 . . . .  . . . .   . . . . 
bk1 bk2 ... bkm am1 am2 ... amn ck1 ck2 ... ckn
k P
P m
where, for j = 1, . . . , n and r = 1, . . . , k, crj = i=1 bri aij . This completes the
r=1
proof.

Examples.  , V2 , and T be as before. Let V3 = R2 , let S : V2 → V3 be


Let V1
x1  
x2 − x1
given by S   x2   = and let B ” = (z1 , z2 ) = (e1 , e1 +e2 ). We
x2 − x3
x3

36
   
1   −1
−1
have ST (v1 ) = S  0  = = −z2 and ST (v2 ) = S  1  =
−1
1 −2
   
2 1 −1 0
= −z1 + 3z2 . We have B” MB′ (S) = . 11 Thus
3 −1 1 1
 
  1 −2  
1 −1 0  0 −1
B ” M B ′ (S) B ′ MB (T ) = 1 −1  = ,
−1 1 1 −1 3
−1 2
which is as it should be.
Definition 3.24. Let r (∈ N. The r × r-unit matrix Ir is defined by Ir =
1 if i = j
(dij )1≤i, j≤r where dij = .
0 if i ̸= j
In other words,  
1 0 ... 0
 0 1 ... 0 
Ir =  .
 
.. .. .. ..
 . . . . 
0 ... 0 1
We note a corollary of Theorem 3.23:
Corollary 3.25. Suppose that T is bijective. Then n = m and
−1 −1
B′ MB (T ) B MB′ (T ) = In = B MB′ (T ) B′ MB (T ).

Proof. The rank-nullity theorem (zitat) yields n = m. By Theorem 3.23,


−1 −1
B′ MB (T ) B MB′ (T )= B′ MB′ (T T )= B′ MB′ (idV2 ).

For 1 ≤ i ≤ m, idV2 (wi ) = wi = 0w1 + . . . + 0wi−1 + 1wi + 0wi+1 + . . . + 0wm .


It follows that B′ MB′ (idV2 ) = Im .
Analogously,
−1 −1
B MB′ (T ) B′ MB (T ) = B MB (T T) = B′ MB′ (idV1 ) = In .

Remark. It should be noted that Lemma 3.25 says that a linear transformation
is bijective if and only if it is represented by an invertible matrix.

3.6.1 Change of base


The matrix B′ MB (T ) depends on B′ , B, and T . To forge a relation between different
matrix representations of the same linear transformation, we use Lemma 3.23. We
only need one more definition. The identity map is clearly linear and bijective.
Throughout this subsection, we retain all previous notation. Let B1 be a basis of V1
and B1′ be a basis of V2 . Since T = idV2 ◦ T ◦ idV1 , Lemma 3.23 yields that
11 The verification is a question on your tutorial worksheet.

37
Remark.
′ MB1 (T )
B1 = ′ MB1 (idV2 T idV1 )
B1 =
= ′
B1 MB1 (idV2 T idV1 ) = ′
B1 MB′ (idV2 ) B′ MB (T ) B MB1 (idV1 ).
By Corollary 3.25, B MB1 is an invertible matrix-indeed, B MB1 (idV1 ) B1 MB (idV1 ) =
B MB (idV1 ) = In . If X is any invertible r × r-matrix with coefficients in F , then the
column vectors of X form a basis B2 of F r ; letting B2′ be the standard basis of F r , we
have X = B2′ MB2 idF r .
Lemma 3.26. Let A = B′ MB (T ) and let B be an m × n-matrix with coefficients in
F . The following statements are equivalent:
ˆ There are bases B1 of V1 and B1′ of V2 such that B = B1′ MB1 (T ).
ˆ There are invertible matrices Y of size n × n and X of size m × m over F such
that B = XAY .

4 Inner product spaces


Throughout this chapter, V is a vector space over R.

Definition 4.1. ˆ A map f : V × V → R is called an inner product on V


if the following three requirements are met:
a) f is bilinear: For u, v, w ∈ V and λ ∈ R,
f (u + v, w) = f (u, w) + f (v, w), f (λu, w) = λf (u, w),
f (u, v + w) = f (u, v) + f (u, w), f (u, λw) = λf (u, w).
b) f is symmetric: For u, v ∈ V , f (u, v) = f (v, u).
c) f is positive definite: f (u, u) > 0 whenever 0 ̸= u ∈ V .
ˆ If f is an inner product on V , then the pair (V, f ) is called an inner
product space. We also say V is equipped with the inner product f .
Examples. 1) Let V = Rn . The dot product on V is defined by
u · v = ut v.

In other words,    
x1 y1
 ..   ..  =
 . · . 
xn yn
 
y1
= (x1 , x2 , . . . , xn )  ...  =
 

yn
n
X
= xi yi .
i=1

38
We verify that the dot product is an inner product: Recall the rules (A +
B)t = At + B t , (AB)t = B t At , (At )t = A, A(B + C) = AB + AC of matrix
multiplication.
Let u, v ∈ V . Since ut v ∈ R, ut v = (ut v)t = v t u, so the dot product is
symmetric.
Given u, v, w ∈ V and λ ∈ R, we have

u · (v + w) = ut (v + w) = ut v + ut w

and
u · (λv) = ut (λv) = λut v.
Linearity in 
the first
argument follows from symmetry.
x1
Letting v =  ... ,
 

xn
n
X
vt v = x2i ,
i=1
t
so v v = 0 if and only if x1 = x2 = . . . = xn = 0, i.e. v = 0.
Note: The dot product is also called the canonical inner product on Rn .

Examples. We continue with more examples:


2) Let V = R2 and let f : V × V → R be defined by
   
x1 y1
f , = 2x1 y1 − x2 y1 − x1 y2 + x2 y2 .
x2 y2

The map f is bilinear and symmetric (check), while, whenever x1 , x2 ∈ R, we


have 2x21 − 2x1 x2 + x22 = x21 + (x1 − x2 )2 ≥ 0 with equality only if x21 = 0 =
(x1 − x2 )2 , i.e. x1 = x2 = 0.
3) The general example: Let n ∈ N, V = Rn and A ∈ M (n × n, R). Suppose that
A is a symmetric matrix - this means A = At . The map f : V × V → R given
by
f (u, v) = ut Av
is bilinear and symmetric (yet not always positive definite).

Proof: Let u, v, w ∈ V , λ, µ ∈ R. Then

f (v, u) = v t Au = (v t Au)t = ut At v = ut Av = f (u, v).

For bilinearity,
f (λu + µv, w) = (λu + µv)t Aw =
(λut + µv t )Aw = λut Aw + µv t Aw = λf (u, w) + µf (v, w).

Remark. The dot product on Rn is a particular case of the examples in 3) - take


A = In .

Examples. A final example:

39
 
0 1
4) Let V = R2 and A = . Noting that A is symmetric, the map f
1 −2
t
 7→ uAv is bilinear and symmetric.
given by (u, v)
x1
Let V ∋ v = . We have
x2
 
x1
f (v, v) = v t Av = (x2 , x1 − 2x2 ) = 2x1 x2 − 2x22 .
x2

In particular, f (e2 , e2 ) = −2, so f is not positive definite and therefore not an


inner product.

4.0.1 Inequalities
The main point of this short subsection is the proof of the Cauchy-Schwarz inequality
and some of its corollaries. Those inequalities for the basis for the definitions of angles
and distances in analytic geometry. Throughout, (V, f ) is an inner product space.
p
Definition 4.2. The norm of a vector v in V is given by ∥v∥ = f (v, v).

Remark. Since f is positive definite, ∥v∥ ≥ 0 for any v in V , with equality if and
only if v = 0. p p
If λ ∈ R, then ∥λv∥ = f (λv, λv) = λ f (v, v) = |λ| ∥v∥.
2
1
In particular: If v ̸= 0, then ∥v∥ (v) = 1.

Lemma 4.3. [Cauchy-Schwarz-inequality] Let u, v ∈ V . Then

|f (u, v)| ≤ ∥u∥ ∥v∥

with equality only if u and v are linearly dependent.

Proof. Let λ ∈ R. We have 0 ≤ f (u − λv, u − λv) = f (u, u) − λf (u, v) − λf (v, u) +


f (v, v) = = f (u, u) − 2λf (u, v) + λ2 f (v, v). Note that ” f (u − λv, u − λv) = 0” is
possible only when u − λv = 0, in particular, u and v are linearly dependent. Noting
that f (0, w) = f (0 · 0, w) = 0f (0, w) = 0 whenever w ∈ V , the equality holds if
v = 0. Now assume that v ̸= 0. Putting λ = ff (u, v)
(v, v)
, we get: 0 ≤ f (u, u) − 2λf (u, v) +
2 2 2
λ2 f (v, v) = f (u, u) − 2 ff(u, v)
(v, v)
+ ff(u, v)
(v, v)
= f (u, u) − ff(u, v)
(v, v)
. Since f (v, v) > 0, we
may multiply both sides by f (v, v) to obtain f (u, u)f (v, v) ≥ f (u, v)2 ; the left hand
side is ||u||2 ||v||2 and we are done.

We immediately deduce the

Lemma 4.4. [Triangle inequality] Let u, v ∈ V . Then

∥u + v∥ ≤ ∥u∥ + ∥v∥

with equality only if u and v are linearly dependent.

Proof. We simply apply 4.3: ∥u + v∥2 = f (u+v, u+v) = f (u, u)+f (v, v)+2f (u, v) ≤
f (u, u) + f (v, v) + 2|f (u, v)| ≤ f (u, u) + f (v, v) + 2 ∥u∥ ∥v∥ = (∥u∥ + ∥v∥)2 . When
the triangle inequality is not strict (so the two expressions are equal for a given pair
of vectors), then |f (u, v)| = ∥u∥ ∥v∥, so the Cauchy-Schwarz inquality is an equality
and the two vectors are linear dependent.

40
Definition 4.5. We define the distance between two vectors u, v in V as d(u, v) =
∥u − v∥ .

The following properties of the distance function d : V × V → R make d a metric on


V.

Lemma 4.6. Let u, v, w ∈ V . Then


a) d(u, v) ≥ 0 and d(u, v) = 0 if and only if u = v.
b) d(u, v) = d(v, u).
c) d(u, w) ≤ d(u, v) + d(v, w).

Proof. We have d(u, v) = ∥u − v∥ ≥ 0 with equality if and only if u − v = 0. This is


a). Statement b) follows from ∥v − u∥ = ∥−1 · (u − v)∥ = | − 1| ∥u − v∥. Statement c)
follows straight from the triangle inequality: d(u, w) = ∥u − w∥ = ∥u − v + v − w∥ ≤
∥u − v∥ + ∥v − w∥ = d(u, v) + d(v, w).

4.1 Orthonormal bases


Definition 4.7. The vectors u, v in V are called perpendicular, denoted u ⊥ v, if
f (u, v) = 0.

Remark. For u ∈ V ,

f (u, 0) = f (u, 0 + 0) = f (u, 0) + f (u, 0) = 0.

So the null vector is perpendicular to every vector in V .

On the other hand, if 0 ̸= v, then f (v, v) > 0, in particular v is not perpendicular


to every vector in V .

Lemma 4.8. Let (v1 , . . . , vm ) be a family of nonzero vectors in V . If vi ⊥ vj whenever


i ̸= j and i, j ∈ {1, . . . , m}, then the family (v1 , . . . , vm ) is linearly independent.

Proof. Suppose that λ1 v1 + . . . + λm vm = 0. Let i ∈ {1, . . . , m}. We have

0 = f (vi , 0) = f (vi , λ1 v1 + . . . + λm vm ) =
m
X
= λj f (vi , vj ) = λi f (vi , vi ).
j=1

From vi ̸= 0 and the fact that f is positive definite, f (vi , vi ) ̸= 0. So λi = 0.

Definition 4.9. Let (V, f ) be an inner product space of finite dimension n. An or-
thonormal basis of V is a basis (v1 , . . . , vn ) of V such that
1. ∥vi ∥ = 1, i = 1, . . . , n.
2. vi ⊥ vj whenever 1 ≤ i < j ≤ n.

Theorem 4.10 (Gram-Schmidt orthonormalisation). Assume V has finite dimension


n. Let (u1 , . . . , un ) be a basis of V . There is an orthonormal basis (v1 , . . . , vn ) of V
with the property that, for i = 1, . . . , n, ⟨v1 , . . . , vi ⟩ = ⟨u1 , . . . , ui ⟩.

41
Proof. We proceed by induction on n.. Let v1 = ∥u11 ∥ u1 .12 Then ⟨v1 ⟩ = ⟨u1 ⟩ and
∥v1 ∥ = 1.
Let 1 ≤ i < n and suppose that the family (v1 , . . . , vi ) satisfies the requirements of
the Lemma:
∥vj ∥ = 1, vj ⊥ vk for any j ̸= k ∈ {1, . . . , i}, ⟨v1 , . . . , vi ⟩ = ⟨u1 , . . . , ui ⟩.
Let
i
X
ṽi+1 = ui+1 − f (vj , ui+1 )vj .
j=1

Since (u1 , . . . , ui+1 ) is linearly independent ui+1 ∈


/ ⟨u1 , . . . , ui ⟩ = ⟨v1 , . . . , vi ⟩. In
Pi
particular, ṽi+1 ̸= 0. We recall that ṽi+1 = ui+1 − f (vj , ui+1 )vj .
j=1
Let 1 ≤ j ≤ i. Then:
i
X
f (vj , ṽi+1 ) = f (vj , ui+1 ) − f (vj , f (vk , ui+1 )vk ) =
k=1

i
X
= f (vj , ui+1 ) − f (vk , ui+1 )f (vj , vk ) =
k=1

= f (vj , ui+1 ) − f (vj , ui+1 )f (vj , vj ) = 0.


1
In other words, ṽi+1 ⊥ vj . Let vi+1 = ṽ . The definition of vi+1 implies
∥ṽi+1 ∥ i+1
that vi+1 ∈ ⟨v1 , . . . , vi ⟩ + ⟨ui+1 ⟩ = ⟨u1 , . . . , ui ⟩ + ⟨ui+1 ⟩ = ⟨u1 , . . . , ui , ui+1 ⟩. Since
vi+1 ̸= 0 and vi+1 ⊥ vj whenever 1 ≤ j ≤ i, Lemma 3.3 says that (v1 , . . . , vi+1 ) is
linearly independent. So (v1 , . . . , vi+1 ) is a linearly independent family in the i + 1-
dimensional space ⟨u1 , . . . , ui+1 ⟩. This implies that ⟨v1 , . . . , vi+1 ⟩ = ⟨u1 , . . . , ui+1 ⟩.

Apart from its theoretical meaning, the proof of Lemma 3.6 also contains a procedure
known as Gram-Schmidt algorithm with which to modify an existing basis into an
orthonormal basis. We provide an example.

Examples. Let V = R3 and f the dot product. Let


     
1 1 1
u1 =  0  , u2 =  −1  , u3 =  0  .
1 0 −1

The family (u1 , u2 , u3 ) is a basis of V .


We use the Gram-Schmidt algorithm to determine an orthonormal basis (v1 , v2 , v3 )
with
⟨v1 ⟩ = ⟨u1 ⟩ and ⟨v1 , v2 ⟩ = ⟨u1 , u2 ⟩.

I shall leave it to you to verify


 that(u1 , u2 , u3 ) is a basis of V .
√ 1
As ∥u1 ∥ = 2, that v1 = √12  0 .
1
     1 
1 1 2
1 1
Next, v1 · u2 = √2 and ṽ2 = u2 − (v1 · u2 )v1 =  −1  − 2  0  = =  −1  .
0 1 − 12
12 Recall u1 ̸= 0 because a basis is linearly independent.

42
 
q q 1
1 1 3  −2 . √1
Noting that ∥ṽ2 ∥ = +1+ =
4 4 2
,
we get v2 = 6
−1
Finally, v1 · u3 = 0 and v2 · u3 = √26 , so ṽ3 = u3 − (v1 · u3 )v1 − (v2 · u3 )v2 =
     2 
1 1 3 q √
 0  −  −2  =  2  . The norm of ṽ3 being
1 12
= 2 3 3 , we have
3 3 9
−1 −1 − 23
 
1
3 √1 
v 3 = 2√ 3
ṽ3 = 3
1 .
−1

4.2 Orthogonal complements


Definition 4.11. Let S ⊆ V . The set S ⊥ (read ”S perpendicular”) is defined by

S ⊥ = {u ∈ V | u ⊥ s for all s ∈ S}.

Note that V ⊥ = {0} and ∅⊥ = V .


Lemma 4.12. For S ⊆ V , S ⊥ is a subspace of V .
Proof. Since f (v, 0) = 0 whenever v ∈ V , 0 ∈ S ⊥ ̸= ∅. Let u, v ∈ S ⊥ , s ∈ S, λ ∈ R.
Then f (u + v, s) = f (u, s) + f (v, s) = 0 + 0 = 0 = λf (u, s) = f (λu, s).

Lemma 4.13. Suppose that V has finite dimension. Let U be a subspace of V . Then
V = U ⊕ U ⊥.
Proof. Let dim V = n, dim U = k and (u1 , . . . , uk ) a basis of U . Extend to a basis
(u1 , . . . , uk , uk+1 , . . . , un ) of V .
Using the Gram-Schmidt-algorithm 4.10, we construct an orthonormal basis (v1 , . . . , vn )
of V satisfying ⟨v1 , . . . , vk ⟩ = ⟨u1 , . . . , uk ⟩ = U. In other words, (v1 , . . . , vk ) is an or-
thonormal basis of U .
For ℓ > k ≥ i, we have vℓ ⊥ vi . This implies that vi ∈ {vℓ }⊥ for any i ∈ {1, . . . , k}.
By Lemma 3.8, {vℓ }⊥ is a subspace of V . Containing each vi for i = 1, . . . , k, it must
contain ⟨v1 , . . . , vk ⟩ = U . In other words, vℓ ∈ U ⊥ for any ℓ in {k + 1, . . . , n}; since
U ⊥ is a subspace of V , ⟨vk+1 , . . . , vn ⟩ ≤ U ⊥ . In particular, V = U + U ⊥ .
If u ∈ U ∩ U ⊥ , then u ⊥ u which means u = 0 because f is positive definite. So the
sum is direct.

Definition 4.14. Let U be a subspace of V . The space U ⊥ is called the orthogonal


complement of U .
Lemma 4.15. Suppose that V is a finite-dimensional vector space. Let U and W be
subspaces of V . Then U ⊥ = W ⊥ if and only if U = W .
Furthermore, U ⊥⊥ = U .
Proof. We prove ”U ⊥⊥ = U ” first - the first statement will follow very quickly from
this. Since f (u, w) = 0 whenever u ∈ U and w ∈ U ⊥ , U ⊆ U ⊥⊥ .
Suppose that v ∈ U ⊥⊥ . Because of Lemma 3.9, v = uv +wv with uv ∈ U and wv ∈ U ⊥ .
Since uv ∈ U ⊆ U ⊥⊥ , wv = v − uv ∈ U ⊥⊥ ∩ U ⊥ = {0}. It follows that v = uv ∈ U
and U ⊥⊥ = U .
As to the first statement, U = W certainly implies U ⊥ = W ⊥ . If U ⊥ = W ⊥ , then
⊥⊥
U = U = W ⊥⊥ = W .

43
Examples. Let V = R4 , equipped with the dot product. Let U = ⟨u1 , u2 ⟩, where
   
1 1
 1   0 
u1 =  1  , u2 =  −2  .
  

1 1

We aim to determine a basis of U ⊥ and a homogeneous linear system whose solution


space is U .

Solution:
First of all, λ1 u1 + λ2 u2 = 0 implies that λ1 = 0 = λ2 , so dim U = 2.
Since u1 = e1 + e2 + e3 + e4 , (u1 , e1 , e2 , e3 ) is a basis of V . Since u2 = u1 − 3e3 − e2 ,
(u1 , u2 , e1 , e2 ) is a basis of V . Let e1 = u3 , e2 = u4 .
We apply the Gram-Schmidt algorithm to transform (u1 , u2 , e1 , e2 ) into an orthonor-
mal basis (v1 , v2 , v3 , v4 ) of V . We start the Gram-Schmidt process with u1 and u2 ,
which guarantees that ⟨u1 , u2 ⟩ = ⟨v1 , v2 ⟩ = U ; moreover, v3 , v4 ∈ {v1 , v2 }⊥ = U ⊥ .
Since dim⟨v3 , v4 ⟩ = 4 − dim U = dim U ⊥ (compare Lemma 3.10) we know that
U ⊥ = ⟨v3 , v4 ⟩.
Finally, U = (U ⊥ )⊥ = {v ∈ V |v3 · v = 0 = v4 · v}- this will yield a linear system
 with
1
√  1 
solution space U . Since ∥u1 ∥ = 12 + 12 + 12 + 12 = 2, v1 = ∥u11 ∥ u1 = 21   1 .

1
Next, ṽ2 = u2 − (v1 · u2 )v1 , while v1 · u2 = 21 − 2 · 12 + 12 = 0. Hence ṽ2 = u2 and
 
1
 0 
v2 = ∥u12 ∥ u2 = √16 
 −2  . Next, ṽ3 = u3 − (u3 · v1 )v1 − (u3 · v2 )v2 .

1
We have u3 = e1 and u3 · v1 = 21 , while u3 · v2 = √16 . Hence:
     
1 1 1
 0  1  1  1  0 
 0 − 4
ṽ3 =   1 − 6  −2  =
    

0 1 1
 
7
 −3  .
1  
=
12  1 
−5
q
49+9+1+25 1

Since ∥ṽ3 ∥ = 144
= 12
84,
 
7
1 1  −3 
v3 = ṽ3 = √  .
∥ṽ3 ∥ 84  1 
−5

Finally, ṽ4 = u4 − (v1 · u4 )v1 − (v2 · u4 )v2 − (v3 · u4 )v3 .

44
We have (v1 , u4 ) = 12 , (v2 , u4 ) = 0, (v3 , u4 ) = − √384 . This yields
     
0 1 7
 1  1  1  3  −3 
 0 − 4
ṽ4 =   1  + 84  1 =
    

0 1 −5
   
0 0
1 
 54  = 3
  3 
=  −1  .
 
84  −18  14
−36 −2
1
Accordingly, v4 = ṽ
∥ṽ4 ∥ 4
=
 
√ 0
14 
 3 .

3  −1 
−2
   
7 0
 −3   3 
Thus (v3 , v4 ) is a basis of U ⊥ . Let w3 =  1  and w4 =  −1 . The vectors w3
  

−5 −2
and w4 are nonzero scalar multiples of v3 and v4 , respectively, so that U ⊥ = ⟨w3 , w4 ⟩.
By Lemma 3.10, U = U ⊥⊥ = {w3 , w4 }⊥ =
  
 x1
 

  x2  
=   | 7x1 − 3x2 + x3 − 5x4 = 0 = 3x2 − x3 − 2x4 .
  x3  
 
x4
 

4.2.1 Distances
Lemma 4.16. Suppose that V has finite dimension. Let U be a subspace of V , and
let v ∈ V . If v = u + w with u ∈ U and w ∈ U ⊥ , then d(u, v) = min{d(x, v) | x ∈ U }.

Proof. Let u and w be defined as in the statement of the lemma. Let x ∈ U . Write
x = u + (x − u). Recalling that w ∈ U ⊥ , we obtain

d(v, x)2 = ∥v − x∥2 = f (v−u+(x−u), v−u+(x−u)) = f (w+(x−u), w+(x−u)) = f (w, w)+f (x−u, x−u).

Now f (w, w) ≥ 0 ≤ f (x − u, x − u) and f (x − u, x − u) = 0 if and only if x − u = 0,


i.e. u = x.

We conclude this chapter with an example on how to use Lemma 4.16.

Examples. Let V = R3 with the dot product. Let U = ⟨y⟩, where y = 2e1 − e3 .
a) Show that U ⊥ = ⟨e2 , e1 + 2e3 ⟩.
b) Let x = e2 + e3 . Determine a vector in U at minimal distance to x.

45
Solution: For a), we could use the Gram-Schmidt algorithm, but there is a shorter
way: Since dim U = 1, we know that dim U ⊥ = 2. The two vectors e2 and e1 + 2e3
are linearly independent and perpendicular to y. This is sufficient for a).
For b), we write x as a sum of an element of U and an element of U ⊥ . This means
finding scalars α, β, γ such that x = e2 + e3 = α(2e1 − e3 ) + βe2 + γ(e1 + 2e3 ) =
(2α + γ)e1 + βe2 + (−α + 2γ)e3 . It follows that β = 1, γ = −2α, and −5α = 1, i.e.
α = − 51 and γ = 52 . Thus u = α(2e1 − e3 ) = − 15 y is the element of U at minimal
distance to x.

46

You might also like