0% found this document useful (0 votes)
41 views77 pages

Algebra Redacted

Uploaded by

Gir0fa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views77 pages

Algebra Redacted

Uploaded by

Gir0fa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Contents

1 The idea of a group 3


1.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 The group of permutations 11


2.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 Rotations and reflections in the plane 15


3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4 Cyclic groups and dihedral groups 19


4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5 Finite sets, counting and group theory 24


5.17 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

6 More counting problems with groups 29


6.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

7 Kernels and quotients 36


7.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

8 Rings and modular arithmetic 40


8.17 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

9 Z∗p is cyclic 45
9.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

10 Matrices over Zp 49
10.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

11 The sign of a permutation 52


11.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

12 Determinants 56
12.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

1
13 The 3 dimensional rotation group 60
13.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

14 Finite subgroups of the rotation group 64


14.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

15 Quaternions 69
15.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

16 The Spin group 73


16.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

2
Chapter 1

The idea of a group

One of our goals in this class is to make precise the idea of symmetry, which is
important in math, other parts of science, and art. Something like a square has
a lot of symmetry, but circle has even more. But what does this mean? One
way of expressing this is to a view a symmetry of a given shape as a motion
which takes the shape to itself. Let us start with the example of an equilateral
triangle with vertices labelled by 1, 2, 3.
3

1 2

We want to describe all the symmetries, which are the motions (both rota-
tions and flips) which takes the triangle to itself. First of all, we can do nothing.
We call this I, which stands for identity. In terms of the vertices, I sends 1 → 1,
2 → 2 and 3 → 3. We can rotate once counterclockwise.

R+ : 1 → 2 → 3 → 1.

We can rotate once clockwise

R− : 1 → 3 → 2 → 1.

We can also flip it in various ways

F12 : 1 → 2, 2 → 1, 3 fixed

F13 : 1 → 3, 3 → 1, 2 fixed

3
F23 : 2 → 3, 3 → 2, 1 fixed
We will say more about this example and generalizations for regular polygons
later. In the limit, as the number of vertices go to infinity, we get the circle.
This has infinitely many symmetries. We can use any rotation about the center,
or a reflection about a line through the center.
Another example which occurs in classical art and design (mosaics, wallpa-
per....) and two dimensional crystals is a repetetive pattern in the plane such
as the one drawn below.

We imagine this covering the entire plane; the grid lines are not part of the
pattern. Then there are infinitely many symmetries. We can translate or shift
all the “ducks” up or down by one square, or left or right by two squares. We
can also flip or reflect the pattern along vertical lines.
Here is another pattern below.

This has translational symmetries as before, but no flipping symmetries. In-


stead, if the plane is rotated by 90◦ about any point where four ducks meet, the
pattern is preserved. One might ask can we replace four by five, or some arbi-
trary number of, ducks and still get an infinitely repeating symmetric pattern
as above? The answer surprisingly is no. We will prove this later.

The study of symmetry leads to an algebraic structure. To simplify things,


let us ignore flips and consider only rotational symmetries of a circle C of radius
r. To simplify further, let us start with the limiting case where r → ∞. Then
C becomes a line L, and rotations correspond to translations. These can be
described precisely as follows. Given a real number x ∈ R, let Tx : L → L

4
Lemma 1.1. If θ ∈ C, then θ ⊕ 0 = θ .
Proof. Since θ < 2π, θ ⊕ 0 = θ + 0 = θ
Lemma 1.2. If θ, φ ∈ C, then θ ⊕ φ = φ ⊕ θ.
Proof. If we compare
(
φ+θ if φ + θ < 2π
φ⊕θ =
φ + θ − 2π if φ + θ ≥ 2π

we see that it is identical to θ ⊕ φ.


Lemma 1.3. Given θ ∈ C, we have φ ∈ C such that θ ⊕ φ = 0.
Proof. We can take φ = θ = 2π − θ.
We omit the proof for now, but the associative law

θ ⊕ (φ ⊕ ψ) = (θ ⊕ φ) ⊕ ψ

also holds.
So in summary, the set C with the operation ⊕ shares the same 4 laws
as R with usual addition: namely the associative and commutative laws, and
the existence of identity and inverse. We have a name for such a thing. It is
called an abelian group, and it will be one of the key concepts in this class. To
appreciate the power of this simple set of rules, let us extend a standard result
from highschool algebra.
Theorem 1.4. Suppose that A is any abelian group with operation + and iden-
tity 0. For any a, b ∈ A, there is exactly one solution to x + a = b.
Proof. By the axioms, there exists an element that we denote by −a such that
a + (−a) = 0. Add b to both sides, and use the laws to obtain

(b + (−a)) + a = b + (−a + a) = b + 0 = b

Therefore x = b + (−a) gives a solution. Suppose that x is any solution to


x + a = b. Then adding −a to both sides and use the associative law

x = x + (a + (−a)) = (x + a) + (−a) = b + (−a)

We are being a bit pedantic in our notation, since this was the first abstract
proof. In the future, we will just write b − a instead of b + (−a).

We want to return to the first example of the triangle, but first we should
clarify what kind of mathematical objects we are dealing with. Given a set X,

6
a permutation of X is a one to one onto function f : X → X. Recall that
function, or map, mapping or transformation f : X → Y is a rule for am taking
element x of one set X to an element f (x) ∈ Y ; it is one to one and onto if
every element of Y equals f (x) for exactly one x ∈ X. The symmetries R+ etc.
are just permutations of {1, 2, 3}. Here are some abstractly given permutations
of the set {1, 2, 3, 4}.
f (1) = 2, f (2) = 3, f (3) = 1, f (4) = 4
g(1) = 1, g(2) = 1, g(3) = 4, g(4) = 3
The function h defined by
h(1) = h(2) = 1, h(3) = h(4) = 2
is not a permutation. It may be helpful to visualize these
 

 1 → 2 
 2 ← 1
2 → 3 3 ← 2
 
f= = ,

 3 → 1 
 1 ← 3
4 → 4 4 ← 4
 
 

 1 → 2 
 2 ← 1
2 → 1 1 ← 2
 
g= =

 3 → 4 
 4 ← 3
4 → 3 3 ← 4
 

Since the above notations are a bit cumbersome, we often write this in permu-
tation notation as
   
1 2 3 4 1 2 3 4
f= ,g =
2 3 1 4 2 1 4 3
Note these are not matrices. There is yet another notation, which a bit more
compact. A cycle of a permutation is a sequence of elements a → f (a) →
f (f (a)) . . . → a For f , the cycles are 1 → 2 → 3 → 1 and 4 → 4; for g,
1 → 2 → 1 and 3 → 4 → 3. To specify a permutation it is just enough to list
the cycles as in
f = (123)(4), g = (12)(34)
Cycles consisting of just one element are usually omitted, so we would write
f = (123). Note that (312) would also represent f .
Given two permutations f : X → X and g : X → X. We can multiply them
by composing them as functions. In the examples above,
f ◦ g(1) = f (g(1)) = f (2) = 3, etc.
We usually omit the ◦ symbol. More visually
 

 3 ← 2 ← 1 
 3 ← 1
2 ← 1 ← 2 2 ← 2
 
fg = =

 4 ← 4 ← 3 
 4 ← 3
1 ← 3 ← 4 1 ← 4
 

7
Note that we use backword arrows because this is consistent with function com-
position. Some people (and software) use forward arrows, which is easier to
work with, but confusing in other ways.
With a bit of practice, this can this read off directly from the permutation
symbols     
1 2 3 4 1 2 3 4 1 2 3 4
=
2 3 1 4 2 1 4 3 3 2 4 1
We now return to our triangle example.
    
1 2 3 1 2 3 1 2 3
R+ R+ = = = R−
2 3 1 2 3 1 3 1 2

Let’s do two flips, F12 followed by F13


    
1 2 3 1 2 3 1 2 3
F12 F13 = = = R−
2 1 3 3 2 1 3 1 2

Doing this the other way gives

F13 F12 = R+

Therefore this multiplication is not commutative.


The full multiplication table can be worked out with enough patience as

◦ I F12 F13 F23 R+ R−


I I F12 F13 F23 R+ R−
F12 F12 I R− R+ F23 F13
F13 F13 R+ I R− F13 F23
F23 F23 R− R+ I F12 F13
R+ R+ F23 F12 F13 R− I
R− R− F13 F23 F12 I R+
One thing that can be observed from the table is that every element has an
inverse, i.e. an element which multiplies with it to give the identity. It is not
obvious from the table that the associative law holds, but this is something we
will prove later. A group is a set with an multiplication, which is associative,
has an identity and such that every element has an inverse. We will be clarify
the meaning of the axioms later. Suffice it to say that we now have two new
examples of groups. One which is abelian and one which isn’t.

1.5 Exercises
In the next few exercises, you will study the symmetries of a square with vertices
labelled by 1, 2, 3, 4 as shown

8
4 1

3 2

Let  
1 2 3 4
I=i=
1 2 3 4
R be the clockwise rotation
 
1 2 3 4
2 3 4 1

and F be the flip  


1 2 3 4
2 1 4 3

1. Show that all the rotations preserving the square are given by I, R, R2 =
RR and R3 . Write these out explicitly in cycle notation.
2. Show that all the flips (including diagonal flips) preserving the square are
given by F, F R, F R2 , F R3 . Write these out explicitly in cycle notation.
3. The above 8 rotations and flips is a complete list of all the symmetries
of the square. Describe RF in terms of this list. Give an example of a
permutation of {1, 2, 3, 4} which is not a symmetry of the square.
4. Determine the inverses of the rotations R, R2 = RR and R3 .
5. Determine the group of symmetries (rotations and flips) of a rectangle
which is not a square. Is this abelian?
6. Determine all the symmetries of a regular pentagon. Regular means that
all the sides have the same length.
7. (If you forgot what complex numbers are, now is the time to remind
yourself.)
(a) Given z = a + bi ∈ C, recall that z̄ = a − bi. Check that z z̄ = a2 + b2 ,
and also that z̄ w̄ = zw for w = c + di.
(b) Let C be the set of complex numbers of the form a + bi, where
a2 + b2 = 1. With the help of the previous exercise, prove that if
z ∈ C, then z −1 ∈ C, and that the product of any two numbers in C
is also in C. Conclude that C is a group under multiplication.

9
(c) Given an angle θ, show that eiθ = cos θ + i sin θ ∈ C and conversely,
every element of z ∈ C is of this form for a unique θ ∈ [0, 2π). This is
another way to turn C into a group which is the same as the previous
group in an appropriate sense.

10
Chapter 2

The group of permutations

Recall that a function f : X → Y is one to one if for any pair of distinct elements
x1 , x2 ∈ X, f (x1 ) 6= f (x2 ). Equivalently, if f (x1 ) = f (x2 ) then x1 = x2 . f is
onto if for every y ∈ Y , we can find an x ∈ X such that f (x) = y. An
important example of a function is the identity function idX : X → X defined
by idX (x) = x. This is clearly one to one and onto. If X is understood, we
write this as id.
Lemma 2.1. Suppose that f : X → Y and g : Y → Z are functions.
1. If f and g are one to one, then so is g ◦ f .
2. If f and g are onto, then so is g ◦ f .
Proof. Suppose that f and g are one to one. If g ◦ f (x1 ) = g ◦ f (x2 ), then
g(f (x1 )) = g(f (x2 )). This implies f (x1 ) = f (x2 ) because g is one to one.
Therefore x1 = x2 because f is one to one. This proves 1.
Suppose that f and g are onto. Given z ∈ Z, we can find y ∈ Y such
that g(y) = z because g is onto. We can also find x ∈ X such that f (x) = y.
Therefore g ◦ f (x) = z. This proves 2.
Lemma 2.2. Suppose that f : X → Y , g : Y → Z and h : Z → W are
functions, then h ◦ (g ◦ f ) = (h ◦ g) ◦ f
Proof. To be clear two functions are considered to be equal if they produce
equal outputs on the same input. Now observe that

(h ◦ (g ◦ f ))(x) = h(g(f (x))) = ((h ◦ g) ◦ f )(x)

Lemma 2.3. If f : X → Y is one to one and onto, there exists a function


f −1 : Y → X called the inverse such that f ◦ f −1 = idY and f −1 ◦ f = idX .

11
Proof. For every y ∈ Y , there exists a unique x ∈ X such that f (x) = y. We
define f −1 (y) = x. Then f −1 ◦ f (x) = f −1 (y) = x and f ◦ f −1 (y) = f (x) = y.

Lemma 2.4. Given a function f : X → Y , f ◦ idX = f and idY ◦ f = f .


Proof. The first equation holds because f ◦ id(x) = f (id(x)) = f (x). The proof
of the second is similar.
Now come to the key definition.
Definition 2.5. A group is a set G with an operation ∗ and a special element
e satisfying
1. The associative law: (x ∗ y) ∗ z = x ∗ (y ∗ z)
2. e is the identity: x ∗ e = e ∗ x = x
3. Existence of inverses: given x, there exists y such that x ∗ y = y ∗ x = e
We sometimes say that (G, ∗, e) is a group we want to specify the operation
and identity. Occasionally, we will omit the operation, and simply write xy for
x∗y. We will see in the exercises that each x has exactly one inverse. We denote
this by x−1 , or sometimes −x, depending on the situation.
It is also worth repeating what we said in the first chapter in this context.
Definition 2.6. An abelian group is a group G for which the commutative law
x ∗ y = y ∗ x holds.
Given a set X, recall that a permutation of X is a one to one onto function f :
X → X. Let SX denote the set of permutations of X. When X = {1, 2, . . . , n},
which is the case we will mostly be interested in, we denote this by Sn . Putting
the previous lemmas, we get
Theorem 2.7. SX becomes a group under composition, with identity given by
id.
Sn is called the symmetric group on n letters. Most of you have actually
encountered this before, although perhaps not by name, and in particular, you
probably already know is that:
Theorem 2.8. The number of elements of Sn is n! = 1 · 2 · 3 · · · n.
We will in fact give a proof of this later on. For n = 3, we see that S3 has
6 elements, so it must coincide with the symmetry group of the triangle. For
n = 4, we have 24 which is much bigger than the symmetries of the square.
This a pretty typical. We are often interested not in the whole of Sn , but some
interesting piece of it.
Definition 2.9. Given a group (G, ∗, e), a subset S ⊆ G is called a subgroup
if e ∈ S, and x, y ∈ S implies x ∗ y, x−1 ∈ S (one says that S is closed under
these operations).

12
The definition ensures that if these operations can be restricted to S, we
don’t leave S.
Proposition 2.10. A subgroup S ⊆ G of a group is also a group.
There is actually nothing to prove. The same laws of G hold for elements of
S.
Coming back to permutation notation, we note see that the identity is simply
 
1 2 3 ...
id =
1 2 3 ...

To find the inverse, we simply turn it upside down and then rearrange columns.
For example,  
1 2 3 4
f=
1 4 2 3
   
−1 1 4 2 3 1 2 3 4
f = =
1 2 3 4 1 3 4 2
In cycle notation, we simply reverse the cycles

f = (243), f −1 = (342)

2.11 Exercises
1. Let X be a nonempty set and let f : X → X be a function. Prove that
f is one to one if and only if there is a function g : X → X such that
gf = id; g is called a left inverse. (One direction is easy, and the other
will require you to be a bit creative.)
2. Let X be a nonempty set and let f : X → X be a function. Prove that f
is onto if and only if there is a function g : X → X such that f g = id; g
is called a right inverse. (People who know some set theory will need to
invoke the axiom of choice.)
3. A permutation f ∈ Sn is a transposition, if it interchanges two numbers,
say i and j and fixes everything else, i.e. f (i) = j, f (j) = i, f (x) = x, i 6=
x 6= j, or f = (ij) in cycle notation.

(a) Check that everything in S3 is a product of transpositions.


(b) Check (12)(34), (123), (1234) ∈ S4 are products of transpositions.
Generalizing from these examples, prove that every element of S4 is
a product of transpositions.

4. Given a group (G, ∗, e), prove that it is has only one identity element. In
other words, if x ∗ e0 = e0 ∗ x = x holds for all x, prove e0 = e.
5. Given a group (G, ∗, e),

13
(a) Prove that every element x as exactly one inverse. We now denote it
by x−1 .
(b) Prove that (x ∗ y)−1 = y −1 ∗ x−1 .
6. Given a group (G, ∗, e),

(a) Given y, z ∈ G, prove that there is exactly one x1 ∈ G satisfying


x1 ∗ y = z and exactly one x2 ∈ G satisfying y ∗ x2 = z.
(b) Is always true that x1 = x2 ? If yes, then prove it; if no, then find
a counterexample, i.e. a group G and elements x1 , x2 , y, z as above
with x1 6= x2 .

7. Let R = (123) and F a transposition in S3


(a) Check that {I, R, R2 }, and {I, F }, are subgroups of S3
(b) Prove that S3 does not have a subgroup with exactly 4 elements. (If
you happen to know Lagrange’s theorem, don’t use it. Give a direct
argument.)
8. Recall that the intersection (respectively union) of two sets H ∩K (H ∪K)
is the set elements x such that x ∈ H and x ∈ K (respectively x ∈ H or
x ∈ K – x is allowed to be in both).
(a) Prove that if H and K are both subgroups of a group G, then H ∩ K
is a subgroup.
(b) What about H ∪ K?

14
Chapter 3

Rotations and reflections in


the plane

We want another important source of nonabelian groups, which is one that most
people should already be familiar. Let
 
a11 a12 . . .
A = a21 a22 . . .
...
be an n × n matrix with entries in R. If B is another n × n matrix, we can form
their product C = AB which is another n × n matrix with entries
X
cij = ai1 b1j + ai2 b2j + . . . ain bnj = aik bkj
k

The identity matrix  


1 0 ...
I=0 1 . . .
...
has entries (
1 if i = j
δij =
0 otherwise
Lemma 3.1. Matrix multiplication is associative and I is the identity for it,
i.e. AI = IA = A.
Proof. Given matrices A, B, C, the ijth entries of A(BC) and (AB)C both work
out to XX
aik bk` c`j
k `
Also X X
aij = aik δkj = δik akj
k k

15
An n × n matrix A is invertible if there exists an n × n matrix A−1 such that
AA = A−1 A = I. It follows that:
−1

Theorem 3.2. The set of invertible n × n matrices with entries in R forms a


group called the general linear group GLn (R).
For 2 × 2 matrices there is a simple test for invertibility. We recall that the
determinant  
a b
det = ad − bc
c d
and    
a b ea eb
e =
c d ec ed
 
a b
Theorem 3.3. Let A = be matrix over R, then A is invertible if and
c d
only det(A) 6= 0. In this case,
 
d −b
A−1 = (det(A))−1
−c a
 
d −b
Proof. Let ∆ = det(A), and let B = . Then an easy calculation gives
−c a

AB = BA = ∆I.

If ∆ 6= 0, then ∆−1 B will give the inverse of A by the above equation.


Suppose that ∆ = 0 and A−1 exists. Then multiply both sides of the above
equation by A−1 to get B = ∆A−1 = 0. This implies that A = 0, and therefore
that 0 = AA−1 = I. This is impossible.
Let us study an important subgroup of this. A 2 × 2 rotation matrix is a
matrix of the form  
cos θ − sin θ
R(θ) =
sin θ cos θ
This sends a column vector v in the plane R2 to the vector R(θ)v obtained by
rotation through angle θ. We denote the set of these by SO(2) (SO stands for
special orthogonal).
Theorem 3.4. SO(2) forms a subgroup of GL2 (R).

16
Proof. It is easy to check that det R(θ) = cos2 θ + sin2 θ = 1 and of course,
R(0) = I ∈ SO(2). If we multiply two rotation matrices
  
cos θ − sin θ cos φ − sin φ
R(θ)R(φ) =
sin θ cos θ sin φ cos φ
" #
cos θ cos φ − sin θ sin φ − cos θ sin φ − sin θ cos φ
=
sin θ cos φ + cos θ sin φ cos θ cos φ − sin θ sin φ
 
cos(θ + φ) − sin(θ + φ)
=
sin(θ + φ) cos(θ + φ)
= R(θ + φ)

Therefore SO(2) is closed under multiplication. The last calculation also shows
that R(θ)−1 = R(−θ) ∈ SO(2)
A matrix  
a b
A=
c d
is called orthogonal if the columns are unit vectors a2 + c2 = b2 + d2 = 1 which
are orthogonal in the sense that the dot product ab + cd = 0. Since the first
column is on the unit circle, it can be written as (cos θ, sin θ)T (the symbol
(−)T , read transpose, turns a row into a column). The second column is on the
intersection of the line perpendicular to the first column and the unit circle.
This implies that the second column is ±(− sin θ, cos θ)T . So either A = R(θ)
or  
cos θ sin θ
A = F (θ) =
sin θ − cos θ
In the exercises, you will find a pair of nonzero orthogonal vectors v1 , v2 , F (θ)v1 =
v1 and F (θ)v2 = −v2 . This means that F (θ) is a reflection about the line
spanned by v1 . In the exercises, you will also prove that
Theorem 3.5. The set of orthogonal matrices O(2) forms a subgroup of GL2 (R).
Given a unit vector v ∈ R2 and A ∈ O(2), Av is also a unit vector. So
we can interpret O(2) as the full symmetry group of the circle, including both
rotations and reflections.

3.6 Exercises
1. Let U T (2) be the set of upper triangular matrices
 
1 a
0 1

Show this forms a subgroup of GL2 (R).

17
2. Let U T (3) be the set of upper triangular matrices
 
1 a b
0 1 c 
0 0 1

Show this forms a subgroup of GL3 (R).


3. Find a pair of nonzero orthogonal vectors v1 , v2 , F (θ)v1 = v1 and F (θ)v2 =
−v2 . (Hint: if θ = 0 this is easy; when θ 6= 0, try v1 = (sin θ, 1 − cos θ)T .)

4. Recall that the transpose of an n × n matrix A is the n × n matrix with


entries aji . A matrix is called orthogonal if AT A = I = AAT (the second
equation is redundant but included for convenience).
(a) Check that this definition of orthogonality agrees with the one we
gave for 2 × 2 matrices.
(b) Prove that the set of n × n orthogonal matrices O(n) is a subgroup
of GLn (R). You’ll need to know that (AB)T = B T AT .
5. Show that SO(2) is abelian, but that O(2) is not.
6. A 3 × 3 matrix is called a permutation matrix, if it can be obtained from
the identity I by permuting the columns. Write P (σ) for the permutation
matrix corresponding to σ ∈ S3 . For example,
 
0 1 0
F = P ((12)) = 1 0 0
0 0 1

Check that F 2 = I. What can you conclude about the set {I, F }?
7. Prove that the set of permutations matrices in GL3 (R) forms a subgroup.
Prove the same thing for GLn (R), where permutations matrices are de-
fined the same way. (The second part is not really harder than the first,
depending how you approach it.)

18
Chapter 4

Cyclic groups and dihedral


groups

Consider the group Cn of rotational symmetries of a regular n-gon. If we label


the vertices consecutively by 1, 2 . . . , n. Then we can view

Cn = {I, R, R2 , . . . Rn−1 } ⊂ Sn

where I = id and
R = (123 . . . n)
n
A bit of thought shows that R = I. We won’t need to multiply permutations
explicitly, we just use this rule: Rj Rk = Rj+k and if j + k ≥ n, we “wrap
around” to Rj+k−n . We will encounter other groups with a similar structure.
Definition 4.1. A finite group G is called cyclic if there exists an element
g ∈ G, called a generator, such that every element of G is a power of g.
Cyclic groups are really the simplest kinds of groups. In particular:
Lemma 4.2. A cyclic group is abelian.
Proof. g j g k = g j+k = g k g j .
Let us give a second example. Let

Zn = {0, 1, 2 . . . n − 1}

We modify addition using the same wrap around rule as before.


(
x+y if x + y ∈ Zn
x⊕y =
x + y − n otherwise

19
This is usually called modular addition. It is not completely obvious that this
is a group but we will show this later. Here is the table for n = 2

⊕ 0 1
0 0 1
1 1 0

This is the simplest nonzero abelian group. A somewhat more complicated case
is n = 4
⊕ 0 1 2 3
0 0 1 2 3
1 1 2 3 0
2 2 3 0 1
3 3 0 1 2
Zn with this addition rule is also cyclic with generator 1.
We can see that µ2 = {1, −1} is a cyclic group under multiplication. More
generally, the group of nth roots of unity.
     
2πik 2πik
µn = e2πik/n = cos + i sin | k = 0, 1, . . . n − 1
n n

This is a subgroup of the group of nonzero complex numbers C∗ under multi-


plication. µn is generated by e2πi/n , so it is cyclic.
Although these examples are superficially different, they are the same in
some sense. If we associate k 7→ Rk or k 7→ e2πik/n and compare addi-
tion/multiplication tables, they will match. Here is the precise definition.
Definition 4.3. If (G, ∗, e) and (H, ◦, e0 ) are groups. A function f : G → H is
called a homomorphism if f (e) = e0 and f (g1 ∗ g2 ) = f (g1 ) ◦ f (g2 ). A one to
one onto homomorphism is called an isomorphism. Two groups are isomorphic
if there is a homomorphism from one to the other. In symbols, we write G ∼ = H.
The function f : Zn → Cn defined by f (k) = Rk is an isomorphism. The
function f : Z → µn defined by f (k) = e2πik/n is a homomorphism which is not
an isomorphism because it is not one to one. The order of a finite group is the
number of elements in it.
Theorem 4.4. A cyclic group of order n is isomorphic to Zn .
Proof. Let G be the cyclic group in question with generator g. Since G is finite,
the sequence g n must repeat itself. That is g n1 = g n2 for n1 > n2 . Taking
n = n1 − n2 > 0 implies that g n = e. Let us assume that n is the smallest such
number (this is called the order of g). We claim that G = {e, g, . . . , g n−1 } and
that all the elements as written are distinct. By distinctness we mean that if
m1 > m2 lie in {0, 1, . . . n − 1} then g m1 6= g m2 . If not then g m1 −m2 = e would
contradict the fact that n is the order of g.
So now the function f (i) = g i is easily seen to given an isomorphism from
Zn to G.

20
We need to come back and check that Zn is actually a group. We make
use of a result usually called the “division algorithm”. Although it’s not an
algorithm in the technical sense, it is the basis of the algorithm for long division
that one learns in school.
Theorem 4.5. Let x be an integer and n positive integer, then there exists a
unique pair of integers q, r satisfying

x = qn + r, 0 ≤ r < n

Proof. Let
R = {x − q 0 n |, q 0 ∈ Z and q 0 n ≤ x}
Observe that R ⊆ N, so we can choose a smallest element r = x − qn ∈ R.
Suppose r ≥ n. Then x = qn + r = (q + 1)n + (r − n) means that r − n lies in
R. This is a contradiction, therefore r < n.
Suppose that x = q 0 n + r0 with r0 < n. Then r0 ∈ R so r0 ≥ r. Then
qn = q 0 n + (r0 − r) implies that n(q − q 0 ) = r0 − r. So r0 − r is divisible by n. On
the other hand 0 ≤ r0 − r < n. But 0 is the only integer in this range divisible
by n is 0. Therefore r = r0 and qn = q 0 n which implies q = q 0 .

We denote the number r given above by x mod n; mod is read “modulo” or


simply “mod”. When x ≥ 0, this is just the remainder after long divison by n.
Lemma 4.6. If x1 , x2 , n are integers with n > 0, then

(x1 + x2 ) mod n = (x1 mod n) ⊕ (x2 mod n)

Proof. Set ri = xi mod n. Then xi = qi n + ri for appropriate qi . We have


x1 + x2 = (q1 + q2 )n + (r1 + r2 ). We see that
(
r1 + r2 = r1 ⊕ r2 if r1 + r2 < n
(x1 + x2 ) mod n =
r1 + r2 − n = r1 ⊕ r2 otherwise

This would imply that f (x) = x mod n gives a homomorphism from Z → Zn


if we already knew that Zn were a group. Fortunately, this can be converted
into a proof that it is one.
Lemma 4.7. Suppose that (G, ∗, e) is a group and f : G → H is an onto map
to another set H with an operation ∗ such that f (x ∗ y) = f (x) ∗ f (y). Then H
is a group with identity f (e).
In the future, we usually just write + for modular addition.

The dihedral group Dn is the full symmetry group of regular n-gon which
includes both rotations and flips. There are 2n elements in total consisting

21
of n rotations and n flips. Label the vertices consecutively by 1, 2, 3 . . .. Let
R = (123 . . . n) be the basic rotation. This generates a cyclic subgroup Cn ⊂ Dn .
The reflection around the line through the midpoint of 1n and opposite side or
vertex is
F = (1 n)(2 n − 1)(3 n − 2) . . .
One can calculate that
  
1 2 ... n 1 2 ... n
FR =
n n − 1 ... 1 2 3 ... 1
= (1 n − 1)(2 n − 2) . . .

is another flip, and furthermore that


  
1 2 ... n 1 2 ... n
F RF =
n − 1 n − 2 ... n n n − 1 ... 1
 
1 2 ... n
=
n 1 ... n − 1
= R−1

Here’s the point. We will eventually see that the elements of Dn are given by
I, R, R2 . . . , F, F R, F R2 . So we say that these elements generate the group. (In
general, to say that a set elements generates a group, means that we have to take
products in every possible way such as F R2 F 3 .) We have three basic relations
among the generators

F 2 = I, Rn = I, F RF = R−1

Everything else about Dn follows from this. In particular, we won’t have to


multiply any more permutations. For instance, let us check that (F R)2 = I
using only these relations

(F R)2 = (F RF )R = R−1 R = I

4.8 Exercises
1. Determine all the generators of Z6 and Z8 . Is there an obvious pattern?
2. Let Z∗7 = {1, 2, 3, 4, 5, 6} with an operation defined by x y = (x·y) mod 7.
Assume that it is associative, and check that Z∗7 is a cyclic group.
3. Given a finite group G and g ∈ G, prove that {e, g, g 2 , . . .} is a cyclic
subgroup. This called the subgroup generated by G. The order of this
group is called the order of g. Prove that the order is the smallest positive
integer n such that g n = e.
4. Given a function f : H → G such that f (x ∗ y) = f (x) ∗ f (y), prove that
f takes the identity to the identity and is therefore a homomorphism.

22
5. Complete the proof of lemma 4.7.
6. Let us say that an infinite group is cyclic if it isomorphic to Z. Prove that
the set of even integers is cyclic.
7. Let G ⊆ Z be nonzero subgroup, where Z is a group under addition. Let
d ∈ G be the smallest positive element. Prove that if x ∈ G, then x = qd
for some integer q. Conclude that G is cyclic.
8. Let F, R ∈ Dn be as above.
(a) For any i > 0, show that F Ri F = R−i , where R−i is the inverse of
Ri .
(b) Show that for any i, j > 0, (F Ri )(F Rj ) is a rotation.
(c) Show every element of Dn is either Ri or F Ri with i = 0, 1, . . . , n.
9. Assuming the previous exercise, show that f : Dn → Z2 given by f (Ri ) =
0 and f (F Ri ) = 1 is a homomorphism.
10. Let G ⊂ O(2) be the set of matrices
  
cos θ ± sin θ 2πk
|θ= , k = 0, 1, . . . n − 1
sin θ ∓ cos θ n

Let    
2π 1 0
R=R ,F =
n 0 −1
Check that G is generated by these two elements, and that they satisfy
the same relations as the generators of the Dn . Use these facts to prove
that Dn is isomorphic to G.

23
Chapter 5

Finite sets, counting and


group theory

Let N = {0, 1, 2 . . .} be the set of natural numbers. Given n, let [n] = {x ∈ N |


x < n}. So that [0] = ∅ is the empty set, and [n] = {0, 1, . . . , n − 1} if n > 0.
A set X is called finite if there is a one to one onto function (also called a one
to one correspondence) f : [n] → X for some n ∈ N. The choice of n is unique
(which we will accept as a fact), and is called the cardinality of X, which we
denote by |X|.
Lemma 5.1. If X is finite and g : X → Y is a one to one correspondence,
then Y is finite and |Y | = |X|.
Proof. By definition, we have a one to one correspondence f : [n] → X, where
n = |X|. Therefore g ◦ f : [n] → Y is a one to one correspondence.
Proposition 5.2. If a finite set X can be written as a union of two disjoint
subsets Y ∪ Z, then |X| = |Y | + |Z|. (Recall that Y ∪ Z = {x | x ∈ Y or x ∈ Z},
and disjoint means their intersection is empty.)
Proof. Let f : [n] → Y and g : [m] → Z be one to one correspondences. Define
h : [n + m] → X by 
f (i) if i < n
h(i) =
g(i − n) if i ≥ n
This is a one to one correspondence.
A partition of X is a decomposition of X as a union of subsets X = Y1 ∪
Y2 ∪ . . . Yn such that Yi and Yj are disjoint whenever i 6= j.
Corollary 5.3. If X = Y1 ∪ Y2 ∪ . . . Yn is a partition, then |X| = |Y1 | + |Y2 | +
. . . |Yn |.

24
Proof. We have that

|X| = |Y1 | + |Y2 ∪ . . . Yn | = |Y1 | + |Y2 | + |Y3 ∪ . . . Yn | = . . . = |Y1 | + |Y2 | + . . . |Yn |

Given a function f : X → Y and an element y ∈ Y , the preimage

f −1 (y) = {x ∈ X | f (x) = y}

Proposition 5.4. If f : X → Y is a function, then


X
|X| = |f −1 (y)|
y∈Y

Proof. The collection {f −1 (y)} forms a partition of X.


The cartesian product of two sets is the set of ordered pairs

X × Y = {(x, y) | x ∈ X, y ∈ Y }

Theorem 5.5. If X and Y are finite sets, then |X × Y | = |X||Y |.


Proof. Let p : X × Y → Y be the projection map defined by p(x, y) = y. Then

p−1 (y) = {(x, y) | x ∈ X}

and (x, y) → x gives a one to one correspondence to X. Therefore, by the


previous corollary, X
|X × Y | = |p−1 (y)| = |Y ||X|
y∈Y

Let us apply these ideas to group theory.


Given a subgroup H ⊂ G and g ∈ G, let gH = {gh | h ∈ H}. This is called
a (left) coset. For example, when G = S3 and H = {I, (123), (321)}, the cosets
are
IH = (123)H = (321)H = H
and
(12)H = (13)H = (23)H = {(12), (13), (23)}
Thus the collection of distinct cosets gives a partition of S3 into rotations and
flips, and there are the same number of each. We will prove that is a similar
statement in general.
Lemma 5.6. If two cosets g1 H and g2 H have a nonempty intersection then
g1 H = g2 H.

25
Proof. If g ∈ g1 H ∩ g2 H, we can write g = g1 h1 = g2 h2 with h1 , h2 ∈ H. Then
g2 = g1 h1 h−1 −1
2 . If h ∈ H, then h1 h2 h ∈ H because H is a subgroup. Therefore
−1
g2 h = g1 h1 h2 h ∈ g1 H. This proves that g2 H ⊆ g1 H. The same argument,
with g1 and g2 interchanged, shows that g1 H ⊆ g2 H. Therefore these sets are
equal.
Lemma 5.7. G/H is a partition of G
Proof. Every element g ∈ G lies in the coset gH. Therefore G is the union of
cosets. By the previous lemma, the cosets are pairwise disjoint.
Lemma 5.8. If H is finite, |gH| = |H| for every g.
Proof. Let f : H → gH be defined by f (h) = gh. Then f is onto. Suppose that
f (h1 ) = f (h2 ). Then h1 = g −1 gh1 = g −1 gh2 = h2 . Therefore f is also one to
one. Consequently |gH| = |H|.
Theorem 5.9 (Lagrange). If H ⊆ G is a subgroup of a finite group, then

|G| = |H| · |G/H|

In particular, the order of H divides the order of G.


Proof. By the previous results, G/H is a partition of G into |G/H| sets each of
cardinality |H|.
Given g ∈ G, the order of g is the smallest positive n such that g n = e. This
was shown in a previous exercise to be the order of the subgroup generated by
g. Therefore:
Corollary 5.10. The order of any element g ∈ G divides the order of G.
Corollary 5.11. If the order G is a prime number, then G is cyclic.
Proof. Let p = |G|. By the previous corollary g ∈ G divides p. If g 6= e, then
the order must be p. Therefore G is generated by g.
One can ask whether the converse of the first corollary holds, that is if |G|
is divisible by n, does G necessarily have element of order n? The answer is no,
it would fail for n = |G| unless G is cyclic. Even if we require n < |G| then it
may still fail (exercise 9). However, if n is prime, then it is true.
Theorem 5.12 (Cauchy). If the order of a finite group G is divisible by a prime
number p, then G has an element of order p
Proof when p = 2. Suppose that G is even. We can partition G into A = {g ∈
G | g 2 = e} and B = {g ∈ G | g 2 6= e}. Therefore |G| = |A|+|B|. Every element
g ∈ B satisfies g 6= g −1 . Therefore |B| is even, because we can write B as a
disjoint union of pairs {g, g −1 }. Therefore |A| = |G| − |B| is even. Furthermore
|A| ≥ 1 because e ∈ A. It follows that A contains an element different from e,
and this must have order 2.

26
Next, we want to develop a method for computing the order of a subgroup
of Sn .
Definition 5.13. Given i ∈ {1, . . . , n}, the orbit Orb(i) = {g(i) | g ∈ G}. A
subgroup G ⊆ Sn is called transitive if for some i, Orb(i) = {1, . . . , n}.

Definition 5.14. Given subgroup G ⊆ Sn and i ∈ {1, . . . n}, the stabilizer of i,


is Stab(i) = {f ∈ G | f (i) = i}
Theorem 5.15 (Orbit-Stabilizer theorem). Given a subgroup G ⊆ Sn , and
i ∈ {1, . . . , n} then
|G| = | Orb(i)| · | Stab(i)|
In particular,
|G| = n| Stab(i)|
if G is transitive.
Proof. We define a function f : G → Orb(i) by f (g) = g(i). The preimage
T = f −1 (j) = {g ∈ G | g(i) = j}. By definition if j ∈ Orb(i), there exists
g0 ∈ T . We want to show that T = g0 Stab(i). In one direction, if h ∈ Stab(i)
then g0 h(i) = j. Therefore g0 h ∈ T . Suppose g ∈ T . Then g = g0 h where
h = g0−1 g. We see that h(i) = g0−1 g(i) = g0−1 (j) = i. Therefore, we have
established that T = g0 Stab(i). This shows that
X X
|G| = |f −1 (j)| = | Stab(i)| = | Orb(i)| · | Stab(i)|
j∈Orb(i) j∈Orb(i)

Corollary 5.16. |Sn | = n!


Proof. We prove this by mathematical induction starting from n = 1. When
n = 1, Sn consists of the identity so |S1 | = 1 = 1!. In general, assuming that the
corollary holds for n, we have prove it for n+1. The group Sn+1 acts transitively
on {1, . . . , n + 1}. We want to show that there is a one to one correspondence
between Stab(n + 1) and Sn . An element of f ∈ Stab(n + 1) looks like
 
1 2 ...n n+1
f (1) f (2) . . . f (n) n + 1

Dropping the last column yields a permutation in Sn , and any permutation in Sn


extends uniquely to an element of Stab(n+1) by adding that column. Therefore
we have established the correspondence. It follows that | Stab(n + 1)| = |Sn | =
n!. Therefore

|Sn+1 | = (n + 1)| Stab(n + 1)| = (n + 1)(n!) = (n + 1)!

27
5.17 Exercises
1. Given finite sets Y, Z. Prove that |Y ∪ Z| = |Y | + |Z| − |Y ∩ Z|. Recall
that the intersection Y ∩ Z = {x | x ∈ Y and x ∈ Z}.
2. If B ⊆ A, prove that |A − B| = |A| − |B|, where A − B = {a | a ∈
A and a ∈
/ B}. Use this to prove that the set of distinct pairs {(x1 , x2 ) ∈
X × X | x1 6= x2 } has |X|2 − |X| elements.
3. We can use the above counting formulas to solve simple exercises in prob-
ability theory. Suppose that a 6 sided dice is rolled twice. There are
6 × 6 = 36 possible outcomes. Given a subset S of these outcomes, called
an event, the probability of S occurring is |S|/36.
(a) What is the probability that a five or six is obtained on the first role?
(b) What is the probability that a five or six is obtained in either (or
both) roll(s)?
(c) What is probability that the same number is rolled twice?
(d) What is probability that different numbers be obtained for each roll?
Explain how you got your answers.
4. Let G ⊆ Sn be a subgroup.
(a) Prove that the stablizer H of an element i is a subgroup of G.
(b) A subgroup H ⊂ G is a normal subgroup if ghg −1 ∈ H for all g ∈ G
and h ∈ H. Is the stabilizer a normal subgroup?
5. By the previous results, the order of an element g ∈ Sn must divide n!. We
can do much better. Find a better bound using the cycle decomposition.
6. What is the probability that an element of S5 has order 2?
7. Choose two elements g1 , g2 from a finite group G. What is the probability
that g1 g2 = e?
8. Determine all the transitive subgroups of S3 .
9. Let Zm1 × Zm2 × . . . Zmn = {(a1 , . . . , an ) | ai ∈ Zmi } be the set of vectors.
(a) Show that this becomes a group using (a1 , . . . , an ) + (b1 , . . . , bn ) =
(a1 + b1 , . . . , an + bn ) with mod mi arithmetic in each slot.
(b) Show that the order of this group is m1 m2 . . . mn .
(c) Let m be the least common multiple of m1 , . . . , mn . Show that all
elements have order dividing m.
10. Prove that Cauchy’s theorem holds for the group defined in the previous
exercise.

28
(I) 1
(A) 3 (rotations) × 3 (pairs of faces) = 9
(B) 2 (rotations) × 4 (pairs of vertices) = 8
(C) 6 (opposite pairs of edges ) = 6
making 24. To see that this is a complete list, we use the orbit-stabilizer the-
orem. The action of C is transitive, and Stab(1) consists of I, and two other
elements of type B. Therefore |C| = (8)(3) = 24. In principle, we have a com-
plete description of C. However, we can do better. There are 4 diagonal lies
such as 17. One can see that any non-identity element of C must permute the
diagonal lines nontrivially. A bit more formally, we have produced a one to one
homomorphism from C to S4 . Since they both have order 24, we can conclude
that:
Lemma 6.2. The symmetry group C of a cube is isomorphic to S4 .

Let us now turn to counting problems with symmetry.


Question 6.3. How many dice are there?
Recall that a die (singular of dice) is gotten by labelling the faces of cube by
the numbers 1 through 6. One attempt at a solution goes as follows. Choose
some initial labelling, then there as many ways to relabel as there are permuta-
tions which is 6! = 720. This doesn’t take into account that there are 24 ways
to rotate the cube, and each rotated die should be counted as the same. From
this, one may expect that there are 720/24 = 30 possibilities. This seems more
reasonable.
Question 6.4. How many cubes are there with 3 red faces and 3 blue?

6
 Arguing as above, labelling the faces of the cube 1 through 6, there are
3 = 20 ways to pick 3 red faces. But this discounts symmetry. On the
other hand, dividing by the number of symmetries yields 20/24, which doesn’t
make sense. Clearly something more sophisticated is required. Let X be a
finte set of things such as relabellings of the cube, or colorings of a labelled
cube, and suppose that G is a finite set of permutations of X. In fact, we
only need to assume that G comes with a homomorphism to SX . This means
that each g ∈ G determines a permutation of X such that g1 g2 (x) = g1 (g2 (x))
for all gi ∈ G, x ∈ X. We say that G acts on X. Given x ∈ X, its orbit
Orb(x) = {g(x) | g ∈ G}, and let X/G be the set of orbits. Since we really want
to x and g(x) to be counted as one thing, we should count the number of orbits.
Given g ∈ G, let F ix(g) = {x ∈ X | g(x) = x} be the set of fixed points.
Theorem 6.5 (Burnside’s Formula). If G is a finite group acting on a finite
set X, then
1 X
|X/G| = | Fix(g)|
|G|
g∈G

31
Before starting the proof, we define the stablizer Stab(x) = {g ∈ G | g(x) =
x}. Theorem 5.15 generalizes, with the same proof, to

|G| = | Orb(x)| · | Stab(x)|

Proof of Burnside. Let

S = {(x, g) ∈ X × G | g(x) = x}

Consider the map p : S → G given by p(x, g) = g. Then p−1 (g) = Fix(g).


Therefore proposition 5.4 applied to p yields
X X
|S| = |p−1 (g)| = | Fix(g)| (6.1)
g∈G g∈G

Next consider the map q : S → X given by q(x, g) = x. Then q −1 (x) =


Stab(x). Therefore proposition 5.4 applied to q yields
X X
|S| = |q −1 (x)| = | Stab(x)|
x∈X x∈X

Let us write X as disjoint union of orbits Orb(x1 ) ∪ Orb(x2 ) ∪ . . ., and group


terms of the last sum into these orbits
X X
|S| = | Stab(x)| + | Stab(x)| + . . .
x∈Orb(x1 ) x∈Orb(x2 )

Each orbit Orb(xi ) has |G|/| Stab(xi )| elements by the orbit-stabilizer theorem.
Furthermore, for any x ∈ Orb(xi ), we have | Stab(x)| = | Stab(xi )|. Therefore
X X |G|
| Stab(x)| = | Stab(xi )| = | Stab(xi )| = |G|
| Stab(xi )|
x∈Orb(xi ) x∈Orb(xi )

Consequently
X X
|S| = |G| + |G| + . . . = |G| · |X/G|
x∈Orb(x1 ) x∈Orb(x2 )

Combining this with equation (6.1) yields


X
|G| · |X/G| = | Fix(g)|
g∈G

Dividing by |G| yields the desired formula.

Let us say that the action of G on X is fixed point free if Fix(g) = ∅ unless
g is the identity. In this case the naive formula works.

32
Corollary 6.6. If the action is fixed point free,

|X/G| = |X|/|G|

Coming back to question 6.3. Let X be the set of relabellings of the cube,
and G = C the symmetry group of the cube. Then the action is fixed point
free, so that |X/G| = 720/24 = 30 gives the correct answer.
The solution to question 6.4 using Burnside’s formula is rather messy (the
answer is 2). So instead, let us consider the simpler question.
Question 6.7. How many ways can we color a regular tetrahedron with 2 red
and 2 blue faces?

Let X be the set of such colorings, and let T be the symmetry group. Then

Fix(I) = X
4

has 2 = 6 elements. We can see that

Fix(g) = ∅

for any 3-cycle such as g = (123) because we would need to have 3 faces the
same color for any fixed point. For a fixed point of g = (13)(24), the sides
adjacent to 13 and 24 would have to be the same color. Therefore

| Fix(g)| = 2

The same reasoning applies to g = (14)(23) or (12)(34). Thus


1
|X/T | = (6 + 2 + 2 + 2) = 1
12
Of course, this can be figured out directly.
In general, Burnside’s formula can be a bit messy to use. In practice, how-
ever, there a some tricks to simplify the sum. Given two elements g1 , g2 of a
group, we say that g1 is conjugate to g2 if g1 = hg2 h−1 for some h ∈ G. Since we
can rewrite this as g2 = h−1 g1 h, we can see that the relationship is symmetric.
Here are a couple of examples
Example 6.8. Every element g is conjugate to itself because g = ege−1 .
Example 6.9. In the dihedral group Dn , R is conjugate to R−1 because F RF =
F RF −1 = R−1 .
An important example is:
Example 6.10. In Sn , any cycle is conjugate to any other cycle of the same
length.
The relevance for counting problems is as follows.

33
Chapter 7

Kernels and quotients

Recall that homomorphism between groups f : G → Q is a map which preserves


the operation and identity (which we denote by · and e). It need not be one to
one. The failure to be one to one is easy to measure.
Definition 7.1. Given a homomorphism between groups f : G → Q, the kernel
ker f = {g ∈ G | f (g) = e}.
Lemma 7.2. A homomorphism is one to one if and only if ker f = {e}.
The proof will be given as an exercise. The kernel is a special kind of
subgroup. It’s likely that you already encountered this notion in linear algebra
in the context of linear transformations. There it also called the kernel or
sometimes the null space.
Definition 7.3. A subgroup H ⊂ G is called normal if ghg −1 ∈ H for all
g ∈ G and h ∈ H. The operation h 7→ ghg −1 is called conjugation of h by g. So
normality of H means that it is closed under conjugation by elements of G.
Proposition 7.4. Suppose that f : G → Q is a homomorphism, then ker f is
a normal subgroup.
Proof. Let h1 , h2 ∈ H and g ∈ G. Then f (h1 h2 ) = f (h1 )f (h2 ) = e, f (h−1
1 ) = e,
f (gh1 g −1 ) = f (g)f (g)−1 = e.
Here are some examples.
Example 7.5. If G is abelian, then any subgroup is normal.
Example 7.6. In S3 , H = {I, (123), (321)} is a normal subgroup. The subgroup
{I, (12)} is not normal because (12) is conjugate to (13) and (23).
We want to prove that every normal subgroup arises as the kernel of a homo-
morphism. This involves the quotient construction. Given subsets H1 , H2 ⊂ G
of a group, define their product by

H1 H2 = {h1 h2 | h1 ∈ H1 , h2 ∈ H2 }

36
Lemma 7.7. If H ⊆ G is normal, then the product of cosets satisfies (g1 H)(g2 H) =
(g1 g2 )H.
Proof. By definition, (g1 H)(g2 H) = {g1 h1 g2 h2 | h1 , h2 ∈ H}. Since H is nor-
mal, h3 = g2−1 h1 g2 ∈ H. Therefore g1 h1 g2 h2 = g1 g2 h3 h2 ∈ (g1 g2 )H. This
proves (g1 H)(g2 H) ⊆ (g1 g2 )H.
For the reverse inclusion (g1 g2 )H ⊆ (g1 H)(g2 H), observe that if h ∈ H,
then g1 g2 h = (g1 e)(g2 h) ∈ (g1 H)(g2 H).

Theorem 7.8. If H ⊆ G is a normal subgroup, then G/H becomes a group with


respect to the product defined above. The map p(g) = gH is a homomorphism
with kernel H.
Proof. By the previous lemma, (gH)(eH) = gH = (eH)(gH), (gH)(g −1 H) =
H = (g −1 H)(gH), and (g1 H)(g2 Hg3 H) = g1 g2 g3 H = (g1 Hg2 H)(g3 H). So
G/H is a group. Also p(g1 g2 ) = g1 g2 H = (g1 H)(g2 H) = p(g1 )(g2 ), so p is a
homomorphism. Furthermore, ker p = {g ∈ G | gH = H} = H.
When H is normal, we refer to G/H as the quotient group. Quotient groups
often show up indirectly as follows.
Lemma 7.9. Let f : G → H be a homomorphism with kernel K = ker f .
Then the image f (G) = {f (g) | g ∈ G} is a subgroup isomorphic to G/K. In
particular, H is isomorphic to G/K if f is onto.
The proof will be given as an exercise. The quotient construction can be
used to tie up some loose ends from earlier sections. Let n be a positive integer,
and let nZ = {nx | x ∈ Z}. This is a subgroup. So we can form the quotient
Znew
n = Z/nZ. The label “new” is temporary, and is there to distinguish it from
Zn = {0, 1, . . . , n − 1}. Given an integer x, let x̄ = x + nZ. In particular, x 7→ x
gives a map from Zn → Znew n . We leave it as an exercise to show this is a one
to one correspondence, and that

x⊕y =x+y

where + on the right is addition in the quotient group. Thus, we can conclude
that the old and new versions of Zn are isomorphic, and we will conflate the
two. Recall, in fact, that we never fully completed the proof that the old Zn
was a group. Now we don’t have to!

Normal subgroups can be used to break up complicated groups into simpler


pieces. For example, in the exercises, we will see that the dihedral group Dn
contains a cyclic subgroup Cn , which is normal and the quotient Dn /Cn is also
cyclic. Here we look at the related example of the orthogonal group O(2). This
is the full symmetry group of the circle which includes rotations and reflection.
The rotations form a subgroup SO(2).

37
Proposition 7.10. SO(2) is a normal subgroup of O(2).
We give two proofs. The first, which uses determinants, gets to the point
quickly. However, the second proof is also useful since it leads to the formula
(7.1).
First Proof. We start with a standard result.
Theorem 7.11. For any pair of 2×2 matrices A and B, det AB = det A det B.
Proof. A brute force calculation shows that

(a11 a22 − a12 a21 )(b11 b22 − b12 b21 )

and

(a11 b11 + a12 b21 )(a21 b12 + a22 b22 ) − (a11 b12 + a12 b22 )(a21 b11 + a22 b22 )

both can be expanded to

a11 a22 b11 b22 − a11 a22 b12 b21 − a12 a21 b11 b22 + a12 a21 b12 b21

Therefore det : O(2) → R∗ is a homomorphism, where R∗ denote the group


of nonzero real numbers under multiplication. It follows that SO(2) is the
kernel. So it is normal.
Second Proof. We have to show that AR(θ)A−1 ∈ SO(2) for any A ∈ O(2).
This is true when A ∈ SO(2) because SO(2) is a subgroup.
It remains to show that conjugating a rotation by a reflection is a rotation.
In fact we will show that for any reflection A

AR(θ)A−1 = R(−θ) (7.1)


 
1 0
First let A be the reflection F = about the x-axis. Then an easy
0 −1
calculation shows that F R(θ)F −1 = F R(θ)F = R(−θ). Now assume that A is
a general reflection. Then
 
cos φ sin φ
A= = F R(−φ)
sin φ − cos φ

So
AR(θ)A−1 = F R(−φ)R(θ)R(φ)F = R(−θ)
as claimed.
So now we have a normal subgroup SO(2) ⊂ O(2) which we understand
pretty well. What about the quotient O(2)/SO(2). This can identified with the
cyclic group {±1} ⊂ R∗ using the determinant.

38
7.12 Exercises
1. Prove lemma 7.2.
2. Determine the normal subgroups of S3 .
3. Prove lemma 7.9. (Hint: first prove that f (G) is subgroup. Then that
f¯(gH) = f (g) is a well defined function which gives an isomorphism
G/K ∼= f (G).)
4. (a) Given a group G and a normal subgroup H. Let S ⊂ G be a subset
with the property that S ∩ gH has exactly one element for every g ∈
G. Show that the restriction of p gives a one to one correspondence
S → G/H.
(b) Show that these conditions hold for G = R, H = 2πZ and S = [0, 2π).
5. Prove that Zn is isomorphic to the quotient group Z/nZ as claimed earlier.
6. Check that SL2 (R) = {A ∈ GL2 (R) | det A = 1} is a normal subgroup of
GL2 (R).
7. In an earlier exercise in chapter , you showed that the set of upper trian-
gular matrices  
1 a
0 1
is a subgroup of GL2 (R). Is it normal?
8. Let H ⊆ G be a normal subgroup f : G → K be an onto homomorphism,
prove that f (H) = {f (h) | h ∈ H} is a normal subgroup. What if f is not
onto?
9. Given a group G, its center Z(G) is the set of elements c which satisfy
cg = gc for every g ∈ G.

(a) Prove that the center is an abelian normal subgroup.


(b) Does an abelian normal subgroup necessarily lie in the center? (Think
about the dihedral group.)

10. Check that the center of Sn , when n > 2, is trivial in the sense that it
consists of only the identity.

39
Chapter 8

Rings and modular


arithmetic

So far, we have been working with just one operation at a time. But standard
number systems, such as Z, have two operations + and · which interact. It is
useful to give a name to this sort of thing.
Definition 8.1. A ring consists of a set R with elements 0, 1 ∈ R, and binary
operations + and · such that: (R, +, 0) is an Abelian group, · is associative with
1 as the identity, and · distributes over + on the left and right:

x · (y + z) = x · y + x · z

(y + z) · x = y · x + z · x
Definition 8.2. A ring is commutative if in addition

x·y =y·x

Here are some basic examples that everyone should already know.
Example 8.3. Let Z (respectively Q, R , C) be the set of integers (respectively
rational numbers, real numbers, complex numbers) with the usual operations.
These are all commutative rings.
Example 8.4. The set Mnn (R) of n × n matrices over R with the usual matrix
operations forms a ring. It is not commutative when n > 1.
We now focus on a new example. Let n be a positive integer, and write
Zn = Z/nZ = {0, 1, . . . , n − 1}, where x = x + nZ. We already know that this
has an addition given by addition of cosets:

a+b=a+b

40
For hereon in, we’ll stop writing ⊕. We will try to define multiplication the
same way by
ab = ab
However, we have to prove that this definition makes sense. In other words, we
have to show that right side depends only on a and b rather than a and b.
Lemma 8.5. If a = a0 and b = b0 , then ab = a0 b0
Proof. The equality x = x0 holds if and only if x − x0 is divisible by n. Therefore
a0 = a + nx and b = b0 + ny for some x, y ∈ Z. It follows that a0 b0 = ab + n(xb0 +
ya0 + nxy).

Theorem 8.6. Zn is a commutative ring.


Proof. The laws follow from the fact that Z is a commutative ring, the definition
of the operations in Zn , and the fact that the map Z → Zn is onto. For example,
here is a proof of the distributive law

(x + y)z = (x + y)z = (x + y)z

When it’s clear we’re working in Zn , we usually just write x instead of x̄.
To get a feeling for modular multiplication, lets write down the table for Z6

· 0 1 2 3 4 5
0 0 0 0 0 0 0
1 0 1 2 3 4 5
2 0 2 4 0 2 4
3 0 3 0 3 0 3
4 0 4 2 0 4 2
5 0 5 4 3 2 1

One curious fact is that some nonzero numbers, such as 2, can be multiplied by
other nonzero numbers to get 0. We say that such a number is a zero divisor.
Lemma 8.7. An element m ∈ Zn is a zero divisor if m > 1 and m divides n.
Proof. We have that n = mm0 for some 0 < m0 < n. So that mm0 = 0
Also notice that the number 5 has a reciprocal, namely 5.
Definition 8.8. An element x ∈ R of a ring is invertible if there exists an
element y such that xy = yx = 1. Let R∗ denote the set of invertible elements.
(When R is commutative, invertible elements are also called units.)
Lemma 8.9. If R is a ring R∗ is a group with respect to multiplication.

41
This will be proven in the exercises. The group of invertible elements are
easy to determine for the previous examples. For example, Mnn (R)∗ = GLn (R).
Given two integers a, b, a common divisor is an integer d such that d|a and
d|b. The greatest common divisor is exactly that, the common divisor greater
than or equal to all others (it exists since the set of common divisors is finite).
We denote this by gcd(a, b).
Lemma 8.10 (Euclid). If a, b are natural numbers then gcd(a, b) = gcd(b, a mod b)
Proof. Let r = a mod b. Then the division algorithm gives a = qb + r for some
integer q. SInce gcd(b, r) divides b and r, it divides qb + r = a. Therefore
gcd(b, r) is a common divisor of a and b, so that that gcd(b, r) ≤ gcd(a, b). On
the other hand, r = a − qb implies that gcd(a, b)|r. Therefore gcd(a, b) is a
common divisor of b and r, so gcd(a, b) ≤ gcd(b, r), which forces them to be
equal.
This lemma leads to a method for computing gcds. For example

gcd(100, 40) = gcd(40, 20) = gcd(20, 0) = 20.

For our purposes, a diophantine equation is an equation with integer co-


efficients where the solutions are also required to be integers. The simplest
examples are the linear ones: given integers a, b, c, find all integers m, n such
that am + bn = c.
Theorem 8.11. Given integers a, b, c, am+bn = c has a solution with m, n ∈ Z
if and only if gcd(a, b)|c.
Proof. Since (m0 , n0 ) = (±m, ±n) is a solution of ±an0 + ±bm0 = c, we may as
well assume that a, b ≥ 0. We now prove the theorem for natural numbers a, b
by induction on the minimum min(a, b).
If min(a, b) = 0, then one of them, say b = 0. Since a = gcd(a, b) divides
c by assumption, (c/a, 0) gives a solution of am + bn = c. Now assume that
a0 m + b0 n = c0 has a solution whenever min(a0 , b0 ) < min(a, b) and the other
conditions are fulfilled. Suppose b ≤ a, and let r = r(a, b) = a mod b and
q = q(a, b) be given as in theorem 4.5. Then rm0 + bn0 = c has a solution since
min(r, b) = r < b = min(a, b) and gcd(b, r) = gcd(a, b) divides c. Let m = n0
and n = m0 − qn0 , then

am + bn = an0 + b(m0 − qn0 ) = bm0 + rn0 = c.

From the last proof, we can deduce:

Corollary 8.12. Given a, b ∈ Z, there exists m, n ∈ Z such that am + bn =


gcd(a, b).
We can now determine the invertible elements

42
Theorem 8.13. m ∈ Zn is invertible if and only if gcd(m, n) = 1 (we also say
that m and n are relatively prime or coprime).
Proof. If gcd(m, n) = 1, then mm0 + nn0 = 1 or mm0 = −n0 n + 1 for some
integers by corollary 8.12. After replacing (m0 , n0 ) by (m0 + m00 n, n0 − m00 ) for
some suitable m00 , we can assume that 0 ≤ m0 ≤ n. Since have r(mm0 , n) = 1,
mm0 = 1.
The converse follows by reversing these steps.
Definition 8.14. A ring is called a division ring if R∗ = R − {0}. A commu-
tative division ring is called a field.
For example Q, R and C are fields. We will see a noncommutative division
ring later on. The previous theorem implies the following:
Theorem 8.15. The ring Zn is a field if and only if n is prime.
Corollary 8.16 (Fermat’s little theorem). When p is a prime and n and inte-
ger, then p divides np − n.
Proof. If p divides n, then clearly it divides np −n. Now suppose that p does not
divide n, then n ∈ Z∗p . This is a group of order p − 1. So by Lagrange’s theorem,
n has order dividing p − 1. This implies that np−1 = 1, or that np−1 − 1 = 0.
This implies that p divides np−1 − 1 (which is usually taken as the statement of
Fermat’s little theorem) and therefore np − n.

8.17 Exercises
1. Let R be a commutative ring. Prove that 0 · x = 0. (This might appear
to be a completely obvious statement, but it isn’t – the only things you
know about R are what follows from the axioms.)
2. Let R be a commutative ring. Prove that (−1) · x = −x, where −x is the
additive inverse of x, that is (−x) + x = 0.

3. The Gaussian integers Z[i] = {a + bi | a, b ∈ Z}, where i = −1.
(a) Check that is closed under addition, additive inverses and multipli-
cation, and is therefore a ring.
(b) Determine the group Z[i]∗ of invertible elements.
4. Check that the Gaussian field Q[i] = {a + bi | a, b ∈ Q} is a field when
equipped with the usual operations.
5. Prove that there are no zero divisors in a field, i.e. if xy = 0 then x = 0
or y = 0.

43
6. If R1 and R2 are commutative rings, define R = R1 × R2 with operations
(a1 , a2 ) + (b1 , b2 ) = (a1 + b1 , a2 + b2 ) and (a1 , a2 ) · (b1 , b2 ) = (a1 b1 , a2 b2 ).
Check that this is a commutative ring with appropriate choice of constants.
Show that this has zero divisors.
7. An element x of a commutative ring is nilpotent if xN = 0 for some integer
N ≥ 0. Determine the nilpotent elements of Zn .
8. Prove that the sum and product of nilpotent elements in a commutative
ring are also nilpotent.
9. Sequences of “random” numbers are often generated on a computer by
the following method: Choose integers n ≥ 2, a, b, x0 , and consider the
sequence
xi+1 = (axi + b) mod n.
This sequence will eventually repeat itself. The period is the smallest k
such that xi+k = xi for all i large enough. Obviously, short periods are
less useful, since the pattern shouldn’t be too predictable.

(a) Prove that the period is at most n.


(b) Explain why picking a nilpotent in Zn would be a really bad choice.

44
Chapter 9

Z∗p is cyclic

Given a field K, a polynomial in x is a symbolic expression


an xn + an1 xn−1 + . . . + a0
where n ∈ N is arbitrary and the coefficients an , . . . , a0 ∈ K. Note that polyno-
mials are often viewed as functions but it is important to really treat these as
expressions. First of all the algebraic properties become clearer, and secondly
when K is finte, there only finitely many functions from K → K but infinitely
many polynomials. We denote the set of these polynomials by K[x]. We omit
terms if the coefficients are zero, so we can pad out a polynomial with extra
zeros whenever convenient e.g. 1 = 0x2 + 0x + 1. The highest power of x oc-
curring with a nonzero coefficient is called the degree. We can add polynomials
by adding the coefficients
f = an xn + an1 xn−1 + . . . + a0
g = bn xn + bn1 xn−1 + . . . + b0
f + g = (an + bn )xn + . . . (a0 + b0 )
Multiplication is defined using the rules one learns in school
f g = (a0 b0 ) + (a1 b0 + a0 b1 )x + (a2 b0 + a1 b1 + a0 b2 )x2 + . . .
X
=( ai bj )xk
i+j=k

Theorem 9.1. K[x] is a commutative ring with the operations described above.
Proof. This is fairly routine, so we just list a few steps. Let f and g be as above
and
h = cn xn + cn−1 xn−1 + . . . c0
Then X
f (gh) = ( ai bj ck )x` = (f g)h
i+j+k=`

45
and X
f (g + h) = ai (bj + cj )xn = f g + f h
i+j=k

Given a polynomial f ∈ K[x] and an element a ∈ K, we can substitute x by


a in f to get an element that we write as eva (f ) = f (a). The following is easy
to check and true by design.
Lemma 9.2. Given polynomials f and g, eva (1) = 1, eva (f + g) = eva (f ) +
eva (g) and eva (f g) = eva (f )eva (g).
The lemma says that eva : K[x] → K is a ring homomorphism. An element
r ∈ K is called a zero or root of a polynomial f if f (r) = 0.
Lemma 9.3. An element r ∈ K is a root of f ∈ K[x] if and only if x − a
divides f , i.e there exists g ∈ K[x] with f = (x − a)g
Proof. If f = (x − r)g, then clearly f (r) = (r − r)g(r) = 0.
Now suppose that f (r) = 0. Write f = an xn + . . . + a0 . We want g =
bn−1 xn−1 + . . . b0 such that (x − r)g = f . This equation is equivalent to the
system
bn−1 = an
bn−2 − rbn−1 = an−1
...
b0 − rb1 = a1
−rb0 = a0
There are n+1 linear equations in n variables bn−1 , . . . , b0 , so there is no guaran-
tee that we have a solution. However, if we can eliminate one of the equations by
doing “row operations” (adding multiples of one equation to another etc.), then
the new system will have n equations, and this would be consistent. Adding rn
times the first equation to the last equation, and then rn−1 times the second
equation to the last and so on leads to
rn bn−1 + rn−1 (bn−2 − rbn−1 ) + . . . − rb0 = an rn + an−1 rn−1 + . . . a0
or 0 = 0. So we have eliminated the last equation. The remaining n equations
can now be solved to obtain g as above.
Theorem 9.4. If f ∈ K[x] polynomial of degree n, then it has at most n distinct
roots. If f has exactly n roots r1 , . . . , rn , then f = c(x − r1 ) . . . (x − rn ) where
c is the leading coefficient of f .
Proof. Suppose that r1 is a root of f . Then f = (x − r1 )g by the previous
lemma. The roots of f are r1 together with the roots of g. By induction on the
degree, we can assume that g has at most n − 1 roots. By the same reasoning,
if r1 , . . . , rn are roots, f = c(x − r1 ) . . . (x − rn ) for some c ∈ K. But clearly c
must equal the leading coefficient.

46
We now apply these results to the field K = Zp , where p is a prime. Some-
times this is denoted by Fp to emphasize that its a field. When the need arises,
let us write a to indicate we are working Zp , but we won’t bother when the
context is clear.
Proposition 9.5. We can factor xp − x = x(x − 1)(x − 2) . . . (x − (p − 1)) in
Zp [x]
Proof. By Fermat’s little theorem, 1 . . . , p − 1 are roots. Therefore xp − x =
x(x − 1)(x − 2) . . . (x − p − 1) in Zp [x].

Corollary 9.6 (Wilson’s theorem). (p − 1)! = −1


Proof. We have xp−1 − 1 = (x − 1)(x − 2) . . . (x − (p − 1)). Now evaluate both
sides at 0.
p!
Corollary 9.7. The binomial coefficients np = n!(p−n)!

are divisible by n when
1 < n < p.
Proof. Substitue 1 + x into the above identity to obtain (1 + x)p − (1 + x) = 0
in Zp . Now expand using the binomial theorem, which is valid in any field (see
exercises), to obtain
p−1  
X p n
x =0
n=1
n

The last few results were fairly easy, the next result is not.
Theorem 9.8. If p is prime, then Z∗p is cyclic.
Proof in a special case. We won’t prove this in general, but to get some sense
of why this is true, let’s prove it when p = 2q + 1, where q is another prime.
This is not typical, but it can certainly happen (e.g. p = 7, 11, 23, . . .). Then
Z∗p has order 2q. The possible orders of its elements are 1, 2, q, or 2q. There is
only element of order 1, namely 1. An element of order 2 is a root of x2 − 1,
so it must be −1. An element of order q satisfies xq − 1 = 0, and be different
from 1. Thus there are at most q − 1 possibilities. So to summarize there are no
more q + 1 elements of orders 1, 2, q. Therefore there are at least q − 1 elements
of order 2q, and these are necessarily generators.

9.9 Exercises
1. Given a field K and a positive integer n, let n = 1 + . . . + 1 (n times). K is
said to have positive characteristic if n = 0 for some positive n, otherwise
K is said to have characteristic 0. In the positive characteristic case, the
smallest n > 0 with n = 0 is called the characteristic. Prove that the
characteristic is a prime number.

47
2. For any field, prove the binomial theorem
n  
X
n n m
(x + 1) = x
m=0
m

n+1 n n+1
  
(Recall m = m + m .)
√ √
3. Let K be a field and s ∈ K. Let K[ s] be the set of expressions a + b s,
with a, b ∈ K. Show that this becomes a commutative ring if we define
addition and multiplication as the notation suggests:
√ √ √
(a + bi s) + (c + d s) = (a + c) + (b + d) s
√ √ √
(a + b s)(c + d s) = (ac + bds) + (ad + bc) s

4. Show K[ s] has zero divisors if x2 − s √ = 0 has a root. If this equation

does
√ not have a root, then prove that K[ s] is a field (Hint: (a + b s)(a −
b s) =? and when is it zero?).
5. When p is an odd prime, show that the map x 7→ x2 from Z∗p → Z∗p is not
onto. Use this fact to construct a field with p2 elements and characteristic
p.

48
Chapter 10

Matrices over Zp

We can now combine everything we’ve learned to construct a new, and inter-
esting, collection of finite groups. Let p be a prime number. Then Zp is a field,
which means that we can perform all the usual operations in it, including divi-
sion. This allows us to do linear algebra over this field pretty much as usual.
For instance, we can consider vectors of length n with entries in Zp . We denote
this by Znp . This becomes an abelian group under vector addition:
(a1 , . . . , an ) + (b1 , . . . , bn ) = (a1 + b1 , . . . , an + bn )
where, of course, + on the right denotes addition in Zp . One can also consider
matrices with entries in Zp . The standard results from basic linear algebra
generalize to Zp or any field. For example,
Theorem 10.1 (Gaussian elimination). Let A be an n × n matrix with entries
in a field K. A is invertible if and only if it be can be taken to the identity matrix
by a finite sequence of row operations (interchanges, addition of a multiple of
one row to another, multiplication of a row by an element of K ∗ ). A is not
invertible if and only if it can be taken to a matrix with a zero row.
Some details will be recalled in the exercises. Let us denote the set of
invertible n × n matrices with entries in K by GLn (K). This is a group under
matrix multiplication. When K = Zp , this is a finite group. So the first thing
we should do is calculate its order. Let us start with the 2 × 2 case. Let V = Z2p
as above, but now represented by 2 × 1 vectors. A ∈ GL2 (Zp ) will act on v ∈ Z2p
by matrix multiplication Av. Set v = [1, 0]T ∈ V
Lemma 10.2. Orb(v) = Z2p − {0}.
Proof. Given u ∈ Z2p − {0}, we can clearly find a matrix A ∈ GL2 (Zp ) with u
as its first column. This will satisfy Av = u.
Lemma 10.3. Stab(v) is the set of matrices
  
1 x
| y 6= 0
0 y

49
Proof. The condition Av = v means that the first column is v.
Theorem 10.4. The order of GL2 (Zp ) is (p2 − 1)(p2 − p)
Proof. From the last two lemmas and the orbit-stabilizer theorem, the order is
(p2 − 1)(p − 1)p
Corollary 10.5. GL2 (Z2 ) is isomorphic to S3 .
Proof. GL(Z2 ) acts on Z22 − {0} which has 3 elements. Therefore we have a
homomorphism f : G → S3 which is one to one because ker f consists matrices
satisfying A[1, 0]T = [1, 0]T and A[0, 1]T = [0, 1]T , and A = I is the only such
matrix. Since the order of GL(Z2 ) is 6, f has to be onto as well.
This isomorphism should be viewed as something of an accident. For p = 3,
the order is 48 which is not the factorial of anything.
Bolstered by this success, let’s try to compute the order of GLn (Zp ) which
is the group of invertible n × n matrices.
Theorem 10.6. The order of GLn (Zp ) is (pn − 1)(pn − p) . . . (pn − pn−1 )
Proof. We will apply the same strategy as before. This time GLn (Zp ) acts
on Znp − {0}, and arguing as before, we can see that this is the orbit of v =
[1, 0, 0, . . .]T . So |GLn (Zp )| = (pn − 1)| Stab(v)| by the orbit-stabilizer theorem.
The stabilizer of v is the set of matrices of the form
 
1 x2 x3 . . .
0 
A = 0
 
B 
..
 
.

One can see that for A to be invertible, the (n − 1) × (n − 1) block labelled B


must be invertible as well, but there are no constraints on the elements labelled
xi . Therefore A 7→ (B, x2 , x3 , . . .) gives a one to one correspondence between
Stab(v) and GLn−1 (Zp ) × Zn−1p . It follows by induction on n that

| Stab(v)| = pn−1 |GLn−1 (Zp )|


= pn−1 (pn−1 − 1) . . . (pn−1 − pn−2 )
= (pn − 1) . . . (pn − pn−1 )

10.7 Exercises
1. A matrix over a field is called elementary if it can be obtained from I
by a single row operation. Check that if E is an elementary matrix, it is
invertible, and that EA is the matrix obtained from A by a row operation.

50
2. The one fact from linear algebra, we will just accept is the rank-nullity the-
orem, which implies that a square matrix is invertible if its kernel contains
only 0. If E1 , . . . , EN are elementary matrices such that EN . . . E1 A = I,
then prove that A is invertible and that EN . . . E1 = A−1 .
3. If E1 , . . . , EN are elementary matrices such that EN . . . E1 A has a row of
zeros, prove that A is not invertible. (Hint: show that ker A contains a
nonzero vector.)
4. Determine which of the following matrices over Z2 is invertible, and find
the inverse when it exists.
 
1 1 1
(a) A = 1 0 1
0 1 0
 
1 1 1
(b) B = 1 0 1
1 1 0

5. Cauchy’s theorem would


 imply that GL2 (Zp ) would have an element of
1 1
order p. Show that would work.
0 1

6. The determinant det : GL2 (Zp ) → Z∗p gives a homomorphism. Show that
this is onto, and use this to compute the order of the kernel (which is
usually denoted as SL2 (Zp )).
7. The order of SL2 (Z3 ) is 24, which might lead one to suspect it’s isomorphic
to S4 . Prove that it isn’t by comparing centers (see ex 7 of chap 7).

51
Chapter 11

The sign of a permutation

Theorem 11.1. Suppose n ≥ 2.


(a) Every permutation in Sn is a product of transpositions.
(b) If the identity I = τ1 . . . τr in Sn is expressed as product of transpositions,
r must be even.
Before giving the proof, we need the following lemmas.
Lemma 11.2. Suppose a, b, c, d ∈ {1, . . . , n} are mutually distinct elements.
We have the following identities among transpositions
(ab) = (ba)
(ab)(ab) = I (11.1)
(ac)(ab) = (ab)(bc) (11.2)
(bc)(ab) = (ac)(cb) (11.3)
(cd)(ab) = (ab)(cd) (11.4)
Proof. The first couple are obvious, the rest will be left as an exercise.
Lemma 11.3. Any product of transpositions τ1 τ2 . . . τr , in Sn , is equal to an-
other product of transpositions τ10 . . . τr0 0 , such that r and r0 have the same parity
(in other words, they are either both even or both odd) and n occurs at most
once among the τi0 .
Proof. Rather than giving a formal proof, we explain the strategy. Use (11.3)
and (11.4) to move transpositions containing n next to each other. Then apply
(11.1) and (11.2) to eliminate one of the n’s. In each of these moves, either r
stays the same or drops by 2. Now repeat.
Here are a couple of examples when n = 4,
(43)(41)(24) = (41)(13)(24) = (41)(24)(13) = (24)(12)(13)
(34)(12)(34) = (34)(34)(12) = (12)

52
Proof of theorem 11.1. We prove both statements by induction on n. The base
case n = 2 of (a) is clear, the only permutations are (12) and (12)(12). Now
suppose that (a) holds for Sn . Let f ∈ Sn+1 . If f (n + 1) = n + 1, then
f ∈ Stab(n + 1) which can be identified with Sn . So by induction, f is a
product of transpositions. Now suppose that j = f (n + 1) 6= n + 1. Then the
product g = (n + 1 j)f sends n + 1 to n + 1. This implies that g is a product
of transpositions τ1 τ2 . . . by the previous case. Therefore f = (n + 1 j)τ1 τ2 . . ..
Statement (b) holds when n = 2, because I = (12)r if and only if r is even.
Suppose that (b) holds for Sn . Let

I = τ1 τ2 . . . τr (11.5)

in Sn+1 . By using these lemma 11.3, we can get a new equation

I = τ10 . . . τr0 0 (11.6)

where at most one of the τi0 ’s contains n + 1, and r0 has the same parity as r. If
exactly one of the τi0 ’s contains n + 1, then τ10 . . . τr0 0 will send n + 1 to a number
other than n + 1. This can’t be the identity contradicting (11.6). Therefore
none of the τi0 ’s contains n + 1. This means that (11.6) can be viewed as an
equation in Sn . So by induction, we can conclude that r0 is even.

Corollary 11.4. If a permutation σ is expressible as a product of an even


(respectively odd) number of transpositions, then any decomposition of σ as a
product of transpositions has an even (respectively odd) number of transpositions.
Proof. Write
σ = τ1 . . . τr = τ10 . . . τr0 0
where τi , τj0 are transpositions. Therefore

I = τr−1 . . . τ1−1 τ10 . . . τr0 0 = τr . . . τ1 τ10 . . . τr0 0

which implies that r + r0 is even. This is possible only if r and r0 have the same
parity.
Definition 11.5. A permutation is called even (respectively odd) if it is a prod-
uct of an even (respectively odd) number of transpositions. Define
(
1 if σ is even
sign(σ) =
−1 if σ is odd

Lemma 11.6. The map sign : Sn → {1, −1} is a homomorphism.


Proof. Clearly sign(I) = 1 and sign(στ ) = sign(σ) sign(τ ).
Definition 11.7. The alternating group An ⊂ Sn is the subgroup of even per-
mutations.

53
Observe that An is a subgroup, and in fact a normal subgroup, because it
equals ker(sign). We can identify Sn /An with {1, −1}. Therefore
Lemma 11.8. |An | = 12 n!.
Earlier as an exercise, we found that the symmetry group of the dodeca-
hedron had order 60, which is coincidentally the order of A5 . A more precise
analysis, which we omit, shows that these groups are in fact isomorphic.

Let us apply these ideas to study functions of several variables. A function


f : X n → R is called symmetric if

f (x1 , . . . , xi , . . . , xj , . . . xn ) = f (x1 , . . . , xj , . . . , xi . . . xn )

and antisymmetric if

f (x1 , . . . , xi , . . . , xj , . . . xn ) = −f (x1 , . . . , xj , . . . , xi , . . . xn )

for all i 6= j. For example, when X = R

x1 + x2 + x3

is symmetric, and
(x1 − x2 )(x1 − x3 )(x2 − x3 )
is antisymmetric. Clearly when f is antisymmetric,

f (x1 , . . . , xn ) = sign(σ)f (xσ(1) , . . . , xσ(n) )

holds for any permutation. A similar equation holds for symmetric functions,
with sign(σ) omitted. We define the symmetrization and antisymmetrization
operators by
1 X
Sym(f ) = f (xσ(1) , . . . , xσ(n) )
n!
σ∈Sn

1 X
Asym(f ) = sign(σ)f (xσ(1) , . . . , xσ(n) )
n!
σ∈Sn

We’ll see in the exercises that these operators produce (anti)symmetric func-
tions.

11.9 Exercises
1. Check the identities in lemma 11.2.
2. Prove that if σ ∈ Sn is odd, then so is σ −1 .
3. Prove that a cycle of length r is even if and only if r is odd.

54
4. Prove that if G ⊆ Sn is a subgroup of odd order, then G ⊆ An .
5. Prove that Sym(f ) (respectively Asym(f )) is symmetric (respectively an-
tisymmetric), and furthermore that f = Sym(f ) (f = Asym(f )) if and
only if f symmetric ( antisymmetric).

6. If f is symmetric, prove that Asym(f ) = 0.


7. Prove that
f (x1 , . . . , xn ) = f (xσ(1) , . . . , xσ(n) )
holds for all σ ∈ An if and only if f is a sum of a symmetric and antisym-
metric function.

55
Chapter 12

Determinants

The ideas of the previous chapter can be applied to linear algebra. Given an
n × n matrix A = [aij ] over a field K, the determinant
X
det A = sign(σ)a1σ(1) . . . anσ(n)
σ∈Sn

This is bit like the antisymmetrization considered earlier. There is also symmet-
ric version, without sign(σ), called the permanent. However, as far as I know,
it is much less useful. The definition, we gave for the determinant, is not very
practical. However, it is theoretically quite useful.
Theorem 12.1. Given an n × n matrix A, the following properties hold.
(a) det I = 1
(b) If B is obtained by multiplying the ith row of A by b then det B = b det A
(c) Suppose that the ith row of C is the sum of the ith rows of A an B, and
all other rows of A, B and C are identical. Then det C = det A + det B.
(d) det A = det AT .
(e) Let us write A = [v1 , . . . , vn ], where v1 , v2 , . . . are the columns. Then
det(vτ (1) , . . . vτ (n) ) = sign(τ ) det(v1 , . . . vn )
Proof. Item (a) is clear because all the terms δ1σ(1) . . . δnσ(n) = 0 unless σ = I.
(b)
X
det B = sign(σ)a1σ(1) . . . (baiσ(i) ) . . . anσ(n)
σ∈Sn
X
=b sign(σ)a1σ(1) . . . aiσ(i) . . . anσ(n)
σ∈Sn

= b det A

56
(c) Denote the ith columns of A, B, C by [α1 , . . .], [β1 , . . .] and [α1 + β1 , . . .]
X
det C = sign(σ)a1σ(1) . . . (ασ(i) + βσ(i) ) . . . anσ(n)
σ∈Sn
X
= sign(σ)a1σ(1) . . . ασ(i) . . . anσ(n)
σ∈Sn
X
+ sign(σ)a1σ(1) . . . βσ(i) . . . anσ(n)
σ∈Sn

= det A + det B

Before proving (d), observe that by the commutative law

a1σ(1) . . . anσ(n) = aτ (1)τ σ(1) . . . aτ (n)τ σ(n)

for any permutation τ . In particular, setting τ = σ −1 gives

a1σ(1) . . . anσ(n) = aσ−1 (1)1 . . . aσ−1 (n)n

Therefore
X
det AT = sign(τ )aτ (1)1 . . . aτ (n)n
τ ∈Sn
X
= sign(σ −1 )aσ−1 (1)1 . . . aσ−1 (n)n
σ∈Sn
X
= sign(σ)aσ−1 (1)1 . . . aσ−1 (n)n
σ∈Sn

= det A

For (e), using the fact that sign(στ ) = sign(σ) sign(τ ), we obtain sign(στ ) sign(τ ) =
sign(σ). Therefore
X
det(vτ (1) , . . . , vτ (n) ) = sign(σ)a1στ (1) . . . anστ (n)
σ∈Sn
X
= sign(στ ) sign(τ )a1στ (1) . . . anστ (n)
σ∈Sn
X
= sign(τ ) sign(στ )a1στ (1) . . . anστ (n)
σ∈Sn

In the last sum, we can set η = στ , and sum over η ∈ Sn . This allows us to
simplify it to sign(τ ) det(v1 , . . . vn ).

Corollary 12.2. A matrix A with a row of zeros has det A = 0


Proof. By (b), det A = 0 · det A.

57
In the exercises, we will use the above theorem to show that A behaves in
the expected way under elementary row operations. This can be summarized as
Lemma 12.3. If A, E are both n × n with E elementary, then det(E) 6= 0 and
det(EA) = det(E) det(A).
Much of the importance of determines stems from the following facts.
Theorem 12.4. Given an n × n matrix A, the following statements are equiv-
alent
(a) A is invertible.
(b) det A 6= 0.
(c) Av = 0 implies v = 0.
Proof. By theorem 10.1, a square matrix A is either a product of elementary
matrices when its invertible, or a product of elementary matrices and a matrix
B with a zero row otherwise. In the first case det A 6= 0 by lemma 12.3. In the
second case, det A is proportional to det B = 0. This proves the equivalence of
(a) and (b).
If A is invertible and Av = 0, then v = A−1 Av = 0. Suppose A is not
invertible. Then det AT = det A = 0 by what was just proved. Therefore
AT = F B where F is a product of elementary matrices, and B hasa row of zeros.
For simplicity, suppose that the first row is zero. Set v = (F T )−1 [1, 0, . . . , 0]T .
This vector is nonzero and Av = B T F T v = 0. This proves the equivalence of
(a) and (c).
Theorem 12.5. The determinant gives a homomorphism det : GLn (K) → K ∗ .
Proof. If A and B are invertible n × n matrices, write A as a product of ele-
mentary matrices, then det AB = det A det B follows from lemma 12.3.
Let A be an n × n matrix. An element λ ∈ K is an eigenvalue of A if there
exists a nonzero vector v ∈ K n , called an eigenvector, such that
Av = λv
or equivalently
(λI − A)v = 0
Theorem 12.6. The expression p(x) = det(xI − A) is a polynomial of degree
n, called the characteristic polynomial of A. λ is an eigenvalue if and only if it
is a root of p(x).
Proof. Clearly
X
p(x) = sign(σ)(xδ1σ(1) − a1σ(1) ) . . . (xδnσ(n) − anσ(n) )
σ∈Sn

is a polynomial of degree n. From the definition of p(x), p(λ) = 0 if and only if


λI − A is not invertible. By theorem 12.4, this is equivalent to λI − A having
a nonzero kernel, or in other words for λ to be an eigenvalue.

58
Corollary 12.7. A has at most n distinct eigenvalues.

12.8 Exercises
1. Let E be obtained from I by interchanging two rows. Check that det E =
−1 and det(EA) = − det A.
2. Let E be obtained from I by multiplying a row by k ∈ K ∗ . Check that
det E = k and det(EA) = k det A.
3. Let E be obtained from I by adding a multiple of one row to another.
Check that det E = 1 and det(EA) = det A.
4. Suppose that a square matrix A can be subdivided into blocks
 
B C
0 D

as indicated with B and D square. Prove that det A = det B det D.


5. The permanent of an n × n matrix A is
X
Perm A = a1σ(1) . . . anσ(n)
σ∈Sn

Prove that Perm A = Perm AT .


6. Calculate the cardinality of {A ∈ GLn (Zp ) | det A = 0}.
7. Suppose that A is an n × n matrix with n eigenvalues λ1 . . . , λn allowing
repetitions, i.e. the characteristic polynomial factors as (x − λ1 ) . . . (x −
λn ). Show that det A = (−1)n λ1 . . . λn and trace A = λ1 + . . . + λn , where
trace A is defined as a11 + a22 + . . . ann .

59
Proof. From AT A = I we obtain det(A)2 = det(AT ) det(A) = 1.
We already saw in the exercises to chapter 3 that the set of orthogonal
matrices O(3) forms a subgroup of GL3 (R). Let SO(3) = {A ∈ O(3) | det A =
1}.

Lemma 13.2. SO(3) is a subgroup of O(3).


Proof. If A, B ∈ SO(3), then det(AB) = det(A) det(B) = 1 and det(A−1 ) =
1−1 = 1. Also det(I) = 1.
Proposition 13.3. Every rotation matrix lies in SO(3).

Proof. Given a unit vector v3 = r as above, fix R = R(θ, r). By Gram-Schmid


we can find two more vectors, so v1 , v2 , v3 is orthonormal. Therefore A =
[v1 v2 v3 ] is an orthogonal matrix. After possibly switching v1 , v2 , we can assume
that v1 , v2 , v3 is right handed or equivalently that det A = 1. Then

R(v1 ) = cos θv1 + sin θv2


R(v2 ) = − sin θv1 + cos θv2
R(v3 ) = v3

and therefore
RA = AM
where  
cos θ − sin θ 0
M =  sin θ cos θ 0
0 0 1
Since M, A ∈ SO(3), it follows that R = AM A−1 ∈ SO(3).
In principle, the method of proof can be used to calculate R(θ, [a, b, c]T )
explicitly. In fact, I did find an expression with the help of a computer algebra
package:
 2
a + cos(θ) − a2 cos(θ)

−c sin(θ) + ab − ab cos(θ) ac − ac cos(θ) + b sin(θ)
 ab − ab cos(θ) + c sin(θ) b2 + cos(θ) − b2 cos(θ) −a sin(θ) + bc − bc cos(θ) 
2 2 2 2
−b sin(θ) + ac − ac cos(θ) bc − bc cos(θ) + a sin(θ) −b + b cos(θ) − a + a cos(θ) + 1

However, the formula is pretty horrendous and essentially useless. We will see a
better way to do calculations shortly (which is in fact what I used to calculate
the previous matrix).
We want to prove that every matrix in SO(3) is a rotation. We start by
studying their eigenvalues. In general, a real matrix need not have any real
eigenvalues. However, this will not be a problem in our case.
Lemma 13.4. A 3 × 3 real matrix has a real eigenvalue.

61
Proof. The characteristic polynomial p(λ) = λ3 +a2 λ2 +. . . has real coefficients.
Since λ3 grows faster than the other terms, p(λ) > 0 when λ  0, and p(λ) < 0
when λ  0. Therefore the graph of y = p(x) must cross the x-axis somewhere,
and this would give a real root of p. (This intuitive argument is justified by the
intermediate value theorem from analysis.)

Lemma 13.5. If A ∈ O(3), 1 or −1 is an eigenvalue.


Proof. By the previous lemma, there exists a nonzero vector v = [x, y, z]T ∈ R3
and real number λ such that Av = λv. Since a multiple of v will satisfy the same
conditions, we can assume that the square of the length v T v = x2 + y 2 + z 2 = 1.
It follows that

λ2 = (λv)T (λv) = (Av)T (Av) = v T AT Av = v T v = 1

Theorem 13.6. A matrix in SO(3) is a rotation.


Proof. Let R ∈ SO(3). By the previous lemma, ±1 is an eigenvalue.
We divide the proof into two cases. First suppose that 1 is eigenvalue. Let
v3 be an eigenvector with eigenvalue 1. We can assume that v3 is a unit vector.
We can complete this to an orthonormal set v1 , v2 , v3 . The vectors v1 and v2
form a basis for the plane v3⊥ perpendicular to v3 . The matrix A = [v1 , v2 , v3 ]
is orthogonal, and we can assume that it is in SO(3) by switching v1 and v2 if
necessary. It follows that

RA = [Rv1 , Rv2 , Rv3 ] = [Rv1 , Rv2 , v3 ]

remains orthogonal. Therefore Rv1 , Rv2 lie in v3⊥ . Thus we can write

R(v1 ) = av1 + bv2


R(v2 ) = cv1 + dv2
R(v3 ) = v3

The matrix  
a b 0
A−1 RA =  c d 0
0 0 1
 
a b
lies in SO(3). It follows that the block lies in SO(2), which means that
c d
it is a plane rotation matrix R(θ). It follows that R = R(θ, v3 ).
Now suppose that −1 is an eigenvalue and let v3 be an eigenvector. Defining
A as above, we can see that
 
a b 0
A−1 RA =  c d 0 
0 0 −1

62
This time the upper 2 × 2 is block lies O(2) with determinant −1. This implies
that it is a reflection. This means that there is a nonzero vector v in the plane
v3⊥ such Rv = v. Therefore R also +1 as an eigenvalue, and we have already
shown that R is a rotation.
From the proof, we extract the following useful fact.
Corollary 13.7. Every matrix in SO(3) has +1 as an eigenvalue. If the matrix
is not the identity then the corresponding eigenvector is the axis of rotation.
We excluded the identity above, because everything would be an axis of
rotation for it. Let us summarize everything we’ve proved in one statement.
Theorem 13.8. The set of rotations in R3 can be identified with SO(3), and
this forms a group.

13.9 Exercises
1. Check that unlike SO(2), SO(3) is not abelian. (This could get messy, so
choose the matrices with care.)
2. Given two rotations Ri = R(θi , vi ), show that the axis of R2 R1 R2−1 is
R2 v1 . Conclude that a normal subgroup of SO(3), different from {I}, is
infinite.
3. Check that  
cos θ − sin θ 0
 sin θ cos θ 0
0 0 1
has 1, e±iθ as complex eigenvalues. With the help of the previous exercise
show that this holds for any rotation R(θ, v).
4. Show the map f : O(2) → SO(3) defined by
 
A 0
f (A) =
0 det(A)
is a one to one homomorphism. Therefore we can view O(2) as a subgroup
of SO(3). Show that this subgroup is the subgroup {g ∈ SO(3) | gr =
±r}, where r = [0, 0, 1]T .
5. Two subgroups Hi ⊆ G of a group are conjugate if for some g ∈ G,
H2 = gH1 g −1 := {ghg −1 | h ∈ H1 }. Prove that H1 ∼
= H2 if they are
conjugate. Is the converse true?
6. Prove that for any nonzero vector v ∈ R3 , the subgroup {g ∈ SO(3) |
gv = ±v} (respectively {g ∈ SO(3) | gv = v}) is conjugate, and there-
fore isomorphic, to O(2) (respectively SO(2)). (Hint: use the previous
exercises.)

63
Chapter 14

Finite subgroups of the


rotation group

At this point, it should it should come as no surprise that finite subgroups of


the O(2) are groups of a symmetries of a regular polygon. We prove a slightly
more precise statement.
Theorem 14.1. A finite subgroup of SO(2) is cyclic, and a finite subgroup of
O(2) not contained in SO(2) is dihedral.
Recall that the dihedral group Dn is defined by generators and relations
Rn = I, F 2 = I and F RF = R−1 . We include the “degenerate” cases D1 ∼
= Z2 ,
and D2 ∼
= Z2 × Z2 (see exercises).
Proof. First suppose that G ⊂ SO(2) is finite. The elements of G are of course
rotations through some angle θ ∈ [0, 2π). Let R ∈ G − {I} be the rotation with
the smallest possible θ. Let S ∈ G − {I} be another element with angle φ. Since
φ ≥ θ, we can write φ = nθ + ψ, where n ≥ 0 is an integer and ψ ≥ 0. By
choosing n as large as large as possible, we can assume that ψ < θ. Since ψ is
the angle of SR−n , we must have ψ = 0. This proves that S = Rn . So G is
generated by R, and therefore cyclic.
Now suppose that G ⊂ O(2) is finite but not contained in SO(2). Then there
exists F ∈ G with det F = −1. This is necessarily a reflection so that F 2 = I.
G ∩ SO(2) is cyclic with generator R by the previous paragraph. Let us suppose
that R has order n. We have that det F R = −1, so it is also a reflection. This
means that F RF R = I or F RF = R−1 . Together with the relations F 2 = I
and Rn , we see that G ∼ = Dn .
Let us now turn to finite subgroups of SO(3). Since O(2) ⊂ SO(3), we have
the above examples. We also have symmetry groups of a regular tetrahedron,
cube or dodecahedron. Remarkably, the converse is also true. We will be content
to prove a weaker statement.

64
Theorem 14.2. Let G ⊂ SO(3) be a finite subgroup. Then either G is cyclic,
dihedral or else it has order 12, 24 or 60.
The proof will be broken down into a series of lemmas. Let us suppose that
G ⊂ SO(3) is a nontrivial finite subgroup. Then G acts on the sphere S of
radius one centered at the origin. We define a point of S to be a pole of G if it
is fixed by at least one g ∈ G with g 6= I. Let P be the set of poles. For g 6= I,
there are exactly two poles ±p, where the axis of g meets S. It follows that P
is a finite set with even cardinality. We will see in an exercise that G acts on P .
So, we can partition P into a finite number, say n, of orbits. Choose one point
pi , in each orbit.
Lemma 14.3.   X n  
1 1
2 1− = 1− (14.1)
|G| i=1
| Stab(pi )|
Proof. By Burnside’s formula
1 X
n= | Fix(g)|
|G|
g∈G

As noted above | Fix(g)| = 2, when g 6= I. Therefore, with the help of the


orbit-stabilizer theorem
1
n= (2(|G| − 1) + |P |)
|G|
n
!
1 X
= 2(|G| − 1) + | Orb(pi )|
|G| 1
n
!
1 X |G|
= 2(|G| − 1) +
|G| 1
| Stab(pi )|

This can be rearranged to get


  Xn  
1 1
2 1− = 1−
|G| 1
| Stab(pi )|

Lemma 14.4. With above notation, if G 6= {I} then either n = 2 or 3 in


(14.1).
Proof. Since |G| ≥ 2 and | Stab(pi )| ≥ 2, we must have
 
1
1≤2 1− <2
|G|
and  
n X 1
≤ 1− <n
2 | Stab(pi )|
The only way for (14.1) to hold is for n = 2, 3.

65
Lemma 14.5. If n = 2, G is cyclic.
Proof. Since Stab(pi ) ⊆ G, we have
   
1 1
1− ≤ 1− (14.2)
| Stab(pi )| |G|

But (14.1) implies


     
1 1 1
2 1− = 1− + 1−
|G| | Stab(p1 )| | Stab(p2 |

and this forces equality in (14.2) for both i = 1, 2. This implies that G =
Stab(p1 ) = Stab(p2 ). This means that g ∈ G is a rotation with axis the line L
connecting p1 to 0 (or p2 to 0, which would have to be the same). It follows
that g would have to be a rotation in the plane perpendicular to L. So that G
can be viewed as subgroup of SO(2). Therefore it is cyclic by theorem 14.1.
We now turn to the case n = 3. Let us set ni = | Stab(pi )| and arrange them
in order 2 ≤ n1 ≤ n2 ≤ n3 . (14.1) becomes
       
1 1 1 1
2 1− = 1− + 1− + 1−
|G| n1 n2 n3
or
2 1 1 1
1+ = + +
|G| n1 n2 n3
The left side is greater than one, so we have a natural constraint.
Lemma 14.6. The only integer solutions to the inequalities

2 ≤ n1 ≤ n2 ≤ n3
1 1 1
+ + >1
n1 n2 n3
are as listed together with the corresponding orders of G.
(a) (2, 2, n3 ) and |G| = 2n3 .
(b) (2, 3, 3) and |G| = 12.
(c) (2, 3, 4) and |G| = 24.
(d) (2, 3, 5) and |G| = 60.
To complete the proof of theorem 14.2, we need the following
Lemma 14.7. A subgroup G ⊂ SO(3) corresponding to the triple (2, 2, n) is
isomorphic to Dn .

66
Show that the set of poles P of the symmetry group T of the tetrahedron
consists of the vertices, midpoints of edges and midpoints of faces extended
to S. Show that the T action on P has three orbits, where one of them
has a stabilizer of order 2 and the remaining two have stabilizers of order
3.

7. Determine the poles of the symmetry group of the cube, and determine
the orbits and stabilizers as in the previous exercise.

68
Chapter 15

Quaternions

The two dimensional rotation group can be naturally identified with the mul-
tiplicative group of complex numbers with |z| = 1. This idea can be extended
to handle rotations in R3 , and this will be explained in the next chapter. We
start by describing the ring of quaternions, which was discovered by Hamilton
in order to generalize complex numbers. The ring of quaternions is given by

H = {a + bi + cj + dk | a, b, c, d ∈ R}

where i, j, k are symbols. Alternatively, we can think of a + bi + cj + dk as a


more suggestive way of writing the vector (a, b, c, d) ∈ R4 . We define

0 = 0 + 0i + 0j + 0k

1 = 1 + 0i + 0j + 0k
(a + bi + cj + dk) + (a0 + b0 i + c0 j + d0 k) = (a + a0 ) + (b + b0 )i + (c + c0 )j + (d + d0 )k
(a + bi + cj + dk) · (a0 + b0 i + c0 j + d0 k) = (aa0 − bb0 − cc0 − dd0 ) + (ab0 + ba0 + cd0 − dc0 )i
+ (ac0 − bd0 + ca0 + db0 )j + (ad0 + bc0 − cb0 + da0 )k
To put it another way, multiplication is determined by the rules:

1 is the identity

i2 = j 2 = k 2 = −1
ij = −ji = k
jk = −kj = i
ki = −ik = j.
Theorem 15.1. With the above rules, H becomes a noncommutative ring.

69
Proof. All the laws, except the associative law for multiplication, are not dif-
ficult to verify. In principle, associativity can be checked by a long and messy
calculation. Instead, we will embed H into the ring M22 (C) with the help of the
Pauli spin matrices1 used in physics:
     
0 i 0 −1 i 0
σi = , σj = , σk =
i 0 1 0 0 −i

The i within the matrices is the complex number −1 of course. We define a
map f : H → M22 (C) by

f (a + bi + cj + dk) = aI + bσi + cσj + dσk


 
a + di bi − c
= (15.1)
bi + c a − di

which is clearly a homomorphism of additive groups. If

f (a + bi + cj + dk) = 0

then clearly a = b = c = d = 0 by (15.1). So f is one to one. A calculation


shows that
σi2 = σj2 = σk2 = −I
(15.2)
σi σj = −σj σi = σk

etc. So that f takes a product of quaternions uv to the product of matrices


f (u)f (v). Associativity of products is now automatic. More explicitly, u(vw) =
(uv)w, because f (u(vw)) = f (u)f (v)f (w) = f ((uv)w).
We could have simply defined the set of quaternions to be the set of matrices
of the form (15.1). But this would hide the fact that quaternions should be
viewed as a generalization of complex numbers. Many familiar constructions
from complex numbers carry over to H. We define the conjugate, norm and real
and imaginary parts of a quaternion by

a + bi + cj + dk = a − bi − cj − dk
p
|a + bi + cj + dk| = a2 + b2 + c2 + d2
Re(a + bi + cj + dk) = a
Im(a + bi + cj + dk) = bi + cj + dk
Let us say that a quaternion is imaginary if its real part is zero
Theorem 15.2. Let q ∈ H then
(a) q = q.
1 Actually, we are using i times the Pauli matrices, which is more convenient for our pur-

poses.

70
(b) q1 + q2 = q2 + q1 .
(c) q1 q2 = q2 q1 .
(d) qq = |q|2 .
(e) |q1 q2 | = |q1 ||q2 |
(f ) If q is imaginary q 2 = −|q|2 .
The first two statements are easy. The remainder are left as exercises.
Corollary 15.3. H forms a division ring. If q 6= 0, its inverse
1
q −1 = q
|q|2
In particular, H∗ = H − {0} is a group under multiplication.
Lagrange proved that every positive integer is a sum of four squares of inte-
gers. For example,
10 = 32 + 12 + 02 + 02
20 = 42 + 22 + 12 + 02
30 = 52 + 22 + 12 + 02
Although we won’t prove the theorem, we will explain one step because it gives
a nice application of quaternions.
Proposition 15.4. If x and y are both expressible as a sum of four squares of
integers, then the same is true of xy.
Proof. By assumption, we can find two quaternions q1 and q2 with integer coef-
ficients such that x = |q1 |2 and y = |q2 |2 . The product q1 q2 is also a quaternion
with integer coefficients. By theorem 15.2, xy = |q1 q2 |2 .

15.5 Exercises
1. Check (15.2) and use this to show that in addition

σj σk = −σk σj = σi

σk σi = −σi σk = σj
hold.
2. Prove part (c) and (d) of theorem 15.2.
3. Prove part (e) and (f).
4. Check that the set Q = {1, −1, i, −i, j, −j, k, −k} is a subgroup of H∗
which is not abelian and not isomorphic to D4 . So it is a group of order
8 that we have not seen before, called the quaternion group.

71
5. Show that {±1} ⊂ Q is a normal subgroup, and that the quotient Q/{±1}
is isomorphic to D2 .
6. Let
1
T̃ = {±1, ±i, ±j, ±k, (±1 ± i ± j ± k)}
2
where the signs on the terms in the last sum can be chosen independently
of each other. Check that T̃ ⊂ H∗ is a subgroup of order 24. This is called
the binary tetrahedral group.
7. You have probably encountered the scalar (or dot or inner) product h, i and
vector (or cross) product × on R3 before. Identifying vectors (a, b, c) ∈ R3
with imaginary quaternions ai + bj + ck, the scalar product is an R-valued
operation given by

hbi + cj + dk, ei + f j + gki = be + cf + dg

The vector product is an R3 -valued distributive operation satisfying

v × w = −w × v

i × j = k, j × k = i, k × i = j.
3
If v, w ∈ R , show that these are related quaternionic product by

v · w = −hv, wi + v × w

72
Chapter 16

The Spin group

We return to the study of rotations. We saw earlier a rotation can be rep-


resented by a 3 × 3 matrix in SO(3). However, as we saw this description is
cumbersome for doing calculations. We will give an alternative based on the
ring of quaternions which makes this easy. Define the spin group
Spin = {q ∈ H | |q| = 1}
Using theorem 15.2, we can see that this is a subgroup of H∗ , so it really is
a group. The word “spin” comes from physics (as in electron spin); at least I
think it does. Usually this group is called Spin(3), but we won’t consider any
of the other groups in this series.
Lemma 16.1. If q ∈ Spin and v ∈ H is imaginary, then qvq is imaginary.
Proof. Re(v) = 0 implies that v = −v. Therefore
qvq = qvq = −qvq
This implies qvq is imaginary.
We will identify R3 with imaginary quaternions by sending [x, y, z] to xi +
yj +zk. The previous lemma allows us to define a transformation Rot(q) : R3 →
R3 by Rot(q) = qvq for q ∈ Spin. This is a linear transformation, therefore it
can be represented by a 3 × 3 matrix.
Lemma 16.2. Rot : Spin → GL3 (R) is a homomorphism.
Proof. We have that Rot(q1 q2 ) = Rot(q1 ) Rot(q2 ) because Rot(q1 q2 )(v) = q1 q2 vq 2 q 1 =
Rot(q1 ) Rot(q2 )(v). And the lemma follows.

Lemma 16.3. Rot(q) is an orthogonal matrix.


Proof. We use the standard characterization of orthogonal matrices that these
are exactly the square matrices for which |Av| = |v| for all vectors v. If v ∈ R3 ,
| Rot(q)(v)|2 = |qvq|2 = |q|2 |v|2 |q|2 = |v|2 .

73
Lemma 16.4. Rot(q) ∈ SO(3).
Proof. There are a number of ways to see this. Geometrically, an orthogonal
matrix lies in SO(3) if it takes a right handed orthonormal basis to another right
handed basis. In terms of the vector cross products, right handed means that
the cross product of the first vector with the second vector is the third. In the
exercise 7 of the last chapter, we saw that the imaginary part of the product of
two imaginary quaternions is the vector cross product of the corresponding vec-
tors. The right handed basis i, j, k gets transformed to Rot(q)i, Rot(q)j, Rot(q)k.
Since qiqqjq = qijq = qkq, we have Rot(q)i × Rot(q)j = Rot(q)k. So this is
again right handed.
Lemma 16.5. If r is an imaginary quaternion with |r| = 1, and a, b ∈ R satisfy
a2 + b2 = 1, then Rot(a + br) is a rotation about r.
Proof. Let q = a + br. It clearly satisfies |q| = 1. The lemma follows from
Rot(q)(r) = (a + br)r(a − br) = (ar − b)(a − br) = r

It remains to determine the angle.


Theorem 16.6. For any unit vector r viewed as an imaginary quaternion,
Rot(cos(θ) + sin(θ)r)
is R(2θ, r).
Proof. Pick a right handed system orthonormal vectors v1 , v2 , v3 with v3 = r.
Then by exercise 7 of the last chapter, v1 v2 = v3 , v2 v3 = v1 , and v3 v1 = v2 . Let
q = cos(θ) + sin(θ)r. We have already seen that Rot(q)v3 = v3 . We also find
Rot(q)v1 = (cos θ + sin θv3 )v1 (cos θ − sin θv3 )
= (cos2 θ − sin2 θ)v1 + (2 sin θ cos θ)v2
= cos(2θ)v1 + sin(2θ)v2
and
Rot(q)v2 = (cos θ + sin θv3 )v2 (cos θ − sin θv3 )
= − sin(2θ)v1 + cos(2θ)v2
which means that Rot(q) behaves like R(2θ, r).
Corollary 16.7. The homomorphism Rot : Spin → SO(3) is onto, and SO(3)
is isomorphic to Spin /{±1}.
Proof. Any rotation is given by R(2θ, r) for some θ and r, so Rot is onto. The
kernel of Rot consists of {1, −1}. Therefore SO(3) ∼
= Spin /{±1}.
So in other words, a rotation can be represented by an element of Spin
uniquely up to a plus or minus sign. This representation of rotations by quater-
nions is very economical, and, unlike R(θ, r), multiplication is straigthforward.

74
16.8 Exercises
1. Suppose we rotate R3 counterclockwise once around the z axis by 90◦ , and
then around the x axis by 90◦ . This can expressed as a single rotation.
Determine it.
2. Given a matrix A ∈ Mnn (C). Define the adjoint A∗ = ĀT . In other words
the ijth entry of A∗ is aji . (This should not be confused with the matrix
built out of cofactors which also often called the adjoint.) A matrix A
called unitary if A∗ A = I and special unitary if in addition det A = 1.
Prove that the subset U (n) (or SU (n)) of (special) unitary matrices in
GLn (C) forms a subgroup.
3. Let a + bi + cj + dk ∈ Spin, and let A ∈ M22 (C) be given by (15.1). Prove
that A ∈ SU (2). Prove that this gives an isomorphism Spin ∼ = SU (2).
4. Consider the quaternion group Q = {±1, ±i, ±j, ±k} studied in a previous
exercise. Show this lies in Spin and that its image in SO(3) is the subgroup
  
 ±1 0 0 
 0 ±1 0  | there are 0 or 2 −1’s
0 0 ±1
 

Find the poles (see chapter 14) and calculate the orders of their stabilizers.
5. Let
1 1 1 1
V = { √ [1, 1, 1]T , √ [−1, −1, 1]T , √ [−1, 1, −1]T , √ [1, −1, −1]T }
3 3 3 3
and let
1
T̃ = {±1, ±i, ±j, ±k, (±1 ± i ± j ± k)}
2
be the subgroup of Spin defined in an exercise in the previous chapter.
Show that the image T of T̃ in SO(3) has order 12, and that it consists
of the union of the set of matrices in exercise 5 and
π π
{R(θ, r) | θ ∈ { , }, r ∈ V }
6 3

6. Continuing the last exercise. Show that the T acts as the rotational sym-
metry group of the regular tetrahedron with vertices in V .

75
Bibliography

[1] M. Artin, Algebra


[2] M. Armstrong, Groups and symmetry
[3] H. Weyl, Symmetry

76

You might also like