Introduction To Group Theory
Introduction To Group Theory
Imagine a square of paper lying flat on your desk. I ask you to close your eyes. You hear the paper shift. When
you open your eyes the paper doesn’t appear to have changed. What could I have done to it while you weren’t
looking?
It’s obvious that I haven’t rotated the paper by 30 degrees, because then the paper would look different.
I also did not flip it over across a line connecting, say, one of the corners to the midpoint of another edge. The
paper would look different if I had.
What I could have done, however, was rotate the paper clockwise or counterclockwise by any multiple of 90
degrees, or flipped it across either of the diagonal lines or the horizontal and vertical lines.
Flipping across any dashed line will not change the square.
A helpful way to visualize the transformations is to mark the corners of the square.
The last option is that to do nothing. This is called the identity transformation. Together, these are all called the
symmetry transformations of the square.
I can combine symmetry transformations to make other symmetry transformations. For example, two flips
across the line segment BD produces the identity, as do four successive 90 degree counterclockwise rotations. A
flip about the vertical line followed by a flip about the horizontal line has the safe effect as a 180 degree
rotation. In general, any combination of symmetry transformations will produce a symmetry transformation.
The following table gives the rules for composing symmetry transformations:
We use “e” for the identity transformation.
In this table, R with subscripts 90, 180, and 270 denote counterclockwise rotations by 90, 180, and 270 degrees,
H means a flip about the horizontal line, V is a flip about the vertical line, MD is a flip about the diagonal from
the top left to the bottom right, and OD means a flip over the other diagonal. To find the product of A and B, go
to the row of A and then over to the column of B. For example, H∘MD=R₉₀.
There are a few things you may notice by looking at the table:
• The operation ∘ is associative, meaning that A∘(B∘C) = (A∘B)∘C for any transformations A, B, and C.
• For any pair of symmetry transformations A and B, the composition A∘B is also a symmetry
transformation
• There is one element e such that A∘e=e∘A for every A
• For every symmetry transformation A, there is a unique symmetry transformation A⁻¹ such that
A∘A⁻¹=A⁻¹∘A=e
We therefore say that the collection of symmetry transformations of a square, combined with composition,
forms a mathematical structure called a group. This group is called D₄, the dihedral group for the square. These
structures are the subject of this article.
Definition of a group
A group ⟨G,*⟩ is a set G with a rule * for combining any two elements in G that satisfies the group axioms:
• Associativity: (a*b)*c = a*(b*c) for all a,b,c∈G
• Closure: a*b∈G all a,b∈G
• Unique identity: There is exactly one element e∈G such that a*e=e*a=a for all a∈G
• Unique inverses: For each a∈G there is exactly one a⁻¹∈G for which a*a⁻¹=a⁻¹*a=e.
In the abstract we often suppress * and write a*b as ab and refer to * as multiplication.
An example of a group from everyday life is the set of “moves” that can be made on a Rubik’s cube under
composition. Source.
It is not necessary for * to be commutative, meaning that a*b=b*a. You can see this by looking at the table of
D₄, where H∘MD=R₉₀ but MD∘H=R₂₇₀. Groups where * is commutative are called abelian groups after Neils
Abel.
Abelian groups are the exception rather than the rule. Another example of a non-abelian group is the symmetry
transformations of a cube. Consider just rotations about the axes:
Source: Chegg
If I first rotate 90 degrees counterclockwise about the y-axis and then 90 degrees counterclockwise about the z-
axis then his will have a different result than if I were to rotate 90 degrees about the z-axis and then 90 degrees
about the y-axis.
Top row: Rotation 90 degrees about y followed by 90 degrees about z. Bottom row: 90 degree rotation about z
followed by 90 degree rotation about y.
It is possible for an element to be its own inverse. Consider the group which consists of 0 and 1 with the
operation of binary addition. Its table is:
Clearly 1 is its own inverse. This is also an abelian group. Don’t worry, most groups aren’t this boring.
Some more examples of groups include:
• The set of integers with addition.
• The set of rational numbers not including 0 with multiplication.
• The set of solutions to the polynomial equation xⁿ-1=0, called the nth roots of unity, with multiplication.
In our example code, m=3 and the sphere of radius ½(m-1)=1 about the codeword 011101 is {011101, 111101,
001101, 011001, 010101, 011111, 011100}. So if the receiver receives any of these words, it knows that it was
supposed to receive 011101.
So that’s all well and good, but what does any of this have to do with group theory? The answer is in how we
actually produce the codes that this method works with.
It is possible for a code in 𝔹ⁿ to form a subgroup. In this case we say that the code is a group code. Group codes
are finite groups so they are finitely generated. We will see how to produce a code by finding a generating set.
A code can be specified so that the first few bits of each codeword are called the information bits and the bits at
the end are called the parity bits. In our example code C, the first three bits are information bits and the last
three are parity bits. The parity bits satisfy parity-check equations. For a codeword A₁A₂A₃A₄A₅A₆ the parity
equations are A₄=A₁+A₂, A₅=A₂+A₃, and A₆=A₁+A₃. Parity equations provide another layer of protection
against errors: if any of the parity equations aren’t satisfied then an error has occurred.
Here is what we do. Suppose that we want codewords with m information bits and n parity bits. To generate a
group code, write a matrix with m rows and m+n columns. In the block formed by the first m columns, write
the m×m identity matrix. In column j for m+1≤j≤m_n, write a 1 in the kth row if Aₖ appears in the parity
equation for parity bit Aⱼ and 0 otherwise. In our example code, the matrix is:
Such a matrix is called a generating matrix for the group code. You can verify directly that the rows generate C:
The rows of any generating matrix form a generating set for a group code C. Proof:
• Identity and inverses: Any row added to itself gives the identity, a string consisting of all zeros.
• Closure: If A,B∈C then A and B are sums of rows of the generating matrix so A+B is also a sum of rows
of the generating matrix. Therefore A+B∈C.
This lets us generate a code, now I will show how to generate a useful code.
Define the weight w(x) to mean the number of ones in x. For example, w(100101)=3. It is obvious that
w(x)=d(x,0) where 0 is a word whose digits are all zeroes. The minimum weight W of a code is the weight of
the nonzero codeword with the fewest ones in the code. For a code of minimum distance m, d(x,0)≥m so
w(x)≥m and therefore W=m.
Recall that for a word to be considered a codeword, it must satisfy a set of parity check equations. For our
example code, these are A₄=A₁+A₂, A₅=A₂+A₃, and A₆=A₁+A₃. We can also write these as a linear system:
Which itself can be written in terms of dot products:
Or in more compact form as Ha=0 where a=(A₁,A₂,A₃,A₄,A₅,A₆) and H is the parity-check matrix for the code:
One can verify by direct computation that if w(a)≤2 then we cannot have Ha=0. In general, the minimum
weight is t+1 where t is the smallest number such any collection of t columns of H do not sum to zero (i.e. they
are linearly independent). Proving this would take us just a little bit too far into linear algebra.
If that frightens you, don’t worry. We can produce some very good codes without it by taking advantage of the
fact that the result I just mentioned implies that if every column of H is nonzero and no two columns are equal
then the minimum weight, and thus the minimum distance of the code, is at least three. This is very good: If our
communication system is expected to have one bit error for every hundred words transmitted, then only one in
ten thousand transmitted words will have an uncorrected error and one in one million transmitted words will
have an undetected error.
So now we have a recipe for producing a useful code for the maximum-likelihood detection scheme for
codewords containing m information bits and n parity bits:
• Create a matrix with m+n columns and n rows. Fill in the matrix with ones and zeros so that no two
columns are the same and no column is just zeros.
• Each row of the resulting parity-check matrix corresponds to one of the parity bit equations. Write them
as a system of equations and solve so that each parity bit is written in terms of the information bits.
• Create a matrix with m+n columns and m rows. In the block formed by the first m columns, write the
m×m identity matrix. In column j for m+1≤j≤m_n, write a 1 in the kth row if Aₖ appears in the parity
equation for parity bit Aⱼ and 0 otherwise.
• The rows of this matrix are the generators of a code group with minimum distance at least three. This is
the code we will use.
As an example, suppose that I need a simple, eight-word code and I only need some basic error detection and
not correction, so I can get away with a minimum distance of two. I want three information bits and two parity
bits. I write down the following parity-check matrix:
There are two pairs of columns that are equal, so the smallest t for which any collection of t columns is linearly
independent is one, so the minimum weight, and therefore the minimum distance I will have, is two. The rows
represent the parity-check equations A₄=A₁+A₃ and A₅=A₁+A₂+A₃. My generating matrix is therefore:
Closing remarks
Abstract algebra is a deep subject with far-reaching implications, but it is also a very easy one to learn. Aside
from a few passing mentions of linear algebra, nearly everything that I’ve discussed here is accessible to
someone who’s only had high school algebra.
When I first had the idea to write this article I really wanted to talk about the Rubik’s cube, but in the end I
wanted to pick an example that could be covered only with the most basic ideas in group theory. Plus, there is
so much to say about the Rubik’s cube group that it deserves a standalone piece, so that will be coming soon.
My college courses in abstract algebra were based on the book A Book of Abstract Algebra by Charles Pinter,
which is and accessible treatment. The examples in this article were all borrowed with some modification from
Pinter.
As a final note, any images that are not cited are my own original work and may be used with attribution. As
always, I appreciate any corrections or requests for clarification.