CF PDF

Continued Fractions
Wieb Bosma
Cor Kraaikamp
Sandra Hommersom
Merlijn Keune
Chris Kooloos
Willem van Loon
Roy Loos
Ewelina Omiljan
Geert Popma
David Venhoek
Maaike Zwart
20122013
2
Contents
1 Introduction 9
1.1 What is a continued fraction? . . . . . . . . . . . . . . . . . . 9
1.2 Finite real continued fractions . . . . . . . . . . . . . . . . . . 10
1.3 Innite real continued fractions . . . . . . . . . . . . . . . . . 19
1.4 Basic properties and matrices . . . . . . . . . . . . . . . . . . 24
1.5 Periodic regular continued fractions . . . . . . . . . . . . . . . 26
2 Planetaria 35
2.1 Huygenss Planetarium . . . . . . . . . . . . . . . . . . . . . . 36
2.2 Eisingas Planetarium . . . . . . . . . . . . . . . . . . . . . . 36
2.3 Mathematical Issues . . . . . . . . . . . . . . . . . . . . . . . 38
2.3.1 Take some Educated Guesses . . . . . . . . . . . . . . 40
2.3.2 Continued Fractions . . . . . . . . . . . . . . . . . . . 40
2.3.3 Gear Trains . . . . . . . . . . . . . . . . . . . . . . . . 41
2.3.4 Some last Comments . . . . . . . . . . . . . . . . . . . 42
3 The Stern-Brocot algorithm 43
3.1 Constructing the Rationals . . . . . . . . . . . . . . . . . . . 43
3.2 Application: Approximating Fractions . . . . . . . . . . . . . 47
3.2.1 Stern - Brocot . . . . . . . . . . . . . . . . . . . . . . 50
3.2.2 Brocot - Euclid . . . . . . . . . . . . . . . . . . . . . . 50
3.3 An Amusing Property of the Stern-Brocot Sequence . . . . . 51
3.3.1 Some last Comments . . . . . . . . . . . . . . . . . . . 57
4 Sums of squares 59
4.0.2 Theorem 1 . . . . . . . . . . . . . . . . . . . . . . . . 59
5 Pell equation 65
6 Markov numbers 73
3
4 CONTENTS
7 The nearest integer continued fraction 83
8 Continued fractions and the LLL algorithm 87
8.1 Lattices and bases . . . . . . . . . . . . . . . . . . . . . . . . 87
8.2 The geometry of continued fractions . . . . . . . . . . . . . . 90
8.3 The relation between LLL and NICF . . . . . . . . . . . . . . 92
9 Continued fractions and Ford circles 95
10 Decimals vs. continued fractions 105
10.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
10.2 Results of Levy and Lochs . . . . . . . . . . . . . . . . . . . . 109
11 Entropy and the theorem of Lochs 115
11.1 Introduction to entropy . . . . . . . . . . . . . . . . . . . . . 115
11.2 Calculation of entropy . . . . . . . . . . . . . . . . . . . . . . 119
11.3 The theorom of Lochs . . . . . . . . . . . . . . . . . . . . . . 120
11.3.1 Computation of h(T) . . . . . . . . . . . . . . . . . . . 120
11.3.2 Proof of the theorem . . . . . . . . . . . . . . . . . . . 123
12 Complex continued fractions 127
12.1 Greatest common divisor of two Gaussian integers . . . . . . 127
12.2 Generalized circles . . . . . . . . . . . . . . . . . . . . . . . . 128
12.3 Hurwitz mapping . . . . . . . . . . . . . . . . . . . . . . . . . 128
12.4 Finite number of g-circles . . . . . . . . . . . . . . . . . . . . 131
12.5 Bounded partial quotients . . . . . . . . . . . . . . . . . . . . 132
13 Geodesics 135
14 Halls theorem 139
14.1 Cantor Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
15 Bounded complex partial quotients 151
16 Binary quadratic forms 157
16.1 Positive denite forms . . . . . . . . . . . . . . . . . . . . . . 159
16.2 Indenite forms . . . . . . . . . . . . . . . . . . . . . . . . . . 159
17 CFs in power series elds 165
17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
17.1.1 Lemma 1 . . . . . . . . . . . . . . . . . . . . . . . . . 166
CONTENTS 5
17.2 Properties of convergents . . . . . . . . . . . . . . . . . . . . 167
17.2.1 Lemma 2 . . . . . . . . . . . . . . . . . . . . . . . . . 167
17.2.2 Lemma 3 . . . . . . . . . . . . . . . . . . . . . . . . . 167
17.3 Relations between continued fraction expansions . . . . . . . 168
17.3.1 Lemma 4 . . . . . . . . . . . . . . . . . . . . . . . . . 168
17.3.2 Theorem 2 . . . . . . . . . . . . . . . . . . . . . . . . 169
17.4 Mobius transformations and matrix notation . . . . . . . . . 169
17.5 Pseudoperiodic continued fractions . . . . . . . . . . . . . . . 170
17.5.1 Theorem 3 . . . . . . . . . . . . . . . . . . . . . . . . 171
17.6 Calculating continued fraction . . . . . . . . . . . . . . . . . . 172
17.6.1 Step of type I . . . . . . . . . . . . . . . . . . . . . . . 173
17.6.2 Lemma 5 . . . . . . . . . . . . . . . . . . . . . . . . . 174
17.6.3 Step of type II . . . . . . . . . . . . . . . . . . . . . . 174
17.6.4 Lemma 6 . . . . . . . . . . . . . . . . . . . . . . . . . 174
17.6.5 Lemma 7 . . . . . . . . . . . . . . . . . . . . . . . . . 175
17.6.6 Calculating a continued fraction from a relation with
itself . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
18 Computing Mobius transformations 177
18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
18.2 Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
18.3 Finite automata . . . . . . . . . . . . . . . . . . . . . . . . . 178
18.4 Transducers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
18.4.1 Multi-symbol input . . . . . . . . . . . . . . . . . . . . 181
18.5 LR representation of the continued fraction expansion . . . . 182
18.6 Other 2 2 matrices over N . . . . . . . . . . . . . . . . . . . 184
18.7 Enumerating matrices in 1B
n
, (B
n
and TB
n
. . . . . . . . . . 186
18.8 Transformations on row balanced matrices . . . . . . . . . . . 187
18.9 Transducers for Mobius transformations . . . . . . . . . . . . 189
18.10Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
6 CONTENTS
Preface
These are the notes of a course on Continued Fractions that we organized
in Nijmegen in the fall semester of 2012. After a brief introduction by us,
the notes contain the contents of the 18 lectures that were given by the
nine student participants. Roughly speaking they correspond to topics we
proposed for self-study (with literature provided by us).
The reader may notice some strange formatting in these pages; they are
due to the rather hastily put together character of this document: many of
the chapters were originally formatted in another L
A
T
E
X-style, and we did
not invest terribly much time in re-formatting. In particular, the text may
run into the margin occasionally. Also, we did not (yet) take the time to
translate brief sections of the Introduction (taken from elsewhere) that were
written in Dutch.
Nevertheless, this document shows very well the range of topics covered
in the course. We mainly put the whole thing together for the particiapants
to look back on their own, but more importantly, each others contribution.
Thanks are due to Sandra, Merlijn, Chris, Willem, Roy, Ewelina, Geert,
David and Maaike for their eorts in writing these notes, lecturing about
them, and making this course into an enjoyable experience.
Wieb Bosma, Cor Kraaikamp
Nijmegen, August 2013
7
8 CONTENTS
Chapter 1
Introduction
Wieb Bosma, Cor Kraaikamp
1.1 What is a continued fraction?
A nite continued fraction is a representation
p
q
= a
0
+
e
1
a
1
+
e
2
a
2
+
e
3
.
.
.
e
n
a
n
for an element p/q from the eld of fractions Q(R) of a commutative ring R
(with unit element). Here e
i
, a
i
are elements from the ring R, and it is clear
that p, q R for which equality holds will always exist: we can just simplify
the continued fraction (from right to left). Usually, certain restrictions are
placed on the e
i
and a
i
depending on R and the type of continued fraction;
we will see examples of this further on. The non-negative integer n will be
called the length of the continued fraction.
Suppose now that Q(R) is endowed with a metric, and that

Q is a
completion of Q(R) with respect to this metric. Then we say that
a
0
+
e
1
a
1
+
e
2
a
2
+
e
3
.
.
.
9
10 CHAPTER 1. INTRODUCTION
is an innite continued fraction if for every n 0 the nite part
p
n
q
n
= a
0
+
e
1
a
1
+
e
2
a
2
+
e
3
.
.
.
e
n
a
n
is a nite continued fraction representation for p
n
/q
n
of Q(R) and it holds
that
lim
n
p
n
q
n
exists as an element of

Q. If this limit is x, we say that the innite continued
fraction represents x. The nite truncations represent elements p
n
/q
n
that
are called convergents of x.
For typographical reasons we will usually denote the above continued
fraction by [a
0
, e
1
/a
1
, e
2
/a
2
, . . .], or, in the (common) case that all e
i
are
equal to 1, by [a
0
, a
1
, a
2
, . . .].
1.2 Finite real continued fractions
The most common type of continued fraction is that of continued fractions
for real numbers: this is the case where R = Z, so Q(R) = Q, with the usual
Euclidean metric [ [, which yields the eld of real numbers as completion.
Although we do not limit ourselves to this case in the course, it will be used
very often as a basic case for reference.
A regular continued fraction expansion for x R will be an (in)nite
continued fraction of the form
a
0
+
1
a
1
+
1
a
2
+
1
.
.
.
Euclids algorithm is very closely related to continued fractions.
Example 1.2.1. Suppose we determine the greatest common divisor of 33
1.2. FINITE REAL CONTINUED FRACTIONS 11
and 137 using Euclids method:
137 = 4 33 + 5,
33 = 6 5 + 3,
5 = 1 3 + 2,
3 = 1 2 + 1,
2 = 2 1 + 0
which shows the g.c.d. is 1. Dividing each of the above lines of the form
a = q b +r
by b we obtain
137
33
= 4 +
5
33
,
33
5
= 6 +
3
5
,
5
3
= 1 +
2
3
,
3
2
= 1 +
2
1
,
2
1
= 2 +
0
1
.
In these, each fraction on the right is the reciprocal of the fraction on the
next line, so we get (by substitution) that
137
33
= 4 +
5
33
= 4 +
1
6 +
3
5
= 4 +
1
6 +
1
1 +
2
3
= 4 +
1
6 +
1
1 +
1
1 +
1
2
.
It will be clear why the expression is called a continued fraction, and
why we prefer the notation [0; 4, 6, 1, 1, 2].
Denition 1.2.2. A nite regular continued fraction [a
0
; a
1
, a
2
, . . . , a
n
] is a
repeated quotient
[a
0
; a
1
, a
2
, . . . , a
n
] = a
0
+
1
a
1
+
1
a
2
+
1
.
.
.
1
a
n
,
with integers a
i
satisfying a
i
1 for i 1, and a
n
2. The integers
a
i
are called partial quotients. The continued fraction [a
0
; a
1
, a
2
, . . . , a
n
]
determines a rational number p/q, called the value of the continued fraction.
The rational numbers
p
0
q
0
= [a
0
; ],
p
1
q
1
= [a
0
; a
1
],
p
2
q
2
= [a
0
; a
1
, a
2
], . . . ,
p
n
q
n
= [a
0
; a
1
, . . . a
n
],
are called the convergents of p/q = [a
0
; a
1
, a
2
, . . . , a
n
], and n is its length.
Remark 1.2.3. Note that a
i
1 for i 1, and a
n
2 are natural from
Euclids algorithm. The latter restriction prohibits the alternative form of
a continued fraction ending in a
n
= 1, which is can be rewritten via
a
n1
+
1
1
= a
n1
+ 1
to a regular form. We will sometimes make use of the existence of both of
these expansions: one of odd and one of even length for every rational num-
ber in what follows, but only one of them is the regular continued fraction.
Theorem 1.2.4. Every rational number p/q determines a unique nite reg-
ular continued fraction.
Proof Given p/q, Euclids algorithm determines [a
0
; a
1
, a
2
, . . . , a
n
].
Note that for t = [0; a
1
, a
2
, . . . , a
n
] holds: 0 t < 1 (with a strict
inequality on the right because a
n
> 1), and thus a
0
[a
0
; a
1
, a
2
, . . . , a
n
] <
a
0
+ 1.
Now suppose that also p/q = [b
0
; b
1
, . . . b
k
] for another continued fraction.
Then a
0
= p/q| = b
0
since p/q| is the unique integer satisfying p/q|
p/q < p/q| + 1.
Consider
1
[a
1
; a
2
, . . . , a
n
]
= [0; a
1
, a
2
, . . . , a
n
] =
p
q
p/q| = [0; b
1
, . . . b
k
] =
1
[b
1
; b
2
, . . . , b
k
]
,
then [a
1
; a
2
, . . . , a
n
] = [b
1
; b
2
, . . . , b
k
] so a
1
= b
1
as before; etc.
Without rst considering Euclids algorithm, we now nd the regular expan-
sion of p/q in general as follows: determine the integral part a
0
and subtract
it from the fraction; take the reciprocal of the result and repeat.
In other words, we iterate
x
k+1
=
1
x
k
x
k
|
(1.1)
with x
0
= p/q until x
k
x
k
| becomes 0. Put a
k
= x
k
|.
The convergents are found as follows: by denition p
0
/q
0
= [a
0
; ] = a
0
/1.
Then
p
1
q
1
= a
0
+
1
a
1
=
a
1
a
0
+ 1
a
1
,
and
p
2
q
2
= a
0
+
1
a
1
+
1
a
2
= a
0
+
a
2
a
2
a
1
+ 1
=
a
2
a
1
a
0
+a
2
+a
0
a
2
a
1
+ 1
.
Of course this follows from the previous by replacing a
1
by a
1
+ 1/a
2
. Sim-
ilarly
p
3
q
3
=
(a
2
+
1
a
3
)a
1
a
0
+a
2
+
1
a
3
+a
0
(a
2
+
1
a
3
)a
1
+ 1
=
a
3
(a
2
a
1
a
0
+a
2
+a
0
) +a
1
a
0
+ 1
a
3
(a
2
a
1
+ 1) +a
1
and by induction we obtain the following.
Theorem 1.2.5. For every convergent p
k
/q
k
of a rational number p/q:
p
k
q
k
=
a
k
p
k1
+p
k2
a
k
q
k1
+q
k2
. (1.2)
for 1 k n, dening p
1
= 1, q
1
= 0.
Example 1.2.6. The partial quotients and convergents from our rst ex-
ample are summarized in the following table.
n : 1 0 1 2 3 4 5
a
n
: 0 4 6 1 1 2
p
n
: 1 0 1 6 7 13 33
q
n
: 0 1 4 25 29 54 137
Lemma 1.2.7. The convergents p
k
/q
k
of a rational number p/q satisfy
[p
k
[ [p
k1
[ and q
k
q
k1
for k 1, and even:
[p
k
[ > [p
k1
[, (k 3), and q
k
> q
k1
, (k 2),
while
p
k1
q
k
p
k
q
k1
= (1)
k
.
Proof Use equation 1.2:
p
k1
q
k
p
k
q
k1
q
k1
q
k
=
p
k1
q
k1
p
k
q
k
=
p
k1
q
k1
a
k
p
k1
+p
k2
a
k
q
k1
+q
k2
=
=
(1)(p
k2
q
k1
p
k1
q
k2
)
q
k1
(a
k
q
k1
+q
k2
)
,
which by induction equals
(1)
k
(p
1
q
0
p
0
q
1
)
q
k1
(a
k
q
k1
+q
k2
)
=
(1)
k
q
k1
(a
k
q
k1
+q
k2
)
.
So p
k1
q
k
p
k
q
k1
= (1)
k
and in particular p
k
and q
k
will be coprime.
Also, by induction q
k
= a
k
q
k1
+q
k2
> q
k1
for k 2, because a
k
1, and
q
k1
> 0 for k 1. Moreover [p
k
[ = a
k
[p
k1
[ + [p
k2
[ > [p
k1
[, for k 3
since [p
k2
[ > 0 for k 3.
Theorem 1.2.8. The convergents p
k
/q
k
of a rational number p/q satisfy
for k 0:
p
k1
q
k1
p
k
q
k
=
(1)
k
q
k1
q
k
,
and
p
0
q
0
<
p
2
q
2
<
p
4
q
4
< <
p
q
=
p
n
q
n
< <
p
3
q
3
<
p
1
q
1
<
p
1
q
1
;
also
p
q

p
k
q
k
<
p
q

p
k1
q
k1
for 0 k n.
Proof The rst statement is immediate from the Lemma. That also im-
plies that the sequence of q
i
s is strictly increasing, and hence the dierences
between two consecutive convergents is decreasing; this proves the second
part.
To prove the nal statement we rst note the following, for positive
real numbers a, b, and integers 0 k n: positieve reele getallen a, b en
0 k n:
ap
k1
+p
k2
aq
k1
+q
k2
bp
k1
+p
k2
bq
k1
+q
k2
0 (a b)(p
k2
q
k1
p
k1
q
k2
) = (a b)(1)
k1
k odd and b a, or k even and a b.

For odd k < n it holds that
p
k1
q
k1

p
k+1
q
k+1
=
a
k+1
p
k
+p
k1
a
k+1
q
k
+q
k1
p
k
+p
k1
q
k
+q
k1
;
but then
p
q

p
k1
q
k1
>
p
k+1
q
k+1
p
k1
q
k1
p
k
+p
k1
q
k
+q
k1
p
k1
q
k1
=
(p
k1
q
k
p
k
q
k1
)
q
k1
(q
k
+q
k+1
)
=
=
1
q
k1
(q
k
+q
k+1
)
>
1
q
k
(q
k
+q
k+1
)
=
(p
k1
q
k
p
k
q
k1
)
q
k
(q
k
+q
k+1
)
=
=
p
k
q
k
p
k
+p
k1
q
k
+q
k1
p
k
q
k
a
k+1
p
k
+p
k1
a
k+1
q
k
+q
k1
=
p
k
q
k
p
k+1
q
k+1
>
p
k
q
k
p
q
.
The even case is similar.
Application 1.2.9 (Gear ratios). Christiaan Huygens used continued frac-
tion convergents in his construction of a planetarium, a model of the solar
system as it was known at the time. Using a single drive shaft and gears with
numbers of teeth in carefully chosen ratios, all known planets should revolve
with reasonable accuracy around the sun in this model. The ratios would
correspond to the rations between the length of the year on each planet and
that on earth. To be able to make a physical model with actual gears, the
number of teeth could neither be very big nor too small.
Huygens found, for example, for the innermost planet, Mercurius, a ratio
of 25335/105190. Its continued fraction is [0; 4, 6, 1, 1, 2, 1, 1, 1, 1, 7, 1, 2] and
initially Huygens used the fth convergent [0; 4, 6, 1, 1, 2] =
33
137
. Later, he
realized that, although using the ninth convergent would require too many
teeth: [0; 4, 6, 1, 1, 2, 1, 1, 1, 1] =
204
847
, but since 204 = 12 17 en 847 = 7 121
this approximation can be used and gives a better result, when using 4 gears
with 12, 17, 7, and 121 teeth, two of these xed to the same shaft.
Application 1.2.10 (Solving linear equations). The property that consec-
utive regular convergents satisfy: p
k1
q
k
p
k
q
k1
= 1 can be used to
solve
ax by = 1, a, b Z
1
in integers x, y; note that gcd(a, b) = 1 should hold for solutions to exist at
all.
Expand the fraction b/a as a regular continued fraction and consider
the penultimate and the ultimate convergents p
n1
/q
n1
and p
n
/q
n
= b/a.
According to Lemma 1.2.7
p
n1
q
n
p
n
q
n1
= p
n1
a q
n1
b = (1)
n
,
which means that, depending on the parity of n, a solution is given by
(x
0
, y
0
) = (p
n1
, q
n1
) or by (x
0
, y
0
) = (p
n1
, q
n1
).
If we want to consider positive solutions only, we may insist on making
the length of the continued fraction even by replacing a
n
by a
n
1, 1, if
necessary.
From the solution (x
0
, y
0
) we nd the general solution by simply taking
(x, y) = (x
0
+ zb, y
0
+ za): clearly these form solutions, while on the other
hand a second solution (x
1
, y
1
) satises
ax
1
by
1
= 1 = ax
0
by
0
so
a(x
1
x
0
) = b(y
1
y
0
).
Since a and b are coprime, a must be a divisor of y
1
y
0
and b a divisor of
x
1
x
0
; the result follows.
To solve ax + by = 1 we apply the following: develop the continued
fraction for b/a to nd one solution (x
0
, y
0
) for axby = 1. Then (x
0
, y
0
)
will be a solution of ax + by = 1 and the general solutin is given by
(x
0
+bz, y
0
az).
To solve equations of the form ax by = c with [c[ > 1, one multiplies
the solutions for ax by = 1 by c.
Example 1.2.11. Find all solutions for the equation
34 x + 49 y = 13.
The regular continued fraction for 49/34 is [1; 2, 3, 1, 3]. Modifying this to
get an expansion of odd length, gives [1; 2, 3, 1, 2, 1], and this provides a
solution to
34 x + 49 y = 1
by looking at the penultimate convergent from the sequence
1
1
,
3
2
,
10
7
,
13
9
,
36
25
,
49
34
,
since 36 34 25 49 = 1. The general solution will be x = 13 36 + 49z,
y = 13 (25) 34z.
Application 1.2.12 (Egyptian fractions). Egyptian fractions are fractions
with numerator 1. The Egyptians wrote every fraction (with the exception of
2/3) as a sum of these Egyptian fractions with distinct denominators. There
are several algorithms to write a fraction as a sum of Egyptian fractions,
and two questions arise: how large can the denominators become, and how
many terms are needed?
The following method uses continued fractions to nd reasonably short
expansions with reasonably small denominators. More precisely: it will, for
a fraction p/q, produce a sum of no more than p terms with denominators
at most q(q 1).
Let 0 < p/q < 1 be given, with gcd(p, q) = 1. Suppose that p/q =
[0; a
1
, a
2
, . . . , a
n
] as a regular continued fraction. With induction on the
length of the continued fraction, we dene an expansion in terms of Egyptian
fractions as follows: if n = 1 then p/q = 1/a
1
and we are done. Next suppose
that we have dealt with continued fractions of length up to n 1, then we
coninue as follows: for odd n we have p
n1
/q
n1
< p
n
/q
n
= p/q, and
p
q

p
n1
q
n1
=
p
n
q
n
p
n1
q
n1
=
p
n
q
n1
p
n1
q
n
q
n1
q
n
=
1
q
n1
q
n
,
so we are done. For even n, we have p
n2
/q
n2
< p/q and we use interme-
diate approximations or mediants:
p
n2
q
n2
<
p
n2
+p
n1
q
n2
+q
n1
< <
p
n2
+a
n
p
n1
q
n2
+a
n
q
n1
=
p
n
q
n
= p/q.
Since
p
n2
+jp
n1
q
n2
+jq
n1
p
n2
+ (j 1)p
n1
q
n2
+ (j 1)q
n1
=
1
(q
n2
+ (j 1)q
n1
)(q
n2
+jq
n1
)
we can write
p
q
=
p
n2
q
n2
+
an
j=1
1
(q
n2
+ (j 1) q
n1
)(q
n2
+j q
n1
)
,
and we are done again by induction. In all we did not use more than
1 + a
2
+ + a
n
Egyptian fractions, where n
t
is the largest even intger
less than or equal to n.
There is a modication of the algorithm that uses several mediants si-
multaneously, but it takes some care to avoid equal terms.
Application 1.2.13 (Sums of squares). Let p be a prime number; it is
well-known that the multiplicative group F
p
of the nite eld F
p
is cyclic:
there exists an integer g such that the powers of g modulo p produce all
p 1 dierent residue classes. Of course g
p1
1 mod p. The equation
x
2
1 = 0 will have the two solutions 1 and 1 g
(p1)/2
mod p in F
p
. The
equation x
2
+1 = 0 will then have either no solution in F
p
(if p 3 mod 4),
a single solution (if p = 2) or two dierent solutions g
(p1)/4
mod p (if
p 1 mod 4).
We will use this in an attempt to write p as the sum of two squares;
since this is a trivial problem if p = 2, we assume that p is odd. We are
looing for a, b such that p = a
2
+ b
2
. If p 3 mod 4 such a, b will not exist
(as otherwise (ab
1
)
2
1 mod p contradicts the above).
The following nds a solution p = a
2
+ b
2
for every prime p 1 mod 4.
Suppose we have found w with w
2
1 mod p, then we obtain a, b with
continued fractions, as follows. First adapt w Z if necessary in such a
way that w
2
1 mod p and 0 < w < p/2. Now develop p/w as a regular
continued fraction; it turns out that
p
w
= [a
0
; a
1
, . . . , a
m
, a
m
, . . . , a
1
, a
0
];
and the solution is obtained from the convergents p
m1
/q
m1
= [a
0
; a
1
, . . . , a
m1
]
and p
m
/q
m
= [a
0
; a
1
, . . . , a
m
]: namely, a = p
m1
and b = p
m
will do.
For example, when p = 9973, we have 2798
2
1 mod p and the con-
tinued fraction of 9973/2798 is
[3; 1, 1, 3, 2, 1, 1, 2, 3, 1, 1, 3].
We nd the convergents
3
1
,
4
1
,
7
2
,
25
7
,
57
16
,
82
23
,
139
39
,
360
101
,
1219
342
,
1579
443
,
2798
785
,
9973
2798
.
The numerators of the convergents almost half-way theis expansion yield
57
2
+ 82
2
= 9973.
The reason this algorithm works can be seen when we look at the ex-
1.3. INFINITE REAL CONTINUED FRACTIONS 19
tended algorithm of Euclid. In this example the steps are
1 9973 + 0 2798 = 9973;
0 9973 + 1 2798 = 2798;
1 9973 + 3 2798 = 1579;
1 9973 + 4 2798 = 1219;
2 9973 + 7 2798 = 360;
7 9973 + 25 2798 = 139;
16 9973 + 57 2798 = 82;
23 9973 + 82 2798 = 57;
39 9973 + 139 2798 = 25;
101 9973 + 360 2798 = 7;
342 9973 + 1219 2798 = 4;
443 9973 + 1579 2798 = 3;
785 9973 + 2798 2798 = 1.
The n + 1-st row is obtained from the rows n and n 1 by division with
remainder in the rst column. The quotients are the partial quotients. The
symmetry between the second and the third columns is caused by the prop-
erty 2798
2
1 mod 9973: take every row modulo p = 9973 and multiply
by 2798. We simply nd the lower half of the table from the upper half.
We can stop producing rows as soon as a number less than

p appears in
the right hand column, at which point a and b appear as coecients in the
second and third columns!
1.3 Innite real continued fractions
Broadening our outlook, we now allow arbitrary real numbers x for continued
fractions: we iterate 1.1 for x
0
= x R and obtain
x
0
= a
0
+
1
x
1
,
x
1
= a
1
+
1
x
2
,
etcetera. It is clear this will generate an innite sequence [a
0
; a
1
, a
2
, . . .] of
partial fractions, unless
0
is rational. As before, we get (a possibly innite)
sequence of convergents
p
1
q
1
=
1
0
,
p
0
q
0
=
a
0
1
,
p
1
q
1
,
p
2
q
2
, .
It is easy to see (using induction) that for n 0
x =
x
n+1
p
n
+p
n1
x
n+1
q
n
+q
n1
; (1.3)
since for n = 0 we have
x =
x
1
a
0
+ 1
x
1
= a
0
+
1
x
1
= x
0
,
and it holds that
x
n+1
p
n
+p
n1
x
n+1
q
n
+q
n1
=
x
n+1
(a
n
p
n1
+p
n2
) +p
n1
x
n+1
(a
n
q
n1
+q
n2
) +q
n1
=
(a
n
p
n1
+p
n2
) +
p
n1
x
n+1
(a
n
q
n1
+q
n2
) +
q
n1
x
n+1
=
(a
n
+
1
x
n+1
)p
n1
+p
n2
(a
n
+
1
x
n+1
)q
n1
+q
n2
=
x
n
p
n1
+p
n2
x
n
q
n1
+q
n2
,
using the recursive relation for numerators and denominators of convergents
given in 1.2 (the inductive proof works for the innte expansions just as
well).
As in the proof of 1.2.8 we see that the convergents provide increasingly
good approximations, alternately from above and below, for x, and also
again that consecutive convergents are at a distance (q
k1
q
k
)
1
from each
other. From Theorem 1.2.8 it follows immediately that
x = lim
n
p
n
q
n
.
Theorem 1.3.1. The convergents p
k
/q
k
of any irrational number x satisfy:
1
2q
k
q
k+1
<
1
q
k
(q
k
+q
k+1
)
<
x
p
k
q
k
<
1
q
k
q
k+1
<
1
q
2
k
,
for k 1.
Proof By 1.3
x
p
k
q
k
x
k+1
p
k
+p
k1
x
k+1
q
k
+q
k1
p
k
q
k
(1)
k
q
k
(q
k
x
k+1
+q
k1
)
.
Since a
k+1
< x
k+1
< a
k+1
+ 1 is q
k+1
< q
k
x
k+1
+q
k1
< q
k+1
+q
k
:
1
q
k
(q
k
+q
k+1
)
<
x
p
k
q
k
<
1
q
k
q
k+1
.
The other inequalities follow from q
k
< q
k+1
.
Theorem 1.3.2. Voor twee opeenvolgende convergenten p
k1
/q
k1
, p
k
/q
k
van een irrationaal getal x geldt:
x
p
k1
q
k1
<
1
2q
2
k1
of
x
p
k
q
k
<
1
2q
2
k
.
Proof It would follow from
p
k
q
k
p
k1
q
k1
x
p
k
q
k
x
p
k1
q
k1
and the assumption that the statement is false, that

1
q
k1
q
k
=
p
k
q
k
p
k1
q
k1
1
2q
2
k
+
1
2q
2
k1
which is equivalent to
(q
k
q
k1
)
2
0;
this is a contradiction as q
k
> q
k1
for k 2.
Theorem 1.3.3. If a fraction p/q satises 0 < q q
k
for some convergent
p
k
/q
k
of x then
p
q
,=
p
k
q
k
x
p
q
>
x
p
k
q
k
.
Proof Without loss of generality we assume that p and q are coprime. If
q = q
k
then
p
q

p
k
q
k
>
1
q
k
but
x
p
k
q
k
<
1
2q
k
so
x
p
k
q
k
<
x
p
q
.
Suppose that q
k1
< q < q
k
; let integers e, f be dened by
e = (qp
k1
pq
k1
), f = (pq
k
qp
k
),
then f ,= 0 and
ep
k
+fp
k1
= p(p
k1
q
k
p
k
q
k1
) = p,
eq
k
+fq
k1
= q(p
k1
q
k
p
k
q
k1
) = q,
so (changing the sign of e, f if necessary) we may assume there are =-signs
on the right hand side. As eq
k
+ fq
k1
= q < q
k
, the signs of e and f
are opposite, like those of p
k
q
k
x and p
k1
q
k1
x. But then the signs
e(p
k
q
k
x) and f(p
k1
q
k1
x) will be equal again. Moreover,
p qx = e(p
k
q
k
x) +f(p
k1
q
k1
x)
and if [f[ = 1 and e ,= 0 since q > q
k1
, and therefore
[p qx[ > [p
k1
q
k1
x[ .
Now Theorem 1.3.1 implies
[p
k1
q
k1
x[ > q
k1
1
q
k1
(q
k1
+q
k
)

1
q
k+1
> q
k
p
k
q
k
x
= [p
k
q
k
x[ .
So [p qx[ > [p
k
q
k
x[ and the statement follows upon division by q on the
left and by q
k
> q on the irghand the statement follows upon division by q
on the left and by q
k
> q on the right
Theorem 1.3.4. If p/q satises
x
p
q
<
1
2q
2
then
p
q
=
p
k
q
k
,
for some convergent p
k
/q
k
of x.
Proof Expand p/q in a nite continued fraction of odd length n; then
p/q = p
n
/q
n
and
p
n
q
n
x =

q
2
n
, <
1
2
.
There exists y > 0 such that
x =
yp
n
+p
n1
yq
n
+q
n1
,
and then
q
2
n
=
p
n
q
n
x =
p
n
q
n1
p
n1
q
n
q
n
(yq
n
+q
n1
=
(1)
n+1
q
n
(yq
n
+q
n1
),
so
=
q
n
yq
n
+q
n1
implying
y =
1

q
n1
q
n
> 1.
According to Lemma 1.3.5, below, p
n1
/q
n1
and p
n
/q
n
= p/q will then be
consecutive convergents of x.
Lemma 1.3.5. if
x =
py +r
qy +s
,
with y R and p, q, r, s Z such that
y > 1, q > s > 0, ps qr = 1,
then there exists n 0 with
y = x
n+1
,
p
q
=
p
n
q
n
,
r
s
=
p
n1
q
n1
,
where x = [a
0
; a
1
, . . .], x
i
= [a
i
; a
i+1
, . . .] and p
i
/q
i
= [a
0
; a
1
, . . . , a
i
] for
i 0.
Proof Expand p/q in a continued fraction p/q = [A
0
; A
1
, . . . , A
n
] =
v
n
/w
n
, and let v
n1
/w
n1
= [A
0
; A
1
, . . . , A
n1
]. The continued fraction has
been chosen in such a way that (1)
n+1
= v
n
w
n1
v
n1
w
n
= psqr = 1.
Then
v
n
w
n1
v
n1
w
n
= v
n
s v
n
r,
hence v
n
(w
n1
s) = w
n
(v
n1
r) from which (because v
n
, w
n
are coprime
and v
n
> v
n1
) follows that s = w
n1
and r = v
n1
. But the continued
fraction expansion of
v
n
y +v
n1
w
n
y +w
n1
= [A
0
; A
1
, . . . , A
n
, y]
(compare 1.3) and so [A
0
; A
1
, . . . , A
n
] is the initial part of the continued
fraction of x: [A
0
; A
1
, . . . , A
n
] = [a
0
; a
1
, . . . , a
n
] and y = [A
n+1
; A
n+2
, . . .] =
[a
n+1
; a
n+2
, . . .] the tail.
Denition 1.3.6. The regular continued fraction operator T : [0, 1)
[0, 1) is dened by
Tx :=
1
x

1
x
|, x ,= 0; T0 := 0.
0 1
1
1
2
1
3
1
4
1
5
1
6
....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
. . . . . . . . . . . . . . . . .................................................................................. . . . ......... . . . . . . . . . . ............................................................................................................................................................................................................................................................................................................................................................................................................
...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Figure 1.1: The continued fraction map T
The map T is illustrated in Figure 1.1.
Now let x R Q. Setting
a
n
= a
n
(x) :=
1
T
n1
| , n 1,
one has
x = a
0
+
1
a
1
+T
1
= a
0
+
1
a
1
+
1
a
2
+T
2
=
= a
0
+
1
a
1
+
1
a
2
+
.
.
. +
1
a
n
+T
n
= [ a
0
; a
1
, a
2
, . . . , a
n1
, a
n
+T
n
] , n 1.
(1.4)
1.4 Basic properties and matrices
In this section we will derive a number of basic properties of continued frac-
tions using 22 matrices. In fact, this matrix representation establishes the
1.4. BASIC PROPERTIES AND MATRICES 25
connection between continued fractions and (a part of) algebraic geometry,
a connection that has been beautifully explained by C. Series in [82] and
[83]. Let
A =
_
a b
c d
_
SL
2
(Z) ,
i.e., A has integer entries a, b, c and d, and det(A) 1, +1. The letters
SL in SL
2
(Z) stand for special linear. Now dene a map A : R
R by
A(x) :=
ax +b
cx +d
, x R .
Such a map is also known as a Mobius transformation. Notice that we
use the same notation both for the matrix A and for its associated Mobius
transformation.
Let x R be an irrational number with continued fraction expansion
x = [ a
0
; a
1
, . . . , a
n
, . . .]. Dene for n 1 matrices A
n
and M
n
by
A
0
:=
_
1 a
0
0 1
_
, A
n
:=
_
0 1
1 a
n
_
, M
n
:= A
0
A
1
A
n
, n 1.
(1.5)
Writing
M
n
=
_
r
n
p
n
s
n
q
n
_
, n 0,
it is easy to show that (p
n
, q
n
) = 1. Using M
n
= M
n1
A
n
one gets
r
n
= p
n1
, s
n
= q
n1
,
p
n1
q
n
p
n
q
n1
= (1)
n
, n 1, and
p
n
q
n
= [ a
0
; a
1
, . . . , a
n
], n 1.
and furthermore, that the sequences (p
n
)
n1
and (q
n
)
n1
satisfy the fol-
lowing recurrence relations
p
1
:= 1; p
0
:= a
0
; p
n
= a
n
p
n1
+p
n2
, n 1,
q
1
:= 0; q
0
:= 1; q
n
= a
n
q
n1
+q
n2
, n 1.
(1.6)
Finally, using (1.6) one sees that p
n
(x) = q
n1
(Tx) for all n 0, where
T
n
= T
n
x.
From the recurrence relation for the q
n
s it is possible to derive
q
n1
q
n
= [ 0; a
n
, . . . , a
1
] .
Dening
A
n
:=
_
0 1
1 a
n
+T
n
_
, for n 1,
we see that x = (M
n1
A
n
)(0); using
M
n1
=
_
p
n2
p
n1
q
n2
q
n1
_
, for n 1,
and the recurrence relations for (p
n
)
n1
and (q
n
)
n1
one obtains
x =
p
n
+T
n
p
n1
q
n
+T
n
q
n1
, for n 1,
i.e., x = M
n
(T
n
x). Finally, use the fact that p
n1
q
n
p
n
q
n1
= (1)
n
so
x
p
n
q
n
=
(1)
n
T
n
q
n
(q
n
+T
n
q
n1
)
, for n 1. (1.7)
Since T
n
[0, 1) we have that
x
p
n
q
n
<
1
q
2
n
, for n 1. (1.8)
The sequence (q
n
)
n0
is a monotone increasing sequence of positive integers
(which is the Fibonacci sequence (T
n
)
n1
, given by
1, 1, 2, 3, 5, 8, 13, 21, . . .
if all the a
i
are equal to 1. Now (1.8) yields that lim
n
p
n
q
n
= x.
1.5 Periodic regular continued fractions
We richten ons nu op het eenvoudigste soort oneindige kettingbreuken,
namelijk de repeterende. Het zal blijken dat die precies corresponderen
met kwadratisch irrationale getallen, maar voordat we dat bewijzen, geven
we eerst een voorbeeld.
Example 1.5.1 (wortel). We bepalen de kettingbreuk voor
77. Het is
belangrijk om op te merken dat we hiervoor alleen maar hoeven te weten
dat 8
2
< 77 < 9
2
en dus 8 <
77 < 9.
x
0
= x =
77, dus a
0
= x
0
| = 8.
1.5. PERIODIC REGULAR CONTINUED FRACTIONS 27
Dan
x
1
=
1
x
0
a
0
=
1
77 8
=
77 + 8
77 64
, dus a
1
= x
1
| = 1.
Vervolgens
x
2
=
1
x
1
a
1
=
1
775
13
=
77 + 5
7725
13
=
77 + 5
4
, dus a
2
= x
2
| = 3.
Daarna
x
3
=
1
x
2
a
2
=
1
777
4
=
77 + 7
7749
4
=
77 + 7
7
, dus a
3
= x
3
| = 2,
en
x
4
=
1
x
3
a
3
=
1
777
7
=
77 + 7
7749
7
=
77 + 7
4
, dus a
4
= x
4
| = 3,
waaruit volgt
x
5
=
1
x
4
a
4
=
1
775
4
=
77 + 5
7725
4
=
77 + 5
13
, dus a
5
= x
5
| = 1.
Tenslotte is
x
6
=
1
x
5
a
5
=
1
778
13
=
77 8
7764
13
=
77 + 8
1
, dus a
6
= x
6
| = 16,
zodat
x
7
=
1
x
6
a
6
=
1
77 8
= x
1
,
en de kettingbreuk repeteert vanaf hier:
x + [8; 1, 3, 2, 3, 1, 16]
waar de overstreping oneindige herhaling van dat blok wijzergetallen aangeeft.
Denition 1.5.2. Een oneindige kettingbreuk [a
0
; a
1
, a
2
, . . .] heet periodiek
met periodelengte m als er een N 0 bestaat zodanig dat voor alle n
N geldt dat a
n
= a
n+m
en er geen kleinere m 1 met die eigenschap
bestaat. De wijzergetallen a
0
, . . . , a
N1
vormen dan de preperiode, de (zich
steeds herhalende) a
N
, . . . , a
N+m1
de periode. Een kettingbreuk heet zuiver
periodiek als hij periodiek is en N = 0 genomen kan worden, dat wil zeggen,
er is geen preperiode.
Om alle identiteiten in onderstaande bewijzen ook te laten gelden wan-
neer N = 0, is het handig de (teller en noemer van de) convergent met index
2 te denieren door p
2
= 0, en q
2
= 1. De gebruikelijke recursies (zoals
p
k
= a
k
p
k1
+p
k2
) blijven dan ook geldig voor k = 0.
Theorem 1.5.3 (Euler). Een irrationaal getal x met een periodieke ketting-
breuk is een element van Q(
d) voor een d Z
1
, waar d geen kwadraat
is.
Proof Veronderstel dat x = [a
0
; a
1
, . . . , a
N1
, a
N
, . . . , a
N+m1
], en ge-
bruik nu dat x
N
= x
N+m
met relatie 1.3:
x =
x
N
p
N1
+p
N2
x
N
q
N1
+q
N2
=
x
N+m
p
N+m1
+p
N+m2
x
N+m
q
N+m1
+q
N+m2
,
dan is
xq
N2
p
N2
xq
N1
p
N1
= x
N
= x
N+m1
=
xq
N+m2
+p
N+m2
xq
N+m1
+p
N+m1
,
en daarom
0 = (q
N2
q
N+m1
q
N1
q
N+m2
)x
2
+
+ (p
N1
q
N+m2
p
N2
q
N+m1
+p
N+m2
q
N1
p
N+m1
q
N2
)x +
+ p
N2
p
N+m1
p
N1
p
N+m2
. (1.9)
Dit is een kwadratische vergelijking die niet ontaard is; immers de kopcoecient
kan alleen nul zijn wanneer
q
N2
q
N+m1
= q
N1
q
N+m2
,
maar omdat q
N+m2
en q
N+m1
onderling ondeelbaar zijn kan dat alleen in-
dien q
Nm+2
een deler is van q
N2
, hetgeen in tegenspraak is met q
Nm+2
>
q
N2
.
Denition 1.5.4. Als x Q(
d) dan bestaan er a, b Q zodat x =

a + b
d, en we noemen het element x = a
d de geconjugeerde van x.
Omdat eenvoudig is in te zien dat de geconjugeerde van de som, resp. het
product van twee elementen van Q(
d) de som, resp. het product van de

geconjugeerden is, en de geconjugeerde van een element van Q het element
zelf is, volgt dat x aan dezelfde kwadratische vergelijking over Q voldoet als
x:
ax
2
+bx +c = 0 a x
2
+b x +c = 0.
voor a, b, c Q.
Wanneer x kwadratisch irrationaal is, zullen we in het vervolg P, Q, d
willen kiezen zodat x = (P +
d)/Q, zodanig dat P, Q, d Z, met d > 0

geen kwadraat en Q een deler van P
2
d. Dat kan altijd, omdat voor zekere
a, b, c geldt dat ax
2
+ bx + c = 0, en volgens de wortelformule kunnen we
dan P = b nemen, Q = 2a en d = b
2
4ac.
Een element x van Q(
d) heet gereduceerd als x > 1 en 1 < x < 0.

Theorem 1.5.5 (Lagrange). Als x irrationaal is en x Q(
d) met d Z
1
dan is de kettingbreuk van x periodiek.
Proof Schrijf x = x
0
= (P
0
+
d)/Q
0
, met Q
0
P
2
d. Met a
0
= x
0
|
krijgen we
x
1
=
1
x
0
a
0
=
1
(P
0
a
0
Q
0
)+
d
Q
0
=
(P
0
a
0
Q
0
)
d
(P
0
a
0
Q
0
)
2
d
Q
0
=
(a
0
Q
0
P
0
) +
d
dP
2
0
Q
0
+ 2a
0
P
0
a
2
0
Q
0
hetgeen gelijk is aan
P
1
+
d
Q
1
,
als we schrijven
P
1
= a
0
Q
0
P
0
, Q
1
=
d P
2
0
Q
0
+ 2a
0
P
0
a
2
0
Q
0
=
d P
2
1
Q
0
;
dit is opnieuw van onze standaardvorm, daar duidelijk Q
1
d P
2
1
. Dus is,
met inductie, voor k 1, te schrijven x
k
= (P
k
+
d)/Q
k
, waar
P
k
= a
k1
Q
k1
P
k1
, Q
k
=
d P
2
k1
Q
k1
+2a
k1
P
k1
a
2
k1
Q
k1
=
d P
2
k
Q
k1
,
met Q
k
d P
2
k
. We leiden een aantal ongelijkheden af, waaruit allereerst
volgt dat er maar eindig veel verschillende mogelijkheden zijn voor (P
k
, Q
k
),
maar waarvan we ook later nog gebruik zullen maken. Conjugeren we de
gelijkheid
x = x
0
=
x
k
p
k1
+p
k2
x
k
q
k1
+q
k2
,
dan krijgen we voor de geconjugeerde
x
0
=
x
k
p
k1
+p
k2
x
k
q
k1
+q
k2
,
zodat
x
k
=
x
0
q
k2
p
k2
x
0
q
k1
p
k1
=
q
k2
q
k1
_
x
0

p
k2
q
k2
x
0

p
k1
q
k1
_
,
maar omdat p
k
/q
k
convergeert naar x
0
,= x
0
, en q
k2
< q
k1
is voor k groot
genoeg 1 < x
k
< 0 terwijl x
k
> 1 (dus x
k
is gereduceerd, voor k groot
genoeg). Maar dan is
2
d
Q
k
= x
k
x
k
> 0, dus Q
k
> 0
en
2P
k
Q
k
= x
k
+ x
k
> 0, dus P
k
> 0.
Bovendien is
1 < x
k
=
P
k

d
Q
k
< 0
zodat
P
k
<
d en
d P
k
< Q
k
,
en
P
k
+
d
Q
k
= x
k
> 1 dus Q
k

d(Q
k
1)
zodat (omdat P
k
<
d) de ongelijkheid x
k
>
d precies optreedt wanneer

Q
k
= 1. In het bijzonder is
a
k
<
d tenzij Q
k
= 1, en dan a
k
< 2
d. (1.11)
We mention two more theorems on periodic continued fractions without
giving the proof.
Theorem 1.5.6 (Galois). A quadratic irrational number x has a purely
periodic continued fraction if and only if x is reduced. Moreover, in that
case
1
x
has the reversed period:
x = [a
0
; a
1
, . . . , a
m1
], en
1
x
= [a
m1
; a
m2
, . . . , a
0
].
Theorem 1.5.7. The regular continued fraction of
d, for d Z
1
not a
square is of the form
[a
0
; a
1
, a
2
, . . . , a
2
, a
1
, 2a
0
].
Remark 1.5.8. From what we have seen already about P
k
and Q
k
it follows
immediately that the period length m of the regular continued fraction of
d (and hence of any quadratic irrational in Q(
d)) is bounded by 2d; a

sharp upper bound is of the order
d log d.
It is also not hard to prove that
Q
k
= 1 m[k.
Application 1.5.9 (Pell). De Pell-vergelijking is de vergelijking x
2
dy
2
=
1, met d Z
2
geen kwadraat; er worden niet-negatieve gehele oplossingen
voor x, y gezocht. We betrekken ook de vergelijking x
2
dy
2
= 1 direct
mee in de beschouwing. Omdat
x
2
dy
2
= (x y
d)(x +y
d) = (x
d)
2
+ 2y
d,
zien we dat voor oplossingen (x, y) van de vergelijkingen geldt
0 <
x
y

<
1
2y
2
d
<
1
2y
2
.
Volgens Stelling 1.3.4 geldt dan dat x/y een convergent van
d moet zijn.
Dus alle oplossingen van de vergelijkingen x
2
dy
2
= 1 zijn te vinden
onder de convergenten van
d.
Om alle oplossingen te bepalen gebruiken we weer de relatie
x =
x
n+1
p
n
+p
n1
x
n+1
q
n
+q
n1
,
met x =
d en x
n+1
= (P
n+1
+
d)/Q
n+1
. Hieruit volgt
(q
n1
Q
n+1
+q
n
P
n+1
p
n
)
d = p
n
P
n+1
+p
n1
Q
n+1
q
n
d,
en dat kan alleen als links en rechts nul staat, dus
p
n
= q
n1
Q
n+1
+q
n
P
n+1
, en q
n
d = p
n1
Q
n+1
+p
n
P
n+1
.
Maar dan is
p
2
n
q
2
n
d = p
n
(q
n1
Q
n+1
+q
n
P
n+1
) q
n
(p
n1
Q
n+1
+p
n
P
n+1
) =
= (p
n
q
n1
p
n1
q
n
)Q
n+1
= (1)
n+1
Q
n+1
.
Volgens de voorgaande opmerking kan dat laatste alleen 1 zijn indien n+1
een veelvoud is van de periodelengte m van de kettingbreuk voor
d. Is die
periodelengte m even, dan zijn voor k = 0, 1, 2, 3, . . . de tellers en noemers
(p
km1
, q
km1
) van de convergenten van
d dus precies alle oplossingen van

de vergelijking x
2
dy
2
= 1 en zijn er geen oplossingen voor de vergelijking
met 1; is de periodelengte oneven, dan zijn beide vergelijkingen oplosbaar
en vormen (p
km1
, q
km1
) voor k = 1, 3, 5, . . . alle oplossingen voor x
2
dy
2
= 1 en (p
km1
, q
km1
) voor k = 0, 2, 4, . . . alle oplossingen voor x
2
dy
2
= 1.
Application 1.5.10 (factorisatie). Een aantal van de beste methoden om
een gegeven getal N in factoren te ontbinden is gebaseerd op het idee dat
wanneer je twee gehele getallen x en y hebt met 0 < x < y < N zodat
modulo N geldt x
2
y
2
, dan zal gcd(N, x y) een factor voor N opleveren
omdat dan N een deler is van x
2
y
2
= (x y)(x + y). Die factor kan
triviaal zijn, wanneer x y mod N, (een geval dat wel op moet treden
wanneer N priem is), maar als N minstens twee verschillende priemdelers
heeft kan het zijn dat sommige priemfactoren in x + y zitten en andere in
x y zodat we een niet-triviale factor detecteren.
Het grote probleem is natuurlijk om x en y te vinden. Fermat probeerde
x
2
y
2
= N op te lossen door systematisch te zoeken, beginnend bij x =
N|, en x telkens met 1 ophogend, naar een kwadraat van de vorm x

2
N. Dit werkt aardig wanneer N het product is van twee priemgetallen die
heel dicht bij elkaar liggen, maar hopeloos als de twee ver uiteen lopen,
bijvoorbeeld p
3
N.
Een succesvolle (en tot in de jaren 1980 veel gebruikte) aanpassing
maakt gebruik van kettingbreuken, als volgt. We gebruiken de in het vorige
voorbeeld gevonden identiteit p
2
n
Nq
2
n
= (1)
n+1
Q
n+1
, voor de conver-
genten p
n
/q
n
van
N, met x
n
= (P
n
+
N)/Q
n
, voor n 0, en de
ongelijkheid Q
n+1
< 2
N. Het nut is gelegen in de congruentie p

2
n

(1)
n+1
Q
n+1
mod N. Om ook rechts een kwadraat te krijgen vereist wat
veel geluk, maar we kunnen wel congruenties proberen te combineren tot
een kwadraat, en daarvoor willen we Q
n+1
klein hebben om deze te kunnen
ontbinden in factoren. Het idee is als volgt: leg een lijst van kleine priemge-
tallen aan (te beginnen met 1, 2, 3, 5, , zie echter onder), voer een stap
in de kettingbreukontwikkeling van N uit, en zie (via deling met rest door
de priemen) of Q
n+1
te ontbinden is met behulp van uitsluitend de priemen
uit de lijst. Bepaal dan de exponenten k
i
in
Q
n+1
= (1)
k
0
p
k
1
1
p
kr
r
,
en herhaal dit proces. Het doel is om zo een matrix met als rijen de gevon-
den exponenten modulo 2 op te bouwen en in deze matrix een afhankelijheid
tussen de rijen te vinden: zon afhankelijkheid modulo 2 betekent namelijk
dat er een product van overeenkomstige Qs bestaat waarvan de exponenten
in de factorisatie bij alle p
i
en bij 1 even zijn. Met andere woorden, dit
product is een kwadraat! Omtrent de factor basis (de lijst van priemgetallen
tot een te kiezen grens B) is het nuttig op te merken dat natuurlijk eerst
gekeken wordt of N door een van die kleine p
i
deelbaar is, maar ook dat
slechts ongeveer de helft van de priemgetallen van nut zijn. Immers, een
priemgetal p dat een Q
n+1
deelt, deelt p
2
n
Nq
2
n
, dus N (p
n
/q
n
)
2
mod p.
Met andere woorden, N moet een kwadraatrest modulo p zijn, een eigen-
schap die eenvoudig te verieren is. Het kan zijn dat de periodelengte van de
kettingbreuk van
N klein is, en in dat geval treden maar weinig verschil-

lende waarden Q
n+1
op. Een manier om dat te verhelpen is door naar de
kettingbreuk van
kN voor kleine veelvouden van N te kijken. Niet alleen

kan de periode zo groter worden, maar k kan ook nog eens zo geselecteerd
worden dat kN kwadraatrest is voor veel van de priemgetallen tot B.
Chapter 2
Planetaria
Maaike Zwart
An application of continued fractions in engineering
Astronomy is possibly the oldest science around. Ever since the dawn of
mankind people have wondered about the sun, moon and stars, about what
they are, what they are made of. Various tales and myths are built upon
this curiosity. It has encouraged many a man to make models of it reecting
their view. Some are built on myths, like the turtle carrying the earth,
others purely on observations, and some on both, like Keplers model of the
Platonic Universe.
Figure 2.1: The world carried by elephants and a turtle (Hindu Myth) and
the Mysterium Cosmographicum by Kepler.
With the invention of the telescope in the 1600s, the accuracy of the scien-
35
36 CHAPTER 2. PLANETARIA
tic models rapidly increased. More planets were discovered, some of them
carrying their own moons with them. In 1682, Huygens built a table top
size planetarium for the Academie Royale des Sciences in Paris. His design
uses state of the art engineering as well as some interesting mathematics. A
century later, Eise Eisinga turned his house into a museum by converting
the ceiling of his living room into a planetarium. Both designs are breath-
taking. I will give a brief description of them, after which I will discuss
the main mathematical issue in designing such a model of our solar system;
approximating the periods of the planets around the sun.
2.1 Huygenss Planetarium
Christiaan Huygens, born in 1629, was one of the most brilliant scientists
of his day. He contributed greatly to physics, mathematics and astronomy.
In 1682 he designed a planetarium intended for the Academie Royale des
Sciences in Paris, of which he had been head until the previous year. The
planetarium was crafted by Johannes van Ceulen, a craftsman from The
Hague. It contains the planets Mercury, Venus, Earth, Mars, Jupiter and
Saturn. Apart from showing the motions of the planets, the design indicates
the date and time, which constellations are in the sky, and many things
more, far too many to describe here. There are various descriptions of this
planetarium, some are translations of Huygenss own description, such as
[33]. To fully appreciate the beauty of this planetarium, I think its best to
take one of those descriptions to the Boerhave museum in Leiden, where
the planetarium is on display, and watch the real thing while reading the
description. As I obviously cannot do this for you in these notes, Ill settle
with an ancient wisdom: a picture says more than a thousand words, see
gures 2.2 and 2.3.
2.2 Eisingas Planetarium
A century later, Eise Eisinga invested seven years of his spare time to con-
struct a magnicent planetarium in the ceiling of his home. While Huygens
was a professional scientist (as far as such a profession existed in that time),
Eisinga was just an interested layman. His profession was woolcomber, like
his father. To relax his mind, he spend his evening hours studying physics
and mathematics.
On the 8
th
of May, 1774, the Moon and the planets Jupiter, Mars, Mer-
cury and Venus all aligned in the constellation Aries. This extraordinary
2.2. EISINGAS PLANETARIUM 37
Figure 2.2: Huygenss planetarium, front view.
Figure 2.3: Huygenss planetarium, inside view and schematic drawing.
phenomenon gave rise to end-of-the world myths much like the expiring
Maya calendar on 21-12-12. Eise Eisinga regretted the ignorance of his fel-
low villagers. It was then that he decided to built his own planetarium. His
planetarium should accurately show the positions of the heavenly bodies at
any given time. He did a magnicent job. Although his planetarium is a
little less accurate than the one Huygens made [87], this one is truly breath-
taking. The planets move along the ceiling, following circular trails. In the
attic above this ceiling, Eisinga built the whole machinery. Everything is
handmade out of wood and steel. Apart from the planets (again, Mercury,
Venus, Earth, Mars, Jupiter and Saturn) there are indescribably many dis-
plays and other clockworks giving information about constellations, date,
lunar phases, time of sunrise and sunset, etc. . .
Again, the only way to fully appreciate this work is to visit it, which I
havent been able to yet. But these pictures should give you a fair impres-
sion.
2.3 Mathematical Issues
Apart from technical diculties, Huygens and Eisinga faced a mathematical
challange. The planetaria are driven by a single power source, for example
a spring (remember, there was no electricity back then), driving one wheel.
Interlocking gears are then used to give each planet its own period around
the sun. The idea is simple: if the main wheel makes a full revolution in
x minutes, and for example Mercury should revolve around the sun in y
minutes, then we should mount a gearwheel with x teeth on the axis of the
main wheel, and a gearwheel with y teeth on the axis connected to Mercury.
However, x and y are generally not whole numbers. This issue could be
easily solved by taking the nearest integers of x and y, but still technical
issues get in the way. A gearwheel with a single tooth is not functional,
one with a million teeth is not manufacturable. Say that a gearwheel must
have a number of teeth between 7 and 200
1
. The question is to nd two of
these numbers 7 n, m 200, such that
n
m
approximates
x
y
as closely as
possible. Then the two gears, one with n teeth and one with m teeth, will
approximate Mercurys period as good as possible. As Huygens formulates
it:
De geheele kwestie komt dus hierop neer: wanneer twee groote
1
[34] refers to a book from 1947 by Merrit, wherein 127 is given as upper limit. I
thought 200 would be more reasonable
2.3. MATHEMATICAL ISSUES 39
Figure 2.4: Eisingas planetarium, planet discs and constellations clocks.
getallen gegeven zijn die in een bepaalde verhouding tot elka-
nder staan, andere kleinere te vinden voor de radertanden die
niet ongeschikt zijn door hun grootte en die dezelfde verhouding
met een zoo groote nauwkeurigheid opleveren, dat geen andere
kleinere getallen een betere benadering geven. [33]
That is:
2
:
Given a r Q, nd natural numbers n and m such that 7 n, m
200 and
n
m
r, and such that there are no other 7 < n
t
, m
t
< 200
with [r
n
[ < [r
n
m
[.
2
We may assume x and y are both rationals, so that the resulting fraction (r) is again a
rational number, because both x and y are measured quantities, and measurements always
yield a rational quantities.
I will discuss two methods for nding the required n and m. Then, I
will discuss a method using more than two gears to make an even better
approximation.
2.3.1 Take some Educated Guesses
This may seem too simple a solution, far from ecient and not usable at all.
But a little common sense at a bit of trial and error can get you a long way.
Nowadays, the use of a computer has made this method the most favourable
of all. During my lecture, David actually wrote a program that tried every
possible n and m and returned the best approximation. It returned the
same value Huygens found using the second method, involving continued
fractions.
2.3.2 Continued Fractions
As we know from Sandras notes about decimals and continued fractions, the
latter are very suitable for approximating numbers. In the construction of
continued fractions, the partial quotients
pn
qn
converge to r, while p
n
and q
n
are both increasing with each step. So all we have to do is make the continued
fraction expansion of the number r we want to approximate, keeping track
of the partial quotients, and stop as soon as either p
n
> 200 or q
n
> 200.
Huygens invented this method for the design of his planetarium. The
axis driving the Earth was his main axis. In getting the period of Mercury
right, he took the following steps: The ratio between the periods of Mercury
and Earth he found to be 21038 : 5067 [33]. The continued fraction of
5067
21038
is [0; 4, 6, 1, 1, 2, 1, 1, 1, 1, 7, 1, 2]. The table with the subsequent convergents
pn
qn
is given below. This table is the result of using the recursive relations
for p
n
and q
n
:
p
n
= a
n
p
n1
+p
n2
q
n
= a
n
q
n1
+q
n2
Table 2.1: The convergents of the continued fraction expansion of
5067
21038
n -1 0 1 2 3 4 5 6 7 8 9 10 11 12
a
n
0 4 6 1 1 2 1 1 1 1 7 1 2
p
n
1 0 1 6 7 13 33 46 79 125 204 1553 1757 5067
q
n
0 1 4 25 29 54 137 191 328 519 847 6448 7295 21038
2.3. MATHEMATICAL ISSUES 41
As we see from the table, the most useful convergent would be
46
191
.
Would be, because Huygens used an additional clever trick; the concept of
gear trains. But before I discuss those, I want to turn your attention to the
consequences of measuring errors. The periods of the various planets around
the sun are all measured values. Measurements always come with errors.
The ratio of 21038 : 5067 comes from the measured ratio of values of 525950
minutes for the Earths period and 126675 minutes for Mercurys. If instead,
Mercurys period was measured two minutes slower, 126677 minutes, then
the continued fraction of
126677
525950
is: [0; 46, 1, 1, 2, 2, 266, 1, 5]:
Table 2.2: The convergents of the continued fraction expansion of
126677
525950
n -1 0 1 2 3 4 5 6 7 8 9
a
n
0 4 6 1 1 2 2 266 1 5
p
n
1 0 1 6 7 13 33 79 21047 21126 126677
q
n
0 1 4 25 29 54 137 328 87385 87713 525950
This small measuring error leads to the choice of
33
137
instead of
46
191
.
This illustrates that due to measuring errors, the error of the model is al-
ways larger than the error of the approximations using continued fractions
suggests.
2.3.3 Gear Trains
The approximation Huygens picked for the ratio of the periods of Earth
and Mercury is 204 : 847 [33]. These numbers are far too large to craft as
teeth of a gearwheel. However, Huygens realised that 204 = 12 17, and
847 = 121 7. Using 4 gears instead of 2, he constructed a gear train, see
gure 2.5 below.
Suppose gear A has 121 teeth, B has 12 teeth, C has 7 teeth and D has
17 teeth. Then, as A makes one full revolution, B makes
121
12
revolutions.
As C is mounted on the same axis as B, it too makes
121
12
revolutions. The
last one then, D, makes
7
17
as many revolutions as C. So as A makes one full
revolution, D makes
121
12

7
12
=
1217
1217
=
847
204
revolutions. This is exactly the
ratio Huygens wanted to use for the periods of Mercury and Earth.
So gear trains can get you a more precise approximation at the cost of
using more material. However, this trick only works if this better approxi-
mation consists of numbers that factorise conveniently.
Figure 2.5: A four stage gear train
2.3.4 Some last Comments
Huygens clearly indicated how he used continued fractions to approximate
the periods of the planets around the sun. Altough Eisinga built his plan-
etarium a century later, it appears to be less accurate [87], which is odd.
I do not yet know which method Eisinga used to get his approximations.
He might not have known about Huygens work. The planetarium itself is
much more impressive than Huygens, as it has many more details.
In both planetaria, as much as possible is made to scale. However, the
sizes of the planets are greatly exaggerated to make them visible. If you
would like to experience the true emptiness of our solar system, I would
recommend to pay a visit to het Melkwegpad, near Dwingeloo/Westerbork.
Here, all planets are to scale. Beware, its quite a walk reaching Pluto.
Chapter 3
The Stern-Brocot algorithm
Maaike Zwart
In the 1860s, Moritz Stern and Achille Brocot independently developed an
interesting algorithm. Number theorist Stern used it to construct all rational
numbers, Clockmaker Brocot as a method to approximate real numbers.
The algorithm is closely related to Euclids algorithm, and therefore also to
continued fractions. In this chapter I will give Sterns denition and discuss
Brocots application in his clockwork, linking this chapter to my previous
topic, Planetaria, and showing the similarity with Euclids algorithm. I
will nish with an unexpected link between Stern-Brocot and continued
fractions.
1
3.1 Constructing the Rationals
The rationals are produced in a hanging tree. Starting with two bound-
aries, every rational number in between them appears in a tree that is con-
structed by taking mediants.
Denition 3.1.1. Mediants
Let
p
q
and
p
be two rational numbers (expressed as fractions in their lowest

terms). Then their mediant is given by:
p
q

p
t
q
t
=
p +p
t
q +q
t
The Stern-Brocot tree is then the result of the following iterative process:
1
The denition of both Stern and Brocot I learned from [34], a nice article I denitely
recommend reading for leisure time.
43
44 CHAPTER 3. THE STERN-BROCOT ALGORITHM
Denition 3.1.2. Stern-Brocot tree
Stage 0: Choose rational boundaries and write them as fractions in
their lowest terms.
Stage 1: Add the mediant of the boundaries.
Stage n+1: Add the mediants of all consecutive fractions in the tree,
including the boundaries.
Choosing the somewhat peculiar expressions
0
1
and
1
0
as boundaries, the
latter expressing , yield all the positive rational numbers:
0
1
1
0
0
1
1
1
1
0
0
1
1
2
1
1
2
1
1
0
0
1
1
3
1
2
2
3
1
1
3
2
2
1
3
1
1
0
0
1
1
4
1
3
2
5
1
2
3
5
2
3
3
4
1
1
4
3
3
2
5
3
2
1
5
2
3
1
4
1
1
0
.
.
.
The hanging tree structure becomes more apparent when only the new-
borns of each stage (i.e. the fractions added to the tree in that stage) are
depicted:
0
1
1
0
1
1
1
2
2
1
1
3
2
3
3
2
3
1
1
4
2
5
3
5
3
4
4
3
5
3
5
2
4
1
.
.
.
I claimed every positive rational number appears in this Stern-Brocot tree
(with boundaries
0
1
and
1
0
, from now on referred to simply as the Stern-
Brocot tree). The proof of this fact uses some uses some properties of the
3.1. CONSTRUCTING THE RATIONALS 45
Stern-Brocot tree. As these are interesting properties on their own, I will
rst discuss these, and then justify my claim.
The rst property is one about consecutive fractions, meaning that in a
certain stage in the tree, there is no fraction in between them, e.g.
0
1
and
1
1
are consecutive in stage 1.
Proposition 3.1.3. Any any stage of the Stern-Brocot tree, two consecutive
fractions
p
q
<
p
have the property that:

qp
t
pq
t
= 1
Proof. By induction to the stages of the Stern-Brocot tree.
Base case: At the rst stage we only have the fractions
0
1
and
1
0
. As
1 1 0 0 = 1, this case is ok.
Induction step: Suppose at stage n of the Stern Brocot tree, all consecutive
fractions have this property. At stage n +1, all mediants of the consecutive
fractions are added to the tree. Let
p
q
and
p
be two consecutive fractions of

stage n + 1 of the Stern-Brocot tree. Then one these two fractions already
existed in stage n (suppose
p
q
), and the other is a newborn. The newborn
must then be the mediant of
p
q
and
p
, that is:
p
=
p+p
q+q
. Note that
p
q
and
p
are consecutive fractions in stage n. Suppose wlog that

p
q
<
p
<
p
.
Then:
qp
t
pq
t
= q(p +p
tt
) p(q +q
tt
)
= qp +qp
tt
pq pq
tt
= qp
tt
pq
tt
= 1
Where in the last step, the induction hypothesis is applied.
The next proposition concerns the newborns of stages.
Proposition 3.1.4. At stage n, the sum of the numerator and denominator
of the newborns is at least n + 1.
Proof. By induction.
Base case: At the rst stage (n=0), we have the fractions
0
1
and
1
0
, for
both fractions, the sum of their numerator and denominator equals 1, which
is n + 1.
Induction step: Suppose the sum of the numerator and denominator of
the newborns of stage n is at least n +1. Let
a
b
be an arbitrary newborn of
stage n+1. Then it is the mediant of a newborn of stage n (
p
q
, p+q n+1)
and an older fraction (
p
, p
t
+q
t
1):
a
b
=
p+p
q+q
. The sum of the numerator

and denominator of
a
b
is hence at least (n + 1) + 1.
These two propositions enable us to prove the claim that every positive
rational number appears in the Stern-Brocot tree. Even better, it shows up
only once. The proof is a slightly edited version of a proof by A. Bogomolny
and W.McWorter [10].
Proposition 3.1.5. Every positive rational number appears exactly once in
the Stern-Brocot tree with starting fractions
0
1
and
1
0
.
Proof. Each mediant lies strictly in between its two parents. As all rationals
in the tree are constructed by taking mediants, it is impossible for a rational
to appear more than once in this tree. To prove each rational appears at
least once takes some more work. Let
p
q
be any fraction expressed in its
lowest terms and suppose it does not appear in the tree. Then, at any
stage of the tree, there are two consecutive fractions
n
m
and
n
such that
n
m
 0 mp nq 1
n
t
q m
t
p > 0 n
t
q m
t
p 1
Which can be rewritten as:
(n
t
+m
t
)(mp nq) n
t
+m
t
and (n +m)(n
t
q m
t
p) n +m
n
t
mp +m
t
mp n
t
nq m
t
nq n
t
+m
t
and nn
t
q +mn
t
q nm
t
p mm
t
p n +m
Adding these two expressions and reducing the result to a readable form:
p(n
t
mnm
t
) +q(mn
t
m
t
n) n
t
+m
t
+n +m
As
n
m
and
n
are consecutive fractions, we may use mn

t
nm
t
= 1 (propo-
sition 3.1.3), yielding:
p +q n
t
+m
t
+n +m
3.2. APPLICATION: APPROXIMATING FRACTIONS 47
As
n
m
and
n
are consecutive fractions in stage p+q of the Stern-Brocot tree,

one of them is a newborn. By proposition 3.1.4, the sum of its numerator
and denominator is at least p +q + 1, leading to the result that:
p +q > p +q + 1
Which is clearly a contradiction. Hence the assumption
p
q
does not appear
in the rst in the rst p +q stages of the Stern-Brocot tree is false. Hence
p
q
does appear in the Stern-Brocot tree. More specically, it does so within
the rst p +q stages.
Now that we an interesting tree to work with, I will show how Brocot
used this tree for making approximations needed in his clockwork.
3.2 Application: Approximating Fractions
As discussed in the chapter about planetaria, gears used to drive clockwork
are limited in number of teeth. I have already discussed some methods for
approximating fractions that are inappropriate for gears. Here is how Brocot
tackled the problem. I will rst give the algorithm for an arbitrary fraction
p
q
, then I will give an example, clarifying the dierent steps taken. Then I
will show this procedure is equivalent to following a specic path down the
Stern-Brocot tree.
Denition 3.2.1. Brocots algorithm for approximating a fraction
Start with
p
q
| and ,
p
q
| and calculate the errors of these two approxi-
mations as follows: e(,
p
q
|) = ,
p
q
|qp, and similarly e(
p
q
|) =
p
q
|qp.
Make a table with three columns as follows:
numerator (n) denominator (d) error (e)
,
p
q
| 1 ,
p
q
| q p
p
q
| 1
p
q
| q p
The error in the table divided by the denominator gives the total er-
ror of the approximation in terms of the denominator of the original
fraction. That is, (
n
d

p
q
=
e/d
q
).
add the two rows column-wise to create a third row in between them:
,
p
q
| 1 ,
p
q
| q p
,
p
q
| +
p
q
| 2 ,
p
q
| q p +
p
q
| q p
p
q
| 1
p
q
| q p
If the new error e is positive, add this row to the row below it (which
has a negative error term) to create a row in between them. If the new
error term is negative, add this row to the row above it (which has a
positive error term) to create a row in between them.
Repeat the last step until you are satised with the approximation
or until the error term is zero, then you have recovered the original
fraction.
Example 3.2.2. Hayes uses the example of
191
23
[34]. As
191
23
| = 8 and
,
191
23
| = 9, the rst table looks like:
9 1 16
8 1 -7
Adding these rows results in:
9 1 16
17 2 9
8 1 -7
Adding the middle row to the lower row:
3.2. APPLICATION: APPROXIMATING FRACTIONS 49
9 1 16
17 2 9
25 3 2
8 1 -7
etc... each time adding the rows in such a way that the error decreases. The
nished table looks like this:
9 1 16
17 2 9
25 3 2
108 13 1
191 23 0
83 10 -1
58 7 -3
33 4 -5
8 1 -7
In this example, the best choice for approximating
191
23
with the restriction
that both the numerator and the denominator should be at least 7 and at
most 100 is
83
10
, with an error of
1/10
23
= 1/230.
Proposition 3.2.3. Brocots algorithm is sound, that is, the error in the
table gained by adding the errors of the parent rows, is the true error for
the fraction in that row compared to the fraction that is approximated. In
mathematical terms:
e/d
q
=
n
d

p
q
Proof. Note: if e = nq dp, then
e/d
q
=
nqdp
dq
=
n
d

p
q
, so it is enough
to prove that e = nq dp. This is done by induction to the number of
iterations of the algorithm. The base case is the start of the algorithm.
As e is here dened to be nq dp (as d is 1 here), this case is ok. In
the subsequent rows, the error is computed as the sum of the errors of the
parents. That is, e(
n
+n
+d
) = e(
n
)+e(
n
). As Induction hypothesis we may

assume e(
n
) = n
t
q d
t
p and e(
n
) = n
tt
q d
tt
p, so that:
e(
n
t
+n
tt
d
t
+d
tt
) = e(
n
t
d
t
) +e(
n
tt
d
tt
)
= n
t
q d
t
p +n
tt
q d
tt
p
= (n
t
+n
tt
)q (d
t
+d
tt
)p
So indeed, e = nq dp, and Brocots algorithm gives the right error terms.
3.2.1 Stern - Brocot
Brocots algorithm is very mechanic, while Sterns work is very theoretic.
Still, these two gentlemen actually hit upon the same idea. Brocot picks a
very specic path down Sterns tree. When approximating a fraction
p
q
, he
picks
p
q
| and ,
p
q
| as boundary values for the tree. He then constructs the
tree by taking mediants. However, he is not interested in the entire tree,
just the part that approximates the fraction
p
q
. So he in step n, he takes
the left branch if the approximation is larger than the original fraction, the
right branch if the approximation is smaller.
But Stern is not the only one whose work is closely related to Brocots
approximation algorithm. Where the rst two columns of Brocots table
pointed to Sterns tree, the operations on the error column remind of Euclid.
3.2.2 Brocot - Euclid
Brocots procedure is actually a a primitive variant of Euclids algorithm.
When only considering the error terms, starting with error +e and f,
the procedure adds f to +e once too many times, so that the outcome
is negative, while adding f once less is still positive. In other words, it
writes e as e = b[f[ r. After some more iterations:
e = b
0
[f[ +r
0
= (b
0
+ 1)[f[ r
t
0
r
t
0
= b
1
r
0
+r
1
= (b
1
+ 1)r
0
r
t
1
r
t
1
= b
2
r
1
+r
3
= (b
2
+ 1)r
1
r
t
.
.
.
3.3. AN AMUSING PROPERTY OF THE STERN-BROCOT SEQUENCE51
Which is clearly a variant of Euclids algorithm. Only in Brocots algorithm,
subtracting r
n
b
n+1
+ 1 times from r
t
n
is done once at a time, which could
result in cumbersome work if one error term is relatively small and the other
relatively large.
The connection between Stern-Brocot and continued fractions is not too
hard to nd now. However, there is another, less obvious connection between
the two.
3.3 An Amusing Property of the Stern-Brocot Se-
quence
This section contains my interpretarion of a lemma (lemma 2.1) from [44].
It is the same proof as given there, but I have expanded it a bit to clarify
some details.
Preciously, the boundaries were either
0
1
and
1
0
or two consecutive natural
numbers. In this part, I will take a closer look at the tree resulting from the
boundaries
0
1
and
1
1
, which includes every positive rational smaller than 1.
The dierent stages of the tree are represented by sets in the Stern-Brocot
sequence. These sets J
n
contain all the rational numbers constructed by the
algorithm up to and including stage n. i.e. the rst few elements of the
Stern-Brocot sequence are:
J
0
=
_
0
1
,
1
1
_
J
1
=
_
0
1
,
1
2
,
1
1
_
J
2
=
_
0
1
,
1
3
,
1
2
,
2
3
,
1
1
_
J
3
=
_
0
1
,
1
4
,
1
3
,
2
5
,
1
2
,
3
5
,
2
3
,
3
4
,
1
1
_
To give a precise denition of this sequence is somewhat less obvious, but
we will need in proofs later on. Kessebohmer and Stratmann [44] formulate
the denition as stated below, I havent found a better way to express it.
Denition 3.3.1. Stern-Brocot sequence
The Stern-Brocot sequence is a sequence of sets J
n
,
J
n
=
_
s
n,k
t
n,k
[ k = 1, . . . 2
n
+ 1
_
where s
n,k
and t
n,k
are inductively dened as:
s
0,1
= 0 t
0,1
= 1 (3.1)
s
0,2
= 1 t
0,2
= 1 (3.2)
(3.3)
s
n+1,k
=
_
s
n,l
if k = 2l 1
s
n,l
+s
n,l+1
if k = 2l
(3.4)
t
n+1,k
=
_
t
n,l
if k = 2l 1
t
n,l
+t
n,l+1
if k = 2l
(3.5)
Taking a close look at the sets in the Stern-Brocot sequence, we see
this denition makes sense. For odd k, the elements of the previous set are
included, and for even k, the algorithm of adding numerators and denomi-
nators is applied.
By taking dierences between consecutive sets of the sequence, we gain
sets containing the newborns of each stage:
J
1
J
0
=
_
1
2
_
J
2
J
1
=
_
1
3
,
2
3
_
J
3
J
2
=
_
1
4
,
2
5
,
3
5
,
3
4
_
These dierence sets have an amusing property. If we express all the
elements of J
n
J
n1
as continued fractions, then the sum of the quotients
a
i
is n + 1. Take for example J
3
J
2
:
J
3
J
2
=
_
1
4
,
2
5
,
3
5
,
3
4
_
=
_
_
_
1
4
,
1
2 +
1
2
,
1
1 +
1
1+
1
2
,
1
1 +
1
3
_
_
_
= [0; 4], [0; 2, 2], [0; 1, 1, 2], [0; 1, 3]
i
a
i
= 4 = 3 + 1
To state this more precisely, consider the following sets:
/
n
k
=
_
[0; a
1
, . . . , a
k
]
i=1
a
i
= n, a
k
,= 1
_
The set /
n
k
contains all continued fractions with k quotients adding up to n.
The union
n1
k=1
/
n
k
contains all continued fractions of which the quotients
have sum n. Our theorem expressed in the example above, then becomes:
Theorem 3.3.2.
J
n
J
n1
=
n
_
k=1
/
n+1
k
(3.6)
As the sets J
n
are recursively dened, a proof by induction is quite ob-
vious. However, to carry through the induction, we need a relation between
the elements of J
n
J
n1
and J
n1
J
n2
. The following Lemma provides us
with such a relation.
Lemma 3.3.3.
J
n
J
n1
=
_
1
x + 1
,
x
x + 1
[ x J
n1
_
To my regret, I did not succeed in proving this Lemma
2
. So for the time
being I will just assume it is true and use it in the proof of Theorem 3.3.2:
2
[11] claims to prove it, but I do not understand the reasoning presented there.
Proof. By induction.
Base: n=1
J
1
J
0
=
_
1
2
_
= [0; 2]
= /
2
1
=
1
_
k=1
/
2
k
To prove the induction step, I will rst show that J
n
J
n1

n
k=1
/
n+1
k
.
Then, I will note the two sets have an equal cardinality. As both sets are
nite, and one is completely contained in the other, this proves they are the
same.
In proving the inclusion we need Lemma 3.3.3. I want to show that:
J
n
J
n1

n
_
k=1
/
n+1
k
By Lemma 3.3.3 we have:
J
n
J
n1
=
_
1
x + 1
,
x
x + 1
[ x J
n1
_
So I need to show that, for each x J
n1
J
n2
, both
1
x+1
and
x
x+1
are
elements of
n
k=1
/
n+1
k
.
So x any x J
n1
J
n2
arbitrarily. By the induction hypothesis, we
know that J
n1
J
n2
=
n1
k=1
/
n
k
. So we know the sum of the quotients of
x equals n.
Suppose x = [0; a
1
, . . . , a
k
] (so that
k
i=1
a
i
= n). Consider
1
x+1
:
1
x + 1
=
1
[0; a
1
, . . . , a
k
] + 1
=
1
1 + 0 +
1
a
1
+
1
.
.
. +
1
a
k
= [0; 1, a
1
, . . . , a
k
]
Hence the sum of the quotients of
1
x+1
equals n + 1. That is,
1
x+1
/
n+1
k+1
A similar argument holds for
x
x+1
:
x
x + 1
=
1
1 +
1
x
=
1
1 +
1
[0; a
1
, . . . , a
k
]
=
1
1 +
1
1
a
1
+
1
.
.
. +
1
a
k
=
1
1 +a
1
+
1
.
.
. +
1
a
k
= [0; a
1
+ 1, a
2
. . . , a
k
]
Hence the sum of the quotients of
x
x+1
equals n + 1. That is,
x
x+1
/
n+1
k
.
Hence both
1
x+1
and
x
x+1
are elements of
n
k=1
/
n+1
k
. We may conclude that
indeed
J
n
J
n1

n
_
k=1
/
n+1
k
For equality of these sets, we will compute their cardinality:
#J
n
= 2
n
+ 1, which is easily proven by induction. So:
#(J
n
J
n1
) = #J
n
#J
n1
= 2
n
+ 1 (2
n1
+ 1)
= 2
n1
Computing the cardinality of
n
k=1
/
n+1
k
will take some more work. First
note that:
#/
n+1
k
=
_
n 1
k 1
_
which I will explain using balls and sticky walls. To get the number of ele-
ments in /
n+1
k
, I need to count the number of possible sequences (a
1
, . . . , a
k
)
such that [0; a
1
, . . . , a
k
] is a continued fraction (that is, each a
i
1 and
a
k
> 1), and
k
i=1
a
i
= n + 1.
Suppose I have n + 1 balls in a row:

I want to divide them into k groups, the number of balls in group i repre-
senting the value of a
i
. This is done by picking k 1 balls and stick a wall
to their left:

The k 1 walls divide the balls into k groups. The stickiness ensures each
group contains at least one ball (you cannot stick two walls to the same
ball). The number of ways these walls can be placed is equal to the number
of ways I can pick k 1 balls out of n + 1 balls, that is:
_
n+1
k1
_
. However,
there are some more constraints. Using this construction, I could pick the
very st ball in the row to stick a wall to its left, resulting in a
1
= 0. To
prevent this, I take 1 ball apart before placing the walls, and add it to the
rst group once the walls are in place, ensuring a
1
1. In the same way, I
reserve one ball beforehand for the last group, a
k
. As the walls are placed on
the left side of the balls, the procedure gives me a
k
1, adding my reserved
ball yields the desired a
k
> 1. So I am left with n + 1 2 = n 1 balls to
stick the k 1 walls to, yielding
_
n1
k1
_
ways to do so. Hence
#/
n+1
k
=
_
n 1
k 1
_
From this fact, it is only a small step to #
n
k=1
/
n+1
k
. Newtons Bino-
mial Theorem is used here:
#
n
_
k=1
/
n+1
k
=
n
k=1
#/
n+1
k
=
n
k=1
_
n 1
k 1
_
=
n
k=1
_
n 1
k 1
_
1
k1
1
n1(k1)
=
n1
k=0
_
n 1
k
_
1
k
1
n1k
= (1 + 1)
n1
= 2
n1
So we have:
#(J
n
J
n1
) = 2
n1
= #
n
_
k=1
/
n+1
k
As each element of J
n
J
n1
is also an element of
n
k=1
/
n+1
k
, and both
sets have the same number of elements, they must be equal.
This proves that the elements of J
n
J
n1
as continued fractions, then
the sum of the quotients a
i
is n + 1.
3.3.1 Some last Comments
For a brilliant explanation of the Stern-Brocot tree and its properties, please
go to http://www.ams.org/samplings/feature-column/fcarc-stern-brocot
[3]. Here, David Austin explains the whole subject way better than I ever
could, incuding nice geometrical interpretations.
Chapter 4
Sums of squares
Chris Kooloos
Question 1. Let n N. Can we nd a and b in N such that a
2
+b
2
= n?
First of all we have the following lemma, which tells us that we only have
to consider the special case where n is a prime.
Lemma 4.0.4. The product of two natural numbers that each are the sum
of two squares is again a sum of two squares.
Proof:
Let a, b, x and y be natural numbers. We have:
(a
2
+b
2
) (x
2
+y
2
) = a
2
x
2
+a
2
y
2
+b
2
x
2
+b
2
y
2
=
a
2
x
2
2abxy +b
2
y
2
+a
2
y
2
+ 2abxy +b
2
x
2
= (ax by)
2
+ (ay +bx)
2
.
Suppose we have a and b such that a
2
+ b
2
= p for some prime p. All
squares equal 0 or 1 modulo 4 so a
2
+ b
2
equals 0, 1 or 2 modulo 4. This
implies that we only have to consider primes p such that p = 2 or p
4
1.
Of course, we have 2 = 1
2
+ 1
2
.
4.0.2 Theorem 1
et p be a prime such that p
4
1. Then we can nd a, b N such that
a
2
+b
2
= p. Proof:
Find w such that w
2
p
1 and 0 < w <
p
2
. This is always possible:
The multiplicative group F
p
is cyclic. Therefore, there exists a generator g
for this group. g
p1
p
1 and g
p1
2

p
1.
Now, both g
p1
4
and g
3p3
4
are roots of x
2
+ 1 over F
p
[x] and one of them
59
60 CHAPTER 4. SUMS OF SQUARES
has a representative w such that w <
p
2
.
Consider the regular continued fraction expansion of the rational number
w
p
.
The sequence q
0
, q
1
, of nominators of the convergents is strictly increasing
and q
0
= 1. Therefore we can nd an m such that q
m
<

p < q
m
+ 1. We
know that:
[
w
p

p
m
q
m
[<
1
q
m+1
q
m
.
If we set a := w q
m
p p
m
, then:
[
a
p q
m
[ = [
w
p

p
m
q
m
[<
1
q
m+1
q
m
.
So we have:
[a[ <
p
p
=

p.
Together with q
m
<

p, we have:
a
2
+q
2
m
< p +p = 2p.
Furthermore, a = w q
m
p p
m

p
w q
m
. So, because we have w
2
p
1:
a
2
+q
2
m

p
0.
Therefore, if we set b = q
m
, we have:
a
2
+b
2
= p.
The proof gives us an algorithm to nd a and b in which we calculate the
continued fraction expansion of a rational number until the nominators of
the partial quotients get bigger than

p. The problem is to nd the number
w or to nd the generator for F
p
so that we can calculate w.
We now consider a shorter algorithm using the Euclidean algorithm.
Unfortunately, this algorithm also starts with nding a w as above. However,
in this algorithm we dont have to calculate partial quotients.
Carry out the Euclidean algorithm on
p
w
(instead of
w
p
), producing the
sequence r
1
, r
2
, of remainders. When we rst encounter a remainder r
k
such that r
k
<

p, we calculate the next remainder r
k+1
and stop after
doing so.
Claim 1.
p = r
2
k
+r
2
k+1
if r
1
> 1
p = w
2
+ 1 if r
1
= 1.
61
Before proving this claim, we prove a lemma about the shape of the
continued fraction expansion of
p
w
.
Lemma 4.0.5. Let
p
w
be an irreducible fraction where p > w > 1 and w
2
p
1. Then
p
w
has a palindromic (symmetric) continued fraction expansion
with an even number of partial quotiens.
Proof:
We have w
2
+ 1 = v p for some v N.
Consider the continued fraction expansion of
p
w
:
p
w
= [a
0
; a
1
, a
2
, a
n
] =
p
n
q
n
where we can assume that n is odd. The greatest common divisor of p and
w is 1 by assumption and the greatest common divisor of p
n
and q
n
is always
1 so we have p = p
n
and w = q
n
. We know that:
p
n1
q
n
p
n
q
n1
= (1)
n
= 1
So
1 = p
n
q
n1
p
n1
q
n
which we submit in w
2
+ 1 = v p to see:
q
2
n
+p
n
q
n1
p
n1
q
n
= v p
n
q
n
(q
n
p
n1
) = p
n
(v q
n1
)
This implies p
n
[(q
n
p
n1
).
We have
1. p
n
> q
n
> 0
2. p
n
= a
n
p
n1
+p
n2
> p
n1
> 0
Together these imply p
n
> [q
n
p
n1
[ so we can conclude q
n
p
n1
= 0,
and therefore, q
n
= p
n1
.
Now we have:
p
n
p
n1
=
p
n
q
n
= [a
0
; a
1
, a
n
]
But we also have:
p
n
p
n1
= [a
n
; a
n1
, , a
1
, a
0
]
Because:
p
n
= a
n
p
n1
+p
n2
p
n
p
n1
= a
n
+
p
n2
p
n1
p
n1
p
n2
= a
n1
+
p
n3
p
n2
and so on..
So now we have
[a
0
; , a
n
] = [a
n
; , a
0
]
And n is odd, say n = 2k + 1. We have:
p
w
= [a
0
; a
1
, , a
k
, a
k
, , a
0
] =
p
2k+1
p
2k
Which proves the lemma.
Observe:
p
w
has continued fraction expansion [a
0
; , a
0
] with convergents
p
0
q
0
,
p
1
q
1
, .
w
p
is equal to 0+
1
p
w
so
w
p
has continued fraction expansion [0; a
0
, a
1
, , a
0
].
So if
p
0
q
0
,
p
1
q
1
, is the sequence of convergents of the continued fraction
expansion of
w
p
, which we used in the rst algorithm in the theorem above,
we have, for all m:
p
t
m+1
q
t
m+1
=
q
m
p
m
.
Lemma 4.0.6. p
2
k
+p
2
k1
= p
Proof:
This proof is given in Perron (blz. 29).
Proof of Claim 1:
Suppose r
1
> 1. We want to prove p = r
2
k
+r
2
k+1
.
We have p
2k+1
= p and p
2k
= w.
The recursion formula for p
n
gives us:
p = a
0
w +p
2k1
w = a
1
p
2k1
+p
2k2
and so on..
These equations are identical to those in the Euclidean algorithm for
p
w
.
Hence,
p
2k1
= r
1
, p
2k2
= r
2
, , p
k+1
= r
k1
, p
k
= r
k
, p
k1
= r
k+1
,
63
Together with Lemma 1.3 this gives us:
p = r
k
2
+r
k+1
2
.
From this we conclude r
k
<

p. To prove that k is indeed the smallest k
0
such that r
k
0
<

p, rst assume k = 1. Then surely 1 is the smallest k
0
such that r
k
0
<

p.
Now assume k > 1. We have: r
k1
= p
k+1
.
From the observation above, we have p
k+1
= q
t
k+2
, where q
t
k+2
is the nom-
inator of the k + 2nd convergent of the continued fraction expansion of
w
p
.
There is a natural number z such that:
q
t
k+2
2
= (q
t
k+1
z +q
t
k
)
2
= q
t
k+1
2
+ 2z q
t
k+1
q
t
k
+q
t
k
2
>
q
t
k+1
2
+q
t
k
2
= p
k
2
+p
k+1
2
= p.
So r
k1
= p
k+1
= q
t
k+2
>

p which proves that r
k
is the rst remainder
smaller than

p.
Now assume r
1
= 1.
Then p = w a
0
+ 1.
Also
p
w
= [a
0
; a
0
], so w = a
0
.
We conclude p = w
2
+ 1.
Chapter 5
Pell equation
Merlijn Keune
A Pell equation is an equation of the form x
2
dy
2
= 1, with d Z
2
not a perfect square. The equation is named after the English mathemati-
cian John Pell (1610-1685), who had nothing to do with this equation. In
fact William Brouncker (1620-1684) came up with a solution method, but
Euler mistakenly credited Pell for this. Brounckers method is in substance
identical to a method known to Indian mathematicians at least six centuries
earlier. The equation also occurred in Greek mathematics, but there is no
evidence that they were able to solve it.
In this chapter we will use the continued fraction expansion of
d and
quadratic irrationals in general to solve the Pell equation. The rst propo-
sition makes clear why this is a sensible thing to do.
Proposition 5.0.7. Let (p, q) Z
2
be a solution to x
2
dy
2
= 1. Then
p = p
k
and q = q
k
for some convergent
p
k
q
k
of
d.
Proof
p
2
q
2
= d
1
q
2
, so p q. Therefore
[p
dq[ =
1
p +
dq
1
(1 +
d)q
<
1
2q
.
So
[
p
q

d[ <
1
2q
2
.
Furthermore gcd(p, q) = 1, so the proposition follows from what we already
know.
65
66 CHAPTER 5. PELL EQUATION
Before we can have a good look at the continued fraction expansion of
quadratic irrationals we need some denitions and a few immediate results.
Denitions 5.0.8. Let be a quadratic irrational. We can write = a +
b
d with a, b Q and d some square-free natural number. Then a b
d is
its conjugate, denoted by . There are unique a N
, b, c Z, gcd(a, b, c) =
1 such that is a root of ax
2
+bx+c. Its discriminant is disc() = b
2
4ac.
is called reduced if > 1 and 1 < < 0. For the continued fraction
expansion we shall use the map
: R Q R Q,
1
|
.
So we have
= [|; ()] = [|; ()|,
2
()] = . . . .
Proposition 5.0.9. Let be a quadratic irrational. Then () is a quadratic
irrational and disc() = disc(()). If is reduced, then () is reduced.
Proof Q(
d) for some d N
, so obviously () Q(
d). Let be
a root of ax
2
+bx+c, so disc() = b
2
4ac. We show that the discriminant
is invariant under the maps 1 and
1
. 1 is a root of
ax
2
+ (2a +b)x +a +b +c:
a(1)
2
+(2a+b)(1)+a+b+c = a
2
2a+a+2a2a+bb+a+b+c = 0.
Also gcd(a, 2a+b, a+b+c) = 0 and a simple calculation shows that disc(
1) = a
2
4ac.
1
is a root of cx
2
+ bx + a (divide everything by
2
) which
obviously gives the same discriminant.
Because is reduced, we have the following inequalities:
| < | < | + 1,
1
| + 1
<
1
|
<
1
|
,
1
1
|
< () <
1
| + 1
< 0.
So () is also reduced. (Note that () > 1.)
Lemma 5.0.10. There are only nitely many reduced quadratic irrationals
with a given discriminant d.
67
Proof Let be a root of ax
2
+bx +c. Then
=
b +
b
2
4ac
2a
b
b
2
4ac
2a
=
b
2
b
2
+ 4ac
4a
2
=
c
a
.
If is reduced, then < 0, so
c
a
< 0 and thus c < 0. But
b
2
4ac = b
2
+ 4a(c) = d
with b
2
and 4a(c) positive, so a, b, c are all bounded.
We are now ready to prove some classical results about reduced quadratic
irrationals.
Theorem 5.0.11. A reduced quadratic irrational has a strictly periodic
continued fraction expansion. That is: = [a
0
; a
1
, . . . , a
n1
]
Proof The set of quadratic irrationals with discriminant d = disc() is
invariant under . We show that the restriction of to this set is injective.
Suppose (
1
) = (
2
), so
1
1

1
|
=
1
2

2
|
,
1

1
| =
2

2
|.
So also the conjugates are the same:
1

1
| =
2

2
|.
But since 1 < overline
1
,
2
< 0 this implies
1
| =
2
|, and thus
1
=
2
.
Because the set is nite there must occur repetition in the sequence
, (),
2
(), . . .
Injectivity ensures this sequence is strictly periodic.
Proposition 5.0.12. If is reduced with = [a
0
; a
1
, . . . , a
n1
], then
= [a
n1
; a
n2
, . . . , a
0
].
Proof Let us write
i
=
i
(). Then
=
0
= | +
0
| = a
0
+
1
1
.
Continuing like this gives
0
= a
0
+
1
1
,
1
= a
1
+
1
2
, . . . ,
n1
= a
n1
+
1
n
= a
n1
+
1
0
,
so for the conjugates we have
0
= a
0
+
1
1
,
1
= a
1
+
1
2
, . . . ,
n1
= a
n1
+
1
n
= a
n1
+
1
0
.
Rewriting this gives
1
= a
0
+(
0
),
1
2
= a
1
+(
1
), . . . ,
1
n
= a
n1
+(
n1
) =
1
0
.
So a
n1
is the oor of
1
and
(
1
) =
1
n1
and so on.
We now turn our attention to a special quadratic irrational, being
d.
We need its continued fraction expansion to solve the Pell equation x
2
dy
2
= 1.
Lemma 5.0.13. (
d) is reduced.
Proof We have
(
d) =
1
d|
> 1
and
(
d) =
1
d|
=
1
d +
d|
.
Theorem 5.0.14. The continued fraction expansion of
d is of the form
d = [a
0
; a
1
, a
2
, . . . , a
2
, a
1
, 2a
0
].
Proof (
d) = [a
1
; a
2
, . . . , a
n
], so
d = [a
0
; a
1
, a
2
, . . . , a
n
]. Then
d +a
0
= [2a
0
; a
1
, . . . , a
n1
, a
n
].
But applying the previous proposition to (
d) also gives
d +a
0
= [a
n
; a
n1
, . . . , a
1
].
So a
n
= 2a
0
and a
i
= a
ni
for 1 i n 1.
69
This brings us to the main theorem on this topic. We already know
that all solutions to the Pell equation can be found in the continued fraction
expansion of
d, but we can also specify where exactly they can be found.

We use some equalities we encountered earlier on.
Theorem 5.0.15. Let
d = [a
0
; a
1
, . . . , a
m
], where m is the smallest period.
Then (p
n
, q
n
) is a solution of x
2
dy
2
= 1 if and only if m[n+1. Moreover,
it is a solution to x
2
dy
2
= (1)
n+1
.
Proof
d = [a
0
; a
1
, . . . , a
n
,
n+1
], so
d =
p
n
n+1
+p
n1
q
n
n+1
+q
n1
.
Substituting
d gives
p
n
q
n
d =
p
n
q
n
n+1
+p
n
q
n1
p
n
q
n
n+1
q
n
p
n1
q
n
n+1
+q
n1
=
(1)
n+1
q
n
n+1
+q
n1
.
Since p
2
n
dq
2
n
= (p
n
dq
n
)(p
n
+
dq
n
) and p
n
+
dq
n
is positive, we now
know that if (p
n
, q
n
) is a solution, it is a solution to x
2
dy
2
= (1)
n+1
.
Also, by eliminating the denominator, we get
d(q
n
n+1
+q
n1
) = p
n
n+1
+p
n1
,
so
(p
n
q
n
d)
n+1
= q
n1
d p
n1
.
Multiplying by p
n
+q
n
d gives
(p
2
n
dq
2
n
)
n+1
= (q
n1
d p
n1
)(p
n
+q
n
d)
= (p
n
q
n1
p
n1
q
n
)
d +k
= (1)
n+1
d +k
for some k Z.
: If (p
n
, q
n
) is a solution, then (1)
n+1
n+1
= (1)
n+1
d +k, so
n+1
=
d +k. Then
n+2
=
1
n+1

n+1
|
=
1
d|
=
1
.
So m[n + 1.
: Suppose m[n + 1. Then
d = [a
0
; a
1
, . . . , a
n+1
] and a
n+1
= 2a
0
. So
d = [a
0
; a
1
, . . . , a
n+1
, a
1
, . . . , a
n+1
]
= [a
0
; a
1
, . . . , a
n
, a
0
+
d]
=
pn(a
0
+
d)+p
n1
qn(a
0
+
d)+q
n1
.
Multiplying by the denominator and rearranging gives
(q
n
a
0
+q
n1
p
n
)
d = p
n
a
0
+p
n1
dq
n
.
Since
d is irrational, both q
n
a
0
+ q
n1
p
n
and p
n
a
0
+ p
n1
dq
n
must
be equal to 0. Multiplying by p
n
and q
n
gives
p
n
q
n
a
0
+p
n
q
n1
p
2
n
= 0, p
n
q
n
a
0
+q
n
p
n1
dq
2
n
= 0.
By subtracting these equalities we get
p
2
n
dq
2
n
+p
n1
q
n
p
n
q
n1
= 0,
so
p
2
n
dq
2
n
= p
n
q
n1
p
n1
q
n
= (1)
n+1
.
Example 5.0.16. Let us solve the equation x
2
14y
2
= 1.
Calculating the continued fraction expansion of
14 gives us:
14 = [3; 1, 2, 1, 6].
Notice that this indeed is of the form we proved it must have. Since the
length of the period is 4, we already know that the equation x
2
14y
2
= 1
has no solutions. As seen earlier, we can determine p
n
and q
n
as follows:
n : 2 1 0 1 2 3 4 5 6 7
a
n
: 3 1 2 1 6 1 2 1
p
n
: 0 1 3 4 11 15 101 116 333 449
q
n
: 1 0 1 1 3 4 27 31 89 120
The smallest solution to the equation can be found in the table under n = 3:
15
2
14 4
2
= 1. n = 7 gives us the next solution: 449
2
14 120
2
= 1.
Continuing like this, we see that the solutions grow rather fast; the sixth
solution that will be found is 362074049
2
14 96768360
2
= 1.
71
Remark 5.0.17. Obviously this method works ne for nding small so-
lutions, but if you are interested in lets say the 37th smallest solution, the
calculation gets quite nasty. However, if (x, y) is a solution of x
2
dy
2
= 1,
then x
dy is a unit of the ring Z[
d]. In algebraic number theory,

Dirichlets unit theorem states that the group of units of Z[
d] is of the
form 1, , with of innite order. For our use this means that once you
found the rst solution (also called the fundamental solution) (p
m1
, q
m1
),
you get all other solutions by calculating the unit (p
m1

dq
m1
)
k
for
k N
.
Chapter 6
Markov numbers
Chris Kooloos
Consider the following equation of integers:
m
2
+m
2
1
+m
2
2
= 3mm
1
m
2
We will call this the Markov equation.
The solutions of this equation are triples of integers, and we consider only
triples of positive integers. The numbers that occur in such a solution are
called Markov numbers. The rst two solutions are (1, 1, 1) and (1, 1, 2).
These two are called singular solutions.
Claim 2. The singular solutions are the only solutions where m, m
1
, m
2
are
not distinct.
Proof:
Let (m, m
1
, m
2
) be a solution and assume m
1
= m
2
.
Then we have
m
2
+ 2m
2
1
= 3mm
1
2
m
2
= (3m2)m
2
1
m
1
[m.
There is a k N such that m = km
1
.
We have:
k
2
m
2
1
+m
2
1
+m
2
1
= 3km
1
m
1
m
1
(k
2
+ 2) = 3km
1
.
Thus, k = 1 or k = 2, which leads us to conclude that we started with one
of the singular solutions.
73
74 CHAPTER 6. MARKOV NUMBERS
From a given solution (m, m
1
, m
2
) we can obtain three other solutions, which
we will call its neighbours, namely:
(m
t
, m
1
, m
2
), (m, m
t
1
, m
2
), (m, m
1
, m
t
2
)
dened by:
m
t
= 3m
1
m
2
m, m
t
1
= 3mm
2
m
1
, m
t
2
= 3mm
1
m
2
.
Indeed, if m
2
+m
2
1
+m
2
2
= 3mm
1
m
2
, then:
m
t2
+m
2
1
+m
2
2
= 9m
2
1
m
2
2
6mm
1
m
2
+m
2
+m
2
1
+m
2
2
= 9m
2
1
m
2
2
3mm
1
m
2
= 3m
1
m
2
(3m
1
m
2
m) = 3m
1
m
2
m
t
and similar for m
t
1
and m
t
2
.
Theorem 6.0.18. 1. All of the positive integer solutions (m, m
1
, m
2
) of
the Markov equation are obtained by succesively taking neighbours,
starting with (1, 1, 1).
2. The three integers in any solution are pairwise relatively prime.
Proof:
(i): Consider the quadratic polynomial f given by:
f(x) = x
2
3m
1
m
2
x +m
2
1
+m
2
2
.
If m Z
>0
is a root of f (i.e. (m, m
1
, m
2
) a solution of the Markov equation),
then so is 3m
1
m
2
m, and these are the only two roots. If (m, m
1
, m
2
) is
a nonsingular solution, with say m
1
> m
2
, then:
f(m
1
) = 2m
2
1
3m
2
m
2
1
+m
2
2
< 3m
2
2
3m
2
m
2
1
< 0.
Also, f(m
1
) = (m
1
m)(m
1
m
t
). So we have (m
1
m)(m
1
m
t
) < 0.
Now we drop the assumption m
1
> m
2
.
Suppose that m is the largest of m, m
1
, m
2
. Then max(m
1
, m
2
) m < 0 so
max(m
1
, m
2
) > m
t
, i.e. m
t
< max(m
1
, m
2
) < m. We also have:
m
t
1
= 3mm
2
m
1
> 3mm > m
m
t
2
= 3mm
1
m
2
> 3mm > m.
We can conclude that for any solution of the Markov equation, there is one
neighbour with a smaller maximal element and there are two neighbours
75
with a larger maximal element. Given any nonsingular solution, we can
walk backwards from this solution through succesive neighbours with smaller
maximal elements, a process which must terminate in a singular solution.
We observe that (1, 1, 1) has only one neighbour, which in its turn has only
two neighbours; (1, 1, 1) and (1, 2, 5). Thus, the solutions can be arranged
in a tree (g. 1).
(ii): Suppose that (m, m
1
, m
2
) is a solution where two of the integers have
a common divisor d. Looking at the Markov equation, we see that d must
also be a divisor of the third integer. Therefore, d will also be a divisor of
the neighbors. Walking back in the tree we conclude that d must equal 1.
Quadratic Forms
Quadratic forms are homogeneous polynomials of degree two.
We will consider real indenite quadratic forms:
f(x, y) = ax
2
+bxy +cy
2
where the discriminant d(f) = b
2
4ac is positive. Such an f has two roots.
We dene, for two quadratic forms f and g; f g if and only if we can
obtain f out of g by applying a linear transformation
_

_
to
_
x
y
_
with integer coecients and determinant = 1.
If f g then f(x, y) = g(x +y, x +y) and d(f) = f(g).
One may easlily check that is an equivalence relation. We dene:
(f) = inf
(x,y)Z\(0,0)
[f(x, y)[
If f g then (f) = (g). For all ,= 0, we have (f) = (f) and
d(f) =
2
d(f). Therefore, the value of
(f)
d(f)
is the same for all equivalent
forms and non-zero multiples of them.
The Markov spectrum is the set of all possible values for
(f)
d(f)
.
An automorph of a quadratic form is a linear transformation with determi-
nant +1 which leaves the form unaltered.
A quadratic form f(x, y) = ax
2
+ bxy + cy
2
is primitive if its coecients
have no common divisor, it is reduced if its roots and satisfy 1 > > 0
and < 1.
Theorem 6.0.19. If a quadratic form f(x, y) = ax
2
+bxy +cy
2
has integer
coecients and is primitive, then the following are equivalent:
Figure 6.1: A part of the Markov tree
77
1.
_

_
is an automorph of f.
2. =
rbs
2
, = cs, = as =
r+bs
2
where r, s is an integer solution of r
2
d(f)s
2
= 4.
Proof:
Both directions follow immediately from the given denitions.
Theorem 6.0.20. Let f(x, y) = ax
2
+ bxy + cy
2
be a quadratic form with
integer coecients which is primitive and reduced. Let the continued frac-
tion expansion of root , which is purely periodic by Galois, be given by
= [0; a
1
, a
2
, , a
n
],
where n is taken to be even and a
1
, a
2
, , a
n
is the shortest period. Let
p
1
q
1
,
p
2
q
2
, be the sequence of convergents of the continued fraction expansion
of .
If we dene
_

_
=
_
q
n1
q
n
p
n1
p
n
_
,
we have determinant: p
n
q
n1
q
n
p
n1
= (1)
n
= 1.
Furthermore this is an automorph of f, which satises:
() =
rbs
2
, = cs, = as =
r+bs
2
where r, s is the least positive integer solution of r
2
d(f)s
2
= 4.
(sketch of) Proof:
=
_
0; a
1
, , a
n
,
1
, so
1
=
pn
1
+p
n1
qn
1
+q
n1
=
pn+p
n1
qn+q
n1
.
This can be rewritten as
2
+ ( ) = 0, so is also a root of
g(x, y) = x
2
+( )xy y
2
and therefore g(x, y) has integer coecients
which are multiples of the coecients of f, say = as, = bs, = cs.
Because f is primitive, s must be an integer. If we dene r = +, one may
show that the equations in () hold, and conclude that we have an automorph
of f. Since this integer solution r, s is associated with the shortest period
of , the theory of Pell equations tells us that it is indeed the least positive
integer solution.
Markov Forms
Let (m, m
1
, m
2
) be a nonsingular solution with m the largest of the integers
m, m
1
, m
2
. From Theorem 1 we know that m and m
1
are relative prime.
By Euclids algorithm we can nd integers w, u such that wm+um
1
= m
2
and therefore um
1

m
m
2
with u < m. From the Markov equation we see
that m
2
1
+ m
2
2

m
0 from which we can show that um
2

m
m
1
. Observe
that:
m
2
1
(u
2
+ 1) = m
2
1
u
2
+m
2
1

m
m
2
2
+m
2
1

m
0
So we can nd v < m such that mv = u
2
+ 1.
We consider the quadratic form:
f(x, y) = mx
2
+ (3m2u)xy + (v 3u)y
2
.
If u
t
is another integer such that 0 < u
t
< m and u
t

m
0,
then (u + u
t
)m
2

m
m
1
+ m
1

m
0, hence u
t
must be equal to m u.
Considering the quadratic form f
t
which we obtain by using this u
t
instead
of u, we see that
f
t
(x, y) = mx
2
+ (3m2u
t
)xy + (v
t
3u
t
)y
2
= f(x + 2y, y)
so f f
t
. Thus, we can dene the quadratic form associated to a solution
(m, m
1
, m
2
) as above. Also, this form only depends (up to equivalence)
upon the largest element m and we can assume 0 2u m.
The Markov Form associated to a solution (m, m
1
, m
2
), is dened by:
f
m
(x, y) = mx
2
+ (3m2u)xy + (v 3u)y
2
Where u and v satisfy m
1
u
m
m
2
, 0 2u m and mv = u
2
+ 1.
Theorem 6.0.21. Let (m, m
1
, m
2
) be a solution and f
m
(x, y) its Markov
form.
1. The discriminant d(f
m
) is 9m
2
4.
2. The form f
m
(x, y) is properly equivalent to f
m
(x, y).
3. (f
m
) = f
m
(1, 0) = m.
4. If and denote the roots of f
m
(x, 1) with > , then 1 > > 0, <
1 and has a continued fraction expansion of the form [0; a
1
, , a
k
].
5. If m is odd, then f
m
is primitive.
If m is even, then
1
2
f
m
has integer coecients and is primitive.
79
Proof of (iv):
f
m
(x, y) = mx
2
+ (3m2u)x +v 3u, so the roots are:
=
3m+ 2u +
9m
2
4
2m
and =
3m+ 2u
9m
2
4
2m
We have:
0 <
_
9m
2
4 3m+ 2u < 2u < 2m <
_
9m
2
4 + 3m2u
The rst inequality: (
9m
2
4 + 2u)
2
= 9m
2
4 + 4u
2
+ > (3m)
2
.
The second inequality:
9m
2
4 <
9m
2
= 3m.
The third inequality: denition of f
m
.
The fourth inequality: m > 2u.
So we have: 1 > > 0 and < 1.
is a so called a reduced irrational and therefore has a purely periodic
continued fraction expansion; = [a
0
; a
1
, , a
k
].
From this we see that
1
= [a
k
; a
k1
, , a
0
].
= 0 +
1
1
, so = [0; a
0
, , a
k
].
Continuants
For a nite sequence of integers a
1
, a
2
, , a
n
, the continuant K(a
1
, a
2
, , a
n
)
is dened to be the denominator of the continued fraction [0; a
1
, a
2
, , a
n
].
So we have, K(a
1
) = a
1
, because [0; a
1
] = 0 +
1
a
1
=
1
a
1
.
K(a
1
, a
2
) = a
1
a
2
+ 1, for [0, a
1
, a
2
] = 0 +
1
1+
1
a
2
=
a
2
a
1
a
2
+1
.
K(a
1
, , a
n
) = q
n
= a
n
K(a
1
, , a
n1
) +K(a
1
, , a
n2
), for n 3.
In some texts, these continuants are also called Euler polynomials. This
is because we can obtain K(a
1
, , a
n
) by taking the sum of all possi-
ble products of a
1
, a
n
in which any number of consecutive terms are
deleted. From this we can conclude that reversing the order of a nite se-
quence does not change the value of its continuant. That is, K(a
1
, , a
n
) =
K(a
n
, a
1
).
So we also have the recursion relation:
K(a
1
, a
n
) = a
1
K(a
2
, , a
n
) +K(a
3
, , a
n
) for n 3
which can be extended to the more general formula
K(a
1
, , a
n
) = K(a
1
, , a
m
)K(a
m+1
, , a
n
)+
K(a
1
, , a
m1
)K(a
m+2
, , a
n
) for 1 m < n.
Lemma 6.0.22. Suppose that the positive integers m, u, v satisfy
m 2, m > v and mv u
2
= 1.
If we expand
m
u
, and
m
mu
as continued fractions with an even number of
partial quotients, then these continued fractions are symmetric. Moreover,
there exist unique positive integers a
1
, a
n
such that the following contin-
uant formulas hold:
u = K(a
1
, , a
n
, a
n
, , a
2
) = K(a
2
, , a
n
, a
n
, , a
1
)
m = K(a
1
, , a
n
, a
n
, , a
1
)
Proof:
From my earlier lecture Sums of squares we know that the continued frac-
tion expansion of
m
u
is symmetric.
Lets say
m
u
= [a
1
; , a
n
, a
n
, , a
1
].
Then by denition, m = K(a
1
, , a
n
, a
n
, , a
1
).
u is the denominator of [a
1
; , a
n
, a
n
, , a
1
], which is also the denomi-
nator of [0; a
2
, , a
n
, a
n
, , a
1
]. So u is the numerator of [a
2
; a
3
, , a
1
],
which is K(a
2
, , a
2
, a
1
) = K(a
1
, a
2
, , a
2
). To prove that
m
mu
also has
a symmetric continued fraction expansion with an even number of partial
quotients, we observe that if we replace v by (m2u+v) in the prerequisites
and u by mu, we get:
m > m2u +v
and m(m2u +v) (mu)
2
=
m
2
2um+mv m
2
+ 2umu
2
=
mv u
2
= 1.
Theorem 6.0.23. Let m > 2, u, and v be the integers in the denition
of a Markov form. Then the positive root
m
of f
m
(x, y) has a continued
fraction expansion whose period has an even number of digits, say
m
=
[0; a
1
, , a
2n
].
Furthermore,
m = K(a
1
, a
2n1
), u = K(a
2
, , a
2n2
)
and we also have: a
1
= a
2n
= 2, a
2n2
= a
2n1
= 1, and the sequence
a
2
, a
2n3
is symmetric.
Proof:
We set
m
= [0; a
1
, , a
j
] , d = 9m
2
4.
81
If m is odd, then f
m
is primitive and the least solution of r
2
ds
2
= 4
is r = 3m, s = 1. If m is even then
1
2
f
m
(x, y) is primitive and the least
solution of r
2
1
4
ds
2
= 4 is r = 3m, s = 2.
In both cases we use theorem 3 to get an automorphism of f with:
=
3m(3m2u)
2
= u = (v 3u) 1 = 3u v
= ms =
3m+ (3m2u)
2
= 3mu
and also:
_

_
=
_
q
j1
q
j
p
j1
p
j
_
, with
p
i
q
i
the partial convergents of the
continued fraction expansion of
1
m
. So we have ():
u = q
j1
= K(a
2
, , a
j1
) 3u v = q
j
= K(a
2
, , a
j
)
m = p
j1
= K(a
1
, , a
j1
) 3mu = p
j
= K(a
1
, , a
j
).
Observe that f
m
(0, 1) = 3u v, so with theorem 4 part (iii) we see that
m < 3u. We already knew that 2u < m, so now we have 2u < m < 3u. This
fact together with mv u
2
= 1 gives us 2v < u < 3v.
From () we see that
m
u
= [a
1
, , a
j1
] , so a
1
= 2
3mu
m
=
K(a
2
, , a
j
)
K(a
1
, , a
j1
)
= a
j
+
K(a
1
, , a
j2
)
K(a
1
, , a
j
)
= = [a
j
; a
j1
, , a
1
]
Together with
3mu
m
= 3
u
m
, this gives us a
j
= 2.
()
3mu
m
= [2; a
j1
, , a
1
] = 2 +
1
[a
j1
, , a
1
]
From () we have
3uv
u
= [a
j
; a
j1
, , a
2
] = [2; a
j1
, , a
2
], hence
3u v 2u
u
=
1
[a
j1
, , a
2
]
so
u
u v
= [a
j1
, a
2
] .
If j were odd, then by lemma 1 and () we would have a symmetric
continued fraction [a
j1
, , a
1
] which would imply a
j1
= 2. However,
u
uv
= [a
j2
, , a
1
] leads to contradiction because u > 2v. So j must be
even, lets say j = 2n. From () and the fact that a
1
= 2 we see that we
can write
m
mu
= [a
2n1
, a
2n2
, , a
2
, 1, 1]
which has an even number of terms. Applying lemma 1, we conclude that
a
2n2
= a
2n1
= 1 and a
2
, , a
2n3
is symmetric.
Chapter 7
The nearest integer
continued fraction
Willem van Loon
If we calculate the continued fraction expansion of , we get:
= 3 +
1
7 +
1
15 +
1
1 +
In the third step, we arrive at 1/0,06...=15,996..., so normally we would
take 15 for our expansion. But this fraction lies much closer to 16, so maybe
it would make more sense if we choose 16 for our expansion, instead of 15. If
we do this consequently, that is taking the nearest integer of the remaining
fraction, we get the nearest integer continued fraction expansion.
Denition 7.0.24. The Nearest Integer Continued Fraction (NICF) oper-
ator T1
/2
: [
1
2
,
1
2
) [
1
2
,
1
2
) is dened by T1
/2
(x) :=
1
x
1
x
+
1
2
_
, x ,= 0,
with x = b
0
+
1
b
1
+
2
b
2
+
, and
n
= sgn
_
T
n1
1
/2
(x b
0
)
_
b
0
is hereby chosen such that x b
0
[
1
2
,
1
2
).
Lemma 7.0.25. [b
0
;
1
b
1
,
2
b
2
, ] is the NICF-expansion of a number a
R b
n
2 and b
n
+
n+1
2 for all n 1
83
84CHAPTER 7. THE NEAREST INTEGER CONTINUED FRACTION
Proof
1
2
T
n1
1
/2
<
1
2
, so b
n
=
_
1
T
n1
1
/2
(xb
0
)
+
1
2
_

_
2 +
1
2
_
= 2.
Furthermore, if
n+1
= 1, then
T
n
1
/2
(x b
0
) =
1
T
n1
1
/2
(x b
0
)
1
T
n1
1
/2
(x b
0
)
+
1
2
_
< 0
Since
1
T
n1
1
/2
(xb
0
)
2, it follows that b
n
3, so b
n
+
n+1
2. The other
direction is easy to prove.
Denition 7.0.26. Let x be an irrational number and let [b
0
;
1
b
1
,
2
b
2
, ]
be the continued fraction expansion of x. Suppose b
l+1
= 1 and
l+1
=
l+2
=
1. Then the transformation
[b
0
;
1
b
1
, ,
l
b
l
,
l+1
b
l+1
,
l+2
b
l+2
,
l+3
b
l+3
, ]
[b
0
;
1
b
1
, ,
l
(b
l
+ 1), (b
l+2
+ 1),
l+3
b
l+3
, ]
is called the singularization of b
l+1
.
For example, if we singularize the expansion of , we get
= [3; 7, 15, 1, 292, 1, ]
= [3; 7, 16, 293, 1, ]

We havent shown yet that the new expansion is the same number as the
old one. To prove this, dene
r
k
s
k
= [b
0
;
1
b
1
, ,
k
b
k
], the n-th convergent
of x before the singularization. In the same way,
c
k
d
k
is dened to be the n-th
convergent of x after singularizing. We now have to prove that
r
l+1
s
l+1
=
c
l
d
l
and
r
l+2
s
l+2
=
c
l+1
d
l+1
. Then the rest would follow, because from here on the expansion
stays the same. If you look at the denition of singularizing, it is very easy
to see that
r
l+1
s
l+1
=
c
l
d
l
, because we have
b
l1
+
l
b
l
+
1
1
= b
l1
+
l
b
l
+ 1
85
For the proof of
r
l+2
s
l+2
=
c
l+1
d
l+1
, we have to show that
l
b
l
+
1
1 +
1
b
l+2
=
l
b
l
+ 1 +
1
b
l+2
+ 1
,
which is easy to see because
1
1 +
1
b
l+2
=
b
l+2
b
l+2
+1
= 1
1
b
l+2
+1
Lemma 7.0.27. Let x [0, 1) be some irrational number (without loss of
generality), so x = [0; a
1
, a
2
, ]. If you singularize in each block of m
consecutive ones in the RCF of x, the rst, the third, the fth,... partial
quotient, you get the NICF-expansion of x (which is [b
0
;
1
b
1
,
2
b
2
, ]).
Proof By Lemma 0.0.2 its sucient to proof b
n
2 and b
n
+
n
1
2 for
all n 1, and since it follows directly from the denition of singularization
that b
n
2 for all n 1, we only have to show that b
n
+
n
1
2 for all
n 1.
Suppose
n+1
= 1. Then you have singularized a
n+1
. There are two
cases:
a
n+1
is the rst 1 of a block of ones. Then a
n
2, so a
n
+1 = b
n
3.
a
n+1
is not the rst 1 of a block of ones, so a
n1
= a
n
= 1, and thus
you have singularized a
n1
. Therefore a
n
has changed to (a
n
+ 1).
After that a
n+1
has been singularized, which changed (a
n
+ 1) to
(a
n
+ 2) = b
n
. Because a
n
1, it follows that b
n
3.
Were now going to introduce two new operators, the future-operator
T and the past-operator V .
T
n
= T
n
(x) = [0; a
n+1
, a
n+2
, ], T
0
= x
V
n
= V
n
(x) = [0; a
n
, a
n1
, , a
1
], V
0
= 0
It will be of later importance to note that V
n
= [0; a
n
, a
n1
, , a
1
] =
q
n1
qn
,
with q
n
the denominator of the n-th convergent
pn
qn
. Furthermore
V
n+1
=
1
a
n+1
+V
n
=
1
a
n+1
+
q
n1
q
n
=
q
n
a
n+1
q
n
+q
n1
86CHAPTER 7. THE NEAREST INTEGER CONTINUED FRACTION
and so
q
n
q
n+1
=
q
n
a
n+1
q
n
+q
n1
Lemma 7.0.28. Let x [0, 1) be an irrational number, so x has RCF-
expansion [0; a
1
, a
2
, ]. Let S1
/2
:= [
1
2
, g) [0, g] [g, 1) [0, g). If you
singularize a
n+1
= 1 if and only if (T
n
, V
n
) S1
/2
, then you get the NICF-
expansion of x.
Proof We want to prove that singularizing the rst, third, fth,... one of
each block of ones is the same as singularizing if (T
n
, V
n
) S1
/2
. We assume
a
n
,= 1, so a
n+1
= 1 is the rst one of a block. Then V
n
= [0; a
n
, , a
1
] =
1
an+
<
1
2
, and T
n
= [0; a
n+1
, a
n+2
, ] =
1
1+
>
1
2
, so (T
n
, V
n
) S1
/2
. So
we always singularize the rst one.
There are now two cases to seperate:
a
n+1
= a
n+2
= 1 and (T
n
, V
n
) S1
/2
. In this case
1
2
< T
n+1
< 1.
Furthermore, we have 0 V
n
< g = 0
qn
q
n+1
< g = q
n1
< gq
n
.
Therefore:
V
n+1
=
q
n
q
n+1
=
q
n
a
n+1
q
n
+q
n1
>
q
n
q
n
+gq
n
=
1
1 +g
= g
So (T
n+1
, V
n+1
) / S1
/2
, and thus we dont singularize the next one.
a
n+1
= a
n+2
= 1 and (T
n
, V
n
) / S1
/2
. Again,
1
2
< T
n+1
< 1. Fur-
thermore, we have g < V
n
1 = g <
q
n1
qn
1 = gq
n
< q
n1
.
Therefore:
V
n+1
=
q
n
q
n+1
=
q
n
a
n+1
q
n
+q
n1
<
q
n
q
n
+gq
n
=
1
1 +g
= g
So (T
n+1
, V
n+1
) S1
/2
, and so now we do singularize the next one.
Chapter 8
Continued fractions and the
LLL algorithm
Geert Popma
Introduction
In 1982 Arjen Lenstra, Hendrik Lenstra and Laszlo Lovasz published their
article on the LLL algorithm. It is a polynomial time algorithm which re-
duces a basis for an integer lattice. The original application was to give
a polynomial time algorithm for factorizing polynomials with rational co-
ecients into irreducible polynomials[52]. The algorithm can be used for
nding simultaneous rational approximations to real numbers and for solv-
ing the integer linear programing problem in xed dimensions.
We rst give the basic denition of a lattice, next we give examples of
dierent bases for a lattice. For practical reasons we want to have a reduced
basis. The celebrated LLL-algorithm describes, given a basis, a way to pro-
duce a reduced basis. Next in (2) we give a geometrical interpretation of the
continued fraction expansion in the language of lattices. This interpetation
paves the way to NICF, the nearest integer continued fraction. We establish
in (3) a connection between LLL and NICF.
8.1 Lattices and bases
Denition 1 (Lattice). A lattice is a discrete subset of R
n
spanning the
enitre vector space.
Let (b
1
, b
2
, . . . , b
n
) be a basis for R
n
, then Zb
1
+Zb
2
+. . . +Zb
n
is a lattice
87
88CHAPTER 8. CONTINUED FRACTIONS AND THE LLL ALGORITHM
in R
n
.
Just as for a vector space, the basis of a lattice is not unique. In gure
8.1 we provide two bases for the same lattice.
b
1
b
2
b
1
b
2
Figure 8.1: The same lattice spanned by dierent bases.
Since R
n
is a nite-dimensional inner product space, we know it has an
orthonormal basis. An arbitrary set of vectors can be turned into an or-
thonormal basis using the Gram-Schmidt orthogonalization process. How-
ever, this does not work for lattices!
In this report were only concerned with lattices in R
2
. As we said above
we in general cannot nd an orthogonal basis. Neither can we nd a normal
basis, i.e. a basis with vectors of unit length. If we apply the Gram-Schmidt
process we have to be careful, we can only work with integer multiples of
the original basis vectors.
The Gram-Schmidt process takes a nite, linearly independent set S =
b
1
, . . . , b
k
for k n and generates an orthogonal set S = b
1
, . . . , b
k
that
spans the same k-dimensional subspace of R
n
as S. For n = 2 the process
comes down to:
b
1
= b
1
, b
2
= b
2
b
1
Where =
b
2
,b
1
)
b
1
,b
1
)
.
The problem with this process when applied to lattices is that has to
be an integer, otherwise the new basis is no longer a basis. By choosing a
suitable integer we can use this process to get a more orthogonal basis. We
attempt to apply G-S to the second basis in gure 8.1.
The new basis (b
1
, b
2
) appears to be a better basis for the lattice. To
8.1. LATTICES AND BASES 89
b
1
b
2
b
2
Figure 8.2: Orthogonalizing
the basis
The G-S proces suggests we shift b
2
by
a multiple of b
1
. These shifts are on
the dashed purple line. Our choice of b
2
should be a point on the line which is also
a element of the lattice. Since
=
b
2
, b
1
b
1
, b
1
1.44
the best choice for b
2
according to G-S
becomes
b
2
= b
2
b
1
.
make precise what constitutes a better basis we introduce the notion of a
reduced basis.
Denition 2 (Reduced basis). A basis (b
1
, . . . , b
n
) for a lattice is called
reduced if
(i) [
ji
[
1
2
for 1 i < j n
(ii) [b
j

jj1
b
j1
[
2
3
4
[b
j1
[
2
for 1 < j n
Where
ji
=
b
j
,b
i
)
b
i
,b
i
)
and b
j
= b
j

j1
i=1

ji
b
i
.
For a two-dimensional lattice we write instead of
21
. Being reduced
now amounts to
[[
1
2
, [b
2
[
3
4
[b
1
[.
By the rst condition a reduced basis has more orthogonal vectors. The
basis vectors always span a parallelepiped of the same volume, hence more
orthogonal vectors are also shorter. For instance for a parallelogram spanned
by vectors of lengths b
1
and b
2
with an angle in between the area is
b
1
b
2
sin . If the vecotrs become more orthogonal then sin grows towards
1. The area remains the same thus b
1
b
2
grows smaller. This explains why
reducing a basis provides short basis vectors. LLL does not garantee you
get the shortest vector in a lattice, however we obtain the following, which
provides short enough vectors in practice.
Propostition 3. A reduced basis satises
(i) det(L)
[b
i
[ 2
n(n1)/4
det(L)
(ii) for all x L we have [b
i
[ 2
(n1)/2
[x[
Looking back at the gure 8.1 we get that the rst basis is reduced
whereas the second one is not. Our orthogonalized basis in gure 8.2 is
reduced, provided we swap b
1
and b
2
in order to satisfy the second condition.
This can be readily checked since the bases are:
_
-0.63
1
_
,
_
1
0.37
_ _
0.37
1.37
_
,
_
1.37
1.74
_ _
1
0.37
_
,
_
0.37
1.37
_
The process of reducing the basis in gure 8.2 consists of two steps in
order to satisfy the reducedness conditions. First we shift b
2
by a multiple
of b
1
, second we swap b
1
and b
2
. This gives rise to the celebrated LLL-
algorithm.
Algorithm (The LLL-algorithm for the case n = 2). Given a basis b
1
, b
2
for a lattice do the following. We denote | for the nearest integer to .
If [[ >
1
2
, replace b
2
by b
2
|b
1
, this also replaces with |.
If [b
2
[
2
<
3
4
[b
1
[
2
, then swap b
1
and b
2
, this also adjusts .
Stop when both conditions are satised.
The full algorithm is described in [52].
8.2 The geometry of continued fractions
Given a continued fraction = [a
0
, a
1
, a
2
, . . .] dene numbers p
n
, q
n
by the
recursive relations
p
n+2
= a
n+2
p
n+1
+p
n
q
n+2
= a
n+2
q
n+1
+q
n
The rationals
pn
qn
are the convergents of . As n grows larger the convergents
become better approximations of . In general we say that a rational
a
b
is a
best approximation of a real number if for all
c
d
such that d b we have
[ba[ [dc[. Lagranges theorem states that for a rational x which is
not an integer, x is a best approximation of if and only if x is a convergent
of .
The geometrical interpretation of a rational p/q being a best approxi-
mation to is that, amongst all points (q
t
, p
t
in the integer lattice Z
2
such
that 0 q
t
q, (q, p) is nearest to the line l given by y = x.
8.2. THE GEOMETRY OF CONTINUED FRACTIONS 91
O
b
2
b
1
c
This idea can also be expressed in terms of a
cone, (q, p) is nearest to y = x amongst all points
outside of the halfcone with vertex (q, p).
Denition 4 (Halfcone). The (b
1
, b
2
)-halfcone
with vertex c consists of points c + t
1
b
1
+ t
2
b
2
for
all numbers t
1
0, t
2
> 0.
Given a real number number and lattice points b
1
, b
2
we wish to con-
struct the outpoint b. Let l be the line y = x.
Assume that l meets parallelogram Ob
1
b
b
2
in the origin O, and in a
point p that is the intersection point between l and the segment b
2
b
. Let q
be the intersection point between l and the extension of b
1
b
. We then have
p = b
1
+ b
2
, for 0 < < 1, and it follows that q = b
1
+
1
b
2
. Note that
q lies on the lattice edge between the outpoint b = b
1
+1/|b
2
and c = b+b
2
.
O
b
2
p
q
c
b
1
b
b
Figure 8.3: The outpoint construction
In the case that l meets Ob
1
b
b
2
in the segment b
1
b
we follow a similar
construction where the roles of b
1
, b
2
are swapped.
The outpoint b approximates l in the sense that it is nearer to l than
any point outside of the (b
1
, b
2
)-halfcone with vertex b. This brings to mind
the geometrical interpretation of a best approximation.
We can iterate the outpoint construction. For n 3 dene b
n
as the
outpoint of b
n2
and b
n1
. In g. 8.3 we get for instance b
3
= b and b
4
= c.
When we start out with the basis for the integer latice, i.e. b
2
= (0, 1),
b
1
= (1, 0), the outpoints become b
n
= (q
n
, p
n
). It turns out the best ap-
proximations in Z
2
are precisely the points (q
n
, b
n
).[36]
The outpoint construction coincides with the continued fraction expan-
sion. This geometrical interpretation also makes clear how the nearest inte-
ger continued fraction, abbreviated NICF, works. We can instead take as ou-
point b = b
1
+ 1/|b
2
, in g. 8.3 this yields c instead of b.
b
c
Geometrically this is better since by adding
a multiple of b
2
to b
1
the closest we can
get to the line l is the point c. We
can express this via an open cone with ver-
tex c, which consist of all points c + t
1
b
1
+
t
2
b
2
, t
1
, t
2
> 0. Then c is closest to
the line of all points outsde of its open
cone.
Compare the NICF outpoint construction to the LLL algorithm. Both
shift a base element by a nearest integer multiple of the other base element.
The LLL tries to produce an orthogonal basis, whereas the outpoints form
a basis with vectors approaching the line y = x, i.e. the iterated outpoints
become as least orthogonal as possible. We set out to establish a connection
between these two iterations.
8.3 The relation between LLL and NICF
We set out to show that we can obtain the NICF as a special case of the
LLL-algorithm.
Consider the lattice spanned by the vectors
b
1
=
_
1
0
_
, b
2
=
_

2
_
LLL yields p, q such that [p q[ and 1 q
2/, see propostion

1.39 in [52]. The smaller we take the better our approximation of by
p/q gets. What happens if we set = 0? This would mean we apply LLL
to (1, 0) and (, 0). Then this is simply 1-dimensional so were not dealing
8.3. THE RELATION BETWEEN LLL AND NICF 93
with inproducts. We get
b
1
= 1, b
2
= , =
b
2
b
1
b
1
b
1
=
b
2
b
1
.
Write = q
0
+ r
0
where q
0
Z and
1
2
< r
0

1
2
. Using the fact that
b
2
b
1
= = q
0
+r
0
implies b
2
= b
1
q
0
+b
1
r
0
, we get
b
2
q
0
b
1
= b
1
r
0
.
But this is exactly what the LLL algorithm tells us to replace b
2
with since
q
0
= |. Now
[b
2
[
2
= [r
0
b
1
[
2
1
4
[b
1
[
2
so by LLL we have to swap b
1
and b
2
. So our current situation becomes
b
t
1
= r
0
b
1
, b
t
2
= b
1
, =
1
r
0
.
Next write 1/r
0
= q
1
+r
1
where q
1
Z and
1
2
< r
1

1
2
. In the same way
as above we get
b
t
2
q
1
b
t
1
= b
t
1
r
1
so we replace b
t
2
with b
t
1
r
1
= r
0
r
1
b
1
. Now
[b
t
2
[
2
= [r
0
r
1
b
1
[
2
1
4
[r
0
b
1
[
2
1
4
[b
t
1
[
2
so we interchange the vectors. This brings us to the situation
b
tt
1
= r
0
r
1
b
1
, b
tt
2
= r
0
b
1
, =
1
r
1
.
Repeating this we get
= q
0
+r
0
1
r
0
= q
1
+r
1
1
r
1
= q
2
+r
2
.
.
.
or
= q
0
+
1
q
1
+
1
.
.
.
where q
i
Z and
1
2
< r
i

1
2
.
If we insert the vectors (1, 0) and (, 0) this yields the NICF expansion
of . This special case is however not an application of the algorithm as our
input is not a basis of a lattice. Reasoning above we set = 0, depending
on a choice of we nd an n such that
q
0
+
1
q
1
+
1
.
.
.+
1
qn
Conclusions
We have seen the NICF expansion as an application of LLL, however it
is not an application since we do not provide an appropiate input for the
algorithm. This result is remarkable in the sense that LLL produces an
orthogonal basis, whereas the geometrical interpretation of NICF is about
producing a basis approaching the line y = x, the oppisote of orthogonal.
Chapter 9
Continued fractions and Ford
circles
Geert Popma
Introduction
Recall the following notions from continued fractions.
Denition 5 (Convergents). Given a continued fraction = [a
0
, a
1
, a
2
, . . .]
dene numbers p
n
, q
n
by the recursive relations
p
n+2
= a
n+2
p
n+1
+p
n
q
n+2
= a
n+2
q
n+1
+q
n
The rationals
pn
qn
are the convergents of .
Denition 6 (Best approximation). Let be a real number. A rational
a
b
is a best approximation of if for all
c
d
such that d b: [ba[ [dc[.
We have seen the following result:
Theorem 7 (Lagrange). Let x be a rational which is not an integer. Then
x is a best approximation of if and only if x is a convergent of .
In this essay there will be an alternative proof of this theorem using Ford
Circles. We follow a proof by Short (2011)[86].
95
96 CHAPTER 9. CONTINUED FRACTIONS AND FORD CIRCLES
Ford Circles
Denition 8 (Ford Circle). The Ford circle C[a, b] corresponding to a re-
duced fraction
a
b
is a circle in R
2
in the upper half-plane tangent to the
x-axis. Its point of tangency to the x-axis is
a
b
and its radius is
1
2b
2
.
1
1
2
1
3
1
3
2
5
2
4
3
5
3
Figure 9.1: Some Ford circles
Propostition 9. Dierent Ford circles C[a, b] and C[c, d] do not overlap
eachother. They are tangent if and only if [ad bc[ = 1.
Proof. Denote d the distance between the centers of the circles, (
a
b
,
1
2b
2
) and
(
c
d
,
1
2d
2
). By the pythagorean theorem we have d
2
=
_
c
d

a
b
_
2
+
_
1
2d
2

1
2b
2
_
2
.
Denote s the sum of the radii, s =
1
2d
2
+
1
2b
2
. A little calculation gives:
d
2
s
2
=
_
c
d

a
b
_
2
+
_
1
2d
2

1
2b
2
_
2
_
1
2d
2
+
1
2b
2
_
2
=
_
c
d

a
b
_
2
4
1
2d
2

1
2b
2
=
(ad bc)
2
1
b
2
d
2
Here (adbc)
2
1 since they are dierent rationals. Hence d
2
s
2
0 and
d s and the circles do not overlap. The circles are tangent if d
2
= s
2
, this
is the case if [ad bc[ = 1.
What are the tangent circles to a given circle C[a, b]? We can nd c, d
such that [adbc[ = 1. Suppose we have another tangent circle C[c
t
, d
t
]. We
then have [adbc[ = 1 = [ad
t
bc
t
[. If adbc and ad
t
bc
t
have the same sign,
then a(dd
t
)b(cc
t
) = 0. So we get a(dd
t
) = b(cc
t
), hence a[cc
t
. Set
cc
t
= n a, or c
t
= cna. Then a(dd
t
) = b(na), so d
t
= dnb. If adbc
and ad
t
bc
t
have opposite signs, then a(d+d
t
) = b(c+c
t
). Set c
t
= c+na,
then a(d +d
t
) = b(na) hence d
t
= d +nb. In both cases
c
=
cna
dnb
. We see
that all the adjacents are of the form C[c na, d nb], n Z.
97
Denition 10 (Continued fraction chain). Let be a real number, de-
note C
n
:= C[p
n
, q
n
] for the Ford circle corresponding to the convergent
pn
qn
of . The continued fraction chain of is the sequence of Ford circles
C
0
, C
1
, C
2
, . . ..
We have the equality [p
n
q
n1
p
n1
q
n
[ = 1 so two consecutive circles in
the chain are tangent. Since the numbers q
n
are increasing the radii of the
circles C
n
are decreasing.
Given a rational x =
a
b
and a real number dene R
x
() =
1
2
[b a[
2
=
b
2
2
[ x[
2
. From the previous proposition we see that C[a, b] is tangent to
the circle at with radius R
x
().
Theorem 11. Let be a real number and x =
a
b
a rational which isnt an
integer.
(i) x is a convergent of
(ii) C
x
is a member of the continued fraction chain of .
(iii) x is a best approximation of
(iv) if z is a rational such that C
z
has larger radius than C
x
, then R
z
()
R
x
().
Statement (ii) is merely a geometric reformulation of statement (i) just
like (iv) is of (iii).
Propostition 12. Let x =
a
b
be a rational.
(i) if [ x[ < [ x[ then R
x
() < R
x
()
(ii) if z is a rational distinct from x then rad(C
z
) R
x
(z) with equality if
and only if C
x
and C
z
are tangent.
Proof. (i) follows immediately from the denition.
For (ii) let z =
c
d
, then
1
2d
2

1
2d
2
[ad bc[
2
=
b
2
2
[
a
b

c
d
[
2
= R
x
(z). They are
equal i [bc ad[
2
= 1 which is equivalent to C
x
and C
z
being tangent.
Propostition 13. Let C
x
and C
y
be tangent Ford circles. If z is a rational
that lies strictly in between x and y, then C
z
has smaller radius than both
C
x
and C
y
.
Proof. We have [z x[ < [y x[ so R
x
(z) < R
x
(y) = rad(C
y
). Furthermore
rad(C
z
) R
x
(z) so rad(C
z
) rad(C
y
). By a similar argument we see
rad(C
z
) rad(C
x
).
x
and C
y
be tangent Ford circles such that rad(C
x
) >
rad(C
y
). Suppose lies strictly between x and y and a rational z lies strictly
outside the interval bounded by x and y. Then R
x
() < R
z
().
Proof. We have [x [ < [x y[ so R
x
() < R
x
(y) = rad(C
y
).
If y lies between z and then [z y[ < [z [ so R
z
(y) < R
z
(). From
rad(C
y
) R
z
(y) we conclude R
x
() < R
z
().
If x lies between z and then [z x[ < [z [ so R
z
(x) < R
z
(). From
rad(C
x
) R
z
(x) we conclude R
x
() < rad(C
y
) rad(C
x
) < R
z
().
x
and C
y
be tangent Ford circles such that rad(C
x
) >
rad(C
y
). Suppose lies strictly between x and y. If z is a rational such that
rad(C
z
) rad(C
x
) then R
x
() R
z
(), with equality if and only if z = x.
Proof. If z ,= x then z lies outside the interval bounded by x and y, hence
R
x
() < R
z
().
Proof of the theorem. (i) (iv) First the case if irrational. Suppose
x =
pn
qn
, dene y =
p
n+1
q
n+1
. Then C
x
and C
y
are tangent and lies strictly
between x and y. Also rad(C
x
) < rad(C
y
).
Suppose z is a ratonal such that rad(C
z
) rad(C
x
). Then by Prop 15
R
x
() R
z
().
Now for the rational case, suppose =
p
N
q
N
. Then for n < N 1 we can
apply the same argument to x =
pn
qn
as above. If x =
p
N
q
N
, then R
x
() = 0
and statement (iv) is trivially true. If x =
p
N1
q
N1
dene y =
u
v
=
p
N
p
N1
q
N
q
N1
. I
claim that y is a reduced fraction and v > 0, hence C
y
exists and I further
claim that C
x
is tangent to C
y
and rad(C
x
) > rad(C
y
). Statement (iv)
then follows by Prop 15 since lies between x and y. Proof of the claim:
if k[gcd(u, v), then k[ [uq
N1
vp
N1
[ = 1, hence gcd(u, v) = 1 and y is
reduced; the q
n
are increasing hence v > 0; the circles are tangent since
[uq
N1
vp
N1
[ = [p
N
q
N1
p
N1
q
N
[ = 1; the radius is larger since q
N1
<
q
N
q
N1
= (a
N
1)q
N1
+q
N2
.
(iv) (i) by contradiction. Suppose x is not a convergent of . Denote
r
n
=
1
q
2
n
for the radius of C
n
. The r
n
are strictly decreasing. If is irrational
then r
n
0, if is rational then the sequence ends at r
N
= rad(C
n
). If
is rational we may assume rad(C
x
) > rad(C
) = r
N
, otherwise R
x
()
R
() = 0 and then statement (iv) fails for x ,= and R

x
() = R
() So
there is a unique integer n such that r
n
rad(C
x
)r
n+1
.
lies strictly between
pn
qn
and
p
n+1
q
n+1
, unless =
p
n+1
q
n+1
. We have rad(C
x
) >
rad(C
n+1
) so x lies outside the interval bounded by
pn
qn
and
p
n+1
q
n+1
, hence
99
R
pn/qn
() < R
x
(). Now statement (iv) fails because for z =
pn
qn
we have
rad(C
z
) rad(C
x
) and R
z
() < R
x
(). If =
p
n+1
q
n+1
then lies strictly
between
pn
qn
and
p
n+1
pn
q
n+1
qn
so again statement (iv) fails.
Hurwitz inequality
Ford circles were originally introduced to prove Hurwitz inequality. This
concerns inequalities of the form [ y/x[ < /x
2
. Given an irrational ,
for which > 0 is the above inequality satised for innitely many fractions
x/y. In 1891 Adolf Hurwitz found that = 1/
5 is the best value, i.e. the

smallest. This can be derived using Ford circles.
Consider a vertical line x = , descending from the upper half-plane to
the x-axis. If is rational then the line passes through the x-axis directly
from a Ford circle. If is irrational then the line must leave every circle
which it enters. The line then passes through innitely many circles. To
see this, recall given a circle C[a, b] and a circle C[c, d] tangent to it, all the
adjacents of C[a, b] are of the form C[cna, dnb], n Z. As n grows large,
positively or negatively, we see that the base point of the adjacents coverge
to a/b. So any Ford circle is surrounded by a chain of adjacents(this is not
the continued fraction chain). So whenever the line leaves a circle it must
enter a new one. If the line passes through a circle C[a, b] we have

a
b
<
1
2b
2
the line passes through innitely many circles hence we have
Propostition 16. The inequality [ y/x[ < /x
2
is satised innitely
often for =
1
2
.
So far we have considered the circles, but they do not cover the entire
plane. Three mutually tangent circles leave the triangle in between uncov-
ered. This area is called the mesh. A mesh has three corners which are the
points of tangency among the circles. Let A be the point of tangency of two
circles C[a, b], C[c, d] which are both tangent to a third circle.
The point A divides the line connecting the centers of the circles in the
ratio
1
2b
2
:
1
2d
2
= d
2
: b
2
. Also x
A
, the rst coordinate of A, divides the line
between the basepoints in the same ratio. Hence
x
A
=
b
2
(a/b) +d
2
(c/d)
b
2
+d
2
=
ab +cd
b
2
+d
2
.
a
b
c
d
e
f
A
x
A
Thus for an irrational the line x = cannot pass through the corner of a
mesh. To be specic we have drawn the gure for the case a/b < c/d and
0 < d < b. The argument will be entirely analogous no matter how the
fractions are disposed in the inequalities.
Let B,C be the other corners of the mesh, the corresponding abscissas
are
x
B
=
cd +ef
d
2
+f
2
x
C
=
ab +ef
b
2
+f
2
We know that the line x = cannot pass through any of the corners,
so it must pass thourgh the interior of a mesh. Then it passes through the
side of the mesh which has the longest projection onto the x-axis. Suppose
C[/, B] forms this side of the mesh and y is the vertex of the mesh on this
circle which lies furthest away from the line x = //B. Then we can form
the inequality:

/
B
< y
/
B
We would now like to estimate y //B so that it will be less than 1/2B
2
.
We rst consider the relations between the vertices of the mesh triangle.
What can we say about x
B
x
A
? To simplify computation we nd:
x
A

a
b
=
ab +cd
b
2
+d
2

a
b
=
bcd ad
2
b(b
2
+d
2
)
=
d
b

bc ad
b
2
+d
2
=
d
b(b
2
+d
2
)
.
Similarly we nd
x
B

a
b
=
f
b(b
2
+f
2
)
.
101
Substracting these and recalling f = b +d
x
B
x
A
=
f(b
2
+d
2
) d(b
2
+f
2
)
b(b
2
+d
2
)(b
2
+f
2
)
=
(b +d)(b
2
+d
2
) d(b
2
+ (b +d)
2
)
b(b
2
+d
2
)(b
2
+f
2
)
=
b
3
db
2
bd
2
b(b
2
+d
2
)(b
2
+f
2
)
=
b
2
bd d
2
(b
2
+d
2
)(b
2
+f
2
)
Substuting s = b/d we get
x
B
x
A
=
d
2
(s
2
s 1)
(b
2
+d
2
)(b
2
+f
2
)
.
If the dierence were zero we would have s = (1
5)/2 but s is by denition

rational, hence x
B
, x
A
cant coincide. We can use the roots of s
2
s 1 to
characterize dierent cases.
x
B
x
A
=
d
2
_
s (1 +
5)/2
_ _
s (1
5)/2
_
(b
2
+d
2
)(b
2
+f
2
)
so the sign of x
B
x
A
only depends on the sign of s (1 +
5)/2, i.e. we
have x
A
< x
B
if s > (1 +
5)/2 and x
A
> x
B
if s < (1 +
5)/2.
Propostition 17 (Hurwitz Inequality). For any irrational there are in-
nitely many fractions such that [ a/b[ < 1/
5b
2
.
Proof. We will distinguish two cases x
A
< x
B
and x
A
> x
B
.
In the case x
A
< x
B
we know that the line x
A
x
C
is the longest projection
of a side of the mesh, hence x
A
< < x
C
. Then the x = passes through
C[c, d]. We have:

b
d
=
b
d
<
b
d
x
A
=
b
d

ab +cd
b
2
+d
2
=
cb
2
abd
d(b
2
+d
2
)
=
b
d

1
b
2
+d
2
,
since
a
b
,
c
d
are adjacent fractions. Introduce s = b/d, we have

b
d
<
s
s
2
+ 1

1
d
2
=
(s)
d
2
This should be better than our previous result. If so we ought to have the
following inequality:
0 < =
s
s
2
+ 1

1
2
This is equivalent to
0 s
2
2s + 1 = (s 1)
2
which is true. We consider the behaviour of (s) for s > (1 +
5)/2. Can
we nd an upper bound for less than 1/2? Let us choose a second value
of s
t
, say s
t
> s, then
s
s
2
+ 1

s
t
s
t2
+ 1
=
ss
t
+s s
t
s
2
s
t
(s
2
+ 1)(s
t2
+ 1)
=
(s s
t
)(1 ss
t
)
(s
2
+ 1)(s
t2
+ 1)
.
Since s 1, 1 ss
t
< 0. Hence the dierence between the values of is
positive, i.e. the function is decreasing. So the function attains its maximum
for the smallest value of s. So in the case x
A
< x
B
this leads to
=
s
s
2
+ 1
<
(
5 + 1)/2
_
5 + 1)/2
_
2
+ 1
=
1
5
So in this case we have nd the inequality

b
d
<
(s)
d
2

1
5d
2
.
Lets consider the case x
A
> x
B
, remeber that we are still in the case
a/b < c/d and 0 < d < b.
Now BC is the side of the mesh with the longest projection onto the
x-axis, i.e. x
B
< < x
C
. Hence the line x = passes through C[e, f] and
we consider [ e/f[. We can compare this to [x
B
e/f[ and [x
C
e/f[, of
which the rst one is the larger one, since C lies higher than B on the circle
by our assumption d > b. We can estimate the follwoing, using a/b and e/f
are adjacent.

e
f
<
x
B

e
f
=
e
f
x
B
=
e
f

ab +ef
b
2
+f
2
=
eb
2
+abf
f(b
2
+f
2
)
=
b(eb +af)
f(b
2
+f
2
)
=
b
f(b
2
+f
2
)
103
a
b
c
d
e
f
A
B
C
x
A
x
B
x
C
We are now in the case 1 s < (
5 + 1)/2, we would like to reduce the

estiamtion to something of the form (s)/f
2
. We can form the following,
recalling f = b +d:
b
f(b
2
+f
2
)
=
bf
f
2
(b
2
+f
2
)
=
1
f
2
_
b(b +d)
b
2
+ (b +d)
2
_
=
1
f
2
_
s(s + 1)
s
2
+ (s + 1)
2
_
= (s)
1
f
2
Thus we have [ e/f[ < /f
2
. What can we say about (s) is this case?
Take s
t
such that s < s
t
< (
5+1)/2 and consider the dierence (s)(s

t
):
s(s + 1)
s
2
+ (s + 1)
2

s
t
(s
t
+ 1)
s
t2
+ (s
t
+ 1)
2
=
s(s + 1)
2s(s + 1) + 1

s
t
(s
t
+ 1)
2s
t
(s
t
+ 1) + 1
=
s(s + 1) s
t
(s
t
+ 1)
(2s(s + 1) + 1) (2s
t
(s
t
+ 1) + 1)
=
(s s
t
)(s +s
t
+ 1)
(2s(s + 1) + 1) (2s
t
(s
t
+ 1) + 1)
Clearly s + s
t
+ 1 > 0, so this dierence is negative. Thus the function
is increasing and attains it maximum value at s = (
5 + 1)/2. Hence
s(s + 1)
2s(s + 1) + 1
<
_
(
5 + 1)/2
__
(
5 + 3)/2
_
2
_
(
5 + 1)/2
__
(
5 + 3)/2
_
+ 1
=
2 +
5
2
_
2 +
5
_
+ 1
=
2 +
5
2
5 + 5
=
1
5
We can again conclude

e
f
<

f
2
<
1
5f
2
.
We have now shown Hurwitz inequality, for any irrational there are
innitely many fractions such that [ a/b[ < 1/
5b
2
.
We now need to show that 1/
5 is the best value, i.e. we must give an

irrational number for which the inequality [ a/b[ < /b
2
is satised for
only nitely many fractions, 0 < < 1/
5. It turns out the golden mean

=
_
5 + 1
_
/2 will do the trick.[69]
Ford also considered complex continued fractions, this gives rise to Ford
spheres. There is an analogue of Hurwitzs theorem in the complex case,
proven by Ford himself.[27]
Chapter 10
Decimals vs. continued
fractions
Sandra Hommersom
By now we know that we can represent a real number in dierent ways.
We already knew about the decimal expansion and during this course, we
learned how to represent a real number by its continued fraction expansion.
In this chapter, we are going to compare these two expansions. Our main
goal is to prove the theorem of the German mathematician Gustav Lochs.
This theorem says that for almost every real number x, if we are given the
rst n decimals of x, then in average we can obtain about the same number
of partial quotients of x. To prove this theorem, we need a proposition of
the French mathematician Paul Levy. First we do some preliminaries.
Let us start with a few examples: we consider the number . At rst
we assume that we are given the rst 10 decimals of = 3, 14159265358 . . ..
We dene rational numbers x = 3, 1415926535 and y = 3, 1415926536, then
we know x < < y. Using this, we can get information about the partial
quotients of . We can work out the continued fraction expansions of x and
y by hand. We then obtain:
x = [3; 7, 15, 1, 292, 1, 1, 6, 2, 13, 3, 1, 12, 3], y = [3; 7, 15, 1, 292, 1, 1, 1, 4, 1, 1, 1, 45, 1, 1, 8].
If we forget about the integer part 3, we see that the rst six partial quotients
of the continued fraction expansions of x and y coincide. Since lies in
between these two numbers, its rst six partial quotients must be the same.
Thus: we obtain six partial quotients of the continued fraction expansion of
from its rst ten decimals.
105
106 CHAPTER 10. DECIMALS VS. CONTINUED FRACTIONS
The second example is from Gustav Lochs himself. In 1963 he showed that
if we are given the rst 1000 decimals of , then we will obtain 968 partial
quotients of the continued fraction expansion. After we have proved the
theorem of Lochs, we will know that this result holds in general (that is,
for almost every real number).
10.1 Preliminaries
We will now derive some properties related to the continued fraction ex-
pansion of a real number x. Without loss of generality we assume that
x [0, 1). We denote the n
th
partial quotient of x by a
n
(x). Furthermore,
for a
n
(x) = a
n
we denote the n
th
convergent of x by
pn(x)
qn(x)
= [0; a
1
, . . . , a
n
].
These convergents satisfy the following well-known relations:
p
n
= a
n
p
n1
+p
n2
and q
n
= a
n
q
n1
+q
n2
,
p
n1
q
n
p
n
q
n1
= (1)
n
,
and we have the conventions p
1
= q
0
= 1, q
1
= 0 and p
0
= a
0
= 0.
We will also use the continued fraction map
T : [0, 1) [0, 1), x
1
x

_
1
x
_
.
For x = [0; a
1
, a
2
, . . .], we have Tx = [0; a
2
, a
3
, . . .], i.e. a
n
(Tx) = a
n+1
(x)
for every x [0, 1).
Denition 10.1.1. For a
1
, . . . , a
n
Z
1
, the cylinder of a
1
, . . . , a
n
is the
set
(a
1
, . . . , a
n
) = x [0, 1) [ a
1
(x) = a
1
. . . a
n
(x) = a
n
,
or simply
n
if it is clear which integers we are taking the cylinder of.
For now, we consider cylinders
n
= (a
1
, . . . , a
n
) containing real num-
bers x = [0; a
1
, a
2
, . . .]. The following lemma gives more information about
what these cylinders look like. Notice that by denition, the rst n conver-
gents of two elements in
n
will always be the same.
Lemma 10.1.2.
n
is an interval. If
p
1
q
1
, . . . ,
pn
qn
are the rst n convergents
belonging to elements of
n
, then
n
=
_
_
_
pn
qn
,
pn+p
n1
qn+q
n1
_
if n even
_
pn+p
n1
qn+q
n1
,
pn
qn
_
if n odd
.
10.1. PRELIMINARIES 107
Furthermore, the length of this interval is
1
qn(qn+q
n1
)
.
Proof. One should give this proof by induction. Here we only consider the
cases n = 1, 2 to give the intuition and leave the rest to the reader.
Case n = 1. We have x
1
if and only if a
1
(x) = a
1
. It holds that
a
1
=
1
xa
0
| =
1
x
|, so a
1

1
x
< a
1
+ 1 and therefore
1
a
1
+1
< x
1
a
1
. On
one hand, we have
1
a
1
=
a
1
0 + 1
a
1
1 + 0
=
a
1
p
0
+p
1
a
1
q
0
+q
1
=
p
1
q
1
.
On the other hand, the natural number a
1
+ 1 would have been the lower
bound for
1
x
if the integer part of x was a
1
+ 1. Then the rst convergent
would have been
1
a
1
+1
. If we write this convergent as
r
1
s
1
, we can again write
down the relations:
1
a
1
+ 1
=
r
1
s
1
=
(a
1
+ 1)p
0
+p
1
(a
1
+ 1)q
0
+q
1
=
p
1
+p
0
q
1
+q
0
.
So we now have x
1
if and only if
p
1
+p
0
q
1
+q
0
< x
p
1
q
1
. Therefore, we can
conclude
1
=
_
p
1
+p
0
q
1
+q
0
,
p
1
q
1
_
.
Case n = 2. We have x
2
if and only if a
1
(x) = a
1
and a
2
(x) = a
2
. But
we also know that a
2
= a
1
(Tx), where Tx =
1
x

1
x
| =
1
x
a
1
. So similarly
as in the case n = 1, we have the inequalities
1
a
2
+1
< Tx
1
a
2
, which we
can rewrite as
a
2
a
1
a
2
+ 1
x <
a
2
+ 1
a
1
a
2
+a
1
+ 1
.
On one hand, we have by denition
a
2
a
1
a
2
+ 1
=
1
a
1
+
1
a
2
=
p
2
q
2
.
On the other hand, we have
a
2
+1
a
1
a
2
+a
1
+1
=
a
2
+1
a
1
(a
2
+1)+1
=
r
1
s
1
, where
r
1
s
1
is the
second convergent if the second partial quotient would have been a
2
+ 1 in
stead of a
2
. Therefore, we also have
a
2
+ 1
a
1
a
2
+a
1
+ 1
=
r
1
s
1
=
(a
2
+ 1)p
1
+p
0
(a
2
+ 1)q
1
+q
0
=
p
2
+p
1
q
2
+q
1
.
So we have x
2
if and only if
p
2
q
2
x <
p
2
+p
1
q
2
+q
1
. Therefore we can conclude
2
=
_
p
2
q
2
,
p
2
+p
1
q
2
+q
1
_
.
Finally, for the length of the interval
n
, we have
p
n
q
n
p
n
+p
n1
q
n
+q
n1
p
n
q
n
+p
n
q
n1
p
n
q
n
p
n1
q
n
q
n
(q
n
+q
n1
)
=
[p
n
q
n1
p
n1
q
n
[
q
n
(q
n
+q
n1
)
=
1
q
n
(q
n
+q
n1
)
.
Now we will prove two more lemmas, which we both need later on to
prove Levys proposition. The rst lemma gives a relation between a real
number and its convergents, while the second lemma gives an interesting
relation between convergents when we apply the continued fraction map T.
Lemma 10.1.3. Let x [0, 1). Then for n 1 we have
0 log x log
pn(x)
qn(x)

1
qn(x)
if n even,
0 log x log
pn(x)
qn(x)

1
qn(x)
if n odd.
Proof. First suppose that n is even. Then by the previous lemma we have
x
pn(x)
qn(x)
. Therefore we have log x log
pn(x)
qn(x)
. We can write down the
following (in)equalities:
0 log xlog
p
n
(x)
q
n
(x)
=
_
x
p
n
(x)
q
n
(x)
_
1

1
q
n
(q
n
+q
n1
)
q
n
p
n
=
1
p
n
(q
n
+q
n1
)

1
q
n
.
Here we applied the Mean Value Theorem to the map log to nd
pn
qn
x.
Furthermore, we used the previous lemma to bound from above by the length
of
n
and we used that (q
i
)
i0
is an increasing sequence of natural numbers.
Now suppose that n is odd. Then by the previous lemma we have
pn+p
n1
qn+q
n1
<
x
pn
qn
, so log
pn
qn
log x. We again apply the Mean Value Theorem to
the map log and now we nd x
pn
qn
. Then in particular we have
pn+p
n1
qn+q
n1
. Then we can write down the following (in)equalities:
0 log
p
n
(x)
q
n
(x)
log x =
_
p
n
(x)
q
n
(x)
x
_
1

1
q
n
(q
n
+q
n1
)
q
n
+q
n1
p
n
+p
n1
=
1
q
n
(p
n
+p
n1
)

1
q
n
,
which completes the proof in case n is odd.
Remark 10.1.4. We can actually see that the inequalities are strict in-
equalities, if we dont allow x to be a rational number. But in the proof of
Levys proposition it is enough to have this slightly weaker result. That is
why we stated it like this here.
10.2. RESULTS OF L
EVY AND LOCHS 109

Lemma 10.1.5. Let x [0, 1), then p
n
(x) = q
n1
(Tx).
Proof. For x = [0; a
1
, a
2
, . . .], write Tx = [0; b
1
, b
2
, . . .]. Then the relation
b
n
= a
n+1
holds. We prove lemma this by induction. For n = 0, we have
p
0
(x) = a
0
= 0 and q
1
(Tx) = 0 by convention.
Now suppose we know the relation holds up to some natural number n.
Then we have
p
n+1
(x) = a
n+1
p
n
(x) +p
n1
(x)
= a
n+1
q
n1
(Tx) +q
n2
(Tx)
= b
n
q
n1
(Tx) +q
n2
(Tx)
= q
n
(Tx).
So the relation holds for n + 1. This completes the proof by induction.
10.2 Results of Levy and Lochs
The aim of this section is to prove the theorem of Lochs. We can do this
partly by basic mathematics, but on our way we will need another result.
This result is due to the mathematician Paul Levy. Therefore we rst prove
the following proposition, which Paul Levy proved in 1929.
Proposition 10.2.1 (Paul Levy). For almost every x [0, 1), we have
lim
n
log q
n
(x)
n
=

2
12 log 2
.
Proof. Let x = [0; a
1
, a
2
, . . . ] [0, 1) and let n be a natural number. Using
lemma 10.1.5, we have
1
qn(x)
=
1
qn(x)
pn(x)
q
n1
(Tx)
p
n1
(Tx)
q
n2
(T
2
x)

p
2
(T
n2
x)
q
1
(T
n1
x)
=
pn(x)
qn(x)
p
n1
(Tx)
q
n1
(Tx)

p
1
(T
n1
x)
q
1
(T
n1
x)
.
Here we are allowed to write down the second equality, since T
n1
x =
[0; a
n
, a
n+1
, . . .], so therefore p
1
(T
n1
x) = a
1
(T
n1
x)p
0
(T
n1
x)+p
1
(T
n1
x) =
a
n
0 + 1 = 1. Taking log on both sides of the equation, one sees
log q
n
(x) = log
p
n
(x)
q
n
(x)
+ log
p
n1
(Tx)
q
n1
(Tx)
+. . . + log
p
1
(T
n1
x)
q
1
(T
n1
x)
.
We know that for every m, the convergents
p
k
(T
m
x)
q
k
(T
m
x)
approximate T
m
x.
Therefore, it makes sense to compare the right-hand side of the equation
to log x + log Tx +. . . + log T
n1
x. Up to some error term R(n, x) we then
obtain another equation for log q
n
(x), namely
log q
n
(x) = log x + log Tx +. . . + log T
n1
x +R(n, x).
Combining the two equations we have for log q
n
(x), we get
R(n, x) = log
pn(x)
qn(x)
+ log
p
n1
(Tx)
q
n1
(Tx)
+. . . + log
p
1
(T
n1
x)
q
1
(T
n1
x)
log x log Tx . . . log T
n1
x
=
_
log
pn(x)
qn(x)
log x
_
+
_
log
p
n1
(Tx)
q
n1
(Tx)
log Tx
_
+. . . +
_
log
p
1
(T
n1
x)
q
1
(T
n1
x)
log T
n1
x
_
.
Now we rst observe that from lemma 10.1.3 for every y [0, 1) and for
every k 1 we have [log
p
k
(y)
q
k
(y)
log y[
1
q
k
(y)
. Let further F
1
, F
2
, . . . be the
Fibonacci sequence 1, 1, 2, 3, 5, . . ., then by our recursion relation for the q
k
s
we know that for every k 1 the inequality q
k
F
k
holds. Using this, we
are able to bound the error term from above:
[R(n, x)[
log
pn(x)
qn(x)
log x
+ . . . +
log
p
1
(T
n1
x)
q
1
(T
n1
x)
log T
n1
x
1
qn(x)
+ . . . +
1
q
1
(T
n1
x)
1
Fn
+ . . . +
1
F
1
So we have [R(n, x)[

n
i=1
1
F
i
, which we want to bound from above by a
constant. For that aim, let G :=
1+
5
2
> 1 and g :=
1
G
=
1
5
2
< 1 be the
golden means. Then one has the equality
F
k
=
G
k
g
k
5
and therefore F
k
grows like
G
k
5
at innity. Since
i=0
5G
i
is a geo-
metric series, we have that
n
i=1
1
F
i
is the n
th
partial sum of the convergent
series
i=1
1
F
i
=: C. Thus, we have found an upper bound for the error
term, namely [R(n, x)[ C for every n, x.
We now have obtained that lim
n
log qn(x)
n
exists if and only if lim
n
1
n
(log x+
log Tx+. . . +log T
n1
x) exists, and these limits are equal. But from earlier
lectures we know that ([0, 1), L, , T) is an ergodic system, so we can apply
the Ergodic Theorem to see that the second limit exists for almost every x.
So by the Ergodic Theorem we have for almost every x:
lim
n
1
n
n1
i=0
log T
i
x =
_
[0,1)
log d =
1
log 2
_
1
0
log x
1 +x
dx.
10.2. RESULTS OF L
EVY AND LOCHS 111

Now the integral on the right is not an easy one and we will not compute it
here, because that is part of another course. But one can check that
_
1
0
log x
1 +x
dx =
1
2
i=1
1
i
2
=
1
2
2
6
=
2
12
and from that we obtain
_
[0,1)
log d =

2
12 log 2
. We know that this is also
the limit of
log qn(x)
n
for n for almost every x, so that we can nally
conclude:
for almost every x [0, 1) : lim
n
log q
n
(x)
n
=

2
12 log 2
.
We are now able to prove the theorom of Lochs. The proof we are
following here is actually the one given by Lochs himself in 1964. Our task
is to understand his proof and ll in the gaps that Lochs left to the reader.
Theorem 10.2.2 (Gustav Lochs). For x [0, 1), let m denote the number
of partial quotients b
k
(x) of the continued fraction expansion of x that can
be obtained from the rst n decimals in the decimal expansion of x. Then
for almost every x [0, 1) we have
lim
n
m
n
=
6 log 2 log 10
2
0, 9703.
Before giving the proof we recall the denition of the small-o notation,
that is: f(x) = o(g(x)) if and only if lim
n
f(n)
g(n)
= 0. Intuitively it says
that the growth of f is nothing compared to that of g.
The rst part of the proof describes exactly the method used in the rst
example to obtain partial quotients of .
Proof. Write x = 0, a
1
a
2
. . . for the decimal expansion of x. We dene two
rational numbers y := 10
n
10
n
x| = 0, a
1
a
2
. . . a
n
and z := y + 10
n
, such
that y < x < z. Let now m be the natural number such that the rst m
partial quotients of y and z coincide and the (m + 1)
st
partial quotients
of y and z are dierent. Then we can write y = [0; b
1
, . . . , b
m
, y
m+1
] and
z = [0; b
1
, . . . , b
m
, z
m+1
] and also we know that the rst m partial quotients
of x are b
1
, . . . , b
m
(so that this natural number m is actually the number
m mentioned in the theorem).
Now notice that we know y =
pmy
m+1
+p
m1
qmy
m+1
+q
m1
, since this relation holds for
every k 0. But then we can write down the following equations:
y =
pmqmy
m+1
+p
m1
qm+pmq
m1
pmq
m1
qm(qmy
m+1
+q
m1
)
=
pm(qmy
m+1
+q
m1
)+(1)
m
qm(qmy
m+1
+q
m1
)
=
pm
qm
+
(1)
m
qm(qmy
m+1
+q
m1
)
.
Similarly, this holds for z: z =
pm
qm
+
(1)
m
qm(qmz
m+1
+q
m1
)
, so then we have
10
n
= z y =
[y
m+1
z
m+1
[
(q
m
y
m+1
+q
m1
)(q
m
z
m+1
+q
m1
)
.
We are now going to study this last expression. Therefore we dene a few
numbers, these are u := max(y
m+1
, z
m+1
), v := [y
m+1
z
m+1
[ and t :=
q
m1
qm
.
Note that then holds u, u + v = y
m+1
, z
m+1
. The denominator of the
expression for z y contains similar formulas for y
m+1
and z
m+1
, so we can
replace them by u and u + v without having any problem. Then we have
the following equations:
10
n
=
(qmy
m+1
+q
m1
)(qmz
m+1
+q
m1
)
v
=
(qmu+q
m1
)(qm(u+v)+q
m1
)
v
=
(qmu+qmt)(qm(u+v)+qmt)
v
=
qm(u+t)qm(u+v+t)
v
= q
2
m
(u +t)
u+v+t
v
Now we are going to say something about the size of the factors that appear
in the last equation. First notice that 0 < t < 1 < u, so that u + t < 2u.
For considering the right-hand factor, we distinguish two cases:
If v < 1, then
u+v+t
v
<
u+1+t
v
<
3u
v
=
3u
min(1,v)
.
If v 1, then
u+v+t
v
= 1 +
u+t
v
1 +u +t < 3u =
3u
min(1,v)
.
On the other hand, we also have u +t > 1 and
u+v+t
v
1, so that
1 < (u +t)
u +v +t
v
2u
3u
min(1, v)
=
6u
2
min(1, v)
.
10.2. RESULTS OF L
EVY AND LOCHS 113

Multiplying this by q
2
m
and using the above deduced equation, we obtain
q
2
m
< 10
n
q
2
m
6u
2
min(1, v)
.
Since log is an increasing function, we have log(min(1, v)) = min(log 1, log v) =
min(0, log v). Applying this to the inequality, we obtain the following:
2 log q
m
< nlog 10 2 log q
m
+ log 6u
2
min(log v, 0)
2 log q
m
+ log 6 + 2 log u log v.
Clearly we have log 6 = o(m). To complete the proof of the theorem, we
assume for the moment that for almost every x both log u and log v are
equal to o(m) as well. Then the whole term log 6 +2 log ulog v reduces to
a term o(m). Then we nd the useful inequality
2
log q
m
m
<
n
m
log 10 2
log q
m
m
+
o(m)
m
.
Now dividing by log 10, taking limits and applying the proposition of Levy
we get the result
lim
n
n
m
=
2
log 10
2
12 log 2
=

2
6 log 2 log 10
.
But then also lim
n
m
n
exists and is equal to
6 log 2 log 10
2
, which completes
the proof of the theorom.
We still have to prove log u = o(m) and log v = o(m). Let us start
with log u. Suppose that log u = o(m) does not hold for some x [0, 1),
that is lim
m
log u
m
= 0 does not hold. Then there exists > 0 such that
for all N N there exists m > N such that [
log u
m
[ > . Since u > 1,
we nd thus innitely many m such that log u > m. For these m, we
have b
m+1
u| e
m
|, so therefore we can also nd > 0 such that
b
m+1
> e
m
. Since q
m+1
= b
m+1
q
m
+q
m1
, we obtain for these m
log q
m+1
> log b
m+1
q
m
= log b
m+1
+ log q
m
> m+ log q
m
. (10.1)
On the other hand, from the proposition of Levy we have
log qm
m
=

2
12 log 2
+
m
where
m
0 if m . The latter implies that m
m
= o(m), so that we
obtain for all m
log q
m
=

2
12 log 2
m+ o(m) (10.2)
Since log q
m
,= o(m), the equations 10.1 and 10.2 contradict each other.
Thus we can conclude that if log u ,= o(m) for some x [0, 1), then that x
does not satisfy the proposition of Levy. That is exactly: for almost every
x [0, 1) we have log u = o(m).
Now we consider the case of log v. Notice that log v only appears in the
equation if min(log v, 0) = log v, that is if v < 1. Suppose that log v = o(m)
does not hold for some x [0, 1), then there exists > 0 such that for all
N N there exists m > N such that [
log v
m
[ > . Since we may assume that
log v < 0, we nd innitely many m such that log v > m. For these m,
we thus have [a
m+1
(y)a
m+1
(z)[ < e
m
< 1, which means that we can nd
a natural number K that is in between a
m+1
(y) and a
m+1
(z). Otherwise we
have a
m+1
(y) = a
m+1
(z), which contradicts the denition of m. Since x
m+1
lies in between y
m+1
an z
m+1
, we have a
m+1
(x) K, K 1 and certainly
[x
m+1
K[ < e
m
.
Write x
m+1
= a
m+1
(x) +
1
x
m+2
. If a
m+1
(x) = K, then we obtain x
m+2
>
e
m
, which means a
m+2
(x) e
m
. If a
m+1
(x) = K 1, then we obtain
[
1
x
m+2
1[ < e
m
, so for m large enough a
m+2
= 1 holds. Writing x
m+2
=
1 +
1
x
m+3
and [
1
x
m+2
1[ =
x
m+2
1
x
m+2
, we then obtain x
m+3
+1 > e
m
. Taking
both cases together, we can always nd > 0 such that innitely many
partial quotients b
m
of x satisfy b
m
> e
m
This brings us back to what
we obtained in the case of log u and we will again have the contradiction
between the equations 10.1 and 10.2. So therefore we also have log v = o(m)
for almost every x [0, 1).
Remark 10.2.3. In the above proof, we used the following unproven fact:
for almost every x [0, 1) : if n , then m . It is not the intention
to give a proof of this here, but one could intuitively understand. Also,
one can think of examples for which the result does not hold. The easiest
examples are rational numbers.
References
During my research on the theorem of Lochs I used the book [21] and articles
[54] and [55] in particular.
Chapter 11
Entropy and the theorem of
Lochs
Sandra Hommersom
In my previous lecture I gave a proof of Lochss theorem, which gives us
information about the relation between the decimal expansion and the con-
tinued fraction expansion of a real number. In this lecture we again consider
this theorem, but now our goal is to prove it from a completely dierent area
of mathematics, namely the theory of entropy. Since entropy is an unknown
subject to many mathematicians, we will rst give an introduction to it.
The proof of Lochss theorem we give here uses a famous theorem of Shan-
non, McMillan and Breiman. We will of course mention this theorem, but
unfortunately not give a proof.
11.1 Introduction to entropy
Entropy is a notion of uncertainty or randomness. It was rst developed by
the American mathetician Claude Shannon to study the amount of infor-
mation one can get from a transmitted message. This idea can be extended
to the amount of information one can get from an occurring event. We can
intuitively understand what entropy is. Therefore we look at the example
of rolling a die.
First suppose we have a fair die, so every side occurs with probability
1
6
after
rolling. Then the best way to predict the outcome of dicing is just randomly
guess a value. Since we have no clue about the outcome, then acually see-
ing the outcome gives us much information about this event. This is just
another way of saying that the entropy of rolling a fair die is large.
115
116 CHAPTER 11. ENTROPY AND THE THEOREM OF LOCHS
Now suppose we have a die which rolls a 6 with probability
9
10
and probabil-
ity
1
50
to roll each of the other values. Then the outcome of rolling this die
does not give us much information, because we could already predict with
pretty large probability that the outcome will be 6. This is just saying that
the entropy of this die should be small.
Now it must be clear that entropy has something to do with probability
theory. We can now dene what entropy is in this area of mathematics.
Denition 11.1.1. Let e be an event occurring with probability p. Then
we dene the amount of information I(e) of e as I(e) := log p.
We can see why this is a suitable denition: if p = 1, then I(e) = 0.
That is: the event e gives us no useful information. On the other hand,
when we are considering a discrete random variable with n possible values,
uniformly distributed, then I(e) is maximal, namely I(e) = log
1
n
= log n.
Denition 11.1.2. Let X = (x
1
, . . . , x
n
) be a discrete random variable
with probability distribution P = (p
1
, . . . , p
n
). Then we dene the entropy
of X as
H(X) := E(I(X)) =
n
i=1
p
i
log p
i
,
together with the notion 0 log 0 = 0.
Later we would like to consider the ergodic system ([0, 1), L, , T), where
T is the continued fraction map. This means that now we must think of a
way to dene entropy on dynamical systems. Let us therefore go back to
the information of a transmitted message. We view a message as a string of
symbols . . . x
1
x
0
x
1
. . . from an alphabet a
1
, . . . , a
n
, where every a
i
has
probability p
i
to be received and symbols are sent independently from each
other. Of course the p
i
s satisfy p
i
0 and
n
i=1
p
i
= 1.
In ergodic theory we see this situation as a dynamical system (X, T, , T),
where:
X = a
1
, . . . , a
n
Z
,
T is the -algebra generated by sets
(a
i
0
, . . . , a
i
m1
) := x X [ x
0
= a
i
0
, . . . , x
m1
= a
i
m1
,
is the product measure, that is ((a
i
0
, . . . , a
i
m1
)) = p
i
0
. . . p
i
m1
,
11.1. INTRODUCTION TO ENTROPY 117
T is the left shift.
Then we dene the entropy of this system as H(X, T, , T) =
n
i=1
p
i
log p
i
.
We can think of H as the average amount of information per symbol.
We are going to extend this denition to dene entropy on a arbitrary
measure preserving system (X, T, , T). To the alphabet a
1
, . . . , a
n
we
now relate a partition = A
1
, . . . , A
n
of X. Then with x X we associate
an innite sequence . . . x
1
x
0
x
1
. . . where x
i
= a
j
if and only if T
i
x A
j
.
Denition 11.1.3. Given a partition of X, we dene the entropy of
the partition by
H() :=
n
i=1
(A
i
) log (A
i
).
In this denition T does not appear yet. As we will see later, the en-
tropy of the system is actually dened by the entropy of the transformation
T. Therefore we need a denition that is independent of the choice of the
partition .
First we numerate some facts about partitions and properties of entropy
of partitions.
Lemma 11.1.4. Let = A
1
, . . . , A
n
and = B
1
, . . . , B
m
be partitions
of X.
(i) T
1
:= T
1
(A
1
), . . . , T
1
(A
n
) is a partition of X,
(ii) := A
i
B
j
[ i = 1, . . . , n, j = 1, . . . , m is a partition of X.
The proof of this lemma is very straightforward and is therefore omitted.
The set is called the common renement of and . More general,
the partition is called a renement of , written , if for all j m
there exists i n such that B
j
A
i
.
Lemma 11.1.5. Let (X, T, , T) be a measure preserving system and let ,
be partitions of X.
(i) H(T
1
) = H(),
(ii) If , then H() H(),
(iii) H( ) H() +H(),
(iv) If , are independent, then H( ) = H() +H().
This proof is also omitted, but intuitively it is easily seen why the state-
ments could be true. The rst part follows directly from the fact that T is
measure preserving. This lemma needs one more denition:
Denition 11.1.6. Two partitions , are called independent if (A
B) = (A)(B) for all A , B .
Now given a partition , we consider the partition
n1
i=0
T
i
. The
elements of this partition look like A
i
0
T
1
(A
i
1
). . .T
n+1
(A
i
n1
), which
contains elements x X satisfying x A
0
, Tx A
i
1
, . . . , T
n1
x A
i
n1
.
Denition 11.1.7. The entropy of T w.r.t. is given by
h(, T) := lim
n
1
n
H(
n1
i=0
T
i
).
By denition, we have H(
n1
i=0
T
i
) =
A
n1
i=0
T
i
(A) log (A).

Remark 11.1.8. We dont say anything about the existence of the limit
in the above denition. One could prove that the limit exists using that
(H(
n1
i=0
T
i
))
nN
is a subadditive sequence.
We end this section by dening entropy on the measure preserving sys-
tem, which was our goal. This is done by dening entropy on the trans-
formation T. We choose an easy way to get rid of the dependence on the
chosen partition, namely we just take the supremum over all partitions.
Denition 11.1.9. Dene P
n
:= [ is a partition of X and H() <
. Then the entropy of T is given by
h(T) := suph(, T) [ P
n
.
Note that we use H to denote the entropy of a partition and h to denote
the entropy of a transformation.
Remark 11.1.10. In this section we only considered nite partitions. It
turns out that exactly the same denitions can be given if we allow countable
partitions. We will need this generalization in the other sections.
11.2. CALCULATION OF ENTROPY 119
11.2 Calculation of entropy
Practically, calculating the entropy from the denition is impossible, because
we have to take a supremum over possibly an innite number of partitions. If
we were given a partition, then we could use the properties of that partition
to actually do a calculation, which seems much easier. Therefore, the ques-
tion arises wether we could nd a partition of X satisfying h(T) = h(, T).
In some cases this can be done, but rst we need another denition: let
be a nite or countable partition of X and let m, k Z such that k < m.
Then we dene:

m
k
:=
m
i=k
T
i
,
(, T) is the smallest -algebra containing all elements of
m
k
for all
m, k Z with k < m.
Denition 11.2.1. A partition of X is called a generator of T if
(, T) = T (up to some sets of measure 0).
As an example we will compute the entropy of the rst dynamical system
we considered, namely the one where T is the left shift. In this way we see
that the generalized denition of entropy coincides with the one for that
particular dynamical system. We will use the following theorem.
Theorem 11.2.2 (Kolmogorov-Sinai, 1958). If is a nite or countable
generator for T satisfying H() < , then h(T) = h(, T).
For the alphabet we take the set 1, 2, . . . , n together with a given
probability distribution (p
1
, . . . , p
n
) and we may assume
n
i=1
p
i
log p
i
<
. We can do this because we could just leave out the symbols which have
occurring probability 0. As earlier, we thus consider:
X = 1, . . . , n
Z
,
T is the -algebra generated by cylinder sets
(i
0
, . . . , i
m1
) := x X [ x
0
= i
0
, . . . , x
m1
= i
m1
,
is the product measure, that is ((i
0
, . . . , i
m1
)) = p
i
0
. . . p
i
m1
,
T is the left shift.
For we choose the time-zero partition, which means = A
1
, . . . , A
n
,
where A
i
:= (i), that is A
i
= x X [ x
0
= i. Since ((i)) = p
i
, we
then have H() =
n
i=1
(A
i
) log (A
i
) =
n
i=1
p
i
log p
i
< . Further
we see that for every m N: T
m
A
i
= x X [ x
m
= i, so that
A
i
0
T
1
(A
i
1
) . . . T
m+1
(A
i
m1
) = x X[ x
0
= i
0
, . . . , x
m1
= i
m1
.
From this we can conclude that
m1
i=0
T
i
is precisely the set of all cylinders
of length m. In other words: taking the
m1
i=0
T
i
for all m together, then
we obtain exactly the generators for T. Therefore is a (nite) generator
of T. Then by the Kolmogorov-Sinai theorem we know h(t) = h(, T) =
lim
m
1
m
H(
m1
i=0
T
i
). So we must compute h(, T).
Now since each of , T
1
, . . . , T
m+1
determines a dierent coordinate
and since we are using the product measure, we know that all these partitions
are independent. Then by lemma 11.1.5(i), (iv) we have:
H( T
1
. . . T
m+1
) = H() +H(T
1
) +. . . +H(T
m+1
)
= m H()
= m
n
i=1
p
i
log p
i
So now we can conclude: h(T) = lim
n
1
m
m
n
i=1
p
i
log p
i
=
n
i=1
p
i
log p
i
.
This indeed coincides with our earlier denition.
11.3 The theorom of Lochs
Until now we have seen two very technical sections containing a lot of def-
initions. In this section we will see how we could use these denitions to
give a alternative proof of Lochss theorem. First we mention the following
famous theorem, which we will use to prove the theorem of Lochs.
Theorem 11.3.1 (Shannon-McMillan-Breiman). Let (X, T, , T) be an er-
godic measure preserving system and let be a nite or countable parti-
tion of X satisfying H() < . Let A
n
(x) denote the unique element
A
n

n1
i=0
T
i
such that x A
n
. Then for almost every x X we have:
lim
n
1
n
log (A
n
(x)) = h(, T).
11.3.1 Computation of h(T)
First of all, our goal is to compute h(T), where T is the continued fraction
map. We will need this calculation to prove the Theorem of Lochs.
11.3. THE THEOROM OF LOCHS 121
Lemma 11.3.2. For almost every x [0, 1) :
lim
n
log (
n
(x))
log (
n
(x))
= 1.
Proof. We use a well-known correspondence between the Lebesgue measure
and the Gauss measure : for all A L:
1
2 log 2
(A) (A)
1
log 2
(A).
For A we take the cylinders
n
(x) L. Then the correspondence is equiv-
alent to:
log 2
(
n
(x))
(
n
(x))
2 log 2.
By multiplying by (
n
(x)) and taking logarithms, we obtain:
log(log 2) + log (
n
(x)) log (
n
(x)) log(2 log 2) + log (
n
(x)).
For n large enough we have (
n
(x)) < 1, so that log (
n
(x)) < 0. So
dividing by log (
n
(x)) ips the inequality symbol for these n:
log(2 log 2) + log (
n
(x))
log (
n
(x))

log (
n
(x))
log (
n
(x))

log(log 2) + log (
n
(x))
log (
n
(x))
.
Now we take a look at the denominator of the most left- and right-hand side
of the inequality. Observe that for almost every x we have lim
n
(
n
(x)) =
0, so that lim
n
log (
n
(x)) = . Using the Squeeze Test, this gives
us the wanted result:
lim
n
log (
n
(x))
log (
n
(x))
exists and is equal to 1.
Notice that we could also prove this lemma for cylinders D
n
belonging
to the decimal map S, since we only used that the measure of the cylinders
is 0 at innity. A direct consequence of the lemma is the following equality
for almost every x:
lim
n
1
n
(
n
(x)) = lim
n
1
n
(
n
(x)),
that is, if one of both limits exists, then so does the other and these limits
are equal. Soon we will see that the left-hand limit exists.
Now we will compute h(T) using the proposition of Levy, which we proved
in the previous lecture. Recall that this proposition says the following: for
almost every x [0, 1) we have lim
n
1
n
log q
n
=

2
6 log 2
. The following
lemma is a consequence of Levys result.
Lemma 11.3.3. For almost every x [0, 1) :
lim
n
1
n
log (
n
(x)) =

2
6 log 2
.
Proof. From the previous lecture we know (
n
(x)) =
1
qn(qn+q
n1
)
, so there-
fore
log (
n
(x)) = log 1 log q
n
(q
n
+q
n1
) = log q
n
log(q
n
+q
n1
).
On one hand, we have q
n
+q
n1
> q
n
, so that log q
n
+log(q
n
+q
n1
) > 2 log q
n
.
From this we obtain thus log (
n
(x)) < 2 log q
n
. On the other hand, we
have q
n
+ q
n1
< 2q
n
, so that log q
n
+ log(q
n
+ q
n1
) < log q
n
+ log 2q
n
=
log 2 + 2 log q
n
. From this we thus obtain log (
n
(x)) > log 2 2 log q
n
.
Taking this two observations together we have the following result:
log 2 2 log q
n
log (
n
(x)) 2 log q
n
.
Using the proposition of Levy and again the squeeze test we now can con-
clude that for almost every x:
lim
n
1
n
log (
n
(x)) exists and is equal to 2 lim
n
1
n
log q
n
=

2
6 log 2
.
With all these results it is easy to compute h(T). For the partition
we take a time-zero partition, which was introduced in the previous section.
This time we choose this partition with respect to the cylinders
n
belonging
to the continued fraction map T. We do not discuss the details here that
is a generator of T and that it has nite entropy. Using the theorems 11.2.2
and 11.3.1 and the previous two lemmas, we have for almost every x:
h(T) = h(, T) = lim
n
1
n
log (
n
(x)) = lim
n
1
n
log (
n
(x)) =

2
6 log 2
.
Remark 11.3.4. One could also compute entropy by the Rohlin Entropy
Formula, which for the continued fraction map has the form
h(T) =
_
[0,1)
log [T
t
x[d(x).
This formula holds for a larger class of functions, including n-ary transfor-
mations T
n
: x nx mod 1. For example, for the decimal map S : x 10x
mod 1 one obtains h(S) = log 10.
11.3.2 Proof of the theorem
Let us now discuss just a few more results before we get to the theorem of
Lochs.
Lemma 11.3.5. For a cylinder
n
= (a
1
, . . . , a
n
), let us dene
+
n
:=
(a
1
, . . . a
n1
, a
n
+ 1). Then (
n
) 3(
+
n
). The cylinders
n
and
+
n
are called adjacent.
Proof. Notice that for i = 1, . . . , n 1 we have q
i
= q
+
i
. Furthermore
q
+
n
= (a
n
+ 1)q
+
n1
+q
+
n2
= (a
n
+ 1)q
n1
+q
n2
= q
n
+q
n1
. Therefore we
obtain
(
+
n
) =
1
q
+
n
(q
+
n
+q
+
n1
)
=
1
(q
n
+q
n1
)(q
n
+ 2q
n1
)
.
The claim
1
3
(
n
) (
+
n
) is equivalent to 3q
n
(q
n
+q
n1
) (q
n
+q
n1
)(q
n
+
2q
n1
). Here the left-hand side is equal to 3q
2
n
+3q
n
q
n1
, while the right-hand
side equals q
2
n
+ 3q
n
q
n1
+ 2q
2
n1
. Since q
n1
q
n
, the wanted inequality
thus holds. This proves the lemma.
Now we are going to take a closer look at how
n+1
is obtained from
n
. From my previous lecture, we know what the
n
look like: they are
half open intervals and the precise form depends on if n is even or odd. We
see that if n is even, then
n
(1),
n
(2), . . . is a sequence of disjoint cylinders
of length n+1 in
n
and they are ordered from left to right. Here
n
(j) is
the set of real numbers in
n
with the (n +1)
st
partial quotient equal to j.
Similarly, if n is odd, then we have this sequence in
n
ordered from right
to left. In other words: to obtain
n+1
from
n
, one renes
n
from left
to right if n is even, and from right to left if n is odd.
Suppose we are given such
n
. When we repeat the described proces for
cylinders of length n + 1, n + 2, . . . , we alternately rene from left to right
and from right to left. Everytime when we consider a next level, the intervals
are shrinking. Now if we consider a sucient level (which means cylinders
of length n + j for j large enough), there exists an interval which is small
enough in the following sense: suppose I is any interval in [0, 1) and
n
is the smallest cylinder containing I, then for amost every x I we have
either
n+j
(x) I, or its adjacent cylinder (
n+j
(x))
+
I. The impor-
tant thing here is that j is bounded. Actually, it turns out that j is at most 3.
Using this knowledge, we are now able to give a proof of Lochss theorem.
The above discussion appears in the proof to compare the number of partial
quotients to the number of decimals.
Theorem 11.3.6 (Gustav Lochs). For x [0, 1), let m denote the number
of partial quotients b
k
(x) of the continued fraction expansion of x that can
be obtained from the rst n decimals in the decimal expansion of x. Then
for almost every x [0, 1) we have
lim
n
m
n
=
6 log 2 log 10
2
0, 9703.
Proof. Let S denote the decimal map x 10x mod 1 and D
n
the cylinders
belonging to S. Suppose x [0, 1) and we have xed some n. Given
the interval D
n
(x) [0, 1), then by denition of m, the smallest cylinder
containing D
n
(x) is
m
(x). By the above discussion, we nd j 3 such
that either
m+j
(x) D
n
(x) or (
m+j
(x))
+
D
n
(x). If we denote this
cylinder we nd by
m+j
, we thus have
m+j
D
n
(x)
m
(x).
Using lemma 11.3.5, we see that without regard on the precise form of
m+j
,
we always have
1
3
(
m+j
(x)) (
m+j
). So we have
1
3
(
m+j
(x)) (D
n
(x)) (
m
(x)),
which implies
1
n
log 3 +
1
n
log (
m+j
(x))
1
n
log (D
n
(x))
1
n
log (
m
(x)) (11.1)
by taking logarithms and dividing by n.
We write the right-hand inequality as
1
n
log (D
n
(x))
m
n
1
m
log (
m
(x)),
which we can rewrite as
m
n

1
n
log (Dn(x))
1
m
log (m(x))
=
1
n
log (Dn(x))
1
m
log (m(x))
log (m(x))
log (m(x))
.
Here one must notice log (
m
(x)) < 0, so that we indeed have a -symbol.
Taking limits, we have from lemma 11.3.2 for almost every x [0, 1) :
lim
n
log (
m
(x))
log (
m
(x))
= 1.
Furthermore, using our computation of h(T), and similarly one for h(S), we
obtain
lim
n
1
n
log (D
n
(x)) = h(S) and lim
n
1
m
log (D
m
(x)) = h(T).
Therefore we have
limsup
n
m
n

h(S)
h(T)
=
h(S)
h(T)
.
So far, we have only used the right-hand inequality of equation 11.1. Con-
sidering the left-hand inequality, we obtain a similar result, namely
liminf
n
m
n

h(S)
h(T)
,
so that
lim
n
m
n
=
h(S)
h(T)
.
As we already mentioned once, a calculation on the Rohlin Entropy Formula
for S results in h(S) = log 10. Therefore we can nally conclude
lim
n
m
n
=
log 10
2
6 log 2
=
6 log 2 log 10
2
.
References
During my research on entropy and its relation to the theorem of Lochs, I
used the book [21].
One could have observed that these lecture notes contain a lot af gaps,
which are left to the reader. For example the discussion, following the
lemma about adjacent cylinders, is not very precise, but also the proofs of
two famous theorems are ommitted. Therefore, this project provides an
opportunity for further research. Completing this research will lead to full
understanding of what is happening here.
For now I chose to not going through these gaps, because I think I would
pass the limit of this course Continued Fractions. But I certainly think
these are interesting points to have a look at in the future.
Chapter 12
Complex continued fractions
Ewelina Omiljan
12.1 Greatest common divisor of two Gaussian in-
tegers
Take
u, v a +bi[a, b Z
such that
[u[ [v[.
Then
(u
k+1
, v
k=1
) = (v
k
, u
k
gv
k
),
where g is a Gaussian integer nearest
u
k
v
k
and rounding down real and imag-
inary part. We halt when some v
r
= 0.
Example 18.
(u, v) = (u
0
, v
0
) = (77 + 190i, 20 + 204i)
[u
0
[ [v
0
[
g
1
= 1 0 i
(u
1
, v
1
) = (20 + 204i, 77 + 190i 1 20 + 1 204i) = (20 + 204i, 57 14i)
g
2
= 0 + 31
(u
2
, v
2
) = (57 14i, 22 + 33i)
g
3
= 1 i
127
128 CHAPTER 12. COMPLEX CONTINUED FRACTIONS
(u
3
, v
3
) = (22 + 33i, 2 3i)
g
4
= 11
(u
4
, v
4
) = (2 3i, 0)
12.2 Generalized circles
Denition 19. A generalized circle, or g-circle is the set of complex solu-
tions to an equation of the form:
Aww +Bw +Bw +D = 0,
where w denotes complex conjugation of w, A and D are real coecients, B
is complex coecient. They satisfy: BB AD 0
We can denote a g-circle by the matrix
_
A B
B D
_
. Then the equation
has the form:
Aww +Bw +Bw +D =
_
w 1
_
_
A B
B D
__
w
1
_
.
We use that denition because the set of solutions in the complex w = x+yi-
palne form ordinary circle with centre in
B
A
and radius
[B[
2
AD
[A[
when
A ,= 0. When A = 0 they form a line ax by =
D
2
. Its from:
Bw +Bw +D = 0 when B = a +bi, w = x +yi
(a +bi)(x +yi) + (a bi)(x yi) = D
ax +by =
D
2
.
If A = 0 then ax + by =
D
2
and when D = 0 it is of the form y =
a
b
x so
it passes through the origin. If A ,= 0 then g-circle has centre in
B
A
and
radius [
B
A
[.
12.3 Hurwitz mapping
Now we will show that the map
w
1
w
12.3. HURWITZ MAPPING 129
maps g-circles to g-circles. To show that we take g-circle
C =
_
A B
B D
_
.
Under the map we get:
A
1
w

1
w
+B
1
w
+B
1
w
+D = 0
A+Bw +Bw +Dww = 0
So our new circle has the form
C =
_
D B
B A
_
.
And it is a g-circle as well.
If we take any translation of the complex plane it also maps g-circle to
g-circle. Lets take
Hw =
1
w

and g-circle:
C =
_
A B
B D
_
.
We get:
A(
1
w
)(
1
w
) +B(
1
w
) +B(
1
w
) +D = 0
ww(A B B +D) +w(A+B) +w(A +B) +A = 0
C =
_
A B B +D A+B
A+B A
_
.
And after a lot of calculations we can write C as:
C =
_
1
1 0
__
A B
B D
__
1
1 0
_
.
We should notice that H leaves determinant of the matrix C invariant. It is
indeed:
A
2
BA BA +DA(A
2
AB AB +BB) = DABB
Like in C before applying H map.
We take z - complex number and then
0
= z|
z
0
= z
0
Let the circle C has centre
0
and radius [z[.
[[w +
0
[[ = (w +
0
)(w +
0
) = [z[
2
(w +
0
)(w +
0
) [z[
2
= 0
ww +w
0
+w
0
+[
0
[
2
[z[
2
= 0
From there we have:
C
0
=
_
1
0
0
[
0
[
2
[z[
2
_
.
For n 1 we dene
n
=
1
zn
| and z
n
=
1
z
n1
n
. Then [
0
,
1
, . . .] is the
Hurwitz continued fraction expansion of z. By the denition z
0
lies on g-
circle C
0
and also in the unit B, where:
B = z C[
1
2
w, 1w
1
2
If we apply H
1
: w
1
w

1
to z
0
we obtain z
1
B. Repeating this we
nd g-circles C
0
, C
1
, C
2
, . . . with corresponding matrices C
j
=
_
A
j
B
j
B
j
D
j
_
for j 0 and complex numbers z
j
C
j
B.
Moreover, A
j
D
j
B
j
B
j
= A
0
D
0
B
0
B
0
= [z[
2
, j 1. We call C
j
the se-
quence of g-circles corresponding to the Hurwitz expansion z = [
0
,
1
, . . .].
Lemma 1. If [z[
2
= n Z then for the g-circles C
j
=
_
A
j
B
j
B
j
D
j
_
cor-
responding to the Hurwitz continued fraction expansion of z it holds that
A
j
, D
j
Z and B
j
Z[i] and B
j
B
j
A
j
D
j
= n.
Proof. It is true for j = 0 then C
0
=
_
1
0
0
[
0
[
2
[z[
2
_
where
0
Z[i],
0
is the nearest Gaussian integer to z. For j > 0 we use
induction and Hurwitz map: H
j
: w
1
w

j
12.4. FINITE NUMBER OF G-CIRCLES 131
12.4 Finite number of g-circles
Theorem 12.4.1. Let z be a complex number. If n = [z[
2
Z
>0
then the
sequence C
0
, C
1
, . . . of g- circles corresponding to the Hurwitz expansion of
z consists of nitely many dierent g-circles.
Proof. If A
j
= 0 the g-circle is a line r
j
x i
j
y =
D
2
where r
j
= 1B
j
, i
j
=
B
j
, and r
j
, i
j
are rational integers satysfaying r
j
2
+i
j
2
= n
0 B
j
B
j
= n (r
j
2
+i
j
2
) = n
From that we have that there are only nitely many solutions for B
j
. For
the line to intersects the unit box one needs D
j
[r
j
[ + [i
j
[. For the case
A
j
,= 0 we use induction on j. We want to show thet radius R
j
satises
R
j
2
>
1
8
for all j.
For j = 0: R
0
2
= n because R
0
=
[
0
[
2
[
0
[
2
+[z[
2
1
= [z[ and n Z
>0
.
The induction hypothesis is that if C
j1
is a proper circle, then the radius
R
j1
2
>
1
8
.
Suppose taht g-circle C
j
pass throug the origin for some j 1. It means
that D
j
= 0. It has to intersect the unit box B, so the point P on it has to
be at the distance less than
1
2
from the origin.
Under H point P of C
j1
gets mapped to the point opposed from the origin
on C
j
and will be at distance at least
2. From there R
j
2
of C
j
will be at
least
1
2
.
In the rest of cases A
j
and D
j
non-zero integers and A
j1
,D
j1
as well.
Suppose that A
j1
and D
j1
have opposite signs. This means that the ori-
gin is the interior point of the g-circle C
j1
.z
j1
C
j1
B is at distance
at most
1
2
from the origin. The image of C
j1
under H
0
is a g-circle that
also has origin as an interior point. That contains
1
z
j1
which is at distance
at least
2 from the origin. From that: R

j
2
>
1
2
R
j
2
>
1
2
.
Now if A
j1
and D
j1
have the same sign, then origin is an exterior point
of C
j1
and C
j
. The point P on C
j1
nearest to the origin is at distance
p <
1
2
. The diametrically opposed point Q on the same g-circle is at dis-
tance p+ diameter of C
j1
from the origin. Using induction hypothesis:
d > 2R
j1
>
1
2
. The diameter of the image of C
j1
under H
0
, and from
there we have diameter of C
j
:
1
p

1
d +p
=
d
(p +d)p
>
1
2
p(p +
1
2
)
>
p
p(p +
1
2
)
>
1
1
2
+
1
2
=
1
2
so R
j
2
>
1
8
In any case we see that R
j
2
>
1
8
. From that and lemma we have:
R
j
2
=
B
j
B
j
A
j
D
j
A
j
2
=
n
A
j
2
.
It leaves only nitely many posibilities for integer A
j
. For C
j
to intersect
the unit box, its center(
B
A
) can not be too far from the origin:
[
B
j
A
j
[
1
2
2
+
n
[A
j
[
And this leaves only nitely many possibilities for B
j
for each A
j
. D
j
is
determined by A
j
and B
j
so we have the same situation for D
j
.
12.5 Bounded partial quotients
Corollary 12.5.1. Let z C be such that its norm n = [z[
2
Z
>0
is not
the sum of two squares of integers. Then the partial quotients in the Hurwitz
continued fraction of z are bounded.
Proof. According to the Theorem 0.1 the remainders z
i
of the Hurwitz con-
tinued fraction operator all lie on a nite number of dierent g-circles. If
a g-circle C
j
passes through the origin then we know that D
j
= 0 and our
g-circle is
_
ifA = 0 : y =
a
b
x, whereB = a +bi, w = x +yi
ifA ,= 0 : D(
B
A
; [
B
A
[)
and B
j
is a Gaussian integer from the previous lemma and
B
j
B
j
A
j
D
j
= n B
j
B
j
= n
B
j
B
j
= (a +bi)(a bi) = a
2
+b
2
This is a contradiction with assumption that n is not a sum of two integer
squares. So none of the g-circles passes through the origin. It means that
there exists a positive constant c > 0 which is the shortest distance from
any of the g-circles to the origin such that [z
j
[ c and then:
[
j+1
[ =
1
z
j
| ,
1
c
|
So we have our bound.
Theorem 12.5.2. For every even integer d there exists algebraic element
CR of degree d over Q for which the Hurwitz continued fraction ex-
pansion has bounded partial quotients.
12.5. BOUNDED PARTIAL QUOTIENTS 133
Examples 12.5.3. Lets start with z =
2+i
5 which norm [z[

2
= n = 7.
The minimal polynomial for z is z
4
+ 6z
2
+ 49.
The Hurwitz continued fraction expansion of z reads:
z = [2i + 1, i + 2, i 5, i 2, 4, i 2, 4, 2, i 1, 2i, . . .]
And Doug Hensley calculated that there are probably 72 g-circles for this z.
If well draw it on the picture, it looks as follow:
We can see there the intersection of g-circles of z =
2 +i
5 with the unit

box B.
On this picture we can see the partial quotients of z =
2 +i
5.
Now lets take an example for transcendental number. z =

+ i
7 .
The picture of g-circles and partial quotients is similar to the previous one
so I will not put the picture again. It is hard to nd out when some number
is transcendental or not. To prove it for this z we will probably need that
is transcendental.
An example for z =
2 + i
3 where [z[
2
= n = 5 = 1
2
+ 2
2
, so n is
the sum of two squares of integers.
We see there g-circles of this z, some of them pass through the origin.
On this picture we can see partial quotients of z =
2 + i
3. They are
unbounded.
Conjecture 12.5.4. Let z C be such that its norm n = [z[
2
Z
>0
is the
sum of two squares of integers. Then the partial quotients in the Hurwitz
continued fraction of z are unbounded, unless z is in Q(i) or quadratic over
Q(i).
In my presentation I used the work [13].
Chapter 13
Geodesics
Willem van Loon
This chapter will be about geodesics. There is a very interesting connection
between geodesics (on the modular surface M, the quotient of the hyperbolic
plane by the modular group SL(2, Z)) and continued fractions.
Let H = z C[(z) > 0 be the upper half-plane with the Poincare
metric
ds
2
=
dx
2
+dy
2
y
2
Furthermore, let SL(2, Z) =
__
a b
c d
_
ad bc = 1, a, b, c, d Z
_
. This
modular group acts on the upper complex plane as a group of fractional
linear transformations via the correspondence
_
a b
c d
_
z =
az +b
cz +d
H
Proposition 13.0.5. Let =
_
a b
c d
_
SL(2, Z). Then maps H to itself
bijectively.
Proof.
_
az +b
cz +d
_
=
_
(az +b)(cz +d)
[cz +d[
2
_
=
(aczz +adz +bcz +bd)
[cz +d[
2
=
((ad bc)z)
[cz +d[
2
=
(z)
[cz +d[
2
> 0
The inverse function of is
1
=
_
d b
c a
_
135
136 CHAPTER 13. GEODESICS
It will be convenient to add innity into the denition. Suppose M =
_
a b
c d
_
, then we dene:
M() =
_
a
c
if c ,= 0
if c = 0
M
_
d
c
_
=
Circles and lines
Now were going to look at circles and lines in the complex plane. We know
that the equation of a line L in R
2
has the form
ax +by +c = 0
for some a, b, c R. In the complex plane we can write x and y in terms of
z as x =
1
2
(z + z), y =
1
2i
(z z). If we substitute these expressions in our
equation for the line, we get
1
2
(a ib)z +
1
2
(a +ib)z +c = 0
So if we let = (a ib)/2, then the equation of L is
z +z +c = 0
Now we do the same thing for the circle. The equation for a circle in R
2
is
(x x
0
)
2
+ (y y
0
)
2
= r
2
.
Let z = x +iy and z
0
= x
0
+iy
0
. Then we get [z z
0
[
2
= r
2
, which we can
write as
(z z
0
)(z z
0
) = r
2
If we write this out further, and let = z
0
and = z
0
z
0
r
2
, the equation
for the circle becomes
zz +z +z + = 0
We can combine these two results to get the following proposition:
137
Proposition 13.0.6. If A is either a circle or a line in the complex plane,
then A has the equation
zz +z +z + = 0
where , R and C ( = 0 stands for a line).
It is not hard to see that if R, then we either have a circle with its
on the real axis, or a vertical straight line. If we only look at H, the upper
half-plane, then these circles will become semi-circles that meets the R-axis
orthogonally. We denote this set of semi-circles orthogonal to R and vertical
lines in the upper half-plane H by 1.
Proposition 13.0.7. Let H be either (i) a semi-circle orthogonal to the real
axis, or (ii) a vertical straight line. Let SL(2, Z). Then (H) is either
a semi-circle orthogonal to the real axis or a vertical straight line.
Proof By Proposition 0.0.1 we know that maps the upper half-plane to
itself bijectively. Hence it is sucient to show that maps vertical straight
lines in C and circles in C with real centres to vertical straight lines and
circles with real centres.
Let L be a vertical line or circle with real centre in C. Then L is given by
an equation of the form
zz +z +z + = 0
for some , , R.
Let = (z) =
az+b
cz+d
. Then z =
db
c+a
, and if we substitute this in our
equation for L, we get:
_
d b
c +a
__
d b
c +a
_
+
_
d b
c +a
_
+
_
d b
c +a
_
+ = 0
Therefore
(d b)(d b) +(d b)(c +a)+
(d b)(c +a) +(c +a)(c +a) = 0
and simplifying this further, we get:
(d
2
2cd +c
2
) + (bd +ad +bc ac)+
(bd +ad +bc ac) + (b
2
2ab +a
2
) = 0
which is exactly the equation for a circle or line.
138 CHAPTER 13. GEODESICS
Farey Tesselation
With this in mind we can now discuss the Farey tesselation. This is a
tesselation (a covering by triangles) of H with the Poincare metric by ideal
triangles, which means triangles whose vertices all lie on R. One way
to achieve this Farey tesselation is by looking at the standard fundamental
region, which is the region F = z; [1(z)[
1
2
, [z[ 1. If we move the
left half of this region one unit to the right and glue the pieces together,
we get a new fundamental region, a quadrilateral with vertices i, i + 1,
=
1
2
+
_
3
4
i and . If we look at the images of this quadrilateral under the
maps I, S =
_
0 1
1 1
_
and S
2
=
_
1 1
1 0
_
, we get three quadrilaterals, one
of which is F. The next is S(F) with vertices
1
2
+
1
2
i, i, and 0. Because of
Proposition 0.0.3, we know that S sends geodesics to geodesics, so the sides
of S(F) must look like in the gure below.
We see that S
2
(F) is also a quadrilateral with vertices i +1,
1
2
+
1
2
i, and 1.
Now if we take D to be the union of these images, then D = FS(F)S
2
(F)
is a triangle with vertices 0, 1, and . Now the images of D under SL(2, Z)
make up a tesselation of H in ideal triangles (triangles which vertices all lie
either on the real line or at ). This I wont proof here.
Chapter 14
Halls theorem
Roy Loos
We will look at a theorem of the mathematician Marshall Hall. He published
his result in 1947. In his paper he askes himself which real numbers can be
constructed if we give restrictions to the partial quotients. His theorem is
about continued fractions with partial quotients not exceeding four. It tells
us, more or less, that every real number can be written as a sum of two
continued fractions, whose partial quotients do not exceed four.
14.1 Cantor Set
In order to formulate Halls Theorem we will at rst have a look at the con-
struction of the Cantor Set. Cantor constructed the Cantor as an example
of a perfect set. He introduced the notion of a perfect set in order to prove
a weaker form of the Continuum Hypothesis. One can dene the Cantor Set
by means of a step by step construction. We will dene an innite sequence
of closed sets (
0
, (
1
, (
2
, ... as follows:
We dene the Cantor Set ( as follows:
Stage 0: we start with the unit interval, so (
0
= [0, 1].
Stage 1: we remove the open middle-third-interval, that is (
1
3
,
2
3
). De-
ne (
1
= [0,
1
3
] [
2
3
, 1].
Stage 2: remove the middle-thirds of the two closed intervals [0,
1
3
] and
[
2
3
, 1]. Dene (
2
= [0,
1
9
] [
2
9
,
1
3
] [
2
3
,
7
9
] [
8
9
, 1].
Stage 3: now remove the middle thirds of the closed interval which
appear in C
2
.
139
140 CHAPTER 14. HALLS THEOREM
And so on
After innite many stages we may collect the leftovers of the above con-
struction. That is, we dene the Cantor Set ( as follows:
( =
(
n
[ n N
We see that on each stage n we take the union of 2
n
closed intervals.
To speak about one of those 2
n
intervals, we may speak simply about a
closed interval which appears on the nth stage. For example, the interval
[0,
1
9
] appears on the second stage. Furthermore we introduce the notion of
a successor. An interval appearing on the nth stage, contains precisely
two intervals which appear on the n+1th stage, we call those two intervals
the successors. We immediately take this terminology into practice in order
to show that the Cantor Set is non-empty.
Proposition 14.1.1. For every n in N, for every q in Q, if there exists a
closed interval A which appears on the nth stage such that q is an endpoint
of A, then q is in (.
Proof. Have a close look at the construction of (, we cannot get rid of
endpoints.
So the Cantor Set is non-empty, we also see that that the Cantor Set is a
closed set, because it is an intersection of closed sets. At this point, one may
think that the endpoints of the closed intervals which appear on the stages
are the only points of the Cantor Set. Actually, those endpoints form just
a very small part of the Cantor Set to some extent. There are many more
elements of the Cantor Set, as we may deduce from the next proposition.
Proposition 14.1.2. Let B
n
[ n N be a sequence such that for every
n, B
n
is a closed interval which appears on the nth stage of the Cantor
Set. If for every n, B
n+1
B
n
, then there exists a real number x such that
B
n
[ n N = x and x is in (.
Proof. At rst we dene a sequence x
n
nN
of real numbers, such that,
for all n, x
m
is the left endpoint of the closed interval B
n
. Observe that
the sequence x
n
is a increasing sequence which is bounded. This means
that the sequence converges, so let us say that limx
n
= x, for some real
number x. It is clear that for every k, for every n k, we have b
k
in B
n
.
We conclude: for all k, for all n k, limb
i
ik
B
n
. Thus, for all n, x
in B
n
. To nish the theorem we need to prove that x is the only element
which exists in
B
n
. But this is obvious, since for all n, there exists N,
such that for all i N, [B
i
[
1
n
.
14.1. CANTOR SET 141
The following proposition tells us that the Cantor Set has the cardinality
of the continuum.
Proposition 14.1.3. There exists an injection from 2
N
to (.
Proof. We will construct the injection as follows. Let be a function
from N to 2. We will dene a sequence B
n
of closed intervals. At rst we
dene B
0
= [0, 1]. Now, we let n in N and suppose that n 1, we proceed
as follows:
if (n 1) = 0, then we dene B
n
to be the left successor of B
n1
if (n 1) = 1, then we dene B
n
to be the right successor of B
n1
.
Thanks to the previous lemma we can calculate x such that x =
B
n
[
n N, so we dene () = x.
We can also show that ( is a perfect set. A perfect set is a closed set
without isolated points. One could also say that a perfect set is a neat set.
Proposition 14.1.4. The Cantor Set is a perfect set.
Proof. We are done if we show that ( has no isolated points, so we have to
prove:
x (ny ([[x y[ <
1
n
]
Let x in (. Let n be a natural number. We start an innite search for the
real number x. Let B
n
nN
be a sequence such that for every n, B
n
appears
on the nth stage of the Cantor Set, x in B
n
and B
n+1
is a successor of B
n
.
Calculate N such that [B
N
[ <
1
n
. Now dene a new sequence C
n
nN
such
that for all i,
if i N, then C
i
= B
i
if i = N + 1, then we let C
i
to be the unique successor of B
N
such
that x is not in C
i
.
if i > N + 1, then we let C
i
to be the left successor of C
i1
.
To nish the proof we let y such that y =
C
n
[ n N. We conclude:
[x y[ <
1
n
.
The idea of the Cantor Set was to remove consecutive the middle-thirds.
It is possible to generalize this idea slightly. In a Generalized Cantor Set
you start just as in the normal Cantor Set with an interval A, on the rst
stage we are allowed to choose which piece of A we delete, so for example we
are allowed to get rid of the middle-fth of A. On each stage one has more
freedom to choose the successors, but there are restrictions. The following
denition slightly sloppy, but hopefully the idea is clear.
Denition 14.1.5. Let A be a closed and bounded interval. L(A) is a
Generalized Cantor Set, in case, there exists stages /
0
, /
1
, /
2
, ..., such
that L(A) =
/
n
[ n N and for every n, for every C which appears on
stage n, we have the following:
C has two successors X and Y .
min(X Y ) = min(X)
max(X Y ) = max(X)
X Y =
Denition 14.1.6. For every z in Z, for every n in N we dene
F(z, n) = [a
0
, a
1
, ...] [ a
0
= z i(i > 0 1 a
i
n)
Denition 14.1.7. For every set X, for every set Y , we dene:
X +Y = x +y [ x X y Y
We are now ready to formulate Halls Theorem.
Theorem 14.1.8 (Halls Theorem). R
F(z, 4) [ z Z + F(0, 4)
Before we can actually prove this, we need two theorems. At rst we want
to nd a closed and bounded interval A. The rst theorem is about turning A
it into some kind of Generalized Cantor Set L(A) such that L(A) = F(0, 4).
Secondly we will prove a theorem which ensures us that L(A)+L(A) = A+A.
If we have these two theorems then the prove is as follows.
Proof Sketch. Let x be a real number. The particular A of above will have
the property that [A+A[ > 1. Which means, that we are able to nd z in Z
and a in A+A, such that x = z+a. Because A+A = F(0, 4)+F(0, 4), we de-
termine [a
0
, a
1
, ...] and [b
0
, b
1
, ...] in F(0, 4) such that a = [0, a
1
, ...]+[0, b
1
, ...].
Conclude: x = [z, a
1
, a
2
, ...] + [0, b
1
, b
2
, ...].
A suitable choice for A seems to be the interval [min F(0, 4), max F(0, 4)].
So in particular F(0, 4) A. We will start by thinking about the minimum
and maximum of F(0, 4).
Lemma 14.1.9. max F(0, 4) = [0, 4, 1, 4, 1, 4, ...] and
min F(0, 4) = [0, 1, 4, 1, 4, ...]
Proof. We will construct the maximum [a
0
, a
1
, ...] step by step. Of course
we want a
0
to be as big as possible, unfortunately we are only allowed to
let a
0
be zero. The second step we want 0 +
1
[a
1
,...]
to be as big as possible.
But this means we want to minimize [a
1
, a
2
, ...], thus a
1
= 1. We continue
with this procedure, and we nd out that a
2
= 4, a
3
= 1, etcetra. The other
proof goes mutatis mutandis.
Lemma 14.1.10. For every a in N, for every b in N, if a 1 and b 1,
then
[0, a, b, a, b, ...] =
b
2
+
_
b
2
+
b
a
Proof. Observe: [0, a, b, ...] =
1
[a,b,...]
=
1
a+
1
b+[0,a,b,...
. Dene x = [0, a, b, ...].
Then the latter equation becomes: x =
1
a+
1
b+x
. Some calculus transforms
this equation into: ax
2
+abxb = 0. You can use the abc-formula to obtain
that: x =
b
2

_
b
2
+
b
a
. One of the two solutions is negative, which is
impossible, hence we have proven the lemma.
The lemma tells us a little bit more about F(0, 4). We are now able to
calculate the minimum and maximum of F(0, 4), that is A = [
21
2
, 2
22].
Furthermore notice that we have indeed [A+A[ > 1.
Denition 14.1.11 (). Let L(A) be a GCS. L(A) satises in case for
every closed interval C = [x, x + c] such that C appears on some stage of
L(A), the lengths of the two successors C
1
= [x, x +c
1
] and C
2
= [x +c
1
+
c
12
, x +c] of C in L(A) satisfy c
12
c
1
and c
12
c
2
.
As notices in the proof of Halls Theorem, we need to show that L(A)+
L(A) = A + A. To show this we will proof a more or less related theorem.
But before so, we need one more denition.
Denition 14.1.12. Let C = [x, x+c] and D = [y, y+d] be closed intervals.
Calculate e = minc, d. We dene the left associative (C, D)
= [x+y, x+
y +2e] and the right associative (C, D)
+
= [x+y +c +d 2e, x+y +c +d].
Furthermore we dene: As(C, D) = (C, D)
(C, D)
+
.
Theorem 14.1.13. Let A and B be closed and bounded intervals. Let L(A)
and L(B) be Generalized Cantor Sets. If L(A) and L(B) both satisfy , then
As(A, B) L(A) +L(B).
Proof. We will start the proof with a claim which is, as we will see later,
very useful in order to prove the theorem.
Claim: Let C = [x, x + c] and D = [y, y + d] such that C appears on some
stage of L(A) and D appears on some stage of L(B). Furthermore, we let
C
1
= [x, x+c
1
] and C
2
= [x+c
1
+c
12
, x+c] be the two successors of C and
in the same way we let D
1
and D
2
be the successors of D. For every real
number such that in As(C, D),
if c d, then in As(C, D
1
) or in Ass(C, D
2
).
if d c, then in As(C
1
, D) or in Ass(C
2
, D).
Proof of the claim:
In order to prove the claim, we let in Ass(C, D). Observe that it is enough
to prove the claim under the assumption c d. So, we assume: c d. We
have to treat four cases.
Case 1: c d
1
and c d
2
. We observe the following: (C, D)
=
[x +y, x +y + 2c] = (C, D
1
)
and (C, D)
+
= (C, D
2
)
+
.
Case 2: c d
1
and c > d
2
. Just as in case 1 one can calculate that
As(C, D) As(C, D
1
) As(C, D
2
).
Case 3: c > d
1
and c d
2
. The same trick as above.
Case 4: c > d
1
and c > d
2
. Let us write down the associatives:
(C, D
1
)
[x+y, x+y+2d
1
], (C, D
2
)
= [x+y+d
1
+d
12
, x+y+d
1
+d
12
+
2d
2
], (C, D
1
)
+
= [x+y+cd
1
, x+y+c+d
1
] and (C, D
2
)
+
= [x+y+c+
d2d
2
, x+y+c+d]. We will show that these four interval will contain
the set C+D, then we are done, because Ass(C, D) C+D. Observe
that min(C +D) = min((C, D
1
)
. Remember that tell us d

12
d
1
,
thus we conclude: min((C, D
2
)
) max((C, D
1
)
). Furthermore we
have min((C, D
1
)
+
) max((C, D
2
)
), because x+y +c x+y +d =

x+y+d
1
+d
12
+d
2
. We prove that min((C, D
2
)
+
) max((C, D
1
)
+
) as
follows: says that d
12
d
2
, thus d
12
d
2
0, thus d
1
+d
12
d
2
d
1
,
but that is just d 2d
2
d
1
, which proves the statement.
Let be in As(A, B). We will show that is in L(A) +L(B). By using the
above claim we can construct, step-by-step, an innite sequence A
n
, B
n
nN
such that, A
0
= A, B
0
= B, for every n, A
n+1
is a successor of A
n
in L(A)
and B
n+1
is a successor of B
n
in L(B) and for every n, in As(A
n
, B
n
).
So, at each step we compare the length a
n
of A
n
and the length b
n
of B
n
and then apply the claim. Observe that the sequences a
n
and b
n
are
decreasing in a closed interval, so the limit of these sequences will exist.
Once again, we have to distinguish cases about the a
n
and b
n
.
Case 1: lim
n
(maxa
n
, b
n
) = 0. So if this is the case, we have in
particular: lima
n
= 0 and limb
n
= 0, thus we are able to dene real
numbers and such that
A
n
= and
B
n
= . Furthermore
observe that for all n, we have in A
n
+B
n
. In order to prove + = ,
we show: for all n, [( + ) [ <
1
n
. Let n in N. Calculate N such
that [A
N
[ <
1
2n
and [B
N
[ <
1
2n
. Conclude: [A
n
+ B
n
[ <
1
n
. This
nishes the proof.
Case 2: lim
n
(maxa
n
, b
n
) = t > 0. In this case, there are three
possibilities, which are proven in case 2a and case 2b below.
Case 2a: limb
n
= t and t > lima
n
(this case also deals with the case
lima
n
= t and limb
n
< t). Dene s = lima
n
. Because s < t, we
calculate N such that a
N
< t. Conclude: n N(a
n
< b
n
). Have
a look at the lemma, this means, that the interval A
N
is unharmed,
that is, for all n N, a
n
= A
N
. But this means also that for all
n N, B
n+1
B
n
. So, if we write L(B) as an intersection of stages:
L(B) =
B
n
, just as with the normal Cantor Set. Then we see:
k(
B
n
[ n N B
k
). Conclude:
B
n

B
n
= L(B). Let us
nd suitable real numbers u, s, v and t such that: A
N
= [u, u+s] and
B
n
= [v, v + t]. We have for all n, in A
n
+ B
n
, but also for all n,
in A
N
+ B
n
. I claim the following: n[u + v
1
n
u + v +
1
n
].
To show this, we let n in N. Calculate M N such that [B
M
[ <
1
n
.
Because
B
n
B
M
, it follows that B
M
[v
1
n
, v + t +
1
n
]. Since
A
N
+ B
M
, we conclude [u, u + s] + [v
1
n
, v + t +
1
n
]. Hence
we have proven the claim. If we let n go to innity, we immediately
have: A
N
+
B
n
= [u +v, u +v +s +t]. If u +v u +v +t,
then we dene = u and = u, we see that is an endpoint
of an interval appearing in L(A), so we have in L(A), on the other
hand we have that in [v, v +t], so also in L(B). If it happens that
u+v+t u+v+s+t, then we dene = u+s and = us,
with the same arguments we see: in L(A) and in L(B). So in both
cases we can nd suitable and . Thus in L(A) +L(B).
Case 2b: lima
n
= t and limb
n
= t. If it is the case that: Nn
N(A
n
= A
N
), then we can easily adapt the proof of case 2a. So let
us assume that for all n, there exists m > n such that A
m
,= A
n
.
We assume this also for the other sequence: for all n, there exists
m > n, such that B
m
,= B
n
. Let us write once again the A and
B as an intersection of stages: A =
/
n
and B =
B
n
. Now
observe that: nm(A
m
/
n
) and nm(B
m
B
n
). Thus we have,
A
n
L(A) and
B
n
L(B). We can show in
A
n
+
B
n
.
Now we may nd real numbers u, v and t such that:
A
n
= [u, u +t]
and
B
n
= [v, v +t]. If u +v u +v +t, then we choose = u
and = u. If u+v +t u+v +2t, then we choose = u+t
and = u t. In both cases we have that in L(A), in L(B)
and + = . This nishes the proof.
We will now turn F(0, 4) into a Generalized Cantor Set L(A). As men-
tioned earlier, L(A) can be seen as a intersection of stages, just as the normal
Cantor Set. On the rst stage we will only have the interval A. But how
do we proceed? The next paragraph will answer that question. Somehow
we have to make sure that F(0, 4) will be a subset of L(A), we dont want
to spill any real number which is in F(0, 4). We have to keep that in mind,
when constructing the sages of L(A). So on stage zero, we are still doing
ne. In order to keep track of an element [a
0
, a
1
, ...] in F(0, 4), we will look
at an initial segment of [a
0
, a
1
, ...]. We will try to formalize this idea.
Denition 14.1.14. We dene a very special set of nite sequences as
follows:
a
0
, a
1
, ..., a
n1
, A
n
a
0
= 0 0 < i < n(1 a
i
4)
(m[A
n
= m 1 m 4]
A
n
= 2, 3, 4 A
n
= 3, 4)
Denition 14.1.15. Dene the function T : (F(0, 4)) as follows, for
every a
0
, a
1
, ..., A
n
in :
T(a
0
, ..., A
n
) = [
0
,
1
, ...] [ i < n[
i
= a
i
]
n
A
n
i > n[1 a
i
4]
We also dene a function F : X [ X A [ X is a closed interval
by for every a in , F( a) = [min T( a), max T( a)].
We would like to see L(A) as an intersection of stages: L(A) =
/
n
[
n N. The elements in Im(F) are supposed to live in some stage of L(A).
To assign a stage to each of the elements in Im(F) we just describe for
every C in Im(F) the two successors of C. One could formalize this proces
as creating a binary tree, altough we will not do that.
Denition 14.1.16. Let n be in N. Let a
0
, ..., A
n
be in . In order to
dene the two successors of F(a
0
, ..., A
n
) we distinguish three cases.
If A
n
= m, then the two successors are: F(a
0
, ..., m, 1) and
F(a
0
, ..., m, 2, 3, 4).
If A
n
= 2, 3, 4, then the two successors are: F(a
0
, ..., a
n1
, 2) and
F(a
0
, ..., a
n1
, 3, 4).
If A
n
= 3, 4, then the two successors are: F(a
0
, ..., a
n1
, 3) and
F(a
0
, ..., a
n1
, 4).
Proposition 14.1.17. L(A) is a Generalized Cantor Set, that is to say, for
every a in , which appears on some stage in L(A), we let X and Y be the
successors of F( a), then: min(XY ) = min F( a), max(XY ) = max F( a)
and X Y = .
Proof. Let a = a
0
, ..., a
N
be in . We will distinguish three cases, A
N
= m
for some m, A
N
= 2, 3, 4 and A
N
= 3, 4. We will look at the rst case,
A
N
= m. However in each case we also need to distinguish whether n is even
or odd, so let us assume that n is even. The two successors of F( a) are F( a, 1)
and F( a, 2, 3, 4). Let us move on to the calculations of the maximums
and minimums of the intervals, we use lemma 3.1.9 for this. max F( a) =
[ a, 1, 4, 1, 4, ...], min F( a) = [ a, 4, 1, 4, ...], min F( a, 1) = [ a, 1, 1, 4, 1, 4, ...],
max F( a, 2, 3, 4) = [ a, 2, 4, 1, 4, ...]. From the above calculations we con-
clude: max F( a, 1) = max F( a), min F( a, 2, 3, 4) = min F( a) and
max F( a, 2, 3, 4) < min F( a, 1). This handles this specic case. The other
cases are similar.
It is easy to see that we indeed have L(A) = F(0, 4). We now want to
prove that L(A)+L(A) = A+A, because then we have [F(0, 4)+F(0, 4)[ > 1.
Our task is to prove that L(A) satises , then we shall apply a previous
theorem and conclude As(A, A) = A+A L(A) +L(A). But also L(A) +
L(A) A+A, thus L(A) +L(A) = A+A.
Lemma 14.1.18. Let n be a natural number. Let [a
0
, ..., a
n
] be a nite
continued fraction. For every real number , for every real number , we
have: [[a
0
, ..., a
n
, ] [a
0
, ..., a
n
, ][ = [

qn(+)(+)
[, where =
q
n1
qn
.
Proof. Use Theorem 2.1.5. to conclude that [a
0
, ..., a
n
, ] =
pn+p
n1
qn+q
n1
. Now
observe: [[a
0
, ..., a
n
, ] [a
0
, ..., a
n
, ][ = [
pn+p
n1
qn+q
n1
pn+p
n1
qn+q
n1
[ =
[
(pnq
n1
p
n1
qn)+(p
n1
qnpnq
n1
)
qn(+)(+)
[ = [

qn(+)(+)
[. In the last step we use
Lemma 2.1.7, which states that p
n
q
n1
p
n1
q
n
= (1)
n+1
.
Theorem 14.1.19. The Generalized Cantor Set L(A) satises .
Proof. At rst we want have a better understanding of the length of the
intervals which appear in L(A). Some agreements:
pn
qn
is the nth convergent
of a certain continued fraction and =
q
n1
qn
. Furthermore we need to
dene a special irrational: = [1, 4, 1, 4, ...] =
1
2
(
2 + 1). Observe that:

1
= [0, 1, 4, 1, 4, ...] = 4 4.
We now let a = [a
0
, ..., A
n
] be in . We have to distinguish the cases whether
n is odd or even and whether A
n
= m or A
n
= 2, 3, 4, or A
n
= 3, 4.
Case 1: n is even and A
n
= m.
We let C
1
= F( a, 2, 3, 4), C
2
= F( a, 1) and C
12
= F( a) (C
1
C
2
) be as
usual. We are done if we prove:
c
12
c
1
1 and
c
12
c
2
1, where the small cs
are the lengths of the intervals just as before.
Have a look at the two successors of F( a) and observe that min F( a, 1) =
[ a, 1, 1, 4, 1, ...],max F( a, 1) = [ a, 1, 4, 1, 4, ...], min F( a, 2, 3, 4) = [ a, 4, 1, 4, ...]
and max F( a, 2, 3, 4) = [ a, 2, 4, 1, 4, ...]. We will continue by calculating the
lengths of the intervals C
1
, C
12
and C
2
.
The length of C
12
. We dene: = [1, 1, 4, 1, 4, ...] = 2
21 =
1
+1. De-
ne = [2, 4, 1, 4, ...] = 1+. Thus: c
12
= min F( a, 1)max F( a, 2, 3, 4) =
[ a, ] [ a, ] =

qn(+)(+)
=
1+
1
qn(
1
+1+)(1++
).
The length of C
1
. The minimum and maximum of C
1
are [ a, 4, 1, 4, ...]
respectively [ a, 2, 4, 1, 4, ...]. So we have: c
1
= [ a, 2, 4, ...] [ a, 4, 1, 4, ...] =
13
qn(1++)(4+)
, because [4, 1, 4, ...] = 4 and [2, 4, 1, 4, ...] = 1 +.
The length of C
2
. The minimum and maximum of C
2
are [ a, 1, 1, 4, 1, 4, ...]
respectively [ a, 1, 4, 1, 4, ...]. So we have: c
2
= [ a, 1, 4, ...] [ a, 1, 1, 4, ...] =] =
1
qn(+)(
1
+1+)
, because [1, 1, 4, 1, 4, ...] =
1
+ 1 and [1, 4, 1, 4, ...] = .

We now calculate the fraction
c
12
c
1
=
1
(4+)
(
1
+1+)(13)
. We can see this as
a function f : Q R in the variable . Recall lemma 2.1.7, which states
that q
n
> q
n1
. So we deduce that Q [0, 1]. You can use calculus (or
a computer) to convince yourself that f() [ Q [0, 1] (0, 1].
If you look at the fraction
c
12
c
2
=
(1)(+)
(1++)(
1
1)
, then we can do more or
less the same trick. So this fraction gives rise to a function g : Q R and
we can conclude that g() [ Q [0, 1] (0, 1].
The above discussion threats one of the six cases. However all the other
cases are just, like this case, a lot of calculations. Nothing really happens
there. So I will nish the proof of the theorem here.
Now that we have convinced ourselves that L(A) satises , we are able
to perform the Proof Sketch, which I wrote down earlier. So this means we
have proven Halls Theorem. Let us dene for every natural number N, the
set F(N) = [a
0
, a
1
, ...] [ a
0
Z n > 0(1 a
n
N). In the literature
one formulates Halls Theorem simply as F(4) + F(4) = R. Although the
proof of Hall is about a slightly dierent statement. Hall also proved, in
the same paper (1947), that every real number is the product of two
real numbers in F(4). After this, one discovered related results. Divis and
Cusick showed that F(3) +F(3) ,= R and 3F(2) ,= R, while 3F(2) = R and
4F(2) = R. Hlvaka showed that F(4) +F(3) = R, F(4) +F(2) +F(2) = R,
F(3) + F(3) + F(2) = R, F(7) + F(2) = R, but F(4) + F(2) ,= R and
F(3) +F(2) +F(2) ,= R. More recently Astels showed that F(5) F(2) =
R, F(3) F(4) = R, F(3) F(3) = R, F(3) F(2) F(2) = R and
F(n) [ nis odd +
F(n) [ nis odd = R. So, the theorem of Hall has

inspired a lot of other interesting results.
Chapter 15
Bounded complex partial
quotients
Ewelina Omiljan
We take complex number z C, z = z
0
and sequence of approximations to
z:
(
pn
qn
) where p
n
, q
n
G = a +bi[a, b Z ,G - Gaussian integers and p
n
, q
n
- relatively prime.
We denote by z| the Gaussian integer nearest z, rounding down in the both
the real and imaginary part.
We take the domain B = x +iy[
1
2
x, y <
1
2
.
The Hurwitz complex continued fraction algorithm proceeds by steps of the
form:
z
n+1
=
1
z
n
1
z
n
|.
If z+z
0
Q(i) then algorithm terminates when z
n
= 0 and nal nite-depth
continued fraction gives a reduced fraction
pn
qn
equal to z.
If z , Q(i) then algorithm continues indenitely.
As in the classical algorithm we have:
p
n1
p
n
q
n1
q
n
= p
n1
q
n
p
n
q
n1
= (1)
n
We introduce some notation:
z = z
0
is given
z
0
B
p
1
= q
0
= 1, p
0
= q
1
= 0
151
152 CHAPTER 15. BOUNDED COMPLEX PARTIAL QUOTIENTS
For n 1 let a
n
=
1
z
n1
|.
p
n
= a
n
p
n1
+p
n2
q
n
= a
n
q
n1
+q
n2
Let x
n
=
1
z
n1
and w
n
=
q
n1
qn
. Then:
z
n+1
=
1
z
n
1
z
n
| =
1
z
n
a
n+1
= Hz
n
And it is called the Hurwitz continued fraction operator.
w
n+1
=
q
n
q
n+1
=
q
n
a
n+1
q
n
+q
n1
=
1
a
n+1
+
q
n1
qn
=
1
a
n+1
+w
n
Let
1
B
denote the set of reciprocals of the nonzero elements of B.
1
B
is bounded by arcs of circles of radius 1 about 1 and i, and these
arcs pass through 1 i and 2 2i.
Let G
t
denote G0, 1, i.
Lemma 15.0.20. Suppose z B, n 1 and z has a Hurwitz continued
fraction to depth n + 2. If [w
n
[
2
3
then [w
n+1
[ <
2
3
. Furthermore, either
[w
n
[ <
2
3
or
2
3
[w
n
[ < 1 and one of the following, or its negative or
complex-conjugate counterpart, holds: [w
n

9
14
(1 i)[ <
3
7
and a
n
= 1 + i,
[w
n

9
16
[ <
3
16
and a
n
= 2.
Proof. First we start with observation that:
w
0
=
q
1
q
0
=
0
1
= 0.
We dene D(s, r) := z C : [z s[ < r as a disc in C with radius r about
s. If [z[ > r then the reciprocals of the disc have form:
1
D(s, r)
= D(
s
[s[
2
r
2
,
r
[s[
2
r
2
)
(the reciprocal of z C is
1
z
=
z
[z[
2
).
Now let D
0
be the disc about 0 with radius
2
3
. And let D
1
=
1
1+i+D
0
. Using
the formula
1
D(s,r)
we calculate D
1
.
D
1
(
0 + 1 +i
(
2)
2
(
2
3
)
2
;
2
3
2
4
9
) = D
1
(
9
14
(1 i);
3
7
).
153
In the same way we calculate other discs:
D
2
=
1
1 +i +D
0
D
3
=
1
1 +i +D
0
D
4
=
1
1 i +D
0
D
5
=
1
2 +D
0
D
6
=
1
2i +D
0
D
7
=
1
2 +D
0
D
8
=
1
2i +D
0
Assuming the claim to be true for k n we have either that [w
n
[ D
0
or
that w
n
lies in the intersection of one of 8 other discs with the open unit
disc and a
n
takes a value either 1 ior 2 or 2i. Now if [a
n
[
5 then
1
a+D
0
D
0
. If a is one of the 8 Gaussian integers from G
t
nearest the origin
then
1
a+D
0
is one of the D
1
, D
2
, . . . , D
8
.
For 1 k 8 for any successor a to a
n
:
1
a+D
k
D
0
. So for example when
w
n
D
1
D
0
, we have a
n
= 1+i so that a
n+1
G
1+i
= u = iv : u 0, v
0, u +v 2 where G
k
denote the set of possitive value in G
t
.
For a G
1+i
though if [a[ 2
2 and [w[ < 1 then [

1
a+w
[ <
2
3
. It is from:
1
[a +w[
<
1
[a[ [w[
<
1
2
2 1
2 + 1
2
2 + 1
=
2
2 + 1
7
<
2
3
.
While if a 2, 2 i, 1 i, 1 2i, 2i then
1
a+D
1
D
0
.
So [w
n+1
[ <
2
3
. It shows that if [w
n
[
2
3
[w
n+1
[ <
2
3
.
Theorem 15.0.21. If z B has a Hurwitz continued fraction expansion to
depth n + 2, then [
q
n+2
qn
[.
Proof.
1
[wn[[w
n+1|
= [
qn
q
n1
q
n+1
qn
[ = [
q
n+1
q
n1
[
And we know that [w
n
[[w
n+1
[ < 1
2
3
=
2
3
. So
1
[wn[[w
n+1
[
<
3
2
Theorem 15.0.22. Suppose C has a Hurwitz algorithm sequence of
convergents to depth at least n, and suppose (p
n1
, q
n1
) and (p
n
, q
n
) are
the numerators and denominators of the (n 1)th and nth convergents.
Suppose q G with [q
n1
[ < [q[ [q
n
[, p G and
p
q
,=
pn
qn
. Then [
p
q
[
1
5
[
pn
qn
[ [
qn
q
[.
Proof. We rst write:
(p, q) = s(p
n1
, q
n1
) +t(p
n
, q
n
)
From that:
p = sp
n1
+tp
n
q = sq
n1
+tq
n
If s = 0 then (p, q) = t(p
n
, q
n
) p = tp
n
and q = tq
n
.
[
p
q
[ = [
tp
n
tq
n
[ = [
p
n
q
n
[
Equivalently:
1
1
5
[
q
n
q
[ =
1
5
[
q
n
tq
n
[ =
1
5
[
1
t
[ [t[
1
5
Its true because t ,= 0 and t Z. So the estimate is true.
If [s[ = 1, we can consider only s = 1 because we can multiply s and t
by the same unit.
[
p
n
q
n
[ = [
p
n
+z
n
p
n1
q
n
+z
n
q
n1
p
n
q
n
[ = [
z
n
(1)
n
q
n
2
(1 +z
n
w
n
)
[ =
[z
n
[
[q
n
[[q
n
+z
n
q
n1
[
s + 1 p = p
n1
+tp
n
; q = q
n1
+tq
n
[
p
q
[ = [
p
n
+z
n
p
n1
q
n
+z
n
q
n1
tp
n
+p
n1
tq
n
+q
n1
[ = [
(1 tz)(p
n
q
n1
p
n1
q
n
)
(q
n
+z
n
q
n1
)(tq
n
+q
n1
)
[ =
= [
tz
n
1
q(q
n
+z
n
q
n1
)
[ =
[z
n
[[t
1
zn
[
[q[[q
n
+z
n
q
n1
[

1
5
[z
n
[
[q[[q
n
+z
n
q
n1
[
[q
n
[
[q
n
[
=
1
5
[q
n
[
[q[

[z
n
[
[q
n
+z
n
q
n1
[
=
=
1
5
[
p
n
q
n
[
[q
n
[
[q[
q = q
n+1
t =
1
zn
|, but [q[ [q
n
[ < [q
n+1
[, so t ,=
1
zn
| and [t
1
zn
[
1
2

1
5
.
The case [s[ > 1 we can break down into subcases depending on the value of
155
a
n
. Because some of rotations, symmetry and reections these cases reduce
to the following:
[a
n
[ 3, a
n
= 2 + 2i, a
n
= 2 +i.
The value of a
n
constrains both w
n
and z
n
.
w
n
because w
n
=
1
w
n1
+an
, [w
n1
[ < 1 so w
n
D(
an
[an[
2
1
;
1
[an[
2
1
).
z
n
because z
n
D(a
n
, 1) B.
In my presentation I used [35].
Chapter 16
Binary quadratic forms
Merlijn Keune
A binary quadratic form is a map f(x, y) = ax
2
+bxy +cy
2
with a, b, c Z.
In this chapter the word form will often be used, always meaning a binary
quadratic form. We rst need some basic terminology.
Denitions 16.0.23. Given a form f and an integer n, we say n is rep-
resented by f if there are x, y Z such that f(x, y) = n. One may wonder
which integers are represented by a given form. For example, we already saw
which integers can be represented by the form f(x, y) = x
2
+y
2
; the integers
that are a sum of squares.
A form is called positive/negative denite if it only represents posi-
tive/negative integers apart from f(0, 0) = 0, and indenite if it represents
both negative and positive integers. Note that not every form has to be in
one of these three categories.
A form is called degenerate if it can be factorized into lineair parts. We
are not interested in these forms, since they are easy to solve. For example,
4x
2
+ 12xy + 9y
2
= (2x + 3y)
2
and represents exactly the perfect squares,
since gcd(2, 3) = 1.
Denition 16.0.24. Let f(x, y) = ax
2
+bxy +cy
2
be a form. The matrix
of f is
M =
_
2a b
b 2c
_
.
The discriminant of f is d = b
2
4ac = det(M). Note that we can now
write f(x, y) =
1
2
_
x y
_
M
_
x
y
_
.
157
158 CHAPTER 16. BINARY QUADRATIC FORMS
From now on we will assume [d[ not to be a square, since otherwise the
form would be degenerate. The following lemma then becomes an easy
calculation.
Lemma 16.0.25. Let f be a form. If d > 0 then f is indenite, if d < 0
then f is denite.
Lemma 16.0.26. An integer d is the discriminant of a form if and only if
d 0, 1 (mod 4).
Proof Obviously d = b
2
4ac b
2
0, 1 (mod 4). Conversely, if d 0
(mod 4), then x
2
d
4
y
2
has discriminant d and if d 1 (mod 4), then
x
2
+xy
d1
4
y
2
has discriminant d.
The question we will be considering is: given an integer d, which forms have
discriminant d? Since there are innitely many of such forms, we need an
equivalence relation to classify them.
Denition 16.0.27. Two forms f and f
t
with matrices M and M
t
are said
to be equivalent if there is a matrix P SL
2
(Z) such that M
t
= P
MP.
This is a proper equivalence relation, which follows easily by looking at the
identity matrix, P
1
and matrix multiplication. It can also be seen as the
group SL
2
(Z) acting on the set of forms, with the orbits as equivalence
classes.
Proposition 16.0.28. Equivalent forms have the same discriminant and
represent the same integers.
Proof Since P SL
2
(Z) we have det(P) = 1, so det(M
t
) = det(M).
Thus d = d
t
. Now let n Z be represented by f, then there are x, y
Z such that
1
2
_
x y
_
P
MP
_
x
y
_
= n. Putting
_
x
t
y
t
_
= P
_
x
y
_
gives
1
2
_
x
t
y
t
_
M
_
x
t
y
t
_
= n. The other direction follows from symmetry.
Remark 16.0.29. If M is the matrix of f(x, y) = ax
2
+bxy+cy
2
and M
t
=
P
MP with P =
_
p q
r s
_
SL
2
(Z), then M
t
is the matrix of f
t
(x, y) =
a
t
x
2
+b
t
xy +c
t
y
2
with
a
t
= f(p, r),
b
t
=
_
p r
_
M
_
q
s
_
,
c
t
= f(q, s).
This can be seen by simply lling in the equations.
16.1. POSITIVE DEFINITE FORMS 159
16.1 Positive denite forms
The next step in this theory would normally be to look at positive denite
forms. There are nitely many equivalence classes of forms for a given
discriminant d, which can be canonically represented by a special sort of
forms. This however has nothing to do with continued fractions, so it wont
be treated here. The steps that should be taken are as follows.
Dene when a form is reduced.
Show that the number of reduced forms for a given d is nite.
Give an algorithm to determine a reduced form equivalent to a given
form.
Proof that no distinct reduced forms are equivalent.
Having done this, we know there is a unique reduced form in every equiva-
lence class.
16.2 Indenite forms
We could try to solve the problem in the indenite case similarly, but there
will be a complication. This is where continued fractions provide a solution.
In this section all forms are assumed to be indenite, thus with discriminant
d > 0.
Denition 16.2.1. Let f(x, y) = ax
2
+ bxy + cy
2
be a form. We dene
t =
b+
d
2a
to be the rst root of f. Note that this is a solution of f(x, 1) = 0.
Lemma 16.2.2. We have
1
[t[
=
b+
d
2c
.
Proof
1
t
=
2a
d b
=
2a(
d +b)
d b
2
=
2a(
d +b)
4ac
=
b +
d
2c
.
Taking absolute value on both sides proves the lemma.
Denition 16.2.3. A formf is said to be reduced if
1
[t[
is a reduced quadratic
irrational, so if 1 <
1
[t[
and 1 <
1
[t[
t
< 0, where
1
[t[
t
is the conjugate of
1
[t[
.
Lemma 16.2.4. If f(x, y) = ax
2
+bxy+cy
2
with discriminant d is reduced,
then we have 0 0, then since
1
[t[
is reduced we have
b +
d > 2c = 2[c[, 2[c[ 2[c[
b
d > 2[c[
_
b > 0,
b +
d > 2[c[
0 > b
d
_
d > [c[.
Corollary 16.2.5. There are only nitely many reduced indenite forms of
given discriminant d: b and c are bounded and a is determined by b, c and
d.
Denition 16.2.6. Let f be a form with matrix M. Let
P =
_
0 1
1 k
_
k Z,
then the form with matrix P
MP is called the right neighbour of f over k.

In particular:
f
t
(x, y) = f(y, ky x) = cx
2
(b + 2ck)xy + (a +bk +ck
2
)y
2
.
Similarly we obtain the left neighbour of f over k from the matrix P
1
.
Proposition 16.2.7. Let f be an indenite form with discriminant d and
rst root t. If g is the right neighbour of f over k, then g has rst root k
1
t
.
Proof By our previous result for
1
t
:
b + 2ck +
d
2c
= k +
b +
d
2c
= k
1
t
.
Note that d remains invariant since the forms are equivalent.
16.2. INDEFINITE FORMS 161
At this point we should introduce an algorithm to determine an equivalent
reduced form given any indenite form. This however is rather technical
and requires some long calculations, so we will omit this and just suppose
this can be done.
Corollary 16.2.8. Every equivalence class contains at least one reduced
form.
In the denite case we would have proved that the reduced form in an
equivalence class is unique. In the indenite case this is not true.
Proposition 16.2.9. Let f(x, y) = ax
2
+bxy +cy
2
be a reduced indenite
form with rst root t, [t[ = t and let
1
[t[
= [a
0
; a
1
, a
2
, . . .]. Then, if g is the
right neighbour of f over a
0
and g has rst root T, g is reduced, [T[ = T
and
1
[T[
= [a
1
; a
2
, . . .].
Proof
T = a
0

1
t
=
_
a
0

1
[t[
_
= (a
0
[a
0
; a
1
, a
2
, . . .]) =
1
[a
1
; a
2
, . . .]
.
So [T[ = T and
1
[T[
= [a
1
; a
2
, . . .].
Since f is reduced,
1
[t[
has a purely periodic continued fraction expansion,
so also
1
[T[
had a purely periodic continued fraction expansion. Therefore g
is reduced.
Lemma 16.2.10. The form g from the previous proposition is the only
reduced right neighbour of f.
Proof For reduced forms we have
d b < 2[c[ <
d +b
and
(
d b)(
d +b) = d b
2
= 4ac = 2[a[ 2[c[,
because both
d b and
d +b are positive. From this it follows that
d b < 2[a[ <
d +b.
If h(x, y) = a
t
x
2
+ b
t
xy + c
t
y
2
is the right neighbour of f over k, then
b + b
t
= b b 2ck, so b + b
t
0 (mod 2[c[). Together with the inequality
0 <
db
t
< 2[a
t
[ = 2[c[ this ensures that k = a
0
is unique. (No multiples
of 2[c[ can be added to or subtracted from b
t
.)
Remark 16.2.11. Also the left reduced neighbour is unique. This follows
from a similar calculation.
Now for every reduced form, we have an associated sign and a purely pe-
riodic continued fraction
1
[t[
= [a
0
; a
1
, . . . , a
m1
] of period m. We construct
a chain of equivalent forms, beginning at some reduced form f and taking
right neighbours over a
0
. As weve seen, this has the following eect:
, [a
0
; a
1
, . . . , a
m1
] [a
1
; a
2
, . . . , a
m1
, a
0
].
If m is odd, after m steps we return at the same continued fraction, but
instead of we have . So the cykel is back at the beginning after lcm(2, k)
steps. So now we have proven most of the main theorem:
Theorem 16.2.12. Let f(x, y) be a reduced indenite form with discrimi-
nant d and rst root t, where t = [t[. Suppose that
1
[t[
= [a
0
; a
1
, . . . , a
m1
]
with m even. (Take twice the smallest period if needed.) Let f
0
= f and let
f
i
be the right neighbour of f
i1
by (1)
i1
a
i1
. Then f
0
, f
1
, . . . , f
m1
are
all the reduced forms equivalent to f, and f
m
= f.
The only thing left to prove is that there are no to f equivalent reduced forms
outside this chain. Unfortunately that proof is rather hard and requires
more knowledge about continued fractions then we have at this point, so
this wont be proven. For instance, a nice proof can be given using so called
minus-continued fractions, expressions of the form
a
0

1
a
1

1
a
2

1
.
.
.
with a
0
, a
1
, a
2
, . . . 2.
The number of equivalence classes is called the class number of d. This
corresponds with the class number h
d
we know from number theory, where
d = D
m
, the discriminant of the quadratic number eld K
m
.
Example 16.2.13. To calculate the class number h
17
, we rst need to
determine all reduced forms f(x, y) = ax
2
+bxy +cy
2
with discriminant 17.
We have 0 < b <
17 and 0 < [c[ <
17. Also b is odd, since b

2
4ac = 17.
So b = 1 or b = 3.
16.2. INDEFINITE FORMS 163
If b = 1, then 4ac = 16, so ac = 4. So c 1, 2, 4. Because
1
[t[
=
b +
d
2c
=
1 +
17
2[c[
is a reduced quadratic irrational, we have 1+
17 > 2[c[ and 1
17 > 2[c[,
so c = 2. This gives us two forms:
2x
2
+xy 2y
2
, 2x
2
+xy + 2y
2
.
Similar reasoning gives another four reduced forms if b = 3:
x
2
+ 3xy 2y
2
, x
2
+ 3xy + 2y
2
, 2x
2
+ 3xy y
2
, 2x
2
+ 3xy +y
2
.
Taking f
0
= 2x
2
+xy 2y
2
with rst root t =
1+
17
4
we get
1
t
=
_
1; 3, 1
=
_
1; 3, 1, 1, 3, 1
.
So all reduced forms with discriminant 17 are equivalent, which tells us
h
17
= 1. Knowing a bit about number theory, we have now proved that
Z
_
1+
17
2
_
is a principal ideal domain.
Chapter 17
CFs in power series elds
David Venhoek
17.1 Introduction
Instead of working from the reals, we can also make continued fraction
expansions of elements in a power series eld. Let k be a eld, then the role
of Z is played by k[X], that of Q by k(X) and that of R by k((X
1
)).[80]
Throughout this piece I will use the following notational conventions.
Lower case symbols such as a, x are elements of k. Upper case symbols
(with the obvious exception of X) such as A, D are elements of k[X], and
Greek letters such as , are elements of k((X
1
)).
We can dene a norm on the elements of k((X
1
)). Let =
inf
i=t
a
i
x
i
with a
t
,= 0. Then the we dene the norm of to be [[ = 2
t
. This norm
is special in the sense that it is a non-archimedean norm. This means that
instead of the normal triangle inequality [+[ [[ +[[ we have [+[
max([[, [[). In this particular case we have the additional property that
[ +[ = max([[, [[) when [[ , = [[.
The regular continued fraction expansion for elements of k((X
1
)) is
dened as the expansion
a
0
+
1
a
1
+
1
a
2
+
1
...
where i > 0 : [a
i
[ > 1. We will take the same shorthand notation used for
165
166 CHAPTER 17. CFS IN POWER SERIES FIELDS
the continued fractions over the reals:[80]
[A
0
] = A
0
[A
0
; A
1
] = A
0
+
1
A
1
[A
0
; A
1
, ..., A
n
] =
_
A
0
; A
1
, ..., A
n2
, A
n1
+
1
A
n
_
We can now introduce again the notion of convergents and complete
quotients. Let = [A
0
; A
1
, A
2
, ...]. Then the partial quotients are:
P
m
Q
m
= [A
0
; A
1
, ..., A
m
]
and the complete quotients are
m
= [A
m
; A
m+1
, ...]. Furthermore we have
= [A
0
; A
1
, ..., A
m1
,
m
]. From this it follows immediately that [A
i
[ =
[
i
[.[80]
We also have a recursion relation for the partial quotients.
17.1.1 Lemma 1
Let
P
i
Q
i
be the convergents of = [A
0
; A
1
, ...]. Then with P
2
= 0, Q
2
= 1,
P
1
= 1 and Q
1
= 0 we have P
i
= A
i
P
i1
+P
i2
and Q
i
= A
i
Q
i1
+Q
i2
.
Proof: Verifying for i = 0 is trivial. Suppose the lemma is true for all
i < n. Then
P
n
Q
n
= [A
0
; A
1
, ..., A
n
] =
_
A
0
; A
1
, ..., A
n1
+
1
A
n
_
=
_
A
n1
+
1
An
_
P
n2
+P
n3
_
A
n1
+
1
An
_
Q
n2
+Q
n3
=
P
n1
+
P
n2
An
Q
n1
+
Q
n2
An
=
A
n
P
n1
+P
n2
A
n
Q
n1
+Q
n2
and the lemma is thus also true for n. Then by induction the lemma is true
for all i N.[80]
By induction it is now also easy to prove that the relationship Q
n
P
n1
P
n
Q
n1
= (1)
n
holds for n 1.
Finally we need to dene some useful subsets of k. k
are all the units

(invertible elements) in k. We dene (k
)
2
to be the set of all squares in k,
and for a k
, a(k
)
2
is the coset of (k
)
2
containing a.[80]
17.2. PROPERTIES OF CONVERGENTS 167
17.2 Properties of convergents
We now have all the denitions in place to prove some facts about the
convergents of the continued fraction of an element .
17.2.1 Lemma 2
If
Pn
Qn
is a convergent of then

Pn
Qn
=
1
[Q
n+1
[[Qn[
<
1
[Qn[
2
.
Proof: Using the denitions and lemma 1 we get:

P
n
Q
n
n+1
P
n
+P
n1
n+1
Q
n
+Q
n1
P
n
Q
n
Q
n
(
n+1
P
n
+P
n1
) P
n
(
n+1
Q
n
+Q
n1
)
Q
n
(
n+1
Q
n
+Q
n1
)
(1)
n
Q
n
(
n+1
Q
n
+Q
n1
)
=
1
[Q
n+1
[ [Q
n
[
Because the degrees of the Q
n
are strictly increasing, the inequality also
holds.[80]
17.2.2 Lemma 3
If k((X
1
))k(X),

P
Q
<
1
[Q[
2
, then there exists an n such that
P
Q
=
Pn
Qn
Proof: There exists an n such that [Q
n
[ [Q[ < [Q
n+1
[. We now have

P
Q
<
1
[Q[
2
<
1
[Q[[Q
n
[

P
n
Q
n
=
1
[Q
n
[[Q
n+1
[
<
1
[Q[[Q
n
[
From this we can now determine
P
Q

Pn
Qn
P
Q

P
n
Q
n

P
n
Q
n
_

P
Q
_
max
_

P
n
Q
n

P
Q
_
<
1
[Q[[Q
n
[
and thus
P
Q
=
Pn
Qn
.[80]
17.3 Relations between continued fraction expan-
sions
We will now look at an interesting relation between elements of k((X
1
)),
that indicates a relationship between the continued fraction expansions of
elements.
We dene the relation as follows:[80]
R, S, T, U k[X] : RU ST k
=
R +S
T +U
We will not prove that this is indeed an equivalence relation, this is very
easy to verify for yourself, though we will elaborate on some of the basic
facts about it in the next section.
We have two simple consequences of this denition. First
n

m
n, m
N. From this it follows that , K(X).
To analyze the consequences of this relationship further it is easier to
rst look at a slightly dierent relationship.
17.3.1 Lemma 4
Let =
A+B
C+D
where [D[ < [C[, AD BC = a k
, , , k(X) and
[[ > 1. Let
Pn
Qn
be the convergents of . Then for some n we have
A
C
=
P
n
Q
n
B
D
=
P
n1
Q
n1
and = b
n+1
for some b (1)
n+1
a(k
)
2
.
Proof: We can write
A
C
as a continued fraction:
A
C
= [A
0
; A
1
, ..., A
n
] =
P
n
Q
n
, where the star indicates it is a convergent of
A
C
. A and C are coprime
and thus we have A = c
1
P
n
, B = c
1
Q
n
with c k
. Thus:
P
n
D Q
n
B = c(AD BC) = ac = ac(1)
n
(Q
n
P
n1
P
n
Q
n1
)
P
n
(D + (1)
n
acQ
n1
= Q
n
(B + (1)
n
acP
n1
)
P
n
and Q
n
are coprime, and thus Q
n
[(D+(1)
n
acP
n1
). [D[ < [C[ = [Q
n
[,
and thus we have D = (1)
n+1
acQ
n1
, B = (1)
n+1
acP
n1
. Thus
=
((1)
n+1
c
2
a
1
)P
n
+P
n1
((1)
n+1
c
2
a
1
)Q
n
+Q
n1
and we have
A
C
=
Pn
Qn
,
B
D
=
P
n1
Q
n1
and = (1)
n+1
ac
2
n+1
.[80]
17.4. M
OBIUS TRANSFORMATIONS AND MATRIX NOTATION 169

17.3.2 Theorem 2
Let , K((X
1
))K(X). Then if and only if we have n, m N,
a k
such that
m
= a
n
Proof:
m
= a
n
=
an+0
0n+1
gives
n

m
, and thus . The other
directions requires a bit more work. gives
R+S
S+T
= . Let
P
i
Q
i
be the
convergents of . We can write:
=
R(P
n1
+P
n2
) +S (Q
n1
+Q
n2
)
T (P
n1
+P
n2
) +U (Q
n1
+Q
n2
)
(17.1)
=
(RP
n1
+SQ
n1
) + (RP
n2
+SQ
n2
)
(TP
n1
+UQ
n1
) + (TP
n2
+UQ
n2
)
(17.2)
=
A +B
C +D
(17.3)
with
A = RP
n1
+SQ
n1
B = RP
n2
+SQ
n2
C = TP
n1
+UQ
n1
D = TP
n2
+UQ
n2
We have

P
n1
Q
n1
<
1
[Q
n1
[
2
. And thus we can write P
n1
= Q
n1
+
with [[ <
1
[Q
n1
[
. We can thus write:
C = (T +U) Q
n1
+T
D = (T +U) Q
n2
+T
If we take n big enough we have [C[ = [T +U[ [Q
n1
[ and [D[ = [T +U[ [Q
n1
[.
Thus for large enough n we have [C[ > [D[. Now we can apply lemma 4 to
get an a k
and m N such that

m
= a
n
, proving the theorem.[80]
17.4 Mobius transformations and matrix notation
The relation in the previous section can be seen as a requirement on the exis-
tence of a certain kind of Mobius transformation. A Mobius transformation
is a function of the form
f(x) =
Ax +B
Cx +D
We limit ourselves here to those transformations for which we have AD
BC k
. These form a group with a left action on k((X

1
)). We claim that
this group behaves like the group of 2x2 matrices over k[X] with determinant
in k
, from now on we will call this group SL(2, k[X]). First we need to
show that it is indeed a group. For this we need to verify only two things,
the rest follows from the fact that we are working with matrices. First, let
f, g SL(2, k[X]). Then det(f g) = (det f)
( det g). Since k
is a group, we
thus have det(f g) k
and thus f g SL(2, k[X]). Now let f SL(2, k[X]).

We have
f
1
=
_
A B
C D
_
1
=
_
D
ADBC

B
ADBC
C
ADBC
A
ADBC
_
And since we have det f = AD BC k
and k
a group, we have
f
1
SL(2, k[X]). And thus we have SL(2, k[X]) a group.
We now dene the left action of SL(2, k[X]) on k((X
1
)). Let f
SL(2, k[X]) and k((X
1
)). Then if
f =
_
A B
C D
_
We dene f to be
f =
A +B
C +D
This is indeed a left action because we have:
_
A B
C D
__
A
t
B
t
C
t
D
t
_
=
_
AA
t
+BC
t
AB
t
+BD
t
CA
t
+DC
t
CB
t
+DD
t
_
and
f(f
t
) =
A
A
+B
+D
+B
C
A
+B
+D
+D
=
A(A
t
+B
t
) +B(C
t
+D
t
)
C (A
t
+B
t
) +D(C
t
+D
t
)
=
(AA
t
+BC
t
) + (AB
t
+BD
t
)
(CA
t
+DC
t
) + (CB
t
+DD
t
)
17.5 Pseudoperiodic continued fractions
We can extend the notion of a periodic continued fraction in the real case
to the Power Field Case. We use the notation
_
A
0
, A
1
, A
2
, A
3
, A
4
, A
5
, A
6
=
_
A
0
, A
1
, A
2
, A
3
, A
4
, A
5
, A
6
, A
4
, A
5
, A
6
. However, it turns out to be useful

to slightly extend the notion of periodicity, by allowing the period to be
17.5. PSEUDOPERIODIC CONTINUED FRACTIONS 171
repeated with a factor a k
. We can write:
_
A
0
, A
1
, A
2
, A
3
, A
4
, A
5
a
=
_
A
0
, A
1
, A
2
, A
3
, A
4
, A
5
, aA
4
, a
1
A
5
a
_
In which we require the period to always be even.[80]
We can now formulate and proof the following theorem:
17.5.1 Theorem 3
Let k((X
1
))k(X). Then has a pseudoperiodic continued fraction
expansion if and only if we have
=
R +S
T +U
where
_
R S
T U
_
has determinant in k
and is not a multiple of the identity matrix.

Proof: Let have a pseudoperiodic continued fraction. Then we can
write =
_
A
0
, .., A
n
, A
n+1
, .., A
k
a
with n ,= k. And thus we have

n+1
=
_
A
n
+ 1, .., A
k
a
, and
k+1
= a
n+1
. Letting
Pn
Qn
be the convergents of we
now dene
M
l
=
_
P
l1
P
l2
Q
l1
Q
l2
_
We then have = M
l
l
and thus:
M
1
k+1
=
_
a 0
0 1
_
M
1
n+1
= M
k+1
_
a 0
0 1
_
M
1
n+1
And we thus have a relation =

R+S
T+U
with
_
R S
T U
_
= M
k+1
_
a 0
0 1
_
M
1
n+1
If this matrix where a multiple of the identity matrix we would have a b k
such that:
M
k+1
_
a 0
0 1
_
= M
n+1
_
b 0
0 b
_
However, the lower left entry of the left side is aQ
k
and is thus of greater
degree than the corresponding entry bQ
n
on the right side of the equation,
and thus this equality can hold for no b k
, and thus we have a relation

as required by the theorem.
Now suppose that we have an such that we have
=
_
R S
T U
_
with RU ST k
and
_
R S
T U
_
not a multiple of the identity matrix. Then by theorem 2 we have
n
= b
m
for some m, n N and b k
. We only need to show now that m ,= n.

But the result was reached through lemma 4, which also gives
A
C
=
P
m1
Q
m1
and
B
D
=
P
m2
Q
m2
. Suppose m = n, then we have A = uP
n1
, C = uQ
n1
,
B = vP
n2
and D = vP
n2
with u, v k
. Substituting these into equation

17.1 gives
=
uP
n1
n
+vP
n2
uQ
n1
n
+vQ
n2
since also = M
n
n
we have v = u, and thus
_
A B
C D
_
= uM
n
and
_
A B
C D
_
=
_
R S
T U
_
M
n
But since the matrix with RSTU is not a multiple of the identity matrix,
this is impossible, and thus m ,= n. From this it follows that has a
pseudoperiodic continued fraction expansion.[80]
17.6 Calculating continued fraction
Up until now we have only looked at abstract properties of continued frac-
tions. But we might also be interested in calculating the continued fraction
17.6. CALCULATING CONTINUED FRACTION 173
expansion of a zero of an equation. There are two strategies that will be
treated here.
First of all, we can calculate the continued fraction expansion from
its power series expansion. Suppose we have the element =
i=0
x
i
of
GF(2)((X
1
)). We get its continued fraction expansion by repeatedly tak-
ing the part consisting of positive powers of
i
, which is the next partial
quotient, and using that to obtain
i+1
. In this case this gives A
0
= 0 and
1
= 1 +X, and the continued fraction expansion is thus [0, 1 +X].
However, it is not always easy to obtain the power series expansion of
an element, and it can be rather hard to manipulate it in certain situations.
There are sometimes alternative approaches.
Suppose , GF(p)((X
1
)), and suppose we know the continued frac-
tion expansion of . If we know have =
R
p
+S
T
p
+U
, then we can use a method
by Mills and Robbins [59].
The method of Mills and Robbins consists of an iterated process, in
which during each iteration one executes on of two possible steps. These
steps either produce a new element of , or consume an element of . The
state of the algorithm after each iteration is a relation
i
=
R
p
j
+S
T
p
j
+U
, and we
can for the rest of this piece assume that the greatest common divisor of
R, S, T and U is 1.
We rst need some notational conventions. We dene D = RU ST,
and let r, s, t, u, d be the degree of R, S, T, U, D resp, and these are if
their corresponding polynomial is 0. When we write [] this means we take
the integer part of (the positive powers of X). We can now go on to look
at the actual steps of the algorithm.
17.6.1 Step of type I
Steps of type I produce an element of the continued fraction expansion of
, and can thus only be executed in the situation that no further element
of the continued fraction expansion of inuences it.
Let
i
=
A
p
j
+B
C
p
j
+D
. We can execute a step of type I when:
deg(
j
) > 1
t +p > u
2t +p > d
Then let B
i
=
_
R
T
, and we have
i+1
=
T
p
j
+U
(RB
i
T)
p
j
+SB
i
U
.
17.6.2 Lemma 5
A step of type I is valid under the given conditions
Proof: D = RU ST, which gives
i

R
T
=
R
p
j
+S
T
p
j
+U

R
T
=
D
T
_
T
p
j
+U
_
t + p > u and thus we have that deg
_
T
_
T
p
j
+U
__
2t + p. 2t + p > d
and thus the right hand side of the equation has integer part 0. It follows
that the next partial coecient of is
_
R
T
.
We now have
i
= B
i
+
1
i
+1
, and thus
i+1
=
1
i
B
i
. Substituting for
i
we get:
i+1
=
1
R
p
j
+S
T
p
j
+U
B
i
=
T
p
j
+U
R
p
j
+S B
i
_
T
p
j
+U
_ =
T
p
j
+U
(R B
i
T)
p
j
+S B
i
U
Which proves the lemma.
17.6.3 Step of type II
Steps of type II consumes an element of the continued fraction expansion
, and can always be done. Because in practical calculations one wants to
calculate the required number of partial quotients as fast as possible, one
will usually only want to do steps of type II if one cannot do a step of type
I, though this is by no means necessary.
Let the next partial quotient of be A
j
. Then the result of a step of
type II is
i
=
_
RA
p
j
+S
_
p
j+1
+R
_
TA
p
j
+U
_
p
j+1
+T
17.6.4 Lemma 6
A step of type II is valid
Proof: We have
j
= A
j
+
1
j+1
. Since we are working in GF(p) we have
17.6. CALCULATING CONTINUED FRACTION 175
p
j
= A
p
j
+
1
p
j+1
. Putting this in the original relation gives:
i
=
R
_
A
p
j
+
1
p
j+1
_
+S
T
_
A
p
j
+
1
p
j+1
_
+U
=
_
S +RA
p
j
_
p
j+1
+R
_
U +TA
p
j
_
p
j+1
+T
Both types of steps can easily be shown to preserve up to sign the de-
terminant.
Now that we have both types of steps, one last thing that is useful to
prove, is that it actually can only take a nite number of steps of type II to
make the next step of type I possible. In other words, given enough steps,
one will always get the next partial quotient. One cannot get stuck.
17.6.5 Lemma 7
It only takes a nite number of steps of type II to make a step of type I
possible
Proof: Suppose that
i
is of degree 0. The denition of the complete
quotients then gives that i = 0. A single step of type II will thus make it
so that this property is satised, and further steps of type I and type II will
preserve this property.
Suppose t
i
+p u
i
. Then after a step of type II we have u
i+1
= t
i
 u
i+k
. If we would know have
another step of type II we have t
i
+p > u
i
and thus t
i+1
t
i
+p > t
i
= u
i+1
.
Thus further steps will preserve the property t +p > u.
Suppose t
i
+ p > u
i
but 2t
i
+ p d. The previous also gave that if
t
i
+p > u
i
then t
i+1
> t
i
. Since t
i
is an integer it takes only a nite number
of steps to satisfy the property 2t
i
+p > d. By the same argument steps of
type II preserve the property once it is satised.
There is thus need for only a nite number of steps of type II between
steps of type I.
17.6.6 Calculating a continued fraction from a relation with
itself
Suppose we have GF(p)((X
1
)), which satises the equation
=
R
p
+S
T
p
+U
Then, as long as the method of Mills and Robbins produces every partial
quotient before it is needed as input for the procedure, we can use it to
calculate the continued fraction expansion of .
For example: Let GF(2)((X
1
)) satisfy the equations
=
_
x
2
+x
_
2
+ 1
x
2
+ 1
degree () 1
This has determinant D = x
2
. The degree of the determinant is thus 2.
We can verify that the conditions for a type I step hold, so doing that gives
A
0
= x + 1 and
1
=
(x)
2
+ 1
x
Now we need to do a step of type II, giving
1
=
_
x
3
+x + 1
_
2
1
+x
(x)
2
1
We can now once again do a step of type I, giving A
1
= x
2
+ 1 and
2
=
x
2
1
2
1
+x
Again, we now need a step of type II, giving
2
=
_
x
5
+x
_
2
2
+x
(x
4
+x + 1)
2
2
+ 1
Continuing this way will give us further partial quotients of .
Chapter 18
Computing Mobius
transformations
18.1 Introduction
One of the diculties of the continued fraction of a real number is that it is
rather hard to do calculations with it. This appendix will look at a technique,
rst described by G.N. Raney in [70], for doing Mobius transformations on
the Continued Fraction Expansion of a real number.
A Mobius transformation is a function f : R R of the form
f() =
A +B
C +D
with A, B, C, D Z. In order to look at a technique for applying these
to the Continued Fraction Expansion of a number we need to get some
preliminaries out of the way
18.2 Words
Words are nite or innite sequences of symbols from an alphabet . An
alphabet is simply a set of symbols. So for example we can have = 1, 2, 3
and then we can have the words 123, 1212121212121212..., etc. We also can
have the empty word, that is a word consisting of 0 symbols. We denote
this as . The length of a word w is denoted by [w[.
The set of all possible nite words over an alphabet is denoted as
.
Oftentimes it is usefull to consider all nite words excluding the empty one.
177
178 CHAPTER 18. COMPUTING M
OBIUS TRANSFORMATIONS
This set is denoted by
+
. Last, the set of all innite words is denoted as
.
If we have two words, w and v then wv denotes the concatenation of
these two words, that is, the word formed by rst writing down all symbols
from the rst word (w) and then all symbols from the second word (v).
Using this notation we can now make the following denitions. If we have
x, w
, v
, then:
x is a prex of w i there exists an u
such that xu = w.
x is a strict prex of w i there exists an u
+
such that xu = w.
x is a sux of w i there exists an u
such that ux = w.
x is a strict sux of w i there exists an u
+
such that ux = w.
x is a prex of v i there exists an u
such that xu = v.
Sometimes, we dont want to talk about all words over an alphabet, but
over a subset of them. Such a subset is called a Language. A language L is
dened as being a set L such that L
.
There is also a useful type of map on words. A morphism is a function
:
, where , two alphabets, with the property that

for all u
, v
we have (uv) = (u)(v). From this denition

it follows that a morphism is completely dened by its action on the symbols
of [2].
18.3 Finite automata
A nite automaton is the simplest model for calculations. For some lan-
guages, we can create a nite automaton that determines for a word whether
or not it is an element of the language.
A nite automaton can be seen as a set of positions, called states, with
instructions on where to go next from a state and the rst unused symbol
of a word. We can draw these in the form of diagrams:
18.3. FINITE AUTOMATA 179
A D
B C
1
1
1
0
0
0
0
1
We start in state A (given by the arrow without source on this node).
Suppose we have the word 011010. We then walk through the automaton
in the following way:
We have as rst unused symbol a 0. The arrow with 0 besides it going
away from state A points to state A, so we are now in state A.
away from state A points to state B, so we are now in state B.
away from state B points to state C, so we are now in state C.
away from state C points to state A, so we are now in state A.
away from state A points to state B, so we are now in state B.
away from state B points to state A, so we are now in state A.
We have no unused symbols left, so we are done.
We say the nite automaton accepts a word if the state in which it ends
after proccesing all the symbols in that word is an accepting state. Such a
state is drawn with a double edge. So the word in the example is accepted
by the automaton, since state A is an accepting state. The set of all words
accepted by an automaton is called the corresponding language. For this
automaton, it is the set of words with no 3 subsequent 1s.
We can formalize these notions in the following way. A nite automaton
M is a tuple (Q, , , q
0
, F) where
Q is a nite set of states (the circles in the diagram).
is the nite alphabet of the input words.
: Q Q is the transition function (the arrows in the diagram).
q
0
Q is the initial state.
F Q is the set of accepting states.
We start dening the notion of accepted words and corresponding language
by extending . We dene
: Q
Q with the following property:

Let w
, a and q Q, then:
(q, ) = q,
(q, aw) =
( (q, a) , w) .
Using this, we say the automaton M accepts a word w i
(q
0
, w) F, and
the corresponding language L(M) is dened by L(M) = w
(q
0
, w) F
[2].
18.4 Transducers
A nite automaton is not that practical for our purposes, since it produces
only one piece of information as its output. However, we can make a varia-
tion on the nite automaton that also produces output. We take a regular
nite automaton, and to every edge, we assign a word that gets concate-
nated to the end of the current output when that transition is taken. For
example, take the following transducer:
A B
1/
1/1
0/0
0/0
Suppose we feed it with the word 011110110101110. Then the output
will be 01110100110. Notice that we have no more accepting states. We
also have no need for them anymore, since the output of the automaton is
generated during its execution, not afterwards. This means that we can also
execute the automaton on innite words. This then produces a possibly
innitely long output word, depending on the automaton.
We can formalize this in the following way. A transducer T is a tuple
(Q, , , q
0
, , ) where
18.4. TRANSDUCERS 181
Q is a nite set of states.
is the nite alphabet of the input word.
: Q Q is the transition function.
q
0
Q is the initial state.
is the nite alphabet of the output word.
: Q
is the output function.

We dene the function
: Q(
with:
(q, ) = ,
(q, aw) = (q, a)
((q, a), w)
with q Q, a , w
. The output of the transducer on a word

w
is dened as being
(q
0
, w) [2].
18.4.1 Multi-symbol input
For our purposes it is usefull to make a further generalization of the Trans-
ducer, by allowing the automaton to consume multiple symbols from the
input in one transition. To make this generalization, we rst assume that
all our input words are of innite length. We can then dene some of the
necessary concepts to formalize this.
A set of words B
is called a base i we have that for every word

w
there exists an unique element b B that is a prex of w. Using

this we can now formally dene a Transducer with Multi-symbol input.
A Multi-symbol input Transducer T is a tuple (Q, , , P, q
0
), where:
Q is the nite collection of states.
is the input alphabet.
is the output alphabet.
P Q
is the transition table.

q
0
is the initial state.
To make this a valid Multi-symbol input Transducer, we require that B
q
=
l[(q, l, .., ..) P is a base for all q Q. We now dene a function
t
, a
transformation function, as the function satisfying the following equation:
(p, u, p
t
, v) P, w
: (p, uw) = v(p

t
, w).
It follows from the requirement on P that this is a unique well dened
function, and the output of the transducer on an input word w is dened to
be (q
0
, w) [70].
18.5 LR representation of the continued fraction
expansion
Now that we have these denitions out of the way, there is only one piece
missing before we can talk about Mobius transformations. The denitions of
Transducers assume nite input and output alphabets. However, the default
way of writing down a continued fraction does not lend itself wel to a nite
alphabet. However, there is a representation of the same information that
is more useful for this purpose, the LR representation.
We dene two matrices L and R as:
L =
_
1 0
1 1
_
,
R =
_
1 1
0 1
_
.
An LR word is now a sequence of Ls and Rs, and represents the matrix
obtained by calculating the matrix product of the sequence [70].
We say that a vector x =
_
x
1
x
2
_
, x
1
, x
2
R
0
, accepts a word W i
there exists a vector y =
_
y
1
y
2
_
, y
1
, y
2
R
0
, such that x = Wy. We
now say that a number z R
0
is represented by x i z =
x
1
x
2
. A number
z R
0
accepts a word W i its representative accepts W. This is a sensible
denition because if x, x
t
both represent z, then x = ax
t
with a R
0
, and
thus if one accepts W, then so does the other [70].
We now want to prove a correspondence between the LR words a number
accepts, and its continued fraction expansion.
Lemma 18.5.1. Let x represent ,= 1. Then x accepts exactly one of L,
R.
18.5. LR REPRESENTATION OF THE CONTINUED FRACTION EXPANSION183
Proof. First we analyse the eects of L
1
and R
1
. We have
L
1
_
x
1
x
2
_
=
_
x
1
x
2
x
1
_
R
1
_
x
1
x
2
_
=
_
x
1
x
2
x
2
_
In other words, x accepts L i x
2
x
1
0, and x accepts R i x
1
x
2
0.
We now have two cases.
Case 1: < 1. We have
x
1
x
2
= < 1. Then x
2
x
1
> 0, and thus x
accepts L but not R.
Case 2: > 2. We have
x
1
x
2
= > 1. Then x
1
x
2
> 0, and thus x
accepts R but not L.
Lemma 18.5.2. If accepts the LR word w, then
1
accepts the LR word
(w), where is the morphism with (L) = R and (R) = L.
Proof. We look at the basis transformation V with the matrix:
V =
_
0 1
1 0
_
.
Now we have V = V
1
, V LV
1
= R and V RV
1
= L. But we also have
that every x accepts V , and that if x represents , then V x represents
1
.
Thus if x represents
1
, then x accepts V wV . And because of the
identities above we have V wV = (w). Thus
1
accepts (w).
Lemma 18.5.3. Let RQ. Then the innitely long LR word accepted
by is unique and equal to R
A
0
L
A
1
R
A
2
L
A
3
....
Proof. We know A
0
= |, and also
R
n
_
x
1
x
2
_
=
_
x
1
nx
2
x
2
_
.
Let x be a representative of . Because is irrational we know that there
exists no m, n Z such that mx
1
= nx
2
. R
1
and L
1
only create linear
combinations of x
1
and x
2
, and thus it follows that for any word w w
1
represents another irrational number. Since 1 is rational it follows from

Lemma 18.5.1 that if accepts w and v and [v[ [w[, then v a prex of w.
We have
x
1
x
2
= A
0
, from which we can conclude x
1
A
0
x
2
. We
also have
x
1
x
2
= < A
0
+ 1, from which we can conclude x
1
< (A
0
+ 1)x
2
.
From these facts combined it follows that the word R
A
0
is a prex of the
innite word that accepts, but R
A
0
+1
is not. We also have that R
A
0
represents
1
1
, and we know
1
is irrational. Lemma 18.5.2 now gives that
the accepting word of this is the accepting word of
1
with L and R switched.
Using induction we now have proven the lemma.
Lemma 18.5.4. Let x =
_
1
1
_
. Then the only innite words x accepts are
RL
and LR
.
Proof. x accepts R with x
t
=
_
0
1
_
. Then by Lemma 18.5.1 x
t
accepts only
one of R, L. Since Lx
t
= x
t
x accepts the innite word LR
, and no other
innite word starting with L. x also accepts L with x
tt
=
_
1
0
_
. Again by
Lemma 18.5.1 x
tt
accepts only one of R, L. Since Rx
tt
= x
tt
x accepts the
ininite word RL
, and no other innite word starting with R. Since every

innite word should start with either R or L, these are the only two options,
proving the lemma.
Lemma 18.5.5. Let Q. Then = [A
0
, A
1
, ..., A
n
] i we have:
If n even then accepts R
A
0
L
A
1
...L
An
R
.
If n odd then accepts R
A
0
L
A
1
...R
An
L
.
Proof. Let [A
t
0
, A
t
1
, ..., A
t
n
] be the unique continued fraction expansion of

with A
t
n
= 1. We know from the introduction lectures that we have only two

options for the continued fraction expansion of , the one we constructed and
[A
t
0
, A
t
1
, ..., A
t
n
1
+ A
n
]. We can use the procedure of Lemma 18.5.3 up to
the point where we are left with a number that represents 1. Depending on
n
t
we then have that every word that accept is either a prex of, or has as a
prex, the word R
A
0
L
A
1
...L
A
or R
A
0
L
A
1
...R
A
1
. Then by Lemma 18.5.4
we have that the two unique words that are R
A
0
L
A
1
...L
A
1
+1
R
and
R
A
0
L
A
1
...L
A
1
R
1
L
, or R
A
0
L
A
1
...R
A
1
+1
L
and R
A
0
L
A
1
...R
A
1
L
1
R
.
These correspond each with one of the two continued fraction expansions of
as required by the lemma, thus proving it.
18.6 Other 2 2 matrices over N
In order to construct the automata to calculate the Mobius transformation
it is usefull to look at and classify the other 2 2 matrices over N. We rst
18.6. OTHER 2 2 MATRICES OVER N 185
split these matrices into categories depending on their determinant. We let
T
n
denote all 2 2 matrices over N with determinant n.
We now introduce the concept of dominance. A row or column is said
to be dominant over the other if we have that its values are all greater than
or equal to those of the other row or column. A 2 2 matrix is called row-
balanced (resp. column-balanced) if neither of its rows (resp columns) is
dominant. We denote the set of these matrices with 1B
n
(resp (B
n
), where
n denotes the determinant of all the matrices in that set. If a matrix is both
row-balanced and column-balanced it is called doubly-balanced. The set of
all doubly balanced matrices is denoted with TB
n
[70].
The following facts are usefull in proving the more complicated theorems
later and proving them is left as an exercise to the reader. These originate
from [70].
Lemma 18.6.1. 1. 1B
n
and 1B
n
are nite sets.
2. 1B
1
= (B
1
= I.
3. For M T
n
the rst (second) row of M is dominant i M R T
n
(M L T
n
).
4. The sets R T
n
, L T
n
and 1B
n
are pairwise disjoint and their union
is T
n
.
5. For M T
n
the rst (second) column of M is dominant i M LT
n
(M R T
n
).
6. The sets L T
n
, L T
n
and (B
n
are pairwise disjoint and their union
is T
n
.
7. For every M T
1
there exists exactly one word w such that the matrix
represented by w equals M.
8. Each matrix M T
n
has a unique decomposition of the form PQ with
P T
1
and Q 1B
n
.
9. Each matrix M T
n
has a unique decomposition of the form QP with
P T
1
and Q (B
n
.
10. If M (B
n
and M = PQ with P T
1
and Q 1B
n
, then Q TB
n
.
11. If M 1B
n
and M = QP with P T
1
and Q (B
n
, then Q TB
n
.
18.7 Enumerating matrices in 1B
n
, (B
n
and TB
n
.
We now introduce a formalism useful in enumerating balanced matrices,
and showing that they remain balanced after certain transformations. Let
M =
_
a b
c d
_
T
n
. We dene r(M) =
_
p
q
_
, where p = db and q = ac.
When M is row-balanced both p and q are positive. In this case we denote
the generating word of r(M) with W
M
[70].
We then have the following identities:
Lemma 18.7.1. For every M T
n
we have r(ML) = L
1
r(M) and
r(MR) = R
1
r(M).
Proof. A straightforward calculation shows this.
Lemma 18.7.2. [70] For every M T
n
and LR-word W we have: r(Mw) =
(w)
1
r(M).
Proof. From linear algebra we know (AB)
1
= B
1
A
1
. The lemma follows
from repeatedly applying Lemma 18.7.1.
From the denition we also know that if r(M) =
_
g
g
_
, that the matrix is
of the form M =
_
s
t
+g s
s
t
s +g
_
, with g(s
t
+s+g) = n. We now introduce
two more denitions. A triple (g, s, s
t
) of non-negative integers will be called
a (*)-triple. A (*)-triple is associated with a matrix M if and only if there
exists a word W such that MW =
_
s
t
+g s
s
t
s +g
_
.
Theorem 20. [70] For every M 1B
n
there is exactly one (*)-triple
(g, s, s
t
) associated with M and for this triple the equation MW
M
=
_
g +s
t
s
s
t
g +s
_
holds.
Proof. We have r(M) = W
m
_
g
g
_
. By Lemma 18.7.2 we then get r(MW
M
) =
_
g
g
_
. It now follows that there exists a (*)-triple (g, s, s
t
) with MW
M
=
_
g +s
t
s
s g +s
_
. This (*)-triple is associated with M. Suppose that a (*)-
triple (g
1
, s
1
, s
t
1
) is associated with M. Then by Lemma 18.7.2 we know
18.8. TRANSFORMATIONS ON ROW BALANCED MATRICES 187
that there exists a word W
1
such that W
1
_
g
1
g
1
_
= r(M) = W
M
_
g
g
_
. From
this equality and the properties of LR-words it follows that W
1
= W
M
and
g
1
= g. But then it also follows that s
1
= s and s
t
1
= s
t
.
Theorem 21. [70] For every (*)-triple (g, s, s
t
) there is exactly one matrix
Q TB
n
whose associated (*)-triple is (g, s, s
t
). The matrices M 1B
n
having (g, s, s
t
) as their associated (*)-triple are precisely those of the form
M = QU with U an LR-word that is a prex of W
Q
. Furthermore we have
W
Q
= UW
M
.
Proof. Let (g, s, s
t
) be a (*)-triple. We have
_
g +s
t
s
s
t
g +s
_
1B
n
and
thus (by Lemma 18.6.1 parts 7,9 and 11) there exist Q TB
N
and an
LR-word U such that
_
g +s
t
s
s
t
g +s
_
= QW. Because of this equation we
know that Q has (g, s, s
t
) as its associated (*)-triple, and by Theorem 20
that W = W
Q
.
Now let M 1B
n
with (g, s, s
t
) as its associated (*)-triple. Then
MW
M
=
_
g +s
t
s
s
t
g +s
_
= QW
Q
. By Lemma 18.6.1 parts 7,9 and 11 we
have M = Q
t
U where U is an LR-word. This gives Q
t
UW
M
= QW
Q
and
this implies that Q
t
= Q and W
Q
= UW
M
. If M was in fact an element of
TB
n
then this gives, in combination with 18.6.1 part 9, that M = Q. Thus
Q is the only matrix in TB
n
with (g, s, s
t
) as its associated triple.
Lemma 18.7.3. [70] If M 1B
n
and V an LR-word, then MV 1B
n
i
V is a prex of W
M
.
Proof. From Theorem 21 we know that we have Q TB
n
associated to the
same (*)-triple as M, and that MW
M
= QUW
M
= QW
Q
. If V is a prex
of W
M
, the theorem gives immediately that MV 1B
n
. If MV 1B
n
,
then we know MV W
MV
= QW
Q
, and since W
Q
= UW
M
we get that V is
a prex of W
M
, which completes the proof.
18.8 Transformations on row balanced matrices
We now have the tools to go towards transformations on row balanced ma-
trices. We need one nal denition. Let M 1B
n
. Dene the base B
M
to
be the base formed by taking all words w such that every strict prex of w
is a prex of W
M
, and w itself is not a prex of W
M
. This means that the
only dierence between W
M
and w is in the last symbol of w [70].
Theorem 22. [70] Let M
1
1B
n
and V
1
B
M
1
. Then if V
1
ends on an L,
we have M
1
V
1
L (B
n
, and if V
1
ends on an R, we have M
1
V
1
R (B
n
.
Furthermore, there exists also an LR-word V
2
and a matrix M
2
TB
n
such
that M
1
V
1
= V
2
M
2
.
Proof. We start by proving the rst conclusion. For this, we seperate the
proof into 4 distinct cases.
Case 1: V
1
= W
M
1
L.
In this case we have
M
1
V
1
= M
1
W
M
1
L
=
_
g +s
t
s
s
t
g +s
_
L
=
_
s
t
+g +s s
s
t
+s +g s +g
_
= L
_
s
t
+g +s s
0 g
_
L (B
n
.
Case 2: V
1
= W
M
1
R.
In this case we have
M
1
V
1
= M
1
W
M
1
R
=
_
g +s
t
s
s
t
g +s
_
R
=
_
s
t
+g s
t
+g +s
s
t
s
t
+s +g
_
= R
_
g 0
s
t
s
t
+s +g
_
R (B
n
.
Case 3: W
M
1
= URZ for some LR-words U and Z, and V
1
= UL.
First let Z =
_
x y
z w
_
. Since Z T
1
we have Z
1
=
_
w y
z x
_
. It now
follows that
M
1
U = M
1
W
M
1
Z
1
R
1
=
_
s
t
+g s
s
t
s +g
__
w y
z x
__
1 1
0 1
_
=
_
s
t
w +gw sz s
t
w s
t
y gw gy +sz +sx
s
t
w sz gz s
t
w s
t
y +sz +sx +gz +gx
_
.
18.9. TRANSDUCERS FOR M
OBIUS TRANSFORMATIONS 189

Since the elements of M
1
U are nonnegative we get s
t
w sz gz 0 and
s
t
w +gw sz > 0. These facts can now be used to obtain
M
1
V
1
= M
1
UL
=
_
s
t
y gy +sx s
t
w s
t
y gw gy +sz +sx
s
t
y +sx +gx s
t
w s
t
y +sz +sx +gz +gx
_
= L
_
s
t
y gy +sx s
t
w s
t
y gw gy +sz +sx
g(x +y) g(z +x +w +y)
_
L(B
n
.
Case 4: W
M
1
= ULZ for some LR-words U and Z, and V
1
= UR.
Again take Z =
_
x y
z w
_
. We get:
M
1
U = M
1
W
M
1
Z
1
L
1
=
_
s
t
w +s
t
y +gw +gy sz sx s
t
y gy +sx
s
t
w +s
t
y sz sx gz gx s
t
y +sx +gx
_
.
Again, this matrix has no negative elements, and thus s
t
y gy + sx 0
and s
t
y +sx +gx > 0. We then get
M
1
V
1
= M
1
UR
= R
_
g(w +y +z +x) g(w +z)
s
t
w +s
t
y sz sx gz gx s
t
w sz gz
_
R(B
n
.
We now only need to prove the second part. However, it follows from
18.6.1 parts 7, 9 and 11 that R(B
n
(T
1
TB
n
), and that there thus exists an
unique LR-word V
2
and a matrix M
2
TB
n
such that M
1
V
1
= V
2
M
2
.
18.9 Transducers for Mobius transformations
Using the results from the previous section we can now build transducers
that calculate certain Mobius transformations.
Let Q be the set of all matrices M TBn which have as greatest divisor
of their elements . Then let P be the set of all tuples (M
1
, V
1
, M
2
, V
2
) with
M
1
, M
2
Q, V
1
B
M
1
and M
1
V
1
= V
2
M
2
[70].
Theorem 23. [70] T = (Q, L, R, L, R, P, M) is a transducer M Q,
which produces the result of the Mobius transformation specied by M on
its input.
Proof. First since all matrices in T
1
are either the unit matrix or can be
written as an LR word, we get that if M
1
V
1
= V
2
M
2
for M
1
, M
2
T
n
and
V
1
, V
2
T
1
that the greatest common divisor of the elements of M
1
is equal
to the greatest common divisor of the elements of M
2
.
With this fact it follows from Lemma 18.6.1 and Theorem 22 that P
does satisfy the requirements from our denition of a multi-symbol input
transducer. Let x
1
, x
2
R
0
be such that
_
x
1
x
2
_
accepts the LR word
given as input to the transducer. It then follows from the denition of P
that Mx accepts the output word of the transducer. But Mx represents
the result of the Mobius transformation on the number represented by x.
Thus our transducer performs the Mobius transformation M, proving the
theorem.
18.10 Conclusion
The result from the previous section gives a method for calculating certain
Mobius transformations, on continued fraction expansions representing pos-
itive numbers. In the original work by G.N. Raney [70] he also presents a
set of steps by which this method can be extended to continued fraction ex-
pansions of arbitrary numbers and arbitrary Mobius transformations. These
results show that Mobius transformations can be relativly easily calculated
on the continued fraction expansion of a number.
Bibliography
[1] Adler, R., M.S. Keane and M. Smorodinsky A construction of a Normal
Number for the Continued Fraction Transformation, J. of Number Th. 13
(1981), 95-105.
[2] Allouche, Jean-Paul and Jerey Shallit, Automatic sequences: theory,
applications, generalizations, Cambridge University Press 2003.
[3] Austin, D. Trees, Teeth, and Time: The mathematics of clock making,
http://www.ams.org/samplings/feature-column/fcarc-stern-brocot,
Accessed 29 January 2013.
[4] Bagemihl, F. and J.R. McLauglin Generalization of some classical
theorems concerning triples of consecutive convergents to simple continued
fractions, J. reine Angew. Math. 221 (1966), 146-149.
[5] Barbolosi, D. and H. Jager On a theorem of Legendre in the theory of
continued fractions, Sem. Th. Nombres Bordeaux 6 (1994), 81-94.
[6] Barrionuevo, Jose, Robert M. Burton, Karma Dajani and Cor
Kraaikamp Ergodic Properties of Generalized L uroth Series, Acta
Arithm., LXXIV (4) (1996), 311-327.
[7] Billingley, P. Ergodic Theory and Information, John Wiley and Sons,
1965.
[8] Billingley, P. Probability and Measure, John Wiley and Sons, 2nd Ed.
1986.
[9] Blanchard, F. -Expansions and Symbolic Dynamics, Theoretical
Comp. Sc., 65 (1989), 131-141.
[10] Bogomolny, A. Stern-Brocot Tree. Introduction
from Interactive Mathematics Miscellany and Puzzles,
191
192 BIBLIOGRAPHY
http://www.cut-the-knot.org/blue/Stern.shtml, Accessed 29
January 2013
[11] Bogomolny, A. Stern-Brocot Tree, a second look at the binary
encoding from Interactive Mathematics Miscellany and Puzzles
http://www.cut-the-knot.org/blue/chaos game.shtml#tree, Ac-
cessed 27 January 2013
[12] Borel,

E. Contribution `a lanalyse arithmetique du continu,
J. Math. Pures Appl. (5) 9 (1903), 329-375.
[13] Bosma, W. and D. Gruenewald Complex numbers with bounded partial
quotients, to appear in Journal of the Australian Mathematical Society.
[14] Bosma, W., H. Jager and F. Wiedijk Some metrical observations on
the approximation by continued fractions, Indag. Math. 45 (1983), 281-
299.
[15] Boyd, David W. On the beta expanson for Salem numbers of degree
6, Math. Comp. 65 (1996), 861-875, S29-S31.
[16] Brauer, A. On algebraic equations with all but one root in the interior
of the unit circle, Math. Nachr. 4 (1951), 250-257.
[17] Bressoud, D.M. Factorization and Primality Testing, Springer UTM,
Spriner Verlag, Berlin, New York, 1989.
[18] Brown, James R. Ergodic Theory and Topological Dynamics, Aca-
demic Press, New York, San Francisco, London, 1976.
[19] Champernowne, D.G. The construction of decimal normal in the scale
of ten, J. London Math. Soc., 8 (1933), 254-260.
[20] Cornfeld, I.P., S.V. Fomin and Ya.G. Sinai Ergodic Theory,
Grundlehren der math. Wiss. 245, Springer-Verlag New York, Heidel-
berg, Berlin (1982).
[21] Dajani, K., C. Kraaikamp Ergodic Theory of Numbers, Mathematical
Association of America, 2002.
[22] Dajani, K., C. Kraaikamp and B. Solomyak The natural extension of
the -transformation, Acta Math. Hungar., 73 (1-2) (1996), 97-109.
[23] Davenport, H. The higher arithmetic. An introduction to the theory
of numbers, Sixth edition. Cambridge University Press, Cambridge, 1992.
BIBLIOGRAPHY 193
[24] Davenport, H. and Erdos, P. Note on normal decimals, Canadian J.
Math. 4 (1952). 58-63. MR 13,825g
[25] Erdos, P., I. Joo and V. Komornik Characterization of the unique
expansions 1 =
i=1
q
n
i
and related problems, Bull. Soc. Math. France
118 (1990), (3), 377390. MR 91j:11006
[26] William Feller An Introduction to Probability Theory and Its Appli-
cations, II, John Wiley and Sons, 1966.
[27] Ford, L.R. Fractions The American Mathematical Monthly, Vol. 45,
No. 9, pp. 586-601. Mathematical Association of america, November 1938.
[28] Friedman, N.A. and D.S. Ornstein On isomorphisms of weak Bernoulli
transformations, Adv. in Math. 5 (1970), 365-390.
[29] Frougny, C. and B. Solomyak Finite beta-expansions, Ergod. Th. &
Dynam. Sys. 12 (1992), 713-723.
[30] Galambos, J. Representations of Real numbers by Innite Series,
Springer LNM 502, Springer-Verlag, Berlin, Heidelberg, New York, 1976.
[31] Gauss, C.F. Mathematisches Tagebuch 1796-1814, Akademische Ver-
lagsgesellschaft Geest & Portig K.G., Leipzig 1976.
[32] Hardy, G. H. and E.M. Wright An introduction to the theory of num-
bers, Fifth edition. The Clarendon Press, Oxford University Press, New
York, 1979. MR 81i:10002
[33] Havinga, E. and W.E. van Wijk and J.F.M.G. dAumerie Plane-
tariumboek Eise Eisinga, Arnhem, 1928, 382410. Also accessible via
http://adcs.home.xs4all.nl/Huygens/21/plan.v.html (Accessed 3
February 2013)
[34] Hayes, B. On the Teeth of Wheels, American Scientist, July-August
2000, 88-4, 296300
[35] Hensley, D. The Hurwitz complex continued fraction, preprint, Jan-
uary 2006.
[36] Irwin, M.C. Geometry of Continued Fractions The American Math-
ematical Monthly, Vol. 96, No. 8, pp. 696-703. Mathematical Association
of america, October 1989.
194 BIBLIOGRAPHY
[37] J1 Jager, H. The distribution of certain sequences connected with the
continued fraction, Indag. Math. 48 (1986), no. 1, 6169. MR 87g:11092
[38] Jager, H. Continued Fractions and Ergodic Theory, Trancendental
Numbers and related Topics, RIMS Kokyuroku 599, Kyoto University,
Kyoto, Japan (1986), 55-59.
[39] Jager, H. and C. Kraaikamp On the approximation by continued frac-
tions, Indag. Math. 51 (1989), 289-307. MR 90k:11084
[40] Jager, H. and C. de Vroedt L uroth series and their ergodic properties,
Indag. Math. 31 (1968), 31-42. MR 39 #157
[41] Kakutani, Shizuo Induced measure preserving transformations, Proc.
Imp. Acad. Tokyo 19, (1943), 635641. MR 7,255f
[42] Kamae, T. A simple proof of the ergodic theorem using non-standard
analysis, Israel J. Math. 42 (1982), 284-290.
[43] Katznelson, Y. and B. Weiss A simple proof of some ergodic theorems,
Israel J. Math. 42 (1982), 291-296.
[44] Kessebohmer and Stratmann Multifractal analysis for Stern-Brocot
intervals, J. reine angew. Math. 605 (2007), 133163
[45] Khintchine, A.Ya. Continued Fractions, Groningen: Noordho, 1963.
[46] Kitchens, Bruce P. Symbolic Dynamics, Springer Universitext,
Springer-Verlag Berlin Heidelberg New York, 1998.
[47] Kingman, J.F.C. and S.J. Taylor Introduction to measure and proba-
bility, Cambridge University Press, Cambridge, 1966.
[48] Kolmogorov, A.N. A new metric invariant of transitive dynamical
systems and Lebesgue space automorphisms, Dokl. Acad. Sc. USSR 119,
no. 5, (1958), 861-864.
[49] Kraaikamp, C. On the approximation by continued fractions, II,
Indag. Math. New Series 1 (1990), 63-75.
[50] Kraaikamp, C. A new class of continued fraction expansions, Acta
Arithm., LVII (1) (1991), 1-39.
[51] Krengel, Ulrich Ergodic theorems, de Gruyter Studies in Mathematics,
6. Walter de Gruyter & Co., Berlin-New York, 1985. MR 87i:28001
BIBLIOGRAPHY 195
[52] Lenstra, A.K., H.W. Lenstra and L. Lovasz Factoring Polynomials
with Rational Coecients, Mathematische Annalen 1982, pp. 515-534.
Springer-Verlag
[53] Lind, D. and B. Marcus Symbolic dynamics and coding, Cambridge
University Press 1995.
[54] Lochs, G. - Vergleich der Genauigkeit von Dezimalbruch und Ketten-
bruch, Abh. Math. Sem. Hamburg, 27 (1964), 142-144.
[55] Lochs, G. - Die ersten 968 Kettenbruchnenner von , Monatsh. Math.
67 (1963), 311-316.
[56] L uroth, J. Ueber eine eindeutige Entwickelung von Zahlen in eine
unendliche Reihe, Math. Annalen 21 (1883), 411-423.
[57] Mane, Ricardo Ergodic theory and dierentiable dynamics, Ergeb-
nisse der Mathematik und ihrer Grenzgebiete (3) [Results in Mathematics
and Related Areas (3)], 8. Springer-Verlag, Berlin-New York, 1987. MR
88c:58040
[58] Martin, Nathaniel F.G. and James W. England Mathematical The-
ory of Entropy, Encyclopedia of Mathematics and its Applications, 12.
Addison-Wesley Publishing Co., Reading, Mass., 1981. MR 83k:28019
[59] Mills, W.H. and D.P. Robbins Continued fractions for certain alge-
braic power series, Journal of Number Theory 23-3 (1986), 388404.
[60] Nakada, H. Metrical Theory for a Class of Continued Fraction Ex-
pansions and their Natural Extensions, Tokyo J. Math. 4 (1981), 399-426.
[61] Nakada, H., Sh. Ito and S. Tanaka On the invriant measure for the
transformations associated with some real continued fractions, Keio En-
geneering Reports 30 (1977), 159-175.
[62] Danny Oorburg Een onderzoek naar het LLL- en Kettingbreukalgo-
ritme, Groningen 1997.
[63] Oxtoby, J.C. Measure and Category, Springer GTM 2, Springer-
Verlag New York Heidelberg Berlin, 1971.
[64] Parry, W. On the -expansion of real numbers, Acta Math. Acad. Sci.
Hungary 11 (1960), 401-416.
196 BIBLIOGRAPHY
[65] Perron, O. Die Lehre von den Kettenbr uchen, Band I, B.G. Teubner,
Stuttgart, 3. verb. u. erw. Au.
[66] Perron, O. Irrationalzahlen, Walter de Gruyter & Co., Berlin, 1960.
[67] Petersen, Karl Ergodic Theory, Corrected reprint of the 1983 original.
Cambridge Studies in Advanced Mathematics, 2. Cambridge University
Press, Cambridge, 1989. MR 92c:28010
[68] Pollicott, Mark and Michiko Yuri Dynamical Systems and Ergodic
Theory, London Mathematical Society Student Texts 40, Cambridge Uni-
versity Press 1998.
[69] Hans Rademacher, Higher Mathematics from an Elementary Point of
View. Birkhauser, 1983. Chapter 8: Ford Circles
[70] Raney, G.N. On continued fractions and nite automata, Mathema-
tische Annalen, 206-4 (1973), 265283.
[71] Renyi, A. Representations for real numbers and their ergodic proper-
ties, Acta Math. Acad. Sci. Hungary 8 (1957), 401-416.
[72] Rockett, A.M. and P. Sz usz Continued Fractions, Singapore: World
Scientic, 1992.
[73] Rohlin, V.A. Exact endomorphisms of a Lebesgue space, Izv. Akad.
Naik SSSR, Ser. Mat., 24 (1960); English AMS translation, Series 2, 39
(1969), 1-36.
[74] Royden, H.L. Real Analysis, Collier MacMillan International Editions,
2nd Ed., 1968.
[75] Rudin, Walter Real and Complex Analysis, McGraw-Hill Book Com-
pany, 3rd Ed., 1986.
[76] Rudolph, Daniel J Fundamentals of measurable dynamics. Ergodic
theory on Lebesgue spaces, Oxford Science Publications. The Clarendon
Press, Oxford University Press, New York, 1990. MR 92e:28006
[77] Salem, R. Algebraic Numbers and Fourier Analysis, Duke Math. LJ.
12 (1945), 153-172.
[78] Schmidt, K. On periodic expansions of Pisot numbers and Salem num-
bers, Bull. London Math. Soc. 12 (1980), 269-278.
BIBLIOGRAPHY 197
[79] Schmidt, W.M. Diophantine Approximation, Springer LNM 785,
1980.
[80] Schmidt, W.M. On continued fractions and diophantine approxima-
tion in power series elds, Acta Arithmetica 95-2 (2000), 139166.
[81] Fritz Schweiger Ergodic Theory of Fibered Systems and Metric Num-
ber Theory, Clarendon Press, Oxford 1995.
[82] Series, Caroline Non-Euclidean geometry, continued fractions, and
ergodic theory, Math. Intelligencer 4 (1982), no. 1, 2431. MR 84h:58086
[83] Series, Caroline The geometry of Marko numbers, Math. Intelli-
gencer 7 (1985), 2029. MR 86j:11069
[84] Segre, B. Lattice points in innite domains and asymmetric Diophan-
tine approximation, Duke J. Math. 12 (1945), 337-365.
[85] Shannon, C. A mathematical theory of communication, Bell Syst.
Tech. J. 27 (1948), 379-423, 623-656.
[86] Short, Ian Ford circles, Continued Fractions, and Rational Approxi-
mation The American Mathematical Monthly, Vol. 118, No. 2, pp. 130-
135. Mathematical Association of america, February 2011.
[87] Swinden, J.H. van Beschrijving van het Eisinga planetarium, uitgev-
erij van Wijnen Franeker, 1994. ISBN: 90 5194 105 6
[88] Tong, Jingcheng Approximation by nearest integer continued frac-
tions, Math. Scand. 71 (1992), 161-166.
[89] Vitanyi, Paul Randomness, CWI Quarterly 8 (1995), 6782. MR
97c:01032
[90] Walters, P. An Introduction to Ergodic Theory, GTM 79, Springer-
Verlag New York, Heidelberg, Berlin (1982).

CF PDF

Uploaded by

CF PDF

Uploaded by

Continued Fractions

k odd and b a, or k even and a b.

and the assumption that the statement is false, that

d) dan bestaan er a, b Q zodat x =

d, en we noemen het element x = a

d) de som, resp. het product van de

d)/Q, zodanig dat P, Q, d Z, met d > 0

d) heet gereduceerd als x > 1 en 1 < x < 0.

d precies optreedt wanneer

d (and hence of any quadratic irrational in Q(

d)) is bounded by 2d; a

d dus precies alle oplossingen van

N|, en x telkens met 1 ophogend, naar een kwadraat van de vorm x

N. Het nut is gelegen in de congruentie p

N klein is, en in dat geval treden maar weinig verschil-

kN voor kleine veelvouden van N te kijken. Niet alleen

be two rational numbers (expressed as fractions in their lowest

have the property that:

be two consecutive fractions of

are consecutive fractions in stage n. Suppose wlog that

. The sum of the numerator

. In particular, these fractions can be found in stage p + q of

are consecutive fractions, we may use mn

are consecutive fractions in stage p+q of the Stern-Brocot tree,

). As Induction hypothesis we may

d with a, b Q and d some square-free natural number. Then a b

d, but we can also specify where exactly they can be found.

dy is a unit of the ring Z[

d]. In algebraic number theory,

= [3; 7, 16, 293, 1, ]

2/, see propostion

() = 0 and then statement (iv) fails for x ,= and R

5 is the best value, i.e. the

5)/2 but s is by denition

5 + 1)/2, we would like to reduce the

5+1)/2 and consider the dierence (s)(s

5 is the best value, i.e. we must give an

5. It turns out the golden mean

EVY AND LOCHS 109

EVY AND LOCHS 111

EVY AND LOCHS 113

(A) log (A).

2 from the origin. From that: R

5 which norm [z[

5 with the unit

. Remember that tell us d

), because x+y +c x+y +d =

2 + 1). Observe that:

+ 1 and [1, 4, 1, 4, ...] = .

F(n) [ nis odd +

F(n) [ nis odd = R. So, the theorem of Hall has

2 and [w[ < 1 then [

d and 0 < [c[ <

d > 2c = 2[c[, 2[c[ < b

MP is called the right neighbour of f over k.

d b < 2[c[ <

d +b are positive. From this it follows that

d b < 2[a[ <

17 and 0 < [c[ <

17. Also b is odd, since b

17 > 2[c[ and 1

are all the units

OBIUS TRANSFORMATIONS AND MATRIX NOTATION 169

and m N such that

. These form a group with a left action on k((X

( det g). Since k

and thus f g SL(2, k[X]). Now let f SL(2, k[X]).

. However, it turns out to be useful

and is not a multiple of the identity matrix.

with n ,= k. And thus we have

And we thus have a relation =

, and thus we have a relation

. We only need to show now that m ,= n.

. Substituting these into equation

, where , two alphabets, with the property that

we have (uv) = (u)(v). From this denition