Algebraic Method Qcomputing

ALGEBRAIC AND LOGICAL METHODS IN QUANTUM
COMPUTATION
arXiv:1510.02198v1 [quant-ph] 8 Oct 2015
by
Neil J. Ross
Submitted in partial fulfillment of the requirements

for the degree of Doctor of Philosophy
at
Dalhousie University
Halifax, Nova Scotia
August 2015
⃝
c Copyright by Neil J. Ross, 2015
À mes maı̂tres.
ii
Table of Contents
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Abbreviations and Symbols Used . . . . . . . . . . . . . . . . . . ix
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Chapter 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Approximate synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 The mathematical foundations of Quipper . . . . . . . . . . . . . . . 6
1.3 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Chapter 2 Quantum computation . . . . . . . . . . . . . . . . . . . . 10
2.1 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.1 Finite dimensional Hilbert spaces . . . . . . . . . . . . . . . . 10
2.1.2 Operators and matrices . . . . . . . . . . . . . . . . . . . . . . 11
2.1.3 Unitary, Hermitian, and positive matrices . . . . . . . . . . . 12
2.1.4 Tensor products . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Quantum Computation . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2.1 A single quantum bit . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2 Multiple quantum bits . . . . . . . . . . . . . . . . . . . . . . 14
2.2.3 Evolution of a quantum system . . . . . . . . . . . . . . . . . 15
2.2.4 The QRAM model and quantum circuits . . . . . . . . . . . . 17
Chapter 3 Algebraic number theory . . . . . . . . . . . . . . . . . . 21
3.1 Rings of integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.1.1 Extensions of Z . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1.2 Automorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . 23
iii
3.1.3 Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Diophantine equations . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.2.1 Euclidean domains . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.2 Relative norm equations . . . . . . . . . . . . . . . . . . . . . 25
Chapter 4 The lambda calculus . . . . . . . . . . . . . . . . . . . . . 27
4.1 The untyped lambda calculus . . . . . . . . . . . . . . . . . . . . . . 27

4.1.1 Concrete terms . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.1.2 Reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.3 Abstract terms . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.1.4 Properties of the untyped lambda calculus . . . . . . . . . . . 32
4.2 The simply typed lambda calculus . . . . . . . . . . . . . . . . . . . . 33

4.2.1 Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.2.2 Operational semantics . . . . . . . . . . . . . . . . . . . . . . 34
4.2.3 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.4 Properties of the type system . . . . . . . . . . . . . . . . . . 37
4.3 Linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.3.1 Contraction, weakening, and strict linearity . . . . . . . . . . 38
4.3.2 Reintroducing non-linearity . . . . . . . . . . . . . . . . . . . 39
4.4 The quantum lambda calculus . . . . . . . . . . . . . . . . . . . . . . 40

4.4.1 Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.4.2 Operational semantics . . . . . . . . . . . . . . . . . . . . . . 42
4.4.3 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Chapter 5 Grid problems . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.1 Grid problems over Z[i] . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5.1.1 Upright rectangles . . . . . . . . . . . . . . . . . . . . . . . . 49
5.1.2 Upright sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.1.3 Grid operators . . . . . . . . . . . . . . . . . . . . . . . . . . 52
iv
5.1.4 Ellipses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.1.5 The enclosing ellipse of a bounded convex set . . . . . . . . . 55
5.1.6 General solution to grid problems over Z[i] . . . . . . . . . . . 55
5.1.7 Scaled grid problems over Z[i] . . . . . . . . . . . . . . . . . . 56
5.2 Grid problems over Z[ω] . . . . . . . . . . . . . . . . . . . . . . . . . 57

5.2.1 One-dimensional grid problems . . . . . . . . . . . . . . . . . 59
5.2.2 Upright rectangles and upright sets . . . . . . . . . . . . . . . 61
5.2.3 Grid operators . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2.4 Ellipses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.2.5 General solution to grid problems over Z[ω] . . . . . . . . . . 79
5.2.6 Scaled grid problems over Z[ω] . . . . . . . . . . . . . . . . . . 79
Chapter 6 Clifford+V approximate synthesis . . . . . . . . . . . . . 81
6.1 Exact synthesis of Clifford+V operators . . . . . . . . . . . . . . . . 81
6.2 Approximate synthesis of z-rotations . . . . . . . . . . . . . . . . . . 88

6.2.1 A reduction of the problem . . . . . . . . . . . . . . . . . . . 88
6.2.2 The algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.2.3 Analysis of the algorithm . . . . . . . . . . . . . . . . . . . . . 91
6.3 Approximate synthesis of special unitaries . . . . . . . . . . . . . . . 97
Chapter 7 Clifford+T approximate synthesis . . . . . . . . . . . . . 99
7.1 Exact synthesis of Clifford+T operators . . . . . . . . . . . . . . . . . 99
7.2 Approximate synthesis of z-rotations . . . . . . . . . . . . . . . . . . 101
7.3 Approximation up to a phase . . . . . . . . . . . . . . . . . . . . . . 105
Chapter 8 The Proto-Quipper language . . . . . . . . . . . . . . . . 111
8.1 From the quantum lambda calculus to Proto-Quipper . . . . . . . . . 111
8.2 The syntax of Proto-Quipper . . . . . . . . . . . . . . . . . . . . . . 112
8.3 The operational semantics of Proto-Quipper . . . . . . . . . . . . . . 119
v
Chapter 9 Type-safety of Proto-Quipper . . . . . . . . . . . . . . . . 127
9.1 Properties of the type system . . . . . . . . . . . . . . . . . . . . . . 127
9.2 Subject reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
9.3 Progress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Chapter 10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

10.1 Approximate synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . 140
10.2 Proto-Quipper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
vi
List of Figures
2.1 The Hadamard and Pauli matrices. . . . . . . . . . . . . . . . 12

2.2 The Bloch sphere representation of the state of a qubit. . . . . 14
2.3 The QRAM model of quantum computation. . . . . . . . . . . 18
4.1 Reduction rules for the simply typed lambda calculus. . . . . . 35

4.2 Typing rules for the simply typed lambda calculus. . . . . . . 36
4.3 Reduction rules for the quantum lambda calculus. . . . . . . . 43
4.4 Subtyping rules for the quantum lambda calculus. . . . . . . . 45
4.5 Typing rules for the quantum lambda calculus. . . . . . . . . . 46
5.1 Grid problems over Z[i]. . . . . . . . . . . . . . . . . . . . . . 50

5.2 Grid problems over Z[i] for upright and non-upright sets. . . . 51
5.3 Complex grids. . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.4 Real grids. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.5 The action of a grid operator on a grid problem over Z[ω]. . . 63
5.6 The grid operators R, A, B, X, K, and Z. . . . . . . . . . . . 67
5.7 Two coverings of the plane . . . . . . . . . . . . . . . . . . . . 78
6.1 The ε-region.. . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.1 Subtyping rules for Proto-Quipper. . . . . . . . . . . . . . . . 118

8.2 Typing rules for Proto-Quipper. . . . . . . . . . . . . . . . . . 120
8.3 A representation of Append(C, D, b). . . . . . . . . . . . . . . . 122
8.4 Reduction rules for Proto-Quipper. . . . . . . . . . . . . . . . 124
vii
Abstract
This thesis contains contributions to the theory of quantum computation.

We first define a new method to efficiently approximate special unitary operators.
Specifically, given a special unitary U and a precision ε > 0, we show how to efficiently
find a sequence of Clifford+V or Clifford+T operators whose product approximates
U up to ε in the operator norm. In the general case, the length of the approximating
sequence is asymptotically optimal. If the unitary to approximate is diagonal then
our method is optimal: it yields the shortest sequence approximating U up to ε.
Next, we introduce a mathematical formalization of a fragment of the Quipper
quantum programming language. We define a typed lambda calculus called Proto-
Quipper which formalizes a restricted but expressive fragment of Quipper. The type
system of Proto-Quipper is based on intuitionistic linear logic and prohibits the du-
plication of quantum data, in accordance with the no-cloning property of quantum
computation. We prove that Proto-Quipper is type-safe in the sense that it enjoys
the subject reduction and progress properties.
viii
List of Abbreviations and Symbols Used
N The natural numbers.

Z The integers.
Q The rational numbers.
R The real numbers.
C The complex numbers.
R× The group of units of the ring R.
∥.∥ The norm of a scalar, vector, or matrix.
⊗ The tensor product.
Mm,n (R) The set of m by n matrices over the ring R.
U(n) The unitary group of order n.
SU(n) The special unitary group of order n.
U −1 The inverse of the matrix U .
U† The conjugate transpose of the matrix U .
det(U ) The determinant of the matrix U .
tr(U ) The trace of the matrix U .
K(α) The extension of the field K by α.
R[α] The extension of the ring R by α.
OK The ring of integers of the field K.
(.)• The bullet automorphism of Z[ω].
≡ Modular congruence.
| Divisibility.
NR (.) The norm for the ring of integers R.
a{y/x} The renaming of x by y in a.
FV(a) The free variables of a.
a[b/x] The substitution of b for x in a.
→, The one-step and multi-step β-reduction.
=α The α-equivalence.
≡β The β-equivalence.
ix
<: The subtyping relation.
[Q, L, a] A closure of the quantum lambda calculus.
up(A) The uprightness of the set A.
BBox(A) The bounding box of the set A.
area(A) The area of the set A.
Grid(A) The grid for the set A.
O(.) The big-O notation.
Skew(D) The skew of the ellipse D.
Bias(D, ∆) The bias of the state (D, ∆).
sinhλ (.) The hyperbolic sine in base λ.
coshλ (.) The hyperbolic cosine in base λ.
Rε The ε-region.
D The closed unit disk.
Re(a) The real part of the complex number a.
Pf (X) The set of finite subsets of the set X.
FQ(a) The free quantum variables of a term a.
Bijf (X) The set of finite bijections on the set X.
SpecX (T ) An X-specimen for T .
[C, a] A closure of Proto-Quipper.
⌊.⌋ The floor function.
∪· The disjoint union.
x
Acknowledgements
I would like to express my profound gratitude to the people who, in one way or
another, have contributed to the writing of this thesis.
I want to thank my supervisor Peter Selinger, for his insight, his guidance, and,
most importantly, his contagious passion for mathematics. Special thanks are due to
the members of my supervisory committee, Dorette Pronk and Richard Wood, for
accepting to read a thesis which, alas, contains so few arrows. I am very grateful to
Prakash Panangaden for accepting to be my external examiner.
The staff of the Department of Mathematics and Statistics of Dalhousie Univer-
sity have played a large part in making my stay in Halifax enjoyable. I am very
appreciative of their hard work.
I am greatly indebted to the many teachers I had the chance to learn from during
the past ten years. In particular, I want to acknowledge the profound influence of Su-
sana Berestovoy, Julien Dutant, Vincent Homer, Jean-Baptiste Joinet, and Damiano
Mazza.
I want to thank the researchers I had the pleasure to collaborate with, notably
D. Scott Alexander, Henri Chataing, Alexander S. Green, Peter LeFanu Lumsdaine,
Jonathan M. Smith, and Benoı̂t Valiron.
My fellow students have always provided inspiration, support, and laughter. I
especially want to thank Abdullah Al-Shaghay, Ali Alilooee, Chloé Berruyer, Samir
Blakaj, Méven Cadet, Antonio Chavez, Hoda Chuangpishit, Hugo Férée, Florent
Franchette, Brett Giles, Giulio Guerrieri, François Guignot, Zhenyu Victor Guo,
D. Leigh Herman, Ben Hersey, Joey Mingrone, Lucas Mol, Alberto Naibo, Mattia
Petrolo, Francisco Rios, Kira Scheibelhut, Matthew Stephen, Aurélien Tonneau, An-
tonio Vargas, Kim Whoriskey, Bian Xiaoning, Amelia Yzaguirre, and Kevin Zatloukal.
Finally, I want to thank my friends and loved ones, far and near. My brother and
my sister. My parents, for their unwavering support. And Kira, for her kindness.
xi
Chapter 1
Introduction
Quantum computation, introduced in the early 1980s by Feynman [18], is a paradigm

for computation based on the laws of quantum physics. The interest in quantum
computation lies in the fact that quantum computers can solve certain problems more
efficiently than their classical counterparts. Most famously, Shor showed in 1994 that
integers can be factored in polynomial time on a quantum computer [62]. This is
in striking contrast with the exponential running time of the best known classical
methods. Since then, many algorithms leveraging the power of quantum computers
have been introduced (e.g., [29], [48], [30]). This promised increase in efficiency has
provided great incentive to solve the theoretical and practical challenges associated
with building quantum computers.
The fundamental unit of quantum computation is the quantum bit or qubit. The
state of a qubit is described by a unit vector in the two-dimensional Hilbert space C2 .
A system of n qubits is similarly described by a unit vector in the Hilbert space
n
C2 = C2 ⊗ . . . ⊗ C2 .
  
n
A computation is performed by acting on the state of a system of qubits. This

can be done in two ways. One can either apply a unitary transformation to the
state or one can measure some of the qubits making up the system. A quantum
algorithm describes a sequence of unitary operations to be performed on the state of
a system of qubits, usually followed by a single final measurement, or in some cases,
measurements throughout the computation. It is customary to represent a sequence
of unitary operations in the form of a quantum circuit, which is built up from a basic
set of unitaries, called gates, using composition and tensor product. An important
peculiarity of quantum computation is that the state of a quantum system cannot in
general be duplicated. This is the so-called no-cloning property. This contrasts with
the situation in classical computation, where the state of a bit can be freely copied.
1
2
Quantum algorithms are run on a physical device. Just as in classical computing,

there is a chain of successive translations, starting from a mathematical description
of an algorithm in a research paper. The abstract algorithm is first implemented in
a programming language, which is then turned into a quantum circuit. This first
circuit is then rewritten using a finite set of basic unitaries available on the hardware.
Finally, this second circuit is rewritten according to some error correcting scheme,
which redundantly encodes the circuit to make it robust to some of the errors that
are bound to occur on the physical machine. At this point, the abstract description
of the algorithm can be mapped to a physical system for the computation to be
effectively performed.
All of the above-mentioned steps in the execution of a quantum algorithm raise
interesting mathematical questions. For many years the problems at the top of this
translation chain, those closer to the abstract description of the algorithm, were either
overlooked or considered adequately solved. This was justified by the fact that the
challenges of building a reliable physical quantum computer were so far from being
met that the higher-level problems seemed somewhat irrelevant. However, as the
prospect of usable quantum computers draws nearer, the need to develop tools to
define proper solutions to these problems has become more pressing.
In this thesis, we contribute to two of the mathematical problems that arise in the
higher level of this execution phase. We first develop methods to decompose unitaries
into certain basic sets of unitaries. Secondly, we introduce a lambda calculus which
serves as a foundation for the Quipper quantum programming language. We now
briefly outline each of these contributions.
1.1 Approximate synthesis
The unitary group of order 2, denoted U(2), is the group of 2 × 2 complex unitary
matrices. The special unitary group of order 2, denoted SU(2), is the subgroup of
U(2) consisting of unitary matrices of determinant 1.
Let S ⊆ U(2) be a set of unitaries and ⟨S⟩ be the set of words over S. In the
context of quantum computing, the elements of S are called single-qubit gates and the
elements of ⟨S⟩ are called single-qubit circuits over S. Matrix multiplication defines
a map µ : ⟨S⟩ → U(2). However, we often abuse notation and simply write W for
3
µ(W ).
We think of S ⊆ U(2) as the set of unitary operations that can be performed
natively on a quantum computer. Because a quantum computer is a physical device, S
is finite and the set ⟨S⟩ of circuits expressible on our quantum computer is countable.
In contrast, both U(2) and SU(2) are uncountable. Hence, regardless of which S is
chosen, there will be U ∈ SU(2) such that U ∈
/ ⟨S⟩. This is a fortiori true of U(2)
also. This might be problematic, as it is often desirable to have all unitaries at our
disposal when writing quantum algorithms. This tension is alleviated by the fact that
unitaries can be approximated.
Definition 1.1.1. The distance between two operators U, W ∈ U(2) is defined as
∥U − W ∥ = sup{||U v − W v|| ; v ∈ C2 and ||v|| = 1}.
The notion of distance introduced in Definition 1.1.1, based on the operator norm,
is adopted because the physically observable difference between two unitaries U and
W is a function of ∥U − W ∥. As a result, if ε is small enough, the action of the
unitaries U and W are observably almost indistinguishable. The finiteness of the
gate set S can therefore be remedied if ⟨S⟩ is dense in SU(2).
Definition 1.1.2. A set S ⊆ U(2) is universal if for any U ∈ SU(2) and any ε > 0,
there exists W ∈ ⟨S⟩ such that ∥U − W ∥ < ε.
Note that a set S of gates is universal if it is dense in SU(2), rather than in U(2).
This is due to the fact that since a global phase has no observable effect in quantum
mechanics, we can without loss of generality focus on special unitary matrices.
If a gate set S is universal, then any special unitary can be approximated up to an
arbitrarily small precision by a circuit over S. However, the fact that S is universal
does not provide an efficient method which, given a special unitary U and a precision
ε, allows us to construct the approximating circuit W . An algorithm that performs
such a task is a solution to the approximate synthesis problem for S.
Problem 1.1.3 (Approximate synthesis problem for S). Given a unitary U ∈ SU(2)
and a precision ε > 0, construct a circuit W over S such that ∥W − U ∥ 6 ε.
The approximate synthesis problem is important for quantum computing because

it significantly impacts the resources required to run a quantum algorithm. Indeed,
4
a quantum circuit, to be executed by a quantum computer, must first be compiled

into some universal gate set. The complexity of the final physical circuit therefore
crucially depends on the chosen synthesis method. In view of the considerable re-
sources required for most quantum algorithms on interesting problem sizes, which
can require upwards of 30 trillion gates [28], a universal gate set can be realistically
considered for practical quantum computing only if it comes equipped with a good
synthesis algorithm.
In this thesis, we are interested in Problem 1.1.3 for two universal extensions of
the Clifford group. The Clifford group is generated by the following gates
[ ] [ ]
1 1 1 1 0
ω = eiπ/4 , H=√ , and S = .
2 1 −1 0 −i
Note that ω is a complex number, rather than a matrix. By a slight abuse of notation,
we write ω to denote the unitary ωI, where I is the identity 2 × 2 matrix. The
Clifford group is of great interest in quantum computation because Clifford circuits
can be fault-tolerantly implemented at very low cost in most error-correcting schemes.
For this reason, the Clifford group is often seen as a prime candidate for practical
quantum computing. However, the Clifford group is finite and therefore not universal
for quantum computing. Moreover, Gottesman and Knill showed that Clifford circuits
can be efficiently simulated on a classical computer [26]. It is therefore necessary to
consider universal extensions of the Clifford group.
The first extension we will consider, the Clifford+V gate set, arises by adding the
following V -gates to the set of Clifford generators
[ ] [ ] [ ]
1 1 2i 1 1 2 1 1 + 2i 0
VX = √ , VY = √ , and VZ = √ .
5 2i 1 5 −2 1 5 0 1 − 2i
The V -gates were introduced in [46] and [47] and later considered in the context of
approximate synthesis in [32], [7], [54], and [5].
The second extension we will consider, the Clifford+T gate set, arises by adding
the following T -gate to the set of Clifford generators
[ ]
1 0
T =
0 ω
5
where ω = eiπ/4 as above. The T gate is the most common extension of the Clifford
group considered in the literature. One reason for this is that the Clifford+T gate set
enjoys nice error-correction properties [50]. This gate set has often been considered
in the approximate synthesis literature (see, e.g., [41], [40], [60], and [56]).
We evaluate an approximate synthesis algorithm with respect to its time complex-

ity and its circuit complexity. The time complexity of an algorithm is the number
of arithmetic operations it requires to produce an approximating circuit. The circuit
complexity is the length of the produced circuit, which we identify with the number
of non-Clifford gates that appear in the circuit. This is motivated by the high cost of
error correction for non-Clifford gates.
Until recently, there were two main approaches to the approximate synthesis prob-
lem: the ones based on exhaustive search, like Fowler’s algorithm of [20], and the ones
based on geometric methods, like the Solovay-Kitaev algorithm ([37], [14], [38]). The
methods based on exhaustive search achieve optimal circuit sizes but, due to their
exponential runtimes, are impractical for small ε. In contrast, the well-known Solovay-
Kitaev algorithm has polynomial runtime and achieves circuit sizes of O(logc (1/ε)),
where c > 3. However, the information-theoretic lower bound is O(log(1/ε)), so the
Solovay-Kitaev algorithm leaves ample room for improvement.
In the past few years, number theoretic methods, and in particular Diophantine
equations, have been used to define new synthesis algorithms. This has rejuvenated
the field of quantum circuit synthesis and significant progress has been made ([42],
[7], [54], [5], [41], [40], [60], [56]).
The algorithms we introduce in this thesis belong to this new kind of number-
theoretic algorithms. We give algorithms for Clifford+V and Clifford+T circuits.
Both algorithms are efficient (they run in probabilistic polynomial time) and achieve
near-optimal circuit length. In certain specific cases, the algorithms are optimal.
That is, the produced circuits are the shortest approximations possible. This solves
a long-standing open problem, as no such efficient optimal synthesis method was
previously known for any gate set.
6
1.2 The mathematical foundations of Quipper
Quipper is a programming language for quantum computation (see [1], [27], [28], and
[55]). The Quipper language was developed in 2011–2013 in the context of a research
contract for the U.S. Intelligence Advanced Research Project Activity (see [35]). As
part of this project, seven non-trivial algorithms from the quantum computing lit-
erature were implemented in Quipper ([10], [2], [30], [65], [31], [53], and [48]). I
participated in the development of the Quipper language and in the implementation
of the Triangle Finding Algorithm [48] and the Unique Shortest Vector Algorithm
[53].
An important aspect of the Quipper language is that it acts as a circuit description

language. This means that Quipper provides a syntax in which to express quantum
circuits. Quipper moreover provides the ability to treat circuits as data and to manip-
ulate them as a whole. For example, Quipper has operators for reversing and iterating
circuits, decomposing them into gate sets, etc. This circuit-as-data paradigm is re-
markably useful for the programmer, as it is very close to the way in which quantum
algorithms are described in the literature.
Currently, Quipper is implemented as an embedded language [12]. This means

that Quipper can be seen as a collection of functions and data types within some
pre-existing host language. Quipper’s host language is Haskell [34], a strongly-typed
functional programming language. An advantage of this embedded language approach
is that it allows for the implementation of a large-scale system without having to first
design and implement a compiler, a parser, etc. The embedded language approach
also has drawbacks, however. In particular, Quipper inherits the type system of its
host and while Haskell’s type system provides many type-safety properties, it is not
in general strong enough to ensure the full type-safety of quantum programs. In
the current Quipper implementation, it is therefore the programmer’s responsibility
to ensure that quantum components are plugged together in physically meaningful
ways. This means that certain types of programming errors will not be prevented by
the compiler. In the worst case, this may lead to ill-formed output or run-time errors.
In this thesis, we introduce a quantum programming language which we call Proto-

Quipper. It is defined as a typed lambda calculus and can be seen as a mathematical
7
formalization of a fragment of Quipper. Proto-Quipper is meant to provide a foun-

dation for the development of a stand-alone (i.e., non-embedded) version of Quipper.
Moreover, Proto-Quipper is designed to “enforce the physics”, in the sense that it de-
tects, at compile-time, programming errors that could lead to ill-formed or undefined
circuits. In particular, the no-cloning property of quantum computation is enforced.
In designing the Proto-Quipper language, our approach was to start with a limited,
but still expressive, fragment of the Quipper language and make it type safe. This
fragment will serve then as a robust basis for future language extensions. The idea is
to eventually close the gap between Proto-Quipper and Quipper by extending Proto-
Quipper with one feature at a time while retaining type safety.
Our main inspiration for the design of Proto-Quipper is the quantum lambda
calculus (see [64], [61], or [58]). The quantum lambda calculus represents an ideal
starting point for the design of Proto-Quipper because it is equipped with a type
system tailored for quantum computation. However, the quantum lambda calculus
only manipulates qubits and all quantum operations are immediately carried out on
a quantum device, not stored for symbolic manipulation. We therefore extend the
quantum lambda calculus with the minimal set of features that makes it Quipper-like.
The current version of Proto-Quipper is designed to
• incorporate Quipper’s ability to generate and act on quantum circuits, and to
• provide a linear type system to guarantee that the produced circuits are physi-
cally meaningful (in particular, properties like no-cloning are respected).
To achieve these goals, we extend the types of the quantum lambda calculus with
a type Circ(T, U ) of circuits, and add constant terms to capture some of Quipper’s
circuit-level operations, like reversing. We give a formal operational semantics of
Proto-Quipper in terms of a reduction relation on pairs [C, t] of a term t of the
language and a so-called circuit state C. The state C represents the circuit currently
being built. The reduction is defined as a rewrite procedure on such pairs, with the
state being affected when terms involve quantum constants.
1.3 Outline
The thesis can be divided into three parts, whose contents are outlined below.
8
• The first part of the thesis, which corresponds to chapters 2 – 4, contains back-
ground material. Chapter 2 is an exposition of the basic notions of quantum
computation. Chapter 3 introduces concepts and methods from algebraic num-
ber theory. Chapter 4 presents the lambda calculus as well as the quantum
lambda calculus.
• The second part of the thesis, which corresponds to chapters 5 – 7, contains

algebraic contributions to quantum computation. In Chapter 5, we introduce
and solve grid problems. In Chapter 6, we define an algorithm to solve the
problem of approximate synthesis of special unitaries over the Clifford+V gate
set. In Chapter 7 we show how the methods of the previous chapter can be
adapted to the Clifford+T gate set.
• The third and last part of the thesis, which corresponds to chapters 8 and
9, contains logical contributions to quantum computation. In Chapter 8 we
introduce the syntax, type system, and operational semantics of Proto-Quipper.
In Chapter 9 we prove that Proto-Quipper enjoys the subject reduction and
progress properties.
In the interest of brevity, the introductory chapters 2 – 4 contain only the material
that is necessary to the subsequent chapters. In particular, most proofs are omitted.
In each of these chapters, we provide references to the relevant literature.
1.4 Contributions
My original contributions are contained in chapters 5 – 9 of the thesis. They have

appeared in several published papers. The algorithm for the Clifford+V approximate
synthesis of unitaries and its analysis (Section 5.1 and Chapter 6) appeared in the
single-authored paper [54]. The algorithm for the Clifford+T approximate synthesis
of unitaries and its analysis (Section 5.2 and Chapter 7) appeared in the paper [56],
co-authored with my supervisor Peter Selinger, following earlier work by Selinger [60].
The results of Section 5.2.4, providing a method to make two ellipses simultaneously
upright, are my original contribution. The other results of [56] are the product of an
equal collaboration. Finally, the definition of the Proto-Quipper language, as well as
the proof of its type safety (Chapter 8 and 9) appeared in the report [9], co-authored
9
with Henri Chataing and Peter Selinger. The results in these chapters are my original
work; Peter Selinger’s role was supervisory, and Henri Chataing was a summer intern
whom I helped supervise.
Chapter 2
Quantum computation
In this chapter, we provide a brief introduction to quantum computation. Further

details can be found in the literature. References for this material include [50] and
[36]. For a concise introduction, we also refer the reader to [59] which we loosely
follow here.
2.1 Linear Algebra
We write N for the semiring of non-negative integers (including 0) and Z for the ring
of integers. The fields of rational numbers, real numbers, and complex numbers are
denoted by Q, R, and C respectively. Recall that C = {a + bi ; a, b ∈ R} so that the
complex numbers can be identified with the two-dimensional real plane R2 . Recall
moreover that if α = a + bi is a complex number, then its conjugate is α† = a − bi.
2.1.1 Finite dimensional Hilbert spaces
We will be interested in complex vector spaces of the form Cn for some n ∈ N. The
integer n is called the dimension of Cn . The vector space Cn has a canonical basis,
whose elements we denote by ei , for 1 6 i 6 n. Every element α ∈ Cn can be uniquely
written as a linear combination
α = a1 e 1 + . . . + an e n (2.1)
with a1 , . . . an ∈ C. We will often represent the vector α of (2.1) as a column vector

⎡ ⎤
a1
⎢.⎥
α=⎢ .⎥
⎣ . ⎦. (2.2)
an
We identify C1 with C and frequently refer to the elements of C as scalars. The vector
space Cn is equipped with the usual operations of addition and scalar multiplication.
10
11
Given a vector α as in (2.2), its dual is the row vector

[ ]
α† = a†1 . . . a†n .
Note that under the identification of C1 and C, no ambiguity arises from using (−)†
to denote the dual of a vector and the conjugate of a scalar. The norm of a vector α
√
is defined as ||α|| = α† α, where α† α is obtained by matrix multiplication. A vector
whose norm is 1 is called a unit vector. Equipped with the norm || − ||, the vector
space Cn has the structure of an n-dimensional Hilbert space.
2.1.2 Operators and matrices
A linear operator Cn → Cm can be represented by an m × n complex matrix. We

write Mm,n (C) for the set of all m × n complex matrices and we say that m and n are
the dimensions of U ∈ Mm,n (C). If m = n, then we simply say that U has dimension
n. We identify Mn,1 (C) and Cn .
We will use some well-known operations on matrices. For any n, the identity
matrix of dimension n is denoted by In , or simply by I if the dimension is clear
from context. If U ∈ Mn,n (C), an inverse of U , written U −1 , is a matrix such that
U U −1 = U −1 U = I. If it exists, the inverse of a matrix is unique. We note that
for any n, the set of invertible matrices of dimension n forms a group under matrix
multiplication. We denote this group by Mn,n (C)× . If U ∈ Mm,n (C), the conjugate
†
transpose of U , written U † , is defined by Ui,j = (Uj,i )† . The identification of Mn,1 (C)
and Cn ensures that no ambiguity arises from our use of (−)† to denote the conjugate
transpose of a matrix, since the conjugate transpose of a column vector is its dual.
We will also use the well-known notions of trace and determinant of a square
matrix. Both are functions that assign a complex number to any complex square
matrix. For U ∈ Mn,n (C), the trace of U is defined as
n
∑
tr(U ) = Ui,i .
i=1
The formula to express the determinant of an arbitrary n × n matrix is somewhat

cumbersome. Since we will only be considering determinants of 2 × 2 matrices, we
give an explicit definition of the determinant in this case only. For U ∈ M2,2 (C), the
12
[ ] [ ] [ ] [ ]
1 1 1 0 1 0 −i 1 0
H=√ X= Y = Z=
2 1 −1 1 0 i 0 0 −1
Figure 2.1: The Hadamard matrix H and the Pauli matrices X, Y , and Z.
determinant of U is defined as
([ ])
α β
det = αδ − βγ.
γ δ
We note that in any dimension the determinant is multiplicative and det(I) = 1.

Finally, if U ∈ Mn,m (C), α ∈ Cn , λ ∈ C, and U α = λα, then α is an eigenvector
of U with eigenvalue λ.
2.1.3 Unitary, Hermitian, and positive matrices
A complex matrix U is unitary if U −1 = U † . Unitary matrices preserve the norm of

vectors in Cn . That is, if U is unitary then ||U α|| = ||α|| for any vector α ∈ Cn . The
composition of two unitary matrices is again unitary. Moreover, the identity matrix
is unitary. Hence, the set U(n) of all unitary matrices of dimension n forms a group,
called the unitary group of order n. Since the determinant is a multiplicative function,
U(n) has a subgroup which consists of those unitary matrices whose determinant is
1. This group is called the special unitary group of order n and is denoted by SU(n).
We thus have the inclusions
SU(n) ⊆ U(n) ⊆ Mn,n (C)× .
Examples of useful unitary matrices are provided in Figure 2.1. The matrix H is
known as the Hadamard matrix and the matrices X, Y , and Z are known as the
Pauli matrices.
A complex matrix U is Hermitian if U = U † . If U is hermitian, then u† U u is
always real. Note that all the matrices in Figure 2.1 are Hermitian.
A matrix U is positive semidefinite (resp. positive definite) if U is hermitian and
α† U α > 0 (resp. α† U α > 0) for all α ∈ Cn .
13
2.1.4 Tensor products
The tensor product of vectors spaces and matrices is defined as usual and denoted by
⊗. We note that Cn ⊗ Cm = Cnm and that the tensor product acts on square matrices
as
⊗ : Mn,n (C) × Mm,m (C) → Mnm,nm (C).
Given two vectors α ∈ Cn , β ∈ Cm , their tensor product γ = α ⊗ β ∈ Cnm is defined

by γ(i,j) = αi βj , with the pairs (i, j) ordered lexicographically. One obtains a basis for
the tensor product H ⊗ H′ of two vector spaces H and H′ by tensoring the elements
of the bases for H and H′ . For example, if {α1 , α2 } and {β1 , β2 } are two bases for C2 ,
then
{α1 ⊗ β1 , α1 ⊗ β2 , α2 ⊗ β1 , α2 ⊗ β2 }
forms a basis for C2 ⊗ C2 = C4 . We note, however, that not all elements of C4 are of
the form α ⊗ β with α, β ∈ C2 .
2.2 Quantum Computation
2.2.1 A single quantum bit
Recall that the fundamental unit of classical computation is the bit. By analogy, the
fundamental unit of quantum computation is called the quantum bit, or qubit.
Definition 2.2.1. The state of a qubit is a unit vector in C2 considered up to a phase,

that is, up to multiplication by a unit vector of C.
As is customary, we use the so-called ket notation to denote the state of a qubit,
which is written |φ⟩. Moreover, we write |0⟩ and |1⟩ for the elements e1 and e2 of the
standard basis for C2 . In the context of quantum computation, the basis {|0⟩, |1⟩} is
referred to as the computational basis. Since the state of a (classical) bit is an element
of the set {0, 1}, we sometimes refer to the states |0⟩ = 1|0⟩+0|1⟩ and |1⟩ = 0|0⟩+1|1⟩
as the classical states.
By Definition 2.2.1, the state of a qubit can be one of the classical states but
also any linear combination α|0⟩ + β|1⟩ such that ||α||2 + ||β||2 = 1. The coefficients
α and β in such a linear combination are called the amplitudes of the state. When
14
Figure 2.2: The Bloch sphere representation of the state of a qubit. The state (θ, φ)
is represented by the black dot, with θ corresponding to the angle pictured in red and
φ to the angle pictured in green.
both amplitudes of the state of a qubit are non-zero, then the qubit is said to be in
a superposition of |0⟩ and |1⟩.
The fact that the vector in Definition 2.2.1 is considered only up to a phase gives
rise to a nice geometric representation of the state of a qubit. Let α|0⟩ + β|1⟩ be a
unit vector. We can rewrite this linear combination as
θ θ
α|0⟩ + β|1⟩ = eiγ (cos( )|0⟩ + eiφ sin( )|1⟩)
2 2
for some real numbers θ ∈ [0, π] and γ, φ ∈ [0, 2π]. Since the state of qubit is defined
up to a phase, the same state is described by
θ θ
cos( )|0⟩ + eiφ sin( )|1⟩
2 2
with θ, φ ∈ R. The pair (θ, φ) defines a point on the 2-sphere known as the Bloch
sphere representation of the state, as pictured in Figure 2.2. The Cartesian coordi-
nates of the point (θ, φ) on the Bloch sphere are given by (sin θ cos φ, sin θ sin φ, cos θ).
2.2.2 Multiple quantum bits
In classical computation, the state of a system of n bits is represented by an element of

the set {0, 1}n . The set of states of a complex system therefore arises as a Cartesian
product. In contrast, the set of states of a system composed of multiple qubits is
obtained using the tensor product.
15
n
Definition 2.2.2. The state of a system of n qubits is a unit vector in C2 considered
up to a phase.
n
The basis {|0⟩, |1⟩} for C2 can be used to construct a basis for C2 , which we
also call the computational basis. As is customary, we denote the basis element
|x1 ⟩ ⊗ . . . ⊗ |xn ⟩ by |x1 . . . xn ⟩, for any x1 , . . . , xn ∈ {0, 1}. For example, the state of
a system of two qubits is described, up to a phase, by a linear combination
α0 |00⟩ + α1 |01⟩ + α2 |10⟩ + α3 |11⟩
||αi ||2 = 1. As mentioned in Section 2.1.4,

∑
for some complex numbers αi such that
not all elements of C4 arise as the tensor of two elements of C2 . If the state of a
two-qubit system can be written as
(α|0⟩ + β|1⟩) ⊗ (α′ |0⟩ + β ′ |1⟩)
then the two qubits are said to be separable. Otherwise, they are said to be entangled.
2.2.3 Evolution of a quantum system
A computation is performed by acting on the state of a system of qubits. This can

be done in two ways: via unitary transformation or via measurement.
Unitary evolution
The state of a system of n qubits can evolve under the action of a unitary operator.
n
If |φ⟩ is such a state (viewed as a column vector in C2 ) and U is a unitary matrix,
the evolution of |φ⟩ under U is given by |φ⟩ ↦→ U |φ⟩. For this reason, we sometimes
refer to a unitary matrix of dimension 2n as an n-qubit unitary.
Recall from Section 2.2.1 that the state of a qubit can be interpreted as a point
on the Bloch sphere. This interpretation extends to single-qubit unitary matrices,
which can be seen as rotations of the Bloch sphere. Let v = (x, y, z) be a unit vector
in R3 and θ ∈ R and define the matrix
θ θ
Rv (θ) = cos( )I − sin( )(xX + yY + zZ),
2 2
where X, Y , and Z are the Pauli matrices. The matrix Rv (θ) defines a rotation of
the Bloch sphere by θ radians about the v-axis. For example, the Pauli matrices X,
16
Y , and Z correspond to rotations about the x-, y-, and z-axes by π radians. The
following theorem states that, up to a phase, every unitary can be seen as a rotation
of the Bloch sphere.
Theorem 2.2.3. If U ∈ U(2), then there exist real numbers α and θ, and a unit
vector w in R3 such that U = eiα Rw (θ).
Measurement
The second way in which one can act on the state of a system of qubits is by mea-
surement. Unlike unitary evolutions, measurements are probabilistic processes.
We first describe the effect of a measurement on a single qubit. Assume a qubit is
in the state α|0⟩ + β|1⟩. If the qubit is measured, then the result of the measurement
is either 0 or 1 and the state of the qubit post-measurement is the corresponding
classical state. Moreover, the measurement result 0 occurs with probability ||α||2 and
the measurement result 1 occurs with probability ||β||2 . Since the state of a qubit
was described by a unit vector, these probabilities sum to 1.
We now discuss the case of a complex system. For simplicity, we only consider a
two-qubit system. Assume the state of our system is given by the following vector
α0 |00⟩ + α1 |01⟩ + α2 |10⟩ + α3 |11⟩.
If the first qubit is measured, then with probability ||α0 ||2 + ||α1 ||2 the measurement
result is 0 and the post-measurement state is
α0 α1
√ |00⟩ + √ |01⟩,
2
||α0 || + ||α1 ||2 ||α0 ||2 + ||α1 ||2
while with probability ||α2 ||2 + ||α3 ||2 the measurement result is 1 and the post-
measurement state is
α2 α3
√ |10⟩ + √ |11⟩.
||α2 ||2 + ||α3 ||2 ||α2 ||2 + ||α3 ||2
Note that the linear combinations have been renormalized to ensure that the resulting
states are described by unit vectors.
17
No-cloning
The well-known no-cloning theorem is a property of quantum computation which will

be of importance in chapters 4, 8, and 9. The theorem states that there is no physical
device whose action is described by the following mapping
|ψ⟩ ⊗ |0⟩ ↦→ |ψ⟩ ⊗ |ψ⟩.
In other words, it is impossible to “clone” quantum states.
2.2.4 The QRAM model and quantum circuits
In Section 2.2, we described the mathematical formalism of quantum computation,

but did not spend any time explaining what a quantum computer might look like,
nor how one would describe quantum programs and protocols. Various models of
quantum computation have been devised in the literature (see, e.g., [15], [52]). Here,
we discuss two complementary approaches: the QRAM model introduced by Knill in
[44] and the circuit model described by Deutsch in [16]. We also refer the reader to
[1] where this model was described in more detail.
The QRAM model of quantum computation
In the QRAM model of a quantum computation, a quantum computer is thought

of as consisting of two devices, a classical device and a quantum device, sharing
computational tasks. The classical device performs operations such as compilation
and correctness checking. The quantum device only performs the specifically quantum
operations. In particular, it is assumed to hold an array of qubits and to be able to
• initialize qubits to a specified state,
• perform unitary operations on qubits, and
• measure qubits.
The execution of a program in this model proceeds as follows. The source code of the
program resides on the classical device where it is compiled into an executable. When
the program is executed on the classical device, it can communicate with the quantum
device if required. Through this communication, it can instruct the quantum device
18
Instructions
Classical Quantum
Measurement results
Figure 2.3: The QRAM model of quantum computation.
to perform one of the above described quantum operations. Measurement results, if

any, are returned to the classical device for post-processing or further computation.
This system is schematically represented in Figure 2.3.
Quantum circuits
While the QRAM model of quantum computation describes the overall architecture of
a quantum computer, quantum circuits are a language to express (parts of) quantum
algorithms. In particular, one can think of a quantum circuit as the description of a
sequence of operations that, in the QRAM model, the classical device would send to
its quantum counterpart.
In the quantum circuit model, a quantum computation is thought of as a sequence
of unitary gates applied to an array of qubits, followed by a measurement of some or
all of the qubits. The sequence of unitary operations are arranged in the form of a
circuit, akin to the classical boolean circuits.
The identity matrix on n qubits, i.e., the identity matrix of dimension 2n , is
represented by n distinct wires. For example, the identity on 3 qubits is represented
by
.
A non-identity unitary U acting on n qubits is depicted as a box, labelled U , with n
input wires and n output wires. For example, the Hadamard matrix H is represented
19
as
H .
Because of the similarities between quantum circuits and classical boolean circuits,
we sometimes refer to a unitary U as a quantum gate.
The graphical representation of the composition U V , of two unitary matrices U
and V acting on n qubits, is obtained by horizontally concatenating the gates for U
and V . That is, by connecting the output wires of the gate for V to the input wires
of the gate for U , as illustrated below in the case of matrices on 4 qubits.
V U
.
Finally, the graphical representation of the tensor product U ⊗ V of two unitary

matrices U and V is obtained by vertically concatenating the gates for U and V ,
as illustrated below in the case of unitary matrices U and V on 3 and 2 qubits
respectively.
V .
Now let S be a set of unitary matrices and write S † for the set {U † ; U ∈ S}. A
circuit over S is constructed using the gates of S ∪ S † , as well as arbitrary identity
gates, using the graphical representations for composition and tensor product. For
example, a circuit over S = {U, V, W }, where U acts on 2 qubits and V and W both
act on 3 qubits, is represented below
U U†
V W V†
.
We write ⟨S⟩ for the set of all circuits over S.

One can recover the matrix represented by a circuit by interpreting the operations
of horizontal and vertical concatenation as composition and tensor product respec-
tively. For example, let S consist of the Hadamard gate H and the Pauli gate X, and
20
let C be the following circuit

H
X H .
Then C represents the unitary matrix U given by
⎡ ⎤
1 −1 1 −1
⎢ ⎥
1⎢⎢1 1 1 1 ⎥
U = (I2 ⊗ H) ◦ (H ⊗ X) = ⎢ ⎥.
⎥
2 ⎢1 −1 −1 1 ⎥
⎣ ⎦
1 1 −1 −1
Note that C could have alternatively been interpreted as the unitary (I2 ◦H)⊗(H ◦X).
However, both interpretations coincide since (I2 ◦ H) ⊗ (H ◦ X) = (I2 ⊗ H) ◦ (H ⊗ X).
This is due to the so-called bifunctoriality of ⊗, which guarantees in particular that
(U ◦ U ′ ) ⊗ (V ◦ V ′ ) = (U ⊗ V ) ◦ (U ′ ⊗ V ′ )
for any U, U ′ , V, V ′ . By a slight abuse of notation, we often write C = U if the circuit

C represents the matrix U .
The language of quantum circuits can be extended with measurement gates, which
are depicted as
.
Chapter 3
Algebraic number theory
In this chapter, we introduce basic concepts of algebraic number theory. In particular,

we describe ring extensions of Z and computational methods to solve certain Diophan-
tine equations, known as relative norm equations, over these rings. References for this
material include [49] and [13].
3.1 Rings of integers
If K is a field, S a subfield of K, and α an element of K, then the field extension

S(α) is the smallest subfield of K which contains S and α.
An element α ∈ C is an algebraic number if it is the root of some polynomial over
Q. A field extension of Q of the form Q(α) for some algebraic number α is called an
algebraic number field.
An element β ∈ C is an algebraic integer if it is the root of some monic polynomial
over Z, i.e., of some polynomial over Z whose leading coefficient is 1. The set of
algebraic integers of a number field Q(α) forms a ring, called the ring of integers of
Q(α) and denoted by OQ(α) . For any algebraic number field Q(α), the ring OQ(α) is
an integral domain whose field of fractions is Q(α).
3.1.1 Extensions of Z
If R is a ring, R′ a subring of R, and α an element of R, then the ring extension R′ [α]

is the smallest subring of R which contains R′ and α.
Definition 3.1.1 (Extensions of Z). We are interested in the following four ring
extensions of Z.
• The ring Z of integers.
• The ring Z[i] of Gaussian integers.
21
22
√
• The ring Z[ 2] of quadratic integers with radicand 2.
√
• The ring Z[ω] of cyclotomic integers of degree 8, where ω = eiπ/4 = (1 + i)/ 2.
√
We note that since i = ω 2 and 2 = ω−ω 3 , we have the inclusions Z ⊆ Z[i] ⊆ Z[ω]
√ √
and Z ⊆ Z[ 2] ⊆ Z[ω]. Moreover, one can show that Z[ 2] and Z[ω] are dense in
R and C respectively. The rings introduced in Definition 3.1.1 are rings of algebraic
integers. Indeed we have
√
Z = OQ , Z[i] = OQ(i) , Z[ 2] = OQ(√2) , and Z[ω] = OQ(ω) .
√
Explicit expressions for the elements of Z[i], Z[ 2], and Z[ω] are given below.
• Z[i] = {a0 + a1 i ; aj ∈ Z}.

√ √
• Z[ 2] = {a0 + a1 2 ; aj ∈ Z}.
• Z[ω] = {a0 + a1 ω + a2 ω 2 + a3 ω 3 ; aj ∈ Z}.
Note that there is a bijection between Z[i] and Z2 . As the following proposition shows,
√ √
there is also a bijection between Z[ω] and and two disjoint copies of Z[ 2] × Z[ 2].
Proposition 3.1.2. A complex number α is in Z[ω] if and only if it can be written

√
of the form α = a0 + a1 i or of the form α = a0 + a1 i + ω, where a0 , a1 ∈ Z[ 2].
Proof. The right-to-left implication is trivial. For the left-to-right implication, let
1+i
α = a + bω + cω 2 + dω 3 , where a, b, c, d ∈ Z. Noting that ω = √ ,
2
we have
b − d√ b + d√
α = (a + 2) + (c + 2)i.
2 2
√
If b − d (and therefore b + d) is even, then α is of the first form, with a0 = a + b−d
2
2
√
and a1 = c + b+d2
2. If b − d (and therefore b + d) is odd, then α is of the second
√ √
form, with a0 = a + b−d−1
2
2 and a1 = c + b+d−1
2
2.
We close this subsection with the definition of two algebraic integers which will
be useful in chapters 5, 6, and 7.
√
Definition 3.1.3. The algebraic integers λ ∈ Z[ 2] and δ ∈ Z[ω] are defined as
follows
√
• λ=1+ 2 and
• δ = 1 + ω.
23
3.1.2 Automorphisms
Recall that an automorphism of a ring R is an isomorphism R → R. The ring Z[ω]

has four automorphisms. One of these automorphisms is complex conjugation, which
we denote (−)† as in Chapter 2. Explicitly, (−)† acts on an arbitrary element of Z[ω]
as follows
(a0 + a1 ω + a2 ω 2 + a3 ω 3 )† = a0 − a3 ω − a2 ω 2 − a1 ω 3 .
√
A second automorphism of Z[ω] is 2-conjugation, denoted (−)• , which acts on an
arbitrary element of Z[ω] as follows
(a0 + a1 ω + a2 ω 2 + a3 ω 3 )• = a0 − a1 ω + a2 ω 2 − a3 ω 3 .
•
The remaining two automorphisms of Z[ω] are the identity as well as (−)• † = (−)† .
√
The rings Z[i] and Z[ 2] both have two automorphisms, while Z has exactly one.
All of these are obtained by restricting the automorphisms of Z[ω]. Because (−)† acts
√ √
trivially on Z[ 2], the only non-identity automorphism of Z[ 2] is (−)• . Explicitly,
√ √ √
the action of (−)• on an element of Z[ 2] is given by (a + b 2)• = a − b 2. Similarly,
the only non-identity automorphism of Z[i] is (−)† , whose action is explicitly given
by (a + bi)† = a − bi. The ring Z has no non-trivial automorphism.
√
We note that for t ∈ Z[ω], we have t ∈ Z[ 2] iff t = t† , t ∈ Z[i] iff t = t• , and
t ∈ Z iff t = t† and t = t• .
3.1.3 Norms
√
Let R be one of the rings Z, Z[i], Z[ 2], or Z[ω]. We define the norm NR (α) of an
element α ∈ R to be
∏
NR (α) = σ(α),
σ
where the product is taken over all automorphisms σ : R → R. We provide explicit

formulas for each norm in the definition below.
Definition 3.1.4 (Norms).
• If α ∈ Z, then NZ (α) = α.
• If α = a0 + a1 i ∈ Z[i], then NZ[i] (α) = α† α = a20 + a21 .

24
√ √
• If α = a0 + a1 2 ∈ Z[ 2], then NZ[√2] (α) = α• α = a20 − 2a21 .
• If α = a0 + a1 ω + a2 ω 2 + a3 ω 3 ∈ Z[ω], then
•
NZ[ω] (α) = α† α† α• α = (a20 + a21 + a22 + a23 )2 − 2(a3 a2 + a2 a1 + a1 a0 − a3 a0 )2 .
All the norms introduced in Definition 3.1.4 are multiplicative and integer valued.
This means that NR (αβ) = NR (α)NR (β) and NR (α) ∈ Z. The norms NZ[ω] and NZ[i]
are moreover valued in the non-negative integers. Finally, we have NR (α) = 0 iff
α = 0 and NR (α) is a unit if and only if α is a unit, that is, an invertible element.
√
Remark 3.1.5. If α and β are two distinct elements of Z[ 2], then the following
inequality holds:
|α − β| · |α• − β • | > 1, (3.1)
This follows from the fact that |α−β|·|α• −β • | = |NZ[√2] (α−β)|. The same inequality
holds for α, β ∈ Z[ω].
3.2 Diophantine equations
3.2.1 Euclidean domains

√
The rings Z, Z[i], Z[ 2], and Z[ω] are integral domains, whose fields of fractions are
√
Q, Q(i), Q( 2), and Q(ω). An important property of these rings is that they are
Euclidean domains.
Definition 3.2.1. A Euclidean domain is an integral domain R equipped with a

function f : R \ {0} → N such that for every a ∈ R and b ∈ R \ {0}, there exist
q, r ∈ R such that a = bq + r and r = 0 or f (r) < f (b).
√
Proposition 3.2.2. Let R be one of Z, Z[i], Z[ 2], or Z[ω]. Then the function
|NR (−)| makes R into a Euclidean domain.
The notion of divisibility, as well as many essential properties of the divisibility of

integers can be defined in an arbitrary Euclidean domain. In particular, we write x | y
if x is a divisor of y, and x ∼ y if x | y and y | x. An element x is prime if x is not a unit
and x = ab implies that either a or b is a unit. The notion of greatest common divisor,
as well as Euclid’s algorithm, can be defined in any Euclidean domain. Finally, every
25
Euclidean domain is also a unique factorization domain, which means that every non-
zero non-unit element of the ring can be factored into primes in an essentially unique
way.
3.2.2 Relative norm equations
In chapters 6 and 7, we will be interested in solving certain equations known as relative

norm equations. Specifically, we will be concerned with the following two problems.
Problem 3.2.3 (Relative norm equation over Z[i]). Given β ∈ Z, find α ∈ Z[i] such
that α† α = β.
√
Problem 3.2.4 (Relative norm equation over Z[ω]). Given β ∈ Z[ 2], find α ∈ Z[ω]
such that α† α = β.
Solving Problem 3.2.3 amounts to finding α ∈ Z[i] such that NZ[i] (α) = β. The
equation to be solved is therefore a norm equation. In Problem 3.2.4, however, α† α ̸=
NZ[ω] (α). In this case, we do not consider all the automorphic images of α. Instead,
we consider the automorphic images of α under the automorphisms of Z[ω] that fix
√
Z[ 2]. For this reason the equation in Problem 3.2.4 is a relative norm equation. For
uniformity, we refer to both equations as relative norm equations.
A Diophantine equation is a polynomial equation in integer variables. By writing
α = a + bi with a, b ∈ Z, Problem 3.2.3 becomes equivalent to the Diophantine
equation
a2 + b2 = β.
√
Similarly, by writing α = a + bω + cω 2 + dω 3 and β = a′ + b′ 2 with a, a′ , b, b′ , c, d ∈ Z,
Problem 3.2.4 becomes equivalent to the system of Diophantine equations
a2 + b2 + c2 + d2 = a′
ab − ad + cb + cd = b′ .
In light of these equivalences, we sometimes abuse terminology and refer to the equa-
tions of problems 3.2.3 and 3.2.4 as Diophantine equations.
Remark 3.2.5. Problems 3.2.3 and 3.2.4 are computational problems. This means that
a solution to either of these problems is an algorithm which decides whether the given
26
equation has a solution and produces a solution if one exists. The algorithms solving
problems 3.2.3 and 3.2.4 that we consider here are probabilistic in the sense that they
make certain choices at random. We evaluate the time-complexity of an algorithm
by estimating the number of arithmetic operations it requires to solve one the above
problems. By arithmetic operations, we mean addition, subtraction, multiplication,
division, exponentiation, and logarithm. When we say that an algorithm runs in
probabilistic polynomial time, we mean that the algorithm is probabilistic, requires
a polynomial number of expected arithmetic operations to solve the given problem,
and produces a correct solution with probability greater than 1/2.
Since α† α > 0, a necessary condition for Problem 3.2.3 to have a solution is β > 0.
Similarly, necessary conditions for Problem 3.2.4 to have a solution are β > 0 and
β • > 0. This follows from the fact that α† α = β implies (α• )† (α• ) = β • . There are
also sufficient conditions for the above relative norm equations to have solutions.
Proposition 3.2.6. Let β ∈ Z be such that β > 0. If β is prime and β ≡ 1 (mod 4),
then the equation α† α = β has a solution.
√
Proposition 3.2.7. Let β ∈ Z[ 2] be such that β > 0 and β • > 0 and let n = β • β ∈
Z. If n is prime and n ≡ 1 (mod 8), then the equation α† α = β has a solution.
We close this chapter by stating that there are solutions to problems 3.2.3 and
3.2.4.
Proposition 3.2.8. Let β ∈ Z. Given the prime factorization of β, there exists

an algorithm that determines, in probabilistic polynomial time, whether the equation
α† α = β has a solution α ∈ Z[i] or not, and finds a solution if there is one.
√
Proposition 3.2.9. Let β ∈ Z[ 2], and let n = β • β. Given the prime factorization of
n, there exists an algorithm that determines, in probabilistic polynomial time, whether
the equation α† α = β has a solution α ∈ Z[ω] or not, and finds a solution if there is
one.
Chapter 4
The lambda calculus
In this chapter, we introduce the untyped lambda calculus as well as various typed
lambda calculi, including the quantum lambda calculus. The standard reference for
the untyped lambda calculus is [4]. For typed lambda calculi, see [24]. For the
quantum lambda calculus, see [64], [61], or [58].
4.1 The untyped lambda calculus
4.1.1 Concrete terms
We start by defining the syntax of the untyped lambda calculus.
Definition 4.1.1. The concrete terms of the untyped lambda calculus are defined
by
a, b ::= x (λx.a) (ab)
where x comes from a countable set V of variables.
In Definition 4.1.1, concrete terms are given in the so-called Backus-Naur Form.
This notation should be interpreted as defining the collection L of all concrete terms
as the smallest set of words on the alphabet V ∪ {(, ), λ, .} such that
• V ⊆ L,
• if a ∈ L and x ∈ V, then (λx.a) ∈ L, and
• if a, b ∈ L, then (ab) ∈ L.
A concrete term of the form (λx.a) is called a lambda abstraction and we say
that a is the body of the abstraction. A concrete term of the form (ab) is called an
application. The operations of lambda abstraction and application are called term
forming operations.
27
28
To increase the readability of concrete terms, we adopt the following notational

conventions
• outermost parentheses are omitted,
• applications associate to the left,
• the body of a lambda abstraction extends as far to the right as possible, and
• multiple lambda abstractions are contracted.
This implies, for example, that the term (λx.(λy.((xy)x))) will be written λxy.xyx.
The intended interpretation of concrete terms is as follows. The lambda abstrac-
tion λx.a represents the function defined by the rule x ↦→ a. The application ab
represents the application of the function a to the argument b.
As a first example, consider the identity function. Since it acts as x ↦→ x, its
representation as a concrete term should be λx.x. The application of the identity
function to some input a is written as (λx.x)a. As a second example, consider the
function that acts as x ↦→ (y ↦→ x). This function inputs x and outputs the constant
function to x. Its representation as a concrete term is λxy.x.
4.1.2 Reduction
We now define the operational semantics of the terms of the untyped lambda calculus.
That is, we attribute meaning to the terms by specifying their behavior.
For the concrete term λx.x to be an acceptable representative for the identity
function, it should “behave” accordingly, i.e., the concrete term (λx.x)a should reduce
to a. One way to achieve this is to define the reduction relation by
(λx.b)a → b[a/x], (4.1)
where b[a/x] stands for the substitution of a for every occurrence of x in b. Under
this definition of the reduction relation we have (λx.x)a → a, as intended.
Now consider the concrete term λxy.x. Under the interpretation of terms given
above, it represents the function x ↦→ (y ↦→ x). If we apply this concrete term to x′ y ′ ,
then (4.1) yields the expected result
(λxy.x)(x′ y ′ ) → (λy.x)[(x′ y ′ )/x] = λy.x′ y ′ .

29
However, if we apply λxy.x to the variable y and reduce according to (4.1) again, we
get
(λxy.x)y → (λy.x)[y/x] = λy.y.
This is unsatisfactory because the concrete term λy.y represents the identity function,
not the constant function to y. The way around this problem is to start by renaming
λxy.x to, say, λxz.x. Under this renaming, the reduction of (4.1) would produce the
concrete term λz.y, representing a constant function to y. We therefore need to define
a substitution method that appropriately renames variables.
Definition 4.1.2. Let x and y be two variables and a be a concrete term. The
renaming of x by y in a, written a{y/x}, is defined as
• x{y/x} = y,
• z{y/x} = z if x ̸= z,
• ab{y/x} = (a{y/x})(b{y/x}),
• λx.a{y/x} = λy.(a{y/x}), and
• λz.a{y/x} = λz.(a{y/x}) if x ̸= z.
Definition 4.1.3. The set of free variables of a concrete term a, written FV(a), is
defined as
• FV(x) = {x},
• FV(ab) = FV(a) ∪ FV(b), and
• FV(λx.a) = FV(a) \ {x}.
Any variable that appears in a but does not belong to FV(a) is said to be bound. For
this reason, we call λ a binder.
Definition 4.1.4. If a is a concrete term such that FV(a) = ∅, then a is said to be

closed.
30
In the terminology of Definition 4.1.3, the problem with the relation defined by
(4.1) is that the variable y is free on the left of → but bound on the right. The
variable y is said to have been captured in the course of the reduction. Hence, we
introduce a capture-avoiding notion of substitution.
Definition 4.1.5. Let x be a variable, and a and b be two concrete terms. The
substitution of b for x in a, written a[b/x], is defined as
• if a = x, then a[b/x] = b,
• if a = y and y ̸= x, then a[b/x] = a,
• if a = cc′ , then a[b/x] = c[b/x]c′ [b/x],
• if a = λx.c, then a[b/x] = a, and
• if a = λy.c, then a[b/x] = λz.(c{z/y})[b/x] where z ∈

/ FV(b) ∪ FV(c).
Remark 4.1.6. The last clause of Definition 4.1.5 is slightly ambiguous. Indeed, the
variable z is not specified, but only required to belong to V \ (FV(b) ∪ FV(c)). We
say of such a z that it is fresh with respect to b and c. To be more rigorous, we
should require the set V to be well-ordered and choose z to be the least element of
V \ (FV(b) ∪ FV(c)). Since this ambiguity will be lifted below when we move from
concrete terms to abstract ones, we leave the definition unchanged.
To formally define the reduction relation, we start by identifying the strings, within
a concrete term, that will give rise to a reduction. As one might expect, a reduction
will take place whenever a function is applied to an argument.
Definition 4.1.7. A redex is a concrete term of the form (λx.a)b. By extension, a

redex of a concrete term c is a redex that appears in c.
Definition 4.1.8. The one-step β-reduction, written →, is defined on concrete terms

by the rules
a → a′ b → b′ a → a′ .
(λx.a)b → a[b/x] ab → a′ b ab → ab′ λx.a → λx.a′
The β-reduction, written , is the reflexive and transitive closure of →.

31
Remark 4.1.9. In Definition 4.1.8, is defined as the reflexive and transitive closure
of →. This defines as the smallest relation containing → that is reflexive and
transitive.
With the reduction of concrete terms now defined, we can confirm that the con-
crete terms λx.x and λxy.x behave as expected since for any concrete term a we
have
(λx.x)a → a and (λxy.x)a → λz.a
where z ∈
/ FV(a) ∪ {x}.
4.1.3 Abstract terms
As noted in Remark 4.1.6, the reduction (λxy.x)a → λz.a depends on the choice of
an ordering of the set V, since z is the least element of V \ (FV(a) ∪ {x}). Thus, two
different orderings of V will yield two different concrete terms λz.a and λz ′ .a. This
difference, however, is inessential since both terms define the same function L → L
because z, z ′ ∈
/ FV(a). The same inessential difference occurs between the concrete
terms λz.z and λz ′ .z ′ . To remove this distinction, we define an equivalence relation
=α on L that equates the concrete terms differing only in the name of their bound
variables. The idea behind this equivalence is that in the concrete term λx.a, the
occurrences of the variable x in a are place holders, rather than variables possessing
an intrinsic identity. They can therefore be renamed without affecting the overall
meaning of the concrete term.
Definition 4.1.10. The α-equivalence, written =α , is defined on concrete terms as

the smallest equivalence relation satisfying the rules
y∈/a a =α a′ b =α b′ a =α a′ .
λx.a =α λy.(a{y/x}) ab =α a′ b ab =α ab′ λx.a =α λx.a′
where y ∈
/ a means that y does not appear in a.
Definition 4.1.11. The abstract terms of the untyped lambda calculus are the ele-
ments of L/=α . We write Λ for the set of all abstract terms.
If two concrete terms a and b are α-equivalent, then they have the same structure
(i.e., a is an application if and only if b is an application, and so on). For this reason,
32
we keep writing x, ab, and λx.a for abstract terms even though we are dealing with
equivalence classes of concrete terms.
When defining a function or a relation on abstract terms using the underlying
concrete terms, we should make sure that this function or relation is well-defined.
For example, if we define a function F on abstract terms in this way, then we should
verify that a =α b implies F (a) = F (b). One can check that the notions of free
variable, capture-avoiding substitution and β-reduction are well-defined on abstract
terms. In these cases, and in what follows, we generally overlook this obligation and
omit the proofs of well-definedness.
Because from now on we will always be manipulating abstract terms, we refer to
these as terms for brevity.
4.1.4 Properties of the untyped lambda calculus
An important property of the untyped lambda calculus is that it forms a complete

model of computation in the sense of the Church-Turing thesis.
Proposition 4.1.12. The untyped lambda calculus is Turing-complete, i.e., every

Turing machine can be simulated by a lambda term.
A term a may have any number of redexes. If a has no redexes, then the compu-
tation of a is finished.
Definition 4.1.13. A term that does not contain any redexes is reduced or in normal
form. If b is reduced and a b we say that b is a normal form for a.
For example, a variable x is reduced. Similarly, the terms λx.x and λxy.x are both
reduced. The following term, on the other hand, is not reduced, since it contains two
redexes
(λx.x)((λy.z)x′ ).
When reducing such a term, nothing in the definition of the β-reduction tells us which
redex to reduce first. We say that the β-reduction is non-deterministic.
Proposition 4.1.14. The untyped lambda calculus is confluent, i.e., if a, b, and b′

are terms such that a b and a b′ , then there exists a term c such that b c and
b′ c.
33
Proposition 4.1.14 was first established by Church and Rosser in [11] and is there-
fore known as the Church-Rosser property. It is also referred to as the confluence
property of the β-reduction. Confluence guarantees that, despite the non-determinism
of the reduction, normal forms are unique.
Corollary 4.1.15. Let a be a term. If a has a normal form, then it is unique.
The uniqueness of normal forms implies that the lambda calculus is consistent in
the sense that not all terms are equated by the β-reduction.
Corollary 4.1.16. The untyped lambda calculus is consistent, i.e., there exists two
terms a and b such that a ̸≡β b, where ≡β denotes the reflexive, symmetric, and
transitive closure of →.
4.2 The simply typed lambda calculus
The term xx is a well-formed term of the untyped lambda calculus. If we think of

this term in light of the interpretation discussed in Section 4.1 we are led to interpret
the variable x as both function and argument in xx. This unusual construction is
admitted in the untyped lambda calculus because there are no notions of domain and
codomain for terms. Types can be seen as a method to endow the lambda calculus
with these notions.
4.2.1 Terms
The language of the simply typed lambda calculus is an extension of the language of
the untyped lambda calculus.
Definition 4.2.1. The terms of the simply typed lambda calculus are defined by
a, b ::= x λx.a ab ∗ ⟨a, b⟩ let ∗ = a in b let ⟨x, y⟩ = a in b
where x and y are variables of the untyped lambda calculus.
Definition 4.2.1 extends the language of the untyped lambda calculus by adding
the constant ∗ as well as two new term forming operations. The intended meaning of
these new terms is as follows.
34
• ⟨a, b⟩ is the pair of a and b.
• ∗ is the empty pair, i.e., the 0-ary version of ⟨a, b⟩.
• let ⟨x, y⟩ = a in b is a term that will reduce a and in case a ⟨b1 , b2 ⟩ will
assign b1 to x and b2 to y.
• let ∗ = a in b is the nullary version of let ⟨x, y⟩ = a in b.
We extend the notion of free-variables of a term to account for the new term
forming operations.
Definition 4.2.2. The set of free variables of a term a of the simply typed lambda
calculus, written FV(a), is defined as
• FV(x) = {x},
• FV(ab) = FV(a) ∪ FV(b),
• FV(λx.a) = FV(a) \ {x},
• FV(∗) = ∅,
• FV(⟨a, b⟩) = FV(a) ∪ FV(b),
• FV(let ∗ = a in b) = FV(a) ∪ FV(b), and
• FV(let ⟨x, y⟩ = a in b) = FV(a) ∪ (FV(b) \ {x, y}).
The notions of α-equivalence and capture-avoiding substitution can be extended

to the setting of the simply typed lambda calculus. Note that in let ⟨x, y⟩ = a in b
the variables x and y are bound in b (but not in a) so that let is a binder. As in the
untyped case, we say of a term a such that FV(a) = ∅ that it is closed.
4.2.2 Operational semantics
To account for the new term forming operations of our extended language, we intro-
duce additional reduction rules.
35
let ∗ = ∗ in a → a let ⟨x, y⟩ = ⟨b, c⟩ in a → a[b/x, c/y]

b → b′ a → a′
⟨a, b⟩ → ⟨a, b′ ⟩ ⟨a, b⟩ → ⟨a′ , b⟩
a → a′
let ∗ = a in b → let ∗ = a′ in b
a → a′
let ⟨x, y⟩ = a in b → let ⟨x, y⟩ = a′ in b
b → b′
let ∗ = a in b → let ∗ = a in b′
b → b′
let ⟨x, y⟩ = a in b → let ⟨x, y⟩ = a in b′
Figure 4.1: Additional reduction rules for the simply typed lambda calculus.
Definition 4.2.3. The one-step β-reduction, written →, is defined on the terms of

the simply typed lambda calculus by the rules of Definition 4.1.8 as well as those given
in Figure 4.1. The β-reduction, written , is the reflexive and transitive closure of
→.
4.2.3 Types
Definition 4.2.4. The types of the simply typed lambda calculus are defined by
A, B ::= X (A × B) 1 (A → B)
where X comes from a set T of basic types.
It can be useful to think of types as sets of terms. Under this interpretation, we

have
• (A × B) is the set of pairs,
• 1 is the set containing the unique empty tuple, and
• (A → B) is the set of functions from A to B.
We now explain how to use types to restrict the formation of terms.

36
(ax)
Γ, x : A ⊢ x : A
Γ, x : A ⊢ b : B
(λ) Γ ⊢ c : A → B Γ ⊢ a : A (app)
Γ ⊢ λx.b : A → B Γ ⊢ ca : B
(∗ ) Γ ⊢ b : 1 Γ ⊢ a : A (∗ )
Γ ⊢ ∗ : 1 i Γ ⊢ let ∗ = b in a : A e
Γ ⊢ a : A Γ ⊢ b : B (× ) Γ ⊢ b : (B1 × B2 ) Γ, x : B1 , y : B2 ⊢ a : A (× )
i e
Γ ⊢ ⟨a, b⟩ : A × B Γ ⊢ let ⟨x, y⟩ = b in a : A
Figure 4.2: Typing rules for the simply typed lambda calculus.
Definition 4.2.5. A typing context is a finite set {x1 : A1 , . . . , xn : An } of pairs of a

variable and a type, such that no variable occurs more than once. The expressions of
the form x : A in a typing context are called type declarations.
Definition 4.2.6. A typing judgment is an expression of the form
Γ⊢a:A
where Γ is a typing context, a is a term, and A is a type.
Definition 4.2.7. A typing judgment is valid if it can be inferred from the rules
given in Figure 4.2.
If a is a term, one shows that Γ ⊢ a : A is valid by exhibiting a typing derivation.

If such a derivation exists, we say that a is well-typed of type A, or sometimes simply
well-typed. For example, below are two typing derivations, establishing that both
λx.x and λxy.x are well-typed.
x : X, y : Y ⊢ x : X
x:X⊢x:X x : X ⊢ λy.x : Y → X
⊢ λx.x : X → X ⊢ λxy.x : X → (Y → X)
The term xx, however, is not well-typed. Indeed, suppose a typing derivation π
of Γ ⊢ xx : B exists. Then the last rule of π must be the (app) rule, since this rule
is the only one allowing the construction of an application. Moreover, the only rule
permitting the introduction of a variable is the (ax) rule. The typing derivation π
37
must therefore be the following.
Γ, x : A → B ⊢ x : A → B Γ, x : A ⊢ x : A .
Γ ⊢ xx : B
But there are no types A and B such that A = A → B. Hence there is no typing
derivation of Γ ⊢ xx : B.
4.2.4 Properties of the type system
The type system of the simply typed lambda calculus restricts the construction of
terms in order to syntactically rule out “ill-behaved” terms. To verify that the type
system achieves this intended goal, we need to prove that all well-typed terms “behave
well”. This intuitive idea is captured by establishing the type safety of the language.
Following [17], we consider that a language is type safe if it enjoys the subject reduction
and progress properties. The latter property relies on a notion of value. These values
are a particular set of distinguished normal forms. The definition of value varies from
language to language. For now, we can take the set of values to consist of all normal
forms. Later, when we consider specific languages, this definition will be adjusted
and explicitly stated.
Subject reduction: This property guarantees that the type of a term is stable under
reduction. As a corollary, it also shows that if a term is well-typed, then it never
reduces to an ill-typed term.
Progress: This property shows that a well-typed closed term is either a value or
admits further reductions.
The simply typed lambda calculus is type safe as it enjoys both of the above
properties.
Proposition 4.2.8 (Subject reduction). If Γ ⊢ a : A and a → a′ , then Γ ⊢ a′ : A.
Proposition 4.2.9 (Progress). If ⊢ a : A, then either a is a value or there exists a′

such that a → a′ .
The simply typed lambda calculus also enjoys a property known as strong nor-
malization.
38
Definition 4.2.10. A term a is weakly normalizing if there exists a finite sequence of

reductions a → . . . → b where b is in normal form, and strongly normalizing if every
sequence of reductions starting from a is finite.
Note that any strongly normalizing term is also weakly normalizing. Variables
are examples of strongly normalizing terms. As an example of a term that is neither
strongly nor weakly normalizing, consider ΩΩ, where Ω is the term λx.xx. ΩΩ is
neither weakly nor strongly normalizing since we have
ΩΩ = (λx.xx)λx.xx → xx[λx.xx/x] = (λx.xx)λx.xx = ΩΩ.
As an example of a term that is weakly but not strongly normalizing, consider

(λz.y)(ΩΩ).
Since Ω contains xx, we know that ΩΩ is not well-typed. In contrast, the well-
typed terms we have encountered so far, λx.x and λxy.x, are both strongly normal-
izing. In fact, the simply typed lambda calculus has the property that all well-typed
terms are strongly normalizable.
Proposition 4.2.11 (Strong normalization). If ⊢ a : A is a valid typing judgment,

then a is strongly normalizing.
4.3 Linearity
We now sketch a version of Girard’s intuitionistic linear logic ([23]). As we shall see
in the next section the use of linear logic in the context of quantum computation is
motivated by the no-cloning property of quantum information.
4.3.1 Contraction, weakening, and strict linearity
Informally, a variable is used linearly if it is used exactly once. In the simply typed
lambda calculus, variables can be used non-linearly. As a first example, consider the
following typing derivation.
x:A⊢x:A x:A⊢x:A
x : A ⊢ ⟨x, x⟩ : A × A
39
In the above derivation the variable x is used non-linearly, in the sense that only a
single occurrence of x in the context is required to construct the pair ⟨x, x⟩ in which x
occurs twice. This is possible because the two occurrences of the declaration x : A in
the leaves of the typing derivation were implicitly contracted by the application of the
(×i ) rule. As another example, note that the typing judgement x : A, y : B ⊢ y : B
is valid by the (ax) rule. In this case, the variable x is handled non-linearly because
it appears in the context but is not used at all. This second kind of non-linearity is
due to the implicit weakening of the context in the (ax) rule.
It is possible to modify the typing rules of the simply typed lambda calculus to
force variables to be used in a strictly linear fashion. To obtain such a system, we
can replace the (ax) and (∗i ) rules with
.
x:A⊢x:A and ⊢∗:1
In the above rules, the typing contexts are minimal, which guarantees that no implicit
weakening can occur. To forbid implicit contractions we need to ensure that contexts
are not merged but juxtaposed in binary rules. This can be achieved in the case of
the (×i ) rule as follows.
Γ1 ⊢ a : A Γ2 ⊢ b : B
Γ1 , Γ2 ⊢ ⟨a, b⟩ : A × B
The above rule carries the side condition that the contexts Γ1 and Γ2 are distinct, so
that the notation Γ1 , Γ2 denotes the disjoint union of the two contexts.
If contexts are removed from all nullary rules and juxtaposed rather than merged
in all binary rules, then we obtain a strictly linear system. In this system, a variable
occurs in the context of a valid typing judgement if and only if it appears exactly
once in the term being typed.
4.3.2 Reintroducing non-linearity
The restrictions imposed by the strictly linear type system sketched above are very
strong. Our interest in Girard’s linear logic is the fact that it allows us to reintroduce
a controlled form of non-linearity. The idea is to use a modality called bang and
denoted ! to identify the variables that can be used non-linearly. To this end, we
extend the grammar of types as follows.
A, B ::= X 1 (A × B) (A → B) !A
40
The new type !A consists of all the elements of the type A that can be used non-
linearly. One can think of the elements of type !A as those elements of A that have
the property of being reusable or duplicable.
We can modify the typing rules to account for this new modality. For example,
the (×i ) rule becomes
!∆, Γ1 ⊢ a : A !∆, Γ2 ⊢ b : B .
!∆, Γ1 , Γ2 ⊢ ⟨a, b⟩ : A × B
where the context !∆ denotes a set of declarations of the form x1 : !A1 , . . . xn : !An . In
this new rule, the contracted part of the context consists exclusively of declarations
of the form x : !A. This ensures that the only variables that are used non-linearly are
the ones of a non-linear type.
It should be possible to use a duplicable variable only once. In other words, if a
variable x is declared of type !A, it should also have type A. One way to achieve this
is to equip the type system with a subtyping relation, denoted <:, satisfying !A <: A
for every type A.
4.4 The quantum lambda calculus
Various lambda calculi for quantum computation have appeared in the literature
(e.g., [63], [3]). Here, we focus on the quantum lambda calculus (see [64], [61], or [58])
as it is the main inspiration for the Proto-Quipper language defined and studied in
chapters 8 and 9.
The quantum lambda calculus is based on the QRAM model of quantum com-
putation described in Section 2.2.4. To embody the QRAM model, the reduction
relation of the quantum lambda calculus is defined on closures. These closures are
triples [Q, L, a] where Q is a unit vector in ni=1 C2 for some integer n, L is a list of
⨂
n distinct term variables, and a is a term of the quantum lambda calculus.

The vector Q represents the state of a system of n qubits held in some hypothetical
quantum device. In a well-formed closure, the free variables of a are required to form
a subset of L and the list L is interpreted as a link between the variables of a and the
qubits of Q. This way, the qubits whose state is described by Q become accessible to
the operations of the quantum lambda calculus.
41
4.4.1 Terms
Definition 4.4.1. The terms of the quantum lambda calculus are defined by
a, b, c ::= x u λx.a ab ⟨a, b⟩ ∗
let ⟨x, y⟩ = a in b let rec x y = b in c
injl(a) injr(a) match a with (x ↦→ b | y ↦→ c)
where u comes from a set U of quantum constants and x, y come from a countable
set V of variables.
The meaning of most terms is intended to be the standard one, as described in

the previous sections. The term let rec x y = b in c is a recursion operator. The
terms injl(a) and injr(a) denote the left and right inclusion in a disjoint union
respectively. The term match a with (x ↦→ b | y ↦→ c) denotes a case distinction
depending on a. The terms injl(∗) and injr(∗) form a two element set on which
one can perform a case distinction. The classical bits are defined as the elements of
this set, with 0 = injr(∗) and 1 = injl(∗).
The set U contains syntactical representatives of certain operations that can be
executed by the quantum device. In particular, U is assumed to contain the constants
new and meas, whose intended interpretation is as follows.
• The term new represents an initialization function. It inputs a bit (i.e., one of 0
or 1 as defined above) and produces a qubit in the corresponding classical state
(i.e., |1⟩ or |0⟩ respectively).
• The term meas represents a measurement function. It inputs a qubit and mea-
sures it in the computational basis, returning the corresponding bit.
Assume that U contains a constant H representing the Hadamard gate and con-
sider the term coin defined as
coin = λ ∗ . meas(H(new 0)).
This term represents a “fair coin”. When applied to any argument, it will prepare a
qubit in the state |0⟩ and apply a Hadamard gate to it. This results in the superpo-
sition
|0⟩ + |1⟩
√ .
2
The qubit is then measured, which results in 0 or 1 with equal probability.
42
4.4.2 Operational semantics
In the quantum lambda calculus, one must choose a reduction strategy. To see why
this is the case, assume that bplus is a term from the quantum lambda calculus
representing addition modulo 2 and consider the following term
a = (λx.bplus ⟨x, x⟩)(coin ∗).
The term a has two redexes. If the outer redex is reduced first, we obtain the term
bplus ⟨(coin ∗), (coin ∗)⟩, which will reduce to 0 or 1 with equal probability. How-
ever, if we evaluate (coin ∗) first, then a reduces to
(λx.bplus ⟨x, x⟩)0 or (λx.bplus ⟨x, x⟩)1
with equal probability. Either way, the final result of computation will be 0. This ex-
ample shows that confluence fails in the quantum lambda calculus, forcing us to choose
an order of evaluation. In the quantum lambda calculus, evaluation of terms follows
a call-by-value reduction strategy. In particular, this means that when evaluating an
application we reduce the argument before applying the function. To determine when
a term is reduced, we define a notion of value.
Definition 4.4.2. The values of the quantum lambda calculus are defined by
v, w ::= x u ∗ ⟨v, w⟩ λx.a injr(v) injl(v).
Definition 4.4.3. A closure is a triple [Q, L, a] where
n
• Q is a normalized vector of C2 for some n > 0 called a quantum array,
• L is a list of n distinct term variables, and
• a is a term whose free variables appear in L.
The closure [Q, L, a] is a value if a is a value.
Definition 4.4.4. The one-step β-reduction, written →p , is defined on closures by

the rules given in Figure 4.3. The notation [Q, L, a] →p [Q′ , L′ , a′ ] means that the
reduction takes place with probability p.
43
[Q, L, a] →p [Q′ , L′ , a′ ] [Q, L, b] →p [Q′ , L′ , b′ ]

[Q, L, av] →p [Q′ , L′ , a′ v] [Q, L, ab] →p [Q′ , L′ , ab′ ]
[Q, L, b] →p [Q′ , L′ , b′ ] [Q, L, a] →p [Q′ , L′ , a′ ]
[Q, L, ⟨a, b⟩] →p [Q′ , L′ , ⟨a, b′ ⟩] [Q, L, ⟨a, v⟩] →p [Q′ , L′ , ⟨a′ , v⟩]
[Q, L, a] →p [Q′ , L′ , a′ ] [Q, L, a] →p [Q′ , L′ , a′ ]
[Q, L, injl(a)] →p [Q′ , L′ , injl(a′ )] [Q, L, injr(a)] →p [Q′ , L′ , injr(a′ )]
[Q, L, a] →p [Q′ , L′ , a′ ]
[Q, L, let ⟨x, y⟩ = a in b] →p [Q′ , L′ , let ⟨x, y⟩ = a′ in b]
[Q, L, a] →p [Q′ , L′ , a′ ]
[Q, L, match a with (x ↦→ b | y ↦→ c)] →p [Q′ , L′ , match a′ with (x ↦→ b | y ↦→ c)]
[Q, L, (λx.a)v] →1 [Q, L, a[v/x]]

[Q, L, let ∗ = ∗ in a] →1 [Q, L, a]
[Q, L, let ⟨x, y⟩ = ⟨v, w⟩ in a] →1 [Q, L, a[v/x, w/y]]
[Q, L, match injl(v) with (x ↦→ b | y ↦→ c)] →1 [Q, L, b[v/x]]
[Q, L, match injr(v) with (x ↦→ b | y ↦→ c)] →1 [Q, L, c[v/y]]
[Q, L, let rec x y = b in c] →1 [Q, L, c[(λy.let rec x y = b in b)/x]]
[Q, L, u⟨xj1 , . . . , xjn ⟩] →1 [Q′ , L, ⟨xj1 , . . . , xjn ⟩]

[α|Q0 ⟩ + β|Q1 ⟩, L, meas(xi )] →|α|2 [|Q0 ⟩, L, 0]
[α|Q0 ⟩ + β|Q1 ⟩, L, meas(xi )] →|β|2 [|Q1 ⟩, L, 1]
[Q, |x1 . . . xn ⟩, new(0)] →1 [Q ⊗ |0⟩, |x1 . . . xn+1 ⟩, xn+1 ]

[Q, |x1 . . . xn ⟩, new(1)] →1 [Q ⊗ |1⟩, |x1 . . . xn+1 ⟩, xn+1 ]
Figure 4.3: Reduction rules for the quantum lambda calculus.

44
The rules are separated in three groups. The first group contains the congruence
rules. In particular, the rules for the reduction of an application can be seen to define
a call-by-value reduction strategy. The second group of rules contains the classical
rules. These rules define the reduction of redexes that do not involve any of the
constants from the set U . The last group of rules contains the quantum rules. These
rules define the interaction between the classical device and the quantum device. In
the first quantum rule, we have Q′ = u(Q). This rule corresponds to the application
of the unitary u to the relevant qubits. Note that the only probabilistic reduction
step is the one corresponding to measurement.
The chosen reduction strategy guarantees that, at every step of a reduction, only
one rule applies. Hence, unlike the untyped lambda calculus, the quantum lambda
calculus is deterministic.
Proposition 4.4.5. If [Q, L, a] is a closure, then at most one reduction rule applies
to it.
4.4.3 Types
Definition 4.4.6. The types of the quantum lambda calculus are defined by
A, B ::= qubit 1 !A A⊗B A⊕B A ( B.
The type system of the quantum lambda calculus is based on intuitionistic linear
logic as sketched in Section 4.3. The notation is adopted from linear logic, with A⊗B
for the type of pairs, A⊕B for the type of sums, and A ( B for the type of functions.
The type qubit represents the set of all 1-qubit states. As in Section 4.3, the type
!A can be understood as the subset of A consisting of values that have the additional
property of being duplicable or reusable. We will sometimes write !n A, with n ∈ N,
to mean
!.
. .! A.
n
Similarly, we sometimes write A⊗n to mean
A ⊗ ... ⊗ A.
  
n
45
qubit <: qubit 1 <: 1

A1 <: B1 A2 <: B2 A1 <: B1 A2 <: B2 A2 <: A1 B1 <: B2
(A1 ⊗ A2 ) <: (B1 ⊗ B2 ) (A1 ⊕ A2 ) <: (B1 ⊕ B2 ) (A1 ( B1 ) <: (A2 ( B2 )
A <: B (n = 0 ⇒ m = 0)
! A <: !m B
n
Figure 4.4: Subtyping rules for the quantum lambda calculus.
We also write bit for the type 1 ⊕ 1.

The fact that a term is reusable should not prevent us from using it exactly once.
Intuitively, this should imply that if !A is a valid type for a given term, then A should
also be a valid type for it. To capture this idea, we use a subtyping relation on types.
Definition 4.4.7. The subtyping relation <: is the smallest relation on types satis-
fying the rules given in Figure 4.4.
Proposition 4.4.8. The subtyping relation is reflexive and transitive.
To define the type system of the quantum lambda calculus, we first introduce
axioms for the elements of the set U .
Definition 4.4.9. We introduce a type for the constants of U . For new and meas we
set
Anew = bit ( qubit, Ameas = qubit ( !bit,
and for the remaining elements v ∈ U we set
Av = qubit⊗n ( qubit⊗n .
Definition 4.4.10. A typing judgment of the quantum lambda calculus is valid if it

can be inferred from the rules given in Figure 4.5.
There are two rules for the construction of a lambda abstraction. The rule (λ1 ) is
similar to the (λ) rule of the simply typed lambda calculus. However, the produced
function is not duplicable. In contrast, the rule (λ2 ) produces a duplicable function.
The main difference is that in the (λ2 ) rule, the free variables that appear in b must
46
A <: B !Au <: B

(ax1 ) (ax2 )
Γ, x : A ⊢ x : B Γ⊢u:B
Γ ⊢ a : !n A (⊕ ) Γ ⊢ b : !n B (⊕i2 )
i
Γ ⊢ injl(a) : !n (A ⊕ B) 1
Γ ⊢ injr(b) : !n (A ⊕ B)
Γ2 , !∆, y : !n A ⊢ c : C
Γ1 , !∆ ⊢ a : !n (A ⊕ B) Γ2 , !∆, x : !n A ⊢ b : C
(⊕e )
Γ1 , Γ2 , !∆ ⊢ match a with (x ↦→ b | y ↦→ c) : C
Γ, x : A ⊢ b : B Γ, !∆, x : A ⊢ b : B FV(b) ∩ |Γ| = ∅
(λ1 ) (λ2 )
Γ ⊢ λx.b : A ( B Γ, !∆ ⊢ λx.b : !n+1 (A ( B)
Γ1 , !∆ ⊢ c : A ( B Γ2 , !∆ ⊢ a : A
(app)
Γ1 , Γ2 , !∆ ⊢ ca : B
(∗ )
Γ ⊢ ∗ : !n 1 i
Γ1 , !∆; Q1 ⊢ a : !n A Γ2 , !∆; Q2 ⊢ b : !n B
(⊗i )
Γ1 , Γ2 , !∆ ⊢ ⟨a, b⟩ : !n (A ⊗ B)
Γ1 , !∆ ⊢ b : !n (B1 ⊗ B2 ) Γ2 , !∆, x : !n B1 , y : !n B2 ⊢ a : A
(⊗e )
Γ1 , Γ2 , !∆ ⊢ let ⟨x, y⟩ = b in a : A
!∆, x : !(A ( B), y : A ⊢ b : B Γ, !∆, x : !(A ( B) ⊢ c : C
(rec)
Γ, !∆ ⊢ let rec x y = b in c : C
Figure 4.5: Typing rules for the quantum lambda calculus.
all be of a duplicable type. This prevents b from having any embedded quantum data,
which could not be cloned. Note that the type system prevents us from assigning the
type qubit ( qubit ⊗ qubit to the term λx.⟨x, x⟩.
Definition 4.4.11. A typed closure is an expression of the form
[Q, L, a] : A,
where [Q, L, a] is a closure and A is a type. It is valid if
x1 : qubit, . . . , xn : qubit ⊢ a : A
is a valid typing judgement, with L = |x1 , . . . , xn ⟩.
The quantum lambda calculus is a type safe language, in the sense that it enjoys
the subject reduction and progress properties.
47
Proposition 4.4.12. If [Q, L, a] : A is a valid typed closure and
[Q, L, a] →p [Q′ , L′ , a′ ],
then [Q′ , L′ , a′ ] : A is a valid typed closure.
Proposition 4.4.13. Let [Q, L, a] be a valid typed closure of type A. Then either
[Q, L, a] is a value, or there is a valid typed closure [Q′ , L′ , a′ ] such that [Q, L, a] →p
[Q′ , L′ , a′ ]. Moreover, the total probability of all possible one-step reductions from
[Q, L, a] is 1.
Chapter 5
Grid problems
In this chapter, we present an efficient method to solve a type of lattice point enumer-
ation problem which we call a grid problem. As a first approximation, a grid problem
can be thought of as follows: given a discrete subset L ⊆ R2 (such as a lattice), which
we call the grid, and a bounded convex subset A ⊆ R2 with non-empty interior, enu-
merate all the points u ∈ A ∩ L. Specifically, we will be interested in grid problems
for which L is a subset of Z[i] or of Z[ω], as defined in Chapter 3. We refer to the
first kind of problem as a grid problem over Z[i] and to the second kind of problem
as a grid problem over Z[ω]. As we shall see in chapters 6 and 7, these problems have
applications in quantum computation. However, they are treated here independently
of any quantum considerations. The results contained in this chapter first appeared
in [54] and [56].
We note that the method presented here is not the only method for solving grid
problems. Alternatively, grid problems over Z[i] or Z[ω] can be reduced to so-called
integer programming problems in some fixed dimension which can be efficiently solved
using the techniques pioneered by Lenstra in [45]. Nevertheless, we believe our method
is novel and interesting.
Even though grid problems over Z[i] are significantly simpler than grid problems
over Z[ω], the overall method remains the same. For this reason, the case of Z[i],
which is treated first, is used as an introduction to the case of Z[ω], which is treated
second.
Let R be one of Z[i] or Z[ω]. As in Chapter 3, we quantify the complexity of our
methods by estimating the number of arithmetic operations required to produce an
element u ∈ A ∩ L, for L ⊆ R. Our algorithms will input bounded convex subsets
A ⊆ R2 (as well as closed intervals [x0 , x1 ] ⊆ R). If we were to give a rigorous
complexity-theoretic account, we should indicate what it means for a subset A of R2
to be “given” as the input to an algorithm. The details of this do not matter much.
48
49
For our purposes, it will suffice to assume that a convex set is given along with the
following information.
• A convex polygon enclosing A, say with rational vertices, and such that the
area of the polygon exceeds that of A by at most a fixed constant factor;
• a method to decide, for any given point of R, whether it is in A or not; and
• a method to compute the intersection of A with any straight line in R. More

precisely, given any straight line parameterized as L(t) = p + tq, with p, q ∈ R,
we can effectively determine the interval {t | L(t) ∈ A} in the sense of the
above.
5.1 Grid problems over Z[i]
Recall from Chapter 3 that the elements of Z[i] are of the form a + bi, with a and b
in Z. We can therefore identify Z[i] with the set Z2 ⊆ R2 . When viewed in this way,
we refer to Z[i] as the grid and to elements u ∈ Z[i] as grid points.
Problem 5.1.1 (Grid problem over Z[i]). Given a bounded convex subset A of R2
with non-empty interior, enumerate all the points u ∈ A ∩ Z[i].
A point u ∈ A ∩ Z[i] is called a solution to the grid problem over Z[i] for A.
Figure 5.1 (a) illustrates a grid problem for which A is a disk centered at the origin.
The grid is shown as black dots and the set A is shown in red.
5.1.1 Upright rectangles
We first consider grid problems over Z[i] for which A is an upright rectangle, i.e., of
the form [x1 , x2 ] × [y1 , y2 ]. These instances are easily solved, as they can be reduced
to a problem in a lower dimension. Indeed, it suffices to independently solve the grid
problem in the x-axis (i.e., by enumerating the integers in [x1 , x2 ]) and in the y-axis
(i.e. by enumerating the integers in interval [y1 , y2 ]), as illustrated in Figure 5.1 (b).
Proposition 5.1.2. Let A be an upright rectangle. Then there is an algorithm which

enumerates all the solutions to the grid problem over Z[i] for A. Moreover, the algo-
rithm requires only a constant number of arithmetic operations per solution produced.
50
4 4
(a) 3 (b) 3
A
2 2
1 1
A
0 0
−4 −3 −2 −1 0 1 2 3 4 −4 −3 −2 −1 0 1 2 3 4
−1 −1
−2 −2
−3 −3
−4 −4
Figure 5.1: (a) A grid problem over Z[i] for which A is a disk of radius 1.5 centered at
the origin. (b) A grid problem over Z[i] for which A is an upright rectangle, together
with the projections of A along the x- and y-axes.
5.1.2 Upright sets
We now generalize the method of the previous subsection to convex sets that are close
to upright rectangles in a suitable sense.
Definition 5.1.3 (Uprightness). Let A be a bounded convex subset of R2 with non-

empty interior. The bounding box of A, denoted BBox(A), is the smallest set of
the form [x1 , x2 ] × [y1 , y2 ] that contains A. The uprightness of A, denoted up(A), is
defined to be the ratio of the area of A to the area of its bounding box:
area(A)
up(A) = .
area(BBox(A))
We say that A is M -upright if up(A) > M .
Proposition 5.1.4. Let A be an M -upright set. Then there exists an algorithm

which enumerates all the solutions to the grid problem over Z[i] for A. Moreover, the
algorithm requires O(1/M ) arithmetic operations per solution produced. In particular,
when M > 0 is fixed, it requires only a constant number of operations per solution.
Proof. By Proposition 5.1.2, we can efficiently enumerate the solutions of the grid
problem for BBox(A). For each such candidate solution u, we only need to check
51
3
A1
2
A2
1
0
−4 −3 −2 −1 0 1 2 3 4
−1
−2
A3
−3
−4
Figure 5.2: Grid problems over Z[i] for upright and non-upright sets.
whether u is also a solution for A. To establish the efficiency of the algorithm, we

need to ensure that the total number of solutions is not too small in relation to the
total number of candidates produced. To see this, note that, with the exception
of trivial cases, when the number of rows or columns is very small, M -uprightness
and convexity ensure that the proportion of candidates u that are solutions for A
is approximately M : 1. Therefore, the runtime per solution differs from that of
Proposition 5.1.2 by at most a factor of O(1/M ).
Figure 5.2 shows three different examples of grid problems over Z[i]. The sets Ai
are again shown in red, for i = 1, 2, 3, and their bounding boxes are shown in outline.
The typical case of an upright set is A1 . Here, a fixed proportion of grid points from
the bounding box of A1 are elements of A1 . The exceptional case of an upright set is
A2 . Its bounding box spans only two columns of the grid. Therefore, although the
bounding box contains many grid points, A2 does not. However, this case is easily
dealt with by solving the problem in a lower dimension for each of the grid columns
separately. Finally, the set A3 is not upright. In this case, Lemma 5.1.4 is not helpful,
and a priori, it could be a difficult problem to find grid points in A3 .
52
5.1.3 Grid operators
The method of the previous subsection can be further generalized by using certain
linear transformations to turn non-upright sets into upright sets. The linear trans-
formations that are useful for this purpose are special grid operators.
Definition 5.1.5 (Grid operator). A grid operator is an integer matrix, or equiva-

lently, a linear operator, that maps Z2 to itself. A grid operator G is called special if
it has determinant ±1, in which case G−1 is also a grid operator.
Remark 5.1.6. If A is a subset of R2 and G is a grid operator, then G(A), the direct
image of A, is defined as usual by G(A) = {G(v) ; v ∈ A}.
Remark 5.1.7. The interest in special grid operators lies in the fact that u is a solution
to the grid problem over Z[i] for A if and only if G(u) is a solution to the grid problem
over Z[i] for G(A).
5.1.4 Ellipses
Combining the results of the previous two subsections, we know that the grid problem
over Z[i] for A can be solved efficiently provided that we can find a operator G such
that G(A) is sufficiently upright. In this subsection, we show that if A is an ellipse,
then this can always be done.
Definition 5.1.8 (Ellipse). Let D be a positive definite real 2 × 2-matrix with non-
zero determinant, and let p ∈ R2 be a point. The ellipse defined by D and centered
at p is the set
E = {u ∈ R2 ; (u − p)† D(u − p) 6 1}.
Remark 5.1.9. If G is a grid operator and E is an ellipse centered at the origin

and defined by D, then G(E) is an ellipse centered at the origin and defined by
(G−1 )† DG−1 .
The notion of uprightness introduced above was defined for an arbitrary bounded
convex subset of R2 . If the set in question is an ellipse, we can expand the definition
of uprightness into an explicit expression.
53
Proposition 5.1.10. Let E be the ellipse defined by D and centered at p, with

[ ]
a b
D= .
b d
√
π det(D)
Then up(E) = 4 ad
.
Proof. We can compute the area of E and the area of its bounding box using D.
√ √
Indeed, we have area(E) = π/ det(D) and area(BBox(E)) = 4 ad/ det(D). Sub-
stituting these in Definition 5.1.3, yields the desired expression for uprightness
√ BBox(E)
area(E) π det(D)
up(E) = = . E
area(BBox(E)) 4 ad
Remark 5.1.11. The uprightness of an ellipse is invariant under translation and scalar
multiplication.
Definition 5.1.12 (Skew). The skew of a matrix is the product of its anti-diagonal
entries. The skew of an ellipse defined by D is the skew of D.
By Proposition 5.1.10, the skew of an ellipse is small if and only if its uprightness
is large. Our strategy for increasing the uprightness will therefore be to reduce the
skew.
Proposition 5.1.13. Let E be an ellipse. There exists a grid operator G such that
G(E) is 1/2-upright. Moreover, if E is M -upright, then G can be efficiently computed
in O(log(1/M )) arithmetic operations.
Proof. Let A and B be the following special grid operators

[ ] [ ]
1 1 1 0
A= , B= ,
0 1 1 1
and consider an ellipse E defined by D and centered at p. Since uprightness is

invariant under translation and scaling, we may without loss of generality assume
54
that E is centered at the origin and that D has determinant 1. Suppose moreover
that the entries of D are as follows
[ ]
a b
.
b d
Note that D can be written in this form because it is symmetric. We first show
that there exists a grid operator G such that Skew(G(E)) 6 1. Indeed, assume that
Skew(E) = b2 > 1. In case a 6 d, choose n such that |na + b| 6 a/2. Then we have:
[ ]
· · · na + b
An† DAn = .
na + b ···
Therefore, using Remark 5.1.9 with G1 = (An )−1 , we have:

a2 ad 1 + b2
Skew(G1 (E)) = (na + b)2 6 6 =
4 4 4
1 + Skew(E)
=
4
2 Skew(E) 1
6 = Skew(E).
4 2
Similarly, in case d < a, then choose n such that |nd + b| 6 d/2. A similar calculation
shows that in this case, with G1 = (B n )−1 , we get Skew(G1 (E)) 6 21 Skew(E). In
both cases, the skew of E is reduced by a factor of 2 or more. Applying this process
repeatedly yields a sequence of operators G1 , . . . , Gm and letting G = Gm · . . . · G1 we
find that Skew(G(E)) 6 1.
Now let D′ be the matrix defining G(E), with entries as follows:
[ ]
α β
D′ = .
β δ
Then Skew(G(E)) 6 1 implies that β 2 6 1. Moreover, since A and B are special

grid operators we have det(D′ ) = αδ − β 2 = 1. Using the expression from Proposi-
tion 5.1.10 for the uprightness of G(E) we get the desired result:
√
π det(D′ ) π π π 1
up(G(E)) = = √ = √ > √ > .
4 αδ 4 αδ 4 β2 + 1 4 2 2
Finally, to bound the number of arithmetic operations, note that each application
of Gj reduces the skew by at least a factor of 2. Therefore, the number n of grid op-
erators required satisfies n 6 log2 (Skew(E)). Now note that since D has determinant
55
1, we have
π 1 π
M 6 up(E) = √ = √ .
4 ad 4 b2 + 1
Therefore Skew(E) = b2 6 (π 2 /16M 2 ) − 1, so that the computation of G requires
O(log(1/M )) arithmetic operations.
5.1.5 The enclosing ellipse of a bounded convex set
The final step in our solution of grid problems over Z[i] is to generalize Proposi-
tion 5.1.13 from ellipses to arbitrary bounded convex sets with non-empty interior.
This can be done because every such set A can be inscribed in an ellipse whose area
is not much greater than that of A, as stated in the following proposition, which was
proved in [56].
Proposition 5.1.14. Let A be a bounded convex subset of R2 with non-empty interior.

Then there exists an ellipse E such that A ⊆ E, and such that
4π
area(E) 6 √ area(A). (5.1)
3 3
4π
Note that √
3 3
≈ 2.4184. We remark that the bound in Proposition 5.1.14 is sharp;
the bound is attained in case A is an equilateral triangle. In this case, the enclosing
4π
ellipse is a circle, and the ratio of the areas is exactly √ .
3 3
5.1.6 General solution to grid problems over Z[i]
We can now describe our algorithm to solve Problem 5.1.1.
Proposition 5.1.15. There is an algorithm which, given a bounded convex subset

A of R2 with non-empty interior, enumerates all solutions of the grid problem over
Z[i] for A. Moreover, if A is M -upright, then the algorithm requires O(log(1/M ))
arithmetic operations overall, plus a constant number of arithmetic operations per
solution produced.
56
Proof. Given A, with an enclosing ellipse A′ whose area only exceeds that of A by a
fixed constant factor N , use Proposition 5.1.13 to find a grid operator G such that
G(A′ ) is 1/2-upright. Then, use Proposition 5.1.4 to enumerate the grid points of
G(A′ ). For each grid point u found, check whether it belongs to G(A). This is the
case if and only if G−1 (u) is a solution to the grid problem over Z[i] for A.
Remark 5.1.16. Note that the complexity of O(log(1/M )) overall operations in Propo-
sition 5.1.15 is exponentially better than the complexity of O(1/M ) per candidate we
obtained in Proposition 5.1.4. This improvement is entirely due to the use of grid
operators in Proposition 5.1.13.
5.1.7 Scaled grid problems over Z[i]
If A is a convex bounded subset of R2 with non-empty interior, then so is the set

rA, for any non-zero real number r. Hence, by the results of the previous subsection,
we can solve grid problems over Z[i] for rA where r is a non-zero scalar and A is
a bounded convex subset of R2 with non empty interior. We call such a problem a
scaled grid problem over Z[i] for A and r. In Chapter 6, we will be interested in
solving a sequence of such scaled grid problems for specific values of r. The reasons
behind our particular choice of a sequence will be detailed in Chapter 6.
√ √
Consider the set of real numbers of the form 2k 5ℓ , with k, ℓ ∈ N and 0 6
√ √
k 6 2. Note that the set { 2k 5ℓ }k,ℓ is ordered as a subset of R. In particular, if
√ k√ ℓ √ ′√ ′
2 5 6 2k 5ℓ , then ℓ 6 ℓ′ . When we say that an algorithm “enumerates all
√ √
solutions of the scaled grid problem over Z[i] for A and 2k 5ℓ in order of increasing
ℓ”, we mean that the algorithm first outputs all solutions for ℓ = 0, then for ℓ = 1,
etc.
Proposition 5.1.17. There is an algorithm which, given a bounded convex subset A

of R2 with non-empty interior, enumerates (the infinite sequence of ) all solutions of
√ √
the scaled grid problem over Z[i] for A and 2k 5ℓ in order of increasing ℓ. Moreover,
if A is M -upright, then the algorithm requires O(log(1/M )) arithmetic operations
overall, plus a constant number of arithmetic operations per solution produced.
Proof. This follows from Proposition 5.1.15.

57
We finish this subsection with some lower bounds on the number of solutions to
scaled grid problems.
√
Remark 5.1.18. If a bounded convex subset A ⊆ R2 contains a circle of radius 1/ 5k ,
√ √
then the grid problem over Z[i] for A and 2k 5ℓ has at least three solutions.
Proposition 5.1.19. Let A be a bounded convex subset of R2 with non-empty interior

√ √
and assume that the scaled grid problem over Z[i] for A and 2k 5ℓ has at least two
distinct solutions. Then for all j > 0, the scaled grid problem over Z[i] for A and
√ k √ ℓ+2j
2 5 has at least 5j + 1 solutions.
√ √
Proof. Let u ̸= v be solutions of the scaled grid problem over Z[i] for A and 2k 5ℓ .
√ √
That is, u, v ∈ ( 2k 5ℓ A) ∩ Z[i]. For each n = 0, 1, . . . , 5j , let φ = 5nj , and consider
√ √
uj = φu + (1 − φ)v. Then uj is a convex combination of u and v. Since 2k 5ℓ A
√ √
is convex, it follows that uj ∈ 2k 5ℓ A, so that 5j uj is a solution of the scaled grid
√ √
problem over Z[i] for A and 2k 5ℓ+2j , yielding 5j + 1 distinct such solutions.
5.2 Grid problems over Z[ω]
We now turn to grid problems where the lattice L is a subset of Z[ω]. Viewing C as
the real plane, we can consider the elements of Z[ω] as points in R2 . However, we
know from Chapter 3 that Z[ω] is dense in C and therefore does not form a lattice
in the plane. To circumvent this issue, we consider subsets of Z[ω] that arise as
the image, under the automorphism (−)• , of the intersection of Z[ω] and a bounded
convex set B ∈ R2 with non-empty interior. We use this notion of grid to formulate
grid problems over Z[ω].
Definition 5.2.1. Let B be a subset of R2 . The (complex) grid for B is the set
Grid(B) = {u ∈ Z[ω] | u• ∈ B}. (5.2)
Remark 5.2.2. We will only be interested in the case where B is a bounded convex
set with non-empty interior. In this case, the grid is discrete and infinite. It is infinite
by the density of Z[ω]: there are infinitely many points u ∈ B ∩ Z[ω], and for each
of them, u• is a grid point. To see that it is discrete, recall from Remark 3.1.5 of
Chapter 3 that for u, v ∈ Z[ω] we have
|u − v| · |u• − v • | > 1.
58
(a) 4 (b) 4
3 3
2 2
1 1
B B
0 0
−4 −3 −2 −1 0 1 2 3 4 −4 −3 −2 −1 0 1 2 3 4
−1 −1
−2 −2
−3 −3
−4 −4
(c) 4
2
B
1
0
−4 −3 −2 −1 0 1 2 3 4
−1
−2
−3
−4
Figure 5.3: The complex grid for three different convex sets B. In each case, the set
B is shown in green and grid points are shown as black dots. (a) B = [−1, 1]2 . (b)
B = {(x, y) | x2 + y 2 6 2}. (c) B = {(x, y) | 6x2 + 16xy + 11y 2 6 2}.
Since the distance between points in B is bounded above, the distance between their
bullets is bounded below.
Figure 5.3 illustrates the complex grids for several different convex sets B. Note
that the grid has a 90-degree symmetry in (a), a 45-degree symmetry in (b), and a
180-degree symmetry in (c).
Problem 5.2.3 (Grid problem over Z[ω]). Given two bounded convex subsets A and
B of R2 with non-empty interior, enumerate all the points u ∈ A ∩ Grid(B).
59
(a) B
−8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 7 8
(b) B
−8 −7 −6 −5 −4 −3 −2 −1 0 1 2 3 4 5 6 7 8
Figure 5.4: The real grid for two different intervals B. In both cases, the interval B
is shown in green, and grid points are shown as black dots.
Alternatively, grid problems over Z[ω] can be understood as looking for points in
Z[ω] such that u ∈ A and u• ∈ B. We also refer to the conditions u ∈ A and u• ∈ B
as grid constraints. As before, an element u ∈ A ∩ Grid(B) is called a solution to the
grid problem over Z[ω] for A and B.
We solve grid problems over Z[ω] by reasoning as in the previous section. That
is, we first deal with the case of upright rectangles and then generalize our methods
to arbitrary convex sets using ellipses.
5.2.1 One-dimensional grid problems
A grid problem over Z[i] for an upright rectangle A is solved by considering the x and
y coordinates of the problem independently. To extend this method to grid problems
over Z[ω], we define a one-dimensional analogue of Problem 5.2.3.
Definition 5.2.4. Let B be a subset of R. The (real) grid for B is the set
√
Grid(B) = {u ∈ Z[ 2] | u• ∈ B}. (5.3)
Remark 5.2.5. In the following, we will only be interested in the case where B is a
closed interval [y0 , y1 ] with y0 < y1 . In this case also the grid is discrete and infinite.
Figure 5.4 illustrates the grids for the intervals [−1, 1] and [−3, 3], respectively.
√
For example, the first few non-negative points in Grid([−1, 1]) are 0, 1, 1 + 2,
√ √ √ √
2 + 2, 2 + 2 2, 3 + 2 2, and 4 + 3 2. As one would expect, the grid for [−3, 3]
is about three times denser than that for [−1, 1]. We also note that B ⊆ B ′ implies
Grid(B) ⊆ Grid(B ′ ).
Problem 5.2.6 (One-dimensional grid problem). Given two sets of real numbers A
and B of R, enumerate all the points u ∈ A ∩ Grid(B).
60
As in the two-dimensional case, Problem 5.2.6 can be equivalently expressed by

√
the grid constraints u ∈ A and u• ∈ B for u ∈ Z[ 2]. In the case where A and B are
finite intervals, the grid problem is guaranteed to have a finite number of solutions.
We recall the following facts from [60].
Lemma 5.2.7. Let A = [x0 , x1 ] and B = [y0 , y1 ] be closed real intervals, such that
x1 − x0 = δ and y1 − y0 = ∆. If δ∆ < 1, then the one-dimensional grid problem for
√
A and B has at most one solution. If δ∆ > (1 + 2)2 , then the one-dimensional grid
problem for A and B has at least one solution.
Proof. Lemmas 16 and 17 of [60].
Proposition 5.2.8. Let A = [x0 , x1 ] and B = [y0 , y1 ] be closed real intervals. There
is an algorithm which enumerates all solutions to the one-dimensional grid problem
for A and B. Moreover, the algorithm only requires a constant number of arithmetic
operations per solution produced.
Proof. It was already noted in [60, Lemma 17] that there is an efficient algorithm for
computing one solution. To see that we can efficiently enumerate all solutions, let
δ = x1 − x0 and ∆ = y1 − y0 as before. Recall from Definition 3.1.3 of Chapter 3 that
√
λ = 2 + 1 and that λ−1 = −λ• . The grid problem for the sets A and B is equivalent
to the grid problem for λ−1 A and −λB, because u ∈ A and u• ∈ B hold if and only
if λ−1 u ∈ λ−1 A and (λ−1 u)• ∈ −λB. Using such rescaling, we may without loss of
generality assume that λ−1 6 δ < 1.
√ √
Now consider any solution u = a + b 2 ∈ Z[ 2]. From u ∈ [x0 , x1 ], we know that
√ √
x0 − b 2 6 a 6 x1 − b 2. But since x1 − x0 < 1, it follows that for any b ∈ Z, there
√
is at most one a ∈ Z yielding a solution. Moreover, we note that b = (u − u• )/ 23 ,
√ √
so that any solution satisfies (x0 − y1 )/ 23 6 b 6 (x1 − y0 )/ 23 . The algorithm then
√ √
proceeds by enumerating all the integers b in the interval [(x0 −y1 )/ 23 , (x1 −y0 )/ 23 ].
√ √
For each such b, find the unique integer a (if any) in the interval [x0 − b 2, x1 − b 2].
√
Finally, check if a + b 2 is a solution. The runtime is governed by the number of
b ∈ Z that need to be checked, of which there are at most O(y1 − y0 ) = O(δ∆). As a
consequence of Lemma 5.2.7, the total number of solutions is at least Ω(δ∆), and so
the algorithm is efficient.
61
5.2.2 Upright rectangles and upright sets
Recall, from Proposition 3.1.2 of Chapter 3, that Z[ω] can be seen as two disjoint
√ √
copies of Z[ 2]×Z[ 2]. We use this fact to reduce two dimensional grid problems over
Z[ω] for upright rectangles A and B to independent one-dimensional grid problems.
Proposition 5.2.9. Let A and B be upright rectangles. Then there is an algorithm

which enumerates all the solutions to the grid problem over Z[ω] for A and B. More-
over, the algorithm requires only a constant number of arithmetic operations per so-
lution produced.
Proof. By assumption, A = Ax × Ay and B = Bx × By , where Ax , Ay , Bx , and

By are closed intervals. By Proposition 3.1.2, any potential solution is of the form
√
u = a + bi or u = a + bi + ω, where a, b ∈ Z[ 2]. When u = a + bi, then u• = a• + b• i.
Therefore, the two-dimensional grid constraints u ∈ A and u• ∈ B are equivalent to
the one-dimensional constraints a ∈ Ax , a• ∈ Bx and b ∈ Ay , b• ∈ By . On the other
hand, when u = a + bi + ω, let v = u − ω = a + bi. Then v • = u• + ω, and the
constraints u ∈ A and u• ∈ B are equivalent to v ∈ A − ω and v • ∈ B + ω, which
reduces to the one-dimensional constraints a ∈ Ax − √12 , a• ∈ Bx + √12 and b ∈ Ay − √12 ,
b• ∈ By + √1 .
2
In both cases, the solutions to the one-dimensional constraints can be
efficiently enumerated by Proposition 5.2.8.
Now that we are able to solve grid problem over Z[ω] for upright rectangles A and
B, we can reason as in Subsection 5.1.2 to establish the following proposition.
Proposition 5.2.10. Let A, B be a pair of M -upright sets. Then there exists an

algorithm which enumerates all the solutions to the grid problem over Z[ω] for A
and B. Moreover, the algorithm requires O(1/M 2 ) arithmetic operations per solution
produced. In particular, when M > 0 is fixed, it requires only a constant number of
operations per solution.
5.2.3 Grid operators
We now adapt the notion of grid operator from Subsection 5.1.3 to the setting of grid
problems over Z[ω].
62
Definition 5.2.11. We regard Z[ω] as a subset of R2 . A real linear operator G :

R2 → R2 is called a grid operator if G(Z[ω]) ⊆ Z[ω]. Moreover, a grid operator G is
called special if it has determinant ±1.
Grid operators are characterized by the following lemma.
Lemma 5.2.12. Let G : R2 → R2 be a linear operator, which we can identify with a

real 2 × 2-matrix. Then G is a grid operator if and only if it is of the form
⎡ ⎤
a′ b′
a+ 2 b+ 2
√ √
G=⎣ ⎦, (5.4)
c′ d′
c+ 2 d+ 2
√ √
where a, b, c, d, a′ , b′ , c′ , d′ are integers satisfying a + b + c + d ≡ 0 (mod 2) and a′ ≡

b′ ≡ c′ ≡ d′ (mod 2).
Proof. By Proposition 3.1.2 from Chapter 3, we know that a vector u ∈ R2 is in Z[ω]

if and only if it can be written of the form
⎡ ⎤
x1 + √x22
u=⎣ ⎦, (5.5)
y1 + √y22
where x1 , x2 , y1 , y2 are integers and x2 ≡ y2 (mod 2). It can then be shown by com-
putation that every operator of the form (5.4) is a grid operator. For the converse,
consider an arbitrary grid operator G. We prove the claim by applying G to the three
points [ 10 ], [ 01 ], and √1 [ 1 ] ∈ Z[ω]. From G[ 10 ] ∈ Z[ω] and G[ 01 ] ∈ Z[ω], it follows that
2 1
the columns of G are of the form (5.5), so that G is of the form (5.4), with integers
a, b, c, d, a′ , b′ , c′ , d′ satisfying a′ ≡ c′ (mod 2) and b′ ≡ d′ (mod 2). Moreover, we have
⎡ ⎤
[ √ ] a′ +b′ a+b
+ 2
√
G 1/ √
1/ 2
2
= ⎣ ′2 ′ ⎦ ∈ Z[ω],
c +d c+d
2
+ √
2
which implies a + b ≡ c + d (mod 2) and a′ + b′ ≡ c′ + d′ ≡ 0 (mod 2). Together,

these conditions imply a + b + c + d ≡ 0 (mod 2) and a′ ≡ b′ ≡ c′ ≡ d′ (mod 2), as
claimed.
Remark 5.2.13. The composition of two (special) grid operators is again a (special)
grid operator. If G is a special grid operator, then G is invertible and G−1 is a
special grid operator. If G is a (special) grid operator, then G• is a (special) grid
operator, defined by applying (−)• separately to each matrix entry, and satisfying
G• u• = (Gu)• .
63
(a) 4 (b) 4
3 3
2 2
A G(A)
1 1
B
G• (B)
0 0
−4 −3 −2 −1 0 1 2 3 4 −4 −3 −2 −1 0 1 2 3 4
−1 −1
−2 −2
−3 −3
−4 −4
Figure 5.5: (a) The grid problem over Z[ω] for two sets A and B. (b) The grid
problem over Z[ω] for G(A) and G• (B). Note that the solutions of (a), which are the
grid points in the set A, are in one-to-one correspondence with the solutions of (b),
which are the grid points in the set G(A).
The interest of special grid operators lies in the following fact.
Proposition 5.2.14. Let G be a special grid operator, and let A and B be subsets of
R2 . Define
G(A) = {Gu | u ∈ A},
G• (B) = {G• u | u ∈ B}.
Then u ∈ Z[ω] is a solution to the grid problem over Z[ω] for A and B if and only if
Gu is a solution to the grid problem over Z[ω] for G(A) and G• (B). In particular, the
grid problem over Z[ω] for A and B is computationally equivalent to that for G(A)
and G• (B).
Proof. Let u ∈ Z[ω]. Then u is a solution to the grid problem for A and B if and
only if u ∈ A and u• ∈ B, if and only if Gu ∈ G(A) and G• u• = (Gu)• ∈ G• (B), if
and only if Gu is a solution to the grid problem for G(A) and G• (B).
Figure 5.5(a) illustrates the grid problem for a pair of sets A and B. As before,
the set B is shown in green, and Grid(B) is shown as black dots. The set A is shown
in red, and the solutions to the grid problem are the seven grid points that lie in A.
64
Figure 5.5(b) shows the grid problem for the sets G(A) and G• (B), where G is the
special grid operator
[ √ ]
1 2
G= .
0 1
Note that, as predicted by Proposition 5.2.14, the solutions of the transformed grid
problem are in one-to-one correspondence with those of the original problem; namely,
in each case, there are seven solutions.
5.2.4 Ellipses
Proceeding as in Subsection 5.1.4 we now prove that if A and B are two ellipses, then
the grid problem for A and B over Z[ω] can be solved efficiently. For this, we show
that one can find a grid operator G such that G(A) and G• (B) are both sufficiently
upright. Indeed, we know from Proposition 5.2.10 that if both A and B are upright
sets, then the grid problem over Z[ω] for A and B can be solved efficiently. The fact
that two ellipses have to be made simultaneously upright make the problem of finding
an appropriate grid operator significantly more complicated.
We start by reformulating the problem in more convenient terms. Recall from
√
Definition 3.1.3 of Chapter 3 that λ = 2 + 1. The matrix D corresponding to an
ellipse E therefore has determinant 1 if and only if it can be written in the form
[ ]
eλ−z b
D= (5.6)
b eλz
for some b, e, z ∈ R with e > 0 and e2 = b2 + 1. As established in Proposition 5.1.10,

the definition of uprightness simplifies in this case to
π π
up(E) = 2
= √ . (5.7)
4e 4 b2 + 1
Equivalently, if up(E) = M , then
2 π2
b = − 1. (5.8)
16M 2
Since we now have to deal with pairs of ellipses, it is convenient to introduce the
following terminology for discussing pairs of matrices.
65
Definition 5.2.15. A state is a pair of real symmetric positive definite matrices of

determinant 1. Given a state (D, ∆) with
[ ] [ ]
eλ−z b ελ−ζ β
D= ∆= (5.9)
b eλz β ελζ
we define its skew as Skew(D, ∆) = b2 + β 2 and its bias as Bias(D, ∆) = ζ − z.
Note that the skew of a state is small if and only if both b2 and β 2 are small, which
happens, by (5.7), if and only if the ellipses corresponding to D and ∆ both have
large uprightness. So our strategy for increasing the uprightness will be to reduce the
skew, as in Subsection 5.1.4. In what follows, we use (D, ∆) to denote an arbitrary
state and always assume that the entries of D and ∆ are given as in (5.9). For future
reference, we record here another useful property of states.
Remark 5.2.16. If (D, ∆) is a state with b > 0, then −be 6 −b2 . Indeed:
e2 = b2 + 1 ⇒ e2 > b2 ⇒ e > b ⇒ − be 6 −b2 .
Similarly, if b 6 0, then be 6 −b2 . Analogous inequalities also hold for β and ε.

The action of a grid operator on an ellipse can be adapted to states in a natural
way, provided that the operator is special.
Definition 5.2.17. The action of special grid operators on states is defined as follows.
Here, G† denotes the transpose of G, and G• is defined by applying (−)• separately
to each matrix entry, as in Remark 5.2.13.
(D, ∆) · G = (G† DG, G•† ∆G• ).
Lemma 5.2.18. Let (D, ∆) be a state, and let A and B be the ellipses centered at the
origin that are defined by D and ∆, respectively. Then the ellipses G(A) and G• (B)
are defined by the matrices D′ and ∆′ , where
(D′ , ∆′ ) = (D, ∆) · G−1
Proof. We have
G(A) = {G(u) ∈ R2 | u† Du 6 1}
= {v ∈ R2 | (G−1 v)† D(G−1 v) 6 1}
= {v ∈ R2 | v † (G−1 )† DG−1 v 6 1},
66
so the ellipse G(A) is defined by the positive operator D′ = (G−1 )† DG−1 . The proof
for G• (B) is similar.
The main ingredient in our proof that states can be made upright is the following
Step Lemma.
Lemma 5.2.19 (Step Lemma). For any state (D, ∆), if Skew(D, ∆) > 15, then
there exists a special grid operator G such that Skew((D, ∆) · G) 6 0.9 Skew(D, ∆).
Moreover, G can be computed using a constant number of arithmetic operations.
Before proving the Step Lemma, we show how it can be used to derive the following
proposition.
Proposition 5.2.20. Let A and B be ellipses. Then there exists a grid operator G
such that G(A) and G• (B) are 1/6-upright. Moreover, if A and B are M -upright,
then G can be efficiently computed in O(log(1/M )) arithmetic operations.
Proof. Let D and ∆ be the matrices defining A and B respectively, in the sense of
Definition 5.1.8. Since uprightness is invariant under translations and scaling, we may
without loss of generality assume that both ellipses are centered at the origin, and
that det D = det ∆ = 1.
The pair (D, ∆) is a state. By applying Lemma 5.2.19 repeatedly, we get grid
operators G1 , . . . , Gn such that:
Skew((D, ∆) · G1 . . . Gn ) 6 15. (5.10)
Now let (D′ , ∆′ ) = (D, ∆) · G1 . . . Gn and set G = (G1 · · · Gn )−1 . By Lemma 5.2.18,
the ellipses G(A) and G• (B) are defined by the matrices D′ and ∆′ , respectively. Let
b and β be the anti-diagonal entries of the matrices D′ and ∆′ , respectively. We have:
b2 + β 2 = Skew(D′ , ∆′ ) = Skew((D, ∆) · G−1 ) = Skew((D, ∆) · G1 . . . Gn ) 6 15,
hence b2 6 15 and β 2 6 15. Using (5.7), we get
π π
up(G(A)) = √ > √ > 1/6
2
4 b +1 4 16
and similarly up(G• (B)) > 1/6, as desired.

67
[ ] [ ] [ √ ]
1 1 −1 1 −2 1 2
R= √ A= B=
2 1 1 0 1 0 1
−λ−1 −1
[ ] [ ] [ ]
1 0 1 1 0
K=√ X= Z=
2 λ 1 1 0 0 −1
Figure 5.6: The grid operators R, A, B, X, K, and Z.
To bound the number of operations, note that each application of Gj reduces

the skew by at least 10 percent. Therefore, the number n in (5.10) satisfies n 6
log0.9 (15/Skew(D, ∆)) = O(log(Skew(D, ∆))). Using (5.8), we have
π2 π2
log(Skew(D, ∆)) = log(b2 + β 2 ) 6 log(( − 1) + ( − 1)) = O(log(1/M )).
16M 2 16M 2
It follows that the computation of G requires O(log(1/M )) applications of the Step

Lemma, each of which requires a constant number of arithmetic operations, proving
the final claim of the proposition.
The remainder of this subsection is devoted to proving the Step Lemma. To each
state, we associate the pair (z, ζ). The proof of the Step Lemma is essentially a case
distinction on the location of the pair (z, ζ) in the plane. We find coverings of the plane
with the property that if the point (z, ζ) belongs to some region O of our covering,
then it is easy to compute a special grid operator G such that Skew((D, ∆) · G) 6
0.9 Skew(D, ∆). The relevant grid operators are given in Figure 5.6.
Each one of the next five subsections is dedicated to a particular region of the
plane.
The Shift Lemma
In this section, we consider states (D, ∆) such that |Bias(D, ∆)| > 1. Any such state
can be “shifted” to a state (D′ , ∆′ ) of equal skew but with |Bias(D′ , ∆′ )| 6 1.
Definition 5.2.21. The shift operators σ and τ are defined by:

[ ] [ ]
√ λ 0 √ 1 0
σ= λ−1 ,τ = λ−1
0 1 0 −λ
68
Even though σ and τ are not grid operators, we can use them to define an operation
on states called a shift by k. By abuse of notation, we write this operation as an action.
Definition 5.2.22. Given a state (D, ∆) and k ∈ Z, the k-shift of (D, ∆) is defined
as:
(D, ∆) · Shiftk = (σ k Dσ k , τ k ∆τ k ).
The notation (D, ∆) · Shiftk is justified by the following lemma.
Lemma 5.2.23. The shift of a state is a state and moreover:
Skew((D, ∆) · Shiftk ) = Skew(D, ∆) and Bias((D, ∆) · Shiftk ) = Bias(D, ∆) + 2k
Proof. Compute (D, ∆) · Shiftk :
(D, ∆) · Shiftk = (σ k Dσ k , τ k ∆τ k )
[ ] [ ]
−z −ζ
eλ b ελ β
= (σ k σk , τ k τ k)
z ζ
b eλ β ελ
[ ] [ ]
eλ−z+k b ελ−ζ−k (−1)k β
= ( , )
b eλz−k (−1)k β ελζ+k
The resulting matrices are clearly symmetric and positive definite. Moreover, since
σ k and τ k have determinant ±1, both σ k Dσ k and τ k ∆τ k have determinant 1. Finally:
• Skew((D, ∆) · Shiftk ) = b2 + ((−1)k β)2 = b2 + β 2 = Skew(D, ∆) and
• Bias((D, ∆) · Shiftk ) = (ζ + k) − (z − k) = Bias(D, ∆) + 2k,
which completes the proof.
For every special grid operator G, there is a special grid operator G′ whose action
on a state corresponds to shifting the state by k, applying G and then shifting the
state by −k.
Lemma 5.2.24. If G is a special grid operator and k ∈ Z, then G′ = σ k Gσ k is a

special grid operator and moreover G′• = (−τ )k G• τ k .
69
[ ]
w x
Proof. It suffices to show this for k = 1. Suppose G = is a special grid
y z
operator and note that:
[ ] [ ][ ] [ ]
λw x λ−1 0 λ 0 λ 0
G′ = σGσ = = G .
y λ−1 z 0 λ−1 0 1 0 1
Since all the factors in the above product are grid operators, the result is also a grid
operator. Moreover det(σGσ) = det(G) = 1 so that σGσ is special. Finally:
[ ] [ ]
λ• w• x• −λ−1 w• x•
G′• = (σGσ)• = = = −τ G• τ.
• −1 • • • •
y (λ ) z y −λz
Lemma 5.2.25. If G is a grid operator, then:
(((D, ∆) · Shiftk ) · G) · Shiftk = (D, ∆) · (σ k Gσ k ).
Proof. Write G′ = σ k Gσ k . Simple computation then yields the result:
(((D, ∆) · Shiftk ) · G) · Shiftk = ((σ k Dσ k , τ k ∆τ k ) · G) · Shiftk

= (G† σ k Dσ k G, G•† τ k ∆τ k G• ) · Shiftk
= (σ k G† σ k Dσ k Gσ k , τ k G•† τ k ∆τ k G• τ k )
= (σ k G† σ k Dσ k Gσ k , ((−τ )k G•† τ k )∆((−τ )k G• τ k ))
= (G′† DG′ , G′•† ∆G′ • )
= (D, ∆) · G′
= (D, ∆) · (σ k Gσ k ).
Shifts allow us to consider only states (D, ∆) with Bias(D, ∆) ∈ [−1, 1] in the
proof of the Step Lemma.
Lemma 5.2.26. If the Step Lemma holds for all states (D, ∆) with Bias(D, ∆) ∈
[−1, 1], then it holds for all states.
Proof. Let (D, ∆) be some state with Skew(D, ∆) > 15. Let x = Bias(D, ∆) and
set k = ⌊ 1−x
2
⌋. Then by Lemma 5.2.23, we have Skew((D, ∆) · Shiftk ) = Skew(D, ∆)
70
and Bias((D, ∆) · Shiftk ) ∈ [−1, 1]. Then by assumption, there exists a special grid
operator G such that Skew(((D, ∆) · Shiftk ) · G) 6 0.9 Skew((D, ∆) · Shiftk ). Now by
Lemma 5.2.24 we know that G′ = σ k G σ k is a special grid operator. Moreover, by
Lemma 5.2.25 and 5.2.23, we have:
Skew((D, ∆) · G′ ) = Skew((((D, ∆) · Shiftk ) · G) · Shiftk )

= Skew(((D, ∆) · Shiftk ) · G)
6 0.9 Skew((D, ∆) · Shiftk )
= 0.9 Skew(D, ∆),
which completes the proof.
The R Lemma
Definition 5.2.27. The hyperbolic sine in base λ is defined as:

λx − λ−x
sinhλ (x) = .
2
Lemma 5.2.28. Recall the operator R from Figure 5.6. If (D, ∆) is a state such that
Skew(D, ∆) > 15, and such that −0.8 6 z 6 0.8 and −0.8 6 ζ 6 0.8, then:
Skew((D, ∆) · R) 6 0.9 Skew(D, ∆).
Proof. Compute the action of R on (D, ∆):

[ ][ ][ ]
−z
1 1 1 eλ b 1 −1
R† DR =
2 −1 1 b eλz 1 1
e(λz −λ−z )
[ ] [ ]
... 2
... e sinhλ (z)
= e(λz −λ−z )
= ,
2
... e sinhλ (z) ...
[ ][ ][ ]
1 −1 −1 ελ−ζ β −1 1
R•† ∆R• =
2 1 −1 β ελζ −1 −1
ε(λζ −λ−ζ )
[ ] [ ]
... 2
... ε sinhλ (ζ)
= ε(λζ −λ−ζ )
= .
2
... ε sinhλ (ζ) ...
Therefore Skew((D, ∆) · R) = e2 sinh2λ (z) + ε2 sinh2λ (ζ). But recall that e2 = b2 + 1

and ε2 = β 2 + 1, so that in fact:
Skew((D, ∆) · R) = (b2 + 1) sinh2λ (z) + (β 2 + 1) sinh2λ (ζ).

71
We assumed −0.8 6 z, ζ 6 0.8 and this implies that sinh2λ (ζ), sinh2λ (z) 6 sinh2λ (0.8).
Writing y = sinh2λ (0.8) for brevity, and using the assumption that Skew(D, ∆) > 15,
we get:
Skew((D, ∆) · R) = (b2 + 1) sinh2λ (z) + (β 2 + 1) sinh2λ (ζ)

6 (b2 + 1)y + (β 2 + 1)y
= (b2 + β 2 + 2)y
2
6 Skew(D, ∆)(1 + 15
)y.
This completes the proof, since (1 + 2

15
)y = (1 + 2
15
) sinh2λ (0.8) ≈ 0.663 6 0.9.
The K Lemma
Definition 5.2.29. The hyperbolic cosine in base λ is defined as:
λx + λ−x
coshλ (x) = .
2
Lemma 5.2.30. Recall the operator K from Figure 5.6. If (D, ∆) is a state such
that Bias(D, ∆) ∈ [−1, 1], Skew(D, ∆) > 15, and such that b, β > 0, z 6 0.3, and
0.8 6 ζ, then:
Skew((D, ∆) · K) 6 0.9 Skew(D, ∆).
Proof. Compute the action of K on (D, ∆):
K † DK
[ ][ ][ ]
1 −λ−1 λ eλ−z b −λ−1 −1
=
2 −1 1 b eλz λ 1
[ √ ]
1 ... e(λz+1 + λ−z−1 ) − 2 2b
= √
2 e(λz+1 + λ−z−1 ) − 2 2b ...
[ √ ]
... e coshλ (z + 1) − 2b
= √ ,
e coshλ (z + 1) − 2b ...
72
K •† ∆K •
[ ][ ][ ]
1 λ −λ−1 ελ−ζ β λ −1
=
2 −1 1 β ελζ −λ−1 1
[ √ ]
1 ... −ε(λζ−1 + λ−ζ+1 ) + 2 2β
= √
2 −ε(λζ−1 + λ−ζ+1 ) + 2 2β ...
[ √ ]
... 2β − ε coshλ (ζ − 1)
= √ .
2β − ε coshλ (ζ − 1) ...
Therefore:
√ √
Skew((D, ∆) · K) = ( 2b − e coshλ (z + 1))2 + ( 2β − ε coshλ (ζ − 1))2 . (5.11)
But recall that e2 = b2 + 1, and from Remark 5.2.16 that b > 0 implies −be 6 −b2 ,
so:
√
( 2b − e coshλ (z + 1))2
√
= 2b2 − 2 2 be coshλ (z + 1) + e2 cosh2λ (z + 1)
√
6 2b2 − 2 2 b2 coshλ (z + 1) + (b2 + 1) cosh2λ (z + 1)
√
= b2 (2 − 2 2 coshλ (z + 1) + cosh2λ (z + 1)) + cosh2λ (z + 1)
√
= b2 ( 2 − coshλ (z + 1))2 + cosh2λ (z + 1). (5.12)
Reasoning analogously, we also have

√ √
( 2β − ε coshλ (ζ − 1))2 6 β 2 ( 2 − coshλ (ζ − 1))2 + cosh2λ (ζ − 1). (5.13)
By assumption, Bias(D, ∆) ∈ [−1, 1], thus ζ 6 z + 1. This, together with the

assumptions 0.8 6 ζ and z 6 0.3, implies that both z + 1 and ζ − 1 are in the interval
[−0.2, 1.3]. On this interval, the function cosh2λ (x) assumes its maximum at x = 1.3,
√
and the function f (x) = ( 2 − coshλ (x))2 assumes its maximum at x = 0. Therefore,
√ √
b2 ( 2 − coshλ (z + 1))2 + cosh2λ (z + 1) 6 b2 ( 2 − coshλ (0))2 + cosh2λ (1.3) (5.14)
and
√ √
β 2 ( 2 − coshλ (ζ − 1))2 + cosh2λ (ζ − 1) 6 β 2 ( 2 − coshλ (0))2 + cosh2λ (1.3). (5.15)
73
Combining (5.11)–(5.15), together with the assumption that Skew(D, ∆) > 15, yields:
√ √
Skew((D, ∆) · K) = ( 2b − e coshλ (z + 1))2 + ( 2β − ε coshλ (ζ − 1))2
√
6 (b2 + β 2 )( 2 − coshλ (0))2 + 2 cosh2λ (1.3)
√
= Skew(D, ∆)( 2 − coshλ (0))2 + 2 cosh2λ (1.3)
√ 2
6 Skew(D, ∆)(( 2 − coshλ (0))2 + cosh2λ (1.3))
15
√ 2
This completes the proof since ( 2 − coshλ (0))2 + 15 cosh2λ (1.3) ≈ 0.571 6 0.9.
The A Lemma
Definition 5.2.31. Let g(x) = (1 − 2x)2 .
Lemma 5.2.32. Recall the operator A from Figure 5.6. If (D, ∆) is a state such that
Bias(D, ∆) ∈ [−1, 1], Skew(D, ∆) > 15, and such that b, β > 0 and 0.3 6 z, ζ, then
there exists n ∈ Z such that:
Skew((D, ∆) · An ) 6 0.9 Skew(D, ∆).
c
Proof. Let c = min{z, ζ} and n = max{1, ⌊ λ2 ⌋}. Compute the action of An on
(D, ∆):
[ ][ ][ ]
1 0 eλ−z b 1 −2n
An † DAn =
−2n 1 b eλz 0 1
[ ]
−z
... b − 2neλ
= ,
b − 2neλ−z ...
An •† ∆An • = An † ∆An
[ ]
... β − 2nελ−ζ
= .
β − 2nελ−ζ ...
Therefore:
Skew((D, ∆) · An ) = (b − 2neλ−z )2 + (β − 2nελ−ζ )2
But recall that e2 = b2 + 1 and ε2 = β 2 + 1, and from Remark 5.2.16 that b, β > 0
implies −be 6 −b2 and −εβ 6 −β 2 . Using these facts, we can expand the above
74
formula as follows:
Skew((D, ∆) · An )
= (b − 2neλ−z )2 + (β − 2nελ−ζ )2
= b2 − 4nbeλ−z + 4n2 e2 λ−2z + β 2 − 4nβελ−ζ + 4n2 ε2 λ−2ζ
6 b2 − 4nb2 λ−z + 4n2 (b2 + 1)λ−2z + β 2 − 4nβ 2 λ−ζ + 4n2 (β 2 + 1)λ−2ζ
= b2 (1 − 4nλ−z + 4n2 λ−2z ) + β 2 (1 − 4nλ−ζ + 4n2 λ−2ζ ) + 4n2 (λ−2z + λ−2ζ )
= b2 (1 − 2nλ−z )2 + β 2 (1 − 2nλ−ζ )2 + 4n2 (λ−2z + λ−2ζ )
= b2 g(nλ−z ) + β 2 g(nλ−ζ ) + 4n2 (λ−2z + λ−2ζ ).
Writing y = max{g(nλ−z ), g(nλ−ζ )} for brevity, and using the assumption that
Skew(D, ∆) > 15 together with the fact that c 6 z, ζ, we get:
Skew((D, ∆) · An ) 6 b2 y + β 2 y + 8n2 λ−2c

= Skew(D, ∆)y + 8n2 λ−2c
8
6 Skew(D, ∆)(y + n2 λ−2c ).
15
8 2 −2c
To finish the proof, it remains to show that y + 15 nλ 6 0.9. There are two cases:
c λc λc λc
• If ⌊ λ2 ⌋ > 1, then 4
6 n 6 2
. From n 6 2
, we have 2nλ−c 6 1, and so
8 2 −2c 2
15
nλ 6 15
. Moreover, because Bias(D, ∆) ∈ [−1, 1], we have c 6 z, ζ 6
λc −c−1
c + 1. Hence 1
4λ
= 4
λ 6 nλ−c−1 6 nλ−z , nλ−ζ 6 nλ−c 6 12 . On the interval
1 1 1
[ 4λ , 2 ], the function g(x) assumes its maximum at x = 4λ
. This implies that
1
y 6 g( 4λ ). This completes the present case since we get:
8 2 −2c 1 2
y+ nλ 6 g( ) + ≈ 0.762 6 0.9.
15 4λ 15
c 8 2 −2c 8 −0.6
• If ⌊ λ2 ⌋ < 1, then n = 1 and λc < 2. From 0.3 6 c, we have 15
nλ 6 15
λ .
Moreover, because Bias(D, ∆) ∈ [−1, 1], we have 0.3 6 c 6 z, ζ 6 c + 1. With
λc 6 2, this implies that 1
2λ
6 λ−c−1 6 λ−z , λ−ζ 6 λ−0.3 . Therefore both
λ−z and λ−ζ are in the interval [ 2λ
1
, λ−0.3 ]. On this interval, the function g(x)
1 1
assumes its maximum at x = 2λ
, and therefore y 6 g( 2λ ). This completes the
proof since:
8 2 −2c 1 8
y+ nλ 6 g( ) + λ−0.6 ≈ 0.657 6 0.9.
15 2λ 15
75
The B Lemma
√
Definition 5.2.33. Let h(x) = (1 − 2x)2 .
Lemma 5.2.34. Recall the operator B from Figure 5.6. If (D, ∆) is a state such that
Bias(D, ∆) ∈ [−1, 1], Skew(D, ∆) > 15, and such that b 6 0 6 β and −0.2 6 z, ζ,
then there exists n ∈ Z such that:
Skew((D, ∆) · B n ) 6 0.9 Skew(D, ∆).

c
Proof. Let c = min{z, ζ}, n = max{1, ⌊ √λ 2 ⌋} and compute the action of B n on (D, ∆):
[ ][ ][ √ ]
1 0 eλ−z b 1 2n
B n † DB n = √
2n 1 eλz
b 0 1
[ √ ]
... b + 2 neλ−z
= √ ,
b + 2 neλ−z ...
[ ][ ][ √ ]
1 0 ελ−ζ β 1 − 2n
B n •† DB n • = √
− 2n 1 β ελζ 0 1
[ √ ]
... β − 2 nελ−ζ
= √ .
β − 2 nελ−ζ ...
Therefore:
√ √
Skew((D, ∆) · B n ) = (b + 2 neλ−z )2 + (β − 2 nελ−ζ )2 .
But recall that e2 = b2 + 1, that ε2 = β 2 + 1, and from Remark 5.2.16 that b 6 0 6 β

implies be 6 −b2 and −βε 6 −β 2 . Using these facts, we can expand the above
formula as follows:
Skew((D, ∆) · B n )
√ √
= (b + 2 neλ−z )2 + (β − 2 nελ−ζ )2
√ √
= b2 + 2 2 nbeλ−z + 2n2 e2 λ−2z + β 2 − 2 2 nβελ−ζ + 2n2 ε2 λ−2ζ
√ √
6 b2 − 2 2 nb2 λ−z + 2n2 (b2 + 1)λ−2z + β 2 − 2 2 nβ 2 λ−ζ + 2n2 (β 2 + 1)λ−2ζ
√ √
= b2 (1 − 2 2 nλ−z + 2n2 λ−2z ) + β 2 (1 − 2 2 nλ−ζ + 2n2 λ−2ζ ) + 2n2 (λ−2z + λ−2ζ )
√ √
= b2 (1 − 2 nλ−z )2 + β 2 (1 − 2 nλ−ζ )2 + 2n2 (λ−2z + λ−2ζ ).
= b2 h(nλ−z ) + β 2 h(nλ−ζ ) + 2n2 (λ−2z + λ−2ζ ).
76
Writing y = max{h(nλ−z ), h(nλ−ζ )} for brevity, and using the assumption that
Skew(D, ∆) > 15, together with the fact that c 6 z, ζ, we get:
Skew((D, ∆) · B n ) 6 b2 y + β 2 y + 4n2 λ−2c

= Skew(D, ∆)y + 4n2 λ−2c
4
6 Skew(D, ∆)(y + n2 λ−2c ).
15
4 2 −2c
To finish the proof, it remains to show that y + 15 nλ 6 0.9. There are two cases:
c c λc λc
• If ⌊ √λ 2 ⌋ > 1, then λ
√
2 2
6n6 √
2
. From n 6 √
2
, we have 2n2 λ−2c 6 1, and so
4n2 λ−2c 2
15
6 15
. Moreover, because Bias(D, ∆) ∈ [−1, 1], we have c 6 z, ζ 6 c + 1.
c
Hence √1
2 2λ
= 2λ√2 λ−c−1 6 nλ−c−1 6 nλ−z , nλ−ζ 6 nλ−c 6 √12 . On the interval
[ 2√12 λ , √12 ], the function h(x) assumes its maximum at x = 2√12 λ . This implies
that y 6 h( 2√12 λ ). This completes the present case since we get:
4 2 −2c 1 2
y+ nλ 6 h( √ ) + ≈ 0.762 6 0.9.
15 2 2λ 15
c √ 4 2 −2c
• If ⌊ √λ 2 ⌋ < 1, then n = 1 and λc < 2. From −0.2 6 c, we have 15
nλ 6
4 0.4
15
λ . Moreover, because Bias(D, ∆) ∈ [−1, 1], we have −0.2 6 c 6 z, ζ 6
√
c + 1. With λc 6 2, this implies that √12 λ 6 λ−c−1 6 λ−z , λ−ζ 6 λ0.2 .
Therefore both λ−z and λ−ζ are in the interval [ √12 λ , λ0.2 ]. On this interval, the
function h(x) assumes its maximum at x = λ0.2 , and therefore y 6 h(λ0.2 ). This
completes the proof since:
4 2 −2c 4
y+ nλ 6 h(λ0.2 ) + λ0.4 ≈ 0.851 6 0.9.
15 15
Proof of the Step Lemma
The proof of the Step Lemma is now basically a case distinction, using the cases
enumerated in lemmas 5.2.26–5.2.34, as well as some additional symmetric cases. In
particular, the following remark will allow us to use the grid operators X and Z to
reduce the number of cases to consider.
77
Remark 5.2.35. The grid operator Z negates the anti-diagonal entries of a state while
the operator X swaps the diagonal entries of a state. This follows by simple compu-
tation since ([ ] [ ])
eλ−z −b ελ−ζ −β
(D, ∆) · Z = ,
−b eλz −β ελζ
and ([ ] [ ])
eλz b ελζ β
(D, ∆) · X = , .
b eλ−z β ελ−ζ
Moreover, Bias((D, ∆) · Z) = Bias(D, ∆) and Bias((D, ∆) · X) = −Bias(D, ∆).
Lemma (Step Lemma). For any state (D, ∆), if Skew(D, ∆) > 15, then there exists
a special grid operator G such that Skew((D, ∆) · G) 6 0.9 Skew(D, ∆). Moreover, G
can be computed using a constant number of arithmetic operations.
Proof. Let (D, ∆) be a state such that Skew(D, ∆) > 15. By Lemma 5.2.26 we can
assume w.l.o.g. that Bias(D, ∆) ∈ [−1, 1]. Moreover, by Remark 5.2.35, we can also
assume that β > 0 and z + ζ > 0. Note that the application of the grid operators
X and/or Z in Remark 5.2.35 preserves the fact that Bias(D, ∆) ∈ [−1, 1]. We now
treat in turn the cases b > 0 and b 6 0.
Case 1 b > 0. A covering of the strip defined by z − ζ ∈ [−1, 1] and z + ζ > 0 is

depicted in Figure 5.7(a). The R region (in green) and the A region (in red)
are defined as the intersection of this space with {(z, ζ) | − 0.8 6 z, ζ 6 0.8}
and {(z, ζ) | z 6 0.3 and 0.8 6 ζ} respectively. The K and K • regions (both in
blue) fill the remaining space.
We now consider in turn the possible locations of the pair (z, ζ) in this covering.
1. If −0.8 6 z, ζ 6 0.8, then by Lemma 5.2.28 we have Skew((D, ∆) · R) 6

0.9 Skew(D, ∆).
2. If z 6 0.3 and 0.8 6 ζ, then by Lemma 5.2.30 we have Skew((D, ∆) · K) 6

0.9 Skew(D, ∆).
3. If 0.3 6 z, ζ, then there exists n ∈ Z such that Skew((D, ∆) · An ) 6

0.9 Skew(D, ∆) by Lemma 5.2.32.
78
ζ ζ
(a) (b)
K An Bn
1 1
R R
XK • z z
•
1K
1
XAn XK XB n
Figure 5.7: (a) A covering of the region z − ζ ∈ [−1, 1] and z + ζ > 0 for the case
b > 0. (b) A covering of the region z − ζ ∈ [−1, 1] and z + ζ > 0 for the case b 6 0.
4. If 0.8 6 z and ζ 6 0.3, then note that (D, ∆) · K • = (∆, D) · K, and

therefore Skew((D, ∆) · K) 6 0.9 Skew(D, ∆) by Lemma 5.2.30:
Skew((D, ∆)·K • ) = Skew((∆, D)·K) 6 0.9 Skew(∆, D) = 0.9 Skew(D, ∆).
Case 2 b 6 0. As above, we use a covering of the strip defined by z − ζ ∈ [−1, 1]

and z + ζ > 0 and consider the possible locations of (z, ζ) in this space. The
relevant covering is depicted in Figure 5.7(b), where the R region (in green) is
defined as above and the B region (in red) is defined as the intersection of the
strip with {(z, ζ) | z, ζ > −0.2}.
1. If −0.8 6 z, ζ 6 0.8, then by Lemma 5.2.28 we have Skew((D, ∆) · R) 6

0.9 Skew(D, ∆).
2. If z, ζ > −0.2 then there exists n ∈ Z such that Skew((D, ∆) · B n ) 6

0.9 Skew(D, ∆) by Lemma 5.2.34.
Finally, note that only a constant number of calculations are required to decide which
of the above cases applies. Moreover, each case only requires a constant number of
operations. Specifically, the computation of k and σ k in Lemma 5.2.26, of n and An
79
in Lemma 5.2.32, and of n and B n in Lemma 5.2.34 each require just a fixed number
of operations, and each of the remaining cases produces a fixed grid operator.
5.2.5 General solution to grid problems over Z[ω]
We are finally in a position to solve Problem 5.2.3.
Proposition 5.2.36. There is an algorithm which, given two bounded convex subset
A and B of R2 with non-empty interior, enumerates all solutions of the grid problem
over Z[ω] for A and B. Moreover, if A and B are M -upright, then the algorithm
requires O(log(1/M )) arithmetic operations overall, plus a constant number of arith-
metic operations per solution produced.
Proof. Analogous to the proof of Proposition 5.1.15, using Proposition 5.2.20 to find
the appropriate grid operator.
5.2.6 Scaled grid problems over Z[ω]
We close this chapter by considering scaled versions of Problem 5.2.3, as in Subsec-

tion 5.1.7. More specifically, given k ∈ N and two bounded convex subsets A and B
or R2 with non-empty interior, we are interested in solving grid problems over Z[ω]
√ √
for 2k A and (− 2)k B. We call such problems scaled grid problems over Z[ω] for
A, B, and k. These scaled grid problems will also be useful in Chapter 3. Using
Proposition 5.2.36, we can establish the following proposition.
Proposition 5.2.37. There is an algorithm which, given two bounded convex subset
A and B of R2 with non-empty interior, enumerates (the infinite sequence of ) all
solutions of the scaled grid problem over Z[ω] for A, B, and k in order of increasing
k. Moreover, if A and B are M -upright, then the algorithm requires O(log(1/M ))
arithmetic operations overall, plus a constant number of arithmetic operations per
solution produced.
Finally, we give some lower bounds on the number of solutions to scaled grid
problems over Z[ω].
80
Lemma 5.2.38. Let A and B be convex subsets of R2 , and let k > 0. Assume A
contains a circle of radius r and B contains a circle of radius R, such that rR >
1
√
2k
(1 + 2)2 . Then the scaled grid problem over Z[ω] for A, B, and k has at least 2
solutions.
√ √ √
Proof. By assumption, ( 2k )A contains a circle of radius r′ = 2k r and (− 2)k B
√ √ √
contains a circle of radius R′ = 2k R, with rR > (1 + 2)2 . Let δ = r′ / 2 and
√
∆ = R′ 2, and inscribe two squares of size δ × δ in the first circle, and one square of
size ∆ × ∆ in the second circle, as shown here:
∆ w0
y1 δ δ
• • b•
b • •
y0
w1
x0 a x1x2 a′ x3 z0 a′• a• z1
√ √
Since δ∆ = r′ R′ > (1 + 2)2 , by Lemma 5.2.7, we can find a, a′ , b ∈ Z[ 2] such that
a ∈ [x0 , x1 ], a• ∈ [z0 , z1 ], a′ ∈ [x2 , x3 ], a′• ∈ [z0 , z1 ], b ∈ [y0 , y1 ], and b• ∈ [w0 , w1 ].
Then u = a + ib and v = a′ + ib are two different solutions to the scaled grid problem
over Z[ω] for A, B, and k as claimed.
Lemma 5.2.39. Let A and B be convex subsets of R2 , and assume that the two-
dimensional scaled grid problem for k has at least two distinct solutions. Then for all
ℓ > 0, the scaled grid problem for k + 2ℓ has at least 2ℓ + 1 solutions.
Proof. Analogous to the proof of Proposition 5.1.19.

Chapter 6
Clifford+V approximate synthesis
In this chapter, we introduce an efficient algorithm to solve the problem of approxi-

mate synthesis of special unitaries over the Clifford+V gate set. Recall from Chapter 1
that the Clifford group is generated by
[ ] [ ]
iπ/4 1 1 1 1 0
ω=e , H= √ , and S = .
2 1 −1 0 i
Recall moreover that the Clifford+V group is obtained by adding the following V -
gates to the generators of the Clifford group
[ ] [ ] [ ]
1 1 2i 1 1 2 1 1 + 2i 0
VX = √ , VY = √ , and VZ = √ .
5 2i 1 5 −2 1 5 0 1 − 2i
The problem of approximate synthesis of special unitaries over the Clifford+V

gate set is the following. Given a special unitary U ∈ SU(2) and a precision ε > 0,
construct a Clifford+V circuit W whose V-count is as small as possible and such that
∥W − U ∥ 6 ε.
We solve the problem in three steps. We first characterize the unitaries which
can be expressed exactly as Clifford+V circuits. We then use this characterization
to define an algorithm solving the problem of approximate synthesis of z-rotations
over the Clifford+V gate set. Finally, we show how to this method can be used to
approximately synthesize arbitrary special unitaries.
6.1 Exact synthesis of Clifford+V operators
We start by solving the problem of exact synthesis of Clifford+V operators.
Problem 6.1.1 (Exact synthesis of Clifford+V operators). Given a unitary U ∈

U(2), determine whether there exists a Clifford+V circuit W such that U = W and,
in case such a circuit exists, construct one whose V-count is minimal.
81
82
The problem of exact synthesis of Pauli+V operators was first solved in [7] using
the arithmetic of quaternions. Here, we extend this result to the Clifford+V group.
To characterize Clifford+V operators, we consider the following set of unitaries.
Definition 6.1.2. The set V consists of unitary matrices of the form

[ ]
1 α γ
U=√ √ (6.1)
2k 5ℓ β δ
where k, ℓ ∈ N with 0 6 k 6 2, α, β, γ, δ ∈ Z[i], and such that det(U ) is a power of i.
We will show that a unitary U is a Clifford+V operator if and only if U ∈ V. As a

corollary, this will establish that V is a group, which might not be obvious, due to the
seemingly arbitrary restriction on the exponent k. We note that V does not coincide
√ √
with the subgroup of U(2) whose entries are in the ring Z[1/ 2, 1/ 5, i] and whose
determinant is a power of i. Indeed, the matrix
[ √ ]
1 2i + 5 −80 + 96i
√
53 80 + 96i −2i + 5
√ √
has entries in Z[1/ 2, 1/ 5, i] and has determinant 1 but is not an element of V.
√
Definition 6.1.3. Let U ∈ V be as in (6.1). The integers k and ℓ are called the 2-
√
denominator exponent and 5-denominator exponent of U respectively. The least k
√
(resp. ℓ) such that U can be written as in (6.1) is the least 2-denominator exponent
√
(resp. least 5-denominator exponent) of U . These notions extend naturally to
vectors and scalars of the form
[ ]
1 α 1
√ √ and √ √ α, (6.2)
2k 5ℓ β 2k 5ℓ
where k, ℓ ∈ N, with 0 6 k 6 2, and α, β ∈ Z[i].
In what follows, we refer to the pair (k, ℓ) as the denominator exponent of a

matrix, vector, or scalar. It is then understood that the first component of the pair
√ √
(k, ℓ) is the 2-exponent, while the second is the 5-exponent. Note that the least
denominator exponent of a matrix, vector, or scalar is the pair (k, ℓ), where k and ℓ
√ √
are the least 2- and 5-exponents respectively.
83
√ √
Remark 6.1.4. Since 5∈/ Z[i], if ℓ and ℓ′ are two 5-denominator exponents of a
√
matrix U ∈ V, then ℓ ≡ ℓ′ (mod 2). A similar property holds for 2-denominator
exponents.
We first show that if U is a Clifford+V operator, then U ∈ V.
Lemma 6.1.5. If U is a Clifford+V operator, then U = ABC where A is a product

of V-gates, B is a Pauli+S operator, and C is one of I, H, HS, ω, Hω, and HSω.
Proof. Clifford gates and V-gates can be commuted in the sense that for every pair
C, V of a Clifford gate and a V-gate, there exists a pair C ′ , V ′ such that CV = V ′ C ′ .
This implies that a Clifford+V operator U can always be written as U = AA′ , where
A is a product of V-gates and A′ is a Clifford operator. Furthermore, the Pauli+S
group has index 6 as a subgroup of the Clifford group and its cosets are: Pauli+S,
Pauli+S · H, Pauli+S · HS, Pauli+S · ω, Pauli+S · Hω, and Pauli+S · HSω. It
thus follows that a Clifford operator A′ can always be written as A′ = BC with B a
Pauli+S operator and C one of I, H, HS, ω, Hω, and HSω.
Corollary 6.1.6. If U is a Clifford+V operator, then U ∈ V.
To show, conversely, that every element of V can be represented by a Clifford+V

circuit, we proceed as follows. First, we show that every unit vector of the form (6.2)
can be reduced to e1 = [ 10 ] by applying a sequence of carefully chosen Clifford+V
gates. Then, we show how applying this method to the first column of a unitary
matrix U of the form (6.1) yields a Clifford+V circuit for U .
√
Lemma 6.1.7. If u is a unit vector of the form (6.2) with least 5-denominator
√
exponent ℓ and W is a Clifford circuit, then W u has least 5-denominator exponent
ℓ.
Proof. It suffices to show that the generators of the Clifford group preserve the least
√
5-denominator exponent of u. The general result then follows by induction. To this
end, write u as in (6.2), with α = a + ib and β = c + id:
[ ]
1 a + ib
u= √ √ .
2k 5ℓ c + id
84
Now apply H, ω, and S to u:

[ ] [ ]
1 (a + c) + i(b + d) 1 (a − b) + i(a + b)
Hu = √ √ , ωu = √ √ ,
2k+1 5ℓ (a − c) + i(b − d) 2k+1 5ℓ (c − d) + i(c + d)
[ ]
1 a + ib
Su = √ √ .
2k 5ℓ −d + ic
√
By minimality of ℓ, one of a, b, c, d is not divisible by 5. The least 5-denominator of
Su is therefore ℓ. Moreover, for any two integers x and y, x + y ≡ x − y ≡ 0 (mod 5)
√
implies x ≡ y ≡ 0 (mod 5). Thus the least 5-denominator exponent of Hu and ωu
is also ℓ.
Lemma 6.1.8. If u is a unit vector of the form (6.2) with least denominator exponent
(k, ℓ), then there exists a Clifford circuit W such that W u has least denominator
exponent (0, ℓ).
Proof. By Lemma 6.1.7, we need not worry about ℓ and only have to focus on reducing
k. Write u as in (6.2), with 0 6 k 6 2, α = a + ib, and β = c + id. Since u has unit
norm, we have a2 + b2 + c2 + d2 = 2k · 5ℓ . We prove the lemma by case distinction on
k. If k = 0, there is nothing to prove. The remaining cases are treated as follows.
• k = 1. In this case a2 + b2 + c2 + d2 = 2 · 5ℓ ≡ 0 (mod 2). Therefore only an

even number amongst a, b, c, d can be odd. Using a Pauli+S operator, we can
without loss of generality assume that a ≡ c (mod 2) and b ≡ d (mod 2) or that
a ≡ b (mod 2) and c ≡ d (mod 2). It then follows that either Hu or ωu has
denominator exponent (0, ℓ) since
[ ] [ ]
1 (a + c) + i(b + d) 1 (a − b) + i(a + b)
Hu = √ and ωu = √ .
2 5ℓ (a − c) + i(b − d) 2 5ℓ (c − d) + i(c + d)
• k = 2. In this case a2 + b2 + c2 + d2 = 4 · 5ℓ ≡ 0 (mod 4). This implies that

a, b, c and d must have the same parity and thus, by minimality of k, must all
be odd. Using a Pauli+S operator, we can without loss of generality assume
that a ≡ b ≡ c ≡ d ≡ 1 (mod 4). It then follows that Hωu has denominator
exponent (0, ℓ) since
[ ]
1 (a − b + c − d) + i(a + b + c + d)
Hωu = √ .
4 5ℓ (a − b − c + d) + i(a + b − c − d)
85
Remark 6.1.9. Let V be one of the V-gates, u be a vector of the form (6.2), and ℓ and
√
ℓ′ be the least 5-denominator exponents of u and V u respectively. Then ℓ′ 6 ℓ + 1.
√
Moreover, if it were the case that ℓ′ < ℓ − 1, then the least 5-denominator exponent
of V † V u = u would be strictly less ℓ which is absurd. Thus ℓ − 1 6 ℓ′ 6 ℓ + 1.
Lemma 6.1.10. If u is a unit vector of the form (6.2) with least denominator expo-
nent (0, ℓ), then there exists a Pauli+V circuit W of V-count ℓ such that W u = e1 ,
the first standard basis vector.
Proof. Write u as in (6.2) with k = 0, α = a + ib, and β = c + id. Since u has unit
norm, we have a2 + b2 + c2 + d2 = 20 · 5ℓ = 5ℓ . We prove the lemma by induction on ℓ.
• ℓ = 0. In this case a2 + b2 + c2 + d2 = 1. It follows that exactly one of a, b, c, d

is ±1 while all the others are 0. Then u can be reduced to e1 by acting on it
using a Pauli operator.
• ℓ > 0. In this case a2 + b2 + c2 + d2 ≡ 0 (mod 5). We will show that there exists
a Pauli+V operator U of V-count 1 such that the least denominator exponent
of U u is ℓ − 1. It then follows by the induction hypothesis that there exists U ′
of V-count ℓ − 1 such that U ′ U u = e1 , which then completes the proof.
Consider the residues modulo 5 of a, b, c, and d. Since 0, 1, and 4 are the only
squares modulo 5, then, up to a reordering of the tuple (a, b, c, d), we must have:
⎧
⎨ (0, 0, 0, 0)
⎪
⎪
(a, b, c, d) ≡ (±2, ±1, 0, 0)
⎪
⎪
⎩ (±2, ±2, ±1, ±1).
However, by minimality of ℓ, we know that a ≡ b ≡ c ≡ d ≡ 0 is impossible, so

the other two cases are the only possible ones. We treat them in turn.
First, assume that one of a, b, c, d is congruent to ±2, one is congruent to ±1,

and the remaining two are congruent to 0. By acting on u with a Pauli operator,
we can moreover assume without loss of generality that a ≡ 2. Now if b ≡ 1,
consider VZ u: [ ]
1 (a − 2b) + i(2a + b)
VZ u = √ k+1
.
5 (c + 2d) + i(d − 2c)
86
Since a ≡ 2, b ≡ 1, and c ≡ d ≡ 0, we get (a − 2b) ≡ (2a + b) ≡ (c + 2d) ≡

(d − 2c) ≡ 0 (mod 5). The least denominator exponent of VZ u is therefore ℓ − 1.
If on the other hand b ≡ −1 then
[ ]
1 (a + 2b) + i(b − 2a)
VZ † u = √ k+1
5 (c − 2d) + i(d + 2c)
and reasoning analogously shows that the least denominator exponent of VZ † u

is k − 1. A similar argument can be made in the remaining cases, i.e., when
c ≡ ±1 or d ≡ ±1. For brevity, we list the desired operators in the table below.
The left column describes the residues of a, b, c, and d modulo 5 and the right
column gives the operator U such that U u has least denominator exponent ℓ−1.
(a, b, c, d) U
(2, 1, 0, 0) VZ
(2, 0, 1, 0) VY †
(2, 0, 0, 1) VX
(2, −1, 0, 0) VZ†
(2, 0, −1, 0) VY
(2, 0, 0, −1) VX †
Now assume that two of a, b, c, d are congruent to ±2 while the remaining two
are congruent to ±1. We can use Pauli operators to guarantee that a ≡ 2
and c > 0. As above, we list the desired operators in a table for conciseness.
It can be checked that in each case the given operator is such that the least
denominator exponent of U u is ℓ − 1.
87
(a, b, c, d) U
(2, 2, 1, 1) VY †
(2, 1, 2, 1) VX
(2, 1, 1, 2) VZ
(2, 1, 2, −1) VZ
(2, −1, 2, 1) VZ †
(2, 2, 1, −1) VX †
(2, −2, 1, 1) VX
(2, 1, 1, −2) VY †
(2, −1, 1, 2) VY †
(2, −1, 1, −2) VZ †
(2, −1, 2, −1) VX †
(2, −2, 1, −1) VY †
We can now solve Problem 6.1.1.
Proposition 6.1.11. A unitary operator U ∈ U (2) is exactly representable by a

Clifford+V circuit if and only if U ∈ V. Moreover, there exists an efficient al-
gorithm that computes a Clifford+V circuit for U with V-count equal to the least
√
5-denominator exponent of U , which is minimal.
Proof. The left-to-right implication is given by Corollary 6.1.6. For the right-to-left
implication, it suffices to show that there exists a Clifford+V circuit W of V-count ℓ
such that W U = I, since we then have U = W † . To construct W , apply Lemma 6.1.8
and Lemma 6.1.10 to the first column u1 of U . This yields a circuit W ′ such that the
first column of W ′ U is e1 . Since W ′ U is unitary, it follows that its second column u2
is a unit vector orthogonal to e1 . Therefore u2 = λe2 where λ is a unit of the Gaussian
integers. Since the determinant of W ′ is im for some integer m, the determinant of
W ′ U is in+m , so that λ = in+m . Thus one of the following equalities must hold
W ′ U = I, ZW ′ U = I, SW ′ U = I or ZSW ′ U = I.
√
To prove the second claim, suppose that the least 5-denominator exponent of U is
ℓ. Then W can be efficiently computed because the algorithm described in the proofs
88
of Lemma 6.1.8 and Lemma 6.1.10 requires O(ℓ) arithmetic operations. Moreover,
W has V-count ℓ by Lemma 6.1.10, which is minimal since any Clifford+V circuit of
√
V-count up to ℓ − 1 has least 5-denominator exponent at most ℓ − 1.
Remark 6.1.12. By restricting k to be equal to 0 in (6.1) and the determinant of U

to be ±1 we get a solution to the problem of exact synthesis for Pauli+V operators.
6.2 Approximate synthesis of z-rotations
We now turn to the approximate synthesis of z-rotations over the Clifford+V gate
set. A z-rotation is a unitary matrix of the form
[ ]
e−iθ/2 0
Rz (θ) = (6.3)
0 eiθ/2
for some real number θ. Matrices of the form (6.3) are called z-rotations because they
act as rotations of the Bloch sphere along the z-axis.
Problem 6.2.1. Given an angle θ and a precision ε > 0, construct a Clifford+V

circuit U whose V-count is as small as possible and such that ∥U − Rz (θ)∥ 6 ε.
6.2.1 A reduction of the problem
Lemma 6.2.2. Let cV = |1 − eiπ/4 |. If ε < cV , then all solutions to Problem 6.2.1
have determinant 1. If ε > cV , then there exists a solution of the form ω n for some
n ∈ N.
Proof. Every complex 2 × 2 unitary operator U can be written as

[ ]
a −b† eiφ
U= ,
b a† eiφ
for a, b ∈ C and φ ∈ [−π, π]. This, together with the characterization of Clifford+V
operators given by Proposition 6.1.11, implies that a complex 2 × 2 unitary operator
U can be exactly synthesized over the Clifford+V gate set if and only if
[ ]
1 α −β † in
U=√ √ ,
2k 5ℓ β α† in
89
ε2
D 2
1
θ
2
Rε
⃗z
Figure 6.1: The ε-region..
with k, ℓ, n ∈ N, α, β ∈ Z[i], and 0 6 ℓ 6 2. Now assume that ε < |1 − eiπ/4 | and ∥U −

Rz (θ)∥ 6 ε. Let eiφ1 and eiφ2 be the eigenvalues of U Rz (θ)−1 , with φ1 , φ2 ∈ [−π, π].
Then |1 − eiπ/4 | > ε > ∥U − Rz (θ)∥ = ∥I − U Rz (θ)−1 ∥ = max{|1 − eiφ1 |, |1 − eiφ2 |},
so that |1 − eiφj | < |1 − eiπ/4 |. Therefore −π/4 < φj < π/4, for j ∈ {1, 2}, which
√
implies that −π/2 < φ1 + φ2 < π/2. Hence |1 − ei(φ1 +φ2 ) | < |1 − eiπ/2 | = 2. But
√
ei(φ1 +φ2 ) = det(U Rz (θ)−1 ) = in . Thus |1 − in | < 2 which proves that in = 1.
For the second statement, note that if θ/2 ∈ [−π/4, π/4], then ∥I − Rz (θ)∥ =
|1 − eiθ/2 | 6 |1 − eiπ/4 |. Similarly, if θ/2 belongs to one of [π/4, 3π/4], [3π/4, 5π/4], or
[5π/4, 7π/4], then one of ∥ω 2 − Rz (θ)∥, ∥ − I − Rz (θ)∥, or ∥ − ω 2 − Rz (θ)∥ is less than
|1 − eiπ/4 |. In each case, Rz (θ) is approximated to within ε by a Clifford operator.
Definition 6.2.3. Let θ be an angle and ε > 0 a precision. The ε-region for θ is the
subset of the plane defined by
ε2
Rε = {u ∈ D ; u · z > 1 − }
2
where z = e−iθ/2 and D is the unit disk.
The ε-region is illustrated in Figure 6.1. We now show how to reduce Problem 6.2.1
to three distinct problems.
Proposition 6.2.4. Problem 6.2.1 reduces to a grid problem, a Diophantine equation,

and an exact synthesis problem, namely:
√ k√ ℓ
1. find k, ℓ ∈ N with 0 6 k 6 2 and α ∈ Z[i] such that α ∈ 2 5 Rε ,
90
2. find β ∈ Z[i] such that β † β = 2k 5ℓ − α† α, and
3. find a Clifford+V circuit for the unitary matrix

[ ]
1 α −β †
U=√ √ .
2k 5ℓ β α†
√ k√ ℓ
Moreover, the least 2 5 for which the above three problems can be solved yields
an optimal solution to Problem 6.2.1.
Proof. Let U be the matrix

[ ]
1 α −β †
U=√ √ (6.4)
2k 5ℓ β α†
with α, β ∈ Z[i] and k, ℓ ∈ N satisfying 0 6 k 6 2 and α† α = 2k 5ℓ . Given ε and θ,

we can express the requirement ∥U − Rz (θ)∥ 6 ε as a constraint on the top left entry
√ √ √ √ √ √
α/( 5k 2ℓ ) of U . Indeed, let z = e−iθ/2 , α′ = α/( 5k 2ℓ ), and β ′ = β/( 5k 2ℓ ).
Since α′ † α′ + β ′ † β ′ = 1 and z † z = 1, we have
∥U − Rz (θ)∥2 = |α′ − z|2 + |β ′ |2

†
= (α′ − z)† (α′ − z) + β ′ β ′
† † †
= α′ α′ + β ′ β ′ − z † α′ − α′ z + z † z
= 2 − 2 Re(z † α′ ).
Thus ∥Rz (θ) − U ∥ 6 ε if and only if 2 − 2 Re(z † α′ ) 6 ε2 , or equivalently, Re(z † α′ ) >

ε2
1− 2
. If we identify the complex numbers z = x + yi and α′ = a + bi with 2-
⃗ ′ = (a, b)T , then Re(z † α′ ) is just their inner
dimensional real vectors ⃗z = (x, y)T and α
⃗ ′ , and therefore ∥U − Rz (θ)∥ 6 ε is equivalent to ⃗z · α
product ⃗z · α ⃗ ′ > 1 − ε2 /2. Hence
√ k√ ℓ
∥U − Rz (θ)∥ 6 ε ⇐⇒ α ∈ 2 5 Rε .
The fact that, by Lemma 6.2.2, all the solutions to Problem 6.2.1 are of the form
(6.4) completes the reduction. Since by Proposition 6.1.11 the minimal V-count of
√ √ √
an element U ∈ V is its least 5-denominator exponent, then the least 2k 5ℓ for
which problems 1-3 can be solved is an optimal solution.
91
6.2.2 The algorithm
The reduction of Proposition 6.2.4 describes an algorithm which we explicitly state

below.
Algorithm 6.2.5. Let θ and ε > 0 be given.
1. Use the algorithm from Proposition 5.1.17 of Chapter 5 to enumerate the infinite
√ √
sequence of solutions to the scaled grid problem over Z[i] for Rε and 2k 5ℓ
in order of increasing ℓ.
2. For each solution α:
(a) Let n = 2k 5ℓ − α† α.
(b) Attempt to find a prime factorization of n. If n ̸= 0 but no prime factor-

ization is found, skip step 2(c) and continue with the next α.
(c) Use the algorithm from Proposition 3.2.8 of Chapter 3 to solve the equation
β † β = n. If a solution β exists, go to step 3; otherwise, continue with the
next α.
3. Define U as [ ]
1 α −β †
U=√ √
2k 5ℓ β α†
and use the exact synthesis algorithm of Proposition 6.1.11 to find a Clifford+V
circuit for U . Output this circuit and stop.
Remark 6.2.6. In analogy with Remark 6.1.12, we note that restricting k to be equal
to 0 throughout the algorithm yields a method for the approximate synthesis of z-
rotations in the Pauli+V basis.
6.2.3 Analysis of the algorithm
We now discuss the properties of Algorithm 6.2.5. We are interested in three aspects
of the algorithm: its correctness, its circuit complexity, and its time complexity. We
treat each of these aspects in turn. We note that the results established here also
hold for the restricted algorithm of Remark 6.2.6.
92
Correctness
Proposition 6.2.7 (Correctness). If Algorithm 6.2.5 terminates, then it yields a

valid solution to the approximate synthesis problem, i.e., it yields a Clifford+V circuit
approximating Rz (θ) up to ε.
Proof. By construction, following the reduction described by Proposition 6.2.4.
Circuit complexity
In the presence of a factoring oracle, the algorithm has optimal circuit complexity.
Proposition 6.2.8 (Optimality in the presence of a factoring oracle). In the presence

of an oracle for integer factoring, the circuit returned by Algorithm 6.2.5 has the
smallest V-count of any single-qubit Clifford+V circuit approximating Rz (θ) up to ε.
Proof. By construction, step (1) of the algorithm enumerates all solutions α to the
√ √
scaled grid problem over Z[i] for Rε and 2k 5ℓ in order of increasing ℓ. Step 2(a)
always succeeds and, in the presence of the factoring oracle, so does step 2(b). When
step 2(c) succeeds, the algorithm has found a solution of Problem 6.2.1 of minimal
√
5-denominator exponent, which therefore has minimal V-count.
In the absence of a factoring oracle, the algorithm is still nearly optimal. Our
proof of this near-optimality relies on the following number-theoretic hypothesis. We
do not have a proof of this hypothesis, but it appears to be valid in practice.
Hypothesis 6.2.9. For each number n produced in step 2(a) of Algorithm 6.2.5, write
n = 2j m, where m is odd. Then m is asymptotically as likely to be a prime congruent
to 1 modulo 4 as a randomly chosen odd number of comparable size. Moreover, each
m can be modelled as an independent random event.
Definition 6.2.10. Let U ′ and U ′′ be the following two solutions of the approximate
synthesis problem
[ ] [ ]
α′ −β ′† α′′ −β ′′†
U′ = and U ′′ = . (6.5)
β′ α′† β ′′ α′′†
U ′ and U ′′ are said to be equivalent solutions if α′ = α′′ .

93
Proposition 6.2.11 (Near-optimality in the absence of a factoring oracle). Let ℓ

be the V-count of the solution of the approximate synthesis problem found by Algo-
rithm 6.2.5 in the absence of a factoring oracle. Then
1. The approximate synthesis problem has at most O(log(1/ε)) non-equivalent so-

lutions with V-count less than ℓ.
2. The expected value of ℓ is ℓ′′′ + O(log(log(1/ε))), where ℓ′ , ℓ′′ , and ℓ′′′ are the
V-counts of the optimal, second-to-optimal, and third-to-optimal solutions of the
approximate synthesis problem (up to equivalence).
Proof. If ε > |1 − eiπ/4 |, then by Lemma 6.2.2 there is a solution of V-count 0 and
the algorithm easily finds it. In this case there is nothing to show, so assume without
loss of generality that ε < |1 − eiπ/4 |. Then by Lemma 6.2.2, all solutions are of the
form (6.4).
1. Consider the list α1 , α2 , . . . of candidates generated in step (i) of the algorithm.

√
Let ℓ1 , ℓ2 , . . . be their least 5-denominator exponent and let n1 , n2 , . . . be the
corresponding integers calculated in step (ii.a). Note that nj 6 4 · 5ℓj for all j.
Write nj = 2zj mj where mj is odd. By Hypothesis 6.2.9, the probability that
mj is a prime congruent to 1 modulo 4 is asymptotically no smaller than that
of a randomly chosen odd integer less than 4 · 5ℓj , which, by the well-known
prime number theorem, is
1 1
pj := = . (6.6)
ln(4 · 5ℓj ) ℓj ln 5 + ln 4
By the pigeon-hole principle, two of ℓ1 , ℓ2 , and ℓ3 must be congruent modulo

2. Assume without loss of generality that ℓ2 ≡ ℓ3 (mod 2). Then α2 and
α3 are two distinct solutions to the scaled grid problem over Z[i] for Rε and
√ k√ ℓ √
2 5 with (not necessarily least) 5-denominator exponent ℓ3 . It follows
by Proposition 5.1.19 from Chapter 5 that there are at least 5r + 1 distinct
candidates of denominator exponent ℓ3 + 2r, for all r > 0. In other words,
for all j, if j 6 5r + 1, we have ℓj 6 ℓ3 + 2r. In particular, this holds for
r = ⌊1 + log5 j⌋, and therefore,
ℓj 6 ℓ3 + 2(1 + log5 j). (6.7)

94
Combining (6.7) with (6.6), we have

1 1
pj > = (6.8)
(ℓ3 + 2(1 + log5 j)) ln 5 + ln 4 (ℓ3 + 2) ln 5 + 2 ln j + ln 4
Let j0 be the smallest index such that mj0 is a prime congruent to 1 modulo 4.
By Hypothesis 6.2.9, we can treat each mj as an independent random variable.
Therefore,
P (j0 > j) = P (n1 , . . . , nj are not prime)

6 (1 − p1 )(1 − p2 ) · · · (1 − pj )
6 (1 − pj )j
( )j
1
6 1− .
(ℓ3 + 2) ln 5 + 2 ln j + ln 4
The expected value of j0 is given by the sum E(j0 ) = ∞

∑
j=0 P (j0 > j). It was
proved in [56] that this sum can be estimated as follows

∞
∑
E(j0 ) = P (j0 > j)
j=0
∞ ( )j
∑ 1
6 1+ 1− = O(ℓ3 ). (6.9)
j=1
(ℓ3 + 2) ln 5 + 2 ln j + ln 4
Next, we will estimate ℓ3 . First note that if the ε region contains a circle of
√
radius greater than 1/ 5ℓ , then it contains at least 3 solutions to the scaled grid
√
problem for Rε with 5-denominator exponent ℓ. The width of the ε-region
Rε is ε2 /2 at the widest point, and we can inscribe a disk of radius r = ε2 /4 in
it. Hence the scaled grid problem over Z[i] for Rε , as in step 1 of the algorithm,
has at least three solutions with denominator exponent ℓ, provided that
ε2 1
r= >√ ,
4 5ℓ
or equivalently, provided that
ℓ > 2 log5 (2) + 2 log5 (1/ε).
It follows that
ℓ3 = O(log(1/ε)), (6.10)
95
and therefore, using (6.9), also
E(j0 ) = O(log(1/ε)). (6.11)
To finish the proof of part (i), recall that j0 was defined to be the smallest
index such that mj0 is a prime congruent to 1 modulo 4. The primality of
mj0 ensures that step (ii.b) of the algorithm succeeds for the candidate αj0 .
Furthermore, because mj0 ≡ 1 (mod 4), the equation β † β = n has a solution by
Proposition 3.2.6. Hence the remaining steps of the algorithm also succeed for
αj0 .
Now let s be the number of non-equivalent solutions of the approximate synthe-

sis problem of V-count strictly less than ℓ. As noted above, any such solution
U is of the form (6.4). Then the least denominator exponent of α is strictly
smaller than ℓj0 , so that α = αj for some j < j0 . In this way, each of the s
non-equivalent solutions is mapped to a different index j < j0 . It follows that
s < j0 , and hence that E(s) 6 E(j0 ) = O(log(1/ε)), as was to be shown.
2. Let U ′ be an optimal solution of the approximate synthesis problem, let U ′′

be optimal among the solutions that are not equivalent to U ′ and let U ′′′ be
optimal among the solutions that are not equivalent to either U ′ or U ′′ . Assume
that U ′ , U ′′ , and U ′′′ are written as in (6.5) with top-left entry α′ , α′′ , and α′′′
respectively. Now let ℓ′ , ℓ′′ , and ℓ′′′ be the least denominator exponents of α′ ,
α′′ , and α′′′ , respectively. Let ℓ3 and j0 be as in the proof of part (i). Note that,
by definition, ℓ3 6 ℓ′′′ . Let ℓ be the least denominator exponent of the solution
of the approximate synthesis problem found by the algorithm. Then ℓ 6 ℓj0 .
Using (6.7), we have
ℓ 6 ℓj0 6 ℓ3 + 2(1 + log5 j0 ) 6 ℓ′′′ + 2(1 + log5 j0 ).
This calculation applies to any one run of the algorithm. Taking expected values
over many randomized runs, we therefore have
E(ℓ) 6 ℓ′′′ + 2 + 2E(log5 j0 ) 6 ℓ′′′ + 2 + 2 log5 E(j0 ). (6.12)
Note that we have used the law E(log j0 ) 6 log(E(j0 )), which holds because
log is a concave function. Combining (6.12) with (6.11), we therefore have the
96
desired result:
E(ℓ) = ℓ′′′ + O(log(log(1/ε))).
Time complexity
Finally, we turn to the time complexity of the algorithm. For this again, we rely on
our number theoretic conjecture on the distribution of primes to estimate how many
candidates must be tried before one that is prime is reached.
Proposition 6.2.12. Algorithm 6.2.5 runs in expected time O(polylog(1/ε)). This

is true whether or not a factorization oracle is used.
Proof. Let M be the uprightness of the ε-region. Let j0 be the average number of
candidates tried in steps 2(a)–(c) of the algorithm, and let ℓj0 be the least denominator
exponent of the final candidate. Let n be the largest integer that appears in step 2(a)
of the algorithm.
By Proposition 5.1.17, step 1 of the algorithm requires O(log(1/M )) arithmetic
operations, plus a constant number per candidate. For each of the j0 candidates,
step 2(a) requires O(1) arithmetic operations. Step 2(b) also requires O(1) arithmetic
operations, either due to the use of a factoring oracle, or else, because we can put
an arbitrary fixed bound on the amount of effort invested in factoring any given
integer. At minimum, this will succeed when the integer in question is prime, which
is sufficient for the estimates of Proposition 6.2.11. Step 2(c) requires O(polylog(n))
operations by Proposition 3.2.8. Finally, step 3 requires O(ℓj0 ) arithmetic operations
by Proposition 6.1.11. So the total expected number of arithmetic operations is
O(log(1/M )) + j0 · O(polylog(n)) + O(ℓj0 ). (6.13)
Recall that the ε-region Rε , shown in Figure 6.1, contains a disk of radius ε2 /4;
π 4
therefore, area(Rε ) > 16
ε. On the other hand, the square [−1, 1] × [−1, 1] is a (not
very tight) bounding box for Rε . It follows that
area(Rε )
M = up(Rε ) = = Ω(ε4 ),
area(BBox(Rε ))
97
hence log(1/M ) = O(log(1/ε)). From (6.11), the expected value of j0 is O(log(1/ε)).

Combining (6.7) with (6.10), we therefore have
ℓj0 6 ℓ3 + 2(1 + log2 j0 ) = O(log(1/ε)) + O(log(log(1/ε))) = O(log(1/ε)).
Now note that for any i, ni 6 5ℓi +1 . This, together with the fact that candidates are
enumerated in order of increasing denominator exponent, we have n 6 4ℓj0 , hence
polylog(n) = O(poly(ℓj0 )) = O(polylog(1/ε)).
Combining all of these estimates with (6.13), the expected number of arithmetic
operations for the algorithm is O(polylog(1/ε)). Moreover, each individual arithmetic
operation can be performed with precision O(log(1/ε)), taking time O(polylog(1/ε)).
Therefore the total expected time complexity of the algorithm is O(polylog(1/ε)), as
desired.
6.3 Approximate synthesis of special unitaries
The algorithm of the previous section allows us to approximate z-rotation up to arbi-

trarily small accuracy. This method can be used to solve the problem of approximate
synthesis of arbitrary special unitaries over the Clifford+V gate set.
Problem 6.3.1. Given a special unitary U and a precision ε > 0, construct a

Clifford+V circuit U whose V-count is as small as possible and such that ∥U −
Rz (θ)∥ 6 ε.
Indeed, an element U ∈ SU(2) can always be decomposed as a product of three

rotations using Euler angles. Hence
U = Rz (θ1 )Rx (θ2 )Rz (θ3 ).
Using the Hadamard gate, the central x-rotation can be expressed as a z-rotation.
Thus
U = Rz (θ1 )HRz (θ2 )HRz (θ3 ).
We can therefore use Algorithm 6.2.5 to find a Clifford+V circuit approximating each
of the Rz (θi ) up to ε/3. Since the Hadamard gate is a Clifford operator, this yields a
Clifford+V approximation of U up to ε.
98
We note that optimality is lost in the process of decomposing an operator as a

product of z-rotations. Indeed, if we write a special unitary U as
U = Rz (θ1 )HRz (θ2 )HRz (θ3 )
and approximate each z-rotation using Algorithm 6.2.5, we obtain a circuit whose
length exceeds the optimal one by a factor of 3 in the typical case.
Chapter 7
Clifford+T approximate synthesis
In this chapter, we introduce an efficient algorithm to solve the problem of approxi-

mate synthesis of special unitaries over the Clifford+T gate set. Recall from Chapter 1
that the T gate is the following matrix
[ ]
1 0
T = .
0 ω
The Clifford+T gate set is obtained by adding the T gate to the generators ω, H,
and S of the Clifford group.
The results presented in this chapter are obtained by adapting the methods of
Chapter 6 to the Clifford+T setting. Like in Chapter 6, we first consider the exact
synthesis of Clifford+T operators. This provides a characterization of Clifford+T
circuits which we then use to define an algorithm for the approximate synthesis of
z-rotations. Because of the similarities between this chapter and the previous one, we
omit most proofs in order to avoid redundancy. However, we explain the differences
when they occur.
The approximate synthesis algorithms introduced in this chapter (Algorithm 7.2.5
and Algorithm 7.3.9) have been implemented in Haskell. The implementations are
freely available [57].
7.1 Exact synthesis of Clifford+T operators
Problem 7.1.1 (Exact synthesis of Clifford+T operators). Given a unitary U ∈ U(2),

determine whether there exists a Clifford+T circuit W such that U = W and, in case
such a circuit exists, construct one whose T-count is minimal.
Problem 7.1.1 was first solved in [41]. A version of Problem 7.1.1 generalized to
multi-qubit circuits was solved in [21].
To characterize Clifford+T operators, we consider the following set of unitaries.
99
100
Definition 7.1.2. The set T consists of unitary matrices of the form

[ ]
1 α γ
U=√ (7.1)
2k β δ
where k ∈ N and α, β, γ, δ ∈ Z[ω].
We note that T is the subgroup of U (2) consisting of matrices with entries in

√
Z[1/ 2, i]. This is slightly different than the situation in the previous chapter, where
√
we had defined V to be a strict subset of the group of unitary matrices over Z[1/ 5, i].
We will also use a notion of denominator exponent for the elements of T .
Definition 7.1.3. Let U ∈ T be as in (7.1). The integer k is called a denominator

exponent of U . The least k such that U can be written as in (7.1) is the least
denominator exponent of U . These notions extend naturally to vectors and scalars of
the form [ ]
1 α 1
√ and √ α, (7.2)
2k β 2k
where k ∈ N and α, β ∈ Z[ω].
In the previous chapter, we used the set V to characterize Clifford+V operators.

Similarly, one can prove that Clifford+T operators are exactly the elements of T .
Proposition 7.1.4 (Kliuchnikov, Maslov, Mosca [42]). A unitary operator U ∈ U (2)

is exactly representable by a Clifford+T circuit if and only if U ∈ T . Moreover, there
exists an efficient algorithm that computes a Clifford+T circuit for U with minimal
T -count.
The above proposition can be proved by a technique similar to the one used to
establish Proposition 6.1.11. To this end one first shows that every vector of the
form (7.2) can be reduced to e1 = [ 10 ] by applying well-chosen Clifford+T operators.
Applying this method to the first column of an element U of T then yields a circuit
for U .
Recall that in Proposition 6.1.11, the minimal V -count of the operator U was equal
√
to its least 5-denominator exponent. The relation between denominator exponent
and minimal T -count is slightly more complicated.
101
Proposition 7.1.5. Let U ∈ T with least denominator exponent k and minimal

T -count t. Then 2k − 3 6 t 6 2k + 1.
Proof. See, e.g., [22].
7.2 Approximate synthesis of z-rotations
As in Chapter 6, we consider the problem of approximate synthesis of z-rotations.
Problem 7.2.1. Given an angle θ and a precision ε > 0, construct a Clifford+T

circuit U whose T-count is as small as possible and such that
∥U − Rz (θ)∥ 6 ε. (7.3)
An algorithm to solve Problem 7.2.1 can be used to solve the problem of approx-
imate synthesis of arbitrary special unitaries using Euler angles, as in Section 6.3.
Our algorithm solving Problem 7.2.1 relies on a reduction of the problem to a
grid problem, a Diophantine equation and an exact synthesis problem. This is anal-
ogous to the reduction described in the Clifford+V case by Proposition 6.2.4. In the
Clifford+T context, we must first show that enumerating candidate solutions in order
of increasing denominator exponents allows us to also enumerate candidate solutions
in order of minimal T -count. This is not immediate, due to Proposition 7.1.5.
Lemma 7.2.2. If ε < |1 − eiπ/8 |, then all solutions to Problem 7.2.1 have the form
[ ]
1 u −t†
U=√ . (7.4)
2k t u†
If ε > |1 − eiπ/8 |, then there exists a solution of T -count 0 (i.e., a Clifford operator),
and it is also of the form (7.4).
Proof. Analogous to the proof of Lemma 6.2.2.
Lemma 7.2.3. Let U be a unitary operator as in (7.4) with least denominator ex-
ponent k. Then the T -count of U is either 2k − 2 or 2k. Moreover, if k > 0 and
U has T -count 2k, then U ′ = T U T † has T -count 2k − 2. We further note that
∥Rz (θ) − U ′ ∥ = ∥Rz (θ) − U ∥, so for the purpose of solving (7.3), it does not matter
whether U or U ′ is used. Hence, without loss of generality, we may assume that U as
in (7.4) always has T -count exactly 2k − 2 when k > 0, and 0 when k = 0.
102
Proof. The claims about the T -counts of U and U ′ follow by inspection of Figure 2
of [22]. Using the terminology of Definitions 7.4 and 7.6 of [22], this figure shows
every possible k-residue of a Clifford+T operator, modulo a right action of the group
⟨S, X, ω⟩. Because U is of the form (7.4), only a subset of the k-residues is actually
possible, and the figure shows that for this subset, the T -count is 2k or 2k − 2.
Moreover, in each of the possible cases where k > 0 and U has T -count 2k, the figure
also shows that U ′ = T U T † has T -count 2k − 2.
For the final claim, we have ∥Rz (θ) − U ∥ = ∥T Rz (θ)T † − T U T † ∥ = ∥Rz (θ) − U ′ ∥
because Rz (θ) and T commute.
We can now state a reduction for Problem 7.2.1 as we did in Proposition 7.2.4 for
Problem 6.2.1.
Proposition 7.2.4. Problem 7.2.1 reduces to a grid problem, a Diophantine equation,

and an exact synthesis problem, namely:
√ k √
1. find k ∈ N and α ∈ Z[ω] such that α ∈ 2 Rε and α• ∈ (− 2)k D,
√
2. find β ∈ Z[ 2] such that β † β = 2k − α† α, and
3. define the unitary matrix U as

[ ]
1 α −β †
U=√
2k β α†
and find a Clifford+T circuit for U or T U T † , whichever has the smaller T -

count.
Moreover, the least k for which the above three problems can be solved yields an optimal
solution to Problem 7.2.1.
Note that item 1 of Proposition 7.2.4 is a grid problem over Z[ω]. This is in
contrast with the corresponding item of Proposition 7.2.4, which was a grid problem
over Z[i]. In both cases, we look for points in a scaled ε-region in order of increasing
denominator exponent. However, in the Clifford+T case, the desired points must be
elements of Z[ω]. Since Z[ω] is dense in R2 , there are infinitely many elements in
Rε ∩ Z[ω] for any fixed denominator exponent. To circumvent this issue, we only
103
consider those elements of Rε ∩ Z[ω] for which the Diophantine equation of item 2
can potentially be solved. Since we have
√ √
α† α + β † β = 2k =⇒ α ∈ 2k D and α• ∈ (− 2)k D,
where D is the closed unit disk, the points of interest are precisely the solutions to
the scaled grid problem over Z[ω] for Rε and D.
Algorithm 7.2.5. Let θ and ε > 0 be given.
1. Use the algorithm from Proposition 5.2.37 of Chapter 5 to enumerate the infinite
sequence of solutions to the scaled grid problem over Z[ω] for Rε and D and k
in order of increasing k.
2. For each solution α:
(a) Let ξ = 2k − α† α and n = ξ • ξ.

ization is found, skip step 2(c) and continue with the next α.
(c) Use the algorithm from Proposition 3.2.9 of Chapter 3 to solve the equation
β † β = n. If a solution β exists, go to step 3; otherwise, continue with the
next α.
3. Define U as [ ]
1 α −β †
U=√
2k β α†
and use the exact synthesis algorithm of Proposition 7.1.4 to find a Clifford+V
circuit for U or T U T † , whichever has the smallest T -count. Output this circuit
and stop.
We now state the properties of Algorithm 7.2.5. In most cases it enjoys the same
properties as the Clifford+V algorithm.
Proposition 7.2.6 (Correctness). If Algorithm 7.2.5 terminates, then it yields a

valid solution to the approximate synthesis problem, i.e., it yields a Clifford+T circuit
approximating Rz (θ) up to ε.
104
Proposition 7.2.7 (Optimality in the presence of a factoring oracle). In the presence

of an oracle for integer factoring, the circuit returned by Algorithm 7.2.5 has the
smallest T-count of any single-qubit Clifford+T circuit approximating Rz (θ) up to ε.
Correctness and optimality are proved like the corresponding propositions in

Chapter 6.
Here also, we rely on a number-theoretic assumption on the distribution of primes
to establish the remaining properties of the algorithm.
Hypothesis 7.2.8. For each n produced in step 2(a) of Algorithm 7.2.5, write n =
2j m, where m is odd. Then m is asymptotically as likely to be prime as a randomly
chosen odd number of comparable size. Moreover, the primality of each m can be
modelled as an independent random event.
Note that Hypothesis 7.2.8 is slightly different than Hypothesis 6.2.9. Indeed,
Hypothesis 6.2.9 makes an additional assumption on the residue class of the integer
m. Here it is not necessary to make such an assumption, since we can prove that the
number n produced in step 2(a) of the algorithm satisfies n > 0 and moreover is such
that either n = 0 or n ≡ 1 (mod 8). A proof of this fact can be found in Appendix D
of [56].
Proposition 7.2.9 (Near-optimality in the absence of a factoring oracle). Let m

be the T-count of the solution of the approximate synthesis problem found by Algo-
rithm 7.2.5 in the absence of a factoring oracle. Then
1. The approximate synthesis problem has at most O(log(1/ε)) non-equivalent so-

lutions with T -count less than m.
2. The expected value of m is m′′ + O(log(log(1/ε))), where m′ and m′′ are the
T -counts of the optimal and second-to-optimal solutions of the approximate syn-
thesis problem (up to equivalence).
Note that in Proposition 7.2.9, we use the second-to-optimal solution, rather than
the third-to-optimal solution as in Proposition 6.2.11. This is due to the fact that
√ √
2 ∈ Z[ω] whereas 5 ∈ / Z[i]. Indeed, if α and α′ are two solutions of least denom-
inator exponent k and k ′ with k 6 k ′ , then they are both solutions of denominator
105
exponent k ′ . But in the case of the Clifford+V gates, we need to have three solutions
to guarantee that two will have the same denominator exponent.
The last property of the algorithm can be proved just like the corresponding one
from Chapter 6.
Proposition 7.2.10. Algorithm 7.2.5 runs in expected time O(polylog(1/ε)). This

is true whether or not a factorization oracle is used.
7.3 Approximation up to a phase
So far, we have considered the problem of approximate synthesis “on the nose”, i.e.,
the operator U in Problem 7.2.1 was literally required to approximate Rz (θ) in the
operator norm. However, it is well-known that global phases have no observable effect
in quantum mechanics, so in quantum computing, it is also common to consider the
problem of approximate synthesis “up to a phase”. This is made precise in the
following definition.
Problem 7.3.1. Given θ and some ε > 0, the approximate synthesis problem for
z-rotations up to a phase is to find an operator U expressible in the single-qubit
Clifford+T gate set, and a unit scalar λ, such that
∥Rz (θ) − λU ∥ 6 ε. (7.5)
Moreover, it is desirable to find U of smallest possible T -count. As before, the norm

in (7.5) is the operator norm.
In this section, we will give a version of Algorithm 7.2.5 that optimally solves the
approximate synthesis problem up to a phase. The central insight is that it is in fact
√
sufficient to restrict λ to only two possible phases, namely λ = 1 and λ = ω = eiπ/8 .
First, note that if W is a unitary 2 × 2-matrix and det W = 1, then tr W is real.
This is obvious, because det W = 1 ensures that the two eigenvalues of W are each
other’s complex conjugates.
Lemma 7.3.2. Let W be a unitary 2 × 2-matrix, and assume that det W = 1 and
tr W > 0. Then for all unit scalars λ, we have
∥I − W ∥ 6 ∥I − λW ∥.
106
Proof. We may assume without loss of generality that W is diagonal. Since det W =
1, we can write
[ ]
eiφ 0
W =
0 e−iφ
for some φ. By symmetry, we can assume without loss of generality that 0 6 φ 6 π.
Since tr W > 0, we have φ 6 π/2. Now consider a unit scalar λ = eiψ , where
−π 6 ψ 6 π. Then ∥I−λW ∥ = max{|1−ei(ψ+φ) |, |1−ei(ψ−φ) |} and ∥I−W ∥ = |1−eiφ |.
If ψ > 0, then |1 − eiφ | 6 |1 − ei(ψ+φ) |. Similarly, if ψ 6 0, then |1 − eiφ | 6 |1 − ei(ψ−φ) |.
In either case, we have ∥I − W ∥ 6 ∥I − λW ∥, as claimed.
Lemma 7.3.3. Fix ε, a unitary operator R with det R = 1, and a Clifford+T operator
U . The following are equivalent:
1. There exists a unit scalar λ such that
∥R − λU ∥ 6 ε;
2. There exists n ∈ Z such that
∥R − einπ/8 U ∥ 6 ε.
Proof. It is obvious that (2) implies (1). For the opposite implication, first note that,
because U is a Clifford+T operator, we have det U = ω k for some k ∈ Z, and therefore
det(R−1 U ) = ω k . Let V = e−ikπ/8 R−1 U , so that det V = 1. If tr V > 0, let W = V ;
otherwise, let W = −V . Either way, we have W = einπ/8 R−1 U , where n ∈ Z, and
det W = 1, tr W > 0. Let λ′ = e−inπ/8 λ. By Lemma 7.3.2, we have
∥I − W ∥ 6 ∥I − λ′ W ∥
⇒ ∥I − einπ/8 R−1 U ∥ 6 ∥I − λ′ einπ/8 R−1 U ∥
⇒ ∥R − einπ/8 U ∥ 6 ∥R − λ′ einπ/8 U ∥,
⇒ ∥R − einπ/8 U ∥ 6 ∥R − λU ∥,
which implies the desired claim.
Remark 7.3.4. A version of Lemma 7.3.3 applies to gate sets other than Clifford+T ,
as long as the gate set has discrete determinants.
107
Corollary 7.3.5. In Definition 7.3.1, it suffices without loss of generality to consider

only the two scalars λ = 1 and λ = eiπ/8 .
Proof. Suppose U is a Clifford+T operator satisfying (7.5) for some unit scalar λ.
By Lemma 7.3.3, there exists a λ of the form einπ/8 also satisfying (7.5). Then we
can write λ = ω k λ′ , where k ∈ Z and λ′ ∈ {1, eiπ/8 }. Letting U ′ = ω k U , we have
λ′ U ′ = λU , and therefore
∥Rz (θ) − λ′ U ′ ∥ 6 ε,
as claimed. Moreover, since ω = eiπ/4 is a Clifford operator, U and U ′ have the same
T -count.
To solve the approximate synthesis problem up to a phase, we therefore need an

algorithm for finding optimal solutions of (7.5) in case λ = 1 and λ = eiπ/8 . For
λ = 1, this is of course just Algorithm 7.2.5. So all that remains to do is to find an
algorithm for solving
∥Rz (θ) − eiπ/8 U ∥ 6 ε. (7.6)
We use a sequence of steps very similar to those of Proposition 7.2.4 to reduce this
to a grid problem and a Diophantine equation. We first consider the form of U .
Lemma 7.3.6. If ε < |1 − eiπ/8 |, then all solutions of (7.6) have the form
[ ]
u −t† ω −1
U= . (7.7)
t u† ω −1
Proof. This is completely analogous to the proof of Lemma 7.2.2, using eiπ/8 U in
place of U .
δ
Recall that δ = 1 + ω, and note that |δ| = eiπ/8 . Also note that δω −1 = δ † , and
√
that δ −1 = (ω − i)/ 2. Suppose that U is of the form (7.7). Let u′ = δu and t′ = δt.
We have:
 [ ]
iπ/8
 δ u −t† ω −1 
∥Rz (θ) − e U ∥ = Rz (θ) −
 
|δ| t † −1 

 uω
 [ ]
 1 δu −δ † t† 
= Rz (θ) −
 
|δ| δt † † 

 δu
 [ ]
 1 u′ −t′ † 
= Rz (θ) − .
 
 |δ| t′ u ′† 
108
Using exactly the same argument as in Proposition 7.2.4, it follows that (7.6) holds
u′
if and only if |δ|
∈ Rε , i.e., u′ ∈ |δ|Rε .
As before, in order for U to be unitary, of course it must satisfy u† u + t† t = 1, and
a necessary condition for this is u, u• ∈ D. The latter condition can be equivalently
re-expressed in terms of u′ by requiring u′ ∈ |δ| D and u′ • ∈ |δ •| D. Therefore, finding
solutions to (7.6) of the form (7.7) reduces to the two-dimensional grid problem u′ ∈
|δ|Rε and u′ • ∈ |δ •| D, together with the usual Diophantine equation u† u+t† t = 1. The
last remaining piece of the puzzle is to compute the T -count of U , and in particular,
to ensure that potential solutions are found in order of increasing T -count.
Lemma 7.3.7. Let U be a Clifford+T operator of the form (7.7), and let k be the
least denominator exponent of u′ = δu. Then the T -count of U is either 2k − 1 or
2k + 1. Moreover, if k > 0 and U has T -count 2k + 1, then U ′ = T U T † has T -count
2k − 1.
Proof. This can be proved by a tedious but easy induction, analogous to Lemma 7.2.3.
We therefore arrive at the following algorithm for solving (7.6). Here we assume
ε < |1 − eiπ/8 |, so that Lemma 7.3.6 applies.
Algorithm 7.3.8. Given θ and ε, let A = |δ|Rε , and let B = |δ •| D.
1. Use Proposition 5.2.37 to enumerate the infinite sequence of solutions to the

scaled grid problem over Z[ω] for A, B, and k in order of increasing k.
2. For each such solution u′ :
(a) Let ξ = 2k − u′ † u′ , and n = ξ • ξ.

ization is found, skip step 2(c) and continue with the next u′ .
(c) Use the algorithm of Proposition 3.2.9 to solve the equation t† t = ξ. If a

solution t exists, go to step 3; otherwise, continue with the next u′ .
3. Define U as in equation (7.7), let U ′ = T U T † , and use the exact synthesis

algorithm of [42] to find a Clifford+T circuit implementing either U or U ′ ,
whichever has smaller T -count. Output this circuit and stop.
109
Algorithm 7.3.8 is optimal in the presence of a factoring oracle, and near-optimal

in the absence of a factoring oracle, in the same sense as Algorithm 7.2.5. Its expected
time complexity is O(polylog(1/ε)). The proofs are completely analogous to those
of the previous section. We then arrive at the following composite algorithm for the
approximate synthesis problem for z-rotations up to a phase:
Algorithm 7.3.9 (Approximate synthesis up to a phase). Given θ and ε, run both

Algorithms 7.2.5 and 7.3.8, and return whichever circuit has the smaller T -count.
Proposition 7.3.10 (Correctness, time complexity, and optimality). Algorithm 7.3.9

yields a valid solution to the approximate synthesis problem up to a phase. It runs
in expected time O(polylog(1/ε). In the presence of a factoring oracle, the algo-
rithm is optimal, i.e., the returned circuit has the smallest T -count of any single-qubit
Clifford+T circuit approximating Rz (θ) up to ε and up to a phase. Moreover, in the
absence of a factoring oracle, the algorithm is near-optimal in the following sense.
Let m be the T -count of the solution found. Then:
1. The approximate synthesis problem up to a phase has an expected number of at

most O(log(1/ε)) non-equivalent solutions with T -count less than m.
2. The expected value of m is m′′′ + O(log(log(1/ε))), where m′′′ is the T -count

of the third-to-optimal solution (up to equivalence) of the approximate synthesis
problem up to a phase.
Proof. The correctness and time complexity of Algorithm 7.3.9 follows from that
of Algorithms 7.2.5 and 7.3.8. The optimality results follow from those of Algo-
rithms 7.2.5 and 7.3.8, keeping in mind that Algorithm 7.2.5 finds an optimal (or
near-optimal) solution for the phase λ = 1, Algorithm 7.3.8 finds an optimal (or
near-optimal) solution for the phase λ = eiπ/8 , and by Corollary 7.3.5, these are the
only two phases that need to be considered.
The only subtlety that must be pointed out is that in part (b) of the near-
optimality, we use the T -count of the third-to-optimal solution, rather than the
second-to-optimal one as in Proposition 7.2.9. This is because the optimal and second-
to-optimal solutions may belong to Algorithms 7.2.5 and 7.3.8, respectively, so that
it may not be until the third-to-optimal solution that the near-optimality result of
either Algorithm 7.2.5 or Algorithm 7.3.8 can be invoked.
110
Remark 7.3.11. Algorithms 7.2.5 and 7.3.8 share the same ε-region up to scaling, and
therefore the uprightness computation only needs to be done once.
Remark 7.3.12. By Lemmas 7.2.3 and 7.3.7, Algorithm 7.2.5 always produces circuits
with even T -count, and Algorithm 7.3.8 always produces circuits with odd T -count.
Instead of running both algorithms to completion, it is possible to interleave the two
algorithms, so that all potential solutions are considered in order of increasing T -
count. This is a slight optimization which does not, however, affect the asymptotic
time complexity.
Chapter 8
The Proto-Quipper language
In this chapter, we introduce the syntax and operational semantics of the Proto-
Quipper language.
8.1 From the quantum lambda calculus to Proto-Quipper
Proto-Quipper is based on the quantum lambda calculus. As was discussed in Sec-

tion 4.4, the execution of programs is modelled in the quantum lambda calculus by
a reduction relation defined on closures, which are triples [Q, L, a] consisting of a
quantum state Q, a list of term variables L, and a term a. The quantum state is held
in a quantum device capable of performing certain operations (applying unitaries,
measuring qubits,. . . ). The reduction relation in the quantum lambda calculus is
then defined as a probabilistic rewrite procedure on these closures. Typically, the
reduction will be classical until a redex involving a quantum constant is reached. At
this point, the quantum device will be instructed to perform the appropriate quantum
operation. For example: “Apply a Hadamard gate to qubit number 3”.
Our approach in designing the Proto-Quipper language was to start with a lim-
ited (but still expressive) fragment of the Quipper language and make it completely
type-safe. The central aspect of Quipper that we chose to focus on is Quipper’s cir-
cuit description abilities: to generate and act on quantum circuits. Indeed, Quipper
provides the ability to treat circuits as data, and to manipulate them as a whole.
For example, Quipper has operators for reversing circuits, decomposing them into
gate sets, etc. This is in contrast with the quantum lambda calculus, where one
only manipulates qubits and all quantum operations are immediately carried out on
a quantum device, not stored for symbolic manipulation.
We therefore extend the quantum lambda calculus with the minimal set of features
that makes it Quipper-like. The current version of Proto-Quipper is designed to:
• incorporate Quipper’s ability to generate and act on quantum circuits, and to
111
112
• provide a linear type system to guarantee that the produced circuits are physi-
cally meaningful (in particular, properties like no-cloning are respected).
To achieve these goals, we define Proto-Quipper as a typed lambda calculus, whose

type system is similar to that of the quantum lambda calculus. The main difference
between Proto-Quipper and the quantum lambda calculus is that the reduction re-
lation of Proto-Quipper is defined on closures [C, a] that consist of a term a and a
circuit state C. Here, the state C represents the circuit currently being built. Instead
of having a quantum device capable of performing quantum operations, we assume
that we have a circuit constructor capable or performing certain circuit building op-
erations (such as appending gates, reversing, etc.). The reduction is then defined as a
rewrite procedure on closures. As in the quantum lambda calculus, some redexes will
affect the state by sending instructions to the circuit constructor. For example: “Ap-
pend a Hadamard gate to wire number 3”. In the current version of Proto-Quipper,
we make the simplifying assumption that no measurements are available, so that the
reduction relation is non-probabilistic.
8.2 The syntax of Proto-Quipper
In this section, we present in detail the syntax and type system of Proto-Quipper.
Definition 8.2.1. The types of Proto-Quipper are defined by
A, B ::= qubit 1 bool A⊗B A(B !A Circ(T, U ).
Among the types, we single out the subset of quantum data types
T, U ::= qubit 1 T ⊗ U.
The types 1, bool, A ⊗ B, A ( B, and !A are inherited from the quantum

lambda calculus and should be interpreted as they were in Section 4.4. The elements
of qubit are references to a logical qubit within a computation. They can be thought
of as references to quantum bits on some physical device, or simply as references to
quantum wires within the circuit currently being constructed. Elements of quantum
data types describe sets of circuit endpoints, and consist of tuples of wire identifiers.
We can think of these as describing circuit interfaces. Finally, the type Circ(T, U ) is
113
the set of all circuits having an input interface of type T and an output interface of
type U .
Definition 8.2.2. The terms of Proto-Quipper are defined by
a, b, c ::= x q (t, C, a) True False ⟨a, b⟩ ∗ ab λx.a

rev unbox boxT if a then b else c let ∗ = a in b
let ⟨x, y⟩ = a in b.
where x and y come from a countable set V of term variables, q comes from a countable
set Q of quantum variables, and C comes from a countable set C of circuit constants.
Among the terms, we single out the subset of quantum data terms
t, u ::= q ∗ ⟨t, u⟩.
Moreover, we assume that C is equipped with two functions In, Out : C → Pf (Q) and
that Q is well-ordered. Here, Pf (Q) denotes the set of finite subsets of Q.
The meaning of most terms is intended to be the standard one. For example ⟨a, b⟩
is the pair of a and b, True and False are the booleans and λx.a is the function which
maps x to a. We briefly discuss the meaning of the more unusual terms.
• A circuit constant C represents a low-level quantum circuit. Because it would

be complicated, and somewhat besides the point, to define a formal language
for describing low-level quantum circuits, Proto-Quipper assumes that there
exists a constant symbol for every possible quantum circuit. Each circuit C
is equipped with a finite set of inputs and a finite set of outputs, which are
subsets of the set of quantum variables Q. Proto-Quipper’s abstract treatment
of quantum circuits is further explained in Section 8.3.
• The term (t, C, a) represents a quantum circuit, regarded as Proto-Quipper

data. The purpose of the terms t and a is to provide structure on the (otherwise
unordered) sets of inputs and outputs of C, so that these inputs and outputs
can take the shape of Proto-Quipper quantum data. For example, suppose that
C is a circuit with inputs {q1 , q2 , q3 } and outputs {q4 , q5 , q6 }. Then the term
(⟨q2 , ⟨q3 , q1 ⟩⟩, C, ⟨⟨q4 , q6 ⟩, q5 ⟩)

114
represents the circuit C, but also specifies what it means to apply this circuit to
a quantum data term ⟨p, ⟨r, s⟩⟩. Namely, in this case, the circuit inputs q2 , q3 ,
and q1 will be applied to qubits p, r, and s, respectively. Moreover, if the output
of this circuit is to be matched against the pattern ⟨⟨x, y⟩, z⟩, then the variables
x, y, and z will be bound, respectively, to the quantum bits at endpoints q4 , q6 ,
and q5 .
Terms of the form (t, C, a) are not intended to be written by the user of the pro-
gramming language; in fact, a Proto-Quipper implementation would not provide
a concrete syntax for such terms. Rather, these terms are internally generated
during the evaluation of Proto-Quipper programs. However, the circuits for
certain basic gates may be made available to the user as pre-defined symbols.
• boxT is a built-in function to turn a circuit-producing function (for example, a

function of type T ( U ) into a circuit regarded as data (for example, of type
Circ(T, U )).
• unbox is a built-in function for turning a circuit regarded as data into a circuit-
producing function. It is an inverse of boxT .
• rev is a built-in function for reversing a low-level circuit.
Note that the term boxT is parameterized by a type T . This Church-style typing of
the language is the reason why types were introduced before terms. Also note that
in a term like (t, C, a), t is assumed to be a quantum data term, but a is not. The
type system to be introduced below will guarantee that even though a is not yet a
quantum data term it will eventually reduce to one.
Examples 8.2.3. Suppose that H is the circuit constant for the Hadamard gate. The
term ⟨q1 , H, q2 ⟩ then represents the circuit consisting only of the Hadamard gate,
regarded as Proto-Quipper data. We can then define the circuit producing function
H = unbox⟨q1 , H, q2 ⟩. Similarly, if CN OT is the circuit constant for the controlled-
not gate, then the term CNOT = unbox⟨⟨q1 , q2 ⟩, CN OT, ⟨q3 , q4 ⟩⟩ is the corresponding
circuit producing function. In an implementation of the Proto-Quipper language, a fi-
nite set of such circuit producing functions would be provided as basic operations. For
example, a candidate such gate set would consist of the Clifford+T gate set extended
115
with the controlled-not gate: H, S, T, CNOT. Basic gates can then be combined. For
example, the term
λx. T(S(H x))
is the circuit producing function which applies in sequence the H, S, and T gates.
Using the boxT operator, we can turn this circuit producing function into a circuit
boxqubit (λx. T(S(H x))).
As in the quantum lambda calculus, the operational semantics of Proto-Quipper

will be defined according to a call-by-value reduction strategy. We therefore define
what it means, for a term of Proto-Quipper, to be a value.
Definition 8.2.4. The values of Proto-Quipper are defined by
v, w ::= x q (t, C, u) True False ⟨v, w⟩

∗ λx.a boxT rev unbox unbox v.
Note that according to Definition 8.2.4, some applications are values, namely
terms of the form unbox v. This is consistent with the meaning of the unbox constant
discussed above. Indeed, if unbox turns a circuit into a circuit-generating function,
then a term of the form unbox v should be seen as a function awaiting an argument,
much like a term of the form λx.a, and therefore considered a value.
We now introduce some useful syntactic operations on types and terms. We start
by defining the notion of free variable for Proto-Quipper terms.
Definition 8.2.5. The set of free (term) variables of a term a, written FV(a), is
defined as
• FV(x) = {x},
• FV(⟨a, b⟩) = FV(a) ∪ FV(b),
• FV(ab) = FV(a) ∪ FV(b),
• FV(λx.a) = FV(a) \ {x},
• FV(if a then b else c) = FV(a) ∪ FV(b) ∪ FV(c),
• FV(let ∗ = a in b) = FV(a) ∪ FV(b),

116
• FV(let ⟨x, y⟩ = a in b) = FV(a) ∪ (FV(b) \ {x, y}),
• FV((t, C, a)) = FV(a), and
• FV(a) = ∅ in all remaining cases.
The above definition of free variables extends the standard one. Note that the free
variables of a term of the form (t, C, a) are the free variables of a. This is justified
since no variables ever appear in the quantum data term t.
The notions of α-equivalence, capture-avoiding substitution, etc., are defined in a
straightforward manner.
By analogy with the free term variables of a term, we introduce a notion of quan-
tum variable of a term.
Definition 8.2.6. The set of free quantum variables of a term a, written FQ(a), is
defined as
• FQ(q) = {q},
• FQ(⟨a, b⟩) = FQ(a) ∪ FQ(b),
• FQ(ab) = FQ(a) ∪ FQ(b),
• FQ(λx.a) = FQ(a),
• FQ(if a then b else c) = FQ(a) ∪ FQ(b) ∪ FQ(c),
• FQ(let ∗ = a in b) = FQ(a) ∪ FQ(b),
• FQ(let ⟨x, y⟩ = a in b) = FQ(a) ∪ FQ(b), and
• FQ(a) = ∅ in all remaining cases.
Note that FQ((t, C, a)) = ∅. This reflects the idea that the quantum variables
appearing in t and a are “bound” in (t, C, a).
To append circuits, we will need to be able to express the way in which wires
should be connected. For this, we use the notion of a binding.
Definition 8.2.7. A finite bijection on a set X is a bijection between two finite

subsets of X. We write Bijf (X) for the set of finite bijections on X. The domain and
codomain of a finite bijection b are denoted dom(b) and cod(b), respectively.
117
Definition 8.2.8. A binding is a finite bijection on Q. We will usually denote

bindings by b.
Definition 8.2.9. If a is a term, b is a binding and FQ(a) = {q1 , . . . , qn } ⊆ dom(b),

then b(a) is the following term
b(a) = a[b(q1 )/q1 , . . . , b(qn )/qn ].
Definition 8.2.10. The partial function bind : QDataTerm2 → Bijf (Q) is defined as
• bind(∗, ∗) = ∅;
• bind(q1 , q2 ) = {(q1 , q2 )};
• bind(⟨t1 , t2 ⟩, ⟨u1 , u2 ⟩) = bind(t1 , u1 ) ∪· bind(t2 , u2 ), provided that bind(t1 , u1 ) ∩

bind(t2 , u2 ) = ∅;
• bind(t, u) = undefined, in all remaining cases.
Definition 8.2.11. Let T be a quantum data type and X a finite subset of Q. An

X-specimen for T is quantum data term written SpecX (T ) defined as
• SpecX (1) = ∗,
• SpecX (qubit) = q where q is the smallest quantum index of Q \ X,
• SpecX (T ⊗ U ) = ⟨t, u⟩ where t = SpecX (T ) and u = SpecX∪FQ(t) (U ).
Informally, an X-specimen for T is a quantum data term t that is “fresh” with

respect to the quantum variables appearing in X. If X is clear from the context, we
simply write Spec(T ). Note that the definition of specimen uses the fact that Q is
well-ordered.
As in the quantum lambda calculus, we use a subtyping relation to deal with the
! modality.
Definition 8.2.12. The subtyping relation <: is the smallest relation on types satis-
fying the rules given in Figure 8.1.
Note that the subtyping of A ( B and Circ(A, B) is contravariant in the left

argument, i.e., A <: A′ implies A′ ( B <: A ( B.
118
qubit <: qubit 1 <: 1 bool <: bool

A1 <: B1 A2 <: B2 A2 <: A1 B1 <: B2
(A1 ⊗ A2 ) <: (B1 ⊗ B2 ) (A1 ( B1 ) <: (A2 ( B2 )
A2 <: A1 B1 <: B2
Circ(A1 , B1 ) <: Circ(A2 , B2 )
A <: B (n = 0 ⇒ m = 0)
! A <: !m B
n
Figure 8.1: Subtyping rules for Proto-Quipper.
Remark 8.2.13. If A <: B then:
1. if A ∈ {qubit, 1, bool}, then A = B;
2. if A = A1 ⊗ A2 , then B = B1 ⊗ B2 , A1 <: B1 and A2 <: B2 ;
3. if A = A1 ( A2 , then B = B1 ( B2 , B1 <: A1 and A2 <: B2 ;
4. if A = Circ(A1 , A2 ), then B = Circ(B1 , B2 ), B1 <: A1 and A2 <: B2 ;
5. if B = !B ′ , then A = !A′ and A′ <: B ′ ;
6. if A is not of the form !A′ , then B is not of the form !B ′ .
Proposition 8.2.14. The subtyping relation is reflexive and transitive.
As in the quantum lambda calculus, the following subtyping rule is derivable
!A <: A .
Definition 8.2.15. A typing context is a finite set {x1 : A1 , . . . , xn : An } of pairs

of a variable and a type, such that no variable occurs more than once. A quantum
context is a finite set of quantum variables. The expressions of the form x : A in a
typing context are called type declarations.
We write Γ or ∆ for a typing context and Q for a quantum context. We also adopt
the previous notational conventions when dealing with typing contexts: |Γ|, Γ(xi ),
!Γ, and Γ <: Γ′ . Moreover, we still write Γ, Γ′ to denote the union of two contexts,
which is defined when |Γ| ∩ |Γ′ | = ∅.
119
Definition 8.2.16. Let T, U be quantum data types. For each of the constants boxT ,
unbox, and rev, we introduce a type as follows
• AboxT (T, U ) = !(T ( U ) ( ! Circ(T, U ),
• Aunbox (T, U ) = Circ(T, U ) ( !(T ( U ), and
• Arev (T, U ) = Circ(T, U ) ( ! Circ(U, T ).
Definition 8.2.17. A typing judgment is an expression of the form:
Γ; Q ⊢ a : A
where Γ is a typing context, Q is a quantum context, a is a term and A is a type.

A typing judgment is valid if it can be inferred from the rules given in Figure 8.2.
In the rule (cst), c ranges over the set {boxT , unbox, rev}. Each typing rule carries
an implicit side condition that the judgements appearing in it are well-formed. In
particular, a rule containing a context of the form Γ1 , Γ2 may not be applied unless
|Γ1 | ∩ |Γ2 | = ∅.
Note that in the typing judgements of Proto-Quipper, quantum variables and

variables are kept separate. As a result, we do not have to specify that q : qubit for
every quantum variable q since the typing rules implicitly enforce this. However, when
a future version of Proto-Quipper will be equipped with the ability to manipulate
quantum and classical wires, the type of a wire might have to be explicitly stated.
As a first illustration of the safety properties of the type system, note that the
(⊗i ) rule ensures that λx.⟨x, x⟩ cannot be given the type qubit ( qubit ⊗ qubit.
8.3 The operational semantics of Proto-Quipper
As mentioned Section 8.1, the reduction relation for Proto-Quipper is defined in the
presence of a circuit constructor. This is a device capable of performing certain basic
circuit building operations. It is not necessary to have a detailed description of the
inner workings of this device. In fact, all that is required for the definition of Proto-
Quipper’s operational semantics is the existence of some primitive operations. We
now axiomatize these operations. Their intuitive meaning will be explained following
Definition 8.3.1.
120
A <: B (axc ) (axq )
!∆, x : A; ∅ ⊢ x : B !∆; {q} ⊢ q : qubit
!Ac (T, U ) <: B
(cst) (∗i )
!∆; ∅ ⊢ c : B !∆; ∅ ⊢ ∗ : !n 1
Γ, x : A; Q ⊢ b : B !∆, x : A; ∅ ⊢ b : B
(λ1 ) (λ2 )
Γ; Q ⊢ λx.b : A ( B !∆; ∅ ⊢ λx.b : !n+1 (A ( B)
Γ1 , !∆; Q1 ⊢ c : A ( B Γ2 , !∆; Q2 ⊢ a : A
(app)
Γ1 , Γ2 , !∆; Q1 , Q2 ⊢ ca : B
Γ1 , !∆; Q1 ⊢ a : !n A Γ2 , !∆; Q2 ⊢ b : !n B
(⊗i )
Γ1 , Γ2 , !∆; Q1 , Q2 ⊢ ⟨a, b⟩ : !n (A ⊗ B)
Γ1 , !∆; Q1 ⊢ b : !n (B1 ⊗ B2 ) Γ2 , !∆, x : !n B1 , y : !n B2 ; Q2 ⊢ a : A
(⊗e )
Γ1 , Γ2 , !∆; Q1 , Q2 ⊢ let ⟨x, y⟩ = b in a : A
Γ1 , !∆; Q1 ⊢ b : !n 1 Γ2 , !∆; Q2 ⊢ a : A
(∗ )
Γ1 , Γ2 , !∆; Q1 , Q2 ⊢ let ∗ = b in a : A e
(⊤) (⊥)
!∆; ∅ ⊢ True : !n bool !∆; ∅ ⊢ False : !n bool
Γ1 , !∆; Q1 ⊢ b : bool Γ2 , !∆; Q2 ⊢ a1 : A Γ2 , !∆; Q2 ⊢ a2 : A
(if )
Γ1 , Γ2 , !∆; Q1 , Q2 ⊢ if b then a1 else a2 : A
Q1 ⊢ t : T !∆; Q2 ⊢ a : U In(C) = Q1 Out(C) = Q2
(circ)
!∆; ∅ ⊢ (t, C, a) : !n Circ(T, U )
Figure 8.2: Typing rules for Proto-Quipper.
Definition 8.3.1. A circuit constructor consists of a pair of countable sets ⟨Q, S⟩

together with the following maps
• New : Pf (Q) → S,
• In : S → Pf (Q),
• Out : S → Pf (Q),
• Rev : S → S,
• Append : S × S × Bijf (Q) → S × Bijf (Q)
satisfying the following conditions
1. Rev ◦ Rev = 1S ,
121
2. In ◦ Rev = Out and Out ◦ Rev = In,
3. In ◦ New = Out ◦ New = 1Pf (Q) , and
4. if Append(C, D, b) = (C ′ , b′ ) and dom(b) ⊆ Out(C) and cod(b) = In(D), then
(a) In(C ′ ) = In(C),
(b) dom(b′ ) = Out(D) and cod(b′ ) ⊆ Out(C ′ ) and
(c) Out(C ′ ) = (Out(C) \ dom(b)) ∪· cod(b′ ).
If ⟨Q, S⟩ is a circuit constructor, we call the elements of S circuit states and the
elements of Q wire identifiers. We now explain the intended meaning of a circuit
constructor and its constituents. An element C ∈ S is a quantum circuit, such as
q1 H q3
C =
q2 • q4 .
Each circuit has a finite set of inputs and a finite set of outputs, given by the functions
In and Out. For example, In(C) = {q1 , q2 } and Out(C) = {q3 , q4 }. For X ⊆ Q, the
circuit New(X) is the identity circuit with inputs and outputs X; for example,
q1 q1
New(q1 , q2 , q3 ) = q2 q2
q3 q3 .
The operator Rev reverses a circuit, swapping its inputs and outputs in the process.
When (C ′ , b′ ) = Append(C, D, b), the circuit C ′ is obtained by appending the circuit
D to the end of the circuit C. The function b is used to specify along which wires
to compose C and D while the function b′ updates the wire names post composition.
An illustration of this is given in Figure 8.3.
We note that the axiomatization of Definition 8.3.1 does not mention the concept
of a gate. Indeed, any gate is a circuit, and thus a member of the set S; conversely,
any circuit can be used as a gate. In Proto-Quipper, we simply assume that certain
members of S are available as pre-defined constants, serving as “elementary” gates.
The operation of appending a gate to a circuit is subsumed by the more general
operation of composing circuits.
Proto-Quipper’s quantum variables and circuit constants are supposed to be the
syntactic representatives of a circuit constructor’s wire identifiers. This idea is for-
malized in the following definition.
122
 b
   b′
 
q1 q1 q1′ p1 p′1
q2 q2 q2′ p2 p′2
D
q3 q3 q3′ p3 p′3
C
q4 q4 q4
q5 q5 q5
Figure 8.3: A representation of Append(C, D, b).
Definition 8.3.2. A circuit constructor ⟨Q, S⟩ is adequate if it can be equipped with

bijections Wire : Q → Q and Name : C → S such that:
In = Wire′ ◦ In ◦ Name and Out = Wire′ ◦ Out ◦ Name
where Wire′ denotes the lifting of Wire−1 from Q to P(Q).
Remark 8.3.3. The existence of the bijections Wire and Name has the following con-
sequences:
• C can be equipped with an involution:
(.)−1 = Name ◦ Rev ◦ Name−1 : C → C
such that In(C −1 ) = Out(C) and Out(C −1 ) = In(C).
• If t and u are quantum data terms such that bind(t, u) = b, then we can define
b = Wire ◦ b ◦ Wire−1 ∈ Bijf (Q).
From now on, we always assume an adequate circuit constructor. Moreover, we work
under the simplifying assumptions that Q = Q, C = S, Wire = 1Q , and Name = 1S .
This notably implies that In = In and Out = Out.
We are now in a position to define Proto-Quipper’s operational semantics.

123
Definition 8.3.4. Let ⟨Q, S⟩ be an adequate circuit constructor. A closure is a pair

[C, a] where C ∈ S, a is a term and FQ(t) ⊆ Out(C).
Definition 8.3.5. The one-step reduction relation, written →, is defined on closures

by the rules given in Figure 8.4. The reduction relation, written →∗ , is defined to be
the reflexive and transitive closure of →.
The rules are separated in three groups. The first group and second group contain
the congruence rules and classical rules respectively. Except for the rule for (t, D, a),
these rules are standard. They describe a call-by-value reduction strategy. The
circuit generating rule for rev(t, C, t′ ) rule is straightforward. We briefly discuss the
remaining rules.
The rule for boxT (v) is to be understood as follows. To reduce a closure of the
form [C, boxT (v)], start by generating a specimen of type T . Then apply the function
v on the input t in the context of an empty circuit of the appropriate arity. By the
congruence rule for (t, D, a), this computation will continue until a value is reached,
i.e., a term of the form (t, D, t′ ). Note that while this computation is taking place,
the state C is not accessible. When a value of the form (t, D, t′ ) is reached, the
construction of C can resume. Note that it was necessary to know the type T in
order to generate the appropriate specimen. This explains the choice of a Church-
style typing of the box operator.
The rule for (unbox (u, D, u′ ))v will first generate a binding from v and the input
u of D. Then, it will compose C and D along that binding and update the names of
the wire identifiers appearing in u′ according to b′ .
The recursive nature of the reduction rules explains why closures are not required
to satisfy FQ(a) = Out(C). The requirement that FQ(a) ⊆ Out(C) is justified by
the idea that a term should not affect a wire outside of C. But if we also asked for
the opposite inclusion, it would not be possible to define a recursive reduction in a
straightforward way. For example, the reduction of a pair is done component-wise: to
reduce ⟨a, b⟩ one first reduces b. The simplest way to express this in terms of closures
is to carry the whole circuit state along. This implies that if both a and b contain
wire identifiers, then the equality FQ(a) = Out(C) cannot be satisfied.
Unlike in the quantum lambda calculus, Proto-Quipper’s reduction is not proba-
bilistic, in the sense that the right member of any reduction rule is a unique closure.
124
[C, a] → [C ′ , a′ ] [C, b] → [C ′ , b′ ]
(f un) (arg)
[C, ab] → [C ′ , a′ b] [C, vb] → [C ′ , vb′ ]
[C, b] → [C ′ , b′ ] [C, a] → [C ′ , a′ ]
(right) (lef t)
[C, ⟨a, b⟩] → [C ′ , ⟨a, b′ ⟩] [C, ⟨a, v⟩] → [C ′ , ⟨a′ , v⟩]
[C, a] → [C ′ , a′ ]
(let∗)
[C, let ∗ = a in b] → [C ′ , let ∗ = a′ in b]
[C, a] → [C ′ , a′ ]
(let)
[C, let ⟨x, y⟩ = a in b] → [C ′ , let ⟨x, y⟩ = a′ in b]
[C, a] → [C ′ , a′ ]
(cond)
[C, if a then b else c] → [C ′ , if a′ then b else c]
[D, a] → [D′ , a′ ]
(circ)
[C, (t, D, a)] → [C, (t, D′ , a′ )]
(β)
[C, (λx.a)v] → [C, a[v/x]]
(unit)
[C, let ∗ = ∗ in a] → [C, a]
(pair)
[C, let ⟨x, y⟩ = ⟨v, w⟩ in a] → [C, a[v/x, w/y]]
(if F )
[C, if False then a else b] → [C, b]
(if T )
[C, if True then a else b] → [C, a]
SpecFQ(v) (T ) = t new(FQ(t)) = D
(box)
[C, boxT (v)] → [C, (t, D, vt)]
bind(v, u) = b Append(C, D, b) = (C ′ , b′ ) FQ(u′ ) ⊆ dom(b′ )
(unbox)
[C, (unbox (u, D, u′ ))v] → [C ′ , b′ (u′ )]
(rev)
[C, rev (t, C, t′ )] → [C, (t′ , C −1 , t)]
Figure 8.4: Reduction rules for Proto-Quipper.

125
The following proposition establishes that Proto-Quipper’s reduction is moreover de-

terministic.
Proposition 8.3.6. If [C, a] is a closure, then at most one reduction rule applies to
it.
Proof. By case distinction on a.
To close this chapter, we illustrate the reduction of Proto-Quipper with an exam-

ple. Assume that we are given the basic circuit generating functions H, S, and CNOT
of Examples 8.2.3 and let F be the following term
F = λz.(let ⟨x, y⟩ = z in CNOT⟨H x, S y⟩)
Since F can be given the type qubit ⊗ qubit ( qubit ⊗ qubit, we can use
boxqubit⊗qubit to turn F into a Proto-Quipper circuit. Now consider the closure
[−, boxqubit⊗qubit F ] (8.1)
where − is any circuit state. The (box) rule applies, so that a specimen of type
qubit ⊗ qubit is created, say ⟨q1 , q2 ⟩, and (8.1) reduces to
[−, (⟨q1 , q2 ⟩, C, F ⟨q1 , q2 ⟩)]
where C = new({q1 , q2 }) is the empty circuit on {q1 , q2 }. Since F ⟨q1 , q2 ⟩ is not a

value, the (circ) rule applies. This means that we consider the closure
[C, F ⟨q1 , q2 ⟩].
We now repeatedly consider reducts of this closure until a value is reached. For clarity,
we represent circuit states as circuits. In two classical reductions we reach the closure
q1 q1
q2 q2 CNOT⟨H q1 , S q2 ⟩.
Following Proto-Quipper’s reduction strategy, the right argument is reduced first,

yielding
q1 q1
q2 S q2 CNOT⟨H q1 , q2 ⟩.
126
Here we assumed for simplicity that the output wire of the S was not renamed. Since
q1 is a value, we now reduce S q2 . This yields
q1 H q1
CNOT⟨q1 , q2 ⟩.
q2 S q2
Finally, the CN OT gate is applied
q1 H q1
⟨q1 , q2 ⟩. (8.2)
q2 S • q2
Since ⟨q1 , q2 ⟩ is a value, the execution is finished. The final circuit is now returned in
the form of a term of the language, e.g., as
[−, (⟨q1 , q2 ⟩, D, ⟨q1 , q2 ⟩)]
where D is the constant representing the circuit on the left hand side of (8.2).
Chapter 9
Type-safety of Proto-Quipper
In this chapter, we establish that Proto-Quipper is a type safe language. As discussed

in Section 4.2.4, type safety is established by proving that the language enjoys the
subject reduction and progress properties.
9.1 Properties of the type system
Before proving the subject reduction and progress, we record some properties of the
type system, including the technical but important Substitution Lemma. Note that
the typing rules enforce a strict linearity on variables and quantum variables. In
particular, if a quantum variable appears in the quantum context of a valid typing
judgement for a term a, then it must belong to the free quantum variables of a.
Lemma 9.1.1.
1. If Γ; Q ⊢ a : A is valid, then Q = FQ(a).
/ FV(a), then B = !B ′ and Γ; Q ⊢ a : A is

2. If Γ, x : B; Q ⊢ a : A is valid, and x ∈
valid.
3. If Γ; Q ⊢ a : A is valid, then Γ, !∆; Q ⊢ a : A is valid.
4. If Γ; Q ⊢ a : A is valid, ∆ <: Γ and A <: B, then ∆; Q ⊢ a : B is valid.
Proof. By induction on the corresponding typing derivation.
Lemma 9.1.2. If T is a quantum data type and X is a finite subset of Q, then

FQ(SpecX (T )) ⊢ SpecX (T ) : T is valid.
Proof. We prove the Lemma by induction on T .
• If T = 1, then SpecX (T ) = ∗ and we can use the (∗i ) rule.
127
128
• If T = qubit, then SpecX (T ) = q for some quantum variable q and we can use
the (axq ) rule.
• If T = T1 ⊗ T2 , then SpecX (T ) = ⟨t1 , t2 ⟩ where t1 = SpecX (T1 ) and u =

SpecX∪FQ(t1 ) (T2 ). By the induction hypothesis, both FQ(t1 ) ⊢ t1 : T1 and
FQ(t2 ) ⊢ t2 : T2 are valid typing judgements. We can therefore conclude by
applying the (⊗i ) rule.
Lemma 9.1.3. If Γ; Q ⊢ a : A is valid and b is a binding such that F Q(a) ⊆ dom(b)

then Γ; b(Q) ⊢ b(a) : A is valid.
Proof. By induction on the typing derivation of Γ; Q ⊢ a : A.
Lemma 9.1.4. If v ∈ Val and Γ; Q ⊢ v : !A is valid, then Q = ∅ and Γ = !∆ for

some ∆.
Proof. By induction on the typing derivation of Γ; Q ⊢ v : !A. In the case of (axc ),

use Lemma 8.2.13.5.
Lemma 9.1.5. If a term a is not a value then it is of one of the following forms
• (t, C, a′ ) with a′ ∈
/ Val,
• ⟨a1 , a2 ⟩ with a1 ∈
/ Val or a2 ∈
/ Val,
• if a1 then a2 else a3 ,
• let ∗ = a1 in a2 ,
• let ⟨x, y⟩ = a1 in a2 , or
• a1 a2 with a1 ̸= unbox or a2 ∈
/ Val.
Proof. By definition of terms and values.
Lemma 9.1.6. A well-typed value v is either a variable, a quantum variable, a con-

stant or one of the following case occurs
• if it is of type !n Circ(T, U ), it is of the form (t, C, u) with t and u values,
• if it is of type !n bool it is either True or False,

129
• if it is of type !n (A ⊗ B), it is of the form ⟨w, w′ ⟩, with w and w′ values and

FQ(w) ∩ FQ(w′ ) = ∅,
• if it is of type !n 1, it is precisely the term ∗, or
• if it is of type !n (A ( B), it is a lambda abstraction, a constant, or of the form

unbox (t, C, u).
Proof. By induction on the typing derivation of v.
Corollary 9.1.7. If T is a quantum data type and v is a well-typed value of type T

then v is a quantum data term.
Lemma 9.1.8. If T is a quantum data type and v1 , v2 are well-typed values of type
T , then b = bind(v1 , v2 ) is a well-defined binding, dom(b) = FQ(v1 ), and cod(b) =
FQ(v2 ).
Proof. By Corollary 9.1.7, we know that v1 and v2 are quantum data terms so that
the statement of the lemma makes sense. The proof then proceeds by induction on
T , using Lemma 9.1.6.
Lemma 9.1.9 (Substitution). If v ∈ Val and both Γ′ , !∆; Q′ ⊢ v : B and Γ, !∆, x :

B; Q ⊢ a : A are valid typing judgements, then Γ, Γ′ , !∆; Q, Q′ ⊢ a[v/x] : A is also
valid.
Proof. Let π1 and π2 be the typing derivations of Γ, !∆, x : B; Q ⊢ a : A and

Γ′ , !∆; Q′ ⊢ v : B respectively. We prove the Lemma by induction on π1 .
• If the last rule of π1 is (axc ) and a = x, then π1 is
B <: A (axc )
!∆, x : B; ∅ ⊢ x : A
with Γ = Q = ∅. Then a[v/x] = v and can conclude by applying Lemma 9.1.1.4

to π2 .
• If the last rule of π1 is (axc ) and a = y ̸= x, then π1 is
A′ <: A (axc )
!∆, x : !B ′ , y : A′ ; ∅ ⊢ y : A
130
with B = !B ′ , Q = ∅ and Γ = {y : A′ } or Γ = ∅ depending on whether or not

A′ is duplicable. Therefore v is a value of type !B ′ and by Lemma 9.1.4, we
know that Γ′ = Q′ = ∅. Since a[v/x] = y and x ∈
/ FV(y) we can conclude by
applying Lemma 9.1.1.2 to π1 .
• If the last rule of π1 is one of (axq ), (cst), (∗i ), (⊤) and (⊥), and a is the
corresponding constant, then x ∈
/ FV(a) and x must be declared of some type
!B ′ . We can therefore reason as in the previous case.
• If the last rule of π1 is (λ1 ) and a = λy.b, then π1 is

..
.
Γ, !∆, x : B, y : A1 ; Q ⊢ b : A2
(λ )
Γ, !∆, x : B; Q ⊢ λy.b : A1 ( A2 1
with A = A1 ( A2 . By the induction hypothesis, Γ, Γ′ , !∆, y : A1 ; Q, Q′ ⊢

b[v/x] : A2 is valid and we can conclude by applying (λ1 ).
• If the last rule of π1 is (λ2 ) and a = λy.b, then π1 is

..
.
!∆, x : !B ′ , y : A1 ; ∅ ⊢ b : A2
(λ2 )
!∆, x : !B ′ ; ∅ ⊢ λy.b : !n+1 (A1 ( A2 )
with A = !n+1 (A1 ( A2 ) and B = !B ′ . Hence v is a value of type !B ′ and by

Lemma 9.1.4, we know that Γ′ = Q′ = ∅. The induction hypothesis therefore
implies that !∆, y : A1 ; ∅ ⊢ b[v/x] : A2 is valid and we can conclude by applying
(λ2 ).
• If the last rule of π1 is (app), and a = ca′ , then π1 can be of one of three forms
depending on B. If B is duplicable, then π1 is
.. ..
. .
Γ1 , x : !B ′ , !∆; Q1 ⊢ c : A′ ( A Γ2 , x : !B ′ , !∆; Q2 ⊢ a′ : A′
(app)
Γ1 , Γ2 , !∆, x : !B ′ ; Q1 , Q2 ⊢ ca′ : A
with B = !B ′ . Using Lemma 9.1.4 again, we know that Γ′ = Q′ = ∅. The

induction hypothesis therefore implies that Γ1 , !∆; Q1 ⊢ c[v/x] : A′ ( A and
Γ2 , !∆; Q2 ⊢ a′ [v/x] : A′ are valid and we can conclude by applying (app). If,
131
instead, B is non-duplicable, then the declaration x : B can only appear in one

branch of the derivation. This means that π1 is either
.. ..
. .
Γ1 , x : B, !∆; Q1 ⊢ c : A′ ( A Γ2 , !∆; Q2 ⊢ a′ : A′
(app)
Γ1 , Γ2 , !∆, x : B; Q1 , Q2 ⊢ ca′ : A
or
.. ..
. .
Γ1 , !∆; Q1 ⊢ c : A′ ( A Γ2 , x : B, !∆; Q2 ⊢ a′ : A′
(app).
Γ1 , Γ2 , !∆, x : B; Q1 , Q2 ⊢ ca′ : A
In the first case, the induction hypothesis implies that Γ1 , Γ′ !∆; Q1 , Q′ ⊢ c[v/x] :
A′ ( A is valid and we can conclude by (app). The second case is treated
analogously.
• If the last rule of π1 is one of (⊗i ), (⊗e ), (∗e ) and (if ), and a is the corresponding
term, then we can reason as above by considering in turn the case where B is
duplicable and the case where B is non-duplicable.
• If the last rule of π1 is (circ), and a = (t, C, a′ ), then π1 is

.. ..
. .
Q1 ⊢ t : T !∆, x : !B ′ ; Q2 ⊢ a′ : U In(C) = Q1 Out(C) = Q2
(circ)
!∆, x : !B ′ ; ∅ ⊢ (t, C, a′ ) : !n Circ(T, U )
with A = !n Circ(T, U ) and B = !B ′ for some types T , U and B ′ . Using

Lemma 9.1.4 again, we know that Γ′ = Q′ = ∅. The induction hypothesis
therefore implies that !Γ; Q2 ⊢ a′ [v/x] : U is valid and we can conclude by
applying (circ).
9.2 Subject reduction
We now prove that Proto-Quipper enjoys the subject reduction property. Since the
reduction relation is defined on closures but the typing rules apply to terms, we start
by extending the notions of typing judgement and validity to closures.
Definition 9.2.1. A typed closure is an expression of the form:
Γ; Q ⊢ [C, a] : A, (Q′ |Q′′ ).

132
It is valid if In(C) = Q′ and Out(C) = Q, Q′′ , and Γ; Q ⊢ a : A is a valid typing

judgement.
Lemma 9.2.2. If [C, a] → [C ′ , a′ ] then In(C) = In(C ′ ).
Proof. By induction on the derivation of [C, a] → [C ′ , a′ ]. In all but the (unbox) case,
the result follows either from the induction hypothesis or from the fact that C = C ′ .
In the (unbox) case, use Definition 8.3.1.4a.
Theorem 9.2.3 (Subject reduction). If Γ; FQ(a) ⊢ [C, a] : A, (Q′ |Q′′ ) is a valid

typed closure and [C, a] → [C ′ , a′ ], then Γ; FQ(a′ ) ⊢ [C ′ , a′ ] : A, (Q′ |Q′′ ) is a valid
typed closure.
Proof. We prove the theorem by induction on the derivation of the reduction [C, a] →
[C ′ , a′ ]. In each case, we start by reconstructing the unique typing derivation π of
Γ; FQ(a) ⊢ a : A and we use it to prove that Γ; FQ(a′ ) ⊢ [C ′ , a′ ] : A, (Q′ |Q′′ ) is valid.
By Lemma 9.2.2 we never need to verify that In(C ′ ) = Q′ so that we only need to
show:
• Out(C ′ ) = FQ(a′ ), Q′′ and
• Γ; FQ(a′ ) ⊢ a′ : A is valid.
Throughout the proof, we write IH(π) to denote the proof obtained by applying the
induction hypothesis to π.
Congruence rules: These rules are treated uniformly. We illustrate the (f un) and
(circ) cases.
• (f un): the reduction rule is
[C, c] → [C ′ , c′ ]
[C, cb] → [C ′ , c′ b]
with a = cb and a′ = c′ b. The typing derivation π is therefore

.. ..
. π1 . π2
Γ1 , !∆; FQ(c) ⊢ c : B ( A Γ2 , !∆; FQ(b) ⊢ b : B
Γ1 , Γ2 , !∆; FQ(c), FQ(b) ⊢ cb : A
133
and Γ1 , Γ2 ; FQ(c), FQ(b) ⊢ [C, cb], (Q′ |Q′′ ) is valid. It follows that
Γ1 , !∆; FQ(c) ⊢ [C, c] : B ( A, (Q′ | FQ(b), Q′′ )
is valid and, by the induction hypothesis, this implies that Γ1 , !∆; FQ(c′ ) ⊢
[C ′ , c′ ] : B ( A, (Q′ | FQ(b), Q′′ ) is also valid. In particular, it follows that
Out(C ′ ) = FQ(c′ ), FQ(b), Q′′ . This, together with the following typing
derivation,
.. ..
. IH(π1 ) . π2
Γ1 , !∆; FQ(c′ ) ⊢ c′ : B ( A Γ2 , !∆; FQ(b) ⊢ b : B
Γ1 , Γ2 , !∆; FQ(c′ ), FQ(b) ⊢ c′ b : A
shows that Γ1 , Γ2 , !∆; FQ(c′ ), FQ(b) ⊢ [C ′ , c′ b] : A, (Q′ , Q′′ ) is valid.
• (circ): the reduction rule is
[D, b] → [D′ , b′ ]
(circ)
[C, (t, D, b)] → [C, (t, D′ , b′ )]
with a = (t, D, b) and a′ = (t, D′ , b′ ). The typing derivation π is therefore
.. ..
. π1 . π2 Out(D) = FQ(b)
FQ(t) ⊢ t : T !∆; FQ(b) ⊢ b : U In(D) = FQ(t)
!∆; ∅ ⊢ (t, D, b) : !n Circ(T, U )
and !∆; ∅ ⊢ [C, (t, D, b)] : !n Circ(T, U ), (Q′ |Q′′ ) is valid. Disregarding π1 ,
it follows from the assumptions in the above rule that !∆; FQ(b) ⊢ [D, b] :
U, (FQ(t)|∅) is valid and, by the induction hypothesis, this implies that
!∆, FQ(b′ ) ⊢ [D′ , b′ ] : U, (FQ(t)|∅) is also valid. This, together with the
following typing derivation,
.. ..
. π1 . IH(π2 ) Out(D′ ) = FQ(b′ )
FQ(t) ⊢ t : T !∆; FQ(b′ ) ⊢ b′ : U In(D′ ) = FQ(t) .
!∆; ∅ ⊢ (t, D, b′ ) : !n Circ(T, U )
shows that !∆; ∅ ⊢ [C, (t, D′ , b′ )] : !n Circ(T, U ), (Q′ |Q′′ ) is valid.
Classical rules: These rules are also treated uniformly, we illustrate the (β) case.
134
• (β): the reduction rule is
[C, (λx.b)v] → [C, b[v/x]]
with a = (λx.b)v and a′ = b[v/x]. The typing derivation π is therefore

..
. π1
..
Γ1 , !∆, x : B; FQ(b) ⊢ b : A . π2
Γ1 , !∆; FQ(b) ⊢ λx.b : B ( A Γ2 , !∆; FQ(v) ⊢ v : B
Γ1 , Γ2 , !∆; FQ(b), FQ(v) ⊢ (λx.b)v : A
and Γ1 , Γ2 , !∆; FQ(b), FQ(v) ⊢ [C, (λx.b)v] : A, (Q′ |Q′′ ) is valid. We then
know, by Lemma 9.1.9, that Γ1 , Γ2 , !∆; FQ(b), FQ(v) ⊢ b[v/x] : A is a valid
typing judgement which implies that
Γ1 , Γ2 , !∆; FQ(b), FQ(v) ⊢ [C, b[v/x]] : A, (Q′ |Q′′ )
is a valid typed closure.
Circuit generating rules: These rules represent the most interesting cases. We
treat them individually.
• (box): the reduction rule is

Spec(T ) = t New(FQ(t)) = D
[C, boxT (v)] → [C, (t, D, vt)]
with a = boxT (v) and a′ = (t, D, vt). Since v is a value, we know by

Lemma 9.1.4 that the typing derivation π is
..
. π1
!∆; ∅ ⊢ boxT : !(T ( U ) ( !n Circ(T, U ) !∆; ∅ ⊢ v : !(T ( U )
!∆; ∅ ⊢ boxT (v) : !n Circ(T, U )
and !∆; ∅ ⊢ [C, boxT (v)] : !n Circ(T, U ), (Q′ |Q′′ ) is valid. There exists
a typing derivation π2 of FQ(t) ⊢ t : T , by Lemma 9.1.2. Applying
Lemma 9.1.1.4 to π1 we get a derivation π1′ of !∆; ∅ ⊢ v : T ( U . We can
therefore construct the following derivation τ :
.. ′ .. ′
. π1 . π2
!∆; ∅ ⊢ v : T ( U !∆; FQ(t) ⊢ t : T
!∆; FQ(t) ⊢ vt : U
135
where π2′ is obtained from π2 by Lemma 9.1.1.3. Moreover, since FQ(vt) =

FQ(t) = Out(D) = In(D), we have:
.. ..
. π2 .τ Out(D) = FQ(vt)
FQ(t) ⊢ t : T !∆; FQ(vt) ⊢ vt : U In(D) = FQ(t) .
n
!∆; ∅ ⊢ (t, D, vt) : ! Circ(T, U )
Hence !∆; ∅ ⊢ [C, (t, D, vt)] : !n Circ(T, U )(Q′ |Q′′ ) is a valid typed closure.
• (unbox): the reduction rule is
bind(v, u) = b Append(C, D, b) = (C ′ , b′ ) FQ(u′ ) ⊆ dom(b′ )
[C, (unbox (u, D, u′ ))v] → [C ′ , b′ (u′ )]
with a = (unbox (u, D, u′ ))v and a′ = b′ (u′ ). To reconstruct the typing
derivation π, first note that we have the following derivation π1 of !∆; ∅ ⊢
unbox (u, D, u′ ) : T ( U
.. 1 .. 2
. π1 . π1
FQ(u) ⊢ u : T !∆; FQ(u′ ) ⊢ u′ : U
!∆; ∅ ⊢ unbox : Circ(T, U ) ( (T ( U ) !∆; ∅ ⊢ (u, D, u′ ) : Circ(T, U )
′
!∆; ∅ ⊢ unbox (u, D, u ) : T ( U
with In(D) = FQ(u), Out(D) = FQ(u′ ). We can then use π1 to rebuild π

as follows:
.. ..
. π1 . π2
!∆; ∅ ⊢ unbox (u, D, u′ ) : T ( U !∆; FQ(v) ⊢ v : T
!∆; FQ(v) ⊢ (unbox (u, D, u′ ))v : U
and the typed closure
!∆; FQ(v) ⊢ [C, (unbox (u, D, u′ ))v] : U, (Q′ |Q′′ )
is valid. In the conclusion of π2 , all the term variables are declared of

a duplicable type. This follows from Corollary 9.1.7 and Lemma 9.1.1.2.
By assumption, we know that FQ(u′ ) ⊆ dom(b′ ). We can therefore apply
Lemma 9.1.3 to π12 to get a typing derivation τ of
!∆; FQ(b′ (u′ )) ⊢ b(u′ ) : U.
Now by Definition 8.3.1.4c we have:

Out(C ′ ) = b′ (Out(D)) ∪· (Out(C) \ b−1 (In(D)))
= b′ (FQ(u′ )) ∪· ((Q′′ ∪· FQ(v)) \ b−1 (FQ(u)))
= FQ(b′ (u′ )) ∪· ((Q′′ ∪· FQ(v)) \ FQ(v))
= FQ(b′ (u′ )) ∪· Q′′ .
136
Hence !∆; FQ(b(u′ )) ⊢ [C ′ , b′ (u′ )] : U, (Q′ |Q′′ ) is valid.

• (rev): the reduction rule is
(rev)
[C, rev (t, D, t′ )] → [C, (t′ , D−1 , t)]
with a = rev (t, D, t′ ) and a′ = (t′ , D−1 , t). The typing derivation π is
therefore
.. ..
. π1 . π2
FQ(t) ⊢ t : T !∆; FQ(t′ ) ⊢ t′ : U
n
!∆; ∅ ⊢ rev : Circ(T, U ) ( ! Circ(U, T ) !∆; ∅ ⊢ (t, D, t′ ) : Circ(T, U )
!∆; ∅ ⊢ rev (t, D, t′ ) : !n Circ(U, T )
with In(D) = FQ(t), Out(D) = FQ(t′ ) and
!∆; ∅ ⊢ rev (t, D, t′ ) : !n Circ(T, U ), (Q′ |Q′′ )
is valid. Now note that since t′ is a quantum data term, it contains no

variables. Applying Lemma 9.1.1.2 to π2 repeatedly we therefore get a
derivation π2′ of FQ(t′ ) ⊢ t′ : U . Moreover, by applying Lemma 9.1.1.3 to π1
we get a typing derivation π1′ of !∆, FQ(t) ⊢ t : T . Since, by Remark 8.3.3,
we have Out(D−1 ) = In(D) = t and In(D−1 ) = Out(D) = t′ , we can
construct the following typing derivation:
.. ′ .. ′
. π2 . π1 Out(D−1 ) = FQ(t′ )
FQ(t′ ) ⊢ t′ : U !∆; FQ(t) ⊢ t : T In(D−1 ) = FQ(t′ )
!∆; ∅ ⊢ (t′ , D−1 , t) : Circ(U, T )
Hence !∆; ∅ ⊢ (t′ , D, t) : !n Circ(T, U ), (Q′ |Q′′ ) is valid.
Corollary 9.2.4. If Γ; FQ(a) ⊢ [C, a] : A, (Q′ |Q′′ ) is a valid typed closure and
[C, a] →∗ [C ′ , a′ ], then Γ; FQ(a′ ) ⊢ [C ′ , a′ ] : A, (Q′ |Q′′ ) is also a valid typed closure.
Proof. By induction on the length of the reduction sequence. The base case is pro-
vided by Theorem 9.2.3.
The above formulation of Subject Reduction explains why a typed closure contains
information about the input and output wires of the circuit state. Indeed, Subject
Reduction now guarantees (1) that the input wires of a circuit remain unchanged
through reduction and (2) that a term can only affect wires whose identifiers are
among its quantum variables.
137
9.3 Progress
We now prove that Proto-Quipper enjoys the progress property.
Theorem 9.3.1 (Progress). If FQ(a) ⊢ [C, a] : A, (Q′ |Q′′ ) is a valid typed closure
then either a ∈ Val or there exists a closure [C ′ , a′ ] such that [C, a] → [C ′ , a′ ].
First, note that the Progress property is stated for a typed closure whose typing
context is empty. This is because the property is not expected to hold if we allow
for a non-empty typing context. Indeed, it is easy to see that there are well-typed,
non-closed closures such as [C, xy], which are neither values nor reduce. We now
prove the theorem.
Proof. We prove the theorem by induction on the typing derivation π of FQ(a) ⊢

a : A. If a is a value then there is nothing to prove. If a is not a value, then
by Lemma 9.1.5 there are 6 cases to consider. In each case we show that [C, a] is
reducible in the sense that there exists a closure [C, b] such that [C, a] → [C, b]
1. If a = (t, D, a′ ) with a′ ∈
/ Val, then the typing derivation π is:
.. ..
. π1 . π2 Out(D) = FQ(a′ )
FQ(t) ⊢ t : T FQ(a′ ) ⊢ a′ : U In(D) = FQ(t) .
′
∅ ⊢ (t, D, a ) : Circ(T, U )
The typed closure
FQ(a′ ) ⊢ [D, a′ ] : U, (FQ(t)|∅)
is therefore valid. Since a′ is not a value, the induction hypothesis implies that
there exists a′′ such that [D, a′ ] → [D′ , a′′ ] and [C, (t, D, a′ )] therefore reduces
to [C, (t, D′ , a′′ )] by the (circ) reduction rule.
2. If a = ⟨a1 , a2 ⟩ with a1 ∈
/ Val or a2 ∈
/ Val, then the typing derivation π is:
.. ..
. π1 . π2
FQ(a1 ) ⊢ a1 : ! A1 FQ(a2 ) ⊢ a2 : !n A2 .
n
FQ(a1 ), FQ(a2 ) ⊢ ⟨a1 , a2 ⟩ : !n (A1 ⊗ A2 )

138
The typed closures

FQ(a1 ) ⊢ [C, a1 ] : !n A1 , (Q′ | FQ(a2 ), Q′′ )
FQ(a2 ) ⊢ [C, a2 ] : !n A1 , (Q′ | FQ(a1 ), Q′′ )
are therefore both valid. Now if a2 ∈
/ Val, then by the induction hypothesis
[C, a2 ] → [C ′ , a′2 ]. Hence [C, ⟨a1 , a2 ⟩] reduces to [C ′ , ⟨a1 , a′2 ⟩] by the (right)
reduction rule. If on the other hand a2 ∈ Val, then it must be the case that
a1 ∈
/ Val and we can conclude by reasoning analogously that [C, ⟨a1 , a2 ⟩] reduces
to some [C ′ , ⟨a′1 , a2 ⟩] by the (lef t) reduction rule.
3. If a = if a1 then a2 else a3 , then the typing derivation π is:

.. .. ..
. π1 . π2 . π3
FQ(a1 ) ⊢ a1 : bool Q ⊢ a2 : A Q ⊢ a3 : A .
FQ(a1 ), Q ⊢ if a1 then a2 else a3 : A
The typed closure
FQ(a1 ) ⊢ [C, a1 ] : bool, (Q′ | FQ(a2 ), FQ(a3 ), Q′′ )
is therefore valid. Now if a1 ∈

/ Val, then by the induction hypothesis [C, a1 ] →
[C ′ , a′1 ] and thus [C, if a1 then a2 else a3 ] reduces to [C ′ , if a′1 then a2 else a3 ]
by the (cond) reduction rule. If on the other hand a1 ∈ Val, then either a1 =
True or a1 = False, by Lemma 9.1.6. Thus [C, if a1 then a2 else a3 ] reduces
either to [C, a2 ] by the (if T ) reduction rule, or to [C, a3 ] by the (if F ) reduction
rule.
4. If a = (let ∗ = a1 in a2 ), then we can reason as above to show that if a1 is not

a value, then the (let∗) congruence rule applies, and that if a1 is a value then
Lemma 9.1.6 guarantees that the (∗) rule applies.
5. If a = (let ⟨x, y⟩ = a1 in a2 ), then we can reason as above to show that if a1

is not a value, then the (let) congruence rule applies and that if a1 is a value
then Lemma 9.1.6 guarantees that the (pair) rule applies.
6. If a = a1 a2 then the typing derivation π is:

.. ..
. π1 . π2
FQ(a1 ) ⊢ a1 : B ( A FQ(a2 ) ⊢ a2 : B .
FQ(a1 ), FQ(a2 ) ⊢ a1 a2 : A
139
The typed closures
FQ(a1 ) ⊢ [C, a1 ] : B ( A, (Q′ | FQ(a2 ), Q′′ ) (9.1)

FQ(a2 ) ⊢ [C, a2 ] : B, (Q′ | FQ(a1 ), Q′′ ) (9.2)
are therefore valid. There are three cases to treat.
/ Val, then [C, a1 a2 ] → [C ′ , a′1 a2 ] by the induction hypothesis and

• If a1 ∈
the (f un) rule.
/ Val, then [C, a1 a2 ] → [C ′ , a1 a′2 ] by the induction
• If a1 ∈ Val and a2 ∈
hypothesis and the (arg) rule.
• If a1 , a2 ∈ Val then by Lemma 9.1.6, a1 is either an abstraction, a constant,
or of the form unbox (t, C, u). If a1 is a lambda abstraction, then [C, a1 a2 ]
reduces by the (β) rule. If a1 is a constant, then it cannot be unbox,
since a1 a2 would then be a value. If a1 = rev, then a2 is a value of type
Circ(T, U ). Hence a2 is of the form (t, C, u) by Lemma 9.1.6, so that
[C, a1 a2 ] reduces by the (rev) rule. If a1 = boxT , then [C, a1 a2 ] reduces by
the (box) rule.
Remains to treat the case of a1 = unbox (u, D, t). For the (unbox) rule to
apply, we need to show that bind(a2 , u) is well-defined, and that FQ(t) ⊆
dom(b′ ), where Append(C, D, b) = (C ′ , b′ ). The typing derivation π1 is the
following:
.. 1 .. 2
. π1 . π1
FQ(u) ⊢ u : T !∆; FQ(t) ⊢ t : U
!∆; ∅ ⊢ unbox : Circ(T, U ) ( (T ( U ) !∆; ∅ ⊢ (u, D, t) : Circ(T, U )
!∆; ∅ ⊢ unbox (u, D, t) : T ( U
with In(D) = FQ(u), Out(D) = FQ(t), A = U and B = T for some quan-

tum data types T, U . It follows that a2 and u are two values of type T , so
that, by Lemma 9.1.8, b = bind(a2 , u) is defined with dom(b) = FQ(a2 ) and
cod(b) = FQ(u). Moreover, FQ(a2 ) ⊆ Out(C) by the validity of the typed
closure (9.2), and FQ(u) = In(D) as noted above. Therefore, dom(b) ⊆
Out(C) and cod(b) = In(D) hold, as required by Definition 8.3.1.4b. By
Definition 8.3.1.4b, we conclude that dom(b′ ) = Out(D) = FQ(t), so that
the (unbox) rule in fact applies.
Chapter 10
Conclusion
In this thesis, we applied tools from algebraic number theory and mathematical logic
to problems in the theory of quantum computation. We described algorithms to
solve the problem of approximate synthesis of special unitaries over the Clifford+V
and Clifford+T gate sets. We also defined a typed lambda calculus for quantum
computation called Proto-Quipper which serves as a mathematical foundation for the
Quipper quantum programming language. In conclusion, we briefly describe some
avenues for future research.
10.1 Approximate synthesis
The synthesis methods described in chapters 6 and 7 belong to a very recent fam-
ily of number-theoretic algorithms. Many generalization of these methods can be
considered.
• The algorithms described in chapters 6 and 7 are only optimal for z-rotations.
While Euler angle decompositions can be used to extend these methods to
arbitrary special unitaries, optimality is lost in the process. A first potential
generalization of the methods of chapters 6 and 7 is to define optimal number-
theoretic synthesis algorithms for arbitrary special unitaries.
• A second restriction of the decomposition methods presented in chapters 6 and

7 is that they are only defined for specific gate sets, namely the Clifford+V
and Clifford+T gate sets. Another future generalization of this work is to
extend the number-theoretic synthesis methods to different gate sets. This line
of enquiry has already been pursued in the recent literature with encouraging
results. In particular, it is known that asymptotically optimal number-theoretic
decomposition methods can be defined for certain gate sets based on anyonic
braidings (see [39] and [6]). Further, exact synthesis methods have recently been
140
141
devised for a relatively general family of gate sets ([19] and [43]). While these
exact synthesis methods have not yet been extended to approximate synthesis
algorithms, we expect that, at least in some cases, this extension should carry
through with relative ease.
• A further possible generalization is to consider unitary groups in higher dimen-

sions. Higher-dimensional versions of the Clifford gates have previously been
studied in the literature [25]. Moreover, higher-dimensional analogues of the T
gate were recently introduced [33]. Together, these define a higher-dimensional
Clifford+T gate set which stands as a natural candidate for an adaptation of
the methods of chapters 6 and 7.
We note that the decomposition algorithm of Fowler [20] as well as the Solovay-Kitaev
algorithm [14] are both very general. Indeed, both algorithms allow the synthesis
of arbitrary special unitaries over any gate set and in any dimension. Since the
algorithms presented in chapters 6 and 7 rely on specific properties of the rings of
algebraic integers associated with the Clifford+V and Clifford+T gate set, there is
no reason for these methods to generalize to arbitrary gate sets. However, it might
be possible to identify a general class of gate sets to which these methods apply.
Another interesting avenue of future research lies in a modification of the state-
ment of the synthesis problem itself, by allowing a broader notion of circuit. As a first
such generalization, one can introduce ancillary qubits in the approximating circuit.
Suppose a special unitary U and a precision ε > 0 are given. Instead of searching for
W ∈ U(2) such that ∥U − W ∥ < ε one can look for W ∈ U(2n+1 ) such that for any
state |φ⟩ we have
∥U (|φ⟩)|0 . . . 0⟩ − W (|φ⟩|0 . . . 0⟩)∥ < ε.
In other words, unitaries acting on more than one qubit can be considered, provided
that they return the additional qubits nearly unchanged. The advantage of such a
generalized notion of circuit is that it opens the door to a certain form of paralleliza-
tion. In particular, even if the number of non-Clifford gates in the approximating
circuit remains unchanged, applying them in parallel, rather than sequentially, may
represent a gain.
142
Another generalization of the synthesis problem is to allow the approximating

circuits to use measurements or other adaptive methods. It is known that using
such methods can decrease the gate count below the information-theoretic lower
bound. These methods are relatively new, but very promising results have already
been achieved ([66], [51], [8], and [67]).
10.2 Proto-Quipper
As already mentioned in Chapter 1, the rationale behind the design of Proto-Quipper

was to start with the simplest language possible, establish type-safety, and then ex-
tend the language in small steps with the goal of eventually adding most of Quipper’s
features to Proto-Quipper in a type-safe way. This defines a natural set of problems
for future work. Many such extensions are conceivable, but we only describe a few
here.
• In the current version of Proto-Quipper, all circuits are reversible. This follows
from the definition of the (rev) and (circ) typing rules and will have to be
modified to accommodate non-reversible gates such as measurements. In such
a setting, the type system should ensure that circuits are reversed only if it is
meaningful to do so. In particular, if a circuit contains a measurement, then it
should not be possible to reverse it.
• A circuit generating function that inputs a list of qubits does not define just one
circuit, but rather a family of circuits parameterized by the length n of the list.
To box such a function, a particular value of n has to be given. In Quipper, we
refer to n as the “shape” of the argument of the function. Operations such as
boxing and reversing often require shape information. An alternative solution
would be to equip Proto-Quipper with a dependent type system. This would
allow shape information to be stored at the type level.
• In contrast to the quantum lambda calculus, the reduction in Proto-Quipper

is non-probabilistic. Of course, the hypothetical quantum device running the
circuit produced by Proto-Quipper would have to perform probabilistic opera-
tions, but the circuit generation itself does not have to. Even if the language
143
is extended with measurement gates, it is still possible to generate the circuits

deterministically. This is justified by the “principle of deferred measurement”
which states that any quantum circuit is equivalent to one where all measure-
ments are performed as the very last operations (see, e.g., [50] p.186). We
therefore do not need to rely on the result of a measurement to construct cir-
cuits and, in theory, no computational power is lost by making this assumption.
In practice, however, this delaying of measurement may significantly increase
the size of the circuit. Thus in terms of computational resources it is some-
times advantageous to permit circuit generating functions that access previous
measurement results. Several existing quantum algorithms rely on such inter-
active circuit building. In Quipper, this capability is captured by the notion
of dynamic lifting. Adding such a feature to Proto-Quipper would make the
reduction relation probabilistic. It is an interesting research problem how such
an extension can be carried out in a type-safe way.
Bibliography
[1] D.S. Alexander, Neil J. Ross, P. Selinger, J.M. Smith, and B. Valiron. Program-
ming the quantum future. Communications of the ACM, 2015. To appear.
[2] A. Ambainis, A. M. Childs, B.W. Reichardt, R. Špalek, and S. Zhang. Any AND-
1
OR formula of size n can be evaluated in time n 2 +o(1) on a quantum computer.
SIAM J. Comput., 39:2513–2530, 2010.
[3] Pablo Arrighi and Gilles Dowek. Linear-algebraic lambda-calculus: higher-order,

encodings, and confluence. In Proceedings of the 19th international conference on
Rewriting Techniques and Applications, volume 5117 of Lecture Notes in Com-
puter Science, page 1731, 2008.
[4] H.P. Barendregt. The Lambda Calculus : Its Syntax and Semantics., volume 103
of Studies in Logic and the Foundations of Mathematics. MIT Press, 1981.
[5] Andreas Blass, Alex Bocharov, and Yuri Gurevich. Optimal ancilla-free Pauli+V
circuits for axial rotations. Available from arXiv:1412.1033, December 2014.
[6] Alex Bocharov, Xingshan Cui, Vadym Kliuchnikov, and Zhenghan Wang. Effi-
cient topological compilation for weakly-integral anyon model. Available from
arXiv:1504.03383, April 2015.
[7] Alex Bocharov, Yuri Gurevich, and Krysta M. Svore. Efficient decomposition
of single-qubit gates into V basis circuits. Phys. Rev. A, 88:012313 (13 pages),
2013. Also available from arXiv:1303.1411.
[8] Alex Bocharov, Martin Roetteler, and Krysta M. Svore. Efficient synthesis of
universal repeat-until-success circuits. Available from arXiv:1404.5320, April
2014.
[9] H. Chataing, N. J. Ross, and P. Selinger. Report on Proto-Quipper 0.2. Unpub-

lished report delivered to IARPA in the context of the QCS project, 2013.
[10] A. M. Childs, R. Cleve, E. Deotto, E. Farhi, S. Gutmann, and D. A. Spielman.

Exponential algorithmic speedup by a quantum walk. In Proceedings of the
Thirty-Fifth Annual ACM Symposium on Theory of Computing, pages 59–68,
2003.
[11] A. Church and J. B. Rosser. Some properties of conversion. Transactions of the

American Mathematical Society, 39:472–482, 1936.
[12] Koen Claessen. Embedded Languages for Describing and Verifying Hardware.
PhD thesis, Chalmers University of Technology and Göteborg University, 2001.
144
145
[13] Henri Cohen. Advanced topics in computational number theory. Graduate texts
in mathematics. Springer, New York, N.Y., Berlin, Heidelberg,, 2000.
[14] Christopher M. Dawson and Michael A. Nielsen. The Solovay-Kitaev algorithm.

Quantum Information and Computation, 6(1):81–95, January 2006. Also avail-
able from arXiv:quant-ph/0505030.
[15] David Deutsch. Quantum theory, the Church-Turing principle and the univer-
sal quantum computer. Proceedings of the Royal Society of London, Series A,
Mathematical and Physical Sciences, 400:97–117, 1985.
[16] David Deutsch. Quantum computational networks. Proceedings of the Royal

Society of London, Series A, Mathematical and Physical Sciences, 425(1868):73–
90, 1989.
[17] M. Felleisen and A. K. Wright. A syntactic approach to type soundness. Infor-

mation and Computation, pages 38–94, November 1994.
[18] Richard P. Feynman. Simulating physics with computers. International Journal

of Theoretical Physics, 21(6-7):467–488, June 1982.
[19] Simon Forest, David Gosset, Vadym Kliuchnikov, and David McKinnon. Exact
synthesis of single-qubit unitaries over Clifford-cyclotomic gate sets. Available
from arXiv:1501.4150, January 2015.
[20] Austin G. Fowler. Constructing arbitrary Steane code single logical qubit fault-
tolerant gates. Quantum Information and Computation, 11(9–10):867–873, 2011.
Also available from arXiv:quant-ph/0411206.
[21] Brett Giles and Peter Selinger. Exact synthesis of multiqubit Clifford+T circuits.
Physical Review A, 87:032332, 2013. Also available from arXiv:1212.0506.
[22] Brett Giles and Peter Selinger. Remarks on Matsumoto and Amano’s normal
form for single-qubit Clifford+T operators. Available from arXiv:1312.6584,
December 2013.
[23] J.-Y. Girard. Linear logic. Theoretical Computer Science, 50(1):1–102, 1987.
[24] Jean-Yves Girard, Yves Lafont, and Paul Taylor. Proofs and Types. Number 7
in Cambridge Tracts in Theoretical Computer Science. Cambridge University
Press, 1989.
[25] Daniel Gottesman. Fault-tolerant quantum computation with higher-dimension-

al systems. In Quantum Computing and Quantum Communications, Proceedings
of the 1st NASA International Conference on Quantum Computing and Quantum
Communications (QCQC), pages 302–313, New York, NY, USA, 1998. Springer-
Verlag.
146
[26] Daniel Gottesman. The Heisenberg representation of quantum computers. In

International Conference on Group Theoretic Methods in Physics, page 9807006,
1998.
[27] Alexander Green, Peter LeFanu Lumsdaine, Neil J. Ross, Peter Selinger, and
Benoı̂t Valiron. An introduction to quantum programming in Quipper. In Pro-
ceedings of the 5th International Conference on Reversible Computation, volume
7948 of Lecture Notes in Computer Science, pages 110–124, 2013. Preprint avail-
able from arXiv:1304.5485.
[28] Alexander Green, Peter LeFanu Lumsdaine, Neil J. Ross, Peter Selinger, and
Benoı̂t Valiron. Quipper: a scalable quantum programming language. In Pro-
ceedings of the 34th ACM SIGPLAN Conference on Programming Language De-
sign and Implementation, PLDI ’13, pages 333–342, 2013. Preprint available
from arXiv:1304.3390.
[29] Lov K. Grover. A fast quantum mechanical algorithm for database search. In
Proceedings of the Twenty-eighth Annual ACM Symposium on Theory of Com-
puting, STOC ’96, pages 212–219, New York, NY, USA, 1996. ACM.
[30] Sean Hallgren. Polynomial-time quantum algorithms for Pell’s equation and the
principal ideal problem. J. ACM, 54(1):4:1–4:19, March 2007.
[31] Aram W. Harrow, Avinatan Hassidim, and Seth Lloyd. Quantum algorithm for
linear systems of equations. Phys. Rev. Lett., 103(15):150502, 2009.
[32] A.W. Harrow, B. Recht, and I.L. Chuang. Efficient discrete approximations of
quantum gates. Journal of Mathematical Physics, 43, 2002. Also available from
arXiv:quant-ph/0111031.
[33] Mark Howard and Jiri Vala. Qudit versions of the qubit π/8 gate. Phys. Rev.
A, 86:022316, Aug 2012.
[34] Graham Hutton. Programming in Haskell. Cambridge University Press, January
2007.
[35] IARPA Quantum Computer Science Program. Broad Agency Announce-
ment IARPA-BAA-10-02. Available from https://www.fbo.gov/notices/
637e87ac1274d030ce2ab69339ccf93c, April 2010.
[36] Phillip Kaye, Raymond Laflamme, and Michele Mosca. An Introduction to Quan-
tum Computing. Oxford University Press, Inc., New York, NY, USA, 2007.
[37] A. Yu. Kitaev. Quantum computations: algorithms and error correction. Russian
Mathematical Surveys, 52(6):11911249, 1997.
[38] A. Yu. Kitaev, A. H. Shen, and M. N. Vyalyi. Classical and Quantum Compu-
tation. Graduate Studies in Mathematics 47. American Mathematical Society,
2002.
147
[39] Vadym Kliuchnikov, Alex Bocharov, and Krysta M. Svore. Asymptotically opti-
mal topological quantum compiling. Available from arXiv:1310.4150, October
2013.
[40] Vadym Kliuchnikov, Dmitri Maslov, and Michele Mosca. Practical approxima-
tion of single-qubit unitaries by single-qubit quantum Clifford and T circuits.
Also available from arXiv:1212.6964, December 2012.
[41] Vadym Kliuchnikov, Dmitri Maslov, and Michele Mosca. Asymptotically optimal
approximation of single qubit unitaries by Clifford and T circuits using a constant
number of ancillary qubits. Phys. Rev. Lett., 110:190502 (5 pages), 2013. Also
available from arXiv:1212.0822v2.
[42] Vadym Kliuchnikov, Dmitri Maslov, and Michele Mosca. Fast and efficient exact
synthesis of single qubit unitaries generated by Clifford and T gates. Quan-
tum Information and Computation, 13(7–8):607–630, 2013. Also available from
arXiv:1206.5236v4.
[43] Vadym Kliuchnikov and Jon Yard. A framework for exact synthesis. Available
from arXiv:1504.04350, April 2015.
[44] E. Knill. Conventions for quantum pseudocode, 1996.
[45] H.W. Lenstra, Jr. Integer programming with a fixed number of variables. Math.
Oper. Res., 8:538 – 548, 1983.
[46] A. Lubotzky, R. Phillips, and P. Sarnak. Hecke operators and distributing points
on the sphere I. Communications on Pure and Applied Mathematics, 39:S149–
S186, 1986.
[47] A. Lubotzky, R. Phillips, and P. Sarnak. Hecke operators and distributing points
on S 2 II. Communications on Pure and Applied Mathematics, 40:401–420, 1987.
[48] Frédéric Magniez, Miklos Santha, and Mario Szegedy. Quantum algorithms for
the triangle problem. quant-ph/0310134, 2003.
[49] Wladyslaw Narkiewicz. Elementary and analytic theory of algebraic numbers.
Springer-Verlag Warszawa, Berlin, 1990.
[50] Michael A. Nielsen and Isaac L. Chuang. Quantum Computation and Quantum
Information. Cambridge University Press, 2002.
[51] Adam Paetznick and Krysta M. Svore. Repeat-until-success: Non-deterministic
decomposition of single-qubit unitaries. Available from arXiv:1311.1074,
November 2013.
[52] Robert Raussendorf and Hans J. Briegel. Computational model underlying the
one-way quantum computer. Quantum Info. Comput., 2(6):443–486, October
2002.
148
[53] Oded Regev. Quantum computation and lattice problems. SIAM J. Comput.,
33(3):738–760, 2004.
[54] Neil J. Ross. Optimal ancilla-free Clifford+V approximation of z-rotations.

Quantum Information and Computation, 15(11–12):932–950, 2015. Preprint
available from arXiv:1409.4355.
[55] Neil J. Ross, P. Selinger, J.M. Smith, and B. Valiron. Quipper: Concrete resource
estimation in quantum algorithms. Extended abstract for QAPL 2014. Available
from arXiv:1412.0625, 2014.
[56] Neil J. Ross and Peter Selinger. Optimal ancilla-free Clifford+T approximation
of z-rotations. Available from arXiv:1403.2975, March 2014.
[57] Neil J. Ross and Peter Selinger. Exact and approximate synthesis of quantum
circuits, version 0.3.0.1. Software implementation available from http://www.
mathstat.dal.ca/~selinger/newsynth/, 2015.
[58] P. Selinger and B. Valiron. Quantum lambda calculus. In S. Gay and I. Mackie,
editors, Semantic Techniques in Quantum Computation, pages 135–172. Cam-
bridge University Press, 2009.
[59] Peter Selinger. Towards a quantum programming language. Mathematical. Struc-

tures in Comp. Sci., 14(4):527–586, August 2004.
[60] Peter Selinger. Efficient Clifford+T approximation of single-qubit operators.

Quantum Information and Computation, 15(1–2):159–180, 2015. Preprint avail-
able from arXiv:1212.6253.
[61] Peter Selinger and Benoı̂t Valiron. A lambda calculus for quantum computation
with classical control. Mathematical Structures in Computer Science, 16(3):527–
552, 2006.
[62] Peter W. Shor. Algorithms for quantum computation: discrete logarithms

and factoring. In Proceedings of the 35th Annual Symposium on Founda-
tions of Computer Science, pages 124–134, 1994. Also available from arXiv:
quant-ph/9508027.
[63] André van Tonder. A lambda calculus for quantum computation. SIAM J.
Comput., 33(5):1109–1135, May 2004.
[64] Benoı̂t Valiron. A functional programming language for quantum computation

with classical control. Master’s thesis, University of Ottawa, September 2004.
[65] James D. Whitfield, Jacob Biamonte, and Alán Aspuru-Guzik. Simulation of

electronic structure Hamiltonians using quantum computers. Molecular Physics,
109(5):735–750, 2011.
149
[66] Nathan Wiebe and Vadym Kliuchnikov. Floating point representations in quan-
tum circuit synthesis. Available from arXiv:1305.5528, May 2013.
[67] Nathan Wiebe and Martin Roetteler. Quantum arithmetic and numerical anal-
ysis using repeat-until-success circuits. Available from arXiv:1406.2040, June
2014.

Algebraic Method Qcomputing

Uploaded by

Algebraic Method Qcomputing

Uploaded by

ALGEBRAIC AND LOGICAL METHODS IN QUANTUM

Submitted in partial fulfillment of the requirements

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

List of Abbreviations and Symbols Used . . . . . . . . . . . . . . . . . . ix

1.1 Approximate synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 The mathematical foundations of Quipper . . . . . . . . . . . . . . . 6

Chapter 2 Quantum computation . . . . . . . . . . . . . . . . . . . . 10

2.1 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Quantum Computation . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Chapter 3 Algebraic number theory . . . . . . . . . . . . . . . . . . 21

3.1 Rings of integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

3.2 Diophantine equations . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Chapter 4 The lambda calculus . . . . . . . . . . . . . . . . . . . . . 27

4.1 The untyped lambda calculus . . . . . . . . . . . . . . . . . . . . . . 27

4.2 The simply typed lambda calculus . . . . . . . . . . . . . . . . . . . . 33

4.4 The quantum lambda calculus . . . . . . . . . . . . . . . . . . . . . . 40

Chapter 5 Grid problems . . . . . . . . . . . . . . . . . . . . . . . . . 48

5.1 Grid problems over Z[i] . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5.2 Grid problems over Z[ω] . . . . . . . . . . . . . . . . . . . . . . . . . 57

Chapter 6 Clifford+V approximate synthesis . . . . . . . . . . . . . 81

6.1 Exact synthesis of Clifford+V operators . . . . . . . . . . . . . . . . 81

6.2 Approximate synthesis of z-rotations . . . . . . . . . . . . . . . . . . 88

6.3 Approximate synthesis of special unitaries . . . . . . . . . . . . . . . 97

Chapter 7 Clifford+T approximate synthesis . . . . . . . . . . . . . 99

7.1 Exact synthesis of Clifford+T operators . . . . . . . . . . . . . . . . . 99

7.2 Approximate synthesis of z-rotations . . . . . . . . . . . . . . . . . . 101

7.3 Approximation up to a phase . . . . . . . . . . . . . . . . . . . . . . 105

Chapter 8 The Proto-Quipper language . . . . . . . . . . . . . . . . 111

8.1 From the quantum lambda calculus to Proto-Quipper . . . . . . . . . 111

8.2 The syntax of Proto-Quipper . . . . . . . . . . . . . . . . . . . . . . 112

8.3 The operational semantics of Proto-Quipper . . . . . . . . . . . . . . 119

Chapter 10 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

2.1 The Hadamard and Pauli matrices. . . . . . . . . . . . . . . . 12

4.1 Reduction rules for the simply typed lambda calculus. . . . . . 35

5.1 Grid problems over Z[i]. . . . . . . . . . . . . . . . . . . . . . 50

6.1 The ε-region.. . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

8.1 Subtyping rules for Proto-Quipper. . . . . . . . . . . . . . . . 118

This thesis contains contributions to the theory of quantum computation.

N The natural numbers.

Quantum computation, introduced in the early 1980s by Feynman [18], is a paradigm

A computation is performed by acting on the state of a system of qubits. This

Quantum algorithms are run on a physical device. Just as in classical computing,

1.1 Approximate synthesis

Definition 1.1.1. The distance between two operators U, W ∈ U(2) is defined as

∥U − W ∥ = sup{||U v − W v|| ; v ∈ C2 and ||v|| = 1}.

The approximate synthesis problem is important for quantum computing because

a quantum circuit, to be executed by a quantum computer, must first be compiled

We evaluate an approximate synthesis algorithm with respect to its time complex-

1.2 The mathematical foundations of Quipper

An important aspect of the Quipper language is that it acts as a circuit description

Currently, Quipper is implemented as an embedded language [12]. This means

In this thesis, we introduce a quantum programming language which we call Proto-

formalization of a fragment of Quipper. Proto-Quipper is meant to provide a foun-

• incorporate Quipper’s ability to generate and act on quantum circuits, and to

• The second part of the thesis, which corresponds to chapters 5 – 7, contains

My original contributions are contained in chapters 5 – 9 of the thesis. They have

In this chapter, we provide a brief introduction to quantum computation. Further

2.1 Linear Algebra

2.1.1 Finite dimensional Hilbert spaces

with a1 , . . . an ∈ C. We will often represent the vector α of (2.1) as a column vector

Given a vector α as in (2.2), its dual is the row vector

2.1.2 Operators and matrices

A linear operator Cn → Cm can be represented by an m × n complex matrix. We

The formula to express the determinant of an arbitrary n × n matrix is somewhat

We note that in any dimension the determinant is multiplicative and det(I) = 1.

2.1.3 Unitary, Hermitian, and positive matrices

A complex matrix U is unitary if U −1 = U † . Unitary matrices preserve the norm of

SU(n) ⊆ U(n) ⊆ Mn,n (C)× .

2.1.4 Tensor products

Given two vectors α ∈ Cn , β ∈ Cm , their tensor product γ = α ⊗ β ∈ Cnm is defined

2.2 Quantum Computation

2.2.1 A single quantum bit

Definition 2.2.1. The state of a qubit is a unit vector in C2 considered up to a phase,

2.2.2 Multiple quantum bits

In classical computation, the state of a system of n bits is represented by an element of

α0 |00⟩ + α1 |01⟩ + α2 |10⟩ + α3 |11⟩

The β-reduction, written , is the reflexive and transitive closure of →.