Lecture Notes On Quantum Algorithms

Lecture Notes on
Quantum Algorithms
Andrew M. Childs
Department of Computer Science,
Institute for Advanced Computer Studies, and
Joint Center for Quantum Information and Computer Science
University of Maryland
30 May 2017
ii
Contents
Preface vii
1 Preliminaries 1
1.1 Quantum data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Quantum circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Universal gate sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Reversible computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Uniformity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.6 Quantum complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.7 Fault tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
I Quantum circuits 5
2 Efficient universality of quantum circuits 7
2.1 Subadditivity of errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 The group commutator and a net around the identity . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Proof of the Solovay-Kitaev Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Proof of Lemma 2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Quantum circuit synthesis over Clifford+T 11
3.1 Converting to Matsumoto-Amano normal form . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Uniqueness of Matsumoto-Amano normal form . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Algebraic characterization of Clifford+T unitaries . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4 From exact to approximate synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
II Quantum algorithms for algebraic problems 15

4 The abelian quantum Fourier transform and phase estimation 17
4.1 Quantum Fourier transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.2 QFT over Z2n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.3 Phase estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.4 QFT over ZN and over a general finite abelian group . . . . . . . . . . . . . . . . . . . . . . . 19
5 Discrete log and the hidden subgroup problem 21
5.1 Discrete log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2 Diffie-Hellman key exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.3 The hidden subgroup problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.4 Shors algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6 The abelian HSP and decomposing abelian groups 25
6.1 The abelian HSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
6.2 Decomposing abelian groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
7 Quantum attacks on elliptic curve cryptography 29
7.1 Elliptic curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
7.2 Elliptic curve cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
7.3 Shors algorithm for discrete log over elliptic curves . . . . . . . . . . . . . . . . . . . . . . . . 32
8 Quantum algorithms for number fields 33
iii
iv Contents
8.1 Review: Order finding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

8.2 Pells equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
8.3 Some basic algebraic number theory .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
8.4 A periodic function for the units of Z[ d] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
9 Period finding from Z to R 37
9.1 Period finding over the integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
9.2 Period finding over the reals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
9.3 Other algorithms for number fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
10 Quantum query complexity of the HSP 43
10.1 The nonabelian HSP and its applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
10.2 The standard method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
10.3 Query complexity of the HSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
11 Fourier analysis in nonabelian groups 47
11.1 A brief introduction to representation theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
11.2 Fourier analysis for nonabelian groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
12 Fourier sampling 51
12.1 Weak Fourier sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
12.2 Normal subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
12.3 Strong Fourier sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
13 Kuperbergs algorithm for the dihedral HSP 55
13.1 The HSP in the dihedral group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
13.2 Fourier sampling in the dihedral group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
13.3 Combining states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
13.4 The Kuperberg sieve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
13.5 Analysis of the Kuperberg sieve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
13.6 Entangled measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
14 The HSP in the Heisenberg group 59
14.1 The Heisenberg group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
14.2 Fourier sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
14.3 Two states are better than one . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
15 Approximating the Jones polynomial 63
15.1 The Hadamard test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
15.2 The Jones polynomial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
15.3 Links from braids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
15.4 Representing braids in the Temperley-Lieb algebra . . . . . . . . . . . . . . . . . . . . . . . . 64
15.5 A quantum algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
15.6 Quality of approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
15.7 Other algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
III Quantum walk 67

16 Continuous-time quantum walk 69
16.1 Continuous-time quantum walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
16.2 Random and quantum walks on the hypercube . . . . . . . . . . . . . . . . . . . . . . . . . . 70
16.3 Random and quantum walks in one dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
16.4 Black-box traversal of the glued trees graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
16.5 Quantum walk algorithm to traverse the glued trees graph . . . . . . . . . . . . . . . . . . . . 72
16.6 Classical and quantum mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
16.7 Classical lower bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
17 Discrete-time quantum walk 77
17.1 Discrete-time quantum walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
17.2 How to quantize a Markov chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Contents v
17.3 Spectrum of the quantum walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

17.4 Hitting times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
18 Unstructured search 83
18.1 Unstructured search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
18.2 Quantum walk algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
18.3 Amplitude amplification and quantum counting . . . . . . . . . . . . . . . . . . . . . . . . . . 84
18.4 Search on graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
19 Quantum walk search 87
19.1 Element distinctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
19.2 Quantum walk algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
19.3 Quantum walk search algorithms with auxiliary data . . . . . . . . . . . . . . . . . . . . . . . 89
IV Quantum query complexity 91

20 Query complexity and the polynomial method 93
20.1 Quantum query complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
20.2 Quantum queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
20.3 Quantum algorithms and polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
20.4 Symmetrization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
20.5 Parity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
21 The collision problem 99
21.1 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
21.2 Classical query complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
21.3 Quantum algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
21.4 Toward a quantum lower bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
21.5 Constructing the functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
21.6 Finishing the proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
22 The quantum adversary method 105
22.1 Quantum adversaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
22.2 The adversary method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
22.3 Example: Unstructured search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
23 Span programs and formula evaluation 111
23.1 The dual of the adversary method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
23.2 Span programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
23.4 Formulas and games . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
23.5 Function composition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
23.6 An algorithm from a dual adversary solution . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
24 Learning graphs 117
24.1 Learning graphs and their complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
24.3 From learning graphs to span programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
24.4 Element distinctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
24.5 Other applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
V Quantum simulation 121

25 Simulating Hamiltonian dynamics 123
25.1 Hamiltonian dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
25.2 Efficient simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
25.3 Product formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
25.4 Sparse Hamiltonians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
vi Contents
25.5 Measuring an operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

26 Fast quantum simulation algorithms 129
26.1 No fast-forwarding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
26.2 Quantum walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
26.3 Linear combinations of unitaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
VI Adiabatic quantum computing 133

27 The quantum adiabatic theorem 135
27.1 Adiabatic evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
27.2 Proof of the adiabatic theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
28 Adiabatic optimization 141
28.1 An adiabatic optimization algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
28.2 The running time and the gap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
28.3 Adabatic optimization algorithm for unstructured search . . . . . . . . . . . . . . . . . . . . . 143
29 An example of the success of adiabatic optimization 147
29.1 The ring of agrees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
29.2 The Jordan-Wigner transformation: From spins to fermions . . . . . . . . . . . . . . . . . . . 148
29.3 Diagonalizing a system of free fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
29.4 Diagonalizing the ring of agrees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
30 Universality of adiabatic quantum computation 155
30.1 The Feynman quantum computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
30.2 An adiabatic variant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
30.3 Locality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Bibliography 161
Preface
This is a set of lecture notes on quantum algorithms. It is primarily intended for graduate students who have
already taken an introductory course on quantum information. Such a course typically covers only the early
breakthroughs in quantum algorithms, namely Shors factoring algorithm (1994) and Grovers searching
algorithm (1996). Here we show that there is much more to quantum computing by exploring some of the
many quantum algorithms that have been developed over the past twenty years.
These notes cover several major topics in quantum algorithms, divided into six parts:
In Part I, we discuss quantum circuitsin particular, the problem of expressing a quantum algorithm
using a given universal set of quantum gates.
In Part II, we discuss quantum algorithms for algebraic problems. Many of these algorithms generalize
the main idea of Shors algorithm. These algorithms use the quantum Fourier transform and typically
achieve an exponential (or at least superpolynomial) speedup over classical computers. In particular,
we explore a group-theoretic problem called the hidden subgroup problem. A solution of this problem
for abelian groups leads to several applications; we also discuss what is known about the nonabelian
case.
In Part III, we explore the concept of quantum walk, a quantum generalization of random walk. This
concept leads to a powerful framework for solving search problems, generalizing Grovers search algo-
rithm.
In Part IV, we discuss the model of quantum query complexity. We cover the two main methods
for proving lower bounds on quantum query complexity (the polynomial method and the adversary
method), demonstrating limitations on the power of quantum algorithms. We also discuss how the
concept of span programs turns the quantum adversary method into an upper bound, giving optimal
quantum algorithms for evaluating Boolean formulas.
In Part V, we describe quantum algorithms for simulating the dynamics of quantum systems. We also
discuss an application of quantum simulation to an algorithm for linear systems.
In Part VI, we discuss adiabatic quantum computing, a general approach to solving optimization prob-
lems (in a similar spirit to simulated annealing). Related ideas may also provide insights into how one
might build a quantum computer.
These notes were originally prepared for a course that was offered three times at the University of
Waterloo: in the winter terms of 2008 (as CO 781) and of 2011 and 2013 (as CO 781/CS 867/QIC 823). I
thank the students in the course for their feedback on the lecture notes. Each offering of the course covered
a somewhat different set of topics. This document collects the material from all versions of the course and
includes a few subsequent improvements.
The material on quantum algorithms for algebraic problems has been collected into a review article that
was written with Wim van Dam [33]. I thank Wim for his collaboration on that project, which strongly
influenced the presentation in Part II.
Please keep in mind that these are rough lecture notes; they are not meant to be a comprehensive treat-
ment of the subject, and there are surely at least a few mistakes. Corrections (by email to amchilds@umd.edu)
are welcome.
I hope you find these notes to be a useful resource for learning about quantum algorithms.
vii
viii Contents
Chapter 1
Preliminaries
This chapter briefly reviews some background material on quantum computation. We cover these topics at
a very high level, just to give a sense of what you should know to understand the rest of the lecture notes.
If any of these topics are unfamiliar, you can learn more about them from a text on quantum computation
such as Nielsen and Chuang [75]; Kitaev, Shen, and Vyalyi [62]; or Kaye, Laflamme, and Mosca [60].
1.1 Quantum data

A quantum computer is a device that uses a quantum mechanical representation of information to perform
calculations. Information is stored in quantum bits, the states of which can be represented as `2 -normalized
vectors in a complex vector space. For example, we can write the state of n qubits as
X
|i = ax |xi (1.1)
x{0,1}n
where the ax C satisfy x |ax |2 = 1. We refer to the basis of states |xi as the computational basis.
P
It will often be useful to think of quantum states as storing data in a more abstract form. For example,
given a group G, we could write |gi for a basis state corresponding to the group element g G, and
X
|i = bg |gi (1.2)
gG
for an arbitrary superposition over the group. We assume that there is some canonical way of efficiently
representing group elements using bit strings; it is usually unnecessary to make this representation explicit.
If a quantum computer stores the state |i and the state |i, its overall state is given by the tensor
product of those two states. This may be denoted |i |i = |i|i = |, i.
1.2 Quantum circuits

The allowed operations on (pure) quantum states are those that map normalized states to normalized states,
namely unitary operators U , satisfying U U = U U = I. (You probably know that there are more general
quantum operations, but for the most part we will not need to use them in this course.)
To have a sensible notion of efficient computation, we require that the unitary operators appearing in
a quantum computation are realized by quantum circuits. We are given a set of gates, each of which acts
on one or two qubits at a time (meaning that it is a tensor product of a one- or two-qubit operator with
the identity operator on the remaining qubits). A quantum computation begins in the |0i state, applies a
sequence of one- and two-qubit gates chosen from the set of allowed gates, and finally reports an outcome
obtained by measuring in the computational basis.
1
2 Chapter 1. Preliminaries
1.3 Universal gate sets

In principle, any unitary operator on n qubits can be implemented using only 1- and 2-qubit gates. Thus
we say that the set of all 1- and 2-qubit gates is (exactly) universal. Of course, some unitary operators may
take many more 1- and 2-qubit gates to realize than others, and indeed, a counting argument shows that
most unitary operators on n qubits can only be realized using an exponentially large circuit of 1- and 2-qubit
gates.
In general, we are content to give circuits that give good approximations of our desired unitary transfor-
mations. We say that a circuit with gates U1 , U2 , . . . , Ut approximates U with precision if
kU Ut . . . U2 U1 k . (1.3)
Here kk denotes some appropriate matrix norm, which should have the property that if kU V k is small,
then U should be hard to distinguish from V no matter what quantum state they act on. A natural choice
(which will be suitable for our purposes) is the spectral norm
kA|ik
kAk := max , (1.4)
|i k|ik
p
(where k|ik = h|i denotes the vector 2-norm of |i), i.e., the largest singular value of A. Then we call
a set of elementary gates universal if any unitary operator on a fixed number of qubits can be approximated
to any desired precision using elementary gates.
It turns out that there are finite sets of gates that are universal: for example, the set {H, T, C} with

i/8 1 0 0 0
1 1 1 e 0 0 1 0 0
H := T := i/8 0 0 0 1 .
C := (1.5)
2 1 1 0 e
0 0 1 0
There are situations in which we say a set of gates is effectively universal, even though it cannot actually
approximate any unitary operator on n qubits. For example, the set {H, T 2 , Tof}, where

1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0

0 0 1 0 0 0 0 0

0 0 0 1 0 0 0 0
Tof := 0 0 0 0 1 0 0 0
(1.6)

0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 1
0 0 0 0 0 0 1 0
is universal, but only if we allow the use of ancilla qubits (qubits that start and end in the |0i state).
Similarly, the basis {H, Tof} is universal in the sense that, with ancillas, it can approximate any orthogonal
matrix. It clearly cannot approximate complex unitary matrices, since the entries of H and Tof are real;
but the effect of arbitrary unitary transformations can be simulated using orthogonal ones by simulating the
real and imaginary parts separately.
1.4 Reversible computation

Unitary matrices are invertible: in particular, U 1 = U . Thus any unitary transformation is a reversible
operation. This may seem at odds with how we often define classical circuits, using irreversible gates such
as and and or. But in fact, any classical computation can be made reversible by replacing any irreversible
gate x 7 g(x) by the reversible gate (x, y) 7 (x, y g(x)), and running it on the input (x, 0), producing
(x, g(x)). In other words, by storing all intermediate steps of the computation, we make it reversible.
On a quantum computer, storing all intermediate computational steps could present a problem, since two
identical results obtained in different ways would not be able to interfere. However, there is an easy way to
1.5. Uniformity 3
remove the accumulated information. After performing the classical computation with reversible gates, we
simply xor the answer into an ancilla register, and then perform the computation in reverse. Thus we can
implement the map (x, y) 7 (x, y f (x)) even when f is a complicated circuit consisting of many gates.
Using this trick, any computation that can be performed efficiently on a classical computer can be
performed efficiently on a quantum computer: if we can efficiently implement the map x 7 f (x) on a classical
computer, we can efficiently perform the transformation |x, yi 7 |x, y f (x)i on a quantum computer. This
transformation can be applied to any superposition of computational basis states, so for example, we can
perform the transformation
1 X 1 X
|x, 0i 7 |x, f (x)i. (1.7)
n
2 x{0,1}n n
2 x{0,1}n
Note that this does not necessarily mean we can efficiently implement the map |xi 7 |f (x)i, even when
f is a bijection (so that this is indeed a unitary transformation). However, if we can efficiently invert f , then
we can indeed do this efficiently.
1.5 Uniformity
When we give an algorithm for a computational problem, we consider inputs of varying sizes. Typically,
the circuits for instances of different sizes will be related to one another in a simple way. But this need not
be the case; and indeed, given the ability to choose an arbitrary circuit for each input size, we could have
circuits computing uncomputable languages. Thus we require that our circuits be uniformly generated : say,
that there exists a fixed (classical) Turing machine that, given a tape containing the symbol 1 n times,
outputs a description of the nth circuit in time poly(n).
1.6 Quantum complexity

We say that an algorithm for a problem is efficient if the circuit describing it contains a number of gates
that is polynomial in the number of bits needed to write down the input. For example, if the input is a
number modulo N , the input size is dlog2 N e.
With a quantum computer, as with a randomized (or noisy) classical computer, the final result of a
computation may not be correct with certainty. Instead, we are typically content with an algorithm that
can produce the correct answer with high enough probability (for a decision problem, bounded above 1/2;
for a non-decision problem for which we can check a correct solution, (1)). By repeating the computation
many times, we can make the probability of outputting an incorrect answer arbitrarily small.
In addition to considering explicit computational problems, in which the input is a string, we will also
consider the concept of query complexity. Here the input is a black box transformation, and our goal is to
discover some property of the transformation by making as few queries as possible. For example, in Simons
problem, we are given a transformation f : Zn2 S satisfying f (x) = f (y) iff y = x t for some unknown
t Zn2 , and the goal is to learn t. The main advantage of considering query complexity is that it allows us
to prove lower bounds on the number of queries required to solve a given problem. Furthermore, if we find
an efficient algorithm for a problem in query complexity, then if we are given an explicit circuit realizing the
black-box transformation, we will have an efficient algorithm for an explicit computational problem.
Sometimes, we care not just about the size of a circuit for implementing a particular unitary operation,
but also about its depth, the maximum number of gates on any path from an input to an output. The depth
of a circuit tells us how long it takes to implement if we can perform gates in parallel.
1.7 Fault tolerance

In any real computer, operations cannot be performed perfectly. Quantum gates and measurements may be
performed imprecisely, and errors may happen even to stored data that is not being manipulated. Fortu-
nately, there are protocols for dealing with faults that may occur during the execution of a quantum com-
putation. Specifically, the threshold theorem states that as long as the noise level is below some threshold
4 Chapter 1. Preliminaries
(depending on the noise model, but typically in the range of 103 to 104 ), an arbitrarily long computation
can be performed with an arbitrarily small amount of error (see for example [48]).
In this course, we will always assume implicitly that fault-tolerant protocols have been applied, such that
we can effectively assume a perfectly functioning quantum computer.
Part I
Quantum circuits
5
Chapter 2
Efficient universality of quantum

circuits
Are some universal gate sets better than others? Classically, this is not an issue: the set of possible operations
is discrete, so any gate acting on a constant number of bits can be simulated exactly using a constant number
of gates from any given universal gate set. But we might imagine that some quantum gates are much more
powerful than others. For example, given two rotations about strange axes by strange angles, it may not be
obvious how to implement a Hadamard gate, and we might worry that implementing such a gate to high
precision could take a very large number of elementary operations, scaling badly with the required precision.
Fortunately, this is not the case: a unitary operator that can be realized efficiently with one set of 1- and
2-qubit gates can also be realized efficiently with another such set. In particular, we have the following (see
[75, Appendix 3], [37], and [62, Chapter 8]).
Theorem 2.1 (Solovay-Kitaev). Fix two universal gate sets that are closed under inverses. Then any t-gate
circuit using one gate set can be implemented to precision using a circuit of t poly(log t ) gates from other
set (indeed, there is a classical algorithm for finding this circuit in time t poly(log t )).
Thus, not only are the two gate sets equivalent under polynomial-time reduction, but the running time
of an algorithm using one gate set is the same as that using the other gate set up to logarithmic factors.
This means that even polynomial quantum speedups are robust with respect to the choice of gate set.
2.1 Subadditivity of errors

We begin with the basic fact that errors in the approximation of one quantum circuit by another accumulate
at most linearly.
Lemma 2.2. Let Ui , Vi be unitary matrices satisfying kUi Vi k for all i {1, 2, . . . , t}. Then
kUt . . . U2 U1 Vt . . . V2 V1 k t. (2.1)
Proof. We use induction on t. For t = 1 the lemma is trivial. Now suppose the lemma holds for a particular
value of t. Then by the triangle inequality and the fact that the norm is unitarily invariant (kU AV k = kAk
for any unitary matrices U, V ),
kUt+1 Ut . . . U1 Vt+1 Vt . . . V1 k
= kUt+1 Ut . . . U1 Ut+1 Vt . . . V1 + Ut+1 Vt . . . V1 Vt+1 Vt . . . V1 k (2.2)
kUt+1 Ut . . . U1 Ut+1 Vt . . . V1 k + kUt+1 Vt . . . V1 Vt+1 Vt . . . V1 k (2.3)
= kUt+1 (Ut . . . U1 Vt . . . V1 )k + k(Ut+1 Vt+1 )Vt . . . V1 k (2.4)
= kUt . . . U1 Vt . . . V1 k + kUt+1 Vt+1 k (2.5)
(t + 1), (2.6)
so the lemma follows by induction.
7
8 Chapter 2. Efficient universality of quantum circuits
Thus, in order to simulate a t-gate quantum circuit with total error at most , it suffices to simulate each
individual gate with error at most /t.
2.2 The group commutator and a net around the identity

To simulate an arbitrary individual gate, the strategy is to first construct a very fine net covering a very
small ball around the identity using the group commutator,
JU, V K := U V U 1 V 1 . (2.7)
To approximate general unitaries, we will effectively translate them close to the identity.
Note that it suffices to consider unitary gates with determinant 1 (i.e., elements of SU(2)) since a global
phase is irrelevant. Let
S := {U SU(2) : kI U k } (2.8)
denote the -ball around the identity. Given sets , S SU(2), we say that is an -net for S if for any
A S, there is a U such that kA U k . The following result (to be proved later on) indicates how
the group commutator helps us to make a fine net around the identity.
Lemma 2.3. If is an 2 -net for S , then J, K := {JU, V K : U, V } is an O(3 )-net for S2 .
To make an arbitrarily fine net, we apply this idea recursively. But first it is helpful to derive a consequence
of the lemma that is more suitable for recursion. We would like to maintain the quadratic relationship
between the size of the ball and the quality of the net. If we aim for a k 2 3 -net (for some constant k),
we would like it to apply to arbitrary points in Sk3/2 , whereas the lemma only lets us approximate points
in S2 . To handle an arbitrary A Sk3/2 , we first let W be the closest gate in to A. For sufficiently
small we have k3/2 < , so Sk3/2 S , and therefore A S . Since is an 2 -net for S , we have
kA W k 2 , i.e., kAW Ik 2 , so AW S2 . Then we can apply the lemma to find U, V such
that kAW JU, V Kk = kA JU, V KW k k 2 3 . In other words, if is an 2 -net for S , then J, K :=
{JU, V KW : U, V, W } is a k 2 3 -net for Sk3/2 .
Now suppose that 0 is an 20 -net for S0 , and let i := Ji1 , i1 Ki1 for all positive integers i. Then
3/2 i
i is an 2i -net for Si , where i = ki1 . Solving this recursion gives i = (k 2 0 )(3/2) /k 2 .
2.3 Proof of the Solovay-Kitaev Theorem

With these tools in hand, we are prepared to establish the main result.
Proof of Theorem 2.1. It suffices to consider how to approximate an arbitrary U SU(2) to precision by
a sequence of gates from a given universal gate set .
First we take products of elements of to form a new universal gate set 0 that is an 20 -net for SU(2),
for some sufficiently small constant 0 . We know this can be done since is universal. Since 0 is a constant,
the overhead in constructing 0 is constant.
Now we can find V0 0 such that kU V0 k 20 . Since kU V0 k = kU V0 Ik, we have U V0 S20 . If
3/2
0 is sufficiently small, then 20 < k0 = 1 , so U V0 S1 .
Since 0 is an 20 -net for SU(2), in particular it is an 20 -net for S0 . Thus by the above argument, 1 is
3/2
an 21 -net for S1 , so we can find V1 1 such that kU V0 V1 k 21 < k1 = 2 , i.e., U V0 V1 S2 .
In general, suppose we are given V0 , V1 , . . . , Vi1 such that U V0 V1 . . . Vi1

Si . Since i is an 2i -net
for Si , we can find Vi i such that kU V0 V1 . . . Vi1 Vi k 2i . In turn, this implies that U V0 V1 . . . Vi

Si+1 .
Repeating this process t times gives a very good approximation of U by Vt . . . V1 V0 : in particular, we
have kU Vt . . . V1 V0 k 2t . Suppose we consider a gate from 0 to be elementary. (These gates can be
implemented using only a constant number of gates from , so there is a constant factor overhead if only
count gates in as elementary.) The number of elementary gates needed to implement a gate from i is 5i ,
2.4. Proof of Lemma 2.3 9
Pt
so the total number of gates in the approximation is i=0 5i = (5t+1 1)/4 = O(5t ). To achieve an overall
t
error at most , we need 2t = ((k 2 0 )(3/2) /k 2 )2 , i.e.,
t 1
3 log(k 4 )
> 2 . (2.9)
2 log(k 2 0 )
Thus the number of gates used is O(log 1 ) where = log 5/ log 32 .

At this point, it may not be clear that the approximation can be found quickly, since i contains a
large number of points, so we need to be careful about how we find a good approximation Vi i of
U V0 V1 . . . Vi1

. However, by constructing the approximation recursively, it can be shown that the running
time of this procedure is poly(log 1 ). It will be clearer how to do this after we prove the lemma, but we leave
the details as an exercise.
2.4 Proof of Lemma 2.3

It remains to prove the lemma. A key idea is to move between the Lie group SU(2) and its Lie algebra, i.e.,
the Hamiltonians generating these unitaries. In particular, we can represent any A SU(2) as A = ei~a~ ,
where ~a R3 and ~ = (x , y , z ) is a vector of Pauli matrices. Note that we can choose k~ak without
loss of generality.
In the proof, the following basic facts about SU(2) will be useful.
(i) kI ei~a~ k = 2 sin k~a2k = k~ak + O(k~ak3 )
~
(ii) keib~ ei~c~ k k~b ~ck
(iii) [~b ~ , ~c ~ ] = 2i(~b ~c) ~
~ ~
(iv) kJeib~ , ei~c~ K e[b~,~c~] k = O(k~bkk~ck(k~bk + k~ck))
Here the big-O notation is with respect to k~ak 0 in (i) and with respect to k~bk, k~ck 0 in (iv).
Proof of Lemma 2.3. Let A S2 . Our goal is to find U, V such that kA JU, V Kk = O(3 ).
Choose ~a R3 such that A = ei~a~ . Since A S2 , by (i) we can choose ~a so that k~ak = O(2 ).
Then choose ~b, ~c R3 such ~
p that 2b ~c = ~a. We can choose these vectors to be orthogonal and of equal
~ i~b~
length, so that kbk = k~ck = k~ak/2 = O(). Let B = e
and C = ei~c~ . Then the only difference between
A and JB, CK is the difference between the commutator and the group commutator, which is O(3 ) by (iv).
However, we need to choose points from the net . So let U = ei~u~ be the closest element of to B,
and let V = ei~v~ be the closest element of to C. Since is an 2 -net for S , we have kU Bk 2 and
kV Ck 2 , so in particular k~u ~bk = O(2 ) and k~v ~ck = O(2 ).
Now by the triangle inequality, we have
kA JU, V Kk kA e2i(~u~v)~ k + ke2i(~u~v)~ JU, V Kk. (2.10)
For the first term, using (ii), we have

~
kA e2i(~u~v)~ k = ke2i(b~c)~ e2i(~u~v)~ k (2.11)
2k~b ~c ~u ~v k (2.12)
= 2k(~b ~u + ~u) (~c ~v + ~v ) ~u ~v k (2.13)
= 2k(~b ~u) (~c ~v ) + (~b ~u) ~v + ~u (~c ~v )k (2.14)
3
= O( ). (2.15)
For the second term, using (iii) and (iv) gives
ke2i(~u~v)~ JU, V Kk = ke[~u~,~v~] JU, V Kk = O(3 ) (2.16)
The lemma follows.

10 Chapter 2. Efficient universality of quantum circuits
Note that it is possible to improve the construction somewhat over the version described above. Further-
more, it can be generalized to SU(N ) for arbitrary N . In general, the cost is exponential in N 2 , but for any
fixed N this is just a constant.
Chapter 3
Quantum circuit synthesis over

Clifford+T
As we discussed in Chapter 2, the Solovay-Kitaev Theorem tells us that we can convert between gate sets
with overhead that is only poly(log(1/)). However, the overhead may not be that small in practice (we
upper bounded the power of the log by log 5/ log 32 3.97), and it is natural to ask if we can do better. A
counting argument shows that the best possible exponent is 1. Can we get close to this lower boundideally
while retaining a fast algorithm?
In general, no such result is known (even if we do not require a fast algorithm). However, there are
strong circuit synthesis results for particular gate sets with nice structure. In particular, one can perform
fast, nearly-optimal synthesis for the set of single-qubit Clifford+T circuits. Not only does it admit fast
synthesis, but this gate set is also the most common choice for fault-tolerant quantum computation, so it is
likely to be relevant in practice.
To understand the synthesis of Clifford+T circuits, we focus here on the problem of exactly expressing
a given unitary operation over that gate set, assuming such a representation is possible. This result can
be extended to give an algorithm for approximately synthesizing arbitrary single-qubit gates, although the
details are beyond the scope of this lecture. (Note that some of these ideas can also be applied to the
synthesis of multi-qubit circuits, but that is also beyond our scope.)
3.1 Converting to Matsumoto-Amano normal form

An algorithm for exact synthesis of Clifford+T circuits was first presented in [63]. However, our presentation
here is based on a simpler analysis [47] that uses a normal form for such circuits introduced by Matsumoto
and Amano [71].
The single-qubit Clifford group C = hH, S, i is generated by the Hadamard gate H, the phase gate S,
and the phase := ei/4 , where

1 1 1 1 0 1 0
H := T := S := T 2 = . (3.1)
2 1 1 0 0 i
By adding the T gate, we get a universal gate setin other words, the set hH, T, i is dense in U(2). We
call any unitary operation that can be represented exactly over this gate set a Clifford+T operation.
Clearly, any single-qubit Clifford+T operation M can be written in the form
M = Cn T Cn1 C1 T C0 (3.2)
where C0 , . . . , Cn C. Our goal is to rewrite such an expression into a simpler form.
11
12 Chapter 3. Quantum circuit synthesis over Clifford+T
Let S := hS, X, i C. Any element of S can be pushed through T (say, to the right), since we have
ST = T S (3.3)
1
XT = T XS (3.4)
T = T . (3.5)
Thus we can assume C1 , . . . , Cn
/ S. (In some cases, pushing elements of S to the right might cause two T
gates to merge into S C; we take n to be the number of Clifford gates after any such cancellations.)
An explicit calculation shows that |S| = 64, whereas |C| = 192. Since I, H, and SH are in different left
cosets of S in C, they can be chosen as the three coset representatives, and we can write every element of
the Clifford group in the form H S, where H {I, H, SH}. Similarly, every element of C \ S can be written
in the same form, where H {H, SH}. Thus we can write M in the form
M = Hn Sn T Hn1 Sn1 H1 S1 T C0 (3.6)
where C0 C, H1 , . . . , Hn {H, SH}, and S1 , . . . , Sn S.
Now we can further simplify this expression by again pushing elements of S to the right. We have already
seen that such operators can be pushed through T gates, giving new elements of S. But furthermore, they
can also be pushed through elements of {H, SH}, since
SH {H, SH} SSH = HX (3.7)
2 2 2
XH = HZ = HS XSH = SHY = SH( XS ) (3.8)
H = H SH = SH. (3.9)
After applying these rules, we are left with an expression of the form
M = Hk T Hk1 H1 T C0 (3.10)
where H1 , . . . , Hk1 {H, SH} and Hk {I, H, SH}. (Note that we can have k < n, since again we could
find cancellations as we push gates to the right.) This expression is now in Matsumoto-Amano (MA) normal
form. In terms of regular expressions, we can write this form as ( | T ) (HT | SHT ) C.
Since the above argument is constructive, it also gives an algorithm for converting circuits to MA normal
form. A naive implementation would take time O(n2 ), since we might make a pass through O(n) gates to find
a simplification, and we might have to repeat this O(n) times before reaching MA normal form. However,
we can reduce to MA normal form in linear time by simplifying the given circuit gate-by-gate, maintaining
MA normal form along the way. If N is in MA normal form and C C, then N C can be reduced to MA
normal form in constant time (we simply combine the rightmost two Clifford operators). On the other hand,
case analysis shows that reducing N T to MA normal form only requires updating the rightmost 5 gates, so
it can also be reduced in constant time. Overall, this approach takes O(n) steps, each taking time O(1), for
a total running time of O(n).
An important parameter of a Clifford+T circuit is its T -count, which is simply the number of T gates it
contains. Clearly there is a way of writing any Clifford+T circuit in MA normal form such that the T -count
is minimal, simply because the reduction procedure described above never increases the T -count.
3.2 Uniqueness of Matsumoto-Amano normal form

In fact, the MA normal form is unique, so the procedure described above always produces a circuit with
minimal T -count. Furthermore, the proof of this helps to develop an algebraic characterization of Clifford+T
unitaries that facilitates approximate synthesis.
Given a single-qubit unitary U , let U denote its Bloch sphere representation. If U (xX + yY + zZ)U =
x x0
x0 X + y 0 Y + z 0 Z, then U y = y0 , and this relationship serves to define U by linearity. We have
z z0

0 0 1 0 1 0 1 1 0
1
H = 0 1 0 S = 1 0 0 T = 1 1 0 .
(3.11)
1 0 0 0 0 1 2 0 0 2
3.3. Algebraic characterization of Clifford+T unitaries 13

These generators belong to the ring Z[ 12 ] = { a+b k
2
: a, b Z, k N}, so clearly the Bloch sphere
2
representation of any Clifford+T operator has entries in this ring.
k
We say that k N is a denominator exponent of x Z[ 12 ] if 2 x Z[ 2] = {a + b 2 : a, b Z}. We
call the smallest such k the least
denominator exponent of x.
Define the parity of x Z[ 2], denoted p(x), such that p(a+b 2) is the parity of a (i.e., 0 if a is even and
k
1 if a is odd). If k is a denominator exponent for x, define the k-parity of x Z[ 12 ] as pk (x) := p( 2 x).
Observe that the Bloch sphere representation of a Clifford operator is a signed permutation matrix, so it
has denominator exponent 0, and its parity (applied to the matrix elementwise) is a permutation.
We can define an equivalence relation on (k-)parity matrices of Bloch sphere representations of Clifford+T
operators such that they are equivalent if they differ by right-multiplication by the parity matrix of a Clifford
operator (in other words, by permutation of the columns). Now consider what happens to the k-parity
matrix of the operator as we proceed through the MA normal form, where we increase k by one every time
we multiply by a T gate. A simple calculation shows that transitions between the resulting equivalence
classes are as follows:
T
C
T

1 0 0 1 1 0 0 0 0 1 1 0 (3.12)
0 T S
1 0 1 1 0 1 1 0 0 0 0
0 0 1 0 0 0 1 1 0 1 1 0
H
Here the matrices are representatives of equivalence classes of k-parity matrices, the labels on the arrows
show what gates induce the transitions, k = 0 at the leftmost (starting) matrix, and the value of k is increased
by 1 along each thick arrow. For example, for the transitions under a T gate, we have

a11 + b11 2 a12 + b12 2 a13 + b13 2 1 1 0 a11 + b11 2 a12 + b12 2 a13 + b13 2
a21 + b21 2 a22 + b22 2 a23 + b23 2 7 1 1 1
0
a21 + b21 2 a22 + b22 2 a23 + b23 2
2
a31 + b31 2 a32 + b32 2 a33 + b33 2 0 0 2 a31 + b31 2 a32 + b32 2 a33 + b33 2

1 (a11 a21 ) + (b11 b21 )2 (a12 a22 ) + (b12 b22 )2 (a13 a23 ) + (b13 b23 )2
= (a11 + a21 ) + (b11+ b21 ) 2 (a12 + a22 ) + (b12+ b22 ) 2 (a13 + a23 ) + (b13+ b23 ) 2 . (3.13)
2 2b31 + a31 2 2b32 + a32 2 2b33 + a33 2
At the leftmost matrix, we have a11 , a22 , a33 odd and aij even for i 6= j. Clearly the resulting 1-parity matrix
is of the indicated form. Similar calculations verify the other transitions.
From this transition diagram, we can easily see that the MA normal form is unique. If we remain at the
leftmost matrix, the operation is Clifford. On the other hand, if we end at one of the next three matrices to
the right, the leftmost syllable of M is T , HT , or SHT , respectively. Given a matrix M , let k be its least
denominator exponent. By computing pk (M ) and determining which equivalence class it belongs to, we can
determine the final syllable of its MA normal form. By induction, the entire MA normal form is determined.
Note that this also shows that the least denominator exponent of M is its minimal T -count.
This argument also implies an algorithm for exact synthesis given the matrix of a Clifford+T operation
(instead of some initial Clifford+T circuit). We simply convert to the Bloch matrix representation, compute
the least denominator exponent k, use pk (M ) to determine the leftmost syllable of the MA normal form, and
recurse until we are left with a Clifford operation. In this algorithm, the number of arithmetic operations
performed is O(k).
3.3 Algebraic characterization of Clifford+T unitaries

We can use the ideas of the previous section to give an algebraic characterization of Clifford+T operations.
Specifically, a matrix U SO(3) is the Bloch sphere representation of a Clifford+T unitary if and only if its
14 Chapter 3. Quantum circuit synthesis over Clifford+T
entries are in Z[ 12 ]. As noted above, the only if part is trivial; it remains to show that any such matrix
corresponds to a Clifford+T operation.
The proof of this statement uses the orthogonality condition on U to characterize the possible values of
pk (U ) (specifically, to show it is one of the forms in the above transition diagram, up to permutation of the
columns), and then shows that the least denominator can always be reduced by multiplying from the left by
the inverse of a matrix from {T, HT, SHT }. The proofs of these statements are straightforward, but involve
some explicit calculation and case analysis; see [47] for details.
As a simple corollary, we can establish that U is a Clifford+T unitary if and only if its entries are in
Z[ 12 , i] Again the only if direction is trivial. For the other direction, simply observe that if the entries of
U are in Z[ 12 , i], then the entries of U are in Z[ 12 ], and we can apply the characterization of Bloch matrices
of Clifford+T unitaries. Note that this only determines the actual matrix up to a phase, but this phase must
be a power of , so indeed the original U must be a Clifford+T unitary.
In summary, we have seen that any Clifford+T unitary can be synthesized into a Clifford+T circuit, with
the minimal number of T gates (equal to the least denominator exponent of its Bloch sphere representation),
in time linear in the T -count.
3.4 From exact to approximate synthesis

The methods described above can only be directly applied if the unitary operation in question can be
performed exactly using Clifford+T gates. However, the methods can be adapted to perform approximate
synthesis of arbitrary gates. The basic idea is to round a given unitary to a nearby element of Z[ 12 , i] and
then to synthesize that operation over Clifford+T . This is far from straightforward since a naive rounding
procedure (say, rounding the matrix elementwise) will generally yield an operation that is not unitary, and
thus not amenable to synthesis. However, by a careful rounding procedure, we can produce a nearby unitary
matrix over Z[ 12 , i], and thus produce an -approximation of length O(log(1/)).
Part II
Quantum algorithms
for algebraic problems
15
Chapter 4
The abelian quantum Fourier

transform and phase estimation
4.1 Quantum Fourier transform

Perhaps the most important unitary transformation in quantum computing is the quantum Fourier transform
(QFT). Later, we will discuss the QFT over arbitrary finite groups, but for now we will focus on the case of
an abelian group G. Here the transformation is
1 XX
FG := p y (x)|yihx| (4.1)
|G| xG
yG
where G is a complete set of characters of G, and y (x) denotes the yth character of G evaluated at x.
(You can verify that this is a unitary operator using the orthogonality of characters.) Since G and G are
isomorphic, we can label the elements of G using elements of G, and it is often useful to do so.
The simplest QFT over a family of groups is the QFT over G = Zn2 . The characters of this group are
y (x) = (1)xy , so the QFT is simply
1 X
FZn2 = (1)xy |yihx| = H n . (4.2)
n
2 x,yZn
2
You have presumably seen how this transformation is used in the solution of Simons problem [92].
4.2 QFT over Z2n

A more complex quantum Fourier transform is the QFT over G = Z2n :
1
2xyn |yihx|
X
FZ2n = (4.3)
2n x,yZ2n
where m := exp(2i/m) is a primitive mth root of unity. To see how to realize this transformation by a
quantum circuit, it is helpful to represent the input x as a string of bits, x = xn1 . . . x1 x0 , and to consider
17
18 Chapter 4. The abelian quantum Fourier transform and phase estimation
how an input basis vector is transformed:
1 X xy
|xi 7 2n |yi (4.4)
2n yZ n
2
1 X x(Pn1 y 2k )
= 2n k=0 k |yn1 . . . y1 y0 i (4.5)
n
2 yZ n
2
n1
1 X Y xyk 2k
= 2n |yn1 . . . y1 y0 i (4.6)
2n yZ n k=0
2
n1
1 O X xyk 2k
= 2n |yk i (4.7)
2n k=0 y Z
k 2
n1
O
= |zk i (4.8)
k=0
where
1 X xyk 2k
|zk i := 2n |yk i (4.9)
2 y Z
k 2
1 k
= (|0i + 2x2n |1i) (4.10)
2
1
Pn1
xj 2j+k
= (|0i + 2n j=0 |1i) (4.11)
2
1 kn k+1n
++xn1k 21 )
= (|0i + e2i(x0 2 +x1 2 |1i). (4.12)
2
(A more succinct way to write this is |zk i = 12 (|0i + 2xnk |1i), but the above expression is more helpful
for understanding the circuit.) In other words, F |xi is a tensor product of single-qubit states, where the kth
qubit only depends on the n k least significant bits of x.
This decomposition immediately gives a circuit for the QFT over Z2n . Let Rk denote the single-qubit
unitary operator

1 0
Rk := . (4.13)
0 2k
Then the circuit can be written as follows:
|x0 i H |zn1 i
|x1 i H R2 |zn2 i
.. .. . . .. ..
. . .. .. . .
|xn3 i |z2 i
|xn2 i H R2 Rn2 Rn1 |z1 i
|xn1 i H R2 R3 Rn1 Rn |z0 i
This circuit uses O(n2 ) gates. However, there are many rotations by small angles that do not affect the
final result very much. If we simply omit the gates Rk with k = (log n), then we obtain a circuit with
O(n log n) gates that implements the QFT with precision 1/ poly(n).
4.3. Phase estimation 19
4.3 Phase estimation

Aside from being directly useful in quantum algorithms, such as Shors algorithm, The QFT over Z2n
provides a useful quantum computing primitive called phase estimation [34, 62]. In the phase estimation
problem, we are given a unitary operator U (either as an explicit circuit, or as a black box that lets us apply
a controlled-U j operation for integer values of j). We are also given a state |i that is promised to be an
eigenvector of U , namely U |i = ei |i for some R. The goal is to output an estimate of to some
desired precision.
The procedure for phase estimation is straightforward. To get an n-bit estimate of , prepare the quantum
computer in the state
1 X
|x, i, (4.14)
2n xZ n
2
apply the operator X

|xihx| U x (4.15)
xZ2n
to give the state

1 X ix
e |x, i, (4.16)
2n xZ n
2
apply an inverse Fourier transform on the first register, and measure. If the binary expansion of /2
terminates after at most n bits (i.e., if = 2y/2n for some y Z2n ), then the state (4.16) is F2n |yi |i,
so the result is guaranteed to be the binary expansion of /2. In general, we obtain a good approximation
with high probability. In particular, the probability of obtaining the result y (corresponding to the estimate
2y/2n for the phase) is
1 sin2 (2n1 )
Pr(y) = 2n , (4.17)
2 sin2 ( 2 y
2n )
which is strongly peaked around the best n-bit approximation (in particular, it gives the best n-bit approx-
imation with probability at least 4/ 2 ). We will see the details of a similar calculation when we discuss
period finding.
4.4 QFT over ZN and over a general finite abelian group

One useful application of phase estimation is to implement the QFT over an arbitrary cyclic group ZN :
1 X xy
FZN = N |yihx|. (4.18)
N x,yZN
The circuit we derived using the binary representation of the input and output only works when N is a
power of two (or, with a slight generalization, some other small integer). But there is a simple way to realize
FZN (approximately) using phase estimation.
We would like to perform the transformation that maps |xi 7 |xi, where |xi := FZN |xi denotes a Fourier
basis state. (By linearity, if the transformation acts correctly on a basis, it acts correctly on all states.) It is
straightforward to perform the transformation |x, 0i 7 |x, xi; then it remains to erase the register |xi from
such a state.
Consider the unitary operator that adds 1 modulo N :
X
U := |x + 1ihx|. (4.19)
xZN
The eigenstates of this operator are precisely the Fourier basis states |xi := FZN |xi, since (as a simple
calculation shows)
FZN U FZN =
X
x
N |xihx|. (4.20)
xZN
20 Chapter 4. The abelian quantum Fourier transform and phase estimation
Thus, using phase estimation on U (with n bits of precision where n = O(log N )), we can perform the
transformation
|x, 0i 7 |x, xi (4.21)
(actually, phase estimation only gives an approximation of x, so we implement this transformation only
approximately). By running this operation in reverse, we can erase |xi, and thereby produce the desired
QFT.
Given the Fourier transform over ZN , it is straightforward to implement the QFT over an arbitrary finite
abelian group: any finite abelian group can be written as a direct product of cyclic factors, and the QFT
over a direct product of groups is simply the tensor product of QFTs over the individual groups.
Chapter 5
Discrete log and the hidden subgroup

problem
In this lecture we will discuss the discrete logarithm problem and its relevance to cryptography. We will
introduce the general hidden subgroup problem, and show how Shors algorithm solves a particular instance
of it, giving an efficient quantum algorithm for discrete log.
5.1 Discrete log

Let G = hgi be a cyclic group generated by g, written multiplicatively. Given an element x G, the discrete
logarithm of x in G with respect to g, denoted logg x, is the smallest non-negative integer such that g = x.
The discrete logarithm problem is the problem of calculating logg x.
Here are some simple examples of discrete logarithms:
For any G = hgi, logg 1 = 0
For G = Z
7 , log3 2 = 2
For G = Z 541 , log126 282 = 101
The discrete logarithm seems like a good candidate for a one-way function. We can efficiently compute
g , even if is exponentially large (in log |G|), using repeated squaring. But given x, it is not immediately
clear how to compute logg x, other than by checking exponentially many possibilities. (There are better
algorithms than brute force search, but none is known that works in polynomial time.)
5.2 Diffie-Hellman key exchange

The apparent hardness of the discrete logarithm problem is the basis of the Diffie-Hellman key exchange
protocol, the first published public-key cryptographic protocol.
The goal of key exchange is for two distant parties, Alice and Bob, to agree on a secret key using only
an insecure public channel. The Diffie-Hellman protocol works as follows:
1. Alice and Bob publicly agree on a large prime p and an integer g of high order. For simplicity, suppose
they choose a g for which hgi = Z p (i.e., a primitive root modulo p). (In general, finding such a g
might be hard, but it can be done efficiently given certain restrictions on p.)
2a. Alice chooses some a Zp1 uniformly at random. She computes A := g a mod p and sends the result
to Bob (keeping a secret).
2b. Bob chooses some b Zp1 uniformly at random. He computes B := g b mod p and sends the result to
Alice (keeping b secret).
3a. Alice computes K := B a mod p = g ab mod p.
3b. Bob computes Ab mod p = g ab mod p = K.
21
22 Chapter 5. Discrete log and the hidden subgroup problem
At the end of the protocol, Alice and Bob share a key K, and Eve has only seen p, g, A, and B.
The security of the Diffie-Hellman protocol relies on the assumption that discrete log is hard. Clearly, if
Eve can compute discrete logarithms, she can recover a and b, and hence the key. (Note that it is an open
question whether, given the ability to break the protocol, Eve can calculate discrete logarithms, though some
partial results in this direction are known.)
This protocol only provides a means of exchanging a secret key, not of sending private messages. However,
very similar ideas can be used to create a public-key cryptosystem (similar in spirit to RSA).
5.3 The hidden subgroup problem

It turns out that discrete logarithms can be calculated efficiently on a quantum computer, rendering cryp-
tographic protocols such as Diffie-Hellman key exchange insecure. The quantum algorithm for discrete log
solves a particular instance of the hidden subgroup problem (HSP).
In the general HSP, we are given a black box function f : G S, where G is a known group and S is a
finite set. The function is promised to satisfy
f (x) = f (y) if and only if x1 y H

(5.1)
i.e., y = xh for some h H
for some unknown subgroup H G. We say that such a function hides H. The goal of the HSP is to learn
H (say, specified in terms of a generating set) using queries to f .
Its clear that H can in principle be reconstructed if we are given the entire truth table of f . Notice in
particular that f (1) = f (x) if and only if x H: the hiding function is constant on the hidden subgroup,
and does not take that value anywhere else.
But the hiding function has a lot more structure as well. If we fix some element g G with g / H, we see
that f (g) = f (x) if and only if x gH, a left coset of H in G with coset representative g. So f is constant
on the left cosets of H in G, and distinct on different left cosets.
In the above definition of the HSP, we have made an arbitrary choice to multiply by elements of H on
the right, which is why the hiding function is constant on left cosets. We could just as well have chosen to
multiply by elements of H on the left, in which case the hiding function would be constant on right cosets;
the resulting problem would be equivalent. Of course, in the case where G is abelian, we dont need to make
such a choice. For reasons that we will see later, this case turns out to be considerably simpler than the
general case; and indeed, there is an efficient quantum algorithm for the HSP in any abelian group, whereas
there are only a few nonabelian groups for which efficient algorithms are known.
You should be familiar with Simons problem, the HSP with G = Zn2 and H = {0, s} for some s Zn2 .
There is a very simple quantum algorithm for this problem, yet one can prove that any classical algorithm
for finding s must query the hiding function exponentially many times (in n). The gist of the argument is
that, since the set S is unstructured, we can do no better than querying random group elements so long as
we do not know two elements x, y forpwhich f (x) = f (y). But by the birthday problem, we are unlikely to
see such a collision until we make ( |G|/|H|) random queries.
A similar argument applies to any HSP with a large number of trivially intersecting subgroups. More
precisely, we have
Theorem 5.1. Suppose that G has a set H of N subgroups whose only common element is the identity.
Then a classical computer must make ( N ) queries to solve the HSP.
Proof. Suppose the oracle does not a priori hide a particular subgroup, but instead behaves adversarially,
as follows. On the `th query, the algorithm queries g` , which we assume to be different from g1 , . . . , g`1
without loss of generality. If there is any subgroup H H for which gk / gj H for all 1 j < k ` (i.e.,
there is some consistent way the oracle could assign g` to an as-yet-unqueried coset of a hidden subgroup
from H), then the oracle simply outputs `; otherwise the oracle conceeds defeat and outputs a generating
set for some H H consistent with its answers so far (which must exist, by construction).
The goal of the algorithm is to force the oracle to conceed, and we want to lower bound the number of
queries required. (Given an algorithm for the HSP in G, there is clearly an algorithm that forces this oracle
to conceed using only one more query.) Now consider an algorithm that queries the oracle t times before
5.4. Shors algorithm 23
forcing the oracle to conceed. This algorithm simply sees a fixed sequence of responses 1, 2, . . . , t, so for the
first t queries, the algorithm cannot be adaptive. But observe that, regardless of which t group elements are
queried, there are at most 2t values of gk gj1 , whereas there are N possible subgroups in H. Thus, to satisfy
the N conditions that for all H H, there is some pair j, k such that gk gj1 H, we must have 2t N ,

i.e., t = ( N ).
Note that there are cases where a classical algorithm can find the hidden subgroup with a polynomial
number of queries. In particular, since a classical computer can easily test whether a certain subgroup is
indeed the hidden one, the HSP is easy for a group with only a polynomial number of subgroups. Thus, for
example, a classical computer can easily solve the HSP in Zp for p prime (since it has only 2 subgroups) and
in Z2n (since it has only n + 1 subgroups).
5.4 Shors algorithm

Now we will see how Shors algorithm can be used to calculate discrete logarithms [91]. This is a nice
example because its simpler than the factoring algorithm, but the problem it solves is actually at least as
hard: factoring N can be reduced to calculating discrete log in Z N . (Unfortunately, this does not by itself
give a quantum algorithm for factoring, because Shors algorithm for discrete log in G requires us to know
the order of Gbut computing |Z N | = (N ) is as hard as factoring N .)
Recall that we are given some element x of a cyclic group G = hgi, we would like to calculate logg x, the
smallest integer such that g = x. For simplicity, let us assume that the order of G, N := |G|, is known.
(For example, if G = Z p , then we know N = p 1. In general, if we do not know N , we can learn it using
Shors period-finding algorithm, but we will not discuss that here.) We can also assume that x 6= g (i.e.,
logg x 6= 1), since it is easy to check whether this is the case.
The discrete log problem can be cast as an abelian HSP in the (additive) group ZN ZN . Define a
function f : ZN ZN G as follows:
f (, ) = x g . (5.2)
Since f (, ) = g logg x+ , f is constant on the lines
L := {(, ) Z2N : logg x + = }. (5.3)
In other words, it hides the subgroup
H = L0 = {(0, 0), (1, logg x), (2, 2 logg x), . . . , (N 1, (N 1) logg x)}. (5.4)
The cosets of H in ZN ZN are of the form (, ) + H with , ZN . In particular, the set of cosets of the
form
(0, ) + H = {(, logg x) : ZN } = L (5.5)
varying over all ZN gives a complete set of cosets (so the set {0} ZN is a complete set of coset
representatives, i.e., a transversal of H in ZN ZN ).
Shors algorithm for finding H proceeds as follows. We start from the uniform superposition over ZN ZN
and compute the hiding function:
1 X 1 X
|ZN ZN i := |, i 7 |, , f (, )i. (5.6)
N N
,ZN ,ZN
Next we discard the third register. To see what this does, it may be conceptually helpful to imagine
that we actually measure the third register. Then the post-measurement state is a superposition over group
elements consistent with the observed function value, which by definition is some coset of H. In particular,
if the measurement outcome is g , we are left with the coset state corresponding to (0, ) + H, namely
1 X
|(0, ) + Hi = |L i = |, logg xi (5.7)
N ZN
24 Chapter 5. Discrete log and the hidden subgroup problem
However, note that the measurement outcome is unhelpful: each possible value occurs with equal proba-
bility, and we cannot obtain from g unless we know how to take discrete logarithms. This is why we may
as well simply discard the third register, leaving the system in the mixed state described by the ensemble of
pure states (5.7) where is uniformly random and unknown.
Now we can exploit the symmetry of the quantum state by performing a QFT over ZN ZN ; then the
state becomes
1 X +( logg x) 1 X

X ( log x)
g
3/2
N |, i = 3/2 N N |, i, (5.8)
N ,,Z
N ,Z Z
N N N
P
and using the identity ZN N = N ,0 , we have
1 X
N | logg x, i. (5.9)
N ZN
Now suppose we measure this state in the computational basis. Then we obtain some pair ( logg x, ) for
uniformly random ZN . If has a multiplicative inverse modulo N , we can divide the first register by
to get the desired answer. If does not have a multiplicative inverse, we simply repeat the entire procedure
again. The probability of success for each independent attempt is (N )/N = (1/ log log N ), so we dont
have to repeat the procedure many times before we find an invertible .
This algorithm can be carried out for any cyclic group G so long as we have a unique representation of
the group elements, and we are able to efficiently compute products in G. (We need to be able to compute
high powers of a group element, but recall that this can be done quickly by repeated squaring.)
Chapter 6
The abelian HSP and decomposing

abelian groups
Here we describe an algorithm to solve the HSP in any finite abelian group of known structure. We also
explain how related ideas can be used to determine the structure of a black-box abelian group.
6.1 The abelian HSP

We now consider the HSP for a general abelian group. When the group elements commute, it often makes
more sense to use additive notation for the group operation. We use this convention here, writing the
condition that f hides H as f (x) = f (x) iff x y H.
The strategy for the general abelian HSP closely follows the algorithm for the discrete log problem. We
begin by creating a uniform superposition over the group,
1 X
|Gi := p |xi. (6.1)
|G| xG
Then we compute the function value in another register, giving
1 X
p |x, f (x)i. (6.2)
|G| xG
Discarding the second register then gives a uniform superposition over the elements of some randomly chosen
coset x + H := {x + h : h H} of H in G,
1 X
|x + Hi = p |x + hi. (6.3)
|H| hH
Such a state is commonly called a coset state. Equivalently, since the coset is unknown and uniformly
random, the state can be described by the density matrix
1 X
H := |x + Hihx + H|. (6.4)
|G|
xG
Next we apply the QFT over G. Then we obtain the state

|x\
+ Hi := FG |x + Hi (6.5)
1 XX
=p y (x + h)|yi (6.6)
|H| |G|
yG hH
s
|H| X
= y (x)y (H)|yi (6.7)
|G|
yG
25
26 Chapter 6. The abelian HSP and decomposing abelian groups
where
1 X
y (H) := y (h). (6.8)
|H|
hH
Note that applying the QFT was the right thing to do because the state H is G-invariant. In other words,
it commutes with the regular representation of G, the unitary matrices U (x) satisfying U (x)|yi = |x + yi for
all x, y G: we have
1 X
U (x)H = |x + y + Hihy + H| (6.9)
|G|
yG
1 X
= |z + Hihz x + H| (6.10)
|G|
zG
= H U (x) (6.11)
= H U (x). (6.12)
It follows that H := FG H FG is diagonal (indeed, we verify this explicitly below), so we can measure
without losing any information. We will talk about this phenomenon more when we discuss nonabelian
Fourier sampling.
Note that y is a character of H if we restrict our attention to that subgroup. If y (h) = 1 for all h H,
then clearly y (H) = 1. On the other hand, if there is any h0 H with y (h0 ) 6= 1 (i.e., if the restriction of
y to H is not the trivial character of H), then since h0 + H = H, we have
1 X
y (H) = y (h) (6.13)
|H|
hh0 +H
1 X
= y (h0 + h) (6.14)
|H|
hH
= y (h0 )y (H), (6.15)
which implies that y (H) = 0. (This also follows from the orthogonality of characters of H,
1 X
y (x)y0 (x) = y,y0 , (6.16)
|H|
xH
taking y 0 to be the trivial character.) Thus we have

s
|H| X
|x\+ Hi = y (x)|yi (6.17)
|G|
y : y (H)=1
or, equivalently, the mixed quantum state

|H| X X |H| X
H = y (x)y0 (x) |yihy 0 | = |yihy|. (6.18)
|G|2 |G|
xG y,y 0 : y (H)=y0 (H)=1 y : y (H)=1
Next we measure in the computational basis. Then we obtain some character y that is trivial on the
hidden subgroup H. This information narrows down the possible elements of the hidden subgroup: we can
restrict our attention to those elements g G satisfying y (g) = 1. The set of such elements is called the
kernel of y ,
ker y := {g G : y (g) = 1}; (6.19)
it is a subgroup of G. Now our strategy is to repeat the entire sampling procedure many times and compute
the intersection of the kernels of the resulting characters. After only polynomially many steps, we claim that
the resulting subgroup is H with high probability. It clearly cannot be smaller than H (since the kernel of
every sampled character contains H), so it suffices to show that each sample is likely to reduce the size of
H by a substantial fraction until H is reached.
6.2. Decomposing abelian groups 27
Suppose that at some point in this process, the intersection of the kernels is K G with K 6= H. Since
K is a subgroup of G with H < K, we have |K| 2|H| (by Lagranges theorem). Because each character
y of G satisfying y (H) = 1 has probability |H|/|G| of appearing, the probability that we see some y for
which K ker y is
|H|
|{y G : K ker y }|. (6.20)
|G|
But the number of such ys is precisely |G|/|K|, since we know that if the subgroup K were hidden, we would
sample such ys uniformly, with probability |K|/|G|. Therefore the probability that we see a y for which
K ker y is precisely |H|/|K| 1/2. Now if we observe a y such that K 6 ker y , then |Kker y | |K|/2;
furthermore, this happens with probability at least 1/2. Thus, if we repeat the process O(log |G|) times, it
is extremely likely that the resulting subgroup is in fact H.
6.2 Decomposing abelian groups

To apply the above algorithm, we must understand the structure of the group G; in particular, we must be
able to apply the Fourier transform FG . For some applications, we might not know the structure of G a
priori. But if we assume only that we have a unique encoding of each element of G, the ability to perform
group operations on these elements, and a generating set for G, then there is an efficient quantum algorithm
[27] that decomposes the group as
G = h1 i h2 i ht i (6.21)
in terms of generators 1 , 2 , . . . , t . Here denotes an internal direct sum, meaning that the groups hi i
intersect only in the identity element; in other words, we have
G
= Z|h1 i| Z|h2 i| Z|ht i| . (6.22)
Given such a decomposition, it is straightforward to implement FG and thereby solve HSPs in G. We might
also use this tool to decompose the structure of the hidden subgroup H output by the HSP algorithm, e.g.,
to compute |H|.
First, it is helpful to simplify the problem by reducing to the case of a p-group for some prime p. For
each given generator g of G, we compute its order, the smallest non-negative integer r such that rg = 0
(where we are using additive notation; in multiplicative notation we would write g r = 1). Recall that there
is an efficient quantum algorithm for order finding. Furthermore, there is an efficient quantum algorithm for
factoring, so suppose we can write r = st for some relatively prime integers s, t. By Euclids algorithm, we
can find a, b such that as + bt = 1, so asg + btg = g. Therefore, we can replace the generator g by the two
generators sg and tg and still have a generating set. By repeating this procedure, we eventually obtain a
generating set in which all the generators have prime power order.
For a givenLprime p, let Gp be the group generated by all the generators of G whose order is a power of
p. Then G = p Gp : every element of G can be written as a sum of elements from the Gp s (since together
they include a generating set), and since Gp is a p-group (i.e., the orders of all its elements are powers of
p), Gp Gp0 = {0}. Thus, it suffices to focus on the generators of Gp and determine the structure of this
p-group. So from now on we assume that the order of G is a power of p.
Now, given a generating set {g1 , . . . , gd } for G, let q (which is some power of p) be the largest order of
any of the generators. We consider a hidden subgroup problem in the group Zdq whose solution allows us to
determine the structure of G. Define f : Zdq G by
f (x1 , . . . , xd ) = x1 g1 + + xd gd .
Now f (x1 , . . . , xd ) = f (y1 , . . . , yd ) if and only if (x1 y1 )g1 + + (xd yd )gd = 0, i.e., if and only if
f (x y) = 0. The elements of G for which f takes the value 0,
K := {x Zdq : f (x) = 0},
form a subgroup of G called the kernel of f . Using the algorithm for the hidden subgroup problem in Zdq ,
we can find generators for K. Suppose this generating set is W = {w1 , . . . , wm }, where wi Zdq .
28 Chapter 6. The abelian HSP and decomposing abelian groups
The function f is clearly a homomorphism from Zdq to G, and it is also surjective (i.e., onto, meaning
that the image of f is all of G), which implies that Zdq /K = G (this is called the first isomorphism theorem).
Thus, to determine the structure of G, it suffices to determine the structure of the quotient Zdq /K. In
particular, if Zdq /K = hu1 + Ki hut + Ki, then G = hf (u1 )i hf (ut )i. The final ingredient is a
polynomial-time classical algorithm that produces such a direct sum decomposition of a quotient group.
To find such a decomposition, it is helpful to view the problem in terms of linear algebra. With x Zdq ,
we have x + K = K (so that f (x) = 0, and there is no need to include x as a generator) if and only if
x spanZq W (recall that W is a generating set for K). We can easily modify this to allow arbitrary integer
vectors x Zd : then x + K = K if and only if x spanZ (W {qe1 , . . . , qed }), where ei is the ith standard
basis vector. In other words, as x varies over the integer span of the vectors w1 , . . . , wm , qe1 , . . . , qed , we
obtain redundant vectors.
Now we use a tool from integer linear algebra called the Smith normal form. A square integer matrix
is called unimodular if it has determinant 1. Given an integer matrix M , its Smith normal form is a
decomposition M = U DV 1 , where D = diag(1, . . . , 1, d1 , . . . , dt , 0, . . . 0) is an integer diagonal matrix with
its positive diagonal entries satisfying d1 | d2 | . . . | dt . The Smith normal form can be computed classically
in polynomial time.
In the present context, let M be the matrix with columns w1 , . . . , wm , qe1 , . . . , qed . Let M = U DV 1 be
its Smith normal form, and let u1 , . . . , ut be the columns of U corresponding to diagonal entries of D that
are not 0 or 1 (i.e., if the ith diagonal entry of D is not 0 or 1, the ith column of U is included). We claim
that Zdq /K = hu1 + Ki hut + Ki.
Since U is nonsingular, it is clear that we still have a generating set if we take all the columns of U .
Were claiming that the columns corresponding to 0 or 1 diagonal entries of D are redundant. Let u be
the jth column of U ; we know that u + K = K (i.e., u is redundant) if u spanZ cols(M ) (where cols(M )
denotes the set of columns of M ). Since V is unimodular, spanZ cols(M ) = spanZ cols(M V ). So u + K = K
if u spanZ cols(M V ), i.e., if ej spanZ cols(U 1 M V ) = spanZ cols(D). If the jth diagonal entry of D is 0
or 1, then clearly this is true, so u + K = K. This shows that the cosets u1 + K, . . . , ut + K alone indeed
generate Zdq /K.
It remains to show that they generate Zdq /K as a direct sum. The above argument shows that P di ui + K =
K, and thisP is not true for any smaller value than di , so the order of ui +K is di . Now suppose i xi ui +K =
1
K. Then i xi ui spanZ cols(M ) = spanZ cols(M V ), or in other words, x spanZ cols(U MV ) =
spanZ cols(D). But this implies that xi is an integer multiple of di , which shows that hu1 +Ki hut +Ki
is indeed a direct sum decomposition.
Chapter 7
Quantum attacks on elliptic curve

cryptography
In Chapter 5 we discussed Shors algorithm, which can calculate discrete logarithms over any cyclic group.
In particular, this algorithm can be used to break the Diffie-Hellman key exchange protocol, which assumes
that the discrete log problem in Z p (p prime) is hard. However, Shors algorithm also breaks elliptic curve
cryptography, the main competitor to RSA. In this lecture we will introduce elliptic curves and show how
they give rise to abelian groups that can be used to define cryptosystems.
This lecture is only intended to be a survey of the main ideas behind elliptic curve cryptography. While
breaking such cryptosystems is a major potential application of quantum computers, only a few implemen-
tation details differ between the algorithms for discrete log over the integers and over elliptic curves; no new
quantum ideas are required.
7.1 Elliptic curves

Fix a field F whose characteristic is not equal to 2 or 3. (Cryptographic applications often use the field F2n
of characteristic 2, but the definition of an elliptic curve is slightly more complicated in this case, so we will
not consider it here.) Consider the equation
y 2 = x3 + ax + b (7.1)
where a, b F are parameters. The set of points (x, y) F2 satisfying this equation, together with a special
point O called the point at infinity, is called the elliptic curve Ea,b . A curve is called nonsingular if its
discriminant, := 16(4a3 + 27b2 ), is nonzero, and we will assume that this is the case for all curves we
consider.
Here are a few examples of elliptic curves over R2 :
Such pictures are helpful for developing intuition. However, for cryptographic applications it is useful to
have a curve whose points can be represented exactly with a finite number of bits, so we use curves over
finite fields. For simplicity, we will only consider the case Fp where p is a prime different from 2 or 3.
As an example, consider the curve
E2,1 = {(x, y) F27 : y 2 = x3 2x + 1} (7.2)
over F7 . This curve has 4a3 + 27b2 = 32 + 27 = 5 = 2 mod 7, so it is nonsingular. It is tedious but
straightforward to check that the points on this curve are
E2,1 = {O, (0, 1), (0, 6), (1, 0), (3, 1), (3, 6), (4, 1), (4, 6), (5, 2), (5, 5), (6, 3), (6, 4)}. (7.3)
In general, the number of points on the curve depends on the parameters a and b. However, for large

p it is quite close to p for all curves. Specifically, a theorem of Hasse says it is p + 1 t, where |t| 2 p.
(Note that for elliptic curves, there is a classical algorithm, Schoof s algorithm, that computes the number
29
30 Chapter 7. Quantum attacks on elliptic curve cryptography
y2 "x3 !2x#1 y2 "x3 !x#1 y2 "x3 #x#1
2 3
1
1
1
!2 !1 1 2 !2 !1 1 2 !2 !1 1 2
!1
!1
!1
!2
!2
!2 !3
of points on the curve in time poly(log p). For more general curves defined by polynomial equations over
finite fields, there are similar estimates to the one provided by Hasses theorem, yet computing the precise
number of points may be a classically hard problem. But for some such curves, there is an efficient quantum
algorithm, Kedlayas algorithm, for counting the number of points on the curve [61].)
It turns out that an elliptic curve defines an abelian group. Specifically, there is a binary operation +
that maps a pair of points on the curve to a new point on the curve, in a way that satisfies all the group
axioms. To motivate this definition, we go back to the case where F = R. Given two points P, Q Ea,b ,
their sum P + Q is defined geometrically, as follows. For now, assume that neither point is O. Draw a line
through the points P and Q (or, if P = Q, draw the tangent to the curve at P ), and let R denote the third
point of intersection (defined to be O if the line is vertical). Then P + Q is defined as the reflection of R
about the x axis (where the reflection of O is O). If one of P or Q is O, we draw a vertical line through the
other point, giving the result that P + O = P : O acts as the additive identity. Thus we define O + O = O.
Note that reflection about the x axis corresponds to negation, so we can think of the rule as saying that the
three points of intersection of any line with the curve sum to 0.
2
R
Q
1
"2 "1 1 2
"1
P!Q
"2
It turns out that this law makes Ea,b into an abelian group for which the identity is O and the inverse of
P = (x, y) is P = (x, y). By definition, it is clear that (Ea,b , +) is abelian (the line through P and Q does
not depend on which point is chosen first) and closed (we always choose P +Q to be some point on the curve).
The only remaining group axiom to check is associativity: we must show that (P + Q) + T = P + (Q + T ).
7.2. Elliptic curve cryptography 31
Using a diagram of a typical curve, and picking three arbitrary points, you should be able to convince yourself
that associativity appears to hold. Actually proving it in these geometric terms requires a little algebraic
geometry.
For calculations, it is helpful to produce an algebraic description of the definition of elliptic curve point
addition. Let P = (xP , yP ) and Q = (xQ , yQ ). The slope of the line through P and Q (with P 6= Q) is
yQ yP
= . (7.4)
xQ xP
Thus the set of points (x, y) on this line is y = x + y0 , where y0 = yP xP . Substituting this into (7.1)
gives the equation
x3 2 x2 + (a 2y0 )x + b y02 = 0, (7.5)
and solving this equation with the cubic formula shows that xP + xQ + xR = 2 . Thus we have
xP +Q = xR (7.6)
2
= xP xQ (7.7)
2
yQ yP

= xP xQ (7.8)
xQ xP
and
yP +Q = yR (7.9)
= xP +Q y0 (7.10)
= (xP xP +Q ) yP (7.11)
yQ yP
= (xP xP +Q ) yP . (7.12)
xQ xP
A similar formula can be derived for the case where P = Q (i.e., we are computing 2P ). It is straightforward
to compute the slope of the tangent to the curve at P ; if yP = 0 then the slope is infinite, so 2P = O, but
otherwise
3x2 + a
= P . (7.13)
2yP
The rest of the calculation proceeds as before, and we have
x2P = 2 2xP (7.14)

2 2
3xP + a
= 2xP (7.15)
2yP
and
y2P = (xP x2P ) yP (7.16)

3x2P +a
= (xP x2P ) yP . (7.17)
2yP
While the geometric picture does not necessarily make sense for the case where F is a finite field, we can
take its algebraic description as a definition of the + operation. Even over a finite field, it turns out that this
operation defines an abelian group. It is a nice exercise to check explicitly (say, using Mathematica) that
when addition of points is defined by these algebraic expressions, it is commutative, closed, and associative,
thereby proving that (Ea,b , +) is an abelian group over any field.
7.2 Elliptic curve cryptography

Suppose we fix an elliptic curve Ea,b and choose a point g Ea,b . Then we can consider the subgroup hgi
(which is possibly the entire group if it happens to be cyclic). Using exponentiation in this group (which is
32 Chapter 7. Quantum attacks on elliptic curve cryptography
simply multiplication in our additive notation), we can define analogs of Diffie-Hellman key exchange and
related cryptosystems such as ElGamal. The security of such a cryptosystem then relies on the assumption
that the discrete log problem on hgi is hard.
In practice, there are many details to consider when choosing an elliptic curve for cryptographic purposes.
Algorithms are known for calculating discrete log on supersingular and anomalous curves that run faster
than algorithms for the general case, so such curves should be avoided. Also, g should be chosen to be a
point of high orderideally, the elliptic curve group should be cyclic, and g should be a generator. Such
curves can be found efficiently, and in the general case, it is not known how to solve the discrete log problem
over an elliptic curve classically any faster than by general methods (e.g., Pollards rho algorithm), which

run in time O( p).
7.3 Shors algorithm for discrete log over elliptic curves

It is straightforward to use Shors algorithm to solve the discrete log problem for an elliptic curve over Fp in
time poly(log p). Points on the curve can be represented uniquely by their coordinates, with a special symbol
used to denote O, the point at infinity. Addition of points on the curve can be computed using the formulas
described above, which involve only elementary arithmetic operations in the field. The most complex of these
operations is the calculation of modular inverses, which can easily be done using the extended Euclidean
algorithm.
Elliptic curve cryptosystems are commonly viewed as being more secure than RSA for a given key size,
since the best classical algorithms for factoring run faster than the best classical algorithms for calculating
discrete log in an elliptic curve. Thus, in practice, much smaller key sizes are used in elliptic curve cryptog-
raphy than in factoring-based cryptography. Ironically, Shors algorithm takes a comparable number of steps
for both factoring and discrete log (regardless of the group involved), with the caveat that group operations
on an elliptic curve take more time to calculate than ordinary multiplication of integers.
Chapter 8
Quantum algorithms for number fields
In this and the next lecture, we will explore a natural extension of the abelian hidden subgroup problem,
namely an algorithm discovered by Hallgren for solving a quadratic diophantine equation known as Pells
equation [54, 59]. This algorithm is interesting for at least two reasons. First, it gives an application of
quantum algorithms to a new area of mathematics, algebraic number theory (and indeed, subsequent work
has shown that quantum computers can also efficiently solve other problems in this area). Second, it extends
the solution of the abelian HSP to the case of an infinite group, namely the real numbers.
There are two main parts to the quantum algorithm for solving Pells equation. First, we define a periodic
function whose period encodes the solution to the problem. To define this function, we must introduce some
notions from algebraic number theory. Second, we show how to find the period of a black-box function
defined over the real numbers even when the period is irrational.
8.1 Review: Order finding

Recall that in order finding, we are given an element g of a group G and our goal is to find the order of
g, the smallest r N such that g r = 1. We consider the function f : Z G defined by f (x) = g x . This
function is periodic with period r, and there is a quantum algorithm to efficiently find this period.
8.2 Pells equation

Given a squarefree integer d (i.e., an integer not divisible by any perfect square), the Diophantine equation
x2 dy 2 = 1 (8.1)
is known as Pells equation. Amusingly, Pell had nothing whatsoever to do with the equation. The misat-
tribution is apparently due to Euler, who confused Pell with a contemporary, Brouncker, who had actually
worked on the equation. In fact, Pells equation was studied in ancient India, where (inefficient) methods
for solving it were developed about a century before Pell.
The left hand side of Pells equation can be factored as

x2 dy 2 = (x + y d)(x y d). (8.2)
2

Note
(x, y) Z can be encoded uniquely as the
that a solution ofthe equation number x + y d: since
real
d is irrational, x + y d = w + z d if and only if (x, y) = (w, z). (Proof: xw zy = d.) Thus we can also

refer to the number x + y d as a solution of Pells equation.
There is clearly no loss of generality in restricting our attention to positive solutions of the equation,
namely those for which x n> 0 and y > 0. It is straightforward to show that if x1 + y1 d is a positive
solution, then (x1 + y1 d) is also a positive solution
for any n N. In fact, one can show that all positive
solutions are obtained in this way, where x1 +y1 d is the fundamental solution, the smallest positive solution
33
34 Chapter 8. Quantum algorithms for number fields
of the equation. Thus, even though Pells equation has an infinite number of solutions, we can in a sense
find them all by finding the fundamental solution.
Some examples of fundamental solutions for various values of d are shown in the following table. Notice
that while the size of the fundamental solution generally increases with increasing d, the behavior is far
from monotonic: for example, x1 has 44 decimal digits when d = 6009, but only 11 decimal digits when
d = 6013.

But it is possible for the solutions to be very largethe size of x 1 + y 1 d is only upper bounded
by 2O( d log d) . Thus it is not even possible to write down the fundamental solution with poly(log d) bits.
d x1 y1
2 3 2
3 2 1
5 9 4
..
.
13 649 180
14 15 4
..
.
6009 131634010632725315892594469510599473884013975 1698114661157803451688949237883146576681644
1.3 1044 1.6 1042
6013 40929908599 527831340

..
.
To get around this difficulty, we define the regulator of the fundamental solution,

R := ln(x1 + y1 d). (8.3)

Since R = O( d log d), we can write down dRe using O(log d) bits. Now R is an irrational number, so
determining only its integer part may seem unsatisfactory. But in fact, given the integer part of R, there is
a classical algorithm to compute n digits of R in time poly(log d, n). Thus it suffices to give an algorithm
that finds the integer part of R in time poly(log d). The best known classical algorithm for this problem
takes time 2O( log d log log d) assuming the generalized Riemann hypothesis, or time O(d1/4 poly(log d)) with
no such assumptions.
8.3 Some basic algebraic number theory

As mentioned above, there are two main parts to the quantum algorithm for Pells equation: first, the
definition of a periodic function over the reals whose period encodes the regulator, and second, a solution
of the period-finding problem in the case where the period might be irrational. We will start by showing
how to define the periodic function. To do this, we need to introduce some concepts from algebraic number
theory.
Given a squarefree positive integer d, the quadratic number field Q( d) is defined as

Q( d) := {x + y d : x, y Q}. (8.4)
You can easily check that this is a field with the usual addition and multiplication operations. We can also
define an operation called conjugation, defined by

x + y d := x y d. (8.5)

Conjugation
of elements of Q( d) has many of the same properties as complex conjugation, and indeed
Q( d) behaves
in many
respects like C, with d taking the place of the imaginary unit i = 1. Defining
the ring Z[ d] Q( d) as
Z[ d] := {x + y d : x, y Z}, (8.6)

we see that solutions of Pells equation correspond to Z[ d] satisfying = 1.

8.4. A periodic function for the units of Z[ d] 35

that1any solution of Pells equation, Z[ d], has the property that its multiplicative inverse
Notice
over Q( d), = / = ,
is also an element of Z[ d]. In general, an element of a ring with an inverse
that is also an element of the ring is called a unit. In Z, the only units are 1,
but in other rings it is possible
to have more units. It should not be a surprise that the set of units of Z[ d] is closely related to the set of
solutions of Pells equation. Specifically, we have

Proposition 8.1. = x + y d is a unit in Z[ d] if and only if = x2 dy 2 = 1.
Proof. We have
1 xy d
= = 2 . (8.7)
x dy 2

If x2 dy 2 = 1, then clearly 1 Z[ d]. Conversely, if 1 Z[ d], then so is

1 1 (x y d)(x + y d) 1
= 2 2 2
= 2 , (8.8)
(x dy ) x dy 2
which shows that x2 dy 2 = 1.

It is not hard to show that the set of all units in Z[ d] is given by {n1 : n Z}, where 1 is the
fundamental unit, the smallest unit greater than 1. The proof is essentially the same as the proof that all
solutions of Pells equation are powers of the fundamental solution.
If we can find 1 , then it is straightforward to find all the solutions of Pells equation. If 1 = x + y d
has x2 dy 2 = +1, then the units are precisely the solutions of Pells equation. On the other hand, if
x2 dy 2 = 1, then 2 := 21 satisfies 2 2 = 21 21 = (1)2 = 1; in this case the solutions of Pells equation
are {2n1 : n Z}. Thus our goal is to find 1 . Just as in our discussion of the solutions to Pells equation,
1 is too large to write down, so instead we will compute the regulator of the fundamental unit, R := ln 1 .
To define a periodic function that encodes R, we need to introduce the concept of an ideal of a ring (and
more specifically, a principal ideal ). For any ring R, we say that I R is an ideal if it is closed under integer
linear combinations and under multiplication by arbitrary elements of of R. For example, 2Z is an ideal of
Z.
We say that an ideal is principal if it is generated by a single element of the ring, i.e., if it is of the form
R for some R. In the example above, 2Z is a principal ideal. (Not all ideals are principal; for example,
consider xZ[x, y] + yZ[x, y] Z[x, y], an ideal in the ring of polynomials in x, y with integer coefficients.)

8.4 A periodic function for the units of Z[ d]

Principal ideals are useful because the function mapping the ring element Z[ d] to the principal ideal
R is periodic, and its periodicity corresponds to the units of Z[ d]. Specifically, we have

Proposition 8.2. Z[ d] = Z[ d] if and only if = where is a unit in Z[ d].

Proof. If is a unit, then Z[ d] = Z[ d] = Z[ d] since Z[ d] = Z[
d] by the definition of a unit.
Conversely,
suppose that Z[ d] = Z[ d]. Since
1 Z[ d], Z[ d] = Z[ d], so there is some
Z[ d] satisfying = . Similarly, Z[ d] = Z[ d], so there is some Z[ d] satisfying = .
Thus we have = = . This shows that = 1, so and are units (indeed, = 1 ).

Thus the function g() = Z[ d] is (multiplicatively) periodic with period 1 . In other words, letting
= ez , the function
h(z) = ez Z[ d] (8.9)
is (additively) periodic with period R. However, we cannot simply use this function since it is not possible
to succinctly represent the values it takes.
To define a more suitable periodic function, Hallgren uses the concept of a reduced ideal, and a way of
measuring the distance between principal ideals. The definition of a reduced ideal is rather technical, and
we will not go into the details. For our purposes, it is sufficient to note that there are only finitely many
36 Chapter 8. Quantum algorithms for number fields
reduced principal ideals, and in fact only O(d) of them, so we can represent a reduced principal ideal using
poly(log d) bits.
Hallgren also uses a function that measures the distance of any principal ideal from the unit ideal, Z[ d].
This function is defined as

(Z[ d]) := ln mod R. (8.10)

Notice that the unit ideal has distance (1Z[ d]) = ln |1/1| mod R = 0, as required. Furthermore, the
distance function does not depend on which generator we choose to represent an ideal, since (by the above
proposition) two equivalent ideals have generators that differ by some unit , and

(Z[ d]) = ln mod R = ln 1 mod R = ln |2 | mod R = 2 ln || mod R = 0. (8.11)

With this definition of distance, one can show that the reduced ideals are not too far apart, so that there is
a reduced ideal close to any non-reduced ideal.
The periodic function used in Hallgrens algorithm, f (z), is defined as the reduced principal ideal whose
distance from the unit ideal is maximal among all reduced principal ideals of distance at most z (together
with the distance from z mod R, to ensure that the function is one-to-one within each period). In other
words, we select the reduced principal ideal to the left of or at z.
This function is periodic with period R, and can be computed in time poly(log d). Thus it remains to
show how to perform period finding when the period of the function might be irrational.
Chapter 9
Period finding from Z to R
In the previous chapter, we defined a periodic function over R whose period is an irrational number (the
regulator) encoding the solutions of Pells equation. Here we review Shors approach to period finding, and
show how it can be adapted to find an irrational period.
9.1 Period finding over the integers

Shors factoring algorithm is based on a reduction of factoring to order finding (observed by Miller in the
1970s). This reduction is typically covered in a first course on quantum computing, so we do not discuss the
details in these notes.
In the order finding problem for a group G, we are given an element g G and our goal is to find the
order of g, the smallest r N such that g r = 1. In other words, we are trying to find the smallest positive
integer r such that g x mod L = g x+r mod L for all x Z. (Factoring L reduces to order finding in G = ZL
with a random element g Z L .) One way to approach this problem is to consider the function f : Z G
defined by f (x) = g x . This function is periodic with period r, and there is an efficient quantum algorithm
to find this period, which we now review.
Note that since the period does not, in general, divide a known number N , we cannot simply reduce this
task to period finding over ZN ; rather, we should really think of it as period finding over Z (or, equivalently,
the hidden subgroup problem over Z).
Of course, we cannot hope to represent arbitrary integers on a computer with finitely many bits of
memory. Instead, we will consider the function only on the inputs {0, 1, . . . , N 1} for some chosen N ,
and we will perform Fourier sampling over ZN . We will see that this procedure can work even when the
function is not precisely periodic over ZN . Of course, this can only have a chance of working if the period
is sufficiently small, since otherwise we could miss the period entirely. Later, we will see how to choose N if
we are given an a priori upper bound of M on the period. If we dont initially have such a bound, we can
simply start with M = 2 and repeatedly double M until its large enough for period finding to work. The
overhead incurred by this procedure is only poly(log r).
Given a value of N , we prepare a uniform superposition over {0, 1, . . . , N 1} and compute the function
in another register, giving
1 X 1 X
|xi 7 |x, f (x)i. (9.1)
N x{0,...,N 1}
N x{0,...,N 1}
Next we measure the second register, leaving the first register in a uniform superposition over those val-
ues consistent with the measurement outcome. When f is periodic with minimum period r, we obtain a
superposition over points separated by the period r. The number of such points, n, depends on where the
first point, x0 {0, 1, . . . , r 1}, appears. When restricted to {0, 1, . . . , N 1}, the function has bN/rc full
periods and N rbN/rc remaining points, as depicted below. Thus n = bN/rc + 1 if x0 < N rbN/rc and
n = bN/rc otherwise.
37
38 Chapter 9. Period finding from Z to R
N
z }| {
x0
| {z }| {z }| {z }| {z }| {z }
r r r r N rbN/rc
Discarding the measurement outcome, we are left with the quantum state
n1
1 X
|x0 + jri (9.2)
n j=0
where x0 occurs nearly uniformly random (it appears with probability n/N ) and is unknown. To obtain
information about the period, we apply the Fourier transform over ZN , giving
n1 n1
1 X X k(x0 +jr) 1 X
kx0
X jkr
N |ki = N N |ki. (9.3)
nN j=0 kZN nN kZN j=0
Now if we were lucky enough to choose a value of N for which r | N , then in fact n = N/r regardless of the
value of x0 , and the sum over j above is
n1 n1
jkr
X X
N = njk (9.4)
j=0 j=0
= nk mod n,0 . (9.5)
In this especially simple case, the quantum state is

n X kx0 1 X kx0
N k mod n,0 |ki = N |ki, (9.6)
nN kZN r
knZr
and measurement of k is guaranteed to give an integer multiple of n = N/r, with each of the r multiples
occurring with probability 1/r. But more generally, the sum over j in (9.3) is the geometric series
n1 krn
X jkr N 1
N = kr
(9.7)
j=0
N 1
krn
(n1)kr/2 sin N
= N . (9.8)
sin kr
N
The probability of seeing a particular value k is given by the normalization factor 1/nN times the magnitude
squared of this sum, namely
sin2 krn
N
Pr(k) = . (9.9)
nN sin2 kr
N
From the case where n = N/r, we expect this distribution to be strongly peaked around values of k that are
close to integer multiples of N/r. The probability of seeing k = bjN/re = jN/r + for some j Z, where
bxe denotes the nearest integer to x, is
sin2 (jn + rn

N )
Pr(k = bjN/re) = 2 (9.10)
nN sin (j + r
N )
sin2 rn
N
= . (9.11)
nN sin2 r
N
Now, to upper bound the denominator, we use sin2 x x2 . To lower bound the numerator, observe that
2 rn
since || 1/2 and rn/N 1 + O(1/n), we have | rn
N | 2 + O(1/n); thus sin
rn 2
N c( N ) for some
9.2. Period finding over the reals 39
4
constant c (in particular, we can take c 2 for large n). Thus we have
c( rn
N )
2
Pr(k = bjN/re) r 2 (9.12)
nN ( N )
cn
= (9.13)
N
c
. (9.14)
r
This bound shows that Fourier sampling produces a value of k that is the closest integer to one of the r
integer multiples of N/r with probability lower bounded by a constant.
To discover r given one of the values bjN/re, we can divide by N to obtain a rational approximation to
j/r that deviates by at most 1/2N . Then consider the continued fraction expansion
bjN/re 1
= . (9.15)
N 1
a1 +
1
a2 +
a3 +
Truncating this expansion after a finite number of terms gives a convergent of the expansion. The convergents
provide a sequence of successively better approximations to bjN/re/N by fractions that can be computed
in polynomial time (see for example Knuths The Art of Computer Programming, volume 2). Furthermore,
it can be shown that any fraction p/q with |p/q bjN/re/N | < 1/2q 2 will appear as one of the convergents
(see for example Hardy and Wright, Theorem 184). Since j/r differs by at most 1/2N from bjN/re/N , the
fraction j/r will appear as a convergent provided r2 < N . By taking N is sufficiently large, this gives an
efficient means of recovering the period.
9.2 Period finding over the reals

Now suppose we are given a function f : R S satisfying f (x + r) = f (x) for some r R, and as usual,
assume that f is injective within each (minimal) period. Now well see how to adapt Shors procedure to
find an approximation to r, even if it happens to be irrational.
To perform period finding on a digital computer, we must of course discretize the function. We have
to be careful about how we perform this discretization. For example, suppose that S = R. If we simply
evaluate f at equally spaced points and round the resulting values (perhaps rescaled) to get integers, there
is no reason for the function values corresponding to inputs separated by an amount close to the period to
be related in any way whatsoever. It could be that the discretized function is injective, carrying absolutely
no information about the period.
Instead we will discretize in such a way that the resulting function is pseudoperiodic. We say that
f : Z S is pseudoperiodic at k Z with period r R if for each ` Z, either f (k) = f (k + b`rc) or
f (k) = f (k + d`re). We say that f is -pseudoperiodic if it is pseudoperiodic for at least an fraction of the
values k = 0, 1, . . . , brc. We assume that the discretized function is -pseudoperiodic for some constant ,
and that it is injective on the subset of inputs where it is pseudoperiodic. Note that the periodic function
encoding the regulator of Pells equation can be constructed so that it satisfies these conditions.
Now lets consider what happens when we apply Fourier sampling to a pseudoperiodic function. As before,
we will Fourier sample over ZN , with N to be determined later (again, depending on some a priori upper
bound M on the period r). We start by computing the pseudoperiodic function on a uniform superposition:
X X
|xi 7 |x, f (x)i. (9.16)
x{0,...,N 1} x{0,...,N 1}
Now measuring the second register gives, with constant probability, a value for which f is pseudo-periodic.
Say that this value is f (x0 ) where 0 x0 r. As before, we see n = bN/rc + 1 points if x0 < N rbN/rc
or n = bN/rc points otherwise (possibly offset by 1 depending on how the rounding occurs for the largest
value of x, but lets not be concerned with this detail). We will write [`] to denote an integer that could be
either b`c or dè. With this notation, we obtain
n1
1 X
|x0 + [jr]i. (9.17)
n j=0
Next, performing the Fourier transform over ZN gives

n1 n1
1 X X k(x0 +[jr]) 1 X
kx0
X k[jr]
N |ki = N N |ki. (9.18)
nN j=0 kZN nN kZN j=0
Now we have [jr] = jr + j , where 1 < j < 1, so the sum over j is

n1 n1
k[jr] kjr k
X X
N = N N j . (9.19)
j=0 j=0
close to the corresponding sum in the case where the offsets j are zero (which,
We would like this to be
when normalized, is (1/ r) by the same calculation as in the case of period finding over Z). Consider the
deviation in amplitude,
n1
X kjr kj n1X kjr n1
X k

N N N
|N j 1| (9.20)
j=0 j=0 j=0
n1
X kj
=2 sin (9.21)

N

j=0
n1
X kj

2 (9.22)
N

j=0
2kn
. (9.23)
N
At least insofar as this bound is concerned, the amplitudes may not be close for all values of k. However,
suppose we only consider values of k less than N/ log r. (We will obtain such a k with probability about
1/ log r, so we can condition on this event with only polynomial overhead.) For such a k, we have
1 n1
X k[jr]
1 n
nN N = (1/ r) O( nN log r ) (9.24)
j=0

= (1/ r) O( r 1log r ) (9.25)

= (1/ r). (9.26)
Thus, as in the case of period finding over Z, Fourier sampling allows us to sample from a distribution for
which some value k = bjN/re (with j Z) appears with reasonably large probability (now (1/ poly(log r))
instead of (1)).
Finally, we must obtain an approximation to r using these samples. Since r is not an integer, the
procedure used in Shors period-finding algorithm does not suffice. However, we can perform Fourier sampling
sufficiently many times that we obtain two values bjN/re, bj 0 N/re such that j and j 0 are relatively prime,
again with only polynomial overhead. We prove below that if N 3r2 , then j/j 0 is guaranteed to be one
of the convergents in the continued fraction expansion for bjN/re/bj 0 N/re. Thus we can learn j, and hence
compute jN/bjN/re, which gives a good approximation to r: in particular, |r bjN/bjN/ree| 1.
Lemma 9.1. If N 3r2 , then j/j 0 appears a convergent in the continued fraction expansion of bjN/re/bj 0 N/re.
Furthermore, |r bjN/bjN/ree| 1.
9.3. Other algorithms for number fields 41
Proof. A standard result on the theory of approximation by continued fractions says that if a, b Z with
|x ab | 2b12 , then a/b appears as a convergent in the continued fraction expansion of x (see for example
Hardy and Wright, An Introduction to the Theory of Numbers, Theorem 184.) Thus it is sufficient to show
that
bjN/re j 1
bj 0 N/re j 0 < 2j 02 . (9.27)

Letting bjN/re = jN/r + and bj 0 N/re = j 0 N/r + with ||, || 1/2, we have

bjN/re j jN/r + j
bj 0 N/re j 0 = j 0 N/r + j 0 (9.28)

jN + r j
= 0
(9.29)
j N + r j 0
r(j 0 j)

= 0 0
(9.30)
j (j N + r)
r(j + j 0 )

02
(9.31)
2j N j 0 r
r
0 (9.32)
j N r/2
where in the last step we have assumed j < j 0 wlog. This is upper bounded by 1/2j 02 provided j 0 N
r/2 + 2j 02 r, which certainly holds if N 3r2 (using the fact that j 0 < r).
Finally
jN jN
r j m =r jN
(9.33)
jN
r+
r
jN r
=r (9.34)
jN + r
r2
= (9.35)
jN + r
which is at most 1 in absolute value since N 3r2 , || 1/2, and j 1.
9.3 Other algorithms for number fields

To conclude, we mention some further applications of quantum computing to computational algebraic number
theory.
Hallgrens original paper on Pells equation [54] also solves another problem, the principal ideal problem,
which is the problem of deciding whether an ideal is principal, and if so, finding a generator of the ideal.
Factoring reduces to the problem of solving Pells equation, and Pells equation reduces to the principal ideal
problem; but no reductions in the other direction are known. Motivated by the possibility that the principal
ideal problem is indeed harder than factoring, Buchmann and Williams designed a key exchange protocol
based on it. Hallgrens algorithm shows that quantum computers can break this cryptosystem.
Subsequently, further related algorithms for problems in algebraic number theory have been found by
Hallgren [53] and, independently, by Schmidt and Vollmer [85]. Specifically, they found polynomial-time
algorithms for computing the unit group and the class group of a number field of constant degree. These
algorithms require generalizing period finding over R to a similar problem over Rd .
More recently, some of these algorithms have been extended to the case of arbitrary-degree number fields
[38].
Chapter 10
Quantum query complexity of the

HSP
So far, we have considered the hidden subgroup problem in abelian groups. We now turn to the case where
the group might be nonabelian. We will look at some of the potential applications of the HSP, and then
show that the general problem has polynomial quantum query complexity.
10.1 The nonabelian HSP and its applications

Recall that in the hidden subgroup problem for a group G, we are given a black box function f : G S,
where S is a finite set. We say that f hides a subgroup H G provided
f (x) = f (y) if and only if x1 y H. (10.1)
In other words, f is constant on left cosets H, g1 H, g2 H, . . . of H in G, and distinct on different left cosets.
When G is a nonabelian group, we refer to this problem as the nonabelian HSP.
The nonabelian HSP is of interest not only because it generalizes the abelian case in a natural way, but
because a solution of certain nonabelian hidden subgroup problems would have particularly useful applica-
tions. The most well-known (and also the most straightforward) applications are to the graph automorphism
problem and the graph isomorphism problem, problems for which no efficient classical algorithm is currently
known.
In the graph automorphism problem, we are given a graph on n vertices, and the goal is to determine
whether it has some nontrivial automorphism. In other words, we would like to know whether there is any
nontrival permutation Sn such that () = . The automorphisms of form a subgroup Aut Sn ;
if Aut is trivial then we say is rigid. We may cast the graph automorphism problem as an HSP over
Sn by considering the function f () := (), which hides Aut . If we could solve the HSP in Sn , then by
checking whether or not the automorphism group is trivial, we could decide graph automorphism.
In the graph isomorphism problem, we are given two graphs , 0 , each on n vertices, and our goal is to
determine whether there is any permutation Sn such that () = 0 , in which case we say that and
0 are isomorphic. We can cast graph isomorphism as an HSP in the wreath product Sn o S2 S2n , the
subgroup of S2n generated by permutations of the first n points, permutations of the second n points, and
swapping the two sets of points. Writing elements of Sn o S2 in the form (, , b) where , Sn represent
permutations of , 0 , respectively, and b {0, 1} denotes whether to swap the two graphs, we can define a
function (
((), (0 )) b = 0
f (, , b) := (10.2)
((0 ), ()) b = 1.
This function hides the automorphism group of the disjoint union of and 0 , which contains an element
that swaps the two graphs if and only if they are isomorphic. In particular, if and 0 are rigid (which
seems to be the hardest case for the HSP approach to graph isomorphism), the hidden subgroup is trivial
43
44 Chapter 10. Quantum query complexity of the HSP
when , 0 are non-isomorphic; and has order two, with its nontrival element the involution (, 1 , 1), when
= (0 ).
The second major potential application of the hidden subgroup problem is to lattice problems. An n-
dimensional lattice is the set of all integer linear combinations of n linearly independent vectors in Rn (a basis
for the lattice). In the shortest vector problem, we are asked to find a shortest nonzero vector in the lattice.
In particular, in the g(n)-unique shortest vector problem, we are promised that the shortest nonzero vector is
unique (up to its sign), and is shorter than any other non-parallel vector by a factor g(n). This problem can
be solved in polynomial time on a classical computer if g(n) is sufficiently large (say, if it is exponentially
large), and is NP-hard if g(n) = O(1). Less is known about intermediate cases, but the problem is suspected
to be classically hard even for g(n) = poly(n), to the extent that cryptosystems have been designed based
on this assumption.
Regev showed that an efficient quantum algorithm for the dihedral hidden subgroup problem based on
the so-called standard method (described below) could be used to solve the poly(n)-unique shortest vector
problem. Such an algorithm would be significant since it would break lattice cryptosystems, which are some
of the few proposed cryptosystems that are not compromised by Shors algorithm.
So far, only the symmetric and dihedral hidden subgroup problems are known to have significant ap-
plications. Nevertheless, there has been considerable interest in understanding the complexity of the HSP
for general groups. There are at least three reasons for this. First, the problem is simply of fundamental
interest: it appears to be a natural setting for exploring the extent of the advantage of quantum computers
over classical ones. Second, techniques developed for other HSPs may eventually find application to the sym-
metric or dihedral groups. Finally, exploring the limitations of quantum computers for HSPs may suggest
cryptosystems that could be robust even to quantum attacks.
10.2 The standard method

Nearly all known algorithms for the nonabelian hidden subgroup problem use the black box for f in essentially
the same way as in the abelian HSP. This approach has therefore come to be known as the standard method.
In the standard method, we begin by preparing a uniform superposition over group elements:
1 X
|Gi := p |gi. (10.3)
|G| gG
We then compute the value f (g) in an ancilla register, giving the state
1 X
p |g, f (g)i. (10.4)
|G| gG
Finally, we measure the second register and discard the result (or equivalently, simply discard the second
register). If we obtain the outcome s S, then the state is projected onto the uniform superposition of those
g G such that f (g) = s, which by the definition of f is simply some left coset of H. Since every coset
contains the same number of elements, each left coset occurs with equal probability. Thus this procedure
produces the coset state
1 X
|gHi := p |ghi with g G uniformly random (10.5)
|H| hH
(or, equivalently, we can view g as being chosen uniformly at random from some left transversal of H in G).
Depending on context, it may be more convenient to view the outcome either as a random pure state, or
equivalently, as the mixed quantum state
1 X
H := |gHihgH| (10.6)
|G|
gG
which we refer to as a hidden subgroup state. In the standard approach to the hidden subgroup problem, we
attempt to determine H using samples of this hidden subgroup state. In other words, given k H for some
k = poly(log |G|), we try to find a generating set for H.
10.3. Query complexity of the HSP 45
10.3 Query complexity of the HSP

As a first step toward understanding the quantum complexity of the HSP, we can ask how many queries
of the hiding function are required to solve the problem. If we could show that an exponential number of
quantum queries were required, then we would know that there was no efficient quantum algorithm. But
it turns out that this is not the case: as shown by Ettiner, Hyer, and Knill, poly(log |G|) queries to f
suffice to determine H [40]. In particular, they showed this within the framework of the standard method:
poly(log |G|)
H contains enough information to recover H. Of course, this does not necessarily mean that the
quantum computational complexity of the HSP is polynomial, since it is not clear in general how to perform
the quantum post-processing of the hidden subgroup states efficiently. Nevertheless, this is an important
observation since it already shows a difference between quantum and classical computation, and offers some
clues as to how we might design efficient quantum algorithms.
To show that the query complexity of the HSP is polynomial, it is sufficient to show that the (single-copy)
hidden subgroup states are pairwise statistically distinguishable, as measured by the quantum fidelity
p
F (, 0 ) := tr | 0 |. (10.7)
This follows from a result of Barnum and Knill [15], who showed the following.
Theorem 10.1. Suppose is drawn from an ensemble {1 , . . . , N }, where each i occurs with some fixed
prior probability pi . Then there exists a quantum measurement (namely, the so-called pretty good measure-
ment) that identifies with probability at least
r
1 N max F (i , j ). (10.8)
i6=j
In fact, by the minimax theorem, this holds even without assuming a prior distribution for the ensemble.
Given only one copy of the hidden subgroup state, (10.8) will typically give only a trivial bound. However,
by taking multiple copies of the hidden subgroup states, we can ensure that the overall states are nearly
orthogonal, and hence distinguishable. In particular, using k copies of , we see that there is a measurement
for identifying with probability at least
r r
1 N max F (k k
i , j ) = 1 N max F (i , j )k (10.9)
i6=j i6=j
(since the fidelity is multiplicative under tensor products). Setting this expression equal to 1 and solving
for k, we see that arbitrarily small error probability can be achieved provided we use
2(log N log )

k (10.10)
log (1/ maxi6=j F (i , j ))
copies of .
Provided that G does not have too many subgroups, and that the fidelity between two distinct hidden
subgroup states is not too close to 1, this shows that polynomially many copies of H suffice to solve the
2
HSP. The total number of subgroups of G is 2O(log |G|) , which can be seen as follows. Any group K can be
specified in terms of at most log2 |K| generators, since every additional (non-redundant) generator increases
the size of the group by at least a factor of 2. Since every subgroup of G can be specified by a subset of
2
at most log2 |G| elements of G, the number of subgroups of G is upper bounded by |G|log2 |G| = 2(log2 |G|) .
This shows that we can take log N = poly(log |G|) in (10.10). Thus k = poly(log |G|) copies of H suffice
to identify H with constant probability provided the maximum fidelity is bounded away from 1 by at least
1/ poly(log |G|).
To upper bound the fidelity between two states , 0 , consider the two-outcome measurement that projects
onto the support of or its orthogonal complement. The classical fidelity of the resulting distribution is an
upper bound on the quantum fidelity, so
q
F (, 0 ) tr tr 0 + tr((1 )) tr((1 )0 )
p
(10.11)
p
= tr 0 . (10.12)
46 Chapter 10. Quantum query complexity of the HSP
where denotes the projector onto the support of .

Now consider the fidelity between H and H 0 for two distinct subgroups H, H 0 G. Let |H| |H 0 |
without loss of generality. We can write (10.6) as
1 X |H| X
H = |gHihgH| = |gHihgH|. (10.13)
|G| |G|
gG gTH
where TH denotes some left transversal of H in G. Since the right hand expression is a spectral decomposition
of H , we have
X 1 X
H = |gHihgH| = |gHihgH|. (10.14)
|H|
gTH gG
Then we have
F (H , H 0 )2 tr H H 0 (10.15)
1 X
= |hgH|g 0 H 0 i|2 (10.16)
|H| |G| 0
g,g G
1 X |gH g 0 H 0 |2
= (10.17)
|H| |G| 0 |H| |H 0 |
g,g G
1 X
= |gH g 0 H 0 |2 . (10.18)
|G| |H|2 |H 0 | 0
g,g G
Now
|gH g 0 H 0 | = |{(h, h0 ) H H 0 : gh = g 0 h0 }| (10.19)

0 0 0 1 0
= |{(h, h ) H H : hh = g g }| (10.20)
(
|H H 0 | if g 1 g 0 HH 0
= (10.21)
0 if g 1 g 0
/ HH 0 ,
so
X
|gH gH 0 |2 = |G| |HH 0 | |H H 0 |2 (10.22)
g,g 0 G
= |G| |H| |H 0 | |H H 0 |. (10.23)
Thus we have
|G| |H| |H 0 | |H H 0 |
F (H , H 0 )2 = (10.24)
|G| |H|2 |H 0 |
|H H 0 |
= (10.25)
|H|
1
. (10.26)
2

This shows that F (H , H 0 ) 1/ 2, thereby establishing that the query complexity of the HSP is poly(log |G|).
Chapter 11
Fourier analysis in nonabelian groups
We have seen that hidden subgroup states contain sufficient information to determine the hidden subgroup.
Now we would like to know whether this information can be extracted efficiently. In this lecture, we will
introduce the theory of Fourier analysis over general groups, an important tool for getting a handle on this
problem.
11.1 A brief introduction to representation theory

To understand nonabelian Fourier analysis, we first need to introduce some notions from group representation
theory. For further information on this subject, a good basic reference is the book Linear Representations
of Finite Groups by Serre [87].
A linear representation (or simply representation) of a group G over the vector space Cn is a homomor-
phism : G GL(Cn ), i.e., a map from group elements to nonsingular n n complex matrices satisfying
(x)(y) = (xy) for all x, y G. Clearly, (1) = 1 and (x1 ) = (x)1 . We call Cn the representation
space of , where n is called its dimension (or degree), denoted d .
Two representations and 0 with representation spaces Cn are called isomorphic (denoted 0 ) if
there is an invertible linear transformation M Cnn such that M (x) = 0 (x)M for all x G. Otherwise
they are called non-isomorphic (denoted 6 0 ). In particular, representations of different dimensions are
non-isomorphic. Every representation of a finite group is isomorphic to a unitary representation, i.e., one for
which (x)1 = (x) for all x G. Thus we can restrict our attention to unitary representations without
loss of generality.
The simplest representations are those of dimension one, such that (x) C with |(x)| = 1 for all x G.
Every group has a one-dimensional representation called the trivial representation, defined by (x) = 1 for
all x G.
Two particularly useful representations of a group G are the left regular representation and the right
regular representation. Both of these representations have dimension |G|, and their representation space is
the group algebra CG, the |G|-dimensional complex vector space spanned by basis vectors |xi for x G.
The left regular representation L satisfies L(x)|yi = |xyi, and the right regular representation R satisfies
R(x)|yi = |yx1 i. In particular, both regular representations are permutation representations: each of their
representation matrices is a permutation matrix.
Given two representations : G V and 0 : G V 0 , we can define their direct sum, a representation
0 : G V V 0 of dimension d0 = d + d0 . The representation matrices of 0 are block diagonal,
of the form
(x) 0
( 0 )(x) = (11.1)
0 0 (x)
for all x G.
A representation is called irreducible if it cannot be decomposed as the direct sum of two other represen-
tations. Any representation of a finite group G can be written as a direct sum of irreducible representations
(or irreps) of G.
47
48 Chapter 11. Fourier analysis in nonabelian groups
Another way to combine two representations is with the tensor product. The tensor product of : G V
and 0 : G V 0 is 0 : G V V 0 , a representation of G of dimension d0 = d d0 .
The character of a representation is the function : G C defined by (x) := tr (x). We have
(1) = d (since (1) is Id , the d-dimensional identity matrix)
(x1 ) = (x) (since we can assume that is unitary), and
(yx) = (xy) for all x, y G (since the trace is cyclic).
In particular, (yxy 1 ) = (x), so characters are constant on conjugacy classes. For two representations
, 0 , we have 0 = + 0 and 0 = 0 .
The most useful result in representation theory is probably Schurs Lemma, which can be stated as
follows:
Theorem 11.1 (Schurs Lemma). Let and 0 be two irreducible representations of G, and let M Cd d0
be a matrix satisfying (x)M = M 0 (x) for all x G. Then if 6 0 , M = 0; and if = 0 , M is a scalar
multiple of the identity matrix.
Schurs Lemma can be used to prove the following orthogonality relation for irreducible representations:
Theorem 11.2 (Orthogonality of irreps). For two irreps and 0 of G, we have
d X
(x)i,j 0 (x)i0 ,j 0 = ,0 i,i0 j,j 0 , (11.2)
|G|
xG
where we interpret ,0 to mean 1 if 0 , and 0 otherwise.

This implies a corresponding orthogonality relation for the irreducible characters (i.e., the characters of
the irreducible representations):
Theorem 11.3 (Orthogonality of characters). For two irreps and 0 of G, we have
1 X
( , 0 ) := (x) 0 (x) = ,0 . (11.3)
|G|
xG
The characters of G supply an orthonormal basis for the space of class functions, functions that are constant
on conjugacy classes of G. (Recall that the characters themselves are class functions.) This is expressed by
the orthonormality of the character table of G, the square matrix whose rows are labeled by irreps, whose
columns are labeled by conjugacy classes, and whose entries are the corresponding characters. The character
orthogonality theorem says that the rows of this matrix are orthonormal, provided each entry is weighted
by the square root of the size of the corresponding conjugagcy class divided by |G|. In fact the columns are
orthonormal in the same sense.
Any representation of G can be broken up into its irreducible components. The regular representations
of G are useful for understanding such decompositions, since they contain every possible irreducible repre-
sentation of G, with each irrep occuring a number of times equal to its dimension. Let G denote a complete
set of irreps of G (which are unique up to isomorphism). Then we have
L Id , R
M M
Id .

= = (11.4)
G G
In fact, this holds with the same isomorphism for both L and R, since the left and right regular representations
commute. This isomorphism is simply the Fourier transform over G, which we discuss further below.
Considering L (1) = R (1) = |G| and using this decomposition, we find the well-known identity
X
d2 = |G|. (11.5)
G
Also, noting that L (x) = R (x) = 0 for any x G \ {1}, we see that
X
d (x) = 0. (11.6)
G
11.2. Fourier analysis for nonabelian groups 49
In general, the multiplicity of the irrep G in an arbitrary representation of G is given by :=

( , ). This gives the decomposition

M
= I . (11.7)
G
Characters also provide a simple test for irreducibility: for any representation , ( , ) is a positive integer,
and is equal to 1 if and only if is irreducible.
Any representation of G can also be viewed as a representation of any subgroup H G, simply by
restricting its domain to elements of H. We denote the resulting restricted representation by ResG H . Even
if is irreducible over G, it may not be irreducible over H.
11.2 Fourier analysis for nonabelian groups

The Fourier transform is a unitary transformation from the group algebra, CG, to a complex vector space
whose basis vectors correspond to matrix elements of the irreps of G, G (Cd Cd ). These two spaces
L
have the same dimension by (11.5).
The Fourier transform of the basis vector |xi CG corresponding to the group element x G is a
weighted superposition over all irreducible representations G, namely
d
p |, (x)i,
X
|xi := (11.8)
|G|
G
where |i is a state that labels the irreducible representation, and |(x)i

is a normalized, d2 -dimensional
state whose amplitudes correspond to the entries of the matrix (x)/ d :
d
X (x)j,k
|(x)i := |j, ki. (11.9)
j,k=1
d
(If is one-dimensional, then |(x)i is simply a phase factor (x) = (x) C with |(x)| = 1.) The Fourier
transform over G is the unitary matrix
X
FG := |xihx| (11.10)
xG
s
d
XX d X
= (x)j,k |, j, kihx|. (11.11)
|G|
xG G j,k=1
Note that the Fourier transform over G is not uniquely defined, but rather, depends on a choice of basis for
each irreducible representation.
It is straightforward to check that FG is indeed a unitary transformation. Using the identity
h(y)|(x)i = tr (y)(x)/d (11.12)

1
= tr (y x)/d (11.13)
1
= (y x)/d , (11.14)
we have
X d2

hy|xi = h(y)|(x)i (11.15)
|G|
G
X d
= (y 1 x). (11.16)
|G|
G
Hence by (11.511.6) above, we see that hy|xi = x,y .

50 Chapter 11. Fourier analysis in nonabelian groups
FG is precisely the transformation that decomposes both the left and right regular representations of G
into their irreducible components. Let us check this explicitly for the left regular representation L. Recall
that this representation satisfies L(x)|yi = |xyi, so we have
L(x) := FG L(x) FG (11.17)

X
= |c
xyihy| (11.18)
yG
d d 0
X X X X d d0
= (xy)j,k 0 (y)j 0 ,k0 |, j, kih 0 , j 0 , k 0 | (11.19)
|G|
yG , 0 G j,k=1 j 0 ,k0 =1
d d0
X X X X d d0
= (x)j,` (y)`,k 0 (y)j 0 ,k0 |, j, kih 0 , j 0 , k 0 | (11.20)
|G|
yG , 0 G j,k,`=1 j 0 ,k0 =1
d
X X
= (x)j,` |, j, kih, `, k| (11.21)
G j,k,`=1
M
= (x) Id , (11.22)
G
where in the fourth line we have used the orthogonality relation for irreducible representations.
A similar calculation can be done for the right regular representation defined by R(x)|yi = |yx1 i, giving
R(x) := FG R(x) FG (11.23)

M
Id (x) .

= (11.24)
G
This identity will be useful when analyzing the application of the quantum Fourier transform to the hidden
subgroup problem.
To use the Fourier transform as part of a quantum computation, we must be able to implement it efficiently
by some quantum circuit. Efficient quantum circuits for the quantum Fourier transform are known for many,
but not all, nonabelian groups. Groups for which an efficient QFT is known include metacyclic groups (i.e.,
semidirect products of cyclic groups), such as the dihedral group; the symmetric group; and many families
of groups that have suitably well-behaved towers of subgroups. There are a few notable groups for which
efficient QFTs are not known, such as the general linear group GLn (q) of n n invertible matrices over Fq ,
the finite field with q elements.
Chapter 12
Fourier sampling
In this lecture, we will see how the Fourier transform can be used to simplify the structure of the states
obtained in the standard approach to the hidden subgroup problem. In particular, we will see how weak
Fourier sampling is sufficient to identify any normal hidden subgroup (generalizing the solution of the abelian
HSP). We will also briefly discuss the potential of strong Fourier sampling to go beyond the limitations of
weak Fourier sampling.
12.1 Weak Fourier sampling

Recall that the standard approach to the HSP allows us to produce a coset state
1 X
|gHi := p |ghi (12.1)
|H| hH
where each g G occurs uniformly at random; or equivalently, the hidden subgroup state
1 X
H := |gHihgH|. (12.2)
|G|
gG
The symmetry of such a state can be exploited using the quantum Fourier transform. In particular, we
have
1 X
|gHi = p R(h)|gi (12.3)
|H| hH
where R is the right regular representation of G. Thus the hidden subgroup state can be written
1 X X
H = R(h)|gihg|R(h0 ) (12.4)
|G| |H| 0
gG h,h H
1 X
= R(hh01 ) (12.5)
|G| |H| 0
h,h H
1 X
= R(h). (12.6)
|G|
hH
Since the right regular representation is block-diagonal in the Fourier basis, the same is true of H . In
particular, we have
H := FG H FG (12.7)
1 M
Id (H)

= (12.8)
|G|
G
51
52 Chapter 12. Fourier sampling
where X
(H) := (h). (12.9)
hH
Since H is block diagonal, with blocks labeled by irreducible representations, we may now measure
the irrep label without loss of information. This procedure is referred to as weak Fourier sampling. The
probability of observing representation G under weak Fourier sampling is
1
tr Id (H)

Pr() = (12.10)
|G|
d X
= (h) (12.11)
|G|
hH
d |H|
= ( , 1 )H , (12.12)
|G|
or in other words, d |H|/|G| times the number of times the trivial representation appears in ResG H , the
restriction of to H. We may now ask whether polynomially many samples from this distribution are
sufficient to determine H, and if so, whether H can be reconstructed from this information efficiently.
12.2 Normal subgroups

If G is abelian, then all of its representations are one-dimensional, so weak Fourier sampling reveals all
of the available information about H . (In this case there is no difference between weak Fourier sampling
and strong Fourier sampling, which we will discuss later.) Indeed, for an abelian group, we saw that the
information provided by Fourier sampling can be used to efficiently determine H.
Weak Fourier sampling succeeds for a similar reason whenever H is a normal subgroup of G (denoted
H E G), i.e., whenever gHg 1 = H for all g G [52]. In this case, the hidden subgroup state within the
irrep G is proportional to
1 X
(H) = (ghg 1 ) . (12.13)
|G|
gG,hH

This commutes with (g) for all g G, so by Schurs Lemma, it is a multiple of the identity. Thus H
is proportional to the identity within each block, and again weak Fourier sampling reveals all available
information about H.
Furthermore, when H E G, the distribution under weak Fourier sampling is a particularly simple gener-
alization of the abelian case: we have
(
d2 |H|/|G| H ker
Pr() = (12.14)
0 otherwise,
where ker := {g G : (g) = Id } is the kernel of the representation (a normal subgroup Pof G). To see
this, note that if H 6 ker , then there is some h0 H with (h0 ) 6= 1; but then (h0 )(H) = hH (h0 h) =
(H), and since (h0 ) is unitary and (H) is a scalar multiple of the identity, this can only be satisfied if in
fact (H) = 0. On the other hand, if H ker , then (h) = d for all h H, and the result is immediate.
To find H, we can simply proceed as in the abelian case: perform weak Fourier sampling O(log |G|) times
and compute the intersection of the kernels of the resulting irreps (assuming this can be done efficiently).
Again, it is clear that the resulting subgroup contains H, and we claim that it is equal to H with high
probability. For suppose that at some stage during this process, the intersection of the kernels is K E G with
K 6= H; then the probability of obtaining an irrep for which K ker is
|H| X |H| 1
d2 = (12.15)
|G| |K| 2
: Kker
where we have used the fact that the distribution (12.14) remains normalized if H is replaced by any normal
subgroup of G. Since each repetition of weak Fourier sampling has a probability of at least 1/2 of cutting
12.3. Strong Fourier sampling 53
the intersection of the kernels at least in half, O(log |G|) repetitions suffice to converge to H with substantial
probability. In fact, applying the same approach when H is not necessarily normal in G gives an algorithm
to find the normal core of H, the largest subgroup of H that is normal in G.
This algorithm can be applied to find hidden subgroups in groups that are close to Abelian in a certain
sense. In particular, Grigni et al. showed that if (G), the intersection of the normalizers of all subgroups of
1/2
G, is sufficiently largespecifically, if |G|/|(G)| = 2O(log n) , such as when G = Z3 o Z2n then the HSP
in G can be solved in polynomial time [49]. The idea is simply to apply the algorithm for normal subgroups
to the restriction of G to all subgroups containing (G); the union of all subgroups obtained in this way
gives the hidden subgroup with high probability. This result was subsequently improved (by Gavinsky) to
give a polynomial-time quantum algorithm whenever |G|/|(G)| = poly(log |G|).
12.3 Strong Fourier sampling

Despite the examples we have just discussed, weak Fourier sampling does not provide sufficient information
to recover the hidden subgroup for the majority of hidden subgroup problems. For example, weak Fourier
sampling fails to solve the HSP in the symmetric group and the dihedral group.
To obtain more information about the hidden subgroup, we can perform a measurement on the d2 -
dimensional state that results when weak Fourier sampling returns the outcome . Such an approach is
referred to as strong Fourier sampling.
Recall that the state H from (12.8) is maximally mixed over the row register, as a consequence of the
fact that the left and right regular representations commute. Thus we may discard this register without loss
of information, so that strong Fourier sampling is effectively faced with the d -dimensional state
(H)
H, := P
. (12.16)
hH (h)
In fact, this state is proportional to a projector whose rank is simply the number of times the trivial
representation appears in ResG
H . This follows because
X
(H)2 = (hh0 ) = |H| (H), (12.17)
h,h0 H
which gives
|H|
2H, = P ,
H,
(12.18)
hH (h)
so that H, is proportional to a projector with rank(H, ) = hH (h) /|H|.

P
It is not immediately clear how to choose a good basis for strong Fourier sampling, so a natural first
approach is to consider the effect of measuring in a random basis (i.e., a basis chosen uniformly with respect
to the Haar measure over Cd ). There are a few cases in which such random strong Fourier sampling produces
sufficient information to identify the hidden subgroupin particular, Sen showed that it succeeds whenever
rank(H, ) = poly(log |G|) for all G [86].
However, in many cases random strong Fourier sampling is unhelpful. For example, Grigni et al. showed
that if H is sufficiently small and G is sufficiently non-Abelian (in a certain precise sense), then random
strong Fourier sampling is not very informative [49]. In particular, they showed this for the problem of
finding hidden involutions in the symmetric group. Another example was provided by Moore et al., who
showed that random strong Fourier sampling fails in the metacyclic groups Zp o Zq (subgroups of the affine
group Zp o Z p ) when q < p
1
for some > 0 [73].
Even when measuring in a random basis is information-theoretically sufficient, it does not give an efficient
quantum algorithm, since it is not possible to efficiently measure in a random basis. It would be interesting to
find informative pseduo-random bases that can be implemented efficiently. However, in the absence of such
techniques, we can instead hope to find explicit bases in which strong Fourier sampling can be performed
efficiently, and for which the results give a solution of the HSP. The first such algorithm was provided by
Moore et al., for the aforementioned metacyclic groups, but with q = p/ poly(log p) [73]. Note that for these
54 Chapter 12. Fourier sampling
values of p, q, unlike the case q < p1 mentioned above, measurement in a random basis is information-
theoretically sufficient. Indeed, we do not know of any example of an HSP for which strong Fourier sampling
succeeds, yet random strong Fourier sampling fails; it would be interesting to find any such example (or to
prove that none exists).
Note that simply finding an informative basis is not sufficient; it is also important that the measurement
results can be efficiently post-processed. This issue arises not only in the context of measurement in a
pseudo-random basis, but also in the context of certain explicit bases. For example, Ettinger and Hyer
gave a basis for the dihedral HSP in which a measurement gives sufficient classical information to infer the
hidden subgroup, but no efficient means of post-processing this information is known [39].
For some groups, it turns out that strong Fourier sampling simply fails. Moore, Russell, and Schulman
showed that, regardless of what basis is chosen, strong Fourier sampling provides insufficient information
to solve the HSP in the symmetric group [72]. Specifically, they showed that for any measurement basis
(indeed, for any POVM applied to a hidden subgroup state), the distribution of outcomes in the cases where
the hidden subgroup is trivial and where the hidden subgroup is an involution are exponentially close. Thus,
in general one has to consider entangled measurements on multiple copies of the hidden subgroup states.
(Indeed, entangled measurements on (log |G|) copies may be necessary, as Hallgren et al. showed for the
symmetric group [51].) In the next two lectures, we will see some examples of quantum algorithms for the
HSP that make use of entangled measurements.
Chapter 13
Kuperbergs algorithm for the

dihedral HSP
We now discuss a quantum algorithm for the dihedral hidden subgroup problem. No polynomial-time algo-
rithm for this problem is known. However, Kuperberg gave a quantum
algorithm that runs in subexponential
(though superpolynomial) timespecifically, it runs in time 2O( log |G|) [64].
13.1 The HSP in the dihedral group

The dihedral group of order 2N , denoted DN , is the group of symmetries of a regular N -gon. It has the
presentation
DN = hr, s : r2 = sN = 1, rsr = s1 i. (13.1)
Here r can be thought of as a reflection about some fixed axis, and s can be thought of as a rotation of the
N -gon by an angle 2/N .
Using the defining relations, we can write any group element in the form sx ra where x ZN and a Z2 .
Thus we can equivalently think of the group as consisting of elements (x, a) ZN Z2 . Since
(sx ra )(sy rb ) = sx ra sy ra ra+b (13.2)

x (1)a y a+b
=s s r (13.3)
x+(1)a y a+b
=s r , (13.4)
the group operation on such elements can be expressed as
(x, a) (y, b) = (x + (1)a y, a + b). (13.5)
(In particular, this shows that the dihedral group is the semidirect product ZN o Z2 , where : Z2
Aut(ZN ) is defined by (a)(y) = (1)a y.) It is also easy to see that the group inverse is
(x, a)1 = ((1)a x, a). (13.6)
The subgroups of DN are either cyclic or dihedral. The possible cyclic subgroups are of the form h(x, 0)i
where x ZN is either 0 or some divisor of N . The possible dihedral subgroups are of the form h(y, 1)i where
y ZN , and of the form h(x, 0), (y, 1)i where x ZN is some divisor of N and y Zx . A result of Ettinger
and Hyer reduces the general dihedral HSP, in which the hidden subgroup could be any of these possibilities,
to the dihedral HSP with the promise that the hidden subgroup is of the form h(y, 1)i = {(0, 0), (y, 1)}, i.e.,
a subgroup of order 2 generated by the reflection (y, 1).
The basic idea of the Ettinger-Hyer reduction is as follows. Suppose that f : DN S hides a subgroup
H = h(x, 0), (y, 1)i. Then we can consider the function f restricted to elements from the abelian group
ZN {0} DN . This restricted function hides the subgroup h(x, 0)i, and since the restricted group is
55
56 Chapter 13. Kuperbergs algorithm for the dihedral HSP
abelian, we can find x efficiently using Shors algorithm. Now h(x, 0)i E DN (since (z, a)(x, 0)(z, a)1 =
(z + (1)a x, a)((1)a z, a) = ((1)a x, 0) ZN {0}), so we can define the quotient group DN /h(x, 0)i.
But this is simply a dihedral group (of order 2N/(N/x) = 2x), and if we now define a function f 0 as f
evaluated on some coset representative, it hides the subgroup h(y, 1)i. Thus, in the rest of this lecture, we
will assume that the hidden subgroup is of the form h(y, 1)i for some y ZN without loss of generality.
13.2 Fourier sampling in the dihedral group

When the hidden subgroup is H = h(y, 1)i, one particular left transversal of H in G consists of the left coset
representatives (z, 0) for all z ZN . The coset state corresponding to the coset (z, 0)H is
1
|(z, 0){(0, 0), (y, 1)}i = (|z, 0i + |y + z, 1i). (13.7)
2
We would like to determine y using samples of this state.
We have seen that to distinguish coset states in general, one should start by performing weak Fourier
sampling: apply a Fourier transform over G and then measure the irrep label. However, in this case we will
instead simply Fourier transform the first register over ZN , leaving the second register alone. It is possible
to show that measuring the first register of the resulting state is essentially equivalent to performing weak
Fourier sampling over DN (and discarding the row register), but for simplicity we will just consider the
abelian procedure.
Fourier transforming the first register over ZN , we obtain
1 X kz k(y+z)
(FZN I2 )|(z, 0)Hi = (N |k, 0i + N |k, 1i (13.8)
2N kZN
1 X kz 1 ky
= N |ki (|0i + N |1i). (13.9)
N kZ N
2
If we then measure the first register, we obtain one of the N values of k uniformly at random, and we are
left with the post-measurement state
1 yk
|k i := (|0i + N |1i). (13.10)
2
Thus we are left with the problem of determining y given the ability to produce single-qubit states |k i of
this form (where k is known).
13.3 Combining states

It would be very useful if we could prepare states |k i with particular values of k. For example, if we could
prepare the state |N/2 i = 12 (|0i + (1)y |1i), then we could learn the parity of y (i.e., its least significant

bit) by measuring in the basis of states |i := (|0i |1i)/ 2. The main idea of Kuperbergs algorithm is to
combine states of the form (13.10) to produce new states of the same form, but with more desirable values
of k.
To combine states, we can use the following procedure. Given two states |p i and |q i, perform a
controlled-not gate from the former to the latter, giving
1 yp yq y(p+q)
|p , q i = (|0, 0i + N |1, 0i + N |0, 1i + N |1, 1i) (13.11)
2
1 yp yq y(p+q)
7 (|0, 0i + N |1, 1i + N |0, 1i + N |1, 0i) (13.12)
2
1 yq
= (|p+q , 0i + N |pq , 1i). (13.13)
2
13.4. The Kuperberg sieve 57
Then a measurement on the second qubit leaves the first qubit in the state |pq i (up to an irrelevant global
phase), with the + sign occurring when the outcome is 0 and the sign occurring when the outcome is 1,
each outcome occurring with probability 1/2.
This combination operation has a nice representation-theoretic interpretation: the state indices p and q
can be viewed as labels of irreducible representations of DN , and the extraction of |pq i can be viewed as
decomposing their tensor product (a reducible representation of DN ) into one of two irreducible components.
13.4 The Kuperberg sieve

Now we are ready to describe how the algorithm works. For simplicity, we will assume from now on that
N = 2n is a power of 2. For such a dihedral group, it is actually sufficient to be able to determine
the least significant bit of y, since such an algorithm could be used recursively to determine all the bits
of y. This can be seen as follows. The group DN contains two subgroups isomorphic to DN/2 , namely
{(2x, 0), (2x, 1) : x ZN/2 } and {(2x, 0), (2x + 1, 1) : x ZN/2 }. The hidden subgroup is a subgroup of the
former if y has even parity, and of the latter if y has odd parity. Thus, once we learn the parity of y, we
can restrict our attention to the appropriate DN/2 subgroup. The elements of either DN/2 subgroup can be
represented using only n 1 bits, and finding the least significant bit of the hidden reflection within this
subgroup corresponds to finding the second least significant bit of y in DN . Continuing in this way, we can
learn all the bits of y with only n iterations of an algorithm for finding the least significant bit of the hidden
reflection.
The idea of Kuperbergs algorithm is to start with a large number of states, and collect them into pairs
|p i, |q i that share many of their least significant bits, such that |pq i is likely to have many of its least
significant bits equal to zero. Trying to zero out all but the most significant bit in one shot would require
an exponential running time, so instead we will proceed in stages, only trying to zero some of the least
significant bits in each stage; this will turn out to give an improvement.
Specifically, the algorithm proceeds as follows:
1. Prepare (16 n ) coset states of the form (13.10), where each copy has k Z2n chosen independently
and uniformly at random.

2. For each j = 0, 1, . . . , m 1 where m := d n e, assume the current coset states are all of the form |k i
with at least mj of the least significant bits of k equal to 0. Collect them into pairs |p i, |q i that
share at least m of the next least significant bits, discarding any qubits that cannot be paired. Create
a state |pq i from each pair, and discard it if the + sign occurs. Notice that the resulting states have
at least m(j + 1) significant bits equal to 0.
3. The remaining states are of the form |0 i and |2n1 i. Measure one of the latter states in the |i basis
to determine the least significant bit of y.

O( n)

Since this algorithm
requires 2 initial queries andproceeds through O( n) stages, each of which
O( n) O( n)
takes at most 2 steps, the overall running time is 2 .
13.5 Analysis of the Kuperberg sieve

To show that this algorithm works, we need to prove that some qubits survive to the final stage of the process
with non-negligible
probability. Lets analyze a more general version of the algorithm to see why we should
try to zero out n bits at a time, starting with 2O( n) states.
Suppose we try to cancel m bits in each stage, so that there are n/m stages (not yet assuming any
relationship between m and n), starting with 2` states. Each combination operation succeeds with probability
1/2, and turns 2 states into 1, so at each step we retain only about 1/4 of the states that can be paired. Now
when we pair states that allow us to cancel m bits, there can be at most 2m unpaired states, since that is
the number of values of the m bits to be canceled. Thus if we ensure that there are at least 2 2m states at
each stage, we expect to retain at least a 1/8 fraction of the states for the next stage. Since we begin with 2`
states,we expect to have at least 2`3j states left after the jth stage. Thus, to have 2 2m states remaining
at the last stage
of the algorithm, we require
2`3n/m > 2m+1 , or ` > m + 3n/m + 1. This is minimized by
choosing m n, so we see that ` 4 n suffices.
58 Chapter 13. Kuperbergs algorithm for the dihedral HSP
This analysis is not quite correct because we do not obtain precisely a 1/8 fraction of the paired states
for use in the next stage. For most of the stages, we have many more than 2 2m states, so nearly all of them
can be paired, and the expected fraction remaining for the next stage is close to 1/4. Of course, the precise
fraction will experience statistical fluctuations. However, since we are working with a large number of states,
the deviations from the expected values are very small, and a more careful analysis (using the Chernoff
bound) shows that the procedure succeeds with high probability. For a detailed argument, see section 3.1 of
Kuperbergs paper (SICOMP version). That paper also gives an improved algorithm that runs faster and
that works for general N .
Note
that this algorithm uses not only superpolynomial time, but also superpolynomial space, since all
(16 n ) coset states are present at the start of the algorithm. However, by creating a smaller number of
coset states at a time and combining them according to the solution of a subset sum problem, Regev showed
how to make the space requirement polynomial with only a slight increase in the running time [77, 32].
13.6 Entangled measurements

Although this algorithm acts on pairs of cosetstates at a time, the overall algorithm effectively implements
a highly entangled measurement on all (16 n ) registers, since the combination operation that produces
|pq i entangles the coset states |p i and |q i. The same is true of Regevs polynomial-space variant.
It is natural to ask whether a similar sieve could be applied to other hidden subgroup problems, such
as in the symmetric group, for which highly entangled measurements are necessary. Alagic, Moore, and
Russell used a similar approach to give a subexponential-time algorithm for the hidden subgroup problem
in the group Gn , where G is a fixed non-Abelian group [9]. (Note that the HSP in Gn can be much harder
than solving n instances of the HSP in G, since Gn has many subgroups that are not direct products of
subgroups of G.) But unfortunately, this kind of sieve does not seem well-suited to the symmetric group. In
particular, Moore, Russell, and Sniady gave the following negative result for the HSP in Sn o Z2 , where the
hidden subgroup is promised to be either trivial or an involution [74]. Consider any algorithm that works
by combining pairs of hidden subgroup states to produce a new state in the decomposition of their tensor
product into irreps (i.e., in their Clebsch-Gordan decomposition), and uses the sequence of measurement
results to guess whether the hidden subgroup is trivial or nontrivial. Any such algorithm must use 2( n)
queries. Thus it is not possible to give a significantly better-than-classical algorithm for graph
isomorphism
O( n/ log n)
in this way, since there are classical algorithms for graph isomorphism that run in time 2 .
Note that entangled measurements are not information-theoretically necessary for the dihedral HSP:
Ettinger and Hyer gave an explicit measurement (i.e., an explicit basis for strong Fourier sampling) from
which the measurement results give sufficient information to determine the hidden subgroup [39]. Suppose
that, given the state (13.10), we simply measure in the |i basis. Then we obtain the result |+i with
probability
h0| + h1| |0i + yk |1i 2 1 + yk 2

!
yk
N
= N
= cos2 . (13.14)

2 N

2 2
If we postselect on obtaining this outcome (which happens with probability 1/2 over the uniformly random
value of k, assuming y 6= 0), then we effectively obtain each value k ZN with probability Pr(k|+) =
2 2 yk
N cos N . It is not hard to show that these distributions are statistically far apart for different values of k,
so that they can in principle be distinguished with only polynomially many samples. However, no efficient
(or even subexponential time) classical (or even quantum) algorithm for doing so is known.
Chapter 14
The HSP in the Heisenberg group
We showed that the quantum query complexity of the general hidden subgroup problem is polynomial
poly(log |G|)
by measuring H using a particular measurement strategy (the pretty good measurement) that
identifies H with high probability. One strategy for finding an efficient quantum algorithm for the HSP is
to find an efficient way of implementing that particular measurement [14]. In this lecture, we will describe
an efficient quantum algorithm for the HSP in the Heisenberg group that effectively implements the pretty
good measurement.
14.1 The Heisenberg group

There are several different ways to define the Heisenberg group. For those familiar with quantum error
correcting codes on higher-dimensional systems, perhaps the most familiar definition is as follows. Given a
prime number p, define operators X and Z acting on an orthonormal basis of states {|xi : x Zp } by
X|xi = |x + 1 mod pi (14.1)
Z|xi = px |xi. (14.2)
These operators satisfy the relation ZX = p XZ. Using this relation, any product of Xs and Zs can be
written in the form pa X b Z c , where a, b, c Zp . Thus the operators X and Z generate a group of order p3 ,
which is precisely the Heisenberg group. Writing the group elements in the form (a, b, c) with a, b, c Zp , it
is straightforward to work out the group law
(a, b, c) (a0 , b0 , c0 ) = (a + a0 + b0 c, b + b0 , c + c0 ). (14.3)
Equivalently, the Heisenberg group is the group of lower triangular 3 3 matrices

1 0 0
b 1 0 : a, b, c Fp (14.4)
a c 1

over Fp , and the semidirect product Z2p o Zp , where : Zp Aut(Z2p ) is defined by (c)(a, b) = (a + bc, b).
To solve the HSP in the Heisenberg group, it is sufficient to be able to distinguish the following cyclic
subgroups of order p:
Ha,b := h(a, b, 1)i = {(a, b, 1)x : x Zp }. (14.5)
The reduction to this case is essentially the same as the reduction of the dihedral hidden subgroup problem
to the case of a hidden reflection, so we omit the details. The elements of such a subgroup are
(a, b, 1)2 = (2a + b, 2b, 2) (14.6)
3
(a, b, 1) = (a, b, 1)(2a + b, 2b, 2) = (3a + 3b, 3b, 3) (14.7)
4
(a, b, 1) = (a, b, 1)(3a + 3b, 3b, 3) = (4a + 6b, 4b, 4) (14.8)
5
(a, b, 1) = (a, b, 1)(4a + 6b, 4b, 4) = (5a + 10b, 5b, 5) (14.9)
59
60 Chapter 14. The HSP in the Heisenberg group
etc., and a straightforward inductive argument shows that a general element has the form
(a, b, 1)x = (xa + x2 b, xb, x).

(14.10)
Furthermore, it is easy to see that the p2 elements (`, m, 0) for `, m Zp form a left transversal of Ha,b in
the Heisenberg group for any a, b Zp .
14.2 Fourier sampling

Suppose we are given a function that hides Ha,b in the Heisenberg group. Then the standard method can
be used to produce the coset state
1 X
|` + xa + x2 b, m + xb, xi

|(`, m, 0)Ha,b i = (14.11)
p
xZp
for some uniformly random, unknown `, m Zp . Our goal is to determine the parameters a, b Zp using
the ability to produce such states.
At this point, we could perform weak Fourier sampling over the Heisenberg group without discarding
any information. However, as in the case of the dihedral group, it will be simpler to consider an abelian
Fourier transform instead of the full nonabelian Fourier transform. Using the representation theory of the
Heisenberg group, one can show that this procedure is essentially equivalent to nonabelian Fourier sampling.
Fourier transforming the first two registers over Z2p , we obtain the state
1 X s(`+xa+(x
2 )b)+t(m+xb)
(FZp FZp Ip )|(`, m, 0)Ha,b i = 3/2 p |s, t, xi. (14.12)
p x,s,tZ
p
Now suppose we measure the values s, t appearing in the first two registers. In fact this can be done without
loss of information, since the density matrix of the state (mixed over the uniformly random values of `, m)
is block diagonal, with blocks labeled by s, t. Collecting the coefficients of the unknown parameters a, b, the
resulting p-dimensional quantum state is
1 X s(xa+(x2)b)+t(xb)
|H\
a,b;s,t i := p |xi (14.13)
p
xZp
1 X a(sx)+b(s(x2)+tx)
= p |xi. (14.14)
p
xZp
where the values s, t Zp are known, and are obtained uniformly at random. We would like to use samples
of this state to determine a, b Zp .
14.3 Two states are better than one

With only one copy of this state, there is insufficient information to recover the hidden subgroup: Holevos
theorem guarantees that a measurement on a p-dimensional quantum state can reliably communicate at most
p different outcomes, yet there are p2 possible values of (a, b) Z2p . Thus we have to use at least two copies
of the state. One can show that there exist single-register measurements on this state that yield enough
information to recover a, b with poly(log p) samplesin fact, a random measurement has this property with
high probability. But no single-register measurement is known from which a and b can be extracted efficiently
(i.e., in time poly(log p)).
However, by making a joint measurement on two copies of the state, we can recover the information
about a, b that is encoded in a quadratic function in the phase. To see this, consider the two-copy state
1 X a(sx+uy)+b(s(x2)+tx+u(y2)+vy)
|H\a,b;s,t i |Ha,b;u,v i =
\ p |x, yi (14.15)
p
x,yZp
1 X a+b
= p |x, yi, (14.16)
p
x,yZp
14.3. Two states are better than one 61
where
:= sx + uy (14.17)
:= s x2 + tx + u y2 + vy

(14.18)
and where we suppress the dependence of , on s, t, u, v, x, y for clarity. If we could replace |x, yi by |, i,
then the resulting state would be simply the Fourier transform of |a, bi, and an inverse Fourier transform
would reveal the solution. So lets compute the values of , in ancilla registers, giving the state
1 X a+b
p |x, y, , i, (14.19)
p
x,yZp
and attempt to uncompute the first two registers.

For fixed values of , , s, t, u, v Zp , the quadratic equations (14.17)(14.18) could have zero, one, or
two solutions x, y Zp . Thus we cannot hope to erase the first and second registers by a classical procedure
conditioned on the values in the third and fourth registers (and the known values of s, t, u, v). However, it
is possible to implement a quantum procedure to erase the first two registers by considering the full set of
solutions
s,t,u,v
:= {(x, y) Z2p : sx + uy = and s x2 + tx + u y2 + vy = }.

S, (14.20)
The state (14.19) can be rewritten

1 X a+b
q
s,t,u,v s,t,u,v
p |S, | |S, , , i, (14.21)
p
,Zp
P p
where we use the convention that |Si := sS |si/ |S| denotes the normalized uniform superposition over
the elements of the set S. Thus, if we could perform a unitary transformation satisfying
s,t,u,v s,t,u,v
|S, i 7 |, i for |S, |=
6 0 (14.22)
(and defined in any way consistent with unitarity for other values of , ), we could erase the first two
registers of (14.19), producing the state
1 X a+b
q
s,t,u,v
p |S, | |, i. (14.23)
p
,Zp
(Note that in fact we could just apply the transformation (14.22) directly to the state (14.16); there is no
need to explicitly compute the values , in an ancilla register.)
We refer to the inverse of the transformation (14.22) as quantum sampling, since the goal is to produce
a uniform superposition over the set of solutions, a natural quantum analog of random sampling from those
solutions.
Since the system of equations (14.17)(14.18) consists of a pair of quadratic equations in two variables
over Fp , it has either zero, one, or two solutions x, y Fp . In particular, a straightforward calculation shows
that the solutions can be expressed in closed form as

s + sv tu u + tu sv
x= y= (14.24)
s(s + u) u(s + u)
where
:= (2s + s 2 2t)(s + u)u + (u + tu sv)2 . (14.25)
Provided su(s + u) 6= 0, the number of solutions is completely determined by the value of . If is a
nonzero square in Fp , then there are two distinct solutions; if = 0 then there is only one solution; and if
is a non-square then there are no solutions. In any event, since we can efficiently compute an explicit list
of solutions in each of these cases, we can efficiently perform the transformation (14.22).
62 Chapter 14. The HSP in the Heisenberg group
It remains to show that the state (14.23) can be used to recover a, b. This state is close to the Fourier
transform of |a, bi provided the solutions are nearly uniformly distributed. Since the values of s, t, u, v are
uniformly distributed over Fp , it is easy to see that is uniformly distributed over Fp . This means that
is a square about half the time, and is a non-square about half the time (with = 0 occurring only with
probability 1/p). Thus there are two solutions about half the time and no solutions about half the time.
This distribution of solutions is uniform enough for the procedure to work.
Applying the inverse quantum Fourier transform over Zp Zp , we obtain the state
1
q
s,t,u,v
X
p(ak)+(b`) |S, | |k, ì. (14.26)
p2
,,k,`Zp
Measuring this state, the probability of obtaining the outcome k = a and ` = b for any particular values of
s, t, u, v is
2
1
q
s,t,u,v
X
|S, | . (14.27)
p4
,Zp
Since those values occur uniformly at random, the overall success probability of the algorithm is
2 2
1 X X q s,t,u,v 1 X X q s,t,u,v
|S, | |S, | (14.28)
p8 p12
s,t,u,vZp ,Zp s,t,u,vZp ,Zp
2
1 X p4
12 2 (14.29)
p 2 + o(1)
,Zp
1
= (1 o(1)), (14.30)
2
which shows that the algorithm succeeds with probability close to 1/2.
Chapter 15
Approximating the Jones polynomial
In this final chapter of the part on algebraic problems, we discuss a very different class of quantum algorithms,
ones that approximately solve various #P-complete problems. The best-known example of such a quantum
algorithm is for approximating the value of a link invariant called the Jones polynomial. This algorithm is
not based on the Fourier transform, but it does use properties of group representations.
15.1 The Hadamard test

The quantum algorithm for approximating the Jones polynomial uses a simple primitive called the Hadamard
test. This is equivalent to phase estimation with a single bit of precision. Given a unitary operation U and
a state |i, the Hadamard test provides a means of estimating h|U |i. The test applies a controlled-U
operation to the state |+i |i and measures the first qubit in the basis |i := 12 (|0i |1i). The state
before the measurement is
1 1
(|0i|i + |1iU |i) = |+i(|i + U |i) + |i(|i U |i) , (15.1)
2 2
so
1
Pr() = k|i U |ik2 (15.2)
4
1
= (1 Reh|U |i). (15.3)
2
In other words, the expected value of the outcome is precisely Reh|U |i. Replacing the states |i by the
states |ii := 12 (|0i i|1i), a simple calculation shows that we can approximate Imh|U |i.
15.2 The Jones polynomial

The Jones polynomial is a central object in low-dimensional topology with surprising connections to physics.
Witten showed that the Jones polynomial is closely related to topological quantum field theory (TQFT).
Friedman, Kitaev, Larsen, and Wang investigated the relationship between TQFT and topological quantum
computing, showing that quantum computers can efficiently simulate TQFTs (thereby approximating the
Jones polynomial), and that in fact TQFTs essentially capture the power of quantum computation [46].
Here we describe the quantum algorithm for approximating the Jones polynomial in a way that does not
explicitly refer to TQFT, following the treatment of Aharonov, Jones, and Landau [6].
To define the Jones polynomial, we must first introduce the concepts of knots and links. A knot is an
embedding of the circle in R3 , i.e., a closed loop of string that may wrap around itself in any way. More
generally, a link is a collection of any number of knots that may be intertwined. In an oriented link, each
loop of string is directed. It is natural to identify links that are isotopic, i.e., that can be transformed into
one another by continuous deformation of the strings.
63
64 Chapter 15. Approximating the Jones polynomial

The Jones polynomial of an oriented link L is a Laurent polynomial VL (t) in the variable t, i.e., a0
polynomial in t and 1/ t. It is a link invariant, meaning that VL (t) = VL0 (t) if the oriented links L and L
are isotopic. While it is possible for the Jones polynomial to take the same value on two non-isotopic links,
it can often distinguish links; for example, the Jones polynomials of the two orientations of the trefoil knot
are different.
An oriented link L can be specified by a link diagram, a drawing of the link in the plane with over- and
under-crossings indicated. One way to define the Jones polynomial of a link diagram is as follows. First, let
us define the Kauffman bracket hLi, which does not depend on the orientation of L. Each crossing in the
link diagram can be opened in one of two ways, and for any given crossing we have
D E D E D E
= t1/4 + t1/4 , (15.4)
where the rest of the link remains unchanged. Repeatedly applying this rule, we eventually arrive at a link
consisting of disjoint unknots. The Kauffman bracket of a single unknot is h i := 1, and more generally, the
Kauffman bracket of n unknots is (t1/2 t1/2 )n1 . By itself, the Kauffman bracket is not a link invariant,
but it can be turned into one by taking into account the orientation of the link, giving the Jones polynomial.
_ ?
For any oriented link diagram L, we define its writhe w(L) as the number of crossings of the form minus
_ ?
the number of crossings of the form . Then the Jones polynomial is defined as
VL (t) := (t1/4 )3w(L) hLi. (15.5)
Computing the Jones polynomial of a link diagram is quite difficult. A brute-force calculation using
the definition in terms of the Kauffman bracket takes time exponential in the number of crossings. Indeed,
exactly computing the Jones polynomial is #P-hard (except for a few special values of t), as shown by
Jaeger, Vertigan, and Welsh. Here #P is the class of counting problems associated to problems in NP (e.g.,
computing the number of satisfying assignments of a Boolean formula). Of course, approximate counting
can be easier than exact counting, and sometimes #P-hard problems have surprisingly good approximation
algorithms.
15.3 Links from braids

It is useful to view links as arising from braids. A braid is a collection of n parallel strands, with adjacent
strands allowed to cross over or under one another. Two braids on the same number of strands can be
composed by placing them end to end. The braid group on n strands is an infinite group with generators
{1 , . . . , n1 }, where i denotes a twist in which strand i passes over strand i + 1, interchanging the two
strands. More formally, the braid group is defined by the relations i i+1 i = i+1 i i+1 and i j = j i
for |i j| > 1.
Braids and links differ in that the ends of a braid are open, whereas a link consists of closed strands. We
can obtain a link from a braid by connecting the ends of the strands in some way. One simple way to close a
braid is via the trace closure, in which the ith strand of one end is connected to the ith strand of the other
end for each i = 1, . . . , n, without crossing the strands. A theorem of Alexander states that any link can
be obtained as the trace closure of some braid. Another natural closure (for braids with an even number of
strands) is the plat closure, which connects the first and second strands, the third and fourth strands, etc.,
at each end of the braid.
15.4 Representing braids in the Temperley-Lieb algebra

The Jones polynomial of the plat or trace closure of a braid can be expressed in terms of a representation
of the braid group defined over an algebra called the Temperley-Lieb algebra. While the definition of this
algebra is fairly straightforward, the description of its representations is somewhat technical, and we will not
give the details here; instead we only mention some general features.
We consider the case where t = e2i/k is a kth root of unity. For such values, the relevant representation
of the braid group is unitary. The dimension of this representation is exponential in n (specifically, it is the
15.5. A quantum algorithm 65
number of paths of length n that start from one end of a path with k 1 vertices), so it corresponds to a
unitary operation on poly(n) qubits. The Jones polynomial of the plat closure of a braid is proportional to
the expectation h|U |i of the associated representation matrix U in a fixed quantum state |i.
15.5 A quantum algorithm

The description of the Jones polynomial in terms of a representation of the Temperley-Lieb algebra naturally
suggests a quantum algorithm for approximating the Jones polynomial. Suppose that we can efficiently
implement unitary operations corresponding to twists of adjacent strands on a quantum computer. By
composing such operations, we can implement a unitary operation corresponding to the entire braid. Then
we can approximate the desired expectation value using the Hadamard test.
With a suitable choice for an encoding of the basis states of the representation of the braid group using
qubits, one can show that the braid group representation operators corresponding to elementary twists can
indeed be performed efficiently on a quantum computer. Given an explicit description of the braid group
representation, the details of this implementation are fairly straightforward.
Applying this approach to the relevant unitary representation of the braid group, one obtains a quantum
algorithm for approximating the Jones polynomial of the plat closure of a braid at a root of unity. In
particular, for a braid on n strands, with m crossings, and with t = e2i/k , there is an algorithm running
in time poly(n, m, k) that outputs an approximation differing from the actual value VL (t) of the Jones
polynomial by at most (2 cos k )3n/2 /(N poly(n, k, m)), with only exponentially small probability of failure.
Here N is an exponentially larger factor derived from the representation of the braid group.
The Jones polynomial of the trace closure of a braid can be similarly approximated by noting that
this quantity is given by the Markov trace of the representation of the braid. The Markov trace is simply
a weighted version of the usual trace, so it can be approximated by sampling hp |U |p i from an appro-
priate distribution over states |p i. Performing such a procedure, one obtains an approximation of the
Jones polynomial with additive error at most (2 cos k )n1 / poly(n, k, m), again in polynomial time and with
exponentially small failure probability.
15.6 Quality of approximation

Without knowing more about the possible values of the Jones polynomial, it is hard to say whether the
approximations described above are good. Notice that the algorithms only provide additive approximations,
meaning that the error incurred by the algorithm is independent of the value being approximated, which is
undesirable when that value is small. Indeed, the additive error increases exponentially with n, the number
of strands in the braid. For some braids, the error might be larger than the value being approximated. It
would be preferable to obtain a multiplicative approximation, but no such algorithm is known.
However, it can be shown that obtaining the additive approximation described above for the Jones
polynomial of the plat closure of a braid is as hard as any quantum computation. In other words, this
quality of Jones polynomial approximation is BQP-complete. This can be shown by demonstrating that,
with an appropriate encoding of qubits, the representations of the braid group can be used to implement a
universal set of quantum gates. Thus, in principle, any quantum algorithm can be described in terms of some
braid whose plat closure has a Jones polynomial encoding the result of the computation, with exponentially
differing values corresponding to yes and no outcomes. Therefore, it is unlikely that a classical computer can
obtain the same approximation, since this would give a classical algorithm for simulating a general quantum
computation.
Approximating the Jones polynomial of the trace closure of a braid to the level described above turns out
to be substantially easier: such a computation can be performed using a quantum computer whose initial
state has only one pure qubit and many maximally mixed qubits. Such a device can approximate tr U by
supplying the maximally mixed state in place of the pure state |i in the Hadamard test. This does not
immediately show how to approximate the Jones polynomial of the trace closure, since the Markov trace
is a weighted trace. However, by using a different representation of the braid group to describe the Jones
polynomial, Jordan and Shor showed that a single pure qubit indeed suffices. Furthermore, they showed
66 Chapter 15. Approximating the Jones polynomial
that this problem is complete for the one clean qubit model, and hence apparently unlikely to be solvable
by classical computers.
15.7 Other algorithms

The results described above can be generalized to many other related problems. Wocjan and Yard showed how
to evaluate the Jones polynomial of a generalized closure of a braid, and how to evaluate a generalization of
the Jones polynomial called the HOMFLYPT polynomial [98]. Work of Aharonov, Arad, Eban, and Landau
shows how to approximate the Tutte polynomial of a planar graph, which in particular gives an approximation
of the partition function of the Potts model on a planar graph; this problem also characterizes the power
of quantum computation, albeit only for unphysical choices of parameters [4]. More generally, there are
efficient quantum algorithms to compute additive approximations of tensor networks, as shown by Arad and
Landau [13]. There are also related quantum algorithms for approximating invariants of manifolds.
Part III
Quantum walk
67
Chapter 16
Continuous-time quantum walk
We now turn to our second major topic in quantum algorithms, the concept of quantum walk. In this lecture
we will introduce continuous-time quantum walk as a natural analog of continuous-time classical random
walk, and well see some examples of how the two kinds of processes differ.
16.1 Continuous-time quantum walk

Random walks come in two flavors: discrete- and continuous-time. It is easiest to define a quantum analog
of a continuous-time random walk [43], so we consider this case first. Given a graph G = (V, E), we define
the continuous-time random walk on G as follows. Let A be the adjacency matrix of G, the |V | |V | matrix
with (
1 (j, k) E
Aj,k = (16.1)
0 (j, k) /E
for every pair j, k V . In particular, if we disallow self loops, then the diagonal of A is zero. There is
another matrix associated with G that is nearly as important: the Laplacian of G, which has

deg(j) j = k

Lj,k = 1 (j, k) E (16.2)

0 otherwise

where deg(j) denotes the degree of vertex j. (The Laplacian is sometimes defined differently than thise.g.,
sometimes with the opposite sign. We use this definition because it makes L a discrete approximation of the
Laplacian operator 2 in the continuum.)
The continuous-time random walk on G is defined as the solution of the differential equation
d X
pj (t) = Ljk pk (t). (16.3)
dt
kV
Here pj (t) denotes the probability associated with vertex j at time t. This can be viewed as a discrete analog
of the diffusion equation. Note that
d X X
pj (t) = Ljk pk (t) = 0 (16.4)
dt
jV j,kV
(since the columns of L sum to 0), which shows that an initially normalized distribution remains normalized:
the evolution of the continuous-time random walk for any time t is a stochastic process. The solution of the
differential equation can be given in closed form as
p(t) = eLt p(0). (16.5)
69
70 Chapter 16. Continuous-time quantum walk
Now notice that the equation (16.3) is very similar to the Schrodinger equation
d
i |i = H|i (16.6)
dt
except that it lacks the factor of i. If we simply insert this factor, and rename the probabilities pj (t) as
quantum amplitudes qj (t) = hj|(t)i (where {|ji : j V } is an orthonormal basis for the Hilbert space),
then we obtain the equation
d X
i qj (t) = Ljk qk (t), (16.7)
dt
kV
which is simply the Schrodinger equation with the Hamiltonian given by the Laplacian of the graph. Since the
d 2
P
Laplacian is a Hermitian operator, these dynamics preserve normalization in the sense that dt jV |qj (t)| =
0. Again the solution of the differential equation can be given in closed form, but here it is |(t)i =
eiLt |(0)i.
We could also define a continuous-time quantum walk using any Hermitian Hamiltonian that respects
the structure of G. For example, we could use the adjacency matrix A of G, even though this matrix cannot
be used as the generator of a continuous-time classical random walk.
16.2 Random and quantum walks on the hypercube

Lets begin by investigating a simple, dramatic example of a difference between the behavior of random
and quantum walks. Consider the Boolean hypercube, the graph with vertex set V = {0, 1}n and edge set
E = {(x, y) V 2 : (x, y) = 1}, where (x, y) denotes the Hamming distance between the strings x and y.
When n = 1, the hypercube is simply an edge, with adjacency matrix

0 1
x := . (16.8)
1 0
For general n, the graph is the direct product of this graph with itself n times, and the adjacency matrix is
n
X
A= x(j) (16.9)
j=1
(j)
where x denotes the operator acting as x on the jth bit, and as the identity on every other bit.
For simplicity, lets consider the quantum walk with the Hamiltonian given by the adjacency matrix. (In
fact, since the graph is regular, the walk generated by the Laplacian would only differ by an overall phase.)
Since the terms in the above expression for the adjacency matrix commute, the unitary operator describing
the evolution of this walk is simply
n
Y (j)
eiAt = eix t
(16.10)
j=1
n
O cos t i sin t
= . (16.11)
i sin t cos t
j=1
After time t = /2, this operator flips every bit of the state (up to an overall phase), mapping any input
state |xi to the state |xi corresponding to the opposite vertex of the hypercube.
In contrast, consider the continuous- or discrete-time random walk starting from the vertex x. It is
not hard to show that the probability of reaching the opposite vertex x is exponentially small at any time,
since the walk rapidly reaches the uniform distribution over all 2n vertices of the hypercube. So this simple
example shows that random and quantum walks can exhibit radically different behavior.
16.3. Random and quantum walks in one dimension 71
16.3 Random and quantum walks in one dimension

Perhaps the best-known example of a random walk is the case of an infinite path, with V = Z and (j, k) E
iff |j k| = 1. It is well known that the random walk on this graphstarting from the origin (in either
continuous or discrete time) typically moves a distance proportional to t in time t. Now lets consider the
corresponding quantum walk.
To calculate the behavior of the walk, it is helpful to diagonalize the Hamiltonian. The eigenstates of
the Laplacian of the graph are the momentum states |pi with components
hj|pi = eipj (16.12)
where p . We have
hj|L|pi = hj + 1|pi + hj 1|pi 2hj|pi (16.13)

ip(j+1) ip(j1) ipj
= (e +e 2e ) (16.14)
ipj ip ip
=e (e + e 2) (16.15)
= 2(cos p 1)hj|pi, (16.16)
so the corresponding eigenvalue is 2(cos p 1). Thus the amplitude for the walk to move from j to k in time
t is
Z
iLt 1
hk|e |ji = e2it(cos p1) hk|pihp|ji dp (16.17)
2
Z
1
= eip(kj)2it(cos p1) dp (16.18)
2
= e2it (i)kj Jkj (2t) (16.19)
where J is the Bessel function of order . This expression can be understood using basic asymptotic
properties of the Bessel function. For large values of , the function J (t) is exponentially small in for
t, of order 1/3 for t, and of order 1/2 for t. Thus (16.19) describes a wave propagating
with speed 2.
We can use a similar calculation to exactly describe the corresponding continuous-time classical random
walk, which is simply the analytic continuation of the quantum case with t it. Here the probability of
moving from j to k in time t is
[eLt ]kj = e2t Ikj (2t), (16.20)
1
where I is the modified Bessel function of order . For large t, this expression is approximately 4t exp((k
2

j) /4t), a Gaussian of width 2t, in agreement with our expectations for a classical random walk in one
dimension.
16.4 Black-box traversal of the glued trees graph

We have seen that the behavior of a quantum walk can be dramatically different from that of its classical
counterpart. Next we will see an even stronger example of the power of quantum walk: a black-box problem
that can be solved exponentially faster by a quantum walk than by any classical algorithm [30].
Consider a graph obtained by starting from two balanced binary trees of height n, and joining them by a
random cycle of length 2 2n that alternates between the leaves of the two trees. For example, such a graph
for n = 4 could look like the following:

Suppose we take a random walk on the graph starting from the root of the left tree. It is not hard to
see that such a walk rapidly gets lost in the middle of the graph and never has a substantial probability
of reaching the opposite root. In fact, by specifying the graph in such a way that it can only be explored
locally, we can ensure that no classical procedure starting from the left root can efficiently reach the right
root. However, a quantum walk starting from the left root produces a state with a large (lower bounded by
1/ poly(n)) overlap on the right root in a short (upper bounded by poly(n)) amount of time.
To establish a provable separation between classical and quantum strategies, we will formulate the graph
traversal problem in terms of query complexity.
Let G = (V, E) be a graph with N vertices. To represent G by a black box, let m be such that 2m N ,
and let k be at least as large as the maximum degree of G. For each vertex a V , assign a distinct m-bit
string (called the name of a), not assigning 11 . . . 1 as the name of any vertex. For each b V with (a, b) E,
assign a unique label from {1, 2, . . . , k} to the ordered pair (a, b). For a {0, 1}m (identifying the vertex
with its name) and c {1, 2, . . . , k}, define vc (a) as the name of the vertex reached by following the outgoing
edge of a labeled by c, if such an edge exists. If there is no vertex of G named a or no outgoing edge from a
labeled c, then let vc (a) = 11 . . . 1. The black box for G takes a {0, 1}m and c {1, 2, . . . , k} as input and
returns vc (a).
The black box graph traversal problem is as follows. Let G be a graph and let entrance and exit be
two vertices of G. Given a black box for G as described above, with the additional promise that the name of
the entrance is 00 . . . 0, the goal is to output the name of the exit. We say an algorithm for this problem
is efficient if its running time is polynomial in m.
Of course, a random walk is not necessarily the best classical strategy for this problem. For example,
there is an efficient classical algorithm for traversing the n-dimensional hypercube (exercise: what is it?)
even though a random walk does not work. However, no classical algorithm can efficiently traverse the glued
trees, whereas a quantum walk can.
16.5 Quantum walk algorithm to traverse the glued trees graph

Given a black box for a graph G as specified above, we can efficiently compute a list of neighbors of any
desired vertex, provided k = poly(m) (i.e., provided the maximum degree of the graph is not too large).
Thus it is straightforward to simulate the dynamics of the continuous-time quantum walk on any such G,
and in particular, on the glued trees graph (which has maximum degree 3). Our strategy for solving the
traversal problem is simply to run the quantum walk and show that the resulting state has a substantial
overlap on the exit for some t = poly(n).
Let G be the glued trees graph. The dynamics of the quantum walk on this graph are dramatically
simplified because of symmetry. Consider the basis of states |col ji that are uniform superpositions over the
vertices at distance j from the entrance, i.e.,
1 X
|col ji := p |ai (16.21)
Nj (a,entrance)=j
16.5. Quantum walk algorithm to traverse the glued trees graph 73
where (
2j 0jn
Nj := (16.22)
22n+1j n + 1 j 2n + 1
is the number of vertices at distance j from the entrance, and where (a, b) denotes the length of the
shortest path in G from a to b. It is straightforward to see that the subspace span{|col ji : 0 j 2n + 1}
is invariant under the action of the adjacency matrix A of G. At the entrance and exit, we have

A|col 0i = 2|col 1i (16.23)

A|col 2n + 1i = 2|col 2ni. (16.24)
For any 0 < j < n, we have
1 X
A|col ji = p A|ai (16.25)
Nj (a,entrance)=j

1 X X
=p 2 |ai + |ai (16.26)
Nj (a,entrance)=j1 (a,entrance)=j+1
1 p p
= p (2 Nj1 |col j 1i + Nj+1 |col j + 1i) (16.27)
Nj

= 2(|col j 1i + |col j + 1i). (16.28)
Similarly, for any n + 1 < j < 2n + 1, we have
1 p p
A|col ji = p ( Nj1 |col j 1i + 2 Nj+1 |col j + 1i) (16.29)
Nj

= 2(|col j 1i + |col j + 1i). (16.30)
The only difference occurs at the middle of the graph, where we have
1 p p
A|col ni = (2 Nn1 |col n 1i + 2 Nn+1 |col n + 1i) (16.31)
Nn

= 2|col n 1i + 2|col n + 1i (16.32)
and similarly
1 p p
A|col n + 1i = p (2 Nn |col ni + 2 Nn+2 |col n + 2i) (16.33)
Nn+1

= 2|col ni + 2|col n + 2i. (16.34)
In summary, the matrix elements of A between basis states for this invariant subspace can be depicted as
follows:

2 2 2 2 2 2 2
entrance exit
col 0 col 1 col 2 col n1 col n col n+1 col n+2 col 2n1 col 2n col 2n+1
By identifying the subspace of states |col ji, we have found that the quantum walk on the glued trees
graph starting from the entrance is effectively the same as a quantum walk on a weighted path of 2n + 2
vertices, with all edge weights the same except for the middle one. Given our example of the quantum walk
on the infinite path, we can expect this walk to reach the exit with amplitude 1/ poly(n) in time linear in
n. To prove that the walk indeed reaches the exit in polynomial time, we will use the notion of the mixing
time of a quantum walk.
16.6 Classical and quantum mixing

Informally, the mixing time of a random walk is the amount of time it takes to come close to a stationary
distribution. Recall that the continuous-time random walk on a graph G = (V, E) with Laplacian L is defined
as the solution of the differential equation dp(t)
dt = Lp(t), where p(t) R
|V |
denotes a vector of probabilities
for the walk to be at each vertex at time t. The uniform distribution over the vertices, u := (1, 1, . . . , 1)/|V |, is
an eigenvector of L with eigenvalue 0. Indeed, if G is connected, then this is the unique eigenvector
P with this
eigenvalue. Letting v denote a normalized eigenvector of L with eigenvalue (so that L = 6=0 v vT ),
we have
p(t) = eLt p(0) (16.35)

X
= |V |uuT + et v vT p(0) (16.36)
6=0
X
= h|V |u, p(0)iu + et hv , p(0)iv (16.37)
6=0
X
=u+ et hv , p(0)iv (16.38)
6=0
p
(where in exponentiating L we have used the fact that |V |u is a normalized eigenvector of L, so that |V |uuT
is the projector onto the corresponding subspace). The Laplacian is a negative semidefinite operator, so the
contributions et for 6= 0 decrease exponentially in time; thus the walk asymptotically approaches the
uniform distribution. The deviation from uniform is small when t is large compared to the inverse of the
largest (i.e., least negative) nonzero eigenvalue of L.
Since a quantum walk is a unitary process, we should not expect it to approach a limiting quantum state,
no matter how long we wait. Nevertheless, it is possible to define a notion of the limiting distribution of a
quantum walk as follows. Suppose we pick a time t uniformly at random between 0 and T , run the quantum
walk starting at a V for a total time t, and then measure in the vertex basis. The resulting distribution is
1 T
Z
pab (T ) = |hb|eiHt |ai|2 dt (16.39)
T 0
1 T i(0 )t
X Z
0 0
= hb|ih|aiha| ih |bi e dt (16.40)
0
T 0
,
0
X X 1 ei( )T
= |ha|ihb|i|2 + hb|ih|aiha|0 ih0 |bi (16.41)
i( 0 )T
6=0
where we have considered a quantum walk generated by an unspecified Hamiltonian H (it could be the
Laplacian or the adjacency matrix, orPsome other operator as desired), and where we have assumed for
simplicity that the spectrum of H = |ih| is nondegenerate. We see that the distribution pab (T )
tends toward a limiting distribution
X
pab () := |ha|ihb|i|2 . (16.42)

The timescale for approaching this distribution is again governed by the spectrum of H, but now we see that
T must be large compared to the inverse of the smallest gap between any pair of distinct eigenvalues, not
just the smallest gap between a particular pair of eigenvalues as in the classical case.
Lets apply this notion of quantum mixing to the quantum walk on the glued trees. It will be simplest to
consider the walk generated by the adjacency matrix A. Since the subspace of states |col ji has dimension
only 2n + 1, it should not be surprising that the limiting probability of traversing from entrance to exit
is bigger than 1/ poly(n). To see this, notice that A commutes with the reflection operator R defined as
R|col ji = |col 2n + 1 ji, so these two operators can be simultaneously diagonalized. Now R2 = 1, so it
16.6. Classical and quantum mixing 75
has eigenvalues 1, which shows that we can choose the eigenstates |i of A to satisfy hentrance|i =
hexit|i. Therefore,
X
pentranceexit () = |hentrance|ihexit|i|2 (16.43)

X
= |hentrance|i|4 (16.44)

!2
1 X
2
|hentrance|i| (16.45)
2n + 2

1
= (16.46)
2n + 2
where the lower bound follows by the Cauchy-Schwarz inequality. Thus it suffices to show that the mixing
time of the quantum walk is poly(n).
To see how long we must wait before the probability of reaching the exit is close to its limiting value,
we can calculate
|pentranceexit () pentranceexit (T )|

i(0 )T
X
0 0 1 e
= hexit|ih|entranceihentrance| ih |exiti (16.47)
6=0 i( 0 )T
2 X
|hexit|ih|entranceihentrance|0 ih0 |exiti| (16.48)
T 0
,
2 X
= |hentrance|i|2 |hentrance|0 i|2 (16.49)
T 0,
2
= , (16.50)
T
where denotes the smallest gap between any pair of distinct eigenvalues of A. All that remains is to lower
bound .
To understand the spectrum of A, recall that an infinite path has eigenstates of the form eipj . For any
ipj
the state |i with amplitudes hcol j|i = e satisfies hcol j|A|i = hcol j|i, where the eigenvalue
value of p,
is = 2 2 cos p, for all values of j except 0, n, n + 1, 2n + 1. We can satisfy the eigenvalue condition for
j = 0, 2n + 1 by taking linear combinations of eipj that vanish for j = 1 and j = 2n + 2, namely
(
sin(p(j + 1)) 0jn
hcol j|i = (16.51)
sin(p(2n + 2 j)) n + 1 j 2n + 1.
Finally, we can enforce the eigenvalue condition at j = n (which automatically enforces it at j = n + 1 by

symmetry), which restricts the values of p to a finite set. We have

2 sin(pn) 2 sin(p(n + 1)) = 2 2 cos(p) sin(p(n + 1)), (16.52)
which can be simplified to

sin(p(n + 2))
= 2. (16.53)
sin(p(n + 1))
The left hand side of this equation decreases monotonically, with poles at integer multiples of /(n + 1). For
example, with n = 4, we have the following:

With a bit of analysis (see quant-ph/0209131 for details), one can show that the solutions of this equation
2
give 2n values of p, each of which is separated from the integer multiples of /(n3 + 1) by (1/n ). The
spacings between the corresponding eigenvalues of A, = 2 2 cos p, are (1/n ). The remaining two
eigenvalues of A can be obtained by considering solutions with p imaginary, and it is easy to show that they
are separated from the rest of the spectrum by a constant amount. By taking (say) T = 5n/ = O(n4 ), we
can ensure that the probability to reach the exit is (1/n). Thus there is an efficient quantum algorithm
to traverse the glued trees graph.
16.7 Classical lower bound

It remains to show that this problem is difficult for a classical computer. A formal proof of this fact can
be given using a sequence of reductions to problems that are essentially no easier than the original one, but
that restrict the nature of the allowed algorithms. Here we will simply sketch the main ideas.
First, note that if we name the vertices at random using strings of about, say, 2 log |V | bits, then there
are exponentially many more possible names than there are actual vertices. Since the probability that a
randomly guessed name corresponds to a vertex of the graph is exponentially small, we can essentially restrict
our attention to algorithms that query a connected set of vertices, starting from the entrance (the only
vertex whose name is known initially).
Next, suppose we consider the algorithm to succeed not only if it reaches the exit, but also if it manages
to find a cycle in the graph. This only makes it easier for the algorithm to succeed, but not significantly so,
since it turns out to be hard even to find a cycle.
Now we can restrict our attention to the steps the algorithm takes before it finds a cycle. Notice that
for such steps, the names supplied by the black box provide no information whatsoever about the structure
of the graph: they could just as well be simulated by a sequence of random responses. Therefore, we can
think of an algorithm as simply producing a rooted binary tree and embedding it into the glued trees graph
at random. To show that the algorithm fails, it suffices to show that under such a random embedding,
the probability of any rooted binary tree giving rise to a cycle or reaching the exit is small. By a fairly
straightforward probabilistic argument, one can show that even for exponentially large trees (say, having
at most 2n/6 vertiecs), the probability of the embedded tree giving rise to a cycle or reaching the exit is
exponentially small. Thus any classical algorithm for solving the black box glued trees traversal problem
must make exponentially many queries to succeed with more than exponentially small probability.
Chapter 17
Discrete-time quantum walk
In the last lecture we introduced the notion of continuous-time quantum walk. We now turn our attention
to discrete-time quantum walk, which provides a convenient framework for quantum search algorithms.
17.1 Discrete-time quantum walk

It is trickier to define a quantum analog of a discrete-time random walk than of a continuous-time random
walk. In the simplest discrete-time random walk on G, at each time step we simply move from any given
vertex to each of its neighbors with equal probability. Thus the walk is governed by the |V | |V | matrix M
with entries (
1/ deg(k) (j, k) E
Mjk = (17.1)
0 otherwise.
for j, k V : an initial probability distribution p over the vertices evolves to p0 = M p after one step of the
walk.
To define a quantum analog of this process, we would like to specify a unitary operator U with the property
that an input state |ji corresponding to the vertex j V evolves to a superposition of the neighbors of j.
We would like this to happen in essentially the same way at every vertex, so we are tempted to propose the
definition
? 1 X
|ji 7 |j i := p |ki. (17.2)
deg(j) k:(j,k)E
However, a moments reflection shows that this typically does not define a unitary transformation, since
the orthogonal states |ji and |ki corresponding to adjacent vertices j, k with a common neighbor ` evolve
to non-orthogonal states. We could potentially avoid this problem using a rule that sometimes introduces
phases, but that would violate the spirit of defining a process that behaves in the same way at every vertex.
In fact, even if we give that up, there are some graphs that simply do not allow local unitary dynamics [88].
We can get around this difficulty if we allow ourselves to enlarge the Hilbert space, an idea proposed by
Watrous as part of a logarithmic-space quantum algorithm for deciding whether two vertices are connected
in a graph [96]. Let the Hilbert space consist of states of the form |j, ki where (j, k) E. We can think of
the walk as taking place on the (directed) edges of the graph; the state |j, ki represents a walker at vertex j
that will move toward vertex k. Each step of the walk consists of two operations. First, we apply a unitary
transformation that operates on the second register conditional on the first register. This transformation is
sometimes referred to as a coin flip, as it modifies the next destination of the walker. A common choice is
the Grover diffusion operator over the neighbors of j, namely
X
C := |jihj| 2|j ihj | I . (17.3)
jV
Next, the walker is moved to the vertex indicated in the second register. Of course, since the process must
77
78 Chapter 17. Discrete-time quantum walk
be unitary, the only way to do this is to swap the two registers using the operator
X
S := |j, kihk, j|. (17.4)
(j,k)E
Overall, one step of the discrete-time quantum walk is described by the unitary operator SC.
In principle, this construction can be used to define a discrete-time quantum walk on any graph (although
care must be taken if the graph is not regular). However, in practice it is often more convenient to use an
alternative framework introduced by Szegedy [94], as described in the next section.
17.2 How to quantize a Markov chain

A discrete-time classical random walk on an N -vertex graph can be represented by an N N matrix P .
The entry Pjk represents the probability of making a transition to k from j, so that an initial probability
distribution
PN p RN becomes P p after one step of the walk. To preserve normalization, we must have
j=1 Pjk = 1; we say that such a matrix is stochastic.
For any N N stochastic matrix P (not necessarily symmetric), we can define a corresponding discrete-
time quantum walk, a unitary operation on the Hilbert space CN CN . To define this walk, we introduce
the states
N
X p
|j i := |ji Pkj |ki (17.5)
k=1
N
X p
= Pkj |j, ki (17.6)
k=1
for j = 1, . . . , N . Each such state is normalized since P is stochastic. Now let
N
X
:= |j ihj | (17.7)
j=1
denote the projection onto span{|j i : j = 1, . . . , N }, and let
N
X
S := |j, kihk, j| (17.8)
j,k=1
be the operator that swaps the two registers. Then a single step of the quantum walk is defined as the
unitary operator U := S(2 1).
Notice that if Pjk = Ajk / deg(k) (i.e., if the walk simply chooses an outgoing edge of an underlying
digraph uniformly at random), then this is exactly the coined quantum walk with the Grover diffusion
operator as the coin flip.
If we take two steps of the walk, then the corresponding unitary operator is
[S(2 1)][S(2 1)] = [S(2 1)S][2 1] (17.9)

= (2SS 1)(2 1), (17.10)
which can be interpreted as the reflection about span{|j i} followed by the reflection about span{S|j i}
(the states where we condition on the second register to do a coin operation on the first). To understand the
behavior of the walk, we will now compute the spectrum of U ; but note that it is also possible to compute
the spectrum of a product of reflections more generally.
17.3. Spectrum of the quantum walk 79
17.3 Spectrum of the quantum walk

To understand the behavior of a discrete-time quantum walk, it will be helpful to compute its spectral
decomposition. Let us show the following:
Theorem 17.1. Fix an N N stochastic matrix P , p and let {|i} denote a complete set of orthonormal
eigenvectors of the N N matrix D with entries Djk = Pjk Pkj with eigenvalues {}. Then the eigenvalues

of the discrete-time quantum walk U = S(2 1) corresponding to P are 1 and i 1 2 = ei arccos .
Proof. Define an isometry
N
X
T := |j ihj| (17.11)
j=1
N
X p
= Pkj |j, kihj| (17.12)
j,k=1
mapping states in Cn to states in Cn Cn , and let |i := T |i. Notice that
N
X
TT = |j ihj|kihk | (17.13)
j,k=1
N
X
= |j ihj | (17.14)
j=1
= , (17.15)
whereas
N
X
T T = |jihj |k ihk| (17.16)
j,k=1
N
X p
= P`j Pmk |jihj, `|k, mihk| (17.17)
j,k,`,m=1
N
X
= P`j |jihj| (17.18)
j,`=1
=I (17.19)
and
N
X
T ST = |jihj |S|k ihk| (17.20)
j,k=1
N
X p
= P`j Pmk |jihj, `|S|k, mihk| (17.21)
j,k,`,m=1
N
X p
= Pjk Pkj |jihk| (17.22)
j,k=1
= D. (17.23)
Applying the walk operator U to |i gives
U |i = S(2 1)|i (17.24)

= S(2T T 1)T |i (17.25)
= 2ST |i ST |i (17.26)
= S|i, (17.27)
and applying U to S|i gives
U S|i = S(2 1)S|i (17.28)

= S(2T T 1)ST |i (17.29)
= (2ST D T )|i (17.30)
= 2S|i |i. (17.31)
We see that the subspace span{|i, S|i} is invariant under U , so we can find eigenvectors of U within this
subspace.
Now let |i := |i S|i, and let us choose C so that |i is an eigenvector of U . We have
U |i = S|i (2S|i |i) (17.32)

= |i + (1 2)S|i. (17.33)
Thus will be an eigenvalue of U corresponding to the eigenvector |i provided (1 2) = (), i.e.

2 2 + 1 = 0, so
p
= i 1 2 . (17.34)
Finally, note that for the orthogonal complement of span{|i, S|i}, U simply acts as S
P any vector in P
(since = T T = T |ih|T = |ih| projects onto span{|i}). In this subspace, the eigenvalues
are 1.
17.4 Hitting times

We can use random walks to formulate a generic search algorithm, and quantizing this algorithm gives a
generic square root speedup [94]. Consider a graph G = (V, E), with some subset M V of the vertices
designated as marked. We will compare classical and quantum walk algorithms for deciding whether any
vertex in G is marked.
Classically, a straightforward approach to this problem is to take a random walk defined by some stochastic
matrix P , stopping if we encounter a marked vertex. In other words, we modify the original walk P to give
a walk P 0 defined as
1
k M and j = k
0
Pjk = 0 k M and j 6= k (17.35)

Pjk k / M.

Let us assume from now on that the original walk P is symmetric, though the modified walk P 0 clearly is
not provided M is non-empty. If we order the vertices so that the marked ones come last, the matrix P 0 has
the block form
0 PM 0
P = (17.36)
Q I
where PM is obtained by deleting the rows and columns of P corresponding to vertices in M .

17.4. Hitting times 81
Suppose we take t steps of the walk. A simple calculation shows

t

0 t PM 0
(P ) = t1 (17.37)
Q(I + PM + + PM ) I
!
t
PM 0
= t
PM I . (17.38)
Q PM I I
Now if we start from the uniform distribution over unmarked items (if we start from a marked item we
are done, so we might as well condition on this not happening), then the probability of not reaching a
1 t t t
P
marked item after t steps is N |M | / [PM ]jk kPM k = kPM k , where the inequality follows because
j,kM
t
in the normalized state |V \ M i = 1
P
the left hand side is the expectation of PM j M
/ |ji. Now if
N |M |
kPM k = 1, then the probability of reaching a marked item after t steps is at least 1kPM kt = 1(1)t ,
1
which is (1) provided t = O(1/) = O( 1kP Mk
).
It turns out that we can bound kPM k away from 1 knowing only the fraction of marked vertices and the
spectrum of the original walk. Thus we can upper bound the hitting time, the time required to reach some
marked vertex with constant probability.
Lemma 17.2. If the second largest eigenvalue of P (in absolute value) is at most 1 and |M | N , then
kPM k 1 .
Proof. Let |vi RN |M | be the principal eigenvector of PM , and let |wi RN be the vector obtained by
padding |vi with 0s for all the marked vertices.
We will decompose |wi in the eigenbasis of P . Since P is symmetric, it is actually doubly stochastic,
and the uniform vector |V i = 1N j |ji corresponds to the eigenvalue 1. All other eigenvectors |i have
P
eigenvalues at most 1 by assumption. Now
kPM k = hv|PM |vi (17.39)

= hw|P |wi (17.40)
X
2 2
= |hV |wi| + |h|wi| (17.41)
6=1
X
|hV |wi|2 + (1 ) |h|wi|2 (17.42)
6=1
X
=1 |h|wi|2 (17.43)
6=1
= 1 (1 |hV |wi|2 ). (17.44)
But by the Cauchy-Schwarz inequality,
|hV |wi|2 = |hV |V \M |wi|2 (17.45)

2 2
kV \M |V ik k|wik (17.46)
N |M |
= (17.47)
N
1 (17.48)
P
where V \M = j M
/ |jihj|. Thus kPM k 1 as claimed.
Thus we see that the classical hitting time is O(1/).
Now we turn to the quantum case. Our strategy will be to perform phase estimation with sufficiently
high precision on the operator U , the quantum walk corresponding to P 0 , with the state
1 X
|i := p |j i. (17.49)
N |M | j6M
Im(z)
ei
1
|{z} Re(z)
1cos
Figure 17.1: The classical gap, 1 = 1 cos , appears on the real p axis. The quantum phase gap,
= arccos , is quadratically larger, since cos 1 2 /2, i.e., arccos 2(1 ).
This state can easily be prepared by starting from the state

1 X
T |V i = |j i (17.50)
N j
and measuring whether the first register corresponds to a marked vertex; if it does then we are done, and if
not then we have prepared |i.
The matrix D for the walk P 0 is
PM 0
, (17.51)
0 I
so according to the spectral theorem, the eigenvalues of the resulting walk operator U are 1 and ei arccos ,
where runs over the eigenvalues of PM . If the marked set M is empty, then P 0 = P , and |i is an
eigenvector of U with eigenvalue 1, so phase estimation on U is guaranteed to return a phase of 0. But if M
is non-empty, then the state |i lives entirely within the subspace with eigenvalues ei arccos . Thus if we
p estimation on U with precision O(min arccos ), we will see a phase
perform phase pdifferent from 0. Since
arccos 2(1 ) (see Figure 17.1 for an illustration), we see that precision O(p 1 kPM k) suffices.
So
the quantum algorithm can decide whether there is a marked vertex in time O(1/ 1 kPM k) = O(1/ ).
Chapter 18
Unstructured search
Now we begin to discuss applications of quantum walks to search algorithms. We start with the most basic
of all search problems, the unstructured search problem (which is solved optimally by Grovers algorithm).
We discuss how this problem fits into the framework of quantum walk search, and also describe amplitude
amplification and quantum counting in this setting. We also discuss quantum walk algorithms for the search
problem under locality constraints.
18.1 Unstructured search

In the unstructured search problem, we are given a black box function f : S {0, 1}, where S is a finite set
of size |S| = N . The inputs x M , where M := {x S : f (x) = 1}, are called marked items. In the decision
version of the problem, our goal is to determine whether M is empty or not. We might also want to find a
marked item when one exists.
It is quite easy to see that even the decision problem requires (N ) classical queries, and that N queries
suffice, so the classical query complexity of unstructured search is (N ).
You should already be familiar with Grovers algorithm, which solves this problem
P using
O( N ) quantum
queries [50]. Grovers algorithm works by starting from P the state |Si := xS |xi/ N and alternately
applying the reflection about the set of marked items, xM 2|xihx| 1, and the reflection about the state
|Si, 2|SihS| 1. The former can be implemented with two quantum queries topf , and the latter requires no
queries to implement. It is straightforward to show that there is some t = O( N/|M |) for which t steps of
this procedure give a state with constant overlap on |M i (assuming M is non-empty), so that a measurement
will reveal a marked item with constant probability. p
It can be shown that unstructured search requires ( N/|M |) queries. We will prove this when we
discuss adversary lower bounds.
18.2 Quantum walk algorithm

Consider the discrete-time random walk on the complete graph represented by the stochastic matrix

0 1 1 1

1 0 1 1
..

1 ..
P = 1 1 0 . . (18.1)
N 1 . . . .

. . . .

. . . . 1
1 1 1 0
N 1
= |SihS| I. (18.2)
N 1 N 1
It has eigenvalues 1 (which is non-degenerate) and 1/(N 1) (with degeneracy N 1). Since the graph is
highly connected, its spectral gap is very large: we have = 1 + N 11 = NN1 .
83
84 Chapter 18. Unstructured search
This random walk gives rise to a very simple classical algorithm for unstructured search. In this algorithm,
we start from a uniformly random item and repeatedly choose a new item uniformly at random from the
other N 1 possibilities, stopping when we reach a marked item. The fraction of marked items is = |M |/N ,
so the hitting time of this walk is
(N 1)N

1
O = = O(N/|M |) (18.3)
N |M |
(this is only an upper bound on the hitting time, but in this case we know it is optimal). Of course, if we
have no a priori lower bound on |M | in the event that M is non-empty, the best we can say is that 1/N ,
giving a running time O(N ).
The corresponding quantum walk search algorithm has a hitting time of

1 p
O = O( N/|M |), (18.4)

corresponding
p to the running time of Grovers algorithm. To see that this actually gives an algorithm using
O( N/|M |) queries, we need to see that a step of the quantum walk can be performed using only O(1)
quantum queries. In the case where the first item is marked, the modified classical walk matrix is

N 1 1 1 1

0 0 1 1
..

1 ..
P0 =

0 1 0 . . , (18.5)
N 1 . . . .

.. .. .. . . 1

0 1 1 0
q
so that the vectors |j i are |1 i = |1, 1i and |j i = |j, S \ {j}i = NN1 |j, Si 1
N 1
|j, ji for j = 2, . . . , N .
With a general marked set M , the projector onto the span of these states is
X X
= |j, jihj, j| + |j, S \ {j}ihj, S \ {j}|, (18.6)
jM j M
/
so the operator 2 1 acts as Grover diffusion over the neighbors when when the vertex is unmarked,
P and
as a phase flip when the vertex is marked. (Note that since we start from the state |i = j M / |j i, we
stay in the subspace of states span{|j, ki : (j, k) E}, and in particular have zero support on any state |j, ji
for j V , so 2 1 acts as 1 when the first register holds a marked vertex.) Each such step can be
implemented using two queries of the black box, one to compute whether we are at a marked vertex and
one to uncompute that p information; the subsequent swap operation requires no queries. Thus the query
complexity is indeed O( N/|M |).
This algorithm is not exactly the same as Grovers; for example, it works in the Hilbert space CN CN
instead of CN . Nevertheless, it is clearly closely related. In particular, notice that in Grovers algorithm,
the unitary operation 2|SihS| 1 can be viewed as a kind of discrete-time quantum walk on the complete
graph, where in this particular case no coin is necessary to define the walk.
The algorithm we have described so far only solves the decision version of unstructured search. To find
marked item, we could use bisection, but this would introduce a logarithmic overhead. In fact, it can be
shown that the final state of the quantum walk algorithm actually encodes a marked item when one exists.
18.3 Amplitude amplification and quantum counting

We briefly mention some other concepts related to unstructured search that provide useful tools for quantum
algorithms in general. These ideas are typically presented in the context of Grovers algorithm; here we
describe them in the framework of quantum walk search. This is slightly less space efficient, but the essential
ideas are the same.
18.4. Search on graphs 85
Amplitude amplification is a general method for boosting the success probability of a (classical or quan-
tum) subroutine [23]. It can be implemented by quantum walk search as follows. Suppose we have a

procedure that produces a correct answer with probability p (i.e., with an amplitude of magnitude p if
we view it as a quantum process). From this procedure we can define a two-state Markov chain that, at
each step, moves from the state where the answer is not known to the state where the answer is known with
probability p, and then remains there. This walk has the transition matrix

1p 0
P0 = ,
p 1
p
so PM = 1 p, giving a quantum hitting time of O(1/ 1 kPM k) = O(1/ p).
For some applications, it may be desirable to estimate the value of p. Quantizing
the above two-state
3/2
Markov chain gives eigenvalues in the non-marked subspace of ei arccos(1p) = ei 2p+O(p ) . By applying

phase estimation, we can determine p approximately. Recall that phase estimation gives an estimate with
precision using O(1/) applications of the given unitary [34] (assuming we cannot apply high powers of the

unitary any more efficiently than simply applying it repeatedly). An estimate of p with precision gives

an estimate of p with precision p (since ( p + O())2 = p + O( p)), so we can produce an estimate of

p with precision in O( p/) steps.
In particular, if the Markov chain is a search of the complete graph as described in the previous section,
with |M | marked sites out of N , then p = |M |/N , and thispallows us to count the number of marked items.
We obtain an estimate of |M |/N with precision in O( |Mp|/N /) steps. If we want a multiplicative
approximation of |M | with precision , this means we need O( N/|M |/) steps.
Note that for exact counting, no speedup is possible in general. If |M | = (N ) then we need to estimate
p within precision O(1/N ) to uniquely determine |M |, but then the running time of the above procedure is
O(N ). In fact, it can be shown that exact counting requires (N ) queries [16].
18.4 Search on graphs

We can also consider a variant of unstructured search with additional locality constraints. Suppose we view
the items in S as the vertices of a graph G = (S, E), and we require the algorithm to be local with respect
to the graph. More concretely, we require
P the algorithm to alternate between queries and unitary operations
U constrained to satisfy U |j, i = kj(j) k |k, k i for any j S (where the second register represents
possible ancillary space, and recall that (j) denotes the set of neighbors of j in G)
[2].
Since we have only added new restrictions that an algorithm must obey, the ( N ) lower bound from
the non-local version of the problem still applies. However, it is immediately clear that this bound cannot
always be achieved. For example, if the graph is a cycle of N vertices, then simply propagating from one
vertex of the cycle to an opposing vertex takes time (N ). So we would like to know, for example, how far
from complete the graph can be such that we can still perform the search in O( N ) steps.
First, note that any expander graph (a graph with degree upper bounded bya constant and second
largest eigenvalue bounded away from 1 by a constant) can be searched in time O( N ). Such graphs
have
= (1), and since 1/N when there are marked items, the quantum hitting time is O(1/ ) = O( N )
(whereas the classical hitting time is O(1/) = O(N )).
There are also many cases in which a quantum search can be performed in time O( N ) even though the
eigenvalue gap of P is non-constant [89, 31, 12]. For example, consider the n-dimensional hypercube (with
N = 2n vertices). Recall that since the adjacency matrix acts independently as x on each coordinate, the
eigenvalues are equally spaced, and the gap of P is 2/n. Thus the general bound in terms of the eigenvalues
of P shows that the classical hitting time is O(nN ) = O(N log N ). In fact, this bound is loose; the hitting
time is actually O(N ), which can be seen by directly computing kPM k with onemarked vertex. So there is
a local quantum algorithm that runs in the square root of this time, namely O( N ).
Perhaps the most interesting example is the d-dimensional square lattice with N sites (i.e., with linear
size N 1/d ). This case can be viewed as having N items distributed on a grid in d-dimensional space. For
simplicity, suppose we have periodic boundary conditions; then the eigenstates of the adjacency matrix are
86 Chapter 18. Unstructured search
given by
1 X 2ikx/N 1/d
|ki := e |xi (18.7)
N x
where k is a d-component vector of integers from 0 to N 1/d 1. The corresponding eigenvalues are
d
X 2kj
2 cos . (18.8)
j=1
N 1/d
Normalizing to obtain a stochastic matrix, we simply divide these eigenvalues by 2d. The 1 eigenvector has
k = (0, 0, . . . , 0), and the second largest eigenvalue comes from (e.g.) k = (1, 0, . . . , 0), with an eigenvalue
2
1 2 1 2
d 1 + cos 1/d 1 . (18.9)
d N 2d N 1/d
2
2 2/d
Thus the gap of the walk matrix P is about dN 2/d = O(N ). This is another case in which the bound
on the classical hitting time in terms of eigenvalues of P is too loose (it gives only O(N 1+2/d )), and instead
we must directly estimate the gap of PM . One can show that the classical hitting time is O(N 2 ) in d = 1,
O(N log N ) in d = 2, and O(N ) for any d 3. Thus there is a local quantum walk search algorithm that
saturates the lower bound for any d 3, and one that runs in time time O( N log N ) for d = 2. We already
argued that there could be no speedup for d = 1, and indeed we see that the quantum hitting time in this
case is O(N ).
Chapter 19
Quantum walk search
In this lecture we will discuss the algorithm that cemented the importance of quantum walk as a tool for
quantum query algorithms: Ambainiss algorithm for the element distinctness problem [11]. The key new
conceptual idea of this algorithm is to consider walks that store information obtained from many queries at
each vertex, but that do not require many queries to update this information for an adjacent vertex. This
idea leads to a general, powerful framework for quantum walk search [70, 84].
19.1 Element distinctness

In the element distinctness problem, we are given a black-box function f : {1, . . . , n} S, where S is
some finite set. The goal is to determine whether there are two distinct inputs x, y {1, . . . , n} such that
f (x) = f (y).
It is clear that a classical algorithm must make (n) queries to solve the problem, since deciding whether
there is such a pair is at least as hard as unstructured search (suppose we give the additional promise that if
there is a pair, it will be with x = 1 for which f (1) = 1; then we must search for a y {2, . . . , n} for which
f (y) = 1). By the same argument, there is a quantum lower bound for element distinctness of ( n).
There is a simple quantum algorithm that uses Grovers algorithm recursively to improve upon the trivial
running time of O(n) [26]. To see how this algorithm works, first consider the following subroutine. Query
f in ` randomly chosen places, and check whether one of these ` places belongs to a pair of inputs that map
to the same value by performing a Grover search on the remaining n ` inputs. Theinitial setup takes `
queries, and the Grover search takes O( n `) = O( n) queries, for a total of ` + O( n). This subroutine
fails most of the time, since it is likely that the random choice of ` inputs will be unlucky, but it succeeds with
probability
p at least `/n. To boost the success probability, we can use amplitude amplification, which takes
O( n/`) steps to boost the success probability to a constant. Overall, we can obtain success probability
(1) using
p
(` + n) n/` = n` + n/ ` (19.1)

queries. To optimize the query complexity, we set the two terms to be equal, giving ` = n and hence a
3/4
query complexity of O(n ). (Note that an analysis of the running time of this algorithm would include
extra logarithmic factors, since the inner use of Grovers algorithm must check whether an element against
` queried function values, which can be done in time O(log `) provided S is ordered and we initially sort the
queried values.)
So far, we have a quantum upper bound of O(n3/4 ), and a quantum lower bound of (n1/2 ). It turns out
that both of these can be improved. On the lower bound side, Aaronson and Shi proved an (n1/3 ) lower
bound for the closely-related collision problem, in which the goal is to distinguish one-to-one from two-to-
one functions [3]. This impliesan (n2/3 ) lower bound for element distinctness by the following reduction.
Suppose we randomly choose n inputs of the collision problem function and run the element distinctness
algorithm on them. If the function is two-to-one, then there is some pair of elements in this set mapping to
the same value with high probability (by the birthday problem), which the element distinctness algorithm
87
88 Chapter 19. Quantum walk search

will detect. Hence a k-query element distinctness algorithm implies an O( k)-query collision algorithm; or
equivalently, a k-query collision lower bound implies an (k 2 ) element distinctness lower bound.
Now the question remains, can we close the gap between the O(n3/4 ) upper bound and this (n2/3 ) lower
bound? Ambainiss quantum walk algorithm does exactly this.
19.2 Quantum walk algorithm

The idea of Ambainiss algorithm is
n
to quantize a walk on the Johnson graph J(n, m), where m is chosen
appropriately. This graph has m vertices corresponding to subsets of {1, 2, . . . , n} of size m, and two
vertices are connected by an edge if the subsets differ in exactly one element.
To simplify the analysis slightly, we will use a different graph, the Hamming graph H(n, m). The vertices
of this graph are the m-tuples of values from {1, 2, . . . , n} (so there are nm vertices). Two vertices are
connected by an edge if they differ in exactly one coordinate. There are two main differences between the
Johnson and Hamming graphs: the Hamming graph allows for repeated elements, and the order of elements
is significant. Neither of these differences significantly affects the performance of the algorithm.
At each vertex, we store the values of the function at the corresponding inputs. In other words, the
vertex (x1 , x2 , . . . , xm ) {1, 2, . . . , n}m is represented by the state
|x1 , x2 , . . . , xm , f (x1 ), f (x2 ), . . . , f (xm )i. (19.2)
To prepare such states, we must query the black-box function. In particular, to prepare an initial superpo-
sition over vertices of this graph takes m queries. However, we can move from one vertex to an adjacent
vertex using only two queries: to replace x by y in any particular coordinate, we use one query to erase f (x)
and another to compute f (y).
In this search problem, the marked vertices are those containing some x 6= y with f (x) = f (y). Notice
that, given the stored function values, we can check whether we are at a marked vertex with no additional
queries. The total number of marked vertices (in the case where the elements are not all distinct) is at least
m(m 1)(n 2)m2 , so the fraction of marked vertices is
m(m 1)(n 2)m2
. (19.3)
nm
To analyze the walk, we also need the P
eigenvalues of the relevant Markov chain. The adjacency matrix
m (i)
of the Hamming graph H(n, m) is A = i=1 (J I) , where J denotes the n n all 1s matrix, and
the superscript indicates that this matrix acts on the ith coordinate. The eigenvalues of J are n and
0, so the eigenvalues of J I are n 1 and 1. Hence the largest eigenvalue of A is m(n 1) (the
degree of any vertex of H(n, m)) and the second largest eigenvalue is (m 1)(n 1) 1 = m(n 1) n.
Normalizing by the degree, we see that the second largest eigenvalue of the stochastic matrix A/m(n 1) is
(m(n 1) n)/m(n 1) = 1 n/m(n 1). In other words, the spectral gap is
n
= . (19.4)
m(n 1)
Finally, how many queries does this algorithm use? Taking into account the initial m queries used to
prepare the starting state and the 2 queries per step of the walk, we have a total number of queries
r !
nm
r
m(n 1)

1
m+2O =m+O (19.5)
n m(m 1)(n 2)m2

n
=m+O . (19.6)
m
Again we can set the two terms equal to optimize the performance. We have m3/2 = O(n), so we should
take m = (n2/3 ). Then the total number of queries is O(n2/3 ), which matches the lower bound, and hence
is optimal.
Note that for the classical random walk search algorithm that we have quantized, the corresponding
query complexity is m + O(n2 /m), which is optimized by m = n. This gives no improvement over querying
every input, as we knew must be the case.
19.3. Quantum walk search algorithms with auxiliary data 89
19.3 Quantum walk search algorithms with auxiliary data

Algorithms based on similar ideas turn out to be useful for a wide variety of problems, including deciding
whether a graph contains a triangle (or various other related graph properties), checking matrix multiplica-
tion, and testing whether a group is abelian. In general, as in the element distinctness case, we may need to
store some data at each vertex, and we need to take into account the operations on this data when analyzing
the walk.
Suppose we have a setup cost S, a cost U to update the state after one step of the walk, and a cost C to
check whether a vertex is marked. For example, in Ambainiss algorithm for element distinctness, we had
S=m to query m positions (19.7)

U =2 to remove one of the items and add another (19.8)
C=0 since the function values for the subset are stored. (19.9)
In general, there is an algorithm to solve such a problem with total cost

1
S + (U + C). (19.10)

It turns out that for some problems, when the checking cost C is much larger than the update cost U ,
it is advantageous to take many steps of the walk on the unmarked graph before performing a phase flip on
the marked sites. This is how Ambainiss algorithm originally worked, though for element distinctness it is
not actually necessary. Using this idea, one can give a general quantum walk search algorithm with total
cost [70]
1 1
S+ U +C . (19.11)

In fact, it is also possible to modify the general algorithm so that it finds a marked item when one exists.
90 Chapter 19. Quantum walk search
Part IV
Quantum query complexity
91
Chapter 20
Query complexity and the polynomial

method
So far, we have discussed several different kinds of quantum algorithms. In the next few chapters, we discuss
ways of establishing limitations on the power of quantum algorithms [57]. After reviewing the model of
quantum query complexity, this chapter presents the polynomial method, an approach that relates quantum
query algorithms to properties of polynomials.
20.1 Quantum query complexity

Many of the algorithms we have covered work in the setting of query complexity, where the input for a
problem is provided by a black box. This setting is convenient since the black box provides a handle for
proving lower bounds: we can often show that many queries are required to compute some given function
of the black-box input. In contrast, it is notoriously difficult to prove lower bounds on the complexity of
computing some function of explicit input data.
We briefly formalize the model of query complexity. Consider the task of computing a function f : S T ,
where S n is a set of strings over some input alphabet . If S = n then we say f is total ; otherwise we
say it is partial. The input string x S is provided to us by a black box that computes xi for any desired
i {1, . . . , n}. A query algorithm begins from a state that does not depend on the oracle string x. It then
alternates between queries to the black box and other, non-query operations. Our goal is to compute f (x)
using as few queries to the black box as possible.
Of course, the minimum number of queries (which we call the query complexity of f ) depends on the
kind of computation we allow. There are at least three natural models:
D(f ) denotes the deterministic query complexity, where the algorithm is classical and must always
work correctly.
R denotes the randomized query complexity with error probability at most . Note that this it does not
depend strongly on since we can boost the success probability by repeating the computation several
times and taking a majority vote. Therefore R (f ) = (R1/3 (f )) for any constant , so sometimes we
simply write R(f ).
Q denotes the quantum query complexity, again with error probability at most . Similarly to the
randomized case, Q (f ) = (Q1/3 (f )) for any constant , so sometimes we simply write Q(f ).

We know that D(or) = n and R(or) = (n). Grovers algorithm shows that Q(or) = O( n). In this
lecture we will use the polynomial method to show (among other things) that Q(or) = ( n), a tight lower
bound.
93
94 Chapter 20. Query complexity and the polynomial method
20.2 Quantum queries

A quantum query algorithm begins from x-independent state |i and applies a sequence of unitary operations
U1 , . . . , Ut interspersed with queries Ox , resulting in the state
|xt i := Ut Ox . . . U2 Ox U1 Ox |i. (20.1)
To make this precise, we need to specify the action of the oracle Ox .

For simplicity, we will mostly consider the case where the input is a bit string, i.e., = {0, 1}. Perhaps
the most natural oracle model is the bit flip oracle Ox , which acts as
Ox |i, bi = |i, b xi i for i {1, . . . , n}, b {0, 1}. (20.2)
This is simply the linear extension of the natural reversible oracle mapping (i, b) 7 (i, b xi ), which can be
performed efficiently given the ability to efficiently compute i 7 xi . Note that the algorithm may involve
states in a larger Hilbert space; implicitly, the oracle acts as the identity on any ancillary registers.
It is often convenient to instead consider the phase oracle, which is obtained by conjugating the bit-flip
oracle by Hadamard gates: by the well-known phase kickback trick, Ox = (I H)Ox (I H) satisfies
Ox |i, bi = (1)bxi |i, bi for i {1, . . . , n}, b {0, 1}. (20.3)
Note that this is slightly wasteful since Ox |i, 0i = |i, 0i for all i; we could equivalently consider a phase oracle
Ox0 defined by Ox0 |0i = |0i and Ox0 |ii = (1)xi |ii for all i {1, . . . , n}. However, it is essential to include the
ability to not query the oracle by giving the oracle some eigenstate of known eigenvalue, independent of x.
If we could only perform the phase flip |ii 7 (1)xi |ii for i {1, . . . , n}, then we could not tell a string x
from its bitwise complement x.
These constructions can easily be generalized to the case of a d-ary input alphabet, say = Zd (identifying
input symbols with integers modulo d). Then for b , we can define an oracle Ox by
Ox |i, bi = |i, b + xi i for i {1, . . . , n}, b Zd . (20.4)
Taking the Fourier transform of the second register gives a phase oracle Ox = (I FZd )Ox (I FZd ) satisfying
Ox |i, bi = dbxi |i, bi for i {1, . . . , n}, b Zd (20.5)
where d := e2i/d .
20.3 Quantum algorithms and polynomials

The following shows a basic connection between quantum algorithms and polynomials [16].
Lemma 20.1. The acceptance probability of a t-query quantum algorithm for a problem with black-box input
x {0, 1}n is a polynomial in x1 , . . . , xn of degree at most 2t.
Proof. We claim that the amplitude of any basis state is a polynomial of degree at most t, so that the
probability of any basis state (and hence the probability of success) is a polynomial of degree at most 2t.
The proof is by induction on t. If an algorithm makes no queries to the input, then its success probability
is independent of the input, so it is a constant, a polynomial of degree 0.
For the induction step, a query maps
O
|i, bi 7x (1)bxi |i, bi (20.6)
= (1 2bxi )|i, bi, (20.7)
so it increases the degree of each amplitude by at most 1.

20.4. Symmetrization 95
Consider a Boolean function f : {0, 1}n {0, 1}. We say a polynomial p R[x1 , . . . , xn ] represents f if
p(x) = f (x) for all x {0, 1}n . Letting deg(f ) denote the smallest degree of any polynomial representing f ,
we have Q0 (f ) deg(f )/2.
To handle bounded-error algorithms, we introduce the concept of approximate degree. We say a polyno-
mial p -represents f if |p(x) f (x)| for all x {0, 1}n . Then the -approximate degree of f , denoted
g (f ), is the smallest degree of any polynomial that -represents f . Clearly, Q (f ) deg
deg g (f )/2. Since

bounded-error query complexity does not depend strongly on the particular error probability , we can define,
g ) := deg
say, deg(f g (f ).
1/3
Now to lower bound the quantum query complexity of a Boolean function, it suffices to lower bound its
approximate degree.
20.4 Symmetrization
While polynomials are well-understood objects, the acceptance probability is a multivariate polynomial, so
it can be rather complicated. Since x2 = x for x {0, 1}, we can restrict our attention to multilinear
polynomials, but it is still somewhat difficult to deal with such polynomials directly. Fortunately, for many
functions it suffices to consider a related univariate polynomial obtained by symmetrization.
For a string x {0, 1}n , let |x| denote the Hamming weight of x, the number of 1s in x.
Lemma 20.2. Given any n-variate multilinear polynomial p, let P (k) := E|x|=k [p(x)]. Then P is a polyno-
mial with deg(P ) deg(p).
Proof. Since p is multilinear, it can be written as a sum of monomials, i.e., as

X Y
p(x) = cS xi (20.8)
S{1,...,n} iS
for some coefficients cS . Then we have

X Y
P (k) = cS E xi (20.9)
|x|=k
S{1,...,n} iS
and it suffices to compute the expectation of each monomial. We find

Y
E xi = Pr [ i S, xi = 1] (20.10)
|x|=k |x|=k
iS
n|S|

k|S|
= n
(20.11)
k
(n |S|)! k! (n k)!
= (20.12)
(k |S|)! (n k)! n!
(n |S|)!
= k(k 1) (k |S| + 1) (20.13)
n!
which is a polynomial in k of degree |S|. Since cS = 0 whenever |S| > deg(p), we see that deg(P )
deg(p).
Thus the polynomial method is a particularly natural approach for symmetric functions, those that only
depend on the Hamming weight of the input.
20.5 Parity
Let parity : {0, 1}n {0, 1} denote the symmetric function parity(x) = x1 xn . Recall that Deutschs
problem, which is the problem of computing the parity of 2 bits, can be solved exactly with only one quantum
query. Applying this algorithm to a pair of bits at a time and then taking the parity of the results, we see
that Q0 (parity) n/2.
What can we say about lower bounds for computing parity? Symmetrizing parity gives the function
P : {0, 1, . . . , n} R defined by (
0 if k is even
P (k) = (20.14)
1 if k is odd.
Since P changes direction n times, deg(P ) n, so we see that Q0 (parity) n/2. Thus Deutschs algorithm
is tight among zero-error algorithms.
What about bounded-error algorithms? To understand this, we would like to lower bound the approxi-
mate degree of parity. If |p(x) f (x)| for all x {0, 1}n , then

|P (k) F (k)| = E (p(x) f (x))
(20.15)
|x|=k
for all k {0, 1, . . . , n}, where P is the symmetrization of p and F is the symmetrization of f . Thus, a
multilinear polynomial p that -approximates parity implies a univariate polynomial P satisfying P (k)
for k even and P (k) 1 for k odd. For any < 1/2, this function still changes direction n times, so in
g (f ) n, and hence Q (parity) n/2.
fact we have deg
This shows that the strategy for computing parity using Deutschs algorithm is optimal, even among
bounded-error algorithms. This is an example of a problem for which a quantum computer cannot get a
significant speeduphere the speedup is only by factor of 2. In fact, we need at least n/2 queries to succeed
with any bounded error, even with very small advantage (e.g., even if we only want to be correct with
probability 21 + 10100 ). In contrast, while the adversary method can prove an (n) lower bound for parity,
the constant factor that it establishes is error-dependent.
Note that this also shows we need (n) queries to exactly count the number of marked items in an
unstructured search problem, since exactly determining the number of 1s would in particular determine
whether the number of 1s is odd or even.

Next we will see how the polynomial method can be used to prove the ( n) lower bound for computing
the logical or of n bits. Symmetrizing or gives a function F (k) with F (0) = 0 and F (1) = 1. We also have
F (k) = 1 for all k > 1, but we will not actually need to use this. This function is monotonic, so we cannot

use the same simple argument we applied to parity. Nevertheless, we can prove that deg(or)
g = ( n) using
the following basic fact about polynomials, due to Markov.
Lemma 20.3. Let P : R R be a polynomial. Then
deg(P )2

dP (x)
max max P (x) min P (x) . (20.16)
x[0,n] dx n x[0,n] x[0,n]
In other words, if we let

h := max P (x) min P (x) (20.17)
x[0,n] x[0,n]
denote the height of P in the range [0, n], and
dP (x)
d := max (20.18)
x[0,n] dx
p
denote the largest derivative of P in that range, then we have deg(P ) nd/h.
Now let P be a polynomial that -approximates or. Since P (0) and P (1) 1 , P must increase
by at least 1 2 in going from k = 0 to k = 1, so d 1 2.
We have no particular bound on h, since we have no control over the value of P at non-integer points; the
function could become arbitrarily large or small. However, since P (k) [0, 1] for k {0, 1, . . . , n}, a large
20.6. Unstructured search 97
value of h implies a large value of d, since P must change fast enough to start from and return to values in
the range [0, 1]. In particular, P must change by at least (h 1)/2 over a range of k of width at most 1/2,
so we have d h 1. Therefore,
r
n max{1 2, h 1}
deg(P ) (20.19)
h

= ( n). (20.20)

It follows that Q(or) = ( n).
Note that the same argument applies for a function that takes the value 0 whenever |x| = w and the value
1 whenever |x| = w + 1, for any w; in particular, it applies to any non-constant symmetric function. (Of
course, we can do better for some symmetric functions, such as parity and also majority, among others.)
Chapter 21
The collision problem
We now discuss the quantum lower bound for the collision problem. This lower bound is a more involved
application of the polynomial method than the simple examples weve seen so far.
21.1 Problem statement

In the collision problem, we are given a black-box function f : {1, . . . , n} that is promised to be either
one-to-one or two-to-one. The goal is to determine which is the case. (Of course, we can assume that n is
even, since otherwise the function must be one-to-one.)
Note that for this problem to be well-defined, we must have || n. For concreteness, we consider the
case where = {1, . . . , n}. Since this is a special case of a function with a larger range, this assumption can
only make the problem slightly harder. However, we will present an upper bound that does not depend on
||, and our lower bound will handle the case where = {1, . . . , n}, so it applies in general.
While it may be most natural to think in terms of one-to-one and two-to-one functions, we can easily cast
the collision problem in the framework we used to formally define query complexity. The input alphabet is
, and an input is an oracle string x S0 S1 n , where
S0 = {x n : i 6= j, xi 6= xj } (21.1)
S1 = {x n : i there is a unique j such that xi = xj }. (21.2)
In other words, we are promised that x S0 S1 , i.e., that either every character of the string is distinct or
that every character of x appears exactly twice. Again, the goal is to determine which of these is the case,
so we are computing a function collision : S0 S1 {0, 1} with collision(x) = y if x Sy .
Why is the collision problem interesting? We know that the unstructured search problem takes ( n)
queries, i.e., exponentially many in the number of bits required to specify an index. Thus quantum computers
cannot provide a speedup for completely unstructured problems. The collision problem is a problem with
slightly more structure: instead of looking for a needle in a haystack, we are merely looking for two identical
items among a stack consisting of two pieces of hay, two needles, two toothpicks, two twigs, two cactus
spines, etc. We will see that the quantum query complexity of the collision problem is (n1/3 ), so while
this more structured problem can be solved slightly faster than unstructured search, it still requires many
queries. In other words, the kind of structure present in the collision problem is not sufficient to allow an
exponential quantum speedup.
Note that the structure in the collision problem arises naturally in the hidden subgroup problem: if the
hidden subgroup has order r, then the hiding function is r-to-one. But the HSP has additional structure
(namely, that of the underlying group), which explains why its quantum query complexity is so much smaller.
21.2 Classical query complexity

Before investigating quantum algorithms and lower bounds, lets briefly consider the classical query complex-
ity of the collision problem. A natural algorithm for finding a collision in a two-to-one function is as follows:
99
100 Chapter 21. The collision problem
query random indices until we observe a collision. For two independently random indices i, j {1, . . . , n},
1
. If we query m indices in this way, there are m

we have Pr(xi = xj ) = n1 2 pairs, so the expected number

of collisions seen is m

2 /(n 1). With m = n this is (1), so we expect to see a collision. Indeed, a second
moment argument shows that this happens with constant probability.
In fact, this algorithm is optimal. Until we see a collision, there is no better strategy than to query
randomly.
m
By the union bound, the
2
probability of seeing a collision after making m queries is at most
2 /(n 1) = O(m /n), so m = ( n).
21.3 Quantum algorithm

Now lets consider quantum algorithms for the collision problem. One simple approach is a trivial application
of Grovers algorithm: first query x1 , and then search over x2 , x3 , . . . , xn for x1 . This approach uses O( n)
queries, so it is no better than the above classical algorithm, although it achieves the same running time in
a very different way.
However, we can improve the performance by combining this approach with the main idea of the optimal
classical algorithm. The approach is to query m distinct indices (it doesnt matter how we choose them) and
check for a collision. If we find a collision then we are done; otherwise, we can search the remainingp nm
characters for one of the m values we have seen. Overall, this approach takes time m + O( (n m)/m).
This is minimized by taking m = (n1/3 ), showing that Q(collision) = O(n1/3 ) [24].
This is similar to an algorithm for element distinctness that we saw previously. The difference is that in
element distinctness, we could have a unique collision, so a single run of the basic algorithm only succeeds
with small probability; we address this by performing amplitude amplification, at the cost of a longer running
time. Recall that the resulting algorithm for element distinctness took time O(n3/4 ), which turned out to
be suboptimal. In contrast, the above algorithm for the collision problem turns out to be optimal.
As a side remark, note that this algorithm uses a lot of space, since we need to store the m = n1/3 items
queried initially. It is an open problem to understand the quantum query complexity of the collision problem
with restricted space (i.e., to find a time-space tradeoff).
21.4 Toward a quantum lower bound

How can we prove a quantum lower bound for the collision problem?
Note that the positive-weighted adversary method will not be sufficient. This is a consequence of the
so-called property-testing barrier: for a decision problem in which yes input strings differ from no
input strings in an fraction of their characters, the best possible positive adversary lower bound goes like
1/. For the collision problem, this means that the adversary method can at best prove a constant lower
bound. (There is now an explicit (n1/3 ) lower bound for collision using the negative adversary method
[18], although it is somewhat complicated.)
Thus our strategy will be to apply the polynomial method. To do this, we need to adapt the method to
the case of a non-binary input alphabet. To do this, we define variables ij for i, j {1, . . . , n}, as follows:
(
1 if xi = j
ij := (21.3)
0 otherwise.
Then we claim that acceptance probability of a t-query quantum algorithm is a multilinear polynomial in
the {ij } of degree at most 2t.
This can be proved along similar lines to the binary case. Suppose we use an addition modulo n query,
where |i, ji 7 |i, j + xi i. Then we have
X
|i, j + xi i = ik |i, j + ki (21.4)
k
X
= i,`j |i, ì, (21.5)
`
21.5. Constructing the functions 101
so the degree of each amplitude can increase by at most 1 when we make a query.
Next we would like to obtain a simpler polynomial. We cannot directly symmetrize over the variables
{ij }, as this would destroy too much of the structure of the problem.
The original idea leading to a nontrivial collision lower bound, due to Aaronson, was to express the
acceptance probability as a bivariate polynomial in n and r, where the function is r-to-one. The main
difficulty with this approach is that we need to have r | n in order for such inputs to make sense (so that
we can at least say that the acceptance probability of a quantum algorithm is defined, and hence a given
approximating polynomial is bounded between 0 and 1). This approach originally gave a lower bound (n1/5 )
[1]. Subsequently, Shi improved this to give the optimal result of (n1/3 ) [90].
21.5 Constructing the functions

In this lecture we will describe a later argument due to Kutin [65] that seems to be simplest known approach
to lower bounding the collision problem, yet that gives a tight lower bound, and even covers the case where
the range is {1, . . . , n}. We will follow Kutins presentation closely, with only some slight changes of notation.
The basic idea of Kutins proof is to consider functions that are a-to-one on part of the domain and
b-to-one on the rest of the domain. Let us say that a triple (m, a, b) is valid if m {0, 1, . . . , n}, a | m, and
b | n m. Then we define an input xm,a,b as
(
di/ae if 1 i m
xm,a,b
i = (21.6)
n b(n i)/bc if m < i n.
Given such an input, we can obtain a family of inputs by permuting the input alphabet and the characters
of the string arbitrarily. Specifically, for any permutations of {1, . . . , n}, we define an input x with
m,a,b
x
i := (x(i) ). This induces corresponding binary variables ij with ij = 1 iff xi = j.
Now we claim that the acceptance probability of a quantum algorithm presented with such an input is a
polynomial in m, a, b.
Lemma 21.1. Let p({ij }) be a polynomial in the ij . For any valid triple (m, a, b), let

P (m, a, b) := E [p({ij })]. (21.7)
,
Then P is a polynomial in m, a, b with deg(P ) deg(p).
Proof. Any polynomial p can be expanded as a sum of monomials of the form

n Y
Y
IS ({ij }) := ij (21.8)
j=1 iSj
P of {1, . . . , n}. This monomial is 1 if all the indices i Sj satisfy xi = j,

where S1 , . . . , Sn are disjoint subsets
and 0 otherwise. Its degree is s := j |Sj |. It suffices to handle the case where p is any particular monomial.
For any input x , we can identify the subset of function values in the range of the a-to-one part of the
function and the b-to-one part of the function. Given U {1, . . . , n}, let X denote the event that all function
values in U belong to the a-to-one part and all function values not in U belong to the b-to-one part (more
formally, that (j) U for all j {1, . . . , m / U for all j {n, n1, . . . , n nm
a } with Sj 6= and (j) b +1}
with Sj 6= ). We collect together the inputs corresponding to a particular event X. Then we have
X

P (m, a, b) = Pr[X] E [IS ({ij }) | X] (21.9)
,
U {1,...,n}
X

= Pr[X] Pr[ j, i Sj , ij = 1 | X]. (21.10)
,
U {1,...,n}
Let u := |U |, and let t denote the number of values of j with |Sj | =

6 0. Then we have
m m
a(a 1) ( m nm nm
a u + 1) ( b )( b 1) ( nm
b (t u) + 1) (n t)!
Pr[X] = (21.11)
n!
( m )u ( nm
b )tu
= a (21.12)
(n)t
where
n!
(n)k := = n(n 1) (n k + 1). (21.13)
(n k)!
Here the numerator of (21.11) has three contributions: the number of ways to permute the u function values
in the a-to-one part is ( m
a )u , the number of ways to permute the t u function values in the b-to-one part is
( nm
b )tu , and the number of ways to permute the n t remaining function values is (n t)!. Note that this
expression is a rational function in m, a, b whose numerator has degree t and whose denominator is au btu .

Now consider the latter term of (21.10). Given that X occurs, Pr [ j, i Sj , ij = 1] is independent
of , so suppose is any permutation such that X occurs. In other words, we only need to count consistent
ways of permuting the indices i. Observe that the number of ways to permute the indices i such that xi = j
for some j U is
a(a 1) (a |Sj | + 1) = (a)|Sj | . (21.14)
Similarly, the number of ways to permute the indices i such that xi = j for some j / U is (b)|Sj | . In addition,
we can permute the remaining n s indices however we like. Thus we have
Q Q

(n s)! jU (a)|Sj j U / (b)|Sj |
Pr [ j, i Sj , ij = 1 | X] = (21.15)
, n!
Q Q
jU (a)|Sj | / (b)|Sj |
j U
= . (21.16)
(n)s
This expression is a polynomial in a, b of degree s. Also, it is divisible by au and btu . Thus P (m, a, b) is a
polynomial in m, a, b of degree t + s u (t u) = s.
21.6 Finishing the proof

To prove the lower bound, we will use the following fact about real polynomials. (Note that this fact is also
useful elsewhere, e.g., in proving a tight lower bound for general threshold functions.)
Lemma 21.2 (Paturi). Let f R[x], let a, b Z with a < b, and let [a, b]. If there are constants c, d
such that
|f (i)| c for all integers i [a, b]
|f (bc) f ()|
p d,
then deg(f ) = ( ( a + 1)(b + 1)).

Regardless of , Paturis lemma always shows that the degree is ( b a). If is near the middle of
the range then it does much better, showing that the degree is (b a). Also note that by continuity, it is
sufficient for f to differ by a constant amount between two consecutive integers.
Now we are ready to prove that the quantum query complexity of the collision problem is (n1/3 ). Let
p({ij }) be the acceptance probability of a t-query quantum algorithm that solves the collision problem with
error probability at most 1/3, and let P (m, a, b) be as above. We know that t deg(P )/2. Furthermore, we
know that P has the following properties:
0 P (m, 1, 1) 1/3
2/3 P (m, 2, 2) 1
0 P (m, a, b) 1 for any valid triple (m, a, b).

21.6. Finishing the proof 103
Now consider inputs that are roughly half one-to-one and half two-to-one. For concreteness, let m =
2bn/4c. Since n m is even (recall that n is always even by assumption since otherwise the problem is
trivial), (m, 1, 2) is valid. We consider two cases, depending on whether the algorithm is more likely to call
this a yes or a no input. First suppose P (m, 1, 2) 1/2.
Let r be the smallest integer such that |P (m, 1, r)| 2. First we consider P (m, 1, x) as a function of x.
For all x {1, . . . , r 1}, we have 2 P (m, 1, x) 2. But we also know that |P (m, 1, 1) P (m, 1, 2)|
1 1 1
2 3 = 6 . By Paturis lemma, this implies that deg(P ) = ( r).
On the other hand, consider the polynomial g(x) := P (n rx, 1, r). When x Z such that rx
{0, . . . , n}, the triple (n rx, 1, r) is valid,
so we have 0 g(x) 1 for all integers x [0, bn/rc]. However,
|g( nm
r )| = |P (n, 1, r)| 2, and nm
r is about halfway between 0 and bn/rc. Thus by Paturis lemma,
deg(P ) = (n/r).
Combining these results, we have deg(P ) = ( r + n/r). The weakest lower bound is obtained when
the two terms are equal, i.e., when r = (n2/3 ); therefore deg(P ) = (n1/3 ).
It remains to consider the case where |P (m, 1, r)| < 1/2. But the same conclusion holds here by a very
argument. (Let r be the smallest even integer for which |P (m, r, 2)| 2; on the one hand, deg(P ) =
similar
( d) as before, but on the other hand, the polynomial h(x) := P (rx, r, 2) shows that deg(P ) = (n/r).)
Overall, it follows that any quantum algorithm for solving the collision problem must use (n1/3 ) queries.
Chapter 22
The quantum adversary method
We now discuss a second approach to proving quantum query lower bounds, the quantum adversary method
[10]. In fact, well see later that the generalized version of the adversary method we consider here (allowing
negative weights [55]) turns out to be an upper bound on quantum query complexity, up to constant factors
[79, 68].
22.1 Quantum adversaries

Motivation for the quantum adversary method comes from the following construction. Suppose the oracle is
operated by anPadversarial party who holds a quantum state determining the oracle string, which is in some
superposition xS ax |xi over the possible oracles. To implement each query, the adversary performs the
super-oracle
X
O := |xihx| Ox . (22.1)
xS
An algorithm does not have direct access to the oracle string, and hence can only perform unitary operations
that act as the identity on the adversarys superposition. After t steps, an algorithm maps the overall state
to
!
X
t
| i := (I Ut )O . . . (I U2 )O(I U1 )O ax |xi |i (22.2)
xS
X
= ax |xi |xt i. (22.3)
xS
The main idea of the approach is that for the algorithm to learn x, this state must become very entangled.
To measure the entanglement of the pure state | t i, we can consider the reduced density matrix of the oracle,
X
t := ax ay hxt |yt i |xihy|. (22.4)
x,yS
Initially, the state 0 is pure. Our goal is to quantify how mixed it must become (i.e., how entangled the
overall state must be) before we can compute f with error at most . To do this we could consider, for
example, the entropy of t . However, it turns out that other measures are easier to deal with.
In particular, we have the following basic fact about the distinguishability of quantum states (for a proof,
see for example section A.9 of KLM):
Fact 22.1. Given one of two pure states |i, |i, we can make a measurement p that determines which state
we have with error probability at most [0, 1/2] if and only if |h|i| 2 (1 ).
Thus it is convenient to consider measures that are linear in the inner products hxt |yt i.
105
106 Chapter 22. The quantum adversary method
22.2 The adversary method

To obtain an adversary lower bound, we choose a matrix R|S||S| with rows and columns indexed by
the possible black-box inputs. The entry x,y is meant to characterize how hard it is to distinguish between
x and y. We say is an adversary matrix if
1. xy = yx and
2. if f (x) = f (y) then xy = 0.
The second condition reflects that we do not need to distinguish between x and y if f (x) = f (y).
The original adversary method made the additional assumption that xy 0, but it turns out that this
condition is not actually necessary [55]. Sometimes we refer to the negative or generalized adversary method
to distinguish it from the original, positive-weighted method. While it may not be intuitively obvious what
it would mean to give a negative weight to the entry characterizing distinguishability of two inputs, it turns
out that this flexibility can lead to significantly improved lower bounds for some functions.
Given an adversary matrix , we can define a weight function
X
W j := xy ax ay hxj |yj i. (22.5)
x,yS
Note that this is a simple function of the entries of j . The idea of the lower bound is to show that W j
starts out large, must become small in order to compute f , and cannot change by much if we make a query.
The initial value of the weight function is
X
W0 = xy ax ay hx0 |y0 i (22.6)
x,yS
X
= ax xy ay (22.7)
x,yS
since |x0 i cannot depend on x. To make this as large as possible, we take a to be a principal eigenvector of
, an eigenvector with eigenvalue kk. Then |W 0 | = kk.
The final value of the weight function is easier to bound if we assume a nonnegative adversary matrix.
The final value is constrained by the fact that we must distinguish x from y with p error probability at most
whenever f (x) 6= f (y). For this to hold after t queries, we need |hxt |yt i| 2 (1 ) for all pairs x, y S
with f (x) 6= f (y) (by the above Fact). Thus we have
X
xy ax ay 2 (1 )
p
|W t | (22.8)
x,yS
p
= 2 (1 )kk. (22.9)
Here we can include the terms where f (x) = f (y) in the sum since xy = 0 for such pairs. We also used the
fact that the principal eigenevector of a nonnegative matrix can be taken to have nonnegative entries (by
the Perron-Frobenius theorem).
A similar bound holdspif has negative entries, but we need a different argument. In general, one can
only show that |W t | (2 (1 ) + 2)kk. But if we assume that f : S {0, 1} has Boolean output, then
we can prove the same bound as in the non-negative case, and the proof is simpler than for aPgeneral output
space. We use the following simple result, stated in terms of the Frobenius norm kXk2F := a,b |Xab |2 :
Proposition 22.2. For any X Cmn , Y Cnn , Z Cnm , we have |tr(XY Z)| kXkF kY kkZkF .
Proof. We have
X
tr(XY Z) = Xab Ybc Zca (22.10)
a,b,c
X
= (xa ) Y z a (22.11)
a
22.2. The adversary method 107

where (xa )b = Xab and (z a )c = Zca . Thus
X
|tr(XY Z)| kxa kkY z a k (22.12)
a
X
kY k kxa kkz a k (22.13)
a
sX X
kY k kxa k2 kz a0 k2 (22.14)
a a0
= kY kkXkF kZkF (22.15)
as claimed, where we used the Cauchy-Schwarz inequality in the first and third steps.
To upper bound |W t | for the negative adversary with Boolean output, write W t = tr(V ) where Vxy :=
ax ay hxt |yt i[f (x)
6= f (y)]. Define
X
C := ax f (x) |xt ihx| (22.16)
xS
X
C := ax 1f (x) |xt ihx| (22.17)
xS
with 0 , 1 denoting the projectors onto the subspaces indicating f (x) = 0, 1, respectively. Then
(C C)xy = ax ay hxt |f (x) 1f (y) |yt i, (22.18)
so
(C C + C C)xy = ax ay hxt |(f (x) 1f (y) + 1f (x) f (y) )|yt i (22.19)

= ax ay hxt |yt i[f (x) 6= f (y)], (22.20)
i.e., V = C C + C C. Thus we have
W t = tr((C C + C C)) (22.21)

= tr(CC ) + tr(CC ). (22.22)
By the Proposition, |W t | 2kkkCkF kCkF . Finally, we upper bound kCkF and kCkF . We have
X
kCk2F + kCk2F = |ax |2 (|hy|f (x) |xt i|2 + |hy|1f (x) |xt i|2 ) = 1 (22.23)
x,yS
X
kCk2F = |ax |2 k|1f (x) |xt ik2 . (22.24)
xS
p p
Therefore kCkF kCkF maxx[0,] x(1 x) = (1 ) (assuming [0, 1/2]), and we find that |W t |
p
2 (1 )kk, as claimed.
It remains to understand how much the weight function can decrease at each step of the algorithm. We
have
X
W j+1 W j = xy ax ay (hxj+1 |yj+1 i hxj |yj i). (22.25)
x,yS
Consider how the state changes when we make a query. We have |xj+1 i = Uj+1 Ox |xj i. Thus the elements
of the Gram matrix of the states {|xj+1 i : x S} are
hxj+1 |yj+1 i = hxj |Ox (Uj+1 ) Uj+1 Oy |yj i (22.26)

= hxj |Ox Oy |yj i (22.27)
since Uj+1 is unitary and Ox = Ox . Therefore

X
W j+1 W j = xy ax ay hxj |(Ox Oy I)|yj i. (22.28)
x,yS
Observe that Ox Oy |i, bi = (1)b(xi yi ) |i, bi. Let P0 = I |0ih0| denote the projection onto the b = 0
states, and let Pi denote the projection Pn |i, 1ihi, 1|. (As with Ox , the projections
Pn Pi implicitly act as the
xi yi
identity
P on any ancilla registers, so P
i=0 i = I.) Then O O
x y = P 0 + i=1 (1) Pi , so Ox Oy I =
2 i : xi 6=yi Pi . Thus we have
X X
W j+1 W j = 2 xy ax ay hxj |Pi |yj i. (22.29)
x,yS i : xi 6=yi
Now for each i {1, . . . , n}, let i be a matrix with

(
xy if xi 6= yi
(i )xy := (22.30)
0 if xi = yi
Then we have
n
X X
W j+1 W j = 2 (i )xy ax ay hxj |Pi |yj i (22.31)
x,yS i=1
n
tr(Qi i Qi )
X
= 2 (22.32)
i=1
where Qi := x ax Pi |xj ihx|.

P
Using the triangle inequality and the above Proposition, we have
n
|tr(Qi i Qi )|
X
|W j+1 W j | 2 (22.33)
i=1
Xn
2 ki kkQi k2F . (22.34)
i=1
Since
n
X n X
X
kQi k2F = |ax |2 kPi |xj ik2 (22.35)
i=1 i=1 xS
X
|ax |2 (22.36)
xS
= 1, (22.37)
we have
|W j+1 W j | 2 max ki k. (22.38)

i{1,...,n}
Combining these three facts gives the adversary lower bound. Since |W 0 | = kk, we have
|W t | kk 2t max ki k. (22.39)
i{1,...,n}
p
Thus, to have |W t | 2 (1 )kk, we require
p
12 (1 )
t Adv(f ). (22.40)
2
22.3. Example: Unstructured search 109
where
kk
Adv(f ) := max (22.41)
maxi{1,...,n} ki k
with the maximum taken over all adversary matrices for the function f . (Often the notation Adv(f )
is reserved for the maximization over nonnegative adversary matrices, with the notation Adv (f ) for the
generalized adversary method allowing negative weights.)
22.3 Example: Unstructured search

As a simple application of this method, we prove the optimality of Grovers algorithm. It suffices to consider
the problem of distinguishing between the case of no marked items and the case of a unique marked item
(in an unknown location). Thus, consider the partial function where S consists of the strings of Hamming
weight 0 or 1, and f is the logical or of the input bits. (Equivalently, we consider the total function or but
only consider adversary matrices with zero weight on strings of Hamming weight more than 1.)
For this problem, adversary matrices have the form

0 1 n
1 0 0
= . (22.42)

.. . . ..
.. . . .
n 0 0
for some nonnegative coefficients 1 , . . . , n . Symmetry suggests that we should take 1 = = n . This
can be formalized, but for the present purposes we can take this as an ansatz.
Setting 1 = = n = 1 (since an overall scale factor does not affect the bound), we have

n 0 0
0 1 1
2 = . . . (22.43)

.. .. . . ...

0 1 1

which has norm k2 k = n, and hence kk = n. We also have

0 1 0 0
1 0
0 0
1 = 0 0
0 0 (22.44)
.. .. .. .. ..
. . . . .
0 0 0 0

andsimilarly for the other i , so ki k = 1. Thus we find Adv(or) n, and it follows that Q (or)
12 (1)
2 n. This shows that Grovers algorithm is optimal up toa constant factor (recall that Grovers
algorithm finds a unique marked item with probability 1 o(1) in 4 n + o(1) queries).
Chapter 23
Span programs and formula

evaluation
Having discussed lower bounds on quantum query complexity, we now turn our attention back to upper
bounds. The framework of span programs is a powerful tool for understanding quantum query complexity
[80, 78]. Span programs are closely related to the quantum adversary method, and can be used to show that
the (generalized) adversary method actually characterizes quantum query complexity up to constant factors
[79, 68].
For simplicity, we restrict our attention to the case of a (possibly partial) Boolean function f : S {0, 1}
where S {0, 1}n . Many (but not all) of the considerations for this case generalize to other kinds of
functions.
23.1 The dual of the adversary method

Recall that the adversary method defines a quantity
kk
Adv (f ) := max (23.1)
maxi{1,...,n} ki k
such that Q(f ) = (Adv (f )). Although not immediately obvious from the above expression, it can be
shown that Adv (f ) is the value of a semidefinite program (SDP), a kind of optimization problem in which
a linear objective function is optimized subjected to linear and positive semidefiniteness constraints.
Unfortunately, the details of semidefinite programming are beyond the scope of this course. For a good
introduction in the context of quantum information, see Watrouss lecture notes on Theory of Quantum
Information [97, Lecture 7].
A useful feature of SDPs is that they can be solved efficiently. Thus, we can use a computer program to
find the optimal adversary lower bound for a fixed (finite-size) function. However, while this may be useful
for getting intuition about a problem, in general this does not give a strategy for determining asymptotic
quantum query complexity.
Another key feature of SDPs is the concept of semidefinite programming duality. To every primal SDP,
phrased as a maximization problem, there is a dual SDP, which is a minimization problem. Whereas feasible
solutions of the primal SDP give lower bounds, feasible solutions of the dual SDP give upper bounds. The
dual problem can be constructed from the primal problem by a straightforward (but sometimes tedious)
process. Semidefinite programs satisfy weak duality, which says that the value of the primal problem is at
most the value of the dual problem. Furthermore, almost all SDPs actually satisfy strong duality, which
says that the primal and dual values are equal. (In particular, this holds under the Slater conditions, which
essentially say that the primal or dual constraints are strictly feasible.)
To understand any SDP, one should always construct its dual. Carrying this out for the adversary
method would require some experience with semidefinite programs, so we simply state the result here.
The variables of the dual problem can be viewed as a set of vectors |vx,i i Cd for all inputs x S
111
112 Chapter 23. Span programs and formula evaluation
and all indices i P[n] := {1, . . . , n}, for some dimension d. For b {0, 1}, we define the b-complexity
Cb := maxxf 1 (b) i[n] k|vx,i ik2 . Since strong duality holds, we have the following.
Theorem 23.1. For any function f : S {0, 1} with S {0, 1}n , we have
Adv (f ) = min max{C0 , C1 } (23.2)

{|vx,i i}
where the minimization is over all positive integers d and all sets of vectors {|vx,i i Cd : x S, i [n]}
satisfying the constraint
X
hvx,i |vy,i i = 1 f (x),f (y) x 6= y. (23.3)
i : xi 6=yi
By constructing solutions of the adversary dual, we place upper bounds on the best possible adversary
lower bound. But more surprisingly, one can construct an algorithm from a solution of the adversary dual,
giving an upper bound on the quantum query complexity itself.
Observe that if we replace |vx,i i |vx,i i for all x f 1 (0) and |vy,i i |vy,i i/ for all y f 1 (1), we
dont affect the constraints (23.3), but we map C0 2 C0 and C1 C1 /2 . Taking = (C1 /C0 )1/4 , we
make the two complexities equal. Thus we have
Adv (f ) = min
p
C0 C1 . (23.4)
{|vx,i i}
Note that the constraint (23.3) for f (x) = f (y), where the right-hand side is zero, can be removed
without changing the value of the optimization problem. (For functions with non-Boolean output, one
loses a factor strictly between 1 and 2 in the analogous relaxation.) To see this, suppose we have a set
of vectors {|vx,i i} satisfying the constraint (23.3) for f (x) 6= f (y) but not for f (x) = f (y). Simply let
0 0
|vx,i i = |vx,i i|xi f (x)i for all x S and all i [n]. Then k|vx,i ik = k|vx,i ik, and for the terms where
0 0 0 0
xi 6= yi , we have hvx,i |vy,i i = hvx,i |vy,i i if f (x) 6= f (y) and hvx,i |vy,i i = 0 if f (x) = f (y).
23.2 Span programs

The dual of the adversary method is equivalent to a linear-algebraic model of computation known as span
programs. This model was first studied in the context of classical computational complexity. It was connected
to quantum algorithms for formula evaluation by Reichardt and Spalek [80], and was subsequently related
to the adversary method by Reichardt [78].
A span program for a function f : {0, 1}n {0, 1} consists of a target vector |ti CD , sets of input
vectors Ii,b CD for all i [n] and b {0, 1}, and D
S a set of free input vectors Ifree C . The set of
available input vectors for input x is I(x) := Ifree i[n] Ii,xi . We say that a span program computes f if
|ti span I(x) if and only if f (x) = 1.
The complexity of a span program is measured by its witness size. If f (x) = 1, then there is a linear
combination of vectors from I(x) that gives |ti; the witness size of x is the smallest squared length of the
coefficients for any such linear combination. If f (x) = 0, then there is a vector that has inner product 1 with
|ti that is orthogonal to all available input vectors; the witness size of x is the smallest squared length of the
vector of inner products of this vector with all input vectors (of course, these inner products are zero for the
available input vectors). The witness size of f is the largest witness size of any x S, or equivalently, the
geometric mean of the largest witness sizes of 0- and 1-inputs.
The smallest witness size of any span program computing f is precisely Adv (f ), and there is a close
relationship between span programs and dual adversary solutions. Given L a dual adversary solution with
vectors |vx,i i, one can construct a matrix whose rows are the vectors i[n] hxi |hvx,i |. Take the columns of
this matrix in block i and subblock b to be the vectors in Ii,b , let the target vector be the all ones vector,
and let there be no free input vectors. It can be shown that this gives a span program for f whose witness
size is exactly the complexity of the dual adversary solution. Furthermore, every span program can be put
into a canonical form for which this translation can be reversed to produce a dual adversary solution: taking
the vectors of a canonical span program to be the columns of a matrix, the rows give dual adversary vectors
23.3. Unstructured search 113
for x f 1 (0) and the witness vectors for x f 1 (1) give the remaining dual adversary vectors. For more
detail on this translation, see [78, Lemma 6.5] (and see the rest of that paper for more than you ever wanted
to know about span programs).
We focus on dual adversary solutions here, as these are simpler to work with for the applications we
consider. However, for other applications it may be useful to work directly with span programs instead; in
particular, (non-canonical) span programs offer more freedom when trying to devise upper bounds.

We now give a simple example of an optimal dual adversary solution, namely for unstructured search. Let
f : S {0, 1} be defined by f (x) = or(x) with S = {x {0, 1}n : |x| 1} the set of inputs with Hamming
weight at most 1. Take dimension d = 1; let |v0,i i = 1 for all i [n] and |vx,i i = xi . The constraint (23.3)
gives
X
hv0,i |vej ,i i = hv0,j |vej ,j i = 1 (23.5)
i : 06=(ej )i
for all j [n] (where ej Cn is the jth standard basis vector) and
X
hvej ,i |vek ,i i = hvej ,j |vek ,j i + hvej ,k |vek ,k i = 0 (23.6)
i : (ej )i 6=(ek )i
for j 6= k, so the constraint is satisfied.

The 0- and 1-complexities of this solution are
X
C0 = 1=n (23.7)
i[n]
X
C1 = max i,j = 1. (23.8)
j
i[n]

Since C0 C1 = n, we see that Adv (f ) n, demonstrating that the previously discussed adversary
lower bound is the best possible adversary lower bound.
It is easy to extend this dual adversary solution to one for the total or function. For any x 6= 0, simply
let |vx,i i = i,j , where j is the index of any particular bit for which xj = 1 (e.g., the first such bit). Then
the constraints are still satisfied, and the complexity is the same. As an exercise, you should work out an
optimal dual adversary for and.
23.4 Formulas and games

Unstructured search can be viewed as the problem of evaluating the logical or of n bits in O( n) quantum
queries. It is natural to ask about the quantum query complexity of other Boolean functions. Next we show
how span programs can be used to show that the quantum query complexity of evaluating a balanced binary
and-or tree with n leaves is O( n). In fact, a similar approach shows that the quantum query complexity
of evaluating any Boolean formula expressed in terms of and, or, and not gates is O( n).
One way of motivating this query problem is as follows. Consider a two-player game between Andrea and
Orlando in which the players alternate turns, with Andrea going first and Orlando going second. On each
turn there are d possible moves, and there are a total of k turns. Suppose that the winner is determined by
some black box function f : {1, 2, . . . , d}k {0, 1} indicating who wins the game (0 = Andrea, 1 = Orlando)
after a particular sequence of k moves. The goal is to determine who wins the game, assuming Andrea and
Orlando both play optimally.
This problem is equivalent to evaluating an and-or formula. Consider the tree of moves (a balanced
d-ary tree of height k) and let internal vertices evaluate to 0 or 1 depending on who wins the game if the
players start from the corresponding (partial) sequence of moves. Orlando wins if he can make any move
that leads to the outcome 1, so vertices representing his moves correspond to or gates. Andrea wins if she
can make any move that gives 0i.e., she only loses if all her moves give 1so her vertices correspond to
and gates.
What is the query complexity of evaluating this balanced d-ary and-or tree? Let us first consider
randomized classical algorithms. Notice that it is sometimes possible to avoid evaluating all the leaves: for
example, if we learn that one input to an and gate is 0, then we do not need to evaluate the other inputs
to know that the gate evaluates to 0. In the case where all inputs are 1, we must evaluate all of them; but
the inputs to an and gate are given by the outputs of or gates, and an or gate evaluating to 1 is exactly
the case where it is possible to learn the value of the gate without knowing all of its inputs. Similarly, the
hardest input to the or gate is precisely the output of an and gate for which it is possible to learn the
output without evaluating all inputs.
With these observations in mind, a sensible classical algorithm is as follows. Suppose that to evaluate
any given vertex of the tree, we guess a random child and evaluate it (recursively), only evaluating other
children when necessary. By analyzing a simple recurrence, one can show that this algorithm uses

!k
d1+ d2 + 14d + 1

d1+ d2 +14d+1
O = O nlogd 4 (23.9)
4
queries, where n = dk is the input size (e.g., for d = 2, O(n0.753 )) [93, 82]. In fact, it is possible to show
that this algorithm is asymptotically optimal [83]. Notice, in particular, that the classical query complexity
becomes larger as d is increased with n fixed. In the extreme case where k = 1, so that n = d, we are simply
evaluating the and gate, which is equivalent (by de Morgans laws) to evaluating an or gate, and which we
know takes (n) queries.
A quantum computer
can evaluate such games faster if k is sufficiently small. Of course, the k = 1
case is solved in O( n) queries by Grovers algorithm. By applying Grovers algorithm recursively, suitably
k1
amplifying the success probability, it is possible to evaluate the formula
in nO(log n) queries [25], which
is nearly optimal for constant k. This can be improved slightly to O( nck ) queries for some constant c using
a variant of Grovers algorithm that allows noisy inputs [56]. But both of these algorithms are only close to
tight when k is constant. Indeed, for very low degree (such as d = 2, so that k = log2 n), nothing better
than the classical
algorithm was known until 2007 [41]. Here we will describe how to solve that problem
in only O( n) quantum queries. However, rather than presenting the original algorithm, we show how a
composition property of span programs offers a particularly simple analysis.
23.5 Function composition

A nice property of the adversary method (in both dual and primal formulations) is its behavior under function
composition. Given functions f : {0, 1}n {0, 1} and g : {0, 1}m {0, 1}, we define f g : {0, 1}nm {0, 1}
by (f g)(x) = f (g(x1 , . . . , xm ), . . . , g(xnmm+1 , . . . , xnm )). Here we focus on upper bounds, for which we
have the following [79, 68].
Theorem 23.2. Adv (f g) Adv (f ) Adv (g).
Proof. Let {|vx,i i : x {0, 1}n , i [n]} be an optimal dual adversary solution for f , and let {|uy,j i : y
{0, 1}m , j [m]i} be an optimal dual adversary solution for g. Let y = (y 1 , . . . , y n ) where each y i {0, 1}m .
Then define
|wy,(i,j) i = |vg(y),i i |uyi ,j i (23.10)
where g(y) denotes the vector with g(y)i = g(y i ).

23.6. An algorithm from a dual adversary solution 115
We claim that this is a dual adversary solution for f g. To see this, we compute
X X X
hwy,(i,j) |wz,(i,j) i = hvg(y),i |vg(z),i i huyi ,j |uzi ,j i (23.11)
(i,j) : yji 6=zji i[n] j : yji 6=zji
X
= hvg(y),i |vg(z),i i(1 g(yi ),g(zi ) ) (23.12)
i[n]
X
= hvg(y),i |vg(z),i i (23.13)
i : g(y i )6=g(z i )
= 1 (f g)(y),(f g)(z) . (23.14)
Finally, since k|wy,(i,j) ik = k|vg(y),i ik k|uyi ,j ik, using (23.2) gives
Adv (f g) max
X X
k|vg(y),i ik2 k|uyi ,j ik2 (23.15)
y
i j
Adv (f ) Adv (g) (23.16)
as claimed.
Note that here we needed the constraint (23.3) in the case where f (x) = f (y).
In particular,
combining this with the dual adversary for or and a similar solution for and, this shows
that Adv (f ) n for the n-input balanced binary and-or tree.
23.6 An algorithm from a dual adversary solution

The dual adversary is significant not just because it gives upper bounds on Adv (f ), but because it directly
gives a quantum algorithm for evaluating f with quantum query complexity O(Adv (f )) [79, 68]. (Note that
the construction is not necessarily time-efficientit may use many more elementary gates than queries
but many known algorithms developed using span programs have subsequently led to explicit, time-efficient
algorithms.)
In particular, this shows that the quantum query complexity of the balanced binary and-or tree is O( n).
This was originally shown, up to some small overhead, using a continuous-time quantum walk algorithm
1+ 33
based on scattering theory [41]. The classical query complexity of this problem is O(nlog2 ( 4 ) ) = O(n0.753 )
[93, 82], and no better quantum algorithm was known for many years. From the perspective of span programs,
the formula evaluation algorithm can be seen a method of recursive evaluation with no need for error
reduction.
Similarly to the quantum walk search algorithms we discussed previously, the algorithm for the adversary
dual uses a product of two reflections. Let A = Adv (f ), and let be the projector onto span{|x i : f (x) =
1} where
1 X
|x i := |0i + 1 |ii|vx,i i|xi i (23.17)
x 2A
i[n]
with {|vx,i i} an optimal dual adversary solution. Here the normalization factor is
1 X 3
x = 1 + k|vx,i ik2 . (23.18)
2A 2
i[n]
P
The reflection 2 I requires no queries to implement. Let x = |0ih0| + i[n] |iihi| I |xi ihxi | be
the projector onto |0i and states where the query and output registers are consistent. Then the reflection
2x I can be implemented using only two queries to the oracle Ox .
The algorithm runs phase estimation with precision (1/A) on the unitary U := (2x I)(2 I),
with initial state |0i. If the estimated phase is 1, then the algorithm reports that f (x) = 1; otherwise it
reports that f (x) = 0. This procedure uses O(A) queries. It remains to see why the algorithm is correct
with bounded error.
First, we claim that if f (x) = 1, then |0i is close to the 1-eigenspace of U . We have x |x i = |x i for
all x and |x i = |x i for f (x) = 1, so clearly U |x i = |x i. Furthermore, |h0|x i|2 = 1/x 2/3 for all
x, so surely kx |0ik2 2/3. Thus the algorithm is correct with probability at least 2/3 when f (x) = 1.
On the other hand, we claim that if f (x) = 0, then |0i has small projection onto the subspace of
eigenvectors with eigenvalue ei for c/A, for some constant A. To prove this, we use the following [68]:
Lemma 23.3 (Effective spectral gap lemma). Let |i be a unit vector with |i = 0; let P be the projector
onto eigenvectors of U = (2 I)(2 I) with eigenvalues ei with || < for some 0. Then
kP |ik /2.
Let
1 X
|x i := |0i 2A |ii|vx,i i|xi i , (23.19)
x
i[n]
where the normalization factor is

X
x = 1 + 2A k|vx,i ik2 1 + 2A2 . (23.20)
i[n]
For any y with f (y) = 1, we have

1 X
hy |x i = 1 hvy,i |vx,i i = 0, (23.21)
y x : i yi 6=xi

so |x i = 0. Also, observe that x |x i = |0i/ x . By the effective spectral gap lemma, kP |0ik
q
2 1

x /2 1 + 2A2 /2 A/ 2. Thus, choosing = 3 A gives a projection of at most 1/ 3, so
the algorithm fails with probability at most 1/3 (plus the error of phase estimation, which can be made
negligible, and the small error from approximating 1 + 2A2 2A2 , which is negligible if A 1).
It remains to prove the lemma.
Proof. We apply Jordans lemma, which says that for any two projections acting on the same finite-
dimensional space, there is a decomposition of the space into a direct sum of one- and two-dimensional
subspaces that are invariant under both projections.
We can assume without loss of generality that |i only has support on 2 2 blocks of the Jordan
decomposition in which and both have rank one. If the block is 1 1, or if either projection has rank 0
or 2 within the block, then U acts as either I on the block; components with eigenvalue 1 are annihilated
by P , and components with eigenvalue +1 are annihilated by .
Now, by an appropriate choice of basis, restricting and to any particular 2 2 block gives

= 1 0 (23.22)
0 0
cos 2

cos 2 sin 2

= (23.23)
sin 2
where 2 is the angle between the vectors projected onto within the two subspaces. A simple calculation
shows that (2 I)(2 I) is a rotation by an angle , so its eigenvalues are ei . Since |i = 0, the
component of |i in the relevant subspace is proportional to ( 01 ), and

0 = sin cos 2 = |sin |

2 2 2 (23.24)
1 sin 2
as claimed.
Chapter 24
Learning graphs
While span programs provide a powerful tool for proving upper bounds on quantum query complexity, they
can be difficult to design. The model of learning graphs, introduced by Belovs [17], is a restricted class of
span programs that are more intuitive to design and understand. This model has led to improved upper
bounds for various problems, such as subgraph finding and k-distinctness.
24.1 Learning graphs and their complexity

A learning graph for an n-bit oracle is a directed, acyclic graph whose vertices are subsets of [n], the set of
indices of input bits. Edges of the learning graph can only connect vertices [n] and {i} for some
i [n] \ . We interpret such an edge as querying index i, and we sometimes say that the edge (, {i})
loads index i. Each edge e has an associated weight we > 0. We say that a learning graph computes
f : S {0, 1} (where S {0, 1}n ) if, for all x with f (x) = 1, there is a path from to a 1-certificate for x
(a subset of indices such that f (y) = f (x) for all y such that x = y , where x denotes the restriction of
x to the indices in ).

Associated to any learning graph for f is a complexity measure C = C0 CP 1 , the geometric mean of the
0-complexity C0 and the 1-complexity C1 . The 0-complexity is simply C0 := e we , where the sum is over
all edges in the learning graph.
The definition of the 1-complexity is somewhat more involved. Associated to any x with f (x) = 1, we
consider a flow in the learning graph, which assigns a value pe to each edge so that for any vertex, the sum
of all incoming flows equals the sum of all outgoing flows. There is a unit flow coming from vertex ; this
is the only source. PA vertex can be a sink if and only if it contains a 1-certificate for x. The complexity
of any such flow is e p2e /we . (Note that we > 0 for any edge in a learning graph, although we also have
the possibility of omitting edges.) The complexity C1 (x) is the smallest complexity of any valid flow for x.
Finally, we have C1 := maxxf 1 (1) C1 (x), the largest complexity of any 1-input.

Perhaps the simplest example of a learning graph is for the case of unstructured search. The learning graph
simply loads an index. In other words, there is an edge of weight 1 from to {i} for each i [n]. Clearly,
we have C0 = n. To compute the 1-complexity, consider the input x = ei for some i [n]. For this input
there is a unique 1-certificate, namely {i}. The only possible flow is the one with unit weight on the edge
from to {i}.This gives C1 (ei ) = 1 for all i, so C1 = 1. Therefore the complexity of this learning graph is
C = C0 C1 = n.
It is not hard to see that the same learning graph works for the total function or: for each x with
f (x) = 1, we can send all flow to any particular i for which xi = 1.
117
118 Chapter 24. Learning graphs
24.3 From learning graphs to span programs

We now show that every learning graph implies a dual adversary solution of the same complexity, so that
the learning graph complexity is an upper bound on quantum query complexity, up to constant factors.
We construct vectors |vx,i i for all x S. These vectors consist of a block for each vertex of the learning
graph, with the coordinates within each block labeled by possible assignments of the bits in . Since we fix
a particular index i, we can think of the blocks as labeling edges e,i := (, {i}). We define
X

we,i |, x i if f (x) = 0

|vx,i i = X pe,i (24.1)
|, x i if f (x) = 1
we,i

where the sums only run over those [n] for which e,i is an edge of the learning graph.
It is easy to check that this definition satisfies the dual adversary constraints. For any x, y S with
f (x) = 0 and f (y) = 1, we have
X X X pe
hvx,i |vy,i i = we ,i ,i hx |y i (24.2)

we,i
i : xi 6=yi i : xi 6=yi
X X
= pe,i . (24.3)
i : xi 6=yi : x =y
Now observe that the set of edges {e,i : x = y , xi 6= yi } forms a cut in the graph between the vertex sets
{ : x = y } and { : x 6= y } Since is in the former set and all sinks are in the latter set, the total flow
through the cut must be 1.
Recall that we do not have to satisfy the constraint for f (x) = f (y) since there is a construction that
enforces this condition without changing the complexity provided the condition for f (x) 6= f (y) is satisfied.
It remains to see that the complexity of this dual adversary solution equals the original learning graph
complexity. For b {0, 1}, we have
X
Cb = max
1
k|vx,i ik2 (24.4)
xf (b)
i[n]

X X we,i if b = 0
= max p2e
,i
(24.5)
xf 1 (b)
i[n]

we,i if b = 1
(
C0 if b = 0
= (24.6)
maxxf 1 (1) C1 (x) if b = 1
= Cb . (24.7)

Therefore C0 C1 = C0 C1 = C as claimed. In particular, Adv (f ) C, so Q(f ) = O(C).
Learning graphs are simpler to design than span programs: the constraints are automatically satisfied,
so one can focus on optimizing the objective value. In contrast, span programs have exponentially many
constraints (in n, if f is a total function), and in general it is not obvious how to even write down a solution
satisfying the constraints.
Note, however, that learning graphs are not equivalent to general span programs. For example, learning
graphs (as defined above) only depend on the 1-certificates of a function, so two functions with the same
1-certificates have the same learning graph complexity. The 2-threshold function (the symmetric Boolean
function that is 1 iff two or more input bits are 1) has the same certificates as element distinctness, so
its learning graph complexity is (n2/3 ), whereas its query complexity is O( n). This barrier can be
circumvented by modifying the learning graph model, but even such variants are apparently less powerful
than general span programs.
24.4. Element distinctness 119
24.4 Element distinctness

We conclude by giving another simple example of a learning graph, one for element distinctness. (This
requires generalizing learning graphs to non-Boolean input alphabet, but this generalization is straightfor-
ward.) We assume for simplicity that there is a unique collisionin fact, the analysis of the learning graph
works for the general case by simply fixing one particular collision when designing a flow.
A convenient simplification is to break up the learning graph into k stages, which are simply subsets of
the edges. To compute the complexity of a stage, we sum over only the edges in that stage. It is easy to see
that there is a learning graph whose complexity is at most the sum of the complexities of the stages times
the square root of the number of stages (which we will take to be constant). Let Cbj denote the b-complexity
of stage j. By dividing the weight of every edge in stage j by C0j , we send C0j 1 and C1j C0j C1j . Then the
total 0-complexity becomes C0 = k and the total 1-complexity becomes
k k q
X 2
C0j C1j C0j C1j
X
C1 = (24.8)
j=1 j=1
Pk q
(since the 1-norm upper bounds the 2-norm), so C k j=1 C0j C1j .
Another useful modification is to allow multiple vertices corresponding to the same subset of indices. It
is straightforward to show that such learning graphs can be converted to span programs at the same cost,
or to construct a new learning graph with no multiple vertices and the same or better complexity.
The learning graph for element distinctness has three stages. For the first stage, we load subsets
Pn ni of size
ni

r 2. We do this by first adding edges from to r3 copies of vertex {i}, so that there are i=1 r3 =
n

r2 singleton vertices. Then, from each of these singleton vertices, we load the remaining indices of each
possible subset of size r 2, one index at a time. Every edge has weight 1. Then the 0-complexity of the
n
first stage is (r 2) r2 .
To upper bound the 1-complexity of the first stage, we route flow only through vertices that do not
1
contain the indices of a collision, sending equal flow of n2 r2 to all subsets of size r 2. This gives
n2 n2 2 n2 1

1-complexity of at most (r 2) r2 r2 = (r 2) r2 .
Overall, the complexity of the first stage is at most
s 1 s
n(n 1)

n n 2
(r 2)2 = (r 2) = O(r). (24.9)
r2 r2 (n r + 2)(n r + 1)
The second and third stages each include all possible edges that load one additional index from the
terminal vertices
of the previous stage. Again every edge has unit weight. Thus, the 0-complexity is
n n
(n r + 2) r2 for the second stage and (n r + 1) r1 for the third stage. We send the flow through
vertices that contain the collision pair (namely, that contain the first index of a collision in the second stage
n22 1
and the second index of a collision in the third stage). Thus, the 1-complexity is n2 r2 r2 = n2
r2
in both the second and the third stages. This gives total complexity
s 1
n2

n
(n r + 2) = O( n) (24.10)
r2 r2
for the second stage and

s 1 s
n2 n(n 1)

n
(n r + 1) = = O(n/ r) (24.11)
r1 r2 (r 1)
for the third stage.

Overall, the complexity is O(r + n + n/ r). This optimized by choosing r to equate the first and
last terms, giving r = n2/3 . The overall complexity is O(n2/3 ), matching the optimal quantum walk search
algorithm.
120 Chapter 24. Learning graphs
24.5 Other applications

The simple examples discussed above only involve problems for which the optimal query complexity was
previously known using other techniques. However, several new quantum query upper bounds have been
given using learning graphs. These include improved algorithms for the triangle problem [17, 67] (and more
generally, subgraph finding [66, 99, 67], with an application to associativity testing [67]) and the k-distinctness
problem [17]. (Note that the algorithm for k-distinctness uses a subtle modification of the learning graph
framework.) Unfortunately, the details of these algorithms are beyond the scope of the course.
Part V
Quantum simulation
121
Chapter 25
Simulating Hamiltonian dynamics
Another major potential application of quantum computers is the simulation of quantum dynamics. Indeed,
this was the idea that first led Feynman to propose the concept of a quantum computer [44]. In this lecture we
will see how a universal quantum computer can efficiently simulate several natural families of Hamiltonians.
These simulation methods could be used either to simulate actual physical systems or to implement quantum
algorithms defined in terms of Hamiltonian dynamics, such as continuous-time quantum walks (Part III) and
adiabatic quantum algorithms (Part VI).
25.1 Hamiltonian dynamics

In quantum mechanics, time evolution of the wave function |(t)i is governed by the Schrodinger equation,
d
i~ |(t)i = H(t)|(t)i. (25.1)
dt
Here H(t) is the Hamiltonian, an operator with units of energy, and ~ is Plancks constant. For convenience
it is typical to choose units in which ~ = 1. Given an initial wave function |(0)i, we can solve this differential
equation to determine |(t)i at any later (or earlier) time t.
For H independent of time, the solution of the Schrodinger equation is |(t)i = eiHt |(0)i. For simplic-
ity we will only consider this case. There are many situations in which time-dependent Hamiltonians arise,
not only in physical systems but also in computational applications such as adiabatic quantum computing.
In such cases, the evolution cannot in general be written in such a simple form, but nevertheless similar ideas
can be used to simulate the dynamics.
25.2 Efficient simulation

We will say that a Hamiltonian H acting on n qubits can be efficiently simulated if for any t > 0, > 0 there
is a quantum circuit U consisting of poly(n, t, 1/) gates such that kU eiHt k < . Clearly, the problem
of simulating Hamiltonians in general is BQP-hard, since we can implement any quantum computation by a
sequence of Hamiltonian evolutions. In fact, even with natural restrictions on the kind of Hamiltonians we
consider, it is easy to specify Hamiltonian simulation problems that are BQP-complete (or more precisely,
PromiseBQP-complete).
You might ask why we define the notion of efficient simulation to be polynomial in t; if t is given as
part of the input, this means that the running time is, strictly speaking, not polynomial in the input size.
However, one can show that a running time polynomial in log t is impossible; running time (t) is required
in general (intuitively, one cannot fast forward the evolution according to a generic Hamiltonian) [19].
The dependence on is more subtle. In fact, it is possible to achieve running time logarithmic in 1/, as we
discuss further in Chapter 26.
We would like to understand the conditions under which a Hamiltonian can be efficiently simulated. Of
course, we cannot hope to efficiently simulate arbitrarily Hamiltonians, just as we cannot hope to efficiently
123
124 Chapter 25. Simulating Hamiltonian dynamics
implement arbitrary unitaries. Instead, we will simply describe a few classes of Hamiltonian that can be
efficiently simulated. Our strategy will be to start from simple Hamiltonians that are easy to simulate and
define ways of combining the known simulations to give more complicated ones.
There are a few cases where a Hamiltonian can obviously simulated efficiently. For example, this is the
case if H only acts nontrivially on a constant number of qubits, simply because any unitary evolution on a
constant number of qubits can be approximated with error at most using poly(log 1 ) one- and two-qubit
gates, using the Solovay-Kitaev theorem.
Note that since we require a simulation for an arbitrary time t (with poly(t) gates), we can rescale the
evolution by any polynomial factor: if H can be efficiently simulated, then so can cH for any c = poly(n).
This holds even if c < 0, since any efficient simulation is expressed in terms of quantum gates, and can
simply be run in reverse.
In addition, we can rotate the basis in which a Hamiltonian is applied using any unitary transformation
with an efficient decomposition into basic gates. In other words, if H can be efficiently simulated and the
unitary transformation U can be efficiently implemented, then U HU can be efficiently simulated. This
follows from the simple identity

eiU HU t = U eiHt U . (25.2)
Another simple but useful trick for simulating Hamiltonians is the following. Suppose H is diagonal in
the computational basis, and any diagonal element d(a) = ha|H|ai can be computed efficiently. Then H can
be simulated efficiently using the following sequence of operations, for any input computational basis state
|ai:
|a, 0i 7 |a, d(a)i (25.3)

itd(a)
7 e |a, d(a)i (25.4)
7 eitd(a) |a, 0i (25.5)
iHt
=e |ai|0i. (25.6)
By linearity, this process simulates H for time t on an arbitrary input.

Note that if we combine this simulation with the previous one, we have a way to simulate any Hamiltonian
that can be efficiently diagonalized, and whose eigenvalues can be efficiently computed.
25.3 Product formulas

Many natural Hamiltonians have the form of a sum of terms, each of which can be simulated by the techniques
described above. For example, consider the Hamiltonian of a particle in a potential:
p2
H= + V (x).
2m
To simulate this a digital quantum computer, we can imagine discretizing the x coordinate. The operator
V (x) is diagonal, and natural discretizations of p2 = d2 /dx2 are diagonal in the discrete Fourier basis. Thus
we can efficiently simulate both V (x) and p2 /2m. Similarly, consider the Hamiltonian of a spin system, say
of the form X X
H= hi Xi + Jij Zi Zj
i ij
(or more generally, any k-local Hamiltonian, a sum of terms that each act on at most k qubits). This consists
of a sum of terms, each of which acts only only a constant number of qubits and hence is easy to simulate.
In general, if H1 and H2 can be efficiently simulated, then H1 + H2 can also be efficiently simulated.
If the two Hamiltonians commute, then this is trivial, since eiH1 t eiH2 t = ei(H1 +H2 )t . However, in the
general case where the two Hamiltonians do not commute, we can still simulate their sum as a consequence
of the Lie product formula m
ei(H1 +H2 )t = lim eiH1 t/m eiH2 t/m . (25.7)
m
25.4. Sparse Hamiltonians 125
A simulation using a finite number of steps can be achieved by truncating this expression to a finite number
of terms, which introduces some amount of error that must be kept small. In particular, if we want to have
iH1 t/m iH2 t/m m

e e ei(H1 +H2 )t , (25.8)

it suffices to take m = O((t)2 /), where := max{kH1 k, kH2 k}. (The requirement that H1 and H2 be
efficiently simulable means that can be at most poly(n).)
It is somewhat unappealing that to simulate an evolution for time t, we need a number of steps propor-
tional to t2 . Fortunately, the situation can be improved if we use higher-order approximations of (25.7). For
example, one can show that
iH1 t/2m iH2 t/m iH1 t/2m m

e e e ei(H1 +H2 )t (25.9)

with a smaller value of m. In fact, by using even higher-order approximations, it is possible to show that
H1 + H2 can be simulated for time t with only O(t1+ ), for any fixed > 0, no matter how small [28, 19].
A Hamiltonian that is a sum of polynomially many terms can be efficiently simulated by composing the
simulation of two terms, or by directly using an approximation to the identity
m
ei(H1 ++Hk )t = lim eiH1 t/m eiHk t/m . (25.10)
m
Another way of combining Hamiltonians comes from commutation: if H1 and H2 can be efficiently
simulated, then i[H1 , H2 ] can be efficiently simulated. This is a consequence of the identity
m
e[H1 ,H2 ]t = lim eiH1 t/m eiH2 t/m eiH1 t/m eiH2 t/m , (25.11)
m
which can again be approximated with a finite number of terms. However, I dont know of any algorithmic
application of such a simulation.
25.4 Sparse Hamiltonians

We will say that an N N Hermitian matrix is sparse (in a fixed basis) if, in any fixed row, there are only
poly(log N ) nonzero entries [7]. The simulation techniques described above allow us to efficiently simulate
sparse Hamiltonians. More precisely, suppose that for any a, we can efficiently determine all of the bs for
which ha|H|bi is nonzero, as well as the values of the corresponding matrix elements; then H can be efficiently
simulated. In particular, this gives an efficient implementation of the continuous-time quantum walk on any
graph G = (V, E) whose maximum degree is poly(log |V |).
The basic idea of the simulation is to edge-color the graph, simulate the edges of each color separately,
and combine these pieces using (25.7). The main new technical ingredient in the simulation is a means of
coloring the edges of the graph of nonzero matrix elements of H. A classic result in graph theory (Vizings
Theorem) says that a graph of maximum degree d has an edge coloring with at most d + 1 colors (in fact,
the edge chromatic number is either d or d + 1). If we are willing to accept a polynomial overhead in the
number of colors used, then we can actually find an edge coloring using only local information about the
graph. Here we describe a simple d2 -coloring for the case of a bipartite graph, which is sufficient for general
Hamiltonian simulation using a simple reduction [21].
Lemma 25.1. Suppose we are given an undirected, bipartite graph G with N vertices and maximum degree d,
and that we can efficiently compute the neighbors of any given vertex. Then there is an efficiently computable
edge coloring of G with at most d2 colors.
Proof. Number the vertices of G from 1 through N . For any vertex a, let idx(a, b) denote the index of vertex
b in the list of neighbors of a. Define the color of the edge ab, where a is from the left part of the bipartition
and b is from the right, to be the ordered pair (idx(a, b), idx(b, a)). This is a valid coloring since if (a, b) and
(a, d) have the same color, then idx(a, b) = idx(a, d), so b = d. Similarly, if (a, b) and (c, b) have the same
color, then idx(b, a) = idx(b, c), so a = c.
Given this lemma, the simulation proceeds as follows. First, to ensure that the graph of H is bipartite,
we actually simulate evolution according to the Hamiltonian x H, which is bipartite and has the same
sparsity as H. Since ei(x H)t |+i|i = |+ieiHt |i, we can recover a simulation of H from a simulation
of x H.
Now write H as a diagonal matrix plus a matrix with zeros on the diagonal. We have already shown how
to simulate the diagonal part, so we can assume H has zeros on the diagonal without loss of generality.
It suffices to simulate the term corresponding to the edges of a particular color c. We show how to make
the simulation work for any particular vertex x; then it works in general by linearity. By computing the
complete list of neighbors of x and computing each of their colors, we can reversibly compute vc (x), the
vertex adjacent to x via an edge with color c, along with the associated matrix element:
|xi 7 |x, vc (x), Hx,vc (x) i. (25.12)
Then we can simulate the H-independent Hamiltonian defined by the map
|x, y, hi 7 h|y, x, h i (25.13)
since it is easily diagonalized, as it consists of a direct sum of two-dimensional blocks. Finally, we can
uncompute the second and third registers. Before the uncomputation, the simulation produces a linear

combination of the states |x, vc (x), Hx,vc (x) i and |vc (x), x, Hx,v c (x)
i. Since

|vc (x), x, Hx,v c (x)
i = |vc (x), vc (vc (x)), Hvc (x),x i, (25.14)
the uncomputation works identically for both components.
25.5 Measuring an operator

We can view a Hermitian operator not just as the generator of dynamics, but also as a quantity to be
measured. In a practical quantum simulation, the desired final measurement might be of this type. For
example, we might want to measure the energy of the system, and the Hamiltonian could be a complicated
sum of noncommuting terms.
It turns out that any Hermitian operator that can be efficiently simulated (viewing it as the Hamiltonian
of a quantum system) can also be efficiently measured using a formulation of the quantum measurement
process given by von Neumann. In fact, von Neumanns procedure is essentially the same as quantum phase
estimation.
In von Neumanns description of the measurement process, a measurement is performed by coupling
the system of interest to an ancillary system, which we call the pointer. Suppose that the pointer is a
one-dimensional free particle and that the system-pointer interaction Hamiltonian is H p, where p is the
momentum of the particle. Furthermore, suppose that the mass of the particle is sufficiently large that we
can neglect the kinetic term. Then the resulting evolution is
X
eitHp = |Ea ihEa | eitEa p ,

(25.15)
a
where |Ea i are the eigenstates of H with eigenvalues Ea . Suppose we prepare the pointer in the state |x = 0i,
a narrow wave packet centered at x = 0. Since the momentum operator generates translations in position,
the above evolution performs the transformation
|Ea i |x = 0i 7 |Ea i |x = tEa i. (25.16)
If we can measure the position of the pointer with sufficiently high precision that all relevant spacings
xab = t|Ea Eb | can be resolved, then measurement of the position of the pointera fixed, easy-to-measure
observable, independent of Heffects a measurement of H.
Von Neumanns measurement protocol makes use of a continuous variable, the position of the pointer.
To turn it into an algorithm that can be implemented on a digital quantum computer, we can approximate
the evolution (25.15) using r quantum bits to represent the pointer. The full Hilbert space is thus a tensor
25.5. Measuring an operator 127
product of a 2n -dimensional space for the system and a 2r -dimensional space for the pointer. We let the
computational basis of the pointer, with basis states {|zi}, represent the basis of momentum eigenstates.
The label z is an integer between 0 and 2r 1, and the r bits of the binary representation of z specify the
states of the r qubits. In this basis, p acts as
z
p|zi = |zi. (25.17)
2r
In other words, the evolution eitHp can be viewed as the evolution eitH on the system for a time controlled
by the value of the pointer.
Expanded in the momentum eigenbasis, the initial state of the pointer is
r
2X 1
1
|x = 0i = |zi. (25.18)
2r/2 z=0
The measurement is performed by evolving under H p for some appropriately chosen time t. After this
evolution, the position of the simulated pointer can be measured by measuring the qubits that represent it
in the x basis, i.e., the Fourier transform of the computational basis.
Note that this discretized von Neumann measurement procedure is equivalent to phase estimation. Recall
that in the phase estimation problem, we are given an eigenvector |i of a unitary operator U and asked to
determine its eigenvalue ei . The algorithm uses two registers, one that initially stores |i and one that will
store an approximation of the phase . The first and last steps of the algorithm are Fourier transforms on
the phase register. The intervening step is to perform the transformation
|i |zi 7 U z |i |zi, (25.19)
where |zi is a computational basis state. If we take |zi to be a momentum eigenstate with eigenvalue z (i.e.,
if we choose a different normalization than in (25.17)) and let U = eiHt , this is exactly the transformation
induced by ei(Hp)t . Thus we see that the phase estimation algorithm for a unitary operator U is exactly
von Neumanns prescription for measuring i ln U .
Chapter 26
Fast quantum simulation algorithms
While product formulas provide the most straightforward approach to Hamiltonian simulation, alternative
approaches can offer improved performance. Here we explore Hamiltonian simulation beyond product for-
mulas.
26.1 No fast-forwarding
Before introducing improved upper bounds, we begin by establishing a limitation on the ability of algorithms
to simulate sparse Hamiltonians. Specifically, as mentioned in Chapter 25, we show that no general procedure
can simulate a sparse Hamiltonian acting for time t using o(t) queries [19].
The lower bound is based on a reduction from parity. Recall from Section 20.5 that computing the parity
of n bits requires (n) queries. Given an input string x {0, 1}n , construct a graph on vertices (i, b) for
i {0, 1, . . . , n} and b {0, 1}, such that (i1, b) is adjacent to (i, bxi ) for all i {1, . . . , n} and b {0, 1}.
This graph is the disjoint union of two paths of length n, and (0, 0) is connected to (n, b) for exactly one value
of b, namely b = x1 xn , the parity of the input string. The main idea of the proof is to construct a
Hamiltonian whose nonzero entries correspond to this graph, such that the dynamics for some time t = O(n)
map the state |0, 0i to the state |n, x1 xn i. Then a simulation of the Hamiltonian dynamics for time
t using o(t) queries would violate the parity lower bound.
The most obvious choice is to simply use the adjacency matrix of the graph as the Hamiltonian. However,
then the dynamics generate a continuous-time quantum walk on a finite path, which does not reach the
opposite end of the path with constant amplitude after linear time.
Instead, we choose the matrix elements of the Hamiltonian H so that
p
hi 1, b|H|i, b xi i = i(n i + 1)/n. (26.1)
Clearly, a black box for this 2-sparse Hamiltonian can be implemented using O(1) queries to the black box
for the input string to answer each neighbor query. The weights are chosen to reflect the transitions between
column states (in the sense of Section 16.5) for an unweighted hypercube. Specifically, letting Q denote the
1/2 P
adjacency matrix of the hypercube and |wtk i := nk |x|=k |xi, we have
1/2
n X X
Q|wtk i = (n k + 1) |xi + (k + 1) |xi (26.2)
k
|x|=k1 |x|=k+1
p p
= k(n k + 1)|wtk1 i + (k + 1)(n k) |wtk+1 i. (26.3)
Thus, with these weights on the edges, the dynamics behave just as the walk on the hypercube within its
column subspace. In particular, since the dynamics on the hypercube map a vertex into the opposite corner
in time /2 (as shown in Section 16.2), the chosen Hamiltonian maps |0, 0i to |n, x1 xn i in time O(n).
It follows that a generic procedure for simulating sparse Hamiltonians for time t must have complexity
(t) in general. In other words, one cannot fast-forward the dynamics of arbitrary Hamiltonians.
129
130 Chapter 26. Fast quantum simulation algorithms
26.2 Quantum walk

As we saw in Section 25.3, simulations based on product formulas have superlinear complexity. A con-
ceptually different approach to Hamiltonian simulation uses the notion of a discrete-time quantum walk
(specifically, the Szegedy framework introduced in Chapter 17). Here, one defines a quantum walk that is
closely related to any given time-independent Hamiltonian and applies phase estimation in order to simulate
the Schrodinger dynamics [29]. This approach gives a simulation of sparse Hamiltonians acting for time t
with complexity O(t), matching the lower bound of the previous section.
The quantum walk approach is based on a construction analogous to Theorem 17.1, except we start from
a Hamiltonian instead of a stochastic matrix. Define states
v
N q u N
1 X
|j, ki + t1 1
u X
|j i := Hjk |Hjk ||j, N + 1i (26.4)
X k=1 X
k=1
PN
where X maxj k=1 |Hjk |. Define the operators S, T as in the proof of Theorem 17.1. Then we get
T ST = H/X, so the walk has eigenvalues ei arccos(/X) , where is an eigenvalue of H. The eigenvectors
corresponding to these two eigenvalues can be found within the subspace span{T |i, ST |i}, where |i is
the eigenvector of H with eigenvalue .
To simulate H on a given input state |i, we proceed as follows:
1. Apply the isometry T to produce the state T |i.
2. Perform phase estimation on the quantum walk with precision (to be determined).
3. Given a value approximating arccos(/X), compute an estimate of .
4. Introduce the phase eit .
5. Uncompute the estimate of .
6. Invert the phase estimation procedure.
7. Apply T to return to a state in the original Hilbert space.
Since a step of the quantum walk can be implemented using two applications of the isometry T , this
procedure makes O(1/) calls to T . In turn, T can be implemented using a number of queries that is
polynomial in the sparsity of the Hamiltonian, so up to factors of the sparsity, the query complexity of
simulation is simply O(1/). Thus it remains to determine what value of suffices to ensure that the overall
procedure reproduces the dynamics up to error at most .
The details of this analysis are presented in [29, 20], but we can understand it roughly as follows. Suppose
the estimate of arccos(/X) deviates from its true value by of order . Since the cosine function has Lipschitz
constant 1 (i.e., |cos(x + ) cos(x)| ||), the resulting error in the value of /X is also of order . In
other words, the error in the value of is of order X. To ensure that eit deviates by at most from its
true value, we take Xt = (), i.e., 1/ = (Xt/). Thus we seethat the complexity is linear in t and
polynomial in 1/. Note if H is d-sparse, then we can choose X dkHk dkHkmax , so the factor of X
just introduces polynomial overhead with respect to the sparsity.
Using a more refined implementation and analysis of this approach, one can achieve query complexity

O( kHkt

+ dkHkmax t) = O(dkHkmax t/ ) for a d-sparse Hamiltonian H [20].
26.3 Linear combinations of unitaries

While the quantum walk approach described in the previous section gives optimal complexity as a function
of the simulation time t, its performance as a function of the allowed error is worse than using high-order
product formulas. It is natural to ask how efficiently we can simulate Hamiltonian dynamics as a function
of , and in particular, whether we can achieve complexity poly(log(1/)).
This can indeed be achieved by another approach to Hamiltonian simulation [21, 22], which is based
on techniques for implementing linear combinations of unitary operators on a quantum computer. This
approach strictly improves over direct use of product formulas, giving faster performance without the need
26.3. Linear combinations of unitaries 131
for higher-order formulas. The query complexity of this approach is O( loglog( /)
log( /) ), where
:= d2 kHkmax t
(with kHkmax denoting the largest magnitude of an entry of H).
Denote the Taylor series for the evolution up to time t, truncated at order K, by
K
X (iHt)k
U (t) := . (26.5)
k!
k=0
For sufficiently large K, the operator U (t) is a good approximation of exp(iHt). Specifically, by Taylors
theorem, we have
exp(kHkt)(kHkt)K+1
kU (t) exp(iHt)k , (26.6)
(K + 1)!
so we can ensure that the error is at most by taking K = O(log(kHkt/)/ log log(kHkt/)). If we take
kHkt constant, then we get an approximation with K = O(log(1/)/ log log(1/)). If we could implement
the evolution for constant time with this complexity, then by reducing the error to /t, we could repeat the
process O(t) times and get a simulation with complexity O(t log(t/)/ log log(t/)) and overall error at most
.
Now suppose we can decompose the given Hamiltonian in the form
L
X
H= ` H` (26.7)
`=1
for some coefficients ` R, where the individual terms H` are both unitary and Hermitian. This is
straightforward if H is k-local, since we can express the local terms as linear combinations of Pauli operators.
If H is sparse, then such a decomposition can also be constructed efficiently [21].
To implement U (t), we begin by writing it as a linear combination of unitaries, namely
K
X (iHt)k
U (t) = (26.8)
k!
k=0
K L
X X tk
= ` `k (i)k H`1 H`k (26.9)
k! 1
k=0 `1 ,...,`k =1
m1
X
= j V j , (26.10)
j=0
where the Vj are products of the form (i)k H`1 H`k , and the j are the corresponding coefficients.
How can we implement such a linear combination of unitaires? Let B be an operation that prepares the
state
m1
1 Xp
|i := j |ji (26.11)
s j=0
from |0i (i.e., B|0i = |i), where

m1
X
s := |j | (26.12)
j=0
K L
X X tk
= |` `k | (26.13)
k! 1
k=0 `1 ,...,`k =1
K PL
X (t `=1 |` |))k
= . (26.14)
k!
k=0
132 Chapter 26. Fast quantum simulation algorithms
Let
W := B select(V )B (26.15)
with
m1
X
select(V ) := |jihj| Vj . (26.16)
j=0
Then we have
1 Xp
(h0| I)W (|0i |i) = (h0| I)B select(V ) j |ji|i (26.17)
s j
1 Xp
= (h0| I)B j |jiVj |i (26.18)
s j
1X
= j Vj |i (26.19)
s j
1
= U (t)|i. (26.20)
s
In other words, if we postselect the state W (|0i |i) on having its first register in the state |0i, we obtain
the desired result. However, this postselection only succeeds with probability (approximately) 1/s2 .
Considering the action of W on the full space, we have
r
1 1
W |0i|i = |0i U (t)|i + 1 2 |i (26.21)
s s
for |i H and some |i whose ancillary state is supported in the subspace orthogonal to |0i. To boost
the chance of success, we might like to apply amplitude amplification to W . However, the initial state |i is
unknown, so we cannot reflect about it. Fortunately, something similar can be achieved using the reflection
R := (I 2|0ih0|) I. (26.22)
about the subspace with |0i in the first register. Specifically, letting P := |0ih0|, we have
(h0| I)W RW RW (|0i I) = (h0| I) W W W 2W W P W 2W P W W

+ 4W P W P W (|0i I)

(26.23)
= (h0| I) 3W + 4W P W P W (|0i I).

(26.24)
Therefore
3 4
(h0| I)W RW RW |0i|i = U (t) + 3 U (t)U (t) U (t), (26.25)
s s
which is close to ( 3s s43 )U (t) since U (t) is close to unitary. In particular, if s = 2 then this process boosts
the amplitude from 1/2 to 1, analogous to Grover search with a single marked item out of 4. For the purpose
of Hamiltonian simulation, we can choose the parameters such that a single segment of the evolution has
this value of s, and we repeat the process as many times as necessary to simulate the full evolution.
More generally, the operation W RW RW is analogous to the Grover iterate, and it can be applied many
times to boost the amplitude for success from something small to a value close to 1. Using this oblivious
amplitude amplification, a general linear combination of unitaries as in (26.10) can be implemented with
complexity O(1/s).
Part VI
Adiabatic quantum computing
133
Chapter 27
The quantum adiabatic theorem
In the last part of this course, we will discuss an approach to quantum computation based on the concept
of adiabatic evolution. According to the quantum adiabatic theorem, a quantum system that begins in the
nondegenerate ground state of a time-dependent Hamiltonian will remain in the instantaneous ground state
provided the Hamiltonian changes sufficiently slowly. In this lecture we will prove the quantum adiabatic
theorem, which quantifies this statement.
27.1 Adiabatic evolution

When the Hamiltonian of a quantum system does not depend on time, the dynamics of that system are
fairly straightforward. Given a time-independent Hamiltonian H, the solution of the Schrodinger equation
d
i |(t)i = H|(t)i (27.1)
dt
with the initial quantum state |(0)i is given by
|(t)i = exp(iHt)|(0)i. (27.2)
So any eigenstate |Ei of the Hamiltonian, with H|Ei = E|Ei, simply acquires a phase exp(iEt). In
particular, there are no transitions between eigenstates.
If the Hamiltonian varies in time, the evolution it generates can be considerably more complicated.
However, if the change in the Hamiltonian occurs sufficiently slowly, the dynamics remain relatively simple:
roughly speaking, if the system begins close to an eigenstate, it remains close to an eigenstate. The quantum
adiabatic theorem is a formal description of this phenomenon.
For a simple example of adiabatic evolution in action, consider a spin in a magnetic field that is rotated
from the x direction to the z direction in a total time T :
t t
H(t) = cos x sin z . (27.3)
2T 2T

Suppose that initially, the spin points in the x direction: |(0)i = (|0i + |1i)/ 2, the ground state of H(0).
As the magnetic field is slowly rotated toward the z direction, the spin begins to precess about the new
direction of the field, moving it toward the z axis (and also producing a small component out of the xz
plane). If T is made larger and larger, so that the rotation of the field direction happens more and more
slowly (as compared to the speed of precession), the state will precess in a tighter and tighter orbit about
the field direction. In the limit of arbitrarily slow rotation of the field, the state simply tracks the field,
remaining in the instantaneous ground state of H(t).
More generally, for s [0, 1], let H(s) be a Hermitian operator that varies smoothly as a function of
s. (The notion of smoothness will be made precise in the following section.) Let s := t/T . Then for T
arbitrarily large, H(t) varies arbitrarily slowly as a function of t. An initial quantum state |(0)i evolves
135
136 Chapter 27. The quantum adiabatic theorem
according to the Schrodinger equation,
d
i |(t)i = H(t)|(t)i, (27.4)
dt
or equivalently,
d
i |(s)i = T H(s)|(s)i. (27.5)
ds
Now suppose that |(0)i is an eigenstate of H(0), which we assume for simplicity is the ground state, and
is nondegenerate. Furthermore, suppose that the ground state of H(s) is nondegenerate for all values of
s [0, 1]. Then the adiabatic theorem says that in the limit T , the final state |(T )i obtained by the
evolution (27.4) will be the ground state of H(1).
Of course, evolution for an infinite time is rather impractical. For computational purposes, we need a
quantitative version of the adiabatic theorem: we would like to understand how large T must be so that the
final state is guaranteed to differ from the adiabatically evolved state by at most some fixed small amount.
In particular, we would like to understand how the required evolution time depends on spectral properties of
the interpolating Hamiltonian H(s). We will see that the timescale for adiabaticity is intimately connected
to the energy gap between the ground and first excited states.
27.2 Proof of the adiabatic theorem

We now give a proof of the adiabatic theorem, following [95] (see also [58] for a similar approach).
We would like to compare the evolution according to (27.5) to the corresponding (exactly) adiabatic
evolution, in which the initial ground state |(0)i = |(0)i evolves into the instantaneous ground state |(s)i
of H(s) satisfying
H(s)|(s)i = E(s)|(s)i (27.6)
where E(s) is the instantaneous ground state energy. (We assume for simplicity that the ground state is
unique.) Let
P (s) := |(s)ih(s)| (27.7)
denote the projector onto the ground state of H(s). Then we claim that the Hamiltonian
Ha (s) := T H(s) + i[P (s), P (s)] (27.8)
generates exactly adiabatic evolution, where we use a dot to denote differentiation with respect to s. In
other words, we claim that the differential equation
d
i |(s)i = Ha (s)|(s)i (27.9)
ds
with |(0)i = |(0)i has the solution |(s)i = ei(s) |(s)i for some time-dependent phase (s). Equivalently,
the density matrix P (s) = |(s)ih(s)| = |(s)ih(s)| satisfies the differential equation

d d
iP (s) = i |(s)i h(s)| + |(s)i h(s)| (27.10)
ds ds
= [Ha (s), P (s)]. (27.11)
To see this, we compute
[Ha , P ] = T [H, P ] + i[[P , P ], P ] (27.12)

= i(P P 2P P P + P P ) (27.13)
(dropping the argument s when it is clear from context). Differentiating the identity P 2 = P gives
P = P P + P P , (27.14)
27.2. Proof of the adiabatic theorem 137
and multiplying this identity by P on both sides gives
P P P = 0. (27.15)
Thus we see that [Ha , P ] = iP as claimed.

Recall that our goal is to understand the evolution according to (27.5), which can be written as
|(s)i = U (s)|(0)i (27.16)
for some unitary operator U (s). It is helpful to write the evolution in terms of a differential equation for
U (s). We have
d d
U (s)|(0)i = |(s)i (27.17)
ds ds
= iT H(s)|(s)i (27.18)
= iT H(s)U (s)|(0)i, (27.19)
and since this holds for any initial state |(0)i, we see that U (s) satisfies the differential equation
iU (s) = T H(s)U (s). (27.20)
Similarly, we have
iUa (s) = Ha (s)Ua (s) (27.21)
for the corresponding adiabatic evolution.
We would like to show that the difference between U and Ua is small. Thus we consider
Z 1
d
U (1) Ua (1) = U (1) (U Ua ) ds (27.22)
0 ds
Z 1
= iU (1) U [Ha T H]Ua ds (27.23)
0
Z 1
= U (1) U [P , P ]Ua ds (27.24)
0
where the first line follows from the fundamental theorem of calculus, the second from (27.20) and (27.21),
and the third from the definition of Ha .
It turns out that the expression [P , P ] can be written as a commutator with the Hamiltonian, [P , P ] =
[H, F ], where
F := RP P + P P R (27.25)
where we have defined the resolvent
1
R := (27.26)
H E
(which has poles at the eigenvalues of H). This can be seen as follows: noting that (H E)R = 1 so that
HR = 1 + ER, and P H = EP , we have
[H, F ] = HRP P + HP P R RP P H P P RH (27.27)

= P P + ERP P + EP P R ERP P P P EP P R (27.28)
= [P , P ] (27.29)
as claimed.
Now let us define
F := U F U. (27.30)
Using (27.20), we have
F = iT U [H, F ]U + U F U ; (27.31)
therefore
U [P , P ]U = U [H, F ]U (27.32)
1
= (F U F U ). (27.33)
iT
Now we insert this into (27.24) and integrate the first term by parts:
Z 1
i
U (1) Ua (1) = U (1) [(F U F U )U Ua ] ds (27.34)
T 0
i
h i1 Z 1 d
= U (1) F U Ua F (U Ua ) U F Ua ds (27.35)
T 0 0 ds
h
1
Z 1
i
i

= U (1) F U Ua F U [P , P ]Ua U F Ua ds (27.36)
T 0 0
where we compute the derivative of U Ua as in (27.24). Thus we have

Z 1
T kU (1) Ua (1)k kF (0)k + kF (1)k + ds 2kF k kP k + kF k . (27.37)
0
Now
kF k 2kRP P k (27.38)
= 2kR(1 P )P k (27.39)
2kR(1 P )k kP k (27.40)
2kP k
(27.41)

where we have used (27.14) to see that P P = (1 P )P , and where (s) is the gap between the smallest
eigenvalue E(s) of H(s) and the nearest distinct eigenvalue of H(s). Also,
F = RP P + RP P + RP 2 + P P R + P P R + P 2 R (27.42)
and
1 1
R = H (27.43)
(H E) (H E)
(to see this, differentiate the identity (H E)R = 1), so (by similar calculations as above)
kHk kP k kP k kP k2

kF k 2 + + . (27.44)
2
Thus we have
1
kP (0)k kP (1)k kP k2 kHk kP k kP k

T
Z
kU (1) Ua (1)k + + ds 3 + + . (27.45)
2 (0) (1) 0 2
Finally, we would like to express kP k and kP k in terms of H. We can obtain upper bounds for these
quantities using first and second order perturbation theory. Intuitively, if the Hamiltonian changes slowly,
and if its eigenvalues are not close to degenerate, then its eigenvectors should also change slowly. At first
order, we have
kHk
kP k c1 (27.46)

for some constant c1 , and at second order,
kHk kHk2
kP k c2 + c3 (27.47)
2
27.2. Proof of the adiabatic theorem 139
for some constants c2 , c3 . Plugging these estimates into (27.45), we have

1
kH(0)k kH(1)k kHk2 kHk

T
Z
kU (1) Ua (1)k c1 + c1 + ds (3c21 + c1 + c3 ) 3 + c2 2 . (27.48)
2 (0)2 (1)2 0
Overall, we have proved the following quantitative version of the adiabatic theorem:
Theorem 27.1. Suppose H(s) has a nondegenerate ground state for all s [0, 1], and suppose that the total
evolution time satisfies
" Z 1 #
2
2 kH(0)k kH(1)k kHk kHk
T c1 + c1 + ds (3c21 + c1 + c3 ) 3 + c2 2 . (27.49)
(0)2 (1)2 0
Then evolution of the initial state |(0)i = |(0)i under the Schrodinger equation (27.5) produces a final
state |(1)i satisfying
k|(1)i |(1)ik . (27.50)
Chapter 28
Adiabatic optimization
Having established the quantum adiabatic theorem, we will now see how it can be applied to solve opti-
mization problems.
After describing the general framework [42], we will see how this approach gives an
alternative O( N )-time algorithm for unstructured search.
28.1 An adiabatic optimization algorithm

Many computational problems can be cast as the minimization of some cost function. For concreteness,
suppose we are given a function
h : {0, 1}n R (28.1)
from strings of n bits to real numbers. A natural question is, does there exist a string z {0, 1}n such that
h(z) = 0? Or alternatively, can you find a string z that globally minimizes h(z)? In general, such questions
can be very difficult to answerthe first problem is NP-complete, and the second is NP-hard. (For example,
h(z) could be the number of clauses violated by some CNF formula.) But of course, specific instances of
such a problem can be more tractable than the general case.
The quantum adiabatic theorem suggests a natural approach to minimizing functions such as (28.1). The
basic idea is to encode the solutions of the minimization problem in the ground state of a Hamiltonian and to
adiabatically evolve into this ground state, starting from a known ground state. According to the adiabatic
theorem, the probability of finding a solution will be high provided the evolution is sufficiently slow.
To cast the problem of minimizing (28.1) in quantum mechanical terms, consider a Hamiltonian that is
diagonal in the computational basis, with eigenvalues h(z):
X
HP := h(z) |zihz| (28.2)
z{0,1}n
We refer to HP as the problem Hamiltonian, since it corresponds to the problem of minimizing h. Clearly,
its ground state consists of strings z such that h(z) is minimized. Therefore, if we could prepare the ground
state of HP , we could solve the minimization problem.
To prepare the ground state of HP , we will adiabatically evolve from the ground state of a simpler
Hamiltonian. Let the the beginning Hamiltonian HB be some Hamiltonian whose ground state is easy
to prepare. Then let HT (t) be a smoothly varying time-dependent Hamiltonian with HT (0) = HB and
HT (T ) = HP , where T is the total run time of the evolution. Assuming the evolution is sufficiently close to
adiabatic, the initial ground state will evolve into a state close to the final ground state, thereby solving the
problem.
For any given HB and HP , there are many possible choices for the interpolation HT (t). One simple
choice is a time-dependent Hamiltonian of the form

HT (t) = H(t/T ) := 1 f (t/T ) HB + f (t/T )HP (28.3)
where f (s) is a smooth, monotonic function of s [0, 1] satisfying f (0) = 0 and f (1) = 1, so that H(0) = HB
and H(1) = HP . In other words, the interpolating function f (t/T ) should vary smoothly from 0 to 1 as the
141
142 Chapter 28. Adiabatic optimization
time t varies from 0 to T . If f (s) is twice differentiable, and if the ground state of H(s) is nondegenerate
for all s [0, 1], then the adiabatic theorem guarantees that the evolution will become arbitrarily close
to adiabatic in the limit T . An especially simple choice for this interpolation schedule is the linear
interpolation f (s) = s, but many other choices are possible.
Finally, how should we choose the beginning Hamiltonian? If we choose an interpolation of the form
(28.3), then HB clearly should not commute with HP , or else no evolution will occur. One natural choice
for HB is
Xn
HB = x(j) (28.4)
j=1
(j)
where x is the Pauli x operator on the jth qubit. This beginning Hamiltonian has the ground state
1 X
|Si := |zi, (28.5)
2n z{0,1}n
a uniform superposition of all possible solutions S = {0, 1}n . But as for the method of interpolation, many
other choices for HB are possible.
To summarize, a quantum adiabatic optimization algorithm works as follows:
1. Prepare the quantum computer in the ground state of the beginning Hamiltonian HB .
2. Evolve the state with the Hamiltonian H(t) for a total time T , ending with the problem Hamiltonian
HP .
3. Measure in the computational basis.
Step 1 can be performed efficiently if HB has a sufficiently simple ground statefor example, if it is the
state (28.5). Step 2 can be simulated efficiently on a universal quantum computer, assuming the Hamiltonian
is of a suitable form (say, if it is sparse) and the run time T is not too large. Step 3 is straightforward to
implement, and will yield a state close to the ground state assuming the simulation of the evolution is
sufficiently good and the evolution being simulated meets the conditions of the adiabatic theorem.
28.2 The running time and the gap

The quantum adiabatic optimization algorithm described above is guaranteed to produce the correct answer
with high probability provided the run time is sufficiently large. But how long is long enough? Unfortunately,
this question is difficult to answer for almost all interesting problems. However, using the adiabatic theorem,
it can at least be rephrased as a statement about spectral properties of the Hamiltonian.
From the adiabatic theorem, we see that the run time depends crucially on the gap (s) between the
ground and first excited states of H(s). Suppose for simplicity that we use linear interpolation between HB
and HP , i.e., (28.3) with f (s) = s. Clearly
H = HP HB (28.6)
H = 0. (28.7)
Now let
min := min (s) (28.8)
s[0,1]
be the minimum gap between the ground and first excited states. Then we have
Theorem 28.1. Suppose the evolution time satisfies
kHP HB k kHP HB k2

2 2
T 2c1 + (3c1 + c1 + c 3 ) . (28.9)
2min 3min
Then k|(1)i |(1)ik .

28.3. Adabatic optimization algorithm for unstructured search 143
Recall that to be efficiently simulable, HB and HP should not have very large norm. Thus we see that if
the minimum gap min is not too small, the run time need not be too large. In particular, to show that the
adiabatic algorithm runs in polynomial time, it sufficies to show that the minimum gap is only polynomially
small, i.e., that 1/min is upper bounded by a polynomial in n.
Of course, this does not answer the question of whether the adiabatic algorithm runs in polynomial time
unless the minimum gap can be estimated. In general, calculating the gap for a particular Hamiltonian is
a difficult problem, which makes the adiabatic algorithm difficult to analyze. Nevertheless, there are a few
examples of interest for which the gap can indeed be estimated.
28.3 Adabatic optimization algorithm for unstructured search

Recall that the unstructured search problem for N items can be solved in time O( N ) (by Grovers algorithm,
or by a quantum walk search algorithm), but has classical query complexity (N ). Lets show that adiabatic
optimization can also solve this problem in O( N ), demonstrating that it can indeed provide quantum
speedup [81, 36].
Unstructured search is equivalent to minimizing the black box function h : {0, 1, . . . , N 1} {0, 1}
defined as (
0 z is marked
h(z) := (28.10)
1 z is unmarked.
For simplicity, lets focus on the case of a single marked item m. Then the problem Hamiltonian (28.2)
corresponding to (28.10) is
HP = 1 |mihm|. (28.11)
A natural starting point for adiabatic evolution is the uniform superposition state |Si, where now S =
{0, 1, . . . , N 1}. Then a particularly simple choice of the beginning Hamiltonian is the projector onto states
orthogonal to |Si,
HB = 1 |SihS|. (28.12)
(Note that such a beginning Hamiltonian is not a good choice in general, since one can show that it allows
at most a quadratic speedup over brute force search. We could instead use the initial Hamiltonian (28.4) for
unstructured search, corresponding to adiabiatic optimization over the hypercube, but the analysis would
be more complicated.) For this beginning Hamiltonian, |Si is the ground state, with energy 0, and all
orthogonal states have energy 1. Finally, suppose we use the interpolation Hamiltonian (28.3), so that

H(s) = 1 1 f (s) |SihS| + f (s)|mihm| (28.13)
for some as yet undetermined function f : [0, 1] [0, 1].
Just as in Grovers algorithm, HT (t) acts nontrivially only on the subspace spanned by |mi and |Si,
making its spectrum straightforward to calculate. Working in the {|mi, |m i} basis we defined when an-
alyzing the continuous-time
quantum walk algorithm for unstructured search on the complete graph (with
|m i = (|Si a|mi)/ 1 a2 , where a := hS|mi = 1/ N ),

(1 f )(1 a2 )

(1 f )a 1 a2
H= (28.14)
(1 f )a 1 a2 1 (1 f )(1 a2 )
1 p h 1i
= (1 f )a 1 a2 x + (1 f )(1 a2 ) z . (28.15)
2 2
Then it is straightforward to compute the eigenvalues
1 p
E0 = 1 1 4f (1 f )(1 a2 ) (28.16)
2
1 p
E1 = 1 + 1 4f (1 f )(1 a2 ) . (28.17)
2
In other words, the gap between the ground and first excited states is
p
= 1 4f (1 f )(1 a2 ). (28.18)
For example, with N = 1000, we have the following:
0.8
0.6
(f )
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1
f (t/T )

In general, the minimum value occurs at f = 1/2, where we have min = a = 1/ N .
To finish specifying the algorithm, we must choose a particular interpolation function (or schedule) f (s).
The simplest choice is to use the linear interpolation f (s) = s, but it turns out that this simple choice does
not work. Applying (28.9), which pessimistically depends solely on the minimum value of the gap, only
shows it is sufficient to take T = O(1/3 ) = O(N 3/2 ). But even if we use the full adiabatic theorem, we
only find that it is sufficient to take the run time to be large compared to
Z 1 Z 1
df df 1
= = 2 = N. (28.19)
0 3 0
2
[1 4f (1 f )(1 a )]3/2 a
While the adiabatic theorem only gives an upper bound on the running time, it turns out that the bound
is essentially tight in this case: with linear interpolation, the run time must be (N ) for the evolution to
remain approximately adiabatic.
However, we can do better by choosing a different interpolation schedule f (s). Intuitively, since the gap
is smallest whenf (s) is close to 1/2, we should evolve more
slowly for such values. The fact that the gap is
only of order 1/ N for values of |f 1/2| of order 1/ N ultimately means that it is possible to choose a
schedule for which a total run time of O( N ) suffices. Since we should evolve most slowly when the gap is
smallest, it is reasonable to let f p for some power p. For concreteness, we will use p = 3/2, although
any p (1, 2) would work.
If we let
f = 3/2 , (28.20)
R1 R1
then the coefficient is fixed by the equation 0
ds = 0
df /f = 1, i.e.,
Z 1
df
= (28.21)
0 3/2
Z 1
df
= (28.22)
0 [1 4f (1 f )(1 a2 )]3/4
N 3/4
= Im B(1+iN 1)/2 (1/4, 1/4) p (28.23)
2(N 1)
(5/4) 1/4
=2 N + O(1) (28.24)
(3/4)
where Bz (a, b) denotes the incomplete beta function, and (z) denotes the gamma function. Then for
example, with N = 1000, the schedule obtained by integrating (28.20) looks as follows:
28.3. Adabatic optimization algorithm for unstructured search 145
0.8
0.6
f (s)
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1
s
Now we want to evaluate the terms appearing in the adiabatic theorem. For the first three terms, we
need to calculate
kH(s)k = kf(s)(HP HB )k (28.25)

p
= |f(s)| 1 a2 (28.26)
p
= 1 a2 3/2 . (28.27)
The first and second terms are
kH(0)k kH(1)k p
2
= 2
= 1 a2 (28.28)
(0) (1)
= O(N 1/4 ), (28.29)
so they will turn out to be negligible. The third term is

1 1
kHk2 f2 df
Z Z
ds = (1 a2 ) (28.30)
0 3 0
3
f
Z 1
df
= (1 a2 ) 3/2
(28.31)
0
= 2 (1 a2 ) (28.32)

= O( N ). (28.33)
To calculate the final term, we need to compute
kHk = kf(s)(HP HB )k (28.34)

p
= |f(s)| 1 a2 (28.35)
3
p
= 1/2 || 1 a2 (28.36)
2
3 p d
= 1 a2 1/2 f (28.37)

2 df
,
and
d 2(2f 1)(1 a2 )
= . (28.38)
df
Then we have
1 1
kHk kHk df
Z Z
ds = (28.39)
0 2 0 2 f
Z 1
3 p 1 d
= 1 a2 3/2 df
df (28.40)
2 0

Z 1
|2f 1|
= 3(1 a2 )3/2 df (28.41)
0 [1 4f (1 f )(1 a2 )]5/4
6(1 a2 )3/2
= (28.42)
a(1 + a)(1 + a)

= O( N ). (28.43)

Overall, we find a total run time of T = O( N ) suffices to make the evolution arbitrarily close to adiabatic.
In the above analysis, it was essential to understand the behavior of the gap as a function of f . In
particular, since the spectrum of the Hamiltonian (28.13) does not depend on which item m is marked, we
can choose a schedule that is simultaneously good for all possible marked items. For general instances of
adiabatic optimization, this may not be the case.
To implement this adiabatic optimization algorithm for unstructured search in the conventional quantum
querymodel, we must simulate evolution according to this Hamiltonian. Using the fact that [HB , HP ] =
O(1/ N ), it is possible to perform this simulation using O( N ) queries to a black box for h(z).
Chapter 29
An example of the success of

adiabatic optimization
In this lecture, we describe a simple example of a function that can be minimized by adiabatic optimization
in polynomial time [42].
29.1 The ring of agrees

Consider n bits z1 , z2 , . . . , zn arranged on a ring. For each adjacent pair of bits, we include a clause that
is satisfied if and only if the two bits are the same. This instance has exactly two satisfying assignments,
namely those for which all the bits agree: z1 = z2 = = zn = 0 and z1 = z2 = = zn = 1. But even
though it does not present a computational challenge, it is interesting to ask how well adiabatic optimization
does on this simple problem.
Summing over the n clauses, the cost function is
n
X
h(z) = (1 zj ,zj+1 ) (29.1)
j=1
n
X 1 (2zj 1)(2zj+1 1)
= (29.2)
j=1
2
where we make the identification zn+1 := z1 . Thus, the problem Hamiltonian can be written in terms of
Pauli operators as
X
HP := h(z)|zihz| (29.3)
z
n
1X
= (1 z(j) z(j+1) ) (29.4)
2 j=1
(n+1) (1)
where we make the similar identification z := z . To prepare the ground state of HP , we will use
linear interpolation from a magnetic field in the x direction (i.e., the adjacency matrix of the hypercube),
giving
n n
X sX
H(s) = (1 s) x(j) + (1 z(j) z(j+1) ) . (29.5)
j=1
2 j=1
To understand how well the resulting adiabatic algorithm performs, we would like to calculate the gap
(s) of this Hamiltonian as a function of s. Strictly speaking, this gap is zero, since the final ground state
is degenerate: any state in the two-dimensional subspace span{|0 . . . 0i, |1 . . . 1i} has zero energy. However,
147
148 Chapter 29. An example of the success of adiabatic optimization
note that the Hamiltonian commutes with the operator

n
Y
G := x(j) , (29.6)
j=1
and that the initial state |Si (where S = {0, 1}n ) is an eigenstate of G with eigenvalue +1. The evolution
takes place entirely within the +1 eigenspace of G, so we can restrict our attention to this subspace. So let
(s) denote the gap between the ground state of H(s) and the first excited state in the +1 eigenspace of
G. This is the relevant gap for adiabatic evolution starting in |Si, with the ultimate goal of producing the
unique G = +1 ground state of HP , the GHZ state
|0 . . . 0i + |1 . . . 1i
. (29.7)
2
Measurement of this state in the computational basis will yield one of the two satisfying assignments of the
n bits, each occurring with probability 1/2.
The Hamiltonian (29.5) is well-known in statistical mechanics, where it is referred to as a ferromagnetic
Ising model in a transverse magnetic field. It can be diagonalized using the Jordan-Wigner transform, which
we describe next.
29.2 The Jordan-Wigner transformation: From spins to fermions

The Jordan-Wigner transformation is a way of mapping a one-dimensional spin system to a system of
free fermions. Since finding the spectrum of the resulting system of noninteracting fermions only requires
diagonalizing an n n matrix, whereas determining the spectrum of a generic system of n spins requires
diagonalizing a 2n 2n matrix, the Jordan-Wigner transformation shows that one-dimensional spin systems
are particularly simple, and provides a powerful tool for analyzing them.
We will focus on a one-dimensional Ising spin system in a transverse magnetic field, with nearest-neighbor
couplings and magnetic fields that can vary arbitrarily from site to site. In other words,
n
X n
X
H= Ji z(i) z(i+1) + hi x(i) (29.8)
i=1 i=1
for some values of the real numbers Ji and hi . We may either have periodic boundary conditions (by
(n+1) (1)
identifying z with z ) or open boundary conditions (by setting Jn = 0).
The Jordan-Wigner transformation consists of the definition
(j)
aj := x(1) x(2) x(j1) 1(j+1) 1(n) (29.9)
(which will turn out to be a fermion annihilation operator), where we have defined spin raising and lowering
operators in the x basis,
x iy
:= R R (29.10)
2
= |ih| (29.11)
where
1 1 1
R := (29.12)
2 1 1

is the Hadamard transformation, and |i := (|0i |1i)/ 2 are the eigenvectors of x .
To see that the aj s correspond to fermion annihilation operators, we observe that aj and
(j)
aj = x(1) x(2) x(j1) + 1(j+1) 1(n) (29.13)
29.2. The Jordan-Wigner transformation: From spins to fermions 149
obey fermion anticommutation relations. In particular, let
{A, B} := AB + BA (29.14)
denote the anticommutator. For j < k, we have

(j) (k)
{aj , ak } = { , x(j) }x(j+1) x(k1) = 0 (29.15)
and
(j) (k)
{aj , ak } = { , x(j) }x(j+1) x(k1) + = 0. (29.16)
On the other hand, for j = k we have

(j) (j)
{aj , aj } = { , } = 2(|+ih|+ih|)(j) = 0 (29.17)
(j) (j)
{aj , aj } = { , + } = (|+ih|ih+| + |ih+|+ih|)(j) = 1. (29.18)
Thus we find the fermion anticommutation relations
{aj , ak } = 0 (29.19)
{aj , ak } = j,k . (29.20)
(j) (j) (j+1)
To fermionize H, we need to express x and z z in terms of fermion operators. The important
point is that even though the aj s and aj s are highly nonlocal spin operators, certain local combinations of
them correspond to local spin operators, and vice versa. For the magnetic field, we have
(j) (j)
aj aj = + (29.21)
= (|ih|)(j) (29.22)
1
= (1 x(j) ) , (29.23)
2
so
x(j) = 1 2aj aj (29.24)

= aj aj aj aj . (29.25)
For the Ising coupling term, we have (for j = 1, 2, . . . , n 1),

(j) (j) (j+1) (j+1)
(aj aj )(aj+1 + aj+1 ) = (+ )x(j) (+ + ) (29.26)
= z(j) z(j+1) . (29.27)
(n) (1)
If we want to use periodic boundary conditions, including the operator z z , then we have to treat it
separately. We have
n1
(n) (n) (1) (1)
(an an )(a1 + a1 ) =
Y
x(j) (+ )(+ + ) (29.28)
j=1
= Gz(n) z(1) (29.29)
where G is the spin flip operator defined in (29.6). Since x anticommutes with z , the operator G commutes
with each Ising coupling term, and thus commutes with any H of the form (29.8). Therefore, to find the
spectrum of H, it suffices to separately determine the spectra in the subspaces with G = +1 and G = 1.
1
Note that since x = (1) 2 (1x ) , we can write
Pn 1 (j)
G = (1) j=1 2 (1x ) (29.30)
Pn
= (1) j=1 aj aj . (29.31)
Thus the cases G = +1, G = 1 correspond to the cases of an even or an odd number of occupied fermion
modes, respectively.
Overall, the Jordan-Wigner transformation results in the expression
n n
Ji0 (ai ai )(ai+1 + ai+1 ) hi (ai ai ai ai )
X X
H= (29.32)
i=1 i=1
where an+1 := a1 , and where (

Ji i = 1, 2, . . . , n 1
Ji0 := (29.33)
GJn i = n.
Since this Hamiltonian is quadratic in the fermion operators, it corresponds to a collection of n free fermions.
Now it remains to diagonalize such a Hamiltonian.
29.3 Diagonalizing a system of free fermions

Consider the most general quadratic fermion Hamiltonian,
n
(jk aj ak + jk aj ak ) + h.c.
X
H= (29.34)
j,k=1
Using the fermion anticommutation relations (29.19) and (29.20), we can rewrite this Hamiltonian as

H=a a + tr (29.35)

where and denote the matrices whose j, k entries are jk and jk , respectively, and a denotes the column
vector whose first block has entries a1 , . . . , an and whose second block has entries a1 , . . . , an . Since H is
hermitian, we can always choose , so that = and = T .
We would like to define a change of basis to a new set of fermion operators bj , bj in which the Hamiltonian
is diagonal. If we let
n
(jk ak + jk ak )
X
bj := (29.36)
k=1
(sometimes referred to as a Bogoliubov transformation), then we have

b= a. (29.37)

The matrices and are not arbitrary, since we require that the transformed bj s and bj s remain fermion
operators, i.e., that they satisfy the fermion anticommutation relations
{bj , bk } = 0 (29.38)
{bj , bk } = j,k . (29.39)
It is a good exercise to check that the condition that these relations are satisfied if an only if the matrix in
(29.37) is unitary.
Although we will not describe the proof here,1 it turns out that any quadratic fermion Hamiltonian can
be diagonalized by such a transformation. In particular, it is always possible to choose , so that

0
H=b b + tr (29.40)
0
1 The diagonalization of H in the case of real , appears in [69]. For general , , as well as the case where we include
terms that are linear in the fermion operators, see [35].
29.3. Diagonalizing a system of free fermions 151
where is a diagonal matrix whose diagonal entries are the positive eigenvalues of the 2 2 block matrix
(representing a 2n 2n matrix whose eigenvalues occur in pairs) appearing in (29.35). Expanding this
expression, we have
n
j (2bj bj 1) + tr
X
H= (29.41)
j=1
where we have again used the fermion anticommutation relations. Since the bj bj s are commuting operators
with eigenvalues 0 and 1, we see that spectrum of H is given by the 2n numbers
n
X
sj j + tr (29.42)
j=1
for each of the 2n possible assignments of s1 , s2 , . . . , sn = 1.

Calculating the eigenvalues j is especially simple when , are real, as they are in the case of (29.32).
In this case, we have (now treating R in (29.12) as a block matrix)

0 +
R R= . (29.43)
0
Since the square of this matrix is
2
0 + ( + )( ) 0
= , (29.44)
0 0 ( )( + )
we see that the j s are simply the positive square roots of the eigenvalues of the n n matrix ( + )( )
(or equivalently, of ( )( + )).
Finally, we specialize to a Hamiltonian of the Ising form (29.32). Here we have
0 J1 Jn0

0 0
J1 0 J2 0 0
h1 0 0

..

.. .. ..
1 0 J2 0 J3 . .
0 h2 . .
= . . . (29.45)
2 .. .. .. .. .. ..

0 J 0

3 . . . 0
.. .. ..
0 0 hn
0 . . . 0 Jn1
Jn0 0 0 Jn1 0
J1 Jn0

0 0 0
J1 0 J2 0 0
..

. ..
1 0 J 2 0 J 3 .
= . , (29.46)

2 .. . . . .
0 J 3 . . 0

.. .. ..
0 . . . 0 n1J
Jn0 0 0 Jn1 0
so the matrix ( + )( ) is given by

J 2 + h21 J1 h2 0 0 Jn0 h1
1
J1 h2 J22 + h22 J2 h3 0 0

..

..
J32 + h23

0 J2 h3 J3 h4 . .
. (29.47)

.. .. ..

. 0 J3 h4 . . 0

.. .. ..

2
. . + h2n1 Jn1 hn

0 . Jn1

0
Jn h1 0 0 Jn1 hn Jn2 + h2n
The eigenvalues corresponding to eigenstates with G = 1 can be identified as follows. The transfor-
mation (29.37) is invertible, so any quadratic expression in the aj s and aj s can be written as a quadratic
expression in the bj s and bj s. Since quadratic fermion operators do not change the parity of the total
number of occupied modes, this means that the parity of the a modes is the same as the parity of the b
modes. In other words,
bj bj
Pn
G = (1) j=1 . (29.48)
Thus the eigenvalues with G = +1 are those with an even number of sj s equal to +1 in (29.42), whereas
the eigenvalues with G = 1 are those with an odd number of sj s equal to +1. In particular, we see that
the gap between the ground and first excited states in the G = +1 subspace is equal to 2(1 + 2 ), where
1 and 2 are the square roots of the two smallest eigenvalues of (29.47).
In the case of periodic boundary conditions, note that we have two distinct matrices (29.47), one for
each value of G. However, with a fixed value of G, only half the possible assignments of the sj s give rise
to eigenvalues of the Hamiltonian, so we still find the correct number of eigenvalues. Here again, the gap
between the ground and first excited states in the G = +1 subspace is equal to 2(1 + 2 ), where now 1
and 2 are the square roots of the two smallest eigenvalues of (29.47) with G = +1.
29.4 Diagonalizing the ring of agrees

The Hamiltonian (29.5) is of the form (29.8) with Ji = s/2 and hi = (1 s) for each i = 1, 2, . . . , n. (Note
that we can neglect terms proportional to the identity, since they do not affect the gap.) Then, according to
(29.47), the gap is given by twice the sum of the square roots of the two smallest eigenvalues of the matrix
1 2
(Ji2 + h2i ) Ji hi (D + D1 ) = s + 4(1 s)2 2s(1 s)(D + D1 ) ,

(29.49)
4
where D is the skew-circulant matrix

0 1 0 0
.. ..

0 0 1 . .
D := .. .. .. (29.50)

. . . 0
0 0 1
1 0 0 0
n2
X
= |x + 1ihx| |0ihn 1| . (29.51)
x=0
Now just as the circulant matrix

0 1 0 0
.. ..
0
0 1 . .
...
C := .. .. (29.52)

. . 0
0 0 1
1 0 0 0
n1
X
= |x + 1 mod nihx| (29.53)
x=0
(and hence any circulant matrix) is diagonal in the Fourier basis
n1
1 X 2ikx/n
|k i := e |xi (29.54)
n x=0
29.4. Diagonalizing the ring of agrees 153
for k = 0, 1, . . . , n 1, one can show that the matrix D (and hence any skew-circulant matrix) is diagonal in
the skew-Fourier basis
n1
1 X i(2k+1)x/n
|k i := e |xi , (29.55)
n x=0
also for k = 0, 1, . . . , n 1. In particular,
D|k i = ei(2k+1)/n |k i . (29.56)
Thus, the eigenvalues of (29.49) are given by

1 2 (2k + 1)
s + 4(1 s)2 4s(1 s) cos . (29.57)
4 n
The smallest two eigenvalues (which are equal) occur for k = 0 and k = n 1, so the gap as a function of
the interpolating parameter is
r

(s) = 2 s2 + 4(1 s)2 4s(1 s) cos , (29.58)
n
which looks like this for n = 50:
4
3.5
2.5
(s)
1.5
0.5
0
0 0.2 0.4 0.6 0.8 1
s
For large n,
2
cos = 1 2 + O(1/n4 ) , (29.59)
n 2n
so r
2 2
(s) = 2 (2 3s)2 + s(1 s)
+ O(1/n4 ) . (29.60)
n2
Setting d(s)2 /ds equal to zero, we see that the minimum occurs at s = 2/3 + O(1/n2 ), at which the
minimum gap is
4
= + O(1/n3 ) . (29.61)
3n
Since the minimum gap decreases only as 1/ poly(n), we see that adiabatic optimization can efficiently find
a satisfying assignment for the ring of agrees. Even though the ring of agrees is not by itself an interesting
computational problem, we can take this as preliminary evidence that adiabatic optimization sometimes
succeeds.
However, it is also possible for the adiabatic algorithm to fail (at least for certain natural choices of
the interpolating Hamiltonian), even for cost functions that are almost as simple as the ring of agrees. For
example, suppose we have 4n spins arranged on a ring, and we define the cost function
n
X 2n
X 3n
X 4n
X
h0 (z) = (1 zj ,zj+1 ) + 2 (1 zj ,zj+1 ) + (1 zj ,zj+1 ) + 2 (1 zj ,zj+1 ). (29.62)
j=1 j=n+1 j=2n+1 j=3n+1
In other words, we again penalize a string when adjacent bits disagree, but the penalty is either 1 or 2
for contiguous blocks of n pairs of spins. In this case one can show that the gap is exponentially small.
Unfortunately, we did not have time to discuss the details of this calculation.
Chapter 30
Universality of adiabatic quantum

computation
In this final chapter, we see how adiabatic evolution can be used to implement an arbitrary quantum circuit
[8]. In particular, this can be done with a local, linearly interpolated Hamiltonian. We may think of such
Hamiltonians as describing a model of quantum computation. We know that this model can be efficiently
simulated in the quantum circuit model. In this lecture we will see how the circuit model can be efficiently
simulated by the adiabatic model, so that in fact the two models have equivalent computational power (up
to polynomial factors).
This does not necessarily mean that there is an efficient adiabatic optimization algorithm for any problem
that can be solved efficiently by a quantum computer. For example, Shors algorithm shows that quantum
computers can factor integers efficiently, yet we do not know if there is an adiabatic factoring algorithm that
works by optimizing some cost function (such as the squared difference between the integer and a product
of smaller integers). In general, it does not seem that the constructions of universal adiabatic quantum
computers give much insight into how one might design efficient quantum adiabatic optimization algorithms.
Nevertheless, they show that there is some sense in which the idea of adiabatic evolution captures much of
the power of quantum computation.
30.1 The Feynman quantum computer

In a classic paper from the mid-1980s, Feynman presented a quantum mechanical model of a computer
using local, time-independent Hamiltonian dynamics [45]. (Feynmans Hamiltonian has also been useful in
quantum complexity, namely in formulating a complete problem for a quantum analog of the complexity
class NP [62].) The motivation for this model was to show that quantum mechanics does not pose barriers
to building a classical computer, despite quantum effects such as the uncertainty principle. Feynman showed
that any sequence of reversible classical logic gates can be efficiently simulated using local Hamiltonian
dynamics. However, his model applies equally well to simulate a quantum circuit.
Given a k-gate quantum circuit on n qubits, Uk U2 U1 , let
k
X
HF := Hj (30.1)
j=1
where
Hj := Uj |jihj 1| + Uj |j 1ihj| . (30.2)
Here the first register consists of n qubits, and the second register stores a quantum state in a (k + 1)-
dimensional space spanned by states |ji for j {0, 1, . . . , k}. The second register acts as a clock that records
the progress of the computation. Later, we will show how to represent the clock using qubits, but for now,
we treat it as a convenient abstraction.
155
156 Chapter 30. Universality of adiabatic quantum computation
If we start the computer in the state |i |0i, then the evolved state remains in the subspace spanned
by the k + 1 states
|j i := Uj U1 |i |ji (30.3)
for j {0, 1, . . . , k}. In this subspace, the nonzero matrix elements of HF are
hj |HF |j1 i = 1 , (30.4)
so the evolution is the same as that of a free particle propagating on a discretized line segment. Such a
particle moves with constant speed, so in a time proportional to k, the initial state |0 i will evolve to a
state with substantial overlap on the state |k i = Uk U1 |i|ki, corresponding to the final state of the
computation. For large k, one can show that
|hk |eiHF k/2 |0 i|2 = (k 2/3 ) , (30.5)
so that after time k/2, a measurement of the clock will yield the result k, and hence give the final state
of the computation, with a probability that is only polynomially small in the total number of gates in the
original circuit.
The success probability of Feynmans computer can be made close to 1 by a variety of techniques. The
simplest approach is to repeat the process O(k 2/3 ) times. Or we could pad the end of the computation
with a large number of identity gates, boosting the probability that we reach a state in which the entire
computation has been performed. Alternatively, as Feynman suggested, the success probability can be made
arbitrarily close to 1 in single shot by preparing the initial state in a narrow wave packet that will propagate
ballistically without substantial spreading. But perhaps the best approach is to make the process perfect by
changing the Hamiltonian to
Xk p
HF G := j(k + 1 j) Hj . (30.6)
j=1
In this case, the choice t = gives the exact transformation eiHF G t |0 i = |k i. This can be understood by
viewing |j i as a state of total angular momentum k2 ( k2 + 1) with z component j k2 . Then HF G is simply
the x component of angular momentum, which rotates between the states with z component k2 in time .
Equivalently, HF G can be viewed as the Hamiltonian in the Hamming weight subspace of a hypercube.
In the Hamiltonians (30.1) and (30.6), the clock space is not represented using qubits. However, we can
easily create a Hamiltonian expressed entirely in terms of k + 1 qubits using a unary representation of the
clock. Let
|ji := | 0| {z
0} 1 |0 {z
0}i . (30.7)
j kj
Then suppose we make the replacement

(j1,j)
|jihj 1| |01ih10| (30.8)
(and similarly for the adjoint), where the parenthesized superscript indicates which qubits are acted on. Then
the subspace of states for which the clock register has the form (30.7) is invariant under the Hamiltonian,
and within this subspace, its action is identical to that of the original Hamiltonian.
Notice that if the quantum circuit consists of one- and two-qubit gates, then the Hamiltonians (30.1) and
(30.6) are local in the sense that the interactions involve at most four qubits. We call such a Hamiltonian
4-local.
This construction shows that even a time-independent Hamiltonian of a particularly simple form can be
universal for quantum computation. Now lets see how we can modify the construction to use adiabatic
evolution instead of a time-independent Hamiltonian.
30.2 An adiabatic variant

The construction of an adiabatic quantum computer will again involve two registers, the first holding the state
of the quantum computation and the second representing a clock. The idea is to start from a Hamiltonian
30.2. An adiabatic variant 157
whose ground state is the initial state of the computation together with the initial configuration of the clock,
and to slowly evolve to a Hamiltonian (essentially, minus the Feynman Hamiltonian (30.1)) whose ground
state encodes not the final state of the computation, but rather a uniform superposition over the entire
history of the computation.
As before, we will find it convenient to start with an abstract description of the clock register in terms
of k + 1 basis states |0i, |1i, . . . , |ki, without worrying about how these states are represented in terms of
qubits. Later, we will consider issues of locality in this type of construction.
For the beginning Hamiltonian, we will use
HB := I |0ih0| + Hpenalty (30.9)
where
n
X (j)
Hpenalty := |1ih1| |0ih0|. (30.10)
j=1
Here the parenthesized superscript again indicates which qubit is acted on. The first term of (30.9) says that
the energy is lower if the clock is in the initial state |0i. Adding Hpenalty gives an energy penalty to states
whose clock is in the state |0i, yet for which the state of the computation is not the initial state |00 . . . 0i.
Thus the unique ground state of HB is |00 . . . 0i |0i.
For the final Hamiltonian (which we denote HC , since it encodes the final result of an arbitrary circuit,
rather than the solution of a particular problem), we will use
HC := HF + Hpenalty (30.11)
where HF is the Feynman Hamiltonian defined in (30.1). From (30.4), we see that the HF has a degenerate
ground state subspace, where any state of the form
k
1 X
|i := |j i (30.12)
k + 1 j=0
(with |j i defined in (30.3)), with an arbitrary initial state |i, has minimal energy. Adding Hpenalty
penalizes those states for which the initial state of the computation is not |00 . . . 0i, so that (30.12) with
|i = |00 . . . 0i is the unique ground state of HC . This state is almost as good as the final state of the
computation, since if we measure the clock, we obtain the result k with probability 1/(k + 1), which is
1/ poly(n) assuming the length of the circuit is only k = poly(n). By repeating the entire process poly(k)
times, we can obtain the final state of the computation with high probability.
Finally, we use linear interpolation to get from HB to HC , defining
H(s) := (1 s)HB + sHC . (30.13)
If we begin in the state |00 . . . 0i |0i and evolve according to HT (t) := H(t/T ) for a sufficiently large time
T , the adiabatic theorem guarantees that the final state will be close to |i. It remains to estimate the gap
(s) to show that T = poly(k) is sufficient.
In fact, the (k + 1)-dimensional computational subspace spanned by the states |j i with |i = |00 . . . 0i
is invariant under H(s), so it suffices to compute the gap within this subspace. Let us examine how H(s)
acts within the computational subspace. Note that Hpenalty |j i = 0 for all j {0, 1, . . . , k}. We have
hj |HB |j 0 i = j,j 0 j,0 (30.14)
and
hj |HC |j 0 i = (j,j 0 +1 + j,j 0 1 ) , (30.15)
so we need to lower bound the gap between the smallest and second smallest eigenvalues of the matrix
s1 s

0 0
.. ..
s
0 s . .

..
. (30.16)
0
s 0 . 0
. . . ..
.. .. ..

. s
0 0 s 0
We will show
Lemma 30.1. The gap between the smallest and second smallest eigenvalues of the matrix (30.16) for
s [0, 1] is (1/k 2 )
Proof. The reduced Hamiltonian (30.16) essentially describes a free particle on a finite, discrete line, with
a nonzero potential at one end. Thus the eigenstates are simply plane waves with a quantization condition
determining the allowed values of the momentum. We will show the lower bound on the gap by analyzing
this quantization condition.
We claim that the (unnormalized) eigenstates of (30.16), denoted |Ep i, are given by
hj |Ep i = sin(p(k j + 1)) (30.17)
for j = 0, 1, . . . , k, and where p is yet to be determined. It is straightforward to verify that these states
satisfy
hj |H(s)|Ep i = Ep hj |Ep i (30.18)
for j = 1, 2, . . . , k, with the energy given by
Ep = 2s cos p. (30.19)
(where p may be either real or imaginary). The allowed values of p are determined by the quantization
condition obtained by demanding that (30.18) also holds at j = 0, i.e., that we have
s sin(kp) + (s 1) sin((k + 1)p) = Ep sin((k + 1)p) . (30.20)
Using trigonometric identities, we can rewrite this condition as
s sin((k + 2)p) = (1 s) sin((k + 1)p) , (30.21)
or equivalently, in terms of Chebyshev polynomials, as

Uk+1 (cos p) 1s
= (30.22)
Uk (cos p) s
where Uk (x) is the kth Chebyshev polynomial of the second kind, satisfying Uk (cos ) = sin((k + 1))/ sin .
The left hand side of (30.22) is shown below for k = 8. The intersections of this curve with the constant
function (1 s)/s, when multiplied by 2s, give the eigenvalues Ep . Note that since p can be imaginary,
cos p can be larger than 1 or smaller than 1.
2
U9 (cos p)/U8 (cos p)
4
2 1.5 1 0.5 0 0.5 1 1.5 2
cos p
30.3. Locality 159
j
Since the roots of Uk (x) are given by cos k+1 for j = 1, 2, . . . , k, the left hand side of (30.22) has simple
j
poles at those values (and zeros at cos k+2 for j = 1, 2, . . . , k + 1). One can show that left hand side of
(30.22) is strictly increasing. So there is one solution of (30.22) to the left of the leftmost pole, one between
each pair of poles, and one to the right of the rightmost pole, giving a total of k + 1 solutions, and thus
accounting for all the eigenvalues of (30.16).
It remains to show that the gap between the two rightmost solutions of (30.22) is not too small. It is

easy to see that the gap is (1/k 3 ), because the ground state has cos p cos k+2 (since it must occur to the

right of the rightmost root), and the first excited state has cos p cos k+1 (since it must occur to the left of

the rightmost pole). This shows the gap is at least 2s(cos k+2 cos k+1 ) = (1/k 3 ) for constant s (and it
is easy to show that the gap is a constant for s = o(1)).
However, we might like to prove a tighter result. To do this, we can separately consider the cases where
the value of p corresponding the ground state is real (giving a plane wave) and where it is imaginary (giving
a bound state). Since Uk+1 (1)/Uk (1) = (k + 2)/(k + 1), the value of s separating these two regimes is
s := (k + 1)/(2k + 3).

For s s , the ground state has cos p 1, whereas the first excited state has cos p cos k+1 (as observed
above). Therefore, the gap satisfies

(s) 2s 1 cos = (1/k 2 ) (30.23)
k+1
for constant s (and as mentioned above, it is easy to see that (s) = (1) for s = o(1)).

For s > s , the ground state has cos p cos k+2 (as mentioned above). For the first excited state, we will
show that the solution of (30.22) not only lies to the left of the rightmost pole, but that its distance from
that pole is at least a constant fraction more than the distance of that pole from cos p = 1. In particular,
for any constant a > 0, we have

Uk+1 (1 (1 + a)(1 cos k+1 )) sin((k + 2) cos1 ((1 + a) cos k+1 a))
= (30.24)
Uk (1 (1 + a)(1 cos k+1 )) sin((k + 1) cos1 ((1 + a) cos k+1 a))

1 + a cot( 1 + a)
=1+ + O(1/k 2 ) (30.25)
k
where the second line follows by Taylor expansion. In comparison,
k+2 1
= 1 + + O(1/k 2 ) . (30.26)
k+1 k
So if we fix (say) a = 1, then for k sufficiently large, (30.25) is larger than (30.26), which implies that the

first excited state has cos p 2 cos k+1 1. In turn, this implies that

(s) 2s cos 2 cos + 1 = (1/k 2 ) , (30.27)
k+2 k+1
which completes the proof.
30.3 Locality
The Hamiltonian (30.13) is local in terms of the computational qubits, but not in terms of the clock. However,
it is possible to make the entire construction local.
The basic idea is again to use a unary representation of the clock, as in (30.7). We saw above that
this makes HF 4-local. However, HB and Hpenalty remain nonlocal with this clock, since they include the
projector |0ih0| acting on the clock register, which involves all k + 1 of the clock qubits. Thus we must
modify the construction slightly.
Lets try adding a term to Hpenalty that penalizes clock states which are not of the correct form. To do
this, it will be useful to change the unary representation from (30.7) to a form that can be checked locally,
this time with k + 2 qubits:
|ji := | 0| {z
0} 1| {z
1}i (30.28)
j+1 kj+1
for j {0, 1, . . . , k}. (Note that the first qubit is always in the state |0i, and the last qubit is always in
the state |1i.) Now we can verify that the clock state is of the form (30.28) by ensuring that there is no
occurrence of the string 10 in the clock register, that the first bit is not 1, and that the last bit is not
0; then we can check whether the clock is in its initial state by checking whether the second clock qubit is
in the state |1i. Thus, let us redefine
n k
X (j) X (j,j+1)
Hpenalty := |1ih1| (|0ih0|)(1) + I (|1ih1|)(0) + I |10ih10| + I (|0ih0|)(k+2) (30.29)
j=1 j=1
where the parenthesized superscripts again indicate which qubits are acted on. We redefine the beginning
Hamiltonian as
HB := I (|0ih0|)(1) + Hpenalty , (30.30)
and in the Feynman term HF of the computational Hamiltonian HC , we make the replacement
(j1,j,j+1)
|jihj 1| |001ih011| (30.31)
(and similarly for the adjoint). With these redefinitions, the overall Hamiltonian H(s) = (1 s)HB + sHC is
5-local, assuming as before that the gates in the quantum circuit to be simulated involve at most two qubits
each.
As with the original nonlocal-clock construction, HB and HC have unique ground states |0 . . . 0i|01 . . . 1i
1
Pk j+1 kj+1
and k+1 j=0 Uj U1 |0 . . . 0i |0 1 i, respectively. Again, the computational subspace spanned
by the states |j i from (30.3) (but now with the clock representation (30.28)) is invariant under H(s); and
within this subspace, the Hamiltonian acts according to (30.16), which has a gap of (1/k 2 ). Overall, this
shows that there is a 5-local Hamiltonian H(s) implementing an arbitrary quantum circuit by adiabatic
evolution.
By suitable engineering, its possible to produce variants of this construction with even better locality
properties. One can even make the Hamiltonian spatially local, with nearest-neighbor interactions between
qubits on a two-dimensional square lattice [76]. (In fact, one can even use a one-dimensional array of quantum
systems, although not necessarily with qubits, but with higher-dimensional particles [5].)
Bibliography
[1] Scott Aaronson, Quantum lower bound for the collision problem, Proc. 34th ACM Symposium on Theory
of Computing, pp. 635642, 2002, quant-ph/0111102. [p. 101]
[2] Scott Aaronson and Andris Ambainis, Quantum search of spatial regions, Theory of Computing 1 (2005),
4779, quant-ph/0303041, preliminary version in FOCS 2003. [p. 85]
[3] Scott Aaronson and Yaoyun Shi, Quantum lower bounds for the collision and the element distinctness
problems, Journal of the ACM 51 (2004), no. 4, 595605, quant-ph/0111102 and quant-ph/0112086,
preliminary versions in STOC 2002 and FOCS 2002. [p. 87]
[4] Dorit Aharonov, Itai Arad, Elad Eban, and Zeph Landau, Polynomial quantum algorithms for additive
approximations of the Potts model and other points of the Tutte plane, quant-ph/0702008. [p. 66]
[5] Dorit Aharonov, Daniel Gottesman, Sandy Irani, and Julia Kempe, The power of quantum systems on
a line, Commun. Math. Phys. 287 (2009), no. 1, 4165, 0705.4077. [p. 160]
[6] Dorit Aharonov, Vaughan Jones, and Zeph Landau, A polynomial quantum algorithm for approximating
the jones polynomial, Proc. 38th ACM Symposium on Theory of Computing, pp. 427436, 2006. [p. 63]
[7] Dorit Aharonov and Amnon Ta-Shma, Adiabatic quantum state generation and statistical zero
knowledge, Proceedings of the 35th ACM Symposium on Theory of Computing, pp. 2029, 2003,
quant-ph/0301023. [p. 125]
[8] Dorit Aharonov, Wim van Dam, Julia Kempe, Zeph Landau, Seth Lloyd, and Oded Regev, Adiabatic
quantum computation is equivalent to standard quantum computation, SIAM Journal on Computing 37
(2007), no. 1, 166194, quant-ph/0405098, preliminary version in FOCS 2004. [p. 155]
[9] Gorjan Alagic, Cristopher Moore, and Alexander Russell, Quantum algorithms for Simons problem over
general groups, Proceedings of the 18th ACM-SIAM Symposium on Discrete Algorithms, pp. 12171224,
2007, quant-ph/0603251. [p. 58]
[10] Andris Ambainis, Quantum lower bounds by quantum arguments, Journal of Computer and System
Sciences 64 (2002), no. 4, 750767, quant-ph/0002066, Preliminary version in STOC 2000. [p. 105]
[11] , Quantum walk algorithm for element distinctness, SIAM J. Comput. 37 (2007), no. 1, 210239,
quant-ph/0311001, preliminary version in FOCS 2004. [p. 87]
[12] Andris Ambainis, Julia Kempe, and Alexander Rivosh, Coins make quantum walks faster, Proceedings
of the 16th ACM-SIAM Symposium on Discrete Algorithms, pp. 10991108, 2005, quant-ph/0402107.
[p. 85]
[13] Itai Arad and Zeph Landau, Quantum computation and the evaluation of tensor networks,
arXiv:0805.0040. [p. 66]
[14] Dave Bacon, Andrew M. Childs, and Wim van Dam, From optimal measurement to efficient quantum
algorithms for the hidden subgroup problem over semidirect product groups, Proceedings of the 46th
IEEE Symposium on Foundations of Computer Science, pp. 469478, 2005, quant-ph/0504083. [p. 59]
161
162 Bibliography
[15] Howard Barnum and Emanuel Knill, Reversing quantum dynamics with near-optimal quantum and
classical fidelity, Journal of Mathematical Physics 43 (2002), no. 5, 20972106, quant-ph/0004088. [p.
45]
[16] Robert Beals, Harry Buhrman, Richard Cleve, Michele Mosca, and Ronald de Wolf, Quantum lower
bounds by polynomials, Journal of the ACM 48 (2001), no. 4, 778797, quant-ph/9802049, preliminary
version in FOCS 1998. [pp. 85, 94]
[17] Aleksandrs Belovs, Span programs for functions with constant-sized 1-certificates, Proceedings of the
44th Symposium on Theory of Computing, pp. 7784, 2012, arXiv:1105.4024. [pp. 117, 120]
[18] Aleksandrs Belovs and Ansis Rosmanis, Adversary lower bounds for the collision and the set equality
problems, arXiv:1310.5185. [p. 100]
[19] Dominic W. Berry, Graeme Ahokas, Richard Cleve, and Barry C. Sanders, Efficient quantum algorithms
for simulating sparse Hamiltonians, Communications in Mathematical Physics 270 (2007), no. 2, 359
371, quant-ph/0508139. [pp. 123, 125, 129]
[20] Dominic W. Berry and Andrew M. Childs, Black-box Hamiltonian simulation and unitary implementa-
tion, Quantum Information and Computation 12 (2012), no. 1-2, 2962, arXiv:0910.4157. [p. 130]
[21] Dominic W. Berry, Andrew M. Childs, Richard Cleve, Robin Kothari, and Rolando D. Somma, Ex-
ponential improvement in precision for simulating sparse Hamiltonians, Proceedings of the 46th ACM
Symposium on Theory of Computing, pp. 283292, 2014, arXiv:1312.1414. [pp. 125, 130, 131]
[22] , Simulating Hamiltonian dynamics with a truncated Taylor series, Physical Review Letters 114
(2015), no. 9, 090502, arXiv:1412.4687. [p. 130]
[23] Gilles Brassard, Peter Hyer, Michele Mosca, and Alain Tapp, Quantum amplitude amplification and
estimation, Quantum Computation and Information (S. J. Lomonaco and H. E. Brandt, eds.), AMS
Contemporary Mathematics Series, vol. 305, AMS, Providence, RI, 2002, quant-ph/0005055. [p. 85]
[24] Gilles Brassard, Peter Hyer, and Alain Tapp, Quantum algorithm for the collision problem,
quant-ph/9705002. [p. 100]
[25] Harry Buhrman, Richard Cleve, and Avi Wigderson, Quantum vs. classical communication and com-
putation, Proceedings of the 30th ACM Symposium on Theory of Computing, pp. 6368, 1998,
quant-ph/9802040. [p. 114]
[26] Harry Buhrman, Christoph Durr, Mark Heiligman, Peter Hyer, Frederic Magniez, Miklos Santha, and
Ronald de Wolf, Quantum algorithms for element distinctness, SIAM Journal on Computing 34 (2005),
no. 6, 13241330, quant-ph/0007016, preliminary version in CCC 2001. [p. 87]
[27] Kevin K. H. Cheung and Michele Mosca, Decomposing finite abelian groups, Quantum Information and
Computation 1 (2001), no. 3, 2632, cs.DS/0101004. [p. 27]
[28] Andrew M. Childs, Quantum information processing in continuous time, Ph.D. thesis, Massachusetts
Institute of Technology, 2004. [p. 125]
[29] , On the relationship between continuous- and discrete-time quantum walk, Communications in
Mathematical Physics 294 (2010), no. 2, 581603, arXiv:0810.0312. [p. 130]
[30] Andrew M. Childs, Richard Cleve, Enrico Deotto, Edward Farhi, Sam Gutmann, and Daniel A. Spiel-
man, Exponential algorithmic speedup by quantum walk, Proceedings of the 35th ACM Symposium on
Theory of Computing, pp. 5968, 2003, quant-ph/0209131. [p. 71]
[31] Andrew M. Childs and Jeffrey Goldstone, Spatial search by quantum walk, Physical Review A 70 (2004),
no. 2, 022314, quant-ph/0306054. [p. 85]
Bibliography 163
[32] Andrew M. Childs, David Jao, and Vladimir Soukharev, Constructing elliptic curve isogenies in quantum
subexponential time, Journal of Mathematical Cryptology 8 (2014), no. 1, 129, arXiv:1012.4019. [p.
58]
[33] Andrew M. Childs and Wim van Dam, Quantum algorithms for algebraic problems, Reviews of Modern
Physics 82 (2010), no. 1, 152, arXiv:0812.0380. [p. vii]
[34] Richard Cleve, Artur Ekert, Chiara Macchiavello, and Michele Mosca, Quantum algorithms revisited,
Proceedings of the Royal Society of London A 454 (1998), no. 1969, 339354, quant-ph/9708016. [pp.
19, 85]
[35] J. H. P. Colpa, Diagonalisation of the quadratic fermion hamiltonian with a linear part, J. Phys. A 12
(1979), no. 4, 469488. [p. 150]
[36] Wim van Dam, Michele Mosca, and Umesh Vazirani, How powerful is adiabatic quantum computation?,
Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science, pp. 279287, 2001,
quant-ph/0206003. [p. 143]
[37] Christopher M. Dawson and Michael A. Nielsen, The Solovay-Kitaev algorithm, Quantum Information
and Computation 6 (2006), no. 1, 8195, quant-ph/0505030. [p. 7]
[38] Kirsten Eisentrager, Sean Hallgren, Alexei Kitaev, and Fang Song, A quantum algorithm for computing
the unit group of an arbitrary degree number field, Proceedings of the 46th ACM Symposium on Theory
of Computing, pp. 293302, 2014. [p. 41]
[39] Mark Ettinger and Peter Hyer, On quantum algorithms for noncommutative hidden subgroups, Ad-
vances in Applied Mathematics 25 (2000), 239251, quant-ph/9807029. [pp. 54, 58]
[40] Mark Ettinger, Peter Hyer, and Emanuel Knill, The quantum query complexity of the hidden subgroup
problem is polynomial, Information Processing Letters 91 (2004), no. 1, 4348, quant-ph/0401083. [p.
45]
[41] Edward Farhi, Jeffrey Goldstone, and Sam Gutmann, A quantum algorithm for the Hamiltonian NAND
tree, Theory of Computing 4 (2008), no. 1, 169190, quant-ph/0702144. [pp. 114, 115]
[42] Edward Farhi, Jeffrey Goldstone, Sam Gutmann, and Michael Sipser, Quantum computation by adiabatic
evolution, quant-ph/0001106. [pp. 141, 147]
[43] Edward Farhi and Sam Gutmann, Quantum computation and decision trees, Physical Review A 58
(1998), no. 2, 915928, quant-ph/9706062. [p. 69]
[44] Richard P. Feynman, Simulating physics with computers, International Journal of Theoretical Physics
21 (1982), no. 6-7, 467488. [p. 123]
[45] , Quantum mechanical computers, Optics News 11 (1985), 1120. [p. 155]
[46] M. H. Freedman, A. Yu. Kitaev, M. J. Larsen, and Z. Wang, Topological quantum computation, Bull.
Amer. Math. Soc. 40 (2003), 3138, quant-ph/0101025. [p. 63]
[47] Brett Giles and Peter Selinger, Remarks on Matsumoto and Amanos normal form for single-qubit
Clifford+T operators, arXiv:1312.6584. [pp. 11, 14]
[48] Daniel Gottesman, An introduction to quantum error correction and fault-tolerant quantum computation,
Quantum Information Science and Its Contributions to Mathematics (Jr. Samuel J. Lomonaco, ed.),
Proceedings of Symposia in Applied Mathematics, vol. 68, AMS, 2010, arXiv:0904.2557. [p. 4]
[49] M. Grigni, L. J. Schulman, M. Vazirani, and U. Vazirani, Quantum mechanical algorithms for the
nonabelian hidden subgroup problem, Combinatorica 24 (2004), no. 1, 137154, preliminary version in
STOC 2001. [p. 53]
164 Bibliography
[50] Lov K. Grover, Quantum mechanics helps in searching for a needle in a haystack, Physical Review
Letters 79 (1997), no. 2, 325328, quant-ph/9706033, preliminary version in STOC 1996. [p. 83]
[51] S. Hallgren, C. Moore, M. Rotteler, A. Russell, and P. Sen, Limitations of quantum coset states for
graph isomorphism, Proc. 38th ACM Symposium on Theory of Computing, pp. 604617, 2006, quant-
ph/0511148, quant-ph/0511149. [p. 54]
[52] S. Hallgren, A. Russell, and A. Ta-Shma, The hidden subgroup problem and quantum computation using
group representations, SIAM Journal on Computing 32 (2003), no. 4, 916934, preliminary version in
STOC 2000. [p. 52]
[53] Sean Hallgren, Fast quantum algorithms for computing the unit group and class group of a number field,
Proc. 37th ACM Symposium on Theory of Computing, pp. 468474, 2005. [p. 41]
[54] , Polynomial-time quantum algorithms for Pells equation and the principal ideal problem, Journal
of the ACM 54 (2007), no. 1, article 4, preliminary version in STOC 2002. [pp. 33, 41]
[55] P. Hyer, T. Lee, and R. Spalek, Negative weights make adversaries stronger, Proc. 39th ACM Sympo-
sium on Theory of Computing, pp. 526535, 2007, quant-ph/0611054. [pp. 105, 106]
[56] P. Hyer, M. Mosca, and R. de Wolf, Quantum search on bounded-error inputs, Proc. 30th International
Colloquium on Automata, Languages, and Programming, Lecture Notes in Computer Science, vol. 2719,
pp. 291299, 2003, quant-ph/0304052. [p. 114]
[57] P. Hyer and R. Spalek, Lower bounds on quantum query complexity, Bulletin of the European Associ-
ation for Theoretical Computer Science 87 (2005), 78103, quant-ph/0509153. [p. 93]
[58] Sabine Jansen, Mary-Beth Ruskai, and Ruedi Seiler, Bounds for the adiabatic approximation
with applications to quantum computation, Journal of Mathematical Physics 48 (2007), 102111,
quant-ph/0603175. [p. 136]
[59] Richard Jozsa, Quantum computation in algebraic number theory: Hallgrens efficient quantum algorithm
for solving Pells equation, Annals of Physics 306 (2003), no. 2, 241279, quant-ph/0302134. [p. 33]
[60] Phillip Kaye, Raymond Laflamme, and Michele Mosca, An introduction to quantum computing, Oxford
University Press, 2007. [p. 1]
[61] Kiran S. Kedlaya, Quantum computation of zeta functions of curves, Computational Complexity 15
(2006), no. 1, 119, math.NT/0411623. [p. 30]
[62] Alexei Yu. Kitaev, Alexander H. Shen, and Mikhail N. Vyalyi, Classical and quantum computation,
AMS, 2002. [pp. 1, 7, 19, 155]
[63] Vadym Kliuchnikov, Dmitri Maslov, and Michele Mosca, Fast and efficient exact synthesis of single
qubit unitaries generated by Clifford and T gates, Quantum Information and Computation 13 (2013),
no. 7-8, 607630, arXiv:1206.5236. [p. 11]
[64] Greg Kuperberg, A subexponential-time quantum algorithm for the dihedral hidden subgroup problem,
SIAM Journal on Computing 35 (2005), no. 1, 170188, quant-ph/0302112. [p. 55]
[65] Samuel Kutin, Quantum lower bound for the collision problem with small range, Theory of Computing
1 (2005), no. 2, 2936, quant-ph/0304162. [p. 101]
[66] Troy Lee, Frederic Magniez, and Miklos Santha, Learning graph based quantum query algorithms
for finding constant-size subgraphs, Chicago Journal of Theoretical Computer Science (2011), no. 10,
arXiv:1109.5135. [p. 120]
[67] , Improved quantum query algorithms for triangle finding and associativity testing, Proceedings
of the 24th ACM-SIAM Symposium on Discrete Algorithms, pp. 14861502, 2013, arXiv:1210.1014. [p.
120]
Bibliography 165
[68] Troy Lee, Rajat Mittal, Ben W. Reichardt, Robert Spalek, and Mario Szegedy, Quantum query com-
plexity of state conversion, Proceedings of the 52nd IEEE Symposium on Foundations of Computer
Science, pp. 344353, 2011, arXiv:1011.3020. [pp. 105, 111, 114, 115, 116]
[69] E. Lieb, T. Schultz, and D. Mattis, Two soluble models of an antiferromagnetic chain, Ann. Phys. 16
(1961), no. 3, 407466. [p. 150]
[70] Frederic Magniez, Ashwin Nayak, Jeremie Roland, and Miklos Santha, Search via quantum walk, SIAM
Journal on Computing 40 (2011), no. 1, 142164, quant-ph/0608026. [pp. 87, 89]
[71] Ken Matsumoto and Kazuyuki Amano, Representation of quantum circuits with Clifford and /8 gates,
arXiv:0806.3834. [p. 11]
[72] C. Moore, A. Russell, and L. J. Schulman, The symmetric group defies strong Fourier sampling, Proc.
46th IEEE Symposium on Foundations of Computer Science, pp. 479490, 2005, quant-ph/0501056. [p.
54]
[73] Cristopher Moore, Daniel N. Rockmore, Alexander Russell, and Leonard J. Schulman, The power of
strong fourier sampling: Quantum algorithms for affine groups and hidden shifts, SIAM J. Comput. 37
(2007), no. 3, 938958, quant-ph/0503095, preliminary version in SODA 2004. [p. 53]
[74] Cristopher Moore, Alexander Russell, and Piotr Sniady, On the impossibility of a quantum sieve al-
gorithm for graph isomorphism, Proc. 29th ACM Symposium on Theory of Computing, pp. 536545,
2007, quant-ph/0612089. [p. 58]
[75] M. A. Nielsen and I. L. Chuang, Quantum computation and quantum information, Cambridge University
Press, Cambridge, 2000. [pp. 1, 7]
[76] R. Oliveira and B. M. Terhal, The complexity of quantum spin systems on a two-dimensional square
lattice, Quantum Information and Computation 8 (2008), no. 10, 900924, quant-ph/0504050. [p. 160]
[77] Oded Regev, A subexponential time algorithm for the dihedral hidden subgroup problem with polynomial
space, quant-ph/0406151. [p. 58]
[78] B. W. Reichardt, Span programs and quantum query complexity: The general adversary bound is nearly
tight for every boolean function, Proc. 50th IEEE Symposium on Foundations of Computer Science,
pp. 544551, 2009, 0904.2759. [pp. 111, 112, 113]
[79] , Reflections for quantum query algorithms, Proceedings of the 22nd ACM-SIAM Symposium on
Discrete Algorithms, pp. 560569, 2011, 1005.1601. [pp. 105, 111, 114, 115]
[80] B. W. Reichardt and R. Spalek, Span-program-based quantum algorithm for evaluating formulas, Proc.
40th ACM Symposium on Theory of Computing, pp. 103112, 2008, arXiv:0710.2630. [pp. 111, 112]
[81] J. Roland and N. J. Cerf, Quantum search by local adiabatic evolution, Physical Review A 65 (2002),
no. 4, 042308, quant-ph/0107015. [p. 143]
[82] M. Saks and A. Wigderson, Probabilistic boolean decision trees and the complexity of evaluating game
trees, Proc. 27th IEEE Symposium on Foundations of Computer Science, pp. 2938, 1986. [pp. 114,
115]
[83] Miklos Santha, On the Monte Carlo Boolean decision tree complexity of read-once formulae, Random
Structures and Algorithms 6 (1995), no. 1, 7587. [p. 114]
[84] Miklos Santha, Quantum walk based search algorithms, Theory and Applications of Models of Compu-
tation, Lecture Notes in Computer Science, vol. 4978, Springer, 2008, arXiv:0808.0059, pp. 3146. [p.
87]
[85] Arthur Schmidt and Ulrich Vollmer, Polynomial time quantum algorithm for the computation of the
unit group of a number field, Proc. 37th ACM Symposium on Theory of Computing, pp. 475480, 2005.
[p. 41]
166 Bibliography
[86] Pranab Sen, Random measurement bases, quantum state distinction and applications to the hid-
den subgroup problem, Proc. 21st IEEE Conference on Computational Complexity (2006), 274287,
quant-ph/0512085. [p. 53]
[87] Jean-Pierre Serre, Linear representations of finite groups, Graduate Texts in Mathematics, vol. 42,
Springer, 1977. [p. 47]
[88] Simone Severini, On the digraph of a unitary matrix, SIAM Journal on Matrix Analysis and Applications
25 (2003), no. 1, 295300, math.CO/0205187. [p. 77]
[89] Neil Shenvi, Julie Kempe, and K. Birgitta Whaley, A quantum random walk search algorithm, Physical
Review A 67 (2003), no. 5, 052307, quant-ph/0210064. [p. 85]
[90] Yaoyun Shi, Quantum lower bounds for the collision and the element distinctness problems, Pro-
ceedings of the 43rd IEEE Symposium on Foundations of Computer Science, pp. 513519, 2002,
quant-ph/0112086. [p. 101]
[91] Peter W. Shor, Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum
computer, SIAM Journal on Computing 26 (1997), no. 5, 14841509, quant-ph/9508027, preliminary
version in FOCS 1994. [p. 23]
[92] D. R. Simon, On the power of quantum computation, SIAM Journal on Computing 26 (1997), no. 5,
14741483, preliminary version in FOCS 1994. [p. 17]
[93] M. Snir, Lower bounds on probabilistic linear decision trees, Theoretical Computer Science 38 (1985),
6982. [pp. 114, 115]
[94] Mario Szegedy, Quantum speed-up of Markov chain based algorithms, Proceedings of the 45th IEEE
Symposium on Foundations of Computer Science, pp. 3241, 2004, quant-ph/0401053. [pp. 78, 80]
[95] Stefan Teufel, Adiabatic perturbation theory in quantum dynamics, Lecture Notes in Mathematics, vol.
1821, Springer-Verlag, 2003. [p. 136]
[96] John Watrous, Quantum simulations of classical random walks and undirected graph connectivity, Jour-
nal of Computer and System Sciences 62 (2001), no. 2, 376391, cs.CC/9812012. [p. 77]
[97] , Theory of quantum information, lecture notes, 2011, http://cs.uwaterloo.ca/watrous/
LectureNotes.html. [p. 111]
[98] Pawel Wocjan and Jon Yard, The Jones polynomial: Quantum algorithms and applications in quantum
complexity theory, quant-ph/0603069. [p. 66]
[99] Yechao Zhu, Quantum query complexity of subgraph containment with constant-sized certificates, Inter-
national Journal of Quantum Information 10 (2012), no. 3, 1250019, arXiv:1109.4165. [p. 120]

Lecture Notes On Quantum Algorithms

Uploaded by

Lecture Notes On Quantum Algorithms

Uploaded by

Lecture Notes on

II Quantum algorithms for algebraic problems 15

8.1 Review: Order finding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

III Quantum walk 67

17.3 Spectrum of the quantum walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

IV Quantum query complexity 91

V Quantum simulation 121

25.5 Measuring an operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

VI Adiabatic quantum computing 133

1.1 Quantum data

1.2 Quantum circuits

1.3 Universal gate sets

1.4 Reversible computation

1.6 Quantum complexity

1.7 Fault tolerance

Efficient universality of quantum

2.1 Subadditivity of errors

2.2 The group commutator and a net around the identity

2.3 Proof of the Solovay-Kitaev Theorem

Thus the number of gates used is O(log 1 ) where = log 5/ log 32 .

2.4 Proof of Lemma 2.3

kA JU, V Kk kA e2i(~u~v)~ k + ke2i(~u~v)~ JU, V Kk. (2.10)

For the first term, using (ii), we have

For the second term, using (iii) and (iv) gives

ke2i(~u~v)~ JU, V Kk = ke[~u~,~v~] JU, V Kk = O(3 ) (2.16)

The lemma follows.

Quantum circuit synthesis over

3.1 Converting to Matsumoto-Amano normal form

where C0 , . . . , Cn C. Our goal is to rewrite such an expression into a simpler form.

3.2 Uniqueness of Matsumoto-Amano normal form

3.3 Algebraic characterization of Clifford+T unitaries

3.4 From exact to approximate synthesis

The abelian quantum Fourier

4.1 Quantum Fourier transform

4.2 QFT over Z2n

how an input basis vector is transformed:

Then the circuit can be written as follows:

|xn1 i H R2 R3 Rn1 Rn |z0 i

4.3 Phase estimation

apply the operator X

to give the state

4.4 QFT over ZN and over a general finite abelian group

Discrete log and the hidden subgroup

5.1 Discrete log

5.2 Diffie-Hellman key exchange

5.3 The hidden subgroup problem

f (x) = f (y) if and only if x1 y H

5.4 Shors algorithm

L := {(, ) Z2N : logg x + = }. (5.3)

In other words, it hides the subgroup

The abelian HSP and decomposing

6.1 The abelian HSP

Next we apply the QFT over G. Then we obtain the state

taking y 0 to be the trivial character.) Thus we have

or, equivalently, the mixed quantum state

6.2 Decomposing abelian groups

K := {x Zdq : f (x) = 0},

Quantum attacks on elliptic curve

7.1 Elliptic curves

E2,1 = {(x, y) F27 : y 2 = x3 2x + 1} (7.2)

y2 "x3 !2x#1 y2 "x3 !x#1 y2 "x3 #x#1

x2P = 2 2xP (7.14)

y2P = (xP x2P ) yP (7.16)

7.2 Elliptic curve cryptography

7.3 Shors algorithm for discrete log over elliptic curves

Quantum algorithms for number fields

8.1 Review: Order finding

8.2 Pells equation

6013 40929908599 527831340

8.3 Some basic algebraic number theory

which shows that x2 dy 2 = 1.

Period finding from Z to R

9.1 Period finding over the integers

= nk mod n,0 . (9.5)

In this especially simple case, the quantum state is

sin2 (jn + rn

Thus the number of gates used is O(log 1 ) where = log 5/ log 32 .

ke2i(~u~v)~ JU, V Kk = ke[~u~,~v~] JU, V Kk = O(3 ) (2.16)

sin2 (jn + rn

h0| + h1| |0i + yk |1i 2 1 + yk 2