Lecture Notes On Quantum Algorithms
Lecture Notes On Quantum Algorithms
Quantum Algorithms
Andrew M. Childs
Department of Computer Science,
Institute for Advanced Computer Studies, and
Joint Center for Quantum Information and Computer Science
University of Maryland
30 May 2017
ii
Contents
Preface vii
1 Preliminaries 1
1.1 Quantum data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Quantum circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Universal gate sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Reversible computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Uniformity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.6 Quantum complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.7 Fault tolerance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
I Quantum circuits 5
2 Efficient universality of quantum circuits 7
2.1 Subadditivity of errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 The group commutator and a net around the identity . . . . . . . . . . . . . . . . . . . . . . 8
2.3 Proof of the Solovay-Kitaev Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.4 Proof of Lemma 2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3 Quantum circuit synthesis over Clifford+T 11
3.1 Converting to Matsumoto-Amano normal form . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Uniqueness of Matsumoto-Amano normal form . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Algebraic characterization of Clifford+T unitaries . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.4 From exact to approximate synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
iii
iv Contents
Bibliography 161
Preface
This is a set of lecture notes on quantum algorithms. It is primarily intended for graduate students who have
already taken an introductory course on quantum information. Such a course typically covers only the early
breakthroughs in quantum algorithms, namely Shors factoring algorithm (1994) and Grovers searching
algorithm (1996). Here we show that there is much more to quantum computing by exploring some of the
many quantum algorithms that have been developed over the past twenty years.
These notes cover several major topics in quantum algorithms, divided into six parts:
In Part I, we discuss quantum circuitsin particular, the problem of expressing a quantum algorithm
using a given universal set of quantum gates.
In Part II, we discuss quantum algorithms for algebraic problems. Many of these algorithms generalize
the main idea of Shors algorithm. These algorithms use the quantum Fourier transform and typically
achieve an exponential (or at least superpolynomial) speedup over classical computers. In particular,
we explore a group-theoretic problem called the hidden subgroup problem. A solution of this problem
for abelian groups leads to several applications; we also discuss what is known about the nonabelian
case.
In Part III, we explore the concept of quantum walk, a quantum generalization of random walk. This
concept leads to a powerful framework for solving search problems, generalizing Grovers search algo-
rithm.
In Part IV, we discuss the model of quantum query complexity. We cover the two main methods
for proving lower bounds on quantum query complexity (the polynomial method and the adversary
method), demonstrating limitations on the power of quantum algorithms. We also discuss how the
concept of span programs turns the quantum adversary method into an upper bound, giving optimal
quantum algorithms for evaluating Boolean formulas.
In Part V, we describe quantum algorithms for simulating the dynamics of quantum systems. We also
discuss an application of quantum simulation to an algorithm for linear systems.
In Part VI, we discuss adiabatic quantum computing, a general approach to solving optimization prob-
lems (in a similar spirit to simulated annealing). Related ideas may also provide insights into how one
might build a quantum computer.
These notes were originally prepared for a course that was offered three times at the University of
Waterloo: in the winter terms of 2008 (as CO 781) and of 2011 and 2013 (as CO 781/CS 867/QIC 823). I
thank the students in the course for their feedback on the lecture notes. Each offering of the course covered
a somewhat different set of topics. This document collects the material from all versions of the course and
includes a few subsequent improvements.
The material on quantum algorithms for algebraic problems has been collected into a review article that
was written with Wim van Dam [33]. I thank Wim for his collaboration on that project, which strongly
influenced the presentation in Part II.
Please keep in mind that these are rough lecture notes; they are not meant to be a comprehensive treat-
ment of the subject, and there are surely at least a few mistakes. Corrections (by email to amchilds@umd.edu)
are welcome.
I hope you find these notes to be a useful resource for learning about quantum algorithms.
vii
viii Contents
Chapter 1
Preliminaries
This chapter briefly reviews some background material on quantum computation. We cover these topics at
a very high level, just to give a sense of what you should know to understand the rest of the lecture notes.
If any of these topics are unfamiliar, you can learn more about them from a text on quantum computation
such as Nielsen and Chuang [75]; Kitaev, Shen, and Vyalyi [62]; or Kaye, Laflamme, and Mosca [60].
where the ax C satisfy x |ax |2 = 1. We refer to the basis of states |xi as the computational basis.
P
It will often be useful to think of quantum states as storing data in a more abstract form. For example,
given a group G, we could write |gi for a basis state corresponding to the group element g G, and
X
|i = bg |gi (1.2)
gG
for an arbitrary superposition over the group. We assume that there is some canonical way of efficiently
representing group elements using bit strings; it is usually unnecessary to make this representation explicit.
If a quantum computer stores the state |i and the state |i, its overall state is given by the tensor
product of those two states. This may be denoted |i |i = |i|i = |, i.
1
2 Chapter 1. Preliminaries
kU Ut . . . U2 U1 k . (1.3)
Here kk denotes some appropriate matrix norm, which should have the property that if kU V k is small,
then U should be hard to distinguish from V no matter what quantum state they act on. A natural choice
(which will be suitable for our purposes) is the spectral norm
kA|ik
kAk := max , (1.4)
|i k|ik
p
(where k|ik = h|i denotes the vector 2-norm of |i), i.e., the largest singular value of A. Then we call
a set of elementary gates universal if any unitary operator on a fixed number of qubits can be approximated
to any desired precision using elementary gates.
It turns out that there are finite sets of gates that are universal: for example, the set {H, T, C} with
i/8 1 0 0 0
1 1 1 e 0 0 1 0 0
H := T := i/8 0 0 0 1 .
C := (1.5)
2 1 1 0 e
0 0 1 0
There are situations in which we say a set of gates is effectively universal, even though it cannot actually
approximate any unitary operator on n qubits. For example, the set {H, T 2 , Tof}, where
1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0
Tof := 0 0 0 0 1 0 0 0
(1.6)
0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1
0 0 0 0 0 0 1 0
is universal, but only if we allow the use of ancilla qubits (qubits that start and end in the |0i state).
Similarly, the basis {H, Tof} is universal in the sense that, with ancillas, it can approximate any orthogonal
matrix. It clearly cannot approximate complex unitary matrices, since the entries of H and Tof are real;
but the effect of arbitrary unitary transformations can be simulated using orthogonal ones by simulating the
real and imaginary parts separately.
remove the accumulated information. After performing the classical computation with reversible gates, we
simply xor the answer into an ancilla register, and then perform the computation in reverse. Thus we can
implement the map (x, y) 7 (x, y f (x)) even when f is a complicated circuit consisting of many gates.
Using this trick, any computation that can be performed efficiently on a classical computer can be
performed efficiently on a quantum computer: if we can efficiently implement the map x 7 f (x) on a classical
computer, we can efficiently perform the transformation |x, yi 7 |x, y f (x)i on a quantum computer. This
transformation can be applied to any superposition of computational basis states, so for example, we can
perform the transformation
1 X 1 X
|x, 0i 7 |x, f (x)i. (1.7)
n
2 x{0,1}n n
2 x{0,1}n
Note that this does not necessarily mean we can efficiently implement the map |xi 7 |f (x)i, even when
f is a bijection (so that this is indeed a unitary transformation). However, if we can efficiently invert f , then
we can indeed do this efficiently.
1.5 Uniformity
When we give an algorithm for a computational problem, we consider inputs of varying sizes. Typically,
the circuits for instances of different sizes will be related to one another in a simple way. But this need not
be the case; and indeed, given the ability to choose an arbitrary circuit for each input size, we could have
circuits computing uncomputable languages. Thus we require that our circuits be uniformly generated : say,
that there exists a fixed (classical) Turing machine that, given a tape containing the symbol 1 n times,
outputs a description of the nth circuit in time poly(n).
(depending on the noise model, but typically in the range of 103 to 104 ), an arbitrarily long computation
can be performed with an arbitrarily small amount of error (see for example [48]).
In this course, we will always assume implicitly that fault-tolerant protocols have been applied, such that
we can effectively assume a perfectly functioning quantum computer.
Part I
Quantum circuits
5
Chapter 2
Are some universal gate sets better than others? Classically, this is not an issue: the set of possible operations
is discrete, so any gate acting on a constant number of bits can be simulated exactly using a constant number
of gates from any given universal gate set. But we might imagine that some quantum gates are much more
powerful than others. For example, given two rotations about strange axes by strange angles, it may not be
obvious how to implement a Hadamard gate, and we might worry that implementing such a gate to high
precision could take a very large number of elementary operations, scaling badly with the required precision.
Fortunately, this is not the case: a unitary operator that can be realized efficiently with one set of 1- and
2-qubit gates can also be realized efficiently with another such set. In particular, we have the following (see
[75, Appendix 3], [37], and [62, Chapter 8]).
Theorem 2.1 (Solovay-Kitaev). Fix two universal gate sets that are closed under inverses. Then any t-gate
circuit using one gate set can be implemented to precision using a circuit of t poly(log t ) gates from other
set (indeed, there is a classical algorithm for finding this circuit in time t poly(log t )).
Thus, not only are the two gate sets equivalent under polynomial-time reduction, but the running time
of an algorithm using one gate set is the same as that using the other gate set up to logarithmic factors.
This means that even polynomial quantum speedups are robust with respect to the choice of gate set.
7
8 Chapter 2. Efficient universality of quantum circuits
Thus, in order to simulate a t-gate quantum circuit with total error at most , it suffices to simulate each
individual gate with error at most /t.
JU, V K := U V U 1 V 1 . (2.7)
To approximate general unitaries, we will effectively translate them close to the identity.
Note that it suffices to consider unitary gates with determinant 1 (i.e., elements of SU(2)) since a global
phase is irrelevant. Let
S := {U SU(2) : kI U k } (2.8)
denote the -ball around the identity. Given sets , S SU(2), we say that is an -net for S if for any
A S, there is a U such that kA U k . The following result (to be proved later on) indicates how
the group commutator helps us to make a fine net around the identity.
Lemma 2.3. If is an 2 -net for S , then J, K := {JU, V K : U, V } is an O(3 )-net for S2 .
To make an arbitrarily fine net, we apply this idea recursively. But first it is helpful to derive a consequence
of the lemma that is more suitable for recursion. We would like to maintain the quadratic relationship
between the size of the ball and the quality of the net. If we aim for a k 2 3 -net (for some constant k),
we would like it to apply to arbitrary points in Sk3/2 , whereas the lemma only lets us approximate points
in S2 . To handle an arbitrary A Sk3/2 , we first let W be the closest gate in to A. For sufficiently
small we have k3/2 < , so Sk3/2 S , and therefore A S . Since is an 2 -net for S , we have
kA W k 2 , i.e., kAW Ik 2 , so AW S2 . Then we can apply the lemma to find U, V such
that kAW JU, V Kk = kA JU, V KW k k 2 3 . In other words, if is an 2 -net for S , then J, K :=
{JU, V KW : U, V, W } is a k 2 3 -net for Sk3/2 .
Now suppose that 0 is an 20 -net for S0 , and let i := Ji1 , i1 Ki1 for all positive integers i. Then
3/2 i
i is an 2i -net for Si , where i = ki1 . Solving this recursion gives i = (k 2 0 )(3/2) /k 2 .
Proof of Theorem 2.1. It suffices to consider how to approximate an arbitrary U SU(2) to precision by
a sequence of gates from a given universal gate set .
First we take products of elements of to form a new universal gate set 0 that is an 20 -net for SU(2),
for some sufficiently small constant 0 . We know this can be done since is universal. Since 0 is a constant,
the overhead in constructing 0 is constant.
Now we can find V0 0 such that kU V0 k 20 . Since kU V0 k = kU V0 Ik, we have U V0 S20 . If
3/2
0 is sufficiently small, then 20 < k0 = 1 , so U V0 S1 .
Since 0 is an 20 -net for SU(2), in particular it is an 20 -net for S0 . Thus by the above argument, 1 is
3/2
an 21 -net for S1 , so we can find V1 1 such that kU V0 V1 k 21 < k1 = 2 , i.e., U V0 V1 S2 .
In general, suppose we are given V0 , V1 , . . . , Vi1 such that U V0 V1 . . . Vi1
Si . Since i is an 2i -net
for Si , we can find Vi i such that kU V0 V1 . . . Vi1 Vi k 2i . In turn, this implies that U V0 V1 . . . Vi
Si+1 .
Repeating this process t times gives a very good approximation of U by Vt . . . V1 V0 : in particular, we
have kU Vt . . . V1 V0 k 2t . Suppose we consider a gate from 0 to be elementary. (These gates can be
implemented using only a constant number of gates from , so there is a constant factor overhead if only
count gates in as elementary.) The number of elementary gates needed to implement a gate from i is 5i ,
2.4. Proof of Lemma 2.3 9
Pt
so the total number of gates in the approximation is i=0 5i = (5t+1 1)/4 = O(5t ). To achieve an overall
t
error at most , we need 2t = ((k 2 0 )(3/2) /k 2 )2 , i.e.,
t 1
3 log(k 4 )
> 2 . (2.9)
2 log(k 2 0 )
Note that it is possible to improve the construction somewhat over the version described above. Further-
more, it can be generalized to SU(N ) for arbitrary N . In general, the cost is exponential in N 2 , but for any
fixed N this is just a constant.
Chapter 3
As we discussed in Chapter 2, the Solovay-Kitaev Theorem tells us that we can convert between gate sets
with overhead that is only poly(log(1/)). However, the overhead may not be that small in practice (we
upper bounded the power of the log by log 5/ log 32 3.97), and it is natural to ask if we can do better. A
counting argument shows that the best possible exponent is 1. Can we get close to this lower boundideally
while retaining a fast algorithm?
In general, no such result is known (even if we do not require a fast algorithm). However, there are
strong circuit synthesis results for particular gate sets with nice structure. In particular, one can perform
fast, nearly-optimal synthesis for the set of single-qubit Clifford+T circuits. Not only does it admit fast
synthesis, but this gate set is also the most common choice for fault-tolerant quantum computation, so it is
likely to be relevant in practice.
To understand the synthesis of Clifford+T circuits, we focus here on the problem of exactly expressing
a given unitary operation over that gate set, assuming such a representation is possible. This result can
be extended to give an algorithm for approximately synthesizing arbitrary single-qubit gates, although the
details are beyond the scope of this lecture. (Note that some of these ideas can also be applied to the
synthesis of multi-qubit circuits, but that is also beyond our scope.)
By adding the T gate, we get a universal gate setin other words, the set hH, T, i is dense in U(2). We
call any unitary operation that can be represented exactly over this gate set a Clifford+T operation.
Clearly, any single-qubit Clifford+T operation M can be written in the form
M = Cn T Cn1 C1 T C0 (3.2)
11
12 Chapter 3. Quantum circuit synthesis over Clifford+T
Let S := hS, X, i C. Any element of S can be pushed through T (say, to the right), since we have
ST = T S (3.3)
1
XT = T XS (3.4)
T = T . (3.5)
Thus we can assume C1 , . . . , Cn
/ S. (In some cases, pushing elements of S to the right might cause two T
gates to merge into S C; we take n to be the number of Clifford gates after any such cancellations.)
An explicit calculation shows that |S| = 64, whereas |C| = 192. Since I, H, and SH are in different left
cosets of S in C, they can be chosen as the three coset representatives, and we can write every element of
the Clifford group in the form H S, where H {I, H, SH}. Similarly, every element of C \ S can be written
in the same form, where H {H, SH}. Thus we can write M in the form
M = Hn Sn T Hn1 Sn1 H1 S1 T C0 (3.6)
where C0 C, H1 , . . . , Hn {H, SH}, and S1 , . . . , Sn S.
Now we can further simplify this expression by again pushing elements of S to the right. We have already
seen that such operators can be pushed through T gates, giving new elements of S. But furthermore, they
can also be pushed through elements of {H, SH}, since
SH {H, SH} SSH = HX (3.7)
2 2 2
XH = HZ = HS XSH = SHY = SH( XS ) (3.8)
H = H SH = SH. (3.9)
After applying these rules, we are left with an expression of the form
M = Hk T Hk1 H1 T C0 (3.10)
where H1 , . . . , Hk1 {H, SH} and Hk {I, H, SH}. (Note that we can have k < n, since again we could
find cancellations as we push gates to the right.) This expression is now in Matsumoto-Amano (MA) normal
form. In terms of regular expressions, we can write this form as ( | T ) (HT | SHT ) C.
Since the above argument is constructive, it also gives an algorithm for converting circuits to MA normal
form. A naive implementation would take time O(n2 ), since we might make a pass through O(n) gates to find
a simplification, and we might have to repeat this O(n) times before reaching MA normal form. However,
we can reduce to MA normal form in linear time by simplifying the given circuit gate-by-gate, maintaining
MA normal form along the way. If N is in MA normal form and C C, then N C can be reduced to MA
normal form in constant time (we simply combine the rightmost two Clifford operators). On the other hand,
case analysis shows that reducing N T to MA normal form only requires updating the rightmost 5 gates, so
it can also be reduced in constant time. Overall, this approach takes O(n) steps, each taking time O(1), for
a total running time of O(n).
An important parameter of a Clifford+T circuit is its T -count, which is simply the number of T gates it
contains. Clearly there is a way of writing any Clifford+T circuit in MA normal form such that the T -count
is minimal, simply because the reduction procedure described above never increases the T -count.
These generators belong to the ring Z[ 12 ] = { a+b k
2
: a, b Z, k N}, so clearly the Bloch sphere
2
representation of any Clifford+T operator has entries in this ring.
k
We say that k N is a denominator exponent of x Z[ 12 ] if 2 x Z[ 2] = {a + b 2 : a, b Z}. We
call the smallest such k the least
denominator exponent of x.
Define the parity of x Z[ 2], denoted p(x), such that p(a+b 2) is the parity of a (i.e., 0 if a is even and
k
1 if a is odd). If k is a denominator exponent for x, define the k-parity of x Z[ 12 ] as pk (x) := p( 2 x).
Observe that the Bloch sphere representation of a Clifford operator is a signed permutation matrix, so it
has denominator exponent 0, and its parity (applied to the matrix elementwise) is a permutation.
We can define an equivalence relation on (k-)parity matrices of Bloch sphere representations of Clifford+T
operators such that they are equivalent if they differ by right-multiplication by the parity matrix of a Clifford
operator (in other words, by permutation of the columns). Now consider what happens to the k-parity
matrix of the operator as we proceed through the MA normal form, where we increase k by one every time
we multiply by a T gate. A simple calculation shows that transitions between the resulting equivalence
classes are as follows:
T
C
T
1 0 0 1 1 0 0 0 0 1 1 0 (3.12)
0 T S
1 0 1 1 0 1 1 0 0 0 0
0 0 1 0 0 0 1 1 0 1 1 0
H
Here the matrices are representatives of equivalence classes of k-parity matrices, the labels on the arrows
show what gates induce the transitions, k = 0 at the leftmost (starting) matrix, and the value of k is increased
by 1 along each thick arrow. For example, for the transitions under a T gate, we have
a11 + b11 2 a12 + b12 2 a13 + b13 2 1 1 0 a11 + b11 2 a12 + b12 2 a13 + b13 2
a21 + b21 2 a22 + b22 2 a23 + b23 2 7 1 1 1
0
a21 + b21 2 a22 + b22 2 a23 + b23 2
2
a31 + b31 2 a32 + b32 2 a33 + b33 2 0 0 2 a31 + b31 2 a32 + b32 2 a33 + b33 2
1 (a11 a21 ) + (b11 b21 )2 (a12 a22 ) + (b12 b22 )2 (a13 a23 ) + (b13 b23 )2
= (a11 + a21 ) + (b11+ b21 ) 2 (a12 + a22 ) + (b12+ b22 ) 2 (a13 + a23 ) + (b13+ b23 ) 2 . (3.13)
2 2b31 + a31 2 2b32 + a32 2 2b33 + a33 2
At the leftmost matrix, we have a11 , a22 , a33 odd and aij even for i 6= j. Clearly the resulting 1-parity matrix
is of the indicated form. Similar calculations verify the other transitions.
From this transition diagram, we can easily see that the MA normal form is unique. If we remain at the
leftmost matrix, the operation is Clifford. On the other hand, if we end at one of the next three matrices to
the right, the leftmost syllable of M is T , HT , or SHT , respectively. Given a matrix M , let k be its least
denominator exponent. By computing pk (M ) and determining which equivalence class it belongs to, we can
determine the final syllable of its MA normal form. By induction, the entire MA normal form is determined.
Note that this also shows that the least denominator exponent of M is its minimal T -count.
This argument also implies an algorithm for exact synthesis given the matrix of a Clifford+T operation
(instead of some initial Clifford+T circuit). We simply convert to the Bloch matrix representation, compute
the least denominator exponent k, use pk (M ) to determine the leftmost syllable of the MA normal form, and
recurse until we are left with a Clifford operation. In this algorithm, the number of arithmetic operations
performed is O(k).
entries are in Z[ 12 ]. As noted above, the only if part is trivial; it remains to show that any such matrix
corresponds to a Clifford+T operation.
The proof of this statement uses the orthogonality condition on U to characterize the possible values of
pk (U ) (specifically, to show it is one of the forms in the above transition diagram, up to permutation of the
columns), and then shows that the least denominator can always be reduced by multiplying from the left by
the inverse of a matrix from {T, HT, SHT }. The proofs of these statements are straightforward, but involve
some explicit calculation and case analysis; see [47] for details.
As a simple corollary, we can establish that U is a Clifford+T unitary if and only if its entries are in
Z[ 12 , i] Again the only if direction is trivial. For the other direction, simply observe that if the entries of
U are in Z[ 12 , i], then the entries of U are in Z[ 12 ], and we can apply the characterization of Bloch matrices
of Clifford+T unitaries. Note that this only determines the actual matrix up to a phase, but this phase must
be a power of , so indeed the original U must be a Clifford+T unitary.
In summary, we have seen that any Clifford+T unitary can be synthesized into a Clifford+T circuit, with
the minimal number of T gates (equal to the least denominator exponent of its Bloch sphere representation),
in time linear in the T -count.
Quantum algorithms
for algebraic problems
15
Chapter 4
1 XX
FG := p y (x)|yihx| (4.1)
|G| xG
yG
where G is a complete set of characters of G, and y (x) denotes the yth character of G evaluated at x.
(You can verify that this is a unitary operator using the orthogonality of characters.) Since G and G are
isomorphic, we can label the elements of G using elements of G, and it is often useful to do so.
The simplest QFT over a family of groups is the QFT over G = Zn2 . The characters of this group are
y (x) = (1)xy , so the QFT is simply
1 X
FZn2 = (1)xy |yihx| = H n . (4.2)
n
2 x,yZn
2
You have presumably seen how this transformation is used in the solution of Simons problem [92].
1
2xyn |yihx|
X
FZ2n = (4.3)
2n x,yZ2n
where m := exp(2i/m) is a primitive mth root of unity. To see how to realize this transformation by a
quantum circuit, it is helpful to represent the input x as a string of bits, x = xn1 . . . x1 x0 , and to consider
17
18 Chapter 4. The abelian quantum Fourier transform and phase estimation
1 X xy
|xi 7 2n |yi (4.4)
2n yZ n
2
1 X x(Pn1 y 2k )
= 2n k=0 k |yn1 . . . y1 y0 i (4.5)
n
2 yZ n
2
n1
1 X Y xyk 2k
= 2n |yn1 . . . y1 y0 i (4.6)
2n yZ n k=0
2
n1
1 O X xyk 2k
= 2n |yk i (4.7)
2n k=0 y Z
k 2
n1
O
= |zk i (4.8)
k=0
where
1 X xyk 2k
|zk i := 2n |yk i (4.9)
2 y Z
k 2
1 k
= (|0i + 2x2n |1i) (4.10)
2
1
Pn1
xj 2j+k
= (|0i + 2n j=0 |1i) (4.11)
2
1 kn k+1n
++xn1k 21 )
= (|0i + e2i(x0 2 +x1 2 |1i). (4.12)
2
(A more succinct way to write this is |zk i = 12 (|0i + 2xnk |1i), but the above expression is more helpful
for understanding the circuit.) In other words, F |xi is a tensor product of single-qubit states, where the kth
qubit only depends on the n k least significant bits of x.
This decomposition immediately gives a circuit for the QFT over Z2n . Let Rk denote the single-qubit
unitary operator
1 0
Rk := . (4.13)
0 2k
|x0 i H |zn1 i
|x1 i H R2 |zn2 i
.. .. . . .. ..
. . .. .. . .
|xn3 i |z2 i
|xn2 i H R2 Rn2 Rn1 |z1 i
This circuit uses O(n2 ) gates. However, there are many rotations by small angles that do not affect the
final result very much. If we simply omit the gates Rk with k = (log n), then we obtain a circuit with
O(n log n) gates that implements the QFT with precision 1/ poly(n).
4.3. Phase estimation 19
apply an inverse Fourier transform on the first register, and measure. If the binary expansion of /2
terminates after at most n bits (i.e., if = 2y/2n for some y Z2n ), then the state (4.16) is F2n |yi |i,
so the result is guaranteed to be the binary expansion of /2. In general, we obtain a good approximation
with high probability. In particular, the probability of obtaining the result y (corresponding to the estimate
2y/2n for the phase) is
1 sin2 (2n1 )
Pr(y) = 2n , (4.17)
2 sin2 ( 2 y
2n )
which is strongly peaked around the best n-bit approximation (in particular, it gives the best n-bit approx-
imation with probability at least 4/ 2 ). We will see the details of a similar calculation when we discuss
period finding.
The circuit we derived using the binary representation of the input and output only works when N is a
power of two (or, with a slight generalization, some other small integer). But there is a simple way to realize
FZN (approximately) using phase estimation.
We would like to perform the transformation that maps |xi 7 |xi, where |xi := FZN |xi denotes a Fourier
basis state. (By linearity, if the transformation acts correctly on a basis, it acts correctly on all states.) It is
straightforward to perform the transformation |x, 0i 7 |x, xi; then it remains to erase the register |xi from
such a state.
Consider the unitary operator that adds 1 modulo N :
X
U := |x + 1ihx|. (4.19)
xZN
The eigenstates of this operator are precisely the Fourier basis states |xi := FZN |xi, since (as a simple
calculation shows)
FZN U FZN =
X
x
N |xihx|. (4.20)
xZN
20 Chapter 4. The abelian quantum Fourier transform and phase estimation
Thus, using phase estimation on U (with n bits of precision where n = O(log N )), we can perform the
transformation
|x, 0i 7 |x, xi (4.21)
(actually, phase estimation only gives an approximation of x, so we implement this transformation only
approximately). By running this operation in reverse, we can erase |xi, and thereby produce the desired
QFT.
Given the Fourier transform over ZN , it is straightforward to implement the QFT over an arbitrary finite
abelian group: any finite abelian group can be written as a direct product of cyclic factors, and the QFT
over a direct product of groups is simply the tensor product of QFTs over the individual groups.
Chapter 5
In this lecture we will discuss the discrete logarithm problem and its relevance to cryptography. We will
introduce the general hidden subgroup problem, and show how Shors algorithm solves a particular instance
of it, giving an efficient quantum algorithm for discrete log.
21
22 Chapter 5. Discrete log and the hidden subgroup problem
At the end of the protocol, Alice and Bob share a key K, and Eve has only seen p, g, A, and B.
The security of the Diffie-Hellman protocol relies on the assumption that discrete log is hard. Clearly, if
Eve can compute discrete logarithms, she can recover a and b, and hence the key. (Note that it is an open
question whether, given the ability to break the protocol, Eve can calculate discrete logarithms, though some
partial results in this direction are known.)
This protocol only provides a means of exchanging a secret key, not of sending private messages. However,
very similar ideas can be used to create a public-key cryptosystem (similar in spirit to RSA).
for some unknown subgroup H G. We say that such a function hides H. The goal of the HSP is to learn
H (say, specified in terms of a generating set) using queries to f .
Its clear that H can in principle be reconstructed if we are given the entire truth table of f . Notice in
particular that f (1) = f (x) if and only if x H: the hiding function is constant on the hidden subgroup,
and does not take that value anywhere else.
But the hiding function has a lot more structure as well. If we fix some element g G with g / H, we see
that f (g) = f (x) if and only if x gH, a left coset of H in G with coset representative g. So f is constant
on the left cosets of H in G, and distinct on different left cosets.
In the above definition of the HSP, we have made an arbitrary choice to multiply by elements of H on
the right, which is why the hiding function is constant on left cosets. We could just as well have chosen to
multiply by elements of H on the left, in which case the hiding function would be constant on right cosets;
the resulting problem would be equivalent. Of course, in the case where G is abelian, we dont need to make
such a choice. For reasons that we will see later, this case turns out to be considerably simpler than the
general case; and indeed, there is an efficient quantum algorithm for the HSP in any abelian group, whereas
there are only a few nonabelian groups for which efficient algorithms are known.
You should be familiar with Simons problem, the HSP with G = Zn2 and H = {0, s} for some s Zn2 .
There is a very simple quantum algorithm for this problem, yet one can prove that any classical algorithm
for finding s must query the hiding function exponentially many times (in n). The gist of the argument is
that, since the set S is unstructured, we can do no better than querying random group elements so long as
we do not know two elements x, y forpwhich f (x) = f (y). But by the birthday problem, we are unlikely to
see such a collision until we make ( |G|/|H|) random queries.
A similar argument applies to any HSP with a large number of trivially intersecting subgroups. More
precisely, we have
Theorem 5.1. Suppose that G has a set H of N subgroups whose only common element is the identity.
Then a classical computer must make ( N ) queries to solve the HSP.
Proof. Suppose the oracle does not a priori hide a particular subgroup, but instead behaves adversarially,
as follows. On the `th query, the algorithm queries g` , which we assume to be different from g1 , . . . , g`1
without loss of generality. If there is any subgroup H H for which gk / gj H for all 1 j < k ` (i.e.,
there is some consistent way the oracle could assign g` to an as-yet-unqueried coset of a hidden subgroup
from H), then the oracle simply outputs `; otherwise the oracle conceeds defeat and outputs a generating
set for some H H consistent with its answers so far (which must exist, by construction).
The goal of the algorithm is to force the oracle to conceed, and we want to lower bound the number of
queries required. (Given an algorithm for the HSP in G, there is clearly an algorithm that forces this oracle
to conceed using only one more query.) Now consider an algorithm that queries the oracle t times before
5.4. Shors algorithm 23
forcing the oracle to conceed. This algorithm simply sees a fixed sequence of responses 1, 2, . . . , t, so for the
first t queries, the algorithm cannot be adaptive. But observe that, regardless of which t group elements are
queried, there are at most 2t values of gk gj1 , whereas there are N possible subgroups in H. Thus, to satisfy
the N conditions that for all H H, there is some pair j, k such that gk gj1 H, we must have 2t N ,
i.e., t = ( N ).
Note that there are cases where a classical algorithm can find the hidden subgroup with a polynomial
number of queries. In particular, since a classical computer can easily test whether a certain subgroup is
indeed the hidden one, the HSP is easy for a group with only a polynomial number of subgroups. Thus, for
example, a classical computer can easily solve the HSP in Zp for p prime (since it has only 2 subgroups) and
in Z2n (since it has only n + 1 subgroups).
H = L0 = {(0, 0), (1, logg x), (2, 2 logg x), . . . , (N 1, (N 1) logg x)}. (5.4)
The cosets of H in ZN ZN are of the form (, ) + H with , ZN . In particular, the set of cosets of the
form
(0, ) + H = {(, logg x) : ZN } = L (5.5)
varying over all ZN gives a complete set of cosets (so the set {0} ZN is a complete set of coset
representatives, i.e., a transversal of H in ZN ZN ).
Shors algorithm for finding H proceeds as follows. We start from the uniform superposition over ZN ZN
and compute the hiding function:
1 X 1 X
|ZN ZN i := |, i 7 |, , f (, )i. (5.6)
N N
,ZN ,ZN
Next we discard the third register. To see what this does, it may be conceptually helpful to imagine
that we actually measure the third register. Then the post-measurement state is a superposition over group
elements consistent with the observed function value, which by definition is some coset of H. In particular,
if the measurement outcome is g , we are left with the coset state corresponding to (0, ) + H, namely
1 X
|(0, ) + Hi = |L i = |, logg xi (5.7)
N ZN
24 Chapter 5. Discrete log and the hidden subgroup problem
However, note that the measurement outcome is unhelpful: each possible value occurs with equal proba-
bility, and we cannot obtain from g unless we know how to take discrete logarithms. This is why we may
as well simply discard the third register, leaving the system in the mixed state described by the ensemble of
pure states (5.7) where is uniformly random and unknown.
Now we can exploit the symmetry of the quantum state by performing a QFT over ZN ZN ; then the
state becomes
1 X +( logg x) 1 X
X ( log x)
g
3/2
N |, i = 3/2 N N |, i, (5.8)
N ,,Z
N ,Z Z
N N N
P
and using the identity ZN N = N ,0 , we have
1 X
N | logg x, i. (5.9)
N ZN
Now suppose we measure this state in the computational basis. Then we obtain some pair ( logg x, ) for
uniformly random ZN . If has a multiplicative inverse modulo N , we can divide the first register by
to get the desired answer. If does not have a multiplicative inverse, we simply repeat the entire procedure
again. The probability of success for each independent attempt is (N )/N = (1/ log log N ), so we dont
have to repeat the procedure many times before we find an invertible .
This algorithm can be carried out for any cyclic group G so long as we have a unique representation of
the group elements, and we are able to efficiently compute products in G. (We need to be able to compute
high powers of a group element, but recall that this can be done quickly by repeated squaring.)
Chapter 6
Here we describe an algorithm to solve the HSP in any finite abelian group of known structure. We also
explain how related ideas can be used to determine the structure of a black-box abelian group.
25
26 Chapter 6. The abelian HSP and decomposing abelian groups
where
1 X
y (H) := y (h). (6.8)
|H|
hH
Note that applying the QFT was the right thing to do because the state H is G-invariant. In other words,
it commutes with the regular representation of G, the unitary matrices U (x) satisfying U (x)|yi = |x + yi for
all x, y G: we have
1 X
U (x)H = |x + y + Hihy + H| (6.9)
|G|
yG
1 X
= |z + Hihz x + H| (6.10)
|G|
zG
= H U (x) (6.11)
= H U (x). (6.12)
It follows that H := FG H FG is diagonal (indeed, we verify this explicitly below), so we can measure
without losing any information. We will talk about this phenomenon more when we discuss nonabelian
Fourier sampling.
Note that y is a character of H if we restrict our attention to that subgroup. If y (h) = 1 for all h H,
then clearly y (H) = 1. On the other hand, if there is any h0 H with y (h0 ) 6= 1 (i.e., if the restriction of
y to H is not the trivial character of H), then since h0 + H = H, we have
1 X
y (H) = y (h) (6.13)
|H|
hh0 +H
1 X
= y (h0 + h) (6.14)
|H|
hH
= y (h0 )y (H), (6.15)
which implies that y (H) = 0. (This also follows from the orthogonality of characters of H,
1 X
y (x)y0 (x) = y,y0 , (6.16)
|H|
xH
Next we measure in the computational basis. Then we obtain some character y that is trivial on the
hidden subgroup H. This information narrows down the possible elements of the hidden subgroup: we can
restrict our attention to those elements g G satisfying y (g) = 1. The set of such elements is called the
kernel of y ,
ker y := {g G : y (g) = 1}; (6.19)
it is a subgroup of G. Now our strategy is to repeat the entire sampling procedure many times and compute
the intersection of the kernels of the resulting characters. After only polynomially many steps, we claim that
the resulting subgroup is H with high probability. It clearly cannot be smaller than H (since the kernel of
every sampled character contains H), so it suffices to show that each sample is likely to reduce the size of
H by a substantial fraction until H is reached.
6.2. Decomposing abelian groups 27
Suppose that at some point in this process, the intersection of the kernels is K G with K 6= H. Since
K is a subgroup of G with H < K, we have |K| 2|H| (by Lagranges theorem). Because each character
y of G satisfying y (H) = 1 has probability |H|/|G| of appearing, the probability that we see some y for
which K ker y is
|H|
|{y G : K ker y }|. (6.20)
|G|
But the number of such ys is precisely |G|/|K|, since we know that if the subgroup K were hidden, we would
sample such ys uniformly, with probability |K|/|G|. Therefore the probability that we see a y for which
K ker y is precisely |H|/|K| 1/2. Now if we observe a y such that K 6 ker y , then |Kker y | |K|/2;
furthermore, this happens with probability at least 1/2. Thus, if we repeat the process O(log |G|) times, it
is extremely likely that the resulting subgroup is in fact H.
G
= Z|h1 i| Z|h2 i| Z|ht i| . (6.22)
Given such a decomposition, it is straightforward to implement FG and thereby solve HSPs in G. We might
also use this tool to decompose the structure of the hidden subgroup H output by the HSP algorithm, e.g.,
to compute |H|.
First, it is helpful to simplify the problem by reducing to the case of a p-group for some prime p. For
each given generator g of G, we compute its order, the smallest non-negative integer r such that rg = 0
(where we are using additive notation; in multiplicative notation we would write g r = 1). Recall that there
is an efficient quantum algorithm for order finding. Furthermore, there is an efficient quantum algorithm for
factoring, so suppose we can write r = st for some relatively prime integers s, t. By Euclids algorithm, we
can find a, b such that as + bt = 1, so asg + btg = g. Therefore, we can replace the generator g by the two
generators sg and tg and still have a generating set. By repeating this procedure, we eventually obtain a
generating set in which all the generators have prime power order.
For a givenLprime p, let Gp be the group generated by all the generators of G whose order is a power of
p. Then G = p Gp : every element of G can be written as a sum of elements from the Gp s (since together
they include a generating set), and since Gp is a p-group (i.e., the orders of all its elements are powers of
p), Gp Gp0 = {0}. Thus, it suffices to focus on the generators of Gp and determine the structure of this
p-group. So from now on we assume that the order of G is a power of p.
Now, given a generating set {g1 , . . . , gd } for G, let q (which is some power of p) be the largest order of
any of the generators. We consider a hidden subgroup problem in the group Zdq whose solution allows us to
determine the structure of G. Define f : Zdq G by
f (x1 , . . . , xd ) = x1 g1 + + xd gd .
Now f (x1 , . . . , xd ) = f (y1 , . . . , yd ) if and only if (x1 y1 )g1 + + (xd yd )gd = 0, i.e., if and only if
f (x y) = 0. The elements of G for which f takes the value 0,
form a subgroup of G called the kernel of f . Using the algorithm for the hidden subgroup problem in Zdq ,
we can find generators for K. Suppose this generating set is W = {w1 , . . . , wm }, where wi Zdq .
28 Chapter 6. The abelian HSP and decomposing abelian groups
The function f is clearly a homomorphism from Zdq to G, and it is also surjective (i.e., onto, meaning
that the image of f is all of G), which implies that Zdq /K = G (this is called the first isomorphism theorem).
Thus, to determine the structure of G, it suffices to determine the structure of the quotient Zdq /K. In
particular, if Zdq /K = hu1 + Ki hut + Ki, then G = hf (u1 )i hf (ut )i. The final ingredient is a
polynomial-time classical algorithm that produces such a direct sum decomposition of a quotient group.
To find such a decomposition, it is helpful to view the problem in terms of linear algebra. With x Zdq ,
we have x + K = K (so that f (x) = 0, and there is no need to include x as a generator) if and only if
x spanZq W (recall that W is a generating set for K). We can easily modify this to allow arbitrary integer
vectors x Zd : then x + K = K if and only if x spanZ (W {qe1 , . . . , qed }), where ei is the ith standard
basis vector. In other words, as x varies over the integer span of the vectors w1 , . . . , wm , qe1 , . . . , qed , we
obtain redundant vectors.
Now we use a tool from integer linear algebra called the Smith normal form. A square integer matrix
is called unimodular if it has determinant 1. Given an integer matrix M , its Smith normal form is a
decomposition M = U DV 1 , where D = diag(1, . . . , 1, d1 , . . . , dt , 0, . . . 0) is an integer diagonal matrix with
its positive diagonal entries satisfying d1 | d2 | . . . | dt . The Smith normal form can be computed classically
in polynomial time.
In the present context, let M be the matrix with columns w1 , . . . , wm , qe1 , . . . , qed . Let M = U DV 1 be
its Smith normal form, and let u1 , . . . , ut be the columns of U corresponding to diagonal entries of D that
are not 0 or 1 (i.e., if the ith diagonal entry of D is not 0 or 1, the ith column of U is included). We claim
that Zdq /K = hu1 + Ki hut + Ki.
Since U is nonsingular, it is clear that we still have a generating set if we take all the columns of U .
Were claiming that the columns corresponding to 0 or 1 diagonal entries of D are redundant. Let u be
the jth column of U ; we know that u + K = K (i.e., u is redundant) if u spanZ cols(M ) (where cols(M )
denotes the set of columns of M ). Since V is unimodular, spanZ cols(M ) = spanZ cols(M V ). So u + K = K
if u spanZ cols(M V ), i.e., if ej spanZ cols(U 1 M V ) = spanZ cols(D). If the jth diagonal entry of D is 0
or 1, then clearly this is true, so u + K = K. This shows that the cosets u1 + K, . . . , ut + K alone indeed
generate Zdq /K.
It remains to show that they generate Zdq /K as a direct sum. The above argument shows that P di ui + K =
K, and thisP is not true for any smaller value than di , so the order of ui +K is di . Now suppose i xi ui +K =
1
K. Then i xi ui spanZ cols(M ) = spanZ cols(M V ), or in other words, x spanZ cols(U MV ) =
spanZ cols(D). But this implies that xi is an integer multiple of di , which shows that hu1 +Ki hut +Ki
is indeed a direct sum decomposition.
Chapter 7
In Chapter 5 we discussed Shors algorithm, which can calculate discrete logarithms over any cyclic group.
In particular, this algorithm can be used to break the Diffie-Hellman key exchange protocol, which assumes
that the discrete log problem in Z p (p prime) is hard. However, Shors algorithm also breaks elliptic curve
cryptography, the main competitor to RSA. In this lecture we will introduce elliptic curves and show how
they give rise to abelian groups that can be used to define cryptosystems.
This lecture is only intended to be a survey of the main ideas behind elliptic curve cryptography. While
breaking such cryptosystems is a major potential application of quantum computers, only a few implemen-
tation details differ between the algorithms for discrete log over the integers and over elliptic curves; no new
quantum ideas are required.
y 2 = x3 + ax + b (7.1)
where a, b F are parameters. The set of points (x, y) F2 satisfying this equation, together with a special
point O called the point at infinity, is called the elliptic curve Ea,b . A curve is called nonsingular if its
discriminant, := 16(4a3 + 27b2 ), is nonzero, and we will assume that this is the case for all curves we
consider.
Here are a few examples of elliptic curves over R2 :
Such pictures are helpful for developing intuition. However, for cryptographic applications it is useful to
have a curve whose points can be represented exactly with a finite number of bits, so we use curves over
finite fields. For simplicity, we will only consider the case Fp where p is a prime different from 2 or 3.
As an example, consider the curve
over F7 . This curve has 4a3 + 27b2 = 32 + 27 = 5 = 2 mod 7, so it is nonsingular. It is tedious but
straightforward to check that the points on this curve are
E2,1 = {O, (0, 1), (0, 6), (1, 0), (3, 1), (3, 6), (4, 1), (4, 6), (5, 2), (5, 5), (6, 3), (6, 4)}. (7.3)
In general, the number of points on the curve depends on the parameters a and b. However, for large
p it is quite close to p for all curves. Specifically, a theorem of Hasse says it is p + 1 t, where |t| 2 p.
(Note that for elliptic curves, there is a classical algorithm, Schoof s algorithm, that computes the number
29
30 Chapter 7. Quantum attacks on elliptic curve cryptography
2 3
1
1
1
!2 !1 1 2 !2 !1 1 2 !2 !1 1 2
!1
!1
!1
!2
!2
!2 !3
of points on the curve in time poly(log p). For more general curves defined by polynomial equations over
finite fields, there are similar estimates to the one provided by Hasses theorem, yet computing the precise
number of points may be a classically hard problem. But for some such curves, there is an efficient quantum
algorithm, Kedlayas algorithm, for counting the number of points on the curve [61].)
It turns out that an elliptic curve defines an abelian group. Specifically, there is a binary operation +
that maps a pair of points on the curve to a new point on the curve, in a way that satisfies all the group
axioms. To motivate this definition, we go back to the case where F = R. Given two points P, Q Ea,b ,
their sum P + Q is defined geometrically, as follows. For now, assume that neither point is O. Draw a line
through the points P and Q (or, if P = Q, draw the tangent to the curve at P ), and let R denote the third
point of intersection (defined to be O if the line is vertical). Then P + Q is defined as the reflection of R
about the x axis (where the reflection of O is O). If one of P or Q is O, we draw a vertical line through the
other point, giving the result that P + O = P : O acts as the additive identity. Thus we define O + O = O.
Note that reflection about the x axis corresponds to negation, so we can think of the rule as saying that the
three points of intersection of any line with the curve sum to 0.
2
R
Q
1
"2 "1 1 2
"1
P!Q
"2
It turns out that this law makes Ea,b into an abelian group for which the identity is O and the inverse of
P = (x, y) is P = (x, y). By definition, it is clear that (Ea,b , +) is abelian (the line through P and Q does
not depend on which point is chosen first) and closed (we always choose P +Q to be some point on the curve).
The only remaining group axiom to check is associativity: we must show that (P + Q) + T = P + (Q + T ).
7.2. Elliptic curve cryptography 31
Using a diagram of a typical curve, and picking three arbitrary points, you should be able to convince yourself
that associativity appears to hold. Actually proving it in these geometric terms requires a little algebraic
geometry.
For calculations, it is helpful to produce an algebraic description of the definition of elliptic curve point
addition. Let P = (xP , yP ) and Q = (xQ , yQ ). The slope of the line through P and Q (with P 6= Q) is
yQ yP
= . (7.4)
xQ xP
Thus the set of points (x, y) on this line is y = x + y0 , where y0 = yP xP . Substituting this into (7.1)
gives the equation
x3 2 x2 + (a 2y0 )x + b y02 = 0, (7.5)
and solving this equation with the cubic formula shows that xP + xQ + xR = 2 . Thus we have
xP +Q = xR (7.6)
2
= xP xQ (7.7)
2
yQ yP
= xP xQ (7.8)
xQ xP
and
yP +Q = yR (7.9)
= xP +Q y0 (7.10)
= (xP xP +Q ) yP (7.11)
yQ yP
= (xP xP +Q ) yP . (7.12)
xQ xP
A similar formula can be derived for the case where P = Q (i.e., we are computing 2P ). It is straightforward
to compute the slope of the tangent to the curve at P ; if yP = 0 then the slope is infinite, so 2P = O, but
otherwise
3x2 + a
= P . (7.13)
2yP
The rest of the calculation proceeds as before, and we have
simply multiplication in our additive notation), we can define analogs of Diffie-Hellman key exchange and
related cryptosystems such as ElGamal. The security of such a cryptosystem then relies on the assumption
that the discrete log problem on hgi is hard.
In practice, there are many details to consider when choosing an elliptic curve for cryptographic purposes.
Algorithms are known for calculating discrete log on supersingular and anomalous curves that run faster
than algorithms for the general case, so such curves should be avoided. Also, g should be chosen to be a
point of high orderideally, the elliptic curve group should be cyclic, and g should be a generator. Such
curves can be found efficiently, and in the general case, it is not known how to solve the discrete log problem
over an elliptic curve classically any faster than by general methods (e.g., Pollards rho algorithm), which
run in time O( p).
In this and the next lecture, we will explore a natural extension of the abelian hidden subgroup problem,
namely an algorithm discovered by Hallgren for solving a quadratic diophantine equation known as Pells
equation [54, 59]. This algorithm is interesting for at least two reasons. First, it gives an application of
quantum algorithms to a new area of mathematics, algebraic number theory (and indeed, subsequent work
has shown that quantum computers can also efficiently solve other problems in this area). Second, it extends
the solution of the abelian HSP to the case of an infinite group, namely the real numbers.
There are two main parts to the quantum algorithm for solving Pells equation. First, we define a periodic
function whose period encodes the solution to the problem. To define this function, we must introduce some
notions from algebraic number theory. Second, we show how to find the period of a black-box function
defined over the real numbers even when the period is irrational.
x2 dy 2 = 1 (8.1)
is known as Pells equation. Amusingly, Pell had nothing whatsoever to do with the equation. The misat-
tribution is apparently due to Euler, who confused Pell with a contemporary, Brouncker, who had actually
worked on the equation. In fact, Pells equation was studied in ancient India, where (inefficient) methods
for solving it were developed about a century before Pell.
The left hand side of Pells equation can be factored as
x2 dy 2 = (x + y d)(x y d). (8.2)
2
Note
(x, y) Z can be encoded uniquely as the
that a solution ofthe equation number x + y d: since
real
d is irrational, x + y d = w + z d if and only if (x, y) = (w, z). (Proof: xw zy = d.) Thus we can also
refer to the number x + y d as a solution of Pells equation.
There is clearly no loss of generality in restricting our attention to positive solutions of the equation,
namely those for which x n> 0 and y > 0. It is straightforward to show that if x1 + y1 d is a positive
solution, then (x1 + y1 d) is also a positive solution
for any n N. In fact, one can show that all positive
solutions are obtained in this way, where x1 +y1 d is the fundamental solution, the smallest positive solution
33
34 Chapter 8. Quantum algorithms for number fields
of the equation. Thus, even though Pells equation has an infinite number of solutions, we can in a sense
find them all by finding the fundamental solution.
Some examples of fundamental solutions for various values of d are shown in the following table. Notice
that while the size of the fundamental solution generally increases with increasing d, the behavior is far
from monotonic: for example, x1 has 44 decimal digits when d = 6009, but only 11 decimal digits when
d = 6013.
But it is possible for the solutions to be very largethe size of x 1 + y 1 d is only upper bounded
by 2O( d log d) . Thus it is not even possible to write down the fundamental solution with poly(log d) bits.
d x1 y1
2 3 2
3 2 1
5 9 4
..
.
13 649 180
14 15 4
..
.
6009 131634010632725315892594469510599473884013975 1698114661157803451688949237883146576681644
1.3 1044 1.6 1042
To get around this difficulty, we define the regulator of the fundamental solution,
R := ln(x1 + y1 d). (8.3)
Since R = O( d log d), we can write down dRe using O(log d) bits. Now R is an irrational number, so
determining only its integer part may seem unsatisfactory. But in fact, given the integer part of R, there is
a classical algorithm to compute n digits of R in time poly(log d, n). Thus it suffices to give an algorithm
that finds the integer part of R in time poly(log d). The best known classical algorithm for this problem
takes time 2O( log d log log d) assuming the generalized Riemann hypothesis, or time O(d1/4 poly(log d)) with
no such assumptions.
You can easily check that this is a field with the usual addition and multiplication operations. We can also
define an operation called conjugation, defined by
x + y d := x y d. (8.5)
Conjugation
of elements of Q( d) has many of the same properties as complex conjugation, and indeed
Q( d) behaves
in many
respects like C, with d taking the place of the imaginary unit i = 1. Defining
the ring Z[ d] Q( d) as
Z[ d] := {x + y d : x, y Z}, (8.6)
we see that solutions of Pells equation correspond to Z[ d] satisfying = 1.
8.4. A periodic function for the units of Z[ d] 35
that1any solution of Pells equation, Z[ d], has the property that its multiplicative inverse
Notice
over Q( d), = / = ,
is also an element of Z[ d]. In general, an element of a ring with an inverse
that is also an element of the ring is called a unit. In Z, the only units are 1,
but in other rings it is possible
to have more units. It should not be a surprise that the set of units of Z[ d] is closely related to the set of
solutions of Pells equation. Specifically, we have
Proposition 8.1. = x + y d is a unit in Z[ d] if and only if = x2 dy 2 = 1.
Proof. We have
1 xy d
= = 2 . (8.7)
x dy 2
If x2 dy 2 = 1, then clearly 1 Z[ d]. Conversely, if 1 Z[ d], then so is
1 1 (x y d)(x + y d) 1
= 2 2 2
= 2 , (8.8)
(x dy ) x dy 2
8.4 A periodic function for the units of Z[ d]
Principal ideals are useful because the function mapping the ring element Z[ d] to the principal ideal
R is periodic, and its periodicity corresponds to the units of Z[ d]. Specifically, we have
Proposition 8.2. Z[ d] = Z[ d] if and only if = where is a unit in Z[ d].
Proof. If is a unit, then Z[ d] = Z[ d] = Z[ d] since Z[ d] = Z[
d] by the definition of a unit.
Conversely,
suppose that Z[ d] = Z[ d]. Since
1 Z[ d], Z[ d] = Z[ d], so there is some
Z[ d] satisfying = . Similarly, Z[ d] = Z[ d], so there is some Z[ d] satisfying = .
Thus we have = = . This shows that = 1, so and are units (indeed, = 1 ).
Thus the function g() = Z[ d] is (multiplicatively) periodic with period 1 . In other words, letting
= ez , the function
h(z) = ez Z[ d] (8.9)
is (additively) periodic with period R. However, we cannot simply use this function since it is not possible
to succinctly represent the values it takes.
To define a more suitable periodic function, Hallgren uses the concept of a reduced ideal, and a way of
measuring the distance between principal ideals. The definition of a reduced ideal is rather technical, and
we will not go into the details. For our purposes, it is sufficient to note that there are only finitely many
36 Chapter 8. Quantum algorithms for number fields
reduced principal ideals, and in fact only O(d) of them, so we can represent a reduced principal ideal using
poly(log d) bits.
Hallgren also uses a function that measures the distance of any principal ideal from the unit ideal, Z[ d].
This function is defined as
(Z[ d]) := ln mod R. (8.10)
Notice that the unit ideal has distance (1Z[ d]) = ln |1/1| mod R = 0, as required. Furthermore, the
distance function does not depend on which generator we choose to represent an ideal, since (by the above
proposition) two equivalent ideals have generators that differ by some unit , and
(Z[ d]) = ln mod R = ln 1 mod R = ln |2 | mod R = 2 ln || mod R = 0. (8.11)
With this definition of distance, one can show that the reduced ideals are not too far apart, so that there is
a reduced ideal close to any non-reduced ideal.
The periodic function used in Hallgrens algorithm, f (z), is defined as the reduced principal ideal whose
distance from the unit ideal is maximal among all reduced principal ideals of distance at most z (together
with the distance from z mod R, to ensure that the function is one-to-one within each period). In other
words, we select the reduced principal ideal to the left of or at z.
This function is periodic with period R, and can be computed in time poly(log d). Thus it remains to
show how to perform period finding when the period of the function might be irrational.
Chapter 9
In the previous chapter, we defined a periodic function over R whose period is an irrational number (the
regulator) encoding the solutions of Pells equation. Here we review Shors approach to period finding, and
show how it can be adapted to find an irrational period.
1 X 1 X
|xi 7 |x, f (x)i. (9.1)
N x{0,...,N 1}
N x{0,...,N 1}
Next we measure the second register, leaving the first register in a uniform superposition over those val-
ues consistent with the measurement outcome. When f is periodic with minimum period r, we obtain a
superposition over points separated by the period r. The number of such points, n, depends on where the
first point, x0 {0, 1, . . . , r 1}, appears. When restricted to {0, 1, . . . , N 1}, the function has bN/rc full
periods and N rbN/rc remaining points, as depicted below. Thus n = bN/rc + 1 if x0 < N rbN/rc and
n = bN/rc otherwise.
37
38 Chapter 9. Period finding from Z to R
N
z }| {
x0
| {z }| {z }| {z }| {z }| {z }
r r r r N rbN/rc
Discarding the measurement outcome, we are left with the quantum state
n1
1 X
|x0 + jri (9.2)
n j=0
where x0 occurs nearly uniformly random (it appears with probability n/N ) and is unknown. To obtain
information about the period, we apply the Fourier transform over ZN , giving
n1 n1
1 X X k(x0 +jr) 1 X
kx0
X jkr
N |ki = N N |ki. (9.3)
nN j=0 kZN nN kZN j=0
Now if we were lucky enough to choose a value of N for which r | N , then in fact n = N/r regardless of the
value of x0 , and the sum over j above is
n1 n1
jkr
X X
N = njk (9.4)
j=0 j=0
and measurement of k is guaranteed to give an integer multiple of n = N/r, with each of the r multiples
occurring with probability 1/r. But more generally, the sum over j in (9.3) is the geometric series
n1 krn
X jkr N 1
N = kr
(9.7)
j=0
N 1
krn
(n1)kr/2 sin N
= N . (9.8)
sin kr
N
The probability of seeing a particular value k is given by the normalization factor 1/nN times the magnitude
squared of this sum, namely
sin2 krn
N
Pr(k) = . (9.9)
nN sin2 kr
N
From the case where n = N/r, we expect this distribution to be strongly peaked around values of k that are
close to integer multiples of N/r. The probability of seeing k = bjN/re = jN/r + for some j Z, where
bxe denotes the nearest integer to x, is
Now, to upper bound the denominator, we use sin2 x x2 . To lower bound the numerator, observe that
2 rn
since || 1/2 and rn/N 1 + O(1/n), we have | rn
N | 2 + O(1/n); thus sin
rn 2
N c( N ) for some
9.2. Period finding over the reals 39
4
constant c (in particular, we can take c 2 for large n). Thus we have
c( rn
N )
2
Pr(k = bjN/re) r 2 (9.12)
nN ( N )
cn
= (9.13)
N
c
. (9.14)
r
This bound shows that Fourier sampling produces a value of k that is the closest integer to one of the r
integer multiples of N/r with probability lower bounded by a constant.
To discover r given one of the values bjN/re, we can divide by N to obtain a rational approximation to
j/r that deviates by at most 1/2N . Then consider the continued fraction expansion
bjN/re 1
= . (9.15)
N 1
a1 +
1
a2 +
a3 +
Truncating this expansion after a finite number of terms gives a convergent of the expansion. The convergents
provide a sequence of successively better approximations to bjN/re/N by fractions that can be computed
in polynomial time (see for example Knuths The Art of Computer Programming, volume 2). Furthermore,
it can be shown that any fraction p/q with |p/q bjN/re/N | < 1/2q 2 will appear as one of the convergents
(see for example Hardy and Wright, Theorem 184). Since j/r differs by at most 1/2N from bjN/re/N , the
fraction j/r will appear as a convergent provided r2 < N . By taking N is sufficiently large, this gives an
efficient means of recovering the period.
Now measuring the second register gives, with constant probability, a value for which f is pseudo-periodic.
Say that this value is f (x0 ) where 0 x0 r. As before, we see n = bN/rc + 1 points if x0 < N rbN/rc
or n = bN/rc points otherwise (possibly offset by 1 depending on how the rounding occurs for the largest
40 Chapter 9. Period finding from Z to R
value of x, but lets not be concerned with this detail). We will write [`] to denote an integer that could be
either b`c or d`e. With this notation, we obtain
n1
1 X
|x0 + [jr]i. (9.17)
n j=0
close to the corresponding sum in the case where the offsets j are zero (which,
We would like this to be
when normalized, is (1/ r) by the same calculation as in the case of period finding over Z). Consider the
deviation in amplitude,
n1
X kjr kj n1X kjr n1
X k
N N N
|N j 1| (9.20)
j=0 j=0 j=0
n1
X kj
=2 sin (9.21)
N
j=0
n1
X kj
2 (9.22)
N
j=0
2kn
. (9.23)
N
At least insofar as this bound is concerned, the amplitudes may not be close for all values of k. However,
suppose we only consider values of k less than N/ log r. (We will obtain such a k with probability about
1/ log r, so we can condition on this event with only polynomial overhead.) For such a k, we have
1 n1
X k[jr]
1 n
nN N = (1/ r) O( nN log r ) (9.24)
j=0
= (1/ r) O( r 1log r ) (9.25)
= (1/ r). (9.26)
Thus, as in the case of period finding over Z, Fourier sampling allows us to sample from a distribution for
which some value k = bjN/re (with j Z) appears with reasonably large probability (now (1/ poly(log r))
instead of (1)).
Finally, we must obtain an approximation to r using these samples. Since r is not an integer, the
procedure used in Shors period-finding algorithm does not suffice. However, we can perform Fourier sampling
sufficiently many times that we obtain two values bjN/re, bj 0 N/re such that j and j 0 are relatively prime,
again with only polynomial overhead. We prove below that if N 3r2 , then j/j 0 is guaranteed to be one
of the convergents in the continued fraction expansion for bjN/re/bj 0 N/re. Thus we can learn j, and hence
compute jN/bjN/re, which gives a good approximation to r: in particular, |r bjN/bjN/ree| 1.
Lemma 9.1. If N 3r2 , then j/j 0 appears a convergent in the continued fraction expansion of bjN/re/bj 0 N/re.
Furthermore, |r bjN/bjN/ree| 1.
9.3. Other algorithms for number fields 41
Proof. A standard result on the theory of approximation by continued fractions says that if a, b Z with
|x ab | 2b12 , then a/b appears as a convergent in the continued fraction expansion of x (see for example
Hardy and Wright, An Introduction to the Theory of Numbers, Theorem 184.) Thus it is sufficient to show
that
bjN/re j 1
bj 0 N/re j 0 < 2j 02 . (9.27)
Letting bjN/re = jN/r + and bj 0 N/re = j 0 N/r + with ||, || 1/2, we have
bjN/re j jN/r + j
bj 0 N/re j 0 = j 0 N/r + j 0 (9.28)
jN + r j
= 0
(9.29)
j N + r j 0
r(j 0 j)
= 0 0
(9.30)
j (j N + r)
r(j + j 0 )
02
(9.31)
2j N j 0 r
r
0 (9.32)
j N r/2
where in the last step we have assumed j < j 0 wlog. This is upper bounded by 1/2j 02 provided j 0 N
r/2 + 2j 02 r, which certainly holds if N 3r2 (using the fact that j 0 < r).
Finally
jN jN
r j m =r jN
(9.33)
jN
r+
r
jN r
=r (9.34)
jN + r
r2
= (9.35)
jN + r
So far, we have considered the hidden subgroup problem in abelian groups. We now turn to the case where
the group might be nonabelian. We will look at some of the potential applications of the HSP, and then
show that the general problem has polynomial quantum query complexity.
In other words, f is constant on left cosets H, g1 H, g2 H, . . . of H in G, and distinct on different left cosets.
When G is a nonabelian group, we refer to this problem as the nonabelian HSP.
The nonabelian HSP is of interest not only because it generalizes the abelian case in a natural way, but
because a solution of certain nonabelian hidden subgroup problems would have particularly useful applica-
tions. The most well-known (and also the most straightforward) applications are to the graph automorphism
problem and the graph isomorphism problem, problems for which no efficient classical algorithm is currently
known.
In the graph automorphism problem, we are given a graph on n vertices, and the goal is to determine
whether it has some nontrivial automorphism. In other words, we would like to know whether there is any
nontrival permutation Sn such that () = . The automorphisms of form a subgroup Aut Sn ;
if Aut is trivial then we say is rigid. We may cast the graph automorphism problem as an HSP over
Sn by considering the function f () := (), which hides Aut . If we could solve the HSP in Sn , then by
checking whether or not the automorphism group is trivial, we could decide graph automorphism.
In the graph isomorphism problem, we are given two graphs , 0 , each on n vertices, and our goal is to
determine whether there is any permutation Sn such that () = 0 , in which case we say that and
0 are isomorphic. We can cast graph isomorphism as an HSP in the wreath product Sn o S2 S2n , the
subgroup of S2n generated by permutations of the first n points, permutations of the second n points, and
swapping the two sets of points. Writing elements of Sn o S2 in the form (, , b) where , Sn represent
permutations of , 0 , respectively, and b {0, 1} denotes whether to swap the two graphs, we can define a
function (
((), (0 )) b = 0
f (, , b) := (10.2)
((0 ), ()) b = 1.
This function hides the automorphism group of the disjoint union of and 0 , which contains an element
that swaps the two graphs if and only if they are isomorphic. In particular, if and 0 are rigid (which
seems to be the hardest case for the HSP approach to graph isomorphism), the hidden subgroup is trivial
43
44 Chapter 10. Quantum query complexity of the HSP
when , 0 are non-isomorphic; and has order two, with its nontrival element the involution (, 1 , 1), when
= (0 ).
The second major potential application of the hidden subgroup problem is to lattice problems. An n-
dimensional lattice is the set of all integer linear combinations of n linearly independent vectors in Rn (a basis
for the lattice). In the shortest vector problem, we are asked to find a shortest nonzero vector in the lattice.
In particular, in the g(n)-unique shortest vector problem, we are promised that the shortest nonzero vector is
unique (up to its sign), and is shorter than any other non-parallel vector by a factor g(n). This problem can
be solved in polynomial time on a classical computer if g(n) is sufficiently large (say, if it is exponentially
large), and is NP-hard if g(n) = O(1). Less is known about intermediate cases, but the problem is suspected
to be classically hard even for g(n) = poly(n), to the extent that cryptosystems have been designed based
on this assumption.
Regev showed that an efficient quantum algorithm for the dihedral hidden subgroup problem based on
the so-called standard method (described below) could be used to solve the poly(n)-unique shortest vector
problem. Such an algorithm would be significant since it would break lattice cryptosystems, which are some
of the few proposed cryptosystems that are not compromised by Shors algorithm.
So far, only the symmetric and dihedral hidden subgroup problems are known to have significant ap-
plications. Nevertheless, there has been considerable interest in understanding the complexity of the HSP
for general groups. There are at least three reasons for this. First, the problem is simply of fundamental
interest: it appears to be a natural setting for exploring the extent of the advantage of quantum computers
over classical ones. Second, techniques developed for other HSPs may eventually find application to the sym-
metric or dihedral groups. Finally, exploring the limitations of quantum computers for HSPs may suggest
cryptosystems that could be robust even to quantum attacks.
We then compute the value f (g) in an ancilla register, giving the state
1 X
p |g, f (g)i. (10.4)
|G| gG
Finally, we measure the second register and discard the result (or equivalently, simply discard the second
register). If we obtain the outcome s S, then the state is projected onto the uniform superposition of those
g G such that f (g) = s, which by the definition of f is simply some left coset of H. Since every coset
contains the same number of elements, each left coset occurs with equal probability. Thus this procedure
produces the coset state
1 X
|gHi := p |ghi with g G uniformly random (10.5)
|H| hH
(or, equivalently, we can view g as being chosen uniformly at random from some left transversal of H in G).
Depending on context, it may be more convenient to view the outcome either as a random pure state, or
equivalently, as the mixed quantum state
1 X
H := |gHihgH| (10.6)
|G|
gG
which we refer to as a hidden subgroup state. In the standard approach to the hidden subgroup problem, we
attempt to determine H using samples of this hidden subgroup state. In other words, given k H for some
k = poly(log |G|), we try to find a generating set for H.
10.3. Query complexity of the HSP 45
In fact, by the minimax theorem, this holds even without assuming a prior distribution for the ensemble.
Given only one copy of the hidden subgroup state, (10.8) will typically give only a trivial bound. However,
by taking multiple copies of the hidden subgroup states, we can ensure that the overall states are nearly
orthogonal, and hence distinguishable. In particular, using k copies of , we see that there is a measurement
for identifying with probability at least
r r
1 N max F (k k
i , j ) = 1 N max F (i , j )k (10.9)
i6=j i6=j
(since the fidelity is multiplicative under tensor products). Setting this expression equal to 1 and solving
for k, we see that arbitrarily small error probability can be achieved provided we use
2(log N log )
k (10.10)
log (1/ maxi6=j F (i , j ))
copies of .
Provided that G does not have too many subgroups, and that the fidelity between two distinct hidden
subgroup states is not too close to 1, this shows that polynomially many copies of H suffice to solve the
2
HSP. The total number of subgroups of G is 2O(log |G|) , which can be seen as follows. Any group K can be
specified in terms of at most log2 |K| generators, since every additional (non-redundant) generator increases
the size of the group by at least a factor of 2. Since every subgroup of G can be specified by a subset of
2
at most log2 |G| elements of G, the number of subgroups of G is upper bounded by |G|log2 |G| = 2(log2 |G|) .
This shows that we can take log N = poly(log |G|) in (10.10). Thus k = poly(log |G|) copies of H suffice
to identify H with constant probability provided the maximum fidelity is bounded away from 1 by at least
1/ poly(log |G|).
To upper bound the fidelity between two states , 0 , consider the two-outcome measurement that projects
onto the support of or its orthogonal complement. The classical fidelity of the resulting distribution is an
upper bound on the quantum fidelity, so
q
F (, 0 ) tr tr 0 + tr((1 )) tr((1 )0 )
p
(10.11)
p
= tr 0 . (10.12)
46 Chapter 10. Quantum query complexity of the HSP
1 X |H| X
H = |gHihgH| = |gHihgH|. (10.13)
|G| |G|
gG gTH
where TH denotes some left transversal of H in G. Since the right hand expression is a spectral decomposition
of H , we have
X 1 X
H = |gHihgH| = |gHihgH|. (10.14)
|H|
gTH gG
Then we have
F (H , H 0 )2 tr H H 0 (10.15)
1 X
= |hgH|g 0 H 0 i|2 (10.16)
|H| |G| 0
g,g G
1 X |gH g 0 H 0 |2
= (10.17)
|H| |G| 0 |H| |H 0 |
g,g G
1 X
= |gH g 0 H 0 |2 . (10.18)
|G| |H|2 |H 0 | 0
g,g G
Now
so
X
|gH gH 0 |2 = |G| |HH 0 | |H H 0 |2 (10.22)
g,g 0 G
Thus we have
|G| |H| |H 0 | |H H 0 |
F (H , H 0 )2 = (10.24)
|G| |H|2 |H 0 |
|H H 0 |
= (10.25)
|H|
1
. (10.26)
2
This shows that F (H , H 0 ) 1/ 2, thereby establishing that the query complexity of the HSP is poly(log |G|).
Chapter 11
We have seen that hidden subgroup states contain sufficient information to determine the hidden subgroup.
Now we would like to know whether this information can be extracted efficiently. In this lecture, we will
introduce the theory of Fourier analysis over general groups, an important tool for getting a handle on this
problem.
47
48 Chapter 11. Fourier analysis in nonabelian groups
Another way to combine two representations is with the tensor product. The tensor product of : G V
and 0 : G V 0 is 0 : G V V 0 , a representation of G of dimension d0 = d d0 .
The character of a representation is the function : G C defined by (x) := tr (x). We have
(1) = d (since (1) is Id , the d-dimensional identity matrix)
(x1 ) = (x) (since we can assume that is unitary), and
(yx) = (xy) for all x, y G (since the trace is cyclic).
In particular, (yxy 1 ) = (x), so characters are constant on conjugacy classes. For two representations
, 0 , we have 0 = + 0 and 0 = 0 .
The most useful result in representation theory is probably Schurs Lemma, which can be stated as
follows:
Theorem 11.1 (Schurs Lemma). Let and 0 be two irreducible representations of G, and let M Cd d0
be a matrix satisfying (x)M = M 0 (x) for all x G. Then if 6 0 , M = 0; and if = 0 , M is a scalar
multiple of the identity matrix.
Schurs Lemma can be used to prove the following orthogonality relation for irreducible representations:
Theorem 11.2 (Orthogonality of irreps). For two irreps and 0 of G, we have
d X
(x)i,j 0 (x)i0 ,j 0 = ,0 i,i0 j,j 0 , (11.2)
|G|
xG
The characters of G supply an orthonormal basis for the space of class functions, functions that are constant
on conjugacy classes of G. (Recall that the characters themselves are class functions.) This is expressed by
the orthonormality of the character table of G, the square matrix whose rows are labeled by irreps, whose
columns are labeled by conjugacy classes, and whose entries are the corresponding characters. The character
orthogonality theorem says that the rows of this matrix are orthonormal, provided each entry is weighted
by the square root of the size of the corresponding conjugagcy class divided by |G|. In fact the columns are
orthonormal in the same sense.
Any representation of G can be broken up into its irreducible components. The regular representations
of G are useful for understanding such decompositions, since they contain every possible irreducible repre-
sentation of G, with each irrep occuring a number of times equal to its dimension. Let G denote a complete
set of irreps of G (which are unique up to isomorphism). Then we have
L Id , R
M M
Id .
= = (11.4)
G G
In fact, this holds with the same isomorphism for both L and R, since the left and right regular representations
commute. This isomorphism is simply the Fourier transform over G, which we discuss further below.
Considering L (1) = R (1) = |G| and using this decomposition, we find the well-known identity
X
d2 = |G|. (11.5)
G
Also, noting that L (x) = R (x) = 0 for any x G \ {1}, we see that
X
d (x) = 0. (11.6)
G
11.2. Fourier analysis for nonabelian groups 49
Characters also provide a simple test for irreducibility: for any representation , ( , ) is a positive integer,
and is equal to 1 if and only if is irreducible.
Any representation of G can also be viewed as a representation of any subgroup H G, simply by
restricting its domain to elements of H. We denote the resulting restricted representation by ResG H . Even
if is irreducible over G, it may not be irreducible over H.
(If is one-dimensional, then |(x)i is simply a phase factor (x) = (x) C with |(x)| = 1.) The Fourier
transform over G is the unitary matrix
X
FG := |xihx| (11.10)
xG
s
d
XX d X
= (x)j,k |, j, kihx|. (11.11)
|G|
xG G j,k=1
Note that the Fourier transform over G is not uniquely defined, but rather, depends on a choice of basis for
each irreducible representation.
It is straightforward to check that FG is indeed a unitary transformation. Using the identity
we have
X d2
hy|xi = h(y)|(x)i (11.15)
|G|
G
X d
= (y 1 x). (11.16)
|G|
G
FG is precisely the transformation that decomposes both the left and right regular representations of G
into their irreducible components. Let us check this explicitly for the left regular representation L. Recall
that this representation satisfies L(x)|yi = |xyi, so we have
d d0
X X X X d d0
= (x)j,` (y)`,k 0 (y)j 0 ,k0 |, j, kih 0 , j 0 , k 0 | (11.20)
|G|
yG , 0 G j,k,`=1 j 0 ,k0 =1
d
X X
= (x)j,` |, j, kih, `, k| (11.21)
G j,k,`=1
M
= (x) Id , (11.22)
G
where in the fourth line we have used the orthogonality relation for irreducible representations.
A similar calculation can be done for the right regular representation defined by R(x)|yi = |yx1 i, giving
This identity will be useful when analyzing the application of the quantum Fourier transform to the hidden
subgroup problem.
To use the Fourier transform as part of a quantum computation, we must be able to implement it efficiently
by some quantum circuit. Efficient quantum circuits for the quantum Fourier transform are known for many,
but not all, nonabelian groups. Groups for which an efficient QFT is known include metacyclic groups (i.e.,
semidirect products of cyclic groups), such as the dihedral group; the symmetric group; and many families
of groups that have suitably well-behaved towers of subgroups. There are a few notable groups for which
efficient QFTs are not known, such as the general linear group GLn (q) of n n invertible matrices over Fq ,
the finite field with q elements.
Chapter 12
Fourier sampling
In this lecture, we will see how the Fourier transform can be used to simplify the structure of the states
obtained in the standard approach to the hidden subgroup problem. In particular, we will see how weak
Fourier sampling is sufficient to identify any normal hidden subgroup (generalizing the solution of the abelian
HSP). We will also briefly discuss the potential of strong Fourier sampling to go beyond the limitations of
weak Fourier sampling.
where each g G occurs uniformly at random; or equivalently, the hidden subgroup state
1 X
H := |gHihgH|. (12.2)
|G|
gG
The symmetry of such a state can be exploited using the quantum Fourier transform. In particular, we
have
1 X
|gHi = p R(h)|gi (12.3)
|H| hH
where R is the right regular representation of G. Thus the hidden subgroup state can be written
1 X X
H = R(h)|gihg|R(h0 ) (12.4)
|G| |H| 0
gG h,h H
1 X
= R(hh01 ) (12.5)
|G| |H| 0
h,h H
1 X
= R(h). (12.6)
|G|
hH
Since the right regular representation is block-diagonal in the Fourier basis, the same is true of H . In
particular, we have
H := FG H FG (12.7)
1 M
Id (H)
= (12.8)
|G|
G
51
52 Chapter 12. Fourier sampling
where X
(H) := (h). (12.9)
hH
Since H is block diagonal, with blocks labeled by irreducible representations, we may now measure
the irrep label without loss of information. This procedure is referred to as weak Fourier sampling. The
probability of observing representation G under weak Fourier sampling is
1
tr Id (H)
Pr() = (12.10)
|G|
d X
= (h) (12.11)
|G|
hH
d |H|
= ( , 1 )H , (12.12)
|G|
or in other words, d |H|/|G| times the number of times the trivial representation appears in ResG H , the
restriction of to H. We may now ask whether polynomially many samples from this distribution are
sufficient to determine H, and if so, whether H can be reconstructed from this information efficiently.
where ker := {g G : (g) = Id } is the kernel of the representation (a normal subgroup Pof G). To see
this, note that if H 6 ker , then there is some h0 H with (h0 ) 6= 1; but then (h0 )(H) = hH (h0 h) =
(H), and since (h0 ) is unitary and (H) is a scalar multiple of the identity, this can only be satisfied if in
fact (H) = 0. On the other hand, if H ker , then (h) = d for all h H, and the result is immediate.
To find H, we can simply proceed as in the abelian case: perform weak Fourier sampling O(log |G|) times
and compute the intersection of the kernels of the resulting irreps (assuming this can be done efficiently).
Again, it is clear that the resulting subgroup contains H, and we claim that it is equal to H with high
probability. For suppose that at some stage during this process, the intersection of the kernels is K E G with
K 6= H; then the probability of obtaining an irrep for which K ker is
|H| X |H| 1
d2 = (12.15)
|G| |K| 2
: Kker
where we have used the fact that the distribution (12.14) remains normalized if H is replaced by any normal
subgroup of G. Since each repetition of weak Fourier sampling has a probability of at least 1/2 of cutting
12.3. Strong Fourier sampling 53
the intersection of the kernels at least in half, O(log |G|) repetitions suffice to converge to H with substantial
probability. In fact, applying the same approach when H is not necessarily normal in G gives an algorithm
to find the normal core of H, the largest subgroup of H that is normal in G.
This algorithm can be applied to find hidden subgroups in groups that are close to Abelian in a certain
sense. In particular, Grigni et al. showed that if (G), the intersection of the normalizers of all subgroups of
1/2
G, is sufficiently largespecifically, if |G|/|(G)| = 2O(log n) , such as when G = Z3 o Z2n then the HSP
in G can be solved in polynomial time [49]. The idea is simply to apply the algorithm for normal subgroups
to the restriction of G to all subgroups containing (G); the union of all subgroups obtained in this way
gives the hidden subgroup with high probability. This result was subsequently improved (by Gavinsky) to
give a polynomial-time quantum algorithm whenever |G|/|(G)| = poly(log |G|).
(H)
H, := P
. (12.16)
hH (h)
In fact, this state is proportional to a projector whose rank is simply the number of times the trivial
representation appears in ResG
H . This follows because
X
(H)2 = (hh0 ) = |H| (H), (12.17)
h,h0 H
which gives
|H|
2H, = P ,
H,
(12.18)
hH (h)
values of p, q, unlike the case q < p1 mentioned above, measurement in a random basis is information-
theoretically sufficient. Indeed, we do not know of any example of an HSP for which strong Fourier sampling
succeeds, yet random strong Fourier sampling fails; it would be interesting to find any such example (or to
prove that none exists).
Note that simply finding an informative basis is not sufficient; it is also important that the measurement
results can be efficiently post-processed. This issue arises not only in the context of measurement in a
pseudo-random basis, but also in the context of certain explicit bases. For example, Ettinger and Hyer
gave a basis for the dihedral HSP in which a measurement gives sufficient classical information to infer the
hidden subgroup, but no efficient means of post-processing this information is known [39].
For some groups, it turns out that strong Fourier sampling simply fails. Moore, Russell, and Schulman
showed that, regardless of what basis is chosen, strong Fourier sampling provides insufficient information
to solve the HSP in the symmetric group [72]. Specifically, they showed that for any measurement basis
(indeed, for any POVM applied to a hidden subgroup state), the distribution of outcomes in the cases where
the hidden subgroup is trivial and where the hidden subgroup is an involution are exponentially close. Thus,
in general one has to consider entangled measurements on multiple copies of the hidden subgroup states.
(Indeed, entangled measurements on (log |G|) copies may be necessary, as Hallgren et al. showed for the
symmetric group [51].) In the next two lectures, we will see some examples of quantum algorithms for the
HSP that make use of entangled measurements.
Chapter 13
We now discuss a quantum algorithm for the dihedral hidden subgroup problem. No polynomial-time algo-
rithm for this problem is known. However, Kuperberg gave a quantum
algorithm that runs in subexponential
(though superpolynomial) timespecifically, it runs in time 2O( log |G|) [64].
(In particular, this shows that the dihedral group is the semidirect product ZN o Z2 , where : Z2
Aut(ZN ) is defined by (a)(y) = (1)a y.) It is also easy to see that the group inverse is
The subgroups of DN are either cyclic or dihedral. The possible cyclic subgroups are of the form h(x, 0)i
where x ZN is either 0 or some divisor of N . The possible dihedral subgroups are of the form h(y, 1)i where
y ZN , and of the form h(x, 0), (y, 1)i where x ZN is some divisor of N and y Zx . A result of Ettinger
and Hyer reduces the general dihedral HSP, in which the hidden subgroup could be any of these possibilities,
to the dihedral HSP with the promise that the hidden subgroup is of the form h(y, 1)i = {(0, 0), (y, 1)}, i.e.,
a subgroup of order 2 generated by the reflection (y, 1).
The basic idea of the Ettinger-Hyer reduction is as follows. Suppose that f : DN S hides a subgroup
H = h(x, 0), (y, 1)i. Then we can consider the function f restricted to elements from the abelian group
ZN {0} DN . This restricted function hides the subgroup h(x, 0)i, and since the restricted group is
55
56 Chapter 13. Kuperbergs algorithm for the dihedral HSP
abelian, we can find x efficiently using Shors algorithm. Now h(x, 0)i E DN (since (z, a)(x, 0)(z, a)1 =
(z + (1)a x, a)((1)a z, a) = ((1)a x, 0) ZN {0}), so we can define the quotient group DN /h(x, 0)i.
But this is simply a dihedral group (of order 2N/(N/x) = 2x), and if we now define a function f 0 as f
evaluated on some coset representative, it hides the subgroup h(y, 1)i. Thus, in the rest of this lecture, we
will assume that the hidden subgroup is of the form h(y, 1)i for some y ZN without loss of generality.
1
|(z, 0){(0, 0), (y, 1)}i = (|z, 0i + |y + z, 1i). (13.7)
2
We would like to determine y using samples of this state.
We have seen that to distinguish coset states in general, one should start by performing weak Fourier
sampling: apply a Fourier transform over G and then measure the irrep label. However, in this case we will
instead simply Fourier transform the first register over ZN , leaving the second register alone. It is possible
to show that measuring the first register of the resulting state is essentially equivalent to performing weak
Fourier sampling over DN (and discarding the row register), but for simplicity we will just consider the
abelian procedure.
Fourier transforming the first register over ZN , we obtain
1 X kz k(y+z)
(FZN I2 )|(z, 0)Hi = (N |k, 0i + N |k, 1i (13.8)
2N kZN
1 X kz 1 ky
= N |ki (|0i + N |1i). (13.9)
N kZ N
2
If we then measure the first register, we obtain one of the N values of k uniformly at random, and we are
left with the post-measurement state
1 yk
|k i := (|0i + N |1i). (13.10)
2
Thus we are left with the problem of determining y given the ability to produce single-qubit states |k i of
this form (where k is known).
Then a measurement on the second qubit leaves the first qubit in the state |pq i (up to an irrelevant global
phase), with the + sign occurring when the outcome is 0 and the sign occurring when the outcome is 1,
each outcome occurring with probability 1/2.
This combination operation has a nice representation-theoretic interpretation: the state indices p and q
can be viewed as labels of irreducible representations of DN , and the extraction of |pq i can be viewed as
decomposing their tensor product (a reducible representation of DN ) into one of two irreducible components.
This analysis is not quite correct because we do not obtain precisely a 1/8 fraction of the paired states
for use in the next stage. For most of the stages, we have many more than 2 2m states, so nearly all of them
can be paired, and the expected fraction remaining for the next stage is close to 1/4. Of course, the precise
fraction will experience statistical fluctuations. However, since we are working with a large number of states,
the deviations from the expected values are very small, and a more careful analysis (using the Chernoff
bound) shows that the procedure succeeds with high probability. For a detailed argument, see section 3.1 of
Kuperbergs paper (SICOMP version). That paper also gives an improved algorithm that runs faster and
that works for general N .
Note
that this algorithm uses not only superpolynomial time, but also superpolynomial space, since all
(16 n ) coset states are present at the start of the algorithm. However, by creating a smaller number of
coset states at a time and combining them according to the solution of a subset sum problem, Regev showed
how to make the space requirement polynomial with only a slight increase in the running time [77, 32].
If we postselect on obtaining this outcome (which happens with probability 1/2 over the uniformly random
value of k, assuming y 6= 0), then we effectively obtain each value k ZN with probability Pr(k|+) =
2 2 yk
N cos N . It is not hard to show that these distributions are statistically far apart for different values of k,
so that they can in principle be distinguished with only polynomially many samples. However, no efficient
(or even subexponential time) classical (or even quantum) algorithm for doing so is known.
Chapter 14
We showed that the quantum query complexity of the general hidden subgroup problem is polynomial
poly(log |G|)
by measuring H using a particular measurement strategy (the pretty good measurement) that
identifies H with high probability. One strategy for finding an efficient quantum algorithm for the HSP is
to find an efficient way of implementing that particular measurement [14]. In this lecture, we will describe
an efficient quantum algorithm for the HSP in the Heisenberg group that effectively implements the pretty
good measurement.
over Fp , and the semidirect product Z2p o Zp , where : Zp Aut(Z2p ) is defined by (c)(a, b) = (a + bc, b).
To solve the HSP in the Heisenberg group, it is sufficient to be able to distinguish the following cyclic
subgroups of order p:
Ha,b := h(a, b, 1)i = {(a, b, 1)x : x Zp }. (14.5)
The reduction to this case is essentially the same as the reduction of the dihedral hidden subgroup problem
to the case of a hidden reflection, so we omit the details. The elements of such a subgroup are
(a, b, 1)2 = (2a + b, 2b, 2) (14.6)
3
(a, b, 1) = (a, b, 1)(2a + b, 2b, 2) = (3a + 3b, 3b, 3) (14.7)
4
(a, b, 1) = (a, b, 1)(3a + 3b, 3b, 3) = (4a + 6b, 4b, 4) (14.8)
5
(a, b, 1) = (a, b, 1)(4a + 6b, 4b, 4) = (5a + 10b, 5b, 5) (14.9)
59
60 Chapter 14. The HSP in the Heisenberg group
etc., and a straightforward inductive argument shows that a general element has the form
(a, b, 1)x = (xa + x2 b, xb, x).
(14.10)
Furthermore, it is easy to see that the p2 elements (`, m, 0) for `, m Zp form a left transversal of Ha,b in
the Heisenberg group for any a, b Zp .
for some uniformly random, unknown `, m Zp . Our goal is to determine the parameters a, b Zp using
the ability to produce such states.
At this point, we could perform weak Fourier sampling over the Heisenberg group without discarding
any information. However, as in the case of the dihedral group, it will be simpler to consider an abelian
Fourier transform instead of the full nonabelian Fourier transform. Using the representation theory of the
Heisenberg group, one can show that this procedure is essentially equivalent to nonabelian Fourier sampling.
Fourier transforming the first two registers over Z2p , we obtain the state
1 X s(`+xa+(x
2 )b)+t(m+xb)
(FZp FZp Ip )|(`, m, 0)Ha,b i = 3/2 p |s, t, xi. (14.12)
p x,s,tZ
p
Now suppose we measure the values s, t appearing in the first two registers. In fact this can be done without
loss of information, since the density matrix of the state (mixed over the uniformly random values of `, m)
is block diagonal, with blocks labeled by s, t. Collecting the coefficients of the unknown parameters a, b, the
resulting p-dimensional quantum state is
1 X s(xa+(x2)b)+t(xb)
|H\
a,b;s,t i := p |xi (14.13)
p
xZp
1 X a(sx)+b(s(x2)+tx)
= p |xi. (14.14)
p
xZp
where the values s, t Zp are known, and are obtained uniformly at random. We would like to use samples
of this state to determine a, b Zp .
where
:= sx + uy (14.17)
:= s x2 + tx + u y2 + vy
(14.18)
and where we suppress the dependence of , on s, t, u, v, x, y for clarity. If we could replace |x, yi by |, i,
then the resulting state would be simply the Fourier transform of |a, bi, and an inverse Fourier transform
would reveal the solution. So lets compute the values of , in ancilla registers, giving the state
1 X a+b
p |x, y, , i, (14.19)
p
x,yZp
P p
where we use the convention that |Si := sS |si/ |S| denotes the normalized uniform superposition over
the elements of the set S. Thus, if we could perform a unitary transformation satisfying
s,t,u,v s,t,u,v
|S, i 7 |, i for |S, |=
6 0 (14.22)
(and defined in any way consistent with unitarity for other values of , ), we could erase the first two
registers of (14.19), producing the state
1 X a+b
q
s,t,u,v
p |S, | |, i. (14.23)
p
,Zp
(Note that in fact we could just apply the transformation (14.22) directly to the state (14.16); there is no
need to explicitly compute the values , in an ancilla register.)
We refer to the inverse of the transformation (14.22) as quantum sampling, since the goal is to produce
a uniform superposition over the set of solutions, a natural quantum analog of random sampling from those
solutions.
Since the system of equations (14.17)(14.18) consists of a pair of quadratic equations in two variables
over Fp , it has either zero, one, or two solutions x, y Fp . In particular, a straightforward calculation shows
that the solutions can be expressed in closed form as
s + sv tu u + tu sv
x= y= (14.24)
s(s + u) u(s + u)
where
:= (2s + s 2 2t)(s + u)u + (u + tu sv)2 . (14.25)
Provided su(s + u) 6= 0, the number of solutions is completely determined by the value of . If is a
nonzero square in Fp , then there are two distinct solutions; if = 0 then there is only one solution; and if
is a non-square then there are no solutions. In any event, since we can efficiently compute an explicit list
of solutions in each of these cases, we can efficiently perform the transformation (14.22).
62 Chapter 14. The HSP in the Heisenberg group
It remains to show that the state (14.23) can be used to recover a, b. This state is close to the Fourier
transform of |a, bi provided the solutions are nearly uniformly distributed. Since the values of s, t, u, v are
uniformly distributed over Fp , it is easy to see that is uniformly distributed over Fp . This means that
is a square about half the time, and is a non-square about half the time (with = 0 occurring only with
probability 1/p). Thus there are two solutions about half the time and no solutions about half the time.
This distribution of solutions is uniform enough for the procedure to work.
Applying the inverse quantum Fourier transform over Zp Zp , we obtain the state
1
q
s,t,u,v
X
p(ak)+(b`) |S, | |k, `i. (14.26)
p2
,,k,`Zp
Measuring this state, the probability of obtaining the outcome k = a and ` = b for any particular values of
s, t, u, v is
2
1
q
s,t,u,v
X
|S, | . (14.27)
p4
,Zp
Since those values occur uniformly at random, the overall success probability of the algorithm is
2 2
1 X X q s,t,u,v 1 X X q s,t,u,v
|S, | |S, | (14.28)
p8 p12
s,t,u,vZp ,Zp s,t,u,vZp ,Zp
2
1 X p4
12 2 (14.29)
p 2 + o(1)
,Zp
1
= (1 o(1)), (14.30)
2
which shows that the algorithm succeeds with probability close to 1/2.
Chapter 15
In this final chapter of the part on algebraic problems, we discuss a very different class of quantum algorithms,
ones that approximately solve various #P-complete problems. The best-known example of such a quantum
algorithm is for approximating the value of a link invariant called the Jones polynomial. This algorithm is
not based on the Fourier transform, but it does use properties of group representations.
63
64 Chapter 15. Approximating the Jones polynomial
The Jones polynomial of an oriented link L is a Laurent polynomial VL (t) in the variable t, i.e., a0
polynomial in t and 1/ t. It is a link invariant, meaning that VL (t) = VL0 (t) if the oriented links L and L
are isotopic. While it is possible for the Jones polynomial to take the same value on two non-isotopic links,
it can often distinguish links; for example, the Jones polynomials of the two orientations of the trefoil knot
are different.
An oriented link L can be specified by a link diagram, a drawing of the link in the plane with over- and
under-crossings indicated. One way to define the Jones polynomial of a link diagram is as follows. First, let
us define the Kauffman bracket hLi, which does not depend on the orientation of L. Each crossing in the
link diagram can be opened in one of two ways, and for any given crossing we have
D E D E D E
= t1/4 + t1/4 , (15.4)
where the rest of the link remains unchanged. Repeatedly applying this rule, we eventually arrive at a link
consisting of disjoint unknots. The Kauffman bracket of a single unknot is h
i := 1, and more generally, the
Kauffman bracket of n unknots is (t1/2 t1/2 )n1 . By itself, the Kauffman bracket is not a link invariant,
but it can be turned into one by taking into account the orientation of the link, giving the Jones polynomial.
_ ?
For any oriented link diagram L, we define its writhe w(L) as the number of crossings of the form minus
_ ?
the number of crossings of the form . Then the Jones polynomial is defined as
Computing the Jones polynomial of a link diagram is quite difficult. A brute-force calculation using
the definition in terms of the Kauffman bracket takes time exponential in the number of crossings. Indeed,
exactly computing the Jones polynomial is #P-hard (except for a few special values of t), as shown by
Jaeger, Vertigan, and Welsh. Here #P is the class of counting problems associated to problems in NP (e.g.,
computing the number of satisfying assignments of a Boolean formula). Of course, approximate counting
can be easier than exact counting, and sometimes #P-hard problems have surprisingly good approximation
algorithms.
number of paths of length n that start from one end of a path with k 1 vertices), so it corresponds to a
unitary operation on poly(n) qubits. The Jones polynomial of the plat closure of a braid is proportional to
the expectation h|U |i of the associated representation matrix U in a fixed quantum state |i.
that this problem is complete for the one clean qubit model, and hence apparently unlikely to be solvable
by classical computers.
Quantum walk
67
Chapter 16
We now turn to our second major topic in quantum algorithms, the concept of quantum walk. In this lecture
we will introduce continuous-time quantum walk as a natural analog of continuous-time classical random
walk, and well see some examples of how the two kinds of processes differ.
where deg(j) denotes the degree of vertex j. (The Laplacian is sometimes defined differently than thise.g.,
sometimes with the opposite sign. We use this definition because it makes L a discrete approximation of the
Laplacian operator 2 in the continuum.)
The continuous-time random walk on G is defined as the solution of the differential equation
d X
pj (t) = Ljk pk (t). (16.3)
dt
kV
Here pj (t) denotes the probability associated with vertex j at time t. This can be viewed as a discrete analog
of the diffusion equation. Note that
d X X
pj (t) = Ljk pk (t) = 0 (16.4)
dt
jV j,kV
(since the columns of L sum to 0), which shows that an initially normalized distribution remains normalized:
the evolution of the continuous-time random walk for any time t is a stochastic process. The solution of the
differential equation can be given in closed form as
69
70 Chapter 16. Continuous-time quantum walk
Now notice that the equation (16.3) is very similar to the Schrodinger equation
d
i |i = H|i (16.6)
dt
except that it lacks the factor of i. If we simply insert this factor, and rename the probabilities pj (t) as
quantum amplitudes qj (t) = hj|(t)i (where {|ji : j V } is an orthonormal basis for the Hilbert space),
then we obtain the equation
d X
i qj (t) = Ljk qk (t), (16.7)
dt
kV
which is simply the Schrodinger equation with the Hamiltonian given by the Laplacian of the graph. Since the
d 2
P
Laplacian is a Hermitian operator, these dynamics preserve normalization in the sense that dt jV |qj (t)| =
0. Again the solution of the differential equation can be given in closed form, but here it is |(t)i =
eiLt |(0)i.
We could also define a continuous-time quantum walk using any Hermitian Hamiltonian that respects
the structure of G. For example, we could use the adjacency matrix A of G, even though this matrix cannot
be used as the generator of a continuous-time classical random walk.
For general n, the graph is the direct product of this graph with itself n times, and the adjacency matrix is
n
X
A= x(j) (16.9)
j=1
(j)
where x denotes the operator acting as x on the jth bit, and as the identity on every other bit.
For simplicity, lets consider the quantum walk with the Hamiltonian given by the adjacency matrix. (In
fact, since the graph is regular, the walk generated by the Laplacian would only differ by an overall phase.)
Since the terms in the above expression for the adjacency matrix commute, the unitary operator describing
the evolution of this walk is simply
n
Y (j)
eiAt = eix t
(16.10)
j=1
n
O cos t i sin t
= . (16.11)
i sin t cos t
j=1
After time t = /2, this operator flips every bit of the state (up to an overall phase), mapping any input
state |xi to the state |xi corresponding to the opposite vertex of the hypercube.
In contrast, consider the continuous- or discrete-time random walk starting from the vertex x. It is
not hard to show that the probability of reaching the opposite vertex x is exponentially small at any time,
since the walk rapidly reaches the uniform distribution over all 2n vertices of the hypercube. So this simple
example shows that random and quantum walks can exhibit radically different behavior.
16.3. Random and quantum walks in one dimension 71
where p . We have
so the corresponding eigenvalue is 2(cos p 1). Thus the amplitude for the walk to move from j to k in time
t is
Z
iLt 1
hk|e |ji = e2it(cos p1) hk|pihp|ji dp (16.17)
2
Z
1
= eip(kj)2it(cos p1) dp (16.18)
2
= e2it (i)kj Jkj (2t) (16.19)
where J is the Bessel function of order . This expression can be understood using basic asymptotic
properties of the Bessel function. For large values of , the function J (t) is exponentially small in for
t, of order 1/3 for t, and of order 1/2 for t. Thus (16.19) describes a wave propagating
with speed 2.
We can use a similar calculation to exactly describe the corresponding continuous-time classical random
walk, which is simply the analytic continuation of the quantum case with t it. Here the probability of
moving from j to k in time t is
1
where I is the modified Bessel function of order . For large t, this expression is approximately 4t exp((k
2
j) /4t), a Gaussian of width 2t, in agreement with our expectations for a classical random walk in one
dimension.
Suppose we take a random walk on the graph starting from the root of the left tree. It is not hard to
see that such a walk rapidly gets lost in the middle of the graph and never has a substantial probability
of reaching the opposite root. In fact, by specifying the graph in such a way that it can only be explored
locally, we can ensure that no classical procedure starting from the left root can efficiently reach the right
root. However, a quantum walk starting from the left root produces a state with a large (lower bounded by
1/ poly(n)) overlap on the right root in a short (upper bounded by poly(n)) amount of time.
To establish a provable separation between classical and quantum strategies, we will formulate the graph
traversal problem in terms of query complexity.
Let G = (V, E) be a graph with N vertices. To represent G by a black box, let m be such that 2m N ,
and let k be at least as large as the maximum degree of G. For each vertex a V , assign a distinct m-bit
string (called the name of a), not assigning 11 . . . 1 as the name of any vertex. For each b V with (a, b) E,
assign a unique label from {1, 2, . . . , k} to the ordered pair (a, b). For a {0, 1}m (identifying the vertex
with its name) and c {1, 2, . . . , k}, define vc (a) as the name of the vertex reached by following the outgoing
edge of a labeled by c, if such an edge exists. If there is no vertex of G named a or no outgoing edge from a
labeled c, then let vc (a) = 11 . . . 1. The black box for G takes a {0, 1}m and c {1, 2, . . . , k} as input and
returns vc (a).
The black box graph traversal problem is as follows. Let G be a graph and let entrance and exit be
two vertices of G. Given a black box for G as described above, with the additional promise that the name of
the entrance is 00 . . . 0, the goal is to output the name of the exit. We say an algorithm for this problem
is efficient if its running time is polynomial in m.
Of course, a random walk is not necessarily the best classical strategy for this problem. For example,
there is an efficient classical algorithm for traversing the n-dimensional hypercube (exercise: what is it?)
even though a random walk does not work. However, no classical algorithm can efficiently traverse the glued
trees, whereas a quantum walk can.
1 X
|col ji := p |ai (16.21)
Nj (a,entrance)=j
16.5. Quantum walk algorithm to traverse the glued trees graph 73
where (
2j 0jn
Nj := (16.22)
22n+1j n + 1 j 2n + 1
is the number of vertices at distance j from the entrance, and where (a, b) denotes the length of the
shortest path in G from a to b. It is straightforward to see that the subspace span{|col ji : 0 j 2n + 1}
is invariant under the action of the adjacency matrix A of G. At the entrance and exit, we have
A|col 0i = 2|col 1i (16.23)
A|col 2n + 1i = 2|col 2ni. (16.24)
1 X
A|col ji = p A|ai (16.25)
Nj (a,entrance)=j
1 X X
=p 2 |ai + |ai (16.26)
Nj (a,entrance)=j1 (a,entrance)=j+1
1 p p
= p (2 Nj1 |col j 1i + Nj+1 |col j + 1i) (16.27)
Nj
= 2(|col j 1i + |col j + 1i). (16.28)
1 p p
A|col ji = p ( Nj1 |col j 1i + 2 Nj+1 |col j + 1i) (16.29)
Nj
= 2(|col j 1i + |col j + 1i). (16.30)
The only difference occurs at the middle of the graph, where we have
1 p p
A|col ni = (2 Nn1 |col n 1i + 2 Nn+1 |col n + 1i) (16.31)
Nn
= 2|col n 1i + 2|col n + 1i (16.32)
and similarly
1 p p
A|col n + 1i = p (2 Nn |col ni + 2 Nn+2 |col n + 2i) (16.33)
Nn+1
= 2|col ni + 2|col n + 2i. (16.34)
In summary, the matrix elements of A between basis states for this invariant subspace can be depicted as
follows:
2 2 2 2 2 2 2
entrance exit
col 0 col 1 col 2 col n1 col n col n+1 col n+2 col 2n1 col 2n col 2n+1
By identifying the subspace of states |col ji, we have found that the quantum walk on the glued trees
graph starting from the entrance is effectively the same as a quantum walk on a weighted path of 2n + 2
vertices, with all edge weights the same except for the middle one. Given our example of the quantum walk
on the infinite path, we can expect this walk to reach the exit with amplitude 1/ poly(n) in time linear in
n. To prove that the walk indeed reaches the exit in polynomial time, we will use the notion of the mixing
time of a quantum walk.
74 Chapter 16. Continuous-time quantum walk
p
(where in exponentiating L we have used the fact that |V |u is a normalized eigenvector of L, so that |V |uuT
is the projector onto the corresponding subspace). The Laplacian is a negative semidefinite operator, so the
contributions et for 6= 0 decrease exponentially in time; thus the walk asymptotically approaches the
uniform distribution. The deviation from uniform is small when t is large compared to the inverse of the
largest (i.e., least negative) nonzero eigenvalue of L.
Since a quantum walk is a unitary process, we should not expect it to approach a limiting quantum state,
no matter how long we wait. Nevertheless, it is possible to define a notion of the limiting distribution of a
quantum walk as follows. Suppose we pick a time t uniformly at random between 0 and T , run the quantum
walk starting at a V for a total time t, and then measure in the vertex basis. The resulting distribution is
1 T
Z
pab (T ) = |hb|eiHt |ai|2 dt (16.39)
T 0
1 T i(0 )t
X Z
0 0
= hb|ih|aiha| ih |bi e dt (16.40)
0
T 0
,
0
X X 1 ei( )T
= |ha|ihb|i|2 + hb|ih|aiha|0 ih0 |bi (16.41)
i( 0 )T
6=0
where we have considered a quantum walk generated by an unspecified Hamiltonian H (it could be the
Laplacian or the adjacency matrix, orPsome other operator as desired), and where we have assumed for
simplicity that the spectrum of H = |ih| is nondegenerate. We see that the distribution pab (T )
tends toward a limiting distribution
X
pab () := |ha|ihb|i|2 . (16.42)
The timescale for approaching this distribution is again governed by the spectrum of H, but now we see that
T must be large compared to the inverse of the smallest gap between any pair of distinct eigenvalues, not
just the smallest gap between a particular pair of eigenvalues as in the classical case.
Lets apply this notion of quantum mixing to the quantum walk on the glued trees. It will be simplest to
consider the walk generated by the adjacency matrix A. Since the subspace of states |col ji has dimension
only 2n + 1, it should not be surprising that the limiting probability of traversing from entrance to exit
is bigger than 1/ poly(n). To see this, notice that A commutes with the reflection operator R defined as
R|col ji = |col 2n + 1 ji, so these two operators can be simultaneously diagonalized. Now R2 = 1, so it
16.6. Classical and quantum mixing 75
has eigenvalues 1, which shows that we can choose the eigenstates |i of A to satisfy hentrance|i =
hexit|i. Therefore,
X
pentranceexit () = |hentrance|ihexit|i|2 (16.43)
X
= |hentrance|i|4 (16.44)
!2
1 X
2
|hentrance|i| (16.45)
2n + 2
1
= (16.46)
2n + 2
where the lower bound follows by the Cauchy-Schwarz inequality. Thus it suffices to show that the mixing
time of the quantum walk is poly(n).
To see how long we must wait before the probability of reaching the exit is close to its limiting value,
we can calculate
|pentranceexit () pentranceexit (T )|
i(0 )T
X
0 0 1 e
= hexit|ih|entranceihentrance| ih |exiti (16.47)
6=0 i( 0 )T
2 X
|hexit|ih|entranceihentrance|0 ih0 |exiti| (16.48)
T 0
,
2 X
= |hentrance|i|2 |hentrance|0 i|2 (16.49)
T 0,
2
= , (16.50)
T
where denotes the smallest gap between any pair of distinct eigenvalues of A. All that remains is to lower
bound .
To understand the spectrum of A, recall that an infinite path has eigenstates of the form eipj . For any
ipj
the state |i with amplitudes hcol j|i = e satisfies hcol j|A|i = hcol j|i, where the eigenvalue
value of p,
is = 2 2 cos p, for all values of j except 0, n, n + 1, 2n + 1. We can satisfy the eigenvalue condition for
j = 0, 2n + 1 by taking linear combinations of eipj that vanish for j = 1 and j = 2n + 2, namely
(
sin(p(j + 1)) 0jn
hcol j|i = (16.51)
sin(p(2n + 2 j)) n + 1 j 2n + 1.
The left hand side of this equation decreases monotonically, with poles at integer multiples of /(n + 1). For
example, with n = 4, we have the following:
76 Chapter 16. Continuous-time quantum walk
With a bit of analysis (see quant-ph/0209131 for details), one can show that the solutions of this equation
2
give 2n values of p, each of which is separated from the integer multiples of /(n3 + 1) by (1/n ). The
spacings between the corresponding eigenvalues of A, = 2 2 cos p, are (1/n ). The remaining two
eigenvalues of A can be obtained by considering solutions with p imaginary, and it is easy to show that they
are separated from the rest of the spectrum by a constant amount. By taking (say) T = 5n/ = O(n4 ), we
can ensure that the probability to reach the exit is (1/n). Thus there is an efficient quantum algorithm
to traverse the glued trees graph.
In the last lecture we introduced the notion of continuous-time quantum walk. We now turn our attention
to discrete-time quantum walk, which provides a convenient framework for quantum search algorithms.
for j, k V : an initial probability distribution p over the vertices evolves to p0 = M p after one step of the
walk.
To define a quantum analog of this process, we would like to specify a unitary operator U with the property
that an input state |ji corresponding to the vertex j V evolves to a superposition of the neighbors of j.
We would like this to happen in essentially the same way at every vertex, so we are tempted to propose the
definition
? 1 X
|ji 7 |j i := p |ki. (17.2)
deg(j) k:(j,k)E
However, a moments reflection shows that this typically does not define a unitary transformation, since
the orthogonal states |ji and |ki corresponding to adjacent vertices j, k with a common neighbor ` evolve
to non-orthogonal states. We could potentially avoid this problem using a rule that sometimes introduces
phases, but that would violate the spirit of defining a process that behaves in the same way at every vertex.
In fact, even if we give that up, there are some graphs that simply do not allow local unitary dynamics [88].
We can get around this difficulty if we allow ourselves to enlarge the Hilbert space, an idea proposed by
Watrous as part of a logarithmic-space quantum algorithm for deciding whether two vertices are connected
in a graph [96]. Let the Hilbert space consist of states of the form |j, ki where (j, k) E. We can think of
the walk as taking place on the (directed) edges of the graph; the state |j, ki represents a walker at vertex j
that will move toward vertex k. Each step of the walk consists of two operations. First, we apply a unitary
transformation that operates on the second register conditional on the first register. This transformation is
sometimes referred to as a coin flip, as it modifies the next destination of the walker. A common choice is
the Grover diffusion operator over the neighbors of j, namely
X
C := |jihj| 2|j ihj | I . (17.3)
jV
Next, the walker is moved to the vertex indicated in the second register. Of course, since the process must
77
78 Chapter 17. Discrete-time quantum walk
be unitary, the only way to do this is to swap the two registers using the operator
X
S := |j, kihk, j|. (17.4)
(j,k)E
Overall, one step of the discrete-time quantum walk is described by the unitary operator SC.
In principle, this construction can be used to define a discrete-time quantum walk on any graph (although
care must be taken if the graph is not regular). However, in practice it is often more convenient to use an
alternative framework introduced by Szegedy [94], as described in the next section.
N
X
:= |j ihj | (17.7)
j=1
N
X
S := |j, kihk, j| (17.8)
j,k=1
be the operator that swaps the two registers. Then a single step of the quantum walk is defined as the
unitary operator U := S(2 1).
Notice that if Pjk = Ajk / deg(k) (i.e., if the walk simply chooses an outgoing edge of an underlying
digraph uniformly at random), then this is exactly the coined quantum walk with the Grover diffusion
operator as the coin flip.
If we take two steps of the walk, then the corresponding unitary operator is
which can be interpreted as the reflection about span{|j i} followed by the reflection about span{S|j i}
(the states where we condition on the second register to do a coin operation on the first). To understand the
behavior of the walk, we will now compute the spectrum of U ; but note that it is also possible to compute
the spectrum of a product of reflections more generally.
17.3. Spectrum of the quantum walk 79
Theorem 17.1. Fix an N N stochastic matrix P , p and let {|i} denote a complete set of orthonormal
eigenvectors of the N N matrix D with entries Djk = Pjk Pkj with eigenvalues {}. Then the eigenvalues
of the discrete-time quantum walk U = S(2 1) corresponding to P are 1 and i 1 2 = ei arccos .
N
X
T := |j ihj| (17.11)
j=1
N
X p
= Pkj |j, kihj| (17.12)
j,k=1
N
X
TT = |j ihj|kihk | (17.13)
j,k=1
N
X
= |j ihj | (17.14)
j=1
= , (17.15)
whereas
N
X
T T = |jihj |k ihk| (17.16)
j,k=1
N
X p
= P`j Pmk |jihj, `|k, mihk| (17.17)
j,k,`,m=1
N
X
= P`j |jihj| (17.18)
j,`=1
=I (17.19)
and
N
X
T ST = |jihj |S|k ihk| (17.20)
j,k=1
N
X p
= P`j Pmk |jihj, `|S|k, mihk| (17.21)
j,k,`,m=1
N
X p
= Pjk Pkj |jihk| (17.22)
j,k=1
= D. (17.23)
80 Chapter 17. Discrete-time quantum walk
We see that the subspace span{|i, S|i} is invariant under U , so we can find eigenvectors of U within this
subspace.
Now let |i := |i S|i, and let us choose C so that |i is an eigenvector of U . We have
Finally, note that for the orthogonal complement of span{|i, S|i}, U simply acts as S
P any vector in P
(since = T T = T |ih|T = |ih| projects onto span{|i}). In this subspace, the eigenvalues
are 1.
Let us assume from now on that the original walk P is symmetric, though the modified walk P 0 clearly is
not provided M is non-empty. If we order the vertices so that the marked ones come last, the matrix P 0 has
the block form
0 PM 0
P = (17.36)
Q I
Now if we start from the uniform distribution over unmarked items (if we start from a marked item we
are done, so we might as well condition on this not happening), then the probability of not reaching a
1 t t t
P
marked item after t steps is N |M | / [PM ]jk kPM k = kPM k , where the inequality follows because
j,kM
t
in the normalized state |V \ M i = 1
P
the left hand side is the expectation of PM j M
/ |ji. Now if
N |M |
kPM k = 1, then the probability of reaching a marked item after t steps is at least 1kPM kt = 1(1)t ,
1
which is (1) provided t = O(1/) = O( 1kP Mk
).
It turns out that we can bound kPM k away from 1 knowing only the fraction of marked vertices and the
spectrum of the original walk. Thus we can upper bound the hitting time, the time required to reach some
marked vertex with constant probability.
Lemma 17.2. If the second largest eigenvalue of P (in absolute value) is at most 1 and |M | N , then
kPM k 1 .
Proof. Let |vi RN |M | be the principal eigenvector of PM , and let |wi RN be the vector obtained by
padding |vi with 0s for all the marked vertices.
We will decompose |wi in the eigenbasis of P . Since P is symmetric, it is actually doubly stochastic,
and the uniform vector |V i = 1N j |ji corresponds to the eigenvalue 1. All other eigenvectors |i have
P
Im(z)
ei
1
|{z} Re(z)
1cos
Figure 17.1: The classical gap, 1 = 1 cos , appears on the real p axis. The quantum phase gap,
= arccos , is quadratically larger, since cos 1 2 /2, i.e., arccos 2(1 ).
and measuring whether the first register corresponds to a marked vertex; if it does then we are done, and if
not then we have prepared |i.
The matrix D for the walk P 0 is
PM 0
, (17.51)
0 I
so according to the spectral theorem, the eigenvalues of the resulting walk operator U are 1 and ei arccos ,
where runs over the eigenvalues of PM . If the marked set M is empty, then P 0 = P , and |i is an
eigenvector of U with eigenvalue 1, so phase estimation on U is guaranteed to return a phase of 0. But if M
is non-empty, then the state |i lives entirely within the subspace with eigenvalues ei arccos . Thus if we
p estimation on U with precision O(min arccos ), we will see a phase
perform phase pdifferent from 0. Since
arccos 2(1 ) (see Figure 17.1 for an illustration), we see that precision O(p 1 kPM k) suffices.
So
the quantum algorithm can decide whether there is a marked vertex in time O(1/ 1 kPM k) = O(1/ ).
Chapter 18
Unstructured search
Now we begin to discuss applications of quantum walks to search algorithms. We start with the most basic
of all search problems, the unstructured search problem (which is solved optimally by Grovers algorithm).
We discuss how this problem fits into the framework of quantum walk search, and also describe amplitude
amplification and quantum counting in this setting. We also discuss quantum walk algorithms for the search
problem under locality constraints.
83
84 Chapter 18. Unstructured search
This random walk gives rise to a very simple classical algorithm for unstructured search. In this algorithm,
we start from a uniformly random item and repeatedly choose a new item uniformly at random from the
other N 1 possibilities, stopping when we reach a marked item. The fraction of marked items is = |M |/N ,
so the hitting time of this walk is
(N 1)N
1
O = = O(N/|M |) (18.3)
N |M |
(this is only an upper bound on the hitting time, but in this case we know it is optimal). Of course, if we
have no a priori lower bound on |M | in the event that M is non-empty, the best we can say is that 1/N ,
giving a running time O(N ).
The corresponding quantum walk search algorithm has a hitting time of
1 p
O = O( N/|M |), (18.4)
corresponding
p to the running time of Grovers algorithm. To see that this actually gives an algorithm using
O( N/|M |) queries, we need to see that a step of the quantum walk can be performed using only O(1)
quantum queries. In the case where the first item is marked, the modified classical walk matrix is
N 1 1 1 1
0 0 1 1
..
1 ..
P0 =
0 1 0 . . , (18.5)
N 1 . . . .
.. .. .. . . 1
0 1 1 0
q
so that the vectors |j i are |1 i = |1, 1i and |j i = |j, S \ {j}i = NN1 |j, Si 1
N 1
|j, ji for j = 2, . . . , N .
With a general marked set M , the projector onto the span of these states is
X X
= |j, jihj, j| + |j, S \ {j}ihj, S \ {j}|, (18.6)
jM j M
/
so the operator 2 1 acts as Grover diffusion over the neighbors when when the vertex is unmarked,
P and
as a phase flip when the vertex is marked. (Note that since we start from the state |i = j M / |j i, we
stay in the subspace of states span{|j, ki : (j, k) E}, and in particular have zero support on any state |j, ji
for j V , so 2 1 acts as 1 when the first register holds a marked vertex.) Each such step can be
implemented using two queries of the black box, one to compute whether we are at a marked vertex and
one to uncompute that p information; the subsequent swap operation requires no queries. Thus the query
complexity is indeed O( N/|M |).
This algorithm is not exactly the same as Grovers; for example, it works in the Hilbert space CN CN
instead of CN . Nevertheless, it is clearly closely related. In particular, notice that in Grovers algorithm,
the unitary operation 2|SihS| 1 can be viewed as a kind of discrete-time quantum walk on the complete
graph, where in this particular case no coin is necessary to define the walk.
The algorithm we have described so far only solves the decision version of unstructured search. To find
marked item, we could use bisection, but this would introduce a logarithmic overhead. In fact, it can be
shown that the final state of the quantum walk algorithm actually encodes a marked item when one exists.
Amplitude amplification is a general method for boosting the success probability of a (classical or quan-
tum) subroutine [23]. It can be implemented by quantum walk search as follows. Suppose we have a
procedure that produces a correct answer with probability p (i.e., with an amplitude of magnitude p if
we view it as a quantum process). From this procedure we can define a two-state Markov chain that, at
each step, moves from the state where the answer is not known to the state where the answer is known with
probability p, and then remains there. This walk has the transition matrix
1p 0
P0 = ,
p 1
p
so PM = 1 p, giving a quantum hitting time of O(1/ 1 kPM k) = O(1/ p).
For some applications, it may be desirable to estimate the value of p. Quantizing
the above two-state
3/2
Markov chain gives eigenvalues in the non-marked subspace of ei arccos(1p) = ei 2p+O(p ) . By applying
phase estimation, we can determine p approximately. Recall that phase estimation gives an estimate with
precision using O(1/) applications of the given unitary [34] (assuming we cannot apply high powers of the
unitary any more efficiently than simply applying it repeatedly). An estimate of p with precision gives
an estimate of p with precision p (since ( p + O())2 = p + O( p)), so we can produce an estimate of
p with precision in O( p/) steps.
In particular, if the Markov chain is a search of the complete graph as described in the previous section,
with |M | marked sites out of N , then p = |M |/N , and thispallows us to count the number of marked items.
We obtain an estimate of |M |/N with precision in O( |Mp|/N /) steps. If we want a multiplicative
approximation of |M | with precision , this means we need O( N/|M |/) steps.
Note that for exact counting, no speedup is possible in general. If |M | = (N ) then we need to estimate
p within precision O(1/N ) to uniquely determine |M |, but then the running time of the above procedure is
O(N ). In fact, it can be shown that exact counting requires (N ) queries [16].
given by
1 X 2ikx/N 1/d
|ki := e |xi (18.7)
N x
where k is a d-component vector of integers from 0 to N 1/d 1. The corresponding eigenvalues are
d
X 2kj
2 cos . (18.8)
j=1
N 1/d
Normalizing to obtain a stochastic matrix, we simply divide these eigenvalues by 2d. The 1 eigenvector has
k = (0, 0, . . . , 0), and the second largest eigenvalue comes from (e.g.) k = (1, 0, . . . , 0), with an eigenvalue
2
1 2 1 2
d 1 + cos 1/d 1 . (18.9)
d N 2d N 1/d
2
2 2/d
Thus the gap of the walk matrix P is about dN 2/d = O(N ). This is another case in which the bound
on the classical hitting time in terms of eigenvalues of P is too loose (it gives only O(N 1+2/d )), and instead
we must directly estimate the gap of PM . One can show that the classical hitting time is O(N 2 ) in d = 1,
O(N log N ) in d = 2, and O(N ) for any d 3. Thus there is a local quantum walk search algorithm that
saturates the lower bound for any d 3, and one that runs in time time O( N log N ) for d = 2. We already
argued that there could be no speedup for d = 1, and indeed we see that the quantum hitting time in this
case is O(N ).
Chapter 19
In this lecture we will discuss the algorithm that cemented the importance of quantum walk as a tool for
quantum query algorithms: Ambainiss algorithm for the element distinctness problem [11]. The key new
conceptual idea of this algorithm is to consider walks that store information obtained from many queries at
each vertex, but that do not require many queries to update this information for an adjacent vertex. This
idea leads to a general, powerful framework for quantum walk search [70, 84].
87
88 Chapter 19. Quantum walk search
will detect. Hence a k-query element distinctness algorithm implies an O( k)-query collision algorithm; or
equivalently, a k-query collision lower bound implies an (k 2 ) element distinctness lower bound.
Now the question remains, can we close the gap between the O(n3/4 ) upper bound and this (n2/3 ) lower
bound? Ambainiss quantum walk algorithm does exactly this.
91
Chapter 20
So far, we have discussed several different kinds of quantum algorithms. In the next few chapters, we discuss
ways of establishing limitations on the power of quantum algorithms [57]. After reviewing the model of
quantum query complexity, this chapter presents the polynomial method, an approach that relates quantum
query algorithms to properties of polynomials.
D(f ) denotes the deterministic query complexity, where the algorithm is classical and must always
work correctly.
R denotes the randomized query complexity with error probability at most . Note that this it does not
depend strongly on since we can boost the success probability by repeating the computation several
times and taking a majority vote. Therefore R (f ) = (R1/3 (f )) for any constant , so sometimes we
simply write R(f ).
Q denotes the quantum query complexity, again with error probability at most . Similarly to the
randomized case, Q (f ) = (Q1/3 (f )) for any constant , so sometimes we simply write Q(f ).
We know that D(or) = n and R(or) = (n). Grovers algorithm shows that Q(or) = O( n). In this
lecture we will use the polynomial method to show (among other things) that Q(or) = ( n), a tight lower
bound.
93
94 Chapter 20. Query complexity and the polynomial method
This is simply the linear extension of the natural reversible oracle mapping (i, b) 7 (i, b xi ), which can be
performed efficiently given the ability to efficiently compute i 7 xi . Note that the algorithm may involve
states in a larger Hilbert space; implicitly, the oracle acts as the identity on any ancillary registers.
It is often convenient to instead consider the phase oracle, which is obtained by conjugating the bit-flip
oracle by Hadamard gates: by the well-known phase kickback trick, Ox = (I H)Ox (I H) satisfies
Note that this is slightly wasteful since Ox |i, 0i = |i, 0i for all i; we could equivalently consider a phase oracle
Ox0 defined by Ox0 |0i = |0i and Ox0 |ii = (1)xi |ii for all i {1, . . . , n}. However, it is essential to include the
ability to not query the oracle by giving the oracle some eigenstate of known eigenvalue, independent of x.
If we could only perform the phase flip |ii 7 (1)xi |ii for i {1, . . . , n}, then we could not tell a string x
from its bitwise complement x.
These constructions can easily be generalized to the case of a d-ary input alphabet, say = Zd (identifying
input symbols with integers modulo d). Then for b , we can define an oracle Ox by
Taking the Fourier transform of the second register gives a phase oracle Ox = (I FZd )Ox (I FZd ) satisfying
where d := e2i/d .
Lemma 20.1. The acceptance probability of a t-query quantum algorithm for a problem with black-box input
x {0, 1}n is a polynomial in x1 , . . . , xn of degree at most 2t.
Proof. We claim that the amplitude of any basis state is a polynomial of degree at most t, so that the
probability of any basis state (and hence the probability of success) is a polynomial of degree at most 2t.
The proof is by induction on t. If an algorithm makes no queries to the input, then its success probability
is independent of the input, so it is a constant, a polynomial of degree 0.
For the induction step, a query maps
O
|i, bi 7x (1)bxi |i, bi (20.6)
= (1 2bxi )|i, bi, (20.7)
Consider a Boolean function f : {0, 1}n {0, 1}. We say a polynomial p R[x1 , . . . , xn ] represents f if
p(x) = f (x) for all x {0, 1}n . Letting deg(f ) denote the smallest degree of any polynomial representing f ,
we have Q0 (f ) deg(f )/2.
To handle bounded-error algorithms, we introduce the concept of approximate degree. We say a polyno-
mial p -represents f if |p(x) f (x)| for all x {0, 1}n . Then the -approximate degree of f , denoted
g (f ), is the smallest degree of any polynomial that -represents f . Clearly, Q (f ) deg
deg g (f )/2. Since
bounded-error query complexity does not depend strongly on the particular error probability , we can define,
g ) := deg
say, deg(f g (f ).
1/3
Now to lower bound the quantum query complexity of a Boolean function, it suffices to lower bound its
approximate degree.
20.4 Symmetrization
While polynomials are well-understood objects, the acceptance probability is a multivariate polynomial, so
it can be rather complicated. Since x2 = x for x {0, 1}, we can restrict our attention to multilinear
polynomials, but it is still somewhat difficult to deal with such polynomials directly. Fortunately, for many
functions it suffices to consider a related univariate polynomial obtained by symmetrization.
For a string x {0, 1}n , let |x| denote the Hamming weight of x, the number of 1s in x.
Lemma 20.2. Given any n-variate multilinear polynomial p, let P (k) := E|x|=k [p(x)]. Then P is a polyno-
mial with deg(P ) deg(p).
Thus the polynomial method is a particularly natural approach for symmetric functions, those that only
depend on the Hamming weight of the input.
20.5 Parity
Let parity : {0, 1}n {0, 1} denote the symmetric function parity(x) = x1 xn . Recall that Deutschs
problem, which is the problem of computing the parity of 2 bits, can be solved exactly with only one quantum
96 Chapter 20. Query complexity and the polynomial method
query. Applying this algorithm to a pair of bits at a time and then taking the parity of the results, we see
that Q0 (parity) n/2.
What can we say about lower bounds for computing parity? Symmetrizing parity gives the function
P : {0, 1, . . . , n} R defined by (
0 if k is even
P (k) = (20.14)
1 if k is odd.
Since P changes direction n times, deg(P ) n, so we see that Q0 (parity) n/2. Thus Deutschs algorithm
is tight among zero-error algorithms.
What about bounded-error algorithms? To understand this, we would like to lower bound the approxi-
mate degree of parity. If |p(x) f (x)| for all x {0, 1}n , then
|P (k) F (k)| = E (p(x) f (x))
(20.15)
|x|=k
for all k {0, 1, . . . , n}, where P is the symmetrization of p and F is the symmetrization of f . Thus, a
multilinear polynomial p that -approximates parity implies a univariate polynomial P satisfying P (k)
for k even and P (k) 1 for k odd. For any < 1/2, this function still changes direction n times, so in
g (f ) n, and hence Q (parity) n/2.
fact we have deg
This shows that the strategy for computing parity using Deutschs algorithm is optimal, even among
bounded-error algorithms. This is an example of a problem for which a quantum computer cannot get a
significant speeduphere the speedup is only by factor of 2. In fact, we need at least n/2 queries to succeed
with any bounded error, even with very small advantage (e.g., even if we only want to be correct with
probability 21 + 10100 ). In contrast, while the adversary method can prove an (n) lower bound for parity,
the constant factor that it establishes is error-dependent.
Note that this also shows we need (n) queries to exactly count the number of marked items in an
unstructured search problem, since exactly determining the number of 1s would in particular determine
whether the number of 1s is odd or even.
deg(P )2
dP (x)
max max P (x) min P (x) . (20.16)
x[0,n] dx n x[0,n] x[0,n]
dP (x)
d := max (20.18)
x[0,n] dx
p
denote the largest derivative of P in that range, then we have deg(P ) nd/h.
Now let P be a polynomial that -approximates or. Since P (0) and P (1) 1 , P must increase
by at least 1 2 in going from k = 0 to k = 1, so d 1 2.
We have no particular bound on h, since we have no control over the value of P at non-integer points; the
function could become arbitrarily large or small. However, since P (k) [0, 1] for k {0, 1, . . . , n}, a large
20.6. Unstructured search 97
value of h implies a large value of d, since P must change fast enough to start from and return to values in
the range [0, 1]. In particular, P must change by at least (h 1)/2 over a range of k of width at most 1/2,
so we have d h 1. Therefore,
r
n max{1 2, h 1}
deg(P ) (20.19)
h
= ( n). (20.20)
It follows that Q(or) = ( n).
Note that the same argument applies for a function that takes the value 0 whenever |x| = w and the value
1 whenever |x| = w + 1, for any w; in particular, it applies to any non-constant symmetric function. (Of
course, we can do better for some symmetric functions, such as parity and also majority, among others.)
98 Chapter 20. Query complexity and the polynomial method
Chapter 21
We now discuss the quantum lower bound for the collision problem. This lower bound is a more involved
application of the polynomial method than the simple examples weve seen so far.
99
100 Chapter 21. The collision problem
query random indices until we observe a collision. For two independently random indices i, j {1, . . . , n},
1
. If we query m indices in this way, there are m
we have Pr(xi = xj ) = n1 2 pairs, so the expected number
of collisions seen is m
2 /(n 1). With m = n this is (1), so we expect to see a collision. Indeed, a second
moment argument shows that this happens with constant probability.
In fact, this algorithm is optimal. Until we see a collision, there is no better strategy than to query
randomly.
m
By the union bound, the
2
probability of seeing a collision after making m queries is at most
2 /(n 1) = O(m /n), so m = ( n).
Then we claim that acceptance probability of a t-query quantum algorithm is a multilinear polynomial in
the {ij } of degree at most 2t.
This can be proved along similar lines to the binary case. Suppose we use an addition modulo n query,
where |i, ji 7 |i, j + xi i. Then we have
X
|i, j + xi i = ik |i, j + ki (21.4)
k
X
= i,`j |i, `i, (21.5)
`
21.5. Constructing the functions 101
so the degree of each amplitude can increase by at most 1 when we make a query.
Next we would like to obtain a simpler polynomial. We cannot directly symmetrize over the variables
{ij }, as this would destroy too much of the structure of the problem.
The original idea leading to a nontrivial collision lower bound, due to Aaronson, was to express the
acceptance probability as a bivariate polynomial in n and r, where the function is r-to-one. The main
difficulty with this approach is that we need to have r | n in order for such inputs to make sense (so that
we can at least say that the acceptance probability of a quantum algorithm is defined, and hence a given
approximating polynomial is bounded between 0 and 1). This approach originally gave a lower bound (n1/5 )
[1]. Subsequently, Shi improved this to give the optimal result of (n1/3 ) [90].
Given such an input, we can obtain a family of inputs by permuting the input alphabet and the characters
of the string arbitrarily. Specifically, for any permutations of {1, . . . , n}, we define an input x with
m,a,b
x
i := (x(i) ). This induces corresponding binary variables ij with ij = 1 iff xi = j.
Now we claim that the acceptance probability of a quantum algorithm presented with such an input is a
polynomial in m, a, b.
Lemma 21.1. Let p({ij }) be a polynomial in the ij . For any valid triple (m, a, b), let
P (m, a, b) := E [p({ij })]. (21.7)
,
where
n!
(n)k := = n(n 1) (n k + 1). (21.13)
(n k)!
Here the numerator of (21.11) has three contributions: the number of ways to permute the u function values
in the a-to-one part is ( m
a )u , the number of ways to permute the t u function values in the b-to-one part is
( nm
b )tu , and the number of ways to permute the n t remaining function values is (n t)!. Note that this
expression is a rational function in m, a, b whose numerator has degree t and whose denominator is au btu .
Now consider the latter term of (21.10). Given that X occurs, Pr [ j, i Sj , ij = 1] is independent
of , so suppose is any permutation such that X occurs. In other words, we only need to count consistent
ways of permuting the indices i. Observe that the number of ways to permute the indices i such that xi = j
for some j U is
a(a 1) (a |Sj | + 1) = (a)|Sj | . (21.14)
Similarly, the number of ways to permute the indices i such that xi = j for some j / U is (b)|Sj | . In addition,
we can permute the remaining n s indices however we like. Thus we have
Q Q
(n s)! jU (a)|Sj j U / (b)|Sj |
Pr [ j, i Sj , ij = 1 | X] = (21.15)
, n!
Q Q
jU (a)|Sj | / (b)|Sj |
j U
= . (21.16)
(n)s
This expression is a polynomial in a, b of degree s. Also, it is divisible by au and btu . Thus P (m, a, b) is a
polynomial in m, a, b of degree t + s u (t u) = s.
Lemma 21.2 (Paturi). Let f R[x], let a, b Z with a < b, and let [a, b]. If there are constants c, d
such that
|f (i)| c for all integers i [a, b]
|f (bc) f ()|
p d,
then deg(f ) = ( ( a + 1)(b + 1)).
Regardless of , Paturis lemma always shows that the degree is ( b a). If is near the middle of
the range then it does much better, showing that the degree is (b a). Also note that by continuity, it is
sufficient for f to differ by a constant amount between two consecutive integers.
Now we are ready to prove that the quantum query complexity of the collision problem is (n1/3 ). Let
p({ij }) be the acceptance probability of a t-query quantum algorithm that solves the collision problem with
error probability at most 1/3, and let P (m, a, b) be as above. We know that t deg(P )/2. Furthermore, we
know that P has the following properties:
0 P (m, 1, 1) 1/3
2/3 P (m, 2, 2) 1
Now consider inputs that are roughly half one-to-one and half two-to-one. For concreteness, let m =
2bn/4c. Since n m is even (recall that n is always even by assumption since otherwise the problem is
trivial), (m, 1, 2) is valid. We consider two cases, depending on whether the algorithm is more likely to call
this a yes or a no input. First suppose P (m, 1, 2) 1/2.
Let r be the smallest integer such that |P (m, 1, r)| 2. First we consider P (m, 1, x) as a function of x.
For all x {1, . . . , r 1}, we have 2 P (m, 1, x) 2. But we also know that |P (m, 1, 1) P (m, 1, 2)|
1 1 1
2 3 = 6 . By Paturis lemma, this implies that deg(P ) = ( r).
On the other hand, consider the polynomial g(x) := P (n rx, 1, r). When x Z such that rx
{0, . . . , n}, the triple (n rx, 1, r) is valid,
so we have 0 g(x) 1 for all integers x [0, bn/rc]. However,
|g( nm
r )| = |P (n, 1, r)| 2, and nm
r is about halfway between 0 and bn/rc. Thus by Paturis lemma,
deg(P ) = (n/r).
Combining these results, we have deg(P ) = ( r + n/r). The weakest lower bound is obtained when
the two terms are equal, i.e., when r = (n2/3 ); therefore deg(P ) = (n1/3 ).
It remains to consider the case where |P (m, 1, r)| < 1/2. But the same conclusion holds here by a very
argument. (Let r be the smallest even integer for which |P (m, r, 2)| 2; on the one hand, deg(P ) =
similar
( d) as before, but on the other hand, the polynomial h(x) := P (rx, r, 2) shows that deg(P ) = (n/r).)
Overall, it follows that any quantum algorithm for solving the collision problem must use (n1/3 ) queries.
104 Chapter 21. The collision problem
Chapter 22
We now discuss a second approach to proving quantum query lower bounds, the quantum adversary method
[10]. In fact, well see later that the generalized version of the adversary method we consider here (allowing
negative weights [55]) turns out to be an upper bound on quantum query complexity, up to constant factors
[79, 68].
An algorithm does not have direct access to the oracle string, and hence can only perform unitary operations
that act as the identity on the adversarys superposition. After t steps, an algorithm maps the overall state
to
!
X
t
| i := (I Ut )O . . . (I U2 )O(I U1 )O ax |xi |i (22.2)
xS
X
= ax |xi |xt i. (22.3)
xS
The main idea of the approach is that for the algorithm to learn x, this state must become very entangled.
To measure the entanglement of the pure state | t i, we can consider the reduced density matrix of the oracle,
X
t := ax ay hxt |yt i |xihy|. (22.4)
x,yS
Initially, the state 0 is pure. Our goal is to quantify how mixed it must become (i.e., how entangled the
overall state must be) before we can compute f with error at most . To do this we could consider, for
example, the entropy of t . However, it turns out that other measures are easier to deal with.
In particular, we have the following basic fact about the distinguishability of quantum states (for a proof,
see for example section A.9 of KLM):
Fact 22.1. Given one of two pure states |i, |i, we can make a measurement p that determines which state
we have with error probability at most [0, 1/2] if and only if |h|i| 2 (1 ).
Thus it is convenient to consider measures that are linear in the inner products hxt |yt i.
105
106 Chapter 22. The quantum adversary method
Note that this is a simple function of the entries of j . The idea of the lower bound is to show that W j
starts out large, must become small in order to compute f , and cannot change by much if we make a query.
The initial value of the weight function is
X
W0 = xy ax ay hx0 |y0 i (22.6)
x,yS
X
= ax xy ay (22.7)
x,yS
since |x0 i cannot depend on x. To make this as large as possible, we take a to be a principal eigenvector of
, an eigenvector with eigenvalue kk. Then |W 0 | = kk.
The final value of the weight function is easier to bound if we assume a nonnegative adversary matrix.
The final value is constrained by the fact that we must distinguish x from y with p error probability at most
whenever f (x) 6= f (y). For this to hold after t queries, we need |hxt |yt i| 2 (1 ) for all pairs x, y S
with f (x) 6= f (y) (by the above Fact). Thus we have
X
xy ax ay 2 (1 )
p
|W t | (22.8)
x,yS
p
= 2 (1 )kk. (22.9)
Here we can include the terms where f (x) = f (y) in the sum since xy = 0 for such pairs. We also used the
fact that the principal eigenevector of a nonnegative matrix can be taken to have nonnegative entries (by
the Perron-Frobenius theorem).
A similar bound holdspif has negative entries, but we need a different argument. In general, one can
only show that |W t | (2 (1 ) + 2)kk. But if we assume that f : S {0, 1} has Boolean output, then
we can prove the same bound as in the non-negative case, and the proof is simpler than for aPgeneral output
space. We use the following simple result, stated in terms of the Frobenius norm kXk2F := a,b |Xab |2 :
Proposition 22.2. For any X Cmn , Y Cnn , Z Cnm , we have |tr(XY Z)| kXkF kY kkZkF .
Proof. We have
X
tr(XY Z) = Xab Ybc Zca (22.10)
a,b,c
X
= (xa ) Y z a (22.11)
a
22.2. The adversary method 107
where (xa )b = Xab and (z a )c = Zca . Thus
X
|tr(XY Z)| kxa kkY z a k (22.12)
a
X
kY k kxa kkz a k (22.13)
a
sX X
kY k kxa k2 kz a0 k2 (22.14)
a a0
as claimed, where we used the Cauchy-Schwarz inequality in the first and third steps.
To upper bound |W t | for the negative adversary with Boolean output, write W t = tr(V ) where Vxy :=
ax ay hxt |yt i[f (x)
6= f (y)]. Define
X
C := ax f (x) |xt ihx| (22.16)
xS
X
C := ax 1f (x) |xt ihx| (22.17)
xS
with 0 , 1 denoting the projectors onto the subspaces indicating f (x) = 0, 1, respectively. Then
so
By the Proposition, |W t | 2kkkCkF kCkF . Finally, we upper bound kCkF and kCkF . We have
X
kCk2F + kCk2F = |ax |2 (|hy|f (x) |xt i|2 + |hy|1f (x) |xt i|2 ) = 1 (22.23)
x,yS
X
kCk2F = |ax |2 k|1f (x) |xt ik2 . (22.24)
xS
p p
Therefore kCkF kCkF maxx[0,] x(1 x) = (1 ) (assuming [0, 1/2]), and we find that |W t |
p
2 (1 )kk, as claimed.
It remains to understand how much the weight function can decrease at each step of the algorithm. We
have
X
W j+1 W j = xy ax ay (hxj+1 |yj+1 i hxj |yj i). (22.25)
x,yS
Consider how the state changes when we make a query. We have |xj+1 i = Uj+1 Ox |xj i. Thus the elements
of the Gram matrix of the states {|xj+1 i : x S} are
Observe that Ox Oy |i, bi = (1)b(xi yi ) |i, bi. Let P0 = I |0ih0| denote the projection onto the b = 0
states, and let Pi denote the projection Pn |i, 1ihi, 1|. (As with Ox , the projections
Pn Pi implicitly act as the
xi yi
identity
P on any ancilla registers, so P
i=0 i = I.) Then O O
x y = P 0 + i=1 (1) Pi , so Ox Oy I =
2 i : xi 6=yi Pi . Thus we have
X X
W j+1 W j = 2 xy ax ay hxj |Pi |yj i. (22.29)
x,yS i : xi 6=yi
Then we have
n
X X
W j+1 W j = 2 (i )xy ax ay hxj |Pi |yj i (22.31)
x,yS i=1
n
tr(Qi i Qi )
X
= 2 (22.32)
i=1
Since
n
X n X
X
kQi k2F = |ax |2 kPi |xj ik2 (22.35)
i=1 i=1 xS
X
|ax |2 (22.36)
xS
= 1, (22.37)
we have
Combining these three facts gives the adversary lower bound. Since |W 0 | = kk, we have
|W t | kk 2t max ki k. (22.39)
i{1,...,n}
p
Thus, to have |W t | 2 (1 )kk, we require
p
12 (1 )
t Adv(f ). (22.40)
2
22.3. Example: Unstructured search 109
where
kk
Adv(f ) := max (22.41)
maxi{1,...,n} ki k
with the maximum taken over all adversary matrices for the function f . (Often the notation Adv(f )
is reserved for the maximization over nonnegative adversary matrices, with the notation Adv (f ) for the
generalized adversary method allowing negative weights.)
for some nonnegative coefficients 1 , . . . , n . Symmetry suggests that we should take 1 = = n . This
can be formalized, but for the present purposes we can take this as an ansatz.
Setting 1 = = n = 1 (since an overall scale factor does not affect the bound), we have
n 0 0
0 1 1
2 = . . . (22.43)
.. .. . . ...
0 1 1
which has norm k2 k = n, and hence kk = n. We also have
0 1 0 0
1 0
0 0
1 = 0 0
0 0 (22.44)
.. .. .. .. ..
. . . . .
0 0 0 0
andsimilarly for the other i , so ki k = 1. Thus we find Adv(or) n, and it follows that Q (or)
12 (1)
2 n. This shows that Grovers algorithm is optimal up toa constant factor (recall that Grovers
algorithm finds a unique marked item with probability 1 o(1) in 4 n + o(1) queries).
110 Chapter 22. The quantum adversary method
Chapter 23
Having discussed lower bounds on quantum query complexity, we now turn our attention back to upper
bounds. The framework of span programs is a powerful tool for understanding quantum query complexity
[80, 78]. Span programs are closely related to the quantum adversary method, and can be used to show that
the (generalized) adversary method actually characterizes quantum query complexity up to constant factors
[79, 68].
For simplicity, we restrict our attention to the case of a (possibly partial) Boolean function f : S {0, 1}
where S {0, 1}n . Many (but not all) of the considerations for this case generalize to other kinds of
functions.
such that Q(f ) = (Adv (f )). Although not immediately obvious from the above expression, it can be
shown that Adv (f ) is the value of a semidefinite program (SDP), a kind of optimization problem in which
a linear objective function is optimized subjected to linear and positive semidefiniteness constraints.
Unfortunately, the details of semidefinite programming are beyond the scope of this course. For a good
introduction in the context of quantum information, see Watrouss lecture notes on Theory of Quantum
Information [97, Lecture 7].
A useful feature of SDPs is that they can be solved efficiently. Thus, we can use a computer program to
find the optimal adversary lower bound for a fixed (finite-size) function. However, while this may be useful
for getting intuition about a problem, in general this does not give a strategy for determining asymptotic
quantum query complexity.
Another key feature of SDPs is the concept of semidefinite programming duality. To every primal SDP,
phrased as a maximization problem, there is a dual SDP, which is a minimization problem. Whereas feasible
solutions of the primal SDP give lower bounds, feasible solutions of the dual SDP give upper bounds. The
dual problem can be constructed from the primal problem by a straightforward (but sometimes tedious)
process. Semidefinite programs satisfy weak duality, which says that the value of the primal problem is at
most the value of the dual problem. Furthermore, almost all SDPs actually satisfy strong duality, which
says that the primal and dual values are equal. (In particular, this holds under the Slater conditions, which
essentially say that the primal or dual constraints are strictly feasible.)
To understand any SDP, one should always construct its dual. Carrying this out for the adversary
method would require some experience with semidefinite programs, so we simply state the result here.
The variables of the dual problem can be viewed as a set of vectors |vx,i i Cd for all inputs x S
111
112 Chapter 23. Span programs and formula evaluation
and all indices i P[n] := {1, . . . , n}, for some dimension d. For b {0, 1}, we define the b-complexity
Cb := maxxf 1 (b) i[n] k|vx,i ik2 . Since strong duality holds, we have the following.
Theorem 23.1. For any function f : S {0, 1} with S {0, 1}n , we have
where the minimization is over all positive integers d and all sets of vectors {|vx,i i Cd : x S, i [n]}
satisfying the constraint
X
hvx,i |vy,i i = 1 f (x),f (y) x 6= y. (23.3)
i : xi 6=yi
By constructing solutions of the adversary dual, we place upper bounds on the best possible adversary
lower bound. But more surprisingly, one can construct an algorithm from a solution of the adversary dual,
giving an upper bound on the quantum query complexity itself.
Observe that if we replace |vx,i i |vx,i i for all x f 1 (0) and |vy,i i |vy,i i/ for all y f 1 (1), we
dont affect the constraints (23.3), but we map C0 2 C0 and C1 C1 /2 . Taking = (C1 /C0 )1/4 , we
make the two complexities equal. Thus we have
Adv (f ) = min
p
C0 C1 . (23.4)
{|vx,i i}
Note that the constraint (23.3) for f (x) = f (y), where the right-hand side is zero, can be removed
without changing the value of the optimization problem. (For functions with non-Boolean output, one
loses a factor strictly between 1 and 2 in the analogous relaxation.) To see this, suppose we have a set
of vectors {|vx,i i} satisfying the constraint (23.3) for f (x) 6= f (y) but not for f (x) = f (y). Simply let
0 0
|vx,i i = |vx,i i|xi f (x)i for all x S and all i [n]. Then k|vx,i ik = k|vx,i ik, and for the terms where
0 0 0 0
xi 6= yi , we have hvx,i |vy,i i = hvx,i |vy,i i if f (x) 6= f (y) and hvx,i |vy,i i = 0 if f (x) = f (y).
for x f 1 (0) and the witness vectors for x f 1 (1) give the remaining dual adversary vectors. For more
detail on this translation, see [78, Lemma 6.5] (and see the rest of that paper for more than you ever wanted
to know about span programs).
We focus on dual adversary solutions here, as these are simpler to work with for the applications we
consider. However, for other applications it may be useful to work directly with span programs instead; in
particular, (non-canonical) span programs offer more freedom when trying to devise upper bounds.
for all j [n] (where ej Cn is the jth standard basis vector) and
X
hvej ,i |vek ,i i = hvej ,j |vek ,j i + hvej ,k |vek ,k i = 0 (23.6)
i : (ej )i 6=(ek )i
Since C0 C1 = n, we see that Adv (f ) n, demonstrating that the previously discussed adversary
lower bound is the best possible adversary lower bound.
It is easy to extend this dual adversary solution to one for the total or function. For any x 6= 0, simply
let |vx,i i = i,j , where j is the index of any particular bit for which xj = 1 (e.g., the first such bit). Then
the constraints are still satisfied, and the complexity is the same. As an exercise, you should work out an
optimal dual adversary for and.
that leads to the outcome 1, so vertices representing his moves correspond to or gates. Andrea wins if she
can make any move that gives 0i.e., she only loses if all her moves give 1so her vertices correspond to
and gates.
What is the query complexity of evaluating this balanced d-ary and-or tree? Let us first consider
randomized classical algorithms. Notice that it is sometimes possible to avoid evaluating all the leaves: for
example, if we learn that one input to an and gate is 0, then we do not need to evaluate the other inputs
to know that the gate evaluates to 0. In the case where all inputs are 1, we must evaluate all of them; but
the inputs to an and gate are given by the outputs of or gates, and an or gate evaluating to 1 is exactly
the case where it is possible to learn the value of the gate without knowing all of its inputs. Similarly, the
hardest input to the or gate is precisely the output of an and gate for which it is possible to learn the
output without evaluating all inputs.
With these observations in mind, a sensible classical algorithm is as follows. Suppose that to evaluate
any given vertex of the tree, we guess a random child and evaluate it (recursively), only evaluating other
children when necessary. By analyzing a simple recurrence, one can show that this algorithm uses
!k
d1+ d2 + 14d + 1
d1+ d2 +14d+1
O = O nlogd 4 (23.9)
4
queries, where n = dk is the input size (e.g., for d = 2, O(n0.753 )) [93, 82]. In fact, it is possible to show
that this algorithm is asymptotically optimal [83]. Notice, in particular, that the classical query complexity
becomes larger as d is increased with n fixed. In the extreme case where k = 1, so that n = d, we are simply
evaluating the and gate, which is equivalent (by de Morgans laws) to evaluating an or gate, and which we
know takes (n) queries.
A quantum computer
can evaluate such games faster if k is sufficiently small. Of course, the k = 1
case is solved in O( n) queries by Grovers algorithm. By applying Grovers algorithm recursively, suitably
k1
amplifying the success probability, it is possible to evaluate the formula
in nO(log n) queries [25], which
is nearly optimal for constant k. This can be improved slightly to O( nck ) queries for some constant c using
a variant of Grovers algorithm that allows noisy inputs [56]. But both of these algorithms are only close to
tight when k is constant. Indeed, for very low degree (such as d = 2, so that k = log2 n), nothing better
than the classical
algorithm was known until 2007 [41]. Here we will describe how to solve that problem
in only O( n) quantum queries. However, rather than presenting the original algorithm, we show how a
composition property of span programs offers a particularly simple analysis.
Proof. Let {|vx,i i : x {0, 1}n , i [n]} be an optimal dual adversary solution for f , and let {|uy,j i : y
{0, 1}m , j [m]i} be an optimal dual adversary solution for g. Let y = (y 1 , . . . , y n ) where each y i {0, 1}m .
Then define
We claim that this is a dual adversary solution for f g. To see this, we compute
X X X
hwy,(i,j) |wz,(i,j) i = hvg(y),i |vg(z),i i huyi ,j |uzi ,j i (23.11)
(i,j) : yji 6=zji i[n] j : yji 6=zji
X
= hvg(y),i |vg(z),i i(1 g(yi ),g(zi ) ) (23.12)
i[n]
X
= hvg(y),i |vg(z),i i (23.13)
i : g(y i )6=g(z i )
Adv (f g) max
X X
k|vg(y),i ik2 k|uyi ,j ik2 (23.15)
y
i j
as claimed.
Note that here we needed the constraint (23.3) in the case where f (x) = f (y).
In particular,
combining this with the dual adversary for or and a similar solution for and, this shows
that Adv (f ) n for the n-input balanced binary and-or tree.
1 X
|x i := |0i + 1 |ii|vx,i i|xi i (23.17)
x 2A
i[n]
with {|vx,i i} an optimal dual adversary solution. Here the normalization factor is
1 X 3
x = 1 + k|vx,i ik2 . (23.18)
2A 2
i[n]
P
The reflection 2 I requires no queries to implement. Let x = |0ih0| + i[n] |iihi| I |xi ihxi | be
the projector onto |0i and states where the query and output registers are consistent. Then the reflection
2x I can be implemented using only two queries to the oracle Ox .
The algorithm runs phase estimation with precision (1/A) on the unitary U := (2x I)(2 I),
with initial state |0i. If the estimated phase is 1, then the algorithm reports that f (x) = 1; otherwise it
116 Chapter 23. Span programs and formula evaluation
reports that f (x) = 0. This procedure uses O(A) queries. It remains to see why the algorithm is correct
with bounded error.
First, we claim that if f (x) = 1, then |0i is close to the 1-eigenspace of U . We have x |x i = |x i for
all x and |x i = |x i for f (x) = 1, so clearly U |x i = |x i. Furthermore, |h0|x i|2 = 1/x 2/3 for all
x, so surely kx |0ik2 2/3. Thus the algorithm is correct with probability at least 2/3 when f (x) = 1.
On the other hand, we claim that if f (x) = 0, then |0i has small projection onto the subspace of
eigenvectors with eigenvalue ei for c/A, for some constant A. To prove this, we use the following [68]:
Lemma 23.3 (Effective spectral gap lemma). Let |i be a unit vector with |i = 0; let P be the projector
onto eigenvectors of U = (2 I)(2 I) with eigenvalues ei with || < for some 0. Then
kP |ik /2.
Let
1 X
|x i := |0i 2A |ii|vx,i i|xi i , (23.19)
x
i[n]
so |x i = 0. Also, observe that x |x i = |0i/ x . By the effective spectral gap lemma, kP |0ik
q
2 1
x /2 1 + 2A2 /2 A/ 2. Thus, choosing = 3 A gives a projection of at most 1/ 3, so
the algorithm fails with probability at most 1/3 (plus the error of phase estimation, which can be made
negligible, and the small error from approximating 1 + 2A2 2A2 , which is negligible if A 1).
It remains to prove the lemma.
Proof. We apply Jordans lemma, which says that for any two projections acting on the same finite-
dimensional space, there is a decomposition of the space into a direct sum of one- and two-dimensional
subspaces that are invariant under both projections.
We can assume without loss of generality that |i only has support on 2 2 blocks of the Jordan
decomposition in which and both have rank one. If the block is 1 1, or if either projection has rank 0
or 2 within the block, then U acts as either I on the block; components with eigenvalue 1 are annihilated
by P , and components with eigenvalue +1 are annihilated by .
Now, by an appropriate choice of basis, restricting and to any particular 2 2 block gives
= 1 0 (23.22)
0 0
cos 2
cos 2 sin 2
= (23.23)
sin 2
where 2 is the angle between the vectors projected onto within the two subspaces. A simple calculation
shows that (2 I)(2 I) is a rotation by an angle , so its eigenvalues are ei . Since |i = 0, the
component of |i in the relevant subspace is proportional to ( 01 ), and
0
=
sin cos 2
= |sin |
2 2 2 (23.24)
1
sin 2
as claimed.
Chapter 24
Learning graphs
While span programs provide a powerful tool for proving upper bounds on quantum query complexity, they
can be difficult to design. The model of learning graphs, introduced by Belovs [17], is a restricted class of
span programs that are more intuitive to design and understand. This model has led to improved upper
bounds for various problems, such as subgraph finding and k-distinctness.
117
118 Chapter 24. Learning graphs
where the sums only run over those [n] for which e,i is an edge of the learning graph.
It is easy to check that this definition satisfies the dual adversary constraints. For any x, y S with
f (x) = 0 and f (y) = 1, we have
X X X pe
hvx,i |vy,i i = we ,i ,i hx |y i (24.2)
we,i
i : xi 6=yi i : xi 6=yi
X X
= pe,i . (24.3)
i : xi 6=yi : x =y
Now observe that the set of edges {e,i : x = y , xi 6= yi } forms a cut in the graph between the vertex sets
{ : x = y } and { : x 6= y } Since is in the former set and all sinks are in the latter set, the total flow
through the cut must be 1.
Recall that we do not have to satisfy the constraint for f (x) = f (y) since there is a construction that
enforces this condition without changing the complexity provided the condition for f (x) 6= f (y) is satisfied.
It remains to see that the complexity of this dual adversary solution equals the original learning graph
complexity. For b {0, 1}, we have
X
Cb = max
1
k|vx,i ik2 (24.4)
xf (b)
i[n]
X X we,i if b = 0
= max p2e
,i
(24.5)
xf 1 (b)
i[n]
we,i if b = 1
(
C0 if b = 0
= (24.6)
maxxf 1 (1) C1 (x) if b = 1
= Cb . (24.7)
Therefore C0 C1 = C0 C1 = C as claimed. In particular, Adv (f ) C, so Q(f ) = O(C).
Learning graphs are simpler to design than span programs: the constraints are automatically satisfied,
so one can focus on optimizing the objective value. In contrast, span programs have exponentially many
constraints (in n, if f is a total function), and in general it is not obvious how to even write down a solution
satisfying the constraints.
Note, however, that learning graphs are not equivalent to general span programs. For example, learning
graphs (as defined above) only depend on the 1-certificates of a function, so two functions with the same
1-certificates have the same learning graph complexity. The 2-threshold function (the symmetric Boolean
function that is 1 iff two or more input bits are 1) has the same certificates as element distinctness, so
its learning graph complexity is (n2/3 ), whereas its query complexity is O( n). This barrier can be
circumvented by modifying the learning graph model, but even such variants are apparently less powerful
than general span programs.
24.4. Element distinctness 119
Pk q
(since the 1-norm upper bounds the 2-norm), so C k j=1 C0j C1j .
Another useful modification is to allow multiple vertices corresponding to the same subset of indices. It
is straightforward to show that such learning graphs can be converted to span programs at the same cost,
or to construct a new learning graph with no multiple vertices and the same or better complexity.
The learning graph for element distinctness has three stages. For the first stage, we load subsets
Pn ni of size
ni
r 2. We do this by first adding edges from to r3 copies of vertex {i}, so that there are i=1 r3 =
n
r2 singleton vertices. Then, from each of these singleton vertices, we load the remaining indices of each
possible subset of size r 2, one index at a time. Every edge has weight 1. Then the 0-complexity of the
n
first stage is (r 2) r2 .
To upper bound the 1-complexity of the first stage, we route flow only through vertices that do not
1
contain the indices of a collision, sending equal flow of n2 r2 to all subsets of size r 2. This gives
n2 n2 2 n2 1
1-complexity of at most (r 2) r2 r2 = (r 2) r2 .
Overall, the complexity of the first stage is at most
s 1 s
n(n 1)
n n 2
(r 2)2 = (r 2) = O(r). (24.9)
r2 r2 (n r + 2)(n r + 1)
The second and third stages each include all possible edges that load one additional index from the
terminal vertices
of the previous stage. Again every edge has unit weight. Thus, the 0-complexity is
n n
(n r + 2) r2 for the second stage and (n r + 1) r1 for the third stage. We send the flow through
vertices that contain the collision pair (namely, that contain the first index of a collision in the second stage
n22 1
and the second index of a collision in the third stage). Thus, the 1-complexity is n2 r2 r2 = n2
r2
in both the second and the third stages. This gives total complexity
s 1
n2
n
(n r + 2) = O( n) (24.10)
r2 r2
Quantum simulation
121
Chapter 25
Another major potential application of quantum computers is the simulation of quantum dynamics. Indeed,
this was the idea that first led Feynman to propose the concept of a quantum computer [44]. In this lecture we
will see how a universal quantum computer can efficiently simulate several natural families of Hamiltonians.
These simulation methods could be used either to simulate actual physical systems or to implement quantum
algorithms defined in terms of Hamiltonian dynamics, such as continuous-time quantum walks (Part III) and
adiabatic quantum algorithms (Part VI).
d
i~ |(t)i = H(t)|(t)i. (25.1)
dt
Here H(t) is the Hamiltonian, an operator with units of energy, and ~ is Plancks constant. For convenience
it is typical to choose units in which ~ = 1. Given an initial wave function |(0)i, we can solve this differential
equation to determine |(t)i at any later (or earlier) time t.
For H independent of time, the solution of the Schrodinger equation is |(t)i = eiHt |(0)i. For simplic-
ity we will only consider this case. There are many situations in which time-dependent Hamiltonians arise,
not only in physical systems but also in computational applications such as adiabatic quantum computing.
In such cases, the evolution cannot in general be written in such a simple form, but nevertheless similar ideas
can be used to simulate the dynamics.
123
124 Chapter 25. Simulating Hamiltonian dynamics
implement arbitrary unitaries. Instead, we will simply describe a few classes of Hamiltonian that can be
efficiently simulated. Our strategy will be to start from simple Hamiltonians that are easy to simulate and
define ways of combining the known simulations to give more complicated ones.
There are a few cases where a Hamiltonian can obviously simulated efficiently. For example, this is the
case if H only acts nontrivially on a constant number of qubits, simply because any unitary evolution on a
constant number of qubits can be approximated with error at most using poly(log 1 ) one- and two-qubit
gates, using the Solovay-Kitaev theorem.
Note that since we require a simulation for an arbitrary time t (with poly(t) gates), we can rescale the
evolution by any polynomial factor: if H can be efficiently simulated, then so can cH for any c = poly(n).
This holds even if c < 0, since any efficient simulation is expressed in terms of quantum gates, and can
simply be run in reverse.
In addition, we can rotate the basis in which a Hamiltonian is applied using any unitary transformation
with an efficient decomposition into basic gates. In other words, if H can be efficiently simulated and the
unitary transformation U can be efficiently implemented, then U HU can be efficiently simulated. This
follows from the simple identity
eiU HU t = U eiHt U . (25.2)
Another simple but useful trick for simulating Hamiltonians is the following. Suppose H is diagonal in
the computational basis, and any diagonal element d(a) = ha|H|ai can be computed efficiently. Then H can
be simulated efficiently using the following sequence of operations, for any input computational basis state
|ai:
p2
H= + V (x).
2m
To simulate this a digital quantum computer, we can imagine discretizing the x coordinate. The operator
V (x) is diagonal, and natural discretizations of p2 = d2 /dx2 are diagonal in the discrete Fourier basis. Thus
we can efficiently simulate both V (x) and p2 /2m. Similarly, consider the Hamiltonian of a spin system, say
of the form X X
H= hi Xi + Jij Zi Zj
i ij
(or more generally, any k-local Hamiltonian, a sum of terms that each act on at most k qubits). This consists
of a sum of terms, each of which acts only only a constant number of qubits and hence is easy to simulate.
In general, if H1 and H2 can be efficiently simulated, then H1 + H2 can also be efficiently simulated.
If the two Hamiltonians commute, then this is trivial, since eiH1 t eiH2 t = ei(H1 +H2 )t . However, in the
general case where the two Hamiltonians do not commute, we can still simulate their sum as a consequence
of the Lie product formula m
ei(H1 +H2 )t = lim eiH1 t/m eiH2 t/m . (25.7)
m
25.4. Sparse Hamiltonians 125
A simulation using a finite number of steps can be achieved by truncating this expression to a finite number
of terms, which introduces some amount of error that must be kept small. In particular, if we want to have
iH1 t/m iH2 t/m m
e e ei(H1 +H2 )t
, (25.8)
it suffices to take m = O((t)2 /), where := max{kH1 k, kH2 k}. (The requirement that H1 and H2 be
efficiently simulable means that can be at most poly(n).)
It is somewhat unappealing that to simulate an evolution for time t, we need a number of steps propor-
tional to t2 . Fortunately, the situation can be improved if we use higher-order approximations of (25.7). For
example, one can show that
iH1 t/2m iH2 t/m iH1 t/2m m
e e e ei(H1 +H2 )t
(25.9)
with a smaller value of m. In fact, by using even higher-order approximations, it is possible to show that
H1 + H2 can be simulated for time t with only O(t1+ ), for any fixed > 0, no matter how small [28, 19].
A Hamiltonian that is a sum of polynomially many terms can be efficiently simulated by composing the
simulation of two terms, or by directly using an approximation to the identity
m
ei(H1 ++Hk )t = lim eiH1 t/m eiHk t/m . (25.10)
m
Another way of combining Hamiltonians comes from commutation: if H1 and H2 can be efficiently
simulated, then i[H1 , H2 ] can be efficiently simulated. This is a consequence of the identity
m
e[H1 ,H2 ]t = lim eiH1 t/m eiH2 t/m eiH1 t/m eiH2 t/m , (25.11)
m
which can again be approximated with a finite number of terms. However, I dont know of any algorithmic
application of such a simulation.
Given this lemma, the simulation proceeds as follows. First, to ensure that the graph of H is bipartite,
we actually simulate evolution according to the Hamiltonian x H, which is bipartite and has the same
sparsity as H. Since ei(x H)t |+i|i = |+ieiHt |i, we can recover a simulation of H from a simulation
of x H.
Now write H as a diagonal matrix plus a matrix with zeros on the diagonal. We have already shown how
to simulate the diagonal part, so we can assume H has zeros on the diagonal without loss of generality.
It suffices to simulate the term corresponding to the edges of a particular color c. We show how to make
the simulation work for any particular vertex x; then it works in general by linearity. By computing the
complete list of neighbors of x and computing each of their colors, we can reversibly compute vc (x), the
vertex adjacent to x via an edge with color c, along with the associated matrix element:
since it is easily diagonalized, as it consists of a direct sum of two-dimensional blocks. Finally, we can
uncompute the second and third registers. Before the uncomputation, the simulation produces a linear
combination of the states |x, vc (x), Hx,vc (x) i and |vc (x), x, Hx,v c (x)
i. Since
|vc (x), x, Hx,v c (x)
i = |vc (x), vc (vc (x)), Hvc (x),x i, (25.14)
where |Ea i are the eigenstates of H with eigenvalues Ea . Suppose we prepare the pointer in the state |x = 0i,
a narrow wave packet centered at x = 0. Since the momentum operator generates translations in position,
the above evolution performs the transformation
If we can measure the position of the pointer with sufficiently high precision that all relevant spacings
xab = t|Ea Eb | can be resolved, then measurement of the position of the pointera fixed, easy-to-measure
observable, independent of Heffects a measurement of H.
Von Neumanns measurement protocol makes use of a continuous variable, the position of the pointer.
To turn it into an algorithm that can be implemented on a digital quantum computer, we can approximate
the evolution (25.15) using r quantum bits to represent the pointer. The full Hilbert space is thus a tensor
25.5. Measuring an operator 127
product of a 2n -dimensional space for the system and a 2r -dimensional space for the pointer. We let the
computational basis of the pointer, with basis states {|zi}, represent the basis of momentum eigenstates.
The label z is an integer between 0 and 2r 1, and the r bits of the binary representation of z specify the
states of the r qubits. In this basis, p acts as
z
p|zi = |zi. (25.17)
2r
In other words, the evolution eitHp can be viewed as the evolution eitH on the system for a time controlled
by the value of the pointer.
Expanded in the momentum eigenbasis, the initial state of the pointer is
r
2X 1
1
|x = 0i = |zi. (25.18)
2r/2 z=0
The measurement is performed by evolving under H p for some appropriately chosen time t. After this
evolution, the position of the simulated pointer can be measured by measuring the qubits that represent it
in the x basis, i.e., the Fourier transform of the computational basis.
Note that this discretized von Neumann measurement procedure is equivalent to phase estimation. Recall
that in the phase estimation problem, we are given an eigenvector |i of a unitary operator U and asked to
determine its eigenvalue ei . The algorithm uses two registers, one that initially stores |i and one that will
store an approximation of the phase . The first and last steps of the algorithm are Fourier transforms on
the phase register. The intervening step is to perform the transformation
where |zi is a computational basis state. If we take |zi to be a momentum eigenstate with eigenvalue z (i.e.,
if we choose a different normalization than in (25.17)) and let U = eiHt , this is exactly the transformation
induced by ei(Hp)t . Thus we see that the phase estimation algorithm for a unitary operator U is exactly
von Neumanns prescription for measuring i ln U .
128 Chapter 25. Simulating Hamiltonian dynamics
Chapter 26
While product formulas provide the most straightforward approach to Hamiltonian simulation, alternative
approaches can offer improved performance. Here we explore Hamiltonian simulation beyond product for-
mulas.
26.1 No fast-forwarding
Before introducing improved upper bounds, we begin by establishing a limitation on the ability of algorithms
to simulate sparse Hamiltonians. Specifically, as mentioned in Chapter 25, we show that no general procedure
can simulate a sparse Hamiltonian acting for time t using o(t) queries [19].
The lower bound is based on a reduction from parity. Recall from Section 20.5 that computing the parity
of n bits requires (n) queries. Given an input string x {0, 1}n , construct a graph on vertices (i, b) for
i {0, 1, . . . , n} and b {0, 1}, such that (i1, b) is adjacent to (i, bxi ) for all i {1, . . . , n} and b {0, 1}.
This graph is the disjoint union of two paths of length n, and (0, 0) is connected to (n, b) for exactly one value
of b, namely b = x1 xn , the parity of the input string. The main idea of the proof is to construct a
Hamiltonian whose nonzero entries correspond to this graph, such that the dynamics for some time t = O(n)
map the state |0, 0i to the state |n, x1 xn i. Then a simulation of the Hamiltonian dynamics for time
t using o(t) queries would violate the parity lower bound.
The most obvious choice is to simply use the adjacency matrix of the graph as the Hamiltonian. However,
then the dynamics generate a continuous-time quantum walk on a finite path, which does not reach the
opposite end of the path with constant amplitude after linear time.
Instead, we choose the matrix elements of the Hamiltonian H so that
p
hi 1, b|H|i, b xi i = i(n i + 1)/n. (26.1)
Clearly, a black box for this 2-sparse Hamiltonian can be implemented using O(1) queries to the black box
for the input string to answer each neighbor query. The weights are chosen to reflect the transitions between
column states (in the sense of Section 16.5) for an unweighted hypercube. Specifically, letting Q denote the
1/2 P
adjacency matrix of the hypercube and |wtk i := nk |x|=k |xi, we have
1/2
n X X
Q|wtk i = (n k + 1) |xi + (k + 1) |xi (26.2)
k
|x|=k1 |x|=k+1
p p
= k(n k + 1)|wtk1 i + (k + 1)(n k) |wtk+1 i. (26.3)
Thus, with these weights on the edges, the dynamics behave just as the walk on the hypercube within its
column subspace. In particular, since the dynamics on the hypercube map a vertex into the opposite corner
in time /2 (as shown in Section 16.2), the chosen Hamiltonian maps |0, 0i to |n, x1 xn i in time O(n).
It follows that a generic procedure for simulating sparse Hamiltonians for time t must have complexity
(t) in general. In other words, one cannot fast-forward the dynamics of arbitrary Hamiltonians.
129
130 Chapter 26. Fast quantum simulation algorithms
PN
where X maxj k=1 |Hjk |. Define the operators S, T as in the proof of Theorem 17.1. Then we get
T ST = H/X, so the walk has eigenvalues ei arccos(/X) , where is an eigenvalue of H. The eigenvectors
corresponding to these two eigenvalues can be found within the subspace span{T |i, ST |i}, where |i is
the eigenvector of H with eigenvalue .
To simulate H on a given input state |i, we proceed as follows:
1. Apply the isometry T to produce the state T |i.
2. Perform phase estimation on the quantum walk with precision (to be determined).
3. Given a value approximating arccos(/X), compute an estimate of .
4. Introduce the phase eit .
5. Uncompute the estimate of .
6. Invert the phase estimation procedure.
7. Apply T to return to a state in the original Hilbert space.
Since a step of the quantum walk can be implemented using two applications of the isometry T , this
procedure makes O(1/) calls to T . In turn, T can be implemented using a number of queries that is
polynomial in the sparsity of the Hamiltonian, so up to factors of the sparsity, the query complexity of
simulation is simply O(1/). Thus it remains to determine what value of suffices to ensure that the overall
procedure reproduces the dynamics up to error at most .
The details of this analysis are presented in [29, 20], but we can understand it roughly as follows. Suppose
the estimate of arccos(/X) deviates from its true value by of order . Since the cosine function has Lipschitz
constant 1 (i.e., |cos(x + ) cos(x)| ||), the resulting error in the value of /X is also of order . In
other words, the error in the value of is of order X. To ensure that eit deviates by at most from its
true value, we take Xt = (), i.e., 1/ = (Xt/). Thus we seethat the complexity is linear in t and
polynomial in 1/. Note if H is d-sparse, then we can choose X dkHk dkHkmax , so the factor of X
just introduces polynomial overhead with respect to the sparsity.
Using a more refined implementation and analysis of this approach, one can achieve query complexity
O( kHkt
+ dkHkmax t) = O(dkHkmax t/ ) for a d-sparse Hamiltonian H [20].
for higher-order formulas. The query complexity of this approach is O( loglog( /)
log( /) ), where
:= d2 kHkmax t
(with kHkmax denoting the largest magnitude of an entry of H).
Denote the Taylor series for the evolution up to time t, truncated at order K, by
K
X (iHt)k
U (t) := . (26.5)
k!
k=0
For sufficiently large K, the operator U (t) is a good approximation of exp(iHt). Specifically, by Taylors
theorem, we have
exp(kHkt)(kHkt)K+1
kU (t) exp(iHt)k , (26.6)
(K + 1)!
so we can ensure that the error is at most by taking K = O(log(kHkt/)/ log log(kHkt/)). If we take
kHkt constant, then we get an approximation with K = O(log(1/)/ log log(1/)). If we could implement
the evolution for constant time with this complexity, then by reducing the error to /t, we could repeat the
process O(t) times and get a simulation with complexity O(t log(t/)/ log log(t/)) and overall error at most
.
Now suppose we can decompose the given Hamiltonian in the form
L
X
H= ` H` (26.7)
`=1
for some coefficients ` R, where the individual terms H` are both unitary and Hermitian. This is
straightforward if H is k-local, since we can express the local terms as linear combinations of Pauli operators.
If H is sparse, then such a decomposition can also be constructed efficiently [21].
To implement U (t), we begin by writing it as a linear combination of unitaries, namely
K
X (iHt)k
U (t) = (26.8)
k!
k=0
K L
X X tk
= ` `k (i)k H`1 H`k (26.9)
k! 1
k=0 `1 ,...,`k =1
m1
X
= j V j , (26.10)
j=0
where the Vj are products of the form (i)k H`1 H`k , and the j are the corresponding coefficients.
How can we implement such a linear combination of unitaires? Let B be an operation that prepares the
state
m1
1 Xp
|i := j |ji (26.11)
s j=0
Let
W := B select(V )B (26.15)
with
m1
X
select(V ) := |jihj| Vj . (26.16)
j=0
Then we have
1 Xp
(h0| I)W (|0i |i) = (h0| I)B select(V ) j |ji|i (26.17)
s j
1 Xp
= (h0| I)B j |jiVj |i (26.18)
s j
1X
= j Vj |i (26.19)
s j
1
= U (t)|i. (26.20)
s
In other words, if we postselect the state W (|0i |i) on having its first register in the state |0i, we obtain
the desired result. However, this postselection only succeeds with probability (approximately) 1/s2 .
Considering the action of W on the full space, we have
r
1 1
W |0i|i = |0i U (t)|i + 1 2 |i (26.21)
s s
for |i H and some |i whose ancillary state is supported in the subspace orthogonal to |0i. To boost
the chance of success, we might like to apply amplitude amplification to W . However, the initial state |i is
unknown, so we cannot reflect about it. Fortunately, something similar can be achieved using the reflection
R := (I 2|0ih0|) I. (26.22)
about the subspace with |0i in the first register. Specifically, letting P := |0ih0|, we have
Therefore
3 4
(h0| I)W RW RW |0i|i = U (t) + 3 U (t)U (t) U (t), (26.25)
s s
which is close to ( 3s s43 )U (t) since U (t) is close to unitary. In particular, if s = 2 then this process boosts
the amplitude from 1/2 to 1, analogous to Grover search with a single marked item out of 4. For the purpose
of Hamiltonian simulation, we can choose the parameters such that a single segment of the evolution has
this value of s, and we repeat the process as many times as necessary to simulate the full evolution.
More generally, the operation W RW RW is analogous to the Grover iterate, and it can be applied many
times to boost the amplitude for success from something small to a value close to 1. Using this oblivious
amplitude amplification, a general linear combination of unitaries as in (26.10) can be implemented with
complexity O(1/s).
Part VI
133
Chapter 27
In the last part of this course, we will discuss an approach to quantum computation based on the concept
of adiabatic evolution. According to the quantum adiabatic theorem, a quantum system that begins in the
nondegenerate ground state of a time-dependent Hamiltonian will remain in the instantaneous ground state
provided the Hamiltonian changes sufficiently slowly. In this lecture we will prove the quantum adiabatic
theorem, which quantifies this statement.
d
i |(t)i = H|(t)i (27.1)
dt
with the initial quantum state |(0)i is given by
So any eigenstate |Ei of the Hamiltonian, with H|Ei = E|Ei, simply acquires a phase exp(iEt). In
particular, there are no transitions between eigenstates.
If the Hamiltonian varies in time, the evolution it generates can be considerably more complicated.
However, if the change in the Hamiltonian occurs sufficiently slowly, the dynamics remain relatively simple:
roughly speaking, if the system begins close to an eigenstate, it remains close to an eigenstate. The quantum
adiabatic theorem is a formal description of this phenomenon.
For a simple example of adiabatic evolution in action, consider a spin in a magnetic field that is rotated
from the x direction to the z direction in a total time T :
t t
H(t) = cos x sin z . (27.3)
2T 2T
Suppose that initially, the spin points in the x direction: |(0)i = (|0i + |1i)/ 2, the ground state of H(0).
As the magnetic field is slowly rotated toward the z direction, the spin begins to precess about the new
direction of the field, moving it toward the z axis (and also producing a small component out of the xz
plane). If T is made larger and larger, so that the rotation of the field direction happens more and more
slowly (as compared to the speed of precession), the state will precess in a tighter and tighter orbit about
the field direction. In the limit of arbitrarily slow rotation of the field, the state simply tracks the field,
remaining in the instantaneous ground state of H(t).
More generally, for s [0, 1], let H(s) be a Hermitian operator that varies smoothly as a function of
s. (The notion of smoothness will be made precise in the following section.) Let s := t/T . Then for T
arbitrarily large, H(t) varies arbitrarily slowly as a function of t. An initial quantum state |(0)i evolves
135
136 Chapter 27. The quantum adiabatic theorem
d
i |(t)i = H(t)|(t)i, (27.4)
dt
or equivalently,
d
i |(s)i = T H(s)|(s)i. (27.5)
ds
Now suppose that |(0)i is an eigenstate of H(0), which we assume for simplicity is the ground state, and
is nondegenerate. Furthermore, suppose that the ground state of H(s) is nondegenerate for all values of
s [0, 1]. Then the adiabatic theorem says that in the limit T , the final state |(T )i obtained by the
evolution (27.4) will be the ground state of H(1).
Of course, evolution for an infinite time is rather impractical. For computational purposes, we need a
quantitative version of the adiabatic theorem: we would like to understand how large T must be so that the
final state is guaranteed to differ from the adiabatically evolved state by at most some fixed small amount.
In particular, we would like to understand how the required evolution time depends on spectral properties of
the interpolating Hamiltonian H(s). We will see that the timescale for adiabaticity is intimately connected
to the energy gap between the ground and first excited states.
generates exactly adiabatic evolution, where we use a dot to denote differentiation with respect to s. In
other words, we claim that the differential equation
d
i |(s)i = Ha (s)|(s)i (27.9)
ds
with |(0)i = |(0)i has the solution |(s)i = ei(s) |(s)i for some time-dependent phase (s). Equivalently,
the density matrix P (s) = |(s)ih(s)| = |(s)ih(s)| satisfies the differential equation
d d
iP (s) = i |(s)i h(s)| + |(s)i h(s)| (27.10)
ds ds
= [Ha (s), P (s)]. (27.11)
(dropping the argument s when it is clear from context). Differentiating the identity P 2 = P gives
P = P P + P P , (27.14)
27.2. Proof of the adiabatic theorem 137
P P P = 0. (27.15)
for some unitary operator U (s). It is helpful to write the evolution in terms of a differential equation for
U (s). We have
d d
U (s)|(0)i = |(s)i (27.17)
ds ds
= iT H(s)|(s)i (27.18)
= iT H(s)U (s)|(0)i, (27.19)
and since this holds for any initial state |(0)i, we see that U (s) satisfies the differential equation
Similarly, we have
iUa (s) = Ha (s)Ua (s) (27.21)
for the corresponding adiabatic evolution.
We would like to show that the difference between U and Ua is small. Thus we consider
Z 1
d
U (1) Ua (1) = U (1) (U Ua ) ds (27.22)
0 ds
Z 1
= iU (1) U [Ha T H]Ua ds (27.23)
0
Z 1
= U (1) U [P , P ]Ua ds (27.24)
0
where the first line follows from the fundamental theorem of calculus, the second from (27.20) and (27.21),
and the third from the definition of Ha .
It turns out that the expression [P , P ] can be written as a commutator with the Hamiltonian, [P , P ] =
[H, F ], where
F := RP P + P P R (27.25)
where we have defined the resolvent
1
R := (27.26)
H E
(which has poles at the eigenvalues of H). This can be seen as follows: noting that (H E)R = 1 so that
HR = 1 + ER, and P H = EP , we have
as claimed.
Now let us define
F := U F U. (27.30)
Using (27.20), we have
F = iT U [H, F ]U + U F U ; (27.31)
138 Chapter 27. The quantum adiabatic theorem
therefore
U [P , P ]U = U [H, F ]U (27.32)
1
= (F U F U ). (27.33)
iT
Now we insert this into (27.24) and integrate the first term by parts:
Z 1
i
U (1) Ua (1) = U (1) [(F U F U )U Ua ] ds (27.34)
T 0
i
h i1 Z 1 d
= U (1) F U Ua F (U Ua ) U F Ua ds (27.35)
T 0 0 ds
h
1
Z 1
i
i
= U (1) F U Ua F U [P , P ]Ua U F Ua ds (27.36)
T 0 0
Now
kF k 2kRP P k (27.38)
= 2kR(1 P )P k (27.39)
2kR(1 P )k kP k (27.40)
2kP k
(27.41)
where we have used (27.14) to see that P P = (1 P )P , and where (s) is the gap between the smallest
eigenvalue E(s) of H(s) and the nearest distinct eigenvalue of H(s). Also,
F = RP P + RP P + RP 2 + P P R + P P R + P 2 R (27.42)
and
1 1
R = H (27.43)
(H E) (H E)
(to see this, differentiate the identity (H E)R = 1), so (by similar calculations as above)
kHk kP k kP k kP k2
kF k 2 + + . (27.44)
2
Thus we have
1
kP (0)k kP (1)k kP k2 kHk kP k kP k
T
Z
kU (1) Ua (1)k + + ds 3 + + . (27.45)
2 (0) (1) 0 2
Finally, we would like to express kP k and kP k in terms of H. We can obtain upper bounds for these
quantities using first and second order perturbation theory. Intuitively, if the Hamiltonian changes slowly,
and if its eigenvalues are not close to degenerate, then its eigenvectors should also change slowly. At first
order, we have
kHk
kP k c1 (27.46)
for some constant c1 , and at second order,
kHk kHk2
kP k c2 + c3 (27.47)
2
27.2. Proof of the adiabatic theorem 139
Overall, we have proved the following quantitative version of the adiabatic theorem:
Theorem 27.1. Suppose H(s) has a nondegenerate ground state for all s [0, 1], and suppose that the total
evolution time satisfies
" Z 1 #
2
2 kH(0)k kH(1)k kHk kHk
T c1 + c1 + ds (3c21 + c1 + c3 ) 3 + c2 2 . (27.49)
(0)2 (1)2 0
Then evolution of the initial state |(0)i = |(0)i under the Schrodinger equation (27.5) produces a final
state |(1)i satisfying
k|(1)i |(1)ik . (27.50)
140 Chapter 27. The quantum adiabatic theorem
Chapter 28
Adiabatic optimization
Having established the quantum adiabatic theorem, we will now see how it can be applied to solve opti-
mization problems.
After describing the general framework [42], we will see how this approach gives an
alternative O( N )-time algorithm for unstructured search.
We refer to HP as the problem Hamiltonian, since it corresponds to the problem of minimizing h. Clearly,
its ground state consists of strings z such that h(z) is minimized. Therefore, if we could prepare the ground
state of HP , we could solve the minimization problem.
To prepare the ground state of HP , we will adiabatically evolve from the ground state of a simpler
Hamiltonian. Let the the beginning Hamiltonian HB be some Hamiltonian whose ground state is easy
to prepare. Then let HT (t) be a smoothly varying time-dependent Hamiltonian with HT (0) = HB and
HT (T ) = HP , where T is the total run time of the evolution. Assuming the evolution is sufficiently close to
adiabatic, the initial ground state will evolve into a state close to the final ground state, thereby solving the
problem.
For any given HB and HP , there are many possible choices for the interpolation HT (t). One simple
choice is a time-dependent Hamiltonian of the form
HT (t) = H(t/T ) := 1 f (t/T ) HB + f (t/T )HP (28.3)
where f (s) is a smooth, monotonic function of s [0, 1] satisfying f (0) = 0 and f (1) = 1, so that H(0) = HB
and H(1) = HP . In other words, the interpolating function f (t/T ) should vary smoothly from 0 to 1 as the
141
142 Chapter 28. Adiabatic optimization
time t varies from 0 to T . If f (s) is twice differentiable, and if the ground state of H(s) is nondegenerate
for all s [0, 1], then the adiabatic theorem guarantees that the evolution will become arbitrarily close
to adiabatic in the limit T . An especially simple choice for this interpolation schedule is the linear
interpolation f (s) = s, but many other choices are possible.
Finally, how should we choose the beginning Hamiltonian? If we choose an interpolation of the form
(28.3), then HB clearly should not commute with HP , or else no evolution will occur. One natural choice
for HB is
Xn
HB = x(j) (28.4)
j=1
(j)
where x is the Pauli x operator on the jth qubit. This beginning Hamiltonian has the ground state
1 X
|Si := |zi, (28.5)
2n z{0,1}n
a uniform superposition of all possible solutions S = {0, 1}n . But as for the method of interpolation, many
other choices for HB are possible.
To summarize, a quantum adiabatic optimization algorithm works as follows:
1. Prepare the quantum computer in the ground state of the beginning Hamiltonian HB .
2. Evolve the state with the Hamiltonian H(t) for a total time T , ending with the problem Hamiltonian
HP .
3. Measure in the computational basis.
Step 1 can be performed efficiently if HB has a sufficiently simple ground statefor example, if it is the
state (28.5). Step 2 can be simulated efficiently on a universal quantum computer, assuming the Hamiltonian
is of a suitable form (say, if it is sparse) and the run time T is not too large. Step 3 is straightforward to
implement, and will yield a state close to the ground state assuming the simulation of the evolution is
sufficiently good and the evolution being simulated meets the conditions of the adiabatic theorem.
H = HP HB (28.6)
H = 0. (28.7)
Now let
min := min (s) (28.8)
s[0,1]
be the minimum gap between the ground and first excited states. Then we have
kHP HB k kHP HB k2
2 2
T 2c1 + (3c1 + c1 + c 3 ) . (28.9)
2min 3min
Recall that to be efficiently simulable, HB and HP should not have very large norm. Thus we see that if
the minimum gap min is not too small, the run time need not be too large. In particular, to show that the
adiabatic algorithm runs in polynomial time, it sufficies to show that the minimum gap is only polynomially
small, i.e., that 1/min is upper bounded by a polynomial in n.
Of course, this does not answer the question of whether the adiabatic algorithm runs in polynomial time
unless the minimum gap can be estimated. In general, calculating the gap for a particular Hamiltonian is
a difficult problem, which makes the adiabatic algorithm difficult to analyze. Nevertheless, there are a few
examples of interest for which the gap can indeed be estimated.
0.8
0.6
(f )
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1
f (t/T )
In general, the minimum value occurs at f = 1/2, where we have min = a = 1/ N .
To finish specifying the algorithm, we must choose a particular interpolation function (or schedule) f (s).
The simplest choice is to use the linear interpolation f (s) = s, but it turns out that this simple choice does
not work. Applying (28.9), which pessimistically depends solely on the minimum value of the gap, only
shows it is sufficient to take T = O(1/3 ) = O(N 3/2 ). But even if we use the full adiabatic theorem, we
only find that it is sufficient to take the run time to be large compared to
Z 1 Z 1
df df 1
= = 2 = N. (28.19)
0 3 0
2
[1 4f (1 f )(1 a )]3/2 a
While the adiabatic theorem only gives an upper bound on the running time, it turns out that the bound
is essentially tight in this case: with linear interpolation, the run time must be (N ) for the evolution to
remain approximately adiabatic.
However, we can do better by choosing a different interpolation schedule f (s). Intuitively, since the gap
is smallest whenf (s) is close to 1/2, we should evolve more
slowly for such values. The fact that the gap is
only of order 1/ N for values of |f 1/2| of order 1/ N ultimately means that it is possible to choose a
schedule for which a total run time of O( N ) suffices. Since we should evolve most slowly when the gap is
smallest, it is reasonable to let f p for some power p. For concreteness, we will use p = 3/2, although
any p (1, 2) would work.
If we let
f = 3/2 , (28.20)
R1 R1
then the coefficient is fixed by the equation 0
ds = 0
df /f = 1, i.e.,
Z 1
df
= (28.21)
0 3/2
Z 1
df
= (28.22)
0 [1 4f (1 f )(1 a2 )]3/4
N 3/4
= Im B(1+iN 1)/2 (1/4, 1/4) p (28.23)
2(N 1)
(5/4) 1/4
=2 N + O(1) (28.24)
(3/4)
where Bz (a, b) denotes the incomplete beta function, and (z) denotes the gamma function. Then for
example, with N = 1000, the schedule obtained by integrating (28.20) looks as follows:
28.3. Adabatic optimization algorithm for unstructured search 145
0.8
0.6
f (s)
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1
s
Now we want to evaluate the terms appearing in the adiabatic theorem. For the first three terms, we
need to calculate
kH(0)k kH(1)k p
2
= 2
= 1 a2 (28.28)
(0) (1)
= O(N 1/4 ), (28.29)
and
d 2(2f 1)(1 a2 )
= . (28.38)
df
146 Chapter 28. Adiabatic optimization
Then we have
1 1
kHk kHk df
Z Z
ds = (28.39)
0 2 0 2 f
Z 1
3 p 1 d
= 1 a2 3/2 df
df (28.40)
2 0
Z 1
|2f 1|
= 3(1 a2 )3/2 df (28.41)
0 [1 4f (1 f )(1 a2 )]5/4
6(1 a2 )3/2
= (28.42)
a(1 + a)(1 + a)
= O( N ). (28.43)
Overall, we find a total run time of T = O( N ) suffices to make the evolution arbitrarily close to adiabatic.
In the above analysis, it was essential to understand the behavior of the gap as a function of f . In
particular, since the spectrum of the Hamiltonian (28.13) does not depend on which item m is marked, we
can choose a schedule that is simultaneously good for all possible marked items. For general instances of
adiabatic optimization, this may not be the case.
To implement this adiabatic optimization algorithm for unstructured search in the conventional quantum
querymodel, we must simulate evolution according to this Hamiltonian. Using the fact that [HB , HP ] =
O(1/ N ), it is possible to perform this simulation using O( N ) queries to a black box for h(z).
Chapter 29
In this lecture, we describe a simple example of a function that can be minimized by adiabatic optimization
in polynomial time [42].
where we make the identification zn+1 := z1 . Thus, the problem Hamiltonian can be written in terms of
Pauli operators as
X
HP := h(z)|zihz| (29.3)
z
n
1X
= (1 z(j) z(j+1) ) (29.4)
2 j=1
(n+1) (1)
where we make the similar identification z := z . To prepare the ground state of HP , we will use
linear interpolation from a magnetic field in the x direction (i.e., the adjacency matrix of the hypercube),
giving
n n
X sX
H(s) = (1 s) x(j) + (1 z(j) z(j+1) ) . (29.5)
j=1
2 j=1
To understand how well the resulting adiabatic algorithm performs, we would like to calculate the gap
(s) of this Hamiltonian as a function of s. Strictly speaking, this gap is zero, since the final ground state
is degenerate: any state in the two-dimensional subspace span{|0 . . . 0i, |1 . . . 1i} has zero energy. However,
147
148 Chapter 29. An example of the success of adiabatic optimization
and that the initial state |Si (where S = {0, 1}n ) is an eigenstate of G with eigenvalue +1. The evolution
takes place entirely within the +1 eigenspace of G, so we can restrict our attention to this subspace. So let
(s) denote the gap between the ground state of H(s) and the first excited state in the +1 eigenspace of
G. This is the relevant gap for adiabatic evolution starting in |Si, with the ultimate goal of producing the
unique G = +1 ground state of HP , the GHZ state
|0 . . . 0i + |1 . . . 1i
. (29.7)
2
Measurement of this state in the computational basis will yield one of the two satisfying assignments of the
n bits, each occurring with probability 1/2.
The Hamiltonian (29.5) is well-known in statistical mechanics, where it is referred to as a ferromagnetic
Ising model in a transverse magnetic field. It can be diagonalized using the Jordan-Wigner transform, which
we describe next.
for some values of the real numbers Ji and hi . We may either have periodic boundary conditions (by
(n+1) (1)
identifying z with z ) or open boundary conditions (by setting Jn = 0).
The Jordan-Wigner transformation consists of the definition
(j)
aj := x(1) x(2) x(j1) 1(j+1) 1(n) (29.9)
(which will turn out to be a fermion annihilation operator), where we have defined spin raising and lowering
operators in the x basis,
x iy
:= R R (29.10)
2
= |ih| (29.11)
where
1 1 1
R := (29.12)
2 1 1
is the Hadamard transformation, and |i := (|0i |1i)/ 2 are the eigenvectors of x .
To see that the aj s correspond to fermion annihilation operators, we observe that aj and
(j)
aj = x(1) x(2) x(j1) + 1(j+1) 1(n) (29.13)
29.2. The Jordan-Wigner transformation: From spins to fermions 149
{A, B} := AB + BA (29.14)
and
(j) (k)
{aj , ak } = { , x(j) }x(j+1) x(k1) + = 0. (29.16)
{aj , ak } = 0 (29.19)
{aj , ak } = j,k . (29.20)
(j) (j) (j+1)
To fermionize H, we need to express x and z z in terms of fermion operators. The important
point is that even though the aj s and aj s are highly nonlocal spin operators, certain local combinations of
them correspond to local spin operators, and vice versa. For the magnetic field, we have
(j) (j)
aj aj = + (29.21)
= (|ih|)(j) (29.22)
1
= (1 x(j) ) , (29.23)
2
so
where G is the spin flip operator defined in (29.6). Since x anticommutes with z , the operator G commutes
with each Ising coupling term, and thus commutes with any H of the form (29.8). Therefore, to find the
spectrum of H, it suffices to separately determine the spectra in the subspaces with G = +1 and G = 1.
1
Note that since x = (1) 2 (1x ) , we can write
Pn 1 (j)
G = (1) j=1 2 (1x ) (29.30)
Pn
= (1) j=1 aj aj . (29.31)
150 Chapter 29. An example of the success of adiabatic optimization
Thus the cases G = +1, G = 1 correspond to the cases of an even or an odd number of occupied fermion
modes, respectively.
Overall, the Jordan-Wigner transformation results in the expression
n n
Ji0 (ai ai )(ai+1 + ai+1 ) hi (ai ai ai ai )
X X
H= (29.32)
i=1 i=1
Using the fermion anticommutation relations (29.19) and (29.20), we can rewrite this Hamiltonian as
H=a a + tr (29.35)
where and denote the matrices whose j, k entries are jk and jk , respectively, and a denotes the column
vector whose first block has entries a1 , . . . , an and whose second block has entries a1 , . . . , an . Since H is
hermitian, we can always choose , so that = and = T .
We would like to define a change of basis to a new set of fermion operators bj , bj in which the Hamiltonian
is diagonal. If we let
n
(jk ak + jk ak )
X
bj := (29.36)
k=1
The matrices and are not arbitrary, since we require that the transformed bj s and bj s remain fermion
operators, i.e., that they satisfy the fermion anticommutation relations
{bj , bk } = 0 (29.38)
{bj , bk } = j,k . (29.39)
It is a good exercise to check that the condition that these relations are satisfied if an only if the matrix in
(29.37) is unitary.
Although we will not describe the proof here,1 it turns out that any quadratic fermion Hamiltonian can
be diagonalized by such a transformation. In particular, it is always possible to choose , so that
0
H=b b + tr (29.40)
0
1 The diagonalization of H in the case of real , appears in [69]. For general , , as well as the case where we include
terms that are linear in the fermion operators, see [35].
29.3. Diagonalizing a system of free fermions 151
where is a diagonal matrix whose diagonal entries are the positive eigenvalues of the 2 2 block matrix
(representing a 2n 2n matrix whose eigenvalues occur in pairs) appearing in (29.35). Expanding this
expression, we have
n
j (2bj bj 1) + tr
X
H= (29.41)
j=1
where we have again used the fermion anticommutation relations. Since the bj bj s are commuting operators
with eigenvalues 0 and 1, we see that spectrum of H is given by the 2n numbers
n
X
sj j + tr (29.42)
j=1
The eigenvalues corresponding to eigenstates with G = 1 can be identified as follows. The transfor-
mation (29.37) is invertible, so any quadratic expression in the aj s and aj s can be written as a quadratic
expression in the bj s and bj s. Since quadratic fermion operators do not change the parity of the total
number of occupied modes, this means that the parity of the a modes is the same as the parity of the b
modes. In other words,
bj bj
Pn
G = (1) j=1 . (29.48)
Thus the eigenvalues with G = +1 are those with an even number of sj s equal to +1 in (29.42), whereas
the eigenvalues with G = 1 are those with an odd number of sj s equal to +1. In particular, we see that
the gap between the ground and first excited states in the G = +1 subspace is equal to 2(1 + 2 ), where
1 and 2 are the square roots of the two smallest eigenvalues of (29.47).
In the case of periodic boundary conditions, note that we have two distinct matrices (29.47), one for
each value of G. However, with a fixed value of G, only half the possible assignments of the sj s give rise
to eigenvalues of the Hamiltonian, so we still find the correct number of eigenvalues. Here again, the gap
between the ground and first excited states in the G = +1 subspace is equal to 2(1 + 2 ), where now 1
and 2 are the square roots of the two smallest eigenvalues of (29.47) with G = +1.
1 2
(Ji2 + h2i ) Ji hi (D + D1 ) = s + 4(1 s)2 2s(1 s)(D + D1 ) ,
(29.49)
4
where D is the skew-circulant matrix
0 1 0 0
.. ..
0 0 1 . .
D := .. .. .. (29.50)
. . . 0
0 0 1
1 0 0 0
n2
X
= |x + 1ihx| |0ihn 1| . (29.51)
x=0
n1
1 X 2ikx/n
|k i := e |xi (29.54)
n x=0
29.4. Diagonalizing the ring of agrees 153
for k = 0, 1, . . . , n 1, one can show that the matrix D (and hence any skew-circulant matrix) is diagonal in
the skew-Fourier basis
n1
1 X i(2k+1)x/n
|k i := e |xi , (29.55)
n x=0
also for k = 0, 1, . . . , n 1. In particular,
D|k i = ei(2k+1)/n |k i . (29.56)
Thus, the eigenvalues of (29.49) are given by
1 2 (2k + 1)
s + 4(1 s)2 4s(1 s) cos . (29.57)
4 n
The smallest two eigenvalues (which are equal) occur for k = 0 and k = n 1, so the gap as a function of
the interpolating parameter is
r
(s) = 2 s2 + 4(1 s)2 4s(1 s) cos , (29.58)
n
which looks like this for n = 50:
4
3.5
2.5
(s)
1.5
0.5
0
0 0.2 0.4 0.6 0.8 1
s
For large n,
2
cos = 1 2 + O(1/n4 ) , (29.59)
n 2n
so r
2 2
(s) = 2 (2 3s)2 + s(1 s)
+ O(1/n4 ) . (29.60)
n2
Setting d(s)2 /ds equal to zero, we see that the minimum occurs at s = 2/3 + O(1/n2 ), at which the
minimum gap is
4
= + O(1/n3 ) . (29.61)
3n
Since the minimum gap decreases only as 1/ poly(n), we see that adiabatic optimization can efficiently find
a satisfying assignment for the ring of agrees. Even though the ring of agrees is not by itself an interesting
computational problem, we can take this as preliminary evidence that adiabatic optimization sometimes
succeeds.
However, it is also possible for the adiabatic algorithm to fail (at least for certain natural choices of
the interpolating Hamiltonian), even for cost functions that are almost as simple as the ring of agrees. For
example, suppose we have 4n spins arranged on a ring, and we define the cost function
n
X 2n
X 3n
X 4n
X
h0 (z) = (1 zj ,zj+1 ) + 2 (1 zj ,zj+1 ) + (1 zj ,zj+1 ) + 2 (1 zj ,zj+1 ). (29.62)
j=1 j=n+1 j=2n+1 j=3n+1
154 Chapter 29. An example of the success of adiabatic optimization
In other words, we again penalize a string when adjacent bits disagree, but the penalty is either 1 or 2
for contiguous blocks of n pairs of spins. In this case one can show that the gap is exponentially small.
Unfortunately, we did not have time to discuss the details of this calculation.
Chapter 30
In this final chapter, we see how adiabatic evolution can be used to implement an arbitrary quantum circuit
[8]. In particular, this can be done with a local, linearly interpolated Hamiltonian. We may think of such
Hamiltonians as describing a model of quantum computation. We know that this model can be efficiently
simulated in the quantum circuit model. In this lecture we will see how the circuit model can be efficiently
simulated by the adiabatic model, so that in fact the two models have equivalent computational power (up
to polynomial factors).
This does not necessarily mean that there is an efficient adiabatic optimization algorithm for any problem
that can be solved efficiently by a quantum computer. For example, Shors algorithm shows that quantum
computers can factor integers efficiently, yet we do not know if there is an adiabatic factoring algorithm that
works by optimizing some cost function (such as the squared difference between the integer and a product
of smaller integers). In general, it does not seem that the constructions of universal adiabatic quantum
computers give much insight into how one might design efficient quantum adiabatic optimization algorithms.
Nevertheless, they show that there is some sense in which the idea of adiabatic evolution captures much of
the power of quantum computation.
where
Hj := Uj |jihj 1| + Uj |j 1ihj| . (30.2)
Here the first register consists of n qubits, and the second register stores a quantum state in a (k + 1)-
dimensional space spanned by states |ji for j {0, 1, . . . , k}. The second register acts as a clock that records
the progress of the computation. Later, we will show how to represent the clock using qubits, but for now,
we treat it as a convenient abstraction.
155
156 Chapter 30. Universality of adiabatic quantum computation
If we start the computer in the state |i |0i, then the evolved state remains in the subspace spanned
by the k + 1 states
|j i := Uj U1 |i |ji (30.3)
for j {0, 1, . . . , k}. In this subspace, the nonzero matrix elements of HF are
so the evolution is the same as that of a free particle propagating on a discretized line segment. Such a
particle moves with constant speed, so in a time proportional to k, the initial state |0 i will evolve to a
state with substantial overlap on the state |k i = Uk U1 |i|ki, corresponding to the final state of the
computation. For large k, one can show that
so that after time k/2, a measurement of the clock will yield the result k, and hence give the final state
of the computation, with a probability that is only polynomially small in the total number of gates in the
original circuit.
The success probability of Feynmans computer can be made close to 1 by a variety of techniques. The
simplest approach is to repeat the process O(k 2/3 ) times. Or we could pad the end of the computation
with a large number of identity gates, boosting the probability that we reach a state in which the entire
computation has been performed. Alternatively, as Feynman suggested, the success probability can be made
arbitrarily close to 1 in single shot by preparing the initial state in a narrow wave packet that will propagate
ballistically without substantial spreading. But perhaps the best approach is to make the process perfect by
changing the Hamiltonian to
Xk p
HF G := j(k + 1 j) Hj . (30.6)
j=1
In this case, the choice t = gives the exact transformation eiHF G t |0 i = |k i. This can be understood by
viewing |j i as a state of total angular momentum k2 ( k2 + 1) with z component j k2 . Then HF G is simply
the x component of angular momentum, which rotates between the states with z component k2 in time .
Equivalently, HF G can be viewed as the Hamiltonian in the Hamming weight subspace of a hypercube.
In the Hamiltonians (30.1) and (30.6), the clock space is not represented using qubits. However, we can
easily create a Hamiltonian expressed entirely in terms of k + 1 qubits using a unary representation of the
clock. Let
|ji := | 0| {z
0} 1 |0 {z
0}i . (30.7)
j kj
(and similarly for the adjoint), where the parenthesized superscript indicates which qubits are acted on. Then
the subspace of states for which the clock register has the form (30.7) is invariant under the Hamiltonian,
and within this subspace, its action is identical to that of the original Hamiltonian.
Notice that if the quantum circuit consists of one- and two-qubit gates, then the Hamiltonians (30.1) and
(30.6) are local in the sense that the interactions involve at most four qubits. We call such a Hamiltonian
4-local.
This construction shows that even a time-independent Hamiltonian of a particularly simple form can be
universal for quantum computation. Now lets see how we can modify the construction to use adiabatic
evolution instead of a time-independent Hamiltonian.
whose ground state is the initial state of the computation together with the initial configuration of the clock,
and to slowly evolve to a Hamiltonian (essentially, minus the Feynman Hamiltonian (30.1)) whose ground
state encodes not the final state of the computation, but rather a uniform superposition over the entire
history of the computation.
As before, we will find it convenient to start with an abstract description of the clock register in terms
of k + 1 basis states |0i, |1i, . . . , |ki, without worrying about how these states are represented in terms of
qubits. Later, we will consider issues of locality in this type of construction.
For the beginning Hamiltonian, we will use
HB := I |0ih0| + Hpenalty (30.9)
where
n
X (j)
Hpenalty := |1ih1| |0ih0|. (30.10)
j=1
Here the parenthesized superscript again indicates which qubit is acted on. The first term of (30.9) says that
the energy is lower if the clock is in the initial state |0i. Adding Hpenalty gives an energy penalty to states
whose clock is in the state |0i, yet for which the state of the computation is not the initial state |00 . . . 0i.
Thus the unique ground state of HB is |00 . . . 0i |0i.
For the final Hamiltonian (which we denote HC , since it encodes the final result of an arbitrary circuit,
rather than the solution of a particular problem), we will use
HC := HF + Hpenalty (30.11)
where HF is the Feynman Hamiltonian defined in (30.1). From (30.4), we see that the HF has a degenerate
ground state subspace, where any state of the form
k
1 X
|i := |j i (30.12)
k + 1 j=0
(with |j i defined in (30.3)), with an arbitrary initial state |i, has minimal energy. Adding Hpenalty
penalizes those states for which the initial state of the computation is not |00 . . . 0i, so that (30.12) with
|i = |00 . . . 0i is the unique ground state of HC . This state is almost as good as the final state of the
computation, since if we measure the clock, we obtain the result k with probability 1/(k + 1), which is
1/ poly(n) assuming the length of the circuit is only k = poly(n). By repeating the entire process poly(k)
times, we can obtain the final state of the computation with high probability.
Finally, we use linear interpolation to get from HB to HC , defining
H(s) := (1 s)HB + sHC . (30.13)
If we begin in the state |00 . . . 0i |0i and evolve according to HT (t) := H(t/T ) for a sufficiently large time
T , the adiabatic theorem guarantees that the final state will be close to |i. It remains to estimate the gap
(s) to show that T = poly(k) is sufficient.
In fact, the (k + 1)-dimensional computational subspace spanned by the states |j i with |i = |00 . . . 0i
is invariant under H(s), so it suffices to compute the gap within this subspace. Let us examine how H(s)
acts within the computational subspace. Note that Hpenalty |j i = 0 for all j {0, 1, . . . , k}. We have
hj |HB |j 0 i = j,j 0 j,0 (30.14)
and
hj |HC |j 0 i = (j,j 0 +1 + j,j 0 1 ) , (30.15)
so we need to lower bound the gap between the smallest and second smallest eigenvalues of the matrix
s1 s
0 0
.. ..
s
0 s . .
..
. (30.16)
0
s 0 . 0
. . . ..
.. .. ..
. s
0 0 s 0
158 Chapter 30. Universality of adiabatic quantum computation
We will show
Lemma 30.1. The gap between the smallest and second smallest eigenvalues of the matrix (30.16) for
s [0, 1] is (1/k 2 )
Proof. The reduced Hamiltonian (30.16) essentially describes a free particle on a finite, discrete line, with
a nonzero potential at one end. Thus the eigenstates are simply plane waves with a quantization condition
determining the allowed values of the momentum. We will show the lower bound on the gap by analyzing
this quantization condition.
We claim that the (unnormalized) eigenstates of (30.16), denoted |Ep i, are given by
for j = 0, 1, . . . , k, and where p is yet to be determined. It is straightforward to verify that these states
satisfy
hj |H(s)|Ep i = Ep hj |Ep i (30.18)
for j = 1, 2, . . . , k, with the energy given by
Ep = 2s cos p. (30.19)
(where p may be either real or imaginary). The allowed values of p are determined by the quantization
condition obtained by demanding that (30.18) also holds at j = 0, i.e., that we have
2
U9 (cos p)/U8 (cos p)
4
2 1.5 1 0.5 0 0.5 1 1.5 2
cos p
30.3. Locality 159
j
Since the roots of Uk (x) are given by cos k+1 for j = 1, 2, . . . , k, the left hand side of (30.22) has simple
j
poles at those values (and zeros at cos k+2 for j = 1, 2, . . . , k + 1). One can show that left hand side of
(30.22) is strictly increasing. So there is one solution of (30.22) to the left of the leftmost pole, one between
each pair of poles, and one to the right of the rightmost pole, giving a total of k + 1 solutions, and thus
accounting for all the eigenvalues of (30.16).
It remains to show that the gap between the two rightmost solutions of (30.22) is not too small. It is
easy to see that the gap is (1/k 3 ), because the ground state has cos p cos k+2 (since it must occur to the
right of the rightmost root), and the first excited state has cos p cos k+1 (since it must occur to the left of
the rightmost pole). This shows the gap is at least 2s(cos k+2 cos k+1 ) = (1/k 3 ) for constant s (and it
is easy to show that the gap is a constant for s = o(1)).
However, we might like to prove a tighter result. To do this, we can separately consider the cases where
the value of p corresponding the ground state is real (giving a plane wave) and where it is imaginary (giving
a bound state). Since Uk+1 (1)/Uk (1) = (k + 2)/(k + 1), the value of s separating these two regimes is
s := (k + 1)/(2k + 3).
For s s , the ground state has cos p 1, whereas the first excited state has cos p cos k+1 (as observed
above). Therefore, the gap satisfies
(s) 2s 1 cos = (1/k 2 ) (30.23)
k+1
for constant s (and as mentioned above, it is easy to see that (s) = (1) for s = o(1)).
For s > s , the ground state has cos p cos k+2 (as mentioned above). For the first excited state, we will
show that the solution of (30.22) not only lies to the left of the rightmost pole, but that its distance from
that pole is at least a constant fraction more than the distance of that pole from cos p = 1. In particular,
for any constant a > 0, we have
Uk+1 (1 (1 + a)(1 cos k+1 )) sin((k + 2) cos1 ((1 + a) cos k+1 a))
= (30.24)
Uk (1 (1 + a)(1 cos k+1 )) sin((k + 1) cos1 ((1 + a) cos k+1 a))
1 + a cot( 1 + a)
=1+ + O(1/k 2 ) (30.25)
k
where the second line follows by Taylor expansion. In comparison,
k+2 1
= 1 + + O(1/k 2 ) . (30.26)
k+1 k
So if we fix (say) a = 1, then for k sufficiently large, (30.25) is larger than (30.26), which implies that the
first excited state has cos p 2 cos k+1 1. In turn, this implies that
(s) 2s cos 2 cos + 1 = (1/k 2 ) , (30.27)
k+2 k+1
which completes the proof.
30.3 Locality
The Hamiltonian (30.13) is local in terms of the computational qubits, but not in terms of the clock. However,
it is possible to make the entire construction local.
The basic idea is again to use a unary representation of the clock, as in (30.7). We saw above that
this makes HF 4-local. However, HB and Hpenalty remain nonlocal with this clock, since they include the
projector |0ih0| acting on the clock register, which involves all k + 1 of the clock qubits. Thus we must
modify the construction slightly.
Lets try adding a term to Hpenalty that penalizes clock states which are not of the correct form. To do
this, it will be useful to change the unary representation from (30.7) to a form that can be checked locally,
this time with k + 2 qubits:
|ji := | 0| {z
0} 1| {z
1}i (30.28)
j+1 kj+1
160 Chapter 30. Universality of adiabatic quantum computation
for j {0, 1, . . . , k}. (Note that the first qubit is always in the state |0i, and the last qubit is always in
the state |1i.) Now we can verify that the clock state is of the form (30.28) by ensuring that there is no
occurrence of the string 10 in the clock register, that the first bit is not 1, and that the last bit is not
0; then we can check whether the clock is in its initial state by checking whether the second clock qubit is
in the state |1i. Thus, let us redefine
n k
X (j) X (j,j+1)
Hpenalty := |1ih1| (|0ih0|)(1) + I (|1ih1|)(0) + I |10ih10| + I (|0ih0|)(k+2) (30.29)
j=1 j=1
where the parenthesized superscripts again indicate which qubits are acted on. We redefine the beginning
Hamiltonian as
HB := I (|0ih0|)(1) + Hpenalty , (30.30)
and in the Feynman term HF of the computational Hamiltonian HC , we make the replacement
(j1,j,j+1)
|jihj 1| |001ih011| (30.31)
(and similarly for the adjoint). With these redefinitions, the overall Hamiltonian H(s) = (1 s)HB + sHC is
5-local, assuming as before that the gates in the quantum circuit to be simulated involve at most two qubits
each.
As with the original nonlocal-clock construction, HB and HC have unique ground states |0 . . . 0i|01 . . . 1i
1
Pk j+1 kj+1
and k+1 j=0 Uj U1 |0 . . . 0i |0 1 i, respectively. Again, the computational subspace spanned
by the states |j i from (30.3) (but now with the clock representation (30.28)) is invariant under H(s); and
within this subspace, the Hamiltonian acts according to (30.16), which has a gap of (1/k 2 ). Overall, this
shows that there is a 5-local Hamiltonian H(s) implementing an arbitrary quantum circuit by adiabatic
evolution.
By suitable engineering, its possible to produce variants of this construction with even better locality
properties. One can even make the Hamiltonian spatially local, with nearest-neighbor interactions between
qubits on a two-dimensional square lattice [76]. (In fact, one can even use a one-dimensional array of quantum
systems, although not necessarily with qubits, but with higher-dimensional particles [5].)
Bibliography
[1] Scott Aaronson, Quantum lower bound for the collision problem, Proc. 34th ACM Symposium on Theory
of Computing, pp. 635642, 2002, quant-ph/0111102. [p. 101]
[2] Scott Aaronson and Andris Ambainis, Quantum search of spatial regions, Theory of Computing 1 (2005),
4779, quant-ph/0303041, preliminary version in FOCS 2003. [p. 85]
[3] Scott Aaronson and Yaoyun Shi, Quantum lower bounds for the collision and the element distinctness
problems, Journal of the ACM 51 (2004), no. 4, 595605, quant-ph/0111102 and quant-ph/0112086,
preliminary versions in STOC 2002 and FOCS 2002. [p. 87]
[4] Dorit Aharonov, Itai Arad, Elad Eban, and Zeph Landau, Polynomial quantum algorithms for additive
approximations of the Potts model and other points of the Tutte plane, quant-ph/0702008. [p. 66]
[5] Dorit Aharonov, Daniel Gottesman, Sandy Irani, and Julia Kempe, The power of quantum systems on
a line, Commun. Math. Phys. 287 (2009), no. 1, 4165, 0705.4077. [p. 160]
[6] Dorit Aharonov, Vaughan Jones, and Zeph Landau, A polynomial quantum algorithm for approximating
the jones polynomial, Proc. 38th ACM Symposium on Theory of Computing, pp. 427436, 2006. [p. 63]
[7] Dorit Aharonov and Amnon Ta-Shma, Adiabatic quantum state generation and statistical zero
knowledge, Proceedings of the 35th ACM Symposium on Theory of Computing, pp. 2029, 2003,
quant-ph/0301023. [p. 125]
[8] Dorit Aharonov, Wim van Dam, Julia Kempe, Zeph Landau, Seth Lloyd, and Oded Regev, Adiabatic
quantum computation is equivalent to standard quantum computation, SIAM Journal on Computing 37
(2007), no. 1, 166194, quant-ph/0405098, preliminary version in FOCS 2004. [p. 155]
[9] Gorjan Alagic, Cristopher Moore, and Alexander Russell, Quantum algorithms for Simons problem over
general groups, Proceedings of the 18th ACM-SIAM Symposium on Discrete Algorithms, pp. 12171224,
2007, quant-ph/0603251. [p. 58]
[10] Andris Ambainis, Quantum lower bounds by quantum arguments, Journal of Computer and System
Sciences 64 (2002), no. 4, 750767, quant-ph/0002066, Preliminary version in STOC 2000. [p. 105]
[11] , Quantum walk algorithm for element distinctness, SIAM J. Comput. 37 (2007), no. 1, 210239,
quant-ph/0311001, preliminary version in FOCS 2004. [p. 87]
[12] Andris Ambainis, Julia Kempe, and Alexander Rivosh, Coins make quantum walks faster, Proceedings
of the 16th ACM-SIAM Symposium on Discrete Algorithms, pp. 10991108, 2005, quant-ph/0402107.
[p. 85]
[13] Itai Arad and Zeph Landau, Quantum computation and the evaluation of tensor networks,
arXiv:0805.0040. [p. 66]
[14] Dave Bacon, Andrew M. Childs, and Wim van Dam, From optimal measurement to efficient quantum
algorithms for the hidden subgroup problem over semidirect product groups, Proceedings of the 46th
IEEE Symposium on Foundations of Computer Science, pp. 469478, 2005, quant-ph/0504083. [p. 59]
161
162 Bibliography
[15] Howard Barnum and Emanuel Knill, Reversing quantum dynamics with near-optimal quantum and
classical fidelity, Journal of Mathematical Physics 43 (2002), no. 5, 20972106, quant-ph/0004088. [p.
45]
[16] Robert Beals, Harry Buhrman, Richard Cleve, Michele Mosca, and Ronald de Wolf, Quantum lower
bounds by polynomials, Journal of the ACM 48 (2001), no. 4, 778797, quant-ph/9802049, preliminary
version in FOCS 1998. [pp. 85, 94]
[17] Aleksandrs Belovs, Span programs for functions with constant-sized 1-certificates, Proceedings of the
44th Symposium on Theory of Computing, pp. 7784, 2012, arXiv:1105.4024. [pp. 117, 120]
[18] Aleksandrs Belovs and Ansis Rosmanis, Adversary lower bounds for the collision and the set equality
problems, arXiv:1310.5185. [p. 100]
[19] Dominic W. Berry, Graeme Ahokas, Richard Cleve, and Barry C. Sanders, Efficient quantum algorithms
for simulating sparse Hamiltonians, Communications in Mathematical Physics 270 (2007), no. 2, 359
371, quant-ph/0508139. [pp. 123, 125, 129]
[20] Dominic W. Berry and Andrew M. Childs, Black-box Hamiltonian simulation and unitary implementa-
tion, Quantum Information and Computation 12 (2012), no. 1-2, 2962, arXiv:0910.4157. [p. 130]
[21] Dominic W. Berry, Andrew M. Childs, Richard Cleve, Robin Kothari, and Rolando D. Somma, Ex-
ponential improvement in precision for simulating sparse Hamiltonians, Proceedings of the 46th ACM
Symposium on Theory of Computing, pp. 283292, 2014, arXiv:1312.1414. [pp. 125, 130, 131]
[22] , Simulating Hamiltonian dynamics with a truncated Taylor series, Physical Review Letters 114
(2015), no. 9, 090502, arXiv:1412.4687. [p. 130]
[23] Gilles Brassard, Peter Hyer, Michele Mosca, and Alain Tapp, Quantum amplitude amplification and
estimation, Quantum Computation and Information (S. J. Lomonaco and H. E. Brandt, eds.), AMS
Contemporary Mathematics Series, vol. 305, AMS, Providence, RI, 2002, quant-ph/0005055. [p. 85]
[24] Gilles Brassard, Peter Hyer, and Alain Tapp, Quantum algorithm for the collision problem,
quant-ph/9705002. [p. 100]
[25] Harry Buhrman, Richard Cleve, and Avi Wigderson, Quantum vs. classical communication and com-
putation, Proceedings of the 30th ACM Symposium on Theory of Computing, pp. 6368, 1998,
quant-ph/9802040. [p. 114]
[26] Harry Buhrman, Christoph Durr, Mark Heiligman, Peter Hyer, Frederic Magniez, Miklos Santha, and
Ronald de Wolf, Quantum algorithms for element distinctness, SIAM Journal on Computing 34 (2005),
no. 6, 13241330, quant-ph/0007016, preliminary version in CCC 2001. [p. 87]
[27] Kevin K. H. Cheung and Michele Mosca, Decomposing finite abelian groups, Quantum Information and
Computation 1 (2001), no. 3, 2632, cs.DS/0101004. [p. 27]
[28] Andrew M. Childs, Quantum information processing in continuous time, Ph.D. thesis, Massachusetts
Institute of Technology, 2004. [p. 125]
[29] , On the relationship between continuous- and discrete-time quantum walk, Communications in
Mathematical Physics 294 (2010), no. 2, 581603, arXiv:0810.0312. [p. 130]
[30] Andrew M. Childs, Richard Cleve, Enrico Deotto, Edward Farhi, Sam Gutmann, and Daniel A. Spiel-
man, Exponential algorithmic speedup by quantum walk, Proceedings of the 35th ACM Symposium on
Theory of Computing, pp. 5968, 2003, quant-ph/0209131. [p. 71]
[31] Andrew M. Childs and Jeffrey Goldstone, Spatial search by quantum walk, Physical Review A 70 (2004),
no. 2, 022314, quant-ph/0306054. [p. 85]
Bibliography 163
[32] Andrew M. Childs, David Jao, and Vladimir Soukharev, Constructing elliptic curve isogenies in quantum
subexponential time, Journal of Mathematical Cryptology 8 (2014), no. 1, 129, arXiv:1012.4019. [p.
58]
[33] Andrew M. Childs and Wim van Dam, Quantum algorithms for algebraic problems, Reviews of Modern
Physics 82 (2010), no. 1, 152, arXiv:0812.0380. [p. vii]
[34] Richard Cleve, Artur Ekert, Chiara Macchiavello, and Michele Mosca, Quantum algorithms revisited,
Proceedings of the Royal Society of London A 454 (1998), no. 1969, 339354, quant-ph/9708016. [pp.
19, 85]
[35] J. H. P. Colpa, Diagonalisation of the quadratic fermion hamiltonian with a linear part, J. Phys. A 12
(1979), no. 4, 469488. [p. 150]
[36] Wim van Dam, Michele Mosca, and Umesh Vazirani, How powerful is adiabatic quantum computation?,
Proceedings of the 42nd IEEE Symposium on Foundations of Computer Science, pp. 279287, 2001,
quant-ph/0206003. [p. 143]
[37] Christopher M. Dawson and Michael A. Nielsen, The Solovay-Kitaev algorithm, Quantum Information
and Computation 6 (2006), no. 1, 8195, quant-ph/0505030. [p. 7]
[38] Kirsten Eisentrager, Sean Hallgren, Alexei Kitaev, and Fang Song, A quantum algorithm for computing
the unit group of an arbitrary degree number field, Proceedings of the 46th ACM Symposium on Theory
of Computing, pp. 293302, 2014. [p. 41]
[39] Mark Ettinger and Peter Hyer, On quantum algorithms for noncommutative hidden subgroups, Ad-
vances in Applied Mathematics 25 (2000), 239251, quant-ph/9807029. [pp. 54, 58]
[40] Mark Ettinger, Peter Hyer, and Emanuel Knill, The quantum query complexity of the hidden subgroup
problem is polynomial, Information Processing Letters 91 (2004), no. 1, 4348, quant-ph/0401083. [p.
45]
[41] Edward Farhi, Jeffrey Goldstone, and Sam Gutmann, A quantum algorithm for the Hamiltonian NAND
tree, Theory of Computing 4 (2008), no. 1, 169190, quant-ph/0702144. [pp. 114, 115]
[42] Edward Farhi, Jeffrey Goldstone, Sam Gutmann, and Michael Sipser, Quantum computation by adiabatic
evolution, quant-ph/0001106. [pp. 141, 147]
[43] Edward Farhi and Sam Gutmann, Quantum computation and decision trees, Physical Review A 58
(1998), no. 2, 915928, quant-ph/9706062. [p. 69]
[44] Richard P. Feynman, Simulating physics with computers, International Journal of Theoretical Physics
21 (1982), no. 6-7, 467488. [p. 123]
[45] , Quantum mechanical computers, Optics News 11 (1985), 1120. [p. 155]
[46] M. H. Freedman, A. Yu. Kitaev, M. J. Larsen, and Z. Wang, Topological quantum computation, Bull.
Amer. Math. Soc. 40 (2003), 3138, quant-ph/0101025. [p. 63]
[47] Brett Giles and Peter Selinger, Remarks on Matsumoto and Amanos normal form for single-qubit
Clifford+T operators, arXiv:1312.6584. [pp. 11, 14]
[48] Daniel Gottesman, An introduction to quantum error correction and fault-tolerant quantum computation,
Quantum Information Science and Its Contributions to Mathematics (Jr. Samuel J. Lomonaco, ed.),
Proceedings of Symposia in Applied Mathematics, vol. 68, AMS, 2010, arXiv:0904.2557. [p. 4]
[49] M. Grigni, L. J. Schulman, M. Vazirani, and U. Vazirani, Quantum mechanical algorithms for the
nonabelian hidden subgroup problem, Combinatorica 24 (2004), no. 1, 137154, preliminary version in
STOC 2001. [p. 53]
164 Bibliography
[50] Lov K. Grover, Quantum mechanics helps in searching for a needle in a haystack, Physical Review
Letters 79 (1997), no. 2, 325328, quant-ph/9706033, preliminary version in STOC 1996. [p. 83]
[51] S. Hallgren, C. Moore, M. Rotteler, A. Russell, and P. Sen, Limitations of quantum coset states for
graph isomorphism, Proc. 38th ACM Symposium on Theory of Computing, pp. 604617, 2006, quant-
ph/0511148, quant-ph/0511149. [p. 54]
[52] S. Hallgren, A. Russell, and A. Ta-Shma, The hidden subgroup problem and quantum computation using
group representations, SIAM Journal on Computing 32 (2003), no. 4, 916934, preliminary version in
STOC 2000. [p. 52]
[53] Sean Hallgren, Fast quantum algorithms for computing the unit group and class group of a number field,
Proc. 37th ACM Symposium on Theory of Computing, pp. 468474, 2005. [p. 41]
[54] , Polynomial-time quantum algorithms for Pells equation and the principal ideal problem, Journal
of the ACM 54 (2007), no. 1, article 4, preliminary version in STOC 2002. [pp. 33, 41]
[55] P. Hyer, T. Lee, and R. Spalek, Negative weights make adversaries stronger, Proc. 39th ACM Sympo-
sium on Theory of Computing, pp. 526535, 2007, quant-ph/0611054. [pp. 105, 106]
[56] P. Hyer, M. Mosca, and R. de Wolf, Quantum search on bounded-error inputs, Proc. 30th International
Colloquium on Automata, Languages, and Programming, Lecture Notes in Computer Science, vol. 2719,
pp. 291299, 2003, quant-ph/0304052. [p. 114]
[57] P. Hyer and R. Spalek, Lower bounds on quantum query complexity, Bulletin of the European Associ-
ation for Theoretical Computer Science 87 (2005), 78103, quant-ph/0509153. [p. 93]
[58] Sabine Jansen, Mary-Beth Ruskai, and Ruedi Seiler, Bounds for the adiabatic approximation
with applications to quantum computation, Journal of Mathematical Physics 48 (2007), 102111,
quant-ph/0603175. [p. 136]
[59] Richard Jozsa, Quantum computation in algebraic number theory: Hallgrens efficient quantum algorithm
for solving Pells equation, Annals of Physics 306 (2003), no. 2, 241279, quant-ph/0302134. [p. 33]
[60] Phillip Kaye, Raymond Laflamme, and Michele Mosca, An introduction to quantum computing, Oxford
University Press, 2007. [p. 1]
[61] Kiran S. Kedlaya, Quantum computation of zeta functions of curves, Computational Complexity 15
(2006), no. 1, 119, math.NT/0411623. [p. 30]
[62] Alexei Yu. Kitaev, Alexander H. Shen, and Mikhail N. Vyalyi, Classical and quantum computation,
AMS, 2002. [pp. 1, 7, 19, 155]
[63] Vadym Kliuchnikov, Dmitri Maslov, and Michele Mosca, Fast and efficient exact synthesis of single
qubit unitaries generated by Clifford and T gates, Quantum Information and Computation 13 (2013),
no. 7-8, 607630, arXiv:1206.5236. [p. 11]
[64] Greg Kuperberg, A subexponential-time quantum algorithm for the dihedral hidden subgroup problem,
SIAM Journal on Computing 35 (2005), no. 1, 170188, quant-ph/0302112. [p. 55]
[65] Samuel Kutin, Quantum lower bound for the collision problem with small range, Theory of Computing
1 (2005), no. 2, 2936, quant-ph/0304162. [p. 101]
[66] Troy Lee, Frederic Magniez, and Miklos Santha, Learning graph based quantum query algorithms
for finding constant-size subgraphs, Chicago Journal of Theoretical Computer Science (2011), no. 10,
arXiv:1109.5135. [p. 120]
[67] , Improved quantum query algorithms for triangle finding and associativity testing, Proceedings
of the 24th ACM-SIAM Symposium on Discrete Algorithms, pp. 14861502, 2013, arXiv:1210.1014. [p.
120]
Bibliography 165
[68] Troy Lee, Rajat Mittal, Ben W. Reichardt, Robert Spalek, and Mario Szegedy, Quantum query com-
plexity of state conversion, Proceedings of the 52nd IEEE Symposium on Foundations of Computer
Science, pp. 344353, 2011, arXiv:1011.3020. [pp. 105, 111, 114, 115, 116]
[69] E. Lieb, T. Schultz, and D. Mattis, Two soluble models of an antiferromagnetic chain, Ann. Phys. 16
(1961), no. 3, 407466. [p. 150]
[70] Frederic Magniez, Ashwin Nayak, Jeremie Roland, and Miklos Santha, Search via quantum walk, SIAM
Journal on Computing 40 (2011), no. 1, 142164, quant-ph/0608026. [pp. 87, 89]
[71] Ken Matsumoto and Kazuyuki Amano, Representation of quantum circuits with Clifford and /8 gates,
arXiv:0806.3834. [p. 11]
[72] C. Moore, A. Russell, and L. J. Schulman, The symmetric group defies strong Fourier sampling, Proc.
46th IEEE Symposium on Foundations of Computer Science, pp. 479490, 2005, quant-ph/0501056. [p.
54]
[73] Cristopher Moore, Daniel N. Rockmore, Alexander Russell, and Leonard J. Schulman, The power of
strong fourier sampling: Quantum algorithms for affine groups and hidden shifts, SIAM J. Comput. 37
(2007), no. 3, 938958, quant-ph/0503095, preliminary version in SODA 2004. [p. 53]
[74] Cristopher Moore, Alexander Russell, and Piotr Sniady, On the impossibility of a quantum sieve al-
gorithm for graph isomorphism, Proc. 29th ACM Symposium on Theory of Computing, pp. 536545,
2007, quant-ph/0612089. [p. 58]
[75] M. A. Nielsen and I. L. Chuang, Quantum computation and quantum information, Cambridge University
Press, Cambridge, 2000. [pp. 1, 7]
[76] R. Oliveira and B. M. Terhal, The complexity of quantum spin systems on a two-dimensional square
lattice, Quantum Information and Computation 8 (2008), no. 10, 900924, quant-ph/0504050. [p. 160]
[77] Oded Regev, A subexponential time algorithm for the dihedral hidden subgroup problem with polynomial
space, quant-ph/0406151. [p. 58]
[78] B. W. Reichardt, Span programs and quantum query complexity: The general adversary bound is nearly
tight for every boolean function, Proc. 50th IEEE Symposium on Foundations of Computer Science,
pp. 544551, 2009, 0904.2759. [pp. 111, 112, 113]
[79] , Reflections for quantum query algorithms, Proceedings of the 22nd ACM-SIAM Symposium on
Discrete Algorithms, pp. 560569, 2011, 1005.1601. [pp. 105, 111, 114, 115]
[80] B. W. Reichardt and R. Spalek, Span-program-based quantum algorithm for evaluating formulas, Proc.
40th ACM Symposium on Theory of Computing, pp. 103112, 2008, arXiv:0710.2630. [pp. 111, 112]
[81] J. Roland and N. J. Cerf, Quantum search by local adiabatic evolution, Physical Review A 65 (2002),
no. 4, 042308, quant-ph/0107015. [p. 143]
[82] M. Saks and A. Wigderson, Probabilistic boolean decision trees and the complexity of evaluating game
trees, Proc. 27th IEEE Symposium on Foundations of Computer Science, pp. 2938, 1986. [pp. 114,
115]
[83] Miklos Santha, On the Monte Carlo Boolean decision tree complexity of read-once formulae, Random
Structures and Algorithms 6 (1995), no. 1, 7587. [p. 114]
[84] Miklos Santha, Quantum walk based search algorithms, Theory and Applications of Models of Compu-
tation, Lecture Notes in Computer Science, vol. 4978, Springer, 2008, arXiv:0808.0059, pp. 3146. [p.
87]
[85] Arthur Schmidt and Ulrich Vollmer, Polynomial time quantum algorithm for the computation of the
unit group of a number field, Proc. 37th ACM Symposium on Theory of Computing, pp. 475480, 2005.
[p. 41]
166 Bibliography
[86] Pranab Sen, Random measurement bases, quantum state distinction and applications to the hid-
den subgroup problem, Proc. 21st IEEE Conference on Computational Complexity (2006), 274287,
quant-ph/0512085. [p. 53]
[87] Jean-Pierre Serre, Linear representations of finite groups, Graduate Texts in Mathematics, vol. 42,
Springer, 1977. [p. 47]
[88] Simone Severini, On the digraph of a unitary matrix, SIAM Journal on Matrix Analysis and Applications
25 (2003), no. 1, 295300, math.CO/0205187. [p. 77]
[89] Neil Shenvi, Julie Kempe, and K. Birgitta Whaley, A quantum random walk search algorithm, Physical
Review A 67 (2003), no. 5, 052307, quant-ph/0210064. [p. 85]
[90] Yaoyun Shi, Quantum lower bounds for the collision and the element distinctness problems, Pro-
ceedings of the 43rd IEEE Symposium on Foundations of Computer Science, pp. 513519, 2002,
quant-ph/0112086. [p. 101]
[91] Peter W. Shor, Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum
computer, SIAM Journal on Computing 26 (1997), no. 5, 14841509, quant-ph/9508027, preliminary
version in FOCS 1994. [p. 23]
[92] D. R. Simon, On the power of quantum computation, SIAM Journal on Computing 26 (1997), no. 5,
14741483, preliminary version in FOCS 1994. [p. 17]
[93] M. Snir, Lower bounds on probabilistic linear decision trees, Theoretical Computer Science 38 (1985),
6982. [pp. 114, 115]
[94] Mario Szegedy, Quantum speed-up of Markov chain based algorithms, Proceedings of the 45th IEEE
Symposium on Foundations of Computer Science, pp. 3241, 2004, quant-ph/0401053. [pp. 78, 80]
[95] Stefan Teufel, Adiabatic perturbation theory in quantum dynamics, Lecture Notes in Mathematics, vol.
1821, Springer-Verlag, 2003. [p. 136]
[96] John Watrous, Quantum simulations of classical random walks and undirected graph connectivity, Jour-
nal of Computer and System Sciences 62 (2001), no. 2, 376391, cs.CC/9812012. [p. 77]
[97] , Theory of quantum information, lecture notes, 2011, http://cs.uwaterloo.ca/watrous/
LectureNotes.html. [p. 111]
[98] Pawel Wocjan and Jon Yard, The Jones polynomial: Quantum algorithms and applications in quantum
complexity theory, quant-ph/0603069. [p. 66]
[99] Yechao Zhu, Quantum query complexity of subgraph containment with constant-sized certificates, Inter-
national Journal of Quantum Information 10 (2012), no. 3, 1250019, arXiv:1109.4165. [p. 120]