Lecture Notes v2 18
Lecture Notes v2 18
Gabriel T. Landi
University of São Paulo
3 Composite Systems 39
3.1 The age of Ulkron . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Entanglement and mixed states . . . . . . . . . . . . . . . . . . . . . 43
3.3 Reduced density matrices and the partial trace . . . . . . . . . . . . . 46
3.4 Measurements in bipartite systems . . . . . . . . . . . . . . . . . . . 51
3.5 Bell’s inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
1
5 Open quantum systems 81
5.1 Overview of quantum operations . . . . . . . . . . . . . . . . . . . . 81
5.2 Stinespring representation theorem . . . . . . . . . . . . . . . . . . . 86
5.3 Choi’s matrix and proof of the Kraus representation . . . . . . . . . . 91
5.4 Lindblad master equations . . . . . . . . . . . . . . . . . . . . . . . 93
5.5 Collisional models . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
3
it.
Exploring the 3 properties of the inner product, one may then show that given two
states written in this basis, |ψi = i ψi |ii and |φi = i φi |ii, the inner product becomes
P P
X
hψ|φi = ψ∗i φi . (1.3)
i
We always work with orthonormal bases. And even though the basis set is never
unique, the basis we are using is usually clear from the context. A general state such
as (1.1) is then generally written as a column vector
ψ0
ψ
1
|ψi = . . (1.4)
..
ψd−1
The object hψ| appearing in the inner product, which is called a bra, may then be
written as a row vector
hψ| = ψ∗0 ψ∗1 . . . ψ∗d−1 . (1.5)
4
The inner product formula (1.3) can now be clearly seen to be nothing but the mul-
tiplication of a row vector by a column vector. Notwithstanding, I am obligated to
emphasize that when we write a state as in Eq. (1.4), we are making specific reference
to a basis. If we were to use another basis, the coefficients would be different. The
inner product, on the other hand, does not depend on the choice of basis. If you use a
different basis, each term in the sum (1.3) will be different, but the total sum will be the
same.
The vectors in the Hilbert space which represent physical states are also constructed
to satisfy the normalization condition
hψ|ψi = 1. (1.6)
This, as we will see, is related to the probabilistic nature of quantum mechanics. It
means that if two states differ only by a global phase eiθ , then they are physically
equivalent. Mathematically, this means that the relevant space in question is not a
vector space, but rather a ray space.
You may also be wondering about wave-functions. Wave-functions are nothing but
the inner product of a ket with the position state |xi:
ψ(x) = hx|ψi. (1.7)
The ket |xi may at first seem weird because the label inside the ket varies continuously.
But apart from that, you can use it just as a normal ket. Wave-functions are not very
useful in quantum information. In fact, I don’t think we will ever need them again in
this course. So bye-bye ψ(x).
5
Figure 1.1: Example of Bloch’s sphere which maps the general state of a qubit into a sphere of
unit radius.
where θ and φ are arbitrary real parameters. While this parametrization may not seem
unique, it turns out that it is since any other choice will only differ by a global phase
and hence will be physically equivalent. It also suffices to consider the parameters in
the range θ ∈ [0, π] and φ ∈ [0, 2π], as other values would just give the same state up to
a global phase.
You can probably see a similarity here with the way we parametrize a sphere in
terms of a polar and a azimutal angle. This is somewhat surprising since these are
completely different things. A sphere is an object in R3 , whereas in our case we have
a vector in C2 . But since our vector is constrained by the normalization (1.9), it is
possible to map one representation into the other. That is the idea of Bloch’s sphere,
which is illustrated in Fig. 1.1. In this representation, the state |0i is the north pole,
whereas |1i is the south pole. I also highlight in the figure two other states which
appear often, called |±i. They are defined as
|0i ± |1i
|±i = √ . (1.11)
2
In terms of the angles θ and φ in Eq. (1.10), this corresponds to θ = π/2 and φ = 0, π.
Thus, these states lie in the equator, as show in Fig. 1.1.
A word of warning: Bloch’s sphere is only used as a way to represent a complex
vector as something real, so that we humans can visualize it. Be careful not to take this
mapping too seriously. For instance, if you look blindly at Fig. 1.1 you will think |0i
and |1i are parallel to each other, whereas in fact they are orthogonal, h0|1i = 0.
6
also think about the opposite operation of multiplying a column vector by a row vector.
The result will be a matrix. For instance, if |ψi = a|0i + b|1i and |φi = c|0i + d|1i, then
! !
a ∗ ac∗ ad∗
|ψihφ| = c d = ∗
. (1.12)
b bc∗ bd∗
This is the idea of the outer product. In linear algebra the resulting object is usually
referred to as a rank-1 matrix.
Let us go back now to the decomposition of an arbitrar state in a basis, as in
Eq. (1.1). Multiplying on the left by h j| and using the orthogonality (1.2) we see that
ψi = hi|ψi. (1.13)
X
|iihi| = 1 = I (1.14)
i
which is the completeness relation, as expected since |0i, |1i form an orthonormal basis.
But we can also do this with other bases. For instance, the states (1.11) also form an
orthogonal basis, as you may check. Hence, they must also satisfy completeness:
! ! !
1 1 1 1 1 −1 1 0
|+ih+| + |−ih−| = + = .
2 1 1 2 −1 1 0 1
7
basis element |0i and a rank-2 sub-space spanned by |1i and |2i. Or it may be divided
into 3 rank-1 sub-spaces.
Each term in the sum in Eq. (1.14) may now be thought of as a projection onto a
rank-1 sub-space. In fact, we define rank-1 projectors, as operators of the form
Pi = |iihi|. (1.15)
They are called projection operators because if we apply them onto a general state of
the form (1.1), they will only take the part of |ψi that lives in the sub-space |ii:
Pi |ψi = ψi |ii.
1.4 Operators
The outer product is our first example of a linear operator. That is, an operator that
acts linearly on vectors to produce other vectors:
X X
A ψi |ii = ψi A|ii.
i i
Such a linear operator is completely specified by knowing its action on all elements of
a basis set. The reason is that, when A acts on an element | ji of the basis, the result will
also be a vector and must therefore be a linear combination of the basis entries:
X
A| ji = Ai, j |ii (1.17)
i
The entries Ai, j are called the matrix elements of the operator A in the basis |ii. The
quickest way to determine them is by taking the inner product of Eq. (1.17) with h j|,
which gives
Ai, j = hi|A| ji. (1.18)
We can also use the completeness (1.14) twice to write
X X
A = 1A1 = |iihi|A| jih j| = Ai, j |iih j|. (1.19)
i, j i, j
8
We therefore see that the matrix element Ai, j is the coefficient multiplying the outer
product |iih j|. Knowing the matrix form of each outer product then allows us to write
A as a matrix. For instance, !
A A0,1
A = 0,0 (1.20)
A1,0 A1,1
Once this link is made, the transition from abstract linear operators to matrices is simply
a matter of convenience. For instance, when we have to multiply two linear operators
A and B we simply need to multiply their corresponding matrices.
Of course, as you well know, with matrix multiplication you have to be careful with
the ordering. That is to say, in general, AB , BA. This can be put in more elegant terms
by defining the commutator
[A, B] = AB − BA. (1.21)
When [A, B] , 0 we then say the two operators do not commute. Commutators appear
all the time. The commutation relations of a given set of operators is called the algebra
of that set. And the algebra defines all properties of an operator. So in order to specify
a physical theory, essentially all we need is the underlying algebra. We will see how
that appears when we work out specific examples.
Commutators appear so often that it is useful to memorize the following formula:
This formula is really easy to remember: first A goes out to the left then B goes out to
the right. A similar formula holds for [A, BC]. Then B exists to the left and C exists to
the right.
9
1. Every Hermitian operator of dimension d always has d (not necessarily distinct)
real eigenvalues.
2. The corresponding d eigenvectors can always be chosen to form an orthonormal
basis.
An example of a Hermitian operator is the rank-1 projector Pi = |iihi|. It has one
eigenvalue λ = 1 and all other eigenvalues zero. The eigenvector corresponding to
λ = 1 is precisely |ii and the other eigenvectors are arbitrary combinations of the other
basis vectors.
I will not prove the above properties, since they can be found in any linear algebra
textbook or on Wikipedia. The proof that the eigenvalues are real, however, is cute and
simple, so we can do it. Multiply Eq. (1.23) by hλ|, which gives
hλ|A|λi = λ. (1.25)
Because of the relation (1.24), it now follows for any state that,
hλ|A† |λi = λ∗ .
Thus, an operator A is diagonal when written in its own basis. That is why the proce-
dure for finding eigenvalues and eigenvectors is called diagonalization.
UU † = U † U = 1, (1.28)
Unitary matrices play a pivotal role in quantum mechanics. One of the main reasons
for this is that they preserve the normalization of vectors. That is, if |ψ0 i = U|ψi then
hψ0 |ψ0 i = hψ|ψi. Unitaries are the complex version of rotation matrices: when you
rotate a vector, you don’t change its magnitude, just the direction. The idea is exactly
the same, except it is in Cd instead of R3 .
Unitary matrices also appear naturally in the diagonalization of Hermitian operators
that we just discussed [Eq. (1.23)]. Given the set of d eigenvectors |λi i, define the
matrix X
U= |λi ihi|. (1.29)
i
10
One can readily verify that since both |ii and |λi i form basis sets, this matrix will be
unitary. The entries of U in the basis |ii, Ui j = hi|U| ji, are such that the eigenvectors
|λi i are arranged one in each column:
. .. ..
.
. . ... .
. . ..
.. .. ... .
U = |λ0 i |λ1 i . . . |λd−1 i
(1.30)
. .. ..
.. . ... .
. . ..
.. .. ... .
This is one of those things that you simply need to stop and check for yourself. Please
take a second to do that.
Next w apply this matrix U to the operator A:
X X
U † AU = |iihλi |A|λ j ih j| = λi |iihi|.
i, j i
Thus, we see that U † AU produces a diagonal matrix with the eigenvalues λi . This is
why finding the eigenstuff is called diagonalization. If we define
X
Λ= λi |iihi| = diag(λ0 , λ1 , . . . , λd−1 ) (1.31)
i
A = UΛU † . (1.32)
pi = |hi|ψi|2 (1.33)
Moreover, if the system was found in state |ii, then due to the action of the measurement
the state after the measurement collapses to the state |ii. That is, the measurement
transforms the state as |ψi → |ii. I will not try to explain the physics behind this
process right now. We will do that later one, in quite some detail. For now, just please
accept that this crazy measurement thingy is actually possible.
11
The quantity hi|ψi is the probability amplitude to find the system in |ii. The modulus
squared of the probability amplitude is the actual probability. The probabilities (1.33)
are clearly non-negative. Moreover, they will sum to 1 when the state |ψi is properly
normalized: X X
pi = hψ|iihi|ψi = hψ|ψi = 1.
i i
I will leave for you to show that using Eq. (1.33) we may also write this as
The probabilities are therefore nothing but the expectation value of the projection op-
erators on the state |ψi.
1 To be more precise, after we collapse, the state will start to evolve in time. If the second measurement
occurs right after the first, nothing will happen. But if it takes some time, we may get something non-trivial.
We can also keep on measuring a system on purpose, to always push it to a given a state. That is called the
Zeno effect.
12
1.8 Pauli matrices
As far as qubits are concerned, the most important matrices are the Pauli matrices.
They are defined as
! ! !
0 1 0 −i 1 0
σx = , σy = , σz = . (1.37)
1 0 i 0 0 −1
The Pauli matrices are both Hermitian, σ†i = σi and unitary, σ2i = 1. The operator σz
is diagonal in the |0i, |1i basis:
The operators σ x and σy , on the other hand, flip the qubit. For instance,
! !
0 1 0 0
σ+ = |0ih1| = and σ− = |1ih0| = (1.40)
0 0 1 0
or
σ x ± iσy
σ± = (1.42)
2
The action of these operators on the states |0i and |1i can be a bit counter-intuitive:
In the way we defined the Pauli matrices, the indices x, y and z may seem rather
arbitrary. They acquire a stronger physical meaning in terms of Bloch’s sphere. The
states |0i and |1i are eigenstates of σz and they lie along the z axis of the Bloch sphere.
Similarly, the states |±i in Eq. (1.11) can be verified to be eigenstates of σ x and they
lie on the x axis. One can generalize this and define the Pauli matrix in an arbitrary
direction of Bloch’s sphere by first defining a unit vector
13
where θ ∈ [0, π) and φ ∈ [0, 2π]. The spin operator at an arbitrary direction n is then
defined as
σn = σ · n = σ x n x + σy ny + σz nz (1.45)
Please take a second to check that we can recover σ x,y,z just by taking appropriate
choices of θ and φ. In terms of the parametrization (1.44) this spin operator becomes
I will leave for you the exercise of computing the eigenvalues and eigenvectors
of this operator. The eigenvalues are ±1, which is quite reasonable from a physical
perspective since the eigenvalues are a property of the operator and thus should not
depend on our choice of orientation in space. In other words, the spin components in
any direction in space are always ±1. As for the eigenvectors, they are
If we stare at this for a second, then the connection with Bloch’s sphere in Fig. 1.1 starts
to appear: the state |n+ i is exactly the same as the Bloch sphere parametrization (1.10),
except for a global phase e−iφ/2 . Moreover, the state |n− i is simply the state opposite
to |n+ i.
Another connection to Bloch’s sphere is obtained by computing the expectation
values of the spin operators in the state |n+ i. They read
Thus, the average of σi is simply the i-th component of n: it makes sense! We have
now gone full circle: we started with C2 and made a parametrization in terms of a unit
sphere in R3 . Now we defined a point n in R3 , as in Eq. (1.44), and showed how to
write the corresponding state in C2 , Eq. (1.47).
To finish, let us also write the diagonalization of σn in the form of Eq. (1.32). To
do that, we construct a matrix whose columns are the eigenvectors |n+ i and |n− i. This
matrix is then
cos 2θ −e−iφ/2 sin 2θ
−iφ/2
e
G = (1.49)
eiφ/2 sin 2θ eiφ/2 cos 2θ
The diagonal matrix Λ in Eq. (1.32) is the matrix containing the eigenvalues ±1. Hence
it is precisely σz . Thus, we conclude that
σn = GσzG† (1.50)
We therefore see that G is the unitary matrix that “rotates” a spin operator from an
arbitrary direction towards the z direction.
The Pauli matrices can also be used as a mathematical trick to simplify some cal-
culations with general 2 × 2 matrices. Finding eigenvalues is easy. But surprisingly,
the eigenvectors can become quite ugly. I way to circumvent this is to express them in
14
terms of Pauli matrices σ x , σy , σz and σ0 = 1 (the identity matrix). We can write this
in an organized way as
A = a0 + a · σ, (1.51)
q
for a certain set of four numbers a0 , a x , ay and az . Next define a = |a| = a2x + a2y + a2z
and n = a/a. Then A can be written as
A = a0 + a(n · σ) (1.52)
Functions of operators are super easy to compute when an operator is diagonal. Con-
sider, for instance, a Hermitian operator A decomposed as A = i λi |λi ihλi |. Since
P
hλi |λ j i = δi, j it follows that X
A2 = λ2i |λi ihλi |.
i
Thus, we see that the eigenvalues of A are λ2i , whereas the eigenvectors are the same
2
as those of A. Of course, this is also true for A3 or any other power. Inserting this in
Eq. (1.55) we then get
∞
X X XX
f (A) = cn λni |λi ihλi | = cn λni |λi ihλi |.
n=0 i i n
15
The quantity inside the square brackets is nothing but f (λi ). Thus we conclude that
X
f (A) = f (λi )|λi ihλi | (1.56)
i
This result is super useful: if an operator is diagonal, simply apply the function to the
diagonal elements, like one would do for numbers. For example, the exponential of the
Pauli matrix σz in Eq. (1.37) simply reads
!
−iφσz /2 e−iφ/2 0
e = . (1.57)
0 eiφ/2
When the matrix is not diagonal, a bit more effort is required, as we now discuss.
The unitaries in the middle cancel and we get the same structure as A = U BU † , but
for A2 and B2 . Similarly, if we continue like this we will get A3 = U B3 U † or, more
generally, An = U Bn U † . Plugging this in Eq. (1.55) then yields
∞
X ∞
X
f (A) = cn U Bn U † = U cn Bn U † .
n=0 n=0
I call this the infiltration property of the unitary. Unitaries are sneaky! They can
enter inside functions, as long as you have U on one side and U † in the other. This is
the magic of unitaries. Right here, in front of your eyes. This is why unitaries are so
remarkable.
From this result we also get a simple recipe on how to compute the matrix exponen-
tials of general, non-diagonal matrices. Simply write A in diagonal form as A = UΛU † ,
where Λ is the diagonal matrix. Then, computing f (Λ) is easy because all we need is
to apply the function to the diagonal entries. After we do that, we simply multiple by
U(˙)U † to get back the full matrix. As an example, consider a general 2 × 2 matrix with
the eigenstructure given by Eq. (1.54). One then finds that
eiθ(a0 +a)
!
0
e =G
iθA
G† . (1.59)
0 eiθ(a0 −a)
16
Of course, if we carry out the full multiplications, this will turn out to be a bit ugly, but
still, it is a closed formula for the exponential of an arbitrary 2 × 2 matrix. We can also
write down the same formula for the inverse:
1 !
a0 +a 0
A =G
−1
1 G† . (1.60)
0 a0 −a
To practice, let us compute the exponential of some other Pauli operators. The eigen-
vectors of σ x , for instance, are the |±i states in Eq. (1.11). Thus
cos α i sin α
!
eiασx = eiα |+ih+| + e−iα |−ih−| = = cos α + iσ x sin α (1.63)
i sin α cos α
It is also interesting to compute this in another way. Recall that σ2x = 1. In fact, this is
true for any Pauli matrix σn . We can use this to compute eiασn via the definition of the
exponential in Eq. (1.61). Collecting the terms proportional to σn and σ2n = 1 we get:
α2 α4 α3
eiασn = 1 − + + . . . + σn iα − i + . . . .
2 4! 3!
Thus, we readily see that
eiασn = cos α + iσn sin α, (1.64)
where I remind you that the first term in Eq. (1.64) is actually cos α multiplying the
identity matrix. If we now replace σn by σ x , we recover Eq. (1.63). It is interesting
to point out that nowhere did we use the fact that the matrix was 2 × 2. If you are ever
given a matrix, of arbitrary dimension, but such that A2 = 1, then the same result will
also apply.
In the theory of angular momentum, we learn that the operator which affects a
rotation around a given axis, defined by a vector n, is given by e−iασn /2 . We can use
this to construct the state |n+ i in Eq. (1.47). If we start in the north pole, we can get
to a general point in the R3 unit sphere by two rotations. First you rotate around the y
axis by an angle θ and then around the z axis by an angle φ (take a second to imagine
how this works in your head). Thus, one would expect that
17
I will leave for you to check that this is indeed Eq. (1.47). Specially in the context of
more general spin operators, these states are also called spin coherent states, since
they are the closest analog to a point in the sphere. The matrix G in Eq. (1.49) can also
be shown to be
G = e−iφσz /2 e−iθσy /2 (1.66)
The exponential of an operator is defined by means of the Taylor series (1.61).
However, that does not mean that it behaves just like the exponential of numbers. In
fact, the exponential of an operator does not satisfy the exponential property:
eA+B , eA eB . (1.67)
In a sense this is obvious: the left-hand side is symmetric with respect to exchanging A
and B, whereas the right-hand side is not since eA does not necessarily commute with
eB . Another way to see this is by means of the interpretation of eiασn as a rotation:
rotations between different axes do not in general commute.
Exponentials of operators is a serious business. There is a vast mathematical liter-
ature on dealing with them. In particular, there are a series of popular formulas which
go by the generic name of Baker-Campbell-Hausdorff (BCH) formulas. For instance,
there is a BCH formula for dealing with eA+B , which in Wikipedia is also called Zassen-
haus formula. It reads
t2 t3
et(A+B) = etA etB e− 2 [A,B] e 3! (2[B,[A,B]]+[A,[A,B]]) . . . , (1.68)
where t is just a parameter to help keep track of the order of the terms. From the fourth
order onwards, things just become mayhem. There is really no mystery behind this
formula: it simply summarizes the ordering of non-commuting objects. You can derive
it by expanding both sides in a Taylor series and grouping terms of the same order in
t. It is a really annoying job, so everyone just trusts Zassenhaus. Notwithstanding, we
can extract some physics out of this. In particular, suppose t is a tiny parameter. Then
Eq. (1.68) can be seen as a series expansion in t: the error you make in writing et(A+B)
as etA etB will be a term proportional to t2 . A particularly important case of Eq. (1.68) is
when [A, B] commutes with both A and B. That generally means [A, B] = c, a number.
But it can also be that [A, B] is just some fancy matrix which happens to commute with
both A and B. We see in Eq. (1.68) that in this case all higher order terms commute and
the series truncates. That is
t2
et(A+B) = etA etB e− 2 [A,B] , when [A, [A, B]] = 0 and [B, [A, B]] = 0 (1.69)
There is also another BCH formula that is very useful. It deals with the sandwich
of an operator between two exponentials, and reads
t2 t3
etA Be−tA = B + t[A, B] + [A, [A, B]] + [A, [A, [A, B]]] + . . . (1.70)
2! 3!
Again, you can derive this formula by simply expanding the left-hand side and collect-
ing terms of the same order in t. I suggest you give it a try in this case, at least up to
order t2 . That will help give you a feeling of how messy things can get when dealing
with non-commuting objects.
18
1.10 The Trace
The trace of an operator is defined as the sum of its diagonal entries:
X
tr(A) = hi|A|ii. (1.71)
i
It turns out that the trace is the same no matter which basis you use. You can see that
using completeness: for instance, if |ai is some other basis then
X XX XX X
hi|A|ii = hi|aiha|A|ii = ha|A|iihi|ai = ha|A|ai.
i i a i a a
The trace is a property of the operator, not of the basis you choose. Since it does not
matter which basis you use, let us choose the basis |λi i which diagonalizes the operator
A. Then hλi |A|λi i = λi will be an eigenvalue of A. Thus, we also see that
X
tr(A) = λi = sum of all eigenvalues of A . (1.73)
i
I will leave it for you to demonstrate this. Simply insert a convenient completeness
relation in the middle of AB. Using the cyclic property (1.74) you can also move
around an arbitrary number of operators, but only in cyclic permutations. For instance:
19
The sum over |ii becomes a 1 due to completeness and we conclude that
Notice how this follows the same logic as Eq. (1.74), so you can pretend you just used
the cyclic property. This formula turns out to be extremely useful, so it is definitely
worth remembering.
20
Chapter 2
Now let A be an observable. If the state is |ψ1 i, then the expectation value of A
will be hψ1 |A|ψ1 i. But if it is |ψ2 i then it will be hψ2 |A|ψ2 i. To compute the actual
21
expectation value of A we must therefore perform an average of quantum averages:
X
hAi = pi hψi |A|ψi i (2.2)
i
We simply weight the possible expectation values hψi |A|ψi i by the relative probabilities
pi that each one occurs.
What is important to realize is that this type of average cannot be writen as hφ|A|φi
for some ket |φi. If we want to attribute a “state” to our system, then we must generalize
the idea of ket. To do that, we use Eq. (1.76) to write
hψi |A|ψi i = tr A|ψi ihψi |
X
ρ= pi |ψi ihψi | (2.3)
i
which, by the way, is the same as tr(ρA) since the trace is cyclic [Eq. (1.74)].
With this idea, we may now recast all of quantum mechanics in terms of density
matrices, instead of kets. If it happens that a density matrix can be written as ρ = |ψihψ|,
we say we have a pure state. And in this case it is not necessary to use ρ at all. One
may simply continue to use |ψi. For instance, Eq. (2.4) reduces to the usual result:
tr(Aρ) = hψ|A|ψi. A state which is not pure is usually called a mixed state. In this case
kets won’t do us no good and we must use ρ.
Examples
Let’ s play with some examples. To start, suppose a machine tries to produce qubits
in the state |0i. But it is not very good so it only produces |0i with probability p. And,
with probability 1 − p it produces the state |1i. The density matrix would then be.
!
p 0
ρ = p|0ih0| + (1 − p)|1ih1| = .
0 1− p
22
√
Or it could be such that it produces either |0i or |+i = (|0i + |1i)/ 2. Then,
1 1+ p 1− p
!
ρ = p|0ih0| + (1 − p)|+ih+| = .
2 1− p 1− p
Maybe if our device is not completely terrible, it will produce most of the time |0i and
every once in a while, a state |ψi = cos 2θ |0i + sin 2θ |1i, where θ is some small angle.
The density matrix for this system will then be
Of course, the machine can very well produce more than 2 states. But you get the idea.
Next let’s talk about something really cool (and actually quite deep), called the
ambiguity of mixtures. The idea is quite simple: if you mix stuff, you generally loose
information, so you don’t always know where you started at. To see what I mean,
consider a state which is a 50-50 mixture of |0i and |1i. The corresponding density
matrix will then be !
1 1 1 1 0
ρ = |0ih0| + |1ih1| = .
2 2 2 0 1
Alternatively, consider a 50-50 mixture of the states |±i in Eq. (1.11). In this case we
get !
1 1 1 1 0
ρ = |+ih+| + |−ih−| = .
2 2 2 0 1
We see that both are identical. Hence, we have no way to tell if we began with a 50-50
mixture of |0i and |1i or of |+i and |−i. By mixing stuff, we have lost information.
ρ† = ρ. (2.5)
Second, X X X
tr(ρ) = pi tr(|ψi ihψi |) = pi hψi |ψi i = pi = 1. (2.6)
i i i
This is the normalization condition of the density matrix. Another way to see this is
from Eq. (2.4) by choosing A = 1. Then, since h1i = 1 we again get tr(ρ) = 1.
We also see from Eq. (2.8) that hφ|ρ|φi is a sum of quantum probabilities |hφ|ψi i|2
averaged by classical probabilities pi . This entails the following interpretation: for an
arbitrary state |φi,
hφ|ρ|φi = Prob. of finding the system at state |φi given that it’s state is ρ (2.7)
23
Besides normalization, the other big property of a density matrix is that it is positive
semi-definite, which we write symbolically as ρ ≥ 0. What this means is that its
sandwich in any quantum state is always non-negative. In symbols, if |φi is an arbitrary
quantum state then X
hφ|ρ|φi = pi |hφ|ψi i|2 ≥ 0. (2.8)
i
Of course, this makes sense in view of the probabilistic interpretation of Eq. (2.7).
Please note that this does not mean that all entries of ρ are non-negative. Some of
them may be negative. It does mean, however, that the diagonal entries are always
non-negative, no matter which basis you use.
Another equivalent definition of a positive semi-definite operator is one whose
eigenvalues are always non-negative. In Eq. (2.3) it already looks as if ρ is in di-
agonal form. However, we need to be a bit careful because the |ψi i are arbitrary states
and do not necessarily form a basis (which can be seen explicitly in the examples given
above). Thus, in general, the diagonal structure of ρ will be different. Notwithstanding,
ρ is Hermitian and may therefore be diagonalized by some orthonormal basis |λk i as
X
ρ= λk |λk ihλk |, (2.9)
k
for certain eigenvalues λk . Since Eq. (2.8) must be true for any state |φi we may choose,
in particular, |φi = |λk i, which gives
λk = hλk |ρ|ki ≥ 0.
Thus, we see that the statement of positive semi-definiteness is equivalent to saying
that the eigenvalues are non-negative. In addition to this, we also have that tr(ρ) = 1,
which implies that k λk = 1. Thus we conclude that the eigenvalues of ρ behave like
P
probabilities: X
λk ∈ [0, 1], λk = 1. (2.10)
k
But they are not the same probabilities pi . They just behave like a set of probabilities,
that is all.
For future reference, let me summarize what we learned in a big box: the basic
properties of a density matrix are
Any normalized positive semi-definite matrix is a valid candidate for a density matrix.
I emphasize again that the notation ρ ≥ 0 in Eq. (2.11) means the matrix is positive
semi-definite, not that the entries are positive. For future reference, let me list here
some properties of positive semi-definite matrices:
• hφ|ρ|φi ≥ 0 for any state |φi;
• The eigenvalues of ρ are always non-negative.
• The diagonal entries are always non-negative.
√
• The off-diagonal entries in any basis satisfy |ρi j | ≤ ρii ρ j j .
24
2.3 Purity
Next let us look at ρ2 . The eigenvalues of this matrix are λ2k so
X
tr(ρ2 ) = λ2k ≤ 1 (2.12)
k
The only case when tr(ρ2 ) = 1 is when ρ is a pure state. In that case it can be written
as ρ = |ψihψ| so it will have one eigenvalue p1 = 1 and all other eigenvalues equal to
zero. Hence, the quantity tr(ρ2 ) represents the purity of the quantum state. When it is
1 the state is pure. Otherwise, it will be smaller than 1:
As a side note, when the dimension of the Hilbert space d is finite, it also follows
that tr(ρ2 ) will have a lower bound:
1
≤ tr(ρ2 ) ≤ 1 (2.14)
d
This lower bound occurs when ρ is the maximally disordered state
Id
ρ= (2.15)
d
where Id is the identity matrix of dimension d.
where p ∈ [0, 1] and I used 1 − p in the last entry due to the normalization tr(ρ2 ) = 1.
If the state is pure then it can be written as |ψi = a|0i + b|1i, in which case the density
matrix becomes !
|a|2 ab∗
ρ = |ψihψ| = ∗ . (2.17)
a b |b|2
This is the density matrix of a system which is in a superposition of |0i and |1i. Con-
versely, we could construct a state which can be in |0i or |1i with different probabilities.
According to the very definition of the density matrix in Eq. (2.3), this state would be
!
p 0
ρ = p|0ih0| + (1 − p)|1ih1| = . (2.18)
0 1− p
25
This is a classical state, obtained from classical probability theory. The examples in
Eqs. (2.17) and (2.18) reflect well the difference between quantum superpositions and
classical probability distributions.
Another convenient way to write the state (2.16) is as
1 1 + sz
1 s x − isy
ρ = (1 + s · σ) = . (2.19)
2 2 s x + isy 1 − sz
26
Figure 2.1: Examples of pure and mixed states in the z axis. Left: a pure state. Center: an
arbitrary mixed state. Right: the maximally mixed state (2.23).
Schrödinger’s equation
We start with Schödinger’s equation. Interestingly, the structure of Schödinger’s
equation can be obtained by only postulating that the transformation caused by the
time evolution should be a linear operation, in the sense that it corresponds to the
action of a linear operator on the original state. That is, we can write the time evolution
from time t0 to time t as
|ψ(t)i = U(t, t0 )|ψ(t0 )i, (2.24)
where U(t, t0 ) is the operator which affects the transformation between states. This as-
sumption of linearity is one of the most fundamental properties of quantum mechanics
and, in the end, is really based on experimental observations.
In addition to linearity, we continue to assume the probabilistic interpretation of
kets, which mean they must remain normalized. That is, they must satisfy hψ(t)|ψ(t)i =
1 at all times. Looking at Eq. (2.24), we see that this will only be true when the matrix
U(t, t0 ) is unitary. Hence, we conclude that the time evolution must be described by
a unitary matrix. This is very important and, as already mentioned, is ultimately the
main reason why unitaries live on a pedestal.
Eq. (2.24) doesn’t really look like the Schrödinger equation you know. We can get
to that by assuming we do a tiny evolution, from t to t + ∆t. The operator U must
of course satisfy U(t, t) = 1 since this means we haven’t evolved at all. Thus we can
expand it in a Taylor series in ∆t, which to first order can be written as
where H(t) is some operator which, as you of course know, is called the Hamiltonian
of your system. The reason why I put an i in front is to make H Hermitian. I also didn’t
introduce Planck’s constant ~. In this course ~ = 1. This simply means that time and
energy have the same units:
Inserting Eq. (2.25) in Eq. (2.24), dividing by ∆t and then taking the limit ∆t → 0 we
27
get
The initial condition here simply means that U must act trivially when t = t0 , since
then there is no evolution to occur in the first place. Eq. (2.27) is also Schrödinger’s
equation, but written at the level of the time evolution operator itself. In fact, by lin-
earity of matrix multiplications and the fact that U(t0 , t0 ) = 1, each column of U(t, t0 )
can be viewed as the solution of Eq. (2.26) for an initial ket |ψ0 i = |ii. So Eq. (2.27) is
essentially the same as solving Eq. (2.26) d times.
If the Hamiltonian is time-independent, then the solution of Eq. (2.27) is given by
the time-evolution operator
U(t, t0 ) = e−i(t−t0 )H . (2.28)
For time-dependent Hamiltonians one may also write a similar equation, but it will in-
volve the notion of time-ordering operator. Defining the eigenstuff of the Hamiltonian
as X
H= En |nihn|, (2.29)
n
and using the tricks of Sec. 1.9, we may also write the evolution operator as
X 0
U(t, t0 ) = e−iEn (t−t ) |nihn|. (2.30)
n
28
The von Neumann equation
We now translate Eq. (2.24) from kets to density matrices. Let us consider again
the preparation idea that led us to Eq. (2.3). We have a machine that prepares states
|ψi (t0 )i with probability pi , so that the initial density matrix of the system is
X
ρ(t0 ) = pi |ψi (t0 )ihψi (t0 )|.
i
These states then start to evolve in time. Each |ψi (t0 )i will therefore transform to
|ψi (t)i = U(t, t0 )|ψi (t0 )i. Consequently, the evolution of ρ(t) will be given by
X
ρ(t) = pi U(t, t0 )|ψi (t0 )ihψi (t0 )|U † (t, t0 ).
i
But notice how we can factor out the unitaries U(t, t0 ) since they are the same for each
ψi . As a consequence, we find
dρ
= −i[H(t), ρ], ρ(t) = U(t, t0 )ρ(0)U(t, t0 ), (2.32)
dt
Thus,
This makes some sense: unitaries are like rotations. And a rotation should not affect
the radius of the state (in the language of Bloch’s sphere). Notwithstanding, this fact
has some deep consequences, as we will soon learn.
29
2.6 Quantum operations
Von Neumann’s equation is nothing but a consequence of Schrödinger’s equation.
And Schrödinger’s equation was entirely based on two principles: linearity and normal-
ization. Remember: we constructed it by asking, what is the kind of equation which
preserves normalization and is still linear. And voilá, we had the unitary.
But we only asked for normalization of kets. The unitary is the kind of operator
that preserves hψ(t)|ψ(t)i = 1. However, we now have density matrices and we can
simply ask the question once again: what is the most general kind of operation which
preserves the normalization of density matrices. Remember, a physical density matrix
is any operator which is positive semidefinite and has trace tr(ρ) = 1. One may then
ask, are there other linear operations, besides the unitary, which map physical density
matrices to physical density matrices. The answer, of course, is yes! They are called
quantum operations or quantum channels. And they are beautiful. :)
We will discuss the formal theory of quantum operations later on, when we have
more tools at our disposal. Here I just want to illustrate one result due to Kraus.1 A
quantum operation can always be implemented as
X X
E(ρ) = Mk ρMk† , Mk† Mk = 1. (2.33)
k k
Here {Mk } is an arbitrary set of operators, called Kraus operators, which only need to
satisfy this condition that they sum to the identity. The unitary evolution is a particular
case in which we have only one operator in the set {Mk }. Then normalization implies
U † U = 1. Operations of this form are called Completely Positive Trace Preserving
(CPTP) maps. Any map of the form (2.33) with the {Mk } satisfying k Mk† Mk = 1 is,
P
in principle, a CPTP map.
Amplitude damping
The most famous, and widely used, quantum operation is the amplitude damping
channel. It is defined by the following set of operators:
√ !
λ
!
1 √0 0
M0 = , M1 = , (2.34)
0 1−λ 0 0
with λ ∈ [0, 1]. This is a valid set of Kraus operators since M0† M0 + M1† M1 = 1. Its
action on a general qubit density matrix of the form (2.16) is:
√
λ + p(1 − λ) q 1 − λ
E(ρ) = √ . (2.35)
q∗ 1 − λ (1 − λ)(1 − p)
1 c.f. K. Kraus, States, Effects and Operations: Fundamental Notions of Quantum Theory, Springer
Verlag 1983.
30
If λ = 0 nothing happens, E(ρ) = ρ. Conversely, if λ = 1 then
!
1 0
E(ρ) , (2.36)
0 0
for any initial density matrix ρ. This is why this is called an amplitude damping: no
matter where you start, the map √ tries to push the system towards |0i. It does so by
destroying coherences, q → q 1 − λ, and by affecting the populations, p → λ + p(1 −
λ). The larger the value of λ, the stronger is the effect.
X
Mk† Mk = 1. (2.37)
k
The values of k label the possible outcomes of an experiment, which will occur ran-
domly with each repetition of the experiment. The probability of obtaining outcome k
is
If the outcome of the measurement is k, then the state after the measurement will be
Mk ρMk†
ρ→ , (2.39)
pk
31
which are the usual projective measurement/full collapse scenario.
If the state happens to be pure, ρ = |ψihψ|, then Eqs. (2.38) and (2.39) also simplify
slightly to
Mk |ψi
pk = hψ|Mk† Mk |ψi, |ψi → √ . (2.41)
pk
The final state after the measurement continues to be a pure state. Remember that pure
states contain no classical uncertainty. Hence, this means that performing a measure-
ment does not introduce any uncertainty.
Suppose, however, that one performs the measurement but does not check the out-
come. There was a measurement backaction, we just don’t know exactly which back-
action happened. So the best guess we can make to the new state of the system is
just a statistical mixture of the possible outcomes in Eq. (2.39), each weighted with
probability pk . This gives
X Mk ρMk† X
ρ0 = pk = Mk ρMk† , (2.42)
k
pk k
which is nothing but the quantum operations introduced in Eq. (2.33). This gives a neat
interpretation of quantum operations: they are statistical mixtures of making measure-
ments but not knowing what the measurement outcomes were. This is, however, only
one of multiple interpretations of quantum operations. Eq. (2.33) is just mathematics:
how to write a CPTP map. Eq. (2.42), on the other hand, is physics.
The physics of outcome k = 0 seems a bit strange. But that of k = 1 is quite clear: if
that outcome happened it means the state collapsed to |0i.
32
This kind of channel is used to model photon emission. If you imagine the qubit is
an atom and |0i is the ground state, then the outcome k = 1 means the atom emitted a
photon. If that happened, then no matter which state it was, it will collapse to |0i. The
outcome k = 0 can then be interpreted as not emitting a photon. What is cool is that
this doesn’t mean nothing happened to the state. Even though it doesn’t emit a photon,
the state was still updated.
POVMs
Let us now go back to the probability of the outcomes, Eq. (2.38). Notice that as
far as the probabilities are concerned, we don’t really need to know the measurement
operators Mk . It suffices to know
Ek = Mk† Mk , (2.47)
The set of operators {Ek } are called a Positive Operator Value Measure (POVM).
By construction, the Ek are positive semidefinite operators. Moreover, the normaliza-
tion (2.37) is translated to
X
Ek = 1. (2.49)
k
Any set of operators {Ek } which are positive semidefinite and satisfy this normalization
condition represents a valid POVM.
It is interesting to differentiate between generalized measurements and POVMs
because different sets {Mk } can actually produce the same set {Ek }. For instance, in the
amplitude damping example, instead of Eq. (2.34), we can choose Kraus operators
!
0 √0
M0 = M0 ,
0
M1 =
0
, (2.50)
0 λ
The backaction caused by M10 will be different than that caused by M1 . However, one
may readily check that
!
0 0
M1† M1 = (M10 )† (M10 ) = .
0 λ
Thus, even though the backaction is different, the POVM may be the same. This means
that, as far as the probability of outcomes is concerned, both give exactly the same
values.
33
In many experiments one does not have access to the post-measurement state, but
only to the probability outcomes. In this case, all one needs to talk about are POVMs.
We can also look at this from a more mathematical angle: what is the most general kind
of mapping which takes as input a density matrix ρ and outputs a set of probabilities
{pk }? The answer is a POVM (2.48), with positive semidefinite Ek satisfying (2.49).
Another example, which the experimentalists prefer due to its simplicity, is the set
34
of 6 Pauli states, Ek = √1 |ψk ihψk |,
6
with
! !
1 0
|ψ1 i = |z+ i = , |ψ2 i = |z− i = ,
0 1
! !
1 1 1 1
|ψ3 i = |x+ i = √ , |ψ4 i = |x− i = √ , (2.52)
2 1 2 −1
! !
1 1 1 1
|ψ5 i = |y+ i = √ , |ψ6 i = |y− i = √ .
2 i 2 −i
This POVM is not minimal: we have more elements than we need in principle. But
from an experimental point of view that is actually a good thing, as it means more data
is available.
X
S (ρ) = − tr(ρ log ρ) = − λk log λk , (2.53)
k
where λk are the eigenvalues of ρ. Working with the logarithm of an operator can be
awkward. That is why in the last equality I expressed S (ρ) in terms of them. The
logarithm in Eq. (2.53) can be either base 2 or base e. It depends if the application is
more oriented towards information theory or physics (respectively). The last expression
in (2.53), in terms of a sum of probabilities, is also called the Shannon entropy.
The entropy is seen to be a sum of functions of the form −p log(p), where p ∈ [0, 1].
The behavior of this function is shown in Fig. 2.3. It tends to zero both when p → 0
and p → 1, and has a maximum at p = 1/e. Hence, any state which has pk = 0 or
pk = 1 will not contribute to the entropy (even though log(0) alone diverges, 0 log(0) is
well behaved). States that are too deterministic therefore contribute little to the entropy.
Entropy likes randomness.
Since each −p log(p) is always non-negative, the same must be true for S (ρ):
S (ρ) ≥ 0. (2.54)
35
Moreover, if the system is in a pure state, ρ = |ψihψ|, then it will have one eigenvalue
p1 = 1 and all others zero. Consequently, in a pure state the entropy will be zero:
In information theory the quantity − log(pk ) is sometimes called the surprise. When an
“event” is rare (pk ∼ 0) this quantity is big (“surprise!”) and when an event is common
(pk ∼ 1) this quantity is small (“meh”). The entropy is then interpreted as the average
surprise of the system, which I think is a little bit funny.
���
��� �/�
-� ��(�)
���
���
���
�/�
���
��� ��� ��� ��� ��� ���
�
Figure 2.3: The function −p log(p), corresponding to each term in the von Neumann en-
tropy (2.53).
As we have just seen, the entropy is bounded from below by 0. But if the Hilbert
space dimension d is finite, then the entropy will also be bounded from above. I will
leave this proof for you as an exercise. What you need to do is maximize Eq. (2.53) with
respect to the pk , but using Lagrange multipliers to impose the constraint k pk = 1.
P
Or, if you are not in the mood for Lagrange multipliers, wait until Eq. (??) where I will
introduce a much easier method to demonstrate the same thing. In any case, the result
is
I
max(S ) = log(d). Occurs when ρ = . (2.56)
d
The entropy therefore varies between 0 for pure states and log(d) for maximally disor-
dered states. Hence, it clearly serves as a measure of how mixed a state is.
Another very important property of the entropy (2.53) is that it is invariant under
unitary transformations:
S (UρU † ) = S (ρ). (2.57)
This is a consequence of the infiltration property of the unitaries U f (A)U † = f (UAU † )
[Eq. (1.58)], together with the cyclic property of the trace. Since the time evolution
of closed systems are implemented by unitary transformations, this means that the
entropy is a constant of motion. We have seen that the same is true for the purity:
unitary evolutions do not change the mixedness of a state. Or, in the Bloch sphere
36
picture, unitary evolutions keep the state on the same spherical shell. For open quantum
systems this will no longer be the case.
As a quick example, let us write down the formula for the entropy of a qubit. Recall
the discussion in Sec. 2.4: the density matrix of a qubit may always beqwritten as in
Eq. (2.19). The eigenvalues of ρ are therefore (1 ± s)/2 where s = s2x + s2y + s2z
represents the radius of the state in Bloch’s sphere. Hence, applying Eq. (2.53) we get
1 + s 1 + s 1 − s 1 − s
S =− log − log . (2.58)
2 2 2 2
For a pure state we have s = 1 which then gives S = 0. On the other hand, for a
maximally disordered state we have s = 0 which gives the maximum value S = log 2,
the log of the dimension of the Hilbert space. The shape of S is shown in Fig. 2.4.
���
��(�)
���
���
���
�(ρ)
���
���
���
���
��� ��� ��� ��� ��� ���
�
Figure 2.4: The von Neumann entropy for a qubit, Eq. (2.58), as a function of the Bloch-sphere
radius s.
This quantity is important for a series of reasons. But one in particular is that it satisfies
the Klein inequality:
The proof of this inequality is really boring and I’m not gonna do it here. You can find
it in Nielsen and Chuang or even in Wikipedia.
37
Eq. (2.60) gives us the idea that we could use the relative entropy as a measure of
the distance between two density matrices. But that is not entirely precise since the
relative entropy does not satisfy the triangle inequality
This is something a true measure of distance must always satisfy. If you are wondering
what quantities are actual distances, the trace distance is one of them2
q
T (ρ, σ) = ||ρ − σ||1 := tr (ρ − σ)† (ρ − σ) . (2.62)
I(ρ) = S (ρ||π).
I know it is a bit weird to manipulate Id /d here. But remember that the identity matrix
satisfies exactly the same properties as the number one, so we can just use the usual
algebra of logarithms in this case.
We see from this result that the information contained in a state is nothing but
This shows how information is connected with entropy. The larger the entropy, the less
information we have about the system. For the maximally mixed state S (ρ) = log(d)
and we get zero information. For a pure state S (ρ) = 0 and we have the maximal
information log(d).
As I mentioned above, the relative entropy is very useful in proving some mathe-
matical relations. For instance consider the result in Eq. (2.56). If we look at Eq. (2.63)
and remember that S (ρ||σ) ≥ 0, this result becomes kind of obvious: S (ρ) ≤ log(d) and
S (ρ) = log(d) iff ρ = 1/d, which is precisely Eq. (2.56).
2 The fact that ρ − σ is Hermitian can be used to simplify this a bit. I just wanted to write it in a more
38
Chapter 3
Composite Systems
39
The tensor/Kronecker product
The mathematical structure that implements these ideas is called the tensor prod-
uct or Kronecker product. It is, in essence, a way to glue together two vector spaces
to form a larger space. The tensor product between two states |iiA and | jiB is written as
The symbol ⊗ separates the two universes. We read this as “i tens j” or “i kron j”. I
like the “kron” since it reminds me of a crappy villain from a Transformers or Marvel
movie. Similarly, the operators σAx and σBx are defined as
In this rule A, B, C and D can be any mathematical object, as long as the multiplications
AC and BD make sense.
Let’s see how this works. For instance,
The only thing I did was apply the rule (3.4) to combine stuff to the left of ⊗ with stuff
to the left and stuff to the right with stuff to the right. Now that we have σ x |0i we are
back to the single qubit business, so we can just write σ x |0i = |1i. Then we recombine
the result:
(σ x |0i) ⊗ (I|0i) = |1i ⊗ |0i = |1, 0i,
which is what we would expect intuitively. As another example, the property (3.1), that
operators pertaining to different systems should commute, now follows directly from
our definitions:
40
Let us also talk about how to combine other kinds of objects. Remember that all
we need is for the multiplications in the composition rule (3.4) to make sense. For
instance, an operation that makes no sense is
(h0| ⊗ |0i)(σ x ⊗ σ x ) = crazy nonsense,
because even though h0|σ x makes sense, the operation |0iσ x does not.
An operation which does make sense is
hk, `|i, ji = (hk| ⊗ h`|)(|ii ⊗ | ji) = (hk|ii) ⊗ (h`| ji).
The objects that remain here are two numbers and the tensor product of two numbers
is also a number. Thus, we arrive at a rule for the inner product:
hk, `|i, ji = hk|iih`| ji. (3.5)
Outer products are similarly defined:
|k, `ihi, j| = |kihi| ⊗ |`ih j|. (3.6)
One can also come up with somewhat weird operations which nonetheless make sense.
For instance,
(hk| ⊗ |`i)(|ii ⊗ h j|) = (hk|ii) ⊗ |`ih j| = (hk|ii)|`ih j|.
In the last equality I used the fact that hk|ii is just a number.
In the third notation adding the suffixes A and B is essential. Otherwise one would not
know if |ii belongs to A or B. For completeness I also added the suffixes to the first two
notations. Sometimes that is redundant. But if there is ever room for confusion, add it:
it doesn’t cost much.
A notation like |iiA | jiB also allows you to move things around and write, for in-
stance, | jiB |iiA . There is no room for confusion because you know one symbol belongs
to A and the other to B. The same is true for operator multiplication. For instance,
σBx |i, jiAB = |iiA σBx | jiB .
Notice that there is zero room for misinterpretation: the notation is not rigid, but no
one will interpret it wrong.
I strongly recommend you be cool about the notation. Each notation is useful for a
different thing, so feel free to change them at will. Just make sure there is no room for
misinterpretation.
41
Matrix representation of the Kronecker product
When using the Kronecker product in a computer, it is standard to order the basis
elements |i, ji in lexicographic order: for each entry of the first, you loop over all
elements of the last. For instance, if A and B have each dimension 3, we get
|0, 0i, |0, 1i, |0, 2i, |1, 0i, |1, 1i, |1, 2i, |2, 0i, |2, 1i, |2, 2i.
Conversely, if we have 3 qubits, we would order the basis elements as
|0, 0, 0i, |0, 0, 1i, |0, 1, 0i, |0, 1, 1i |1, 0, 0i ...
This ordering is not mandatory. But is extremely convenient for the following reason.
We then associate to each element a unit vector. For instance, for 2 qubits we would
have,
1 0 0 0
0 1 0 0
|0, 0i = , |0, 1i = , |1, 0i = , |1, 1i = . (3.8)
0 0 1 0
0 0 0 1
The matrix elements of an operator of the form A × B then becomes, using the prop-
erty (3.4)
hk, `|A ⊗ B|i, ji = hk|A|iih`|B| ji = Aki B` j .
If we now present these guys in a matrix, since we loop over all elements of the second
index, for each element of the first, the matrix form of this will look like
A0,0 B ... a0,dA −1 B
. .
A ⊗ B = .. . .. .. .
(3.9)
adA −1,0 B . . . adA −1,dA −1 B
This is just an easy of visualizing the matrix: for each Aki we introduce a full block B.
To be clear what is meant by this, consider for instance
! !
0 1 0 1 0 0 0 1
0 1 0 1 1 0
0 0 1 0
σ x ⊗ σ x = = .
(3.10)
0 1 0 1 0 0
! !
0 1
1 0 1 0 0 0
1 0 1 0
This provides an automated way to construct tensor product matrices. The final result is
not very intuitive. But computationally, it is quite trivial. Specially since the Kronecker
product is implemented in any library. In MATLAB they call it kron() whereas in
Mathematica they call it KroneckerProduct[]. These functions are really useful.
You should really try to play with them a bit.
As a consistency check, we can verify that the same logic also holds for vectors.
For instance, !
1 1
1
0
0
|0, 0i = |0i ⊗ |0i = ! =
(3.11)
1 0
0 0
0
42
Proceeding similarly leads to all elements in Eq. (3.8).
If we look at the second line, this state seems like simply a linear combination of
the four basis elements |i, jiAB . However, this is not an arbitrary linear combination.
It contains a very special choice of parameters which are such that you can perfectly
factor the state into something related to A times something related to B. Cases like this
are what we call a product state. If A and B are in a product state, they are completely
independent of each other.
However, quantum theory also allows us to have more general linear combinations
which are not necessarily factorable into a product. Such a general linear combination
has the form X
|ψiAB = ψi j |i, jiAB , (3.12)
i, j
where ψi, j are any set of complex numbers satisfying i j |ψi j |2 = 1. When a state like
P
this cannot be written as a product,1 we say A and B are entangled. An important set
of entangled states are the so called Bell states:
1
|Φ+ i = √ |0, 0i + |1, 1i , (3.13)
2
1
|Φ− i = √ |0, 0i − |1, 1i , (3.14)
2
1
|Ψ+ i = √ |0, 1i + |1, 0i , (3.15)
2
1
|Ψ− i = √ |0, 1i − |1, 0i . (3.16)
2
These states cannot be factored into a product of local states (please take a second
to convince yourself of that!). In fact, we will learn soon that they are maximally
1 That is, when we cannot decompose ψi j = f j g j .
43
entangled states. If you are familiar with the theory of angular momentum, you will
also notice that these states (specially |Ψ± i) are exactly to the singlet and triplet states
of two spin 1/2 particles. Moreover, it is useful to note that they form an orthonormal
basis for the Hilbert space of the two qubits.
Suppose we now start with two qubits reset to |0iA |0iB . We can prepare the two
qubits in a Bell state by applying two gates. First, we apply a Hadamard gate to A:
!
1 1 1
H= √ . (3.18)
2 1 −1
This produces
|0i + |1i
A A
HA |0iA |0iB = |+iA |0iB = √ |0iB .
2
This is a gate acting only on A. It is a local operation and thus cannot entangle A and
B. To entangle them we now apply the CNOT (3.17). It gives
44
Density matrices from entanglement
Now I want you to recall our original discussion in Sec. 2.1. We saw that the
concept of density matrix naturally appeared when we considered a crappy machine
that produced quantum states with some classical uncertainty. What we found was that
it was possible to combine quantum and classical effects by introducing an object of
the form X
ρ= pi |ψi ihψi | (3.20)
i
where the |ψi i are arbitrary states and the pi are arbitrary probabilities. This construc-
tion may have left you with the impression that the density matrix is only necessary
when we want to mix quantum and classical stuff. That density matrices are not really
a quantum thing. Now I want to show you that this is not the case. It is definitely not
the case. I will show you that there is an intimate relation between mixed states and
entanglement. And this relation is one the key steps relating quantum mechanics and
information theory.
Essentially, the connection is made by the notion of reduced state or reduced
density matrix. When a composite system is in a product state |ψiA ⊗ |φiB , it makes
sense to say the state of A is simply |ψiA . But if A and B are entangled, then what is
exactly the “state” of A? To warm up, consider first a bipartite state of AB of the form
X
|ψiAB = ci |ii ⊗ |ii (3.21)
i
for certain coefficients ci satisfying i |ci |2 = 1. If ci = 1 for some i and all other
P
c j = 0 then |ψi = |ii ⊗ |ii and we get a product state. In any other case, the state will be
entangled.
Now let OA be an operator which acts only on system A. That is, an operator which
has the form OA = OA ⊗ IB . The expectation value of OA in the state (3.21) will be
The sandwich that remains is now performed only over the reduced state of A. How-
ever, each sandwich hi|OA |ii is now weighted by a factor |ci |2 .
We now ask the following question: can we attribute a state |ψA i for system A such
that the above result can be expressed as hψA |OA |ψA i. This is actually the same question
we asked in Sec. 2.1. And we saw that the answer is no. In general, there is no pure
45
state we can associate with A. Instead, if we wish to associate a quantum state to A, it
will have to be a mixed state, described by a density matrix of the form
X
ρA = |ci |2 |iihi| (3.23)
i
This result has extremely important consequences. Eq. (3.23) has exactly the same form
as Eq. (3.20), with the classical probabilities pi replaced by quantum coefficients |ci |2 .
But there is absolutely nothing classical here. Nothing. We started with a pure state.
We are talking about a purely quantum effect. Notwithstanding, we see that in general
the state of A will be mixed. If ci = 1 for some i and all other c j = 0 then Eq. (3.23)
reduces to ρA = |iihi|, which is a pure state. In all other cases, the state of A will be
mixed. Thus,
46
for some index α and some set of operators Aα and Bα . So, to start, let us consider
simply an operator of the form O = A ⊗ B. Then, by linearity it will be easy to extend
to more general operators.
We begin by computing the trace of O = A ⊗ B in the |a, bi basis:
X
tr(O) = ha, b|O|a, bi
a,b
X
= (ha| ⊗ hb|)(A ⊗ B)(|ai ⊗ |bi)
a,b
X
= ha|A|ai ⊗ hb|B|bi
a,b
X X
= ha|A|ai hb|B|bi.
a b
I got rid of the ⊗ in the last line because the kron of two numbers is a number. The two
terms in this formula are simply the trace of the operators A and B in their respective
Hilbert spaces. Whence, we conclude that
We started with an operator having support on two Hilbert spaces and ended up tracing
everything, so that we are left with only a single number.
We can now imagine an operation where we only trace over one of the Hilbert
spaces and obtain an operator still having support on the other part. This is what we
call the partial trace. It is defined as
When you “trace over A”, you eliminate the variables pertaining to A and what you are
left with is an operator acting only on B. This is something we often forget, so please
pay attention: the result of a partial trace is still an operator. More generally, for an
arbitrary operator O as defined in Eq. (3.26), we have
X X
trA O = tr(Aα )Bα , trB O = Aα tr(Bα ). (3.29)
α α
An important example is the partial trace of an operator of the form |a, biha0 , b0 |. To
take the partial trace, remember that this can be written as
|a, biha0 , b0 | = |aiha0 | ⊗ |bihb0 |.
The partial trace over B, for instance, will simply go right through the first part and act
only on the second part; i.e.,
trB |a, biha0 , b0 | = |aiha0 | tr |bihb0 |
47
Thus, we conclude that
trA |a, biha0 , b0 | = δa,a0 |bihb0 |, trB |a, biha0 , b0 | = |aiha0 | δb,b0 . (3.30)
for some number p ∈ [0, 1]. If p = 1/2 we recover the Bell state (3.13). To take the
partial trace we proceed as before:
ρAB = |ψihψ| = p|0, 1ih0, 1| + (1 − p)|1, 0ih1, 0| + p(1 − p) |0, 1ih1, 0| + |1, 0ih0, 1| .
p
Due to Eq. (3.30), the last two terms will always give 0 when we take the partial trace.
We are then left with
ρA = p|0ih0| + (1 − p)|1ih1|,
ρB = (1 − p)|0ih0| + p|1ih1|.
48
Example: X states
X states of two qubits are density matrices of the form
p1 0 0 β
0 p2 α 0
ρ = . (3.35)
0 α∗ p3 0
β∗ 0 0 p4
The meaning of α and β now become a bit more clear: they represent, respectively,
the non-local coherences between {|0, 1i, |1, 0i} and {|0, 0i, |1, 1i}. From Eq. (3.36) it is
easy to take the partial trace:
p + p2
!
0
ρA = trB ρ = 1 , (3.37)
0 p3 + p4
p + p3
!
0
ρB = trA ρ = 1 . (3.38)
0 p2 + p4
We see that for X states, the reduced density matrices are diagonal. The entries which
are set to zero in Eq. (3.35) are precisely the ones that would lead to non-zero diagonals
in the reduced state. If we now look for observables, for instance, we will then find that
Non-local observables, on the other hand, can be non-zero. For instance, one may
check that
hσAx σBx i = α + α∗ + β + β∗ .
49
information. To put that more precisely, given a general ρAB and its reduced density
matrices (3.31), we have
ρA ⊗ ρB , ρAB (3.39)
This is only true when ρAB was already originally uncorrelated. Thus, in general, we
see that information is lost whenever AB are correlated.
for a set of probabilities pi ∈ [0, 1], i pi = 1 and an arbitrary set of density matrices
P
ρA(B) . Of course, in light of Eq. (3.26), any density matrix of AB can be decomposed as
i
a sum of products. But usually each term in the sum is not a valid density matrix with
a valid probability. The reason why a state of the form (3.40) is physically interesting
is because it represents a classical statistical mixture of states of A and B.
This is just like the crappy machine of Sec. 2.1. With some probability pi the
machine prepares a state ρiA for A and ρiB for B. The two systems will in general be
correlated: if we learn something about A, we can usually infer something about B. But
this is only because they share a common ignorance about the machine that produced
them. The states of A and B may very well be quantum. But their correlations are
purely classical. We say separable states of the form (3.40) are not entangled.
which will be a valid quantum state provided pi, j ∈ [0, 1] and i, j pi, j = 1. This state is
P
nothing but a classical probability distribution encoded in a density matrix. To compute
the partial trace over B we use Eq. (3.30), which gives
X X
ρA = trB ρAB = pi, j |iihi|A = piA |iihi|A
i, j i
In the last equality I carried out the sum over j and defined
X
piA = pi, j . (3.42)
j
50
3.4 Measurements in bipartite systems
The measurement postulates discussed in Sec. 2.7 remain exactly the same for com-
posite systems. In fact, what is neat about those postulates is precisely their flexibility:
any set of operators {Mk } satisfying k Mk† Mk = 1 constitutes a possible set of gen-
P
eralized measurement operators. The interesting new thing that appears when talking
about composite systems is that now we can specify measurement operators that act
only on one of the subsystems and then understand what is the measurement back-
action on the other. Or we can also define non-local measurements that act on both
systems simultaneously.
|00ih00| + |11ih11|
ρ0 = . (3.45)
2
This state is now a classical probabilistic mixture of the two possibilities. It is therefore
fundamentally different from the original entangled state |Φ+ i.
No-signaling theorem
According to the previous example, if Alice performs a measurement on her qubit,
it will instantaneously affect the state of Bob, irrespective of how far they are from each
other. This is the spooky part of quantum theory. It suggests that this kind of process
could be used to signal faster than the speed of light, thus violating causality. Although
51
spooky, this result is correct and can be verified in the laboratory, as we will discuss in
the next section. However, it cannot be used to transmit information between distant
observers. That is the idea of the no-signaling (or no-communications) principle.
The reason why this is so is because as far as Bob is concerned, he cannot tell that
Alice made the measurement. Before the measurement was made, Bob’s state was a
maximally mixed state
IB
ρB = trA |Φ+ ihΦ+ | = .
2
When Alice performs her measurement, the global state will collapse to either 0 or 1,
each with probability 1/2. But there exist no measurement scheme which Bob can do
to figure out which outcome she got. From a probabilistic point of view, the best guess
Bob can make about his state is
1 1 IB
ρ0B = |0ih0| + |1ih1| = .
2 2 2
The only way Bob can tell that a measurement was actually done would be to classi-
cally communicate with Alice (e.g. via Whatsapp). If Alice tell him that she obtained
0 in her measurement, then Bob would be able to update his state. Otherwise, he has
no way to tell. Thus, local measurements cannot be used to signal information.
We can make this proof more formal, as follows. Consider a local measurement
on Alice’s side, described by a set of Kraus operators {MkA = Mk ⊗ IB } acting only on
Alice’s Hilbert space. Alice and Bob are prepared initially in an arbitrary state ρ, which
is in general not a product state. The state of Bob after a measurement is performed on
Alice will then be
X
ρ0B = trA MkA ρ(MkA )†
k
X
= trA (MkA )† MkA ρ
k
= trA (ρ)
= ρB .
I was allowed use the cyclic property of the trace here because the operators MkA act
only on the Hilbert space which we are tracing off. We therefore see from this general
argument that as far as Bob can tell, he has no way of knowing a local operation was
performed on Alice.
The no-signaling principle is an extremely useful tool when thinking about prob-
lems in quantum information. I use it all the time as a kind of sanity check: whenever
I’m confronted with a question regarding non-local properties of quantum theories, I
always ask myself: “would this violate no-signaling?” All one need to remember is
that performing local operations on Alice has no effect on any local properties on Bob.
52
No-cloning theorem
Related to no-signaling is the notion of no-cloning: it is impossible to find a unitary
U which clones an arbitrary state |ψi:
for all |ψi. Of course, you can cook up a unitary that does this for a single |ψi. But then
this won’t work on another |φi. In fact, the theorem can be proven by contradiction.
Suppose we were able to find a unitary U such that
hψ|φi = hψ|φi2 .
But x = x2 only has solutions x = 0 or x = 1. Thus, either |ψi = |φi or |ψi is orthogonal
to |φi. Hence, a general cloning device is not possible. We can only close orthogonal
states.
The connection between no-signaling and no-cloning can be made as follows. Sup-
pose Alice and Bob share a Bell state |Φ+ i. Bob then makes many copies of his qubit
and measures each one. If in all copies he always finds the same state, he would know
that Alice had performed a measurement. If he finds 0 and 1 with equal probabili-
ties, he would know she didn’t perform a measurement. Thus, we see that if cloning
were possible, Alice could use this idea to transmit a message. No-signaling therefore
implies no-cloning.
Non-local measurements of this form may, at first, seem quite strange. However, they
are actually quite simple the perform. All we need to do is use Eq. (3.19) to construct
Bell states from the computational basis.
The idea goes as follows. Suppose we have some composite state ρAB we wish to
measure. We then first apply a sequence of two unitaries to the state ρAB : a CNOT
gate UCNOT [Eq. (3.17)] followed by a Hadamard gate HA [Eq. (3.18)] acting only on A.
Afterwards, we measure A and B in the usual computational basis defined by projectors
P0 = |00ih00|, P1 = |01ih01|, P2 = |10ih10| and P3 = |11ih11|. The probability of the
different outcomes will be
( )
pi = tr Pi (HA ⊗ IB )UCNOT ρAB UCNOT (HA ⊗ IB ) . (3.47)
53
But because of the relations in Eq. (3.19), it follows that the Bell projectors Qi are
related to the computational basis projectors Pi according to
which are nothing but the probabilities of finding the qubits in the four Bell states.
Thus, to perform measurements in the Bell basis, we simply need to first entangle the
two qubits and then measure them in the computational basis.
Both states are correlated, but they are fundamentally different. The first is pure state.
There is no uncertainty associated to the global state. And it is also entangled. The
second, on the other hand, is just a statistical mixture of two possibilities. It has no
entanglement nor any other type of quantum correlations. That being said however, in
both cases, the probability that Alice finds her qubit in 0 is p. And, in both cases, if
she does find it in zero, then the state of Bob will be updated to |0i. So in this sense, it
seems these two states behave quite similarly.
The state (3.51) represents our degree of ignorance about the configurations of
Alice and Bob. We don’t know in which configuration they are, |00i or |11i. If Alice
measures and happens to find out, then we update our information. The state (3.50), on
the other hand, contains no ignorance at all. We know exactly which state the two qubits
are. According to quantum theory, the randomness associated with the state (3.50) has
nothing to do with ignorance. It is intrinsic.
This gets us into the idea of realism or ontology. According to philosophy, an
object is real when one can associate properties to it independent of observation. A rose
is red and that is independent of whether you look or not. As Einstein said, “the moon
is still there even if we don’t look at it”. In the mixture (3.51) realism is preserved.
The qubits have the property of being either 0 or 1, it is just us, silly physicists, who
don’t know in which state they are. A state of the form (3.50), however, is not a realist
state. Bob’s qubit is not in 0 or 1. It will only collapse to 0 or 1 if we happen to make
a measurement.
54
Realism is intimately related to locality. If properties can only be established when
a system is measured, then measurements on entangled particles would cause an in-
stantaneous backaction on other particles arbitrarily far away. A non-realist theory, like
quantum mechanics, must therefore also be intrinsically non-local. This led Einstein
Podolsky and Rosen in 19352 to propose that there should exist past factor (which, to
add some drama, they called hidden variables) which determined these properties long
ago, perhaps when they interacted. That is, there should be additional variables, per-
haps which are very difficult to access in the lab, but which if we knew them, we would
know with certainty the configurations of the two qubits. These past factors/hidden
variable would therefore recover the status of a realist theory to quantum mechanics.
This is the idea behind the EPR paradox.
where ρAB is the global state of the two qubits. However, the idea is precisely not
to assume that this how these probabilities were computed. Instead, we take a Black
2 A. Einstein, B. Podolsky and N. Rosen, “Can Quantum-Mechanical Description of Physical Reality be
55
box (device independent) approach: we simply assume that Alice and Bob can make
measurements and build from experiment a set of probabilities P(a, b|x, y).
Once we have these probabilities, we can just apply them using the standard tools of
statistics. For instance, if we sum over all possible outcomes of Bob, we can construct
the marginal distribution of Alice,
X
P(a|x, y) = P(a, b|x, y). (3.55)
b
That is, the outcomes of Alice cannot be influenced by the choice of measurement y that
Bob decided to make. That would be super weird (Quantum Healing is not possible!).
In general, one will find that the probabilities P(a, b|x, y) are not independent. That
is,
P(a, b|x, y) , P(a|x)P(b|y).
The outcomes of Alice and Bob are correlated. But now comes Bell’s reasoning.
If these probabilities were constructed by a realist theory, then this independence
must stem from a common source of ignorance. That is, some past factor that es-
tablished a relation between the two qubits long ago, when they interacted. Let us call
this past factor/hidden variable by the generic name λ. What λ is, exactly, is not im-
portant. It simply summarizes a set of possible common factors that may have affected
the two qubits in the past. If we knew λ, then the outcomes of Alice and Bob would be
independent:
P(a, b|x, y, λ) = P(a|x, λ)P(b|y, λ). (3.57)
The only reason why A and B are correlated is because they share a common ignorance
about λ: Z
P(a, b|x, y) = dλ P(a|x, λ)P(b|y, λ)q(λ), (3.58)
where q(λ) is the probability distribution of λ. This is the most general formulation of
a realist theory. Or also, as we mentioned above, a local theory since breaking realism
must also break locality.
CHSH inequality
The game is now to prove, experimentally, that quantum mechanics can actually
violate Eq. (3.58). That is, that Nature can produce probability distributions which
cannot be written in the form (3.58), for any probability distribution q(λ).
The easiest way to do that is using an inequality proved by Clauser, Horne, Shimony
and Holt in 19693 Let us construct expectation values of the possible outcomes. For
instance, X
hA x i = a P(a, b|x, y).
ab
3 J. F. Clauser, M. A. Horne, A. Shimony and R. A. Holt, Phys. Rev. Lett., 23 880 (1969).
56
This is nothing but the average spin component of Alice’s qubit, in the direction x. We
can also compute correlation functions, such as
X
hA x By i = a b P(a, b|x, y).
ab
If we assume that the locality condition (3.58) is true, then we may write
XZ
hA x By i = dλ a b P(a|x, λ)P(b|y, λ)q(λ)
ab
Z
= dλhA x iλ hBy iλ q(λ),
where I defined X
hA x iλ = a P(a|x, λ),
a
All the expectation values appearing here are bounded to the interval [−1, 1]. Hence,
the quantity inside {} must lie in the interval [−2, 2]. The integral will not change this
because q(λ) is a probability distribution and convex sums cannot give you something
larger than the largest element. Hence, we reach the remarkable conclusion that
57
Violating the CHSH
√
Suppose the two qubits are prepared in a bell state |ψi = (|01i − |10i)/ 2. And
suppose, for simplicity, that Alice and Bob measure in directions (3.52) with angles θ x
and θy and φ x = φy = 0. Applying Eq. (3.54) one may write, after a bit of simplification,
1
P(a, b|x, y) = 1 − ab cos(θ x − θy ) . (3.61)
4
Next choose the directions to be
4 B. Hansen, et. al., “Loophole-free Bell inequality violation using electron spins separated by 1.3 kilo-
58
Chapter 4
Quantifying correlations
between quantum systems
where ψab are coefficients. This state will in general be entangled. To see that in first
hand, let us look at the reduced density matrices of A and B. I will leave for you as an
exercise to show that
XX
ρA = trB |ψihψ| = ψ∗ab ψa0 b |aiha0 |, (4.2)
a,a0 b
XX
ρB = trA |ψihψ| = ψ∗ab ψab0 |bihb0 |. (4.3)
b,b0 a
Of course, these are kind of ugly because ρA and ρB are not diagonal. But what I want
to stress is that in general these states will be mixed. The only case in which these
states will be pure is when the ψa,b factor as a product of coefficients ψa,b = fa gb . Then
one can already see from (4.1) that |ψi will also factor as a product. Our goal will be to
quantify the degree of entanglement between these states.
59
=
Figure 4.1: The size of the matrices appearing in Eq. (4.4). Left: A is short and fat (M ≤ N).
Right: A is thin and tall (M ≥ N).
A = US V † , (4.4)
where
• U is M × min(M, N) and has orthogonal columns U † U = 1. If M ≤ N then U
will be square and unitary, UU † = 1.
• V is N × min(M, N) and has orthogonal columns V † V = 1. If M ≥ N then V will
be square and unitary, VV † = 1.
• S is min(M, N) × min(M, N) and diagonal, with entries S αα = σα ≥ 0, which are
called the singular values of the matrix A. It is convention to always order the
singular values in decreasing order, σ1 ≥ σ2 ≥ . . . ≥ σr > 0. The number of
non-zero singular values, called r, is known as the Schmidt rank of the matrix.
When the matrix is square, M = N, then both U and V become unitary. The sizes of A,
U, S and V are shown in Fig. 4.1. For future reference, I will also write down Eq. (4.4)
in terms of the components of A:
r
X
Ai j = Uiα σα V ∗jα (4.5)
α=1
where the sum extends only up the Schmidt rank r (after that the σα are zero so we
don’t need to include them).
The SVD is not in general related to the eigenvalues of A. In fact, it is defined even
for rectangular matrices. Instead, the SVD is actually related to the eigenvalues of A† A
and AA† . Starting from Eq. (4.4) and using the fact that U † U = 1 we see that
A† A = VS 2 V † (4.6)
60
By construction, the matrix A† A is Hermitian and positive semi-definite. Hence, we
see that V forms its eigenvectors and σ2α its eigenvalues. Similarly, using the fact that
VV † = 1 we get
AA† = US 2 U † (4.7)
Thus, σ2α are also the eigenvalues of AA† . It is interesting to note that when A is
rectangular, A† A and AA† will have different dimensions. The point is that the largest
of the two will have the same eigenvalues as the smaller one, plus a bunch of zero
eigenvalues. The only type of matrix for which the singular values are identically
equal to the eigenvalues are positive semi-definite matrices, like density matrices ρ.
One of the most important applications of the SVD is in making low rank approx-
imations of matrices. To do that, suppose A is N × N. Then it will have N 2 entries
which, if N is large, will mean a bunch of entries. But now let u and v be vectors of
size N and consider the outer product uv † , which is also an N × N matrix with entries
(uv † )i j = ui v∗j . We see that even though this is N × N, the entries of this matrix are not
independent, but are completely specified by the 2N numbers ui and vi . A matrix of
this form is called a rank-1 matrix (just like the rank-1 projectors we studied before).
Going back now to Eq. (4.5), let uα denote a column vector with entries Uiα and,
similarly, let vα denote a column vector with entries V jα . Then it is easy to verify that
the matrix A in Eq. (4.5) can be written as
r
X
A= σα uα vα† . (4.8)
α=1
We have therefore decomposed the matrix A into a sum of rank-1 matrices, weighted by
the singular values σα . Since the singular values are always non-negative and appear
in decreasing order, we can now think about retaining only the largest singular values.
That is, instead of summing over the full Schmidt rank r, we sum only up to a smaller
number of singular values r0 < r to get an approximate representation of A:
r0
X
A =
0
σα uα vα† . (4.9)
α=1
This is called a rank-r0 approximation for the matrix A. If we consider just the largest
singular value (a rank-1 approximation) then we replaced N 2 elements by 2N, which
can be an enormous improvement if N is large. It turns out that this approximation is
controllable in the sense that the matrix A0 is the best rank-r0 approximation of A given
the Frobenius norm, defined as ||A|| = i j |Ai j |2 . That is, A0 is the rank-r0 matrix which
P
minimizes ||A − A0 ||.
Schmidt decomposition
I have introduced above the SVD as a general matrix decomposition, which is use-
ful to know since it appears often in many fields of research. Now I want to apply
the SVD to extract properties of quantum states. Consider again a bipartite system
described by the pure state X
|ψi = ψab |a, bi. (4.10)
a,b
61
With a moment of though we see that ψab can also be interpreted as a matrix of coeffi-
cients. In fact, this matrix will be rectangular in general, with dimensions dA × dB . In-
terpreting the entries of the vector (4.10) as a matrix is the key idea behind the Schmidt
decomposition.
I know this is confusing at first. But this is just a reordering of elements. For
instance, the state
√
|ψi = p|01i + 1 − p|10i,
p
√
can also be represented as a matrix with coefficients ψ01 = p and ψ10 = 1 − p. The
p
same idea also applies if A and B have different dimensions. For instance, suppose A
is a qubit but B is a qutrit (3 levels) and they are both prepared in a state of the form
α β
!
0
ψ= ,
0 γ 0
The matrix ψab is special in that the state |ψi must be normalized. This means that
ab |ψab | = 1 which in turn implies that
P 2
r
X
σ2α = 1. (4.12)
α=1
In general the singular values are simply non-negative. But for states ψab they are also
normalized in this way.
Inserting Eq. (4.11) back into (4.10) now gives
X X X X
|ψi = σα Uaα Vbα
∗
|a, bi = σα Uaα |ai ⊗ ∗
Vbα |bi . (4.13)
a,b,α α a b
Note how these states are labeled by the same index α, even though they may be com-
pletely different (recall that we can even have dA , dB ). Notwithstanding, we notice
62
that these states are orthonormal because of the properties of the SVD matrices U and
V.
Thus, we can now write our entangled state |ψi as
X
|ψi = σα |αA i ⊗ |αB i. (4.16)
α
This is way better than (4.10) because now we only have a single sum. It is a bit
like we diagonalized something (but what we did was find the singular values of ψab ).
Note also that this is exactly the type of state that we used in Eq. (3.21) when we first
introduced the connection between mixed states and entanglement. The step in going
from a general entangled state (4.10) to a state of the form (4.16) is called the Schmidt
decomposition of the state. The square of the singular values, λα := σ2α , are also
called Schmidt coefficients. As we will see, all the information about entanglement is
contained in these guys.
We have seen that a general state such as (4.10) will be a product state when ψab =
fa gb is a product of coefficients. But that can in practice be a hard thing to check. If
we look at the Schmidt form (4.16), however, it is now trivial to know when the state
will be a product or not: it will only be a product if σ1 = 1 and all other σα = 0.
That is, they will be in a product state when the Schmidt rank is r = 1. We can even
go further and use the singular values/Schmidt coefficients to quantify the the degree
of entanglement. To do that, we compute the reduced density matrices of A and B,
starting from the state (4.16). Since the states |αA i and |αB i are orthonormal, it is
straightforward to find that
X
ρA = σ2α |αA ihαA |, (4.17)
α
X
ρB = σ2α |αB ihαB |. (4.18)
α
Once we have these reduced density matrices, we can now compute their purity:
X X
tr(ρ2A ) = tr(ρ2B ) = σ4α = λ2α . (4.19)
α α
Quite remarkably, we see that the purity of A and B are equal (which is true even if
one has dA = 2 and the other has dB = 1000). Thus, we conclude that the purity
of the reduced states can be directly used as a quantifier of entanglement. The more
entangled are two systems, the more mixed are their reduced density matrices.
63
the others must be zero so the two parties are in a product state. Otherwise, their degree
of entanglement is quantified by the sum in Eq. (4.19). In particular, we now finally
have the tools to define what is a maximally entangled state: it is a state in which all
singular values are equal. Due to the normalization (4.12), this then implies
1
σα = √ (Maximally entangled states) (4.20)
r
As an example, consider a state of the form
√
|ψi = p|01i + 1 − p|10i,
p
p ∈ [0, 1]. (4.21)
Recall from Eq. (4.6) that the singular values of ψ are related to the square root of the
eigenvalues of ψ† ψ. Well, in our case this is pretty easy because
!
1− p 0
ψ ψ=
†
.
0 p
When p = 0, 1 we have one of the singular values equal to 1, which is the case of a
product state. Conversely, when p = 1/2 both singular values are equal and we have a
maximally entangled state. Thus, the Bell state (3.15) is the maximally entangled state.
A worked example
Let us work out an example in detail. Consider two qubits prepared in the state
c d
|ψi = √ (|00i + |11i) + √ (|01i + |10i), (4.23)
2 2
where c and d are constants satisfying |c|2 + |d|2 = 1. For simplicity I will assume
they are real, so that they may be parametrized as c = cos(θ/2) and d = sin(θ/2). If
(c, d) = (1, 0) or (0, 1) we recover the Bell states |Φ+ i and |Ψ√ +
i in Eqs. (3.13) and
(3.14), which we know are entangled. Conversely, if c = d = 1/ 2 we actually obtain
the product state |ψi = |+i|+i.
The reduced density matrices of A and B are
sin θ
! !
1 1 2cd 1 1
ρA = ρ B = = . (4.24)
2 2cd 1 2 sin θ 1
64
The Purity of the global state ρAB = |ψihψ| is of course equal to 1, tr ρ2AB = 1, since the
state is pure. But in general, the reduced density matrices ρA and ρB will be mixed. As
we just learned from Schmidt’s theorem, both will have the same purity,
1 + sin2 θ
tr ρ2A = tr ρ2B = . (4.25)
2
It is true that in this case ρA ≡ ρB , but even in cases when ρA , ρB , both will still
have the same purity [Eq. (4.19)]. We see that the state is maximally mixed (purity 1/2)
when θ = 0, π and that the purity is 1 when θ = π/2.
The matrix ψab corresponding to the state (4.23) is
!
1 c d
ψ= √ . (4.26)
2 d c
If you try to plug this on Mathematica to get the Singular Value Decomposition, you
may get a bit frustrated. SVD routines are really optimized for numerics and can give
weird results for symbolics. In this case, however, the SVD (4.4) is pretty simple. I
will leave for you to verify that both U and V in this case are nothing but the Hadamard
gate. !
1 1 1
U=V=H= √ . (4.27)
2 1 −1
That is,
ψ = HS H † , (4.28)
where S = diag(σ+ , σ− ), with the singular values
c+d c−d
σ+ = √ , σ− = √ . (4.29)
2 2
Applying the recipe (4.19) should give you back Eq. (4.25).
Let us now find the Schmidt basis in Eqs. (4.14) and (4.15). Since U = V = H,
in this case they are nothing but the |±i states. In fact, we may simply verify that the
original state (4.23) can actually be written as
(c + d) (c − d)
|ψi = √ | + +i + √ | − −i. (4.30)
2 2
This is the Schmidt form of the state. I personally think it is quite nice. Note how,
when written like this, it becomes much more evident when the state will be entangled
or not, compared to Eq. (4.23).
65
and is still an open topic of research. The reason is that it is not easy to distinguish
between quantum correlations and classical correlations. To see what I mean, have a
look back at the state (3.41). This is a classical probability distribution. However, the
sub-systems A and B are not statistically independent because pi, j cannot be factored
as a product of two probability distributions. This is therefore an instance of classical
correlations. We will get back to this topic soon.
The coefficients ψ s1 ...sN contain all the information about this system. It says, for in-
stance, that 3 is entangled to 25 but 1 is not entangled with 12. Or that 1, 2, 3 taken as a
block, is completely independent of 4, 5, . . . , N. Everything is encoded in ψ s1 ,...,sN . And
its a mess.
Understanding this mess is a big topic of current research. To complicate things
further, one has to notice that if the local Hilbert spaces have dimension d (e.g. d = 2
for qubits), then there will be d N different entries in the state (4.31). And that is a big
problem because for d = 2 and N = 300, 2300 represents more particles than there are in
the universe. Thus, if we want to characterize the entanglement properties of only 300
silly qubits, we are already in deep trouble. This is not a computational limitation that
will be solved with the next generation of processors. It is a fundamental limitation.
However, this does not mean that all corners of Hilbert space are equally explored.
It may very well be that, in practice, all the action occurs essentially in a small part
of it. One is therefore set with the question of understanding which are the relevant
corners of Hilbert space.
A particularly fruitful approach to deal with these questions, which has gained
enormous popularity in recent years, is to interpret ψ s1 ,...,sN as a rank-N tensor. This
is the idea of Tensor Networks. The neat thing about tensors, is that one can employ
low-rank approximations. This is what was done, for instance, when we discussed the
SVD as an approximation tool in Eq. (4.9): matrix Ai j has N 2 entries, but a matrix
of the form Ai j = ui v j will only have 2N. Approximating a general A as a sum of
ui v j allows us to greatly reduce the number of entries. Another reason why Tensor
Networks are popular is that they allow one to draw diagrams that tell you who is
entangled with whom. Figuring out what parts of the many-body Hilbert space are
relevant is a million dollar question. Substantial progress has been done recently for
certain classes of quantum systems, such as one-dimensional chains with short-range
interactions. But the problem is nonetheless still in its infancy.
66
State purification
We finish this section with the concept of purifying a state. Consider a physical
system A described by a general mixed state ρA with diagonal form
X
ρ= pa |aiha|
a
By purification we mean writing this mixed state as a pure state in a larger Hilbert
space. Of course, this can obviously always be done because we can always imagine
that A is mixed because it was entangled with some other system B. All we need is
to make that formal. One thing we note from the start is that state purification is not
unique. The system B can have any size so there is an infinite number of pure states
which purify ρA . The simplest approach is then to consider B to be a copy of A. We
then define the pure state X√
|ψi = pa |ai ⊗ |ai (4.32)
a
The entropy defines the lack of information we have about a quantum state. It is zero
when the state is pure (because in this case we have all information – we know exactly
67
what the state is). Moreover, it is ln d when the state is the identity π = I/d (maxi-
mally mixed state). Another quantity, intimately related to S (ρ), is the relative entropy
(Kullback-Leibler divergence). Given two density matrices ρ and σ, it is defined as
S (ρ||σ) = tr ρ ln ρ − ρ ln σ . (4.35)
The relative entropy represents a kind of distance between ρ and σ. It satisfies S (ρ||σ) ≥
0, with equality if and only if ρ = σ. To make the link between the von Neumann en-
tropy and the relative entropy, we compute the relative entropy between a state ρ and
the maximally mixed state π = I/d. It reads S (ρ||π) = −S (ρ) + ln d. This motivates us
to define the information we have about a quantum state as
The information is therefore the distance between ρ and the maximally mixed state.
Makes sense.
The logic behind Eq. (4.36) is extremely powerful and is the basis for most quan-
tifiers in quantum information theory. In (4.36) we wish to quantify information, so
we consider the distance between our state and a reference state, which is the state for
which we have no information.
We use this idea to introduce the concept of Mutual Information. Consider a
bipartite system AB prepared in an arbitrary state ρAB . This state will in general not be
a product, so from the marginals ρA = trB ρAB and ρB = trA ρAB , we cannot reconstruct
the original state. That is, ρA ⊗ ρB , ρAB . We then ask, what is the information that is
contained in ρAB , but which is not present in the marginalized state ρA ⊗ ρB . This is a
measure of the correlations present in ρAB . This can be defined as the distance between
ρAB and the marginalized state ρA ⊗ ρB :
Due to the properties of the relative entropy, this quantity is non-negative and zero if
and only if ρAB is a product state. For this reason, the mutual information quantifies the
amount of correlations in ρAB . Using Eq. (4.35) we can also write
S (ρAB ||ρA ⊗ ρB ) = −S (ρAB ) − tr ρAB ln ρA − tr ρAB ln ρB .
But now we can compute the partial traces in two steps. For instance,
− tr ρAB ln ρA = − trA trB ρAB ln ρA = − trA ρA ln ρA .
The mutual information quantifies the total amount of correlations present in a state,
irrespective of whether these correlations are classical or quantum. It is, by far, the most
68
important quantifier of correlations in quantum information theory. It is also extremely
easy to use. A harder question to address is “how much of IρAB (A : B) is quantum and
how much is classical.” That is a big question and still an open topic of research. We
will talk more about it below. If the state ρAB is pure than S (ρAB ) = 0. Moreover, as we
saw in Eqs. (4.17) and (4.18) of the previous section, the eigenvalues of ρA and ρB in
this case will be the same. Hence S (ρA ) = S (ρB ). We therefore conclude that, for pure
states,
IρAB (A : B) = 2S (ρA ) = 2S (ρB ), (for pure states). (4.38)
When the state is pure, all correlations are quantum and correspond to entanglement.
Hence, we see that in this case the mutual information becomes twice the entanglement
entropy.
As a byproduct of Eq. (4.37), since IρAB (A : B) ≥ 0, we also learn that
This is called the subadditivity of the von Neumann entropy: the entropy of the whole
is always lesser or equal than the entropy of the parts.
Next consider now the information (4.36) of a quantum state ρAB (not the mutual,
just the standard information). It reads I(ρAB ) = log dA dB − S (ρAB ). Using Eq. (4.37)
we can write this as
This makes the physical meaning of the mutual information particularly clear: The total
information I(ρAB ) contained in the state ρAB is split into the local information, con-
tained in the marginals ρA and ρB , plus the mutual information that is shared between
A and B.
As we just saw, the mutual information is a distance between a state ρAB and the
marginalized state ρA ⊗ ρB . We can actually make this a bit more general and define
the mutual information as the minimum distance between ρAB and all product states,
where σA and σB are arbitrary states of A and B. It is easy to convince ourselves that
the closest product state to ρAB is actually ρA ⊗ ρB . To do this, we use the definition of
the relative entropy in Eq. (4.35) together with the fact that ρA and ρB are the marginals
of ρAB , to write
since the relative entropy is non-negative. Hence, the minimization clearly occurs for
σA(B) = ρA(B) . We thus see that out of the set of all product states, that which is closes
to ρAB (“closest” in the sense of the relative entropy) is exactly ρA ⊗ ρB .
69
4.3 Other quantifiers based on relative entropies
A very nice idea
The reason why it is interesting to write the mutual information as the minimization
in Eq. (4.41) is because it introduces an extremely powerful idea:1 suppose we want to
quantify some property P of a quantum state ρ. Any property. We then first define the
set S of all states which do not have that property. The distance between ρ and the
closest state without that property can then be used to quantify that property:
This is one of the basic ideas behind quantum resource theories. We say a property P
is a resource. And we establish what are the set of states which do not have that resource
(the free states). The amount of resource that a state ρ has can then be associated with
the minimum distance between ρ and the set of resourceless states. If this distance
is large, then ρ will have a substantial amount of this resource. There is also another
essential piece to resource theories, which are the set of operations one can perform
and which do not increase a resource. We will talk more about this when we discuss
quantum operations in general, next chapter.
This is called the relative entropy of coherence.2 In this case the minimization can
again be done in closed form. Given a general ρ = ρi j |iih j|, define ρdiag = ρii |iihi|
P P
ij i
as the state containing only the diagonal entries of ρ in the basis |ii. One may then
verify that for any δ ∈ I, tr(ρ log δ) = tr(ρdiag log δ). This allows us to write
080501 (2010).
2 T. Baumgratz, M. Cramer, and M. B. Plenio, “Quantifying coherence”, Phys. Rev. Lett., 113, 140401.
(2014).
70
Hence, the minimum occurs precisely for δ = ρdiag , so that the relative entropy of
coherence may be written as
This quantity serves as a faithful measure of the amount of coherences in a basis |ii,
present in a quantum state ρ.
where ρiA and ρiB are valid density matrices for systems A and B. This is a typical state
mixing classical and quantum stuff. The states ρiA and ρiB can be anything we want.
But we are mixing them in a classical way. With some classical probability p1 system
A is prepared in ρ1A and B in ρ1B , with a probability p2 they are prepared in ρ2A and ρ2B ,
and so on. Thus, even though quantumness may be hidden inside the ρiA and the ρiB ,
the correlations between A and B are purely classical, being related simply to a lack of
information about which preparation took place.
It can be shown that separable states can be produced solely with Local Operations
and Classical Communications (LOCC). That is, they can be produced by doing stuff
locally in A and B, as well as sending WhatsApp messages. For this reason, sepa-
rable states are said not to be entangled. That is, we define entanglement for mixed
states as states which are not separable. Following the spirit of Eq. (4.42), the amount
of entanglement in a quantum state ρAB can be quantified by the relative entropy of
entanglement:
E(ρAB ) = min S (ρAB ||σAB ), (4.46)
σAB ∈S
where S is the set of separable states. Unfortunately, no closed form exists for E(ρAB ).
And, what is worse, this quantity is notoriously difficult to compute. In fact, this is
considered one of the major open problems in quantum information theory.
This is frustrating since entanglement is a key resource in quantum information
processing applications. An alternative way to quantify entanglement using what is
called the Entanglement of Formation (EoF). This at least has a closed (albeit ugly)
formula for two qubits. The idea is as follows. We know how to quantify entanglement
for pure states. In this case the entanglement entropy is simply the von Neumann
entropy of the reduced states,
71
Given a mixed state ρAB , we therefore look into all possible decompositions of it in
terms of an ensemble of pure states. That is, decompositions of the form
X
ρAB = qi |ψi ihψi |. (4.48)
i
where qi are probabilities and |ψi i are arbitrary states. The entanglement of formation
is then defined as X
EF (ρAB ) = min qi E(|ψi ihψi |), (4.49)
i
where the minimization is done over all possible decompositions into ensembles of
pure states.
For the case of two qubits, one can find a closed formula for E(ρAB ) which reads as
follows.3 I will list it here in case you ever need it in the future. But please, don’t let
the messiness of the next few lines interrupt your reading.
Define a matrix
ρ̃AB = (σy ⊗ σy )ρ∗AB (σy ⊗ σy ).
where ρ∗ is the complex conjugate of ρ. Then define the matrix
√ √
q
R= ρAB ρ̃AB ρAB .
(Yeah, this is ugly. I know.) Next let λ1 ≤ λ2 ≤ λ3 ≤ λ4 denote the four eigenvalues of
R in decreasing order. We then define the Concurrence as
C(ρAB ) = max 0, λ1 − λ2 − λ3 − λ4 . (4.50)
(1997).
72
���
���
���
ℰ�
���
���
���
��� ��� ��� ��� ��� ���
�
Figure 4.2: Entanglement of Formation EF [Eq. (4.51)] as a function of p for the Werner
state (4.52).
where pi j are probabilities and |iiA and | jiB represent arbitrary bases of A and B. Why is
this state classical? Because it satisfies absolutely all properties expected from classical
probability theory. Or, put it differently, for a state like this all we need is the joint
probability distribution pi j . From it we can deduce any other property.
Let me try to explain what I mean. In classical probability theory if you are given a
joint probability distribution pi j of A and B, one can compute the marginal distributions
by simply summing over the undesired index:
X X
piA = pi j , pBj = pi j . (4.54)
j i
On the other hand, given a quantum state ρAB , we can compute the reduced density
matrix by taking the partial trace. For instance,
X
ρA = trB ρAB = pi j |iihi|.
ij
73
Thus, we see that taking the partial trace is the same as just computing the marginal
distribution piA . We don’t even need the quantum state: we can simply operate with a
probability distribution.
But there is also a stronger reason as to why the state (4.53) can be considered
classical. And it is related to measurements and conditional probability distributions.
In probability theory we define the conditional distribution of A given B as
pi j
pi| j = . (4.55)
pBj
This represents the probability of observing outcome i for A given that outcome j for
B was observed.
Here is where things get interesting, because in quantum mechanics, in order to
observe an outcome for B one must perform a measurement on it. And, as we know,
measurements usually have a backaction. The state (4.53) precisely because it does
not. Suppose we measure B in the basis | ji. If outcome | ji is observed, the state of AB
after the measurement will then be updated to
1
ρAB| j = IA ⊗ | jih j| ρAB IA ⊗ | jih j| , (4.56)
pj
where
p j = tr IA ⊗ | jih j| ρAB IA ⊗ | jih j| . (4.57)
To compute p j we take the trace. Not surprisingly, we find that p j = pBj = i pi j , the
P
marginal of B. That is, the probability of finding outcome j when we measure B is
simply the marginal distribution pBj . The conditional state, given that j was observed
also becomes X pi j
ρAB| j = |i jihi j|.
i
pBj
The reduced density matrix of B is simply ρB| j = | jih j|. And, what is the interesting
part, the reduced density matrix of A will be
X pi j X
ρA| j = trB ρAB| j = B
|iihi| = pi| j |iihi|. (4.59)
i
pj i
We see that this is once again nothing but what one would obtain from the conditional
probability distribution (4.55). The basis elements |iihi| aren’t really doing anything
interesting. They are just sort of hanging in there.
To summarize, the state (4.53) can be considered a genuinely classical state because
it satisfies all properties expected from classical probability theory. There exists bases
sets {IiiA } and {| jiB } for which no backaction in the measurements occur, so that the
notion of conditional probability can be defined without a problem. The state still has
74
correlations because in general pi j , piA pBj . But these correlations are fully classical.
States of the form (4.53) are also called classical-classical states, not because they are
really really classical, but because they are classical on both sides A and B. I think
it is really nice how the density matrix formalism contains all of classical probability
theory. Density matrices can go from purely quantum stuff to purely classical.
The Mutual Information (4.37) can be written in a neat way if we write, for instance,
X
S (ρA ) = pi j log piA .
ij
This formula is true because j only appears in the first pi j , so that if we sum over j we
get j pi j = piA . But with this trick we can write all 3 terms in Eq. (4.37) in a single
P
sum, leading to X pi j
I(A : B) = pi j log A B = D(pAB ||pA pB ), (4.60)
ij
pi p j
where D(pAB ||pA pB ) is the classical version of the relative entropy (4.35),
X
D(p||q) = pi log pi /qi . (4.61)
i
This result provides an intuitive feel for the mutual information. It simply quantifies the
ratio between the probability distribution pi j and the marginalized distribution piA pBj .
We can also write it in terms of the conditional distributions pi| j in Eq. (4.55). It
then becomes
X X pi| j X B
I(A : B) = pBj pi| j log = p j D(pA| j ||pA ). (4.62)
j i
piA j
75
knowledge of j. The quantity pi| j /piA represents how much learning something about
B affects the outcomes of A.
We can also write (4.62) in a third way. If we split the log in two terms, the one
involving piA will simply reduce to
X X X
− pBj pi| j log piA = − piA log piA = H(A),
j i i
This is the Shannon entropy H of the distribution pA . As for the other term, we define
the conditional entropy
X X
H(A|B) = − pBj pi| j log pi| j . (4.63)
j i
The entropy H(A) measures the lack of information we have about A. The conditional
entropy H(A|B) measures the lack of information about A given we know what the
outcome of B is. In terms of these quantities, the mutual information (4.62) can be
written as
Thus, the mutual information is nothing but the difference between how much you
don’t know about A and how much you don’t know about A given that you know
B. Of course, again, the results are symmetric in A and B so that it is also true that
I(A : B) = H(B) − H(B|A).
Quantum-classical states
The results above for the mutual information apply only for the classical-classical
state (4.53). Let us now explore what happens when we start to introduce some quan-
tumness in our state. The next best thing would be a classical-quantum state
X
ρAB = ρAj ⊗ pBj | jih j|B (4.65)
j
where ρAj are arbitrary density operators for A. This state behaves in a simple way with
respect to B, but may contain a bunch of quantumness in A. The reduced states are:
X
ρA = pBj ρAj ,
j
X
ρB = pBj | jih j|.
j
As a particular case, if ρAj = i pi| j |iihi|, then we recover the classical-classical state (4.53).
P
To give an example, suppose A and B are qubits with ρ0A = |0ih0| and ρ1A = |+ih+|.
Then
ρAB = p0B |0ih0| ⊗ |0ih0| + p1B |+ih+| ⊗ |1ih1|. (4.66)
76
The reduced state of A will then be
We now try to play with conditional distributions and see what happens. First,
suppose we measure B in the basis | ji. Following the same steps in Eq. (4.58), we get
IA ⊗ | jih j| ρAB IA ⊗ | jih j| = ρAj ⊗ pBj | jih j| .
Thus, the probability of obtaining outcome j is simply pBj , as expected. Moreover, the
conditional state of A given that j was observed is
Quantum discord
Let us now turn to the Mutual Information for the quantum-classical state (4.65).
The Mutual Information is of course given by Eq. (4.37). That definition is rock solid
and always holds, no matter what type of state. However, we also showed that in the
classical-classical state we could write it as (4.64). This way of writing is nice because
it relates information to the difference in entropy between the case where you know, and
the case where you do not know, the other side. Can we have an equivalent formulation
valid for more general states?
The answer is “sort of”. First, we need to realize that the type of measurement
matters. The amount of information we can learn about A given that we measured B,
depends on what measurement we did on B. So let us make things a bit more formal
by using the concept of generalized measurements studied in Sec. 2.7 and 3.4. Let
MkB denote a set of measurement operators acting on B, satisfying k (MkB )† MkB = I.
P
We then start with an arbitrary state ρAB and perform the MkB measurement on B. The
probability of outcome k will be
pkB = tr IA ⊗ MkB ρAB IA ⊗ MkB † ,
(4.68)
77
and the state of AB given that outcome k was obtained, will be
This quantity represents how much the information about A was updated given the
knowledge we learned from measuring B. It is the generalization of Eq. (4.64) for
arbitrary quantum states.
The key point however, which we will now explore, is that for general quantum
states J M (A|B) is in general not the same as the mutual information I(A : B). Their
difference is called Quantum Discord:
The discord is always non-negative (as we will show in a second). It represents the
mismatch between the total amount of shared information between A and B and the in-
formation gain related to the specific choice of measurement MkB . There is a “Discord”
because in quantum theory the type of measurement matters and may not give you the
full information.
In fact, we can even make that more rigorous: we can define a measurement inde-
pendent discord by minimizing the distance (4.72) over all possible measurements:
Q(A|B) = min
B
Q M (A|B). (4.73)
Mk
The remarkable feature of quantum mechanics is that there exists states for which this
quantity is non-zero, meaning no measurement is capable of recovering the results
expected from classical probability theory. Computing this minimization is not an easy
task, which is a bummer. And also, sometimes the measurement which extracts the
largest information is actually quite crazy and would be impossible to perform in a
lab. But nevertheless, you have to agree with me that conceptually the idea is quite
beautiful.
78
Discord in quantum-classical states
Let us now go back to the quantum-classical state (4.65) and explore these new
ideas. We begin by proving something called the joint entropy theorem. Consider the
spectral decomposition of each ρAj in the form
X
ρAj = pk|A j |k j ihk j |.
k
Remember that the ρAj are just arbitrary density matrices of A. Thus, for the same j, the
|k j i form an orthonormal basis, hk j |q j i = δk,q . But for different j there is no particular
relation between them, hk j |q j0 i , δk,q . The quantum-classical state (4.65) can then be
written as X
ρAB = pk|A j pBj |k j , jihk j , j|. (4.74)
k, j
hk j , j|q j0 , j0 i = δ j, j hk j |q j i = δ j, j0 δk,q .
Thus, we conclude that the eigenvalues of ρAB are simply pk|A j pBj .
The von Neumann entropy of ρAB will then be
X
S (ρAB ) = − tr ρAB log ρAB = − pk|A j pBj log pk|A j pBj .
k, j
X X
ρAj ⊗ pBj | jih j|B = H(pB ) + pBj S (ρAj ).
S (4.75)
j j
The connection with quantum discord can now be made clear. The mutual infor-
mation (4.37) for the quantum-classical state becomes
79
��� (�) ��� (�)
��� ���
��� ���
� �〉
� �〉
��� ���
��� ���
��� ���
��� ���
��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ��� ���
θ θ
Figure 4.3: Basis dependent quantum discord (4.72) for the quantum-classical state (4.66) and
a projective measurement performed in the basis (4.76). (a) When the measurement
is performed in A. (b) Measurement performed in B.
This is nothing but J| ji (A|B) in Eq. (4.71). Thus, we see that for a quantum-classical
state, if we measure B in the basis | ji there will be no discord at all. Of course, if we
measure B in some crazy basis, there will be a discord. But at least there exists one
basis for which the minimization in Eq. (4.73) gives exactly zero.
The discord, however, is not symmetric in A and B. That is the whole point of hav-
ing a quantum-classical state. So even though there exists zero-discord measurements
on B, in general this is not the case for measurements performed on A. That is, even
though we may have Q(A|B) = 0, we may very well have Q(B|A) , 0.
To illustrate this, let us consider the quantum-classical state in Eq. (4.66) and let us
suppose that we perform projective measurements on each qubit in the basis
In Fig. 4.3 we show the basis dependent discord (4.72) as a function of θ for the case
where the measurement is performed in A and in B. The curves suppose p = 0.3. In
the case of measurements in B the discord turns out not to depend on φ. The curve for
when the measurement is performed in A [Fig. 4.3(a)] represent multiple values of φ.
As can be seen, when the measurement is performed in B, which is the classical
side of the quantum-classical state, there exist values of θ for which the discord is zero.
This is precise θ = 0, which means measuring in the computational basis. Conversely,
when the measurement is performed in A, no matter what measurement direction (θ, φ)
we choose, we always get a non-zero discord.
80
Chapter 5
ρ0 = UρU † , U † U = 1. (5.1)
As we already saw in previous chapters, this is not the most general type of transfor-
mation taking density matrices to density matrices. The most general map is called a
quantum operation or quantum channel. It has the form
X
ρ0 = E(ρ) = Mk ρMk† , (5.2)
k
known as Kraus operators. If the set {Mk } has only one operator, we are back to
Eq. (5.1). The map described by Eq. (5.2) is called Completely Positive and Trace
Preserving (CPTP). The trace preserving condition is ensured by Eq. (5.3). The term
“completely positive” means that it takes positive operators to positive operators.
There are four names in this chapter: Kraus, Stinespring, Lindblad and Choi. Here
is what each name is associated with:
• Kraus: the representation of a channel as in Eq. (5.2).
• Stinespring: a channel of the form (5.2) can always be viewed as the interac-
tion of a system with an environment (hence justifying the name open quantum
systems).
• Lindblad: a way to write the map (5.2) as a differential equation dρ/dt = . . ..
This is a bit like how von Neumann’s equation is derived from (5.1). It is only
possible when the environm
81
• Choi: a weird, but super fancy way, of representing channels. ent is very large.
We will investigate each of these ideas throughout this chapter. In the remainder of this
section, I want to show you first how the four names appear in a specific example.
with λ ∈ [0, 1]. This is a valid set of Kraus operators since M0† M0 + M1† M1 = 1. Its
action on a general qubit density matrix reads:
√
λ + p(1 − λ) q 1 − λ
p q
ρ = → ρ =
0
√ . (5.5)
q∗ 1 − p q∗ 1 − λ (1 − λ)(1 − p)
This is why this is called an amplitude damping: no matter where you start, the map
tries to push the system towards |0i. It does so by destroying coherences,
√
q → q 1 − λ, (5.7)
The larger the value of λ, the stronger is the effect. This is Kraus.
1 0 0 0
0 cos gt −i sin gt 0
U = e−iHt = . (5.10)
0 −i sin gt cos gt 0
0 0 0 1
82
Now suppose that the ancila starts in the state |0iE whereas the system starts in an
arbitrary state ρS . They then evolve unitarily as in Eq. (5.1):
ρ0S E = U ρS ⊗ |0ih0|E U † . (5.11)
Notice how subtle this is: I am not talking about the evolution in time. In fact, in the
Kraus map (5.5) we don’t even have time. What I am talking about is the structure of
the map; the structure of how an input state ρS is processed into an output state ρ0S .
This is Stinespring: the effect of a quantum channel can be viewed as the interaction
of a system with an environment, followed by a partial trace over the environment.
many things in science, Lindblad got most of the credit. That is slowly changing and a lot of people are now
calling Eq. (5.14) the GKSL equation. Maybe I should do the same!
83
Comparing this with Eq. (5.5) we see that the evolution stemming from (5.14) can also
be viewed as an amplitude damping channel, provided we identify
λ = 1 − e−γt . (5.16)
Here I wrote EB to emphasize that the map is acting only on qubit B. Now consider the
action of this map on a unnormalized Bell state,
This is the Choi matrix. The reason why it is useful is because, surprisingly, it turns
out that it completely characterizes the channel. Every channel E has, associated to it,
a corresponding Choi matrix ΛE . And the Choi matrix completely defines the channel
in the sense that from ΛE one can reconstruct E. This is surprising because it means
that, in order to know how a channel will act on all quantum states, it suffices to know
how it acts on the maximally entangled state |Ωi in Eq. (5.18). I know this is not at all
obvious just by looking at Eq. (5.19). But I will show you next section how to do this.
84
if outcome k is obtained, the state ρ is updated to
Mk ρMk†
ρ → ρk = , (5.20)
pk
where pk = tr(Mk ρMk† ). This map definitely takes valid density matrices to valid den-
sity matrices. However, it has a fundamental difference with respect to the maps we
discussed before: it is non-linear because ρ also appears in the probabilities pk . I
think this kind of funny: professors and books always pound in our heads the idea that
quantum mechanics is linear. But measurements are not linear.
We can recover linearity if we consider instead a map of the form
Ek (ρ) = Mk ρMk† . (5.21)
This is definitely linear, but it no longer preserves the trace. It is an example of a
Completely Positive Trace Non-Increasing (CPTNI) map. The trace cannot increase
because, since k Mk† Mk = I, it follows that Mk† Mk ≤ I. The fact that it no longer pre-
P
serves the trace may at first seem disturbing, but we can always normalize it whenever
we need it, by writing ρk = Ek (ρ)/ tr Ek (ρ). Thus, we can keep linearity, as long as we
get used to working with states that are no longer normalized. That is the idea of “trace
non-increasing maps”.
85
5.2 Stinespring representation theorem
In this section we make the Stinespring argument more rigorous. Consider a system
S with arbitrary density matrix ρS . Now suppose this system is coupled to an environ-
ment E, which can have any dimension, but which we assume is prepared in a certain
pure state |ψi (we will lift this restriction in a second). The composite S + E system
then interacts via a unitary U (Fig. 5.1) after which we trace over the environment. This
leads to a map
E(ρS ) = trE U ρS ⊗ |ψihψ|E U † . (5.24)
This map is clearly linear in ρ. It is also CPTP because we know that both unitary
dynamics and partial traces are CPTP, so the operations we are doing are certainly
on the safe side. In order to take the trace over E, we introduce a basis |kiE for the
environment, leading to
X
E(ρS ) = hk|U ρS |ψihψ| U † |ki
k
X
= E hk|U|ψiE ρS E hψ|U † |kiE . (5.25)
k
This last step is a bit confusing, I know. What I did was split |ψihψ|E in two and pass
the left one through ρS . I am allowed to do this because |ψiE lives in the space of the
environment and thus commutes with ρS .2
The quantities E hk|U|ψiE are still operators on the side of the system, as the con-
traction was done only on the side of the environment. If we define
Mk = E hk|U|ψiE , (5.26)
we then see that Eq. (5.25) may be written in the form of a quantum channel,
X
E(ρS ) = Mk ρS Mk† .
k
2 You can also see this in a more clumsy way using tensor products. Any unitary U of S E can always be
written as U = Aα ⊗ Bα , where A, B are operators of S and E respectively. In tensor product notation, the
P
α
kets |kiE should be written as IS ⊗ |kiE . The map (5.24) may then be written as
X X X
E(ρS ) = IS ⊗ hk| Aα ⊗ Bα ρS ⊗ |ψihψ| A†β ⊗ B†β IS ⊗ |ki
k α β
X
= Aα ρS A†β hk|Bα |ψihψ|B†β |ki
α,β
X ! X !
= Aα hk|Bα |ψi ρS A†β hψ|B†β |ki ,
α β
86
ρ� ℰ(ρ�)
�
|ψ〉�
Figure 5.1: Idea behind a Stinespring dilation: a quantum operation E(ρ) can always be con-
structed by evolving the system together with an environment, with a global unitary
U, and then discarding the environment.
We can also neatly see how the unitarity of U implies that the {Mk } are indeed Kraus
operators:
X X
Mk† Mk = E hψ|U |kiE E hk|U|ψiE = E hψ|U U|ψiE = IS ,
† †
(5.27)
k k
This has again the same structure as the Kraus decomposition, except that the Kraus
operators now have two indices,
√
Mkµ = qµ E hk|U|µiE . (5.28)
87
Even though the two indices may seem like a complication, we can simply interpret
(k, µ) as a collective index, so that formally this is not at all different from what we
already had before. This also agrees with the notion of state purification, as discussed
in Eq. (4.32): any mixed state ρE can always be viewed as a pure state in a larger
Hilbert space. Hence, we could also simply apply the same recipe (5.26), but use
P√
instead a purified state |ψiEE 0 = qµ |µiE ⊗ |µiE 0 living in a larger Hilbert space EE 0 .
µ
Stinespring dilations
Eq. (5.26) provides a recipe on how to construct the set of Kraus operators given
a U and a |ψi of the environment. But what about the converse? Given a quantum
channel, specified by a set of Kraus operators {Mk }, is it always possible to associate
to it a certain system-environment unitary dynamics? The answer is yes. This is the
idea of a dilation: any quantum channel can be represented as a unitary evolution in a
dilated Hilbert space.
It can be shown that any quantum operation in a system with dimension d can
always be described by an environment with dimension at most d2 . Thus, channels for
a single qubit, for instance, can always be constructed by coupling it to an environment
having 4 levels (e.g. two other qubits). The reason why this works is best shown using
an example. Suppose we have the amplitude damping with Kraus operators (5.4). Now
construct an environment in a state |0i and a unitary U of the form
1 0 x x
√
1−λ
0
x x
U = .
√ (5.30)
0
λ x x
0 0 x x
Here x means arbitrary elements (not all equal) which should be included to ensure that
U is indeed a unitary (the choices are not unique). Note how the first two columns are
simply the entries of the Kraus operators (5.4) stacked together. Something like
. . .
M0
U = .
M1 . . .
This shows why the construction works: we simply build environments by hand, putting
in U the elements of the Mk we want to have. It is, above all, mostly a matter of order-
ing elements.
88
projective measurement in the ancilla. Here is how this goes. Consider the S + E inter-
action in Eq. (5.24). I am using the word ancilla now (which means auxiliary system)
instead of environment, but it doesn’t matter what you call it. After the interaction the
global state of S + E is
ρ0S E = U ρS ⊗ |ψihψ|E U † ,
We now perform a projective measurement in the ancilla, in the basis |ki. This will
update the state of the system to
where I left out the normalization factor. Whence, we see that the generalized measure-
ment Mk ρS Mk† can always be viewed as the interaction of a system with an ancilla by
means of some unitary U, followed by a projective measurement in the ancilla in the
basis |ki. The total quantum channel E(ρ) is then a sum of each possible outcome ρkS
(here we don’t have to weight by the probabilities because the ρkS are not normalized).
The channel therefore encompasses all possible outcomes of the measurement. It is
like measuring but not reading the outcomes of the measurement.
If we now combine these states to form a quantum channel we get the same channel
X X
ραS = E hα|U|ψiE ρS E hψ|U |αiE
†
α α
= trE U|ψiE ρS E hψ|U † ,
where I simply used the fact that {|αi} also forms a basis. Thus, we reach the important
conclusion that as far as the channel is concerned, it doesn’t matter which basis set you
measure the ancilla. They all produce the same E(ρ).
However, depending on the choice of measurement basis, one may get different
Kraus operators. These are called the different unravelings of the quantum channel.
Indeed, Eq. (5.32), for instance, incites us to define
Fα = E hα|U|ψiE . (5.33)
Hence, X X
E(ρS ) = Mk ρS Mk† = Fα ρS Fα† . (5.34)
k α
The set of measurements {Mk } and {Fα } represents different unravelings of the same
channel. That is, different ways in which measurements can take place, but which lead
89
to the same channel in the end. This ambiguity in the choice of Kraus operators is called
freedom in the operator-sum representation. An “operator-sum representation” is
a representation of the form k Mk ρS Mk† . A given channel E, however, may have
P
multiple operator sum representations.
The operators Fα in Eq. (5.33) and Mk in Eq. (5.26) are seen to be connected by a
basis transformation unitary Vαk = hα|ki:
X
E hα|U|ψi E = hα|ki E hk|U|ψiE
k
Whence,
X
Fα = Vαk Mk . (5.35)
k
In words: sets of Kraus operators connected by a unitary lead to the same quantum
channel.
If we measure the ancilla in the computational basis we then find the Kraus operators
Thus, the outcomes we get from the ancilla represent the probabilities of finding the
system in 0 and 1:
pi = tr Mi ρS Mi† = hi|ρS |ii, i = 0, 1.
But now suppose we decide to measure the ancillas in a different basis, say |±iE .
Then the Kraus operators become
IS σSz
F0 = E h+|U|0iE = √ , F1 = E h−|U|0iE = √ . (5.38)
2 2
The measurement outcomes are now completely uninformative, q0 = q1 = 1/2. Notwith-
standing, the channel is still exactly the same.
90
5.3 Choi’s matrix and proof of the Kraus representa-
tion
All I want to do in this section is to prove the following mathematical result: Let
E(ρ) denote a map satisfying
1. Linearity: E(αρ1 + βρ2 ) = αE(ρ1 ) + βE(ρ2 ).
2. Trace preserving: tr[E(ρ)] = tr(ρ).
3. Completely positive: if ρ ≥ 0 then E(ρ) ≥ 0.
Then this map can always be represented in the form (5.2) for a certain set of Kraus
operators {Mk }. Proving this is nice because it is a way of showing that the quantum
channel is the most general map satisfying 1,2 and 3.
There is a subtle difference between a map that is positive and a map that is com-
pletely positive. Completely positive means E(ρ) ≥ 0 even if ρ is a density matrix living
in a larger space than the one E acts on. For instance, suppose E acts on the space of
a qubit. But maybe we want to apply this map to the density matrix of 2 entangled
qubits. If even in this case the resulting ρ0 is positive semi-definite, we say the map is
completely positive.3
The proof of our claim is based on a powerful, yet abstract, idea related to what is
called the Choi isomorphism. Let S denote the space where our map E acts and define
an auxiliary space R which is an exact copy of S. Define also the (unnormalized) Bell
state X
|Ωi = |iiR ⊗ |iiS , (5.39)
i
where |ii is an arbitrary basis and from now on I will always write the R space in the
left and the S space in the right. We now construct the following operator:
This is called the Choi matrix of the map E. It is the outcome of applying the map
ES on one side of the maximally entangled Bell state of R+S. Hence, it is a bit like a
density matrix in the space RS : it is positive definite although not normalized.
The most surprising thing about the Choi matrix is that it completely determines
the map E. That is, if we somehow learn how our map E acts on |ΩihΩ| we have
completely determined how it will act on any other density matrix. This is summarized
by the following formula:
E(ρ) = trR (ρT ⊗ IS )ΛE . (5.41)
3 There aren’t many examples of maps that are positive but not completely positive. The only example I
know is the partial trace (see, for instance, Box 8.2 of Nielsen and Chuang).
91
I know what you are thinking that this is super weird. And I agree! It is! But now let’s
check and see that this sorcery actually works.
Note that here ρT is placed on the auxiliary space R in which the trace is being
taken. Consequently, the result on the left-hand side is still an operator living on S. To
verify that Eq. (5.41) is true we first rewrite (5.40) as
X
ΛE = |iiR h j| ⊗ E(|iih j|). (5.42)
i, j
Then we get
X
trR (ρT ⊗ IS )ΛE = trR (ρT ⊗ IS ) |iih j| ⊗ E(|iih j|)
i, j
X
= h j|ρT |iiE(|iih j|)
i, j
X
=E ρi, j |iih j|
i, j
= E(ρ).
Here I used the fact that h j|ρT |ii = hi|ρ| ji = ρi, j . Moreover, I used our assumption that
E is a linear map.
We are now in the position to prove our claim. As I mentioned, the Choi matrix
looks like a density matrix on R+S. In fact, we are assuming that our map E is CPTP.
Thus, since |ΩihΩ| is a positive semi-definite operator, then so will ΛE (although it will
not be normalized). We may then diagonalize ΛE as
X
ΛE = λk |λk ihλk |,
k
where |λk i are vectors living in the big R+S space and λk ≥ 0. For the purpose of what
we are going to do next, it is convenient to absorb the eigenvalues into the eigenvectors
(which will no longer be normalized) and define
X
ΛE = |mk i = λk |λk i,
p
|mk ihmk |, (5.43)
k
√
Note that here CPTP is crucial because it implies that λk ≥ 0 so that hmk | = hλk | λk .
To finish the proof we insert this into Eq. (5.41) to get
X
E(ρ) = trR (ρT ⊗ IS )|mk ihmk | . (5.44)
k
The right-hand side will still be an operator living in S, since we only traced over R. All
we are left to do is convince ourselves that this will have the shape of the operator-sum
representation in Eq. (5.2).
In order to do that, things will get a bit nasty. The trick is to connect the states
|mk i of the Choi matrix ΛE with the Kraus operators Mk appearing in the operator-sum
92
representation (5.2): This is done by noting that since |mk i lives on the R+S space, it
can be decomposed as X
|mk i = (Mk ) j,i |iiR ⊗ | jiS , (5.45)
i, j
where (Mk ) j,i are a set of coefficients which we can interpret as a matrix Mk . We now
manipulate (5.44) to read
XX
E(ρ) = R hi|ρ | jiR R h j|mk ihmk |iiR .
T
k i, j
and voilá!
In conclusion, we have seen that any map which is linear and CPTP can be de-
scribed by an operator-sum representation, Eq. (5.2). I like this a lot because we are
not asking for much: linearity and CPTP is just the basic things we expect from a
physical map. Linearity should be there because everything in quantum mechanics is
linear and CPTP must be there because the evolution must map a physical state into a
physical state. When we first arrived at the idea of a unitary, we were also very relaxed
because all we required was the conservation of ket probabilities. The spirit here is the
same. For this reason, the quantum operation is really just a very natural and simplistic
generalization of the evolution of quantum systems, using density matrices instead of
kets.
93
But when there is also the coupling to an environment, we expect an evolution equation
of the form
dρ
= −i[H, ρ] + D(ρ), (5.46)
dt
where D(ρ) is an additional term, called the dissipator, which describes the effects of
the environment. This type of equation is historically known as a master equation, a
name which was first introduced in a completely different problem,4 and is supposed
to mean an equation from which all other properties can be derived from. I think its a
weird name. Sounds like a Dan Brown book: the master equation.
In this section I want to address the question of what are the typical forms one can
have for D(ρ) that lead to a valid CPTP evolution. That is, such that the solution ρ(t) of
Eq. (5.46) is a positive semidefinite density matrix at all times t. This is the content of
Lindblad’s theorem, who showed that in order for this to happen, D(ρ) must have the
form
X 1
D(ρ) = γk Lk ρLk† − {Lk† Lk , ρ} , γk ≥ 0, (5.47)
k
2
where Lk are arbitrary operators. If you have any equation satisfying this structure, then
the corresponding evolution is guaranteed to be CPTP (i.e., physical). Master equations
having this structure are then called Linbdlad master equations or, more generally,
Gorini-Kossakowski-Sudarshan-Lindblad (GKSL) equations.
The operator D(ρ) in Eq. (5.46) is called a superoperator. It is still a linear oper-
ator, as we are used to in quantum mechanics. But it acts on density matrices, instead
of kets, which means it can multiply ρ on both sides. Notwithstanding, it is essential
to realize that despite this complication, Eq. (5.46) still has the general structure of a
linear equation
dρ
= L(ρ), (5.48)
dt
The superoperator L(ρ) is called the Liouvillian (because of the analogy Liouville’s
equation in classical mechanics). This equation is just like a matrix vector equation
dx
= Ax, (5.49)
dt
where x is a vector and A is a matrix. The density matrix is now the “vector” (matrices
form a vector space) and the Liouvillian L is now the “matrix” (linear operator acting
on the vector space).
The solution of Eq. (5.49) is well known. It is simply x(t) = eAt x(0), which is the
map from the initial to the final state. We can call B = eAt as the map. Then A is the
4 A. Nordsieck, W. E. Lamb and G. T. Uhlenbeck, Physica, 7, 344 (1940).
94
generator of the map. The same will be true for Eq. (5.48); that is, we can also write
where Et is the linear map taking ρ(0) to ρ(t), which is related to the generator L by
means of Et = eLt .5
The letter E already hints at where I want to get to eventually: If the generator is
in Lindblad form (5.47), then the map will be a quantum channel (5.2). That is the
essence of Lindblad’s theorem. But what about the converse? When can a quantum
channel be expressed as the exponential of a Lindblad generator?
To answer this requires us to introduce another property of equations of the form (5.48),
known as the semigroup property. Using again the analogy with the linear Eq. (5.49),
we know that we can split the evolution into multiple steps. If we first evolve to t1
and then evolve for an extra t2 , it is the same as if we evolved all the way through by
t1 + t2 . The matrix exponential makes this obvious: eAt2 eAt1 = eA(t2 +t1 ) . Since the master
equation has the same structure, this must also be true for the map Et . That is, it must
satisfy the semigroup property 6
Lindblad’s Theorem
We can now update our question: What is the structure of a map which is
both CPTP and semigroup? This is the content of Lindblad’s theorem:a The
generator of any quantum operation satisfying the semigroup property must
have the form:
udρ X 1
= L(ρ) = −i[H, ρ] + γk Lk ρLk† − {Lk† Lk , ρ} , (5.52)
dt k
2
5 This formula is pretty on a formal level, but it is difficult to apply because L does not multiply ρ on the
left only. But luckily we won’t have to enter into this issue.
6 Here “group” refers to the group of CPTP maps E characterized by a single parameter t. Eq. (5.51) is
t
the composition property for a group. But the reason why it is a semigroup, is because but the inverse is not
a member of the group (as required for a set to be called a group). Here the inverse is E−t . While this exists,
it is not in general CPTP (unless E is unitary, as Lindblad shows).
95
Of course, this does not say anything about how to derive such an equation. That is
a hard question, which we will start to tackle in the next section. But this result gives
us an idea of what kind of structure we should look for and that is already remarkably
useful.
where the Kraus operators Mk (∆t) cannot depend on the time t. We are interested in a
differential equation for ρ(t), which would look something like
M0 = I + G∆t, Mk = γk ∆tLk ,
p
k,0
where G and Lk are arbitrary operators and γk ≥ 0 are constants that I introduce simply
to make the Lk dimensionless. The normalization condition for the Kraus operators
implies that
X X
I= Mk† Mk = M0† M0 + Mk† Mk
k k,0
X
= (I + G† ∆t)(I + G∆t) + ∆t γk Lk† Lk
k,0
X
= I + (G + G† )∆t + ∆t γk Lk† Lk + O(∆t2 ).
k,0
This shows why we need this G guy. Otherwise, we would never be able to normalize
the Kraus operators. Since G is arbitrary, we may parametrize it as
G = K − iH, (5.54)
where K and H are both Hermitian. It then follows from the normalization condition
that
1X
K=− γk Lk† Lk , (5.55)
2 k,0
96
whereas nothing can be said about H. The operator G then becomes
1X
G = − iH + γk Lk† Lk . (5.56)
2 k,0
This concludes our recipe for constructing Mk and M0 . They are now properly normal-
ized to order ∆t. And when ∆t → 0 we get only M0 = I (nothing happens).
With this at hand, we can finally substitute our results in Eq. (5.53) to find
X
ρ(t + ∆t) = (I + G∆t)ρ(I + G† ∆t) + ∆t γk Lk ρLk†
k,0
X
= ρ(t) + ∆t(Gρ + ρG† ) + ∆t γk Lk ρLk†
k,0
X 1
= ρ(t) − i∆t[H, ρ] + ∆t γk Lk ρLk† − {Lk† Lk , ρ}
k,0
2
These equations can also be used in the reverse way: it tells you what will be
the corresponding Kraus operators if you integrate the Lindblad Eq. (5.52) over
an infinitesimal time ∆t. This is neat because we already have a nice intuition
for the action of the Kraus operators as generalized measurements. We know
that the Mk cause quantum jumps: abrupt transitions in the state of the system
as ρ → Mk ρMk† . Over a small time interval ∆t, there is a large probability
that nothing happens (M0 ) and a small probability that one of the jumps Mk
occur. The operators Lk are called jump operators. They represents the types
of jumps one may have in the quantum dynamics.
97
|�〉 = |�〉
γ� γ(�+�)
|�〉 = |�〉
Figure 5.2: Example evolution of ρ(t) under the map (5.59). Left: energy level diagram show-
ing also the transition rates. Right: dynamics in the Bloch sphere. The initial state
is taken as |ψi = (cos π8 , eiπ/4 sin π8 ) and the parameters were
dρ
= −i[H, ρ] + γ(N + 1)D[σ− ] + γND[σ+ ], (5.59)
dt
where γ is a constant and
1
N= ,
eβΩ
−1
is the Bose-Einstein distribution with inverse temperature β = 1/kB T and frequency Ω.
In Eq. (5.59) I also introduce the notation
1
D[L] = LρL† − {L† L, ρ}. (5.60)
2
After all, since the dissipator is fully specified by the jump operator L, we don’t need
to write the full dissipator all the time.
An example of the evolution of the density matrix under Eq. (5.59) is shown in
Fig. 5.2. The Hamiltonian part induces the qubit to precess around the z axis. If there
was no dissipation the spin would precess indefinitely. But in the presence of dissipa-
tion, it precesses and is also damped towards the z axis.
98
Steady-state: After a long time has elapsed the system will tend to a steady-state,
which is the solution of
dρ
= L(ρ) = 0. (5.61)
dt
You may check that the steady-state is in this case diagonal, with
1
hσ x iss = hσy iss = 0, hσz iss = − , (5.62)
2N + 1
which corresponds to a thermal equilibrium density matrix
N
2N+1 0
ρss =
N+1
. (5.63)
0 2N+1
Emission and absorption: Let us now understand this from the perspective of the
jump operators (5.58). Our master equation (5.59) is characterized by jump operators
σ− and σ+ , which lower and raise the energy of the atom respectively. Moreover, these
transitions occur with rates γ(N + 1) and γN, so that it is more likely to observe a
transition downwards than upwards. A transition upwards represents the absorption of
a photon and a transition downwards represents an emission. In the limit of zero tem-
perature we get N → 0. In this case the atom only interacts with the electromagnetic
vacuum. As a consequence, the absorption tends to zero. However, the emission does
not. This is the idea behind spontaneous emission. Even in an environment with zero
photons on average, the atom still interacts with the vacuum and can emit radiation.
This is the physical meaning of the factor “1” in N + 1.
Unitary vs. dissipative: The dissipative dynamics of Eq. (5.59) is described by the
jump operators σ− and σ+ . We could also have introduced jump operators directly in
the Hamiltonian. For instance, consider the purely unitary dynamics under
Ω λ
H= σz + (σ+ + σ− ). (5.64)
2 2
This type of Hamiltonian appears when the atom is pumped by a coherent light source
(i.e. a laser). How similar is this dynamics from the Lindblad equation? They are
completely different. First, note that Hamiltonians have to be Hermitian, so the rates
for upward and downward transitions have to be equal. This reflects the fact that unitary
dynamics is always reversible. An example of the dynamics generated by Eq. (5.64)
is shown in Fig. 5.3. As can be seen, all it does is shift the axis of precession. Instead
of precessing around z, it will precess around some other axis. There is no damping. In
unitary dynamics there is never any damping.
Dissipators and Hamiltonians compete: Let us now suppose, for some reason, that
the system evolves according to the MEq. (5.59), but with the Hamiltonian (5.64). This
could mean, for instance, a damped atom also pumped by an external laser. We want
to compute the steady-state ρss , which is the solution of
Ω λ
−i σz + (σ+ + σ− ), ρss ] + γ(N + 1)D[σ− ] + γND[σ+ ] = 0. (5.65)
2 2
99
Figure 5.3: Unitary evolution of ρ(t) under the Hamiltonian (5.64). Compare this with the dissi-
pative dynamics in Fig. 5.2. The term σ x also causes transitions, but the Hamiltonian
dynamics is fundamentally different from the open dynamics.
The point I want to emphasize about this equation, is that the steady-state depends on
both the unitary as well as the dissipative part. The steady-state (5.63) happened not
to depend on the Hamiltonian which was simply σz . But in general, dissipators and
Hamiltonians compete. And the steady-state is somewhere in the middle. In the case
of Eq. (5.65) the steady-state turns out to be quite ugly, and is given by
4λΩ
hσ x iss = − , (5.66)
(2N + 1) γ2 (2N + 1)2 + 2(λ2 + 2Ω2 )
2γλ
hσy iss = − , (5.67)
γ2 (2N + 1)2 + 2(λ2 + 2Ω2 )
4Ω2 + γ2 (2N + 1)2
hσz iss = − . (5.68)
(2N + 1) γ (2N + 1)2 + 2(λ2 + 2Ω2 )
2
In addition to ugly, this steady-state is also not very intuitive. For instance, if we take
the limit λ → 0 we recover the results in (5.62). But if we instead take the limit λ → ∞
(a very strong laser) we get instead the maximally mixed state hσi i = 0. The message
I want you to take from this is that the steady-state of a master equation is always
a competition between different terms. Moreover, Hamiltonian and dissipative terms
compete in different ways. As a consequence, the steady-state is not always intuitive.
100
model specific, depending not only on the system, but also on the type of environment
and the specific system-environment interaction. Hence, they should be derived on a
case-by-case basis. Of course, with some practice, one can start to gain some intuition
as to what the Lindblad equations should look like, and so on.
So to start this adventure, I propose we study a very simple model of master equa-
tions, which I absolutely love. They are called collisional models and are illustrated
in Fig. 5.4. A system S , prepared in an arbitrary state ρS is allowed to interact with
a sequence of ancillas. The ancillas are all independent and have been prepared in
identical states ρE (each ancilla can be a qubit, for instance, so ρE is the 2 × 2 matrix
of a single qubit). Each interaction is described by some unitary US E (τ) (in the same
spirit as what we did in Stinespring’s theorem in Sec. 5.2) that lasts for some duration
τ. Thus, if we think about one system-ancilla interaction event, their global state at the
end will be given by
ρS E 0 = U ρS ⊗ ρE U † ,
(5.69)
where, in order to make the notation more compact, I write only U instead of US E (τ).
After one system-ancilla interaction, we then throw away the ancilla and
bring in a fresh new one, again prepared at the state ρE . Since we threw away
the ancilla, the state of the system is now ρ0S = trE ρ0S E . We then repeat the
process, using ρ0S as the input in Eq. (5.69) and evolving the system towards
another state ρ00S . All we care about is the stroboscopic evolution of the system,
in integer multiples of τ. In fact, we can make this more precise as follows. Let
ρS (n) ≡ ρS (nτ) denote the state of the system after n interactions. So the initial
state is ρS (0), the state after the first interaction is ρS (1) and etc. The game of
the collisional model is therefore given by the stroboscopic map
ρS (n + 1) = trE U ρS (n) ⊗ ρE U † . (5.70)
This kind of evolution is really neat. It is exactly the map of the Stinespring
dilation [Eq (5.24)], but applied multiple times, each with a brand new envi-
ronment. This map is therefore constructed to satisfy the semigroup property
at the stroboscopic level.
101
S
I’m not saying we take the “limit of τ → 0”. There is no limit in the mathematical sense
here: τ is still finite. But if it is sufficiently small, compared to other time scales, then
the discrete difference ρS (n + 1) − ρ(n) could be a good approximation for a derivative.
In order to gain intuition, let us introduce a specific model for system-ancilla inter-
actions. The system is assumed to be arbitrary, but the ancillas are taken to be qubits
prepared in a diagonal (thermal) state
!
f 0
ρE = = f |0ih0| + (1 − f )|1ih1|, (5.72)
0 1− f
where f ∈ [0, 1] is the probability of finding the ancilla in |0i. We then assume that the
unitary U is generated by a system ancilla interaction of the form
where g is a real number and L is an arbitrary operator of the system. This type of
interaction is very common and is sometimes referred to as an exchange Hamiltonian:
it essentially says that whenever the qubit goes up a level (by applying σ+E ), one should
apply L to the system (whatever the meaning of L is). Whereas if the qubit goes down
a level, we apply L† . For instance, if our system was another qubit, maybe we could
just use L = σS− . Then if the bath goes up, the system goes down. But in general, the
choice of L is arbitrary.
The unitary U in Eq. (5.70) will then be given by U = e−iτV . For simplicity, I will
not worry here about the Hamiltonians of the system and environment, just about the
interaction V. We now consider the unitary map (5.69) and expand the sandwich of
exponentials using the BCH formula:
τ2
ρ0S E = e−iτV ρS ρE eiτV = ρS ρE − iτ[V, ρS ρE ] − [V, [HS E , ρS ρE ]] + . . . (5.74)
2
To obtain the equation for S , we then trace over the environment. The first term is
trivial: trE ρS ρE = ρS . In the second term we have
trE [V, ρS ρE ] = trE VρS ρE − ρS ρE V .
102
This is the trickiest term: you cannot use the cyclic property of the trace because now
V acts on both subspaces of S and E, whereas the trace is only partial. But what we
can do is commute ρS and ρE at will, so that this may be written as
trE [V, ρS ρE ] = trE (VρE )ρS − ρS trE (ρE V).
However, due to our choice of state ρE and interaction V in Eqs. (5.72) and (5.73), we
have that
trE (σ±E ρE ) = 0.
Hence, this term is also identically zero:
trE [V, ρS ρE ] = 0.
Hence, taking the partial trace of Eq. (5.74) yields
τ2
ρ0S = ρS − trE [V, [V, ρS ρE ]]. (5.75)
2
Dividing by τ and arranging things so as to appear the derivative in Eq. (5.71), we find
dρS τ
= D(ρS ) := − trE [V, [V, ρS ρE ]]. (5.76)
dt 2
This result is a bit weird, because there is still a τ in there. This is why I said above that
we should not take the limit τ → 0. The approximation as a derivative in Eq. (5.71)
does not mean that anything multiplying τ should be negligible. The reason is that V
still has a constant g [Eq. (5.73)]. So it can be that even though τ is small, g2 τ will be
finite.
This type of issue also appears in classical stochastic processes. For instance, the
Langevin equation describing Brownian motion is
d2 x dx
= −γ + f (x) + 2γkB T ξ(t),
p
m 2
dt dt
where ξ(t) is a random noise satisfying hξ(t)ξ(t0 )i = δ(t − t0 ). The noise is a delta
function, so that it acts only for an infinitesimal time interval. But if you want to get
something non-trivial, it must also be infinitely large to compensate. The logic is the
same in Eq. (5.76): we are assuming the interaction time τ with each ancilla is very
short. But if we want to get something non-trivial out of it, we must also make this
interaction sufficiently strong.
In any case, continuing with our endeavor, we now write
−[V, [V, ρS ρE ]] = 2VρS ρE V − V 2 ρS ρE − ρS ρE V 2 .
We then compute the traces over E using the specific form of V in Eq. (5.73): For
instance,
trE (V 2 ρS ρE ) = trE (V 2 ρE )ρS
= g2 trE (L† Lσ−E σ+E + LL† σ+E σ−E )ρE ρS
= g2 hσ−E σ+E iL† LρS + hσ+E σ−E iLL† ρS .
103
Using Eq. (5.72) we get
Thus,
trE (V 2 ρS ρE ) = g2 (1 − f )L† LρS + f LL† ρS .
Finally, we compute
trE (VρS ρE V) = g2 trE (L† σ−E + Lσ+E )ρS ρE (L† σ−E + Lσ+E )
= g2 L† ρS L trE (σ−E ρE σ+E ) + LρS L† trE (σ+E ρE σ−E ) + L† ρS L† trE (σ−E ρE σ−E ) + LρS L trE (σ+E ρE σ+E ) .
The last two terms are zero because (σ±E )2 = 0. In the first two terms, we can now use
the cyclic property of the trace, which yields
trE (VρS ρE V) = g2 f L† ρS L + (1 − f )LρS L† .
1 1
D(ρS ) = g2 τ(1 − f ) LρS L† − {L† L, ρS } + g2 τ f L† ρS L − {LL† , ρS } .
2 2
(5.78)
We therefore recognize here a Linbdlad master equation with jump operators L and L†
and rates γ− = g2 τ(1 − f ) and γ+ = g2 τ f .
Collisional models are neat because they give you substantial control over the types
of master equations you can construct. Eq. (5.78) provides a very general recipe. Given
any system S , if you want to couple it to an environment with an interaction of the
form (5.73), this will produce a dissipator D(ρS ) having both L and L† as jump opera-
tors.
104
Chapter 6
Continuous variables
[a, a† ] = 1. (6.1)
All properties of these operators and the Hilbert space they represent follow from this
simple commutation relation, as we will see below. Another set of operators which can
be used as the starting point of the discussion are the position and momentum operators
105
q and p. They satisfy
[q, p] = i. (6.2)
In quantum optics, they no longer represent position and momentum, but are related
to the electric and magnetic fields. In this case they are usually called quadrature
operators. We define q and p to be dimensionless. Then they are related to the creation
and annihilation operators according to
1 1
q = √ (a† + a) a = √ (q + ip)
2 2
⇐⇒ (6.3)
i 1
p = √ (a† − a) a† = √ (q − ip) .
2 2
From this it can be clearly seen that q and p are Hermitian operators, even though a
is not. Also, please take a second to verify that with this relation Eq. (6.1) indeed
implies (6.2) and vice-versa.
Mechanical oscillators
The operators a, a† , q and p appear in two main contexts: mechanical oscillators
and second quantization. The latter will be discussed below. A mechanical oscillator
is specified by the Hamiltonian
P2 1
H= + mω2 Q2 , (6.4)
2m 2
where m is the mass and ω is the frequency. Moreover Q and P are the position and
momentum operators satisfying
[Q, P] = i~. (6.5)
Now define the dimensionless operators
r
mω P
q= Q, p= √ . (6.6)
~ m~ω
Then Eq. (6.5) implies that q and p will satisfy (6.2). In terms of q and p, the Hamilto-
nian (6.4) becomes
~ω 2
H= (p + q2 ), (6.7)
2
which, you have to admit, is way more elegant than (6.4). Using now Eq. (6.3) we
finally write the Hamiltonian as
Eqs. (6.7) and (6.8) show very well why ~ is not important: it simply redefines the
energy scale. If we set ~ = 1, as we shall henceforth do, we are simply measuring
energy in units of frequency.
106
In the days of Schrödinger, harmonic oscillators were usually used either as toy
models or as an effective description of some other phenomena such as, for instance,
the vibration of molecules. In the last two decades this has change and we are now
able to observe quantum effects in actual mechanical mesoscopic- (nano- or micro-)
oscillators. This is usually done by engineering thin suspended membranes, which can
then undergo mechanical vibrations. This field is usually known as optomechanics
since most investigations involve the contact of the membranes with radiation. I find it
absolutely fascinating that in our day and age we can observe quantum effects as awe-
some as entanglement and coherence in these mechanical objects. I love the century
we live in!
An algebraic problem
In Eq. (6.8) we see the appearance of the Hermitian operator a† a, called the num-
ber operator. To find the eigenstuff of H we therefore only need to know the eigenstuff
of a† a. We have therefore arrived at a very clean mathematical problem: given a non-
Hermitian operator a, satisfying [a, a† ] = 1, find the eigenvalues and eigenvectors of
a† a. This is a really important problem that appears often in all areas of quantum
physics: given an algebra, find the eigenstuff. Maybe you have seen this before, but I
will nonetheless do it again, because I think this is one of those things that everyone
should know.
Here we go. Since a† a is Hermitian, its eigenvalues must be real and its eigenvec-
tors can be chosen to form an orthonormal basis. Let us write them as
Our goal is to find the allowed n and the corresponding |ni. The first thing we notice is
that a† a must be positive semi-definite operator, so n cannot be negative:
n = hn|a† a|ni ≥ 0.
Next we use Eq. (6.1) to show that
This type of structure is a signature of a ladder like spectrum (that is, when the eigen-
values are equally spaced). To see that, we use these commutation relations to compute:
(a† a)a|ni = [a(a† a) − a]|ni = a(a† a − 1)|ni = (n − 1)a|ni.
Hence, we conclude that if |ni is an eigenvector with eigenvalue n, then a|ni is also
an eigenvector, but with eigenvalue (n − 1) [This is the key argument. Make sure you
understand what this sentence means.]. However, I wouldn’t call this |n − 1i just yet
because a|ni is not normalized. Thus we need to write
|n − 1i = γa|ni,
107
where γ is a normalization constant. To find it we simply write
Thus |γ|2 | = 1/n. The actual sign of γ is arbitrary so we choose it for simplicity as
being real and positive. We then get
a
|n − 1i = √ |ni.
n
From this analysis we conclude that a reduces the eigenvalues by unity:
√
a|ni = n|n − 1i.
Thus a† raises the eigenvalue by unity. The normalization factor is found by a similar
procedure: we write |n + 1i = βa† |ni, for some constant β, and then compute
Thus √
a† |ni = n + 1|n + 1i.
These results are important, so let me summarize them in a boxed equation:
√ √
a|ni = n|n − 1i, a† |ni = n + 1|n + 1i . (6.11)
From this formula we can see why the operators a and a† also receive the name lower-
ing and raising operators.
Now comes the trickiest (and most beautiful) argument. We have seen that if n is
an eigenvalue, then n ± 1, n ± 2, etc., will all be eigenvalues. But this doesn’t mean
that n itself should be an integer. Maybe we find one eigenvalue which is 42.42 so that
the eigenvalues are 41.42, 43.42 and so on. Of course, you know that is not true and
n must be integer. To show that, we proceed as follows. Suppose we start with some
eigenstate |ni and keep on applying a a bunch of times. At each application we will
lower the eigenvalue by one tick:
But this crazy party cannot continue forever because, as we have just discussed, the
eigenvalues of a† a cannot be negative. They can, at most, be zero. The only way for
this to happen is if there exists a certain integer ` for which a` |ni , 0 but a`+1 |ni = 0.
And this can only happen if ` = n because, then
108
and the term n − ` will vanish. Since ` is an integer, we therefore conclude that n must
also be an integer. Thus, we finally conclude that
It is for this reason that a† a is called the number operator: we usually say a† a counts the
number of quanta in a given state: given a state |ni, you first apply a to annihilate one
quant and then a† to create it back again. The proportionality factor is the eigenvalue
n. Curiously, this analysis seem to imply that if you want to count how many people
there are in a room, you first need to annihilate one person and then create a fresh new
human. Quantum mechanics is indeed strange.
This analysis also serves to define the state with n = 0, which we call the vacuum,
|0i. It is defined by
a|0i = 0. (6.13)
We can build all states starting from the vacuum and applying a† successively:
(a† )n
|ni = √ |0i. (6.14)
n!
Using this and the algebra of a and a† it then follows that the states |ni form an or-
thonormal basis, as expected:
hn|mi = δn,m .
The states |ni are called Fock states, although this nomenclature is more correctly
employed in the case of multiple modes, as we will now discuss.
That is, ai with a†i behaves just like before, whereas ai with a†j commute if j , i. More-
over annihilation operators always commute among themselves. Taking the adjoint of
[ai , a j ] = 0 we see that the same will be true for the creation operators [a†i , a†j ] = 0.
Using the same transformation as in Eq. (6.3), but with indices everywhere, we can
also define quadrature operators qi and pi , which will then satisfy
Multi-mode systems can appear in mechanical contexts. For instance, consider two
mechanical oscillators coupled by springs, as in Fig. 6.1. Each oscillator has a natural
109
ω� � ω�
frequency ω1 and ω2 and they are coupled by a spring constant k. Assuming unit mass,
the Hamiltonian will then be
ω2 ω2 k
H = p21 + 1 q21 + p22 + 2 q22 + (q1 − q2 )2 . (6.17)
2 2 2
If we want we can also transform this into ai and a†i , or we can extend it to multiple
oscillators forming a chain. In fact, these “harmonic chains” are a widely studied
topic in the literature because they can always be solved analytically and they are the
starting point for a series of interesting quantum effects. We will have the opportunity
to practice with some of these solutions later on.
But by far the most important use of multi-mode systems is in second quantiza-
tion. Since operators pertaining to different modes commute, the Hilbert space of a
multi-mode system will be described by a basis
These are called Fock states and are the eigenstates of the number operators a†i ai :
110
where {A, B} = AB + BA is the anti-commutator. If we repeat the diagonalization pro-
cedure of the last section for this kind of algebra we will find a similar “Fock structure”
but with the only allowed eigenvalues being ni = 0 and ni = 1.
The most important bosonic system is the electromagnetic field. The excitations
are then the photons and the modes are usually chosen to be the momentum and polar-
ization. Hence, we usually write an annihilation operator as ak,λ where k = (k x , ky , kz )
is the momentum and λ = ±1 is the polarization. Moreover, the Hamiltonian of the
electromagnetic field is written as
X
H= ωk a†k,λ ak,λ , (6.21)
k,λ
where ωk is the frequency of each mode and is given by1 ωk = c|k| where c is the
speed of light.
You have noticed that my discussion of second quantization was rather shallow. I
apologize for that. But I have to do it like this, otherwise we would stray too far. Second
quantization is covered in many books on condensed matter, quantum many-body and
quantum field theory. A book which I really like is Feynman’s Statistical mechanics:
a set of lectures”.
Optical cavities
Many controlled experiments take place inside optical cavities, like the one repre-
sented in my amazing drawing in Fig. 6.2 (it took me 30 minutes to draw it!). The
cavity is made up of highly reflective mirrors allowing the photons to survive for some
time, forming standing wave patterns. Unlike in free space, where all radiation modes
can exist equally, the confinement inside the cavity favors those radiation modes whose
frequencies are close to the cavity frequency ωc , which is related to the geometry of the
cavity. It is therefore common to consider only one radiation mode, with operator a
and frequency ωc .
The photons always have a finite lifetime so more photons need to be injected all
the time. This is usually done by making one of the mirrors semi-transparent and
pumping it with a laser from the outside, with frequency ω p . Of course, since photons
can come in, they can also leak out. This leakage is an intrinsically irreversible process
and can only be described using the theory of open quantum systems, which we will
get to in the next chapter. Hence, we will omit the process of photon losses for now.
The Hamiltonian describing a single mode pumped externally by a laser then has the
form
H = ωc a† a + a† e−iω p t + ∗ aeiω p t , (6.22)
1 If we define ω = 2πν and |k| = 2π/λ we see that this is nothing but the relation ν = λc that you learned
in high school.
111
ω�
ω�
Figure 6.2: An optical cavity of frequency ωc , pumped from the outside by a laser of frequency
ωp.
(�) (�)
(�)
|�〉 = |�〉
Ω
ω�
|�〉 = |�〉
Figure 6.3: (a) Typical scenario for light-matter interaction: an atom, modeled as a two-level
system, is placed inside a cavity in which there is only one cavity mode. The atom
then absorbs and emits photons jumping up and down from the ground-state to the
excited state. (b) The cavity field is represented by a harmonic oscillator of fre-
quency ωc . (c) The atom is represented as a two-level system (qubit) with energy
gap Ω. When the atom Hamiltonian is +σz then the ground-state will be |1i and the
excited state will be |0i.
where is the pump amplitude and is related to the laser power P according to ||2 =
γP/~ω p where γ is the cavity loss rate (the rate at which photons can go through
the semi-transparent mirror). This Hamiltonian is very simple, but is time-dependent.
Lucky for us, however, this time dependence can be eliminated using the concept of a
rotating frame, as will be discussed below.
112
Fig. 6.3.
The Jaynes-Cummings model reads
Ω
H = ωa† a + σz + λ(aσ+ + a† σ− ). (6.23)
2
The first two terms are the free Hamiltonians of the cavity field, with frequency ωc , and
the atom, with energy gap Ω. Whenever the atom Hamiltonian is written as +σz , the
ground-state will be |gi = |1i and the excited state will be |ei = |0i [see Fig. 6.3(c)].
Finally, the last term in (6.23) is the light-atom coupling. The term aσ+ describes the
process where a photon is annihilated and the atom jumps to the excited state. Simi-
larly, a† σ− describes the opposite process. The Hamiltonian must always be Hermitian
so every time we include a certain type of process, we must also include its reverse.
The type of interaction in Eq. (6.23) introduces a special symmetry to the problem.
Namely, it conserves the number of quanta in the system:
[H, a† a + σz ] = 0. (6.24)
This means that if you start the evolution with 7 photons and the atom in the ground-
state, then at all times you will either have 7 photons + ground-state or 6 photons and
the atom in the excited state. This is a very special symmetry and is the reason why the
Jaynes-Cummings model turns out to be easy to deal with.
However, if we start with a physical derivation of the light-atom interaction, we
will see that it is not exactly like the Jaynes-Cummings Hamiltonian (6.23). Instead, it
looks more like the Rabi model
Ω
H = ωa† a + σz + λ(a + a† )σ x . (6.25)
2
The difference is only in the last term. In fact, if we recall that σ x = σ+ + σ− , we get
(a + a† )σ x = (aσ+ + a† σ− ) + (a† σ+ + aσ− ).
The first term in parenthesis is exactly the Jaynes-Cummings interaction, so the new
thing here is the term (a† σ+ + aσ− ). It describes a process where the atom jumps to the
excited state and emits a photon, something which seems rather strange at first. More-
over, this new term destroys the pretty symmetry (6.24), making the Rabi model much
more complicated to deal with, but also much richer from a physical point of view.
Notwithstanding, as we will see below, if λ is small compared to ωc , Ω this new term
becomes negligible and the Rabi model approximately tends to the JC Hamiltonian.
113
where H(t) is a possibly time-dependent Hamiltonian. We can always move to a rotat-
ing frame by defning a new density matrix
where S (t) is an arbitrary unitary. I will leave to you as an exercise to show that ρ̃ will
also obey a von Neumann equation
dρ̃
= −i[H̃(t), ρ̃], (6.28)
dt
but with an effective Hamiltonian2
dS †
H̃(t) = i S + S HS † . (6.29)
dt
Thus, we see that in any rotating frame the system always obeys von Neumann’s (or
Schrödinger’s) equation, but the Hamiltonian changes from H(t) to H̃(t). Note that this
result is absolutely general and holds for any unitary S (t). Of course, whether it is
useful or not will depend on your smart choice for S (t).
Before we move to applications, I need to mention that computing the first term
in Eq (6.29) can be tricky. Usually we write unitaries as S (t) = eiK(t) where K is
Hermitian. Then, one may easily verify the following BCH expansion
deiK −iK dK i2 dK i3 dK
e =i + [K, ] + [K, [K, ]] + . . . . (6.30)
dt dt 2 dt 3! dt
The important point here is whether or not K commutes with dK/ dt. If that is the case
then only the first term survives and things are easy and pretty. Otherwise, you may
get an infinite series. I strongly recommend you always use this formula, because then
you are always sure you will not get into trouble.
Eliminating time-dependences
A simple yet useful application of rotating frames is to eliminate the time-dependence
of certain simple Hamiltonians, such as the pumped cavity (6.22). In this case the uni-
tary that does the job is
†
S (t) = eiω p ta a . (6.31)
2 To derive this equation it is necessary to use the following trick: since S S † = 1 then
dS S † dS † dS † dS † dS †
0= =S + S −→ S =− S .
dt dt dt dt dt
114
That is, we move to a frame that is rotating at the same frequency as the pump laser
ω p . Using the BCH expansion (1.70) one may show that
† † † †
eiαa a ae−iαa a = e−iα a, eiαa a a† e−iαa a = eiα a† , (6.32)
which are easy to remember: a goes with negative α and a† with positive α. It then
follows that
S (t) a† e−iω p t + aeiω p t S † (t) = a† + a,
while S (t) has no effect on a† a. Moreover, this is one of those cases where only the
first term in (6.30) contributes:
dS †
S = iω p a† a.
dt
Thus Eq. (6.29) becomes
would not have a time-independent rotating frame under the transformation (6.31) be-
cause if you expand (a + a† )4 there will be terms with a unbalanced number of as and
a† s.
115
A similar rotating frame transformation also works for qubit systems of the form
Ω λ
H= σz + (σ+ e−iω p t + σ− eiω p t ) (6.34)
2 2
Ω λ
= σz + [σ x cos(ω p t) + σy sin(ω p t)]. (6.35)
2 2
This Hamiltonian appears often in magnetic resonance because it represents a spin 1/2
particle subject to a constant field Ω in the z direction and a rotating field λ in the xy
plane. Remarkably, the transformation here is almost exactly as in the bosonic case:
In this case the idea of a rotating frame becomes a bit more intuitive: the Hamiltonian
is time-dependent because there is a field rotating in the xy plane. So to get rid of it,
we go to a frame that is rotating around the z axis by an angle ω p t. I will leave for
you to check that this S (t) indeed does the job. One thing that is useful to know is that
Eq. (6.32) is translated almost literally to the spin case:
Interaction picture
Now let us consider another scenario. Suppose the Hamiltonian is time-independent
but can be written in the standard perturbation-theory-style
H = H0 + V, (6.38)
This is the interaction picture: we eliminate the dependence on H0 at the cost of trans-
forming a time-independent Hamiltonian H0 + V into a time-dependent Hamiltonian
S VS † .
The interaction picture is usually employed as the starting point of time-dependent
perturbation theory. We will learn a bit more about this below. But to get a first glimpse,
consider the Rabi Hamiltonian (6.25) and let us move to the interaction picture with
respect to H0 = ωa† a + Ω2 σz . Using Eqs. (6.32) and (6.37) we then find
H̃(t) = λ aσ+ ei(Ω−ωc )t + a† σ− e−i(Ω−ωc )t + λ a† σ+ ei(Ω+ωc )t + aσ− e−i(Ω+ωc )t . (6.41)
In the interaction picture we see more clearly the difference between the two types of
couplings. The first term, which is the Jaynes-Cummings coupling, oscillates in time
116
with a frequency Ω − ωc , which will be very small when Ω is close to ωc . The second
term, on the other hand, oscillates quickly with frequency ωc + Ω, which is in general a
much faster frequency that ωc − Ω. We therefore see the appearance of two time scales,
the JC term, which is slow, and the Rabi dude which give rise to fast oscillations.
Eq. (6.41) is frequently used as the starting point to justify why sometimes we can
throw away the last term (and hence obtain the Jaynes-Cummings model (6.23) from
the Rabi model). The idea is called the rotating-wave approximation (RWA) and is
motivated by the fact that if Ω + ω is very large, the last terms will oscillate rapidly
around zero average and hence will have a small contribution to the dynamics. But this
explanation is only partially convincing, so be careful. In the end of the day, the RWA is
really an argument on time-dependent perturbation theory. Hence, it will only be good
when λ is small compared to ωc and Ω. Thus, the RWA is better stated as follows: if
λ ωc , Ω and ωc ∼ Ω, it is reasonable to throw away the fast oscillating terms in the
interaction picture. For an interesting discussion connection with perturbation theory,
see the Appendix in arXiv 1601.07528.
Heisenberg picture
In the interaction picture we started with a Hamiltonian H = H0 + V and went to
a rotating frame with H0 . In the Heisenberg picture, we go all the way through. That
is, we go to a rotating frame (6.29) with S (t) = eiHt . For now I will assume H is time-
independent, but the final result also holds in the time-dependent case. As a result we
find
H̃ = 0 (6.42)
Consequently, the solution of the rotating frame Eq. (6.28) will be simply
ρ̃(t) = ρ̃(0) = ρ(0). (6.43)
But by Eq. (6.27) we have ρ̃(t) = S (t)ρ(t)S † (t) so we get
ρ(t) = S † (t)ρ(0)S (t) = e−iHt ρ(0)eiHt . (6.44)
You may now be thinking “DUH! This is is just the solution of the of von Neumann’s
equation!”. Yes, that’s exactly the point. The solution of von Neumann’s equation is
exactly that special rotating frame where time stands still (like in the Rush song!).
In the Heisenberg picture we usually transfer the time-dependence to the operators,
instead of the states. Recall that given an arbitrary operator A, its expectation value
will be hAi = tr(Aρ). Using Eq. (6.44) we then get
hAi = tr Ae−iHt ρ(0)eiHt = tr eiHt AeiHt ρ(0) . (6.45)
This formula summarizes well the Schrödinger vs. Heisenberg ambiguity. It provides
two equivalent ways to compute hAi. In the first, which is the usual Schrödinger picture
approach, the state ρ(t) evolves in time and A is time-independent. In the second, the
state ρ is fixed at ρ(0) and we transfer the time evolution to the operator. It is customary
to define the Heisenberg operator
AH (t) = A(t) = eiHt Ae−iHt . (6.46)
117
Some people write AH (t) to emphasize that this is different from A What I usually do
is just be careful to always write the time argument in A(t).
By direct differentiation one may verify that the operator A(t) satisfies the Heisen-
berg equation
dA(t)
= i[H, A(t)]. (6.47)
dt
This is to be interpreted as an equation for the evolution of the operator A(t). If what
you are interested is instead the evolution of the expectation value hAit , then it doesn’t
matter which picture you use. In the Heisenberg picture, Eq. (6.47) directly gives you
dhAi
= ih[H, A]i. (6.48)
dt
But you can also get the same equation in the Schrödinger picture using the von Neu-
mann equation:
dhAi dρ
= tr A = −i tr A[H, ρ] = i tr [H, A]ρ ,
dt dt
where, in the last line, all I did was rearrange the commutator using the cyclic property
of the trace.
118
� Δ� �Δ� �Δ� �Δ� �Δ�
Of course, this discretization is just a trick. We can now take ∆t → 0 and we will have
solved for the most general time-dependent Hamiltonian.
If we define the time-evolution operator according to
Since this becomes exact when ∆t → 0, we conclude that this is the general solution
of the time-dependent problem. Admittedly, this solution is still quite a mess and part
of our effort below will be to clean it up a bit. But if you ever wonder “what is the
solution with a time-dependent Hamiltonian?”, I recommend you think about (6.53).
It is interesting to note that this operator U(t, t0 ) satisfies all properties of its time-
independent cousin:
U(t0 , t0 ) = 1, (6.54)
Eq. (6.55) is particularly important, because it shows that even in the time-dependent
case the solution can still be broken down in pieces.
The important point that must be remembered concerning Eq. (6.53) is that in gen-
eral you cannot recombine the exponentials since the Hamiltonian at different times
may not commute:
in general [H(t), H(t0 )] , 0. (6.58)
119
If this happens to be the case, then the problem is very easy and Eq. (6.53) becomes
N
X
U(t, t0 ) = exp − i∆t H(n∆t)
n=M
Zt
= exp − i H(t0 ) dt0 ,
t0
where, in the last line, I already took the limit ∆t → 0 and transformed the sum to an
integral.
However, if H(t) does not commute at different times, this solution is incorrect.
Instead, we can use a trick to write down the solution in a way that looks formally
similar. We define the time-ordering operator T such that, when acting on any set of
time-dependent operators, it always puts later times to the left:
A(t1 )A(t2 ) if t1 > t2
T A(t1 )A(t2 ) =
(6.59)
A(t2 )A(t1 ) if t2 > t1
This time-ordering operator can now be used to combine exponentials. If we recall the
Zassenhaus (BCH) formula (??):
t2 t3
et(A+B) = etA etB e− 2 [A,B] e 3! (2[B,[A,B]]+[A,[A,B]]) . . . , (6.60)
Consequently, if we expand eA(t2 )+B(t1 ) and then apply T , the only term that will survive
will be eA(t2 ) eB(t1 ) . Hence,
eA(t2 ) eB(t1 ) = T eA(t2 )+B(t1 ) . (6.61)
Within the protection of the time-ordering operator, we can freely recombine exponen-
tials.
Using this time-ordering trick we may now recombine all terms in the product (6.53),
leading to
Zt
U(t, t0 ) = T exp − i H(t0 ) dt0 , (6.62)
t0
where I already transformed this into an integral. This is the way we usually write the
formal solution of a time-dependent problem. The time-ordering operator T is just a
compact way to write down the solution in Eq. (6.53). If you are ever confused about
120
how to operate with it, go back to Eq. (6.53). Finally, let me mention that Eq. (6.62)
can also be viewed as the solution of the initial value problem
dU(t, t0 )
= −iH(t)U(t, t0 ), U(t0 , t0 ) = 1. (6.63)
dt
This may not be so evident from Eq. (6.62), but it is if we substitute Eq. (6.52) into (6.49).
Magnus expansion
We are now in a good point to discuss time-dependent perturbation theory. The
scenario is as follows. We start with H0 + V and move to the interaction picture where
the rotating frame Hamiltonian becomes the time-independent operator (6.40). We then
try to solve the von Neumann equation for this operator. Or, what is equivalent, we try
to find the time-evolution operator Ũ(t, t0 ) which, as in (6.63), will be the solution of
dŨ(t, t0 )
= −iH̃(t)Ũ(t, t0 ), Ũ(t0 , t0 ) = 1. (6.64)
dt
There are many ways to do this. Sometimes the perturbation theory is done in terms of
states and sometimes it is done in terms of operators (in which case it is called a Dyson
series).
Here I will try to do it in a slightly different way, using something called a Magnus
expansion. Parametrize the time evolution operator as
Ũ(t, t0 ) = e−iΩ(t,t0 ) , Ω(t0 , t0 ) = 0, (6.65)
where Ω(t, t0 ) is an operator to be determined. To find an equation for it, we first
multiply Eq. (6.64) bu U † on the left, leading to
de−iΩ iΩ
e = −iH̃(t).
dt
Then we use Eq. (6.30) to find
i 1
Ω̇ − [Ω, Ω̇] − [Ω, [Ω, Ω̇]] + . . . = H̃(t), (6.66)
2 3!
which is a really weird equation for Ω(t, t0 ).
We now write this in perturbation-theory-style by assuming that H̃(t) → H̃(t)
where is a small parameter. Moreover, we expand Ω as
Ω = Ω1 + 2 Ω2 + 3 Ω3 + . . . . (6.67)
Substituting in Eq. (6.66) and collecting terms of the same order in we are then led to
a system of equations
Ω̇1 = H̃(t), (6.68)
i
Ω̇2 = [Ω1 , Ω̇1 ], (6.69)
2
i i 1
Ω̇3 = [Ω1 , Ω̇2 ] + [Ω2 , Ω̇] + [Ω1 , [Ω1 , Ω̇1 ]]. (6.70)
2 2 3!
121
and so on. These can now be solved sequentially, leading to
Zt
Ω1 (t) = dt1 H̃(t1 ), (6.71)
t0
Zt Zt1
i
Ω2 (t) = − dt1 dt2 [H̃(t1 ), H̃(t2 )], (6.72)
2
t0 t0
Zt Zt1 Zt2
1
Ω3 (t) = − dt1 dt2 dt3 [H̃(t1 ), [H̃(t2 ), H̃(t3 )]] + [H̃(t3 ), [H̃(t2 ), H̃(t1 )]] .(6.73)
6
t0 t0 t0
This is the Magnus expansion. Higher order terms become more and more cumber-
some. From this one may obtain the Dyson series expanding Eq. (6.65) in a Taylor
series.
It is also important to note that if the Hamiltonian commutes at different times, then
the series truncates at the first term. If this were always the case, there would be no
need for perturbation theory at all. The need for time-dependent perturbation theory is
really a consequence of the non-commutativity of H̃ at different times.
λ
+ a† σ+ (ei(Ω+ωc )t − 1) − aσ− (e−i(Ω+ωc )t − 1) .
i(Ω + ωc )
The Rotating-wave approximation scenario is now apparent: when we do perturbation
theory, the Jaynes-Cummings terms will multiply λ/(Ω−ωc ) whereas the non-JC terms
will contain λ/(Ω − ωc ). If we are close to resonance (Ω ∼ ωc ) and if λ is small the
first term will be very large and the second very small. Consequently, the second term
may be neglected.
122
We begin by defining the displacement operator
D(α) = eαa
†
−α∗ a
. (6.74)
where α is an arbitrary complex number and α∗ is its complex conjugate. The reason
why it is called a “displacement” operator will become clear soon. A coherent state is
defined as the action of D(α) into the vacuum state:
We sometimes say that “a coherent state is a displaced vacuum”. This sounds like a
typical Star Trek sentence: “Oh no! He displaced the vacuum. Now the entire planet
will be annihilated!”
D† (α)aD(α) = a + α. (6.79)
123
The coherent state is an eigenstate of a
What I want to do now is apply a to the coherent state |αi in Eq. (6.75). Start
with Eq. (6.79) and multiply by D(α) on the left. Since D is unitary we get aD(α) =
D(α)(a + α). Thus
a|αi = aD(α)|0i = D(α)(a + α)|0i = D(α)(α)|0i = α|αi,
where I used the fact that a|0i = 0. Hence we conclude that the coherent state is the
eigenvector of the annihilation operator:
The annihilation operator is not Hermitian so its eigenvalues do not have to be real. In
fact, this equation shows that the eigenvalues of a are all complex numbers.
2
/2 αa† −α∗ a 2
/2 −α∗ a αa†
D(α) = e−|α| e e = e|α| e e . (6.84)
This result is useful because now the exponentials of a and a† are completely separated.
From this result it follows that
This means that if you do two displacements in a sequence, it is almost the same as
doing just a single displacement; the only thing you get is a phase factor (the quantity
in the exponential is purely imaginary).
Poisson statistics
Let us use Eq. (6.84) to write the coherent state a little differently. Since a|0i = 0 it
follows that e−αa |0i = |0i. Hence we may also write Eq. (6.75) as
2
/2 αa†
|αi = e−|α| e |0i. (6.86)
124
Now we may expand the exponential and use Eq. (6.14) to write (a† )n |0i in terms of
the number states. We get
∞
2
/2
X αn
|αi = e−|α| √ |ni. (6.87)
n=0 n!
Thus we find that
αn
hn|αi = e−|α| /2 √ .
2
(6.88)
n!
The probability of finding it in a given state |ni, given that it is in a coherent state, is
therefore
2 n
2 (|α| )
|hn|αi|2 = e−|α| . (6.89)
n!
This is a Poisson distribution with parameter λ = |α|2 . The photons in a laser are
usually in a coherent state and the Poisson statistics of photon counts can be measured
experimentally. If you measure this statistics for thermal light you will find that it is
not Poisson (usually it follows a geometric distribution). Hence, Poisson statistics is a
signature of coherent states.
Orthogonality
Coherent states are not orthogonal. To figure out the overlap between two coherent
states |αi and |βi we use Eq. (6.86):
/2 −|α|2 /2
h0|eβ a eαa |0i.
2 ∗ †
hβ|αi = e−|β| e
We need to exchange the two operators because we know how a acts on |0i and how a†
acts on h0|. To do that we use Eq. (6.83):
eβ a eαa = eαa eβ a eβ α .
∗ † † ∗ ∗
(6.90)
|β|2 |α|2
hβ|αi = exp β∗ α − − . (6.91)
2 2
|hβ|αi|2 = exp − |α − β|2 . (6.92)
Hence, the overlap between two coherent states decays exponentially with their dis-
tance. For large α and β they therefore become approximately orthogonal. Also, as a
sanity check, if β = α then
hα|αi = 1, (6.93)
125
which we already knew from Eq. (6.75) and the fact that D is unitary. Coherent states
are therefore normalized, but they do not form an orthonormal basis. In fact, they form
an overcomplete basis in the sense that there are more states than actually needed.
Completeness
Even though the coherent states do not form an orthonormal basis, we can still write
down a completeness relation for them. However, it looks a little different:
d2 α
Z
|αihα| = 1. (6.94)
π
This integral is over the entire complex plane. That is, if α = x + iy then d2 α = dx dy.
This is, therefore, just your old-fashioned integral over two variables. The proof of
Eq. (6.94) is a little bit cumbersome. You can find it in Gardiner and Zoller.
dα dα
Z 2 Z 2
tr D(λ) = hα|D(λ)|αi = h0|D† (α)D(λ)D(α)|0i.
π π
But since D(α) is unitary, it infiltrates everywhere:
D† (α)D(λ)D(α) = exp D† (α)(λa† − λ∗ a)D(α) = eλα −λ α D(λ).
∗ ∗
Thus we get
d2 α d2 α λα∗ −λ∗ α
Z Z
= eλα −λ α h0|D(λ)|0i = e−|λ| /2
∗ ∗ 2
tr D(λ) = e (6.96)
π π
where I used the fact that h0|D(λ)|0i = h0|λi = e−|λ| /2 [Eq. (6.88)].
2
126
But each one is now a Dirac delta
Z∞
dxeixk = 2πδ(k).
−∞
Whence
d2 α λα∗ −λ∗ α
Z
e = πδ(λ). (6.97)
π
where δ(λ) = δ(Re(λ))δ(Im(λ)). This integral is therefore nothing but the two-dimensional
Fourier transform in terms of the complex variable α.
Substituting this in Eq. (6.96) we finally conclude that
where I omitted the factor of e−|λ| /2 since the Dirac delta make it irrelevant. Using this
2
d2 α
Z
F= f (α)D† (α) (6.100)
π
where
f (α) := tr FD(α) . (6.101)
This is just like decomposing a state in a basis, but we are actually decomposing an
operator.
127
three most important are the Husimi-Q function, the Wigner function and the Glauber-
Sudarshan P function. Each has its own advantages and disadvantages. Since this
chapter is meant to be a first look into this topic, we will focus here on the simplest one
of them, the Q function.
The Husimi-Q function is defined as the expectation value of the density matrix in
a coherent state
1
Q(α∗ , α) = hα|ρ|αi. (6.102)
π
dα
Z 2
1 = tr ρ = hα|ρ|αi.
π
Thus, we conclude that the Husimi Q function is normalized as
Z
d2 α Q(α∗ , α) = 1 (6.103)
dα
Z 2 Z
hai = tr(ρa) = hα|ρa|αi = d2 α Q(α, α∗ )α,
π
dα
Z 2 Z
haa i = tr(a ρa) =
† †
hα|a ρa|αi =
†
d2 α Q(α, α∗ )|α|2 .
π
It is interesting to see here how the ordering of operators play a role. Suppose you want
to compute ha† ai. Then you should first reorder it as ha† ai = haa† i − 1 and then use
the above result for haa† i.
More generally, we may obtain a rule for computing the expectation values of anti-
normally ordered operators. That is, operators which have all a† s to the right. If this is
the case then we can easily write
Z
k † `
ha (a ) i = d2 α αk (α∗ )` Q(α∗ , α). (6.104)
Thus, to compute the expectation value of an arbitrary operator, we should first use the
commutation relations to put it in anti-normal order and then use this result.
128
� �
� ��(μ)
��(α)
�
��(μ)
-�
-�
-� -� � � �
��(α)
The Q function is always non-negative. But not all Q functions correspond to valid
states. For instance, δ(α) is not a valid Husimi function since it would lead to
dα 2 2
Z 2
haa i =
†
|α| δ (α) = 0, (6.105)
π
which is impossible since haa† i = ha† ai + 1 and ha† ai ≥ 0.
Let us now turn to some examples of Q functions.
129
μ=� μ=� μ=� μ=�
���� ���� ����
���
���� ���� ����
��� ���� ���� ����
�
�
���� ���� ����
���
���� ���� ����
��� ���� ���� ����
���� ���� ����
-� -� -� � � � � -� -� -� � � � � -� � � -�� -� � � ��
��(α) ��(α) ��(α) ��(α)
Figure 6.6: Example of the Husimi function (6.108) for a Schrödinger cat state (6.107), assum-
ing µ real. The plots correspond to a cut at Im(α) = 0.
An example of this function is shown in Fig. 6.6. It corresponds to roughly two Gaus-
sians superposed. If µ is small then the two peaks merge into one, but as µ increases
they become more distinguishable.
This is a straightforward and fun calculation, which I will leave for you as an exercise.
All you need is the overlap formula (6.88). The result is
1 |α|2
Q(α∗ , α) = exp − , (6.111)
π(n̄ + 1) n̄ + 1
where
1
n̄ =
, (6.112)
eβω − 1
is the Bose-Einstein thermal occupation of the harmonic oscillator. Thus, we see that
the thermal state is also a Gaussian distribution, centered at zero but with a variance
proportional to n̄ + 1. At zero temperature we get n̄ = 0 and we recover the Q function
for the vacuum ρ = |0ih0|. The width of the Gaussian distribution can be taken as a
measure of the fluctuations in the system. At high temperatures n̄ becomes large and
so does the fluctuations. Thus, in the classical limit we get a big fat Gaussian. But even
at T = 0 there is still a finite width, which is a consequence of quantum fluctuations.
130
The two examples above motivate us to consider a displaced thermal state. It is
defined in terms of the displacement operator (6.74) as
†
e−βωa a †
ρ = D(µ) D (µ). (6.113)
Z
The corresponding Q function, as you can probably expect, is
1 |α − µ|2
Q(α∗ , α) = exp − , (6.114)
π(n̄ + 1) n̄ + 1
which is sort of a mixture of Eqs. (6.106) and (6.111): it represents a thermal Gaussian
displaced in the complex plane by an amount µ.
Heterodyne measurements
The Husimi-Q function allows for an interesting interpretation in terms of mea-
surements in the coherent state basis |αi, which is called heterodyne measurements.
Recall that the basis |αi is not orthonormal and therefore such a measurement is not a
projective measurement. Instead, it is a generalized measurement in the same spirit of
Sec. 2.7. In particular, please recall Eqs. (2.37)-(2.39). In our case, the set of measure-
ment operators are
1
Mα = √ |αihα|. (6.115)
π
They are appropriately normalized as
dα
Z Z 2
d2 α Mα† Mα = |αihα| = 1,
π
which is nothing but the completeness relation (6.94).
If outcome α is obtained, then the state after the measurement will collapse to
|αihα|. And the probability of obtaining outcome α is, by Eq. (2.38),
1
pα = tr Mα ρMα† = hα|ρ|αi = Q(α, α∗ ). (6.116)
π
Thus, we see that the Husimi-Q function is nothing but the probability outcome if we
were to perform a heterodyne measurement. This gives a nice interpretation to Q:
whenever you see a plot of Q(α, α∗ ) you can imagine “that is what I would get if I were
to measure in the coherent state basis”.
131
variation of an original proposal given by von Neumann. Suppose we have a system S
that has been prepared in some state |ψi and we wish to measure some observable K in
this state. We write the eigenstuff of K as
X
K= k|kihk|. (6.117)
k
In order to measure this observable, what we are going to do is couple the system to
an ancilla, consisting of a single continuous variable bosonic mode a, according to the
interaction Hamiltonian
H = igK(a† − a). (6.118)
This Hamiltonian represents a displacement of the bosonic mode which is proportional
to the operator K. We could also do the same with (a + a† ) which looks more like a
coordinate q. But doing it for i(a† − a) turns out to be a bit simpler.
We assume the ancila starts in the vacuum so the initial state is
We then compute the time evolution of S+A under the interaction Hamiltonian (6.118).
We will not worry here about the free part of the Hamiltonian. Including it would
complicate the analysis, but will not lead to any new physics. Our goal then is to
compute the state at time t
(−i)2 2 2
e−iHt = 1 − iHt + H t + ...
2
We now note that, using the eigenstuff (6.117), we can write (being a bit sloppy with
the ⊗):
X
H= |kihk|(igk)(a + a† ),
k
X
H2 = |kihk|(igk)2 (a + a† )2 ,
k
..
.
X
Hn = |kihk|(igk)n (a + a† )n .
k
132
It is now easy to apply the evolution operator to the initial state, as in Eq. (6.120).
We simply get
X
|Φ(t)iS A = |kihk| ⊗ D(gtk) |ψiS ⊗ |0iA ,
k
or
X
|Φ(t)iS A = hk|ψi |kiS ⊗ |gtkiA ,
(6.122)
k
where |gtkiA = D(gtk)|0iA is the coherent state at position α = gtk. This result is quite
important. It says that after a time t the combined S+A system will be in an entangled
state, corresponding to a superposition of the system being in |ki and the ancilla being
in |gtki.
X
ρA (t) = trS |Φ(t)ihΦ(t)| = |hk|ψi|2 |gtkihgtk|. (6.123)
k
This is just an incoherent combination of coherent states, with the coherent state |gtki
occurring with probability
pk = |hk|ψi|2 . (6.124)
The corresponding Q function will then be simply a sum of terms of the form (6.106):
1X 2
Q(α, α∗ ) = pk e−|α−gtk| . (6.125)
π k
133
Figure 6.7: Example of the Q function (6.125) computed for the example state (6.126) for
different values of gt. Namely (a) 1, (b) 2 and (c) 4.
if gt is small then the different peaks become blurred so such a measurement would
not be able to appropriately distinguish between the different peaks. Conversely, as gt
gets larger (which means a longer interaction time or a stronger interaction) the peak
separation becomes clearer. Thus, the more S and A interact (or, what is equivalent,
the more entangled they are) the larger is the amount of information that you can learn
about S by performing a heterodyne detection on A.
We can simplify this using the orthogonality relation between coherent states, Eq. (6.91),
which gives
(gt)2
hgtk|gtk0 i = exp − (k − k0 )2 .
2
Thus, the reduced density matrix of S becomes
X
ρS (t) = ρk,k0 (t)|kihk0 |, (6.127)
k,k0
where
(gt)2
ρk,k0 (t) = hk|ψihψ|k0 i exp − (k − k0 )2 . (6.128)
2
134
Conversely, the off-diagonal coherences are exponentially damped and if we never turn
off the S+A interaction we will eventually end up with
Thus, the system initially started in a state |ψi which was a superposition of the states
|ki. But, if we allow the system and ancilla to interact for a really long time, the system
will end up in a incoherent mixture of states. It is also cool to note how the damping of
the coherences is stronger for k and k0 which are farther apart.
This analysis shows the emergence of a preferred basis. Before we turned on
the S+A interaction, the system had no preferred basis. But once that interaction was
turned on, the basis of the operator K, which is the operator we chose to couple to
the ancila in Eq. (6.118), becomes a preferred basis, in the sense that populations and
coherences behave differently in this basis.
Our model also allows us to interpolate between weak measurements and strong
measurements. If gt is small then we perturb the system very little but we also don’t
learn a lot about it by measuring A. Conversely, if gt is large then we can learn a great
deal more, but we also damage the system way more.
Mα |Φ(t)ihΦ(t)|Mα†
|Φ(t)ihΦ(t)| → , (6.131)
Q(α, α∗ )
where I already used Eq. (6.116) to relate the outcome probability pα with the Husimi
function. After the measurement the ancilla will collapse to the coherent state |αihα|.
Taking the partial trace of Eq. (6.131) over A we then get the reduced density matrix of
S, given that the measurement outcome was α. I will leave the details of this calculation
to you. The result is X
ρS |α (t) = ρk,k0 |α (t)|kihk0 |, (6.132)
k,k0
where
1
ρk,k0 |µ = hk|ψihψ|k0 ihα|gtkihgtk0 |αi. (6.133)
πQ(α, α∗ )
In particular, we can look at the diagonal elements ρk,k|α
2
pk e−|α−gtk|
ρk|α (t) = P . (6.134)
pk0 e−|α−gtk0 |2
k0
These quantities represent the populations in the |ki basis, given that the measurement
outcome was α.
135
(�) � = -� (�) � = -� (�) � = � (�) � = � (�) � = �
��� ��� ��� ��� ���
��� ��� ��� ��� ���
��� ��� ��� ��� ���
ρ� α
ρ� α
ρ� α
ρ� α
ρ� α
��� ��� ��� ��� ���
��� ��� ��� ��� ���
��� ��� ��� ��� ���
-� -� � � � -� -� � � � -� -� � � � -� -� � � � -� -� � � �
��(α) ��(α) ��(α) ��(α) ��(α)
Figure 6.8: The conditional populations in Eq. (6.134) for the example state (6.126) and gt = 1.
ρ� α
ρ� α
ρ� α
ρ� α
��� ��� ��� ��� ���
��� ��� ��� ��� ���
��� ��� ��� ��� ���
-�� -� � � �� -�� -� � � �� -�� -� � � �� -�� -� � � �� -�� -� � � ��
��(α) ��(α) ��(α) ��(α) ��(α)
136
Here γ > 0 is a constant which quantifies the loss rate of the cavity. Recall that the
pump term in Eq. (6.135) was related to the laser power P by ||2 = γP/~ω p , which
therefore depends on γ. This is related to the fact that the mechanism allowing for the
photons to get in is the same that allows them to get out, which is the semi-transparent
mirror. I should also mention that sometimes Eq. (6.137) is written instead with another
constant, γ = 2κ. There is a sort of unspoken rule that if Eq. (6.137) has a 2 in front,
the constant should be named κ. If there is no factor of 2, it should be named γ. If you
ever want to be mean to a referee, try changing that order.
For qubits the dimension of the Hilbert space is finite so we can describe the master
equation by simply solving for the density matrix. Here things are not so easy. Finding
a general solution for any density matrix is a more difficult task. Instead, we need to
learn alternative ways of dealing with (and understanding) this type of equation.
Before we do anything else, it is important to understand the meaning of the struc-
ture of the dissipator, in particular the meaning of a term such as aρa† . Suppose at t = 0
we prepare the system with certainty in a number state so ρ(0) = |nihn|. Then
D(|nihn|) = γn |n − 1ihn − 1| − |nihn| .
The first term, which comes from aρa† , represents a state with one photon less. This
is precisely the idea of a loss process. But this process must also preserve probability,
which is why we also have another term to compensate. The structure of the dissipa-
tor (6.137) represents a very finely tuned equation, where the system looses photons,
but does so in such a way that the density matrix remains positive and normalized at all
times. We also see from this result that
D(|0ih0|) = 0. (6.138)
Thus, if you start with zero photons, nothing happens with the dissipator term . We say
that the the vacuum is a fixed point of the dissipator (it is not necessarily a fixed point
of the unitary evolution).
137
They represent the probability of finding the system in the Fock state |ni. We can find
an equation for pn (t) by sandwiching Eq. (6.136) in hn| . . . |ni. The unitary part turns
out to give zero since |ni is an eigenstate of H = ωc a† a. As for hn|D(ρ)|ni, I will leave
for you to check that we get
dpn
= γ (n + 1)pn+1 − npn . (6.141)
dt
This is called a Pauli master equation and is nothing but a rate equation, specifying
how the population pn (t) changes with time. Positive terms increase pn and negative
terms decrease it. So the first term in Eq. (6.141) describes the increase in pn due to
populations coming from pn+1 . This represents the decays from higher levels. Simi-
larly, the second term in Eq. (6.141) is negative and so describes how pn decreases due
to populations at pn that are falling down to pn−1 .
The steady-state of Eq. (6.141) is obtained by setting dpn / dt = 0, which gives
n
pn+1 = pn , (6.142)
n+1
In particular, if n = 0 we get p1 = 0. Then plugging this in n = 1 gives p2 = 0 and so
on. Thus, the steady-state correspond to all pn = 0. The only exception is p0 which, by
normalization, must then be p0 = 1.
Evolution of observables
Another useful thing to study is the evolution of observables, such as hai, ha† ai, etc.
Starting from the master equation (6.136), the expectation value of any observables is
dhOi dρ
= tr O = −i tr O[H, ρ] + tr OD(ρ) .
dt dt
Rearranging the first term we may write this as
dO
= ih[H, O]i + tr OD(ρ) . (6.143)
dt
The first term is simply Heisenberg’s equation (6.48) for the unitary part. What is new
is the second term. It is convenient to write this as the trace of ρ times “something”, so
that we can write this as an expectation value. We can do this using the cyclic property
of the trace:
1 1 1 1
tr O aρa† − a† aρ − ρa† a = ha† Oa − a† aO − Oa† ai. (6.144)
2 2 2 2
Using this result for O = a and O = a† a gives, playing with the algebra a bit,
γ
tr aD(ρ) = − hai, tr a† aD(ρ) = −γha† ai. (6.145)
2
Using these results in Eq. (6.143) then gives
dhai
= −(iω + γ/2)hai, (6.146)
dt
dha† ai
= −γha† ai. (6.147)
dt
138
Thus, both the first and the second moments will relax exponentially with a rate γ,
except that hai will also oscillate:
As t → ∞ the average number of photons ha† ai tends to zero, no matter which state
you begin at. Looking at a handful of observables is a powerful way to have an idea
about what the density matrix is doing.
This transforms the Hamiltonian (6.135) into the detuned time-independent Hamilto-
nian (6.33):
139
One way to check this is to take the coherent state as an ansatz and then try to find what
is the value of α which solves Eq. (6.151). The average number of photons will then
be
2
ha† ai = |α|2 = 2 . (6.155)
∆ + γ2 /4
The purpose of this section was to show you a practical use of master equations
and open quantum systems. This “cavity loss” dissipator is present in literally every
quantum optics setup which involves a cavity. In fact, I know of several papers which
sometimes even forget to tell that this dissipator is there, but it always is.
ω X X
H= σz + Ωk b†k bk + λk σ x (bk + b†k ), (bad boy). (6.157)
2 k k
The fundamental difference between the two models is that in the first the operator ap-
pearing in the S-E interaction (σz ) is the same as the operator in HS . Consequently, the
model (6.156) cannot generate transitions between energy levels (population changes)
and, consequently, the most that can happen is environment-induced decoherence. In
Eq. (6.157), on the other hand, the operator σ x is the spin flip and therefore causes pop-
ulation changes. Consequently, it will give rise to an amplitude damping-type of dy-
namics. In this section we will talk about the good-boy spin-boson model, Eq. (6.156).
It has the cool, and quite rare, feature that it can analytically solved.
Exact solution
In this section we find the exact solution for ρS (t). This is one of the few models for
which exact solutions are available, so enjoy it! The starting point is von Neumann’s
equation (in the Schrödinger picture) for the total density matrix of S+E:
dρ
= −i[H, ρ], (6.158)
dt
where H is the total Hamiltonian (6.156). This is subject to the initial condition
e−βHE
ρ(0) = ρS (0)ρE (0), ρE (0) = . (6.159)
ZE
However, now we are interested in exact dynamics so the bath will also evolve in time
and the system and bath will become correlated.
140
The solution of Eq. (6.158) is
H = HS + H0 ,
ω
where HS = 2 σz and
X X
H0 = Ωk b†k bk + λk σz (bk + b†k )
k k
The Hamiltonian HS lives on the qubit space and therefore can be taken out of the
partial trace:
ρS (t) = e−iHS t trE e−iH0 t ρS (0)ρE (0)eiH0 t eiHS t .
In this way, we have separated the local unitary dynamics, described by HS , to the
dissipative dynamics described by everything inside the trace. In fact, if you think
about it, this whole partial trace is a quantum operation in the spirit of Stinespring’s
theorem. So for now let us focus on this dissipative part defined by the map
ρ̃S (t) = trE e−iH0 t ρS (0)ρE (0)eiH0 t . (6.160)
The easiest way to proceed from here is to actually look at the matrix elements of
this map in the computational basis. The reason why this is useful is because H0 is
already diagonal in the qubit sector. In fact, we can define
where X X
H0± = Ωk b†k bk ± λk (bk + b†k )
k k
We then have, for instance
h0|ρ̃S (t)|0i = h0| trE e−iH0 t ρS (0)ρE (0)eiH0 t |0i
= trE h0|e−iH0 t ρS (0)ρE (0)eiH0 t |0i
+ +
= trE e−iH0 t h0|ρS (0)|0iρE (0)eiH0 t
+ +
= h0|ρS (0)|0i trE e−iH0 t ρE (0)eiH0 t
141
This set of steps is important and a bit confusing, so make sure you understand what I
am doing. I push the system bra and ket |0i inside the partial trace. But then I know
how H0 acts on it. And after it has acted, H0+ will no longer have any components
on the qubit space, so we can move |0i through it at will. Finally, when h0| and |0i
encounter ρS (0), they form a number, which can then be taken outside the partial trace.
But now comes the magic trick: H0± is an operator that lives only on the environ-
ment’s Hilbert space. Hence, we are now allowed to use the cyclic property of the
trace. This is a useful trick to remember: if an operator lives on a larger space, cyclic
property is forbidden. But if acts only over the space you are tracing, then it becomes
allowed again. And if we do that the two exponentials cancel and we are left with
h0|ρ̃S (t)|0i = trE h0|ρS (0)|0iρE (0) = h0|ρS (0)|0i. (6.161)
Thus, as anticipated, we see that the action of the bath does not change the populations
(diagonal elements) of ρS . A similar argument can of course be used for h1|ρ̃S (t)|1i but
we don’t need to do it because, if h0|ρ̃S (t)|0i doesn’t change, then h1|ρ̃S (t)|1i cannot
change also due to normalization.
Next we look at the off-diagonal element
+ −
h0|ρ̃S (t)|1i = h0|ρS (0)|1i trE e−iH0 t ρE (0)eiH0 t . (6.162)
We see now that the exponentials do not cancel, so the result of the trace will not be
just trE ρE (0) = 1. In fact, we can define a general dephasing rate as
+ −
e−Λ(t) = trE e−iH0 t ρE (0)eiH0 t . (6.163)
Our task has now been reduced to the calculation of the decoherence rate Λ(t).
where ρk is the initial state of mode k of the environment. If we assume the environment
is in a thermal state then †
ρk = (1 − e−βΩk )e−βΩk bk bk .
142
Since the calculations for all modes are equivalent, let us clean up the notation a bit
and focus on the quantity
† †
B = heit Ωb b−λ(b+b ) e−it Ωb b+λ(b+b ) i
† †
(6.165)
discussed in Sec. 6.4. Recall that D† (α)bD(α) = b + α. We can then use this to write
λ2
Ωb† b ± λ(b + b† ) = Ω D† (±λ/Ω)(b† b)D(±λ/Ω) − . (6.166)
Ω
But the displacement operator is unitary, so it can enter or leave exponentials at will.
Consequently †
e−it Ωb b+λ(b+b ) = e−itλ /Ω D† (λ/Ω)e−iΩtb b D(λ/Ω),
† 2 †
with a similar result for the other exponential. Eq. (6.165) then becomes
† †
B = hD† (−λ/Ω)eiΩtb b D(−λ/Ω)D† (λ/Ω)e−iΩtb b D(λ/Ω)i
† †
= hD(λ/Ω)eiΩtb b D† (2λ/Ω)e−iΩtb b D(λ/Ω)i,
where I used the fact that D(−α) = D† (α) and that D(α)D(α) = D(2α) (all these
properties are described in Sec. 6.4).
In the middle term we infiltrate the exponential inside D† (2λ/Ω):
2λ
† † † †
eiΩtb b D† (2λ/Ω)e−iΩtb b = exp − eiΩtb b (b† − b)e−iΩtb b
Ω
2λ
= exp − (b† eiΩt − be−iΩt )
Ω
= D† (2λeiΩt/Ω)
B = hD(λ/Ω)D† (2λeiΩt/Ω)D(λ/Ω)i
Finally, we combine the three displacement operators using D(α)D(β) = e(β α−α β)/2 D(α+
∗ ∗
2λ
B = hD(αt )i, αt := (1 − eiΩt ). (6.167)
Ω
This result is somewhat general since it holds for an arbitrary bath initial state.
143
Next let us specialize it for the case of a thermal state. In this case, it turns out that
the trace of a displacement operator in a thermal state is:3
†
B = hD(αt )i = exp − |αt |2 (n̄ + 1/2) , when ρ = (1 − e−βΩ )e−βΩb b ,
(6.168)
where n̄ = (eβΩ − 1)−1 .
In K. Cahill and R. Glauber in Phys. Rev. 177, 1857-1881 (1969), they show that hn|D(α)|ni = e−|α| /2 Ln (|α|2 )
2
where Ln (x) are the Laguerre polynomials. The sum in n may then be related to the generating function of
Laguerre polynomials:
∞
X 1 yx
xn Ln (y) = exp − .
n=0
1 − x 1 −x
Using this yields, after some simplifications, the result in Eq. (6.168).
144
=� =� = �� = ���
� ��
� ��
� �
� � �
�
Λ(�)/�
Λ(�)/�
Λ(�)/�
Λ(�)/�
� �
� � � �
� � � �
� (�) (�) (�) (�)
� � � �
� �� �� �� �� �� � �� �� �� �� �� � �� �� �� �� �� � �� �� �� �� ��
��� � ��� � ��� � ��� �
Figure 6.10: The decoherence rate Λ(t)/t defined in Eq. (6.169) for different numbers of bath
modes N. The parameters Ωk and λk were chosen as in Eqs. (6.170) and (6.171).
The temperature was fixed at T/Ωc = 1.
The logic behind this will be explained below, but essentially it is the condition to
obtain what is called an Ohmic bath. We also rescale the λk with the number of modes
since this allows us to compare different values of N.
We present some results for different values of N in Fig. 6.10. As can be seen, if
N = 2 is small the damping rate is first positive but then goes back all the way to zero
at certain points. Having Λ(t) = 0 means the system didn’t dephase at all. This is a
signature of non-Markovian behavior. For initial times there is some dephasing. But
then information backflows towards the system and it can eventually get back exactly
to its initial state when Λ(t) = 0. As we increase N these backflows start to become
more seldom and also occur at larger and larger times. Then, as N → ∞ information
never flows back and the dynamics becomes Markovian.
On the other hand, for large N we see that at large times Λ(t)/t tends to a constant.
This means that the decoherence behaves as q(t) = q0 e−Λ0 t , which is the same result
one obtains from the Lindblad equation
dρ
= Λ σz ρσz − ρ . (6.172)
dt
But we also see, for instance in the curve with N = 100, that for small times there is
an adjustment period in which Λ(t)/t is not constant. So this means that for very short
times there is always some weird stuff going on, even if the bath is infinitely large. The
microscopic derivations of master equations don’t capture this type of effect because
they only take into account a coarse-graining dynamics at large times.
145
so that Eq. (6.169) becomes
Z∞ Ω
2 J(Ω)
Λ(t) = dΩ (1 − cos Ωt) coth . (6.173)
π Ω2 2T
0
√
We continue to assume Eqs. (6.170) and (6.171) for Ωk and λk . Since λk ∼ Ωk and
J(Ω) ∼ λ2k , we see that these assumptions imply an Ohmic spectral density J(Ω) ∼ Ω.
As for the cut-off, we have two choices. One is to assume J(Ω) = 0 when Ω > Ωc (a
hard cut-off) and the other is to assume that J(Ω) ∼ e−Ω/Ωc (a soft cut-off). We shall
take the latter. That is, we shall assume that
This integral can actually be played with analytically. You will find this analysis on
Sec. 4.2 of Breuer and Petruccione. The result is:
Ω2 t2
c
2 t Ω−1 c
Λ(t) '
ln Ωt Ω−1 1 (6.176)
c t πT
πtT
1
πT t
Here Ω−1c and 1/(πT ) represent characteristic time scales of the problem. The first is a
very small time scale (because the cut-off is usually insanely large) and describes the
behavior at very short times. Conversely, 1/(πT ) dominates the behavior of the system
at the short time scales.
146
�
���� π��
Λ(�)
�� ��
���� � ��(��)
����
Figure 6.11: The three regimes of the decoherence rate, Eq. (6.176), compared with numerical
simulations for N = 104 bath modes. The other parameters were Ωc = 100 and
T = 1.
I will also denote the spin states by a variable σ = ±1. The action of the channel will
then be given by
Λ(t) 0 2
hσ|E(ρS )|σ0 i = hσ|ρS |σ0 ie− 4 (σ−σ ) , (6.177)
where σ, σ0 = ±1.
Now I want to consider what happens when we have N spins, with each spin cou-
pled to its own individual spin-boson bath. In this case the computational basis will
be given by the 2N kets |σi = |σ1 , . . . , σN i. The logic behind Eq. (6.177) will now
continue to apply, but at the level of each individual state. For instance, suppose only
spin i was coupled to a spin-boson bath. Then the action of this bath would be
Λ(t) 0 2
hσ|Ei (ρS )|σ 0 i = hσ|ρS |σ 0 ie− 4 (σi −σi ) .
That is to say, the bath will only cause decoherence in those states of ρS which has
non-diagonal elements in spin component i. For instance, in the case of two qubits, a
state of the form
147
would not suffer any decoherence if we applied E1 , but would in general suffer deco-
herence if we applied E2 , provided |ψi and |ψ0 i have off-diagonal elements.
If we now apply the map to all spins, the effects will simply add up and so we will
get
N
− Λ(t) (σi −σ0i )2
P
hσ|(E1 ⊗ . . . ⊗ EN )(ρS )|σ 0 i = hσ|ρS |σ 0 ie .
4
i=1 (6.178)
This equation actually hides a dramatic effect, which explains why quantum effects are
so easily washed away in macroscopic systems (like Schrödinger’s cat, for instance).
To see this, suppose that the system is prepared in a GHZ state,
|0 . . . 0i + |1 . . . 1i |0i + |1i
|ψi = √ := √ . (6.179)
2 2
The corresponding density matrix ρS = |ψihψ| will have four terms. Two of them will
not suffer decoherence; namely, |0ih0| and |1ih1|. But the cross terms |0ih1| and |1ih0|
will. And according to Eq. (6.178), this decoherence will be proportional to
N
1X
(σi − σ0i )2 = N
4 i=1
148