Lecture 7
Lecture 7
CME 335
Spring Quarter 2010-11
Lecture 7 Notes
Jacobi Methods
One of the major drawbacks of the symmetric QR algorithm is that it is not parallelizable. Each
orthogonal similarity transformation that is needed to reduce the original matrix A to diagonal
form is dependent upon the previous one. In view of the evolution of parallel architectures, it is
therefore worthwhile to consider whether there are alternative approaches to reducing an n × n
symmetric matrix A to diagonal form that can exploit these architectures.
where c = cos θ and s = sin θ. This matrix, when applied as a similarity transformation to a
symmetric matrix A, rotates rows and columns p and q of A through the angle θ so that the (p, q)
and (q, p) entries are zeroed. We call the matrix J(p, q, θ) a Jacobi rotation. It is actually identical
to a Givens rotation, but in this context we call it a Jacobi rotation to acknowledge its inventor.
Let off(A) be the square root of the sum of squares of all off-diagonal elements of A. That is,
n
X
off(A)2 = kAk2F − a2ii .
i=1
Furthermore, let
B = J(p, q, θ)T AJ(p, q, θ).
1
Then, because the Frobenius norm is invariant under orthogonal transformations, and because only
rows and columns p and q of A are modified in B, we have
n
X
off(B)2 = kBk2F − b2ii
i=1
X
= kAk2F − b2ii − (b2pp + b2qq )
i6=p,q
X
= kAk2F − a2ii − (a2pp + 2a2pq + a2qq )
i6=p,q
X n
= kAk2F − a2ii − 2a2pq
i=1
= off(A)2 − 2a2pq
< off(A)2 .
We see that the “size” of the off-diagonal part of the matrix is guaranteeed to decrease from such
a similarity transformation.
If we define
aqq − app s
τ= , t= ,
2apq c
then t satisfies the quadratic equation
t2 + 2τ t − 1 = 0.
2
we obtain
1
c= , s = ct.
sqrt1 + t2
We choose the smaller root to ensure that |θ| ≤ π/4 and that the difference between A and B, since
X
kB − Ak2F = 4(1 − c) (a2ip + aiq )2 + 2ap q 2 /c2 .
i6=p,q
off(A)2 ≤ 2N a2pq .
which implies linear convergence. However, it has been shown that for sufficiently large k, there
exist a constant c such that
off(A(k+N ) ) ≤ c ∗ off(A(k) )2 ,
where A(k) is the matrix after k Jacobi updates, meaning that the clasical Jacobi algorithm con-
verges quadratically as a function of sweeps. Heuristically, it has been argued that approximatly
log n sweeps are needed in practice.
It is worth noting that the guideline that θ be chosen so that |θ| ≤ π/4 is actually essential
to ensure quadratic convergence, because otherwise it is possible that Jacobi updates may simply
interchange nearly converged diagonal entries.
3
Error Analysis
In terms of floating-point operations and comparisons, the Jacobi method is not competitive with
the symmetric QR algorithm, as the expense of two Jacobi sweeps is comparable to that of the entire
symmetric QR algorithm, even with the accumulation of transformations to obtain the matrix of
eigenvectors. On the other hand, the Jacobi method can exploit a known approximate eigenvector
matrix, whereas the symmetric QR algorithm cannot.
The relative error in the computed eigenvalues is quite small if A is positive definite. If λi is an
exact eigenvalue of A and λ̃i is the closest computed eigenvalue, then it has been shown by Demmel
and Veselic̀ that
|λ̃i − λi |
≈ uκ2 (D−1 AD−1 ) uκ2 (A),
|λi |
√ √ √
where D is a diagonal matrix with diagonal entries a11 , a22 , . . . , ann .
Parallel Jacobi
The primary advantage of the Jacobi method over the symmetric QR algorithm is its parallelism.
As each Jacobi update consists of a row rotation that affects only rows p and q, and a column
rotation that effects only columns p and q, up to n/2 Jacobi updates can be performed in parallel.
Therefore, a sweep can be efficiently implemented by performing n−1 series of n/2 parallel updates
in which each row i is paired with a different row j, for i 6= j.
As the size of the matrix, n, is generally much greater than the number of processors, p, it is
common to use a block approach, in which each update consists of the computation of a 2r × 2r
symmetric Schur decomposition for some chosen block size r. This is accomplished by applying
another algorithm, such as the symmetric QR algorithm, on a smaller scale. Then, if p ≥ n/(2r),
an entire block Jacobi sweep can be parallelized.
• One-sided Jacobi: This approach, like the Golub-Kahan SVD algorithm, implicitly applies
the Jacobi method for the symmetric eigenvalue problem to AT A. The idea is, within each
update, to use a column Jacobi rotation to rotate columns p and q of A so that they are
4
orthogonal, which has the effect of zeroing the (p, q) entry of AT A. Once all columns of AV
are orthogonal, where V is the accumulation of all column rotations, the relation AV = U Σ
is used to obtain U and Σ by simple column scaling. To find a suitable rotation, we note that
if ap and aq , the pth and qth columns of A, are rotated through an angle θ, then the rotated
columns satisfy
where c = cos θ and s = sin θ. Dividing by c2 and defining t = s/c, we obtain a quadratic
equation for t that can be solved to obtain c and s.