0% found this document useful (0 votes)
90 views45 pages

Matrix Norms: Tom Lyche

The document discusses matrix norms. It defines what constitutes a matrix norm and provides examples of common matrix norms such as the Frobenius norm, sum norm, and max norm. It also covers properties of matrix norms such as equivalence of norms, continuity, unitary invariance, submultiplicativity, and consistency. Explicit expressions are provided for p-norms including the 2-norm (spectral norm). The document discusses how matrix norms relate to vector norms and operator norms.

Uploaded by

Abdul SM
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
0% found this document useful (0 votes)
90 views45 pages

Matrix Norms: Tom Lyche

The document discusses matrix norms. It defines what constitutes a matrix norm and provides examples of common matrix norms such as the Frobenius norm, sum norm, and max norm. It also covers properties of matrix norms such as equivalence of norms, continuity, unitary invariance, submultiplicativity, and consistency. Explicit expressions are provided for p-norms including the 2-norm (spectral norm). The document discusses how matrix norms relate to vector norms and operator norms.

Uploaded by

Abdul SM
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 45

Matrix Norms

Tom Lyche
Centre of Mathematics for Applications,
Department of Informatics,
University of Oslo

September 28, 2009


Matrix Norms

We consider matrix norms on (Cm,n , C). All results holds for


(Rm,n , R).
Definition (Matrix Norms)
A function k·k : Cm,n → C is called a matrix norm on Cm,n if
for all A, B ∈ Cm,n and all α ∈ C
1. kAk ≥ 0 with equality if and only if A = 0. (positivity)
2. kαAk = |α| kAk. (homogeneity)
3. kA + Bk ≤ kAk + kBk. (subadditivity)
A matrix norm is simply a vector norm on the finite
dimensional vector spaces (Cm,n , C) of m × n matrices.
Equivalent norms

Adapting some general results on vector norms to matrix


norms give
Theorem
x
1. All matrix norms are equivalent. Thus, if k·k and k·k0 are
two matrix norms on Cm,n then there are positive
constants µ and M such that µkAk ≤ kAk0 ≤ MkAk
holds for all A ∈ Cm,n .
2. A matrix norm is a continuous function k·k : Cm,n → R.
Examples
I From any vector norm k kV on Cmn we can define a
matrix norm on Cm,n by kAk := kvec(A)kV , where
vec(A) ∈ Cmn is the vector obtained by stacking the
columns of A on top of each other.
I

m X
X n
kAkS := |aij |, p = 1, Sum norm,
i=1 j=1
m X n
X 1/2
kAkF := |aij |2 , p = 2, Frobenius norm,
i=1 j=1

kAkM := max|aij |, p = ∞, Max norm.


i,j
(1)
The Frobenius Matrix Norm 1.

Pn Pm Pm Pn
I kAH k2F = j=1 i=1 |aij |
2
= i=1 j=1 |aij |
2
= kAk2F .
The Frobenius Matrix Norm 2.

Pm Pn 2
Pm 2
j=1 |aij | = i=1 kai· k2
I
i=1
Pm Pn 2
Pn Pm 2
Pn 2
j=1 |aij | = i=1 |aij | = j=1 ka·j k2 .
I
i=1 j=1
Unitary Invariance.

I If A ∈ Cm,n and U ∈ Cm,m , V ∈ Cn,n are unitary


2. P 2.
kUAk2F = nj=1 kUa·j k22 = nj=1 ka·j k22 = kAk2F .
P
I

1. 1.
I kAVkF = kVH AH kF = kAH kF = kAkF .
Submultiplicativity

I Suppose A, B are rectangular matrices so that the


product AB is defined.
2
kABk2F = ni=1 kj=1 aTi· b·j ≤
P P
I
Pn Pk 2 2 2 2
i=1 j=1 kai· k2 kb·j k2 = kAkF kBkF .
Subordinance

I kAxk2 ≤ kAkF kxk2 , for all x ∈ Cn .


I Since kvkF = kvk2 for a vector this follows from
submultiplicativity.
Explicit Expression

I Let A ∈ Cm,n have singular values σ1 , . . . , σn and SVD


A = UΣVH . Then
3. p
I kAkF = kUH AVkF = kΣkF = σ12 + · · · + σn2 .
Consistency

I A matrix norm is called consistent on Cn,n if


4. kABk ≤ kAk kBk (submultiplicativity)
n,n
holds for all A, B ∈ C .
I A matrix norm is consistent if it is defined on Cm,n for
all m, n ∈ N, and 4. holds for all matrices A, B for which
the product AB is defined.
I The Frobenius norm is consistent.
I The Sum norm is consistent.
I The Max norm is not consistent.

I The norm kAk := mnkAkM , A ∈ Cm,n is consistent.
Subordinate Matrix Norm
Definition
I Suppose m, n ∈ N are given,
I Let k kα on Cm and k kβ on Cn be vector norms, and let
k k be a matrix norm on Cm,n .
I We say that the matrix norm k k is subordinate to the
vector norms k kα and k kβ if kAxkα ≤ kAk kxkβ for all
A ∈ Cm,n and all x ∈ Cn .
I If k kα = k kβ then we say that k k is subordinate to k kα .

I The Frobenius norm is subordinate to the Euclidian


vector norm.
I The Sum norm is subordinate to the l1 -norm.
I kAxk∞ ≤ kAkM kxk1 .
Operator Norm

Definition
Suppose m, n ∈ N are given and let k·kα be a vector norm on
Cn and k·kβ a vector norm on Cm . For A ∈ Cm,n we define

kAxkβ
kAk := kAkα,β := max . (2)
x6=0 kxkα

We call this the (α, β) operator norm, the (α, β)-norm, or


simply the α-norm if α = β.
Observations

kAxkα
I kAkα,β = maxx∈ker(A)
/ kxkβ
= maxkxkβ =1 kAxkα .
I kAxkα ≤ kAkkxkβ .
I kAkα,β = kAx∗ kα for some x∗ ∈ Cn with kx∗ kβ = 1.
I The operator norm is a matrix norm on Cmn, .
I The Sum norm and Frobenius norm are not an α operator
norm for any α.
Operator norm Properties

I The operator norm is a matrix norm on Cm,n .


I The operator norm is consistent if the vector norm k kα is
defined for all m ∈ N and k kβ = k kα .
Proof

In 2. and 3. below we take the max over the unit sphere Sβ .


1. Nonnegativity is obvious. If kAk = 0 then kAykβ = 0 for
each y ∈ Cn . In particular, each column Aej in A is zero.
Hence A = 0.
2. kcAk = maxx kcAxkα = maxx |c| kAxkα = |c| kAk.
3. kA + Bk = maxx k(A + B)xkα ≤
maxx kAxkα + maxx kBxkα = kAk + kBk.
kABxkα
4. kABk = maxBx6=0 kxkα
= maxBx6=0 kABxk α kBxkα
kBxkα kxkα
≤ maxy6=0 kAyk
kykα
α
maxx6=0 kBxk
kxkα
α
= kAk kBk.
The p matrix norm

I The operator norms k·kp defined from the p-vector norms


are of special interest.
I Recall
p 1/p
Pn 
kxkp := j=1 |xj | , p ≥ 1, kxk∞ := max1≤j≤n |xj |.
I Used quite frequently for p = 1, 2, ∞.
I We define for any 1 ≤ p ≤ ∞

kAxkp
kAkp := max = max kAykp . (3)
x6=0 kxkp kykp =1

I The p-norms are consistent matrix norms which are


subordinate to the p-vector norm.
Explicit expressions

Theorem
For A ∈ Cm,n we have
m
X
kAk1 := max |ak,j |, (max column sum)
1≤j≤n
k=1
kAk2 := σ1 , (largest singular value of A)
n
X
kAk∞ := max |ak,j |, (max row sum).
1≤k≤m
j=1
(4)
The expression kAk2 is called the two-norm or the spectral
norm of A. The explicit expression follows from the minmax
theorem for singular values.
Examples

1 14 4 16
For A := 15 [ 2 22 13 ] we find
I kAk1 = 29 .
15
I kAk2 = 2.

I kAk∞ = 37 .
15

I kAkF = 5.
I A := [ 1 2
3 4]
I kAk1 = 6

I kAk2 = 5.465

I kAk∞ = 7.

I kAkF = 5.4772
The 2 norm

Theorem
Suppose A ∈ Cn,n has singular values σ1 ≥ σ2 ≥ · · · ≥ σn and
eigenvalues |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn |. Then
1
kAk2 = σ1 and kA−1 k2 = , (5)
σn
1
kAk2 = λ1 and kA−1 k2 = , if A is symmetric positive definite,
λn
(6)
1
kAk2 = |λ1 | and kA−1 k2 = , if A is normal. (7)
|λn |

For the norms of A−1 we assume of course that A is


nonsingular.
Unitary Transformations

Definition
A matrix norm k k on Cm,n is called unitary invariant if
kUAVk = kAk for any A ∈ Cm,n and any unitary matrices
U ∈ Cm,m and V ∈ Cn,n .
If U and V are unitary then U(A + E)V = UAV + F, where
kFk = kEk.
Theorem
The Frobenius norm and the spectral norm are unitary
invariant. Moreover kAH kF = kAkF and kAH k2 = kAk2 .
Proof 2 norm

I kUAk2 = maxkxk2 =1 kUAxk2 = maxkxk2 =1 kAxk2 = kAk2 .


I kAH k2 = kAk2 (same singular values).
I kAVk2 = k(AV)H k2 = kVH AH k2 = kAH k2 = kAk2 .
Perturbation of linear systems

I
x1 + x2 = 20
x1 + (1 − 10 )x2 = 20 − 10−15
−16

I The exact solution is x1 = x2 = 10.


I Suppose we replace the second equation by

x1 + (1 + 10−16 )x2 = 20 − 10−15 ,

I the exact solution changes to x1 = 30, x2 = −10.


I A small change in one of the coefficients, from 1 − 10−16
to 1 + 10−16 , changed the exact solution by a large
amount.
Ill Conditioning

I A mathematical problem in which the solution is very


sensitive to changes in the data is called ill-conditioned
or sometimes ill-posed.
I Such problems are difficult to solve on a computer.
I If at all possible, the mathematical model should be
changed to obtain a more well-conditioned or
properly-posed problem.
Perturbations

I We consider what effect a small change (perturbation) in


the data A,b has on the solution x of a linear system
Ax = b.
I Suppose y solves (A + E )y = b+e where E is a (small)
n × n matrix and e a (small) vector.
I How large can y−x be?
I To measure this we use vector and matrix norms.
Conditions on the norms

I k·k will denote a vector norm on Cn and also a


submultiplicative matrix norm on Cn,n which in addition is
subordinate to the vector norm.
I Thus for any A, B ∈ Cn,n and any x ∈ Cn we have

kABk ≤ kAk kBk and kAxk ≤ kAk kxk.

I This is satisfied if the matrix norm is the operator norm


corresponding to the given vector norm or the Frobenius
norm.
Absolute and relative error

I The difference ky − xk measures the absolute error in y


as an approximation to x,
I ky − xk/kxk or ky − xk/kyk is a measure for the relative
error.
Perturbation in the right hand side
Theorem
Suppose A ∈ Cn,n is invertible, b, e ∈ Cn , b 6= 0 and Ax = b,
Ay = b+e. Then

1 kek ky − xk kek
≤ ≤ K (A) , K (A) = kAkkA−1 k.
K (A) kbk kxk kbk
(8)
I Proof:
I Consider (8). kek/kbk is a measure for the size of the
perturbation e relative to the size of b. ky − xk/kxk can
in the worst case be

K (A) = kAkkA−1 k

times as large as kek/kbk.


Condition number

I K (A) is called the condition number with respect to


inversion of a matrix, or just the condition number, if it
is clear from the context that we are talking about solving
linear systems.
I The condition number depends on the matrix A and on
the norm used. If K (A) is large, A is called
ill-conditioned (with respect to inversion).
I If K (A) is small, A is called well-conditioned (with
respect to inversion).
Condition number properties

I Since kAkkA−1 k ≥ kAA−1 k = kI k ≥ 1 we always have


K (A) ≥ 1.
I Since all matrix norms are equivalent, the dependence of
K (A) on the norm chosen is less important than the
dependence on A.
I Usually one chooses the spectral norm when discussing
properties of the condition number, and the l1 and l∞
norm when one wishes to compute it or estimate it.
The 2-norm

I Suppose A has singular values σ1 ≥ σ2 ≥ · · · ≥ σn > 0


and eigenvalues |λ1 | ≥ |λ2 | ≥ · · · ≥ |λn | if A is square.
I K2 (A) = kAk2 kA−1 k2 = σσn1
|λ1 |
I K2 (A) = kAk2 kA−1 k2 = |λ n|
, A normal.
I It follows that A is ill-conditioned with respect to
inversion if and only if σ1 /σn is large, or |λ1 |/|λn | is large
when A is normal.
I K2 (A) = kAk2 kA−1 k2 = λλn1 , A positive definite.
The residual

Suppose we have computed an approximate solution y to


Ax = b. The vector r(y :) = Ay − b is called the residual
vector , or just the residual. We can bound x−y in term of
r(y).
Theorem
Suppose A ∈ Cn,n , b ∈ Cn , A is nonsingular and b 6= 0. Let
r(y) = Ay − b for each y ∈ Cn . If Ax = b then

1 kr(y)k ky − xk kr(y)k
≤ ≤ K (A) . (9)
K (A) kbk kxk kbk
Discussion

I If A is well-conditioned, (9) says that


ky − xk/kxk ≈ kr(y)k/kbk.
I In other words, the accuracy in y is about the same order
of magnitude as the residual as long as kbk ≈ 1.
I If A is ill-conditioned, anything can happen.
I The solution can be inaccurate even if the residual is small
I We can have an accurate solution even if the residual is
large.
The inverse of A + E

Theorem
Suppose A ∈ Cn,n is nonsingular and let k·k be a consistent
matrix norm on Cn,n . If E ∈ Cn,n is so small that
r := kA−1 Ek < 1 then A + E is nonsingular and

−1 kA−1 k
k(A + E) k ≤ . (10)
1−r
If r < 1/2 then

k(A + E)−1 − A−1 k kEk


−1 ≤ 2K (A) . (11)
kA k kAk
Proof
I We use that if B ∈ Cn,n and kBk < 1 then I − B is
nonsingular and k(I − B)−1 k ≤ 1−kBk
1
.
I Since r < 1 the matrix I − B := I + A−1 E is nonsingular.
I Since (I − B)−1 A−1 (A + E) = I we see that A + E is
nonsingular with inverse (I − B)−1 A−1 .
I Hence, k(A + E)−1 k ≤ k(I − B)−1 kkA−1 k and (10)
follows.
I From the identity (A + E)−1 − A−1 = −A−1 E(A + E)−1
we obtain by
k(A + E)−1 − A−1 k ≤ kA−1 kkEkk(A + E)−1 k ≤
kEk kA−1 k
K (A) kAk 1−r
.
I Dividing by kA−1 k and setting r = 1/2 proves (11).
Perturbation in A

Theorem
Suppose A, E ∈ Cn,n , b ∈ Cn with A invertible and b 6= 0. If
r := kA−1 Ek < 1/2 for some operator norm then A + E is
invertible. If Ax = b and (A + E)y = b then

ky − xk kEk
≤ kA−1 Ek ≤ K (A) , (12)
kyk kAk
ky − xk kEk
≤ 2K (A) .. (13)
kxk kAk
Proof

I A + E is invertible.
I (12) follows easily by taking norms in the equation
x − y = A−1 Ey and dividing by kyk.
From the identity y − x = (A + E)−1 − A−1 Ax we

I

obtain ky − xk ≤ k(A + E)−1 − A−1 kkAkkxk and (13)


follows.
Finding the rank of a matrix

I Gauss-Jordan cannot be used to determine rank


numerically
I Use singular value decomposition
I numerically will normally find σn > 0.
I Determine minimal r so that σr +1 , . . . , σn are ”close” to
round off unit.
I Use this r as an estimate for the rank.
Convergence in Rm,n and Cm,n

I Consider an infinite sequence of matrices


{Ak } = A0 , A1 , A2 , . . . in Cm,n .
I {Ak } is said to converge to the limit A in Cm,n if each
element sequence {Ak (ij)}k converges to the
corresponding element A(ij) for i = 1, . . . , m and
j = 1, . . . , n.
I {Ak } is a Cauchy sequence if for each  > 0 there is an
integer N ∈ N such that for each k, l ≥ N and all i, j we
have |Ak (ij) − Al (ij)| ≤ .
I {Ak } is bounded if there is a constant M such that
|Ak (ij)| ≤ M for all i, j, k.
More on Convergence

I By stacking the columns of A into a vector in Cmn we


obtain
I A sequence {Ak } in Cm,n converges to a matrix A ∈ Cm,n
if and only if limk→∞ kAk − Ak = 0 for any matrix norm
k·k.
I A sequence {Ak } in Cm,n is convergent if and only if it is
a Cauchy sequence.
I Every bounded sequence {Ak } in Cm,n has a convergent
subsequence.
The Spectral Radius

I ρ(A) = maxλ∈σ(A) |λ|.


I For any matrix norm k·k on Cn,n and any A ∈ Cn,n we
have ρ(A) ≤ kAk.
I Proof: Let (λ, x) be an eigenpair for A
I X := [x, . . . , x] ∈ Cn,n .
I λX = AX, which implies
|λ| kXk = kλXk = kAXk ≤ kAk kXk.
I Since kXk = 6 0 we obtain |λ| ≤ kAk.
A Special Norm

Theorem
Let A ∈ Cn,n and  > 0 be given. There is a consistent matrix
norm k·k0 on Cn,n such that ρ(A) ≤ kAk0 ≤ ρ(A) + .
A Very Important Result
Theorem
For any A ∈ Cn,n we have

lim Ak = 0 ⇐⇒ ρ(A) < 1.


k→∞

I Proof:
I Suppose ρ(A) < 1.
I There is a consistent matrix norm k·k on Cn,n such that
kAk < 1.
I But then kAk k ≤ kAkk → 0 as k → ∞.
I Hence Ak → 0.
I The converse is easier.
Convergence can be slow

   
0.99 1 0 0.4 9.37 1849
I A =  0 0.99 1  , A100 =  0 0.4 37  ,
0 0 0.99 0 0 0.4
−9

10  0.004
A2000 =  0 10−9  
0 0 10−9
More limits

I For any submultiplicative matrix norm k·k on Cn,n and


any A ∈ Cn,n we have

lim kAk k1/k = ρ(A). (14)


k→∞

You might also like