( (Robust) ) ( (Optimal) ) Robust & Optimal Control
( (Robust) ) ( (Optimal) ) Robust & Optimal Control
Zhou, Kemin.
Robust and optimal control / Kemin Zhou with Join C. Doyle and
Keith Glover.
p. cm.
Includes bibliographical references and index.
ISBN O-13-456567-3
1. Control theory. 2. Mathematical optimization. I. Doyle, John
Comstock. II. Glover, K. (Keith), 1946- III. Title.
QA402.3.Z48 1 9 9 5
629.8’3 12--dc20 95-303 11
CIP
ISBN O-13-456567-3
HW@82@
Contents
...
Preface x111
1 Introduction 1
2 Linear Algebra 17
vii
.. .
Vlll CONTENTS
4 Performance Specifications 91
4.1 Normed Spaces . . . . . . . 91
4.2 Hilbert Spaces . . . . . . . . . . ........... 93
4.3 Hardy Spaces ?tz and ?im . . . ........... 97
4.4 Power and Spectral Signals . . ........... 102
4.5 Induced System Gains . . . . . ........... 104
4.6 Computing .& and Xz Norms . ........... 112
4.7 Computing J& and IFt, Norms ........... 114
4.8 Notes and References , . . . . . ........... 116
Bibliography 573
Index 591
Preface
This book is inspired by the recent development in the robust and ‘& control the-
ory, particularly the state-space XFt, control theory. We give a fairly comprehensive
and step-by-step treatment of the state-space ‘H, control theory in the style of Doyle,
Glover, Khargonekar, and Francis [1989]. We have tried to make this derivation as self-
contained as possible and for reference have included many background results on linear
systems, the theory and application of Riccati equations and model reduction. We also
treat the robust control problems with unstructured and structured uncertainties. The
linear fractional transformation (LFT) and the structured singular value (known as p)
are introduced as the unified tools for robust stability and performance analysis and
synthesis. Chapter 1 contains a more detailed chapter-by-chapter review of the topics
and results presented in this book. We have not included any exercises in this edition.
However, exercises and their solutions will be available through anonymous ftp on In-
ternet: /pub/lcemin/ZDG at hot.caltech.edu. Like any book, there are inevitably errors
and we therefore encourage all readers to write us about errors and suggestions. We
also encourage readers to send us exercise problems so that they can be shared by all
readers.
We would like to thank Professor Bruce A. Francis at University of Toronto for his
helpful comments and suggestions on early versions of the manuscript. As a matter
of fact, this manuscript was inspired by his lectures given at Caltech in 1987 and his
masterpiece - A Course in ‘H, Control Theory. We are grateful to Professor Andre Tits
at University of Maryland who has made numerous helpful comments and suggestions
that have greatly improved the quality of the manuscript. Professor Jakob Stoustrup,
Professor Hans Henrik Niemann, and their students at The Technical University of
Denmark have read various versions of this manuscript and have made many helpful
comments and suggestions. We are grateful to their help. Special thanks go to Professor
Andrew Packard at University of California-Berkeley for E: help during the preparation
of the early versions of this manuscript. We are also grateful to Professor Jie Chen at
University of California-Riverside for providing material used in Chapter 6. We would
also like to thank Professor Kang-Zhi Liu at Chiba University (Japan) and Professor
Tongwen Chen at University of Calgary for their valuable comments and suggestions.
In addition, we would like to thank Gary Balas, Carolyn Beck, Dennis S. Bernstein,
Bobby Bodenheimer, Guoxiang Gu, Weimin Lu, John Morris, Matt Newlin, Li Qiu,
.
x111
xiv PREFACE
Hector P. Rotstein, Malcolm Smith, and many other people for their comments and
suggestions. The first author is especially grateful to Professor Pramod P. Khargonekar
at The University of Michigan for introducing him to robust and 7-&, control and to
Professor Tryphon Georgiou at University of Minnesota for encouraging him to complete
this work. The trailing authors are pleased to acknowledge that the major contribution
to the writing of this book has been made by the first author.
Finally, the first author wants to thank his family for their support and encourage-
ment.
Kemin Zhou
John C. Doyle
Keith Glover
Notation and Symbols
E belong to
C subset
U union
n intersection
0 end of proof
0 end of example
0 end of remark
.-
.- defined as
M> asymptotically greater than
< asymptotically less than
M
> much greater than
< much less than
75 complex conjugate of cr E @
I4 absolute value of (Y E @
Ma) real part of Q E Cc
xv
xvi NOTATION AND SYMBOLS
Jw polynomial ring
%ss) rational proper transfer matrices
H-l
A B
shorthand for state space realization C(s1 - A)-‘B + D
c D
or C(z1 - A)-‘B + D
lower LFT
upper LFT
star product
List of Acronyms
xix
xx LIST OF ACRONYMS
80’s - 90’s -
for specifying both the level of plant uncertainty and the signal gain from disturbance
inputs to error outputs in the controlled system.
The “standard” ‘& optimal control problem is concerned with the following block
diagram:
The ‘& norm gives the maximum energy gain (the induced Cz system gain), or sinu-
soidal gain of the system. This is in contrast to the ‘Fls norm, ]]TrW]]z, which for example
gives the variance of the output given white noise disturbances. The important property
of the ‘& norm comes from the application of the small gain theorem, which states
that if []TZW[]oo I y then the system with block diagram,
Z W
G
Y U
K
@
will be stable for all stable n with /]A(], < I/y. It is probably the case that this robust
stability consequence was the main motivation for the development of ‘HW methods
rather than the worst case signal gain.
The synthesis of controllers that achieve an ?fW norm specification hence gives a well-
defined mathematical problem whose solution we will now discuss. Most of the original
solution techniques were in an input-output setting and involved analytic functions
(Nevanlinna-Pick interpolation) or operator-theoretic methods [Sarason, 1967; Adamjan
et al., 1978; Ball and Helton, 19831, and such derivations involved a fruitful collaboration
between Operator Theorists and Control Engineers (see Dym [1994] for some historical
remarks). Indeed, ‘& theory seemed to many to signal the beginning of the end
for the state-space methods which had dominated control for the previous 20 years.
Unfortunately, the standard frequency-domain approaches to ?YW started running into
significant obstacles in dealing with multi-input multi-output (MIMO) systems, both
mathematically and computationally, much as the ‘Hz (or LQG) theory of the 1950’s
had.
Not surprisingly, the first solution to a general rational MIMO ‘H, optimal control
problem, presented in Doyle [1984], relied heavily on state-space methods, although
more as a computational tool than in any essential way. The procedure involved state-
space inner/outer and coprime factorizations of transfer function matrices which reduced
the problem to a Nehari/Hankel norm problem solvable by the state-space method in
Glover [1984]. Both [Francis, 19871 and [Francis and Doyle, 19871 give expositions of
this approach, which in a mathematical sense “solved” the general rational problem but
in fact suffered from severe problems with the high order of the Riccati equations to be
solved.
4 INTRODUCTION
It is interesting to observe at this point th;lq the above techniques related to Hankel
operators were simultaneously being exploiter I and developed in the model reduction
literature. In particularly the striking result ‘)f Adamjan, Arov and Krein [1978] on
rational approximat,ion in the Hankel norm, which had been communicated to the
West,ern Systems and Control community by ielton), had led to the state space mul-
tivariable results in Kung and Lin [1981] and Glover [1984]. The latter paper gave a
self-contained state-spa.ce treatment exploitin:, the balanced realizations proposed for
model reduction by Moore [1981] which arc al: ‘) of independent interest.
The simple state space 3-i, controller formu i<te t,o be presented in this book were first
announced in Glover and Doyle [1988] (after : I )me sustained manipulation). However
the very simplicity of the new formulae and thl lr similarity with the ‘Hz ones suggested
a more direct approach. Independent encourag ment for a simpler approach to the ?tm
problem came from papers by Khargonekar, ‘etersen, Rotea, a n d Z h o u [1987,1988],
They showed that the state-feedback 3-1, probl ‘rn can be solved by solving an algebraic
Riccati equation and completing the square.
Derivations of the controller formulae in G ~lvcr and Doyle [1988] using derivations
more akin to the above state feedback results v, xre given in Doyle, Glover, Khargonekar
and Francis [1989] and will form the basis 01’ the developments in this book. The
operator theory still plays a central role (as do 3s R.edheffer’s work [Redheffer, 19601 on
linear fractional transformations), but its use i more straightforward. The key to this
was a return to simple and familiar state-space 1( ~ols, in the style of Willems [1971], such
as completing the square, and the connection beI ween frequency domain inequalities (e.g
[]G(lw < l), R,iccati equations, and spectral fat .orizations.
This has been a brief and personal account of . hese developments and more extensive,
but still inadequate, comments are made in set Ion 16.12. Relations between 3-1, have
now been established with many other topics i I control: e.g. risk sensitive control of
Whittle [1981, 19901; differential games (see Ba. ,U and Bernhard [1991], Limebeer et al
[1992], Green and Limebeer [1995]); J-lossless I’.lctorization (Green [1992]); maximum
entropy methods of Dym and Gohberg [1986] Isee Mustafa and Glover [1990]). The
state-space theory of ‘H, has been also been carried much further, by generalizing
time-invariant to time-varying, infinite horizon ! 1) finite horizon, and finite dimensional
to infinite dimensional and even to some nonlinr i1.r results. It should be mentioned that
in these generalizations the term XF1, has come to be (mis-)used to mean the induced
norm in ,&. Some of these developments also jrovided alternative derivations of the
standard ‘P& results. Indeed one of the attractif Ins of this area to researchers is that it
can be approached from such a diverse technic; I backgrounds with each providing its
own interpretations. These developments are bt \ ond the scope of the present book.
Having established that the above 3-t, contra problem can be relatively easily solved
and can represent specifications for performant ( *and robustness let us return to the
question of whether this gives a suitable robust control design tool. There is no question
that the algorithm can be used to provide poor CC lntrollers due to poorly chosen problem
descriptions resulting in, for example, very high 1 (tndwidth controllers. T WO approaches
mentioned in this book attempt to satisfy this rl aquirement. Firstly the method of ‘&
1.2. How to Use This Book 5
loop shaping is described where the desired loop shupc is spccifiod together with a
requirement of robust stability, and has been found to be an intuitively appealing and
robust procedure with C~OSC connections to stabilization in the ga.p metric. Secondly
the method of ,u analysis (the structured singular value) is introduced. This approach
due to Doyle [1981] gives an cffectivc analysis tool for assessing robust performance in
the presence of structured uncertainty. Note that the N, norm alone can only give a
conservative prediction of robust performance. The synthesis of controllers that, satisfy
such a criterion (the /l-synthesis problem), can bc approachc:tI iteratively with the ‘&
synthesis for a scaled system as an intermediate step.
Finally it is interesting to consider the question of why t,hc induced norm in C2 has
been used. Is it just for the mathematical convenience of Lx being a Hilbert space?
Apart from the relative simplicity of the solution the ma.in advantage is probably its
easy interpretation in terms of the familiar frequency response considerations. How-
ever roughly in parallel with the development of the ‘HFt, theory has been the work on
L, induced norms of Pearson and coworkers (set> the book by Dahleh [1995]) where
analogous results on robustness and performance ran be made.
In view of the above classification, one possible choice for an one-semester course
on robust control (analysis) would cover Chapters 4 - 5, 9 - 11 or 4 - 11 and an one-
semester advanced course on ?fz and ‘H, control would cover (parts of) Chapters 12-19.
Another possible choice for an one-semester course on ‘HH, control may include Chapter
4, parts of Chapter 5 (5.1- 5.3, 5.5, 5.7), Chapter 10, Chapter 12 (except Section 12.6),
parts of Chapter 13 (13.2, 13.4, 13.6), Chapter 15 and Chapter 16. Although Chapters
7 - 8 are very much independent of other topics and can, in principle, be studied at any
stage with the background of Section 3.9, they may serve as an introduction to sources
of model uncertainties and hence to robustness problems.
Table 1.1: Possible choices for an one-semester course (* chapters may be omitted)
Table 1.1 lists several possible choices of topics for an one-semester course. A course
on model and controller reductions may only include the concept of 3-1, control and
the 3c, controller formulas with the detailed proofs omitted as suggested in the above
table.
not explicitly stated. Readers should consult the corresponding chapters for the exact
statements and conditions.
Chapter 2 reviews some basic linear algebr;l facts and treats a special class of matrix
dilation problems. In particular, we show
Chapter 4 defines several norms for signals and introduces the 7-t~ spaces and the 7-1,
s p a c e s . The input/output gains of a stable liuear system under various input signals
are characterized. We show that ‘Hz and 7-t, norms come out naturally as measures of
the worst possible performance for many classes of input signals. For example, let
1
where
LiB”/y2
-C*C -A* ’
Chapter 5 introduces the feedback structux<b and discusses its stability and perfor-
mance properties.
14 INTRODUCTION
with
F = -R-l@* + B*X).
Chapter 14 treats the optimal control of line.-Lr time-invariant systems with quadratic
performance criteria, i.e., L&R and 3-t~ probll-sms. We consider a dynamical system
described by an LFT with
Define
A+B2~2-i-LzG I-;2 1.
Ko,t(s) :=
Chapter 15 solves a max-min problem, i.e.. a full information (or state feedback)
7-t, control problem, which is the key to the ‘&, theory considered in the next chapter.
Consider a dynamical system
i = Az + BIW + BZU
z = Cla: + D12u, DT2 [ C, D12 ] = [ 0 I ] .
Then we show that min llzll2 < y if and only if H, E dom(Ric) and X, =
w:2*+ uEC2+
Ric(H,) > 0 where
y211,B; - B2B;
-A* I
Linear Algebra
Some basic linear algebra facts will be reviewed in this chapter. The detailed treatment
of this topic can be found in the references listed at the end of the chapter. Hence we shall
omit most proofs and provide proofs only for those results that either cannot be easily
found in the standard linear algebra textbooks or are insightful to the understanding of
some related problems. We then treat a special class of matrix dilation problems which
will be used in Chapters 8 and 17; however, most of the results presented in this book
can be understood without the knowledge of the matrix dilation theory.
17
18 LINEAR ALGEBRA
such a basis for a subspace S is not unique but all bases for S have the same number
of elements. This number is called the dimension of S, denoted by dim(S).
A set of vectors (51, x2,. . . ,xk} in IF” are mutually orthogonal if xfxj = 0 for all
i # j and orthonormal if X~XC~ = Sij, where the superscript * denotes complex conjugate
transpose and Sij is the Kronecker delta function with Sij = 1 for i = j and 6ij = 0
for i # j. More generally, a collection of subspaces S~,SZ, . . . , Sk of IFn are mutually
orthogonal if x*y = 0 whenever x E Si and y E Sj for i # j.
The orthogonal complement of a subspace ,S c IF” is defined by
We call a set of vectors (~1, ~2,. . . , uk} an orthonormal basis for a subspace S E IF if
they form a basis of S and are orthonormal. It is always possible to extend such a basis
to a full orthonormal basis {ul,uz,, . . ,u,} for IF‘. Note that in this case
s’ = span{?&+1 , . . . , ‘IL,},
(Note that a vector x E IF” can also be viewed as a linear transformation from F to IF”,
hence anything said for the general matrix case is also true for the vector case.) Then
the kernel or null space of the linear t,ransformation A is defined by
ImA=R(A):={yEP :y=Ax,x~F}.
It is clear that KerA is a subspace of IF and Im.4 is a subspace of IF”t . Moreover, it can
be easily seen that dim(KerA) -t dim(ImA) = IL and dim(ImA) = dim(KerA)I. Note
that (KerA)‘- is a subspace of IF”.
Let ai, i = 1,2,. . . , n denote the columns of a matrix A E YPXn, then
rank(A) = dim(ImA).
It is a fact that rank(A) = rank(A*), and thus the rank of a matrix equals the maximal
number of independent rows or columns. A matrix A E lPXn is said to have full row
rank if m 5 n and rank(A) = m. Dually, it is said to have fuU column rank if n 5 m
and rank(A) = n. A full rank square matrix is called a nonsingular matrix. It is easy
2.1. Linear SubsDaces 19
to see that rank(A) = rank(AT) = rank(PA) if T and P are nonsingular matrices with
appropriate dimensions.
A square matrix U E Fnx” whose columns form an orthonormal basis for P is called
an unitary matrix (or orthogonal matrix if IF = R), and it satisfies U’U = I = UU*.
The following lemma is useful.
Lemma 2.1 Let D = [ dl . . dk ] E pxk (n > k) be such that D*D = I, so
di, i = 1,2,. . , k are orthonormal. Then there exists a matrix Dl E lVX(“-k) such that
[ D Dl ] is a unitary matrix. Furthermore, the columns of Dl, di, i = 5 + 1,. . . , n,
form an orthonormal completion of {dl, dz, . . . , dk}.
Trace(A) := 2 aii.
2=1
Ax = X..,
Theorem 2.4 For any square complex matrix A E P’“, there exists a nonsingular
matrix T such that
A = TJT -~I
where
J= diag{JI,Jz....,Jl}
Ji=diag{Jil,Ji2....,Jim;}
A; 1
xi 1
Jij = .. . ... E (-JL, XWJ
xi 1
xi _
T = [ TI Tz . Tl ]
are called the generalized eigenvectors of A. For a given integer q 5 n+, the generalized
eigenvectors tijl,kfl < q, are called the lower rank generalized eigenvectors of tzj,.
Definition 2.1 A square matrix A E IPx”is called cyclic if the Jordan canonical form
of A has one and only one Jordan block associated with each distinct eigenvalue.
More specifically, a matrix A is cyclic if its Jordan form has rni = 1, i = 1,. . . , 1. Clearly,
a square matrix A with all distinct eigenvalues is cyclic and can be diagonalized:
. I.
Xl
x2
A [ xl z2 . . . x, ] = [ x1 x2 . . x, ]
&I
where y; E P is given by
I!
Y;
Y;
= [ Xl 22 ..’ xn 1-l.
YG
In general, eigenvalues need not be real, and neither do their corresponding eigenvec-
tors. However, if A is real and X is a real eigenvalue of A, then there is a real eigenvector
corresponding to X. In the case that all eigenvalues of a matrix A are real’, we will
denote X ,,,(A) for the largest eigenvalue of A and X,,, (A) for the smallest eigenvalue.
In particular, if A is a Hermitian matrix, then there exist a unitary matrix U and a
real diagonal matrix A such that A = UAU*, where the diagonal elements of A are the
eigenvalues of A and the columns of U are the eigenvectors of A.
The following theorem is useful in linear system theory.
Theorem 2.5 (Cayley-Hamilton) Let A E CnXn and denote
Then
A” + alA”-l + . + a,,1 = 0.
‘For example, this is the case if A is Hermitian, i.e., A = A*.
22 b LINEAR ALGEBRA
and Xi is an eigenvalue of A. The proof for tht! general case follows from the following
lemma.
Lemma 2.6 Let A E Px”. Then
and
det(XI - A) = X” + al An-l + . . . + a,
where ai and Ri can be computed from the following recursive formulas:
al = - Trace A R1 = I
a2 = - $ Trace( R2 A) R2 = RIA+alI
The proof is left to the reader as an exercise. IVote that the Cayley-Hamilton Theorem
follows from the fact that
A := [ ;;; :::;; ]
where All and A22 are also square matrices. r\ow suppose AlI is nonsingular, then A
has the following decomposition:
let x1, x2, . . . , xl be the corresponding eigenvector and the generalized eigenvectors ob-
tained through the following equations:
( A - X1I)zl = o
(A-X11)22 = xl
( A - Xll)zl = x1-1.
Then a subspace S with xt E S for some t 5 1 is an A-invariant subspace only if all lower
rank eigenvectors and generalized eigenvectors of x1 are in S, i.e., xi E S, Vl 5 i 5 t.
This will be further illustrated in Example 2.1.
On the other hand, if S is a nontrivial subspace and is A-invariant, then there is
x E S and X such that Ax = Xx.
Example 2.1 Suppose a matrix A has the following Jordan canonical form
1 1
Xl 1
Xl
A [ Xl 22 x3 x4 ] = [ x1 x2 53 x4 ]
x3
x4
with ReXr < 0, X3 < 0, and X4 > 0. Then it is easy to verify that
are all A-invariant subspaces. Moreover, Sr , S3, Sr2, S13, and Sl23 are stable A-invariant
subspaces. However, t h e s u b s p a c e s S2 = span{xz}, S23 = span{xs,xs},
SZ4 = span(x2, x4}, a n d S 234 = span{xa, x3, x4} are not A-invariant subspaces since
the lower rank generalized eigenvector x1 of x2 is not in these subspaces. TO illus-
trate, consider the subspace $3. Then by definition, Ax2 E S23 if it is an A-invariant
subspace. Since
Ax2 = Xx2 + ~1,
Ax2 E S23 would require that x1 be a linear combination of x2 and x3, but this is
impossible since XI is independent of x2 and 23. 0
for any x E X and y E X. A function is said tc) be a semi-norm if it satisfies (i), (iii),
and (iv) but not necessarily (ii).
Let x E U?. Then we define the vector pnorm of x as
IlP
I n
Ml2 := c /xi12;
J i=l
1141, := 1’=Fl&
- -’ IXil.
Clearly, norm is an abstraction and extension of our usual concept of length in 3-
dimensional Euclidean space. So a norm of a vet tor is a measure of the vector “length”,
for example (jx((s is the Euclidean distance of the vector 3: from the origin. Similarly,
we can introduce some kind of measure for a miltrix.
Let A = [a+] E CmXn, then the matrix norm induced by a vector p-norm is defined
as
In particular, for p = 1,2,00, the corresponding Induced matrix norm can be computed
as
m
2.7. Vector Norms and Matrix Norms 29
The matrix norms induced by vector pnorms are sometimes called induced p-norms.
This is because llAl[r is defined by or induced from a vector p-norm. In fact, A can
be viewed as a mapping from a vector space C” equipped with a vector norm 11.11, to
another vector space C” equipped with a vector norm 11.11,. So from a system theoretical
point of view, the induced norms have the interpretation of input/output amplification
gains.
We shall adopt the following convention throughout the book for the vector and matrix
norms unless specified otherwise: let x E P and A E Cmxn, then we shall denote the
Euclidean 2-norm of x simply by
11x11 := 11x112
and the induced 2-norm of A by
II4 := IL‘% .
I. Suppose n 2 m. Then llxll = llyjl iJf there is a matrix U E IF”‘” such that x = Uy
and U*U = I.
2. Suppose n = m. Then (x*yl 5 llxll IIy(l. M oreover, the equality holds iff x = ay
forsomeaEF ory=O.
Another often used matrix norm is the so called Frobenius norm. It is defined as
llAllF := dm = Fk laijl’ .
i=l j=l
Lemma 2.9 Let A and B be any matrices with appropriate dimensions. Then
30 LINEAR ALGEBRA
1. ~(4 5 II4 (Th’as is also true for F norm and any induced matrix norm).
3. IPAVll = ll4L and IIUAVII, = IIAIIF, for. any appropriately dimensioned unitary
matrices U and V.
and let each Aij be an appropriately dimension6 d matrix. Then for any induced matrix
:1
p-norm
MII Ilp IIAn Ilp . . . IIAdp
IL421 I& IL422 Ilp . . IIAz&,
(2.2)
Proof. It is obvious that if the F-norm is used, then the right hand side of inequal-
ity (2.2) equals the left hand side. Hence only the induced pnorm cases, 1 5 p 5 00,
will be shown. Let a vector x be partitioned consistently with A as
H
Xl
X2
x= . ;
xq
I .!
IlZl Ilp
IlQ Ilp
l141p = : .
llx& p
Then
= sup L sup
ll41,,=1 ll4,,=1
p
11 cg1 A@, 11
p- P
IIAII Ilp IlA1211p . . IIAdlp 11x1 lip
=
5
sup
~~211,,=1
II-421
i IIAn Ilp
i
lL4m211p
.
.
II& lip
lli, I& Ii : I
IIx211p
11~111, p
= (1 [IIAijllp] lip.
32 LINEAR ALGEBRA
where
Cl =
I u‘1 0 0 . . CT2 0 0 ‘.‘. . . 0.0 : .
up
4
and
Ul 2 ff2 2 ‘.’ 2 ffP > 0, p = min{m,n}.
Proof. Let g = I(Al( and without loss of gentrality assume m 2 n. Then from the
definition of llA[l, there exists a z E IF” such that
IMI = oll4l.
I -
By Lemma 2.8, there is a matrix i? E Fmxn su,,h that U*U = I and
Az=d.?:.
Now let
and
u=[y u+.IFnxm
be unitary.4 Consequently, U*AV has the follo\ving structure:
Al := U*AV
it follows that J\A1112 2 g2 + w*w. But since g = \lAl\ = l]AlII, we must have w = 0.
An obvious induction argument gives
U*AV = C.
The cri is the i-th singular value of A, and the vectors u, and v~j are, respectively,
the i-th left singular vector and the j-th right singular vector. It is easy to verify that
Avi = uiu,
A*u, = (T~v;.
A*Avi = a&
AA*ui = gf~i.
and
a(A) = urnin = up = the smallest singular value of A .
Geometrically, the singular values of a matrix A are precisely the lengths of the semi-
axes of the hyperellipsoid E defined by
Thus vi is the direction in which llyll is largest for all llzll = 1; while w, is the direction
in which llylj is smallest for all 1(zlj = 1. From the input/output point of view, vr (vn)
is the highest (lowest) gain input direction, while u1 (u,) is the highest (lowest) gain
observing direction. This can be illustrated by the following 2 x 2 matrix:
cos 01 - sin 01
A =
sin Or cos 81
34 LINEAR ALGEBRA
It is easy to see that A maps a unit disk to an ellipsoid with semi-axes of ~1 and CJQ.
Hence it is often convenient to introduce the following alternative definitions for the
largest singular value O-:
F(A) := ,r,yl IIA4I
Proof.
(i) By definition
a(AA) : = ,,yyl
I IIA~4l
ZZ min x*-1*A*AAx
d 11~11=1
2 g(A) ,,ry;, ll~xll = a(A
I
(iii) Let the singular value decomposition of A be A = UCV*, then A-l = VFIU*.
Hence i~(A-l) = iY(C-‘) = l/a(E) = l/a( 4).
Then
1. rank(A) = r;
A = &riviv: = U&V;
i=l
6. IIAII = ~1;
Proof. We shall only give a proof for part 8. It is easy to see that rank(Ak) 5 t and
IIA - Akll = CJ~+I. Hence, we only need to show that rang;)5k IV - BII 2: n+l. Let
More generally, if a matrix A has neither ful: row rank nor full column rank, then all
the ordinary matrix inverses do not exist; however, the so called pseudo-inverse, known
also as the Moore-Penrose inverse, is useful. ‘l’his pseudo-inverse is denoted by A+,
which satisfies the following conditions:
(i) AA+A = A;
(ii) A+AA+ = A+;
(iii) (AA+)* = AA’;
(iv) (A+A)* = A+A.
It can be shown that pseudo-inverse is unique. I.)ne way of computing A+ is by writing
A=BC
so that B has full column rank and C has full IOW rank. Then
A+ = C*(CC*)-l(,!g*B)-lB*.
Another way to compute A+ is by using SVD. Suppose A has a singular value decom-
position
A=UCS*
[ 1 ) c, >o.
with
c= : ;
where B1 and Cr are full row rank. Then B1 and Cr have the same number of rows
and V3 := BIC:(CIC~)-l satisfies &*r,<l = I since B*B = C’C. Hence vy is a unitary
matrix and V;C B1 = Cl. Finally let
We can define square root for a positive semi-definite matrix A, Ali” = (Al/‘)* > 0,
by
Clearly, A1/2 can be computed by using spectral decomposition or SVD: let A = UAU*,
then
Ali2 = UA’/“U*
where
A = diag{Ar,. . , A,,}, A 1/2 = diag{ a, . . , a}.
Lemma 2.15 Suppose A = A* > 0 and B = B* > 0. Then A > B i;ffp(BA-l) < 1.
However A-1/2BA-1/2 and BA-l are similar, hence X;(BA-‘) = &(A-‘/’ BA-‘i2).
Therefore, the conclusion follows by the fact that
0 < I - A-l/2BA-l12
x=
[
;:
12
x12
x22 1
Then KerXa2 c KerXr2. Consequently, if X,+, is the pseudo-inverse of X22, then
Y = X12X,f, solves
YX’L2 = Xl2
and
38 LINEAR ALGEBRA
and
Moreover
(2.3)
where X, B, C, and A are constant matrices of compatible dimensions.
[ 1
X B
The matrix C A is a dilation of its sub-matrices as indicated in the following
diagram:
-- d
X
C
B
A I -
c
- - [1
B
A
t I
d c d c
I 1 7 I
d
[CA1 -
- (’ - - PI
2.11. Matrix Dilation Problems* 39
In this diagram, “c” stands for the operation of compression and “d” stands for dilation.
Compression is always norm non-increasing and dilation is always norm non-decreasing.
Sometimes dilation can be made to be norm preserving. Norm preserving dilations are
the focus of this section.
The simplest matrix dilation problem occurs when solving
min
X
(2.4)
Although (2.4) is a much simplified version of (2.3), we will see that it contains all the
essential features of the general problem. Letting ys denote the minimum norm in (2.4),
it is immediate that
YO = ll-4.
The following theorem characterizes all solutions to (2.4).
Il[ Ill
ifs there is a Y with l/Yll 5 1 such that
X
A 5-Y
X = Y(y”I - A*A)lj2.
Proof.
I [ Ill
X
A Ir
iff
X*X + A’A < y21
iff
X*X 2 (y21 - A*A).
Y := X [(y21 - A*A)‘i’] +
This theorem implies that, in general, (2.4) has more than one solution, which is
in contrast to the minimization in the Frobenius norm in which X = 0 is the unique
solution. The solution X = 0 is the central solution but others are possible unless
A*A = $1.
40 LINEAR ALGEBRA
Remark 2.1 The theorem still holds if (y21 -. A*A)li2 is replaced by any matrix R
such that y21 - A*A = R*R. V
A more restricted version of the above theorc,m is shown in the following corollary.
ID Il X
A 5-Y (C-Y)
Theorem 2.19 VT 2 yo
IN x A III 5-f
X = (y21 - AA*)l12Y.
(2.5)
The following so called Parrott’s theorem will IJay an important role in many control
related optimization problems. The proof is tire straightforward application of Theo-
rem 2.17 and its dual, Theorem 2.19.
Proof. Denote by 3 the right hand side of the equation (2.6). Clearly, ^fo 2 y since
compressions are norm non-increasing, and that ys 5 9 will be shown by using Theo-
rem 2.17 and Theorem 2.19.
Suppose A E Cnxm and n 2 m (the case for m > n can be shown in the same
fashion). Then A has the following singular value decomposition:
Hence
+21 - A*A = V(+21 - c&)V*
and
Now let
(T21 - A*A) l/2 := vpjq- pJ/qf*
and
I[ 1
(T21 - A*A)li2 (,;121- A*#/2 *
(921 r;:h*,1/2
A A
= +.I [
5
(+21 - A*A)1/2
A Ill
42 LINEAR ALGEBRA
This theorem gives one solution to (2.3) anti an expression for 70. As in (2.4), there
may be more than one solution to (2.3), although the proof of theorem 2.21 exhibits
only one. Theorem 2.22 considers the problem of parameterizing all solutions. The
solution X = -YA*Z is the %entral” solution analogous to X = 0 in (2.4).
Theorem 2.22 Suppose y > 70. The solution: X such that
(2.7)
are exactly those of the form
X = -YA*Z + $I- YY*)‘/2W(I - 2*2)1/2 (2.8)
where W is an arbitrary contraction (I]W]l 5 l), and Y with [[Y/l 5 1 and Z with
llZl/ 5 1 solve the linear equations
B = Y(y21 - A*A)li2, (2.9)
c = (y21 - A.4*)1’22. (2.10)
Proof. Since y > 70, again from Theorem 2.19 there exists a 2 with ll.Zl] < 1 such
that
C = (y21 - AA *)lj2 2.
Note that using the above expression for C we have
* -;(I -
y”I-[C A ] * [ C A ]
I[ 1
pqv2
(y21 - :*A)‘ia -A*Z (+I - ;*A)‘/2 ’
Now apply Theorem 2.17 (Remark 2.1) to inequdity (2.7) with respect to the partitioned
matrix
@I - :*A)‘/2 1
for some contraction fi, 6’ 5 1. Partition I$’ as fi = [ Wr Y ] to obtain the
II II
expression for X and B:
X = -YA*Z + yW’, (I - 2*2)1’2,
B = Y(y21 - A*A)]12.
Then llY]l < 1 and the theorem follows by notmg that I] [ WI Y ] ]I < 1 iff there is a
W, l]W]l 5 1, such that Wr = (I - YY*)li2W. 0
The following corollary gives an alternative version of Theorem 2.22 when y > ^lo.
Linear Dynamical Systems
This chapter reviews some basic system theoretical concepts. The notions of controlla-
bility, observability, stabilizability, and detectability are defined and various algebraic
and geometric characterizations of these notions are summarized. Kalman canonical de-
composition, pole placement, and observer theory are then introduced. The solutions of
Lyapunov equations and their connections with system stability, controllability, and so
on, are discussed. System interconnections and realizations, in particular the balanced
realization, are studied in some detail. Finally, the concepts of system poles and zeros
are introduced.
where x(t) E Iw” is called the system state, x(to) is called the initial condition of the
system, u(t) E Iw” is called the system input, a n d y(t) E IfP i s t h e s y s t e m o u t p u t .
The A, B, C, and D are appropriately dimensioned real constant matrices. A dynamical
system with single input (m = 1) and single output (p = 1) is called a SISO (single input
and single output) system, otherwise it is called MIMO (multiple input and multiple
45
46 LINEAR DYNAMICAL SYSTEMS
where U(s) and Y(s ) are the Laplace transform of u(t) and y(t) with zero initial condi-
tion (z(0) = 0). Hence, we have
Note that the system equations (3.1) and (3.2) (‘an be written in a more compact matrix
[:I=[: :r][:]-
form:
[i-lA B
C D
:= C(s1 -- A)-lB + D
will be used. Other reasons for using this not.ation will be discussed in Chapter 10.
[ 1
Note that
A B
C D
is a real block matrix, not a transfer function.
Now given the initial condition x(to) and the input u(t), the dynamical system
response z(t) and y(t) for t >_ to can be determined from the following formulas:
l+(t) : = ;: ; z ;;
1
3.2. Controllability and Observability 47
The input/output relationship (i.e., with zero initial state: ~0 = 0) can be described by
the convolution equation
Definition 3.1 The dynamical system described by the equation (3.1) or the pair
(A,B) is said to be controllable if, for any initial state x(O) = zu, tr > 0 and final
state 21, there exists a (piecewise continuous) input ?I,(.) such that the solution of (3.1)
satisfies x(tr) = 21. Otherwise, the system or the pair (A, B) is said to be uncontrollable.
The controllability (or observability introduced next) of a system can be verified through
some algebraic or geometric criteria.
C = [ B AB A2B . . . A”-‘B ]
has full row rank or, in other words, (A IlmB) := Cy=“=, Lm(Ai-lB) = II+?.
(iv) The matrix [A - XI, B] has full row rank for all X in C.
(v) Let A and x be any eigenvalue and any corresponding left eigenvector of A, i.e.,
x*A = x*X, then x*B # 0.
(vi) The eigenvalues of A+ BF can be freely assigned (with the restriction that complex
eigenvalues are in conjugate pairs) by a suitable choice of F.
48 LINEAR DYNAMICAL SYSTEMS
Proof.
(i) * (ii): supp ose Wc(ti) > 0 for some ti > U, and let the input be defined as
U(T) = -B*eA*(t’-‘)w.. (t1)-l(eAt120 - 51).
Then it is easy to verify using the formu;;r in (3.3) that ~c(ti) = ~1. Since 21 is
arbitrary, the pair (A, B) is controllable.
To show that the controllability of (A,B, implies that WC(t) > 0 for any t > 0,
assume that (A,B) is controllable but W.(ti) is singular for some tl > 0. Since
eAtBB*eAlt >- 0 for all t, there exists a rt ~1 vector 0 # IJ E IV such that
v*eAtB = 0, t E [0, tl].
Now let z(tl) = zi = 0, and then from thl: solution (3.3), we have
0 = eAtlz(0) + t1 At’-Q+)&.
.I’0
Pre-multiply the above equation by v* to get
0 = v*P ’ x(0).
If we chose the initial state s(O) = epAt IV. then v = 0, and this is a contradiction.
Hence, WC(t) can not be singular for any i: > 0.
(ii) e (iii): s u p p o s e Wc( t) > 0 for all t > 0 (in fact, it can be shown that WC(t) > 0
for all t > 0 iff, for some tl, W,(tl) > 0) b!rt the controllability matrix C does not
have full row rank. Then there exists a v E lFY such that
v*AiB = 0
for all 0 5 i 5 n - 1. In fact, this equaiity holds for all i > 0 by the Cayley-
Hamilton Theorem. Hence,
v*eAtB = 0
for all t or, equivalently, v*Wc(t) = 0 for it11 t; this is a contradiction, and hence,
the controllability matrix C must be full row rank. Conversely, suppose C has full
row rank but WC(t) is singular for some t , Then there exists a 0 # v E IF such
that v*eAtB = 0 for all t E [0, tl]. Therefore, set t = 0, and we have
v*B =: 0.
Next, evaluate the i-th derivative of v*eA1 lj’ = 0 at t = 0 to get
v*AiB = 0. i > 0.
Hence, we have
v* [ B A B A2B . . A”-lB ] = 0
or, in other words, the controllability matrix C does not have full row rank. This
is again a contradiction.
3.2. Controllability and Observabilitv 49
[A-XI B]
does not have full row rank for some X E UJ.. Then there exists a vector z E c”
such that
x*[ A - X I B]=O
i.e., rc*A = XX* and z*B = 0. However, this will result in
i.e., the controllability matrix C does not have full row rank, and this is a contra-
diction.
(u) + (iii): We will again prove this by contradiction. Assume that (w) holds but
rank C = Ic < n. Then in section 3.3, we will show that there is a transformation
with A, E lR(n-k)x(n-k). Let Xr and x,- be any eigenvalue and any corresponding
left eigenvector of A,-, i.e., xf& = Aix,“. Then x*(TB) = 0 and
x= [ X1E0
(wi) =s- (i): T h ’is f0110~s the same arguments as in the proof of (w) + (iii): assume
that (vi) holds but (A, B) is uncontrollable. Then, there is a decomposition so
that some subsystems are not affected by the control, but this contradicts the
condition (wi).
(i) + (vi): This will be clear in section 3.4. In that section, we will explicitly construct
a matrix F so that the eigenvalues of A + BF are in the desired locations.
Definition 3.3 The dynamical system (3.1), or the pair (A, B), is said to be stabilizable
if there exists a state feedback u = Fx such that the system is stable, i.e., A + BF is
stable.
Therefore, it is more appropriate to call this stabilizability the state feedback stabiliz-
ability to differentiate it from the output feedback stabilizability defined later.
The following theorem is a consequence of Theorem 3.1.
(ii) The matrix [A - XI, B] has full row rank for all ReA 2 0.
(iii) For all X and x such that x*A = x*X and ReX 2 0, x*B # 0.
We now consider the dual notions of observability and detectability of the system
described by equations (3.1) and (3.2).
Definition 3.4 The dynamical system described by the equations (3.1) and (3.2) or
by the pair (C,A) is said to be observable if, for any tl > 0, the initial state ~(0) = $0
can be determined from the time history of the input u(t) and the output y(t) in the
interval of [0, tl]. Otherwise, the system, or (C, .4), is said to be unobservable.
[ 1
A - XI
(iv) The matrix has full column rank for all X in Cc.
C
3.2. Controllability and Observability 51
(v) Let X and y be any eigenvalue and any corresponding right eigenvector of A, i.e.,
Ay = Xy, then Cy # 0.
(vi) The eigenvalues of A + LC can be freely assigned (with the restriction that complex
eigenvalues are in conjugate pairs) by a suitable choice of L.
Proof. First, we will show the equivalence between conditions (i) and (iii). Once this
is done, the rest will follow by the duality or condition (vii).
(i) -t= (iii): No t e that given the input u(t) and the initial condition x0, the output in
the time interval [0, tl] is given by
t
y(t) = ceAtx(0) + CeA(t-‘hL(7)d7- + IAL(t).
s0
Y(O)
G(O)
40)
,in-b (0)
:
where yci) stands for the i-th derivative of y. Since the observability matrix c3
has full column rank, there is a unique solution x(0) in the above equation. This
completes the proof.
(i) + (iii): This will be proven by contradiction. Assume that (C, A) is observable but
that the observability matrix does not have full column rank, i.e., there is a vector
x0 such that 0~0 = 0 or equivalently CAZxo = 0, Vi 2 0 by the Cayley-Hamilton
Theorem. Now suppose the initial state z(O) = x0, then y(t) = CeAtx(0) = 0.
This implies that the system is not observable since x(0) cannot be determined
from y(t) = 0.
Definition 3.5 The system, or the pair (C,A), is detectable if A + LC is stable for
some L.
52 LINEAR DYNAMICAL SYSTEMS
(ii) The matrix has full column lank for all ReX 2 0.
The conditions (iv) and (v) of Theorem 3.1 and Theorem 3.3 and the conditions
(ii) and (iii) of Theorem 3.2 and Theorem 3.4 <rre often called Popov-Belevitch-Hautus
(PBH) tests. In particular, the following definitions of modal controllability and
observability are often useful.
It follows that a system is controllable (observa t)le) if and only if every mode is control-
lable (observable). Similarly, a system is stabilizable (detectable) if and only if every
unstable mode is controllable (observable).
For example, consider the following 4th ord’r system:
with X1 # As. Then, the mode Xr is not controllable if Q = 0, and As is not observable
if p = 0. Note that if Xr = As, the system is un#-,ontrollable and unobservable for any a
and /3 since in that case, both
are the left eigenvectors of A corresponding to Ar. Hence any linear combination of x1
and 22 is still an eigenvector of A corresponding to X1. In particular, let 5 = x1 - crxz,
then x*B = 0, and as a result, the system is nor. controllable. Similar arguments can be
applied to check observability. However, if the 13 matrix is changed into a 4 x 2 matrix
3.3. Kalman Canonical Decomposition 53
with the last two rows independent of each other, then the system is controllable even
if Xr = As. For example, the reader may easily verify that the system with
for some ICI x ICI matrix A,. Similarly, each column of the matrix B is a linear combi-
nation of qi, i = 1,2,. . . , k1, hence
1
. . . ,+-l& . . . A;-l&
... cc) . . . 0 .
Corollary 3.7 If the system is stabilizable and the controllability matrix C has rank
k1 < n, then there exists a similarity transformation T such that
So the controllable subspace is the span of qi, i = 1,. . . , kl or, equivalently, Im C. On the
other hand, the uncontrollable subspace is given by the complement of the controllable
subspace.
By duality, we have the following decomposition if the system is not completely
observable.
3.3. Kalman Canonical Decomposition 57
where the vector %‘cO is controllable and observable, %:ca is controllable but unobserv-
able, Zc,-, is observable but uncontrollable, and z,-~ is uncontrollable and unobservable.
Moreover, the transfer matrix from u to y is given by
One important issue is that although the transfer matrix of a dynamical system
H-1
A B
C D
their internal behaviors are very different. In other words, while their input/output
behaviors are the same, their state space response with nonzero initial conditions are
very different, This can be illustrated by the state space response for the simple system
with zc,,, controllable and observable, 2,,- controllable but unobservable, %c,-, observable
but uncontrollable, and zFd uncontrollable and unobservable.
The solution to this system is given by
eA~,~t3co(0) + $ eA~f~(t-‘)Bcou(7-)d7
eA~.<i’~c6(()) + 1; eL(t-‘)&u(+j~
e&ztgzo(o)
e&“tp5(o)
I
y(t) = c,,z,,(t) + c&i&;
note that zCTo(t) and ftc5(t) are not affected by the input U, while zCc6(t) and 3&(t) do
not show up in the output y. Moreover, if the initial condition is zero, i.e., Z(O) = 0,
then the output
t
Y(t) = C,,eA’n(t-r)Bc,u(7)d7.
s0
58 LINEAR DYNAMICAL SYSTEMS
However, if the initial state is not zero, then the response g,,(t) will show up in the
output. In particular, if &, is not stable, then the output y(t) will grow without bound.
The problems with uncontrollable and/or unobservable unstable modes are even more
profound than what we have mentioned. Since the states are the internal signals of the
dynamical system, any unbounded internal signal will eventually destroy the system.
On the other hand, since it is impossible to make the initial states exuctly zero, any
uncontrollable and/or unobservable unstable mode will result in unacceptable system
behavior. This issue will be exploited further in section 3.7. In the next section, we will
consider how to place the system poles to achieve desired closed-loop behavior if the
system is controllable.
i = Ax+Bu
y = Cx+Du,
and let u be a state feedback control law given by
u = Fx + 11.
This closed-loop system is as shown in Figure 3.1, and the closed-loop system equations
are given by
2 = (A + BF).r: + BTJ
Y = (C + DF).c + Dw.
Then we have the following lemma in which the proof is simple and is left as an
exercise to the reader.
3.4. Pole Placement and Canonical Forms 59
Lemma 3.11 Let F be a constant matrix with appropriate dimension; then (A, B) is
controllable (stabilizable) if and only if (A + BF, B)zs controllable (stabilizable).
However, the observability of the system may change under state feedback. For
example, the following system
u= Fx= [ - 1 - 1 lx,
ri=Ax+Bu-j:=Ax+Bu+Ly
By duality, the output injection does not change the system observability (detectability)
but may change the system controllability (stabilizability).
Remark 3.1 We would like to call attention to the diagram and the signals flow con-
vention used in this book. It may not be conventional to let signals flow from the right
to the left, however, the reader will find that this representation is much more visually
appealing than the traditional representation, and is consistent with the matrix ma-
nipulations. For example, the following diagram represents the multiplication of three
matrices and will be very helpful in dealing with complicated systems:
z = M1M2M3w.
Z W
- MI - M2 - MS -
The conventional signal block diagrams, i.e., signals flowing from left to right, will also
be used in this book. 0
60 LINEAR DYNAMICAL SYSTEMS
We now consider some special state space representations of the dynamical system
described by equations (3.1) and (3.2). First, In-e will consider the systems with single
inputs.
Assume that a single input and multiple out put dynamical system is given by
H--l
A b
G(s) = c d 7 b E W C E I’“, d c I
and define
1 --a1
1
-a2
0
.‘.
...
-a,-] -
0
a
0
0
,
1 1
0
Al := 0 1 ... 0 bl := o
10 0 ‘.. 1 0 J 0
and
C = [ b Ab . . . An-lb ]
Cl = 1 bl Albl . A;-‘bl ] .
Then it is easy to verify that both C and C1 arlh nonsingular. Moreover, the transfor-
mation
T, = CIC-’
will give the equivalent system representation
where
CT,-’ = [ PI P2 . . . Pn-1 Pn ]
for some ,0i E l@‘. This state space representation is usually called controllable canonical
form or controller canonical form. It is also ea:y to show that the transfer matrix is
given by
which also shows that, given a column vector of transfer matrix G(s), a state space rep-
resentation of the transfer matrix can be obtainecl as above. The quadruple (A, b, C, d)
is called a state space realization of the transfer matrix G(s).
3.4. Pole Placement and Canonical Forms 61
i = (AI + bIF)x
G(s) =
H-l
4 ; , BcRnxm, c*EEP, d” EEP,
and assume that (c, A) is observable; then there is a transformation TO such that
-ala 0 0 ... 0 qn
_ 1 oO...O d _
and
n-1 + 712Sn-2
+. ‘. + %x-lS + %I + d
G(s) = c(s1 - A)-lB + d = al;n + alsn--l +
. . . + a,-1s + a,
The pole placement problem for a multiple input system (A, B) can be converted
into a simple single input pole placement problem. To describe the procedure, we need
some preliminary results.
Lemma 3.12 If an m input system pair (A,B) zs controllable and if A is cyclic, then
for almost all 21 E IRm, the single input pair (A, Bv) is controllable.
62 LINEAR DYNAMICAL SYSTEMS
Proof. Without loss of generality, assume that A is in the Jordan canonical form and
that the matrix B is partitioned accordingly:
Jl
52
A=
..
i . J/c
where Ji is in the form of
xi ...
J, = .: ..
xi 1
Ai _
and X; # Xj if i # j. By PBH test, the pair (A, 11) is controllable if and only if, for each
i = l,..., Ic, the last row of B; is not zero. Let li, E KY” be the last row of Bi, and then
we only need to show that, for almost all u E IV’, biv # 0 for each i = 1,. . . , k which is
clear since for each i, the set v E Iw” such that biv = 0 has measure zero in Iw” since
bi # 0. 0
The cyclicity assumption in this theorem is essential. Without this assumption, the
theorem does not hold. For example, the pair
A = [ ; ;I, B:-=[; ;]
is controllable but there is no v E R2 such that (A, Bv) is controllable since A is not
cyclic.
Since a matrix A with distinct eigenvalues i-; cyclic, by definition we have the fol-
lowing lemma.
Lemma 3.13 If (A, B) a . s controllable, then for almost any K E Rmx”, all the eigen-
values of A + BK are distinct and, consequentl:r,. A + BK is cyclic.
A proof can be found in Brasch and Pearson [1970], Davison [1968], and Heymann
[1968].
Now it is evident that given a multiple input controllable pair (A, B), there is a
matrix K E lF!Px” and a vector v E Iw” such th<lt A + BK is cyclic and (A + BK, Bv)
is controllable. Moreover, from the pole placemlnt results for the single input system,
there is a matrix f E IX”” so that the eigenvaluer; of (A+BK)+(Bv)f can be arbitrarily
assigned. Hence, the eigenvalues of A + BF cari be arbitrarily assigned by choosing a
state feedback in the form of
u = Fx = (K t vf)x.
3.5. Observers and Observer-Based Controllers 63
4 Ix Mq+Nu+Hy
i = Qq+Ru+Sy
so that g(t) -z(t) -+ 0 as t --+ 03 for all initial states X(O), q(0) and for every input u(.).
Theorem 3.14 An observer exists i;fs (C,A) is detectable. Further, if (C, A) is de-
tectable, then a full order Luenberger observer is given by
4 = Aq+Bu+L(Cq+Du-y) (3.5)
P = q (3.6)
Proof. We first show that the detectability of (C,A) is sufficient for the existence
of an observer. To that end, we only need to show that the so-called Luenberger
observer defined in the theorem is indeed an observer. Note that equation (3.5) for q
is a simulation of the equation for Z, with an additional forcing term L(Cq + Du - y),
which is a gain times the output error. Equivalent equations are
4 = (A+LC)q+Bu+LDu-Ly
i = q.
64 LINEAR DYNAMICAL SYSTEMS
These equations have the form allowed in the d&nition of an observer. Define the error,
e := i - 5, and then simple algebra gives
G = (A + L(:‘)e;
(j zz Mq + nu -t Hy
i = Qq -+ Ru + Sy.
Take q(0) = 0 and u(t) G 0. Then the equation, for x and the candidate observer are
i = Ax
(j = Mq+ HCx
2 = Qq + SCx.
The above Luenberger observer has dimensic 111 n, which is the dimension of the state
x. It’s possible to get an observer of lower dinrension. The idea is this: since we can
measure y - Du = Cx, we already know x module Ker C, so we only need to generate
the part of x in Ker C. If C has full row rank and p := dim y, then the dimension of
Ker C equals n - p, so we might suspect that we can get an observer of order n - p.
This is true. Such an observer is called a “minii:lal order observer”. We will not pursue
this issue further here. The interested reader may consult Chen [1984].
Recall that, for a dynamical system descrilod by the equations (3.1) and (3.2), if
(A, B) is controllable and state x is available for feedback, then there is a state feedback
‘u. = Fx such that the closed-loop poles of the system can be arbitrarily assigned.
Similarly, if (C, A) is observable, then the system observer poles can be arbitrarily
placed so that the state estimator 2 can be marle to approach x arbitrarily fast. Now
let us consider what will happen if the system states are not available for feedback
so that the estimated state has to be used. Ilence, the controller has the following
dynamics:
BF
I[ 1 5
B F + LC 2
=[ A+LC 0 e
i
e
H -LC A+BF I[ I ’
i
and the closed-loop poles consist of two parts: the poles resulting from state feedback
(T(A+BF) and the poles resulting from the state estimation a(A+LC). Now if (A, B) is
controllable and (C, A) is observable, then there exist F and L such that the eigenvalues
of A + BF and A + LC can be arbitrarily assigned. In particular, they can be made to
be stable. Note that a slightly weaker result can also result even if (A, B) and (C, A)
are only stabilizable and detectable.
The controller given above is called an observer-based controller and is denoted as
u = K(s)y
and
1
A+BF+LC+LDF -L
K(s) =
F 0
Now denote the open loop plant by
[+I
A B
G= c D ;
Suppose that Gr and Gs are two subsystems with state space representations:
Then the series or cascade connection of these two subsystems is a system with the
output of the second subsystem as the input l)f the first subsystem as shown in the
following diagram:
This operation in terms of the transfer matrices of the two subsystems is essentially the
product of two transfer matrices. Hence, a representation for the cascaded system can
be obtained as
where Rl2 = I+DlDz and R2l = I+DzD r. Note that these state space representations
may not be necessarily controllable and/or observable even if the original subsystems
Gr and Gs are.
For future reference, we shall also introduce the following definitions.
3.6. Operations on Systems 67
Definition 3.7 The transpose of a transfer matrix G(s) or the dual system is defined
as
G - GT(s) = B*(sI - A*)-%* + D*
or equivalently
or equivalently
[q-g] - [$+-I.
Definition 3.9 A real rational matrix G(s) is called a right (left) inverse of a transfer
matrix G(s) if G(s)&(s) = 1 ( G(s)G(s) = 1 ). M oreover, if G(s) is both a right inverse
and a left inverse of G(s), then it is simply called the inverse of G(s).
Lemma 3.15 Let Dt denote a right (left) inverse of D if D has full row (column) rank.
Then
Gt=[ “-;;+“I-“,:+I
Proof. The right inverse case will be proven and the left inverse case follows by duality.
Suppose DDi = I. Then
= [ 1 Af;%-yt]
GGt = [i 0 0
A-,i++.Dt
0 I
,
= I.
68 LINEAR DYNAMICAL SYSTEMS
G(s) =
[-I 1
; ; ,
a realization of G(s).
We note that if the transfer matrix is eithe: single input or single output, then the
formulas in Section 3.4 can be used to obtain ,1 controllable or observable realization.
The realization for a general MIMO transfer mal,rix is more complicated and is the focus
of this section.
Definition 3.10 A state space realization (A, ,J, C, D) of G(s) is said to be a minimal
realization of G(s) if A has the smallest possiblf dimension.
Theorem 3.16 A state space realization (A, B C, D) of G(s) is minimal if and only if
(A, B) is controllable and (C, A) is observable.
Proof. We shall first show that if (A, B, C, D) is a minimal realization of G(s), then
(A, B) must be controllable and (C, A) must bc’ observable. Suppose, on the contrary,
that (A, B) is not controllable and/or (C,A) ‘/I< not observable. Then from Kalman
decomposition, there is a smaller dimensioned ( ontrollable and observable state space
realization that has the same transfer matrix; t iris contradicts the minimality assump-
tion. Hence (A, B) must be controllable and (C, A) must be observable.
Next we show that if an n-th order realizat!on (A, B, C, D) is controllable and ob-
servable, then it is minimal. But supposing it i. not minimal, let (A,, B,, Cm, D) be
a minimal realization of G(s) with order k < n. Since
we have
CAiB = C,,AkB,, Vi > 0 .
This implies that
c?c = o,,c,,, (3.7)
3.7. State Space Realizations for Transfer Matrices 69
where C and 0 are the controllability and observability matrices of (A, B) and (C, A),
respectively, and
By Sylvester’s inequality,
and, therefore, we have rank (UC) = 71 since rank C = rank (3 = 71, by the controllability
and observability assumptions. Similarly, since (A,,, , IL, C,,, , D) is minim4 (&, B,,)
is controllable and (C,,, A,,) is observable. Moreover,
The following property of minimal realizations can also be verified, and this is left
to the reader.
Theorem 3.17 Let (Al, B1, CL, D) and (AZ, Bz, C2, D) be two minimal realizations of
a real rational transfer matrix G(s), and let Cl, Ca, 01, and 02 be the correspond-
ing controllability and observability matrices, rcspectivcly. Then there exists a unique
nonsingular T such that
We now describe several ways to obtain a state space realization for a given multiple
input and multiple output transfer matrix G(s). The simplest and most straightforward
way to obtain a realization is by realizing each element of the matrix G(s) and then
combining all these individual realizations to form a realization for G(s). To illustrate,
let us consider a 2 x 2 (block) transfer matrix such as
A B,
and assume that G,(s) has a state space realization of
H-l
Gz(s) = c, Di , i = 1,. ,4.
70 L.INEAR DYNAMICAL SYSTEMS
Note that Gi(s) may itself be a multiple input and multiple output transfer matrix. In
particular, if Gi(s) is a column or row vector of transfer functions, then the formulas
in Section 3.4 can be used to obtain a controll,Lble or observable realization for Gi(s).
Then a realization for G(s) can be given by
Alternatively, if the transfer matrix G(s) can be factored into the product and/or
the sum of several simply realized transfer matrices, then a realization for G can be
obtained by using the cascade or addition formulas in the last section.
A problem inherited with these kinds of realization procedures is that a realization
thus obtained will generally not be minimal. To obtain a minimal realization, a Kalman
controllability and observability decomposition has to be performed to eliminate the
uncontrollable and/or unobservable states. (An alternative numerically reliable method
to eliminate uncontrollable and/or unobservable states is the balanced realization method
which will be discussed later.)
We will now describe one factorization procedure that does result in a minimal
realization by using partial fractional expansion (The resulting realization is sometimes
called Gilbert’s realization due to Gilbert).
Let G(s) be a p x m transfer matrix and write it in the following form:
with d(s) a scalar polynomial. For simplicity, we shall assume that d(s) has only real
and distinct roots ;\i # Xj if i # j and
G(s)=D+g=.
i=l S-A.z
Suppose
rank Wi = ki
and let B,z E RkZ xm and Ci E llPxk~ be two constant matrices such that
3.8. Lyapunov Equations 71
It follows immediately from PBH tests that this realization is controllable and observ-
able. Hence, it is minimal.
An immediate consequence of this minimal realization is that a transfer matrix with
an r-th order polynomial denominator does not necessarily have an r-th order state
space realization unless IV; for each i is a rank one matrix.
This approach can, in fact, be generalized to more complicated cases where d(s) may
have complex and/or repeated roots. Readers may convince themselves by trying some
simple examples.
Lemma 3.18 Assume that A is stable, then the following statements hold:
(i) X = s: eA*‘QeAtdt.
(ii) X>OifQ>OandX>OifQ>O
An immediate consequence of part (iii) is that, given a stable matrix A, a pair (C, A)
is observable if and only if the solution to the following Lyapunov equation is positive
definite:
A*L, + L,A + C*C = 0.
The solution L, is called the observability Gramian. Similarly, a pair (A, B) is con-
trollable if and only if the solution to
Now if X > 0 then v*Xv > 0, and it is clear that ReX 5 0 if Q 2 0 and ReX < 0 if
Q > 0. Hence (i) and (ii) hold. To see (iii), WC assume ReX 2 0. Then we must have
v*Qv = 0, i.e., Qv = 0. This implies that X IS an unstable and unobservable mode,
which contradicts the assumption that (Q, A) i> detectable. 0
[-tl
A B
Lemma 3.20 Let C D be a state space realization of a (not necessarily stable)
transfer matrix G(s). Suppose that there exists a symmetric matrix
AP+PA*+BrI*=O.
Since l/a2 is very large if Eli is small, this shows that the state corresponding to the
last diagonal term is strongly observable. This example shows that controllability (or
observability) Gramian alone can not give an accurate indication of the dominance of
the system states in the input/output behavior.
This motivates the introduction of a balanced realization which gives balanced
Gramians for controllability and observability.
A B .
Suppose G = C D 1s stable, i.e., A is stable. Let P and Q denote the control-
[-tl
lability Gramian and observability Gramian, respectively. Then by Lemma 3.18, P and
Q satisfy the following Lyapunov equations
AP+PA*+BB*=O (3.9)
A*Q+QA+C*C=O, (3.10)
and P > 0, Q 2 0. Furthermore, the pair (A, B) is controllable iff P > 0, and (C, A) is
observable iff Q > 0.
Suppose the state is transformed by a nonsingular T to !2 = TX to yield the realiza-
tion
G=[j-$]=[w].
Then the Gramians are transformed to p = TPT* and Q = (T-l)*QT-‘. Note that
l%j = TPQT-l, and therefore the eigenvalues of the product of the Gramians are
invariant under state transformation.
Consider the similarity transformation T which gives the eigenvector decomposition
Then the columns of T-l are eigenvectors of PQ corresponding to the eigenvalues {Xi}.
Later, it will be shown that PQ has a real diagonal Jordan form and that A > 0, which
are consequences of P 2 0 and Q 2 0.
Although the eigenvectors are not unique, in the case of a minimal realization they
can always be chosen such that
0 = (T-')*QT-~ = c,
where C = diag(cr,g2,...,c,) a n d C 2 = A. This new realization with controllabil-
ity and observability Gramians P = Q = C will be referred to as a balanced real-
ization (also called internally balanced realization). The decreasingly order numbers,
cl > rr2 > . . > u, > 0, are called the Hankel singular values of the system.
More generally, if a realization of a stable system is not minimal, then there is a trans-
formation such that the controllability and observability Gramians for the transformed
realization are diagonal and the controllable and observable subsystem is balanced. This
is a consequence of the following matrix fact.
3.9. Balanced Realizations 77
Define
(T;)-1 =
and let
T = T4T3T2TI.
Then
with C2 = I. ?
Corollary 3.23 The product of two positive semi-definite matrices is similar to a pos-
itive semi-definite matrix.
Proof. Let P and Q be any positive semi-definite matrices. Then it is easy to see that
with the transformation given above
TPQT-1 =
[ 1
4 “0 .
21 = y
x2 = u-t.
[:;I = [i I:][::]+[#
Y =
[ .I
[ l 0 1 i: .:
3.10. Hidden Modes and Pole-Zero Cancelation 79
and is a second order system. Moreover, it is easy to show that the unstable mode 1 is
uncontrollable but observable. Hence, the output can be unbounded if the initial state
ccl(O) is not zero. We should also note that the above problem does not go away by
changing the interconnection order:
Y S - l 7) 1 u
.
Sfl s - l
In the later case, the unstable mode 1 becomes controllable but unobservable. The
unstable mode can still result in the internal signal r] unbounded if the initial state
~(0) is not zero. Of course, there are fundamental differences between these two types
of interconnections as far as control design is concerned. For instance, if the state is
available for feedback control, then the latter interconnection can be stabilized while
the former cannot be.
This example shows that we must be very careful in canceling unstable modes in
the procedure of forming a transfer function in control designs; otherwise the results
obtained may be misleading and those unstable modes become hidden modes waiting
to blow. One observation from this example is that the problem is really caused by the
unstable zero of the subsystem $. Although the zeros of an SISO transfer function
are easy to see, it is not quite so for an MIMO transfer matrix. In fact, the notion of
“system zero” cannot be generalized naturally from the scalar transfer function zeros.
For example, consider the following transfer matrix
- 1 1 -
G ( s ) = ‘il ‘T2
- -
I s+2 S-l-l I
which is stable and each element of G(s) has no finite zeros. Let
1
sf2 - s+l -
K = s-Jz S-Jz
0 1
s+Jz
-(s+l)(s+2) O
KG=
2 1 1
1 -
s+2 -
s+l 1
is stable. This implies that G(s) must have an unstable zero at fi that cancels the
unstable pole of K. This leads us to the next topic: multivariable system poles and
zeros.
80 LINEAR DYNAMICAL SYSTEMS
Definition 3.11 Let Q(s) E R[s] be a (p x m 1 polynomial matrix. Then the normal
rank of Q(S ), denoted normalrank (Q(S)), is t;le maximally possible rank of Q(s) for
at least one s E @.
In short, sometimes we say that a polynomial matrix Q(S ) has rank(Q(s)) in R[s] when
we refer to the normal rank of Q(s).
= [ ,“s21111 .
To show the difference between the normal rank of a polynomial matrix and the
rank of the polynomial matrix evaluated at certain point, consider
Q(s)
Then Q(s) has normal rank 2 since rank Q(2) =- 2. However, Q(0) has rank 1.
It is a fact in linear algebra that any polynomial matrix can be reduced to a so-
called Smith form through some pre- and post- unimodular operations. [cf. Kailath,
1984, pp.3911.
Lemma 3.25 (Smith form) Let P(s) E R[s] h e any polynomial matrix, then there
exist unimodular matrices U(s), V(s) E IR[s] SK/I that
- n(s) 0 '.. 0 0
0 72(s) .'. 0 0
U(s)P(s)V(s) = S(s) : = ; ; ... ; ;
0 0 ... yr(s) 0
0 0 ... 0 0
S(s) is called the Smith form of P(s). It is also c,lear that r is the normal rank of P(s).
We shall illustrate the procedure of obtaining a Smith form by an example. Let
1
s+l (s + 1)(2s t 1) s(s + 1)
s+2 (s+2)(s2+;ls+3) s(s+2) .
1 2s + 1 S
3.11. Multivariable System Poles and Zeros 81
u= [ 8 ; +].
1
Then
1 2s + 1 S
PI(S) := U(s)P(s) =
0 (s + l)(s +2)2 0 .
[ 0 0 0
Next use column operation to zero out the 2s + 1 and s terms in Pr. This process can
be done by post-multiplying a unimodular matrix V to PI(S):
1
1 -(2s+ 1) -s
V(s) = [ 00 10 01
and
1 0 0
PI(S)V(S) = [ 0 ( s + l)(s +2)2 0 I
0 0 0
Then we have
I
0
S(s) = U(s)P(s)V(s) = :, (s + 1,;s + 2)2 0 .
0 0 0
Similarly, let Rp(s) d enote the set of rational proper transfer matrices.’ Then any
real rational transfer matrix can be reduced to a so-called McMillan form through some
pre- and post- unimodular operations.
Lemma 3.26 (McMillan form) Let G(s) E 7&,(s) b e any proper real rational transfer
matrix, then there exist unimodular matrices U(s), V(s) E R[s] such that
-a,(s) 0 . . 0 o-
Pi(S)
0 & . . . 0 0
@2(s)
U(s)G(s)V(s) = M(s) : = : : . . . ; ;
0 0 ‘.’ $j”
_ 0 0 ... 0 o-
‘Similarly, we say a transfer matrix G(s) has normal rank T if G(s) has maximally possible rank T
for at least one s E @.
82 LINEAR DYNAMICAL SYSTEMS
Proof. If we write the transfer matrix G(s) as G(s) = N(s)/d(s) such that d(s) is a
scalar polynomial and N(s) is a p x m polynomial matrix and if let the Smith form of
N(s) be S(s) = U(s)N(s)V(s), th e conclusion follows by letting M(s) = S(s)/d(s). 0
Definition 3.12 The number Cideg(&(s)) .I S called the McMilZan degree of G(s)
where deg(&(s)) denotes the degree of the polynomial pi(s), i.e., the highest power
of s in pi(s).
Definition 3.13 The roots of all the polynomials ,&(s) in the McMillan form for G(s)
are called the poles of G.
Let (A, B, C, D) be a minimal realization of G(s). Then it is fairly easy to show that
a complex number is a pole of G(s) if and only if it is an eigenvalue of A.
Definition 3.14 The roots of all the polynomials ai in the McMillan form for G(s)
are called the transmission zeros of G(s). A complex number zo E C is called a blocking
zero of G(s) if G(zu) = 0.
It is clear that a blocking zero is a transmission zero. Moreover, for a scalar transfer
function, the blocking zeros and the transmission zeros are the same.
We now illustrate the above concepts through an example. Consider a 3 x 3 transfer
matrix:
N( s )
Then G(s) can be written as
1
s+l (s + l)(Zs + 1) s(s + 1)
s+2 (s + 2)(s’ + 5s + 3) s(s + 2)
G(s) = (S + l):(s + 2) 1 2s + 1 S := d o ’
3.11. Multivariable System Poles and Zeros 83
Since N(s) is exactly the same as the P(s) in the previous example, it is clear that the
G(s) has the McMillan form
M(s) = U(s)G(s)V(s) = s + 2
0 - 0
Sfl
and G(s) has McMillan degree of 4. The poles of the transfer matrix are (-1, -1, -1, -2)
and the transmission zero is (-2). Note that the transfer matrix has pole and zero at
the same location (-2); this is the unique feature of multivariable systems.
To get a minimal state space realization for G(s), note that G(s) has the following
partial fractional expansion:
G(s) =
Since there are repeated poles at -1, the Gilbert’s realization procedure described in the
last section cannot be used directly. Nevertheless, a careful inspection of the fractional
expansion results in a 4-th order minimal state space realization:
- 1 0 10 0 3 1
0 -1 1 0 -1 3 2
0 0 -1 0 1 -1 -1
G(s) = 0 0 0 -2 1 -3 -2 .
0 0 l - 1 0 0 0
1 0 0 0 0 1 0
0 1 0 1 0 0 0
We remind readers that there are many different, definitions of system zeros. The
definitions introduced here are the most common ones and are useful in this book.
Lemma 3.27 Let G(s) be a p x m proper transfer matrix with full column normal rank.
Then zo E @ is a transmission zero of G(s) if and only if there exists a 0 # uo E Cm
such that G(zo)uo = 0.
84 LINEAR DYNAMICAL SYSTEMS
Proof. We shall outline a proof of this lemma. We shall first show that there is a.
vector uo E Cc” such that G(zs)us = 0 if ~0 E :I2 is a transmission zero. Without loss of
generality, assume
-al(s) ... 0 -
0
O'(y I421 . . . 0
for some unimodular matrices Ur (s) and VI (5 ) and suppose zn is a zero of CY~ (s), i.e.,
al(z0) = 0. Let
uo = v,-‘(z, ‘(:I # 0
where er = [l, O,O, . . .]* E IP. Then it is easy o verify that G(ZO)UO = 0. On the other
hand, suppose there is a ~0 E Cm such that GI ~0)~s = 0. Then
- w(zo) .. . . . 0
a1;") 0
pJ 0
Ul(ZO) : : .. : V,(zo)uo = 0.
0 0 ..1 &
_ 0 0 ... 0 _
Define
II
Ul
u2
= vl(Zl)UO # 0.
%n
Then
Note that the lemma may not be true if G(s) does not have full column normal rank.
This can be seen from the following example. C‘onsider
It is easy to see that G has no transmission zero but G(s)un = 0 for all s. It should
also be noted that the above lemma applies even if za is a pole of G(s) although G(za)
is not defined. The reason is that G(zs)ua may be well defined. For example,
Lemma 3.28 Let G(s) be a p x m proper transfer matrix with full row normal rank.
Then zo E @ is a transmission zero of G(s) ‘fa an d only if there exists a 0 # ~0 E 0’
such that $G(zo) = 0.
In the case where the transmission zero is not a pole of G(s), we can give a useful
alternative characterization of the transfer matrix transmission zeros. Furthermore,
G(s) is not required to be full column (or row) rank in this case.
The following lemma is easy to show from the definition of zeros.
Corollary 3.30 Let G(s) b e a square m x m proper transfer matrix and det G(s) $ 0.
Suppose z0 E C is not a pole of G(s). T h en zo E @ is a transmission zero of G(s) if
and only if det G(zo) = 0.
Using the above corollary, we can confirm that the example in the last section does have
a zero at fi since
1
- 1
1
-
s+l s+2 2 - s2
det
2 1 = (s + 1)2(s + 2)2’
- -
s+2 s-t1
Note that the above corollary may not be true if za is a pole of G. For example,
Definition 3.15 The eigenvalues of A are cahed the poles of the realization of G(s).
A-s1 B
Q(s)=[ c D].
Definition 3.16 A complex number ~0 E C i> called an invariant zero of the system
realization if it satisfies
A - “I B <
C D 1 normtrlrank
A-sI
C
B
D‘ 1
The invariant zeros are not changed by constant state feedback since
A + B F -zoI B
1 A-zoIB I
I[ I F
O
I
=
C+DF D = ra’k
C D’
I
It is also clear that invariant zeros are not changed under similarity transformation.
The following lemma is obvious.
u II
= 0.
1 # 0 such that
A-z01
C
B X
D I[ u I= 0
since A - ‘I B
[ C D 1 has full column normal rauk.
[1 X
column normal rank, i.e., = 0 which is a contradiction.
U
Finally, note that if w, = 0, then
[ A--&?I]x=O
and za is a non-observable mode by PBH test. 0
1 has
invariant zero of a realiiation (A, B, 6, D) 2fan only
d
full row normal rank. Then zo E Cc is an
if there exist 0 # y E cc” and
v E P such that
1
A-zoI B
[ y* w* ] c D = 0.
A-sI B1
Lemma 3.33 G(s) has full column (row) normal rank if and only if
L c Dl
has full column (row) normal rank.
A-sI B
C D = I C(A
[
I
-sI)-’
0
I I[ A-sI
0
B
G(s) 1
and
1
A-sI B
normalrank = n + normalrank(G(s)).
C D
0
Theorem 3.34 Let G(s) b e a real rational proper transfer matrix and let (A, B, C, D)
be a corresponding minimal realization. Then a complex number zo is a transmission
zero of G(s) if and only if it is an invariant zero of the minimal realization.
Proof. We will give a proof only for the case that the transmission zero is not a pole
of G(s). Then, of course, zs is not an eigenvalue of A since the realization is minimal.
Note that
1[ I[ 1
A-sI B I 0 A-sI B
C D = C(A- sI)-’ I 0 G(s) ’
88 LINEAR DYNAMICAL SYSTEMS
= n + rank G(za).
I
Hence
1 < normArank
A-s1
C
B
D 1
if and only if rank G(Q) < normalrank G( 5). Then the conclusion follows from
Lemma 3.29. 0
Note that the minimality assumption is esseatial for the converse statement. For ex-
ample, consider a transfer matrix G(s) = D ( constant) and a realization of
G(s) = ; ; where A is any square matrix with any dimension and C is any
H-1
matrix with compatible dimension. Then G(s:I has no poles or zeros but every eigen-
H---l
A 0
value of A is an invariant zero of the realization{
C D .
Nevertheless, we have the following corollar r if a realization of the transfer matrix
is not minimal.
Corollary 3.35 Every transmission zero of a kansfer matrix G(s) is an invariant zero
of all its realizations, and every pole of a trans.,% matrix G(s) is a pole of all its real-
izations.
Lemma 3.36 Let G(s) E T&,(s) be a p x m hnsfer matrix and let (A, B, C, D) be a
minimal realization. If the system input is of tht: form u(t) = uOeXt, where X E @. is not
a pole of G(s) and us E P is an arbitrary con.<tant vector, then the output due to the
input u(t) and the initial state x(0) = (XI - A) -‘Buo is y(t) = G(X)uoext, Vt 2 0.
Proof. The system response with respect to the input u(t) = uaeXt and the initial
condition x(0) = (XI - A)-IB ug is (in terms of Laplace transform):
Combining the above two lemmas, we have thl, following results that give a dynamical
interpretation of a system’s transmission zero.
3.12. Notes and References 89
Corollary 3.37 Let G(s) E 7&(s) be a p x m transfer matrix and let (A, B, C, D) be
a minimal realization. Suppose that zo E @ is a transmission zero of G(s) and is not
a pole of G(s). Then for any nonzero vector uo E P the output of the system due to
the initial state x(0) = (~01 - A)-‘Bub and the input u = uoezot is identically zero:
y(t) = G(zo)uOezo’ = 0.
The following lemma characterizes the relationship between zeros of a transfer func-
tion and poles of its inverse.
A B
Lemma 3.38 Suppose that G = C D is a square transfer matrix with D non-
H-1
singular, and suppose ze is not an eigenvalue of A (note that the realization is not
necessarily minimal). Then there exists xo such that
(A - BD-%)x0 = zoxo, Cxo # 0
iff there exists ug # 0 such that
G(zo)uo = 0.
has a pole at zo which is observable. Then, by definition, there exists x0 such that
(A - BD-%)x,-, = /zoxo
and
cxo # 0.
(+) Set ua = -D-lCxo # 0. Then
(zol - A)xo = -BD-%x0 = BuO.
Using this equality, one gets
G(zo)uo = C(z,,I - A)-‘Bu,, + Duo = Cxo - Cxo = 0.
The above lemma implies that ~0 is a zero of an invertible G(s) if and only if it is a
pole of G-l(s).
4.1 Normed S p a c e s
Let V be a vector space over @ (or IX) and let ((.(( be a norm defined on V. Then V
is a normed space. For example, the vector space Q’ with any vector p-norm, ]].]lP,
for 1 5 p < 03, is a normed space. As another example, consider the linear vector
space C[a, b] of all bounded continuous functions on the real interval [a, b]. Then C[a, b]
92 PER.FORMANCE SPECIFICATIONS
(Ixl(p : = 2 :cJP .
( i=o 1
and
Ilxc(t)ll,:= es;,y’r”P lx(t)l.
Some of these spaces, for example, &(--cc, 01, Cz[O, co) and &(-co, co), will be
discussed in more detail later on.
C[a, b] space:
C[a, b] consists of all continuous functions WI the real interval [a, b] with the norm
defined as
lI4, := ,;w;,, Ix(t)l.
94 PERPORMANCE SPECIFICATIONS
Note that many important metric notions and geometrical properties such as length,
distance, angle, and the energy of physical syhtems can be deduced from this inner
product. For instance, the length of a vector IC C: P is defined as
and the angle between two vectors 2, y E a?’ call be computed from
= (x,((x(( l,y,l’
Y)
COS4?Y) .L(X,Y) E P,d.
The two vectors are said to be orthogonal if L(x. y) = 5.
We now consider a natural generalization of the inner product on cc” to more general
(possibly infinite dimensional) vector spaces.
Definition 4.2 Let V be a vector space over Q1. An inner product’ on V is a complex
valued function,
(.;) : v x v k----) cc
such that for any x, y, z E V and o, ,B E UZ
6) I( <: II4 IIYII CCauchy-S c h warz ineqmlity). Moreover, the equality holds if
and only if x = ay for some constant Q: or !I = 0.
Next we shall derive some simple and useful bounds for the 3-1, norm and the Lr
norm of a stable system. Suppose
[t-l
A B
G(s) = c o E 723-t,
C=diag(ai,az,...,a,) 20
AC+CA*+BB*=O A*C+CA+C*C=O.
Remark 4.3 It should be clear that the inequalities stated in the theorem do not
depend on a particular state space realization of G(s). However, use of the balanced
realization does make the proof simple. V
Proof. The inequality (~1 5 llGl/, follows from the Nehari Theorem of Chapter 8.
:= J 03
We will now show the other inequalities. Since
sup
Re(s)>O 0
To prove the last inequality, let ui be the ith unit vector. Then
1 ifi=j
u;uj = &j = a n d 2 UiUf = I.
0 ifi#j
i=l
114 PERFORMANCE SPECIFICATIONS
To compute the LZz norm of a rational transfer function, G(s) E &, using state space
approach. Let G(s) = [G(s)]+ + [G(s)]- with G+ E KHZ and G- E R3-I:. Then
Note that this characterization of the ?fz norm can be appropriately generalized for
nonlinear time varying systems, see Chen and Francis [1992] for an application of this
norm in sampled-data control.
{w,...,wJ.
This value is usually read directly from a Bode singular value plot. The &, norm can
also be computed in state space if G is rational.
E RL,. (4.3)
4.7. Computing ,!I, and TfFt, Norms 115
Then llGjloo < y if and only if O(D) < y and H has no eigenualues on the imaginary
axis where
A + BR-lD*C BR-1 B’
H := (4.4)
-C*(I + DR-lD*)C - ( A + BR-lD*C)* I
Proof. Let Q(s) = y”l - G”(s)G(s). Tlren it is clear that llGlloo < y if and only if
(a(jw) > 0 for all w E IR. Since @(XI) = R > 0 and since @(jw) is a continuous function
of w, (a(jw) > 0 for all w E IR if and only if @(jw) is nonsingular for all w E IF! U {cm},
i.e., a(s) has no imaginary axis zero. Equivalently, F’(s) has no imaginary axis pole.
It is easy to compute by some simple algebra that
R-l 1
BR-’
H
tl+(s) = -C*DR-1 .
R-lD*C R-IB” ]
Thus the conclusion follows if the above realization has neither uncontrollable modes
nor unobservable modes on the imaginary axis. Assume that jws is an eigenvalue
of H but not a pole of F’(s). Then jwn must be either an unobservable mode of
( [ R-lD*C R-lB* ] ,H)or an uncontrollable mode of (H, -:*Fi-, ). N O W
1
suppose jws is an unobservable mode of ([ R-lD*C R-lB* ] :H). Then thke exists
a n 20 = such that
(jwol- A)zl = 0
(jwoI + A*)x2 = -c*cx1
D*Cxl + Bfx2 = 0 .
0
Similarly, a contradiction will also be arrived if jws is assumed to be an uncontrol-
lable mode of (H,
1 ).
116 PERFORMANCE SPECIFICATIONS
Bisection Algorithm
Lemma 4.7 suggests the following bisection algozithm to compute RC, norm:
(a) select an upper bound “ill and a lower bou Id yl such that yl 5 ]]G]], 5 yU;
(b) if (rU - n)/rl <specified level, stop; ]]G]] z (rU + yl)/2. Otherwise go to next
step;
Of course, the above algorithm applies to XW norm computation as well. Thus fZ,
norm computation requires a search, over eithe: y or w, in contrast to & (‘Es) norm
computation, which does not. A somewhat analogous situation occurs for constant
matrices with the norms ]]M]]z = trace(M*M) and ]]M]lW = i?[M]. In principle, ]]M]]3
can be computed exactly with a finite number of operations, as can the test for whether
F(M) < y (e.g. y21 - M*M > 0), but the value> of F(M) cannot. To compute CT(M),
we must use some type of iterative algorithm.
Remark 4.4 It is clear that ]]G]], < y iff /r-‘G]I, < 1. Hence, there is no loss of
generality to assume y = 1. This assumption uill often be made in the remainder of
this book. It is also noted that there are other I’;rst algorithms to carry out the above
norm computation; nevertheless, this bisection algorithm is the simplest. 0
The 3-1, norm of a stable transfer function (‘an also be estimated experimentally
using the fact that the ‘H, norm of a stable trans!cr function is the maximum magnitude
of the steady-state response to all possible unit ; .tnplitude sinusoidal input signals.
This chapter introduces the feedback structure and discusses its stability and perfor-
mance properties. The arrangement of this chapter is as follows: Section 5.1 discusses
the necessity for introducing feedback structure and describes the general feedback con-
figuration. In section 5.2, the well-posedness of the feedback loop is defined. Next, the
notion of internal stability is introduced and the relationship is established between the
state space characterization of internal stability and the transfer matrix characteriza-
tion of internal stability in section 5.3. The stable coprime factorizations of rational
matrices are also introduced in section 5.4. Section 5.5 considers feedback properties
and discusses how to achieve desired performance using feedback control. These discus-
sions lead to a loop shaping control design technique which is introduced in section 5.6.
Finally, we consider the mathematical formulations of optimal tiz and ‘H, control
problems in section 5.7.
117
118 STABILITY AND PERFORMANCE OF FEEDBACK SYSTEMS
uncertainty. Indeed, this requirement was the oAgina1 motivation for the development
of feedback systems. Feedback is only requireci when system performance cannot be
achieved because of uncertainty in system characteristics. The more detailed treatment
of model uncertainties and their representations will be discussed in Chapter 9.
For the moment, assuming we are given a node1 including a representation of un-
certainty which we believe adequately captures the essential features of the plant, the
next step in the controller design process is to cletermine what structure is necessary
to achieve the desired performance. Prefilterinq input signals (or open loop control)
can change the dynamic response of the model :;et but cannot reduce the effect of un-
certainty. If the uncertainty is too great to achieve the desired accuracy of response,
then a feedback structure is required. The merfr assumption of a feedback structure,
however, does not guarantee a reduction of uncertainty, and there are many obstacles
to achieving the uncertainty-reducing benefits ot‘ feedback. In particular, since for any
reasonable model set representing a physical system uncertainty becomes large and the
phase is completely unknown at sufficiently high frequencies, the loop gain must be small
at those frequencies to avoid destabilizing the high frequency system dynamics. Even
worse is that the feedback system actually increases uncertainty and sensitivity in the
frequency ranges where uncertainty is significantly large. In other words, because of the
type of sets required to reasonably model physical systems and because of the restriction
that our controllers be causal, we cannot use feetlback (or any other control structure)
to cause our closed-loop model set to be a prol)er subset of the open-loop model set.
Often, what can be achieved with intelligent use of feedback is a significant reduction
of uncertainty for certain signals of importance lvith a small increase spread over other
signals. Thus, the feedback design problem centers around the tradeoff involved in re-
ducing the overall impact of uncertainty. This tradeoff also occurs, for example, when
using feedback to reduce command/disturbance error while minimizing response degra-
dation due to measurement noise. To be of pr;Ictical value, a design technique must
provide means for performing these tradeoffs. 1\‘e will discuss these tradeoffs in more
detail later in section 5.5 and in Chapter 6.
To focus our discussion, we will consider the : t andard feedback configuration shown
in Figure 5.1. It consists of the interconnected plant P and controller K forced by
command I-, sensor noise n, plant input disturbance di, and plant output disturbance
d. In general, all signals are assumed to be multivariable, and all transfer matrices are
assumed to have appropriate dimensions.
(s + 2) ?$d,
u=-(r-n-d)-
i.e., the transfer functions from the external signals r -n - d and di to u are not proper.
Hence, the feedback system is not physically realizable!
Now suppose that all the external signals T, n, d, and di are specified and that the
closed-loop transfer matrices from them to ‘u. are respectively well-defined and proper.
Then, y and all other signals are also well-defined and the related transfer matrices are
proper. Furthermore, since the transfer matrices from d and n to ‘u. are the same and
differ from the transfer matrix from r to u by only a sign, the system is well-posed if
4
and only if the transfer matrix from to ‘u. exists and is proper.
[ d I
In order to be consistent with the notation used in the rest of the book, we shall
denote
and regroup the external input signals into the feedback loop as wi and wz and regroup
the input signals of the plant and the controller as er and es. Then the feedback loop
with the plant and the controller can be simply represented as in Figure 5.2 and the
system is well-posed if and only if the transfer matrix from
proper.
[1 Wl
w2
to er exists and is
Lemma 5.1 The feedback system in Figure 5.2 is well-posed if and only if
I - 2(cO)P(cO) (5.1)
is invertible.
120 STABILITY AND PERFORMANCE OF FEEDBACK SYSTEMS
Proof. The system in the above diagram can 1~ represented in equation form as
el = w1 + IT-e2
e2 = wg+Pel.
Thus well-posedness is equivalent to the condition that (I - k’P)-’ exists and is proper.
But this is equivalent to the condition that the constant term of the transfer function
I - kP is invertible. 0
p=
k+l
A
C
B
D-- (5.3)
(5.4)
Fortunately, in most practical cases we will have D = 0, and hence well-posedness for
most practical control systems is guaranteed.
5.3. Internal Stabilitv 121
i = Ax+Bel (5.6)
e2 = CxfDel (5.7)
i = Ai+Be;? (5.8)
el = Ci +L?ez. (5.9)
Definition 5.2 The system of Figure 5.2 is said to be internally stable if the origin
(z, 2) = (0,O) is asymptotically stable, i.e., the states (z, 2) go to zero from all initial
states when wi = 0 and wz = 0.
Note that internal stability is a state space notion. To get a concrete characterization
of internal stability, solve equations (5.7) and (5.9) for ei and e2:
Note that the existence of the inverse is guaranteed by the well-posedness condition.
Now substitute this into (5.6) and (5.8) to get
[+a[;]
where
Thus internal stability is equivalent to the condition that A has all its eigenvalues in
the open left-half plane. In fact, this can be taken as a definition of internal stability.
Lemma 5.2 The system of Figure 5.2 with given stabilizable and detectable realizations
for P and l? is internally stable if and only if 2 is a Hurwitz matrix.
It is routine to verify that the above definition of internal stability depends on!y on
P and l?, not on specific realizations of them as long as the realizations of P and K are
both stabilizable and detectable, i.e., no extra unstable modes are introduced by the
realizations.
The above notion of internal stability is defined in terms of state-space realizations
of P and 2. It is also important and useful to characterize internal stability from the
122 STABILITY AND PERFORMANCE OF FEEDBACK SYSTEMS
transfer matrix point of view. Note that the feedback system in Figure 5.2 is described,
in term of transfer matrices, by
Now it is intuitively clear that if the system in Figure 5.2 is internally stable, then for all
bounded inputs (IQ, wz), the outputs (er, es) are also bounded. The following lemma
shows that this idea leads to a transfer matrix characterization of internal stability.
Lemma 5.3 The system in Figure 5.2 is internally stable if and only if the transfer
matrix
Proof. As above let [$-/-&I and [-$-/$I be stabilizable and detectable realiza-,
tions of P and I?, respectively. Let yr denote the output of P and y2 the output of l?.
Then the state-space equations for the system in Figure 5.2 are
are in the open left-half plane, it follows that the transfer matrix from (~1, wz) to (ei, es)
given in (5.11) is in R’H,.
Conversely, suppose that (I - PJ?) is invertible and the transfer matrix in (5.11) is
in RI-t,. Then, in particular, (I - Pk)-l is proper which implies that (I - P@(oo) =
(I - DD) is invertible. Therefore,
as a transfer matrix belongs to 7&Y,. Finally, since (A, B, C) and (A, B, 6’) are stabi-
lizable and detectable,
is stabilizable and detectable. It then follows that the eigenvalues of A are in the open
left-half plane. 0
Note that to check internal stability, it is necessary (and sufficient) to test whether
each of the four transfer matrices in (5.11) is in RX,. Stability cannot be concluded
even if three of the four transfer matrices in (5.11) are in R;Ft,. For example, let an
interconnected system transfer function be given by
S - l j&-l.
p=-----
s+l’ s - l
Then it is easy to compute
which shows that the system is not internally stable although three of the four transfer
functions are stable. This can also be seen by calculating the closed-loop A-matrix with
any stabilizable and detectable realizations of P and k.
124 STABILITY AND PERFORMANCE OF FEEDBACK SYSTEMS
Remark 5.1 It should be noted that internal stability is a basic requirement for a
practical feedback system. This is because all interconnected systems may be unavoid-
ably subject to some nonzero initial conditions and some (possibly small) errors, and
it cannot be tolerated in practice that such errors at some locations will lead to un-
bounded signals at some other locations in the closed-loop system. Internal stability
guarantees that all signals in a system are born ded provided that the injected signals
(at any locations) are bounded. 0
However, there are some special cases under, which determining system stability is
simple.
Corollary 5.4 Suppose I? E ICY,. Then the system in Figure 5.2 is internally stable
if and only if it is well-posed and P(I - kP)-1 E R’H,.
Proof. The necessity is obvious. To prove the &ficiency, it is sufficient to show that
(I - Pk)-l E RX,. But this follows from
(I - PI?)-1 = I + (I -- PI+lPI?
This corollary is in fact the basis for the clasbical control theory where the stability
is checked only for one closed-loop transfer function with the implicit assumption that
the controller itself is stable. Also, we have
Corollary 5.5 Suppose P E R’X,. Then the s@em in Figure 5.2 is internally stable
if and only if it is well-posed and K(I - PI?)-l 5: RIH,.
Corollary 5.6 Suppose P E ‘R’H, a7;d k E ‘RW,. Then the system in Figure 5.2 is
internally stable if and only if (I - PK)-’ E %3x’,.
Theorem 5.7 The system is internally stable if and only if it is well-posed and
( i i ) 4(s) : = d_et(l - P(s)K(s)) h as all its WI‘OS in the open left-half plane (i.e.,
(I - P(s)K(s))-’ is stable).
5.3. Internal Stability 125
Proof. It is easy to show that PI? and (I - PI?)-’ have the following realizations:
where
( I - P&)-l =
H-1 ; ;
I-Dlj)-l [ C 06 ]
c = (I-D&‘[ C DC]
b = ( I - Dfi)-l.
It is also easy to see that A = A. Hence, the system is internally stable iff A is stable.
Now suppose that the system is internally stable, then (I - PI?-’ E ‘R&. This
implies that all zeros of det(1 - P(s)k(s)) must be in the left-half plane. So we only
need to show that given condition (ii), condition (i) is necessary and sufficient for the
internal stability. This follows by noting that (A, B) is stabilizable iff
(5.12)
[C De’] (5.13)
is detectable. But conditions (5.12) and (5.13) are equivalent to condition (i), i.e., PI?
has no unstable pole/zero cancelations. 0
With this observation, the MIMO version of the Nyquist stability theorem is obvious.
Theorem 5.8 (Nyquist Stability Theorem) The system is internally stable if and only
if it is well-posed, condition (i) in Theorem 5.7 is satisfied and the Nyquist plot of $(jw)
for --co 5 w 2 00 encircles the origin, (O,O), nk + nP times in the counter-clockwise
direction.
Proof. Note that by SISO Nyquist stability theorem, 4(s) has all zeros in the open
left-half plane if and only if the Nyquist plot of &jw) for -cc 5 w 5 00 encircles the
origin, (0, 0), n111, + nP times in the counter-clockwise direction. 0
126 STABILITY AND PERFORMAXCE OF FEEDBACK SYSTEMS
zm+yn= 1.
The more primitive, but equivalent, definition is that m and n are coprime if every
common divisor of m and n is invertible in R’F&, i.e.,
M
Note that these definitions are equivalent to saving that the matrix N is left-
[ I
invertible in R’FI, and the matrix [ i@ # ] is right-invertible in RF&,. These two
equations are often called Bezout identities.
Now let P be a proper real-rational matrix. A right-coprime factorization (rcf)
of P is a factorization P = NM-l where N atld M are right-coprime over R’H,.
Similarly, a left-coprime factorization (lcf) has tile form P = i’klfi where N and M
are left-coprime over RF&. A matrix P(s) E 7&(s) is said to have double coprime
factorization if there exist a right coprime factorization P = NM-l, a left coprime
factorization P = &klN, and X,, Y,, Xl, J$ E I?‘& such that
(5.14)
Of course implicit in these definitions is the requkment that both M and h;r be square
and nonsingular.
‘See, e.g., [Kailath, 1980, pp. 140-1411.
5.4. Coprime Factorization over ‘R’H, 127
p=
[-+I
A
C
B
D
(5.15)
(5.16)
Then P = NM-l = n/“l-r# are rcf and lcf, respectively, and, furthermore, (5.14) is
satisfied.
Remark 5.3 The coprime factorization of a transfer matrix can be given a feedback
control interpretation. For example, right coprime factorization comes out naturally
from changing the control variable by a state feedback. Consider the state space equa-
tions for a plant P:
2 = AxfBu
Y = Cx-kDu.
u:=u-Fx
i = (A+ BF)x + Bv
u = Fxfv
Y = (C + DF)x + Dv.
M(s)= [+I,
128 STABILITY AND PERFORMA.NCE OF FEEDBACK SYSTEMS
Therefore
u=Mu, y=:Nv
so that y = NM-l+ i.e., P = NM-l. 0
We shall now see how coprime factorizations cSm be used to obtain alternative charac-
terizations of internal stability conditions. Consic:r:r again the standard stability analysis
diagram in Figure 5.2. We begin with any rcf’s xnd lcf’s of P and I?:
~=uv-l=i-lU. (5.18)
Lemma 5.10 Consider the system in Figure 5.,.‘. The following conditions are equiwa-
lent:
is invertible in R7tFt,.
] is invertible in RF&,.
.
or, equivalently,
(5.19)
Now
so that
5.4. Coprime Factorization over R’H, 129
are right-coprime (this fact is left as an exercise for the reader), (5.19) holds iff
This proves the equivalence of conditions 1 and 2. The equivalence of 1 and 3 is proved
similarly.
The conditions 4 and 5 are implied by 2 and 3 from the following equation:
Since the left hand side of the above equation is invertible in R’H,, so is the right hand
side. Hence, conditions 4 and 5 are satisfied. We only need to show that either condition
4 or condition 5 implies condition 1. Let us show condition 5 -+ 1; this is obvious since
Combining Lemma 5.10 and Theorem 5.9, we have the following corollary.
Corollary 5.11 Let P be a proper real-rational matrix and P = NM-l = &l-la be
corresponding rcf and lcf over RF&. Then there exists a controller
8, zz uovo-l zz p-lo
0 0
(5.20)
Furthermore, let F and L be such that A+ BF and A+ LC are stable. Then a particular
set of state space realizations for these matrices can be given by
(5.21)
(5.22)
130 STABILITY AND PERFORMANCE OF FEEDBACK SYSTEMS
Proof. The idea behind the choice of these mi~,trices is as follows. Using the observer
theory, find a controller ke achieving internal srability; for example
k .= A+BF+;C-LDF 1 ---I.
0. (5.23)
Perform factorizations
go = u v-l = i-10
0 0 0 0
which are analogous to the ones performed on 1’. Then Lemma 5.10 implies that each
of the two left-hand side block matrices of (5.20) must be invertible in RX,. In fact,
(5.20) is satisfied by comparing it with the equation (5.14). cl
Consider again the feedback system shown in l’igure 5.1. For convenience, the system
diagram is shown again in Figure 5.3. For furthc r discussion, it is convenient to define
the input loop transfer matrix, L;, and output loop transfer matrix, L,, as
respectively, where L, is obtained from breaking the loop at the input (u) of the plant
while L, is obtained from breaking the loop at the output (y) of the plant. The input
sensitivity matrix is defined as the transfer matrix from di to up:
And the output sensitivity matrix is defined as the transfer matrix from d to y:
s, = (I + LJl, r/ = Sod.
5.5. Feedback ProDerties 131
T, = I - S, = Li(I + Li)-’
T, = I - S, = L,(I + LO)-‘,
respectively. (The word complementary is used to signify the fact that T is the comple-
ment of S, T = 1- S.) The matrix I + L, is called input return difference matrix and
I + L, is called output return difference matrix.
It is easy to see that the closed-loop system, if it is internally stable, satisfies the
following equations:
These four equations show the fundamental benefits and design objectives inherent in
feedback loops. For example, equation (5.24) shows that the effects of disturbance d
on the plant output can be made “small” by making the output sensitivity function SO
small. Similarly, equation (5.27) shows that the effects of disturbance d; on the plant
input can be made small by making the input sensitivity function Si small. The notion
of smallness for a transfer matrix in a certain range of frequencies can be made explicit
using frequency dependent singular values, for example, ??(S,) < 1 over a frequency
range would mean that the effects of disturbance d at the plant output are effectively
desensitized over that frequency range.
Hence, good disturbance rejection at the plant output (y) would require that
be made small and good disturbance rejection at the plant input (up) would require
that
be made small, particularly in the low frequency range where d and di are usually
significant.
Note that
then
1
<CT(&)< l if e(PK) > 1
a(PK) + 1 - ,,(PK) - 1’
1
-c F(Si) 5 l --, i f c~(KP) > 1 .
(r(KP) + 1 - ,(KP) - I
Z(S,)<l * _a:PK)>>l
Z(Si) << 1 w CT: K P ) > 1 .
a(PK) > 1 or a(KP) >> 1 W F(S,P) = 7’ ((I -I- PK)-IP) z F(IC1) = -&
Hence good performance at plant output (1~) rec,rlires in general large output loop gain
o(L,) = ,(PK) > 1 in the frequency range wllrxe d is significant for desensitizing d
and large enough controller gain a(K) > 1 in them frequency range where di is significant
for desensitizing di. Similarly, good performance at plant input (u,) requires in general
large input loop gain a(L,) = a(KP) >> 1 in the frequency range where d; is significant
for desensitizing d; and large enough plant gain rl:[P) >> 1 in the frequency range where
d is significant, which can not changed by conr roller design, for desensitizing d. (It
should be noted that in general S, # Si unless h and P are square and diagonal which
is true if P is a scalar system. Hence, small pi S,) does not necessarily imply small
F(Si); in other words, good disturbance rejection at the output does not necessarily
mean good disturbance rejection at the plant in1 lot.)
Hence, good multivariable feedback loop desigr, boils down to achieving high loop (and
possibly controller) gains in the necessary freqwtlcy range.
Despite the simplicity of this statement, fee clback design is by no means trivial.
This is true because loop gains cannot be made ,xrbitrarily high over arbitrarily large
frequency ranges. Rather, they must satisfy ccl tain performance tradeoff and design
limitations. A major performance tradeoff, for (sample, concerns commands and dis-
turbance error reduction versus stability under the: model uncertainty. Assume that the
plant model is perturbed to (I + A)P with a stable, and assume that the system is
nominally stable, i.e., the closed-loop system wit,1 a = 0 is stable. Now the perturbed
closed-loop system is stable if
has no right-half plane zero. This would in gener 11 amount to requiring that IlKP,II be
small or that F(T,) be small at those frequencic’s where a is significant, typically at
5.5. Feedback ProDerties 133
high frequency range, which in turn implies that the loop gain, F(L,), should be small
at those frequencies.
Still another tradeoff is with the sensor noise error reduction. The conflict between
the disturbance rejection and the sensor noise reduction is evident in equation (5.24).
Large (~(L,(jw)) values over a large frequency range make errors due to d small. How-
ever, they also make errors due to ‘n large because this noise is “passed through” over
the same frequency range, i.e.,
Note that R is typically significant in the high frequency range. Worst still, large loop
gains outside of the bandwidth of P, i.e., a(L,(jw)) > 1 or &i(jw)) >> 1 while
i?(P(jw)) << 1, can make the control activity (1~) quite unacceptable, which may cause
the saturation of actuators. This follows from
Here, we have assumed P to be square and invertible for convenience. The resulting
equation shows that disturbances and sensor noise are actually amplified at u whenever
the frequency range significantly exceeds the bandwidth of P, since for w such that
Tj(P(jw)) < 1, we have
,,[P-l(jw)] = _ 1 > 1.
QY.cJ)l
Similarly, the controller gain, Z(K), should also be kept not too large in the frequency
range where the loop gain is small in order to not saturate the actuators. This is because
for small loop gain (T(L,(jw)) << 1 or F(L,(jw)) < 1
Therefore, it is desirable to keep F(K) not too large when the loop gain is small.
To summarize the above discussion, we note that good performance requires in some
frequency range, typically some low frequency range (0, ~1):
and good robustness and good sensor noise rejection require in some frequency range,
typically some high frequency range (wjL, oo)
where A4 is not too large. These design requirements are shown graphically in Figure 5.4.
The specific frequencies wl and ~1~ depend on the specific applications and the knowledge
one has on the disturbance characteristics, the modeling uncertainties, and the sensor
noise levels.
134 STABILITY AND PERFORMANCE OF FEEDBACK SYSTEMS
(1) Find a rational strictly proper transfer function L which contains all the right half
plane poles and zeros of P such that IL/ ( 1ears the boundaries specified by the
performance requirements at low frequencies and by the robustness requirements
at high frequencies as shown in Figure 5.4.
L must also be chosen so that 1 + L has all zeros in the open left half plane, which
can usually be guaranteed by making L well-behaved in the crossover region, i.e.,
L should not be decreasing too fast in the frequency range of IL(jw)I M 1.
The loop shaping for MIMO system can be done similarly if the singular values of
the loop transfer functions are used for the loop gains.
140 STABILITY AND PERFORMANCE OF FEEDBACK SYSTEMS
with some appropriate choice of weighting matrix W, and scalar p. The parameter p
clearly defines the tradeoff we discussed earlier between good disturbance rejection at
the output and control effort (or disturbance and sensor noise rejection at the actuators).
Note that p can be set to p = 1 by an approp?ate choice of W,. This problem can
be viewed as minimizing the energy consumed by the system in order to reject the
disturbance d.
This type of problem was the dominant par.idigm in the 1960’s and 1970’s and is
usually referred to as Linear Quadratic Gaussi,jn Control or simply as LQG. (They
will also be referred to as ?fz mixed sensitivity problems for the consistency with the
‘H, problems discussed next.) The development of this paradigm stimulated extensive
research efforts and is responsible for important technological innovation, particularly
in the area of estimation. The theoretical contril)utions include a deeper understanding
of linear systems and improved computational methods for complex systems through
state-space techniques. The major limitation of this theory is the lack of formal treat-
ment of uncertainty in the plant itself. By allowing only additive noise for uncertainty,
the stochastic theory ignored this important pr;..ctical issue. Plant uncertainty is par-
ticularly critical in feedback systems.
l-l, Performance
Although the 3-t~ norm (or fZ2 norm) may be a meaningful performance measure and
although LQG theory can give efficient design cl:)mpromises under certain disturbance
and plant assumptions, the X2 norm suffers a major deficiency. This deficiency is due
to the fact that the tradeoff between disturbance error reduction and sensor noise error
reduction is not the only constraint on feedback design. The problem is that these
performance tradeoffs are often overshadowed by a second limitation on high loop gains
- namely, the requirement for tolerance to uncertainties. Though a controller may be
designed using FDLTI models, the design must bt’ implemented and operated with a real
physical plant. The properties of physical systems, in particular the ways in which they
deviate from finite-dimensional linear models, pllt strict limitations on the frequency
range over which the loop gains may be large.
A solution to this problem would be to put explicit constraints on the loop gain in
the cost function. For instance, one may chose to minimize
This problem can also be regarded as minimizing the maximum power of the error
subject to all bounded power disturbances: let
then
and
se: = we so wd
WuKSoWd 1[“’
wesow,
wuKsowd
1 *
wesowd 2
Traces,-;(jw) d w =
I ll[ WuKSoWd Ill m’
Alternatively, if the system robust stability margin is the major concern, the weighted
complementary sensitivity has to be limited. Thus the whole cost function may be
where WI and W2 are the frequency dependent uncertainty scaling matrices. These
design problems are usually called ‘HFI, mixed sensitivity problems. For a scalar system,
an ‘FI, norm minimization problem can also be viewed as minimizing the maximum
magnitude of the system’s steady-state response with respect to the worst case sinusoidal
inputs.
These expressions for B(jw)B*(jw) and G*(jw)c(j w ) are then substituted into (7.13)
to obtain
Now consider one-step order reduction, i.e., T = II, - 1, then Cs = CJ~ and
where 0 := $-*(jw)$(jw) = O-* is an “all pi\ss” scalar function. (This is the only
place we need the assumption of si = 1) Hence I(-)(jw)( = 1.
Using triangle inequality we get
Cl = diag(~rISI,~&, . . ,c&).
Let I&(S) = Gk+r(s) - Go for Ic = 1,2,. . ,N - 1 and let GN(s) = G(s). Then
since Go is a reduced order model obtained from the internally balanced realization
of Gk+r (i)’ and the bound for one-step order reduction, (7.15) holds.
Noting that
We shall now give an alternative proof of the error bound using matrix dilation.
Another alternative proof will be given in the next chapter using the optimal Hankel
norm approximation.
7.2. Frequency-Weighted Balanced Model Reduction 163
and P = Q --+ iln as o + co. So the Hankel singular values ‘~j + i and 2(~i + cr2 +
. . . + CT,) + n = llG(s)ll, as a -+ 00.
The model reduction bound can also be loose for systems with Hankel singular values
close to each other. For example, consider the balanced realization of a fourth order
system
-19.9579 -5.4682 9.6954 0.9160
The approximation errors and the estimated bounds are listed in the following table.
The table shows that the actual error for an r-th order approximation is almost the same
as 2a,+i which would be the estimated bound if we regard (~,+i = or+2 = . . . = ~4. In
general, it is not hard to construct an n-th order system so that the r-th order balanced
model reduction error is approximately 2g,+i but the error bound is arbitrarily close
to 2(” - T)U,+1. One method to construct such a system is as follows: Let G(s) be
a stable all-pass function, i.e., G”(s)G(s) = I, then there is a balanced realization for
G so that the controllability and observability Gramians are P = Q = 1. Next make
a very small perturbation to the balanced realization then the perturbed system has
a balanced realization with distinct singular values and P = Q M I. This perturbed
system will have the desired properties and this is exactly how the above example is
constructed.
0 1 2 3
IIG -h, 2 1.996 1.991 1.9904
Bounds: 2 Cz=,+, u, 7.9772 5.9772 3.9818 1.9904
2ar+1 2 1.9954 1.9914 1.9904
The above bounds are then used to show that the Ic-th order optimal Hankel norm
approximation, G(S), together with some constilnt matrix Dn satisfies
We shall also provide an alternative proof for the error bounds derived in the last chapter
for the truncated balanced realizations using thf> results obtained in this chapter.
Finally we consider the Hankel operator in discrete time and offer an alternative
proof of the well-known Nehari’s theorem.
rG : tit -3 ?‘f2
,f L2
MG
p+
/-.-.....I T
There is a corresponding Hankel operator in the time domain. Let g(t) denote the
inverse (bilateral) Laplace transform of G(s). Then the time domain Hankel operator
is
r g : &?(-co, O] t--t C2[O,cQ)
r,.f := P+(g * f), for .f E C,(-oo,O].
Thus
(r,.f)(t) = 1 J”, g(t - 7- f(T)dT, t 2 0;
0, t < 0.
Because of the isometric isomorphism property between the L2 spaces in the time do-
main and in the frequency domain, we have
176 HAN KEL NORM APPROXIMATION
from the initial state to the future output. These two operators will be called the con-
trollability operator, QC, and the observability opwator, Q,, respectively, and are defined
as
Q c : L2(-CqO] - c
0
Q!,u := e-A7 Bu(T)dT
I-cc
and
Q o: cc” -L,:[0,m)
QoxO := CeAtxO, t 2 0.
(If all the data are real, then the two operators become Q, : Lc,(-oo,O] - IR” and
‘3r 0 : KY H Cz[O,oo).) Clearly, x0 = Q-,u(t) for u(t) E &(--oo,O] i s t h e s y s t e m s t a t e
at t = 0 due to the past input and y(t) = QOxu, t 2 0, is the future output due to the
initial state xc with the input set to zero.
It is easy to verify that
r Ll
The adjoint operators of Q, and 9, can also +e obtained easily from their definitions
as follows: let u(t) E Lz(-oo,O], x0 E C”, and y(t) E L2[O,cm), then
0
and
(~,~o,Y)L~[o,~)
J
= oli x;leA*tC*y(t)dt = (x0, /lr eA*tC*y(t)dt)o~ = (x0, QZY)C
where (., .)x denotes the inner product in the Hilbert space X. Therefore, we have
Q?‘:: : C” - C2(-CqO]
CP’,“xo = B*e-A*TxO, 7- 5 0
and
Jm
!IJE : Cz[O, co) - @”
KY(t) = eA*tC*y(t)dt.
0
r;y =
J O"
0
B*eA*(+‘)C*y(t)dt, 75 0.
Let L, and L, be the controllability and observability Gramians of the system, i.e.,
L, =
J0
O” eAtBB*eA’tdt
L, =
SW
0
eA’tC*CeAtdt.
Then we have
~‘k,xP;xo = L&J
xk~Q,xo = L&J
for every 50 E C”. Thus L, and L, are the matrix representations of the operators
iJ!C@E and !I$+,,.
Theorem 8.1 The o p e r a t o r l?;P, (or I’ZPC) a n d the matrix L,L, have the same
nonzero eigenvalues. In particular lpy = Jm.
Proof. Let u2 # 0 be an eigenvalue of P;Ps, and let 0 # u E &(-co, 0] be a corre-
sponding eigenvector. Then by definition
Note that z = 9,~ # 0 since otherwise u2u = 0 from (8.1) which is impossible. So u2
is an eigenvalue of L,L,.
On the other hand, suppose u2 # 0 and x # 0 are an eigenvalue and a corresponding
eigenvector of L,L,. Pre-multiply (8.2) by QlfL,, and define u = !QEL,z to get (8.1). It
is easy to see that u # 0 since 9,u = !Plc+\ErLoz = L,L,x = u2x # 0. Therefore ff2 is
an eigenvalue of l?zPs.
Finally, since G(s) is rational, P;Ps is compact and self-adjoint and has only discrete
spectrum. Hence ])l?s])2 = I]P;l?,II = p(L,L,). 0
hgZl
u E ~~10, CO).
v :=
r,u = (TV
rp = ou.
This pair of vectors (u, v) are called a Schmidt pair of Pg. The proof given above suggests
a way to construct this pair: find the eigenvalueh and eigenvectors of L,L,, i.e., up and
xi such that
L,L,x; = up ci.
Then the pairs (ui, vi) given below are the corresponding Schmidt pairs:
Remark 8.2 As seen in various literature, there are some alternative ways to write a
Hankel operator. For comparison, let us examine some of the alternatives below:
(i) Let v(t) = u(--t) for u(t) E Cz(-oo,O], and then v(t) E &[O,oo). Hence, the
Hankel operator can be written as
rg : Or rG
s =
0
t
CeA(t+T)Bv(7)d7, for 1 0.
8.2. All-pass Dilations 179
[+I
A B
Let G(s) = c o be an antistable transfer matrix, i.e., all the eigenvalues of
A have positive real parts. Then the Hankel operator associated with G(s) can
be written as
P, : C,[O,ccJ) - C2(-m,O]
RJW = { o
7 t>o
= Ce”(t-‘)Bv(T)d7, for t 5 0
r
0
Definition 8.1 The inertia of a general complex, square matrix A denoted In(A) is the
triple (n(A), v(A), 6(A)) where
Theorem 8.3 Given complex n x n and n x m l/ratrices A and B, and hermitian matrix
P = P* satisfying
AP+PA*+BF=O (8.3)
then
(1) If 6(P) = 0 then r(A) 5 v(P), v(A) 5 ~(1’).
(2) If S(A) = 0 then r(P) 5 v(A),v(P) 5 ~(11).
(8.4)
(1) IffA,B,C) as completely controllable and completely observable the following two
statements are equivalent:
(a) there exists a D such that GG” = a21 where G(s) = D + C(sI - A)-lB.
(b) there exist P, Q E cnxn such that
(i) P = P”, Q = Q*
(ii) AP + PA* + BB* = 0
(iii) A*Q + QA + C*C = 0
(iv) PQ = 021
(2) Given that part (lb) is satisfied then there exists a D satisfying
D*D = a21
D*C+B*Q = 0
DB*+CP = 0
and any such D will satisfy part (la) (note, observability and controllability are
not assumed).
Proof. Any systems satisfying part (la) or (lb) can be transformed to the case u = 1
by & = Bffi, C = C/fi, 6 = D/o, p = P/o, Q = Q/u. Hence, without loss of
generality the proof will be given for the case u = 1 only.
(la) =S (lb) This is proved by constructing P and Q to satisfy (lb) as follows. Given
(la), G(oo) = D + DD* = I. Also GG” = I + G” = G-l, i.e.,
xz G-=[$j-$].
These two transfer functions are identical and both minimal (since (A, B, C) is assumed
to be minimal), and hence there exists a similarity transformation T relating the state-
space descriptions, i.e.,
Further
TA+A*T-CC’=0 (8.16)
which verifies (lb), equation (iii). Also (8.16) implies
AT-l + T-lA* - T-lC*CT-l = 0 (8.17)
which together with (8.10) implies part (lb), equation (ii).
(lb) 3 (la) This is proved by first constructing D according to part (2) and then veri-
fying part (la) by calculation. Firstly note that si uce Q = P-l, Q x ((lb), equation (ii)) x
Q gives
QA + A*& + QBB*Q = 0 (8.18)
which together with part (lb), equation (iii) implies that
QBB*Q = (‘“C (8.19)
and hence by Lemma 2.14 there exists a D such that D*D = I and
DB*Q = - C (8.20)
DB* = -CQ- ’ = - C P . (8.21)
Equations (8.20) and (8.21) imply that the corditions of part (2) are satisfied. Now
note that
BB* = (ST -. A)P + P(-sI - A*)
+ C(sI - A)-‘BB*(-sI - A*)-%* = CP( --sI - A*)-%* + C(sI - A)-1 PC*
(8.21) + = -DI?*(-sI - A*)-%* - C(sI - A)-lBD*.
G(s)G” = I.
Part (2) follows immediately from the proof of (lb) + (la) above. cl
D;D, = I (8.54)
D,*Ce + B,*Qe = 0 (8.55)
D,B,* +C,P, = 0 . (8.56)
Equation (8.54) is immediate, (8.55) follows by substituting the definitions of 8, 8, D,
and Q, and (8.56) follows from D, x (8.55) x P,.
(3) (a) To show that 6(A) = 0 if 6(A) = 0 we will assume that there exists z E 4F’
and X E Cc such that kr = Xa: and X + 5 = 0, and show that this implies 3: = 0. From
zr* (8.53)x,
L
(A + x)x*clrx + x*t?*cx = 0 (8.57)
=s ex = 0. (8.58)
-A;,rx -rxx = 0
-AT‘Jx = 0
8.2. All-pass Dilations 187
(A + x)x*cJx = 0
C=[2 - 1 61, C=
Example 8.2 Let us also illustrate Theorem 8.5 when (CT - (~~1) is indefinite. Take
G(s) as in the above example and permute the first and third states of the balanced
realization so that C = diag(2, i, l), Ci = diag(2, i), B = 1. The construction of
Theorem 8.5 now gives
and we note that the stable part of W(s) has McMillan degree 1 as predicted by Theo-
rem 8.5, part (3~). However, this example has been constructed to show that (a, l?, 6’)
itself may not be minimal when the conditions of part (3d) are not satisfied, and in this
case the unstable pole at +5 is both uncontrollirble and unobservable. 9 is in fact
an optimal Hankel norm approximation to G(s) of degree 1 and
In general the error E(jw) will have modulus eqlial to v but E(s) will contain unstable
poles. 0
Example 8.3 Let us finally complete the analysis of this G(s) by permuting the second
and third states in the balanced realization of the last example to obtain Cl = diag(2, l),
u=i. Wewillfind
1 (- s2 + 123s + 110)
= 5 1:s2 + 21s + 10)
2 s2 - 21s + 10)
E(s) = G(s) - W(s) = - f ‘s” 1 ;j;-;;,;;,l + 21s + 1o) .
\.s
Note that -Cir = diag(-15/2,-3/4) so that A is stable by Theorem 8.5, part (3b).
]E(jw)] = f by Theorem 8.5, part (2), (a,&,G) rs minimal by Theorem 8.5, part (3d).
W(s) is in fact an optimal second-order Hankel norm approximation to G(s). 0
Lemma 8.6 Given a stable, rational, p x m, transfer function matrix G(s) with Hankel
singular values c~1 > (~2 . . . 2 uk > U&l > gk+2 . . > u, > 0, then for all d(s) stable
and of McMillan degree 5 k
In particular,
/IW - &)I(H 2 a+l(G(S)). (8.63)
Proof. We shall prove (8.61) only and the inequality (8.62) follows from (8.61) by
setting
G(s) = (G(s) - 6(s)) - (-6(s)).
Let (A,B,G) be a minimal state space realization of d(s), then (Ae,Be, C,) given by
(8.33) will be a state space realization of G(s) - G(s). Now let P = P* and Q = Q*
satisfy (8.34) and (8.35) respectively (but not necessary (8.36) and (8.37) and write
P=RR’
where
with
R22 = 112 ,
J’22 R12 = P12P2;l/2, RIIR;, = PII - RlzRT2
(P22 > 0 since (A, B, C) is a minimal realization.)
[ In 0 ] R*QR
[ R;, 0 ] Q R;l
[ I)
= Xi(R;1Q11Rll) = k(QllRl1RT1)
= ~i(Qll(f’ll - RI&))
= Xi(Q:{“PIIQ:{” - XX*) where X = Qii2R12
2 k+k(Q:(2P~~Q:(2) (8.64)
= k+k(f’llQll) = “f+k(G)
190 HANKEL NORM APPROXIMATION
where (8.64) follows from the fact that X is an 71 x k matrix (+ rank(XX*) 5 k). 0
We can now give a solution to the optimal H;rnkel norm approximation problem for
square transfer functions.
Theorem 8.7 Given a stable, rational, m x m, transfer function G(s) then
In which case
IIG(4 - %)I1 ,~ = uk+l e (8.66)
(3) Let G(s) b e as in (2) above, then an optimal Hankel norm approx:imation of
McMillan degree k, G(s), can be constructea as follows. Let (A, B, C) be a balanced
realization of G(s) with corresponding
-.. ^
C=diag((T1,(T2,...,Ok,(Tk+r+i. . . . . ~n,~k+l,...,~k+r),
t?(s) + F(s) =
Ii-t-l
; ; (8.67)
where 6(s) E ‘H, and F(s) E ‘/YFI, with the McMillan degree of b(s) = k and the
McMillan degree of F(s) = n - k - r.
Proof. By the definition off,, norm, for all F(J) E T-l&, and d(s) E ‘FI, of McMillan
degree k
=
II I I,
G-c?
Hence, G has McMillan degree Ic and it in the correct class, and therefore (8.69) implies
that the inequalities in (8.68) becomes equalities, and part (1) is proven, as in part (3).
Clearly the sufficiency of part (2) can be similarly verified by noting that (8.65) implies
that (8.68) is satisfied with equality.
To show the necessity of part (2) suppose that G(s) is an optimal Hankel norm ap-
proximation to G(s) of McMillan degree Ic, i.e, equation (8.66) holds. Now Theorem 8.5
can be applied to G(s) - G(s) to produce an optimal anticausal approxim_ation F(s),
such that (G(s) -d(s) -F(s))/ rrk+r(G) is all-pass since ok+r(G) = gr(G-G). Further,
the McMillan degree of this- F(s) will be, the McMillan degree of (G(s) - G(s)) minus
the multiplicity of ar(G - G), 5 n + Ic - 1. cl
The following corollary gives the solution to the well-known Nehari’s problem.
3 m
and a solution is given by Theorem 8.5 with k = 0. Indeed, let F(s) be an optimal
anticausal approxcimation of degree n - rr given by the construction of Theorem 8.5.
Then
Proof. (1) is proved in Theorem 8.5, part (2). (2) is obtained from the forms of P, and
Qe in Theorem 8.5, part (1). F(- s) is. used since it will be stable and have well-defined
Hankel singular values. 0
The optimal Hankel norm approximation for non-square case can be obtained by
first augmenting the function to form a square function. For example, consider a stable,
[1
G
rational, p x m (p < m), transfer function G(s). Let G, = o be an augmented
192 HANKEL NORM APPROXIMATION
6
square transfer function and let G:, =
tion of G, such that
1 IA
G2
bf. the optimal Hankel norm approxima-
PQ = g2 I implies that,
It is easy to show that if G is a scalar, then U, V are scalar transfer functions and
U”lJ = V”V. Hence, V/U is an all-pass. The details are left as an exercise for the
reader.
G ( X ) = C(X-II- A)-% = ; f ,
H-1
Let L, and L, be the corresponding controllability and observability Gramians:
AL,A* - L, + BB* = 0
A*L,A - L, + C*C = 0.
And let (T:, Q be the largest eigenvalue and a corresponding eigenvector of L,L,:
Define
Then
E m,l
are the Schmidt pair and
i=l
V ( X ) = FCAivX’ =
i=O
210 HANKEL NORM APPROXIMATION
where u and v are partitioned such that uri E 3’ and vii E W. Finally, U(X) and V(X)
can be obtained as
In particular, if G(X) is an n-th order matrix polynomial, then matrix H has only a
finite number of nonzero elements and
with
Hz
[ 0 01
Hn ’
GI Ga -- G,ml G, -
G2 G3 ... G, 0
H,= Gs G4 ... 0 0 .
.. ..
6, 0 ... ;, 0
8.9. Notes and References 211
modeling problem is much deeper - the universe of mathematical models from which a
model set is chosen is distinct from the universe of’ physical systems. Therefore, a model
set which includes the true physical plant can ntsver be constructed. It is necessary for
the engineer to make a leap of faith regarding the applicability of a particular design
based on a mathematical model. To be practical, a design technique must help make
this leap small by accounting for the inevitable inadequacy of models. A good model
should be simple enough to facilitate design, yei complex enough to give the engineer
confidence that designs based on the model will work on the true plant.
The term uncertainty refers to the differences or errors between models and reality,
and whatever mechanism is used to express thesa, errors will be called a representation
of uncertainty. Representations of uncertainty vary primarily in terms of the amount
of structure they contain. This reflects both our knowledge of the physical mechanisms
which cause differences between the model and the plant and our ability to represent
these mechanisms in a way that facilitates convenient manipulation. For example, con-
sider the problem of bounding the magnitude of the effect of some uncertainty on the
output of a nominally fixed linear system. A useful measure of uncertainty in this con-
text is to provide a bound on the power spectrllm of the output’s deviation from its
nominal response. In the simplest case, this powl:r spectrum is assumed to be indepen-
dent of the input. This is equivalent to assumin:; that the uncertainty is generated by
an additive noise signal with a bounded power spectrum; the uncertainty is represented
as additive noise. Of course, no physical system is linear with additive noise, but some
aspects of physical behavior are approximated quite well using this model. This type
of uncertainty received a great deal of attention in the literature during the 1960’s and
1970’s, and elegant solutions are obtained for many interesting problems, e.g., white
noise propagation in linear systems, Wiener and Kalman filtering, and LQG optimal
control. Unfortunately, LQG optimal control dd not address uncertainty adequately
and hence had less practical impact than might have been hoped.
Generally, the deviation’s power spectrum of c,he true output from the nominal will
depend significantly on the input. For example, an additive noise model is entirely in-
appropriate for capturing uncertainty arising from variations in the material properties
of physical plants. The actual construction of model sets for more general uncertainty
can be quite difficult. For example, a set membf>rship statement for the parameters of
an otherwise known FDLTI model is a highly-structured representation of uncertainty.
It typically arises from the use of linear incremental models at various operating points,
e.g., aerodynamic coefficients in flight control vary with flight environment and aircraft
configurations, and equation coefficients in powctr plant control vary with aging, slag
buildup, coal composition, etc. In each case, thf> amounts of variation and any known
relationships between parameters can be expressed by confining the parameters to ap-
propriately defined subsets of parameter space. However, for certain classes of signals
(e.g., high frequency), the parameterized FDLTI model fails to describe the plant be-
cause the plant will always have dynamics whicll are not represented in the fixed order
model.
In general, we are forced to use not just a sin:.;le parameterized model but model sets
9.4. Unstructured Robust Performance 235
Remark 9.7 It is important to note that in this case, the robust stability condition is
given in terms of Li = KP while the nominal performance condition is given in terms
of L, = PK. These classes of problems are called skewed problems or problems with
skewed specifications.2 Since, in general, PK # KP, the robust stability margin or
tolerances for uncertainties at the plant input and output are generally not the same.
Remark 9.8 It is also noted that the robust performance condition is related to the
condition number of the weighted nominal model. So in general if the weighted nominal
model is ill-conditioned at the range of critical frequencies, then the robust performance
condition may be far more restrictive than the robust stability condition and the nominal
performance condition together. For simplicity, assume Wr = 1, Wd = I and W2 = wtI
where wt E ‘JUtHI, is a scalar function. Further, P is assumed to be invertible. Then
robust performance condition (9.10) can be written as
Comparing these conditions with those obtained for non-skewed problems shows that
the condition related to robust stability is scaled by the condition number of the plant3.
Since K(P) 2 1, it is clear that the skewed specifications are much harder to satisfy
if the plant is not well conditioned. This problem will be discussed in more detail in
section 11.3.3 of Chapter 11. 0
These skewed specifications also create problems for MIMO loop shaping design
which has been discussed briefly in Chapter 5. The idea of loop shaping design is
based on the fact that robust performance is guaranteed by designing a controller with
a sufficient nominal performance margin and a sufficient robust stability margin. For
example, if K(wrU1wd) M 1, the output multiplicative perturbed robust performance is
guaranteed by designing a controller with twice the required nominal performance and
robust stability margins.
2See Stein and Doyle [1991].
3Alternative condition can be derived so that the condition related to nominal performance is scaled
by the condition number.
10.2. Examdes of LFTs 255
Hence every polynomial is a linear fraction of its indeterminates. More generally, any
multidimensional (matrix) polynomials are also LFTs in their indeterminates; for ex-
ample,
Then
with
N =
Rational Functions
N(&,Sz,~..,&,)
F(Sl,S2,..~,L) = = N(&,&;..,&) (@l,~2,y&&-1
d(S1,62, ‘. . I&Z)
where N(&,&;.. , 6,) is a multidimensional matrix polynomial and d(Si, 62,. . . ,a,)
is a scalar multidimensional polynomial with d(O,O, ... ,O) # 0. Both N and dl can
be represented as LFTs, and, furthermore, since d(O,O, ... ,O) # 0, the inverse of d1 is
also an LFT as shown in Lemma 10.3. Now the conclusion follows by the fact that the
product of LFTs is also an LFT. (Of course, the above LFT representation problem is
exactly the problem of state space realization for a multidimensional transfer matrix.)
However, this is usually not a nice way to get an LFT representation for a rational matrix
since this approach usually results in a much higher dimensioned a than required. For
_
example,
Q + ps = Fte(M,h)
f(d) = 1
256 LINEAR FRACTIONAL TRANSFORMATION
with
and
i=Az+Bu
y=Cz+Lju
has a transfer matrix of
G(s)=D+C(sI-A)+3=.TJ
[ 1 ; ; ,;I).
G(s) =Fu(
[ 1 ; ;; ,A).
More generally, consider a discrete time 2-D (or .\ID) system realized by the first-order
state space equation
6‘2 1 r&I
i 0
In the same way, take
A = ‘;I -“II =:
[ 0 &I I
We have seen that in the 3-12 FI case, the optimal controller uses just the state x even
though the controller is provided with full information. We will show below that, in the
7-& case, a suboptimal controller exists which also uses just 5. This case could have
been restricted to state feedback, which is more traditional, but we believe that, once
one gets outside the pure ‘Hz setting, the full information problem is more fundamental
and more natural than the state feedback problem.
One setting in which the full information case is more natural occurs when the
parameterization of all suboptimal controllers is considered. It is also appropriate when
studying the general case when Dri # 0 in the next chapter or when 1-I, optimal
(not just suboptimal) controllers are desired. Even though the optimal problem is not
studied in detail in this book, we want the methods to extend to the optimal case in a
natural and straightforward way.
The assumptions relevant to the FI problem which are inherited from the output
feedback problem are
(i) (Cl, A) is detectable;
Theorem 16.9 There exists an admissible controller K(s) for the FIproblem such that
llTZ2ull, < y if and only if H, E dom(Ric) and X, = Ric(H,) 2 0. Furthermore,
if these conditions are satisfied, a class of admissible controllers satisfying llT,,/, < y
can be parameterized as
(16.9)
526 FIXEI) STRUCTURE CONTROLLERS
point of the constraints. Then there exists a unrque multiplier P = P* E R’xl such that
if we set F(z) = j(z) + Trace(T(z)P), then Vl’(x~) = 0, i.e.,
VF(zu) = V j(xu) + V Tr.ice(T(su)P) = 0.
In general, in the case where a local minim,4 point 20 is not necessarily a regular
point, we have the following corollary.
Corollary 20.5 Suppose that x0 E I@’ is a locd minimum off(x) subject to the con-
straints T(x) = 0 where T(x) = T(x)* E I&‘“. ‘Then there exist 0 # (Xo,P) E R x I@”
with P = P* such that
AeVj(xe) + V Trace(‘l”(zn)P) = 0.
Remark 20.2 We shall also note that the variable x E ll%” may be more conveniently
given in terms of a matrix X E ll%“‘q, i.e., we have
x1 1
"kl
7'12
x=VecX:= . .
Then
2.1
ZI 0
is equivalent to
: I
g-1
I,
Remark 20.5 This method can also be used to derive the ‘H, results presented in
the previous chapters. The interested reader should consult the references for details.
It should be pointed out that this method suffers a severe deficiency: global results
are hard to find due to the lack of convexity in general. Hence even if a fixed-order
controller can be found, it may not be optimal. 0
Remark 21.1 If X;(A)Xj(B) = 1 for some i,j, t,hen the equation (21.1) has either no
solution or more than one solution depending on the specific data given. If B = A* and
Q = Q*, then the equation is called the discrete Lyapunov equation. 0
The following results are analogous to the corresponding continuous time cases, so they
will be stated without proof.
Lemma 21.6 Let Q be a symmetric matrix and consider the following Lyapunov equa-
tion:
AXA*-X+i)=O
1. Suppose that A is stable, and then the following statements hold:
s :=
[ 1
4
$31
1 s12
sz- ,
simplectic mairix has ng eigenvalues at the origin. and, furthermore, it is easy to see that
if X is an eigenvalue of a simplectic matrix S, then A, l/A, and l/x are also eigenvalues
of s.
If S22 is assumed to be invertible then the simplectic matrix, S, is necessarily of the
form.
s ,= A
. [
+ G(A*)-lQ
-(A*)-1Q
-G(A*)-l
(A*)-1 1 (21.2)
21.3. Discrete Riccati Equations 539
X-(S) =
[1
I m T1
T2
where Tl,Tz E IPx”. If 7’1 is nonsingular or, equivalently, if the two subspaces
X-(S), Im
[1
:
Definition 21.1 The domain of Ric, denoted by dom(Ric), consists of all (2n x 2n)
[1
simplectic matrices S such that S has no eigenvalues on the unit circle and the two
0
subspaces X-(S) and Im are complementary.
I
Theorem 21.7 Let S be defined in (21.2) and suppose S E dom(Ric) and X = Ric(S).
Then
Note that the discrete Riccati equation in (21.3) can also be written as
A*(I + XG)-lXA - X + Q = 0.
Remark 21.2 In the case that A is singular, all results presented in this chapter will
still be true if the eigenvalue problem of S is replaced by the following generalized
eigenvalue problem:
21.3. Discrete Riccati Equations 545
s =
[
A + G(A*)-lQ
-(A*)-lQ
-G(A*)-1
(A*)-l 1 ’
Then S E dom(Ric) iff (A,G) is stabilizable and (Q,A) has no unobservable modes on
the unit circle. Furthermore, X = Ric(S) 2 0 if S E dom(Ric) and X > 0 if and only
if (Q,A) has no unobservable stable modes.
Proof. Let Q = C’C for some matrix C. The first half of the theorem follows from
Lemmas 21.9 and 21.10. Now rewrite the discrete Riccati equation as
XAx = 0, Cx = 0. (21.17)
Now X 2 0, G 2 0, and (XI < 1 imply that IXI”(I + X1/2GX1/2)-’ - I < 0. Hence
Xx = 0, i.e., X is singular. 0
Lemma 21.12 Suppose that D has full column rank and let R = D*D > 0; then the
following statements are equivalent:
(i) [ A-zeI ;]
has full column rank for all 0 E [0,27r].
does not have full column rank. Conversely, sIppose that (21.18) does not have full
Now let
Then
and
(A - BR-lD*C - eiei)x + By = 0 (21.19)
(I - DR-lD*)Cx -- Dy = 0. (21.20)
Pre-multiply (21.20) by D* to get y = 0. Then we have
(A - BR-lD*C)x = ejex, (I- DR-‘D*)Cx = 0
i.e., eje is an unobservable mode of ((I - DR-l D*)C, A - BR-lD*C). 0
Corollary 21.13 Suppose that D has full column rank and denote R = D*D > 0. Let
S have the form
s =
where E = A- BR-lD*C,
[
E + G(E*)-IQ -G(E*)-1
-(E*)-lQ (E*)-1
G = BR-lB*, Q = /?*(I - DR-lD*)C,
1 and E is assumed
Note that the Riccati equation corresponding to the simplectic matrix in Corol-
lary 21.13 is
E*XE - X - E*XG(I + XG)-lXE + Q = 0.
by
H-1A
= c o
B
E RC, and let S be a simplectic matrix dejined
s := A - BB*(A*)-lC*C
-(A*)-lC*C
(Q ll~(~)ll~ < 1;
(ii) S has no eigenwalues on the unit circle and IlC(I - A)-lBII < 1.
[I - M”(z)M(z)]-l =
A - BB”(A*)-lC*C -BB*(A*)- I
4* (A*)-l ] =ej’q*, 4* [ f ] =O.
(A*)-lC*C
Hence qTl3 = 0 and
S=
[ I[
- I
I
A - BB*(A*)-lC*C
(A*)%*C
-BB*(A*)-l
(A*)-l I[ 1I .
-I
Hence S does not have eigenvalues on the unit ( Ircle. It is clear that we have already
proven that S has no eigenvalues on the unit cir.le iff (I - M”M)-1 E XL,. So it is
sufficient to show that
]]M(z)]]~ < 1 ti (I - M”M)-1 t: RL, and(]M(l)]] < 1.
It is obvious that the right hand side is necessary To show that it is also sufficient, sup-
pose ll~(z)ll~ 2 1, then gmaz (M(ej’)) = 1 for some 8 E [0, 2~1, since gmaz(M(l)) < 1
and M(ej’) is continuous in 0. This implies that 1 is an eigenvalue of M*(e-j’)M(ej’),
s o I - M*(e-js)M(eje) IS singular. This contradicts to (I - M-M)-1 E ‘R&. 0
In the above Lemma, we have assumed that the transfer matrix is strictly proper.
We shall now see how to handle non-strictly proper case. For that purpose we shall
focus our attention on the stable system, and WC shall give an example below to show
why this restriction is sometimes necessary for the technique to work.
We first note that ‘&-norm of a stable syste u is defined as
Ml(z)=~
z-l/a = H-1
l cr ci E RC c l’
z
1 1
Then 11Ml(z)[[, = & < 1, but 1-D*D = 0. In general, if 111 E RC, and llM[l, < 1,
then I - D*D can be indefinite. 0
[tl A B
Lemma 21.15 Let M(z) = C D E R’H,. Then llM(z)ll, < 1 if a n d o n l y if
A + B(I - D*D)-‘D*C
N(z) =
(I - DD*)-l/zC
[tj
A B
Theorem 21.16 Let M(z) = C D E R’H, and define
E : = A + B(I - D*D)-lD*C
G := -B(I- D*D)-lB*
Q : = C*(I - DD*)-%‘.
s .=
. [
E
Note that the Riccati equation in (c) can also bib written as
The last equality is obtained from substituting in Riccati equation. Now pre-multiply
the above equation by B*(z-‘I - A)-’ and post-multiply by (21 - A)-lB to get
I - M*(z-Q.!f(z) = w*(z-‘)W(z)
where
A B
w(z) = (I - B*XB)-l/zB*XA -(I - B*XB)@ ’
I
that are all inside of the unit circle. Hence eJO cannot be a zero of W.
Therefore, we get I - M*(e-j’)M(ejO) > 0 for all 0 E [0, 2~1, i.e., jlMlloo < 1. 0
The following more general results can be proven easily following the same procedure
as in the proof of (c)*(a).
Then
I - M*(z8)M(z) = W*(z-‘)(I - B*XB)W(z)
where
W(z) =
( I -
1 ’
(2) ifI-B*XB>O(<O) and IX;{(I - BB*X)-lA}J # 1, then IIM(z)I~~ < l(> 1).
Remark 21.4 As in the continuous time case, the equivalence between (u) and (b) in
Theorem 21.16 can be used to compute the XH, norm of a discrete time transfer matrix.
Q
552 DISCRETE TIME CONTROL
2 1 . 5 M a t r i x Factorizations
21.5.1 Inner Functions
A transfer matrix N(z) is called an inner if N: Z) is stable and N*(z)N(.z) = I for all
z rz &J. Note that N*(ej@) = N”(ejs). A transfer matrix is called outer if all its
transmission zeros are stable (i.e., inside of the lmit disc).
A*XA- X i- C’C = 0.
Then
(b) A stable, (A, B) reachable, and N”N = D’ D + B*XB implies D*C+ B*XA = 0.
Substitute C*C = X - A*XA into the above eciuation and combine terms to get
The following corollary is a special case of this lemma which gives the necessary and
sufficient conditions of a discrete inner transfer :natrix with the state-space representa-
tion.
then N(z) is inner if and only if there exists a vratrix.X = X* 2 0 such that
(a) A*XA-X+C*C=O
[234] Willems, J.C. (1971). “Least Squares Stationary Optimal Control and the Al-
gebraic Riccati Equation”, IEEE Trans. Auto. Control, vol. AC-16, no. 6, pp.
621-634.
[236] Willems, J. C., A. Kitapci and L. M. Silverman (1986). “Singular optimal control:
a geometric approach,” SIAM J. Control Optimiz., Vol. 24, pp. 323-337.
[240] Youla, D.C. and J.J. Bongiorno, Jr. (1985). “A feedback theory of two-degree-of-
freedom optimal Wiener-Hopf design,” IEEE Trans. Auto. Control, vol. AC-30,
No. 7, pp. 652-665.
[241] Youla, D.C., H.A. Jabr, and J.J. Bongiorno (1976). “Modern Wiener-Hopf design
of optimal controllers: part I,” IEEE Trans. Auto. Control, vol. AC-21, pp. 3-13.
[242] Youla, D.C., H.A. Jabr, and J.J. Bongiorno (1976). “Modern Wiener-Hopf design
of optimal controllers: part II,” IEEE Trans. Auto. Control, vol. AC-21, pp. 319-
338.
[243] Youla, D.C., H.A. Jabr, and C. N. Lu (1974). “Single-loop feedback stabilization
of linear multivariable dynamical plants,” Automatica, Vol. 10, pp. 159-173.
[246] Young, P. M. (1993). Robustness with Parametric and Dynamic Uncertainty, PhD
Thesis, California Institute of Technology.
[248] Zames, G. (1966). “On the input-output stability of nonlinear time-varying feed-
back systems, parts I and II,” IEEE Trans. Auto. Control, vol. AC-11, pp. 228
and 465.
590 BIBLIOGRAPHY
[249] Zames, G. (1981). “Feedback and optimal scsnsitivity: model reference transforma-
tions, multiplicative seminorms, and approximate inverses,” IEEE Trans. Auto.
Control, vol. AC-26, pp. 301-320.
[250] Zames, G. and A. K. El-Sakkary (1980). “Unstable systems and feedback: The
gap metric,” Proc. Allerton Conf., pp. 380-385.
[251] Zhou, K. (1992). “Comparison between 7-Lz itnd 7-1, controllers,” IEEE Trans. on
Automatic Control, Vol. 37, No. 8, pp. 1261-1265.
[253] Zhou, K. (1993). “Frequency weighted model reduction with L, error bounds,”
Syst. Co&r. Lett., Vol. 21, 115-125, 1993.
[254] Zhou, K. (1993). ‘LFrequency weighted L, norm and optimal Hankel norm model
reduction,” submitted to IEEE Trans. on 1lutomatic Control .
[255] Zhou, K. and P. P. Khargonekar (1988). “ALI algebraic Riccati equation approach
to X, optimization,” Systems and Control Letters, vol. 11, pp. 85-92, 1988.
[256] Zhou, K., J. C. Doyle, K. Glover, and B. Boclenheimer (1990). “Mixed 3ts and ‘H,
control,” Proc. of the 1990 American Control Conference, San Diego, California,
pp. 2502-2507.
[257] Zhou, K., K. Glover, B. Bodenheimer, and .J. Doyle (1994). “Mixed Ills and XFI,
performance objectives I: robust performance analysis,” IEEE Trans. on Automat.
Conk., Vol. 39, No. 8, pp. 1564-1574.
[258] Zhu, S. Q. (1989). “Graph topology and gap t,opology for unstable systems,” IEEE
Trans. Automat. Co&., Vol. 34, No. 8, pp. 848-855.
Index
591
592 INDEX
weighting, 137
well-posedness, 119, 249
winding number, 417
zero, 78, 80
blocking, 82
direction, 144
invariant, 86
transmission, 82