PHYSICS |
Eugene ButkovMATHEMATICAL PHYSICS
EUGENE BUTKOV
St. John’s University, New York
ADDISON-WESLEY PUBLISHING COMPANY
Reading, Massachusetts - Menlo Park, California - London Sydney - Manilap20 Wey
s
g z
g
Dxpye
STupE®
WORLD STUDENT SERIES EDITION
FIRST PRINTING 1973
‘A complete and unabridged reprint of the original American textbook, this World
Student Series edition may be sold only in those countries to which it is con-
signed by Addison-Wesley or its authorized trade distributors. It may not be
re-exported from the country to which it has been consigned, and it may not be
sold in the United States‘of America or its possessions.
Copyright © 1968 by Addison-Wesley Publishing Company, Inc. All rights reserved, No
part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any
form or by any means, electronic, mechanical, photocopying, recording, or otherwise, with-
ut the prior written permission of the publisher. Original edition published in the United
States of America, Published simultaneously in Canada. Philippines copyright 1968
Library of Congress Catalog Card Number: 68- 11391,PREFACE
During the past decade we have witnessed a remarkable increase in the number of
students seeking higher education as well as the development of many new colleges
and universities. The inevitable nonuniformity of conditions present in different
institutions necessitates considerable variety in purpose, general approach, and
the level of instruction in any given discipline. This has naturally contributed to
the proliferation of texts on almost any topic, and the subject of mathematical
physics is no exception. There is a aumber of texts in this field, and some of them
are undoubtedly of outstanding quality.
Nevertheless, many teachers often feel that none of the existing texts is properly
suited, for one reason or another, for their particular courses. More important,
students sometimes complain that they have difficulties studying the subject from
texts of unquestionable merit. This is not as surprising as it sounds: Some texts
have an encyclopedic character, with the material arranged in a different order
from the way it is usually taught; others become too much involved in complex
mathematical analysis, preempting the available space from practical examples;
still others cover a very wide variety of topics with utmost brevity, leaving the
student to struggle with a number of difficult questions of theoretical nature.
True enough, a well-prepared and bright student should be able to find his way
through most of such difficulties. A less-gifted student may, however, find it very
difficult to grasp and absorb the multitude of new concepts strewn across an ad-
vanced text.
Under these circumstances, it seems desirable to give more stress to the peda-
gogical side of a text to make it more readable to the student and more suitable for
independent study. Hopefully, the present work represents a step in this direction.
It has several features designed to conform to the path an average student may
conceivably follow in acquiring the knowledge of the subject.
First, the inductive approach is used in each chapter throughout the book. Fol-
lowing the fundamentals of modern physics, the text is almost entirely devoted to
linear problems, but the unifying concepts of linear space are fully developed rather
late in the book after the student is exposed to a number of practical mathematical
techniques. Also, almost every chapter starts with an example or discussion of
elementary nature, with subject matter that is probably familiar to the reader.
The introduction of new concepts is made against a familiar background and is
later extended to more sophisticated situations. A typical example is Chapter 8,
where the basic aspects of partial differential equations are illustrated using the
“elementary functions” exclusively. Another facet of this trend is the repeated
use of the harmonic oscillator and the stretched string as physical models: no
vVi PREFACE
attempt is made to solve as many problems for the student as possible, but rather
to show how various methods can be used to the same end within a familiar phys-
ical context.
In the process of learning, students inevitably pose a number of questions
necessary to clarify the material under scrutiny. While most of these questions
naturally belong to classroom discussion, it is certainly beneficial to attempt to
anticipate some of them in a text. The Remarks and many footnotes are designed
to contribute to this goal. The author hopes they answer some questions in the
mind of the student as well as suggest some new ones, stimulating an interest in
further inquiry. A number of cross-references serves a similar purpose, inviting
the reader to make multiple use of various sections of the book. The absence of
numbered formulas is intentional: if the student bothers to look into the indicated
section or page, he should not simply check that the quoted formula “is indeed
there,” but, rather, glance through the text and recall its origin and meaning.
The question of mathematical rigor is quite important in the subject treated
here, although it is sometimes controversial. It is the author's opinion that a theo-
retical physicist should know where he stands, whether he is proving his own deduc-
tions, quoting somebody else’s proof, or just offering a reasonable conjecture.
Consequently, he should be trained in this direction, and the texts should be written
in this spirit. On the other hand, it would be unwise to overload every student
with mathematics for two main reasons: first, because of the limitations of time
in the classroom and the space in a text, and second, because physicists are apt to
change their mathematical postulates as soon as experimental physics lends support
to such suggestions. The reader can find examples of the latter philosophy in
Chapters 4 and 6 of the text. Whether the author was able to follow these principles
is left to the judgment of users of this book.
Each chapter is supplied with its share of problems proportional to the time
presumed to be allotted to its study. The student may find some of the problems
rather difficult since they require more comprehension of the material rather than
sheer technique. To balance this, a variety of hints and explanations are often
supplied. Answers are not given because many problems contain the answer in
their formulation; the remaining ones may be used to test the ability of the student
for independent work. The exercises within the text can be used as problems to
test the students’ manipulative skills.
For many of the methods of instruction of mathematical physics presented in
this book, the author is indebted to his own teachers at the University of British
Columbia and McGill University. The encouragement of his colleagues and
students at St. John’s University and Hunter College of the City University of
New York is greatly appreciated. Also, the author wishes to thank Mrs. Ludmilla
Verenicin and Miss Anne Marie Nowom for their help in the preparation of the
manuscript.
Palo Alto, Calif. EB.
August 1966Chapter 1
11
12
13
14
1.5
1.6
17
1.8
19
Chapter 2
24
2.2
23
24
2.5
26
27
28
2.9
2.10
241
2.12
2.13
214
215
Chapter 3
34
ake
mae
3.4
35
3.6
CONTENTS
Vectors, Matrices, and Coordinates
Introduction .
Vectors in Cartesian Coordinate Systems
Changes of Axes. Rotation Matrices
Repeated Rotations. Matrix Multiplication .
‘Skew Cartesian Systems. Matrices in General
Scalar and Vector Fields
Vector Fields in Plane .
Vector Fields in Space .
Curvilinear Coordinates
Functions of a Complex Variable
Complex Numbers...
Basic Algebra and Geometry of Complex Numbers.
De Moivre Formula and the Calculation of Roots .
Complex Functions. Euler’s Formula
Applications of Euler’s Formula
Multivalued Functions and Riemann Surfaces
Analytic Functions. Cauchy Theorem
Other Integral Theorems. Cauchy Integral Formula
Complex Sequences and Series
Taylor and Laurent Series .
Zeros and Singularities . :
The Residue Theorem and its Applications.
Conformal Mapping by Analytic Functions .
Complex Sphere and Point at Infinity
Integral Representations
Linear Differential Equations of Second Order
General Introduction. The Wronskian :
General Solution of The Homogeneous Equation
The Nonhomogeneous Equation. Variation of Constants .
Power Series Solutions .
The Frobenius Method . :
Some other Methods of Solution .
vii
45
48
49
51
54
58
62
1
78
83
97
102
104
123
125
126
128
130
147Vi CONTENTS
Chapter 4
4.1
4.2
43
44
45
4.6
47
48
Chapter 5
5.1
5.2
53
5.4
5.5
5.6
5.7
5.8
5.9
5.10
Chapter 6
6.1
62
63
64
65
6.6
67
68
69
6.10
Chapter 7
WW
12
73
14
15
16
17
Fourier Series
Trigonometric Series
Definition of Fourier Series
Examples of Fourier Series
Parity Properties. Sine and Cosine Series
Complex Form of Fourier Series...
Pointwise Convergence of Fourier Series
Convergence in the Mean
Applications of Fourier Series .
‘The Laplace Transformation
Operational Calculus
The Laplace Integral
Basic Properties of Laplace Transform
The Inversion Problem .
‘The Rational Fraction Decomposition
The Convolution Theorem. .
Additional Properties of Laplace Transform
Periodic Functions. Rectification .
The Mellin Inversion Integral . ae
Applications of Laplace Transforms .
Concepts of the Theory of Distributions
Strongly Peaked Functions and The Dirac Delta Function
Delta Sequences .
The 6-Calculus
Representations of Delta Functions
Applications of The é-Calculus
Weak Convergence...
Correspondence of Functions and Distributions.
Properties of Distributions .
Sequences and Series of Distributions
Distributions in N dimensions .
Fourier Transforms
Representations of a Function
Examples of Fourier Transformations
Properties of Fourier Transforms .
Fourier Integral Theorem...
Fourier Transforms of Distributions .
Fourier Sine and Cosine Transforms .
Applications of Fourier Transforms. The Principle of Causality -
154
155
157
161
165
167
168
172
179
180
184
187
189
194
200
204
206
210
221
223
226
229
232
236
240
245
250
257
260
262
266
269
211
273
276Chapter 8
8.1
8.2
8.3
84
8.5
8.6
8.7
88
89
Chapter 9
9.1
9.2
9.3
94
9.5
9.6
97
98
99
9.10
9.11
Chapter 10
10.1
10.2
10.3
10.4
10.5
10.6
10.7
10.8
10.9
‘Chapter 11
iui
11.2
113
14
Ws
11.6
17
11.8
CONTENTS
Partial Differential Equations
The Stretched String. Wave Equation
The Method of Separation of Variables .
Laplace and Poisson Equations
The Diffusion Equation :
Use of Fourier and Laplace Transforms
The Method of Eigenfunction Expansions and Finite Transforms
Continuous Eigenvalue Spectrum...
Vibrations of a Membrane. Degeneracy .
Propagation of Sound. Helmholtz Equation
Special Functions
Cylindrical and Spherical Coordinates
The Common Boundary-Value Problems
The Sturm-Liouville Problem
Self-Adjoint Operators .
Legendre Polynomials
Fourier-Legendre Series
Bessel Functions : 7
Associated Legendre Functions and Spherical Harmonics .
Spherical Bessel Functions .
‘Neumann Functions '
Modified Bessel Functions .
Finite-Dimensional Linear Spaces
Oscillations of Systems with Two Degrees of Freedom
Normal Coordinates and Linear Transformations
Vector Spaces, Bases, Coordinates
Linear Operators, Matrices, Inverses .
Changes of Basis
Inner Product. Orthogonality. Unitary Operators .
The Metric. Generalized Orthogonality .
Eigenvalue Problems. Diagonalization
Simultaneous Diagonalization. . . ~
Infinite-Dimensional Vector Spaces
Spaces of Functions .
The Postulates of Quantum Mechanics _
The Harmonic Oscillator
Matrix Representations of Linear Operators.
Algebraic Methods of Solution
Bases with Generalized Orthogonality
Stretched String with a Discrete Mass in the Middle
Applications of Eigenfunctions
1x
287
291
295
297
299
304
308
313
319
332
334
337
340
342
350
355
372
381
388
394,
405
4u1
419
424
433
437
441
443
451
463
467
471
476
483
488
492
495X% CONTENTS
Chapter 12
12.1
12.2
12.3
12.4
12.5
12.6
12.7
12.8
12.9
Chapter 13
13.1
13.2
13.3
13.4
13.5
13.6
aaa
13.8
13.9
‘Chapter 14
14.1
142
143
144
14.5
146
147
148
149
14.10
‘Chapter 15
15.1
15.2
15.3
15.4
15.5
15.6
Chapter 16
16.1
16.2
Green's Functions
Introduction .
Green's Function for the Sturm:
Series Expansions for G(x|£)
Green's Functions in Two Dimensions
Green's Functions for Initial Conditions .
Green's Functions with Reflection Properties
Green’s Functions for Boundary Conditions
The Green's Function Method
‘A Case of Continuous Spectrum
jouville Operator
Variational Methods
The Brachistochrone Problem .
The Euler-Lagrange Equation .
Hamilton's Principle .
Problems involving Sturm-Liouville Operators.
The Rayleigh-Ritz Method
Variational Problems with Constraints
Variational Formulation of Eigenvalue Problems
Variational Problems in Many Dimensions .
Formulation of Eigenvalue Problems by The Ratio ‘Method
Traveling Waves, Radiation, Scattering
Motion of Infinite Stretched String
Propagation of Initial Conditions . ‘
Semi-infinite String. Use of Symmetry Properties.
Energy and Power Flow in a Stretched String
Generation of Waves in a Stretched String
Radiation of Sound from a Pulsating Sphere
The Retarded Potential :
Traveling Waves in Nonhomogeneous Media
Scattering Amplitudes and Phase Shifts.
Scattering in Three Dimensions. Partial Wave Analysis
Perturbation Methods
Introduction . .
The Born Approximation .
Perturbation of Eigenvalue Problems .
First-Order Rayleigh-Schrédinger Theory
The Second-Order Nondegenerate Theory
The Case of Degenerate Eigenvalues .
Tensors
Introduction . : :
‘Two-Dimensional Stresses .
503
514
520
523
$27
531
536
543
553
554
560
562
565
567
573
577
581
589
592
595
599
603
611
619
624
628
633
647
650
653
658
665
671
67216.3
16.4
16.5
16.6
16.7
16.8
16.9
16.10
16.11
CONTENTS
Cartesian Tensors :
Algebra of Cartesian Tensors zi
Kronecker and Levi-Civita Tensors. Pseudotensors
Derivatives of Tensors. Strain Tensor and Hooke’s Law .
Tensors in Skew Cartesian Frames. Covariant and
Contravariant Representations .
General Tensors . EE ie
Algebra of General Tensors. Rel
The Covariant Derivative
Calculus of General Tensors
e Tensors .
PR ceeeceecce
676
681
684
687
696
700
705
m1
ms
RICHAPTER 1
VECTORS, MATRICES, AND COORDINATES
1.1 INTRODUCTION
To be able to follow this text without undue difficulties, the reader is expected to
have adequate preparation in mathematics and physics. This involves a good
working knowledge of advanced calculus, a basic course in differential equations,
and a basic course in undergraduate algebra. Rudimentary knowledge of complex
numbers, matrices, and Fourier series is very desirable but not indispensable.
As for the subjects in physics, the reader should have completed the standard
undergraduate training in mechanics, thermodynamics, electromagnetism, and
atomic physics.
Despite these prerequisites, a need is often recognized for reviewing some of
the preparatory material at the beginning of a text. Let us follow this custom and
devote some time to the subject of vector analysis which has a bearing, in more
than one way, on the material developed in this text. Of course, such a review
must be brief and we must omit all the details, in particular those involving
mathematical proofs. The reader is referred to standard textbooks in advanced
calculus and vector analysis* for a full discussion. On the other hand, we hope to
draw attention to some interesting points not always emphasized in commonly
used texts.
1.2 VECTORS IN CARTESIAN COORDINATE SYSTEMS
In many elementary textbooks a vector is defined as a quantity characterized by
magnitude and direction. We shall see in Chapter 10 that vectors/are much more
general than this, but it is fair to say that the concept of vectors was first intro-
duced into mathematics (by physicists) to represent “quantities with direction,”
e.g., displacement, velocity, force, etc. Doubtless, they are the simplest and most
familiar kinds of vectors.
As we well know, quantities with direction can be graphically represented by
arrows and are subject to two basic operations:
a) multiplication by a scalar,t _b) addition.
These operations are illustrated in Fig. 1.1.
* For example, A. E. Taylor, Advanced Calculus; T. M. Apostol, Mathematical Analysis;
W. Kaplan, Advanced Calculus.
+ Until we are ready to discuss complex vectors (Chapter 10) we shall assume that scalars
are real numbers.
1J Vb bana MANIC Es, AND) COORDINATES 12
a
4
In many cases we can plot various vectors from a’single point, the origin.
‘Then euch vector can be characterized by the coordinates of its “tip.” Various
coordinate systems are possible but the cartesian coordinate systems are the most
convenient. The reason is very simple and very deep: The cartesian coordinates
of a point can serve as the components of the corresponding vector at the same time.
This is illustrated in Fig. 1.2, where orthogonal cartesian systems, in plane and in
space, are selected. Note that the three-dimensional system is ‘right-handed’
in general, we shall use right-handed systems in this book.
Figure 1.1
Figure 1.2
We can now associate with 2 vector u (in space) a set of three scalars (14, Uy, Us),
such that du will correspond to (dts, My, Muz) and u-+ v will correspond to
(uz + Ve, Uy + by Uz + v,). Note that no such relations hold, in general, if a
vector is characterized by other types of coordinates, e.g., spherical or cylindrical.
In addition, orthogonal cartesian coordinates result in very simple formulas
for other common quantities associated with vectors, such as
a) length (magnitude) of a vector:
ul = uw = G2 + uj + ui)",
'b) projections of a vector on coordinate axes:
uz = ucos(u,i), uy = ucos(u,j), us = ucos (u, k),
* Rotation of the x-axis by 90° to coincide with the y-axis appears counterclockwise for
all observers with z > 0.
+ Standard notation is used: The symbols i, j, k are unit vectors in x-, y-, and z-directions,
respectively. The symbol (u, v) stands for the angle between the directions given by
wand v,eae VECTORS IN CARTESIAN COORDINATE SYSTEMS. c)
©) projection of a vector on an arbitrary direc-
tion defined by vector s (Fig. 1.3):
OP = u, = ucosy
= uz C05 (5,4) + uy COs (5, j) + us cos (s, k),
d) scalar product (dot product) of two vectors:
(uv) = wv cos (u,v) = usvs + uy + Us,
€) vector product (cross product):
x
Figure 1.3
(u X v] = (ube — uavy)i + (ude — Uads)j + (Usdy — Uyrs)k.
The important distinctive feature of the cross product is that [u X v] ¥ [v X ul],
namely, it is not commutative; rather, it is anticommutative:
fu X vy) = —[v Xu}.
Remark. Apart from its important physical applications, the cross product of two
vectors leads us to the concept of “oriented area.” The magnitude of [u X v}, namely
ue |sin (u, v)|, is equal to the area of the parallelogram formed by u and v. The direction
of [u X v] can serve to distinguish the “positive side” of the parallelogram from its
“negative side.” Figure 1.4 shows two views of the same parallelogram illustrating
this idea, “
[uxv]
Positive side
‘Negative side
Figure 1.4
Closely related to this property is the concept of a right-handed triple of
vectors. Any three vectors u, v, and w, taken in this order, are said to form a right-
handed (or positive) triple if the so-called triple product
(fu X v]-w)
is positive* This happens if w is on the same side of the plane defined by the
vectors u and ¥, as illustrated in Fig. 1.5. It is not hard to verify that ((u X ¥] - w)
represents, in this case, the volume V of the parallelepiped formed by the vectors
u, v, and w.
* These vectors form a left-handed (negative) triple if (lu X v] + W)< 0.4 VECTORS, MATRICES, AND COORDINATES 13
Show that
Vv = \(fu x vw)
under any circumstances. Show also that the sign of the
triple product is unchanged under cyclic permutation of
u, v, and w, that is,
(tu Xx v)-w) = (lw X u)-¥) = (ly X w)-u).
Figure 1.5
1.3 CHANGES OF AXES. ROTATION MATRICES
We have seen that a given vector wis associated with a set of three numbers, namely
its components,* with respect to some orthogonal cartesian system. However, it
is clear that if the system of axes is changed, the components change as well. Let
us study these changes.
Consider, for vectors in a plane, a change in the system of axes which is pro-
duced by a rotation by the angle 6, as illustrated in Fig. 1.6. The old system is
(x, y) and the new system is (x’, y’). Since u = u,i + uj, the x’-component of u
is the sum of projections of vectors usi and uyj on the x/-axis, and similarly for
the y’-component. From the diagram we sce that this yields
ul, = u,cos@ + uysin@, wl, = —uz sin @ + wy cos 6.
It is instructive to note that the angle between the x’= and y-axes is (x/2 — 0)
while the angle between the y’- and x-axes is (7/2 + 6). In view of
sind = 00s (5 - )
—sin @ = cos G + a):
we see that all four coefficients in the above equations represent cosines of the
angles between the respective axes.
Let us now turn to the three-dimensional case. Figure 1.7 represents two
orthogonal cartesian systems, both right-handed, centered at O. It is intuitively
clear that the primed system can be obtained from the unprimed one by the mo-
tion of a “rigid body about a fixed point.” In fact, it is shown in almost any text-
book on mechanics that such a motion can be reduced to a rotation about some
axis (Euler’s theorem).
and
* Instead of “components,” the term “coordinates of a vector” is often used (see also
Section 10.3).
+ For example, Goldstein, Classical Mechanics, Section 4.6,13 CHANGES OF AXES. ROTATION MATRICES 5
Write
u = ui + uj + uk,
collect contributions to wz from the three vectors uzi, uyj, and uk, and obtain
uw, = uz cos (i’, i) + uy cos (i’, j) + uz cos (i’, k),
where i’ is, of course, the unit vector in the x’-direction. Note that the cosines
involved are the directional cosines of the x'-direction with respect to the unprimed
system or, for that matter, the dot products of i’ with i, j, and k.
Figure 1.7
It is clear that similar formulas can be written for uj and u;. At this stage,
however, it is very convenient to switch to a different notation: Instead of writing
(uz, Uy, Uz), let us write (u4, U2, wg) and similarly (u',, w, u's) for (ut, uf, uf). More-
over, denote by amn the angle between the mth primed axis and the nth unprimed
axis (three such angles are marked on Fig. 1.6) and by dn, the corresponding
cosine (that is, Gn = COS mn). This new notation permits us to write the
transformation formulas in an easily memorized pattern:
uy = arity + ayotte + arsits,
Wz = Gait + Goat, + azsits,
us = Ggity + ag2¥2 + assis,
or, if desired, in the compact form
From this analysis we conclude that the new components (u', w2, #4) can be
obtained from the old components (wi, 12, uz) with the help of nine coefficients.6 VECTORS, MATRICES, AND COORDINATES 3
These nine coefficients, arranged in the self-explanatory pattern below are said
to form a matrix.* We shall denote matrices by capital letters.
Columns
a2 ay3 | Ist
a20 23 | 2nd pRows
a32 a33 | 3rd
Matrix A has three rows and three columns; the individual coefficients an, are
referred to as matrix elements, or entries. It is customary to use the first subscript
(m in our case) to label the row and the second one to label the column; thus the
matrix element a; should be located at the intersection of the kth row with the
Ith column.
The set of elements in a given row is very often called a row vector and the
set of elements in a column, a colwmn vector. This nomenclature is justified by the
fact that any three numbers can be treated as components of some vector in space.
However, at this stage it is worthwhile to make a digression and establish a geo-
metric interpretation for the column vectors of A. The reader should pay par-
ticular attention to the argument because of its general significance.
Let us imagine a unit vector u. Suppose the unprimed system was oriented
in such a way that u was along the x-axis. Then the components of u are (1, 0, 0)
and u actually coincides with the vector i. If the coordinate system is now rotated,
the new components of u are given by
uy = aul + 4120 + 4430 = au,
Ub = dail + a220 + G230 = aoi,
uy = 311 + 320 + a330 = agi.
We see that the first column vector of matrix A is composed of the new components
of vector u. In other words, we can say that the new components of i are
(@11, @21, @31) and we can write
i= ay’ + aoij’ + azik’.
Similar statements relate j and k to the second and third column vectors of A.
Note that in this discussion the unit vectors i, j, k assume a role independent of
their respective coordinate axes. The axes are rotated but the vectors i, j,k stay
in place and are then referred to the rotated system of axes.
* More precisely, a 3 X 3 matrix is formed. The reader can easily construct an analogous
2 X 2 matrix to account for two-dimensional rotations.13 CHANGES OF AXES. ROTATION MATRICES a
Exercise. Establish the geometrical meaning of the row vectors of matrix A representing
a rotation.
The definitions introduced above allow the computation of (w, u4, w4) from
(ux, ua, ua) by the following rule: To obtain uf, take the dot product of the kth
row of matrix A with the vector u, as given by the triple (uy, ua, us). Since in this
process each row of the matrix is “dotted” with (u1, wa, v3), we may regard it
as some kind of multiplication of a vector by a matrix. In fact, this operation is
commonly known as vector-matrix multiplication and is visually exhibited as shown:
411 a2 a3 uy uh
42, G92 Gog | Ua 7 ue
431 432 33 us us
Matrix A Column vector u Column vector u’
As we see, the old components are arranged in a column which we shall denote
by u. This column vector is multiplied by the matrix A and this results in another
column vector, denoted by u’. The multiplication means, of course: Form the
dot product of the first row of A with u for the first component of u’; then form the
dot product of the second row of A with u to get w’, and similarly for u. The entire
procedure is symbolically written as
Au=u.
Remark. Note that the set (1, u2, us), arranged in a column, has not been denoted simply
by u but rather by a new symbol u.* The point is that in the context of our problem,
both w and 1’ represent the same vector u, but with respect to different systems of axes.
‘We must think of u and 1’ as two different representations of u, and the matrix A shows
us how to switch from one representation to another.
Before we discuss further topics involving matrices, let us record the fact
that our matrix A is not just a collection of nine arbitrary scalars. Its matrix ele-
ments are interdependent and possess the following properties.
a) The columns of A are orthogonal to each other, namely,
411412 + 421422 + 431432 = 0,
412013 + a22de3 + a32d33 = 0,
413011 + G23a21 + a33@31 = 0.
This property follows from the fact that the columns of A are representations
(in the new system) of the vectors i, j, and k and these vectors are mutually
orthogonal.
* The symbol u should not be confused with |ul, the magnitude of vector u, which is also
denoted by u (p. 2).8 VECTORS, MATRICES, AND COORDINATES 14
b) The columns of 4 have unit magnitude, namely,
ai, + ah + ah, = 1,
ais + ads + ae = 1,
ats + ais + ads = 1,
because i, j, and k are unit vectors.
c) The rows of A are also mutually orthogonal and have unit magnitude. This
is verified by establishing the geometrical meaning of the row vectors of A.*
Matrices satisfying these three properties are called orthogonal. We may then
conclude that the matrices representing rotations of orthogonal cartesian systems
in space are orthogonal matrices.
Remark. There are orthogonal matrices which do not represent rotations. Rotation
matrices have an additional property: their determinant (j.e., the determinant of the
equations on p. 5) is equal to +1. Otherwise, orthogonal matrices can have the deter-
minant equal to —1. The point is that a rotation must yield a right-handed triple (i,j,
K’) of unit vectors since (i, j, k) is a right-handed triple.t
1.4 REPEATED ROTATIONS. MATRIX MULTIPLICATION
The matrix notation introduced in the preceding sections is particularly useful
when we are faced with repeated changes of coordinate axes. In addition to the
primed and unprimed systems related by a matrix A, let there be a third double-
primed system of axes (x, y”, 2”) and let it be related to the primed system
by a matrix B:
Evidently, the system (x", y’, 2”) can be related directly to the system (x, y, z)
through some matrix C and our task is to evaluate the matrix elements Cyn in
terms of matrix elements dyn aNd Brn. We have
uy = Brau + biotte + bisuss,
uy = bewuis+ beoud + bosus,
Us = bath + dgatts + basus,
and
uy = ayy + Gyatle + aysts,
ua = aot, + Goats + Goats,
Uy = 31, + azale + asgus,
* It will be shown in Chapter 10 that any N X N matrix which satisfies (a) and (b) must
also satisfy (c).
+ See Problem 4 at the end of this chapter.14 REPEATED ROTATIONS. MATRIX MULTIPLICATION 9
from which it follows that
wy = (brrais + Bi2der + bisasi)ur + (Birai2 + bi2de2 + bisase)u2
+ (611413 + bizaes + b13@33)us,
uy = (barair + border + besasi)ur + (barai2 + ba2de2 + besase)u2
+ (b2ia13 + beeaes + be3as3)us,
ug = (b3iai1 + bsage1 + b3sas1)ur + (bsidi2 + 632022 + b33a32)u2
+ (bsiais + bs2aes + baadas)us-
The maze of matrix elements above becomes quite manageable if we observe
that every coefficient associated with u, is a dot product of some row of matrix B
and some column of matrix A. A closer look at these relationships leads us to
the following statement: The element Cmn of matrix C is obtained by taking the
dot product of the mth row of matrix B with the nth column of matrix A.
Now, if we record our relationship in the vector-matrix symbolic notation
Ce ee ers
then we are naturally led to the relation
uw” = Cu = B(Au).
It seems reasonable now to define the product of two matrices, like B and A,
to be equal to a third matrix, say C, so that the above relationship may also
: .
be written as’ Ww’ = Cu = (BA.
In this sense we write
Ci Cuz Cis bir biz bis M1 42 a3
Ca1 C22 Cos |=| bei bee bes |*| ai 22 aes
€31 C32 C33 bs1 B32 bss 931 432 G33
or, symbolically, ela
given that the matrix elements of C are defined by the rule quoted above.
Having introduced the notion of matrix multiplication, we are naturally
interested in determining whether it has the same properties as the multiplication
of ordinary numbers (scalars). A simple check shows that the associative law
holds: If we multiply three matrices A, B, and C in that order, then this can be
done in two ways:
ABC = (AB)C = A(BC)
(where it is understood that the operation in parentheses is performed first).
* The difference is, of course, that in B(Au) the column vector u is first multiplied by A,
producing another column vector which is, in turn, multiplied by B. In (BA)u the matrices
are being multiplied first, resulting in a new matrix which acts on u.10 VECTORS, MATRICES, AND COORDINATES 14
Exercise. Verify this statement. [Hint: If AB = D, then the elements of D are given by
3
don = Do dmibine
ro
Develop the matrix elements of (4B)C and A(BC) in this fashion and verify that they
are the same.]
On the other hand, matrix multiplication is not commutative:
AB # BA,
and, furthermore, there is no simple relation, in general, between AB and BA.
This noncommutativity feature precludes the possibility of defining “matrix
division.”* However, it is possible to talk about the inverse of a matrix and this
concept arises naturally in our discussion of rotations. Indeed, if we rotate our
orthogonal cartesian system of axes, the new coordinates of vector u are obtained
from the vector-matrix equation
ul = Au.
Suppose now that we rotate the axes back to their original position. The new
components of vector u are given by (11, #2, ug) and the old ones are given by
(u4, ub, U4); these components must be related by some matrix B:
u= Bu.
Combining these relations we obtain
= B(Au) = (BA)u.
Therefore, the matrix BA must transform the components (11, U2, us) into them-
selves. It is easy to see that this task is accomplished by the sc-called unit matrix
(“identity matrix”)
SN
0
oon
0
1
0
- oo
Exercise. Show that if w = (BA)u is to hold for an arbitrary vector u, then BA must
necessarily be of the above form. In other words, the identity matrix is unique.
It is now customary to call B the inverse of matrix A and to denote it by symbol
A~" 50 that we have A~'A = I. Since we could have performed our rotations in
reverse order, it is not hard to see that AB = J as well, that is, A~'A = AA~*
and A = B~!. While two rotation matrices may not commute, a rotation matrix
always commutes with its inverse.t
* If we write 4/B = X the question would arise whether we imply A = BX or A = XB.
See Section 10.4 for a general statement to that effect.ae SKEW CARTESIAN SYSTEMS. MATRICES IN GENERAL 7
It may now be of interest to relate the elements of matrix B to those of matrix
A. To obtain Bm, we should, in principle, solve the equations on p. 5 for
Uy, U2, us. However, in the case of rotations, we have a much simpler method
at our disposal. Let us write the matrix equation BA = J in detail:
bir bia bis 411 412 13 100
bo: bee bas |*| aa1 G22 aes | =| 0 1 0
bs1 bs2 bss 431 432 a33 ool
To get the first row of J we must take the dot products of the first row vector of B
with each of the column vectors of A. However, we know that the latter are just
the vectors i, j, k in new representation. We see that the first row vector of B is
orthogonal to j and k and its dot product with i is unity. Consequently, it could
be nothing else but the vector i (in new representation, of course), and we conclude
that by, = 44, by2 = Gay, and by3 = ag).
Repeat this argument for the other rows of B and deduce that the rows of B
are the columns of A and vice versa. This is also expressed by the formula
bmn = Gam:
Any two matrices A and B satisfying these conditions are called transposes of
each other and are denoted by B = AT and A = BT. While it is not, in general,
true that the inverse and transpose of a matrix are identical, this rule holds for
rotation matrices and is very useful.
1.5 SKEW CARTESIAN SYSTEMS. MATRICES IN GENERAL
If the coordinate axes in a cartesian system form angles other than 90°, we have
a skew cartesian system. Figure 1.8 shows two such systems, one in plane and
one in space, along with the decomposition of a vector into its respective
components.
Figure 1.8 x12 VECTORS, MATRICES, AND COORDINATES 15
Skew systems are specified by the angles between the axes (one in plane,
three in space) which may vary between 0° and 180°. As before, i, j, and k will
be used to denote unit vectors in the direction of axes. Note that we can still talk
about right-handed systems in space, where the vectors i, j, k (in that order!) form
a right-handed triple.
The vectors are added and multiplied by scalars according to the same rules
that were stated before. However, the length of a vector is now given by a different
formula. For instance, for a plane vector we have from Fig. 1.8(a), by the cosine
theorem,
ul? = u2 + uj — usw, cos (w — 6) = ul + uf + 2u-u, cos @
that
cae [ul = Vu? + uz + 2u,u, cos $.
In general, the dot product is no longer given by the sum of the products of com-
ponents, but by a more complicated formula. As a matter of fact we shall even
introduce a new name for it and call it the inner product of two vectors which is
then defined by
@-y)
The reason is that we would like to retain the name dot product to mean the sum
of the products of components of two vectors, regardless of whether the axes
are orthogonal or skew.*
The derivation of a formula for inner product in a skew system is greatly
facilitated by its distributive property, namely,
(a @ +) = @-y) + ew).
Indeed, from Fig. 1.9, no matter which coordinate system is used, we see that
@- w+) - (MN).
However, MN = MP + PN; since MP is the projection of v on u, we have
|u| - QP) = |u| |v| -cos (u,v), and similarly for PN, establishing the result.
Note that this argument is also valid for vectors in space.
Now we can writet for two vectors in a plane u = usi + uj and v =
va + oy):
+ 608 (u, ¥).
u|- |v + wl cos @ =
(> ¥) = Quek val) + (uct ogi) + (uy val) + (yi: evi)
= Ups + Uv, COS > + Ups COS + Uy.
Note that this formula reduces to the usual dot product when ¢ = 90°.
* With this distinction, the dot product becomes an algebraic concept (referring to the
components) while inner product is a geometrical concept, independent of the coordinate
system. The two are identical provided the system is orthogonal.
} Since (u-v) = (v-u), the second distributive law ((u + ¥)-w) = (u-w) + (v-w)
is trivial.ae SKEW CARTESIAN SYSTEMS. MATRICES IN GENERAL eg
There is no difficulty now in establishing other formulas for skew systems, in
plane or in space. We shall not go into these details but rather consider another
important question: the transformation of coordinates from an orthogonal to a
skew system of axes. Consider, for instance, Fig. 1.10; here a skew x’y’-system
with unit vectors i’ and j’ is superimposed on an orthogonal xy-system with unit
vectors i and j. A given vector u can be represented either as u = u,i + u,j or as
u= uli’ + uj’,
ua”
oO
Figure 1.9 Figure 1.10
Although i’ = i, the components ut and u, are not equal; rather, we have
u, = 00' = 00 — 00 = u, — uy tan.
Also,
uy = OP = uy sec 7.
We see that the new components (uf, u,) are linearly related to the old components
(Uz, ty) and this relationship can be represented by means of vector-matrix
multiplication:
uy 1 -tany Uy
uy | | 0 secy ele
This can be written symbolically as
w= Au
where w’ stands for the column vector (u:, uw), and w stands for the column vector
(uz, uy). The obvious difference from the previous cases is that the matrix A
is no longer orthogonal. Its inverse A~' is readily calculated by solving for wz, uy
in terms of u%, ut, and it reads
1 siny
A=
0 cos ¥
Note that it is no longer the transpose of A.14 VECTORS, MATRICES, AND COORDINATES 1.6
It is still true that the columns of A are the old unit vectors in new representa-
tion (Fig. 1.11), that is,
+0-j, f= —tany-i’ + secy-j’.
The rows of matrix A do not have a simple geometrical interpretation, but the
columns of matrix A~' do have one.*
From the above analysis we may conjecture that, in general, a change from
one set of unit vectors to another in a plane or in space involves a linear relationship
between the old and new components of a vector, expressible by a vector-matrix
multiplication u’ = Au (A is a 2 X 2 or a3 X 3 matrix). We shall consider this
problem in detail in Chapter 10. For the time being we shall mention the fact that
not every matrix A can represent such a relationship. Consider, for instance, the
following hypothetical relation between the new and old coordinates of some
vector w:
Ue, = dug — Quy uy = Qty — Uy.
If we attempt to solve these two equations for uz
and uy we find that they have no solution if u, and u’,
are arbitrary. We say that the matrix
does not possess an inverse; such matrices are
called singular matrices. It is not difficult to see that Figure 1.11
in this example the pair w;, ui, cannot possibly repre-
sent an arbitrary vector in plane: Our equations imply u; = 2u, so that there is
only one independent component instead of the two required for a plane.
Remark. It is of interest to note that if u; = 2u, is actually satisfied, our system of
equations has an infinity of solutions for uz and u, (since two equations reduce to a
single one). Furthermore, if an additional requirement uz = 2u, is imposed, the system
has a unique solution (uz = 4u'; ty = 4uj). Alll these features should be remembered
since they are important in physical applications.
1.6 SCALAR AND VECTOR FIELDS
So far we have been discussing constant vectors, but we can also contemplate
vectors which depend on one or more variable parameters. The simplest example
is, perhaps, a position vector which depends on time ¢. In a fixed coordinate
system, this is equivalent to saying that its components are functions of time
* If A were orthogonal, the rows of A would be identical with columns of A? (see pp. 11
and 440).1.6 SCALAR AND VECTOR FIELDS 15
and we write
ws ut) = ue() + lt) § + melt) “ke
Such vectors can be differentiated with respect to the variable 1 according to the
definition
d — fm Ue +40 (1),
Ae eo ae ae
With u(1) and u(t + Af) expressed in terms of their components, it is trivial to
deduce that
ede dey dus
HO eige eae ge
so that the operation of differentiation of a vector is reduced to differentiation
of its components.
While vectors depending on time are widely used in mechanics of particles,
we shall be more interested in another type of variable vectors: those depending
‘on space coordinates (x, y, z).* Such vectors are said to form vector fields and
can be denoted as follows:
w= u(x, y, 2) = we(X, Ys Zi + myx, Ys ZI) + uel, Y, ZK.
Common examples are electric and magnetic fields in space, velocity field of a
fluid in motion, and others.
The simplest kind of such a field is probably the so-called gradient field} which
can be derived from a single scalar function (x, y, 2), usually referred to as a
scalar field. Familiar cases of scalar fields include the temperature distribution in
a solid body, density of a nonhomogeneous medium, electrostatic potential, etc.
A scalar field gives rise to numerous other quantities through its various
partial derivatives. In particular, let us concentrate our attention on
a) the total differential
= % oe oe
de = Sede + Seedy + 52 de,
and
b) the directional derivativet
de _ de dx , d0 dy, 36 de,
a ~ ax d& + ay a + oz
* These vectors may also depend on time, but we shall be mostly interested in instanta-
neous relationships, where ¢ has some fixed value.
} This is also called conservative field or potential field.
t Rate of change of g per unit length in some particular direction characterized, say, by
the element of arc ds of some curve. See, e.g., Apostol, p. 104,16 VECTORS, MATRICES, AND COORDINATES 1.6
The expressions on the right-hand side of the equations in (a) and (b) have the
appearance of a dot product. It is convenient to define the gradient of a scalar
field o(x, y, 2) by the vector
a
= 98; 4 9°; 4
grad e = Sei + eit 5k.
Then we can write
de
dp = (grad p-ds) and SP = (grad @- 0),
where ds = dxi-+ dyj + dzk represents infinitesimal displacement in some
direction and
so = Fi 4 Bi 4 Sx
is the unit vector in the specified direction.*
Since every differentiable scalar field generates a gradient field, it is natural
to ask whether any given vector field u = u(x, y, z) may not be the gradient of
some scalar g. The answer is negative and this becomes clear as we examine the
basic properties of gradient fields. In this survey we shall need certain assumptions
regarding the differentiability of various functions and analytic properties of
curves and surfaces involved in vector analysis. We shall mention these assump-
tions as we need them. In many cases they can be relaxed and the results can be
generalized, but we shall confine ourselves to the common situations encountered
in physics.
A curve in space is called smooth if it can be represented by
x=x), y=, z= 2,
where x(1), y(i), and 2(1) have continuous derivatives with respect to the parameter
1 (for a curve in a plane, simply set z = 0). Smooth curves possess tangents at
all points and a (vector) line element ds can be defined at any point. The smooth-
ness also guarantees the existence of line integrals.t This last property is trivially
extended to piecewise smooth curves; i.e., those consisting of a finite number of
smooth parts. We shall assume that ali curves considered by us are piecewise
smooth,
Regarding the differentiability of various functions, we must remember the
following definitions and statements: the interior of a sphere of arbitrary radius €
(usually thought to be small) centered at some point M(x, y, z) is called a neigh-
borhood} of this point (in a plane, replace “sphere” by “circle”). If a set of points
* Observe that dx/ds = cos (50, i), etc., are the directional cosines of the direction
defined by ds or by So
+ We shall assume that all integrals are Riemann integrals which are adequate for our
purposes. For example, see Apostol p. 276.
{A more precise term is an neighborhood.1.6 SCALAR AND VECTOR FIELDS 7
is such that it contains some neighborhood of every one of its points, then it is
called an open set. For instance, the interior of a cube is an open set; we can
always draw a small sphere about each interior point which will lie entirely within
the cube. However, the cube with boundary points included is no longer an open set.
The reason these concepts are needed is that partial derivatives of a function
in space are defined by a limiting process that is tied to a neighborhood. We must
make sure a region is an open set before we can say that f(x, y, z) is differentiable
in this region.
In addition, we shall be mostly interested in connected open sets, or domains.
These are open sets any two points of which can be connected by a polygon, ie.,
a curve which is formed by a finite number of connected straight-line segments.
From now on we shall assume that all our piecewise smooth curves lie in domains
where the functions under consideration (scalar fields and components of vector
fields) possess continuous first-order partial derivatives.
Let us now return to the properties of gradient fields. Suppose that
= grad oC, ¥, 2)
and consider the following integral between points
M(Xo¥os26) and NX, YZ),
taken along some curve C:
L (u- ds) = ft (¢ ax +9 e ay + $a):
Aloe C Oe
Using the parameter ¢ as the variable of integration, we have
N ‘4 i
-dsy =| (20 4 4 de ay, ae de’ a= [4 _ a
i w a =f, ( ate Et eS di t= Hh) — eos
sioatc
where fo and f, are the values of the parameter ¢ corresponding to points M and
N. We see that the integral is simply the difference of values of g(x, y, z) at points
Nand M and, therefore, is independent of the choice of curve C.
Conversely, if the integral
fr (u-ds)
is independent of path,* then keeping M fixed and treating N as a variable point,
we can define a function
0 952) =f" (weds) = [OP (ued + ty dy + te de).
* From now on, we shall occasionally use the term “path” to indicate a piecewise smooth
curve.18 VECTORS, MATRICES, AND COORDINATES 16
Path C;
M, ty N
Path Cy
Path C;
Path C M :
N
@) (b) Figure 1.12
It is now simple to show that grad y = u. For instance,
(e4azu.2)
el + Ax, 2) — 9662) = [ONY (ue de + uy dy + ue de)
e.u.2)
(etaeu2)
= fae
esn.2)
and the statement u, = d/az follows from the fundamental theorem of integral
calculus.
We have then established the following theorem: The necessary and sufficient
condition that u = grad ¢ is the independence of path of the integral [(u- ds).
An alternative way of stating this result follows from consideration of the
integral § ay
over a simple closed path C, called the circulation of vector u around C. By simple
closed path we mean a closed path which does not intersect itself.*
The following theorem holds: The circulation of u vanishes for an arbitrary
simple closed path C (in a domain D) if and only if the integral S¥, (w ds) is inde-
pendent of path (in D).
Indeed, let C (Fig. 1.12a) be a simple closed path. Choose two arbitrary
points M and N on C and write
(u-ds) = [™ (w-ds) + [™ (eds) = [” (weds) — [” -c),
e ae E 7 7
Along Cr Along C2 Along Cy Along C2
If the integral {if (u- ds) is independent of path, the right-hand side vanishes and
the circulation is zero.
Conversely, if two paths C, and Cz connecting two points (Fig. 1.12b) do
not intersect (in space), a simple closed path can be formed from them and the
above equation holds. If the left-hand side is zero, so is the right-hand side,
yielding the independence of path, If Cy and C2 intersect a finite number of times,
the proof is obtained by splitting the closed path into a finite number of simple
* This property permits us to assign the direction of integration around the curve, char-
acterized by the vector ds, in a unique fashion.1.6 SCALAR AND VECTOR FIELDS 19
closed paths. In the rather exceptional case when C, and C2 cross each other
an infinite number of times, a limiting process can be invoked reducing this
case to the preceding one.*
Remark. Within the context of the above discussion, it is emphasized that by g(x, y, z)
we mean a well-defined single-valued functiont over the entire domain and nothing
short of this requirement will suffice. In many treatmentst of the magnetostatic field H,
‘one introduces the so-called scalar magnetic potential X so that H = grad X and yet
the circulation of H does not vanish over some contours. However, in all such cases it
is impossible to define X uniquely over the entire contour (for those contours for which
it is possible, the circulation of H does indeed vanish).
It should now be clear that many vector fields do not fall into the category
of gradient fields since it is easy to construct a vector u for which the integral
J(u ds) will actually depend on path. It is perhaps even easier to sketch some
such fields, a task facilitated by the introduction of the concept of field lines.
These field lines are curves with tangents directed along the vector field u at
every point. For instance, Fig, 1.13 shows the velocity field (in the plane) of a
fluid rotating around a circular obstacle. In this case the field lines are the tra-
jectories along which the particles of fluid actually move.
Figure 1.13 : “7 Figure 1.14
It is evident that the circulation of the velocity vector u around any one of
the circles in Fig. 1.13 cannot be zero (the product u - ds has the same sign at each
point of the circle). Consequently, the above field cannot be a gradient field.
The velocity field of a fluid is, perhaps, the best starting point for investigation
of other types of vector fields because it naturally leads us to another fundamental
concept: the flux of a vector field.
Consider the element dS of a surface S (Fig. 1.14). Just as in the case of
curves, we shall deal only with piecewise smooth surfaces, i.e., those consisting of
* The details may be found in O. D. Kellogg, Foundations of Potential Theory.
+ A multivalued function is not one function, but a collection of several different functions.
t For example, Reitz and Milford, Foundations of Electromagnetic Theory, Section 8.8.” VECTORS, MATRICES, AND COORDINATES LB
w fluite number of smooth portions, By a smooth surface we mean a surface
toprerentuble by
x= (7,9, Y= VW(P9), 27,9),
where p and q are independent parameters and the functions x, y, and z have con-
tinuous first partials with respect to p and q in the domain under consideration.
Smooth surfaces possess tangential planes at all points and can be oriented; that
is, we can distinguish between the positive side and the negative side of the surface.
We shall also assume that our piecewise smooth surfaces are constructed in such
a way that they are oriented.* It is customary to represent surface elements dS
by vectors dS of magnitudes that are directed along the positive normal to the sur-
face, as illustrated in Fig. 1.14. Suppose that the vector field u represents the
velocity of a moving fluid. It can be seen that the inner product (u- dS) repre-
sents the amount of fluid passing through dS per unit time. Indeed, the particles
of fluid crossing dS at time ¢ will occupy the face ABCD of the shown parallelepiped
at time ¢ + df and all particles of fluid which have crossed dS between ¢ and
1 + dt will be located at 1+ df inside the parallelepiped. Consequently, the
amount of fluid passing through dS in the interval dt is given by the volume of the
parallelepiped, equal to
dS -|ul -dtcos @ = (u- dS) dt.
Divide by dt and obtain the desired statement.
By analogy with these observations, we define, in general, the flux of a vector
field u through a surface S by the surface integral
& = ff(u- ds).
In this formula, S can be either an open or a closed surface. A very familiar case
of the latter is found in Gauss’ theorem in electrostatics.
1.7 VECTOR FIELDS IN PLANE
According to the material of the preceding section, integrals representing circula-
tion and flux are important in the study of vector fields. For vectors in a plane,
the circulation integral has the form
fu ds) = § (ued + tyd,).
Integrals of this type can be analyzed by means of Green's theorem: If C is a (piece-
wise smooth) simple closed curve in a simply connected domain D and if P(x, y)
* For details, consult, e.g., Kaplan, p. 260 er seq.
+ Unless stated otherwise, it is conventional to take the direction of integration in such
integrals, i.e., the orientation of ds, as counterclockwise.17 VECTOR FIELDS IN PLANE ri
Simply connected domain Doubly connected domain
Figure 1.15
and Q(x, y) have continuous first partials in D, then
feac+ oar If (2-F)as
where S is the area bounded by C.
The importance of the requirement that C is a simple closed curve (see p. 18)
lies in the fact that we can distinguish the interior of the curve from its exterior
by the following rule: As we proceed along the curve in the direction of ds we
designate the region appearing on our right as the exterior and that on our left
as the interior. If the curve crosses itself such a formulation leads to contradiction
as should be obvious by considering a curve in the shape of a “figure eight.”
A domain in a plane is said to be simply connected if every simple closed curve
in it has its interior inside the domain as well. Figuratively speaking, a domain
is simply connected if it has no “holes” (Fig. 1.15).
Without going into mathematical details we shall sketch a possible method
of proving Green’s theorem which greatly facilitates its physical interpretations.
First of all, note that P(x, y) and Q(x, y) can always be treated as the components
u(x, y) and uy(x, y) of some vector field, and we shall adopt, for convenience,
this identification, Let us now divide the area S into a network of meshes, as
illustrated in Fig. 1.16(a). Taking the integrals [(u-ds) around each mesh in
eS] 4
NM a
AV Ny of [Pelt
we _
SS 7
Figure 1.1622. VECTORS, MATRICES, AND COORDINATES 17
counterclockwise direction we can easily deduce that
Fwd) = Ef (wa,
mites
(The contribution from a common boundary between two meshes cancels out
because of opposite orientations of vectors ds for each mesh; this leaves only the
contributions from the pieces of C.) Furthermore, multiplying and dividing each
term in the sum by the area AS of each mesh, we obtain
$ cua) = LEO Mas,
All
‘meshes
Suppose now that the number of meshes is increased to infinity and that each
mesh “shrinks to a point” so that AS — 0. If the limit
. 7)
lim £&
asso AS
= fx, y)
exists and is independent of the shape of AS,* then the sum reduces to an integral
and we have
¢ (u- ds) = |[ fos yds.
‘$
Therefore, it remains for us to evaluate the function f(x, y). A typical mesh is
shown in Fig. 1.16(b); it need not be rectangular since the arguments presented
below are valid for an arbitrary shape. If u, and wu, have continuous partials,
then we can write
(uz)
(udp + (), (-)+ (ue ,o- 1
and
‘au,
wo + 3) «-9+ (#4) 0-9
with the approximations being within the first order in |x — ¢| and |y — |-f
Here P(, ») is the fixed point to which AS ultimately shrinks and M(x, y) is an
arbitrary point on the boundary of the mesh. Writing now
§ (us) = f ted + $ 4 dy,
(ay)
* Except that the largest diameter of AS must approach zero: the mesh should not become
infinitesimally thin while retaining finite length.
} By the theorem on existence of total differential, guaranteed by the continuity of
partial derivatives.17 VECTOR FIELDS IN PLANE 23
we see that the following six integrals will be needed:
ga, $a pxa, $ vax, fp xae, gry.
The first two are zero, the second two are +AS and —AS, respectively, and the
last two are zero. As a result, we have
nw (Atty _ aus
$ (wa) = x ayo
Consequently,
= tim F024) _ Bly _ due
Sy y) = im, AS ee ay
where the stipulation that the partials are to be calculated at P(é, 7) can be omitted
since P is now an arbitrary point within C. We have then the result
$ ued + yay) = IG@-s)«
which is simply Green’s theorem in our notation.
With regard to a vector field u = uzi + uyj, the function f(x, y) is known as
the curl of u so that, by definition,
ead):
‘We have then evaluated the expression for curlu in (orthogonal) cartesian co-
ordinates in plane:
Quy us
Cu oye
curlu = ox ~ ay
Remark. Attention is drawn to the fact that, for a vector field in a plane, curl u is essen-
tially a scalar* and not a vector. The point is that by vectors we must mean quantities
expressible as ai + 4j and curl u is definitely not this type, whether or not we introduce
the third axis.
A vector field u which has zero curl at some point is called an irrotational field
(at that point). If wis irrotational in a simply connected domain, then by Green’s
theorem, it is a conservative field (gradient field) in this domain, ic., it has zero
circulation. The converse has to be worded rather carefully: If u is a gradient
field (namely, u = grad ¢), then it is irrotational provided ¢ has continuous second-
order partial derivatives.
* In a more elaborate nomenclature, curl u is called pseudoscalar due to its peculiar
property of changing sign if the x- and y-axes are interchanged. See Section 16.5.24 VECTORS, MATRICES, AND COORDINATES 17
Figure 1.17 Figure 1.18
Example. The magnetic induction field (B-field) due to an infinite current-
carrying wire is known to be (outside the wire)*
B= gol, 5) (MKSA units).
In the xy-plane (as shown in Fig. 1.17),
=—Bsing= —MY_ _ mol _y
eos oe ee
= abel x.
By = Boosd = 5 wey
This field happens to be irrotational everywhere except at the origin. Therefore
$c (B- ds) = Oif C does not encircle the origin, but not otherwise. A function
x(x, )) may be found such that B = grad x in a simply connected domain D, but
this can be done only if the domain does not contain the origin.
The B-field inside the wire is known to be
eee
Bo aR
where Ris the radius of the wire. Here B, = —(uol/2eR*)y and By = (ol /2xR?)x.
The field is not irrotational and cannot be represented as grad y anywhere.
Let us now turn our attention to the concept of flux for vectors in plane. The
obvious analog to the three-dimensional case is the integral
& =f (uno) ds =f (a dn)
taken over a curve C (not necessarily closed) with no being the unit normal to
the curve and dn = no ds. This is illustrated in Fig. 1.18.
* so is the unit vector that is tangential to the circle drawn around the axis of the wire.17 - VECTOR FIELDS IN PLANE 25
Exercise. For a flow of fluid in a plane, relate the integral © above to the amount of fluid
crossing the curve C per unit time. Specify physical units of all the quantities used.
In many applications, the flux through a closed curve is involved. Evidently if
ds = dxi + dyj, then*
dn = dyi — dxj
and
¢. (u-dn) = ¢. (uz dy — uy dx).
Setting P = —uy, Q = uz, this integral can be transformed by means of Green’s
theorem so that
= ff (4
fo fl + we) dx dy
provided, of course, that the partials are continuous everywhere inside C.
This relationship is usually referred to as the divergence theorem (in a plane)
and is written as
(adn) = [| div u- dS,
s
where
' Otte | Oty
div = 3 +5,
is another function derivable from a vector field and is known as the divergence of u.
While the above derivation is straightforward, it does not reveal the geometric
(or physical) meaning of div u. It is instructive to invoke the technique used in
Green’s theorem: Dividing the area S into a network of meshes, we find that it
is not hard to establish that
«day = yo Flue da)
fw da) = AS,
‘a
mites
because the vectors dn at the common boundary of two adjacent meshes are op-
positely directed. This observation permits us to define the divergence of u as
flux out of an infinitesimal area (per unit area), namely,
divu = lim 2S.
sno AS
Exercise. Derive the formula div u = duz/dx + du,/dy starting from the above defini-
tion and using the arguments analogous to those for curlu. Spell out the conditions
required in the derivation.
* It is a standard convention that for closed curves in a plane (and closed surfaces in
space) the positive normal is the outward-pointing normal.26 VECTORS, MATRICES, AND COORDINATES 18
Remark. From our definitions of curl u and div u it is seen that both represent a new kind
of derivative, namely, a derivative with respect to infinitesimal area rather than infinitesi-
mal displacement:
diva = tim £04),
curlu = lim
: asso AS
aso
It may be of interest to mention that the gradient of a scalar field y can be represented
in a similar fashion; i.e., the following statement holds:
. £e-dn
dee i .
Brace = yso AS
Another interesting observation is that curl u can be related to an integral involving
dn and div u to an integral involving ds. Indeed, the identities
(u-dn) = (u X ds}, (u-'ds) = [dn X ul,
are not hard to verify provided we treat the cross product of two vectors in plane as a
scalar, which is the logical thing to do.* Consequently, the following statements also hold:
_ yn Fld xu) Su X ds),
curlu = in AS hae divu Ase
Vector fields which have zero divergence are called solenoidal fields. They
are very common in physics. For instance, the electrostatic field is solenoidal in
the absence of charged matter ; the magnetic induction field is solenoidal everywhere.
1.8 VECTOR FIELDS IN SPACE
We would now like to extend the analysis of the last section to vectors in space.
We shall start with the flux of a vector field u through a closed surface S, because
this integral is, perhaps, easiest to handle. It reads
= fea) = ff (ods + u, dS, + u,dS.),
where dS, dS,, and dS, are projections of vector dS on the coordinate axes.
Integrals of this type can be handled by Gauss’ theorem: If S is a (piecewise
smooth) closed orientable surface contained, along with its interior, in a domain D
and if L(x, y, 2), M(x, y, 2), and N(x, y, 2) have continuous first partials in D, then
aL , aM , aN
ff was. + MdS, + NdS.) “ff (e+ Me a) av,
s
where V is the volume bounded by S.
* A typical cross product of two vectors in physics is the torque IP = [r X F] which, for
vectors in a plane, is completely described by magnitude and sign (clockwise or counter-
clockwise). See the remark on p. 23 which may lead to a conjecture that the cross product
ina plane is a pseudoscalar (and, indeed, it is).18 VECTOR FIELDS IN SPACE 27
Figure 1.19
Gauss’ theorem can be restated in vector notation, by identifying L, M, and
N with the components of a vector field u = wei + uj + uk. The concept of
divergence of u,
: uz , Oy , dus
divu= 52+
‘ay * Oz
is readily introduced so that Gauss’ theorem can be rewritten as
(u- dS) = f[[divu- dv
fof
and is often referred to as the divergence theorem.
The deduction of this theorem can be done again by a method analogous to
that in the preceding section. In this case we cut the volume V into small pieces,
say, rectangular blocks, one of which is shown in Fig. 1.19. If we calculate the
flux of u through each block and add all the results, we must obtain the flux
through the outer boundary S; just as before, the flux of u through an interface
between two blocks must appear in the sum twice, but with the opposite sign
because of the changed direction of the “outward normal” (see the footnote on
p. 25). Consequently,
ff ww- 48) = 5 HO) gy
al
bieke
introducing, for an obvious purpose, the volume AV of each block. We now
define the divergence of u by means of
: . (u- dS)
divu= tim 2-48),
wee yo AV
Gauss’ theorem can now be deduced by ironing out the mathematical de-
tails and developing the formula for divu in cartesian coordinates. We shall28 VECTORS, MATRICES, AND COORDINATES 18
give a simplified version of this from the consideration of a rectangular block
AV (Fig. 1.19).
It is not hard to show that the flux of u through a rectangle such as ABCD,
is given* within the first order in Ay and Az by
anc = (uz)p Ay Az,
where (uz)p, is calculated at the center P of the rectangle. Indeed, for any point
within ABCD we have
up 2 (us)p + (), o- + (), @-
and the last two terms yield zero when integrated over the area ABCD.
Consequently, the flux of u through the face A’B’C’D’ is to be determined by
the value of u, at the center of this face and is given by
[ude + (), 2 ay as
Similarly, the flux through the opposite face is
-[oor~(@), oe
There is a minus sign in front because we now want the positive normal in the
negative x-direction. Adding these two fluxes we obtain
au
(#), ‘Ax Ay Az.
The fluxes through the other four faces are obtained in a similar way leading to
the statement about the flux through a smalll parallelepiped AV:
ce (Mz 4 My 5 dus
yy = @& ae ay oe ae Av.
The expression for divergence now readily follows and concludes our analysis.
Let us now consider the question,of circulation of a vector in space. Such an
integral has the form
$. (4 ds) = fue dx + uy dy + us de),
and we shall assume that C is a (piecewise smooth) simple closed curve in space.
It need not be a plane curve but we shalll assume that it can be spanned by a (piece-
wise smooth) orientable surface S; that is, C can serve as the boundary for such a
* Assuming a positive normal in the positive x-direction.18 VECTOR FIELDS IN SPACE 29
Figure 1.20
surface. This situation is illustrated in Fig. 1.20(a). It is a standard convention
to define the positive direction of circulation around a surface element dS so
that it forms the “right-hand screw system” with the positive normal to the
surface.* The positive direction of circulation around the boundary curve C is
now chosen so that it coincides with such direction for the adjacent surface element
(Fig. 1.20a). When all this is done, we can quote Stokes’ theorem: If S is a (piece-
wise smooth) orientable surface spanning a (piecewise smooth) simple closed curve
C, mutually oriented as described, then
$e 2) dx'+ M(x, y,z) dy + N(x, y, z) dz]
-{f{er-
provided L, M, and N have continuous partials in a domain containing S and C.
As before, we can treat L, M, and N as the components of a vector field u. By
the now familiar technique, we divide S into a network of (curved) meshes
(Fig. 1.20b) and claim
fu ds) = DEG B as,
All
meshes
where AS is the area of a given mesh. As before, we calculate
F(u-ds)_ 4, Suz dx + uy dy + uz dz)
F (WAS) _ jim Le Ge + My dy + te dz),
AS Asoo
lim
As—0
Considering a small mesh about a point P(&, n, ¢) we have, for each point M(x, y, z)
on its boundary I (Fig. 1.21a),
udu = de + (82) ce 9+ (28), 0-94 (M2) G0, ete
oy
* A rule well known from the study of magnetic fields.30 VECTORS, MATRICES, AND COORDINATES 18
(@) Case I (b) Case IT
Figure 1.21
We shall need integrals like
f, dx, f, x dx, ¢, ydx, ¢, zdx,
and similar ones. These integrals can be evaluated by projecting our mesh on the
coordinate planes.
Suppose that we want to obtain the integrals of the type Srf(x, y) dx and
$rf(% y) dy. Since the points M and M’ (see Fig. 1.21) have the same x and y,
these integrals reduce to Sr f(x, y)dx and $r-f(x, y) dy except for the sign*
which depends on the orientation of the mesh:
¢, Sy y)dx = + ¢. fx, y) dx, ete.
The plus sign is for Case I (Fig. 1.21a), where the point M’ describes I” in counter-
clockwise direction, the minus sign is for Case II (Fig. 1.21b), with M’ going
clockwise.t
In particular, note that
¢.dx =f xdx = 0
while .
$9 ax = gp vax = FAS’,
where AS’ is the area bounded by I’, i.e., the area of the projection of the mesh
* The symbol fr, by itself is meant to indicate counterclockwise integration (see the
footnote on p. 20).
+ The motion of M’ cannot be chosen at will because it is determined by the motion of M
in the positive direction of circulation around the mesh.18 VECTOR FIELDS IN SPACE 31
on the xy-plane. Since the mesh is small, we have (Fig. 1.21)
AS’ & AS cos (mo,k) in Case, AS’ & —AS cos (Mo, k)_ in Case II,
(where no is the unit normal to AS). In either case,
grax = —AScos (no, k).
r
Similarly, we can deduce
¢, x dy = +AS-os (no, k)
.
(this integral is also needed). Other integrals are evaluated in an analogous fashion.
In particular, $y z dx and gy x dz require projection on the xz-plane, and so on.
The net result of this calculation reads
+ ds) = Ou at iy + (uz _ dus i
fo ds) as (2 -3 £05 0m) + (8 Fie) p £08 (Ro i)
+ & ue) 0s (no, k)
Introducing the cur] of a vector in space (which is now a vector, in distinction
to the plane case) by means of the relation
. £(u-ds)
tim £@-4) ~ Curt u- no),
ie ase 7
we have the result
tua (Ole _ aun), (Ate _ auc), , (ty _ aus)
Cul Mis oye @ arnt Nar ear) ts ox soy)
which also gives rise to the statement that circulation of u around an infinitesimal
oriented area described by dS is equal to (curlu-dS). Thus Stokes’ theorem
can now be written in the compact form:*
4, (u- ds) = fftoma- as).
As in the case of vectors in a plane, the vector fields satisfying div u = 0 are
called solenoidal and those satisfying curl u = 0 are called irrotational. The
concept of irrotational field is closely related to the concept of conservative field
(u = grad y) but these two fields should not be identified because of topological
complications. In particular, if a field is irrotational in a domain D, it does not
* The shape of the area element is irrelevant; because of this we can identify dS. in
aS = dS.i + dS,j + dS.k with dy dz, etc.32 VECTORS, MATRICES, AND COORDINATES 18
follow that its circulation about an arbitrary simple closed curve in D is zero. The
point is that we may not be able to invoke Stokes’ theorem because we may not
be able to construct a suitable spanning surface S which would lie inside D.
Example. Consider a tightly wound coil of current-carrying wire in the shape of a
torus. The B-field inside the torus is irrotational; indeed, curl B satisfies the
equation*
curlB = pod + cou (MKSA units)
and there is no current density J and no displace-
ment current in the interior of the torus; we
assume a de-situation. However, the B-field at
the center of each turn of the coil is known to be
B = pon! yielding a circulation along the central
circle C of the torus:
$B: ds) = wont - 2nR x 0.
Figure 1.22
It is easy to see that any surface spanning C must necessarily extend outside the
torus and cut through the windings where curl B = 0.
Since the quantities divu and curlu in a plane have been defined as area
derivatives (p. 26) and since div u in space has been defined as a volume derivative
(p. 27), it may be conjectured that curlu in space is also reducible to a volume
derivative. This is in fact true and the formula reads
. dS X ul
= ti He x a,
oan aT
where the surface integral is over the boundary enclosing the volume AV. We
shall sketch the proof of this relation by considering AV in the form of a rectangular
parallelepiped as shown in Fig. 1.22, The contributions to #f{dS X u] from the
top face involve only wzi and u,j and yield, in the usual notation,
Jftes x ae acay((u + 8%) i (ut +88) i.
Top face
where P is the center of the parallelepiped. The bottom face will have dS with
the direction reversed and
du, dz\ . du, dz\ |
[fs x ae —sear ((u- 35 9),3~ (w+ 522),
pena
face
* One of Maxwell’s equations; see, e.g., Reitz and Milford, pp. 296-297.18 VECTOR FIELDS IN SPACE 33
Adding these, we obtain
Ou Ona
az) p Oz) il cen,
Contributions from the other four faces are treated similarly, establishing the
result.
Remark. ‘The gradient of a scalar field g can also be represented as a volume derivative,
namely
Hoas
= lim 2
grad yp in ‘AV
We shall conclude this section by mentioning some quantities obtained
from the repeated operations involving gradient, divergence, and curl. First of all
observe the identity
curl grad g = 0
representing the statement that a conservative field is always irrotational (pro-
vided, of course, that the second-order partials of ¢ are continuous, as mentioned
on p. 23). The second operation of similar type yields the definition of the Laplace
differential operator ¥? or simply the Laplacian,*
¥2p = div grad ¢,
with the well-known expression in cartesian coordinates
y= Oe
Po ont +% a+ at
Both of these operations have their counterparts for plane vectors. There
are also two operations which are only possible for vectors in space: div curl u
and curlcurlu. Straightforward calculation in cartesian coordinates yields
divcurlu=0. A more sophisticated argument of some interest is to take a
closed surface S in a domain D where curl u is defined (and its components have
continuous partials), Split S into two parts, S, and Sa, by a curve C, as illustrated
in Fig. 1.23. By Stokes’ theorem,
fu as) TS 7 eo
* The symbol V? is related to the so-called “‘nabla” or “‘del” operator
a a a
x t5ay tae
v
capable of representing gradient, divergence, and curl by means of the notations
grad y = Vy, divu = (V-u), and curlu=[V X ul}, which are sometimes quite handy.34 VECTORS, MATRICES, AND COORDINATES 19
the minus sign arises from the “wrong” orientation of Sz with respect to C.
Consequently,
(curl u- dS.) + [[ (curl u-dS2) = ¢f (curl u- dS) = 0
[foes 1 ler 2 ff (curtn
for any closed surface in D. Applying the divergence theorem, we obtain,
divcurlu- dV = 0.
I) iv curl w
Then, since this is true for an arbitrary volume V in D, it follows that div curl u = 0.
Regarding the operation curl curl u, we can derive the following expression
for the cartesian coordinates:
curl curlu = grad divu — V°u,
where the symbol V2u stands for the operation V2ui + V2mj + Vuk. This
formula, however, is not valid for other coordinate systems.*
Exercise. Verify the above formula using the expressions for divergence, gradient, and
curl in cartesian coordinates.
Figure 1.23
1.9 CURVILINEAR COORDINATES |
Sometimes it is more convenient to use coordinate systems other than cartesian.
In general, a point.in space can be described by three parameters which we will
denote by J, m,n. A well-known example is given by spherical coordinates
r, 0, g, as shown in Fig. 1.24, along with the usual cartesian coordinates x, y, z
* Of course, it is possible to define the operation V2u by (grad div u — curl curl u) in
any coordinate system, but this is not done since V2u does not reduce to the operation
‘div grad” applied to components of u.19 CURVILINEAR COORDINATES 35
which are related to r, 8, 6 by
x=rsin@cos¢, y=rsindsing, z= rcosé.
In general, x, y, z can be thought as being functions of J, m, n:
x= x(mn), y= y(mn), z= 2(hmyn).
We shall assume that at least within some domain D in space, these functions
have continuous derivatives and can also be solved for /, m,n:
meurve
1=xy,2, Mu)
m = m(x, y, Z),
n= n(x, y, 2).
4
Observe that this implies that the Jacobian
= 2% » 2) 7
cl (7, m,n)
oO Figure 1.25
does not vanish.* Let us now choose a particular point M with cartesian coordi-
nates (£, n, ¢). it can also be denoted as M(A, u,v), in terms of the coordinates
I,m,n. If we keep m = y = const and n = y = const and change J, then we
obtain a (smooth) curve passing through M which may be called the /-curve.
Similarly, we can define m-curve and n-curve. This is shown in Fig. 1.25. Further-
more, we can introduce unit vectors 1p, mo, and mo along the tangents to these
curves (pointed in the direction of increasing /, m,n). This establishes a local
system of axes. For convenience, the labels /, m,n are chosen so that (Io, Mo, No)
form a right-handed triple.
Our local systems possess, in general, the following features which distinguish
them from the system formed by cartesian unit vectors i, j, and k.
1. The axes may not be orthogonal; moreover, the angles between the axes may
change from one point to another.
2. The orientation of 1p, mo, mo (with respect to i, j,k) may change from one
point to another, even if the angles between the axes remain the same.
3. The physical meaning of parameters /, m,n may not be the length, and dl,
dm, dn need not be identical with the elements ds of arc in the respective
directions.
Let us investigate properties (1), (2), and (3). We can always think of point
M as being defined by a position vector r = xi + yj + zk. Treating x,y,z as
* See, e.g., Kaplan, pp. 31 et seq.{0 VICTORS, MATRICES, AND COORDINATES 19
functions of /, m,n we can write
r= x(I,m, n)i + y(l,m, nj + 2(1, m, n)k.
Changing r by dr amounts to changing x, y, z by dx, dy, dz which is, in turn,
caused by changing /, m, n by dl, dm, dn. We have the following general relations:
dx = a+ Xam +2 an,
dy = 2 at + % dm + ° a,
dz = Fat Zam + Zn,
If we move along the -curve, then dm = da = 0, and dr becomes*
(d8)\mn = (dx 1 + dy jE Wn = (i +3j + 3x) a
This defines the derivative of r with respect to the parameter /:
Or _ (dt)mn _ Ox, | dy,
aa ~ ait aii
ez
+ 5k
By its very meaning, ar/al is a vector along the direction of Io. Therefore, lo
can be expressed as
_ or/al_ _ _(ax/alji + (@y/adj + (0z/aDk_,
° © or/all (ax/aly? + (ay/aly? + (z/aly=
The quantity
a l(ax\: (2y (#)
iS (%) +a) + Xai
has a simple geometric interpretation: The length of elementary arc ds produced
when only I changes is given by ds = |(dt)mn| = hdl.
In a similar fashion we deduce
— (x/am)i+ (ay/am)j + (ez/am)k, , _ [[ax\? , (ay)? , (az\*,
2 hn m= (3) + (2)'+ (2)
9 = x/andi + (@y/ami + (e/a, 4, (#y a (ys 7 (y-
Tig * WNaa) + Man) * Nan
* We use the notation familiar from thermodynamics: (dt)m,, Means such dr where m
and 7 are kept constant.19 CURVILINEAR COORDINATES 37
Suppose now that the triple Io, mo, no is an orthogonal triple. Then we must
have the relations
ax ax , ay ay , dz az
‘a am + af amt al am—%
These relations are satisfied for most coordinate systems employed in physics.
In particular, this is true for spherical and cylindrical coordinate systems, as
can be easily verified.
This analysis clarifies feature (1) of local systems of axes. Regarding the
orientation of local axes, note that it does indeed vary from point to point for
spherical and cylindrical coordinate systems. In fact, this is the characteristic
property of curvilinear coordinate systems, as opposed to cartesian ones. Feature
(3) is also illustrated by spherical and cylindrical coordinates where some of the
parameters /, m,n represent angles rather than lengths. As a general rule, the
elementary displacement dr decomposed along the local system of axes will read
dr
or or or i :
ae + mdm + 5, dn = idl lo + him dm mo + thy dn - no.
Let us assume that the local system is orthogonal;* then the element of arc is
given by a simple formula,
ds = |de| = Whi dl? + 13 dm? + 12 dn’.
For instance, for spherical coordinates, h, = 1, hy = r, hy = rsin 6, and
ds = Vdr® + r* do” + r? sin? 6.dp”.
We shall conclude our survey of curvilinear coordinates by the derivation of
formulas for common differential operations in vector calculus. In order to
express grad in terms of new axes and new variables, we could start with
a
ax
ae. , a
+aitak
grad g =
then use
9¢ _ ae al , d9 am | dy on
ax ~ af ax * am ox t
Sn ax ott
and also express i, j,k in terms of Io, mo,m. A quicker way is to utilize the
statement
(grad y- de) = de
20 a 4 2 am 4. 98
af + 3m am + an
* The associated coordinates are then called orthogonal coordinates.MB VICTORS, MATRICES, AND COORDINATES 19
() Figure 1.26
The calculation of divergence can be also carried out starting from the general
definition
divu = tim 2028).
ai AV
Without loss of generality, AV can be taken as a volume element with the sides
along the [-, m-, and n-curves (Fig. 1.26a). In general, the flux through an ele-
mentary area oriented in the Ip-direction is given by
uy + him dm + hy dn.
As we subtract the fluxes through the areas M’N’P’Q' and MNPQ we must not
forget that not only w;, but also Am and hy, are functions of /, m, and n.t By an
argument similar to that on pp. 27-28 (we give here a simplified version), we deduce
that the net outward flux through these two faces is
a
a (Uthmhn) al dm da.
Adding the analogous contributions from the other four faces and dividing
by the volume of our volume element (which is fi dl hm dm lin dn), we obtain
immediately
a 1
divu = Fae 3 (ual) ton o (umbn ht) ton Zeutitn
* Recall that dr is arbitrary; therefore, by setting dm = dn = 0 we obtain (grad ¢): =
(1/h)@¢/AN, etc. We tacitly assume that the coordinates are orthogonal.
+ In cartesian coordinates fy = htm = hy = 1.19 CURVILINEAR COORDINATES 39
Example. In spherical coordinates we identify /, m,n with r, 6, $, in that order.
Then hy = hy = 1, hm = ho = 7, hyn = hy = rsin 6, and
divu =
snd ee sin @u,y) + J crsin ou) + & Lew}
This may be simplified and, if desirable, ultimately reduced to
see ous 2) 1 due , cote Te Gus
Ott ort rt Ur oo tare rsind o¢
The curl of u can be deduced from the circulation of u around the faces of the
very same volume element. For instance, the face MNPQ yields (Fig. 1.26)
P Mt a
f (u- ds) +f, (us ds) = (unhn dw — (Unbin dn)atg = 3,7 Unlin) dn den,
[way cus
which is combined to form
—(Umlm dm) pg + (Umm dm) un
a
= 5 Umlim) dnd,
a a
(cutl u): hm dim fy dt = {es (aha) = 2 (mn dm dn.
This determines the component of curl u. The complete formula reads
ic tin) — 2 (eimbin
curl w= 51 32 (en) an Umtin){ to
tik fe (eh) ~ 2 ¢ah3} mo
a a
+ hi {2 cena) = om aro} No.
Finally, the expression for the Laplacian V? is obtained by combining the
formulas for gradient and divergence:
oe 1 ‘imlin 2¢\ , 9 (lint de) , 2 (hihm d¢\) ,
Vie = div gradu = FG fas hi 2) + om ( Tim ie) . a( Tn 22}
For instance, in the spherical system this reads (after trivial simplification)
1 a (2a 1a 1
ea 299 geeeeapecee Bese
te 2 on 2¢r ae) + sind 00 (sin : 4) + ain? 6 ag?40 VECTORS, MATRICES, AND COORDINATES
BIBLIOGRAPHY
Apostot, T. M., Mathematical Analysis, Reading, Mass.: Addison-Wesley Publishing
Co., 1957.
Gotpstetn, H., Classical Mechanics, Reading, Mass.: Addison-Wesley Publishing Co.,
1959.
Kapian, W., Advanced Calculus, Reading, Mass.: Addison-Wesley Publishing Co., 1957.
Kettoae, O. D., Foundation of Potential Theory, Berlin: Springer Verlag Ohg., 1929.
Taytor, A. E., Advanced Calculus, Boston: Ginn & Company, 1955.
Rerrz, J. R. and F. J. Mirorp, Foundations of Electromagnetic Theory, Reading, Mass.:
Addison-Wesley Publishing Co., 1960.
PROBLEMS
1. Let two vectors in a plane, ux and ua, be defined by the polar coordinates of their
tips: (81,71) and (@2,r2). If uz = ui + uz is defined by (@3, r3), show how 63 and
rg are related to 01,02, r1, and r2.
2. A triple vector product of three vectors is defined by the expression [u X [vy X w].
Show that for any three vectors the following identity holds:
[u x lv x w]] + [v x Iw x ul] + [w x [fu x vI] = 0.
Hint: Use the vector identity
[a X [b X ]] = ba) — c(a-b).
The above formula, known as the Jacobi identity, appears in a variety of contexts
in physics and mathematics.
3. Consider the following three vectors in space given by their coordinates
0G -3 DP, WEED, w-h-$9-
a) Verify that these vectors are unit vectors, orthogonal to each other, and form a
right-handed triple, if ordered as above.
'b) Construct the rotation matrix transforming the old components of a vector
(namely those with respect to i, j, k) to the new ones (with respect to u, v, w).
c) Evaluate, by vector-matrix multiplication, the new coordinates of the vectors
a(0, 3, 2), b(—1, 4, —3), and ¢(2; —2, —2). Can you give a geometrical interpre-
tation of the peculiar behavior of vector ¢?
4. a) Show that the triple product (p. 3) of vectors u(w1, u2, 43), v(v1, 02,03), and
w(w1, W2, ws) can be expressed by the determinant
det = (fu X vy) ).PROBLEMS: 41
b) Using this, prove that if a 3 X 3 matrix is orthogonal, then its determinant can
have only two values, either +1 or —1.
c) Consider the matrices
100, v3/2 4 0
A= Of 0s 4 -Vv3/2 Oe
0 0 0 0 -1
-1 0 0 23/5 -3V3/10 $
c= o-1 oj, D= z ete vo/2 |.
O01 2 4 0
and indicate which ones represent rotations. Also, describe the geometrical
meaning of others.
. Compare, in general, the jth matrix element of 4B with that of BA, for 3 X 3matrices
A and B, Construct two noncommuting 3 X 3 matrices of your choice, ie., such
that AB x BA.
. According to the discussion on p. 4, the matrix
cos@ sin@
—sin@ cosé
cos 26 sin 20
—sin 28 cos 26
cos 36 sin 30
and A? = AAA =
—sin 38 cos 36
and give the geometrical interpretation of this result.
. Show that the matrix
cos sing
sing —cos¢
does not represent a rotation of axes. Give a geometrical interpretation of matrix B.
[Hint: Draw the old and the new coordinate axes, as well as the straight line
y = xtan /2).)42. VECTORS, MATRICES, AND COORDINATES
8. Find the inverses of the following matrices by solving the equations B.A, = I or
otherwise:
2-10 4 3 3
A= eee Az 0 -1/v2 1/2 |,
35014: 22/3 -V2/6 —V2/6
-3 -3 % 1 Ky ri
ee eee ae 1 -1 oe
4% 2 0 1
Comment on the cases A, As, and Ag. ;
9. Let (x’, ’) be the coordinates of a point in a skew a
cartesian system in plane. Let the x’ and y’-axes
make angles @ and 8, respectively, with the x-axis
(Fig. 1.27), Show that the equation of a circle with
radius R and the center at the origin reads
x’? + y'? + 2x’y’ cos (B — a) = R?. a x
10. Show that the vector v = 2i + j — 6k cannot be a
expressed as a linear combination of the vectors .
up=itj+2k, ue =3i-j, us = 2+k Figure 1.27
Show that the vector w = —2j — 3k can be expressed in this fashion, in more
than one way. Give the algebraic explanation of these facts. Also give a geometrical
interpretation.
11. Evaluate the following integrals around the circle x? + y? = 1; use Green’s theorem
it is convenient:
a) $ (u- ds), where u = (2y? — 3x?y)i + (xy — x9)j,
b) £(2x? — y8)dx + (3 + y3)dy,
c) £(v- dn), where v = (x? + y?)i — 2xyj (dn is defined on p. 24).
12. Let F(x, y) = x? — »?. Evaluate
a) J&B) (grad F - ds) along the curve y = x°,
») 4 eas around the circle x? + y? = 1. Here aF/an is the directional deriva-
in
five of F along the outer normal and ds = |ds|.
13. Show that the vector field u = yzi + zxj + xyk is both irrotational and solenoidal.
Find g such that grad g = u. Can you find a vector field A such that curl A = u?
14. Prove the following identities for scalar fields f, y and vector fields u, v in space:
a) grad (fp) = ferad y + grad f,
yb) curl (fa) = fcurlu + [grad f X ul],
¢) div w x v] = (v- curlu) — (a- curly).PROBLEMS 43
15, Using divergence and Stokes’ theorems, if it is convenient, calculate the following
integrals.
a) ffs (u- dS), where u
the origin,
b) ffs (v- dS), where v = x°i + yj + 25k and S is the sphere as in (a),
©) ffs (x dy dz + ydzdx + zdx dy), where S is the sphere as in (a),
4) $r (w- ds), where u = —3yi + 3xj + k and Tis the circle x? + y? = 1 lying
in the plane z = 2.
16. A flat disk rotates about the axis normal to its plane and passing through its center.
Show that the velocity vector v of any point on the disk satisfies the equation
+ y3j + 2%k and S is the sphere of radius R about
curly = 2w,
where w is the angular velocity vector.
17. Consider a conducting medium with variable charge density p(x, y, z) and variable
current density J(x, y, z). Let V be an arbitrary volume within the medium bounded
by a (piecewise smooth) closed surface S. Considering the total amount of charge
inside V and the amount entering it per unit time through the surface S, deduce that
£ [[foss2av ~~ £f oen9-as)
; i
With the help of the divergence theorem, deduce the so-called equation of continuity
: op
dvd + 37 = 0.
18. Using the techniques employed in Sections 1.7 and 1.8, outline the possible proofs
of the following statements:
grady = lim = (ina plane), grad = lim Hoas
asso AS Dea aas tna
19. Evaluate the quantities f,, hy, he (see p. 36) for the cylindrical coordinate system.
Using appropriate formulas from Section 1.9, write the expressions for gradient,
divergence, curl, and the Laplacian in cylindrical coordinates.CHAPTER 2
FUNCTIONS OF A COMPLEX VARIABLE
2.1 COMPLEX NUMBERS
In the course of study of roots of algebraic equations and in particular the cubic
equation, it has been found convenient to introduce the concept of a number
whose square is equal to —1. By a well-established tradition, this number is
denoted by i, and we write 7? = —1 andi = \/—1. If we allow i to be multiplied
by real numbers, we obtain the so-called imaginary numbers* of the form bi (where
bis real). If the usual rules of multiplication are extended to imaginary numbers,
then we must conclude that the products of imaginary numbers are real numbers;
moreover, their squares are negative real numbers. For instance,
Gi(—4/ = BY-4)? = (— 12-1) = 12,
(-5i? = (-5)27? = -25.
If imaginary numbers are adjoined to real numbers, we have a system within
which we can perform multiplication and division (except by zero, of course).
We say that such a system is closed under multiplication and division. However,
our system is not closed under addition and subtraction.t To eliminate this de-
ficiency, so-called complex numbers are introduced. These are numbers which are
most often written in the form
a+ bi (a,b = real numbers)
and are assumed to obey appropriate algebraic rules. As will be shown below,
the system of complex numbers is closed under addition, subtraction, multiplica-
tion, and division plus the “extraction of roots” operation. In short, it has all the
desirable algebraic characteristics and represents an extension of the real number
system. The study of complex numbers is invaluable for every physicist because
the description of physical laws is much more complicated without them.
* Imaginary numbers are also called pure imaginary numbers to stress the distinction from
the more general case of complex numbers. The name originated from the belief that
imaginary numbers, as well as complex numbers, do not represent directly observable
quantities in nature. While this point of view is now mostly abandoned, the original
nomenclature still exists.
+ The system is not closed under the operation of extraction of the square root either;
for example, V/s neither real nor (pure) imaginary.
44ae BASIC ALGEBRA AND GEOMETRY OF COMPLEX NUMBERS 45
2.2 BASIC ALGEBRA AND GEOMETRY OF COMPLEX NUMBERS
If complex numbers are written in the usual form a + ib (or a + bi) then the usual
algebraic operations with them are defined as follows.
1. Addition:
{ay + ibi) + (az + ibe) = (a + a2) + (br + 2).
2. Multiplication:
(a, + iby) @2 + ibe) = (@ya2 — b1b2) + i(aib2 + obi).
The second rule is easy to follow if we recognize that the expressions a + ib are
multiplied in the same manner as binomials, using the distributive and associative
laws, and i? is replaced by —1.
Complex numbers of the form a + i0 are tacitly identified with real numbers
since they obey the same algebraic rules and are generally indistinguishable from
each other.* Complex numbers of the form 0 + ib are then (pure) imaginary
numbers. It is customary to write simply a+ i0 = a and 0+ ib = ib. Sub-
traction of complex numbers can be defined as inverse addition so that if
(a, + bi) — (az + ibs) = x + iy,
then
a, + iby = (x + iy) + (az + ibe)
from which it follows thatt
x=a1—4, and y= b,— be.
An alternative is to form the negative of a complex number,
—(@ + ib) = (—1)@ + ib) = (-1 + 0a + id) = —a — ib,
and reduce the subtraction to addition.
The rule for division can be similarly deduced by inverting the multiplication.
A shortcut method is given by the following technique:
at th at be— id (ac + bd) + ibe — ad)
. i i. + a
(24+ @ #0).
It is readily seen that the divisor can be any complex number except zero (namely
the number 0 + 10, which is unique and is written simply 0).
* In a more rigorous language, “the subset of complex numbers of the form a + i0 is
isomorphic to the set of real numbers under the correspondence a + i0 © a.”
4 Its tacitly postulated that x1 + iyi = x2 + iya if and only if x1 = x2 and y= yo.46 FUNCTIONS OF A COMPLEX VARIABLE 22
Remarks
1. The addition of complex numbers obeys the same rule as the addition of vectors in
plane, provided a and b are identified. with components of a vector. Note, however, that
the multiplication of complex numibers differs from the formation of dot and cross
products of vectors.
2. The use of the symbol i and the related binomial a + ib is conventional, but not
indispensable. It is possible to define a complex number as a pair of real numbers,
(aA), obeying certain peculiar rules, e.g., the multiplication can be defined by
(@1, b1)(a2, b2) = (a1a2 — bib2, a1b2 + a2b.),
and so on.: It should be clear that the form a + ib is just a representation of a complex
number.
It is customary to represent complex numbers by points in the so-called com-
plex plane, or Argand diagram (Fig. 2.1). If we denote the complex number
x + iy bya single symbol z and write z = x + iy, then to each z there cor-
responds"4 point in the complex plane with the abscissa x and the ordinate y.
This idea also leads us to the trigonometric representation of a complex number:
z= r(cos 0 + isin 6),
Im
Pyrextiy
where A= \/x? +p? and tand = y/x. In
this repgesentation r is unique (positive square
root) but @ is not. A common convention is to
demand that}
Imaginary axis
—r<0kn, Real axis
along with the standard rule of quadrants, Figure 2.1
namely, @ < Oif y <0.
The following nomenclature and notation will be widely used: If
z=x+iy = r(cosé + isin)
then
x = Rez is the real part of z,
y = Imz_ is the dmaginary part of z,
r= |z| __ is the modulus of z, also known as the magnitude
or absolute value of z,
6 is the argument of z, also called the polar angle or phase.t
The number x — iy is called the complex conjugate of the number z = x + iy
and vice versa. We shall denote it by z*. We can say that z and z* represent (on
the complex plane) the reflections of each other with respect to the real axis.
+ Another commonly used convention is 0 < @ < 2m.
+ A more precise name for 8 would be the “principal value of the argument of 2” (see p. 57).pI 3 BASIC ALGEBRA AND GEOMETRY OF COMPLEX NUMBERS 47
Remarks
1. The quantity z2* is always a nonnegative real number equal to.
are the same).
? (which
2. The quantity.z + 2* is always a real number, equal to 2 Re z or to 2 Re 2* (which
are the same).
3. The rules (z; + z2)* = zt + z% and (ziz2)* = ztz2 are evident and should be
remembered.
Figure 2.2
Because complex numbers obey the same addition rule that applies to vectors
in a plane, they can be added graphically by the parallelogram rule (Fig. 2.2a).
Conversely, vectors in a plane can be represented by complex numbers. The
scalar product of two such vectors can be obtained by the rule
(1-22) = Re (ziz2) = Re (2123),
where it is understood that z; and z2 are vectors corresponding to complex num-
bers z, and z2 respectively. The vector product can be obtained in a similar
fashion:
[21 X 22] = Im (2422) = —Im (2123).
Exercise. Verify the validity of the above rules for scalar and vector products.
In the theory of complex variables, the expression |z1 — z2| is often used.
According to Fig. 2.2(b) this quantity (modulus of the complex number z1 — 22)
is equal to the distance between the points z; and z2 in the complex plane. It
follows that the statement |z — zo| < R (which often occurs in proofs of various
theorems) means geometrically that point z is within the circle of radius R drawn
around the point zo as a center (i.e., z is in the R-neighborhood of zo; see p. 16).
The following two inequalities are easily proved from geometrical considerations:
I [za + zal < eal + [el
(A side of a triangle is less than or equal to the sum of the other two sides.)
2. [za ~ 20] > [lea — [eall-
(The difference of two sides of a triangle is less than or equal to the third side.)AB FUNCTIONS OF A COMPLEX VARIABLE 7
Remark. It should be emphasized that inequalities can exist only among the moduli of
complex numbers, not among the complex numbers themselves. A complex number
cannot be greater or smaller than another complex number. Also, there are no positive
or negative complex numbers.
2.3 DE MOIVRE FORMULA AND THE CALCULATION OF ROOTS
While addition and subtraction of complex numbers are most easily performed
in their cartesian form z = x + iy, multiplication and division are easier in
trigonometric form. If z1 = r1(cos 61 + isin 0) and zz = re(cos 02 + isin 42),
then elementary calculation shows that
Z12Z2 = rira[cos (81 + 62) + isin @, + 62)]
with the provision that if 6; + 2 happens to be greater than x, or less than or
equal to —z, then the amount 2x should be added or subtracted to fulfill the
condition —x < (0; + 62)
0) to which the positive
imaginary semiaxis is added. The negative imaginary semiaxis is not included.
The second branch, which has no special name, maps the z-plane onto the left
half-plane (Re w < 0) plus the negative imaginary semiaxis. Except for z = 0,
no other point on the w-plane (image plane) is duplicated by both mappings.
Also observe another important feature of the two branches. Each branch
taken separately is discontinuous on the negative real semiaxis. The meaning of
this is as follows: The points
8) (—e+8)
2 = eff and zp) = ef
where 6 is a small positive number, are very close to each other. However, their
images under the principal branch mapping, namely
filer) = 12-92) and fy(zq) = e282),
are very far from each other. On the other hand, note that the image of z» under
the mapping f2(z), namely,
falas) = eitx/24812,
is very close to the point f,(z1). It appears that the continuity of mapping can be
preserved if we switch branches as we cross the negative real semiaxis.
To give this idea a more precise meaning we must define the concept of con-
tinuous function of a complex variable. Let w = f(z) be defined in some neigh-
borhood (see pp. 47 and 16) of point zo and let f(zo) = wo. We say that f(z)
is continuous at zo if* f(z) wo whenever z —> zo in the sense that given 8 > 0
(arbitrarily small), the inequality |f(z) — wo] < 8 holds whenever |z — zo| <€
holds, for sufficiently small ¢. It is readily shownt that if w = u(x, y) + iv(x, »),
then the continuity of w implies the continuity of u(x, y) and v(x, y) and vice versa.
* Also written as lim:_,:9 f(z) = f(Zo)-
+ For example, see Kaplan, p. 495.2.6 MULTIVALUED FUNCTIONS AND RIEMANN SURFACES ae
Riemann proposed an ingenious device to represent both branches by means
of a single continuous mapping: Imagine two separate z-planes cut along the
negative real semiaxis from “minus infinity” to zero. Imagine that the planes are
superimposed on each other but retain their separate identity in the manner of
two sheets of paper laid on top of each other. Now suppose that the second
quadrant of the upper sheet is joined along the cut to the fourth quadrant of the
lower sheet to form a continuous surface (Fig. 2.6). It is now possible to start
a curve C in the third quadrant of the upper sheet, go around the origin, and cross
the negative real semiaxis into the third quadrant of the lower sheet in a con-
tinuous motion (remaining on the surface). The curve can be continued on the
lower sheet around the origin into the second quadrant of the lower sheet.
Lower sheet
Figure 2.6 Figure 2.7
Now imagine the second quadrant of the lower sheet joined to the third
quadrant of the upper sheet along the same cut (independently of the first joint and
actually disregarding its existence). The curve C can then be continued onto the
upper sheet and may return to the starting point. This process of cutting and cross-
joining two planes leads to the formation of a Riemann surface which is thought of
as a single continuous surface formed of two Riemann sheets (Fig. 2.7).
An important remark is now in order: The line between the second quadrant
of the upper sheet and the third quadrant of the lower sheet is to be considered as
distinct from the line between the second quadrant of the lower sheet and the third
quadrant of the upper one. This is where the paper model fails us. According to
this model the negative real semiaxis appears as the line where all four edges of
our cuts meet. However, the Riemann surface has no such property; there are
two real negative semiaxes on the Riemann surface just as there are two real posi-
tive semiaxes. The mapping f(z) = \/z may help to visualize this: The principal
branch maps the upper Riemann sheet (negative real semiaxis excluded) onto
the region Re w > 0 of the w-plane. The line joining the second upper with the
third lower quadrants is also mapped by the principal branch onto the positive
imaginary semiaxis. The lower Riemann sheet (negative real semiaxis excluded)
is mapped by the second branch onto the region Re w < 0. The line joining the
second lower with the third upper quadrants is mapped (by the second branch)
onto the negative imaginary semiaxis, In this fashion the entire Riemann surface$6 FUNCTIONS OF A COMPLEX VARIABLE 2.6
is mapped one-to-one onto the w-plane (z = 0 is mapped onto w = 0; this
particular correspondence, strictly speaking, belongs to neither branch since
the polar angle @ is not defined for z = 0).
‘The splitting of a multivalued function into branches is arbitrary to a great
extent. For instance, define the following two functions which also may be treated
as branches of f(z) = V2:
Vr etal? for O<0<7,
Vr ell@H20/2) for 4 << 0,
W/rell@t20/21 for O< 9 0) or with (W/2")" (g < 0) and has q branches. For irrational
a the power function is infinitely multivalued.
2.7 ANALYTIC FUNCTIONS. CAUCHY THEOREM
In this section we shall discuss the subject of calculus of functions of a complex
variable. The basic concept of the continuity of a complex function has already
been presented, and it is not difficult to verify that the sum, product, and quotient
(except for division by zero) of two continuous functions is continuous. A con-
tinuous function of a continuous function is also continuous.f
Let C be a piecewise smooth curve in a complex plane. If f(z) is continuous
on C, then the complex integral
[fara
lc
can be defined and expressed in terms of real integrals by putting
S@ = ulx,y) + v(x y) and dz = dx + idy;
this yields
[,f@d = [ude — ody) + if (dr + udy),
where the real integrals fo (u dx — v dy) and Jo (v dx + u dy) are known to exist.$
Curve C may be open or closed but the direction of integration must be specified
in either case. The reversal of this direction results in the change of sign of the
integral. Complex integrals are, therefore, reducible to curvilinear real integrals
and possess the following properties:
[,G@+ e@)d& = fleyde + fale) az, @
- Af(z)dz =k i flz)dz — (k = complex constant), Q)
[foe =f, ; fleyde +f fla)de, 6)
where C is decomposed into two curves, C, and C2. The absolute value of an
integral can be estimated by the formula
|f, Ae | < ME,
le
where M = max |f(z)| on C, and L is the length of C.
* Another widespread notation is arc sin z = sin=! z, arsinh z = sinh~! z, and so on.
+ For example, Kaplan, p. 496.
} In the sense of Riemann; see, e.g., Courant, Vol. 1, p. 133, and Kaplan, p. 299.27 ANALYTIC FUNCTIONS. CAUCHY THEOREM 59
As our next concept we shall define the derivative of a complex function:
Changing z into z + Az (with complex Az), we obtain f(z + Az) and we can write*
S@) = Efe) = jim L242) — £0).
As in the case of real functions, this limit may or may not exist. It may be empha-
sized that in the above formula Az may approach zero in an arbitrary fashion,
that is, z + Az may approach z along any curve or by any sequence. This rather
stringent requirement implies that f(z) must indeed be “well behaved” at point z
in order to be differentiable.
Function f(z) is said to be analytic (regular, or holomorphic) at point z if it
possesses a derivative at z and at all points of some neighborhood of z (small but
finite). This additional requirement results in many desirable properties of analytic
functions, such as the existence of derivatives of all orders. The theory of functions
of a complex variable deals essentially with analytic functions.
Mere existence of a derivative at all points of a neighborhood may be shown to
imply that the derivative is continuous.t Also, it is a simple matter to verify (by
the same technique as for real variables) that the derivatives of complex functions
obey the usual rules:
Zon + me) = St 4 ee, @
E (oye) = avs + ee Was (2)
& - * HE, where w = w(g) and f= $0), @)
Ze = n2"-! — (n = integer), (4)
and so on. The differentials of complex functions are defined in the same way as
for real functions: If w = f(z), then dw = f’(z) dz.
If we set f(z) = w = u(x, y) + io(x, y), then the definition of the derivative
can be rewritten as
“) = Ym Wt Amy + Ay) — ule y)) + ilo + Ax, y + Ay) — v(x, y)),
107 ax +i Ay
ayo
The limiting value on the right-hand side must be the same for the arbitrary ap-
proach Az—>0. In particular, set Az = Ax (approach along the real axis); then
cee
£@ = + 5g
* See p. 54 (including the footnote) for the definition of a limit.
t See, e.g., Knopp, Theory of Functions, Vol. 1, p. 65.60 FUNCTIONS OF A COMPLEX VARIABLE Pag
Alternatively, set Az = i Ay (approach along the imaginary axis); then
; ou
EG) aay lay
It follows that for a differentiable function w = u + iv we must have
ou av, du _ av.
Petey ey woe
These are the Cauchy-Riemann conditions; they follow ditectly from the definition
of the derivative. If, further, f(z) is analytic, then f’(z) must be continuous which,
in turn,* implies that the partial derivatives of u and v are continuous.
The inverse theorem also holds: If u(x, y) and v(x, y) have continuous first
partial derivatives satisfying Cauchy-Riemann conditions in some neighborhood
of z, then f(z) = u + iv is analytic at z.
Integrals of analytic functions possess some very important properties. Per-
haps the most fundamental one is expressed by the Cauchy theorem: If f(z) is
analytic in a simply connected domain D, and C is a (piecewise smooth) simple
closed curve in D, then
$, fz) dz = 0.
Proof. Write the integral as
¢ f@)az = 4, (udx — vdy) + if (@dx + udy).
c c c
Analyticity of (2) implies continuity of partial derivatives of u and v and Green's
theorem (p. 20) is applicable. However, then the Cauchy-Riemann conditions
imply
$ was — ody) = “fCR+3 22) dy = 0,
fetes aan ff (- Been
and the theorem follows.
There is a converse of the Cauchy theorem, known as the
Morera theorem:t If f(z) is continuous in a domain D and if $ f(z) dz = 0
for every simple closed path in D with its interior also in D, then f(2) is
analytic in D.
It is not difficult to see that the Cauchy theorem is true for multiply connected
domains provided the interior of the simple closed path Cis also inside the domain
* See, e.g., Kaplan, p. 510.
+ Knopp, Theory of Functions, Vol. 1, p. 66.27 ANALYTIC FUNCTIONS. CAUCHY THEOREM 61
(ie., the path does not encircle a hole; see Fig. 1.15). Similar extensions hold for
related integral theorems that will be quoted later.
The vanishing of a contour integral (an integral around a simple closed path)
is closely related to the independence of path of an integral. In fact, the considera-
tions of Section 1.6 can be applied easily to complex integrals, leading to the
statement: If $ f(z) dz = 0 for every simple closed path, then the integral
[soar
is independent of path (between zo and 2).
Suppose now that the point zo is fixed. If the integral f°, f(t) dt is inde-
pendent of path, then it must represent a function of z. This function is then a
primitive function of f(z) (or an indefinite integral of f(z)) as follows from the
fundamental theorem of integral calculus: If f(z) is analytic in a simply connected
domain D, then the function
Fe) = [seat
is also analytic in D and f(z) = (d/dz)F(2).
Proof. Since f(z) is analytic, the integral is independent of path and is therefore
a function of z. In the expression
(eu)
Fa) = U+ i = [2 (ude — vay) + fe? (wart udy),
40
fe
0.0) cz
both integrals are independent of path (by Green’s theorem and the Cauchy-
Riemann conditions). It also follows that*
i
ee ee ee
ee, tee
ee oy
so that u and v satisfy the Cauchy-Riemann conditions as well. Therefore F(z)
is analytic. Moreover,
qdF_ au, av
nr tig -ut i= fo,
and the theorem follows.
Any two primitive functions must differ by a (complex) constant; this follows
from the fact that f"(z) = Oimplies f(z) = const (integrate du/ax = 0, au/ay = 0,
ete.).
* For example, Kaplan, p. 244.62 FUNCTIONS OF A COMPLEX VARIABLE 2.8
2.8 OTHER INTEGRAL THEOREMS. CAUCHY INTEGRAL FORMULA
It should be emphasized that all conditions stated in the Cauchy theorem must
be checked before applications. Consider, for instance, the integral
1
T= gt dz (a = const).
Is this integral zero or not? Generally speaking, f(z) = 1/(z — a) is an
analytic function but it fails to be analytic at one (and only one) point, namely,
z= a, The function is not even defined at this point and thus cannot possess a
derivative.
Let the curve C involved in the definition of I be a simple closed curve. Then,
if the point z = a is outside the curve, the Cauchy theorem holds and I = 0.
If it is inside, the Cauchy theorem cannot be applied. In fact, the integral is not
equal to zero, as demonstrated by the following considerations: If C is a circle of
radius R centered at z = a, then the integral is easily evaluated by setting
z=a-+ Re. In this case dz = iRe™ do and
I= [via = dni.
It is not difficult to show that the same result is true for any simple closed
path C1, which encircles point z = a. Suppose that C; is entirely inside the circle
C (Fig. 2.8). Then a thin channel made up of curves B, and Be can be constructed
to connect the interior of C, with the exterior of C and the Cauchy theorem can
be applied to the shaded region; a domain D can be constructed so that the
shaded region is within it. The integral over C, is clockwise. As the sides By
and By ofthe channel are allowed to approach each other, the integrals of
£2) = 1/(z — a) along By and Bz will (in the limit) cancel out, leaving us with
the statement
¢, Sf) dz + $0, flz) dz = 0.
Counterclockwise Clockwise
Reversing the direction of the second integration to
counterclockwise, we obtain
$ Se) de = $0, fle) de
Figure 2.8
(with both directions counterclockwise).
If Cis entirely within C,, the proof is similar. If C and C, intersect, the proof
is even simpler.
Exercise. Produce a proof of the discussed statement given that C and Cy intersect
at two points.28 OTHER THEOREMS. CAUCHY FORMULA 63
If the integral J is evaluated around a closed path which is not simple, its
value is not necessarily 2i. In cases of practical interest it will be equal to n2xi,
where n is the number of times the path encircles the point z = a counterclockwise
less the number of times it encircles the point z = a clockwise.
Of course, it should be understood that the integral g f(z) dz may happen
to be zero even if the Cauchy theorem does not apply. For instance, calculate
the integral
1
r= fete
where is a positive integer not equal to unity and the contour is a circle of radius
Raround z = a. Using z = a + Re*, obtain
+e -
J -/ Recreate ds = i
This result evidently holds for any closed path encircling z = a.
In both of these examples, the possibility of the point z = a being exactly on
the path of integration has been avoided, and for a good reason; such integrals
cannot actually be defined. Whenever this situation occurs in practical problems,
the path must be deformed to avoid the troublesome point. How this is to be done
depends on the nature of the problem.*
Function f(z) in the Cauchy theorem must, of course, be single valued. It
may be a particular branch of a multiple-valued function, but then care should be
taken that this particular branch is analytic. Consider, for instance, the integral
fe
along the unit circle about the origin. First of all, the branch of the (double-valued)
function /z must be specified. Suppose it is the principal branch. Then
f Vide =" elie dn = 1.
The Cauchy theorem is not applicable because f(z) is not analytic within the circle
lz| = 1. The points where it fails to be analytic are along the real axis from
x = —I1 to x = 0 where f(z) is not even continuous. Note also that although
F(Z is continuous at z = 0, it is not analytic at that point either.
Consider now the same integral
oe ee
* One such example is given in Section 12.9.64 FUNCTIONS OF A COMPLEX VARIABLE 2.8
taken around the point z = —2 (Fig. 2.9). If the principal branch is involved in
the integration, the Cauchy theorem is not applicable. However, split ¥/2 into
the following two branches (as on p. 56):
(\/z = v/r ef? if O<0 K for all n sufficiently large, and k > 1,
then Dizn diverges.
Root test. If Y[zq| < k <1 for all n sufficiently large, then Xz, converges
absolutely, and if *Y[zm] > k > 1 for all sufficiently large, then Dz, diverges.
Proofs of these theorems are similar to those for real series; they are based
on the inequalities for absolute values which are also true for complex numbers.
Divergence of a series can often be quickly established by the nth term test:
If z, fails to converge to zero then the series Sz, diverges.
If necessary, the question of convergence of a complex series can always be re-
duced to that of two real series by the basic theorem: The series 52m = C(Xn + in)
converges to S = P + iQ if and only if xq converges to P and Cyn converges
to Q. For instance, series which are: convergent but not absolutely convergent
can be treated in this fashion.
Complex series can be added and subtracted provided they are convergent.
They can be multiplied only if they are absolutely convergent, the product being
also an absolutely convergent series; if the series are not absolutely convergent,
then we are faced with the problem of arranging the product series.t
* For example, see Section 6.4.
t See Knopp, Theory and Application of Infinite Series, Section 45.ri ‘COMPLEX SEQUENCES AND SERIES 69
Terms of a complex series may depend on a complex variable z. Most common
series are power series, for instance,
l+zt27424-
In many cases such series will converge only if z is confined to a certain region.
The above series converges absolutely, by ratio test, provided |z| < 1. This series
diverges, by ratio test, if |z| > 1. The ratio test is inconclusive if |z| = 1, but then
the nth-term test shows that the series diverges. It is seen that the above power
series converges absolutely for all points inside a circle of radius R = 1 called
the radius of convergence.
The concept of the radius of convergence can be applied to every power
series. Indeed, if a power series is convergent on a circle of some radius r, then
it is absolutely convergent everywhere inside this circle* (by comparison test).
The problem is then to find the upper bound of r which is the sought radius
of convergence.
Exercise. Show that the series
1 — 324 92? — 2723 + Bizt — --+
has a radius of convergence equal to 4, while the series
Lz + 2tz? + 3123 + atzA +++
has a radius of convergence equal to zero; the point z = 0 is the only value for which the
series converges.
If the upper bound described above does not exist, then the series converges
absolutely for all values of z and is said to have an infinite radius of convergence.
For example,
ak zs
Te ttt a
Partial sums of a power series represent a sequence of polynomials in z.
Sequences of other functions can also be considered. The sequence {f,(z)} of
functions defined in a region R (z belongs to R) is said to converge to a limit func-
tion f(z) in R, provided
lim fa(z) = f(2)
for each z in R. For instance, the partial sums of the series
Saige te eye
mo
* This statement is known as Abel’s theorem. See Kaplan, p. 350.70 FUNCTIONS OF A COMPLEX VARIABLE ea
form a sequence of functions (polynomials)
:
faz) = D2,
&
and this sequence converges to the function f(z) = 1/(1 — 2) in the (open)
region |z| < 1 because
gt mi
Sl) = Vp zt 22 pop =
and
ntl
z
lim
0 for |2| <1.
For this reason the function f(z) = 1/(1 — 2) is said to be the sum of the
above series (for |z| < 1 only!):
=1i-2 (<2).
m0
When representing a function by a sequence of other functions, it is necessary
to know how well a certain function f(z) is approximated by the nth term of a
sequence {f,(z)}. This question leads to the definition of a uniform convergence:
The sequence {f,(z)} is said to converge uniformly to f(z) in a region R provided
that the inequality
\fl2) — f@ <6
which is satisfied if n > N, holds simultaneously for all z in R.
In plain terms: Let is suppose we desire a certain accuracy ¢ for our ap-
proximation. For some particular z, the tenth function in the sequence may
suffice (ten terms of a series, if we are talking about partial sums). But for another
z, the tenth function may be inadequate because the speed of convergence is
slower. In general, we may need to go farther and farther along the sequence as
we proceed to points where the convergence gets increasingly worse. Uniform
convergence sets the end to this process. The convergence can be no worse than
a certain specified degree and the Nth term will guarantee a certain accuracy
for the entire region.
Example, The sequence of partial sums of the series
is convergent for |z| < 1 but it is not uniformly convergent; the convergence
becomes increasingly worse as |z| > 1. However, in the region |z| < k where
k <1, the convergence is uniform. It may be “bad” at z = k, but once N is2.10 TAYLOR AND LAURENT SERIES 7
found such that “
for all n > N and z = k, then the same value of N will hold for all other z with
sk
The series of functions is called uniformly convergent (in a region R) if the
sequence of its partial sums is uniformly convergent (in that region). Uniform
convergence of series is most commonly established by the Weierstrass M-test:
The series © fn(z) of functions is uniformly convergent in a region R if there exists
a series of positive constants M, such that
\fa@| < Mn for allzin R
and the series 2M, is convergent. The proof follows from the comparison test
and the fact that the M,, are independent of z.
Several important theorems can be established for a uniformly convergent
series as follows.
Continuity theorem. The sum of a uniformly convergent series of continuous
functions is a continuous function.
Integrability theorem. A uniformly convergent series of continuous functions
can be integrated term by term.
Differentiability theorem. A uniformly convergent series can be differentiated
term by term, provided all terms have continuous derivatives and the re-
sulting series is uniformly convergent.
All these theorems can be proved by the same methods as those used for real
variables.* One can also show that the sums and products of uniformly con-
vergent series are uniformly convergent (within the same region, of course).
From the above results one can deduce the
Weierstrass theorem. If the terms of the series I f,(z) are analytic inside
and on a simple closed curve C and the series converges uniformly on C,
then its sum is an analytic function (inside and on C) and the series may be
differentiated or integrated any number of times.
2.10 TAYLOR AND LAURENT SERIES
Consider a power series of powers (z — a)" where a is a fixed complex number:
co + ex(z — a) + co(z — a)? + e9(z — a)? + °--
If this series converges for some value zo # a (for zo = a the series always
converges), then it is absolutely convergent everywhere in the interior of the circle
* For these methods see Kaplan, pp. 345-348.72° FUNCTIONS OF A COMPLEX VARIABLE 2.10
of radius |zp — a| = Ro about the point a (by comparison test). Moreover, it
will be uniformly convergent within a circle of radius R less than Ro (by the
Weierstrass M-test). It follows that the above power series represents, within a
circle R (at least), a complex function
$0) = Zee — a.
By the Weierstrass theorem, this function must be analytic within the circle.
Summarizing these results, we can state that every power series with a nonzero
radius of convergence represents a regular function in some neighborhood of the
point z = a. Such series may be added and multiplied (where the neighborhoods
overlap) and differentiated and integrated any number of times.
The converse statement is also true: Every function
S( analytic at z = a can be expanded in a power series
f@) = = en(z — a)" r
valid in some neighborhood of point a. This series, known
as the Taylor series, is unique, and the coefficients cy can
be obtained from the formula
el fe)
Figure 2.12
Proof. Let f(z) be analytic within a circle C about the point a and let z be inside
C (Fig. 2.12). It is then always possible to construct a circle T such that I is
inside C and the point z is inside I. This is necessary to ensure that f(z) is analytic
on I’ and the Cauchy formula can be applied:
Sf)
5, ate 2h
The quantity 1/(¢ — z) may now be expanded by means of a geometric series:
nao \S — @,
This is allowable because the series converges (by ratio test). Then
fey = ¢ AO ro $s F AOE ="
arin e — 2% tao (F — att2.10 TAYLOR AND LAURENT SERIES rf
The series appearing in the integrand, viewed as a function of §, is uniformly
convergent on I’ and, by the Weierstrass theorem, can be integrated term by term:
f= Ee - ot fp LO ae.
nao T(t — ayett
In view of the formula (p. 66)
nt
Fa = TL)
we obtain
f= ¥ EPM@e - oy"
Zoi
It remains for us to show that this power series is unique. Indeed, if there is a
series with undetermined coefficients cn,
f@) = Li ealz — a)",
n=
which represents an analytic function in some neighborhood of point a; then it is
uniformly convergent within and on a circle [inside the neighborhood. Differen-
tiating the series n times and setting z = a, we obtain
1 ny
n= LOO,
which completes the proof.
Power series can be generalized to contain negative powers of (z — a) to read
+
XD enlz — a)”.
Such series may be split into two parts:
os = a
Cn(z — a)” and me
xy . z x, (Zz — ay"
and the original series will converge provided both parts converge.
The series of positive powers converges inside a circle of some radius Ry
about the point z = a. The series of negative powers will, in general, converge
outside a circle of some radius Ry about z = a. To see this, denote
so that the series of negative powers of (2 — a) becomes
cca re:
ear74 FUNCTIONS OF A COMPLEX VARIABLE 2.10
Unless this series happens to have zero radius of convergence it will, in general,
converge within a circle of radius R’ about the origin. But || < R’ implies
|z— al > 1/R’ = Ra,
and the statement follows. Therefore we conclude that if Rp > R,, then the series
+
Dez — a)”
will converge within the annulus
Ri < |z—a| < Ro.
It can, of course, happen that Re < Rj, in which case our series will diverge
everywhere.
The following theorem can now be derived: Every function f(z) analytic in an
annulus
Ri <|z-al < Ry
can be expanded in a series of positive and negative powers of (2 — a), namely
ts a:
f= Veale — a".
This series, known as the Laurent series, is unique for a
given annulus, and the coefficients c, can be obtained
from
1 ¢ fede,
* Oxi In @ — ati
where I is a circle of radius R such that Ry < R < Ro. Figure 2.13
Proof. Contract the radius Re slightly and expand the radius Rj slightly to. obtain
an annular region, with point z inside, to which the generalized Cauchy formula
is applicable (Fig. 2.13):*
Loa 4 1 gf fat,
2f—2z' Mitr, F—z
f= 4
The first integral can be treated as in the derivation of the Taylor series:
Lg Mot _ 1 f§ ME-o yp _ Se_ yi dG Soa
ai -ah4 d= Ze - V5;
Qi Sr, fz “GC —aett Tt) ( — ayeri
Term-by-term integration is permissible by uniform convergence.
* We shall occasionally use the convenient symbols ¢ and § to indicate clockwise or
counterclockwise direction of integration.2.10 TAYLOR AND LAURENT SERIES 75
The second integral is treated by expanding 1/(¢ — z) in a somewhat different
geometric series:
1 1 s (= a)”
tae a Pw TSeSae= a" © pinto @ = art
which is convergent by ratio test. Then
fod _ _l@gy foe
nf—2 oni i
—— ere Lf, FUME = ay" ds.
m=0
oe
Im
Replace m by —(n + 1) (m must be negative) and rewrite the above as
feyat _ Ee- rig Sey dt
Win FZ Ga
Finally, note that the integrals
¢ LOU = 01,2...)
1 (
ae
and
d
(oo
may be just as well evaluated over a common circle I’, concentric with T and T'2,
and lying within the annulus Ry < R < Re.
To prove the uniqueness, assume that an expansion
+e
fe) = Dente — a"
exists and is valid in the annulus R; < |z — a| < Re. Choose an arbitrary integer
k, multiply both sides of this expression by (z — a)~*~1, and integrate around a
circle T about z = a, lying within the annulus. Then
$ fad _ & 4 dz :
Ir (2 — aye Op @ — ati
All integrals on the right-hand side will vanish except one, for which n = k, and
whose value is 2ri. Therefore
¢ Oe = epri,
Z — ayert
which completes the proof.76 FUNCTIONS OF A COMPLEX VARIABLE 2.10
The part of the Laurent series consisting of positive powers of (z — a) is
called the regular part. It resembles the Taylor series but it should be emphasized
that the nth coefficient cannot be associated, in general, with f(a) because the
latter may not exist. In most applications, f(z) is not analytic at z = a. The
other part, consisting of negative powers, is called the principal part. Either part
(or both) may terminate or be identically zero. Of course, if the principal part is
identically zero, then f(z) is analytic at z = a, and the Laurent series is identical
with the Taylor series.
Remark, The Laurent series is unique only for a specified annulus. In general, a func-
tion f(z) may possess two or more entirely different Laurent series about a given point,
valid for different (nonoverlapping) regions. For instance,
1
2 —2)
1 A
aptitet 2 thee, O lal. Write
f@) = =
If |z| > |a|, then |a/z| < 1, and we can expand
ar E(Y cm2.10 TAYLOR AND LAURENT SERIES tae
Therefore ee
=Lsn (> la).
n=0
This is the desired Laurent series.
The function f(z) can be expanded by this method about any point z = 6:
Indeed, write
1 1
$0 Fa" @=H-@-H tas) OM
Then, either
eee eS ee
I9-~ GHG” @- DAG
(2-6 <|e— Bp,
er © (a— 5)"
f= assy (lz — b| > la — 4).
Example 2. Rational fraction decomposition.
1
JO- 2 = O4 FH
The roots of the denominator are a = i, b = 2 (simple and distinct). There-
fore f(z) fails to be analytic only at z = i and z = 2 and should possess a Taylor
series about z = 0 valid for |z| < 1 (\i] = 1) and two Laurent series about z = 0
valid for 1 < |z| < 2and|z| > 2. To obtain these three series, we use the identities
2-Q+iz2+2%=@-)E-D
and
1
f2)= GoNE=D =~ TR
Suppose that the Laurent series valid for 1 < |z| < 2 is desired. The function
1/(z — 2) should then be expanded in Taylor series about z = 0 (Example 1).
This series is, in particular, valid for 1 < |z| < 2. The function 1/(z — i) should
be expanded in the Laurent series about z = 0 valid for |z| > 1 (Example 1).
This series is also valid for 1 < |z| < 2. If these two series are subtracted, we may
obtain (multiplying by 1/(2 — i) a series for f(z) valid for 1 < |z| < 2 which
is the desired Laurent series.
Example 3. Differentiation. f(z) = 1/(z — 1)?.
The method applied in Example 2 fails here because of the double root of the
denominator. Among the alternative methods the simplest one is, perhaps, to
observe that
canis ae)
@- 1p a\i—z8 FUNCTIONS OF A COMPLEX VARIABLE at
The series
(a <1)
=0
can be differentiated term by term within the circle of convergence. Therefore
ec Lh 2t3h tes Vit ie (<0).
Example 4. Integration. f(z) = log(1 + z) = log |1 + 2| + iarg (1 + 2).
This is the principal branch of the (multivalued) logarithmic function. The
branch line extends from “minus infinity” to minus one and log (I + 2) is analytic
within the circle |z] = 1
We know that 1
142°
d
qe +2)=
Therefore we may expand
Wdta)slozt+7—-F4A—---= VC" (A M
(any M) for |z — a| < € (some 6).
3. Neither of the two cases above; in plain terms, f(z) oscillates in a “wild”
manner.
Examples of these three types of singularities (at z = 0) are
Casel, f(z) = sin z/z,
Case 2, f(2) = I/sinz,
Case 3, f(@) =e".
The first case turns out to be trivial because then the limit lim,_. f(z) must exist,
and if the function f(z) is defined at z = a by f(a) = lim, f(z), then it must
be regular at z = aas well.
Remark. The formula f(z) = sin z/z does not define, in a rigorous sense, the value of
the function at z = 0. The extended formula
(sin z/z, z #0,
ro ~{ 1 0.
does define the function f(z) at z = 0 (and elsewhere).
To prove the above statement concerning Case 1, observe that f(z) is analytic
in an annulus p < |z — a| < R within a neighborhood of z = a. By the Cauchy
theorem, for any point z within this annulus (Fig. 2.14), we have
fol fisds A gy fide
a re—z } lity fz
We shalll show that the second integral must be zero for
all p. If this is so, then f(z) must approach the limit
1 GMO,
mi cca
lim fle) =
which will prove two things at once: (a) lim... (2)
exists, (b) if f(a) is defined by this limit then the rede-
fined function f(z) is analytic at z = a as well.80 FUNCTIONS OF A COMPLEX VARIABLE 2.11
To achieve this result, write ¢ — z = (¢ — a) — (z — a) and observe that
I@ - a) — @ — @| 2 |z — al — |¢ — a] = |[z - | — pp.
Then, for a fixed z,
1 f Aya]
2ni Jy $=
1 B = Bo.
is Boal —p ~ al —p
The integral must be independent of p because of the analyticity of the integrand.
Since it is less than an arbitrary positive number (for sufficiently small p) it must
be equal to zero. The proof is now complete.
Because of the described property, isolated singularities of the first type are
called removable singularities. In practice, if a function is defined by a formula
which fails for some isolated point z = a, then the formula is tacitly replaced by
the corresponding limit. In this sense, the functions
f(2) = sinz/z, gz) = e"*"*, (2) = 1/z — cotz
are analytic at z = 0.
The second type of isolated singularity, when |f(z)| > 0 as z— a, is called
a pole. Since the singularity is isolated, there must exist a Laurent series
f= XE ele - a"
valid for 0 < |z — al < R (for some R). If the principal part terminates, ie., if
the Laurent series is of the form
+
f= E ole — ay
then f(z) has a pole of order m at z = a. Conversely, if f(z) has a pole at z = a,
it must have the Laurent series (for 0 < |z — al < R) of the above form as seen
from the following argument: Consider the function
a2) = 1/f@).
Unless f(z) = 0 there must exist a neighborhood of z = a where f(z) has no zeros.
In this neighborhood, g(z) is analytic and
le@@|70 as za
(because |f(2)| > o). Therefore g(z) has a zero at z = a. This zero must be of
some definite order m, namely,
s@= > be- a" OX zal Goa TS
Since ¥(2) is analytic at z = a and does not vanish there, it follows that 1/4(z)
must be analytic as well and possess a Taylor series at z = a. Then
f2) = ———~ = Cr
However, this is evidently a Laurent series with a terminating principal part so
that the argument is complete.
Example. The function f(z) = csc z = 1/sin z has the Laurent series valid for
O0 0 nor approaches infinity, for an
arbitrary manner of approach: For instance, if z approaches zero along the
negative real semiaxis, then |f(z)| > 0; if it approaches zero along the positive
real semiaxis, then |f(z)| —> 00; if it approaches zero along the imaginary axis,
then | f(2)| remains constant but arg f(z) oscillates, and so on.82 FUNCTIONS OF A COMPLEX VARIABLE ae
In fact it is not difficult to show that even in an arbitrarily small neighborhood
of an essential singularity, a function f(z) assumes values arbitrarily close to
any desired complex number (Weierstrass-Casorati theorem). Also, an even more
explicit statement can be proved, known as
rd’s theorem. In an arbitrarily small neighborhood of an essential
singularity, a function f(z) assumes infinitely many times every complex
value except, perhaps, one particular value.
Remark. It should be emphasized that an infinite principal part in the Laurent series
implies essential singularity only if the series is valid “up to the singular point”:*
a 1 7 A
f= Go tenis tenet
is that the above
1| > 1. It actually represents the function
does not mean that f(z) has essential singularity at z = 1. The poi
series converges only if |2
1
eee ee
fe)
in the annulus 1 < |z — 1| 1 and represents a branch of the function /z? — 1
which is analytic in this region. The branch line joins, in this case, two branch
points z = +1 andz = —1 and does not extend to infinity (Fig. 2.15). Replacing
* Namely, for all points in a neighborhood |z — al < ¢ except zcee THE RESIDUE THEOREM AND ITS APPLICATIONS 83
z by (z — 1), we obtain a Laurent series centered about the branch point z = 1.
The latter will converge for |z — 1| > 2.
Another type of singular behavior of an analytic function occurs when it
possesses an infinite number of isolated singularities converging to some limit
point. Consider, for instance,
1 1
S@) = 0802 = Sa"
The denominator has simple zeros whenever
Branch line
(n = +1, +2,...). Figure 2.15
The function f(z) has simple poles at these points and the sequence of these poles
converges toward the origin. The origin cannot be called an isolated singularity
because every neighborhood of it contains at least one pole (actually an infinite
number of them).
2.12 THE RESIDUE THEOREM AND ITS APPLICATIONS
Let f(z) be analytic in some neighborhood of z = a except, perhaps, at z= @
itself [in other words f(z) is either analytic at z = a or has an isolated singularity
there]. Let C be a simple closed path lying in this neighborhood and surrounding
z = a. Then the integral
Res f(a) = a ¢ fz) dz
is independent of the choice of C and is called the residue of function at the point
z= a. Evidently, if f(z) is analytic at z = a (the point z = a is then called a
regular point), the residue is zero. If z = @ is an isolated singularity, then the
residue may or may not be zero.
Examples
1. f(2) = 1/2; the residue at z = 0 is equal to unity.
2. f(2) = 1/z*; the residue at z = 0 is equal to zero.
According to the formula on p. 74, the residue is seen to be identical with the
coefficient c_, in the Laurent series
+0
SO) = Lentz — a)",
which is valid for 0 < |z — al < R (for some R).
The residues of a function at its isolated singularities find their application
in the evaluation of integrals, complex or real. The basis for these applications is84 FUNCTIONS OF A COMPLEX VARIABLE 2.12
the residue theorem: If f(z) is analytic on and inside a closed contour C except for
a finite number of isolated singularities at z = a1, a2, ... , dy, which are all located
inside C, then
§, J@)de = 2xi ¥ Res lar).
6 a
This theorem is proved by the technique of cutting the channels between the con-
tour C and small circles C,, C2, ... around each singularity (Fig. 2.16).
Re
Figure 2.16 Figure 2.17
There is a variety of practical methods for quick evaluation of residues:
FIRST METHOD. From the definition,
Res fla) = 4 ¢ fade
(using a suitably chosen contour C). This method is rarely used, but may be valu-
able if the primitive function of f(z) is known and has a branch point at z = a.
Example. f(2) = 1/2, F(2)
to preserve the relationship
Log z. Any branch of Log z may be chosen, but in order
aF() ,
f@)= Fe =
the closed contour must be disconnected and the appropriate limiting process must be
applied. For instance (Fig. 2.17),
A
$ hae = lim tim (log (A) — log (B)} = 2t.
loz poz Rea
BoA
Here the principal branch was used, which possesses a discontinuity 2ri on the negative
real semiaxis.2.12 THE RESIDUE THEOREM AND ITS APPLICATIONS 85
SECOND METHOD. For a simple pole at z = a the following formula holds:
Res f(a) = lim (z — a) f(z).
The limit involved is often obtained by simple substitution or the use of well-
known limiting values.
Example.’ f(z) = tan z/z2. Then
sinz 1
iim
290 Z COSZ
tan z
Res f(0) = lim 20
THIRD METHOD. For a pole of order m at z = a the following formula holds:
Res f(a) = x im {= = Esta — a)" f2y-
Example. f(z) = e/24. Then
a 1
Res f(0) = 3 zim ae A = Gili e* ae
FOURTH METHOD. A common case of a simple pole is when f(z) has the form
eZ),
f2) = 4G’
where g(a) x 0 and ¥(z) has a simple zero at z = a. In this case
Res f(a) = ge.
Note that if z = ais a simple zero of y(2), then y’(a) cannot vanish.
Example. f(z) = e/sin z. Then
Re 10) 5. cos z 2| oF
FIFTH METHOD. Expand f(z) in the Laurent series and pick out the residue.
This method is valuable if f(z) is a product of functions with known Laurent
series. The series for f(z) is then obtained by multiplication. In practice, the
coefficient c_ can be picked out by inspection.
Example, f(z) = e#/(e + 3(z — 1).
The residue at z = 1 is desired. In the first step, transfer the pole to the origin:
z-l=u zeo+l.
Then
ef
f=86 FUNCTIONS OF A COMPLEX VARIABLE. 2.12
In the second step, expand e“ and 1/(3 + 2
2
MeL wth wot Sul 4 (allw),
1 tae 1 eo,
he thee e | (lel <3).
In the third step, evaluate (by inspection) the coefficient with w* from the product of the
above two series:
In the fourth step, evaluate the residue:
Exercise. Prove the validity of the third method given above. (Hint: Represent /(2) as
I@) = e@/@ — a)”
and use the formula from p. 66.)
The residue theorem can be applied to the evaluation of a wide variety of
definite integrals, real or complex. Some of the most frequently used procedures
are shown in the several examples which follow.
Example 1. Consider the real integral
1
Toapeossy pt “lO.
This integral can be converted into a contour integral in complex plane by setting
z= e. Then
dz 1 i
wa cxo-}(e+!)
and
PM = pa)’
where C is the unit circle in the z-plane. The integrand has two poles: at z = p
and z = I/p. If |p| < |, the pole z = p is inside the contour, while the pole
z = I/pis outside. Only the residue at z = p is needed; it is equal to
therefore
(lal < 1.2.12 THE RESIDUE THEOREM AND ITS APPLICATIONS 87
If |p| > 1, the needed residue is at z = 1/p and it is equal to
yielding
(lp| > 1).
Note that both results can be combined into
(al # D,
while the integral is not defined for |p| = 1. ‘i
This method can be used for integrals of the type Jo” R(cos 6, sin 6) dé,
where R(cos @, sin 6) is a rational function of cos @ and sin 6.
Example 2, Consider the real integral
te oy
[7 wan inf wea @>%
The integral [*y dx/(x? + a?) can be treated as a portion of the complex integral
$c dz/(z? + a*) evaluated over the contour C shown in Fig. 2.18. Indeed (set
z = x on the real axis):
dz ae dz
Let us estimate the integral over the semicircle Cp when R is very large: Write
_!_.1_3 _.
Py@ Pitaee
If |z| = R is very large, then |a*/z?| = a/R? is small and || + a?/z*| is almost
equal to unity (Fig. 2.19). To be precise, observe that |1 + a?/z?| > 4 for
z-plane
r+
Re
Re
Figure 2.18 Figure 2.1988 FUNCTIONS OF A COMPLEX VARIABLE 2.12
R > ay/2, and consequently
1 .
Tre <2. (for R > av2).
This implies
(for R > av/2).
2
a+ SR
Now, employ the estimate (p. 58)
[2
ca z+ a
S rR max < aR & =
i
+ a R
Then
. dz
Hal a Bea ~%
Observe now that the integral fc dz/(z? + a*) is independent of the radius R (so
long as R is greater than a) because the only singularity of the integrand within C
is at z = ai and, by the residue theorem,
.
—— = 2ni Res flai) = anit = 5
(for all C such that R > a). Consequently, if we let R 00, we have
+R
$ -# = tim [ ax y im f dz
cP +a peJg Pte) Rm Jege te
which reduces to
r_ [*_ ax
i- [ata @>o
Exercise. Show that if a < 0, then
:
f os
me X82 + a? a
[Hint: No new calculations need be done, just some logical deductions.]
The above procedure can be applied to integrals of the type
P(x)
dx,
Cc Ox)
where P(x) and Q(x) are polynomials in x and (a) Q(x) should have no real zeros*
* If Q(x) has real zeros, then the integral is not defined (see p. 63), unless particular
modifications are made (e., p. 111).2.12 THE RESIDUE THEOREM AND ITS APPLICATIONS 89
and (b) the degree of Q(x) must exceed the degree of P(x) by at least 2 (otherwise
the integral over the semicircle Cg may not tend to zero). For such integrals it
is true that
ic FO ax = 2ni DY Res,
s +
where 54 Res is the sum of the residues of the integrand in the upper half-plane.
This statement is a special case of the following theorem: If f(z) is continuous for
Jz| > Ro (some Ro) and |zf(z)| > 0 uniformly as |z| > co, then
lim Siz) dz
Rood ek
Proof
di 2nR
Fogle =|f,,_ aH] $ may son
Uniform convergence of |zf(z)| means that |zf(z)| < (for arbitrarily small ¢)
whenever |z| > R (that is, independently of the manner in which z approaches
infinity). Then
if f@) ae| < 2xe,
and the theorem follows. The conditions of the theorem are satisfied by the
function
fl) = ae (with deg Q > deg P + 2)
because (a) all zeros of Q must be within some fixed circle about the origin and
(b) the condition |zf(z)| < ¢ for |z| > k (for some k) can be satisfied.
Remark, The method described above can be extended to certain integrals of the type
i * Ax) dx,
0
where f(x) is an even function of x.
Example 3. Consider the real integral
Note, first of all, that
cos x dx
“J. +a"
The replacement of x by z will not help in this case because cos z is not “well
behaved” in the upper half-plane; it is not bounded. However, the function e
is bounded in the upper half-plane because e* = e~Ye"* and |e*| = 1 (all real x)90 FUNCTIONS OF A COMPLEX VARIABLE mie
while |e-¥| < I (all nonnegative y), For this reason, consider the complex integral
etd: _ [eax etd
coPe a lp wtat|,.2¢ a
J=
evaluated over the contour shown in Fig. 2.20. Observe that
: eit dz
tie iL Boca
J = 2ni Res f(ai) = wi TG
(as in Example 2). Also,
ai
so that
E
eM dx cos x de pine de
i [cree "Ja Pa? a® (@ > 0).
Since the right-hand side is real, it follows that
:
1-[
Remark. The statement
could have been made immediately on the grounds of symmetry. However, if it were
not true, we would have obtained the value of this integral as well from the imaginary
part of 2ri ° 4 Res.
Example 4. Consider the real integral
As before, observe that
+°
1 sin x
: if me
Since sin z is not “well behaved” on the upper half-plane, we shall try to evaluate
the complex integral
dz.
Zz
A new problem now arises: the integrand has a pole on the real axis. Note that
this pole is not caused by sin x but rather by cos x, which has been added to form
the complex integral.fae THE RESIDUE THEOREM AND ITS APPLICATIONS 91
Im In
Cr Cr
Cr
Rea tRRe xR i ee
Figure 2.20 Figure 2.21
To be able to apply the residue theorem we must avoid the pole at the origin
in some fashion. Let us suppose we do this by means of a semicircle of a small
radius r in the upper half-plane, as shown in Fig. 2.21. Then we can write
is —T ie iz ee jie
| es a é ee
fia [Sat [ Serf Sat dz.
The reason the chosen contour C is helpful in the evaluation of our real integral is
as follows. Note that (by continuity of sin x/x)
* sin ~ sin * sin * si
f IDX ax = tim f x dx + tim f nS dx + tim f nO ax.
0 x rola Xx 190 roar x
Now, since evidently fe (aoe
i in x/x) dx
0,
rao dar
it follows that
* sin ~ sin ** sin
[ SOX dx = lim f “2* dx + lim SOX ax.
see ro Je X ro dtr x
We can now see that the imaginary parts of the first and the third integrals on the
right-hand side of the relation
ta fat te iz +E ie i
e e e e e
Ser a = dx sa dx dz
fs Gs Re. J. ee
will give us the desired result in the limit where r—> 0 and R— oo, provided we
manage to calculate the other three integrals in the formula.
Let us estimate the integral over Cp. The method of Example 2 fails here;
however, we can perform integration by parts:*
e
Ca.
cn iz
* From the results of Section 2.7 it is trivial to verify that integration by parts applies to
complex integrals as well as to real ones.92 FUNCTIONS OF A COMPLEX VARIABLE 2.12
If R— ~, both terms on the right-hand side approach zero (cos R is bounded
because R is real),
Next, we evaluate the integral over C:
ef lte*-1, fa
Ree
Since e* is continuous at z = 0 and is equal to unity there, it follows that
le* — I] 0,R> «©
zero (no poles of e“*/z within the contour). Taking the limits
ither order), we obtain
de
sinx
dx — wi.
=f sind gy =F
ee
Therefore*
* Of course, this integral can be evaluated by much more elementary methods. The pur-
pose of the above analysis is to illustrate the techniques of the residue calculus on a
simple example rather than to obtain this particular result.ae THE RESIDUE THEOREM AND ITS APPLICATIONS 93
Example 5. We shall now evaluate the integral of Example 4 by a somewhat
different method. We write
i oo ee
ra fac tae
ae Geeta eeaeeeaesage
Here we treat our integral as a complex integral over a path (real axis) which is
open (so far). Since sin z/z is continuous at z = 0, we may deform the contour
as shown in Fig. 2.22 and claim that .
T= tim | S22 a, |
rao dor z
Now set ee
oes dee Sao),
i" Figure 2.22
The problem is to evaluate
=lim | “dz and fy =lim
rao Jor Z 130 JC
For 1;, close the contour as usual (Fig. 2.23a). Show that the integral over Ce
approaches zero, and deduce that
I
For Ia, close the contour through the lower half-plane as shown in Fig. 2.23(b).
Now |e~| is bounded in the lower half-plane and the integral over C’p approaches
zero. On the other hand, note that (a) there is a contribution from the pole at
the origin and (b) the clockwise integration introduces a change in the sign.
When this is taken into account, we obtain
i ~ dz = —2ri Res f(0) = —2r1.
Im Im
G a Re
CR
4 Re
aw (b)
Figure 2.2394 FUNCTIONS OF A COMPLEX VARIABLE ie
Combining these results, we deduce that
+0
i 1 1
[ Se dx = 35 (h — be) = 50 + 2m) = +
in conformity with the previous result.
Remark 1. The integral over the semicircle C, in Example 4 has been shown to yield the
value —i in the limit r + 0. This is just one-half the value obtained by integration over
the full circle. One may prove a general theorem to this effect: Let f(z) be analytic at
z = a, Consider the integral
Le [ ** (2) dz Z2
,zZ-a
taken from z1 = a + re to z2 = a + re along the circle
= r (Fig, 2.24), Then
le-
lim I, = ai f(a)
0
Figure 2.24
where a = 92 — 61 + 2mm (choose » so that la] < 2r).
Exercise. Prove this theorem using a technique similar to that in Example 4.
‘Remark 2. The statement that the integral of e'*/z over Ce in Example 4 vanishes in the
limit R —+ © is also a special case of a more general result, known as
Jordan’s lemma: If f(z) converges uniformly to zero whenever z approaches infinity, then
lim S2)e™* dz = 0,
Roe Jog
where d is any positive number and Cx is the upper half of the circle |,
The term “uniform convergence” as z—> © means that |/(2)| < « whenever |2| > M
(for some M) no matter what the phase of z is.
Example 6. The integral reads
(-1 oo. It follows that J, is bounded by, say 2e~**®®, and this
approaches zero since |a| 0) is the principal
branch, with k = 0. It is convenient to work with the branch for which
0 0 and with the branch
k = 1 for Imz <0. Then our real integral I coincides* with the integral of this
branch along the upper edge of the branch cut (see Fig. 2.26). Close the contour
as shown in the figure. The integral over the lower edge of the branch cut is then
° 2
ree ears
0 I+z o I+x
dz [ ote
THz tle, t+z—
Therefore,
2ni Res f(—1) = [l — enemy gf
Now, for [zh = R,
Zo dz ar
fg oe ee Gtk)
27" dz| pa @
[at war) = In 30 (asr—0).
‘The residue at z = —1 is e@~*; therefore
ee
— ei d2* ~ sin ow
* In the limit « — 0, r — 0, and R > &, of course.2.13 CONFORMAL MAPPING BY ANALYTIC FUNCTIONS = 97
Im Im
ae
>Re Poe >Re
zo
7
Figure 2.26 Figure 2.27
Remark. Itis also possible to operate with the standard principal branch —w < Argz Re
Figure 2.29 w-plane
zplane
Note that the function f(z) = z? maps half the z-plane onto the entire w-plane.
If the points on the real z-axis are included then the real positive semiaxis in the
w-plane serves as a double image (for a single image it is necessary to exclude
either the real negative or the real positive semiaxis in the z-plane).
It is, perhaps, most convenient to say that w = z? maps the open upper half-
plane (the region Im z > 0) onto the open region consisting of the w-plane with
the real positive semiaxis removed. This is usually referred to as the “w-plane with
a cut along the positive real semiaxis.”
Now consider a point z == Zo in the z-plane and a smooth curve C passing
through it. Let f(z) be analytic at z = zo and let f(z) map point zp onto wo
and the curve C onto curve I’ passing through it (Fig. 2.30).
Also consider a point z, on C, close to zo, and its image w; on T. Denote
z — Zo = Az, w1 — wo = Aw. By definition of the derivative,
ae
Jim 3% = FG).
Therefore
Jim aay = God
Geometrically, the left-hand side can be interpreted as the magnification of an
(infinitesimal) arc of curve C as the curve C is transformed into the curve I.
This magnification is the same for all curves passing through Zo.
Now consider the angles @9 and ¢o which the curves C and T make with
the real axis. We have
¢0 — 99 = lim [arg Aw — arg Az] = lim (xs ax) = arg f(Zo) =
nia Pay az
Geometrically, this means that the curve C is rotated through an angle a when
it is transformed into I. This angle of rotation is the same for all curves passing
through zo. Observe, however, that these two geometrical statements lose their
meaning at the points zo where f"(z9) = 0.
The properties described above are usually referred to as the conformal
mapping properties. Customarily the term “conformal mapping” is defined by:2.13 CONFORMAL MAPPING BY ANALYTIC FUNCTIONS 101
Im Im
y
ic
i ia!
2 4 — z-plane “ w-plane
wal
Re Re
Figure 2.30
a) Invariance of angles: An angle at zo formed by two curves remains unchanged
(in magnitude and orientation).
b) Invariance of infinitesimal circles: An infinitesimal circle around zo retains its
shape; it differs from a circle by higher order infinitesimal than its radius.
Our analysis leads then to the following theorem: If a function f(z) is analytic
at 29 and the derivative f’(zo) does not vanish, then the mapping z—> f(z) is con-
formal at zo.
‘Remarks
1. The statement f”(zo) # 0 guarantees the existence of the inverse mapping because if
= f(z) and z = g(w), then g’(wo) = 1/f’(zo) provided f"(zo) ¥ 0.
2. The mapping f(z) = z*, which reflects the z-plane in the real axis, preserves the
angles and maps circles into circles but it is not considered conformal because the orienta-
tion of the angles is reversed (Fig. 2.31).
Im Im
* G t
v a LY
TT
Ty
>Re +Re
Conformal mapping
Im Gy Im
v
C
Re +Re
n
Figure 231 Mapping f(z)=2* r102 FUNCTIONS OF A COMPLEX VARIABLE 2.14
2.14 COMPLEX SPHERE AND POINT AT INFINITY
Many concepts of the theory of complex variables are greatly simplified through
the introduction of the so-called point at infinity. This is done with the help of
the stereographic projection between the complex plane and the complex sphere
which is defined as follows: Construct a sphere of radius R (for convenience, R
may be taken as 4) such that the complex plane is tangential to it at the origin,
as shown in Fig. 2.32. The point P on the sphere opposite to the origin (called
the north pole for convenience) is used as the “eye” of stereographic projection.
Lines through P are drawn which inter-
sect both the sphere and the plane per- : 7
mitting a mapping of point z on the plane image of
onto ite point ¢ on the sphere (Fig. 2.32), "™*sinary XS
In this fashion the entire complex plane is
mapped onto the complex sphere (or
Riemann sphere). Curves and regions in
the z-plane are mapped onto curves and
regions on the ¢-sphere.
Note that the point P itself has no
counterpart on the z-plane. Nevertheless,
it has been found convenient to adjoin to
the z-plane an extra point, known as the
point at infinity, in such a way that a
curve passing through P on the ¢-sphere
is defined as approaching the point at in- Figure 2.32
finity in the z-plane.
The concept of the point at infinity is very useful, particularly in the analysis
of mappings. The following statements of properties can be verified without much
difficulty.
1, Circles in plane are mapped onto circles on the sphere which do not pass
through P.
. Straight lines in plane are mapped onto circles on the sphere which do pass
through P.
3. Maps of intersecting straight lines have two common points on the {-sphere,
one of which is P.
North pole P
Image of real axis,
The origin 0
N
4. Maps of parallel straight lines have only the point P in common and they have
common tangent at P.
5. The exterior of a circle |z| = R with R >> 1 is mapped onto the interior of a
small spherical cap around point P. As R— o the cap “shrinks to P.”
These and similar properties give rise to a series of definitions applied to the
so-called extended complex plane, i., a complex plane to which the point at
infinity is adjoined.2.14 COMPLEX SPHERE AND POINT AT INFINITY 103
Examples
a) The mapping w = 1/z maps zo = 0 onto wo = e (point at infinity adjoined
to w-plane) and vice versa. The rigorous meaning of this is, of course, that
if a sequence of points in the z-plane converges to zo = 0, then a correspond-
ing sequence of points on the w-sphere converges to its north pole.
b) The region |z| > R is a neighborhood of the point at infinity.
The importance of the complex sphere is greatly enhanced by the fact that if
two curves intersect in the z-plane at an angle Y, then their images on the sphere
will intersect at the same angle. In fact, the stereographic projection is conformal.
(The proof is not difficult, but will be omitted here.) This permits the definition
of the angle at infinity which two curves make if they recede to infinity in the
z-plane. This angle is defined to be the angle that their images on the sphere
make at the point P.
Now the following theorem can be stated.
Theorem. The mapping w = 1/z is conformal at the origin z = 0.
Observe that the function f(z) = 1/z is not defined at z = 0 but the mapping
w = 1/2 is defined (by the subterfuge of the sphere). The conformality does not
follow from the analyticity, but rather from the relationships on the complex
sphere.
Corollary. The mapping w = 1/z is conformal at infinity (despite the fact
that f”(z) approaches zero as z recedes to infinity).
The concept of the point at infinity is closely interwoven with the study of
singularities of analytic functions. The very notion of analyticity can be extended
to the point at infinity by the following device: A function f(2) is defined to be
analytic at infinity if the function
az) = f(l/z)
is analytic at z = 0. Moreover, it is possible to introduce the concept of a pole at
infinity, branch point at infinity, etc., through the corresponding behavior of g(z)
at the origin. In this connection, the function f(z) = e, which has no zeros
and no singularities in the entire complex plane, turns out to possess an essential
singularity at infinity. Other functions which have no singularities (e.g., all
polynomials in z) are also found to have a breakdown of analyticity at infinity.
Exercise. Show that a polynomial P(z) of degree n has a pole of nth order at infinity.
In fact, a survey of familiar functions reveals the fact that functions which are
analytic at infinity possess at least one singularity elsewhere, i.e. for some finite
value of z. The natural conjecture is that there may not be a “perfectly analytic”
function. This problem has actually been resolved and is embodied in the
Liowville theorem. The only function f(z) which is analytic in the entire com-
plex plane and the point at infinity is the constant function f(z) = const.104 FUNCTIONS OF A COMPLEX VARIABLE 2.15
In conclusion, it may be mentioned that in some texts the term complex plane
is tacitly assumed to mean the extended complex plane, with the point at infinity
included. Certain theorems may then be more conveniently stated. However, one
should never forget that while there is a point at infinity, there is still no such thing
as a complex number “infinity,” in the sense that it should possess the algebraic
properties shared by other complex numbers.
2.15 INTEGRAL REPRESENTATIONS
It is a very common occurrence that certain functions are represented and even
defined by integrals, with constant or variable limits. For instance, consider the
real function, the derivative of which is equal to sin x/x. It is known that the
indefinite integral of sin x/x is not expressible in terms of elementary functions*
in finite form. Therefore, there is no other alternative but to define such functions
by means of an integral and it is customary to write
Si (x) -[ ant a,
where Si (x) is a new function, called the sine-integral. It is, of course, but one of
many primitive functions of sin x/x (i.e. the one which vanishes at the origin)
the others being obtainable by adding an arbitrary constant to Si (x).
If we want to extend the definition of Si (x) to complex variables, it can be
done in a trivial way. We write
Si (2) -[ Ear,
where f is a complex variable and the integral is now a curvilinear integral in the
complex {-plane over some path connecting the points ¢ = 0 and ¢ = z. It
does not matter which particular path is chosen since the function f(t) = sin ¢/¢
is analytic for all (finite) valuest of ¢ and the integral is independent of path.
A different situation is encountered in the so-called cosine integral, defined by
42
Ci(y) = -{ COS E ae
This function is easily seen to be a primitive function of cos x/x. The choice of
limits is conventional but the main feature is that the integral diverges at x = 0.
While it is still possible to extend the definition to complex variables by writing
ete
aw--f oon as,
* That is, algebraic, exponential, trigonometric, and their inverse functions.
+t See the discussion on pp. 79-80.oat INTEGRAL REPRESENTATIONS 105
the question arises as to the specification of the path of integration. Presumably,
the upper limit implies an asymptotic approach to the real axis.* Three such paths
are illustrated in Fig. 2.33. The integrals taken along C; and C2 are easily shown
to be identical, but the integral along C; will differ by 2ni times the residue of the
integrand at ¢ = 0. This means essentially that Ci (z) is a multivalued function,
the integrals along different paths yielding different branches. Two of these
branches are characterized by the paths C; and Cs (or other paths equivalent
to them). Other branches can be obtained by circling the origin n times before
proceeding to “plus infinity.”
Im
a
Re
Pole,
Figure 2.33
Remark, The above analysis reveals that Ci (x) for real negative x should be defined
by one of the branches of complex function Ci (z) because the réal integral
=
cost
[a
is divergent for x < 0 due to the behavior of the integrand at = 0. It has been found,
however, that this divergent integral (like many similar others) can be given a definite
meaning by the prescription
car coue Eaten €
os Uf a+ | al
known as the Cauchy principal value (or simply the principal value) and is often denoted by
-
cos £
t re
is customary to define Ci (x) for x < 0 by
Because of this result, it
—
cw --f ee at (<0).
Principal values of integrals occur very often in physical applications; the complex
version of this definitiont will be given shortly.
* As usual, we require Ci (z) to reduce to Ci (x) as z becomes real and positive. The
method of approaching infinity cannot be modified because cos z has essential singularity
at infinity (see p. 109).
+ The integral —£."* (cos /£) dé is still a branch of Ci (2), although not one of those
described previously.106 FUNCTIONS OF A COMPLEX VARIABLE 2.15
A second type of representation of functions by integrals occurs when the
limits of integration are fixed but the integrand depends on a parameter. A well-
known example of this kind is the integral
f oo dy = VE
: 2
occurring in the kinetic theory of gases and many other branches of physics.*
It may be viewed as an integral representation of the function I(k) = (/#/2)A— "7.
This point of view is, as a matter of fact, utilized in practice when the function
IQ) is differentiatedt to yield other important integrals, like
dQ) _ _ Vr \-3)2 _ 2h?
one aah =—] xe dx,
@IQX) _ 3025/2 ft os
7 Ce! “s* = 7 xe ee eee
In this particular example the function (A) represented by the integral is a familiar
one, but the same idea can be used to generate a variety of new functions. Con-
sider, for instance, the integral
Sin) =f te'dt (m= 1,2,3,...),
E
which is equal to n! as is not difficult to prove.
The above integral representation of n! suggests the extension of the notion of
factorial. We may define the factorial function TI(x) by the integral
T(x) = if * Fe dt,
where x is not necessarily a positive integer. The integral converges at infinity
for all values of x, but we must demand that x > —1 for the integral to converge
at f= 0. Consequently, the function I(x) is defined for all (real) x > —1 by
the above integral representation.
Since I(n) = n!, we have
M@+1)=@+DNM@ (= 1,2,3,..).
This formula holds for all values of x(x > —1). Indeed, integrating by parts,
we obtain
i Pte dy = — +e aoe (x + Ifem* dt.
0 oo
* For the evaluation of this integral, see Kaplan, p. 218.
+ Differentiation of improper integrals should be justified by appropriate theorems. See
e.g., Kaplan, p. 379.sea INTEGRAL REPRESENTATIONS 107
Since
lim F#e* = 0 (alll x),
te
lim # +e 0 (x>-1),
0
it follows that
fo ereta = (+ vf Fe—' dt
0 0
or
Ux+ 1) =(@+DIQ) («> -D.
In particular, 11(0) = 1. Thisis the actual reason for the commonly used convention
oO!
The recursion formula I(x + 1) = (x + 1)H(x) permits calculation of I(x) for
any x (x > —1) provided a table of II(x) is compiled for the interval 0 < x <1.
Moreover, it permits an extension of the definition of I1(x) into the region x < —1.
Example. Calculate 11(4), (—4), 1(—3).
The first value can be obtained directly from the definition
mG) = [° te at
@) = [ore
by the substitution ¢ = x?. Then
m1) = af xe dx = 2
0
by a previous result. The value of II(— }) can also be obtained directly in the same
manner. But it can be, alternatively, obtained from the recursion formula. If
we set x = —3, then
M1) = 3(—4);
m(-%
therefore
The value of 1l(—3) cannot be obtained directly (the integral diverges) but it may
be defined via the recursion formula. If we set x = —$, then
M(—4) = (-)m(-9,
yielding
1(—4) = —2Vr.
‘An important feature of II(x) is that it approaches infinity as x approaches —1
from the right. The conclusion that II(x) has infinite discontinuities at x equal to
a negative integer follows from this fact and the recursion formula. The graph
of I(x) is sketched in Fig. 2.34.108 FUNCTIONS OF A COMPLEX VARIABLE ae
While the extension of the concept of the factorial is widely used in applied
mathematics, it is not usually accomplished by means of the factorial function
T(x) but rather by means of the related gamma function defined by
T(x) = Hx — 1).
Evidently P(x) satisfies the relations
T(x + 1) = TQ) (real x),
T(x) = (x— 1)! (x = positive integer),
and has the integral representation
Te) -f[ foe dt (x > 0).
The behavior of P(x) a8 a function of x is easily obtained from Fig. 2.34 just
by shifting the origin to the point x = —1,y = 0.
1)
Figure 2.34
Integral representations of the type given for factorial and gamma functions
can be extended to complex integrals as well. A complex function g(z) may be
defined by a definite integral .
ate) = ["" fe, 3) ds,
Along C
where the endpoints s; and sz as well as the path C between them is prescribed.
In many cases the points s and s coincide and the path is a closed contour.
From the theory of analytic functions, it follows that the path of integration
can be deformed within certain limits (as much as the Cauchy theorem and its
consequences permit this).