Chapt 6
Chapt 6
Characteristic Functions
Transforms and Characteristic Functions.
There are several transforms or generating functions used in mathematics, probability and statistics. In general, they are all integrals of an exponential function, which has the advantage that it converts sums to products. They are all functions dened on t <. In this section we use the notation i = 1. For example; 1. (Probability) Generating function. g (s) = E (sX ). R 2. Moment Generating Function. m(t) = E [etX ] = etx dF R 3. Laplace Transform. L(t) = E [etX ] = etx dF R 4. Fourier Transform. E [eitX ] = eitx dF R 5. Characteristic function. X (t) = E [eitX ] = eitx dF
Denition 106 ( Characteristic Function) Dene the characteristic function of a random variable X or or its cumulative distribution function FX to be the complex-valued function on t < Z X (t) = E [eitX ] = eitx dF = E (cos(tX )) + iE (sin(tX ))
59
60CHAPTER 6. CHARACTERISTIC FUNCTIONS AND THE CENTRAL LIMIT THEOREM The main advantage of the characteristic function over transforms such as the Laplace transform, probability generating function or the moment generating function is property (a) below. Because we are integrating a bounded function; |eitx | = 1 for all x, t <, the integral exists for any probability distribution.
6.1.2
(a) exists for any distribution for X . (b) (0) = 1 . (c) |(t)| 1 for all t . (d) is uniformly continuous . That is for all > 0 , there exists > 0 such that |(t) (s)| whenever |t s| . (e) The characteristic function of a + bX is eiat (bt). (f) The characteristic function of X is the complex conjugate (t). (g) A characteristic function is real valued if and only if the distribution of the corresponding random variable X has a distribution that is symmetric about zero, that is if and only if P [X > z ] = P [X < z ] for all z 0. (h) The characteristic function of a convolution F G is F (t)G (t). Proofs. (a) Note that for each x and t, |eitx |2 = sin2 (tx) + cos2 (tx) = 1 and the constant 1 is integrable. Therefore E |eitX |2 = 1. It follows that E |eitX | q E |eitX |2 = 1
(b) eitX = 1 when t = 0 . Therefore (0) = Ee0 = 1. (c) This is included in the proof (a). (d) Let h = s t. Assume without loss of generality that s > t. Then |(t) (s)| = |EeitX (eihX 1) | E [|eitX ||eihX 1|] E [|eihX 1|]. E [|eitX (eihX 1)|]
61
But as h 0 the function eihX 1 converges to 0 for each and it is dominated by the constant 2. Therefore, by the Lebesgue Dominated Convergence theorem, E [|eihX 1|] 0 as h 0. So for a given > 0, we chan choose h suciently small, for example h = |s t| such that |(t) (s)| . (e) By denition, Eeit(a+bX ) = eita E [eitbX ] = eiat (bt). (f) Recall that the complex conjugate of a + bi is a bi when a, b, and z are real numbers. Then and of eiz is eiz
E [eit(X ) ] = E [eitX ] = E [cos(tX ) + isin(tX )] = E [cos(tX ) isin(tX )] = (t). (g) The distribution of the corresponding random variable X is symmetric if and only if X has the same distribution as does X . This is true if and only if they have the same characteristic function. By properties (f) and the corollary below, this is true if and only if (t) = (t) which holds if and only if the function (t) takes on only real values. (h) Put H = F G. When there is a possible ambiguity about the R variable over which we are integrating, we R occasionally use the notation h(x)F (dx y ) to indicate the integral h(x)dK (x) where K (x) is the cumulative distribution function given by K (x) = F (x y ). Then Z Z Z itx itx F (dx y )G(dy ) e H (dx) = e = Z Z eit(z+y) F (dz )G(dy ), with z = x y,
and this is F (t)G (t). The major reason for our interest in characteristic functions is that they uniquely describe the distribution. Probabilities of intervals can be recovered from the characteristic function using the following inversion theorem. Theorem 107 ( Inversion Formula). If X has characteristic function X (t), then for any interval (a, b), P [a < X < b] + P [X = a] + P [X = b] 1 = lim T 2 2 Z Z
T
eita eitb it
eitx F (dx)dt
<
<
Z Z
< T
sin(tc) dt t
and so we obtain from above Z T ita Z Z T Z T e eitb 1 sin(t(x a)) sin(t(x b)) 1 (t)dt = { dt dt}F (dx). 2 T it t t < 0 0 But as T , it is possible to show that the integral (this is known as the sine integral function) 1 Z 2, x < a 1 T sin(t(x a)) 1 , x>a dt 2 0 t 0, x=a
Substituting this above and taking limits through the integral using the Lebesgue Dominated Convergence Theorem, the limit is the integral with respect to F (dx) of the function 1 , x=a 2 1 2, x = b g (x) = 1 , a<x<b 0, elswhere and this integral equals P [a < X < b] + P [X = a] + P [X = b] . 2
Corollary 108 If the characteristic function of two random variables X and Y agree, then X and Y have the same distribution. Proof. This follows immediately from the inversion formula above. We have seen that if a sequence of cumulative distribution functions Fn (x) converges pointwise to a limit, the limiting function F (x) is not necessarily a cumulative distribution function. To ensure that it is, we require that the distributions are tight. Similarly if a sequence of characteristic functions converge for each t, the limit is not necessarily the characteristic function of a probability distribution. However, in this case the tightness of the sequence translates to a very simple condition on the limiting characteristic function. Theorem 109 (Continuity Theorem) If Xn has characteristic function n (t), then Xn converges weakly if and only if there exists a function (t) which is continuous at 0 such that n (t) (t) for each t. ( Note: In this case is the characteristic function of the limiting random variable X .)
63
Proof. Suppose Xn X . Then since the function eitx is a continuous bounded function of x, then E (eitXn ) E (eitX ). Conversely, suppose that n (t) (t) for each t and is a continuous function at t = 0. First prove that for all > 0 there exists a c < such that P [|Xn | > c] for all n. This is Problem 14 below. This shows that the sequence of random variables Xn is tight in the sense that any subsequence of it contains a further subsequence which converges in distribution to a proper cumulative distribution function. By the rst half of the proof, (t) is the characteristic function of the limit. Thus, since every subsequence has the same limit, Xn X . Example 110 (Problem 18) Suppose Xn U [n, n] . Then the characteristic function of Xn is n (t) = (sin tn)/tn. Does this converge as n ? Is the limit continuous at 0? Example 111 (Problem 19) Suppose X1 , ...Xn . . . are independent Cauchy distributed random variables with probability density function f (x) = 1 , x <. (1 + x2 )
has the same distribution as X1 . Then the sample mean X Note: We may use the integral formula Z cos(tx) dx = etb , t 0 2 + x2 b 2 b 0 to obtain the characteristic function of the above Cauchy distribution (t) = e|t| .
6.1.3
Characteristic function of N (, 2 ) .
The characteristic function of a random variable with the distribution N (, 2 ) is 2 t2 (t) = exp{it }. 2 (Note: Recall that for any real constant c, Z 2 e(xc) /2 dx = 2.
6.2
Our objective is to show that the sum of independent random variables, when standardized, converges in distribution to the standard normal distribution. The proof usually used in undergraduate statistics requires the moment generating function. However, the moment generating function exists only if moments of all orders exist, and so a more general result, requiring only that the random variables have nite mean and variance, needs to use characteristic functions. Two preliminary lemmas are used in the proof. Lemma 112 For real x , eix (1 + ix
3
x2 ) = r(x) 2
t2 E (X 2 ) + o(t2 ) 2
0 as t 0.
Proof. By expanding eix in a Taylor series with remainder we obtain b2 eix 1 x2 = x + i + i2 i 2 2 Rx where bn (x) = 0 (x s)n eis ds, and a crude approximation provides |b2 | Rx 2 x2 s ds = x3 /3. Integration by parts shows that b2 = 2b1 and substituting i 0 this provides the remaining bound on the error term. Lemma 113 For P any complex numbers wi , zi , if |zi | 1, Q Q | i zi i wi | i |zi wi |. Proof. This is proved by induction using the fact that
n Y n Y n 1 Y i=1 n 1 Y i=1 n 1 Y i=1
|wi | 1 , then
i=1
zi
i=1
wi = (zn wn )(
zi )+wn (
zi
wi ) |zn wn |+|(
n 1 Y i=1
zi
n 1 Y i=1
wi )|.
This shows the often used result that (1 and hence that (1 c c 1 + o( ))n (1 )n 0 n n n as n .
c 1 + o( ))n ec n n
6.3. PROBLEMS
65
Theorem 114 (Central Limit Theorem) If Xi are independent identically distributed random variables with E (Xi ) = , var(Xi ) = 2 , then
Sn
1 X = (Xi ) n i=1
t2 n (t/ n) = {1 + o(t2 /n)}n . 2n Note that by Lemma 47, |{1 t2 n t2 + o(t2 /n)}n (1 ) | n o(t2 /n) 0 2n 2n
2 2
t n t /2 and the second term (1 2 . Since this is the characteristic function n) e of the standard normal distribution, it follows that Sn converges weakly to the standard normal distribution.
6.3
Problems
1. Find the characteristic function of the normal(0,1) distribution. Prove using characteristic functions that if F is the N (, 2 ) c.d.f. then G(x) = F ( + x) is the N (0, 1) c.d.f. 2. Let F be a distribution function and dene G(x) = 1 F (x) where x denotes the limit from the left. Prove that F G is symmetric. 3. Prove that F G = G F. 4. Prove using characteristic functions that if Fn F and Gn G, then Fn Gn F G. 5. Prove that convolution is associative. That (F G) H = F (G H ). 6. Prove that if is a characteristic function, so is |(t)|2 .
66CHAPTER 6. CHARACTERISTIC FUNCTIONS AND THE CENTRAL LIMIT THEOREM 7. Prove that any characteristic function is non-negative denite:
n n X X i=1 j =1
(ti tj )zi z j 0
for all real t1 , ...tn and complex z1 , ..., zn . 8. Find the characteristic function of the Laplace distribution with density on < 1 f (x) = e|x| . (6.1) 2 What is the characteristic function of X1 + X2 where Xi are independent with the probability density function (6.1)? 9. (Stable Laws) A family of distributions of importance in nancial modelling is the symmetric stable family. These are unimodal densities, symmetric about their mode, and roughly similar in shape to the normal or Cauchy distribution (both special cases = 2 or 1 respectively). They are most easily described by their characteristic function, which, upon setting location equal to 0 and scale equal to 1 is EeiXt = e|t| . The parameter 0 < 2 indicates what moments exist, for except in the special case = 2 (the normal distribution), moments of order less than exists while moments of order or more do not. Of course, for the normal distribution, moments of all orders exist. The probability density function does not have a simple closed form except in the case = 1 (the Cauchy distribution) and = 2 (the Normal distribution) but can, at least theoretically, be determined from the series expansion of the probability density valid in case 1 < < 2 (the cases, other than the normal and Cauchy of most interest in applications) fc (x) = =
X ((k + 1)/) k x (1)k cos( )( )k ck! c c
k=0 X
n=0
(1)n
(b) Verify that the Characteristic function of the probability density function X ((2n + 1)/) 2n f1 (x) = (1)n x (2n)! n=0
(a) Let X1 , ..., Xn be independent random variables P all with a symmetric stable () distribution. Show that n1/ n i=1 Xi has the same Stable distribution.
6.3. PROBLEMS is given by e|t| .(Hint: the series expansion of cos(x) is cos(x) =
X
67
n=0
(1)n
x2n ). (2n)!
10. Let be the unit interval and P the uniform distribution and suppose we express each [0, 1] in the binary expansion which does not terminate with nitely many terms. If = .1 2 ...., dene Rn ( ) = 1 if n = 1 and otherwise Rn ( ) = 1. These are called the Rademacher functions. Prove that they are independent random variables with the same distribution. 11. For the Rademacher functions Rn dened on the unit interval, Borel sets and Lebesgue measure, let Y1 = R1 /2 + R3 /22 + R6 /23 .... Y2 = R2 /2 + R4 /22 + R7 /23 + .... Y3 = R5 /2 + R8 /22 + R12 /23 + .... Note that each Ri appears only in the denition of one Yj . Prove that the Yi are independent identically distributed and nd their distribution. 12. Find the characteristic function of: (a) The Binomial distribution (b) The Poisson distribution (c) The geometric distribution Prove that suitably standardized, both the binomial distribution and the Poisson distribution approaches the standard normal distribution as one of the parameters . 13. (Families Closed under convolution.) Show that each of the following families of distributions are closed under convolution. That is suppose X1 , X2 are independent and have a distribution in the given family. Then show that the distribution of X = X1 + X2 is also in the family and identify the parameters. (a) Bin(n, p), with p xed. (b) Poisson ( ). (c) Normal (, 2 ). (d) Gamma (, ), with xed. (e) Chi-squared. (f) Negative Binomial, with p xed. 14. Suppose that a sequence of random variables Xn has characteristic functions n (t) (t) for each t and is a continuous function at t = 0. Prove that the distribution of Xn is tight, i.e. for all > 0 there exists a c < such that P [|Xn | > c] for all n.
68CHAPTER 6. CHARACTERISTIC FUNCTIONS AND THE CENTRAL LIMIT THEOREM 15. Prove, using the central limit theorem, that
n X ni en i=0
i!
1 , as n . 2
16. (Negative binomial ) Suppose we decide in advance that we wish a xed number ( k ) of successes in a sequence of Bernoulli trials, and sample repeatedly until we obtain exactly this number. Then the number of trials X is random and has probability function x1 k f (x) = p (1 p)xk , x = k, k + 1, . . . . k1 Use the central limit theorem to show that this distribution can be approximated by a normal distribution when k is large. Verify the central limit theorem by showing that the characteristic function of the standardized Negative binomial approaches that of the Normal. 17. Consider a random walk built from independent Bernoulli random vari ables Xi = 1 with probability p = / n and otherwise Xi = 0. Dene the process [nt] 1 X Bn (t) = Xi n i=1
for all 0 t 1. Find the limiting distribution of B (t) and the limiting joint distribution of B (s), B (t) B (s) for 0 < s < t < 1.
18. Suppose Xn U [n, n]. Show that the characteristic function of Xn is n (t) = (sin tn)/tn. Does this converge as n ? Is the limit continuous at 0? 19. Suppose X1 , . Xn are independent Cauchy distributed random variables with probability density function f (x) = 1 , x <. (1 + x2 )
Show that the characteristic function of the Cauchy distribution is (t) = e|t| . has the same distribution as does X1 . and verify that the sample mean X Note: We may use the integral formula Z cos(tx) dx = etb , t 0. 2 + x2 b 2 b 0 20. What distribution corresponds to the following characteristic functions?
6.3. PROBLEMS (a) (t) = exp{ita b|t|}. (b) (t) = (c) 2 2 cos t t2 sin t . t
69
21. Let X be a discrete random variable which takes only integer values. Show that (t) is a periodic function with period 2 . 22. Let X be a random variable and a 6= 1. Show that the following conditions are equivalent: (a) X (a) = 1 (b) X is periodic with period |a|. (c) P [X { 2 j , j = ..., 2, 1, 0, 1, 2, ...}] = 1 a
23. Let X be a random variable which is bounded by a nite constant. Prove that 1 dn E (X n ) = n n X (0) i dt 24. What distribution corresponds to the following characteristic functions? (a) X (t) = (b) t X (t) = cos( ) 2 (c) X (t) = 3eit 2 1 1 it 1 1 2it e + + e . 2 3 6
25. Show that if X correponds to an absolutely continuous distribution with probability density function f (x), then Z 1 f (x) = eitx X (t)dt 2 26. What distribution corresponds to the following characteristic functions?
70CHAPTER 6. CHARACTERISTIC FUNCTIONS AND THE CENTRAL LIMIT THEOREM (a) X (t) = (b) X (t) = (c) X (t) = 1 t2 /2 1 it + e e 2 2 1 1 + t2
1 it2 2 it3 + e e 3 3