UNIT 17 TESTING HYPOTHESES
Structure
17.1 Introduction
Objectives
17.2 Some Concepts
17.3 Neyman-Pearson Lemma
17.4 Likelihood-ratio Tests
17.5 Summary
17.6 Solutions and Answers
In Unit 15,we introduced some basic notions about testing of hypothesis. There we
described some concepts and definitions useful in testing of hypothesis problems: In
this unit, we shall discuss the problem of testing of hypothesis in greater detail. To
begin with, we shall introduce some concepts and definitions. Next, we shall
describe, an important result (Neyman-Pearson Lemma) for constructing critical
regions for testing a simple hypothesis against a simple alternative. We also discuss
the likelihood ratio test. The usage of these two procedures of testing are illustrated.
Objeetives
After reading this unit, you should be able to:
derive critical regions for testing of hypothesis,
derive the power of these tests.
17.2 SOME CONCEPTS
In Unit 15,Section 15.5, you have been introduced to some basic notiops about
testing of hypothesis, like two types of error, level of significance, power critical
region etc. We recall these concepts.
Let XI, . . . ,X, be a random sample with joint distribution function
F ( 3 8 ), 8 E Q. On the basis of the observed sample we wish to test the null
hypothesis I-&-, : 8 E against an alternative H1 : 8 E 52, = 52 - q. Both
I-Io and H1 may be simple or composite hypotheses. Let x", the set of all possible
values of XI, . . . ,X,,, denote the sample space. Then X" _C R ~A.rule that specifies
a subset C, C C xn,such that
if (XI,. . . ,X,) E C, reject Ho
if (XI, . . . ,X,,) 4 C do not reject Ho
is called a test of I-&-, against H1 and C is called a critical region of the test. The
statistic used in the specification of C is called a test statistic. In such a test .
procedure, one might commit two types of error. The probability of type 1 error is
a ( 8 ) = Pe (Reject &),when 8 E Q,
Elements ot Statistlcal and probability of type I1 error is
Inference
- p0 ( c ) when 8.ESal
- -
Tbe function y ( 8 ) 1 $ ( 8 ) as a function of 8 Is called the power h c t i o n
of the test. In the construction of a test-procedure,we fix the probability of the
Type 1error to a desired small level and choose one for which y ( 8 )'is maximum
(or, equivalently, $ ( 8 ) is minimum). Thus, given 0 s a s 1,our interest is thed to
construct a test procedure for which
-
- -
-
8 E GI, ). A test of null hypothesis & 8 E % against H1 8 E Qlis said to -
and y ( 8 ) 1 $ ( 8 ) is as large as possible (or $ ( 8 )) is as small as possible,
have size u, 0 s a s 1,if
Sup a (8)
eeQ0
- a. A
The chosen size a is generaily unattainable. In fact in many prbblems only
countable number of levels a in [ 0,1] are attainable. In such a case we usually take
the largest level less than a that is attainable. We also call a as the level of
significanceof the critical region C if
a (8) s a for all 8 E @.,
If Sup a (8)
e E 0,
- a, then the level of significance and size of critical region, C, both
equal a.On the other hand, if Sup a (8) < a then the size of aitical region C is
eE %
smaller than its level of significance a. If & is a simple hypothesis, then it is clear
that a (8), 8 E Q, is the size of the aitical region C, which may or may not equal a
given significance level a.
We now define a criterion for selecting a test statistic for testing I-&, : 8 E Q, against
-
H1: 8 E Q1, if H1is a composite hypothesis. A test with critical region of size .
a Sup a (8) is said to be Uniformly Most PowerhI of size a of testing & if it
e E Qo
has the maxhum power among all critical regions C of the same size. In other
words, C,is the best (Uniformly Most Powerful) if for all tests C with size a
(which is the size of q),
the inequality Pe ( C,) r Pe (C)
holds for each 8 E Ql
Uniformly most powerful tests do not exist for many hypothesis testing problems.
Even when they do exist, they are often not easy to find. We now describe a test
procedure (equivalently, obtain a critical region ) for testing a simple hypothesis
against a simple alternative. In this case, the power function, y (8) reduces to a ~Gtlng
ofHypothe#
single number, so that the "uniformly", in uniformly most powerful, becomes
redundant, and we examine the question of the existence of a most powerfbl test of
given significance level a. \
We first state (without proof) an important result, called Neyman-Pearson Lemma
which is very useful for constructing uniformly most powerful tests.
Lemma 1: Let fo, f,, . . .be integrable functions of XI,. . . ,X,, over a space S and
let C be any region such that
S f i "X
C
- ai (given), i - l,2,. .. ...(1)
.
Further, let there exist constants kl, h, . . such that for the region within which
.
f, z k,f, + kf, + . . . .outside which fo < k, f, + k2 f, + . .. . . ,
and the conditions (1)are satisfied. Then
We now describe the application of Lemma 1to the problem of testing of simple
hypothesis against a simple alternative.
For a fixed positive integer n, let XI,. . . . .X,, denote a random sample from a
.
density f ( 5 0 ). Let XI,. . . . X,, denote the observed sample. Then the density of
-
X - (XI,. . ... &) is Pe - n
0): Let Peo( 5 ) and Pel ( 5 ) be the densities of
ll f (Xj,
j-1
g under Hoand H1respectively. The problem is that of determining a critical region .
C, such that
and
The optimum region C, is piwided by choosing b - x
Pel ( ) and fl - Peo( 5 ) in
Lemma 1.The optimum region C, is defined by
provided there exists a k such that (5) is satisfied. The test can thus be written as
Thus we determine the distribution of T under Ho.If the distribution is continuous
then there exists a k such that
for any assigned a.The test, T z k, depends on the simple alternative HI. If the test
is independent of the alternative hypothesis in a class of alternatives, then we have a
uniformly most powerful test with respect to all such alternative hypothesis against
a simple hypothesis.
We now consider same examples.
.
Exrmpk 1:Let XI, &, . .,Xn be a random sample from a normal distribution
'
-
N (p, o '). Assume that o is fixed and known and p E ( - m, ). We wish to
obtain a critical region for testing Ho: p ~b against H1 : p pl, where
and pl are the specified values of p.
-
We have,
and
The critical region, using Neyman-Pearson Lemma, is obtained as follows :
-42 & - b y ]
a kexp
I 20 i-1
taking natural logarithm and simplifying where - l/d
n
2 Xi .
i
Case I
Let p, > p,,. Then the critical region is
where kl is to he detknnined such that
P,{X. kl) - a
where Z is distributed as N (0, 1). Therefore choose k, so that
Denoting by Za the upper a probability point of N (0, 1) distribution, we have
Hence the best critical region in this case is
c, - {xlTI>yp+o/\/;;za]
Case I1 Let p1 < k.Then
Xdh
where k2is choosen such that
The best critical region in this case is
where - Za
distribution.
- Z1 -,is the lower a probability point of the standard normal
.
It may be seen that the test 2 k, ( s 4 ) is uniformly most powerful for the
class of alternatives p1 (pi c h)because k, ( k2 ) is independent of y But
there is no uniformly most powerful test for the entire class of alternatives p1 rr h.
In case I, the power of the test is
I
Elemenb of ~ktlatlcnl
Inference
[
-P Zr-
- [
1-P Z r -
k1- P1
0/6 ]
+
where (.) is the distribution function of a standard normal distribution.
Similarly the power of the test for the second case is
E l ) Let X1 be a random sample of size 1from a population with p.d.f.
f (x, 8) = (1/8) exp ,x z 0,8 > 0.Obtain a best critical region of size a
for testing % : 8 = 80 against H1 : 8 - 81 tie0and also the power of the test.
E2) Obtain a test, the size of the test and power of the test for testing a null
hypothesis I+
:, ,x E R against an alternative
1,
H1 : X- 1/2 exp { - I x I x E R. Develop the test on the basis of a single
observation.
17.4 LIKELIHOOD RATIO TESTS
In Section 17.3 we described the Neyman-Pearson lemma for obtaining the best test
for testing a simple hypothesis against a simple hypothesis. But when the
hypothesis to be tested is composite rather than simple, it becomes necessary to
introduce some other principle for obtaining good tests.
Neyman and Pearson suggested a simple method of construction of a test statistic
which is closely related to the maximum likelihood method of estimation.
Suppose L (8 1 J
-
X -(XI, X,, .
X is the likelihood function of 8 corresponding to the set of values
.
.,X,,). Suppose we are required to test the simple hypothesis
against the composite hypothesis -7.
In this situation, given the observation X, intuitively we should reject H,in case
Testlng d Hypotheses
i L (go 1 J
X is too small and accept it otherwise. This means that the test be based on
the critical region
where A is the likelihood ratio defined by a
and I,, is a constant so chosen as to make the probability of 5 p e I error associated
with the test equal to a.
When % itself is a composite hypothesis, say ,
thelikelihood is not a constant under Ho,and in order to judge the acceptability of
$1
k 8
thenull hypothesis in the light of the observation X, we compare the highest value
.PI of he likelihood under &, i.e. @
witl its highest value under the model, i.e.,
Ths here we base our test on the likelihood ratio
b
i
B Thecritical region of the size-a likelihood ration test of Hg against HIis
:tf
r
whye & is determined by the condition
gr
E)
een,
SUP P ~ ( x I * ( x ) < ~ Ja -
Thdcritical value of the ratio is determined by consideration of the size of the test.
i
It i!clear that 0 r A r 1.As in the case of Neyman-Pearson lemma, if the I
t. E
dis'ibution of A is continuous as, then any size a is attainable. If, however, the
f
6 di8;ibution of h is discrete it is difficult to find a likelihood ratio test whose size is
ex@ eqdtQ a,It k,however, possible to obtain a likelihoodratio test of size
by nnh& , d u r n which we shall not discusshere. We may
als&oose the largest C sudr that
, Elements of Stetbtkd Example 2: We consider here the problem of testing Ho : p = against all its
Inference 2
alternativca in sampling from N (p, o '1, where both p and o are unknown. In this
C8Se
Q,- {(p,,,02);02>0]and
n- {(h,C,o
2
);-m<p~a,o >O
I.
We shall write = (p, 02).
Under -
G2 -n1
n
E(X~-~)~.
i-1
Thus, Sup L @ 1 J
X =
0€Q,
Now I
Under H, the MIE of p, a 2 b
Thus
! The'likelihoodratio rejects & if
n
Since A (X)is a decreasing function of II ( X - h )2/2 - )2,we reject H,, if
(Xi
that is, if
statistic t ( ) =
fi(z-b)
i S
has a Student's t distribution with (n - 1)d.f under H,, : ~r = h,but under
H1: p # h,t ( - X ) has a non-central t-distribution with (n - 1)d.f. and
non-centrality parameter 8 = (v- %)/a. Thus the critical region is
I t ( iq,(for simplicity, we write t for t ( X ) )
where C2 is SO chosen that
Let ,,
=. - a/;! in accordance with the distribution oft ( X ) under H,,. Thus the
two sided test obtained here is
Suppose we now consider the.problemof testing I-&-, :p = bagainst the class of
alternatives H1: u = p1> b.I6 this case
and
n
-
- 2 (xi-2l.
Sup L@(xJ
e E no
- Sup
P>I+, a(=)"
1
exP - i
2a
A.
The MLE of p is z7when 51 r and is if zf<h.Similarly, the M U of
n
1.
2 1
a is-n ~ ( ~ - ~ ~ ~ h e n ~n r ~ s n d i ~ -< ~16.f i ~ - ~ ~ w
1 1
Thus
4
Sup L(e(xJ =
0EP (Wn, ($,x,-,,r
1
e-* if
Thus
._
X
Hence we consider those x for which r %. Proceeding on the same lines as for Twtlng d I i y p o t h e ~
the set of alternatives HI: p h,we get the test as
to reject the null hypothesis. The one sided test is UMP.
I l k
X
In the preceeding illustrations, A (X) was a simple function of and s2whose
distribution is known. In general, however, there is no guarantee that some such
nice relationship to a familiar variable will exist. Then we must use whatever tools
available to find the distribution of A. Fortunately, for large samples there is a good
approximation to the distribution of A which eliminates the necessity for finding the
distribution of A in situations where this is difficult to find. Under certain regularity
2
conditions, the random variable - 2 log, A has an asymptotic x -distribution.The
degrees of freedom equals the number of unknown parameters under S2 minus the
number of unknown parameters under q.
E3) Let XI, . . . ,X,,be a random sample from the Bernoulli distribution with
parameter p, 0 s p s 1. Construct a level a likelihood ratio test of I-&, : p s po
against HI : p > po.
17.5 SUMMARY
In this unit we have
1. briefly introduced the problem of testing of hypothesis,
2. discussed the Neyman-Pearson Lemma for testing a simple hypothesis against
a simple alternative,
3. described the likelihood ratio test for testing hypothesis
17.6 SOLUTIONS AND ANSWERS
E l ) We have
and
Poo(X) - 1
- exp
80
[ - X1/eo 1.
Using Neyman-Pearson Lrnrna, the best critical region is obtained as
- edel exp [ x1 ( l/el - 1/e0) ) r k
After taking logarithms, we have
XI (8, - 80) 2 kl
Ekmeatr of Statbtical CASE I: Let el > €lo
Inference
The test is
\
X1 b
where kZ is to be determined such that
1
that is, -
OD
$ exp (-xl/eo)
li
dx, - a
or exp (- k2/e0) - a
= 4 - 1% (i/aT4
The critical region is thus
- { X lx1 a log ( l / a ) l }
The power of the test is
- -1
81
m
J exp (-x1/el) dxl
bg ( i / a f 4
- exp [ - log (l/a)Odel ]
Case I1
Let 8, < 8,
The test is
s k3
where k,is to determined such that
Pe0(X, 5 4) -
that is
, 1
-
14
$ cxp (- xl/$)
0
dxl - a
-> 1- exp (- k3/e0) - a
The critical region is thus
- {x,I X, z log (1 -a)-'o)
The power of the test is
- 1- exp (- k3/e1)
S
i E2) Since both the densities (under I-&, and HI)are campletely specified, it is a case
of testing a simple hypothesis against a simple alternative. Using
Neyman-Pearson Lemma, the test is obtained as
Let
I - I
It is clearly seen that T (x) is an increasing function
T (x) r k if and only if I X I 1 r kt. It
form
which means that if either a very large or a very small value of X is observed,
then we suspect that H,is true and I-& is false. The size of the test is
is the standard normal variate. The power of the test is
Elements of Statistla11
Inrereace
The likelihood function is given by
Now
sup L (p, X) = sup p' (1 - p)"-'
-0EP Osps1
The maximum likelihood estimate of p is - r/n. Thus
sup pr(l-p)n-r
Ospsl
- ($ (I-;)
8 -I
Also
Sup pr (1 -p)"-' = Sup pr (1 -p)n-r
0Erb PIPo
r r r
The maximum likelihood estimate of p is po if po < - and is - if.po z -.
n' n n
Thus,
Sup pr(l-p)8*r = p ~ ( ~ - p ~ ) ~
if - po<r/n
'
PSPo
SinceA(X) s l f o r r > n p o a n d A ( X ) + l f o r r s np,A(X)isa
decreasing function of r. Thus the test statistic A (X) < c implies r > c', where
c' is so choosen that
Sup P (r > c')
PIP0
- a.
t.
The distribution of r is binomial with parameters n and p, b (n, p) and
It can be seea that P, (r > c') is a non-decreasing function of p, so that
sup P (r > cl)
P + Po
-2
j-c'+l
(;)dil-~~)~-'
k
Thus for a preassigned a, 0 < a < 1, choose c' so that
I Since r has a discrete distribution no c' may exist for which we get the exact
probability a. In this case, choose c' such that
4
and