Transcript Document

Likelihood Ratio Tests
The origin and properties of using the
likelihood ratio in hypothesis testing
Teresa Wollschied
Colleen Kenney
Outline
 Background/History
 Likelihood Function
 Hypothesis Testing
 Introduction to Likelihood Ratio Tests
 Examples
 References
Jerzy Neyman (1894 – 1981)
 Jerzy Neyman (1894 – 1981)
 April 16, 1894: Born in Benderry, Russia/Moldavia (Russian
version:Yuri Czeslawovich)
 1906: Father died. Neyman and his mother moved to Kharkov.
 1912:Neyman began study in both physics and mathematics at
University of Kharkov where professor Aleksandr Bernstein introduced
him to probability
 1919: Traveled south to Crimea and met Olga Solodovnikova. In 1920
ten days after their wedding, he was imprisoned for six weeks in
Kharkov.
 1921: Moved to Poland and worked as an asst. statistical analyst at the
Agricultural Institute in Bromberg then State Meteorological Institute in
Warsaw.
Neyman biography
 1923-1924:Became an assistant at Warsaw University and taught at the
College of Agriculture. Earned a doctorate for a thesis that applied
probability to agricultural experimentation.
 1925: Received the Rockefeller fellowship to study at University
College London with Karl Pearson (met Egon Pearson)
 1926-1927:Went to Paris. Visited by Egon Pearson in 1927, began
collaborative work on testing hypotheses.
 1934-1938: Took position at University College London
 1938: Offered a position at UC Berkeley. Set up Statistical Laboratory
within Department of Mathematics. Statistics became a separate
department in 1955.
 Died on August 5, 1981
Egon Pearson (1895 – 1980)
 August 11, 1895: Born in Hampstead, England. Middle child of
Karl Pearson
 1907-1909: Attended Dragon School Oxford
 1909-1914: Attended Winchester College
 1914: Started at Cambridge, interrupted by influenza.
 1915: Joined war effort at Admiralty and Ministry of Shipping
 1920: Awarded B.A. by taking Military Special Examination;
Began research in solar physics, attending lectures by Eddington
 1921: Became lecturer at University College London with his
father
 1924: Became assistant editor of Biometrika
Pearson biography
 1925: Met Neyman and corresponded with him through letters
while Neyman was in Paris. Also corresponded with Gosset at the
same time.
 1933: After father retires, becomes the Head of Department of
Apllied Statistics
 1935: Won Weldon Prize for work done with Neyman and began
work on revising Tables for Statisticians and Biometricians
(1954,1972)
 1939: Did war work, eventually receiving a C.B.E.
 1961: Retired from University College London
 1966: Retired as Managing Editor of Biometrika
 Died June 12, 1890
Likelihood and Hypothesis Testing
 “On The Use and Interpretation of Certain Test Criteria for
Purposes of Statistical Inference, Part I,” 1928, Biometrika:
Likelihood Ratio Tests explained in detail by Neyman and
Pearson

“Probability is a ratio of frequencies and this relative measure cannot
be termed the ratio of probabilities of the hypotheses, unless we
speak of probability a posteriori and postulate some a priori
frequency distribution of sampled populations. Fisher has therefore
introduced the term likelihood, and calls this comparative measure
the ratio of the two hypotheses.
Likelihood and Hypothesis Testing
 “On the Problem of the most Efficient Tests of Statistical
Hypotheses,” 1933, Philosophical Transactions of the Royal
Society of London: The concept of developing an ‘efficient’
test is expanded upon.

“Without hoping to know whether each hypothesis is true or false,
we may search for rules to govern our behavior with regard to them,
in following which we insure that, in the long run of experience, we
shall not be too often wrong”
Likelihood Function
Suppose X1, X2,...,Xn is a randomsamplefroma
distribution withp.d.f.or p.m.f. f( x |  ). T hen,
given X = x is observed, thelikelihoodfunctionis
thefunctionof  defined by :
n
L( | x1,...xn)  f( x |  )   f ( xi |  )
i 1
Hypothesis Testing
Ho :    0
Ha :   a
 Define T=r(x)
 R={x: T>c} for some constant c.
Power Function
 The probability a test will reject H0 is given by:
 ( )  P ( X  R)
 Size  test:
sup   o  ( )   , 0    1
 Level  test:
sup   o  ( )   , 0    1
Types of Error
 Type I Error:
Rejecting H0 when H0 is true
 Type II Error:
Accepting H0 when H0 is false
Likelihood Ratio Test (LRT)
 LRT statistic for testing H0:   0 vs. Ha:  
a is:
sup 0 L( | x)
 ( x) 
sup  L( | x)
 A LRT is any test that has a rejection region
of the form {x:  (x)  c}, where c is any
number such that 0  c  1.
Uniformly Most Powerful (UMP) Test

Let  be a test procedure for testing H0:   0 vs.
Ha:   a, with level of significance 0. Then ,
with power function (), is a UMP level 0 test
if:
(1)
(2)
()  0
For every test procedure ′ with (′)  0, we have
′()   () for every   a.
Neyman-Pearson Lemma
Consider testing H0:  = 0 vs. Ha:  = 1, where the pdf or pmf
corresponding to i is f(x|i), i=0,1, using a test with
rejection region R that satisfies
xR if f(x|1) > k f(x|0)
(1)
and
xRc if f(x|1) < k f(x|0),
for some k  0, and
(2)
  P 0( X  R).
Neyman-Pearson Lemma (cont’d)

Then
(a)
Any test that satisfies (1) and (2) is a UMP level  test.
(b)
If there exists a test satisfying (1) and (2) with k>0,
then every UMP level  test satisfies (2) and every
UMP level  test satisfies (1) except perhaps on a set A
satisfying P ( X  A)  P ( X  A)  0.
0
1
Proof: Neyman-Pearson Lemma
Note that,if   P 0(X  R), we havea size  test and hencea
level test because sup    0 P (X  R)  P 0(X  R)   ,
since 0 has only one point.
Define the test function as :
 ( x )  1 if x  R,  (x)  0 if x  R c .
Let  ( x ) be the test functionof a test satisfying(1) and (2) and
 ' ( x ) be the test functionfor any otherlevel test ,where the
corresponding power functionsare  ( ) and  ' ( ).
Since 0   ' ( x )  1, ( ( x )   ' ( x ))( f ( x |  1)  kf ( x |  0))  0
for every x.T hus,
(3) 0   [ ( x )   ' ( x )][ f ( x |  1)  kf ( x |  0)]dx
  ( 1)   ' ( 1)  k (  ( 0)   ' ( 0)).
Proof: Neyman-Pearson Lemma (cont’d)
(a) is provedby not ing ( 0)   ' ( 0)     ' ( 0)  0.
T hus wit h k  0 and (3),
0   ( 1)   ' ( 1)  k (  ( 0)   ' ( 0))   ( 1)   ' ( 1)
showing  ( 1)   ' ( 1). Since  ' is arbit raryand 1 is t heonly
pointin  c0 ,  is an UMP t est .
Proof: Neyman-Pearson Lemma (cont’d)
P roof of (b)
Now, let  ' be the test functionfor any UMPlevel test.
By (a), , the test satisfying(1) and (2),is also a UMP level
 test. So,  ( 1)   ' ( 1). Using this,(3),and k  0,
   ' ( 0)   ( 0)   ' ( 0)  0.
Since  ' is a level test, ' ( 0)   , thatis,  ' is a size 
test implyingthat(3) is an equality.But thenonnegative
integrandin (3) will be 0 onlyif  ' satisfies(1)
except,perhaps,where  f ( x |  i )dx  0 on a set A.
A
LRTs and MLEs


Let o be theMLE of    and let o be theMLE of   0.
T hen theLRT statisticis

 (x) 
L(o | x)

L( | x)
Example: Normal LRT
Let X1,...Xn be a randomsamplefroma N( , 1) populat ion.
T est : H0 :    0 versus H1 :    0.
T hen t heLRT st at ist icis :
n
  ( xi  0 ) 2 / 2
i 1

L(o | x) (2 ) -n/2 e
 (x) 

n
L(x | x)
  ( xi  x ) 2 / 2
(2 ) -n/2 e i 1
e
n
 n
2
2
   ( xi  0 )   ( xi  x ) 
i 1
 i 1

n
2
n
Not e t hat ( xi   0)   ( xi  x) 2  n( x   0) 2 .
2
i 1
  (x)  e
i 1
 n ( x  0 ) 2
2
Example: Normal LRT (cont’d)
 We will reject H0 if (x)  c. We have:
{x :  ( x)  c}  {x : e
 n ( x  0 ) 2
2
 c}
 {x : ( x   0) 2  (2 ln c) / n}
 {x :| x   0 |
where 0  (2 ln c) / n  .
 (2 ln c) / n }
 Therefore, the LRTs are those tests that reject H0 if the
sample mean differs from the value 0 by more than
(2 ln c) / n.
Example: Size of the Normal LRT
Chose c such thatsup    0 P ( (X)  c)   .
For thepreviousexample,we have:
0  {   0}, and n ( X   0) ~ N (0,1) if    0.
T he test :
z / 2
reject if | X   0 | 
,
n
where z / 2 satisfiesP(Z  z / 2) 

2
with Z~ N(0,1), is a size  test.
Sufficient Statistics and LRTs
Theorem: If T(X) is a sufficient statistic for , and
*(t) and (t) are the LRT statistics based on T and
X, respectively, then *(T(x))=(x) for every x in
the sample space.
Example: Normal LRT with unknown
variance
Let X1,...Xn be a randomsamplefroma N( ,  2 ) population.
T est : H0 :    0 versus H1 :    0. (Note:  2 is a nuisance parameter).
T hen t heLRT stat isticis :
2
2
max
L(

,

|
x)
max
L(

,

| x)
{  ,  2 :   0 , 2  0}
{  ,  2 :   0 , 2  0}
 (x) 

2
 2
max
L(

,

|
x)
L( ,  | x)
{  ,  2 :     , 2  0}
if    0
1

 2
  L( 0,  0 | x)
if    0

 2
 L( ,  | x)
which is equivalent to a test based on theStudent's t stat istic.
Example: Normal LRT with unknown
variance (cont’d)
If    0, then
 2
 (x) 
L( 0,  0 | x)
2


(2  0) -n/2 e

2
L( ,  | x)
(2  ) -n/2 e

(2  0)
n
2
- n/2

e
(2  )
e

n
2
 ( xi   )

( xi   ) 2 i 1
n
n
2
- n/2
n
2
 ( xi   0 )
( xi   0 ) 2 i 1
i 1

2


n
2
  ( xi   ) / 2
i 1
n



n
  ( xi   0 ) 2 / 2 0
i 1

i 1


( 0)
2
- n/2
( ) -n/2
 
 
  
 0 
 

n
2
Example: Normal LRT with unknown
variance (cont’d)
Note that:
1 n
n 1 2
   ( xi  x ) 2 
S
n i 1
n
2
and

1 n
1 n
 n 1 2
2
 0   ( xi   0)    ( xi  x ) 2  n( x   0) 2  
S  ( x   0) 2
n i 1
n  i 1
n

T herefore, (x)  c when

 

n

1
n

1




2
 2 
S
 

  
n
n
 (x)      

  c' and x   0;
n

1
2
2
2
 0  

  n S  ( x   0)   n  1  ( x   0) 
S2

  n

X - 0
thisis analogousto rejectingH0 when
 tn  1,  .
2
S
n
 2
Asymptotic Distribution of the LRT –
Simple H0
T heorem: For testing H0 :    0 versus Ha :    0, suppose X1,...Xn

are iid f ( x |  ),  is theMLE of  , and f ( x |  ) satisfies the
thefollowingregularityconditions:
(1) T heparameteris identifiable; i.e., if    ' , then f ( x |  )  f ( x |  ' ).
(2) T hedensities f ( x |  ) havesome commonsupport,and f ( x |  )
is differentiable in  .
(3) T heparameterspace  containsan open set  of which thetrue
parametervalue 0 is an interiorpoint.
(4) x  X , thedensity f ( x |  ) is threetimesdifferentiable with respect
to , the thirdderivativeis continuousin  , and  f ( x |  )dx can
be differentiated threetimesunder theintegralsign.
(5)   , c  0 and a functionM(x) (both dependon  0) such that:
3
log f ( x |  )  M(x) x  X ,  0 - c     0  c,
3

with E 0[M(X)] .
Asymptotic Distribution of the LRT –
Simple H0 (cont’d)
T henunder H0, as n  ,
D
- 2 log  (X) 
12
D
If, instead,  0, then- 2 log  (X) 
2 ,
where  of thelimitingdistribution is thedifference
between the number of free parametersspecified
by   0 and thenumber of free parametersspecified
by   .
Restrictions
 When a UMP test does not exist, other methods
must be used.

Consider subset of tests and search for a UMP test.
References
Cassella, G. and Berger, R.L. (2002). Statistical Inference. Duxbury:Pacific Grove, CA.
Neyman, J. and Pearson, E., “On The Use and Interpretation of Certain Test Criteria for
Purposes of Statistical Inference, Part I,” Biometrika, Vol. 20A, No.1/2 (July 1928),
pp.175-240.
Neyman, J. and Pearson, E., “On the Problem of the most Efficient Tests of Statistical
Hypotheses,” Philosophical Transactions of the Royal Society of London, Vol. 231 (1933),
pp. 289-337.
http://www-history.mcs.st-andrews.ac.uk/Mathematicians/Pearson_Egon.html
http://www-history.mcs.st-andrews.ac.uk/Mathematicians/Neyman.html