Transcript No Slide Title
Hypothesis Testing
Basic Problem We are interested in deciding whether some data credits or discredits some “hypothesis” (often a statement about the value of a parameter or the relationship among parameters).
1
Suppose we consider the value of
= (true) average lifetime of some battery of a certain cell size and for a specified usage.
and hypothesize: H H 0 1 :
:
= 160 160
2
This would usually involve a scenario of either (1) 160 is a “standard” H 0 H 1 160 :
:
(2) the previous
was 160 (has there been a change?) or something analogous.
H 0 H 1 is called the “Null Hypothesis” is called the “Alternate Hypothesis”
3
We must make a decision whether to ACCEPT H 0 or REJECT H 0 (ACCEPT H 0 (REJECT H 0 same as REJECT H 1 ) same as ACCEPT H 1 ) We decide by looking at X from a sample of size n
4
Basic Logic: H 0 H 1 :
:
= 160
160 (1) Assume (for the moment) that H 0 true.
(2) Find the probability that X would be “that far away”, if, indeed, H 0 is true.
(3) If the probability is small, we reject H 0 ; if it isn’t too small, we give the benefit of the doubt to H 0 and accept H 0 .
5
BUT — what’s “too small” or “not too small”? Essentially — you decide!
You pick (somewhat arbitrarily) a value, .10, and most often = .05, called the SIGNIFICANCE LEVEL; 6
If the probability of getting “as far away” from the H
0 alleged value as we indeed got is greater than or equal to , we say “the chance of getting what we got isn’t that small and the difference could well be due to sample error, and, hence, we accept H 0 least, do not reject H 0 ).” (or, at
7
If the probability is <
, we say that the chance of getting the result we got is too small (beyond a reasonable doubt) to have been simply “sample error,” and hence, we REJECT H 0 .
8
Suppose we want to decide if a coin is fair. We flip it 100 times.
H 1 H 0 : p = 1/2, coin is fair
9
Let X = number of heads Case 1) 2) 3) X = 49 Perfectly consistent with H 0 , Could easily happen if p = 1/2; ACCEPT H 0 X = 81 Are you kiddin’? If p = 1/2, the chance of gettin’ what we got is one in a billion! REJECT H 0 X = 60 NOT CLEAR!
10
What is the chance that if p = 1/2 we’d get “as much as” 10 away from the ideal (of 50 out of 100)?
If this chance <
, reject H 0 If this chance >
, accept H 0
11
Important logic: H 0 gets a huge “Favor from the Error”; H 1 has the “Burden of Proof”; We reject H 0 only if the results are “overwhelming”.
12
To tie together the
value chosen and the X values which lead to accepting (or rejecting) H 0 , we must figure out the probability law of X if H 0 is true.
Assuming a NORMAL distribution (and the Central Limit Theorem suggests that this is overwhelmingly likely to be true), the answer is: X
= 160
13
We can find (using normal distribution tables) a region such that
= the probability of being outside the region: X
/2
/2 150.2
=160 169.8
(I made up the values of 150.2 and 169.8)
14
Note: logic suggests (in this example) a “rejection” region which is 2 sided; in experimental design, most regions are 1-sided.
150.2 169.8 is called the Acceptance Region (AR) <150.2 and >169.8
is called the Critical Region (CR)
15
/2 X 150.2
=160
/2 169.8
Decision Rule: If X in AR, accept H 0 If X in CR, reject H 0
16
X is called the “TEST STATISTIC” (that function of the data whose value we examine to see if it’s in AR or CR.) ONE-SIDED LOWER TAIL H 0 H 1 :
:
> 20 < 20 X
ONE-SIDED UPPER TAIL H 0 H 1 :
:
< 10 >10 X
20 10 Value Critical Value
17
has another meaning, which in many contexts is important: we accept H 0 we reject H 0 H 0 true Good!
(Correct!) Type I Error, or “
Error” H 0 false Type II Error, or “
Error” Good!
(Correct)
18
= Probability of Type I error = P(rej. H 0 |H 0 true)
= Probability of Type II error = P(acc. H 0 |H 0 false)
19
We often preset
. The value of
depends on the specifics of the H don’t know these specifics).
1 : (and most often in the real world, we
20
EXAMPLE: H 0 :
< 100 H 1 :
>100 Suppose the Critical Value = 141: X
=100 C=14 1
21
= P (X < 141/
= 150) = .3594
What is
?
= 150 141
= 150 These are values corresp.to a value of 25 for the Std. Dev. of X
= P (X < 141/
= 160) = .2236
141
= P (X < 141/
= 170) = .1230
141
= P (X < 141/
= 180)
= P (X < 141|H 0 false) = .0594
141
= 160
= 170
= 160
= 170
= 180
= 180
Note: Had
been preset at .025 (instead of .05), C would have been 149 (and
would be larger); had
been preset at .10, C would have been 132 and
would be smaller.
and
“trade off”.
23
In ANOVA, we have H 1 H 0 :
1
2
• • • =
c : not all (column) means are =.
The probability law of “F calc ” in the ANOVA table is an F distribution with appropriate degrees of freedom values, assuming H true: 0 is
0 C Critical Value
24
F calc = MSB col MSW Error E(MSB col ) =
2 + V col E(MSW Error ) =
2 The larger the ratio, F calc , the more suggestive that H 0 is false).
C P (F C is the value so that if V col calc > C)=
= 0 (all
’s=)
25
Note: What is
?
( The
’s are not all = (i.e., the level of the factor does matter!!) ) Answer: Unable to be determined because we would need exact specification of the “non-equality”.
[Hardly ever known, in practice!]
26
HOWEVER — The fact that we cannot compute the numerical value of
in no way means it doesn’t exist!
And – we can prove that whatever
is, it still “trades off” with
.
27