Transcript Slide 1
1 Schaum’s Outline
Probability and Statistics
Chapter 7 HYPOTHESIS TESTING presented by Professor Carol Dahl Examples by Alfred Aird Kira Jeffery Catherine Keske Hermann Logsend Yris Olaya
2 Outline of Topics Topics Covered
Statistical Decisions Statistical Hypotheses Null Hypotheses Tests of Hypotheses Type I and Type II Errors Level of Significance Tests Involving the Normal Distribution One and Two – Tailed Tests P – Value
3 Outline of Topics (Continued )
Special Tests of Significance
Large Samples
Small Samples
Estimation Theory/Hypotheses Testing Relationship
Operating Characteristic Curves and Power of a Test
Fitting Theoretical Distributions to Sample Frequency Distributions
Chi-Square Test for Goodness of Fit
4 “The Truth Is Out There” The Importance of Hypothesis Testing Hypothesis testing helps evaluate models based upon real data enables one to build a statistical model enhances your credibility as analyst economist
5 Statistical Decisions Innocent until proven guilty principle Want to prove someone is guilty Assume the opposite or status quo - innocent H o : Innocent H 1 : Guilty Take subsample of possible information If evidence not consistent with innocent - reject Person not pronounced innocent but not guilty
6 Statistical Decisions Status quo innocence = null hypothesis Evidence = sample result Reasonable doubt = confidence level
7 Statistical Decisions Eg. Tantalum ore deposit feasible if quality > 0.0600g/kg with 99% confidence 100 samples collected from large deposit at random.
Sample distribution mean of 0.071g/kg standard deviation 0.0025g/kg.
8 Statistical Decisions Should the deposit be developed?
Evidence = 0.071 (sample mean) Reasonable doubt = 99% Status quo = do not develop the deposit H o :
H 1 :
< 0.0600
> 0.0600
9 Statistical Hypothesis General Principles Inferences about population using sample statistic Prove A is true by assuming it isn’t true Results of experiment (sample) compared with model If results of model unlikely, reject model If results explained by model, do not reject
10 Statistical Hypothesis Event A fairly likely, model would be retained Event B unlikely, model would be rejected Area B
z
0 A
11 Statistical Decisions Should the deposit be developed?
Evidence = 0.071 (sample mean) Reasonable doubt = 99% Status quo = do not develop the deposit H o :
H 1 :
= 0.0600
> 0.0600
How likely H o
12 Need Sampling Statistic Need statistic with population parameter estimate for population parameter its distribution
13 Need Sampling Statistic Population Normal - Two Choices Small Sample <30 Known Variance Unknown Variance X
n X
sˆ n N(0,1) t n-1
14 Need Sampling Statistic Population Not-Normal Large Sample Known Variance Unknown Variance X
X
sˆ n n N(0,1) N(0,1) Doesn’t matter if know variance of not If population is finite sampling no replacement need adjustment
15 Normal Distribution X~N(0,1)
=0 27 SD=1 (68%) SD=2 (95%) SD=3 (99.7%)
16 Statistical Decisions Should the deposit be developed?
Evidence: 0.071 (sample mean) 0.0025g/kg (sample variance) 0.05 (sample standard deviation) Reasonable doubt = 99% Status quo = do not develop the deposit H o :
H 1 :
= 0.0600
> 0.0600
One tailed test How likely H o
17 Hypothesis test Evidence: 0.071 (sample mean) 0.05g/kg (sample standard deviation) Reasonable doubt = 99% Status quo = do not develop the deposit H o :
H 1 :
= 0.0600
> 0.0600
P ( X sˆ
n
Z c )
0 .
99
1
18 Statistical Hypothesis Eg. Z = (0.071 – 0.0600)/ (0.05/
100) = 2.2
Conclusion: Don’t reject H o , don’t develop deposit 2.2 Z c =2.33
19 Null Hypothesis Hypotheses cannot be proven reject or fail to reject based on likelihood of event occurring null hypothesis is not accepted
20 Test of Hypotheses Maple Creek Mine and Potaro Diamond field in Guyana
Mine potential for producing large diamonds
Experts want to know true mean carat size produced True mean said to be 4 carats Experts want to know if true with 95% confidence
Random sample taken Sample mean found to be 3.6 carats
Based on sample, is 4 carats true mean for mine?
21 Tests of Hypotheses Tests referred to as: “Tests of Hypotheses” “Tests of Significance” “Rules of Decision”
22 Types of Errors H o : µ = 4 (Suppose this is true) H 1 : µ
4 Two tailed test Choose
= 0.05
Sample n = 100 (assume X is normal),
= 1 P (
1 .
96
X
4 n
1 .
96 )
0 .
95
1
/2 23 Type I error (
) –reject true H o : µ = 4 suppose true P (
1 .
96
X
4 n
1 .
96 )
0 .
95
1
/2
24 Type II Error (ß) - Accept False H o : µ = 4 not true µ = 6 true
X-µ not mean 0 but mean 2
μ = 4
ß
μ = 6
0 2
25 Lower Type I What happens to Type II Ho: µ = 4 not true µ = 6 true ß
μ = 4 μ = 6
0 2
26 Higher µ What happens to Type II?
Ho: µ = 4 not true µ = 7 true
X-µ not mean 0 but mean 3 ß
μ = 4 μ = 7
0 3
27 Type I and Type II Errors Two types of errors can occur in hypothesis testing To reduce errors, increase sample size when possible P ( Type I Error )
P ( Type II Error )
Reject H o Do Not Reject H o H o True H o False Type I Error Correct Decision Correct Decision
Type II Error
28 To Reduce Errors Increase sample size when possible Population, n = 5, 10, 20 Mean Sampling Distributions Difference Sample Sizes
-4 -2 2.5
2 1.5
1 0.5
0 -0.5
0 2 4
29 Error Examples Type I Error – rejecting a true null hypothesis Convicting an innocent person Rejecting true mean carat size is 4 when it is Type II Error – not rejecting a false null hypothesis Setting a guilty person free Not rejecting mean carat size is 4 when it’s not
30 Level of Significance (
) α = max probability we’re willing to risk Type I Error = tail area of probability density function If Type I Error’s “cost” high, choose α low α defined before hypothesis test conducted α typically defined as 0.10, 0.05 or 0.01
α = 0.10 for 90% confidence of correct test decision α = 0.05 for 95% confidence of correct test decision α = 0.01 for 99% confidence of correct test decision
31 Diamond Hypothesis Test Example
-2.575
H o : µ = 4 H 1 : µ
4 Choose α = 0.01 for 99% confidence Sample n = 100,
= 1
X = 3.6, -Z c = - 2.575, Z c = 2.575
2.575
.005
.005
32 Example Continued 21 z
X
2 -
n
3 .
2
4 1 100
2
2 ( z ) not 2.575
( z
2
)
Observed not “significantly” different from expected Fail to reject null hypothesis
We’re 99% confident true mean is 4 carats
33 Tests Involving the t Distribution Billy Ray has inherited large, 25,000 acre homestead Located on outskirts of Murfreesboro, Arkansas, near: Crater of Diamonds State Park Prairie Creek Volcanic Pipe Land now used for agricultural recreational No official mining has taken place
34 Case Study in Statistical Analysis Billy Ray’s Inheritance Billy Ray must now decide upon land usage Options: Exploration for diamonds Conservation Land biodiversity and recreation Agriculture and recreation Land development
35 Consider Costs and Benefits of Mining Cost and Benefits of Mining Opportunity cost Excessive diamond exploration damages land’s value Exploration and Mining Costs Benefit Value of mineral produced
36 Consider Costs and Benefits of Mining Cost and Benefits of Mining Sample for geologic indicators for diamonds kimberlite or lamporite larger sample more likely to represent “true population” larger sample will cost more
37 How to decide one tailed or two tailed One tailed test Do we change status quo only if its bigger than null Do we change status quo only if its smaller than null Two tailed test Change status quo if its bigger of if it smaller
38 Tests of Mean Normal or t population normal known variance small sample Normal population normal unknown variance small sample large population
t
Normal
39 Difference Normal and t
-5 0.6
0.5
0.4
0.3
0.2
0.1
0 0
t “fatter” tail than normal bell-curve
5
40 Hypothesis and Sample Need at least 30 g/m 3 mine Null hypothesis H o : µ = 20 Alternative hypothesis H 1 : ?
Sample data: n=16 (holes drilled) X close to normal
X =31 g/m³ variance (ŝ 2 /n)=0.286 g/m³
41 Normal or t? One tailed Null hypothesis H o : µ = 30 Alternative hypothesis H 1 : µ > 30 Sample data: n = 16 (holes drilled)
X = 31 g/m³ variance (ŝ 2 ) = 4.29 g/m³ = 4.29
standard deviation ŝ = 2.07
small sample, estimated variance, X close to normal not exactly t but close if X close to normal
t n-1 =
X - µ ŝ/
n t 16-1 42 Tests Involving the t Distribution
=0 Reject 5% t c =1.75
43 Tests Involving the t Distribution t n-1 =
X - µ = (31 - 30) = 1.93
ŝ/
n 2.07/
16 t 16-1
=0 Reject 5% t c =1.75
44 Wells produces oil X= API Gravity approximate normal with mean 37
periodically test to see if the mean has changed too heavy or too light revise contract H o : H 1 : Sample of 9 wells,
X= 38
, ŝ 2 = 2 What is test statistic?
Normal or t?
t n-1 =
X - µ ŝ/
n 45 Two tailed t test on mean
=0 Reject
/2%
t c
t c Reject
/2%
46 Two tailed t test on mean t H o : µ= 37 H 1 : µ
37 Sample of 9 wells,
X= 38
, ŝ 2 n-1 =
X - µ = (38 – 37) = 1.5
ŝ/
n 2/
9 = 2,
= 10%
47 P-values - one tailed test Level of significance for a sample statistic under null Largest
for which statistic would reject null t 16-1 =
X - µ = (31 - 30) = 1.93
ŝ/
n 2.07/
16
P=0.04
tinv(1,87,15,1)
48 P-value two tailed test t H o : µ= 37 H 1 : µ
37 Sample of 9 wells,
X= 38
, ŝ 2 n-1 =
X - µ = (38 – 37) = 1.5
ŝ/
n 2/
9 = 2,
= 10% =TDIST(1.5,8,2) = 0.172
49 Formal Representation of p-Values p-Value <
= Reject H o p-Value >
= Fail to reject H o
50 More tests Survey: - Ranking refinery managers Daily refinery production Sample two refineries of 40 and 35 1000 b/cd First refinery: mean = 74, stand. dev. = 8 Second refinery: mean = 78, stand. dev. = 7 Questions: difference of means?
variances?
differences of variances Again Statistics Can Help!!!!
51 Differences of Means H o : µ 1 H o : µ 1 - µ 2 = 0 - µ 2
0 X
1 2 n 1 1
X
2 2 2 n 2 X 1 and X 2 normal, known variance or large sample known variance
= 10%
5% 5% -Z c Z c
52 Differences of Means H o : µ 1 H o : µ 1 - µ 2 = 0 - µ 2
0 n 1 =
X 1 40, n 2 = 74,
1 = 35 = 8
X 2 = 78,
2 = 7 X 1 σ 1 2
n 1 X 2 σ n 2 2 2
74
78 8 2 40
7 2 35
0 .
958
5% 5%
-Z=-1.645
c Z c -1.645
53 Difference of Means X normal Unknown but equal variances Do above test with t n 1
n 2
2
( n 1 X 1
X 2
1 n ) 1 2 sˆ 1
n ( 2 n
2 2
1 ) sˆ 2 2
n 1
n 2 n 1 n 2
/2
54 Variance test (
2 distribution) Two tailed
2
( n
1 )
2 2
/2
55 Variance test (
2 distribution) One tailed
2
( n
1 )
2 2
56 Hypothesis Test on Variance Suppose best practice in refinery
2 = 6 Does refinery 2 have different variability than best practice?
H o :
2 = 6 H 1 :
2
6.5
Example: 2 nd mine, n –1 = 34, Standard deviation = 7
P (
2 c 1
( n
1 ) Sˆ
2
2
2 c 2
)
1
57 Hypothesis Test on Variance
/2
H o :
2 = (6.5) 2 H 1 :
2
6.5
2 Example: 2 nd mine, n –1 = 34, Standard deviation = 7
= 10% ( n
1 ) Sˆ 2
2
( 35
1 ) 7 2 6 2
46 .
278 P (
2 c 1
( n
1
2 ) Sˆ 2
2 c 2 )
1
58 Hypothesis Test on Variance
/2
Suppose best practice in refinery H o :
2 = 6.5
H 1 :
2
6.5
Example: 2 nd mine, n –1 = 34, Standard deviation = 7
chiinv ( 0 .
95 , 34 ), chiinv ( 0 .
05 , 34 )
21 .
664 , 48 .
603
59 Variance test (
2 distribution)
2
( n
1 )
2 Two tailed 2
46 .
278
0.05
21.664
48.602
0.05
60 Variance test (
2 distribution) More variance than best practice Ho:
2 = 6.5
H1:
2 > 6.5
One tailed
0.10
61 Variance test (
2 distribution) More variance than best practice Ho:
2 = 6.5
H1:
2 > 6.5
2
( n
1 ) Sˆ 2
2 One tailed
46 .
278
0.10
chiinv(0.10,34)=44.903
62 Testing if Variances the Same F Distribution 2 samples of size n 1 and n2 sample variances: ŝ 1 2 , ŝ 2 2, H o :
1 2 =
2 2 => H o :
2 2 /
1 2 = 1 H o :
1 2
2 2 => H o :
2 2 /
1 2
1 F
Sˆ 1 2
S 2 2
1 2
2 2
Sˆ 1 2
2 2 Sˆ 2 2
1 2 is F n 1
1 , n 2
1
/2 63 Testing if Variances the Same F Distribution H o :
1 2 /
2 2 = 1 H 1 :
1 2 /
2 2
1 Two tailed Sˆ 2 1 Sˆ 2 2
/2
64 Testing if Variances the Same F Distribution H o :
2 2 /
1 2 = 1 H 1 :
2 2 /
1 2 >1 One tailed Sˆ 2 1 Sˆ 2 2
=10
65 Example Testing if Variances the Same 2 samples of size n 1 = 40 and n 2 = 35 sample variances: ŝ 1 2 = 8 2 , ŝ 2 2 = 7 2 H o :
2 2 /
1 2 = 1 P ( H o :
2 2 /
1 2
Finv ( 0 .
95 , 39 , 34 ) 1
Sˆ 1 2
2 2 Sˆ 2 2
1 2
Finv ( 0 .
05 , 39 , 34 ))
1
0 .
10 [0.579, 1.749] 8 2 /7 2 =1.306
66 Testing if Variances the Same F Distribution H o :
1 2 /
2 2 = 1 H 1 :
1 2 /
2 2
1 Two tailed Sˆ 1 2 Sˆ 2 2
1 .
306 0.05
0.05
Finv(0.95,39,34)=0.579
Finv(0.05,39,34)=1.749
67 Testing if Variances the Same F Distribution H o :
2 2 /
1 2 = 1 H 1 :
2 2 /
1 2
1 One tailed Sˆ 1 2 Sˆ 2 2
1 .
306 0.05
Finv(0.10,39,34)=1.544
68 Power of a test Type II error:
= P(Fail to reject H o Power = 1-
| H 1 is true)
μ = 4 μ = 6
0 2
69 Power of a test Type II error:
= P(Fail to reject H o Power = 1-
| H 1 is true)
μ = 4 μ = 6
0 2
70 Power of a test Researcher controls level of significance,
Increase
what happens to ß?
71 Raise Type I (
) What happens to Type II (ß) Ho: µ = 4 not true µ = 6 true
X-µ not mean 0 but mean 2 ß
μ = 4 μ = 6
0 2
μ = 4
72 Higher
What happens to Type II?
μ = 6
ß 0 Increase ß, reduce
2
73 Operating Characteristic Curve
μ = μ 0 μ = μ 1 H 0 H 1 ß -10 -5 Z β 5
Can graph
against
called operating characteristic curve useful in experimental design
10
-10 -10
74 Operating Characteristic Curve
μ = μ 0 μ = μ 1 H 0 -5 μ = μ 0 ß Z β H 1 5 μ = μ 2 10 H 1 H 0 -5 ß Z β 5 10
75 Fitting a probability distribution Is electricity demand a log-normal distribution Observed Mean: 18.42
Observed Variance 43 Observations : 20
9.8261
20.8787
35.6834
13.1139
15.9879
13.2253
20.2954
18.1785
24.3539
16.4685
30.2449
14.182
20.275
17.243
12.8461
9.2554
23.3099
17.2652
21.9764
13.9045
76 Fitting a probability distribution Does electricity demand follow a normal distribution?
9.8261
20.8787
35.6834
13.1139
15.9879
13.2253
20.2954
18.1785
24.3539
16.4685
30.2449
14.182
20.275
17.243
12.8461
Observed Mean: 18.42
Observed Variance: 43 Observations : 20
9.2554
23.3099
17.2652
21.9764
13.9045
77 You can test your model graphically: 1. Order observations from smallest Y 1 to largest Y n 2. Compute cumulative frequency distribution 3. Plot ordered observations versus P i on special probability sheet 4. If straight line within critical range can’t reject normal
78 You can test your model graphically: 9.26
9.83
12.85
13.11
13.23
13.90
14.18
15.99
16.47
17.24
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
17.27
18.18
20.28
20.30
20.88
21.98
23.31
24.35
30.24
35.68
0.55
0.60
0.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
79 Or use the Graph/Probability Plot … Option in Minitab
80 Statistical test of distribution H o : X e
N(µ,
2 ) H 1 : X e does not follow N(µ,
2 ) Order data Estimate sample mean & variance Observed Mean: 18.42
Observed Variance: 43 Observations : 20
2 statistic goodness of fit of model
9.26
9.83
12.85
13.11
13.23
13.90
14.18
15.99
16.47
17.24
81 Statistical test of distribution Again order sample 17.27
18.18
20.28
20.30
20.88
21.98
23.31
24.35
30.24
35.68
Create m = 5 categories <10 10-15 15-20 20-25 >25
9.26
9.83
12.85
13.11
13.23
13.90
14.18
15.99
16.47
17.24
82 Statistical test of distribution 17.27
18.18
20.28
20.30
20.88
21.98
23.31
24.35
30.24
35.68
Actual frequencies <10 10-15 15-20 20-25 >25 6 2 2 5 5
<10 10-15 15-20 20-25 >25 83 Statistical test of distribution Frequencies actual expected 2 Normdist(10,18.42,6.56,1)*20 5 (Normdist(15,18.42,6.56,1) Normdist(10,18.42,6.56,1)*20 5 (Normdist(20,18.42,6.56,1) Normdist(15,18.42,6.56,1)*20 6 2
<10 10-15 15-20 20-25 >25 84 Statistical test of distribution Frequencies Observed 2 5 5 6 2 Expected 1.99
4.03
5.88
4.94
3.16
2 85 Goodness of Fit Test Is based on:
m
2 =
(o i -e i ) 2 /e i
i=1
df = m – k – 1 k = number of parameters replaced by estimates o i : observed frequency, e i : expected frequency
<10 10-15 15-20 20-25 >25 o i 2 5 5 6 2 86 Statistical test of distribution Frequencies e i 1.99
4.03
5.88
4.94
3.16
2=
(o i -e i )2/e i +(2-1.99) 2 /1.99
+(5-4.03) 2 /4.03
+(5-5.88) 2 /5.88
+(6-4.94) 2 /4.94
+(2-3.19) 2 /3.16
= 1.04
87 Statistical test of distribution H o : X
N(µ,
2 ) H 1 : X ~ does not follow N(µ,
2 )
df = m – k – 1= 5 – 2 - 1
2=
(o i -e i )2/e i = 1.04
CHIINV(0.05,2)=5.99
88 Outline of Topics (Continued )
Estimation Theory/Hypotheses Testing Relationship
Operating Characteristic Curves and Power of a Test
Fitting Theoretical Distributions to Sample Frequency Distributions
Chi-Square Test for Goodness of Fit
89 Sum Up Chapter 7 Hypothesis testing null vs alternative null with equal sign null often status quo alternative often what want to prove type I error vs type II error type I called level of significance P – values 1-ß = power of test = probability of rejecting false one tailed vs two tailed
90 Sum Up Chapter 7 Hypothesis tests mean – Normal test population normal, known variance large sample X
n mean – t test population normal, unknown variance, small sample X sˆ
n
91 Sum Up Chapter 7 Normal and t
92 Sum Up Chapter 7 Hypothesis tests difference of means – Normal test population normal, known variance X
1 n 1 1 2
X
2 2 2 n 2
Hypothesis tests variance
2
( n
1 ) Sˆ 2
2 93 Sum Up Chapter 7 Are variances equal Sˆ 1 2
2 2 Sˆ 2 2
1 2 is F n 1
1 , n 2
1
94 Sum Up Chapter 7
2 and F
95 Sum Up Chapter 7 How is random variable distributed normal – graph cumulative frequency distribution special paper straight line Statistical
2 k-m-1 =
(o i -e i )2/e i k = categories m = estimated parameters always 1 tailed
End of Chapter 7!
96