Transcript Slide 1

1 Schaum’s Outline

Probability and Statistics

Chapter 7 HYPOTHESIS TESTING presented by Professor Carol Dahl Examples by Alfred Aird Kira Jeffery Catherine Keske Hermann Logsend Yris Olaya

2 Outline of Topics Topics Covered

        

Statistical Decisions Statistical Hypotheses Null Hypotheses Tests of Hypotheses Type I and Type II Errors Level of Significance Tests Involving the Normal Distribution One and Two – Tailed Tests P – Value

3 Outline of Topics (Continued )

Special Tests of Significance

Large Samples

Small Samples

Estimation Theory/Hypotheses Testing Relationship

Operating Characteristic Curves and Power of a Test

Fitting Theoretical Distributions to Sample Frequency Distributions

Chi-Square Test for Goodness of Fit

4 “The Truth Is Out There” The Importance of Hypothesis Testing Hypothesis testing helps evaluate models based upon real data enables one to build a statistical model enhances your credibility as analyst economist

5 Statistical Decisions Innocent until proven guilty principle Want to prove someone is guilty Assume the opposite or status quo - innocent H o : Innocent H 1 : Guilty Take subsample of possible information If evidence not consistent with innocent - reject Person not pronounced innocent but not guilty

6 Statistical Decisions Status quo innocence = null hypothesis Evidence = sample result Reasonable doubt = confidence level

7 Statistical Decisions Eg. Tantalum ore deposit feasible if quality > 0.0600g/kg with 99% confidence 100 samples collected from large deposit at random.

Sample distribution mean of 0.071g/kg standard deviation 0.0025g/kg.

8 Statistical Decisions Should the deposit be developed?

Evidence = 0.071 (sample mean) Reasonable doubt = 99% Status quo = do not develop the deposit H o :

H 1 :

< 0.0600

> 0.0600

9 Statistical Hypothesis General Principles Inferences about population using sample statistic Prove A is true by assuming it isn’t true Results of experiment (sample) compared with model If results of model unlikely, reject model If results explained by model, do not reject

10 Statistical Hypothesis Event A fairly likely, model would be retained Event B unlikely, model would be rejected Area B

z

0 A

11 Statistical Decisions Should the deposit be developed?

Evidence = 0.071 (sample mean) Reasonable doubt = 99% Status quo = do not develop the deposit H o :

H 1 :

= 0.0600

> 0.0600

How likely H o

12 Need Sampling Statistic Need statistic with population parameter estimate for population parameter its distribution

13 Need Sampling Statistic Population Normal - Two Choices Small Sample <30 Known Variance Unknown Variance X

  

n X

 

sˆ n N(0,1) t n-1

14 Need Sampling Statistic Population Not-Normal Large Sample Known Variance Unknown Variance X

 

X

  

sˆ n n N(0,1) N(0,1) Doesn’t matter if know variance of not If population is finite sampling no replacement need adjustment

15 Normal Distribution X~N(0,1)

=0 27 SD=1 (68%) SD=2 (95%) SD=3 (99.7%)

16 Statistical Decisions Should the deposit be developed?

Evidence: 0.071 (sample mean) 0.0025g/kg (sample variance) 0.05 (sample standard deviation) Reasonable doubt = 99% Status quo = do not develop the deposit H o :

H 1 :

= 0.0600

> 0.0600

One tailed test How likely H o

17 Hypothesis test Evidence: 0.071 (sample mean) 0.05g/kg (sample standard deviation) Reasonable doubt = 99% Status quo = do not develop the deposit H o :

H 1 :

= 0.0600

> 0.0600

P ( X sˆ

 

n

Z c )

0 .

99

1

 

18 Statistical Hypothesis Eg. Z = (0.071 – 0.0600)/ (0.05/

100) = 2.2

Conclusion: Don’t reject H o , don’t develop deposit 2.2 Z c =2.33

19 Null Hypothesis Hypotheses cannot be proven reject or fail to reject based on likelihood of event occurring null hypothesis is not accepted

20 Test of Hypotheses Maple Creek Mine and Potaro Diamond field in Guyana

Mine potential for producing large diamonds

Experts want to know true mean carat size produced True mean said to be 4 carats Experts want to know if true with 95% confidence

Random sample taken Sample mean found to be 3.6 carats

Based on sample, is 4 carats true mean for mine?

21 Tests of Hypotheses Tests referred to as: “Tests of Hypotheses” “Tests of Significance” “Rules of Decision”

22 Types of Errors H o : µ = 4 (Suppose this is true) H 1 : µ

4 Two tailed test Choose

= 0.05

Sample n = 100 (assume X is normal),

= 1 P (

1 .

96

X

 

4 n

1 .

96 )

0 .

95

1

 

/2 23 Type I error (

) –reject true H o : µ = 4 suppose true P (

1 .

96

X

 

4 n

1 .

96 )

0 .

95

1

  

/2

24 Type II Error (ß) - Accept False H o : µ = 4 not true µ = 6 true

X-µ not mean 0 but mean 2

μ = 4

ß

μ = 6

0 2

25 Lower Type I What happens to Type II Ho: µ = 4 not true µ = 6 true ß

μ = 4 μ = 6

0 2

26 Higher µ What happens to Type II?

Ho: µ = 4 not true µ = 7 true

X-µ not mean 0 but mean 3 ß

μ = 4 μ = 7

0 3

27 Type I and Type II Errors Two types of errors can occur in hypothesis testing To reduce errors, increase sample size when possible P ( Type I Error )

 

P ( Type II Error )

 

Reject H o Do Not Reject H o H o True H o False Type I Error Correct Decision Correct Decision

Type II Error

28 To Reduce Errors Increase sample size when possible Population, n = 5, 10, 20 Mean Sampling Distributions Difference Sample Sizes

-4 -2 2.5

2 1.5

1 0.5

0 -0.5

0 2 4

29 Error Examples Type I Error – rejecting a true null hypothesis Convicting an innocent person Rejecting true mean carat size is 4 when it is Type II Error – not rejecting a false null hypothesis Setting a guilty person free Not rejecting mean carat size is 4 when it’s not

30 Level of Significance (

) α = max probability we’re willing to risk Type I Error = tail area of probability density function If Type I Error’s “cost” high, choose α low α defined before hypothesis test conducted α typically defined as 0.10, 0.05 or 0.01

α = 0.10 for 90% confidence of correct test decision α = 0.05 for 95% confidence of correct test decision α = 0.01 for 99% confidence of correct test decision

31 Diamond Hypothesis Test Example

-2.575

H o : µ = 4 H 1 : µ

4 Choose α = 0.01 for 99% confidence Sample n = 100,

= 1

X = 3.6, -Z c = - 2.575, Z c = 2.575

2.575

.005

.005

32 Example Continued 21 z

X

2 -

n

3 .

2

4 1 100

 

2

2 ( z ) not 2.575

( z

2

)

Observed not “significantly” different from expected Fail to reject null hypothesis

We’re 99% confident true mean is 4 carats

33 Tests Involving the t Distribution Billy Ray has inherited large, 25,000 acre homestead Located on outskirts of Murfreesboro, Arkansas, near: Crater of Diamonds State Park Prairie Creek Volcanic Pipe Land now used for agricultural recreational No official mining has taken place

34 Case Study in Statistical Analysis Billy Ray’s Inheritance Billy Ray must now decide upon land usage Options: Exploration for diamonds Conservation Land biodiversity and recreation Agriculture and recreation Land development

35 Consider Costs and Benefits of Mining Cost and Benefits of Mining Opportunity cost Excessive diamond exploration damages land’s value Exploration and Mining Costs Benefit Value of mineral produced

36 Consider Costs and Benefits of Mining Cost and Benefits of Mining Sample for geologic indicators for diamonds kimberlite or lamporite larger sample more likely to represent “true population” larger sample will cost more

37 How to decide one tailed or two tailed One tailed test Do we change status quo only if its bigger than null Do we change status quo only if its smaller than null Two tailed test Change status quo if its bigger of if it smaller

38 Tests of Mean Normal or t population normal known variance small sample Normal population normal unknown variance small sample large population

t

Normal

39 Difference Normal and t

-5 0.6

0.5

0.4

0.3

0.2

0.1

0 0

t “fatter” tail than normal bell-curve

5

40 Hypothesis and Sample Need at least 30 g/m 3 mine Null hypothesis H o : µ = 20 Alternative hypothesis H 1 : ?

Sample data: n=16 (holes drilled) X close to normal

X =31 g/m³ variance (ŝ 2 /n)=0.286 g/m³

41 Normal or t? One tailed Null hypothesis H o : µ = 30 Alternative hypothesis H 1 : µ > 30 Sample data: n = 16 (holes drilled)

X = 31 g/m³ variance (ŝ 2 ) = 4.29 g/m³ = 4.29

standard deviation ŝ = 2.07

small sample, estimated variance, X close to normal not exactly t but close if X close to normal

t n-1 =

X - µ ŝ/

n t 16-1 42 Tests Involving the t Distribution

=0 Reject 5% t c =1.75

43 Tests Involving the t Distribution t n-1 =

X - µ = (31 - 30) = 1.93

ŝ/

n 2.07/

16 t 16-1

=0 Reject 5% t c =1.75

44 Wells produces oil X= API Gravity approximate normal with mean 37

periodically test to see if the mean has changed too heavy or too light revise contract H o : H 1 : Sample of 9 wells,

X= 38

, ŝ 2 = 2 What is test statistic?

Normal or t?

t n-1 =

X - µ ŝ/

n 45 Two tailed t test on mean

=0 Reject

/2%

t c

t c Reject

/2%

46 Two tailed t test on mean t H o : µ= 37 H 1 : µ

37 Sample of 9 wells,

X= 38

, ŝ 2 n-1 =

X - µ = (38 – 37) = 1.5

ŝ/

n 2/

9 = 2,

= 10%

47 P-values - one tailed test Level of significance for a sample statistic under null Largest

for which statistic would reject null t 16-1 =

X - µ = (31 - 30) = 1.93

ŝ/

n 2.07/

16

P=0.04

tinv(1,87,15,1)

48 P-value two tailed test t H o : µ= 37 H 1 : µ

37 Sample of 9 wells,

X= 38

, ŝ 2 n-1 =

X - µ = (38 – 37) = 1.5

ŝ/

n 2/

9 = 2,

= 10% =TDIST(1.5,8,2) = 0.172

49 Formal Representation of p-Values p-Value <

= Reject H o p-Value >

= Fail to reject H o

50 More tests Survey: - Ranking refinery managers Daily refinery production Sample two refineries of 40 and 35 1000 b/cd First refinery: mean = 74, stand. dev. = 8 Second refinery: mean = 78, stand. dev. = 7 Questions: difference of means?

variances?

differences of variances Again Statistics Can Help!!!!

51 Differences of Means H o : µ 1 H o : µ 1 - µ 2 = 0 - µ 2

0 X

1 2 n 1 1

 

X

2 2 2 n 2 X 1 and X 2 normal, known variance or large sample known variance

= 10%

5% 5% -Z c Z c

52 Differences of Means H o : µ 1 H o : µ 1 - µ 2 = 0 - µ 2

0 n 1 =

X 1 40, n 2 = 74,

1 = 35 = 8

X 2 = 78,

2 = 7 X 1 σ 1 2

 

n 1 X 2 σ n 2 2 2

74

78 8 2 40

7 2 35

 

0 .

958

5% 5%

-Z=-1.645

c Z c -1.645

53 Difference of Means X normal Unknown but equal variances Do above test with t n 1

n 2

2

( n 1 X 1

X 2

1 n ) 1 2 sˆ 1

 

n ( 2 n

2 2

1 ) sˆ 2 2

 

n 1

n 2 n 1 n 2

 

 /2

54 Variance test (

2 distribution) Two tailed

2

( n

1 )

2 2

 /2

55 Variance test (

2 distribution) One tailed

2

( n

1 )

2 2

56 Hypothesis Test on Variance Suppose best practice in refinery

2 = 6 Does refinery 2 have different variability than best practice?

H o :

2 = 6 H 1 :

2

6.5

Example: 2 nd mine, n –1 = 34, Standard deviation = 7

P (

2 c 1

( n

1 ) Sˆ

2

2

 

2 c 2

)

1

 

57 Hypothesis Test on Variance

 /2

H o :

2 = (6.5) 2 H 1 :

2

6.5

2 Example: 2 nd mine, n –1 = 34, Standard deviation = 7

= 10% ( n

1 ) Sˆ 2

2

( 35

1 ) 7 2 6 2

46 .

278 P (

2 c 1

( n

1

2 ) Sˆ 2

 

2 c 2 )

1

 

58 Hypothesis Test on Variance

 /2

Suppose best practice in refinery H o :

2 = 6.5

H 1 :

2

6.5

Example: 2 nd mine, n –1 = 34, Standard deviation = 7

chiinv ( 0 .

95 , 34 ), chiinv ( 0 .

05 , 34 )

 

21 .

664 , 48 .

603

59 Variance test (

2 distribution)

2

( n

1 )

2 Two tailed 2

46 .

278

0.05

21.664

48.602

0.05

60 Variance test (

2 distribution) More variance than best practice Ho:

2 = 6.5

H1:

2 > 6.5

One tailed

0.10

61 Variance test (

2 distribution) More variance than best practice Ho:

2 = 6.5

H1:

2 > 6.5

2

( n

1 ) Sˆ 2

2 One tailed

46 .

278

0.10

chiinv(0.10,34)=44.903

62 Testing if Variances the Same F Distribution 2 samples of size n 1 and n2 sample variances: ŝ 1 2 , ŝ 2 2, H o :

1 2 =

2 2 => H o :

2 2 /

1 2 = 1 H o :

1 2

 

2 2 => H o :

2 2 /

1 2

1 F

Sˆ 1 2

S 2 2

1 2

2 2

Sˆ 1 2

2 2 Sˆ 2 2

1 2 is F n 1

1 , n 2

1

/2 63 Testing if Variances the Same F Distribution H o :

1 2 /

2 2 = 1 H 1 :

1 2 /

2 2

1 Two tailed Sˆ 2 1 Sˆ 2 2

/2

64 Testing if Variances the Same F Distribution H o :

2 2 /

1 2 = 1 H 1 :

2 2 /

1 2 >1 One tailed Sˆ 2 1 Sˆ 2 2

=10

65 Example Testing if Variances the Same 2 samples of size n 1 = 40 and n 2 = 35 sample variances: ŝ 1 2 = 8 2 , ŝ 2 2 = 7 2 H o :

2 2 /

1 2 = 1 P ( H o :

2 2 /

1 2

Finv ( 0 .

95 , 39 , 34 ) 1

Sˆ 1 2

2 2 Sˆ 2 2

1 2

Finv ( 0 .

05 , 39 , 34 ))

1

0 .

10 [0.579, 1.749] 8 2 /7 2 =1.306

66 Testing if Variances the Same F Distribution H o :

1 2 /

2 2 = 1 H 1 :

1 2 /

2 2

1 Two tailed Sˆ 1 2 Sˆ 2 2

1 .

306 0.05

0.05

Finv(0.95,39,34)=0.579

Finv(0.05,39,34)=1.749

67 Testing if Variances the Same F Distribution H o :

2 2 /

1 2 = 1 H 1 :

2 2 /

1 2

1 One tailed Sˆ 1 2 Sˆ 2 2

1 .

306 0.05

Finv(0.10,39,34)=1.544

68 Power of a test Type II error:

= P(Fail to reject H o Power = 1-

| H 1 is true)

μ = 4 μ = 6

0 2

69 Power of a test Type II error:

= P(Fail to reject H o Power = 1-

| H 1 is true)

μ = 4 μ = 6

0 2

70 Power of a test Researcher controls level of significance,

Increase

what happens to ß?

71 Raise Type I (

) What happens to Type II (ß) Ho: µ = 4 not true µ = 6 true

X-µ not mean 0 but mean 2 ß

μ = 4 μ = 6

0 2

μ = 4

72 Higher

What happens to Type II?

μ = 6

ß 0 Increase ß, reduce

2

73 Operating Characteristic Curve

μ = μ 0 μ = μ 1 H 0 H 1 ß -10 -5 Z β 5

Can graph

against

called operating characteristic curve useful in experimental design

10

-10 -10

74 Operating Characteristic Curve

μ = μ 0 μ = μ 1 H 0 -5 μ = μ 0 ß Z β H 1 5 μ = μ 2 10 H 1 H 0 -5 ß Z β 5 10

75 Fitting a probability distribution Is electricity demand a log-normal distribution Observed Mean: 18.42

Observed Variance 43 Observations : 20

9.8261

20.8787

35.6834

13.1139

15.9879

13.2253

20.2954

18.1785

24.3539

16.4685

30.2449

14.182

20.275

17.243

12.8461

9.2554

23.3099

17.2652

21.9764

13.9045

76 Fitting a probability distribution Does electricity demand follow a normal distribution?

9.8261

20.8787

35.6834

13.1139

15.9879

13.2253

20.2954

18.1785

24.3539

16.4685

30.2449

14.182

20.275

17.243

12.8461

Observed Mean: 18.42

Observed Variance: 43 Observations : 20

9.2554

23.3099

17.2652

21.9764

13.9045

77 You can test your model graphically: 1. Order observations from smallest Y 1 to largest Y n 2. Compute cumulative frequency distribution 3. Plot ordered observations versus P i on special probability sheet 4. If straight line within critical range can’t reject normal

78 You can test your model graphically: 9.26

9.83

12.85

13.11

13.23

13.90

14.18

15.99

16.47

17.24

0.05

0.10

0.15

0.20

0.25

0.30

0.35

0.40

0.45

0.50

17.27

18.18

20.28

20.30

20.88

21.98

23.31

24.35

30.24

35.68

0.55

0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

1.00

79 Or use the Graph/Probability Plot … Option in Minitab

80 Statistical test of distribution H o : X e

N(µ,

2 ) H 1 : X e does not follow N(µ,

2 ) Order data Estimate sample mean & variance Observed Mean: 18.42

Observed Variance: 43 Observations : 20

2 statistic goodness of fit of model

9.26

9.83

12.85

13.11

13.23

13.90

14.18

15.99

16.47

17.24

81 Statistical test of distribution Again order sample 17.27

18.18

20.28

20.30

20.88

21.98

23.31

24.35

30.24

35.68

Create m = 5 categories <10 10-15 15-20 20-25 >25

9.26

9.83

12.85

13.11

13.23

13.90

14.18

15.99

16.47

17.24

82 Statistical test of distribution 17.27

18.18

20.28

20.30

20.88

21.98

23.31

24.35

30.24

35.68

Actual frequencies <10 10-15 15-20 20-25 >25 6 2 2 5 5

<10 10-15 15-20 20-25 >25 83 Statistical test of distribution Frequencies actual expected 2 Normdist(10,18.42,6.56,1)*20 5 (Normdist(15,18.42,6.56,1) Normdist(10,18.42,6.56,1)*20 5 (Normdist(20,18.42,6.56,1) Normdist(15,18.42,6.56,1)*20 6 2

<10 10-15 15-20 20-25 >25 84 Statistical test of distribution Frequencies Observed 2 5 5 6 2 Expected 1.99

4.03

5.88

4.94

3.16

2 85 Goodness of Fit Test Is based on:

m 

2 =

(o i -e i ) 2 /e i

i=1

df = m – k – 1 k = number of parameters replaced by estimates o i : observed frequency, e i : expected frequency

<10 10-15 15-20 20-25 >25 o i 2 5 5 6 2 86 Statistical test of distribution Frequencies e i 1.99

4.03

5.88

4.94

3.16

2=

(o i -e i )2/e i +(2-1.99) 2 /1.99

+(5-4.03) 2 /4.03

+(5-5.88) 2 /5.88

+(6-4.94) 2 /4.94

+(2-3.19) 2 /3.16

= 1.04

87 Statistical test of distribution H o : X

N(µ,

2 ) H 1 : X ~ does not follow N(µ,

2 )

df = m – k – 1= 5 – 2 - 1 

2=

(o i -e i )2/e i = 1.04

CHIINV(0.05,2)=5.99

88 Outline of Topics (Continued )

Estimation Theory/Hypotheses Testing Relationship

Operating Characteristic Curves and Power of a Test

Fitting Theoretical Distributions to Sample Frequency Distributions

Chi-Square Test for Goodness of Fit

89 Sum Up Chapter 7 Hypothesis testing null vs alternative null with equal sign null often status quo alternative often what want to prove type I error vs type II error type I called level of significance P – values 1-ß = power of test = probability of rejecting false one tailed vs two tailed

90 Sum Up Chapter 7 Hypothesis tests mean – Normal test population normal, known variance large sample X

  

n mean – t test population normal, unknown variance, small sample X sˆ

 

n

91 Sum Up Chapter 7 Normal and t

92 Sum Up Chapter 7 Hypothesis tests difference of means – Normal test population normal, known variance X

1 n 1 1 2

 

X

2 2 2 n 2

Hypothesis tests variance

2

( n

1 ) Sˆ 2

2 93 Sum Up Chapter 7 Are variances equal Sˆ 1 2

2 2 Sˆ 2 2

1 2 is F n 1

1 , n 2

1

94 Sum Up Chapter 7

 2 and F

95 Sum Up Chapter 7 How is random variable distributed normal – graph cumulative frequency distribution special paper straight line Statistical

2 k-m-1 =

(o i -e i )2/e i k = categories m = estimated parameters always 1 tailed

End of Chapter 7!

96