Statistics Review – Part II • Topics: – Hypothesis Testing – Paired Tests – Tests of variability.

Download Report

Transcript Statistics Review – Part II • Topics: – Hypothesis Testing – Paired Tests – Tests of variability.

Statistics Review – Part II

• Topics: – Hypothesis Testing – Paired Tests – Tests of variability 1

Sampling and the Normal Distribution Now that we have a good sampling plan, how do we monitor the population?

sample population inference 2

Sampling and the Normal Distribution 3

Tests of Means • • Suppose the standard of care for diabetics in the ICU is to have HbA1c level  0 = 9.5

You want to see whether your ICU diabetic patients in the ICU meet the standard of care • What do you do next?

4

Tests of Means • Sample your diabetic patients in the ICU • Randomization….

• Sample all diabetic patients every day?

– Independence between measurements • Time of day?

– Controlling for external factors 5

Tests of Means • Randomization – Control for all known factors – Then, randomize the rest – Uncontrollable factors will randomly affect your diabetic patients’ measurements • Uncontrollable factors will ‘average out’ 6

Tests of Means • Develop a sampling plan….

• What are the controllable elements?

• How do we randomize and collect our blood sugar measurements?

7

Tests of Means • Suppose we collect A1c data on 20 diabetic patients in the ICU – Mean 7.5

– Standard deviation 4.0

• How do we proceed?

8

Tests of Means • Identify the null and alternative hypothesis – What is it we’re testing?

• Equal versus greater than (left-tailed test) – – Ho: mean A1c = 9.5

H1: mean A1c < 9.5

9

Tests of Means Compute the test statistic which follows Student’s

t-

distribution with

n

degrees of freedom.

– 1 10

Tests of Means • Our hypothesis test expressed on the normal distribution 7.0 8.0 9.0 10.0 11.0 12.0

11

Tests of Means • • Calculate the t-statistic – 19 degrees of freedom T = 7.5 – 8.6

4 sqrt(20) = 1.23

12

Tests of Means • Compare to the critical value of the t-statistic with 19 degrees of freedom – What do you conclude?

• Do your conclusions match you graphical interpretation?

13

Tests of Means • • • Instead, say that you sampled 80 diabetic patients in the ICU Mean = 7.6

Standard deviation = 4.0

• What do you conclude now?

14

• Tests of Means 7.5 – 8.6

T = 4 sqrt(80) = 2.46

• • How much area is in the ‘curve’ now?

What are we basing our assumptions on?

– How do we validate our assumptions?

15

Tests of Means A

P-value

is the probability of observing a sample statistic as extreme or more extreme than the one observed under the assumption the null hypothesis is true.

16

Tests of Means • • P-value of 2.46 is 0.01

1 of 100 experiments will have a t value greater than 2.46 due to ‘random chance’ 1% chance that 2.46 occurred due to chance – more likely that there is a difference between the mean A1c values • If p-value < 0.05, we often say we reject the null hypothesis in favor of the alternative (there is strong evidence to reject the null hypothesis).

17

Tests of Means Lastly, we have to validate the assumptions under which we based our test. Here, we use a normal probability plot.

Fat pencil test to detect normality .99

.95

.90

.75

.50

.25

.10

.05

.01

0 -1 -2 -3 3 2 1 140 145 150 Thickness (Ang) 155 160 18

Tests of Means • Example: – Is the turn-around time in the lab less than 120 minutes?

– – Null Hypothesis: Alternative Hypothesis: 19

Tests of Means  Is the turn-around time in the lab less than 120 minutes?

      100 samples Mean = 138.0

Standard deviation = 25.8

t=6.98

df=99 p-value 0 20 40 Week 60 80 100 20

Tests of Means • • What is the ‘strength’ of the evidence against the null hypothesis?

What do you conclude relative to the null hypothesis?

21

Tests of Means • End: Z-statistics, T-statistics • Start: Independent samples t-test 22

Tests of Means • Objective: – Determine whether there are statistically significant differences between means of independently-drawn samples • Example: – Mean LOS is the same for 2 physicians – Mean registration time is the same on Fridays as it is on Tuesdays 23

Tests of Means 24

Tests of Means 25

Tests of Means • • Claim: – Physician A avg LOS = Physician B avg LOS Alternative Hypothesis: – Physician A avg LOS not equal to Physician B avg LOS 26

Tests of Means  Compare the average LOS for visual differences: 0 10 20 W eek 30 40 50 27

Tests of Means • Graphically, we have a distribution for the distribution of the difference of avg LOS 28

Tests of Means • Hypothesis: Physician A and B have the same average LOS – – Sample 12 Physician A and B patients A: • • Mean LOS is 5.8

Standard deviation is 4.0

– B: • • Mean LOS is 8.5

Standard deviation is 3.2

29

Tests of Means • Calculate the test statistic 30

Tests of Means The degrees of freedom used to determine the critical value(s) presented in the last example are conservative. Results that are more accurate can be obtained by using the following degrees of freedom: 31

Tests of Means • • • What is the t-statistic?

What is the p-value associated with the two-sided test?

What do you conclude?

32

Tests of Means • Say, now that you sample 25 patients from each physician: – – – Physician A and B patients: A: • • Mean LOS is 6.8

Standard deviation is 4.3

B: • • Mean LOS is 8.1

Standard deviation is 5.0

33

Tests of Means • • • What is the t-statistic?

What is the p-value associated with the two-sided test?

What do you conclude?

34

Tests of Means Lower Bound = (

x

1 

x

2 ) 

t

 / 2

s

1 2

n

1 

s n

2 2 2 Upper Bound = (

x

1 

x

2 ) 

t

 / 2

s

1 2

n

1 

s n

2 2 2 35

Tests of Means • End: Independent Samples Test • Start: Tests of Variability – F-Tests 36

Tests of Variances • When might you want to know whether the variability in 2 populations is the same?

– – Time series - whether a process is ‘in control’; or Comparison between two populations 37

Tests of Variances 0 10 20 W eek 30 40 50 38

Tests of Variances 0 20 40 Week 60 80 100 39

Tests of Variances   Comparing variability is based on the ratio of sample variances The ratio of sample variances is called the F-statistic and follows the F-distribution 40

Tests of Variances • Since, • If is large relative to , then F is very large and suggests the sample variances are different.

• If is large relative to , then F is very small and suggests the sample variances are different.

41

Tests of Variances • How ‘large’ or how ‘small’ does F have to be to be ‘significant’?

– That depends on the F-distribution which takes into account the sample sizes for both sample variances – Recall, the larger the sample size, the more precisely we can characterize the sample variance (and test for differences between the sample variances) 42

Tests of Variances 43

Tests of Variances Testing Claims Regarding Two Population Standard Deviations 1. The samples are independent simple random samples from both populations.

• Drawing observations in one population does not affect the drawing of observations in the second population 2. The populations from which the samples are drawn are normally distributed.

• If the populations from which the samples are drawn are not normally distributed, do not use the F-test for equality of variances 44

Tests of Variances 45

Tests of Variances 46

Tests of Variances Testing hypotheses about equality of variances: Is the critical F with n

1

– 1 degrees of freedom in the numerator and n 2 – 1 degrees of freedom in the denominator and an area of α to the right of the critical F. 47

Tests of Variances 48

Tests of Variances 49

Tests of Variances 50

Tests of Variances • Example: – Change process in lab.

0 20 40 Week 60 80 100 51

Tests of Variances • Summary: – Pre-change • Minimum: 51 • Mean: 128 minutes • Maximum: 236 Standard deviation: 40 minutes – Post-change • Minimum: 105 • Mean: 121 minutes • Maximum: 143 Standard deviation: 10 minutes 52

Tests of Variances • • 50 weeks pre- and post Calculate the F-test • F = 40/10 = 4 • • Look it up!

P-value!

– Less than critical value?

53

Tests of Variances • • What do you conclude about the equality of the variability of the distributions?

Do this ‘intuitively’ makes sense when compared to the graph?

0 20 40 Week 60 80 100 54

Conclusions

• Review: – Paired t-tests – Independent samples t-test – F-test for equality of variances – – – Hypothesis testing P-values Confidence intervals 55