No Slide Title

Download Report

Transcript No Slide Title

Inference for m1 - m2
Confidence Intervals and Hypothesis
Tests for the Difference between
Two Population Means µ1 - µ2:
Independent Samples
1
Confidence Intervals for the
Difference between Two Population
Means µ1 - µ2: Independent Samples
• Two random samples are drawn from the
two populations of interest.
• Because we compare two population
means, we use the statistic x1 - x 2 .
2
Population 1
Population 2
Parameters: µ1 and 12
Parameters: µ2 and 22
(values are unknown) (values are unknown)
Sample size: n1
Statistics: x1 and s12
Sample size: n2
Statistics: x2 and s22
Estimate µ1- µ2 with x1- x2
3
Sampling distribution model for x1 - x2 ?
E ( x1 - x2 )  m1 - m2
SD( x1 - x2 ) 
 12
n1

 22
Estimate using
SE ( x1 - x2 ) 
n2
Shape?
2
s s 
  
n1 n2 

df 
2
2
2
2
1  s1 
1  s2 
  
 
n1 - 1  n1  n2 - 1  n2 
2
1
2
2
s12 s22

n1 n2
df
s12 s22

n1 n2
An estimate of the degrees of
freedom is
min(n1 − 1, n2 − 1).

m1-m2
x1 - x 2
Two sample t-confidence interval
Practical use of t: t*

C is the area between −t* and t*.

We find the value of t* in the line
of the t-table for the correct df and
the column for confidence level C.
C
−t*
t*
Confidence Interval for m1 – m2
Confidence interval
s2 s2
( x - x )  tdf * 1  2
1 2
n
n
1
2
where tdf * is the value from the t-table
that corresponds to the confidence level
2
s s 
  
n1 n2 

df 
2
2
2
2
1  s1 
1  s2 
  
 
n1 - 1  n1  n2 - 1  n2 
2
1
2
2
An estimate of the degrees of
freedom is
min(n1 − 1, n2 − 1).
6
Hypothesis test for m1 – m2
H0: m1 – m2= 0 ; Ha: m1 – m2>0 (or <0, or ≠0)
Test statistic:
t
( x1 - x2 ) - 0
2
1
s
s

n1 n2
2
s s 
  
n1 n2 

df 
2
2
2
2
1  s1 
1  s2 
  
 
n1 - 1  n1  n2 - 1  n2 
2
1
2
2
2
2
An estimate of the degrees of
freedom is
min(n1 − 1, n2 − 1).
7
Example: confidence interval for m1 – m2 using
min(n1 –1, n2 -1) to approximate the df
• Example
– Do people who eat high-fiber cereal for
breakfast consume, on average, fewer
calories for lunch than people who do
not eat high-fiber cereal for breakfast?
– A sample of 150 people was randomly
drawn. Each person was identified as a
consumer or a non-consumer of highfiber cereal.
– For each person the number of calories
consumed at lunch was recorded.
8
Example: confidence interval for m1 – m2
Consmers Non-cmrs
568
498
589
681
540
646
636
739
539
596
607
529
637
617
633
555
.
.
.
.
705
819
706
509
613
582
601
608
787
573
428
754
741
628
537
748
.
.
.
.
n1  43 n2  107
Solution:
• The parameter to be tested is
the difference between two means.
• The claim to be tested is:
The mean caloric intake of consumers (m1)
is less than that of non-consumers (m2).
2
s s 
  
n1 n2 

df 
122.6
2
2
2
2
1  s1 
1  s2 
  
 
n1 - 1  n1  n2 - 1  n2 
2
1
2
2
Let’s use df = min(43-1, 107-1) = min(42, 106) =
x1  604.02 x2  633.239 42;
t42* = 2.0181
s1  4103 s2  10670
2
2
9
Example: confidence interval for m1 – m2
• df = 42; t42* = 2.0181
• The confidence interval estimator for the difference
between two means using the formula is
*
( x - x )  t42
1 2
s2 s2
1  2
n
n
1
2
4103 10670
 (604.02 - 633.239)  2.0181

43
107
 -29.21  28.19   -57.40, - 1.02 
10
Interpretation
• The 95% CI is (-57.40, -1.02).
• Since the interval is entirely negative (that is,
does not contain 0), there is evidence from
the data that µ1 is less than µ2. We estimate
that non-consumers of high-fiber breakfast
consume on average between 1.02 and 57.40
more calories for lunch.
11
Beware!! Common Mistake !!!
A common mistake is to calculate a one-sample
confidence interval for m1, a one-sample confidence interval for
m2,and to then conclude that m1 and m2 are equal if the
confidence intervals overlap.
This is WRONG because the variability in the sampling
distribution for x1 - x 2 from two independent samples is more
complex and must take into account variability coming from both
samples. Hence the more complex formula for the standard error.
SE 
s12 s22

n1 n2
INCORRECT Two single-sample 95% confidence intervals:
The confidence interval for the male mean and the
confidence interval for the female mean overlap,
suggesting no significant difference between the true
mean for males and the true mean for females.
Male
Male interval: (18.68, 20.12)
Female
mean 19.4
17.9
st. dev. s 2.52
3.39
n 50
50
Female interval: (16.94, 18.86)
CORRECT The 2-sample 95% confidence interval of the form
( y1 - y2 )  t
*
.025, df
s12
n1

s22
n2
for the difference mmale - m female between the means
is (.313, 2.69). Interval is entirely positive, suggesting significant difference
between the true mean for males and the true mean for females
(evidence that true male mean is larger than true female mean).
0 .313
1.5
2.69
Reason for Contradictory Result
It's always true that
a  b  a  b . Specifically,
2
1
2
2
s
s
s1
s2



n1 n2
n1
n2
SE ( x1 - x2 )  SE ( x1 )  SE ( x2 )
14
Example: hypothesis test for m1 – m2
Consmers Non-cmrs
568
498
589
681
540
646
636
739
539
596
607
529
637
617
633
555
.
.
.
.
705
819
706
509
613
582
601
608
787
573
428
754
741
628
537
748
.
.
.
.
Solution:
• The parameter to be tested is
the difference between two means.
• The claim to be tested is:
The mean caloric intake of consumers (m1)
is less than that of non-consumers (m2).
n1  43 n2  107
x1  604.02 x2  633.239
s1  4103 s2  10670
2
2
15
Example: hypothesis test for m1 – m2(cont.)
H0: m1 – m2= 0 ; Ha: m1 – m2< 0
Test statistic: t 
( x1 - x2 ) - 0
2
s1
n1

604.02 - 633.239
2

s2
n2
 -2.09
4103 10670

43
107
Let’s use df = min(n1 − 1, n2 − 1) =
min(43-1, 107-1) = min(42, 106) = 42
From t-table: for df=42,
-2.4185 <t=-2.09 <-2.0181
 .01 < P-value < .025
Conclusion: reject H0 and conclude high-fiber
breakfast eaters consume fewer calories at lunch
16
Does smoking damage the lungs of children exposed
to parental smoking?
Forced vital capacity (FVC) is the volume (in milliliters) of
air that an individual can exhale in 6 seconds.
FVC was obtained for a sample of children not exposed to
parental smoking and a group of children exposed to
parental smoking.
Parental smoking
FVC
Yes
No
x
s
n
75.5
9.3
30
88.2
15.1
30

We want to know whether parental smoking decreases
children’s lung capacity as measured by the FVC test.
Is the mean FVC lower in the population of children
exposed to parental smoking?
Parental smoking
FVC x
s
n
Yes
75.5
9.3
30
No
88.2
15.1
30

95% confidence interval for (µ1 − µ2), with
df = min(30-1, 30-1) = 29  t* = 2.0452:
s12 s22
( x1 - x2 )  t *

n1 n2
m1 = mean FVC of children
with a smoking parent;
m2 = mean FVC of children
without a smoking parent
9.32 15.12
 (75.5 - 88.2)  2.0452

30
30
-12.7  2.0452*3.24
-12.7  6.63  (-19.33, - 6.07)
We are 95% confident that lung capacity is between
19.33 and 6.07 milliliters LESS in children of smoking
parents.
Do left-handed people have a shorter life-expectancy than
right-handed people?
 Some psychologists believe that the stress of being lefthanded in a right-handed world leads to earlier deaths among
left-handers.
 Several studies have compared the life expectancies of lefthanders and right-handers.
 One such study resulted in the data shown in the table.
Handedness
Mean age at death
Left
Right
star left-handed quarterback
Steve Young
x
s
n
66.8
25.3
99
75.2
15.1
888
left-handed presidents

We will use the data to construct a confidence interval
for the difference in mean life expectancies for left-
handers and right-handers.
Is the mean life expectancy of left-handers less
than the mean life expectancy of right-handers?
Handedness
Mean age at death
s
n
Left
66.8
25.3
99
Right
75.2
15.1
888
95% confidence interval for (µ1 − µ2), with
df = min(99-1, 888-1) = 98  t* = 1.9845:
s12 s22
( x1 - x2 )  t *

n1 n2
(25.3) 2 (15.1)2
 (66.8 - 75.2)  1.9845

99
888
-8.4  1.9845* 2.59
-8.4  5.14  (-13.54, - 3.26)
The “Bambino”,left-handed Babe
Ruth, baseball’s all-time best
player.
m1 = mean life expectancy of
left-handers;
m2 = mean life expectancy of
right-handers
We are 95% confident that the mean life expectancy for lefthanders is between 3.26 and 13.54 years LESS than the mean
life expectancy for right-handers.
Matched pairs t procedures
Sometimes we want to compare treatments or conditions at the
individual level. These situations produce two samples that are not
independent — they are related to each other. The members of one
sample are identical to, or matched (paired) with, the members of the
other sample.
– Example: Pre-test and post-test studies look at data collected on the
same sample elements before and after some experiment is performed.
– Example: Twin studies often try to sort out the influence of genetic
factors by comparing a variable between sets of twins.
– Example: Using people matched for age, sex, and education in social
studies allows canceling out the effect of these potential lurking
variables.
Matched pairs t procedures
• The data:
– “before”: x11 x12 x13 … x1n
– “after”: x21 x22 x23 … x2n
• The data we deal with are the differences di of the
paired values:
d1 = x11 – x21 d2 = x12 – x22 d3 = x13 – x23 … dn = x1n – x2n
• A confidence interval for matched pairs data is
calculated just like a confidence interval for 1 sample
data: d  t s
n
• A matched pairs hypothesis test is just like a onesample test:
H0: µdifference= 0 ; Ha: µdifference>0 (or <0, or ≠0)
22
*
n -1
d
Sweetening loss in colas
The sweetness loss due to storage was evaluated by 10 professional
tasters (comparing the sweetness before and after storage):
Taster
•
•
•
•
•
•
•
•
•
•
1
2
3
4
5
6
7
8
9
10
Before sweetness – after sweetness
2.0
0.4
0.7
2.0
−0.4
2.2
−1.3
1.2
1.1
2.3
95% Confidence interval:
1.02  2.2622(1.196/sqrt(10)) = 1.02 2.2622(.3782)
= 1.02  .8556 =(.1644, 1.8756)
We want to test if storage results in a
loss of sweetness, thus:
H0: mdifference = 0
versus Ha: mdifference > 0
Summary stats: d = 1.02, s = 1.196
This is a pre-/post-test design and the variable is the cola sweetness
before storage minus cola sweetness after storage.
A matched pairs test of significance is indeed just like a one-sample
test.
Sweetening loss in colas hypothesis test
• H0: mdifference = 0 vs Ha: mdifference > 0
• Test statistic
1.02 - 0
1.02
t

 2.6970
1.196
.3782
10
• From t-table: for df=9,
2.2622 <t=2.6970<2.8214
 .01 < P-value < .025
• ti83 gives P-value = .012263…
• Conclusion: reject H0 and conclude colas do
lose sweetness in storage (note that CI was
entirely positive.
24
Does lack of caffeine increase depression?
Individuals diagnosed as caffeine-dependent are
deprived of caffeine-rich foods and assigned
to receive daily pills. Sometimes, the pills
contain caffeine and other times they contain
Depression Depression Placebo Subject with Caffeine with Placebo Cafeine
1
5
16
11
2
5
23
18
3
4
5
1
4
3
7
4
5
8
14
6
6
5
24
19
7
0
6
6
8
0
3
3
9
2
15
13
10
11
12
1
11
1
0
-1
a placebo. Depression was assessed (larger number means more depression).
– There are 2 data points for each subject, but we’ll only look at the difference.
– The sample distribution appears appropriate for a t-test.
11 “difference”
data points.
DIFFERENCE
20
15
10
5
0
-5
-2
-1
0
1
Normal quantiles
2
Hypothesis Test: Does lack of caffeine increase depression?
For each individual in the sample, we have calculated a difference in depression score
(placebo minus caffeine).
There were 11 “difference” points, thus df = n − 1 = 10.
We calculate that x = 7.36; s = 6.92
H0 :mdifference = 0 ; Ha: mdifference > 0

t
x -0
7.36

 3.53
s n 6.92 / 11
Depression Depression Placebo Subject with Caffeine with Placebo Cafeine
1
5
16
11
2
5
23
18
3
4
5
1
4
3
7
4
5
8
14
6
6
5
24
19
7
0
6
6
8
0
3
3
9
2
15
13
10
11
12
1
11
1
0
-1
For df = 10, 3.169 < t = 3.53 < 3.581  0.005 > p > 0.0025
ti83 gives P-value = .0027
Caffeine deprivation causes a significant increase in depression.