6.2a - Two Means
Download
Report
Transcript 6.2a - Two Means
CHAPTER 6
Statistical Inference & Hypothesis Testing
• 6.1 - One Sample
Mean μ, Variance σ 2, Proportion π
• 6.2 - Two Samples
Means, Variances, Proportions
μ1 vs. μ2 σ12 vs. σ22
π1 vs. π2
• 6.3 - Multiple Samples
Means, Variances,
μ1, …, μk σ12, …, σk2
Proportions
π1, …, πk
CHAPTER 6
Statistical Inference & Hypothesis Testing
• 6.1 - One Sample
Mean μ, Variance σ 2, Proportion π
• 6.2 - Two Samples
Means, Variances, Proportions
μ1 vs. μ2 σ12 vs. σ22
π1 vs. π2
• 6.3 - Multiple Samples
Means, Variances,
μ1, …, μk σ12, …, σk2
Proportions
π1, …, πk
Consider two independent populations…and a random variable X, normally distributed in each.
POPULATION 1
X1 ~ N(μ1, σ1)
σ1
POPULATION 2
X2 ~ N(μ2, σ2)
σ2
Null Hypothesis
H0: μ1 = μ2, i.e.,
μ1 – μ2 = 0 μ0
(“No mean difference")
Test at signif level α
1
2
Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control
Random Sample,
size n1
Random Sample,
size n2
Sampling Distribution =?
X1
X2
Consider two independent populations…and a random variable X, normally distributed in each.
POPULATION 1
X1 ~ N(μ1, σ1)
σ1
POPULATION 2
X2 ~ N(μ2, σ2)
Null Hypothesis
H0: μ1 = μ2, i.e.,
μ1 – μ2 = 0 μ0
σ2
(“No mean difference")
Test at signif level α
1
2
Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control
Random Sample,
size n1
Random Sample,
size n2
Sampling Distribution =?
X 1 ~ N 1 , 1
n1
X 2 ~ N 2 , 2
n2
X1 X 2 ~ ????
Consider two independent populations…and a random variable X, normally distributed in each.
POPULATION 1
X1 ~ N(μ1, σ1)
σ1
POPULATION 2
X2 ~ N(μ2, σ2)
Null Hypothesis
H0: μ1 = μ2, i.e.,
μ1 – μ2 = 0 μ0
σ2
(“No mean difference")
Test at signif level α
1
2
Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control
Random Sample,
size n1
Random Sample,
size n2
Sampling Distribution =?
X 1 ~ N 1 , 1
n1
Recall from section 4.1 (Discrete Models):
Mean(X – Y) = Mean(X) – Mean(Y)
and if X and Y are independent…
Var(X – Y) = Var(X) + Var(Y)
X 2 ~ N 2 , 2
n2
X1 X 2 ~ N ????, ????
Consider two independent populations…and a random variable X, normally distributed in each.
POPULATION 1
X1 ~ N(μ1, σ1)
σ1
POPULATION 2
X2 ~ N(μ2, σ2)
Null Hypothesis
H0: μ1 = μ2, i.e.,
μ1 – μ2 = 0 μ0
σ2
(“No mean difference")
Test at signif level α
1
2
Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control
Random Sample,
size n1
Random Sample,
size n2
Sampling Distribution =?
X 1 ~ N 1 , 1
n1
Recall from section 4.1 (Discrete Models):
Mean(X – Y) = Mean(X) – Mean(Y)
and if X and Y are independent…
Var(X – Y) = Var(X) + Var(Y)
X 2 ~ N 2 , 2
n2
X1 X 2 ~ N 1 2 , ????
Consider two independent populations…and a random variable X, normally distributed in each.
POPULATION 1
X1 ~ N(μ1, σ1)
σ1
POPULATION 2
X2 ~ N(μ2, σ2)
Null Hypothesis
H0: μ1 = μ2, i.e.,
μ1 – μ2 = 0 μ0
σ2
(“No mean difference")
Test at signif level α
1
2
Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control
Random Sample,
size n1
Random Sample,
size n2
Sampling Distribution =?
X 1 ~ N 1 , 1
n1
Recall from section 4.1 (Discrete Models):
Mean(X – Y) = Mean(X) – Mean(Y)
and if X and Y are independent…
Var(X – Y) = Var(X) + Var(Y)
X 2 ~ N 2 , 2
n2
X1 X 2 ~ N 1 2 , ????
Consider two independent populations…and a random variable X, normally distributed in each.
POPULATION 1
X1 ~ N(μ1, σ1)
σ1
POPULATION 2
X2 ~ N(μ2, σ2)
Null Hypothesis
H0: μ1 = μ2, i.e.,
μ1 – μ2 = 0 μ0
σ2
(“No mean difference")
Test at signif level α
1
2
Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control
Random Sample,
size n1
Random Sample,
size n2
Sampling Distribution =?
X 1 ~ N 1 , 1
n1
Recall from section 4.1 (Discrete Models):
Mean(X – Y) = Mean(X) – Mean(Y)
and if X and Y are independent…
Var(X – Y) = Var(X) + Var(Y)
X 2 ~ N 2 , 2
n2
12
X 1 X 2 ~ N 1 2 ,
n
1
Consider two independent populations…and a random variable X, normally distributed in each.
POPULATION 1
X1 ~ N(μ1, σ1)
σ1
POPULATION 2
X2 ~ N(μ2, σ2)
Null Hypothesis
H0: μ1 = μ2, i.e.,
μ1 – μ2 = 0 μ0
σ2
(“No mean difference")
Test at signif level α
1
2
Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control
Random Sample,
size n1
Random Sample,
size n2
Sampling Distribution =?
X 1 ~ N 1 , 1
n1
Recall from section 4.1 (Discrete Models):
Mean(X – Y) = Mean(X) – Mean(Y)
and if X and Y are independent…
Var(X – Y) = Var(X) + Var(Y)
X 2 ~ N 2 , 2
n2
12 2 2
X1 X 2 ~ N 1 2 ,
n
n
1
2
Consider two independent populations…and a random variable X, normally distributed in each.
POPULATION 1
X1 ~ N(μ1, σ1)
σ1
POPULATION 2
X2 ~ N(μ2, σ2)
Null Hypothesis
H0: μ1 = μ2, i.e.,
μ1 – μ2 = 0 μ0
σ2
(“No mean difference")
Test at signif level α
1
2
Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control
Random Sample,
size n1
Random Sample,
size n2
Sampling Distribution =?
X 1 ~ N 1 , 1
n1
Recall from section 4.1 (Discrete Models):
Mean(X – Y) = Mean(X) – Mean(Y)
and if X and Y are independent…
Var(X – Y) = Var(X) + Var(Y)
X 2 ~ N 2 , 2
n2
X 1 X 2 ~ N 1 2 ,
12
n1
22
n2
Consider two independent populations…and a random variable X, normally distributed in each.
POPULATION 1
X1 ~ N(μ1, σ1)
σ1
POPULATION 2
X2 ~ N(μ2, σ2)
Null Hypothesis
H0: μ1 = μ2, i.e.,
μ1 – μ2 = 0 μ0
σ2
(“No mean difference")
Test at signif level α
1
2
Classic Example: “Randomized Clinical Trial”… Pop 1 = Treatment, Pop 2 = Control
Random Sample,
size n1
Random Sample,
size n2
Sampling Distribution =?
X 1 ~ N 1 , 1
n1
Recall from section 4.1 (Discrete Models):
Mean(X – Y) = Mean(X) – Mean(Y)
and if X and Y are independent…
Var(X – Y) = Var(X) + Var(Y)
X 2 ~ N 2 , 2
n2
= 0 under H0 2 2
1
X 1 X 2 ~ N 1 2 ,
2
n
n
1
2
Consider two independent populations…and a random variable X, normally distributed in each.
POPULATION 1
X1 ~ N(μ1, σ1)
σ1
POPULATION 2
X2 ~ N(μ2, σ2)
Null Hypothesis
H0: μ1 = μ2, i.e.,
μ1 – μ2 = 0
σ2
(“No mean difference")
Test at signif level α
1
2
Null Distribution
X 1 X 2 ~ N 0,
But what if σ1 and σ2 are unknown?
2
Then use sample estimates s12 and s22
n1
n2 with Z- or t-test, if n and n are large.
1
2
2
1
s.e.
X1 X 2
0
2
2
2
Consider two independent populations…and a random variable X, normally distributed in each.
POPULATION 1
X1 ~ N(μ1, σ1)
σ1
POPULATION 2
X2 ~ N(μ2, σ2)
Null Hypothesis
H0: μ1 = μ2, i.e.,
μ1 – μ2 = 0
σ2
(“No mean difference")
Test at signif level α
1
2
Null Distribution
X 1 X 2 ~ N 0,
s
s2
n1 n2
2
1
2
s.e.
X1 X 2
0
But what if σ12 and σ22 are unknown?
Then use sample estimates s12 and s22
with Z- or t-test, if n1 and n2 are large.
Later…
(But what if n1 and n2 are small?)
Example: X = “$ Cost of a certain medical service”
Assume X is known to be normally distributed at each of k = 2 health care facilities (“groups”).
Hospital: X1 ~ N(μ1, σ1)
Clinic: X2 ~ N(μ2, σ2)
• Null Hypothesis H0: μ1 = μ2,
i.e., μ1 – μ2 = 0
(“No difference exists.")
2-sided test at significance level α = .05
• Data Sample 1: n1 = 137
Sample 2: n2 = 140
x1 630
x2 546
s12 788.5
s22 1663.0
4.2
Null Distribution
X 1 X 2 N 0,
N 0,
2
95% Confidence Interval for μ1 – μ2:
(84 – 8.232, 84 + 8.232) = (75.768, 92.232)
788.5 1663.0
137
140
N 0, 4.2
0
95% Margin of Error = (1.96)(4.2) = 8.232
s
s2
n1 n2
2
1
x1 x2 84
NOTE:
>0
does not contain 0
84 0
= 20 >> 1.96 p << .05
Z-score =
4.2
Reject H0; extremely strong significant difference
Consider two independent populations…and a random variable X, normally distributed in each.
POPULATION 1
X1 ~ N(μ1, σ1)
1
POPULATION 2
X2 ~ N(μ2, σ2)
2
1
Sample
size n1
2
1
s
(“No mean difference")
Test at signif level α
2
X 1 X 2 ~ N 0,
12
n1
unknown 12 and 22
Sample
size n2
Null Distribution
22
n2
Null Hypothesis
H0: μ1 = μ2, i.e.,
μ1 – μ2 = 0
s22
large n1 and n2
Consider two independent populations…and a random variable X, normally distributed in each.
POPULATION 1
X1 ~ N(μ1, σ1)
1
POPULATION 2
X2 ~ N(μ2, σ2)
2
1
Sample
size n1
s12
(“No mean difference")
Test at signif level α
2
X 1 X 2 ~ N 0,
2
n1
n2
2
1
unknown 12 and 22
Sample
size n2
Null Distribution
Null Hypothesis
H0: μ1 = μ2, i.e.,
μ1 – μ2 = 0
small
large n11 and n22
IF the two populations
are equivariant, i.e.,
2
s22
H0 :
2
1
2
2
then conduct a t-test on
the “pooled” samples.
H 0 : 12 2 2
H A: 2
2
1
s12
2
s22
H0 : 2
2
1
2
H 0 : 12 2 2
H A: 2
2
1
s12
2
s22
Test Statistic
s12
F 2
s2
Sampling Distribution =?
Working Rule of Thumb
Acceptance Region for H0
¼<F<4
Consider two independent populations…and a random variable X, normally distributed in each.
POPULATION 1
X1 ~ N(μ1, σ1)
POPULATION 2
X2 ~ N(μ2, σ2)
1
2
1
(“No mean difference")
Test at signif level α
unknown 12 and 22
2
Null
Distribution X 1 X 2 ~ N 0,
12
n1
small n1 and n2
22
n2
Null Hypothesis
H0: μ1 = μ2, i.e.,
μ1 – μ2 = 0
IF equal variances
H0 : 12 22
is accepted, then estimate their
common value with a “pooled”
sample variance.
2
pooled
pooled
s
(n1 1) s12 (n2 1) s22
n1 n2 2
The pooled variance is a weighted
average of s12 and s22, using the
degrees of freedom as the weights.
Consider two independent populations…and a random variable X, normally distributed in each.
POPULATION 1
X1 ~ N(μ1, σ1)
POPULATION 2
X2 ~ N(μ2, σ2)
1
2
1
12
n1
2
pooled2
pooled
1
s
s.e. s
IF equal variances
(“No mean difference")
Test at signif level α
unknown 12 and 22
2
Null
Distribution X 1 X 2 ~ N 0,
n
H0 : 12 22
small n1 and n2
22
n2
1spooled1
n1 n2n2
2
IF equal variances
s
then use Satterwaithe Test, Welch Test, etc.
SEE LECTURE NOTES AND TEXTBOOK.
H0 : 12 22
is accepted, then estimate their
common value with a “pooled”
sample variance.
2
pooled
pooled
is rejected,
Null Hypothesis
H0: μ1 = μ2, i.e.,
μ1 – μ2 = 0
(n1 1) s12 (n2 1) s22
n1 n2 2
The pooled variance is a weighted
average of s12 and s22, using the
degrees of freedom as the weights.
Example: Y = “$ Cost of a certain medical service”
Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”).
Hospital: Y1 ~ N(μ1, σ1)
Clinic: Y2 ~ N(μ2, σ2)
• Null Hypothesis H0: μ1 = μ2,
i.e., μ1 – μ2 = 0
(“No difference exists.")
2-sided test at significance level α = .05
• Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5
• Analysis via T-test (if equivariance holds):
“Group Means” y1
“Group
2
Variances” s1
s2 = SS/df
Pooled
Variance
667 653 614 612 604
5
630))2 (604630)2
(667 630
51
SS1
630
Point estimates
y2
788.5 s
2
2
df1
593 525 520
3
546
546))2 (520546
546))2
(593 546
31
y yi / n
NOTE:
y1 y2 84
>0
1663
2.11 4
1663 F 788.5
SS2
( n11)(
1)788.5
s1 ()n
1)s1)(
(3
2
2
2 1663 )
spooled
(5
1080
1)
n(5
1 n
2 2 (31)
2
Sample 2 = {593, 525, 520}; n2 = 3
2
df2
The pooled variance is a weighted average of the group
variances, using the degrees of freedom as the weights.
Example: Y = “$ Cost of a certain medical service”
Assume Y is known to be normally distributed at each of k = 2 health care facilities (“groups”).
Hospital: Y1 ~ N(μ1, σ1)
Clinic: Y2 ~ N(μ2, σ2)
• Null Hypothesis H0: μ1 = μ2,
i.e., μ1 – μ2 = 0
(“No difference exists.")
2-sided test at significance level α = .05
• Data: Sample 1 = {667, 653, 614, 612, 604}; n1 = 5
Sample 2 = {593, 525, 520}; n2 = 3
• Analysis via T-test (if equivariance holds): Point estimates
“Group Means” y1
“Group
2
Variances” s1
s2 = SS/df
Pooled
Variance
667 653 614 612 604
5
630))2 (604630)2
(667 630
51
630
788.5 s
2
2
546
546))2 (520546
546))2
(593 546
31
NOTE:
y1 y2 84
>0
1663
2.11 4
1663 F 788.5
SS = 6480
) (31)(1663)
2
spooled
(51)( 788.5
1080
(51) (31)
df = 6
Standard
Error
y2
593 525 520
3
y yi / n
11 1 1
s.e. 1080
24
s
5n1 3n2
2
pooled
The pooled variance is a weighted average of the group
variances, using the degrees of freedom as the weights.
p-value = 2P(Y1 Y2 84) 2 P T6 84240 2 P T6 3.5
> 2 * (1 - pt(3.5, 6)) Reject H0 at α = .05
stat signif, Hosp > Clinic
[1] 0.01282634
R code:
> y1 = c(667, 653, 614, 612, 604)
> y2 = c(593, 525, 520)
>
> t.test(y1, y2, var.equal = T)
Formal Conclusion
Two Sample t-test
p-value < α = .05
Reject H0 at this level.
data: y1 and y2
t = 3.5, df = 6, p-value = 0.01283
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
25.27412 142.72588
Interpretation
sample estimates:
mean of x mean of y
The samples provide evidence that the
630
546
difference between mean costs is (moderately)
statistically significant, at the 5% level,
with the hospital being higher than the clinic
(by an average of $84).
NEXT UP…
PAIRED MEANS
page 6.2-7, etc.