Chapter 1: Introduction to Statistics
Download
Report
Transcript Chapter 1: Introduction to Statistics
COURSE: JUST 3900
TIPS FOR APLIA
Chapter : 12
Analysis of Variance: ANOVA
Developed By:
Ethan Cooper (Lead Tutor)
John Lohman
Michael Mattocks
Aubrey Urwick
Key Terms: Don’t Forget
Notecards
Factors (p. 388)
Levels (p. 388)
Testwise Alpha Level (p. 391)
Experimentwise Alpha Level (p. 391)
Error Term (p. 394)
Post Hoc Tests or Post Tests (p. 416)
ANOVA Notation
k is used to identify the number of treatment conditions
n is used to identify the number of scores in each treatment
condition
N is used to identify the total number scores in the entire study
T stands for treatment total and is calculated by ∑X, which equals
the sum of the scores for each treatment condition
G stands for the sum of all scores in a study (Grand Total)
N = kn, when samples are the same size
Calculate by adding up all N scores or by adding treatment total (G=∑T)
You will also need SS and M for each sample, and ∑X2 for the entire
set of all scores.
Formulas
F-ratio: 𝐹 =
𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛
𝑋2
SStotal: 𝑆𝑆𝑡𝑜𝑡𝑎𝑙 =
SSwithin: 𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑆𝑆1 + 𝑆𝑆2 + 𝑆𝑆3 …
SSbetween: 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑆𝑆𝑡𝑜𝑡𝑎𝑙 − 𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛
−
𝐺2
𝑁
𝑇2
𝑛
SSbetween: 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 =
dftotal: 𝑑𝑓𝑡𝑜𝑡𝑎𝑙 = 𝑁 − 1
dfwithin: 𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑛 − 1 or 𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑁 − 𝑘
dfbetween: 𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑘 − 1
−
𝐺2
𝑁
More Formulas
2
MSwithin: 𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑠𝑤𝑖𝑡ℎ𝑖𝑛
=
MSbetween: 𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 =
Tukey’s HSD: 𝐻𝑆𝐷 = 𝑞
𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛
𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛
2
𝑠𝑏𝑒𝑡𝑤𝑒𝑒𝑛
=
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛
𝑛
Scheffe Test: 𝐹𝐴 𝑣𝑒𝑟𝑠𝑢𝑠 𝐵 =
Effect Size: η2 =
𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 +𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛
=
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑆𝑆𝑡𝑜𝑡𝑎𝑙
Hypothesis Testing with ANOVA
Question 1: A psychologist studied three computer
keyboard designs. Three samples of individuals were
given material to type on a particular keyboard, and the
number of errors committed by each participant was
recorded. The data are as follows:
Keyboard A
Keyboard B
Keyboard C
0
6
6
4
8
5
0
5
9
1
4
4
0
2
6
T=
SS =
T=
SS =
T=
SS =
N=
G=
ΣX2 =
Hypothesis Testing with ANOVA
Question 1: Are these data sufficient to conclude that
there are significant differences in typing performance
among the three keyboard designs? Set alpha at
α = 0.05
Keyboard A
Keyboard B
Keyboard C
0
6
6
4
8
5
0
5
9
1
4
4
0
2
6
T=5
SS = 12
T = 25
SS = 20
T = 30
SS = 14
N = 15
G = 60
ΣX2 = 356
Hypothesis Testing with ANOVA
Question 1 Answer:
Step 1: State the hypothesis.
H0: μ1 = μ2 = μ3 (Type of keyboard has no effect)
H1: At least one of the treatment means is different.
Hypothesis Testing with ANOVA
Question 1 Answer:
Step 2: Locate the critical region
𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑘 − 1 = 3 − 1 = 2
𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑁 − 𝑘 = 15 − 3 = 12
For this problem df = 2,12 and the critical value for α = 0.05 is
F = 3.88.
If F-ratio ≤ Fcritical (3.88), then fail to reject H0.
If F-ratio > Fcritical (3.88), then reject H0.
Hypothesis Testing with ANOVA
Question 1 Answer:
Step 3: Perform the analysis.
𝑋2 −
𝐺2
𝑁
𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑆𝑆1 + 𝑆𝑆2 + 𝑆𝑆3 = 12 + 20 + 14 = 46
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑆𝑆𝑡𝑜𝑡𝑎𝑙 − 𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 116 − 46 = 70
or 𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 =
𝑇2
𝑛
𝐺2
−𝑁
=
52
5
+
252
5
+
302
5
−
602
15
= 356 −
3600
15
𝑆𝑆𝑡𝑜𝑡𝑎𝑙 =
= 356 −
602
15
= 310 −
𝑆𝑆
70
2
2
𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 = 𝑠𝑏𝑒𝑡𝑤𝑒𝑒𝑛
= 𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛 =
2
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛 = 𝑠𝑤𝑖𝑡ℎ𝑖𝑛
= 𝑑𝑓𝑤𝑖𝑡ℎ𝑖𝑛 = 12 = 3.83
𝐹=
𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑆𝑆
46
𝑤𝑖𝑡ℎ𝑖𝑛
𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛
35
= 3.83 = 9.14
602
15
= 35
= 356 − 240 = 116
= 310 − 240 = 70
Hypothesis Testing with ANOVA
Sources
SS
df
MS
Between
70
2
35
Within
46
12
3.83
Total
116
14
F = 9.14
Hypothesis Testing with ANOVA
Question 1 Answer:
Step 4: Make a decision
If F-ratio ≤ Fcritical (3.88), then fail to reject H0.
If F-ratio > Fcritical (3.88), then reject H0.
F-ratio (9.14) > Fcritical (3.88). Therefore, we reject H0. The type of
keyboard used has a significant effect on the number of errors
committed.
Computing Effect Size for
ANOVA
Question 2: Compute effect size (η2), the percentage of
variance explained, for the data that were analyzed in
Question 1.
Computing Effect Size for
ANOVA
Question 2 Answer:
η2 =
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 +𝑆𝑆𝑤𝑖𝑡ℎ𝑖𝑛
=
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑆𝑆𝑡𝑜𝑡𝑎𝑙
=
70
116
= 0.60 = 60%
Post Hoc Tests
Question 3: For the data used in Question 1, perform a
post hoc test to determine which mean differences are
significant and which are not. Use both Tukey’s HSD and
the Scheffe Test.
Post Hoc Tests: Tukey’s HSD
Question 3 Answer:
𝐻𝑆𝐷 = 𝑞
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛
𝑛
1.
Find q. q = 3.77 (Table B.5, p.708)
2.
𝐻𝑆𝐷 = 𝑞
3.
Thus, the mean difference between any two samples must be at
least 3.23 to be significant.
Find the means for each treatment.
4.
1.
𝑀𝐴 =
2.
𝑀𝐵 =
3.
𝑀𝐶 =
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛
𝑛
𝑋
𝑛
𝑋
𝑛
𝑋
𝑛
= 3.77
5
=5=1
=
=
25
5
30
5
=5
=6
3.83
5
= 3.77 0.766 = 3.23
Post Hoc Tests: Tukey’s HSD
Question 3 Answer:
𝐻𝑆𝐷 = 𝑞
6.
7.
8.
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛
𝑛
HSD = 3.23
𝑀𝐴 − 𝑀𝐵 = 1 − 5 = −4, Treatment A is significantly different than
Treatment B.
𝑀𝐴 − 𝑀𝐶 = 1 − 6 = −5, Treatment A is significantly different than
Treatment C.
𝑀𝐵 − 𝑀𝐶 = 5 − 6 = −1, Treatment B is not significantly different
than Treatment C.
Post Hoc Tests: Scheffe Test
Question 3 Answer:
First, compute SSbetween for Treatments A and B.
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 =
𝑇2
𝑛
𝐺2
−𝑁
=
52
5
+
252
5
−
302
10
= 5 + 125 − 90 = 40
Notice: G is equal to the total of Treatments A and B, not A, B, and C.
Similarly, N is equal to nA + nB.
Now, find MSbetween.
𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 =
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛
=
40
2
𝐹𝐴 𝑣𝑒𝑟𝑠𝑢𝑠 𝐵 =
𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛
= 3.83 = 5.22
For df (2,12) and α = 0.05, the critical region for F is 3.88. Therefore
our obtained F-ratio is in the critical region, and we must conclude
that these data show a significant difference between treatment A
and treatment B.
= 20
For dfbetween, use k-1.
20
Post Hoc Tests: Scheffe Test
Question 3 Answer:
First, compute SSbetween for Treatments A and C.
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 =
𝑇2
𝑛
𝐺2
−𝑁
=
52
5
302
5
+
−
352
10
= 5 + 180 − 122.5 = 62.5
Notice: G is equal to the total of Treatments A and C, not A, B, and C.
Similarly, N is equal to nA + nC.
Now, find MSbetween.
𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 =
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛
=
62.5
2
𝐹𝐴 𝑣𝑒𝑟𝑠𝑢𝑠 𝐶 =
𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛
=
31.25
3.83
For df (2,12) and α = 0.05, the critical region for F is 3.88. Therefore
our obtained F-ratio is in the critical region, and we must conclude
that these data show a significant difference between treatment A
and treatment C.
= 31.25 For dfbetween, use k-1.
= 8.16
Post Hoc Tests: Scheffe Test
Question 3 Answer:
First, compute SSbetween for Treatments B and C.
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 =
𝑇2
𝑛
𝐺2
−𝑁
=
252
5
+
302
5
552
− 10
= 125 + 180 − 302.5 = 2.5
Notice: G is equal to the total of Treatments B and C, not A, B, and C.
Similarly, N is equal to nB + nC.
Now, find MSbetween.
𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛 =
𝑆𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑑𝑓𝑏𝑒𝑡𝑤𝑒𝑒𝑛
=
2.5
2
𝐹𝐵 𝑣𝑒𝑟𝑠𝑢𝑠 𝐶 =
𝑀𝑆𝑏𝑒𝑡𝑤𝑒𝑒𝑛
𝑀𝑆𝑤𝑖𝑡ℎ𝑖𝑛
= 3.83 = 0.33
For df (2,12) and α = 0.05, the critical region for F is 3.88. Therefore
our obtained F-ratio is not in the critical region, and we must
conclude that these data show no significant difference between
treatment B and treatment C.
= 1.25
For dfbetween, use k-1.
1.25
Assumptions for ANOVA
Question 4: What three assumptions are required for
ANOVA?
Assumptions for ANOVA
Question 4 Answer:
The observations within each sample must be independent.
The populations from which the samples are selected must be
normal.
The populations from which the samples are selected must have
equal variances (homogeneity of variance).