Power Fifteen Analysis of Variance (ANOVA) 1

Download Report

Transcript Power Fifteen Analysis of Variance (ANOVA) 1

1
Power Fifteen
Analysis of Variance (ANOVA)
Analysis of Variance

One-Way ANOVA
• Tabular
• Regression

Two-Way ANOVA
• Tabular
• Regression
2
One-Way ANOVA



Apple Juice Concentrate Example, Data
File xm 15-01
New product
Try 3 different advertising strategies, one in
each of three cities
• City 1: convenience of use
• City 2: quality of product
• City 3: price

Record Weekly Sales
3
Advertising Strategies & Weekly
Sales for 20 Weeks
Convenience
Quality
Price
529
804
672
658
630
531
793
774
443
-
-
-
614
624
532
Mean: 577.5
Mean: 653.0
Mean: 608.65
4
Is There a Significant Difference in Average Sales?
Null Hypothesis, H0 : m1 = m2 = m3
Alternative Hypothesis: m  m , m  m , m  m
1
2
1
3
2
3
5
Table 3: 1-Way ANOVA of Apple Juice Sales By Advertising Strategy
Source of Variation
Sum of Squares
Degrees of
Mean
Freedom
Square
k
Explained(between
ESS =

j= 1
nj( x
- x )2
k-1
ESS/(k-1)
( x ij - x j) 2
n-k
USS/(n-k)
( x ij - x ) 2
n-1
j
treatments)
k
Unexplained(withi
USS =
n( j )
 
j =1
i =1
n
treatments)
k
Total
TSS =
n( j )
 
j =1
i =1
Fk-1, n-k = [ESS/(k-1)]/[USS/(n-k)]
6
Apple Juice Concentrate ANOVA
Source of
Variation
Explained
(Between
Treatments)
Unexplained
(Within
Treatments)
Total
Sum of
Degrees of
Squares Freedom
ESS=
k-1 = 2
57,512.23
Mean
Square
ESS/(k-1)
=28,756.12
USS=
506,984
n-k = 57
USS/(n-k)
=8894.45
TSS=
564,496
n-1 = 59
F2, 57 = 28,756.12/8894.45 = 3.23
7
F-Distribution Test of the Null Hypothesis of No
Difference in Mean Sales with Advertising Strategy
Figure 2: F-D istribution Density For 2 DOF, 57 DOF
1.0
DENSITY
0.8
0.6
F2, 60 (critical) @ 5% =3.15
0.4
0.2
0.0
0
2
4
6
8
10
F Variable
8
One-Way ANOVA and Regression
9
Regression Set-Up: y(1) is column of 20 sales observations
For city 1, 1 is a column of 20 ones, 0 is a column of 20
Zeros. Regression of a quantitative variable on three dummies
y(1) 1 0 0
y(2) = 0 1 0
y(3)
0 0 1
Y = C(1)*Dummy(city 1) + C(2)*Dummy(city 2) +
C(3)*Dummy(city 3) + e
10
11
One-Way ANOVA and Regression
Table 5: One-Way ANOVA Estimated Using Regression
Dependent Variable: SALESAJ
Method: Least Squares
Sample: 1 60
Included observations: 60
Variable
Coefficient
Std. Error
t-Statistic
Prob.
CONVENIENCE
QUALITY
PRICE
577.5500
653.0000
608.6500
21.08844
21.08844
21.08844
27.38704
30.96483
28.86178
0.0000
0.0000
0.0000
R-squared
Adjusted Rsquared
S.E. of regression
Sum squared
resid
Log likelihood
Durbin-Watson
stat
0.101882
0.070370
Mean dependent var
S.D. dependent var
613.0667
97.81474
94.31038
506983.5
Akaike info criterion
Schwarz criterion
11.97977
12.08448
F-statistic
Prob(F-statistic)
3.233041
0.046773
-356.3930
1.525930
Regression Coefficients are the City Means; F statistic
Dependent Variable: SALESAJ
Method: Least Squares
Included observations: 60
Variable
Coefficient
CONVENIENCE 577.5500
QUALITY
653.0000
PRICE
608.6500
Std. Error
21.08844
21.08844
21.08844
Sample: 1 60
t-Statistic
27.38704
30.96483
28.86178
R-squared
0.101882
Mean dependent var
Adjusted R-squared 0.070370 S.D. dependent var
S.E. of regression 94.31038 Akaike info criterion
Sum squared resid
506983.5 Schwarz criterion
Log likelihood-356.3930
Durbin-Watson stat
Regression Coefficients are the City Means; F statistic (?)
Prob.
0.0000
0.0000
0.0000
613.0667
97.81474
11.97977
12.08448
1.525930
14
Table 6: Test of the Null Hypothesis: All Treatment Means Are Equal
Wald Test:
Equation: Untitled
Null
Hypothesis:
C(1)=C(3)
C(2)=C(3)
F-statistic
Chi-square
3.233041
6.466083
Probability
Probability
0.046773
0.039437
15
Anova and Regression: One-Way
Interpretation


Salesaj =
c(1)*convenience+c(2)*quality+c(3)*price+ e
E[salesaj/(convenience=1, quality=0, price=0)]
=c(1) = mean for city(1)
•
•
•
•
c(1) = mean for city(1) (convenience)
c(2) = mean for city(2) (quality)
c(3) = mean for city(3) (price)
Test the null hypothesis that the means are equal
using a Wald test: c(1) = c(2) = c(3)
16
One-Way ANOVA and Regression
Table 5: One-Way ANOVA Estimated Using Regression
Dependent Variable: SALESAJ
Method: Least Squares
Sample: 1 60
Included observations: 60
Variable
Coefficient
Std. Error
t-Statistic
Prob.
CONVENIENCE
QUALITY
PRICE
577.5500
653.0000
608.6500
21.08844
21.08844
21.08844
27.38704
30.96483
28.86178
0.0000
0.0000
0.0000
R-squared
Adjusted Rsquared
S.E. of regression
Sum squared
resid
Log likelihood
Durbin-Watson
stat
0.101882
0.070370
Mean dependent var
S.D. dependent var
613.0667
97.81474
94.31038
506983.5
Akaike info criterion
Schwarz criterion
11.97977
12.08448
F-statistic
Prob(F-statistic)
3.233041
0.046773
-356.3930
1.525930
Regression Coefficients are the City Means; F statistic
Anova and Regression: One-Way
Alternative Specification: Drop Price



Salesaj = c(1) +
c(2)*convenience+c(3)*quality+e
E[Salesaj/(convenience=0, quality=0)] = c(1)
= mean for city(3) (price, the omitted one)
E[Salesaj/(convenience=1, quality=0)] = c(1)
+ c(2) = mean for city(1) (convenience)
• so mean for city(1) = c(1) + c(2)
• so mean for city(1) = mean for city(3) + c(2)
• and so c(2) = mean for city(1) - mean for city(3)
18
19
20
Anova and Regression: One-Way
Alternative Specification: Drop Price



Salesaj = c(1) +
c(2)*convenience+c(3)*quality+e
E[Salesaj/(convenience=0, quality=0)] = c(1)
= mean for city(3) (price, the omitted one)
E[Salesaj/(convenience=1, quality=0)] = c(1)
+ c(2) = mean for city(1) (convenience)
• so mean for city(1) = c(1) + c(2)
• so mean for city(1) = mean for city(3) + c(2)
• and so c(2) = mean for city(1) - mean for city(3)
21
Anova and Regression: One-Way
Alternative Specification

Salesaj = c(1) +
c(2)*convenience+c(3)*quality+e
• Test that the mean for city(1) = mean for city(3)
• Using the t-statistic for c(2)
H 0 : x1 = x3 , i.e.H 0 : x1  x3 = 0
22
Anova and Regression: One-Way
Alternative Specification, Drop Quality



Salesaj = c(1) +
c(2)*convenience+c(3)*price+e
E[Salesaj/(convenience=0, price=0)] = c(1) =
mean for city(2) (quality, the omitted one)
E[Salesaj/(convenience=1, price=0)] = c(1) +
c(2) = mean for city(1) (convenience)
• so mean for city(1) = c(1) + c(2)
• and so mean for city(1) = mean for city(2) + c(2)
• so c(2) = mean for city(1) - mean for city(2)
23
24
Anova and Regression: One-Way
Alternative Specification, Drop Quality

Salesaj = c(1) +
c(2)*convenience+c(3)*price+e
• Test that the mean for city(1) = mean for city(2)
• Using the t-statistic for c(2)
25
Two-Way ANOVA


Apple Juice Concentrate
Two Factors
• 3 advertising strategies
• 2 advertising media: TV & Newspapers

6 cities
•
•
•
•
City 1: convenience on TV
City 2: convenience in Newspapers
City 3: quality on TV
Etc.
26
Advertising Strategies In Two Media: Weekly Sales
Table 7: Apple Juice Concentrate Sales in Six Cities
City 1
City 2
City 3
City 4
City 5
City 6
491
464
677
689
575
803
712
559
627
650
614
584
558
759
590
704
706
525
447
557
632
652
484
498
479
528
683
576
478
812
624
670
760
836
650
565
546
534
690
628
583
708
444
657
548
798
536
546
582
557
579
497
579
616
672
474
644
841
795
587
27
Mean Weekly Sales By Strategy
and Medium
Table 9: Mean Weekly Sales, Apple Juice Concentrate, Six Cities
Convenience
Quality
Price
Television
city1: 555.5
city3: 643
city5: 600
Newspapers
city2: 575.9
city4: 687.1
city 6: 624.4
28
Average
Figure 3; Mean Apple Juice Sales by Advertising
Strategy and Medium
700
600
500
400
300
200
100
0
convenienc
e
television
news papers
quality
price
Strategy
29
Average Weekly Sales By Strategy & Medium
750
700
Average Sales
650
600
550
Newspapers
Television
500
450
400
convenience
quality
price
Is There Any Difference In Mean
Sales Among the Six Cities?
Table 8: 1-Way ANOVA of Apple Juice Sales, Six Cities
Source of Variation
Sum of Squares
Explained(between
ESS = 113,620
Degrees of Freedom
k-1 = 5
treatments)
Unexplained(within
ESS/(k-1) =
22,724
USS = 501,137
n-k = 54
treatments)
Total
Mean Square
USS/(n-k) =
9280
TSS = 614,757
n-1 = 59
F5, 54 = (22,724/9,280) = 2.45, critical value at 5% = 2.38
-----------------------------------------------------------------------31
Table of ANOVA for Two-Way
Table 10: Schematic For 2-Way ANOVA of Apple Juice Sales
Source of Variation
Sum of Squares
Degrees of
Mean Square
Freedom
Explained(between
ESS =
ESS/(k-1)
treatments)
Strategy
ESS(Strategy)
a-1
ESS(Strat.)/(a-1)
Medium
ESS(Medium)
b-1
ESS(Med)/(b-1)
Interaction
ESS(Interaction)
(a-1)(b-1)
ESS(I)/(a-1)(b-1)
USS
n-ab
USS/(n-k)
TSS
n-1
Unexplained(within
treatments)
Total
32
Formulas For Sums of Squares
a
TSS =
r
b
  
i =1
j =1
k =1
(xijk -x )2
a is the # of treatments for strategies =3
b is the # of treatments for media =2
r is the # of replicates or observations =10
The Grand Mean:
a
x = {
i =1
b
r
 
j =1
k =1
xijk }/n
33
Formulas For Sums of Squares (Cont.)
a
ESS(Strategy) = r b  (x
i =1
S
i
-x )2
Where the mean for treatment i, strategy, is:
b
x iS = {
r
 
j =1
k =1
xijk }/r b
34
Mean Weekly Sales By Strategy
and Medium
Table 9: Mean Weekly Sales, Apple Juice Concentrate, Six Cities
Convenience
Quality
Price
Television
city1: 555.5
city3: 643
city5: 600
Newspapers
city2: 575.9
city4: 687.1
city 6: 624.4
35
Formulas For Sums of Squares (Cont.)
b
ESS(Medium) = r a  (x
j =1
M
j
-x )2
Where the mean for treatment j, medium, is:
a
x jM = {
r
 
i =1
k =1
xijk }/r a
36
Formulas For Sums of Squares (Cont.)
a
ESS(Interaction) = r 
i =1
a
USS =
b
j =1

j =1
x(
SM
ij
-x
i
Sx
-
M
j
x
+
)2
r
  
i =1
b
k =1
(xijk - xij )2
Where xij is the mean for each city
37
Table of Two-Way ANOVA for Apple Juice Sales
Table 11: 2-Way ANOVA of Apple Juice Sales
Source of Variation
Sum of Squares
Degrees of
Mean Square
Freedom
Explained(between
ESS =
treatments)
Strategy ESS(Strat) = 98838.6
(a-1) = 2
49419.3
Medium
ESS(Med) = 13172.0
(b-1) = 1
13172.0
Interaction
ESS(I) = 1609.6
(a-1)(b-1) = 2
804.8
(n-ab) = 60 – 6
9280.3
Unexplained(within
USS = 501136.7
treatments)
Total
= 54
TSS = 614756.98
(n-1) = 59
38
F-Distribution Tests
Test for Interaction:
F2, 54 = 804.8/9280.3 = 0.09
Test for Advertising Medium:
F1, 54 = 13172/9280.3 = 1.42, and the critical value at the 5% level is 4.02,
Test for Advertising Strategy:
F2, 54 = 49419.3/9280.3 = 5.32, with a critical value of 3.17 at the 5% level,
39
Two-Way ANOVA and Regression
40
Two-Way ANOVA and Regression


With Two-Way ANOVA you cannot include
both 3 dummy variables for strategy and
two dummy variables for media, without a
constant, so a different specification is
needed.
You need to drop one of the strategy
variables and drop one of the media varibles
and include the constant.
41
Regression Set-Up
Convenience dummy
Quality dummy
TV dummy
y (4)
y (5)
y (6)
=
y (1)
y (2)
y (3)
constant
1 0 1
1 0 0
0 1 1
1
1
1
0 1 0
0 0 1
0 0 0
1
1
1
42
SALESAPJ CONVENIENCE
QUALITY
TELEVISION NEWSPAPERS
491
1
0
0
1
0
712
1
0
0
1
0
558
1
0
0
1
0
447 1
0
0
1
0
479
1
0
0
1
0
624
1
0
0
1
0
546
1
0
0
1
0
444 1
0
0
1
0
582
1
0
0
1
0
672
1
0
0
1
0
464
1
0
0
0
1
559
1
0
0
0
1
759
1
0
0
0
1
557
1
0
0
0
1
528
1
0
0
0
1
670
1
0
0
0
1
PRICE
ANOVA and Regression: Two-Way
Series of Regressions; Compare to
Table 11, Lecture 15



Salesaj = c(1) + c(2)*convenience + c(3)*
quality + c(4)*television +
c(5)*convenience*television +
c(6)*quality*television + e, SSR=501,136.7
Salesaj = c(1) + c(2)*convenience + c(3)*
quality + c(4)*television + e, SSR=502,746.3
Test for interaction effect: F2, 54 =
[(502746.3-501136.7)/2]/(501136.7/54) =
(1609.6/2)/9280.3 = 0.09
44
Table of Two-Way ANOVA for Apple Juice Sales
Table 11: 2-Way ANOVA of Apple Juice Sales
Source of Variation
Sum of Squares
Degrees of
Mean Square
Freedom
Explained(between
ESS =
treatments)
Strategy ESS(Strat) = 98838.6
(a-1) = 2
49419.3
Medium
ESS(Med) = 13172.0
(b-1) = 1
13172.0
Interaction
ESS(I) = 1609.6
(a-1)(b-1) = 2
804.8
(n-ab) = 60 – 6
9280.3
Unexplained(within
USS = 501136.7
treatments)
Total
= 54
TSS = 614756.98
(n-1) = 59
Dependent Variable: SALESAPJ
Method: Least Squares
Sample: 1 60
Included observations: 60
Variable
Coefficient
Std. Error
t-Statistic
Prob.
CONVENIENCE -48.50000 43.08204
-1.125759
0.2652
QUALITY
62.70000 43.08204
1.455363
0.1514
TELEVISION -24.40000 43.08204
-0.566361
0.5735
C
624.4000 30.46360
20.49659
0.0000
CONVENIENCE*TELEVISION 4.000000 60.92720 0.065652
0.9479
QUALITY*TELEVISION -19.70000 60.92720 -0.323337 0.7477
R-squared
Mean dependent var
Adjusted R-squared
S.D. dependent var
S.E. of regression
Akaike info criterion
Sum squared resid
Schwarz criterion
Log likelihood
F-statistic
Durbin-Watson stat
Prob(F-statistic)
0.184821
614.3167
0.109342
102.0765
96.33436
12.06817
501136.7
12.27760
-356.0450
2.448631
2.452725
0.045165
Dependent Variable: SALESAPJ
Method: Least Squares
Sample: 1 60
Included observations: 60
Variable Coefficient Std. Error
t-Statistic
Prob.
CONVENIENCE -46.50000 29.96267 -1.551931 0.1263
QUALITY
52.85000
29.96267 1.763862 0.0832
TELEVISION -29.63333
24.46441 -1.211283 0.2309
C
627.0167
24.46441
25.62974
0.0000
R-squared 0.182203 Mean dependent var
614.31
Adjusted R-squared 0.138393 S.D. dependent var
102.0765
S.E. of regression 94.75027 Akaike info criterion
12.00471
Sum squared resid 502746.3 Schwarz criterion
12.14433
Log likelihood-356.1412 F-statistic
4.158888
Durbin-Watson stat 2.456222
Prob(F-statistic)
ANOVA By Difference




Regression with interaction terms, USS =
501,136.7
Regression dropping interaction terms<
USS = 502746.3
Difference is 1,609.6 and is the sum of
squares explained by interaction terms
F-test of the interaction terms:
F2, 54 = [1609.6/2]/[501,136.7/54]
49
ANOVA and Regression: Two-Way
Series of Regressions




Salesaj = c(1) + c(2)*convenience + c(3)*
quality + e, SSR=515,918.3
Test for media effect: F1, 54 = [(515918.3502746.3)/1]/(501136.7/54) = 13172/9280.3
= 1.42
Salesaj = c(1) +e, SSR = 614757
Test for strategy effect: F2, 54 = [(614757515918.3)/2]/(501136.7/54) =
(98838.7/2)/(9280.3) = 5.32
50
Dependent Variable: SALESAPJ
Method: Least Squares
Sample: 1 60
Included observations: 60
Variable
Coefficient
Std. Error
t-Statistic
Prob.
CONVENIENCE -46.50000 30.08521
-1.545610
0.1277
QUALITY
52.85000
30.08521
1.756677
0.0843
C
612.2000
21.27346
28.77765
0.0000
R-squared
0.160777
Mean dependent var
614.31
Adjusted R-squared 0.131330 S.D. dependent var
102.07
S.E. of regression
95.13779 Akaike info criterion
11.99724
Sum squared resid
515918.3 Schwarz criterion
12.101
Log likelihood-356.9171
F-statistic
5.459975
Durbin-Watson stat 2.379774
Prob(F-statistic)
0.006769
Wald Test:
Equation: Untitled
Null Hypothesis:
C(2)=C(3)
F-statistic
138.2678
Probability
0.000000
Chi-square
Probability
0.000000
138.2678
52