Different Types of SS

Download Report

Transcript Different Types of SS

MANOVA
• All statistical methods we have learned so far have only one continuous
DV and one or more IVs which may be continuous or categorical
• There are cases with multiple DVs.
• All statistical methods that deal with multiple DVs or multiple variables
without DV/IV specification are grouped into multivariate statistics
• Multivariate analysis of variance (MANOVA) as an example
• Why MANOVA instead of multiple ANOVA for each DV?
– Experimentwise error rate
– See what univariate analysis cannot see
• The power of MANOVA test generally decreases with the number of
variables that do not differ among groups. So be cautious in including
variables in MANOVA (or any other multivariate statistical methods.
Xuhua Xia
Slide 1
Advantage of MANOVA
Data Shape;
Input Sex $ Height Width @@;
datalines;
Male 69 70 Male 68 74 Male 75 80 Male 78 85 Male 68 68
Male 63 68 Male 72 74 Male 63 66 Male 71 76 Male 72 78
Male 71 73 Male 70 73 Male 56 59 Male 77 83
Female 72 79 Female 64 65 Female 74 74 Female 72 75
Female 82 84 Female 69 68 Female 76 76 Female 68 65
Female 78 79 Female 70 71 Female 60 61
;
proc glm ;
class Sex;
model Height Width=Sex / solution;
manova h=Sex;
lsmeans Sex;
title "MANOVA: Differences in Height and Width between sexes";
run;
The SAS program will run both individual ANOVAs and a
MANOVA. Run and explain output.
Xuhua Xia
Slide 2
Fisher Iris Data
• Collected by Dr. Edgar Anderson, published in
Fisher (1936)
• Sepal length and width, petal length and width (in
cm) of fifty plants for each of three types of iris
– Iris setosa, diploid with 38 chromosomes
– Iris versicolor, hexaploid (108 chromosomes)
– Iris virginica, tetroploid
• Fisher, R.A. (1936). "The Use of Multiple
Measurements in Taxonomic Problems". Annals of
Eugenics 7: 179–188.
Xuhua Xia
Slide 3
SAS Program
Xuhua Xia
Data Iris;
input SepalLen SepalWid PetalLen PetalWid Species $ @@;
cards;
5.1 3.5 1.4 0.2 Is 7.0 3.2 4.7 1.4 Ive 6.3 3.3 6.0 2.5 Ivi
4.9 3.0 1.4 0.2 Is 6.4 3.2 4.5 1.5 Ive 5.8 2.7 5.1 1.9 Ivi
4.7 3.2 1.3 0.2 Is 6.9 3.1 4.9 1.5 Ive 7.1 3 5.9 2.1 Ivi
4.6 3.1 1.5 0.2 Is 5.5 2.3 4 1.3 Ive 6.3 2.9 5.6 1.8 Ivi
5 3.6 1.4 0.2 Is 6.5 2.8 4.6 1.5 Ive 6.5 3 5.8 2.2 Ivi
5.4 3.9 1.7 0.4 Is 5.7 2.8 4.5 1.3 Ive 7.6 3 6.6 2.1 Ivi
4.6 3.4 1.4 0.3 Is 6.3 3.3 4.7 1.6 Ive 4.9 2.5 4.5 1.7 Ivi
5 3.4 1.5 0.2 Is 4.9 2.4 3.3 1 Ive 7.3 2.9 6.3 1.8 Ivi
4.4 2.9 1.4 0.2 Is 6.6 2.9 4.6 1.3 Ive 6.7 2.5 5.8 1.8 Ivi
4.9 3.1 1.5 0.1 Is 5.2 2.7 3.9 1.4 Ive 7.2 3.6 6.1 2.5 Ivi
5.4 3.7 1.5 0.2 Is 5 2 3.5 1 Ive 6.5 3.2 5.1 2 Ivi
4.8 3.4 1.6 0.2 Is 5.9 3 4.2 1.5 Ive 6.4 2.7 5.3 1.9 Ivi
4.8 3 1.4 0.1 Is 6 2.2 4 1 Ive 6.8 3 5.5 2.1 Ivi
4.3 3 1.1 0.1 Is 6.1 2.9 4.7 1.4 Ive 5.7 2.5 5 2 Ivi
5.8 4 1.2 0.2 Is 5.6 2.9 3.6 1.3 Ive 5.8 2.8 5.1 2.4 Ivi
5.7 4.4 1.5 0.4 Is 6.7 3.1 4.4 1.4 Ive 6.4 3.2 5.3 2.3 Ivi
5.4 3.9 1.3 0.4 Is 5.6 3 4.5 1.5 Ive 6.5 3 5.5 1.8 Ivi
5.1 3.5 1.4 0.3 Is 5.8 2.7 4.1 1 Ive 7.7 3.8 6.7 2.2 Ivi
5.7 3.8 1.7 0.3 Is 6.2 2.2 4.5 1.5 Ive 7.7 2.6 6.9 2.3 Ivi
5.1 3.8 1.5 0.3 Is 5.6 2.5 3.9 1.1 Ive 6 2.2 5 1.5 Ivi
5.4 3.4 1.7 0.2 Is 5.9 3.2 4.8 1.8 Ive 6.9 3.2 5.7 2.3 Ivi
5.1 3.7 1.5 0.4 Is 6.1 2.8 4 1.3 Ive 5.6 2.8 4.9 2 Ivi
4.6 3.6 1 0.2 Is 6.3 2.5 4.9 1.5 Ive 7.7 2.8 6.7 2 Ivi
5.1 3.3 1.7 0.5 Is 6.1 2.8 4.7 1.2 Ive 6.3 2.7 4.9 1.8 Ivi
4.8 3.4 1.9 0.2 Is 6.4 2.9 4.3 1.3 Ive 6.7 3.3 5.7 2.1 Ivi
5 3 1.6 0.2 Is 6.6 3 4.4 1.4 Ive 7.2 3.2 6 1.8 Ivi
5 3.4 1.6 0.4 Is 6.8 2.8 4.8 1.4 Ive 6.2 2.8 4.8 1.8 Ivi
5.2 3.5 1.5 0.2 Is 6.7 3 5 1.7 Ive 6.1 3 4.9 1.8 Ivi
Slide 4
5.2 3.4 1.4 0.2 Is 6 2.9 4.5 1.5 Ive 6.4 2.8 5.6 2.1 Ivi
4.7 3.2 1.6 0.2 Is 5.7 2.6 3.5 1 Ive 7.2 3 5.8 1.6 Ivi
4.8 3.1 1.6 0.2 Is 5.5 2.4 3.8 1.1 Ive 7.4 2.8 6.1 1.9 Ivi
5.4 3.4 1.5 0.4 Is 5.5 2.4 3.7 1 Ive 7.9 3.8 6.4 2 Ivi
5.2 4.1 1.5 0.1 Is 5.8 2.7 3.9 1.2 Ive 6.4 2.8 5.6 2.2 Ivi
5.5 4.2 1.4 0.2 Is 6 2.7 5.1 1.6 Ive 6.3 2.8 5.1 1.5 Ivi
4.9 3.1 1.5 0.2 Is 5.4 3 4.5 1.5 Ive 6.1 2.6 5.6 1.4 Ivi
5 3.2 1.2 0.2 Is 6 3.4 4.5 1.6 Ive 7.7 3 6.1 2.3 Ivi
5.5 3.5 1.3 0.2 Is 6.7 3.1 4.7 1.5 Ive 6.3 3.4 5.6 2.4 Ivi
4.9 3.6 1.4 0.1 Is 6.3 2.3 4.4 1.3 Ive 6.4 3.1 5.5 1.8 Ivi
4.4 3 1.3 0.2 Is 5.6 3 4.1 1.3 Ive 6 3 4.8 1.8 Ivi
5.1 3.4 1.5 0.2 Is 5.5 2.5 4 1.3 Ive 6.9 3.1 5.4 2.1 Ivi
5 3.5 1.3 0.3 Is 5.5 2.6 4.4 1.2 Ive 6.7 3.1 5.6 2.4 Ivi
4.5 2.3 1.3 0.3 Is 6.1 3 4.6 1.4 Ive 6.9 3.1 5.1 2.3 Ivi
4.4 3.2 1.3 0.2 Is 5.8 2.6 4 1.2 Ive 5.8 2.7 5.1 1.9 Ivi
5 3.5 1.6 0.6 Is 5 2.3 3.3 1 Ive 6.8 3.2 5.9 2.3 Ivi
5.1 3.8 1.9 0.4 Is 5.6 2.7 4.2 1.3 Ive 6.7 3.3 5.7 2.5 Ivi
4.8 3 1.4 0.3 Is 5.7 3 4.2 1.2 Ive 6.7 3 5.2 2.3 Ivi
5.1 3.8 1.6 0.2 Is 5.7 2.9 4.2 1.3 Ive 6.3 2.5 5 1.9 Ivi
4.6 3.2 1.4 0.2 Is 6.2 2.9 4.3 1.3 Ive 6.5 3 5.2 2 Ivi
5.3 3.7 1.5 0.2 Is 5.1 2.5 3 1.1 Ive 6.2 3.4 5.4 2.3 Ivi
5 3.3 1.4 0.2 Is 5.7 2.8 4.1 1.3 Ive 5.9 3 5.1 1.8 Ivi
;
SAS program (cont.)
proc glm data=Iris;
class Species;
model SepalLen SepalWid PetalLen
PetalWid=Species/solution;
manova h=_all_; /* Print multivariate tests together
with characteristic roots and
vectors of: E-1 * H. */
means Species / tukey;
‘pool=yes|no’: assuming
title "MANOVA test";
equal|unequalcovariance matrices
run;
‘pool=test’: test the equal
proc discrim pool = test slpool = 0.05;
covariance assumption, with slpool
class Species;
specifiying the significance level
var SepalLen SepalWid PetalLen PetalWid;
and with subsequent analysis
priors proportional;
depending on the outcome of the
title "Discriminant function analysis";
test.
run;
proc stepdisc ;
class Species;
var SepalLen SepalWid PetalLen PetalWid;
title "Stepwise discriminant function analysis";
run;
Xuhua Xia
Run and explain
Slide 6
Interpretation of MANOVA
• If the multivariate test is
– not significant, report no group differences among the
mean vectors
– significant, perform univariate ANOVA and relevant
contrasts
– Correlation among variables that may lead to significant
MANOVA test but no significant ANOVA test.
• Contrasts
– Prior (planned): Certain theory predicts which treatments
should be different
– Post hoc (unplanned): Not sure which treatments should be
different
• Control of experimentwise error rate
Xuhua Xia
Slide 7
MANOVA Assumptions
• Independence assumption: All observations
are independent (residuals are uncorrelated)
• Multivariate normality
• Sphericity assumption in repeated measures
• Homoscedasticity (equal variance and
covariance) assumption: Each sample (group)
has the same covariance matrix (compound
symmetry)
• Linearity assumption: Relationship among
variables are linear.
Xuhua Xia
Slide 8
Discriminant function analysis
1.0
Centroid
0.9
0.8
Y
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
0.0
0.2
0.4
0.6
X
SSdis tan ces  [ x  c( x)]2
x
MSE 
0.8
[ x  c( x)]
1.0
2
x
( N  c)b

SSdistances
( N  c)b
N: number of genes (rows), c: number of clusters, b:
number of time points or replicates (columns).
Xuhua Xia
Slide 9
Multi-group DFA
1.0
0.8
Y
0.6
0.4
0.2
0.0
0.0
0.2
0.4
0.6
0.8
1.0
X
Xuhua Xia
Slide 10
GLM and repeated measures
data dental;
input person gender$ y1-y4;
datalines;
1 F 21.0 20.0 21.5 23.0
2 F 21.0 21.5 24.0 25.5
3 F 20.5 24.0 24.5 26.0
4 F 23.5 24.5 25.0 26.5
5 F 21.5 23.0 22.5 23.5
6 F 20.0 21.0 21.0 22.5
7 F 21.5 22.5 23.0 25.0
8 F 23.0 23.0 23.5 24.0
9 F 20.0 21.0 22.0 21.5
10 F 16.5 19.0 19.0 19.5
11 F 24.5 25.0 28.0 28.0
12 M 26.0 25.0 29.0 31.0
13 M 21.5 22.5 23.0 26.5
14 M 23.0 22.5 24.0 27.5
15 M 25.5 27.5 26.5 27.0
16 M 20.0 23.5 22.5 26.0
17 M 24.5 25.5 27.0 28.5
18 M 22.0 22.0 24.5 26.5
19 M 24.0 21.5 24.5 25.5
20 M 23.0 20.5 31.0 26.0
21 M 27.5 28.0 31.0 31.5
Xuhua Xia
22 M
23 M
24 M
25 M
26 M
27 M
;
proc
23.0
21.5
17.0
22.5
23.0
22.0
23.0
23.5
24.5
25.5
24.5
21.5
23.5
24.0
26.0
25.5
26.0
23.5
25.0
28.0
29.5
26.0
30.0
25.0
glm;
class gender;
model y1-y4=gender / nouni;
repeated age 4 (8 10 12 14);
means gender;
lsmeans gender / pdiff;
run;
Slide 11
Adjusted F
• Greenhouse-Geisser Epsilon measures by how much
the sphericity assumption is violated. Epsilon is then
used to adjust for the potential bias in the F statistic.
– Epsilon = 1: the sphericity assumption is met perfectly.
– minimum Epsilon = 1/(k - 1), where k is the number of
levels in the repeated measure factor. For k = 3, the
minimum Epsilon = ½.
• The Huynh-Feldt epsilon its a correction of the
Greenhouse-Geisser epsilon because the latter is too
conservative.
Xuhua Xia
Slide 12