Are Variance-Stabilizing Transformations Really Useful?

Download Report

Transcript Are Variance-Stabilizing Transformations Really Useful?

Classroom
Simulation: Are
Variance-Stabilizing
Transformations
Really Useful?
*
Trumbo
Bruce E.
Eric A. Suess
†
Rebecca E. Brafman
Department of Statistics
California State University, Hayward
† Presentation, JSM 2004, Toronto
* [email protected]
Introduction to One-way ANOVA
In a one-way ANOVA, we test the null
hypothesis that all group means i are equal
against the alternative hypotheses that all group
means are not equal.
ANOVA Table
Source
DF
SS
MS
F-Ratio
.
Factor
I – 1
SS(Fact)
MS(Fact)
MS(Fact)/MS(Err)
Error
IJ – I
SS(Err)
MS(Err) .
Total
IJ – 1
Model and Assumptions
We use the model: Xij i.i.d. NORM(i, 2),
for i = 1, …, I and j = 1, …, J.
Assumptions:
– normal data
– independent groups
– independent observations within groups
– equal variances
When Data Are Not Normal…
• If H0 True: Distributional difficulties arise
– MS(Factor) and MS(Error) not chi-squared
– MS(Factor) and MS(Error) not independent
– F-ratio not distributed as F
• If H0 False:
– Different means may imply
– Different variances
Commonly Recommended Method
For Transformating Data to
Stabilize Variances
Based on two-term Taylor-series approximations.
Given relationship between mean and variance:
2 = j().
The following transformation makes variances
approximately equal — even if means differ:
Y = f(X), where f’() = [j()]–1/2
Some Types of Nonnormal Data and Their
Variance-Stabilizing Transformations
Type of
Relationship of
Distribution Mean & Variance
Poisson
Variance = Mean
Type of
Transformation
Square Root
Binomial
Mean = p
Proportions Variance = p(1–p)/n
Exponential SD = Mean
Arcsine of
Square Root
Log and
Rank
Square Root Transformations (Right) of Three
Poisson Samples Have Similar Variances
Var = 2.98
Mean = 1.81
Var = 0.227
0 50
0 100
150
300
Mean = 3.02
0
10
20
30
40
0
2
Sample 1 from POIS(3)
Var = 9.5
Mean = 3.18
6
8
Var = 0.242
0 50
0 100
150
300
250
Mean = 9.86
4
Log Transform of Sample 1
0
10
20
30
40
0
2
Sample 2 from POIS(10)
6
8
Var = 20.3
Mean = 4.51
Var = 0.248
0
0 100
50 100
300
Mean = 20.1
4
Log Transform of Sample 2
0
10
20
Sample 3 from POIS(20)
30
40
0
2
4
Log Transform of Sample 3
6
8
Arcsine of Square Root Transformations (Right) of
Three Binomial Samples Have Similar Variances
Var = 0.00238
Mean = 0.214
Var = 0.00904
0 100
0 200
300
600
Mean = 0.0485
0.0
0.2
0.4
0.6
0.8
1.0
0.0
Sample 1 from BINOM(20,.05)
1.0
1.5
Var = 0.00643
Mean = 0.379
Var = 0.0142
0 50
0 100
150
300
Mean = 0.147
0.5
Log Transform of Sample 1
0.0
0.2
0.4
0.6
0.8
1.0
0.0
Sample 2 from BINOM(20,.15)
1.0
1.5
Mean = 0.638
Var = 0.013
0 100
0 100
300
Var = 0.0112
300
Mean = 0.358
0.5
Log Transform of Sample 2
0.0
0.2
0.4
0.6
Sample 3 from BINOM(20,.35)
0.8
1.0
0.0
0.5
1.0
Log Transform of Sample 3
1.5
Log Transformations (Right) of Three
Exponential Samples Have Similar Variances
M
e
a
n
=
0
.
6
4
8
V
a
r
=
1
.
6
6
0 10 20 30
0 10 20 30 40
M
e
a
n
=
0
.
9
5
1
V
a
r
=
0
.
9
7
1
0 1
0 2
0 3
0 4
0 5
0
S
a
m
p
l
e
1
f
r
o
m
E
X
P
(
1
)
L
o
g
T
r
a
n
s
f
o
r
m
o
f
S
a
m
p
l
e
1
M
e
a
n
=
4
.
8
5
V
a
r
=
2
4
.
5
M
e
a
n
=
1
V
a
r
=
1
.
6
5
0 10 20 30
0 20 40 60
6420 2 4 6
0 1
0 2
0 3
0 4
0 5
0
S
a
m
p
l
e
2
f
r
o
m
E
X
P
(
5
)
L
o
g
T
r
a
n
s
f
o
r
m
o
f
S
a
m
p
l
e
2
M
e
a
n
=
9
.
8
3
V
a
r
=
1
0
4
M
e
a
n
=
1
.
6
6
V
a
r
=
1
.
8
1
0 10 20 30
0 20 40 60
6420 2 4 6
0 1
0 2
0 3
0 4
0 5
0
S
a
m
p
l
e
3
f
r
o
m
E
X
P
(
1
0
)
6420 2 4 6
L
o
g
T
r
a
n
s
f
o
r
m
o
f
S
a
m
p
l
e
3
Additional Transformations
We also consider rank transformations for
exponential data.
Possible future work (no results given here):
Box-Cox Transformation of the type Y = Xa,
where a is based on the data.
Examples:
• Square root if a = 1/2
• Reciprocal if a = –1
• Interpreted as log transformation if a = 0
Simulation Study
1. Simulations are based on data with known
distributions: Poisson, binomial, or exponential.
2. Use R, S-Plus, and Minitab.
(SAS can also be used but is very time consuming.)
3. In each simulation we generate 20,000 datasets
from the nonnormal distribution under study.
4. Each dataset consists of I = 3 groups, usually
with J = 5 or 10 observations per group.
5. For each distribution: Datasets under H0,
and for a variety of cases with Ha.
Comparisons to Judge
Usefulness of Transformations
All tests have nominal size  = 5%.
P{Rej} is estimated as the proportion of 20,000
simulated datasets in which H0 is rejected.
With and without transformation:
When is H0 is true, does P{Rej} = 5% ?
For various alternatives:
When is P{Rej} larger, with or without
transformation?
R / S-Plus Code for Exponential Simulation
r <- 5; m <- 20000
# r = no. of obs./group, m = no.of datasets
mu1 <- 10; mu2 <- 10; mu3 <- 10
# set means of ea. group
mu <- c(rep(mu1,r), rep(mu2,r), rep(mu3,r))
# combine means
x <- rexp(3*r*m, rate=1/mu)
# form vector of random exponential data
DTA <- matrix(x, m, byrow=T)
# transform vector into matrix of m datasets
#DTA <- log(DTA)
# Activate line above for log transf.
#DTA <- t(apply(DTA, 1, rank))
# Activate line above for rank transf.
m1 <- rowMeans(DTA[,1:r])
m2 <- rowMeans(DTA[,(r+1):(2*r)])
m3 <- rowMeans(DTA[,(2*r+1):(3*r)])
v1 <- rowSums((DTA[,1:r] - m1)^2)/(r-1)
v2 <- rowSums((DTA[,(r+1):(2*r)] - m2)^2)/(r-1)
v3 <- rowSums((DTA[,(2*r+1):(3*r)] - m3)^2)/(r-1)
g <- (m1 + m2 + m3)/3
#calculates group means,group variances,& row mean
MSF <- r * rowSums((cbind(m1,m2,m3) - g)^2)/2
# calculates MSF
MSE <- rowMeans(cbind(v1, v2, v3))
# calculates MSE
F.rat <- MSF/MSE
# calculates F-ratio
rej <- (F.rat > qf(.95, 2, 3*(r-1)))
# vector of T/F for reject/don’t reject
mean(rej)
# rejection probability or proportion of T in rej vector.
Summary of Findings
Within the limited scope of our study…
For Poisson data, the square root
transformation seems ineffective.
For binomial data, the “arcsine”
transformation seems ineffective.
For exponential data, both the log and the
rank transformations seem to be useful in
some cases—particularly for small samples.
Some Specific Results:
P{Rej} for Poisson Data
Three groups, each with 5 observations
Not
Pattern of
Group Means Transformed Transformed
Transf.
Useful?
10, 10, 10
~ 0.05
~ 0.05
NO
10, 15, 20
~ 0.91
~ 0.91
NO
Some Specific Results:
P{Rej} for Binomial Proportions
Three groups, each with 5 observations
Pattern of
Transf.
p = P(Success) Not
Transformed
Transformed
Useful?
in each group
0.2, 0.2, 0.2
~ 0.05
~ 0.05
NO
0.1, 0.25, 0.4 ~ 0.82
~ 0.82
NO
For Exponential Data Log and Rank
Transformations Sometimes Useful
Power = P{Rej|Ha} “often” larger for transformed data
(one borderline exceptional case shown)
Group Means
r=5
10, 10, 10
1, 2, 4
1, 5, 10
1, 10 ,10
1, 10, 100
r = 10
10, 10, 10
1, 2, 4
1, 5 ,10
1, 10, 10
None
Transformation
Log
Rank
0.040
0.24
0.39
0.34
0.76
0.046
0.28
0.65
0.76
0.986
0.056
0.31
0.68
0.78
0.985
0.043
0.59
0.87
0.86
0.047
0.54
0.94
0.976
0.052
0.60
0.976
0.991
Exponential: Power Against Ha: 1, 10, 100
For Various Numbers of Replications
Log and rank transformations work well when r is small
and population means are widely separated.
O = Original
* = Log Transf
+ = Rank Transf.
Exponential: Power Against Ha: 1, 2, 4
For Various Numbers r of Replications
When means are not so widely separated, log and rank
transformations do some harm unless r is small .
O = Original
* = Log Transf
+ = Rank Transf
.
Exponential: Power for Various Alternatives
When M = 1, H0 is true; when M = 2, the group means are
1, 2, 4; when M = 4, the group means are 1, 4 , 16; etc.
For r = 5 and M > 2 transformations are useful.
Solid = Original
Dotted = Log Transf
Dashed = Rank Transf.
Exponential: Power for Various Alternatives
When M = 1, H0 is true; when M = 2, the group means are
1, 2, 4; when M = 4, the group means a are 1, 4 , 16; etc.
For r = 20, transformations may be harmful.
Solid = Original
Dotted = Log Transf
Dashed = Rank Transf.
References / Acknowledgments
REFERENCES ON VARIANCE STABILIZING TRANSFORMATIONS
G. Oehlert: A First Course in Design and Analysis of
Experiments, Freeman (2000), Chapter 6.
D. Montgomery: Design and Analysis of Experiments,
5th ed., Wiley (2001), Chapter 3.
K. Brownlee: Statistical Theory and Methodology in Science and Engineering, 2nd ed.,
Wiley (1965). Chapter 3.
H. Scheffé: The Analysis of Variance, Wiley 1959, Chapter 10.
G. Snedecor and W. Cochran: Statistical Methods, 7th ed. Iowa State Univ. Press (1980),
Chapter 15.
WEB PAGES including computer code and results for this paper:
www.sci.csuhayward.edu/~btrumbo/JSM2004/simtrans/.
THANKS TO Jaimyoung Kwan (UC Berkeley/CSU Hayward)
for suggestions, especially concerning the inclusion of power curves.
Rebecca Brafman’s graduate study supported by NSF Graduate Research Fellowship.
About the Authors
• Rebecca E. Brafman, presenting this poster at JSM
2004 in Toronto, has recently completed her M.S. in
Statistics from CSU Hayward.
• Eric A. Suess received his Ph.D. in Statistics from
U.C. Davis and is Associate Professor of Statistics
at CSU Hayward. His interests include statistical
computation, time series and Bayesian statistics.
[email protected]
• Bruce E. Trumbo is a fellow of ASA and IMS and
has been a professor in the Statistics Department
at CSU State University, Hayward for over 30 years.
[email protected]