Power Analysis An Overview Power Is • • • • • The conditional probability that one will reject the null hypothesis given that the null is really false by a.

Download Report

Transcript Power Analysis An Overview Power Is • • • • • The conditional probability that one will reject the null hypothesis given that the null is really false by a.

Power Analysis
An Overview
Power Is
•
•
•
•
•
The conditional probability
that one will reject the null hypothesis
given that the null is really false
by a specified amount
and given certain other specifications such
as sample size and the criterion of
statistical significance (alpha).
A Priori Power Analysis
• You want to find how many cases you will
need to have a specified amount of power
given
– a specified effect size
– the criterion of significance to be employed
– whether the hypotheses are directional or
nondirectional
• A very important part of the planning of
research.
A Posteriori Power Analysis
• You want to find out what power would be
for a specified
– effect size
– sample size
– and type of analysis
• Best done as part of the planning of
research.
– could be done after the research to tell you
what you should have known earlier.
Retrospective Power Analysis
• Also known as “observed power.”
• What would power be if I were to
– repeat this research
– with same number of cases etc.
– and the population effect size were exactly
what it was in the sample in the current
research
• Some stat packs (SPSS) provide this.
Hoenig and Heisey
• The American Statistician, 2001, 55, 1924
• Retrospective power asks a foolish
question.
• It tells you nothing that you do not already
know from the p value.
• After the research you do not need a
power analysis, you need confidence
intervals for effect sizes.
One Sample Test of Mean
• Experimental treatment = memory drug
• H0: µIQ  100; σ = 15, N = 25
• Minimum Nontrivial Effect Size (MNES)
= 2 points.
• Thus, H1: µ = 102.
15
M 
3
25
 = .05, MNES = 2, Power = ?
• Under H0, CV = 100 + 1.645(3) = 104.935
– will reject null if sample mean  104.935
•
•
•
•
•
Power = area under H1  104.935
Z = (104.935  102)/3 = 0.98
P(Z > 0.98) = .1635
 = 1 - .16 = .84
Hope you like making Type II errors.
 = .05, ES = 5, Power = ?
•
•
•
•
•
What if the Effect Size were 5?
H1: µ = 105
Z = (104.935  105)/3 = 0.02
P(Z > 0.02) = .5080
It is easier to find large things than small
things.
H0: µ = 100 (nondirectional)
•
•
•
•
•
•
CVLower = 100  1.96(3) = 94.12 or less
CVUpper = 100 + 1.96(3) = 105.88 or more
If µ = 105, Z = (105.88  105)/3 = .29
P(Z > .29) = .3859
Notice the drop in power.
Power is greater with directional
hypotheses IF you can correctly PREdict
the direction of the effect.
Type III Error
• µ = 105 but we happen to get a very low
sample mean, at or below CVLower.
• We would correctly reject H0
• But incorrectly assert the direction of
effect.
• P(Z < (94.12  105)/3) = P(Z < 3.63),
which is very small.
H0: µ = 100, N = 100
15
M 
 1.5
100
•
•
•
•
Under H0, CV = 100 + 1.96(1.5) = 102.94
If µ = 105, Z = (102.94  105)/1.5 = -1.37
P(Z > -1.37) = .9147
Anything that reduces the SE increases
power (increase N or reduce σ)
Reduce  to .01
•
•
•
•
CVUpper = 100 + 2.58(1.5) = 103.87
If µ = 105, Z = (103.87  105)/1.5 = -0.75
P(Z > 0.75) = .7734
Reducing  reduces power, ceteris
paribus.
z versus t
• Unless you know σ (highly unlikely), you
really should use t, not z.
• Accordingly, the method I have shown you
is approximate.
• If N is not small, it provides a good
approximation.
• It is primarily of pedagogical value.
Howell’s Method
• The same approximation method, but
– You don’t need to think as much
– There is less arithmetic
– You need his power table
H0: µ = 100, N = 25, ES = 5
• IQ problem, minimum nontrivial effect size
at 5 IQ points, d = (105  100)/15 = 1/3.
• with N = 25,
•   d N = (1/3)5 = 1.67.
• Using the power table in our text, for a .05
two-tailed test, power = 36% for a  of 1.60
and 40% for a  of 1.70
 = .05,  = 1.67
• power for  = 1.67 is 36% + .7(40% 
36%) = 38.8%
I Want 95% Power
• From the table,  is 3.60.
 
 3.6 
N  
  116.64
d 
 1/ 3 
2
2
• If I get data on 117 cases, I shall have
power of 95%.
• With that much power, if I cannot reject the
null, I can assert its near truth.
The Easy Way: GPower
• Test family: t tests
• Statistical test: Means: Difference from constant (one
sample case)
• Type of power analysis: Post hoc: Compute achieved
power – given α, sample size, and effect size
• Tails: Two
• Effect size d: 0.333333 (you could click “Determine” and
have G*Power compute d for you)
• α error prob: 0.05
• Total sample size: 25
• This is NOT an approximation, it uses the t distribution.
Significant Results, Power = 36%
• Bad news – you could only get 25 cases
• Good news – you got significant results
• Bad news – the editor will not publish it
because power was low.
• Duh. Significant results with low power
speaks to a large effect size.
• But also a wide confidence interval.
Nonsignificant Results
• Power = 36%
– You got just what was to be expected, a Type
II error.
• Power = 95%
– If there was anything nontrivial to be found,
you should have found it, so the effect is
probably trivial.
– The confidence interval should show this.
I Want 95% Power
• How many cases do I need?
Sensitivity Analysis
• I had lots of data, N = 1500, but results
that were not significant.
• Can I assert the range null that d  0.
• Suppose that we consider d  0 if
-0.1  d  +0.1.
• For what value of d would I have had 95%
power?
• If the effect were only .093, I would have almost
certainly found it.
• I did not find it, so it must be trivial in magnitude
• I’d rather just compute a CI.
Two Independent Samples Test
of Means
• Effective sample size, .
~
n
2
1 1

n1 n2
• The more nearly equal n1 and n2, the
greater the effective sample size.
• For n = 50, 50, it is 50. For n =10, 90, it is
18.
Howell’s Method: Aposteriori
• n1 = 36, n2 = 48, effect size = 40 points,
SD = 98
~
n
2
1
1

36 48
 41.14
1  2
d
 40 / 98  .408

n
41.14
 d
 .408
 1.85
2
2
• From the power table, power = 46%.
I Want 80% Power
• For effect size d = 1/3.
• From power table,  = 2.8 with alpha .05
• I plan on equal sample sizes.
 
 2.8 
n  2   2
  141
d 
 1/ 3 
2
2
• Need a total of 2(141) = 282 subjects.
G*Power
• We have 36 scores in one group and 48 in
another.
• If µ1 - µ2 = 40, and σ = 98, what is power?
I Want 80% Power
• n1 = n2 = ? for d = 1/3,  = .05, power = .8.
• You need 286 cases.
Allocation Ratio = 9
• n1/n2 = 9. How many cases needed now?
• You need 788 cases!
Two Related Samples, Test of
Means
• Is equivalent to one sample test of null that
mean difference score = 0.
 Diff      2 12 1 2
2
1
2
2
• With equal variances,  Diff   2(1  )
• The greater , the smaller the SE, the
greater the power.
dDiff
• Adjust the value of d to take into account
the power enhancing effect of this design.
dDiff
1  2
d


 Diff
2(1 12 )
Howell’s Method: A Posteriori
• Effect size = 20 points:
– Cortisol level when anxious vs. when relaxed
•
•
•
•
σ1 = 108, σ2 = 114
 = .75
N = 16
Power = ?
Howell’s Method
.5(1082 )  .5(1142 )  111
• Pooled SD =
• d = 20/111 = .18.
dDiff 
.18
 .255
2(1  .75)
  .255 16  1.02.
• From the power table, power =  17%.
I Want 95% Power
 
n  
 dDiff
2

 3.6 
  
  199.3
 .255 

2
G*Power
• Dependent means,
post hoc.
• Set the total sample
size to 16.
• Click on “Determine.”
• Select “from group
parameters.”
• Calculate and transfer
to main window.
Power = 16%
I Want 95% Power
• You need 204 subjects.
Type III Errors
• You have correctly rejected H0: µ1= µ2.
• Which µ is greater?
• You conclude it is the one whose sample
mean was greater.
• If that is wrong, you made a Type III error.
• This probability is included in power.
• To exclude it, see
http://core.ecu.edu/psyc/wuenschk/StatHelp/Type_III.htm
Bivariate Correlation/Regression
• H0: Misanthropy-AnimalRights = 0
• For power = .95,  = .05,  = .2, N = ?
One-Way ANOVA,
Independent Samples
• f is the effect size statistic. Cohen
considered .1 to be small, .25 medium,
and .4 large.
• In terms of 2, this is 1%, 6%, 14%.
k
f 

(  j   )2
j 1
k
2
error
Comparing three populations on
GRE-Q
• Minimum nontrivial effect size is if each
ordered mean differs from the next by 20
points (about 1/5 SD),  = 100, n = 11.
• (µj - µ)2 = 202 + 02 + 202 = 800
f  800 / 3 / 10000  0.163
Power is only .115
I Want 70% Power
Analysis of Covariance
• Adding covariates to the ANOVA model
can increase power.
• If they are well correlated with the
dependent variable.
• Adjust the f statistic this way, where r is
the corr between covariate(s) and Y.
f 
f
1 r
2
k = 3, f = .1, power = .95, N = ?
• f = 1 is a small effect.
• Ouch, that is a lot of data we need here.
Add a Covariate, r = .7
f 
.1
1  .49
 .14
Reduce the error df by 1 for each
covariate
Factorial ANOVA, Independent
Samples
• We plan a 3 x 4 ANOVA.
• Want power = 80% for medium-sized
effect.
• Sample sizes will be constant across cells
• Will be three F tests, with df =
– 2 (the three level factor)
– 3 (the four level factor)
– 6 (the interaction)
The Three-Level Factor
• For a medium effect, you need 158 cases,
= 158/12 = 13.2 per cell. Bump N up to
14(12) = 168 cases.
The Four Level Factor
The Interaction
Which N to Obtain?
• You will not have the same power for each
effect.
• If only interested in main effects, get the N
required for them.
• Suppose we are interested in the interaction.
225/12 = 18.75 cases/cell, bump up to 19(12) =
228 cases.
• This would give you 93% power for the one main
effect and almost 90% for the other.
Let GPower Determine the f
• What f corresponds to 2 of 6% ?
• Click Determine and enter 2 and 1- 2
Adjusting f for Other Effects
• That f ignores the fact
that other effects in
the model reduce the
error variance.
• Suppose that I expect
other effects to
account for 14% of
the total variance.
• I enter 6% for the
effect and (100-6-14)
= 80% for error.
ANOVA With Related Factors
• For the univariate-approach analysis, you
need add two more parameters
– The correlation between scores in one
condition and those in another condition
– Epsilon, if you suspect that correlation to differ
across pairs of conditions
• k = 4, f = .25 (medium), power = .95,
r = .5,  = 1.
Need only 36 Cases
Increase r to .75
Estimate  to be .6
Multivariate Approach:
No Sphericity Assumption
Contingency Table Analysis
(Two-Way)
• Effect size =
(P1i  P0 i )2
w 
P0 i
i 1
k
• P0i is the population proportion in cell i
under the null hypothesis.
• P1i is the population proportion in cell i
under the alternative hypothesis.
• .1 is small, .3 medium, .5 large
• For a 2 x 2, w is identical to 
2 x 4, 95% Power, w = .1:
Need 1,717 Cases !
MANOVA and DFA
• There will be one root (discriminant function,
canonical variate) for each treatment df.
• Each is a weighted linear combination of the Y
variables.
• Each maximizes the ratio of the among groups
SS to within group SS (the eigenvalue, ).
• Within a set, each root is independent of the
others.
Test Statistics for a Given Effect
• For each df there will be one  and  
• Hotellings Trace:
• Wilks Lambda:
• Pillai’s Trace:

1  

• Roy’s Greatest Root:  for the first root

 1
The Effect Size Parameter
• It is f.
• .1 is small, .25 medium, .4 large.
• GPower will convert from value of trace
to f if you wish.
• We plan a one-way MANOVA, four
groups, two Y variables.
• Want 95% power for a medium effect.
Planning the 1-Way MANOVA
Planning the Post-MANOVA
• What will do you do if the MANOVA is
significant?
• You decide to do two univariate ANOVAs,
one on each outcome variable.
• How much power would you have for each
of those?
Oh My, Only 25% Power
But I Want 95% Power !
• You have it, for the canonical variate you
have created, just not for the original
variables.
• Maybe you should just work with the
canonical variates.
• But maybe you, or your editors, don’t
really understand canonical variates.
95% Power for the Post-MANOVA
Analyses of Variance
• Does the significant MANOVA protect you
from inflating familywise error?
• You decide to employ the Bonferroni
correction.
• To keep familywise error capped at .05,
you use a .025 criterion for each of the two
ANOVAs.
• How many cases do you need?
Need 320 Cases. Ouch !
The Type I Boogey Man
• Paranoid obsession with this creature can
really mess up your research life.
• If that univariate ANOVA is significant, you
plan to make, for each Y, six comparisons
(1-2, 1-3, 1-4, 2-3, 2-4, 3-4).
• Bonferroni per comparison alpha = .05/12
= .00416.
• How many cases now?
Need 165 x 4 = 660 Cases !
• 330/2 groups = 165 per group.
Links
• Assorted Stats Links
• G*Power 3 – download site
– User Guide – sorted by type of analysis
– User Guide – sort by test distribution
• Internet Resources for Power Analysis
• List of the analyses available in G*Power