Stat 512 Class 1 - Purdue University

Download Report

Transcript Stat 512 Class 1 - Purdue University

Topic 5: Power
Outline
• Review estimation and inference for
simple linear regression
• Power / Sample Size Estimation
– Slope
– Intercept
Simple Linear Normal
Error Regression Model
• Yi = b0 + b1Xi + ei
• ei is a Normally distributed random
variable with mean 0 and variance σ2
• ei and ej are uncorrelated → indep
•
Parameter Estimators
(
Xi  X)(Yi  Y)

β 1:
b1 
2
 (Xi  X)
• β 0:
b0  Y  b1 X
• σ2:
Y  b


s
2
i
0
 b1Xi 
n2
2
95% Confidence Intervals
for β0 and β1
b0 ± tcs(b0) and b1 ± tcs(b1)
where tc = t(.975, n-2), the upper 97.5
percentile of the t distribution with n-2
degrees of freedom
Significance tests for β0
and β1
• H0: β0 = 0, Ha: β0  0
t* =b0/s(b0)
• H0: β1 = 0, Ha: β1 ≠ 0
t* =b1/s(b1)
Reject H0 if the P-value is small (<.05)
Power
• The power of a significance test is the
probability that the null hypothesis is to
be rejected when, in fact, it is false.
• This is 1-P(Type II error)
• This probability depends on the
particular value of the parameter in Ha.
Power for β1
• H0: β1 = 0, Ha: β1  0
t* = b1/s(b1)
• When H0 true, t* ~ t(n-2)
• We reject H0 when |t*| t(1-/2,n-2)
Power for β1
• To compute power, we need to find
P(|t*| t(1-/2,n-2))
for arbitrary values of β1
• Note: When β1 = 0, calculation gives α
Power for β1
•
•
•
•
•
When H0 false, t*~ t(n-2,d).
This refers to the noncentral t distribution
δ= β1/ σ(b1) – noncentrality parameter
Need to assume values to get σ(b1)
Often use prior info or pilot study data
Power for β1
 (b1 )  
2
2
(X  X)
2
i
2, n, and
• Need
to
assume
values
for
s
n
2
(
X

X
)
 i
i 1
• KNNL use tables, see pg 51
• We will use SAS
Example of Power for β1
• From KNNL pg 51
• They assume σ2=2500, n=25, and
2
  X i  X   19800 based on s=48.82
and other results from pg 20
• Results in

2
 X
 X   2500 /19800  0.1263
2
i
Example of Power for β1
• Suppose β1 were 1.5
• We can calculate δ= β1/ σ(b1) and use
the distribution t~ t(n-2,δ) to find
P(|t*|  t(1-/2,n-2))
• We will use a function to calculate this
probability
SAS CODE
data a1;
n=25; sig2=2500; ssx=19800;
alpha=.05; beta1=1.5;
sig2b1=sig2/ssx; df=n-2;
delta=beta1/sqrt(sig2b1);
t_c=tinv(1-alpha/2,df);
power=1-probt(t_c,df,delta)
+probt(-t_c,df,delta);
output;
proc print data=a1;
run;
SAS OUTPUT
Obs
1
n
25
sig2
2500
sig2b1
0.12626
df
23
beta1
1.5
t_c
2.06866
power
0.98121
ssx
19800
alpha
0.05
delta
4.22137
SAS CODE
*Computes power for range of beta1;
data a2;
n=25; sig2=2500; ssx=19800;
alpha=.05; sig2b1=sig2/ssx; df=n-2;
t_c=tinv(1-alpha/2,df);
do beta1=-2.0 to 2.0 by .05;
delta=beta1/sqrt(sig2b1);
power=1-probt(t_c,df,delta)
+probt(-t_c,df,delta);
output;
end;
SAS CODE
title1 'Power for the slope in
Simple linear regression';
symbol1 v=none i=join;
proc gplot data=a2;
plot power*beta1;
proc print data=a2;
run;
Background Reading
• File knnl051.sas contains the SAS code
used in this Topic (addresses example
on page 51)
• Chapter 2
– 2.4 : Estimation of E(Yh)
– 2.5 : Prediction of new observation