Inference about the Slope and Intercept

Download Report

Transcript Inference about the Slope and Intercept

Decomposition of Sum of Squares
• The total sum of squares (SS) in the response variable is
SSTO   Yi  Y 
2
• The total SS can be decompose into two main sources; error SS and
regression SS…
• The error SS is SSE 
2
e
 i.
• The regression SS is SSR  b
2
1
 X
 X .
2
i
It is the amount of variation in Y’s that is explained by the linear
relationship of Y with X.
STA302/1001 - week 4
1
Claims
• First, SSTO = SSR +SSE, that is
SSTO   Yi  Y   b
i
 X    ei2
2
SSTO   Yi  Y    Yˆi  Y
   Y  Yˆ 
2
2
1
 X
2
• Proof:….
• Alternative decomposition is

2
2
i
i
• Proof: Exercises.
STA302/1001 - week 4
2
Analysis of Variance Table
• The decomposition of SS discussed above is usually summarized in analysis
of variance table (ANOVA) as follow:
• Note that the MSE is s2 our estimate of σ2.
STA302/1001 - week 4
3
Coefficient of Determination
• The coefficient of determination is
SSR
SSE
R2 
 1
SSTO
SSTO
• It must satisfy 0 ≤ R2 ≤ 1.
• R2 gives the percentage of variation in Y’s that is explained by the
regression line.
STA302/1001 - week 4
4
Claim
• R2 = r2, that is the coefficient of determination is the correlation
coefficient square.
• Proof:…
STA302/1001 - week 4
5
Important Comments about R2
• It is a useful measure but…
• There is no absolute rule about how big it should be.
• It is not resistant to outliers.
• It is not meaningful for models with no intercepts.
• It is not useful for comparing models unless same Y and one set of
predictors is a subset of the other.
STA302/1001 - week 4
6
ANOVE F Test
• The ANOVA table gives us another test of H0: β1 = 0.
• The test statistics is Fstat
MSR

MSE
• Derivations …
STA302/1001 - week 4
7
Prediction of Mean Response
• Very often, we would want to use the estimated regression line to
make prediction about the mean of the response for a particular X
value (assumed to be fixed).
• We know that the least square line Yˆ  b0  b1 X is an estimate of
EY | X    0  1 X
• Now, we can pick a point, X = x* (in the range in the regression
line) then, Yˆ*  b0  b1 X * is an estimate of EY | X  x *   0  1 x * .
2



1
x
*

x
• Claim: Var Yˆ* | X  x *    
S XX
n


2
• Proof:




• This is the variance of the estimate of E(Y | X=x*).
STA302/1001 - week 4
8
Confidence Interval for E(Y | X = x*)
• For a given x, x* , a 100(1-α)% CI for the mean value of Y is
1 x *  x 
ˆ
Y * t n  2 ; 2 s

n
S XX
2
where s  MSE.
STA302/1001 - week 4
9
Example
• Consider the smoking and cancer data.
• Suppose we wish to predict the mean mortality index when the
smoking index is 101, that is, when x* = 101….
STA302/1001 - week 4
10
Prediction of New Observation
• Suppose we want to predict a particular value of Y* when X = x*.
• The predicted value of a new point measured when X = x* is
Yˆ*  b0  b1 x *
• Note, the above predicted value is the same as the estimate of
E(Y | X = x*).
• The predicted value Yˆ * has two sources of variability. One is due
to the regression line being estimated by b0+b1X. The second one is
due to ε* i.e., points don’t fall exactly on line.
• To calculated the variance in error of prediction we look at the
difference Y * Yˆ * ....
STA302/1001 - week 4
11
Prediction Interval for New Observation
• 100(1-α)% prediction interval for when X = x* is
Yˆ * t n  2 ; 2
1 x *  x 
s 1 
n
S XX
2
• This is not a confidence interval; CI’s are for parameters and we are
estimating a value of a random variable.
• Prediction interval is wider than CI for E(Y | X = x*).
STA302/1001 - week 4
12
Dummy Variable Regression
• Dummy or indicator variable takes two values: 0 or 1.
• It indicates which category an observation is in.
• Example…
• Interpretation of regression coefficient in a dummy variable
regression…
STA302/1001 - week 4
13