Multiple Regression Fitting Models for Multiple

Download Report

Transcript Multiple Regression Fitting Models for Multiple

Multiple Regression
Fitting Models for Multiple Independent Variables
By Ellen Ludlow
If you wanted to predict someone’s weight
based on their height, you would collect
data by recording the height and weight
and fit a model.
If you wanted to predict someone’s weight
based on their height, you would collect
data by recording the height and weight
and fit a model.
Lets say our population are males ages 1625, and this is a table of collected data...
If you wanted to predict someone’s weight
based on their height, you would collect
data by recording the height and weight
and fit a model.
Lets say our population are males ages 1625, and this is a table of collected data...
height 60 63 65 66 67 68 68 69 70 70 71 72 72 73 75
weight 120 35 130 143 37 149 144 150 156 152 154 162 169 163 168
Next, we graph the data..
Next, we graph the data..
Height vs Weight
80
Weight (lbs)
75
70
65
60
55
115
125
135
145
Height (ins)
155
165
175
Next, we graph the data..
Height vs Weight
80
Weight (lbs)
75
70
65
60
55
115
125
135
145
155
165
175
Height (ins)
And because the data looks linear, fit an LSR line…
Next, we graph the data..
Height vs Weight
80
Weight (lbs)
75
70
65
60
55
115
125
135
145
155
165
175
Height (ins)
And because the data looks linear, fit an LSR line…
But weight isn’t the only factor that has an
impact on someone’s height. The height of
someone’s parents may be another
predictor.
With multiple regression you may have more
then one independent variable, so you could
use someone's weight and his parent’s
height to predict his own height.
Our new table, with the data, the average
height of a subject’s parents, looks like this…
height
60 63 65 66 67 68 68 69 70 70 71 72 72 73 75
weight 120 35 130 143 37 149 144 150 156 152 154 162 169 163 168
parent's
height
59 67 62 59 71 66 71 67 69 73 69 75 72 69 73
This data can’t be graphed like simple linear
regression, because there are two
independent variables.
This data can’t be graphed like simple linear
regression, because there are two
independent variables.
There is software, however, such as Minitab,
that can analyze data with multiple independent
variable.
Lets take a look at a Minitab output for our
data…
Predictor
Constant
weight
parenth
Coef
25.028
0.24020
0.11493
s = 1.165
Stdev
4.326
0.03140
0.09035
R-sq = 92.6%
t-ratio
5.79
7.65
1.27
p
0.000
0.000
0.227
R-sq(adj) = 91.4%
Analysis of Variance
SOURCE
Regression
Error
Total
DF
2
12
14
SS
205.31
16.29
221.60
MS
102.65
1.36
F
75.62
What does all this mean?
p
0.000
First, Let’s look at the multiple regression
model…
The general model for multiple regression is
similar to the model for simple linear
regression.
Simple linear regression model:
y  0  1 x
Multiple regression model:
y  0  1 x1  2 x 2  ... k xk
Just like linear regression, when you fit a
multiple regression to data, the terms in the
model equation are statistics not parameters.
A multiple regression model using statistical
notation looks like...
yˆ  B0  B1 x1  B2 x2  ... Bk xk
where k is the number of independent variables.
The multiple regression model for our data
is…
heigˆ ht  25.028 .24020weight  .11483pare
nth
We get the coefficient values from the Minitab
output…
Predictor
Constant
weight
parenth
Coef
25.028
0.24020
0.11493
Stdev
4.326
0.03140
0.09035
t-ratio
5.79
7.65
1.27
p
0.000
0.000
0.227
Once the regression is fitted, we need to
know how well the model fits the data…
•First, we check and see if there is a good
overall fit.
•Then, we test the significance of each
independent variable. You will notice that
this is the same way we test for
significance in a simple linear regression.
The Overall Test…
Hypotheses:
The Overall Test…
Hypotheses:
HO : 1  2  3  ...  k
All independent variables are unimportant for
predicting y
The Overall Test…
Hypotheses:
HO : 1  2  3  ...  k
All independent variables are unimportant for
predicting y
H A : At least one  k  0
At least one independent variable is useful
for predicting y
What type of test should be used?
The distribution used is called the Fischer
distribution. The F-Statistic is used with
this distribution.
<-- Fischer Distribution
How do you calculate the F-statistic?
How do you calculate the F-statistic?
It can easily be found in the Minitab output, along
with the p-value…
How do you calculate the F-statistic?
It can easily be found in the Minitab output, along
with the p-value…
SOURCE
DF SS
Regress 2
205.31
Error
12 16.29
Total
14 221.60
MS
F
p
102.65 75.62 0.000
1.36
Or you can calculate it by hand…
But, before you can calculate the F-statistic,
you need to be introduced to some other
terms.
But, before you can calculate the F-statistic,
you need to be introduced to some other
terms.
Regression sum of squares (regression SS)
- the variation in Y accounted for by the
regression model with respect to the mean
model
But, before you can calculate the F-statistic,
you need to be introduced to some other
terms.
Regression sum of squares (regression SS)
- the variation in Y accounted for by the
regression model with respect to the mean
model
Error sum of squares (error SS) - the
variation in Y not accounted for by the
regression model.
But, before you can calculate the F-statistic,
you need to be introduced to some other
terms.
Regression sum of squares (regression SS)
- the variation in Y accounted for by the
regression model with respect to the mean
model
Error sum of squares (error SS) - the
variation in Y not accounted for by the
regression model.
Total sum of squares (total SS) - the total
variation in Y
Now that we understand these terms we need to
know how to calculate them…
Now that we understand these terms we need to
know how to calculate them…
Regression SS


n
Yˆi  Y

2
i1
Error SS


n
Yi  Yˆ
i1
Total SS


2
n
 Y  Y 
2
i
i1
Total SS = Regression SS + Error SS
 Y  Y    
n
i 1
2
i
n
i 1
Yˆi  Y
  
2
n
i 1
Yi  Yˆ

2
There are also regression mean of squares,
error mean of squares, and total mean of
squares (abbreviated MS).
There are also regression mean of squares,
error mean of squares, and total mean of
squares (abbreviated MS).
To calculate these terms, you divide the sum of
squares by its respective degrees of
freedom…
There are also regression mean of squares,
error mean of squares, and total mean of
squares (abbreviated MS).
To calculate these terms, you divide the sum of
squares by its respective degrees of
freedom…
Regression d.f. = k
Error d.f. = n-k-1
Total d.f. = n-1
There are also regression mean of squares,
error mean of squares, and total mean of
squares (abbreviated MS).
To calculate these terms, you divide the sum of
squares by its respective degrees of
freedom…
Regression d.f. = k
Error d.f. = n-k-1
Total d.f. = n-1
Where k is the number of independent variables and n
is the total number of observations used to calculate
the regression
So…
Regression 
MS

n
i1

Yi  Yˆ
i1
Total MS

2
n  k 1
n


2
k
n
Error MS 
Yˆi  Y
 Y
i
Y
2
i1
n1
…and Regression MS + Error MS = Total MS
Both sum of squares and mean squares
values can be found in Minitab…
Both sum of squares and mean squares
values can be found in Minitab…
SOURCE
DF SS
Regress
2
Error
12 16.29
Total
14 221.60
MS
F
205.31 102.65 75.62
1.36
p
0.000
Both sum of squares and mean squares
values can be found in Minitab…
SOURCE
DF SS
Regress
2
Error
12 16.29
Total
14 221.60
MS
F
205.31 102.65 75.62
1.36
Now we can calculate the F-statistic.
p
0.000
Test Statistic and Distribution
Test statistic:
F=
F=
F=
model mean square
error mean square
102.65
1.36
75.48
Which is very close to F-statistic from Minitab (
75.62)
The p-value for the F-statistic is then found in
a F-Distribution Table. As you saw before, it
can also be easily calculated by software.
A small p-value rejects the null hypothesis
that none of the independent variables are
significant. That is to say, at least one of the
independent variables are significant.
The conclusion in the context of our data is:
We have strong evidence (p is approx. 0) to
reject the null hypothesis. That is to say
either someone’s weight or their average
parent’s height is significant in predicting his
height.
Once you know that at least one independent
variable is significant, you can go on to test
each independent variable separately.
Testing Individual Terms
If an independent variable does not contribute
significantly to predicting the value of Y, the
coefficient of that variable will be 0.
The test of the these hypotheses determines
whether the estimated coefficient is significantly
different from 0.
From this, we can tell whether an independent
variable is important for predicting the dependent
variable.
Test for Individual Terms:
Test for Individual Terms:
HO:
j  0
Test for Individual Terms:
HO:  j  0
The independent variable, xj, is not important
for predicting y
Test for Individual Terms:
HO:  j  0
The independent variable, xj, is not important
for predicting y
HA:
 j  0 or  j  0 or  j  0
Test for Individual Terms:
HO:  j  0
The independent variable, xj, is not important
for predicting y
HA:  j  0 or  j  0 or  j  0
The independent variable, xj, is important for
predicting y
Test for Individual Terms:
HO:  j  0
The independent variable, xj, is not important
for predicting y
HA:  j  0 or  j  0 or  j  0
The independent variable, xj, is important for
predicting y
where j represents a specified random variable
Test Statistic:
j
t=
s
j
Test Statistic:
j
t=
s
j
d.f. = n-k-1
Test Statistic:
j
t=
s
j
d.f. = n-k-1
Remember, this test is only to be
performed, if the overall model of the
test is significant.
 T-distribution
QuickTime™ and a
GIF decompressor
are needed to see this picture.
Tests of individual terms for significance are
the same as a test of significance in simple
linear regression
A small p-value means that the independent
variable is significant.
Predictor Coef
Stdev
t-ratio p
Constant
25.028
4.326
5.79
0.000
weight
0.24020 0.03140 7.65
0.000
parenth
0.11493 0.09035 1.27
0.227
This test of significance shows that weight is
a significant independent variable for
predicting height, but average parent height
is not.
Now that you know how to do tests of
significance for multiple regression, there are
many other things that you can learn. Such
as…
•How to create confidence intervals
•How to use categorical variables in multiple
regression
•How to test for significance in groups of
independent variables