Document 7906058

Download Report

Transcript Document 7906058

Lecture 27
• Chapter 20.3: Nominal Variables
• HW6 due by 5 p.m. Wednesday
• Office hour today after class. Extra office
hour Wednesday from 9-10.
• Final Exam: May 1st, 4-6 p.m., SHDH 351
• Practice Exam will be posted tomorrow.
20.3 Nominal Independent Variables
• In many real-life situations one or more independent
variables are nominal.
• Including nominal variables in a regression analysis
model is done via indicator (or dummy) variables.
• An indicator variable (I) can assume one out of two
values, “zero” or “one”.
I=
1 if the temperature was below 50o
0 if the temperature was 50o or more
1 if data were collected before 1980
0 if data were collected after 1980
1 if a degree earned is in Finance
0 if a degree earned is not in Finance
Nominal Independent Variables;
Example: Auction Car Price (II)
• Example 18.2 - revised (Xm18-02a)
– Recall: A car dealer wants to predict the auction
price of a car.
– The dealer believes now that odometer reading
and the car color are variables that affect a
car’s price.
– Three color categories are considered:
• White
• Silver
• Other colors
Note: Color is a
nominal variable.
Nominal Independent Variables;
Example: Auction Car Price (II)
• Example 18.2 - revised (Xm18-02b)
1 if the color is white
I1 =
0 if the color is not white
1 if the color is silver
I2 = 0 if the color is not silver
The category “Other colors” is defined by:
I1 = 0; I2 = 0
How Many Indicator Variables?
• Note: To represent the situation of three possible
colors we need only two indicator variables.
• Conclusion: To represent a nominal variable with
m possible categories, we must create m-1
indicator variables.
Nominal Independent Variables;
Example: Auction Car Price
• Solution
– the proposed model is
y = b0 + b1(Odometer) + b2I1 + b3I2 + e
– The data
Price
14636
14122
14016
15590
15568
14718
.
.
Odometer
37388
44758
45833
30862
31705
34010
.
.
I-1
1
1
0
0
0
0
.
.
I-2
0
0
0
0
1
1
.
.
White car
Other color
Silver color
Example: Auction Car Price
The Regression Equation
From JMP (Xm18-02b) we get the regression equation
PRICE = 16701-.0555(Odometer)+90.48(I-1)+295.48(I-2)
Price
Price = 16701 - .0555(Odometer) + 90.48(0) + 295.48(1)
Price = 16701 - .0555(Odometer) + 90.48(1) + 295.48(0)
Price = 16701 - .0555(Odometer) + 90.48(0) + 295.48(0)
Odometer
Example: Auction Car Price
The Regression Equation
From JMP we get the regression equation
PRICE = 16701-.0555(Odometer)+90.48(I-1)+295.48(I-2)
For one additional mile the
auction price decreases by
5.55 cents.
A white car sells, on the average,
for $90.48 more than a car of the
“Other color” category
A silver color car sells, on the average,
for $295.48 more than a car of the
“Other color” category.
Comprehension Question
From JMP we get the regression equation
PRICE = 16701-.0555(Odometer)+90.48(I-1)+295.48(I-2)
• Consider two cars, one white and one silver,
with the same number of miles. How much
more on average does the silver car sell for
than the white car?
Example: Auction Car Price
The Regression Equation
SUMMARY OUTPUT
Regression Statistics
Multiple R
0.8355
R Square
0.6980
Adjusted R Square
0.6886
Standard Error
284.5
Observations
100
There is insufficient evidence
Xm18-02b
to infer that a white color car and
a car of “other color” sell for a
different auction price.
There is sufficient evidence
to infer that a silver color car
sells for a larger price than a
car of the “other color” category.
ANOVA
df
Regression
Residual
Total
Intercept
Odometer
I-1
I-2
3
96
99
SS
17966997
7772564
25739561
Coefficients Standard Error
16701 184.3330576
-0.0555
0.0047
90.48
68.17
295.48
76.37
MS
5988999
80964
F
Significance F
73.97
0.0000
t Stat
P-value
90.60
0.0000
-11.72
0.0000
1.33
0.1876
3.87
0.0002
Nominal Independent Variables;
Example: MBA Program Admission
(MBA II)
• Recall: The Dean wanted to evaluate applications for
the MBA program by predicting future performance of
the applicants.
• The following three predictors were suggested:
– Undergraduate GPA
– GMAT score
– Years of work experience
Note: The undergraduate
degree is nominal data.
• It is now believed that the type of undergraduate degree
should be included in the model.
Nominal Independent Variables;
Example: MBA Program Admission
(II)
1 if B.A.
I1 =
0 otherwise
1 if B.B.A
I2 = 0 otherwise
1 if B.Sc. or B.Eng.
I3 = 0 otherwise
The category “Other group” is defined by:
I1 = 0; I2 = 0; I3 = 0
MBA Program Admission (II)
Analysis of Variance
Source
Model
Error
C. Total
DF
Sum of Squares
Mean Square
F Ratio
6
82
88
54.751842
43.617378
98.369220
9.12531
0.53192
17.1554
Prob > F
<.0001
Parameter Estimates
Term
Intercept
UnderGPA
GMAT
Work
Degree[1]
Degree[2]
Degree[3]
Estimate
Std Error
t Ratio
Prob>|t|
0.2886998
-0.006059
0.0127928
0.0981817
-0.443872
0.6068391
-0.064081
1.396475
0.113968
0.001356
0.030323
0.146288
0.160425
0.138484
0.21
-0.05
9.43
3.24
-3.03
3.78
-0.46
0.8367
0.9577
<.0001
0.0017
0.0032
0.0003
0.6448
Practice Problems
• 20.6, 20.8, 20.22,20.24