Interactions in regression models: An introduction

Download Report

Transcript Interactions in regression models: An introduction

Calculating interaction effects
from OLS coefficients:
Interaction between two categorical
independent variables
Jane E. Miller, PhD
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Overview
• General equation for a model with main effects
and interactions
• Review: Coding of main effects and interaction
terms
• Solving for the interaction pattern based on
estimated coefficients
• Graphical depiction of the interaction pattern
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Estimated coefficients
OLS model of birth weight (grams)
Main effect terms
Non-Hispanic black (NHB)
Less than high school (<HS)
High school diploma (=HS)
Interaction terms
NHB_<HS
NHB_=HS
β
–168
–54
–62
–39
+18
Reference category: Non-Hispanic whites with >HS education.
All variables are dummy-coded: 1 = named value, 0 = other values.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Interpreting the main effects
• The main effect terms estimate the difference in birth
weight relative to those in the reference category
(non-Hispanic whites with more than complete high
school education).
– βNHB is an estimate of the difference in intercept between
non-Hispanic black infants and those in the reference
category.
– β<HS and β=HS estimate the difference in intercept between
infants in the reference category and those born to mothers
with less than complete high school and complete high
school, respectively.
– Units are those of the dependent variable, grams.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Interpreting the interaction
between race and education
• The race_education interaction tests whether
the difference in birth weight for <HS versus
=HS is different for non-Hispanic black infants
than for their non-Hispanic white counterparts.
– We calculate the overall effect for NHB and <HS as
= βNHB+ β<HS + βNHB_<HS
– If the difference in birth weight across mothers’
education categories were the same for blacks as
for whites, then the interaction term βNHB_<HS = 0.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Calculating overall effect of interaction
for specific case characteristics
• The general equation to calculate how a case differs
from the reference category:
– main effects coefficients
– interaction term coefficients
– values of the independent variables
= (βNHB × NHB) + (β<HS × <HS) + (β=HS × =HS) +
(βNHB_<HS × NHB_<HS) + (βNHB_=HS × NHB_=HS)
• To see which βs pertain to which cases, fill in values of
variables for different combinations of race and
education.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Review: Coding of main effects and
interaction term variables
Main effects terms
Interaction terms
Race Education Race & education
Case characteristics NHB
Non-H white & <HS
0
Non-H white & =HS
0
Non-H white & >HS
0
Non-H black & <HS
1
Non-H black & =HS
1
Non-H black & >HS
1
<HS
1
0
0
1
0
0
=HS NHB_<HS NHB_=HS
0
0
0
1
0
0
0
0
0
0
1
0
1
0
1
0
0
0
Reference category
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Cases in the reference category for
both independent variables
General equation to calculate how a case differs
from the reference category:
= (βNHB × NHB) + (β<HS × <HS) + (β=HS × =HS) +
(βNHB_<HS × NHB_<HS) + (βNHB_=HS × NHB_=HS)
Fill in values of variables for non-Hispanic whites with >HS:
Non-H white & >HS
NHB
0
<HS =HS NHB_<HS NHB_=HS
0
0
0
0
= (βNHB × 0) + (β<HS × 0) + (β=HS × 0) +
(βNHB_<HS × 0) + (βNHB_=HS × 0) = 0
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Cases in the reference category for
both independent variables
= (βNHB × 0) + (β<HS × 0) + (β=HS × 0) +
(βNHB_<HS × 0) + (βNHB_=HS × 0) = 0
• All of the coefficients fall out of the equation for
non-Hispanic whites born to mothers with >HS
because each β is multiplied by a value of 0.
• Thus, cases in the reference category for both race
and education have a calculated overall effect of 0.
– As it should be, because there is no difference between
them and themselves!
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Cases in the reference category
for 1 but not both independent variables
Fill values of variables for non-Hispanic whites with =HS into
the general equation:
Non-H white & =HS
NHB
0
<HS =HS NHB_<HS NHB_=HS
0
1
0
0
= (βNHB × 0) + (β<HS × 0) + (β=HS × 1) +
(βNHB_<HS × 0) + (βNHB_=HS × 0) = β=HS
The equation for non-Hispanic white infants born to
mothers with a high school diploma collapses to include
only β=HS because all of the other coefficients are
multiplied by a value of 0.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Cases not in the reference category
for either independent variable
Fill in values of variables for non-Hispanic blacks with =HS:
Non-H black & =HS
NHB
1
<HS
0
=HS NHB_<HS
1
0
NHB_=HS
1
= (βNHB × 1) + (β<HS × 0) + (β=HS × 1) +
(βNHB_<HS × 0) + (βNHB_=HS × 1)
= βNHB + β=HS + βNHB_=HS
Thus, the equation for non-Hispanic black infants born to
mothers with a high school diploma collapses to include
the main effects terms for both βNHB and β=HS and
the interaction term βNHB_=HS.
All the other βs fall out because they are multiplied by 0.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Equations to calculate overall effect
Equation
Symbols
Estimated β
Value
Non-Hisp. white, <HS
= β<HS
= –54
= –54
Non-Hisp. white, = HS
= β=HS
= –62
= –62
Non-Hisp. white, >HS
NA (ref cat)
NA
0
Non-Hisp. black, <HS
= βNHB + β<Hs +
βNHB_<HS
= (–168) + (–54)
+ (–39)
= –261
Non-Hisp. black, =HS
= βNHB + β=Hs +
βNHB_=HS
= (–168) + (–62)
+ (+18)
= –212
Non-Hisp. black, >HS
= βNHB
= –168
= –168
Subgroup
Difference in birth weight (grams) compared to infants born to non-Hispanic
white women with more than a high school education = reference category.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Interpreting the sign of the
interaction terms: NHB_<HS
• βNHB_<HS = –39, meaning that infants in that group have
lower estimated birth weight than would be predicted
from their race and mother’s education alone, based on
the main effects (βNHB + β<HS)
– All three βs (both main effects and interaction) have
negative signs, meaning that they cumulate to a
large deficit in birth weight for NHB <HS.
– The β on the interaction term reinforces (adds to)
the predicted deficit based on race and education
alone.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Calculating overall effect for nonHispanic blacks with <HS education
βNHB =
–168
β<HS =
βNHB_<HS =
–54
–39
= βNHB + β<HS + βNHB_<HS = (–168) + (–54) + (–39) = –261
–39
–54
–168
Compared to infants born to non-Hispanic white women with more than a high
school education = reference category.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Interpreting the sign of the
interaction term: NHB_=HS
• On the other hand, βNHB_=HS = +18, meaning that infants
in that group have higher estimated birth weight than
would be predicted from their race and mother’s
education alone, based on the main effects (βNHB and
β=HS).
– Both main effects terms have negative signs, but the
interaction term has a positive (opposite) sign, so it
partially offsets the deficit in birth weight predicted
based on race and education alone.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Calculating overall effect for nonHispanic blacks with =HS education
βNHB =
β=HS =
βNHB_=HS =
–168
–62
+18
Note that interaction term has the
OPPOSITE SIGN of the two main effects,
partially offsetting their two negative effects
on birth weight with a positive effect.
= βNHB + β=HS + βNHB_=HS = (–168) + (–62) + (+18) = –212
+18
–62
–168
Compared to infants born to non-Hispanic white women with more than a high
school education = reference category.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Overall effects of race and mother’s
education on birth weight
= -261
Solid = main effect term.
Striped = interaction of education level w/ NHB.
Compared to non-Hispanic whites with >HS education.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Predicted value and the intercept term
• The intercept (or “constant”) term estimates the
value of the dependent variable Y for cases in
the reference category.
• To calculate the predicted value of Y for each
combination of the Xi,
– add the estimated coefficient for the intercept (β0) to
the βs for each variable that pertains to the category
of interest.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Examples: Predicted value
• For instance, β0 = 3,042.8.
• So infants who are in the reference category for all variables
are estimated to weigh 3,042.8 grams.
• This includes non-Hispanic whites born to women with >HS.
– Reference category for race and mother’s education
• Those born to Mexican American women with less than
a high school education:
• β0 + βMA + β<HS + βMA_<HS
= 3,042.8 + (–104.2) + (–54.2) + 99.4
= 3,039.8 – 59.0
= 2,983.8 grams.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Use a spreadsheet to calculate and
graph the interaction
• Spreadsheets can
– Store
• The estimated coefficients
• The input values of the independent variables
• The correct generalized formula to calculate the predicted
values for many combinations of the IVs involved in the
interaction
– Graph the overall pattern
• See spreadsheet template and podcast
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Summary
• Calculating the overall shape of an interaction
pattern requires adding together the pertinent
main effects and interaction term coefficients for
each possible combination of the two categorical
IVs in the interaction.
– A spreadsheet can be helpful for storing and
organizing the coefficients and formulas.
• Depending on the respective signs of those βs,
the interaction can either amplify or dampen the
main effects on the component variables.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Suggested resources
• Chapter 16 of Miller, J.E. 2013. The Chicago
Guide to Writing about Multivariate Analysis,
2nd Edition.
• Chapters 8 and 9 of Cohen et al. 2003. Applied
Multiple Regression/Correlation Analysis for the
Behavioral Sciences, 3rd Edition. Florence, KY:
Routledge.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Supplemental online resources
• Podcast on creating interaction term variables
• Spreadsheet template for calculating overall
effect of an interaction between two categorical
variables.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Suggested practice exercises
• Study guide to The Chicago Guide to Writing
about Multivariate Analysis, 2nd Edition.
– Questions #3 and 5 in the problem set for Chapter 16
– Suggested course extensions for Chapter 16
• “Applying statistics and writing” exercise #1.
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.
Contact information
Jane E. Miller, PhD
[email protected]
Online materials available at
http://press.uchicago.edu/books/miller/multivariate/index.html
The Chicago Guide to Writing about Multivariate Analysis, 2nd edition.