A Practical Guide to Multiplicative Interaction Variables

Download Report

Transcript A Practical Guide to Multiplicative Interaction Variables

A Practical Guide to Multiplicative
Interaction Variables in Policy
Research
Garry Young
GW Institute of Public Policy
January 25, 2006
Why Interactive Variables?
Answer: Sometimes the impact of a given
independent variable may depend (or be
conditional on) the level of another
independent variable.
Example: Impact of education on income
may depend on gender or race.
What I’ll Cover
• Interactive effects using OLS
• Case of a single discrete (dummy) variable
interacted with a continuous variable.
• Case of two continuous variables
interacted.
• Centering, significance tests, interpretation
• Very brief discussion of extensions.
Misc.
• Data, Stata do file, and this powerpoint
available at www.gwu.edu/~gwipp.
• Data used is fake and includes the
following variables as follows:
– Race (Klingon and Earthling)
– Income (100 – 280 credits)
– Education (4 – 16 years)
– Age (25 – 60)
Initial Model
Variable
Intercept
Education
Earthling
Coef.
23.01
12.86
52.81
R2 = .70, n = 100
Std. Err.
t
Prob.
11.97
1.92 0.06
.987
13.03 0.00
6.01
8.78 0.00
Interactive Model
• Does education affect income differently
by race?
• Find out by multiplying observations for
Education by observations by race
• Educationi X Earthlingi
What Does this Mean?
Variable
Intercept
Education
Earthling
Education
* Earthling
Coef.
51.92
10.31
-22.86
6.85
R2 = .74, n = 100
Std. Err.
13.94
1.18
22.07
1.93
t
Prob.
3.72 0.00
8.76 0.00
-1.04 0.30
3.55 0.00
Common Mistakes
• Omitting variables that are part of the
interaction
– All variables that are part of the interaction
stay in the equation
– e.g., don’t drop the Education and Earthling
variables while leaving in Education *
Earthling
Common Mistakes
• Omitting variables.
• Not performing an F-test
– Need to know if interaction contributes to
model
Common Mistakes
• Omitting variables.
• Not performing an F-test.
• Failure to understand the conditional
nature of coefficients
Education is Conditional on
Earthling = 0; Earthling is
conditional on Education = 0
Variable
Intercept
Education
Earthling
Education
* Earthling
Coef.
51.92
10.31
-22.86
6.85
R2 = .74, n = 100
Std. Err.
13.94
1.18
22.07
1.93
t
Prob.
3.72 0.00
8.76 0.00
-1.04 0.30
3.55 0.00
Common Mistakes
• Omitting variables that are part of the
interaction
• Not performing an F-test
• Failure to understand the conditional
nature of coefficients
• Failure to test for statistical significance of
conditional slopes from zero
Evaluating the Overall Model
• Interactive terms lessen parsimony,
increase difficulty of interpretation.
• Don’t do unless the interactive adds
explanatory power.
• For OLS perform an F-test.
F-Test Formula
The F-test formula is
2
2
( R  R ) /(k  k )
1
2 1
F 2
(1  R 2 ) /( N  k  1)
2
2
where k denotes the number of variables,
subscript 1 refers to original model and
subscript 2 refers to the expanded model.
F-Test
( R 2  R 2) /(k  k )
1
2 1
F 2
(1  R 2 ) /( N  k  1)
2
2
= (.74-.70)/(3-2)
(1-.74)/(100-3-1)
= 14.8
Critical value for F(1, 96) < 3.84
14.8 > 3.84 so interactive model is statistically significant
Problem of Interpretation:
The Meaning of Zero
Variable
Intercept
Education
Earthling
Education
* Earthling
Coef.
51.92
10.31
-22.86
6.85
R2 = .74, n = 100
Std. Err.
13.94
1.18
22.07
1.93
t
Prob.
3.72 0.00
8.76 0.00
-1.04 0.30
3.55 0.00
Mean Centering
Def.: Subtracting the mean from each observation of the
independent variable of interest so that the new mean is
equal to zero.
Xi
10
20
30
40
50
X bar
30
30
30
30
30
Xi - X bar
-20
-10
0
10
20
Why mean center?
• Makes coefficients easier to interpret
• Some argue it reduces multicollinearity
(Cronbach 1987)
• Otherwise doesn’t affect substance of
results, e.g., R2 is unaffected.
Centering Education
Using Stata:
summarize education, meanonly
gen educmean = r(mean)
gen educ_ct = education - educmean
* show two variables in comparison
summarize education educ_ct
Variable |
Obs
Mean Std. Dev.
Min
Max
-------------+-------------------------------------------------------education |
100
11.11 3.06131
4
16
educ_ct |
100 3.43e-07 3.06131
-7.11
4.89
Same Model but with Education
Mean Centered
Variable
Intercept
Education
Earthling
Education
* Earthling
Coef.
166.43
10.31
53.22
6.84
R2 = .74, n = 100
Std. Err.
4.02
1.18
5.68
1.93
t
Prob.
41.44 0.00
8.76 0.00
9.36 0.00
3.55 0.00
The Impact of Education
Impact of Education conditional on Klingon
(Earthling = 0)
Simply take the value of the Education
coefficient: 10.31
Education conditional on Earthling:
Education coef. + Interactive coef. = 10.31
+ 6.85 = 17.16
The Impact of Race
Earthling income conditional on average
education: 166.43 + 53.22(1) = 219.65
Klingon income conditional on average
education: 166.43 + 53.22(0) = 166.43
Impact of Education
Conditional on Race
550
500
450
Income
400
Earthling
Klingon
350
300
250
200
4
6
8
10
Education
12
14
16
Slope Significance
• Is each individual slope statistically distinct
from zero?
• Two ways to calculate
– Formula using variance-covariance matrix.
See Friedrich (1982: 810).
– A simple trick: Rescoring and recomputing
Recall that the slope for Education is
conditional on Klingon. The
slope is statistically significant.
Variable
Intercept
Education
Earthling
Education
* Earthling
Coef.
166.43
10.31
53.22
6.84
R2 = .74, n = 100
Std. Err.
4.02
1.18
5.68
1.93
t
Prob.
41.44 0.00
8.76 0.00
9.36 0.00
3.55 0.00
How to Rescore and Recompute
• Create a new variable where Klingon = 1
gen klingon = 0
replace klingon = 1 if earthling == 0
• Create new product interactive variable
using new race variable: Education X
Klingon
• Re-run regression.
Education is now conditional on
Earthling
Variable
Intercept
Education
Klingon
Education
* Earthling
Coef.
219.65
17.16
-53.22
-6.85
R2 = .74, n = 100
Std. Err.
4.02
1.53
5.68
1.93
t
Prob.
54.61 0.00
11.22 0.00
-9.36 0.00
-3.55 0.00
Interacting 2 Continuous Variables
• Say we think Age and Education interact
• Steps:
– Center both variables
– Create new product term (Age * Education)
– Run regressions
Initial Model
Variable
Intercept
Education
Age
Coef.
192.25
2.02
4.63
R2 = .97, n = 100
Std. Err.
t
Prob.
.900
213.72 0.00
.379
5.32
0.00
.108
42.86 0.00
Model with Product Term
Variable
Intercept
Education
Age
Education
* Age
Coef.
196.25
.865
4.48
-.200
R2 = .98, n = 100
Std. Err.
t
Prob.
.965
203.47 0.00
.361
2.39
0.02
.098
49.86 0.00
.030
-6.59 0.00
Analysis
• Carry out F-Test as before
• Interpretation of Age: Impact of Age on Income
conditional on Education=0
– Note that Education is mean centered so Age
coefficient is conditional on Education=mean
• Interpretation of Education: Impact of Education
on Income conditional on Age=0
– Note that Age is mean centered so Education
coefficient is conditional on Age=mean
Analysis (cont)
• Other diagnostics similar as before
• Can use rescoring to center variables at
values of interest, e.g., what is the impact
of Age on Income conditional on Education
being set to High, Medium, and Low
values. See, e.g., Young and Perkins
(2005: 1197-1198).
Extensions
• Standardized variables, see Jaccard and Turrisi
(2003)
• Multiple dummy variables, e.g. case where
Romulans, Earthlings, and Klingons yields two
dummy variables. See, e.g., Young and Perkins
(2006)
• Three-way interactions, e.g., Age * Education *
Tenure. See Friedrich (1982) or Jaccard &
Turrisi (2003)
• Other complex interactions, e.g., quadratric, see
Friedrich or Jaccard and Turrisi.
• Non-linear functional forms (Jaccard 2001)
Places for More Info
• Friedrich (1982): Still considered the gold
standard, at least in political science
• Jaccard, Turrisi, and Wan (1990): Better on
providing technical details than its successor
• Jaccard and Turrisi (2003): Good for practical
applications
• Braumoeller (2004): Good tips on hypothesis
• Brambor, Clark, and Golder (2005): Dos and
Don’ts
References
•
•
•
•
•
•
•
•
Brambor, Thomas, William Roberts Clark, and Matt Golder. “Understanding
Interaction Models: Improving Empirical Analyses.” Political Analysis 13: 120.
Braumoeller, Bear. 2004. “Hypothesis Testing and Multiplicative Interaction
Terms.” International Organization 58: 807-820.
Cronbach, L.J. 1987. “Statistical Tests for Moderator Variables.”
Psychological Bulletin 87: 51-57.
Friedrich, Robert. 1982. “In Defense of Multiplicative Terms in Multiple
Regression Equations.” American Journal of Political Science 26: 797-833.
Jaccard, James. 2001. Interaction Effects in Logistic Regression Thousand
Oaks: Sage Publications.
Jaccard, James and Robert Turrisi. 2003. Interaction Effects in Multiple
Regression Second Edition. Thousand Oaks: Sage Publications.
Jaccard, James, Robert Turrisi, Choi K. Wan. 1990. Interaction Effects in
Multiple Regression Thousand Oaks: Sage Publications.
Young, Garry and William Perkins. 2005. “Presidential Rhetoric, the Public
Agenda, and the End of Presidential Television’s ‘Golden Age,’” Journal of
Politics 67: 1190-1205.