econometric modeling 2
econometric modeling 2
More on Experimental Design
• Angrist and Pischke
• Emphasize the identification of causal effects.
• Ask, “What is your identification strategy?”
• Point is to control for unobserved selection effects
• Offer several solutions:
Randomized Controlled Trials
Experiments: We’ll talkmore about this in the next class.
Selection correction like the Heckman two-step
Main ideas behind RCTs
• RCTs try to bring the controls of hard science research to social
• Some treatment is envisioned
• Participants are assigned randomly to the treatment and
• Because getting the treatment is random, difference in the
outcome, after controlling for covariates, is attributable to the
• Removes selection effects. What are selection effects
• Covariates help control for differences in the way the
treatment impacts differ across groups
• A problem with RCTs is that there is selection into the
experiment – people who agree to participate may be
different than those not willing to participate
What is the idea behind natural experiments?
• Basically the same as an RCT, but with less control in
assignment to group
• Looking for something natural that randomly assigns
people into separate categories for getting treatment or
– More rare than people like to think because behavior
and policy are inherently endogenous
– Need to meet a high standard; many seeming
exogenous differences are endogenous
– Looking for something unrelated to the treatment that
– Best are natural disasters, etc. Often different political
outcomes are used, but that suffers from the “Tiebot”
• Does eliminate the selection into RCTs problem
A caution about “natural experiments” and the Tiebout
– Solon (1985) estimated effects of unemployment
insurance on duration of unemployment spells
– Compared states that recently changed standards
– Ignores that the changed standards could be
endogenous. Long spell states might have
purposely tightened standards
– See Tiebout (1956) “A Pure Theory of Local
We are looking for External Validity
• Do the impacts that are observed carryover if the
magnitude change of the variable used to define the
experiment is very different? But ….
– Internal validity (the design) makes experiments
narrow and idiosyncratic
– Empirical evidence is always local to the data
– The underlying variation never is completely
representative, so extrapolation is always
– Calls for repeated experiments, with a range of
– Accumulate more evidence
Kennedy’s paper addresses similar issues
• Applied econometricians “follow” a set of rules to
translate econometric theory to econometric practice.
• So why doesn’t theory translate easily into practice?
– Reliance of theory on asymptotic properties. Applied
econometrics works with finite samples.
– Econometric training focuses on estimation, and has lots of
tools to fix estimation problems (ie, things like sample selection
bias) by focusing on technique. But harder problems are likely
to occur at the specification stage.
• As a result, applied econometricians “violate” the rules
they learn from their classes, as they move into practice.
Kennedy’s paper outlines where violating theory has
become acceptable, and how to work around it.
Kennedy: Ten Rules for Applied Econometrics
1. Use common sense and economic theory
– Use good statistical practices
– Match like measured variables
– Select functional forms appropriate for your
dependent variable (beta function for a variable
with values constrained between 0 and 1)
– Don’t add trends for trendless variables
– Don’t use a formula for your empirical work; think
about what you are doing.
– My Rule: Let good theory drive your econometrics.
– From Angrist and Pishke: Know what you identified.
2. Avoid Type III errors (producing the right answer to
the wrong question)
– Corollary, an approximate answer to the right
question is worth more than a precise answer to
the wrong question
3. Know the context, which means get the facts
– How was the data collected and imputed?
– How were observations selected?
– These are parts of my “Know your data” rule
– But also, understand the system you are trying to
4. Inspect the data (I need say nothing more on this)
– But put together graphs of the data to see
patterns and anomalies
5. Keep it sensibly simple
– Begin with simple models, then make them more
complicated (but only if necessary)
– This is the empirical analog to what I said about
– Conflict between complexity (general) and
– Use the simplest method and simplest
specification appropriate for your analysis
6. Use the interocular trauma test (what is this?)
– Look at the results until the answer hits you
between the eyes.
– Look at it hard until you are comfortable taking
ownership (telling someone you did it)
– Only then should you check that the results make
sense with regard to signs, magnitudes,
significance and other statistical properties.
7. Understand the costs and benefits of data mining
– Goal is not a high R2
– Significance level is contextual
– Specification depends on what data you have, and
if it is relevant
– Coase: “If you torture the data long enough,
Nature will confess.” What does this mean?
– Do remember, the data can drive theory.
• You observe something, and try to explain it.
• Econometrics often is useful in understanding what we
• Make sure your model focuses on your central
8. Be prepared to compromise
– Understand the gap between the statistical theory
underlying your analysis, and the actual
application you are doing
– For example, there are few populations that are
9. Do not confuses statistical significance with
meaningful magnitude. I talked enough about this
already, think McCloskey and Ziliak.
10. Report a sensitivity analysis
– Pay attention to robustness
– Confess your errors and shortcomings (know the
limitations of what you did, and admit to them)