Recent Advanced in Causal Modelling Using Directed Graphs

Download Report

Transcript Recent Advanced in Causal Modelling Using Directed Graphs

Causal Inference
and
Ambiguous Manipulations
Richard Scheines
Grant Reaber, Peter Spirtes
Carnegie Mellon University
1
1. Motivation
Wanted: Answers to Causal Questions:
•
Does attending Day Care cause Aggression?
•
Does watching TV cause obesity?
•
How can we answer these questions empirically?
•
When and how can we estimate the size of the
effect?
•
Can we know our estimates are reliable?
2
Causation & Intervention
Conditioning is not the same as intervening
P(Lung Cancer | Tar-stained teeth = no)

P(Lung Cancer | Tar-stained teeth set= no)
Show Teeth Slides
3
Gender
CEO
Earings
Gender
CEO
Earings
I
4
Causal Inference: Experiments
Gold Standard: Randomized Clinical Trials
- Intervene: Randomly assign treatment
- Observe Response
Estimate P( Response | Treatment
assigned)
5
Causal Inference: Observational Studies
Collect a sample on
- Potential Causes (X)
Individual
Day Care
John
A lot
Mary
None
- Response (Y)
Aggressivenes
s
A lot
A little
- Covariates (potential confounders Z)
Estimate P(Y | X, Z)
•
Highly unreliable
•
We can estimate sampling variability, but we don’t know how to
estimate specification uncertainty from data
6
2. Progress 1985 – Present
1. Representing causal structure, and connecting it
to probability
2. Modeling Interventions
3. Indistinguishability and Discovery Algorithms
7
Representing Causal Structures
Causal Graph G = {V,E}
Each edge X  Y represents a direct causal claim:
X is a direct cause of Y relative to V
Exposure
Infection
Symptoms
8
Direct Causation
X is a direct cause of Y relative to S, iff
z,x1  x2 P(Y | X set= x1 , Z set= z)
 P(Y | X set= x2 , Z set= z)
where Z = S - {X,Y}
X
Y
9
Causal Bayes Networks
The Joint Distribution Factors
Smoking [0,1]
According to the Causal Graph,
Yellow Fingers
[0,1]
Lung Cancer
[0,1]
P(S = 0) = .7
P(S = 1) = .3
P(YF = 0 | S = 0) = .99
P(YF = 1 | S = 0) = .01
P(YF = 0 | S = 1) = .20
P(YF = 1 | S = 1) = .80
i.e., for all X in V
P(V) = P(X|Immediate
Causes of(X))
P(S,Y,F) = P(S) P(YF | S) P(LC | S)
P(LC = 0 | S = 0) = .95
P(LC = 1 | S = 0) = .05
P(LC = 0 | S = 1) = .80
P(LC = 1 | S = 1) = .20
10
Modeling Ideal Interventions
Interventions on the Effect
Post
Pre-experimental System
Wearing
Sweater
Room
Temperature
11
Modeling Ideal Interventions
Interventions on the Cause
Post
Pre-experimental System
Wearing
Sweater
Room
Temperature
12
Interventions & Causal Graphs
• Model an ideal intervention by adding an “intervention” variable
outside the original system
• Erase all arrows pointing into the variable intervened upon
Intervene to change Inf
Pre-intervention graph
Post-intervention graph?
Exp
Exp
Inf
Inf
Rash
Rash
I
13
Calculating the Effect of
Interventions
Exp
Inf
Rash
Pre-manipulation Joint Distribution
P(Exp,Inf,Rash) = P(Exp)P(Inf | Exp)P(Rash|Inf)
Intervention on Inf
Exp
Inf
I
Rash
Post-manipulation Joint Distribution
P(Exp,Inf,Rash) = P(Exp)P(Inf | I) P(Rash|Inf)
14
Causal Discovery from
Observational Studies
Equivalence Class of
Causal Graphs
X1
X1
X1
X2
X2
X2
X3
X3
X3
Causal Markov Axiom
(D-separation)
Independence
Relations
Discovery Algorithm
X1
X 3 | X2
15
Equivalence Class with Latents:
PAGs: Partial Ancestral Graphs
Assumptions:
X1
PAG
• Acyclic graphs
X3
• Latent variables
Represents
• Sample Selection Bias
X1
Equivalence:
X2
X1
X3
• Independence over
measured variables
X2
X2
T1
X3
etc.
X1
X2
X1
T1
X3
T1
X2
X3
T2
16
Causal Inference from
Observational Studies
Knowing when we know enough to
calculate the effect of Interventions
The Prediction Algorithm (SGS, 2000)
17
Causal Discovery from
Observational Studies
Equivalence Class (PAG)
X1
X2
X3
Prediction Algorithm
X4
Discovery Algorithm
Observed Independence
X1 _||_ X4
X1 _||_ X3 | X2
X4 _||_ X3 | X2
Predictions?
P(X3 | X2set)
yes
P(X2 | X1set)
Don’t know
P(X1 | X2set)
yes
….
18
3. The Ambiguity of Manipulation
Assumptions
• Causal graph known
Total Blood
Cholesterol
(Cholesterol is a cause of Heart Condition)
• No Unmeasured Common Causes
Heart
Disease
Therefore
The manipulated and unmanipulated distributions are the same:
P(H | TC = x) = P(H | TC set= x)
19
The Problem with Predicting the
Effects of Acting
Problem – the cause is a composite of causes that don’t act uniformly,
E.g., Total Blood Cholesterol (TC) = HDL + LDL
Total Blood Cholesterol =
HDL
+
+
LDL
Heart
Disease
•The observed distribution over TC is determined by the unobserved joint
distribution over HDL and LDL
• Ideally Intervening on TC does not determine a joint distribution for HDL
and LDL
20
The Problem with Predicting the
Effects of Setting TC
Total Blood Cholesterol =
HDL
+
+
LDL
Heart
Disease
• P(H | TC set1= x) puts NO constraints on P(H | TC set2= x),
• P(H | TC = x) puts NO constraints on P(H | TC set= x)
• Nothing in the data tips us off about our ignorance, i.e., we don’t know that we
don’t know.
21
Examples Abound
Total TV =
Violent Junk
+
PBS, Discovery Channel
Total Day Care =
Overcrowded, Poor Quality
+
High Quality
Social
Adjustment
+
+
-
Aggressiveness
22
Possible Ways Out
•
Causal Graph is Not Known:
Cholesterol does not really cause Heart Condition
•
Confounders (unmeasured common causes) are present:
LDL and HDL are confounders
23
Cholesterol is not really
a cause of Heart Condition
Relative to a set of variables S (and a background),
X is a cause of Y iff
x1  x2 P(Y | X set= x1)  P(Y | X set= x2)
• Total Cholesterol is a cause of Heart Disease
24
Cholesterol is not really
a cause of Heart Condition
Is Total Cholesterol is a direct cause of Heart Condition
relative to: {TC, LDL, HDL, HD}?
• TC is logically related to LDL, HDL, so manipulating it
once LDL and HDL are set is impossible.
25
LDL, HDL are confounders
HDL
TC
LDL
?
Heart
Disease
• No way to manipulate TCl without affecting HDL, LDL
• HDL, LDL are logically related to TC
26
Logico-Causal Systems
S: Atomic Variables
• independently manipulable
• effects of all manipulations are unambiguous
S’: Defined Variables
• defined logically from variables in S
For example:
S: LDL, HDL, HD, Disease1, Disease2
S’: TC
27
Logico-Causal Systems: Adding Edges
S’: TC
S: LDL, HDL, HD, D1, D2
System over S U S’
System over S
D1
D2
D1
HDL
LDL
D2
HDL
LDL
TC
?
HD
HD
TC  HD iff manipulations of TC are unambiguous wrt HD
28
Logico-Causal Systems:
Unambiguous Manipulations
For each variable X in S’, let Parents(X’) be the set of variables in
S that logically determine X’, i.e.,
X’ = f(Parents(X’)), e.g., TC = LDL + HDL
Inv(x’) = set of all values p of Parents(X’) s.t., f(p) = x’
A manipulation of a variable X’ in S’ to a value x’
wrt another variable Y is unambiguous iff
p1≠ p2 [P(Y | p1  Inv(x’)) = P(Y | p2  Inv(x’))]
TC  HD iff all manipulations of TC are unambiguous wrt HD
29
Logico-Causal Systems: Removing Edges
S’: TC
S: LDL, HDL, HD, D1, D2
System over S U S’
System over S
D1
D2
D1
HDL
LDL
D2
HDL
LDL
TC
?
?
HD
HD
Remove LDL  HD iff LDL _||_ HD | TC
30
Logico-Causal Systems: Faithfulness
Faithfulness: Independences entailed by structure, not by
special parameter values. Crucial to inference
D1
D2
Effect of TC on HD unambiguous
HDL
LDL
Unfaithfulness: LDL _||_ HDL | TC
TC
HD
Because LDL and TC determine HDL, and
similarly, HDL and TC determine TC
31
Effect on Prediction Algorithm
Observed System:
Still sound – but less informative
Manipulate:
Effect on:
Assume
manipulation
unambiguous
Manipulation
Maybe
ambiguous
Disease 1
Disease 2
None
None
Disease 1
HD
Can’t tell
Can’t tell
Disease 1
TC
Can’t tell
Can’t tell
Disease 2
Disease 1
None
None
Disease 2
HD
Can’t tell
Can’t tell
Disease 2
TC
Can’t tell
Can’t tell
TC
Disease 1
None
Can’t tell
TC
Disease 2
None
Can’t tell
TC
HD
Can’t tell
Can’t tell
HD
Disease 1
None
Can’t tell
HD
Disease 2
None
Can’t tell
HD
TC
Can’t tell
Can’t tell
TC, HD, D1, D2
D1
D2
HDL
LDL
TC
?
?
HD
?
32
Effect on Prediction Algorithm
Observed System:
TC, HD, D1, D2, X
X
D1
D2
Not completely sound
No general characterization
of when the Prediction
algorithm, suitably modified,
is still informative and sound.
Conjectures, but no proof yet.
HDL
LDL
TC
?
HD
Example:
• If observed system has no
deterministic relations
• All orientations due to
marginal independence
relations are still valid
33
Effect on Causal Inference of
Ambiguous Manipulations
Experiments, e.g., RCTs:
Manipulating treatment is
• unambiguous  sound
• ambiguous  unsound
Observational Studies, e.g., Prediction Algorithm:
Manipulation is
• unambiguous  potentially sound
• ambiguous  potentially sound
34
References
•
Causation, Prediction, and Search, 2nd Edition, (2000), by P. Spirtes, C.
Glymour, and R. Scheines ( MIT Press)
•
Causality: Models, Reasoning, and Inference, (2000), Judea Pearl, Cambridge
Univ. Press
•
Spirtes, P., Scheines, R.,Glymour, C., Richardson, T., and Meek, C. (2004),
“Causal Inference,” in Handbook of Quantitative Methodology in the Social
Sciences, ed. David Kaplan, Sage Publications, 447-478
•
Spirtes, P., and Scheines, R. (2004). Causal Inference of Ambiguous
Manipulations. in Proceedings of the Philosophy of Science Association
Meetings, 2002.
•
Reaber, Grant (2005). The Theory of Ambiguous Manipulations. Masters
Thesis, Department of Philosophy, Carnegie Mellon University
35