Tuesday slides
Download
Report
Transcript Tuesday slides
INTERVENTIONS AND
INFERENCE / REASONING
Causal models
Recall from yesterday:
Represent
relevance using graphs
Causal relevance ⇒ DAGs
Quantitative component = joint probability distribution
And
so clear definitions for independence & association
Connect
DAG & jpd with two assumptions:
Markov:
No edge ⇒ Independent given direct parents
Faithfulness: Conditional independence ⇒ No edge
Three uses of causal models
Represent (and predict the effects of) interventions
on variables
Causal
models only, of course
Efficiently determine independencies
I.e.,
which variables are informationally relevant for
which other ones?
Use those independencies to rapidly update beliefs
in light of evidence
Representing interventions
Central intuition: When we intervene, we control the
state of the target variable
And
so the direct causes of the target variable no
longer matter
But the target still has its usual effects
Directly
applying current to the light bulb ⇒
light switch doesn’t matter, but the plant still grows
Representing interventions
Formal implementation:
Add
a variable representing the intervention, and make
it a direct cause of the target
When the intervention is “active,” remove all other
edges into the target
Leave intact all edges directed out of the target, even
when the intervention is “active”
Representing interventions
Example:
Light Switch
Light Bulb
Plant Growth
Representing interventions
Example:
Add
a manipulation variable as a “cause”
Current
Light Switch
Light Bulb
Plant Growth
Representing interventions
Example:
Add
a manipulation variable as a “cause” that does not
matter when it is inactive
Current
Light Switch
Inactive
Light Bulb
Plant Growth
Inactive Manipulation
Representing interventions
Example:
Add
a manipulation variable as a “cause” that does not
matter when it is inactive
When it is active,
Current
Light Switch
Inactive
Light Bulb
Plant Growth
Inactive Manipulation
Current
Light Switch
Light Bulb
Plant Growth
Active Manipulation
Representing interventions
Example:
Add
a manipulation variable as a “cause” that does not
matter when it is inactive
When it is active, break the incoming edges, but leave
the outgoing edges
Current
Light Switch
Inactive
Light Bulb
Plant Growth
Inactive Manipulation
Current
Light Switch
Light Bulb
Plant Growth
Active Manipulation
Representing interventions
Straightforward extension to more interesting types
of interventions
Interventions
away from current state
Multi-variate interventions
Etc.
Key: For all of these, the “intervention operator”
takes a causal graphical model as input, and yields
a causal graphical model as output
“Post-intervention
CGM” is an ordinary CGM
Why randomize?
Standard scientific practice: randomize Treatment to
find its Effects
E.g.,
don’t let people decide on their own whether to
take the drug or placebo
What is the value of randomization?
Randomization
is an intervention
All edges into T will be broken, including from any
common causes of T and E!
⇒ If T
E, then we must have: T → E
⇒
Why randomize?
Graphically,
Treatment
?
Effect
Why randomize?
Graphically,
Treatment
?
Unobserved
Factors
Effect
Why randomize?
Graphically,
Treatment
?
Unobserved
Factors
Effect
Why randomize?
Graphically,
Treatment
?
Unobserved
Factors
Effect
Why randomize?
Graphically,
Treatment
?
Unobserved
Factors
Effect
Three uses of causal models
Represent (and predict the effects of) interventions
on variables
Causal
models only, of course
Efficiently determine independencies
I.e.,
which variables are informationally relevant for
which other ones?
Use those independencies to rapidly update beliefs
in light of evidence
Determining independence
Markov & Faithfulness ⇒ DAG structure determines
all statistical independencies and associations
Graphical criterion: d-separation
X and Y are independent given S iff
X and Y are d-separated given S iff
X and Y are not d-connected given S
Intuition: X and Y are d-connected iff
information can “flow” from X to Y along some path
d-separation
C is a collider on a path iff A → C ← B
Formally:
A
path between A and B is active given S iff
Every
non-collider on the path is not in S; and
Every collider on the path is either in S, or else one of its
descendants is in S
X
and Y are d-connected by S iff there is an active
path between X and Y given S
d-separation
Surprising feature being exploited here:
Conditioning
on a common effect induces an association
between independent causes
Motivating example:
Gas Tank → Car Starts ← Spark Plugs
Gas
and Plugs are independent, but if we know that the car
doesn’t start, then they’re associated
And
In that case, learning Gas = Full changes the likelihood that
Plugs = Bad
similarly if Car Starts → Emits Exhaust
d-separation
Algorithm to determine d-separation:
Write down every path between X and Y
1.
–
–
–
Edge direction is irrelevant for this step
Just write down every sequence of edges that lies
between X and Y
But don’t use a node twice in the same path
d-separation
Algorithm to determine d-separation:
Write down every path between X and Y
For each path, determine whether it is active by
checking the status of each node on the path
1.
2.
The node is not active if either:
–
–
–
–
–
N is a collider + not in S (and no descendants of N are in S);
or
N is not a collider and in S.
I.e., “multiply” the “not”s to get the node status
Any node not active ⇒ path not active
d-separation
Algorithm to determine d-separation:
1.
2.
3.
Write down every path between X and Y
For each path, determine whether it is active by
checking the status of each node on the path
Any path active ⇒ d-connected ⇒ X & Y associated
No path active ⇒ d-separated ⇒ X & Y independent
d-separation
Exercise and Weight given Metabolism?
E
→M→W
Food
Eaten
Blocked!
M is
an included non-collider
E
→ FE → W
FE is
a non-included non-collider
Exercise
Weight
Unblocked!
⇒E
W|M
Metabolism
d-separation
Metabolism and FE given Exercise?
M
→ W ← FE
Food
Eaten
Blocked!
W is
a non-included collider
M
← E → FE
E is
an included non-collider
Exercise
Weight
Blocked!
⇒M
FE | E
Metabolism
d-separation
Metabolism and FE given Weight?
M
→ W ← FE
Food
Eaten
Unblocked!
W is
an included collider
M
← E → FE
E is
a non-included non-collider
Exercise
Weight
Unblocked!
⇒M
FE | W
Metabolism
Updating beliefs
For both statistical and causal models, efficient
computation of independencies ⇒
efficient prediction from observations
Specific
instance of belief updating
Typically, “just” compute conditional probabilities
Significantly
easier if we have (conditional) independencies,
since we can ignore variables
Bayes (and Bayesianism)
Bayes’ Theorem: P (T | D) = P ( D | T ) P (T )
proof
is trivial…
P ( D)
Interpretation is the interesting part:
Let
D be the observation and T be our target
variable(s) of interest
⇒ Bayes’ theorem says how to update our beliefs
about T given some observation(s)
Bayes (and Bayesianism)
Terminology:
Posterior
distribution
Likelihood
function
P ( D | T ) P (T )
P (T | D) =
P ( D)
Data distribution
Prior
distribution
Bayes and independence
Knowing independencies can greatly speed
Bayesian updating
P(C | E, F, G) = [complex mess]
Suppose C independent of F, G given E
⇒
P(C | E, F, G) = P(C | E) = [something simpler]
Updating beliefs
Compute: P(M = Hi | E = Hi, FE = Lo)
FE
M|E ⇒
P(M | E, FE) = P(M | E)
Food
Eaten
And
P(M | E) is a term in the
Markov factorization!
Exercise
Weight
Metabolism
Looking ahead…
Have:
Basic
formal representation for causation
Fundamental causal asymmetry (of intervention)
Inference & reasoning methods
Need:
Search
& causal discovery methods