Results of the Causality Challenge Isabelle Guyon, Clopinet Constantin Aliferis and Alexander Statnikov, Vanderbilt Univ. André Elisseeff and Jean-Philippe Pellet, IBM Zürich Gregory F.
Download ReportTranscript Results of the Causality Challenge Isabelle Guyon, Clopinet Constantin Aliferis and Alexander Statnikov, Vanderbilt Univ. André Elisseeff and Jean-Philippe Pellet, IBM Zürich Gregory F.
Results of the Causality Challenge
Isabelle Guyon, Clopinet Constantin Aliferis and Alexander Statnikov, Vanderbilt Univ.
André Elisseeff and Jean-Philippe Pellet, IBM Zürich Gregory F. Cooper, Pittsburg University Peter Spirtes, Carnegie Mellon
Causality Workbench
clopinet.com/causality
Causal discovery
What affects… …your health?
… the economy?
…climate changes?
Causality Workbench Which actions will have beneficial effects?
clopinet.com/causality
Causality Workbench
Systemic causality
The system External agent clopinet.com/causality
Feature Selection
Y
Causality Workbench Predict Y from features X 1 , X 2 , … Select most predictive features.
X
clopinet.com/causality
Causation
Y
Y
Causality Workbench
X
Predict the consequences of actions: Under “manipulations” by an external agent, some features are no longer predictive.
clopinet.com/causality
Causality Workbench
Challenge Design
clopinet.com/causality
Available data
• A lot of “observational” data.
Correlation
Causality!
• Experiments are often needed, but: – Costly – Unethical – Infeasible • This challenge, semi-artificial data: – Re-simulated data – Real data with artificial “probes”
Causality Workbench
clopinet.com/causality
Challenge datasets
Four tasks
Toy datasets
Causality Workbench
clopinet.com/causality
Causality Workbench
On-line feed-back
clopinet.com/causality
Difficulties
• • •
Violated assumptions
: – Causal sufficiency – Markov equivalence – Faithfulness – Linearity – “Gaussianity”
Overfitting
(statistical complexity): – Finite sample size
Algorithm efficiency
– Thousands of variables (computational complexity): – Tens of thousands of examples
Causality Workbench
clopinet.com/causality
Evaluation
• Fulfillment of an objective • Prediction of a target variable • Predictions under manipulations • Causal relationships: • Existence • Strength • Degree
Causality Workbench
clopinet.com/causality
Setting
•
Predict a target variable
test data).
(on training and • • Return the set of
features used
.
• Flexibility: – Sorted or unsorted list of features – Single prediction or table of results
Complete entry = xxx0, xxx1, xxx2
(for at least one dataset).
results
Causality Workbench
clopinet.com/causality
Metrics
• Results ranked according to the test set
target prediction performance
“Tscore”: • We also assess directly the feature set with a “Fscore”, not used for ranking.
Causality Workbench
clopinet.com/causality
Causality Workbench
Toy Examples
clopinet.com/causality
Causality assessment with manipulations
Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder Coughing Fatigue
LUCAS
0
: natural
Causality Workbench
Car Accident
clopinet.com/causality
Causality assessment with manipulations
Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder Coughing Fatigue
LUCAS
1
: manipulated
Causality Workbench
Car Accident
clopinet.com/causality
Causality assessment with manipulations
Anxiety Peer Pressure Born an Even Day Yellow Fingers Smoking Genetics Allergy Lung Cancer Attention Disorder Coughing Fatigue
LUCAS
2
: manipulated
Causality Workbench
Car Accident
clopinet.com/causality
Goal driven causality
• We define:
V=variables of interest
(e.g. MB, direct causes, ...) • Participants return: S=
selected subset 4 11 2 3 1 3 10 9 2 4 5 1 0 6 11 8 7
(ordered or not).
• We assess causal relevance: Fscore=f(
V,S
).
Causality Workbench
clopinet.com/causality
Causality assessment without manipulation?
Causality Workbench
clopinet.com/causality
Using artificial “probes”
Yellow Fingers Anxiety Smoking Peer Pressure Genetics Born an Even Day Allergy Lung Cancer Attention Disorder
LUCAP
0
: natural
Coughing Fatigue Car Accident P 1 P 2 P 3
Probes
P T
Causality Workbench
clopinet.com/causality
Using artificial “probes”
Yellow Fingers Anxiety Smoking Peer Pressure Genetics Born an Even Day Allergy Lung Cancer Attention Disorder
LUCAP
1&2
: manipulated
Coughing Fatigue Car Accident P 1 P 2 P 3
Probes
P T
Causality Workbench
clopinet.com/causality
Scoring using “probes”
•
What we can compute (Fscore):
–
Negative class
= probes (here, all “non-causes”, all manipulated).
–
Positive class
= other variables (may include causes and non causes).
•
What we want (Rscore):
–
Positive class
= causes.
–
Negative class
= non-causes.
•
What we get
(asymptotically): Fscore = (N TruePos /N Real ) Rscore + 0.5 (N TrueNeg /N Real )
Causality Workbench
clopinet.com/causality
Causality Workbench
Results
clopinet.com/causality
Challenge statistics
• • • •
Start
: December 15, 2007.
End
: April 30, 2000
Total duration
: 20 weeks.
Last (complete) entry ranked: Number of ranked entrants Number of ranked submissions
Causality Workbench
clopinet.com/causality
Learning curves
REGED
1 0.9
0.8
0.7
0.6
0.5
0.4
0.3
0 20 40 60 80 100 Days into the challenge
CINA
120 0 1 2 140 1 0.9
0.8
0.7
0.6
0.5
0.4
0.3
0
Causality Workbench
20 40 60 80 100 Days into the challenge 120 0 1 2 140
SIDO
1 0.9
0.8
0.7
0.6
0.5
0.4
0.3
0 20 40 60 80 100 Days into the challenge
MARTI
120 0 1 2 140 1 0.9
0.8
0.7
0.6
0.5
0.4
0.3
0 20 40 60 80 100 Days into the challenge 120 0 1 2 140 clopinet.com/causality
AUC distribution
Causality Workbench
clopinet.com/causality
Causality Workbench
REGED
Gavin Cawley Yin-Wen Chang Mehreen Saeed Alexander Borisov E. Mwebaze & J. Quinn H. Jair Escalante J.G. Castellano Chen Chu An Louis Duclos-Gosselin Cristian Grozea H.A. Jen J. Yin & Z. Geng Gr.
Jinzhu Jia Jianming Jin L.E.B & Y.T.
M.B.
Vladimir Nikulin Alexey Polovinkin Marius Popescu Ching-Wei Wang Wu Zhili Florin Popescu CaMML Team Nistor Grozavu clopinet.com/causality
Causality Workbench
SIDO
Gavin Cawley Yin-Wen Chang Mehreen Saeed Alexander Borisov E. Mwebaze & J. Quinn H. Jair Escalante J.G. Castellano Chen Chu An Louis Duclos-Gosselin Cristian Grozea H.A. Jen J. Yin & Z. Geng Gr.
Jinzhu Jia Jianming Jin L.E.B & Y.T.
M.B.
Vladimir Nikulin Alexey Polovinkin Marius Popescu Ching-Wei Wang Wu Zhili Florin Popescu CaMML Team Nistor Grozavu clopinet.com/causality
Causality Workbench
CINA
Gavin Cawley Yin-Wen Chang Mehreen Saeed Alexander Borisov E. Mwebaze & J. Quinn H. Jair Escalante J.G. Castellano Chen Chu An Louis Duclos-Gosselin Cristian Grozea H.A. Jen J. Yin & Z. Geng Gr.
Jinzhu Jia Jianming Jin L.E.B & Y.T.
M.B.
Vladimir Nikulin Alexey Polovinkin Marius Popescu Ching-Wei Wang Wu Zhili Florin Popescu CaMML Team Nistor Grozavu clopinet.com/causality
Causality Workbench
MARTI
Gavin Cawley Yin-Wen Chang Mehreen Saeed Alexander Borisov E. Mwebaze & J. Quinn H. Jair Escalante J.G. Castellano Chen Chu An Louis Duclos-Gosselin Cristian Grozea H.A. Jen J. Yin & Z. Geng Gr.
Jinzhu Jia Jianming Jin L.E.B & Y.T.
M.B.
Vladimir Nikulin Alexey Polovinkin Marius Popescu Ching-Wei Wang Wu Zhili Florin Popescu CaMML Team Nistor Grozavu clopinet.com/causality
Pairwise comparisons
Gavin Cawley Yin-Wen Chang Mehreen Saeed Alexander Borisov E. Mwebaze & J. Quinn H. Jair Escalante J.G. Castellano Chen Chu An Louis Duclos-Gosselin Cristian Grozea H.A. Jen J. Yin & Z. Geng Gr.
Jinzhu Jia Jianming Jin L.E.B & Y.T.
M.B.
Vladimir Nikulin Alexey Polovinkin Marius Popescu Ching-Wei Wang Wu Zhili Florin Popescu CaMML Team Nistor Grozavu clopinet.com/causality
Causality Workbench
Top ranking methods
• According to the rules of the challenge: –
Yin Wen Chang
: SVM => best prediction accuracy on REGED and CINA. Prize: $400 donated by Microsoft.
–
Gavin Cawley
: Causal explorer + linear ridge regression ensembles => best prediction accuracy on SIDO and MARTI. Prize: $400 donated by Microsoft.
• According to pairwise comparisons: –
Jianxin Yin and Prof. Zhi Geng’s group
: Partial Orientation and Local Structural Learning => best on Pareto front, new original causal discovery algorithm. Prize: free WCCI 2008 registration.
Causality Workbench
clopinet.com/causality
Pairwise comparisons
REGED SIDO
Causality Workbench
CINA MARTI
Gavin Cawley Yin-Wen Chang Mehreen Saeed Alexander Borisov E. Mwebaze & J. Quinn H. Jair Escalante J.G. Castellano Chen Chu An Louis Duclos-Gosselin Cristian Grozea H.A. Jen J. Yin & Z. Geng Gr.
Jinzhu Jia Jianming Jin L.E.B & Y.T.
M.B.
Vladimir Nikulin Alexey Polovinkin Marius Popescu Ching-Wei Wang Wu Zhili Florin Popescu CaMML Team Nistor Grozavu clopinet.com/causality
Conclusion
• We have found good correlation between causation and prediction under manipulations.
• Several algorithms have demonstrated effectiveness of discovering causal relationships.
• We still need to investigate what makes then fail in some cases.
• We need to capitalize on the power of classical feature selection methods.
Causality Workbench
clopinet.com/causality