Title page - University of Washington

Download Report

Transcript Title page - University of Washington

Connections between Bayesian
and Conditional Inference in
Matched Studies
Ken Rice
University of Washington, Dept of Biostatistics
November 2006
Bayesian and Conditional Inference in Matched Studies
Outline
• Matched case-control studies (simple but very highly
stratified data)
• Conditional and Bayesian approaches
• Conditional and Bayesian dogma
• A resolution
• Nice properties of this resolution
Bayesian and Conditional Inference in Matched Studies
Formal description
• First ‘match’ subjects; same age, sex, etc; one case to one control
• Then assess control exposure = Z1k, case exposure is Z2k, for pair k
• e.g. for binary exposure Z1k ~Bern( p1k ), Z2k ~Bern( p2k )
• Assume
logit(p2k ) = logit(p1k ) + log( )
• Generates one nuisance parameter for each pair (Neyman-Scott)
• The likelihood factorizes;
Pr(Z1k ,,Z2k ) = Lcond(, Z1k ,,Z2k) × Lmarg(, p1k , Z1k +,Z2k)
Bayesian and Conditional Inference in Matched Studies
Analysis via conditioning
• You can maximize w.r.t. all parameters; but MLE for  goes to 2 (awful!)
• ’Conditional likelihood’ ignores the difficult, unhelpful term;
Lcond(, Z1k ,,Z2k) × Lmarg(, p1k , Z1k +,Z2k)
• Conditional likelihood contributions for a binary exposure;
Number of cases exposed
0
1
Number
of
controls
exposed
0
1
1
1
1 

1 
1
• Ratio of discordant pairs gives cMLE for 
• Well behaved, standard likelihood asymptotics work. The general form is
conditional logistic regression
Bayesian and Conditional Inference in Matched Studies
Random-effects analysis
• Suppose p1k ~ G, a random-effects/mixing/prior distribution
• Marginal likelihood contributions (dropping k);
Number of cases exposed
Number
of
controls
exposed
0
1
0
1
1 EG (Pr(Z1  Z2  0))

 EG (Pr(Z1  Z 2  1))
1 
1
 EG (P r(Z1  Z 2  1)) 1 EG (Pr(Z1  Z2  2))
1 
• Inference comes from marginal likelihood (often fully Bayesian)
• Define EG(Pr(Z1k + Z2k = t)) = mt; the marginal probabilities
Bayesian and Conditional Inference in Matched Studies
Beating up on random-effects
• Innocuous-looking G can do even worse than the (awful)
naïve MLE! (Seaman and Richardson 2004)
• Where did G come from? A big assumption, and hard to
check, with this design.
• Consistency? Efficiency? Software for non-experts?
• Seems quite subjective (and so must be garbage!)
Bayesian and Conditional Inference in Matched Studies
Beating up on conditioning!
• Why throw away information? There can be some
information about  in the marginal probabilities
• The cMLE is a bit biased towards the null. You can
(sometimes) do better with e.g. a normal distribution for G
• The conditioning argument completely falls to pieces
outside “pretty” models; it’s not a general prescription to get
rid of nuisance parameters
• Doesn’t use a full model (… so must be garbage!)
Bayesian and Conditional Inference in Matched Studies
A pragmatic common ground
• Philosophy aside, everything would be fine if m was free of 
• This is actually possible. For pair-matched studies, with any
number of categorical covariates, there are mixing distributions with
exactly this property. [under review; see also Rice 2004, JASA]
• Call these invariant distributions. They exist in closed form, are
‘proper priors’, and by definition have nice conjugacy properties
• An example, for 1:1 matched case-control;
p1k =1/2
p2k =1/2
with probability 1/2
with probability 1/2
• This gives m={0.25,0.5,0.25}, and is invariant. More generally,
transformations of multivariate Normals can be used
Bayesian and Conditional Inference in Matched Studies
About invariant distributions #1
• Consider the previous example; using the marginal likelihood, all
you need to know is that m={0.25,0.5,0.25} – other details about G
don’t affect analysis
• There are infinitely big classes of distributions which lead to
identical marginal probabilities m. We get equivalence classes of
mixing distributions; hence we need only make non-parametric
assumptions about G
• This is (?) entirely novel in Bayesian analysis. It is Bayesian and
non-parametric, but nothing like “Bayesian non-parametrics”!
• In a weaker sense, absolutely all invariant G are equivalent; the
marginal probabilities m and odds ratio(s)  are orthogonal; for any m
free of  the marginal and conditional likelihoods are proportional
Bayesian and Conditional Inference in Matched Studies
About invariant distributions #2
• Return to the example; if the case/control labels are switched then
nothing happens. Other examples show the same behavior.
• This is quite a general property. It has to hold for the marginal
probabilities – and we saw that nothing else in G actually matters.
• So it’s natural (but not necessary) that examples have this property.
Conditional logistic regression also behaves in this way; re-labelling
everyone gives the same (effective) answer
• Can be interpreted as centering the variables (so G depends on )
• Assuming symmetry is surprisingly controversial among
Bayesians! But who can really quantify some a priori difference
between cases and controls? Why would they do the study?
Bayesian and Conditional Inference in Matched Studies
Beating up on fundamentalists
• Recommend using invariant mixing distributions. (Call them
whatever you like in order to sleep at night)
• A direct consequence; without really good justification of some noninvariant G, random-effects fans should use conditional logistic
regression
• Conditioning fans have absolutely nothing to boast about; randomeffects models can have all the nice properties too
• For extensions, the random-effects paradigm allows much more
flexibility. Measurement error, missing data, hierarchical models, prior
information are all quite easy extensions
Bayesian and Conditional Inference in Matched Studies
Conclusions
• Highly stratified data leads naturally to considerations of
exchangeability
• Bayesians in particular have been thinking hard about
exchangeability for a long time; they have a lot to offer in this field
• Puritanical adherence to your favorite mode of inference is
unhelpful
• My thanks to many colleagues (and referees) who have helped the
development of this work immensely
References, talks, very silly posters at
http://faculty.washington.edu/kenrice
Bayesian and Conditional Inference in Matched Studies
Ken’s philosophy of statistical
research (in pictures)
Boring simple models
Exciting (!) complex models
Fast
Generic, reliable
No power gains to be made
Easy to use/misuse
Slow
Problem-specific, can behave oddly
Can get you extra power
Requires training to get anywhere
Fundamentally, most types of regression are interpretable as exercises
in model-fitting.
Bayesian and Conditional Inference in Matched Studies
Ken’s philosophy of statistical
research (in pictures)
As you know, not all analyses fit this description
•
•
•
•
Cox Regression
Conditional Logistic Regression
Robust covariance estimation (sandwich estimates)
Robustness to outliers through bounded influence
In vehicular form: (guess!)
It is typically very hard to adapt submarines for monster-truck jobs
But measurement error, missing data, prior information, multiple sources of data
are much more straightforward when you start with a model.
Bayesian and Conditional Inference in Matched Studies
Ken’s philosophy of statistical
research (in pictures)
An all-purpose (& rather cool) solution;
It’s possible to find full-likelihood interpretations of non-standard analyses –
although not easy; CLR and bounded influence tackled in this way, others remain
• Helps understand why the non-standard methods work, & potential problems
• Much easier to allow for measurement error etc;
Sub → Truck conversions become Car → Truck
• Just really cool