No Slide Title

Download Report

Transcript No Slide Title

Paul Switzer
MPEPM by H. Ö.
• aggregation of micro-environments
• temporal & spatial variability of concentrations in ME studies
• estimate percentiles of population exposure
how affected by source mitigation
• how to assess indoor / outdoor relation
time aggregation issues
personal
population stratification
vs ambient
• 60% of indoor is outdoor (vs 25%)?
• deposition K proportional to concentration?
• more details for Monte Carlo study
• student data - lognormal?
over time
over students
• m.e. exposure time variability affects estimates of source mitigation efficacy
Jerry Sacks
MODELS (1)
PRE-SELECTION OF DATA
Example O3 & Respiratory Hospital Admissions
O3 is much higher in summer.
Respiratory Hospital Admissions peak in January & February
Jerry Sacks
MODELS (2)
VARIATION IN THE SUSCEPTIBLE ‘POOL’
(RISK-WEIGHTING)
Jerry Sacks
MODELS (3)
EXAMPLES -
- HEAT + O3
- PM10 + WINTER CONDITIONS
Jerry Sacks
MODELS (4)
Effect of HEAT + O3 greater than either alone
Effect of PM10 exposure greater in winter (because
susceptible pool is greater) than at other times.
Jerry Sacks
MODELS (5)
It follows that there will be inconsistency in the
effect of a given pollution increase.
Evidence of inconsistency is not evidence against
a causal inference.
Jerry Sacks
MODELS (6)
Evidence of an Association getting stronger when
the exposure is more accurately measured, is evidence
in favor of a causal inference.
Allan Marcus
Berkson errors can look like classical errors if
exposure-response is nonlinear
Response
Exposure
True for
non-cooker
Assigned from
True for
Stationary Monitor cooker
True for
smoker
Allan Marcus
Models for Microenvironment Measurement Error
CME  Cambient
Qsource
aP
QFP


a  K a  K V (a  K )
a = ventilation rate (air exchanges per hour)
P = penetration fraction for ME
(P  0.9 for PM2.5, P  0.7 for PM10)
K = deposition rate (per hour)
F = random variation in Cambient adjusting for wind direction,
local sources, (PM) filter errors and flow rate errors
Qsource = ME source emission rate (mg PM per hour)
V = volume of indoor / personal space
Allan Marcus
Recent Results on EPA Personal Monitoring
for Riverside CA (PTEAM)
ˆE indoors  E personal  Cout a  Kt
aK
a = ventilation, K = deposition, t = fraction of time spent indoors
Estimated indoor exposure to ambient penetrating indoors
Within-subject (longitudinal)
Epersonal shows little correlation with Coutdoors in some subjects,
high correlation in others.
.
Eˆ indoors is almost uncorrelated with Coutdoors__________
Coutdoors, visit j, is highly correlated with E personal.
.
Is the ME model adequate for indoor source variability?
PM10 indoors  19  0.226 PM10 out  11.6 Pets  2.0 NOIG
Adrian Raftery
Discussion of Merlise Clyde’s paper
Goal: Towards causal inference from observational env. data
?
(PM  death)
?
XY
Obstacles:
?
OR
Y  X (time ordering)
Z
OR
X
Z
OR
Y
X
Y
What should Z be?
.
R.A. Fisher: “Make your model as big as an elephant”.
Here dim(Z)  00.
Adrian Raftery
But: include all Z’s
 (very) inefficient
Statistical variable selection:
Standard method (S+, SAS,…): stepwise + condition on selected model
• (frequentist) properties unknown
• underestimates uncertainty
 overstates significance
 understates SEs, CIs.
• can be VERY misleading (Freedman ‘83).
BMA:
• accounts for model uncertainty
• optimal predictive performance (MR ‘94)
• tests minimize total (frequentist) error rates (= Type I+Type II) (Jeffreys ‘61).
.
Adrian Raftery
Designing a REALISTIC simulation study
• We analyzed the 49 “relevant” case-control studies in Amer. J. Epi. in 1996
and based our simulation study on these.
• IQRs of reported OR’s:
IQR
max
# OR’s
n[200, 400]
2.2 – 5.6
27.0
26
n[700, 1300]
1.1 – 2.2
10.7
41
• Simulation design:
32 X-variables
10 of these associated with Y
22 of these indep. of Y
OR range
n=300
n=100
.
1.6 – 5.5
1.4 – 3.0
Results:
Standard approaches
Adrian Raftery
% of coefs  0
Significance
p-values
(1-p)%
Stepwise
2-stage
Barely sig.
.05 - .10
90 – 95
31
37
Significant
.01 - .05
95 – 99
52
63
Highly sig.
.001 - .01
99 – 99.9
87
90
< .001
> 99.9
99.3
99.7
VHS
BMA
Evidence
% post. prob. that   0
% of coefs. that actually
were  0
Weak
50 – 75
61
Positive
75 – 95
75
Strong
95 – 99
91
> 99
99
Very Strong
.
Point estimation: MSE of ˆ is lower for BMA.
Best = BMA< 2-stage < stepwise = worst.
Adrian Raftery
Merlise:
• BMA over large q
• Presentation of results
• Many cute tricks
Orthogonalization:
Recall: BMA  inference for the full model
Y  0  1Xq    q Xq  
with prior point mass on { j  0}.
Justified by prior beliefs / approx.  prior mass on { j  } if   1 SE.
2
1st version of paper: orthogonalize X.
Very fast, but what prior does it correspond to?
.
Adrian Raftery
E.g. X1 = PM; X2 = temperature
Then PCs: W1  .7 X1  .3X 2
W2  .3X1  .7 X 2 (say)
model: Y   0  1W1   2 W2  
BMA prior puts point mass on {1  0}, i.e.{.71  .32  0}. WHY?
Different example: X1 = PMt ; X2 = PMt-1
Then W1  general PM level
W2  increase in PM since yesterday
Then the hypotheses {1  0}, { 2  0} may make sense.
.
Adrian Raftery
 Orthogonalization may make sense for groups of variables
measuring the same thing.
Better alternative? Measurement model
LISREL (Joreskog)
1
X
2
meas.
error
h1
1
Y
h2
2
(latent)
h3
3
x1
x2
(obs.)
(latent)
?
(obs.) meas.
error
Larry Cox
Workshop Summary
Objectives
• immerse statisticians in the scientific framework / problems for PM
• illustrate role of statistical methods / analysis towards their solution
• make connections
Objectives met? I think so.
Perhaps nothing entirely new - but “new to you”.
Larry Cox
Study Design, Sampling & Modeling
–
–
–
–
MEASUREMENT ERROR
inter-individual variability and susceptibility
100+ covariates -- which are bad actors?
susceptible subpopulation and other factors may change with:




–
–
–
–
–
location
season
covariates
pollution
?
dynamic
models
what is best (= appropriate + available) measure of exposure? how does it relate to health effects?
individual  cohort  aggregation  ecological models : whither and whence?
role of ambient monitoring date
estimating means and extremes
making sense of the whole:
• combining studies
• understanding differences in covariates / lags between studies
–
–
–
–
proper adjustment by meteorology
model selection / averaging
multiplicities
case crossover designs
Larry Cox
Estimating Status and Trend
–
–
–
–
proper removal of trend and seasonal effects
inference robust to smoothing method
investigate effects of combined (eg PCA) pollutants instead of one at a time
how to define trend / regional trend
Spatio - Temporal Modeling
–
–
–
–
–
accounting for bias due to network design
validating atmospheric models using ambient measurements
spatio - temporal models of ambient exposure
use of short term / mobile monitoring
simulating regulatory outcomes
Health Effects
– specificity of effects
– pollutant mixtures
– get dose in context
Larry Cox
Aggregation and Scale
– aggregating short term measurements to (ANOVA) averages
– imputing for unobserved subgroups / microenvironments
– predict aggregate or aggregate predictions?
(consistency)
– what is appropriate scale(s) on which to link exposure and health effects?
– tradeoff between temporal and spatial resolution
– role of (daily) observations in estimating (hourly) exposures
– stratification
– aggregating microenvironments
– simulating individual activities over multiple days
“Aggregate when similar; disaggregate when different.” P.S.
Synergies
– Atmospheric and spatio - temporal models
– Atmospheric and receptor models
– Spatio - temporal and receptor models
Personal Lessons Learned
– Particles are cunning little devils. They
• grow
• bounce
• pig in poke?
but
• occasionally are coarse
– We breathe our colleagues’ air
– Computers don’t have unions
– We need more data!
Workshop Outcomes
– workshop summary + videos + etc
– research
• NRCSE
– Connections
• others
 More Work for Everyone.
Larry Cox