SPATIAL ANALYSIS OF PM2.5 DATA

Download Report

Transcript SPATIAL ANALYSIS OF PM2.5 DATA

FOUR METHODS OF ESTIMATING PM

2.5

ANNUAL AVERAGES

Yan Liu and Amy Nail Department of Statistics North Carolina State University EPA Office of Air Quality, Planning, and Standards Emissions Monitoring, and Analysis Division

Project Objectives

    

Estimation of annual average of PM 2.5

concentration Estimation of standard errors associated with annual average estimates Estimation of the probability that a site’s annual average exceeds 15 mg/m 3 At 2400 lattice points for 2000, 2001 Comparisons of 4 different methodologies: 1. Quarter-based analysis (Yan) 2. Annual-based analysis (Yan) Daily-based analyses: 3. “Doug’s method” (Bill) 4. Generalized least squares in SAS Proc Mixed (Amy)

Why are Standard Errors Important?

 We may estimate that the annual average for lattice point 329 is 16 mg/m 3 , which exceeds the standard of 15. But since our estimate has some uncertainty or standard error, we’d like to take this uncertainty into account in order to determine the probability that lattice point 329 exceeds 15.

In addition to maps like this ...

…we also want maps like this.

Note: This Map is WRONG--so don’t show it to anyone! We haven’t figured out the correct way to determine errors, so we cannot correctly draw a probability map yet.

Map of 2400 Lattice Points

Data Description

Concentrations of PM 2.5

during 2000, 2001 measured

The domain analyzed: the portion of the U.S. east of –100 o longitude

Concentrations measured every third day

Methods 3 & 4 - Daily-Based

Used every third day data (122 days per year)

Kriged each day to obtain predictions at 2400 lattice points

At each lattice point fit a timeseries to the 122 days’ estimates to estimate annual average

Calculated timeseries error for annual average (using proc arima)

Method 4 - “Amy’s Method”

Fit a quadratic surface using Generalized Least Squares in SAS Proc Mixed

Restricted (or residual) Maximum Likelihood used to estimate all parameters

Did not assume errors iid when fitting quad surf, so coefficients in quad surf estimated based on cov structure

Specified an exponential covariance structure with a nugget

Estimated each parameter each day

Model for one day

 Y

ij

=  o +  1

i

+  2

i

2 +  3

j

+  4

j

2 +  5

ij

+ 

ij

   Where

i

= lattitude

j

E( 

ij ) =

0 Cov( 

ij,

I’j’

)

=

2 n +

= longitude 

2 e -dist/

 

2 e -dist/

i i=i’

and

j=j’

i’ or j

j’

Model for one site

 Y

k

= µ +  (Y

k

-1 µ) + e

k k =

1,…,122   Where E(e

k

) = 0 Var (e

k

) =  2  Note: this is an AR1 model. The errors are iid (0,  2 ) because the temporal correlation is accounted for using the  (Y

k

-1 µ) term.

What if we “propagate” errors?

At a given lattice point we have 122 days’ worth of predictions, each with a kriging prediction error. What if we treat the 122 days as independent observations (they aren’t, they are AR1) and combine the errors accordingly? We do this for each of our 2400 lattice points.

The Big Problem

None of our standard error estimates are correct!

We need to learn how to put spatial error components together with temporal error components.

Model for all sites and days?

 Y

ijk

 5,k

ij

=  o,k +  1,k

i

+ 

ijk

+ e

ijk

+  2,k

i

2 +  3,k

j

+  4,k

j

2 +    Where E( 

ijk

)

=

0, E(e

ijk

) = 0 We’ve assumed isotropy and stationarity for simplicity.

But how do we model Cov( 

ijk ,

i’j’k’

), Cov(e

ijk ,

e

i’j’k’

), and Cov ( 

ijk ,

e

i’j’k’

)?

Separability

 We’ve been treating the covariance structure as separable--meaning that the 1-D temporal and 2-D spatial covariance structures can be estimated separately and then can be mathematically combined to obtain a 3-D space-time covariance structure. We need to test for separability, and if the covariance components are separable,

we need to appropriately combine them

. We are just now learning how to do this.

Next steps….

Investigate the separability of the covariance structure and the correct method for combining space and time covariance components.

Attempt a 3-dimensional kriging. No assumption of separability is required to do this. We must, however, write our own code for this project because there is no software package (to our knowledge) that performs such an analysis. This method would allow us to use even more data than we are using now, as we would not be restricted to every third day.

Other next steps….

 Try two methods Stefanski recommended.  One method avoids the issue of separability by treating the kriging prediction errors as measurement errors on the timeseries “observations.”  The other method…