Use of imputed tree lists for FVS landscape projections: An overview of some issues and opportunities. Eric L.
Download
Report
Transcript Use of imputed tree lists for FVS landscape projections: An overview of some issues and opportunities. Eric L.
Use of imputed tree lists for
FVS landscape projections:
An overview of some
issues and opportunities.
Eric L. Smith
Forest Health Technology Enterprise Team
U.S. Forest Service
Fort Collins, CO
Problem: We would like to run FVS simulations
for large landscapes, but we only have plot data
for some of the stands
One solution: For each uninventoried
stand, use imputation techniques to
find plot data taken from a similar site
and use that data as if it were taken
from the un-inventoried stand.
Imputation
“Imputation” is a generic term for methods
which can be used to estimate missing
data. There are many ways to do this. For
example, in FVS, you can provide a tree
height but, if you don’t, FVS can impute it:
estimate it from a height as fn(dbh)
model.
Nearest Neighbor Imputation
“Nearest Neighbor” (NN) imputation is a
statistical technique which substitutes
many values from another sample plot
which is like the plot with the missing
data, based on what information you do
have about the plot with the missing
values.
The kind of information we do have (or can
get) includes the kind of mapped data in
GIS coverages and satellite data.
Why NN Imputation?
• In general, use of an entire plot sample
insures the group of data elements
represents a realistic combination of
conditions
• For use in FVS, we need the whole tree list
and sometimes addition plot data
Process example
Gradient Nearest Neighbor
from Ohmann and Gregory, 2002
Example mapped data
Landsat
Bands, transformations, texture
Climate
Means, seasonal variability
Topography
Elevation, slope, aspect, solar
Soil
Texture, drainage, mineral type
Disturbance
Past fires, harvest, &ID
Location
Lat., Long.
Ownership
Federal, state, forest industry, other private
Adapted from Ohmann and Gregory
Mapped data information
• Physiographic variables relates to “potential
vegetation” or successional pathway
• Satellite data relates to current tree sizes and
density (pathway state)
• If management (or fire) has created variation
in understory conditions which is hidden from
the satellite by the overstory, this could be a
problem.
The status of NN for FVS
• The NN technique most associated with FVS, Most
Similar Neighbor (MSN), has been around for over
10 years, additional techniques are being added to
the software by Crookston and others.
• There is a increased recognition for the need for
landscape simulations for fire and other
applications.
• FIA annual data increasing available for all forested
lands, while recent stand exam data is decreasing.
• Adequate computer storage, processing power,
software, and GIS-based mapped data are now
widely available to perform large imputation
projects.
Some current Major NN Efforts
• Crookston et al, RMRS, Moscow
– MSN support, new YAImpute package
• Ohmann et al, PNWRS, Corvallis
– Gradient NN (GNN), mapping in CA, OR, WA
• McRoberts & Finley, NRS, St. Paul
– Faster processing (ANN), variance estimation
• Twombly, NRIS, have Informs, will travel
– MSN inside INFORMS, creates Nat’l Forest maps
• LeMay et al, UBC, Vancouver
– Various application in Canada
Large NN imputations are here
PNW, Ohmann
Mn, McRoberts
Pa, Lister
NFs, Twombly
Scale: Compartments to States
The application of NN imputation to fill in a
(small?) number of uninventoried stands
in a small landscape takes place in a very
different information context than the NN
allocation of large scale inventory plots to
a large area (sub-states to multi-states).
Small area application
• Can know conditions and history
• Can gather more ground information
• Can relate imputation results to the on the
ground reality
• Inventory often linked to purpose and
reasonably intensive
• Homogeneous areas (stands) can be
predefined and be a sampled unit
• Data and relationships between data are
likely to be consistent
Large area application
• Too large to have direct knowledge about
• Sampling intensity is generally low
• Homogeneous areas not pre-defined but can be
done so (using image analysis and GIS tools)
• Data and relationships between data are often
inconsistent across area
• Can gather more information- but through existing
sources of remote sensing and other mapped data
• Inventories may not be linked to the desired
applications of the users
• However, inventory design may provide
statistically reliable population estimates
Scale shifts focus to map data
Fine scale details are less reliable as sample intensity
decreases and the imputation geographic range
increase;
But, from the stand point of the inventory estimates,
imputation allows:
(1) the more precise estimation of inventory data for
small areas;
(2) the estimation of additional types of summary
variables for post stratified conditions;
(3) the FVS projection of inventory subpopulations
using associated tree lists by area and adjusted for a
range of site conditions.
Error and Variance
• Need goodness of fit measures to evaluate
the relative quality of procedures
• Understanding sources of errors which
contribute to variance needed to know if
and how they can be reduced
• Variance estimates for NN results are
complex and difficult, and under active
investigation
• There are different approaches used by
different disciplines
FIA Plot Design
Trees 5 inch and over
are measured on 4
subplots, each
1/24th acre
Trees 1 to 5 inch are
measured on 4
microplots, each
1/300th acre
Eventually, there
should be at least
one plot per 6000
forested acres,
nationwide
Spatial scale: FIA vs. Landsat
• Landsat pixels are 30x30
meters (900 m2)
• Each FIA subplot (>5 in.) is
167 m2 (19% of the pixel)
• Each FIA microplot (1 to 5 in.)
is 13.7 m2 (1.5% of the pixel)
30 m
30 m
• This difference in scale may
result in an underestimate the
accuracy of the imputation if
the sample estimates are
assumed to be “true”
• In addition, there is positional
error and other sampling and
measurement error associated
with FIA plot data, Landsat
data, and other map data
Image from McRoberts, 2006
30m x 30m pixel
k Nearest Neighbor
• k Nearest Neighbor technique allows the selection of
more than one reference data set, usually averaged
to estimate target conditions. (using 3 closest
neighbors would be “k=3”)
• In FVS, the kNN approach could treat the multiple
near neighbors as imputed sub-plots.
• This may be desirable in the case of a scale
mismatch between the intensive plot and the map
data. It also creates more variation across the
landscape, perhaps better representing transitions
between conditions.
• kNN option is included in YAImpute
Questions:
• Would k > 1 be a good tradeoff between
real mixes of plot conditions and the
sample uncertainty of plots smaller than
pixels?
• Could additional pixel-sized information be
gathered at sample point locations (e.g.
photo-interpreted crown cover or cover
type) and included in the multivariate data
analysis?
How much does it matter?
• The issues of goodness of the imputation
need to be considered in the context of the
simulation: the use of the results and the
models’ sensitivity to the lack of accuracy.
• Model applications have a range of
sensitivity
• Analysis projrcts have a range of sensitivity
• Sensitivity tests can be performed
Envision project using imputed data
This imputation application has a low sensitivity to error
Crystal Lakes Fuel Trt Project
Arapaho-Roosevelt NF
as seen from road intersection
A fire-beetle project using MSN
This FFE WWPB application has catastrophic and contagion behaviors,
and may be sensitive to imputation errors
Five Buttes Analysis Area Deschutes Nat’l Forest
Imputation Sensitivity Analysis
In this analysis, two landscapes were imputed, high and a low pine beetle risk, based
on risked rating stands which fell in each of many stand classifications. These maps
represent the no action, “after beetle outbreak” BA for each
Red River pine beetle analysis, Nez Perce National Forest
2011 HIGH
2011 LOW
Sensitivity: High minus Low
The difference in the two extremes show how much the results may
have changed if better data were available, and where the uncertainty is
manifested on the landscape. This is the “no action” alternative; so a
comparison can also be made as to the sensitivity of the action-no
action difference to these 2 extreme landscape ranges.
1986 H-L
2011 H-L
An additional challenge
What is “most similar” depends on what aspects are
considered in the analysis.
If these products are used in decision making, we
face the challenge to produce understandable,
useful products which can be integrated with other
corporate resource data systems and analyses.
(Its not so good to have several, different estimates
of where something important might be. It drives
the boss crazy, but the appellants’ lawyers love it)
Acknowledgements
•
•
•
•
•
Nick Crookston
Andrew McMahan
Ron McRoberts
Ken Pierce
Al Stage
• and to all of you from out of town, who are here on
Valentine’s Day, away from those you hold dear