GIS and Species Prediction Models

download report

Transcript GIS and Species Prediction Models

Making maps, many maps!
[What is GIS?]
Why do I want to know where they occur?
Dr. Brian Klinkenberg
Department of Geography, UBC
For Zoology 502
March 9, 2008
ToC
• Why predict species ranges?
• What is GIS?
• Example: West Nile virus (based on
species biology)
• Example: Cryptococcus gattii (GARP:
correlative model)
Why predict species distributions?
• We need maps showing species distributions because land
use activities, disease prevention actions, are often spatially
explicit (e.g., SARA implications [e.g., spotted owl, tall
bugbane, Pacific water shrew]) and occur at a range of scales.
• We need multi-scale ‘scientific’ approaches because the
impacts of land use and global change are multi-scaled.
• We can’t sample everywhere so we can never really know the
‘truth.’
• Many different approaches, and each approach has its
strengths and weaknesses.
A good reference: Scott, M., P.J. Heglund, M.L. Morrison, M.G. Raphael, W.A. Wall
& F.B. Samson (eds) 2002. Predicting Species Occurrences: Issues of accuracy and scale.
Island Press, Covelo, CA. 847pp.
Distribution looks different at different scales
Similar methods
Different data
Different utility
Distributions will change
2011-2040
2041-2070
2071-2100
http://www.glfc.cfs.nrcan.gc.ca/landscape/index_e.html
What are we modeling?
• Range
– Total extent occupied by a given taxon; “limits within which a
species can be found” (Morrison and Hall 2002).
– Considers only geographic space.
• Distribution (fundamental niche)
– Spatial pattern of environments suitable for occupation by a given
taxon; “spread or scatter of a species within its range” (Morrison and
Hall 2002).
– Considers geographic space and environmental components.
• Habitat (realized niche)
– Combination of resources and conditions that promote occupancy,
survival, and reproduction by individuals of a given taxon (Morrison
et al. 1992).
– Considers geographic space, environmental components and species
responses.
Range
Distribution
Habitat
Observations
Distributions: Level of detail
Range
Observations
Distribution of Bidens amplissima
Source: E-Flora BC
Source: Conservation of Grizzly Bears in British
Columbia. Min. of Environment, Lands and Parks, 1995
Level of detail continuum
Dot Map
Range Map
a) “Definitive” presences
a) “Definitive” Absences
b) Usually under-predicts occupied
area
b) Usually over-predicts occupied
area
c) Usually over-predicts unoccupied
area
c) Usually under-predicts unoccupied
area
d) Accuracy heavily dependent on
sampling effort
d) Often subjective and difficult to
replicate
e) Can be difficult to validate or test
e) Can be difficult to validate or test
Tools and results along the continuum
• Range
– Largely Deductive; using expert
opinion based on coarse datasets.
• Distribution
– Deductive or Inductive; using
statistical algorithms, GIS
modeling on refined datasets.
• Habitat
– Deductive or inductive; using local
knowledge based on specific
research data.
• Observations
– Actual data from field sampling.
Easy to generate. Limited
local utility.
More difficult to generate
than range. Data
intensive. Regionally
useful.
Based on research and/or
local knowledge. Timeconsuming and difficult to
extrapolate. Locally
useful.
Raw data. Expensive.
Limited utility without
supplementary info.
Two Broad Approaches
1. Deductive: conclusions are developed from
combination of premises
– spatial expressions of qualitative data
– overlays of predictor variables
– E.g., GIS-based multi-criteria evaluations
1 3 2 2 3 4 5 5
2 1 1 1 4 2 5 5
2 3 1 1 4 4 5 4
2 3 3 2 3 4 3 5
2. Inductive: conclusions are developed as an
extrapolation from available data
– quantitative and often statistical
– what most folks consider “modeling”
1 3 2 2 3 4 5 5
2 1 1 1 4 2 5 5
2 3 1 1 4 4 5 4
2 3 3 2 3 4 3 5
Model input: Occurrence data
• Quality and Quantity
 opportunistic vs. systematic
 limited vs. abundant
 presence-only vs. presence/absence
• Correcting and Filtering
 spelling, duplicates, misidentification
 location, precision, spatial autocorrelation
 seasonal, sinks, historical cut-off
Model input: Environmental data
• Influence element distribution
• Fewer variables better than more
• Complete coverage of study area
• Climatic influence on distribution
An Ecosystem
Vegetation
Climate
Animals
Terrain
Micro
-organisms
The Biotic
Component
Soil
An ECOSYSTEM
Physical
Parameters
Distribution model approaches
• A variety of approaches:
–
–
–
–
–
–
–
–
–
–
similarity metrics (e.g., DOMAIN)
envelope models (e.g., BIOCLIM, ANUCLIM)
Maximum Entropy (e.g., MaxEnt)
rule-based (e.g., GARP)
splines (e.g., MARS)
classification trees (e.g., CART)
ordination (e.g., CCA, DA, Biomapper)
classical statistics (e.g., GLM, GAM, logistic regression)
neural networks
others …
DOMAIN
WhyWhere
BIOMOD
1. HSI
2. Convex Hull
1
5. GLM
7. GARP
4. DOMAIN
6. GAM
Actual Occurrences
2
3
4
6
5
7
Model (& threshold)
Mann-Whitney Statistic
3. ANUCLIM
(Elith and Burgman 2003)
Kappa
Comparison of distribution models
1
2
3
4
6
Model
5
7
Model evaluation
• Commission vs. Omission Errors
 insufficient sample size
 measurement error
 insufficient spatial resolution
 critical environmental variables excluded
• Validation Methods
 expert review
 classifying independent occurrence data
 post-modeling field surveys
Model Selection
• Depends on many factors…
 data quality and quantity
 study area size and history
 element biology
 intended use of predicted distribution
• Use multiple models
 overlapping predicted distributions
 determine best model
ToC
• Why predict species ranges?
• What is GIS?
• Example: West Nile virus (based on
species biology)
• Example: Cryptococcus gattii (GARP:
correlative model)
GIS?
Geographic
Information
System
Why geography matters
• The examination of spatial patterns
invites questions, raises concerns.
– Theory of evolution (Darwin’s finches)
– John Snow’s cholera mapping
• He mapped deaths from cholera in 1854
• Map led him to question the quality of the
water from the Broad Street pump
• Removing the pump handle stopped the
epidemic (over 500 people died)
John Snow’s map
Why Geography matters
• Almost everything happens somewhere
• Nothing is ‘atomic’, we must consider the
whole (context is everything). (ecological fallacy,
MAUP
)
• Knowing where some things happen is
critically important
–
–
–
–
–
–
Position of country boundaries
Location of hospitals
Routing delivery vehicles
Management of forest stands
Locations of dead corvids
Streams suitable for Pacific Water Shrew
If geography matters, GIS can
be used to study the
problem.
Definition of GIS
A system of
hardware, software
data, people
for
collecting, sorting
analyzing and disseminating
information about areas of the earth
GIS integrates data.
GIS integrates technologies
GIS enables model development
ToC
• Why predict species ranges?
• What is GIS?
• Example: West Nile virus (based on
species biology)
• Example: Cryptococcus gattii (GARP:
correlative model)
Modeling West Nile virus
• West Nile virus (WNv) has recently emerged as a health threat to
the North American population. After the initial disease outbreak
in New York City in 1999, WNv has spread widely and quickly
across North America to every contiguous American state and
Canadian province, with the exceptions of British Columbia (BC),
Prince Edward Island and Newfoundland.
• In our study we developed models of mosquito population
dynamics for Culex tarsalis and C. pipiens, and created a spatial
risk assessment of WNv prior to its arrival in BC by creating a
raster-based mosquito abundance model using basic geographic
and temperature data. Among the parameters included in the
model are spatial factors determined from the locations of BC
Centre for Disease Control mosquito traps (e.g., distance of the
trap from the closest wetland or lake), while other parameters
related to the biology of the mosquitoes were obtained from the
literature.
West Nile virus presence in North America
First
appearance
of positive
birds
Primary route of transmission
Integrated approach using GIS
• Mosquito biology
–
–
–
–
Temperature
Precipitation
Vegetation
Water bodies
• Mosquito habitat
• Bird migration
•
•
•
•
Health regions
Population at risk
Landuse
Sensitive habitat
• Disease surveillance
– Monitor corvid populations
– Mosquito trap data
Developing the model
• Mosquitoes have four stages: egg,
larva, pupa and adult. Generally
mosquitoes grow more rapidly
under higher temperatures. Previous
studies concluded that the condition
for proceeding into the next stage is
determined by degree-days (i.e., a
product of excess beyond the base
temperature (in degrees) and its
length (in days)).
Mosquito abundance model
Flowchart illustrating the mosquito abundance model developed in our study.
Risk assessment
Flowchart illustrating the WNv risk assessment methodology used in the study
Model validation
A comparison of the model outputs and the observed mosquito numbers.
Risk: Mosquito presence
Annual total of weighed daily mosquito numbers per gird cell (C. tarsalis only).
Weight: 1 for daily mean temperature (T) below 16°C, 2 for 16°C ≤ T<20°C,
3 for 20°C ≤ T<24°C, 4 for 24°C ≤ T<28°C, 5 for T ≥ 28°C
(Weight is determined for each day and for each grid cell)
Risk: Bird species abundances
Total abundance of high risk bird species in breeding season.
The map shows the average number of individual birds
considered to be high risk species by the BCCDC.
Risk: Mosquito-bird cycle
Total risk of forming a mosquito-bird cycle.
Risk by Health Regions
Use of GIS in developing adulticiding
scenarios
• First week of August: human cases have
been reported; mosquito infection rates
have been increasing; short-term
weather forecast is continued hot and dry
spell; MHO has given the order to spray
• BCCDC will work with the regional health
authority, local government, mosquito
control contractor and the provincial
emergency program to determine which
areas can and should be sprayed to
reduce the risk of human illness
Use of GIS
Use of GIS
ToC
• Why predict species ranges?
• What is GIS?
• Example: West Nile virus (based on
species biology)
• Example: Cryptococcus gattii (GARP:
correlative model)
Emerging infectious diseases
• For some species (e.g,. Cryptococcus
gattii) we have very little knowledge of its
ecological requirements (what favours it,
what is detrimental to it).
• For species such as this we cannot
develop distribution models based on
species biology (it is unknown), so we let
the software determine which
environmental layers are more significant
that others.
Cryptococcus gattii
• Microscopic (1-2 µm) sized yeast-like fungus
• Environmental reservoir is vegetation and soil
• Traditionally associated with Eucalyptus trees in tropics
and sub-tropics (e.g., Australia, California)
• May cause illness in humans and animals: cryptococcal
disease or cryptococcosis
• Hosts are immunocompetent
• Transmission by aerosolization and inhalation of spores
A cryptic story
• An increase in the number of animal and human
cryptococcosis noted in 2001.
• Clinical symptoms: prolonged cough, sharp chest pain,
unexplained shortness of breath, severe headache,
fever, night sweats, weight loss; skin lesions (animals).
• Profiles of human cases did not fit the traditional
understanding of cryptococcosis.
• All cases resided on or had visited Vancouver Island
prior to the onset of illness.
Cryptococcus gattii identified
• Environmental sampling performed: Cryptococcus
gattii isolated from native vegetation, soil, air, water
Image sources:
BCCDC, 2004
David Ellis, 2005
UBC, 2006
7
Human Cryptococcosis in British Columbia 1999-2007*
35
Probable
Confirmed
30
Number of Cases
25
20
15
10
5
0
1999
2000
2001
2002
2003
Year
2004
2005
2006
2007
*2007 data up to Nov 21/07
Study objective
To delineate the geographic areas where Cryptococcus
gattii is currently established and forecast areas that
could support Cryptococcus gattii in the future for
targeted public health messaging of Cryptococcus gattii
risk and prioritization of environmental sampling
Landscape epidemiology
Explores the relationship between the ecology and
epidemiology of infectious diseases to identify
geographical areas where disease transmission occurs
 Ecological Niche Modeling
Ecological niche modeling
Ecological niche: the total range of environmental conditions that
are suitable for a species existence and maintenance of populations
(Grinnell, 1917).
Hutchinson (1959) provided the valuable distinction between the
fundamental niche, which is the range of theoretical possibilities,
and the realized niche (that part which is actually occupied, given
interactions with other species such as competition). Although it can
be argued that only the realized niche is observable in nature, by
examining species across their entire geographic distributions,
species’ distributional possibilities can be observed against varied
community backgrounds, and thus a view of the fundamental
ecological niche can be assembled (Peterson et al. 1999).
Fundamental
Niche
Realized Niche
http://www.specifysoftware.org/Informatics/bios/biostownpeterson/PK_USDA_2005.pdf
Genetic Algorithm for Rule-set Prediction
• GARP is a species distribution or ecological niche
modeling algorithm
• GARP is used to predict whether an area of study is
suitable habitat for the species in question
• GARP works in an iterative process of rule selection,
evaluation, testing, and incorporation or rejection
Elevation
Methodology and Data
Aspect
Slope
Human
Biogeoclimatic
GARP
January Temp (x3)
Environmental
July Temp (x3)
Precipitation (x3)
Soil (x2)
Animal
Determine
significant
variables
Resulting
models
GARP
20 ecological niche model outputs
produced for each set of cases
GIS
Optimal = 11-20 model agreement
Potential = 1-10 model agreement
Model accuracy is based on:
# of correct predictions
# of correct predictions + commission and omission errors
Ecological niche modeling of C. gattii in BC
Ecological niche modeling of C. gattii in BC
Ecological niche modeling of C. gattii in BC
Observations
• Suitable ecological niche for Cryptococcus gattii is
available on the BC mainland
• Cryptococcus gattii distribution in BC associated with
areas having >1oC January average temperature and
<770m elevation (mean = 100m)
• Animal distribution of cryptococcosis corresponds
directly with human distribution
Observations
• Ecological niche modeling of Cryptococcus gattii
produced very accurate predictions (>98% accuracy)
• The ecological niche model based on environmental
sampling data produced the most conservative forecast
• Environmental sampling for Cryptococcus gattii in
geographic locations identified as “optimal” ecological
niche areas are currently underway
Conclusions
• Species distribution models can be used
in a wide variety of applications (rare and
invasive species management, infectious
disease monitoring and prevention).
• Using several different approaches is
considered the best option, since no one
method works best in all situations.