EarthChem: Solid Earth Geochemisty in Geoinformatic

download report

Transcript EarthChem: Solid Earth Geochemisty in Geoinformatic

Solid Earth Geochemistry in
Why Do We Need Data Management in
Solid Earth Geochemistry?
 Geochemical data are essential for answering
fundamental questions about the composition, structure, &
evolution of the Earth, its oceans, continents, and climate
 Data is dispersed in literature, often not in electronic form
 Compilations by investigators are time-consuming,
redundant, often incomplete
 Missing links among related data
 Data is lost due to incomplete publication
Data Management in Solid Earth Geochemistry
 Offer the only generally accessible compilations
of large volumes of data on the compositional
variation of igneous rocks.
 Provide desktop access to the entire published
geochemical literature within minutes,
 allowing researchers to address questions that
otherwise would be dropped due to the large effort
required to find and compile the data.
 allowing students to explore the global dataset within
a formerly unimaginable timeframe that can be
accommodated in the course schedule.
 Compile and serve ALL ‘raw’ geochemical data
 Share common relational data model (Lehnert et al. 2000)
 Data fully integrated
 Wide range of sample & analytical metadata
 Generally applicable for sample-based petrological and
chemical data for rocks
 Each value linked to original publication or producer
Interactive, Dynamic Web Interfaces
 Select, filter, view, download customized data sets
 Explore metadata
Other Features (database-specific)
Visualization tools (NAVDAT)
Interoperability (PetDB)
Interactive map
interfaces (NAVDAT)
Disparate data for
individual samples linked
via unique sample IDs
Data Quality Control
Comprehensive analytical metadata
allow proper data quality assessment
Example: PetDB interface
can be used as data quality filters
Content of PetDB, NAVDAT, GEOROC
 > 4 Million individual
chemical values
 for > ca. 230,000
igneous rock samples
 from > 6,300 publications
Benefits of
Rigorous Scientific Data Management
 Maximized Utility of the Geochemical Dataset
 Enhanced Data Quality Control
 Data Integration & Visualization across the Geosciences
 Impact on Science & Education
Maximize Utility of the Geochemical Dataset
“More than just a timesaver, these databases make it
possible to address both global and regional questions
that I would otherwise never bother to attempt.
The amount of time saved is such that countless ideas
cross from the realm of the totally impractical for a busy
working scientist into the realm of easy to squeeze into a
spare half hour.
Simply put, I can now test theoretical ideas against all
the world's data, and can readily compare any specific
region I am working on to its global counterparts. This is
a monumental benefit.”
Paul Asimov, California Institute of Technology
EarthChem User Survey January 2005
Scientific Return
>120 papers that cite PetDB & GEOROC
 Plank, T.: Constraints from Thorium/Lanthanum on Sediment Recycling
at Subduction Zones and the Evolution of the Continents, Journal of
Petrology 46, 921-944, 2005.
 Ballentine, C.J. et al.: Neon isotopes constrain convection and volatile
origin in the Earth's mantle, Nature, 433, 33 – 38, 2005
 V. Salters & A. Stracke: Composition of the depleted mantle. G3, 2004
 Cipriani, A. et al.: Oceanic crust generated by elusive parents: Sr and Nd
isotopes in basalt-peridotite pairs from the Mid-Atlantic Ridge. Geology,
32 (8), 657–660, 2004.
 Herzberg, C.: Geodynamic Information in Peridotite Petrology, Journal of
Petrology, 45, 2507-2530, 2004
 M. Hirschmann et al.: Alkalic magmas generated by partial melting of
garnet pyroxenite. Geology 31, 2003
 Kellogg, J. B., Jacobsen, S. B., O’Connell, R. J.: Modeling the
distribution of isotopic ratios in geochemical reservoirs, Earth Planet. Sci.
Letters 217, 2004.
Application to Education
Challenges for Database Providers
 Optimize interaction with the data for a broad audience
ranging from the casual to the expert user
 Efficiently populate databases with legacy and new data
 Integrate data with the larger Earth Science dataset
 Ensure longevity of data systems
The Problem of Distributed Datasets
A typical science question:
What is the relationship between what is being subducted at the Aleutian trench
and what is being erupted in Aleutian volcanoes?
 Need Nd, Sr, Pb, Hf isotope ratios, and incompatible trace element compositions
Aleutian Volcanics
North Pacific (Juan de
Fuca Ridge) MORB
Sediments off the Aleutian
The EarthChem Consortia
Founded in 2003
by R. Carlson, A. Hofmann, K. Lehnert & D. Walker
 Build an integrated data management and information
system for solid earth geochemistry,
 based on and expanding the collaboration of PetDB,
 Nurture synergies among projects
 Minimize duplication of efforts
 Share tools and approaches
EarthChem Activities
 Community Workshop (October 2003, Carnegie Institution
 Reviewed the current status of data management efforts in Solid Earth
 Discussed ways in which these activities can grow and collaborate to
best participate in and contribute to the Cyber Infrastructure revolution
in the Geosciences
 Exhibits & demos at AGU 2003 &
2004 and GSA 2004
 Presentations at GSA2003,
AGU2004, & various workshops
 Session on “Geoinformatics for
Geochemistry” at AGU 2004, cochaired with GERM
 Web site at
EarthChem Priorities
 Build the EarthChem portal as a
central access point to a system of
federated geochemistry databases
(One-Stop Shop for Geochemical Data)
 Ensure efficient and continuing update
and expansion of data holdings
Proposal submitted to NSF (EAR I&F) January 2005
K. Lehnert, D. Walker
One-Stop-Shop for Geochemical Data
Geoscience CI
Uniform data submission
Search capability across federated databases
Standardized & integrated data output
Generally applicable tools for DQ assessment & data analysis/visualization
and more..
Building the One-Stop Shop
 Interface federated databases
Implement web services: SOAP/XML/WSDL, OAI, OGC
Standardize metadata (ISO19115, OGC-GML)
Systematize nomenclature & vocabulary (ontologies)
Register database schemas with GEON?
Implement unique sample identification through use of the
International Geo Sample Number
 Build user interfaces with flexible data selection and
extraction, tiered for different levels of expertise
 Use customized GEON Portal technology?
 Use EarthChem map viewer, GeoMapApp browser, or
other tools to integrate with other data types such as
seismic tomography, gravity, structural features, etc.
 Provide tools for data evaluation such as
 interactive discriminant plots, P/T calculators, data quality filters
The Bottleneck: Data Entry
 Difficult to find knowledgeable data
 Missing metadata (e.g. locations,
analytical info)
 No unique sample identification
 Missing standards for data
presentation (e.g. units)
 Unavailable data files
 Errors in original data tables
 Missing cooperation from authors
Efficient Update & Expansion of Data Holdings
 Encourage direct data contributions from the
 Build on-line data submission capability for future data
(compliance with data policies for science programs!)
 Provide services for on-line storage of routine data
about analytical procedures (“MyEarthChem”)
 Facilitate incorporation of existing large data
 Provide technical assistance to investigators who want
to compile new datasets
Facilitate Community Contributions
 Assist contributors with
design, implementation, &
population of databases.
 Serve databases via the
EarthChem portal.
Contributed datasets will
retain their identity within
the EarthChem system.
“A relational database of the
Mexican Volcanic Belt”
Straub, Ferrari, Langmuir
Expansion of Data Holdings
 Generate additional datasets
 Identify and prioritize new target datasets
through community outreach and the
EarthChem Advisory Committee
 Data entry by dedicated EarthChem
Integration with Science & GeoInformatics
A User’s Vision
“… in theory the best thing would be one
big Geo-database where all different types
of geochemical reservoirs are included
and all analytical tools as well and where
you can search for either regions or
reservoir type or method...
ok that’s a big goal.”