Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM,

Download Report

Transcript Urmas Kõljalg: Data sources (WP1) Task 1.1 Assessment and evaluation of biodiversity data sources Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM,

Urmas Kõljalg: Data sources (WP1)
Task 1.1 Assessment and evaluation of biodiversity data sources
Lead UTARTU; MfN, UEF, UFZ, SGN, HCMR, BGBM, UCPH, MRAC, Plazi, NRM,IBSAS, EBCC,
NBIC, WCMC;
The evaluation will assess important data characteristics, such as
coverage, accessibility, quality, and format. Targeted data types 1) remote
sensing data (incl. derived products e.g. vegetation and habitat maps,
habitat classification schemes); 3) taxonomic backbone data; 4) ecological
data; 5) specimen data from scientific collections; 6) species profile data,
including descriptions and functional traits, conservation status,
distribution and abundance data; 7) DNA sequence data.
Urmas Kõljalg: Data sources (WP1)
Task 1.2 Harmonization of European taxonomic backbone and
analysis of taxonomic coverage
Lead BGBM; MfN, UTARTU, UEF, UFZ, SGN, FIN, HCMR, UCPH, MRAC, Plazi, NRM,
IBSAS, NBIC
A unified taxonomic backbone will be built on the Pan-European Species directories
Infrastructure (PESI, www.eu-nomen.eu), and harmonized with ongoing attempts towards
a global Catalogue of Life (CoL). This task will integrate the Fauna Europaea and Euro+Med
PlantBase databases into the EDIT Platform for Cybertaxonomy, set up a mechanism for
regular updates, develop a full set of GEO BON and LifeWatch compliant services for
integration of taxonomic backbone data into the overall EU BON framework, and integrate
these services with INSPIRE conformant national data sources that include taxonomic
names.
Urmas Kõljalg: Data sources (WP1)
Task 1.3 Gap analysis of available biodiversity information sources and
identifying priorities
Lead MfN; UTARTU, GBIF, FIN, BGBM, MRAC, WCMC
Based on the assessments of data sources undertaken for task 1.1. and the review of
policy requirements in WP6 (task 6.1.), identified gaps in data coverage and quality for
different information layers will be evaluated against scientific interests and capacities in
the biodiversity information stakeholder communities, focusing at the European level.
Taxonomic, geographic, thematic, and other areas of bias will be analyzed in assessed
datasets and priority levels for closing gaps will be provided, also comparing European to
global level coverage.
Urmas Kõljalg: Data sources (WP1)
Task 1.4 Integrated approaches for focused biodiversity data mobilization
Lead NRM; UTARTU, Pensoft, BGBM, UCPH, MRAC, Plazi, GlueCAD, IBSAS, NBIC;
With this task, EU BON will advance data mobilization efforts targeting collection-based
and molecular data. In particular, three activities will be pursued: i) Through the opensource DINA initiative led by NRM to develop a web-based collection management
system, an integrated solution for digitizing, managing and mobilizing specimen data in
natural history collections will be developed. ii) The open access system JACQ (Virtual
Herbaria) for capturing botanical data will be integrated with other web services, to allow
increased participation of a significant number of European herbaria to feed their data
into GBIF, BioCASE, and other networks relevant for GEO BON. iii) For DNA and genomic
datasets, a set of web-based services, integrating distributional and molecular information
will be provided, also linking to molecular identification.
Urmas Kõljalg: Data sources (WP1)
Task 1.5 Exploring citizen science – based approaches for mobilizing and
generating biodiversity data
Lead UTARTU; MfN, Pensoft, FIN, HCMR, UCPH, Plazi, GlueCAD, NRM, IBSAS, NBIC;
This task will explore the potentials of citizen science based approaches for biodiversity
assessment and monitoring, particularly for achieving more comprehensive data
coverage and towards future GEO BON developments. Using highly successful
technology-based citizen science recording schemes from EU BON partners and
associates, the status of citizen science and its links to curricula and environmental
education will be evaluated for best practice examples from at least five EU countries.
Elements for an action plan for pan-European citizen science networking for biodiversity
information will be developed, including supporting an education network that will
provide links to international and global initiatives and programs.
Data Integration and Interoperability
WP2
Hannu Saarenmaa
Objectives
• Establish an information architecture for the EU BON project that will be
compatible with the global GEO BON, INSPIRE, other European projects,
and the LifeWatch research infrastructure
• Develop data integration and interoperability between the various
networks, and with new generation of data sharing tools enhance linking
between observational data, ecosystem monitoring data, and remote
sensing data
• Develop new web service interfaces for data holdings using state-of-theart standards and protocols. Register the networks on the GEOSS Common
Infrastructure (GCI) using harmonised metadata
• Develop a new portal to enable fast access to EU BON integrated data and
products by researchers, decision makers and other stakeholders
• Ensure global coordination of development efforts through an
international data interoperability task force and adoption of the results
through helpdesk and a comprehensive training programme
Backdrop: GEO BON wants to achieve
an operational system by 2015
• What do we mean by “operational”?
– data flow from observations through various
aggregation and processing services,
end-to-end to Essential Biodiversity Variables
(EBV) and indicators;
– automated and streamlined, as appropriate;
– using a plug-and-play service-oriented approach;
– coordinated through the GEO BON registry system
and linked to the GEOSS Common Infrastructure;
– transparent to users through portals.
Streamlined data flow, end-to-end
GEOSS &
GEO BON
Registries
Specimen
Observation
Occurr
ence
Analysable
dataset
Integration
Processing
service-1
Taxonomy
Individual
Ecological
Plot
Analysable
dataset
Processing
service-2
Measurem
ent
EBV-1
Indicator-1
EBV-2
Remote
sensing
Indicator-2
EBV-x
Scenarios
Geospatial
Analysable
dataset
Processing
service-3
In Situ
Spatial
data
Surveys
National
Statistics
EcoSystem
Services
Analysable
dataset
Other
inputs
EU BON WP2 Deliverables
1. Review (January 2014)
– Design of information architecture
– Review of data standards
2. New data sharing tools (2014/2015)
3. Registry and metadata catalogue
(2014/2015)
4. Portal (2015/2016)
5. Assessment of training activities (2016)
EU BON WP2 Milestones in year 1
• A global Informatics Task force
– Invited (March)
– First meeting (May/June)
• Helpdesk opened (April)
• Review
– Draft outlines (May)
– Deliverable (January 2014)
• Initial informatics workshop (May/June)
• Specifications for data sharing tools (January)
• First training workshop (January)
GEOSS Infrastructure Components
Main GEO
Web Site
GEOSS Common
Infrastructure
Registered Community
Resources
Client Tier
Registries
Components
& Services
CSW
Standards and
Interoperability
Best Practices
Wiki
Community
Portals
GEO
Web Portals
WMS
CSW
CSW
Client
Applications
WMS
CSW
Mediation Tier
CSW
Community
Catalogues
GEOSS
Clearinghouses
W*S
Mediation
Servers
Alert
Servers
WPS
User
Requirements
Workflow
Management
WMS
GEONETCast
WFS
WFS
Product Access
Servers
SOS
SAS
Processing
Servers
SPS
Sensor Web
Servers
W*S
Test
Facility
Access Tier
Model Access
Servers
Task 2.1 Design of information
architecture for EU BON
• Starting from the information architectures of relevant infrastructures,
i.e., GBIF, LTER, GEOSS, GEO BON, LifeWatch, and INSPIRE, adopt a
coherent architecture that will guide the development, integration and
interoperability efforts within the EU BON project.
• The architecture will highlight the relevant components of registry, portal,
semantic mediation, workflows, and e-services as envisaged in the GEO
BON Detailed Implementation Plan and open access as recommended by
the GEOSS Data Sharing Principles.
• Link to, and adopt informatics components and approaches of other
relevant EU projects.
• The task will address heterogeneity of projects and networks by ensuring
that the developments of EU BON can be migrated to permanent
infrastructures.
• In particular, the architecture will map GCI components to European and
global biodiversity infrastructure.
• (Lead CSIC; UTARTU, UEF, GBIF, MRAC, GlueCAD, IBSAS, NBIC, TerraData;
Months 4-14)
Task 2.2 Improving data standards and
interoperability
• Starting from the GEO BON Detailed Implementation Plan and the
architecture (task 2.1) as well as relevant European projects
(ALTERNet, EBONE, LifeWatch), review the state-of-the-art and
needs for improvement of the current data standards of TDWG,
OGC, BioCASE, GBIF, LTER-Europe, PESI, and INSPIRE.
• Consider how the available protocols and mechanisms for
interoperability can be best used for integrating different data
layers (i.e., genetic data, primary occurrence data, monitoring data,
ecological measurements, remote sensing data) in the European
context.
• Consider reasons for heterogeneity of biodiversity information and
make recommendations for use of standards by the various
networks.
• (Lead GBIF; UTARTU, UEF, CSIC, Pensoft, MRAC, Plazi, GlueCAD,
INPA, IBSAS, NBIC, TerraData; Months 4-51)
Task 2.3 Tools for data sharing
• This task will work with international partners (task 2.7) to scope
the requirements and build new releases of data sharing tools for
relevant data providers.
• These open source tools implement the selected interoperability
mechanisms (task 2.2) and data publishing mechanisms (task 8.5)
for use by the relevant networks, and provide registration and
query functions towards the GCI.
• As the basis of development, existing tools for metadata,
occurrence data and ecological data from GBIF and LTER will be
used.
• New tools for sharing habitat data will be investigated.
• A model for distributed development will be adopted.
• (Lead MRAC; UTARTU, UEF, GBIF, Pensoft, Plazi, GlueCAD, INPA,
IBSAS; Months 9-51)
Task 2.4 Metadata registry and
catalogue
• Building on the existing GBIF and LTER registry and metadata
catalogues, an enhanced and integrated metadata system will be
developed for EU BON.
• The various entities such as networks, projects, sites, and datasets
identified in the analysis and mobilization efforts of WP1 will be
described in the new registry/catalogue.
• The entity descriptions should include web service interfaces or
other access points, and will also be registered at the GCI and other
indexing services.
• In order to overcome heterogeneity of data, accommodate
multilingualism, enhance discoverability and interoperability, and
facilitate querying in portals, the use of Knowledge Organisations
Systems (KOS; e.g., thesauri) will be explored.
• (Lead GBIF; UEF, CSIC, Pensoft, MRAC, INPA, IBSAS; Months 9-51)
Task 2.5 European Biodiversity Portal
• A European Biodiversity Portal (EBP) will be developed as the main GEO
BON information hub.
• It will link to relevant databases and information systems, policy contacts
and recommendations, and structured advice for assessing relevant
distributed information/datasets for different user groups, including
contributions from citizen science data gathering gateways.
• The EBP will technically integrate the various data sources under one
search facility and spatially/temporally oriented user interface.
• The portal will build on the tools developed by task 2.3, functions
developed by task 2.4.
• It will provide access to full detailed data, geographic visualisation, and
remotely sensed data. It will be closely linked to the GCI and GEO Portal,
and access layers and data from GEOSS sources.
• The portal would also act as showcase for the products from the
analytical and modelling activities of other WPs and support workflows
for building such products using the registered e-services.
• The portal will also serve general dissemination functions for WP8.
• (Lead CSIC; UEF, GBIF, UnivLeeds, Pensoft, FIN, MRAC, Plazi, GlueCAD,
NBIC; Months 1-54)
Task 2.6 Technical support and
helpdesk
• This task will set up a technical coordination unit and helpdesk for
European networks and for global outreach as envisaged in the GEO BON
Detailed Implementation Plan.
• The helpdesk will also support developments in other WPs, and will play a
key role in reviewing all design documents, testing the developments and
giving feedback to developers.
• The helpdesk will actively promote and assist national BONs and other
users in installing and using the tools that enable the interoperability
mechanisms (task 2.3).
• It will promote open access and assist in registering EU BON services at
the GCI, and help populating the portal with content from other WPs and
partners.
• The helpdesk facility will be set in place in collaboration and synergy with
the common Helpdesk platform of the MRAC, which is currently active for
six projects.
• (Lead MRAC; UEF, Pensoft; Months 4-54)
Task 2.8 Training programme
• This task will develop and operate a training programme in data and
metadata integration strategies, use of standards, and use of data tools,
within the EU BON consortium and beyond, thereby contributing also to
WP8 and the long term impact of the project.
• The training events will start with introduction courses and will first target
the consortium members. As the project evolves and the tools become
available the trainings will target also external users. The plan is to invite
stakeholders from and beyond Europe. We target an audience up to 25
participants per event.
• The training programme will be organized in collaboration with the DEST
(the Distributed European School of Taxonomy) set up under the European
project EDIT and currently maintained by RBINS, MRAC and NBGB. Initially
mainly dedicated to taxonomy courses, the scope has now been
broadened to other biodiversity and environment related training
activities (http://www.taxonomytraining.eu/).
• (Lead MRAC; UEF, Pensoft; Months 4-54)
Task 2.7 Informatics task force and
global cooperation
• In order to ensure integration of the various initiatives and
reduction of heterogeneity of biodiversity information, EU BON will
need to liaise broadly on technical issues.
• In this task an informatics and data standards task force and
advisory board will be organized and operated.
• In particular, the task will cooperate with, and build on technical
solutions developed by the LifeWatch infrastructure and several FP7
infrastructure projects for biodiversity.
• Close and regular contact with the GEOSS Infrastructure
Implementation Board and the GEO BON Steering Committee will
be ensured.
• It will also be necessary to link to and cooperate with related
initiatives in other parts of the world, such as ALA, DataOne,
regional BONs, etc), and help to ensure their compatibility.
• (Lead UEF; MfN, GBIF, CSIC, MRAC, IBSAS, NBIC; Months 4-54)
Summary
• EU BON WP2 aligns very closely to GEO BON
WG8 – global task force will ensure this
• We have a lot to learn on how to be part of
the GEOSS Common Infrastructure
• We need to pilot structure and function for
LifeWatch
• We will need to build on tools already being
worked on in the community
LifeWatch Component Architecture
Some Generic Data Standards,
Interoperability Requirements, and Tools
Multi-dimensional
XYZ, t, P
NetCDF
NetCDF
Traditional Spatial
XY, t, P
S-DB
WFS, WMS, …
Signals
XYZ, t, P/ B
O&M
SOS
Ecosystem
XYZ, t, P/ B
MetaCat
EML, CSV
Occurrence
XYZ, T, Tx
GBIF IPT
EML, DwC
Genome
XYZ, T, Al
GenBank
FTP/ ASN.1
Meta-Data Landscape
ISO 19115/p2
EML
FGDC
Semantic Web / Linked Open Data
Dublin Core
Darwin Core
OPeNDAP/
NetCDF/ HDF-5
What is the role of a portal in a
Service Oriented Architecture?
• and in distributed communities?
EAI
Project-ware
SOA
GBIF
Pre-GBIF
LifeWatch, GEOSS, SEIS
source:Sam Gentile
A Service Centric Portal
Biodiversity science
Multiple Service Consumers
Multiple Business Processes
Ecosystem Services
Service
Service
Occurrence
Service Architecture
Shared
Service
Services
Service
Ecological measurements
Remote sensing
Multiple Discrete Resources
Multiple Service Providers
SOA structures the business and its systems as a set
of capabilities that are offered
as Services, organized into a Service Architecture
Service virtualizes how that capability is
performed, and where and by whom the
resources are provided, enabling multiple
providers and consumers to participate
together in shared business activities.
source:TietoEnator AB, Kurts
Bilder