GLOBAL BIODIVERSITY

Download Report

Transcript GLOBAL BIODIVERSITY

GLOBAL
BIODIVERSITY
INFORMATION
FACILITY
The Global Biodiversity
Information Facility (GBIF ):
The distributed architecture
Biodiversity Information Standards (TDWG)
2009 Conference
9-13 November 2009
Samy Gaiji
Head of Informatics
GBIF
WWW.GBIF.ORG
Objectives of this presentation

Expose the challenges faced by GBIF in building a
global information network;

Present GBIF distributed architecture strategy;

Introduce the key building components of the GBIF
Informatics suite;

Call for participation to the community.
A growing global network…
53 country participants
43 associated participants
Million of primary biodiversity records
A growing network…
189,4 million records
5% increase/month
8186 data resources
306 data publishers
Data publishers
<1% IPT
3% TAPIR
16% BioCASE
80% DiGIR
80% DwC
18% ABCD
2% others
Discovering
Indexing
189 M records
8-9 M/month
>300 publishers
Publishing
Architecture
A one-stop entry point to data discovery
http:/data.gbif.org
What are the challenges today?
Improved discovery
Better synchronisation
Richer user interface
Better management
More data types
Richer content
Decentralisation is
therefore aimed at
empowering GBIF
Nodes and Participants
What are the key processes?
Registry
Service
Publishers
Registering
Data
Publishers
Discovering
Harvesting
Indexing
Node
Access
What are the key components?
The GBIF Informatics Suite for Participants
Portal toolkit
Registry
Data flow
Registration &
Discovery
Harvesting toolkit
Publishing toolkit
Publishing Component

Data
Publishers


Provide a robust and user-friendly publishing tool
(TAPIR compliant, WFS-WMS, EML etc.),
Improve the existing standards (DwC, DwC Archive)
and enable the provision of richer content through
extensions for specialised communities,
Support the publishing of more datatypes such as
Metadata, Names, etc…
The Integrated Publishing Toolkit (IPT)
Harvesting/Indexing component

Harvesting
Indexing
Provide a tool that will:
 harvest distributed data publishers using
multiple protocols and schemas,
 harvest multiple datatypes (Primary
Biodiversity Data, Metadata, Names),
 Synchronise with the GBIF Registry (part of the
GBRDS),
 index into a central database.
The Harvesting and Indexing Toolkit (HIT)
Registry component
Registry


Provide a mechanism that will:
 provide a registry of organisation and
resources (collection),
 provide a registry of schema and extensions,
 provide a registry of services and tools.
A compass for all the information networks.
The Global Biodiversity Resources
Discovery System (GBRDS)
Portal component

Node
Access

Provide a platform that will publish:
 Primary Biodiversity Data,
 Names,
 Metadata.
Design it as a flexible and customisable platform to
meet the needs of a variety of community and
needs.
The Nodes Portal Toolkit
Where are we today?
Planning phase
 Node Portal Toolkit (NPT)
Development/Testing phase
 Harvesting Indexing Toolkit (HIT)
 Global Biodiversity Resources
Discovery System (GBRDS)
Production phase
 Integrated Publishing Toolkit (IPT)
Some successful examples…
Broadening standards
The DarwinCore
Germplasm Extension
Some successful examples…
Broadening standards
Sample acquisition
Collecting event
Breeding event
DarwinCore
Trait experiment
Trait measurement
‘IPR’
The DarwinCore
Germplasm Extension
Some successful examples…
Publishing richer content.
The DarwinCore
Germplasm Extension
Towards decentralisation
Better discovery,
Improved integration.
World Database on Protected Areas
Species richness changes…
Global Register of Migratory Species
More data types,
Increased content,
Better data quality,
More participants.
A complex challenge…
A call for participation to the community
1. Improving standards (within and across domains);
2. Evaluate/Contribute to the GBIF Informatics Suite;
3. Develop specific use cases (assessing threats to
biodiversity, monitor impacts of invasive species, agrobiodiversity…);
4. Actively engage in the decentralisation of the GBIF
architecture to meet YOUR needs;
5. Address challenges in data quality and completeness;
6. Constantly monitor data usage and review/prioritise the
Informatics developments.
Ask the GBIF Team !
Nick King
GBIF Executive Secretary
Samy Gaiji
Head of Informatics
David Remsen
Senior Programme Officer
for ECAT
Vishwas Chavan
Senior Programme Officer
for DIGIT
Éamonn Ó Tuama
Senior Programme Officer
for IDA
Andrea Hahn
Data Portal Manager
José Miguel Cuadra
Morales
Programmer
Kyle Braak
Programmer
Markus Döring
Senior Programmer
Challenges: broadening data types!