eCrystallographyDataReports: An Open Archive Route for the Reporting and Dissemination of Crystal Structures S.J.

Download Report

Transcript eCrystallographyDataReports: An Open Archive Route for the Reporting and Dissemination of Crystal Structures S.J.

eCrystallographyDataReports:
An Open Archive Route for the Reporting and Dissemination
of Crystal Structures
S.J. Colesa*, J.G. Freya, M.B. Hursthousea, L. Carrb & C.J. Gutteridgeb.
aSchool
of Chemistry, University of Southampton, UK.; bSchool of Electronics & Computer Science, University of Southampton, UK.
The Publication Problem
Recent advances in crystallographic instrumentation and computational
resources in the last decade have caused an explosion of crystallographic
data. Traditional peer review methods of publication of chemical data are
unable to keep up with this new pace of data generation, causing a
publication bottleneck. This problem will become even more severe with
developments in high throughput chemistry (Combichem) and the impact of
eScience (Combechem). As a result of this situation, the user community is
deprived of valuable information, and the funding bodies are getting a poor
return for their investments!
The Open Archive Initiative (OAI) approach of EPrints offers a solution to this
problem through publically accessible archives. They are currently a method
for disseminating, via the internet, scholarly and research output that cannot
enter the public domain through conventional routes. However, open archives
are currently the subject of much debate as researchers are now depositing
published papers on such repositories, causing concern for the publishers.
Data Publication @ Source
Crystallographic EReports use the OAI concept to make available ALL the
DATA generated during the course of a structure determination experiment.
That is: the publishable output is constructed from all the raw, results and
derived data that is generated during the course of the experiment.
eCrystallographyDataReports present data in a searchable and hierarchical
system. At the top searchable level this metadata includes bibliographic and
chemical identifier items which allow access to a secondary level of
searchable crystallographic items which are directly linked to the associated
archived data.
Hence the results of a crystal structure determination may be disseminated in
a manner that anyone wishing to utilise the information may access the entire
archive of data related to it and assess its validity and worth. Also this
approach enables the publication of chemical interpretations and discussions
without being cluttered by unnecessary and lengthy crystallographic
experimental details.
Simple input of
bibliographic &
crystallographic
information
Core bibliographic data in a
searchable and harvestable format.
Includes
authors,
affiliations,
chemical name and formula along
with IUPAC’s International Chemical
Identifier (InChI). May retrospectively
include references to the eReport
(e.g CSD entry or paper in learned
society journal).
At a glance quality indicators and
key information are automatically
abstracted from the underlying data.
Chemically interactive and rotatable
3D representation of the molecular
structure through a JAVA applet
displaying the Chemical Markup
Language (CML) format file.
Automatic
validation
Direct access to ALL the data
Current Developments
We are now engaging chemistry
publishers who deal with crystallographic
data in order to embed this approach into
the publication process. The most likely
scenario
would
be
that
the
crystallographic experimental details and
structure would be published as an
eCrystallographyDataReport which is then
referred to in the discussion part of a
paper in a learned society journal.
We are also working with the International
Union of Crystallography to develop a set
of standards and semantics for OAI
publishing of crystallography data.
The Bigger Picture: Entering the ‘Scholarly Knowledge’ cycle
The archive openly publishes core bibliographic and chemical data. This enables information about a new entry in the archive
to be ‘harvested’. Information providers regularly probe the archive interface for new or updated entries and download the
associated metadata. These information portals may then ‘aggregate’ the metadata, -that is perform linking and cross
referencing exercises that enable the researcher to move navigate seamlessly through the academic literature.
eCrystallographyDataReports have been devised as a part of the eBank UK project, which is addressing the challenge of whole-lifecycle use by
investigating the role of aggregator services in linking datasets from Grid-enabled projects to ePrints contained in digital repositories through to peer
reviewed articles. eCrystallographyDataReports archives make data available in a harvestable format (OAI-PMH), which enables our project partners at
UKOLN (University of Bath) to automatically extract this metadata from our archive. UKOLN are working with PSIgate (University of Manchester) to
provide a mechanism to aggregate eCrystallographyDataReport information with related studies already available in the broader literature.