VECNet Digital Library for ALA

Download Report

Transcript VECNet Digital Library for ALA

Multiple Identities: Managing Authorities in
Repositories and Digital Collections
Controlled Authorities implementation for the
Vector Ecology and Control Network
Donald Brower, Banu Lakshminarayanan & Natalie Meyers
VECNet and University of Notre Dame
18/07/2015
Page 1
Malaria is DEADLY . . . .
Example of campaign from Malaria NO MORE
malarianomore.org
18/07/2015
Page 2
Malaria is Preventable & Curable
Image from Malaria NO MORE
malarianomore.org
18/07/2015
Page 3
VECNet Digital Repository
dl.vecnet.org
18/07/2015
Page 4
National Library of Medicine - Medical
Subject Headings (MeSH)
18/07/2015
Page 5
MeSH
MeSH terms should not be regarded as
representing an authoritative subject
classification system but rather as arrangements
of descriptors for the guidance and convenience
of persons who are assigning subject headings
to documents or are searching for literature.
http://www.nlm.nih.gov/mesh/intro_trees.html
18/07/2015
Page 6
National Library of Medicine - Medical
Subject Headings (MeSH)
18/07/2015
Page 7
MeSH Trees & Terms
18/07/2015
Page 8
MeSH Trees & Terms
Each term is in at least one place in the
MeSH trees, and may appear in as many
additional places as are appropriate –
therefore cataloged items can have
broader subject terms in more than one
tree
18/07/2015
Page 9
Because each term may
appear in as many
additional places as are
appropriate in MeSH –
cataloged items can have
broader subject terms in
more than one tree
18/07/2015
Page 10
Facet Creation Algorithm
Chemical Actions & Uses
Specialty Uses of Chemicals
Specialty Uses of Chemicals
Toxic Actions
Argochemicals
Pesticides
Pesticides
Pesticides
Insecticides
Insecticides
Insecticides
Term “Insecticides” is in three locations.
18/07/2015
Page 11
Facet Creation Algorithm
Argochemicals
Pesticides
Pesticides
Insecticides
Insecticides
Cut and consolidate the duplicate trees.
18/07/2015
Page 12
Descending the Tree(s)
18/07/2015
Page 13
Search Expansion
18/07/2015
Page 14
How it Works: Indexing MeSH
• We index all the broader terms and synonyms
associated w/ea subject assigned to a given
document
• Most of our content is in a few subtrees – we
start faceting at the third level from top.
• A document’s subject term may appear in
more than one tree. Duplicate trees are
consolidated (cf. “Insecticide”)
18/07/2015
Page 15
Why MeSH Matters
• Diverse Users can find what they are looking
for even though they use different search
terms
• MeSH Trees facilitate faceted browsing
• MeSH Synonyms Enhance discoverability
• These features are possible because we are
using the structure and synonyms in MeSH
18/07/2015
Page 16
Indexing Flow Chart
18/07/2015
Page 17
Modular Approach
• Preprocessing method can be used for other
hierarchical authority files
– Geographic Names
– Species Names
– Art and Architecture Thesaurus
• Process can be repeated easily when authority
files are updated (new terms added, etc).
18/07/2015
Page 18
Future Work
•
-Implementing Researcher IDs
• Implement cartographic UI for search &
retrieval against data w/Location and
Geospatial coverage elements
• Support RDF export
• Automatic Ingest and automatic cataloging
from VECNet Cyberinfrastructure tools
18/07/2015
Page 19
Amidst Heterogeneous Data
The Goal is Achieving Eradication
18/07/2015
Page 20
Acknowledgements
•
•
•
•
•
VECNet –funded by the Bill & Melinda Gates
Foundation
Hesburgh Libraries, University of Notre Dame
Hydra Project
CurateND – UND’s Institutional Repository
Penn State’s Scholarsphere
A work in progress: dl.vecnet.org
18/07/2015
Page 21