Developments and Trends in the LMS and Discovery Arenas

Download Report

Transcript Developments and Trends in the LMS and Discovery Arenas

DEVELOPMENTS AND TRENDS IN THE LMS AND DISCOVERY ARENAS

Marshall Breeding Director for Innovative Technology and Research Vanderbilt University Library Founder and Publisher, Library Technology Guides http://www.librarytechnology.org/ http://twitter.com/mbreeding 26 August 2010 Stockholm

Program on National Infrastructure

Seminar Goal

 The aim of the seminar is to create an understanding of the infrastructural challenges and to contribute to a plan of action for the future.  Library Directors and System managers will discuss different solutions of availability and management of e- resources in order to make strategic choices for the development of the infrastructure at a national level.

Presentation Themes

 Trends and recent developments in the library system market,  resource discovery services and resource management as indexing/knowledge bases  Creation and management of data wells for metadata  Ongoing discussion regarding options for building data wells in-house, open source or partnering with commercial actors.

Summary

• development and trends in the library system market, regarding resource discovery services and resource management as indexing/knowledge bases. If I should emphasize something special, it is the question of data wells for metadata. We have been investigating the data well question in a report (plesase see below, Summary in English) and there is a discussion about building data wells in-house, open source or with commercial actors. We have also invited three commercial actors to the seminar. Not an easy question!

Related is also the topic of the national catalogue LIBRIS as a local OPAC for the libraries. How can Libris work as, not only the national catalogue, but also as a local OPAC? The third topic is the future for ExLibris, Metalib/SFX in Sweden. We´re happy with SFX, but not with Metalib/federated search, how to continue? But the main focus at the seminar will be resource management/data well, although Libris and Metalib/SFX questions need to be included in the discussions.

Basic Discovery Concepts

Crowded Landscape of Information Providers on the Web  Lots of non-library Web destinations deliver content to library patrons     Google Search / Google Scholar Amazon.com

Wikipedia Ask.com

User expectations

Evolution of library collection discovery tools  Bound handwritten catalogs  Card Catalogs  Library online catalogs – OPACs  Next-Gen Catalogs / Discovery interfaces  Web-scale discovery services

Bound Catalog

Card Catalog

Online Card Catalog

Web-based online catalog

Next-generation Catalog

Next-generation Catalog

Modernized Interface

      Single search box Query tools   Did you mean Type-ahead Relevance ranked results Faceted navigation Enhanced visual displays   Cover art Summaries, reviews, Recommendation services

Web site as menu of search options

Disjointed approach to information and service delivery   Silos Prevail  Books: Library OPAC (ILS module)  Articles: Aggregated content products, e-journal collections  OpenURL linking services  E-journal finding aids (Often managed by link resolver)  Local digital collections  ETDs, photos, rich media collections  Metasearch engines All searched separately

Lack of unified Web presence

 User’s don’t understand the distinctions we make  Catalog?

 Articles and Databases?

 Digital Library?

 Search our Site?

 Search interfaces based on content formats or management applications 

Non-library Web sites are much more unified

A simple vision

 A single point of entry to all the content and services offered by the library Search:  …but with precision, nuanced sophistication, and multiple dimensions

Web-scale discovery

Online Catalog vs. Discovery Layer

 Online Catalog  Interface conventions from an earlier Web era  Scope: Tied to the ILS and its content domain  Discovery Layer  Modern interface elements  Scope: aims to address broad range of components that constitute library collections

Discovery Products

Decoupled from ILS

Social discovery

 Tags, user-supplied ratings and reviews  Leverage social networking interactions to assist readers in identifying interesting materials: BiblioCommons  Leverage use data for a recommendation service of scholarly content based on link resolver data: Ex Libris bX service

Deep indexing

     Metadata can no longer serve as the only basis for discovery Increasing opportunities to search the full contents   Google Library Print, Google Publisher, Open Content Alliance, government publications, etc.

High-quality metadata will improve search precision Commercial search providers already offer “search inside the book” and searching across the full text of large book collections Important transition to full-text book search beginning in library projects   HathiTrust indexing 6 million volumes Must become a routine component of library discovery Deep search highly improved by high-quality metadata

Discovery product Trend

  Initial products focused on technology   AquaBrowser, Endeca, Primo, Encore, VUfind Mostly locally-installed software Current phase focused on integrated access to both local content and remote articles to deliver Web-scale discovery. Examples:      Summon (Serials Solutions) WorldCat Local (OCLC) EBSCO Discovery Service (EBSCO) Primo Central Encore Synergy

Beyond Federated search

 Federated Search / Metasearch use real-time queries against multiple information targets  No centralized index – presentation of dynamic results  Shallow results -- only a few results initially fetched from each target  Difficult to calculate relevancy  Performance challenges

Beyond local discovery interfaces

 Pre-populated indexes  Web-scale  Exploits the full depth and breadth of library collections  Beyond the bounds of the local library’s collection  Targets the universe of objective, vetted library content

Pre-populated discovery services

 New-generation interface  Harvested local content  ILS metadata  Institutional repositories, ETDs, Digital Collection platforms  Vendor-supplied indexes of library content  E-journals, databases, e-books  Full-text and metadata corresponding to e-content subscriptions  Book collections beyond local library collections  Includes full-text indexing to the fullest extent possible

Online Catalog Search: Search Results ILS Data

Federated Search Search: Search Results ILS Data Real-time query and responses Digital Collections ProQuest EBSCOhost … MLA Bibliography ABC-CLIO

Discovery Interface Search: Search Results ILS Data Local Index Digital Collections ProQuest EBSCOhost … MLA Bibliography ABC-CLIO Real-time query and responses

Web-scale Search Search: Search Results ILS Data Digital Collections ProQuest EBSCOhost … MLA Bibliography ABC-CLIO Pre-built harvesting and indexing

Search: Search Results Digital Collections ProQuest … MLA Bibliography Fed Search ABC-CLIO Pre-built harvesting and indexing Non harvestable Resources

Discovery

Delivery

 Discovered content delivered through original repositories  Publisher agreements generally preclude exposing content for direct access  Should necessarily circumvent core role of publisher

Benefits

 Libraries: increased access to high-cost electronic content  Users: Easer access to research resources  Publishers: Increased impact of content products  IT perspective: advance harvesting makes more efficient use of resources than simultaneous real time queries

Toward a Large-scale National Discovery environment

Obstacles and Challenges

 Scaleable technology platform  Acceptable relevancy-based retrieval for large heterogeneous collections  Acquisition of data and metadata for aggregated index

Opportunities

 Climate more favorable to harvesting e-content for indexing  Highly scaleable, open source tools for discovery infrastructure  Lucene  SOLR  Many ongoing synergistic projects as possible collaborative partners

Potential Commercial Partners

 Three commercial organizations will participate in the seminar:  Ex Libris  Serials Solutions  EBSCO  Each has negotiated access to commercial content products  Paved the way for library driven projects

Other similar projects

Summa

 State and University Library of Denmark  Locally built integrated search  Catalogs + articles  Failed to receive EU funding due to lack of guarantees to receive article data from publishers  Now Partnering with Serials Solution to use article index from Summon via API

Trove

        National Library of Australia Previously called Single Business Discovery Project Brings together many previously separate discovery systems Built in-house at NLA Prototype released May 2009 Includes some full-text as well as metadata Technology: Java, Lucene, SOLR, MySQL Details: http://www.nla.gov.au/pub/gateways/issues/101/stor y01.html

What about OCLC?

 WorldCat: ever expanding repository of metadata  Books mostly, increasing article metadata  Focused on expanding WorldCat for broad discovery  ArticleFirst 23 million records  April 2009 agreement with EBSCO for article metadata (withdrawn?).

 Quantity of article metadata apparently not on track to attain the same level of comprehensiveness as seen in Summon, EDS, Primo Central

Developing the Data Well / Aggregated index  Aggregation of metadata and content  Normalization – map metadata to make indexing, facets, and presentation meaningful  De-duplication of records within and between content sources  FRBR – Collapsible groupings according to FRBR concepts:  work – expression -- manifestation – item

Content sources populating the Aggregated Index      Article metadata and full text   Index views according to profile Coordinated with local OpenURL knowledge bases Digital Collections LMS Metadata  Books, Microfilm, periodical titles, DVD, etc Blending of vendor provided metadata and locally managed unique content At the cusp of being able to represent library collections comprehensively

Acquiring content for Aggregated Index  Agreements with publishers and providers of article content to libraries  Open access content  Any OAI target  Local digital collections  Relevant library catalog data  OK with OCLC record use policies when aggregated at a national level?

Data Well Construction

  Technical    Assembling technologies of adequate scale and capacity Indexing, Search and retrieval Normalizing Business / Political    Agreements with commercial publisher to provide metadata or content Increasing expectation from libraries to allow harvesting for discovery  (Similar to COUNTER compliance, OpenURL support) Improved performance at delivering library end users to publisher content

Relationship with OpenURL Knowledgebase  The aggregation of article-level citations and content relates to journal title-level profile and availability data in the OpenURL knowledgebase  Important source of profiling needed to deliver appropriate views of the index for different libraries.

A labor-intensive project

  Business process   Develop relationships with providers and publishers Construct contracts and licenses Technical   Create import process for each source:   Normalization, Mapping, de-duplication, FRBR groupings Initial load + constant incremental updates Creation of highly scalable indexing and retrieval platform    Must scale up to 1 billion articles Develop algorithms and tunings for appropriate relevancy rankings Interface design

Building Expectations for Article Discovery  Libraries should require agreements for harvesting as part of content licensing process  Library licenses have led to broad support for:  COUNTER  SUSHI  OpenURL Linking

Beyond Metadata

 Increasing expectation for full-text indexing  Capacity present in e-journals for many years  Full-text book indexing more problematic  Much full text not available  Complex to index

Heterogeneous index

 Books – mere millions  Articles – many hundreds of millions  Digital objects – many hundreds of millions

How to deal with non-harvestable resources  Metasearch?

 Resource recommendation service  Database spotlighting

Positioning of Discovery vs native Interfaces    Current generation of discovery interfaces lack important features  Service delivery (items borrowed, renewals, fee payments, etc)  Browse and other advanced search or retrieval features Many libraries use native Web-based catalog to supplement Native interfaces of major information products appeal to discipline specialists

Content + Services

 Must go beyond discovery to fulfillment  Further integration of user services features into discovery interface  Increased resource sharing capabilities

LIBRIS

 National Union Catalog >  Local catalog?

 Local LMS?

LMS deployments in Sweden - Academic

LMS deployments in Sweden -- Public

Mobile

The next new front for Library Discovery

Relevant Technology Trends

Service-oriented architecture

 Key technology for interoperability among diverse software applications  New applications built with SOA throughout  Legacy applications with a services layer

Aggregating data and metadata

 Open source  Commercial partnerships

Mobile access to library content and services  New opportunity to retain and attract library users  Mobile web and apps  Working toward a unified Mobile library presence  Unify disjointed mobile silos the same ambitions as we have for our the Web

Questions and Discussion