The Richer the Record, the Better the Results: Options for

Download Report

Transcript The Richer the Record, the Better the Results: Options for

The Richer the Record, The Better the Results

Options for enriching bibliographic records in your OPAC

Felicity Dykas, Head of Cataloging, University of Missouri Ember Stevens, MLS 2011 MOBIUS Annual Conference June 8, 2011

OPAC Record Enrichment

• Literature review: what’s out there?

• Enrichment options: commercial and in-house possibilities • Record enrichment at MU: what we’re doing

Enrich Records with …

• Typically • Table of contents • Summaries and abstracts • Publishers descriptions, including information from book jackets

Enrich records with …

• Other • • • • • Author biographies Reviews Recommendations Full text Sample text • Book cover images Other images • • • • Sound and video clips Names of directors, actors, etc. Characters names for fiction • Notes: Language of material, etc.

• Incipits for music material • • • Subject headings Form and genre terms DDC and LCC index entry, captions, and notes (Markey, 1987) • • • • Titles and other descriptive information using non-Roman scripts User tags, reviews, ratings Related works Donor information and bookplates • Call numbers for online resources

Why enrich?

• “If it is too inconvenient I’m not going after it:” convenience as a critical factor in information-seeking behaviors (Connaway, 2011) • The concept of convenience can include • their choice of an information source • their satisfaction with the source and its ease of use • their time horizon in information seeking

Why enrich?

Users overwhelmingly search by keyword

• Records need more than author, title, subject headings • Subject headings contributed an average of 4.84 unique words, while the contents and summary notes fields contributed an average of 15.50 unique words per record. (Markey, 1987) • Notes fields add words that may be more current and/or more relevant to a particular discipline

Why enrich?

More data allows users to evaluate the resource • Libraries need to make it easier for end users to quickly ascertain whether items meet their needs. (Calhoun, 2009) • … giving users significantly enhanced intellectual access to library materials at the point of searching, regardless of where the search takes place. (Banush, 2002) User expectations • OPACs have to compete with internet searches, federated searches, full-text ebooks, etc.

• … Google, Amazon, Barnes & Noble

Summon … MERLIN Metadata

Summon … Database Metadata

Research

• After adding TOCs/Summary Notes, circulation rose significantly more than expected. …

The percentage increase in circulation after TOCs/Summary Notes were added was 20.40%

. (Faiks, 2007) • The study found that tables of contents do increase usage. … In general, even after adjusting for all the variables (publication date, location, circulation status, subject, and previous use), the

odds of a title being used increased by 43% if the titles had online tables of contents.

… The largest effect of including a table of contents was for the most recent items. (Morris, 2001)

Research

• … this study suggests that

content-enriched metadata overall contribute to higher circulation

across the four subject fields. [History, social sciences, language & literature, science & techology] Content-enriched data also play an

important role in OPAC discovery

. (Tosaka, 2010) • These data reveal that users’ queries have more matches in TOC than in LCSH. However, users’ queries often failed to find items similar to the target items whether their terms were searched against TOC or LCSH. (Choi, 2007)

Research

• … manual metadata enhancements greatly increase the use of our digital image collections.

Enhanced images accounted for quadruple the amount of use as unenhanced images.

[Internet searches analyzed using Google analytics] (Chapman, 2011)

Research: Good Overviews

• Anderies, J. (2004), “Enhancing library catalogs for music”, Conference on Music & Technology in the Liberal Arts Environment, June 2004, Hamilton College. • Byrum, John D. Jr, and David W. Williamson (2006). Enriching Traditional Cataloging for Improved Access to Information: Library of Congress Tables of Contents Projects. Information Technology and Libraries. March 2006. p. 4-11. • Tosaka, Yuji, and Cathy Weng. Re-examining Content enriched Access: Its Effect on Usage and Discovery. (2010)

Caveats

• Adding keywords: increase in recall; decrease in precision (Banush) • If TOC is long and contains many entries, does this dilute the value of the information once it is put into a 505 field? (Byrum, 2006) • Time consuming and costly to add • • Copyright issues?

Enrichment can’t overcome all system limitations • Displaying content-enriched data in OPAC with matching keywords highlighted is essential in helping users identify the resources they need. (Tosaka, 2010)

Projects to Enhance Records

• Library of Congress • OCLC

Library of Congress Projects

BEAT – Bibliographic Enrichment Advisory Team

• Scan and add links to digital information • Some automated processes • Add publisher supplied information • Add information from other sources • Annotations for sites selected annually by the MARS Best Free Reference Web Sites Committee

CIP

• Cover art • ONIX data including summaries More information: http://www.loc.gov/catdir/beat/ and Byrum (2006)

LC record in MERLIN

(DLC|cDLC|dBTCTA|dYDXCP|dIG#|dBWX) “Machine-generated contents note:” “—Provided by publisher.”

Best Free Reference Web Sites

OCLC Projects

• Next Generation Cataloging • Pilot project related to use of upstream metadata and enrichment using publisher and vendor ONIX metadata • Where possible, will use automated processing • www.oclc.org/partnerships/material/nextgencataloging.htm

• Work record project: Will re-use summaries, contents, subject headings for identical works

OCLC Projects

• OCLC has partnered with All Music Guide and Rovi to enhance records for pop and classical music. Information added includes descriptions, genres, styles, release dates, tracks, ratings, etc. (olac.org/drupal/newsletters/enews/2010Dec/oclcnews.html) • AllMusic metadata is attached to bibliographic records for sound recordings. Adding third-party enhanced content is an OCLC priority. (MOUG newsletter, June 2011, p. 26, 28)

Some allmusic.com Notes in WorldCat

• 500 Song titles from AllMusic.

• 500 "The explicit stated sequel to 1975's masterpiece 'Captain Fantastic & the brown dirt cowboy'"- AllMusic.Com.

• 500 Originally released in 1953 as Jazz funeral in New Orleans--Allmusic.com, accessed 31 Jan. 2011.

• 500 "Jazz-rock, fusion, avant-garde"- www.allmusic.com

• 500 "Alternative metal, Industrial metal, Rap-metal"- allmusic.com

Slide from presentation at the OCLC ARC America Regional Council meeting ALA Midwinter 2011:

WorldCat Quality / Karen Calhoun

http://www.oclc.org/multimedia/2011/ARC_ALA2011.htm

Recent Cataloging (MU)

These projects are making a difference for us! • Recent new books list (monographs) • • • 355 records total 67 records have summaries (520) – 19% 181 records have table of contents (505) – 51% • Recent FastCat records (monographs, DLC and PCC) • • • 682 records total 127 records have summaries (520) – 19% 520 records have table of contents (505) – 59%

Enrichment Options

• Commercial • Convenient • Costly • In-house • Flexible • Customizable • Time consuming

Commercial Services

• Many backed by Bowker (via Books in Print data) • Usually between $1-$2 per record; may be as cheap as $.02 per record with Syndetics

Commercial Services: Syndetics

• Distributed by Bowker • Table of contents, first chapters/excerpts, summaries, author notes, reviews, additional media • Boston University, Oklahoma State, University of Chicago • Cost depends on collection size • Doesn’t become permanent part of the record

Boston University

Table of contents Summary & Review

University of Chicago

Ability to Tag Summary Table of contents Book covers

Commercial Services: Blackwell/YBP

• “Tables of Contents Catalog Enrichment Service” • Distributed by Bowker • Acquired by Baker & Taylor/YBP in 2009 • TOCs, author notes, and book jacket summaries • Information is permanently added to bibliographic record • UM cost: $1.18 per title (expect this will increase)

MU’s MERLIN

Problems: Promotional Blurbs

"This book provides an excellent overview on opportunities for economic applications of the Information-Gap Theory." "A must-read for serious economic decision-makers."

Problems: Formatting

All caps

Commercial Services: LibraryThing

• Distributed by Bowker • Book recommendations, tagging, other editions, patron reviews, shelf browse • 20% discount for consortia • University of Denver, Brigham Young University, Cal State Channel Islands • Info stored on LibraryThing Server, updates in real time • Annual subscription

University of Denver

Virtual Shelf Browser

Brigham Young University

Tags Similar items Similar items

Brigham Young University

Reviews Similar items

Commercial Services: MARCIVE

• Table of contents, summaries, Accelerated Reader program, Fiction/Biography information, Lexile Meaures, Reading Counts • Table of Contents: $.50/record • Fiction/Biography enrichment: $.50/record • Summaries: $.30/record • UC Merced, Brown University, U. of Georgia School of Law

University of California Merced

Table of contents Abstract & Review

Record Enrichment at MU

• • Focus is on monographs Regular practices – at point of cataloging • Abstracts or summaries for MU theses, dissertations, master’s projects • Title added entries for music materials • Genre/form terms • Local collection headings • Donor information (Honor with Books, MU Remembers) • • TOCs for ebooks using macros (post-cataloging projects, too) Regular practices – post cataloging • OCLC bib notification • TOCs for print engineering books (added by public services staff) • Recatalog to analyze volumes when requested by selector • Commercial: YBP TOCs, summaries, author information

Record Enrichment at MU

• Experiments • Links to WorldCat Identities for authors • Summaries and other information for fiction • Summaries from publishers • Convert TOCs from LC links to TOC in bib records • Descriptions from exhibits and Special Collections pages (pending)

YBP – Purchased Metadata

• Record criteria for April 2011 enrichment • • • • • • No table of contents (MARC tags 505 or 970s) Published between 2008 and 2012 Monograph Not an online book Not a government publication No subject headings with “examination questions” or “questions” • April 2011 enrichment • 21,224 records sent • 9,701 records enriched (46%) • 7,070 table of contents added (73%) • • 5,240 summaries added (54%) 2,076 author information added (21%)

Record Enrichment at MU: OCLC Bib Notification

• We signed up in 2009 to receive reports for records with new tables of contents and encoding level increases (free) – information is sent based on holdings in WorldCat • There is an option to receive updated records ($) • http://www.oclc.org/bibnote/default.htm

MU Process

• Use WorldCat batch processing to search for records using OCLC numbers (from reports), to add constant data, and to export records. Records match on OCLC number • Load table replaces 001 and 019 in existing local record, and inserts 505 and 520 fields • We do not overlay records since we have a shared catalog and we do not want to use local edits • Each record is reviewed and duplicate 505s and 520s are deleted • We used to insert subject headings, too, but reviewing for duplication was too time consuming • Procedures: mulibraries.missouri.edu/staff/catalogdept/OCLCbibnotification.htm

Sample Report

MU OCLC Bib Notification statistics

• Time: Averages about two minutes per enriched record • Statistics: February 2011

Pub dates

-1899 1900-1949 1950-1979 1980-1989 1990-1999 2000-2009 2010 Total

Processed

21 191 620 132 151 185 66 1366

New content

19 186 598 129 95 116 61 1204

OCLC Bib Notification Sample information

Enhancing Ebooks with TOC

• Question: How do we enhance ebooks with table of contents?

• Next best thing to full-text searching • Due to budget reasons, do not send out e-resources for enhancement (table of contents, summaries) • Policy not to duplicate print titles • Selectors would appreciate this added service

Solution

• Most vendors provide listings of table of contents

How we did it

• Used Microsoft Word macro feature • Copy table of contents • Paste into Microsoft Word (set options to keep text only — eliminate HTML coding) • Show codes (paragraph symbols, etc.) • Initially: Use a series of find/replace operations to format TOC correctly and save this into macro • Now: Use Visual basic coding in macro program

Example after pasted into Word

After macro is run: Ready for 505

Preface, Sponsors and Organizing Committees -- Effects of Surface Active Element on the Biocompatibility of High Nitrogen Stainless Steel -- Quench Brittleness of 12%Cr Martensitic Heat-Resistant Steel -- The Formation and Occurrence of Non-Metallic Inclusions of Si-Doped Steel during Continuous Casting -- Microstructure and Mechanical Properties of Molybdenum Alloy Strengthened by Lanthanum Oxide and Silicon -- AZ80 Mg Alloy Synthesized by Spray Forming and its Extrudability - Effect of Carbon Migration on Sulfide Stress Corrosion Cracking Behavior of Dissimilar Joints in Wet H2S Environment -- Microstructures and Mechanical Properties of Dissimilar Metal Weld A508/52M/316L Used in Nuclear Power Plants

Before Macro

After Macro: Ready to Insert into Millennium

Public display in MERLIN

Advantages

• Great keywords, particularly for conference proceedings • Provides user with info about contents • If adding into OCLC records (new or already existing), enrich the record for everyone • We’re able to provide quality customer service for users and public services without spending extra (besides human resources) • Ebook records often better than print (need FRBR enhanced catalog)

Disadvantages

If adding only to MERLIN record (such as for titles from record loads) • Only enhances it locally (not internationally) • We use non-standard 970 (but does display better) • Dependent on browser version (Firefox) • If browser upgrades/changes, then macro may be affected (re: upgrade to Firefox 4.0) • Still often need to do manual changes after running macro (diacritics, subscripts, etc.) • Can be time-consuming & requires hard decisions about what’s most important (cost/benefit ratio).

• Accept case of TOC • Sometimes only include titles (not authors)

Record Enrichment at MU: Fiction

1) Focused on one publisher: HarperCollins.

• Limited search in MERLIN Catalog to genre = fiction and the publisher • Searched HarperCollins web site for each title on the list and added book descriptions to MERLIN record (MARC 520) when found (cut and paste) • 520 “Description.”—Publisher’s website.

• Generally the book descriptions were about three paragraphs of text • Added 47 summaries for books published in 1990s and 2000s • Focusing on one publisher made this a fairly quick process

Fiction: HarperCollins Sample information

Record Enrichment at MU: Fiction

2) Added subject headings, summaries, and TOC from WorldCat • Focus was on titles published in 1980s and 1990s • Searched by OCLC number • If new information was available, added constant data and exported the record (load table set to only load TOCs, summaries, and subject headings) • Mostly added subject headings • 110 records searched. 63 had new subject headings, 24 had summaries, and eight had TOCs. 45 records had no enhancements. (65 records enriched) • About two hours of time (32 records per hour)

Fiction Record that has not been enriched

Fiction Record that has been enriched

Summary and three subject headings

Record Enrichment at MU: Fiction

3) Focus on authors listed on a course syllabus • Worked from a list of authors on a creative non-fiction writing syllabus • E.g., David Foster Wallace and David Sedaris • Found records needing enrichment and added information from WorldCat Identities and WorldCat bibliographic records

Fiction: Course Syllabus Sample information

WorldCat Identities

• David Sedaris: http://www.worldcat.org/identities/lccn-n94 15692 • Information is aggregated from WorldCat records • Includes: • Overview of works • Genre • Subject headings • Book covers • Time line showing publications about David Sedaris and publications by David Sedaris • Summaries • Links to LC authority file, VIAF, and Wikipedia entry

Record Enrichment at MU: Fiction

4) Added links to WorldCat Identity pages • These pages include information on an author’s works, genres, subject headings, and a timeline of publications by and about the person • Added links to bib record (MARC 856) • Created a Millennium macro to speed up the process • 22 added • Downside: added a link; information in not embedded in the record

Record Enrichment at MU: Special Collections

5) Enhanced Special Collections material • Limited to monographs in Special Collections about the Civil War • Searched WorldCat and the internet to find information • Found a resource at UNC with a lot of information: Documenting the South. Great summaries and outlines for books. Added links to this site. • Information added to 20 records

Fiction: Special Collections Sample information

Record Enrichment at MU: LC links

6) Transferred Library of Congress digital TOCs into AACR2 format for inclusion in bibliographic records (MARC 505) • Text was reformatted in Notepad • Took three – ten minutes per record. Using a macro would speed this up.

Note: “A human cannot compete with an automated process.”

Record Enrichment at MU: Google Books API

Book cover and link to Google Books: * Preview and search inside book * Full text

MU Indexing Rules

• Tables of contents • 970 $t Title and keyword indexes • Summaries and other notes • Keyword index • Subject headings • LCSH and MeSH: Subject and keyword indexes • Other subject heading fields: Keyword index • Genre/Form terms • Genre/form and keyword indexes

Where To Start

• Commercial or in-house • Type of content to add • Select material • By date • Rare books • Fiction • Items being sent to remote storage • Non-major publishers, e.g., self-published titles • Books with chapters by different authors (edited books) • Music material • Theses, dissertations, and other local material • Foreign language material

Where To Start

• If in-house, add to WorldCat or just to local catalog(s) • WorldCat Enrichment • See OCLC Bibliographic Formats and Standards, Chapter 5.3 on Database Enrichment. • http://www.oclc.org/bibformats/en/quality/default.shtm#database enrichment • Allow user tags, reviews, and ratings

Where To Get Information

• WorldCat • Publisher catalogs and web sites • WorldCat Identifies • Best Free Reference Web Sites • Book reviews • Other ideas?

Enriched Records …

• Researchers can and do use the catalog the way an entire library is used —not only as a source of material and information, but also as a

gateway to additional information

. Through adding more keyword-rich information to the catalog, libraries can serve the

extended information needs

of the researcher as well as offer structured pathways to their own information resources. (Byrum, 2006)

The end • Thanks to Mary Aycock for sharing information on the ebook macros she created!