Transcript Slide 1

Unbundling the ILS @ NCSU:
implementation of an e-commerce search solution
Emily Lynema
Andrew K. Pace
North Carolina State University Libraries
LITA 2006 National Forum
Or better yet…
Endeca: implementing a faceted
search solution for the library catalog
LITA 2006 National Forum
Agenda

The Context:


Local Implementation



Why, What and How?
Challenges Encountered
Assessment



Next Gen Search Tools vs. OPAC Problems
Usage Statistics
Usability Testing
The Future
LITA 2006 National Forum
The Context
LITA 2006 National Forum
Online Catalogs
"Most integrated library systems, as
they are currently configured and
used, should be removed from public
view."
- Roy Tennant, CDL
LITA 2006 National Forum
Next gen search tools

Proving that it’s possible to improve the
search experience beyond the
functionality that traditional OPACs have
supported.
LITA 2006 National Forum
NextGen Library Search Tools














WorldCat.org (Beta)
RedLightGreen (RLG), subsumed by WorldCat
OCLC Fictionfinder
Vivisimo clustered search (Serials Solutions, Ex Libris)
Aquabrowser visual context
Endeca ProFind
Innovative Interfaces “OPAC Pro” and “Encore”
Ex Libris “Primo”
Polaris, AJAX-Enabled OPAC
SirsiDynix Enterprise Portal System, FAST
Talis, et alWeb Services
EBSCO Research Databases
Georgia PINES, Koha, and the Library 2.0 Bandwagon
And of course the entire commercial web
LITA 2006 National Forum
LITA 2006 National Forum
LITA 2006 National Forum
LITA 2006 National Forum
LITA 2006 National Forum
LITA 2006 National Forum
LITA 2006 National Forum
Existing catalogs are hard to use


Known item searching works pretty well
(sometimes), but …
Lots of topical searches and poor subject access






keyword gives too many or too few results – leads to
general distrust among users
authority searching is under-utilized and
misunderstood
Relevance = system sort order
Impossible to browse the collection
Unforgiving on spelling errors, stemming
Response time doesn’t meet expectations of
web-savvy users
LITA 2006 National Forum
Valuable metadata is buried

Subject headings are not leveraged in
keyword searching


they should be browsed or linked from, not
searched
Data from the item record is not
leveraged

should be able to easily filter based on user’s
changing requirements using item type,
location, circulation status, popularity
LITA 2006 National Forum
What’s the big picture?



Improve the quality of the library catalog
user experience
Exploit our existing authority
infrastructure (aka make MARC data
work harder)
Build a more flexible catalog tool that
can be integrated with discovery tools of
the future.
LITA 2006 National Forum
What is Endeca?



Software company based in
Cambridge, MA
Search and information
access technology provider
for a number of major ecommerce websites
Developers of the Endeca
Information Access Platform
LITA 2006 National Forum
Why Endeca?





Customized relevance ranking of results
Better subject access by leveraging
available metadata (including item level
data!) through facets
Improved response time
Enhanced natural language searching
through spell correction, etc.
Browse
LITA 2006 National Forum
Local Implementation
LITA 2006 National Forum
Demo
LITA 2006 National Forum
Relevance ranking
Based on locally customizable algorithm:



Most relevant: query as entered
For multi-term searches: phrase match
Field match


title match more relevant than notes match
Other factors:
number of fields matched
 weighted frequency (tf/idf)
 static ordering (publication date, circulation
stats)

LITA 2006 National Forum
Faceted browse



Combine search and
browse in single
interface (Guided
Navigation™)
Filter results across
multiple facets
Remove facets in
any order
LITA 2006 National Forum
Facet refinements

Availability
Author
Library
Format
Language

New









LITA 2006 National Forum
LC Classification
Subject: Topic
Subject: Genre
Subject: Region
Subject: Era
True browse

Regain ability to browse catalog without
entering any search terms
LITA 2006 National Forum
Added search tools

Automatic spell correction

“Did you mean…” suggestions

Automatic stemming
LITA 2006 National Forum
The nitty gritty

Endeca co-exists with SirsiDynix Unicorn
ILS and Web2 online catalog




Endeca handles keyword search
Web2 handles authority search and detail
page display
Endeca indexes MARC records exported
nightly from Unicorn
Endeca = discovery portion of the ILS
LITA 2006 National Forum
Technical overview
Information Access Platform
NCSU exports
and reformats
Data
Foundry
Parse text
files
Raw MARC
data
MDEX
Engine
Indices
Flat text
files
HTTP
HTTP
LITA 2006 National Forum
NCSU Web
Application
Technical overview
Offline - Nightly
NCSU exports
and reformats
Data
Foundry
Parse text
files
Raw MARC
data
MDEX
Engine
Indices
Flat text
files
HTTP
HTTP
LITA 2006 National Forum
NCSU Web
Application
Technical overview
Always Online
NCSU exports
and reformats
Data
Foundry
Parse text
files
Raw MARC
data
MDEX
Engine
Indices
Flat text
files
HTTP
HTTP
LITA 2006 National Forum
NCSU Web
Application
Implementation team

Seven member team





5 IT/DLI staff, 1 cataloging librarian, 1 reference
librarian
As a team: functional requirements, metadata,
interface issues (total of 40-60 hours)
Java-trained IT librarian (~40 hrs/wk for 14 weeks)
IT project manager: (~10 hours/wk for 20 weeks)
Timeline



License / negotiation: Spring 2005
Software acquisition: Summer 2005
Implementation: Aug 2005 to Jan 2006
LITA 2006 National Forum
Local decision points
Identifying appropriate facets
LITA 2006 National Forum
LITA 2006 National Forum
Local decision points


Identifying appropriate facets
Designing the user interface
LITA 2006 National Forum
1. Availability
2. Library of
Congress
Classification
3. Subject: Topic
4. Subject: Genre
5. Format
6. Library
7. Subject: Region
8. Subject: Era
9. Language
10. Author
LITA 2006 National Forum
Local decision points



Identifying appropriate facets
Designing the user interface
Integrating authority searching and
Endeca keyword searching
LITA 2006 National Forum
Pre-Endeca Catalog Search
•6 search tabs
•14 radio buttons
•1-4 drop down
boxes
•Title begins with
search default
LITA 2006 National Forum
Post-Endeca catalog search
• 3 search tabs
• No radio buttons
• 2 search boxes
• Keyword search default
Endeca keyword
Web2 authority
LITA 2006 National Forum
Local decision points




Identifying appropriate facets
Designing the user interface
Integrating authority searching and
Endeca keyword searching
Creating the relevance ranking algorithm
for each field index
LITA 2006 National Forum
Special challenges encountered



ILS data with MARC-8 encoding => Text
data with UTF-8 encoding
Data consistency between ILS and
Endeca catalog indexes (updates!)
Data issues revealed by exposing
metadata (ex: subject headings) in
facets
LITA 2006 National Forum
Assessment
LITA 2006 National Forum
Usage statistics
Requests by Search Type
July - September 2006
Search +
Navigation
21%
Navigation
11%
Search 68%
LITA 2006 National Forum
Usage statistics
Navigation by Facet: July - September 2006
Subject: Topic
LC Classification
New
Format
Library
Subject: Genre
Author
Subject: Region
Language
Subject: Era
Availability
0
10,000
20,000
30,000
Requests
LITA 2006 National Forum
40,000
50,000
60,000
Usage statistics
Navigation by Facet: July - September 2006
Other (≤ 5%)
16%
Subject: Topic
24%
Subject: Genre
6%
Library
8%
Format
10%
LC Classification
22%
New
14%
LITA 2006 National Forum
Usage statistics
Navigation by Facet: July - September 2006
Availability
LC Classification
Subject: Topic
Subject: Genre
Format
Library
Subject: Region
Subject: Era
Language
Author
0
10,000
20,000
30,000
Requests
LITA 2006 National Forum
40,000
50,000
60,000
Usability testing

10 undergraduate students




5 with new Endeca-based interface
5 with old catalog interface
Identical searching tasks
Data collected


Task difficulty/failure
Task duration
LITA 2006 National Forum
Usability testing
Task Difficulty: New Catalog
Task Difficulty: Old Catalog
Failed
22%
Failed
23%
Easy
43%
Hard
7%
Easy
59%
Hard
22%
Medium
12%
Medium
12%
LITA 2006 National Forum
Usability testing
A verage Task D uration:
O ld vs New Catalog
00:00.0
00:43.2
01:26.4
Task 1
02:09.6
02:52.8
03:36.0
Old Catalog
New Catalog
Task 2
Task 3
Task 4
Task 5
Task 6
Task 7
Task 8
Task 9
Task 10
LITA 2006 National Forum
Usability testing

For students, relevance ranking is key.



Faceted browsing is intuitive, even for
students who don’t use it.
Beware of library jargon


March 2006: ~13% continue to page 2
“keyword anywhere”, “keyword in subject”
User behavior is influenced by previous
experience.
LITA 2006 National Forum
Relevance



Are search results in Endeca more likely
to be relevant to a user’s query than
search results in old OPAC?
100 topical user searches from 1 month
in Fall 2005
How many of top 5 results relevant?


40% relevant in Web2 OPAC; 31 no hits
68% relevant in Endeca catalog; 12 no hits
LITA 2006 National Forum
The Future
LITA 2006 National Forum
Future directions



Experiment with FRBR search/display through
partnership with OCLC.
Update circulation status throughout the day.
Integrate catalog w/other tools through web
services:


Enrich catalog through external web services:



OpenSearch, RSS
book jackets, reviews, etc. – Amazon/OCLC
Build modular shopping cart functionality.
Use Endeca to index local collections.
LITA 2006 National Forum
From the Calhoun report

"If one accepts the premise that library
collections have value, then library leaders
must move swiftly to establish the catalog
within the framework of online information
discovery systems of all kinds. Because it is
catalog data that has made collections
accessible over time, to fail to define a strategic
future for library catalogs places in jeopardy
the legacy of the world's library collections
themselves. For this reason, the option of
rejecting library catalogs is not considered in
this report."
LITA 2006 National Forum
So what? It’s still just a catalog
Serials
A&I / FT DBs
Metasearch
ERM Systems
GS
Catalog
Guided
Navigation
Digital
Repositories
Web
Legacy ILS
IR
LITA 2006 National Forum
Strong to our finish
“Too often, we have an "eat your spinach"
message about the library: come to the
library, it is good for you.”
Lorcan Dempsey, OCLC
LITA 2006 National Forum
Moving in a new direction
OLD SEARCH MODEL
NEW SEARCH MODEL
LITA 2006 National Forum
Things to read







Rethinking how we provide bibliographic services for the
University of California by the Bibliographic Services Task Force
http://libraries.universityofcalifornia.edu/sopag/BSTF/Final.pdf
The Changing nature of the catalog and its integration with other discovery
tools by Karen Calhoun
http://www.loc.gov/catdir/calhoun-report-final.pdf
The Changing nature of the catalog and its integration with
other discovery tools: A Critical review by Thomas Mann
http://www.guild2910.org/AFSCMECalhounReviewREV.pdf
A “Next Generation Catalog”, Eric Morgan
http://dewey.library.nd.edu/morgan/ngc/
Metadata Research Center, SILS
http://ils.unc.edu/mrc/
University of Rochester eXtensible Catalog
http://www.extensiblecatalog.info/
Toward a 21st Century Catalog, ITAL, Sept. 2006, Antelman, Lynema, and Pace
http://www.lib.ncsu.edu/endeca/publications/antelman_lynema_pace.pdf
LITA 2006 National Forum
Thanks

NCSU project site:


Andrew K. Pace



http://www.lib.ncsu.edu/endeca
Head, Information Technology
[email protected]
Emily Lynema


Systems Librarian for Digital Projects
[email protected]
LITA 2006 National Forum