Aggravation and Aggregation: - University System of Georgia

Download Report

Transcript Aggravation and Aggregation: - University System of Georgia

Aggravation and Aggregation:
A Sweet Story About Statistics
Lauren Fancher, GALILEO
University System of Georgia
ER&L, January 2010
Lies, @#$% Lies
This is not a sweet story.
But you are: about you
Photograph of bookmobile, Bainbridge, Decatur County, Georgia, between 1936 and 1938.
Reported to be the first bookmobile in Decatur County.
The woman in the middle is the librarian, Mrs. Freddy Campbell.
Vanishing Georgia, Georgia Division of Archives and History, Office of Secretary of State.
Vanishing Georgia Database, Digital Library of Georgia, GALILEO
Lots of Practice
• GALILEO, Georgia’s Virtual Library
• Provides databases and a research portal with
federated search, SFX, and EZproxy to over
2000 institutions via 5 audience-specific
interfaces and 400+ profiles, which provide
the aggregation points for usage data
• 1995 to 2010
Those Happy, Golden Years:
1995-1998
• GALILEO originally built on OCLC’s SiteSearch
Software
• Z39.50 searches of both local (ProQuest and
Were any of you
EBSCO) and vendor-hosted (OCLC) collections
still in high
• Detailed information
about sessions, types of
school?
searches, indexes searched
• Perfect correlation between use and data
Vendor Web Interfaces: 1998 to ?
Bainbridge, ca. 1928. Service Drug Co., the Rexall Store, owned by Julian B. Ehrlich,
was located at 124 East Broughton Street. Items carried by the store, as advertised by being
painted on the store's side wall, are: cold drinks, candy, cigars, fountain pens, stationary, perfumes,
Eastman Kodaks, sundries, and novelties.
Vanishing Georgia, Georgia Division of Archives and History, Office of Secretary of State.
Vanishing Georgia Database, Digital Library of Georgia, GALILEO
Familiar Problems?
• From 1998 to 2002, vendor data proved elusive
and impossible to capture
• Many vendors did not and do not provide data on
the usage of their products
• Consortial reporting features were and are not
widely available
• Data elements were and are not consistent from
vendor to vendor
• The disappearance of data from the aggregated
repository belied the actual use, undermining
accountability reporting
The Dark Cloud
• Unavailable or uncollectable data
– Consortial aggregation issues
• Too much information
• Not enough information
– Limitations of vendor reporting tools
Unavailable
or uncollectable
data
• Only at an institution
level
• Only at a product level, or only at a platform level
• Data utility and definition (hits vs. searches)
• Unique product features (video, topic overviews,
ebooks)
Perils
• The fiend of inconsistency
• Changes in subscriptions, institutions require a new
effort or revisions to methods
• Access to previous statistics for products no longer
available or even in existance.
• Every change a vendor makes to a database may
impact their statistical reports
• Staff turnover
• New services, new data: SFX, federated search,
ebooks, ejournal packages
Statistics
Gathering and Consolidation Process
Vendor Data
Vendor Data
Usage Data Repository
•Sessions
•Days, Months, Years
•Locally-Loaded Database Searches, Full-Text,
Citations, Indexes Searched, and More
•Digital Library of Georgia Usage of Collections
Definitions
GALILEO
COUNTER, Code of Practice, Version 3.0, Released August 2008
Appendix A: Glossary of Terms
http://www.projectcounter.org/code_practice.html
Searches
Citation Views
Links Chosen
GALILEO
Sessions
3.1.2.10
Search
A specific intellectual query, typically equated to submitting the search form of the
online service to the server (EBSCO, abridged)
3.1.2.6
Article header
That subsection of an article which includes the following information: publisher;
journal title, volume, issue and page numbers; copyright information; list of names
and affiliations of the authors; author organization addresses; title and abstract
(where present) of the article; keywords (where present)
3.1.2.13
Link-out
Linking from one online resource to another. The act of clicking the link and
moving to a page on another site. Generally used to measure activity for libraryconfigurable links as might be found in a link server. The domain name of the
target of the link in the transaction to be recorded. (EBSCO).
3.1.4.2
Session
A successful request of an online service. It is one cycle of user activities that
typically starts when a user connects to the service or database and ends by
terminating activity that is either explicit (by leaving the service through exit or
logout) or implicit (timeout due to user inactivity) (NISO)
Definitions
GALILEO
COUNTER, Code of Practice, Version 3.0, Released August 2008
Appendix A: Glossary of Terms
http://www.projectcounter.org/code_practice.html
Full-Text Views
3.1.2.1
Item
Full text article, TOC,
Abstract, Database
record
3.1.2.1.1
Full-text item
Full-text article,
book chapter
3.1.2.2
Full- Content Unit
3.1.2.3
Article
A uniquely identifiable piece of published work that may
be: a full-text article (original or a review of other
published work); an abstract or digest of a full-text article;
a sectional HTML page; supplementary material associated
with a full-text article (eg a supplementary data set), or
non-textual resources, such as an image, a video, or audio).
A category of ‘item’ such as a full-text journal article, a
book chapter, or an encyclopedia entry
Journals: article
Books: Minimum requestable unit, which may be the entire
book or a section thereof.
Reference Works: content unit appropriate to resource (eg
dictionary definitions, encyclopedia articles, biographies,
etc)
Non-textual resources: file type as appropriate to resource
(eg image, audio, video, etc) (ICOLC)
An item of original written work published in a journal,
other serial publication, or in a book. An article is complete
in itself, but usually cites other relevant published works in
its list of references, if it has one.
Other Related Counter Definitions: PDF, HTML
What is Not Included?
• Data from vendors that provide only institutionspecific reports and vendors that do not
provide statistics at all
• Yet-to-be mapped vendors
• Data that distinguishes between on- and offsite usage
• Lags: one month (Britannica) or two months
(Lexis Nexis) behind the current month
• Journal usage data at the journal title level
• Federated search data from the search service
Sad or Wonderful?
•
•
•
•
Britannica hits
Federated searching
EBSCO federated search gateway and WebFeat
MetaLib IP
Old and New Reporting Tools
Original Reporting Tool
• http://dbs.galib.uga.edu/stats/html/stats.html
• Offers data repository collected from the GALILEO
system (1995-present) and database vendors (2002 to
present for most)
• Tool allows selection of institutions, databases, data
types, including date ranges (days, weeks, months)
• Reports group each data type separately (searches,
full-text, etc.)
• Reports output to screen (HMTL) or as delimited text
Old and New Reporting Tools
New Reporting Tool
• http://www.galileo.usg.edu/stats
• Utilizes same data repository as original reporting tool (data
collected from GALILEO system (1995-present) and database
vendors (2002 to present for most))
• PHP, MySQL, webservice from repository, ChartDirector
• Provides a default landing page for each institution that shows
current month’s data and links to additional reporting tool options
• Reporting tool allows selection of institutions, databases, data
types, including date ranges (months, fiscal years)
• Reports output to screen (HTML) in graph or table format. Graphs
can be downloaded for use in documents. Tables utilize standard
column headings for data types (searches, full-text, etc.) and rows
for databases. Tables can be exported as delimited file (download).
Hope
• Pro: COUNTER for standardization and
SUSHI for retrieval already helping
institutions make decisions based on
cost-per-use and other analysis points
• Neutral: Funding for consortia too
complex to reduce easily
• Con: Adoption, complexity
COUNTER Assessment Findings
• 11 reports defined in COUNTER Revision 3
• In FY09, GALILEO hosted 203 databases
available through 16 vendor platforms
• COUNTER Reports Most Commonly Available
from GALILEO Vendors
– Journal Report 1: Number of Successful Full-Text Article
Requests by Month and Journal (7 platforms, 178 databases,
88% of databases)
– Database Report 1: Total Searches and Sessions by Month
and Database (8 platforms, 179 databases, 89% of database)
– Database Report 2: Turnaways by Month and Database (6
platforms, 116 databases, 57% of databases)
– Database Report 3: Total Searches and Sessions by Month
and Service (8 platforms, 179 databases, 89% of databases)
– All other reports are not delivered in this format at vendors, with the exception of
Journal Report 3 at ProQuest
COUNTER Assessment Findings
• COUNTER Reports Data Partially available in
GALILEO
• Database Report 1: Total Searches and Sessions by Month
and Database
• Counter Report Similar to Current GALILEO
Reports, Not Available at Vendors
• Consortium Report 2: Total searches by month and
database
Onward
• Next phase: GALILEO will provide Database Reports
1-3 and Consortia Report 2 as part of COUNTER
compliance and ICOLC endorsement. Data may not
be collected in this format from vendors.
• Additional Counter Report elements need to be
accounted for in repository and collected, including:
•
•
•
•
•
•
•
•
•
Publisher
Platform
Searches -- Federated and automated
Total sessions
Sessions – Federated and automated
Page Type (Database Turnaways)
Service name
Page Type (HTML)
Page Type (PDF)
Stuff
• [email protected]
• www.galileo.usg.edu