Transcript Slide 1
Implementing
The HKUST
Institutional
Repository
Diana Chan
Head of Reference
HKUST Library
Nov, 2005
2005 Library Conference: Balancing the
External and Traditional Libraries at the
Tamkang University, Taiwan
Library and Online Resources Technologies
2005 Conference at Xiamen University, PRC
Contents
1.
2.
3.
4.
5.
6.
Open Access and Institutional Repositories
HKUST IR
Software Selection
Planning and Policies
Strategies in Acquiring Content
Challenges
2
HKUST
Opened in 1991
4 schools (SSCI,
SENG, SBM, HSS)
450 faculty, 5,500
UGs, 2,800 PGs
Ranks 42 among the
top 200 universities
(2004 The Times
Higher Education
Supplement)
Library: 22 librarians,
75 support staff
3
1. Open Access and
Institutional Repositories
Technological and social trends that lead to the
Open Access Movement
Fruits of Open Access
What is an Institutional Repository?
Why create one?
4
Technological Trends
Increasing ease of sharing documents via FTP and Web (HTTP)
Enables researchers to “publish” their research results (working
papers, pre-prints, etc) in subject-specific, web-based open archives
for faster and wider dissemination
Individual scholars or institutions post abstracts and full-text
Social Science Research Network (SSRN)
IDEAS – Working papers in Economics
The success of such collections led to the Open Archives Initiative
(OAI) which promotes author self-archiving & interoperable standards
for file sharing
Major outcome: Open Archives Initiative Protocol for Meta Data
Harvesting (OAI-PMH)
5
Social Trends
“Serials Crisis”
Journal titles Increasing
+
Prices rising
+
Library budgets cut
= Market dysfunction
(since the 1980’s)
Source ARL Statistics: Monographs and Serials Costs in ARL Libraries, 1986-2003
6
Open Access Movement Example
Scholarly Publishing and Academic Resources
Coalition (SPARC)
Sponsored by Association of Research Libraries
Endorsed by many different groups: Assoc. of American Universities,
Assoc. of Universities and Colleges of Canada, Australian ViceChancellors Committee, etc.
Founded in 1997 to correct market dysfunction in
scholarly publishing
“Expand competition & support Open Access
to address high & rising journal costs”
7
A Fruit of OA Movement:
Open Access Journals
Refereed or peer reviewed
Emerging Infectious
Diseases
Journal of Machine
Learning Research
More in Directory of Open Access Journals (DOAJ)
8
A Fruit of OA Movement: OAIster
One searchable interface for open archives from 536 academic institutions
5.9 million documents: articles from Open Access journals; working
papers, discussion papers, & conference papers; dissertations & theses
+ All of the above & more from Institutional Repositories
9
A Fruit of OA Movement:
Institutional Repositories
Development of IRs gained momentum with
the release of two open source systems:
Eprints (U of Southampton)
DSpace (MIT)
Examples of Individual IRs
Australian National University Eprint
Repository
eScholarship Repository (U of California)
CalTech CODA
Institutional Archives Registry (468 as of Oct 5,
2005)
10
What is an Institutional Repository
(IR)?
A “digital collection capturing and preserving the
intellectual output of a single or multi-university
community”.
-
Adopted from “The case for institutional repositories: a SPARC
position paper” prepared by Raym Crow.
<http://www.arl.org/sparc/IR/ir.html>
11
Why Create the IR?
Budapest Open Access Initiative
http://www.soros.org/openaccess/index.shtml
Recommends 2 Strategies:
1. Self-archiving in Open Electronic Archives
2. Open Access Journals
12
Dual Open-Access Strategy
BOAI-2 ("gold"): Publish your article in a suitable
open-access journal whenever one exists.
BOAI-1 ("green"): Otherwise, publish your article
in a suitable toll-access journal and also selfarchive it.
13
Must Satisfy Two Conditions
The author…grants to all users a free …right of
access to, and a license to copy, use,
distribute, transmit and display the work
publicly …
A complete version of the work is deposited
in…at least one online repository
- From the Berlin Declaration
14
Why We Created an IR at HKUST
To create a permanent record of the scholarly
output of HKUST
To make available and disseminate the scholarly
output of HKUST in a free and interoperable
digital format
To help the international Open Access effort.
Because the mission of disseminating
knowledge is only half complete if it is not widely
and readily available to society.
- Adapted from the Berlin Declaration
15
2. HKUST Institutional Repository
Collects,
disseminates, and
preserves in digital
format the scholarly
output of the HKUST
community
Uses DSpace
software, OAI-PMH
compliant, supports
Chinese
Easily discovered by
Internet search
engines and indexing
tools
http://library.ust.hk/repository/
16
Total Number of Documents
Collection
Size
%
Conference Papers
579
26
Working Papers, Technical
Reports, Research Reports,
Pre-prints
534
25
Journal Articles
493
23
Doctoral Theses
394
18
Patents
58
3
Presentations
56
2
Book Chapters
37
2
Miscellaneous
8
1
Total
2,159 (incl. 100 As of Oct 5, 2005
duplicates)
17
Contributors by Department
(as of Oct 5, 2005)
HSS&SOSC
6%
OTHER
12%
COMP
21%
SBM
13%
ELEC
13%
OTHER SCI
9%
PHY
5%
MATH
6%
OTHER ENG
8%
MECH
7%
18
Home Page of the HKUST
Institutional Repository
19
Browsing by Communities and Collections
20
Communities in HKUST IR
Accounting
Advanced Engineering
Materials Facility
Applied Technology Center
Atmospheric, Marine and
Coastal Environment Program
Biochemistry Biology
Center for Enhanced Learning
and Teaching
Centre for Display Research
Chemical Engineering
Chemistry
Civil Engineering
Computer Science
Economics
Electrical and Electronic
Engineering
Finance
Humanities
Industrial Engineering and
Engineering Management
Information and System
Management
Institute of Nano Science and
Technology
Language Center
Library
Management of Organizations
Marketing
Mathematics
Mechanical Engineering
Physics
Social Science
21
To Find Papers by Authors
kwok y
22
23
The View of an IR Record
Click to
see full
text
24
Full Text in pdf Format
25
To Search in IR
Fill in keywords and
click Search
26
27
To Submit A Paper
Put in your
UST account
name and
password
28
Fill in the form, click
the “Submit” button
at the bottom of the
page
29
30
You will receive a
confirmation email
31
Access Data
32
3. Software Selection
The July/August 2004 We followed CalTech’s
issue of Library
model and based our IR
Technology Reports
on open source
software and with OAIon IR systems and
PMH interface.
functional
requirements
We evaluated 2 IR
systems: EPrints and
DSpace
33
DSpace
Jointly developed by MIT Libraries
and Hewlett-Packard Company
Open source software
Released on Sourceforge during
our system evaluation period in
late December 2002
Written in Java, with PostgreSQL
database, Lucene search engine,
and a Tomcat web servlet
container
34
DSpace
We chose DSpace in 2003 because:
DSpace began the development with the
experience gained from EPrints - the first and
most popular open source IR software at that
time
EPrints did not have full support on Unicode
and is not Java- and servlet-based
Both EPrints and DSpace are open source
software, fulfill our functional requirements,
and follow state-of-the-art library standards
35
Current Configuration
of HKUST IR
As of Oct 5, 2005,
Home URL:
IR Software:
System Software:
http://repository.ust.hk/
DSpace Version 1.2.1
Fedora Core 2 Linux; Tomcat 5.0.28;
JDK1.4.2_05
Server:
Intel Pentium 4 2.4GHz, 2GB RAM
Content:
2,059 documents from 40 communities
Usages:
Documents were accessed
5,792 times in September 2005
36
Major Features
Data structure
Document submission form
Add item form
CJK support
OAI data provider
SRW/U interface
37
Data Structure
Document Types
journal articles, theses, etc,
Document Formats
Mainly PDF files; also contains PowerPoint files
DSpace data model
Communities (and sub-communities)
Collections
Items
Metadata
Bundles of bitsteams
HKUST implementation: Items are grouped by
Departments (i.e. communities)
then by Document Types (i.e. collections).
38
Document Submission Form
Faculty are not willing to do self-submission
DSpace’s submission and workflow functions
are too lengthy
In need of a simple and effortless submission
form - as a quick medium for submitting
documents
Written in Perl
Submitted data stored in DSpace “Simple
Archive Format”
39
Add Item Form
Is a locally developed JSP application to add
items to DSpace by library staff
Allows staff to:
Create new item from scratch
Enhance the metadata from faculty
submission and then add the item to
DSpace
40
41
CJK Support
CJK (Chinese, Japanese, Korean) Support
DSpace supports Unicode
Problem - Lucene search engine is unable to search
by CJK characters
Solved by replacing DSpace’s Tokenizer with a
CJKTokenizer - but has an interesting side effect
Problem - URL of query containing CJK characters is
not properly encoded
Solved by setting Tomcat URIEncoding="UTF8"
42
43
44
OAI Data Provider
DSpace is OAI-compliant
This means that OAI harvesters can easily
collect the metadata (in Dublin Core format)
from various IRs (including HKUST’s) for their
added-value indexing/searching services.
For example: OAIster
OAI Path to IR at HKUST:
http://repository.ust.hk/dspace-oai/request?
45
http://repository.ust.hk/dspace-oai/request?verb=GetRecord& ... 1783.1/1805
46
SRW/U Interface
Search and Retrieval for the Web (or by URL)
Retain core functionality of Z39.50 but in the
form of web services
This means search service providers can
broadcast a search to various IRs and deliver
the search results in their own GUI interface
SRW/U Interface for the IR at HKUST
Based on OCLC’s SRW/U software
URL: http://repository.ust.hk/SRW/
47
The result of a SRW/U search, with XSLT transformation
48
Enhancements to DSpace
Document submission form
CJK searching problem
Subscript and superscript problem
Number of items displayed
Access data
Top 20
Recommend an item link
Faculty & staff link
49
4. Planning and Policies
Task Force – software, scope, policies, database
structure, problems, action plans
Information Services Committee – guidelines on
publications, publishers’ policies, data formats,
faculty concerns.
Library Administrative Committee – problems,
issues, final decision, strategies.
50
Work Team – Subject Librarians
Correct
Version
Incorrect Version
To Data
Entry Staff
Index
Document
Dr. Samson Soong
Liaise & Subject librarians
With
Faculty
Check
Pub
List
Harvest
Document
Correct
Version
Verify
Document
Version
Ascertain
Pubs’
Policies
51
Work Team – Data Entry Staff
Verify and Convert
PDF Documents
Final Review
Input Metadata
Using Submission Form
Add Items to Repository
Set PDF Document Security &
Properties. Add Watermark for
Pre-published Version
Proof-Read
52
53
Guidelines on Different Publications
Type
Copyright
Action
Book chapter
Book
Conf paper
Conf proceed.
US Patent
Publisher
Need permission
Publisher, 50 years
Need permission
Author
Can archive
Publisher
Need permission
Public Domain
Author
Can archive
US Patents
Working Paper,
Technical Report
Author
Can archive
Presentation
Standard
Author
Can archive
Issuing Organization
No
54
SHERPA Summary of Publishers' Policies
55
Guidelines on Journal Articles
Publisher’s Policy
No
Arch.
Pub’s PrePostRef’ed Ref’ed
Both
All
Not
Specified
PreRefereed
Version
No
Yes
Yes
Yes
Yes
Yes
Ask
Pub
PostRefereed
Version
No
Yes
No
Yes
Yes
Yes
Ask
Pub
Publisher’s No
Version
Yes
No
Ask
Faculty
Ask
Yes
Faculty
Ask
Pub
Version
Available
On hand
56
Guidelines on Publishers’ Policies
Studied publishers’ copyright & self-archiving
policies (SHERPA/RoMEO , Stevan Harnad’s
and publishers’ websites)
Constructed our own table for reference
Printout of publishers’ copyright statements and
date-stamped
Noted their acknowledgement or credit
requirement
57
Credit to Publisher
In the Rights field of a record:
APS copyright statement:
"[Journal title] © copyright (year) American
Physical Society. The Journal's web site is
located at http://....."
58
59
Other Policies
Withdrawal
Replacing Versions
Cooperation with User Groups
Authority Control
Indexing
Rights and Acknowledgement
60
5. Strategies in Acquiring Content
Our logics
How to Acquire by Type of Document?
How to Use Different Channels?
Sustainable Growth
61
Logics Behind our Strategies
The research output is the University’s
intellectual property
Create a critical mass of papers
Copyright and self-archiving rights are our
concerns
Ascertain publishers’ policies
Ask permission from authors and publishers
Deal with publications which are easier to
obtain and sources which are more accessible
Those posted on the web
Those from publishers allowing published
versions
62
How to Acquire by
Type of Document?
1. Working Papers, Technical Reports, Research
Reports
2. Conference Papers
3. Conference Presentations
4. Theses
5. Book Chapters
6. Peer-reviewed Journal Articles
7. Open Access Journal Articles
63
Sources of Scholarly Content
Library
Collection
Researchers
Web
Scholarly
Content
Publishers
Journals
64
Copyright VS. Self-Archiving Rights
Copyrighted
Non-copyrighted
Journal articles, book chapters, Working papers,
conference proceedings, theses, technical reports
presentations
Author’s
Permission
Author’s
Permission
Publisher’s &
Author’s
Permission
Archivable
University
Owned
Author
Owned
Publisher
Owned
Nonarchivable
Selected items to ask for
author’s & publisher’s
permission
Author’s permission
Department’s
Permission
65
Journal Articles
Journal Article
Check Author’s
Archiving Rights
No or Unclear
Ask
Publisher
Yes Publisher’s
Version
Harvest from
The Web
Yes Pre-refereed
Or Post-refereed
Version
Ask
Author
Deposit Into IR
66
How to Use Different Channels?
1.
2.
3.
Self Submission
Harvest from Websites (departmental, faculty, research
centers)
Library Collection
4.
5.
6.
7.
Conference proceedings
Theses and dissertations
University Archives
Harvest from the Source (databases, E-journals, Open
Access publications)
Publishers
Liaisons with Faculty, departments, research centers
Public Relations
67
Electronic Thesis Approval Form
Student Agreement:
I hereby grant to the Hong Kong University of
Science and Technology Library the nonexclusive right to archive my thesis in digital
format, and make it freely accessible, such as
over the Internet.
Signed:
Date:
68
Publisher’s policy: Emerald
Emerald’s Principles on Copyright
Emerald seeks to retain copyright of the articles it
publishes, without the authors giving up their rights to
use their own material. Authors are not required to seek
permission to re-use their own work. As an author you
can use your paper in part or in full,…in another article
written for us or another publisher, on your website, or
any other use, without asking us first.
http://ninetta.emeraldinsight.com/pdfs/jarform.pdf
69
Collection Growth Milestones
1800
83 Research Centers
1600
No. of Documents
1400
79 Univ. Archives
1200
50 IOP papers
1000
142 conference papers
35 papers with publishers' permission
800
96 CS papers
600
110 theses + 211 working papers
400
53 patents
200
116 papers from faculty websites
105 CS technical reports
0
May
2003
Jul
Sep
Nov
Jan
2004
Mar
May
July
Sep
70
Towards Sustainability for the
HKUST Institutional Repository
How to make the submission to IR part of the
publication process?
Seeking permission from faculty to archive
papers supported by RGC grants
making use of the OCGA Research Output
report process, a checkbox is added to the
report form to denote agreement to archiving
in IR – 100+ papers was received in the
summer 2005.
71
6. Challenges - Faculty
Low awareness of Open Access
Concern over copyright issues
Apathy in self submission
Lack of willingness to negotiate on nonexclusive rights or self-archiving rights
Lack of willingness to provide the right versions
of documents (pre- or post-refereed)
Only a small % of their scholarly work can be
archived
72
Example of a Faculty
Retaining Self Archiving Rights
73
Challenges - Institution
Needs to make a commitment to deposit all
research output with the Institutional Repository
Needs to give financial support to faculty who
submit papers to open access journals
Needs to give financial support to the Library for
archiving work
74
Challenges - Publishers
In SHERPA project, 73 out of 107 publishers
(68%) allow some sort of archiving, as of Nov’04
Many have no policy (Camford, Genetic Society
of America)
Many have an unclear policy
Need to include self-archiving into license
agreements with publishers
75
Challenges – Library
Provide support for university research selfarchiving
Promote the IR
Educate users and faculty about the IR
Showcase the IR
Find champions and partners
Seek institutional commitment and support
Harvest documents
Make self submission a part of faculty’s
publication reporting system
76
Challenges - Librarians
System Evaluation
Formulating and interpreting policies
Internal and publishers’ policies
Content Recruitment
Advocacy
Education
Advisory
Perceived benefits
Public relations
Use Assistance
77
References and Additional Resources
Chan, Diana L.H. (2004) “Managing the challenges : acquiring content for the HKUST Institutional Repository” International conference
on developing digital institutional repositories : experiences and challenges, Hong Kong, December 9-10, 2004, California Institute of
Technology Libraries and the Hong Kong University of Science and Technology Library, available at http://hdl.handle.net/1783.1/1973
(accessed September 24, 2005)
Chan, Diana L.H. (2004) “Strategies for acquiring content : experiences at HKUST” International conference on developing digital
institutional repositories : experiences and challenges, Hong Kong, December 9-10 2004, California Institute of Technology Libraries and
the Hong Kong University of Science and Technology Library, available at: http://hdl.handle.net/1783.1/1974 (accessed September 24,
2005)
Chan, Diana L. H., Kwok, Catherine S. Y., Yip, Stephen K. F. (2005) “Changing roles of reference librarians : the case of HKUST
Institutional Repository.” Reference Services Review, Vol. 33, No. 3, pp.268-282, available at http://hdl.handle.net/1783.1/2039 (accessed
September 24, 2005)
Crow, Raym. (2002) “SPARC Institutional repository checklist and resource guide” The Scholarly Publishing & Academic Resources
Coalition, November.
Crow, Raym. (2002) “The case for institutional repositories: a SPARC position paper”, available at
http://www.arl.org/sparc/IR/ir.html (accessed September 24, 2005)
Gibbons, Susan. (2004) “Establishing an institutional repository” Library Technology Reports, July/August, Vol. 40 No. 4, pp. 5-67.
Lam, Ki-Tat. (2004) “DSpace in action: implementing the HKUST Institutional Repository system“ International Conference on Developing
Digital Institutional Repositories : Experiences and Challenges, Hong Kong, December 9-10, 2004, California Institute of Technology
Libraries and the Hong Kong University of Science and Technology Library, available at http://hdl.handle.net/1783.1/2023 (accessed
September 24, 2005)
Special issue on reference librarians and institutional repositories (2005). Reference Services Review, vol. 33, no.3. pp. 259-346.
78