Overview of The HKUST Institutional Repository Diana Chan Head of Reference HKUST Library Managing Scholarly Assets in Institutional Repositories: Sharing Experiences among JULAC Libraries. HKUST, Feb 24,

Download Report

Transcript Overview of The HKUST Institutional Repository Diana Chan Head of Reference HKUST Library Managing Scholarly Assets in Institutional Repositories: Sharing Experiences among JULAC Libraries. HKUST, Feb 24,

Overview of The HKUST
Institutional Repository
Diana Chan
Head of Reference
HKUST Library
Managing Scholarly Assets in Institutional Repositories:
Sharing Experiences among JULAC Libraries.
HKUST, Feb 24, 2006
Contents
1. Changing Landscape of Scholarly
Communication
2. The HKUST Institutional Repository
3. Software, Features and Enhancements
4. Indexed by Search Engines and Indexing Tools
5. Access Statistics
2
1. Changing Landscape of
Scholarly Communication
1.1 Developments in Academia
1.2 Developments in Commercial Sector
1.3 What is an Institutional Repository (IR)?
1.4 Why Create an IR?
3
1.1 Developments in Academia
Converging Trends in Academia:
Technological
+
Social
 Open Access Movement
4
Technological Trends
 Increasing ease of sharing documents via FTP and Web
(HTTP)
 Enables researchers to “publish” their research results
(working papers, pre-prints, etc) in subject-specific, webbased open archives for faster and wider dissemination
 Individual scholars or institutions post abstracts and fulltext
 arXiv.org e-Print archive
 Social Science Research Network (SSRN)
 The success of such collections led to the Open Archives
Initiative (OAI) which promotes author self-archiving &
interoperable standards for file sharing
 Major outcome: Open Archives Initiative Protocol for
Meta Data Harvesting (OAI-PMH)
5
Social and Cultural Trends
“Serials Crisis”
Journal titles proliferating
+
Prices rising
+
Library budgets cut
= Market dysfunction (since
the 1980’s)
Graphic Source ARL Statistics: Monographs and
Serials Costs in ARL Libraries, 1986-2003
6
Further Social Trends
Increasing legal restrictions on licensing and use of print
and digital resources by corporations (since the
1990s)
 New Model offered by academic disciplines (e.g.
Physics & Computer Science) with a culture & long
history of sharing results outside of formal publication
via preprints etc.
 Internet Culture – “Information wants to be free” trend
 Campaigns for free and unrestricted online access for
research literature worldwide develop into “Open
Access Movement”
7
Open Access Movement Example
 Scholarly Publishing and Academic Resources
Coalition (SPARC)
 Sponsored by Association of Research Libraries
 Endorsed by many different groups: Association of
American Universities, Association of Universities
and Colleges of Canada, Australian Vice-Chancellors
Committee, etc.
 Founded in 1997 to address market dysfunction
in scholarly publishing
 “Expand competition & support Open Access to
address high & rising journal costs”
8
A Fruit of OA Movement:
Open Access Journals
Refereed or peer reviewed
 Emerging Infectious Diseases
 Journal of Machine Learning
Research
More in Directory of Open Access Journals (DOAJ)
9
A Fruit of OA Movement: OAIster
 One searchable interface for open archives from 600 academic institutions
 6.5 million documents: articles from Open Access journals; working papers,
discussion papers, conference papers; dissertations & theses, & more from
Institutional Repositories
10
A Fruit of OA Movement:
Institutional Repositories
 Create, acquire, store, disseminate & preserve the
scholarly output of the researchers at the institution in a
free & interoperable digital format
 Development of IRs gained momentum with the release
of two open source systems:
 Eprints (University of Southampton)
 DSpace (MIT)
 Examples of Individual IRs
 Australian National University Eprint Repository
 eScholarship Repository (U of California)
 Caltech CODA
 Directory of Open Access Repositories
11
Within 10 years, all major universities are
likely to be running an IR
- Robin Yeates, 2003
Research Fellow, Dept of Information Science, School of Informatics,
City University, London
12
Support from Funding Bodies
 The UK Parliament Science and Technology Committee
issued a report recommending that publicly-funded
institutions should “establish institutional repositories on
which their published output can be read free-of-charge
online” (July 2004).
 Wellcome Trust, a major funding body for biomedical
research in the U.K., embraces open-access future –
“From Oct 1st, 2005, all papers from new research
projects must be deposited in PubMed Central within six
months of publication” (May 2005).
 U.S. National Institutes of Health and Australia’s National
Scholarly Communications Forum share a similar vision
(2004).
13
1.2 Developments in Commercial
Sector
 Web & Open Access Citation Counting
 ISI Web Citation Index
 Citation index for web-based scholarly resources, including preprints,
proceedings, technical reports, other Open Access research sources
 Thomson ISI & NEC teaming up to produce it
 Elsevier’s Scopus Service
 Has cited reference searching for 13,000 journals from 4,000 Science &
Technology publishers & 100 open access journals
 Google Scholar – Citations & Abstracts only
 Automatically analyzes & extracts citations and includes them in
the search results
 Cited by – provides the number of web documents that cite a
reference
 Web Search – to find the cited item on the web, maybe for sale
 Library search – to find the cited item in a library
14
Compared to Web of Science
15
Google Scholar
16
1.3 What is an IR?
A “digital collection capturing and preserving the
intellectual output of a single or multi-university
community”.
 adopted from “The case for institutional
repositories: a SPARC position paper” prepared
by Raym Crow.
 <http://www.arl.org/sparc/IR/ir.html>
17
1.4 Why Create the IR?
 Budapest Open Access Initiative
http://www.soros.org/openaccess/index.shtml
 Recommends 2 Strategies:
1. Self-archiving in Open Electronic Archives
2. Open Access Journals
 We recommend the 3rd strategy:
3. Publish in your Institutional Repository
18
Dual Open-Access Strategy
 BOAI-2 ("gold"): Publish your article in a suitable
open-access journal whenever one exists.
 BOAI-1 ("green"): Otherwise, publish your article
in a suitable toll-access journal and also selfarchive it.
19
Must Satisfy Two Conditions
 The author…grants to all users a free …right of
access to, and a license to copy, use, distribute,
transmit and display the work publicly …
 A complete version of the work is deposited
in…at least one online repository
- From the Berlin Declaration
20
Why We Created an IR at HKUST
 To create a permanent record of the scholarly
output of HKUST
 To make available and disseminate the scholarly
output of HKUST in a free and interoperable
digital format
 To help the international Open Access effort.
Because the mission of disseminating
knowledge is only half complete if it is not widely
and readily available to society.
- Adapted from the Berlin Declaration
21
2. The HKUST Institutional Repository
 Collects, disseminates, and preserves in
digital format the scholarly output of the
HKUST community
 Uses DSpace software, OAI-PMH
compliant, supports Chinese
 Easily discovered by Internet search
engines and indexing tools
22
Total Number of Documents
Collection
Size
%
Conference Papers
594
27
Working Papers, Technical
Reports, Research Reports,
Pre-prints
534
24
Journal Articles
503
23
Doctoral Theses
431
19
Patents
58
3
Presentations
66
2
Book Chapters
37
2
Miscellaneous
8
1
2,231
As of Feb 13, 2006
Total
23
Contributors by Department
(as of Feb 13, 2006)
HSS&SOSC
6%
OTHER
6%
COMP
20%
SBM
14%
ELEC
14%
OTHER SCI
9%
PHY
5%
MATH
6%
OTHER ENG
11%
MECH
7%
24
Home Page of the
HKUST Institutional Repository
25
Browsing by Communities and Collections
26
Communities in HKUST IR
 Accounting
 Advanced Engineering
Materials Facility
 Applied Technology Center
 Atmospheric, Marine and
Coastal Environment Program
 Biochemistry Biology
 Center for Enhanced Learning
and Teaching
 Centre for Display Research
 Chemical Engineering
 Chemistry
 Civil Engineering
 Computer Science
 Economics
 Electrical and Electronic
Engineering
 Finance
 Humanities
 Industrial Engineering and
Engineering Management
 Information and System
Management
 Institute of Nano Science and
Technology
 Language Center
 Library
 Management of Organizations
 Marketing
 Mathematics
 Mechanical Engineering
 Physics
 Social Science
27
Faculty and Staff Link
28
To Find Papers by Authors
kwok y
29
30
The View of a Record
31
The View of a Record
Click to see
full text
32
Full Text in pdf Format
33
Acknowledgement of Copyright
34
WebBridge Link
35
OpenURL Resolver
36
Link to the Publisher’s Version
of a Paper
37
To Search in IR
Fill in keywords and
click Search
38
39
Search Full Text via Scirus
40
Scirus search results page will look like this
41
To Submit A Paper
Put in your
UST account
name and
password
42
Fill in the form, click
the “Submit” button
at the bottom of the
page
43
44
You will receive a
confirmation email
45
3. Software, Features and
Enhancements
 The July/August 2004  We followed CalTech’s
issue of Library
model and based our IR
Technology Reports
on open source
software and with OAIon IR systems and
PMH interface.
functional
requirements
 We evaluated 2 IR
systems: EPrints and
DSpace
46
DSpace
 Jointly developed by MIT Libraries
and Hewlett-Packard Company
 Open source software
 Released on Sourceforge during
our system evaluation period in
late December 2002
 Written in Java, with PostgreSQL
database, Lucene search engine,
and a Tomcat web servlet
container
47
DSpace
 We chose DSpace in 2003 because:
DSpace began the development with the
experience gained from EPrints - the first and
most popular open source IR software at that
time
EPrints did not have full support on Unicode
and is not Java- and servlet-based
Both EPrints and DSpace are open source
software, fulfill our functional requirements,
and follow state-of-the-art library standards
48
Current Configuration
of HKUST IR
As of Feb 13, 2006,
Home URL:
http://repository.ust.hk/
IR Software: DSpace Version 1.3.2
Software:
Fedora Core release 4 Linux,
Apache Tomcat 5.0.28,
Sun Java JDK 1.4.2_10
Server:
Intel Pentium 4 3.00GHz, 3GB RAM, 80 GB harddisk
Content:
2,231 documents from 42 communities
Usages:
Documents were accessed 74,467 times since
Oct 2004
49
Major Features
Data structure
Document submission form
Add item form
CJK support
OAI data provider
SRW/U interface
50
Data Structure
 Document Types
 journal articles, theses, etc,
 Document Formats
 Mainly PDF files; also contains PowerPoint files
 DSpace data model
 Communities (and sub-communities)
 Collections
 Items
 Metadata
 Bundles of bitstreams
 HKUST implementation: Items are grouped by
 Departments (i.e. communities)
 then by Document Types (i.e. collections).
51
Document Submission Form
Faculty are not willing to do self-submission
DSpace’s submission and workflow functions
are too lengthy
In need of a simple and effortless submission
form - as a quick medium for submitting
documents
Written in Perl
Submitted data stored in DSpace “Simple
Archive Format”
52
Add Item Form
Is a locally developed JSP application to add
items to DSpace by library staff
Allows staff to:
Create new item from scratch
Enhance the metadata from faculty
submission and then add the item to
DSpace
53
54
CJK Support
 CJK (Chinese, Japanese, Korean) Support
 DSpace supports Unicode
 Problem - Lucene search engine is unable to search
by CJK characters
Solved by replacing DSpace’s Tokenizer with a
CJKTokenizer - but has an interesting side effect
 Problem - URL of query containing CJK characters is
not properly encoded
Solved by setting Tomcat URIEncoding="UTF8"
55
56
57
OAI Data Provider
DSpace is OAI-compliant
This means that OAI harvesters can easily
collect the metadata (in Dublin Core format)
from various IRs (including HKUST’s) for their
added-value indexing/searching services.
For example: OAIster
OAI Path to IR at HKUST:
http://repository.ust.hk/dspace-oai/request?
58
http://repository.ust.hk/dspace-oai/request?verb=GetRecord& ... 1783.1/1805
59
SRW/U Interface
Search and Retrieval for the Web (or by URL)
Retain core functionality of Z39.50 but in the
form of web services
This means search service providers can
broadcast a search to various IRs and deliver
the search results in their own GUI interface
SRW/U Interface for the IR at HKUST
Based on OCLC’s SRW/U software
URL: http://repository.ust.hk/SRW/
60
The result of a SRW/U search, with XSLT transformation
61
HKUST’s Enhancements to DSpace
Document submission form
CJK searching problem
Subscript and superscript problem
Number of items displayed
Access data
Top 20
Recommend an item link
Faculty & staff link
WebBridge Link
62
4. Indexed by Search Engines and
Indexing Tools
63
Indexed in
Registry of Open Access Repositories
(ROAR)
Directory of Open Access Repositories
(DOAR)
OAIster
Celestial
Google Scholar
Scopus Scirus
64
Indexed in
DOAR, ROAR, OAIster, Celestial
65
Google Scholar
66
5. Access Statistics
67
Monthly Access from May ‘03 to Jan ‘06
Note:
120000
"Item viewed": Access to metadata
"doc access (all)": Access to documents
100000
"doc access (robot excluded)": Non-Robot Access to
documents
80000
60000
40000
20000
20
03
20 -05
03
20 -06
03
20 -07
03
20 -08
03
20 -09
03
20 -10
03
20 -11
03
20 -12
04
20 -01
04
20 -02
04
20 -03
04
20 -04
04
20 -05
04
20 -06
04
20 -07
04
20 -08
04
20 -09
04
20 -10
04
20 -11
04
20 -12
05
20 -01
05
20 -02
05
20 -03
05
20 -04
05
20 -05
05
20 -06
05
20 -07
05
20 -08
05
20 -09
05
20 -10
05
20 -11
05
20 -12
06
-0
1
0
Item Viewed
doc. access (robot excluded)
doc. access (all)
68
18000
Monthly Document Access
from Oct ‘04 to Jan ‘06
16000
14000
12000
10000
8000
6000
4000
2000
0
2004- 2004- 2004- 2005- 2005- 2005- 2005- 2005- 2005- 2005- 2005- 2005- 2005- 2005- 2005- 200610
11
12
01
02
03
04
05
06
07
08
09
10
11
12
01
Access (Robot excluded)
Access (all)
69
"Top 20" Group
 From Oct 04 to Jan 06 (16 months), 116
documents made to the monthly "Top 20" list
 They account for around 11,600 access during
the period (~15% of the non-robot access of the
same period)
70
Ten Documents with Highest Access
Title
Dept/Unit
Doc. Type
No. of Months
Listed
Total Access
1. 普通話教學
LANG
Conf. Paper
14
519
2. Blogging in an MBA
classroom
ISMT
Conf. Paper
11
491
3. Tasks, talk and teaching
LANG
Research Report
13
483
4. Three-dimensional numerical
investigations ...
CIVL
Journal Article
4
458
5. A study of semi-technical
vocabulary...
LANG
Research Report
11
452
6. 「中文傳意」課程的學與教
LANG
Conf. Paper
10
450
7. Matching leadership styles
with employment modes
MGTO
Journal Article
12
429
8. A comparison of chemisorption
kinetic models
CENG
Journal Article
11
415
9. Quorumcast routing by
multispace search
ELEC
Conf. Paper
2
400
10. Changing roles of reference
librarians, the case of HKUSTIR
LIB
Journal Article
10
394
71
Composition of the "Top 20" Group by
Types (Oct ‘04 to Jan ‘06)
Conference Paper
19%
Case Study
1%
Book Chapter
3%
Working Paper
14%
Thesis
4%
Technical Report
4%
Research
Report
6%
Journal Article
39%
Preprint
1%
Presentation
9%
72
References and Additional Resources

Chan, Diana L.H. (2004) “Managing the challenges : acquiring content for the HKUST Institutional Repository” International conference
on developing digital institutional repositories : experiences and challenges, Hong Kong, December 9-10, 2004, California Institute of
Technology Libraries and the Hong Kong University of Science and Technology Library, available at http://hdl.handle.net/1783.1/1973
(accessed September 24, 2005)

Chan, Diana L.H. (2004) “Strategies for acquiring content : experiences at HKUST” International conference on developing digital
institutional repositories : experiences and challenges, Hong Kong, December 9-10 2004, California Institute of Technology Libraries and
the Hong Kong University of Science and Technology Library, available at: http://hdl.handle.net/1783.1/1974 (accessed September 24,
2005)

Chan, Diana L. H., Kwok, Catherine S. Y., Yip, Stephen K. F. (2005) “Changing roles of reference librarians : the case of HKUST
Institutional Repository.” Reference Services Review, Vol. 33, No. 3, pp.268-282, available at http://hdl.handle.net/1783.1/2039 (accessed
September 24, 2005)

Crow, Raym. (2002) “SPARC Institutional repository checklist and resource guide” The Scholarly Publishing & Academic Resources
Coalition, November.

Crow, Raym. (2002) “The case for institutional repositories: a SPARC position paper”, available at
http://www.arl.org/sparc/IR/ir.html (accessed September 24, 2005)

Gibbons, Susan. (2004) “Establishing an institutional repository” Library Technology Reports, July/August, Vol. 40 No. 4, pp. 5-67.

Lam, Ki-Tat. (2004) “DSpace in action: implementing the HKUST Institutional Repository system“ International Conference on Developing
Digital Institutional Repositories : Experiences and Challenges, Hong Kong, December 9-10, 2004, California Institute of Technology
Libraries and the Hong Kong University of Science and Technology Library, available at http://hdl.handle.net/1783.1/2023 (accessed
September 24, 2005)

Special issue on reference librarians and institutional repositories (2005). Reference Services Review, vol. 33, no.3. pp. 259-346.
73