Transcript Slide 1
What is an institutional repository?
A university-based institutional repository is a set of
services that a university offers to the members of its
community for the management and dissemination of
digital materials created by the institution and its
community members.
It is most essentially an organizational commitment to
the stewardship of these digital materials, including
long-term preservation where appropriate, as well as
organization and access or distribution.
Clifford Lynch, Executive Director
Coalition for Networked Information
Building an Institutional
Repository
Sarah L. Shreeves
September 24, 2007
Illinois
Digital
Environment for
Access to
Learning and
Scholarship
© 2007, IDEALS
This work is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 2.5 License.
To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/2.5/
What is IDEALS?
Illinois Digital Environment for Access to Learning and Scholarship
Institutional repository for the scholarship and
research in digital form of the faculty, students, and
staff of the University of Illinois at UrbanaChampaign. Supported by the Office of the Provost,
CITES and the University Library.
• Dissemination
• Preservation
• Persistent and reliable access
http://ideals.uiuc.edu/
Benefits for our faculty, students,
and staff?
Increased dissemination of research
Persistent URLs
Preservation
Promotion of research
Full text searching of textual material
Control over copyright
What is in scope for IDEALS?
Services
Preservation
Facilitating deposit of materials through
consulting, training, and batch loading
Consultation around copyright issues and
IDEALS
Providing as many access and dissemination
points as possible for deposited material
Providing additional services for end users and
depositors as appropriate
What type of materials?
• Published research and scholarship
• Unpublished research and scholarship in a
‘final’ state
• In the future: digital art, complex data sets….
Publications
Presentations
Grey literature
Pre-prints
Raw and processed research data
Management & organization of digital output
Journal articles,
books, etc
by faculty
Theses /
Dissertations
Manuscripts
Some
scholarly
web sites
What’s out of scope?
Collections
Administrative/electronic records
Everyday curriculum material
Published material where publisher policy does not allow
deposit
Services
E-Portfolio for students
Journal publishing
Digitization of materials
Shared collaborative space for groups
Roles in a digital repository?
Project manager
Collections specialists
Programmer / Technology specialists
Metadata specialists
Digitization specialists
Legal specialists
Public relations specialists
Why should the library be
responsible?
Expertise in large scale collection
management, description, and access
Usually have a preservation component
Long term commitment
Is the library’s mission!
But…
Library should partner with others when
needed
Libraries
ICT units
Consortia
Granting agencies
Demo of IDEALS
Production site: http://ideals.uiuc.edu/
Test site: http://loki.grainger.uiuc.edu/ideals/
Why Now for IDEALS?
Management & organization of digital output
Influence direction of scholarly communication
Open Access
Preservation of digital output
Dissemination of scholarship and research
See original proposal at:
http://www.ideals.uiuc.edu/handle/2142/3
Influence Direction of Scholarly
Communication
Educate faculty on copyright issues
IDEALS highly encourages open access to
deposited material
Funders beginning to mandate open access to
research
Provides multiple dissemination routes
The Fundamental Issue
Scholarly Literature is Different from
Commercial Publication
Not written for direct compensation
Freely given to publishers
Research and writing are supported through
public funds
Access is intended to be as wide as possible
(from Trends in Scholarly Communications” by Richard Fyffe)
Barriers to Broad Access
High Costs
Restrictive Licensing Terms
Slow Speed of Publication
Too Much Information
(From “Trends in Scholarly Communications” by Richard Fyffe)
High Costs
Erosion of Subscription-based Access to
Journals
Decline in Book Purchases and Erosion of
Scholarly Monograph Publishing
(From “Trends in Scholarly Communications” by Richard Fyffe)
Current Model of Scholarly Communications
The Academy
Published
Research
is a
Contribution
The Commercial
Publisher
Published
Research
is a
Commodity
From “Anatomy of a Crisis: Dysfunction in the Scholarly Communications System” by Lee C. Van Orsdel
Restrictive Licenses
Contract Terms Supersede Copyright Law
and “Fair Use.”
Contracts May (and Do) Restrict:
Who may use the journal
Permissible uses or kinds of research
Classroom use
Scholarly sharing
(From “Trends in Scholarly Communications” by Richard Fyffe)
Author Rights
Typically copyright is transferred from the author to
the publisher
An author can request to keep copyright, but no
guarantee that publisher will grant it
Author addendums retain certain rights but not full
copyright. See Hirtle at
http://www.dlib.org/dlib/november06/hirtle/11hirtle.html
Open Access
“free availability of the results of research
mainly in the form of scholarly articles”
“Open Access Publishing: A developing country view” Papin-Ramcharan
and Dawe in First Monday (http://www.firstmonday.org/)
Two roads:
Open access journals
Archiving (self, institutional, discipline)
Open Access Journals
With internet access, articles are free to read,
download, copy, distribute, and print
Can also have a print fee-based version
Costs of journal (including access and
dissemination) paid for by author side fees
(sometimes supplied by author, institution,
granting agency) or by sponsorship
Issues with OA Journals
Sustainability for publisher?
Preservation issues?
For user, accessibility? – Requires internet
and broadband access because most articles
are pdf (exception is Bioline International, First Monday, Ariadne,
Journal of Digital Information and handful of others who publish in html)
Self Archiving
Collect, describe, preserve, and provide
access to digital output on personal,
institutional, disciplinary repository
Cost is generally paid for by organization
maintaining repository
Issues with Self Archiving
Sustainability for organization
Copyright issues
Preservation issues
Take up by faculty / researchers
Federal Research Public Access
Act of 2006 (Cornyn-Lieberman)
Publications from federally-funded research
must be deposited in agency repository;
Agency ensures the manuscript is preserved
in a stable, digital repository;
Free, online access to each taxpayer-funded
manuscript available no later than six months
after its publication in a peer-reviewed
journal.
Benefits of Open Access
Free and open access to research for all who have
ability to access it
Higher citation impact for open access articles
Pressure on commercial publishers for pricing
Forcing changes in the scholarly communication
lifecycle
Digital Preservation
Not just about back-ups and
storage
Technology, organization, and
resources
Looking forward towards
certification as a “Trusted
Digital Repository”
Only 42% of journal publishers
have established formal
arrangements for the long-term
preservation of their journals
Image from Cornell University Library
What is Digital Preservation
Management (DPM)?
Process that requires the use of the best
available technology as well as carefully
thought out administrative policies and
procedures.
Consists of:
Organizational
concerns
Technological development
Resource management
Building a “New” Library
Format Support
Less
Preservable
proprietary
supported by only one
software platform
has low use
More
Preservable
openly documented
supported by a range
of software platforms
TIFF
TIFF
is widely adopted
TIFF
lossy data
compression
contains embedded files
or programs/scripts
lossless data
compression
does not contain
embedded files,
programs, or scripts
TIFF
TIFF
Trusted Digital Repositories
RLG/NARA Digital Preservation Repository Certification Task Force
Audit Checklist
Objectives:
Produce certification requirements (for both self and external
assessment), delineate a process for certifications, and identify a
certifying body (or bodies) that can implement the process.
http://www.crl.edu/content.asp?l1=13&l2=58&l3=162&l4=91
Principles:
External to the digital archives (cannot consist solely of selfassessment)
− Managed/performed by recognized authorities
− Well-documented with comprehensive and explicit policies,
procedures, and practices
− Sustainable and monitorable over time
−
−
Replicable
Dissemination of Scholarship and
Research
Open to Google Scholar and other spiders
Provide harvesting through the Open
Archives Initiative Protocol for Metadata
Harvesting
Most hits to IRs come from the outside in
http://www.oaister.org/ - OAIster at the University of Michigan
Challenges to Establishing Digital
Repositories
Gathering content from faculty
Digital preservation
Metadata
Copyright issues
E-Science, digital art, and other data coming our
way…
Faculty reluctance
Deposit is out of their ordinary workflow
Concerns about copyright issues
Want to keep research data private
Don’t see value in the digital repository
Copyright Issues
Deposit of pre-print and post-print materials
83% of journal publishers require authors to transfer copyright in
their articles to the publisher
Generally publishers do not allow deposit of the publisher pdf
copy
Will authors want the penultimate copy deposited?
Resistance to open access
Generally digital repositories require non-exclusive dissemination
and preservation license
Sherpa/Romeo Publisher copyright policies:
http://www.sherpa.ac.uk/romeo.php?all=yes
4 Approaches to Building Content
1) Working directly with faculty and scholarly units
2) Identifying publications from faculty that are able to
be deposited
3) Inserting ourselves into the process of
disseminating grey literature such as technical
reports and working papers
4) Working directly with publishers
Metadata
Generally author-supplied
For full text objects, is minimal metadata
enough?
How much resources should be spent on
metadata enhancement?
Different types of data
Datasets (E-Science and other disciplines)
Digital art
Preservation issues
Complex objects
Massive datasets with many types of supporting
documentation
Preserving relationships between pieces difficult
Many research questions!
Technical Infrastructure
DSpace 1.4.1
MIT Libraries & Hewlett-Packard
http://www.dspace.org
DSpace Community
130+ sites, 45+ countries
Communicate via listservs, wiki, conferences
Advisory Board – 13 institutions
Technical Infrastructure
UIUC Services Integration
CITES
Bluestem / LDAP
NetFiles
Illinois Compass
UI Portal?
Library
Online Catalog
Online Research Resources
Conclusions
What do you want to do and
what does your institution need?
Better dissemination of local research?
Better management and organization of locally
produced materials?
Digital preservation?
Can you phase these in?
Who is going to be responsible?
Resources
IDEALS: http://www.ideals.uiuc.edu/
IDEALS About Pages:
http://ideals.uiuc.edu/about/aboutus.html
IDEALS Initiative Wiki:
https://www.ideals.uiuc.edu/wiki/
Sarah Shreeves, Coordinator [email protected]
Tim Donohue, Programmer [email protected]
Copyright Notice
Parts of this file is based on the work “Anatomy of a Crisis: Dysfunction in the Scholarly
Communications System” by Lee C. Van Orsdel, Dean of Libraries, Eastern Kentucky
University and “Trends in Scholarly Communications” by Richard Fyffe • Assistant
Dean for Scholarly Communication, University of Kansas Libraries
This file is distributed under a Creative Commons license: AttributionNonCommercial-ShareAlike 3.0
You are free to copy, distribute, display, and perform the work and to make
derivative works
Under the following conditions: Attribution. You must give the original author
credit. Noncommercial. You may not use this work for commercial purposes.
Share Alike. If you alter, transform, or build upon this work, you may distribute
the resulting work only under a license identical to this one.
For any reuse or distribution, you must make clear to others the license
terms of this work.
Any of these conditions can be waived if you get permission from the
copyright holder.
See www.creativecommons.org