Hardware and software considerations

download report

Transcript Hardware and software considerations

Technical Framework
Charl Roberts
University of the Witwatersrand
[email protected]
Source: Repositories Support Project (JISC)
Technical Setup
In order to create an effective digital repository
it is important that the technical set-up process
is planned in detail.
Source: Repositories Support Project (JISC)
• Defining requirements. Without a requirements specification
informed decisions cannot be made relating to choices of
repository platform and environment
• The installation of a repository platform which may require the
purchase of hardware and software, or could involve negotiating a
hosting contract
• Integrating the repository with other systems such as local
authentication systems
• Testing the repository to ensure that it works as expected, and
fulfils all the criteria set out in your requirements specification
• Creation of technical policies for long-term aspects such
as metadata, workflows and file formats
• Technical promotion of the repository. This is important to ensure
that other systems such as external search engines index
or harvest content properly.
Source: Repositories Support Project (JISC)
Requirements and Specifications
Define the requirements:
What is the repository for?
Who are the stakeholders - those people with a vested interest in how the repository
represents the institution, and themselves, to the world, and what do they want from the
repository? In the case of an institutional repository, stakeholders will include senior
institutional managers, departmental leaders, and those who are expected to contribute
content. This approach is likely to reveal a series of questions: What is the target content of
the repository?
Are all content types to be managed in a single repository, or more than one?
What other systems and services might the repository be required to share information
with? This is often referred to as 'interoperability'.
Is there appropriate budget and staffing to support the requirements?
What will the repository store? For a higher education institution, repository content could
include research papers and data, electronic theses, as well as teaching and learning
resources, perhaps including some audio-visual content
Source: Repositories Support Project (JISC)
Lower level requirements
Interoperability (getting data in and out)
Open Archives Initiative (OAI)
Deposit protocols (SWORD)
Web 2.0
Standards (OAI, Dublin Core, W3C)
Software options
Open Source Software (OSS)
Free to download
Staff requirements for support
Hosted solutions
Better support
Comes at a cost
Platform requirements
- Depends on your software options – usually OSS requires other open source software (web server, database,
Programming requirements
- What skills will be required
- What standards do you need / want?
- How many items will you store, growth forecasting
Source: Repositories Support Project (JISC)
• IT staff required
– Campus IT, Web developers, programmers
– HTML, SQL, CSS, Java / PHP
• Supporting software
- Database, web server
• Hardware questions
- Technical requirements of the software
- Growth of the system
- Preservation (Physical & Access)
• Backup
• Maintenance
• Customization
Source: Repositories Support Project (JISC)
Integrating the repository
Three types of integration:
Integration with external systems to get items in to a repository: While repositories are often populated with
items that have been submitted using the repository software, there are many cases where the information can
be gathered from external systems. A common use-case is to populate the repository from an institutional
publications database. Some institutions are investigating ways to populate their repository with learning and
teaching materials from their VLE in an effort to make them more accessible both within the institution and
sometimes in a more open way with the wider community. Another way of working is to provide depositors with
desktop-based smart deposit tools that integrate with their working environment to help capture their work as it
is created. The most widely adopted standard for depositing items into a repository is SWORD
Integration with systems to get items out of a repository: Once a repository contains a useful corpus of items it
can be integrated with other systems that want to use that data. These may be local systems such as institutional
search engines or researcher web pages, national systems such as EThOS, or international systems such as Google
Scholar or OAIster. One of the most common methods for extracting the structured metadata of the items in
repositories is 'harvesting', with the standard protocol being the Open Archives Initiative Protocol for Metadata
Harvesting (OAI-PMH).RSS feeds are another standard mechanism that allow repositories to provide information
to other systems; in this case, RSS feed readers.
Integration with systems that provide services to a repository: Repository software specialises in storing items
and metadata, but can often work more effectively if it makes use of services provided by other systems. One
common system that repositories are often configured to work with is local authentication systems such
as Shibboleth, LDAP or Active Directory. These services allow the repository to look up usernames, passwords,
and user details (name, email, telephone number etc) from a centrally managed system. Other systems that may
provide services to a repository could include file format validation (JHOVE), virus scanning of ingested files, or
external cloud storage of files
Source: Repositories Support Project (JISC)
• Pilot System Testing/Functional testing
• Pilot User Acceptance Testing
• Production System Testing and User
Acceptance Testing
Source: Repositories Support Project (JISC)
Creation of technical policies for longterm aspects
• Metadata (Metadata-reuse http://www.opendoar.org/find.php?format=c
• Workflows
• File Formats
Source: Repositories Support Project (JISC)
Wits Example
• Wits hosts our own Institutional Repository (IR) called
• We currently use DSPACE as our software platform
• On the IT side we have one web developer (as the
manager of the unit I assist with development work)
• Our Technical Services Department work on our
metadata creation and enrichment
• Our DSPACE instance runs off a virtual server on the
Wits cloud infrastructure, it’s Solaris based, although
we are considering moving to UBUNTU Linux
Source: Repositories Support Project (JISC)
Source: Repositories Support Project (JISC)