Transcript Slide 1

Managing Data in Difficult Times
Repositories Update (UK)
Peter Burnhill
Director, EDINA National Data Centre,
University of Edinburgh, Scotland UK
JISC/CNI Conference, Edinburgh, 1 & 2 July2010
1
Overview
policies/strategies/technologies/infrastructure to manage research/teaching
•
Scope
– Digital repositories at the level of the institution (for itself), at a level
above the campus: for institutions, for UK, for much much more
* within the European and wider international context
* in support of research, learning & teaching …. and management
•
Having voice as …
– a provider of common services and national infrastructure [EDINA]
– a user of repository software [Eprints, DSpace, IntraLibrary]
– a member of SONEX and indirectly of COAR and UK-CORR
•
and focus on repository-related progress in the UK since last
JISC/CNI; where is the value, how this is assessed/expressed?
– Size of investment in recent times
– Cost-effectiveness and ‘impact’ of provision
* Effort at institutional & inter/national level and the ‘shared services’ agenda?
•
Wondering what Dorothea said next …
2
Managing Data in Difficult Times
Nostalgia for interesting but not difficult times?
•
JISC Repositories & Preservation Programme -
April 2006; March 2009
“£14m investment in H.E. repository and digital content infrastructure”
•
This included the JISC RepositoryNet, as four ‘support services’:
① Repository Support Project
② Repository Research Project
③ Intute Repository Search
④ ‘interim repository’ | Prospero | the Depot | OpenDepot
•
Checking the JISC website today
– under the heading of ‘key digital repository activities’ are 21 funding
programmes and 216 funded projects.
Including some that are just being awarded … & then there is:
•
•
•
OR10: Open Repositories Conference, 6/9 July 2010, Madrid
RepoFringe2010: Repository Fringe 2/3 September, Edinburgh
and several others
3
R is for Repository
•
What are Repositories?
– Facility/technology to support at least three basic types of service:
PUT: a service interface that allows one or more use community to
deposit/issue digital content (+ metadata on that content)
KEEP: a service that ensures the integrity of that content, for the life of the
repository
GET: a service interface that allows one or more use community to
search/extract that content
* Use community: persons or machines/software; appropriate interface
•
Digital Repositories Review (R.Heery and S.Anderson, 2005)
– Digital repository differs from other digital collections in that:
*
*
*
*
•
"content is deposited, whether by content creator, owner or third party
architecture manages content as well as metadata;
repository offers a minimum set of basic services [put, get, search, access control]
must be sustainable & trusted, well-supported & well-managed."
"a university-based institutional repository is a set of services
… for the management and dissemination of digital materials created by
the institution and its community members. … an organizational
commitment to the stewardship of these digital materials, including long-term
preservation where appropriate, as well as organization and access ..."
(C. Lynch, 2003)
4
R is for Repository
•
Who has Repositories and why?
5
R is for Repository
•
Who has Repositories and why?
6
R is for Repository
•
Who has Repositories and why?
7
R is for Repository
•
What are Repositories
and what are they for?
– Allowing deposit of and holding all sorts of digital things/stuff
* Metadata + Objects; Metadata + pointers; Metadata only
* All sorts of objects: images, datasets, theses, articles, etc etc
•
Special interest in serving our central task:
– ease & continuity of access to scholarly resources
8
Ensuring
researchers, students and their teachers have
ease and continuing access
to online scholarly resources projects
access
to content & services
licence
to use
Use case: article–length work published in e-journals
but other use cases apply
P.Burnhill, Edinburgh 2009
research, learning & teaching in UK universities & colleges
acting as platform for network-level services
& helping to build the JISC Integrated Information Environment
National Data Centres
JISC Collections
JISC Sub-Committees
UK funding councils
Research
Councils UK
1&2
•
provider of services & user of software
EDINA-run repositories, with and without JISC
– DataShare: for research data (institutional, U of Ed)
* Open Data; using DSpace
– Jorum: for learning materials [with Mimas]
* OER and turnstile (UK); using DSpace & IntraLibrary
– OpenDepot (the Depot): for research papers
* OA (world); using Eprints
– ShareGeo: for geo-spatial data
* Open Data and turnstile (UK); using DSpace
– OA Repository Junction as shared service tool
* using own code and Eprints as an 'escrow' repository during the
transfer process.
– & maybe others … depending on definition of repository
11
for learning materials [with Mimas]
OER and turnstile (UK);
using DSpace & IntraLibrary
for research papers
OA (world); using Eprints
ShareGeo: for geo-spatial data
Turnstile (UK) Data & Open Data;
using DSpace
3.
•
SONEX
four individuals in JISC-sponsored mini think-tank
– from Denmark, Spain & UK
– Morgens Sandfaer, Pablo de Castro (Chair)
& Jim Downing (Richard Jones) and Peter Burnhill
•
came out of international workshop
Amsterdam, March 2009
– charged with looking at how repositories should inter-operate
– the focus group given name of ‘repository handshake’
– 3 other focus groups on citation, identifiers and ‘organisation’
* the latter an exit strategy for EU-funded DRIVER project?
•
focus switched to ‘deposit opportunities’
– semi-automatic issue/deposit, under terms of Open Access
* concern about risk of ‘hollow ring of repositories’
* avoid diktat about standards and techno babble
– looking to interoperability via SWORD
15
3.
•
SONEX
focus switched to ‘deposit opportunities’
– Initial categorisation of repositories into which authors deposit
– Looking to onward interworking/interoperability (SWORD)
* Not just technical interoperability but workflow
– Role of repository managers
•
But also recognition
of other networkattached ‘systems’:
– Authoring tools
* Desktop software
– Bibliography tools
– Non-Author-based
workflows
* CRIS
* REF
16
SONEX: Scholarly Output Notification & EXchange
•
Re-branded ourselves as SONEX,
to signal …
– ‘scholarly output’, not just research publications
– ‘notification’ using metadata only
– ‘exchange’ as two-way interoperability/negotiation
* push metadata; pull content; exploit always-on Internet
•
SONEX use case: multi-person & multiinstitutional
•
SONEX activities:
– Identify/analyse deposit opportunities (use cases) for ingest
into the repository space.
– Identify/promote projects tackling deposit use cases
– Gap analysis
•
machine (third party systems) as user (PUT & GET)
http://sonexworkgroup.blogspot.com
17
SONEX Use Case Actors
•
Use case Actor 1: Individual author/researcher [person]
author of multi-authored article, other author(s) at other institution(s)
sole author with entire career at a single institution [exception]
– Variant: author making deposit is the PI of funded research project(compliance with
mandate from funder to deposit)
– Variant: author making deposit is not the PI of funded research project but work is
associated with one or more funded research projects (PI)
•
Use case Actors 2&3: Depositor is not author (Mediated deposit)
– Variant: support staff in research group
– Variant : Library’s own resources and document collections
– Variant: Institutional Research Support Systems (CRIS systems) [machine]
•
Use case Actor 4: Repository Manager (RM) of an IR
– wishing to be notified & obtain copy from a subject (SR) or another IR
•
Use case Actor 5: Publisher
•
Other Actors: Vendor of authoring or repository software
(which work is published) [machine]
a) deposit under OA of the author's final copy (OA-RJ & PEER projects)
b) OA of published copy
c) Pointer supply to published copy
18
SONEX Use Case Scenarios
Gven opportunity, and motivation, to deposit content into the ‘repository
space’, for onward notification and exchange:
1. PI(s) as co-author
*
*
with felt obligation to notify grant funders of OA deposit
via web-based or desktop environment
2. Publisher(s)
*
assisting their author(s) in supply of full-text
into appropriate repositories
3. CRIS, a campus research information system,
*
managed support for researchers,
including note of publications
for the Project/Grant
4. ‘Bibliography’
*
*
*
web-based publications lists
as maintained by individual researchers,
Research Groups, Departments, etc.
including RAE/REF driven institutional
actions
19
OA Repository Junction Project
•
m2m broker supports:
–
–
–
–
–
Discovery of user &
content type
Get /ingest package of data
(metadata + digital object)
Deduce /parse data object &
deduce target repository(s)
Pass /deposit package into
repository targets
Notify /send alert to
appropriate 3rd party(s)
eg repository managers
•
Working with ‘Publisher’ and
‘Subject Repository via Broker
Service
•
Theo Andrew & Ian Stuart (EDINA)
20
Part 2: Showcase
21
O is for Open
•
OA (for publications) not the only ‘open’ policy:
– OER: Open Educational Resources
* UKOER: Jorum and other subject/institutional repositories
* Open CourseWare – as open webpages
– Open Data
* Both repository and open databases; Linked Open Data
– Open Source Software
•
Open Access
– the regime used for Subject Repositories
– seemed to be motive for creation of Institutional Repositories
* ‘Green OA’ self-archiving by authors: Creative Commons
•
Is this how we should judge success of Repositories?
– OA now becoming mainstream, including uptake by publishers
– "One fifth of 2008's research papers now open access"
The Great Beyond, Nature blog, June 25, 2010
•
Are Repositories the only way to support OA?
– Repositories to align themselves with, and support funder-mandates
for open access if they are to be successful
22
Informal discussion with JISC programme managers
“Dealing with institutional processes now, rather than repository
technology. Depending on type of content, the projects would fit much
more closely in:
•
managing research data programme
•
research information programme
•
open educational resources programme
as they have much more in common with those projects than they do
with each other.”
23
Informal discussion with JISC programme managers
“Dealing with institutional processes now, rather than repository
technology. Depending on type of content, the projects would fit much
more closely in:
•
managing research data programme
•
research information programme
•
open educational resources programme
as they have much more in common with those projects than they do
with each other.”
“repositories have found their core business proposition via the REF and
making sure Universities list research outputs to obtain research
ratings
- have not succeeded in making the business case that IRs should be
doing the job of archiving, a core library platform, or the job of an
institutional demonstrator/poster space.
Repositories fit in the ‘University Enterprise Stack’ by virtue of being a
system that delivers a business solution to a real financial problem.”
24
UK-CORR: UK Council of Research Repositories
individual rather than institutional, [email protected]
UK has ‘rich heterogeneous repository landscape’ (C.Awre); lurk
following comment from Dorothea Salo
25
UK-CORR: UK Council of Research Repositories
individual rather than institutional, [email protected]
UK has ‘rich heterogeneous repository landscape’ (C.Awre); lurk
following comment from Dorothea Salo:
US mainly about OA full texts; UK mainly about … serving research assessment!
– Is there more to IRs than the REF: lots of bibliographic records & little full text?
– Should IRs only accept full text, not metadata only?
1. in absence of a CRIS, our IR had to do REF (Lancaster & Northampton)
2. was OA but then RAE2008, but should aim to include all (OU)
3. motive for IR was digital preservation, with different REF system; funder
mandate compliance for OA; visibility via OA (Oxford/Bodleian)
4. RAE/REF is opportunity to engage institution-wide (Warwick)
5. Advent of CRIS (which don’t manage outputs well) may be opportunity for IRs to
have role, including use of ‘metadata only’ as lever to obtain full text (Hull)
6. REF & research management information allows IRs to be embedded as
platform for OA (Southampton)
7. RAE/REF has different goals to OA and IRs with low % of full text may
undermine OA movement (Nottingham)
26
COAR: Confederation of Open Access Repositories
•
New: 1st General Assembly in Madrid in March 2010
•
48 members drawn largely from Europe, but including both JISC
& CNI, and also EDINA (University of Edinburgh)
•
Work Plan for 2010/12, including
1.
2.
3.
4.
5.
6.
7.
Advocacy on behalf of OA and repositories (Rs) [both together?]
Populating (OA) Rs
Best practice documents
Facilitate and ensure data interoperability of (across?) Rs
interoperability with other systems (such as CRIS systems)
Support national helpdesks
Guidance on how Rs will form essential elements for global einfrastructure
8. Promote R manager profession
9. Provide advice & guidance on suitable R infrastructure technologies
10.Global (meta)data store
11.Strategic partner other infrastructure-related initiatives worldwide
27
Managing Data in Difficult [Interesting] Times
End of an era?
processes?
1.
Moving from technology to policy & practice:
some domain-specific, some common to repositories
a) Collection management: active curation & Linked relationships
b)
2.
End of the R word? Embedded in domain-specific
*
*
versions, data|article|learning material
Collections, ‘see also’
First point of public issue (availability); Take-down regimes
Institutional stewardship responsibility for its born-digital [and
digitised] content
–
"a university-based institutional repository [supports] a set of
services … for the management and dissemination of digital
materials created by the institution and its community members.
… an organizational commitment to the stewardship of these digital
materials, including long-term preservation where appropriate, as well as
organization and access ..."
(C. Lynch, 2003)
3.
What of the (new) shared services imperative?
–
Who does what, at what level/scale?
28
Theoretical basis for digital library?
•
Mix of document tradition & computation tradition
“considerable simplification, … helpful to think … of two traditions, or
mentalities, even cultures, co-exist in area of Information Science
1. “Approaches based on a concern with documents, with signifying
records: archives, bibliography, documentation, librarianship,
records management, and the like
2. “approaches based on uses for formal techniques, whether
mechanical (such as punch cards and data-processing
equipment) or mathematical (as in algorithmic procedures).”
Michael Buckland, UC Berkeley, 1998
http://people.ischool.berkeley.edu/~buckland/asis62.html
29
Time for me to stop
Hoping that I have left some space/place for questions

Thank you
Acknowledgements
Theo Andrew, Pablo de Castro & Robin Rice,
Dave Flanders & Andy McGregor
30
Multimedia resources: candidate for repository?
•
platform for search and download of film, video and audio
–
–
•
wide range of subject coverage, including documentary film
Llicensed for use in learning, teaching and research
Being re-worked as the Digital Media Hub, combining
–
Film & Sound Online
*
–
–
NewsFilm Online
*
3000 hours of material from ITN & Reuters
*
Over 4TBs of clips to download
Release of product from JISC Digitisation programmes
*
–
Plus Education Image Gallery of still photography
Visual and Sound Materials Portal project
*
•
initial 600 hours of film, digitised for downloading
Discovering all sorts of audio-visual material
Special interest for social science as record on non-print
record of 20th Century: the first A-V century
–
With new forms of research material to use and to master
31