ARL Survey on E-science and Data Support: Initial Findings
Download
Report
Transcript ARL Survey on E-science and Data Support: Initial Findings
ARL Survey on E-science and Data
Support: Initial Finding
Wendy Pradt Lougee
E-Science Working Group
October 14, 2009
ARL E-Science
2006 Task Force inquiry about library activity
2007 Task Force Report recommendations
o Education, awareness
o Workforce development
o Relationships with relevant organizations
o Infrastructure development (CNI)
o Policy development, new publishing genre
2008 Forum, e-research presentations
2009 Working Group survey of membership
E-Science Survey
Status of institutional planning
Campus structures
Infrastructure development
Status of library planning & engagement
Involvement in campus planning
Library services, infrastructure, capacity (staff)
Pressure points, areas of interest
Institutional structure/organization
52 respondents
Institutional infrastructure in place or planned 75%
Most institutions have hybrid of institution-wide and
unit planning & infrastructure (59%)
Only 10% are pursuing institution-wide approach
Institution-wide approaches: include IT, Library,
faculty/researchers, Office of Research
Unit-based focus: grounded in science/medicine
units
“Organizational structure is too strong a word. There is
periodic interaction and base-touching between science
departments, ITS and sometimes the Libraries that
considers infrastructure needs…”
Institutional infrastructure
Of those institutions with focused infrastructure
(N=41):
50% report designated unit to provide data
curation support
40% have conducted assessment of data
resources and needs
53% use combo of central/distributed data
centers; 44% only distributed centers
Lack of awareness about digital lab notebook
support
Decentralized themes
“Most science and engineering departments, labs and
centers…have some infrastructure to support high
performance computing, or provide software tools to
process/visualize research data. But none…are clearly
documented on a single webpage or other place where
researchers can easily locate.”
“Currently there is no one central group or effort that focuses on overall
Planning, but a collection of overlapping initiatives and activities –
This is largely because the university is highly decentralized and others in
The institution do not think in terms of e-science but in terms of
Research supported by cyberinfrastructure.”
Centralized strategies
“A cyberinfrastructure task force is in the planning stages, and it will
report to the President of the University.”
“Two groups exist: Cyberinfrastructure Council and Knowledge
Management Committee. The Council is most involved in the
high performance computing, data centers, other computing and
network issues. The Knowledge Management Committee is
more oriented to the content of escience and data curation…”
“An eScience Institute was formed in January2008 by the Vice
President for Technology and the Provost, in partnership with
key Deans and a group of highly distinguished research faculty…”
External funding
16 institutions engaged in DataNet proposal
development, 15 involved the library
13 libraries involved in other e-science grants
(NIH, NSF, Mellon and Gates Foundations)
“Investigating Data Curation Profiles Across Multiple Disciplines (Explores
Who is willing to share what data with whom)…Awarded
by IMLS to Purdue, Libraries PI.”
“NSF Office of Cyberinfrastructure award to [Cornell’s] Mann
Library: III-CXT: Promoting the curation of research data through
library-laboratory collaboration.”
Library eScience Support
Of libraries with institutional activity, 72% of
respondents reported library involvement.
Organization: group or group/department/
individual lead
E-Research Working Group, Data Curation Working
Group, e-Data Archiving Group, Science Data Services
Team, Data Executive Group, E-Research Team
86% libraries offering service collaborate with
other units (e.g., IT, colleges/departments,
centers, Ofc. Research)
Data Assessment/audits
Washington
http://www.washington.edu/lst/research_development/re
search_projects/LSTsurvey
U Oregon
http://libweb.uoregon.edu/faculty/SciDataAudit.html
Purdue/UIUC
http://www.datacuratoinprofiles.org
Wisconsin
http://digital.library.wisc.edu/1793/34859 and
http://digital.library.wisc.edu/1793/21443
Library service portfolio
Finding, using available infrastructure
8 libraries maintain web site on services
Finding relevant data, developing data
management plans, rights management
8 libraries offer training in data management
Metadata and archiving consultation/support
Most (86%) rely on discipline librarians, many
(69%) also have data librarians
Library Technology Infrastructure
Institutional repositories
Domain-specific repositories
Virtual community support (e.g., VIVO)
Short-term storage, partner in campus storage
solutions
Publishing infrastructure
GIS and social science data services, tools
Library Staff & Staff Development
62% reassigning existing staff
42% have hired or are planning (39%) to hire
escience expertise
62 positions detailed:
Two named chairs
70% had library/info science degree
32% had disciplinary degree (10% PhD only)
Staff development:
Conference support, in-house presentations, course
support
7 institutions collaborating with iSchools
Selected Case Studies
Wisconsin
Research Data Management Study Group (2008)
Proposed pilot: jointly funded and managed with
research partners DoIT and Library
Easily accessed, maintained storage and backup
for data; projects, address consultation needs
Libraries/DoIT digital curation service (2009)
Assess institutional models
User-based data management applications
integrated with storage/retrieval system
Develop digital curation processes & procedures,
data management assistance
University of Washington
Institution level
Study of campus needs: no clear consensus on
technology priorities. Areas of convergence: data
management, shared expertise, computing power &
network access, data collection & analysis,
communication & collaboration
eScience Institute formed 2008, interdisciplinary and
institution-wide coordinating body; Library interfaces on
planning and referral on data curation
Library:
Informal, evolving structure involving metadata unit,
research services unit, and health sciences libraries.
Planned Libraries integrated data services unit
Purdue
Institutional level
No one central group, but collection of overlapping
initiatives
Planned task force on data management (VP Research,
IT, Libraries, Provost office, colleges/schools)
Libraries Distributed Data Curation Center (D2C2)
pursues curation issues of organizing, facilitating
access to, archiving for and preserving research data
and data sets in complex environments. Brings
together project teams to apply for grants and
promote collaboration on advancing solutions for data
management.
Cornell University
DISCOVER Research Service Group
Partnership: domain scientists, Center for Advanced
Computing, Library, Fedora Commons; Sponsored by VP
Research
Facilitates collaboration, fosters cross-disciplinary analysis of
data using data mining and visualization tools, supports
development of cyberinfrastructure
Library Data Working Group white paper:
http://hdl.handle.net/1813/10903
Library maintains Research Data Management &
Publishing Support site
Library’s DataStar project: supports collaboration,
short term storage, & data sharing during research
Johns Hopkins
Data Intensive Engineering and Science (IDIES),
most visible umbrella organization for eScience
http://idies.jhu.edu
Coalesces data-intensive science efforts
Brings together scholars from School of Arts/Sciences,
Engineering, Sheridan Libraries to form interdisciplinary
teams.
Facilitates development of tools and methods
DataNet award: Data Conservancy Project
addressing creation, implementation, and sustained
management of an integrated and comprehensive
data curation strategy across initial disciplinary base
of astronomy, biodiversity, earth sciences, and social
sciences.
UCSD
Blueprint for the Digital University (2009)
recommendations: colocation facilities, centralized
disk storage, digital curation & data services, CI
network, “condo clusters,” expertise (labor pool).
Move toward centralized data service (new facility).
Service sponsored by UC Ofc of President, available
to UC researchers
Collaboration between SD Supercomputer Center and
Libraries.
Partner in Chronopolis, national center for the
management, long-term preservation, and
promulgation of national digital assets
http://chronopolis.sdsc.edu/
Pressure Points for ARL Libraries
Organizational:
Low recognition of importance of e-science support
Turf issues
Complexity of structures
Resources:
Staff with relevant expertise
Technology infrastructure
Budget constraints
Information Exchange Interests
Initiatives at member institutions
Share organizational models, position
descriptions
Assessments of researcher needs, environmental
scans
Support for digital humanities (models, programs)
Data curation technologies, data preservation
Next steps
Leverage the survey to increase info exchange:
Occasional paper including cases
Repeat survey in 2 years
Briefing paper by and for VPs for Research
“Institute” for e-science teams (discipline/data
librarians and professionals)
Build on Reinventing Science Librarianship Forum