DataONE UC Library Briefing - California Digital Library

Download Report

Transcript DataONE UC Library Briefing - California Digital Library

UC3 Summer Webinar Series

Scientists’ Data and Information Practices and Needs Carol Tenopir, University of Tennessee and Mike Frame, USGS June 15, 2011

Scientists’ Data and Information Practices and Needs: A Baseline Assessment & Implications for Libraries 2 Carol Tenopir, University of Tennessee and Mike Frame, USGS Co-Leaders of the DataONE Usability & Assessment Working Group

Provide universal access to data about life on earth and the environment that sustains it

1. Build on existing cyberinfrastructure 2.

Create new cyberinfrastructure 3. Support new communities of practice 3

Assessment-stakeholders

Public Officials Publishers Data Managers Scientists Students & Teachers Citizen scientists Libraries & Librarians

Analyze

Data Life Cycle

Collect Assure Integrate Assessment Discover Preserve Deposit

5

Describe

Baseline Assessment of Scientists (2010)

Primary Work Sector

others 8% government 12% biology 14% medicine 2% other 7%

Primary Discipline

social sciences 16% computer science/engineering 9% academic 80% atmospheric science 4% environmental sciences & ecology 36% physical sciences 12%

n=1329 n=1317

6

Meet the Scientists: Joe & Mabel

Joe is a biodiversity scientist employed by a government agency. He acts as a program manager and consultant. Joe oversees collection of new data in the field and also manages historical data from other providers. Joe has data from a variety of different projects conducted over the years. Mabel is an academic environmental scientist. She collects and records data in the field on a variety of specimen variables and environmental impacts. Mabel has a data set related to her personal research interests, as well as data collected for a university museum collection. 7

Lessons Learned

8

1. Scientists need a variety of data types and many scientists are interested in sharing data.

9

Data Types

100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 54% 48% 38% 34% 33% 27% 20% 19% 15% 6% 10

Current Sharing Practices 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 75% 78% 41%

share my data with others place at least some of my data into a central data repository place all of my data into a central data repository

Many are interested in sharing data Willing to share data across a broad group of researchers Willing to place at least some of my data into a central data repository with no restrictions Appropriate to create new datasets from shared data Willing to place all of my data into a central data repository with no restrictions 0% 41% 81% 78% 76% 20% 40% 60% Percent agree 80% 100%

Joe & Mabel: About Sharing Data

“We are torn between putting it out there for everyone and worry about suffering the risk of something bad happening with it. Saddest thing would be if the data loses its use, where it isn’t shared.” “I don’t think I would be opposed to it. It would not be a decision I would make personally; we would have to have permission to share.” “I’m interested in having data available to researchers interested in larger questions, particularly climate change questions.” “If NBII required anyone who extracted data through the portal to also share data with the portal, then a resounding yes.” 13

2. There are many barriers to sharing data and conditions that must be met.

14

60% 50% 40% 30% 100% Gap Between Willingness to Share and Accessibility 90% 80% 70% 78% 41% 36% 20% 10% 0%

place at least some of my data into a central data repository place all of my data into a central data repository Others can access my data easily

15

Interest in Data Sharing 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 84% 81% 76%

use other researchers' datasets if their datasets were easily accessible willing to share data across a broad group of researchers it is appropriate to create new datasets from shared data

16

Conditions on data sharing

Formally cite provider/funder 95% Acknowledge provider/funder Opportunity to collaborate 81% 93% Reciprocal sharing agreement Reprints of articles 72% 70% 0% 20% 40% 60% Percent agree 80% 100%

More challenges ..

Don't have the rights to make the data public No place to put data Insufficient time to make data available Lack of funding 24% 24% 40% 54% 0% 20% 40% 60% Percent agree 80% 100%

More challenges ..

Don't have the rights to make the data public 24% 18% No place to put data 24% 24% Insufficient time to make data available 54% 62% Lack of funding 40% 43% 0% 20% 40% 60% Percent agree 80% 100%

Joe & Mabel: About Restrictions & Conditions to Sharing Data “We will share it with people who want to use the data for restoration or research. If a consultant wants data to make money, then we are hesitant to hand it out.” “Is there a mechanism by which we can know when our data is being used? Knowing how valuable we are to the general public comes from the use of our data.” “We want to make sure that those of us who have been involved in gathering the data get appropriate recognition for it.” “If someone were to ask about rare or endangered plants, I would limit that information to appropriate people: natural heritage, universities and federal agencies.” 20

3. There are different needs, attitudes, and practices between scientists who work in government agencies and those who work in academia.

21

“I am satisfied with …”

formal established process to store data beyond the project tools and technical support for data management during the life of the project the tools for preparing my documentation the process for cataloging/describing data 35% 53% 40% 52% 34% 46% 48% 62% Government Academic 0% 20% 40% 60% 80% Percent agree/strongly agree 100%

Responsibilities for Data • Academic respondents are more likely to have sole responsibility for approving access to some or all of their datasets.

– Academic 83%, Government 63% 23

Organizational Involvement • Government respondents are more likely to agree their organization was involved in: – “managing data during the life of the project” • Government 52%, Academic 39%, – “storing data beyond the life of the project” • Government 53%, Academic 46% 24

Joe & Mabel: The View from Government & Academic Organizations “I don’t have the authority to make decisions about data sharing. “ “Our data sharing policy makes it difficult for us to withhold parts of the datasets we receive. As a result, some data contributors only share sub-sets of their data.” “I don’t have anything I’m keeping private. I’m willing to put it all out there.” “If other people are using my data then I somehow need to report that. I need to know how it’s being used and if any publications result.” 25

4. The skill level of scientists and use and access to appropriate tools varies across the data life cycle.

26

What metadata standard do you currently use?

676 266 95 95 96 97 12 DIF 21 DwC 26 DC EML FGDC Open GIS Metadata standard ISO My Lab none

Joe & Mabel: About Metadata

“For contemporary sets, the person who submits the data also submits a metadata record. We create another record representing what we think it is. We have one version of the data, submitter may have a version they keep on their website. We want to be able to show that these are two different things.” “We write FGDC records.” “For my research, very little metadata has been created. For metadata associated with the museum collection, Darwin Core has been used.“ “We are currently redoing all of our collection databases at the museum. We are building an in house system. We looked at available standards and decided to write our own.” 28

5. Scientists need assistance across the data life cycle.

29

My organization provides… Training on best practices Funds for data management long-term Funds for data management short-term Tools and technical support for data management long-term Tools and technical support for data management short-term

% Government

23 27 34 39

% Academic

21 20 29 34 48 43 30

More challenges ..

Don't have the rights to make the data public No place to put data Insufficient time to make data available Lack of funding 24% 24% 40% 54% 0% 20% 40% 60% Percent agree 80% 100%

Joe & Mabel: Looking for Assistance “Ideally, we would like for our research results to be disseminated in a way that’s accessible and digestible to not just academics but to everybody.” “Manpower. We need more people to handle these sorts of things.” “Maximum utility of the data would require geo referencing of the data. We would need help geo-referencing the part of the collection that isn’t geo-referenced.” “It is cumbersome to put those data sets together, but only because it is important. If there were ways to automate some of that information collection out of the data sets, it would help.” 32

Data Life Cycle Scientist Challenges

Will I get credit for my work?

What tools do I use?

Analyze Integrate Collect Assure What is a data management plan?

What is metadata?

Describe Are there standards?

Who can help me?

Where do I preserve my data?

Discover Preserve How much will it cost?

Deposit How do I preserve my data?

Future Assessments

Scientists: BL Scientists: FU Library Policies: BL Librarians: BL Policy Makers: BL Library Policies: FU Librarians: FU Policy Makers: FU Educators: BL Educators: FU Year 1 Year 2 Year 3 Year 4 Year 5

Library and Librarian Surveys

• • • • Library (1 per library) current practices Librarian (individuals) attitudes and perceptions Started with ARL libraries (spring and summer 2011; 38 library responses and 223 librarians so far) Will expand to other North American academic libraries and librarians

Librarian & Library Assessment Are RDS priority?

Level of knowledge & skills ?

Collect Level of participation with data?

Analyze Assure Role in partnering with researcher?

Level of involvement with metadata?

Integrate Role of librarian discovering data?

Role of the librarian to help preservation?

Discover Preserve Describe Is there an agency repository that accepts data?

Deposit Stewardship role (select & deselect)?

Library Survey Research Data Services (RDS) - Research data reference/consultation services to researchers are provided by individual discipline librarians (33%) or dedicated data librarians (17%) or a combination of both (50%).

- Almost half of the libraries (45%) do not

have policies and/or

procedures associated with research data services.

n=18 Library Survey Collaboration for RDS

n=28 Library Survey Staffing issues

n=25 Library Survey Opportunities for Staff for RDS

Librarian Survey

– Distributed to 950 librarians – Science, data, metadata, scholarly communication, digital collection, electronic resources librarians – 223 people replied at least one question

Librarian Survey

• • Interact with faculty, students, or staff in support of RDS 28% Yes-integral part, 41% Yes-occasionally, 32% No (n=221) With faculty or staff consultation on n=193 n=192 n=194

Frequency of research data services performed by the librarian n=167 n=166 n=167 n=167

Librarian Survey

• • Outreach and collaboration w/ other RDS – Off campus 61% Never, 34% few times a year (n=157) – On campus 51% Never, 34% few times a year (n=157) Participation in … about RDS daily once a week once a month few times a year never strategic planning 3% 4% 11% policy development 4% 4% 9% working groups/professional groups 3% 8% 12% informal discussion groups 2% 6% 20% 40% 34% 40% 49% 42% 50% 39% 24% n=158 n=158 n=156 n=158

48% Librarian Survey Skills & Expertise 57% 51% 31% n=157 I have the necessary skills, knowledge, training n=156 I have sufficient subject expertise n=156 n=157 I have access to training in RDS My library provides opportunities to develop skills related to RDS

Librarian Survey

Most important motivation to be involved in RDS 30% 25% 25% 23% 20% 15% 16% 14% 13% 10% 9% 5% 0% RDS are important to subject discipline I support RDS is primary responsibility personal interest in RDS My job includes facilitating data contributions to our institutional repository My job includes metadata creation, training, and/or management Other 2% My research includes RDS

Next steps

• • • • • Follow-up to ARL libraries and librarians Expand scope to other academic libraries Federal libraries/librarians Data Managers Other Working Groups looking at citizen scientists and UG educators

Questions?

Carol Tenopir [email protected]

Mike Frame [email protected]

48