Transcript Slide 1
Scientific Data Curation in Government Agencies Teaching Agency Data Creators How to Develop an OAIS-Compliant Digital Curation System Lorraine L. Richards, William C. Regli, Adam Townes, YuanYuan Feng Drexel University, College of Computing and Informatics Introduction For many U.S. federal agencies, scientific data management activities support the immediate research needs of the agency scientists, but neither support the recently mandated large-scale data sharing and data reuse requirements [3,4,5], nor the longterm preservation of the data. As a result, agencies are scrambling to learn how to curate their scientific data sets without sacrificing current mission-oriented research activities. This poster examines a case study of the Federal Aviation Administration’s William J. Hughes Technical Center (WJHTC), which contracted with the Drexel University project team to develop requirements and build capacity for a digital curation and preservation system that will meet OAIS Reference Model recommendations for such a system. Specifically, this poster presents findings related to teaching non-Archives and Records Management personnel how to develop a “big data” digital curation and preservation system. • Recommendation of a design for the ingest and tagging mechanisms to auto-generate metadata tags; • Research into potential standards for the policies and rules for data sets and access controls; and • Analysis of scientific research workflows and task analysis. Educational Goals: The development of the organizational knowledge and capabilities needed to issue a request for proposal or additional statement of work for a contractor to implement, build, and maintain a digital curation repository that is compliant with the recommendations of the OAIS Reference Model for the WJHTC and its current and future users. Methods To support both the implementation and educational goals, the project team chose to engage in action research. It stressed mutual cooperation between the WJHTC scientists and the Drexel curation and cyberinfrastructure experts. Action Research: Simulation Workstation for Human Factors Simulation [1, 4] and Simulation Data showing “Mean and standard deviation of elbow and wrist angles, and elevations of the arm during test scenarios” [1, 12]. Problem Statement The WJHTC is an organization that uses “big data” information resources in the course of large-scale scientific research. While the WJHTC has not previously been engaged in data curation as a routine activity, it now requires a trustworthy repository for its scientific research data, in order to meet government mandates and to engage in data sharing for future mission-critical projects. Research and Development Activities The Drexel University project team is performing activities such as: • Completion of a data inventory; • Development of a domain ontology and metadata taxonomy; “…an emergent inquiry process in which applied behavioural science knowledge is integrated with existing organizational knowledge and applied to solve real organizational problems. It is simultaneously concerned with bringing about change in organizations, in developing self-help competencies in organizational members and adding to scientific knowledge. Finally, it is an evolving process that is undertaken in a spirit of collaboration and co-inquiry” [6, 439]. References [1] Higgins, J. Stephens et al. 2012. Human Factors Evaluation of Pointing Devices Used by Air Traffic Controllers: Changes in Physical Workload and Behavior. Atlantic City, NJ: Department of Transportation. Available at http://hf.tc.faa.gov/technotes/dot-faa-tc-12-63.pdf [2] Shani, A.B. and Pasmore, W.A. 2010. “Organization Inquiry: Towards a Model of the Action Research Process,” in D. Coghlan and A.B. Shani (eds.) Fundamentals of Organization Development, Vol 1. London: SAGE, pp. 249-260. [3] Blue Ribbon Task Force on the Sustainability of Digital Preservation and Access. 2010. Sustainable Economics for a Digital Planet: Ensuring Long-Term Access to Digital Information. San Diego: SDSC. http://brtf.sdsc.edu/biblio/BRTF_Final_Report.pdf. [4] Office of Management and Budget. 2013. Open Data Policy – Managing Information as an Asset. Washington, D.C.: Executive Office of the President. http://www.whitehouse.gov/sites/default/files/omb/memoranda/2013/m-1313.pdf. [5] Office of Science and Technology Policy (OSTP). 2013a. Increasing Access to the Results of Federally Funded Scientific Research. Washington, D.C.: Executive Office of the President. Some Findings • Constantly focus on current use to sustain the project and maintain interest. (See [1]). • The value add must continually be evaluated and applied to the organization’s/departments’/individuals’ key objectives, linking curation and preservation goals to the ongoing data use priorities. • Tie the current curation project directly to other key, strategic projects within the organization, e.g., UAS (unmanned air space, or drone), project, NextGen (Next Generation) project, or SWIM (System-Wide Information Management). This can require continually reformulating progress reports and education throughout the project, as the organization’s priorities change. • To communicate effectively, the teacher must be willing to be the student. Building trust requires reciprocity. • Education is not so much “iterative” as “holographic.” One iterates through the entire process over and over, providing more detail to the overall “story board.” • Continued focus on the value-add of curation • Continued focus on what steps must be followed • Continued focus on the “big picture” “final” solution/service and how individual project steps fit into the big picture. • Examining, documenting, and validating detailed workflows provides a common language with which to speak. • Ideas need to be presented in concrete form, using examples specific to the domain background of the receiving party. • Academic or preservation-oriented abstractions are not welcome. • Existing information ontologies and taxonomies can be used to gain persuasive power and speed up the metadata development. • Use organizational-specific or IT-oriented language, rather than preservation terms, which often lead to confusion and lessen impact. • Although the DPCCM was used and initially presented to FAA personnel, we found that they responded more positively and with greater acceptance when these findings were “translated” into the language of NASA’s “Technology Readiness Levels,” with which they are already familiar. References, continued http://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_ac cess_memo_2013.pdf [6] Office of Science and Technology Policy (OSTP). 2013b. Science and Technology Priorities for the FY 2015 Budget. Washington, D.C.: Executive Office of the President. Available at http://www.whitehouse.gov/sites/default/files/omb/memoranda/2013/m-1316.pdf.