Worked example: Global Change Information System Peter Fox, and … others Xinformatics 4400/6400 Week 10, April 8, 2014
Download ReportTranscript Worked example: Global Change Information System Peter Fox, and … others Xinformatics 4400/6400 Week 10, April 8, 2014
Worked example: Global Change Information System Peter Fox, and … others Xinformatics 4400/6400 Week 10, April 8, 2014 And yet, we are still not done.. http://4.bp.blogspot.com/-7mYclB2oypk/TWrlhBPvHxI/AAAAAAAAALc/mwjhBbuZ9kU/s1600/yawn4.jpg Assignment 3 3 Assignment 3 4 Reading – long ago 5 The Global Change Research Act and USGCRP • USGCRP was mandated by Congress in the Global Change Research Act (GCRA) of 1990 (P.L. 101 – 606) “To provide for development and coordination of a comprehensive and integrated United States Research Program which will assist the Nation and the world to understand, assess, predict, and respond to humaninduced and natural processes of global change.” 6 U.S. Global Change Research Program The Program: • Coordinates Federal research to better understand and prepare the nation for global change • Prioritizes and supports cutting edge scientific work in global change • Assesses the state of scientific knowledge and the Nation’s readiness to respond to global change • Communicates research findings to inform, educate, and engage the global community 7 Global Change Information System (GCIS) Vision: A unified web based source of authoritative, accessible, usable, and timely information about climate and global change for use by scientists, decision makers, and the public. 8 Global Change Research Act (1990), Section 106 …not less frequently than every 4 years, the Council… shall prepare… an assessment which– • integrates, evaluates, and interprets the findings of the Program and discusses the scientific uncertainties associated with such findings; • analyzes the effects of global change on the natural environment, agriculture, energy production and use, land and water resources, transportation, human health and welfare, human social systems, and biological diversity; and • analyzes current trends in global change, both human- induced and natural, and projects major trends for the subsequent 25 to 100 years. 9 Previous National Climate Assessments Climate Change Impacts on the United States (2000) Global Climate Change Impacts in the United States (2009) http://nca2009.globalchange.gov Target date for next NCA: 2013 10 NCA 2009 http://nca2009.globalchange.gov 11 Prototype Use Case Name Discover and visit data center website of dataset used to generate report figure. Goal The NCA Report reader sees a figure and wants to know where the data came from. Summary A reader of the NCA is browsing the content via the website. He/she sees a figure and wants to know where the data came from. A reference to the publication in which the figure originated appears in the figure caption. Selecting the link to the source publication displays a page of information about the publication including, if available, the publication DOI. The page also includes references to the datasets cited in the publication. Following each of dataset reference links presents a page of information about the dataset, including links back to the agency/data center webpage describing the dataset in more detail and making the actual data available for order or download. Actors Primary Actor - reader of the NCA Preconditions Reader is viewing the NCA online report Post Conditions Reader visits the data center dataset website Normal Flow 1) System is presenting the NCA report to the reader in a web site. Presentation includes report figure with caption that includes reference to source publication. 2) Reader selects publication reference in figure caption 3) System displays information about publication, including DOI (if available). 4) Publication information includes publication dataset citations. 5) Reader selects a dataset cited by the publication. 6) System displays information about dataset including links to agency / data center webpages where more information and (potentially) data download links are available. 7) Reader selects the data center link and is redirected to data center dataset webpage. Assessment links to information 13 Traceable accounts… Magic here ! 14 Under the hood – a graph 15 Key Message & A Traceable Account Key Message vs. “General” Message Prototype 1 • Initial Implementation of UC-1 • Exposes Linked Data API – RESTful – RDF/XML, TTL, HTML, JSON supported • Hosted at TWC / RPI – currently placeholder data – http://globalchange.tw.rpi.edu/elda/gcis/report/nca 2009.html • Implemented using Epimorphics Linked Data API (ELDA) – http://code.google.com/p/linked-data-api/ (spec) – https://code.google.com/p/elda/ (implementation) Linked Data API Architecture Prototype Screenshot 21 GCIS 22 GCIS • Create an entity from the structured metadata about each thing – tag with related concepts. • Identify it with a persistent, controlled identifier. • Present with a human readable web page and a machine interface. • Represent all relationships between items. 23 GCIS and W3C Prov For GCIS, we have agents (people, projects, agencies, data centers, publishers, etc.) who are associated with activities (measuring, deriving, modeling, analyzing, authoring, publishing, archiving, distributing, visualizing, etc. ) the entities (software, data, images, figures, papers, reports, etc.) related to global change. We assign local identifiers to each (so we can persistently resolve them) and capture and represent their relationships. If possible, we link with external authorities: agency data centers, journal publishers, Researcher ID (researcherid.com) or ORCID (orcid.org). 24 Computer science-y things wasDerivedFrom wasInformedBy used ENTITY ACTIVITY wasGeneratedBy startedAtTime, endedAtTime wasAttributedTo wasAssociatedWith AGENT actedOnBehalf Diagram from W3C PROV group and Ivan Herman Non-specialist Use Case Name Find Latest Datasets by Keyword Goal Search for datasets associated with the keyword “snow”, list search results by recentness of publication. Summary User story: I want to look for information concerning “snow.” I don’t know if it is a CLEAN word or a GCMD word or don’t even know what GCMD or CLEAN is. How would I do it, and what would I see on my monitor during the process? Assumptions The reader is not assumed to have knowledge regarding the GCMD Keywords (or other) vocabulary. Actors Primary Actor - reader of the NCA Preconditions TBD Post Conditions Reader is presented with a list of datasets associated with the keyword “snow” sorted by dataset publication date. Normal Flow TBD Notes We are looking into two user interface options for dataset selection by keyword 1) As a free-text search where the user inputs “snow”. 2) Present the user a faceted browse interface with a vocabulary faceted which presents the user with terms from a structured vocabulary. The user can manually select the term(s) which match or contain “snow”. We intend to implement prototypes of both. Free-text Search by Keyword (ELDA) Faceted Browser (S2S) Data type Facet Vocabulary Facet Climate Literacy & Energy Awareness Network (CLEAN) • http://cleanet.org/index.html • The CLEAN project, a part of the National Science Digital Library, provides a reviewed collection of resources coupled with the tools to enable an online community to share and discuss teaching about climate and energy science. • Science Vocabularies for middle school to undergraduate students • Vocabularies hosted at http://serc.carleton.edu/admin/manage/view_vocab.p hp?vocab_id=161 CLEAN Vocabulary CLEAN Vocabulary (cont.) CLEAN Vocabulary (cont.) Interagency Information Integration GCIS can use relationships between all relevant information about global change across the agencies: o From observations to datasets to research papers to models to analyses to organizations to people to synthesized reports to human impacts... o Determine agency interdependencies -- An EPA analysis uses a NOAA model dependent on observations from a NASA satellite. o Can present unique interagency metrics "How many papers referenced datasets from a specific satellite?" o Direct users back to agency data centers for more detailed information and the actual content and data. GCIS “Data Mining” Structured information with relationships allows integrated data mining, searching, metrics. o What projects provided data used to produce figures that were referenced in the 2013 NCA section about coastal sea level rise impacts? o Which data centers hold data referenced by papers related to forests in the midwest? o Which agencies have people working on projects related to societal impacts of extreme weather events? o Show me the latest papers about health impacts of air quality in California. Which datasets were used in the analysis of air quality in California? Be not afraid of informatics Adopt, adapt, adapt, adapt,… Coordinate, finds gaps, be synergistic. Project check-in • There are a few students who have dropped the course… • Red: Aayush Chhabra, Eric Dobson, Rikhya Ghosh, Daniel Zhao • Orange: Eric Hayden, Ankita Khandelwal, Sisi Liu, Travis Scavone • Yellow: Jennifer Chan, Benno Lee, Evan McCarty, James Ryan • Green: Javier Camino, Lakshmi Chenicheri, Jonathan Dieter, Melissa Hay • Blue: Mike Moore, Michael O’Keffe, Ranjani Sargunaraj, Jessie Sodolo • Indigo: James Cataldo, Xueyang Guan, Thomas Hughes, Shruan Li, Amar Vishwanathan Kanna • Violet: Sarah Cooper, Nicolle Negdely, Anirudh Prabhu, 36 Renaldo Smith, Dian Yu