Collaboration on Large Datasets using Globus Rachana Ananthakrishnan University of Chicago Data sharing in collaborations Registry Staging Store Ingest Store Community Store Analysis Store Archive Mirror.
Download ReportTranscript Collaboration on Large Datasets using Globus Rachana Ananthakrishnan University of Chicago Data sharing in collaborations Registry Staging Store Ingest Store Community Store Analysis Store Archive Mirror.
Collaboration on Large Datasets using Globus Rachana Ananthakrishnan University of Chicago Data sharing in collaborations Registry Staging Store Ingest Store Community Store Analysis Store Archive Mirror Data Management User Stories • “I need a good place to store / backup / archive my (big) research data” • “I need to easily, quickly, and reliably move or mirror portions of my data to other places.” • “I need a way to easily and securely share my data with my colleagues at other institutions.” • “I want to publish my data.” • “I want to discover published data.” • … Exemplar: ISI-MIP • Inter-Sectoral Impact Model Intercomparison Project • Framework to collate climate impact data across scales and sectors • World-wide collaboration with data assets managed by the collaboration • Inputs from various climate models & output forms basis for model evaluation and improvement Credits: Dr. Joshua Elliot, University of Chicago ISI-MIP Use Cases • Share data with researchers across institutions world-wide – Restricted sharing – Multiple institutions • Accept data submissions – Restricted writing to archive • Publish results – Move selected results to other locations – Track metadata – Discover data What is Globus? Big data publish*, transfer and sharing… …with Dropbox-like simplicity… …directly from your own storage systems * In pilot phase Publish walk-through Univ. of Chicago IIT Argonne UIUC 1. Publish Data 2. Describe Submission Scientist 4. Curate Dataset 3. Assemble Dataset (Transfer Data) Collaboration Archive Curator Login with Campus Identity 8 New submission 9 Assemble the Dataset 10 Move data to publish archive 11 Grant Submission License 12 Submission Complete 13 Curator Logs in 14 Curation Workflow Options 15 Verify Metadata & Files 16 Approve the Submission 17 Submission is now Published with DOI 18 Discover walk-through 5. Search Univ. of Chicago Argonne IIT UIUC 6. Download 1. Publish Data 2. Describe Submission Scientist 4. Curate Dataset 3. Assemble Dataset (Transfer Data) Collaboration Curator Search Published Datasets 20 Discovering a Published Dataset 21 Download the Published Dataset 22 Select Download Destination 23 Sharing Service Transfer Service Identity, Group, Profile Management Services Globus Toolkit Globus Connect … Globus APIs Globus Under the Covers Reliable, secure, high-performance file transfer and synchronization • “Fire-and-forget” transfers 2 Globus • Automatic fault recovery Data Source moves and syncs files Data Destination • Seamless security integration • Powerful GUI and APIs 1 User initiates transfer request 3 Globus notifies user Simple, secure sharing off existing storage systems • Easily share large data with any user or group • No cloud storage required 1 User A selects file(s) to share, selects user or group, and sets permissions 2 Globus tracks shared files; no need to move files to cloud storage! Data Source 3 User B logs in to Globus and accesses shared file Thank you • Signup and use Globus to transfer and share • globus.org/signup • Signup as early adopters of publish • globus.org/data-publication • Support • [email protected] Thank you to our sponsors! U.S. DEPARTMENT OF ENERGY