Collaboration on Large Datasets using Globus Rachana Ananthakrishnan University of Chicago Data sharing in collaborations Registry Staging Store Ingest Store Community Store Analysis Store Archive Mirror.

Download Report

Transcript Collaboration on Large Datasets using Globus Rachana Ananthakrishnan University of Chicago Data sharing in collaborations Registry Staging Store Ingest Store Community Store Analysis Store Archive Mirror.

Collaboration on Large Datasets
using Globus
Rachana Ananthakrishnan
University of Chicago
Data sharing in collaborations
Registry
Staging
Store
Ingest
Store
Community
Store
Analysis
Store
Archive
Mirror
Data Management User Stories
• “I need a good place to store / backup / archive my
(big) research data”
• “I need to easily, quickly, and reliably move or mirror
portions of my data to other places.”
• “I need a way to easily and securely share my data
with my colleagues at other institutions.”
• “I want to publish my data.”
• “I want to discover published data.”
• …
Exemplar: ISI-MIP
• Inter-Sectoral Impact Model
Intercomparison Project
• Framework to collate climate impact data
across scales and sectors
• World-wide collaboration with data assets
managed by the collaboration
• Inputs from various climate models &
output forms basis for model evaluation
and improvement
Credits: Dr. Joshua Elliot, University of Chicago
ISI-MIP Use Cases
• Share data with researchers across
institutions world-wide
– Restricted sharing
– Multiple institutions
• Accept data submissions
– Restricted writing to archive
• Publish results
– Move selected results to other locations
– Track metadata
– Discover data
What is Globus?
Big data publish*, transfer
and sharing…
…with Dropbox-like
simplicity…
…directly from your own
storage systems
* In pilot phase
Publish walk-through
Univ. of Chicago
IIT
Argonne
UIUC
1. Publish Data
2. Describe
Submission
Scientist
4. Curate Dataset
3. Assemble Dataset
(Transfer Data)
Collaboration Archive
Curator
Login with Campus Identity
8
New submission
9
Assemble the Dataset
10
Move data to publish archive
11
Grant Submission License
12
Submission Complete
13
Curator Logs in
14
Curation Workflow Options
15
Verify Metadata & Files
16
Approve the Submission
17
Submission is now Published with DOI
18
Discover walk-through
5. Search
Univ. of Chicago
Argonne
IIT
UIUC
6. Download
1. Publish Data
2. Describe
Submission
Scientist
4. Curate Dataset
3. Assemble Dataset
(Transfer Data)
Collaboration
Curator
Search Published Datasets
20
Discovering a Published Dataset
21
Download the Published Dataset
22
Select Download Destination
23
Sharing Service
Transfer Service
Identity, Group, Profile
Management Services
Globus Toolkit
Globus Connect
…
Globus APIs
Globus Under the Covers
Reliable, secure, high-performance
file transfer and synchronization
• “Fire-and-forget”
transfers
2 Globus
• Automatic fault
recovery
Data
Source
moves and
syncs files
Data
Destination
• Seamless security
integration
• Powerful GUI
and APIs
1 User initiates
transfer
request
3
Globus
notifies user
Simple, secure sharing off existing
storage systems
• Easily share large data
with any user or group
• No cloud storage
required
1
User A selects
file(s) to share,
selects user or
group, and sets
permissions
2
Globus tracks shared
files; no need to
move files to cloud
storage!
Data
Source
3
User B logs in
to Globus and
accesses
shared file
Thank you
• Signup and use Globus to transfer and share
• globus.org/signup
• Signup as early adopters of publish
• globus.org/data-publication
• Support
• [email protected]
Thank you to our sponsors!
U.S. DEPARTMENT OF
ENERGY