QCloud - Research Computing Centre

Download Report

Transcript QCloud - Research Computing Centre

QCloud
Queensland Cloud Data Storage and Services
27Mar2012
27Mar2012
QCloud
1
NATIONAL RDSI
• 6 primary, 4 additional nodes
• 150-200 PB by 2014
• 100Gb/sec node interconnect
•
•
•
•
•
Storage for large national collections
Shared access by research communities
Research data management
On-shore data cloud
Low-cost research production quality of service
27Mar2012
QCloud
2
NATIONAL NeCTAR RESEARCH CLOUD
• 6 primary nodes (maybe 7)
• 24,000 cores by 2013
Uni of Melbourne
•
•
•
•
QCIF/UQ
ANU
On-demand compute capacity for research
Hosted shared services for research communities
Virtual laboratories
Research tools, workflows
Monash
27Mar2012
Uni
QCloud
3
QCLOUD COMPONENTS
NeCTAR
RESEARCH
CLOUD
RDSI
DATASTORE
QERN CLOUD
>15x
5x
GENOMICS CLOUD
27Mar2012
QCloud
4
QUEENSLAND RDSI NODES
•
•
•
•
~ 30 Petabytes by 2014
Primary node hosted at UQ (embargoed)
Additional Node at JCU (awaiting decision)
National focus: ecosystems, genomics
75% Merit Allocated Collections
25% “Collection
Development”
Can add RDSI+ commercial research storage
27Mar2012
QCloud
5
COLLECTIONS
Market Data Collections
Active research projects
Collaborations across multiple locations
Fast disk, flash memory – immediate access
Vault Data Collections
Rarely accessed data archival storage
Broad availability
Tape, slow disk
27Mar2012
QCloud
6
RDSI NODE SERVICES
OWNER
Qualification
Capture
Ingest
Bulk Update
Curation
Metadata (with ANDS)
Resilience (multiple copies)
Management
Entitlements to Access
Monitoring
Support, Help Desk, Training
USER
Entitlements to Access
Discovery
Read
Update
Duplicate
Monitoring
Support, Help Desk, Training
RESEARCH COMMUNITY
27Mar2012
QCloud
7
TARGET RESEARCH USERS/SERVICES
DATA COLLECTION SIZE
POWER USERS
PB-scale data collections
Low-level access services
Specialised treatment
LARGE DATA USERS
TB-scale data collections
Direct data access, FTP
Automated allocations
LONG-TAIL USERS
GB-scale data collections
Cloud access
Dropbox style on PC
Little allocation surveillance
NUMBER OF USERS
27Mar2012
QCloud
8
QCIF/UQ NeCTAR NODE
•
•
•
•
4,000 cores, up to 4,000 virtual machines
Research service quaility
70% for NeCTAR Allocated Services
30% for Queensland Allocation
27Mar2012
QCloud
9
SERVICES
HOSTED SERVICES
COMPUTATION
Permanent web sites
Research communities
Virtual laboratories
Research tools
Collaboration
On-demand virtual
machines
Off-load “small” jobs from
HPC/clusters
27Mar2012
QCloud
10
COMPUTE USERS/SERVICES
NUMBER OF CORES REQUIRED
Tier 1,2,3 HPC
More than 16 cores
Specialised treatment
MEDIUM COMPUTE USERS
4-16 cores
Large memory VMs (1TB+)
Automated allocations
Smaller HPC jobs
LONG-TAIL USERS
HOSTED SERVICES
1-4 cores
Small HPC jobs
NUMBER OF USERS
27Mar2012
QCloud
11
EARLY NODE (QERN)
• pQERN
– Operational October 2011
– Pre-RDSI/NeCTAR
– Genomics++
• QERN
–
–
–
–
–
–
–
27Mar2012
QCloud
Delivered February
Operational ~April
330 cores
0.5 PB disk
RDSI/NeCTAR services
Early adopter researchers
Operational experience
12
PROJECTED NODE SCALE-UP
PB
30
Cores
4000
NeCTAR
Research Cloud
3000
20
RDSI Data
Storage
2000
10
1000
Today, QERN
0
0
2011
27Mar2012
2012
2013
QCloud
2014
13
PARTICIPANT ROLES
RDSI
NeCTAR
Lead Agent
Lead Agent
Collection
& Service
Owners
& Users
QCIF
Governance, Management, Research Support
UQ RCC
JCU
Management,
SysAdmin
Management,
Operations
Partners
Research Support
UQ ITS
Acquisition,
Operations
27Mar2012
QCloud
14
VMs for
‘Direct’ Disk
Access eg:
IRODS
Logical Storage Physical Server
Physical Storage
Research
collections can
have their own VM
with collection
management
software
Disk can be
provided to
VMs by NFS
or Samba
Shares if
required
A
Research
Collection
VMs
48 Cores
VM Node
IBM X3755
RDSI – Servers
NFS Node
IBM X3650
GPFS -General Parallel File System - IBM developed
file system for multi peta-byte storage
Head Node
IBM X3650
Head Node
IBM X3650
Disk
DCS3700 Array
Disk
DCS3700 Array
Pod A
27Mar2012
General
Purpose VMs
can be fitted
with custom
software
stacks
Individual
Research
VMs
B
C
NeCTAR Specific
Individual
Research
Compute
VMs
A dashboard
page for system
requests and
search of
collections and
VMs hosted
An admin node
provides
services
required to run
the system eg
Auth.
QERN websiteSearch, web
forms login
screens
Auth,
entitlement,
handle
service for
DOI (?)
D
192 + Cores
VM/Compute Node
IBM X3755 x4
NeCTAR - OpenStack
Other File Systems
Head Node
IBM X3650
Disk
DCS3700 Array
Pod B
2x 100TB array
RDSI Specific
Compute
nodes with
multiple cores
and RAM can
be delivered
via virtual
servers
QCloud
XFS - NeCTAR Disk
E
Dependant on user network
speed
Access can be
provided to the logical
disk using IRODS or
similar, establishing a
network drive on the
users computer
F
Admin Node
The admin node
contains administration
interface, the catalogue
of services and links
collection meta data to
external services.
This is the ‘logical’
view of the physical
hard drives the
Virtual Servers will
see.
Head Node
IBM X3650
10 Gig Network between all devices
Virtual Server
QERN Layout Draft 9/2/12
Admin Node
(monitoring and
disk performance)
Disk
DCS3700 Array
2x 100TB array
15
QERN Progress
Mid Feb
Familiarisation
• Hardware installation
• Software systems
Service
Definition and
Delivery
• COO and Solution Architect start
• Service catalogue
• Software stack
• Registration processes
• Trial usage
Production
• Early adopters
• Basic services
• Continuing service introduction
Primary Node
(Jul2012)
27Mar2012
QCloud
16
QCLOUD PARTNERS
•
•
•
•
•
•
•
•
•
QCIF
UQ
JCU
CQU
Griffith
QUT
USQ
USC
Bond
27Mar2012
•
•
•
•
•
•
•
QCloud
CSIRO
NICTA
TERN
DERM
DTMR
DEEDI
Ergon Energy
17
FUNDING FOR QCLOUD
Cash
$15M
RDSI Primary NoDE $1.5M (contracting)
NeCTAR Research Cloud $2M (contracting)
Smart Futures Co-investment Fund $3.6M (signed)
RDSI Additional NoDE $266K (awaiting decision)
RDSI ReDS, DaSh, NRN funds $6.5M+ (projected)
DERM, DTMR co-investments $600K (offered)
QCIF QERN $200K
Inkind
$5M
27Mar2012
UQ Data centre facilities, operations and power consumption plus researcher support
$2.3M
JCU Data centre facilities, operations and power consumption plus researcher
support $700K
Other members, CSIRO, NeCTAR, NICTA, TERN, DERM, DTMR, Ergon Energy, QCIF
researcher support $2M
QCloud
18
Agreements Progress
Smart
Futures
RDSI
NeCTAR
Research
Cloud
• Signed with DEEDI, Jan2012; funds flow in July
• Contingent on RDSI, NeCTAR
• Final adjustments in progress
• Signature expected within April
• Feedback provided to NeCTAR
• Signature expected within April
• Drafting agreement text
Operations • RDSI, NeCTAR agreements as back-to-back schedules
(UQ, JCU)
Data,
service
owners
27Mar2012
• To be completed
• Preliminary arrangements for early users
QCloud
19
RISK MANAGEMENT – QCIF
• Power upgrade at UQ DC2 (JCU also?)
– Decision deferred to May
• Transfer to new UQ data centre
– PACE building offer replacing DC2 – network
• Staff and skills availability
• Software and skills maturity
• Authentication, entitlements
27Mar2012
QCloud
20
RISK MANAGEMENT – NATIONAL
• Levels of interaction: Lead Agents, other nodes
• National support and help desk structure
• RDSI ReDS and DaSh Programs
27Mar2012
QCloud
21
RISK MANAGEMENT – LONG TERM
•
•
•
•
RDSI value proposition to researchers
Growth past 2014
Business model for sustainability
Commercial service quality
27Mar2012
QCloud
22
QUESTIONS?
27Mar2012
QCloud
23