DKRZ System Management

Download Report

Transcript DKRZ System Management

Distributed Data Management
at DKRZ
Wolfgang Sell
Hartmut Fichtel
Deutsches Klimarechenzentrum GmbH
[email protected], [email protected]
9-Sept-2003
CAS2003, Annecy, France, WFS
1
Table of Contents
• DKRZ - a German HPC Center
• HPC Systemarchitecture
suited for Earth System Modeling
• The HLRE Implementation at DKRZ
• Implementing IA64/Linux based
Distributed Data Management
• Some Results
• Summary
9-Sept-2003
CAS2003, Annecy, France, WFS
2
DKRZ - a German HPCC
•
Mission of DKRZ
• DKRZ and its Organization
• DKRZ Services
• Model and Data Services
9-Sept-2003
CAS2003, Annecy, France, WFS
3
Mission of DKRZ
In 1987 DKRZ was founded with the Mission to
• Provide state-of-the-art supercomputing
and data service to the German scientific
community to conduct top of the line Earth
System and Climate Modelling.
• Provide associated services including
high level visualization.
9-Sept-2003
CAS2003, Annecy, France, WFS
Page 4
DKRZ and its Organization (1)
Deutsches KlimaRechenZentrum = DKRZ
German Climate Computer Center
• organised under private law (GmbH)
with 4 shareholders
• investments funded by federal government,
operations funded by shareholders
• usage 50 % shareholders and 50 % community
9-Sept-2003
CAS2003, Annecy, France, WFS
Page 5
DKRZ and its Organization (2)
DKRZ internal Structure
• 3 departments for
• systems and networks
• visualisation and consulting
• administration
• 20 staff in total
• until restructuring end of 1999 a fourth department
supported climate model applications and climate
data management
9-Sept-2003
CAS2003, Annecy, France, WFS
Page 6
DKRZ Services
• operations center:
DKRZ
• technical organization of computational ressources
(compute-, data- and network-services,
infrastructure)
• advanced visualisation
• assistance for parallel architectures
(consulting and training)
9-Sept-2003
CAS2003, Annecy, France, WFS
Page 7
Model & Data Services
competence center: Model & Data
• professional handling of community models
• specific scenario runs
• scientific data handling
Model & Data Group external to DKRZ,
administered by MPI for Meteorology,
funded by BMBF
9-Sept-2003
CAS2003, Annecy, France, WFS
Page 8
HPC Systemarchitecture
suited for Earth System Modeling
• Principal HPC System Configuration
• Links between Different Services
• The Data Problem
9-Sept-2003
CAS2003, Annecy, France, WFS
9
Principal HPC System Configuration
9-Sept-2003
CAS2003, Annecy, France, WFS
Page 10
Link between Compute Power
and Non-Computing Services
• Functionality and Performance Requirements for
Data Service
• Transparent Access to Migrated Data
• High Bandwidth for Data Transfer
• Shared Filesystem
• Possibility for Adaptation in Upgrade Steps
due to Changes in Usage Profile
9-Sept-2003
CAS2003, Annecy, France, WFS
Page 11
Compute server power
Installed compute power (peak)
10000,00
GFlops
1000,00
100,00
10,00
1,00
19
84
19
86
19
88
19
90
19
92
19
94
19
96
19
98
20
00
20
02
20
04
20
06
0,10
9-Sept-2003
CAS2003, Annecy, France, WFS
Page 12
Adaptation Problem for Data Server
Dataproblem in HPC
Datenerzeugungsrate in TByte/Jahr
3.000
data increase:
linear, P1
P3/4
P2/3
2.500
2.000
1.500
1.000
500
0
0
50
100
150
200
250
300
350
400
450
500
Effective Compute Power (P) in GFlops
9-Sept-2003
CAS2003, Annecy, France, WFS
Page 13
Pros of Shared Filesystem Coupling
• High Bandwidth between the Coupled
Servers
• Scalability supported by Operating System
• No Needs for Multiple Copies
• Record Level Access to Data with High
Performance
• Minimized Data Transfers
9-Sept-2003
CAS2003, Annecy, France, WFS
Page 14
Cons of Shared Filesystem Coupling
• Proprietary Software needed
• Standardisation still missing
• Limited Number of Vendors whose Systems
can be connected
9-Sept-2003
CAS2003, Annecy, France, WFS
Page 15
HLRE Implementation at DKRZ
HöchstLeistungsRechnersystem für die Erdsystemforschung = HLRE
High Performance Computer System for Earth
System Research
• Principal HLRE System Configuration
• HLRE Installation Phases
• IA64/Linux based Data Services
• Final HLRE Configuration
9-Sept-2003
CAS2003, Annecy, France, WFS
16
Principal HLRE System Configuration
9-Sept-2003
CAS2003, Annecy, France, WFS
Page 17
HLRE Phases
Date
Feb 2002
4Q 2002
3Q 2003
Nodes
8
16
24
CPUs
64
128
192
Expected Sustained Performance
[Gflops]
ca. 200
ca. 350
ca. 500
Expected Increase in Thruput compared
to CRAY C916
ca. 40
ca. 75
ca. 100
Main Memory [Tbytes]
0.5
1.0
1.5
Disk-Capacity [Tbytes]
ca. 30
ca. 50
ca. 60
Mass Storage Capacity [Tbytes]
>720
>1400
>3400
9-Sept-2003
CAS2003, Annecy, France, WFS
Page 18
DS phase 1: basic structure
• CS performance increase
• f = 37
• F = f3/4 = 15
• minimal component
performance
indicated in diagram
CS client(s)
11 TB
other clients
180 MB/s
GE
• explicit user access
45 MB/s
• ftp, scp ...
• CS disks with local copies
• DS disks for cache
• physically distributed DS
DS
375 MB/s
150 MB/s
• NAS architecture
16.5 TB
9-Sept-2003
CAS2003, Annecy, France, WFS
~ PB
Page 19
Adaptation Option for Data Server
Dataproblem in HPC
Datenerzeugungsrate in TByte/Jahr
3.000
data increase:
linear, P1
P3/4
P2/3
2.500
2.000
1.500
1.000
500
0
0
50
100
150
200
250
300
350
400
450
500
Effective Compute Power (P) in GFlops
9-Sept-2003
CAS2003, Annecy, France, WFS
Page 20
DS phases 2,3: basic structure
• CS performance increase
• f = 63/100
• F = f3/4 = 22.4/31.6
• minimal component
performance
indicated in diagram
other clients
270/325 MB/s
25/30 TB
• implicit user access
•
•
•
•
CS client(s)
11 TB
local UFS commands
CS disks with local copies
shared disks (GFS)
DS disks for IO buffercache
FC
GE
560/675 MB/s
70/80 MB/s
DS
• Intel/Linux platforms
• homogenous HW
• technological challenge
9-Sept-2003
225/270 MB/s
16.5 TB
CAS2003, Annecy, France, WFS
~ PB
Page 21
Implementing IA64/Linux based
Distributed Data Management
• Overall Phase 1 Configurations
• Introducing Linux based Distributed HSM
• Introducing Linux based Distributed DBMS
• Final Overall Phase 3 Configuration
9-Sept-2003
CAS2003, Annecy, France, WFS
22
Proposed final phase 3 configuration
Silkworm 12000
Local Disk
FC- RAID
0.28TB x20
=5.6TB
x 20
Local Disk
FC- RAID
0.28TB x20
=5.6TB
x 20
The Internet
x 120
IXS 24nodes
SX-6
SX-6
SX-6
SX-6
SX-6
SX-6
SQLNET
SX-6
Sun 4CPU
SX-6
Oracle Application
Server
x 32
FE x 2/node
For PolestarLite
GE x 48
FC x 72
x 72
GFS Disk
(Polestar)
0.28 x 53
=14.8TB
x 36
GFS Disk
(Polestar)
0.28 x 53
=14.8TB
x 36
HS/MS
LAN
x2 x2 x2
Fibre channel
x8
x8
x8
x8
x 16
x8
Disk Cache
(DDN)
0.69TB x 12
= 8.3TB
9-Sept-2003
Disk Cache
(Polestar)
0.57TB x 15
= 8.5TB
x4
Post processing system
x8
x 16
x 25
x2
x4
x4
x4
x4
Disk FC x 8
Tape FC x 6
Migration upon market
availability of componen
GigabitEther
GFS/Client GFS/Client GFS/Client GFS/Client
Oracle
Oracle
Oracle AsAmA
Oracle
4way
AsAmA 4way
AsAmA
4way
AsAmA
4way
UDSN/UDNL UDSN/UDNL UDSN/UDNL UDSN/UDNL
GFS/Server
GFS/Server
GFS/Server
UVDM
UVDM
AsAmA 16wayAsAmA
16way AzusA 16way
UCFM/UDSN
UDSN
x2
for Local disk
Local DiskLocal Disk
(Polestar) (Polestar) x 2
0.14 x 2 0.14 x 2 for Local diskDisk FC x 8
= 0.28TB = 0.28TB
Tape FC x 6
x4
x4
x4
x4
SQLNET
AsamA 4CPU
9940B x 20
9840B x 0
9840C x 5
Oracle DB
(DDN)
2TB x 4
= 8TB
Page 23
Some Results
• Growth of the Data Archive
• Growth of Transferrate
• Observed Transferrates for HLRE
• FLOPS-Rates
9-Sept-2003
CAS2003, Annecy, France, WFS
24
DS archive capacity [TB]
archive capacity
600
500
400
[TB] 300
duplicates
original
200
9-Sept-2003
CAS2003, Annecy, France, WFS
2002
2000
1998
1996
1994
0
1992
100
Page 25
DS archive capacity (2001-2003)
archive capacity
1000
800
600
[TB]
original
duplicates
400
9-Sept-2003
Jun 03
Mrz 03
Dez 02
Sep 02
Jun 02
Mrz 02
Dez 01
0
Sep 01
200
CAS2003, Annecy, France, WFS
Page 26
DS transfer rates [GB/day]
daily transfer volume
3000
2500
2000
[GB] 1500
fetch
store
1000
9-Sept-2003
CAS2003, Annecy, France, WFS
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
0
1992
500
Page 27
DS transfer rates (2001-2003)
daily transfer volume
CAS2003, Annecy, France, WFS
Jun 03
Mrz 03
Dez 02
Sep 02
Jun 02
Mrz 02
Dez 01
9-Sept-2003
fetch
store
Sep 01
4000
3500
3000
2500
[GB] 2000
1500
1000
500
0
Page 28
DS transfer rates (2001-2003)
daily transfer volume
10000
9-Sept-2003
CAS2003, Annecy, France, WFS
Jun 03
Mrz 03
Dez 02
Sep 02
Jun 02
Mrz 02
Dez 01
0
Sep 01
[GB] 5000
minimum
average
maximum
Page 29
Observed Transferrates for HLRE
Link
Single Stream
Transferrate [MB/s]
CS -> DS
via ftp, (12.1 SUPER-UX)
13
100
CS -> DS
via ftp, (12.2 SUPER-UX)
25
200
CS -> local disk,
(12.1 SUPER-UX)
40 - 50
CS -> GFS disk,
(13.1 SUPER-UX)
Up to 90
DS -> GFS disk,
(Linux)
Up to 80
9-Sept-2003
Aggregate
Transferrate [MB/s]
> 2.000
3.900
500 per node
CAS2003, Annecy, France, WFS
Page 30
Observed FLOPS-rates for HLRE
• 4 node performance > approx.100 GLFOPS
( about 40 % Efficiency) for
• ECHAM (70-75)
• MOM
• Radar Reflection on Sea Ice
• 24 node performance for Turbulence Code
about 470 GFLOPS (30+ % Efficiency)
9-Sept-2003
CAS2003, Annecy, France, WFS
Page 31
Summary
• DKRZ provides Computing Resources for
Climate Research in Germany on an
competitive international level
• The HLRE System Architecture is suited to cope
with a data-intensive Usage Profile
• Shared Filesystems today are operational in
Heterogenous System Environments
• Standardisation-Efforts for Shared Filesystems
needed
9-Sept-2003
CAS2003, Annecy, France, WFS
32
Thank you for your attention !
9-Sept-2003
CAS2003, Annecy, France, WFS
33
Tape transfer rates (2001-2003)
daily transfer volume [GB]
9-Sept-2003
Jun 03
Mrz 03
Dez 02
Sep 02
Jun 02
Mrz 02
Dez 01
repack
client
Sep 01
8000
7000
6000
5000
4000
3000
2000
1000
0
CAS2003, Annecy, France, WFS
Page 34
DS transfer requests (2001-2003)
daily transfer requests
9-Sept-2003
CAS2003, Annecy, France, WFS
Jun 03
Mrz 03
Dez 02
Sep 02
Jun 02
Mrz 02
Dez 01
fetch
store
Sep 01
40000
35000
30000
25000
20000
15000
10000
5000
0
Page 35
DS archive capacity (2001-2003)
archive capacity
1000
800
9940
9840
SD3
VHS
9490
600
[TB]
400
9-Sept-2003
CAS2003, Annecy, France, WFS
Jun 03
Mrz 03
Dez 02
Sep 02
Jun 02
Mrz 02
Dez 01
0
Sep 01
200
Page 36
DS archive capacity (2001-2003)
number of files stored
12
10
8
9940
9840
SD3
VHS
9490
[million] 6
4
9-Sept-2003
CAS2003, Annecy, France, WFS
Jun 03
Mrz 03
Dez 02
Sep 02
Jun 02
Mrz 02
Dez 01
0
Sep 01
2
Page 37