Document 7435418
Download
Report
Transcript Document 7435418
Operation of the CERN Managed
Storage environment;
current status and future directions
CHEP 2004 / Interlaken
Data Services team:
Vladimír Bahyl, Hugo Caçote, Charles Curran,
Jan van Eldik, David Hughes, Gordon Lee,
Tony Osborne, Tim Smith
Managed Storage Dream
0011010
1010011
0011101
1111011
Free to open…
Instant access
Any time later…
Unbounded recall
Find exact same coins Goods integrity
2004/09/29
CERN Managed Storage: [email protected]
2 of 18
Managed Storage Reality
0011010
1010011
0011101
1111011
Tape
Store
Disk
Cache
Maintain + upgrade, innovate + technology refresh
Ageing equipment, escalating requirements
Dynamic store / Active Data Management
2004/09/29
CERN Managed Storage: [email protected]
3 of 18
CERN Managed Storage
CASTOR Grid Service
SRM Service
GRIDftp servers
42 stager/disk caches
CASTORStage
Service
Servers 370 disk servers
Stage
Servers
Stage
Servers
StageServers
Servers
Stage
Servers
Stage
Stage Servers
Servers
6,700 spinning disks
Stage
DiskCache
Cache
Disk
Cache
Disk
DiskCache
Cache
Disk
Cache
Disk
Disk Cache
Cache
Disk
Reliability
70
tape servers
Tape
Store
Uniformity
Tape
Store
35,000
tapes
Tape
Store
Tape
Store
Tape
Store
Tape
Automation
Tape
Store
TapeStore
Store
CASTOR
Servers
Redundancy
Scalability
New Service
Scalability
Highly
Distributed System
2004/09/29
CERN Managed Storage: [email protected]
4 of 18
CASTOR Service
Running experiments
CDR for NA48, COMPASS, Ntof
Experiment peaks of 120MB/s
Combined average 10TB/day
Sustained 10MB/s per dedicated 9940B drive
Record 1.5 PB in 2004
Pseudo-online analysis
Experiments in the analysis phase
LEP and Fixed Target
LHC experiments in construction
Data production / analysis (Tier0/1 operations)
Test beam CDR
2004/09/29
CERN Managed Storage: [email protected]
5 of 18
Quattor-ising
Motivation: Scale
(See G.Cancio’s talk)
Uniformity; Manageability; Automation
Configuration Description (into CDB)
HW and SW; nodes and services
Reinstallation
Quiescing a server ≠ draining a client!
Gigabit cards gymnastics; BIOS upgrades for PXE
Eliminate peculiarities from CASTOR nodes
Switches misconfigurations, firmware upgrades
(ext2 -> ext3)
Manageable servers
2004/09/29
CERN Managed Storage: [email protected]
6 of 18
LEMON-ising
Lemon agent everywhere
Linux box monitoring and
alarms
Automatic HW static
checks
Adding
CASTOR server specific
Service monitoring
HW Monitoring
temperatures, voltages, fans etc
lm_sensors -> IPMI (see tape section)
disk errors; SMART
smartmontools
auto checks; predictive monitoring
tape drive errors; SMART
Uniformly monitored servers
2004/09/29
CERN Managed Storage: [email protected]
7 of 18
Warranties
500
450
Out of Warranty
X - 2.8 GHz
X - 2.8 GHz
ELONEX - 2.4GHz
ELONEX - 2.4GHz
ELONEX - 2.0GHz
ELONEX - 1.1GHz
JTT - 1.1GHz
ELONEX - 1GHz
ELONEX - 1GHz
JTT - 1GHz
ELONEX - 900
ELONEX - 900
ELONEX - 900
TECH - 800
ELONEX - 700
ELONEX - 650
COGESTRA - 500
ELONEX - 500
ELONEX - 500
COGESTRA - 450
Out of Warranty
5th Generation
400
4th Generation
Number of Disk Servers
350
300
3rd Generation
2nd Generation
250
1st Generation
200
150
0th Generation
100
50
0
Jan-00
Jan-01
2004/09/29
Jan-02
Jan-03
Jan-04
Jan-05
Jan-06
Jan-07
CERN Managed Storage: [email protected]
Jan-08
Jan-09
8 of 18
Disk Replacement
4.5%
4.0%
1224 disks replaced
3.5%
3.0%
% Broken Mirrors
Unacceptably
high failure
rate!
2.5%
2.0%
1.5%
1.0%
0.5%
0th Generation
0.0%
Dec-03 Jan-04 Feb-04 Mar-04 Apr-04 May-04 Jun-04 Jul-04 Aug-04 Sep-04
10 months before case agreed: Head instabilities
4 weeks to execute
1224 disks exchanged (=18%); And the cages
2004/09/29
CERN Managed Storage: [email protected]
9 of 18
Disk Storage Developments
Disk Configurations / File systems
HW.Raid-1/ext3 -> HW.Raid-5+SW.Raid-0/XFS
IPMI: HW health monit. + remote access
Remote reset + power-on/off (indep. of OS)
Serial console redirection over LAN
LEAF: Hardware and State Management
Next generations (see H.Meinhard’s talk)
360 TB SATA in a box
140 TB external SATA disk arrays
New CASTOR stager (JD.Durand’s talk)
2004/09/29
CERN Managed Storage: [email protected]
10 of 18
Tape Service
70 tape servers (Linux)
(mostly) Single FibreChannel attached drives
2 symmetric robotic installations
5 x STK 9310 Silos in each
Drives
Backup
9940A
LTO
4
6
Media
3590
8639
3590
14
9940B
50
Fast
Access
9840
20
2004/09/29
Bulk
physics
9840
8149
CERN Managed Storage: [email protected]
9940B
22884
11 of 18
Chasing Instabilities
Tape server temperatures?
2004/09/29
CERN Managed Storage: [email protected]
12 of 18
Media Migration
Technology generations
Migrate data to avoid obsolescence and
reliability issues in drives
1986
1995
2001
3480 / 3490
Redwood
9940
Financial
Capacity gain in sub generations
2004/09/29
CERN Managed Storage: [email protected]
13 of 18
Media Migration
1% of AReplace
tapes unreadable
A drives
on B drivesby
– keep
B drives
A drives
18,000
Capacity,
Performance,
Reliability
(drive
head
tolerances)
16,000
Number of Tapes
Migrate
A to B
format
14,000
9940B
12,000
200GB
30MB/s
10,000
8,000
9940A
6,000
60GB
12MB/s
4,000
2,000
0
Q1
Q2
Q3
2001
2004/09/29
Q4
Q1
Q2
Q3
2002
Q4
Q1
9 months;
Q2 Q3 Q4 Q1 Q2 Q3
25% of B
2003
resources 2004
CERN Managed Storage: [email protected]
Q4
14 of 18
Tape Service Developments
Removing tails…
Tracking of all tape errors (18 months)
Retiring of problematic media
Proactive retiring of heavily used media (>5000
mounts)
repack on new media
Checksums
Populated writing to tape
Verified loading back to disk
Drive testing
Commodity LTO-2; High end IBM3592/STK-NG
New Technology; SL8500 library / Indigo
2004/09/29
CERN Managed Storage: [email protected]
15 of 18
CASTOR Central Servers
Combined Oracle DB and Application
Daemons node
Assorted helper applications distributed
(historically) across ageing nodes
FrontEnd / BackEnd split
FE: Load balanced applications servers
Eliminate interference with DB
Load distribution, overload localisation
BE: (developing) clustered DB
Reliability, security
2004/09/29
CERN Managed Storage: [email protected]
16 of 18
GRID Data Management
GridFTP + SRM servers (Former)
Standalone / experiment dedicated
Hard to intervene; not scalable
New load-balanced shared 6 node Service
castorgrid.cern.ch
DNS hacks for Globus reverse lookup issues
SRM modifications to support operation behind
load balancer
GridFTP standalone client
Retire ftp and bbftp access to CASTOR
2004/09/29
CERN Managed Storage: [email protected]
17 of 18
Conclusions
Stabilising HW and SW
Automation
Monitoring and control
Reactive -> Proactive Data Management
2004/09/29
CERN Managed Storage: [email protected]
18 of 18