CERN's New Mail Server - Villa Olmo

Download Report

Transcript CERN's New Mail Server - Villa Olmo

AMS Computing and Ground Centers

AMS TIM, CERN Jul 23, 2004

Alexei Klimentov — [email protected]

AMS Computing and Ground Data Centers

AMS-02 Ground Centers – AMS centers at JSC – Ground data transfer – Science Operation Center prototype » Hardware and Software evaluation » Implementation plan  AMS/CERN computing and manpower issues  MC Production Status – AMS-02 MC (2004A) – Open questions : » plans for Y2005 » AMS-01 MC Alexei Klimentov. AMS TIM @ CERN. July 2004.

2

AMS-02 Ground Support Systems

Payload Operations Control Center (POCC) at CERN (first 2-3 months in Houston) CERN Bldg.892 wing A

“control room”, usual source of commands receives Health & Status (H&S), monitoring and science data in real-time receives NASA video voice communication with NASA flight operations

Backup Control Station at JSC (TBD) Monitor Station in MIT

“backup” of “control room” receives Health & Status (H&S) and monitoring data in real-time voice communication with NASA flight operations

Science Operations Center (SOC) at CERN (first 2-3 months in Houston) CERN Bldg.892 wing A

receives the complete copy of ALL data data processing and science analysis data archiving and distribution to Universities and Laboratories

Ground Support Computers (GSC) at Marshall Space Flight Center

receives data from NASA -> buffer -> retransmit to Science Center

Regional Centers

Madrid, MIT, Yale, Bologna, Milan, Aachen, Karlsruhe, Lyon, Taipei, Nanjing, Shanghai,… analysis facilities to support geographically close Universities Alexei Klimentov. AMS TIM @ CERN. July 2004.

3

NASA facilities AMS facilities

Alexei Klimentov. AMS TIM @ CERN. July 2004.

4

Alexei Klimentov. AMS TIM @ CERN. July 2004.

5

AMS Ground Centers at JSC

http://ams.cern.ch/Computing/pocc_JSC.pdf

 Requirements to AMS Ground Systems at JSC  Define AMS GS HW and SW components  Computing facilities – “ACOP” flight – AMS pre-flight – AMS flight – “after 3 months”  Data storage  Data transmission

Discussed with NASA in Feb 2004

Alexei Klimentov. AMS TIM @ CERN. July 2004.

6

AMS-02 Computing facilities at JSC

Center POCC Location Bldg.30, Rm 212 Function(s) Commanding Telemetry Monitoring On-line processing SOC Bldg.30, Rm 3301 Data Processing Data Analysis Data, Web, News Servers Data Archiving “Terminal room” AMS CSR tbd Bldg.30M, Rm 236 Monitoring Computers Pentium MS Win Pentium Linux 19” Monitors Networking switches Terminal Console MCC WS Pentium Linux IBM LTO tapes drives Networking switches 17” color monitors Terminal console Notebooks, desktops Pentium Linux 19” color monitor MCC WS 2 2 1 Qty 4 28 19 8 2 2 35 2 10 5 2 100 Alexei Klimentov. AMS TIM @ CERN. July 2004.

7

AMS Computing at JSC

(TBD)

Year LR-8 months LR-6 months LR LR L – 2 weeks L+2 months (tbd) L+3 months (tbd) Responsible N.Bornas, P.Dennett, A.Klimentov, A.Lebedev, B.Robichaux, G.Carosi

P.Dennett, A.Eline, P.Fisher, A.Klimentov, A.Lebedev, A.Eline, “Finns” (?) Actions Set-up at JSC the “basic version” of the POCC Conduct tests with ACOP for commanding and data transmission Set-up POCC “basic version” at CERN Set-up “AMS monitoring station” in MIT Conduct tests with ACOP/MSFC/JSC commanding and data transmission Set-up POCC “flight configuration” at JSC A.Klimentov, B.Robichaux

V.Choutko, A.Eline, A.Klimentov, B.Robichaux

A.Lebedev, P.Dennett

A.Klimentov

A.Klimentov, A.Lebedev, A.Eline, V.Choutko

Set-up SOC “flight configuration” at JSC Set-up “terminal room and AMS CSR” Commanding and data transmission verification Set-up POCC “flight configuration” at CERN Move part of SOC computers from JSC to CERN Set-up SOC “flight configuration” at CERN Activate AMS POCC at CERN Move all SOC equipment to CERN Set-up AMS POCC “basic version” at JSC

LR – launch ready date : Sep 2007, L – AMS-02 launch date

Alexei Klimentov. AMS TIM @ CERN. July 2004.

8

Data Transmission High Rate Data Transfer between MSFC Al and POCC/SOC, POCC and SOC, SOC and Regional centers will become a paramount importance

 Will AMS need a dedicated line to send data from MSFC to ground centers or the public Internet can be used ?

 What Software (SW) must be used for a bulk data transfer and how reliable is it ?

 What data transfer performance can be achieved ?

G.Carosi ,A.Eline,P.Fisher, A.Klimentov

Alexei Klimentov. AMS TIM @ CERN. July 2004.

9

Global Network Topology

Alexei Klimentov. AMS TIM @ CERN. July 2004.

10

Alexei Klimentov. AMS TIM @ CERN. July 2004.

11

Alexei Klimentov. AMS TIM @ CERN. July 2004.

12

‘amsbbftp’ tests CERN/MIT & CERN/SEU Jan/Feb 2003

A.Elin, A.Klimentov, K.Scholberg and J.Gong

Alexei Klimentov. AMS TIM @ CERN. July 2004.

13

Data Transmission Tests (conclusions)

 In its current configuration Internet provides sufficient bandwidth to transmit AMS data from MSFC Al to AMS ground centers at rate approaching 9.5 Mbit/sec  We are able to transfer and store data on a high end PC reliably with no data loss  Data transmission performance is comparable of what achieved with network monitoring tools  We can transmit data simultaneously to multiple cites Alexei Klimentov. AMS TIM @ CERN. July 2004.

14

Data and Computation for Physics Analysis

detector raw data event filter (selection & reconstruction) event reconstruction event tags data batch physics analysis analysis objects

(extracted by physics topic)

processed data (event summary data ESD/DST) event simulation

Alexei Klimentov. AMS TIM @ CERN. July 2004.

interactive physics analysis

15

Symmetric Multi-Processor (SMP) Model

Experiment Tape Storage TeraBytes of disks

Alexei Klimentov. AMS TIM @ CERN. July 2004.

16

AMS SOC (Data Production requirements) Complex system that consists of computing components including I/O nodes, worker nodes, data storage and networking switches. It should perform as a single system

.

Requirements :        Reliability –

High

(24h/day, 7days/week) Performance goal – process data

“quasi-online”

(with typical delay < 1 day) Disk Space – 12 months data “online” Minimal human intervention (automatic data handling, job control and book-keeping) System stability – months Scalability Price/Performance Alexei Klimentov. AMS TIM @ CERN. July 2004.

17

Production Farm Hardware Evaluation

“Processing node” Processor Memory System disk and transient data storage Ethernet cards Estimated Cost Intel PIV 3.4+GHz, HT 1 GB 400 GB , IDE disk 2x1 GBit 2500 CHF disk server Processor Memory System disk Disk storage Ethernet cards Estimated cost Intel Pentium dual-CPU Xeon 3.2+GHz 2 GB SCSI 18 GB double redundant 3x10x400 GB RAID 5 array or 4x8x400 GB RAID 5 array Effective disk volume 11.6 TB 3x1 GBit 33000 CHF (or 2.85 CHF/GB) Alexei Klimentov. AMS TIM @ CERN. July 2004.

18

AMS-02 Ground Centers.

Science Operations Center. Computing Facilities.

Analysis Facilities (linux cluster) N AMS Physics Services Central Data Services Shared Tape Servers Data Servers, Interactive and Batch physics analysis 10-20 dual processor PCs 5 PC servers tape robots tape drives LTO, DLT Shared Disk Servers 25 TeraByte disk 6 PC based servers AMS regional Centers Production Facilities, 40-50 Linux dual-CPU computers Linux, Intel and AMD batch data processing interactive physics analysis Home directories & registry consoles & monitors CERN/AMS Network Engineering Cluster 5 dual processor PCs

Alexei Klimentov. AMS TIM @ CERN. July 2004.

19

PC Linux 3.4+GHz AMS Science Operation Center Computing Facilities

Production Farm

Cell #7 PC Linux 3.4+GHz PC Linux 3.4+GHz PC Linux 3.4+GHz PC Linux 3.4+GHz PC Linux 3.4+GHz Archiving and Staging (CERN CASTOR) AFS Server PC Linux Server 2x3.4+GHz, RAID 5 ,10TB Cell #1 Gigabit Switch (1 Gbit/sec) AMS data NASA data metadata Disk Server Disk Server

Data Server

Tested, prototype in production Not tested and no prototype yet

MC Data Server Disk Server Web, News Production, DB servers PC Linux Server 2x3.4+GHz PC Linux Server 2x3.4+GHz, RAID 5 ,10TB

Analysis Facilities 20

AMS-02 Science Operations Center

 Year 2004 –

MC Production (18 AMS Universites and Labs)

» SW : Data processing, central DB, data mining, servers » AMS-02 ESD format –

Networking (A.Eline, Wu Hua, A.Klimentov)

» Gbit private segment and monitoring SW in production since April –

Disk servers and data processing (V.Choutko, A.Eline, A.Klimentov)

» dual-CPU Xeon 3.06 GHz 4.5 TB disk space in production since Jan » 2 nd » data processing node : PIV single CPU 3.4 GHz Hyper-Threading mode in production since Jan server : dual-CPU Xeon 3.2 GHz, 9.5 TB will be installed in Aug (3 CHF/GB) –

Datatransfer station (Milano group : M.Boschini, D.Grandi,E.Micelotta and A.Eline)

» Data transfer to/from CERN (used for MC production) » Station prototype installed in May » SW in production since January 

Status report on next AMS TIM

Alexei Klimentov. AMS TIM @ CERN. July 2004.

21

AMS-02 Science Operations Center

 Year 2005 – Q 1 : SOC infrastructure setup » Bldg.892 wing A : false floor, cooling, electricity – Mar 2005 setup production cell prototype » 6 processing nodes + 1 disk server with private Gbit ethernet – LR-24 months

(LR – “launch ready date”) Sep 2005

» 40% production farm prototype (1 st » Database servers bulk computers purchasing) » Data transmission tests between MSFC AL and CERN Alexei Klimentov. AMS TIM @ CERN. July 2004.

22

AMS-02 Computing Facilities .

Function Computer Qty Disks (Tbytes) and Tapes Ready(*) LR-months GSC@MSFC

Intel (AMD) dual-CPU, 2.5+GHz 3 3x0.5TB Raid Array LR-2

POCC POCC prototype@JSC Monitor Station in MIT Science Operation Centre :

Production Farm Database Servers Event Storage and Archiving Intel and AMD, dual-CPU, 2.8+GHz Intel and AMD, dual-CPU, 2.8+GHz Intel and AMD, dual-CPU , 2.8+GHz dual-CPU 2.8+ GHz Intel or Sun SMP Disk Servers dual-CPU Intel 2.8+GHz 45 5 50 2 6 6 TB Raid Array 1 TB Raid Array 10 TB Raid Array 0.5TB

50 Tbyte Raid Array Tape library (250 TB) LR LR-6 LR-2 LR-3 LR Interactive and Batch Analysis SMP computer, 4GB RAM, 300 Specint95 or Linux farm 10 1 Tbyte Raid Array LR-1 “Ready” = operational, bulk of CPU and disks purchasing LR-9 Months 23 Alexei Klimentov. AMS TIM @ CERN. July 2004.

People and Tasks (“my” incomplete list) 1/4

AMS-02 GSC@MSFC      Architecture POIC/GSC SW and HW GSC/SOC data transmission SW GSC installation GSC maintenance

A.Mujunen,J.Ritakari

, P.Fisher,A.Klimentov

A.Mujunen, J.Ritakari

A.Klimentov, A.Elin

MIT,

HUT

MIT

Status

: Concept was discussed with MSFC Reps MSFC/CERN, MSFC/MIT data transmission tests done

HUT have no funding for Y2004-2005 Alexei Klimentov. AMS TIM @ CERN. July 2004.

24

People and Tasks (“my” incomplete list) 2/4

AMS-02 POCC        Architecture TReKGate, AMS Cmd Station Commanding SW and Concept Voice and Video Monitoring Data validation and online processing HW and SW maintenance P.Fisher, A.Klimentov, M.Pohl

P.Dennett, A.Lebedev, G.Carosi, A.Klimentov, A.Lebedev

G.Carosi

V.Choutko, A.Lebedev

V.Choutko, A.Klimentov

More manpower will be needed starting LR-4 months Alexei Klimentov. AMS TIM @ CERN. July 2004.

25

People and Tasks (“my” incomplete list) 3/4

AMS-02 SOC      Architecture Data Processing and Analysis System SW and HEP appl.

Book-keeping and Database HW and SW maintenance V.Choutko, A.Klimentov, M.Pohl

V.Choutko, A.Klimentov

A.Elin, V.Choutko, A.Klimentov

M.Boschini et al, A.Klimentov

More manpower will be needed starting from LR – 4 months

Status

: SOC Prototyping is in progress SW debugging during MC production Implementation plan and milestones are fulfilled

Alexei Klimentov. AMS TIM @ CERN. July 2004.

26

People and Tasks (“my” incomplete list) 4/4

       INFN Italy IN2P3 France SEU China Academia Sinica RWTH Aachen … AMS@CERN AMS-02 Regional Centers PG Rancoita et al G.Coignet and C.Goy

J.Gong

Z.Ren

T.Siedenburg

M.Pohl, A.Klimentov

Status

: Proposal prepared by INFN groups for IGS and J.Gong/A.Klimentov

for CGS can be used by other Universities.

Successful tests of distributed MC production and data transmission between AMS@CERN and 18 Universities. Data transmission, book-keeping and process communication SW (M.Boschini, V.Choutko, A.Elin and A.Klimentov) released.

Alexei Klimentov. AMS TIM @ CERN. July 2004.

27

AMS/CERN computing and manpower issues

 AMS Computing and Networking requirements summarized in Memo – Nov 2005 : AMS will provide a detailed SOC and POCC implementation plan – AMS will continue to use its own computing facilities for data processing and analysis, Web and News services – There is no request to IT for support for AMS POCC HW or SW – SW/HW ‘first line’ expertise will be provided by AMS personnel – Y2005 – 2010 : AMS will have guaranteed bandwidth of USA/Europe line – CERN IT-CS support in case of USA/Europe line problems – Data Storage : AMS specific requirements will be defined in annual basis – CERN support of mails, printing, CERN AFS as for LHC experiments. Any license fees will be paid by AMS collaboration according to IT specs – IT-DB, IT-CS may be called for consultancy within the limits of available manpower

Starting from LR-12 months the Collaboration will need more people to run computing facilities

Alexei Klimentov. AMS TIM @ CERN. July 2004.

28

     

Year 2004 MC Production

Started Jan 15, 2004 Central MC Database Distributed MC Production Central MC storage and archiving Distributed access (under test) SEU Nanjing, IAC Tenerife, CNAF Italy joined production since Apr 2004 Alexei Klimentov. AMS TIM @ CERN. July 2004.

29

Y2004 MC production centers

MC Center

CIEMAT CERN Yale Academia Sinica LAPP/Lyon INFN Milano CNAF & INFN Bologna UMD EKP, Karlsruhe GAM, Montpellier INFN Siena&Perugia, ITEP, LIP, IAC, SEU, KNU

Responsible

J.Casuas

V.Choutko, A.Eline,A.Klimentov

E.Finch

Z.Ren, Y.Lei

C.Goy, J.Jacquemier

M.Boschini, D.Grandi

D.Casadei

A.Malinine

V.Zhukov

J.Bolmont, M.Sapinski

P.Zuccon, P.Maestro, Y.Lyublev, F.Barao, C.Delgado, Ye Wei, J.Shin

GB

2045 1438 1268 1162 825 528 441 210 202 141 135

%

24.3

17.1

15.1

13.8

9.8

6.2

5.2

2.5

2.4

1.6

1.6

Alexei Klimentov. AMS TIM @ CERN. July 2004.

30

MC Production Statistics

185 days, 1196 computers 8.4 TB, 250 PIII 1 GHz/day

97% of MC production done Will finish by end of July URL: pcamss0.cern.ch/mm.html

Particle

protons helium electrons

positrons deuterons anti-protons

carbon

photons

Nuclei (Z 3…28) Alexei Klimentov. AMS TIM @ CERN. July 2004.

Million Events

7630 3750 1280

1280 250 352.5

291.5

128

856.2

% of Total

99.9

99.6

99.7

100 100 100

97.2

100

85 31

Y2004 MC Production Highlights

        Data are generated at remote sites, transmitted to AMS@CERN and available for the analysis (only 20% of data was generated at CERN) Transmission, process communication and book-keeping programs have been debugged, the same approach will be used for AMS-02 data handling 185 days of running (~97% stability) 18 Universities & Labs 8.4 Tbytes of data produced, stored and archived Peak rate 130 GB/day (12 Mbit/sec), average 55 GB/day (AMS-02 raw data transfer ~24 GB/day) 1196 computers Daily CPU equiv 250 1 GHz CPUs running 184 days/24h Good simulation of AMS-02 Data Processing and Analysis

Not tested yet :

Remote access to CASTOR

Access to ESD from personal desktops

TBD :

AMS-01 MC production, MC production in Y2005

Alexei Klimentov. AMS TIM @ CERN. July 2004.

32

AMS-01 MC Production

Send request to [email protected]

Dedicated meeting in Sep, the target date to start AMS-01 MC production October 1st Alexei Klimentov. AMS TIM @ CERN. July 2004.

33