Update since Hepix Spring 2005 TRIUMF SITE REPORT Corrie Kost TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005
Download ReportTranscript Update since Hepix Spring 2005 TRIUMF SITE REPORT Corrie Kost TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005
Update since Hepix Spring 2005 TRIUMF SITE REPORT Corrie Kost TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 Google Mini comes to TRIUMF • $2995 US w 1 yr support • indexes up to 100,000 docs Read a complete in-depth review at http://www.anandtech.com/IT/showdoc.aspx?i=2523&p=2 • 220 different file formats • Two 10/100 Ethernet ports - 1st for normal operation - 2nd for setup using cross-over cable • 120GB Seagate Drive • 2GB Memory • Maintainance via special google dialup modem TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 The TRIUMF-CERN 1GbE Lightpath(s) • 1 GbE circuit establishedApril 18th 2005 • 2nd GbE circuit established July 19th 2005 • TRIUMF • BCNET • CANARIE • SURFnet • CERN TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 ATLAS Service Challenge Servers 3 EMT64 systems, each with: 2 GB memory hardware raid - 3ware 9xxx SATA raid controller Seagate Barracuda 7200.8 drives in hardware raid 5 - 8 x 250 GB 1 dual Opteron 246 server with: 2 GB memory 3ware 9xxx SATA raid controller WD Caviar SE drives in hardware raid 0 - 2 x 250 GB 2 4560-SLX IBM Tape Libraries (currently each with only 1 SDLT 320 tape drive) 1 borrowed EMT64 system used temporarily as an FTS Server with: 1 GB memory 2 SATA 80 GB drives for the OS and for Oracle's needs. Storage 5.5+ TB disk 8+ TB tape http://grid.triumf.ca/status/sc3.html TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 ATLAS Service Challenge TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 10 GbE Lightpath to CERN CERN TRIUMF √ √ √ √ Atlantic Crossing √ TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 X √ 10 GbE Lightpath to CERN •Permanent 10GbE TRIUMF-CERN Lightpath ~ year-end 2005 •Foundry Bigiron RX-4’s at TRIUMF & BCnet TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 10 GbE Lightpath to CERN TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 TRIUMF WAN CWDM PROBLEM: MRV needs 1550+/-3nm but FOUNDRY 1550+/-15nm MRV CWDM Potential to Add 2 more 1GbE channels 4 1GbE channels Passport 8600 • ORAN Single Pair Fiber BCNET 22km • WESTGRID • 2x CERN 1610 nm 1590 nm 1570 nm 1550 nm 10GbE Foundry Switch (CERN / Ottawa) SFP 2x GbE TDM 4 Port Optical Mux TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 Raid5: Puzzling I/O results Repeated reads on same set of files (at 600MB/sec) – one or more files will “degrade” – typically after set of 16 8GB files have been read 1000 times. Positive: Read ~2PB during 50 days – averaging about 600MB/sec TRANSITION 8GB File Read Time (sec) 8 SATA disks on each of pair of RAID5 RocketRaid 1820A controllers 20 15 10 5 0 1 17 33 49 65 File Number (same every 16th) TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 81 Unix Backups at TRIUMF • Amanda system – Dual Opteron 248 2.2 GHz • 2G Memory • 16 x400G WD disks ~ 6TB (1.5TB present sys ~ 10day cycle) • 2 LSI Mega raid 8 disk controllers • Disk based ~1 month of backups – At least 2 full backups with daily incrementals • 26 Slot Overland DLT tape library • SDLT 600 drive 300G native capacity per tape • 150 Linux machines (users: home dir, servers: full) TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 Cheap Hot-Swap Backup • Promise SuperSwap 1100 Enclosures • Four 400 GB Seagate Sata Drives • Promise FastTrak S150 SX4 Sata controller • Raid 5 • Linux 2.4.20-8 RedHat 9 A disk can be removed at anytime and replaced at anytime. Rebuilds in background. Used to keep live multiple (daily) RSYNC (via DIRVISH) copies of critical servers (for ~ 1 month). See http://www.dirvish.com/ TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 VOIP coming to TRIUMF TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 TRIUMF Ticketing System (Request Tracker) TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 TRIUMF Ticketing System (Request Tracker) TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 http://hepix.caspur.it/afs/hepix.org/projects.html TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 Conclusions / Observations - Site services (Web, Email, Batch, Windows) all much more stable – new hardware, more memory (typically 4-8GB) in servers - Quad Opteron SUN I/O - using external SATA - still limited below 1 GB/sec - Read 16 8GB files repeatedly – averaging over 600MB/sec for ~2PB - Site “Backup” services still problematic - tape media capacity (outgrow in 2 years) - reliability (is SDLT robust?) - Permanent 10GbE TRIUMF-CERN service by year-end. - ATLAS Service Challenges targets being met for TRIUMF as TIER1 - Started using PLONE as content management for TRIUMF Web Server - Moving some phones to voice-over-IP - Scientific Linux (3 &4) still preferred Linux OS at TRIUMF - Moving away from distributed printing to print/scan-to-email/copy stations TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 TRIUMF Servers – May/2005 GPS TIME MSR WEB NAME DOCUMENTS CONDORG WEB SHARE MAIL FILE IBM CLUSTER LCG STORAGE WORKER NODES FEDORA / SL MIRROR IBM / SHARE STORAGE TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005 STORM2 SUN1 Foundry STORM1 AMANDA BACKUP (VIA DISKS) TRIUMF Servers – October/2005 TRIUMF Site Report for HEPiX, SLAC, October 10-14,2005