IBM Research BlueGene/L Project Update William R. Pulleyblank IBM Thomas J. Watson Research Center February 2004 © 2004 IBM Corporation.
Download ReportTranscript IBM Research BlueGene/L Project Update William R. Pulleyblank IBM Thomas J. Watson Research Center February 2004 © 2004 IBM Corporation.
IBM Research BlueGene/L Project Update William R. Pulleyblank IBM Thomas J. Watson Research Center February 2004 © 2004 IBM Corporation IBM Research The Financial Times: IBM's new Blue Gene supercomputer By Clive Cookson Financial Times; Nov 14, 2003 A small prototype of International Business Machines' long-awaited Blue Gene supercomputer has begun operating at the company's US research centre. Although it is no larger than a domestic dishwasher - less than 1 per cent of the size of the first full-scale machine, due in 2005 - it will be named today as one of the 100 most powerful computers in the industry's Top500 supercomputer rankings. William Pulleyblank, the IBM executive in charge of the Blue Gene project, says the prototype will be seen as a landmark in the history of computing: "It will revolutionise the way supercomputers are built and broaden the kinds of applications we can run on them." The architecture is particularly suitable for tackling big scientific problems, such as modelling the way protein molecules fold - the application highlighted four years ago when Blue Gene was announced as a $100m (£60m) research project. 2 Blue Gene/L © 2004 IBM Corporation IBM Research HPCwire 2003 Readers Choice Award HPCwire, the journal of record for high performance computing, has announced its 2003 Readers Choice Awards, the first of their kind. These accolades, unprecedented in the history of the HPC industry, have been determined by polling a sample representative of 30% of HPCwire's worldwide readership as well as a panel of luminaries including Fran Berman, Jay Boisseau, John Hurley, Earl Joseph, and Cherri Pancake, designated as the Editors Choice. Because HPCwire readers are the most elite in the industry, the HPCwire Readers Choice Awards will be the most prestigious in the industry. While the top tier of computation has seen its share of benchmarks, white papers, analyst reports, and legislative studies, HPCwire's Readers Choice Awards mark the first time that those on the front lines of both commercial and academic high performance computing have offered their personal input on exactly where the cutting edge of technology lies. The results are sure to provoke much controversy as well as create serious food for thought. There were 14 categories in HPCwire's open-ended surveys, designed to provide the most meaningful responses possible: Following is a comprehensive list of the HPCwire 2003 Readers Choice Award Winners: 1. Most innovative overall HPC technology for 2003 Editors Choice: IBM for BlueGene/L 3 Blue Gene/L © 2004 IBM Corporation IBM Research Blue Gene program December 1999: IBM Research announced a 5 year, $100M US, effort to build a petaflop/s scale supercomputer to attack science problems such as protein folding. Goals: Advance the state of the art of scientific simulation. Advance the state of the art in computer design and software for extremely large scale systems. 4 November 2001: Announced Research partnership with Lawrence Livermore National Laboratory (LLNL). November 2002: Announced planned acquisition of a BG/L machine by LLNL as part of the ASCI Purple contract. June 2003: First chips completed November 2003: BG/L Half rack prototype (512 nodes) ranked #73 on 22nd Top500 List announced at SC2003 (1.435 TFlop/s ). 32 node system folding proteins live on the demo floor at SC2003 February 2, 2004: Second pass BG/L chips delivered to Research Blue Gene/L © 2004 IBM Corporation IBM Research BlueGene/L System (64 cabinets, 64x32x32) Cabinet (32 Node boards, 8x8x16) Node Board (32 chips, 4x4x2) 16 Compute Cards Compute Card (2 chips, 2x1x1) Chip (2 processors) 180/360 TF/s 16 TB DDR 2.9/5.7 TF/s 256 GB DDR 90/180 GF/s 8 GB DDR October 2003 2.8/5.6 GF/s 4 MB 5 5.6/11.2 GF/s 0.5 GB DDR Blue Gene/L BG/L half rack prototype 500 Mhz 512 nodes/1024 proc. 2 TFlop/s peak 1.4 Tflop/s sustained © 2004 IBM Corporation IBM Research BlueGene/L Interconnection Networks 3 Dimensional Torus Interconnects all compute nodes (65,536) Virtual cut-through hardware routing 1.4Gb/s on all 12 node links (2.1 GB/s per node) Communications backbone for computations 350/700 GB/s bisection bandwidth Global Tree One-to-all broadcast functionality Reduction operations functionality 2.8 Gb/s of bandwidth per link Latency of tree traversal in the order of 5 µs Interconnects all compute and I/O nodes (1024) Ethernet Incorporated into every node ASIC Active in the I/O nodes (1:64) All external comm. (file I/O, control, user interaction, etc.) Low Latency Global Barrier and Interrupt Control Network 6 Blue Gene/L © 2004 IBM Corporation IBM Research BG/L – Familiar software environment Fortran, C, C++ with MPI Linux development environment 7 Full language support Automatic SIMD FPU exploitation Cross-compilers and other cross-tools execute on Linux front-end nodes Users interact with system from front-end nodes Tools – support for debuggers, hardware performance monitors, trace based visualization POSIX system calls – compute processes “feel like” they are executing on a Linux environment (restrictions) Blue Gene/L © 2004 IBM Corporation IBM Research Measured MPI Send Bandwidth and Latency Bandwidth (MB/s) @ 500 MHz 700 1 neighbor 2 neighbors 600 3 neighbors 4 neighbors 500 5 neighbors 400 6 neighbors 300 200 100 524288 262144 131072 65536 32768 1048576 Message size (bytes) 16384 8192 4096 2048 1024 512 256 128 64 32 16 8 4 2 1 0 Latency @500 MHz = 5.9 + 0.13 * “Manhattan distance” ls 8 Blue Gene/L © 2004 IBM Corporation NAS Parallel Benchmarks Class C on 256 nodes 160 Mop/s/node 140 BGL T3E 120 100 All NAS Parallel Benchmarks run successfully on 256 nodes (and many other configurations) 80 60 Compared 500 MHz BG/L and 450 MHz Cray T3E All BG/L benchmarks were compiled with GNU and XL compilers 40 20 0 No tuning / code changes Report best result (GNU for IS) BG/L is a factor of two/three faster on five benchmarks (BT, FT, LU, MG, and SP), a bit slower on one (EP) BT CG EP FT IS LU MG SP 9 IBM Research BG/L partners Internal External 10 IBM Research Engineering and Technical Services (Rochester) Systems Group (Deep Computing, Software) IMD (EDRAM, fabrication) Software Group (Compilers) Lawrence Livermore National Labs Technical University of Vienna Columbia University University of Barcelona Broad set of science collaborations Blue Gene/L © 2004 IBM Corporation IBM Research Faster than a speeding bullet, ASCI’s partnership with IBM is creating BlueGene/L – a new supercomputer design with nearly 10x the peak speed, in 1/5th the area, and using a fraction of the electrical power of comparable supercomputers Generating a theoretical peak computing speed of 360 trillion operations per second, occupying 2,500 ft 2 of floor space, and consuming 1.5 MW of electrical power–a fraction of the space and power needed by other supercomputers at this scale– BlueGene/L will likely be the fastest supercomputer on the planet when it is deployed in early 2005. In the time that a speeding bullet could fly across BlueGene/L, this system can perform 10,000 global sums over a value stored in each of its 65,536 nodes. More powerful than a locomotive, BlueGene/L will use the electrical power equivalent of a 2000horsepower diesel engine, in the space of a moderately sized suburban home. To match BlueGene/L’s prodigious peak compute capability, every man, woman and child on Earth would need to perform 60,000 calculations per second without transposing digits or forgetting to “carry the one”. The enormous bandwidth of its internal communications networks will support 150 simultaneous telephone conversations for every person in the US. To match its tremendous input rate, an individual would need to speed-read the complete works of Shakespeare in 1/1000-th of a second. Without getting writers cramp, in less than 10 minutes BlueGene/L can write the entire 20TB book collection of the Library of Congress. In the time it takes for an individual to say “Mississippi one”, BlueGene/L can send and receive 100,000 round-trip MPI messages between its 2^16 dual-processor nodes. Unfortunately at a weight of 30 metric tons, BlueGene/L is not able to leap tall buildings in a single bound. 11 Blue Gene/L © 2004 IBM Corporation IBM Research Agenda 10:00 - 10:30 10:30 - 11:00 11:00 - 11:30 11:30 - 12:00 12:00 - 1:00 1:00 - 1:30 1:30 - 2:00 2:00 - 2:30 2:30 - 3:00 3:00 - 3:30 3:30 - 4:30 4:30 - 5:00 5:00 - 5:30 12 Kick off Tilak Agerwala, Bill Pulleyblank BG/L Architecture Alan Gara Power, Packaging, Cooling Todd Takken Bring up Burkhard-Steinmacher-burow Lunch Break System Software Architecture Derek Lieber MPI on BG/L Gheorghe Almasi Benchmarks on BG/L: Parallel and Serial John Gunnels Blue Gene Science and Application Robert Germain Refreshments Panel: "BG/L: The next 100 weeks Gyan Bhanot (Moderator) BG/L Production and Manufacturing Tom Liebsch Blue Gene Program Research Challenges George Chiu Blue Gene/L © 2004 IBM Corporation IBM Research BlueGene/L Project Update William R. Pulleyblank IBM Thomas J. Watson Research Center February 2004 © 2004 IBM Corporation