The Ranger Supercomputer and ots legacyx

Download Report

Transcript The Ranger Supercomputer and ots legacyx

The Ranger Supercomputer and it’s
legacy
Dan Stanzione
Texas Advanced Computing Center
The University of Texas at Austin
December 2, 2013
[email protected]
The Texas Advanced Computing Center: A
World Leader in High Performance Computing
1,000,000x performance increase in UT computing
capability in 10 years. PS Computation 1Mx – network 100x
Ranger: 62,976 Processor Cores,
123TB RAM, 579 TeraFlops, Fastest
Open Science Machine in the World,
2008
Lonestar: 23,000 processors,
44TB RAM, Shared Mem and GPU
subsystems, #25 in the world 2011
Stampede: #7 in the world today.
Somewhere around half a million
processor cores with Intel Sandy Bridge
and Intel MIC, Dell: >10 Petaflops.
NSF Cyberinfrastructure Strategic Plan
circa 2007 – much of this never happened
•
NSF Cyberinfrastructure Strategic Plan
released March 2007
–
–
•
NSF investing in world-class computing
–
–
•
Articulates importance of CI overall
Chapters on computing, data, collaboration,
and workforce development
Annual “Track2” HPC systems ($30M)
Single “Track1” HPC system in 2011 ($200M)
Complementary solicitations for software,
applications, education
–
–
–
–
–
Software Development for CI (SDCI)
Strategic Technologies for CI (STCI)
Petascale Applications (PetaApps)
CI-Training, Education, Advancement,
Mentoring (CI-TEAM)
Cyber-enabled Discovery & Innovation
(CDI) starting in 2008: $0.75B!
http://www.nsf.gov/od/oci/CI_Vision_March07.pdf
First NSF Track2 System: 1/2 Petaflop
• TACC selected for first
NSF ‘Track2’ HPC
system
– $30M system acquisition
– Sun Constellation Cluster
– AMD Opteron processors
• Project included 4 years
operations and support
–
–
–
–
System maintenance
User support
Technology insertion
Extended to 5 years
Ranger System Summary
• Compute power - 579 Teraflops
– 3,936 Sun four-socket blades
– 15,744 AMD Opteron “Barcelona” processors
• Quad-core, 2.0 GHz, four flops/cycle (dual pipelines)
• Memory - 125 Terabytes
– 2 GB/core, 32 GB/node
– 132 GB/s aggregate bandwidth
• Disk subsystem - 1.7 Petabytes
– 72 Sun x4500 “Thumper” I/O servers, 24TB each
– ~72 GB/sec total aggregate bandwidth
– 1 PB in largest /work filesystem
• Interconnect - 10 Gbps / 2.3 sec latency
– Sun InfiniBand-based switches (2) with 3456 ports each
– Full non-blocking 7-stage Clos fabric
– Mellanox ConnectX IB cards
Ranger I/O Subsystem
•
Disk Object Storage Servers (OSS) based on Sun x4500 “Thumper” servers
– Each x4500:
• 48 SATA II 500GB drives (24TB total)
• running internal software RAID
• Dual Socket/Dual-Core Opterons @ 2.6 GHz
– Downside is that these nodes have PCI-X - raw I/O bandwidth can exceed a
single PCI-X 4X InfiniBand HCA
• We use dual PCI-X
– 72 Servers Total: 1.7 PB raw storage
•
Metadata Servers (MDS) based on Sun
Fire x4600s
•
MDS is Fibre-channel connected to
9TB Flexline Storage
•
Target Performance
– Aggregate bandwidth: 70+ GB/sec
– To largest $WORK filesystem: ~40 GB/sec
Ranger Space, Power, and Cooling
• Total Project Power: 3.4 MW
• System: 2.4 MW
– 96 racks – 82 compute, 12 support, plus 2 switches
– 116 APC In-Row cooling units
– 2,054 sqft total footprint (~4,500 sqft including PDUs)
• Cooling: ~1 MW
– In-row units fed by three 350-ton chillers (N+1)
– Enclosed hot-aisles by APC
– Supplemental 280-tons of cooling from CRAC units
• Observations:
– Space less an issue than power
– Cooling > 25kW per rack difficult
– Power distribution a challenge, almost 1,400 circuits
Interconnect Architecture
Ranger InfiniBand Topology
NEM
NEM
NEM
NEM
NEM
NEM
NEM
NEM
“Magnum”
Switch
…78…
NEM
NEM
NEM
NEM
NEM
NEM
NEM
NEM
12x InfiniBand
3 cables combined
Who Used Ranger?
• On Ranger alone, TACC has ~6,000 users
who have run about three million simulations
over the last four years.
– UT-Austin
– UT System (through UT Research
Cyberinfrastructure)
– Texas A&M and Texas Tech (through Lonestar
Partnership)
– Industry (through the STAR program)
– Users from around the nation and world (through
NSF’s TeraGrid/XSEDE)
Japanese Earthquake Simulation
•
Simulation of the seismic wave from the
earthquake in Japan, propagating
through an earth model
•
Researchers using TACC’s Ranger
supercomputer have modeled the
processes responsible for continental
drift and plate tectonics in greater detail
than any previous simulation.
•
Modeling the propagation of seismic
waves through the earth is an essential
first step to inferring the structure of
earth's interior.
•
This research is led by Omar Ghattas at
The University of Texas at Austin
Studying H1N1 (“Swine Flu”)
Researchers at the
University of Illinois
and the University of
Utah used Ranger to
simulate the molecular
dynamics of antiviral
drugs interacting with
different kinds of flu.
Image produced by Brandt Westing, TACC
• They discovered how commercial medications reach the “binding
pocket” – and why Tamiflu wasn’t working on the new swine flu
strain.
• UT researcher Lauren Meyers also used Lonestar to predict the
best course of action in the event of an outbreak
Science at the Center of the Storm
Using the Ranger supercomputer at the Texas
Advanced Computing Center, National Oceanic and
Atmospheric Administration (NOAA) scientists and
their university colleagues, tracked Hurricane Ike and
Gustav during the recent storms.
The real-time, high-resolution global and mesoscale
(regional) weather predictions they produced used up
to 40,000 processing cores at once — nearly twothirds of Ranger — and included for the first time
data streamed directly from NOAA planes inside the
storm. The forecasts also took advantage of ensemble
modeling, a method of prediction that runs dozens of
simulations with slightly different starting points in
order to determine the most likely path and intensity
forecasts.
This new method and workflow was only possible
because of the massive parallel processing power that
TeraGrid resources can devote to complex scientific
problems and the interagency collaboration that
brought scientists, resources and infrastructure
together seamlessly.
A simulation of Hurricane Ike on TACC's
Ranger supercomputer shortly before the storm
made landfall in Galveston, Texas, on Sept. 13,
2008. Credit: NOAA; Bill Barth, John Cazes,
Greg P. Johnson, Romy Schneider and Karl
Schulz, TACC
Large Eddy Simulation of the Near-Nozzle Region
of Jets Exhausting from Chevron Nozzles
Noise from jet engines causes hearing damage in the
military and angers communities near airports. With
funding from NASA, Ali Uzun (Florida State University)
is using Ranger to simulate new exhaust designs that may
significantly reduce jet noise.
One way to minimize jet noise is to modify the turbulent
mixing process using special control devices, such as
chevrons—triangle-shaped protrusion at the end of the
nozzle. Since noise is a by-product of the turbulent
mixing of jet exhaust with ambient air, one can reduce the
noise by modifying the mixing process.
To determine how a given design would react to highspeed jet exhaust, Uzun first created a computer model of
the chevron-shaped exhaust nozzle. This was then
integrated into a parallel simulation code that calculated
the turbulence of the air as exhaust was forced through
the nozzle. Uzun’s simulations had unprecedented
resolution and detail. They proved that computational
simulations can match experimental results, while
supplying much more detailed information about minute
physical processes.
A picture depicting a two-dimensional cut through the
jet flow. The picture visualizes the turbulence in the jet
flow and the resulting noise radiation away from the jet.
Ranger Project Costs
• NSF Award: $59M
– Purchases full system, plus initial test equipment
– Includes 4 years of system maintenance
– Covers 4 years of operations and scientific support
•
•
•
•
UT Austin providing power: $1M/year
UT Austin upgraded data center infrastructure: $10-15M
TACC upgrading storage archival system: $1M
Total cost $75-80M
– Thus, system cost > $50K/operational day
– Must enable user to conduct world-class science every day!
Ranger-Era TeraGrid HPC Systems
Big Deployments Always Have
Challenges
• We’ve gotten extremely good in bringing in large deployments
on time, but it is not an easy process.
–
Impossible to rely solely on vendors, must be a cooperative process.
• Ranger slipped several months, and was changed from the
original proposed plan:
– Original 2 phase deployment scrapped in favor of a single larger phase.
– Several “early product” design flaws detected and corrected through the
course of the project.
Cable Manufacturing Defect
Illustration of example problematic InfiniBand 12X cables as a results of kinks
imposed by the initial manufacturing process: (left) dismantled cable with
inner foil removed and (b) cracked twinax as seen through a microscope.
Ranger: Circa 2007
Ranger Lives On
• 20 Ranger cabinets have been sent to CHPC for
distribution to South African Universities
• 16 more racks have been shipped to Tanzania.
• 4 awaiting shipment to Botswana
• Other components are at Texas A&M, Baylor College of
Medicine, ARL (UT classified facility).
• Original Ranger user community now migrated to
Stampede.
• After a remarkably successful production run, Ranger will
continue to deliver science and educate HPC
researchers around the world.
Ongoing Partnerships
• We at TACC are eager to use Ranger as a
basis for building sustained and meaningful
collaborations
• Hardware is a start (and there is always the
*next* system) but training, staff
development, data sharing, etc. provide new
opportunities as well.