Technical computing: Observations on an ever changing, occasionally repetitious, environment Los Alamos National Laboratory 17 May 2002 Copyright Gordon Bell LANL 5/17/2002

Download Report

Transcript Technical computing: Observations on an ever changing, occasionally repetitious, environment Los Alamos National Laboratory 17 May 2002 Copyright Gordon Bell LANL 5/17/2002

Technical computing:
Observations on an ever changing,
occasionally repetitious,
environment
Los Alamos National
Laboratory
17 May 2002
Copyright Gordon Bell
LANL 5/17/2002
A brief, simplified history of HPC
1.
2.
3.
4.
5.
6.
7.
8.
9.
Sequential & data parallelism using shared
memory, Cray’s Fortran computers 60-02 (US:90)
1978: VAXen threaten general purpose centers…
NSF response: form many centers 1988 - present
SCI: Search for parallelism to exploit micros 85-95
Scalability: “bet the farm” on clusters.
Users “adapt” to clusters aka multi-computers
with LCD program model, MPI. >95
Beowulf Clusters adopt standardized hardware
and Linus’s software to create a standard! >1995
“Do-it-yourself” Beowulfs impede new
structures and threaten g.p. centers >2000
1997-2002: Let’s tell NEC they aren’t “in step”.
High speed networking enables peer2peer
computing and the Grid. Will this really work?
Outline






Retracing scientific computing evolution: Cray,
SCI & “killer micros”, ASCI, & Clusters kick in.
Current taxonomy: clusters flavors
deja’vu rise of commodity computng:
Beowulfs are a replay of VAXen c1978
Centers: 2+1/2 at NSF;
BRC on CyberInfrastructure urges 650M/year
Role of Grid and Peer-to-peer
Will commodities drive out or enable new ideas?
Copyright Gordon Bell
LANL 5/17/2002
DARPA SCI: c1985-1995;
prelude to DOE’s ASCI





Motivated by Japanese 5th Generation
… note the creation of MCC
Realization that “killer micros” were
Custom VLSI and its potential
Lots of ideas to build various high
performance computers
Threat and potential sale to military
Copyright Gordon Bell
LANL 5/17/2002
Steve Squires
& G Bell at
our “Cray” at
the start of
DARPA’s SCI
c1984.
Copyright Gordon Bell
LANL 5/17/2002
What Is the
System
Architecture?
(GB c1990)
Distributed
Memory
Multiprocessors
(scalable)
Multiprocessors
Single Address Space
Shared Memory
Computation
MIMD
Distributed
Multicomputers
(scalable)
X
SIMD
Multiple Address Space
Message Passing
Computation
X
Cross-point or Multi-stage
Cray, Fujitsu, Hitachi, IBM,
NEC, Tera
Central
Memory
Multiprocessors
(not scalable)
Multicomputers
Dynamic Binding of
addresses to processors
KSR
Static binding, Ring multi
IEEE SCI proposal
Static Binding, caching
Alliant, DASH
Static Run-time Binding
research machines
X
Simple, ring multi ... bus
multi replacement
Bus multis
DEC, Encore, NCR, ...
Sequent, SGI,Sun
Mesh Connected
Intel
Butterfly/Fat Tree/Cubes
CM5, NCUBE
Switch connected
IBM
X
Fast LANs for High
Availability and High
Capacity Clusters
DEC, Tandem
LANs for Distributed
Processing
Workstations, PCs
GRID
Processor Architectures?
VECTORS
OR
CS View
VECTORS
SC Designers View
MISC >> CISC >>
RISC >> VCISC
Language directed
(vectors)>>
RISC >> Super-scalar >>
Extra-Long Instruction
Word
Massively parallel (SIMD)
(multiple pipelines)
Caches: mostly alleviate
need for memory B/W
Memory B/W = perf.
Copyright Gordon Bell
LANL 5/17/2002
The Bell-Hillis Bet c1991
Massive (>1000) Parallelism in 1995
TMC
TMC
TMC
World-wide
Supers
World-wide
Supers
World-wide
Supers
Applications
Petaflops / mo.
Revenue
Copyright Gordon Bell
LANL 5/17/2002
Results from DARPA’s SCI c1983






Many research and construction efforts …
virtually all new hardware efforts failed except
Intel and Cray.
DARPA directed purchases… screwed up the
market, including the many VC funded efforts.
No Software funding!
Users responded to the massive power
potential with LCD software.
Clusters, clusters, clusters using MPI.
It’s not scalar vs vector, its memory bandwidth!
–
–
6-10 scalar processors = 1 vector unit
16-64 scalars = a 2 – 6 processor SMP
Copyright Gordon Bell
LANL 5/17/2002
Dead Supercomputer Society





















ACRI
Alliant
American Supercomputer
Ametek
Applied Dynamics
Astronautics
BBN
CDC
Convex
Cray Computer
Cray Research
Culler-Harris
Culler Scientific
Cydrome
Dana/Ardent/Stellar/Stardent
Denelcor
Elexsi
ETA Systems
Evans and Sutherland
Computer
Floating Point Systems
Galaxy YH-1






















Goodyear Aerospace MPP
Gould NPL
Guiltech
Intel Scientific Computers
International Parallel Machines
Kendall Square Research
Key Computer Laboratories
MasPar
Meiko
Multiflow
Myrias
Numerix
Prisma
Tera
Thinking Machines
Saxpy
Scientific Computer Systems (SCS)
Soviet Supercomputers
Supertek
Supercomputer Systems
Suprenum
Vitesse Electronics
What a difference
25 years AND
spending >10x
makes!
ESRDC: 40 Tflops.
640 nodes
(8 - 8GFl P.vec/node)
LLNL
150 Mflops
machine
room c1978
Copyright Gordon Bell
LANL 5/17/2002
Computer types
-------- Connectivity-------WAN/LAN
Netwrked
Supers…
GRID
Legion
&
P2P
Condor
SAN
VPPuni
DSM
SM
NEC super
NEC mP
Cray X…T
(all mPv)
Clusters
Old
World
Mainframes
T3E
SGI DSM
SP2(mP)
clusters &
Multis
BeowulfNOW
SGI DSM WSs PCs
NT
clusters
Copyright Gordon Bell
LANL 5/17/2002
Top500 taxonomy… everything is
a cluster aka multicomputer

Clusters are the ONLY scalable structure
–


Cluster: n, inter-connected computer nodes
operating as one system. Nodes: uni- or SMP.
Processor types: scalar or vector.
MPP= miscellaneous, not massive (>1000),
SIMD or something we couldn’t name
Cluster types. Implied message passing.
–
–
–
–
–
Constellations = clusters of >=16 P, SMP
Commodity clusters of uni or <=4 Ps, SMP
DSM: NUMA (and COMA) SMPs and constellations
DMA clusters (direct memory access) vs msg. pass
Uni- and SMPvector clusters:
Vector Clusters and Vector Constellations
Copyright Gordon Bell
LANL 5/17/2002
Linux - a web phenomenon








Linus Tovald - writes news reader for his PC
Puts it on the internet for others to play
Others add to it contributing to open source
software
Beowulf adopts early Linux
Beowulf adds Ethernet drivers for essentially all
NICs
Beowulf adds channel bonding to kernel
Red Hat distributes Linux with Beowulf software
Low level Beowulf cluster management tools
added
Copyright Gordon Bell
LANL 5/17/2002
The Challenge leading to Beowulf






NASA HPCC Program begun in 1992
Comprised Computational Aero-Science and
Earth and Space Science (ESS)
Driven by need for post processing data
manipulation and visualization of large data sets
Conventional techniques imposed long user
response time and shared resource contention
Cost low enough for dedicated single-user
platform
Requirement:
–

1 Gflops peak, 10 Gbyte, < $50K
Commercial systems: $1000/Mflops or 1M/Gflops
Copyright Gordon Bell
LANL 5/17/2002
The Virtuous Economic Cycle
drives the PC industry… & Beowulf
Attracts
suppliers
Greater
availability
@ lower cost
Standards
Attracts users
Copyright Gordon Bell
Creates apps,
tools, training,
LANL 5/17/2002
Lessons from Beowulf










An experiment in parallel computing systems
Established vision- low cost high end computing
Demonstrated effectiveness of PC clusters for
some (not all) classes of applications
Provided networking software
Provided cluster management tools
Conveyed findings to broad community
Tutorials and the book
Provided design standard to rally community!
Standards beget: books, trained people, software
… virtuous cycle that allowed apps to form
Industry begins to form beyond a research project
Courtesy, Thomas Sterling, Caltech.
Clusters: Next Steps
Scalability…
 They can exist at all levels:
personal, group, … centers
 Clusters challenge centers…
given that smaller users get
small clusters

Copyright Gordon Bell
LANL 5/17/2002
Kilo
Mega
Disk Evolution

Capacity:100x in 10 years
1 TB 3.5” in 2005
20 TB?
in 2012?!
System on a chip
High-speed SAN

Disk replacing tape

Disk is super
computer
!
LANL
5/17/2002

Giga
Tera

Peta
Exa
Zetta
Yotta
Copyright Gordon Bell
Intermediate Step: Shared Logic
Snap






Brick with 8-12 disk drives
200 mips/arm (or more)
2xGbpsEthernet
General purpose OS
10k$/TB to 100k$/TB
Shared
–
–
–
–
–

Sheet metal
Power
Support/Config
Security
Network ports
~1TB
12x80GB NAS
NetApp
~.5TB
8x70GB NAS
Maxstor
~2TB
12x160GB NAS
IBM
TotalStorage
~360GB
10x36GB NAS
LANL 5/17/2002
These bricks could run applications e.g. SQL, Mail…
Copyright Gordon Bell
SNAP Architecture----------
Copyright Gordon Bell
LANL 5/17/2002
RLX “cluster” in a cabinet

366 servers per 44U cabinet
–
–
–



Single processor
2 - 30 GB/computer (24 TBytes)
2 - 100 Mbps Ethernets
~10x perf*, power, disk,
I/O per cabinet
~3x price/perf
Network services…
Linux based
*42, 2 processors, 84 Ethernet, 3 TBytes
Computing in small spaces @ LANL
(RLX cluster in building with NO A/C)
240 processors
@2/3 GFlops
Fill the 4
racks -- gives
a Teraflops
Beowulf Clusters: space
Performance/Space Ratio
ASCI White
Bladed Beowulf
Mflops/Sq. Ft.
Copyright Gordon Bell
LANL 5/17/2002
Beowulf clusters: power
Performance/Power Ratio
Beowulf
Bladed Beowulf
Mflops/Watt
Copyright Gordon Bell
LANL 5/17/2002
“The networks becomes the
system.”- Bell 2/10/82
Ethernet announcement with
Noyce (Intel), and Liddle (Xerox)
“The network become the
computer.” SUN Slogan >1982
“The network becomes the
system.” GRID mantra c1999
Copyright Gordon Bell
LANL 5/17/2002
Computing
SNAP
built entirely
from PCs
Wide-area
global
network
Mobile
Nets
Wide & Local
Area Networks
for: terminal,
PC, workstation,
& servers
Person
Person
servers
servers
(PCs)
(PCs)
???
TC=TV+PC
home ...
(CATV or ATM
or satellite)
Portables
Legacy
mainframes &
Legacy
minicomputers
mainframe
& terms
servers &
minicomputer
servers & terminals
scalable computers
built from PCs
Centralized
&Centralized
departmental
uni& mP servers
&
departmental
(UNIX
& NT)
servers
buit
from
PCs
A space, time
(bandwidth), &
generation scalable
environment
Copyright Gordon
Bell
LANL 5/17/2002
The virtuous cycle of bandwidth
supply and demand
Increased
Demand
Increase Capacity
(circuits & bw)
Standards
Create new
service
Telnet & FTP
EMAIL
Lower
response time
WWW
Audio
Voice!
Video
Internet II concerns given $0.5B cost

Very high cost
–
–


Disks cost $1/GByte to purchase!
Low availability of fast links (last mile problem)
–
–

$(1 + 1) / GByte to send on the net;
Fedex and 160 GByte shipments are cheaper
DSL at home is $0.15 - $0.30
Labs & universities have DS3 links at most,
and they are very expensive
Traffic: Instant messaging, music stealing
Performance at desktop is poor
–
1- 10 Mbps; very poor communication links
Copyright Gordon Bell
LANL 5/17/2002
Scalable computing: the effects



They come in all sizes; incremental growth
10 or 100 to 10,000 (100X for most users)
debug vs run; problem growth
Allows compatibility heretofore impossible
1978: VAX chose Cray Fortran
1987: The NSF centers went to UNIX
Users chose sensible environment
–
–

The role of gp centers e.g. NSF, statex is
unclear. Necessity for support?
–
–
–

Acquisition and operational costs & environments
Cost to use as measured by user’s time
Scientific Data for a given community…
Community programs and data
Manage GRIDdiscipline
Are clusters ≈ Gresham’s Law? Drive out alts.
The end
Copyright Gordon Bell
LANL 5/17/2002