Applications

Download Report

Transcript Applications

AMD OpteronTM Overview
Top Level Agenda
• The HPC Market & AMD
• AMD64 – A Programmers View
• AMD Opteron Processor – The HW
– Core Improvements
– Integrated Memory Controller
– HyperTransport Technology Clustering
Performance
• System Solutions & Applications
• Development platforms
• Recent Events
• Summary
July 7, 2015
Computation Products Group
2
Computing System Evolution:
Mainframes to desktops to clusters
• Mainframes ~ 1965
– Tightly coupled processor, computer, OS and software from a single company
– Proprietary software
– >$1M
• Departmental Minicomputers ~ 1970
– Significant proliferation of servers as machines leave glass houses
– <$1M
• RISC Processors ~1982
– The beginning of “Commodity Computer Market”
– Processor, Computer and Software begin to un-bundle
– A proliferation of new companies (SUN, MIPS, SGI) with the promise of significantly
improved performance
• Desktop ~ 1985
– Commodity market takes hold - IBM and Apple emerge as market leaders
• Cluster HPC ~ 1992 Onwards
– Gradual and steady growth of clustered systems for HPC
July 7, 2015
Computation Products Group
3
Enterprises want legacy compatibility
IDC's Worldwide Quarterly Server Forecast- 12/12/02
IA32
8,000,000
IA64
RISC
CISC
Red = Total Market Growth
Black = IA32 Growth
7,000,000
14%
April 22,
2003
14%
15%
6,000,000
-2%
5,000,000
5%
4,000,000
11%
10%
1%
3,000,000
14%
5%
15%
15%
Target market for
AMD Opteron™
processor*
2,000,000
1,000,000
0
2000
2001
2002
2003
2004
2005
2006
 The growth in today’s server market is toward technology based on the x86
platform.
 Competing 64-bit products are niche, proprietary technologies that are
expensive and difficult to integrate with existing systems.
*Based on AMD market trend research
July 7, 2015
Computation Products Group
4
HPC Projected Market Breakout
What will happens to processor distribution in the next three years?
• HPC market will continue to grow >12%/yr
2005 >$10B
• Alpha & PA-RISC represent 32% of TAM -> 0%
• SGI and SPARC are losing market share
Itanium
8%
Pow er PC
23%
• Intel and AMD are focused on the prize
IA32
19%
 32% + 10% x86-32 = 42% up for grabs
 ~3M processor over the next three years
SGI
5%
2002 $7B
Power PC
21%
Itanium
3%
Vector
7%
x86-32
10%
SPARC
16%
AMD64
22%
Alpha
19%
SGI
7%
Vector
7%
SPARC
20%
PA-RISC
13%
Alpha
PA-RISC
July 7, 2015
Computation Products Group
5
Evolving Server Market
“… open source platforms and mass commoditization, will
drive the server market through the next decade” *
• Standards-based, volume-driven, low-cost solutions
• Enhanced computing resources with support for legacy applications
• Improved return on investment (ROI)
• Examples:
– Windows NT/2000 leading the field with more than 55% share in the
SMB segment **
– Migration away from Precision Architecture (PA)-RISC and Alpha™
Processors and toward x86-based systems
– Revenue dollars for RISC-based servers are expected to be eclipsed by
other computing platforms for the first time in 2002
* Source IDC 2000
** Source IDC, 2002
July 7, 2015
Computation Products Group
6
AMD Server Successes
Key AMD Opteron
TM
processor milestones
February 2002 – First public
demonstrations of upcoming
AMD Opteron™ processor, running
in 32-bit mode.
April 2002 - Microsoft
announces intent to incorporate 64bit support for AMD Athlon™ and
AMD Opteron processors.
July 2002 – IBM announces 64-bit
enablement of the DB2 database
software for the upcoming
AMD Opteron processors.
October 2002 – Cray and Sandia National
Labs announce plans to build “Red Storm,” a
massive parallel processing supercomputer
with >10,000 AMD Opteron processors.
July 7, 2015
Computation Products Group
March 2002 – SuSE Linux AG
announces its Linux OS will offer
full 64-bit support for AMD's
family of 64-bit processors.
April 2002 – AMD announces
the AMD Opteron processor
brand for multiprocessor
workstations and servers.
June 2002 – AMD demos
first 4-way AMD Opteron
processor-based server
at Computex.
August 2002 – Red Hat announces it will
offer global support for the upcoming
AMD Opteron and AMD Athlon processors.
January 2003 – IBM announces
beta availability of DB2 for
AMD Opteron processors.
7
AMD Athlon™ processors in use…
Rhythm & Hues employed 42
dual processor-based Angstrom
Microsystems servers running
AMD Athlon™ processors in its
render farm.
Pre-visualization by JAK Films
and post-production by
Industrial Light & Magic
used systems powered by
AMD Athlon™ MP processors.
Used in Star Wars – Episode II
Boeing selected AMD processors to
power a 96-node AMD Athlon™ superAMD Athlon™ MP
cluster to support its Delta IV program. processors power creativity
Cluster used to simulate aerodynamic
in Spy Kids 2: The Island
performance of rocket family.
of Lost Dreams
Electronic Arts to use
AMD Athlon™ processor-based
systems worldwide
Mercedes-Benz Technology
Center (MTC) in Germany utilizes
a cluster based on AMD Athlon
processors to do crash simulation
analysis on its vehicles.
Veritas Geophysical is processing highresolution seismic data with AMD Athlon™
processor-based Racksaver Cluster that
provides an incredible amount of computing
power under one roof.
July 7, 2015
Computation Products Group
8
Some of the most powerful computers
in the world today – driven by AMD!
Top 500 Supercomputing List (as of November 2002)
UMEA University
HPC2N Super Cluster
68h with 480 processor system running
AMD Athlon processors
124th with 120 dual
processors nodes running
AMD Athlon processors
University of Bochum
215rd with 264 processor Racksaver system
running AMD Athlon processors
304th – Helix, a Megware
system with 132 AMD
Athlon processors
302nd – MegWare System with
128 AMD Athlon processors
University of Vienna
370th – Schroedinger I, a selfmade system with 160 AMD
Athlon processors
Heidelberg University
139th – Prairiefire, a
Atipa Technology
system with 246
AMD Athlon
processors
64th with 256 dual processor
nodes running AMD Athlon
processors
July 7, 2015
Computation Products Group
9
The AMD Opteron™ processor
Project “Red Storm”
 Cray will build a 40+ teraflop super
computer using x86-64 AMD Opteron™
processors for Sandia National
Laboratories
 Will be used for advanced engineering
simulations
RED STORM
 $90 million project will use more than
10,000 AMD Opteron™ processors
 Will feature a simple building block
approach with HyperTransport™
technology that will enable easy
implementation and reduce
engineering, design, and component
costs
July 7, 2015
Computation Products Group
10
Programmers Point of View
x86 in High Performance Computing
- The Six System Challenges
Watt
density:
•#6:
x86
is the
most widely installed instruction
set in the world.
 With
clusters
exceeding 10,000 processors, watt
#5:
The
I/O infrastructure:
density is an important issue. As cluster size expands,
• Instruction
set not relevant to CPU
cooling capacity and costs can be significant.
 The bandwidth
a Front
Side Bus causes an I/O
performance
(“to of
first
order”).
#4: Addressable
memory:
•
exclude
IA32 from
•bottle-neck
Design the which
lowestcontinues
watts/Gigto
Cycle
solution
challenging
parallel
applications.
start-of-the-art
AMD64
architecture and
#3:
Memory
bandwidth:
What running
isleveraging
important:

Large
RAM
resident
databases and memory
silicon on insulator process
intensive applications exceed the 4 Giga-Bytes limit
• Provide a dedicated I/O buss which is separate from
With
increased
system
memory,
come
data intensive
of
32
bit Cost
systems.
Paging
is not
a solution.
the
memory
bus and
pace
with
next
generation
#2:
perkeeps
processing
node:
I/O protocols
and CPU
clock.
applications
with
strides and block sizes that cause
cache
thrashing.
Making
the
cache
larger
is not costIA32
• AMD64
processing
the only
real
solution
 Due
to cost /isperformance
and
I/O
constraints,
#1: Backward
compatible
to x86-32:
effective.
performance
is limited
by the
size
clustersHence,
are limited
to two (2)
processors
putting
of on
chip
and/or
memory
bandwidth.
additional
stress
SMP
cluster
interconnect
 cache
There
is aon
enormous
investment
is IA32 for all market
segments.
In many applications,
portingincode is not an
• Bringmemory
4 and 8 bandwidth
processor SMP
systems–closer
• Improve
and latency
limit
option.
cost/performance to 2 processor systems;
cache size $$$
• Improve
performance,
decrease
premium…without
• Provide
a solution
that is not
only 100% backwards
breaking
IA32 “commodity”
economics;
compatible,
but designed
to run IA32 code faster then
• Only possible
- if the
samearchitecture
processor architecture
any existing
32-bit
available. is
used•on
the
desktop.
Provide a gradual and controlled migration path for
porting to AMD64
• Make the total cost of ownership minimal.
July 7, 2015
Computation Products Group
12
AMD’s Solution is an Evolutionary Approach:
Backward compatible to IA32
AMD: Single Platform
(< 3GB Limit)
(4GB Limit)
Designed to maintain legacy
compatibility …
 Leverage existing infrastructure
– thermal, enclosures, power,
and BIOS
 Run existing 32-bit applications
natively
 Allow customers to migrate
according to their schedule
 Low learning curve for users and
support staff
… and obey the Immutable Laws.
July 7, 2015
Computation Products Group
13
Compatibility Thunking Layer
64-bit Process
64-bit Process
IA32 Application
AMD64 Application
Thunking Layer
USER
KERNEL
AMD64 Operating System
AMD64 Device Drivers
July 7, 2015
Computation Products Group
14
Compatibility Mode
• Provides a mode where existing IA32 applications can run unchanged under
a 64-bit OS (Long Mode)
Compatibility Mode
• Selected on a code-segment basis
15
(CS.L=0)
0
31
Selector
– Uses far transfer rather than a full mode switch
0
Effective Address
• Faster than mode switch
Segmentation
• Application-level code runs unchanged
– Legacy segmentation
63
0
– Legacy address and data size defaults
• System aspects use native 64-bit mode
semantics
– Interrupts and exceptions use Long Mode
handling
– Paging aspects use Long Mode semantics
July 7, 2015
Computation Products Group
31
0
Virtual Address
Paging
39
0
Opteron’s Physical Address
15
64-bit mode
• 64-bit mode presents a flat, un-segmented virtual address space
– The legacy x86 segmentation scheme is disabled in 64-bit mode
• Default data size is 32 bits
– Override to 64 bits using new REX prefix
– Override to 16 bits using legacy operation size prefix (66h)
64-bit Mode
• Default address size is 64 bits
– Pointers are 64 bits
63
0
Virtual (Linear) Address
• Switch between 64-bit mode and
Compatibility Mode accomplished via
normal Far Transfer instructions
– CALLF, RETF, JMPF, IRET, INT
Prefix Type
None
REX
66h
32
64
16
Paging
39
Operand Size
July 7, 2015
Computation Products Group
0
Opteron’s Physical Address
16
Register Extension
• AMD64
In x86
 64-bit integer registers
 40-bit Physical Address
Added by x86-64
 48-bit Virtual Address
127
• Register Extensions
 Sixteen 64-bit integer registers
 Sixteen 128-bit SSE registers
S
S
E
63
15 7 0
EAX
RAX
0
XMM0
31
G
P
R
EAX
AH AL
x
8
7
79
0
• Vector Math Instruction Set
 3DNow!™
XMM7
XMM8
 SSE & SSE2 support
• Done in such a way as to not
alter the IA32 instruction
set.
EDI
R8
XMM8
XMM15
EIP
R15
http://www.x86-64.org
July 7, 2015
Computation Products Group
17
REX (Register Extension)
The sweet spot is 8 to 15 registers - 90% of application need <15
% of Functions in Typical Applications
Requiring N Registers
100%
80%
>32
16-32
8 to 15
<8
60%
40%
20%
b
yt
e
G
C
C
m
M
ar
8
8k k
si
m
P
er
V
o l
rt
ex
sp I
ea j ep
ds g
w he
or e
d t
pr
o
c
0%
July 7, 2015
Computation Products Group
18
AMD OpteronTM Processor Ecosystem
Operating Systems
Operating System
Type
Available
Windows 2000 Server editions
32-bit

http://www.microsoft.com/windows2000/
Users will need to get Opteron chipset drivers from AMD
Red Hat Professional 8.0
32-bit

http://www.redhat.com/software/linux/professional/
SuSE Linux 8.1 editions
32-bit

http://www.suse.com/us/private/products/suse_linux/i386/
Solaris 9 for x86
32-bit

http://wwws.sun.com/software/solaris/x86/index.html
UnitedLinux Version 1
32-bit
64-bit

http://www.unitedlinux.com/pdfs/UL1_0ProdSpecSheet.pdf
Consortium includes SuSE, The SCO Group, Conectiva, Turbolinux
Linux 2.4 kernel patches
32-bit

http://www.x86-64.org/
Mandrake Linux Corporate Server
2.1
32-bit
64- bit
4/22/03
http://www.mandrakesoft.com/products/range?wslang=en
SuSE Linux Enterprise Server
(SLES) 8
32-bit
64-bit
4/22/03
http://www.suse.com/us/business/products/server/sles/index.html
Beta available from AMD for OEMs, ISVs, and IHVs
Windows Server 2003
32-bit
4/24/03
http://www.microsoft.com/windowsserver2003/default.mspx
NetBSD
64-bit
4/22/03
Development underway by Open Source community
Beowulf Sycld Operating System
64-bit
4/22/03
Linux-based cluster operating system
Turbolinux
64-bit
4/22/03
RC1 candidate available at launch
Red Hat Advanced Server 3.0
64-bit
Stay Tuned
Support announced 08/02 but no release schedule announced.
Windows for AMD64
64-bit
Stay Tuned
http://msdn.microsoft.com
Pre-alpha available from Microsoft for OEMs, ISVs, IHVs at
FreeBSD, OPENBSD
64-bit
Stay Tuned
Development underway by Open Source community.
July 7, 2015
Computation Products Group
Comments
19
AMD OpteronTM Processor Ecosystem
Open Source Development Tools
64-bit Tools
Type
Available
Comments
ATLAS 3.5.0 Developer
Release
Library

http://math-atlas.sourceforge.net/
Optimized BLAS (Basic Linear Algebra Subroutines) library
Blackdown Java Platform 2
Version 1.4.2
Linux JAVA

http://www.blackdown.com/java-linux/java2-status/jdk1.4-status.html
SUN Java products ported to Linux by Blackdown group
GNU binutils
Utilities

http://www.gnu.org/software/binutils/
GNU collection of binary tools including GNU linker, GNU assembler
GNU C++ (g++) 3.2
GNU C (gcc) 3.2
GNU C (gcc) 3.3 (optimized)
Compilers

http://gcc.gnu.org/
GNU Collection of Compilers (gcc) is a full-featured ANSI C compiler
GNU Debugger (GDB)
Debugger

Analysis tool for debugging programs - included with SuSE SLES 8
GNU glibc 2.2.5
GNU glibc 2.3.2 (optimized)
C Library

http://www.gnu.org/software/libc/libc.html
GNU C Library
Other GNU Tools
Various

bash, csb, ksb, strace, libtool - included with SuSE SLES 8
MPICH
Library

Open Source message passing interface for Linux clusters
PERL, Python, Ruby, Tcl/Tk
Language

Scripting languages - included with SuSE SLES 8
GNU means GNU's Not UNIXTM" and is the primary project of by the Free Software
Foundation (FSF), a non-profit organization committed to the creation of a large body
of useful, free, source-code-available software.
July 7, 2015
Computation Products Group
20
AMD OpteronTM Processor Ecosystem
ISV Development Tools
AMD64 Tools
Type
64Express
Code Migrator

http://www.migratec.com/MigraTEC/
Source code migration technology for windows and Linux by MigraTEC
RSA
Library

Encryption library available from RSA Securities
SoftICE beta 1
Debugger

Windows device driver debugger by CompuWare
PGI Workstation 5.0 beta
Optimized
Compilers

http://www.pgroup.com/AMD64 to download
FORTRAN 77/90, C, C++ compilers for 64-bit Linux
PGI Workstation 5.0 beta
Optimized
Compilers
4/14/03
http://www.pgroup.com/AMD64 to download
FORTRAN 77/90, C, C++ compilers for 32-bit Linux, Windows
AMD Core Math Libraries
(ACML)
Optimized
Libraries
4/22/03
Linux
Optimized numerical functions (BLAS, LINPACK, FFTs) for Linux and
Windows ported by NAG. Will be downloadable from AMD web site.
PGI Workstation 5.0
Production Release
Toolkit
06/03
Optimized FORTRAN 77/90, C, C++ compilers for 64-bit Linux, 32-bit
Linux, and 32-bit Windows; PGDBG parallel application debugger;
PGPROF parallel application performance profiler
PGI Cluster Development
Kit (CDK) 5.0
Toolkit
07/03
Toolkit that includes optimized FORTRAN 77/90, C, C++ compilers for
32-bit and 64-bit Linux plus tools for cluster application development.
TotalView
Debugger
Stay tuned
Etnus has announced 32-bit support. 64-bit discussions underway.
Vampir/Vampirtrace
Analysis
Stay tuned
Parallel performance analysis tool by Pallas GmbH
Visual C, C++
Compiler
Stay tuned
Windows compiler included in Windows for AMD64 pre-alpha
Distributed Debugging
Tool
Debugger
Stay tuned
Streaming Computing has announced support 64-bit graphical
debugger for AMD64
Vega Prime
3D App Dev
Stay tuned
Realtime 3D software tools by Multi-gen Paradyn.
Absoft
Compilers
Stay tuned
Fortran toolsets for Windows and Linux
July 7, 2015
Available
Comments
Computation Products Group
21
AMD OpteronTM Processor Ecosystem
Server Applications
64-bit Applications
Type
Available
Comments
Apache HTTP Server
Web Server

http://httpd.apache.org/ - open source
Mental Ray by Mental Images
Rendering Engine

Packaged with graphic front ends
Zeus by Zeus
Web Server

www.zeus.com
Cluster Strike Server by Valve
Gaming Server

www.valvesoftware.com.
SENDMail
Email Server
4/22/03
http://www.sendmail.org/ - open source
MY SQL
Database Engine
4/22/03
Open source
DB2 by IBM
Database Engine
Q302
Apache Server by Covalent
Web Server
Stay tuned
Support announced 10/02
Stronghold by Red Hat
Web Server
Stay tuned
Support announced 10/02
Ingres by CA
Database Engine
Stay tuned
Technology demo shown 01/03
Unicenter by CA
Management
Stay tuned
Technology demo shown 01/03
Oracle
Database Engine
Stay tuned
Technology demo shown 10/02 and 01/03
MS IIS by Microsoft
Web Server
Stay tuned
Available with Windows for AMD64 pre-alpha
Terminal Server by Microsoft
Transaction Server
Stay tuned
Available with Windows for AMD64 pre-alpha
MS SQL
Database Server
Stay tuned
In early development
Beta available from IBM
Engage with ISVs, Open Source Community, and corporate developers to port
applications that immediately benefit from 64-bit computing to AMD64
July 7, 2015
Computation Products Group
Software listed in alphabetic order
22
AMD OpteronTM Processor Ecosystem
Additional Target Applications
Type of Application
Vendor
Application Servers
BEA, IBM
Business Processing Applications
JD Edwards, Oracle, PeopleSoft, SAP, Siebel
Java Engines (JVMs)
BEA, IBM, SUN
HPC Applications
Abaqus, Ansys, Fluent, LS-DYNA, Landmark, NASTRAN
Messaging/Collaborative Engines
Lotus Domino, MS Exchange
Streaming Media Engines
Microsoft, Real Networks
Statistical Engines
SAS
Transaction Engines
Citrix
Workstation Software
Adobe, Alias/Wavefront, Autodesk, Avid, Discreet,
Softimage, Solid Works
Engage with ISVs, Open Source Community, and
corporate developers to migrate applications to AMD64
July 7, 2015
Computation Products Group
23
AMD OpteronTM Processor Ecosystem
Infrastructure Target Applications
Type of Application
Vendors/Application
BIOS
Phoenix, AMI
Diagnostics
AMI, UltrX, Eurosoft, PC-Doctor
Management
Altiris, BMC, CA, HP, Tivoli, NetIQ,
Novell ZenWorks, OSA
Security/Antivirus
CA eTrust, Symantec
Storage Management
Legato, PowerQuest, Veritas
Utilities
Symantec Norton Utilities
Virtualization
Connectix, VMWare GSX
Engage with ISVs to valid 32-bit applications and to
migrate applications to AMD64 when it makes sense
July 7, 2015
Computation Products Group
24
AMD OpteronTM Processor Ecosystem
SLES 8 for AMD64 Features
Web Server
File Sharing
• Apache web server with extensions
• PHP and PHP extensions
• Tomcat
• Windows: Samba 2.2.5
• Macintosh: netatalk
• Netework filesystems: NFS
Network Printing
Authentication Server
• CUPS, lprng
• Windows domain controller: Samba
• Directory service: LDAP
• Single sign-on: Kerberos 5
• Logon: PAM module
• Yellow pages: NIS Server
Internet/Intranet Services
• DNS (bind)
• WINS
• DHCP server and client
• FTP, TFTP
Graphical Interfaces
• KDE 3.0.3 minimal system
• Mozilla 1.0.1
• KDE 1.0.1 libraries
• GNOME 2.0 libraries
Mail and News Servers
• SMTP (postfix), POP, IMAP
Proxy Servers
Security
• Caching and filtering: squid
SQL Database Servers
• mysql, postgres
• Client support (ODBC, JDBC)
Standard Linux/UNIX shells
• bash, csh, ksh
July 7, 2015
Applications
and
infrastructure
components
included in
SLES 8 for
AMD64 launch
release
• Secure shell: ssh
• Secure sockets: Openssl
• Encryption: GnuPG
• Full crypto enabled
• GPG signed RPM files
• Firewall (iptables)
• VPN: FreeSwan
Computation Products Group
25
AMD OpteronTM Processor Ecosystem
SLES 8 for AMD64 Features
Tools
Description
Compilers
• C (gcc) 3.2
• C++ (cpp) 3.2
Scripting Languages
• Perl, Python, Ruby, Tcl/Tk
Development Tools
• diff, patch, make, lex (flex)
• yacc (bison), autoconfig, automake
• Binutils, libtool, GDB, strace
Archiving Tools
• tar, cpio, gzip, bzip2, rpm
Libraries and Core Functionality
• LSB 1.1 runtime environment
• glib 2.2.5
Management Tools
• YaST – graphical administration tool
• AutoYasT – installation tool
Networking Tools
• Remote shell tools: ssh, scp
• ping, traceroute, nslookup, dig, host
• IPv6: ifconfig/route and config location
• Firewalling tools: ipchains, iptables, masquerading
• Xfree86 4.2 (libs and server)
• X print server (libx.P.so.6)
• SNMP
Complete suite of open source development, networking, and
management tools are included in SLES 8 for AMD64 launch release
July 7, 2015
Computation Products Group
26
The AMD Opteron™ Processor Ecosystem
SuSE SLES 8 for AMD64 Features
Disk Adaptors
NICs and Interconnects
Video & Audio
Promise
FastTrack TX2000
FastTrack 100 TX2
FastTrack100
FastTrack SX6000
PDC20375
3Com
3C905CX
3c996
Gig Ethernet Adapter BCM 5703
Gig Ethernet Adapter BCM 5704
NVIDIA
GeForce4
GeForce4MX200
GeForce4MX 400
Quadro2
Quadro2 DCC
Quadro4
3Ware
7500-2/4/8/12
8500-4/8/12
Myricom (HPC Interconnect)
Myrinet Clustering Interconnect
ATI
FireGL X1
FireGL 8800
FireGL 8700
Adaptec
29320x
39320x
29160x
2200S series
2120S series
JNI (InifiniBand)
InfiniStar 4X HBA
Matrox
Parhelia 128
Parhelia Pro 256
Parhelia 512
LSI Logic
LSI22320/53C1030
MegaRAID SCSI 320-2
MegaRAID SCSI 320-1
LSI7202XP-LC
Qlogic (Fibre Channel)
SANblade 2310xx
SANblade 2340xx
SANblade 2342xx
Creative Labs
SB Audigy
SB Audigy2
SB Live! 5.1
SB Live! Cards
July 7, 2015
Computation Products Group
64-bit Linux
Device
Drivers
included in
SLES 8 for
AMD64
launch
release
27
BIOS
• There are four sources for BIOS for the AMD Opteron.
They are:
Phoenix Technologies
Jim Grimm
Director of Sales NA
320 Norwood Park South
Norwood, MA 02062
Phone: (781)551-5023
Fax: 1-781-551-5002
email: [email protected]
American Megatrends, Inc.
Bill Clark
Strategic Account Mngr.
6145-F Northbelt Parkway
Norcross, GA 30071-2976
Phone: (770)326-9158
Fax: 1-770-246-8765
email: [email protected]
CodeGen
T.J. Merritt
Sales Manager
4725 First Street
Pleasanton, CA 94566
Phone: (925)462-4300
Fax: 1-925-462-4309
email: [email protected]
AMD will release to the public on April 22, 2003, documentation that will enable the
development of a open LinuxBIOS. In addition, SuSE has an engineering project
underway to develop a LinuxBIOS for Opteron systems. The intent of this project is to
deliver code to open source in a phased approach as different levels of functionality are
achieved. All questions concerning this development effort need to be addressed to:
Mr. Andreas Jaeger at SuSE <[email protected]>.
July 7, 2015
Computation Products Group
28
Linux for AMD64
What is Linux?
• Linux is an open source operating system that is a Unix clone
–
Open Source refers to any software where both the executable (binary) files and
source code are distributed.
• The Linux concept originated from Minix, a Unix-like operating system
used to teach the inner-workings of an OS to students
–
Linux was introduced over the Internet in 1991 by Linus Torvalds. In 1994 Linus
merged together software components from hundreds of programmers to create
Linux Version 1.0.
• Linux distribution refers to a packaging of the Linux kernel (operating
system core) with a set of utilities and other applications to make the
OS user-friendly
–
–
Vendors providing Linux distributions charge fees for add-on features and services,
such as media distributions, documentation, and support.
Linux Distributions that will support AMD64 on 4/22/03:
• Mandrake Linux Corporate Server
• NetBSD
• Scyld Beowulf Linux
• SuSE Linux Enterprise Server 8
• Turbolinux
July 7, 2015
Computation Products Group
29
Linux for AMD64
Linux Kernel for AMD64
Kernel Features
Kernel 2.4.19
Raw device support for database
Async I/O, Direct I/O, Multipath I/O
POSIX Threads (linuxthreads)
PCI-X support
File systems: ext3, reiserfs, ext2
High Availability
• Logical Volume Manager (LVM) for
use with all supported file systems
July 7, 2015
• Kernel refers to set of core operating
system services that are tightly
integrated with the processor, such as
memory access, I/O functions, file
services, and device drivers.
• Linux Kernel for AMD64 was developed
by AMD, SuSE, and the Open Source
Community.
• Linux Kernel for AMD64 is an Open
Sources Software available for use by
anyone. Any distribution of the binary
version of this kernel must also be
accompanied by source code version.
Computation Products Group
30
Linux for AMD64
SLES 8 for AMD64 Features
Web Server
File Sharing
• Apache web server with extensions
• PHP and PHP extensions
• Tomcat
• Windows: Samba 2.2.5
• Macintosh: netatalk
• Network file systems: NFS
Network Printing
Authentication Server
• CUPS, lprng
•
•
•
•
•
Internet/Intranet Services
•
•
•
•
DNS (bind)
WINS
DHCP server and client
FTP, TFTP
Windows domain controller: Samba
Directory service: LDAP
Single sign-on: Kerberos Heimdal 0.4e
Logon: Password Authentication Modules
Yellow pages: NIS Server
Graphical Interfaces
• KDE 3.0.3 minimal system
• KDE 1.0.1 libraries
Mail and News Servers
• SMTP (postfix), POP, IMAP
Web Browsers
Proxy Servers
• Caching and filtering: squid
• Mozilla 1.0.1
• Konqueror
SQL Database Servers
Security
• MySQL, PosgreSQL
• Client support (ODBC, JDBC)
•
•
•
•
•
•
•
Java (JVM)
• IBM 1.3.X (32-bit)
• Backdown 1.4.2 (64-bit)
Standard Linux/UNIX Shells
Sample of
some of the
open source
applications
and utilities
included in
SuSE SLES 8
for AMD64
Secure shell: ssh
Secure sockets: OpenSSL
Encryption: GnuPG
Full crypto enabled
GPG signed RPM files
Firewall (iptables)
VPN: FreeS/wan
• bash, csh, ksh
July 7, 2015
Computation Products Group
31
Linux for AMD64
SLES 8 for AMD64 Features
Tools
Description
Compilers
• C (gcc) 3.2
• C (gcc) 3.3 (for better performance)
• C++ (g++) 3.2
Scripting Languages
• Perl, Python, Ruby, Tcl/Tk
Development Tools
• diff, patch, make, lex (flex)
• yacc (bison), autoconfig, automake
• Binutils, libtool, GDB, strace
Archiving Tools
• tar, cpio, gzip, bzip2, rpm
Libraries
• glibc 2.2.5
• glibc 2.3.2 (for better performance)
Management Tools
• YaST – graphical administration tool
• AutoYaST – installation tool
Networking Tools
• Remote shell tools: ssh, scp
• ping, traceroute, nslookup, dig, host, ltrace
• IPv6: ifconfig/route and config location
• Firewalling tools: ipchains, iptables, masquerading
• Xfree86 4.2 (libs and server)
• X print server
• SNMP
• DHCP server (dhcpd)
Suite of open source development, networking, and
management tools included in SuSE SLES 8 for AMD64
July 7, 2015
Computation Products Group
32
Linux for AMD64
SuSE SLES 8 for AMD64 Features
Disk Adaptors
NICs
Miscellaneous
Promise
FastTrack TX2000
FastTrack100
FasTrack100 TX
3Com
3C905CX
3c996
2D Graphic Cards
Support for many
popular cards
Broadcom
5701
5702
5703
5704
Creative Labs (audio)
SB Live! 5.1
3Ware
7500-2/4/8/12
8500-4/8/12
Adaptec
29320x
39320x
29160x
2200S series
2120S series
LSI Logic
LSI22320/53C1030
MegaRAID SCSI 320-2
MegaRAID SCSI 320-1
LSI7202XP-LC
Qlogic (Fibre Channel)
SANblade 2310xx
SANblade 2340xx
SANblade 2350xx
Intel
Pro100MT
Pro100
Printers
Support for many
popular printers
Other I/O Devices
CD-RW
DVD-ROM
USB devices
PS/2 keyboards
Mouse: PS/2, serial
Some of the
64-bit Linux
Device Drivers
included in
SLES 8 for
AMD64 launch
release
ATA (IDE) Drives
Support for many
popular drives
Note: High speed interconnect device drivers available from Dolphin, Myrinet, Quadrix
3D graphic card device drivers available from ATI, NVIDIA, Maxtor
July 7, 2015
Computation Products Group
33
Don’t take our word for it . . .
“Jim Allchin, the man in charge of Microsoft's operating systems, calls
the performance of software on the AMD machines ‘pretty amazing.’"
Fortune Magazine, February 2003
“The enterprise-class database solution features a DB2 database on a
SuSE Linux operating system, and was successfully enabled to
support x86-64 technology in two days.”
AMD News Release, July 30, 2002
"AMD's x86-64 technology offers a seamless migration path to 64-bit
computing, while allowing businesses to preserve their existing
investments in 32-bit x86 software.” Boris Nalbach, CTO of SuSE
Linux AG.
AMD/SuSE News Release, March 20, 2003
July 7, 2015
Computation Products Group
34
Don’t take our word for it . . .
Privately held computer seller Angstrom Microsystems will use
Hammer to build high-performance servers for financial institutions,
movie studios, and oil companies. The Boston company says
customers were concerned that they would have to rewrite software
for Itanium. "People want an evolutionary process, not a revolutionary
process," says CEO Lalit Jain.
Business Week: AMD's Hammer: The Right Tool for the Job? (3/10/03)
“Covalent will be developing 64-bit compatibility because we believe
the upcoming AMD Opteron processor-based server systems will
deliver superior performance and reliability for our easy-to-install
Apache server software.” Mark Douglas, senior vice president of
engineering, Covalent Technologies.
AMD News Release,
July 7, 2015
Computation Products Group
35
AMD OpteronTM Software
A Word on Architectural Feedback
• Code size is only up about 5%
– Mostly due to 64-bit literals
• Instruction count is down about 15%
– Additional registers really paying off
– Many spill/fill memory references eliminated
– Call-Exit sequences vastly improved
• Reduced instruction count and increased Instructions Per Cycle (IPC)
mean substantial performance gains
–
–
–
–
AMD Opteron IPC improves about 5%
AMD64 instruction count down about 15%
Net improvement about 20%
Your mileage will vary
• IA64 feedback is exactly opposite
– Instruction count is up
– IPC is down
July 7, 2015
Computation Products Group
36
Computing Strategy: x86-64
 Legacy: 32-bit OS
– Both AMD Athlon 64 and AMD Opteron processors run any 32-bit
legacy O/S
– Compatible all legacy Drivers, OS & BIOS
– No application recompile required, no emulation layer
 64-bit OS
 Desired applications can be written/ported to leverage the full 64-bit
capabilities of x86-64
 Migrate only where warranted, and at the user’s pace
 All 32-bit applications run under 64-bit OS
 BIOS is standard x86 32-bit code.
 Transfer to 64-bit operation occurs under OS load/startup control
 64-bit mode does not use segmentation - Flat addressing
The right way to get to 64 bits:
Investment Protection, Flexibility, No Penalty to 32 bit Performance
July 7, 2015
Computation Products Group
37
The Next Generation Processor
AMD Opteron™ & AMD Athlon™ 64
• Background
• Quick overview of chip
configurations
• Core improvements over AMD’s
7th Generation Athlon Processors
• Integrated Memory Controller
– Multi-processor performance
• HyperTransport Technology
July 7, 2015
Computation Products Group
39
AMD 64-bit Family of Processors
Performance
4P - 8P HPC
Cluster
2P Servers
Workstation
1P Server
& Workstation
800 Series
(up to 8 way)
200 Series
(2 way)
100 Series
(1 way)
1P Desktop
& Mobile
Price
July 7, 2015
Computation Products Group
40
True Customer-Centric Innovation
Performance-enhancing features include:
 Performance
 High-bandwidth integrated
memory controller scales with
processor frequency and
number of processors
 Compatibility
 Approximately 10,000 legacy
applications at time of launch
 Scalability
 Reduced costs for high-end
systems
 Removes I/O bottlenecks
 Easy multiprocessor scaling
July 7, 2015
Computation Products Group
DDR Memory
Controller
AMD64
Processor
Core
L1
Instruction
Cache
L1
Data
Cache
L2
Cache
HyperTransport™
...
41
AMD Athlon™ 64 Processor
 AMD64: Desktop Processor

8 Byte memory controller supporting 200,
266, & 333 MHz DDR Memory
 CHIPKILL ECC with x4 DRAMs
 Drive up to 4 registered DIMMs

4 DIMMs <266MHz

2 DIMMs >333MHz
 Future memory technology
supported as it is defined

DDR Memory
Controller
AMD64
Processor
Core
Up to 4GB x4 DRAMS (4GB DIMMs)

HyperTransport™ Technology I/O

On chip L1 & L2 cache
 64KB L1 ICache, 64KB L1 DCache
 Up to 1M ECC protected L2 Cache

72
740-pin µPGA Package
L1
Instruction
Cache
L1
Data
Cache
L2
Cache
HyperTransport™
16
Replaces Address, Data and Control Bus
July 7, 2015
Computation Products Group
42
1P AMD Athlon™ 64
Desktop Processor System
System Strengths
 Memory Latency, Bandwidth
and memory reach:
4GB DRAM
AMD Athlon™ 64
 240 physical ( 1 Terabyte)
 248 virtual
200-333MHz
72-Bit Reg DDR
32bits @
533Mhz
AGP 8X
16x16 HyperTransport @
1600 MTs
AMD-8151™
AGP 8X
 I/O Latency and
Bandwidth ~1600M T/sec
 6.4 GB/s
 64-bit CPU
PCI 33/32
EIDE
AMD-8111TM
I/O Hub
LPC
USB1.1,2.0
AC97
ACR 1.0
MII
July 7, 2015
FLASH
 More Reliable
SIO
NIC
Computation Products Group
10/100
 Lower Chip count
 Improved machine check
 Improved error handling
43
1P AMD Opteron™ 100 Series
 AMD64: 1 way Value Server

18 CAS lines for 32GB of memory
16 Byte memory controller supporting 200,
72/144
266, & 333 MHz DDR Memory
 CHIPKILL ECC with x4 DRAMs
 Drive up to 8 registered DIMMs

8 DIMMs <266MHz

4 DIMMs >333MHz
AMD64
Processor
Core
 Future memory technology
supported as it is defined


Up to 4GB x4 DRAMS (4GB DIMMs)
L1
Data
Cache
L2
Cache
HyperTransport™
On chip L1 & L2 cache
 64KB L1 ICache, 64KB L1 DCache
 Up to 1M ECC protected L2 Cache

L1
Instruction
Cache
Three 16-bit non-Coherent HyperTransport™
Technology Links

DDR Memory
Controller
16
16
16
940-pin µPGA Package
Replaces Address, Data and Control Bus
July 7, 2015
Computation Products Group
44
1P AMD Opteron™ 100
Desktop Processor System
PCI-X
PCI-X Tunnel
AMD-8131™
PCI-X
8GB
DRAM
AMD Opteron™
16x16 HyperTransport @
1600 MTs
32bits @
533Mhz
PCI-X Tunnel
PCI-X
AMD-8131™
PCI-X
System Strengths
 Ideal for cost sensitive
designs system where I/O is
the critical commodity
AMD-8151™
• Storage servers
AGP 8X
AGP 8X
• Low end DCC workstations
PCI 33/32
AMD-8111TM
I/O Hub
EIDE
USB1.1,2.0
AC97
ACR 1.0
MII
July 7, 2015
LPC
FLASH
SIO
NIC
Computation Products Group
10/100
45
2P - AMD Opteron™ 200 Series
 AMD64: 2 Way Performance Server

18 CAS lines for 32GB of memory
16 Byte memory controller supporting 200,
72/144
266, & 333 MHz DDR Memory
 CHIPKILL ECC with x4 DRAMs
 Drive up to 8 registered DIMMs

8 DIMMs <266MHz

4 DIMMs >333MHz
AMD64
Processor
Core
 Future memory technology
supported as it is defined


Up to 4GB x4 DRAMS (4GB DIMMs)
On chip L1 & L2 cache
 64KB L1 ICache, 64KB L1 DCache
 Up to 1M ECC protected L2 Cache

L1
Instruction
Cache
L1
Data
Cache
L2
Cache
One coherent and two 16-bit non-Coherent
HyperTransport™ Technology Links

DDR Memory
Controller
HyperTransport™
16
16
16
940-pin µPGA Package
Replaces Address, Data and Control Bus
July 7, 2015
Computation Products Group
46
2P AMD Opteron™ 200 Server
AMD-8131™
PCI-X
PCI-X
PCI-X
PCI-X
AMD-8131™
PCI-X Tunnel
PCI-X Tunnel
8GB
DRAM
8GB
DRAM
AMD Opteron™
AMD Opteron™
PCI-X
PCI-X
AMD-8131™
Bridge or
SSL/IPSec.
PCI-X Tunnel
System Strengths

PCI 33/32
EIDE
AMD-8111TM
I/O Hub
USB1.1,2.0
AC97
ACR 1.0
MII
July 7, 2015
LPC
FLASH
SIO
NIC
Computation Products Group
10/100
Ideal for systems where large flat
memory is important
(16GB of SMP memory)
• Data mining
• Rational Data Base applications
47
4P - 8P AMD Opteron™ 800
 AMD64: 4 - 8 Way Performance Server

16 Byte memory controller supporting 200,
72/144
266, & 333 MHz DDR Memory
 CHIPKILL ECC with x4 DRAMs
 Drive up to 8 registered DIMMs

8 DIMMs <266MHz

4 DIMMs >333MHz
 Future memory technology
supported as it is defined


L1
Instruction
Cache
L1
Data
Cache
L2
Cache
Three 16-bit Coherent HyperTransport™
On chip L1 & L2 cache
 64KB L1 ICache, 64KB L1 DCache
 Up to 1M ECC protected L2 Cache

AMD64
Processor
Core
Up to 4GB x4 DRAMS (4GB DIMMs)
Technology Links

DDR Memory
Controller
HyperTransport™
16
16
16
940-pin µPGA Package
July 7, 2015
Computation Products Group
48
AMD Opteron™ 800 HPC
Processing Node
HPC Strengths
 Flat SMP like Memory Model:
 All four reside with the same 248
memory map
 Expandable to 8P NUMA
 Glue-less Coherent multi-
processing:
 low Latency and high Bandwidth
~1600M T/sec (6.4 GB/s)
 32GB of High B/W external
memory bus (>5.3GB/sec.)
 Native high B/W memory map
I/O (>25Gbits/sec.)
July 7, 2015
Computation Products Group
49
Model Number Implementation
Model Number Implementation
•First digit = scalability of AMD Opteron processor
•Second and third digits = relative performance among AMD
Opteron processors
•Model number conveys directional improvement
• AMD Opteron™ Processor Model
AMD Opteron™
800 Series
Clock
Model
1.4GHz
840
Up to 8 way
1.6GHz
842
1.8GHz
844
2.0GHz
846
AMD Opteron™
200 Series
Clock
Model
1.4GHz
240
Up to 2 way
1.6GHz
242
1.8GHz
244
2.0GHz
246
AMD Opteron™
100 Series
Clock
Model
2.0GHz
1.8GHz
146
144
1 way
2.0GHz
146
_
__
Price Performance Positioning
Performance
800
200
1M
100
A solution
unto it self
256K
Price
July 7, 2015
Computation Products Group
51
Opteron™ Processor
Architecture
The Elements of the CPU
System
Request
Queue
Bus Unit
L2
Cache
Scan/Align
Microcode Engine
Fastpath
mOPs
L1
Data
Cache
64KB
Instruction Control Unit (72 entries)
Int Decode & Rename
Crossbar
Memory
Controller
Branch
Prediction
Fetch
L1
Instruction
Cache
64KB
44-entry
Load/Store
Queue
Res
Res
Res
AGU
AGU
AGU
ALU
ALU
ALU
FP Decode & Rename
36-entry FP scheduler
FADD
FMUL
FMISC
HyperTransportTM
MULT
July 7, 2015
Computation Products Group
53
Processor Throughput
Branch
Prediction
Fetch

Supply 16 instruction bytes to the decoder per cycle
Scan/Align
Microcode Engine
Fastpath

Convert x86 instructions to fixed length µOPs
mOPs
Instruction Control Unit (72 entries)


24-entry integer scheduler can Dispatch 3 µOPs per
cycle to integer/FP schedulers
Instructions use one of two decoding pipelines
• Fastpath: instructions which are decoded in to two or
fewer mOPs are decoded by hardware and then packed
into 3 dispatch positions
• Microcode: x86 instructions which are decoded in to
more than two mOPs, calculate microcode ROM entry
point and fetch sequence from Microcode ROM
Int Decode & Rename
Res
Res
Res
AGU
AGU
AGU
ALU
ALU
ALU
FP Decode & Rename
36-entry FP scheduler
FADD
FMUL
FMISC
MULT

Compared to AMD Athlon™ XP, more
instructions use the Fastpath
• Eg: Packed SSE is microcoded in AMD Athlon XP
and Fastpath in AMD Opteron  processors
• AMD Opteron has 8% fewer microcoded
instructions for SPECint2000
• AMD Opteron has 28% fewer microcoded
instructions for SPECfp2000
July 7, 2015
Computation Products Group
54
Floating Point & Integer Performance
 FPU Throughput
•
SSE2, x87
 Theoretical: (1 Mul + 1 Add)/cycle

•
Realized: 1.9 FLOPs/cycle
SSE, 3DNow!
 Theoretical: (2 Mul + 2 Add)/cycle

Realized: 3.4+ FLOPs/cycle
 32-bit Integer Throughput
•
•
•
1 add / clock cycle
1 multiply / clock cycle
Multiply latency has shrunk from 5 cycles on AMD AthlonTM to 3 cycles on the
AMD Opteron™
 64-bit Integer Throughput
•
•
•
•
1 add / clock cycle
1 multiply every other clock cycle
Multiply latency is 4 cycles
Integer Instruction Scheduler
 Out Of Order (OOO) from a queue of 24* Integer Macro-Ops
*AthlonTM Instruction Scheduler is 18 Macro-Ops deep
July 7, 2015
Computation Products Group
55
Internal Caching
L1 caches
• 64k bytes instruction and data
L1
Instruction
Cache
64KB
• 2-way set associative
• Data Cache is ECC protected
L2 cache
• Caches instruction and data streams
• 16-way set associative, ECC protected
• >2X Athlon XP L2  L1 bandwidth
Improved Translation Look-aside Buffer for large
multiprocessor workloads
L2
Cache
Bus Unit
• Instruction Cache is Parity protected
L1
Data
Cache
64KB
• Twice the size and Lower latencies then AMD Athlon XP
• L2 Translation Look-aside Buffer
 512 entry - 4-way associative
• L1 Translation Look-aside Buffer
 32 entry Instruction & Data -fully associative
Machine check architecture for reporting failures
July 7, 2015
Computation Products Group
44-entry
Load/Store
Queue
56
Reliability Features
• L1 Cache
– Data cache is ECC protected via background scrubber
– Instruction cache is parity protected upon R/W
• L2 cache
– Cache Tag arrays are ECC protected via background scrubber
– Instructions are parity protected, Data is ECC protected
• ECC bit reused for Branch Prediction and Instruction Decode (end bits)
• DRAM is ECC protected with chipkill ECC support
– Each fetch is parity checked
– ECC via scrubber – period is user programmable for 40ns to 84usec.
• Remaining arrays are parity protected
– Instruction cache, tags and TLBs
– Data tags and TLBs
– Generally read only data which can be recovered
• Machine Check Architecture
– Report failures and predictive failure results
•
•
•
•
ECC
Branch Predictor
ThermTrip
Memory scrubbers
July 7, 2015
Computation Products Group
57
Branch Prediction Improvements
Fetch
• Full L1 Cache Coverage
Branch
Prediction
– Twice the selectors as AMD Athlon™ XP
Scan/Align
Microcode Engine
Fastpath
• 4K Branch Target Addresses
– Backed up by Branch
Address Calculator
– 4 cycle correction for
unconditional relative branches
mOPs
Instruction Control Unit (72 entries)
• 16K Bimodal Counters
– Four times AMD Athlon XP
• Full Pre-decode and Branch Identification in L2 Cache
– New and unique to AMD Opteron Family of Processors
– Reuses L2 ECC bits on clean/shared instruction lines and on extra bit
July 7, 2015
Computation Products Group
58
Integrated Northbridge
Firmware View of Northbridge
• Performs same functions found in Northbridge
–
–
–
–
–
Memory Controller – fully integrated
Host-Bridge function as defined by the PCI spec
PCI to PCI Bridge as defined by the PCI spec
Graphics Address Resolution Table (GART)
Multi-processor coherency
• Controlled via PCI configuration registers
– Memory controller configuration
– HyperTransport™ technology routing
System
Request
Queue
Crossbar
Memory
Controller
HyperTransportTM
• Configured by Firmware
– HyperTransport™ initialization via Hardware
• Auto-size, coherent or non-coherent, “Legacy” path
to the ROM in Southbridge
– HyperTransport™ technology speed and routing via firmware
– Everything else in firmware follows existing paradigms
• PCI enumeration
• Memory sizing and configuration
• I/O controller setup
July 7, 2015
Computation Products Group
60
Systems View of Northbridge
(Assumes a 2GHz processor Clock)
July 7, 2015
Computation Products Group
61
HyperTransport™ Technology
• Screaming I/O for chip-to-chip communication
–
–
–
–
–
High bandwidth
Point-to-point links
Split transaction and full duplex
Differential Signaling
Tunneling capability
• HyperTransport Links
–
–
–
–
Three 16-bit links (3.2 GB/s per direction)
Reduced pin count compared to the typical Bus based systems
Compatible with high-volume PC board infrastructure
Each can be:
• cHT: coherent (Processor-to-Processor) link or,
• ncHT: non-coherent (Processor-to-I/O) link
– For more info see: http://www.HyperTransport.org/
• Enables scalable 2-8 processor Cache-Coherent MP systems
– Glueless MP
July 7, 2015
Computation Products Group
62
Performance
Multi-Processor Performance
Evaluation Simulation Parameters
• Microbenchmark Simulations:
– RTL based
– Cycle accurate
– DRAM Page hit
• System Parameters:
– AMD Opteron 2 GHz CPU
– Memory Clock = 333 MHz Data Rate
• Registered PC2700 DDR memory
– DRAM width = 128 bits interleaved
– CAS latency = 2.5 memory clocks
– HT frequency = 1600 MHz Data Rate (16 bits)
– DDR Peak Bandwidth = 5.4 GB/s
– HT Peak Bandwidth = 3.2 GB/s (each direction)
July 7, 2015
Computation Products Group
64
SPECint Performance
®
SPECint 2000
1300
*Based on 2GHz lab hardware
Using 32 bit binaries
1200
SPECint 2000
1100
1000
AMD Opteron™
processor estimates
900
800
Intel Xeon processor*
700
600
500
400
1000
1200
1400
1600
1800
2000
2200
2400
2600
2800
3000
Operating Frequency [MHz]
*Source http://www.spec.org/osg/cpu2000/results/cpu2000.html
July 7, 2015
Computation Products Group
65
®
SPECfp Performance Comparison
*Based on 2GHz lab hardware
Using 32 bit binaries
SPECfp 2000
1500
AMD Opteron™
processor
estimates
B
1400
B
1300
1200
A
A
A
1100
1000
900
Intel Xeon™
processor*
800
700
~ 1100 MHz
~400 MHz
600
500
A
B
B
A
400
1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000 4200 4400 4600 4800 5000
Operating Frequency [MHz]
July 7, 2015
Computation Products Group
*Sourcehttp://www.spec.org/osg/cpu2000/results/cpu2000.
html
66
SPECfp 2000 Base Competitive Summary
(32-bit Windows, PC2700 CAS2.5)
SPECfp® 2000 Scores
1400
AMD Opteron
1200
Score
1000
AMD Opteron
800
P4 533FSB
600
P4 400FSB
Redesign effort
400
200
PIII 133FSB
0
0
0.5
1
1.5
2
2.5
3
CPU Frequency (GHz)
Base (IA32)
Peak(IA32)
AMD Opteron Processor (Estimated Performance)
Source: http://www.spec.org
July 7, 2015
Computation Products Group
67
AMD Opteron SPEC projections
compared to Alpha EV7
Alpha EV7
Clock
1250
SPECInt 2000 Peak
804
SPECf p2000 Peak
1253
Opteron 1M
2000
1202
1170
Estimate
• AMD Opteron should be more cost-effective versus Alpha EV7
–Standards versus Proprietary
–Millions per month versus 100’s
July 7, 2015
Computation Products Group
68
AMD Opteron SPEC projections
compared to Itanium-2
Itanium-2
Opteron 1M
Clock
1000
SPECInt 2000 Peak
810
SPECf p2000 Peak
1427
2000
1202
1170
Estimate
• AMD Opteron will be more cost-effective than Itanium-2
–Standards versus Proprietary
–Millions per month versus 1,000’s
July 7, 2015
Computation Products Group
69
Integrated Memory Controller
Latency (Local Memory Access, Registered Memory, CAS 2.5)
1.6GHz
PC2700
130.00
120.00
110.00
85-95ns
4K
(L1 cache miss,TLB miss)
8K
100.00
16K
90.00
32K
64K
80.00
Time (ns)
128K
70.00
60.00
256K
512K
Stride >1M
1M
50.00
2M
65ns
40.00
4M
(L1 cache miss,TLB hit)
30.00
8M
16M
32k< Stride <1M
Stride <32k
20.00
Stride
(bytes)
32M
10.00
16M
8M
4M
2M
1M
512K
256K
128K
64K
32K
16K
8K
4K
2K
1K
512
256
128
64
32
16
8
4
0.00
Block Size (bytes)
July 7, 2015
Computation Products Group
70
Integrated Memory Controller Performance
–Peak Bandwidth
Peak Memory Bandwidth
64-Bit DCT
128-Bit DCT
DDR200 PC1600
1.6GB/s
3.2GB/s
DDR266 PC2100
2.1GB/s
4.2GB/s
DDR333 PC2700
2.7GB/s
5.33GB/s
– 1.6GHz AMD Opteron™ 800 Latency (333MHz PC2700 DIM)
• Page Hit (0 hop) ~ 65nS
• Page Hit (1 hop) ~ 100nS
• Page Hit (2 hop) ~ 140nS
Page Miss (0 hop) ~95nS
Page Miss (1 hop) ~120nS
Page Miss (2 hop) ~160nS
– Note: AMD Athlon™ and competitive 1P processors - Page Hit (0 hop) ~ 170nS
July 7, 2015
Computation Products Group
71
Symmetric Multi Processing (SMP)
 Advantages
• The whole system appears like a single-processor system
• The OS does not need to know how memory is laid out
• The programmer need not worry about memory location
 Disadvantages
• Systems are not scalable:The systems bus becomes a HUGE bottleneck as
more processors are added.
OS
Task1/thread3
System
Memory
• Hardware mechanisms are costly as they are specially built for low volume
systems (e.g. the number of 8 ways is less than 10% that of 2 way systems)
• With a large number of cache misses, the bus transfer activity rises and the
whole system gets slower.
Task2/thread6
Task3/thread5
Task4/thread1
Task5/thread4
Intra-node Communication
Task6/thread2
July 7, 2015
Computation Products Group
72
Non Uniform Memory Architectures
(NUMA)
 Advantages
 NUMA architectures are the next
logical step in scaling from SMP
architectures.
OS
Task1/thread1
 They bring dramatic scalability
advantages
Task2/thread2
Task3/thread3
 Disadvantages
 May require specialized HW to
Task4/thread4
Task5/thread5
implement
 OS needs to be more aware of
memory lay out
 The programmer may need to
Task6/thread6
Task7/thread7
be aware of memory layout for
Task8/thread8
optimum performance
Task9/thread9
 CACHE Coherency protocols
are more complex.
Task10/thread10
Task11/thread11
Task12/thread12
July 7, 2015
Computation Products Group
73
Sufficiently Uniform Memory
Organization (SUMO)
 Advantages
• Software view of memory is SMP
 Latency difference between local & remote memory
is a function of the number of processors in the node
 1P and 2P look like a SMP machine
 3P and 4P are NUMA like but can still be viewed as a
ccUMA or asymmetric SMP node
 >4P can be viewed as ccUMA and depending on CACHE
hit rate, may or may not required NUMA aware OS
• Physical address space is flat and can be
viewed as fully coherent or not (MOEIS state)
• DRAM can be contiguous or interleaved
• Additional processor nodes bring true
increased memory bandwidth
• Designed for lower overall system
chip count (glue-less interface)
 Disadvantages
•3P and 4P nodes work better if the OS is “aware” of the memory map
•>4P may require a NUMA aware OS if the CACHE hit rate is low
July 7, 2015
Computation Products Group
74
Future NUMA Systems
Scaling beyond 8 Processor
Interconnect Fabric
SW0
4
P
4
P
SW1
4
P
4
P
SW2
4
P
4
P
SW3
4
P
4
P
SW2
4
P
4
P
SW3
4
P
4
P
SW2
4
P
4
P
SW3
4
P
4
P
• Scaling beyond 8P is enabled
• External Coherent HyperTransport switch
Coherent Interconnect
 Snoop filter
 Data caching
• Up to 16 processors within the same 240 SPM
memory space
July 7, 2015
Computation Products Group
75
AMD Opteron Support ICs
AMD Opteron™ Support IC’s
•
AMD is committed to deliver the highest quality systems solutions
•
Providing a family of x64-64 processors is just the start
•
AMD will promote and enable a broad range of HyperTransport™
support silicon from internal and external design efforts.
•
AMD, with the HyperTransport™ consortium, will grow the
HyperTransport™ eco-system
July 7, 2015
Computation Products Group
77
HyperTransport Technology
Consortium
July 7, 2015
Computation Products Group
78
AMD-8131™
HyperTransport™ PCI-X Tunnel
• Dual PCIx Master
–
Each PCI-X Bridge independently
supports
•
•
•
•
•
•
•
–
66, 100, 133MHz PCI-X Protocol
33 and 66MHz PCI 2.2 Protocol
SHPC Controller
64-bit data path
IOAPIC
Arbiter for up to 5 masters
Hot-swap
HyperTransportTM Support: 16/16 up,
8/8 down, independent support for
• Up to 1600MT/s up and down
• Full Link Auto sizing and speed selection
AMD Opteron
Or
AMD Athlon™64
16x16 HyperTransport @
1600MTs
AMD-8131
HyperTransport™
Dual PCI-X
8x8 HyperTransport @
800MTs
AMD-8111TM
I/O Hub
32bits @
33Mhz
FLASH
LPC
–
829 OBGA, 37.5mm body, 1.27mm
pitch, full array, 6-Layer Motherboard
Breakout
July 7, 2015
Computation Products Group
USB1.0,2.0
AC97
UDMA100
10/100 Ethernet
SIO
100 BaseT
10/100 Phy
79
AMD-8111™
HyperTransport™ I/O Hub
 I/O Hub
• Engineered from past successful AMD I/O hub
development efforts
8x8 HyperTransportTM
@ 800MHz
• 8x8 wide 200 MHz DDR HyperTransport™
technology interface (800MB/s aggregate BW)
• Enhanced 10/100 Ethernet MAC
• USB1.1, USB2.0, EDMA, AC’97
• LPC for BIOS ROM and Super I/O
• PCI version 2.2 - 33/32 Bridge (“legacy”)
• Supports arbitration of up to 8 external masters
• SMbus 1.0 and 2.0 controllers
32bits @
33Mhz
EIDE
AMD-8111TM
I/O Hub
USB1.1,2.0
AC97
MII
LPC
FLASH
SIO
NIC
10/100 BaseT
• 492 PBGA, 35x35mm body, 1.27mm pitch
July 7, 2015
Computation Products Group
80
AMD-8151™
HyperTransport™ AGP Tunnel
– 8xAGP
• Fully AGP 3.0 Compliant
AMD Opteron™
Or
AMD Athlon™64
• 66,133,266,533MHz operation
–
HyperTransportTM Support: 16/16
up, 8/8 down, independent
support for
• Up to 1600MT/s up, Up to 800MT/s
down
• Full Link Auto sizing and speed
selection
AMD 8151
HyperTransport™
Int
AGP
Gfx
8x8 HyperTransport @
800MTs
32bits @
33Mhz
AMD-8111TM
I/O Hub
–
564 OBGA, 31x31mm body,
1.27mm pitch, full array
July 7, 2015
Computation Products Group
8x
AGP
USB1.0,2.0
AC97
UDMA100
10/100 Ethernet
LPC
FLASH
SIO
100 BaseT
10/100 Phy
81
Opteron™ & Athlon™
Server Chipset Roadmap
AMD-8131
HyperTransport
PCI-X Tunnel
HyperTransport
Second Generation
PCI Device
8th Generation
2 PCI-X Bridges
AMD-8151
HyperTransport
AGP Tunnel
7th Generation
AMD-8111
HyperTransport
I/O Hub
Second Generation
HyperTransport™
I/O Hub
AMD-760MP/MPX
2H02
July 7, 2015
2003
Computation Products Group
2004
2005
82
Desktop Infrastructure Roadmap
Athlon 64 Desktop Chipset Roadmap
July 7, 2015
Computation Products Group
83
A Growing ecosystem of
HyperTransport™ enabled ICs
 Available today:
•
•
•
•
Dual MIPS processor - Broadcom BCM1250
PCI 66/64 Bridge from Alliance Semi.
NITROX Security Macro Processor from Cavium Networks
FPGA from XILINX and Altera
 Announced:
•
•
•
•
RM9000 MIPS processor from PMC Sierra
4 Port 8/8 HyperTransportTM switch swap support from Alliance Semi.
SSL/TLS Record Processing Systems – Broadcom BC5850
Luminance™ Modular Array Technology - Lightspeed Semiconductor
 Planned:
• InfiniBand™ Bridge
• Proprietary High Speed Interconnect
• 4 Port 16/16 non-coherent switch
July 7, 2015
Computation Products Group
• 4 port 16/16 coherent switch
• PCI-X Bridges
84
HyperTransportTM technology
4-way 16/16 Non-Coherent Switch
•Extends the fabric by re-mapping Unit_IDs at each port
 Tracks path of packet that pass through it, guaranteeing the same return path
 Records the incoming Unit_ID so it can be restored in the response packet
• Follows same rules as Processor
Host interface
 Peer-to-peer through the
switch freeing up the host
 Facilitates multiple
Host fabrics
July 7, 2015
Computation Products Group
85
Nine channel GigE Firewall
AMD Opteron™
8x8 HyperTransport™
1000M transfers/sec.
16x16 HyperTransport @
1600MT/s
64bits @
133Mhz
PCI-X
AMD-8131TM
64bits @
133Mhz
PCIX Tunnel
PCI-X
8x8 HyperTransport @
400MT/s
VGA
PCI
Graphics
Legacy PCI
FLASH
LPC
SIO
Computation Products Group
I/O Hub
Zircon BMC
100 BaseT
Management LAN
July 7, 2015
AMD-8111TM
10/100 Phy
USB1.0
AC97
UDMA133
MII
86
AMD Opteron DP - 2P Server with
SSL/IPsec encryption
Security Macro
Processor
DDR SDRAM
SP 8/8
Switch
RM9000x2
Serial Channels
PCI
Marvell
Discovery 2
Customer ASICs
FPGAs
DRAM
PCI
DRAM
Serial Channels
Ethernet
PCI
DRAM
July 7, 2015
Marvell
64120A, 64240
64241, 64244
System
Controllers
SysAD Bus
Gig Ethernet
SP1011
PCI Bridge
PCI 66/64
Marvell
96100, 96122
Communication
Controllers
Computation Products Group
87
1U/1P AMD Opteron™ Server
July 7, 2015
Computation Products Group
88
1U/2P AMD Opteron™ Server
July 7, 2015
Computation Products Group
89
4P Coherent System Based on two
2P MP Nodes
200-333MHz
9 byte Reg. DDR
200-333MHz
9 byte Reg. DDR
8-G DRAM
AMD Opteron DP
AMD Opteron DP
Probe directory
Horis
200-333MHz
9 byte Reg. DDR
200-333MHz
9 byte Reg. DDR
AMD Opteron DP
Legacy PCI
PCI
Graphics
AMD-8111TM
FLASH
LPC
SIO
10/100 Phy
16x16 HyperTransport @
1600MT/s
AMD-8131TM
PCI-X Tunnel
Management
100 BaseT
Management LAN
July 7, 2015
I/O Hub
PCI-X
VGA
AMD Opteron DP
PCI-X
8-G DRAM
SRAM
USB1.0
AC97
UDMA133
MII
Computation Products Group
90
AMD Opteron™
Beowulf 4P SMP Processing Node
To AMD 8131 Tunnel
To AMD 8131 Tunnel
200-333MHz
9 byte Reg. DDR
One 4P SMP node
200-333MHz
9 byte Reg. DDR
AMD Opteron™
• 16G-flops
• 32GB DRAM
AMD Opteron
8GB DRAM
200-333MHz
9 byte Reg. DDR
200-333MHz
9 byte Reg. DDR
AMD Opteron
AMD Opteron
8-G DRAM
PCI
Graphics
Legacy PCI
FLASH
LPC
SIO
AMD-8111TM
I/O Hub
10/100 Phy
AMD-8131TM
PCI-X Tunnel
Management
100 BaseT
Management LAN
July 7, 2015
16x16 HyperTransport @
1600MT/s
PCI-X
VGA
PCI-X
• 10GB/sec. Memory BW
8GB DRAM
USB1.0
AC97
UDMA133
MII
Computation Products Group
91
HyperTransport Technology on the
Backplane – non coherent interconnect
4P
Blade
Hot swap
connection
SI4041
Switch
SI4041
Switch
SI4041
Switch
Switches and 8111
on the backplane
July 7, 2015
Computation Products Group
92
Two - 8 Processor System Topologies
(NUMA)
• Only three nodes are 3 hops away
• No nodes are 4 hops away
• More redundant Paths
To 8111
To I/O device
July 7, 2015
Computation Products Group
93
8GB DRAM
AMD Opteron
AMD-8131
PCI 33/32
Luminance™
Modular
Array ASIC
Interface Device
8x8 HyperTransport @
1.6GB/sec.
EIDE
PCI-X
PCI-X
AMD Opteron™
PCI-X
PCI-X Tunnel
PCI-X Tunnel
AMD-8111TM
I/O Hub
LPC
USB1.1,2.0
AC97
ACR 1.0
GMII
July 7, 2015
AMD-8131
PCI-X Tunnel
PCI-X
200-333MHz
72-Bit Reg DDR
AMD-8131™
PCI-X
PCI-X
2P Server with Add-on
Accelerator Daughter Card
Computation Products Group
HyperTransport-enabled
daughter card
FLASH
SIO
NIC
10/100
94
AMD Athlon 64 1P Blade Design
4GB DRAM
 Ultra low cost Blade design
• 4GB 333MHz DRAM
• 2GHz processor
• ~35 Watts
Boot ROM
 Luminance Device
• Boots the Processor
• Provides HCA network interface
July 7, 2015
AMD Athlon 64™
Computation Products Group
16x16 HyperTransport @
1,000MT/s
Luminance™
Modular
Array ASIC
Interface Device
HCA
Interface
95
AMD Opteron™ Processor
DP – 2P Graphics Workstation
TM
July 7, 2015
Computation Products Group
96
2P AMD Opteron™ Processor
Graphics Workstation (Cave)
July 7, 2015
Computation Products Group
97
High density SprayCooled
Blade Configuration
• 4P – 16G-flop Blade Design
• 64GB of SMP DRAM
• ASIC boots the 4P unit
• PCI-X provides all I/O
• Vapor cooled in sealed
enclosure
External
VRM
July 7, 2015
Computation Products Group
98
How ISR SprayCoolTM Technology Works
a. As the electronics
are sprayed, the
fluid vaporizes,
cooling the
electronics to a low,
stable temperature.
b. Vapor travels
though the heat
exchanger to be
condensed
c. Fluid collects in
reservoir
d. Fluid is purified
by the filtration
system
f. Sealed enclosure
protects electronics
from dust, dirt, salt-air…
July 7, 2015
e. Fluid is pumped
back into the
electronics in a
continuous cycle
Computation Products Group
99
High Density HPC Cluster
SprayCool Technology from ISR
• 16 cards
• 16G-flops/card
• 256G-flops peak throughput
• 64GB of memory per card
• 1TerraByte of sys. Memory
• 240 cubic inches
16”
10”
 114M-flops/cubic inch
 4.27GB of memory storage
cubic inch
• ~6K watts
14”
July 7, 2015
Computation Products Group

~3 watts/cubic inch
100
AMD Reference Design Kits
Four Hardware platforms
• Solo (AMD): 1P AMD Opteron mother-board for Desk top
applications
• Serenade (AMD): 2P AMD Opteron system board for HPC and
server applications
• Quartet (AMD): 4U-4P AMD Opteron system board for HPC and
server applications
• Khperi (Newisys): 1U-2P AMD Opteron server board
July 7, 2015
Computation Products Group
102
Solo Features
– Athlon64 Uni-processor
– Two Unbuffered PC2700, PC2100 DDR DIMMs
– AMD 8151 AGP8X – HyperTransport Tunnel
– AMD 8111 I/O Hub
•
•
•
Four PCI 32b 33MHz slots
Two ATA-100 EIDE connectors
Size USB 2.0 ports
–
•
•
•
3 on back panel, 2 on front panel, and 1 on ACR
AC ’97 audio
SMBus 1.0 and 2.0 support
One ACR slot; 1 Fan with sense and 1 Fan without sense
– Floppy, serial, parallel, 2 PS/2 and 2 IEEE 1394a
connectors
– LPC Super I/O with 2 fans with sense
– 4-layer ATX form factor with ATX power supply
– PC2001, WHQL, Energy Star, WFM 2.0 compliant
July 7, 2015
Computation Products Group
103
Hammer Performance Desktop
(Solo-RDK)
July 7, 2015
Computation Products Group
104
1U/2P “Serenade”
CPU/Memory Complex
– Opteron processor 200 Series (supports up to 2 processors)
– Four banks of 128bit registered DDR memory/CPU (DDR 200-333)
I/O
–
–
–
–
–
–
Full size PCI-X slots: Two PCI-X 64/100 MHz or one PCI-X 64/133 (none hot plug-able)
One mini-PCI slot
Dual Broadcom 10/100/1000 Ethernet onboard
Dual LSI U320 SCSI (one channel to disk, one channel to rear expansion)
Single USB1.1: to front
SIO (Floppy, Serial, Keyboard, Mouse)
Management
– Single dedicated management, LAN10/100
– Optional BMC management controller, IPMI 1.5 compliant
Storage
– Dual drive bays: (standard) IDE or (standard or hot-swap) SCSI drives
– Slim-line IDE CD-ROM or slim-line floppy drive
Physicals
–
–
–
–
–
1U Rack-mount server form factor, tool-less access, full extension slide rails
Single 500W power-supply, rear accessible to line cord
Removable blowers, cooling performed front-to-rear (passive CPU heatsinks)
Front LED panel with activity and status: PWR, RESET, USB , PCI-Video
Dimensions: (1U) x 19” W x 28” D
July 7, 2015
Computation Products Group
105
1U/2P “Serenade” Front View
Full Size PCI-X Slots (x2)
64/100 MHz
or single PCI-X 64/133
SCSI Disk Option
(Mini-PCI)
AMD Opteron 200
Series
(x2)
(riser w/sideband)
32/33MHz PCI
(half-height/half length)
(Video option)
10 Redundant Blowers
(front to back cooling)
8 DIMMs DDR 266-333
ECC (4DIMMs/CPU)
28”
500W
Power Supply
CDROM or Floppy
(slimline)
Drive Carriers (x2)
(SCSI hot
swappable)
July 7, 2015
Computation Products Group
106
1U/2P “Serenade” Rear View
Full Size PCI-X Slots (x2)
64/100 MHz
or single PCI-X 64/133
module assembly
AMD Opteron 200 Series
(x2)
cooling ducts
(riser w/sideband)
U320 SCSI Option
(Mini-PCI)
32/33MHz PCI
(half-height/half length)
(std. half-height video
option)
PS2 ports
USB port
July 7, 2015
Computation Products Group
Dedicated 10/100 IPMI
Management Port
Dual
10/100/1000
ENET
107
“Quartet”: 4U/4P
SledgeHammer MP 940-pin Processor
July 7, 2015
Computation Products Group
108
Quartet System Features
– 4U Rack-mount server form factor (25” deep)
EIA-Std
– 4P Opteron (940-pin)
– Four banks of 128bit registered DDR memory per
CPU (designed for DDR-333) – 16 Total
– Five full size PCI-X slots (AMD 8131):
• Two PCI-X 64/133 MHz (hot plug-able)
• Three PCI-X 64/66 MHz
– Ethernet Ports:
• Dual Broadcom 10/100/1000 Ethernet onboard
• Single 10/100 (AMD-8111)
– Dual LSI U320 SCSI (one channel to disk, one
channel to rear expansion)
– System Management: Qlogic UL BMC IPMI 1.5 via
dedicated LAN/Modem
July 7, 2015
Computation Products Group
109
Quartet System Features (cont)

Dual IDE: Slim-line CD-ROM, Slim Floppy

Dual USB: one front, one rear

SIO (Floppy, Serial, Keyboard, Mouse)

Storage: Four 1” hot-swap Ultra320 SCSI drives

Video: ATI 4 Meg (via card option PCI 32/33)

Three 500W hot-swap power-supplies
(2+1 redundancy) for 4U; rear accessible
to three line cords

Hot-swap redundant fans (10)

Front LED panel with activity and status: PWR, RESET, USB , PCI-Video

Full extension slide rails

Dimensions: 5.25” H x 19” W x 28” D (*5.25” is main/processor section; an additional 1.75” is the
power supply bay)

Cooling front to rear (passive CPU heatsinks)

Tool-less access
July 7, 2015
Computation Products Group
110
Dual Processor Opteron System
Khepri
• 1U 2P Opteron
• 16 GigaBytes RAM, max
• Fully Managed
• Linux 32 & 64 bit
• Windows 32 bit 2000 and .Net Server
• Windows 64 bit (when available)
July 7, 2015
Computation Products Group
111
Khepri Block Diagram
July 7, 2015
Computation Products Group
112
Khepri Alpha Internal View
July 7, 2015
Computation Products Group
113
Availability
 Solo (AMD Athlon 64)
 Prototypes are available now
 Production planned in Sept. 2003
 Serenade (AMD) – Development platform
 RDK available now
 Production planned for June 2003
 Quartet (AMD)
 RDK available June 2003
 Production planned for Aug. 2003
 Khperi (Newisys)
 Development units are available now through AMD Beachhead Program
 Production Now
July 7, 2015
Computation Products Group
114
Platform Enablement Program
• Over the past 24 months, AMD has provided technical design
support to over ~50 companies
• To date, Newisys has enabled over 17 vendors with their Khepri 2P
platform reference design
• By Launch (April 2003) there will be 4+ announcements of 4P HPC
servers based on AMD Opteron.
• By Nov. 2003 there we be many more vendors with 4P and up to
four vendors with 8P SMP/NUMA AMD Opteron platforms.
• With the availability of a HyperTransport coherent switch, the NUMA
server can grow to 32P and beyond.
July 7, 2015
Computation Products Group
115
2002-2003 AMD Server Roadmap
DP/MP
Systems
4Q02
2Q03
3Q03
4Q03
Enterprise
SH MP 2.0
SH MP 1.8
SH MP 2.2
SH MP 2.0
SH MP 2.6
SH MP 2.4
Scalable
SH MP 1.6
SH MP 1.8
SH MP 2.2
Basic +
SH MP 1.4
SH MP 1.6
SH MP 1.4
SH MP 2.0
SH MP 1.8
Basic
SH DP 2.0
SH DP 1.8
SHSH
DP DP
2.4/4200
2.4
SH DP 2.2
SH DP 2.0
SHSH
DP DP
2.6/4500
2.6
SH DP 2.4
SH DP 2.2
SH DP 1.8
SH DP 1.6
SH DP 1.4
SH DP 1.4/2600
SH DP 2.0
SH DP 1.8
SH DP 1.6
SH DP 1.6/3000
SH DP 2.2
SH DP 2.0
SH DP 1.8
BAR 2.2/2800+
SH DP 1.6
SH DP 1.4
BAR 2.2/2800+
SH DP 1.8
SH DP 1.6
THR 2.13/2600+
THR 2.0/2400+
THR 1.8/2200+
THR 1.67/2000+
SH DP 1.4
1Q03
Value +
THR 2.0/2400+
Value
Ultra-Value
THR 1.8/2200+
THR 1.67/2000+
SH DP 1.6
SH DP 1.4
THR 2.13/2600+
THR 2.0/2400+
THR 1.8/2200+
THR 1.67/2000+
THR 2.13/2600+
SH DP 1.4/2600
THR 2.0/2400+
BAR 2.2/2800+
SH DP 1.4
THR 2.13/2600+
AMD Athlon™ MP processor “Thoroughbred” (266MHz FSB)
AMD Opteron processor “SledgeHammer” MP
AMD Athlon MP processor “Barton” (266MHz FSB)
AMD Opteron processor “SledgeHammer” DP
July 7, 2015
Computation Products Group
116
Summary
AMD Opteron Processor
• Optimized for high performance operation
– Chip infrastructure optimized for sub micron process impacting:
• Power distribution, Clocking, Circuit design and layout
• 20-25% better performance per clock than AMD Athlon XP
–
–
–
–
Smart low-latency memory controller
Branch prediction, Cache and TLB improvements
Advanced clock distribution methods
New operand/address sizes, rather than new instructions
• Integrated DDR Memory System Controller
– Closing the gap between external memory access and CPU speed
– Reduced latency of current Stare of Art (AMD Athlon™ processor)
– Greater the bandwidth of current State of Art (AMD Athlon™ system)
• Integrated Coherent HyperTransport I/O supporting
– High speed peripheral connections - >6.4GB/s throughput
– Coherent HyperTransport™ technology to support glueless MP interface
July 7, 2015
Computation Products Group
118
Trademark Attribution
©Copyright 2002 Advanced Micro Devices, Inc. All rights reserved.
AMD, the AMD Arrow Logo, AMD Athlon, AMD Opteron, 3DNow! and
combinations thereof are trademarks of Advanced Micro Devices, Inc.
HyperTransport is a licensed trademark of the HyperTransport
Consortium. MMX is trademark of Intel Corporation. Other product
names used in this presentation are for identification purposes only and
may be trademarks of their respective companies.
July 7, 2015
Computation Products Group
120