keynote presentation
Download
Report
Transcript keynote presentation
Architectural Musings
Rethinking Computer Systems Architecture
Christopher Vick
[email protected]
June 3, 2012
1
Introduction
Vision Talk
2
Mobile computing and current technologies fundamentally
change key parameters and constraints for computer
system architecture
Vast new opportunities for research of great interest to
and great relevance for industry
Outline
Computer System Architecture
Then (Circa 1970)
Scarce Resources & Bottlenecks
Optimizations
Now (Mobile Computing Platforms)
Scarce Resources & Bottlenecks
Optimizations?
Qualcomm Research
Questions?
3
COMPUTER SYSTEM
ARCHITECTURE
4
Computer System Architecture
Hardware
The 5 classic components (Patterson & Hennessy)
Input, Output, Memory, Datapath, Control
Software
System Virtual Machine (Hypervisor, VM, or VMM)
Operating System
Compilers & Tools
Definitions
The way components fit together
The arrangement of the various devices in a complete computer system or
network
The instruction set plus a model of the execution of the instruction set
(Amdahl et al)
Computer System Architecture
The selection and combination of hardware and software components to
assemble an effective computer system
5
Combination
Application Programs
Libraries
Operating System
Drivers
Memory
Manager
Scheduler
Hypercall Interface
Virtual Machine
Multicore Execution Unit
Interconnect
IO Devices
6
Memory
S
o
f
t
w
a
r
e
H
a
r
d
w
a
r
e
Effective
An optimization problem
Many variables
Selection of hardware/software components
Selection of interfaces/interconnects
Many constraints
Physical, sociological, technical & cost constraints
Scarce Resources and Bottlenecks
Maximize utilization of scarce resources
Minimize impact of bottlenecks
7
THEN
(CIRCA 1970)
8
Scarce Resources
CPU Cycles
CPUs expensive
Slow clock rates
Memory Locations
Random Access Memory expensive
Address/Data paths into CPU expensive
Skilled Programmers
Relatively new discipline
Poor language and tools support
9
Bottlenecks
Programmer Productivity
Software development slow and expensive
Low level programming paradigms
Memory Latency
RAM latency gated overall speed (~2-3 MHz)
Small RAM backed by vastly slower storage
I/O Bandwidth
Limited CPU connectivity
Crude communication mechanisms
10
Optimizations
Time Sharing
Effective sharing of limited resource
Virtual Memory
Effective sharing, and backing with cheaper alternative
Hardware Improvements
Smaller features provide more resource and faster clock
Large Scale Integration
Better signaling to improve bandwidth
High Level Programming Languages
Broadens productive programmer community
Abstracts away some hardware complexity
11
Examples
Digital PDP 11
16-bit address space
Orthogonal instruction set
Memory mapped I/O
Unix, DOS, many others
IBM System 370
12
24-bit address space
Virtual Memory
VMS, VM/370, DOS/VS
Backward compatibility with System 360
NOW
(MOBILE COMPUTING)
13
Scarce Resources
Energy
Fixed Energy Budget for mobile devices
Thermal issues at all scales
Tradeoff between performance and energy
Shrinks no longer significantly improving consumption
Memory Bandwidth
Providing bandwidth is expensive
Memory interconnect consumes significant energy
14
Bottlenecks
Memory Latency
Increasing gap between CPU speed and DRAM latency
Physical distance to DRAM devices a factor
Concurrency
Shortage of programmers who can handle this
Inadequate language/tools support
I/O Bandwidth/Latency
Wireless bandwidth lower than wired
Consumes large amounts of energy
15
Example
HTC One
Processor: 1.5 GHz Dual Core Qualcomm MSM8960
OS: Android™ 4.0 (ICS)
Memory RAM: 1 GB DDR2
Memory Storage: 16 GB onboard storage
Display: 4.7" HD super LCD 1280 x 720
Network: LTE CAT3 - DL 100 /UL 50 LTE: 700/AWS
WCDMA: 2100/1900/AWS/850
EDGE: 850/900/1800/1900
Battery: 1800 mAh
Camera (Main): 8 MP, f/2.0, BSI, 1080p HD Video
(Front): 1.3 MP with 720p video
Dimensions: 134.8 x 69.9 x 8.9mm
This is a General Purpose Computer!
16
Optimizations?
Multi-core
Aggressive addition of cores and threads
Hardware concurrency outstripping software
New Concurrent Programming Models/Tools?
Memory Subsystem
Significant contributor to total energy consumption
Adding bandwidth is expensive
New technologies addressing some energy issues
Wireless bandwidth enhancements (LTE Advanced,etc.)
Solutions from desktop/server or embedded worlds
may not directly apply in mobile space!
17
Memory System Energy
Retaining data (one second)
DRAM: ~1-10 pJ/bit self-refresh
SRAM: 1200+ pJ/bit, and rising over time [ITRS 2009]
4 pJ/bit (45nm LP, standby) [Barasinski et al., ESSCIRC ‘08]
Flash, PCM, STT RAM…: Zero !
Moving Data
32-bit value:
18
Recompute: 60 pJ (Razor)
Send 1mm: 10 pJ
Retain in cache for 1 ms: 38 pJ
Retain in DRAM for 1 second: 32+ pJ
Reducing Memory System Energy
Move less!
Caches physically close to CPU
Locality, locality, locality (the first rule of chip real estate)
Retain less!
Power off unused caches lines [Kaxiras et al., ISCA ‘01]
“Drowsy” caches [Flautner et al., ISCA ‘02]
… with compiler analysis
[Zhang et al., Trans. Emb. Comp. Sys. 4(3) 2005]
Don’t refresh unused DRAM
… e.g. with garbage collection [Chen et al., CODES+ISSS ‘03]
19
Extending the Memory Model
Maintaining the illusion of a single flat memory address
space is too expensive
On-chip caches can be major consumers of area and energy
Coherence protocols are expensive and difficult to scale
• Alternative: software-managed memory hierarchies
– Tightly-coupled memory (TCM), scratchpads
– Do not require tag memory, address comparison logic
– More area- and energy-efficient
– Help bridge gap between bandwidth and throughput
20
New Challenges and Opportunities
Different programming paradigm: software explicitly
orchestrates all transfers between on-chip and off-chip
memory areas
Major implications on memory management
Scratchpad allocation strategies
Data partitioning strategies
Dynamic relocation between scratchpad and DRAM to track the
program’s locality characteristics
Opportunities for compile-time and runtime optimization
Challenges in both Hardware and Software!
21
Qualcomm Research
Excellence in Wireless
MAY | 2012
WWW.QUALCOMM.COM/RESEARCH
State of the Art Capabilities Fostering Innovation
2323
Human Resources
Complete Development Labs
• 30% of engineers with PhD,
50% Masters
• Prototype Development Facilities
• Systems, HW, SW, Standards,
Test Engineering
• CPU Simulation Clusters
• Ventures, Bus Dev, Technical
Marketing, Program Mgmt.
• Outdoor Field Systems
• Antenna Ranges
Global Research and Development
Organization
UNITED STATES
EUROPE
ASIA
• San Diego, CA
• Cambridge, UK
• Beijing, China
• Santa Clara, CA
• Nuremberg, Germany
• Bridgewater, NJ
• Vienna, Austria
• Bangalore and
Hyderabad, India
• Seoul, S. Korea
24
Qualcomm Research & University Relations
ACADEMIC COLLABORATION TO FOSTER ADVANCED RESEARCH
RESEARCH
Ongoing relations with more than 30 US and 25 International Universities
Current funding includes MIT, UC Berkeley, Stanford, UCSD, UT Austin, ASU,
UIUC, Univ. of Michigan, EPFL, IISc Bangalore, KAIST, Tsinghua
Research collaboration spans variety of technical areas
Computer vision, multicore processing, context aware computing, machine
learning, low power devices,, wireless networks and signal processing, etc..
Qualcomm Innovation Fellowship (QInF) invests on innovative ideas
Close interactions between Qualcomm Research engineers, graduate students and
professors
25
Qualcomm Research For The Wireless
Future
26
TAKE WWAN TO
THE NEXT LEVEL
INNOVATE
BEYOND WAN
ENABLE SMART
APPLICATIONS
BREAKTHROUGH
PERFORMANCE
IMPROVING WWAN
TECHNOLOGY
EXCELLING IN ALL
FORMS OF
WIRELESS
TRANSFORMING
THE MOBILE USER
EXPERIENCE
RE-ARCHITECTING
NEXT-GEN MOBILE
DEVICES
Innovate Beyond WAN
WIRELESS LOCAL AREA
PEANUT
WIFI ADVANCED
• Next gen short range
ultra-low power radio
• Multi Gbps WLAN using 5
GHz and 60 GHz band.
• Next Gen low-power WiFi
for Internet of Things
LTE D2D
(FLASHLINQ)
• Proximal Wireless
• First Gen device-todevice wireless network
• Autonomous discovery
• Direct communications
27
INNAV
• Indoor positioning for indoor
location based applications
• Map tools for Mobile
Devices
Enable Smart Applications
ELEVATE THE WIRELESS USER EXPERIENCE
AUGMENTED
REALITY
• Mobile user
interface
• Computer vision for
mobile devices
28
LOOK
• Multiple language
text detection and
recognition
• With Mobile phone
camera view finder
LISTEN
• Background Audio
processing
• Augmented user
experience
DASH
• Efficient video
delivery over
HTTP for mobile
devices
AWARE
• Build awareness
in mobile devices
• For enhanced
daily life situations
Breakthrough Device Performance
RE-ARCHITECTING NEX-GEN DEVICES
ADVANCED RADIO
TECHNOLOGIES
• New RF front-end and
baseband technologies
• Advanced mobile device
SW platforms
• RF/antenna and
systems/protocol
techniques
• Improved user
experience
• Concurrent multi-radio
operation
29
MANTICORE
GRYPHON
• Virtual machine
design for SoC
architecture
• Enabling higher power
efficiency
Thank You