SYMP02 John Feo Architect Microsoft Corporation Bad sequential code will run faster on a faster processor Bad parallel code WILL NOT run faster on more.

Download Report

Transcript SYMP02 John Feo Architect Microsoft Corporation Bad sequential code will run faster on a faster processor Bad parallel code WILL NOT run faster on more.

SYMP02
John Feo
Architect
Microsoft
Corporation
Bad sequential code will run faster on a faster processor
Bad parallel code WILL NOT run faster on more cores
Just using parallel code is not enough
Speedup
3
2.5
2
1.5
Speedup
1
0.5
0
1
2
4
8
16
32





(throughput)



(capability)








Memory
Memory
Memory
Memory
CPU
CPU
CPU
CPU













Optimize for critical resource

decompose to
scale with
problem size and
processor count
too much
parallelism may
increase
communication/
synchronization
costs
over
decompose
to improve
load
balance
rely on MS
runtime to
schedule
and manage
threads
watch out
for resources
consumed
by waiting
tasks

Audio
Avatars
Vehicles
Video
UI
Network
Weapons













































































































G(V, E)


s
t







s
s
t
t
4
1
9
3
A
8
s
B
5
2
s
135
2468
79t
6
t
7
D
C
E



























4
1
9
3
A
8
s
B
5
s
1
2
2
3
t
5
7
4 6 8
9
6
7
t
D
C
E


























Bad sequential code will run faster on a faster processor
Bad parallel code WILL NOT run faster on more cores
MSDN.com/concurrency
And download Parallel Extensions to the
.NET Framework!
 Jerry Bautista, PhD
Director, Microprocessor
Technology Management
Intel Corporation
VISUAL
COMPUTING
SOCIAL
NETWORKING
BROADBAND
CONNECTIVITY
USERGENERATED
CONTENT
MOBILE
COMPUTING
Applications that use every available FLOP
Look real
Act real
Feel real
Bringing VC to connected usage models
Social networking, collaboration, online gaming, online retail, and more.
Enhancing the
actual world
Creating new
digital worlds
The Actual
World
Multiplayer Games
ED
T
EC
Earth Mapping
CO
NN
NN
Rich
Visual
Interfaces
CONNECTED
People
Everywhere
Virtual Worlds
CO
EC
TE
D
Augmented Reality
Internet
Data
Real-world data
visualization
LIMITED
3D Digital Entertainment
Static
Web
Web 2.0
CVC
Better content quality, social interaction – a better user experience
RICH
• Realistic, representative visuals both
professional and user-generated
• Socialization, education,
entertainment, collaboration
Virtual Worlds
Multiplayer Online Games
3D Cinema
All company and/or product names may be trade names, trademarks and/or registered trademarks of
the respective owners with which they are associated.
Sharing data and representing data
in richer, more intuitive ways.
OpenSim N-body Simulation
Visualizing Real
World Information
-Dust storm in Morocco
West Nile Virus
Visualization
Virtual Colonoscopy
All company and/or product names may be trade names, trademarks and/or registered trademarks of
the respective owners with which they are associated.
Virtual team rooms
Enterprise-class environments to allow virtual
teams to have realistic, natural interactions
Virtual information environments
Information space for documents,
app-sharing, and visualizations
Combines real world info with data overlays
Location Information
Identification &
Hyperlinking
Virtual Instruction
Translation
Mobile Augmented Reality (MAR) particularly compelling
Today
2010
2012
Map
Hybrids
Visual
Search
Text
Overlays
2014
2D/3D Visual
Overlays
Platform Optimization
• Server, client demands
• Network performance
• Energy-efficiency
Visual Content
• Interoperability
• User creation
Distributed Computing
• Scaling
• Client diversity
• Programmability
Mobile Experience
• Better connectivity, BW
• Sensor integration
interaction
complexity

complexity





richness
TYPE
SERVERS: 10x More Work
MMORPGS
75%+ Time = Compute Intensive Work
VWs
APPLICATION
65%+ Time = Compute Intensive Work
NETWORK: 100x Bandwidth
Maximum Bandwidth Limited by
Server to Client
WoW
Second Life
% CPU
UTILIZATION
2500
160
% GPU
UTILIZATION
2D Websites
20
0-1
Google Earth
50
10-15
Second Life
70
35-75
Bandwidth
(In KB/s))
CLIENTS: 3x CPU, 20x GPU
SOFTWARE
MAX CLIENTS
PER SERVER
10
0
Cached
Uncached
50
0
25
50
75
100
Time (In Seconds)
125
Sources: WoW data (source www.warcraftrealms.com), Second Life data (source Intel Linden Labs CTO-CTO meeting and
www.secondlife.com), and Intel measurements.
150
Parallel Speedup
64
48
32
16
0
0
16
32
Cores
48
64
Production Fluid
Production Face
Production Cloth
Game Fluid
Game Rigid Body
Game Cloth
Marching Cubes
Sports Video Analysis
Video Cast Indexing
Home Video Editing
Text Indexing
Ray Tracing
Foreground Estimation
Human Body Tracker
Portifolio Management
Geometric Mean
Graphics Rendering – Physical Simulation -- Vision – Data Mining -- Analytics
Intel® Thread Building Blocks
Intel® Thread Checker
Intel thread checker is an analysis tool that
pinpoints hard-to-find threading errors like data
races and deadlocks in 32-bit and 64-bit
applications.
Intel threading building blocks (Intel TBB) is a C++
runtime library that abstracts the low-level
threading details necessary for optimal multicore
performance. implementation work.
Smoke
A game framework to maximize core utilization



*Other names and brands may be claimed as the property of others.
The Framework
How is the Smoke highly threaded?
Engine
1. Scheduler manages system
jobs
Scheduler
2. Change Control (CC)
Manager minimizes thread
synchronization
Scene CC
3. Data structured to support
independent processing
UScene
4. System modularity
(through interfaces)
5. Systems are specific to the
demo (e.g. AI, physics, etc)
Managers
Framework
Parser
Object CC
Environment
Task
Service
Platform
…
UObject
Interfaces
Systems
Definition Files
UObject
System
A throughput programming model
TVEC<F32> a(src1), b(src2);
TVEC<F32> c = a + b;
c.copyOut(dest);
User Writes
Core Independent C++ Code
1 1 0 1 0 0 0 1 0 1 0 0 0 0 1 1
+
1 1 0 1 0 0 0 1 0 1 0 0 0 0 1 1
1 1 0 1
+
1 1 0 1
0 0 0 1
+
0 0 0 1
0 1 0 0
+
0 1 0 0
0 0 1 1
+
0 0 1 1
Thread 1
Thread 2
Thread 3
Thread 4
SIMD
Unit
SIMD
Unit
SIMD
Unit
SIMD
Unit
Core 1
Core 2
Core 3
Core 4
Ct Parallel Runtime:
Auto-Scale to Increasing
Cores
Ct JIT Compiler:
Auto-vectorization,
SSE, AVX, Larrabee
Programmer Thinks Serially; Ct Exploits Parallelism
CVC Environment
Visual Computing Clients
USERS
ACT
Sensors
CVC
Processing
Loop
Other CVC
Environments
WORLD
REACTS
DISPLAYS
REFRESH
“Light”
Clients
Rendering & Reasoning
Services Cloud
DATA PIPES:
Potential Bottlenecks
Service Providers
S/W Platforms (Engines)
H/W Platforms
(Server/Client)
World Operators
Infrastructure
Marketing - Ad, Promotion…
Digital Asset Marketplace
Enterprise Integration
Content Tools
Development Tools
S/W Infrastructure
Game Engines
O/S
Device Manufacturers
Sales
OEMs
GPU Vendors
CPU Vendors
A broad effort is required to fully enable CVC
Walled Gardens
Proprietary
1993-1995
Browser
Server
HTML
HTTP
Proprietary
Proprietary
*Other names and brands may be claimed as the property of others.
Open Standards
CVC Future Architecture
Common Building Blocks
Support Services
Communication
Asset &
Inventory
Identity
Transactions
Presentation
Behavior
User
Input/Control
Sensors/
Context
Game
Physics
Scripted
Behavior
User
Feedback
A/V Effects
Services
WORLD SIMULATOR
Rendering
Services
CVC Future Architecture
From Monolithic to Building Blocks
Today:
Vertical
proprietary,
CVC apps
WORLD (SERVER)
Simulation,
Synchronization
Transactions
Communication
Identities & Assets
VIEW (CLIENT)
User Input & Control
Audio/Visual Effects
Rendering
Tomorrow: More horizontal, open, building blocks
Behavioral
functions
Presentation
Support
Services














Parameterized Content Research
3D Face
Database
Create a
Face Model
Simple Control
Parameters
FULLNESS
Full
FLATNESS
Flat
Narrow
Round
SHAPE
Square
CHIN
Sharp
Expression Modeling
Triangle
Round
Customized
Faces
CHALLENGE
RESEARCH
Platform Optimization
•Workload Characterization
•Understanding platform demands
•Optimizations for future platforms
Distributed Computation
•Scalable system, app architectures
•Dynamic repartitioning of workloads
•Execution on diverse clients
Visual Content
•Parameterized Content
•Easy User-generated 3D Content
•Standards enabling content reuse
Mobile Experience
•Data-enhanced real world interaction
•Mirror-world creation and navigation








www.microsoftpdc.com
© 2008 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market
conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation.
MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.
Processing Will Span Client/Server
Tera-scale Server or Compute
Cloud
Get Input
World Simulation
 Collision Physics
 NPC
 Script Execution
 Simulation
Send Requested Update
User Takes Action on the
“World”
Client (moving to TS)
User Inputs
Collects Changes
From All Users
Audio/Visual Effects
 Animation
 Spatial Audio
 Smoke, Crowds, Fluids
Resolves All Object
Behaviors and
Interactions
Rendering
Generates A New
Per-User Model
Display Updates
Observations:
• Significant client/server compute every cycle
• Many aspects best computed on client
• Extensive use of MIPS, FLOPS, threads
• Partitioning depends on client capability, connectivity
More
Server
Compute
Always
Connected
More
Client
Compute
• Combining location data, a
camera, online satellite maps
and social networking
• Provides an enhanced view
of the real world
At the center of interoperability innovation
WORLD MAP
S
S
S
S
S
S
Simulator
SScript
Object
Model
S
SGame
Engine
S
S
S
S
S
S
S
S
Engine
Collision
Detection
S
DECENTRALIZED SIMULATORS
Identity
Inventory
Assets
Voice
CORE INFRASTRUCTURE
World Map
Presence
“Connected” Visual Computing
Users Create
Users Collaborate & Play
World of Warcraft Avatar
Eiffel Tower in
Google Earth
Virtual Teamroom
Scenario Play
Users Explore and Learn
Users Enhance the Actual World
Machinima Interactive Movies
Qwaq Treefort
Virtual Room
Visualizing Real
World Information -Dust storm in Morocco
West Nile Virus
Visualization
CVC apps will transform the Internet from 2D to 3D
…but require LOTS of compute horsepower