Mapping Computational Concepts to GPUs

Download Report

Transcript Mapping Computational Concepts to GPUs

General-Purpose Computation on
Graphics Hardware
Introduction
David Luebke
University of Virginia
Course Introduction
• The GPU on commodity video cards has evolved
into an extremely flexible and powerful processor
– Programmability
– Precision
– Power
• This course will address how to harness that power
for general-purpose computation
Motivation:
Computational Power
• GPUs are fast…
– 3 GHz Pentium4 theoretical: 6 GFLOPS, 5.96 GB/sec peak
– GeForceFX 5900 observed: 20 GFLOPs, 25.3 GB/sec peak
• GPUs are getting faster, faster
– CPUs: annual growth  1.5×  decade growth  60×
– GPUs: annual growth > 2.0×  decade growth > 1000
Courtesy Kurt Akeley,
Ian Buck & Tim Purcell, GPU Gems (see course notes)
Motivation:
Computational Power
GPU
CPU
Courtesy Naga Govindaraju
An Aside:
Computational Power
• Why are GPUs getting faster so fast?
– Arithmetic intensity: the specialized nature of GPUs makes it
easier to use additional transistors for computation not cache
– Economics: multi-billion dollar video game market is a
pressure cooker that drives innovation
Motivation:
Flexible and precise
• Modern GPUs are deeply programmable
– Programmable pixel, vertex, video engines
– Solidifying high-level language support
• Modern GPUs support high precision
– 32 bit floating point throughout the pipeline
– High enough for many (not all) applications
Motivation:
The Potential of GPGPU
• The power and flexibility of GPUs makes them an
attractive platform for general-purpose computation
• Example applications range from in-game physics
simulation to conventional computational science
• Goal: make the inexpensive power of the GPU
available to developers as a sort of computational
coprocessor
The Problem:
Difficult To Use
• GPUs designed for and driven by video games
– Programming model is unusual & tied to computer graphics
– Programming environment is tightly constrained
• Underlying architectures are:
– Inherently parallel
– Rapidly evolving (even in basic feature set!)
– Largely secret
• Can’t simply “port” code written for the CPU!
Course goals
• A detailed introduction to general-purpose
computing on graphics hardware
• We emphasize:
– Core computational building blocks
– Strategies and tools for programming GPUs
– Tips & tricks, perils & pitfalls of GPU programming
• Several case studies to bring it all together
Why a SIGGRAPH Course?
• Why SIGGRAPH, instead of (say) Supercomputing?
– Many graphics applications stand to benefit from GPGPU
• “Hot topic” case studies: tone mapping, level sets, fluids
• Keeping computation on-card!
– Many graphics applications strive for visual plausibility rather
than rigorous scientific realism
• Better tolerate GPU limitations in precision, memory
• Well suited as GPGPU “early adopters”
– GPGPU programming still requires expertise of SIGGRAPH
audience
Course Prerequisites
• We assume
– Familiarity with interactive graphics and computer graphics
hardware
– Ideally, some experience programming vertex & pixel shaders
• Target audience
– Researchers interested in GPGPU
– Graphics and games developers interested in incorporating
these techniques into their applications
– Attendees wishing a survey of this exciting new field
Course Topics
• GPU building blocks
• Languages and tools
• Effective GPU programming
• GPGPU case studies
Course Topics: Details
• GPU building blocks
– Linear algebra
– Sorting and searching
– Database operations
• Languages and tools
– High-level languages
– Debugging tools
Course Topics: Details
• Effective GPU programming
– Efficient data-parallel programming
– Data formatting & addressing
– GPU computation strategies & tricks
• Case studies in GPGPU Programming
–
–
–
–
Physically-based simulation on GPUs
Ray tracing & photon mapping on GPUs
Tone mapping on GPUs
Level sets on GPUs
Speakers
In Order of Appearance
• David Luebke, University of Virginia
• Mark Harris, NVIDIA
• Jens Krüger, TU-Munich
• Tim Purcell, Stanford (NVIDIA)
• Naga Govindaraju, University of North Carolina
• Ian Buck, Stanford
• Cliff Woolley, University of Virginia
• Aaron Lefohn, University of California Davis
Course Schedule:
GPU Building Blocks
8:30
Introduction
Luebke
Welcome, overview, the graphics pipeline
9:00
Mapping computational concepts to the GPU
Harris
Streaming, Resources, CPU-GPU analogies, branching
9:20
Linear algebra
Krüger
Representations, operations, example algorithms
9:55
Sorting & searching (part 1)
Bitonic sort, binary search
10:15
Break
Purcell
Course Schedule:
Languages & Tools
10:30
Sorting & searching (part 2)
Purcell
Nearest-neighbor search
10:45
Database operations
Govindaraju
Queries, boolean predicates, aggregation
11:15
High-level languages
Buck
Cg/HLSL/GLslang, Sh, Brook
11:45
Debugging tools
imdebug, DirectX/OpenGL shader IDEs, ShadeSmith
12:15
Lunch break
Purcell
Course Schedule:
Effective GPU Programming
1:45
Efficient data-parallel GPU programming
Woolley
Computational frequency, profiling, load balancing
2:15
Data formatting & addressing
Lefohn
Memory layout, data structures
2:45
GPU Computation Strategies & Tricks
Buck
Precision, performance, scatter, branching
3:15
Q&A
Questions for the speakers?
3:30
Break
All
Course Schedule:
GPGPU Case Studies
3:45
Physically-based simulation on GPUs
Harris
Reaction-diffusion, fluids, clouds
4:10
Tone mapping on GPUs
Woolley
High-dynamic range images, tone mapping
4:35
Level sets on GPUs
Lefohn
Streaming level sets, visualization, segmentation
5:00
Global illumination on GPUs
Ray tracing, photon mapping
5:30
Wrap!
Purcell
GPU Fundamentals:
The Graphics Pipeline
Graphics State
Application
Transform
Vertices
(3D)
Rasterizer
Xformed,
Lit
Vertices
(2D)
CPU
Shade
Fragments
(pre-pixels)
GPU
• A simplified graphics pipeline
– Note that pipe widths vary
– Many caches, FIFOs, and so on not shown
Final
pixels
(Color, Depth)
Video
Memory
(Textures)
Render-to-texture
GPU Fundamentals:
The Modern Graphics Pipeline
Graphics State
Application
Vertices
(3D)
Vertex
Vertex
Processor
Processor
Rasterizer
Xformed,
Lit
Vertices
(2D)
CPU
• Programmable vertex
processor!
Fragments
(pre-pixels)
GPU
Pixel
Fragment
Processor
Processor
Final
pixels
(Color, Depth)
Video
Memory
(Textures)
Render-to-texture
• Programmable pixel
processor!
GPU Pipeline:
Transform
• Vertex Processor (multiple operate in parallel)
– Transform from “world space” to “image space”
– Compute per-vertex lighting
GPU Pipeline:
Rasterizer
• Rasterizer
– Convert geometric rep. (vertex) to image rep. (fragment)
• Fragment = image fragment
– Pixel + associated data: color, depth, stencil, etc.
– Interpolate per-vertex quantities across pixels
GPU Pipeline: Shade
• Fragment Processors (multiple in parallel)
– Compute a color for each pixel
– Optionally read colors from textures (images)
Coming Up
• Next: Mapping computational concepts to the GPU
• Also coming up:
–
–
–
–
Core building blocks for GPGPU computing
Memory layout, data structures, and algorithms
Detailed advice on writing high performance GPGPU code
Lots of examples