C++ AMP in one, two, or three slides

Download Report

Transcript C++ AMP in one, two, or three slides

This deck has 1-, 2-, and 3- slide
variants for C++ AMP
If your own deck uses 4:3, get with the 21st
century and switch to 16:9 
( Design tab, Page Setup button )
C++ AMP in 1 slide
(for notes see comments section of slides
from the 2- and 3- slide variant)
C++ Accelerated Massive Parallelism
What
– Heterogeneous platform support
– Part of C++ & Visual Studio
– STL-like library for parallel patterns
on large arrays
– Builds on DirectX
Why
How
#include <amp.h>
using namespace concurrency;
void AddArrays(int n, int * pA, int * pB, int * pC)
{
array_view<int,1> a(n, pA);
array_view<int,1> b(n, pB);
array_view<int,1> sum(n, pC);
– Performance
– Productivity
– Portability
parallel_for_each( sum.extent,
[=](index<1> idx) restrict(amp) {
sum[idx] = a[idx] + b[idx];
}
);
}
C++ AMP in 2 or 3 slides
(for 2 slides, just drop the 3rd one)
(see comments section of each slide for notes)
C++ AMP
•
•
•
•
•
•
Heterogeneous platform support
Part of Visual C++
Visual Studio integration
STL-like library for multidimensional data
Builds on DirectX
performance
Is open spec
productivity
portability
Basic Elements of C++ AMP coding
parallel_for_each:
execute the lambda on
the accelerator once
per thread
void AddArrays(int n, int * pA, int * pB, int * pC)
{
restrict(amp): tells the compiler to
array_view<int,1> a(n, pA);
check that this code can execute on
array_view<int,1> b(n, pB);
Direct3D hardware (aka accelerator)
array_view<int,1> sum(n, pC);
array_view: wraps the data to
operate on the accelerator
parallel_for_each(
sum.extent,
extent: the number and
[=](index<1> idx) restrict(amp)
shape of threads to
{
execute the lambda
sum[idx] = a[idx] + b[idx];
}
array_view variables captured and
);
associated data copied to
index: the thread ID that is running the
accelerator (on demand)
lambda, used to index into }data
C++ AMP at a Glance
•
•
•
•
•
•
•
•
restrict(amp, cpu)
parallel_for_each
class array<T,N>
class array_view<T,N>
class index<N>
class extent<N>
class accelerator
class accelerator_view
•
•
•
•
class tiled_extent< , , >
class tiled_index< , , >
class tile_barrier
tile_static storage class
C++ AMP resources
• Native parallelism blog (team blog)
– http://blogs.msdn.com/b/nativeconcurrency/
• MSDN Forums to ask questions
– http://social.msdn.microsoft.com/Forums/en/parallelcppnative/threads
• Daniel Moth's blog (PM of C++ AMP)
– http://www.danielmoth.com/Blog/