Multi-Layer Perceptron On A GPU
Download
Report
Transcript Multi-Layer Perceptron On A GPU
Scott Finley
ECE 539 Fall 2008
UW-Madison
Modern GPUs are have 100s of “stream
processors”
Can now be used for non-graphics
computing
nVida CUDA (used for this project)
openCL
Basic Linear Algebra Subprograms (BLAS)
1.
◦
CPU-Only
nVidia’s cuBLAS library
2.
◦
◦
No explicit GPU use, library uses GPU “under the
hood”
Lots of copies of data from CPU to GPU
cuBLAS with CUDA
3.
◦
Same cuBLAS use as above, non-BLAS operations
done with CUDA.
Data from US forestry service
Large feature vectors: 54
Large number of training samples: 500 per
epoch
Two hidden layers
◦ Number of neurons per layer varied
BLAS
cuBLAS
cuBLAS + CUDA
10000
Time per Epoch (ms)
1000
100
10
1
1
10
100
Neurons in Hidden Layers
BLAS
cuBLAS
cuBLAS + CUDA
1000
Time Per Epoch (ms)
100
10
1
1
0.1
10
100
Nuerons In Hidden Layers
1000
GPU is very powerful parallel processor
◦ Up to two orders of magnitude improvement
possible
Much more effective for large comutations
Many improvements possible
◦ CUDA-only version needed