WCET Measurement-based and Extreme Value

Download Report

Transcript WCET Measurement-based and Extreme Value

WCET Measurement-based and Extreme
Value Theory Characterisation of CUDA
Kernels
Kostiantyn Berezovskyi
CISTER ISEP/IPP
Porto, Portugal
Konstantinos Bletsas
CISTER ISEP/IPP
Porto, Portugal
Luca Santinelli
Onera
Toulouse, France
Eduardo Tovar
CISTER ISEP/IPP
Porto, Portugal
Supported by National Funds through FCT (Portuguese Foun dation for Science and Technology) and by ERDF (European
Regional Development Fund) through COMPETE (Operational Programme ’Thematic Factors of Competitiveness’), within
projects ref. FCOMP-01-0124-FEDER-037281 (CIS- TER) and FCOMP-01-0124-FEDER-020447 (REGAIN); by FCT and the EU
ARTEMIS JU funding, within project ARTEMIS/0001/2013, JU grant nr. 621429 (EMC2); by FCT and by ESF (European
Social Fund) through POPH (Portuguese Human Potential Operational Program), under PhD grant SFRH/BD/82069/2011.
Stream Processing
Streams
Collection of data.
All data is expressed in streams.
Kernels
Series of operations.
Input: streams.
Output: streams.
2
Why Streams?
Data parallelism
Stream elements can be processed at once.
Task parallelism
Pipeline.
© nvidia.com
3
GPU software application
Large data collections.
Data parallelism.
High arithmetic intensity.
Minimal dependency between data elements.
4
Application areas
© nvidia.com
5
GPU as a co-processor
© Kirk, David B. and Hwu, Wen-mei W.
Related work
Sara S. Baghsorkhi, Matthieu Delahaye, Sanjay J. Patel, Willian D. Gropp, Wen-mei
W. Hwu
15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
(PPoPP), 2010.
Sunpyo Hong and Hyesoon Kim
36th International Symposium on Computer Architecture (ISCA-36), 2009.
Shane Ryoo, Christopher I. Rodrigues, Sara S. Baghsorkhi, Sam S. Stone, David B.
Kirk, and Wen-mei W. Hwu
13th ACM SIGPLAN Symposium on Principles and Practice of Parallel
Programming, 2008.
7
Related work
G. Elliott and J. Anderson.
18th International Conference on Real-Time and Network
Systems (RTNS), 2010.
S. Kato, K. Lakshmanan, A. Kumar, Y. Ishikawa, and R. Rajkumar.
32nd IEEE Real-Time Systems Symposium (RTSS), 2011.
R. Mangharam and A. A. Saba.
32nd IEEE Real-Time Systems Symposium (RTSS), 2011.
GPU Architectures
Kepler
Fermi
© nvidia.com
16 streaming multiprocessors
15 streaming multiprocessors
9
Previous work
Kostiantyn Berezovskyi, Konstantinos Bletsas, Björn Andersson
Euromicro Conference on Real-time Systems (ECRTS2012)
Exact
Upper bound on the
Previous work
Kostiantyn Berezovskyi, Konstantinos Bletsas, Stefan M. Petters
18th IEEE International Conference on Emerging Technologies and Factory
Automation (ETFA2013)
Probabilistic WCET (pWCET)
Estimating pWCET
How to make it safe?
Measurements
Statistical approach
Measuring kernel timings
© nvidia.com
Measuring kernel timings
Measuring kernel timings
Measuring kernel timings
Hardware architecture
© nvidia.com
Hardware architecture
© nvidia.com
Hardware architecture
© nvidia.com
Streaming multiprocessor
© nvidia.com
Measurement technique
as
© nvidia.com
Measurement technique
© nvidia.com
Measurement technique
Measurement technique
© oracle.com
Measurement technique
© nvidia.com
© oracle.com
Measurement technique
The case-study kernel
GeForce GTX 770 Graphics Card
8 streaming multiprocessors
Thread Block -- the group of threads
that are processed by a single
streaming multiprocessor.
Statistical approach
Extreme value theory (EVT)
Block Maxima
Peaks-Over-Threshold
Independence, stationarity and extremal tools
Single thread block experiments
32 thread blocks experiments
Measurement extremogram
EVT estimates
CDF representation of the distributions
CDF representation of the distributions
Related work
Kostiantyn Berezovskyi, Konstantinos Bletsas, Björn Andersson
Euromicro Conference on Real-time Systems (ECRTS2012)
Kostiantyn Berezovskyi, Konstantinos Bletsas, Stefan M. Petters
18th IEEE International Conference on Emerging Technologies and Factory
Automation (ETFA2013)
A. Betts and A. F. Donaldson
18th Euromicro Conference on Real-time Systems (ECRTS), 2013.
Vesa Hirvisalo
14th International Workshop on Worst-Case Execution Time Analysis, 2014
Future Work
Questions?
39
Thank You!
40