Parallel execution of multiple MacSim

Download Report

Transcript Parallel execution of multiple MacSim

Photos placed in horizontal position
with even amount of white space
between photos and header
SST + MacSim
Case Studies Using SST-MacSim
Genie Hsieh
Sandia National Labs
Sandia National Laboratories is a multi-program laboratory managed and operated by Sandia Corporation, a wholly owned subsidiary of Lockheed
Martin Corporation, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-AC04-94AL85000.
SST-MacSim DEMO
 MacSim and DRAMSim2 integration
 Parallel execution of multiple MacSim
2
SST-MacSim: Two Modes
 Standalone
 ./configure <options>
./configure --prefix=/home/myhsieh/local/sst --with-McPAT=/home/myhsieh/local
--with-hotspot=/home/myhsieh/local --with-m5=/home/myhsieh/m5-x86/
 make; make install
 With DRAMSim2
 Build DRAMSim2 library: make libdramsim.so
 ./configure <options>
 --with-dramsim=DIR
./configure --prefix=/home/myhsieh/local/sst --with-McPAT=/home/myhsieh/local
--with-hotspot=/home/myhsieh/local --with-m5=/home/myhsieh/m5-x86/
--with-dramsim=/home/mhsieh/DRAMSim2
 make; make install
3
MacSim + DRAMSim2 Example
<component name=gpu0 type=macsimComponent>
<params>
<paramPath>params_hetero_1_6</paramPath>
<tracePath>trace_file_list</tracePath>
<outputPath>results</outputPath>
<clock>1.4Ghz</clock>
</params>
<link name=membus port=bus latency=1ns />
</component>
<component name=mem0 type=DRAMSimC>
<params>
<clock> 1.5 Ghz </clock>
<megsOfMemory> 1024 </megsOfMemory>
<systemini> system_GDDR5.ini </systemini>
<deviceini> ini/GDDR5_hynix_1Gb_16B.ini </deviceini>
</params>
<link name=membus port=bus latency=1ns />
</component>
SST-MacSim
DRAMSim2
DDR2, DDR3
4
DEMO
MacSim
SST Link
DRAMSim2
5
DRAMSim2 Simulation Output
[myhsieh@s910654 bin]$ ./sst.x --sdl-file=test_dram.xml
SST: construct macsimComponent and setSSTComponent with ID 0
SST: construct DRAMSimC with ID 1
…
src/macsim.cc:588: (I=0 C=439930): elapsed time:7.4 seconds
Done
DRAM: Background Energy 17202960
DRAM: Burst Energy 9973920
DRAM: ACT/PRE Energy 21178560
DRAM: Refresh Energy 1472320
Bus packet
Transaction
Transaction queue
1]T [Read] [0x45bbfa4]
2]T [Write] [0x55fbfa0] [5439E]
Memory statistics
Power
6
MacSim Memory Experiments
MacSim + DDR3
Output
<component name=mem0 type=DRAMSimC>
<params>
<systemini> system_DDR3.ini </systemini>
<deviceini> ini/GDDR3.ini </deviceini>
</params>
</component>
**Core 1 Core_Total Finished:
insts:205888
cycles:439929
seconds:7 -- 0.47
IPC (0.47 IPC)
(I=0 C=439930): finalize simulation
DRAM: Background Energy 17202960
DRAM: Burst Energy 9973920
DRAM: ACT/PRE Energy 21178560
MacSim + GDDR5
<component name=mem0 type=DRAMSimC>
<params>
<systemini> system_GDDR5.ini </systemini>
<deviceini> ini/GDDR5.ini </deviceini>
</params>
</component>
**Core 1 Core_Total Finished:
insts:205888
cycles:428507
seconds:8 -- 0.48
IPC (0.48 IPC)
(I=0 C=428508): finalize simulation
DRAM: Background Energy 1431920
DRAM: Burst Energy 40460
DRAM: ACT/PRE Energy 113280
7
Parallel Execution of MacSim in SST
MacSim
SST-MacSim
8
Parallel execution of MacSim
through SST Bus
MacSim
SST Link
MacSim
MacSim
SST Link
Bus
SST Link
DRAMSim2
SST Link
9
Parallel Execution of Multiple MacSim
<component name=cpu0
type=macsimComponent>
<params>
<paramPath>params_x86</paramPath>
<tracePath>trace_file_list_cpu</tracePath>
<clock>4Ghz</clock>
</params>
<link name=cpu port=bus latency=1ns />
</component>
SST-MacSim
CPU
<component name=gpu0
type=macsimComponent>
<params>
<paramPath>params_gtx8800_v2</paramPath>
<tracePath>trace_file_list_gpu</tracePath>
<clock>1.4Ghz</clock>
</params>
<link name=gpu port=bus latency=1ns />
</component>
SST-MacSim
GPU
CPU
GPU
<component name=bus0 type=bus>
<params>
<clock>1GHz</clock>
<deviceList> cpu gpu mem</deviceList>
</params>
<link name=cpu port=cpu latency=1ns />
<link name=gpu port=gpu latency=1ns />
<link name=mem port=mem latency=1ns />
</component>
SST-Bus
Memory
<component name=mem0 type=DRAMSimC>
<params>
<systemini> system_GDDR5.ini </systemini>
<deviceini> ini/GDDR5_.ini </deviceini>
</params>
<link name=mem port=bus latency=1ns />
</component>
SST-DRAMSim2
10
DEMO
11
<comonent name=cpu1
type=macsimComponent
rank =1>
</component>
<comonent name=gpu0
type=macsimComponent
rank =2>
</component>
<comonent name=gpu1
type=macsimComponent
rank =3>
</component>
<component name=dram
type=DRAMSimC
rank=5>
</component>
<comonent name=cpu0
type=macsimComponent
rank =0>
</component>
<component name=bus
type=bus
rank=4>
</component>
Parallel Execution of Multiple MacSim
mpirun –np6 ./sst.x –sdl-file=macsim.xml
12
Memory Experiments
1CPU 1GPU DDR3
2CPUs 2GPUs DDR3
DRAM: Background Energy 338800
DRAM: Burst Energy 2380
DRAM: ACT/PRE Energy 7080
DRAM: Background Energy 2569720
DRAM: Burst Energy 26180
DRAM: ACT/PRE Energy 77880
#
# Simulation times
# Build time: 0.00 s
# Simulation time: 124.04 s
# Total time: 124.04 s
#
# Simulation times
# Build time: 0.00 s
# Simulation time: 174.06 s
# Total time: 174.06 s
13
Parallel execution of MacSim
through SST Iris Network
SST/MacSim Terminal
Decoupled MacSim
Standalone
MacSim
Core
Cache
Core
Cache
Iris NIC
Iris Router
DRAM
SST/Iris NIC
SST/Iris Router
Iris NIC
Iris Router
SST/MacSim DRAM
DRAM
15
MacSim
NIC
R
R
R
R
NIC
MacSim
NIC
DRAM
SST Link
MacSim
NIC
2X2 Mesh
16
Configure SST/MacSim
SST/MacSim Terminal
<component name=cpu
type=macsimComponent>
<params>
<paramPath>x86</paramPath>
<terminalType>0</terminalType>
<id>0</id>
<mc_id>1</mc_id>
<term_mclass>0</term_mclass>
</params>
<link name=cpu2nic port=nic />
</component>
SST/MacSim DRAM
<component name=mc
type=macsimComponent>
<params>
<paramPath>x86</paramPath>
<terminalType>2</terminalType>
<id>3</id>
<mc_id>3</mc_id>
<term_mclass>1</term_mclass>
</params>
<link name=mc2nic port=nic />
</component>
terminalType: 0(core), 1(cache), 2(MC/DRAM)
term_mclass: 0 (request from core), 1(response from DRAM)
17
Parallel Execution of Multiple MacSim
<component name=cpu0 type=macsimComponent>
<component name=mc type=macsimComponent>
<params>
<params>
<paramPath>params_gtx8800_v2</paramPath>
<paramPath>params_x86</paramPath>
<terminalType>2</terminalType>
<terminalType>0</terminalType>
<id>1</id>
<id>0</id>
<mc_id>1</mc_id>
<mc_id>1</mc_id>
</params>
<link
name=mc2nic port=nic latency=1ns />
</params>
</component>
<link name=cpu2nic port=nic latency=1ns />
</component>
cpu2nic
SST-MacSim
Terminal
SST-MacSim
DRAM
MC2nic
<component name=0.nic type=iris.ninterface>
<params>
<id>0</id>
</params>
<link name=cpu2nic port=cpu latency=1ns />
<link name=0.nic2rtr port=router latency=1ns />
</component>
SST-Iris NIC
<component name=1.nic type=iris.ninterface>
<params>
<id>1</id>
</params>
<link name=mc2nic port=cpu latency=1ns />
<link name=1.nic2rtr port=router latency=1ns />
</component>
SST-Iris NIC
nic2rtr
nic2rtr
<component name=0.rtr type=iris.router>
<params>
<id>0</id>
</params>
<link name=0.nic2rtr port=bus latency=1ns />
<link name=xr2r.0.0.1 port=xPos />
<link name=xr2r.0.0.0 port=xNeg/>
</component>
SST-Iris Router
rtr2rtr
<component name=1.rtr type=iris.router>
<params>
<id>1</id>
</params>
<link name=1.nic2rtr port=bus latency=1ns />
<link name=xr2r.0.0.0 port=xPos />
<link name=xr2r.0.0.1 port=xNeg/>
</component>
SST-Iris Router
18
DEMO
19