HAsim: Hardware Asim

Download Report

Transcript HAsim: Hardware Asim

Hasim
Joel Emer†‡
Michael Adler†, Artur Klauser†,
Angshuman Parashar†, Michael Pellauer‡,
Murali Vijayaraghavan‡
†VSSAD
Intel
‡CSAIL
MIT
Overview
•
Goal
– Produce compelling evidence for architecture ideas
• Requirements
– Cycle accurate simulation
– Representative simulation length
– Software development (often)
• Current approach
– Mostly software simulation (10 KHz to 1 KHz)
• New approach
– Build a performance model in an FPGA
2
2007.05.14
Hasim
FPGA-based approaches
• Prototyping
– Build a logically isomorphic representation of the design
• Modeling
– Build a performance simulation in gates
• Hybrids
– Build something that is partially a prototype and partially a model
3
2007.05.14
Hasim
Recreate Asim in hardware
• Modularity
• Inter-module communication
• Functional/Timing Partitioning
• Modeling Utilities
4
2007.05.14
Hasim
Why modularity?
• Speed of model development
• Shared components between products
• Reuse across generations
• Encourages isomorphism to design
• Improved fidelity
• Facilitates speed/fidelity trade-offs
• Architectural experimentation
• Factorial development and evaluations
• Sharing
5
2007.05.14
Hasim
ASIM Module Hierarchy
S
F
B
6
2007.05.14
Hasim
D
C
M
N
R
X
C
W
ASIM Module Selection
S
B
F
B
7
2007.05.14
Hasim
D
C
M
N
R
X
C
B
W
B
B
Module Selection
S
B
F
B
8
2007.05.14
Hasim
D
C
M
N
R
X
C
B
W
B
B
Module Replacement
S
B
F
B
9
2007.05.14
Hasim
D
C
M
N
R
X
C
B
W
B
B
(H)ASIM Module Hierarchy
10
2007.05.14
Hasim
Communication
C
F
N
11
2007.05.14
Hasim
D
R
X
C
N
W
Named connections
A-out
S
12
2007.05.14
Hasim
A-in
D
Model and FPGA Cycles
Port
Port
Module B
Module A
Port
A
1.1
B
1.1
1
13
2007.05.14
1.2
2
Hasim
1.3
3
4
2.1
2.2
2.1
2.2
2.3
5
6
7
Port
8
A
1.1
1.2
1.3
2.1
B
1.1
2.1
2.2
2.3
1
2
3
4
2.2
5
6
7
8
Functional/Timing Decomposition
• ISA semantics
• Platform semantics
• Micro-architecture
Fetch(PC)
…
Timing
Partition
Functional
Partition
Instruction
• Simplifies timing model
• Amortize functional model design effort over many models
• Can be pipelined for performance
• Can be FPGA-friendly design
• Can be split across hardware and software
14
2007.05.14
Hasim
Execute@execute phases
Fetch instruction
Speculatively execute instruction
Read memory*
Speculatively write memory* (locally visible)
Commit or Abort instruction
Write memory* (globally visible)
*
15
Optional depending on instruction type
2007.05.14
Hasim
Execution in phases
F
D
X
F
D
F
X
D
X
F
D
F
C
R
C
W
X
D
R
X
C
W
A
X
C
Assertion: All data dependencies can be represented in these phases
16
2007.05.14
Hasim
W
HASim: Partitioning Overview
Timing Partition
Token
Gen
Fet
Memory
State
Functional
Partition
17
2007.05.14
Hasim
Dec
Exe
Mem
Register State
RegFile
LCom
GCom
Common Infrastructure
• Modules
• Inter-module communication
• Statistics gathering
• Event logging
• Debug Tracing
• Simulation control
•…
18
2007.05.14
Hasim
Bluespec (Asim-style) module
module [HAsim_module] mkCache#() (Empty);
Port#(Addr) req_port <- mkSendPort(‘a2cache’);
Port#(Bool) resp_port <- mkRecvPort(‘cache2a’);
TagArray tagarray <- mkTagArray();
rule cycle(True);
Maybe#(Addr) mx = req_port.get();
if (isValid(mx))
resp_port.put(tagarray.lookup(validValue(mx)));
endrule
endmodule
19
2007.05.14
Hasim
Bluespec (Asim-style) submodule
module mkTagArray(TagArray);
RegFile#(Bit#(12),Bit#(4)) tagArray<- mkRegFileFull(...);
method Bool lookup(Bit#(16) a);
return (tagArray.sub(getIndex(a)) == getTag(a));
endmethod
function Bit#(4) getTag(Address x);
return x[15:12];
endfunction
function Bit#(12) getIndex(Address x);
return x[11:0];
endfunction
endmodule
20
2007.05.14
Hasim
Support functions - stats
Module
Stat Counter
Module
Stat Counter
module mkCache#(...) (Empty);
...
cache_hits <- mkStat(...);
...
hit=tagarray.lookup(...);
if (hit)
cache_hits.increment();
endif
...
endmodule
Module
Stat Counter
21
2007.05.14
Hasim
Stat Dumper
2Dreams
22
2007.05.14
Hasim
Support functions - events
Module
Event Reg
Module
Event Reg
Module
Event Reg
23
2007.05.14
Hasim
module mkCache#(...) (Empty);
...
cache_event <- mkEvent(...);
...
hit=tagarray.lookup(...);
cache_event.report(hit);
...
endmodule
Event Dumper
Support functions – global controller
Module
Controller
Module
Controller
Module
Controller
24
2007.05.14
Hasim
module mkCache#(...) (Empty);
...
ctrl <- mkCntrlr(...);
...
rule (ctrl.run())
...
endrule
endmodule
Global
Controller
FPGA-based prototype
Prototyping Catch-22…
26
2007.05.14
Hasim
Module Instantiation
U
M
FFF
27
2007.05.14
Hasim
D
D
D
CCC
M
N
RRR
X
X
X
CCC
W
W
W
Factorial Coding/Experiments
S
C
M
N
RC
S
C
SM
M
SM
S
C
M
SC
RM
2007.05.14
Hasim
C
N
SC
28
S
M
RC
N
RM
N
HAsim: Current status - models
• Simple RISC functional model operating
–
–
–
Simple RISC ISA
Pipelined multi-phase instruction execution
Supports speculative OOO design
• Physical Reg File and ROB
• Small physically addressed memory
• Fast speculative rewinds
• Instruction-per-cycle (APE) model
– Runs simple benchmarks on FPGA
• Five stage pipeline
– Supports branch mis-speculation
– Runs simple benchmarks (in software simulation)
•
29
X86 functional model architecture under development
2007.05.14
Hasim
Connections Implement Ports
foo
baz baz
foo bar bar
bar bar
baz
baz
PM
(Module Tree w. Connections)
foo
PM
(Hardware Modules w. Wrappers)
Implemented via connections.
30
2007.05.14
Hasim
foo
Timing Model Resources (Fast)
OOO, branch prediction, three functional units, 32KB 2-way set
associative ICache and DCache, iTLB, dTLB
2142 slices (15% of a 2VP30)
• 21 block RAMs (15% of a 2VP30)
Configurable cache model
• 32KB 4-way set associative cache with 16B cache-lines
– 165 slices (1% of a 2VP30)
– 17 block RAMs (12% of a 2VP30)
•
2MB 4-way set-associative cache with 64B cache-lines
– 140 slices (1% of a 2VP30)
– 40 block RAMs (29% of a 2VP30)
Current FPGAs (4VFX140)
• 142,128 slices
• 552 block RAMs
• 2 PowerPCs
31
2007.05.14
Hasim