PPTX slides - DAC Virtual Resources

Download Report

Transcript PPTX slides - DAC Virtual Resources

Using emulation for
RTL performance verification
June 4, 2014
DaeSeo Cha
Infrastructure Design Center
System LSI Division
Samsung Electronics Co., Ltd.
1
Current Performance Verification
System Requirement
•Architectural Performance Exploration
SystemC model, real workload aware performance
analysis
System Architecture
Specification
•Architectural Performance Verification
System C model  Inaccuracy
RTL Integration
FPGA
Post-Silicon
2
•RTL Performance Verification
Subsystems/full chip using logic simulation  Slow
•RTL Performance Verification
Sub-system only  Capacity
•RTL Performance Verification
Full chip  Too late in development stage
2/13
New Approach for Performance Verification
System Requirement
System Architecture
Specification
UVM Testebench
Big capacity
Full chip
Accurate
Cycle Accuracy
Fast
100X+
RTL Integration
log
FPGA
log
log
log
Fast Analysis
Correlation/Compare
GUI Analysis Environment
(PRISM)
Early Stage
RTL freeze
* PRISM: Samsung In-house Tool
Post-Silicon
Summary
Fast and Accurate Performance Verification
3
3/13
Performance Verification Platform
 Environment
 Reuse existing UVM simulation environment without any modification
 Add PV(Performance Verification) components
 PV components
 Monitor: Collect various performance metrics
 Traffic Generator: Random or replay RTL IP’s traffic
4
4/13
UVM Co-emulation Environment
 UVM Architecture for Co-emulation
sw_top
prim_top
tb_top
Incr_top
Bus
AXIUVC
bus
UVM testbench
Simulator
Module
Test scenario
hw_top
Register Model
Sequence
Virtual sequencer
Interface
Interface
DUT
Emulator
REG2BUS adapter
Register predictor
DUT
Interrupt
 Simulation environment
- Incremental elaboration having primary, incremental snapshot
- Building test scenarios by combining testbench and design in full-chip
 Emulation environment
- DUT runs in emulator, incremental elaboration scheme used in emulator
5
5/13
Performance Monitor -1/2
 Performance Metrics






Latency: Min/Max/Average, time-varying, accumulated, distributed
Bandwidth: Min/Max/Average, time-varying, accumulated, distributed
Utilization: Min/Max/Average, time-varying, accumulated, distributed
Address pattern
PM
Response time
Customized metrics like IP’s internal signals (FIFO level)
 Implementation
 Synthesizable code for both simulation and emulation
 Collect performance metrics on AXI interface
 Issue
Log file
PRISM
PM: performance monitor
 Run-time overhead in emulation
 Synchronization overhead between emulator and simulator
6
6/13
Performance Monitor – 2/2
 Experiments
 PV results should be recorded in-order
 Many experiments are done to reduce run-time overhead
Method
No PV Monitor
$display
GFIFO
Description
Baseline
Sync with TB using $fdisplay()
Buffering monitored transaction
Collecting process in back ground
tbcall sync
Overhead
398
-
32,798
81X
472
1.12X
 GFIFO
 Transactions are collected in order, it is congruent with the SW simulation
 Parallel execution of monitor transaction in SW  Improve performance
bit a; bit [5:0] b; int c;
always @(clk) begin
$fdisplay (“ %d %d %d”, a, b, c);
end
Simulation Monitor
7
bit a; bit [5:0] b; int c;
function void my_mon(bit x1, bit [5:0] x2, int x3);
$fdisplay(“%d %d %d”, x1, x2, x3);
endfunction;
initial $ixc_ctrl("gfifo", “my_mon");
always @(clk) begin my_mon(a, b, c) end
GFIFO
7/13
Performance Analysis Environment
 PRISM (Performance Visualization System)
 Charting PV results in GUI
 Easy to find a performance issue by viewing PV results in a single GUI
8
8/13
Experimental Result
 Application
 Multimedia test scenarios such as video playback, camera recording
 Run-time speed
 +100x faster than simulation
 Bugs found
 Critical bugs and design weak points which would not been detected
during simulation-based verification
9
9/13
Conclusion
PV using emulator is a mainstream solution
 Very fast bring up using UVM Co-emulation
Reusing UVM full-chip testbench without any modification
 PV in early design development stage with cycle accuracy
+100x faster speed compared with simulation approach
 Efficient PV analysis by PRISM
Future Work
 Add more features to PRISM - correlation, smart PV report etc.
 Develop ACE PV Monitor for dealing with cache-coherency
 Deploy UVM Co-emulation for other test scenarios
10
10/13
Thank you
11
11/13