Project 2 description

Download Report

Transcript Project 2 description

ECE 111
Run-Length Encoding Project
(RLE)
Run-Length Encoding
• A simple lossless compression algorithm
– See http://en.wikipedia.org/wiki/Run-length_encoding
• Example
– Input:
WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWW
WWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWW
WW
– Output:
12W1B12W3B24W1B14W
(Note: count is an 8-bit number. e.g., “12” is “00001100”)
Module Interface
• message_size[31:0] given in number of bytes (e.g. 67 bytes in previous example)
rle_size[31:0]
done
start
rle_addr[31:0]
message_size[31:0]
message_addr[31:0]
• The size in bytes for the compressed output should be specified at output
rle_size[31:0] (e.g. 14 bytes in previous example)
port_A_addr[15:0]
DPSRAM
(stores
message)
port_A_we
port_A_data_in[31:0]
port_A_data_out[31:0]
DPSRAM interface
port_A_clk
clk
RLE
Processor
nreset
DPSRAM Interface Behavior
RLE
Co-Processor
• To read from the DPSRAM:
– Assert Port_A_addr = 0x0000, port_A_we = 0
– At next clock cycle, read data as port_A_data_out
• To write to the DPSRAM:
– Assert Port_A_addr = 0x0004, port_A_we = 1
– Wait for one clock cycle for write to complete
• Note: you can access addresses at word boundaries only
Timing Diagram for DPSRAM
Big-Endian vs. Little-Endian
• The memory representation uses a little-endian representation.
• For message “ABCABCABCABC”, little-endian would be:
M[0] = “ACBA”;
M[1] = “BACB”;
M[2] = “CBAC”;
big-endian would be:
M[0] = “ABCA”;
M[1] = “BCAB”;
M[2] = “CABC”;
Some Difficulties
• Number of input or output bytes may not be a multiple of 4
(word boundary)
• The compression algorithm produces variable length output
• It is possible for the “compressed” output to be “longer” than
the original input.
• e.g., Input (12 bytes):
“ABCABCABCABC”
• Output (24 bytes):
“1A1B1C1A1B1C1A1B1C1A1B1C”
Some Difficulties (cont’d)
• If the size of the input message is not a multiple of 4, ignore the
“extra bytes”
• If the size of the output compressed message is not a multiple of
4, the “extra bytes” can be anything (e.g., all 0’s works)
Test Bench
• Testbench file is “rle_testbench.v”
• It reads the test cases from a file called “plaintext.dat”, which
contains two test cases: 1st one has 48 bytes, 2nd one has 51
bytes (be careful, only consider 3 bytes from the last memory
word)
• Testbench will report number of cycles for both test cases for
your “delay” computation
Two Design Objectives
• Minimum Delay
– Delay = clock period * number of cycles
– Clock period = 1/Fmax
– Use the Fmax result from the Slow 900mV 100C Model to report your
clock period. (Do not use the Restricted Fmax.)
• Minimum Area*Delay product
– Area = #ALUTs + #Registers
Last Year’s Best Results
• Minimum Delay
– #ALUTs = 1019, #Registers = 165, Area = 1184
– Clock period = 4.19 ns, Clock cycles = 55
– Delay = 230 ns
– Area*Delay = 2.72 x 10-4
• Minimum Area*Delay product
– #ALUTs = 132, #Registers = 98, Area = 230
– Clock period = 2.84 ns, Clock cycles = 210
– Delay = 595 ns
– Area*Delay = 1.37 x 10-4
Passing Requirements
• Last Year’s Median Result:
– #ALUTs = 264, #Registers = 126, Area = 390
– Clock period = 4.30 ns, Clock cycles = 160
– Delay = 688 ns
– Area*Delay = 2.68 x 10-4
• Passing Requirement This Quarter (within 20% of last year’s
median result):
– Delay ≤ 825.60 ns
– Area*Delay ≤ 3.22 x 10-4