An Encryption-Enabled Network Protocol Accelerator Steffen Peter, Mario Zessack, Frank Vater, Goran Panic, Horst Frankenfeldt, and Michael Methfessel IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany IHP Im.
Download ReportTranscript An Encryption-Enabled Network Protocol Accelerator Steffen Peter, Mario Zessack, Frank Vater, Goran Panic, Horst Frankenfeldt, and Michael Methfessel IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany IHP Im.
An Encryption-Enabled Network Protocol Accelerator
Steffen Peter, Mario Zessack, Frank Vater, Goran Panic, Horst Frankenfeldt, and Michael Methfessel IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
Outline
•
Motivation
•
TCP
•
General Hardware Design
•
Cryptographic Accelerators
•
Implementation
•
Conclusions
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
Motivation
Wireless sensor network Tiny sensor nodes Cluster head
• •
standard TCP
•
high data rates
•
security low energy
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
Interne t
Motivation
•
Increasing amount of data
•
even in mobile and ubiquitous scenarios
•
need for good transport performance
•
Low cost
•
Small silicon area
•
Energy efficient
•
Need for security
•
Secrecy, Integrity, Reliability
• • • •
Support of standard protocols TCP (transport) AES (data encryption) ECC/ECDSA (signature, key agreement)
Is dedicated hardware the solution?
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
TCP
•
Standard transport protocol of the Internet
•
Connection-based protocol
-
Three-way handshake Complicated connection tear-down
•
Basic data integrity mechanism
-
Checksum
•
Error correction mechanisms
-
Fast retransmit, slow (re-)start, many others
•
Flow control - Buffer and congestion control
•
No actual security mechanisms
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
TCP – Profiling Results
Transmit Receive
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
TCP Profiling – Implications
•
Copying data consumes most time and energy
•
Reduce copy operations as much as possible
• •
Protocol handling needs merely 1/5th of the total computation
•
Is it worth hard-wiring the TCP state machines in hardware?
•
Trade-off performance
flexibility
•
Checksum is the most expensive computation
•
The obvious dedicated hardware unit
•
How to integrate in the data flow?
Memory allocation needs more than 5 percent of time
•
Can a dedicated unit help here?
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
TCP Profiling – Our Answers
•
One copy architecture
•
Data is copied directly from the peripheral handler to the right memory location (assigned by CPU)
•
During this one copy operation other operational blocks (checksum, encryption) listen on the bus and do their work
•
MIPS CPU performs complicated (but low effort) TCP logic
•
Connection build-up/tear-down, error handling, congestion control
•
Software handling allows protocol variations and debugging
• •
Dedicated checksum-block
•
checksum block computes checksum during the one copy operation No dedicated memory manager unit
•
Hard-wired memory manager reduces time by 77 percent (from 5%
1%)
•
BUT high hardware costs (300 flip-flops) and lack of flexibility
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
General Design
•
MIPS CPU handles complex protocols.
•
CPU never touches payload to Host Host interface handler
• •
CPU System Bus Internal 32 kByte SRAM stores packets.
AMBA bus connects system components.
• •
RF Data I/O Standard bus system allows modular approach.
Che cks um to MAC/PHY Periperhal bus (APB) connects GPIO, UART, SPI ports.
•
System concept: interacting independent units.
• •
Units exchange commands and status using register file.
General-purpose formalism for command/status syntax.
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com
SRAM
© 2008 - All rights reserved
•
Incoming packet Sleeps
General Design - Flow
• •
to Host Full header processing
• •
Window update System Bus SRAM CPU RF Data I/O Check sum
• •
Basic header
•
Packet received and Checksum ok?
Select memory slot for packet to MAC/PHY
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
Results (Performance and power consumption)
•
Simulation results: split power among different entities.
• • •
Maximal data rate in pure software on MIPS is 20.7 Mbit/sec.
Hardware accelerators reduce load on CPU and save 50% of power.
Maximal data rate with hardware accelerators is 40 Mbit/sec.
Case Rate CPU CPU AMBA Reg Card EPP/ Total (Mb/sec) active bus file bus UA power
SW 20.7 100% 60 14 7 4 4 89 mW HW 20.7 15% 9 14 7 4 12 46 mW HW 40.0 31% 18 14 7 4 12 55 mW •
Measured power consumption is 2.5 times simulated power.
• •
Measured power includes pads.
Consumed power varies for different production runs.
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
Cryptographic Accelerators
•
AES (Advanced Encryption Standard)
• •
Symmetric stream cipher Suitable for low-power high-throughput data streams
•
Standardized in November 2000 (NIST/National Institute for Standards and Technology; USA)
•
Input data length: 128 bit; key length: 128, 196, 256 bit
•
Assumed to be secure for the next 70 years
•
ECC (Elliptic Curve Cryptography)
• •
Asymmetric cryptography Suitable signatures and key-establishment
•
Key length 160-571 bit (NIST standard)
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
Advanced Encryption Standard (AES)
S E C R E T K E Y
Key
I H P M i c r o e l e c t r o
Data Calc key xor S-Box Shift row 10 Rounds Mix Column
84 30 AE CB 97 38 AD 58 7A 0E 67 CF FE 43 17 80
Output data
•
Huge design space
•
Sharing S-Boxes reduces performance but leads to smaller designs
•
Pipelining and Parallelism boost performance – but cost area and energy
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
AES - Results
• • • •
Throughput: ~52 MBit/s @33 Mhz (includes input and output of the data blocks) Size: 0.336mm² in 0.25 CMOS (8,450 equivalent gates) 70 clock cycles per 128 bit data block for en-/decryption 72 times faster than software implementation on MIPS (33 MHz) and it requires 0.4% energy of the software solution Company
Elliptic Semi: AES IHP AES
Size Gates
8.000 +1.400 bit Memory 8.450
Clock cycles
379 70 IAIK Graz 8.930
64 IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com
Clock frequency
100 MHz 66 MHz 64 MHz
Throughput
33,7 Mbit/s 108MBit/s 128MBit/s © 2008 - All rights reserved
Elliptic Curve Cryptography (ECC)
•
Asymmetric cryptography
•
Basis for many key exchange and signature algorithms (ECDSA)
•
Trapdoor : Elliptic Curve Point Multiplication
•
one can compute: Q = kP
•
it is infeasible to determine k for given Q and P
•
Higher security with shorter key lengths
•
about 1/10th of RSA’s key size
•
Still operations on Elliptic Curves are expensive
•
one 233 bit EC Point multiplication needs: 1200 additions, 1500 multiplications, 800 squarings, 1 division (233 bit each in the finite field)
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
ECC - Design
•
Utilization
•
Area
•
Asymmetric cryptography
•
Trapdoor : Elliptic Curve Point Multiplication 50%
– –
15% one can compute: Q = kP it is infeasible to determine k for given Q and P 5% 95% 70%
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
20%
ECC – Implementation Results
•
Time for one ECPM (233 bit):
• •
MIPS: 410 ms HW: 0.4 ms
•
Energy for one ECPM (233 bit):
• •
MIPS: 16 mWs HW: 0.03 mWs Company
Elliptic Semi: B-233 IBM B-163 IHP B-233
Size KGates
71 117 72-42
Energy mWs
0.14
0.02-0.04
Time/Op ms
6.68
0.19
0.08-0.35
Technology used
0.13µm 0.13µm 0.25µm IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
Implemented Chip (Design)
CardBus (Linux/Windows Host) I-Cache (16 kB) CardBus (Master) EJTAG (Debug) MIPS Processor Core AMBA AHB Bus D-SPRAM (8 kB) Memory Controller (AHB Slave) CPU Control Bus EPP UART Registers & & Control Internal SRAM SRAM (32 kB) (32 kB) Check Sum1 ECC Data I/O
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com
UART GPIO Serial 1+2 GPIO
© 2008 - All rights reserved
Flash SRAM
Implemented Chip (Chip Photo)
Size: 7.3 x 7.4 mm (54 mm²) Core: 44 mm² Pads: 219 in QFP256 package Transistors: 4.8 M in 0.25
μm Packet SRAM: 32 kByte Instruction cache: 16 kByte Data scratchpad: 8 kByte
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
Implementation (Test Board)
Allows:
• • • •
Testing the implementation in practice Tests of interoperability Performance tests Energy measurements
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
Conclusions
• • • • •
Profiling of TCP/IP code identified bottlenecks
•
TCP checksum and copying use 90% of power.
•
High data rate needs hardware accelerators.
Chip is a hardware solution for TCP/IP handling
•
Takes care of middle protocol layers efficiently.
AMBA-based bus as prototype for modular systems
•
Assemble different systems quickly.
•
Pre-tested components lead to reliability.
Cryptographic components allow security for low-cost
•
Designs for AES and ECC improve performance and energy consumption for security operations by three orders of magnitute.
TCP chip creates basis for further developments
•
Extension to higher data rates (Gbit/sec).
•
Use as component of complex single-chip systems.
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved
Thank You
Questions?
IHP Im Technologiepark 25 15236 Frankfurt (Oder) Germany www.ihp-microelectronics.com © 2008 - All rights reserved