Reliable SW/HW Co-Design for Wireless Communication System Integrating the Spin Model Checker and Celoxica's DK Suite Stefanos Skoulaxinos School of EPS – School of MACS Heriot-Watt.

Download Report

Transcript Reliable SW/HW Co-Design for Wireless Communication System Integrating the Spin Model Checker and Celoxica's DK Suite Stefanos Skoulaxinos School of EPS – School of MACS Heriot-Watt.

Reliable SW/HW Co-Design for
Wireless Communication System
Integrating the Spin Model
Checker and Celoxica's DK Suite
Stefanos Skoulaxinos
School of EPS – School of MACS
Heriot-Watt University, Edinburgh
Skoulaxinos
1
MAPLD2005/116
Roadmap
• SW-HW Co-Design, Rules and Dangers
• The Wireless Communication System – Long Range
Identification Tag (LRID)
• Expected System Survivability
• Reliability Enhancement Strategies
• Implementation: Targeted FPGA Platform
• Testing Procedure
• Analysis of Results and Reliability Estimation
• Work in progress: 3d Tag Location
Skoulaxinos
2
MAPLD2005/116
SW-HW Co-Design a trip from
idealism to realism
Dangers
1 Irrational Abstraction: Raising the design level at a theoretical and impractical
level for targeted application
2 Flawed Synthesis process
Potential
1 Increased system readability and testability, fast code turn-arounds, impressive
productivity gains
2 Bridging the gap between software and hardware development methods and tools
3 Application of high level reliability enhancement strategies
4 Level of abstraction can lift the designer seat enabling more complex applications
through a more testable development process
5 Possibility of monitoring and healing system defects (SW or HW) through a multilayered software architecture (Operating System). Lower levels of fault tolerance
(TMR) can be synthesized by the Compiler automatically.
Skoulaxinos
3
MAPLD2005/116
LRID Tag - Overview
Control Centre (User)
•
•
•
•
•
Skoulaxinos
Inaccessible Location
Requirements
Tolerate environmental noise
Self monitor and heal
Increased levels of survivability
Minimal power consumption at remote station
Maximal processing accuracy at base station
4
MAPLD2005/116
LRID Tag – Main Operation
1
3
Event from user
Signal Present?
4
2
Command
Transmission
by Base Station
Command
Reception by
Remote Station
5
6
ID Transmission
by Remote Station
ID Reception by
Base Station
Skoulaxinos
5
MAPLD2005/116
LRID Tag – Task Overhead
msecs
Base Station
Execution
Overhead
Base Station
Functions
Vs Time
1000
100
forward
s ync
10
com m and
core
com m and
parity
1
pream ble
0.1
Remote
Station
Execution
Overhead
Remote
Station
Functions
Vs Time
msecs
1000
100
synch
to
command
delay
10
synch
1
Skoulaxinos
min
max
command
core
command
parity
min
6
MAPLD2005/116
Software Reliability
Enhancement Strategies
1 Fault Prevention
•
•
•
•
•
Applied to the __
Tag
High Quality Specification
Design Diversity
Modeling, Formal Verification
Testing
Structured Design Principles
2 Fault Tolerance
•
•
•
•
V
V
V
Applied to the __
Tag
Run Time monitoring (Watchdog Timers)
Fault Location and Isolation
SW/HW Redundancy
N-Version Programming, Voting Schemes
Skoulaxinos
V
7
V
V
V
MAPLD2005/116
Fault Prevention: Modeling
and Formal Verification
Description
 Aiming for high levels of reliability, it is essential to
understand the system in depth. Modeling provides an
alternative view of the design and thus contributing to this
process. Formal verification following modeling is an
exhaustive computer based verification covering all
possible event scenarios
Skoulaxinos
8
MAPLD2005/116
Applied to the Tag
The Tag was Modelled and Verified in the Spin Model Checker
Spin is considered one of the most efficient software verification
tools currently available. It is actively used in safety critical NASA
applications such as the application to Cassini (mission to Saturn)
and the Mars Pathfinder.
Skoulaxinos
9
MAPLD2005/116
Fault Prevention: Structured Design
Description
 A set of guidelines which need to be followed by system
designers. It can contribute to code readability and
testability, making fault-removal processes easier and
more effective
Skoulaxinos
10
MAPLD2005/116
Applied to the Tag
The abstract operation implemented by the system is
briefly outlined. A number of languages can be deployed
in this phase (UML, CORE, YSM).
The core of the application is developed in Promela.
Simulation under Spin is performed in this phase.
In this phase, the design can be examined exhaustively
through formal verification. It is checked for deadlock
conditions, responsiveness, assertions and mutual
exclusion violations.
The Promela model is translated with the aid of Bison
and Flex to a language compatible with the Synthesis
tools for FPGAs (HandelC).
Synthesis is performed in this stage. The HDL source
code is then imported in Xilinx ISE. Generation of
configuration file follows.
Programming of targeted FPGA hardware is performed
and system testing takes place.
Skoulaxinos
11
MAPLD2005/116
Fault Tolerance: Run-time
Monitoring
Description
 Software or hardware redundancy aiming to monitor run
time operation of the main system. It is commonly used
in high end safety critical applications including NASA
missions. In such complex systems, monitoring tends to
form multilayered architectures covering both Software
and Hardware fault scenarios
Skoulaxinos
12
MAPLD2005/116
Applied to the Tag
 We have developed Watchdog timers and Forward
Error correction (FEC) architectures. We have taken
the proven watchdog timer scheme a step further by
introducing access points and multilayered
implementation. We have developed FEC schemes to
counterbalance expected medium noise
Skoulaxinos
13
MAPLD2005/116
Run Time Monitoring
Watchdog Timers
-Watchdog Timers are monitoring architectures utilised to
detect if a system has deadlocked
-Can cover a wide range of faults including software,
hardware and real time bugs
Watchdog Timer
Main controller
Reset timer
Reset timer
Reset timer
Reset timer
Proof of system
liveness
Monitored system
Skoulaxinos
14
Monitoring
architecture
MAPLD2005/116
Run Time Monitoring
Watchdog Timers
Example of Multi-layered Implementation
main()
{
par // parallel notation
{
main_operation();
run_time_monitoring();
}
}
function1()
{
// some processing
layer2_AP=0; layer2_reset=1;
// some processing
layer2_AP=1; layer2_reset=1;
}
Run_time_monitoring()
{
par
{
Watchdog_layer1();
Watchdog_layer2();
}
}
function2()
{
// some processing
layer2_AP=2; layer2_reset=1;
// some processing
layer2_AP=3; layer2_reset=1;
}
main_operation()
{
function1();
layer1_AP=0; layer1_reset=1;
function2();
layer1_AP=1; layer1_reset=1;
function3();
layer1_AP=2; layer1_reset=1;
}
Watchdog
layer1
function3()
{
// some processing
layer2_AP=4; layer2_reset=1;
// some processing
layer2_AP=5; layer2_reset=1;
}
Watchdog
layer2
FPGA platforms utilized
during Testing
1 Base Station Xilinx Spartan IIE FPGA
-utilized to control: data
communication with user PC, ID
reception from antenna and tag
location computations, all
processes executed in parallel
-capable of correlating multiple
IDs in a truly concurrent manner
-100 MHz on board oscillator
-can deploy 32 MB of on board
SDRAM
-the Spartan IIE board supports
3.3V and 2.5V I/O standards
-Optimized for very low power
high performance systems, ideal
for wireless applications
-On board low power oscillator
set at 32kHz
-the board supports 1.8V and
3.3V I/O standards
2 Remote Station Xilinx Coolrunner II CPLD
Skoulaxinos
16
MAPLD2005/116
Testing Procedure
controlled noise injection
1 Establish a suitable noise pattern
2 Inject noise starting with minimum duration
3 Increase noise duration progressively and check for system liveness
4 Log maximum noise the tag could withstand without failing
Skoulaxinos
17
MAPLD2005/116
Analysis of Results – Reliability
Estimation
MTBF=18 seconds
Notes: Test Results were
analysed in the CASRE
Reliability Estimation Tool
(developed by JPL-NASA)
Fault Tolerance
Disabled
Fault Tolerance
Enabled
MTBF=50 seconds
Skoulaxinos
18
MAPLD2005/116
Work in progress – 3d Tag Location
Operation
1 User activates tag location query in the front end API (shown above)
2 API connects with the base station hardware (Xilinx Spartan IIE FPGA) and initiates
transmission to remote stations
3 Selected Remote stations respond by sending their unique ID sequence.
4 Time of arrival of ID at three base station antennas is utilized by the FPGA to compute
precise x,y and z co-ordinates of the tag. The co-ordinates are sent back to the API,
which are displayed in a 3d animated view.
Conclusions
With the assistance of Hard and Soft-core processors embedded on
state of the art programmable devices, FPGAs begin to move away
from solitary DSP operation. They can handle complex control
processing functions and form complete systems on chip. The
increased complexity of such applications is beginning to move
out of reach of traditional low level design routes. SW/HW CoDesign is evolving fast to match and bridge this design handicap.
Lessons learned at lower levels of implementation can form a solid
base for a multi-layered fault tolerant architecture on a single FPGA
platform.
Skoulaxinos
19
MAPLD2005/116
Acknowledgements
The presenter wishes to thank everyone who has contributed from the
conception (2002) and development of the research project . The
Dependable Systems Group and Microengineering Group in Heriot-Watt
University, as well as the Institute for System Level Integration (ISLI)
and Scottish Embedded Software Centre (SESC) in Livingston.
Skoulaxinos
20
MAPLD2005/116