A Genetic Representation for Evolutionary Fault Recovery

Download Report

Transcript A Genetic Representation for Evolutionary Fault Recovery

Autonomous FPGA Fault Handling
through
Competitive Runtime Reconfiguration
Ronald F. DeMara and Kening Zhang
University of Central Florida
1 July 2005
Fault-Handling Techniques for
SRAM-based FPGAs
Reprogrammable Device
Failure
Characteristics
Duration:
Target:
Approach:
Methods
Transient:
SEU
Device
Processing
Configuration Datapath
Repetitive
Readback
[Wells00]
Bitwise
Comparison
(conventional
spatial
redundancy)
Majority
Vote
BIST
Invert Bit
Value
Ignore
Discrepancy
SEL, Oxide Breakdown,
Electron Migration, LPD
Processing
Datapath
Evolutionary
STARS
CED
Sussex
CRR
[Abramovici01]
[McCluskey04]
[Vigander01]
[DeMara05]
Supplementary
Testbench
Duplex
Output
Comparison
Cartesian
Intersection
Worst-case
Clock Period
Dilation
Diagnosis:
Recovery:
Device
Configuration
TMR
Detection:
Isolation:
Permanent:
Replicate in
Spare Resource
Fast Run-time
Location
Select Spare
Resource
(not
addressed)
Duplex
Output
Comparison
(not
addressed)
unnecessary
unnecessary
unnecessary
Population-based
GA using
Extrinsic Fitness
Evaluation
Evolutionary
Algorithm using
Intrinsic Fitness
Evaluation
Previous Work
Detection Characteristics of FPGA Fault-Handling Schemes
Strategies: 1) Evolve redundancy into design before anticipated failure
2) Redesign after detection of failure
3) Combine desirable aspects of both strategies 1) + 2) …
Fault Detection
Resource Coverage
Approach
Fault Handling Method
Latency
Distinguish
Transients
Logic
TMR
Spatial voting
Negligible
No
Yes
Yes
No
Voting element
[Vigander01]
Spatial voting & offline
evolutionary
regeneration
Negligible
No
Yes
No
No
Voting element
[Lohn, Larchev,
DeMara03]
Offline evolutionary
regeneration
Negligible
No
Yes
Yes
No
Unnecessary
[Lach98]
Static-capability tile
reconfiguration
STARS
[Abramovici01]
[Keymeulen,
Stoica,
Zebulum00]
Competitive
Runtime
Reconfiguration
(CRR)
Roving Test Area
Granularity
Relies on independent fault detection mechanism
Up to 8.5M
erroneous outputs
Population-based fault
Design-time
insensitive design
prevention emphasis
Competing
configurations with
temporal voting and
online regeneration
InterComparator
connect
Fault Isolation
Negligible
Test pattern
transients
Yes
Yes
No
LUT function
No
Yes
Yes
No
Not addressed
at runtime
Yes
Unnecessary,
but can isolate
functional
components
Transients
are
attenuated
automatically
Yes
Yes
CRR Arrangement in SRAM FPGA
SRAM-based FPGA
Configurations in Population
• C = CL CR
• CL = subset of left-half configurations
• CR = subset of right-half configurations
• |CL|=|CR |= |C|/2
CONFIGURATION BIT STREAM
L
Half-Configuration
R
Half-Configuration
Discrepancy Operator
• Baseline Discrepancy Operator  is dyadic
operator with binary output:
Function Logic L
Function Logic R
• Z(Ci) is FPGA data throughput output of
configuration Ci
0 Z (CiL )  Z (CiR )
C C  
Othewise
1
L
i
`
Discrepancy Check L
DATA OUTPUT
CONTROL
R
i
• Each half-configuration evaluates  using
embedded checker (XNOR gate) within each
individual
Discrepancy Check R
FEEDBACK
OFF-CHIP EEPROM
( NOTE: a non-volatile memory is already required to boot any SRAM
FPGA from cold start ... this is not an additional chip )
INPUT DATA
• Any fault in checker lowers that individual’s
fitness so that individual is no longer preferred
and eventually undergoes repair
WTA: = i^ j Ci , j EOR Ci , j RS:  = ij Ci , j EOR Ci , j
L
Reconfiguration Algorithm
(Equivalence)
R
L
R
(Hamming Distance)
Terminology and Characteristics
Pristine Pool: CP. For any CiC, is member of CP at generation G if and only if
G
C
K 1
L
K
 C KR  0
Suspect Pool: CS. For any CiC, is member of CS at generation G if and only
if at least one of CKL  CKR  0(1  K  G)
Under Repair Pool: CU: For any CiC, is member of CU at generation G if and
only if
G
C
K 1
L
K
 C KR  1
Refurbished Pool: CR: after Genetic Operator applied, the new generated
G
individual is member of CR at generation G if and only if
L
R
C
K 1
K
 CK  0
ED is Discrepancy Count of Ci and EC is Correctness Count of Ci
Length of Evaluation Fitness Window: W = ED+ EC
Fitness Metric: f(Ci) =EC/ EW
Sketch of CRR Approach
Premise: Recovery Complexity << Design Complexity
1. Initialization
 Population P of functionally-identical yet physically-distinct configurations
 Partition P into sub-populations that use supersets of physically-distinct resources
e.g. size |P|/2 to designate physical FPGA
left-half or right-half resource utilization
2. Fitness Assessment
 Discrepancy Operator  is some function of
bitwise agreement between each half’s output
fitness assessment via
pairwise discrepancy
 Four Fitness States defined for Configurations as
(temporal voting vs.
{CP,CS,CU,CR} with transitions, respectively:
spatial voting)
Pristine
Suspect Under Repair Refurbished
 Fitness Evaluation Window W determines comparison interval
3. Regeneration
 Genetic Operators used to recover from fault based on Reintroduction Rate 
 Operators only applied once then offspring returned to “service” without for
concern about increasing fitness
Configuration Health States
States Transitions during lifetime of ith Half-Configuration
primordial
C
O
M
P
E
T
I
T
I
O
N
L=R
1
L=R
pristine
9
complete
repair
partial
repair
2
LR
refurbished
L=R
3
10
L R : fi  fOT
suspect
LR
:
fi  fRT
4
integral with
EVOLUTION
L=R
LR
fi  fOT
fi < fRT
:
:
LR
COMPETITION
11
8
:
L = R :
5
7
fi < fRT
LR
under
repair
6
fi < fOT
Procedural Flow under
Competitive Runtime Reconfiguration
Initialization
Population partitioned into
functionally-identical yet
physically-distinct
half-configurations
L=R
is
either L's or R's
fitness < Repair
Threshold?
L=R
Selection
Detection
choose
FPGA configuration(s)
labeled L and R
apply functional inputs
to compute FPGA
outputs using L, R
discrepancy
free
Fitness
Adjustment
PRIMARY
LOOP
update fitness of only
L and R based on
detection results
YES
invoke
Genetic
Operators
only once
L, R results
and only on L or R
Adjust Controls
detection mode, overlap interval, ...
Integrates all fault handling stages using EC strategy



Detects faults by the occurrence of discrepancy
Isolates faults by accumulation of discrepancies
Failure-specific refurbishment using Genetic Operators:

Intra-Module-Crossover, Inter-Module-Crossover, Intra-Module-Mutation
Realize online device refurbishment


Refurbished online without additional function or resource test vectors
Repair during the normal data throughput process
NO
Selection Process
Any Pristine
individuals?
NO
Any Suspect
individuals?
YES
Select* one Pristine individual
as L half-configuration
NO
YES
Select** one Suspect individual
as L half-configuration
Choose random number X on [0..1]
X>
Re-introduction
rate?
X > R
YES
Select*** one Refurbished individual
as L half-configuration
* = selection that favors inventory rotation
**= selection based on fitness ranking that favors correctness
*** = selection based on fitness ranking that favors correctness with optional
second-order metric such as routing delay (to automatically evolve
better throughput performance at no additional cost)
NO
Select one Operational (Pristine*,
Suspect**, or Refurbished***)
individual as R half-configuration
Select*** one Under Repair
individual as R half-configuration
goto
Detection
process
Fitness Adjustment Procedure
YES
Discrepancy?
Decrease L's & R 's fitness
according to fitness down-adjustment process
NO
Is
the individual
Pristine?
Increase L's & R's fitness
according to fitness up-adjustment process
NO
YES
Mark individual as Suspect
Is
individual Under
Repair?
Is its
fitness < Repair
Threshold?
YES
fL,R<fRT
Is its
fitness > Operational
Threshold?
YES
Mark individual as Under Repair
YES
Mark individual as Refurbished
Invoke Genetic Operators only once
and only on L or R
fL,R>fOT
adjust controls
& goto Selection process
Fitness Evaluation Window
• Fitness Evaluation Window: W
 denotes number of iterations used to evaluate fitness before the state of
an individual is determined
•
Determination of W for 3x3 multiplier
 6 input pins articulating 26=64 possible inputs
 W should be selected so that all possible inputs appear
 More formally,
Let rand(X) return some xi  X at random


Seek W
W
: [ 
rand(X) ] = X with high probability
i=1
• xK = distinct orderings of K inputs
showing in D trials
• if D constant, can calculate Pk>1
successively
• probability PK of K inputs showing
after D trials is ratio of xK / KD
K 
 K 
K 
K
  xK  
 xK 1  .....    x2    x1  K D
K 
 K  1
2
1
K 
 K 
K
K
  PK  
 PK 1  .....    P2    x1  1
K 
 K  1
2
1
K
K
 Pm  1

m 1  m 
W Determination
When K=64:
Impact of Fault on Viable Individuals
• Existence of Positive Test Vector
 Input Ip comprises a articulating test iff Ci(Ip)  Cji(Ip) = 1
 So if a discrepancy is detected then some Ip exists which manifests the fault
• Minimal Case when Ip is Unique
 Ip is unique if fault is observable under exactly one input pattern
• Probability Mass Function for Encountering Minimal Case Ip
 Consider W=600 yielding 99.5% coverage for a module with input space
X=64
 The number of input occurrences, 0  i  600, that randomly encounter Ip to
identify the fault is governed by the probability density function:
W i
p.m.f. =
W   X  n 
   

i   1 
W
X
 
1
where
W  600, X  64, n  1,0  i  600
Integer Multiplier Case Study
• 3bit x 3bit unsigned multiplier automated design:
– Building blocks
 Half-Adder: 18 templates created
 Full-Adder: 24 templates
 Parallel-And : 1 template created
– Randomly select templates for instantiation in modules
GA parameters
GA operators
Population size : 20 individuals
Crossover rate : 5%
Mutation rate : up to 80% per bit
Experimental Evaluation
Xilinx Virtex II Pro on Avnet PCI board
External-Module-Crossover
Internal-Module-Crossover
Internal-Module-Mutation
Experiments Demonstrate …
•
•
•
Objective fitness function replaced by
the Consensus-based Evaluation
Approach and Relative Fitness
Elimination of additional test vectors
Temporal Assessment process
Template Fault Coverage
Half-Adder Template A
Half-Adder Template A
Half-Adder Template B
Template A
–
–
Gate3 is an AND gate
Will lose correctness if a Stuck-At-Zero fault occurs in second
input line of the Gate3, an AND gate
Template B
–
–
Gate3 is a NOT gate and only uses the first input line
Will work correctly even if second input line is stuck at Zero or
One
Regeneration Performance
Parameters:
Difference (vs. Hamming Distance)
Evaluation Window, Ew = 600
Suspect Threshold: S = 1-6/600=99%
Repair Threshold: R = 1-4/600 = 99.3%
Re-introduction rate: r = 0.1
Repairs evolved in-situ, in real-time, without additional test
vectors, while allowing device to remain partially online.
Discrepancy Mirror
• Mechanism for Checking-the-Checker (“golden element” problem)
• Makes checker part of configuration that competes for correctness [DeMara PDPTA-05]
Fault Coverage
Discrepancy Mirror Circuit
Fault Coverage
Component
Fault Scenarios
Fault-Free
Function Output A
Fault
Correct
Correct
Correct
Correct
Function Output B
Correct
Fault
Correct
Correct
Correct
XNORA
Disagree (0)
Disagree (0)
Fault : Disagree(0)
Agree (1)
Agree (1)
XNORB
Disagree (0)
Disagree (0)
Agree (1)
Fault : Disagree(0)
Agree (1)
BufferA
0
0
High-Z
0
1
BufferB
0
0
0
High-Z
1
Match Output
0
0
0
0
1
Influence of LUT utilization
Perpetually Articulating Inputs
with Equiprobable Distribution
• expected number of pairings grows sub-linearly in
number of resources
• utilization below 20% or above 80% implicates (or
exonerates) a smaller sub-set of resources
• 50% utilization, the expected number of pairings for
1,000, 10,000, and 100,000 resources are 11.1, 14.9,
and 17.6
Intermittently Articulating Inputs
with Equiprobable Distribution
• at 90% utilization mean value of
258 pairings are required to
isolate the faulty resource.
Future Work:
Development Board to Self-Contained FPGA
(Xilinx Virtex-II Pro)
Virtex-II
Pro FPGA
Off Chip
RAM
Functional
CLBs
Bit file
ICAP
CRR on a Chip
(Xilinx Virtex-II Pro)
Config
Data
Reconfig
Request
Control via
on-chip
Power PC
Output
PCI Interface
Year 3
CRR on a Chip
Data
Output
Control
hosted on
PC
Input Data
Year 2
Bit file
Year 1
Device Fault
Configurations
in On Chip
RAM Blocks
Avnet FPGA Development Board
Qualitative Analysis of CRR model
• Number of iterations and completeness of regeneration repair
• Percentage of time the device remains online despite physical resource
fault (availability)
Hardware Resource Management
• Optimization of hardware profile for Xilinx Virtex II Pro
Field Testing on SRAM-based FPGA in a Cubesat mission
Backup Slides
• On following pages …
Isolation: Block Duelling
•
•
Algorithm based on group testing methods
Successive intersection to assess health of resources
Each configuration k has a binary Usage Matrix Uk[i,j] 1  i  m and 1  j  n
 m, n are the number of rows and columns of resources in the device
 Elements Uk[i,j] = 1 are resources used in k
History Matrix H [i,j] 1  i  m and 1  j  n, initially all zero, exists in which :
 entries represent the fitness of resources (i, j)
 Information regarding the fitness of resources over time is stored
A discrepant output will lead to an increase in the value of
H[i,j],  Uk[i,j] = 1 ,k  S
 All elements of H, corresponding to resources used by discrepant
configuration will be incremented by one.
 At any point in time, H[i,j] will be a record the outcomes of competitions
 m successive intersections among
are performed
until |S|=1
Dueling Example
H [i,j]
@t=0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
1
0
1
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
1
0
1
0
0
0
0
U1
H [i,j]
@t=2
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
2
1
0
0
1
0
0
0
0
0
1
0
1
1
0
1
0
0
0
0
1
1
0
1
0
0
0
0
0
0
1
0
0
1
1
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
1
0
0
1
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
U2
• H [i,j] changes after C1 and C2 are loaded
• U1 and U2 are corresponding Usage Matrices
• (3,3) is identified as the faulty resource
Fitness of configuration k
k
k
Isolation of a single faulty individual with
1-out-of-64 impact
•
•
•
Outliers are identified after W iterations elapsed
E.V. = (1/64)*600 = 9.375 from minimum impact faulty individual
Isolated individual’s f differs from the average DV by 3 after 1 or more
observation intervals of length W
Isolation of a single faulty L individual
with 10-out-of-64 impact
•
Compare with 1-out-of-64 fault impact
 E.V. of (10/64)*600 = 93.75 discrepancies for faulty configuration
 One isolation will be complete approx. once in every 93.75/5 = 19
Observation Intervals
 Fault Isolation demonstrated in 100% of case
Isolation of 8 faulty individuals L4&R4
with 1-out-of-64 impact
•
Expected isolations do not occur approximately 40% of the time
 Average discrepancy value of the population is higher
 Outlier isolation difficult
 Multiple faulty individual, Discrepancies scattered
Online Dueling Evaluation
• Objective
 Isolate faults by successive intersection between sets of FPGA
resources used by configurations
 Analyze complexity of Isolation process
• Variables
 Total resources available

Measured in number of LUTs
 Number of Competing Configurations

Number of initial “Seed” designs in CRR process
 Degree of Articulation

Some inputs may not manifest faults, even if faulty resource used by
individual
 Resource Utilization Factor

Percentage of FPGA resources required by target application/design
 Number of Iterations for Isolation

Measure of complexity and time involved in isolating fault
Isolation of Faulty Resource at the
FPGA resource (LUT) granularity
• 50625 LUTs comparable to LUTs on a Xilinx Virtex II Pro FPGA
 Xilinx Virtex II Pro has approximately



67 columns, 78 rows
4 slices per CLB
2 LUTs per slice
Isolation of Faulty Resource:
Effect of Articulation
•
•
No direct, uniform relation between % Articulation and Number of Isolations!
Performance best when Articulation (%) = 50%  10%
 Each successive intersection provides maximal information
 Greatest number of resources are intersected out of “suspect” pool.
For further info … EH Website
http://cal.ucf.edu
Fast Reconfiguration for
Autonomously Reprogrammable Logic
• Motivation
– Dynamic reconfiguration required by application
– Exploit architectural & performance improvements fully
– Reconfiguration delay – a major performance barrier
• Previous Work
• Methodology
– Multilayer Runtime Reconfiguration Architecture (MRRA)
– Spatial Management
• Prototype Development
– Loosely-Coupled solution
– Timing Analysis
– System-On-Chip solution
Reconfiguration Demand during CRR
For a complete repair
– Approximately 2,000 generations (GCR ) may be required
– For each generation, # evaluations Onew may be up to 100 evaluations
– Yielding the Cumulative Number of Reconfigurations (CNR) up to
GCR  Onew  20,000
– For each reconfiguration task
Li  TTAT (i)  TDRT (i)  TE (i)
–
Therefore, the total delay
Ltot 
CNR
L
i 1
i
Even if reconfiguration delay alone is assumed to be in the order of tens or
hundreds of milliseconds  Ltot >= 5.5 hours
Previous Work - Tool Level
Moraes,
Mesquita,
Palma,
Moller
Virtex
XCV300
devices
No
N
Loose
Lack of Area
Relocation
Capability
Raghavan,
Sutton
Xilinx
Virtex
devices
No
N
Loose
Cumbersome
CAD flow
Medium
Limited
hardware
speed and
capacity.
Lack of
information
for bit stream
reuse
Virtex II
devices
Partial
Bit Stream
Reuse
Potential
Limitations
Approach
Blodget,
McMillan
On-chip
System
System
Coupling
Degree
FPGA
Supported
Y
Previous Work - Algorithm Level
Approach
Method
Partial
Reconfig
Spatial
Relocation
Temporal
Parallelism
Area
shape
RunTime
Potential
Limitations
Hauck, Li,
Schwabe
Bit file
compression
N/A
No
N/A
N/A
No
Full
reconfiguratio
n required
Shirazi, Luk,
Cheung
Identifying
common
components
Yes
No
Yes
N/A
No
Design time
work required
Mak, Young
Dynamic
Partitioning
Yes
No
Yes
N/A
Yes
Only desirable
for large
designs
Ganesan,
Vemuri
Pipelining
Yes
No
Yes
N/A
Yes
Limited
pipeline depth
Compton, Li,
Knol, Hauck
Relocation and
Defragmentatio
n with new
FPGA
architecture
Yes
Yes
No
Row-based
Yes
Special FPGA
architecture
required
Diessel,
Middendorf
Schmeck,
Schmidt
Task Remapped
and Relocated
Yes
Yes
No
Rectangle
Yes
Overhead for
remapping
calculations
Herbert,
Christoph,
Macro
Partitioning and
2D Hashing
Yes
Yes
Yes
Rectangle
Yes
Rigid task
modeling
assumptions
compression method
temporal method
spatial method
Multilayer Runtime Reconfiguration Architecture
Fault-Repair
Genetic Algorithm
Control System
Microprocessor
(MRRA)
Reconfiguration
Engine
System Bus
Virtex-II Pro
FPGA
RAM
• Develop MRRA fast
reconfiguration paradigm for the
CRR approach
• Validate with real hardware
platform along with detailed
performance analysis
• First general-purpose framework
for a wide variety of applications
requiring dynamic reconfiguration
• Extend existing theories on
reconfiguration
Loosely Coupled Solution
FP G A
O ut p u t
Input Data
Bit file
Control
hosted on
PC
PCI Interface
Virtex-II
Pro FPGA
Off Chip
RAM
Avnet FPGA Development Board
The entire system operates on a
32-bit basis
The Virtex-II Pro is mounted on a
development board which can then
be interfaced with a WorkStation
running Xilinx EDK and ISE.
Result Assessment
• Establish full functional framework of both prototypes
• Communication overhead, throughput and overall speed-up
analysis
 Communication overhead for SOC solution is decreased to micro or submicro second order Vs. milliseconds order of Loosely Coupled solution
 Up to 5-fold speedup is expected compared to the Loosely Coupled solution
• Translation Complexity Analysis
 The quantity of information that needs to be translated to generate the
reconfiguration bitstream
 Simplification from file level to bit level is expected
• Storage Complexity Analysis
– The memory space required for the run-time algorithms
– Decreased memory requirement is expected due to the translation
complexity improvement
Project Milestones
SW Schedule:
Nov
2004
Start
Jan
2005
Mar
2005
May
2005
Jul
2005
Sep
2005
Nov
2005
Build VHDL
Evaluate CRR Design GUI
module
and
Parameters in of 3X3
incorporate into
3x3 multiplier multiplier
the hardware
design
prototype
Jan
2006
FPGAresident
CRR
Mar
2006
May
2006
Jul
2006
Performance
analysis for
prototype 1 on
Quad Decoder
circuit
Sep
2006
Nov
2006
Implement
the SEC
circuit
design
Jan
2007
Mar
2007
Optimized
Parameters
for layered
comb/seq
designs
Jul
2007
Regen.
Final
Report
HW Schedule:
Nov
2004
Jan
2005
Start
API &
SEC
circuit
Mar
2005
May
2005
Scripts
Jul
2005
Sep
2005
Nov
2005
Jan
2006
Mar
2006
May
2006
Jul
2006
Sep
2006
GA
Performance Performance OS for ICAP
representation analysis for analysis for
the circuit
for prototype 1 prototype 1 prototype 1 on SOC
on 3*3
Quad Decoder
multiplier
circuit
Nov
2006
Jan
2007
Reconfig.
Peformance
Report
Mar
2007
Jul
2007
SOC
Final
Report
Publications
Accepted Manuscripts
1. R. F. DeMara and K. Zhang, “Autonomous FPGA Fault Handling through Competitive Runtime
Reconfiguration,” to appear in NASA/DoD Conference on Evolvable Hardware(EH’05),
Washington D.C., U.S.A., June 29 – July 1, 2005.
2. H. Tan and R. F. DeMara, “A Device-Controlled Dynamic Configuration Framework Supporting
Heterogeneous Resource Management,” to appear in International Conference on
Engineering of Reconfigurable Systems and Algorithms (ERSA’05), Las Vegas, Nevada,
U.S.A, June 27 – 30, 2005.
3. R. F. DeMara and C. A. Sharma, “Self-Checking Fault Detection using Discrepancy Mirrors,” to
appear in International Conference on Parallel and Distributed Processing Techniques and
Applications (PDPTA’05), Las Vegas, Nevada, U.S.A, June 27 – 30, 2005.
Submitted Manuscripts
1. R. F. DeMara and K. Zhang, “Populational Fault Tolerance Analysis Under CRR Approach,”
submitted to International Conference on Evolvable Systems (ICES’05), Barcelona, Sept. 12
– 14, 2005.
2. R. F. DeMara and C. A. Sharma, “FPGA Fault Isolation and Refurbishment using Iterative
Pairing,” submitted to IFIP VLSI-SOC Conference, Perth, W. Australia, October 17 – 19, 2005.
Manuscripts In-preparation
1. R. F. DeMara and K. Zhang, “Autonomous Fault Occlusion through Competitive Runtime
Reconfiguration,” submission planned to IEEE Transactions on Evolutionary Computation.
2. R. F. DeMara and C. A. Sharma, “Multilayer Dynamic Reconfiguration Supporting
Heterogeneous FPGA Resource Management,” submission planned to IEEE Design and Test
of Computers.
Field Testing
Implementation of CRR on-board SRAM-based FPGA in a Cubesat mission
EHW Environments
• Evolvable Hardware (EHW) Environments enable experimental
methods to research soft computing intelligent search techniques
• EHW operates by repetitive reprogramming of real-world physical devices
using an iterative refinement process:
Extrinsic
Evolution
Two
modes
of
Genetic
Algorithm
Simulation in the loop
Intrinsic
Evolution
or
Application
Genetic
Algorithm
Hardware in the loop
Evolvable
Stardust Satellite:
• >100 FPGAs onboard
• hostile environment:
radiation, thermal stress
• How to achieve reliability
to avoid mission failure???
Hardware
Done?
software model Build it
device “design-time”
refinement
new approach to
device “run-time”
refinement
Autonomous Repair
of failed devices
Genetic Algorithms (GAs)
Mechanism coarsely modeled after neo-Darwinism (natural selection +
genetics)
start
replacement
offspring
population of
candidate
solutions
mutation
crossover
parents
selection
of
parents
Fitness
function
evaluate
fitness
of
individuals
Goal
reached
Genetic Mechanisms
•
Guided trial-and-error search techniques using principles of Darwinian
evolution
 iterative selection, “survival of the fittest”
 genetic operators -- mutation, crossover, …
 implementor must define fitness function
•
GAs frequently use strings of 1s and 0s to represent candidate
solutions
 if 100101 is better than 010001 it will have more chance to breed and
influence future population
•
GAs “cast a net” over entire solution space to find
high fitness
•
Can invoke Elitism Operator (E=1, E=2 …)
regions of
 guarantees monotonically increasing fitness of best individual over all
generations
GA Success Stories
Commercial Applications:



Nextel: frequency allocation for cellular phone networks -- $15M
predicted savings in NY market
Pratt & Whitney: turbine engine design --- engineer: 8 weeks;
GA: 2 days w/3x improvement
International Truck: production scheduling improved by 90%
in 5 plants
NASA: superior Jupiter trajectory optimization, antennas, FPGAs
Koza:
25 instances showing human-competitive performance
such as analog circuit design, amplifiers, filters
Representing Candidate Solutions
 Representation of an individual can be using discrete values
(binary, integer, or any other system with a discrete set of values)
 Example of Binary DNA Encoding:
Individual
(Chromosome)
GENE
Genetic Operators
t
t +1
selection
reproduction
mutation
recombination
(crossover)
Crossover Operator
...
Population:
cut
1 1 1 1 1 1 1
1 1 1 0 0 0 0
cut
0 0 0 0 0 0 0
0 0 0 1 1 1 1
parents
offspring