Transcript Final Presentation
Slide 1
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 2
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 3
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 4
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 5
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 6
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 7
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 8
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 9
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 10
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 11
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 12
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 13
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 14
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 15
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 16
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 17
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 18
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 19
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 2
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 3
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 4
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 5
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 6
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 7
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 8
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 9
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 10
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 11
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 12
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 13
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 14
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 15
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 16
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 17
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 18
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!
Slide 19
טכניון – מכון טכנולוגי לישראל
הפקולטה להנדסת חשמל
Venus: A
Reliable & Reconfigurable
Satellite Computer
Students:
Guy Derry
Gil Wiechman
Instructor:
Isaschar Walter
In cooperation with MOD
Winter-Spring 2003
Background
• Satellite computer systems must meet various demands
Endurance to cosmic radiation1
Power consumption limitations
Weight limitations
• Space systems demand reliability
Radiation significantly reduced components’ MTBF
Repair is not an option…
• The approach – Redundancy
Data traffic monitoring
Data storage monitoring
1 According
to publications by NASA & CCSDS
Project Goals
• Examine policies of managing redundant
peripherals and select one.
• Implement the chosen algorithm on the
Virtex II Pro FPGA board
• Develop a working prototype of a satellite
computer, implementing the peripheral
device monitoring and operation algorithm.
Design Approach
Memory module
Memory
Memory
Memory
EDAC
EDAC
EDAC
Processor module
Peripheral module
Processor
Processor
Processor
System Bus
Peripheral
Peripheral
Peripheral
Monitor module
Monitor
Monitor
Monitor
Common Solutions - Active
• One common approach
is redundancy of the
computer’s elements.
System Bus
Low power consumption
Off units not vulnerable to
radiation
No faulty performance
Active
Device
Active
Monitor
Unit
Standby
Device
Standby
Device
Processor
Common Solutions - Passive
• Another common
approach is “statistical
redundancy”.
System Bus
No memory, no down time
required
Transparent to entire system,
protocol independent
Errors corrected “on the fly”
Active
Device
Passive
Monitor
Unit
Active
Device
Active
Device
Processor
Our Solution
• Venus combines both
approaches - Active & Passive:
Interface
TMR Module
Passive monitoring based
on TMR – Triple Module
Redundancy
M1 M2 M3
Active monitoring for bus
correctness
TMR
Monitor
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Architectural Concept
Memory
EDAC
Processor
(PPC405)
I/O TMR
TMR
System Bus
Bus Tester
(master)
Bus Tester
(slave)
TMR Module - Concept
I/O TMR Module
I/O TMR
I/O
I/O
TMR
Monitor
TMR
Monitor
System Bus
I/O
I/O Interface
TMR Module - Implementation
I/O TMR Module
TMR
Monitor
TMR
Monitor
PLB
PLB
UART
PLB
UART
I/O Interface
PLB
UART
SWS
I/O TMR Module
•
•
•
•
•
•
•
“Dual TMR” – monitor per interface
Efficient monitoring algorithm
No memory involved – higher robustness
No down time required
Transparent to system, indifferent to protocols
Allows use of “off-the-shelf” hardware modules
Errors corrected “on-the-fly”
I/O Interface
I/O TMR Module
TMR
Monitor
I/O I/O I/O
System Bus
TMR
Monitor
Bus Tester - Concept
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Bus Tester - Implementation
Reset
Timing
Unit
User Reset
System Reset
Tester Reset
Reset
Timing
Unit
System Bus
Bus
Tester
(master)
Bus
Tester
(slave)
Set Dog
Watchdog
Bark
Tester Reset Request
Odd
Even
Bus Fail
Bus Tester - Master
Init
Reset
Bus
Count
1
Test
Odd
Test
Even
Count
2
• A test is performed
every 10 seconds
(230 clock cycles)
• A test lasts less than
50 clock cycles
• Bus reset test is
performed after each
system reset
• Consecutive failures
result in an interrupt
Bus Tester - Slave
Odd
Respond
Odd
Req. ID
Idle
Even
Respond
Even
Req. ID
• Slave contains no
memory, except four
flip-flops required for
state machine
• Fast, stable and
reliable
Innovations
• Independent automatic bus testing –
Processor is free from executing check-ups.
• Transparent reliability management system –
Allowing use of standard software.
• Architectural modularity –
Achieved by generic monitor design.
• SoPC implementation of the reliable satellite computer
system –
A breakthrough towards Micro-Satellites
• Versatile architecture accomplishing a Reconfigurable
System
Project Status
• Various approaches to reliability
management were examined
• A combined approach was chosen
• A working prototype was synthesized on the
Virtex II Pro state-of-the-art FPGA
• Peripheral units & system bus monitoring
were implemented
• Memory monitoring policies were
examined
Future Work
• Development of memory monitoring
module
• Use CAN bus & Firewire for distributed inhouse communication
• Multiple bus / multiple processor
integration
• Build & launch…
Thank you for your time!