Ph.D. Defense by Fault Tolerant Power Systems Carsten Nesgaard

Download Report

Transcript Ph.D. Defense by Fault Tolerant Power Systems Carsten Nesgaard

Fault Tolerant Power Systems
Ph.D. Defense
by
Carsten Nesgaard
1
1
Outline
• Introduction to Fault Tolerance
o
Array-based Redundancy Management
o
Digital Converter Control
o
Hardware Redundancy and Load Sharing
o
Partners for Advanced Transit and Highways
(Thermal droop load sharing)
• Conclusion
2
2
Introduction to Fault Tolerance
Originated within software engineering in the late 1950’s.
Fault tolerance associated with highly reliable systems is often
characterized as a means of minimizing of down-time by:
• Single point failure free
→ acceptable performance
• Fail-safe
→ graceful degradation
• Fault isolation and detection
→ prevent fault propagation
→ majority voting
• Fault prediction
→ false triggering avoidance
• Load management
→ stress minimization
Fault tolerance in power electronics differs from the concept
used in for example control theory.
3
3
Array-based Redundancy Management
• Unidirectional power flow
• N + 2 redundancy
• A converter is comprised of 5 blocks
• Each converter uses a front switch
Power flow
Input
Block 1
Block 2
Block 3
Block 4
Block 5
N+2
Input
Output
Converter
Block 1
Block 2
Block 3
Block 4
Block 5
Output
4
4
Array-based Redundancy Management
Switch configuration:
Switch 1E
Feedback
PWM controller
Input
Inrush control
On/Off switch
Switch 1A
Power-switch
Filter
Switch 1B
Switch 1C
Transformer
Rectifier
S/C protection
Current sharing
Output
Switch 1D
5
5
Array-based Redundancy Management
Automatic reconfiguration based on system ’health’
Power system configuration
Sw itch 1A
Block A
Converter 1
Input
Converter 2
Sw itch 1B
Block B
Failure
Failure
:5x5
Sw itch 1C
Block C
Failure
Sw itch 1D
Block D
Block E
Output
Failure
Failure
Converter 3
Input
Output
Converter 4
Input
Output
Converter 5
Input
Output
6
6
Array-based Redundancy Management
Overall system configuration:
Sw itch
1 0 0 0 1
0 0 0 1
0 1 1 1 0
1 1 1 0
Nial software
1 1 1 1 1
Converter
Converter
Block
1 1 1 1 1
1 1 1 1 1
Failure
Failure
Failure
1 1 1 1
1 1 1 1
1 1 1 1
Failure
Failure
7
7
Array-based Redundancy Management
Power system survival probability:
Probability (P)
1.0
Proposed configuration
0.8
Static configuration
0.6
0.4
0.2
2.105
4.105
6.105
8.105
10.105
12.105
Failure rate (FIT)
Minimizing individual converter failure rates does
NOT imply maximum system survivability.
8
8
Array-based Redundancy Management
• Compared to a static system the ability to reconfigure
itself increases the reliability of any system
• The overall reliability of the proposed power system
approaches that of a static N + 3 redundant system
• Increased reliability with minimal
parameters such as volume and mass
increase
in
• Worst case reliability scenario equals that of the
traditional static power system
• Cost price increases considerably due to the large
number of switches
• Individual block failure rates increase
9
9
Digital Converter Control
• Simple buck topology with measurements of input
voltage, input current, output voltage and output current
• Converter control (using two control laws) and thermal
monitoring by means of low-cost microcontroller
12V Input
Power switch
5V Output
Filter
1AMAX
Duty-cycle
Temp
Input current
Input voltage
Output current
PIC16F877
microcontroller
Output voltage
10
10
Digital Converter Control
Analytical redundancy:
Case temperature vs. output current
160
TSense
140
Temperature
120
No heatsink
100
80
60
40
20
0
0
0,2
0,4
0,6
0,8
1
1,2
Output current
In the event of a fault in PWM mode:
The above graph is used
to determine converter
state h
Minimizing the risk of
shutting down a wellfunctioning converter
11
11
Digital Converter Control
Software data flow diagram:
System init
Measure input
voltage
Main
Interrupt routine
Interrupt routine responsible
for correct converter control
Main loop responsible for
temperature measurement, calculation of correct control law
and type of calculation method
(look-up or real-time)
ADC interrupt
Converter
control in
'real-time'
Within spec.
If n=100
measure
temperature
Outside spec.
Shut-down
converter
Request sample
Within spec.
Sample
Timer interrupt
Outside spec.
Measure VOUT,
VIN, IOUT, IIN and
calculate power
Control law
Converter failed
Outside spec.
Check
temperature
and deduce
converter state
Within spec.
Converter OK
Change in
control law
12
12
Digital Converter Control
Temperature distribution:
TSurface - 10°C
TSurface - 30°C
TSurface
Printed circuit board
1 resistor
4 diodes
2 capacitors
5 resistors
2 IC's
1 inductor
1 resistor
1 MOSFET
8 resistors
3 transistors
4 capacitors
1 diode
4 capacitors
Probability of survival as a function of time:
R(t)  e- lt
Reliability data found in MIL-217F (assumes constant l)
13
13
Digital Converter Control
Failure rates for the two configurations:
Failure rate ( l)
10000
Analog configuration
8000
Digital configuration
6000
4000
Failure rate in FIT
2000
20
40
60
80
100
120
Temperature
From a reliability point of view:
At temperatures below 120C an analog controller is preferable
At temperatures above 120C a digital controller is preferable
14
14
Digital Converter Control
Measurements:
Gate-Source voltage
Output voltage
Inductor current
PWM:
PS:
15
15
Digital Converter Control
Measurements and final system:
90
80
70
Efficiency
60
50
Initial
40
30
20
10
0
0
0,3
0,6
Output current
0,9
1,2
82
80
PS
Efficiency
78
76
Improved
PWM
74
72
70
0,25
0,3
0,35
Output current
0,4
0,45
16
16
Digital Converter Control
• Simple buck converter controlled by a low-cost PIC
microcontroller
• Introduction to the proposed techniques
• Analytical redundancy, change in control law and
thermal monitoring for increased reliability
• Measurements verified that the algorithm is capable of
performing the required tasks within the timing
limitations of the microcontroller
17
17
Hardware Redundancy and Load Sharing
Load sharing is utilized when applications call for:
• Modular structure – increase maintainability
• Simple power system realization
• Short time to market
• Increased reliability – redundancy and fault
tolerance
• High-current low-voltage applications
• Distributed networks
18
18
Hardware Redundancy and Load Sharing
DC/DC converter
DC/DC converter
Load
control
Temp
Load
control
DC/DC converter
DC/DC converter
Load
control
Temp
Load
control
Load sharing bus
DC/DC converter
DC/DC converter
Load
control
Input
Power
components
Temp
Current
meas.
Output
Load
control
Input
High side sensing
2,7V - 20V
RMEAS
TSense
PWM control
Load share
control
Load
Load sharing bus
Load
Power
components
Current
meas.
Output
Part of
PWM control
R1
R2
R3
R4
R1
R2
OP-amp
LS controller
Load share
control
- 9V
+ 9V
19
19
Hardware Redundancy and Load Sharing
RFeedback
IR2110
Converter 1
UC3843
IOUT /2
RGate
+16V
IOUT
Input
Input
UC3902
MC3307
48 H
IRFP064
10 m
+5V
470 F
PBYR
3045
100 F
100 F
470 F
PBYR
3045
100 F
100 F
Output
Output
IOUT /2
RGate
48 H
IRFP064
UC3843
UC3902
10 m
MC3307
Converter 2
IR2110
RFeedback
Buck topology – simplicity of implementation
125 W converters – 5 V output at 25 A
5% output ripple voltage
4 IC’s – lowers overall system reliability
2 freewheeling diodes and 1 MOSFET
L = 48 H, COut = 200 F, RSense = 10 m
20
20
Hardware Redundancy and Load Sharing
Current distribution among the two techniques:
Current sharing:
Thermal load sharing:
16
14
14
Individual converter current
Individual converter current
12
10
8
6
4
2
12
10
8
6
4
2
0
0
0
5
10
15
20
Output current
Converter 1 1
Converter
Converter 2
Converter 2
25
30
0
5
10
15
20
25
30
Output current
Converter 1
Converter 1
Converter 2
Converter 2
21
21
Hardware Redundancy and Load Sharing
Efficiency:
0,9
Efficiency
• Initial ’semi droop’ method
• Current sharing
• Thermal load sharing
0,8
0,7
0,6
0,5
0,4
0
The thermal load sharing efficiency
5
10
15
20
25
30
Output current
Semi droop sharing efficiency
Current sharing efficiency
‘Semi droop’ at low current levels
Lowest temperature
Thermal sharing efficiency
Current sharing technique at
heavier loads but at a higher
level.
22
22
Hardware Redundancy and Load Sharing
Power component loss distribution:
10
9
8
8
7
7
6
6
5
Converter 2
Converter 2
5
Converter 1
Converter 1
4
4
3
3
2
2
1
1
0
0
MOSFET switching
current
MOSFET switching
thermal
MOSFET conduction
current
MOSFET conduction
thermal
5
4,5
Diode, current
Diode, thermal
4
4
3,5
3,5
3
3
4
3,5
2,5
3
Power losses
2,5
2
2,5
Power losses
2
1,5
Power losses
2
1,5
1,5
1
1
0,5
0,5
1
0,5
0
0
Capacitor, current
Capacitor, thermal
0
Sense resistor, current
Sense resistor, thermal
Inductor, current
Inductor, thermal
23
23
Hardware Redundancy and Load Sharing
Temperature distribution for reliability assessment :
Heatsink
Misc. components
Transformer
Transistor
IC
l =
Accumulated failure
rate per unit
R =
Survivability
Q =
Unavailability
IC
PCB
Temperature
TTransformer
TSurface
TEnd of PCB
TAmbient
TIC
Distance

t

R(t)  1 - f(t) dt 
0
t


t


f(t) dt  l  e -lt dt  e -lt
R Sy stem  p1  p 2  q1  p 2  p1  q 2
t
t
R Sy stem  p 2  2  p  q

Q(t)  f(t) dt  l  e -lt dt  1 - e -lt
0
0
24
24
Hardware Redundancy and Load Sharing
Annual system downtime :
• Current sharing:
• Thermal load sharing:
10 min. 14 sec.
6 min. 11 sec.
Change in unavailability (downtime):
2
PConverter1  (PConverter2 - 1) - PConverter2  2  PThermal - PThermal
Q 
100
(PConverter1 - 1)  (PConverter2 - 1)
Inserting values – an overall reduction of almost 40% can be
calculated.
Achieved by simply choosing a different load sharing technique.
25
25
Hardware Redundancy and Load Sharing
• Two parallel-connected buck converters controlled by a
dedicated load share IC formed the basis for the
experimental verification.
• Theoretical evaluations of the experimental measurements
provided the explanation for the efficiency gain.
• Redistribution of the MOSFET transistor losses proved to be
the major contributor to the increased efficiency.
• Increased reliability – redundancy and fault tolerance
• Unequal thermal contact, differences in RDS(ON) and diode
parasitic deviations are some of the possible causes.
26
26
Partners for Advanced Transit and Highways
Research for improving the
problems on US highways.
increasing
transit
Outcome - Report on Power System Reliability
•
•
•
•
•
•
•
Analyze critical subsystems
Characterize failure modes
Propose power system solution
Laboratory implementation
Stability assessment
Present and document findings
Networking
27
27
Partners for Advanced Transit and Highways
Key system components:
•
•
•
•
•
•
Control computers
Steering actuators
Brake system
Accelerometer
Doppler radar and lidar
Magnetometers and gyros
28
28
Partners for Advanced Transit and Highways
Power system specification:
• Single point failure free
• Overvoltage avoidance or monitoring
• Simple implementation
• Affordable system design
• Fail-safe state of operation
• Optimized reliability – minimization of down-time
29
29
Partners for Advanced Transit and Highways
Power system solution → Thermal droop load sharing
VOUT
CT
RF1
RT
RF2
RS
To TRIM pin
VOUT
VIN
Ideal droop output voltage
VOUT
5.15
VOUT
Converter
5.00
VOUT ,nom
T
4.85
Nominal output voltage
TAmbient = 40 oC
40
80
100
120
140
K(T)
+
Thermal droop output voltage
60
+
Feedback
Temperature
160
K(T)
T
30
30
Partners for Advanced Transit and Highways
Overvoltage avoidance:
Over voltage protection
R3
2.2k
output
5.6k
Feedback over voltage
RT,25Ccc = 5.0k c
= 3950

R1
Q2
2N3906
on_off
C3
trig
2.2k R2
C1
Feedback returns to normal
Buffer voltage
outactivation voltage
Test circuit shorts
Switch
individual resistors
10pF
Q1
Output voltage with noise
2N3904
1nF
Thermal droop load sharing
switch
output
buffer
R4
time = t1
3.9k RF2 RS 4.3k
C2
Overvoltage situation detected
and prevented in 663ns.
13k RF1 RT
74F125
2.2k
Retriggering of overvoltage
100pF
feedback
Test circuit
Retriggering attempts ignored
through the use of a ’one-time’
latch.
31
31
Partners for Advanced Transit and Highways
Final system and measurements:
5,5
5,4
Droop voltages
Voltage droop (V)
5,3
5,2
Converter 1
Converter 2
5,1
Converter 3
5
4,9
4,8
4,7
4,6
4,5
0
1
2
3
4
5
6
7
8
9
10
Individual converter current (A)
90
8
80
Current sharing
6
Droop resistor temp.
above ambient
70
Temperature above ambient (C)
Individual converter current (A)
7
5
4
3
Converter 1
Converter 2
Converter 3
2
1
60
50
40
30
20
10
0
0
0
2
4
6
8
10
Load current (A)
12
14
16
18
20
0
2
4
6
8
10
Current through droop resistor (A)
32
32
Partners for Advanced Transit and Highways
Measurements:
1
Efficiencies
0,9
Output voltage notch
0,8
Efficiency
0,7
Series resistor droop technique
0,6
Thermal droop technique
0,5
0,4
0,3
0,2
0,1
0
0
10
20
30
40
50
60
70
80
90
100
Output power (W)
Unavailability
0.00035
0.00030
Unavailability
Unavailability decrease: 75%
0.00025
Series resistor droop technique
• Elimination of droop resistors
0.00020
0.00015
• Redistributing of currents
0.00010
0.00005
Thermal droop technique
Years
1
2
3
4
• Almost equal temperatures
5
33
33
Partners for Advanced Transit and Highways
• Alternative power system implementation – thermal droop
load sharing.
• Power system comprised of off-the-shelf units.
• Overvoltage protection and fail-safe operation.
• Performance evaluation of laboratory test setup – improved
efficiency and lower operating temperature.
• Statistical assessments provided the theoretical evidence
that the proposed technique improves the overall reliability.
• Positive impact on load sharing – elimination of dissipative
droop resistors.
34
34
Conclusion
• Several different techniques have been presented
and evaluated.
• Array-based Redundancy Management
• Digital Converter Control
• Hardware Redundancy
and Load Sharing
• Thermal Droop Load
Sharing
Improved reliability but still outperformed by analog control
circuitry.
Simple microcontroller with lower
failure rate
Improved efficiency and reliability
Relatively simple implementation
Improved efficiency and reliability
Very simple implementation
35
35
Acknowledgement
A special thanks to the following companies and
institutions:
36
36