Software and Safety: A Software Engineer's Perspective

Download Report

Transcript Software and Safety: A Software Engineer's Perspective

Copyright © 2001 Praxis Critical Systems Limited

RTCA DO-178B:
Software Considerations in Airborne Systems
and Equipment Certification
• Industry-produced guidelines
– Consensus document for avionics software good practice
– Not a process document
– Not a standard
• Theoretically provides guidelines
• In practice:
– Endorsed by the FAA and JAA
– The “Bible” for civil avionics software certification
• Compliance established by ders
– Compliance here means process compliance
Copyright © 2001 Praxis Critical Systems Limited

Criticality Levels
• Levels assigned according to consequence of
failure
–
–
–
–
–
Level A:
Level B:
Level C:
Level D:
Level E:
catastrophic
hazardous/severe-major
major
minor
no effect
• DO-178B processes vary according to level
• Only A & B likely to be regarded as safetycritical
Copyright © 2001 Praxis Critical Systems Limited

Verification Activities
• For each development output standard either
requires:
– Tool that produced that output to be qualified; or
– The output itself to be verified
• Since most tools in the object code generation
path are not qualified
– The object code itself must be verified
– Achieved by specified test coverage criteria
– Coverage can be of object code itself or of source code if
source-to-object traceability can be achieved
Copyright © 2001 Praxis Critical Systems Limited

Tool Qualification
• Software development tools
– Tools whose output forms part of the airborne software (e.g.
Compiler)
• Software verification tools
– Tools which cannot introduce errors but may fail to detect
them (e.g. Static analyser)
• Qualification of tools essentially requires tool to
be developed to same standard as system on
which it will be used
Copyright © 2001 Praxis Critical Systems Limited

Test Coverage Requirements
• Level A:
MC/DC branch coverage
– Modified decision/condition coverage
• Level B:
branch (decision) coverage
• Level C:
statement coverage
• Level D/E:
no specific requirements
Copyright © 2001 Praxis Critical Systems Limited

What Is MC/DC?
Modified
Condition
- each
value
independently
affecting
Decision
coverage
- -coverage
each
branch/path
is executed
Condition
coverage
each
value
affecting
decision
is
executed
decision is executed
316test
cases
test
cases
4 test cases
if A and B then
X = X + 1;
elsif C or D then
(AAB)
B
A  B
A   B
(A
A B)^(C
B  D)

C  D
CD
C  D
C  D
Y := Y + 1;
end if;
Copyright © 2001 Praxis Critical Systems Limited
any
(A  B)^ (C 
D)1 of

Establishing Coverage
Copyright © 2001 Praxis Critical Systems Limited

Coverage Analysis
• Show that coverage required for level has been
achieved
• Confirm the data- and control-flow couplings
between components
• Coverage analysis can take place at source
level unless:
– The code is level A; and
– The object code is not directly traceable from the source
code
• e.g. An Ada run-time check
Copyright © 2001 Praxis Critical Systems Limited

Inadequate Coverage
• Shortcomings in requirements-based test cases
– Add more tests
• Inadequate requirements
– Modify requirements, add more tests as necessary
• Dead code (always semantically infeasible)
– Remove, reverify if necessary
• Deactivated code
– Show that code cannot be called in delivered configuration; or
– force test cases to execute it
Copyright © 2001 Praxis Critical Systems Limited

Coverage “Cheating”
Properties of a Tomato
function IsRed return Boolean is
begin
return True;
end IsRed;
function IsBig return Boolean is
begin
return False;
end IsBig;
3 tests required
function IsEdible return Boolean is
begin
return True;
end IsEdible;
Copyright © 2001 Praxis Critical Systems Limited

Coverage “Cheating”
Properties of a Tomato
type Properties
is (Red, Big, Edible)
Tomatoes constant : array(properties...
:= (Red
=> True,
Big
=> False,
Edible => True);
1 test required
function GetProperty (P : Properties)
return Boolean is
begin
return Tomatoes (P);
end GetProperty;
Copyright © 2001 Praxis Critical Systems Limited

DO-178B Critique
• Software correctness standard not a system safety
standard
– Cannot construct arguments to show how hazards are mitigated
– Can only state “whole box done to level A”
• Undue emphasis on test
– Little or no credit for worthwhile activity such as analysis and reviews
– Can create an environment where getting to test quickly is seen as the
prime aim
– MC/DC widely misinterpreted as “exhaustive” testing
– Deactivated code issues create problems for “good” languages like Ada
• DER role is establishing process compliance not system
safety
Copyright © 2001 Praxis Critical Systems Limited

Economic Arguments
• DO-178B gives little credit for anything except
testing
• The required testing is harder for Ada than C
• Therefore the best solution is to “hack in C”?
No!
• MC/DC testing is very expensive
– some estimates say x5 cost
– needs test rig: back-end, high-risk process
• Therefore processes that reduce errors going
into formal test are very worthwhile
Copyright © 2001 Praxis Critical Systems Limited

C130J Development Process - Principles
• Emphasis on correctness by construction (CbC)
– Focus on “doing it right first time”
– Verification-driven development
– Requirements of DO-178B kept in mind but not allowed to dominate
• “Formality” introduced early
– In specification
– In programming language
• Application domain rather than solution domain focus
– e.g. Identification of common paradigms
• Risk management
– e.g. “Thin-slice” prototyping
Copyright © 2001 Praxis Critical Systems Limited

Requirements and Specification
• CoRE (consortium reqts engineering)
– Parnas tables to specify input/output mappings
• Close mapping between CoRE requirements
and code for easy tracing
• Functional test cases obtain from CoRE tables
Copyright © 2001 Praxis Critical Systems Limited

CoRE Elements
Monitored
Variables
System
Input
Devices
Environment
IN
Copyright © 2001 Praxis Critical Systems Limited
Output
Data
Items
Input
Data
Items
Software
SOFT
Controlled
Variables
Output
Devices
Environment
OUT

REQ Template
MON
MON
CON
<value or
range>
<value or
range>
<f(MONs,
TERMs) or
value>
Copyright © 2001 Praxis Critical Systems Limited

CoRE REQ Example
<engine>
Fuelcp_mon_
Fuelcp_mon_auto transfer_
_xfer_class
method
<engine>
fuelcp_mon_
xfeed_class
<engine>
fmcs-con_ fuel_
control_valve_
class
X
X
from | off
shut
X
manual
to
open
auto
to
open
auto
to
shut
< <engine>
fmcs_mon_fuel_
level_tank_class
>= <engine>
fmcs_mon_fuel_
level_tank_class
Copyright © 2001 Praxis Critical Systems Limited

Implementation
• Close mapping from requirements: “one
procedure per table”
• Templates for common structures
– e.g. Obtaining raw data from bus and providing abstract
view of it or rest of system (“devices”)
• Coding in SPARK (150K SLOC)
• Static analysis as part of the coding process
Copyright © 2001 Praxis Critical Systems Limited

An Unexpected Benefit of SPARK
• SPARK requires all data interactions to be
described in annotations
• Some interactions were not clearly specified in
the CoRE tables
– e.g. Validity flags
• Coders could not ignore the problem
• Requirements/specification document became a
“live”, dynamic document as ambiguities were
resolved during coding
Copyright © 2001 Praxis Critical Systems Limited

Integrated Formality
Test Cases
CoRE
Specification
Templates
Copyright © 2001 Praxis Critical Systems Limited
Static Analysis
Testing
SPARK Code

The Value of Early Static Analysis
Example - an Aircrew Warning System
A specification for this
function might be:
An (incorrect) implementation might be:
mon_Event
con_Bell
type Alert is (Warning, Caution, Advisory);
Warning
Caution
Advisory
True
True
False
function RingBell(Event : Alert) return Boolean
is
Result : Boolean;
begin
if Event = Warning then
Result := True;
elsif Event = Advisory then
Result := False;
end if;
return Result;
end RingBell;
Copyright © 2001 Praxis Critical Systems Limited

Example Contd. - Test Results
• There is a very high probability that this code would pass MC/DC
testing
• The code returns True correctly in the case of Event=Warning
and False correctly in the case of Event=Advisory
• In the case of Caution it returns a random value picked up from
memory; however, there is a very high probability that this random
value will be non zero and will be interpreted as True which is the
expected test result
• The test result may depend on the order the tests are run (testing
Advisory before Caution may suddenly cause False to be
returned for example)
• The results obtained during integration testing may differ from those
obtained during unit testing
Copyright © 2001 Praxis Critical Systems Limited

Example Contd. - Examiner Output
13
14
15
16
17
18
19
20
21
22
??? (
23
??? (
function RingBell(Event : Alert) return Boolean
is
Result : Boolean;
begin
if Event = Warning then
Result := True;
elsif Event = Advisory then
Result := False;
end if;
return Result;
^1
1) Warning
: Expression contains reference(s) to
variable Result, which may be undefined.
end RingBell;
2)
Warning
: The undefined initial value of Result
may be used in the derivation of the function value.
Copyright © 2001 Praxis Critical Systems Limited

The Lockheed C130J and C27J Experience
• The testing required by DO-178B was greatly
simplified:
– “Very few errors have been found in the software during
even the most rigorous levels of FAA testing, which is
being successfully conducted for less than a fifth of the
normal cost in industry”
• Savings in verification are especially valuable
because it a large part of the overall cost
– “This level A system was developed at half of typical cost
of non-critical systems”
• Productivity gains:
– X4 on C130J compared to previous safety-critical projects
– X16 on C27J with re-use and increased process maturity
Copyright © 2001 Praxis Critical Systems Limited

C130J Software Certification
DO-178B/FAA
Civil
C130J
Military
Copyright © 2001 Praxis Critical Systems Limited
RAF lead
customer
UK MoD IV&V
programme
comparisons

UK Mod IV&V Programme
• The UK mod commissioned aerosystems
international to perform retrospective static
analysis of all the C130J critical software
• A variety of tools used: e.g.
– SPARK examiner for “proof” of SPARK code against CoRE
– MALPAS for Ada
• All “anomalies” investigated and “sentenced”
by system safety experts
• Interesting comparisons obtained
Copyright © 2001 Praxis Critical Systems Limited

Aerosystems’ IV&V Conclusions
• Significant, safety-critical errors were found by
static analysis in code developed to DO-178B
Level A
• Proof of SPARK code was shown to be cheaper
than other forms of semantic analysis
performed
• SPARK code was found to have only 10% of the
residual errors of full Ada and Ada was found to
have only 10% of the residual errors of C
• No statistically significant difference in residual
error rate could be found between DO-178B
Level A, Level B and Level C code
Copyright © 2001 Praxis Critical Systems Limited

Language Metrics
300
SPARK Ada
250
SLOCs/Anomaly
200
150
LUCOL
C
100
Assembler
PLM
50
Copyright © 2001 Praxis Critical Systems Limited
Average
HUD
 1% have safety implications
IDP
HUD
ECBU
DADS
FOD
NIU
BAECS
GCAS
FMCS
0
BAU II
SLOC per anomaly
Ada

Resources
•
•
•
•
•
•
RTCA-EUROCAE: Software Considerations in Airborne Systems and
Equipment Certification. DO-178B/ED-12B. 1992.
Croxford, Martin and Sutton, James: Breaking through the V&V
Bottleneck. Lecture Notes in Computer Science Volume 1031, 1996,
Springer-Verlag.
Software Productivity Consortium. www.software.org
Parnas, David L: Inspection of Safety-Critical Software Using
Program-Function Tables. IFIP Congress, Vol. 3 1994: 270-277
Sutton, James: Cost-Effective Approaches to Satisfy Safety-critical
Regulatory Requirements. Workshop Session, SIGAda 2000
German, Andy, Mooney, Gavin. Air Vehicle Static Code Analysis Lessons Learnt. In Aspects of Safety Management, Redmill &
Anderson (Eds). Springer 2001. ISBN 1-85233-411-8
Copyright © 2001 Praxis Critical Systems Limited

Programme
• Introduction
• What is High Integrity Software?
• Reliable Programming in Standard Languages
– Coffee
• Standards Overview
• DO178B and the Lockheed C130J
– Lunch
• Def Stan 00-55 and SHOLIS
• ITSEC, Common Criteria and Mondex
– Tea
• Compiler and Run-time Issues
• Conclusions
Copyright © 2001 Praxis Critical Systems Limited

Programme
• Introduction
• What is High Integrity Software?
• Reliable Programming in Standard Languages
– Coffee
• Standards Overview
• DO178B and the Lockheed C130J
– Lunch
• Def Stan 00-55 and SHOLIS
• ITSEC, Common Criteria and Mondex
– Tea
• Compiler and Run-time Issues
• Conclusions
Copyright © 2001 Praxis Critical Systems Limited

Outline
• UK Defence Standards 00-55 and 00-56
– What are they?
– Who’s using them?
• Main Principles & Requirements
• Example Project - the SHOLIS
Copyright © 2001 Praxis Critical Systems Limited

Defence Standards 00-55 and 00-56
• “Requirements for Safety related Software in
Defence Equipment” UK Defence Standard 0055
– Interim issue 1991
– Final issue 1997
• “Safety Management Requirements for Defence
Systems” UK Defence Standard 00-56, 1996
• Always used together
Copyright © 2001 Praxis Critical Systems Limited

Who’s using 00-55 and 00-56
• Primarily UK-related defence projects
•
•
•
•
•
SHOLIS (more of which later…)
Tornado ADV PS8 MMS
Harrier II SMS
Ultra SAWCS
Lockheed C130J (via a mapping from DO-178B)
• Has (sadly) little influence in the USA
Copyright © 2001 Praxis Critical Systems Limited

UK MOD 00-56
• System-wide standard
• Defines the safety analysis process to be used
to determine safety requirements, including
SILs
• SILs are applied to components
• Promotes an active, managed approach to
safety and risk management
Copyright © 2001 Praxis Critical Systems Limited

Def. Stan. 00-56 Main Activities
Corrective
Action
Saf ety
Requirements
Saf ety
Programme
Plan
Hazard
Identif ication
and
Ref inement
Risk
Estimation
Saf ety
Compliance
Assessment
Saf ety
Verif ication
Hazard Log
Saf ety Case
Construction
Copyright © 2001 Praxis Critical Systems Limited

UK Defence Standard 00-55
• Applied to software components of various SILs
• Defines process and techniques to be followed
to achieve level of rigour appropriate to SIL
• A Software Safety Case must be produced:
– to justify the suitability of the software development
process, tools and methods
– present a well-organized and reasoned justification based
on objective evidence, that the software does or will satisfy
the safety aspects of the Software Requirement
Copyright © 2001 Praxis Critical Systems Limited

00-55: Software Development Methods
Technique/Measure
SIL2
SIL4
Formal Methods
J1
M
Structured Design
J1
M
Static Analysis
J1
M
Dynamic Testing
M
M
Formal Verification
J1
M
M = “Must be applied”
J1 = “Justification to be provided if not followed.”
Copyright © 2001 Praxis Critical Systems Limited

Comparison With DO-178B
• RTCA DO-178B
–
–
–
–
MC/DC testing
Comparatively little about process
Level A to level E
Focus on software correctness
• Def Stan 00-55
–
–
–
–
Heavy emphasis on formal methods & proof
Still stringent testing requirements
S1 to S4
Focus on software safety
Copyright © 2001 Praxis Critical Systems Limited

00-55 and Independent Safety Audit
• 00-55 requires a project to have an Independent
Safety Auditor (ISA)
• At SIL4, this engineer must be from a separate,
independent company from the developers.
• The ISA carries out:
– Audit - of actual process against plans
– Assessment - of product against safety requirements
• ISA has sign-off on the safety-case.
• Compare with role of DO-178B DER.
Copyright © 2001 Praxis Critical Systems Limited

SHOLIS
• The Ship/Helicopter Operating Limits
Instrumentation System.
• Assesses and advises on the safety of
helicopter flying operations on board RN and
RFA vessels.
• Ship/Helicopter Operating Limits (SHOLs) for
each ship/helicopter/scenario in a database.
Copyright © 2001 Praxis Critical Systems Limited

SHOLIS - The Problem
Copyright © 2001 Praxis Critical Systems Limited

SHOLIS background
• Power Magnetics and Electronic Systems Ltd.
(now part of Ultra Electronics Group) - Prime
Contractor.
– System architecture and Hardware. (NES 620)
• Praxis Critical Systems Ltd. - all operational
software to Interim Def-Stan. 00-55 (1991)
Copyright © 2001 Praxis Critical Systems Limited

SHOLIS block diagram
ABCB
DPU1
CLM
Ship’s
Data
Bus
and
UPSs
FDDU
BDU
DPU2
Copyright © 2001 Praxis Critical Systems Limited

SHOLIS LRUs
• DPU1, DPU2 - Data Processing Units.
• BDU - Bridge display unit
• FDDU - Flight-deck display unit
• CLM - Change-over logic module
• ABCB - Auxiliary Bridge Control Box
Copyright © 2001 Praxis Critical Systems Limited

SHOLIS FDDU on test...
• © Ultra Electronics 1999
Copyright © 2001 Praxis Critical Systems Limited

Fault tolerance
• Fault tolerance by hot-standby.
• Only one DPU is ever actually controlling the
displays.
• CLM is high-integrity H/W, and selects which
DPU is “selected” but ABCB switch can
override choice.
• DPUs run identical software and receive same
inputs.
Copyright © 2001 Praxis Critical Systems Limited

Fault tolerance(2)
• Unselected DPU outputs to display, but display
simply ignores the data.
• DPUs are not synchronised in any way - both
run free on their own clock source.
• DPUs at either end of ship in case one becomes
damaged.
Copyright © 2001 Praxis Critical Systems Limited

SHOLIS history
• Phase 1
– Prototype demonstrating concept, U/I and major S/C
functions. Single DPU and display. Completed 1995
• Phase 2
– Full system. Software to Interim Def-Stan. 00-55.
Commenced 1995.
• Trials
– Autumn 1998 onwards
Copyright © 2001 Praxis Critical Systems Limited

SHOLIS software - challenges
• First system to Interim DS 00-55
– Many risks:
•
•
•
•
•
Scale of Z specification
Program refinement and proof effort
Complex User-Interface
Real-time, fault-tolerance and concurrency
Independent V&V and Safety Authority.
Copyright © 2001 Praxis Critical Systems Limited

SHOLIS - Exceptions from Interim 00-55
• A very small number of exceptions from 00-55:
• No formal object code verification (beyond state of art?)
• No diverse proof checker (no such tool!)
• No “back-to-back” test against an “executable prototype”
(what does this mean?)
Copyright © 2001 Praxis Critical Systems Limited

Software process (simplified)
Requirements
English
SRS
Z & English
SDS: SPARK
Z, English
Code
SPARK
Copyright © 2001 Praxis Critical Systems Limited

Documents
• Requirements - over 4000 statements, all
numbered.
• SRS - c. 300 pages of Z, English, and Maths.
• SDS - adds implementation detail: refined Z
where needed, scheduling, resource usage,
plus usual “internal documentation.”
Copyright © 2001 Praxis Critical Systems Limited

The code...
• Approx. 133,000 lines:
•
•
•
•
•
13,000 declarations
14,000 statements
54,000 SPARK flow annotations
20,000 SPARK proof annotations
32,000 comment or blank
Copyright © 2001 Praxis Critical Systems Limited

SHOLIS Z specification - example
Copyright © 2001 Praxis Critical Systems Limited

SHOLIS code - example - specification
----------------------------------------------------------------- Up10
--- Purpose:
-Incrementally cycles the tens digit of the given integer.
--- Traceunit: ADS.InjP.Up10.SC
-- Traceto : SDS.InjP.Up10.SC
--------------------------------------------------------------function Up10 ( I : in InjPTypes.HeadingValTypeT )
return BasicTypes.Natural32;
--# return Result =>
--#
(((I/10) mod 10) /= 9 -> Result = I+10) and
--#
(((I/10) mod 10) = 9 -> Result = I-90) and
--#
Result in BasicTypes.Natural32;
Copyright © 2001 Praxis Critical Systems Limited

SHOLIS - code example - body
-- Worst case serial chars : 0;
-- Worst case time (est)
: 4*LINES;
--- Traceunit: ADB.InjP.Up10.SC
-- Traceto : SDS.InjP.Up10.SC
-------------------------------------------------------------------function Up10 ( I : in InjPTypes.HeadingValTypeT )
return BasicTypes.Natural32
is
R : BasicTypes.Natural32;
begin
if ((I/10) mod 10) /= 9 then
R := I+10;
else
R := I-90;
end if;
return R;
end Up10;
Copyright © 2001 Praxis Critical Systems Limited

SHOLIS code example - proof
• Up10 gives rise to 5 verification conditions - 3
for exception freedom (can you spot them all?!)
and 2 for partial correctness.
• The SPADE Simplifier proves 3 of them
automatically.
• The remaining 2 depend on knowing:
– Integer32’Base’Range
– ((I >= 0) and (I / 10) mod 10) = 9) -> (I >= 90)
Copyright © 2001 Praxis Critical Systems Limited

Verification activities(1)
• SRS - review plus Z proof
• Consistency of global variables and constants, existence
of initial state, derivation of preconditions, safety
properties: (CADiZ and “formal partial proof”)
• Traceability
• SDS - review, more Z proof, traceability.
Copyright © 2001 Praxis Critical Systems Limited

Verification activities(2)
• Code
– Development team:
• SPARK Static analysis. Worst-case timing, stack use,
and I/O. Informal tests. Program proof.
– IV&V Team:
• Module and Integration Tests based on Z specifications
and requirements. 100% statement and MCDC coverage.
(AdaTest plus custom tools)
Copyright © 2001 Praxis Critical Systems Limited

Verification activities(3)
• IV&V Team (cont…)
• System Validation Tests based on requirements.
• Acceptance test.
• Endurance test.
• Performance test.
• Review of everything…including code refinement, proof
obligations, proof scripts, proof rules, documents...
Copyright © 2001 Praxis Critical Systems Limited

Traceability
• Every requirement, SRS paragraph, SDS
paragraph, Z schema, SPARK fragment etc. is
identified using a consistent naming scheme.
• “Trace-units” map between levels.
• Automated tools check completeness and
coverage.
Copyright © 2001 Praxis Critical Systems Limited

Static analysis
• Timing - via comments giving WCET in terms of
statements executed and unit calls.
• PERL Script collects and evaluates.
• Execution time of “1 statement” obtained by experiment
+ safety margin.
• Actually quite useful! Not as good as a “real” WCET
tool, but the best we could do.
Copyright © 2001 Praxis Critical Systems Limited

Static analysis(2)
• Display output
• Output to display is via a 38400 baud serial line. Limited
bandwidth.
• A problem either display not being updated rapidly enough, or
buffer overflow.
• Again, expressions are given in comments giving the worst-case
number of characters that can be queued by every procedure.
PERL script evaluates.
• Some units very carefully programmed to minimize output.
Copyright © 2001 Praxis Critical Systems Limited

Static analysis(3)
• Stack usage
– SPARK is non-recursive.
– Careful coding to avoid large temporary objects and heap
allocation.
– Static analysis of generated code yields call tree and worstcase stack usage.
Copyright © 2001 Praxis Critical Systems Limited

Project status: January 2001
• System has passed all System Validation,
Endurance, and Performance Tests.
• Acceptance Test with customer completed with
no problems.
• Sea trial during Autumn 1998.
• Some cosmetic changes requested, but no
reported faults, crashes, or unexpected
behaviour.
Copyright © 2001 Praxis Critical Systems Limited

Some metrics
• Faults
• Any deliverable, once handed to IV&V, goes under
“change control” - no change unless under a formal
fault-report or change-request.
• All fault reports classified - who found it, what project
phase, what impact etc. etc.
• Faults range from simple clerical errors to observable
system failures.
Copyright © 2001 Praxis Critical Systems Limited

Fault metrics
Project Phase
Faults Found (%) Effort (%)
Specification
Z Proof
Design, Code & Informal Test
Unit & Integration Test
System Validation Tests
Acceptance Test
Code Proof
Copyright © 2001 Praxis Critical Systems Limited
3
12
28
21
32
0
4
5
3
19
26
9.5
1.5
4

Copyright © 2001 Praxis Critical Systems Limited
Other
Acceptance
System Validation
Code Proof
Integration Test
Unit Test
Code
High-level Design
Z Proof
Specification
No. of Faults Found
100
80
70
60
50
0.3
40
30
20
0
Efficiency
SHOLIS - Faults found vs. Effort
0.6
90
0.5
0.4
0.2
10
0.1
0
