Software in Practice a series of four lectures on why

Transcript Software in Practice a series of four lectures on why

Software in Practice

a series of four lectures on why software projects fail, and what you can do about it - with particular emphasis on safety-critical systems

Martyn Thomas

Founder: Praxis High Integrity Systems Ltd Visiting Professor of Software Engineering, Oxford University Computing Laboratory

Lecture 1:

What is the problem with software?

 The state of practice  Scale  Complexity  What does testing tell us?

When I started in 1969 ...

IBM 360/65 Computing service for 1000s of users.

Now I have more computing power in my ‘phone.

The Software Crisis

 First digital computer, Manchester 1948  First commercial computer, LEO 1951  We are still in the very early stages of software engineering ...  … like studying civil engineering when Archimedes was still alive!

 NATO Software Engineering conferences in 1968 and 1969 to address the growing crisis in software dependability.

1972 Turing Award Lecture

The vision is that, well before the 1970s have run to completion, we shall be able to design and implement the kind of systems that are now straining our programming ability at the expense of only a few percent in man-years of what they cost us now, and that besides that, these systems will be virtually free of bugs

E W Dijkstra

Software in the 21st Century

 Fifty years on, yet still at the beginning.

 We are planning drive-by-wire cars, guiding themselves on intelligent roads  We are dreaming if we believe we can build such real-world systems safely, with today’s attitudes to software engineering.

 We have still not achieved Dijkstra’s vision of thirty years ago!

Thirty years later… … Most computing system projects fail

 Project cancellation  Major cost or time overrun  Much less functionality than planned  Security inadequate  Major usability problems  Excessive maintenance / upgrade costs  Serious in-service failure

I’ll talk about some specific failures in later lectures

most software projects fail

  Cancelled before delivery Exceed timescales & costs or greatly reduced functionality  On time and budget 31% 53% 16%    Mean time overrun Mean cost overrun Mean functionality delivered 190% 222% 60%  large companies much worse than smaller  recent figures better, but still poor source The Chaos Report (1995) http://www.standishgroup.com

most computing projects fail

 Of 1027 projects, 130 (12.7%) succeeded  Of those 130:  2.3% were development projects  18.2% maintenance projects  79.5% data-conversion projects  of the 500+ development projects in the sample, 3 (0.6%) succeeded .

Source: BCS Review 2001 page 62.

Why does it happen?

Because

:  scale matters. Small processes don’t scale up  process matters. Most developers lack discipline  rigour matters. Most developers are afraid of mathematics  engineering is conservative, whereas the software industry is ruled by fashion  CAA licensing system; C vs Ada at Lockheed Martin; eXtreme this, Agile that ...

Who can make things better? You!

Scale

 How many valid paths through 200 line module?

 We have found around 750,000  How big are modern systems?

 Windows is ~100M LoC  Oracle talk about a “gigaLoC” code base.

 How many paths is that?  How many do you think they have tested?

 What proportion will ever be executed?

A medium-scale system: En Route ATC at Swanwick

RS 6000 workstations

Control Room

Airspace

NERC SECTORISATION / EQUIVALENT LATCC SECTOR NAMES

DELEGATED TO COPENHAGEN ACC

S33 (NORTH SEA)

DELEGATED TO ANTRIM (FL165 - 245) DELEGATED TO DUBLIN (FL165 - 245) DELEGATED TO DUBLIN (FL245 -)

S9 (LANDS END) S8 (STRUMBLE) S7 (WIRRAL)

FL240 TO S7

S3/4 (LAKES) S3/4/ S10 S10 (NORTH SEA) S3/4/5 S5 (BRECON) S5 (BRECON) S23 (BRISTOL) S6 (BERRY HEAD) S11 (NORTH SEA) S27/32 (DAVENTRY WEST) S28/34 (DAVENTRY NORTH) S1 S25 S23 S2 /32/25 S1 /32 /25 S28/26 S1 (LUS WEST) S20 (HURN WEST) S1 /S25 S1 S25/S19 S20 S 2 1 ( H U R N L O W ) S20 S19 (HURN EAST) S2 S2 (LUS EAST) S26 (LMS EAST) S25 (LMS WEST) S1 S25 S18 S18 S1 S2 S12 S2 S26 S17 S2 S26 S15/16 S2 S17 S15 (DOVER LOW) S16 (DOVER HIGH) S2 S18 (SEAFORD) S12 (CLACTON EAST) S13 S17 (LYDD) S13 (HIGH) S14 (LOW)

DELEGATED TO S13/14 (FL235+) FL55 - FL660 DELEGATED TO SHANNON (FL245+) DELEGATED TO BREST (FL245+) PUBLICATION DATE: 20 JUN 01 COPIES OF THIS MAP ARE AVAILABLE FROM: OPERATIONAL INFORMATION, ROOM 3322, BOX 12, SWANWICK.

\\CAHSWNS01\SWANWICK.GLB$\ATC\NERC SECTORISATION.PDF

CHANGE:

S2 CORRECTED IN THE VICINITY OF THE WESTCOTT RC.

NERC SECTORISATION 20.06.01

NOT FOR OPERATIONAL USE

A medium sized system

 114 controller workstations  20 supervisory/management positions  10 engineering positions  48-workstation simulator  2 15-workstation test systems  2.5 million lines of software  >500 processors

Operational data

 1,667,381 flights in 2002  Continuous operation,  one 3-hour failure  (other flight delays caused by NAS failures at West Drayton)

Challenges for the future

 Current ATC safety depends on the controller’s ability to clear their sector with radio only.

 Future traffic growth requires > 10 a/c on frequency. Controllers would be overloaded  So future ATC will depend on automatic systems, which must not fail.

 Target? At least the avionics standard:10 -8 pfh  No current air traffic management systems are built to such standards. This could be your job in 3 years time.

How can we be sure a system works?

 Assurance:

showing

that a system works  Much harder than just system that works

developing

a  you need to generate evidence that it works  what evidence is sufficient?

 How safe or reliable is a system that has never failed?

 What evidence does testing provide?

 How can we do better?

How safe is a system that has never failed?

 If it has run for the operating conditions remain much the same, the best estimate for the probability of failure in the next n n hours without failure, and if hours is

0.5

 To show that a system has a pfh of <10 fault-free testing. (10,000 hours is 13.89 months) -4 with 50% confidence, we need about 14 months of

What evidence does testing provide?

“Testing shows the presence, not the absence, of bugs” - Dijkstra  We

cannot

test every path.

 Testing individual operations or boundary conditions may find faults, but such tests provide no evidence of pfh.



Statistical

testing, under operational conditions, provides evidence of pfh.

 But it takes a very long time.

Statistical testing

 To show an MTBF of

n

hours, with 99% confidence, takes around 10

n

hours of testing with

no faults found

( >100,000 years.

) . So avionics (10 -8 pfh) would need around 10 9 hours  With good prior evidence, e.g. from a strong process, using a Bayesian approach may reduce this to ~10,000 years  Actual testing is trivially short by comparison.

Summary

 Developing reliable software is difficult because of the size and complexity of real-life systems.

 The software industry is very young, amateurish and immature. Most significant projects overrun dramatically (and unnecessarily) or totally fail.

 In future lectures, I will explore why some failures have occurred (Therac, Arianne, LAS, Taurus …) and talk about what you need to know if you are to become a professional amongst all these amateurs.