The Verifying Compiler: a Grand Challenge for Computing

Transcript The Verifying Compiler: a Grand Challenge for Computing

The Verifying Compiler: a Grand Challenge for Computing Research

Tony Hoare Leiden 5 November, 2003

Typical Grand Challenges

Prove Fermat’s last theorem Put a man on the moon Cure cancer within ten years Map the Human Genome Map the Human Proteome Find the Higgs boson Find Gravity waves Unify the four forces of Physics Hilbert’s program for math foundations (accomplished) (accomplished) (failed in 1970s) (accomplished) (too difficult now) (in progress) (in progress) (in progress) (abandoned 1930s)

In Computing Science

Prove that P is not equal to NP The Turing test The verifying compiler (open) (outstanding) (abandoned in 1970s) A championship chess program (completed 1997) A GO program at professional standard (too hard) Machine translation English to Russian (failed in 1960s)

A Grand Challenge

 Is a fifteen-year project  With world-wide participation,  And clear test of success or failure.

 It offers fundamental and radical advance  In basic Science or Engineering.

A Grand Challenge needs

 Maturity of the state of the art  General support from the international scientific community  Long-term commitment from the teams who engage in it  Understanding from funding agencies

The Verifying Compiler

A verifying compiler uses automated mathematical and logical reasoning to check the correctness of the programs that it compiles. Correctness is specified by types, assertions, and other redundant annotations that are associated with the code of the program.

Test of success

 On completion of the project, significant and representative samples of software products will be mechanically verified.

 Each sample will be suitable to replace existing software in routine use, and to serve as a basis for further software evolution.  A prototype verifying compiler will be available as part of a software engineering toolset

 Fundamental  Historical  Astonishing  Idealistic  Inspiring

GC criteria

 Beneficial  Revolutionary  Feasible  Risky  Rare

Fundamental

 How does a software system work?  Annotation of interfaces explains how.  Why does it work?

 The theory of programming explains why.  A verifying compiler checks the correctness of the answers…  And enables the engineer to exploit the basic science.

Historical.

 The prestigious challenges are those which were formulated long ago; without concerted effort, they would be likely to stand for many years to come.

 The challenge of program verification goes back to Turing (1948), McCarthy (1962), Floyd (1967).

Idealistic

 The project does not duplicate commercially motivated evolution of existing products.

 Commercial tools follow market demand, and discover more and more faults; only academic research pursues ideals of purity, accuracy, completeness and correctness.

Astonishing

 It gives scope for engineering ambition to build something useful that was earlier thought impractical.

 It is amazing that computers can check the correctness of their own programs, using logical proof in the same way as mathematicians through the ages.

Testable

  The project has a clear measure of success or failure at the end; and ideally, at intermediate stages too.

A verifying compiler will certify total correctness of embedded software up to 10k lines, the safety of critical systems up to 100k lines, and the soundness and security of software up to a million lines. Many subtle bugs will be found and removed.

Inspiring

 The goals are generally comprehensible, and capture the imagination of the general public, as well as the esteem of scientists in other disciplines  The general public is well aware of the problem of software errors, and should welcome an attempt by computer scientists to solve a problem attributed to their own creation.

Beneficial

 The understanding and knowledge gained after completion of the project could bring scientific, economic or social benefits.  Reduction in program errors could save $22-60 billion per year in US (US Dept. Commerce Planning Report 02-03, May 2002 ).

Revolutionary.

 The project involves a paradigm shift in scientific research practices.  At present large-scale long-term projects are rare among computer scientists. So is co-operation between theorists, tool-builders and tool users.

The team must include …

 Programming theorists  Programming tool-set builders  Compiler writers and optimisers  Sympathetic users  Open source code contributors  Proof-tool builders, model checkers,…  Teachers and students can help

Feasible.

 The reasons for previous failure to meet the challenge are well understood and believable plans are under way to overcome them.

 Gigabytes and Gigacycles are now cheap  Beneficiaries number in billions  The state of the art is much advanced

State of the art

     Smart-card applications have been manually proved (eg. Logica).

Safety-critical systems have been developed from specification (eg. Praxis).

Commodity software includes many assertions (eg. Microsoft Office) Open Source software is freely available for research as well as use (eg. Apache).

Programming theories cover O-O and concurrency (eg. this conference)

Available Tools

        Assertion generators (eg. DAIKON) Program analysers (eg. PREfix, SPLINT) Abstract Syntax Tree compiler (eg.PREfast) Verification Condition Generator (eg. ESC) Program Development Environment (eg.B) Theorem provers (eg. simplify, HOL) Decision procedures (eg. SAT, PVS) Model checkers (eg. SPIN, FDR)

Risks

 Poor quality of legacy code/languages.

 Errors are just missing preconditions.

 Errors are exploited for functionality or compatibility reasons.

 Spec of external interfaces impractical.

 Build/configuration files can’t be proved.

 Multiple languages in a single application.

Rare

 Requires maturity of the Science  (but not too mature)  Requires general support of the many  Long-term commitment of the few  Sympathy from funding agencies  It is hard to start the bandwagon

Early decisions

 What language(s)?

 What compiler/loaders/run-time checkers?

 Which particular applications?

 Smartcard  Embedded  Critical  Commodity  What collaborators?

Timetable

 2005 start of project  2010 smartcard software proved correct  2015 critical applications proved safe  2020 commodity software proved secure

Acknowledgements

 Jim Woodcock  Greg Morrisett  Jay Misra  Peter O’Hearn  Richard Bornat  Carl Gunter  and many others