[SystemSafety] Scalability of Formal Methods [was: Logic]

Wed Feb 26 22:33:42 CET 2014

I'm glad that Patrick brought up the practical complexity of rigorous verification. 

I spent some time this afternoon looked at the work to which David referred, the ClearSy application of Atelier B to the Flushing line (he said Canarsie, but I only found details of Flushing) signalling systems. I was thinking of 200,000 lines of Ada source, formally verified with all proof obligations discharged. And although I acknowledge the power of math, I am sceptical that someone has shown mathematically that the Flushing line signalling systems could never ever go wrong. The question is, as unfortunately so often with claims of formality, what in fact has been shown: what has actually been shown not to go wrong, ever, and what remains that is "allowed" to go wrong, that must be assessed by other methods?

The material that I found and read talked about a "system level" verification. What that seems to mean is that high-level safety requirements were formulated, a series of switching algorithms were formulated along with kit that executes certain functions (trip switches, for example), and the switching algorithms were shown to conform to ("refine") the safety requirements. Good, necessary stuff! But, as Patrick pointed out and Martyn confirmed, far from demonstrating that the object code fulfils the safety requirements, let alone that the HW operates correctly and no switch ever gets frozen in an inappropriate position in winter. 

And, as far as I read, also some ways from showing that the source code actually implements the switching algorithms. Especially since the algorithm expression involves lots of objects (all talked about in the same language) whereas the source code has to be written for each individual object/actor in this system (as well as being single-threaded, most likely) and thus functions have to be partitioned out. There is no word about how the functions for the individual actors are partitioned, or that this partitioning is correct. This, of course, is a crucial step in refinement of a many-actor system description into code for multiple actors.

In other words, the claims made about the system are too general for an expert to figure out what has actually been accomplished; what the demonstrated properties of the final system are. I would think that what has been demonstrated is that the system specification fulfils the requirements specification, and something unspecific more. 

What has not been shown is that 200,000 lines of Ada code fulfil the requirements spec. Or maybe that too. That x million words of object code fulfil the requirements spec. Or maybe that too. 

Fetzer is a red herring. His argument rests on an equivocation and admits of a trivial counterexample. Proved-correct is not a unary predicate of programs. What does "correct" mean, when speaking of programs? Answer: that it does what you want. So you have to have some idea of what you want, some specification. And then you may ask whether the program is correct wrt this specification. Or, as we may say, whether the program fulfils the specification. Correctness is (at least) a binary predicate, of which one argument is a program and another a specification of some kind, a saying-what-is-wanted. 

Fetzer claims to show that programs can never be proved correct. Here is a simple, trivial counterexample to that claim. Consider the specification "P OR (NOT P)", where "P" is any proposition you care to express. Constructivists, who may doubt excluded middle, may rather consider "NOT P OR (NOT NOT P)". Call this specification S. Take any program you like: Prog. There is a rigorous proof that Prog fulfils S. For the classical statement, the proof is a couple lines long (think of it as the two lines of the truth table); for the constructivist about double that.

See? Logic helps. Here, in showing why somebody who is arguing that program verification should lose its research funding is canonically mistaken.

Not that the CACM would print such a refutation. The editor's comment when I sent it in was that Fetzer's article was somewhat old (a couple of years); did I have a more contemporary contribution  to CACM? That's when I realised the marketing people had taken over and I let my membership lapse.

PBL

Prof. Peter Bernard Ladkin, University of Bielefeld and Causalis Limited

> On 26 Feb 2014, at 20:46, Martyn Thomas <martyn at thomas-associates.co.uk> wrote:
> 
> I suggest that we should consider the behaviour of the compiler, the
> libraries, other system components and (Fetzer's focus) the hardware as
> issues that can be addressed and assured separately from whether the
> source code of the software is an accurate refinement of the
> specification, and whether the specification describes a system that,
> with hindsight, many years later, we would agree was the system that
> should have been built.