[SystemSafety] Scalability of Formal Methods [was: Logic]
David MENTRE
dmentre at linux-france.org
Wed Feb 26 15:13:54 CET 2014
Hello Peter,
Le 26/02/2014 12:30, Peter Bernard Ladkin a écrit :
> For example, David tells us that a couple hundred thousand lines of Ada source have been formally
> verified in 2006 using Atelier B. Now, if that was all SPARK source I would be prepared to believe
> it, but I'd want to know how many proof obligations were left undischarged.
Zero (0). Of course, a significant part (3.3% to 8.1% of proof
obligations, i.e. 1,400 to 2,200) is done manually (i.e. interactive
proofs) by a team of skilled engineers (internal or using subcontracts
to companies specialized in such proofs).
B Method is based on refinement: you start from a formal specification,
then you add details (algorithms, data structures) until you reach the
B0 level which corresponds to a basic programming language (IF, WHILE,
Boolean, integers, arrays, ...). At this level, you can directly
translate it in Ada or C. At each refinement step, proof obligations are
automatically generated to ensure the refined code corresponds to the
more abstract one.
For the translation to Ada or C part, usually a double translator is
used: two translators, developed by different teams with different
technologies, one checks the result is the same[1].
> If it isn't all SPARK
> source, then I become more wary. Also, my experience tells me that, of any industrial system with
> reliability requirements, 90%-95% of the source code is boilerplate. Boilerplate is functionally
> simple and easy formally to verify; furthermore, it is repeated all over. So 95% of your functional
> code is going to be easy formally to verify, if you are organised and put in the effort. But that 5%
> - now that could really do you in! It's theorem-proving and the math might do you in, even with the
> best prover technology.
I agree. That's why, when applying formal methods:
* You divide your system into formal and non-formal parts, in such a
way that the formal part ensures your important system properties
(safety, security, ...). The rule of the game is to have the smallest
formal part. You can put a lot of complicated code in non-formal part
(e.g. a radio transmission system) as long as your formal part checks it
(e.g. no message is lost, messages are received in order, ...);
* You design your software with proof in mind, e.g. using simpler
algorithms and data structures. The resulting software might not be
"optimal" but it does the job (e.g. functional and non-functional
objectives are fulfilled);
* The proof environment is tailored to the considered project to reach
90-97% of automatic proof. For B Method, the prover is based on proof
rules (about 1,400). The user can add new rules. For example, Siemens
added about 3,100 rules to Atelier B prover for its projects. Of course,
those rules might be incorrect, therefore they are proved themselves or
carefully reviewed[2]. Old SPARK (before SPARK 2014) is based on the
same approach if I'm correct;
* Your organize your development in such a way that you prove code
that is already mostly debugged, works, has been tested, has not changed
for some time, etc. Proof is done last.
Sincerely yours,
david
[1] Somebody like Nancy Leveson could wonder if such double translator
approach is really safe. I would say that such translators are rather
basic (no complex algorithm) and the specification are rather clear, so
we can have good arguments to say it is safe.
[2] Even after careful review by experts, some errors might slip in.
Later studies with formal methods (e.g. Coq proof assistant or Zenon
automated prover) have indeed found errors (e.g. 20 over 274 examined
rules). But those errors where minor (correct rule but incorrect manual
proof, or issues on non-freeness of rule variables) and never leaded to
a proof obligation that was wrongly proved correct.
More information about the systemsafety
mailing list