[SystemSafety] Bursting the anti formal methods bubble

Gareth Lock gareth at humaninthesystem.co.uk
Fri Oct 27 07:37:00 CEST 2017


Les,

Thank you for this, a very informative post.

Having spent last night on a webinar with 20+ thoracic surgeons across 
Europe I’d like to build on your metaphor of surgeons and the skills 
needed for excellence. Some surgeons (and most people involved in safety 
in healthcare) realise that while a surgeon might have amazing technical 
skills, they need to start working as part of an effective team in which 
knowledge is shared and leadership is situational & dynamic if patient 
safety is to be improved. Surgical techniques are becoming so 
complicated and complex that one person cannot solve the problems on 
their own, especially when emergent issues arise.

There is also a recognition that failure cannot be avoided and there is 
both a societal and cultural challenge in addressing this as surgeons 
are expected to be ‘perfect’. Consequently failures, at all levels, 
need to be captured, analysed and processes changed and modified to 
introduce as much standardisation as possible to reduce the cognitive 
overheads. Unfortunately this requires a resource which is not currently 
funded at a level which can provide useful results. If you mentioned a 
Safety Management System to most healthcare professionals they would 
wonder what you were talking about!

Interestingly, one of the senior trainers last night commented that 
surgeons are no longer artists with their own way of keep death away, 
but rather they are highly skilled technicians. Furthermore, the 
relationship between doctor and patient has moved from 1:1 to 1:many and 
many:1 when it comes to patient care which adds ambiguity and 
uncertainty at all levels and so faults / failures across the domain can 
emerge when none was expected.

Technical skills and attitude are not enough, not when dealing with 
complicated and complex systems and therefore the ability to develop 
effective non-technical skills, to use learning from experience and 
failure analysis post-event (I am assuming pre-event happens), and 
building resilience into the socio-technical system are all essential. I 
doubt this is any different to the world of safety critical systems as 
you describe.

Regards

Gareth Lock
Director

M: +44 7966 483832
E: gareth at humaninthesystem.co.uk
W: http://www.humaninthesystem.co.uk
T: @HumaninSystem

Skype: gloc_1002
WhatsApp: +44 7966 483832

International speaker on human factors and non-technical skills
Published specialist on non-technical skills - 
https://www.humanfactors.academy/blog/sticky-published-articles

This email and any files transmitted with it are confidential and 
intended solely for the use of the individual or entity to whom they are 
addressed. If you have received this email in error please notify the 
sender by email or telephone. This message contains confidential 
information and is intended only for the individual named. If you are 
not the named addressee you should not disseminate, distribute or copy 
this e-mail. Please notify the sender immediately by e-mail if you have 
received this e-mail by mistake and delete this e-mail from your system. 
If you are not the intended recipient you are notified that disclosing, 
copying, distributing or taking any action in reliance on the contents 
of this information is strictly prohibited.

On 26 Oct 2017, at 23:28, Les Chambers wrote:

> Martyn
>
> I enjoyed this post. Thank you. I think you have spoken the truth. If 
> I could add one point:
>
> I think we need to use the surgeon metaphor in training people at the 
> pointy end of life critical systems development. Surgeons try to avoid 
> after incident reviews given that many of them would be done over a 
> dead body. Their solution is to invest decades of training in the 
> operator to avoid the incident in the first place. In our case the 
> responsibility for this high intensity training must fall on the 
> organisations developing software. As with surgeons, training must be 
> done at the coalface on a daily basis.
>
> There are two barriers to this approach:
>
> 1. Money . As the mantra goes -  safety, reliability, security, 
> quality, integrity are free  - but only in the long term (ask VW - the 
> return on investment of an Integrity Commissioner at board level would 
> have been eye watering). In the short-term they are extremely 
> expensive and to the inexperienced chancer related spending has no 
> obvious return on investment. It seems that people responsible for 
> allocating capital can only appreciate this after some life 
> transforming event like a death or massive security breach. Which 
> leads me to my second point
>
>
>
> 2.  Attitude. Most of today's developers are too far removed both in 
> time and geography from the consequences of their mistakes. I've 
> concluded that, by an accident of history, I am a member of an almost 
> extinct club of people who have actually had a desk 30 feet away from 
> a machine that could destroy itself and/or me due to a defect in my 
> software. This experience focuses the mind and motivates flat-out 
> refusal to commit unsafe acts of any kind in software development or 
> other activities. The intensity of that motivation is a function of 
> experience - for example having worked with someone who got it wrong 
> and did kill someone. It's the difference in intensity between an A 
> grade cyclist and a Tour de France rider.
>
>
>
> How then do you engender that level of intensity in someone who shines 
> a seat in a comfortable office light years away from the machine that 
> may explode, auger into the earth at the speed of sound or handover 
> the personal details of millions of people? How then do you maintain 
> the integrity of a system built by many people with a large variation 
> of experience and attitude over many years. And the answer is CONSTANT 
> INDOCTRINATION AND STRONG INCIDENT RESPONSE. I heard a good example of 
> this yesterday. Details are sketchy but I believe that at some point 
> within the past five years Microsoft became aware of some 
> vulnerabilities in their core product line. They were so serious that 
> development management ordered a one week operational pause when no 
> one wrote any software and everyone spent the entire week studying and 
> discussing security issues. The U.S. Navy took the same approach this 
> year after the second collision incident with one of its guided 
> missile destroyers. They have had a total of four incidents in Asian 
> waters this year. The operational pause focuses the mind and 
> reinforces the seriousness of the issue. Indoctrination imbues a value 
> system by constant repetition (I sight Lewis Carroll and The Hunting 
> of the Snark – see below)
>
>
>
> So I say to any Pilgrim reading this who has not suffered the real 
> fear of being killed by their own software, especially if you are a 
> manager. What are you doing to indoctrinate your people, was your 
> response to the last incident forceful enough? If anything is going to 
> improve it must start with you.
>
>
>
> Cheers
>
> Les
>
>
>
> The Hunting of the Snark
>
> BY LEWIS CARROLL
>
>
>
> "Just the place for a Snark!" the Bellman cried,
>
>    As he landed his crew with care;
>
> Supporting each man on the top of the tide
>
>    By a finger entwined in his hair.
>
>
>
> "Just the place for a Snark! I have said it twice:
>
>    That alone should encourage the crew.
>
> Just the place for a Snark! I have said it thrice:
>
>    What I tell you three times is true."
>
> ...
>
> My Comment: effective indoctrination requires you to say it thrice at 
> least once a week for ever
>
>
>
>
>
> From: systemsafety 
> [mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] On Behalf 
> Of Martyn Thomas
> Sent: Friday, October 27, 2017 1:28 AM
> To: systemsafety at lists.techfak.uni-bielefeld.de
> Subject: Re: [SystemSafety] Bursting the anti formal methods bubble
>
>
>
> Michael J Pont and C Michael Holloway,
>
> Thank you for your constructive comments. I would like to try to build 
> some progress from them.
>
> Firstly, let me explain that I understand the difficulties of 
> introducing formal methods into an existing commercial development 
> process. Readers on this list may not know that I have never been an 
> academic: I have worked in commercial software development and support 
> since 1969 and the company, Praxis, that I founded in 1983 was was 
> only 40 people strong when we first started using VDM, and our 
> Critical Systems Division was less than 60 people when we won the 
> development of the air traffic management system CDIS,  the navy 
> helicopter landing system SHOLIS and bet the future of the company on 
> the success of these projects using Z and SPARK. More recently I 
> chaired the steering committee for a large EU project DEPLOY that 
> introduced the Event-B methods and Rodin toolset into Siemens 
> Transportation, Bosch, Space Systems Finland, SAP and elsewhere and 
> studied the difficulties encountered and the solutions found. I 
> co-edited the resulting 
> <https://link.springer.com/book/10.1007%2F978-3-642-33170-1>  book and 
> contributed a chapter on "Introducing Formal Methods into Existing 
> Industrial Practices". I also worked on Daniel Jackson's study for the 
> National Academy of Sciences Software for Dependable Systems: 
> Sufficient Evidence? (2007) and co-edited the 
> <https://www.nap.edu/catalog/11923/software-for-dependable-systems-sufficient-evidence> 
>  final report. That study consulted a lot of practitioners.
>
> So I'm not ignorant of the limitations and difficulties. Most systems 
> contain legacy components; some parts of most developments cannot be 
> fully formalised; we don't have a formal specification of the 
> behaviour of people or much of the real word in which we operate. A 
> very small fraction of the required proofs will have to be resolved by 
> inspection because of limitations in the prover or model checker. 
> Training is expensive and takes time. Many programmers are unfamiliar 
> with formal logic or operations on sets and states. These and many 
> other things are real barriers to change.
>
> But I have seen from experience that the use of formal methods can 
> practical and cost effective and that despite these and other 
> limitations introducing FMs dramatically reduces software defects and 
> provides far stronger evidence for assurance than can be achieved in 
> any other way. I believe that there are strong empirical reasons why 
> this would be the case.
>
> And we strive to be engineers and engineering involves making the 
> right decisions and the right compromises, and as an engineer and a 
> businessman, I have learnt the value of increased formality.
>
> If we are to make constructive progress, I would like us to agree two 
> things:
>
> 1.	 That the cybersecurity threat is a game-changer. When the only 
> failures that had to concern us were the randomly-occuring failures 
> triggered by combinations of events in our operating environment, we 
> were able to make some important assumtions that simplified the task 
> of assuring fitness-for-purpose. We could assume that failures in 
> independent components would occur independently (many safety cases 
> depend on this assumption). We could assume that a failure that 
> occurred randomly in one car in a fleet of ten thousand would not 
> suddenly occur simultaneously in every car in the fleet. We could 
> assume that unlikely combinations of events remained unlikely. These 
> and other assumptions are not valid if our systems may face 
> well-resourced and technically sophisticated cyber attack. And the 
> attacker could be a low-level criminal or even a script kiddie in a 
> world where the agencies of nation states seek out vulnerabilities 
> that they can exploit, weaponise them with scripts, apps and user 
> guides, and then have them stolen and dumped on the Internet (as 
> happened with the EternalBlue exploit that was developed by the NSA, 
> stolen by ShadowBrokers, and used in the Wannacry and Petya/NotPetya 
> Ransomware attacks). It would be complacent to rely on an assumption 
> that no-one has both the motive and the means to mount a successful 
> cyber-attack.
> 2.	That the software development methods that are widely used by the 
> industries that develop safety-critical software cannot provide 
> adequate assurance against such attacks, and that therefore some way 
> must be found to move from assurance that is largely based on testing 
> towards assurance that is substantially supported by automated 
> analysis of our specifications, designs and code.
>
> If we can agree this starting point then we can move to discussing the 
> incremental steps that can improve our processes and our products, 
> step by step, person by person, component by component, subsystem by 
> subsystem and interface by interface. It will be a long journey and we 
> shall need to learn from each other how to overcome the many barriers, 
> but I genuinely believe that the current situation is unsustainable 
> and that our profession must come together to start making the 
> necessary progress.
>
> Regards
>
> Martyn
>
>
>
>
>
> On 26/10/2017 10:21, Michael J. Pont wrote:
>
>
> Peter,
>
> Let me re-phrase my questions (and address them to anyone on this list 
> who cares to express a view).
>
> Suppose Organisation X currently develops software for use in 
> safety-related embedded systems in compliance with one or more 
> international safety standards / guidelines (e.g. IEC 61508, ISO 26262 
> - or whatever).
>
> Organisation X does not currently use formal methods / tools.
>
> Organisation X wishes  to improve confidence in the safety of the 
> systems that employ their software.
>
> What level of 'confidence improvement' could Organisation X expect to 
> see through use of tools that allow static analysis of code pre- and 
> post-conditions?   What evidence can be provided to support this?
>
> [Here I'm assuming something like SPARK or Frama-C tools.]
>
> What other formal methods / tools could Organisation X consider (if 
> any) if they want to reduce the number of defects in their code by a 
> factor of 100x to 1000x?  What evidence can be provided to support 
> this level of defect reduction (from Organisation X's starting point)?
>
> What level of FM support should Organisation X consider if they want 
> to reduce the threat of civil lawsuits etc (raised in Martyn's 
> original email)?
>
> ---
>
> Some more general comments.
>
> There are people on this list who are formal-method 'fans' (some would 
> say 'fundamentalists').
>
> In some cases, I think such FM people 'need to get out more' - and 
> spend time with the development teams in real (often fairly small) 
> organisations that are struggling to create safety-related systems.   
> Such teams usually have very limited time and resources available: 
> this does not mean that they are populated by 'bad people' who want to 
> create poor-quality systems.
>
> Shouting at people because they are not using formal methods doesn't 
> really help ...   Offering suggestions about processes / tools that 
> may help organisations to become 'more formal' would - in my view - be 
> more productive.
>
> Simply my take (others may well see the world in a different way).
>
> Michael.
>
> ---
>
> Michael,
>
> On 2017-10-26 09:05 , Michael J. Pont wrote:
>
> What is your baseline - what do you mean by “typical industrial
> software methods”?  Do you – for example - mean products developed 
> in compliance with IEC 61508 (e.g. ‘SIL 3’)?  ISO 26262 (e.g.
> ‘ASIL D’)?
>
>
> In https://www-users.cs.york.ac.uk/tpk/nucfuture.pdf , John McD and 
> Tim Kelly talk about typical fault densities. They suggest that 1 
> error per KLOC is considered to be "world-class" for some areas of 
> safety-critical systems. The C130J software had about 23 errors per 
> KLOC and 1.4 safety-critical errors per KLOC.
>
> In contrast, Praxis, now Altran UK, has instrumented its code quality 
> for a long time and has achieved fault densities as low as 1 per 
> 25KLOC.
>
>
> What do you mean by FMs?   Would use of SPARK Pro qualify?  Use of 
> Frama-C?  Something else?
>
>
> The answer is trivially no, because you are asking about apples and 
> your examples are oranges.
>
> SPARK Pro and Frama-C are PDEs. A FM is first and foremost a method. 
> Both SPARK Pro and Frama-C use FMs, for example SPARK Pro uses 
> Bergeretti-Carre information flow analysis integrally.  The SPARK 
> Examiner uses Hoare Logic for static analysis, as well as other 
> methods.
>
>
> If you can provide evidence that use of a specific FM is capable of
> providing “two or three orders of magnitude fewer defects” in an 
> IEC 61508 / ISO 26262 project, I’d be interested to see this.
>
> That is a different ask from what Martyn said. Also, in a 
> safety-critical context just using one FM is not going to get you very 
> far. Using Z to write your requirements specifications, for example, 
> isn't going to get you very far if you then simply give your Z 
> requirements to designers who can't read Z. You need more stuff to 
> support that one technique, and you need your techniques to be joined 
> up.
>
> But I can give you some obvious examples from the past. First, use of 
> strong typing would have eliminated 80% of the networked-SW 
> vulnerabilities listed by US CERT in the 1990's. Strong data typing is 
> an FM introduced 49 years ago. Second, use of higher-level languages 
> and their compilers led to much more reliable code than writing it all 
> in assembler. HLL Compilers are FM (if you are surprised by that 
> assertion, a quick revisit of Aho, Hopcroft and Ullmann will make it 
> clear).
>
> PBL
>
> Prof. Peter Bernard Ladkin, Bielefeld, Germany MoreInCommon Je suis 
> Charlie
> Tel+msg +49 (0)521 880 7319  www.rvs-bi.de
>
>
>
>
>
>
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE


> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20171027/ae7157e4/attachment-0001.html>


More information about the systemsafety mailing list