[SystemSafety] Difference between software reliability and astrology

Wed Aug 21 11:08:18 CEST 2024

On 2024-08-20 22:13 , Prof. Dr. Peter Bernard Ladkin wrote:
>
> There are a couple of basic facts here. You are lacking any useful statistical evaluation 
> procedure for something like the Linux kernel. Furthermore, it wasn't developed according to any 
> of the standards, in any industry branch, for safety-related software. So you can't use it in any 
> of those applications. Nothing that has been said during discussion changes either of those two facts.

Something which it occurred to me later I should have written.

IEC 61508-1 includes two tables, Table 2 and Table 3, which specify target reliability requirements 
for safety functions(**). The specification, design and development of safety functions are the 
matters with which 61508 concerns itself.

Let us consider just safety functions for continously-operating processes, those operating in 
high-demand mode (demand greater than once per year) or continuous mode (defined in 61508-4:2010 
3.1.16). 61508-1 Table 3 sets reliability requirements for high-demand/continuous safety functions. 
They are two-sided (which seems to me silly; this apparently won't change in the new edition). I 
give the one-sided values:

* SIL 1 must have a probability of failure per hour of less than 10^(-5).
* SIL 2 must have a probability of failure per hour of less than 10^(-6).
* SIL 3 must have a probability of failure per hour of less than 10^(-7).
* SIL 4 must have a probability of failure per hour of less than 10^(-8).

So if you implement software for safety-related purposes (that is, to implement or partially 
implement a safety function), you must reasonably argue that your software achieves such a 
reliability level. (There is a distinction here between failure and dangerous failure which I shall 
leave aside.) Software which is suitable for implementing a safety function of SIL x is said to have 
"Systematic Capability" (SC) x. 61508-3 nominally says what development methods and measures are to 
be taken for software to be taken to have SC x.

So thoughts about "we need other methods .... to assess complex software" are beside the point. The 
Linux kernel was not developed according to 61508-3, so if you want to use it to implement a safety 
function with SIL x, you need to argue somehow that it has SC x and you can only do that by showing 
it has the required reliability given above. If you can't do that, you can't use it in implementing 
safety functions with SIL 1-4.

Footnote (**):

Safety functions are specific actions/artefacts which must be introduced when a specific action or 
process A  in a safety-related system does not constitute an acceptable risk (the notion of 
"acceptable risk" is fundamental, and is taken by 61508 to be set by "society", that is, outside of 
engineering judgement. HSE has guidance on it in "Reducing Risks, Protecting People", aka R2P2). The 
safety function SF(A) is a supplementary action/process that is invoked to reduce the risk of 
performing A to an acceptable level.

So, for example, if you have a pressure vessel containing chemical reactants, there is some risk 
that the pressure might become too great and the vessel fracture and release its contents into the 
environment (with concomitant harm). A safety function in this case could be a designed overpressure 
release pathway, say a safety valve releasing into a pipe leading to an overspill containment 
vessel. Such things occur ubiquitously in the process industries. There has been debate for thirty 
years as to whether such a conception is appropriate for all industries. I won't pursue this 
question here. IEC 61508 is a fact of safety-related engineering life for 27 years now.

PBL

Prof. Dr. Peter Bernard Ladkin
Causalis Limited/Causalis IngenieurGmbH, Bielefeld, Germany
Tel: +49 (0)521 3 29 31 00