[SystemSafety] Statistical Assessment of SW ......
Les Chambers
les at chambers.com.au
Mon Jan 26 11:23:56 CET 2015
And then there's the issue of glue drift. Yes, you read that right: G L U E D R I F T.
This actually happened. A distributed control system consists of a bunker with supervisory computers and 300 remote terminal units (RTUs) performing the control tasks. Each RTU has a processor chip with a heat sink glued to it. Halfway through the production run the Shanghai factory gets a batch of defective glue. The RTUs are deployed and under normal operating conditions some of the heat sinks start falling off the processor chips and rattling around inside the card cages, causing random failures. Recordkeeping in the factory did not extend to tracking when the defective glue was introduced into the manufacturing process.
Model that, you seekers after determinism!
Les
From: systemsafety-bounces at lists.techfak.uni-bielefeld.de [mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] On Behalf Of Matthew Squair
Sent: Saturday, January 24, 2015 9:12 AM
To: Peter Bishop
Cc: systemsafety at lists.techfak.uni-bielefeld.de
Subject: Re: [SystemSafety] Statistical Assessment of SW ......
With apologies to Peter Bishop, I meant to send this to the group but selected the wrong damn button*.
Another non-trivial hardware problem is how to ensure a shared concept of time in a distributed system in the presence of clock drift. Said drift can lead to quite different responses to inputs from redundant identical components, based on their hitting a time gate at slightly different moments.
*Likelihood of operator error 1x10-3/D (WASH1400 study App III).
On Sat, Jan 24, 2015 at 1:50 AM, Peter Bishop <pgb at adelard.com> wrote:
Determinism is tricky if you include hardware (especially embedded system hardware).
One source of non-determinism is input measurement accuracy.
Multiple "correct" responses are possible if a deterministic threshold (like trip or no-trip) relies on a real world input value.
This can be a real problem when testing an embedded system.
Peter Bishop
RICQUE Bertrand (SAGEM DEFENSE SECURITE) wrote:
Does a deterministic software exist ?
If it is intrinsically deterministic, does a deterministic execution
of this SW on a given hardware exist ?
Bertrand Ricque Program Manager Optronics and Defence Division Sights
Program Mob : +33 6 <tel:%2B33%206%2087%2047%2084%2064> 87 47 84 64 Tel : +33 1 58 11 96 82 <tel:%2B33%201%2058%2011%2096%2082> Bertrand.ricque at sagem.com
-----Original Message----- From:
systemsafety-bounces at lists.techfak.uni-bielefeld.de
[mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] On
Behalf Of Peter Bernard Ladkin Sent: Friday, January 23, 2015 7:43 AM
To: systemsafety at lists.techfak.uni-bielefeld.de Subject: Re:
[SystemSafety] Statistical Assessment of SW ......
On 2015-01-21 14:15 , jean-louis Boulanger wrote:
For software it's not possible to have statistical evidence. the
failure is 1 (yes the software have fault and failure appear)
This argument came up again yesterday in a standards-committee
meeting. It is usually attributed to third party "engineers with whom
I work", because nobody quite seems to claim they hold the view
themselves when I'm in the room :-) ....
So it might be worthwhile to adduce the proof - again. It's real
short.
Suppose you have a piece of SW S which is deterministic. And S is
also not perfect, so it outputs right answers on some inputs and
wrong answers on others. And S reverts to an initial state with no
memory of its previous behavior each time it produces its output.
Suppose the distribution of inputs to S has a stochastic character.
That is, the input I is a random variable. Then the output outS(I),
which is a function of the input I, also has stochastic character. A
deterministic transformation of a random variable is itself a random
variable.
Let us transform outS(I) further, deterministically. Define CorrS(I)
= 1 if outS(I) is correct CorrS(I) = 0 if outS(I) is incorrect
Then again CorrS(I) has also a stochastic nature and is a random
variable.
Thus, if the input to a piece of SW has stochastic nature, then so
does the correctness behavior of the SW.
QED.
The only reasonable objection to this argument which I have heard is
to dispute whether inputs have a stochastic nature.
So, say you build a railway locomotive control system. The piece of
track the locomotive runs on has a fixed architecture, so the
argument would run that the behavior of the locomotive is more or
less determined within certain parameters (whether signal X is red or
green) and does not have a stochastic nature. But various parameters
such as the condition of the track, the nature of the load on the
locomotive, and other environmental conditions such as wind speed and
weather (icy track, or dry track, and when icy where the ice is) make
it practically all but impossible to predict the inputs to the
control system. Besides, at design time the design does not involve
designing to the specific route the locomotive will run on. The
designer is ignorant of the application. So the inputs to the control
system as known at design time have a stochastic nature if you are a
Bayesian.
I would like to remark here, again, on a couple of incoherences in
IEC 61508 and "derivative" standards.
Something which executes a safety function must consist of both HW
and SW, because SW alone cannot take action. A HW-SW element which
executes a safety function is assigned a reliability goal, which is
mostly encapsulated in the SIL. These reliability goals are the
safety requirements. A reliability goal is expressed in terms of
probability of function failure per demand, or per unit time. Suppose
that the correct functioning of the HW-SW element E is functionally
dependent on the correct functioning of its SW S (which for most
actuators it is). The standard requires one demonstrates that the
reliability is attained (that the safety requirement is fulfilled).
How this is actually done must be something like the following.
We assume as above that the element E deterministically transforms
its inputs. We define the function CorrE as above. Given a
distribution of inputs Distr(I), then the probability that E
functions correctly is given by (Integral over Distr(I) of the
function CorrE(I)) divided by (Integral over Distr(I) of the constant
1).
Notice that the probability of correct functioning, the safety
requirement as laid down by IEC 61508, is dependent on Distr(I).
Change Distr(I) and one can usually expect the probability to change.
(For example, let Distr(I) be the Dirac Delta function on one
incorrect input. Then the probability that E functions correctly is
0.)
Yet in IEC 61508, and everywhere else, Distr(I) is not mentioned. Not
once.
This is incoherent.
One could fix it, maybe, by just assuming the uniform distribution on
all inputs, by default. Or the normal distribution. There may be
reasons for this, but it is worth pointing out that Distr(I) in real
applications is almost never uniform or normal. If there is a
distribution D for which it can be argued that the real-world input
distribution "almost always approximates D" then one could choose D
as the default instead.
The second incoherence is as follows. If the SW does not attain the
safety requirement, then E does not attain the safety requirement,
under a certain plausible assumption, namely that if CorrS(I) = 0,
then CorrE(I) is almost always 0. (That is, the HW may sometimes
fortuitously compensate for incorrect SW behavior, but mostly not.)
Then in order for E to fulfil the safety requirement, it must be the
case that
(Integral over Distr(I) of the function CorrS(I)) divided by
(Integral over Distr(I) of the constant 1) GEQ (Integral over
Distr(I) of the function CorrE(I)) divided by (Integral over Distr(I)
of the constant 1)- epsilon
(epsilon is there to instantiate the "almost" part of the
assumption).
So, since the safety requirement on E has a probabilistic calculation
as a component, so must the inherited safety requirement on S.
Yet there is no requirement in IEC 61508 to substantiate that
inherited safety requirement on S. The only condition on software
safety requirements is the techniques which are recommended to be
used during development of S.
In particular, if you don't think that the execution of SW can have a
stochastic nature, such as Jean-Louis, you are thereby committed to
the view that IEC 61508 and its derivates are inherently incoherent.
It must be a difficult world to live in ......
PBL
Prof. Peter Bernard Ladkin, Faculty of Technology, University of
Bielefeld, 33594 Bielefeld, Germany Je suis Charlie Tel+msg +49
(0)521 880 7319 www.rvs.uni-bielefeld.de
_______________________________________________ The System Safety
Mailing List systemsafety at TechFak.Uni-Bielefeld.DE # " Ce courriel et
les documents qui lui sont joints peuvent contenir des informations
confidentielles, être soumis aux règlementations relatives au
contrôle des exportations ou ayant un caractère privé. S'ils ne vous
sont pas destinés, nous vous signalons qu'il est strictement interdit
de les divulguer, de les reproduire ou d'en utiliser de quelque
manière que ce soit le contenu. Toute exportation ou réexportation
non autorisée est interdite Si ce message vous a été transmis par
erreur, merci d'en informer l'expéditeur et de supprimer
immédiatement de votre système informatique ce courriel ainsi que
tous les documents qui y sont attachés." ****** " This e-mail and any
attached documents may contain confidential or proprietary
information and may be subject to export control laws and
regulations. If you are not the intended recipient, you are notified
that any dissemination, copying of this e-mail and any attachments
thereto or use of their contents by any means whatsoever is strictly
prohibited. Unauthorized export or re-export is prohibited. If you
have received this e-mail in error, please advise the sender
immediately and delete this e-mail and all attached documents from
your computer system." #
_______________________________________________ The System Safety
Mailing List systemsafety at TechFak.Uni-Bielefeld.DE
--
Peter Bishop
Chief Scientist
Adelard LLP
Exmouth House, 3-11 Pine Street, London,EC1R 0JH
http://www.adelard.com
Recep: +44-(0)20-7832 5850 <tel:%2B44-%280%2920-7832%205850>
Direct: +44-(0)20-7832 5855 <tel:%2B44-%280%2920-7832%205855>
_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE
--
Matthew Squair
MIEAust CPEng
Mob: +61 488770655
Email: MattSquair at gmail.com
Website: www.criticaluncertainties.com <http://criticaluncertainties.com/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20150126/721afc10/attachment-0001.html>
More information about the systemsafety
mailing list