[SystemSafety] Statistical Assessment of SW ......

DREW Rae d.rae at griffith.edu.au
Mon Jan 26 14:38:55 CET 2015


Peter,
You are correct that the language I used implied that there would be
failures to count - that was imprecise. For the levels of reliability
typically claimed, the failure count would be zero, and there's nothing
wrong with determining upper bounds on likelihood based on a period with
zero failure events. That doesn't remove the need for a reliable counter of
failure events, though.

My point is that the statistical models are internally valid for predicting
reliability, but they can't externally validate their own predictions.
There are any number of auxilliary assumptions necessary to claim that the
test results + statistical assessment are actually reliable estimates, and
those assumptions can't be tested. It isn't just a case of "there's some
uncertainty" - it's a case of _unknown_ uncertainty, with a blind faith
that the uncertainty is small enough that the predictions have some
usefulness.





My safety podcast: disastercast.co.uk
My mobile (from October 6th): 0450 161 361

On 26 January 2015 at 13:21, Peter Bernard Ladkin <
ladkin at rvs.uni-bielefeld.de> wrote:

> On 2015-01-26 12:41 , DREW Rae wrote:
> > A simple thought experiment. Let's say someone claims to have a suitable
> method of predicting
> > combined hardware/software reliability.
> > On what basis could they ever support that claim?
>
> Um, using well-tried statistical methods associated with the Bernoulli,
> Poisson and exponential
> distributions, as taught in most basic statistics courses. (Such as
> http://www.math.uah.edu/stat/bernoulli/
> http://www.math.uah.edu/stat/poisson/  Bev put me on to
> these. They are pretty good! I used to refer to Feller, but Bev thought
> that was "ancient". It's not
> that ancient. Came out the year before I was born.)
>
> > I would argue that such a claim about a method
> > intended for real-world use is empirical in nature, and can only be
> validated empirically.
> > Unfortunately this requires an independent mechanism for counting the
> failures, and that there be
> > enough failures to perform a statistical comparison of the prediction
> with reality.
>
> Methods of assessing reliability of SW are normally predicated on *no
> failures having occurred for a
> certain number of trials*. Providing that no failures have been observed,
> the conclusion that the
> failures have a specified low occurrence rate may be drawn with a
> specified level of confidence,
> dependent on the number of trials observed. I mean, this is just basic
> statistical methodology, is
> it not?
>
> That you are talking about failures, and counting failures, suggests to me
> that you're not au fait
> with the general approach to statistical assessment of SW reliability.
>
> > Conclusion: No method for predicting hardware/software reliability can
> actually be shown to
> > accurately predict hardware/software reliability. All claims about
> hardware/software reliability are
> > constructed using methods that themselves haven't been adequately
> validated.
>
> Dear me! <PBL restrains himself for fear of being voted off his own list
> :-) >
>
> Any method of assessing the reliability of a system is going to rest on a
> considerable amount of
> uncertainty. Say you want to *prove* your system satisfies its spec. You
> say you have listed all the
> proof obligations of your SW? How reliable is the listing process? You say
> you've discharged all the
> proof obligations? How reliable is your proof checker? And so on. How can
> you deal with that
> uncertainty without using statistical methods at some point? I don't think
> you can.
>
> And then there is practicality. Even if everything you say were to be true
> (note this is a
> counterfactual conditional!), large amounts of safety-relevant SW is now
> sold in the marketplace on
> the basis that the user may rely on it doing its job ("yes, problems have
> arisen but these are the
> measures used to fix them and the problems haven't occurred since"). The
> validity of such assurances
> is low to marginal. Much of the thrust of our approach to the statistics
> is to try to encourage
> people to keep better records and to pay attention to appropriate
> inference rather than saying
> "these hundred clients have been using it and previous versions for a
> decade and a half and only one
> has been unhappy enough with the product to go to arbitration about it".
> And for the clients of such
> vendors to demand appropriate statistics rather than be content with such
> claims.
>
> PBL
>
> Prof. Peter Bernard Ladkin, Faculty of Technology, University of
> Bielefeld, 33594 Bielefeld, Germany
> Je suis Charlie
> Tel+msg +49 (0)521 880 7319  www.rvs.uni-bielefeld.de
>
>
>
>
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20150126/b2b0d006/attachment.html>


More information about the systemsafety mailing list