[SystemSafety] Statistical Assessment of SW ......

Mon Jan 26 19:36:25 CET 2015

You have put your finger on the key problem.

Statistical testing and confidence limits are straightforward and 
technically sound *provided the assumptions are correct*

If the assumptions are wrong, the bound no longer applies
The confidence bound would then depend on the (unknown) proability that 
the assumptions are wrong. So in the worst case:

E pfd = P(assump wrong) * 1 + P(assump_right) * confidence limit

Not great unless you a *really* sure your assumptions are valid

Peter Bishop

DREW Rae wrote:
> Peter,
> You are correct that the language I used implied that there would be 
> failures to count - that was imprecise. For the levels of reliability 
> typically claimed, the failure count would be zero, and there's nothing 
> wrong with determining upper bounds on likelihood based on a period with 
> zero failure events. That doesn't remove the need for a reliable counter 
> of failure events, though.
> 
> My point is that the statistical models are internally valid for 
> predicting reliability, but they can't externally validate their own 
> predictions. There are any number of auxilliary assumptions necessary to 
> claim that the test results + statistical assessment are actually 
> reliable estimates, and those assumptions can't be tested. It isn't just 
> a case of "there's some uncertainty" - it's a case of _unknown_ 
> uncertainty, with a blind faith that the uncertainty is small enough 
> that the predictions have some usefulness.
> 
> 
> 
> 
> 
> My safety podcast: disastercast.co.uk <http://disastercast.co.uk>
> My mobile (from October 6th): 0450 161 361
> 
> On 26 January 2015 at 13:21, Peter Bernard Ladkin 
> <ladkin at rvs.uni-bielefeld.de <mailto:ladkin at rvs.uni-bielefeld.de>> wrote:
> 
>     On 2015-01-26 12:41 , DREW Rae wrote:
>      > A simple thought experiment. Let's say someone claims to have a
>     suitable method of predicting
>      > combined hardware/software reliability.
>      > On what basis could they ever support that claim?
> 
>     Um, using well-tried statistical methods associated with the
>     Bernoulli, Poisson and exponential
>     distributions, as taught in most basic statistics courses. (Such as
>     http://www.math.uah.edu/stat/bernoulli/
>     http://www.math.uah.edu/stat/poisson/  Bev put me on to
>     these. They are pretty good! I used to refer to Feller, but Bev
>     thought that was "ancient". It's not
>     that ancient. Came out the year before I was born.)
> 
>      > I would argue that such a claim about a method
>      > intended for real-world use is empirical in nature, and can only
>     be validated empirically.
>      > Unfortunately this requires an independent mechanism for counting
>     the failures, and that there be
>      > enough failures to perform a statistical comparison of the
>     prediction with reality.
> 
>     Methods of assessing reliability of SW are normally predicated on
>     *no failures having occurred for a
>     certain number of trials*. Providing that no failures have been
>     observed, the conclusion that the
>     failures have a specified low occurrence rate may be drawn with a
>     specified level of confidence,
>     dependent on the number of trials observed. I mean, this is just
>     basic statistical methodology, is
>     it not?
> 
>     That you are talking about failures, and counting failures, suggests
>     to me that you're not au fait
>     with the general approach to statistical assessment of SW reliability.
> 
>      > Conclusion: No method for predicting hardware/software
>     reliability can actually be shown to
>      > accurately predict hardware/software reliability. All claims
>     about hardware/software reliability are
>      > constructed using methods that themselves haven't been adequately
>     validated.
> 
>     Dear me! <PBL restrains himself for fear of being voted off his own
>     list :-) >
> 
>     Any method of assessing the reliability of a system is going to rest
>     on a considerable amount of
>     uncertainty. Say you want to *prove* your system satisfies its spec.
>     You say you have listed all the
>     proof obligations of your SW? How reliable is the listing process?
>     You say you've discharged all the
>     proof obligations? How reliable is your proof checker? And so on.
>     How can you deal with that
>     uncertainty without using statistical methods at some point? I don't
>     think you can.
> 
>     And then there is practicality. Even if everything you say were to
>     be true (note this is a
>     counterfactual conditional!), large amounts of safety-relevant SW is
>     now sold in the marketplace on
>     the basis that the user may rely on it doing its job ("yes, problems
>     have arisen but these are the
>     measures used to fix them and the problems haven't occurred since").
>     The validity of such assurances
>     is low to marginal. Much of the thrust of our approach to the
>     statistics is to try to encourage
>     people to keep better records and to pay attention to appropriate
>     inference rather than saying
>     "these hundred clients have been using it and previous versions for
>     a decade and a half and only one
>     has been unhappy enough with the product to go to arbitration about
>     it". And for the clients of such
>     vendors to demand appropriate statistics rather than be content with
>     such claims.
> 
>     PBL
> 
>     Prof. Peter Bernard Ladkin, Faculty of Technology, University of
>     Bielefeld, 33594 Bielefeld, Germany
>     Je suis Charlie
>     Tel+msg +49 (0)521 880 7319 <tel:%2B49%20%280%29521%20880%207319> 
>     www.rvs.uni-bielefeld.de <http://www.rvs.uni-bielefeld.de>
> 
> 
> 
> 
>     _______________________________________________
>     The System Safety Mailing List
>     systemsafety at TechFak.Uni-Bielefeld.DE
>     <mailto:systemsafety at TechFak.Uni-Bielefeld.DE>
> 
> 
> 
> ------------------------------------------------------------------------
> 
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE

-- 

Peter Bishop
Chief Scientist
Adelard LLP
Exmouth House, 3-11 Pine Street, London,EC1R 0JH
http://www.adelard.com
Recep:  +44-(0)20-7832 5850
Direct: +44-(0)20-7832 5855