[SystemSafety] Statistical Assessment of SW ......
Peter Bishop
pgb at adelard.com
Mon Jan 26 19:36:25 CET 2015
You have put your finger on the key problem.
Statistical testing and confidence limits are straightforward and
technically sound *provided the assumptions are correct*
If the assumptions are wrong, the bound no longer applies
The confidence bound would then depend on the (unknown) proability that
the assumptions are wrong. So in the worst case:
E pfd = P(assump wrong) * 1 + P(assump_right) * confidence limit
Not great unless you a *really* sure your assumptions are valid
Peter Bishop
DREW Rae wrote:
> Peter,
> You are correct that the language I used implied that there would be
> failures to count - that was imprecise. For the levels of reliability
> typically claimed, the failure count would be zero, and there's nothing
> wrong with determining upper bounds on likelihood based on a period with
> zero failure events. That doesn't remove the need for a reliable counter
> of failure events, though.
>
> My point is that the statistical models are internally valid for
> predicting reliability, but they can't externally validate their own
> predictions. There are any number of auxilliary assumptions necessary to
> claim that the test results + statistical assessment are actually
> reliable estimates, and those assumptions can't be tested. It isn't just
> a case of "there's some uncertainty" - it's a case of _unknown_
> uncertainty, with a blind faith that the uncertainty is small enough
> that the predictions have some usefulness.
>
>
>
>
>
> My safety podcast: disastercast.co.uk <http://disastercast.co.uk>
> My mobile (from October 6th): 0450 161 361
>
> On 26 January 2015 at 13:21, Peter Bernard Ladkin
> <ladkin at rvs.uni-bielefeld.de <mailto:ladkin at rvs.uni-bielefeld.de>> wrote:
>
> On 2015-01-26 12:41 , DREW Rae wrote:
> > A simple thought experiment. Let's say someone claims to have a
> suitable method of predicting
> > combined hardware/software reliability.
> > On what basis could they ever support that claim?
>
> Um, using well-tried statistical methods associated with the
> Bernoulli, Poisson and exponential
> distributions, as taught in most basic statistics courses. (Such as
> http://www.math.uah.edu/stat/bernoulli/
> http://www.math.uah.edu/stat/poisson/ Bev put me on to
> these. They are pretty good! I used to refer to Feller, but Bev
> thought that was "ancient". It's not
> that ancient. Came out the year before I was born.)
>
> > I would argue that such a claim about a method
> > intended for real-world use is empirical in nature, and can only
> be validated empirically.
> > Unfortunately this requires an independent mechanism for counting
> the failures, and that there be
> > enough failures to perform a statistical comparison of the
> prediction with reality.
>
> Methods of assessing reliability of SW are normally predicated on
> *no failures having occurred for a
> certain number of trials*. Providing that no failures have been
> observed, the conclusion that the
> failures have a specified low occurrence rate may be drawn with a
> specified level of confidence,
> dependent on the number of trials observed. I mean, this is just
> basic statistical methodology, is
> it not?
>
> That you are talking about failures, and counting failures, suggests
> to me that you're not au fait
> with the general approach to statistical assessment of SW reliability.
>
> > Conclusion: No method for predicting hardware/software
> reliability can actually be shown to
> > accurately predict hardware/software reliability. All claims
> about hardware/software reliability are
> > constructed using methods that themselves haven't been adequately
> validated.
>
> Dear me! <PBL restrains himself for fear of being voted off his own
> list :-) >
>
> Any method of assessing the reliability of a system is going to rest
> on a considerable amount of
> uncertainty. Say you want to *prove* your system satisfies its spec.
> You say you have listed all the
> proof obligations of your SW? How reliable is the listing process?
> You say you've discharged all the
> proof obligations? How reliable is your proof checker? And so on.
> How can you deal with that
> uncertainty without using statistical methods at some point? I don't
> think you can.
>
> And then there is practicality. Even if everything you say were to
> be true (note this is a
> counterfactual conditional!), large amounts of safety-relevant SW is
> now sold in the marketplace on
> the basis that the user may rely on it doing its job ("yes, problems
> have arisen but these are the
> measures used to fix them and the problems haven't occurred since").
> The validity of such assurances
> is low to marginal. Much of the thrust of our approach to the
> statistics is to try to encourage
> people to keep better records and to pay attention to appropriate
> inference rather than saying
> "these hundred clients have been using it and previous versions for
> a decade and a half and only one
> has been unhappy enough with the product to go to arbitration about
> it". And for the clients of such
> vendors to demand appropriate statistics rather than be content with
> such claims.
>
> PBL
>
> Prof. Peter Bernard Ladkin, Faculty of Technology, University of
> Bielefeld, 33594 Bielefeld, Germany
> Je suis Charlie
> Tel+msg +49 (0)521 880 7319 <tel:%2B49%20%280%29521%20880%207319>
> www.rvs.uni-bielefeld.de <http://www.rvs.uni-bielefeld.de>
>
>
>
>
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
> <mailto:systemsafety at TechFak.Uni-Bielefeld.DE>
>
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
--
Peter Bishop
Chief Scientist
Adelard LLP
Exmouth House, 3-11 Pine Street, London,EC1R 0JH
http://www.adelard.com
Recep: +44-(0)20-7832 5850
Direct: +44-(0)20-7832 5855
More information about the systemsafety
mailing list