[SystemSafety] Difference between software reliability and astrology

Paul Sherwood paul.sherwood at codethink.co.uk
Wed Aug 21 20:45:26 CEST 2024


This is extremely helpful, Steve - thank you!

On 2024-08-21 17:59, Steve Tockey wrote:
> Paul Sherwood wrote:
> 
> _“Can you (or anyone on the list) help me understand how the
> committee arrived at 10^-5, 10^-6, 10^-7, 10^-8 as targets?”_
> 
> They come from FAA Advisory Circular AC 25.1309-1A System Design and
> Analysis
> (https://www.faa.gov/documentLibrary/media/Advisory_Circular/AC_25_1309-1A.pdf)
> 
> 
> Section 9.e on pages 13 & 14 (quote):
> 
> e. Qualitative Probability Terms. When using qualitative analyses to
> determine compliance with § 25.1309(b), the following descriptions of
> the probability terms used in this regulation and this AC have become
> commonly accepted as aids to engineering judgment:
> 
> (1) Probable failure conditions are those anticipated to occur one or
> more times during the entire operational life of each airplane.
> 
> (2) Improbable failure conditions are those not anticipated to occur
> during the entire operational life of a single random airplane.
> However, they may occur occasionally during the entire operational
> life of all airplanes of one type.
> 
> (3) Extremely Improbable failure conditions are those so unlikely that
> they are not anticipated to occur during the entire operational life
> of all airplanes of one type.
> 
> Section 10.b on pages 14 & 15 (quote):
> 
> b. Quantitative Probability Terms. When using quantitative analyses to
> help determine compliance with § 25.13U9(b), the following
> descriptions of the probability terms used in this regulation and this
> AC have become commonly accepted as aids to engineering judgment. They
> are usually expressed in terms of acceptable numerical probability
> ranges for each flight-hour, based on a flight of mean duration for
> the airplane type. However, for a function which is used only during a
> specific flight operation; e.g., takeoff, landing, etc., the
> acceptable probability should be based on, and expressed in terms of,
> the flight operation's actual duration.
> 
> (1) Probable failure conditions are those having a probability greater
> than on the order of 1 X 10^-5.
> 
> (2) Improbable failure conditions are those having a probability on
> the order of 1 X 10^-5 or less, but greater than on the order of 1 X
> 10^-9.
> 
> (3) Extremely improbable failure conditions are those having a
> probability on the order of 1 X 10^-9 or less.
> 
> Assuming an average of 150k total flight hours over a 30 year service
> life for a single airplane gives an average of 5000 flight hours per
> year. If my calculations are correct, 1 X 10^-5 probability would be
> on the order of once in 5 years of operational service for a single
> airplane. That would mean about six times over the entire service life
> of that airplane.
> 
> Again, assuming 150k total flight hours for a single airplane, 1 X
> 10^-7 would put it at somewhat less than once over the airplane’s
> entire service life.
> 
> Getting to 1 X 10^-9 seems to assume somewhere on the order of a
> couple of hundred airplanes of any one given type.
> 
> Cheers,
> 
> — steve
> 
> On Aug 21, 2024, at 8:28 AM, Prof. Dr. Peter Bernard Ladkin
> <ladkin at techfak.de> wrote:
> 
> On 2024-08-21 15:48 , Paul Sherwood wrote:
> 
>> On 2024-08-21 12:08, Prof. Dr. Peter Bernard Ladkin wrote:
>> 
>>> First, you are talking about using an operating system. An
>>> operating system is a continuously-running system, not a discrete
>>> on-demand function which returns an output value.
>> 
>> Hmmm. Let's break that apart...
> 
> Let's not. For the purposes of assessing reliability, it's not that
> relevant.
> 
>>> So its failure behaviour is not a Bernoulli process. You can drop
>>> the "Bernoulli" bit.
>> 
>> From a physical perspective, the behaviour of such a constructed
>> system appears continuous, but considering what the OS itself is
>> actually doing, every action is discrete.
> 
> So what? Suppose you have a sensor sampling at 400 Hz (typical for
> aircraft-dynamics sensors, for example). The piece of SW dealing with
> those readings (aka control system) is going to want to ascertain
> rates of change and other stuff, so it needs to keep a history of
> readings (over a short period of time). If you have history variables
> then you aren't memoryless. If you're not memoryless then you aren't a
> Bernoulli process, discrete or not.
> 
>>> But keep in mind you can't be letting [the OS] fail. For SIL 4
>>> safety functions, it has to be running more than 100 million
>>> operating hours between failures on average. That is the
>>> constraint from 61508-1 Table 3, which is independent of any means
>>> of describing the failure behaviour.
>> 
>> Understood, but I wonder a bit about the numbers in the table. Can
>> you (or anyone on the list) help me understand how the committee
>> arrived at 10^-5, 10^-6, 10^-7, 10^-8 as targets?
> 
> (1) There is no theoretical reason why powers of 10 are chosen.
> 
> (2) They come from the aerospace regulations, and the "accepted means
> of compliance". The regs contain certain powers of ten for "hazardous
> condition" and "catastrophic condition" and sometimes other hazard
> classes ("minor" and "major") and the AMC nowadays interprets phrases
> such as "not expected to occur within the lifetime of the aircraft
> [fleet]" into probabilities expressed in powers of ten. The reason is
> likely that civil air transport was having continual and improving
> success with what in effect turns out to be its risk matrix, for half
> a century before 61508 came along.
> 
> PBL
> 
> Prof. i.R. Dr. Peter Bernard Ladkin, Bielefeld, Germany
> www.rvs-bi.de
> 
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
> Manage your subscription:
> https://lists.techfak.uni-bielefeld.de/mailman/listinfo/systemsafety
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
> Manage your subscription: 
> https://lists.techfak.uni-bielefeld.de/mailman/listinfo/systemsafety


More information about the systemsafety mailing list