[SystemSafety] Difference between software reliability and astrology

Thu Aug 22 10:42:11 CEST 2024

I'm not an expert, but I have been over this a couple of times with representatives from the FAA.

So here goes, for what it's worth.

The first thing to note is that the figure given is for electronic/electrical hardware and excludes software.

Guidance from the FAA is the software included in any FTA analysis should be assigned a failure rate of zero. The rational being that software failure rates are in general cannot be reliably estimated and thus the dependence/reliance on DO178.

The failure rate applies to a single aircraft of a single type, and as pointed out is the sum of all possible failures in a system. The first step is to apportion failure rates to critical systems e.g. engines, flight control etc. which is dependent on things such as number of engines and how many have to operate to either get the hull off the ground or keep it in the air.

One of the interesting features is that some of the 3-way voting is very straight forward on larger aircraft - for example on control surfaces 3 actuators are used such that any two can overpower the third. i.e. 3-way voting via brute force.

-----Original Message-----
From: systemsafety [mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] On Behalf Of Phil Koopman
Sent: 22 August 2024 00:59
To: systemsafety at lists.techfak.uni-bielefeld.de
Subject: Re: [SystemSafety] Difference between software reliability and astrology

I was waiting for an aviation expert to jump in on this, but haven't 
seen it, so I will contribute some thoughts.

The interpretation of 1e-9 / hr for aviation is a bit subtle.

Indeed this number comes from FAA AC 25.1309-1A
https://www.faa.gov/regulations_policies/advisory_circulars/index.cfm/go/document.information/documentID/22680

Page 7: "Catastrophic failures conditions must be Extremely Improbable"

Page 15: "Extremely Improbable failure conditions are those having a 
probability on the order of 1 x 10-9 or less"   (1e-9)

Page 14:  In the context of Qualitative Probability Terms: "Extremely 
Improbable failure conditions are those so unlikely that they are not 
anticipated to occur during the entire operational life of all airplanes 
of one type."

So my takeaway is that 1e-9 applies to all aircraft OF ONE TYPE and not 
all aircraft in the fleet. Keep in mind this was written in 1988 when 
the skies were a lot less crowded.  So someone did some back of envelope 
math on flight hours per day, number of aircraft of a popular type, and 
airframe lifetime and came up with this number.

Also note that this is for a "failure condition" and is not the 
acceptable failure level for the aircraft. I believe there is an 
assumption that perhaps 10 different failure conditions might all be 
possible, making the aircraft loss rate an order of magnitude worse per 
hour (but I might not be remembering the number 10 correctly -- I don't 
know if it is really written down anywhere).

With modern aircraft there might be a lot more of any one type of 
aircraft, and they might fly more hours + more years. And there might be 
a lot more pieces of kit that can fail catastrophically. So this amounts 
to a legacy number that is not necessarily closely tied to current 
systems.  It might easily be a factor of 10-100 too permissive if the 
objective is to never have a failure on any aircraft of one type.

That having been said, at some point the number becomes so low that it 
is likely failure conditions not anticipated in design become the long 
pole in the tent.   If I remember correctly Concorde was designed to 
1e-10/hr and that is pretty much how things turned out.

Nonetheless when combined with improved safety management system 
approaches the industry seems to have been doing pretty well when they 
don't game the safety system.

Yes, the numbers for self-driving cars are challenging because of number 
of vehicles and hours. However on average a catastrophic mishap involves 
1-2 people rather than 100-200. So that becomes an complex discussion.  
I'll note that ISO 26262 characterizes things as improbable if they are 
unlikely to happen to any single vehicle -- not the fleet of vehicles.

The numbers for autonomous air taxis (1-2 person electric passenger air 
vehicles) are different, and my understanding is that they are 
controversial (some proposed numbers are more like cars than heavy jets).

If any aviation experts can improve this description I welcome it, 
because this topic comes up surprisingly frequently in various 
discussions I have. And while I've done a bit of work in aviation, I 
spent most of my time on automotive.

-- Phil Koopman

On 8/21/2024 7:26 PM, Derek M Jones wrote:
> Steve,
>
> Thanks for the numbers update.
>
>> 5 hours per day is way too low. Airplanes are very expensive, 
>> airlines are low profit margin businesses (which is why they are so 
>> interested in other, more highly profitable side business like credit 
>> cards), and airplanes only earn revenue when they are in the air.
>
> I was not sure whether there was a long tail of less
> frequently used aircraft.
>
>> So if you double or triple your numbers below to account for 10-15 
>> flight hours per day instead of the 5 you used, you get:
>>
>> — 1 X 10^-5 equates to 2.5 to 3.75 Abnormal procedures per day
>>
>> — 1 X 10^-7 equates to one Emergency procedure or Airplane damage 
>> every 30 to 45 days
>>
>> — 1 X 10^-9 equates to one Catastrophic Accident every 6 to 10 years
>
> To me 6-10 years is not Extremely improbable.
> Perhaps the reliability figures were chosen when there were an order
> of magnitude fewer aircraft.
>
> Multiplying these values by lots of orders of magnitude implies
> that self-driving car incidents are going to be routine.
>

-- 
Phil Koopman    m: 412-260-5955    <phil.koopman at hushmail.com>

_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE
Manage your subscription: https://lists.techfak.uni-bielefeld.de/mailman/listinfo/systemsafety