[SystemSafety] At least PBL is now talking to me again ...
Brent Kimberley
brent_kimberley at rogers.com
Sun Jul 12 02:13:09 CEST 2020
At the risk of dumbing things down, I have found that complex multi-disciplinary systems (e.g. operator-electro-mechanical-software-chemical...) tend to be unpredictable - even when each of the components were thoroughly vetted, certified, calibrated - by experts in their prime. It can be as simple as a decision to use angles instead of quaterions (singularites), un-realistic design assumptions, delta retirements analysis instead of global requirements analysis (requirements conflicts), failure to optimize globally across multiple dimensions - mass, energy, momentum, time, jitter, sample frequency, changing physical systems without updating models & transforms, silent bill of materials changes, last minute cables/geometries changes, silent depot changes, pressure to say yes/pressure to say no, bit error/upset, etc. It's well and good to say the "O ring" was within spec and that the problem was else where - in a crisis the question becomes: is / was the system in spec? This is where devices like flight data recorder and flight reporting can prove useful - providing they are used constructively & responsibly. It the very least, perhaps you could use the data to determine the failure power law of the fielded system - or the fleet of systems.
On Friday, July 10, 2020, 5:04:23 p.m. EDT, Olwen Morgan <olwen at phaedsys.com> wrote:
OK. I suspect that we probably have differences of perspective rather
than hard technical disagreements on matters of important substance. I'm
still honestly puzzled by some of the things you are saying but at least
we're not trying to knock eight bells out of each other on what seeks to
be a civilised list. ... :-))
On 10/07/2020 20:20, Peter Bernard Ladkin wrote:
<snip>
> You are way off the usual aeronautical safety ball.
Sod the "aeronautical" safety ball. I believe, with due respect, that
I'm not actually way off the common-or-garden safety ball.
<snip>
> ... they go wrong in specific ways. ...
Individual things go wrong in specific ways but often disasters are a
chain of events whose sequential occurrence was never envisaged by
system designers. By analogy with Michael J's "long span" requirements,
one might call such things "long-span" failures.
<snip>
"Stress testing" SW is not going to get you away from that fundamental
situation.
<snip>
I beg to differ. One of the advantages of stress testing, at least in
the way I have understood and used it, is that it's an opportunity to
simulate low-probability/high-consequence chains of failures. If you
start designing stress tests from the question, "What could possibly go
wrong?" and are focussing on chains of events, you give yourself at
least a sporting chance of detecting circumstances in which your
short-span assumptions link together to engender a long-span failure and
consequent disaster.
And common system safety practices show you where to start on designing
stress tests. You simply aggregate fault trees and cause-consequence
graphs and traverse them systematically applying a stressing
boundary-value test derivation rationale. (I have actually used this
approach, based on graphs of internal signal processing paths, to test
an ADAHRS instrument for a light aircraft. After giving it a hefty
caning - far beyond the normal functional testing - I was unable to
induce long-span failure in the signal processing algorithms, which,
AFAI could see, were very robust - probably owing to well-designed
signal level and rate clipping and smoothing.)
It might not be what most aviation safety practitioners currently do -
but it's not rocket science.
Olwen
_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE
Manage your subscription: https://lists.techfak.uni-bielefeld.de/mailman/listinfo/systemsafety
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/pipermail/systemsafety/attachments/20200712/2d0e52a8/attachment.html>
More information about the systemsafety
mailing list