[SystemSafety] The mindset for safety-critical systems design
Olwen Morgan
olwen.morgan at btinternet.com
Tue Sep 18 17:11:34 CEST 2018
The two cardinal principles of critical systems design are:
1. Whatever is not there cannot go wrong (so do not include any
functions that you do not need).
2. Whatever is there is less likely to go wrong the simpler it is.
Every Tom, Dick and Harry will quote you the second principle but it is
much rarer to find people recognising the first principle explicitly.
The outstanding example of the first principle is the Soyuz spacecraft,
where the Russians discovered by simple mathematics that the mass of a
spacecraft is dominated by the part of the craft that has to be returned
to Earth safely. Hence, much of the living accommodation in the Soyuz is
in the orbital module, which is jettisonned with the service module
before re-entry so that only the descent module has to return to Earth
intact, this module containing all and only the systems required to
achieve safe return of the cosmonauts. As a result, the Soyuz vehicle
for the circumlunar mission (what Apollo 8 first did) provided as much
living accommodation as the Apollo craft while the entire Soyuz craft,
including service module, descent module and orbital module, weighed
about the same as the Apollo command module on its own.
Much the same thinking is seen in the design of the Vostok craft in
which, to minimise launch weight, the cosmonaut descended by parachute
after ejecting from the capsule, which crash-landed, thereby removing
the need for the retro-rocket braking used in the Soyuz. Also, as I've
seen in a Soyuz craft in an exhibition in Washington DC, there are
bungee pockets around the walls of the capsule to keep hand-held devices
needed on voyage from floating about in zero-g. Low-tech yet arguably
the simplest possible piece of design for that purpose.
Now the poser: *Why do people readily recall the second principle (even
if only to pay lip-service) yet often struggle, even when prompted, to
recall the first?*
It is failure to recognise the first principle that ends up producing
neo-natal ventilators that run Windows 7 Embedded (see a previous
posting). Ventilation is a time-triggered cyclic process that does not
need an operating system to support it. If you need to provide data
logging to non-volatile media, as the said ventilator did, then you
still do not need Windows. Nor do you need Windows for offline software
update functions or display on a screen. And this is not to mention the
greater cost of chips to run Windows as compared, say, to dual-core
lockstep microcontrollers that would have supported a much safer
software design. In the firm in which I saw this, only two other
engineers gave me any impression that they understood the importance of
the first principle. Ironically one of them decided that he could take
it into account by configuring into W7 Embedded only those functions
that were needed if Windows were there at all (one out of ten for
effort). The other one who thought using Windows was wrong was actually
a hardware engineer.
Anyone got any ideas why software engineers in particular get the second
principle but miss the first? It beats me. If you miss the first
principle, you'll never retrieve the situation even if you stick
rigorously to the second.
Olwen
PS: Fear not. My postings will tail off when I've stopped dumping my
more egregious examples of insanity for your perusal.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20180918/4e03d62a/attachment.html>
More information about the systemsafety
mailing list