[SystemSafety] A Series of SW-related Sociotechnical System Failures

Les Chambers les at chambers.com.au
Wed Jan 21 00:39:52 CET 2015


Peter
I read all those words. Can you confirm my impressions:
1) you are suggesting that systems should be thoroughly specified before
they are built
2) further you are suggesting that the usability of a system, as specified,
should be validated for its target user community before and during its
construction; certainly before it is foisted on unwitting users. In short
one should respect one's users' right to simplicity, accuracy and ease of
use.
If the answer is yes to both these propositions I would watch my back if I
were you. You sound like a dangerous radical.

In general, I weep for the software industry. Instances such as the one you
have documented are all too common. 60 years on from the first introduction
of computing technology into human systems, people who allocate capital for
the construction of these systems are unwilling to pay for what we class as
best practice (I call it common sense).  Instead they prefer to pick up the
pieces of a failed system at much higher cost. The Obamacare debacle 
http://www.chambers.com.au/public_resources/case_study/obamacare/saving-obam
acare-case-study.pdf
http://www.chambers.com.au/public_resources/case_study/obamacare/saving-obam
acare-case-study-analysis.pdf

... and the Queensland Health payroll system are classic examples of this.
It is often well known by the developers that the system will fail in
production (as was the case with both of these systems) yet they are still
deployed, usually because of a political commitment or an ill-advised
decision by non-technical manager/politician.
Even NASA is not immune. I recently heard Chris Hadfield relate the story of
a software upgrade in the International Space Station. It failed. All the
lights went out, machinery wound down, and they had to rummage around in the
archives for some archaic software that could operate the station on limited
functionality. This was a case of migration to the boundary by people
eminently qualified to do so.

Why, even last Sunday afternoon Microsoft visited a similar scenario on me,
and I suspect millions of other Windows 8 users. For some time I've been
getting suggestions I should upgrade to Windows 8.1. On the weekend this
morphed from a suggestion to a command (the monopolistic arrogance of these
people). The message said the upgrade would occur in four hours and there
was no button to say no. I closed the message window and got on with my
work. Two hours later I went out leaving my machine switched on, forgetting
about the Microsoft command. When I returned that evening I discovered the
upgrade had been done without my authorisation. Attempting to use Microsoft
Word 2007 I discovered it was unstable. It then stopped working altogether.
There followed a three hour investigation on Internet forums, were in I
discovered that the bog standard Windows 8.1 upgrade does not include
everything you need for Microsoft office to keep operating. There are, in
fact 20, additional "enhancements" relating to 64-bit processors that need
to be installed also. Now my Microsoft Office suite is working, sort of.
PowerPoint is still looking for a networked printer that no longer exists
and will not allow me to select another printer.
In short Microsoft issued a software upgrade that destroyed the integrity of
its own signature software products. In this case, for non-technical users,
there is no boundary to migrate to. It's just a Bridge too far. 
I am an optimist though. At some point the people with the money will look
beyond the initial capital cost of software construction and take pity on
the poor users and the businesses and taxpayers who have to pay for the cost
of ownership - some of them with their lives.
Les

-----Original Message-----
From: systemsafety-bounces at lists.techfak.uni-bielefeld.de
[mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] On Behalf Of
Peter Bernard Ladkin
Sent: Tuesday, January 20, 2015 9:37 PM
To: The System Safety List
Subject: [SystemSafety] A Series of SW-related Sociotechnical System
Failures

There's a new paper on the RVS publications page at
http://www.rvs.uni-bielefeld.de/publications/Papers/LadkinSocioTechDB2015011
3.pdf

It's not about a safety-critical system. It's about my experience with a
WWW-based ticketing system.
However, the observations are very similar. People aren't thinking about -
or specifying - the
system function, and they are in particular not checking that the
implemented system is (in this
case, isn't) a refinement of the system function as it should be. Operators
are apparently adapting
as Rasmussen says they do - Migration to the Boundary - but it's not clear
to me that they should be.

What managed to happen in this case is that a system with virtually 100%
reliability over years went
down to 39% reliability in the last year and a half. So much for computers
helping!

Best practice in design and evaluation is the same, it seems to me, as in
critical systems. That
should be good news, on the basis that we need to keep on banging the same
old drum. But it could be
bad news if we are doing so in a vacuum......

PBL

Prof. Peter Bernard Ladkin, Faculty of Technology, University of Bielefeld,
33594 Bielefeld, Germany
Je suis Charlie
Tel+msg +49 (0)521 880 7319  www.rvs.uni-bielefeld.de




_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE




More information about the systemsafety mailing list