[SystemSafety] Logic
Les Chambers
les at chambers.com.au
Fri Feb 21 02:52:02 CET 2014
Steve
You said: " I have a real-world example of a variant on this theme."
All I can say is: Hallelujah brother. How do you clap on an email list?
clap, clap, clap, clap (I guess).
For those interested in case studies, there is another one from my past
along the same lines (model-based development). Sounds like Alice in
Wonderland, but it's actually true.
Story: the soul of a chemical reactor
For many years engineers have used software to enhance the performance of
their machines. My first experience of this was in chemical processing (1975
- 1985). Working in a multidisciplinary team I developed process control
software for chemical processing reactors. Put simply, we made synthetic
latex, plastics, chlorine, insulating foams ... by controlling reaction
kinetics with software. Our goal was not only to make high quality product
but also to maximise the yield from our reactants and to make the most
efficient use of resources: energy, water, labour and so on. Given that most
of our reactants were either a threat to human health or potentially
explosive, all this had to be achieved with safety.
The software became the brains (nay the soul) of the plant, elevating the
operator to a supervisor of automated chemical production. Software also
allowed us to optimise the physical design of the plant (pipes, pumps and
reactor vessels). Using smart control we could actually reduce the amount of
plant hardware required without compromising safety.
>From a business perspective, these applications were a resounding success.
With much tighter, intelligent control we were able to produce more
consistent quality at higher yields. Planned outages became less frequent as
smart software predicted problems and took automatic corrective action
before they escalated into a plant shutdown.
As for the computers themselves, they were nothing but a tool in the service
of chemistry, software was a means to an end. The programmers were actually
chemical engineers on a mission. They couldn't care less about the beauty of
their code, they were more interested in beautiful on-spec products.
For me this was more than a job, it was a meaning of life, it was the best
fun I ever had in my life and, to this day, remains the most successful
suite of software applications I have ever witnessed.
Building Complex Things
This software success story was largely due to the adoption of the standard
engineering approach to building a complex thing:
. Develop a clear understanding of the problem - then document it. All
projects started with something we called "English language", a clear
statement of the operating discipline structured such that it could be
easily transformed into a design models - something similar to what is now
called Requirements State Machine Language. The English language was usually
high quality because it was written by plant engineers intimately familiar
with plant operations and reaction kinetics. As the plant engineers who
wrote the control programs were typically not expert in the application of
computers to control systems, I worked in a central support group that
trained them in analysis methods, control theory and programming techniques.
. Apply the best science to the problem. The applicable sciences were
the mathematics of control theory, and basic chemistry. Control theory, in
existence for some years, was augmented with sampled-data systems theory to
produce computer-based control. The chemical process technology was well
established and documented by technology centres responsible for maintaining
corporate memory of "the way we make chemicals".
. Partition the problem. Chemical processing plants could be very
large and complex with thousands of sensors and final control elements.
However, the control problem could be simply partitioned into various unit
operations (reactor, a heat exchanger, a premix tank, scrubber, distillation
column). We applied the finite state machine model to each unit operation
and developed mechanisms for cooperation between unit operations based on
state. This approach has since become known as "model-based development" -
that is, create a model of the system that will support detailed validation
of proposed behaviour before you write a line of code.
. Simplify. Our application language was a simplified variant of
Fortran. It had no looping constructs (no do-whiles, no do-untils, certainly
no go-tos). It could be taught to any engineer with foundation programming
skills (some operators with no programming skills became programmers).
. Apply standard engineering practices - no exceptions. The control
requirements of all plants were analysed using the same analysis method. All
process control programs were organised in the same way. They could be
easily read and understood by anyone in the world who had received basic
training. These techniques were successfully rolled out in three regions of
the USA and several European plants. The Asia-Pacific rollout became my
responsibility - it wasn't hard, there was a strong engineering culture in
place before I arrived and as this was brand-new technology, no preconceived
ideas to overcome.
. Reuse elements of past solutions where possible. Successful control
strategies that had been proven elsewhere were reused. Technology centres
were tasked with making sure innovations in process control were
communicated and reuse was maximised. Some programming exercises morphed
into comprehensive copy and paste, with the attendant cost savings. My
support group also took responsibility for "remembering neat ways of doing
things," documenting them and promulgating them - complex stuff like
feed-forward control using process modelling or simple stuff like "open the
downstream valve before you start the pump - you idiot!!!"
. Apply strict quality control. It was easy for a newbe programmer to
stray from our best-known practices, so the dos and don'ts of control system
design and coding were well established, documented and rigourously enforced
in requirements and code reviews. We affected a pseudo-Nazi regime in this
respect.
. Perform analysis and simulation. From the beginning it was possible
to test our programs via primitive simulations of plant conditions using
dummy inputs. After a while we began to experiment with running our control
algorithms against full-blown plant simulations. The effort required to
analyse, specify and develop these simulations was roughly equivalent to
that ploughed into the control program itself. The outcome was plant
start-ups that took a week instead of a month; a massive cost saving.
. Manage risk - ask what could kill us next. Every plant I worked on
presented many opportunities to screw up. The ramifications varied from
destruction of plant equipment, to causing sickness or death, to triggering
explosions that would not only destroy the chemical processing complex, but
also the surrounding neighbourhood. There was therefore a formal approach to
risk management. A team was tasked with identifying dangerous states of each
plant. Code was then written to sense these conditions, abort any existing
control actions and return the plant to a safe operating state.
Benefits of the Engineering Approach
There were many positive outcomes from our engineering approach, the most
telling of which was:
... in 10 years of working in this application domain - with at least 10
projects running concurrently at any point in time, I NEVER once heard of or
experienced a project failure.
I attribute this to:
. Customer focus. The control system software NEVER failed to meet the
needs of the processing plant - mainly due to the customer being embedded in
the project and the high levels of expertise brought to bear on requirements
definition.
. Process control. The software development process was simple, well
defined and rigourously enforced. We had no choice, we were always part of a
larger systems project with immovable deadlines.
. Analysis. Analysis methods were mandated and therefore made
consistent across all plants. You could write any program you liked as long
as it implemented the plant control system as a set of cooperating finite
state engines. Further, formal analysis of reaction kinetics and the
equipment under control, followed by generation of simulations,
significantly reduced the resources required for plant start-ups.
. Early validation with model-based development. The use of the finite
state machine model allowed us to validate the overall plant control
strategy, long before. Code was written. This eliminated overruns due to
rework. We discovered that the most complex element of control was the logic
around state transitions. This logic could be clearly stated in English and
validated by engineers highly experienced in plant operations, but with no
programming skills. If you like it allowed non-programmers to look deeper
into what system was about to do and have more control over its behaviour.
. Quantification. Effort estimates were accurate. Using the state
machine as an estimating proxy we could predict how long it would take to
develop control software within a week regardless of who was performing the
work. This injected welcome predictability into our projects.
. Documentation. The statement of requirement became the plant
operating manual. Safety imperatives meant it was always kept up-to-date.
Prior to plant commissioning these requirements were subject to rigourous
review by process technology centres.
. Reuse of past solutions. A managed process for reusing innovative
control strategies enhanced quality and saved money.
. Concern for maintainability and safety. Maintainability and safety
were key issues in software development. Plants were constantly optimised
and had long operational lives. Explicit requirement statements easily
traceable to simple design archetypes (state machines) and implemented with
simple readable code allowed anyone with process knowledge and basic
programming training to enhance operations technology through software
without compromising safety. There was a standing joke that after about
three months from start up, you had to move the plant engineers who wrote
the program on because life got incredibly boring. Often these plants
started up as optimised as they were ever going to be. All the plant
engineer could do was "stick his fingers in the program" (read over
optimise) and screw it up. Better to move him on to another problem.
. Systems thinking. The software was always considered as a component
of a larger system (never an end in itself). The impact of software on the
chemical plant as a whole was assessed and substantial benefits flowed.
Introducing software into a chemical processing plant produced emergent
behavior: high quality product for one, but by far the greatest benefit came
from the ability to trade-off plant hardware for smarter software. For
example, before computer-control it was considered unsafe to mix certain
combinations of reactants in the same reactor. The problem was solved by
using premix reactors to create less volatile, intermediate products.
Computer control gave us tighter control of reactant ratios allowing us to
eliminate premix operations and charge heretofore dangerous chemical mixes
into the same reactor, at savings of hundreds of thousands of dollars.
Cheers
Les
-----Original Message-----
From: systemsafety-bounces at lists.techfak.uni-bielefeld.de
[mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] On Behalf Of
Steve Tockey
Sent: Friday, February 21, 2014 10:20 AM
To: Heath Raftery; systemsafety at lists.techfak.uni-bielefeld.de
Subject: Re: [SystemSafety] Logic
This may very well always be a challenge. People generally have a way too
short term outlook, particularly mid-level managers in big corporations. I
would like to (optimistically) extend Heath Raftery's example as follows
(by the way, I refuse to refer to person B as "Engineer B" because they
clearly aren't one):
Possibility A) Person B's implementation of doodad D is still little more
than just flashing LEDs and clicking relays when Engineer A's solution is
ready to go into production. Engineer A's production version works
essentially flawlessly.
Possibility B) Person B does provide a "production version" of doodad D
however that production system gives defective output on every 32nd use
and crashes--requiring a complete OS reboot--on every 64th use. Engineer
A's production version works essentially flawlessly.
But will the decision makers ever even notice??? Sadly, probably not.
I have a real-world example of a variant on this theme. Details are
changed to keep PBL and his organization out of trouble.
I worked for about 8 years at a company that makes very high-tech
transportation devices. I'll use cars as an analogy, but their devices are
at least an order of magnitude more complex than cars.
We start off with Car Product Line 1, which the company has been building
with, say, gasoline engines for almost 20 years. There's a comprehensive
suite of "automated test equipment" (ATE) software--embedded in a hardware
platform--for testing Product Line 1 cars as they are being manufactured.
All existing test programs were traditionally-built C code that ran on
HP/UX 9. Along comes the need to produce Product Line 1 cars, but with
diesel engines. The "engine simulator ATE" application for gasoline
engines is 25K SLOC. The most reasonable estimate is that engine sim ATE
for diesel engines will also be about 25K SLOC however code reuse is not
possible for reasons that can't be elaborated here. Nonetheless, at
typical programmer productivity rates and the project's allocated staffing
level, that's more than a year of development. The problem is that we're
already in July and the first diesel engine car will be coming through the
factory the following March. We only have 8 months, not 12. They simply
couldn't wait until the following July (or, realistically, much later
given typical software project schedule over-runs) for the diesel version
of the engine sim ATE software.
The project manager, Mike (his real name), had worked with me before on
some non-related projects and was aware of my involvement in model-based
development so he invited me to give the team of four a presentation on
the topic. The team was intrigued with the idea that we could
significantly accelerate delivery because that's exactly what was needed.
Everyone agreed to take the model-based development route. Estimates
derived from early modeling predicted a mid-January completion date for
the model-based project. We could get it done in about 6 months, well
ahead of the March need date.
A mid-level manager (to remain nameless), having experienced horrible
software project delays--due to necessary debugging--in the past, insisted
on having the initial code written by the end of November. This was
intended to allow adequate debugging time before the need date for the
first diesel car. Long story short, the requirements modeling took until
the middle of October. The design modeling took until early December. When
that mid-level manager visited the project in early December, he almost
went into orbit when he realized that the team had not yet written even
one line of production code and the project was already past the point
that he had mandated for "code complete". Mike almost lost his job right
then and there.
There was a slight underestimate in the project, code complete (13K SLOC)
and hardware integration was completed around January 21, not January 14
as predicted back in late July. We had done all of the testing that could
be done without an actual car and everything worked as expected. The
engine sim ATE system then sat there until mid March, waiting for the
first diesel car. When that first diesel car was ready to be tested, both
it and the ATE performed flawlessly.
A little more than a year earlier, the corporate executive management team
approved the engineering & development of Car Product Line 2. The schedule
from approval to Product Line 2 car #1 rolling off the assembly line was
2.5 years. The entire ATE software suite for Car Product Line 2 was
included in the schedule and needed all of the 2.5 years for development.
Unfortunately, that project's original software team wasted the first 1.25
years. When the executive management did a check of the Car Product Line 2
engineering & development critical path, they realized that the ATE
software team was still sitting back at the starting line. The team
members had authored a few interesting technical papers and played a lot
of computer games but had made zero progress on actually producing ATE
software. Most of that original team got reassigned to other projects and
a new team was brought in. This new team noticed that Car Product Line 1's
engine sim ATE was completed in about half the time that had been
predicted, and that's pretty much what they had: half the time. So I and
three of the four from Car Product Line 1 engine sim ATE were brought over
to get Car Product Line 2 ATE going.
There was a management mandate to "reuse as much of the Car Product Line 1
code as possible". Unfortunately, code re-use was simply not an option
because for some reason Car Product Line 2 had chosen C++ on HP/UX 10 for
implementation. We did reuse a little code, but only 83 SLOC. Long story
short, the entire ATE suite for Car Product Line 2 was delivered 3 days
ahead of the need date for car #1. Keep in mind that the full ATE suite
was a far bigger job, we had 30 developers and delivered 113K SLOC. 6-8
weeks after going live on the factory floor, we met with the ATE operators
to see how they liked it. Simply, they were amazed at how such a complex
piece of software could work so flawlessly from the very beginning. They
had set up a contest to see which operator could crash ATE and nobody had
been able to.
With such fully documented, high quality code the middle managers decided
they didn't need nearly as many software weenies to maintain the Product
Line 2 ATE code base. In their infinite wisdom they completely ignored the
fact that we had built a team that took a project in seriously deep doo
doo and made it successful. Rather than find another project that was in
deep doo doo and turn the extra people loose on that, the excess staff got
laid off (made redundant). The team's reward for doing a great job was
that most of them lost their job. Sigh...
Now, wind the clock forward about 12-15 years later. I'm no longer working
at the manufacturing company. By this time they were about half way into
Car Product Line 3 engineering and development. Deja vu all over again as
they realized that the original ATE software team had wasted the first
half of the project schedule. Again, those people got re-assigned to other
projects and I got a panic call from the new ATE software team. "Aren't
you the guy who bailed out the Product Line 2 ATE software project?".
"Yes, why?" "We desperately need your help..."
But again, code reuse was simply not an option because the Product Line 3
ATE project had already committed to C#/.net. Nonetheless, ATE software
was ready well before Product Line 3 Car #1 was in a position to be
tested. And when tested, both car #1 and ATE software performed flawlessly.
One very important lesson that this company never learned was that one
major reason each of these projects were able to finish on time/early was
because we reduced the amount of rework to negligible levels. Software
projects at that company, like traditional software projects everywhere,
suffered from 50-60% rework ("debugging"). All of the model-based ATE
software projects featured peer review of the models that revealed and
allowed removal of the majority of the defects before a single line of
code was ever written. Rework on these projects was well under 10%,
probably closer to 5%.
The other very important lesson that the company never learned was that
the other major reason for finishing on time/early was because of
requirements model reuse. If you laid the three requirements models
side-by-side you would notice that 80-90% of the content was identical.
>From a "what does it mean to test a car?" perspective, each version of ATE
was largely just a minor modification of the earlier version, thus saving
huge amounts of requirements development time.
In the end, my point is that the data is there. Projects have been done
this way and those projects have been successful. But the business has to
take the blinders off and understand what was done differently and why it
made a difference. They seem to be totally incapable of this. Insert
another sigh here...
I should add that what was done on these projects was not strictly "formal
methods" in the sense that's being hotly debated here. We didn't use Z,
VDM, or any of those formal languages. We didn't use theorem provers
either. We used UML (and pre-UML because of project timing) class diagrams
and state charts mostly, but we had a carefully defined and enforced (via
the model inspections) single interpretation of that modeling language.
Much like I mentioned in the Jeannette Wing "A Specifier's Introduction to
Formal Methods" paper earlier, the modelers were using a comfortable
surface syntax (UML) but there was a rigorous (albeit not exactly
formally-defined) semantic to those models.
I can only speculate on the scalability of formal methods based on my
experience. I suspect that they will scale just fine, provided that the
people doing the majority of the "methods" work can work in comfortable
surface syntaxes like UML and keep the Z, VDM, Larch, etc. stuff hidden
under the covers. If anyone wants to do theorem proof of some interesting
property, they are free to do so. Simply take the existing UML model and
translate it into the underlying formal language equivalent and run the
analysis on that. Every property proven about the formal representation
has to apply to the UML version because they are equivalent
semantics--they are just in a different syntax.
I don't have to speculate on the scalability of the ("semi-formal"?)
model-based development process. I've personally been involved on projects
that had more than 250 programmers working for about 5 years on code bases
up to about 5-10 million SLOC. The projects delivered on time (or early)
and the users were amazed by how few defects they encountered in actual
use. We just have to find a way to get the corporate decision makers to
notice...
-- steve
-----Original Message-----
From: Heath Raftery <hraftery at restech.net.au>
Organization: ResTech Pty Ltd
Date: Wednesday, February 19, 2014 3:35 PM
To: "systemsafety at lists.techfak.uni-bielefeld.de"
<systemsafety at lists.techfak.uni-bielefeld.de>
Subject: Re: [SystemSafety] Logic
On 19/02/2014 11:28 PM, Michael J. Pont wrote:
> It may - of course - be that the organisations I have closest contact
>with
> are atypical: they are, after all, a self-selecting group. However,
>while
> I'm sure that there are many organisations that have mature processes in
> place for the development of real-time embedded systems, I'm equally sure
> that this isn't the norm.
>
> If we assume - for the moment - that my model is correct, how do we
>ensure
> that the situation is different in 10 years time?
Great points. I'd suggest that changes to education focus, while very
important, wont be the necessary trigger. There needs to be a market
force. The scenario that plays out in my world goes like this:
1. Customer C requests doodad D to solve problem P.
2. Engineer A says right, no problem, we just need to articulate the
requirements and capture them in an unambiguous way. Formal methods can
help, I'll show you the way.
3. Engineer B says, no problem, in fact here's a prototype I whipped up.
We're almost there.
Engineer A studied embedded development at an excellent facility and has
sound knowledge of formal methods.
Engineer B taught herself programming and has been writing code since
before she could drive.
4. A's manager asks how D is coming along and A says fine, we're working
through the requirements.
5. B's manager asks how D is coming along and B says fine, look I've got
the LEDs flashing and the relays clicking.
Guess which engineer gets rewarded?
_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE
_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE
More information about the systemsafety
mailing list