[SystemSafety] CbyC and unit testing
Olwen Morgan
olwen at phaedsys.com
Fri Jul 3 16:20:29 CEST 2020
All,
I've been exploring this issue offline with others on the list but now
have to attend to a pressing commitment elsewhere. Because I think the
issue is very important, I'll summarise my objections to omission of
unit-testing (UT) here and leave others to shoot at them (especially Rod
Chapman, who knows a lot more about CbyC in iFacts than I do):
1. As I understand it, PBL's position is, in the symbolism of
(classical) modal logic: (Use-CbyC -> ¬[]Perform-UT). That is, if you
use CbyC development methods it is not necessary to do unit testing.
My position is: ¬[](Use-CbyC -> ¬[]Perform-UT). That is to say I do
not think it necessarily the case that use of CbyC renders UT unnecessary.
I'll leave any quibbles about alethic or epistemic interpretation
of modal logic to others.
2. I do not believe that experience of CbyC on the Altran iFacts Air
Traffic Control (ATC) project is sufficient to sustain the claim the
CbyC renders UT unnecessary. My reasons for this include:
(a) The aim of ATC is to maintain safe separation of aircraft. Yet
the Z definitions of the UK airspace used by iFacts specified it as a
set of non-convex volumes intersecting only at their boundaries. These
definitions did not even give a specific topology, still less define a
metric space. In fact the Z definitions did not even contain the
geographic coordinates of the vertices at sector boundaries. In these
circumstances, how can one claim that the notions of location and
distance between aircraft are even defined by the Z specifications? A
coder of a software unit that dealt with location and distance would,
presumably, have to introduce the relevant coordinates and metric
notions into the SPARK proof annotations? For if they were not given in
the Z functional specification, how else could they be introduced? And
if the coordinate system and distance metric were added by the coder for
the units that he was coding, how were these notions explicitly
traceable to the Z specifications? In fact, given the weakness of the Z
airspace specifications, how could incorrect location and metric
assumptions possibly contradict it? If the formal functional spec was
that weak, CbyC was correspondingly limited in the strength of the
assertions that it could prove about the code - and that is quite apart
from any issues of whether UT is necessary.
(b) The iFacts code was not all written in SPARK Ada. AFAI recall (Rod C
may correct me) there were about 100kloc of SPARK Ada and 80kloc of C.
The C bits were not, AFAI am aware, developed using a CbyC process. In
my recollection the most powerful static analysis that was performed on
them was MISRA C conformance checking (done by QAC with yours truly
examining the results). MISRA conformance was subject to a large number
of deviation concessions because the C code used X libraries, which are
not fully MISRA C compliant. If you have CbyC for only 56% of the code,
how can you claim that CbyC has rendered unit-testing unnecessary? The
closest you can get is to provide corresponding input and output
predicates for calls to C functions from Ada. But there is no white-box
proof that a called C routine actually behaves as its input-output
predicate specification claims. In what sense, then, could anyone
reasonably claim that CbyC rendered UT unnecessary for the C code?
(c) Location and distance calculation necessarily involves computation
with trigonometric functions. How was it assured that these produced
correct results close to bearings at which these functions take the
value zero? DId the Z spec take account of problems of stability in the
numerical calculations? Did they, for example, use notions of interval
arithmetic? Were the relevant verification conditions placed in the
SPARK annotations? If not, how did the coders provide for numerical
stability and make their provisions traceable to the Z spec?
(d) The iFacts system is highly concurrent. Did the CbyC provide for
verification of relevant concurrency properties? I didn't look in detail
at all of the Z specification, but I'd seriously question the capacity
of Z to capture the relevant requirements in its action-system style of
modelling (but, please, RC, correct me if I've got this wrong).
(e) AFAI I gather from RC, CbyC in iFacts did only *partial* correctness
proofs. In such circumstances, UT of code with profiling instrumentation
may provide the first opportunity to detect non-termination.
3. Upon what properties of the elements of the tool chain is CbyC relying?
(a) On compiler correctness? Was the compiler developed using CbyC? Does
the language standard even provide a basis for CbyC development of the
compiler? Does one know that the compiler has been tested under the
particular compile-time options that one's project is actually using (a
fearful bear trap with, for example, cross compilers for embedded C)?
How does one ensure, with SPARK Ada for example, that the compiler does
not do perverse things such as compiling iteration as recursion - which,
AFAI can make out from what RC says, the Ada RM may not actually
prohibit, again raising issues of traceability?
(b) On the IDE used? Has it undergone CbyC development? How thoroughly
has it been tested? How does it maintain correct configuration of the
tools to which it provides access? I've seen some truly horrendous
blunders in C tool configuration. IMHO, tool configuration is a grossly
neglected area of the application of formal methods.
(c) Unless the relied-upon properties of toolchain elements are made
explicit, it is hard to sustain the claim that CbyC is even being done
correctly, leave alone that it is obviating testing.
4. Is CbyC based on a formal spec that captures relevant non-functional
(NF) requirements? If not, then UT may the earliest opportunity to
detect that they are not. Examples include;
(a) speed of operation of a coded unit: profiling UT will determine this
very easily,
(b) satisfactory interoperation of a coded unit with other pre-existing
system elements: if not checkable strictly at the UT level, this can be
checked by minimal integration tests.
(c) usability of a coded unit: AFAI can see formal methods are quite a
long way off being practically helpful in getting HMI right. I'd *love*
to see a model of human cognition in Z! :-O
(d) security of a coded unit: I've yet to see a formal specification of
software that would allow one to prove that the security of a wired
connection is inferior to that of an optical connection because of the
possibility of EM emanation. If anyone here has, say in Z, a formulation
of Maxwell's equation and a quantum electrodynamical model of light
propagation in optical fibres, I'd really love to see it.
To summarise the foregoing, I say:
(a) We cannot ignore the limited universe problem: Formal methods do not
allow us to prove anything about details from which they abstract. UT,
being the sordidly practical, unabstracted real thing, gives us the
earliest opportunity to catch problems that may arise from over-abstract
specifications, some aspects of which I have alluded to above.
(b) I would be prepared to concede that something like:
((Use-CbyC & P1 & ... & Pi ...) -> ¬[]Perform-UT) may be tenable
where the Pi are comprehension premises that address the problems of
tool chain integrity such as I have referred to above. I do not,
however, believe that:
(Use-CbyC -> ¬[]Perform-UT) is tenable as it stands. There are,
IMHO, too many implicit things that can go wrong for it to be a safe
claim. AFAI can see the only way that proposition is helpfully tenable
is if you can explicitly prove CbyC -> Pi for each agreed Pi in my
conceded formulation.
(c) I've never yet encountered a systems engineer who does not believe
that the earlier you detect problems, the easier and cheaper they are to
correct. I think that there are circumstances in which, even if CbyC has
been used, UT is the earliest point in the life cycle at which problems
not detectable by CbyC can be detected by other means. Does one really
wish, by fiat, to throw this detection opportunity away?
Don't get me wrong. I'd like to see CbyC much more widely used. But I'd
also like to see its practitioners avoiding the kind of ill-considered,
hyping claims by the like of which the AI community has so often blown
its own arse off.
Hope this provides more food for thought,
Heading for afternoon tea,
Olwen
More information about the systemsafety
mailing list