[SystemSafety] Subversive C programs for mockery of static analysis tools ... on a bad day ; -)

Thu Nov 29 20:50:46 CET 2018

On 29/11/2018 16:54, Derek M Jones wrote:
> Thierry,
>
>> Do I understand that you are building a database of "subversive" C 
>> pieces of code?
> Subversive is probably the wrong word to use, because it implies intent.
> This is certainly true for the International Obfuscated C Code Contest,
> https://www.ioccc.org/, but not more generally.
>
> Anyway, let's run with it.

Nothing so grand. Practical verification of C programs (or programs in 
any imperative language, for that matter) relies on sub-setting the 
usage to make it tractable to verifiers. AFAI am aware, language 
sub-setting currently underpins *all* industrial-strength tools to 
verify imperative programs. I have both personal and professional 
interests in this area. The personal interest is to see how hard/easy it 
is to write programs that meet sub-setting requirements yet lead 
verifiers to give false results (in other words, in spasmodic fits of 
mischief/subversion/ennui, I enjoy trying to break verifiers). The 
professional interest is subject to commercial confidentiality.

>> I am one of the authors of the SQALE method for measuring software 
>> (non-)quality and technical debt, independently of tool makers. We 
>> believe such a database is a great idea and help people like us 
>> trying to standardize the measurement of software, like any other 
>> technological item.

Well, if a metalanguage for providing unambiguous specifications of 
software measurements would help, you may as well use Mosses' SOS - at 
least as a starting point. Without unambiguous definitions of 
measurements, we can have no underlying system of metrology - and 
without that, standards of measurement cannot be made reliably 
realisable, let alone the measurements themselves be made both 
repeatable and reproducible.

>
> I take it the aim is to document cases where:
> Person A thinks code means x, while person B thinks it means y.

Nope. Just to find programs that meet subset requirements but still 
break verifiers.
>
> To prevent the database being overly huge, some likelihood and
> significance criteria are needed.
> For instance, a certain percentage of the population using the language
> have to have differing views.  Also the difference between x and y will
> have to be significant (however that is measured).

No danger of it's being huge. One can hardly mass-produce such programs. 
They're typically not easy cook up. I envisage a gradual accretion of 
programs that achieve some success in breaking verifiers. This is not 
aimed at discrediting the developers of verifiers but to provide them 
with challenging test cases. And even if they do not break verifiers, 
they may still be instructive to developers who remain unaware of the 
pitfalls that C lays up for the unenquiring.

>
> Obtaining data on population beliefs will be costly and difficult.
> And as for deciding whether the difference between x and y is likely to
> be significant, good luck with that.
One does not need data on population beliefs. That's not the aim at all. 
If I (and hopefully others) can produce tests that break verifiers, 
developers may be alerted to possible problems in their tools. If a 
verifier produces incorrect results on one of these tests, then that is 
a success. Verification is fallacious if it produces proofs that are 
contradicted by test results. Trying to write test programs that fool a 
verifier is simply the scientifically respectable attempt to falsify the 
proposition that the verifier is sound.
>
> All in all, a hopeless task.

The Buddha taught that both hope and despair are delusions. Better, IMO, 
to makes one's way happily and concede gracefully if one's efforts turn 
out to be misguided.

>
> A more practical approach is to specify what constructs developers can
> use.  For instance, the 10 most frequent uses of for-loops more than
> covers every situation.  It's much simpler to specify a list of
> constructs that are permitted, rather than the gazillion and one that
> are not permitted.

Not so. (Take a look at eCv from Eschertech. The set of non-permitted 
constructs is remarkably small.) True, different verifiers require 
different subsets but these can be concisely specified using a suitable 
syntactic metalanguage for language restriction. Using such a 
metalanguage, e.g. SYMELAR, it is quite straightforward to give precise 
(at least as precise as the most precise bits of the standard) 
specifications for C constructs whose avoidance may be needed to render 
code tractable to analysis by a verifier. Indeed, I did exactly that - 
and more - for C around a decade ago. It's document N0228 on the WG23 
web site. Surely you remember? The WG23 document that the UK panel 
(including you, AFAI recall) wouldn't submit as a UK contribution - only 
for the then international convenor and CSA have it submitted as a 
Canadian one?

>
> For C, somebody has written a book listing all the most frequent
> constructs :-)
> www.knosof.co.uk/cbook

Your web site says that this page was last updated in 2010. Is that correct?

At 1600+ pages, your book is over twice the length of the C11 standard 
and over three times the length of the latest C draft DIS. Is this 
intended to make it easier to use than the standards themselves? Indeed, 
should it remind us of Gauss' motto, "Pauca sed matura"?

How are you getting on finding an *independent* publisher for it?

Olwen