[SystemSafety] Subversive C programs for mockery of static analysis tools ... on a bad day ; -)
Olwen Morgan
olwen at phaedsys.com
Thu Nov 29 20:50:46 CET 2018
On 29/11/2018 16:54, Derek M Jones wrote:
> Thierry,
>
>> Do I understand that you are building a database of "subversive" C
>> pieces of code?
> Subversive is probably the wrong word to use, because it implies intent.
> This is certainly true for the International Obfuscated C Code Contest,
> https://www.ioccc.org/, but not more generally.
>
> Anyway, let's run with it.
Nothing so grand. Practical verification of C programs (or programs in
any imperative language, for that matter) relies on sub-setting the
usage to make it tractable to verifiers. AFAI am aware, language
sub-setting currently underpins *all* industrial-strength tools to
verify imperative programs. I have both personal and professional
interests in this area. The personal interest is to see how hard/easy it
is to write programs that meet sub-setting requirements yet lead
verifiers to give false results (in other words, in spasmodic fits of
mischief/subversion/ennui, I enjoy trying to break verifiers). The
professional interest is subject to commercial confidentiality.
>> I am one of the authors of the SQALE method for measuring software
>> (non-)quality and technical debt, independently of tool makers. We
>> believe such a database is a great idea and help people like us
>> trying to standardize the measurement of software, like any other
>> technological item.
Well, if a metalanguage for providing unambiguous specifications of
software measurements would help, you may as well use Mosses' SOS - at
least as a starting point. Without unambiguous definitions of
measurements, we can have no underlying system of metrology - and
without that, standards of measurement cannot be made reliably
realisable, let alone the measurements themselves be made both
repeatable and reproducible.
>
> I take it the aim is to document cases where:
> Person A thinks code means x, while person B thinks it means y.
Nope. Just to find programs that meet subset requirements but still
break verifiers.
>
> To prevent the database being overly huge, some likelihood and
> significance criteria are needed.
> For instance, a certain percentage of the population using the language
> have to have differing views. Also the difference between x and y will
> have to be significant (however that is measured).
No danger of it's being huge. One can hardly mass-produce such programs.
They're typically not easy cook up. I envisage a gradual accretion of
programs that achieve some success in breaking verifiers. This is not
aimed at discrediting the developers of verifiers but to provide them
with challenging test cases. And even if they do not break verifiers,
they may still be instructive to developers who remain unaware of the
pitfalls that C lays up for the unenquiring.
>
> Obtaining data on population beliefs will be costly and difficult.
> And as for deciding whether the difference between x and y is likely to
> be significant, good luck with that.
One does not need data on population beliefs. That's not the aim at all.
If I (and hopefully others) can produce tests that break verifiers,
developers may be alerted to possible problems in their tools. If a
verifier produces incorrect results on one of these tests, then that is
a success. Verification is fallacious if it produces proofs that are
contradicted by test results. Trying to write test programs that fool a
verifier is simply the scientifically respectable attempt to falsify the
proposition that the verifier is sound.
>
> All in all, a hopeless task.
The Buddha taught that both hope and despair are delusions. Better, IMO,
to makes one's way happily and concede gracefully if one's efforts turn
out to be misguided.
>
> A more practical approach is to specify what constructs developers can
> use. For instance, the 10 most frequent uses of for-loops more than
> covers every situation. It's much simpler to specify a list of
> constructs that are permitted, rather than the gazillion and one that
> are not permitted.
Not so. (Take a look at eCv from Eschertech. The set of non-permitted
constructs is remarkably small.) True, different verifiers require
different subsets but these can be concisely specified using a suitable
syntactic metalanguage for language restriction. Using such a
metalanguage, e.g. SYMELAR, it is quite straightforward to give precise
(at least as precise as the most precise bits of the standard)
specifications for C constructs whose avoidance may be needed to render
code tractable to analysis by a verifier. Indeed, I did exactly that -
and more - for C around a decade ago. It's document N0228 on the WG23
web site. Surely you remember? The WG23 document that the UK panel
(including you, AFAI recall) wouldn't submit as a UK contribution - only
for the then international convenor and CSA have it submitted as a
Canadian one?
>
> For C, somebody has written a book listing all the most frequent
> constructs :-)
> www.knosof.co.uk/cbook
Your web site says that this page was last updated in 2010. Is that correct?
At 1600+ pages, your book is over twice the length of the C11 standard
and over three times the length of the latest C draft DIS. Is this
intended to make it easier to use than the standards themselves? Indeed,
should it remind us of Gauss' motto, "Pauca sed matura"?
How are you getting on finding an *independent* publisher for it?
Olwen
More information about the systemsafety
mailing list