[SystemSafety] State of the art for "safe Linux"

Wed Aug 7 16:36:21 CEST 2024

Derek,

Thank you for your email - comments below...

On 2024-08-07 15:02, Derek M Jones wrote:
>> # State-Of-The-Art
> 
> The state of the art of software reliability analysis
> remains little better than opinions all the way down,
> backed up with inappropriate statistical techniques (e.g.,
> inhomogeneous Poisson processes) applied to tiny datasets.

You may be right - certainly so far I haven't found much in the way of 
published research to counter your statement.

> However, pointing out that the King is not wearing any clothes
> is not good for one's career.  So let's pretend.

I'm in approaching the end of mine, so not worrying too much :)

>> Building on prior art,
> 
> You are hoping that nobody reads these references, a reasonable
> assumption.

No, I really wasn't. It's just that this was the best I could find, and 
I was hoping that folks here would be able to point me towards 
additional/better materials - which appears to be happening.

>> including L. Cucu-Grosjean et al (2012)[7],
> 
> The paper ""Measurement-based probabilistic timing analysis for 
> multi-path programs"
> http://people.site.ac.upc.edu/~equinone/docs/2012/ecrts_2012.pdf
> proposes a new method for what its title says.  No mention of Linux.

Agreed, but relevant because I keep coming across folks (including here) 
claiming deterministic behaviour is a) achievable and b) the best basis 
for safety of software.

>> S.Draskovic et al. (2021)[8]
> 
> This paper "Schedulability of probabilistic mixed-criticality systems"
> https://www.research-collection.ethz.ch/handle/20.500.11850/470954
> does not mention Linux.  It contains 30 pages of Definitions, Lemmas,
> Theorems and their proofs, followed by 10 pages of a experiments that
> have a somewhat tenuous connection to the preceding mathematics

I believe you... my own grasp of mathematics has decayed to the extent 
that I no longer attempt anything more complex than excel manipulations.

>> Mc Guire and Allende (2020)[9], and in
>> anticipation of Allende's 2022 PhD thesis [10], the authors applied 
>> statistical
> 
> Allende's thesis "Statistical Path Coverage for Non-Deterministic 
> Complex
> Safety-Related Software Testing" is now available
> https://dspace.ub.uni-siegen.de/handle/ubsi/2239
> This is interesting research on estimating path coverage, that happened 
> to
> use Linux as the vehicle for running the 28 line program (page 159) 
> used
> to gather statement trace data.

And interesting for me in that it clearly supports the argument that 
(the chosen version of) Linux's timing was non-deterministic.

>> techniques, including Maximum Likelihood Estimation and Simple 
>> Good-Turing, on
>> a practical case study involving a Linux-based Autonomous Emergency 
>> Braking
>> system. They calculated a probability of software-related failure for 
>> their
>> example (1.42e−4), but noted that current safety standards "do not 
>> provide any
> 
> Allende's analysis makes various assumption that the available data
> suggests don't apply to software reliability.  I'm happy to talk
> about this in another thread.

Yes please!

>> Some of the cited authors, e.g. Lukas Bulwahn, Nicholas Mc Guire and 
>> Jens
>> Petersohn, are expert practitioners who have dedicated a significant 
>> portion
>> of their careers to the work of deploying Linux in critical production
>> systems.
> 
> This same argument was used by those healing using leaches, blood 
> letting
> and crystals inside pyramids to back up their claims of efficacy.

Ouch. The truth hurts. It appears I've spent pretty most of my career in 
a "profession" that claims to be doing engineering but whose best 
practitioners are often self-trained hobbyists.

> Where is the data?

Fair.

>> Allende et al. [4] drew the following conclusion:
> 
> My PhD thesis work is ground breaking.

Sorry, I'm being dumb here. Do you mean that $PhD student always 
concludes that their work is groundbreaking?

>> Chen et al. (2023)[11]. Chen and colleagues explored the variability 
>> in
>> Linux path execution under various conditions, and they "demonstrated 
>> that
>> both system load and file system influence the path variability of the 
>> Linux
>> kernel."
> 
> My question is how this Linux variability compares with the variability
> that must also occur in other operating systems?

That's a good question for sure... probably at least a couple of PhDs in 
that :)

>> - No amount of test coverage will ever be enough to represent the full 
>> range
>>    of behaviours of modern software running on a multicore 
>> microprocessor.
> 
> For an experimental analysis of the likelihood that the execution
> of code containing a coding mistake will actually trigger faulty
> behavior, see
> "Compiler Fuzzing: How Much Does It Matter?"
> https://2019.splashcon.org/details/splash-2019-Artifacts/3/Compiler-Fuzzing-How-Much-Does-It-Matter-
> some discussion here
> https://shape-of-code.com/2020/01/27/how-useful-are-automatically-generated-compiler-tests/

Super, thank you very much!

>> Mc Guire, Bulwahn et al. have demonstrated in multiple research papers 
>> what is
>> already obvious to the expert software community, i.e. modern 
>> multicore systems
>> running multi-threaded software exhibit stochastic behaviours.
> 
> Not least because apparently identical processors have
> different performance characteristics.
> https://shape-of-code.com/2020/01/05/performance-variation-in-2386-identical-processors/

Again, thank you!

Best wishes
Paul