[SystemSafety] Autonomous Vehicle Safety
Les Chambers
les at chambers.com.au
Fri Dec 19 14:55:54 CET 2025
Lorenzo
Agreed.
Call me old-fashioned for my faith in the hard-won axioms/absolutes that 50
years of systems engineering has proven. It's useful to state the obvious every
now and then.
1. Safe system claims based on operating experience are an example of toxic
groupthink in the new generation of vibe coders (and some misguided
establishment actors). Reality is uncomfortable. They avert their eyes. I have
never experienced a complex system that remains static. They are all going
through constant change. Any safety/reliability claims based on safe operating
hours are invalidated every time you change a line of code or a parameter in a
neural network.
Example: The Cloudflare global outage on 2 July 2019, where a single bad Web
Application Firewall rule essentially a very small code/config change took
down a huge chunk of the internet worldwide.
2. Testing finds bugs; it does not prove the absence of bugs. It's delusional
to expect that the cockroach you squash is the last bad actor in your kitchen.
3. Quality and Safety are built, not inspected, into a product. Maintaining
clear visibility of the behaviour of your product as it is built ...
progressive integration ... progressive testing ... independent verification
and validation ... not the UOLO release strategy I keep hearing about.
Full marks to some concerned professionals, for example, the Anthropic
Interpretability team led by Chris Olah.
... but AI observability is still a research project.
As the Teslas come at you on the two-lane road at relative speeds of 200+
km/hr.
!!!
Les
> Kalra et al reason about how one can make an argument that a vehicle is so
safe that its mass operation should be allowed. That's different from asking
whether a vehicle that has already been allowed to operate on a large scale for
a while has proved safe enough.
>
> In their introduction they ask "is it practical to assess autonomous vehicle
safety through test-driving?" and assuming in their thought experiment "a fleet
of 100 autonomous vehicles being test-driven 24 h a day" is consistent with
this scenario.
>
> To pick an industry with a less doubtful safety record: the FAA can see from
the statistics that most large airliners have had low accident rates in
retrospect, but the hard question is how to tell at certification time whether
THIS new type WILL exhibit that low an accident rate, or instead this specific
type is one of the rare ones that kill many passengers before design defects
are fixed.
>
> Regarding the other order-of-magnitude calculations that you propose: it may
be of interest that another RAND corporation report advocated rapid adoption of
autonomous vehicles (despite the difficulty of proving them safe enough), to
reach sooner a posited future situation when autonomous vehicles will have made
roads a lot safer. This leads to other discussion topics: what are reasonable
requirements, possible futures and how we predict them, the ethics of
transferring risk between future and current users.
>
> Best regards,
>
> Lorenzo
>
> > On 18 Dec 2025, at 17:12, Derek M Jones <derek at knosof.co.uk> wrote:
> >
> > Strigini, Lorenzo wrote:
> >
> >> Well, the authors (Kalra et al from RAND) make a straightforward argument
with classical statistical inference: assuming invariance in a system and the
way it is used (for a car: environment, pattern of driving etc.), what can you
infer about how unlikely accidents are, after the system has operated for a
while WITHOUT accidents? Whether they decided their conclusions before "picking
numbers" to support it, I do not know. But their conclusion is right: if you
use as your only evidence of safety just how much the vehicle has driven
without accident, you need to wait a long time before you can claim the level
of safety that can be acceptable for vehicles on public roads.
> >
> > The following author calculation is correct:
> > "This is approximately 8.8 billion miles. With a fleet of
> > 100 autonomous vehicles being test-driven 24 hours a day, 365
> > days a year at an average speed of 25 miles per hour, this would
> > take about 400 years."
> >
> > But, why only 100 autonomous vehicles?
> >
> > Let's say 100,000 vehicles driven for 1 hour a day, 250 days of
> > the year at an average speed of 25mph. Then 8.8 billion miles
> > would be traveled in 14 years.
> >
> > Since 2020 Tesla have been selling over 1 million cars a year.
> > https://www.statista.com/statistics/502208/tesla-quarterly-vehicle-
deliveries
> > How many miles are these cars driven autonomously?
> > I don't know.
> >
> > This 400 year calculation is just silly.
> >
> > Is 8.8 billion miles actually enough?
> >
> > 30% of accidents are alcohol related, plus there are the speed
> > related accidents. These won't happen with autonomous cars, unless
> > they are due to being hit by a human driven car (most fatalities
> > involve two cars).
> > Then we have the autonomous specific fatalities, such as drivers not
> > paying attention because they expect stuff to just work.
> > https://crashstats.nhtsa.dot.gov/Api/Public/Publication/813643
> >
> > If most fatalities involve two cars, and at the moment most
> > cars are human driven, there is a 50% chance that the fault was
> > not due to the autonomous driving.
> >
> > I have lots of issues with Poisson models for software failures.
> > But that's another topic.
> >
> >> The argument was stated before a bit more in depth by Bev Littlewood and
me in 1993 ( Validation of Ultrahigh Dependability for Software-Based Systems.
Communications of the ACM, 36(11), pp. 69-80. doi: 10.1145/163359.163373; also
https://openaccess.city.ac.uk/id/eprint/1251/ ; a minimal summary is at
https://openaccess.city.ac.uk/id/eprint/276/ ). We framed the problem as one of
Bayesian inference: how the probabilities that you assign to events (e.g.,
accidents) after seeing some relevant evidence (operation of the system) should
improve over the probabilities you assigned before seeing it.
> >> We used the example of commercial aviation: to claim high confidence in a
bound of 10^-9 probability of accident per hour, the limited amount of pre-
certification operation added almost nothing to whatever one claimed before
that amount of operation. We also argued that such claims were not plausible,
based on the other forms of evidence used to justify them.
> >>
> >> As Phil Koopman noted, the assumptions of invariance etc are normally
wrong. Yet checking whether a claim would be satisfied at least in the best
possible conditions (invariance of everything, so that all evidence collected
is certainly relevant) helps one to understand how overoptimistic one's claims
are. I think this useful, because we still see claims of having achieved great
levels of safety based on seriously inadequate amounts of evidence.
> >>
> >> My colleagues and I revisited the problem as stated by Kalra et al in this
paper: https://openaccess.city.ac.uk/id/eprint/24779/ , again not in terms of
ignoring all about a car except how many miles it has driven without accidents,
but in terms of how much all this driving can add to whatever claims you could
make before it. We see this as more realistic reasoning than theirs: nobody
would develop a vehicle without abundant precautions to support pre-operation
confidence in its safety. In later work we have looked at refining the
reasoning, e.g. to take into account a change in the car or in its use. Yet
strong claims remain hard to prove and confidence after seeing safe operation
depends heavily on what confidence you can have before it.
> >>
> >> Best regards,
> >>
> >> Lorenzo
> >>
> >>> to meet the reliability required to meet the numbers plugged
> >>> in by the authors.
> >>>
> >>> No attempt to show how the numbers they used connect to
> >>> human accident numbers.
> >>>
> >>> To my non-automotive-expert eye, it looks like they decided
> >>> on a conclusion and then picked numbers to validate it.
> >>>
> >>>> https://www.sciencedirect.com/science/article/abs/pii/S0965856416302129
> >>>
> >>> 2,049 citations on Google Scholar.
> >>>
> >>>> On the other hand, since the first author is at RAND, maybe there is a
report to download at no charge. Indeed so, through
> >>>>
> >>>> https://www.rand.org/pubs/research_reports/RR1478.html
> >>>
> >>> 6 citations, and no link to the paper.
> >>>
> >>> --
> >>> Derek M. Jones Evidence-based software engineering
> >>> blog:https://shape-of-code.com
> >>>
> >>> _______________________________________________
> >>> The System Safety Mailing List
> >>> systemsafety at TechFak.Uni-Bielefeld.DE
> >>> Manage your subscription: https://lists.techfak.uni-
bielefeld.de/mailman/listinfo/systemsafety
> >>
> >> __________________
> >> Prof Lorenzo Strigini
> >> Centre for Software Reliability
> >> City St Georgeâs, University of London
> >> Phone: +44 (0)20 7040 8245
> >> www.csr.city.ac.uk
> >>
> >> _______________________________________________
> >> The System Safety Mailing List
> >> systemsafety at TechFak.Uni-Bielefeld.DE
> >> Manage your subscription: https://lists.techfak.uni-
bielefeld.de/mailman/listinfo/systemsafety
> >
> > --
> > Derek M. Jones Evidence-based software engineering
> > blog:https://shape-of-code.com
> >
> > _______________________________________________
> > The System Safety Mailing List
> > systemsafety at TechFak.Uni-Bielefeld.DE
> > Manage your subscription: https://lists.techfak.uni-
bielefeld.de/mailman/listinfo/systemsafety
>
> __________________
> Prof Lorenzo Strigini
> Centre for Software Reliability
> City St Georgeâs, University of London
> Phone: +44 (0)20 7040 8245
> www.csr.city.ac.uk
>
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
> Manage your subscription: https://lists.techfak.uni-
bielefeld.de/mailman/listinfo/systemsafety
--
Les Chambers
les at chambers.com.au
https://www.chambers.com.au
https://www.systemsengineeringblog.com
+61 (0)412 648 992
More information about the systemsafety
mailing list