[SystemSafety] Autonomous Vehicle Safety

Strigini, Lorenzo lorenzo.strigini.1 at city.ac.uk
Thu Dec 18 20:51:15 CET 2025


Kalra et al reason about how one can make an argument that a vehicle is so safe that its mass operation should be allowed. That's different from asking whether a vehicle that has already been allowed to operate on a large scale for a while has proved safe enough.

In their introduction they ask "is it practical to assess autonomous vehicle safety through test-driving?" and assuming in their thought experiment "a fleet of 100 autonomous vehicles being test-driven 24 h a day" is consistent with this scenario.

To pick an industry with a less doubtful safety record: the FAA can see from the statistics that most large airliners have had low accident rates in retrospect, but the hard question is how to tell at certification time whether THIS new type WILL exhibit that low an accident rate, or instead this specific type is one of the rare ones that kill many passengers before design defects are fixed.

Regarding the other order-of-magnitude calculations that you propose: it may be of interest that another RAND corporation report advocated rapid adoption of autonomous vehicles (despite the difficulty of proving them safe enough), to reach sooner a posited future situation when autonomous vehicles will have made roads a lot safer. This leads to other discussion topics: what are reasonable requirements, possible futures and how we predict them, the ethics of transferring risk between future and current users.

Best regards,

  Lorenzo

> On 18 Dec 2025, at 17:12, Derek M Jones <derek at knosof.co.uk> wrote:
> 
> Strigini, Lorenzo wrote:
> 
>> Well, the authors (Kalra et al from RAND) make a straightforward argument with classical statistical inference: assuming invariance in a system and the way it is used (for a car: environment, pattern of driving etc.), what can you infer about how unlikely accidents are, after the system has operated for a while WITHOUT accidents? Whether they decided their conclusions before "picking numbers" to support it, I do not know. But their conclusion is right: if you use as your only evidence of safety just how much the vehicle has driven without accident, you need to wait a long time before you can claim the level of safety that can be acceptable for vehicles on public roads.
> 
> The following author calculation is correct:
> "This is approximately 8.8 billion miles. With a fleet of
> 100 autonomous vehicles being test-driven 24 hours a day, 365
> days a year at an average speed of 25 miles per hour, this would
> take about 400 years."
> 
> But, why only 100 autonomous vehicles?
> 
> Let's say 100,000 vehicles driven for 1 hour a day, 250 days of
> the year at an average speed of 25mph.  Then 8.8 billion miles
> would be traveled in 14 years.
> 
> Since 2020 Tesla have been selling over 1 million cars a year.
> https://www.statista.com/statistics/502208/tesla-quarterly-vehicle-deliveries
> How many miles are these cars driven autonomously?
> I don't know.
> 
> This 400 year calculation is just silly.
> 
> Is 8.8 billion miles actually enough?
> 
> 30% of accidents are alcohol related, plus there are the speed
> related accidents.  These won't happen with autonomous cars, unless
> they are due to being hit by a human driven car (most fatalities
> involve two cars).
> Then we have the autonomous specific fatalities, such as drivers not
> paying attention because they expect stuff to just work.
> https://crashstats.nhtsa.dot.gov/Api/Public/Publication/813643
> 
> If most fatalities involve two cars, and at the moment most
> cars are human driven, there is a 50% chance that the fault was
> not due to the autonomous driving.
> 
> I have lots of issues with Poisson models for software failures.
> But that's another topic.
> 
>> The argument was stated before a bit more in depth by Bev Littlewood and me in 1993 ( Validation of Ultrahigh Dependability for Software-Based Systems. Communications of the ACM, 36(11), pp. 69-80. doi: 10.1145/163359.163373; also https://openaccess.city.ac.uk/id/eprint/1251/ ; a minimal summary is at https://openaccess.city.ac.uk/id/eprint/276/ ). We framed the problem as one of Bayesian inference: how the probabilities that you assign to events (e.g., accidents) after seeing some relevant evidence (operation of the system) should improve over the probabilities you assigned before seeing it.
>> We used the example of commercial aviation: to claim high confidence in a bound of 10^-9 probability of accident per hour, the limited amount of pre-certification operation added almost nothing to whatever one claimed before that amount of operation. We also argued that such claims were not plausible, based on the other forms of evidence used to justify them.
>> 
>> As Phil Koopman noted, the assumptions of invariance etc are normally wrong. Yet checking whether a claim would be satisfied at least in the best possible conditions (invariance of everything, so that all evidence collected is certainly relevant) helps one to understand how overoptimistic one's claims are. I think this useful, because we still see claims of having achieved great levels of safety based on seriously inadequate amounts of evidence.
>> 
>> My colleagues and I revisited the problem as stated by Kalra et al in this paper: https://openaccess.city.ac.uk/id/eprint/24779/ , again not in terms of ignoring all about a car except how many miles it has driven without accidents, but in terms of how much all this driving can add to whatever claims you could make before it. We see this as more realistic reasoning than theirs: nobody would develop a vehicle without abundant precautions to support pre-operation confidence in its safety. In later work we have looked at refining the reasoning, e.g. to take into account a change in the car or in its use. Yet strong claims remain hard to prove and confidence after seeing safe operation depends heavily on what confidence you can have before it.
>> 
>> Best regards,
>> 
>>     Lorenzo
>> 
>>> to meet the reliability required to meet the numbers plugged
>>> in by the authors.
>>> 
>>> No attempt to show how the numbers they used connect to
>>> human accident numbers.
>>> 
>>> To my non-automotive-expert eye, it looks like they decided
>>> on a conclusion and then picked numbers to validate it.
>>> 
>>>> https://www.sciencedirect.com/science/article/abs/pii/S0965856416302129
>>> 
>>> 2,049 citations on Google Scholar.
>>> 
>>>> On the other hand, since the first author is at RAND, maybe there is a report to download at no charge. Indeed so, through
>>>> 
>>>> https://www.rand.org/pubs/research_reports/RR1478.html
>>> 
>>> 6 citations, and no link to the paper.
>>> 
>>> --
>>> Derek M. Jones           Evidence-based software engineering
>>> blog:https://shape-of-code.com
>>> 
>>> _______________________________________________
>>> The System Safety Mailing List
>>> systemsafety at TechFak.Uni-Bielefeld.DE
>>> Manage your subscription: https://lists.techfak.uni-bielefeld.de/mailman/listinfo/systemsafety
>> 
>> __________________
>> Prof Lorenzo Strigini
>> Centre for Software Reliability
>> City St George’s, University of London
>> Phone: +44 (0)20 7040 8245
>> www.csr.city.ac.uk
>> 
>> _______________________________________________
>> The System Safety Mailing List
>> systemsafety at TechFak.Uni-Bielefeld.DE
>> Manage your subscription: https://lists.techfak.uni-bielefeld.de/mailman/listinfo/systemsafety
> 
> --
> Derek M. Jones           Evidence-based software engineering
> blog:https://shape-of-code.com
> 
> _______________________________________________
> The System Safety Mailing List
> systemsafety at TechFak.Uni-Bielefeld.DE
> Manage your subscription: https://lists.techfak.uni-bielefeld.de/mailman/listinfo/systemsafety

__________________
Prof Lorenzo Strigini
Centre for Software Reliability
City St George’s, University of London
Phone: +44 (0)20 7040 8245
www.csr.city.ac.uk






More information about the systemsafety mailing list