Debunking Toyota’s ‘killer firmware’ theory

Article By : David Cummings

A software bug has caused the bit in Toyota's data structure to erroneously flip from one to zero, causing the accident, according to an expert.

« Previously: Why Toyota's software is not a killer
 

Another key element of the expert's accident theory is a fail-safe called the "Brake Echo Check," which is software that runs on a second processor called the monitor CPU. The Brake Echo Check is designed to behave as follows: If Task X died, and if the driver then stepped on the brake or released the brake, then about 200 milliseconds later the Brake Echo Check on the monitor CPU would detect an inconsistency resulting from the death of Task X on the main CPU, and would force the throttle to idle. About three seconds later it would stall the engine. When the throttle is at idle, braking will successfully stop the vehicle.

According to the accident theory presented to the jury by the expert, the following three things had to happen together just prior to the accident:

  • The bit corresponding to Task X in the operating system data structure was somehow flipped from one to zero, resulting in the death of Task X.
  • At the time of this bit flip, the throttle angle variable maintained by Task X contained a large value corresponding to an open throttle. Because Task X never ran again, the throttle angle variable was stuck at this value and the throttle remained open.
  • The Brake Echo Check did not work for some reason. When the driver stepped on the brake, the Brake Echo Check did not correctly detect the inconsistency due to the death of Task X, and therefore it did not force the throttle to idle. Because the throttle remained open, the driver was unable to stop the vehicle by braking.

This theory is not credible as the likely explanation for the accident for at least the following reasons:

  • It requires two nearly simultaneous independent failures—the hypothetical bit flip and the hypothetical failure of the Brake Echo Check—on two different processors.
  • The expert provided no evidence that either failure occurred at the time of the accident or under any circumstances.
  • For the hypothetical bit flip, the expert merely speculated that it might possibly occur under some circumstances due to problems he claimed to have identified in the software. No connection was established between any of those claimed problems and the specific bit in question.
  • For the hypothetical failure of the Brake Echo Check, the expert did not even speculate at trial why the Brake Echo Check would fail under any circumstances. Furthermore, all of the testing of the Brake Echo Check that he presented at trial showed it working exactly as designed. He also said that if the driver's foot was already on the brake when the hypothetical bit flip caused Task X to die, then the Brake Echo Check would not act to close the throttle because it only acts if there is a brake transition (brake on or brake off). As I show in my IEEE article, however, this is irrelevant because if the driver's foot was already on the brake when the hypothetical bit flip occurred, then the throttle would already be at idle and normal braking would stop the vehicle. The Brake Echo Check would not be needed.

The expert also presented an alternative theory involving the death of Task X that did not assume that the throttle angle variable contained a large value at the time of the hypothetical bit flip. As I show in the IEEE article, this alternative theory is also not credible as the likely explanation for the accident because it requires at least two hypothetical memory corruptions in two different parts of memory (a corruption of the operating system bit plus a corruption of the throttle angle variable) without any supporting evidence. In fact, in many scenarios, it requires yet a third simultaneous failure—a failure of the Brake Echo Check, as in the first theory.

Why should all of this be important to the embedded systems community? There are at least two reasons. First, the plaintiffs in this trial appear to have hit on an approach for embedded software trials that can produce a favourable verdict for the plaintiffs even if the evidence and technical analysis do not support such a verdict. Given its success in this trial, it seems likely that plaintiffs in future embedded software trials will employ the same approach. Hopefully, through increased awareness of this issue by the embedded systems community, the verdicts in future embedded software trials will more likely be supported by the evidence than was the case in this trial. Consequently, justice will be better served in future trials than it was in this trial.

The second reason is that, in this era of science deniers, it is more important than ever that we in the engineering and scientific communities be extremely vigilant and scrupulous in all of our publicly-expressed engineering or scientific opinions, lest those opinions become fodder for the deniers in their attempts to discredit science and scientists. By presenting engineering or scientific opinions at trial that are not supported by the evidence or by technical analysis, we run the risk of unwittingly providing ammunition for the science deniers.

« Previously: Why Toyota's software is not a killer

Dr. David M. Cummings is the Executive Vice President of the Kelly Technology Group in Santa Barbara, CA. He has over 35 years of experience in the design and implementation of software systems, many of which are embedded systems. Nine of those years were spent at the Jet Propulsion Laboratory, where he designed and implemented flight software for the Mars Pathfinder spacecraft. He holds a bachelor's degree from Harvard University, and a master's degree and a Ph.D. from UCLA.

First published by Embedded.

Leave a comment