[SystemSafety] NTSB report on Boeing 787 APU battery fire at Boston Logan
Mike Ellims
michael.ellims at tesco.net
Thu Dec 4 16:44:49 CET 2014
Peter comments that " Huh? How could one not anticipate internal short
circuits? How could one not anticipate thermal runaway from an internal
short circuit? Answer: this was an assumption derived from a single
nail-penetration test."
So could the assumption have been validated as it should have been?
Simplest way to test this is to do a literature search.
Section 1.2 of the report states; "The 787 program began in April 2004, with
the 787's first flight in December 2009, certification in August 2011, and
first delivery in September 2011."
Searching on " lithium ion battery fire" and " lithium ion battery failure"
for papers (Google Scholar) published up to 2003 produces thousands of
results.
For example from paper published in 2003...
... their safety is still a major concern, and are less safe as the capacity
of the battery increases. In particular, one of the unsolved problems that
can occur during operation, abrupt overcharge to the voltage-supply limit
(12 V) owing to a defect or a malfunction in the protective devices[+] of
the cell, has not been prevented. Moreover, numerous battery accidents with
accompanying fires and explosions have been reported.[3] The main cause of
such disasters is that LiCoO2 cathodes can undergo a violent exothermic
reaction with the electrolyte during overcharge, which may result in the
cell shortcircuiting. In addition, lithium deposited on the graphite anode
accelerates the reaction, and results in a sharp rise in temperature.[4-6]
Furthermore, this process converts LiCoO2 into the strong oxidizing agent
Co2O3, which releases oxygen during overcharging. A combination of the
temperature increase and the internal short circuit of the cell eventually
results in an explosion of the cell. In spite of this, no fundamental
solution has been found.
Full paper at http://ep.snu.ac.kr/publication/pdf/2003%20Angew%2042_1618.pdf
-----Original Message-----
From: systemsafety-bounces at lists.techfak.uni-bielefeld.de
[mailto:systemsafety-bounces at lists.techfak.uni-bielefeld.de] On Behalf Of
Peter Bernard Ladkin
Sent: 04 December 2014 13:43
To: The System Safety List
Subject: [SystemSafety] NTSB report on Boeing 787 APU battery fire at Boston
Logan
Has been published at http://www.ntsb.gov/doclib/reports/2014/AIR1401.pdf
There was an NYT article yesterday:
http://www.nytimes.com/2014/12/02/business/report-on-boeing-787-dreamliner-b
atteries-assigns-some-blame-for-flaws.html
Just the summary of the NTSB report is astonishing in itself! Keep in mind
this is a 2.2kW-hr energy storage device (75 Amp-hours at nominal 29.6 V).
When they decide to go, they can get rid of all that energy in a relatively
short space of time. But apparently the regulator didn't think so. Before.
Pp viii-ix of the Report contains a summary of the conclusions. It makes
clear an astonishingly superficial grasp of the technology on the part of
Boeing and the FAA. The manufacturer's processes allowed FOD and improper
cell winding without having effective detection methods in place.
My remarks are in square parentheses
[begin quote NTSB]
The NTSB identified the following safety issues as a result of this incident
investigation:
* Cell internal short circuiting and the potential for thermal runaway of
one or more battery cells, fire, explosion, and flammable electrolyte
release. This incident involved an uncontrollable increase in temperature
and pressure (thermal runaway) of a single APU battery cell as a result of
an internal short circuit and the cascading thermal runaway of the other
seven cells within the battery. This type of failure was not expected .....
[Huh? How could one not anticipate internal short circuits? How could one
not anticipate thermal runaway from an internal short circuit? Answer: this
was an assumption derived from a single nail-penetration test.]
..... Boeing's analysis of the main and APU battery did not consider the
possibility that cascading thermal runaway of the battery could occur as a
result of a cell internal short circuit.
[This is an astonishing assertion, but appears to be well justified in the
NTSB analysis. How do you miss such an obvious phenomenon? I guess it's an
example of group think. The NTSB notes the lack of effective traceability in
the system safety assessment, pp73ff.]
* Cell manufacturing defects and oversight of cell manufacturing processes.
After the incident, the NTSB visited GS Yuasa's production facility ... NTSB
identified several concerns, including foreign object debris (FOD)
generation during cell welding operations and a postassembly inspection
process that could not reliably detect manufacturing defects, such as FOD
and perturbations (wrinkles) in the cell windings, which could lead to
internal short circuiting. In addition, the FAA's oversight of Boeing,
Boeing's oversight of Thales, and Thales' oversight of GS Yuasa did not
ensure that the cell manufacturing process was consistent with established
industry practices.
[That is, the manufacturer of extremely powerful Li-ion secondary batteries
was not using "established industry practices". Not only that, but its
quality-control processes were flawed.
And this after all those public claims about careful oversight. ]
* Thermal management of large-format lithium-ion batteries. Testing
performed during the investigation showed that localized heat generated
inside a 787 main and APU battery during maximum current discharging exposed
a cell to high-temperature conditions. Such conditions could lead to an
internal short circuit and cell thermal runaway. As a result, thermal
protections incorporated in large-format lithium-ion battery designs need to
account for all sources of heating in the battery during the most extreme
charge and discharge current conditions.
[Well, yes. That is or should be routine safety analysis and factor
mitigation. But both the manufacturer's FMEA and the FAA requirements of the
system safety assessment seem to have been lacking; see below.]
* Insufficient guidance for manufacturers to use in determining and
justifying key assumptions in safety assessments. Boeing's EPS safety
assessment for the 787 main and APU battery included an underlying
assumption that the effect of an internal short circuit within a cell would
be limited to venting of only that cell without fire. However, the
assessment did not explicitly discuss this key assumption or provide the
engineering rationale and justifications to support the assumption.
.....Boeing's assumption was incorrect.....
..... Boeing and FAA reviews of the EPS safety assessment did not reveal
that the assessment had not (1) considered the most severe effects of a cell
internal short circuit and (2) included requirements to mitigate related
risks.
* Insufficient guidance for FAA certification engineers to use during the
type certification process to ensure compliance with applicable
requirements. During the 787 certification process, the FAA did not
recognize that cascading thermal runaway of the battery could occur as a
result of a cell internal short circuit.
[This is *really* hard to fathom!]
[end quote]
The manufacturing-line defects were quite straightforward. It astonishes me
that the NTSB was able to observe "perturbations" in electrode/separator
strips being wound during their inspection - and that such as these were not
discovered using the manufacturer's quality-control (CT of the results which
was too coarse to detect the kind of FOD that might well have got in, or
even the perturbations the NTSB found), because these things don't appear to
be subtle.
The manufacturer's FMEA was apparently based upon in-service data of 14,000
cells of a similar design to the LVP65.
On p68 we read:
[begin quote]
Boeing and Thales performed preliminary and final EPS safety assessments,
which included fault tree analyses, FMEAs, and failure rate data provided by
GS Yuasa. These assessments considered internal short circuit failures but
were developed with the underlying assumption that the most severe effect of
an internal short circuit within a cell would be limited to venting of only
that cell without fire and propagation to other cells. Thus, the potential
for an internal short circuit to lead to multiple-cell or battery thermal
runaway with venting, electrolyte leakage, excessive heat, and fire was not
analyzed in the safety assessment.
[end quote]
In other words, the FMEA contained an inadequate "E" part - internal short
circuits leading to thermal runaway apparently didn't occur as an effect of
a failure. Why not?
The FMEA is talked about on pp49-51, in Section 1.7.3 System Safety
Assessment:
[begin quote]
Boeing's FMEA was based on information contained within GS Yuasa's FMEA,
which GS Yuasa developed with assistance from Boeing and Thales. GS Yuasa's
FMEA included a calculation of a representative failure rate for the LVP65
cell. This calculation was based on in-service data from about 14,000
existing large-scale industrial lithium-ion cells manufactured by GS Yuasa,
which had a similar design and manufacturing process as the LVP65 cell.106
GS Yuasa's FMEA indicated that none of the industrial cells had experienced
any failures, including venting, electrolyte release, or rupture of a vent
disc. (GS Yuasa's FMEA did not include an analysis of usage and
environmental similarities between the industrial cells and the LVP65 cells
or a discussion of the hazardous effects of a lithium-ion cell failure,
including overheating or venting.)
[end quote]
So they did an FMEA using data from cells, none of which had failed. Looks
good so far! Perfect manufacturing! But then there was a Nov 2006 fire at
Securaplane, which makes the battery charging system (BCS). Investigation
put this down to an cell-internal short, and overcharging of at least one
other cell (Note 81, p43). There was also a thermal runaway in an APU
battery on July 7, 2009 (Note 82, p43). Both of these incidents vitiate the
assumptions made in the system safety assessment that thermal runaway was
not a possible effect, but apparently nobody at Boeing or the FAA noticed.
In other words, the SSA was not revisited as a result of these two
incidents.
Yes, a lack of joined-up thinking. In some sense, this was known to be a
problem with the heavily outsourced/subcontracted 787 project - one might
even guess that "ensuring joined-up thinking" is THE big challenge with such
efforts. Recall the cable-bundle mismatching that occurred on the A380,
which if I remember rightly was partly put down to different Airbus plants
using different versions of the CAD tool CATIA. But this lack of joined-up
thinking went beyond the manufacturer (on this project more a systems
integrator) to include in the 787 case the regulator as well!
A significant piece of information concerning aircraft safety assessment is
contained in Note 86, p44:
[begin quote]
The FAA did not consider the 787 battery to be a critical component because
the Seattle Aircraft Certification Office (which was responsible for the
airplane's certification) regarded the battery as a redundant system. ......
[end quote]
You are only "critical" according to airworthiness regulations if you're a
single point of failure, and you only get selected for top scrutiny if you
are manufacturing a "critical" component. There is an obvious argument here
for a notion of criticality referring to the severity of consequences of
(faulty or otherwise) behavior.
In any case, that won't help if the FMEA/FHA is faulty and doesn't indicate
any effect greater than a single smoky cell.
Once again, it seems that faulty safety assessment, in this case (again) an
obviously inadequate FMEA played a significant role, despite the presence of
incidents contradicting the analysis.
(There are people here who have heard me say enough times that I haven't
seen an FMEA I can't fault.
There are plenty of other people on this list who can likely that also. Now
it's the NTSB's turn to say it, even if discreetly.)
PBL
Prof. Peter Bernard Ladkin, Faculty of Technology, University of Bielefeld,
33594 Bielefeld, Germany
Tel+msg +49 (0)521 880 7319 www.rvs.uni-bielefeld.de
_______________________________________________
The System Safety Mailing List
systemsafety at TechFak.Uni-Bielefeld.DE
---
This email is free from viruses and malware because avast! Antivirus protection is active.
http://www.avast.com
More information about the systemsafety
mailing list