[SystemSafety] Safety and effective or not cybersecurity countermeasures

Bruce Hunter brucer.hunter at gmail.com
Thu Jun 6 09:51:52 CEST 2019


I do not think the responses to this topic so far have answered either
Peter’s or David’s original questions, so, I’ll stick my neck out and add
to my comments to the previous critique of the pre-publication IEC TR 63069
in this forum.

The following are my views only and does not necessarily reflect the views
of IEC 65 or other WG20 members!

On 27/05/2019 at 09:15, Peter Bernard Ladkin wrote:

*[begin quote] [IEC TR 63069 Ed 1 Section 5 Guiding Principle 1: protection
of safety implementations]*

*Security countermeasures should effectively prevent or guard against
adverse impacts of threats to safety-related systems and their implemented
safety functions. Evaluations of safety functions should be based on the
assumption of effective (security) countermeasures.*

*[end quote]*

*There is nothing wrong with recommending that security countermeasures
should be effective (sentence 1). However, (sentence 2) there is a lot
wrong with *assuming effective cybersecurity countermeasures are in place*
while evaluating safety functions.*

Wording is often a compromise with consensus standards and I agree that the
second sentence in IEC TR 63069 may have been phrased better to convey the
intended meaning.

My view of WG20 intent for Guiding Principle 1 was reinforcing the accepted
rule that *a system is not safe unless it is secure*, and the difficult
issue of proving systems are secure enough to protect the safety functions
from possible threats to its cybersecurity.

In reality, it is not possible to assure the effectiveness of cybersecurity
to the same level of certainty as the integrity of the safety functions
they are protecting (SIL does directly relate to SL). Information Systems
Security and Cybersecurity risk assessments just do not work that way.

You are relying on all appropriate security countermeasures are in verified
for the impact of the worst case safety function failure. Security
practitioners need to work with Safety practitioners to be assured that
cybersecurity countermeasures have minimised the cybersecurity risk to the
safety system and allow them to proceed with safety engineering on this
basis (hence inclusion of "assumption").

While it would be good to integrate safety and security risk management is
not  practical to do this, leading to the guidance in section 6.1 and
Figure 4, Safety and security interaction.

On 05/06/2019 at 09:15, Peter David Mentré wrote:

*On the other side, it seems difficult to me to have effective safety
function without a minimum of effective cybersecurity countermeasures. *

*Taking as example a software based railway interlocking control device
with some networking function. If one cannot assume that through
appropriate countermeasures the device is immune to network attacks, then
the attacker could probably in the worst case overwrite the original
control software and do anything with the device, including producing
unsafe outputs like triggering train collision.*

*How would you produce a safe device without assuming effective
cybersecurity countermeasures make it immune to such network attacks?*

I agree with David’s position and this is a good example to use in
explaining my view of WG20’s intent of the Guiding Principles and the WG20
committee. It is good to remember that IEC TR 63069 does do contain
functional safety or cybersecurity requirements but relies on the relevant
requirements in IEC 61508 and IEC62443.

*Guiding principle 1: protection of safety implementations*

"Security countermeasures should effectively prevent or guard against
adverse impacts of threats to safety-related systems and their implemented
safety functions. Evaluations of safety functions should be based on the
assumption of effective (security) countermeasures."

Railway Interlock systems would have a SIL rating of at least 3 (failure
rate of < 10^-4). Systems I have seen, with axle counters, have a fail-safe
mode of trouble for the single network wiring and switches with faults. I
am not sure about multiple redundant networks.

In either case the network (conduit in IEC 62443 speak) and interlock
subsystems (zones) must be protected from cyber threats and other
influences. The issue is whether the probability of threat to exploit all
possible vulnerabilities that lead to a dangerous failure is going to be <
10^-4, Not assuming effective cybersecurity countermeasures is limited by:

·The problem of predicting average successful exploitation rates (leading
to dangerous failure rates) for systems for which we are unsure of all
possible vulnerabilities and the expertise of determined threat actors.

·The evolution and exploitation of zero-day vulnerabilities not considered
in the original threat and risk assessment, which then puts the protected
system at risk.

·The emergence of new cybersecurity threats or attack vectors.

·Safety fault-tree principles would suggest that we should assume a
probability of failure of “1” (certain) and reliance made on other
elements, if we cannot establish the probability of failure for an element
that could cause a dangerous failure.

· Vulnerability exploitations do not necessarily lead to dangers ours but
it must be assumed that they have the capability to do so. An example is
the Cyber-attack and an Emergency Shutdown System in Saudi Arabia Last year.

The security architecture is chosen (see IEC 624432.1 Annex A 3.3) to allow
physical or logical access only to authenticated and authorised parties.
This would typically bound the safety-related system in a single safety
system zone with conduits to other non-safety zones via attached conduits
restricting access and control. Practically this could involve a one-way
switch before the conduit connecting to monitoring systems. It could also
include specific IP and MAC addressing or device certificates to prevent
counterfeit devices (but see principle 3).

The preferred approach this is to have cybersecurity experts establish,
maintain, and assure a security architecture with a defence-in-depth
approach, based on a current threat and risk assessment. Emergence of new
zero-day vulnerabilities and attack vectors limits the trust in the ongoing
effectiveness of countermeasures.

*2)         Guiding principle 2: protection of security implementations*

Along with Guiding Principle 3, in my view this was meant to address the
issue that safety functions and their implementation could bypass
cybersecurity countermeasures, established to support Principle 1, and thus
accidentally put the system at increased risk of cyber-threats.

In the railway interlock example, features of safety system could
accidentally provide a bypass to the cybersecurity countermeasures.
Examples are:

·Threat actors could using emergency keys, passwords, patches, to access
the interlock system and carry out malicious actions on the system.

· Delaying security patches and updates to the interlock system to keep
stability of the system, could leave published vulnerabilities open for
exploitation and put the system at risk of cyber-attack.

·

*Guiding principle 3: compatibility of implementations *

Cybersecurity countermeasures and any updates may have an accidental impact
on the performance of safety functions and in my view, addressing this is
the main intent of this principle.

In the example of railway interlocks, inappropriate design or changes to
the cybersecurity countermeasures could lead to failures of the safety
functions due to:

·Patches to security software could create false-positive detections that
could lead to the interlock safety function being blocked

· Untested security changes to switch configuration could compromise
interlock functions

·Security penetration testing on a live system could cause accidental
operation or blocking interlock functions.

· Expiration or revocation of authentication certificates and passwords
could block access to interlock system normal, trouble or  emergency
operation.

*Technology imitates biology (and politics)? *

Some comments on this topic raised the analogy of the human body’s reaction
to infection and the autoimmune response, which I believe is a good one.
Sometimes false positives of protective systems and inappropriate
countermeasures result in bad outcomes.

Coincidentally, yesterday was the 30th anniversary of Tienanmen Square
student deaths at the hands of the PLA and it has nearly been 50 years
since Kent State University student deaths at the hands of Ohio National
Guard. Is there an analogy here as well?

*My recommendations on using the Guiding principles*

To summarise:

 >I would recommend that Operational Technology security standards be used
as a basis of protecting safety systems, not Information Technology
standards and in particular IEC 62443, which is assumed (againJ) by IEC TR
63069.

> IEC TR 63069 is a good starting point for considerations in cybersecurity
protections of safety-related systems

>Critical infrastructure security standards and Frameworks, including the
Australian Rail Industry Safety and Standards Board (RISSB) rail cyber
security standard (AS7770), reference IEC 62443 clauses but implementation
should consider  IEC TR 63069 Guiding Principles.

> I am planning doing a paper on practicalities of applying IEC TR 63069
Guiding Principles similar to the Tutorial I conducted at the 2018
Australian System Safety Conference. I hope that this could be beneficial
to practitioners.

I hope this is of help and makes sense (my body immune system is work hard
on a flu virus attacking me :)
Bruce Hunter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.techfak.uni-bielefeld.de/mailman/private/systemsafety/attachments/20190606/3a79773e/attachment-0001.html>


More information about the systemsafety mailing list