Nancy Leveson: Safeware CHAPTER 1: Risk in Modern Society • • • • • Three Mile Island 1979 Bhopal, 1984 Space Shuttle, 1986 Chernobyl, 1986 Therac-25 1985 - 1987

Download Report

Transcript Nancy Leveson: Safeware CHAPTER 1: Risk in Modern Society • • • • • Three Mile Island 1979 Bhopal, 1984 Space Shuttle, 1986 Chernobyl, 1986 Therac-25 1985 - 1987

Nancy Leveson:
Safeware
CHAPTER 1: Risk in
Modern Society
•
•
•
•
•
Three Mile Island 1979
Bhopal, 1984
Space Shuttle, 1986
Chernobyl, 1986
Therac-25 1985 - 1987
IS NEW
TECHNOLOGY
MAKING OUR
WORLD RISKIER?
In the United States
technological hazards account
for 15 to 25 % of human
mortality and have
significantly surpassed natural
hazards in impact, cost and
general importance.
– [Ref 87 - Ned Franklin. The
accident at Chernobyl.
– The Chemical Engineer, pages
17-22, November, 1986]. p.4
Flood damage in the United
States, for example, has
increased as expenditures on
flood control have increased.
[154 - Trevor Kletz. Myths of
the Chemical industry. The
Institution of Chemical
Engineers, Rugby,
Warwickshire, United
Kingdom, 1984].p.4
In fact, all hazards are
affected by complex
interactions among
technological,
ecological,
sociopolitical, and
cultural systems [30,
174, 3339].
Attempts to control risk by
treating it simply as a
technical problem or only
as a social issue are
doomed to fail or to be less
effective than possible.
1.1 Changing Attitudes
toward Risk
• Not Safe, only Safer.
• Societies are recognizing human
and workers’ rights
• “Workers are at the mercy of
their employers in terms of
safety”
• Complete abdication of personal
responsibility, however, is not
always wise
In some instances -- such as the
Bhopal accident -- the public
has completely trusted others to
plan for and respond effectively
to an emergency, with tragic
results.
– Many aspects of Bhopal
guaranteed that an accident
would occur.
• emergency and evacuation
planning, training, and equipment
were inadequate.
• Public not told of simple
measures (such as closing eyes
with wet cloth over faces) which
could have saved their lives.
In writing about the Bhopal
tragedy, Bogard expresses
this new attitude:
We are not safe from the risks posed by
hazardous technologies, and any
choice of technology carries with it
possible worst case scenarios that
must take into account in any
implementation decision. The public
has the right to know precisely what
these worst case scenarios are and
participate in all decisions that
directly or indirectly affect their
future health and well-being.
•In many cases, we must
accept the fact that the result
of employing such criteria
may be a decision to forego
the implementation of a
hazardous technology
altogether [30, p. 109]. p.6
Is risk increasing in our
modern society as a result of
new technological
achievements, or are we
simply experiencing a new
and unjustified form of
Luddhism? p.6
• [The Luddite disturbances occurred
in Yorkshire between 1811 and 1816,
when workers in the English woolen
industry tried, through violence, to
stem the increasing mechanization of
the mills. Luddism has become a
generic term describing opposition
to technological innovation]
1.2 Is Increased Concern
Justified?
• Is technological risk increasing?
Depends on data used and its
interpretation [p.6]
• “Harris and colleagues argue that
technological hazards, in terms or
human mortality, were greater in the
earlier, less fully managed stages of
industrial development [112]
• Data cited from National Safety
Council shows that occupational
death and injury rates have declined
steadily since the early part of this
century [112].
NSC Concludes that
technological hazard mortality
is not currently rising.
– However warning that:
• “The positive effects of technology
have for some time reached their
maximum effect on human
mortality, while the hazards of
technology continue partially
unchecked, affecting particularly the
chronic causes of death currently
account for 85 percent of mortality
in the U.S.A. “ [112]
• On other hand, examination of
technological accident rate, rather
than the occupational death and
injury rate, suggests that the
technological risk is increasing
• 60% of all major industrial disasters
from 1921 to 1989 occurred after
1975 [30].
Bogard (1989) argued that 12 of the 19
major industrial accidents in the
twentieth involving 100 or more deaths
occurred after 1950.
– If small-scale accidents also
included (transportation, dams,
structural collapses) evidence is
more compelling.
– Example: Military Aviation;
accident rate has slowed;
probably due to emphasis on
system safety.
– Past experience does not allow us
to predict the future when the risk
factors in the present and future
differ from those in the past.
Examining these changes will
help us understand the problems
we face.
1.3 Unique Risk Factors
in Industrialized Society
• RISK = Likelihood of an
accident plus severity of the
potential consequences
• Factors include: new hazards,
increasing complexity,
exposure, energy, automation,
centralization, scale, and pace
of technological change in
systems
1.31 The Appearance of
New Hazards
• Misuse and overuse of
antibiotics have created resistant
microbes
• Children no longer work in
mines or as chimney sweeps,
but are exposed to man-made
chemicals and pesticides in their
food or increased environmental
pollution [57]
• Atomic energy has increased
potential for death and injury
from radiation exposure
Redundancy (duplication
of components to protect
against individual failures)
– Not effective means of controlling
risks
– not effective against hazards that
arise from interactions among
components in today’s
increasingly complex and
interactive systems.
– May in fact increase complexity
to the point where the
redundancy itself contributes to
accidents
1.3.2 Increasing Complexity
• Perrow distinguishes between
accidents caused by component
failures and those which he calls
system accidents, that are caused
by interactive complexity in the
presence of tight coupling [259,
Normal Accidents]
• High technology systems are often
made up of networks of closely
related subsystems. Conditions
leading to hazards emerge in the
interfaces between subsystems, and
disturbances progress from one
component to another.
Increasing Complexity/Cont.
An example of this increasingly
common type of complexity, modern
petrochemical plants often combine
several separate chemical processes
into one continuous production,
without the intermediate storage that
would de-couple the subsystems [274].
• “…analysis of major
industrial accidents
invariably reveal highly
complex sequences of
events leading up to
accidents, rather than single
component failures.” p.8
Increasing Complexity
Cont./
In the past component
failure was cited as the
major factor in accidents,
today more accidents result
from dangerous design
characteristics and
interactions among
components [108] p8 top.
Not only does functional
complexity make the
designer’s task more difficult,
but the complexity and scope
of the projects require
numerous people and teams
to work together. The
anonymity of team projects
dilutes individual
responsibility [172]
• Paradox that people are willing
to spend money on complexity
but not on simplicity [158,
Kletz] WHY IS THIS THE
CASE?
CASE in Point: A British
Chemical Plant (p.9)
• pump, various pipelines, had
several uses including:
– transferring methanol from a road
tanker to storage
– charging it to the plant,
– moving methanol back from a
road tanker to storage,
– charging it to the plant
– moving recovered methanol back
from the plant
On this particular
occasion, a tank truck
was being emptied:
– The pump had been started from
the control panel but had been
stopped by means of a local
button. The next job was to
transfer some methanol from
storage to the plant.
– the computer set the valves, but as
the pump had been stopped
manually it had to be started
manually.
When the transfer was complete the
computer told the pump to stop, but
because it had been stated manually
it did not stop and a spillage
occurred [157, p.225]. P 9
– a simpler design -- independent
pipelines for different functions,
actually installed after the spill,
made errors much less likely and
was not more expensive over the
lifetime of the equipment.
Computers
• may encourage the
introduction of unnecessary
and dangerous complexity
• enable more interactive, tightly
coupled, and error-prone
designs to be built
• Kletz has noted:
“Programmable electronic
systems have not introduced
new forms of error, but by
increasing the complexity of
the the processes that can be
controlled, they have
increased the scope of the
introduction of conventional
errors” [158].
Conclusion
•
Accepting Perrow’s argument that
interactive complexity and
coupling are a cause of serious
accidents, then the introduction of
computers to control dangerous
systems, may increase risk unless
great care is taken to minimize
complexity and coupling..
1.3.3 More People
Exposed to Hazards
• Larger flight capacities
• Dangerous plant facilities closer
to population centers
• More of workforce in cities or
within commuting distance.
• Interdependencies and
complexity cause ripple effects
of hazards magnifying potential
consequences.
• 1.3.4 High Energy Sources
increase risks
1.3.5 Increasing Automation
of Manual Operations
• Automation does not remove
humans but tends to redefine
their roles
• operators become concerned
with maintenance, repair and
higher level supervisory control
and decision making [270]
Operators relegated to central
control rooms
Case in Point: 1977 NYC
Blackout
• Indirect Information
• Operator followed prescribed
procedures
• But electrical system was brought to
a complete halt
• Operator could not know there were
two relay failures:
– one leading to a high flow over
line normally carrying little or no
current (operator would have been
alerted)
– other blocked flow over line
making it appear normal
• Operator had no way of
knowing that zero reading
would appear normal.
• Operators become the
“scapegoat” of an automated
system.
Operators and
Embedded Systems
• Embedded systems can mask
the occurrence and subsequent
development of a problem
• When malfunction is discovered
it may be more difficult to
control
• Systems may be hidden or
distorted
• Such design further limits
operator options and hinders
broad comprehension. p.11
Case: China Airlines,
1985
• 747 suffered slow loss of power
in right outer engine
• autopilot compensated
preventing yaw to the right
• when limit reached, crew had
no time to determine cause
• plane rolled and went into
vertical dive of 31,500 ft.
• Aircraft severely damaged
Multiple Goals may lead
to conflicts
• 1970’s attempt to combine
energy savings efforts with
process plants lead to
complications
• Safety and Economy conflicting
• component interactions make
system functions less
transparent to designers and
operators
• Place where trouble first
recognized may not be where it
started.
1.3.7 Increasing Scale
and Centralization
• Bupp and Derian [1968]
observed that manufacturers
were taking orders for nuclear
power plants six times the size
of those in operation.
• Previously 1-2 times
extrapolations were considered
at the outer boundary of
acceptable risk.
• Brown’s Ferry Accident, 1975
was 10 times size of plant
construction in 1966
Supertankers built
without sound design
and redundancy
• Mostert writes:
– The gigantic scale of vessels
creates an abstract environment in
which crews are far removed
from direct experience of the
sea’s unforgiving qualities and
potentially hostile environment.
Heavy automation undermines
much of the old-fashioned
vigilance and induces engineers
to lose their occupational instincts
-- qualities that in earlier days of
shipping were an invaluable
safety factor.
Increasing Pace of
Technological Change
• Average time of conversion of
technical discovery into
commercial product was 30
years [1880 -1919]
• now 5 years!
• number of new products
increasing exponentially
• dangerous new substances with
economic pressures preventing
adequate testing.
Pace of Technological
Change/ cont.
• Learning by trial and error not
possible in modern times
• Design and testing procedures
must be right from the start.
• Christopher Hinton (1957)
writing about nuclear power
pointed out that in other
domains, learning from failures
was possible.
• Progress continues at torrid
pace; new standards and
regulatory procedures are
necessary.
1.4 How Safe Is Safe
Enough?
• The goal is to understand and
manage risk in order to
eliminate accidents or to reduce
their consequences.
• Frola and Miller (88) claim that
system safety investment has
reduced losses where it has been
applied rigorously in military
and aerospace programs.
How Safe is Safe/ cont.
• Conflicting goals between
• safety, performance, and goals.
• e.g. Industrial arm which cannot
be stopped easily. Slower arm
determines that it must hit
something.
• human interface(s) designed for
ease of use are often more
troublesome
Risk - Benefit Analysis
• Often viewed as only
way to make technology
and risk analysis.
• Must be able to:
– measure risk
– choose appropriate level for risk
• Systems must be designed and
built while knowledge of their
risk is incomplete or
nonexistent.
Risk Assessment / cont.
• Impossible to measure before
system is built.
• Failure rate of 4 (over 10000
years) and use body is missing
• Hard to perform and hard to
determine acceptable risk.
• Case of Ford Pinto gas tank
• Optimal Risk involves a tradeoff
that minimizes the sum of all
undesirable consequences.
Risky Systems and
Perrow
• Perrow Divides High-Risk
systems into 3 categories: