Quantitative Methods in Defense and National Security (2010)

Download Report

Transcript Quantitative Methods in Defense and National Security (2010)

Quantitative Methods in Defense
and National Security (2010)
Update from the Panel on Industrial Methods
for the Effective Test and Development of
Defense Systems
Mike Cohen, study director
5/25/2010
Previous Related Work by CNSTAT
Let me start with some recommendations from previous efforts that are
relevant to the current study:
1998: Statistics, Testing, and Defense Acquisition – New Approaches
and Methodological Improvements
Conclusion 2.2: The operational test and evaluation requirement, stated
in law, that the Director, Operational Test and Evaluation certify that
a system is operationally effective and suitable often cannot be
supported solely by the use of standard statistical measures of
confidence for complex defense systems with reasonable amounts
of testing resources.
1998 Study (continued)
Conclusion 3.1: Major advances can be
realized by applying selected industrial
principles and practices in restructuring
the paradigm for operational testing and
the associated information gathering and
evaluation process in the development of
military systems.
1998 Study (continued)
Recommendation 2.1: DoD and the military services
should provide a role for operational test personnel in the
process of establishing verifiable, quantifiable, and
meaningful operational requirements. Although the
military operators have the final responsibility for
establishing operational requirements, the Operational
Requirements Document would benefit from consultation
with and input from test personnel, the Director,
Operational Test and Evaluation, and the operational test
agency in the originating service. This consultation will
ensure that requirements are stated in ways that
promote their assessment.
1998 Study (continued)
Recommendation 3.1: … The primary mandate of
the Director, Operational Test and Evaluation
should be to integrate operational testing into
the overall system development process to
provide as much information as possible as
soon as possible on operational effectiveness
and suitability. In this way, improvements to the
system and decisions about continuing system
development or passing to full-rate production
can be made in a timely and cost-efficient
manner.
1998 Study (continued)
Recommendation 3.3: The DoD and the military
services, using common financial resources,
should develop a centralized testing and
operational evaluation data archive for use in
test design and test evaluation.
Recommendation 3.4: All services should explore
the adoption of the use of small-scale testing
similar to the Army concept of force development
test and experimentation.
1998 Study (continued)
Recommendation 7.1: DoD and the military services should
give increased attention to their reliability, availability,
and maintainability data collection and analysis
procedures because deficiencies continue to be
responsible for many of the current field problems and
concerns about military readiness.
Recommendation 8.4: Service test agencies should be
required to collect data on system failures in the field that
are attributable to software. These should be recorded
and maintained in a central database that is accessible,
easy to use, and makes use of common terminology
across systems and services. This database should be
used to improve testing practices and to improve fielded
systems.
Innovations in Soft. Engineering for
Defense Systems (2003)
Recommendation 1: Given the current lack of implementation of stateof-the-art methods in software engineering in the service test
agencies, initial steps should be taken to develop access to --- either
in-house or in a closely affiliated relationship --- state-of-the-art
software engineering expertise in the operational or developmental
service test agencies.
Recommendation 2: Each service’s operational or developmental test
agency should routinely collect and archive data on software
performance, including test performance data and data on field
performance. The data should include fault types, fault times, and
frequencies, turnaround rate, use scenarios, and root cause
analysis. Also, software acquisition contracts should include
requirements to collect such data.
Innovations in Soft. Engineering for
Defense Systems (cont)
Recommendation 6: DoD needs to examine
the advantages and disadvantages of the
use of methods for obligating software
developers under contract to DoD to use
state-of-the-art methods for requirements
analysis and software testing, in particular,
and software engineering and
development more generally.
Testing of Defense Systems in an
Evolut. Acquis. Environment (2006)
Conclusion 1: In evol. acquisition, the entire spectrum of
testing activities should be viewed as a continuous
process of gathering, analyzing, and combining
information in order to make effective decisions. The
primary goal of test programs should be to experiment,
learn about the strengths and weaknesses of newly
added capabilities or (sub)systems, and use the results
to improve overall system performance. Furthermore,
data from previous stages of development, including field
data, should be used in design, development and testing
at future stages. Operational testing (testing for
verification) of systems still has an important role to play
in the evolutionary environment, although it may not be
realistic to carry out operational testing, comprehensively
at each stage of the development process.
Testing of Defense Systems in an
Evolut. Acquis. Environment (cont)
Conclusion 2: Testing early in the
development stage should emphasize the
detection of design inadequacies and
failure modes. This will require testing in
more extreme conditions than those
typically required by either developmental
or operational testing, such as highly
accelerated stress environments.
Testing of Defense Systems in an
Evolut. Acquis. Environment (2006)
Conclusion 3: To have a reasonable likelihood of
fully implementing the paradigm of testing to
learn about and to improve systems prior to
production and deployment, the roles of DoD
and congressional oversight in the incentive
system in defense acquisition and testing must
be modified. In particular, incentives need to be
put in place to support the process of learning
and discovery of design inadequacies and
failure modes early and throughout system
development.
Testing of Defense Systems in an
Evolut. Acquis. Environment (2006)
Recommendation 3: The undersecretary of
defense (acquisition, technology, logistics)
should develop and implement policies,
procedures, and rules that require
contractors to share all relevant data on
system performance and the results of
modeling and simulation developed under
government contracts, including
information on their validity, to assist in
system evaluation and development.
Testing of Defense Systems in an
Evolut. Acquis. Environment (2006)
Recommendation 4: The undersecretary of defense (AT&L)
should require that all technologies to be included in a
formal acquisition program have demonstrated sufficient
technological maturity before the acquisition program is
approved or before the technology is inserted in a later
stage of development. The decision about the sufficiency
of technological maturity should be based on an
independent assessment from the director of defense
research and engineering or special reviews by the
director of operational test and evaluation (or other
designated individuals) of the technological maturity
assessments made during the analysis of alternatives
and during developmental test.
Testing of Defense Systems in an
Evolut. Acquis. Environment (2006)
Conclusion 5: The DoD testing community
should investigate alternative strategies for
testing complex defense systems to gain,
early in the development process, an
understanding of their potential operational
failure modes, limitations, and level of
performance.
Testing of Defense Systems in an
Evolut. Acquis. Environment (2006)
Recommendation 6: (a) To support the
implementation of evolutionary acquisition, DoD
should acquire, either through hiring in-house or
through consulting or contractual or contractual
agreements, greater access to expertise in the
following areas: (1) combining information from
various sources for efficient multistage design,
statistical modeling, and analysis; (2) software
engineering; and (3) physics-based and
operational-level modeling and simulation …
Problems
(1) we have been operating at a relatively high
level – maybe we can try to change what is
done on a day-to-day basis
(2) we have been too statistically-oriented --- DoD
system development needs to utilize better
engineering practices, and
(3) no matter what memoranda and guidances are
written, you need to help people do their jobs
better, since the guidances often don’t have
much direct impact on practice.
Industrial Methods for the Eff. Test
and Devlop. of Def. Systms (2010)
ROSTER
Vijay Nair, chair (Univ of Michigan)
Pete Adolph (consultant)
Peter Cherry (SAIC)
John Christie (LMI)
Tom Christie (consultant)
Blanton Godfrey (NC State Univ)
Raj Kawlra (Chrysler)
John Rolph (USC)
Elaine Weyuker (AT&T)
Marion Williams (IDA)
Alyson Wilson (Iowa State Univ)
Industrial Methods for the Eff. Test
and Develop. of Defense Systems
Charge: To plan and conduct a workshop
that will explore ways in which
developmental and operational testing,
modeling and simulation, and related
techniques can improve the development
and performance of defense systems.
Workshop --- four key talks by software and
hardware engineers from industry.
Three Key Questions - Question 1:
Finding Failure Modes Earlier
1. The earlier that failure modes and design flaws
are identified during the development of defense
systems, the less expensive are the design
changes that are needed for correction. The
workshop will explore what techniques are
applied in industrial settings to identify failure
modes earlier in system development and
whether such techniques are system dependent
or whether some generally applicable principles
and practices can be discovered.
QUESTION 1 (cont)
We will want to understand the extent to which it is
important to utilize operational realism in testing
schemes ---what realism needs to be present, what can
be safely simulated, and what can be safely ignored?
Also, what is meant by the envelope of operational
performance for a system and how far beyond that
envelope should one test to discover design flaws and
system limitations? Finally, how are accelerated testing
ideas utilized in industry; what are the advantages and
disadvantages (besides that accelerated scenarios are
extrapolations from real use), and what are the pitfalls
and ways around them?
Question 2: Assessment of
Technological Maturity
2. The inclusion of hardware and software components that are not
technologically mature is often the cause of delays in system
development, cost increases, and reduced system performance
when fielded. …Therefore, one key issue is how to set requirements
and specifications for a new technology. A second key issue is how
much of testing should be of the components as isolated systems
and how much should be devoted to tests of the components within
the functioning of the parent system? The workshop will explore the
techniques that are applied in industrial settings to evaluate
components efficiently to determine whether they are or are not
sufficiently mature for use. This should include a discussion of how
the assessment of maturity of software components is different than
that of hardware components.
Question 3: Use of Information
from Disparate Sources
3. Field performance data can be extremely useful for identifying design
deficiencies for systems developed in stages. However, field use is
not always conducive to data collection, resulting in various forms of
incomplete data. How should feedback mechanisms supported by
field performance data operate to improve modeling and simulation
and developmental and operational test and evaluation practices?
Further, along with field performance data, system performance can be
gauged using results from developmental testing, operational
testing, modeling and simulation, and the results of these various
sources for earlier stages of development and for closely related
systems or systems with identical components. However, these
disparate sources of information are from the operation of a system
in very different contexts and may also involve appreciably different
systems, and it is therefore a challenge to develop statistical models
for combining these data. … How can these disparate sources of
information be used to help improve operational test design? What
statistical models are useful for operational evaluation?
Topics We Hope to Address:
1. Setting Requirements
Setting realistic, useful requirements
a. Use of high-level models to assess the
feasibility of a proposed system
b. Input from testers to assess testability
c. Input from ultimate users for early
design changes
d. What if requirements ‘need’ to be
changed mid-development? Need to be able
to say yes or no from an engineering
perspective
2. Assessment of Technological
Maturity
a. Reliability vs. Effectiveness in assessing
technological maturity– need for greater focus
on reliability
b. Greater use of alternative acquisition processes
(ACTD)
c. There are already procedures on the books --are they followed? TRL level assessment is
defined – unfortunately they are not
quantitatively defined and so are easier to be
fuzzy about
d. Expertise greatly needed here --- is this idea
likely to be operationalized soon or not?
3. Finding Failure Modes and
Design Flaws Earlier
a. Focusing on what the developers found in their
private testing – continuity of learning --- and
testing those components in those situations
more to see if adjustments worked
b. Asking developers to use an agile-like
development process --- asking to see the
performance of components, and then
subsystems, and then whole systems at
various stages of the development process for
independent testing.
3. Finding Failure Modes and
Design Flaws Earlier
Hardware Questions (we may not be able to answer all):
i. Testing at the ‘edge of the envelope’ --- how do we define
that? Is that always system specific?
ii. How much operational realism is needed to find failure
modes of various types? When can we test components
in isolation, when do we have to use typical users?
iii. What is the proper role of accelerated testing?
iv. What is the proper role of physics-based modeling and
simulation for finding failure modes early in
development?
v. Support feedback loops for system improvement
vi. Support feedback loops for test process Improvement
3. Finding Failure Modes and
Design Flaws Earlier
Software Questions:
i. what is the role of regression testing based on
earlier stages of development?
ii. how should we test more heavily in
problematic areas of the software system?
iii. we CANNOT rely on OT (or maybe even
current DT) to exercise software with nearly
enough replications, so we need some type of
automated representation of the remainder of
the system to do fully automated testing.
3. Finding Failure Modes and
Design Flaws Earlier
Integration Questions:
i. It may be hardest to find integration problems
early since it requires more of the system to be
developed.
ii. It may require more creative representation of
subsystems and associated expertise to
determine where the integration issues are likely
to reside and then test more there. (If you use a
heavier payload than a component was
previously required to accommodate, where
might problems show up?
4. Improving System Reliability
a. Role of reliability growth models?
b. Greater emphasis in DT – don’t pass into
OT until DT isn’t finding any reliability
problems?
c. Accelerated testing?
d. Component testing vs. Subsystem testing
vs. System testing
4. Improving System Reliability
(cont)
e. Some reliability problems will NOT show up in
DT unless you know what realism needs to be
represented. Testing needs to be looked at
comprehensively and collaboratively --- DT and
OT need to strategize
f. FMECA analysis --- how to build this into the
current process – wonderful graphics now at
Aberdeen
g. Retrieve defective parts and determine what
went wrong
h. Focus on life-cycle costs – more resources
devoted to testing will easily pay for itself
5. Process Changes
a. Stability in the PM position --- need for greater accountability at the
top
i) Civilian with near equal authority in a position that continues
throughout development, maybe chief engineer
ii) Deputy PM gets promoted to PM
b. Incongruent incentives --- PMs are not motivated to find every error
One possibility --- Independent comprehensive assessment of the
system status when the PM turns over. The PM is then judged more
objectively about how the system progressed under them, especially
suitability design flaws.
c. Greater engineering expertise
Those managing development should be experienced in various
types of system development
d. Ensure that current critical processes are followed --- e.g, TRL level
should be 7, but maybe a 5 will still get a pass, etc.
5. Process Changes (cont)
e. Greater collaboration between contractors, testers, and the ultimate
user. Place more constraints in contract to require sharing of test
designs, test results, M&S development, M&S results, etc., etc. This
really does not have a downside. It needs to be done.
f. Everyone works from the same database, with everyone having easy
access, so that requirements, specifications, and later assertions of
reliability and effectiveness based on archived test results of various
kinds can be understood and debated
g. More realism about schedules and a view to do it right the first time -- no free passes to next stages of development without proof.
In conclusion
Wish us luck --- we should be done in the
fall.
Thanks.