NASA OSMA SAS '02 Software Reliability Modeling: Traditional and Non-Parametric Dolores R. Wallace

Download Report

Transcript NASA OSMA SAS '02 Software Reliability Modeling: Traditional and Non-Parametric Dolores R. Wallace

NASA OSMA SAS '02
Software Reliability Modeling:
Traditional and Non-Parametric
Dolores R. Wallace
Victor Laing
SRS Information Services
Software Assurance Technology Center
http://satc.gsfc.nasa.gov/
dwallac, [email protected]
NASA OSMA SAS02
1
The Problem
• Critical NASA systems must execute successfully
for a specified time under specified conditions -Reliability
• Most systems rely on software
• Hence, a means to measure software reliability is
essential to determining readiness for operation
• Software reliability modeling provides one data
point for reliability measurement
NASA OSMA SAS02
2
Software Reliability Modeling
(SRM) – Traditional
• Captures hardware reliability engineering concepts
• Mathematically models behavior of a software
system from failure data to predict reliability growth
• Invokes curve-fitting techniques to determine values
of parameters used in the models
• Validates models with data with statistical analysis
• Using parametric values, predicts future
measurements, e.g.,
– Mean time to failure
– Total number faults remaining
– Number faults at time t
NASA OSMA SAS02
3
Synopsis
• FY01
– Identify mathematics of hardware reliability not
used in software
– Identify differences between hardware, software
affecting reliability measurement
– Identify possible improvements
• FY02
– Demonstrate practicality of SRM at GSFC
– Fault correction improvement – Schneidewind
– Non-parametric model - Laing
NASA OSMA SAS02
4
SRM: Data Collection
• Resistance to data collection
• Data content
– Accuracy of content
– Dates of failure, correction
– Calendar time not execution time
– Activities/ phase when failures occur
• Data manipulation
– Frequency counts
– Interval size and length
– Time between failure
NASA OSMA SAS02
5
IntervalCounter
Sample had 35 weeks – simplified fault count
NASA OSMA SAS02
6
SMERFS^3 3-D OUTPUT
NASA OSMA SAS02
7
Practical Method
• SATC Services
– SATC executes models and prepares analysis
– SATC provides training and public domain tool
• Improvements
– Recommendations to projects for data collection
– IntervalCounter to simplify data manipulation
NASA OSMA SAS02
8
Fault Correction Adjustments
• Reliability growth occurs from fault correction
• Failure correction proportional to rate of failure
detection
• Adjusted model with delay dT (based on queuing
service) but same general form as faults detected at
time T
• Process: use SMERFS Schneidewind model to get
parameters; apply to revised model via spreadsheet
• Results
– Show reliability growth due to fault correction
– Predict stopping rules for testing
NASA OSMA SAS02
9
SMERFS^3 – Excel
Approach*
• Best approach: combine SMERFS^3 with Excel.
• SRT provides model parameter estimation.
• Copy and paste parameters from SRT into
spreadsheet.
• Excel extends capabilities of SRT by allowing user
to provide equations, statistical analysis, and plots.
* CASRE or other software reliability modeling tool may be used with EXCEL
Recommended approach until the SRM tools incorporate this new model.
NASA OSMA SAS02
10
Non-parametric Reliability
Modeling
• Hardware
- Wears out over time
- Increasing failure rate
• Software
- Do not wear over time
- Decreasing failure rate
NASA OSMA SAS02
11
Continued
• Hardware Reliability Modeling
- “Large” independent random sampling
- Model reliability
- Make predictions
• Software Reliability Modeling
- “Small” observed dependent sample (of size one?)
- Not based on independent random sampling
- Model reliability
- Make predictions?
Do we search for the silver bullet of SWR models?
NASA OSMA SAS02
12
Reliability Trending
• Hardware Reliability
100%
Maximum
0%
0
Minimum
1
2
3
4
…

Time
• Software Reliability
100%
Maximum
0%
Minimum
0
NASA OSMA SAS02
1
2
3
4
Time
…

13
Software Reliability Bounds
100%
Maximum
Estimated Bound
Estimated Model
0%
0
Minimum
1
2
3
4
…

Time
NASA OSMA SAS02
14
Calculation of Estimated Models
and Bounds
• Dynamic Metrics
- Failure rate data
- Problem reports
• Static Code Metrics
- Traditional
- Source Lines of Code (SLOC)
- Cyclomatic Complexity (CC)
- Comment Percentage (CP)
- Object-Oriented
- Coupling Between Objects (CBO)
- Depth of Inheritance Tree (DIT)
- Weighted Methods per Class (WMC)
NASA OSMA SAS02
15
Combining Dynamic and
Static Metrics
• The Proportional Hazards Model (PHM)
PHM
Non-Parametric Component (Static)
R(t|z) = {R0(t)}g(z)
Parametric Component (Dynamic)
- Where zβ = z1β1 + z2β2 + … + zpβp , βi’s are unknown
regression coefficients and zi’s are static code metrics data
NASA OSMA SAS02
16
Tool Schema
z = (z1, z2, … zp)
Input Data
Database
R(t|z) = {R0(t)}g(z)
Data Processing
Weighted Average
Output Data
Observed Data
Raw Data
Estimated Model
Estimated Bound
- Process Below Bounds
Action
- Corrective Action
- Process Above Bounds
- No Corrective Action
NASA OSMA SAS02
17
SUMMARY
• Software reliability modeling
–
–
–
Provides useful measurements for decisions
Does not require expert knowledge of the math!
Is relatively easy with use of software tools
• Fault correction improvement
– Adapts model to be more like software
– Demonstrates combined use of traditional SRM tools
with spreadsheet technology
• Non-parametric modeling
– New approach shows promise
– Prototype to be expanded
NASA OSMA SAS02
18
AIAA Recommended Steps
(specific to SRM)
•
•
•
•
•
•
•
Characterizing the environment
Determining test approach
Selecting models
Collecting data
Estimating parameters
Validating the models
Performing analysis
NASA OSMA SAS02
19
Fault Correction Modeling
• Software reliability models focus on modeling and
predicting failure occurrence
– There has not been equal priority on modeling the fault
correction process.
• Fault correction modeling and prediction support to
–
–
–
–
predict whether reliability goals have been achieved
develop stopping rules for testing
formulate test strategies
rationally allocate test resources.
NASA OSMA SAS02
20
Equations: Prediction and Comparison
Worksheets
Time to Next Failure(s) Predicted at Time t
TF(t) = [(log[ /(  (Xs,t + Ft ))]) / ]  (t  s+1)
Remaining Failures Predicted at Time t:
r(t) = (/) – Xs,t
Cumulative Number of Failures Detected at Time T:
D(T) = (α/β)[1 – exp (-β ((T –s + 1)))] + Xs-1
Cumulative Number of Failures Detected Over Life of
Software TL:
D(TL) = / + Xs-1
Equations developed by Dr. Norman Schneidewind, Naval
Postgraduate School, Monterey, CA
NASA OSMA SAS02
21