Transcript Document

Challenges in Modeling
COMPLEXITIES OF
MODELS
• Large State Space (e.g. Bedrock, Wireless
handoff)
– Model construction problem
– Model solution problem
• Model Stiffness.
Fast and slow rates acting together
– Failure And Recovery/Repair (HSP Markov model
in Bedrock)
– Performance and failure (Wireless handoff)
COMPLEXITIES OF
MODELS
(Continued)
• Modeling Non-Exponential Distributions
(e.g. N+1 problem)
• Believability/Understandability/Usability
• What about software?
Potential Solutions
• Largeness
– Largeness Tolerance
– Largeness Avoidance
LARGENESS TOLERANCE
• Automated Model Construction
– Loops in the specification of CTMC (SHARPE)
– Stochastic Petri nets (SPNP, SHARPE)
– High level languages (SAVE, QNAP, ASSIST,
SDM)
– Fault-Tree + Recovery Info (HARP)
– Object-Oriented Approaches (TANGRAM)
LARGENESS TOLERANCE
(Continued)
• Efficient numerical solution techniques
– Sparse Storage
– Accurate and Efficient Solution Methods
We have Generated and Solved Models
with 1,000,000 states (has gone up
considerably recently)
Steady-State : NEAR-Optimal SOR
Transient: Modified Jensen's method
MODEL SPECIFICATION
LANGUAGES
• Different languages can be used to specify a
single model type:
SAVE, QNAP, SPNP all appear very different;
underlying model type is Markov
• Same language can be used to specify different
model types:SPNP input language used for
Markovian SPN analytic numeric solution or
non-Markovian SPN simulation solution
MODEL SPECIFICATION
LANGUAGES (Continued)
• Languages can be domain specific:
– Reliability: HARP, SDM
– Availability: SAVE
– Performance: RESQ, QNAP
• Language can be domain independent:
– SHARPE, SPNP
LARGENESS AVOIDANCE
• Non-State-Space methods
– Reliability block diagrams
– Fault-trees
– Product-Form Queuing Networks
• Approximate solutions
– State Truncation
SAVE, SPNP (Kantz and Trivedi: PNPM91)
Case Study: JPL REE System
Availability Modeling in
Spacecraft Architecture
LARGENESS AVOIDANCE
(Cont.)
• Stochastic Petri Nets (State-space-based
modeling)
• State truncation by introducing guard function
Guard g is defined as
If (mark(“…_dn”) >= K)
return (0);
else
return (1);
SPN MODELING
AVAILABILITY MEASURES
LARGENESS AVOIDANCE
• Approximate solutions
– Hierarchical Decomposition
and Fixed-Point Iteration among submodels:
• Heidelberger and Trivedi; IEEE-TC,1983
(Queueing Models)
• Ciardo and Trivedi; PNPM91 (SPN Models)
• Tomek and Trivedi (Availability Models)
• Lanus, Liang & Trivedi: (Bedrock)
• Wireless handoff work: Ma, Han & Trivedi
(Continued)
LARGENESS AVOIDANCE
(Continued)
• Approximate solutions
– Performability:
Multiprocessor example
– Fluid Approximation:
Mitra; Kulkarni; Ciardo; Nicol, and Trivedi;
FSPN
Difficulties in Modeling Using
MRMs
• Stiffness
Causes numerical difficulties in solution
– Stiffness Tolerance
Develop stiffness tolerant numerical
solution methods
– Stiffness Avoidance
Avoid generating stiff models through
decomposition
Potential Solutions
(Continued)
• Stiffness
– Stiffness Tolerance
– Stiffness Avoidance
• Modeling Non-Exponential Distributions
– Stage-type expansion, MRGP, NHCTMC, DES
STIFFNESS TOLERANCE
• Automatic Detection of Stiffness (HARP)
• Special Stable ODE Solver
Reibman and Trivedi (TR-BDF2)
Computers and Operations Research, 1988.
Malhotra and Trivedi (Pade, Implicit RK)
STIFFNESS TOLERANCE
(Continued)
• Uniformization for Stiff Markov Chains
Muppala and Trivedi
We can solve models with rate ratios of 108 or
higher
Implemented in SHARPE & SPNP
STIFFNESS
AVOIDANCE
• Model-level decomposition
– Hierarchical Composition (SHARPE)
Composition of Submodel solutions without
generating a single one-level overall model
(Bedrock example)
– Fixed-Point Iteration (Wireless handoff example)
STIFFNESS
AVOIDANCE (Continued)
• Importance Sampling (simulation)
– Lewis, Goyal, Heidelberger, Shahbuddin, Geist,
Nicola
– Can also apply to analytic-numeric methods
(Heidelberger, Muppala, and Trivedi; Performance
93)
• Importance splitting (Simulation)
– Tuffin and Trivedi; Tools’ 00
Non-Exponential Behavior
• Non state space models: Fault Trees, Reliability
Graphs, RBDs; no problem
Non-Exponential Behavior
in State Space Models
NON-EXPONENTIAL
DISTRIBUTIONS
• Phase-Type Expansions
– N+1 example
• Non-Homogeneous Markov Chains
CARE III, HARP
Soft Rel model with imperfect repairs solved
using SHARPE
NON-EXPONENTIAL
DISTRIBUTIONS (Continued)
• Semi-Markov Chains
N+1 example
• Markov Regenerative Processes:
Choi, Logothetis, Kulkarni, Trivedi
• DSPN and MRSPN:
Choi, Kulkarni, Trivedi
• Discrete-Event Simulation
Now in SPNP (FSPN and Non-Markovian
SPN Simulation), RESQ, QNAP, Bones, SES
workbench
CASE STUDY: AT & T
• GSHARPE:
– A Preprocessor to SHARPE developed at
Bell Labs by a Duke Student.
– User can specify Weibull Failure times and
lognormal and other repair time
distributions.
– GSHARPE fits these to phase type
distributions and produces a Markov model
that is generated for processing by SHARPE
Potential Solutions
(Continued)
• Believability/Understandability/Usability
– GUI, many practical examples, short-courses, tools,
Boeing SDM project
• Incorporation in the design process
– VHDL  Availability Model,
– C Program  Perf. Model
– Ada Program  SPN Perf. Model (SPC)
• Connection between measurements & models
BELIEVABILITY
UNDERSTANDABILITY
• Integration of Measurements and Models
– Measurements Provide Parameters to Models
– Models Provide Guidelines For Measurements
– Models Validated Against Measurements
• Integration of Different Modeling Tools
– Boeing SDM project
BELIEVABILITY/
UNDERSTANDABILITY
(Continued)
• Many Case-Studies of Validations Needed
– Vaxcluster Availability Model: Wein & Sathaye
– Hsueh, Iyer and Trivedi; IEEE-TC, Apr. 1988
– Lucent Validation of ESS; Veena Mendiratta
• Technology Transfer
– Short courses
– Development and Dissemination of Tools
(SHARPE, SPNP)
BELIEVABILITY/
UNDERSTANDABILITY
(Continued)
• Application of the Techniques and Tools
– Motorola
– Cisco
– 3Com
– HP
– Sun
CASE STUDY: BOEING
• An Integrated Reliability Environment
• A working prototype
• Developed a high-level modeling language
(SDM)
• Designed and implemented an intelligent
interpreter
CASE STUDY: BOEING
(Continued)
• Interpreter determines which solution method
is applicable
• Translator translates the SDM input file into an
input file of any of the engines down below
• Five different modeling engines are integrated:
– CAFTA, SETS, EHARP, SHARPE and
SPNP.
MODELING AND
MEASUREMENTS: INTERFACES
• Measurements supply Input Parameters to Models
(Model Calibration or Parameterization)
Confidence Intervals should be obtained
Boeing, Draper, Union Switch projects
• Model Sensitivity Analysis can suggest which
Parameters to Measure More Accurately: Blake,
Reibman and Trivedi: SIGMETRICS 1988; Fricks
and Trivedi: 1997
MODEL CALIBRATION
What is ?
• Fault Model for Each Component
– Design,Manufacturing: Heisenbugs, Bohrbugs
– Operational: Permanent,
Intermittent,Transient
– Human
• Fault Arrival Processes
(PP,Weibull,NHPP)
• Failure Rates (Sources:MIL-STD)
MODEL CALIBRATION
(Continued)
What is c ?
• Field Data
• Fault/Error Injection (FIAT,MESSALINE)
• Analytic Coverage Model
What is  ?
• Maintenance Model
Corrective; dispatch , travel, repair time,
dead on arrival, imperfect repair
Preventive
MODEL CALIBRATION
(Continued)
What is r ?
• Binary: Up & Down
• Capacity-Oriented:
Number of Operational Resources in Each State
• Performance-Oriented:
Evaluate Perf. in Each Degraded Level of Syst. Config.
1. Measurements
2. Simulation Model
3. Analytic Model -- SHARPE, SPNP
VALIDATION&VERIFICATION
– Validation: Does the conceptual model faithfully
reflect the behavior of the system?
– Verification: Has the conceptual model been
correctly implemented?
MODEL VALIDATION
(Continued)
• Three step process outlined by Naylor and
Finger
– Face validation: Discussion with the experts
– Input-Output validation: Compare results
obtained from model with those from
measurements
– Validation of model assumptions: Either
prove that the assumptions are correct or do
statistical testing
MODEL
ASSUMPTIONS/ERRORS
• Errors in Model Structure
– Missing or Extra Arcs
– Missing or Extra States
– Use Face Validation to avoid these errors.
• Errors Due to Non-Independence
• Distributional Errors
• Parametric Errors
MODEL ASSUMPTIONS/
ERRORS(Continued)
• Errors Due Approximations
– Decomposition/Aggregation/Iteration
– State Truncation
• Numerical Solution Errors
– Discretization Errors
– Round-Off Errors
Model Verification
• Programming Errors
• Approximation errors: Tight bounds due to
approximations are desirable
• Numerical: Errors in numerical algorithms
should be bounded
What about software?
• Testing phase
– Software reliability estimation
• Black-box based approach
• Architecture-based approach
• Operational phase
– Fault tolerance coverage (c in Markov model)
– Countering software aging
• Symptom-based fault management
Conclusions:
• Availability evaluation is very important in
characterizing systems
• Evaluation can be performed either through
measurements, simulation or analytical modeling
• Model verification and validation should form an
integral part of the modeling process