COCOMO II Integrated with Crystal Ball Risk Analysis Software Clate Stansbury

Download Report

Transcript COCOMO II Integrated with Crystal Ball Risk Analysis Software Clate Stansbury

COCOMO II Integrated with
Crystal Ball® Risk Analysis Software
Clate Stansbury
MCR, LLC
[email protected]
(703) 506-4600
Prepared for
19th International Forum on COCOMO Software Cost Modeling
University of Southern California
Los Angeles CA
26-29 October 2004
1
Contents
•
•
•
•
•
Purpose: Describing Uncertainty
Representing Uncertain Inputs
Simulating Costs
Correlating Inputs and Costs
Summary
2
Estimators Must Describe Uncertainty
• Report Cost As a Statistical Quantity, Not a Point
– Cost of Any Incomplete Program Is Uncertain
– Estimator Must Report That Uncertainty as Part of His
or Her Delivered Estimate
• Cost-risk Analysis Allows Estimator to Report Cost
As a Probability Distribution, So Decision-maker Is
Made Aware of
– Expected Cost (Mean)
– 50th Percentile Cost (Median)
– 80th Percentile Cost
– Overrun Probability of Project Budget
3
Representing Uncertain Inputs Using
Triangular Distributions
4
DENSITY
Triangular Distribution of Element Cost,
Reflecting Uncertainty in “Best” Estimate
$
L
M
Optimistic Best-Estimate
Cost (Mode =
Cost
Most Likely)
H
Cost Implication of Technical,
Programmatic Assessment
5
COCOMO Cost Drivers as Triangular
Distributions
• For Each COCOMO II Input …
– Input Request Interpreted as a Triangular Distribution
– User Estimates Optimistic, Most Likely, and Pessimistic Values
(which may not always be all different from each other)
Most Likely (mode)
Probability
Optimistic
Pessimistic
Cost
User provides three values for each COCOMO II input,
as though there were three separate projects.
6
COCOMO Cost Drivers as Triangular
Distributions
0.90
1.14
7
COCOMO Cost Drivers as Triangular
Distributions
Why triangular distribution?
• Triangular Distribution is Simple and Malleable
• Parameters (Optimistic, Most Likely, Pessimistic) Are
Easy to Define and Explain
• Could Have User Provide Parameters for Normal,
Lognormal, Exponential, Uniform, or Beta Distributions,
for Example, if More is known about the distributions
• Good Topic for Further Research….
8
Processing Uncertainty Using
Simulations
9
How to Process Triangular Distributions?
• Taking the Product of Effort Multipliers When Each
EM is a Triangular Distribution?
• How to Compute Rest of COCOMO II Algorithm?
• How to Sum Code Counts for All CSCIs?
10
Traditional “Roll-Up” Method (Too
Simple)
• Define “Best Estimate” of Each Cost Element
to be the Most Likely Cost of that Element
• List Cost Elements in a Work-Breakdown
Structure (WBS)
– Calculate “Best Estimate” of Cost for Each
Element
– Sum All Best Estimates
– Define Result to be “Best Estimate” of Total
Project Cost
• Unfortunately, It Turns Out That Things are
Not as Simple as They Seem – There are a
Lot of Problems with This Approach
11
Why “Roll-up” Doesn’t Work
WBS-ELEMENT TRIANGULAR
COST DISTRIBUTIONS
MERGE WBS-ELEMENT COST DISTRIBUTIONS INTO
TOTAL-COST NORMAL DISTRIBUTION
Most
Likely
Most
Likely
$
.
.
.
$
Most
Likely
$
$
ROLL-UP OF MOST LIKELY
WBS-ELEMENT COSTS
MOST LIKELY
TOTAL COST
12
What Information a Cost Estimate Should
Provide
Statistical Information Output About the Cost
– Probability Density (Frequency Distribution or
Histogram)
– S-curve (Cumulative Probability Distribution)
– Percentiles
– Min, Max, Mode, Mean
13
What a Cost Estimate Should Look Like
Percentile
Value
0%
450.19
10%
516.81
20%
538.98
30%
557.85
40%
575.48
50%
592.72
60%
609.70
70%
629.19
80%
650.97
90%
683.01
100%
796.68
Statistics
Value
Trials
10,000
Mean
596.40
Median
592.72
Mode
--Standard Deviation 63.18
Range Minimum
450.19
Range Maximum
796.68
(Crystal Ball Outputs)
Forecast: A8
10,000 Trials
Cumulativ e Chart
71 Outliers
1.000
10000
“S-Curve”
.750
.500
.250
.000
0
462.43
537.16
611.89
686.62
761.35
Forecast: A8
10,000 Trials
Frequency Chart
71 Outliers
.020
197
“Density Curve”
.015
147.7
.010
98.5
.005
49.25
.000
0
462.43
537.16
611.89
686.62
761.35
14
Cost-Risk Analysis Works by
Simulating System Cost
• In Engineering Work, Computer Simulation of System
Performance is Standard Practice, with Key Performance
Characteristics Modeled by Monte Carlo Analysis as
Random Variables, e.g.
– Data Throughput
– Time to Lock
– Time Between Data Receipt and Delivery
– Atmospheric Conditions
• Cost-Risk Analysis Enables the Cost Analyst to Conduct a
Computer Simulation of System Cost
– WBS-element Costs Are Modeled As Random Variables
– Total System Cost Distribution is Determined by Monte
Carlo Simulation
– Cost is Treated as a Performance Criterion
15
Crystal Ball Risk- Analysis Software
• Commercially Available Third-Party Software Add-on
to Excel, Marketed by Decisioneering, Inc., 2530 S.
Parker Road, Suite 220, Aurora, CO 80014, (800)
289-2550
• Inputs
– Parameters Defining WBS-Element Distributions
– Rank Correlations Among WBS-Element Cost Distributions
• Mathematics
– Monte-Carlo (Random) or Latin Hypercube (Stratified)
Statistical Sampling
– Virtually All Probability Distributions That Have Names Can
Be Used
– Suggests Adjustments to Inconsistent Input Correlation Matrix
• Outputs
– Percentiles and Other Statistics of Program Cost
– Cost Probability Density and Cumulative Distribution Graphics
16
How CB Simulations Work
Trial 1
Trial 5000
Trial 2
Assumption
Cell G5
=SUM($G$4:$G$8)
Total Cost
Forecast
17
Representing Correlations Among Risks
18
Risks are Correlated
• Resolving One WBS Element’s Risk Issues by
Spending More Money on It Often Involves Increasing
Cost of Several Other Elements as Well
– For Example, Excessive Complexity in One CSCI Impacts Effort
Required to Develop Other CSCIs that Interface with It
– Schedule Slippage Due to Problems in One CSCI Lead to Cost Growth
and Schedule Slippage in Other Elements (“Standing Army Effect”)
– Hardware Problems Discovered Late in Program Often Have to Be
Circumvented by Making Expensive Last-minute Fixes to the Software
• As We Will Soon See, Inter-Element Correlation
Tends to Increase the Variance of the Total-Cost
Probability Distribution
• Numerical Values of Inter-WBS-Element Correlations
are Difficult to Estimate, but That’s Another Story
19
Maximum Possible Underestimation
of Total-Cost Sigma
• Percent Underestimated When Correlation Assumed
to be 0 Instead of r (n=# of WBS elements)
100
n = 1000
n = 100
n = 30
Percent Underestimated
80
n = 10
60
40
20
0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Actual Correlation
20
Selection of Correlation Values
• “Ignoring” Correlation Issue is Equivalent to
Assuming that Risks are Uncorrelated, i.e., that All
Correlations are Zero
• Square of Correlation (namely, R2) Represents
Percentage of Variation in one WBS Element’s Cost
that is Attributable to Influence of Another’s
• Reasonable Choice of Nonzero Values Brings You
Closer to Truth
• Most Elements are, in Fact, Pairwise Correlated
• 0.2 is at “Knee” of Curve on Previous Charts, thereby
Providing Most of the Benefits at Least Commitment
Correlation
% Influenced
0.00
0%
0.10
1%
0.32
10%
0.50
25%
0.71
50%
21
Determining Correlations Among
COCOMO II Cost Drivers
• Default Correlations
– Correlations of Intra-CSCI Inputs to Default to 0.5
– Correlations of Inter-CSCI Efforts to Default to 0.2
• More Detailed Default Correlations?
– Higher Correlation Between RELY and DOCU?
– COCOMO II Security Extension Cost Driver Related to
Existing Cost Drivers
22
Summary
• Estimator Must Model Uncertainty
• Describe Uncertainty by Representing COCOMO
Inputs as Triangular Distributions
• Calculate Implications of Uncertainty by Using
Monte Carlo or Latin Hypercube Simulations to
Perform COCOMO II Algorithm
• Consider Correlation Among CSCI Risks and Costs
• Professional Software, e.g., Crystal Ball, is
Available to do Computations
23
Acronyms
AA
AT
CB
CM
COCOMO
CSCI
DM
EI
EIF
EO
EQ
ILF
IM
KSLOC
MS
O,M,P
SCED
SLOC
SU
UFP
UNFM
USC
WBS
Assessment and Assimilation
Automatically Translated code
Crystal Ball
Percent of Code Modified
Constructive Cost Model
Computer Software Cost Integrator
Percent of Design Modified
External Input
External Interface File
External Output
External Inquiry
Internal Logical File
Effort for Integration
Thousands of Source Lines of Code
Microsoft
Optimistic, Most Likely, Pessimistic
Schedule compression/expansion rating
Source Lines of Code
Software
Unadjusted Function Point
Programmer Unfamiliarity rating
University of Southern California
Work Breakdown Structure
24
Backup Slides
25
Correlation Matters
• Suppose for Simplicity
– There are n Cost Elements C , C ,  , C
1
2
n
– Each Var (C ) = s 2
i
– Each Corr(Ci ,Cj ) = r < 1
n
– Total Cost C =  C i
k =1
n
n 1
n
• Var(C ) = 
Var(C i )  2r  
k =1
i =1 j = i 1
2
( )
Var(C i ) Var C j
= ns 2  n( n  1) rs
= ns 2 (1  ( n  1) r )
Correlation
0
1
Var( C )
r
ns 2
ns 2 ( 1  ( n  1) r )
n2s 2
26
Cost Estimate Frequency Chart
• Approximation of Cost-Probability Distribution
Forecast: Forecast
1,500 Trials
Frequency Chart
1,494 Displayed
.029
44
.022
33
.015
22
.007
11
.000
0
152.86
171.54
190.23
Effort
208.91
227.59
27
Cost Estimate Cumulative-Probability
Function
• Probability of Cost Being Less Than x
Forecast: Forecast
1,500 Trials
Cumulative Chart
1,494 Displayed
1.000
1500
.750
.500
750
.250
375
.000
0
152.86
171.54
190.23
208.91
227.59
Effort
28
Cost Estimate Statistics
Statistical Information
Trials
Mean
Median
189.36
Mode
Standard Deviation
Variance
Skewness
Kurtosis
2.84
Coeff. of Variability
Range Minimum
Range Maximum
Range Width
Mean Std. Error
Confidence Levels
1500
190.12
--13.96
195.01
0.15
0.07
152.86
237.42
84.56
Percentile Effort
Percentile Effort
0%
152.86
55%
191.11
5%
167.41
60%
193.12
10%
171.84
65%
195.36
15%
175.86
70%
197.51
20%
178.55
75%
199.69
25%
180.77
80%
202.34
30%
182.66
85%
204.86
35%
184.49
90%
208.09
40%
185.99
95%
213.71
45%
187.57
100%
237.42
50%
189.36
29
Correlation Matrices Allow User to Adjust
Correlations
• One Matrix for Each CSCI Allows Estimator to Set
Correlations Among Cost Drivers for that CSCI
30