Transcript Document

Cost Estimation
Cost Estimation
• “The most unsuccessful three years in the
education of cost estimators appears to be
fifth-grade arithmetic.
» Norman R. Augustine
Goal
• The cost estimation community is working
to improve estimations so that
sophisticated organizations can produce
products with 5% of the estimated cost
(instead of 10%).
Goal
• The cost estimation community is working
to improve estimations so that
sophisticated organizations can produce
products with 5% of the estimated cost
(instead of 10%).
• The typical software organization struggles
to avoid estimates that are incorrect by
100%.
Goal
• The cost estimation
community
What does
“incorrect is
byworking
to improve 100%”
estimations
so that
mean?
sophisticated organizations can produce
products with 5% of the estimated cost
(instead of 10%).
• The typical software organization struggles
to avoid estimates that are incorrect by
100%.
Requirements for Estimations
• What are cost estimates used for?
Requirements for Estimations
• What are cost estimates used for?
• What are the important characteristics of a
cost estimate?
Requirements for Estimation
• Timely
– It is of limited value to provide the estimate after the
project is complete
– How early?
Requirements for Estimation
• Timely
– It is of limited value to provide the estimate after the
project is complete
– How early?
• Before implementation
• Before complete design
• Maybe before complete requirements
Requirements for Estimation
• Timely
– It is of limited value to provide the estimate after the
project is complete
• Accurate
– How accurate is enough?
Requirements for Estimation
• Timely
– It is of limited value to provide the estimate after the
project is complete
• Accurate
– How accurate is enough?
• Enough for planning, bidding, scheduling
Scenario
• A typical software development shop. A
boss is speaking with the lead developer.
• (I need two volunteers to play these parts.)
Scenario
• What’s up with the boss?
Scenario
• What’s up with the boss?
• What does the boss think about the
employee?
Problem with Scenario 1
• The boss wasn’t asking for an estimate.
The boss was asking for a plan to hit the
target.
• Your boss may or may not know the
difference.
Scenario 2
Scenario 2
• What’s the difference?
Definitions
•
•
•
•
Estimate
Target
Commitment
Plan
Estimate
• 1. A tentative evaluation or rough
calculation.
• 2. A preliminary calculation of the cost of a
project.
• 3. A judgment base upon one’s
impressions; opinion
» American Heritage Dictionary, 2nd Ed. 1985
Targets
• “We need to have the prototype ready by
the end of the semester.”
• “These functions should be implemented
by August 30 when the contract ends.”
• “The cost of the project is limited to
$250,000, because that’s the maximum
budget.”
• “We have to ship 7.0 by second quarter
next year, because I have a reunion to
attend in July.”
Commitment
• Target: description of a desirable business
objective
• Commitment: a promise to deliver defined
functionality at a specific level of quality by
a certain date
• Plan: a sequence of steps to achieve a
goal (may include a schedule)
Terms
• Note that target, estimate, and
commitment are not the same concept,
and the dates given for these may differ.
Scenario 3
• Suppose I give an estimate of 90 days.
• What does this mean?
Usual
120
Probability
100
80
60
40
20
0
Schedule (or cost)
Scenario 3
This says there is a 100% probability of
delivering on this schedule.
• Suppose I give an estimate of 90 days.
In order for the number to have value,
• What
does
this
mean?
we need to know what the variance is.
How likely are we to hit this estimate?
Usual
(Usually, this is a target, not an
estimate.)
120
Probability
100
80
60
40
20
0
Schedule (or cost)
Scenario 3: Bell curve
• What does it mean?
• Is this more accurate?
Common Assumption
30
Probability
25
20
15
10
5
0
Schedule (or cost)
Scenario 3: Bell curve
What
• Whatassumptions
does it mean?
• Is this more
accurate?
does this
make?
Common Assumption
30
Probability
25
20
15
10
5
0
Schedule (or cost)
Scenario 3: Realistic
• What does this mean?
• Why is it shaped that way?
Realistic
35
Probability
30
25
20
15
10
5
0
Schedule (or cost)
Quiz
Answers
1
Surface Temperature of the Sun
10,000 F /6,000 C
2
Latitude of Shanghai
31 degrees North
3
Area of Asian continent
17,139,000 square miles
44,390,000 sq Km
4
Birth year of Alexander the Great
356 BC
5
Total value of U.S. currency in circulation in 2004 $719,900,000,000
(in U.S. dollars)
($720 billion)
6
Total volume of the Great Lakes
1.8 *10*23 U.S. gallons
6.8*10^23 liters
7
World wide box office receipts for the movie
Titanic as of 2006
$1.835 billion
8
Total length of the coastline of the Pacific Ocean
84,300 miles /135,663 Km
9
Number of books published in U.S. since 1776
22 million
10 Weight of heaviest blue whale on record
380,000 pounds
179,000 Kg
Scores
•
•
•
•
How many with 10 correct?
9?
8?
7? ….
Math of the expected distribution
• If we have a 90% probability for any single answer, then:
– Probability of getting all 10 correct: .9^10 = 34.9%
– Probability of getting 9 correct: (.9^9*.1)*10 = 38.7%
8:
.9^8*.1^2*45 = 19.4%
• For any given combination, .9^8*.1^2. But there are 45 different ways to put
two wrong in a list of 10.
•
•
•
•
You can put the first wrong answer in any one of the first 9 places.
If you put it in the first spot, there are 9 places to put the second.
If you put it in the second spot, there are 8 places for the second.
And so on.
•
•
•
•
9+8+7…+1
1+2+3…+9
-------------------9*10 = 2x, x = 45.
Math of the expected distribution
• If we have a 90% probability for any single answer, then:
– Probability of getting all 10 correct: .9^10 = 34.9%
– Probability of getting 9 correct: (.9^9*.1)*10 = 38.7%
8:
.9^8*.1^2*45
Conclusion:
with a = 19.4%
•
90%
confidence,
you
For any given combination, .9^8*.1^2. But there are 45 different ways to put
two wrong in a listhave
of 10. a 93% chance
ofwrong
getting
You can put the first
answer8inor
anymore
one of the first 9 places.
If you put it in the first spot, there
are 9 places to put the second.
correct.
•
•
• If you put it in the second spot, there are 8 places for the second.
• And so on.
•
•
•
•
9+8+7…+1
1+2+3…+9
-------------------9*10 = 2x, x = 45.
% correct
What we expect
at 90%
confidence
90%
Confiden
ce
Usual
45
40
35
30
25
20
15
10
5
0
Class
10
9
8
7
6
5
4
3
2
1
0
Questions Correct
Historical data
Us last year
Questions:
• Did you feel pressure to make your ranges
wider? Or narrower? (Why?)
Questions:
• Did you feel pressure to make your ranges
wider? Or narrower? (Why?)
• Where did the pressure come from?
Questions:
• Did you feel pressure to make your ranges
wider? Or narrower? (Why?)
• Where did the pressure come from?
• Is estimating the volume of the Great
Lakes anything like estimating software?
Questions:
• Did you feel pressure to make your ranges
wider? Or narrower? (Why?)
• Where did the pressure come from?
• Is estimating the volume of the Great
Lakes anything like estimating the impact
of new programming tools on productivity,
the productivity of an unidentified person,
or the cost of developing software with no
specification?
Accuracy and the cost of
inaccuracy
• What is the cost of overestimating?
Accuracy and the cost of
inaccuracy
• What is the cost of overestimating?
– Parkinson’s Law: work expands to fill the time
available
– Goldratt’s Syndrome: People procrastinate
until the last moment to start
Accuracy and the cost of
inaccuracy
• What is the cost of underestimating?
Accuracy and the cost of
inaccuracy
• What is the cost of underestimating?
– Reduced effectiveness of project plans
– Reduced chance of on-time completion
– Poor technical approaches: Not enough time
in requirements and design
– Destructive late project dynamics
• More status meetings
• Interim releases
• Fixing problems from workarounds
Cost Estimation 2
Recall cost estimation:
•
•
•
•
•
•
Sophisticated organizations: within 10%
Typical software organization: >100%
Estimates need to be timely and accurate
Estimate, Target, Commitment, Plan
Costs associated with overestimates
Costs associated with underestimates
How are we doing?
• KSLOC is 1,000 lines of source code
• MSLOC is 1,000,000 lines of source code
• With your partner, what does this graph say?
% complete
Projet Outcomes by Project Size
Early
90
80
70
60
50
40
30
20
10
0
On Time
Late
Failed
1KSLOC
10 KSLOC
100KSLOC
Size
1MSLOC
10 MSLOC
Benefits of Accurate Estimates
•
•
•
•
•
•
Improved status visibility
Higher quality
Better coordination with non-software functions
Better budgeting
Increased credibility for team
Early risk information
Benefits of Accurate Estimates
• Improved status visibility
– Track progress by comparing actual to planned
– Ability to make a plan
•
•
•
•
•
Higher quality
Better coordination with non-software functions
Better budgeting
Increased credibility for team
Early risk information
Benefits of Accurate Estimates
• Improved status visibility
• Higher quality
– Less stress on developers
– Schedule pressure can increase defect rate by 400%
(Jones 1994)
•
•
•
•
Better coordination with non-software functions
Better budgeting
Increased credibility for team
Early risk information
Benefits of Accurate Estimates
• Improved status visibility
• Higher quality
• Better coordination with non-software functions
– Testing, documentation, marketing, training, support
– Better estimation: tighter coordination
• Better budgeting
• Increased credibility for team
• Early risk information
Benefits of Accurate Estimates
•
•
•
•
Improved status visibility
Higher quality
Better coordination with non-software functions
Better budgeting
– obvious
• Increased credibility for team
• Early risk information
Benefits of Accurate Estimates
•
•
•
•
•
Improved status visibility
Higher quality
Better coordination with non-software functions
Better budgeting
Increased credibility for team
– Not unusual for
• team to estimate,
• others (manages, marketers, sales staff) turn it into optimistic
business target
• Developers overrun
• Others blame team
• Early risk information
Benefits of Accurate Estimates
•
•
•
•
•
•
Improved status visibility
Higher quality
Better coordination with non-software functions
Better budgeting
Increased credibility for team
Early risk information
– If target and estimate don’t match, then opportunity to:
• fix problem (reassign resources)
• Re-scope
• Cancel
Approaches to arriving at a
number
• Count
• Compute
• Judge
Approaches to arriving at a
number
• Count
– If you want to know how many people in the room,
count them
– Usually not possible (e.g., how many people are on
Earth?)
• Compute
• Judge
Approaches to arriving at a
number
• Count
• Compute
– Find some approach
– E.g., Count the number of jelly beans in 1” and use
that to compute the total number in the jar
• Judge
Approaches to arriving at a
number
• Count
• Compute
• Judge
– a.k.a. “guess”
Counting vs. Estimating
• Similarities:
• Differences:
Counting vs. Estimating
• Similarities:
– Both arrive at a number representing some
real value
– Both are subject to error
• Differences:
– Estimating implies imprecise knowledge
Proxy
• A value that is used to represent some
other value
• Example: Estimate the weight of the
people in the airplane from the number of
people in the airplane (requires that we
also know the average weight of people)
1 minute drill
• What is it that we want to estimate in
software?
Things we want to estimate in
software
• Cost
• Resources
• Revenue
1 minute drill: What are proxies
for these?
• Cost
• Resources
• Revenue
Proxies in software estimation
• Cost
– Program size
– Program complexity
– Development time
• Resources
– Program size
– Program complexity
– Number of users
• Revenue
– Number of customers
Sources of uncertainty
• Inaccurate information about project
• Inaccurate information about ability of
project team
• Too much chaos in project
• Inaccuracies in estimation process
Group Review Estimates
• Individually:
– Read the assignment for the movie rental
program.
– Predict the cost (time) to develop the code.
• Put the time estimate on the card and turn
it in.
Group Review Estimate: Team
• In groups of 4: Compare estimates.
– Discuss the differences enough to understand
the sources of the differences.
– Work until you reach consensus on the high
and low ends of the estimation ranges
• You cannot just “average” the estimates.
• You must reach consensus on the
estimate. Discuss until you get buy-in from
the entire group.
• Turn in the results of this exercise.
Wideband Delphi
1. Estimators prepare initial estimates
2. The estimators meet with a coordinator
to discuss estimation issues
3. Estimators give their estimates to the
coordinator anonymously
4. The estimates are summarized on an
iteration form
5. Estimators meet to discuss differences
6. Estimators vote to accept the average. If
any votes “no”, return to step 2
Wideband Delphi
• Votes and estimates are anonymous
• Reduces political pressure
• Coordinator must prevent dominant
personalities from controlling discussions
• (Frequently, the most reserved person has
the best insights)
Results of Wideband Delphi
• Estimation error cut by 40% compared to
initial group average
• Accuracy improves in 80% of the cases
• Useful for early estimates, particularly with
unfamiliar systems
• Not so useful for detailed estimates
LOC, SLOC, KSLOC, MSLOC
• Lines of code
• Standard measure of size
• Often a measure of cost (i.e., time)
LOC, SLOC, KSLOC, MSLOC
• Lines of code
• Standard measure of size
• Often a measure of cost (i.e., time)
What
assumptions
does this
make?
Function Points
• Synthetic measure of program size used
to estimate size early in the project
• Easier (than lines of code) to calculate
from requirements
• Standards at the International Function
Point Users Group (IFPUG) www.ifpug.org
FP Rules: #FPs depends on:
•
•
•
•
•
External Inputs
External Outputs
External Queries
Internal Logical Files
External Interface Files
FP Rules: #FPs depends on:
• External Inputs
– Data entering the system
– Screens, forms, dialogs, controls
– User or other program adds, deletes, modifies
data
– Any input that requires processing logic
•
•
•
•
External Outputs
External Queries
Internal Logical Files
External Interface Files
FP Rules: #FPs depends on:
• External Inputs
• External Outputs
– Derived Data leaving the system
– Screens, reports, dialog boxes, control signals
generated for end user or other program
• External Queries
• Internal Logical Files
• External Interface Files
FP Rules: #FPs depends on:
• External Inputs
• External Outputs
• External Queries
– Base data leaving the system
– I/O combinations in which an input results in a
simple output
– Queries retrieve data with no formatting.
Output is formatted
• Internal Logical Files
• External Interface Files
FP Rules: #FPs depends on:
•
•
•
•
External Inputs
External Outputs
External Queries
Internal Logical Files
– Data maintained within the application
– Major logical groups of end-user data
completely controlled by the program
– Might be a flat file, a database table, or a
collection of other data
• External Interface Files
FP Rules: #FPs depends on:
•
•
•
•
•
External Inputs
External Outputs
External Queries
Internal Logical Files
External Interface Files
– Data maintained outside the application
– Files controlled by other programs
Function Points Complexity
• The complexity of each function point
depends on:
– Record Element Types (RETs)
– Data Element Types (DETs)
– File Types Referenced (FTRs)
Function Points Complexity
• The complexity of each function point
depends on:
– Record Element Types (RETs)
• a user recognizable subgroup of data elements
within an ILF or EIF
– Data Element Types (DETs)
– File Types Referenced (FTRs)
Function Points Complexity
• The complexity of each function point
depends on:
– Record Element Types (RETs)
– Data Element Types (DETs)
• a unique user recognizable, non-recursive field
– File Types Referenced (FTRs)
Function Points Complexity
• The complexity of each function point
depends on:
– Record Element Types (RETs)
– Data Element Types (DETs)
– File Types Referenced (FTRs)
• a file type referenced by a transaction. An FTR
must also be an internal logical file or external
interface file
Complexity Table- External
Inputs (EI)
Complexity Table- External
Outputs (EO) & Inquiries (EQ)
Complexity Table- Internal
Logical File (ILF) & External
Interface File (EIF)
FP Rules: Complexity Multipliers
Low
Complexity
Medium
Complexity
High
Complexity
External
Inputs
3
4
6
External
Outputs
External
Queries
4
5
7
3
4
6
10
15
7
10
Internal
4
Logical Files
External
5
Interface Files
Value Adjustment Factor
• It is based on 14 general system
characteristics that rate the functionality of
the application
– Performance, End-user efficiency,
Reusability…
• Each characteristic is assigned a degree
of influence range on a scale from 0 to 5
• A formula is then used to account for
these characteristics
LOC vs FP (Boehm 2000, Stutzke 2005)
Language
Ada
C
C#
C++
Java
Assembly
Perl
VB
LOC per FP
50
128
55
55
55
213
20
32
FP results
• Certified counters vary by 10%
• Untrained counters vary by much more
• The multipliers may or may not be useful
(some research indicates unadjusted FPs
are more closely correlated with effort)
• The LOC have on average a range of 3x
wrt FPs
Function Point Counting
Exercise
• In your teams, compute the FPs for the
voting system and answer the questions at
the end of the of the exercise
–