Transcript Document

ECE 355: Software Engineering
Project Cost Estimation
Instructor:
Kostas Kontogiannis
1
Course Outline
• Introduction to software engineering
• Requirements Engineering
• Design Basics
• Traditional Design
• OO Design
• Design Patterns
• Software Architecture
• Design Documentation
• Verification & Validation
Software Process & Project Management
2
• These slides are based on:
– Lecture slides by Ian Summerville, see
http://www.comp.lancs.ac.uk/computing/resources/ser/
– ECE355 Lecture slides by Sagar Naik
3
Process/Project Management
• Project management involves a whole host of issues and
skills
–
–
–
–
–
–
Effort estimation
Staffing
Defining and managing the process
Scheduling activities
Monitoring quality
…
• Process management at the level of an organization
– A software development organization should define, implement
and constantly improve their
• Software processes
• Organizational structure
4
Overview - Software Process &
Project Management
Cost estimation & Staffing
• Project scheduling
• Software Life-Cycle Models
• Examples of Software processes
• Process improvement and Software metrics
5
Software cost estimation
• Predicting the resources
required for a software
development process
6
©Ian Sommerville 1995
Topics covered
•
•
•
•
Productivity
Estimation techniques
Algorithmic cost modelling
Project duration and staffing
7
©Ian Sommerville 1995
Software cost components
• Effort costs (the dominant factor in most
projects)
–
–
–
–
salaries of engineers involved in the project
costs of building, heating, lighting
costs of networking and communications
costs of shared facilities (e.g library, staff restaurant,
etc.)
– costs of pensions, health insurance, etc.
• Other costs
– Hardware and software costs
– Travel and training costs
– …
8
©Ian Sommerville 1995 [modified]
Costing and pricing
• There is not a simple relationship between the
development cost and the price charged to the
customer
• Software pricing factors
– Market opportunity – low price to enter the market,
e.g., initially “free software”
– Cost estimation uncertainty
– Contractual terms
– Requirements volatility
– Financial health
– …
9
©Ian Sommerville 1995 [modified]
Programmer productivity
• A measure of the rate at which individual
engineers involved in software development
produce software and associated
documentation
– Not quality-oriented although quality assurance
is a factor in productivity assessment
• Measure useful functionality produced per time
unit & programmer
10
©Ian Sommerville 1995 [modified]
Productivity metrics
• Size related measures based on some output from
the software process. This may be lines of
delivered source code (SLOC), object code
instructions, etc.
– E.g., SLOC / person-month
• Function-related measures based on an estimate
of the functionality of the delivered software.
Function-points are the best known of this type of
measure
– E.g., FP / person-month
11
©Ian Sommerville 1995 [modified]
Lines of code
• What's a line of code?
– Many different ways to count lines (e.g., with
or without comments, counting statements
rather than lines, or counting lines in a
automatically formatted code)
– Need to know the measurement method
before comparing SLOC numbers
• Assumes linear relationship between
system size and volume of documentation
12
©Ian Sommerville 1995 [modified]
Cross-language comparisons
• Problems of LOC-based comparisons
– The lower level the language, the more
productive the programmer
– The more verbose the programmer, the higher
the productivity
• Function points provide a more accurate
measure of productivity than LOC
13
©Ian Sommerville 1995 [modified]
System development times
Analysis Design Coding T esting Documentation
Assembly code
3 weeks 5 weeks 8 weeks 10 weeks
2 weeks
High-level language 3 weeks 5 weeks 8 weeks 6 weeks
2 weeks
Size
E ffort
Productivity
Assembly code
5000 lines
28 weeks 714 lines/month
High-level language 1500 lines
20 weeks 300 lines/month
14
©Ian Sommerville 1995
The “Vicious Square”
Quality
+
Scope
+
Productivity
-
Development time
-
-
+
+
-
Cost
15
Quality and productivity
• All metrics based on volume/unit time are
flawed because they do not take quality into
account
• Productivity may generally be increased at the
cost of quality
• It is not clear how productivity/quality metrics
are related
16
©Ian Sommerville 1995
Productivity estimates
• Real-time embedded systems, 40-160
LOC/P-month
• Systems programs , 150-400 LOC/P-month
• Commercial applications, 200-800
LOC/P-month
17
©Ian Sommerville 1995
The four variables
• The main four variables of a project
–
–
–
–
Development cost
Time
Quality
Scope
• Only three of these variables can be (more or less) freely adjusted
• Development cost, time and quality are bad control variables
– The number of developers can only be incrementally increased (negative
effects beyond the optimal count)
– Deadlines are often predetermined externally (e.g., market window,
important presentation)
– Low quality upsets customers and developers
• Scope is the only real control variable
18
Accuracy of Estimation
4x
2x
x
x – the actual cost of the system
Estimates on projects studied by Barry
Boehm occupied the area between the curves
Feasibility Requirements
Design
Code
Delivery
0.5x
0.25x
As a project progresses, more information about
the progress becomes available and the accuracy
19
of estimation can be increased over time.
Estimation techniques
•
•
•
•
•
•
•
•
Expert judgement
Estimation by analogy
Parkinson's Law
Pricing to win
Top-down estimation
Bottom-up estimation
Function point estimation
Algorithmic cost modelling
20
©Ian Sommerville 1995 [modified]
Expert judgement
• One or more experts in both software
development and the application domain use
their experience to predict software costs.
Process iterates until some consensus is
reached.
• Advantages: Relatively cheap estimation
method. Can be accurate if experts have direct
experience of similar systems
• Disadvantages: Very inaccurate if there are no
experts!
21
©Ian Sommerville 1995
Estimation by analogy
• The cost of a project is computed by comparing
the project to a similar project in the same
application domain
• Advantages: Accurate if project data available
• Disadvantages: Impossible if no comparable
project has been tackled. Needs systematically
maintained cost database
22
©Ian Sommerville 1995
Parkinson's Law
• The project costs whatever resources are
available
• Advantages: No overspend
• Disadvantages: System is usually
unfinished
23
©Ian Sommerville 1995
Pricing to win
• The project costs whatever the customer has to
spend on it
• Advantages: You get the contract
• Disadvantages: The probability that the
customer gets the system he or she wants is
small. Costs do not accurately reflect the work
required
24
©Ian Sommerville 1995
Top-down estimation
• Approaches may be applied using a top-down
approach. Start at system level and work out how
the system functionality is provided
• Takes into account costs such as integration,
configuration management and documentation
• Can underestimate the cost of solving difficult
low-level technical problems
25
©Ian Sommerville 1995
Bottom-up estimation
• Start at the lowest system level. The cost of each
component is estimated individually. These costs
are summed to give final cost estimate
• Accurate method if the system has been designed
in detail
• May underestimate costs of system level
activities such as integration and documentation
26
©Ian Sommerville 1995
Function Points
• The idea of function point was first proposed by
Albrecht in 1979.
• The function point of a system is a measure of the
“functionality” of the system.
• Steps
– Counting the information domain – counting FPs
– Assessing complexity of the software – adjusting FPs
– Applying an empirical relationship to come up with
LOC or P-months based on the adjusted FPs
• This method cannot be performed automatically
27
©Ian Sommerville 1995
Counting Function Points
28
Counting Function Points
• User inputs. Each user input that provides distinct
application oriented data to the software is counted.
• User outputs. Each user output that provides application
oriented information to the user is counted. Individual data
items within a report are not counted separately.
• User inquiries. This is an on-line input that results in the
generation of some response.
• Files. Each master file is counted.
• External interfaces. Each interface that is used to transmit
information to another system is counted.
29
Adjusting Function Points
Answer the following questions using a scale of [0-5]: 0 not
important; 5 absolutely essential. We call them influence
factors (Fi).
1. Does the system require reliable backup and recovery?
2. Are data communications required?
3. Are there distributed processing functions?
4. Is performance critical?
5. Will the system run in an existing, heavily utilized
operational env.?
6. Does the system require on-line data entry?
30
Adjusting Function Points
7. Does the on-line data entry require the input transaction to
be built over multiple screens or operations (user
efficiency)?
8. Are the master files updated on-line?
9. Are the inputs, outputs, files, or inquiries complex?
10. Is the internal processing complex?
11. Is the code designed to be reusable?
12. Is installation included in the design?
13. Is the system designed for multiple installations?
14. Is the application designed to facilitate change and ease of
use by the user?
31
Map FPs to LOC
• Use an empirical relationship
– Function point = count total  [0.65 + 0.01  (sum of the 14 Fi)]
– Companies may want to refine their own version
• According to a 1989 study, implementing a function point
in a given programming language requires the following
number of lines of code
–
–
–
–
–
–
Assembly
C
COBOL
C++
Visual Basic
SQL
320
128
106
64
32
12
• See www.ifpug.org for more information on FP
32
Example: Your PBX project
33
Example: Your PBX project
• Total of FPs = 25
• F4 = 4, F10 = 4, other Fi’s are set to 0. Sum of all
Fi’s = 8.
• FP = 25 x (0.65 + 0.01 x 8) = 18.25
• Lines of code in C = 18.25 x 128 LOC = 2336
LOC
• In the past, students have implemented their
projects using about 2500 LOC.
34
Algorithmic cost modelling
• Cost is estimated as a mathematical function of
product, project and process attributes whose
values are estimated by project managers
• The function is derived from a study of
historical costing data
• Most commonly used product attribute for cost
estimation is LOC (code size)
• Most models are basically similar but with
different attribute values
35
©Ian Sommerville 1995
The COCOMO model
• Developed at TRW, a US defence contractor
• Based on a cost database of more than 60
different projects
• Exists in three stages
– Basic - Gives a 'ball-park' estimate based on product
attributes
– Intermediate - Modifies basic estimate using project
and process attributes
– Advanced - Estimates project phases and parts
separately
36
©Ian Sommerville 1995
Project classes
• Organic mode small teams, familiar
environment, well-understood applications, no
difficult non-functional requirements (EASY)
• Semi-detached mode Project team may have
experience mixture, system may have more
significant non-functional constraints,
organization may have less familiarity with
application (HARDER)
• Embedded Hardware/software systems, tight
constraints, unusual for team to have deep
application experience (HARD)
37
Basic COCOMO Formula
• Organic mode: PM = 2.4 (KDSI) 1.05
• Semi-detached mode: PM = 3 (KDSI) 1.12
• Embedded mode: PM = 3.6 (KDSI) 1.2
• KDSI = Kilo Delivered Source Instructions
38
©Ian Sommerville 1995
Effort estimates
Person-months
1000
Embedded
800
600
Intermediate
400
Simple
200
0
0
20
40
60
KDSI
©Ian Sommerville 1995
80
100
120
39
COCOMO examples
• Organic mode project, 32KLOC
– PM = 2.4 (32) 1.05 = 91 person months
– TDEV = 2.5 (91) 0.38 = 14 months
– N = 91/15 = 6.5 people
• Embedded mode project, 128KLOC
– PM = 3.6 (128)1.2 = 1216 person-months
– TDEV = 2.5 (1216)0.32 = 24 months
– N = 1216/24 = 51
40
©Ian Sommerville 1995
COCOMO assumptions
• Implicit productivity estimate
– Organic mode = 16 LOC/day
– Embedded mode = 4 LOC/day
• Time required is a function of total effort
NOT team size
• Not clear how to adapt model to personnel
availability
41
©Ian Sommerville 1995
Intermediate COCOMO
• Takes basic COCOMO as starting point
• Identifies personnel, product, computer and
project attributes which affect cost
• Multiplies basic cost by attribute
multipliers which may increase or decrease
costs
42
©Ian Sommerville 1995
Personnel attributes
• Personnel attributes
–
–
–
–
–
Analyst capability
Virtual machine experience
Programmer capability
Programming language experience
Application experience
• Product attributes
– Reliability requirement
– Database size
– Product complexity
43
©Ian Sommerville 1995
Computer attributes
• Computer attributes
–
–
–
–
Execution time constraints
Storage constraints
Virtual machine volatility
Computer turnaround time
• Project attributes
– Modern programming practices
– Software tools
– Required development schedule
44
©Ian Sommerville 1995
Attribute choice
• These are attributes which were found to be
significant in one organization with a limited
size of project history database
• Other attributes may be more significant for
other projects
• Each organization must identify its own
attributes and associated multiplier values
45
©Ian Sommerville 1995
Model tuning
• All numbers in cost model are organization
specific. The parameters of the model must
be modified to adapt it to local needs
• A statistically significant database of
detailed cost information is necessary
46
©Ian Sommerville 1995
Predicted costs
Effort
Curve fitted to
measured effort
Predicted
effort
0
©Ian Sommerville 1995
20
40
Size
60
80
100
47
Example
• Embedded software system on microcomputer
hardware.
• Basic COCOMO predicts a 45 person-month
effort requirement
• Attributes = RELY (1.15), STOR (1.21), TIME
(1.10), TOOL (1.10)
• Intermediate COCOMO predicts
– 45*1.15*1.21.1.10*1.10 = 76 person-months.
• Total cost = 76*$7000 = $532, 000
48
©Ian Sommerville 1995
Development time estimates
•
•
•
•
Organic: TDEV = 2.5 (PM) 0.38
Semi-detached: TDEV = 2.5 (PM) 0.35
Embedded mode: TDEV = 2.5 (PM) 0.32
Personnel requirement: N = PM/TDEV
– This last formula needs to be adjusted (see
next slide)
49
©Ian Sommerville 1995 [modified]
Staffing requirements
• Staff required can’t be computed by diving the
development time by the required schedule
• The number of people working on a project
varies depending on the phase of the project
• The more people who work on the project, the
more total effort is usually required
• Very rapid build-up of people often correlates
with schedule slippage
• Adding more people to a delayed project will
delay it even more
50
©Ian Sommerville 1995 [modified]
Rayleigh manpower curves
Rc
Resources
Rc=(t/k2) e-t2/2k2
k1
k2
k3
51
©Ian Sommerville 1995
Estimation methods - Summary
• Function points
– SRS -> LOC
– SRS -> PM
• COCOMO
– LOC -> PM
– May use FP as a front-end to COCOMO
• COCOMO II
– Refined version with different estimation models based
on
• Requirements (FP->PM),
• Early design (FP->PM), and
• Architecture (FP or LOC->PM)
52
Estimation methods - Summary
• Each method has strengths and weaknesses
• Estimation should be based on several methods
• If these do not return approximately the same
result, there is insufficient information available
• Some action should be taken to find out more in
order to make more accurate estimates
• Pricing to win is sometimes the only applicable
method
53
©Ian Sommerville 1995