Recent from talks
Nothing was collected or created yet.
Software development effort estimation
View on WikipediaIn software development, effort estimation is the process of predicting the most realistic amount of effort (expressed in terms of person-hours or money) required to develop or maintain software based on incomplete, uncertain and noisy input. Effort estimates may be used as input to project plans, iteration plans, budgets, investment analyses, pricing processes and bidding rounds.[1][2]
State-of-practice
[edit]Published surveys on estimation practice suggest that expert estimation is the dominant strategy when estimating software development effort.[3]
Typically, effort estimates are over-optimistic and there is a strong over-confidence in their accuracy. The mean effort overrun seems to be about 30% and not decreasing over time. For a review of effort estimation error surveys, see.[4] However, the measurement of estimation error is problematic, see Assessing the accuracy of estimates. The strong overconfidence in the accuracy of the effort estimates is illustrated by the finding that, on average, if a software professional is 90% confident or "almost sure" to include the actual effort in a minimum-maximum interval, the observed frequency of including the actual effort is only 60-70%.[5]
Currently the term "effort estimate" is used to denote as different concepts such as most likely use of effort (modal value), the effort that corresponds to a probability of 50% of not exceeding (median), the planned effort, the budgeted effort or the effort used to propose a bid or price to the client. This is believed to be unfortunate, because communication problems may occur and because the concepts serve different goals.[6][7]
History
[edit]Software researchers and practitioners have been addressing the problems of effort estimation for software development projects since at least the 1960s; see, e.g., work by Farr[8][9] and Nelson.[10]
Most of the research has focused on the construction of formal software effort estimation models. The early models were typically based on regression analysis or mathematically derived from theories from other domains. Since then a high number of model building approaches have been evaluated, such as approaches founded on case-based reasoning, classification and regression trees, simulation, neural networks, Bayesian statistics, lexical analysis of requirement specifications, genetic programming, linear programming, economic production models, soft computing, fuzzy logic modeling, statistical bootstrapping, and combinations of two or more of these models. The perhaps most common estimation methods today are the parametric estimation models COCOMO, SEER-SEM and SLIM. They have their basis in estimation research conducted in the 1970s and 1980s and are since then updated with new calibration data, with the last major release being COCOMO II in the year 2000. The estimation approaches based on functionality-based size measures, e.g., function points, is also based on research conducted in the 1970s and 1980s, but are re-calibrated with modified size measures and different counting approaches, such as the use case points[11] or object points and COSMIC Function Points in the 1990s.
Estimation approaches
[edit]There are many ways of categorizing estimation approaches, see for example.[12][13] The top level categories are the following:
- Expert estimation: The quantification step, i.e., the step where the estimate is produced based on judgmental processes.[14]
- Formal estimation model: The quantification step is based on mechanical processes, e.g., the use of a formula derived from historical data.
- Combination-based estimation: The quantification step is based on a judgmental and mechanical combination of estimates from different sources.
Below are examples of estimation approaches within each category.
| Estimation approach | Category | Examples of support of implementation of estimation approach |
|---|---|---|
| Analogy-based estimation | Formal estimation model | ANGEL, Weighted Micro Function Points |
| WBS-based (bottom up) estimation | Expert estimation | Project management software, company specific activity templates |
| Parametric models | Formal estimation model | COCOMO, SLIM, SEER-SEM, TruePlanning for Software |
| Size-based estimation models[15] | Formal estimation model | Function Point Analysis,[16] Use Case Analysis, Use Case Points, SSU (Software Size Unit), Story points-based estimation in Agile software development, Object Points |
| Group estimation | Expert estimation | Planning poker, Wideband delphi |
| Mechanical combination | Combination-based estimation | Average of an analogy-based and a Work breakdown structure-based effort estimate[17] |
| Judgmental combination | Combination-based estimation | Expert judgment based on estimates from a parametric model and group estimation |
Selection of estimation approaches
[edit]The evidence on differences in estimation accuracy of different estimation approaches and models suggest that there is no "best approach" and that the relative accuracy of one approach or model in comparison to another depends strongly on the context .[18] This implies that different organizations benefit from different estimation approaches. Findings[19] that may support the selection of estimation approach based on the expected accuracy of an approach include:
- Expert estimation is on average at least as accurate as model-based effort estimation. In particular, situations with unstable relationships and information of high importance not included in the model may suggest use of expert estimation. This assumes, of course, that experts with relevant experience are available.
- Formal estimation models not tailored to a particular organization's own context, may be very inaccurate. Use of own historical data is consequently crucial if one cannot be sure that the estimation model's core relationships (e.g., formula parameters) are based on similar project contexts.
- Formal estimation models may be particularly useful in situations where the model is tailored to the organization's context (either through use of own historical data or that the model is derived from similar projects and contexts), and it is likely that the experts’ estimates will be subject to a strong degree of wishful thinking.
The most robust finding, in many forecasting domains, is that combination of estimates from independent sources, preferable applying different approaches, will on average improve the estimation accuracy.[19][20][21]
It is important to be aware of the limitations of each traditional approach to measuring software development productivity.[22]
In addition, other factors such as ease of understanding and communicating the results of an approach, ease of use of an approach, and cost of introduction of an approach should be considered in a selection process.
Assessing the accuracy of estimates
[edit]The most common measure of the average estimation accuracy is the MMRE (Mean Magnitude of Relative Error), where the MRE of each estimate is defined as:
- MRE = |(actual effort) - (estimated effort)|/(actual effort)
This measure has been criticized [23] [24] [25] and there are several alternative measures, such as more symmetric measures,[26] Weighted Mean of Quartiles of relative errors (WMQ) [27] and Mean Variation from Estimate (MVFE).[28]
MRE is not reliable if the individual items are skewed. PRED(25) is preferred as a measure of estimation accuracy. PRED(25) measures the percentage of predicted values that are within 25 percent of the actual value.
A high estimation error cannot automatically be interpreted as an indicator of low estimation ability. Alternative, competing or complementing, reasons include low cost control of project, high complexity of development work, and more delivered functionality than originally estimated. A framework for improved use and interpretation of estimation error measurement is included in.[29]
Psychological issues
[edit]There are many psychological factors potentially explaining the strong tendency towards over-optimistic effort estimates. These factors are essential to consider even when using formal estimation models, because much of the input to these models is judgment-based. Factors that have been demonstrated to be important are wishful thinking, anchoring, planning fallacy and cognitive dissonance.[30]
- It's easy to estimate what is known.
- It's hard to estimate what is known to be unknown. (known unknowns)
- It's very hard to estimate what is not known to be unknown. (unknown unknowns)
Humor
[edit]The chronic underestimation of development effort has led to the coinage and popularity of numerous humorous adages, such as ironically referring to a task as a "small matter of programming" (when much effort is likely required), and citing laws about underestimation:
The first 90 percent of the code accounts for the first 90 percent of the development time. The remaining 10 percent of the code accounts for the other 90 percent of the development time.[31]
— Tom Cargill, Bell Labs
Hofstadter's Law: It always takes longer than you expect, even when you take into account Hofstadter's Law.
What one programmer can do in one month, two programmers can do in two months.
Comparison of development estimation software
[edit]| Software | Schedule estimate | Cost estimate | Cost Models | Input | Report Output Format | Supported Programming Languages | Platforms | Cost | License |
|---|---|---|---|---|---|---|---|---|---|
| AFCAA REVIC[33] | Yes | Yes | REVIC | KLOC, Scale Factors, Cost Drivers | proprietary, Text | Any | DOS | Free | Proprietary / Free for public distribution |
| Seer for Software | Yes | Yes | SEER-SEM | SLOC, Function points, use cases, bottoms-up, object, features | proprietary, Excel, Microsoft Project, IBM Rational, Oracle Crystal Ball | Any | Windows, Any (Web-based) | Commercial | Proprietary |
| SLIM[34] | Yes | Yes | SLIM | Size (SLOC, Function points, Use Cases, etc.), constraints (size, duration, effort, staff), scale factors, historical projects, historical trends | proprietary, Excel, Microsoft Project, Microsoft PowerPoint, IBM Rational, text, HTML | Any | Windows, Any (Web-based)[35] | Commercial | Proprietary |
| TruePlanning[36] | Yes | Yes | PRICE | Components, Structures, Activities, Cost drivers, Processes, Functional Software Size (Source Lines of Code (SLOC), Function Points, Use Case Conversion Points (UCCP), Predictive Object Points (POPs) etc.) | Excel, CAD | Any | Windows | Commercial | Proprietary |
See also
[edit]References
[edit]- ^ "What We do and Don't Know about Software Development Effort Estimation".
- ^ "Cost Estimating And Assessment Guide GAO-09-3SP Best Practices for developing and managing Capital Program Costs" (PDF). US Government Accountability Office. 2009.
- ^ Jørgensen, M. (2004). "A Review of Studies on Expert Estimation of Software Development Effort". Journal of Systems and Software. 70 (1–2): 37–60. doi:10.1016/S0164-1212(02)00156-5.
- ^ Molokken, K. Jorgensen, M. (2003). "A review of software surveys on software effort estimation". 2003 International Symposium on Empirical Software Engineering, 2003. ISESE 2003. Proceedings. pp. 223–230. doi:10.1109/ISESE.2003.1237981. ISBN 978-0-7695-2002-5. S2CID 15471986.
{{cite book}}: CS1 maint: multiple names: authors list (link) - ^ Jørgensen, M. Teigen, K.H. Ribu, K. (2004). "Better sure than safe? Over-confidence in judgement based software development effort prediction intervals". Journal of Systems and Software. 70 (1–2): 79–93. doi:10.1016/S0164-1212(02)00160-7.
{{cite journal}}: CS1 maint: multiple names: authors list (link) - ^ Edwards, J.S. Moores (1994). "A conflict between the use of estimating and planning tools in the management of information systems". European Journal of Information Systems. 3 (2): 139–147. doi:10.1057/ejis.1994.14. S2CID 62582672.
- ^ Goodwin, P. (1998). Enhancing judgmental sales forecasting: The role of laboratory research. Forecasting with judgment. G. Wright and P. Goodwin. New York, John Wiley & Sons: 91-112. Hi
- ^ Farr, L. Nanus, B. "Factors that affect the cost of computer programming, volume I" (PDF). Archived from the original (PDF) on February 21, 2017.
{{cite web}}: CS1 maint: multiple names: authors list (link) - ^ Farr, L. Nanus, B. "Factors that affect the cost of computer programming, volume II" (PDF). Archived from the original (PDF) on July 28, 2018.
{{cite web}}: CS1 maint: multiple names: authors list (link) - ^ Nelson, E. A. (1966). Management Handbook for the Estimation of Computer Programming Costs. AD-A648750, Systems Development Corp.
- ^ Anda, B. Angelvik, E. Ribu, K. (2002). "Improving Estimation Practices by Applying Use Case Models". Product Focused Software Process Improvement. Lecture Notes in Computer Science. Vol. 2559. pp. 383–397. CiteSeerX 10.1.1.546.112. doi:10.1007/3-540-36209-6_32. ISBN 978-3-540-00234-5.
{{cite book}}: CS1 maint: multiple names: authors list (link) ISBN 9783540002345, 9783540362098. - ^ Briand, L. C. and Wieczorek, I. (2002). "Resource estimation in software engineering". Encyclopedia of software engineering. J. J. Marcinak. New York, John Wiley & Sons: 1160–1196.
- ^ Jørgensen, M. Shepperd, M. "A Systematic Review of Software Development Cost Estimation Studies".
{{cite web}}: CS1 maint: multiple names: authors list (link) - ^ "Custom Software Development Services – Custom App Development – Oxagile".
- ^ Hill Peter (ISBSG) – Estimation Workbook 2 – published by International Software Benchmarking Standards Group ISBSG - Estimation and Benchmarking Resource Centre Archived 2008-08-29 at the Wayback Machine
- ^ Morris Pam — Overview of Function Point Analysis Total Metrics - Function Point Resource Centre
- ^ Srinivasa Gopal and Meenakshi D'Souza. 2012. Improving estimation accuracy by using case based reasoning and a combined estimation approach. In Proceedings of the 5th India Software Engineering Conference (ISEC '12). ACM, New York, USA, 75-78. doi:10.1145/2134254.2134267
- ^ Shepperd, M. Kadoda, G. (2001). "Comparing software prediction techniques using simulation". IEEE Transactions on Software Engineering. 27 (11): 1014–1022. Bibcode:2001ITSEn..27.1014S. doi:10.1109/32.965341.
{{cite journal}}: CS1 maint: multiple names: authors list (link) - ^ a b Jørgensen, M. "Estimation of Software Development Work Effort:Evidence on Expert Judgment and Formal Models".
- ^ Winkler, R.L. (1989). "Combining forecasts: A philosophical basis and some current issues Manager". International Journal of Forecasting. 5 (4): 605–609. doi:10.1016/0169-2070(89)90018-6.
- ^ Blattberg, R.C. Hoch, S.J. (1990). "Database Models and Managerial Intuition: 50% Model + 50% Manager". Management Science. 36 (8): 887–899. doi:10.1287/mnsc.36.8.887. JSTOR 2632364.
{{cite journal}}: CS1 maint: multiple names: authors list (link) - ^ BlueOptima (2019-10-29). "Identifying Reliable, Objective Software Development Metrics".
- ^ Shepperd, M. Cartwright, M. Kadoda, G. (2000). "On Building Prediction Systems for Software Engineers". Empirical Software Engineering. 5 (3): 175–182. doi:10.1023/A:1026582314146. S2CID 1293988.
{{cite journal}}: CS1 maint: multiple names: authors list (link) - ^ Kitchenham, B., Pickard, L.M., MacDonell, S.G. Shepperd. "What accuracy statistics really measure".
{{cite web}}: CS1 maint: multiple names: authors list (link) - ^ Foss, T., Stensrud, E., Kitchenham, B., Myrtveit, I. (2003). "A Simulation Study of the Model Evaluation Criterion MMRE". IEEE Transactions on Software Engineering. 29 (11): 985–995. Bibcode:2003ITSEn..29..985F. CiteSeerX 10.1.1.101.5792. doi:10.1109/TSE.2003.1245300.
{{cite journal}}: CS1 maint: multiple names: authors list (link) - ^ Miyazaki, Y. Terakado, M. Ozaki, K. Nozaki, H. (1994). "Robust regression for developing software estimation models". Journal of Systems and Software. 27: 3–16. doi:10.1016/0164-1212(94)90110-4.
{{cite journal}}: CS1 maint: multiple names: authors list (link) - ^ Lo, B. Gao, X. "Assessing Software Cost Estimation Models: criteria for accuracy, consistency and regression".
{{cite web}}: CS1 maint: multiple names: authors list (link) - ^ Hughes, R.T. Cunliffe, A. Young-Martos, F. (1998). "Evaluating software development effort model-building techniquesfor application in a real-time telecommunications environment". IEE Proceedings - Software. 145: 29. doi:10.1049/ip-sen:19983370 (inactive 12 July 2025). Archived from the original on September 20, 2017.
{{cite journal}}: CS1 maint: DOI inactive as of July 2025 (link) CS1 maint: multiple names: authors list (link) - ^ Grimstad, S. Jørgensen, M. (2006). "A Framework for the Analysis of Software Cost Estimation Accuracy".
{{cite web}}: CS1 maint: multiple names: authors list (link) - ^ Jørgensen, M. Grimstad, S. (2008). "How to Avoid Impact from Irrelevant and Misleading Information When Estimating Software Development Effort". IEEE Software: 78–83.
{{cite journal}}: CS1 maint: multiple names: authors list (link) - ^ Bentley, Jon (1985). "Programming pearls" (fee required). Communications of the ACM. 28 (9): 896–901. doi:10.1145/4284.315122. ISSN 0001-0782. S2CID 5832776.
- ^ Gödel, Escher, Bach: An Eternal Golden Braid. 20th anniversary ed., 1999, p. 152. ISBN 0-465-02656-7.
- ^ AFCAA Revic 9.2 manual Revic memorial site
- ^ "SLIM Suite Overview". Qsm.com. Retrieved 2019-08-27.
- ^ "SLIM-WebServices". Qsm.com. Retrieved 2019-08-27.
- ^ TruePlanning Integrated Cost Models PRICE Systems site Archived 2015-11-05 at the Wayback Machine
Software development effort estimation
View on GrokipediaFundamentals
Definition and Scope
Software development effort estimation is the quantitative prediction of resources, typically expressed in person-hours, person-months, or equivalent units, required to complete software development tasks spanning from requirements analysis through deployment.[8] This process involves assessing the human labor needed to translate project specifications into a functional software product, serving as a foundational element for effective project budgeting and resource allocation.[9] Unlike broader project forecasting, it emphasizes the core work involved in building the software rather than ancillary activities.[10] The scope of software development effort estimation primarily covers key lifecycle phases, including design, coding, integration, and testing, up to deployment, though some models and practices may include post-deployment maintenance.[8] It is distinguished from cost estimation, which adds overheads such as hardware procurement, training, and administrative expenses to the effort figure, and from schedule estimation, which converts effort into calendar time by accounting for team size and parallelism.[10] This focused boundary ensures estimates remain targeted on direct development labor, aiding precise planning without inflating for indirect costs or timelines.[9] Central terminology in this domain includes effort, denoting the total work units expended by personnel; size metrics, such as lines of code (LOC) or function points (FP), which measure the software's functional scale to inform predictions; and productivity rates, representing output achieved per unit of effort, often derived from historical data to calibrate estimates.[11] These concepts provide a standardized framework for quantifying and comparing development demands across projects.[12] This estimation practice emerged in the 1960s and 1970s amid growing software project complexity, as organizations sought reliable methods to anticipate resource needs beyond ad-hoc judgments.[13] Within broader project management, it underpins decision-making for scope control and risk mitigation.[9]Importance and Challenges
Accurate software development effort estimation is essential for effective project budgeting, resource allocation, risk management, and communication with stakeholders, as it provides the foundation for realistic planning and decision-making throughout the project lifecycle.[14] Without reliable estimates, organizations face heightened risks of scope creep, inefficient resource use, and unmet expectations, ultimately impacting project viability and business outcomes.[15] Poor effort estimation frequently results in substantial project overruns; for example, a 2012 McKinsey study found that large IT projects run an average of 45% over budget while delivering 56% less value than predicted.[16] According to the Standish Group's CHAOS reports through the 2020s, over 50% of software projects encounter budget or schedule challenges, underscoring the financial and operational consequences of inaccurate predictions.[17] These overruns not only strain organizational resources but also erode trust among teams and clients. Effort estimation faces significant challenges due to inherent uncertainties, such as evolving or incomplete requirements, variability in team skills and productivity, rapid technological shifts, and rare but impactful Black Swan events like global disruptions.[18] [19] Estimation uncertainty forms a spectrum, often illustrated by the cone of uncertainty, where early-stage predictions exhibit high variance (up to fourfold errors) that narrows as the project progresses and more details emerge.[20] The role of effort estimation varies across development methodologies: in traditional waterfall approaches, it is predominantly upfront and comprehensive to define the entire project scope, whereas in iterative methodologies like Agile, it is ongoing, adaptive, and refined through sprints to accommodate changes. This distinction highlights how estimation practices must align with the project's structure to mitigate risks effectively.Historical Development
Early Methods and Origins
The origins of software development effort estimation can be traced to broader practices in industrial engineering and project management from the early 20th century. Frederick Winslow Taylor's scientific management principles, introduced in the 1910s, emphasized systematic measurement and optimization of worker productivity through time studies and standardized tasks, laying foundational concepts for quantifying effort in complex endeavors.[21] These ideas influenced engineering fields like construction estimating, where parametric models based on historical data, material quantities, and labor rates were used to predict costs for building projects. By the 1950s, as software emerged as a distinct engineering discipline, early practitioners drew analogies from hardware development and construction to estimate programming effort, treating software production similarly to assembling physical systems with predictable labor inputs. In the 1960s, the growing scale of software projects prompted more structured attempts at prediction, particularly within large-scale defense and space programs. A seminal early effort was the 1956 analysis by Herbert D. Benington on the SAGE air defense system, a massive software undertaking involving approximately 500,000 instructions, which documented effort distribution across phases roughly as one-third (≈33%) on planning and specification, one-sixth (≈17%) on coding, and one-half (50%) on testing and debugging, achieving approximately 64 delivered source instructions per person-month through iterative processes and large teams.[22] NASA's projects during this decade, such as those at Goddard Space Flight Center, similarly relied on rudimentary metrics like lines of code and complexity factors to forecast programming effort, often adapting hardware cost models amid the Apollo program's demands. Barry Boehm, working at organizations like RAND and later TRW, began exploring productivity models in the late 1960s, analyzing factors such as system resilience and team dynamics in defense software, which informed his later algorithmic approaches.[23] These early methods were hampered by significant limitations, including heavy reliance on ad-hoc expert guesses due to scarce historical data and the nonlinear nature of software development. For instance, IBM's OS/360 operating system project in the mid-1960s, one of the largest non-military software efforts at the time, suffered severe overruns, with development costs escalating far beyond initial estimates of $125 million to over $500 million in direct research expenses, highlighting the challenges of scaling teams and managing conceptual work without reliable prediction tools.[24] Such experiences underscored the need for more empirical foundations, paving the way for formalized techniques in subsequent decades.Evolution and Key Milestones
The evolution of software development effort estimation began to formalize in the late 1970s and 1980s, building on early empirical approaches to address the growing complexity of software projects. Lawrence H. Putnam introduced the Putnam Resource Allocation Model in 1978, which used the Norden-Rayleigh curve to predict staffing levels, development time, and effort distribution over a project's lifecycle, extending into practical applications during the 1980s for resource planning in large-scale systems. In 1979, Allan J. Albrecht developed Function Point Analysis (FPA) at IBM as a method to measure software size based on functional user requirements rather than lines of code, with formalization and wider adoption occurring throughout the 1980s through industry symposia and guidelines.[25][26] This period marked a shift toward size-based metrics, culminating in Barry Boehm's 1981 publication of the Constructive Cost Model (COCOMO), an empirical parametric model that estimated effort in person-months using lines of code and cost drivers, becoming a foundational benchmark for waterfall-based projects.[27] The 1990s and early 2000s saw refinements to these models alongside the emergence of agile methodologies, adapting estimation to iterative and flexible development. Boehm led the development of COCOMO II in the late 1990s, released in 2000, which incorporated modern practices like object-oriented design and rapid prototyping by using source statements adjusted for reuse and introducing early design and post-architecture stages for more accurate predictions across project phases.[28] Concurrently, the rise of agile processes in the early 2000s introduced collaborative, judgment-based techniques; for instance, Planning Poker, formalized in 2002 within Extreme Programming frameworks, enabled teams to estimate relative effort using cards representing story points, fostering consensus and reducing bias in sprint planning.[29] These advancements reflected a broader transition from rigid, code-centric models to adaptable, team-oriented approaches amid increasing project diversity. From the 2010s onward, effort estimation integrated data-driven and machine learning techniques, responding to big data availability and computational advances, while standards evolved to support global practices. Post-2015, artificial intelligence and machine learning applications, such as neural networks trained on historical datasets like NASA's COCOMO repository, improved prediction accuracy compared to traditional models, with studies demonstrating enhancements in metrics like mean magnitude of relative error.[30] The ISO/IEC 14143 standard for functional size measurement, initially published in 1995, received key updates in 2007 and subsequent parts through the 2010s, refining FPA definitions and conformance requirements to accommodate agile and distributed environments.[31] Since the 2010s, trends in outsourcing and cloud computing have continued to reshape estimation by introducing factors like distributed team coordination and scalable infrastructure costs; global software development increases effort variance due to communication overhead, with models adjusting for geographic dispersion, while cloud migration efforts are estimated using hybrid parametric approaches that factor in reconfiguration and data transfer, often reducing on-premises overhead but adding integration complexities.[32][33]Core Concepts and Factors
Effort Metrics and Components
Software development effort is primarily quantified using time-based metrics such as person-hours, person-days, or person-months, which represent the total labor required from individuals working on the project.[34] These units allow for standardized comparisons across projects and are foundational in models like COCOMO, where effort is expressed in person-months to account for the cumulative work of the development team.[34] In agile contexts, secondary metrics like story points are employed as relative measures of effort, reflecting complexity, risk, and uncertainty without direct ties to time; story points are the most commonly used size metric in agile estimation practices.[35] To derive these effort metrics, size proxies serve as inputs to estimate the scale of the software. Source lines of code (SLOC) measures the volume of implemented code, often in thousands (KLOC), and is used to predict effort based on historical productivity rates, though it is typically available only post-implementation.[36] Function points (FP), introduced by Albrecht in 1979, quantify functionality from the user's perspective by counting inputs, outputs, inquiries, files, and interfaces, enabling early estimation independent of technology.[25] Use case points (UCP), proposed by Karner in 1993, extend this by sizing based on actors and use case scenarios, weighted for complexity to forecast effort in use-case-driven projects.[37] Effort components are typically decomposed by software development life cycle (SDLC) phases to allocate resources effectively. A representative breakdown, drawn from COCOMO II defaults for waterfall models, illustrates typical distributions: requirements and planning consume 7% (range: 2-15%), product and detailed design around 44% (17% + 27%), coding and unit testing 37%, integration and testing 19-31%, and transition or deployment 12% (0-20%).[34] These proportions vary by project type and methodology but highlight that upfront phases like requirements (often 10-20% in empirical studies) and design (approximately 20%) lay the foundation, while coding (around 30%) and testing (30%) dominate execution, with integration closing at about 10%.[38]| Phase | Typical Effort Percentage | Notes |
|---|---|---|
| Requirements & Planning | 7% (2-15%) | Focuses on scope definition; empirical ranges from CSBSG data.[34] |
| Design (Product & Detailed) | 44% (17% + 27%) | Architectural and specification work; varies with complexity.[34] |
| Coding & Unit Testing | 37% | Implementation core; often 30% in balanced models.[34] |
| Integration & Testing | 19-31% | Verification and defect resolution; typically 30-40% including rework.[34] |
| Transition/Deployment | 12% (0-20%) | Rollout and handover; around 10% in many projects.[34] |
Influencing Variables
Influencing variables in software development effort estimation refer to the modifiable factors that adjust baseline estimates derived from core metrics like size or functionality, accounting for contextual nuances that can significantly alter required resources. These variables are typically categorized into project attributes, team factors, environmental elements, and requirements volatility, each contributing multipliers or buffers to refine predictions for accuracy. Seminal models like COCOMO II incorporate 17 cost drivers that multiply the nominal effort, enabling estimators to scale predictions based on empirical data from hundreds of projects.[41] Project attributes, such as complexity and reusability, directly influence the inherent difficulty of the software product. Complexity encompasses aspects like computational demands, data management, and user interface intricacy; for instance, in the COCOMO II model, product complexity (CPLX) applies multipliers ranging from 0.73 for very low levels (simple operations) to 1.74 for extra high levels (highly intricate integrations), effectively increasing effort by up to 74% for complex systems compared to nominal cases. Reusability (RUSE) measures the design effort for components intended for reuse, with multipliers from 0.95 (low reusability needs) to 1.24 (extra high), adding up to 24% more effort when extensive reuse is required to ensure modularity and maintainability. Domain-specific examples highlight these impacts: embedded systems, often involving tight hardware-software coupling, demand higher effort than web applications due to stringent real-time constraints; in COCOMO II's embedded mode, the effort exponent rises to 1.20 versus 1.05 for organic (simple, team-familiar) web projects, potentially doubling effort for equivalent sizes like 100 thousand delivered source instructions (KDSI).[41] Team factors, including experience and co-location, affect productivity and coordination efficiency. Analyst and applications experience (AEXP) in COCOMO II reduces effort for high proficiency, with multipliers dropping to 0.81 (19% savings) from 1.22 (22% increase) for very low experience, reflecting faster problem-solving by seasoned teams. Personnel capability (PCAP) similarly adjusts from 0.76 (very high, 24% savings) to 1.34 (very low), underscoring how skilled teams can cut effort by 24% through optimized coding and debugging. Co-location minimizes communication overhead; the multisite development (SITE) driver penalizes distributed teams with multipliers up to 1.22 for very low collocation (22% effort increase), as remote setups amplify coordination costs compared to on-site collaboration.[41] Environmental factors, such as tools and standards, shape the development ecosystem's efficiency. Tool usage (TOOL) in COCOMO II boosts productivity with advanced automation, applying multipliers from 0.78 (very high tool support, 22% reduction) to 1.17 (very low, 17% increase), as integrated development environments streamline testing and integration. Standards adherence, often tied to process maturity, indirectly influences via personnel factors but can elevate effort if rigid compliance (e.g., ISO 26262 for automotive software) demands additional documentation and reviews. Risk factors like security requirements further amplify environmental demands; reliability (RELY), which includes security robustness, raises effort by 10% for high ratings (multiplier 1.10) to 26% for very high (1.26), with studies indicating an average of 20% of development effort attributed to security in most projects.[41][42] Requirements volatility introduces uncertainty by necessitating rework, often requiring buffers to base estimates. In practice, high volatility—such as frequent changes in specifications—adds 10-30% to effort as a contingency, accounting for iterative revisions and testing; empirical studies confirm that adding new requirements mid-project can inflate change effort by 20% or more per volatility instance. COCOMO II addresses this via platform volatility (PVOL), with multipliers from 0.87 (very low) to 1.30 (high), though broader adjustments are common in agile contexts to mitigate schedule slips. Expert analyses identify volatility as a top-ranked factor (e.g., development type like enhancements), emphasizing its role in degrading prediction accuracy without explicit buffers.[41][43][44]| Category | Example Driver (COCOMO II) | Multiplier Range | Effort Impact Example |
|---|---|---|---|
| Project Attributes | Complexity (CPLX) | 0.73 (very low) to 1.74 (extra high) | +74% for highly complex systems |
| Project Attributes | Reusability (RUSE) | 0.95 (low) to 1.24 (extra high) | +24% for high-reuse designs |
| Team Factors | Experience (AEXP) | 0.81 (very high) to 1.22 (very low) | -19% with expert teams |
| Team Factors | Co-location (SITE) | 0.81 (extra high) to 1.22 (very low) | +22% for distributed teams |
| Environmental | Tools (TOOL) | 0.78 (very high) to 1.17 (very low) | -22% with advanced tools |
| Environmental | Security/Reliability (RELY) | 0.82 (very low) to 1.26 (very high) | +26% for secure systems |
| Requirements Volatility | Platform Volatility (PVOL) | 0.87 (very low) to 1.30 (high) | +30% buffer for changes |
