Estimating to minimize loss
by Ken West
AROBUST SET of estimates puts a project on a firm footing from Day 1, allowing the project manager to apply the right level of resources at the appropriate time. If the plan has been based on poor estimates, problems will occur during the execution of the project. An underestimated and under-resourced project is only superficially under control … and sooner or later that shortfall will come to light. An overestimated project is inefficient and resources are wasted that could be better utilized elsewhere. Any tool or method that improves our ability to make and communicate estimates will have a significant return on investment during the project.
The future is uncertain, and, in response to this uncertainty, the project manager must estimate the possible outcomes, select one that balances the risks of under- and overestimating the resource needs, and then plan based on that estimate. We will look at this process as a means of minimizing the risk of future loss. The viewpoint of “minimal loss” allows us to introduce and apply an estimating tool from the Bayesian probability theory. In this article, we look at the process of cost estimation on a project.
A Sense of Loss
For the project manager, both under- and overestimates have a cost. If the effort needed exceeds budget, then there will be unplanned overtime, missed deadlines, and unhappy clients. Contractual penalties may even be imposed. On the other hand, if the resources needed are under budget, then the resources supplied will not be fully utilized. This is an opportunity cost as the resources could potentially be put to better use elsewhere.
The more that the actual outcome differs from the estimate, the more the cost of a poor estimate tends to increase. Indeed, we can say that the “loss” of a poor estimate is a function of the difference between the estimate and the actual outcome. This is the “loss function.”
An example should make this clear: Imagine a project where the only resource needed is effort. One hundred person-days of effort are estimated at a cost of $900 per day. The price is set at $1 million so, if all goes according to plan, the profit will be $10,000 (a negative loss!). Imagine also that the resources on this project can be redirected to lower-value tasks if they are not needed, and the value of these alternate tasks is $800 per day. One example of a lower-value task is “polishing” a deliverable so that its quality is higher than what is required in the specifications. Another example is having a resource sitting idle for a short time waiting to be reallocated. (If this were the case in this example, the estimate of $800 is equivalent to assigning a probability of 8/9 that the resource will be reallocated within a day.)
Once this imaginary project is approved, resources are allocated and work commences. If the actual resource usage differs from the estimate, then some loss will be incurred. In this example, every extra person-day of effort will cost the project $900. On the other hand, every person-day in which a resource is unused has an opportunity cost of $100—we have unnecessarily tied up a resource for a day longer than required ($900) but are able to reallocate the resource to some other, lower-value work ($800). The “loss function” for this project is shown in Exhibit 1.
Given this loss function, where underestimates are more costly than overestimates, it would be prudent to estimate on the high side, risking the opportunity cost of unused capacity rather than that of funding extra resources.
Bayesian Parameter Estimation Theory
The Bayesian approach to probability allows us to reason logically in the face of uncertainty [E.T. Jaynes, Probability Theory: The Logic of Science, 1996, http://bayes.wustl.edu/etj/prob.html]. It allows us to construct models of the world that can combine our prior experience with new information. For the purposes of this article, we need only consider a small portion of the theory: how to select the “best” value of a parameter from the range of possible values it might take. In our case, the “parameter” we want to estimate is the resource needs for a project.
The Bayesian approach to parameter estimation has been successfully applied in a number of fields; for instance, G.L. Bretthorst [Bayesian Spectrum Analysis and Parameter Estimation, 1988, Springer-Verlag] documents the remarkable accuracy achieved in spectrum analysis using this technique. We shouldn't expect the same precision that Bretthorst achieved because our “data” is not the result of a repeatable experiment, but rather an estimate of future possible outcomes—only one of which will ever happen! What the Bayesian approach offers us under these conditions is (1) a sound basis for reasoning about uncertainty that has proved itself in the fields of science, engineering, medicine, and economics, and (2) a link between the project's cost structure and the “best” estimate we can choose given that structure. The link is the loss function.
There are two key ideas in Bayesian parameter estimation: (1) that loss is suffered when the actual result differs from the estimate (as indicated by the loss function), and (2) that the “best” estimate is that which minimizes the risk of loss. Both these ideas are certainly appropriate to project estimates.
The “best” estimate depends on the shape of the loss function applied. The following results can be derived with some simple maths (remembering that “resource variation” is the difference between the actual resource usage and the resource estimate):
The median (50th percentile) is the best estimate if the loss function is proportional to the absolute value of the resource variation (that is, the loss is symmetric for overand underestimates).
Whatever the pricing decision, the project manager needs to monitor project performance against the cost estimates.
The mean (average) is the best estimate if the loss function is proportional to the square of the resource variation.
If the loss is proportional to the absolute value of the resource variation, but the cost of an underestimate is X times the cost of an equivalent overestimate (as in Exhibit 1), then the Nth percentile is the best estimate, where N = X/(X+1). The median is a special case of this result when X = 1.
Given the loss function in Exhibit 1, our project manager would make use of the third result, and the best estimate will be the 90th percentile of the resource distribution (A = $900/$100 and 9/(9+1) = 90 percent).
Now that we know which statistic is the “best” to choose, the project manager must choose the resource estimate that matches this. Let's say that the estimating team determines that there is a 50 percent chance that the example project will require no more than 80 person-days of effort to complete. Similarly, there is an 80 percent chance of 90 person-days, and a 90 percent chance of 100. Given the shape of our loss function, we should estimate costs on the basis of 100 person-days of effort.
The theory has yielded an intuitively appealing result. If our loss function is strongly biased against underestimating, the theory will lead us to choose an estimate at the high end of our resource range. A lower per-centile would be appropriate if we weighted over- and underestimates more evenly.
Applying the Theory
The key outcome of our discussion so far is that the use of loss functions allows the estimation process to be aligned with the cost structure of the project. In this section, we'll see how an estimating process built on these principles could be implemented.
Let's list the process steps and then discuss them in detail.
Determine the loss function (dollars lost vs. resource variation).
Determine the estimating statistics required (median, mean, percentile, and so forth).
Obtain stakeholder agreement.
Communicate the estimating requirements to the estimating team.
Produce the resource estimates and gather the agreed statistics.
Set the project cost estimate in accordance with the loss functions.
Price the project.
Learn the lessons from project execution.
Determine the Loss Function. The process for calculating a loss function can be summarized simply: weigh the costs of an underestimate against those for an overestimate. However, this is easier said than done. Indeed, the shape of the loss function may itself need to be estimated.
If the project is underestimated, and the budget is not adequate for the work, then this is compensated for by applying extra effort, resources, and time to complete the project. The cost of doing this should be straightforward to calculate from the cost rates that will be used on the project.
If the project is overestimated, the cost of this is less tangible and hence more difficult to calculate. Wasted time has a cost in terms of time-to-market and inability to undertake other projects. Unused effort has a cost, but people don't tend to do nothing—they work on less valuable tasks. Unused materials have a cost, but not all such material is scrap—it can be stored for future use. Since unused resources are not totally lost, and the losses are less tangible, we would expect the loss function to attribute a smaller loss to an overestimate than to an equivalent underestimate.
Note that the business context has an impact on the loss function. For instance, if the project is the subject of a fixed-price contract, then the loss function will be strongly biased against underestimates. On a cost-reimbursable contract, this bias will be less pronounced.
Determine the Estimating Statistics. The shape of the loss function determines which statistics need to be estimated. What I‘ll assume from this point on is that the project's loss function is not symmetric, and has a shape similar to Exhibit 1. Thus, the Nth percentile is the “estimating statistic” that the estimating team needs to calculate. “N” is calculated using the formula given above.
If the project's loss function were symmetric, then the estimating statistic would be either the mean or median as appropriate. If the loss function is more complicated than the examples given, there is probably no simple formula to apply, and a more sophisticated estimating model would need to be developed with the estimation team.
Obtain Stakeholder Agreement. The purpose of this step is to ensure that the stakeholders’ understanding of the estimates matches the project's loss function. The project manager should ensure that the project stakeholders understand and endorse the estimating statistics—the Nth per-centile represents the safety level that will be built into the estimates.
Communicate Estimating Requirements. The project manager should set clear expectations for the estimators. The concepts of loss functions and the Nth percentile for estimates should be discussed and agreed-to so that the estimators’ mental model of the estimation process matches that of the stakeholders.
Produce the Estimates of Resource Needs. The project manager should have the resource estimates made to give the estimating statistics required.
Note that the use of loss functions is independent of the estimating techniques employed. Any technique can be used (whether estimating by analogy, by parametric means, or in a top-down or bottom-up manner), provided it can supply the estimating statistics needed to choose the “best” cost estimate. What we assume here is that the estimates are reviewed for validity.
Projects are subject to a variety of performance constraints. For instance, the project may be required to complete before a drop-dead date. There is no point using the loss function to choose between estimates that don't meet this drop-dead date. The appropriate action in that case is to rework the schedule, renegotiate the date, or abandon the project. In general, the constraints imposed on the project should be honored by the resource estimates before attempting to derive the “best” cost estimate.
Set the Cost Estimate. The project's cost estimate is calculated from the resource estimate by multiplying the quantities of resources needed by their unit costs.
Price the Project. Now that we have a cost estimate that reflects our cost structure, the project can be priced in accordance with vendor, client, and market expectations. Note that these considerations might lead the project to be priced at a loss, particularly if the project is seen as strategic. Whatever the pricing decision, the project manager needs to monitor project performance against the cost estimates.
Lessons from Project Execution. As the project is executed, it may become clear that the loss function used is in some way inappropriate, and needs to be revised. When adjusting the loss function, it is worth using the shapes indicated above, otherwise much more sophisticated models will be required, and the results presented here will not be valid.
It may be that the estimation team needs training in providing estimates that conform to the loss function being used. For instance, if estimates were being made at the 90th percentile, then one would expect 90 percent of estimates made to be greater than the corresponding actual result. Estimates and actual resource needs should be recorded so that estimators’ performance can be scored. The score is the percentage of estimates that exceed the actual outcome, less the percentile required (90 percent in this case). Ideally this score would be zero. Underestimators would have a negative score and would be encouraged to raise their resource estimates. Similarly, overestimators would have positive scores and be encouraged to lower their resource estimates.
VIEWING ESTIMATES AS A WAY of minimizing loss gives the project manager the ability to prove communications both within the team and with the stakeholders. By improving the estimates and clearly communicating what they mean, the project manager has increased the likelihood of project success—from Day 1. ■
Reader Service Number 102
PM Network April 2001