Putting quality in project risk management, part 2
dealing with variation
by Lawrence P. Leach
Once you've isolated special- and common-cause variations, set your sights on improved task estimation and performance.
IN PART 1 OF THIS ARTICLE, I defined and demonstrated the need to understand and manage variation. I also defined and described the differences between common-cause and special-cause variation, and the two management mistakes that frequently increase variation on projects: (1) ascribing a variation as special-cause, when it in fact belongs to the system (common-cause); and (2) ascribing a variation or mistake to the system (common-cause), when in fact the cause was special.
Lack of understanding of common-cause and special-cause variation is a problem that leads to management mistakes that degrade performance of the project system. Critical chain project management (CCPM) provides one approach to improving on present practice through keeping the critical chain fixed throughout a project, and by sizing buffers and setting action criteria at levels that should only cause management response to special-cause variations.
Significant opportunity exists for future improvement to the science of project performance through applying control charts. In the meantime, project risk management must differentiate between the two causes of variation.
In his later years, W. Edwards Deming [The New Economics: For Industry, Government, Education, 1993, MIT Press] put increasing emphasis on the necessity of management to use a “system of profound knowledge.” He defines the system of profound knowledge to include appreciation for a system, understanding of variation, psychology, and a theory of knowledge. His theory of knowledge uses the scientific method.
Walter A. Shewhart was an acknowledged leader in the field of statistical process control and mentor to Deming. Deming's forward to Shewhart's book, Statistical Method from the Viewpoint of Quality Control [1986, Dover Publications, reprint of 1936 original] notes that “some of the greatest contributions from control charts lie in areas that are only partially explored so far, such as application to supervision, management, and systems of measurement….” Shewhart notes that the use of process control provides a “means of directing action toward the process,” so that management can say, “If you do so and so, then such and such will happen.” Project managers must focus this predictive capability on the project schedule and cost.
Lawrence P. Leach manages the Project Management Office for American National Insurance Company (ANICO) in Galveston, Texas, USA. Prior to this he was founder and principal of Advanced Projects Institute (API), which applies business tools to help clients improve their work processes and management systems. With 30-plus years experience as a project manager, he is a member of the Project Management Institute and the American Society of Quality Control.
Control Charts. Shewhart developed control charts in 1926 as the solution to avoiding the mistakes of confusing common-cause variation with special-cause variation, and vice versa. Control charts track the performance of the important variable over time. Initial construction of the control chart provides information on whether the process is in statistical control; that is, free of special causes of variation. If it is not free of such causes, management must take action to remove the special causes of variation before prediction is possible.
Lack of understanding of common-cause and special-cause variation is a problem that leads to management mistakes that degrade performance of the project system.
Once the process is in statistical control, the control chart provides limits that clearly differentiate between common-cause variation and special-cause variation. Management should take corrective action only on special-cause variation. Exhibit 6 [exhibit numbers are sequentially continued from Part 1] illustrates a typical control chart, with the upper and lower control limits. This chart uses the random data from the funnel experiment, described in Part 1 of this article. For a project, the variable on the ordinate might be the ratio of task cost or schedule performance divided by the estimate.
The control chart shows the data as points, the average of the data, and an upper and lower control limit (UCL, LCL). Management should only react to points that demonstrate a special cause of variation. These are points above the UCL or below the LCL. None of the points in Exhibit 6 is out of control, because it is a random process. You may use additional criteria to help detect special-cause variation [for example, Victor E. Kane, Defect Prevention, 1989, ASQC Press].
The PMBOK® Guide [1996, PMI] acknowledges control charts in Chapter 8, “Project Quality Management,” and even includes a figure (8-4) illustrating an example control chart of project schedule performance, but does not complete the understanding of variation. While the PMBOK® Guide makes reference to probabilistic models of project performance, actual project management practice frequently assumes that project plans are, or can be, deterministic. Indeed, the common practice of ascribing start and finish dates to each activity in a schedule with hundreds or even thousands of activities leads to misunderstanding the variability of task duration estimates and performance. (Early and late start and finish dates are not related to task duration variability but are instead a result of the schedule logic.)
The definition of critical path used in the PMBOK® Guide may contribute to treating common-cause variation as special-cause. The PMBOK® Guide  definition of critical path notes that, “the critical path will generally change from time to time as activities are completed ahead of or behind schedule.” This is usually due to common-cause variation. If there is value in defining the critical path, we should expect project management to do something when it changes. Taking action based on the changing critical path of the project, when the change is due to common-cause variation, is an instance of Mistake 1.
Critical Chain Project Management (CCPM). CCPM is a system to plan and manage projects based on all of Deming's elements of profound knowledge, including knowledge of variation. The critical chain approach greatly simplifies applying your understanding of variation to provide a kind of rough-and-ready solution to managing variation. There is reason to suspect that Eliyahu M. Goldratt, the developer of critical chain [Critical Chain, 1997, The North River Press], may have taken this approach because he believes that few people will implement statistical process control. The critical chain method provides simple mechanisms to reduce the frequency of the two mistakes previously described.
Reader Service Number 177
CCPM's foundation in the Theory of Constraints derives from understanding variation and dependent events. Although a critical chain plan may appear to be deterministic, it is understood and applied otherwise by CCPM practitioners. People used to looking at schedules with start and finish dates for each activity may mentally add them to a CCPM plan.
Avoiding task start and finish dates is one way CCPM reduces Mistake 1. Statistics allows you to make knowledge-based statements about groups of activities, such as those along a project network path, which you cannot make about individual tasks. The CCPM plan is a sequence of tasks with variable performance time. CCPM uses project path start dates, and the completion date of the project buffer (Exhibit 7). Project resources provide status tasks with an estimate of the expected time left to complete. The project manager uses buffer signals as the only action criteria. CCPM practitioners view the schedule buffers as probability distributions, with a cumulative probability of 50 percent at the start of the buffer and 90+ percent at the end of the buffer.
Critical chain initially reduces the frequency of Mistake 1 (reacting to common-cause variation by changing the apparent critical path) by defining the critical chain that is not to change. The critical chain definition also reduces resource contention, a frequent cause of changes to the critical path. Although the PMBOK® Guide description of the critical path method encourages resource leveling, my informal research indicates this capability is used on less then 5 percent of project plans. The meaning of the critical path can be ambiguous with resource-leveled schedules and Monte Carlo simulations; but the critical chain is never ambiguous.
Schedule buffers, acting as control charts, reduce the frequency of both Mistake 1 and Mistake 2. Schedule buffers are time allocations in the project plan used to manage uncertainty. A project schedule buffer at the end of the critical chain, sized by using the estimated common-cause uncertainty of the tasks in the chain, enables predicting a high-confidence project completion date. Feeding buffers in the project paths that feed the critical chain, sized the same way, perform the same function for those chains.
Buffer thresholds are set for the project and feeding buffers. These thresholds determine what action management should take and when to initiate that action. The buffer action thresholds are the functional equivalent of the upper control limits in control charts.
Exhibit 6. The control chart shows the data, the mean of the data, and upper (UCL) and lower (LCL) control limits to direct action at the process.
Exhibit 7. Trending of buffer penetration makes it a control chat. The two zones at one-third and two-thirds direct action at the project due to special-cause variation.
Note that experienced project professionals frequently confuse feeding buffers with float or slack in a schedule. They are not the same. Float or slack is an accident of the logic of the plan, having nothing to do with the uncertainty of the duration estimates of the activities that comprise the chain. Feeding buffers are explicit allowances placed in the schedule, based on the common-cause uncertainty in the chain of activities that precede the buffer.
Feeding buffers greatly reduce the need to change the focus of the project team away from the critical chain. However, when a feeding chain has (or is anticipated to have) a significant (that is, special-cause) delay, the feeding buffer will trigger action. Feeding buffers solve the merging path problem addressed by many (usually in the context of PERT), including recent PM Network articles as well as the PMBOK® Guide approach based on Wideman's Project and Program Risk Management: A Guide to Managing Project Risks and Opportunities [1992, PMI].
Some applications of critical chain also use a cost buffer to aid in cost control, sized using the same uncertainty information as the schedule buffers and incorporating action thresholds.
I characterize CCPM as a rough-and-ready approach to managing variation because actual process performance data (estimating process or task performance processes) is usually not used to size the buffers and decision thresholds. The most used buffer sizing method estimates the control action limits, and thus the variance, by using predictions of low-risk and average task performance durations, as described in my book [Critical Chain Project Management, 2000, Artech House]. In Critical Chain, Goldratt recommends a more simplified approach: dividing the usual task duration estimates in half to estimate the 50 percent probable time and using half of the activity chain duration as the estimate for the schedule buffer. You could use the PERT three-time-estimate method to size the buffers. You could also use actual task performance data and Monte Carlo simulations to size buffers. The extent to which you use real data as inputs to the simulation and appropriate tools (such as control charts) to minimize Mistakes 1 and 2 will be a competitive edge and future trend in companies that perform projects.
The common practice of ascribing start and finish dates to each activity in a schedule with hundreds or even thousands of activities leads to misunderstanding the variability of task duration estimates and performance.
Reader Service Number 043
Effective CCPM tracks the trends of buffer penetration (Exhibit 7). Buffer penetration is the ongoing forecast of how much of the buffer will be used. You calculate buffer penetration by adding the estimate to complete the currently working task to the estimate for the tasks not yet started. The buffer-tracking chart indicates management action thresholds for planning and taking corrective action. These thresholds, comparable to the control chart UCL, reduce the frequency of Mistake 1 and provide reasonable assurance against Mistake 2. Buffer tracking preserves the time history of the data, an important advantage of control charts over significance test statistics (which Deming abhorred).
Note that CCPM does not provide a method for dealing with discrete project risks. As described in Part 1 of this article, those are handled well by the existing approach to discrete (special-cause) risk management in the PMBOK® Guide and in Wideman's Project and Program Risk Management.
Recommendations. My reasoning in this article leads to these additional recommendations to reduce Mistakes 1 and 2:
■ Use the critical chain instead of the critical path to define project lead-time.
■ Do not change the critical chain during project execution.
■ Insert a project buffer at the end of the critical chain to absorb common-cause variation.
■ Insert feeding buffers in chains that feed the critical chain to protect the project critical chain from the merge bias.
■ Use buffer penetration as the project measurement, and take management action only when signaled to do so by buffer penetration.
■ Use project risk management for discrete risks only.
■ Continuously improve your project execution process through application of sound scientific methods, including the application of control charts for project time and cost prediction and control.
THE RECOMMENDATIONS I'VE PROVIDED will reduce project variation, giving you a competitive edge and greatly improving project success. For putting even greater quality in your risk management, you should work to further reduce variation by improving the processes for task estimation and performance.
PM Network March 2001