The root cause projection technique – create useful strategies to mitigate risks
Congratulations! You made a healthy start of managing risks during project initiation and planning. You recognized the importance of doing a risk assessment, and so you took action, rallying your team, setting aside time and taking an exhaustive look at potential risks. Now the hard part—how do you move your team from a list of risks that say, “the sky is falling” toward meaningful risk response and control?
After convincing the stakeholders and key project members to set aside time to perform one, the problem with most project risk assessments is keeping it focused on generating useful information and plans that will endure the ensuing weeks, months and years of the project. Too often a risk analysis is an obligation that, once fulfilled, is soon discarded as either ineffective or irrelevant to the constantly changing landscape of the project.
The most conscientious team may spend the energy identifying and documenting the risks, only to fail during devising meaningful plans to mitigate each risk. Like deer frozen in the headlights of the oncoming automobile, they sense danger but don't know how to get out of the way.
The team that DOES write mitigation plans does so by creating statements that amount to “we will try very hard to prevent this problem from occurring, and, if it does occur, we will work very hard to stay on budget and schedule. And, if that doesn't succeed, we will negotiate with the sponsor to eliminate portions of the project scope.”
The author has found that by adapting a technique used in the Operations and Safety analysis world, “Root Cause Analysis,” and applying it as a future-looking tool, it is possible to generate much more specific risk statements, which lead to very specific mitigation actions. This double hit offers to increase the effectiveness of every project team in the management of risks.
In this paper the author provides an overview of the traditional root cause analysis methodology, including several references to documentation and a tool. The paper then describes the adaptation of this analytical technique, called “Root Cause Projection” which becomes a tool/technique (in the PMBOK® Guide sense) that applies to risk identification, quantification, analysis, and response planning.
Traditional Root Cause Analysis
The industrial safety industry has long practiced the disciplined investigation of accidents or incidents. All types of incidents occur daily, resulting in serious consequences of harm to human life, damage to the environment, and significant monetary loss.
Incident investigation consists of a careful review of the specific causative actions leading up to the accident and also considers the permitting conditions, which enabled the disaster to occur. The initial set of actions and conditions are determined by a quick survey of the facts and interviewing the parties involved. Further analysis is conducted by methodically pursuing each finding and understanding why it existed, leading to the “unplanned event.”
For example, consider this hypothetical chemical spill that results in blinding a worker. The immediate details of the accident are determined by interviewing the victim and witnesses. A timeline is constructed. First, the worker (our victim) was handling the toxic material. Next a maintenance person entered the area, bumping into the worker, causing him to drop the material. The resulting splash of liquid and vapors created the injury.
Was it an accident? Perhaps, but we cannot stop there because we intend to prevent the risk of future accidents by understanding the cause(s). In response to each fact of the case we begin by asking “why?” (see Exhibit 1). Why was the worker handling the material? Was it in accordance with an established process? Yes, but the worker was not wearing the proper safety equipment. Why not? Did the worker understand what was required? Was the worker properly trained? Yes, but working practices had become lax. Why? Isn't there periodic supervision and auditing of safety practices?
Why did the maintenance person enter the area where toxic materials where being handled? There was a warning light indicating the danger, but it was ignored. Why? The maintenance person had not worked in this building and familiar neither with the dangers nor the safety procedures. Why?
Such an investigation proceeds until the “Why?” question can provide no further information. Normally this pursuit ends in several standard or recognizable classes of causes that can be categorized as “lack of training,”“lack of policy/procedure,”“disregard of established policy/procedure,” or “insufficient guidance and supervision.” Once these findings are established, appropriate corrective action is relatively easy to determine. This “accident,” the injury, could have been prevented by a combination of policy, training, supervision, and personal attention to detail. Actually any one of these might have preempted the event from occurring.
A Graphical Representation Technique…
The graphing technique used in Exhibit 1 follows several simple conventions. The result end state, the undesirable event, is shown in the top box. Each succeeding tier contains boxes that are states (or events) that represent a layer of discovery. There should be at least one event state or “cause” (marked with a “C”), and usually one or more states known as “permitting conditions” (marked with a “P”). It may be useful to represent other key facts or information (marked with an “I”).
For each of these states it is appropriate to explore the reasons for its occurrence or existence. If the state is a desired condition, e.g., “the worker is doing his job” then no corrective action (“N/C”) is needed. Otherwise, a new tier of causes is determined, in answer to the question “Why does this exist?” After some iteration to finer levels of detail it becomes apparent that no additional useful information can be generated and an opportunity for corrective action(s) should become evident. In the example, two significant legs of the analysis trace down to causes related to inadequate supervision of worker performance.
Notice that in the representation, a significant amount of information can be conveyed. More significant, its value is found in pointing to specific low-level opportunities for corrective action. Breaking the “chain of events” by altering just one of the identified conditions may have prevented this unfortunate accident.
…Converted to a Forward-Looking Technique
Are you getting a glimpse of the value of applying this both as a risk analysis and response planning Technique? “Project risk is an uncertain event or condition that, if it occurs, has a positive or negative effect on a project objective. A risk has a cause (my emphasis), and if it occurs, a consequence (PMBOK® Guide, p.127).” The safety industry investigates root cause, not for the sake of analysis, but to manage the risk of incidents being repeated in the future. The discipline or “best practice” of “lessons-learned” lies at the heart of this management of risk. If we project the same technique, applied above to a historical event, into a future imagined or potential event, we may apply the same concepts in discovering the roots of a risk. By probing, we are able to generate information that leads to more complete understanding of the risk, and therefore, more focused opportunities to mitigate the risk.
Imagine any undesirable event that may threaten your project's success next month. If you sufficiently understand the causative and permitting factors of this future risk, you have the opportunity to define a very effective risk response strategy—one that doesn't say, “try harder” but surgically cuts at the root cause in a very cost-effective manner.
The technique can be applied simply as described above, however I will describe several enhancements that take into account the future, or the unknown aspects of a risk. With these, a “Root Cause Projection” technique is defined that integrates into one whole each of the risk management processes, while also including the best practice of lessons-learned.
Applying the Root Cause Projection Technique
Naturally, the more experience we have, the more we practice risk management continually with an “unconscious competence. “We are unaware that we are doing it; we just make the right choices based on previous lessons-learned. Therefore to demonstrate the Root Cause Projection Technique, let's assume that your project is expecting the delivery of an important subsystem from a contractor next month. Your company, unfamiliar with the required technology, relied on an outside party to provide a core subsystem. They are due to deliver it in three weeks, and then you will jointly integrate the final product for delivery in three more weeks. The end date is very important for getting your new product into a competitive market.
Given all of the information currently baselined in your project plan (scope, schedule, budget, quality requirements), you pull the team aside to update your risk identification (PMBOK® Guide 11.2). The team's consensus is that there is a significant schedule risk. Start the Root Cause Projection at this point. You ask your team to imagine this hypothetical but likely situation: with one week prior to final delivery, your project engineer enters your office ready to quit—“I‘ve had it—we have several “show stopper” technical problems, we're not getting cooperation from the contractor, and frankly the team is getting pretty tired.” The risk is—your project will not go to market next week, resulting in loss of a major income opportunity for the company.
Being proactive you do not intend for that day to ever arrive, and invoke the Root Cause Projection Technique (refer to Exhibit 2) as follows. Given that undesired end result stated above, you go to the white board. At the top you draw a box titled “11th hour crisis.” Despite all of your good planning, there is a significant concern that something will raise its ugly head too late in the game for you to respond in time.
Now ask your team the “WHY?” question. We have a good plan. What is it that is going to cause this “meltdown?” You listen and start to write down their concerns. Let's take a minute to imagine their answers. First—the contractor delivers late. Second—the contractor's system has major bugs. Third—our own people wear out and fail to give their best effort. Then you write down one of your own—WHY did my project engineer wait till the last week to bring this problem to me (communication breakdown)?
Next, pursue the analysis of each box.You may brainstorm this with the whole group or start to parse out specific pieces for individuals to tackle on their own. Keep asking the “WHY” question? Q: Why will the contractor be late? A1: They didn't get our requirements in time. A2: They have other higher-priority jobs than ours A3: They have problems purchasing some special material or obsolete component. Further pursuit of each of these results in a sub-tree of risk information (suggested by lower-tiered boxes in Exhibit 3).
Here is a difference between traditional root cause analysis and root cause projection. In the former, everything is historical fact. Each cause and permitting condition did exist, and the analysis consisted of a logical “ANDing” of each to understand their interdependence. When projecting to the future however, we are anticipating possible causes, and we need to acknowledge both logical “OR” as well as “AND” conditions. This becomes important later in designing appropriate risk responses.
To represent these differences I suggest using a form of shadowing to differentiate between INDEPENDENT states and DEPENDENT states. As shown in Exhibit 4, it became clear that the top four states were independent of one another. Any one of them COULD occur, and are therefore “OR” states in a logical sense. Upon conducting further analysis of the “late delivery” state, the three possible scenarios were also independent.
Integrating the PMBOK® Guide Risk Identification Process
The significance of a state being INDEPENDENT is that it is in effect a risk event on its own merits, meeting the same definition of project risk provided earlier. Therefore it needs to be analyzed for its probability of occurrence and a response strategy considered. Here an important benefit is identified. Whatever risk management you have done prior to this day, given the identification of a single new risk, the “11th hour crisis” at least seven new risks have been identified for tracking in your project risk database.
Integrating the PMBOK® Guide Risk Analysis Processes
The risk analysis proceeds first as in the historical root cause analysis. Each INDEPENDENT risk event must be fully explored with the repeated questioning of “Why?” Why will this event happen? What would cause this? What existing conditions will permit this? At the upper tiers of such an analysis there is a potential for additional independent, “OR” type events, however it should quickly become dependent events-only, because at some point you are transitioning from evaluating UNKNOWNS to stating the KNOWNS you are already living with. For example, in the “LATE CONTRACTOR DELIVERY” risk, it is a KNOWN that you delivered the requirements late, and you may reasonably expect there to be consequences. In the same example, you may not yet have a known “PARTS OBSOLESENCE” problem, or know specifically which components are apt to become obsolete.
Why would you bother analyzing a KNOWN tree any to any further level of understanding? It might not benefit you on your current project, but here is an excellent opportunity to begin documenting the LESSONS-LEARNED while the memories are fresh. Asking the “Why did this happen?” questions may lead your organization to making some focused improvements in its operating processes to prevent such risks from occurring in the future.
A more precise computation of probability of occurrence of a major risk event can be modeled by estimating the probability of the lower-tiered INDEPENDENT risk events and applying recognized mathematical rules to combine them into a single probability. This approach would be analogous to the Monte Carlo analysis of a schedule network, in which the results are dependent on estimates of minimum, maximum, and most likely values of task duration, for each of numerous nodes in the project network. The author proposes to explore this concept further in a future paper.
In available root cause analysis tools such as Decision Systems, Inc. “Reason™” product, some use of quantitative analysis has been made. By counting the various states and branches in the model, a percentage computation is made to evaluate the effectiveness of alternative corrective actions. More will be said in the next section.
Integrating the PMBOK® Guide Risk Response Planning Process
The primary value of the Root Cause Projection Technique, as stated in the beginning, is the stepwise refinement of a general risk statement into specific, detailed risk event statements. By analogy, one does not just “build an airplane.” Instead the complex problem is broken down into finer pieces so that it can be understood, and then managed. Here is the first benefit: the project manager has an alternative to accepting proposed risk responses that amount to, “we're going to try harder to prevent this problem from occurring.” In our example, remember the original risk statement was “we may have a major, show-stopping problem the week before project delivery.” Imagine the proposed risk response plan to such a general concern. It may have involved allocating a nebulous schedule or budget reserve, plans to approve overtime bonuses for your staff, or “senior management phone calls to the customer.”
Now there are at least a dozen or more lower-tiered response plans that are focused at specific project WBS elements, and that can be handled by small teams within your project team. In the author's experience this is precisely how a large risk, “the system will fail the customer's at-sea operational test,” was successfully mitigated by a multi-pronged response plan.
Here is a second benefit to the project manager. The interdependence of these component risk events has already been illustrated. Now, the payoff comes because it is not necessary to design and pay for corrective actions for each sub-risk event in the total model. Assuming the interdependence has been modeled correctly, it can be shown the number of required corrective actions is less than the total number of branches, and in theory only one action is required per INDEPENDENT branch.
With this, the project manager has the option to compare proposed risk responses and choose one based on consideration of its relative cost to the extent of mitigation it provides.
Integrating the PMBOK® Guide Risk Monitoring and Control Process
The graphical model created during the risk analysis becomes a tool for conducting the monitoring and control portion of the risk process. Deployed on a large chart on the project team's status board, it serves as a visual reminder of the challenges that lie ahead.
As each sub-risk event is successfully put to rest, the chart serves as a “bogey-board” keeping the score. A satisfying green “X” through a risk path can be a visual reward that encourages the team to take on the next challenge. A red circle around a problematic risk helps to rally the team's focus until that risk is also surmounted.
As the project progresses, the events that were forecasted may transpire differently than expected. As with any written project documentation, the chart can be updated to reflect reality. As is often said, “Whatever is written down can be changed; whatever is not written down changes constantly and is guaranteed to bite you.”
Conclusion—Long-Term Benefits of Integrated Risk Management
The Root Cause Projection Technique need not be a tool held in reserve for the eleventh hour of a project. The project manager can conduct a project kickoff risk identification meeting using this as a device to stimulate creative thinking about a subject that often intimidates and frustrates team stakeholders. Furthermore, senior management, project managers, and sponsors can apply the technique in review of proposed risk management plans, providing constructive criticism and more objective analysis.
The effort identifying lower-level risk mitigations, even though not selected by the project manager in this particular project, is not necessarily wasted. They can be collected in the risk database as a catalog of mitigation strategies for future use.
Furthermore, with the cataloguing of these mitigation strategies, the history of those actually employed can be documented so as to provide tangible lessons-learned information to other project teams in the project office organization.
REASON® is a root-cause analysis tool published by Decision Systems, Inc. of Longview, TX USA (http://www.rootcause.com)
Proceedings of the Project Management Institute Annual Seminars & Symposium
November 1–10, 2001 • Nashville, Tenn., USA