Introduction
Any activity is subject to some risk(s), whether it is getting up in the morning, driving to work, or attempting to develop a new product for sale to a customer. Many use the term risk, but what is a risk? How do we determine what the risks are for an activity? How do we decide what to do about risks that we identify? How do we track the risks that we identify? All these questions and many others arise as a program manager tries to employ a risk management process. This paper attempts to address these questions and others by identifying tools and techniques for:
1) Developing risk evaluation criteria specific to a product/program,
2) Holding a risk identification session using techniques proven successful on multiple commercial and military product development programs,
3) Evaluating the risks against the product/program specific risk evaluation criteria without being overwhelmed by analysis,
4) Applying risk response strategy(ies) and developing risk response plans to address a given risk item,
5) Monitoring and controlling the risk response plan execution process through the use of trigger events, Red/Yellow/Green (R/Y/G) summaries, etc. to minimize cost while maximizing impact
Risk Management Process
Understanding Risks
The first step in applying any risk management process is understanding what a risk is. A Guide to the Project Management Body of Knowledge (PMBOK®), 2000 Edition defines a risk as an uncertain event or condition, that if it occurs, has a positive or negative effect on a project objective. Thus a risk is not an event or occurrence which has already befallen a project. It is an event that might happen. Secondly, a risk can have a positive impact or a negative impact. Many tend to only focus on risks that will have a negative impact. A wise program manager seeks to identify the positive and the negative.
Risks are composed of three elements: the risk event itself, the consequence or the impact of a risk event occurring, and the likelihood or probability of a risk event occurring. Lacking a clearly defined risk event, it is impossible to completely understand the concern. A clearly defined consequence is vital to ensure all understand the ‘so what' of a risk. Only by understanding the likelihood of a risk to some degree can a team know how important the risk is to the overall program outcome. At the same time, all must understand that the likelihood of the risk event must have a probability that is less than 1.0. Team members often try to associate risk with something that has already occurred (i.e. likelihood=1.0). An event that has already occurred is an issue, not a risk. A risk has the potential to occur; it has not actually occurred.
With a new team, it is often helpful to define three different ‘types' of risks: known, unknown, and unknowable. These three risk types can be defined as:
- A known risk is one that is recognized by a number of people involved in a project and is evident early in the project planning activity
- An unknown risk is one known by a very limited group of people and is not recognized in the project planning activities
- An unknowable risk is one that is totally unexpected and virtually impossible to foresee (e.g. the blackout of 2003)
Identifying the ‘unknown' risks is one of the major goals of a risk management process. The unknowable risks are just that, impossible to predict. Documenting the known risks and capturing as many of the unknown risks as possible reduces the number of surprises on a project allowing time to address the ‘unknowables'.
The Risk Management Process
The PMBOK® Guide, defines a risk management process as the “systematic process of identifying, analyzing, and responding to project risks”. The model for the risk management process is shown in Exhibit 1.
Although obviously technically correct, this model includes both qualitative and quantitative risk analysis and lacks any type of feedback loop, a vital part of any risk management process. A modified model of the risk management process is shown in Exhibit 2.
This model relies upon a qualitative risk assessment, an approach more likely to be used in the business world than a highly mathematical quantitative analysis. In addition, it adds the necessary feedback loop indicating that risk management is an iterative process.
The elements contained within this model are:
- Risk Management Planning—The initial work performed to identify the risk management approach to be used on the program and the program-specific assessment criteria.
- Risk Identification—The process of identifying the potential sources of risks both initially and on an ongoing basis.
- Qualitative Risk Assessment—The process of actually assessing the risks against the program assessment criteria
- Risk Response Planning—The creative process of identifying the risk response strategy (ies) that will be used and the detailed risk response plan(s) (what will be done or not done) for each risk identified. This planning includes identifying the trigger event that will cause the risk response plan to be executed.
- Risk Monitoring and Control—The process of monitoring for a risk event occurrence, reassessing the risk (likelihood and consequence) and monitoring the performance of the risk response plan and reporting the results.
Risk Management Planning
Risk management planning is the key to establishing a common understanding of the project's key parameters/metrics, the sensitivity of those parameters, management's risk tolerance, as well as establishing the practical aspects of how the process will work and how the results will be documented and reported. The program manager and a small cadre of key project members can best perform this planning activity.
In the planning process, the key program evaluation criteria need to be agreed to and established. They can be categorized into various impact areas: business performance, product capability, schedule, etc. Once the key parameters are established, the impact of each should be developed. For instance, if a key parameter is weight, then the sensitivity of weight to the product should be established. Stated another way, if a risk event became a reality and would increase the weight of the product by 1 kg, will that have a small, medium, or large impact on the product/program? Establishing the high, medium, and low impact values for each parameter provides a common viewpoint for all team members to use as they consider a risk.
Similarly, the overall risk sensitivity of the project should be established. This can be done by determining how the likelihood/probability of a risk event occurring is defined. Is the likelihood of an event considered high if it has a 50% chance of occurring? Low? Medium? The characterization of a likelihood as high/medium/low will vary widely across various industries, companies within an industry and even among various projects within a company.
The risk planning results can be documented in a simple matrix similar to that shown in Exhibit 3 below. This provides a 1 page snapshot of the risk planning effort that can/will be used repeatedly throughout the risk management process.
Establishing how risks are traded on a likelihood/consequence basis is another part of the risk planning activity. A 3X3 matrix similar to that shown in Exhibit 4 can be used to capture the relative importance of various likelihood/consequence combinations. The agreed relative importance of various likelihood and consequence combinations is captured in this example by the Roman numerals shown in each circle. In this case, risk events with a medium likelihood but a high consequence are considered to be more important than a high likelihood but medium consequence event. How those trade-offs are made are unique to each project, company and industry.
Risk Identification
Risk identification is the next step in the process and it forms the basis for all the future activities. This is the step where the hard work of drawing out concerns, frustrations and risks must occur. It is an activity worth spending considerable time to complete. During this step, the program manager has to work hard to control his/her emotions and let everyone have their say. Program managers may feel as if they are under attack or be simply overwhelmed with the magnitude of the risks being identified. One has to maintain a view that developing a plan to handle these risks now is easier than waiting until late in the program and have a totally unexpected ‘gotcha' arise.
The appropriate timing for an initial risk identification session can be somewhat tricky to determine but it should be held early, soon after the basic program requirements, milestone dates, etc. have been outlined, but before the budget and business case are baselined. Done properly, the risk identification session should identify areas that require additional effort, money, time, etc. impacting the business case/budget.
Risk identification sessions should be held as a separate meetings with a cross-functional group of varying experience levels. By mixing up the functions and the experience level, the widest array of ideas will be identified. These sessions will also help the team members understand how the various functions ‘fit' together, the impact at the interfaces, and help to develop a ‘common language' for the project. This common language is vital since problems often develop simply because two individuals/groups do not understand meaning of the other's words.
As part of establishing a common language, the process works best if all the risk items identified are documented in a common fashion such as: If something <good/bad> occurs, then the <TBD program objective> will have something <good/bad happen>. A risk event might read like: “If the price of unobtanium increases by 20%, then the total cost of the product will increase by 10%. It is important to ensure the ‘so what' of a risk event occurring is understood. Too often, people will just state a risk like “there is a risk the price of unobtanium will go up”. This is a first step but all need to understand the resulting consequence. The consequence on the product may be a .0001% cost increase or a 10% change. The risk response plan will differ greatly depending upon the consequence.
The best approach is to start a risk identification meeting by defining ‘risk' in very common words (e.g. “A risk is something that keeps you awake at night.” “A risk is what makes you nervous or uncomfortable about the project”). Do not just recite the PMBOK® Guide definition of risk. Running the risk identification meeting as a brainstorming session is an effective technique. Have each individual privately document the risks they identify on separate “Post-It” notes. After a period of ‘quiet time' these should each be read aloud to the group allowing the author to address any questions about their risk item. Allow one risk item per turn to keep all team members engaged, preventing one person from dominating the discussion. “Tickler lists”, lessons learned databases, individual interviews, etc. are also useful tools to help identify risks and often function as thought starters.
In all cases, it is vital to have an on-going risk identification process. This on-going process should include holding risk identification sessions at various points throughout the project, especially as a new project phase is begun. Providing a Risk ID form to all the team members is also helpful for gathering input between risk meetings.
Risk Assessment
The risk assessment step, i.e. evaluating the risk item versus the Risk Assessment Criteria is often initially done as part of the risk identification process. As part of the risk ID meeting, allow the identifier of the risk event also characterize their risk by placing it on a 3' X 4' version of the Risk Priority Matrix (ref: Exhibitr 4). Their assessment can often be ‘inflated' (i.e. high likelihood, high consequence) in the broader project view but it provides a starting point. Since everyone believes their job/function is most important, people tend to rank those risk events related to their job the highest. This ‘inflation' is best addressed by repeating the assessment step, maybe with a subset of the team, at a subsequent session once all the risks have been identified. This session can also be used to ‘combine' similar risks to eliminate redundancies. A program manager must be careful, however, to not unwittingly eliminate a risk as being redundant when in fact there is a nuance that is key.
Risk Event Response Planning
Risk response planning includes 2 major activities: identifying the risk response strategy (ies) to be applied and creating the plan to implement the strategy(ies). The possible response strategies include:
- Avoidance/elimination—pursuit of a completely different approach to the task thus eliminating the risk.
- Transfer—moving the risk elsewhere (to a supplier, to an insurer).
- Mitigation—developing a plan to reduce the consequence and/or the likelihood of a risk event occurring.
- Acceptance—allowing the risk to remain and dealing with the consequences if it happens.
A certain risk event may require multiple strategies to be applied to sufficiently reduce the likelihood, consequence or both. An approach might include splitting a risk and passing a portion to another company (transfer) and mitigating the remaining risk by reducing the likelihood. The appropriate risk response strategy may change over time due to changing business conditions, a technical breakthrough/failure, or a multitude of other reasons.
With a strategy(ies) selected, the plan to execute the strategy must be developed. This risk response plan should include a definition of what is to be done, by whom, what the schedule is, how the work will be funded, etc. Another key to the plan is establishing a ‘trigger' for implementing the risk response plan. This trigger might be an event, the passage of a date, or simply be the identification of the risk and an agreement to address it immediately. Regardless, this trigger is what causes the plan to be executed. A trigger can also be defined to document when this risk event is no longer a concern such as the successful completion of a test, selection of a production source, etc.
Risk Event Monitoring and Control
Once the risks have been identified, assessed, and risk response plans generated, the work associated with monitoring and controlling begins. Monitoring and controlling involves determining if a triggering event has occurred and initiating the response plan when appropriate, monitoring for changes in the environment leading to changes in the likelihood/consequence of an event and tracking to ensure the continuing viability of a response strategy/plan. All of this must be tracked and reported through some logical process.
Tracking and reporting on the risk management process can be accomplished using a relatively simple matrix similar to that shown in Exhibit 5. Such a matrix captures all the aspects associated with each risk event (risk item definition, likelihood, consequence, response strategy, response plan, trigger event, closure date, etc.). Recording the risk items within a worksheet program allows the risk item list to be readily sorted and/or filtered. The structure is also simple enough that others will be more inclined to use it. Depending upon the sophistication level of those involved in the tracking process, a more extensive database program can also be used.
A risk program summary based upon the structure of the Risk Index Priority Matrix can be used for management reporting. It provides a simple assessment of overall program risk status as shown in the left hand portion of Exhibit 6. The number of open risk items in each category is noted in the respective matrix box. The relative importance of the various risk categories can be captured by applying a red/yellow/green (R/Y/G) assignment to the matrix. The assignment of which categories are red vs. yellow vs. green is subjective and will vary by company. This R/Y/G assignment facilitates even further summarization. This is shown in the right hand portion of Exhibit 6 Risk Assessment Summary.
The summary should also include an indication of how many items have not been assessed and how many have already been closed. The former emphasizes the on-going nature of the process while the latter indicates that progress is being made.
Based on the various categorizations of the risk items, different graphs and tabulations can be created. For example, a graph of the number of open risks items associated with the various product modules or by responsible function provide a view of the program. Similarly, the total number of risks that are open can be tracked as a function of time. Some example graphs are shown in Exhibit 7. These graphs provide a quick visual summary of the program's risk status and can be very useful in management reviews.
Summary
Using a risk management process will improve the operation of a program by improving overall visibility, facilitating communication and providing an excellent basis for capturing lessons learned. A successful risk management process involves all the program participants, evaluates risks against established criteria, develops risk responses plans in advance of the occurrence and has triggering events identified. By having documented plans and an on-going process, the number of ‘knowable' risks is minimized providing more capacity to respond to those that are truly ‘unknowable'. This will increase a program's chances for success and reduce the number of sleepless nights for the program manager.