Program execution agility
David Y Ratnaraj, PMP
In the IT industry, it is common to budget a minimum of 20% of development costs for yearly operations and maintenance (O&M) of the system. Of the budgeted O&M cost, industry data show that over 80% of the costs are expended toward fixing known defects. The majority of these defects translate into IT system downtime, costing US $26.5 billion (2010 data) (Harris, 2011) in lost revenue. Quality is a major issue in the IT industry.
This paper focuses on a multiyear IT project that delivered more than 680,000 source lines of code into production two weeks ahead of schedule, with very high quality. Since production release in 2011, the system has had zero downtime to date (over 46 months). A total of 99.8% of resolved incidents had zero post-delivery issues (first time right). Disciplined development and proactive and preventive maintenance has helped keep yearly O&M costs to ~4.2% of system development costs.
This paper will discuss the following:
- Best practices from agile, waterfall, and other approaches that were integrated into the team's high velocity development practices during the development phase as well as the operations and maintenance phase of the project.
- Optimizing project process for the highest quality with the lowest effort.
- Realistic task planning utilizing historic data and simple statistical analysis and tools
- Quantitative status reviews and leading indicators.
- Qualitative status reviews and leading indicators
- Team empowerment and team member accountability.
- Continuous process improvement.
With over 25 years of application development history, our software development process capability has grown with the adoption, customization, and integration of best practices from industry leading practices, including organization-level SEI capability maturity model integration (CMMI), team-focused SEI Team Software Process (TSP), individual development–focused SEI Personal Software Process (PSP), and agile methodologies. Our development processes are continuously enhanced to improve quality, reduce risk, and reduce schedule by the engineers who execute the processes(see Exhibit 1).
This paper focuses on a project for modernizing a large mission critical system, to be executed under a firm fixed-price contract. Failure to meet schedule and quality goals would result in millions of dollars in yearly expenses for the customer. Despite our history and capability, we needed to execute the project with more agility to meet stringent schedule and cost constraints. The following are key factors that helped the project succeed.
Define project common language
Every project interacts with multiple internal and customer groups, with varying levels of interaction. Customer groups include end users, focus groups, development teams, test groups, management, and so forth. Internal groups include the development team, backend team, business logic team, user experience team, test group, process group, configuration group, network team, systems administration team, and so on. Invariably, there is ambiguity in the terminologies used among the different teams. Examples include acronyms with varying meaning, variance in the definition of task completion, perceivable understanding of terms, etc. A lack of clarity leads to an increase in assumptions. Assumptions, in many cases, are the cause of most defects and discontent among teams.
Additionally, most development teams (vendors) see the different customer groups as sources of input and in some cases as having differing goals. Generally, this leads to discontent among the groups and hampers project success. Change from the multi-team, customer–vendor relationship into a unified team with multiple groups but common goals initiates open communication, transparency, cooperation, and working toward project success. Keeping the customer engaged with frequent status reporting, issue resolution, and product demonstrations builds the customer relationship. However, understanding the customer and the customer's goals and using customer terminology are key steps to changing the customer–vendor relationship into a unified team relationship.
Build and emphasize commitment
Studies have shown that teams progress through the four stages of forming, storming, norming, and performing (Tuckman, 1965). The team's efficiency and productivity is highest when they execute at the performing stage. To get the team to the performing stage quickly, all team members should have a common understanding of the project's goals and objectives, and take ownership of the project.
During the project launch session, the development team members are included in the meeting with all internal and customer stakeholders. This facilitates all team members having a common understanding and prioritization of customer goals, and the business impact of the project's success or failure. The team takes ownership of the project by identifying the work products to be developed and defining the processes and work breakdown structure to be used, the in-process goals to be monitored, and the data to be gathered. The work breakdown structure details each task to be executed for the different types of work products and the expected output for each task. The project management roles are broken up (Humphrey, 2006) and team members take on the different roles within the project.
Plan using experience rather than intuition
Every project is made up of a combination of knowns and unknowns. Current industry practice is to plan using intuition and retrofit into the expected schedule. Planning using intuition normally accounts for peak performance capability rather than nominal performance capability. With current industry practice, most projects start as schedule disasters on day one.
The accuracy of project plans increases when plans are made based on historic experiences and data. These estimates are made by domain experts and are continuously revised based on new lessons learned and new technologies incorporated. A plan based on data and well-defined processes gives the project the best option to be predictable throughout execution and the most realistic option to be successful.
Plan for quality—quality cannot be added on
Quality in software development is often an afterthought. Most software projects assume that quality is ingrained in the development process, only to encounter major quality issues during the test and deployment phases. Projects without quality plans typically are on schedule through the start of the test phase and then fall behind due to endless cycles of testing and rework. These projects either end up canceled or are delivered with unpredictable quality because of a lack of budget or schedule. “When poor quality impacts schedule, schedule problems will end up as quality disasters” (Seshagiri, 2014, p. 32).
Every developer, as a human being, is fallible and may introduce defects. The development process accounts for this fact and identifies tasks where defects could potentially be introduced. Quality steps are added into the development process to remove any introduced defects as early as possible. Adding quality gates and accounting for anticipated defects in the development phases helps reduce the number of defects passing through and improves the quality of the development process. Quality data analysis and statistical process controls are used to help identify components that may require remedial actions to ensure high quality.
Execute the plan
“How does a project get to be a year late?…One day at a time” (Brooks, 1995). Project-level plans will guide the team to achieve its goal only if the plan is maintained periodically to reflect the current status and list the remaining tasks required to meet the final goal.
To understand current status and maintain the project-level plan, individual team members need to understand the status of their commitments on a daily basis. This requires individuals to track data as they work and analyze their status daily. Tools capable of gathering data, analysis, and projections help individual developers understand their current status, identify projections based on completed work, and define quantitative corrective actions required to complete their assigned work on schedule. An individual plan summary, shown in Exhibit 2, addresses the key information required to be understood at the individual level:
- What tasks do I need to work on?
- When should I close each task?
- What are the dependencies?
- Am I making sufficient progress for the week?
Corrective actions taken at the individual level helps the individual stay on track and in turn helps the project stay on track. The team is collectively responsible for supporting one another and ensuring that all team members are able to meet their individual commitments.
The project status dashboard, shown in Exhibit 3, addresses key information at the project level:
- What is the project's current status?
- What is the projected completion date?
- What is required for on-time completion?
- Is the team's actual effort distribution as planned?
The status clearly shows that the project was behind. At this stage, the project was 3.8% behind against the baseline project plan. The team made a collective decision not to take corrective action on the project plan, until research, proof of concept, and pilot component tasks were complete. These tasks provided the data the team required to plan for the remainder of the project. The corrective actions listed below were identified based on the data and current status:
- Processes were streamlined to minimize non-product engineering effort. When sufficient information was not available, effort buckets were added. These buckets became catchalls. Analysis helped the team identify well-defined tasks and minimized effort buckets.
- The development process was revised for new component types. For example, design walkthroughs were added in addition to team inspections.
- The development process was revised for some of the standard component types. For example, unit test case inspection was decoupled from design inspection into a separate task.
- Development process task percentage distribution was revised for some of the standard component types. For example, percentage distribution for the design phase was made different for user interface, middle tier, and database components.
- Each team member committed to expending an additional effort over a 12-week period instead of adding 1.5 additional resources. The 12-week period allowed each team member the flexibility to plan on when his or her portion of the additional effort would be expended.
Project quality is monitored throughout the execution of the project. Poor quality at the component level can cause disasters at the project level. Compromising on quality steps during development—for example, to help with schedule—invariably leads to poor project-level quality and schedule unpredictability during the test and deployment phases. Individual developers are encouraged to strive for developing 100% of their components with zero post-unit test defects (components yield). Based on project data, two key corrective actions were executed:
- Personal review training and mentoring: This action resulted in a 17% reduction in defects found in team inspections, with a projected savings of 3.4% in project effort and ROI of 9.8/1.
- Unit test training: This action increased yield by 12.5%; 92.3% of defects introduced during development were removed during development (before integration or system test).
The biggest of the success factors is learning and adapting from experience (continuous improvement). Failure to learn from an experience should be considered a worse failure than a failed project. Continuous improvement happens not only at the project or organization level, but should be an integral part of every developer's growth.
Milestone and component postmortems provide valuable input into project and individual process changes. The data gathered are used at the project level to identify project process efficiency and quality improvements or to solidify and create new processes. At the individual level, with assistance from a coach, data are used to help identify current capability and improvement opportunities and to set improvement goals.
The project team adopted, adapted, and innovated on existing organization processes for this mission critical project. Executing the above-mentioned project success factors with discipline, the team delivered the project (680,000 source lines of code) ahead of schedule with high quality, resulting in an annual cost avoidance of $2.5 million.
With detailed planning and tracking, the project plan's earned value deviation ranged from -3.8 to +1.8 during the 157 weeks of the development phase. Exhibit 4 shows the planned versus actual earned value and the earned value deviation for the project.
Precise and accurate data tracking provided the ability to detect a one-day slip in schedule. This detailed insight helped the team deliver initial operational capability (IOC) two weeks ahead of schedule and the full operational capability (FOC) four weeks (2.5%) ahead of schedule, against the industry average of 27% behind. During the entire development phase, 90% of the milestones were delivered on or ahead of schedule. Optimized development processes allowed for 83.2% of effort to be expended toward product engineering tasks.
With quality process steps and continuous improvement activities, the product was delivered with defect density of 0.97 defects per 10,000 source lines of code. A total of 92.3% of components developed by the team had zero defects after unit test. This high quality was achieved with 34.9% cost of quality (effort in appraisal, failure and prevention tasks), against an industry average of >50%. Since production release in September 2011, the system has had zero downtime as a result of software defects.
In the operations and maintenance (O&M) phase, over 89% of O&M spend is expended toward system feature enhancements. Exhibit 5 shows the build quality of the maintenance releases. Of the 385 incidents resolved to date, a defect was identified in one incident resolution.
The success factors listed above are proven best practices that are successful in various industries. The team's success is directly attributable to team members’ skills and disciplined execution of the success factors. Most organizations have well-defined processes in place. Processes executed with discipline produce predictable results. Unfortunately, the disciplined execution of processes is the first thing to be sacrificed when projects face schedule or budget challenges. In the 2013 CHAOS Manifesto, the Standish Group reported that 61% of IT projects were not successful. Quality issues cost US $26.5 billion in lost revenue. For your project to be successful, incorporate these success factors and execute with discipline!
“Success is understanding and adapting from experiences. Excellence is making success repeatable.”
Brooks, F. P. (1995). The mythical man-month: Essays on software engineering. Boston, MA: Addison-Wesley.
Harris, C. (2011). IT Downtime Costs $26.5 Billion in Lost Revenue. InformationWeek.
Humphrey, W. S. (2006). TSP: Leading a development team. Upper Saddle River, NJ: Addison-Wesley.
Seshagiri, G. (2014, May/June). It is passes test, it must be OK. Common misconceptions and the immutable laws of software. CrossTalk, May/June, 31–35.
The Standish Group. (2013). CHAOS manifesto 2013: Think big, act small. The Standish Group International Inc. Retrieved from https://larlet.fr/static/david/stream/ChaosManifesto2013.pdf
Tuckman, B. (1965). Developmental sequence in small groups. Psychological Bulletin 63(6), 384–399.
© 2015, David Y Ratnaraj
Originally published as a part of the 2015 PMI® Global Congress Proceedings – Orlando, Florida, USA