Testing approach to keep your "Big Bang" system implementation from becoming a bomb
What do you do when three years of development, by up to 25 people, comes together in ONE “big bang” system implementation? How do you plan? How do you make sure that the “little thing” you overlooked two and a half years ago, doesn’t become the nightmare that causes a crash and burn? This paper discusses the innovative approach and management process that “saved us.” While “risky” in a traditional sense, in our case, the results proved worth the risks.
Our approach was to take risks early—a testing approach that let the users get involved in all aspects of testing, see all our warts, and let them help find problems, long before the system was ready for formal testing. It spread the testing process out over a period of months, which forced us to carefully manage user expectations (especially doubts and fears). It also put a premium on our ability to track and manage corrections, such that users didn’t lose faith in our ability to write a functional system, or feel that the testing was a wasted process. It benefited us by helping the team be able to work rational hours when we hit that magical acceptance test and start-up period. In our case, we essentially had one week plus a long weekend “push” to finish testing and problem solving prior to acceptance, and then things proceeded smoothly through a 45-day acceptance test, and a 90-day warranty.
“Big Bang” Implementations
What is a “big bang” implementation? For purposes of this paper, I’m referring to those efforts that involve replacing or implementing a significant set of functionality, with a “hard” conversion and cutover point. In my definition, a “hard” cutover is the scenario where you use the old system until the cutover point, and then it gets turned off for the conversion. The next time you use a system it is in the new environment, and the old system is no longer available. No phasing, no gradual implementation, and if there is any type of “parallel” functionality, it is purely a safety net to prevent catastrophe.
In our project, the big bang was caused by a number of factors, the most significant of which were:
• Y2K transition, which would make the legacy systems nonfunctional, regardless of whether we were ready or not
• Integrated data that needed to converge simultaneously from three legacy systems
• Data interdependency between functional and program areas
• Large volumes of data and number of people using the system, which made any parallel processing impractical, due to issues with maintaining data integrity.
Potential for Crisis
When Compaq and the Commonwealth of Pennsylvania Department of Environmental Protection (DEP) planned the eFACTS Project (Environmental, Facility, Application, Compliance Tracking System—previously known as FIX), we realized that a traditional approach to system testing might lead to a crisis period, that would jeopardize the overall project schedule and success. The project, developed over a three-year time period, by up to 25 people, formally required one system test and implementation period at the end. Compaq and DEP agreed that testing that late in the process was asking for trouble. The team was doubtful we’d finish in time for the Y2K event. The original schedule, a very traditional waterfall approach, did not include any buffer or contingency time. If the project went off track, or any major problems occurred in testing, there wouldn’t be time to resolve them before January 1, 2000. In addition, the Pennsylvania Governor’s office was seriously concerned with our progress, and required every effort to ensure that the Commonwealth transitioned smoothly into the new millennium.
Response to Threat
The original project plan called for the team to development Requirements for all three phases, then proceed to a Design phase for all three, and then finally to begin the Development phase. This meant that for over two years no coding would be completed: just documents. Staffing ramp-up would occur at Development, and the core Design Team would carry the load for an extended period of time.
The obvious answer was to accelerate any staffing and work that could reasonably be moved to occur sooner, rather than later. Since the project was already structured with three major sub-phases, the first step was to revise the plan such that as requirements were completed for each subphase, we would proceed into design and development, even if we couldn’t complete testing and implementation until the end. This led to a plan that brought developers, as well as the documentation and training team onto the effort sooner.
Reviewing the schedule that resulted from this acceleration, we discovered potential under-utilized time for the new team members as they completed one phase of development and waited for the next set of requirements. Since the goal was to make use of as much time as early in the schedule as possible, we looked for other activities that could be accomplished. The remaining major activities were testing and training. Training was quickly ruled out—we were certain that teaching 500 people functionality they wouldn’t use for over a year would be a waste of time.
That left testing. We had a core team of six user representatives responsible to work with us on the project. They were drawn from the DEP functional areas, and were our “subject matter experts.” The user representatives also were responsible for developing the acceptance test plans and their eventual execution. This was going to be a challenge to them, since their background was not in computer system development or application testing. We had held some preliminary discussions on developing test plans from specifications, and although we had set and agreed on schedules for the initial draft test plans, there was no evidence of progress. With over 100 modules that would need to be tested, this didn’t look promising.
We decided to address the lack of test plan progress and the developer utilization by releasing modules for early testing. They would give the user representatives a better basis for developing test plans, and provide work for the developers in correcting any problems that surfaced rather than waiting for the official test period. We didn’t change any plans for the official testing: just expanded the unit, integration, and system testing team to include the user representatives,
In summary, the key components of the revised strategy were:
• “Rolling” design and development: as requirements were completed for a subphase, it would move directly into design, and then into development while the next design got underway. This was a significant departure from the plan to have all designs complete before any development was done.
• Early testing as work was developed. While final integration and security testing would need to be delayed until all work was complete, as soon as a module passed basic unit testing, it was “released” for the user representatives to work with and test.
Testing Throughout the Lifecycle
While this paper focuses on the testing as related to actual application code, our testing process followed the lifecycle. It began as the users worked with us to review requirements, and “tested” our understanding of their needs. It continued through screen and report design, with team “testing” of whether the design would meet their business needs. Testing continued through development, culminating in the formal acceptance test process. Specifically, our test approach included:
• User “hands-on” process testing during requirements and design
• QA checklists and tests for a module’s conformance to standards and requirements
• User testing of modules as soon as they passed unit testing
• Test data conversions and error reports of the full database
• User developed test plans
• Ongoing user testing of modules as they were integrated into the total system
• Overall system testing
• Statewide acceptance testing (over 3,000 tests)
• Comprehensive problem review, verification, correction, QA and closeout process.
Test Team Structure
Up until the replanning process, the user representatives worked independently, only coming together as a group when we called review meetings. Unfortunately, since the project was “part-time,” their deliverables, especially the acceptance test plans were falling behind. To address these concerns, as well as the increased complexity with early testing, DEP made two significant changes in structure that were critical to our testing success:
1. The User Representatives were literally taken out of their normal work place and co-located as a “Tiger Team,” in a “Tiger Den,” so that it was clear to line management, as well as all concerned, that their time was now 80% dedicated to the project. For those that continued with other responsibilities, they were officially allocated Monday mornings and Friday afternoons to deal with them.
2. DEP identified and put in place a Team Leader, a.k.a. “Head Tigress” for the team. She brought systems development and testing background to the team, and taught the user representatives acceptance testing. It was her responsibility to track and manage the team’s performance for their various deliverables, with emphasis on the Acceptance Test Plans and testing feedback.
Training and Support
In our test process, the user representatives need additional training and support, beyond that for a more traditional acceptance test. Since they were exposed to early software versions, they needed training in diagnosing a problem: whether it was a data issue, a test process issue, a specification issue (we gave them what they asked for, but not what they needed), or an application issue (not performing to specification). They needed to clearly understand their role in testing, the overall development and test process, and how to record appropriate information for developers to reproduce errors. They also needed enabling support: equipment, working space, connectivity to test systems and databases, and security access. Once testing was underway, the team needed ongoing support to address their usage questions.
Revised Plan Results Summary
Final acceptance testing and implementation remained at the project end (still a “big bang”), but there was a steadier workload in reaching that point, rather than a steep ramp. By releasing the subphase work as completed, the designers got feedback on their efforts (closed the loop), since they could understand whether the design met user expectations, and any programming issues with trying to build to the design. User representatives were able to see how their needs were transformed into the application, provide timely feedback on whether those needs would be met, and learn the implications of the requirements they gave us for subsequent phases. The Project Managers had the ability to accept critical changes and make functionality tradeoffs, since calendar time was available for flexibility. Overall, we not only were able to complete by the Y2K deadline, the system was implemented and the 90-day warranty was completed in early December 1999.
Scope Management for Extended Testing
One consequence of the early testing was an increase in change requests. As user representatives worked with the system, they generated ideas on improvements and wanted more functionality. Our approach to scope management was conscious and with a firm hand. All changes, no matter how trivial, were referred to the Project Managers. Either could reject a change, as not an urgent business need, not important enough to divert resources to address, or just too much effort given the overall project schedule. The general decision process was one of considering several key questions:
• How important is this?
• Is it critical to make sure the data is correct?
• What would be the consequences of postponing the change until maintenance?
• What is the overall impact to the schedule and are resources available?
• What other consequences will the change have—such as modifying completed training courses, re-work of other screens, etc.?
Changes that required data structure modifications, while complex, often fell into the category of critical for data integrity and data conversion, and not something that could be delayed. Changes of style, screen layout, and almost always of report layout, fell into the category of future enhancements. All changes were evaluated in terms of impact, with an eye to making tradeoffs rather than increasing scope and/or complexity. Given the overall schedule, and the issues involved with adding team members, the change control goal was to ensure delivery of a functionally sound system, with the least amount of change. One significant change was accommodated through a change order. Given the additional work magnitude (the change included conversion of an additional legacy system and its functionality), the team was expanded significantly and restructured to include the new work.
Benefits for Tiger Team
The early testing provided the Tiger Team with tangible benefits:
• Up until this point, they had been trying to write test plans working from specifications. Having screens and reports available made it much easier to write the level of test plan desired (keystroke by keystroke).
User representatives were able to understand the workload testing would be on the regions when we went to full scale testing, and develop a realistic test schedule for acceptance.
• Team members learned what the modules would provide in production from a “hands-on,” experiential perspective.
• System was tested with actual data (we were working on a copy of the database also being used to test conversion), and a wider variety of scenarios than a typical acceptance test would have included.
Benefits for Development Team
The early testing provided the Development Team with related benefits as well:
• Support for testing the functionality of modules and their integration, as they became available
• Improved integration testing
• Test approaches and scenarios that matched user expectations
• Leveled the workload more evenly across the available calendar time
• Increased confidence in readiness for acceptance testing.
Potential Problem Tracking and Resolution
With this type of prolonged test, tracking and bringing to resolution all potential problems is crucial. The testers need to know that their work is valued, that problems will be addressed, and receive feedback on issues if the module is performing to specifications, but the user didn’t understand those specifications. We started tracking with a process relying on shared documents for each module, and by the time we entered acceptance testing, had significantly refined and improved the process, using two coordinated access databases.
Shared Document Tracking
During pre-acceptance testing, users reported their problems in Word documents, one for each module. Although the theory was that each user reviewed the open issues, added to the end, with date and other relevant information, and the technical team added notes and comments as problems were addressed, the practice showed more “creativity.” Significant issues that we encountered with the “running document” approach were:
• Undated comments, so it was hard to track the problem versus various versions of the forms that we thought had fixed the problem.
• Randomly placed additions to the Word document. The QA team had to “hunt” to make sure that all feedback was addressed, since they couldn’t reliably just look at the beginning or end of the document.
• Duplicate reports of the same problem, since users didn’t search each other’s feedback, and sometimes didn’t realize that problems were duplicates.
Exhibit 1. Development Problem Log Database Structure
• Problems reported for different modules, since the documents were based on the “module test plan,” not the module being tested. This didn’t provide the developers one place to look for issues with any given module. For instance, in order to test the Oil and Gas Inventory Details, the user would navigate through the Facility Query Module. If there was a bug observed in the Facility Query Module, the user would report it in the Oil and Gas Inventory Details document, since that was the test plan they were considering.
• The documents were subject to version “wipe-outs.” This would occur when a user would copy the document to their PC, make edits, and then copy it back to the shared directory, maybe a day or two later. Unfortunately, others may have been updating the shared copy, and their updates would be lost.
Problem Log Database for Development Team
The development team had planned to use a tracking database all along, but like many plans, the best intentions don’t always get put into place. Eventually, we got the development Microsoft Access database for problem tracking in place, and included in it the user test results from the various documents. Regardless of the user tracking method, the development team needed the structure to assign work, track results, and verify completion.
The problem log database developed to track reported problems, their status, and resolutions provided an integrated structure to make it easy to enter data, update it, search, and print reports for assorted reasons. The structure of the database is shown in Exhibit 1.
The key table with the details on all reported potential problems was the Problem Log. As shown in the exhibit, it included information such as the module, both a short name and problem description, information on when and who reported the issue, and status values, for both problem type and status of any fixes. Classification data was placed in related tables, to make it easy to ensure data integrity for reports and searching (value driven rather than free form input). The database and its structure provided the ability to:
• Track open problems, in total and by person
• Track problems that needed QA for release to the users
• Track problems that were resolved and needed resolution distribution to the users
• Track modules that weren’t even released, so that when users reported problems with them we could easily say “not yet”
• Generate assorted reports, both detail and summary for tracking and follow-up.
One of the most valuable reports was the individual error log report, which were handed to a team member for resolution. Team members often noted test cases on them, and the reports were passed to QA for testing. As part of the final “push” through integration testing, we reprinted the entire set of problems reported to date, passed them out to the development team, and retested everything that had ever been fixed to make sure we hadn’t subsequently broken it. This process caught a few additional problems, and increased our confidence level entering acceptance testing that the system was solid and ready to go.
By tracking the detail in the problem log, including whom an issue was assigned to, its status, and QA dates, we were able to confidently report on progress to the users, and release updated versions with specific information on what problems they addressed. The results: Users that were favorably impressed that all the problems they had previously reported were taken care of, and an acceptance testing process that was shortened from the original schedule to put the system in production 15 days early.
Acceptance Testing User Database
At the start of acceptance testing, the user team implemented a database to track their error reports. Although on the surface using the same database as development might have made sense, we realized that the problems to be tracked and the information about them varied by team:
• The user representatives needed to know which module’s test plan had a problem. Developers needed to know which mod-ule(s) caused the problem.
• The user representatives needed to track all failed tests, for whatever reasons. These included problems like bad data, errors in the test process itself, user execution errors (not following the process), and potential application errors. The developers were concerned with only the problems that the user representatives felt were application problems, as well as any found outside the official testing (such as by trainers or other members of the development and QA team).
• Closure meant different things depending on the perspective: For the user representatives, the problem was closed when the acceptance test passed. For the developers, closure with respect to additional work occurred when the module was re-released to the users for testing, with a separate status to track user representative concurrence with that closure. The developers needed to track different status details on the correction process: reported, verified, assigned, QA, ready for release, and released for testing.
During acceptance testing, the databases were not linked. Errors were faxed to the development team, entered into the database, and filed. When releases were issued (standard time each day), a fax closure report was sent to the users. These reports were then signed and faxed back to the development team to indicate agreement when the users completed retesting. The Comptroller’s office and the contract required this level of documentation, so the process was set up to ensure that both the contract and coordination needs were met.
Risks Inherent With Early User Testing
With early user testing, we added significant perception risks to the project, and had to deal with the user representative’s:
• Lack of confidence in development, since they saw all our “warts”
• Concern that we weren’t listening, since problem fixes were scheduled in with the regular development, and not an immediate priority
• Skepticism on our readiness for the official acceptance testing, since some of the module updates weren’t released with all the bug corrections until we finished integration testing, particularly those with significant interdependencies
• Concern that they were “doing the development team’s work.”
We controlled the impact by both recognizing and discussing these concerns. We provided ongoing feedback on process, progress, and plans. In reality, it wasn’t until we started the formal acceptance testing, and users saw how smoothly it went for their testers, that they really believed the previous reassurances and that the time they invested was worth the return.
In addition to the perception issues, other issues that required management attention included:
• Users writing tests to match module behavior rather than specified functionality.
• Interdependency problems found during early integration testing that caused rework of specific test steps and expectations.
• Additional coordination and management with users: releasing versions for testing, testing fixes, communicating with users.
• Additional management of development team: balancing time to address problems with time to develop new modules. We solved this by “dedicating” both resources and time period between the two tasks, so that the focus for any given person and day was clear, not a constant “churn.”
• Expectation management with users: that problems may or may not be addressed as soon as discovered, and that this didn’t mean they didn’t count, just that the overall work needed managed.
• Dealing with unpredictable user reactions to problems: user representatives were unable to distinguish fundamental design issues from minor bugs, so overreaction was common in the early testing.
Would We Do It Again?
Yes, we would do this again, and are planning the same process for the current projects. This process brought everyone into a team, working toward the same goal. While acceptance testing is about trying to break the system before it goes into production, to the extent that everyone understood our goal was a high quality implementation, this process reinforced that:
• We ’re in it together.
• User representatives were key team members, not just bystanders.
• User representatives own the results in the field.
Overall, our approach and project succeeded: in part due to the test process (an enabler), but fundamentally due to the hard work and dedication of the extended team. This testing approach was one aspect of the whole, but by itself, not the only reason for our success.
Proceedings of the Project Management Institute Annual Seminars & Symposium
September 7–16, 2000 • Houston,Texas,USA