Running IS maintenance as a project
Of all the exciting and challenging Information Technology (IT) projects you can manage during your career, setting up and managing IT maintenance projects probably would not rank high on your list. You may think that “the other guy” will handle maintenance. But what if you are “the other guy?”
I have been “the other guy” several times while working for a major utility. This last time I took a different approach to setting up and managing the maintenance of a suite of applications.
Many of us IT project managers can attest that applying Project Management Institute's A Guide to the Project Management Body of Knowledge (PMBOK® Guide) principles to our software development projects have attributed to the project's success. But ongoing system maintenance isn't a project; or is it? Does it have an end? Is it providing something unique year after year? While this could be a fun debate, this paper covers HOW to run IT maintenance as a project by applying or slightly customizing PMBOK® Guide principles for increasing your success on delivering the needed support to the business. This paper is divided by the PMBOK® Guide Knowledge Areas.
Scope of Maintenance
Normally a project begins with a project charter. You may still have one from the development project but the charter did not authorize you to spend resources to maintain the systems. Instead, IT and the Business should develop a Service Level Agreement (SLA) identifying the services that the IT Maintenance Team would provide to the Business. The SLA serves as a contract between the IT group and the business owner of the system, replacing the project charter. This document should address all the services to be provided by the IT Maintenance Team as well as services that will not be provided. The business' responsibility in the arrangement must be spelled out so there is no confusion.
Exhibit 1. Steps to Create SLA and Transition Plan
Exhibit 1 outlines the steps to create a SLA and transition plan.
Writing the SLA
The first step is to define the scope. Then define the activities needed to deliver the maintenance scope. Once these activities are documented, estimate the time and cost of them. For the cost you will need to include any contracts for third-party support, licenses, or staff augmentation.
Next identify risks. Determine metrics that indicate key quality deliverables. Once the SLA is drafted, begin negotiation with your business counterparts. Getting their involvement early in drafting the SLA can help establish buy-in and correctly focus the document.
The following is a basic outline of a SLA. One size does not fit all; so use the outline as a starting point to develop ideas for you SLA.
• Time frame services will be provided
• Space for signatures of approval
• Purpose of agreement,
The purpose of this document is to define the scope of services that the _____ business will contract with the _____ IT Team. Included in this document will be the breakdown of the specific services provided, the list of components included, the list of response times, metrics used to track the success of the service delivered, services not delivered under this agreement, and functions that the business will perform. The costs of these services will also be broken down.
• Overview of maintenance
• Constraints of maintenance
• Limitations the system currently has
• Method of migrating into production; possible after-hours restrictions
• Service activities included and excluded
• Response and escalation times based on different severity levels
• Maintenance team contact information and escalation
• System components included and excluded
Scope of Transition
For you hard-core project managers, here you go. You can do some real project management. The Transition Plan will have a definite scope. It is unique with a start and end day. So it can count as a project or mini-project. Or if you are not a hard-core project manager, consider another planning activity that you will want to do right.
The Transition Plan should identify all the tasks needed to take the current state of the system(s) with any development team members and transform them into the end-state that you want. The end-state is not the end of the project but the steady state of your team effectively and efficiently maintaining the system for the business as specified in the SLA.
This plan most likely should include:
• Review components to support
• Review any licensing or contracts that need to be continued
• Determine numbers and skill requirements
• Set up a coverage matrix mapping team members to components
• Obtain Staff
• Have new members meet with current experts
• Address any team member skill gaps
• Produce a document matrix listing existing and needed documents
• Obtain all needed existing documents
• Determine missing documents and plan for their writing
• Owners Matrix
• Map system components to business owners
• Tracking Database
• Set up to track all incidents (bugs, enhancements, data fixes)
• Time Tracking
• Set up method for tracking time your team members spend on specific tasks
• Procedures and Policies
• Document how maintenance should run
• Communication Plan
• Draft to address stakeholders
• Transition any production monitoring from development team to your team
• Customer Contact
• Set up your Help Desk or other customer contact system
• Communicate method to contact the Maintenance Team
• Set up method of monitoring.
This plan should have a clear completion date at which time the Maintenance Team will be fully up and running. An assessment of the current state of maintenance should be communicated to the stakeholders along with this plan.
One of the differences between a maintenance project and a “normal” development project is that a development project has deliverables that are usually products with delivery dates, whereas maintenance projects are a service. This difference translates into differences in the project's metrics (way the project is measured for success). In the development project you set up metrics such as earned value to determine if the project is on track with deliverables as related to cost and schedule. The client wants an answer to the question, “As of today, is the project going to deliver what I need on schedule and on budget?”
For a maintenance project the client asks a different question, “As of today, has the Maintenance Team delivered the system availability defined in the SLA and satisfied the needs of our users while providing good value for the cost?” You as project manager need to be able to answer this question on an ongoing basis.
The measure or metrics should be agreed upon by both parties and documented in the SLA. These are the measures of success that your team will be judged against. The metrics should be set to balance the need to track the metrics with the amount of time it took to collect and report them.
The SLA should have defined all the metrics that the client wanted; but was the client articulate enough at the time the SLA was written to capture all their needs? Probably not. There may be other needs. So as a secondary source, contact the client to review what metrics are important to them starting with the SLA list. Also, you may want to track metrics at a level below your reporting needs as an early warning that something the client will soon see is going off track. This way you may alleviate the potential problem before the SLA metrics indicates an actual problem.
Possible metrics could include:
• Number of incidents open and completed in the period (monthly)
• Breakdown of type of incidents
• Average time to close an incident
• Application availability in the period
• Average length of total outages
• Time and cost spent on maintenance work (see note #1)
• Time and cost spent on enhancement work (see note #1).
Throughout the business world there is a trend for companies to decrease the amount of money they spend on maintaining their computer systems. This is necessary to stay competitive in the market place. Companies spend big bucks on implementing new systems for competitive advantages, but they do not want to keep paying for incremental improvements that do not provide much business benefit.
Exhibit 2. Severity of Incidents
Exhibit 3. Owner Matrix
One effective way of handling this is to separate the amount of time spent on base maintenance and enhancements. Base maintenance includes any activity to keep the system running: bug fixes, backups, checking logs, and performance tuning. Enhancements are any changes to the system's functionality due to requirement changes. This way the business can see how much they are spending to keep the system running and how much for changes.
Incidents (bugs, enhancements, data fixes) are the important variable in maintenance projects. Besides measuring the quantity open and complete, they should also be categorized by severity. Exhibit 2 shows a suggestion for severity. You can expand the description of these categories to include the number of sites, departments, or functions impacted. Also, you may want to redefine them if your system does not have the criticality to prevent the business from functioning as defined below as Critical.
Reporting the metrics should be part of the communication plan, which will define the frequency of collection, frequency of reporting, and to whom they will be reported. Measuring and reporting metrics that the client cares about is important to developing and maintaining trust between you and the client. This information can help justify the SLA effort estimated or show faulty SLA assumptions.
Let's cover three points on communication related to Maintenance Projects: (1) Communication Planning, (2) Application Owners, and (3) Tracking Database.
System maintenance is more of a service than a product with a delivery date. Because of this, communication is even more important than in new development projects. The stakeholders of a new development project want to know the status to be assured that the delivery date(s) are met. But stakeholders of maintenance want more. They want:
• Immediate fixes of production defects
• To be assured that the maintenance team is diligently working on a unresolved defects
• To know that the Service Level Agreement (SLA) services are being met through agreed upon metrics
• Any approved enhancements tracked to resolution.
Project Communications Management is well defined in the PMBOK® Guide and should be fully utilized for maintenance projects. We will not repeat it here. But we will focus on two other communication subjects.
A key group of stakeholders to consider for communications are the owners of the system(s) or application(s) that your team maintains. It is a good model that the business, not IT actually owns the system because the business is the one who reaps the benefits from the system and ultimately knows how the applications support their business needs.
Enhancements to the applications should only start if the business wants the changes. But using the term “the business” can be confusing. A user can request (demand) a change, but are they the ones with the authority to make such a request? There should be a clear contact in the business that is empowered to approve these changes. An Owner Matrix should be developed and maintained to facilitate contacting the correct approver. Exhibit 3 is a simple matrix to track who the owners are and who are additional Subject Matter Experts (SMEs).
Tracking all the questions, requests, bugs, enhancements, and migrations can be an overwhelming nightmare. Communicating the status of each type to stakeholders outside the team is a challenge, let alone ensuring that only tested fixes/enhancements are migrated to production. Each type can have different attributes, but setting up one database to track “everything” is the solution. The database or tool you use for tracking must be workable for the volume of data and the number of people accessing it, but beyond these, any tool should work. A spreadsheet most likely would be too limiting, a database system or even a test defect tracking system works well. What should be tracked?
• Calls with basic questions
• System admin changes
• Bugs with their fixes
• Enhancement requests
• Data fix requests.
What type of information should be kept for each type of incident?
• Incident number as a reference
• Type of incident (bug, enhancement, data fix, sys admin)
• System component
• Status (new, analysis, working, testing, complete, canceled)
• Person assigned to work the incident
• Short description
• Long description
• Working notes
• Date entered
• Date completed or canceled.
Exhibit 4. Skills Matrix
With this information in one place, it will be easy for you, the project manager, to slice and dice the data to communicate status and answer questions on how your Maintenance Team is performing along with what improvements may be needed. You can even allow read access for your key users so they can look first-hand at current statuses of incidents in which they are most interested.
Project Risk Management is well defined in the PMBOK® Guide and should be fully utilized for maintenance projects. We will not repeat it here. But we will focus on three specific risk / mitigation techniques for maintenance projects.
One maintenance risk we identified was “not obtaining needed documents to support the systems.” We strived to acquire the necessary documents from the development teams before they disappeared but fell short on meeting this need. The Maintenance Team had to write the missing maintenance manuals and operation guides.
Managing software versions is critical to removing risk of deploying the wrong software into production. All the versions of the components should be identified. This is extremely important when there are multiple versions of the application running in production for different clients. Define how the application pieces are maintained and fit together. Include application, database, and hardware configurations. Also reference any software licenses utilized for the application.
Migration is the process for moving changes into productions. A major precursor is the thorough testing and business signoff of testing of the changes. There are four processes that create the need for a migration: bug fixes, enhancements, performance tuning, and new releases.
The migration process involves extensive coordination and communication with the business stakeholders. The migration may involve an outage of the system that could adversely affect the business, so this should be minimized and performed after normal business hours if possible.
The following two techniques help to manage the distribution of work among the team members and establish expectations of team members. A team who understands what needs to be done and knows how to do it is an incredible asset to you, the project manager of a maintenance team. If you did not take over an existing team who is at this level, then it is your job to develop the team to this level.
Determine Skills Needed
The more applications (including application components and interfaces) you have to support, the more likely your team will need a variety of skills to provide proper maintenance. Producing a Skills Matrix (see Exhibit 4) is a simple way of keeping track of this.
List all the applications in the first column of the matrix. Title the remaining columns with the skill (programming language, database, and operating system.) Then simply place an “X” in the box where the application needs that skill. This can provide a quick view of what is needed to support your entire project and show any training needs of your team. Any gaps in skills can be addressed by training.
Divide Application Coverage
The more applications (including application components and interfaces) you have to support, the more difficult it is to keep track of who is supporting which application. Especially when you first take over the project. It is equally difficult to track how many alternate team members you have as backups to the primary support person. The application coverage matrix (see Exhibit 5) helps by charting who is supporting which applications.
This example of an application coverage matrix shows some problems with our team's coverage. We are in good shape with Application 1, having a primary team member and backup able to support the application. But the same person supports the next three applications and interfaces with no one providing backup support. This team member may be overloaded. Lastly, team members 4 and 5 are not supporting anything. This example of an application coverage matrix gives you an idea where to start making changes if this is the starting point of a team that you are taking over.
You may or may not be involved with a yearly performance review cycle with your team members. There is an advantage if you are since it will help focus the team members on what you define as important. A good starting point for the performance review cycle is to write project expectations for each member's role / responsibility and review with them so they clearly know what is expected of them. The Coverage Matrix can be used the help write the project expectations.
IT system maintenance can be enhanced by applying or slightly customizing project management principles found in PMI's PMBOK® Guide. Creating a Service Level Agreement (SLA) provides a solid contract and takes the place of a Project Charter. The Transition Plan takes the current state of the system into fully functioning maintenance.
Both standard project communications management and standard project risk management apply fully well to maintenance.
Exhibit 5. Application Coverage Matrix
Standard project communications management benefits from the definition of a System Owner Matrix and the creation of an all-inclusive tracking database. Good system documentation, configuration management, and migration management minimize risk.
We can't forget about the team. The creation of a Skills Matrix and an Application Coverage Matrix is essential to efficiently manage the dynamics of your human resources. Lastly, the ongoing quality of the Maintenance Team's deliverables can only be tracked with the establishment of clear metrics enabling you, the project manager, to answer the client's question, “Is the Maintenance Project delivering the system availability defined in the SLA, satisfying the needs of the users, and providing good value?”
Project Management Standards Committee. 2000. A Guide to the Project Management Body of Knowledge (PMBOK® Guide). Upper Darby, PA: Project Management Institute.
Proceedings of the Project Management Institute Annual Seminars & Symposium
October 3–10, 2002 · San Antonio, Texas, USA