Now would be a good time to make significant improvements in ITIL. The change of ownership is an opportunity which should not be missed. AXELOS has free hands to make changes in ITIL. In the short term a major overhaul of ITIL would hurt existing business but significant improvements would secure it in the long run.
The framework has been useful but it contains several areas which need to be improved. I can recognize three key areas:
1) The service lifecycle
2) The complex, overlapping “process” model
3) Service Desk – Incident – Event – Service Request – Problem Mangement
I plan to comment on all three and I start from the bottom of that list. This is the most practical part of the series. Many practitioners have already successfully applied these ideas and this is probably the most important area which needs major changes. Anybody who has followed the endless debates and discussion round incident and problem management should see the need for clarification.
Incident management and the Service Desk are quite central concepts in ITIL. Originally the solution was good, but it must be understood in the light of the era where it emerged. The IT help desk appeared roughly at the moment when PC’s were introduced to the office environment. The technology was unstable and difficult to use. It was clear that users would have difficulties and would need fixes. The early ITIL authors understood this and created the concept of a single point of contact, SPOC which would support all infrastructure and act as a single channel for all users needs.
The key process was Incident Management and the definition of an incident was deliberately vague. An incident was any event which was not part of normal operation and which might cause problems. The idea was that all user calls were treated as potential failures. The objective was to restore the service, which was usually done by advising the user or fixing the failure by rebooting the system or some other fast remedy.
This was a simple model, far from perfect but it worked better than previous practices, where the user had to find the correct specialist to help them. There were people who fixed failures but only if it was in their system. When a user called, they first checked if the failure was in their domain and if it was not, that was the end of discussion.
The first versions of ITIL did not describe the activities of service operations. This meant that things like service monitoring and event correlation were out of scope. Advanced service organizations had their own processes to handle events and service failures, where the goal was to fix things before customers saw them.
Version 3 of ITIL added operations and event management. The definition of incident was changed slightly and the result has been more confusion. Incident is now an unplanned interruption to the service and it covers also events that are not visible to the customers. The new incident management is in effect fault management.
There are three key operational processes in any service environment. These are:
- order processing, which handles the normal flow of customer orders
- customer support, which handles all customer enquiries and problems
- fault management which repairs and fixes all items which cause or may service failures and degradations
These three are separate processes with different goals and different requirements to the amount of information needed.
Order processing is centered on the service or product that the customer orders, where exact specifications are important. Order processing should be simple and straightforward procedure in most cases. Orders must be documented.
Order processing should be a tool to manage several procedures simultaneously. For example when a new customer needs a computer the person needs many things which should become available at the same time: a new laptop, mouse, user id, connectivity and suitable access rights. Using a single ticket to handle these activities is not efficient. An automated procedure should create several tickets and make sure that all these tasks are executed in the right order.
Customer support is customer or event centric. The goal is to solve the customers’ problems quickly. The process starts from a customer report or query and it should end with a satisfied customer. Good communication skills, the ability to recognize and to solve problems are essential to this activity. In most cases there is no actual fault behind the customer’s problem and there is no need to find out the root cause. When there is a fault which causes the customer’s problem there are two separate goals to meet. One is to take care of the customer and the second is to fix the fault.
Customer support should be alert to helping customers by teaching better ways of doing things or offering new tools.
It may be a useful practice to record all support calls as tickets but it is not useful to try to connect all these tickets to configuration items. The need to document all interactions has been over emphasized. The ideal tool for customer support is the customer relationship management tool. Collecting and managing feedback is an important element of customer service but technically it is not different from customer support.
Fault management is centered on the failing configuration item. It starts with a service failure and ends with a restored service. Fault management is not about the customers who are suffering due to the failure. Customer support looks after the customers and tries to fix the situation in a customer centric way. Fault management requires technical skills and the ability to solve problems.
Fault management is the area where the ticketing tool is most useful. It is important to collect all configuration data as the underlying cause of the fault might be somewhere else. In ITIL terms fault management covers both incident management and reactive problem management. The goal of the fault management is to restore the service, removing underlying causes is secondary. Sometimes it is not possible solve the customer’s problem without solving the fault. In these cases the customer’s ticket stays open until the fault is resolved. All faults cannot be fixed for several possible reasons. In these cases the fault ticket may stay open but the customer ticket must be solved. One solution is explaining the customer that it will not be solved and finding a way to bypass it.
In some cases all these events can be too complicated to be managed by a process and there must be a way to handle complex cases. Case management should be recognized as a separate activity which cannot be controlled with a process.
Behind the customer facing processes an IT service provider needs at least these two activities.
- Monitoring and control which tries to recognize faults before they affect the service. ITIL Event Management is not a process but an activity within monitoring and control. The goal of monitoring is to prevent faults
- Service improvement looks at all service failures and customer problems and tries to improve the service by removing the causes. There is no need for a specific Problem Management as fault management is responsible for solving all ongoing faults. Service improvement is not a process but a group of activities where teams need to understand customer priorities so that they can direct the improvement efforts to those areas where they can create most value.
What this means to ITIL in practical terms:
1) Remove terms “incident” and “problem”.
2) Define Request Management as order processing.
3) Redesign Incident Management as Fault Management.
4) Remove Problem Management and replace it with Service Improvement.
5) Add Customer Support as a process or activity to handle customer’s problems.
6) Remove Event Management as a process but keep it as an activity in Monitoring and Control