The Hot Aisle Logo
Fresh Thinking on IT Operations for 100,000 Industry Executives

Here is the acid test – our customers often know that there is a service problem before we do. When that happens, as far as our customers are concerned, we suck, we are incompetent and they are smarter than us. Our IT Operations team may be publishing loads of great Key Performance Indicators (KPIs) that show improvements in average fix times, improvements in system availability, reductions in incidents etc. but all of that is without value if our customers think we suck.

So why do customers get the jump on our IT Operations Centre? Actually it is quite simple, our customers experience the full end to end service as users of our systems, our IT Operations people only experience a partial picture, the components of the service as they use the instrumentation that checks each is available and functional. Our customers and IT Operations have different views of the world, and the customer view is more powerful, more meaningful and more immediate. Modern distributed systems are so complex that it is impossible for IT Operations to understand the impact of a component failure on the end-to-end service.

Customers think about services, not just bits of technology, they see things end-to-end, not in isolation and even if all of the key components are available and functional, sometimes the end-to-end service just does not work. I am sure we all have an experience of calling a helpdesk analyst and being told that it is all working for everyone else and they can’t see your problem? The supplier has probably spent millions on instrumentation the service you are complaining about, you have spent nothing and your instrumentation is better! Yours is better because yours is the Customer Experience View.

I wrote about some of the history and reasons that we monitor and manage IT in technical silos earlier in the article on this Website:- Why do IT Operations suck? It’s worth a read again.

So how do we get to a Customer Experience View? Actually it is not that difficult, it just takes a little extra effort. The first stage is to map out the end-to-end service (I call this process Tube Mapping after London’s Underground Train Network) that we are offering our customer. The diagram below shows an example for a Broadband Consumer service, showing Cycle Time and Right First Time metrics.

Tube Maps are built in workshops between IT and the Business Units and they crystalize the dependancies between IT systems and applications and their supporting business processes. It is quite common in companies that no one individual understands the whole end-to-end service and that, although the service is documented, there is no one place that it all comes together. Tube Mapping can be very enlightening and in themselves Tube Maps are valuable in driving a clear understanding of what we are delivering to our customers.

The next stage is to start building end-to-end instrumentation for Customer Experience Monitoring. There are three types that we can use:

  • Realtime Active Dashboards
  • Synthetic Transactions
  • Built in End-to End Application instrumentation

The first, Realtime Active Dashboards can be fabricated using tools from IBM’s purchase of Netcool. This toolset enables a federated approach to Customer Experience Monitoring by enabling a set of federated alerts and instrumentation to be combined into a single end-to-end view of the service offered to customers. A set of RAG (Red, Amber, Green) indicators show the impact of a failure at a low level in the service hierarchy on the customer impacting service component. For example a failed Websphere server could cause the service view to be Red if there was no built in resilience (thereby causing a service outage) or Amber if it was part of a protected cluster thereby compromising service protection but not impacting the customer experience.

These Service Hierarchies are built from the Tube Maps so actually translate into meaningful description of the business impact. For example a hardware failure might impact the sell service impacting the call centers because they can no longer take orders. By taking this approach the Command Center begins to understand the impact of outages on the business and our customers thereby being more responsive and better able to determine the most important components to recover.

Actually these Customer Experience Views are incredibly useful and can be shared with customers. Helpdesk call volumes tumble when the customer can see that there is a problem, what the impact of the problem is and that you know about it already. Tables are turned, we tell the customer about the problem, we are competent.

The second, synthetic transactions are a borrowed idea from end-to-end systems testing. Here we take the test harness that is applied to our applications prior to deployment into production and use them constantly to provide in-life testing. Synthetic transactions offer a unique view of the end-to-end service being as close as it is possible to be to the real Customer Experience that we need to achieve. HP have a set of tools called Business Availability Center that work well in this space. The objective is to create software robots that create pseudo transactions and monitor the progress of the transactions through the end-to-end service. Slow performance can be detected and highlighted in the same way as a complete failure. Synthetic transactions are highly flexible with a rich set of instrumentation output.

Here is what HP say about the product set:

HP Business Availability Center helps your organization: 

  • Make ITSM incident and problem management processes more efficient and business aligned
  • Measure business impact and risk from the end-user perspective
  • Manage business and operational service levels proactively
  • Accelerate problem isolation by automating standard operational processes
  • Manage complex business transactions across heterogeneous environments
  • Manage the complexity of composite applications and SOA
The third is Built in end-to-end Application instrumentation. This is much harder to retrofit to an existing application, particularly if it is not a self build. However it is the area where the most comprehensive Customer Experience Monitoring can be fabricated. When constructing new applications making in-life testing part of the functional requirements makes this whole area much easier to implement and instrument.
As I said in my previous article on Why do IT Operations Suck?:
We focus on service by organizing around key services and mapping those services using instrumentation working along the tube maps. We delight customers by caring about the same thing they do Service Protection and Service Recovery. Service Recovery works better because initial triage, where we work out where to focus our attentions, is enabled by understanding which issues are actually impacting customer experience.
Only by focussing our efforts on understanding what is important to the customer and dealing with the customers issues immediately can we get to the point where IT Operations no longer suck.

There Are 6 Responses So Far. »

  1. It is nice to see somebody banging the same drum and nice to see the level of detail that has gone into providing this picture.

    From the business end, heuristic analysis of the customer transaction also provides a clear understanding of the transaction journey, and this may be something to consider when you start building synthetic transactions. TI certainly helps to ensure that there are no gaps and no extra hops in the transaction that add annoying delays and frustration to the customer experience..

  2. Steve

    How long does Customer Experience Management take to implement?


  3. Yvonne,

    Experience shows that this is permanent journey, although benefits can be got quite quickly. At BT the difference after a period of 18 months was quite staggering and was picked up by customers and management alike.


  4. John

    You make a very valid point. The task of tube mapping itself helps IT to be much cleared about the blockages and stupidity that are common in informal and poorly documented processes.


  5. Hey Steve,

    Long time no speak, how’s it going out there?

    Good blog! I’m glad you mentioned the importance of “Tube Maps” – as you know its something I’ve been hammering to a certain organisation for over a year now! Finally we seemed to have realised that understanding our IT estate is a matter of importance and so company-wide initiatives have begun to go out and map it, which I hope to lead on. As well as helping end-to-end monitoring, these maps will also help lay the foundation for other ITIL-aligned services such as Change and Problem Management…

  6. Hiten

    You make an excellent point and thanks for the complement. Quite how IT Organizations have managed to get away from understanding the services they supply to customers for so long defeats me. Service views are one important part of this but as you say there is so much more. How can one focus on Cycle Time and Right First Time improvements without being able to calculate the end-to-end impact of changes?

    Glad to hear that you are doing well.



Post a Response