The Heart Of The Matter…

When it comes to dashboards and Service Management (or any of the many aliases used for that term), no one ever seems to discuss the concept that lies beneath the best implementations.

Yes, we are trying to provide a view of technology in a business-oriented way. That is obvious, even if it is beyond the scope of so many products that claim to do it. But what is it really that Service Management needs to do to deliver that promise?

  • Is it about the visual representation? (It could be, but there is much work to be done before you can even start to think about what the view should look like.)
  • Is it about the data?
  • Is it more important to have real-time monitoring data, or application performance data?

The famous “it depends” rears its ugly head here, but then again, it’s not so much about how much data you have as what you do with the data you have.

All solutions face the same issue: large volumes of data to show, but a very small span of time and space to show it in.

Clearly, we cannot show all the data we have. Adding to that, the process of deciding what is meaningful and what is not meaningful is not a linear process: you cannot simply leave out some bits while including others on the display. You are collecting that data for a reason, so discarding it at the point of delivery suggests cluelessness on an industrial scale, either because you collected data you did not need or because you discarded useful data.

So what are we to do with all this data? With a nod to the many variations that exist of the “DIKW” hierarchy, let me present my own simplified version.

The Modern Intelligence Hierarchy

We start with raw Data, evaluate that to achieve Information, repeat the process again at a higher level to get Knowledge, and then repeat the process ad infinitum until you arrive at Intelligence.

Intelligence Hierarchy
Intelligence Hierarchy

You can choose any word you like, but the word I prefer is “distill”.

The goal is to not discard meaningful data, but also to realize that no visualization technique can encapsulate all the raw data that has been collected. (Wall Street types: this applies even to your vaunted “heat map”.)

The key to this is simple: for each iteration, the distillation process uses all of the data you chose to feed into it from the lower level, summarizing the results in a meaningful manner. Keep in mind the actual number of iterations is purely arbitrary, so don’t get caught in the trap of presuming some fixed number of iterations is either right or wrong. The point is to distill large volumes of data into small bits of intelligence. In the business world, this would also be referred to as “actionable intelligence”, because someone is going to take an action based on what is revealed.

The good news is that “meaningful” is in the eye of the beholder, so the results of this distillation process are not required to fit any specific visualization, though there are better and worse ways of delivering it.

The distillation process uses all the data, summarizes it at every level, then allows you to determine, based on the audience, what is meaningful to display.

This is a perfectly fine theoretical discussion, but how do you put this concept into practice?

The phrase “event correlation” has been bandied about for years, but has only ever succeeded in one limited space: correlating network events to analyze and identify the root-cause of a network problem. Correlation succeeds here because the rules for the topology of a network are well defined, and change is easily handled by refactoring the network discovery data through the same rules again.

However, there are no such rules for the topology of an application or a service, so event correlation in those areas fails, not because it can’t work, but because it only works until something changes in the environment, at which time the rules have to be manually rewritten and/or re-validated. As we know, nothing in the Enterprise ever changes, right?

In the era of the cloud, the thinking is that application performance monitoring data is all that is necessary. Why bother with real-time monitoring of devices or containers when you already know the application is performing well?

In this cloud era, application performance data may be all that you can get: depending upon the contract you have with your cloud provider, you may not have access to any real-time device or container monitoring at all. It’s possible they don’t have it either, since many companies espouse the idea there is no point in collecting data they can’t sell. The trap is set however, since sooner or later your application performance data will inform you that the response time is in the unacceptable range, and due to the lack of real-time monitoring data (or lack of access to it), how will anyone be able to identify (much less correct) the problem?

There is a way to resolve all of this, because distillation of the raw data means you don’t have to discard or ignore useful data, yet you can still have present a meaningful visualization that works for each audience without overwhelming them, which is by definition what a dashboard should do, isn’t it?

In my next entry, I’ll go into detail about the key to distillation, and how it can be done very easily.

Leave a Reply