Friday, December 29, 2006

A Layperson’s Guide to IT Automation Market Speak

My company was a sponsor at the recent Gartner Data Center conference in Las Vegas (http://www.gartner.com/2_events/conferences/lsc25.jsp). Sure enough, the words “IT automation” were all pervasive at the event. There seemed to be so many companies crawling out of the woodwork, all vying for total market dominance. It is energizing to see that automation has truly gone beyond being just a buzzword. However when I spoke to many of the folks who had attended the show, one common frustration seemed to exist across the board - the amount of irrelevant, almost bordering on the nonsensical, marketing chatter in this space.

Imagine you are a senior IT manager (pick your own fancy title) and you are interested in enhancing the level of operational maturity in your environment. (Nice!) Your organization is already following ITIL standards closely and you are keen on seeing what else is out there that might help you. So you attend the next IT/data center optimization show or even, simply do a Google search and…. boom! A ton of offerings with terms ranging from the simple (“lights out monitoring”) to the arcane (“autonomics", “adaptive infrastructure”, "run book automation") hit you between the eyes. (Where oh where, were many of these companies and offerings just a year or two ago?) Almost all of them flush with venture capital and slick marketing, no doubt, but confusing as heck! So you go to your favorite analyst’s website and guess what, it’s a total letdown. There is no single comprehensive coverage of all these offerings and in fact, each analyst seems to have a fetish for a certain buzzword or two. And pitifully, some of the big analysts are still caught up in old world IT agendas, blissfully unaware of the progress in automation; if anything, they are scrambling to come up to speed just as quickly as the rest of us, mere mortals.

After much time and research, I noticed each automation vendor and offering is focused on a horizontal or vertical subset of IT. However from their marketing collateral, it appears as though each one does pretty much everything under the sun – obviously, a lot of redundancy. So what’s an IT manager to do? Who do they believe? Where does one offering end and the other starts? Obviously, it’s not practical to try and evaluate each and every vendor and attempt to do a POC to determine fit…

Here is a guide I built to navigate my way through the different kinds of offerings out there and make the right recommendations for my clients. Hopefully you will agree that it is simple to grasp (and without the usual glib marketing speak!). Based on all the companies I witnessed during the conference, I was able to categorize companies and their offerings into six clean “buckets”.

1. The first category is caveman-style “script-based automation”. Other than being very prolific, it is the dumbest way to attempt to automate anything because (as I have mentioned in prior postings), raw scripts are difficult and time-consuming to write, maintain and deploy, especially across large server farms.

2. The second category is “simple repair automation” – the kind brought about by GUI monitoring and administrative tools such as BMC Patrol, Quest Central, Embarcadero dbArtisan / Performance Center and Oracle Enterprise Manager (OEM). Again as I have mentioned in prior postings, these GUI tools are awesome when it comes to monitoring and performing one-off tasks, but ill suited for carrying out the simplest of repairs in a consistent way across multiple servers. The main problem with them is that they make specific assumptions about the underlying environment which may not be applicable to all environments. Since they have “canned repair logic”, they do not accommodate custom business rules or IT policies very well. Some of these tools do allow scripts to be executed when certain conditions are detected. However this approach runs into most, if not all, of the problems associated with scripts mentioned above. As such, the so-called automation capabilities within these products are of limited practical utility.

3. The third category comprises “automated provisioning and configuration management” tools. While automating provisioning and configuration management is useful, these activities comprise a rather slim portion of the average IT administrator’s workload. There are several other tasks emanating from user requests, environmental changes (unanticipated changes), change control requests (anticipated changes), software releases and alerts that these tools are not meant to deal with. Such limited utility means customers are forced to introduce a bunch more tools in their environment (to automate tasks besides provisioning.) Not ideal.

4. So a fourth category aptly named as “run book automation” (heavily “supported” by Gartner) has come into existence. These tools offer a framework for automating activities driven by custom business logic via a workflow GUI and pre-built integration into commercial monitoring and ITIL packages (especially incident management and ticketing). However the underlying business logic code often has to be written by the IT administrators themselves. Thus while this approach is superior to script-based automation by centralizing script logic onto a workflow GUI, it still suffers from the biggest deficiency of scripts, i.e., it requires an expert administrator to code the business logic into the workflow engine, thereby competing with regular day-to-day tasks for the administrator’s time. If the administrator doesn’t have the bandwidth to properly instrument and deploy these products, they ends up as shelfware (which is more than likely, since administrator bandwidth is the core issue these tools attempt to increase, but instead they consume more than they provide in the short run). Companies that are usually successful in leveraging run book automation are ones that have already attained significant operational maturity even prior to deploying these tools. Companies that are still struggling with daily fires are better off spending time to shore up their internal processes and gradually looking to the other categories here to attain efficiencies.

5. Given the drawbacks of run book automation, another category that is fast emerging as a more viable solution is “domain knowledge automation”. Products in this category offer a run book automation platform with built in expertise for one or more horizontal areas in the IT stack, such as server administration, network administration, application administration and/or database administration. Via this approach, products eliminate much of the problem with script-based automation and more importantly, reduce the need for administrators to code custom logic from scratch. StrataVia’s Data Palette™ is a good example of products in this category. Data Palette represents domain expertise in database automation by providing a library of standard operating procedures (SOPs) pertaining to common DBA tasks. The entire SOP code-base is exposed in an open-source manner such that administrators can tweak the business logic code to fit their environments, if the pre-existing SOPs aren’t suited for the task. Products in this category support several scripting environments to avoid the administrator having to learn a new language. In order to build domain knowledge and awareness, products in this category also need to have the capability to detect external events, along with decision automation prowess to deal with specific changes in event states. This capability is useful for triggering pre-configured SOPs at opportune times without human intervention.

6. The sixth and last category is self-managing software, or what is often referred to as “autonomics”. This is true automation nirvana. Products in this category are aware of themselves, their environments and the interplay thereof. They use this awareness to optimally install, configure, maintain, update and recover themselves. Much of this functionality can be built into products to make them autonomic from the ground up or layered in via an external service to make existing software products autonomic. For instance, Data Palette uses the latter approach to make databases autonomic. The good news is that most of the large vendors like IBM, Microsoft and HP are heading in this direction, even though they may label this functionality differently. (IBM coined the term autonomic computing, HP calls it adaptive infrastructure and Microsoft calls it dynamic systems initiative.) However the bad news (not surprisingly) is that most of the large vendors focus on their own products and their partner ecosystems and worse, take a low-level hardware or operating system level approach to autonomics. In other words, they are mostly about different components of the IT stack collaborating with each other which in my opinion, just doesn’t work – because the level of collaboration required is impractical and doesn’t happen. Even if it does, it doesn’t keep up consistently past a release or two; there are just too many moving parts. Of all the six categories I have listed, autonomics is probably in the most nascent stage.

Hopefully this six-bucket categorization helps the average customer understand what’s actually out there and even if a vendor uses different marketing speak (which is more than likely), they can still read between the lines and place each product in the right bucket. Furthermore, it is my hope that these six buckets will let customers take a step back and see what is most prudent for their organization to evaluate and focus on offerings in that category, allowing them to move faster towards their preferred area of IT optimization.

Just one final word of advise to vendors – don’t even bother appeasing all the big analyst groups out there and rigging your marketing message. As long as your customers know where your offering fits in the above categories, it results in a more mature market for IT automation. Such a market has clean segmentation as its primary attribute allowing different products to fit into appropriate segments, and allowing each to play nicely with others in complimentary categories. In the end, this approach makes it a lot less painful for customers to find what they are looking for, thereby accelerating user evaluation and adoption. Vendors that do not gain this realization and hide their offerings behind a cloud of ambiguity are not only doing themselves a disservice, but also holding the market back by a decade.