Thursday, January 17, 2008

Who is to blame for Oracle patch application failures... Is that even a real question?

I just read a revealing blog entry by Jaikumar Vijayan about how two-thirds of DBAs miss timely and accurate application of Oracle’s quarterly critical patch updates and subsequent objections from some quarters regarding how this is showing DBAs in a bad light! Wow, I gotta say I sure don't agree with the latter view-point!

Honestly, I don’t think the article quite alludes that DBAs are to blame for this. However that is indeed my assertion. I have been in databases for over 16 years and as a Systems DBA, I consider myself and others like me to be the primary custodian of my company’s (and clients’) most important asset - data. To imply that somehow lack of corporate security standards in this area or lack of visible threats such as Slammer is sufficient grounds for ignoring patches is hare-brained in my humble opinion. I consider it my job to come up to speed on the latest patch updates, document the decision to apply the patch (or not!), educate and work with the security auditors (internal and external) if any, and ensure the databases I’m responsible for are totally secure. Not doing that is akin to saying, “oh, it’s okay for me not to test my database backups unless some corporate mandate forces me to do so.” Database patches, like auditing backups, forms one of the most basic job functions of a DBA. Claiming ignorance or being over-worked in other areas doesn’t count as an excuse, especially after your customers’ credit card data has been stolen by a hacker!

A comment to the above story by Slavich Markovich (one of the people quoted there), says that DBAs are not lazy and goes on to state that they just have too many things they are working on. Since when has a DBA not been “over-worked”? Way too often, that’s been part of the DBA’s job description – even though it doesn’t have to be that way. I know many fine DBAs that regardless of their work-load or corporate politics or prevailing state of ignorance, don’t let themselves be stopped from applying the relevant CPUs. They research the patchset, educate their peers, security groups and auditors, evangelize the benefits to application managers, arrange to test it in non-production and coordinate the entire process with change control committees to ensure its success.

Markovich lists several items in his comment that supposedly deter DBAs from adhering to a regular patch cycle. I’m taking each of his points one by one (in red font below) and responding to it:
1. The need to test all applications using the database is a heavy burden => yeah? so is coordinating and testing backups, deal with it!

2. Oracle supports only the latest patchsets => Oracle’s patch-set frequency and support policies tend to vary from version to version. Rather than making bland, generic statements, log onto metalink, do the relevant research, see what patch-set applies to which databases in your environment and go, make it happen!

3. The lack of application vendor certification of the CPUs => sure, certain CPU patches sometimes impact application functionality, but the majority don’t. Regardless, that’s what testing is for – to ensure your application functionality is not negatively impacted by a patch-set. If the testing shows no adverse impact, do a limited deployment and then move to a full deployment of that patch-set. CPUs are released by Oracle almost every quarter, so don’t expect all 3rd party vendors to update their certification matrix quite so rapidly. And BTW, most application problems that are suspected to have been caused by an Oracle CPU can be traced back to an error in the patch application process (why bother reading all the patch-set release notes, right?)

4. The simple fact that it takes a huge amount of work to manually shutdown the database and apply the patch in an organization running hundreds if not thousands of instances => if I had a nickel for every time I heard this… dude, ever read of run book automation for databases? If you are still stuck dealing with manual task methods and scripts, you only have yourself to blame. (Do yourself a favor, check out the Data Palette automation platform. )

5. For production critical databases you have to consider maintenance windows which might come once a year => ever heard of rolling upgrades/patches? If your environment isn’t clustered or lacks a standby environment, work with the application and change control committees to negotiate more reasonable maintenance windows or even better, build a business case for a Data Guard implementation. Remember, to be a successful DBA in today’s complex IT environment, you can’t just be a command-line expert, you need to possess good communication skills, not to mention salesmanship. Use those skills to jockey for reliable service levels – which includes, a well-patched and secure database environment. Don’t just attempt to live within the constraints imposed on you.

6. The lack of understanding by some IT security personnel of the severity of the problem simply does not generate enough pressure in the organization => quit blaming others, you are responsible for those databases, not some arcane person with a fancy title that has “security” somewhere in there.

7. All in all, I know of companies that analyze and deploy CPUs as soon as three months after release but those companies are very few and usually have budgets in the millions for such things… => also, quit generalizing. There are many DBAs that work for small, private companies with miniscule budgets that take their Oracle CPUs very seriously and vice versa.

The truth is, a secure, stable and scalable database environment has very little to do with the size of the budget and everything to do with astute DBAs that think outside the box and take charge of their environment.


Anonymous said...

Useful view point. I'm going to print and share with my DBA manager. I joined our company a few months ago and have been struggling to get the rest of my team to take CPU patching seriously. They seem to ignore and get away with it. That would have been cause for termination at my previous job.

Sreekanth Chintala said...

just got introduced to the blog via Stratavia. Well known DBA being the co-founder tells a lot about the products. I will be taking a look at what the solutions are.
Sreekanth Chintala
Sr. Database Architect/Engineer
Dell. Inc.