Wednesday, January 28, 2009

Implementing a Simple Internal Database or Application Cloud - Part I

A “simple cloud”? That comes across as an oxymoron of sorts since there’s nothing seemingly simple about cloud computing architectures. And further, what do DBAs and app admins have to do with the cloud, you ask? Well, cloud computing offers some exciting new opportunities for both Operations DBAs and Application DBAs – models that are relatively easy to implement, and bring immense value to IT end-users and customers.

The typical large data center environment has already embraced a variety of virtualization technologies at the server and storage levels. Add-on technologies offering automation and abstraction via service oriented architecture (SOA) are now allowing them to extend these capabilities up the stack – towards private database and application sub-clouds. These developments seem more pronounced in the banking, financial services and managed services sectors. However while working on Data Palette automation projects at Stratavia, every once-in-a-while I do come across IT leaders, architects and operations engineering DBAs in other industries as well, that are beginning to envision how specific facets of private cloud architectures can enable them to service their users and customers more effectively (while also compensating for the workload for some of their colleagues that have exited their companies due to the ongoing economic turmoil). I wanted to specifically share here some of the progress in database and application administration with regard to cloud computing.

So, for those database and application admins that haven’t had a lot of exposure to cloud computing (which BTW, is a common situation since most IT admins and operations DBAs are dealing with boatloads of “real-world hands-on work” rather than participating in the next evolution of database deployments), let’s take a moment to understand what it is and its relative benefits. An “application” in this context, refers to any enterprise-level app - both 3rd party (say, SAP or Oracle eBusiness Suite) as well as home-grown N-Tier apps that have a fairly large footprint. Those are the kind of applications that get maximum benefit from the cloud. Hence I use the word “data center asset” or simply “asset” to refer to any type of database or application. However at times, I do resort to specific database terminology and examples, which can be extrapolated to other application and middleware types as well.

Essentially a cloud architecture refers to a collection of data center assets (say, database instances, or just schemas to allow more granularity) that are dynamically provisioned and managed throughout their lifecycle – based on pre-defined service levels. This lifecycle covers multiple areas starting with deployment planning (e.g., capacity, configuration standards, etc.), provisioning (installation, configuration, patching and upgrades) and maintenance (space management, logfile management, etc.) extending all the way to incident and problem management (fire-fighting, responding to brown-outs and black-outs), and service request management (e.g., data refreshes, app cloning, SQL/DDL release management, and so on). All of these facets are managed centrally such that the entire asset pool can be viewed and controlled as one large asset (effectively virtualizing that asset type into a “cloud”).

Here’s a picture representing a fully baked database cloud implementation (if the picture is blurry, click on it to open up a clearer version):

As I had mentioned in a prior blog entry, there are multiple components that have come together to enable a cloud architecture. But more on that later. Let’s look at database/application specific attributes of a cloud (you could read it as a list of requirements for a database cloud).

  • Self-service capabilities: Database instances or schemas need to be capable of rapidly being provisioned based on user specifications by administrators, or by the users themselves (in selective situations – areas where the administrators feel comfortable giving control to the users directly). This provisioning can be done on existing or new servers (the term “OS images” is more appropriate given that most of the “servers” would be virtual machines rather than real bare metal) with appropriate configuration, security and compliance levels. Schema changes or SQL/DDL releases can be rolled out in a scheduled manner, or on-demand. The bulk of these releases, along with other service requests (such as refreshes, cloning, etc.) should be capable of being carried out by project teams directly– with the right credentials (think, role-based access control).
  • Real-time infrastructure: I'm borrowing a term from Gartner (specifically, distinguished analyst Donna Scott's vocabulary) to describe this requirement. Basically, the assets need to be maintained in real-time per specific deployment policies (such as development environment versus QA or Stage), tablespaces and datafiles created per specific naming / size conventions and filesystem/mount-point affinity (accommodating specific SAN or NAS devices, different LUN properties and RAID levels for reporting/batch databases versus OLTP environments), data backed up at the requisite frequency per the right backup plan (full, incremental, etc.), resource usage metered, failover/DR occurring as needed, and finally, archived and de-provisioned based on either a specific time-frame (specified at the time of provisioning) or on-demand -- after the user or administrator indicates that the environment is no longer required (or after a specific period of inactivity). All of this needs to be subject to administrative/manual oversight and controls (think, dashboards and reports, as well as ability to interact with or override automated workflow behavior).
  • Asset type abstraction and reuse: One should be able to mix-and-match these asset types. For instance, one can rollout an Oracle-only database farm or a SQL Server-only estate. Alternatively, one can also span multiple database and application platforms allowing the enterprise to better leverage their existing (heterogeneous) assets. Thus, the average resource consumer (i.e., the cloud customer) shouldn’t have to be concerned about what asset types or sub-types are included therein – unless they want to override default decision mechanisms. The intra-cloud standard operating procedures take those physical nuances into account, thereby effectively virtualizing the asset type.
The benefit of a database cloud includes empowering users to carry out diverse activities in self-service mode in a secure, role-based manner, which in turn, enhances service levels. Activities such as having a database provisioned or a test environment refreshed can often take multiple hours and days. Those can be reduced to a fraction of their normal time – reducing latency especially in situations where there needs to be hand-offs and task turn-over across multiple IT teams. In addition, the resource-metering and self-managing capabilities of the cloud allow better resource utilization and avoids resource waste, improving performance levels, and reducing outages and removing other sources of unpredictability from the equation.

A cloud, while viewed as bleeding edge by some organizations is being viewed by larger organizations as being critical – especially in the current economic situation. Rather than treating each individual database or application instance as a distinct asset and managing it per its individual requirements, a cloud model allows virtual asset consolidation, thereby allowing many assets to be treated as one and promoting unprecedented economies of scale in resource administration. So as companies continue to scale out data and assets, but cannot afford to correspondingly scale up administrative personnel , the cloud helps them achieve non-linear growth.

Hopefully the attributes and benefits of a database or application cloud (and the tremendous underlying business case) become apparent here. My next blog entry (or two) will focus on the requisite components and the underlying implementation methods to make this model a reality.

6 comments:

Anonymous said...

Neat concept. Something to think about and promote to my Director. When will part 2 of the article be out?

Unknown said...

Venkat,

What about security? That is probably the biggest deterrent to cloud computing. I think that concern is even more applicable to database clouds. How do you propose dealing with it?

Anonymous said...

Your database cloud concept covers a lot of ground and conveys much information. What kind of hardware do you suggest? Can it be commodity x86 servers running windows or linux?

Venkat Devraj said...

Shekhar - Thanks for your comment. Here I'm referring to private clouds only (not public cloud architectures). It's built out and maintained within the company firewall. As long as companies take the usual precautions (that they take for any enterprise application rollout), the security concerns typically pertaining to public cloud offerings don't come into play.

Let me know if this doesn't address your question.

Cheers,
Venkat

Venkat Devraj said...

Part 2 of the blog should be out shortly (sorry, been a bit busy lately).

Venkat Devraj said...

Hello Anonymous,
Thanks for your question. Yes, low-cost/commodity x86 servers running Windows or Lunux (or even both!) should work and in most cases, is even encouraged so that if any of these servers go bad, you can unplug and toss them away and plug in the new machines to maintain availability, performance and scalability SLAs. Similarly, on the storage side, you can go with (relatively less expensive) NAS devices (say, NetApp filers) or a combination of SAN and NAS farms for production versus non-production environments. You can even build links for overflow storage to public cloud infrastructures such as Amazon S3 or EMC Atmos. The cloud management software behind the scenes should manage the implementation for you. But more on that later...

Cheers,
Venkat