Looking at database clouds from both sides now

These three ingredients are virtualised physical resources, elasticity (the ability to scale resources for any service either up or down), and multi-tenancy, or the ability for a given service to support many clients without any client having any visibility to any of the others. 

Since then, however, the term has come to be applied in two distinct ways: one having to do with how any Cloud service is architected internally, and the other having to do with how the service is perceived externally. This post considers how SAP HANA, Oracle Exadata, and IBM PureData address the internal or inside view on the one hand, or the external or outside view on the other.

I just spent a whirlwind five weeks attending four user conferences; those for TIBCO, Oracle, SAP, and IBM, all of which included the theme of database delivered as a Cloud service. Every variant of both ways of seeing Clouds, internally and externally, as they pertain to database technology, was on display at these conferences, and the contrasts were striking. 

TIBCO's emphasis was on the real time integration story, which highlights Big Data in motion; streaming data, complex event processing (CEP), and instant analytic decisioning. The other three focused on the database services needed to power such an environment. All three pushed a database Cloud story, yet the three were distinctly different.

One view of the Cloud is one that says that its internal architecture consists of vanilla server and storage resources made fungible by virtualisation, with all software services delivered on a multi-tenancy basis. SAP announced that HANA was being made ready for such a Cloud configuration, in a joint effort with Intel, that could support in-memory databases of up to 100 terabytes. 

HANA is not just a database management system (DBMS), but also an application execution environment. In fact, much of its performance power comes from the fact that application logic executes in the same physical environment with the database. SAP also says that HANA's architecture allows analytic and transactional applications to run in a fully optimised manner against the same data. 

The downside of this scenario is that its design is best utilised by application software that is written in the form of one of several approved scripting languages, so they can be optimised and executed under HANA control. Thus, HANA is an integrated software combination of application execution environment and DBMS, into which both transactional and analytic applications may be integrated. 

The software, including the DBMS, the application execution environment and even the application itself is, in effect, specialised for the HANA approach. This combination runs distributed across virtualised servers and storage in an Amazon Machine Image (AMI) environment. This model fits the inside view of Cloud.

Another view of the Cloud says that how it is physically deployed does not matter; what matters is that it is elastic, service-oriented, and supports multi-tenancy from an external perspective. Oracle offers specialised systems for database (Exadata), application execution (Exalogic), and analytics (Exalytics). 

Each system contains hardware specifically selected, optimised, and tuned for its role, and software that has been combined, optimised, and tweaked to make best use of the system in which it runs. Oracle argues that its approach to database multi-tenancy, which is internal to the database server, makes it much more powerful and easy to manage than alternatives that are offered at the application level. 

The result is three specialised systems, each of which has elastic properties within the limits of its internal resources (though each may be expanded by daisy-chaining boxes using an Infiniband connection) that can be used in combination to deliver the elastic scalability and service-oriented multi-tenancy of the outside Cloud view.

At their conference, IBM spoke of PureData, an approach that seems a sort of hybrid of the two. IBM PureData systems are the latest in a family of products called PureSystems. The others are called PureFlex (for an elastic preconfigured Cloud infrastructure platform) and PureApplication (an elastic preconfigured application platform, including WebSphere technology, that comes with "patterns" to accelerate development and deployment of a range of applications).

PureData come in three forms: one for transactions, one for analytics, and one for what IBM calls "operational analytics". PureData for Transactions is a configuration of DB2 pureScale (the shared disk clustered transaction-oriented flavor of DB2 LUW) with system and storage components configured and tuned to optimise throughput. 

PureData for Analytics is basically a rebrand of Netezza. PureData for Operational Analytics is DB2 Extended Server Edition (the shared nothing clustered flavour of DB2 LUW focused on data warehousing), again with system and storage components configured and tuned to maximise performance. IBM offers multi-tenancy support through the Multi-Tenant Server, which handles tenancy for Web applications as well as the DB2 databases they use. 

In effect, PureData systems are built from software and hardware that you could buy and configure separately, but that IBM has preconfigured and optimised for you in the factory. Like Exadata, PureData systems require almost no setup and can be deployed in ways that exhibit elastic properties. The PureSystems products, including PureData, exhibit elements of both the inside and outside views of Cloud.

So, with SAP HANA, we get software at multiple levels that is largely specialised based on the HANA architecture, but that can run on a wide range of off-the-shelf x86 based systems. With Oracle's Exadata, Exalogic, and Exalytics, we get software and hardware that are specialised to work with each other, integrated and optimised at the factory. 

With IBM's PureData, we get software and hardware that are integrated and optimised at the factory, but with no unique "secret sauce". Does Cloud philosophy matter? Which side works best in practice, both in terms of performance and ongoing manageability: inside or outside? Or both? That is the ultimate question.

Posted by Carl Olofson, Research Vice President, IDC

Copyright © 2012 IDG Communications, Inc.

8 highly useful Slack bots for teams
Shop Tech Products at Amazon