Time for a New View of Data Management

Database management is in a crisis, one that's only partly recognized. The horrors of data integration may be well known, but they're only the tip of a much larger iceberg: schema complexity. Programmers, system architects, and database administrators focusing on design and operation alike -- all their jobs are made immeasurably harder by the boggling complexity of relational schemas.

As schema diversity explodes, the pure relational model is collapsing under its own weight. We must replace it with a radically different view of data management, which I'm calling DBMS2, for database management system services. The key aspects of DBMS2 include the following:

Task-appropriate data managers. Just use whatever is cheapest and simplest for each set of applications. Possible choices include but are not limited to cheap online transaction processing DBMSs, high-end OLTP DBMSs, data warehouse appliances, XML-based document stores, highly distributed and/or small-footprint DBMSs, in-memory systems without their own persistent storage, or cross-corpus indexers without their own storage.

Drastic limitations on relational schema complexity. Relational schemas shouldn't go far beyond two simple models: master-detail for transactions, and hypercubes/star schemas for analytics. Anything inherently more complex is, with rare exceptions, better handled via the schema flexibility of XML. If you need to access data from a legacy application that violates these precepts, do so via XML-based Web services.

Both XML-based and relational information integration. Eventually, most DBMS2 data integration will be done via XML. But relational enterprise information integration will long have a role to play, such as connecting core OLTP and data warehouse systems.

DBMS2 is the antithesis of much current database theory. Rather than fighting modularity, DBMS2 embraces it. Rather than gathering administrative tasks in one huge hairball, it spreads them across many simple systems. Above all, unlike the Oracle pipe dream of a grand unified enterprise relational database, DBMS2 is a pragmatic, realistic continuation of what every large enterprise is doing today.

The need and opportunity for DBMS2 are driven by two overlapping trends: platform change and schema explosion. For starters, DBMS2 depends on the increasing availability of XML and Web services technology. It will be years before XML-based data-manipulation languages are sufficiently robust to handle the requirements of DBMS2, but those developments will happen, and most big software vendors will provide strong support for them in a timely manner.

Beyond that, one of the biggest reasons for embracing DBMS2 is a flood of low-cost alternatives to traditional DBMSs. For most enterprises, relational OLTP is approaching commodity status. Microsoft SQL Server is following Oracle up the food chain, while MySQL (which is even slated for SAP certification in two to three years, or maybe less) nips at Microsoft's heels.

Even more important, there's been an explosion in ultracheap OLAP technologies, both in-memory and in appliance formats. Most of these have very simple indexing schemes -- some have no indexes at all -- which yields huge TCO advantages in storage costs and administrative overhead alike.

The opportunity provided by these fledgling technologies might seem balanced by obvious risks. But before long, embracing them will be the only viable choice. The primary reason is schema explosion, on multiple fronts.

First, there's an explosion in profiles. CRM customer profiles (ideally with full Web site click-trail data), vendor profiles, security-oriented user profiles, you name it -- in almost all cases, the available information, and types of information, vary from one profilee to the next. Mobile/pervasive devices just worsen the problem, adding complexity in terms of location, availability and form factor. Centralized, pre-DBMS2 master data management will never succeed.

Second, text documents are becoming an ever bigger part of IT, be they complex forms and contracts, maintenance manuals, health records, Web marketing content or just e-mail. Documents are commonly unpredictable in structures and sometimes in authoring and editing metadata as well. And the ultimate solutions to making text search work will depend on further schema extension and variability, in a number of respects.

Finally, IT needs to be infused throughout with representations of trust. Security, compliance, missing data -- they all ultimately require some formalized hierarchy of trust. So do the multiple uncertainties of search engine results, document author reliability, planning forecasts and the like. The final resolution of these issues will require schema complexity beyond what relational systems can realistically handle.

Should you throw out Oracle and DB2? Hardly. But maybe you should reduce your reliance on them. The move to DBMS2 lets you exploit a variety of database technology advances from a variety of vendors. For specific product ideas, see my blog at www.computerworld.com/blogs/monash.

Curt A. Monash is a consultant in Acton, Mass., and also blogs regularly on Computerworld.com. You can reach him at curtmonash@monash.com.

From CIO: 8 Free Online Courses to Grow Your Tech Skills
Join the discussion
Be the first to comment on this article. Our Commenting Policies