The Database Diet

Best practices in database archiving help maintain healthy disk-space capacity and prevent performance problems.

Like waistlines, databases almost always grow much larger than their owners ever imagined. Instead of too many carbs, it's a regular diet of stodgy and unnecessary transactional information that leads to database obesity. Serious health problems can result, such as disappearing disk space, poor performance and screaming users upset about slow access rates or queries timing out.

"Our overweight database was months away from crashing due to exceeding our production disk-space capacity," says Larry Cuda, global data archiving and migration project leader at Kennametal Inc. in Latrobe, Pa. "Management determined that we could no longer just keep throwing more disks at the problem."

His SAP database was swelling at a rate of 27GB per month until Kennametal pared it down using eCONtext from Ixos Software AG in Grasbrunn, Germany. Transactions that used to take six seconds now take one, and the company saves an estimated $700,000 annually in terms of hardware acquisition costs alone, according to Cuda. The database maintains a trim 2TB figure, with another terabyte residing in rapid-access archives. The company has an HP-UX 64-bit environment for its SAP ERP applications as well as its Oracle 8.1 database.

With so many competing production demands, and differing U.S. and international data retention regulations to consider, archiving database information is never a quick fix. Companies must decide what they should archive, how they should go about it, which tools are available and which best practices apply.

Losing Wait

According to Meta Group Inc., data is growing at a rate of 125% per year, yet up to 80% of this data remains inactive in production systems, where it cripples performance. "To compound this problem, many enterprises are in the midst of compliance initiatives that require the retention of more data for longer periods of time, as well as consolidation projects that result in significant data growth," says Charlie Garry, senior program director at Stamford, Conn.-based Meta Group.

A laundry list of regulations makes any archiving endeavor an extremely complex affair: The Sarbanes-Oxley Act, SEC Rule 17a, the Health Insurance Portability and Accountability Act and a host of other rules have transformed information management into a minefield of potential liability.

The legal ramifications of not having a way to archive information from databases can be grim. But there are also production reasons for formulating and activating an archiving strategy rapidly. Apart from running out of disk space as Kennametal experienced, companies report problems such as total system outages when the database requires too much processing, backup failures when there's too much data to back up in the available window, and transactions timing out as they search through millions of records.

At Southwest Gas Corp. in Las Vegas, inventory tables contained 5 million rows and a human resources table included 60 million rows. "The more data you have in production, the slower the database grows," says Luca Cotrone, a systems analyst at Southwest Gas. "Users complained of queries taking a long time."

Cotrone implemented Applimation Archiver from Applimation Inc. in New York for an Oracle8i database that was growing at a rate of 1GB per month. The database has now stabilized at about 100GB. Archiving of one general-ledger table, for example, saved 18GB. Searches are down from several minutes to a few seconds.

Unlike Kennametal, which sets policies for archive automation, Southwest Gas relies on manual archiving. Each month, a database administrator spends 30 minutes selecting files to archive. The decision is based on the age of the files in the inventory application database. For example, those that are older than 30 months could be moved from the production system to the less expensive Applimation data store. These files can be accessed by the user transparently from the original application.

Tape Not the Answer

Running a bulging database is rarely a desirable option, and in most cases neither is purging onto tape -- once a common practice. With purging, recovery must be performed manually and is extremely time-consuming. "Once you purge Oracle, users no longer have access to the data," says Lois Hughes, a senior business systems analyst at Tektronix Inc., a test measurement and monitoring business in Beaverton, Ore. "International finance regulations also meant that legally, purging would have to be paralleled by archiving."

Since the company operates in 27 countries, decisions about what to archive in its 120GB database were very complex. Take the case of accounts receivables, just one of dozens of applications in operation: China requires retention of data for 15 years; Brazil, 10; Italy, seven; and the U.S. only three. On top of language and data-retention issues, the system also had to cope with different character sets for Asia.

Tektronix archives transactional data every three months using LiveArchive from OuterBay Technologies Inc. in Campbell, Calif. First, information is recategorized -- reduced in priority within the existing Oracle instance -- then it's moved to a less expensive infrastructure. The users, however, are able to access all data from one screen, without headaches.

OuterBay is one of four primary contenders eyeing a piece of the $1 billion archiving market-share pie. According to Gartner Inc., Princeton Softech Inc. in Princeton, N.J., leads the pack with more than 50% of the market. Along with second-place OuterBay, it addresses IBM, Informix Corp., Sybase Inc., Microsoft Corp. and Oracle databases. Applimation focuses on Oracle, while Ixos Software deals exclusively with SAP AG and Siebel Systems Inc.

Archiver Beware

IT managers taking on archiving projects face their fair share of problems. Hughes reports several bugs in Oracle purging functions that had to be addressed, while Cotrone ran into trouble caused by differences between Oracle8i and 9i. His system runs on Oracle8i, but the archive database runs 9i in a Linux server instance within an IBM mainframe. Each successive evolution of Oracle and its associated applications appears to add more complexity that could scuttle a project.

For example, the Oracle 11i E-Business Suite adds 200 new modules and 17,500 tables to the application infrastructure. The same holds true for other database vendors.

"We couldn't export files from our 8i production database into the 9i archive, as there are certain tables you can't send across," says Cotrone. "Fortunately, our inventory application doesn't have these tables, so we were able to archive it while we complete a migration of everything else to 9i."

Kennametal's Cuda reports that he got his project under control only when he moved from a technology-focused view of archiving to a business process/legal approach and after he had plotted out all 223 data objects within his SAP database. This showed him the dependencies that existed among data types and highlighted exactly how to retire data to minimize risk. For example, invoices shouldn't be archived until the corresponding shipping and delivery documentation denotes a closed transaction. SAP, says Cuda, has mechanisms built in that prevent retirement of open transactions.

His advice for any archiving project is to first head for the easy pickings. "Financial documents are striking in that they have no dependencies," says Cuda. "Attacking such low-hanging fruit not only gives you significant data recovery, it also gives your team a sense of victory and [it] highlights to management and users that archiving is beneficial to the system."

ILM Revolution

Not surprisingly, online archiving has become a major element in vendor information life-cycle management strategies. EMC Corp. in Hopkinton, Mass., has partnered with OuterBay to integrate LiveArchive with EMC's ControlCenter storage management tools as part of its ILM suite . Other vendors are following suit, and the trumpeting about ILM is reaching a fever pitch.

"[ILM] will result in the optimal management of information throughout its life, from creation and use to archiving and disposal," says Mark Lewis, executive vice president for open software at EMC. "It isn't just hype; it's a revolution." Behind the fanfare, EMC talks about a road map to achieving true ILM functionality.

"The ILM buzz is similar to that surrounding virtualization 18 months ago," says Steve Duplessie, an analyst at Enterprise Storage Group Inc. in Milford, Mass. He estimates that it will be at least another 18 months before ILM moves beyond the hype and shows some merit in the real world. Until then, it might be best to evaluate archiving tools on their own merits.

Copyright © 2004 IDG Communications, Inc.

Shop Tech Products at Amazon