ILM: Putting the pieces together

Users look to fully integrate and automate information life-cycle management technologies

Information life-cycle management (ILM) holds the promise of automating storage from the moment data is created by business applications to the time it's deleted from digital tapes stored in underground vaults. But the technology may never realize its full potential without the automation of business rules and processes, or standards to tie heterogeneous storage-area networks (SAN) and network-attached storage together into a tight mesh of resources from which to draw, say users and analysts.

The limitations of ILM are now apparent even in its success stories. John Halamka, CIO at Boston-based CareGroup Inc., says the ILM architecture he built saved his hospital management company more than $1 million by allowing him to avoid the purchase of high-performance systems to store older X-ray images and other data that isn't critical but still needs to be readily accessible.

Halamka says only about 20% of the data in his Oracle database is active, and the rest can be relegated to Tier 2 storage. "The value of a payroll check today is very important, but a year after W-2s are issued -- well, we can move off to different storage," he says.

The CareGroup ILM architecture, which took four years to build, automatically migrates data through four levels of storage, each less expensive and slower-performing than the last.

Halamka's IT team wrote its own middleware to perform the data migration with radiology image management software from General Electric Co. and Veritas Software Corp.'s application management software, which monitors the performance of its internal physician Web site, CareWeb.

Halamka's infrastructure consists of 200 Wintel servers from Hewlett-Packard Co. and 25 high-end HP rp8400s running Unix. Its 100TB of storage comprises 35TB of Symmetrix arrays, 35TB of Clariion CX600 arrays and 30TB of Celerra network-attached storage for file serving, all from EMC Corp. A Powderhorn tape library from Storage Technology Corp. is used for long-term archival storage. A pair of 112-port MDS 9509 switches from Cisco Systems Inc. provides a layer of abstraction between application servers and back-end storage, allowing capacity to be served up as if from a single pool instead of many arrays.

John Halamka, CIO at CareGroup Inc.

John Halamka, CIO at CareGroup Inc.

Image Credit: Webb Chappell

Halamka cobbled together his architecture and acknowledges that not every company could afford its $3 million price tag, which included software and hardware. But he says it was cheaper than the alternative.

The state of Massachusetts requires CareGroup to keep clinical records for up to 30 years. Halamka says he'd love to put the data on high-end Symmetrix arrays, "but just my capital budget restrictions alone are going to require me to identify what data is mission-critical and [needed] in milliseconds and what data is not so mission-critical so I can wait a few seconds to retrieve it." Halamka is still addressing e-mail backup and automating his tape-archival infrastructure, and he has yet to find a utility that will automatically migrate data in business applications onto his tiered storage architecture.

"I run PeopleSoft for all of my 12,000 employees' payroll. Wouldn't it be great to say after you cut the checks this week, 'Do we really need all the raw data that generated the checks? Let's move that off into archival storage,' " Halamka says. "Let's just say ILM isn't there yet."

Halamka and other IT managers are waiting for vendors to produce heterogeneous data-migration tools, which are only now beginning to emerge with the Storage Management Initiative Specification, known as SMI-S.

But pieces of ILM, which can relieve systems administrators of arduous manual processes and produce fast ROI, do exist today. Vendors such as Princeton Softech Inc., OuterBay Technologies Inc. and Applimation Inc. sell software that will remove unused or nonessential data from databases and migrate it to storage based on business policies. Using rules engines, the software can distinguish between open and closed business transactions.

Policy-based Archiving

E-mail archiving tools from companies such as Veritas, KVS Inc., Zantaz Inc., Connected Corp., and EMC's Legato Software division use policies to automatically move e-mail and attachments out of applications such as Exchange, Outlook and Lotus Notes onto disk or tape, which can then be searched.

But still missing in third-party software for ILM is the automation of business processes and standards to tie heterogeneous SANs together.

Tape library manufacturer Advanced Digital Information Corp. in Redmond, Wash., plans to release an upgrade this month to its Pathlight VX appliance that will present a single management interface between EMC Clariion disk arrays and its own tape libraries as well as libraries from StorageTek. But that addresses only hierarchal storage management.

The industry must change how it looks at storage management and operations, say analysts and users.

"Let's work from the basis of business requirements and business practices," says Mike Peterson, president of Strategic Research Corp. in Santa Barbara, Calif., and program director of the Storage Networking Industry Association's (SNIA) Data Management Forum. "We have to instrument the infrastructure so it can be operated automatically. We have to move back to basics."

"It's hard for us to imagine thinking about this as a storage problem," says Kate Kristenson, vice president of information product support at Inovant LLC, Visa International Inc.'s IT organization. "You have to think of it as a systems integration problem. The operating systems have a role to play. The application packages have a role to play. The database has a role to play. The real challenge is how do you think through the application integration end to end."

Foster City, Calif.-based Visa spent 14 months choosing vendors when it built a 150TB relational database and a SAN to support it. Upgrades to the database went online in September. The database is the data store for 300 million transactions per day. Visa looked for vendors that would tailor their software to its ILM needs and take responsibility to grow along with the company, Kristenson says.

"We've been able to influence the development of the technology from our vendors. That has saved us on development time internally," she says.

Visa uses extract, transform and load tools from Lexington, Mass.-based Ab Initio Software Corp. to perform near-real-time analysis of data being stored in the relational database.

Data generated in that database is stored in several tiers -- from 120TB of high-performance Symmetrix and 70TB of IBM FAStT midrange disk arrays to tape libraries -- for real-time or near-real-time retrieval for six months.

Visa spent three years creating its relational database and tiered storage infrastructure. Prior to that, data was siloed by business unit and couldn't be viewed in real time.

"We're now moving to the point where we have all the fundamental data in one place, and we're now using it to drive new business decisions," says Joel Mittler, senior vice president of information services at Inovant.

After transactional data has been viewable in real time on disk for six months, it's automatically migrated to StorageTek tape libraries that hold up to 2 petabytes. Administrators can retrieve information in minutes from the libraries. After two and a half years, the tapes are shipped off-site, and retrieving that data can take a day or more.

Storage resource management (SRM) tools -- software that can report on devices, SAN utilization rates and where business applications access and store data on arrays -- are essential building blocks in ILM architectures. Without them, administrators are flying blind when it comes to the efficiency and utilization of their storage infrastructures, analysts say.

According to Bill North, an analyst at IDC in Framingham, Mass., SRM software will be the fastest growing storage software segment through 2008.

Currently, storage vendors such as EMC, IBM and Hitachi Data Systems Corp. have products that operate within homogeneous islands of storage that can perform policy-based migration of data or ILM. But products are emerging that can migrate data between competing vendors' machines. Even when the technology is available to tie the pieces of ILM together, however, users may be slow to embrace it.

Gary Theus, a field support supervisor at Thompson Hine LLP, a Cleveland-based international law firm with about 360 attorneys, is in the middle of deploying 6TB of Advanced Technology Attachment disk storage from Nexsan Technologies Inc. in Woodland Hills, Calif., as a second tier of storage below high-end systems from Hitachi. Theus says his company paid $30,000 for the 6TB of storage on an ATABeast array from Nexsan, whereas he paid $200,000 for 1.5TB of Hitachi storage.

But before Theus can begin thinking about ILM, he has to agree with the business side on how to classify data in order to determine the law firm's retention policies. He expects to tackle that project next year. (To see how other IT executives are handling that challenge, go to QuickLink 50394.)

Right now, like in many IT shops, Theus' policy for the deletion of records such as e-mails is "everything past five years."

"We've just not wrapped our arms around [ILM] yet," Theus says.


Copyright © 2004 IDG Communications, Inc.

7 inconvenient truths about the hybrid work trend
Shop Tech Products at Amazon