Backup grows up

Compared to hot areas like security or wireless, data backup and restore may have seemed like IT's forgotten child -- until now.

A perfect storm of disappearing backup windows (thanks to enormous data growth and nonstop business operations), large-scale catastrophes, increased litigation requiring electronic data discovery and federal regulations governing data retention, has catapulted backup and recovery to IT's head table.

And, reflecting its newfound status, backup and recovery is taking on a more sophisticated, grownup name: data protection, which encompasses backup, recovery, archiving, retrieval, disaster recovery and business continuity.

"This is a phenomenal time for storage, and particularly for data protection," says Arun Taneja, president of Taneja Group. According to IDC, the backup, archiving and replication software market will grow from $4.3 billion in 2003 to $6.58 billion by 2008, representing 54% of storage software expenditures.

While the term "data protection" covers a lot of ground, it's the first four areas - backup, recovery, archiving and retrieval - that are currently of highest interest, said Pete Gerr, senior analyst at Enterprise Strategy Group.

Companies now realize they must be able to recover specific pieces of data from financial records, e-mail, instant messaging logs and the like if it's subpoenaed as evidence in a legal case, Gerr says.

The bottom line is backup, restoration and safe archiving of electronic data can no longer be a "hope it works" proposition.

Tape falls out of favor

"If the one e-mail that may keep the CEO out of court is the last file written to tape, it's going to take a very long time to find that file," Gerr said. Long recovery times mean high legal fees and electronic data discovery service provider costs, not to mention the spotlight it shines on poor records-management discipline, which can lead to further regulations.

Another problem with tape is that despite many advances in the technology, these systems just can't keep up with the volume of data that needs to be stored in ever-shrinking backup windows. According to a March 2005 survey conducted by Enterprise Strategy Group, roughly half of 163 respondents said their ability to back up and recover data in a timely fashion has been hurt by the limitations of their tape systems.

Start-ups and disk storage heavyweights now are weighing in with tape alternatives, including disk-to-disk backup, virtual tape libraries, content-addressable storage, continuous data-protection devices, new replication and snapshot schemes, data compression techniques and more.

With disk-to-disk devices and virtual tape libraries, backups can run within reasonable time frames, and more data can be kept online, which enables faster recoveries. Denton Central Appraisal District (DCAD), for instance, switched to a StoneFly Networks disk-based backup system and now can back up its 50 servers in the same amount of time it used to take to back up one.

That's 0.5TB to 2TB of backup per night, with a routine average of 400GB to 600GB of changed data written to backup disk daily, according to Brad Green, director of information services at DCAD, the fastest-growing county in North Texas.

No wonder users have responded to these new back-up technologies with great enthusiasm. Companies spent $1.7 billion on disk-based storage in 2003, according to Strategic Research. And according to the March Enterprise Strategy Group study, 18% of respondents have permanently replaced their tape libraries with disk-based alternatives, and another 58% would consider doing so. Of this latter group, 80% believe they will replace at least some of their tape libraries over the next 24 months.

"Disk storage is being used either as an exclusive method of backup or as an intermediate or staging area before going to tape," says Bill North, director of research for IDC's Storage Software service.

While disk backup has traditionally been seen as more expensive than tape, Gerr advises users to consider not just acquisition costs but also operational and administrative costs that tape requires, such as media management and tape swapping.

"Tape is much more labor-intensive than disk," he said. "So while disk is more costly to procure, the total cost of managing it is far less than the total cost of managing a tape environment."

But it doesn't disappear completely

When the Texas county's storage needs grew fourfold in one year, it switched to disk-based backup in the form of a 4.2TB, $50,000 StoneFly IP-based storage-area network fronted by Commvault Systems' QINetix backup software. DCAD has since added an additional 5TB of disk.

But data is still archived on a Dell tape library -- at least for now.

Green's goal is to completely move away from tape. "You're resting the strength of your entire business on that millimeter-thin little tape," he says. "That just doesn't work for me." His plan is to implement a hot site and synchronize data between the two locations over a VPN using replication software from StoneFly, as well as NSI Software's Geocluster technology. "If my plan works, we'll be able to back up to disk offsite," he said.

Tape is still the least expensive means of long-term archival, Green said, adding he'd continue to use it for very long-term archival purposes. "But for me it's inherently flawed, too subject to failure and too slow," he said.

North agrees, saying that "trucks and grocery carts are still less expensive than the bandwidth required by replication." That's why companies such as Avamar, EMC (with its content-addressable storage system, Centera) and Data Domain are working on data reduction algorithms to compress or otherwise reduce the amount of data that needs to be stored during backups, thus reducing disk costs and minimizing what needs to be sent.

A consumer call center for a large New York bank is looking into software that stores incremental changes rather than blocks of data so that -- in the event that it moves to a disk-based archival strategy -- it will have less data to send over the wire. Officials at the call center were recently given the directive to move away from all physical transportation of media to protect confidential data -- which eventually will rule out tape even for off-site storage.

The call center made some preliminary steps in that direction when it recently solved its tape library-based back-up woes with a RAID-based virtual tape library from Sepaton. Day-to-day backups now go to a Sepaton disk-based system, traveling to the IBM 3494 tape library only when it's time for archiving and offsite storage.

With the tape-based system, a full backup could take three days, but by backing up data to a Sepaton virtual tape library, the bank not only can continue using its legacy Tivoli Storage Manager (TSM) backup software, but also a full backup takes just three hours. Further, buying the Sepaton virtual tape library instead of a new tape cabinet and additional drives represented a 50% cost savings.

The call center plans to take another step in favor of disk backup by purchasing a second Sepaton virtual tape library, locating it in an on-campus building and having it perform duplicate backups and restores using the replication capabilities of the TSM software. The call center will continue using tape for offsite storage until it hatches a cost-effective, offsite replication plan using a data-reduction algorithm.

The problem with many of these data-reduction algorithms is that because data is not kept in one, intact file, there's a process associated with restructuring the data when you need to restore it, North says. "You wouldn't want to do that in a transaction database that processes thousands of orders an hour," he said. "It tends to be used for data that is infrequently accessed but where the time to retrieve it may be shorter than if it's offsite in a tape vault somewhere."

Tape takes up residence offsite

There are other reasons why companies still turn to tape for offsite storage. At the Chicago Mercantile Exchange, trading and clearing applications are replicated between two data centers for business-continuity purposes. Copan Systems virtual tape libraries are installed at both sites to resolve the problem of shrinking backup windows, and both are managed by Veritas NetBackup software. Two StorageTek tape silos take care of archiving.

Critical data does not just get backed up on Copan virtual tape libraries, however, said Joe Panfil, director of enterprise technology services at the Chicago Mercantile Exchange. "Probably our biggest challenge is creating tiers of data classification so that critical data still makes it to tape and gets carried out of here," he says. Less-critical data stays on disk for a few weeks and then gets written over.

Federal regulations require some data to be stored on media that cannot be erased, which eliminates many disk-based storage systems. EMC's Centera is an exception, and the exchange would consider that, Panfil said. "My belief is that tape eventually has to die, but it will be when regulators say there's been some media to replace it that's acceptable," Panfil said. "If we're forced to retain data for seven years, and it has to be external to [the Chicago Mercantile Exchange], tape or optical becomes the only way to do that."

Disk will eventually dominate

According to Taneja, this is only the first phase of backup's maturation, and while tape might lose its place in the backup environment, it will occur slowly. He recently completed a survey of 250 midsize and large companies, 95% of which said they were not yet ready to let go of tape.

"Customers don't want to change too many variables at one time," he says. "That's been their crutch for the last 25 years, and they don't want to lose it."

But in the next phase, which Taneja estimates is 12 to 18 months away, people will become more comfortable with disk-based backup and thus disk-to-disk replication over distance.

"At that point, people will say, 'Eureka -- why have tape at all?'" he said, because you've established your offsite archive on disk.

Mary Brandel is a freelance writer in Michigan. She can be reached at

This story, "Backup grows up" was originally published by Network World.

Copyright © 2005 IDG Communications, Inc.

Shop Tech Products at Amazon