Why is the backup process stuck in the 80's?

Back in the 1980's, floppy drives were 5.5 inches, tapes were round, and disk drive capacity was measured in tens of megabytes, not the multiple terabyte versions we see today.

The rise of the Internet caused data to grow almost exponentially each year. The government is forcing some companies to keep all their transactions for many years. All that has made the job of data backup much more important, but most organizations still see backup only as a necessary evil. Budgets for backup technology are typically limited until someone actually loses data and can't recover. Then, all of a sudden, the person responsible for backup becomes the most important individual in the company.

The problem backup administrators face today is the notion that backup does nothing for the bottom line. Since it's not a customer facing application, it can be neglected and provided with minimal budget (same for disaster recovery (DR) by the way!). The good news for backup administrators is that the actual technology for backing up data has advanced dramatically over the last few years. 

Traditional backup software has advanced, the networks have become faster, and the tapes are much larger, but data growth has outpaced all those advances. The uneven rate of data growth versus backup capability is becoming apparent in many organizations as the window of time allocated to move data from the disk drives to tape drives over the network becomes shorter and shorter.

The problem is that a backup is a physical copy of data, and physics is involved. The data needs to be moved from point A (Production) to point B (the copy of production). Another problem is that the backup job is typically a bulk process which only occurs during off hours when the movement of massive amounts of data causes the least impact to the application servers.

The graphic below shows the problem of data movement from the application server to tape during a typical backup process as a reference.

Traditional Backup

Traditional backup data flow 

As you can see, there are a number of potential bottlenecks between the point where the data originates and where it needs to end up.

Backup Problems:

1) Feed Speed: The primary disks may not be fast enough to perform normal operations and read all the data for backup at the same time.

2) Server Bottleneck: The server may be busy, so the client agent running on the server may not have enough CPU time to operate effectively.

3) LAN bottleneck: The Ethernet LAN network may be slow or busy.

4) Tape subsystem: Backup jobs may be queued waiting for slow tape resources to become available.

5) Media requirements: If full backups are performed on databases to assure fast recovery, the same data gets stored every night, which requires a LOT of tapes.

6) Backup Server: The backup server may be overwhelmed by the number of backup streams coming over the LAN.

So how can we can solve these issues?

Problems 1-3: Most large organizations have hundreds of servers and terabytes of data to protect. This means hundreds of backup jobs need to run at the same time to a limited number of physical resources. One way to fix this problem is to move the backup data flow from the network to the SAN. When storage area networks (SAN) came along in the 90's, backup was (and still is) one of the killer apps for the SAN.

Serverless backup

Serverless LAN Free backup  

Moving the flow of backup data from the LAN to the SAN removes the server and LAN bottlenecks from the process (serverless and LAN free backup), which makes the process faster.

Problem 4: Virtual tape libraries (VTL) can provide many virtual tapes to remove the constraints of having to share a limited number of physical tape resources. With a virtual tape solution, you can right click and create a new tape drive at any time. If a new LT04 drive costs 5K, then every time you right click to create a new drive in a VTL, you not only create a new tape resource, but you save 5K to boot!

VTL Backup

Virtual tape backup

Problem 5: Data deduplication helps solve the media problem. Simply add dedupe to the mix, and all of a sudden there is a lot less data to store. Dedupe also helps with replication of data by reducing the costs of WAN bandwidth required to move data to the DR site, which also eliminates tape shipping costs and risk.

Dedupe

Data Deduplication 

Problem 6: The last problem that remains is the backup server, and the fact that the backup process is still typically performed only once a day, which provides a less than optimal recovery point objective (RPO) for many applications. With large data sets, it may take many hours to recover, which also has a negative impact on recovery time objectives (RTO).

After 30 years, there must be a better way!.   

The fix is to change the physics of the process. Instead of performing bulk data movement using a single batch process every night, simply protect all data writes all the time to provide continuous data protection (CDP)

CDP

Continuous data protection replaces the bulk data movement process by continually moving all data changes to a copy on another storage array, and then intelligently journaling the writes as they occur. Any errors or loss of data can be simply rolled back or recovered instantly from the journaled copy.

Snapshots can also be used on the copy to create backups which are consistent from the applications perspective. In essence, the continuous protection solution treats the underlying storage like a database, where the CDP journal acts as the transactions log, and snapshot copies act as a virtual full copy of the data.

Since snapshots and journals alone are not sufficient to protect from primary disk failure, in order to really be a backup, the solution MUST also provide a full replica copy "off-frame" from the primary storage array. CDP technology is now available from multiple vendors, but capabilities vary. I suggest doing a bakeoff between different vendor solutions to see which works best for you. If the CDP solution also includes data deduplication, money can be saved on off-site replication for DR. For more information on CDP, just search the Computerworld site for "CDP".

As companies continue to find new ways to save money, the new year holds promise for innovative solutions to the backup problem. As the process of backup and DR replication continue to converge, continuous data protection with multiple stage dedupe and replication should make an impact on the installed base of traditional backup solutions.

Christopher Poelker is the author of Storage Area Networks for Dummies, and he is currently the vice president of Enterprise Solutions at FalconStor Software.

Copyright © 2010 IDG Communications, Inc.

7 inconvenient truths about the hybrid work trend
Shop Tech Products at Amazon