Storage Tip: Archiving 101 - Active archiving vs. deep archiving – Send your Storage question to David Hill today! | See other Storage tips from David

What seems to be the problem? Archiving was a big theme at the recent Storage Networking World conference. Archiving is greatly misunderstood, however, which can inhibit enterprises from implementing an archived disk pool of data. Failure to implement means being unable to take advantage of archiving's benefits, which indirectly includes improving the backup/restore process on the original production data pool as well as directly enabling data-retention-managed storage that can help with compliance and governance policies.

What do you need to know? There are two possible misunderstandings about archiving. First is the failure to distinguish between archiving types -- active archiving and deep archiving. Second is a failure to distinguish between archiving and backup.

The original approach to archiving was to remove production data that no longer served a useful purpose online and move it to another piece of storage media -- typically tape. The archived data could be used for auditing or for historical analysis purposes, but the hope is that the data never needs to be seen again. That archiving type is "deep" archiving, and with deep archiving, restoring the data is time consuming.

The new approach to archiving is "active" archiving. Fixed content data -- which is data that is unlikely to change, but still serves a useful business purpose -- can be moved from an active changeable production data pool to an active archive disk pool of storage. The difference between an active and a deep archive is that in an active archive, a user can still access data easily and transparently.

Why would you want to move the data if it can still be useful in the active changeable pool of data? Although the data is still useful, the data no longer needs the power of the highest performance disks. Simply move the data to lower cost storage. The cost savings from tiered storage comes into play. And since the majority -- maybe even the vast majority -- of data is fixed content, the savings in the long run can be significant.

The flushing out of data that is no longer critical to the active changeable production pool reduces the size of that pool -- perhaps dramatically. A reduced active changeable pool manages easier and performs better. For example, a full backup goes much faster because there is less data to backup. Also a restore can go more quickly as there is less data to restore.

The key benefit for the new active archive disk pool of storage is that it can now be data-retention managed. You need retention-managed-storage to be able to determine what data needs to be preserved and for how long as well as being able to put in place the proper data destruction policies. Only fixed content storage can be retention-managed. Retention management requires that an application different from the originating application have control over the data (as the creating application should not be allowed to change or delete the data). For example, litigation holds for data mean that the data has to be preserved -- no alterations or deletion of data is allowed. The originating application therefore cannot be in control.

Archiving versus backup

Archiving and backup are not the same. Archiving (no matter what flavor) is still production data. The archive represents the true and official copy of the data at the current time. A backup copy is a data protection copy. The two are separate and distinct. Note that an archive copy must have a data protection copy somewhere. That may be a backup copy or it could be a replica.

What can you do about it? It is important to note that while archiving and backup are different, they both act upon the active changeable pool of production data. Archiving migrates data from the active changeable production pool to the active archive production pool whereas backup software simply copies active changeable pool data to a piece of backup storage media.

One way to go about performing both functions is with an archiving and backup server, and backup/restore software companies are likely to get into the archiving business if they have not already done so. Or an enterprise can choose the traditional way of keeping archiving and backup completely separate and distinct.

Active archiving, which is what people really should mean when they talk about archiving, is going to continue to gain attention. If you are not already deeply involved in the active archiving of your data (and even if you are), pay close attention to how your organization can benefit from its use.

This story, "Storage Tip: Archiving 101 - Active archiving vs. deep archiving" was originally published by ITworld.


Copyright © 2007 IDG Communications, Inc.

Shop Tech Products at Amazon