Here's a closer look at 5 techniques to reduce the volume of stored data.

Schulz says primary deduplication products could perform in preprocessing mode until a certain performance threshold is hit, then switch to postprocessing.

Another option, policy-based deduplication, allows storage managers to choose which files should undergo deduplication, based on their size, importance or other criteria.

SFL Data, which gathers, stores, indexes, searches and provides data for companies and law firms involved in litigation, has found a balance between performance and data reduction. It's deploying Ocarina Networks' 2400 Storage Optimizer for "near-online" storage of compressed and deduplicated files on a BlueArc Mercury 50 cluster that scales up to 2 petabytes of usable capacity, rehydrating those files as users require them.

"Rehydrating the files slows access time a bit, but it's far better than telling customers they have to wait two days" to access those files, says SFL's technical director, Ruth Townsend, noting that the company gets as much as 50% space savings through deduplication and file compression.

2. Compression

Probably the most well-known data reduction technology, compression is the process of finding and eliminating repeated patterns of bytes. It works well with databases, e-mail and files, but it's less effective for images. It's included in some storage systems, but you can also find stand-alone compression applications or appliances.

Real-time compression that doesn't delay access or slow performance by requiring data to be decompressed before it's modified or read is suitable for online applications like databases and online transaction processing, says Schulz. The computing power within modern multicore processors also makes server-based compression an option for some environments, he adds.

Allen of i365 says the benefits of compression vary. It can reduce data by ratios of 6:1 or more for SQL databases, but for file servers the ratios are closer to 2:1. According to Fadi Albatal, vice president of marketing at FalconStor, compression is most effective on backup, secondary or tertiary storage, where it can reduce storage needs by ratios of 2:1 to 4:1 for "highly active" database or e-mail applications. When information management services firm Iron Mountain Inc. archives applications, compression and deduplication reduce storage by as much as 80%, says T.M. Ravi, Iron Mountain's chief marketing officer.

IBM focused attention on compression of primary storage with its acquisition of Storwize, whose appliance writes compressed files back to the NAS device on which they originated or to another tier of storage. Storwize is beta-testing a block-based appliance, says Doug Balog, vice president of IBM storage.

