Opinion: Data de-dup offers new storage management possibilities
The 20:1 compression rates once touted by vendors seem to have been greatly exaggerated
Computerworld - Last week's announcement (see "NetApp Set to Launch De-duplication Tool") by Network Appliance of their A-SIS data de-duplication technology for primary storage opens new storage management possibilities, raises the bar for their competition, but also brings up some interesting questions.
Most data de-duplication efforts have focused on secondary data applications, particularly backup. Companies like Data Domain have been very successful in this area, and in addition to achieving impressive de-duplication numbers, they have been ratcheting up performance capabilities with each new generation of products. So it only seems natural to consider applying de-dup technology more broadly.
Interestingly, in the announcement, there appears to be an effort to lower expectations regarding the rate of data reduction likely to be realized in primary storage situations, suggesting that the 20:1 or more savings quoted by backup de-dup vendors are largely the result of multiple copies of unchanged data over time. They warn that the reduction will be significantly lower in non-backup scenarios, perhaps 40% or less, and this is borne out by the users quoted in the article referenced above.
Certainly the extent of data reduction, as with any method of compression, depends heavily on the characteristics of the data involved, but based on these data points, it appears that the levels being achieved are at or below those attained through more traditional data compression methods (e.g. gzip, OS filesystem-based compression, tape drives, etc.).
This raises the question of why storage vendors, in general, have not leveraged traditional compression techniques for at least some category of application. While anyone who has enabled file-based compression on their laptop knows that it introduces overhead and impacts performance, hardware-based compression, such as that already incorporated into tape drives and some VTLs, could potentially minimize or even eliminate this overhead in a storage system. One company, StorWiz, offers a network-based appliance that compresses data in-stream on its way to the storage system and claims to actually increase performance due to the reduced quantity of data ultimately being written to disk. Otherwise there appears to be a dearth of compression options for storage.
Let me emphasize that data de-duplication and compression are not mutually exclusive. While traditional compression cannot achieve the very high reduction numbers achievable in some circumstances via de-duplication, given the current emphasis on reducing the storage footprint, as well as operational power and cooling costs, making appropriate use of every tool in our arsenal makes sense.
Read more about Data Storage in Computerworld's Data Storage Topic Center.
- 2014 Healthcare Data Management Survey Summary This report provides insights into how much information Healthcare IT organizations are managing, the rate of data growth they are experiencing and which...
- State of Cloud Security Report In a relatively short time, cloud computing, specifically Infrastructure-as a-Service, has shifted from a new but unproven approach to an accepted, even inevitable,...
- What is this "File Sync" Thing and Why Should I Care About It? All of a sudden, getting a file from your work laptop to your iPad became as simple as clicking "Save." So it's no...
- Server and system administrators challenged to keep up with enterprise storage explosion Read this whitepaper to learn how administrators are leveraging their existing skills to simplify the management of storage and servers.
- Brunswick Moves Messaging and Collaboration to the IBM cloud Gerry Orten, Jr, Electronic Messaging Manager at Brunswick talks about why Brunswick moved to the IBM cloud.
- Increase Your Data Center IQ Discover how to improve network efficiency, lower IT costs and more proactively manage your physical, virtual and cloud environments. All Data Storage White Papers | Webcasts