Opinion: Data de-dup offers new storage management possibilities
The 20:1 compression rates once touted by vendors seem to have been greatly exaggerated
Computerworld - Last week's announcement (see "NetApp Set to Launch De-duplication Tool") by Network Appliance of their A-SIS data de-duplication technology for primary storage opens new storage management possibilities, raises the bar for their competition, but also brings up some interesting questions.
Most data de-duplication efforts have focused on secondary data applications, particularly backup. Companies like Data Domain have been very successful in this area, and in addition to achieving impressive de-duplication numbers, they have been ratcheting up performance capabilities with each new generation of products. So it only seems natural to consider applying de-dup technology more broadly.
Interestingly, in the announcement, there appears to be an effort to lower expectations regarding the rate of data reduction likely to be realized in primary storage situations, suggesting that the 20:1 or more savings quoted by backup de-dup vendors are largely the result of multiple copies of unchanged data over time. They warn that the reduction will be significantly lower in non-backup scenarios, perhaps 40% or less, and this is borne out by the users quoted in the article referenced above.
Certainly the extent of data reduction, as with any method of compression, depends heavily on the characteristics of the data involved, but based on these data points, it appears that the levels being achieved are at or below those attained through more traditional data compression methods (e.g. gzip, OS filesystem-based compression, tape drives, etc.).
This raises the question of why storage vendors, in general, have not leveraged traditional compression techniques for at least some category of application. While anyone who has enabled file-based compression on their laptop knows that it introduces overhead and impacts performance, hardware-based compression, such as that already incorporated into tape drives and some VTLs, could potentially minimize or even eliminate this overhead in a storage system. One company, StorWiz, offers a network-based appliance that compresses data in-stream on its way to the storage system and claims to actually increase performance due to the reduced quantity of data ultimately being written to disk. Otherwise there appears to be a dearth of compression options for storage.
Let me emphasize that data de-duplication and compression are not mutually exclusive. While traditional compression cannot achieve the very high reduction numbers achievable in some circumstances via de-duplication, given the current emphasis on reducing the storage footprint, as well as operational power and cooling costs, making appropriate use of every tool in our arsenal makes sense.
Read more about Data Storage in Computerworld's Data Storage Topic Center.
- Data Warehouse Augmentation: The Queryable Data Store While organizations have, to date, been busy exploring and experimenting, they are now beginning to focus on using big data technologies to solve...
- Rebranded Quadmark revamps its IT solutions with Google Apps Switching to Google Apps halved Quadmark's IT admin costs while achieving 10% time savings per employee. The global consulting firm now spends 80%...
- CrashPlan PROe Security Because mobile laptops often are connected to unsecured networks, a very high standard of security is required to ensure privacy.
- Protecting Digitalized Assets in Healthcare Healthcare providers face an urgent, internal battle every day: security and compliance versus productivity and service. For most healthcare organizations, the fight is...
- Live Webcast LIVE EVENT: 5/7, The End of Data Protection As We Know It. Introducing a Next Generation Data Protection Architecture. Traditional backup is going away, but where does this leave end-users?
- LIVE EVENT: 5/7, The End of Data Protection As We Know It. Introducing a Next Generation Data Protection Architecture. Traditional backup is going away, but where does this leave end-users?
- Make or Break: New Auto Products Must Go To Market On Time This Webcast quantifies the value of time to market for the auto industry and highlights how Primavera Enterprise Portfolio Management can help organizations. All Data Storage White Papers | Webcasts