Opinion: Data de-dup offers new storage management possibilities
The 20:1 compression rates once touted by vendors seem to have been greatly exaggerated
Computerworld - Last week's announcement (see "NetApp Set to Launch De-duplication Tool") by Network Appliance of their A-SIS data de-duplication technology for primary storage opens new storage management possibilities, raises the bar for their competition, but also brings up some interesting questions.
Most data de-duplication efforts have focused on secondary data applications, particularly backup. Companies like Data Domain have been very successful in this area, and in addition to achieving impressive de-duplication numbers, they have been ratcheting up performance capabilities with each new generation of products. So it only seems natural to consider applying de-dup technology more broadly.
Interestingly, in the announcement, there appears to be an effort to lower expectations regarding the rate of data reduction likely to be realized in primary storage situations, suggesting that the 20:1 or more savings quoted by backup de-dup vendors are largely the result of multiple copies of unchanged data over time. They warn that the reduction will be significantly lower in non-backup scenarios, perhaps 40% or less, and this is borne out by the users quoted in the article referenced above.
Certainly the extent of data reduction, as with any method of compression, depends heavily on the characteristics of the data involved, but based on these data points, it appears that the levels being achieved are at or below those attained through more traditional data compression methods (e.g. gzip, OS filesystem-based compression, tape drives, etc.).
This raises the question of why storage vendors, in general, have not leveraged traditional compression techniques for at least some category of application. While anyone who has enabled file-based compression on their laptop knows that it introduces overhead and impacts performance, hardware-based compression, such as that already incorporated into tape drives and some VTLs, could potentially minimize or even eliminate this overhead in a storage system. One company, StorWiz, offers a network-based appliance that compresses data in-stream on its way to the storage system and claims to actually increase performance due to the reduced quantity of data ultimately being written to disk. Otherwise there appears to be a dearth of compression options for storage.
Let me emphasize that data de-duplication and compression are not mutually exclusive. While traditional compression cannot achieve the very high reduction numbers achievable in some circumstances via de-duplication, given the current emphasis on reducing the storage footprint, as well as operational power and cooling costs, making appropriate use of every tool in our arsenal makes sense.
Read more about Data Storage in Computerworld's Data Storage Topic Center.
- 15 Non-Certified IT Skills Growing in Demand
- How 19 Tech Titans Target Healthcare
- Twitter Suffering From Growing Pains (and Facebook Comparisons)
- Agile Comes to Data Integration
- Slideshow: 7 security mistakes people make with their mobile device
- iOS vs. Android: Which is more secure?
- 11 sure signs you've been hacked
- Pay-as-you-Grow Data Protection: IBM Tivoli's Full-featured Data Protection Suite for Small to Medium Businesses IBM Tivoli Storage Manager Suite for Unified Recovery gives small and medium businesses the opportunity to start out with only the individual solutions...
- Streamline Data Protection with IBM Tivoli Storage Manager Operations Center IBM Tivoli Storage Manager (TSM) has been an industry-standard data protection solution for two decades. But, where most competitors focus exclusively on Backup...
- Using VM Archiving to Solve VM Sprawl This CommVault whitepaper discusses how archiving virtual machines can mitigate VM sprawl with a comprehensive approach to VM lifecycle management.
- Keep Your Network Available, Efficient and Secure Make the most of your network by working with experts who "get it." CDW and F5 have partnered to keep networks highly optimized....
- Make or Break: New Auto Products Must Go To Market On Time This Webcast quantifies the value of time to market for the auto industry and highlights how Primavera Enterprise Portfolio Management can help organizations.
- IBM Flash Webcast: Optimizing your Datacenter for Efficient Storage & ROI Register for this webcast to learn the benefits of flash storage from IBM Customer, Leonardo Irastorza of Royal Caribbean Cruise Ltd and Storage... All Data Storage White Papers | Webcasts