Opinion: Data de-dup offers new storage management possibilities
The 20:1 compression rates once touted by vendors seem to have been greatly exaggerated
Computerworld - Last week's announcement (see "NetApp Set to Launch De-duplication Tool") by Network Appliance of their A-SIS data de-duplication technology for primary storage opens new storage management possibilities, raises the bar for their competition, but also brings up some interesting questions.
Most data de-duplication efforts have focused on secondary data applications, particularly backup. Companies like Data Domain have been very successful in this area, and in addition to achieving impressive de-duplication numbers, they have been ratcheting up performance capabilities with each new generation of products. So it only seems natural to consider applying de-dup technology more broadly.
Interestingly, in the announcement, there appears to be an effort to lower expectations regarding the rate of data reduction likely to be realized in primary storage situations, suggesting that the 20:1 or more savings quoted by backup de-dup vendors are largely the result of multiple copies of unchanged data over time. They warn that the reduction will be significantly lower in non-backup scenarios, perhaps 40% or less, and this is borne out by the users quoted in the article referenced above.
Certainly the extent of data reduction, as with any method of compression, depends heavily on the characteristics of the data involved, but based on these data points, it appears that the levels being achieved are at or below those attained through more traditional data compression methods (e.g. gzip, OS filesystem-based compression, tape drives, etc.).
This raises the question of why storage vendors, in general, have not leveraged traditional compression techniques for at least some category of application. While anyone who has enabled file-based compression on their laptop knows that it introduces overhead and impacts performance, hardware-based compression, such as that already incorporated into tape drives and some VTLs, could potentially minimize or even eliminate this overhead in a storage system. One company, StorWiz, offers a network-based appliance that compresses data in-stream on its way to the storage system and claims to actually increase performance due to the reduced quantity of data ultimately being written to disk. Otherwise there appears to be a dearth of compression options for storage.
Let me emphasize that data de-duplication and compression are not mutually exclusive. While traditional compression cannot achieve the very high reduction numbers achievable in some circumstances via de-duplication, given the current emphasis on reducing the storage footprint, as well as operational power and cooling costs, making appropriate use of every tool in our arsenal makes sense.
Jim Damoulakis is chief technology officer of GlassHouse Technologies Inc., a leading provider of independent storage services. He can be reached at jimd@glasshouse.com.
Read more about Data Storage in Computerworld's Data Storage Topic Center.
- The 20 Best iPhone/iPad Games of 2013 So Far
- 9 Steps to Build Your Personal Brand (and Your Career)
- 7 Consumer Technologies Coming to an Enterprise Near You
- 11 Signs Your IT Project is Doomed
- A walking tour: 33 questions to ask about your company's security
- 15 social media scams
- The 7 elements of a successful security awareness program
- IT Certification Study Tips
- Register for this Computerworld Insider Study Tip guide and gain access to hundreds of premium content articles, cheat sheets, product reviews and more.
- The Total Cost of Email In this white paper, we'll explore the true costs of fragmented email management and uncover how to reduce those costs with a cloud-based...
- The Shape of Email The shape of email is a starting point in helping us understand the qualify of the information residing in the inboxes of organizations...
- SaaS with a Face: User Satisfaction in Cloud-Based E-mail Management with Mimecast Learn how a carefully targeted SaaS approach can add value to your email environment and potentially result in better services within a much...
-
Your Data under Siege: Protection in the Age of BYODs
Download Kaspersky Lab's new whitepaper, Your Data under Siege: Protection in the Age of BYODs, to learn about:
- How a mobile workforce stretches...
- Becoming An Analytics Driven Organization Join us on Tuesday, June 18, 2013, 11:00 AM EDT and learn how your agency can create an analytics culture that will enable...
- 3 Reasons Why Sepaton is the World's Fastest Backup Solution Leading analyst, Storage Switzerland learns how Sepaton backs up and deduplicates massive data volumes while maintaining the industry's fastest performance - all in... All Data Storage White Papers | Webcasts
Rising salaries boost IT optimism, though not everyone is feeling upbeat. Our survey of 4,000+ IT workers shows who's riding the wave and why. Use our interactive tool and compare your own paycheck. Read more...