Content addressed storage systems may be at risk
The MD5 hashing algorithm sometimes used has a security flaw
Computerworld - Security experts are warning about a flawed hashing algorithm, MD5, used by some vendors for digital signatures to store data securely on increasingly popular content addressed storage systems. The warnings come as more companies unveil CAS systems to meet the need for disk-based backup of fixed data such as e-mails and medical images.
"It really is time for [the industry] to stop using MD5," said Dan Kaminsky, a security consultant at Avaya Inc. in Basking Ridge, N.J. "MD5 has been a deprecated hashing algorithm for almost a decade. The U.S. government agreed."
According to Kaminsky, MD5 has been decertified for secure operations by the National Institute of Standards and Technology since at least 1998. "The industry has clung to the algorithm, partially out of inertia, partially out of scarcity of computer power," he said.
There are currently three major vendors of CAS storage: EMC Corp., Permabit Inc. in Cambridge, Mass., and Archivas Inc. in Waltham, Mass. Both EMC and Archivas use the MD5 hashing algorithm; Permabit does not.
Just this week, Storage Technology Corp. announced that it would use OEM Permabit's technology for e-mail archival. And Sun Microsystems Inc. is currently developing its own CAS, called Honeycomb, with several beta testers and plans to release it toward the end of the year.
Sun wouldn't say which algorithm it will use to store data.
Kaminsky published a report last month on the MD5 algorithm pointing out that an attack could be used to create two files with the same MD5 hash, one with "safe" data and one with "malicious" data. When both of those files are saved to the same system, a so-called collision can result, leading to data loss or dissemination of bad data, Kaminsky said.
CAS systems store metadata and data along with management policies to create an object that is quickly retrievable, no matter where its stored on a disk subsystem. CAS also uses write once, read many (WORM) capability to ensure that once data is stored it cannot be overwritten, which satisfies several regulatory requirements. Hashing is a way to create a shorter fixed-length key or index that represents the original data stored in a device. A multidigit number, for example, could be a hash representing a person's longer name or a specific document.
The vast majority of CAS is being purchased by the financial services and medical industries to store data regulated by the U.S. Securities and Exchange Commission under Rule 17a-4 and Health Insurance Portability and Accountability Act regulations. EMC's $200,000 Centera CAS array uses



- Excel 2010 Cheat Sheet
- Register for this Computerworld Insider Cheat Sheet and gain access to hundreds of premium content articles, guides, product reviews and more.
- Datacenter Consolidation Best Practices Whitepaper
- The benefits of storage consolidation are being realized by companies and seen as a way to streamline many storage-driven applications. Learn why the...
- Eliminating VMware / Storage Related Performance Challenges
- How to proactively monitor the performance in a Fibre Channel SAN / vSphere environment is always a concern. Understand the importance of a...
- Cloud Environments Have Familiar Storage Challenges
- Cloud environments have many storage challenges that are familiar to data center managers, but due to their density and abstraction, the issues become...
- Eight Considerations for Evaluating Disk-Based Backup Solutions
- In the past, the movement from tape- to disk-based backup has been less compelling due to the expense of storing backup data on...
- ExaGrid Helps U.S. Federal Government Agencies Reduce Backup Windows and Improve Data Protection
- The U.S. Government has been the largest user of tape-based backup systems since the 1970s. Most agencies have begun to deploy disk storage... All Storage White Papers
- Understand Your Data: The Future of Backup and Archiving
- Archiving and Backup are the foundation of the next generation of information governance. However, commodity data protection tools and basic archives are only...
- Optimizing Networks for the Cloud
- Join guest speaker, Rohit Mehra, IDC Director of Enterprise Communications Infrastructure, to explore current trends, discuss best practices for optimizing Data Center and...
- Apps QuickStart Series Part 2: Designing and Deploying SQL Server on VMware vSphere
- Download this webcast to learn about the design considerations for virtualizing SQL workloads, performance and scalability information and high-availability options, as well as...
- Apps QuickStart Series Part 1: Designing and Deploying Exchange 2010 on VMware vSphere
- Download this webcast to learn the virtual hardware design considerations for Exchange 2010, deployment using the building block approach, options for high-availability and...
- Customer Spotlight: How IPC The Hospitalist Company Implemented Oracle on VMware
- Have you been looking to hear about customer's experiences with the new VMware vCenter Site Recovery Manager product? View this webcast to learn... All Storage Webcasts