Skip the navigation

Vendor disk failure rates: Myth or metric?

Disk problems contribute to 20% to 55% of storage subsystem failures

By Mary Brandel
April 4, 2008 12:00 PM ET

Computerworld - The statistics of mean time between failures (MTBF) and average failure rate (AFR) have gotten lots of attention lately in the storage world, especially with the release of three much-discussed studies devoted to the topic in the last year. And for good reason: Vendor-stated MTBFs have risen into the 1 million-to-1.5 million-hour range, equaling 114 to 170 years, a lifespan that no one is seeing in the real world.

Three studies over the past year on MTBF include the following:

"MTBF is a term that's in growing disrepute inside the industry because people don't understand what the numbers mean," says Robin Harris, an analyst at Data Mobility Group who also runs the StorageMojo blog. "Your average consumer and a lot of server administrators don't really get why vendors say a disk has a 1 million-hour MTBF, and yet it doesn't last that long."

Indeed, "how do these numbers help a person who wants to evaluate drives?" says Steve Smith, a former EMC Corp. employee and an independent management consultant in Bellevue, Wash. "I don't think they can.

Even storage system maker NetApp Inc. acknowledges in a response to an open letter on the StorageMojo blog that failure rates are several times higher than reported. "Most experienced storage array customers have learned to equate the accuracy of quoted drive-failure specs to the miles-per-gallon estimates reported by car manufacturers," the company says. "It's a classic case of 'Your mileage may vary' -- and often will -- if you deploy these disks in anything but the mildest of evaluation/demo lab environments."

Study results

The upshot of the recent studies can be summarized this way: Users and vendors live in very different worlds when it comes to disk reliability and failure rates.

Consider that MTBF is a figure that's reached through stress-testing and statistical extrapolation, Harris says. "When the vendor specs a 300,000-hour MTBF -- which is common for consumer-level SATA drives -- they're saying that for a large population of drives, half will fail in the first 300,000 hours of operation," he says on his blog. "MTBF, therefore, says nothing about how long any particular drive will last." In other words, MTBF does a very poor job communicating what the actual failure profile looks like, he says.

It's like providing the average woman's height in the U.S. but without showing the numbers used to derive that average, Smith says. "MTBF became the standard because it was perceived as a simpler answer to the question of reliability than showing the data of how they arrived at it," Smith says. "It's an honest-to-God simplification."



Additional Resources
Forrester Consulting - Optimizing Users and Applications in a Mobile World
WHITE PAPER
Solving application issues over the WAN requires careful consideration. Based on their independent research, Forrester Consulting offers recommendations on how to tackle application performance issues, insufficient bandwidth and the inability to quickly restore users in a disaster.

Read now.

Security KnowledgeVault
WHITE PAPER
Security is not an option. This KnowledgeVault Series offers professional advice how to be proactive in the fight against cybercrimes and multi-layered security threats; how to adopt a holistic approach to protecting and managing data; and how to hire a qualified security assessor. Make security your Number 1 priority.

Read now.

Cut Communications Costs Once and for All
WHITE PAPER
New IP-based communications systems are being deployed by small and midsized businesses at a rapid rate. Learn how these organizations are enabling faster responsiveness, creating better customer experiences, speeding office or mobile interactions, and dramatically reducing existing communications costs.

Read now.

Business Continuity White Papers
TechRepublic: Cloud Computing - Potential Value for Your Company?
Content provided by Google

Imagine a world without the hassle of licenses and hardware management - cloud computing makes this possible. Learn more about...
Forrester Analyst White Paper "The State of Enterprise Disaster Recovery Preparedness 2011"
This report outlines five trends that enterprises are architecting to better equip their DR solutions today including: secondary site configuration and separation, cloud...
Forrester Wave Report
Improvements in disaster recovery plans and broad business continuity strategies are top-of-mind concerns for leading enterprises today and recovery time is now measured...
Data Dedupe: It's not a question of if, rather where and when!
There is no such thing as a data or information recession! Data keeps growing in almost every economic climate. Fixed or shrinking budgets...
ESG: What's Changed with BC and DR?
To succeed and thrive, today's organizations must create network environments that enable them to continue operations or recover in the shortest possible time....
All Business Continuity White Papers
Business Continuity Webcasts
Customer Spotlight: How IPC The Hospitalist Company Implemented Oracle on VMware
Have you been looking to hear about customer's experiences with the new VMware vCenter Site Recovery Manager product? View this webcast to learn...
Virtualizing Microsoft and Oracle on VMware vSphere: Benefits and Best Practices
Virtualizing business-critical applications is an essential step in your journey to the cloud. Microsoft SQL Server, Exchange and SharePoint, and Oracle applications, are...
Optimizing Networks for the Cloud
Join guest speaker, Rohit Mehra, IDC Director of Enterprise Communications Infrastructure, to explore current trends, discuss best practices for optimizing Data Center and...
Apps QuickStart Series Part 2: Designing and Deploying SQL Server on VMware vSphere
Download this webcast to learn about the design considerations for virtualizing SQL workloads, performance and scalability information and high-availability options, as well as...
Apps QuickStart Series Part 1: Designing and Deploying Exchange 2010 on VMware vSphere
Download this webcast to learn the virtual hardware design considerations for Exchange 2010, deployment using the building block approach, options for high-availability and...
All Business Continuity Webcasts
Newsletter Sign-Up

Receive the latest news test, reviews and trends on your favorite technology topics

Choose a newsletter
  1. View all newsletters | Privacy Policy
IT Jobs