Disk drive failures 15 times what vendors say, study says
Drive vendors declined to be interviewed
Computerworld - Customers are replacing disk drives at rates far higher than those suggested by the estimated mean time to failure (MTTF) supplied by drive vendors, according to a study of about 100,000 drives conducted by Carnegie Mellon University.
The study, presented last month at the 5th USENIX Conference on File and Storage Technologies in San Jose, also shows no evidence that Fibre Channel (FC) drives are any more reliable than less expensive but slower performing Serial ATA (SATA) drives.
That surprising comparison of FC and SATA reliability could speed the trend away from FC to SATA drives for applications such as near-line storage and backup, where storage capacity and cost are more important than sheer performance, analysts said.
At the same conference, another study of more than 100,000 drives in data centers run by Google Inc. indicated that temperature seems to have little effect on drive reliability, even as vendors and customers struggle to keep temperature down in their tightly packed data centers. Together, the results show how little information customers have to predict the reliability of disk drives in actual operating conditions and how to choose among various drive types (see also "Hard data ").
Real world vs. data sheets
The Carnegie Mellon study examined large production systems, including high-performance computing sites and Internet services sites running SCSI, FC and SATA drives. The data sheets for those drives listed MTTF between 1 million to 1.5 million hours, which the study said should mean annual failure rates "of at most 0.88%." However, the study showed typical annual replacement rates of between 2% and 4%, "and up to 13% observed on some systems."
Garth Gibson, associate professor of computer science at Carnegie Mellon and co-author of the study, was careful to point out that the study didn't necessarily track actual drive failures, but cases in which a customer decided a drive had failed and needed replacement. He also said he has no vendor-specific failure information, and that his goal is not "choosing the best and the worst vendors" but to help them to improve drive design and testing.
He echoed storage vendors and analysts in pointing out that as many as half of the drives returned to vendors actually work fine and may have failed for any reason, such as a harsh environment at the customer site and intensive, random read/write operations that cause premature wear to the mechanical components in the drive.
Several drive vendors declined to be interviewed. "The conditions that surround true drive failures are complicated and require a detailed failure analysis to determine what the failure mechanisms were," said a spokesperson for Seagate Technology in Scotts Valley, Calif., in an e-mail. "It is important to not only understand the kind of drive being used, but the system or environment in which it was placed and its workload."
- Best iPhone, iPad Business Apps for 2014
- 14 Tech Conventions You Should Attend in 2014
- 10 Desktop Apps to Power Your Windows PC
- How to Add New Job Skills Without Going Back to School
- Slideshow: 7 security mistakes people make with their mobile device
- iOS vs. Android: Which is more secure?
- 11 sure signs you've been hacked
- OpenStack Hype vs. Reality: CIO Quick Pulse Open-source architecture can enable IT departments to build infrastructure-as-a-service (IaaS) clouds running on standard hardware.
- OpenStack and Red Hat: IDC White paper Most OpenStack deployments are by public cloud providers that are early adopters of technology and use OpenStack in a do-it-yourself deployment and support...
- Red Hat Enterprise Linux OpenStack Platform Datasheet Seamlessly transition to the cloud. Red Hat Enterprise Linux OpenStack Platform delivers an integrated foundation to create, deploy, and scale a secure and...
- Pay-as-you-Grow Data Protection: IBM Tivoli's Full-featured Data Protection Suite for Small to Medium Businesses IBM Tivoli Storage Manager Suite for Unified Recovery gives small and medium businesses the opportunity to start out with only the individual solutions...
- Make or Break: New Auto Products Must Go To Market On Time This Webcast quantifies the value of time to market for the auto industry and highlights how Primavera Enterprise Portfolio Management can help organizations.
- IBM Flash Webcast: Optimizing your Datacenter for Efficient Storage & ROI Register for this webcast to learn the benefits of flash storage from IBM Customer, Leonardo Irastorza of Royal Caribbean Cruise Ltd and Storage... All Data Storage White Papers | Webcasts