Skip the navigation
News

Google: DRAM error rates vastly higher than previously thought

PCs will likely require error correction code in the future due to DRAM issues

By Lucas Mearian
October 8, 2009 03:51 PM ET

Computerworld - A study released this week by Google Inc. and the University of Toronto showed that data error rates on dynamic RAM memory modules are vastly higher than previously thought and may be more responsible for system shutdowns and service interruptions.

The study (download .pdf), which used tens of thousands of Google's servers, showed that about 8.2% of all dual in-line memory modules (DIMM) are affected by correctable errors and that an average DIMM experiences about 3,700 correctable errors per year.

"Our first observation is that memory errors are not rare events. About a third of all machines in the fleet experience at least one memory error per year, and the average number of correctable errors per year is over 22,000," the report states.

"These numbers vary across platforms, with some platforms seeing nearly 50% of their machines affected by correctable errors, while in others only 12%-27% are affected."

The median number of errors per year on a Google server that had at least one error ranged from 25 to 611.

A memory error is marked by bits being read differently from how they were originally written. Memory errors can be caused by electrical or magnetic interference or by hardware corruption.

Memory errors are classified as soft errors, which randomly corrupt bits but do not leave physical damage and can be corrected, and hard errors, which corrupt bits (cells) within the DRAM and become physical defects that repeat data errors. Soft errors are often caused by radiation or alpha particles, which naturally occur in organic materials, including the epoxy that DRAM chips come packed in. Hard errors are most often caused by chip contamination at the manufacturing facility, but they often don't show up in testing and only surface after the memory chip warms after hours of use, according to Jim Handy, an analyst at Objective Analysis in Los Gatos, Calif.

The Google/University of Toronto study included memory from multiple vendors as well as multiple types of DRAM, such as DDR1, DDR2 and FB-DIMM.

The study covered a majority of servers in Google's data centers and was conducted over a period of two and a half years, from January 2006 to June 2008.

While the study focused on servers and stated that error rates are not climbing with the latest, more dense generations of DRAM, the results show that PCs will eventually need error correction code (ECC) technology as memory chips become more and more dense, Handy said.

ECC on special chips is used to detect and correct errors introduced during data storage or transmission.

Today, DRAM uses 50-nanometer lithography technology but is migrating to 40nm technology. The smaller the bits, the more susceptible they are to soft errors due to normal levels of radiation, Handy said.



Additional Resources
Forrester Consulting - Optimizing Users and Applications in a Mobile World
WHITE PAPER
Solving application issues over the WAN requires careful consideration. Based on their independent research, Forrester Consulting offers recommendations on how to tackle application performance issues, insufficient bandwidth and the inability to quickly restore users in a disaster.

Read now.

Security KnowledgeVault
WHITE PAPER
Security is not an option. This KnowledgeVault Series offers professional advice how to be proactive in the fight against cybercrimes and multi-layered security threats; how to adopt a holistic approach to protecting and managing data; and how to hire a qualified security assessor. Make security your Number 1 priority.

Read now.

Cut Communications Costs Once and for All
WHITE PAPER
New IP-based communications systems are being deployed by small and midsized businesses at a rapid rate. Learn how these organizations are enabling faster responsiveness, creating better customer experiences, speeding office or mobile interactions, and dramatically reducing existing communications costs.

Read now.

Storage White Papers
Datacenter Consolidation Best Practices Whitepaper
The benefits of storage consolidation are being realized by companies and seen as a way to streamline many storage-driven applications. Learn why the...
Eliminating VMware / Storage Related Performance Challenges
How to proactively monitor the performance in a Fibre Channel SAN / vSphere environment is always a concern. Understand the importance of a...
Cloud Environments Have Familiar Storage Challenges
Cloud environments have many storage challenges that are familiar to data center managers, but due to their density and abstraction, the issues become...
Eight Considerations for Evaluating Disk-Based Backup Solutions
In the past, the movement from tape- to disk-based backup has been less compelling due to the expense of storing backup data on...
ExaGrid Helps U.S. Federal Government Agencies Reduce Backup Windows and Improve Data Protection
The U.S. Government has been the largest user of tape-based backup systems since the 1970s. Most agencies have begun to deploy disk storage...
All Storage White Papers
Storage Webcasts
Understand Your Data: The Future of Backup and Archiving
Archiving and Backup are the foundation of the next generation of information governance. However, commodity data protection tools and basic archives are only...
Optimizing Networks for the Cloud
Join guest speaker, Rohit Mehra, IDC Director of Enterprise Communications Infrastructure, to explore current trends, discuss best practices for optimizing Data Center and...
Apps QuickStart Series Part 2: Designing and Deploying SQL Server on VMware vSphere
Download this webcast to learn about the design considerations for virtualizing SQL workloads, performance and scalability information and high-availability options, as well as...
Apps QuickStart Series Part 1: Designing and Deploying Exchange 2010 on VMware vSphere
Download this webcast to learn the virtual hardware design considerations for Exchange 2010, deployment using the building block approach, options for high-availability and...
Customer Spotlight: How IPC The Hospitalist Company Implemented Oracle on VMware
Have you been looking to hear about customer's experiences with the new VMware vCenter Site Recovery Manager product? View this webcast to learn...
All Storage Webcasts
Newsletter Sign-Up

Receive the latest news test, reviews and trends on your favorite technology topics

Choose a newsletter
  1. View all newsletters | Privacy Policy
IT Jobs