Ads by TechWords

See your link here
Receive the latest technology news and information.
Networking
Computerworld Daily News (First Look and Wrap-Up)
Computerworld Blogs Newsletter
The Weekly Top 10
Cloud Computing
View all newsletters




Privacy Policy
 

Gmail outage caused by overloaded servers

September 2, 2009 01:57 AM ET

Active Comments
Anonymous says: they need to go back and look at their disaster recovery procedures and their backup schedules........
Anonymous says: ...the 'servers' are the request routers. Statement is accurate....


IDG News Service - A worldwide outage of Google's Gmail online e-mail system on Tuesday was caused by a traffic jam on its servers, according to Google's official Gmail blog.

The problem was that some recent changes designed to improve traffic flow on request routers, servers designed to direct Web queries to the appropriate Gmail server, overloaded the system after workers took some Gmail servers offline to perform routine upgrades.

"As we now know, we had slightly underestimated the load which some recent changes placed on the request routers," Ben Treynor, site reliability Czar wrote on the Gmail blog. "At about 12:30 p.m. Pacific a few of the request routers became overloaded and in effect told the rest of the system "stop sending us traffic, we're too slow!". This transferred the load onto the remaining request routers, causing a few more of them also to become overloaded, and within minutes nearly all of the request routers were overloaded."

The overload resulted in people around the world being unable to access Gmail for about 100 minutes, Treynor said, though he noted that IMAP/POP access and mail processing continued to work normally.

Gmail engineers were alerted to the problem within seconds of the failures and after figuring out what the problem was, brought additional request routers online. Now, Gmail is more than 99.9 percent available to users, he said.

"We've turned our full attention to helping ensure this kind of event doesn't happen again," he wrote.

One fix the company plans to make is to ensure request routers will work better by having them slow down when overloaded instead of refusing to accept traffic. Treynor said the request routers need to have sufficient failure isolation so that a problem in one data center doesn't affect servers in another data center.

The company will work over the next few weeks to make these changes and further improve reliability, he said.


Reprinted with permission from

IDG.net
Story copyright 2009 International Data Group. All rights reserved.

Jump to comments

Google

Additional Resources

WHITE PAPER
Approximately 60 percent of data migration projects overrun time or budget, while some fail completely. Download this white paper, "Enhancing Your Chance for Successful Data Migration," to learn the critical steps you need to take to execute a data migration project with minimum cost and risk to your business.
WHITE PAPER
Read the Gartner research note to learn why the TCO of a server-based computing deployment used to deliver all applications to users is around 50% lower than that of an unmanaged desktop deployment.
WHITE PAPER
Economic downturns have a tendency to accelerate emerging technologies, boost the adoption of effective solutions, and punish solutions that are not cost competitive or that are out of synch with industry trends. This IDC White Paper presents the results of an IDC survey of 330 companies in Western Europe, Asia/Pacific and the Americas that measures the receptiveness to Linux and takes into consideration changing views driven by the disruptive economic environment that businesses face today.

What People Are Saying

White Papers & Webcasts

Southern Company
Download Now  

Aligning IT to Business: The Rising Importance of Application Delivery Networks
Application Delivery Networking (ADN) will play a vital role in helping enterprises incorporate strategic technologies to achieve business initiatives.

Mitigate Risk, Lower Costs and Improve Network Efficiency
Create a stable IP network that not only meets today's challenges, but is flexible enough to also meet future demands.

Share our Strength
Download Now  

Preparing Your Business Services for the Future
Would you trust your network monitoring tools enough to know when something is truly halting a business service?

IPAM: Slashing Network Costs
Slashing Network Costs by Consolidating and Automating Core Network Services

Horror stories: Managing IT Across Multiple Locations
How one extra sharp IT manager eliminates daily agony, hassle and repetition.

Disaster Recovery & Cost Savings Zone
Thousands of customers world-wide have turned to virtualization solutions from Riverbed as a way to reduce costs.