IDG News Service - Many Facebook users were unable to access the social networking site for up to two and a half hours on Thursday, the worst outage the website has had in over four years, Facebook said in a posting.
The problems were traced back to a change made by Facebook in one of its systems.
The change was made to a piece of data that was called upon whenever an error-checking routine found invalid data in Facebook's system. The piece of data was itself interpreted as invalid, which caused the system to try and replace it with the same piece of data and so a feedback loop began.
The loop resulted in hundreds of thousands of queries per second being sent to Facebook's database cluster, overwhelming the system.
The result for users was a "DNS error" message and no access to the site.
"The way to stop the feedback cycle was quite painful - we had to stop all traffic to this database cluster, which meant turning off the site," wrote Robert Johnson, director of software engineering at Facebook, in a post on the site. "Once the databases had recovered and the root cause had been fixed, we slowly allowed more people back onto the site."
The problem hasn't been entirely fixed. Johnson said Facebook had to turn off the automated system to get the website back up and running. But that system does play an integral role in protecting the website.
Facebook is now exploring new ways to handle the situation so it won't lead to another feedback loop.
"We apologize again for the site outage, and we want you to know that we take the performance and reliability of Facebook very seriously," he wrote.
It's the second day Facebook was brought down for some users. On Wednesday, Facebook blamed a third-party networking provider for making the site inaccessible to some.
- Facebook launches FB Newswire for social newsies
- Marketers are losing faith in Facebook
- Facebook may lure teen users back with virtual reality promise
- Facebook's Oculus VR buy is about more than gaming
- Facebook spends $2B on virtual reality firm, but analysts are skeptical
- Facebook launches redesign with a bit of the old, a bit of the new
- Facebook eyes solar-powered drone company
- Facebook coughs up $19B for WhatsApp's younger users
- Facebook buying WhatsApp for $16 billion
- Facebook's birthday present: A look back at your social life
- Silicon Valley's 19 Coolest Places to Work
- Is Windows 8 Development Worth the Trouble?
- 8 Books Every IT Leader Should Read This Year
- 10 Hot Hadoop Startups to Watch
- Slideshow: 7 security mistakes people make with their mobile device
- iOS vs. Android: Which is more secure?
- 11 sure signs you've been hacked
- Enhancing Application Protection and Recovery with a Modern Approach to Snapshot Management This CommVault Business Value and Technology White Paper explains how Simpana IntelliSnap® Recovery Manager can make your application recovery fast and reliable.
- CIOs Deliver Productivity Breakthroughs with Intelligent Digital Signage Retailers have long recognized the influence that digital signage provides over a shopper's point-of-purchase decision making process.
- How WAN Optimization Helps Enterprises Reduce Costs If you wanted to break down innovation into a tidy equation, it might go something like this: Technology + Connectivity = Productivity. Productivity...
- Four Little-Known Ways WAN Optimization Can Benefit Your Organization WAN optimization has evolved into a complete system that optimizes traffic across a broad range of most popular applications while providing deep visibility...
- Online Video and Web Traffic: Sochi 2014 Winter Olympic Games Over 25 leading global broadcasters worked with Akamai to deliver the action, excitement and inspiration of Sochi because they understand online viewers expect... All Networking White Papers | Webcasts