IDG News Service - Many Facebook users were unable to access the social networking site for up to two and a half hours on Thursday, the worst outage the website has had in over four years, Facebook said in a posting.
The problems were traced back to a change made by Facebook in one of its systems.
The change was made to a piece of data that was called upon whenever an error-checking routine found invalid data in Facebook's system. The piece of data was itself interpreted as invalid, which caused the system to try and replace it with the same piece of data and so a feedback loop began.
The loop resulted in hundreds of thousands of queries per second being sent to Facebook's database cluster, overwhelming the system.
The result for users was a "DNS error" message and no access to the site.
"The way to stop the feedback cycle was quite painful - we had to stop all traffic to this database cluster, which meant turning off the site," wrote Robert Johnson, director of software engineering at Facebook, in a post on the site. "Once the databases had recovered and the root cause had been fixed, we slowly allowed more people back onto the site."
The problem hasn't been entirely fixed. Johnson said Facebook had to turn off the automated system to get the website back up and running. But that system does play an integral role in protecting the website.
Facebook is now exploring new ways to handle the situation so it won't lead to another feedback loop.
"We apologize again for the site outage, and we want you to know that we take the performance and reliability of Facebook very seriously," he wrote.
It's the second day Facebook was brought down for some users. On Wednesday, Facebook blamed a third-party networking provider for making the site inaccessible to some.
- Facebook launches redesign with a bit of the old, a bit of the new
- Facebook eyes solar-powered drone company
- Facebook coughs up $19B for WhatsApp's younger users
- Facebook buying WhatsApp for $16 billion
- Facebook's birthday present: A look back at your social life
- At 10, Facebook strives not to be your granny's social network
- Facebook sees apps in its future ... lots of apps
- Facebook hijacks Trending feature from rival Twitter
- Facebook to kill off one kind of ad some users hated
- Facebook uses a seasoned Chef to keep servers simmering
- Best iPhone, iPad Business Apps for 2014
- 14 Tech Conventions You Should Attend in 2014
- 10 Desktop Apps to Power Your Windows PC
- How to Add New Job Skills Without Going Back to School
- Slideshow: 7 security mistakes people make with their mobile device
- iOS vs. Android: Which is more secure?
- 11 sure signs you've been hacked
- The 12 PCI DSS 3.0 requirements addressed by Peer 1 Hosting This handy quick reference outlines the 12 PCI DSS 3.0 requirements, who needs to be compliant and how Alert Logic solutions address the...
- Alert Logic for PCI DSS Compliance To achieve PCI DSS compliance, you must identify and remediate all critical vulnerabilities detected during PCI scans. Threat Manager streamlines this process by...
- Accelerating Network Convergence in Virtualized and Cloud Data Centers Adopting a converged networking strategy enables organizations to traffic server and storage I/O workloads on consolidated data throughput channels. Intelligent software helps optimize...
- How 10GbE Network is the Backbone of the Virtual Data Center The shift to a virtual data center has put tremendous strain on legacy networks; driving the need for more speed, lower latency, more...
- Live Webcast Best Practices for the Hyperconverged Enterprise Network To the Age of Constant Connectivity and Information overload
- Live Webcast On-demand webinar: "Mobility Mayhem: Balancing BYOD with Enterprise Security" Check out this on-demand webinar to hear Sophos senior security expert John Shier deep dive into how BYOD impacts your enterprise security strategy...
- Live Webcast Endpoint Backup & Restore: Protect Everyone, Everywhere Arek Sokol from the bleeding-edge IT team at Genentech/Roche explains how he leverages cross-platform enterprise endpoint backup in the public cloud as part...
- Getting Ready for BlackBerry Enterprise Service 10.2 Find out how BlackBerry® Enterprise Service 10 helps organizations address the full spectrum of EMM challenges, while balancing the needs of both the...
- Containerization Options: How to Choose the Best DLP Solution for Your Organization This webcast outlines a framework for making the right choice when it comes to containerization approaches, along with the pros and cons of... All Networking White Papers | Webcasts