IDG News Service - Many Facebook users were unable to access the social networking site for up to two and a half hours on Thursday, the worst outage the website has had in over four years, Facebook said in a posting.
The problems were traced back to a change made by Facebook in one of its systems.
The change was made to a piece of data that was called upon whenever an error-checking routine found invalid data in Facebook's system. The piece of data was itself interpreted as invalid, which caused the system to try and replace it with the same piece of data and so a feedback loop began.
The loop resulted in hundreds of thousands of queries per second being sent to Facebook's database cluster, overwhelming the system.
The result for users was a "DNS error" message and no access to the site.
"The way to stop the feedback cycle was quite painful - we had to stop all traffic to this database cluster, which meant turning off the site," wrote Robert Johnson, director of software engineering at Facebook, in a post on the site. "Once the databases had recovered and the root cause had been fixed, we slowly allowed more people back onto the site."
The problem hasn't been entirely fixed. Johnson said Facebook had to turn off the automated system to get the website back up and running. But that system does play an integral role in protecting the website.
Facebook is now exploring new ways to handle the situation so it won't lead to another feedback loop.
"We apologize again for the site outage, and we want you to know that we take the performance and reliability of Facebook very seriously," he wrote.
It's the second day Facebook was brought down for some users. On Wednesday, Facebook blamed a third-party networking provider for making the site inaccessible to some.
- Could you quit Facebook for 99 days?
- Facebook is a school yard bully that's going down
- EPIC says Facebook 'messed with people's minds,' seeks FTC sanctions
- 7 things you need to know about Facebook's mood experiment
- Facebook emotional manipulation test turns users into 'lab rats'
- Facebook tries to stop Snapchat drain with Slingshot
- TMI! Facebook moves to stop over-sharing
- Inside Facebook's brilliant plan to hog your data
- Facebook shows mobile app developers the money with new ad network
- Facebook unveils anonymous app log-ins
- Printer Installer: Eliminating Print Servers Printer Installer is an on-premise web application that enables you to centrally manage and deploy Windows shared or direct iP printers.
- Dell Networking Campus Switching and Mobility Reference Architecture 2.0 The Campus Reference Architecture (CRA) 2.0 provides solutions to address key problems facing small to large businesses.
- Meeting the Exploding Demand for New IT Services In this eBook, explore the top trends driving the New IT for IT Service Management, and how leading organizations are evolving to focus...
- Agency Transformation with Mobility Solutions The work environment at government agencies is changing rapidly, with employees seeking access to private networks, applications and content from a variety of...
- 5 Best Practices for Optimizing UC Monitoring This webcast discusses five best practices on how to successfully optimize and manage UC, as well as how to gain clear picture of...
- Customizing Unified Communications to Meet Today's Enterprise Demands What's the best way to implement UC? Many organizations are looking to Unified Communications as a Service (UCaaS) from providers such as Windstream... All Networking White Papers | Webcasts
Our new bimonthly Internet of Things newsletter helps you keep pace with the rapidly evolving technologies, trends and developments related to the IoT. Subscribe now and stay up to date!