Amazon cloud outage staggers into Day 2

Many sites recovering, but Web 2.0 sites Reddit and Quora still affected

Amazon.com is well into the second day of trying to fix a cloud outage that has partially disabled or knocked out popular websites like Quora, Foursquare and Reddit.

The trouble started a little after 5 a.m. Eastern time on Thursday when the company's Service Health Dashboard reported connectivity problems that were affecting its Relational Database Service, which is used to manage a relational database in the cloud, across multiple zones in the eastern U.S.

Because of server problems at Amazon's data center, which handles the company's EC2 Web hosting services, websites, including popular Web 2.0 sites, were left staggering or disabled.

As of noon Eastern time on Friday, those sites had been affected for about 30 hours.

Earlier on Friday, at 5:41 a.m. Eastern time, Amazon reported that its engineers were making progress. At 9:18 a.m. it noted, "We're starting to see more meaningful progress in restoring volumes (many have been restored in the last few hours) and expect this progress to continue over the next few hours."

That was about 19 hours after Amazon reported Thursday afternoon that it was only a few hours away from having the problem solved.

Amazon updated users again at 11:49 a.m., saying that "many" customers have confirmed that their sites are recovering. "Our current estimate is that the majority of volumes will be recovered over the next five to six hours," the company reported.

Reddit reported at 10:30 a.m. that it was still running in emergency mode. Foursquare appeared to be up and running, while Quora was bouncing between read-only mode and not launching at all and showing an "internal server error" message.

HootSuite was also having problems, reporting at one point that it was "back up" and then changing to "again offline."

Ezra Gottheil, an analyst at Technology Business Research, said the outage is a big problem for the disabled websites, but it's an even bigger problem for Amazon.

"It's a pretty big hit. It's big and it's public," Gottheil added. "When you're doing business on the Web, you don't want to have your doors closed -- ever. It's tough for the sites. Most users will check again later, but [Amazon will] lose a few."

Despite the fact that Amazon has a reputation as one of the top players in the cloud sector, the disruption will be hard for the company's current customers to brush off -- and the publicity surrounding the outage could make it difficult for Amazon to attract new customers.

"This will give the other cloud vendors, especially the higher-end ones, a talking point that won't go away for years," Gottheil said.

Sharon Gaudin covers the Internet and Web 2.0, emerging technologies, and desktop and laptop chips for Computerworld. Follow Sharon on Twitter at @sgaudin or subscribe to Sharon's RSS feed . Her email address is sgaudin@computerworld.com.

FREE Computerworld Insider Guide: IT Certification Study Tips
Join the discussion
Be the first to comment on this article. Our Commenting Policies