Who gets blame for Amazon outage?
Reliability of cloud services makes customers complacent; many don't plan for worst-case scenarios
Computerworld - Amazon.com has promised to provide a "detailed post-mortem" on the root causes of the prolonged outage of its cloud services in recent days. Users of the Amazon services, meanwhile, may also have to explain how they got caught up in the outage.
The ensuing conversations may be uncomfortable for both Amazon and its cloud customers -- perhaps even more so for users of the services.
Cloud services overall have been remarkably reliable, which may be fostering a dangerous complacency among customers who are putting too must trust in them. This is another old and familiar story of technology hubris, one that was famously illustrated by another tech marvel, the unsinkable Titanic.
In this case, it is IT managers who will have to explain to their users -- and to their companies' executives -- why they didn't have a lifeboat.
Amazon's partial outage, which began Thursday and seemed largely resolved today, was an exceptional event.
Based on data compiled by AppNeta, the uptime reliability of 40 of the largest providers of cloud-based services, including Amazon, Google, Azure and Salesforce.com, shows how well cloud providers are delivering uninterrupted services. The performance management and network monitoring firm, known as Apparent Networks until this week, captures minute-by-minute uptime and other data from cloud providers used by its customers.
The overall industry yearly average of uptime for all the cloud services providers monitored by AppNeta is 99.948%, which is equal to 273 minutes of unavailability per year.
The worst providers clock in at 99.92%, or 420 minutes of unavailability each year.
The best providers are at 99.9994%, or three minutes of unavailability each year.
The takeaway for cloud users looking at the AppNeta data is that the risk of an outage is generally very low.
But that's not how the world works.
For example, Ken Brill, founder of the Uptime Institute, which researches data center issues, points to Japan's Fukushima Nuclear Power Plant. For 40 years, there were no problems at the plant. Then an earthquake and tsunami that hit in March disabled the facility with catastrophic consequences.
Brill expects that a post-mortem on the nuclear plant will show at least 10 things that could have been done to help avoid that failure and reduce the magnitude of damage and would have made it easier or faster to recover from.
The Amazon post-mortem will likely show something similar, said Brill.
Despite the redundancies and backups built into the Amazon cloud, "you hit a combination of events for which the backups don't work," he said.
Users see the promise of cloud technology as a way to reduce costs and be greener, but "that [also] means concentrating processing in fewer, bigger places," said Brill. Thus, when something goes wrong, "it has a bigger impact."
Cloud Watch
- DHS shifting to cloud, agile development to boost homeland security
- Cloud computing's big debt to NASA
- Coke bottler picks SaaS over SAP
- Inmate data paroled from mainframe
- An end to the free online tax ride nears
- Netflix guts data center in shift to cloud
- Apple, Facebook put Prineville on the map
- Online dating site dumps Amazon cloud services
- Ellison: Oracle will deliver world's 'most comprehensive cloud'
- Microsoft to run Linux on Azure
- 10 Hot Big Data Startups to Watch
- 11 Unique Uses for Google Glass, Demonstrated by Celebs
- How to Export Your Google Reader Account
- How to Better Engage Millennials (and Why They Aren't Really so Different)
- Telltale signs of ATM skimming
- 20 security and privacy apps for Androids and iPhones
- Big screen con artists: 7 great movies about social engineering
- IT Certification Study Tips
- Register for this Computerworld Insider Study Tip guide and gain access to hundreds of premium content articles, cheat sheets, product reviews and more.
- Clearing the Clouds for Midmarket Businesses The 10-point checklist included in this expert brief has been developed to help small and midsize businesses select the cloud model and cloud...
- Application Integration in the 21st Century World of Mobile, Social, Cloud and Big Data This paper will discuss the new IT landscape as it relates to the new integration, and the need for a new comprehensive integration...
- Manage Virtualized and Cloud Environments and the New Software-defined Data Center Analyst report by Enterprise Management Associates on the newly announced EMC Service Assurance Suite, and how well it addresses operational challenges and market...
- How Apollo Group Evaluated MongoDB Apollo Group, best known as the parent company of the University of Phoenix, sought to build a cloud-based learning management platform and needed...
- Live Webcast
Virtustream (Vayence) video taking a 3000-Seat SAP Environment to the Cloud - How can public cloud services help your organization reduce costs and increase security for your mission
- Live Webcast
Give Your Users What They Want with Cloud and Mobile - Date: Wednesday, June 19, 2013
Time: 2:00 PM EDT
You will learn:
- How moving to the cloud can help accelerate mobile adoption in your organization.
-... - Virtustream (Vayence) video taking a 3000-Seat SAP Environment to the Cloud How can public cloud services help your organization reduce costs and increase security for your mission
- Innovation in the Cloud Managing HR and financial information in the modern business requires efficient business practices and technology. All Cloud Computing White Papers | Webcasts
Rising salaries boost IT optimism, though not everyone is feeling upbeat. Our survey of 4,000+ IT workers shows who's riding the wave and why. Use our interactive tool and compare your own paycheck. Read more...
