Amazon AWS EC2: Always on (uhh... except when it's not)

Jeff Bezos looks astonished

Many Amazon Web Services (AWS) Elastic Compute Cloud (EC2) Instances will reboot themselves over the next few days. Looks like there's a nasty Xen bug that's responsible, but Amazon is being cagey about the details.

In IT Blogwatch, bloggers get ready to run /usr/bin/uptime.

Your humble blogwatcher curated these bloggy bits for your entertainment.

Brandon Butler chronicles the lack of chronic uptime:

Amazon Web Services [says it] needs to reboot up to 10% of its cloud servers in the coming days, and it doesn’t have anything to do with...the Bash Bug, which some are dubbing Shellshock.

[It's because of] what it calls a “timely security and operational update” related to its open source Xen hypervisor. ... The updates must be done before October 1, when details of the Xen flaw are made public. ... Customers don’t necessarily have to do anything, but they should be prepared for their EC2 instances to go down for a few minutes.  MORE

And Mike Wheatley adds his angle:

Amazon says [it's] a “timely security and operational update.”

It should be noted that anyone who uses the Xen hypervisor is affected by this issue, which includes Rackspace. ... Amazon is in a race against time to fix it. All of the updates must be completed before...details of the Xen flaw will be made public with the update XSA-108 release.  MORE

So Amazon's Jeff Barr evangelizes thuswise:

[We've] started notifying some of our customers of a timely security and operational update we need to perform on...less than 10% of our EC2 fleet globally. ... Security and operational excellence are our top two priorities.

Following security best practices, the details...are embargoed until [Oct 1]. The issue...affects many Xen environments, and is not specific to AWS. ... The instances that need the update...will be unavailable for a few minutes while the patches are...applied and the host is being rebooted. ... Instances requiring a reboot will be staggered so that no two regions or availability zones are impacted at the same time [so] most customers should experience no significant issues with the reboots. ... We wouldn’t inconvenience our customers if it wasn’t important and time-critical.

Customers who aren’t sure if they are impacted should go to the “Events“ page on the EC2 console.  MORE

But Ben Kepes keeps talking to deep throats:

An industry insider however, who has visibility over the broader impacts of this issue, suggested...the reboot goes further than this:.."AWS is furiously rebooting pretty much every get the patch in in the next 5 days."

The clock is ticking. ... One has to wonder how many virtual servers will remain at risk.  MORE

Why only some instances? Jason Verge verges on the informative:

The following instance types will not be affected: T1, T2, M2, R3 and HS1.  MORE

Meanwhile, Rich Mogull points and laughs:

Amazon uses a modified version of the Xen hypervisor. ... They do not support live migration [which] allows you to move a running virtual machine from one physical host to another without shutting it down.

Every time you shut an instance down and start it again you likely move to a new host server. That is just normal cloud automation at work. ... Simple reboots generally do not trigger a host migration because a reboot doesn’t actually shutdown the entire instance – [it] just executes the operating system shutdown and reboot procedures, but the instance is never destroyed. ... This is why I am a massive fan of DevOps – its techniques provide extra resiliency for situations like this.  MORE


You have been reading IT Blogwatch by Richi Jennings, who curates the best bloggy bits, finest forums, and weirdest websites… so you don't have to. Catch the key commentary from around the Web every morning. Hatemail may be directed to @RiCHi or Opinions expressed may not represent those of Computerworld. Ask your doctor before reading. Your mileage may vary. E&OE.

Copyright © 2014 IDG Communications, Inc.

Shop Tech Products at Amazon