6 Devastating Cloud Outages Over The Last 6 Months

Keep The Lights Flashing

The trend toward cloud computing is designed to make life easier and less expensive for consumers of cloud service -- for providers of cloud services, not so much. At least, we can be pretty sure it's not easier all of the time. As Apple engineers work to unravel an iCloud outage impacting a variety of different services this morning, CRN takes a look at some high-profile cloud service outages over the last six months.

Google

A brief service interruption this week involving Google Drive, Google Docs and Gmail must have had engineers at the Mountain View, Calif.-based company saying "Not again!" The April 17 issue was short-lived and impacted only a handful of customers, unlike last month when keeping the services running proved to be a challenge.

Google gave new meaning to the term "Triple Play" last month when the company's Google Drive storage service suffered three outages in one week. The first service interruption, which was blamed on a software glitch and a subsequent cascade of other problems, began on March 18, and impacted about one-third of the customer base for approximately two-and-a-half hours. The incident was followed up by a two-hour outage on March 19, and another, more lengthy loss of service on March 20.

Amazon Web Services

An accidental data deletion was said to be the cause of a Christmas Eve outage at Amazon's data center in northern Virginia. The mistake, which was apparently made by a developer unaware that his commands were also controlling online systems, impacted the company's Elastic Load Balancing Service, which caused users to experience issues with APIs, as well as high latency. Related issues began spreading to other infrastructure, as the support team scrambled to keep the outage in check. Service was fully restored 14 hours later.

Amazon suffered an unrelated six-hour outage at the same facility in October of last year. A memory leak impacted the company's Elastic Block Storage servers, resulting in a rapid loss of performance.

Amazon.com

This particular Amazon outage impacted the company's heavily trafficked retail site for nearly an hour on Jan. 31. A spokesman later reported that the primary impact was limited to the homepage, and that service was quickly restored. Further information was not released, leading many industry watchers to speculate about whether the outage might have been caused by a denial of service attack. It's believed that the loss of service may have cost $5 million dollars in lost revenue.

Windows Azure

An expired SSL certificate was the source of a worldwide outage of the Microsoft Azure cloud service. The disruption was discovered on the afternoon of Feb. 22, and service was restored the following day. Non-secured HTTP traffic was not affected. The Azure storage service appeared to be most dramatically impacted, but correspondents in the blogosphere pointed to other services, as well. Some individuals reported that the outage even impacted their XBox streaming capabilities. Failure to renew the SSL certificate is essentially a matter of human error. The incident provoked widespread speculation that the Redmond, Wash.-based company was under cyber-attack.

Hotmail And Outlook.com

Microsoft's move toward online services got a black eye in March of this year when Hotmail and Outlook.com went down for nearly 16 hours. The issue was confirmed late in the afternoon on March 12. What was originally anticipated to be a brief service interruption stretched into the following day. As Microsoft's engineers struggled with the mail issue, it was discovered that SkyDrive was malfunctioning in a way that files were added, edited or removed. However, this glitch was rectified within a few hours. Hotmail and Outlook.com service were not restored until midmorning of March 13. Microsoft later reported that overheated servers in the wake of a firmware update caused the problem.

Dropbox

Dropbox kicked off 2013 with the first major outage of the year. What initially appeared likely to be a short-term system hiccup on Jan. 10 actually turned into an outage of slightly more than 15 hours. The problem, which involved a synchronization issue between the client software and the servers, provoked the usual stir on the Internet as users checked to see if the outage was an isolated circumstance or something happening on a system-wide basis. Service was restored the following morning. Dropbox had previously been taken down during last year's Amazon Web Services crash in October.