A university network brought to its knees when someone inadvertently plugged two network cables into the wrong hub. An employee injured after an ill-timed entry into a data center. Overheated systems shut down after a worker changes a data center thermostat setting from Fahrenheit to Celsius.
These are just a few of the data center disasters that have been caused not by technological malfunctions or natural catastrophes, but by human error.
According to the Uptime Institute, a New York-based research and consulting organization that focuses on data-center performance, human error causes roughly 70% of the problems that plague data centers today. The group analyzed 4,500 data-center incidents, including 400 full downtime events, says Julian Kudritzki, a vice president at the Uptime Institute, which recently published a set of guidelines for operational sustainability of data centers.
"I'm not surprised," Kudritzki says of the findings. "The management of operations is your greatest vulnerability, but also is a significant opportunity to avoid downtime. The good news is people can be retrained."
Whether it's due to neglect, insufficient training, end-user interference, tight purse strings or simple mistakes, human error is unavoidable. And these days, thanks to the ever-increasing complexity of IT systems -- and the related problem of increasingly overworked data center staffers -- even the mishaps that can be avoided often aren't, says Charles King, an analyst at Pund-IT Inc.
"Whenever you mix high levels of complexity and overwork, the results are typically ugly," says King. And as companies become more reliant on technology to achieve their business goals, those mistakes become more critical and more costly.
Wrong worker, wrong cable
Take the example of the university data center switch that overloaded because an IT worker mistakenly plugged two network cables into a downstream hub. That happened about four years ago at the Indiana University School of Medicine in Indianapolis, according to Jeramy Jay Bowers, a security analyst at the school.
The problem arose out of less-than-optimal network design, says Bowers, who worked at the school as a system engineer at the time of the incident. The IT department for the school of medicine was split into two locations, with one room in the school of medicine building and another room at the neighboring university hospital -- not an ideal setup to begin with, says Bowers.
The department had run fiber -- a purple cable, to be exact -- from a switch in the first building to the second, running it up through the ceiling, through a set of doors and across to the hospital's administrative wing next door. That cable attached to a 12-port switch that sat in the hospital building's IT room, and staffers could easily disconnect from the school of medicine network and connect to the hospital network through a jack in the wall, Bowers explains.