Who do you blame when IT breaks?
Assessing fault in data center incidents may pit internal IT staff against their vendors
Computerworld - There's always a reason why things break in IT, and the powers-that-be can usually find someone to blame -- be it a data center operations staff member, an OEM, a systems integrator or a third party service provider.
An offender often leaves clear fingerprints showing that a component was mislabeled or a process wasn't updated. In other cases, an incident may be the result of oversights by multiple parties.
But with the possible exception of a meteor strike, there's always someone to blame for a data center problem.
The majority are blamed on outside parties such as contractors or vendors, with a sizeable percentage of fault assigned to data center operations staff, according to data compiled by the Uptime Institute.
The findings of the Uptime Institute, which has been collecting incident data from its data center customers since 1994, may draw criticism as few internal IT operators or their vendors take blame easily.
Vendors may be blamed most because they are usually willing to take a bullet for a problem even if they feel the genesis is an internal operations oversight.
"The vendor gets caught up in a sensitive spot," said Ahmad Moshiri, director of power technical support at Emerson Network Power Liebert Services, because it doesn't want to put the client - a facilities manager - in a difficult position. It's very touchy," he said.
Uptime Institute members -- data center managers from multiple industries -- agree to voluntarily report abnormal incidents. The institute has about 5,000 abnormal incidents in its database. Such incidents are defined as any event in which a piece of equipment or infrastructure component did not perform as expected.
The data compiled by Uptime found that 34% of the abnormal incidents in 2009 were attributed to operations staff, followed by 41% in 2010, and 40% last year.
External forces who work on the customer's data center or supply equipment to it, including manufacturers, vendors, factory representatives, installers, integrators, and other third parties were responsible for 50% to 60% of the incidents reported in those years, according to Uptime.
Some 5% to 8% of the incidents each year were tied to things like sabotage, outside fires, other tenants in a shared facility and various odd anomalies.
About 10% of all the reported abnormal incidents resulted in an outage ranging from a system losing power to a data center going out.
The Uptime data shows that internal staff are responsible for a majority (60%) of those incidents, which can include outages and data loss incidents.
Although the internal staff gets the blame, "it's the design, manufacturing, installation processes that leave banana peels behind and the operators who slip and fall on them," said Hank Seader, managing principal research and education at Uptime.
To Seader's point about banana peels, David Filas, a data center engineer at healthcare provider Trinity Health described a situation where a fire system vendor, performing routine maintenance on a fire suppression system in one data center, triggered an emergency power off (EPO).
Ordinarily, this would not have been a problem, but an error in the construction of the EPO circuit let the signal through, which resulted in an outage. It turned out that the EPO bypass circuit was not constructed to the as-built drawing when the center was built years earlier.
"The designs and actions of engineers, architects, and installation contractors can have latent effects on operations long after construction," said Filas.
- Best iPhone, iPad Business Apps for 2014
- 14 Tech Conventions You Should Attend in 2014
- 10 Desktop Apps to Power Your Windows PC
- How to Add New Job Skills Without Going Back to School
- Slideshow: 7 security mistakes people make with their mobile device
- iOS vs. Android: Which is more secure?
- 11 sure signs you've been hacked
- Building a Bridge to the Next Generation Data Center Selecting a widely adopted operating system is a foundational component of a standardization strategy.
- HP HAVEn: See the big picture in Big Data HP HAVEn is the industry's first comprehensive, scalable, open, and secure platform for Big Data. Enterprises are drowning in a sea of data...
- Piecing Together the Business Intelligence Puzzle Business intelligence (BI) technology collects and analyzes company data, delivering relevant information to corporate decision-makers in an effort to produce favorable outcomes.
- Harness IT -- An Introduction to Business Intelligence Solutions Learn the key selection criteria required to provide your organization with the capability to address structured data, unstructured data and mobile demands so...
- Cloud Knowledge Vault Learn how your organization can benefit from the scalability, flexibility, and performance that the cloud offers through the short videos and other resources...
- Testimonial: Cystic Fibrosis Trust Peter Hawkins, the Head of IT for Cystic Fibrosis Trust, discusses the role CommVault's Simpana software platform plays in improving the company's information... All Data Center White Papers | Webcasts