Even a well-oiled machine needs squeaky wheels

This company has multiple remote offices, not all of which have IT staffers onsite -- but each one does require several servers to keep running, reports a pilot fish in the loop.

"Downtime can cost the company a significant amount of money," fish says. "After one location was down for a few days when a server failed and a new one had to be purchased and rebuilt from scratch, IT began rolling out clustered server environments with full redundancy and automatic failover of the virtual servers in the event of a hardware failure.

"This year we completed the deployment in all locations, and we sat back and gave each other high-fives for a job well done.

"Then one afternoon I was doing a quick check of connectivity between locations and a ping failed to respond on one of the servers in a location. First thought was that the connectivity to that location was down, but another ping showed the other physical server responding.

"Turns out that the server had dropped from the cluster three weeks prior -- due to two hard drive failures -- and no one noticed. We're so used to our users telling us when there is an issue that we neglected to set up even basic monitoring of the hardware systems.

"Once we have the server rebuilt, we'll be working on that..."

Ping the Shark! Send me your true tale of IT life at sharky@computerworld.com. You'll get a stylish Shark shirt if I use it. Add your comments below, and read some great old tales in the Sharkives.

Get your daily dose of out-takes from the IT Theater of the Absurd delivered directly to your Inbox. Subscribe now to the Daily Shark Newsletter.

Computerworld's IT Salary Survey 2017 results
Shop Tech Products at Amazon