Imperial College London limits app outages with end-user monitoring

Imperial College London can respond to IT outages more quickly after implementing an application performance monitoring and management tool from Quest Software.

Prior to the implementation of Foglight, the only monitoring tools the university had in place were Quest’s Big Brother network monitoring software and Microsoft Operations Manager (MOM) – now called System Center Operations Manager (SCOM) – for tracking its Unix and Windows boxes, respectively. It still has the Microsoft software to cover some of the Microsoft applications.

It therefore mainly relied on end users to alert the IT team to any application performance issues.

Now, the university uses Foglight to monitor its underlying IT infrastructure and more than 70 different applications, including Oracle E-Business Suite, Oracle Student System, Exchange and the college website.

“We wanted to know what the end user actually sees. We know if the system dies in about five or six minutes. We know the response times users are getting. We are on top of the problems before users ring in,” said Andy Lax, availability analyst at Imperial.

The university uses the Foglight tool to perform system checks and 70,000 simulated end user checks on around 80 services – including the library system, timetabling systems and file servers - each day. Two script failures in a row will generate an alert in the system.

Imperial has an IT team of 190 people, around 50 of whom are part of the desktop support team, supporting more than 20,000 staff and students.

The university’s service desk hours are between 8am and 6pm, with an out-of-hours service provided by Northumberland University’s service desk between 6am and 8am, and 6pm to 11pm.

During these hours, Northumberland University takes any IT support calls, which are logged onto Imperial’s systems and passed on to its relevant IT staff to be dealt with. The Quest end-user monitoring software will also automatically provide an alert to any problems on the university’s most critical IT systems – such as the Oracle portal or Exchange - during the out-of-hours period.

Imperial believes it has saved more than £371,000 in just over a year, based on the number of hours it saved users in downtime by being able to start fixing problems outside the normal IT service desk hours. This meant that, for example, if a problem took an hour to fix, but an alert allowed staff to start working on the problem at 6am, rather than 9am, users were less affected by the issues.

Copyright © 2012 IDG Communications, Inc.

8 highly useful Slack bots for teams
Shop Tech Products at Amazon