VMware's vSphere virtualization platform is known for its reliability and offers built-in features like High Availability (HA) to ensure that physical server failures have only minor effects on end-user applications. But not everyone chooses to use these features and, as with any technology, bad things can happen even if you do use them. So, what do you do when the unexpected happens -- when a virtual machine is slow or down or when an ESXi server or vCenter is unresponsive?
vSphere troubleshooting overview
As part of my preparation for the VMware Certified Advanced Professional -- Data Center Administrator certification, and throughout the process of creating my (caution: shameless plug alert) vSphere Troubleshooting course, I have spent a lot of time troubleshooting vSphere. I have intentionally broken it and then tried to fix it, sometimes with success and sometimes without.
What to do when vSphere goes down
Anytime someone says something is "down," you need to start by getting more information. What is down, exactly? Has a physical server failed? Has the vCenter VM blue-screened or are just the services stopped? Is the core network switch locked? Has the SAN lost power? Are all VMs down, or just one?
Users don't know what's "down" -- nor should they care; that's your job. Since you have the understanding of the various pieces in play and can perform some simple tests, you should be able to quickly determine where the problem lies. Still, make sure that you test thoroughly before deciding what the cause is. More than once I've made one quick test and (incorrectly) determined that the problem was the server, for example, when it actually was the entire network.
Check the RAM utilization in the specific VM for the app that's down by using Windows Task Manager. You will likely find that the process is eating up a lot of RAM. In this case, the VM only has 1GB of RAM and this process is using 462MB (about half).
If the memory is being used by an application that you don't want (a malicious application or a game that a user is running on a virtual desktop) then you can kill the process or uninstall the application.