When properly implemented and managed, virtualization means cost savings -- for server hardware, support and administration -- as well as easier server deployment and reduced energy consumption. But for these benefits to come fully to fruition, the virtualization layer of the stack has to be managed independently of either the application or the physical server.
Fortunately, that's not difficult if you do it right.
Sure, server provisioning and back-up become easier in the virtual world. But monitoring, especially checking the servers' underlying physical resources, becomes even more important. You have to balance the types and numbers of virtualized applications among the physical servers with an eye to best use of resources.
The consequences of getting it wrong can turn some -- or all -- of the virtualized applications running on a physical server into poorly performing slugs.
Successfully managing virtualized servers means understanding that virtualization introduces a new layer into the server software stack. And you don't manage the virtualization layer by managing the underlying physical server or the virtual machines or the VMs' applications.
Instead, virtualization must be monitored and managed separately from the operating system, application or the (physical) server layer. (Note: This assumes you're using hypervisor products like Xen or VMware. There are also virtualization products like Virtuozzo that virtualize on top of the operating system. That changes the picture somewhat, but the basic principle remains the same.)
Second, you have to keep in mind that while virtual machines appear independent to the applications running on them, they can and do interact through the physical server and its available resources. All the applications and -- to a lesser extent -- the VMs running on the physical server may act as if they're independent, which is part of the charm of virtualization. But at the bottom, they're all drawing on the same pool of resources from the physical server. A large part of successfully managing virtualized servers is making sure all the VMs have the resources they need when they need them.
The third point flows from the first two. While managing virtual servers is in many ways similar to managing physical servers, they are not the same thing and the analogy stretches only so far.
The good news
Still, despite all the caveats, users agree that virtualization is a win when it comes to managing servers.
"Virtual servers are a lot easier to manage," says Mike Carvallho, CTO of Radiator Express Warehouse Inc. The company, perhaps better known by its 1-800-Radiator moniker, is a chain of 200 franchised radiator shops in the U.S. and Canada. Carvallho manages nine Dell servers that run VMware and are divided between two sites, supporting 55 virtual servers.
Not only are provisioning, capacity management and recovery easier, but there are fewer physical boxes to deal with. "One of the things you immediately lose is all the hardware that was a nightmare to manage," Carvalho says. Among other things, the physical servers tend to be of varying vintages from different vendors, which complicates patches and change management and means keeping on hand a supply of spare parts for the older servers.
When switching to a virtualized environment, many shops take the opportunity to limit the number and types of physical servers, often upgrading to newer technology at the same time. This makes it easier for data centers to keep their hardware in sync by limiting the variety of physical servers in the system.
The result of properly managed virtualization is a lower-cost, more efficient operation. "It all comes down to cost at the end of the day," says Chris Stucker, manager of systems, networks and services at Applied Extrusion Technology Inc. (AET), a maker of plastic films. Stucker says his server support costs have dropped significantly. "Many of those were older servers, and we were paying a premium for support based on that," he says. He could not provide specific savings statistics.
On a typical day, AET is running between 45 and 55 virtual servers on its three blade servers supporting company headquarters and three manufacturing plants in the U.S. and Canada.
However, this rosy picture applies only if the virtualized servers are properly managed. There are several places where managing virtualized servers can go wrong.
Nonlinearity -- when 2+2 = 5 (or 2, or 10, or ...)
Part of the difference between physical and virtual servers is nonlinearity. In the world of virtualization, two plus two sometimes makes three, or six.
"Nothing is really linear," says Derek Anderson, the lead developer for Enomaly Inc.'s Enomalism product, a tool for Xen's virtualization platform. "You get weird overlaps."
Different applications require different amounts of resources. Databases use a lot of RAM and processor cycles, Web servers make a lot of disk accesses, and so on.
Part of efficiently allocating resources depends on the timing of the loads. Anderson provides the following example, which he says comes from an actual case study involving a global financial services company.
A physical server running virtual servers to support users in England and Japan during the business day will probably be fine. The same server running virtual servers supporting users in New York and Florida is likely to have a problem -- because of their time zones. Japan and England are about 12 hours out of sync. Florida and New York are both on Eastern Standard Time. In the first case, the load profiles tend to cancel. In the second, they bump into each other.
So, each virtualized server needs a share of the resources of a physical server. For maximum efficiency, the virtualized servers' resource needs should complement those of the other virtual servers on the same physical box. In general you want to distribute applications with similar resource demands over different physical servers.
The physical server is where the contention occurs. Most of the time you're only running one application per VM because things are cleaner and easier to manage that way.
Think of it as trying to fit a lot of different-size blocks (VMs or application stacks) into a series of same-size boxes (physical servers). If you try to put all the blocks of the same size into one box, you'll probably need more boxes than if you mix up the size of the blocks to make the best use of the space in the boxes.
Managers should seek applications with complementary demands. This can be a problem because the natural tendency of someone brought up in the server environment of a data center is to think in terms of putting the same virtualized applications on the same servers. In the traditional server world, it's common to have one or more servers devoted to mail, another group of servers devoted to the Web, another to databases and so on.
But this is exactly the wrong thing to do in the virtual world.
That means you want to be careful about putting two applications that require a lot of memory, or storage, or any of the same resource, on the same server.
It may work when neither application is doing a lot of work, but as the load gets heavier, you're going to run into problems of resource contention and degraded performance.
Balancing resources is trickier than it sounds, because you can't simply take two virtualized applications, each of which uses 50% of a physical machine's resources, load them both on the same physical machine and expect things to go well.
Even two applications that look like a perfect fit may interfere in minor -- but critical -- ways. An application that normally has low RAM requirements may occasionally spike its RAM demand. If it's loaded on a physical box with an application that constantly needs a lot of RAM, you've not only got a problem, you've got one that's probably weirdly intermittent.
The bottom line is that you can't predict absolutely if two applications will virtualize well on the same physical box. The only way to know for sure is to test the combination. "You have to see what works well and what doesn't," Radiator Express' Carvalho says. "I would not take an unknown application whose features I have not thoroughly tested and expect all the features to work the first time out."
The good news is that virtualization makes it easy to run the virtualized servers on a test system and then migrate them over and go live if the combination works.
The testing procedure has a lot in common with testing a physical server supporting a single application. That is, you run test loads until you're sure that it works, can characterize the virtual server's performance and can confirm that the level of performance is acceptable.
With a virtualized server, you run the proposed combination of VMs and applications on a physical server and monitor the results using monitoring tools from a third party or the virtualization supplier. This is done instead of using something like Microsoft's Iometer, which works through the operating system. Remember that in products like VMware and Xen, the operating system is also running on a virtual machine. You need something that can go through the hypervisor layer to the underlying hardware. In both cases, you use a suite of test data, preferably drawn from your own enterprise.
One major difference comes when it's time to go live. Rather than installing the operating system and applications individually on the new server, you simply clone the whole stack over -- a process that usually takes a matter of minutes.
This concern for interactions still applies once the server is in production as well. Enomaly's Anderson cites the example of a DBMS and a Web server on the same physical platform. If the database can keep all its tables in RAM, it will have relatively little effect on a disk-intensive application like a Web server, Anderson says. But when a table gets large enough that the DBMS has to start paging it out to virtual memory, it can have a big impact on the Web server -- and both applications' performance can suddenly go sideways.
Keeping it all together
While you don't want virtual servers with the same resource demands on the same physical box, you probably want to keep at least some of your physical servers in close physical and network proximity.
The reason: to ease switching virtual servers among physical servers should the need arise. If a group of physical servers shares network connections and other resources, the time to switch a virtual server from one physical server to another can be greatly reduced, as can the configuration effort.
"If you have a bunch of servers in a rack, you can turn off the virtual machine [on one server] and come up on a server on the same switch," says Anderson, "and the time it takes is literally the amount of time it takes to copy the hard drive."
On the other hand, he says, if the machine you're transferring to "is on the other side of the data center, plugged into different switches and different subnets, then the time to do the reconfiguration could be an extra five or 10 minutes. Or if there's other stuff, then that could take a lot longer.
"It's just in the way you laid out your data center," he adds.
Of course, there are limits to proximity, either physical or network. You don't want the servers to overload the network connections, for example, and you may want to have physical servers some distance away -- or even in a different state -- for disaster recovery. It's a balancing act.
Keeping track of hardware
This nonlinearity complicates another aspect of managing virtual servers. Administrators need to closely monitor resource demands on various physical servers.
This is not the same as the demands reported by the operating systems on the virtual machines. Administrators need to look at what's happening on the actual hardware as well. They need to keep an eye on trends to avoid a sudden resource starvation as applications' resource profiles change.
What's more, tracking the physical hardware has to be done in detail because of the different loads the virtualized applications put on the different kinds of resources the server supplies. Because the various virtualized applications have different demands, things like RAM, processor cycles and I/O bandwidth have to be tracked separately.
What you can't do is try to track servers' physical resources by monitoring the operating systems on the virtual machines. You must go below the hypervisor layer to track a physical server's resources separately from a VM's. You probably still need to track reported resource use by the VMs, but that's a separate issue and done for things like performance analysis on the apps the VMs are running. (Note: Again, this assumes you're using hypervisor virtualization like VMware or Xen.)
According to Anderson, memory bandwidth is usually the most critical resource on a physical server, simply because it is a hard limit. Unlike nearly everything else, from storage to processing cycles, you can't increase the memory bandwidth built into the box. When that saturates, you've got to move some of the VMs to a new physical server.