Back to Basics - How hardware virtualization works
It is possible to find many explanations of hardware virtualization on the Internet and, of course, in computer science courses. Nonetheless, there remains a great deal of confusion regarding this increasingly popular technology.
This multi-part series attempts to provide an approachable explanation of hardware virtualization. But before explaining hardware virtualization, we will have to put it in the context of cloud computing and other forms of virtualization.
Virtualization is not a mathematical prerequisite for cloud computing; there are cloud providers who do serve up whole physical servers on demand. However, it is very common, for two reasons:
First, it is an economic requirement. Cloud installations without virtualization are like corporate IT shops prior to virtualization; there, the average utilization of commodity and RISC/UNIX servers is about 12%. (While this seems insanely low, there is a lot of data supporting that number.) If a cloud provider could only hope for 12% utilization at best, when all servers were used, the provider will have to charge a price above that of competitors who use virtualization. That can be a valid business model, and has advantages (like somewhat greater consistency in what's provided) and customers who prefer it, but the majority of vendors have opted for the lower-price route.
Second, it is a management requirement. One of the key things virtualization does is reduce a running computer system to a big bag of bits, which can then be treated like any other bag o' bits. Examples: It can be filed, or archived; it can be restarted after being filed or archived; it can be moved to a different physical machine; and it can be used as a template to make clones, additional instances of the same running system, thus directly supporting one of the key features of cloud computing: elasticity, expansion on demand.
Notice that I claimed the above advantages for virtualization in general, not just the hardware virtualization that creates a virtual computer. Virtual computers, or "virtual machines," are used by Amazon AWS and other providers of Infrastructure as a Service (IaaS); they lease you your own complete virtual computers, on which you can load and run essentially anything you want.
In contrast, systems like Google App Engine and Microsoft Azure provide you with complete, isolated, virtual programming platform - a Platform as a Service (PaaS). This removes some of the pain of use, like licensing, configuring and maintaining your own copy of an operating system, possibly a database system, and so on. However, it restricts you to using their platform, with their choice of programming languages and services.
In addition, there are virtualization technologies that target a point intermediate between IaaS and PaaS, such as the containers implemented in Oracle Solaris, or the WPARs of IBM AIX. These provide independent virtual copies of the operating system within one actual instantiation of the operating system.
The advantages of virtualization apply to all the variations discussed above. And if you feel like stretching your brain, imagine using all of them at the same time. It's perfectly possible: .NET running within a container running on a virtual machine.
Here, however, I will only be discussing hardware virtualization, the implementation of virtual machines as done by VMware and many others. Also, within that area, I am only going to touch lightly on virtualization of input/output functions, primarily to keep this article a reasonable length.
In the next part of this series, I will discuss the techniques used to virtualize processors and memory.
This post originally appeared on Greg Pfister's blog, Perils of Parallel, and in the proceedings of Cloudviews - Cloud Computing Conference 2009, the 2nd Cloud Computing International Conference.