Back to Basics - How hardware virtualization works: Part 3
|
||||||
|
||||||
It is possible to find many explanations of hardware virtualization on the Internet and, of course, in computer science courses. Nonetheless, there remains a great deal of confusion regarding this increasingly popular technology. This multi-part series attempts to provide an approachable explanation of hardware virtualization. You can see Part 1 here and Part 2 here. Translate, Trap and Map Yet VMware and others did provide, and continue to provide, hardware virtualization on such older systems. How? By using a load-time binary scan and patch. (See figure below.) Whenever a section of memory was marked executable - making that marking was, thankfully, trap-able - the hypervisor would immediately scan the executable binary for troublesome instructions and replace each one with a trap instruction. In addition, of course, it augmented the bag 'o bits for that virtual machine with information saying what each of those traps was supposed to do originally.
Now, many software companies are not fond of the idea of someone else modifying their shipped binaries, and can even get sticky about things like support if that is done. Also, my personal reaction is that this is a horrendous kluge. But is a necessary kluge, needed to get around hardware deficiencies, and it has proven to work well in thousands, if not millions, of installations. Thankfully, it is not necessary on more recent hardware releases. Paravirtualization One way around overhead like that is to take a long, hard look at how prevalent you expect virtualization to be, and seriously ask: Is this operating system ever really going to run on bare metal? Or will it almost always run under a hypervisor? Some operating system development streams decided the answer to that question is: No bare metal. A hypervisor will always be there. Examples: Linux with the Xen hypervisor, IBM AIX, and of course the IBM mainframe operating system z/OS (no mainframe has been shipped without virtualization since the mid-1980s). If that's the case, things can be more efficient. If you know a hypervisor is always really behind memory mapping, for example, provide an actual call to the hypervisor to do things that have substantial overhead. For example: Don't do your own memory mapping, just ask the hypervisor for a new page of memory when you need it. Don't set the real-time clock yourself, tell the hypervisor directly to do it. (See figure below.)
This technique has become known as paravirtualization, and can lower the overhead of virtualization significantly. A set of "para-APIs" invoking the hypervisor directly has even been standardized, and is available in Xen, VMware, and other hypervisors. The concept of paravirtualizatin actually dates back to around 1973 and the VM operating system developed in the IBM Cambridge Science Center. They had the not-unreasonable notion that the right way to build a time-sharing system was to give every user his or her own virtual machine, a notion somewhat like today's virtual desktop systems. The operating system run in each of those VMs used paravirtualization, but it wasn't called that back in the Computer Jurassic. Virtualization is, in computer industry terms, a truly ancient art. The next post covers , lowest-overhead technique used in virtualization, then input/output, and draws some conclusions. This post originally appeared on Greg Pfister's blog, Perils of Parallel, and in the proceedings of Cloudviews - Cloud Computing Conference 2009, the 2nd Cloud Computing International Conference. |
||||||