• Subscribe

Back to Basics - How hardware virtualization works: Part 3

Back to Basics - How hardware virtualization works: Part 3

Greg Pfister
Since retiring from his position as an IBM Distinguished Engineer, Greg Pfister has worked as an independent consultant as well as serving as research faculty at Colorado State University. Pfister is the author of "In Search of Clusters," and over his 30-year career, he has accrued over 30 patents in parallel computing and computer communications.

It is possible to find many explanations of hardware virtualization on the Internet and, of course, in computer science courses. Nonetheless, there remains a great deal of confusion regarding this increasingly popular technology.

This multi-part series attempts to provide an approachable explanation of hardware virtualization. You can see Part 1 here and Part 2 here.

Translate, Trap and Map
The basic trap and map technique described previously depends crucially on a hardware feature: the hardware must be able to trap on every instruction that could affect other virtual machines. Prior to the introduction of Intel's and AMD's specific additional hardware virtualization support, that was not true. For example, setting the real time clock was, in fact, not a trappable instruction. It wasn't even restricted to supervisors. (Note, not all Intel processors have virtualization support today; this is apparently a done to segment the market.)

Yet VMware and others did provide, and continue to provide, hardware virtualization on such older systems. How? By using a load-time binary scan and patch. (See figure below.) Whenever a section of memory was marked executable - making that marking was, thankfully, trap-able - the hypervisor would immediately scan the executable binary for troublesome instructions and replace each one with a trap instruction. In addition, of course, it augmented the bag 'o bits for that virtual machine with information saying what each of those traps was supposed to do originally.

Image courtesy of Greg Pfister.

Now, many software companies are not fond of the idea of someone else modifying their shipped binaries, and can even get sticky about things like support if that is done. Also, my personal reaction is that this is a horrendous kluge. But is a necessary kluge, needed to get around hardware deficiencies, and it has proven to work well in thousands, if not millions, of installations.

Thankfully, it is not necessary on more recent hardware releases.

Whether or not the hardware traps all the right things, there is still unavoidable overhead in hardware virtualization. For example, think back to my prior comments about dealing with virtual memory. You can imagine the complex hoops a hypervisor must repeatedly jump through when the operating system in a client machine is setting up its memory map at application startup, or adjusting the working sets of applications by manipulating its map of virtual memory.

One way around overhead like that is to take a long, hard look at how prevalent you expect virtualization to be, and seriously ask: Is this operating system ever really going to run on bare metal? Or will it almost always run under a hypervisor?

Some operating system development streams decided the answer to that question is: No bare metal. A hypervisor will always be there. Examples: Linux with the Xen hypervisor, IBM AIX, and of course the IBM mainframe operating system z/OS (no mainframe has been shipped without virtualization since the mid-1980s).

If that's the case, things can be more efficient. If you know a hypervisor is always really behind memory mapping, for example, provide an actual call to the hypervisor to do things that have substantial overhead. For example: Don't do your own memory mapping, just ask the hypervisor for a new page of memory when you need it. Don't set the real-time clock yourself, tell the hypervisor directly to do it. (See figure below.)

Image courtesy of Greg Pfister.

This technique has become known as paravirtualization, and can lower the overhead of virtualization significantly. A set of "para-APIs" invoking the hypervisor directly has even been standardized, and is available in Xen, VMware, and other hypervisors.

The concept of paravirtualizatin actually dates back to around 1973 and the VM operating system developed in the IBM Cambridge Science Center. They had the not-unreasonable notion that the right way to build a time-sharing system was to give every user his or her own virtual machine, a notion somewhat like today's virtual desktop systems. The operating system run in each of those VMs used paravirtualization, but it wasn't called that back in the Computer Jurassic.

Virtualization is, in computer industry terms, a truly ancient art.

The next post covers , lowest-overhead technique used in virtualization, then input/output, and draws some conclusions.

This post originally appeared on Greg Pfister's blog, Perils of Parallel, and in the proceedings of Cloudviews - Cloud Computing Conference 2009, the 2nd Cloud Computing International Conference.

Join the conversation

Do you have story ideas or something to contribute? Let us know!

Copyright © 2021 Science Node ™  |  Privacy Notice  |  Sitemap

Disclaimer: While Science Node ™ does its best to provide complete and up-to-date information, it does not warrant that the information is error-free and disclaims all liability with respect to results from the use of the information.


We encourage you to republish this article online and in print, it’s free under our creative commons attribution license, but please follow some simple guidelines:
  1. You have to credit our authors.
  2. You have to credit ScienceNode.org — where possible include our logo with a link back to the original article.
  3. You can simply run the first few lines of the article and then add: “Read the full article on ScienceNode.org” containing a link back to the original article.
  4. The easiest way to get the article on your site is to embed the code below.