August 18, 2019

The Morning Paper - Virtual Machines

I came across Adrian Colyer’s The Morning Paper this morning.

I’m overwhelmed at the number of academic papers he provides summaries for.

Going to “tagged” topics, I narrowed down my browsing to these areas:

To make this meta: I’ll give a summary of the paper summaries that I found most interesting

Today I’ll just cover the Virtualization section, with an even shorter summary of each paper & some of my thoughts:

Formal Requirements for Virtualizable Third Generation Architectures 1974

This paper was published in 1974, and lays out the definition of a virtual machine (VM):

A virtual machine is taken to be an efficient, isolated duplicate of the real machine.

A VM is the environment created by what is called a “virtual machine monitor” (VMM).

A VMM provides a runtime environment for programs that is:

  1. efficient, because its (virtual) processor’s instructions are executed directly by the underlying computer’s processor. This is distinct from “traditional” emulators and software simulators which will never be as fast as their underlying computer because they don’t directly use the underlying computer’s processor.
  2. isolated, because it operates in complete control of its system resources, including that it cannot access resources not allocated to it and can regain control over resources it’s allocated.
  3. identical to the behaviors that the program would exhibit if ran directly on the underlying computer

Back in 1974, computers had 2 moving parts:

  1. a processor (which processes instructions fed to it by a software program)
  2. linear uniformly addressable memory

The processor could operate in 2 modes: “supervisor” or “user” mode.

A processor performs instructions in either mode, but user mode can only perform a subset of instructions that supervisor mode can.

As a program’s sequence of instructions are performed by the processor, the processor may be instructed to access an address in memory that is either beyond the limits of memory, or beyond the limits of the VMM’s allocated memory.

Basically a computer can be virtualized if it can prevent instructions being performed in user mode that:

  1. access resources beyond its resource allowance
  2. modify its environment’s resource allowance

The instructions that modify resource allowance have to be a subset of instructions that can only be performed in supervisor mode, or else user mode could run amok.

The paper goes on to formally prove this, which I personally can’t be bothered to go read right now!

Interesting? ✅✅✅✅✅☑️ ☑️ ☑️ ☑️ ☑️

One VM to Rule Them All 2013

How to make a fast VM that isn’t complex

We present a new (Virtual Machine) approach and architecture, which enables implementing a wide range of languages within a common framework, reusing many components (especially the optimizing compiler).

The nitty gritty of how “high-performance” is achieved is by avoiding:

Overall, this topic didn’t interest me as much as I thought when I clicked the link

Interesting? ✅☑️ ☑️ ☑️ ☑️ ☑️ ☑️ ☑️ ☑️ ☑️

My VM is lighter (and safer) than your container (2017)

Can we have the improved isolation of VMs, with the efficiency of containers?

This topic is near and dear to me because I have spent a fair amount of my career close to devops, “spinning” up and down VMs.

Creating an AWS EC2 instance can take minutes.

This paper demonstrates that it’s possible to boot a Xen-based VM in 4ms.

For comparison, fork/exec on Linux takes approximately 1ms. On the same system, Docker containers start in about 150ms.

This is accomplished by running the authors’ VM as a “unikernel”.

They emphasize that the size of your VM image is linearly proportional to your boot time, and since a unikernel VM is radically smaller in size than normal VM images, this is where a lot of bootup time-savings can come from.

How small is small? The example given is a simple TCP service that returns the current time–it is 480KB uncompressed, and runs in 3.6MB of RAM

Not normally realistic–especially since I work at a JVM shop, but interesting to have proven.

As we keep creating VMs, the creation time increases noticeably (note the logarithmic scale): it takes 42s, 10s and 700ms to create the thousandth Debian, Tinyx, and unikernel guest, respectively.

Scaling up the number of VM instances on a server led them to identify interesting bottlenecks that led to super-linear growth in bootup times of each additional VM instance.

One source of superlinear growth was XenStore and interactions with it.

one fundamental problem with the XenStore is its centralized, filesystem-like API which is simply too slow for use during VM creation and boot, requiring tens of interrupts and privilege domain crossings

LightVM redesigns the Xen control plane with a lean driver called noxs (for ‘no XenStore’) that replaces the XenStore and allows direct communication between front-end and back-end drivers via shared memory.

Most interestingly, they highlight how this type of lightweight VM could be used to offer an AWS Lambda type of compute hosting service.

Interesting? ✅✅✅✅✅✅✅☑️ ☑️ ☑️

Deconstructing Xen

Unfortunately, one of the most widely-used hypervisors, Xen, is highly susceptible to attack because it employs a monolithic design (a single point of failure) and comprises a complex set of growing functionality including VM management, scheduling, instruction emulation, IPC (event channels), and memory management.

Covers a lot of extremely low-level security concerns with Xen.

Too much for me to process right now.

Interesting? ✅☑️ ☑️ ☑️ ☑️ ☑️ ☑️ ☑️ ☑️ ☑️