Notes about using the KVM API.

kvm-hello-world

首先建议研究下kvm-hello-world这一项目。运行并研究其代码。
整体来说比较简单,就是对ioctl(vm->fd, KVM_SET_TSS_ADDR, 0xfffbd000)这行代码有困惑。搜了下资料,Documentation/virtual/kvm/api.txt解释如下:

Capability: KVM_CAP_SET_TSS_ADDR
Architectures: x86
Type: vm ioctl
Parameters: unsigned long tss_address (in)
Returns: 0 on success, -1 on error

This ioctl defines the physical address of a three-page region in the guest physical address space. The region must be within the first 4GB of the guest physical address space and must not conflict with any memory slot or any mmio address. The guest may malfunction if it accesses this memory region.

This ioctl is required on Intel-based hosts. This is needed on Intel hardware because of a quirk in the virtualization implementation (see the internals documentation when it pops into existence).

LWN: Using the KVM API

LWN: Using the KVM API

好文。值得细细品读。

Notes 如下:

  • Documentation/virtual/kvm/api.txt
  • fully functional sample program
  • KVM_EXIT_FAIL_ENTRY: in particular, shows up often when changing the initial conditions of the VM; it indicates that the underlying hardware virtualization mechanism (VT in this case) can’t start the VM because the initial conditions don’t match its requirements.
  • KVM_EXIT_INTERNAL_ERROR indicates an error from the Linux KVM subsystem rather than from the hardware.

Additional KVM API features

Prospective implementers of memory-mapped I/O devices will want to look at the exit_reason KVM_EXIT_MMIO, as well as the KVM_CAP_COALESCED_MMIO extension to reduce vmexits, and the ioeventfd mechanism to process I/O asynchronously.

For hardware interrupts, see the irqfd mechanism, using the KVM_CAP_IRQFD extension capability. This provides a file descriptor that can inject a hardware interrupt into the KVM virtual machine without stopping it first. A virtual machine may thus write to this from a separate event loop or device-handling thread, and threads running KVM_RUN for a virtual CPU will process that interrupt at the next available opportunity.

x86 virtual machines will likely want to support CPUID and model-specific registers (MSRs), both of which have architecture-specific ioctl()s that minimize vmexits.

Applications of the KVM API

Other than learning, debugging a virtual machine implementation, or as a party trick, why use /dev/kvm directly?

Virtual machines like qemu-kvm or kvmtool typically emulate the standard hardware of the target architecture; for instance, a standard x86 PC. While they can support other devices and virtio hardware, if you want to emulate a completely different type of system that shares little more than the instruction set architecture, you might want to implement a new VM instead. And even within an existing virtual machine implementation, authors of a new class of virtio hardware device will want a clear understanding of the KVM API.

Efforts like novm and kvmtool use the KVM API to construct a lightweight VM, dedicated to running Linux rather than an arbitrary OS. More recently, the Clear Containers project uses kvmtool to run containers using hardware virtualization.

Alternatively, a VM need not run an OS at all. A KVM-based VM could instead implement a hardware-assisted sandbox with no virtual hardware devices and no OS, providing arbitrary virtual “hardware” devices as the API between the sandbox and the sandboxing VM.

While running a full virtual machine remains the primary use case for hardware virtualization, we’ve seen many innovative uses of the KVM API recently, and we can certainly expect more in the future.