KVM MMIO Emulation
本文主要汇总KVM中MMIO Emulation的过程。
Prerequisite
Overview
For a summary, the following shows the process of MMIO implementation:
- QEMU declares a memory region(but not allocate ram or commit it to kvm)
- Guest first access the MMIO address, cause a EPT violation VM-exit
- KVM construct the EPT page table and marks the page table entry with special mark(110b)
- Later the guest access these MMIO, it will be processed by EPT misconfig VM-exit handler
QEMU part
这里以e1000网卡模拟为例,设备初始化MMIO时候时候注册的MemoryRegion为IO类型(不是RAM类型)。
1 | static void |
QEMU uses function memory_region_init_io to declare a MMIO region. Here we can see the mr->ram is false so no really memory is allocated.
QEMU调用kvm_set_phys_mem注册虚拟机的物理内存到KVM相关的数据结构中的时候,会调用memory_region_is_ram来判断该段物理地址空间是否是RAM设备, 如果不是RAM设备直接return了.
1 | static void kvm_set_phys_mem(KVMMemoryListener *kml, |
KVM part
In vmx_init, when ept enabled, it calls ept_set_mmio_spte_mask.
1 | static void ept_set_mmio_spte_mask(void) |
Here set shadow_mmio_mask.
We the guest access the MMIO address, the VM will exit caused by ept violation and tdp_page_fault will be called. __direct_map will be called to construct the EPT page table.
After the long call-chain, the final function mark_mmio_spte will be called to set the spte with shadow_mmio_mask which as we already know is set when the vmx initialization.
1 | __direct_map |
The condition to call mark_mmio_spte is is_noslot_pfn.
1 | static bool set_mmio_spte(struct kvm *kvm, u64 *sptep, gfn_t gfn, |
As we know the QEMU doesn’t commit the MMIO memory region, so pfn is KVM_PFN_NOSLOT and then mark the spte with shadow_mmio_mask.
When the guest later access this MMIO page, as it’s ept page table entry is 110b, this will cause the VM exit by EPT misconfig, any how can a page be write/execute but no read permission. In the handler handle_ept_misconfig it first process the MMIO case, this will dispatch to the QEMU part.
1 | vcpu_run |
1 | x86_emulate_instruction |
最后会调用到ioeventfd_write,写eventfd给QEMU发送通知事件。
参考资料: