本文将介绍:当guest执行in
或out
指令时,QEMU与KVM源码中的实现细节。
1. PIO background
Intel的I/O指令使得处理器可以访问I/O端口,以便从外设输入数据,或者向外设发送数据。这些指令有一个指定I/O空间端口地址的操作数。有两类的I/O指令:
- 在寄存器指定的地址传送一个数据(字节、字、双字)。
- 传送指定内存中的一串数据(字节串、字串、双字串)。这些被称作为“串 I/O指令”或者说“块I/O指令”。
有IN
/OUT
INS
/OUTS
指令
2. PIO configuration in VMCS
SDM中的description如下:
KVM在Primary Processor-Based VM-Execution Controls 设置了Unconditional I/O exiting位,并且没有设置Use I/O bitmaps 位。因此,一旦guest执行了PIO指令,一定会发生VM Exit。
详情请阅读patch KVM: VMX: drop I/O permission bitmaps
3. Warm-up
3.1 VM Exit Qualification for I/O Instructions
当guest执行PIO指令时,触发vmx_handle_exit,根据EXIT_REASON_IO_INSTRUCTION执行handle_io函数。
handle_io
会解析Exit Qualification,代码如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
| static int handle_io(struct kvm_vcpu *vcpu) { unsigned long exit_qualification; int size, in, string; unsigned port;
exit_qualification = vmx_get_exit_qual(vcpu); string = (exit_qualification & 16) != 0; ... port = exit_qualification >> 16; size = (exit_qualification & 7) + 1; in = (exit_qualification & 8) != 0; ... return kvm_fast_pio(vcpu, size, port, in); }
|
3.2 misc
- 本文只讨论guest执行
in
或out
指令时的情况,guest执行串 I/O指令这一情况不做介绍;
- 本文不考虑KVM模拟I/O指令的情况,即假设
kernel_pio
的返回值不为0。
4. PIO中out的处理流程
KVM函数调用链如下:
1 2 3 4
| kvm_fast_pio kvm_fast_pio_out emulator_pio_out_emulated emulator_pio_in_out
|
1 2 3 4 5 6 7 8 9 10 11
| int kvm_fast_pio_out(struct kvm_vcpu *vcpu, int size, unsigned short port) { ... unsigned long val = kvm_rax_read(vcpu); int ret = emulator_pio_out_emulated(&vcpu->arch.emulate_ctxt, size, port, &val, 1); ... vcpu->arch.pio.linear_rip = kvm_get_linear_rip(vcpu); vcpu->arch.complete_userspace_io = complete_fast_pio_out; }
|
complete_userspace_io
的细节后面再描述。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
| int emulator_pio_out_emulated(struct x86_emulate_ctxt *ctxt, int size, unsigned short port, const void *val, unsigned int count) { struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt);
memcpy(vcpu->arch.pio_data, val, size * count); return emulator_pio_in_out(vcpu, size, port, (void *)val, count, false); }
int emulator_pio_in_out(struct kvm_vcpu *vcpu, int size, unsigned short port, void *val, unsigned int count, bool in) { vcpu->arch.pio.port = port; vcpu->arch.pio.in = in; vcpu->arch.pio.count = count; vcpu->arch.pio.size = size;
...
vcpu->run->exit_reason = KVM_EXIT_IO; vcpu->run->io.direction = in ? KVM_EXIT_IO_IN : KVM_EXIT_IO_OUT; vcpu->run->io.size = size; vcpu->run->io.data_offset = KVM_PIO_PAGE_OFFSET * PAGE_SIZE; vcpu->run->io.count = count; vcpu->run->io.port = port;
return 0; }
|
可以看到vcpu->run->io.data_offset
被设置为4096了,emulator_pio_out_emulated
已经把guest向端口写的值拷贝到了vpuc->arch.pio_data
中去了。 vcpu->arch.pio_data
就在kvm_run
后面一个页的位置,这可以从kvm_vcpu_init
中看出来。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| int kvm_vcpu_init(struct kvm_vcpu *vcpu, struct kvm *kvm, unsigned id) { ... page = alloc_page(GFP_KERNEL | __GFP_ZERO); vcpu->run = page_address(page); ... kvm_arch_vcpu_init(vcpu); ... }
int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu) { ... page = alloc_page(GFP_KERNEL | __GFP_ZERO); vcpu->arch.pio_data = page_address(page); ... }
|
KVM处理完后,返回到QEMU。此时,QEMU的执行代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
| int kvm_cpu_exec(CPUState *cpu) { ... switch (run->exit_reason) { case KVM_EXIT_IO: DPRINTF("handle_io\n"); kvm_handle_io(run->io.port, attrs, (uint8_t *)run + run->io.data_offset, run->io.direction, run->io.size, run->io.count); ret = 0; break; } ... }
|
QEMU处理完后,返回到KVM。
1 2 3 4 5 6 7 8 9 10 11 12 13
| int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run) { ... if (unlikely(vcpu->arch.complete_userspace_io)) { int (*cui)(struct kvm_vcpu *) = vcpu->arch.complete_userspace_io; vcpu->arch.complete_userspace_io = NULL; r = cui(vcpu); ... } ... vcpu_run(vcpu); ... }
|
kvm_fast_pio_out
已将complete_userspace_io
赋值为complete_fast_pio_out
;
1 2 3 4 5 6
| int complete_fast_pio_out(struct kvm_vcpu *vcpu) { vcpu->arch.pio.count = 0; ... return kvm_skip_emulated_instruction(vcpu);//主要功能是让guest的RIP跳过一个指令 }
|
5. PIO中in的处理流程
KVM函数调用链如下:
1 2 3 4
| kvm_fast_pio kvm_fast_pio_in emulator_pio_in_emulated emulator_pio_in_out
|
1 2 3 4 5 6 7 8 9 10 11
| int kvm_fast_pio_in(struct kvm_vcpu *vcpu, int size, unsigned short port) { unsigned long val; ... emulator_pio_in_emulated(&vcpu->arch.emulate_ctxt, size, port, &val, 1); ... vcpu->arch.pio.linear_rip = kvm_get_linear_rip(vcpu); vcpu->arch.complete_userspace_io = complete_fast_pio_in; return 0; }
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| int emulator_pio_in_emulated(struct x86_emulate_ctxt *ctxt, int size, unsigned short port, void *val, unsigned int count) { struct kvm_vcpu *vcpu = emul_to_vcpu(ctxt); int ret;
if (vcpu->arch.pio.count) goto data_avail;
memset(vcpu->arch.pio_data, 0, size * count);
ret = emulator_pio_in_out(vcpu, size, port, val, count, true); if (ret) { data_avail: memcpy(val, vcpu->arch.pio_data, size * count); vcpu->arch.pio.count = 0; return 1; }
return 0; }
|
在emulator_pio_in_emulated
中,由于vcpu->arch.pio.count
此时还没有数据(需要QEMU提供),所以会执行 emulator_pio_in_out
,之前已经看过这个函数了,就是设置kvm_run
的相关数据,然后由QEMU来填充。
回到QEMU后,QEMU会往kvm_run
填入数据。
回到KVM后,kvm_arch_vcpu_ioctl_run
会回调complete_fast_pio_in
函数。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
| int complete_fast_pio_in(struct kvm_vcpu *vcpu) { unsigned long val;
BUG_ON(vcpu->arch.pio.count != 1);
...
emulator_pio_in_emulated(&vcpu->arch.emulate_ctxt, vcpu->arch.pio.size, vcpu->arch.pio.port, &val, 1); kvm_rax_write(vcpu, val);
return kvm_skip_emulated_instruction(vcpu); }
|
在最终的emulator_pio_in_emulated
中,由于这个时候vcpu->arch.pio.count
已经有值了,表示数据可用了。
emulator_pio_in_emulated
中的执行代码为:
1 2 3 4 5 6 7 8 9
| int emulator_pio_in_emulated(struct x86_emulate_ctxt *ctxt, int size, unsigned short port, void *val, unsigned int count) { ... memcpy(val, vcpu->arch.pio_data, size * count); vcpu->arch.pio.count = 0; return 1; }
|
参考资料:
- QEMU-KVM中的PIO处理
- KVM源代码分析5:IO虚拟化之PIO