MSR management in QEMU/KVM
本文将以QEMU V5.2.0,kernel v5.14的源码与SDM的描述,介绍MSR management,具体细节不会一一介绍,但是会点出关键性的内容,读者可以以此为线索,深挖细节。
1. 理论基础
1.1 RDMSR and WRMSR instruction
- RDMSR—Read from Model Specific Register
EDX:EAX ← MSR[ECX]; - WRMSR—Write to Model Specific Register
MSR[ECX] ← EDX:EAX;
WRMSR与RDMSR类似,受篇幅限制,接下来主要以RDMSR为主。
1.2 VM Exit
The RDMSR instruction causes a VM exit if any of the following are true:
- The “use MSR bitmaps” VM-execution control is 0.
- The value of ECX is not in the ranges 00000000H – 00001FFFH and C0000000H – C0001FFFH.
- The value of ECX is in the range 00000000H – 00001FFFH and bit n in read bitmap for low MSRs is 1, where n is the value of ECX.
- The value of ECX is in the range C0000000H – C0001FFFH and bit n in read bitmap for high MSRs is 1, where n is the value of ECX & 00001FFFH.
1.3 MSR bitmap
On processors that support the 1-setting of the “use MSR bitmaps” VM-execution control, the VM-execution control fields include the 64-bit physical address of four contiguous MSR bitmaps, which are each 1-KByte in size. This field does not exist on processors that do not support the 1-setting of that control. The four bitmaps are:
- Read bitmap for low MSRs (located at the MSR-bitmap address). This contains one bit for each MSR address in the range 00000000H to 00001FFFH. The bit determines whether an execution of RDMSR applied to that MSR causes a VM exit.
- Read bitmap for high MSRs (located at the MSR-bitmap address plus 1024). This contains one bit for each MSR address in the range C0000000H toC0001FFFH. The bit determines whether an execution of RDMSR applied to that MSR causes a VM exit.
- Write bitmap for low MSRs (located at the MSR-bitmap address plus 2048). This contains one bit for each MSR address in the range 00000000H to 00001FFFH. The bit determines whether an execution of WRMSR applied to that MSR causes a VM exit.
- Write bitmap for high MSRs (located at the MSR-bitmap address plus 3072). This contains one bit for each MSR address in the range C0000000H toC0001FFFH. The bit determines whether an execution of WRMSR applied to that MSR causes a VM exit.
A logical processor uses these bitmaps if and only if the “use MSR bitmaps” control is 1. If the bitmaps are used, an execution of RDMSR or WRMSR causes a VM exit if the value of RCX is in neither of the ranges covered by the bitmaps or if the appropriate bit in the MSR bitmaps (corresponding to the instruction and the RCX value) is 1.
1.4 VM-Exit Controls for MSRs
1.5 VM-Entry Controls for MSRs
2. Basic VMX-related source code
2.1 MSR bitmap
2.1.1 空间分配与初始化
1 | kvm_vm_ioct(KVM_CREATE_VCPU) |
1 | // 分配一个page(4K)的空间给msr bitmap,并将该空间的内容初始化为全1 |
2.1.2 VMCS field
1 | vmx_create_vcpu |
2.2 passthrough MSR
vmx_disable_intercept_for_msr
1 | void vmx_disable_intercept_for_msr(struct kvm_vcpu *vcpu, u32 msr, int type) |
2.3 MSR area
建议学习下虚拟化学习心得:three context 中MSR area中的motivation。
以下内容为关键字,读者可去KVM中搜索源码学习。
1 | VM_ENTRY_MSR_LOAD_COUNT |
1 | VM_EXIT_MSR_STORE_COUNT |
1 | struct vcpu_vmx { |
3. How KVM handle MSR read
1 | struct msr_data { |
host_initiated:
- true: QEMU fired the call to operate on an MSR reg
- false: guest fired the call to operate on an MSR reg
3.1 VM Exit when guest executing RDMSR instruction
1 | kvm_emulate_rdmsr |
vmx_get_msr
处理一部分特殊MSR的读请求,kvm_get_msr_common
处理普通MSR的读请求。
3.2 QEMU get MSRs
1 | kvm_arch_dev_ioctl |
1 | do_get_msr |
4. IOCTL
1 | /* |
4.1 KVM_GET_MSR_INDEX_LIST
KVM_GET_MSR_INDEX_LIST
returns the guest MSRs that are supported. The list varies by kvm version and host processor, but does not change otherwise.
1 | // QEMU |
1 | // KVM |
4.2 KVM_GET_MSR_FEATURE_INDEX_LIST
KVM_GET_MSR_FEATURE_INDEX_LIST
returns the list of MSRs that can be passed to the KVM_GET_MSRS
system ioctl. This lets userspace probe host capabilities and processor features that are exposed via MSRs (e.g., VMX capabilities).
This list also varies by kvm version and host processor, but does not change otherwise.
1 | // QEMU |
1 | // KVM |
4.3 KVM_GET_MSRS
When used as a system ioctl:
Reads the values of MSR-based features that are available for the VM.
The list of msr-based features can be obtained using KVM_GET_MSR_FEATURE_INDEX_LIST
in a system ioctl.
When used as a vcpu ioctl:
Reads model-specific registers from the vcpu. Supported msr indices can be obtained using KVM_GET_MSR_INDEX_LIST
in a system ioctl.1
2
3
4
5
6
7
8
9
10
11
12struct kvm_msrs {
__u32 nmsrs; /* number of msrs in entries */
__u32 pad;
struct kvm_msr_entry entries[0];
};
struct kvm_msr_entry {
__u32 index;
__u32 reserved;
__u64 data;
};
Application code should set the nmsrs
member (which indicates the size of the entries array) and the index
member of each array entry. kvm will fill in the data
member.
4.4 KVM_SET_MSRS
Writes model-specific registers to the vcpu.
Application code should set the nmsrs
member (which indicates the size of the entries array), and the index
and data
members of each array entry.
参考资料: