本文将深入理解irqfd机制,偏向于KVM side。为了便于理清irqfd机制,本文只介绍patch KVM: irqfd中的内容。

1. Introduction

irqfd in KVM is implemented based on eventfd in Linux.

As its name shows, irqfd is basically a fd that is bound to an interrupt in the virtual machine. Here the fd must be an eventfd. The delivery path is single direction, say, interrupt is delivered from outside world into the guest.

With irqfd, if we want to trigger an interrupt we have setup, what we need to do is only write to that corresponding eventfd. To write it in userspace, a simple write() syscall would suffice (actually there is a libc call named eventfd_write(), however that’s merely a wrapper of the write() system call). To do it in kernel, we can use eventfd_signal() instead.

2. Overview

irqfd基于eventfd机制,qemu中将一个gsi(全局系统中断号)与eventfd捆绑后,向kvm发送注册irqfd请求,kvm收到请求后将带有gsi信息的eventfd加入到与irqfd有关的等待队列中,一旦有进程向该eventfd写入,等待队列中的元素就会唤醒,并调用相应的唤醒函数(irqfd_wakeup)向guest注入中断(irqfd_inject)。

3. Details

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
int kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
{
struct kvm_kernel_irqfd *irqfd;
...
irqfd = kzalloc(sizeof(*irqfd), GFP_KERNEL_ACCOUNT);
irqfd->kvm = kvm;
irqfd->gsi = args->gsi;
INIT_LIST_HEAD(&irqfd->list);
INIT_WORK(&irqfd->inject, irqfd_inject);
INIT_WORK(&irqfd->shutdown, irqfd_shutdown);
...
irqfd->eventfd = eventfd;
...
/*
* Install our own custom wake-up handling so we are notified via
* a callback whenever someone signals the underlying eventfd
*/
init_waitqueue_func_entry(&irqfd->wait, irqfd_wakeup);
...
init_poll_funcptr(&irqfd->pt, irqfd_ptable_queue_proc);
events = vfs_poll(f.file, &irqfd->pt);
}

kvm_kernel_irqfd结构中有2个work_struct,inject和shutdown,分别负责触发中断和关闭中断,这两个work_struct各自对应的操作函数分别为irqfd_inject和irqfd_shutdown。

kvm_irqfd_assign调用init_waitqueue_func_entry将irqfd_wakeup函数注册为irqfd中wait queue entry激活时的处理函数。这样任何写入该irqfd对应的eventfd的行为都将触发这个函数。

kvm_irqfd_assign利用init_poll_funcptr将irqfd_ptable_queue_proc函数注册为irqfd中的poll table的处理函数。irqfd_ptable_queue_proc会将poll table中对应的wait queue entry加入到waitqueue中去。

kvm_irq_assign以irqfd->pt为参数,调用eventfd的poll函数,也就是eventfd_poll;eventfd_poll会调用poll_wait函数;poll_wait会回调之前为poll table注册的irqfd_ptable_queue_proc函数。


参考资料:

  1. KVM: irqfd
  2. Linux虚拟化KVM-Qemu分析(十二)之ioeventfd与irqfd
  3. qemu-kvm的irqfd机制
  4. KVM Irqfd Introduction