什么是eventfd?

An “eventfd object” can be used as an event wait/notify mechanism by user-space applications, and by the kernel to notify user-space applications of events. It has been added to kernel since Linux 2.6.22. The object contains an unsigned 64-bit integer (uint64_t) counter that is maintained by the kernel. This counter is initialized with the value specified in the argument initval.

1
2
#include <sys/eventfd.h>
int eventfd(unsigned int initval, int flags);

本文主要关注eventfd在kernel notify user-space applications中的应用。

kernel module中eventfd的使用

以下内容来源于Stack Overflow中Writing to eventfd from kernel module

Each open file on a system could be identified by the pid of one of the processes which opened it and the fd corresponding to that file (within that process’s context). So if my kernel module knows the pid and fd, it can look up the struct * task_struct of the process and from that the struct * files and finally using the fd, it can acquire the pointer to the eventfd’s struct * file. Then, using this last pointer, it can write to the eventfd’s counter.

Here are the codes for the userspace program and the kernel module that I wrote up to demonstrate the concept (which now work):

Userspace C code (efd_us.c):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h> //Definition of uint64_t
#include <sys/eventfd.h>

int efd; //Eventfd file descriptor
uint64_t eftd_ctr;

int retval; //for select()
fd_set rfds; //for select()

int s;

int main() {

//Create eventfd
efd = eventfd(0,0);
if (efd == -1){
printf("\nUnable to create eventfd! Exiting...\n");
exit(EXIT_FAILURE);
}

printf("\nefd=%d pid=%d",efd,getpid());

//Watch efd
FD_ZERO(&rfds);
FD_SET(efd, &rfds);

printf("\nNow waiting on select()...");
fflush(stdout);

retval = select(efd+1, &rfds, NULL, NULL, NULL);

if (retval == -1){
printf("\nselect() error. Exiting...");
exit(EXIT_FAILURE);
} else if (retval > 0) {
printf("\nselect() says data is available now. Exiting...");
printf("\nreturned from select(), now executing read()...");
s = read(efd, &eftd_ctr, sizeof(uint64_t));
if (s != sizeof(uint64_t)){
printf("\neventfd read error. Exiting...");
} else {
printf("\nReturned from read(), value read = %lld",eftd_ctr);
}
} else if (retval == 0) {
printf("\nselect() says that no data was available");
}

printf("\nClosing eventfd. Exiting...");
close(efd);
printf("\n");
exit(EXIT_SUCCESS);
}

Kernel Module C code (efd_lkm.c):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/pid.h>
#include <linux/sched.h>
#include <linux/fdtable.h>
#include <linux/rcupdate.h>
#include <linux/eventfd.h>

//Received from userspace. Process ID and eventfd's File descriptor are enough to uniquely identify an eventfd object.
int pid;
int efd;

//Resolved references...
struct task_struct * userspace_task = NULL; //...to userspace program's task struct
struct file * efd_file = NULL; //...to eventfd's file struct
struct eventfd_ctx * efd_ctx = NULL; //...and finally to eventfd context

//Increment Counter by 1
static uint64_t plus_one = 1;

int init_module(void) {
printk(KERN_ALERT "~~~Received from userspace: pid=%d efd=%d\n",pid,efd);

userspace_task = pid_task(find_vpid(pid), PIDTYPE_PID);
printk(KERN_ALERT "~~~Resolved pointer to the userspace program's task struct: %p\n",userspace_task);

printk(KERN_ALERT "~~~Resolved pointer to the userspace program's files struct: %p\n",userspace_task->files);

rcu_read_lock();
efd_file = fcheck_files(userspace_task->files, efd);
rcu_read_unlock();
printk(KERN_ALERT "~~~Resolved pointer to the userspace program's eventfd's file struct: %p\n",efd_file);


efd_ctx = eventfd_ctx_fileget(efd_file);
if (!efd_ctx) {
printk(KERN_ALERT "~~~eventfd_ctx_fileget() Jhol, Bye.\n");
return -1;
}
printk(KERN_ALERT "~~~Resolved pointer to the userspace program's eventfd's context: %p\n",efd_ctx);

eventfd_signal(efd_ctx, plus_one);

printk(KERN_ALERT "~~~Incremented userspace program's eventfd's counter by 1\n");

eventfd_ctx_put(efd_ctx);

return 0;
}


void cleanup_module(void) {
printk(KERN_ALERT "~~~Module Exiting...\n");
}

MODULE_LICENSE("GPL");
module_param(pid, int, 0);
module_param(efd, int, 0);

To run this, carry out the following steps:

  1. Compile the userspace program (efd_us.out) and the kernel module (efd_lkm.ko)
  2. Run the userspace program (./efd_us.out) and note the pid and efd values that it print. (for eg. “pid=2803 efd=3”. The userspace program will wait endlessly on select()
  3. Open a new terminal window and insert the kernel module passing the pid and efd as params: sudo insmod efd_lkm.ko pid=2803 efd=3
  4. Switch back to the userspace program window and you will see that the userspace program has broken out of select and exited.

内核中的函数可以去lxr中查看。


参考资料:

  1. Writing to eventfd from kernel module stackoverflow
  2. Linux Programmer’s Manual EVENTFD