使用NVMe driver的poll_queues参数即可开启IO queue的poll。

内核源码

1
2
3
4
//https://elixir.bootlin.com/linux/v6.0/source/drivers/nvme/host/pci.c
static unsigned int poll_queues;
module_param_cb(poll_queues, &io_queue_count_ops, &poll_queues, 0644);
MODULE_PARM_DESC(poll_queues, "Number of queues to use for polled IO.");
1
2
3
4
5
6
static int nvme_setup_io_queues(struct nvme_dev *dev)
{
...
dev->nr_poll_queues = poll_queues;
...
}

注意事项

当只有一个IO queue时,设置poll_queues为1后,发现依然有中断。其实这是符合预取的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
static int nvme_setup_irqs(struct nvme_dev *dev, unsigned int nr_io_queues)
{
struct pci_dev *pdev = to_pci_dev(dev->dev);
struct irq_affinity affd = {
.pre_vectors = 1,
.calc_sets = nvme_calc_irq_sets,
.priv = dev,
};
unsigned int irq_queues, poll_queues;

/*
* Poll queues don't need interrupts, but we need at least one I/O queue
* left over for non-polled I/O.
*/
poll_queues = min(dev->nr_poll_queues, nr_io_queues - 1);
dev->io_queues[HCTX_TYPE_POLL] = poll_queues;

/*
* Initialize for the single interrupt case, will be updated in
* nvme_calc_irq_sets().
*/
dev->io_queues[HCTX_TYPE_DEFAULT] = 1;
dev->io_queues[HCTX_TYPE_READ] = 0;

/*
* We need interrupts for the admin queue and each non-polled I/O queue,
* but some Apple controllers require all queues to use the first
* vector.
*/
irq_queues = 1;
if (!(dev->ctrl.quirks & NVME_QUIRK_SINGLE_VECTOR))
irq_queues += (nr_io_queues - poll_queues);
return pci_alloc_irq_vectors_affinity(pdev, 1, irq_queues,
PCI_IRQ_ALL_TYPES | PCI_IRQ_AFFINITY, &affd);
}

由上述代码可知,driver至少留一个IO queue使用interrupt而非poll。

因此,当只有一个IO queue时,即使driver参数设置了poll_queues为1,其实是不生效的(nvme_setup_irqs中的poll_queues变量为0),这个唯一的IO queue使用的依然是interrupt而非poll。