本文将会介绍VT-d下,cpu cache的snoop control问题。

背景

CPU Cache一致性问题

snoop针对cpu cache。snoop会保证cpu cache的一致性,但是该操作比较费时。

No-snoop attribute in PCI-e request

在PCI-e request中,No Snoop Attribute位可以设置snoop的行为。

当一个PCIe设备对memory进行DMA读操作时,如果传送的数据非常大,比如512MB,Cache的一致性操作不但不会提高DMA写的效率,反而会降低。因为这个DMA读访问的数据在绝大多数情况下,并不会在Cache中命中。同时,由于snoop操作比较费时,因此会降低效率。

对于这类情况,一个较好的做法是,首先使用软件指令保证Cache与memory的一致性,并置“No Snoop Attribute”位为1,然后再进行DMA读操作。同理,使用这种方法对一段较大的数据区域进行DMA写时,也可以提高效率。

除此之外,当软件已知某些内存区域肯定不会被cache时(如被系统事先标注为uncacheable),就不需要snoop了。

下面也是一个无需设置snoop的例子:

An example use case is a GPU that needs to “borrow” extra memory from the processor(s) for “spill” and “restore” traffic. Only the GPU will be accessing that memory, so it does not need to look in the processor caches to see if any of them has modified copies of the cache lines. The improvement in bandwidth due to the elimination of snooping can improve graphics frame rates.

snoop control in VT-d

If VT-d hardware supports snoop control(SC), it allows VT-d to control to ignore the “no-snoop attribute” in PCI-E transactions.

The following table shows the snoop behavior of DMA operation controlled by the combination of:

  • Snoop Control capability of VT-d DMAR unit
  • The setting of SNP filed in leaf PTE
  • No-snoop attribute in PCI-e request

ACRN enable Snoop Control by default if all enabled VT-d DMAR units support Snoop Control by setting bit 11 of leaf PTE of EPT table. Bit 11 of leaf PTE of EPT is ignored by MMU. So no side effect for MMU.

If one of the enabled VT-d DMAR units doesn’t support Snoop Control, then Bit 11 of leaf PET of EPT is not set since the field is treated as reserved(0) by VT-d hardware implementations not supporting Snoop Control.


参考资料:

  1. Hypervisor high-level design VT-d
  2. non-snoop read and non-snoop write. meaning
  3. PCIE总线事务层