本文将介绍umonitorumwaittpause 这三个指令的相关知识点。

1. Introduction

最权威的描述当然来自于SDM。

1.1 UMONITOR—User Level Set Up Monitor Address

The UMONITOR instruction arms address monitoring hardware using an address specified in the source register(the address range that the monitoring hardware checks for store operations can be determined by using the CPUID monitor leaf function, EAX=05H). A store to an address within the specified address range triggers the monitoring hardware. The state of monitor hardware is used by UMWAIT.

UMONITOR sets up an address range for the monitor hardware using the content of source register as an effective address and puts the monitor hardware in armed state. A store to the specified address range will trigger the monitor hardware.

1.2 UMWAIT—User Level Monitor Wait

A hint that allows the processor to stop instruction execution and enter an implementation-dependent optimized state until occurrence of a class of events.

a class of events:

  • the monitoring hardware is triggered
  • when the time-stamp counter reaches or exceeds the implicit EDX:EAX 64-bit input value(if the monitoring hardware did not trigger beforehand)

UMWAIT instructs the processor to enter an implementation-dependent optimized state while monitoring a range of addresses. The optimized state may be either a light-weight power/performance optimized state or an improved power/performance optimized state. The selection between the two states is governed by the explicit input register bit[0] source operand.

1.3 Timed PAUSE

Directs the processor to enter an implementation dependent optimized state until the TSC reaches the value in EDX:EAX.

TPAUSE instructs the processor to enter an implementation-dependent optimized state. There are two such optimized states to choose from: light-weight power/performance optimized state, and improved power/performance optimized state. The selection between the two is governed by the explicit input register bit[0] source operand.

2. Usage

2.1 spin-lock

Today, if an application needs to wait for a very short duration they have to have spinloops. Spinloops consume more power and continue to use execution resources that could hurt its thread siblings in a core with hyperthreads(HT). New instructions umonitor, umwait and tpause allow a low power alternative waiting at the same time could improve the HT sibling perform while giving it any power headroom. These instructions can be used in both user space and kernel space.

A new MSR IA32_UMWAIT_CONTROL allows kernel to set a time limit(how long the umwait and tpause instructions can wait before normal execution continues) in TSC-quanta that prevents user applications from waiting for a long time.

The processor supports two levels of optimized states: a light-weight power/performance optimized state (C0.1 state) or an improved power/performance optimized state (C0.2 state with deeper power saving and higher exit latency). It is conceivable that system administrators might not want to allow the system to go into C0.2 if, for example, it is handling workloads with realtime response requirements.

2.2 DPDK

Power-optimized RX for Ethernet devices

This patchset proposes a simple API for Ethernet drivers to cause the CPU to enter a power-optimized state while waiting for packets to arrive, along with a set of(hopefully generic) intrinsics that facilitate that. This is achieved through cooperation with the NIC driver that will allow us to know address of the next NIC RX(Receive) ring packet descriptor, and wait for writes on it.


参考资料:

  1. Short waits with umwait
  2. x86/umwait: Enable user wait instructions
  3. KVM: x86: Enable user wait instructions