本文将mark下Linux kernel中SRCU相关API的使用方法。

Introduction

  • Read-copy update (RCU)是一种在读多写少场景下可替代读写锁的高性能同步机制,RCU的读端不加锁,因此开销很低,不会被阻塞,执行的时间确定。这种设计决定了RCU写端不能阻塞读端,因此RCU写端的开销很高,因为它必须保留临界区数据直到没有读者访问,然后回收临界区数据
  • RCU要求访问临界区的读者不能睡眠或者被阻塞,原因是睡眠意味者上下文切换,进程的cpu被抢占,是不允许出现在处于临界区的读者身上的,因为会影响宽限期的检查。
  • 在许多场景下我们又要求进程是可睡眠的,比如实时系统,高优先级的进程可以抢占低优先级进程的cpu,因此低优先级的进程必须让出cpu,低优先级进程如果拿了RCU的读锁,此时就会睡眠,会破坏RCU宽限期的检查。

一个可以睡眠的RCU同步进制就被提了出来。

SRCU Implementation Strategy

将宽限期的检查隔离到一个子系统中,这样即使一个读者的睡眠时间无限延长,那么也只有处于这个子系统中的写者受到影响。

SRCU API and Usage

Documentation/RCU/lockdep.txt可以查询相关API的使用信息。

The SRCU API is shown in below. The following sections describe how to use it.

1
2
3
4
5
int init_srcu_struct(struct srcu_struct *sp);
void cleanup_srcu_struct(struct srcu_struct *sp);
int srcu_read_lock(struct srcu_struct *sp);
void srcu_read_unlock(struct srcu_struct *sp, int idx);
void synchronize_srcu(struct srcu_struct *sp);

一个struct srcu_struct代表一个逻辑SRCU子系统。

Initialization and Cleanup

Each subsystem using SRCU must create an struct srcu_struct, either by declaring a variable of this type or by dynamically allocating the memory, for example, via kmalloc(). Once this structure is in place, it must be initialized via init_srcu_struct(), which returns zero for success or an error code for failure (for example, upon memory exhaustion).

If the struct srcu_struct is dynamically allocated, then cleanup_srcu_struct() must be called before it is freed. Similarly, if the struct srcu_struct is a variable declared within a Linux kernel module, then cleanup_srcu_struct() must be called before the module is unloaded. Either way, the caller must take care to ensure that all SRCU read-side critical sections have completed (and that no more will commence) before calling cleanup_srcu_struct().

Read-Side Primitives

The read-side srcu_read_lock() and srcu_read_unlock() primitives are used as shown:

1
2
3
idx = srcu_read_lock(&ss);
/* read-side critical section. */
srcu_read_unlock(&ss, idx);

The ss variable is the struct srcu_struct whose initialization was described above, and the idx variable is an integer that in effect tells srcu_read_unlock() the grace period during which the corresponding srcu_read_lock() started.

Update-Side Primitives

The synchronize_srcu() primitives may be used as shown below:

1
2
3
list_del_rcu(p);
synchronize_srcu(&ss);
kfree(p);

As one might expect by analogy with Classic RCU, this primitive blocks until after the completion of all SRCU read-side critical sections that started before the synchronize_srcu() started, as shown in Table 1.

Here, CPU 1 need only wait for the completion of CPU 0’s SRCU read-side critical section. It need not wait for the completion of CPU 2’s SRCU read-side critical section, because CPU 2 did not start this critical section until after CPU 1 began executing synchronize_srcu(). Finally, CPU 1’s synchronize_srcu() need not wait for CPU 3’s SRCU read-side critical section, because CPU 3 is using s2 rather than s1 as its struct srcu_struct. CPU 3’s SRCU read-side critical section is thus related to a different set of grace periods than those of CPUs 0 and 2.

MISC API

  • synchronize_srcu_expedited

Wait for an SRCU grace period to elapse, but be more aggressive about spinning rather than blocking when waiting.

  • srcu_dereference_check
1
2
3
4
5
6
7
8
9
10
11
12
13
14
/**
* srcu_dereference_check - fetch SRCU-protected pointer for later dereferencing
* @p: the pointer to fetch and protect for later dereferencing
* @ssp: pointer to the srcu_struct, which is used to check that we
* really are in an SRCU read-side critical section.
* @c: condition to check for update-side use
*
* If PROVE_RCU is enabled, invoking this outside of an RCU read-side
* critical section will result in an RCU-lockdep splat, unless @c evaluates
* to 1. The @c argument will normally be a logical expression containing
* lockdep_is_held() calls.
*/
#define srcu_dereference_check(p, ssp, c) \
__rcu_dereference_check((p), (c) || srcu_read_lock_held(ssp), __rcu)

readers/updaters均可能会调用该函数。


参考资料:

  1. SRCU的简单实现
  2. SRCU的内核简单实现
  3. Sleepable RCU
  4. synchronize_srcu_expedited
  5. synchronize_srcu
  6. srcu_dereference_check