本文主要汇总system call相关资料。

1. 基础知识

建议阅读ray的notes,并编译运行里面的程序。

ray’s notes

LINUX SYSTEM CALL TABLE FOR X86 64

2. syscall vs sysenter vs int 0x80

  • syscall is the default way of entering kernel mode on x86-64. This instruction is not available in 32 bit modes of operation on Intel processors.
  • sysenter is an instruction most frequently used to invoke system calls in 32 bit modes of operation. It is similar to syscall, a bit more difficult to use though, but that is the kernel’s concern.
  • int 0x80 is a legacy way to invoke a system call and should be avoided.

What is better “int 0x80” or “syscall” in 32-bit code on Linux?

传统的int 0x80有点慢, Intel实现了sysenter和syscall, 即所谓的快速系统调用指令, 使用它们更快。

3. vDSO(virtual dynamic shared object)

首先运行几个指令,给读者直观地展示vDSO,测试环境为64-bit Linux。

1
2
3
4
5
6
acrn@acrn:/proc$ cat /proc/self/maps | tail -2
7ffe0ce1b000-7ffe0ce1d000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
acrn@acrn:/proc$ cat /proc/self/maps | tail -2
7ffc9c356000-7ffc9c358000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]

Note that the vDSO area has moved, while the vsyscall page remains at the same location. The location of the vsyscall page is nailed down in the kernel ABI, but the vDSO area - like most other areas in the user-space memory layout - has its location randomized every time it is mapped.

1
2
3
4
acrn@acrn:/proc$ ldd /bin/sh
linux-vdso.so.1 => (0x00007ffc03ffd000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f2d40401000)
/lib64/ld-linux-x86-64.so.2 (0x00007f2d409f3000)

linux-vdso.so.1 is a virtual shared object that doesn’t have any physical file on the disk; it’s a part of the kernel that’s exported into every program’s address space when it’s loaded.

For a more detailed description, man vdso

wikipedia较好地介绍了vDSO。

3.1 Introduction

vDSO (virtual dynamic shared object) is a kernel mechanism for exporting a carefully selected set of kernel space routines to user space applications so that applications can call these kernel space routines in-process, without incurring the performance penalty of a mode switch from user mode to kernel mode that is inherent when calling these same kernel space routines by means of the system call interface.

3.2 Virtual dynamic shared object

vDSO uses standard mechanisms for linking and loading i.e. standard Executable and Linkable Format (ELF) format. vDSO is a memory area allocated in user space which exposes some kernel functionalities. vDSO is dynamically allocated, offers improved safety through address space layout randomization, and supports more than 4 system calls. Some C standard libraries, like glibc, may provide vDSO links so that if the kernel does not have vDSO support, a traditional syscall is made. vDSO helps to reduce the calling overhead on simple kernel routines, and it also can work as a way to select the best system-call method on some computer architectures such as IA-32.

3.3 Vsyscall

DSO was developed to offer the vsyscall features while overcoming its limitations: a small amount of statically allocated memory, which allows only 4 system calls, and the same addresses application binary interface (ABI) in each process, which compromises security. This security issue has been mitigated by emulating a virtual system call, but the emulation introduces additional latency.

4. extension

Linux内核中system call的实现,以及vDSO的具体实现较为复杂,本文不会介绍相关内容。读者若有兴趣,可以参阅本文引用的链接。当然,读者可以结合源码以及相关资料去学习具体的实现细节。


参考资料:

  1. On vsyscalls and the vDSO
  2. Where is linux-vdso.so.1 present on the file system
  3. What are vdso and vsyscall?
  4. VDSO与vsyscall
  5. Two frequently used system calls are ~77% slower on AWS EC2
  6. The Definitive Guide to Linux System Calls