Notes about Linux kernel Oops.

1. Introduction

什么是Oops?从语言学的角度说,Oops应该是一个拟声词。当出了点小事故,或者做了比较尴尬的事之后,你可以说”Oops”,翻译成中国话就叫做“哎呦”。“哎呦,对不起,对不起,我真不是故意打碎您的杯子的”。看,Oops就是这个意思。

在Linux内核开发中的Oops是什么呢?其实,它和上面的解释也没什么本质的差别,只不过说话的主角变成了Linux。当某些比较致命的问题出现时,我们的Linux内核也会抱歉的对我们说:“哎呦(Oops),对不起,我把事情搞砸了”。Linux内核在发生kernel panic时会打印出Oops信息,把目前的寄存器状态、堆栈内容、以及完整的Call trace都show给我们看,这样就可以帮助我们定位错误。

Kernel bug reports often come with a stack dump like the one below:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
------------[ cut here ]------------
WARNING: CPU: 1 PID: 28102 at kernel/module.c:1108 module_put+0x57/0x70
Modules linked in: dvb_usb_gp8psk(-) dvb_usb dvb_core nvidia_drm(PO) nvidia_modeset(PO) snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep snd_hda_core snd_pcm snd_timer snd soundcore nvidia(PO) [last unloaded: rc_core]
CPU: 1 PID: 28102 Comm: rmmod Tainted: P WC O 4.8.4-build.1 #1
Hardware name: MSI MS-7309/MS-7309, BIOS V1.12 02/23/2009
00000000 c12ba080 00000000 00000000 c103ed6a c1616014 00000001 00006dc6
c1615862 00000454 c109e8a7 c109e8a7 00000009 ffffffff 00000000 f13f6a10
f5f5a600 c103ee33 00000009 00000000 00000000 c109e8a7 f80ca4d0 c109f617
Call Trace:
[<c12ba080>] ? dump_stack+0x44/0x64
[<c103ed6a>] ? __warn+0xfa/0x120
[<c109e8a7>] ? module_put+0x57/0x70
[<c109e8a7>] ? module_put+0x57/0x70
[<c103ee33>] ? warn_slowpath_null+0x23/0x30
[<c109e8a7>] ? module_put+0x57/0x70
[<f80ca4d0>] ? gp8psk_fe_set_frontend+0x460/0x460 [dvb_usb_gp8psk]
[<c109f617>] ? symbol_put_addr+0x27/0x50
[<f80bc9ca>] ? dvb_usb_adapter_frontend_exit+0x3a/0x70 [dvb_usb]
[<f80bb3bf>] ? dvb_usb_exit+0x2f/0xd0 [dvb_usb]
[<c13d03bc>] ? usb_disable_endpoint+0x7c/0xb0
[<f80bb48a>] ? dvb_usb_device_exit+0x2a/0x50 [dvb_usb]
[<c13d2882>] ? usb_unbind_interface+0x62/0x250
[<c136b514>] ? __pm_runtime_idle+0x44/0x70
[<c13620d8>] ? __device_release_driver+0x78/0x120
[<c1362907>] ? driver_detach+0x87/0x90
[<c1361c48>] ? bus_remove_driver+0x38/0x90
[<c13d1c18>] ? usb_deregister+0x58/0xb0
[<c109fbb0>] ? SyS_delete_module+0x130/0x1f0
[<c1055654>] ? task_work_run+0x64/0x80
[<c1000fa5>] ? exit_to_usermode_loop+0x85/0x90
[<c10013f0>] ? do_fast_syscall_32+0x80/0x130
[<c1549f43>] ? sysenter_past_esp+0x40/0x6a
---[ end trace 6ebc60ef3981792f ]---

Such stack traces provide enough information to identify the line inside the Kernel’s source code where the bug happened. Depending on the severity of the issue, it may also contain the word Oops, as on this one:

1
2
3
4
5
BUG: unable to handle kernel NULL pointer dereference at   (null)
IP: [<c06969d4>] iret_exc+0x7d0/0xa59
*pdpt = 000000002258a001 *pde = 0000000000000000
Oops: 0002 [#1] PREEMPT SMP
...

Despite being an Oops or some other sort of stack trace, the offended line is usually required to identify and handle the bug. Along this chapter, we’ll refer to “Oops” for all kinds of stack traces that need to be analyzed.

2. decode_stacktrace

If the kernel is compiled with CONFIG_DEBUG_INFO, you can enhance the quality of the stack trace by using file:scripts/decode_stacktrace.sh.

2.1 Input

1
2
3
4
5
6
7
[    6.906437]  [<ffffffff811f0e90>] ? backtrace_test_irq_callback+0x20/0x20
[ 6.907121] [<ffffffff84388ce8>] dump_stack+0x52/0x7f
[ 6.907640] [<ffffffff811f0ec8>] backtrace_regression_test+0x38/0x110
[ 6.908281] [<ffffffff813596a0>] ? proc_create_data+0xa0/0xd0
[ 6.908870] [<ffffffff870a8040>] ? proc_modules_init+0x22/0x22
[ 6.909480] [<ffffffff810020c2>] do_one_initcall+0xc2/0x1e0
[...]

2.2 Output

1
2
3
4
5
6
7
[  635.148361]  dump_stack (lib/dump_stack.c:52)
[ 635.149127] warn_slowpath_common (kernel/panic.c:418)
[ 635.150214] warn_slowpath_null (kernel/panic.c:453)
[ 635.151031] _oalloc_pages_slowpath+0x6a/0x7d0
[ 635.152171] ? zone_watermark_ok (mm/page_alloc.c:1728)
[ 635.152988] ? get_page_from_freelist (mm/page_alloc.c:1939)
[ 635.154766] __alloc_pages_nodemask (mm/page_alloc.c:2766)

2.3 Usage

./decode_stacktrace.sh [vmlinux] [base path]

Where vmlinux is the vmlinux to extract line numbers from and base path is the path that points to the root of the build tree, for example:

./decode_stacktrace.sh vmlinux /home/sasha/linux/ < input.log > output.log

3. Finding the bug location by gdb

Reporting a bug works best if you point the location of the bug at the Kernel source file. Usually, using gdb is easier, but the Kernel should be pre-compiled with debug info.

The gdb is the best way to figure out the exact file and line number of the OOPS from the vmlinux file.

The usage of gdb works best on a kernel compiled with CONFIG_DEBUG_INFO. This can be set by running:

1
$ ./scripts/config -d COMPILE_TEST -e DEBUG_KERNEL -e DEBUG_INFO

On a kernel compiled with CONFIG_DEBUG_INFO, you can simply copy the EIP value from the OOPS:

1
EIP:    0060:[<c021e50e>]    Not tainted VLI

And use GDB to translate that to human-readable form:

1
2
$ gdb vmlinux
(gdb) l *0xc021e50e

4. More info

4.1 Oops: 0002 [#1]

0002表示Oops的error code, #1表示这个错误发生一次。

4.2 Tainted信息


参考资料:

  1. Bug hunting
  2. Linux内核的Oops
  3. decode_stacktrace: make stack dump output useful again