本文将mark下Linux kernel内存分配函数的相关notes。

内核函数

kmalloc系列

kmalloc() 申请的内存位于物理内存映射区域,而且在物理上也是连续的,它们与真实的物理地址只有一个固定的偏移,因而存在较简单的转换关系。

  • kmalloc
  • kzalloc
  • kcalloc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
/**
* kzalloc - allocate memory. The memory is set to zero.
* @size: how many bytes of memory are required.
* @flags: the type of memory to allocate (see kmalloc).
*/
static inline void *kzalloc(size_t size, gfp_t flags)
{
return kmalloc(size, flags | __GFP_ZERO);
}

/**
* kcalloc - allocate memory for an array. The memory is set to zero.
* @n: number of elements.
* @size: element size.
* @flags: the type of memory to allocate (see kmalloc).
*/
static inline void *kcalloc(size_t n, size_t size, gfp_t flags)
{
return kmalloc_array(n, size, flags | __GFP_ZERO);
}

vmalloc系列

物理地址不连续:通过映射非连续的物理页帧实现。
虚拟地址连续:虚拟地址空间是连续的。
逐页映射: 用 alloc_page() 从伙伴系统获取多个不连续的物理页帧。 修改内核页表,将这些物理页映射到连续的虚拟地址空间(VMALLOC_START ~ VMALLOC_END)。

  • vmalloc
  • vzalloc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
/**
* vmalloc - allocate virtually contiguous memory
* @size: allocation size
* Allocate enough pages to cover @size from the page level
* allocator and map them into contiguous kernel virtual space.
*
* For tight control over page level allocator and protection flags
* use __vmalloc() instead.
*/
void *vmalloc(unsigned long size)
{
return __vmalloc_node_flags(size, NUMA_NO_NODE,
GFP_KERNEL);
}

/**
* vzalloc - allocate virtually contiguous memory with zero fill
* @size: allocation size
* Allocate enough pages to cover @size from the page level
* allocator and map them into contiguous kernel virtual space.
* The memory allocated is set to zero.
*
* For tight control over page level allocator and protection flags
* use __vmalloc() instead.
*/
void *vzalloc(unsigned long size)
{
return __vmalloc_node_flags(size, NUMA_NO_NODE,
GFP_KERNEL | __GFP_ZERO);
}

kvzalloc系列

  • kvzalloc
  • kvcalloc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
static inline void *kvzalloc(size_t size, gfp_t flags)
{
return kvmalloc(size, flags | __GFP_ZERO);
}

static inline void *kvmalloc(size_t size, gfp_t flags)
{
return kvmalloc_node(size, flags, NUMA_NO_NODE);
}

/**
* kvmalloc_node - attempt to allocate physically contiguous memory, but upon
* failure, fall back to non-contiguous (vmalloc) allocation.
* @size: size of the request.
* @flags: gfp mask for the allocation - must be compatible (superset) with GFP_KERNEL.
* @node: numa node to allocate from
*
* Uses kmalloc to get the memory but if the allocation fails then falls back
* to the vmalloc allocator. Use kvfree for freeing the memory.
*
* Reclaim modifiers - __GFP_NORETRY and __GFP_NOFAIL are not supported.
* __GFP_RETRY_MAYFAIL is supported, and it should be used only if kmalloc is
* preferable to the vmalloc fallback, due to visible performance drawbacks.
*
* Please note that any use of gfp flags outside of GFP_KERNEL is careful to not
* fall back to vmalloc.
*/
void *kvmalloc_node(size_t size, gfp_t flags, int node)
{
gfp_t kmalloc_flags = flags;
void *ret;

/*
* vmalloc uses GFP_KERNEL for some internal allocations (e.g page tables)
* so the given set of flags has to be compatible.
*/
if ((flags & GFP_KERNEL) != GFP_KERNEL)
return kmalloc_node(size, flags, node);

/*
* We want to attempt a large physically contiguous block first because
* it is less likely to fragment multiple larger blocks and therefore
* contribute to a long term fragmentation less than vmalloc fallback.
* However make sure that larger requests are not too disruptive - no
* OOM killer and no allocation failure warnings as we have a fallback.
*/
if (size > PAGE_SIZE) {
kmalloc_flags |= __GFP_NOWARN;

if (!(kmalloc_flags & __GFP_RETRY_MAYFAIL))
kmalloc_flags |= __GFP_NORETRY;
}

ret = kmalloc_node(size, kmalloc_flags, node);

/*
* It doesn't really make sense to fallback to vmalloc for sub page
* requests
*/
if (ret || size <= PAGE_SIZE)
return ret;

return __vmalloc_node_flags_caller(size, node, flags,
__builtin_return_address(0));
}

物理地址获取

1
2
3
4
5
6
7
8
9
static phys_addr_t kvm_kaddr_to_phys(void *kaddr)
{
if (!is_vmalloc_addr(kaddr)) {
return __pa(kaddr);
} else {
return page_to_phys(vmalloc_to_page(kaddr)) +
offset_in_page(kaddr);
}
}

Note that the virtual address may come from different kernel memory zones, including the vmalloc region or the direct memory region. Checking whether the a virtual address belongs to the vmalloc region by invoking the is_vmalloc_addr function. If so, the vmalloc_to_page function is used to get the corresponding physical page structure; otherwise, the virt_to_page function is used to obtain the right page.


参考资料:

  1. https://elixir.bootlin.com/linux/v4.19/
  2. what are differences between kmalloc() kcalloc() vmalloc() and kzalloc()?
  3. 带你看懂Linux内核空间内存申请函数kmalloc.、kzalloc、 vmalloc的区别(一篇就够了)
  4. 傻傻分不清楚 kmalloc、vmalloc和malloc之间有什么区别以及实现上的差异
  5. QZFS: QAT Accelerated Compression in File System for Application Agnostic and Cost Efficient Data Storage