Memory Addresses

Logical address

  • Included in the machine language instructions to specify the address of an operand or of an instruction. Each logical address consists of a segment and an offset that denotes the distance from the start of the segment to the actual address.

Linear address (also known as virtual address)

  • A single 32-bit unsigned integer that can be used to address up to 4 GB. Linear addresses are usually represented in hexadecimal notation; their values range from 0x00000000 to 0xffffffff.

Physical address

  • Used to address memory cells in memory chips. Physical addresses are represented as 32-bit or 36-bit unsigned integers.

The Memory Management Unit (MMU) transforms a logical address into a linear address by means of a hardware circuit called a segmentation unit; subsequently, a second hardware circuit called a paging unit transforms the linear address into a physical address.

Segmentation in Hardware

Starting with the 80286 model, Intel microprocessors perform address translation in two different ways called real mode and protected mode. Real mode exists mostly to maintain processor compatibility with older models and to allow the operating system to bootstrap.
详情参见分段机制解析

Paging in Hardware

The paging unit thinks of all RAM as partitioned into fixed-length page frames(sometimes referred to as physical pages). Each page frame contains a page — that is, the length of a page frame coincides with that of a page. A page frame is a constituent of main memory, and hence it is a storage area. It is important to distinguish a page from a page frame; the former is just a block of data, which may be stored in any page frame or on disk.

The data structures that map linear to physical addresses are called page tables; they are stored in main memory and must be properly initialized by the kernel before enabling the paging unit.

3.1 Regular Paging

3.2 Hardware Protection Scheme

While 80×86 processors allow four possible privilege levels to a segment, only two privilege levels are associated with pages and Page Tables.

User/Supervisor flag

  • 0, can be addressed only when the CPL is less than 3 (Kernel Mode)
  • 1, can always be addressed

Read/Write flag

  • 0, read-only
  • 1, can be read and written

3.3 The Physical Address Extension (PAE) Paging Mechanism

3.4 Hardware Cache

Hardware cache memories were introduced to reduce the speed mismatch between CPU and RAM. They are based on the well-known locality principle, which holds both for programs and data structures.

3.5 Translation Lookaside Buffers (TLB)

Besides general-purpose hardware caches, 80x86 processors include another cache called Translation Lookaside Buffers (TLB) to speed up linear address translation. When a linear address is used for the first time, the corresponding physical address is computed through slow accesses to the Page Tables in RAM. The physical address is then stored in a TLB entry so that further references to the same linear address can be quickly translated.

In a multiprocessor system, each CPU has its own TLB, called the local TLB of the CPU. When the cr3 control register of a CPU is modified, the hardware automatically invalidates all entries of the local TLB, because a new set of page tables is in use and the TLBs are pointing to old data.

Paging in Linux

Linux adopts a common paging model that fits both 32-bit and 64-bit architectures. Starting with version 2.6.11, a four-level paging model has been adopted:

  • Page Global Directory
  • Page Upper Directory
  • Page Middle Directory
  • Page Table

Each process has its own Page Global Directory and its own set of Page Tables. When a process switch occurs, Linux saves the cr3 control register in the descriptor of the process previously in execution and then loads cr3 with the value stored in the descriptor of the process to be executed next. Thus, when the new process resumes its execution on the CPU, the paging unit refers to the correct set of Page Tables.

4.1 Page Table Handling

4.2 Physical Memory Layout

As a general rule, the Linux kernel is installed in RAM starting from the physical address 0x00100000 — i.e., from the second megabyte. The reason that kernel isn’t loaded starting with the first availalbe megabyte of RAM includes:

  • Page frame 0 is used by BIOS to store the system hardware configuration detected during the Power-On Self-Test (POST); the BIOS of many laptops, moreover, writes data on this page frame even after the system is initialized.
  • Physical addresses ranging from 0x000a0000 to 0x000fffff are usually reserved to BIOS routines and to map the internal memory of ISA graphics cards.
  • Additional page frames within the first megabyte may be reserved by specific computer models.

4.3 Process Page Tables

The linear address space of a process is divided into two parts:

  • Linear addresses from 0x00000000 to 0xbfffffff can be addressed when the process runs in either User or Kernel Mode.
  • Linear addresses from 0xc0000000 to 0xffffffff can be addressed only when the process runs in Kernel Mode.

The PAGE_OFFSET macro yields the value 0xc0000000; this is the offset in the linear address space of a process where the kernel lives.

4.4 Handling the Hardware Cache and the TLB


参考资料:

  1. Linux Kernel: Memory Addressing