A physically indexed CPU cache is designed such that addresses in adjacent physical memory blocks take different positions (“cache lines”) in the cache, but this is not the case when it comes to virtual memory; when virtually adjacent but not physically adjacent memory blocks are allocated, they could potentially both take the same position in the cache. Coloring is a technique implemented in memory management software, which solves this problem by selecting pages that do not contend with neighbor pages.
Physical memory pages are “colored” so that pages with different “colors” have different positions in CPU cache memory. When allocating sequential pages in virtual memory for processes, the kernel collects pages with different “colors” and maps them to the virtual memory. In this way, sequential pages in virtual memory do not contend for the same cache line.
Page coloring is a software technique that controls the mapping of physical memory pages to a processor’s cache blocks. Memory pages that map to the same cache blocks are assigned the same color (as illustrated by Figure 1).
Without specific hardware support to control cache sharing, the operating system’s only recourse in a physically addressed cache is to control the virtual to physical mappings used by individual processes. Traditional page coloring [Taylor 1990] attempts to ensure that contiguous pages in virtual memory are allocated to physical pages that will be spread across the cache. In order to accomplish this, contiguous pages of physical memory are allocated different colors and contiguous virtual pages are guaranteed to be assigned distinct colors
A virtual memory subsystem that lacks cache coloring is less deterministic with regards to cache performance, as differences in page allocation from one program run to the next can lead to large differences in program performance.
A different use of page coloring also allows the operating system to restrict a process’s
accesses so that it utilizes only a subset of the cache. The shared cache space can thereby be partitioned among multiple simultaneously executing applications on a multi-core platform.