Linux memory management overview
Paging and Virtual Memory
Linux presents each process with a virtual address space divided into fixed-size pages (typically 4 KiB on x86_64). The MMU translates virtual pages to physical frames using multi-level page tables; the CPU caches recent translations in the TLB so page size and TLB reach directly affect performance. Demand paging and page faults drive allocation and I/O for anonymous and file-backed pages.
Kernel Allocators: Slab, SLUB, and SLOB
The kernel uses slab-family allocators to manage frequently allocated kernel objects. Slab/SLUB maintain caches of same-sized objects to avoid fragmentation and reduce per-allocation overhead; they provide constructors, per-CPU caches, and shrinkers that cooperate with the page reclaim path. Allocator behavior affects page movability and thus hugepage assembly.
Transparent Hugepages
Transparent Hugepages (THP) attempt to back contiguous virtual ranges with larger physical pages (commonly 2 MiB) to reduce page-table entries and TLB misses. THP is opportunistic and relies on the kernel's ability to compact and migrate small pages; allocation failures and fragmentation can force fallbacks to 4 KiB pages. Kernel work on THP allocation heuristics aims to make huge pages cheaper to obtain.
NUMA and Locality
On NUMA systems memory is partitioned into nodes with different access latencies and bandwidth. Linux exposes NUMA policies (local, interleave, bind) and APIs (libnuma) so applications or the allocator can place memory near the CPU that will access it. Remote accesses increase latency and consume interconnect bandwidth; NUMA-aware allocation and thread pinning are primary mitigations.
Observability with vmstat and perf
vmstat reports high-level counters: page faults, page-ins/outs, free/active/inactive memory, and reclaim activity - useful to detect swapping and reclaim pressure. perf exposes CPU-side metrics (TLB misses, page-walks, major/minor faults via tracepoints) and can correlate user/kernel stacks with memory events. Use perf record -e for TLB/page-walk events and perf trace/perf stat to quantify cost of faults and migrations.
vmstat 1
procs memory swap io system cpu
r b swpd free buff cache si so bi bo in cs us sy id
Practical signals and actions
- High minor faults with low disk I/O: indicates page reclaim or COW activity; inspect /proc/
/smaps and slab caches. - High TLB misses in perf stat: consider THP, larger working-set locality, or code/data layout changes.
- NUMA remote hits: bind threads and memory or interleave large allocations for throughput-sensitive workloads. anshadameenza.com