1============================== 2Memory Layout on AArch64 Linux 3============================== 4 5Author: Catalin Marinas <catalin.marinas@arm.com> 6 7This document describes the virtual memory layout used by the AArch64 8Linux kernel. The architecture allows up to 4 levels of translation 9tables with a 4KB page size and up to 3 levels with a 64KB page size. 10 11AArch64 Linux uses either 3 levels or 4 levels of translation tables 12with the 4KB page configuration, allowing 39-bit (512GB) or 48-bit 13(256TB) virtual addresses, respectively, for both user and kernel. With 1464KB pages, only 2 levels of translation tables, allowing 42-bit (4TB) 15virtual address, are used but the memory layout is the same. 16 17ARMv8.2 adds optional support for Large Virtual Address space. This is 18only available when running with a 64KB page size and expands the 19number of descriptors in the first level of translation. 20 21User addresses have bits 63:48 set to 0 while the kernel addresses have 22the same bits set to 1. TTBRx selection is given by bit 63 of the 23virtual address. The swapper_pg_dir contains only kernel (global) 24mappings while the user pgd contains only user (non-global) mappings. 25The swapper_pg_dir address is written to TTBR1 and never written to 26TTBR0. 27 28 29AArch64 Linux memory layout with 4KB pages + 4 levels (48-bit):: 30 31 Start End Size Use 32 ----------------------------------------------------------------------- 33 0000000000000000 0000ffffffffffff 256TB user 34 ffff000000000000 ffff7fffffffffff 128TB kernel logical memory map 35 ffff800000000000 ffff9fffffffffff 32TB kasan shadow region 36 ffffa00000000000 ffffa00007ffffff 128MB bpf jit region 37 ffffa00008000000 ffffa0000fffffff 128MB modules 38 ffffa00010000000 fffffdffbffeffff ~93TB vmalloc 39 fffffdffbfff0000 fffffdfffe5f8fff ~998MB [guard region] 40 fffffdfffe5f9000 fffffdfffe9fffff 4124KB fixed mappings 41 fffffdfffea00000 fffffdfffebfffff 2MB [guard region] 42 fffffdfffec00000 fffffdffffbfffff 16MB PCI I/O space 43 fffffdffffc00000 fffffdffffdfffff 2MB [guard region] 44 fffffdffffe00000 ffffffffffdfffff 2TB vmemmap 45 ffffffffffe00000 ffffffffffffffff 2MB [guard region] 46 47 48AArch64 Linux memory layout with 64KB pages + 3 levels (52-bit with HW support):: 49 50 Start End Size Use 51 ----------------------------------------------------------------------- 52 0000000000000000 000fffffffffffff 4PB user 53 fff0000000000000 fff7ffffffffffff 2PB kernel logical memory map 54 fff8000000000000 fffd9fffffffffff 1440TB [gap] 55 fffda00000000000 ffff9fffffffffff 512TB kasan shadow region 56 ffffa00000000000 ffffa00007ffffff 128MB bpf jit region 57 ffffa00008000000 ffffa0000fffffff 128MB modules 58 ffffa00010000000 fffff81ffffeffff ~88TB vmalloc 59 fffff81fffff0000 fffffc1ffe58ffff ~3TB [guard region] 60 fffffc1ffe590000 fffffc1ffe9fffff 4544KB fixed mappings 61 fffffc1ffea00000 fffffc1ffebfffff 2MB [guard region] 62 fffffc1ffec00000 fffffc1fffbfffff 16MB PCI I/O space 63 fffffc1fffc00000 fffffc1fffdfffff 2MB [guard region] 64 fffffc1fffe00000 ffffffffffdfffff 3968GB vmemmap 65 ffffffffffe00000 ffffffffffffffff 2MB [guard region] 66 67 68Translation table lookup with 4KB pages:: 69 70 +--------+--------+--------+--------+--------+--------+--------+--------+ 71 |63 56|55 48|47 40|39 32|31 24|23 16|15 8|7 0| 72 +--------+--------+--------+--------+--------+--------+--------+--------+ 73 | | | | | | 74 | | | | | v 75 | | | | | [11:0] in-page offset 76 | | | | +-> [20:12] L3 index 77 | | | +-----------> [29:21] L2 index 78 | | +---------------------> [38:30] L1 index 79 | +-------------------------------> [47:39] L0 index 80 +-------------------------------------------------> [63] TTBR0/1 81 82 83Translation table lookup with 64KB pages:: 84 85 +--------+--------+--------+--------+--------+--------+--------+--------+ 86 |63 56|55 48|47 40|39 32|31 24|23 16|15 8|7 0| 87 +--------+--------+--------+--------+--------+--------+--------+--------+ 88 | | | | | 89 | | | | v 90 | | | | [15:0] in-page offset 91 | | | +----------> [28:16] L3 index 92 | | +--------------------------> [41:29] L2 index 93 | +-------------------------------> [47:42] L1 index (48-bit) 94 | [51:42] L1 index (52-bit) 95 +-------------------------------------------------> [63] TTBR0/1 96 97 98When using KVM without the Virtualization Host Extensions, the 99hypervisor maps kernel pages in EL2 at a fixed (and potentially 100random) offset from the linear mapping. See the kern_hyp_va macro and 101kvm_update_va_mask function for more details. MMIO devices such as 102GICv2 gets mapped next to the HYP idmap page, as do vectors when 103ARM64_HARDEN_EL2_VECTORS is selected for particular CPUs. 104 105When using KVM with the Virtualization Host Extensions, no additional 106mappings are created, since the host kernel runs directly in EL2. 107 10852-bit VA support in the kernel 109------------------------------- 110If the ARMv8.2-LVA optional feature is present, and we are running 111with a 64KB page size; then it is possible to use 52-bits of address 112space for both userspace and kernel addresses. However, any kernel 113binary that supports 52-bit must also be able to fall back to 48-bit 114at early boot time if the hardware feature is not present. 115 116This fallback mechanism necessitates the kernel .text to be in the 117higher addresses such that they are invariant to 48/52-bit VAs. Due 118to the kasan shadow being a fraction of the entire kernel VA space, 119the end of the kasan shadow must also be in the higher half of the 120kernel VA space for both 48/52-bit. (Switching from 48-bit to 52-bit, 121the end of the kasan shadow is invariant and dependent on ~0UL, 122whilst the start address will "grow" towards the lower addresses). 123 124In order to optimise phys_to_virt and virt_to_phys, the PAGE_OFFSET 125is kept constant at 0xFFF0000000000000 (corresponding to 52-bit), 126this obviates the need for an extra variable read. The physvirt 127offset and vmemmap offsets are computed at early boot to enable 128this logic. 129 130As a single binary will need to support both 48-bit and 52-bit VA 131spaces, the VMEMMAP must be sized large enough for 52-bit VAs and 132also must be sized large enough to accommodate a fixed PAGE_OFFSET. 133 134Most code in the kernel should not need to consider the VA_BITS, for 135code that does need to know the VA size the variables are 136defined as follows: 137 138VA_BITS constant the *maximum* VA space size 139 140VA_BITS_MIN constant the *minimum* VA space size 141 142vabits_actual variable the *actual* VA space size 143 144 145Maximum and minimum sizes can be useful to ensure that buffers are 146sized large enough or that addresses are positioned close enough for 147the "worst" case. 148 14952-bit userspace VAs 150-------------------- 151To maintain compatibility with software that relies on the ARMv8.0 152VA space maximum size of 48-bits, the kernel will, by default, 153return virtual addresses to userspace from a 48-bit range. 154 155Software can "opt-in" to receiving VAs from a 52-bit space by 156specifying an mmap hint parameter that is larger than 48-bit. 157 158For example: 159 160.. code-block:: c 161 162 maybe_high_address = mmap(~0UL, size, prot, flags,...); 163 164It is also possible to build a debug kernel that returns addresses 165from a 52-bit space by enabling the following kernel config options: 166 167.. code-block:: sh 168 169 CONFIG_EXPERT=y && CONFIG_ARM64_FORCE_52BIT=y 170 171Note that this option is only intended for debugging applications 172and should not be used in production. 173