swiotlb.rst - OpenGrok cross reference for /Documentation/core-api/swiotlb.rst

Lines Matching +full:in +full:- +full:memory
1 .. SPDX-License-Identifier: GPL-2.0
7 swiotlb is a memory buffer allocator used by the Linux kernel DMA layer. It is
8 typically used when a device doing DMA can't directly access the target memory
9 buffer because of hardware limitations or other requirements. In such a case,
10 the DMA layer calls swiotlb to allocate a temporary memory buffer that conforms
11 to the limitations. The DMA is done to/from this temporary memory buffer, and
13 memory buffer. This approach is generically called "bounce buffering", and the
14 temporary memory buffer is called a "bounce buffer".
19 These APIs use the device DMA attributes and kernel-wide settings to determine
22 device, some devices in a system may use bounce buffering while others do not.
25 memory buffer, doing bounce buffering is slower than doing DMA directly to the
26 original memory buffer, and it consumes more CPU resources. So it is used only
30 ---------------
32 limitations. As physical memory sizes grew beyond 4 GiB, some devices could
33 only provide 32-bit DMA addresses. By allocating bounce buffer memory below
37 More recently, Confidential Computing (CoCo) VMs have the guest VM's memory
38 encrypted by default, and the memory is not accessible by the host hypervisor
40 directed to guest memory that is unencrypted. CoCo VMs set a kernel-wide option
41 to force all DMA I/O to use bounce buffers, and the bounce buffer memory is set
42 up as unencrypted. The host does DMA I/O to/from the bounce buffer memory, and
44 data to/from the original target memory buffer. The CPU copying bridges between
45 the unencrypted and the encrypted memory. This use of bounce buffers allows
46 device drivers to "just work" in a CoCo VM, with no modifications
47 needed to handle the memory encryption complexity.
51 "untrusted", the device should be given access only to the memory containing
52 the data being transferred. But if that memory occupies only part of an IOMMU
54 IOMMU access control is per-granule, the untrusted device can gain access to
60 ------------------
63 specified size in bytes and returns the physical address of the buffer. The
64 buffer memory is physically contiguous. The expectation is that the DMA layer
65 maps the physical memory address to a DMA address, and returns the DMA address
67 multiple memory buffer segments, a separate bounce buffer must be allocated for
73 updated the bounce buffer memory and DMA_ATTR_SKIP_CPU_SYNC is not set, the
75 buffer back to the original buffer. Then the bounce buffer memory is freed.
85 ------------------------------
87 called by the corresponding DMA APIs which may run in contexts that cannot
88 block. Hence the default memory pool for swiotlb allocations must be
89 pre-allocated at boot time (but see Dynamic swiotlb below). Because swiotlb
90 allocations must be physically contiguous, the entire default memory pool is
93 The need to pre-allocate the default swiotlb pool creates a boot-time tradeoff.
95 always be satisfied, as the non-blocking requirement means requests can't wait
96 for space to become available. But a large pool potentially wastes memory, as
97 this pre-allocated memory is not available for other uses in the system. The
98 tradeoff is particularly acute in CoCo VMs that use bounce buffers for all DMA
99 I/O. These VMs use a heuristic to set the default pool size to ~6% of memory,
100 with a max of 1 GiB, which has the potential to be very wasteful of memory.
102 on the I/O patterns of the workload in the VM. The dynamic swiotlb feature
104 default memory pool size remains an open issue.
109 must be limited to that 256 KiB. This value is communicated to higher-level
111 higher-level code fails to account for this limit, it may make requests that
117 bounce buffer match the same bits in the address of the original buffer. When
118 min_align_mask is non-zero, it may produce an "alignment offset" in the address
120 This potential alignment offset is reflected in the value returned by
121 swiotlb_max_mapping_size(), which can show up in places like
124 swiotlb, max_sectors_kb will be 256 KiB. When min_align_mask is non-zero,
130 bounce buffer might start at a larger address if min_align_mask is non-zero.
131 Hence there may be pre-padding space that is allocated prior to the start of
133 alloc_align_mask boundary, potentially resulting in post-padding space. Any
134 pre-padding or post-padding space is not initialized by swiotlb code. The
136 devices. It is set to the granule size - 1 so that the bounce buffer is
140 ------------------------
141 Memory used for swiotlb bounce buffers is allocated from overall system memory
145 due to other conditions, such as running in a CoCo VM, as described above. If
146 CONFIG_SWIOTLB_DYNAMIC is enabled, additional pools may be allocated later in
148 memory. The default pool is allocated below the 4 GiB physical address line so
149 it works for devices that can only address 32-bits of physical memory (unless
150 architecture-specific code provides the SWIOTLB_ANY flag). In a CoCo VM, the
151 pool memory must be decrypted before swiotlb is used.
159 IO_TLB_SEGSIZE. Multiple smaller bounce buffers may co-exist in a single slot
163 entirely in a single area. Each area has its own spin lock that must be held to
164 manipulate the slots in that area. The division into areas avoids contending
165 for a single global spin lock when swiotlb is heavily used, such as in a CoCo
166 VM. The number of areas defaults to the number of CPUs in the system for
185 initial slots in each slot set might not meet the alloc_align_mask criterium.
190 change in the future, the initial pool allocation might need to be done with
194 ---------------
195 When CONFIG_SWIOTLB_DYNAMIC is enabled, swiotlb can do on-demand expansion of
196 the amount of memory available for allocation as bounce buffers. If a bounce
198 task is kicked off to allocate memory from general system memory and turn it
200 because the memory allocation may block, and as noted above, swiotlb requests
204 deleted when the bounce buffer is freed. Memory for this transient pool comes
205 from the general system memory atomic pool so that creation does not block.
206 Creating a transient pool has relatively high cost, particularly in a CoCo VM
207 where the memory must be decrypted, so it is done only as a stopgap until the
208 background task can add another non-transient pool.
210 Adding a dynamic pool has limitations. Like with the default pool, the memory
212 (e.g., 4 MiB on a typical x86 system). Due to memory fragmentation, a max size
215 memory fragmentation, dynamically adding a pool might not succeed at all.
217 The number of areas in a dynamic pool may be different from the number of areas
218 in the default pool. Because the new pool size is typically a few MiB at most,
224 New pools added via dynamic swiotlb are linked together in a linear list.
231 few CPUs. It allows the default swiotlb pool to be smaller so that memory is
236 ----------------------
238 io_tlb_area, and io_tlb_slot. io_tlb_mem describes a swiotlb memory allocator,
239 which includes the default memory pool and any dynamic or transient pools
240 linked to it. Limited statistics on swiotlb usage are kept per memory allocator
241 and are stored in this data structure. These statistics are available under
244 io_tlb_pool describes a memory pool, either the default pool, a dynamic pool,
246 the memory in the pool, a pointer to an array of io_tlb_area structures, and a
250 serialize access to slots in the area. The io_tlb_area array for a pool has an
251 entry for each area, and is accessed using a 0-based area index derived from the
255 io_tlb_slot describes an individual memory slot in the pool, with size
257 index computed from the bounce buffer address relative to the starting memory
266 memory buffer address obviously must be passed as an argument to
268 swiotlb data structures must save the original memory buffer address so that it
269 can be used when doing sync operations. This original address is saved in the
272 Second, the io_tlb_slot array must handle partial sync requests. In such cases,
274 buffer but an address somewhere in the middle of the bounce buffer, and the
276 swiotlb code must be able to calculate the corresponding original memory buffer
278 memory buffer address is populated into the struct io_tlb_slot for each slot
280 also recorded in each struct io_tlb_slot so a sanity check can be performed on
285 in struct io_tlb_slot records how many contiguous available slots exist starting
289 IO_TLB_SEGSIZE, which can appear in the first slot in a slot set, and indicates
293 "list" field is initialized to IO_TLB_SEGSIZE down to 1 for the slots in every
299 requirements, it may allocate pre-padding space across zero or more slots. But
304 The "pad_slots" value is recorded only in the first non-padding slot allocated
308 ----------------
310 memory separate from the default swiotlb pool, and that are dedicated for DMA
311 use by a particular device. Restricted pools provide a level of DMA memory