Lines Matching +full:pulled +full:- +full:up
1 .. SPDX-License-Identifier: (GPL-2.0+ OR MIT)
30 meta-data. Typically one per client (DRM file-private), or one per
33 associated meta-data. The backing storage of a gpu_vma can either be
34 a GEM object or anonymous or page-cache pages mapped also into the CPU
40 is anonymous or page-cache pages as described above.
43 page-table entries point to that backing store.
47 the :doc:`dma-buf doc </driver-api/dma-buf>`.
53 allows deadlock-safe locking of multiple dma_resvs in arbitrary
55 :doc:`dma-buf doc </driver-api/dma-buf>`.
62 long-running mode.
79 * The ``gpu_vm->lock`` (optionally an rwsem). Protects the gpu_vm's
88 ``mm/mmu_notifier.c`` as a "Collision-retry read-side/write-side
90 write-sides to hold it at once...". The read side critical section
96 * The ``gpu_vm->resv`` lock. Protects the gpu_vm's list of gpu_vmas needing
101 * The ``gpu_vm->userptr_notifier_lock``. This is an rwsem that is
104 * The ``gem_object->gpuva_lock`` This lock protects the GEM object's
109 to be able to update the gpu_vm evicted- and external object
121 is protected by the ``gem_object->gpuva_lock``, which is typically the
128 over the gpu_vm_bo and gpu_vma lists to avoid locking-order violations.
137 over the gpu_vm_bo's list of gpu_vmas, the ``gem_object->gpuva_lock`` must
139 disappear without notice since those are not reference-counted. A
148 execution using this VM, unmap all gpu_vmas and release page-table memory.
154 pseudo-code. In particular, the dma_resv deadlock avoidance algorithm
161 gpu_vmas set up pointing to them. Typically, each gpu command buffer
162 submission is therefore preceded with a re-validation section:
164 .. code-block:: C
166 dma_resv_lock(gpu_vm->resv);
169 for_each_gpu_vm_bo_on_evict_list(&gpu_vm->evict_list, &gpu_vm_bo) {
170 validate_gem_bo(&gpu_vm_bo->gem_bo);
177 move_gpu_vma_to_rebind_list(&gpu_vma, &gpu_vm->rebind_list);
180 for_each_gpu_vma_on_rebind_list(&gpu vm->rebind_list, &gpu_vma) {
186 add_dependencies(&gpu_job, &gpu_vm->resv);
189 add_dma_fence(job_dma_fence, &gpu_vm->resv);
190 dma_resv_unlock(gpu_vm->resv);
202 .. code-block:: C
206 dma_resv_lock(obj->resv);
208 add_gpu_vm_bo_to_evict_list(&gpu_vm_bo, &gpu_vm->evict_list);
210 add_dependencies(&eviction_job, &obj->resv);
212 add_dma_fence(&obj->resv, job_dma_fence);
214 dma_resv_unlock(&obj->resv);
218 dma_resv lock such that ``obj->resv == gpu_vm->resv``.
220 which is protected by ``gpu_vm->resv``. During eviction all local
245 per-gpu_vm list which is protected by the gpu_vm's dma_resv lock or
270 .. code-block:: C
272 dma_resv_lock(gpu_vm->resv);
274 // External object list is protected by the gpu_vm->resv lock.
276 dma_resv_lock(gpu_vm_bo.gem_obj->resv);
278 add_gpu_vm_bo_to_evict_list(&gpu_vm_bo, &gpu_vm->evict_list);
281 for_each_gpu_vm_bo_on_evict_list(&gpu_vm->evict_list, &gpu_vm_bo) {
282 validate_gem_bo(&gpu_vm_bo->gem_bo);
285 move_gpu_vma_to_rebind_list(&gpu_vma, &gpu_vm->rebind_list);
288 for_each_gpu_vma_on_rebind_list(&gpu vm->rebind_list, &gpu_vma) {
293 add_dependencies(&gpu_job, &gpu_vm->resv);
296 add_dma_fence(job_dma_fence, &gpu_vm->resv);
298 add_dma_fence(job_dma_fence, &obj->resv);
301 And the corresponding shared-object aware eviction would look like:
303 .. code-block:: C
307 dma_resv_lock(obj->resv);
310 add_gpu_vm_bo_to_evict_list(&gpu_vm_bo, &gpu_vm->evict_list);
314 add_dependencies(&eviction_job, &obj->resv);
316 add_dma_fence(&obj->resv, job_dma_fence);
318 dma_resv_unlock(&obj->resv);
343 spin_lock(&gpu_vm->list_lock);
345 struct list_head *entry = list_first_entry_or_null(&gpu_vm->list, head);
350 list_move_tail(&entry->head, &still_in_list);
352 spin_unlock(&gpu_vm->list_lock);
356 spin_lock(&gpu_vm->list_lock);
360 list_splice_tail(&still_in_list, &gpu_vm->list);
361 spin_unlock(&gpu_vm->list_lock);
373 items are temporarily pulled off the list while iterating, and it is
375 also be considered protected by the ``gpu_vm->list_lock``, and it is
388 GPU virtual address range, directly maps a CPU mm range of anonymous-
389 or file page-cache pages.
392 creates a Denial-Of-Service vector since a single user-space process
394 desirable. (For special use-cases and assuming proper accounting pinning might
398 pages, dirty them if they are not mapped read-only to the GPU, and
414 :ref:`the pin_user_pages() documentation <mmu-notifier-registration-case>`.
421 outer lock, which in our example below is the ``gpu_vm->lock``.
426 .. code-block:: C
431 down_write(&gpu_vm->lock);
436 seq = mmu_interval_read_begin(&gpu_vma->userptr_interval);
437 if (seq != gpu_vma->saved_seq) {
439 dma_resv_lock(&gpu_vm->resv);
441 dma_resv_unlock(&gpu_vm->resv);
442 gpu_vma->saved_seq = seq;
452 add_dependencies(&gpu_job, &gpu_vm->resv);
453 down_read(&gpu_vm->userptr_notifier_lock);
454 if (mmu_interval_read_retry(&gpu_vma->userptr_interval, gpu_vma->saved_seq)) {
455 up_read(&gpu_vm->userptr_notifier_lock);
461 add_dma_fence(job_dma_fence, &gpu_vm->resv);
464 add_dma_fence(job_dma_fence, &obj->resv);
467 up_read(&gpu_vm->userptr_notifier_lock);
468 up_write(&gpu_vm->lock);
478 take any dma_resv lock nor the gpu_vm->lock from within it.
481 .. code-block:: C
486 // and backs off or we wait for the dma-fence:
488 down_write(&gpu_vm->userptr_notifier_lock);
490 up_write(&gpu_vm->userptr_notifier_lock);
499 dma_resv_wait_timeout(&gpu_vm->resv, DMA_RESV_USAGE_BOOKKEEP,
506 page-binding before a new GPU submission can succeed.
517 function. This list will then lend itself very-well to the spinlock
522 ``gpu_vm->lock`` or the ``gpu_vm->resv`` lock. Note that the
523 ``gpu_vm->lock`` still needs to be taken while iterating to ensure the list is
537 requires that the ``gpu_vm->lock`` and the ``gem_object->gpuva_lock``
540 the ``gpu_vm->resv`` or the GEM object's dma_resv, that the gpu_vmas
543 outer ``gpu_vm->lock`` is held, since otherwise when iterating over
547 Locking for recoverable page-fault page-table updates
551 recoverable page-faults:
565 ``userptr_seqlock`` as well as the ``gpu_vm->userptr_notifier_lock``
569 when populating the page-tables for any gpu_vma pointing to the GEM
570 object, will similarly ensure we are race-free.
573 under a dma-fence with these locks released, the zapping will need to
574 wait for that dma-fence to signal under the relevant lock before
575 starting to modify the page-table.
578 page-table structure in a way that frees up page-table memory
580 typically focuses only on zeroing page-table or page-directory entries
581 and flushing TLB, whereas freeing of page-table memory is deferred to