1torch.cuda 2=================================== 3.. automodule:: torch.cuda 4.. currentmodule:: torch.cuda 5 6.. autosummary:: 7 :toctree: generated 8 :nosignatures: 9 10 StreamContext 11 can_device_access_peer 12 current_blas_handle 13 current_device 14 current_stream 15 cudart 16 default_stream 17 device 18 device_count 19 device_of 20 get_arch_list 21 get_device_capability 22 get_device_name 23 get_device_properties 24 get_gencode_flags 25 get_sync_debug_mode 26 init 27 ipc_collect 28 is_available 29 is_initialized 30 memory_usage 31 set_device 32 set_stream 33 set_sync_debug_mode 34 stream 35 synchronize 36 utilization 37 temperature 38 power_draw 39 clock_rate 40 OutOfMemoryError 41 42Random Number Generator 43------------------------- 44.. autosummary:: 45 :toctree: generated 46 :nosignatures: 47 48 get_rng_state 49 get_rng_state_all 50 set_rng_state 51 set_rng_state_all 52 manual_seed 53 manual_seed_all 54 seed 55 seed_all 56 initial_seed 57 58 59Communication collectives 60------------------------- 61 62.. autosummary:: 63 :toctree: generated 64 :nosignatures: 65 66 comm.broadcast 67 comm.broadcast_coalesced 68 comm.reduce_add 69 comm.scatter 70 comm.gather 71 72Streams and events 73------------------ 74.. autosummary:: 75 :toctree: generated 76 :nosignatures: 77 78 Stream 79 ExternalStream 80 Event 81 82Graphs (beta) 83------------- 84.. autosummary:: 85 :toctree: generated 86 :nosignatures: 87 88 is_current_stream_capturing 89 graph_pool_handle 90 CUDAGraph 91 graph 92 make_graphed_callables 93 94.. _cuda-memory-management-api: 95 96Memory management 97----------------- 98.. autosummary:: 99 :toctree: generated 100 :nosignatures: 101 102 empty_cache 103 list_gpu_processes 104 mem_get_info 105 memory_stats 106 memory_summary 107 memory_snapshot 108 memory_allocated 109 max_memory_allocated 110 reset_max_memory_allocated 111 memory_reserved 112 max_memory_reserved 113 set_per_process_memory_fraction 114 memory_cached 115 max_memory_cached 116 reset_max_memory_cached 117 reset_peak_memory_stats 118 caching_allocator_alloc 119 caching_allocator_delete 120 get_allocator_backend 121 CUDAPluggableAllocator 122 change_current_allocator 123 MemPool 124 MemPoolContext 125 126.. autoclass:: torch.cuda.use_mem_pool 127 128.. FIXME The following doesn't seem to exist. Is it supposed to? 129 https://github.com/pytorch/pytorch/issues/27785 130 .. autofunction:: reset_max_memory_reserved 131 132NVIDIA Tools Extension (NVTX) 133----------------------------- 134 135.. autosummary:: 136 :toctree: generated 137 :nosignatures: 138 139 nvtx.mark 140 nvtx.range_push 141 nvtx.range_pop 142 nvtx.range 143 144Jiterator (beta) 145----------------------------- 146.. autosummary:: 147 :toctree: generated 148 :nosignatures: 149 150 jiterator._create_jit_fn 151 jiterator._create_multi_output_jit_fn 152 153TunableOp 154--------- 155 156Some operations could be implemented using more than one library or more than 157one technique. For example, a GEMM could be implemented for CUDA or ROCm using 158either the cublas/cublasLt libraries or hipblas/hipblasLt libraries, 159respectively. How does one know which implementation is the fastest and should 160be chosen? That's what TunableOp provides. Certain operators have been 161implemented using multiple strategies as Tunable Operators. At runtime, all 162strategies are profiled and the fastest is selected for all subsequent 163operations. 164 165See the :doc:`documentation <cuda.tunable>` for information on how to use it. 166 167.. toctree:: 168 :hidden: 169 170 cuda.tunable 171 172 173Stream Sanitizer (prototype) 174---------------------------- 175 176CUDA Sanitizer is a prototype tool for detecting synchronization errors between streams in PyTorch. 177See the :doc:`documentation <cuda._sanitizer>` for information on how to use it. 178 179.. toctree:: 180 :hidden: 181 182 cuda._sanitizer 183 184 185.. This module needs to be documented. Adding here in the meantime 186.. for tracking purposes 187.. py:module:: torch.cuda.comm 188.. py:module:: torch.cuda.error 189.. py:module:: torch.cuda.gds 190.. py:module:: torch.cuda.graphs 191.. py:module:: torch.cuda.jiterator 192.. py:module:: torch.cuda.memory 193.. py:module:: torch.cuda.nccl 194.. py:module:: torch.cuda.nvtx 195.. py:module:: torch.cuda.profiler 196.. py:module:: torch.cuda.random 197.. py:module:: torch.cuda.sparse 198.. py:module:: torch.cuda.streams 199