• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1torch.cuda
2===================================
3.. automodule:: torch.cuda
4.. currentmodule:: torch.cuda
5
6.. autosummary::
7    :toctree: generated
8    :nosignatures:
9
10    StreamContext
11    can_device_access_peer
12    current_blas_handle
13    current_device
14    current_stream
15    cudart
16    default_stream
17    device
18    device_count
19    device_of
20    get_arch_list
21    get_device_capability
22    get_device_name
23    get_device_properties
24    get_gencode_flags
25    get_sync_debug_mode
26    init
27    ipc_collect
28    is_available
29    is_initialized
30    memory_usage
31    set_device
32    set_stream
33    set_sync_debug_mode
34    stream
35    synchronize
36    utilization
37    temperature
38    power_draw
39    clock_rate
40    OutOfMemoryError
41
42Random Number Generator
43-------------------------
44.. autosummary::
45    :toctree: generated
46    :nosignatures:
47
48    get_rng_state
49    get_rng_state_all
50    set_rng_state
51    set_rng_state_all
52    manual_seed
53    manual_seed_all
54    seed
55    seed_all
56    initial_seed
57
58
59Communication collectives
60-------------------------
61
62.. autosummary::
63    :toctree: generated
64    :nosignatures:
65
66    comm.broadcast
67    comm.broadcast_coalesced
68    comm.reduce_add
69    comm.scatter
70    comm.gather
71
72Streams and events
73------------------
74.. autosummary::
75    :toctree: generated
76    :nosignatures:
77
78    Stream
79    ExternalStream
80    Event
81
82Graphs (beta)
83-------------
84.. autosummary::
85    :toctree: generated
86    :nosignatures:
87
88    is_current_stream_capturing
89    graph_pool_handle
90    CUDAGraph
91    graph
92    make_graphed_callables
93
94.. _cuda-memory-management-api:
95
96Memory management
97-----------------
98.. autosummary::
99    :toctree: generated
100    :nosignatures:
101
102     empty_cache
103     list_gpu_processes
104     mem_get_info
105     memory_stats
106     memory_summary
107     memory_snapshot
108     memory_allocated
109     max_memory_allocated
110     reset_max_memory_allocated
111     memory_reserved
112     max_memory_reserved
113     set_per_process_memory_fraction
114     memory_cached
115     max_memory_cached
116     reset_max_memory_cached
117     reset_peak_memory_stats
118     caching_allocator_alloc
119     caching_allocator_delete
120     get_allocator_backend
121     CUDAPluggableAllocator
122     change_current_allocator
123     MemPool
124     MemPoolContext
125
126.. autoclass:: torch.cuda.use_mem_pool
127
128.. FIXME The following doesn't seem to exist. Is it supposed to?
129   https://github.com/pytorch/pytorch/issues/27785
130   .. autofunction:: reset_max_memory_reserved
131
132NVIDIA Tools Extension (NVTX)
133-----------------------------
134
135.. autosummary::
136    :toctree: generated
137    :nosignatures:
138
139    nvtx.mark
140    nvtx.range_push
141    nvtx.range_pop
142    nvtx.range
143
144Jiterator (beta)
145-----------------------------
146.. autosummary::
147    :toctree: generated
148    :nosignatures:
149
150    jiterator._create_jit_fn
151    jiterator._create_multi_output_jit_fn
152
153TunableOp
154---------
155
156Some operations could be implemented using more than one library or more than
157one technique. For example, a GEMM could be implemented for CUDA or ROCm using
158either the cublas/cublasLt libraries or hipblas/hipblasLt libraries,
159respectively. How does one know which implementation is the fastest and should
160be chosen? That's what TunableOp provides. Certain operators have been
161implemented using multiple strategies as Tunable Operators. At runtime, all
162strategies are profiled and the fastest is selected for all subsequent
163operations.
164
165See the :doc:`documentation <cuda.tunable>` for information on how to use it.
166
167.. toctree::
168    :hidden:
169
170    cuda.tunable
171
172
173Stream Sanitizer (prototype)
174----------------------------
175
176CUDA Sanitizer is a prototype tool for detecting synchronization errors between streams in PyTorch.
177See the :doc:`documentation <cuda._sanitizer>` for information on how to use it.
178
179.. toctree::
180    :hidden:
181
182    cuda._sanitizer
183
184
185.. This module needs to be documented. Adding here in the meantime
186.. for tracking purposes
187.. py:module:: torch.cuda.comm
188.. py:module:: torch.cuda.error
189.. py:module:: torch.cuda.gds
190.. py:module:: torch.cuda.graphs
191.. py:module:: torch.cuda.jiterator
192.. py:module:: torch.cuda.memory
193.. py:module:: torch.cuda.nccl
194.. py:module:: torch.cuda.nvtx
195.. py:module:: torch.cuda.profiler
196.. py:module:: torch.cuda.random
197.. py:module:: torch.cuda.sparse
198.. py:module:: torch.cuda.streams
199