Lines Matching refs:CUDA
21 This document assumes a basic familiarity with CUDA and the PTX
22 assembly language. Information about the CUDA Driver API and the PTX assembly
23 language can be found in the `CUDA documentation
100 copy data to it by name with the CUDA Driver API.
117 generated PTX compatible with the CUDA Driver API.
119 Example: 32-bit PTX for CUDA Driver API: ``nvptx-nvidia-cuda``
121 Example: 64-bit PTX for CUDA Driver API: ``nvptx64-nvidia-cuda``
223 map in the following way to CUDA builtins:
226 CUDA Builtin PTX Special Register Intrinsic
252 instruction, equivalent to the ``__syncthreads()`` call in CUDA.
267 The CUDA Toolkit comes with an LLVM bitcode library called ``libdevice`` that
270 The library can be found under ``nvvm/libdevice/`` in the CUDA Toolkit and
367 The most common way to execute PTX assembly on a GPU device is to use the CUDA
398 For full examples of executing PTX assembly, please see the `CUDA Samples
555 Intrinsic CUDA Equivalent
622 a real GPU device? The CUDA Driver API provides a convenient mechanism for
630 You can also use the ``ptxas`` tool provided by the CUDA Toolkit to offline
632 binaries can be loaded by the CUDA Driver API in the same way as PTX. This
657 // CUDA initialization
664 std::cout << "Using CUDA Device [0]: " << name << "\n";
759 You will need to link with the CUDA driver and specify the path to cuda.h.
766 system location by the driver, not the CUDA toolkit.
773 Using CUDA Device [0]: GeForce GTX 680