Home
last modified time | relevance | path

Searched refs:ptxas (Results 1 – 23 of 23) sorted by relevance

/external/llvm-project/llvm/test/CodeGen/NVPTX/
Dfp-contract.ll8 ;; add.f32 otherwise. Without an explicit rounding mode on add.f32, ptxas
11 ;; for all adds to prevent ptxas from fusion the ops.
28 ;; to prevent ptxas from fusing this with anything else.
Dparam-align.ll28 ;;; ptxas for sm_50+
Dbug26185-2.ll6 ; emit the necessary cvt.* instructions to implement the extension and let ptxas
/external/llvm/test/CodeGen/NVPTX/
Dfp-contract.ll8 ;; add.f32 otherwise. Without an explicit rounding mode on add.f32, ptxas
11 ;; for all adds to prevent ptxas from fusion the ops.
28 ;; to prevent ptxas from fusing this with anything else.
Dbug26185-2.ll6 ; emit the necessary cvt.* instructions to implement the extension and let ptxas
/external/llvm-project/openmp/tools/analyzer/
Danalyzer.py42ptxas = '\n'.join([line.split(':')[1].strip() for line in stderr.split('\n') if re.search(r"^ptxas…
51 parseKernelUsages(ptxas, usage)
/external/tensorflow/tensorflow/stream_executor/gpu/
Dasm_compiler.cc52 tensorflow::SubProcess ptxas; in WarnIfBadPtxasVersion() local
53 ptxas.SetProgram(ptxas_path, {ptxas_path, "--version"}); in WarnIfBadPtxasVersion()
54 ptxas.SetChannelAction(tensorflow::CHAN_STDOUT, tensorflow::ACTION_PIPE); in WarnIfBadPtxasVersion()
55 if (!ptxas.Start()) { in WarnIfBadPtxasVersion()
61 int exit_code = ptxas.Communicate(/*stdin_input=*/nullptr, &out, in WarnIfBadPtxasVersion()
/external/llvm/docs/
DCompileCudaWithLLVM.rst164 * ``off``: never emit fma operations, and prevent ptxas from fusing multiply
167 across statements (C11 semantics). Prevent ptxas from fusing other
170 statements. Doesn't prevent ptxas from fusing additional multiplies and
DNVPTXUsage.rst413 ptxas complains of undefined function: __nvvm_reflect
638 You can also use the ``ptxas`` tool provided by the CUDA Toolkit to offline
/external/llvm-project/llvm/docs/
DCompileCudaWithLLVM.rst110 * ``off``: never emit fma operations, and prevent ptxas from fusing multiply
113 across statements (C11 semantics). Prevent ptxas from fusing other
116 statements. Doesn't prevent ptxas from fusing additional multiplies and
247 * Optionally, invoke ``ptxas``, the PTX assembler, to generate a file,
270 * Invoke ``ptxas`` to generate a SASS file, ``S_arch``. Note that, unlike
DNVPTXUsage.rst405 ptxas complains of undefined function: __nvvm_reflect
630 You can also use the ``ptxas`` tool provided by the CUDA Toolkit to offline
/external/tensorflow/third_party/nccl/
Dbuild_defs.bzl.tpl53 "ptxas-options=" + maxrregcount,
57 "-Xcuda-ptxas",
/external/llvm-project/clang/lib/Driver/ToolChains/
DCuda.cpp153 if (llvm::ErrorOr<std::string> ptxas = in CudaInstallationDetector() local
156 llvm::sys::fs::real_path(*ptxas, ptxasAbsolutePath); in CudaInstallationDetector()
/external/llvm-project/openmp/libomptarget/deviceRTLs/nvptx/
DCMakeLists.txt99 set(CUDA_DEBUG -DOMPTARGET_NVPTX_DEBUG=-1 -g --ptxas-options=-v)
/external/tensorflow/tensorflow/compiler/xla/
Dxla.proto187 // If set to true XLA:GPU invokes `ptxas` with -O0 (default is -O3).
/external/llvm-project/clang/docs/
DClangCommandLineReference.rst75 .. option:: -Xcuda-ptxas <arg>
77 Pass <arg> to the ptxas assembler
165 Enable device-side debug info generation. Disables ptxas optimizations.
1145 .. option:: --ptxas-path=<arg>
1147 Path to ptxas (used for compiling CUDA code)
/external/clang/include/clang/Driver/
DOptions.td347 def Xcuda_ptxas : Separate<["-"], "Xcuda-ptxas">,
348 HelpText<"Pass <arg> to the ptxas assembler">, MetaVarName<"<arg>">;
414 HelpText<"Enable device-side debug info generation. Disables ptxas optimizations.">;
/external/llvm-project/clang/include/clang/Driver/
DOptions.td563 def Xcuda_ptxas : Separate<["-"], "Xcuda-ptxas">,
564 HelpText<"Pass <arg> to the ptxas assembler">, MetaVarName<"<arg>">;
683 HelpText<"Enable device-side debug info generation. Disables ptxas optimizations.">;
692 def ptxas_path_EQ : Joined<["--"], "ptxas-path=">, Group<i_Group>,
693 HelpText<"Path to ptxas (used for compiling CUDA code)">;
/external/llvm-project/llvm/lib/Target/NVPTX/
DNVPTXInstrInfo.td258 // just like the non ".rn" op, but prevents ptxas from creating FMAs.
828 // ptxas does not have hex representation for fp16, so we can't use
/external/swiftshader/third_party/llvm-10.0/llvm/lib/Target/NVPTX/
DNVPTXInstrInfo.td258 // just like the non ".rn" op, but prevents ptxas from creating FMAs.
828 // ptxas does not have hex representation for fp16, so we can't use
/external/tensorflow/tensorflow/compiler/xla/tests/
DBUILD2627 # Disabled in OSS until nvidia publicly releases a fixed ptxas.
/external/tensorflow/
DRELEASE.md3330 As a result, these versions of `ptxas` miscompile most XLA programs which use
3480 * GPU back-end now uses `ptxas` to compile generated PTX.
3505 As a result, these versions of `ptxas` miscompile most XLA programs which use
/external/llvm/lib/Target/NVPTX/
DNVPTXIntrinsics.td70 // The last two parameters to shfl can be regs or imms. ptxas is smart
807 // patterns, but ptxas does not like these since .s16 is not compatible with