/external/llvm-project/llvm/test/CodeGen/NVPTX/ |
D | fp-contract.ll | 8 ;; add.f32 otherwise. Without an explicit rounding mode on add.f32, ptxas 11 ;; for all adds to prevent ptxas from fusion the ops. 28 ;; to prevent ptxas from fusing this with anything else.
|
D | param-align.ll | 28 ;;; ptxas for sm_50+
|
D | bug26185-2.ll | 6 ; emit the necessary cvt.* instructions to implement the extension and let ptxas
|
/external/llvm/test/CodeGen/NVPTX/ |
D | fp-contract.ll | 8 ;; add.f32 otherwise. Without an explicit rounding mode on add.f32, ptxas 11 ;; for all adds to prevent ptxas from fusion the ops. 28 ;; to prevent ptxas from fusing this with anything else.
|
D | bug26185-2.ll | 6 ; emit the necessary cvt.* instructions to implement the extension and let ptxas
|
/external/llvm-project/openmp/tools/analyzer/ |
D | analyzer.py | 42 …ptxas = '\n'.join([line.split(':')[1].strip() for line in stderr.split('\n') if re.search(r"^ptxas… 51 parseKernelUsages(ptxas, usage)
|
/external/tensorflow/tensorflow/stream_executor/gpu/ |
D | asm_compiler.cc | 52 tensorflow::SubProcess ptxas; in WarnIfBadPtxasVersion() local 53 ptxas.SetProgram(ptxas_path, {ptxas_path, "--version"}); in WarnIfBadPtxasVersion() 54 ptxas.SetChannelAction(tensorflow::CHAN_STDOUT, tensorflow::ACTION_PIPE); in WarnIfBadPtxasVersion() 55 if (!ptxas.Start()) { in WarnIfBadPtxasVersion() 61 int exit_code = ptxas.Communicate(/*stdin_input=*/nullptr, &out, in WarnIfBadPtxasVersion()
|
/external/llvm/docs/ |
D | CompileCudaWithLLVM.rst | 164 * ``off``: never emit fma operations, and prevent ptxas from fusing multiply 167 across statements (C11 semantics). Prevent ptxas from fusing other 170 statements. Doesn't prevent ptxas from fusing additional multiplies and
|
D | NVPTXUsage.rst | 413 ptxas complains of undefined function: __nvvm_reflect 638 You can also use the ``ptxas`` tool provided by the CUDA Toolkit to offline
|
/external/llvm-project/llvm/docs/ |
D | CompileCudaWithLLVM.rst | 110 * ``off``: never emit fma operations, and prevent ptxas from fusing multiply 113 across statements (C11 semantics). Prevent ptxas from fusing other 116 statements. Doesn't prevent ptxas from fusing additional multiplies and 247 * Optionally, invoke ``ptxas``, the PTX assembler, to generate a file, 270 * Invoke ``ptxas`` to generate a SASS file, ``S_arch``. Note that, unlike
|
D | NVPTXUsage.rst | 405 ptxas complains of undefined function: __nvvm_reflect 630 You can also use the ``ptxas`` tool provided by the CUDA Toolkit to offline
|
/external/tensorflow/third_party/nccl/ |
D | build_defs.bzl.tpl | 53 "ptxas-options=" + maxrregcount, 57 "-Xcuda-ptxas",
|
/external/llvm-project/clang/lib/Driver/ToolChains/ |
D | Cuda.cpp | 153 if (llvm::ErrorOr<std::string> ptxas = in CudaInstallationDetector() local 156 llvm::sys::fs::real_path(*ptxas, ptxasAbsolutePath); in CudaInstallationDetector()
|
/external/llvm-project/openmp/libomptarget/deviceRTLs/nvptx/ |
D | CMakeLists.txt | 99 set(CUDA_DEBUG -DOMPTARGET_NVPTX_DEBUG=-1 -g --ptxas-options=-v)
|
/external/tensorflow/tensorflow/compiler/xla/ |
D | xla.proto | 187 // If set to true XLA:GPU invokes `ptxas` with -O0 (default is -O3).
|
/external/llvm-project/clang/docs/ |
D | ClangCommandLineReference.rst | 75 .. option:: -Xcuda-ptxas <arg> 77 Pass <arg> to the ptxas assembler 165 Enable device-side debug info generation. Disables ptxas optimizations. 1145 .. option:: --ptxas-path=<arg> 1147 Path to ptxas (used for compiling CUDA code)
|
/external/clang/include/clang/Driver/ |
D | Options.td | 347 def Xcuda_ptxas : Separate<["-"], "Xcuda-ptxas">, 348 HelpText<"Pass <arg> to the ptxas assembler">, MetaVarName<"<arg>">; 414 HelpText<"Enable device-side debug info generation. Disables ptxas optimizations.">;
|
/external/llvm-project/clang/include/clang/Driver/ |
D | Options.td | 563 def Xcuda_ptxas : Separate<["-"], "Xcuda-ptxas">, 564 HelpText<"Pass <arg> to the ptxas assembler">, MetaVarName<"<arg>">; 683 HelpText<"Enable device-side debug info generation. Disables ptxas optimizations.">; 692 def ptxas_path_EQ : Joined<["--"], "ptxas-path=">, Group<i_Group>, 693 HelpText<"Path to ptxas (used for compiling CUDA code)">;
|
/external/llvm-project/llvm/lib/Target/NVPTX/ |
D | NVPTXInstrInfo.td | 258 // just like the non ".rn" op, but prevents ptxas from creating FMAs. 828 // ptxas does not have hex representation for fp16, so we can't use
|
/external/swiftshader/third_party/llvm-10.0/llvm/lib/Target/NVPTX/ |
D | NVPTXInstrInfo.td | 258 // just like the non ".rn" op, but prevents ptxas from creating FMAs. 828 // ptxas does not have hex representation for fp16, so we can't use
|
/external/tensorflow/tensorflow/compiler/xla/tests/ |
D | BUILD | 2627 # Disabled in OSS until nvidia publicly releases a fixed ptxas.
|
/external/tensorflow/ |
D | RELEASE.md | 3330 As a result, these versions of `ptxas` miscompile most XLA programs which use 3480 * GPU back-end now uses `ptxas` to compile generated PTX. 3505 As a result, these versions of `ptxas` miscompile most XLA programs which use
|
/external/llvm/lib/Target/NVPTX/ |
D | NVPTXIntrinsics.td | 70 // The last two parameters to shfl can be regs or imms. ptxas is smart 807 // patterns, but ptxas does not like these since .s16 is not compatible with
|