| /external/pytorch/benchmarks/dynamo/ |
| D | expected_ci_speedup_inductor_torchbench_cpu.csv | 2 #timm_vision_transformer,inductor,float32,static,default,1.039510755 3 phlippe_densenet,inductor,float32,static,default,1.46474287 4 basic_gnn_edgecnn,inductor,float32,dynamic,default,1.30092957 5 llama_v2_7b_16h,inductor,float32,dynamic,default,1.23234331 6 resnet50,inductor,float32,dynamic,default,1.67742767 7 #timm_efficientnet,inductor,float32,static,cpp, 8 mobilenet_v3_large,inductor,float32,static,cpp,2.63311706 9 timm_resnest,inductor,float32,dynamic,cpp,1.7321529 10 functorch_maml_omniglot,inductor,float32,dynamic,cpp,1.17617472 11 #hf_GPT2,inductor,float32,dynamic,cpp, [all …]
|
| D | README.md | 27 The [inductor-perf-test-nightly.yml](https://github.com/pytorch/pytorch/actions/workflows/inductor-… 30 … of the [workflow](https://github.com/pytorch/pytorch/actions/workflows/inductor-perf-test-nightly… 67 - `--backend=inductor`: selects TorchInductor as the compiler backend to measure. Many more are av… 74 - `--export-aot-inductor`: benchmarks ahead-of-time compilation mode. 83 ./benchmarks/dynamo/torchbench.py --performance --training --amp --backend=inductor --output=torchb… 84 ./benchmarks/dynamo/torchbench.py --performance --inference --bfloat16 --backend=inductor --output=… 86 ./benchmarks/dynamo/huggingface.py --performance --training --amp --backend=inductor --output=huggi… 87 ./benchmarks/dynamo/huggingface.py --performance --inference --bfloat16 --backend=inductor --output… 89 ./benchmarks/dynamo/timm_models.py --performance --training --amp --backend=inductor --output=timm_… 90 ./benchmarks/dynamo/timm_models.py --performance --inference --bfloat16 --backend=inductor --output…
|
| /external/pytorch/.github/workflows/ |
| D | inductor-cu124.yml | 1 name: inductor-cu124 6 - ciflow/inductor-cu124/* 21 linux-focal-cuda12_4-py3_10-gcc9-inductor-build: 22 # Should be synced with the one in inductor.yml, but this doesn't run inductor_timm 26 sync-tag: linux-focal-cuda12_4-py3_10-gcc9-inductor-build 28 docker-image-name: pytorch-linux-focal-cuda12.4-cudnn9-py3-gcc9-inductor-benchmarks 32 { config: "inductor", shard: 1, num_shards: 2, runner: "linux.g5.4xlarge.nvidia.gpu" }, 33 { config: "inductor", shard: 2, num_shards: 2, runner: "linux.g5.4xlarge.nvidia.gpu" }, 53 linux-focal-cuda12_4-py3_10-gcc9-inductor-test: 56 needs: linux-focal-cuda12_4-py3_10-gcc9-inductor-build [all …]
|
| D | inductor.yml | 1 name: inductor 9 - ciflow/inductor/* 30 linux-focal-cuda12_1-py3_10-gcc9-inductor-build: 36 docker-image-name: pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9-inductor-benchmarks 41 …{ config: "inductor", shard: 1, num_shards: 2, runner: "${{ needs.get-label-type.outputs.label-typ… 42 …{ config: "inductor", shard: 2, num_shards: 2, runner: "${{ needs.get-label-type.outputs.label-typ… 64 linux-focal-cuda12_1-py3_10-gcc9-inductor-test: 67 needs: linux-focal-cuda12_1-py3_10-gcc9-inductor-build 70 … docker-image: ${{ needs.linux-focal-cuda12_1-py3_10-gcc9-inductor-build.outputs.docker-image }} 71 test-matrix: ${{ needs.linux-focal-cuda12_1-py3_10-gcc9-inductor-build.outputs.test-matrix }} [all …]
|
| D | inductor-micro-benchmark-x86.yml | 1 name: inductor-micro-benchmark-x86 8 - ciflow/inductor-micro-benchmark-cpu-x86/* 19 linux-jammy-cpu-py3_9-gcc11-inductor-build: 20 name: linux-jammy-cpu-py3.9-gcc11-inductor 24 docker-image-name: pytorch-linux-jammy-py3.9-gcc11-inductor-benchmarks 28 …{ config: "inductor-micro-benchmark-cpu-x86", shard: 1, num_shards: 1, runner: "linux.24xl.spr-met… 31 linux-jammy-cpu-py3_9-gcc11-inductor-micro-benchmark-test: 32 name: linux-jammy-cpu-py3.9-gcc11-inductor 34 needs: linux-jammy-cpu-py3_9-gcc11-inductor-build 37 docker-image: ${{ needs.linux-jammy-cpu-py3_9-gcc11-inductor-build.outputs.docker-image }} [all …]
|
| D | inductor-rocm.yml | 1 name: inductor-rocm 15 - ciflow/inductor-rocm/* 25 linux-focal-rocm6_2-py3_10-inductor-build: 26 name: rocm6.2-py3.10-inductor 33 { config: "inductor", shard: 1, num_shards: 2, runner: "linux.rocm.gpu.2" }, 34 { config: "inductor", shard: 2, num_shards: 2, runner: "linux.rocm.gpu.2" }, 37 linux-focal-rocm6_2-py3_10-inductor-test: 41 name: rocm6.2-py3.10-inductor 43 needs: linux-focal-rocm6_2-py3_10-inductor-build 46 docker-image: ${{ needs.linux-focal-rocm6_2-py3_10-inductor-build.outputs.docker-image }} [all …]
|
| D | inductor-perf-test-nightly-x86.yml | 1 name: inductor-perf-nightly-x86 51 linux-jammy-cpu-py3_9-gcc11-inductor-build: 52 name: linux-jammy-cpu-py3.9-gcc11-inductor 56 docker-image-name: pytorch-linux-jammy-py3.9-gcc11-inductor-benchmarks 77 linux-jammy-cpu-py3_9-gcc11-inductor-test-nightly: 78 name: linux-jammy-cpu-py3.9-gcc11-inductor 80 needs: linux-jammy-cpu-py3_9-gcc11-inductor-build 85 docker-image: ${{ needs.linux-jammy-cpu-py3_9-gcc11-inductor-build.outputs.docker-image }} 86 test-matrix: ${{ needs.linux-jammy-cpu-py3_9-gcc11-inductor-build.outputs.test-matrix }} 93 linux-jammy-cpu-py3_9-gcc11-inductor-test: [all …]
|
| D | inductor-micro-benchmark.yml | 1 name: inductor-micro-benchmark 8 - ciflow/inductor-micro-benchmark/* 19 linux-focal-cuda12_1-py3_10-gcc9-inductor-micro-benchmark-build: 24 docker-image-name: pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9-inductor-benchmarks 28 { config: "inductor-micro-benchmark", shard: 1, num_shards: 1, runner: "linux.gcp.a100" }, 31 linux-focal-cuda12_1-py3_10-gcc9-inductor-micro-benchmark-test: 34 needs: linux-focal-cuda12_1-py3_10-gcc9-inductor-micro-benchmark-build 37 …docker-image: ${{ needs.linux-focal-cuda12_1-py3_10-gcc9-inductor-micro-benchmark-build.outputs.do… 38 …test-matrix: ${{ needs.linux-focal-cuda12_1-py3_10-gcc9-inductor-micro-benchmark-build.outputs.tes…
|
| D | inductor-perf-test-nightly.yml | 1 name: inductor-A100-perf-nightly 69 linux-focal-cuda12_1-py3_10-gcc9-inductor-build: 74 docker-image-name: pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9-inductor-benchmarks 95 linux-focal-cuda12_1-py3_10-gcc9-inductor-test-nightly: 98 needs: linux-focal-cuda12_1-py3_10-gcc9-inductor-build 103 … docker-image: ${{ needs.linux-focal-cuda12_1-py3_10-gcc9-inductor-build.outputs.docker-image }} 104 test-matrix: ${{ needs.linux-focal-cuda12_1-py3_10-gcc9-inductor-build.outputs.test-matrix }} 110 linux-focal-cuda12_1-py3_10-gcc9-inductor-test-weekly: 113 needs: linux-focal-cuda12_1-py3_10-gcc9-inductor-build 118 … docker-image: ${{ needs.linux-focal-cuda12_1-py3_10-gcc9-inductor-build.outputs.docker-image }} [all …]
|
| D | inductor-perf-compare.yml | 1 name: inductor-A100-perf-compare 6 - ciflow/inductor-perf-compare/* 16 linux-focal-cuda12_1-py3_10-gcc9-inductor-build: 21 docker-image-name: pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9-inductor-benchmarks 33 linux-focal-cuda12_1-py3_10-gcc9-inductor-test: 36 needs: linux-focal-cuda12_1-py3_10-gcc9-inductor-build 39 … docker-image: ${{ needs.linux-focal-cuda12_1-py3_10-gcc9-inductor-build.outputs.docker-image }} 40 test-matrix: ${{ needs.linux-focal-cuda12_1-py3_10-gcc9-inductor-build.outputs.test-matrix }}
|
| D | inductor-perf-test-nightly-aarch64.yml | 1 name: inductor-perf-nightly-aarch64 53 linux-jammy-aarch64-py3_10-inductor-build: 54 name: linux-jammy-aarch64-py3.10-inductor 59 docker-image-name: pytorch-linux-jammy-aarch64-py3.10-gcc11-inductor-benchmarks 104 linux-jammy-aarch64-py3_10-inductor-test-nightly: 105 name: linux-jammy-aarch64-py3.10-inductor 107 needs: linux-jammy-aarch64-py3_10-inductor-build 114 docker-image: ${{ needs.linux-jammy-aarch64-py3_10-inductor-build.outputs.docker-image }} 115 test-matrix: ${{ needs.linux-jammy-aarch64-py3_10-inductor-build.outputs.test-matrix }} 122 linux-jammy-aarch64-py3_10-inductor-test: [all …]
|
| D | inductor-perf-test-nightly-a10g.yml | 1 name: inductor-perf-nightly-A10g 71 linux-focal-cuda12_1-py3_10-gcc9-inductor-build: 76 docker-image-name: pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9-inductor-benchmarks 97 linux-focal-cuda12_1-py3_10-gcc9-inductor-test-nightly: 100 needs: linux-focal-cuda12_1-py3_10-gcc9-inductor-build 105 … docker-image: ${{ needs.linux-focal-cuda12_1-py3_10-gcc9-inductor-build.outputs.docker-image }} 106 test-matrix: ${{ needs.linux-focal-cuda12_1-py3_10-gcc9-inductor-build.outputs.test-matrix }} 112 linux-focal-cuda12_1-py3_10-gcc9-inductor-test: 115 needs: linux-focal-cuda12_1-py3_10-gcc9-inductor-build 120 … docker-image: ${{ needs.linux-focal-cuda12_1-py3_10-gcc9-inductor-build.outputs.docker-image }} [all …]
|
| /external/pytorch/test/inductor/ |
| D | test_decompose_mem_bound_mm.py | 1 # Owner(s): ["module: inductor"] 88 torch._logging.set_logs(inductor=logging.DEBUG) 104 counters["inductor"]["decompose_bmm"], 115 counters["inductor"]["decompose_bmm"], 126 torch._logging.set_logs(inductor=logging.DEBUG) 142 counters["inductor"]["decompose_addmm"], 147 counters["inductor"]["decompose_mm"], 150 decompose_mm_fwd = counters["inductor"]["decompose_mm"] 159 counters["inductor"]["decompose_mm"] - decompose_mm_fwd, 173 torch._logging.set_logs(inductor=logging.DEBUG) [all …]
|
| D | test_codecache.py | 1 # Owner(s): ["module: inductor"] 148 self.assertEqual(counters["inductor"]["fxgraph_cache_miss"], 1) 149 self.assertEqual(counters["inductor"]["fxgraph_cache_hit"], 0) 150 self.assertEqual(counters["inductor"]["fxgraph_lookup_write_file"], 0) 158 self.assertEqual(counters["inductor"]["fxgraph_cache_miss"], 1) 159 self.assertEqual(counters["inductor"]["fxgraph_cache_hit"], 1) 160 self.assertEqual(counters["inductor"]["fxgraph_lookup_write_file"], 1) 224 self.assertGreater(counters["inductor"]["fxgraph_cache_miss"], 0) 225 self.assertEqual(counters["inductor"]["fxgraph_cache_hit"], 0) 232 self.assertEqual(counters["inductor"]["fxgraph_cache_miss"], 0) [all …]
|
| D | test_select_algorithm.py | 1 # Owner(s): ["module: inductor"] 67 self.assertEqual(counters["inductor"]["select_algorithm_autotune"], 1) 84 self.assertEqual(counters["inductor"]["select_algorithm_autotune"], 1) 101 self.assertEqual(counters["inductor"]["select_algorithm_autotune"], 1) 113 self.assertEqual(counters["inductor"]["select_algorithm_autotune"], 1) 127 self.assertEqual(counters["inductor"]["select_algorithm_autotune"], 1) 140 self.assertEqual(counters["inductor"]["select_algorithm_autotune"], 0) 153 self.assertEqual(counters["inductor"]["select_algorithm_autotune"], 1) 165 self.assertEqual(counters["inductor"]["select_algorithm_autotune"], 1) 179 self.assertEqual(counters["inductor"]["select_algorithm_autotune"], 1) [all …]
|
| D | test_standalone_compile.py | 1 # Owner(s): ["module: inductor"] 3 from torch import _dynamo as dynamo, _inductor as inductor unknown 45 mod_opt = inductor.compile(symbolic_trace(mod), [inp]) 53 mod_opt = inductor.compile(symbolic_trace(mod), [inp]) 61 mod_opt = inductor.compile(symbolic_trace(mod), [inp]) 69 mod_opt = inductor.compile(make_fx(mod)(inp), [inp]) 78 mod_opt = inductor.compile(mod, [inp]) 87 mod_opt = inductor.compile(gm, [inp]) 96 mod_opt = inductor.compile(gm, [inp]) 109 mod_opt = inductor.compile(mod, inp)
|
| /external/pytorch/benchmarks/dynamo/microbenchmarks/ |
| D | matmul_relu.py | 11 @torch._dynamo.optimize("inductor", nopython=True) 58 time_with_torch_timer(inductor_mm, (a, b), string_id="inductor mm") 68 inductor mm mean: 0.0653 ms 72 inductor mm mean: 0.0252 ms 76 inductor mm mean: 0.0274 ms 80 inductor mm mean: 0.0244 ms 84 inductor mm mean: 0.0290 ms 88 inductor mm mean: 0.0319 ms 92 inductor mm mean: 0.0255 ms 96 inductor mm mean: 0.5090 ms [all …]
|
| D | inductor_mm.py | 17 @torch._dynamo.optimize("inductor", nopython=True) 22 @torch._dynamo.optimize("inductor", nopython=True) 36 print("shape; torch mm; triton mm; inductor aten mm; inductor triton mm") 65 print("shape; torch mm; triton mm; inductor aten mm; inductor triton mm") 115 shape; torch mm; triton mm; inductor aten mm; inductor triton mm 127 shape; torch mm; triton mm; inductor aten mm; inductor triton mm
|
| /external/pytorch/torch/csrc/inductor/aoti_eager/ |
| D | kernel_holder.h | 9 #include <torch/csrc/inductor/aoti_eager/kernel_meta_info.h> 10 #include <torch/csrc/inductor/aoti_runner/model_container_runner.h> 15 namespace torch::inductor { 46 // The AOTIPythonKernelHolder class uses the AOT Inductor to generate a kernel 52 // Inductor is called again to generate the kernel library. 94 // Invoke python utility function on the Inductor side to produce AOTI kernel 96 // Inductor utility function - 102 // Invoke python utility function on the Inductor side to load AOTI kernel for 104 // Inductor utility function - torch._inductor.utils.load_aoti_eager_cache 111 } // namespace torch::inductor
|
| /external/pytorch/torch/_logging/ |
| D | _registrations.py | 20 register_log("inductor", ["torch._inductor", "torch._inductor.cudagraph_trees"]) 24 "Logs information from wrapping inductor generated code with cudagraphs.", 85 …generated by AOTDispatch, after partitioning. Useful to understand what's being given to Inductor", 99 …nerated by post grad passes. Useful to understand what's being given to Inductor after post grad p… 139 "Prints the code that Inductor generates (either Triton or C++)", 145 "Prints the code that Inductor generates (on a per-kernel basis)", 151 "Inductor scheduler information. Useful if working on Inductor fusion algo", 158 "Detailed Inductor fusion decisions. More detailed than 'schedule'", 168 "Detailed Inductor compute/comm overlap decisions", 188 "Detailed Inductor benchmarking information.",
|
| /external/pytorch/.github/ |
| D | pytorch-probot.yml | 8 - ciflow/inductor 9 - ciflow/inductor-rocm 10 - ciflow/inductor-perf-compare 11 - ciflow/inductor-micro-benchmark 12 - ciflow/inductor-micro-benchmark-cpu-x86 13 - ciflow/inductor-cu124
|
| /external/pytorch/test/distributed/_composable/ |
| D | test_replicate_with_compiler.py | 12 from torch import _inductor as inductor, nn unknown 66 return inductor.compile(gm_, example_inputs_) 76 A version of MultiProcessTestCase that derives from the Inductor TestCase 77 to handle isolation of the inductor cache dir. 219 @unittest.skipIf(not has_triton(), "Inductor+gpu needs triton and recent GPU arch") 226 @unittest.skipIf(not has_triton(), "Inductor+gpu needs triton and recent GPU arch") 233 @unittest.skipIf(not has_triton(), "Inductor+gpu needs triton and recent GPU arch") 247 @unittest.skipIf(not has_triton(), "Inductor+gpu needs triton and recent GPU arch") 259 # TODO: figure out why we need to disable Inductor to avoid test errors. 264 @unittest.skipIf(not has_triton(), "Inductor+gpu needs triton and recent GPU arch") [all …]
|
| /external/pytorch/.github/scripts/ |
| D | drci_mocks.json.gz | |
| /external/pytorch/torch/utils/_sympy/ |
| D | symbol.py | 26 # Inductor: The intermediates in inner_fn tmp0, one generated per ops call. 30 # Inductor: Placeholder variable that is later replaced with TMP 32 # Inductor: Some size expressions are replaced with a precomputed size ps0 36 # Inductor: An indexing variable i0 in loops IR which ranges over non-reduced 39 # Inductor: A reduction indexing r0 variable in loops IR which ranges over 42 # Inductor: In templated kernels torch._inductor.kernel, we have a hook to 47 # Inductor: iteration domain for blockIdx.x/blockIdx.y 50 # Inductor: this is used solely for dynamic_reshape_indexer
|
| /external/pytorch/.ci/pytorch/ |
| D | test.sh | 305 --exclude-inductor-tests \ 317 python test/run_test.py -i inductor/test_torchinductor.py -k test_multi_gpu --verbose 318 python test/run_test.py -i inductor/test_aot_inductor.py -k test_non_default_cuda_device --verbose 319 python test/run_test.py -i inductor/test_aot_inductor.py -k test_replicate_on_devices --verbose 349 python test/run_test.py --inductor \ 354 …# Do not add --inductor for the following inductor unit tests, otherwise we will fail because of n… 356 …--include inductor/test_torchinductor inductor/test_torchinductor_opinfo inductor/test_aot_inducto… 377 echo "Testing Inductor cpp wrapper mode with TORCHINDUCTOR_ABI_COMPATIBLE=1" 379 PYTORCH_TESTING_DEVICE_ONLY_FOR="" python test/run_test.py --include inductor/test_cpu_cpp_wrapper 380 python test/run_test.py --include inductor/test_cuda_cpp_wrapper [all …]
|