benchmarks - OpenGrok cross reference for /external/tensorflow/tensorflow/compiler/mlir/tfrt/benchmarks/

# Performance benchmarks for MLIR based code generation

These benchmarks compare performance of Tensorflow -> LLVM code generation
with Eigen. These benchmarks are based on the Google Benchmark library and
can be integrated with performance monitoring tools.

## Running benchmarks

```
bazel run -c opt --cpu=haswell \
  :cwise_op_tanh_benchmark -- --benchmarks="f32/10k"
```

## Using perf and pprof with these benchmarks

1. Record perf profile
```
perf record -k 1 -o /tmp/perf.data --        \
  bazel run -c opt --cpu=haswell -copt=-gmlt \
  :cwise_op_tanh_benchmark -- --benchmarks="f32/10k"
```

2. Inject data from the JIT compiled functions
```
perf inject -j -v -i /tmp/perf.data -o /tmp/perf.data.jit
```

3. Report perf data

```
perf report -i /tmp/perf.data.jit
```

or

```
pprof -flame -nodecount=10000 /tmp/perf.data.jit
```

<!-- BEGIN GOOGLE-INTERNAL -->
## Running benchmarks using perflab and benchy

1. go/benchy
2. go/perflab

```
benchy                                                                        \
  --reference=${reference} --cpu=haswell --runs=20 --benchmarks=all           \
  --perflab --borg_constraints="platform_family_genus_cpu=indus-skylake-2000" \
  third_party/tensorflow/compiler/mlir/tfrt/benchmarks:cwise_op_tanh_benchmark
```

As of Q1 2021 `indus-skylake-2000` is the machine of the day, and roughly 60% of
the fleet cycles are executed on Skylakes.

Reference can be: 1. Cl number to test agains another pending change 2. `srcfs`
to test agains the g3 head 3. Another client number to test local changes
without exporting them <!-- END GOOGLE-INTERNAL -->
Name		Date	Size	#Lines	LOC
..		-	-
BUILD	D	03-May-2024	7.5 KiB	308	282
README.md	D	03-May-2024	1.6 KiB	59	43
benchmark.cc	D	03-May-2024	5.7 KiB	154	112
benchmark.h	D	03-May-2024	7.4 KiB	180	104
benchmark_mlir_function.cc	D	03-May-2024	7.9 KiB	214	129
benchmark_mlir_function.h	D	03-May-2024	3.8 KiB	85	47
compute_function_benchmark.cc	D	03-May-2024	19.7 KiB	469	420
cwise_op_exp_benchmark.cc	D	03-May-2024	2.2 KiB	81	53
cwise_op_expm1_benchmark.cc	D	03-May-2024	1.5 KiB	51	26
cwise_op_fusion_benchmark.cc	D	03-May-2024	2 KiB	56	34
cwise_op_log1p_benchmark.cc	D	03-May-2024	1.5 KiB	50	26
cwise_op_log2_benchmark.cc	D	03-May-2024	1.9 KiB	66	40
cwise_op_log_benchmark.cc	D	03-May-2024	1.5 KiB	51	26
cwise_op_rsqrt_benchmark.cc	D	03-May-2024	1.7 KiB	58	32
cwise_op_sigmoid_benchmark.cc	D	03-May-2024	1.5 KiB	51	27
cwise_op_tanh_benchmark.cc	D	03-May-2024	1.5 KiB	51	26
cwise_op_unary_benchmark.h	D	03-May-2024	13.4 KiB	293	203
fused_reduction_benchmark.cc	D	03-May-2024	3.4 KiB	109	78
matmul_op_benchmark.cc	D	03-May-2024	1.6 KiB	51	28
matmul_op_benchmark.h	D	03-May-2024	6.9 KiB	181	104
mean_row_op_benchmark.cc	D	03-May-2024	2.4 KiB	69	47
reduction_benchmark.cc	D	03-May-2024	3.1 KiB	82	57
reduction_benchmark.h	D	03-May-2024	5.1 KiB	121	90
softmax_op_benchmark.cc	D	03-May-2024	5.1 KiB	145	106
sum_col_op_benchmark.cc	D	03-May-2024	5.9 KiB	145	98
sum_full_op_benchmark.cc	D	03-May-2024	5.5 KiB	144	98
sum_row_op_benchmark.cc	D	03-May-2024	5.9 KiB	145	98
sum_transposed_op_benchmark.cc	D	03-May-2024	1.8 KiB	45	22
transpose_op_benchmark.cc	D	03-May-2024	10.1 KiB	248	183