1# Overview of performance test suite, with steps for manual runs: 2 3For design of the tests, see https://grpc.io/docs/guides/benchmarking. 4 5For scripts related to the GKE-based performance test suite (in development), 6see [gRPC OSS benchmarks](#grpc-oss-benchmarks). 7 8## Pre-reqs for running these manually: 9 10In general the benchmark workers and driver build scripts expect 11[linux_performance_worker_init.sh](../../gce/linux_performance_worker_init.sh) 12to have been ran already. 13 14### To run benchmarks locally: 15 16- From the grpc repo root, start the 17 [run_performance_tests.py](../run_performance_tests.py) runner script. 18 19### On remote machines, to start the driver and workers manually: 20 21The [run_performance_test.py](../run_performance_tests.py) top-level runner 22script can also be used with remote machines, but for e.g., profiling the 23server, it might be useful to run workers manually. 24 251. You'll need a "driver" and separate "worker" machines. For example, you might 26 use one GCE "driver" machine and 3 other GCE "worker" machines that are in 27 the same zone. 28 292. Connect to each worker machine and start up a benchmark worker with a 30 "driver_port". 31 32- For example, to start the grpc-go benchmark worker: 33 [grpc-go worker main.go](https://github.com/grpc/grpc-go/blob/master/benchmark/worker/main.go) 34 --driver_port <driver_port> 35 36#### Commands to start workers in different languages: 37 38- Note that these commands are what the top-level 39 [run_performance_test.py](../run_performance_tests.py) script uses to build 40 and run different workers through the 41 [build_performance.sh](./build_performance.sh) script and "run worker" scripts 42 (such as the [run_worker_java.sh](./run_worker_java.sh)). 43 44##### Running benchmark workers for C-core wrapped languages (C++, Python, C#, Node, Ruby): 45 46- These are more simple since they all live in the main grpc repo. 47 48``` 49$ cd <grpc_repo_root> 50$ tools/run_tests/performance/build_performance.sh 51$ tools/run_tests/performance/run_worker_<language>.sh 52``` 53 54- Note that there is one "run_worker" script per language, e.g., 55 [run_worker_csharp.sh](./run_worker_csharp.sh) for c#. 56 57##### Running benchmark workers for gRPC-Java: 58 59- You'll need the [grpc-java](https://github.com/grpc/grpc-java) repo. 60 61``` 62$ cd <grpc-java-repo> 63$ ./gradlew -PskipCodegen=true -PskipAndroid=true :grpc-benchmarks:installDist 64$ benchmarks/build/install/grpc-benchmarks/bin/benchmark_worker --driver_port <driver_port> 65``` 66 67##### Running benchmark workers for gRPC-Go: 68 69- You'll need the [grpc-go repo](https://github.com/grpc/grpc-go) 70 71``` 72$ cd <grpc-go-repo>/benchmark/worker && go install 73$ # if profiling, it might be helpful to turn off inlining by building with "-gcflags=-l" 74$ $GOPATH/bin/worker --driver_port <driver_port> 75``` 76 77#### Build the driver: 78 79- Connect to the driver machine (if using a remote driver) and from the grpc 80 repo root: 81 82``` 83$ tools/run_tests/performance/build_performance.sh 84``` 85 86#### Run the driver: 87 881. Get the 'scenario_json' relevant for the scenario to run. Note that "scenario 89 json" configs are generated from [scenario_config.py](./scenario_config.py). 90 The [driver](../../../test/cpp/qps/qps_json_driver.cc) takes a list of these 91 configs as a json string of the form: `{scenario: <json_list_of_scenarios> }` 92 in its `--scenarios_json` command argument. One quick way to get a valid json 93 string to pass to the driver is by running the 94 [run_performance_tests.py](./run_performance_tests.py) locally and copying 95 the logged scenario json command arg. 96 972. From the grpc repo root: 98 99- Set `QPS_WORKERS` environment variable to a comma separated list of worker 100 machines. Note that the driver will start the "benchmark server" on the first 101 entry in the list, and the rest will be told to run as clients against the 102 benchmark server. 103 104Example running and profiling of go benchmark server: 105 106``` 107$ export QPS_WORKERS=<host1>:<10000>,<host2>,10000,<host3>:10000 108$ bins/opt/qps_json_driver --scenario_json='<scenario_json_scenario_config_string>' 109``` 110 111### Example profiling commands 112 113While running the benchmark, a profiler can be attached to the server. 114 115Example to count syscalls in grpc-go server during a benchmark: 116 117- Connect to server machine and run: 118 119``` 120$ netstat -tulpn | grep <driver_port> # to get pid of worker 121$ perf stat -p <worker_pid> -e syscalls:sys_enter_write # stop after test complete 122``` 123 124Example memory profile of grpc-go server, with `go tools pprof`: 125 126- After a run is done on the server, see its alloc profile with: 127 128``` 129$ go tool pprof --text --alloc_space http://localhost:<pprof_port>/debug/heap 130``` 131 132### Configuration environment variables: 133 134- QPS_WORKER_CHANNEL_CONNECT_TIMEOUT 135 136 Consuming process: qps_worker 137 138 Type: integer (number of seconds) 139 140 This can be used to configure the amount of time that benchmark clients wait 141 for channels to the benchmark server to become ready. This is useful in 142 certain benchmark environments in which the server can take a long time to 143 become ready. Note: if setting this to a high value, then the scenario config 144 under test should probably also have a large "warmup_seconds". 145 146- QPS_WORKERS 147 148 Consuming process: qps_json_driver 149 150 Type: comma separated list of host:port 151 152 Set this to a comma separated list of QPS worker processes/machines. Each 153 scenario in a scenario config has specifies a certain number of servers, 154 `num_servers`, and the driver will start "benchmark servers"'s on the first 155 `num_server` `host:port` pairs in the comma separated list. The rest will be 156 told to run as clients against the benchmark server. 157 158## gRPC OSS benchmarks 159 160The scripts in this section generate LoadTest configurations for the GKE-based 161gRPC OSS benchmarks framework. This framework is stored in a separate 162repository, [grpc/test-infra](https://github.com/grpc/test-infra). 163 164### Generating scenarios 165 166The benchmarks framework uses the same test scenarios as the legacy one. These 167script [scenario_config_exporter.py](./scenario_config_exporter.py) can be used 168to export these scenarios to files, and also to count and analyze existing 169scenarios. 170 171The language(s) and category of the scenarios are of particular importance to 172the tests. Continuous runs will typically run tests in the `scalable` category. 173 174The following example counts scenarios in the `scalable` category: 175 176``` 177$ ./tools/run_tests/performance/scenario_config_exporter.py --count_scenarios --category=scalable 178Scenario count for all languages (category: scalable): 179Count Language Client Server Categories 180 77 c++ scalable 181 19 python_asyncio scalable 182 16 java scalable 183 12 go scalable 184 12 node node scalable 185 12 node_purejs node scalable 186 9 csharp scalable 187 7 python scalable 188 5 ruby scalable 189 4 csharp c++ scalable 190 4 php7 c++ scalable 191 4 php7_protobuf_c c++ scalable 192 3 python_asyncio c++ scalable 193 2 ruby c++ scalable 194 2 python c++ scalable 195 1 csharp c++ scalable 196 197 189 total scenarios (category: scalable) 198``` 199 200Client and server languages are only set for cross-language scenarios, where the 201client or server language do not match the scenario language. 202 203### Generating load test configurations 204 205The benchmarks framework uses LoadTest resources configured by YAML files. Each 206LoadTest resource specifies a driver, a server, and one or more clients to run 207the test. Each test runs one scenario. The scenario configuration is embedded in 208the LoadTest configuration. Example configurations for various languages can be 209found here: 210 211https://github.com/grpc/test-infra/tree/master/config/samples 212 213The script [loadtest_config.py](./loadtest_config.py) generates LoadTest 214configurations for tests running a set of scenarios. The configurations are 215written in multipart YAML format, either to a file or to stdout. Each 216configuration contains a single embedded scenario. 217 218The LoadTest configurations are generated from a template. Any configuration can 219be used as a template, as long as it contains the languages required by the set 220of scenarios we intend to run (for instance, if we are generating configurations 221to run go scenarios, the template must contain a go client and a go server; if 222we are generating configurations for cross-language scenarios that need a go 223client and a C++ server, the template must also contain a C++ server; and the 224same for all other languages). 225 226The LoadTests specified in the script output all have unique names and can be 227run by applying the test to a cluster running the LoadTest controller with 228`kubectl apply`: 229 230``` 231$ kubectl apply -f loadtest_config.yaml 232``` 233 234A basic template for generating tests in various languages can be found here: 235[loadtest_template_basic_all_languages.yaml](./templates/loadtest_template_basic_all_languages.yaml). 236The following example generates configurations for C# and Java tests using this 237template, including tests against C++ clients and servers, and running each test 238twice: 239 240``` 241$ ./tools/run_tests/performance/loadtest_config.py -l go -l java \ 242 -t ./tools/run_tests/performance/templates/loadtest_template_basic_all_languages.yaml \ 243 -s client_pool=workers-8core -s server_pool=workers-8core \ 244 -s big_query_table=grpc-testing.e2e_benchmarks.experimental_results \ 245 -s timeout_seconds=3600 --category=scalable \ 246 -d --allow_client_language=c++ --allow_server_language=c++ \ 247 --runs_per_test=2 -o ./loadtest.yaml 248``` 249 250The script `loadtest_config.py` takes the following options: 251 252- `-l`, `--language`<br> Language to benchmark. May be repeated. 253- `-t`, `--template`<br> Template file. A template is a configuration file that 254 may contain multiple client and server configuration, and may also include 255 substitution keys. 256- `p`, `--prefix`<br> Test names consist of a prefix_joined with a uuid with a 257 dash. Test names are stored in `metadata.name`. The prefix is also added as 258 the `prefix` label in `metadata.labels`. The prefix defaults to the user name 259 if not set. 260- `-u`, `--uniquifier_element`<br> Uniquifier elements may be passed to the test 261 to make the test name unique. This option may be repeated to add multiple 262 elements. The uniquifier elements (plus a date string and a run index, if 263 applicable) are joined with a dash to form a _uniquifier_. The test name uuid 264 is derived from the scenario name and the uniquifier. The uniquifier is also 265 added as the `uniquifier` annotation in `metadata.annotations`. 266- `-d`<br> This option is a shorthand for the addition of a date string as a 267 uniquifier element. 268- `-a`, `--annotation`<br> Metadata annotation to be stored in 269 `metadata.annotations`, in the form key=value. May be repeated. 270- `-r`, `--regex`<br> Regex to select scenarios to run. Each scenario is 271 embedded in a LoadTest configuration containing a client and server of the 272 language(s) required for the test. Defaults to `.*`, i.e., select all 273 scenarios. 274- `--category`<br> Select scenarios of a specified _category_, or of all 275 categories. Defaults to `all`. Continuous runs typically run tests in the 276 `scalable` category. 277- `--allow_client_language`<br> Allows cross-language scenarios where the client 278 is of a specified language, different from the scenario language. This is 279 typically `c++`. This flag may be repeated. 280- `--allow_server_language`<br> Allows cross-language scenarios where the server 281 is of a specified language, different from the scenario language. This is 282 typically `node` or `c++`. This flag may be repeated. 283- `--runs_per_test`<br> This option specifies that each test should be repeated 284 `n` times, where `n` is the value of the flag. If `n` > 1, the index of each 285 test run is added as a uniquifier element for that run. 286- `-o`, `--output`<br> Output file name. The LoadTest configurations are added 287 to this file, in multipart YAML format. Output is streamed to `sys.stdout` if 288 not set. 289 290The script adds labels and annotations to the metadata of each LoadTest 291configuration: 292 293The following labels are added to `metadata.labels`: 294 295- `language`<br> The language of the LoadTest scenario. 296- `prefix`<br> The prefix used in `metadata.name`. 297 298The following annotations are added to `metadata.annotations`: 299 300- `scenario`<br> The name of the LoadTest scenario. 301- `uniquifier`<br> The uniquifier used to generate the LoadTest name, including 302 the run index if applicable. 303 304[Labels](https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/) 305can be used in selectors in resource queries. Adding the prefix, in particular, 306allows the user (or an automation script) to select the resources started from a 307given run of the config generator. 308 309[Annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) 310contain additional information that is available to the user (or an automation 311script) but is not indexed and cannot be used to select objects. Scenario name 312and uniquifier are added to provide the elements of the LoadTest name uuid in 313human-readable form. Additional annotations may be added later for automation. 314 315### Concatenating load test configurations 316 317The LoadTest configuration generator can process multiple languages at a time, 318assuming that they are supported by the template. The convenience script 319[loadtest_concat_yaml.py](./loadtest_concat_yaml.py) is provided to concatenate 320several YAML files into one, so configurations generated by multiple generator 321invocations can be concatenated into one and run with a single command. The 322script can be invoked as follows: 323 324``` 325$ loadtest_concat_yaml.py -i infile1.yaml infile2.yaml -o outfile.yaml 326``` 327 328### Generating configuration templates 329 330The script [loadtest_template.py](./loadtest_template.py) generates a load test 331configuration template from a set of load test configurations. The source files 332may be load test configurations or load test configuration templates. The 333generated template supports all languages supported in any of the input 334configurations or templates. 335 336The example template in 337[loadtest_template_basic_template_all_languages.yaml](./templates/loadtest_template_basic_all_languages.yaml) 338was generated from the example configurations in 339[grpc/test-infra](https://github.com/grpc/test-infra) by the following command: 340 341``` 342$ ./tools/run_tests/performance/loadtest_template.py \ 343 -i ../test-infra/config/samples/*.yaml \ 344 --inject_client_pool --inject_server_pool --inject_big_query_table \ 345 --inject_timeout_seconds \ 346 -o ./tools/run_tests/performance/templates/loadtest_template_basic_all_languages.yaml \ 347 --name basic_all_languages 348``` 349 350The script `loadtest_template.py` takes the following options: 351 352- `-i`, `--inputs`<br> Space-separated list of the names of input files 353 containing LoadTest configurations. May be repeated. 354- `-o`, `--output`<br> Output file name. Outputs to `sys.stdout` if not set. 355- `--inject_client_pool`<br> If this option is set, the pool attribute of all 356 clients in `spec.clients` is set to `${client_pool}`, for later substitution. 357- `--inject_server_pool`<br> If this option is set, the pool attribute of all 358 servers in `spec.servers` is set to `${server_pool}`, for later substitution. 359- `--inject_big_query_table`<br> If this option is set, 360 spec.results.bigQueryTable is set to `${big_query_table}`. 361- `--inject_timeout_seconds`<br> If this option is set, `spec.timeoutSeconds` is 362 set to `${timeout_seconds}`. 363- `--inject_ttl_seconds`<br> If this option is set, `spec.ttlSeconds` is set to 364 `${ttl_seconds}`. 365- `-n`, `--name`<br> Name to be set in `metadata.name`. 366- `-a`, `--annotation`<br> Metadata annotation to be stored in 367 `metadata.annotations`, in the form key=value. May be repeated. 368 369The four options that inject substitution keys are the most useful for template 370reuse. When running tests on different node pools, it becomes necessary to set 371the pool, and usually also to store the data on a different table. When running 372as part of a larger collection of tests, it may also be necessary to adjust test 373timeout and time-to-live, to ensure that all tests have time to complete. 374 375The template name is replaced again by `loadtest_config.py`, and so is set only 376as a human-readable memo. 377 378Annotations, on the other hand, are passed on to the test configurations, and 379may be set to values or to substitution keys in themselves, allowing future 380automation scripts to process the tests generated from these configurations in 381different ways. 382