1# Benchmark tl;dr: 2 31. The time for JNI calls is negligible, comparable to a main memory load so 4 unless you are doing JNI in a big loop, you can ignore its cost. 51. `AttachCurrentThread()` costs the same as a trivial JNI call. 61. Sending a few parameters (< 50bytes) in a JNI call might only add 25-50% 7 extra latency. 81. For sending a lot of data, primitive arrays are the most performant way, 9 even if java has to convert a List to array + unbox first. 101. If you need to traverse in java vs copying it over, traversing Java arrays is 11 2x faster than doing the same for Java Lists but it is still much faster to 12 convert all the way to c++ vector. 131. Converting a java List to a c++ vector before iterating is ~10x faster than 14 traversing the java List directly and 5x faster than traversing a java array 15 directly (for arrays ~= 10,000 long). This includes the time for type 16 conversion. 171. Direct access to primitive arrays is incredibly fast (eg: `ByteArrayView` or 18 direct JNI API calls from `<jni.h>`). 191. non-ASCII utf-16 strings are the fastest to convert to java strings, followed 20 by ASCII strings in general and finally the slowest is non-ASCII utf-16 to utf-8 21 conversion (because optimizations). 221. Integer boxing and unboxing is could be expensive if done through JNI 23 and very cheap when done on java's side. 24 25 26# How to run the benchmarks 27 281. Add a dep from a java target onto `//third_party/jni_zero/benchmarks:benchmark_java`. 291. Add a dep from a native target onto `//third_party/jni_zero/benchmarks:benchmark_native_side`. 301. Add a call from java to `org.jni_zero.benchmark.Benchmark.runBenchmark()`. 31 32## How to run the generated classes benchmark 33 34The generated classes benchmark uses a large number of generated classes and thus they are not committed into the repo. 35 361. Run the script `generated/generate.py` and it will create the generated files in its own directory. 371. Set `_enable_generated_benchmark = true` in `BUILD.gn`. 381. Add a call to `BenchmarkJni.get().runGeneratedClassesBenchmark()` from `org.jni_zero.benchmark.Benchmark.runBenchmark()` 39 40# Benchmark Detailed Results: 41 42The numbers here are not exact since its hard to control for things like garbage 43collection and how busy was the phone was at the time. Trivial benchmarks show the 44most variance. 45 46## Trivial calls without parameters or return values 47### Java -> C++ 48 49```java 50// Java 51BenchmarkJni.get().callMe(); 52``` 53 54```c++ 55// C++ 56static void JNI_Benchmark_CallMe(JNIEnv* env) {} 57``` 58 59 60| | Pixel 7A | Samsung Galaxy A13 | 61| -------------- | :---------: | :----------------: | 62| Time per Call | 30 ns | 130 ns | 63 64 65### C++ -> Java 66 67```c++ 68// C++ 69Java_Benchmark_callMe(env); 70``` 71 72```java 73// Java 74@CalledByNative 75static void callMe() {} 76``` 77 78 79| | Pixel 7A | Samsung Galaxy A13 | 80| -------------- | :---------: | :----------------: | 81| Time per Call | 50 ns | 380 ns | 82 83### AttachCurrentThread() 84 85```c++ 86// C++ 87AttachCurrentThread(); 88``` 89 90 91| | Pixel 7A | Samsung Galaxy A13 | 92| -------------- | :---------: | :----------------: | 93| Time per Call | 80 ns | 360 ns | 94 95## Sending Primitive containers (primitive arrays or collections with autoboxed primitives). 96 97### Java int[10000] -> C++ jintArray and reading the jintArray's element's memory directly via GetIntArrayElements JNI API (vs copying it to a vector). 98 99This is the fastest possible way of sending data since there is no conversion needed and no extra copies into a vector. 100```java 101// Java 102int[] intArray = new int[10000]; 103BenchmarkJni.get().sendLargeIntArray(intArray); 104``` 105```c++ 106// C++ 107void JNI_Benchmark_SendLargeIntArray( 108 JNIEnv* env, 109 const jni_zero::JavaParamRef<jintArray>& j_array) { 110 size_t array_size = static_cast<size_t>(env->GetArrayLength(j_array.obj())); 111 jint* array = env->GetIntArrayElements(j_array.obj(), nullptr); 112 for (size_t i = 0; i < array_size; i++) { 113 count += array[i]; 114 } 115 env->ReleaseIntArrayElements(j_array.obj(), array, 0); 116} 117``` 118 119 120| | Pixel 7A | Samsung Galaxy A13 | 121| -------------- | :---------: | :----------------: | 122| Time per 10000 int array | 19,000 ns | 23,000 ns | 123| Time per int (amortized) | 1.9 ns | 2.3 ns | 124 125 126### Java int[10000] -> C++ std::vector<int> using @JniType conversions. 127```java 128// Java 129int[] intArray = new int[10000]; 130BenchmarkJni.get().sendLargeIntArrayConverted(intArray); 131``` 132```c++ 133// C++ 134void JNI_Benchmark_SendLargeIntArrayConverted( 135 JNIEnv* env, 136 std::vector<int32_t>& array) { 137 for (size_t i = 0; i < array.size(); i++) { 138 count += array[i]; 139 } 140} 141``` 142 143 144| | Pixel 7A | Samsung Galaxy A13 | 145| -------------- | :---------: | :----------------: | 146| Time per 10000 int array | 27,000 ns | 66,000 ns | 147| Time per int (amortized) | 2.7 ns | 6.6 ns | 148 149 150### C++ std::vector<int>(10000) -> Java int[] using @JniType conversions. 151 152```c++ 153// C++ 154std::vector<int> array(10000); 155Java_Benchmark_receiveLargeIntArray(env, array); 156``` 157 158```java 159// Java 160@CalledByNative 161static void receiveLargeIntArray(@JniType("std::vector<int32_t>") int[] array) { 162 for (int i = 0; i < array.length; i++) { 163 count += array[i]; 164 } 165} 166``` 167 168 169| | Pixel 7A | Samsung Galaxy A13 | 170| -------------- | :---------: | :----------------: | 171| Time per 10000 int array | 42,700 ns | 150,000 ns | 172| Time per int (amortized) | 4.3 ns | 15 ns | 173 174 175### Converting an ArrayList<Integer>(10000) to an int[] via stream().mapToInt() and then to an std::vector<int> (all conversion time counted). 176 177 178```java 179// Java 180List<Integer> integerList = new ArrayList(10000); 181int[] streamedIntArray = 182 integerList.stream().mapToInt((integer) -> integer.intValue()).toArray(); 183BenchmarkJni.get().sendLargeIntArrayConverted(streamedIntArray); 184``` 185 186```c++ 187// C++ 188static void JNI_Benchmark_SendLargeIntArrayConverted( 189 JNIEnv* env, 190 std::vector<int32_t>& array) { 191 for (size_t i = 0; i < array.size(); i++) { 192 count += array[i]; 193 } 194} 195``` 196 197 198| | Pixel 7A | Samsung Galaxy A13 | 199| -------------- | :---------: | :----------------: | 200| Time per 10000 Integer List | 145,000 ns | 1,200,000 ns | 201| Time per Integer (amortized) | 14.5 ns | 120 ns | 202 203 204### Traversing a Java Integer[10000] array from C++ using GetObjectArrayElements and doing the unboxing manually. 205 206 207```java 208// Java 209Integer[] integerArray = new Integer[10000]; 210BenchmarkJni.get().sendLargeObjectArray(integerArray); 211``` 212 213```c++ 214// C++ 215void JNI_Benchmark_SendLargeObjectArray( 216 JNIEnv* env, 217 const JavaParamRef<jobjectArray>& j_array) { 218 size_t array_size = static_cast<size_t>(env->GetArrayLength(j_array.obj())); 219 for (size_t i = 0; i < array_size; i++) { 220 count += JNI_Integer::Java_Integer_intValue( 221 env, JavaParamRef(env, env->GetObjectArrayElement(j_array.obj(), i))); 222 } 223} 224``` 225 226 227| | Pixel 7A | Samsung Galaxy A13 | 228| -------------- | :---------: | :----------------: | 229| Time per 10000 Integer array | 800,000 ns | 6,000,000 ns | 230| Time per Integer (amortized) | 80 ns | 600 ns | 231 232 233### Traversing a Java List<Integer>(10000) from C++ using List.get() and doing the unboxing manually. 234 235```java 236// Java 237List<Integer> integerList = new ArrayList(10000); 238BenchmarkJni.get().sendLargeObjectList(integerList); 239``` 240 241```c++ 242// C++ 243static void JNI_Benchmark_SendLargeObjectList( 244 JNIEnv* env, 245 const JavaParamRef<jobject>& j_list) { 246 size_t array_size = static_cast<size_t>(CollectionSize(env, j_list)); 247 for (size_t i = 0; i < array_size; i++) { 248 count += JNI_Integer::Java_Integer_intValue(env, ListGet(env, j_list, i)); 249 } 250} 251``` 252 253 254| | Pixel 7A | Samsung Galaxy A13 | 255| -------------- | :---------: | :----------------: | 256| Time per 10000 Integer List | 1,500,000 ns | 11,700,000 ns | 257| Time per Integer (amortized) | 150 ns | 1170 ns | 258 259 260## Sending naked integers as parameters (not in a container). {.numbered} 261 262### Sending 10000 ints from Java -> C++ one at a time (each call sends a single int as a parameter). 263 264 265```java 266// Java 267for (int i = 0; i < 10000; i++) { 268 BenchmarkJni.get().sendSingleInt(i); 269} 270``` 271 272```c++ 273// C++ 274static void JNI_Benchmark_SendSingleInt(JNIEnv* env, jint param) { 275 count += param; 276} 277``` 278 279 280| | Pixel 7A | Samsung Galaxy A13 | 281| -------------- | :---------: | :----------------: | 282| Time per 10000 ints | 300,000 ns | 1,400,000 ns | 283| Time per int | 40 ns | 140 ns | 284 285 286### Sending 10000 ints from C++ -> Java one at a time (each call sends a single int as a parameter). 287 288 289```c++ 290// C++ 291for (int i = 0; i < 10000; i++) { 292 Java_Benchmark_receiveSingleInt(env, i); 293} 294``` 295 296```java 297// Java 298@CalledByNative 299static void receiveSingleInt(int param) { 300 count += param; 301} 302``` 303 304 305| | Pixel 7A | Samsung Galaxy A13 | 306| -------------- | :---------: | :----------------: | 307| Time per 10000 ints | 452,000 ns | 3,400,000 ns | 308| Time per int | 45.2 ns | 340 ns | 309 310### Sending 100000 ints 10 at a time from Java -> C++ 311 312```java 313// Java 314for (int i = 0; i < 1000; i++) { 315 BenchmarkJni.get() 316 .send10Ints(i, i, i, i, i, i, i, i, i, i); 317} 318``` 319 320```c++ 321// C++ 322static void JNI_Benchmark_Send10Ints(JNIEnv* env, 323 jint a, 324 jint b, 325 jint c, 326 jint d, 327 jint e, 328 jint f, 329 jint g, 330 jint h, 331 jint i, 332 jint j) { 333 count += a + b + c + d + e + f + g + h + i + j; 334} 335``` 336 337 338| | Pixel 7A | Samsung Galaxy A13 | 339| -------------- | :---------: | :----------------: | 340| Time per 10 ints | 60 ns | 170 ns | 341| Time per int | 6 ns | 17 ns | 342 343 344### Sending 100000 ints 10 at a time from C++ -> Java 345 346```c++ 347// C++ 348for (int i = 0; i < 10000; i++) { 349 Java_Benchmark_receive10Ints(env, i, i, i, i, i, i, i, i, i, i); 350} 351``` 352 353```java 354// Java 355@CalledByNative 356static void receive10Ints( 357 int a, int b, int c, int d, int e, int f, int g, int h, int i, int j) { 358 count += a + b + c + d + e + f + g + h + i + j; 359} 360``` 361 362 363| | Pixel 7A | Samsung Galaxy A13 | 364| -------------- | :---------: | :----------------: | 365| Time per 10 ints | 100 ns | 550 ns | 366| Time per int | 10 ns | 55 ns | 367 368### Sending 10000 Integers from Java -> C++ ints converted using @JniType one at a time (each call sends a single Integer as a parameter). 369 370```java 371// Java 372for (int i = 0; i < 10000; i++) { 373 BenchmarkJni.get().sendSingleInteger(i); 374} 375``` 376 377```c++ 378// C++ 379static void JNI_Benchmark_SendSingleInteger( 380 JNIEnv* env, 381 const JavaParamRef<jobject>& param) { 382 count += JNI_Integer::Java_Integer_intValue(env, param); 383} 384``` 385 386 387| | Pixel 7A | Samsung Galaxy A13 | 388| -------------- | :---------: | :----------------: | 389| Time per 10000 Integers | 1,100,000 ns | 6,500,000 ns | 390| Time per Integer | 110 ns | 650 ns | 391 392 393### Sending 10000 ints from C++ -> Java Integers converted using @JniType one at a time (each call sends a single int as a parameter). 394 395 396```c++ 397// C++ 398for (int i = 0; i < 10000; i++) { 399 Java_Benchmark_receiveSingleInteger(env, i); 400} 401``` 402 403```java 404// Java 405@CalledByNative 406static void receiveSingleInteger(@JniType("int32_t") Integer param) { 407 count += param; 408} 409``` 410 411 412| | Pixel 7A | Samsung Galaxy A13 | 413| -------------- | :---------: | :----------------: | 414| Time per 10000 Integers | 1,500,000 ns | 10,000,000 ns | 415| Time per Integer | 150 ns | 1000 ns | 416 417 418### Sending 100000 Integers 10 at a time from Java -> C++ 419 420```java 421// Java 422Integer a = 1; 423for (int k = 0; k < 10000; k++) { 424 BenchmarkJni.get().send10Integers(a, a, a, a, a, a, a, a, a, a); 425} 426``` 427 428```c++ 429// C++ 430static void JNI_Benchmark_Send10Integers(JNIEnv* env, 431 const JavaParamRef<jobject>& a, 432 const JavaParamRef<jobject>& b, 433 const JavaParamRef<jobject>& c, 434 const JavaParamRef<jobject>& d, 435 const JavaParamRef<jobject>& e, 436 const JavaParamRef<jobject>& f, 437 const JavaParamRef<jobject>& g, 438 const JavaParamRef<jobject>& h, 439 const JavaParamRef<jobject>& i, 440 const JavaParamRef<jobject>& j) { 441 count += JNI_Integer::Java_Integer_intValue(env, a); 442 count += JNI_Integer::Java_Integer_intValue(env, b); 443 count += JNI_Integer::Java_Integer_intValue(env, c); 444 count += JNI_Integer::Java_Integer_intValue(env, d); 445 count += JNI_Integer::Java_Integer_intValue(env, e); 446 count += JNI_Integer::Java_Integer_intValue(env, f); 447 count += JNI_Integer::Java_Integer_intValue(env, g); 448 count += JNI_Integer::Java_Integer_intValue(env, h); 449 count += JNI_Integer::Java_Integer_intValue(env, i); 450 count += JNI_Integer::Java_Integer_intValue(env, j); 451} 452``` 453 454 455| | Pixel 7A | Samsung Galaxy A13 | 456| -------------- | :---------: | :----------------: | 457| Time per 10 Integers | 800 ns | 4,500 ns | 458| Time per Integer | 80 ns | 450 ns | 459 460 461### Sending 100000 Integers 10 at a time from C++ -> Java 462 463```c++ 464// C++ 465for (int i = 0; i < 10000; i++) { 466 Java_Benchmark_receive10IntegersConverted( 467 env, i, i, i, i, i, i, i, i, i, i); 468} 469``` 470 471```java 472// Java 473@CalledByNative 474static void receive10IntegersConverted( 475 @JniType("int") Integer a, 476 @JniType("int") Integer b, 477 @JniType("int") Integer c, 478 @JniType("int") Integer d, 479 @JniType("int") Integer e, 480 @JniType("int") Integer f, 481 @JniType("int") Integer g, 482 @JniType("int") Integer h, 483 @JniType("int") Integer i, 484 @JniType("int") Integer j) { 485 count += a + b + c + d + e + f + g + h + i + j; 486} 487``` 488 489 490| | Pixel 7A | Samsung Galaxy A13 | 491| -------------- | :---------: | :----------------: | 492| Time per 10 Integers | 1400 ns | 7,000 ns | 493| Time per Integer | 140 ns | 700 ns | 494 495 496## Sending Strings 497 498```java 499// Java Strings init. 500StringBuilder sb = new StringBuilder(); 501for (int i = 0; i < 1000; i++) { 502 sb.append('a'); 503} 504String asciiString = sb.toString(); 505sb = new StringBuilder(); 506for (int i = 0; i < 1000; i++) { 507 sb.append('ق'); 508} 509String nonAsciiString = sb.toString(); 510``` 511```c++ 512// C++ strings init. 513std::string u8_ascii_string = ""; 514std::string u8_non_ascii_string = ""; 515std::u16string u16_ascii_string = u""; 516std::u16string u16_non_ascii_string = u""; 517for (int i = 0; i < 1000; i++) { 518 u8_ascii_string += "a"; 519 u8_non_ascii_string += "ق"; 520 u16_ascii_string += u"a"; 521 u16_non_ascii_string += u"ق"; 522} 523``` 524 525### Sending a 1000 long ASCII String from Java to C++ std::string 526 527```java 528// Java 529BenchmarkJni.get().sendAsciiStringConvertedToU8(asciiString); 530``` 531```c++ 532// C++ 533static void JNI_Benchmark_SendAsciiStringConvertedToU8(JNIEnv* env, 534 std::string& param) {} 535``` 536 537 538| | Pixel 7A | Samsung Galaxy A13 | 539| -------------- | :---------: | :----------------: | 540| Time per 1000 characters | 1000 ns | 4000 ns | 541| Time per character | 1 ns | 4 ns | 542 543### Sending a 1000 long ASCII String from Java to C++ std::u16string 544 545```java 546// Java 547BenchmarkJni.get().sendAsciiStringConvertedToU16(asciiString); 548``` 549```c++ 550// C++ 551static void JNI_Benchmark_SendAsciiStringConvertedToU16(JNIEnv* env, 552 std::u16string& param) {} 553``` 554 555 556| | Pixel 7A | Samsung Galaxy A13 | 557| -------------- | :---------: | :----------------: | 558| Time per 1000 characters | 600 ns | 1700 ns | 559| Time per character | 0.6 ns | 1.7 ns | 560 561### Sending a 1000 long non-ASCII String from Java to C++ std::string 562 563```java 564// Java 565BenchmarkJni.get().sendNonAsciiStringConvertedToU8(nonAsciiString); 566``` 567```c++ 568static void JNI_Benchmark_SendNonAsciiStringConvertedToU8(JNIEnv* env, 569 std::string& param) {} 570``` 571 572 573| | Pixel 7A | Samsung Galaxy A13 | 574| -------------- | :---------: | :----------------: | 575| Time per 1000 characters | 4000 ns | 16,000 ns | 576| Time per character | 4 ns | 16 ns | 577 578### Sending a 1000 long non-ASCII String from Java to C++ std::u16string 579 580```java 581// Java 582BenchmarkJni.get().sendNonAsciiStringConvertedToU16(nonAsciiString); 583``` 584```c++ 585static void JNI_Benchmark_SendNonAsciiStringConvertedToU16( 586 JNIEnv* env, 587 std::u16string& param) {} 588``` 589 590 591| | Pixel 7A | Samsung Galaxy A13 | 592| -------------- | :---------: | :----------------: | 593| Time per 1000 characters | 200 ns | 1400 ns | 594| Time per character | 0.2 ns | 1.4 ns | 595 596### Sending a 1000 long ASCII std::string from C++ to Java 597 598```c++ 599// C++ 600Java_Benchmark_receiveU8String(env, u8_ascii_string); 601``` 602```java 603// Java 604@CalledByNative 605static void receiveU8String(@JniType("std::string") String s) {} 606``` 607 608 609| | Pixel 7A | Samsung Galaxy A13 | 610| -------------- | :---------: | :----------------: | 611| Time per 1000 characters | 3000 ns | 11,500 ns | 612| Time per character | 3 ns | 11.5 ns | 613 614### Sending a 1000 long ASCII std::u16string from C++ to Java 615 616```c++ 617// C++ 618Java_Benchmark_receiveU16String(env, u16_ascii_string); 619``` 620```java 621// Java 622@CalledByNative 623static void receiveU16String(@JniType("std::u16string") String s) {} 624``` 625 626 627| | Pixel 7A | Samsung Galaxy A13 | 628| -------------- | :---------: | :----------------: | 629| Time per 1000 characters | 2000 ns | 8,400 ns | 630| Time per character | 2 ns | 8.4 ns | 631 632### Sending a 1000 long non-ASCII std::string from C++ to Java 633 634```c++ 635// C++ 636Java_Benchmark_receiveU8String(env, u8_non_ascii_string); 637``` 638```java 639// Java 640@CalledByNative 641static void receiveU8String(@JniType("std::string") String s) {} 642``` 643 644 645| | Pixel 7A | Samsung Galaxy A13 | 646| -------------- | :---------: | :----------------: | 647| Time per 1000 characters | 5500 ns | 25,000 ns | 648| Time per character | 5.5 ns | 25 ns | 649 650### Sending a 1000 long non-ASCII std::u16string from C++ to Java 651 652```c++ 653// C++ 654Java_Benchmark_receiveU16String(env, u16_non_ascii_string); 655``` 656```java 657// Java 658@CalledByNative 659static void receiveU16String(@JniType("std::u16string") String s) {} 660``` 661 662 663| | Pixel 7A | Samsung Galaxy A13 | 664| -------------- | :---------: | :----------------: | 665| Time per 1000 characters | 1000 ns | 3,200 ns | 666| Time per character | 1 ns | 3.2 ns |