1 2Platform: NVIDIA CUDA 3 Device: GeForce RTX 2080 Ti 4 Driver version : 415.27 (Linux x64) 5 Compute units : 68 6 Clock frequency : 1650 MHz 7 8 Global memory bandwidth (GBPS) 9 float : 506.69 10 float2 : 532.16 11 float4 : 548.03 12 float8 : 556.57 13 float16 : 492.17 14 15 Single-precision compute (GFLOPS) 16 float : 16909.53 17 float2 : 16894.22 18 float4 : 16866.23 19 float8 : 16798.47 20 float16 : 16672.67 21 22 No half precision support! Skipped 23 24 Double-precision compute (GFLOPS) 25 double : 529.92 26 double2 : 529.30 27 double4 : 527.99 28 double8 : 525.44 29 double16 : 519.98 30 31 Integer compute (GIOPS) 32 int : 15480.90 33 int2 : 15398.06 34 int4 : 15411.76 35 int8 : 15226.44 36 int16 : 15304.72 37 38 Transfer bandwidth (GBPS) 39 enqueueWriteBuffer : 9.61 40 enqueueReadBuffer : 8.49 41 enqueueMapBuffer(for read) : 10.79 42 memcpy from mapped ptr : 11.37 43 enqueueUnmap(after write) : 12.27 44 memcpy to mapped ptr : 11.84 45 46 Kernel launch latency : 3.81 us 47 48