1Platform: Portable Computing Language 2 Device: TITAN V 3 Driver version : 3.0-rc2 (Linux x64) 4 Compute units : 80 5 Clock frequency : 1455 MHz 6 7 Global memory bandwidth (GBPS) 8 float : 559.25 9 float2 : 582.61 10 float4 : 605.03 11 float8 : 563.36 12 float16 : 466.89 13 14 Single-precision compute (GFLOPS) 15 float : 13650.31 16 float2 : 13676.21 17 float4 : 13654.21 18 float8 : 13605.12 19 float16 : 13497.45 20 21 No half precision support! Skipped 22 23 Double-precision compute (GFLOPS) 24 double : 6851.16 25 double2 : 6839.41 26 double4 : 6821.27 27 double8 : 6802.04 28 double16 : 6750.90 29 30 Integer compute (GIOPS) 31 int : 13622.78 32 int2 : 13650.92 33 int4 : 13649.84 34 int8 : 13652.66 35 int16 : 13646.51 36 37 Integer compute Fast 24bit (GIOPS) 38 int : 13622.20 39 int2 : 13650.49 40 int4 : 13648.90 41 int8 : 13653.82 42 int16 : 13646.95 43 44 Transfer bandwidth (GBPS) 45 enqueueWriteBuffer : 9.75 46 enqueueReadBuffer : 9.60 47 enqueueWriteBuffer non-blocking : 9.76 48 enqueueReadBuffer non-blocking : 9.61 49 enqueueMapBuffer(for read) : 121195.77 50 memcpy from mapped ptr : 9.11 51 enqueueUnmap(after write) : 8.42 52 memcpy to mapped ptr : 9.45 53 54 Kernel launch latency : 27.61 us 55 56 Device: TITAN V 57 Driver version : 3.0-rc2 (Linux x64) 58 Compute units : 80 59 Clock frequency : 1455 MHz 60 61 Global memory bandwidth (GBPS) 62 float : 559.64 63 float2 : 582.45 64 float4 : 604.99 65 float8 : 562.63 66 float16 : 469.28 67 68 Single-precision compute (GFLOPS) 69 float : 13646.88 70 float2 : 13671.93 71 float4 : 13649.91 72 float8 : 13599.42 73 float16 : 13493.17 74 75 No half precision support! Skipped 76 77 Double-precision compute (GFLOPS) 78 double : 6843.55 79 double2 : 6834.11 80 double4 : 6820.15 81 double8 : 6797.91 82 double16 : 6743.83 83 84 Integer compute (GIOPS) 85 int : 13618.75 86 int2 : 13618.60 87 int4 : 13639.15 88 int8 : 13639.72 89 int16 : 13633.81 90 91 Integer compute Fast 24bit (GIOPS) 92 int : 13612.85 93 int2 : 13642.47 94 int4 : 13639.65 95 int8 : 13641.46 96 int16 : 13634.24 97 98 Transfer bandwidth (GBPS) 99 enqueueWriteBuffer : 8.94 100 enqueueReadBuffer : 10.66 101 enqueueWriteBuffer non-blocking : 8.94 102 enqueueReadBuffer non-blocking : 10.69 103 enqueueMapBuffer(for read) : 13.16 104 memcpy from mapped ptr : 8.71 105 enqueueUnmap(after write) : 12.39 106 memcpy to mapped ptr : 8.97 107 108 Kernel launch latency : 193.03 us 109