1Platform: NVIDIA CUDA 2 Device: TITAN V 3 Driver version : 430.14 (Linux x64) 4 Compute units : 80 5 Clock frequency : 1455 MHz 6 7 Global memory bandwidth (GBPS) 8 float : 561.37 9 float2 : 591.37 10 float4 : 607.71 11 float8 : 516.49 12 float16 : 466.27 13 14 Single-precision compute (GFLOPS) 15 float : 13651.32 16 float2 : 13688.23 17 float4 : 13648.46 18 float8 : 13606.27 19 float16 : 13502.08 20 21 No half precision support! Skipped 22 23 Double-precision compute (GFLOPS) 24 double : 6858.92 25 double2 : 6846.90 26 double4 : 6822.64 27 double8 : 6797.12 28 double16 : 6737.34 29 30 Integer compute (GIOPS) 31 int : 13622.13 32 int2 : 13661.56 33 int4 : 13666.12 34 int8 : 13663.23 35 int16 : 13640.81 36 37 Integer compute Fast 24bit (GIOPS) 38 int : 13622.35 39 int2 : 13662.14 40 int4 : 13666.63 41 int8 : 13658.38 42 int16 : 13647.09 43 44 Transfer bandwidth (GBPS) 45 enqueueWriteBuffer : 6.09 46 enqueueReadBuffer : 6.45 47 enqueueWriteBuffer non-blocking : 4.58 48 enqueueReadBuffer non-blocking : 4.93 49 enqueueMapBuffer(for read) : 6.05 50 memcpy from mapped ptr : 9.09 51 enqueueUnmap(after write) : 6.26 52 memcpy to mapped ptr : 9.42 53 54 Kernel launch latency : 6.51 us 55