Platform: NVIDIA CUDA Device: GeForce GTX 980 Driver version : 343.19 (Linux x64) Compute units : 16 Clock frequency : 1266 MHz Global memory bandwidth (GBPS) float : 165.83 float2 : 171.18 float4 : 174.77 float8 : 157.93 float16 : 173.39 Single-precision compute (GFLOPS) float : 4542.28 float2 : 4674.48 float4 : 4891.81 float8 : 4920.55 float16 : 4910.66 Double-precision compute (GFLOPS) double : 162.44 double2 : 161.56 double4 : 159.78 double8 : 161.25 double16 : 160.20 Integer compute (GIOPS) int : 1365.83 int2 : 1366.54 int4 : 1432.30 int8 : 1424.95 int16 : 1446.37 Transfer bandwidth (GBPS) enqueueWriteBuffer : 9.91 enqueueReadBuffer : 6.95 enqueueMapBuffer(for read) : 11.49 memcpy from mapped ptr : 6.62 enqueueUnmap(after write) : 12.72 memcpy to mapped ptr : 6.58 Kernel launch latency : 3.69 us