1 2Platform: Intel(R) OpenCL 3 Device: Intel(R) Many Integrated Core Acceleration Card 4 Driver version : 1.2 (Linux x64) 5 Compute units : 240 6 Clock frequency : 1333 MHz 7 8 Global memory bandwidth (GBPS) 9 float : 68.00 10 float2 : 49.67 11 float4 : 89.31 12 float8 : 101.30 13 float16 : 1.84 14 15 Single-precision compute (GFLOPS) 16 float : 2130.58 17 float2 : 2262.12 18 float4 : 2255.07 19 float8 : 2247.39 20 float16 : 2213.89 21 22 No half precision support! Skipped 23 24 Double-precision compute (GFLOPS) 25 double : 1157.62 26 double2 : 1156.44 27 double4 : 1153.55 28 double8 : 1146.15 29 double16 : 354.26 30 31 Integer compute (GIOPS) 32 int : 1158.72 33 int2 : 1161.01 34 int4 : 1158.69 35 int8 : 1158.58 36 int16 : 1147.02 37 38 Integer compute Fast 24bit (GIOPS) 39 int : 1158.82 40 int2 : 1160.21 41 int4 : 1158.02 42 int8 : 1150.32 43 int16 : 1136.05 44 45 Transfer bandwidth (GBPS) 46 enqueueWriteBuffer : 2.25 47 enqueueReadBuffer : 5.62 48 enqueueWriteBuffer non-blocking : 4.92 49 enqueueReadBuffer non-blocking : 5.57 50 enqueueMapBuffer(for read) : 137.80 51 memcpy from mapped ptr : 3.71 52 enqueueUnmap(after write) : 6.55 53 memcpy to mapped ptr : 3.43 54 55 Kernel launch latency : 79.60 us 56 57 58Platform: NVIDIA CUDA 59 60Platform: AMD Accelerated Parallel Processing 61