1 2Platform: AMD Accelerated Parallel Processing 3 Device: gfx906 4 Driver version : 3212.0 (HSA1.1,LC) (Linux x64) 5 Compute units : 60 6 Clock frequency : 1725 MHz 7 8 Global memory bandwidth (GBPS) 9 float : 767.51 10 float2 : 748.01 11 float4 : 675.73 12 float8 : 717.26 13 float16 : 585.05 14 15 Single-precision compute (GFLOPS) 16 float : 12713.40 17 float2 : 12396.88 18 float4 : 12340.44 19 float8 : 12001.62 20 float16 : 11861.35 21 22 Half-precision compute (GFLOPS) 23 half : 6434.60 24 half2 : 23781.07 25 half4 : 23540.66 26 half8 : 23181.37 27 half16 : 22714.81 28 29 Double-precision compute (GFLOPS) 30 double : 6084.21 31 double2 : 6160.23 32 double4 : 5970.83 33 double8 : 5964.05 34 double16 : 5833.33 35 36 Integer compute (GIOPS) 37 int : 4241.74 38 int2 : 4223.93 39 int4 : 4227.38 40 int8 : 4198.92 41 int16 : 4162.48 42 43 Integer compute Fast 24bit (GIOPS) 44 int : 11717.45 45 int2 : 11599.73 46 int4 : 11107.29 47 int8 : 11331.84 48 int16 : 11263.35 49 50 Transfer bandwidth (GBPS) 51 enqueueWriteBuffer : 15.68 52 enqueueReadBuffer : 15.39 53 enqueueWriteBuffer non-blocking : 15.61 54 enqueueReadBuffer non-blocking : 11.47 55 enqueueMapBuffer(for read) : 85048.86 56 memcpy from mapped ptr : 15.67 57 enqueueUnmap(after write) : 182764.56 58 memcpy to mapped ptr : 16.07 59 60 Kernel launch latency : 10.55 us 61 62