Platform: ARM Platform Device: Mali-T628 Driver version : 1.2 (Linux ARM) Compute units : 4 Clock frequency : 600 MHz Global memory bandwidth (GBPS) float : 4.14 float2 : 5.19 float4 : 6.85 float8 : 6.25 float16 : 5.29 Single-precision compute (GFLOPS) float : 17.20 float2 : 5.94 float4 : 5.83 float8 : 33.96 float16 : 7.20 Double-precision compute (GFLOPS) double : 9.01 double2 : 1.74 double4 : 16.80 double8 : 16.84 double16 : 16.75 Integer compute (GIOPS) int : 5.29 int2 : 6.07 int4 : 6.18 int8 : 7.38 int16 : 34.14 Transfer bandwidth (GBPS) enqueueWriteBuffer : 4.70 enqueueReadBuffer : 2.87 enqueueMapBuffer(for read) : 441.72 memcpy from mapped ptr : 2.21 enqueueUnmap(after write) : 634.30 memcpy to mapped ptr : 2.31 Kernel launch latency : 233.30 us