Platform: AMD Accelerated Parallel Processing Device: gfx1012:xnack- (RX 5500XT) Driver version : 3361.0 (HSA1.1,LC) (Linux x64) Compute units : 11 Clock frequency : 1900 MHz Global memory bandwidth (GBPS) float : 190.36 float2 : 182.03 float4 : 171.64 float8 : 157.88 float16 : 154.64 Single-precision compute (GFLOPS) float : 5046.67 float2 : 4936.51 float4 : 4887.78 float8 : 4871.37 float16 : 4796.57 Half-precision compute (GFLOPS) half : 2544.83 half2 : 9875.69 half4 : 9771.45 half8 : 9731.20 half16 : 9533.27 Double-precision compute (GFLOPS) double : 323.84 double2 : 323.33 double4 : 322.61 double8 : 321.09 double16 : 318.06 Integer compute (GIOPS) int : 1025.25 int2 : 1025.22 int4 : 1021.86 int8 : 1018.56 int16 : 1012.24 Integer compute Fast 24bit (GIOPS) int : 4738.49 int2 : 4805.52 int4 : 4799.40 int8 : 4682.88 int16 : 4766.09 Transfer bandwidth (GBPS) enqueueWriteBuffer : 15.64 enqueueReadBuffer : 15.34 enqueueWriteBuffer non-blocking : 15.70 enqueueReadBuffer non-blocking : 15.44 enqueueMapBuffer(for read) : 613566.81 memcpy from mapped ptr : 15.37 enqueueUnmap(after write) : 1227133.62 memcpy to mapped ptr : 15.70 Kernel launch latency : 12.77 us