1Platform: Portable Computing Language 2 Device: pthread-AMD EPYC 7763 64-Core Processor 3 Driver version : 3.0-rc2 (Linux x64) 4 Compute units : 128 5 Clock frequency : 2450 MHz 6 7 Global memory bandwidth (GBPS) 8 float : 30.71 9 float2 : 30.89 10 float4 : 28.91 11 float8 : 33.49 12 float16 : 27.35 13 14 Single-precision compute (GFLOPS) 15 float : 88.22 16 float2 : 165.52 17 float4 : 344.33 18 float8 : 636.12 19 float16 : 159.04 20 21 No half precision support! Skipped 22 23 Double-precision compute (GFLOPS) 24 double : 87.14 25 double2 : 170.55 26 double4 : 312.85 27 double8 : 80.29 28 double16 : 105.41 29 30 Integer compute (GIOPS) 31 int : 199.11 32 int2 : 391.47 33 int4 : 765.45 34 int8 : 1513.98 35 int16 : 2490.43 36 37 Integer compute Fast 24bit (GIOPS) 38 int : 131.65 39 int2 : 190.44 40 int4 : 372.82 41 int8 : 659.86 42 int16 : 153.00 43 44 Transfer bandwidth (GBPS) 45 enqueueWriteBuffer : 19.15 46 enqueueReadBuffer : 15.29 47 enqueueWriteBuffer non-blocking : 15.87 48 enqueueReadBuffer non-blocking : 19.75 49 enqueueMapBuffer(for read) : 5067.21 50 memcpy from mapped ptr : 14.73 51 enqueueUnmap(after write) : 4620.23 52 memcpy to mapped ptr : 20.45 53 54 Kernel launch latency : 106.67 us 55