• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1
2Platform: AMD Accelerated Parallel Processing
3  Device: gfx906
4    Driver version  : 3212.0 (HSA1.1,LC) (Linux x64)
5    Compute units   : 60
6    Clock frequency : 1725 MHz
7
8    Global memory bandwidth (GBPS)
9      float   : 767.51
10      float2  : 748.01
11      float4  : 675.73
12      float8  : 717.26
13      float16 : 585.05
14
15    Single-precision compute (GFLOPS)
16      float   : 12713.40
17      float2  : 12396.88
18      float4  : 12340.44
19      float8  : 12001.62
20      float16 : 11861.35
21
22    Half-precision compute (GFLOPS)
23      half   : 6434.60
24      half2  : 23781.07
25      half4  : 23540.66
26      half8  : 23181.37
27      half16 : 22714.81
28
29    Double-precision compute (GFLOPS)
30      double   : 6084.21
31      double2  : 6160.23
32      double4  : 5970.83
33      double8  : 5964.05
34      double16 : 5833.33
35
36    Integer compute (GIOPS)
37      int   : 4241.74
38      int2  : 4223.93
39      int4  : 4227.38
40      int8  : 4198.92
41      int16 : 4162.48
42
43    Integer compute Fast 24bit (GIOPS)
44      int   : 11717.45
45      int2  : 11599.73
46      int4  : 11107.29
47      int8  : 11331.84
48      int16 : 11263.35
49
50    Transfer bandwidth (GBPS)
51      enqueueWriteBuffer              : 15.68
52      enqueueReadBuffer               : 15.39
53      enqueueWriteBuffer non-blocking : 15.61
54      enqueueReadBuffer non-blocking  : 11.47
55      enqueueMapBuffer(for read)      : 85048.86
56        memcpy from mapped ptr        : 15.67
57      enqueueUnmap(after write)       : 182764.56
58        memcpy to mapped ptr          : 16.07
59
60    Kernel launch latency : 10.55 us
61
62