• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Platform: NVIDIA CUDA
2  Device: Tesla T4
3    Driver version  : 560.35.03 (Linux x64)
4    Compute units   : 40
5    Clock frequency : 1590 MHz
6
7    Global memory bandwidth (GBPS)
8      float   : 235.00
9      float2  : 247.01
10      float4  : 253.11
11      float8  : 263.44
12      float16 : 252.38
13
14    Single-precision compute (GFLOPS)
15      float   : 8030.45
16      float2  : 8034.32
17      float4  : 7985.38
18      float8  : 7848.48
19      float16 : 7651.69
20
21    No half precision support! Skipped
22
23    Double-precision compute (GFLOPS)
24      double   : 256.45
25      double2  : 256.03
26      double4  : 253.74
27      double8  : 252.76
28      double16 : 251.68
29
30    Integer compute (GIOPS)
31      int   : 5802.79
32      int2  : 5715.24
33      int4  : 5742.30
34      int8  : 5863.19
35      int16 : 5711.99
36
37    Transfer bandwidth (GBPS)
38      enqueueWriteBuffer         : 4.73
39      enqueueReadBuffer          : 4.78
40      enqueueMapBuffer(for read) : 8.73
41        memcpy from mapped ptr   : 5.39
42      enqueueUnmap(after write)  : 12.17
43        memcpy to mapped ptr     : 5.39
44
45    Kernel launch latency : 5.82 us
46