• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Profiling
2=========
3
4OpenSWR contains built-in profiling  which can be enabled
5at build time to provide insight into performance tuning.
6
7To enable this, uncomment the following line in ``rasterizer/core/knobs.h`` and rebuild: ::
8
9  //#define KNOB_ENABLE_RDTSC
10
11Running an application will result in a ``rdtsc.txt`` file being
12created in current working directory.  This file contains profile
13information captured between the ``KNOB_BUCKETS_START_FRAME`` and
14``KNOB_BUCKETS_END_FRAME`` (see knobs section).
15
16The resulting file will contain sections for each thread with a
17hierarchical breakdown of the time spent in the various operations.
18For example: ::
19
20 Thread 0 (API)
21  %Tot   %Par  Cycles     CPE        NumEvent   CPE2       NumEvent2  Bucket
22   0.00   0.00 28370      2837       10         0          0          APIClearRenderTarget
23   0.00  41.23 11698      1169       10         0          0          |-> APIDrawWakeAllThreads
24   0.00  18.34 5202       520        10         0          0          |-> APIGetDrawContext
25  98.72  98.72 12413773688 29957      414380     0          0          APIDraw
26   0.36   0.36 44689364   107        414380     0          0          |-> APIDrawWakeAllThreads
27  96.36  97.62 12117951562 9747       1243140    0          0          |-> APIGetDrawContext
28   0.00   0.00 19904      995        20         0          0          APIStoreTiles
29   0.00   7.88 1568       78         20         0          0          |-> APIDrawWakeAllThreads
30   0.00  25.28 5032       251        20         0          0          |-> APIGetDrawContext
31   1.28   1.28 161344902  64         2486370    0          0          APIGetDrawContext
32   0.00   0.00 50368      2518       20         0          0          APISync
33   0.00   2.70 1360       68         20         0          0          |-> APIDrawWakeAllThreads
34   0.00  65.27 32876      1643       20         0          0          |-> APIGetDrawContext
35
36
37 Thread 1 (WORKER)
38  %Tot   %Par  Cycles     CPE        NumEvent   CPE2       NumEvent2  Bucket
39  83.92  83.92 13198987522 96411      136902     0          0          FEProcessDraw
40  24.91  29.69 3918184840 167        23410158   0          0          |-> FEFetchShader
41  11.17  13.31 1756972646 75         23410158   0          0          |-> FEVertexShader
42   8.89  10.59 1397902996 59         23410161   0          0          |-> FEPAAssemble
43  19.06  22.71 2997794710 384        7803387    0          0          |-> FEClipTriangles
44  11.67  61.21 1834958176 235        7803387    0          0              |-> FEBinTriangles
45   0.00   0.00 0          0          187258     0          0                  |-> FECullZeroAreaAndBackface
46   0.00   0.00 0          0          60051033   0          0                  |-> FECullBetweenCenters
47   0.11   0.11 17217556   2869592    6          0          0          FEProcessStoreTiles
48  15.97  15.97 2511392576 73665      34092      0          0          WorkerWorkOnFifoBE
49  14.04  87.95 2208687340 9187       240408     0          0          |-> WorkerFoundWork
50   0.06   0.43 9390536    13263      708        0          0              |-> BELoadTiles
51   0.00   0.01 293020     182        1609       0          0              |-> BEClear
52  12.63  89.94 1986508990 949        2093014    0          0              |-> BERasterizeTriangle
53   2.37  18.75 372374596  177        2093014    0          0                  |-> BETriangleSetup
54   0.42   3.35 66539016   31         2093014    0          0                  |-> BEStepSetup
55   0.00   0.00 0          0          21766      0          0                  |-> BETrivialReject
56   1.05   8.33 165410662  79         2071248    0          0                  |-> BERasterizePartial
57   6.06  48.02 953847796  1260       756783     0          0                  |-> BEPixelBackend
58   0.20   3.30 31521202   41         756783     0          0                      |-> BESetup
59   0.16   2.69 25624304   33         756783     0          0                      |-> BEBarycentric
60   0.18   2.92 27884986   36         756783     0          0                      |-> BEEarlyDepthTest
61   0.19   3.20 30564174   41         744058     0          0                      |-> BEPixelShader
62   0.26   4.30 41058646   55         744058     0          0                      |-> BEOutputMerger
63   1.27  20.94 199750822  32         6054264    0          0                      |-> BEEndTile
64   0.33   2.34 51758160   23687      2185       0          0              |-> BEStoreTiles
65   0.20  60.22 31169500   28807      1082       0          0                  |-> B8G8R8A8_UNORM
66   0.00   0.00 302752     302752     1          0          0          WorkerWaitForThreadEvent
67
68