• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Name
2
3    AMD_performance_monitor
4
5Name Strings
6
7    GL_AMD_performance_monitor
8
9Contributors
10
11    Dan Ginsburg
12    Aaftab Munshi
13    Dave Oldcorn
14    Maurice Ribble
15    Jonathan Zarge
16
17Contact
18
19    Dan Ginsburg (dan.ginsburg 'at' amd.com)
20
21Status
22
23    ???
24
25Version
26
27    Last Modified Date: 11/29/2007
28
29Number
30
31    OpenGL Extension #360
32    OpenGL ES Extension #50
33
34Dependencies
35
36    None
37
38Overview
39
40    This extension enables the capture and reporting of performance monitors.
41    Performance monitors contain groups of counters which hold arbitrary counted
42    data.  Typically, the counters hold information on performance-related
43    counters in the underlying hardware.  The extension is general enough to
44    allow the implementation to choose which counters to expose and pick the
45    data type and range of the counters.  The extension also allows counting to
46    start and end on arbitrary boundaries during rendering.
47
48Issues
49
50    1.  Should this be an EGL or OpenGL/OpenGL ES extension?
51
52        Decision - Make this an OpenGL/OpenGL ES extension
53
54        Reason - We would like to expose this extension in both OpenGL and
55        OpenGL ES which makes EGL an unsuitable choice.  Further, support for
56        EGL is not a requirement and there are platforms that support OpenGL ES
57        but not EGL, making it difficult to make this an EGL extension.
58
59    2.  Should the API support multipassing?
60
61        Decision - No.
62
63        Reason - Multipassing should really be left to the application to do.
64        This makes the API unnecessarily complicated.  A major issue is that
65        depending on which counters are to be sampled, the # of passes and which
66        counters get selected in each pass can be difficult to determine.  It is
67        much easier to give a list of counters categorized by groups with
68        specific information on the number of counters that can be selected from
69        each group.
70
71    3.  Should we define a 64-bit data type for UNSIGNED_INT64_AMD?
72
73        Decision - No.
74
75        Reason - While counters can be returned as 64-bit unsigned integers, the
76        data is passed back to the application inside of a void*.  Therefore,
77        there is no need in this extension to define a 64-bit data type (e.g.,
78        GLuint64).  It will be up the application to declare a native 64-bit
79        unsigned integer and cast the returned data to that type.
80
81
82New Procedures and Functions
83
84    void GetPerfMonitorGroupsAMD(int *numGroups, sizei groupsSize,
85                                 uint *groups)
86
87    void GetPerfMonitorCountersAMD(uint group, int *numCounters,
88                                   int *maxActiveCounters, sizei countersSize,
89                                   uint *counters)
90
91    void GetPerfMonitorGroupStringAMD(uint group, sizei bufSize, sizei *length,
92                                      char *groupString)
93
94    void GetPerfMonitorCounterStringAMD(uint group, uint counter, sizei bufSize,
95                                        sizei *length, char *counterString)
96
97    void GetPerfMonitorCounterInfoAMD(uint group, uint counter,
98                                      enum pname, void *data)
99
100    void GenPerfMonitorsAMD(sizei n, uint *monitors)
101
102    void DeletePerfMonitorsAMD(sizei n, uint *monitors)
103
104    void SelectPerfMonitorCountersAMD(uint monitor, boolean enable,
105                                      uint group, int numCounters,
106                                      uint *counterList)
107
108    void BeginPerfMonitorAMD(uint monitor)
109
110    void EndPerfMonitorAMD(uint monitor)
111
112    void GetPerfMonitorCounterDataAMD(uint monitor, enum pname, sizei dataSize,
113                                      uint *data, int *bytesWritten)
114
115
116New Tokens
117
118    Accepted by the <pame> parameter of GetPerfMonitorCounterInfoAMD
119
120        COUNTER_TYPE_AMD                           0x8BC0
121        COUNTER_RANGE_AMD                          0x8BC1
122
123    Returned as a valid value in <data> parameter of
124    GetPerfMonitorCounterInfoAMD if <pname> = COUNTER_TYPE_AMD
125
126        UNSIGNED_INT                               0x1405
127        FLOAT                                      0x1406
128        UNSIGNED_INT64_AMD                         0x8BC2
129        PERCENTAGE_AMD                             0x8BC3
130
131    Accepted by the <pname> parameter of GetPerfMonitorCounterDataAMD
132
133        PERFMON_RESULT_AVAILABLE_AMD               0x8BC4
134        PERFMON_RESULT_SIZE_AMD                    0x8BC5
135        PERFMON_RESULT_AMD                         0x8BC6
136
137Addition to the GL specification
138
139    Add a new section called Performance Monitoring
140
141    A performance monitor consists of a number of hardware and software counters
142    that can be sampled by the GPU and reported back to the application.
143    Performance counters are organized as a single hierarchy where counters are
144    categorized into groups.  Each group has a list of counters that belong to
145    the counter and can be sampled, and a maximum number of counters that can be
146    sampled.
147
148    The command
149
150        void GetPerfMonitorGroupsAMD(int *numGroups, sizei groupsSize,
151                                     uint *groups);
152
153    returns the number of available groups in <numGroups>, if <numGroups> is
154    not NULL.  If <groupsSize> is not 0 and <groups> is not NULL, then the list
155    of available groups is returned.  The number of entries that will be
156    returned in <groups> is determined by <groupsSize>.  If <groupsSize> is 0,
157    no information is copied.  Each group is identified by a unique unsigned int
158    identifier.
159
160    The command
161
162        void GetPerfMonitorCountersAMD(uint group, int *numCounters,
163                                       int *maxActiveCounters,
164                                       sizei countersSize,
165                                       uint *counters);
166
167    returns the following information.  For each group, it returns the number of
168    available counters in <numCounters>, the max number of counters that can be
169    active at any time in <maxActiveCounters>, and the list of counters in
170    <counters>.  The number of entries that can be returned in <counters> is
171    determined by <countersSize>.  If <countersSize> is 0, no information is
172    copied. Each counter in a group is identified by a unique unsigned int
173    identifier.  If <group> does not reference a valid group ID, an
174    INVALID_VALUE error is generated.
175
176
177    The command
178
179        void GetPerfMonitorGroupStringAMD(uint group, sizei bufSize,
180                                          sizei *length, char *groupString)
181
182
183    returns the string that describes the group name identified by <group> in
184    <groupString>.  The actual number of characters written to <groupString>,
185    excluding the null terminator, is returned in <length>.  If <length> is
186    NULL, then no length is returned.  The maximum number of characters that
187    may be written into <groupString>, including the null terminator, is
188    specified by <bufSize>.  If <bufSize> is 0 and <groupString> is NULL, the
189    number of characters that would be required to hold the group string,
190    excluding the null terminator, is returned in <length>.  If <group>
191    does not reference a valid group ID, an INVALID_VALUE error is generated.
192
193
194    The command
195
196        void GetPerfMonitorCounterStringAMD(uint group, uint counter,
197                                            sizei bufSize, sizei *length,
198                                            char *counterString);
199
200
201    returns the string that describes the counter name identified by <group>
202    and <counter> in <counterString>.  The actual number of characters written
203    to <counterString>, excluding the null terminator, is returned in <length>.
204    If <length> is NULL, then no length is returned.  The maximum number of
205    characters that may be written into <counterString>, including the null
206    terminator, is specified by <bufSize>.  If <bufSize> is 0 and
207    <counterString> is NULL, the number of characters that would be required to
208    hold the counter string, excluding the null terminator, is returned in
209    <length>.  If <group> does not reference a valid group ID, or <counter>
210    does not reference a valid counter within the group ID, an INVALID_VALUE
211    error is generated.
212
213    The command
214
215        void GetPerfMonitorCounterInfoAMD(uint group, uint counter,
216                                          enum pname, void *data);
217
218    returns the following information about a counter.  For a <counter>
219    belonging to <group>, we can query the counter type and counter range.  If
220    <pname> is COUNTER_TYPE_AMD, then <data> returns the type.  Valid type
221    values returned are UNSIGNED_INT, UNSIGNED_INT64_AMD, PERCENTAGE_AMD, FLOAT.
222    If type value returned is PERCENTAGE_AMD, then this describes a float
223    value that is in the range [0.0 .. 100.0].  If <pname> is COUNTER_RANGE_AMD,
224    <data> returns two values representing a minimum and a maximum. The
225    counter's type is used to determine the format in which the range values
226    are returned.  If <group> does not reference a valid group ID, or <counter>
227    does not reference a valid counter within the group ID, an INVALID_VALUE
228    error is generated.
229
230
231    The command
232
233        void GenPerfMonitorsAMD(sizei n, uint *monitors)
234
235    returns a list of monitors.  These monitors can then be used to select
236    groups/counters to be sampled, to start multiple monitoring sessions and to
237    return counter information sampled by the GPU.  At creation time, the
238    performance monitor object has all counters disabled.  The value of the
239    PERFMON_RESULT_AVAILABLE_AMD, PERFMON_RESULT_AMD, and
240    PERFMON_RESULT_SIZE_AMD queries will all initially be 0.
241
242    The command
243
244        void DeletePerfMonitorsAMD(sizei n, uint *monitors)
245
246    is used to delete the list of monitors created by a previous call to
247    GenPerfMonitors.  If a monitor ID in the list <monitors> does not
248    reference a previously generated performance monitor, an INVALID_VALUE
249    error is generated.
250
251    The command
252
253        void SelectPerfMonitorCountersAMD(uint monitor, boolean enable,
254                                          uint group, int numCounters,
255                                          uint *counterList);
256
257    is used to enable or disable a list of counters from a group to be monitored
258    as identified by <monitor>.  The <enable> argument determines whether the
259    counters should be enabled or disabled.  <group> specifies the group
260    ID under which counters will be enabled or disabled.  The <numCounters>
261    argument gives the number of counters to be selected from the list
262    <counterList>.  If <monitor> is not a valid monitor created by
263    GenPerfMonitorsAMD, then INVALID_VALUE error will be generated.  If <group>
264    is not a valid group, the INVALID_VALUE error will be generated.  If
265    <numCounters> is less than 0, an INVALID_VALUE error will be generated.
266
267    When SelectPerfMonitorCountersAMD is called on a monitor, any outstanding
268    results for that monitor become invalidated and the result queries
269    PERFMON_RESULT_SIZE_AMD and PERFMON_RESULT_AVAILABLE_AMD are reset to 0.
270
271    The command
272
273        void BeginPerfMonitorAMD(uint monitor);
274
275    is used to start a monitor session.  Note that BeginPerfMonitor calls cannot
276    be nested.  In addition, it is quite possible that given the list of groups
277    and counters/group enabled for a monitor, it may not be able to sample the
278    necessary counters and so the monitor session will fail.  In such a case,
279    an INVALID_OPERATION error will be generated.
280
281    While BeginPerfMonitorAMD does mark the beginning of performance counter
282    collection, the counters do not begin collecting immediately.  Rather, the
283    counters begin collection when BeginPerfMonitorAMD is processed by
284    the hardware.  That is, the API is asynchronous, and performance counter
285    collection does not begin until the graphics hardware processes the
286    BeginPerfMonitorAMD command.
287
288    The command
289
290        void EndPerfMonitorAMD(uint monitor);
291
292    ends a monitor session started by BeginPerfMonitorAMD.  If a performance
293    monitor is not currently started, an INVALID_OPERATION error will be
294    generated.
295
296    Note that there is an implied overhead to collecting performance counters
297    that may or may not distort performance depending on the implementation.
298    For example, some counters may require a pipeline flush thereby causing a
299    change in the performance of the application.  Further, the frequency at
300    which an application samples may distort the accuracy of counters which are
301    variant (e.g., non-deterministic based on the input).  While the effects
302    of sampling frequency are implementation dependent, general guidance can
303    be given that sampling at a high frequency may distort both performance
304    of the application and the accuracy of variant counters.
305
306    The command
307
308        void GetPerfMonitorCounterDataAMD(uint monitor, enum pname,
309                                          sizei dataSize,
310                                          uint *data, sizei *bytesWritten);
311
312    is used to return counter values that have been sampled for a monitor
313    session.  If <pname> is PERFMON_RESULT_AVAILABLE_AMD, then <data> will
314    indicate whether the result is available or not.  If <pname> is
315    PERFMON_RESULT_SIZE_AMD, <data> will contain actual size of all counter
316    results being sampled.  If <pname> is PERFMON_RESULT_AMD, <data> will
317    contain results.  For each counter of a group that was selected to be
318    sampled, the information is returned as group ID, followed by counter ID,
319    followed by counter value.  The size of counter value returned will depend
320    on the counter value type.  The argument <dataSize> specifies the number of
321    bytes available in the <data> buffer for writing.  If <bytesWritten> is not
322    NULL, it gives the number of bytes written into the <data> buffer.  It is an
323    INVALID_OPERATION error for <data> to be NULL.  If <pname> is
324    PERFMON_RESULT_AMD and <dataSize> is less than the number of bytes required
325    to store the results as reported by a PERFMON_RESULT_SIZE_AMD query, then
326    results will be written only up to the number of bytes specified by
327    <dataSize>.
328
329    If no BeginPerfMonitorAMD/EndPerfMonitorAMD has been issued for a monitor,
330    then the result of querying for PERFMON_RESULT_AVAILABLE and
331    PERFMON_RESULT_SIZE will be 0.  When SelectPerfMonitorCountersAMD is called
332    on a monitor, the results stored for the monitor become invalidated and
333    the value of PERFMON_RESULT_AVAILABLE and PERFMON_RESULT_SIZE queries should
334    behave as if no BeginPerfMonitorAMD/EndPerfMonitorAMD has been issued for
335    the monitor.
336
337Errors
338
339    INVALID_OPERATION error will be generated if BeginPerfMonitorAMD is unable
340    to begin monitoring with the currently selected counters.
341
342    INVALID_OPERATION error will be generated if BeginPerfMonitorAMD is called
343    when a performance monitor is already active.
344
345    INVALID_OPERATION error will be generated if EndPerfMonitorAMD is called
346    when a performance monitor is not currently started.
347
348    INVALID_VALUE error will be generated if the <group> parameter to
349    GetPerfMonitorCountersAMD, GetPerfMonitorCounterStringAMD,
350    GetPerfMonitorCounterStringAMD, GetPerfMonitorCounterInfoAMD, or
351    SelectPerfMonitorCountersAMD does not reference a valid group ID.
352
353    INVALID_VALUE error will be generated if the <counter> parameter to
354    GetPerfMonitorCounterInfoAMD does not reference a valid counter ID
355    in the group specified by <group>.
356
357    INVALID_VALUE error will be generated if any of the monitor IDs
358    in the <monitors> parameter to DeletePerfMonitorsAMD do not reference
359    a valid generated monitor ID.
360
361    INVALID_VALUE error will be generated if the <monitor> parameter to
362    SelectPerfMonitorCountersAMD does not reference a monitor created by
363    GenPerfMonitorsAMD.
364
365    INVALID_VALUE error will be generated if the <numCounters> parameter to
366    SelectPerfMonitorCountersAMD is less than 0.
367
368
369
370New State
371
372Sample Usage
373
374    typedef struct
375    {
376            GLuint       *counterList;
377            int         numCounters;
378            int         maxActiveCounters;
379    } CounterInfo;
380
381    void
382    getGroupAndCounterList(GLuint **groupsList, int *numGroups,
383                           CounterInfo **counterInfo)
384    {
385        GLint          n;
386        GLuint        *groups;
387        CounterInfo   *counters;
388
389        glGetPerfMonitorGroupsAMD(&n, 0, NULL);
390        groups = (GLuint*) malloc(n * sizeof(GLuint));
391        glGetPerfMonitorGroupsAMD(NULL, n, groups);
392        *numGroups = n;
393
394        *groupsList = groups;
395        counters = (CounterInfo*) malloc(sizeof(CounterInfo) * n);
396        for (int i = 0 ; i < n; i++ )
397        {
398            glGetPerfMonitorCountersAMD(groups[i], &counters[i].numCounters,
399                                     &counters[i].maxActiveCounters, 0, NULL);
400
401            counters[i].counterList = (GLuint*)malloc(counters[i].numCounters *
402                                                      sizeof(int));
403
404            glGetPerfMonitorCountersAMD(groups[i], NULL, NULL,
405                                        counters[i].numCounters,
406                                        counters[i].counterList);
407        }
408
409        *counterInfo = counters;
410    }
411
412    static int  countersInitialized = 0;
413
414    int
415    getCounterByName(char *groupName, char *counterName, GLuint *groupID,
416                     GLuint *counterID)
417    {
418        int          numGroups;
419        GLuint       *groups;
420        CounterInfo  *counters;
421        int          i = 0;
422
423        if (!countersInitialized)
424        {
425            getGroupAndCounterList(&groups, &numGroups, &counters);
426            countersInitialized = 1;
427        }
428
429        for ( i = 0; i < numGroups; i++ )
430        {
431           char curGroupName[256];
432           glGetPerfMonitorGroupStringAMD(groups[i], 256, NULL, curGroupName);
433           if (strcmp(groupName, curGroupName) == 0)
434           {
435               *groupID = groups[i];
436               break;
437           }
438        }
439
440        if ( i == numGroups )
441            return -1;           // error - could not find the group name
442
443        for ( int j = 0; j < counters[i].numCounters; j++ )
444        {
445            char curCounterName[256];
446
447            glGetPerfMonitorCounterStringAMD(groups[i],
448                                             counters[i].counterList[j],
449                                             256, NULL, curCounterName);
450            if (strcmp(counterName, curCounterName) == 0)
451            {
452                *counterID = counters[i].counterList[j];
453                return 0;
454            }
455        }
456
457        return -1;           // error - could not find the counter name
458    }
459
460    void
461    drawFrameWithCounters(void)
462    {
463        GLuint group[2];
464        GLuint counter[2];
465        GLuint monitor;
466        GLuint *counterData;
467
468        // Get group/counter IDs by name.  Note that normally the
469        // counter and group names need to be queried for because
470        // each implementation of this extension on different hardware
471        // could define different names and groups.  This is just provided
472        // to demonstrate the API.
473        getCounterByName("HW", "Hardware Busy", &group[0],
474                         &counter[0]);
475        getCounterByName("API", "Draw Calls", &group[1],
476                         &counter[1]);
477
478        // create perf monitor ID
479        glGenPerfMonitorsAMD(1, &monitor);
480
481        // enable the counters
482        glSelectPerfMonitorCountersAMD(monitor, GL_TRUE, group[0], 1,
483                                       &counter[0]);
484        glSelectPerfMonitorCountersAMD(monitor, GL_TRUE, group[1], 1,
485                                       &counter[1]);
486
487        glBeginPerfMonitorAMD(monitor);
488
489        // RENDER FRAME HERE
490        // ...
491
492        glEndPerfMonitorAMD(monitor);
493
494        // read the counters
495        GLint resultSize;
496        glGetPerfMonitorCounterDataAMD(monitor, GL_PERFMON_RESULT_SIZE_AMD,
497                                       sizeof(GLint), &resultSize, NULL);
498
499        counterData = (GLuint*) malloc(resultSize);
500
501        GLsizei bytesWritten;
502        glGetPerfMonitorCounterDataAMD(monitor, GL_PERFMON_RESULT_AMD,
503                                       resultSize, counterData, &bytesWritten);
504
505        // display or log counter info
506        GLsizei wordCount = 0;
507
508        while ( (4 * wordCount) < bytesWritten )
509        {
510            GLuint groupId = counterData[wordCount];
511            GLuint counterId = counterData[wordCount + 1];
512
513            // Determine the counter type
514            GLuint counterType;
515            glGetPerfMonitorCounterInfoAMD(groupId, counterId,
516                                           GL_COUNTER_TYPE_AMD, &counterType);
517
518            if ( counterType == GL_UNSIGNED_INT64_AMD )
519            {
520                unsigned __int64 counterResult =
521                           *(unsigned __int64*)(&counterData[wordCount + 2]);
522
523                // Print counter result
524
525                wordCount += 4;
526            }
527            else if ( counterType == GL_FLOAT )
528            {
529                float counterResult = *(float*)(&counterData[wordCount + 2]);
530
531                // Print counter result
532
533                wordCount += 3;
534            }
535            // else if ( ... ) check for other counter types
536            //   (GL_UNSIGNED_INT and GL_PERCENTAGE_AMD)
537        }
538    }
539
540Revision History
541    11/29/2007 - dginsburg
542       + Clarified the default state of a performance monitor object on creation
543
544    11/09/2007 - dginsbur
545       + Clarify what happens if SelectPerfMonitorCountersAMD is called on
546         a monitor with outstanding query results.
547       + Rename counterSize to countersSize
548       + Remove some ';' typos
549
550    06/13/2007 - dginsbur
551       + Add language on the asynchronous nature of the API and
552         counter accuracy/performance distortion.
553       + Add myself as the contact
554       + Remove INVALID_OPERATION error when countersList is NULL
555       + Clarify 64-bit issue
556       + Make PERCENTAGE_AMD counters float rather than uint
557       + Clarify accuracy distortion on variant counters only
558       + Tweak to overview language
559
560    06/09/2007 - dginsbur
561       + Fill in errors section and make many more errors explicit
562       + Fix the example code so it compiles
563
564    06/08/2007 - dginsbur
565       + Modified GetPerfMonitorGroupString and GetPerfMonitorCounterString to
566         be more client/server friendly.
567       + Modified example.
568       + Renamed parameters/variables to follow GL conventions.
569       + Modified several 'int' param types to 'sizei'
570       + Modifid counters type from 'int' to 'uint'
571       + Renamed argument 'cb' and 'cbret'
572       + Better documented GetPerfMonitorCounterData
573       + Add AMD adornment in many places that were missing it
574
575    06/07/2007 - dginsbur
576       + Cleanup formatting, remove tabs, make fit in proper page width
577       + Add FLOAT and UNSIGNED_INT to list of COUNTER_TYPEs
578       + Fix some bugs in the example code
579       + Rewrite introduction
580       + Clarified Issue 1 reasoning
581       + Added Issue 3 regarding use of 64-bit data types
582       + Added revision history
583
584    03/21/2007 - Initial version written.  Written by amunshi.
585
586
587