Name ARB_timer_query Name Strings GL_ARB_timer_query Contact Piers Daniell, NVIDIA Corporation (pdaniell 'at' nvidia.com) Contributors Axel Mamode, Sony Brian Paul, Tungsten Graphics Bruce Merry, ARM James Jones, NVIDIA Corporation Pat Brown, NVIDIA Remi Arnaud, Sony Notice Copyright (c) 2010-2013 The Khronos Group Inc. Copyright terms at http://www.khronos.org/registry/speccopyright.html Specification Update Policy Khronos-approved extension specifications are updated in response to issues and bugs prioritized by the Khronos OpenGL Working Group. For extensions which have been promoted to a core Specification, fixes will first appear in the latest version of that core Specification, and will eventually be backported to the extension document. This policy is described in more detail at https://www.khronos.org/registry/OpenGL/docs/update_policy.php Status Complete. Approved by the ARB at the 2010/01/22 F2F meeting. Approved by the Khronos Board of Promoters on March 10, 2010. Version Last Modified Date: August 9, 2014 Revision: 13 Number ARB Extension #85 Dependencies This extension is written against the OpenGL 3.2 specification. Overview Applications can benefit from accurate timing information in a number of different ways. During application development, timing information can help identify application or driver bottlenecks. At run time, applications can use timing information to dynamically adjust the amount of detail in a scene to achieve constant frame rates. OpenGL implementations have historically provided little to no useful timing information. Applications can get some idea of timing by reading timers on the CPU, but these timers are not synchronized with the graphics rendering pipeline. Reading a CPU timer does not guarantee the completion of a potentially large amount of graphics work accumulated before the timer is read, and will thus produce wildly inaccurate results. glFinish() can be used to determine when previous rendering commands have been completed, but will idle the graphics pipeline and adversely affect application performance. This extension provides a query mechanism that can be used to determine the amount of time it takes to fully complete a set of GL commands, and without stalling the rendering pipeline. It uses the query object mechanisms first introduced in the occlusion query extension, which allow time intervals to be polled asynchronously by the application. IP Status No known IP claims. New Procedures and Functions void QueryCounter(uint id, enum target); void GetQueryObjecti64v(uint id, enum pname, int64 *params); void GetQueryObjectui64v(uint id, enum pname, uint64 *params); New Tokens Accepted by the parameter of BeginQuery, EndQuery, and GetQueryiv: TIME_ELAPSED 0x88BF Accepted by the parameter of GetQueryiv and QueryCounter. Accepted by the parameter of GetBooleanv, GetIntegerv, GetInteger64v, GetFloatv, and GetDoublev: TIMESTAMP 0x8E28 Additions to Chapter 2 of the OpenGL 3.2 (Core Profile) Specification (OpenGL Operation) (Modify table 2.1, Correspondence of command suffix letters to GL argument types, p. 14) Add one new type and suffix: Letter Corresponding GL Type ------ --------------------- ui64 uint64 (Modify Section 2.14, Asynchronous Queries, p. 89) Asynchronous queries provide a mechanism to return information about the processing of a sequence of GL commands. There are three query types supported by the GL. Transform feedback queries (see section 2.16) return information on the number of vertices and primitives processed by the GL and written to one or more buffer objects. Occlusion queries (see section 4.1.6) count the number of fragments or samples that pass the depth test. Timer queries (section 5.4) record the amount of time needed to fully process these commands or the current time of the GL. Additions to Chapter 3 of the OpenGL 3.2 Specification (Rasterization) None. Additions to Chapter 4 of the OpenGL 3.2 Specification (Per-Fragment Operations and the Framebuffer) None. Additions to Chapter 5 of the OpenGL 3.2 Specification (Special Functions) (Add new Section 5.4, Timer Queries, p. 246) Timer queries use query objects to track the amount of time needed to fully complete a set of GL commands, or to determine the current time of the GL. When BeginQuery and EndQuery are called with a of TIME_ELAPSED, the GL prepares to start and stop the timer used for timer queries. The timer is started or stopped when the effects from all previous commands on the GL client and server state and the framebuffer have been fully realized. The BeginQuery and EndQuery commands may return before the timer is actually started or stopped. When the timer query timer is finally stopped, the elapsed time (in nanoseconds) is written to the corresponding query object as the query result value, and the query result for that object is marked as available. If the elapsed time overflows the number of bits, , available to hold elapsed time, its value becomes undefined. It is recommended, but not required, that implementations handle this overflow case by saturating at 2^n - 1. A timer query object is created with the command void QueryCounter(uint id, enum target); must be TIMESTAMP. If is an unused query object name, the name is marked as used and associated with a new query object of type TIMESTAMP. Otherwise must be the name of an existing query object of that type. When QueryCounter is called, the GL records the current time into the corresponding query object. The time is recorded after all previous commands on the GL client and server state and the framebuffer have been fully realized. When the time is recorded, the query result for that object is marked available. QueryCounter timer queries can be used within a BeginQuery / EndQuery block where the is TIME_ELAPSED and it does not affect the result of that query object. ** core profile only QueryCounter fails and an INVALID\_OPERATION error is generated if is not a name returned from a previous call to GenQueries, or if such a name has since been deleted with DeleteQueries. ** end core profile only If is already in use within a BeginQuery / EndQuery block, or if is the name of an existing query object whose type does not match , an INVALID_OPERATION error is generated. The current time of the GL may be queried by calling GetIntegerv or GetInteger64v with the symbolic constant TIMESTAMP. This will return the GL time after all previous commands have reached the GL server but have not yet necessarily executed. By using a combination of this synchronous get command and the asynchronous timestamp query object target, applications can measure the latency between when commands reach the GL server and when they are realized in the framebuffer. Additions to Chapter 6 of the OpenGL 2.0 Specification (State and State Requests) (Modify Section 6.1.6, Asynchronous Queries, p. 255) Section 6.1.6, Asynchronous Queries The command boolean IsQuery(uint id); returns TRUE if is the name of a query object. If is zero, or if is a non-zero value that is not the name of a query object, IsQuery returns FALSE. Information about a query target can be queried with the command void GetQueryiv(enum target, enum pname, int *params); identifies the query target and can be SAMPLES_PASSED for occlusion queries, PRIMITIVES_GENERATED and TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN for primitive queries, or TIME_ELAPSED or TIMESTAMP for timer queries. If is CURRENT_QUERY, the name of the currently active query for , or zero if no query is active, will be placed in . If is QUERY_COUNTER_BITS, the implementation-dependent number of bits used to hold the query result for will be placed in . The number of query counter bits may be zero, in which case the counter contains no useful information. For primitive queries (PRIMITIVES_GENERATED and TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN) if the number of bits is non-zero, the minimum number of bits allowed is 32. For occlusion queries (SAMPLES_PASSED), if the number of bits is non-zero, the minimum number of bits allowed is a function of the implementation's maximum viewport dimensions (MAX_VIEWPORT_DIMS). The counter must be able to represent at least two overdraws for every pixel in the viewport. The formula to compute the allowable minimum value (where is the minimum number of bits) is: n = min(32, ceil(log_2(maxViewportWidth * maxViewportHeight * 2))). For timer queries (TIME_ELAPSED and TIMESTAMP), if the number of bits is non-zero, the minimum number of bits allowed is 30 which will allow at least 1 second of timing. The state of a query object can be queried with the commands void GetQueryObjectiv(uint id, enum pname, int *params); void GetQueryObjectuiv(uint id, enum pname, uint *params); void GetQueryObjecti64v(uint id, enum pname, int64 *params); void GetQueryObjectui64v(uint id, enum pname, uint64 *params); If is not the name of a query object, or if the query object named by is currently active, then an INVALID_OPERATION error is generated. If is QUERY_RESULT, then the query object's result value is returned as a single integer in . If the value is so large in magnitude that it cannot be represented with the requested type, then the nearest value representable using the requested type is returned. If the number of query counter bits for target is zero, then the result is returned as a single integer with the value zero. There may be an indeterminate delay before the above query returns. If is QUERY_RESULT_AVAILABLE, FALSE is returned if such a delay would be required; otherwise TRUE is returned. It must always be true that if any query object returns a result available of TRUE, all queries of the same type issued prior to that query must also return TRUE. Querying the state for any given query object forces that occlusion query to complete within a finite amount of time. If multiple queries are issued using the same object name prior to calling GetQueryObject[u]i[64]v, the result and availability information returned will always be from the last query issued. The results from any queries before the last one will be lost if they are not retrieved before starting a new query on the same and . Interactions with NV_present_video and NV_video_capture The GL timer recorded by this extension is the same timer as that used by the NV_present_video and NV_video_capture extensions. This allows the timer to be used with any of these extensions interchangeably. Interactions with the Compatibility Profile In the compatibility profile, query objects support application-provided names, and the language requiring an error is is not a name returned from GenQueries is removed. This is noted in the body text above. Errors The error INVALID_ENUM is generated if BeginQuery or EndQuery is called where is not SAMPLES_PASSED, TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN or TIME_ELAPSED. The error INVALID_ENUM is generated if GetQueryiv is called where is not SAMPLES_PASSED, TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN, TIME_ELAPSED or TIMESTAMP. The error INVALID_ENUM is generated if QueryCounter is called where is not TIMESTAMP. The error INVALID_OPERATION is generated if QueryCounter is called on a query object that is already in use inside a BeginQuery/EndQuery. The error INVALID_OPERATION is generated if QueryCounter is called on a query object whose type is not TIMESTAMP. (in the core profile only) The error INVALID_OPERATION is generated if QueryCounter is called where is not a name returned from a previous call to GenQueries, or if such a name has since been deleted with DeleteQueries. The error INVALID_OPERATION is generated if GetQueryObjecti64v or GetQueryObjectui64v is called where is not the name of a query object. The error INVALID_OPERATION is generated if GetQueryObjecti64v or GetQueryObjectui64v is called where is the name of a currently active query object. The error INVALID_ENUM is generated if GetQueryObjecti64v or GetQueryObjectui64v is called where is not QUERY_RESULT or QUERY_RESULT_AVAILABLE. New State None. Examples (1) Here is some rough sample code that demonstrates the intended usage of this extension. GLuint queries[N]; GLint available = 0; // timer queries can contain more than 32 bits of data, so always // query them using the 64 bit types to avoid overflow GLuint64 timeElapsed = 0; // Create a query object. glGenQueries(N, queries); // Start query 1 glBeginQuery(GL_TIME_ELAPSED, queries[0]); // Draw object 1 .... // End query 1 glEndQuery(GL_TIME_ELAPSED); ... // Start query N glBeginQuery(GL_TIME_ELAPSED, queries[N-1]); // Draw object N .... // End query N glEndQuery(GL_TIME_ELAPSED); // Wait for all results to become available while (!available) { glGetQueryObjectiv(queries[N-1], GL_QUERY_RESULT_AVAILABLE, &available); } for (i = 0; i < N; i++) { // See how much time the rendering of object i took in nanoseconds. glGetQueryObjectui64v(queries[i], GL_QUERY_RESULT, &timeElapsed); // Do something useful with the time. Note that care should be // taken to use all significant bits of the result, not just the // least significant 32 bits. AdjustObjectLODBasedOnDrawTime(i, timeElapsed); } This example is sub-optimal in that it stalls at the end of every frame to wait for query results. Ideally, the collection of results would be delayed one frame to minimize the amount of time spent waiting for the GPU to finish rendering. (2) This example is basically the same as the example above but uses QueryCounter instead. GLuint queries[N+1]; GLint available = 0; // timer queries can contain more than 32 bits of data, so always // query them using the 64 bit types to avoid overflow GLuint64 timeStart, timeEnd, timeElapsed = 0; // Create a query object. glGenQueries(N+1, queries); // Query current timestamp 1 glQueryCounter(queries[0], GL_TIMESTAMP); // Draw object 1 .... // Query current timestamp N glQueryCounter(queries[N-1], GL_TIMESTAMP); // Draw object N .... // Query current timestamp N+1 glQueryCounter(queries[N], GL_TIMESTAMP); // Wait for all results to become available while (!available) { glGetQueryObjectiv(queries[N], GL_QUERY_RESULT_AVAILABLE, &available); } for (i = 0; i < N; i++) { // See how much time the rendering of object i took in nanoseconds. glGetQueryObjectui64v(queries[i], GL_QUERY_RESULT, &timeStart); glGetQueryObjectui64v(queries[i+1], GL_QUERY_RESULT, &timeEnd); timeElapsed = timeEnd - timeStart; // Do something useful with the time. Note that care should be // taken to use all significant bits of the result, not just the // least significant 32 bits. AdjustObjectLODBasedOnDrawTime(i, timeElapsed); } (3) This example demonstrates how to measure the latency between GL commands reaching the server and being realized in the framebuffer. /* Submit a frame of rendering commands */ while (!doneRendering) { ... glDrawElements(...); } /* * Measure rendering latency: * * Some commands may have already been submitted to hardware, * and some of those may have already completed. The goal is * to measure the time it takes for the remaining commands to * complete, thereby measuring how far behind the app the GPU * is lagging, but without synchronizing the GPU with the CPU. */ /* Queue a query to find out when the frame finishes on the GL */ glQueryCounter(endFrameQuery, GL_TIMESTAMP); /* Get the current GL time without stalling the GL */ glGet(GL_TIMESTAMP, &flushTime); /* Finish the frame, submitting outstanding commands to the GL */ SwapBuffers(); /* Render another frame */ /* * Later, compare the query result of * and to measure the latency of the frame */ Issues from EXT_timer_query (1) What time interval is being measured? RESOLVED: The timer starts when all commands prior to BeginQuery() have been fully executed. At that point, everything that should be drawn by those commands has been written to the framebuffer. The timer stops when all commands prior to EndQuery() have been fully executed. (2) What unit of time will time intervals be returned in? RESOLVED: Nanoseconds (10^-9 seconds). This unit of measurement allows for reasonably accurate timing of even small blocks of rendering commands. The granularity of the timer is implementation-dependent. A 32-bit query counter can express intervals of up to approximately 4 seconds. (3) What should be the minimum number of counter bits for timer queries? RESOLVED: 30 bits, which will allow timing sections that take up to 1 second to render. (4) How are counter results of more than 32 bits returned? RESOLVED: Via two new datatypes, int64EXT and uint64EXT, and their corresponding GetQueryObject entry points. These types hold integer values and have a minimum bit width of 64. UPDATE: This resolution was relevant for EXT_timer_query and OpenGL 2.0. OpenGL 3.2 now has int64 and uint64 datatypes as part of the core spec. (5) Should the extension measure total time elapsed between the full completion of the BeginQuery and EndQuery commands, or just time spent in the graphics library? RESOLVED: This extension will measure the total time elapsed between the full completion of these commands. Future extensions may implement a query to determine time elapsed at different stages of the graphics pipeline. (6) This extension introduces a second query type supported by BeginQuery/EndQuery. Can multiple query types be active simultaneously? RESOLVED: Yes; an application may perform an occlusion query and a timer query simultaneously. An application can not perform multiple occlusion queries or multiple timer queries simultaneously. An application also can not use the same query object for an occlusion query and a timer query simultaneously. (7) Do query objects have a query type permanently associated with them? RESOLVED: No. A single query object can be used to perform different types of queries, but not at the same time. Having a fixed type for each query object simplifies some aspects of the implementation -- not having to deal with queries with different result sizes, for example. It would also mean that BeginQuery() with a query object of the "wrong" type would result in an INVALID_OPERATION error. UPDATE: This resolution was relevant for EXT_timer_query and OpenGL 2.0. Since EXT_transform_feedback has since been incorporated into the core, the resolution is that BeginQuery will generate error INVALID_OPERATION if represents a query object of a different type. (8) How predictable/repeatable are the results returned by the timer query? RESOLVED: In general, the amount of time needed to render the same primitives should be fairly constant. But there may be many other system issues (e.g., context switching on the CPU and GPU, virtual memory page faults, memory cache behavior on the CPU and GPU) that can cause times to vary wildly. Note that modern GPUs are generally highly pipelined, and may be processing different primitives in different pipeline stages simultaneously. In this extension, the timers start and stop when the BeginQuery/EndQuery commands reach the bottom of the rendering pipeline. What that means is that by the time the timer starts, the GL driver on the CPU may have started work on GL commands issued after BeginQuery, and the higher pipeline stages (e.g., vertex transformation) may have started as well. (9) What should the new 64 bit integer type be called? RESOLVED: The new types will be called GLint64EXT/GLuint64EXT The new command suffixes will be i64 and ui64. These names clearly convey the minimum size of the types. These types are similar to the C99 standard type int_least64_t, but we use names similar to the C99 optional type int64_t for simplicity. UPDATE: This resolution was relevant for EXT_timer_query and OpenGL 2.0. OpenGL 3.2 now has int64 and uint64 datatypes as part of the core spec. The i64 suffix already exists in OpenGL 3.2 and the ui64 suffix has been added as part of this extension. Issues (10) What about tile-based implementations? The effects of a command are not complete until the frame is completely rendered. Timing recorded before the frame is complete may not be what developers expect. Also the amount of time needed to render the same primitives is not consistent, which conflicts with issue (8) above. The time depends on how early or late in the scene it is placed. RESOLVED: The current language supports tile-based rendering okay as it is written. Developers are warned that using timers on tile-based implementation may not produce results they expect since rendering is not done in a linear order. Timing results are calculated when the frame is completed and may depend on how early or late in the scene it is placed. (11) Can the GL implementation use different clocks to implement the TIME_ELAPSED and TIMESTAMP queries? RESOLVED: Yes, the implementation can use different internal clocks to implement TIME_ELAPSED and TIMESTAMP. If different clocks are used it is possible there is a slight discrepancy when comparing queries made from TIME_ELAPSED and TIMESTAMP; they may have slight differences when both are used to measure the same sequence. However, this is unlikely to affect real applications since comparing the two queries is not expected to be useful. (12) Why do BeginQuery and QueryCounter have the same arguments in the opposite order? RESOLVED: Due to an unfortunate oversight, which cannot be fixed at this point. Revision History Rev. Date Author Changes ---- ------------ -------- ------------------------------------------- 13 Aug 9, 2014 Jon Leech Fix typo in example 3 (bug 12552). 12 Jul 11, 2013 Jon Leech Change type of queries[] in sample code to GLuint (public bug 432). 11 Apr 13, 2012 Jon Leech Clean up error language, add error for query objects which are not of type TIMESTAMP, and add issue 12 (Khronos internal bug 7662) 10 June 3, 2011 dkoch Add INVALID_OPERATION error when calling QueryCounter with a non-generated in the core profile (Khronos internal bug 7662). 9 Dec 18, 2009 pdaniell Remove ambiguous language about "interuptions to the GL". Rename CURRENT_TIME to TIMESTAMP. 8 Dec 10, 2009 Jon Leech Improve description of QueryCounter command. 7 Dec 10, 2009 Jon Leech Replace non-ASCII punctuation. 6 Dec 07, 2009 pdaniell Remove ARB suffix from new tokens for core. 5 Oct 29, 2009 pdaniell TIMESTAMP_ARB renamed to CURRENT_TIME_ARB. Issue (11) raised about using different clocks to implement CURRENT_TIME and TIME_ELAPSED queries. Add example (3) for calculating the GL latency. 4 Oct 23, 2009 pdaniell Add support for TIMESTAMP_ARB as a to Get* to allow synchronous time query. 3 Oct 15, 2009 pdaniell Resolved Issue (10). Added Interactions with NV_present_video and NV_video_capture section. 2 Oct 15, 2009 pdaniell Clarified some of the old EXT_timer_query Issues wrt OpenGL 3.2. Added specification for the TIMESTAMP_ARB time. Added new Issue for tile-based implementations. Issue 3 resolution added to the spec. 1 Oct 13, 2009 pdaniell Initial revision based on EXT_timer_query