1Name 2 3 ARB_timer_query 4 5Name Strings 6 7 GL_ARB_timer_query 8 9Contact 10 11 Piers Daniell, NVIDIA Corporation (pdaniell 'at' nvidia.com) 12 13Contributors 14 15 Axel Mamode, Sony 16 Brian Paul, Tungsten Graphics 17 Bruce Merry, ARM 18 James Jones, NVIDIA Corporation 19 Pat Brown, NVIDIA 20 Remi Arnaud, Sony 21 22Notice 23 24 Copyright (c) 2010-2013 The Khronos Group Inc. Copyright terms at 25 http://www.khronos.org/registry/speccopyright.html 26 27Specification Update Policy 28 29 Khronos-approved extension specifications are updated in response to 30 issues and bugs prioritized by the Khronos OpenGL Working Group. For 31 extensions which have been promoted to a core Specification, fixes will 32 first appear in the latest version of that core Specification, and will 33 eventually be backported to the extension document. This policy is 34 described in more detail at 35 https://www.khronos.org/registry/OpenGL/docs/update_policy.php 36 37Status 38 39 Complete. Approved by the ARB at the 2010/01/22 F2F meeting. 40 Approved by the Khronos Board of Promoters on March 10, 2010. 41 42Version 43 44 Last Modified Date: August 9, 2014 45 Revision: 13 46 47Number 48 49 ARB Extension #85 50 51Dependencies 52 53 This extension is written against the OpenGL 3.2 specification. 54 55Overview 56 57 Applications can benefit from accurate timing information in a number of 58 different ways. During application development, timing information can 59 help identify application or driver bottlenecks. At run time, 60 applications can use timing information to dynamically adjust the amount 61 of detail in a scene to achieve constant frame rates. OpenGL 62 implementations have historically provided little to no useful timing 63 information. Applications can get some idea of timing by reading timers 64 on the CPU, but these timers are not synchronized with the graphics 65 rendering pipeline. Reading a CPU timer does not guarantee the completion 66 of a potentially large amount of graphics work accumulated before the 67 timer is read, and will thus produce wildly inaccurate results. 68 glFinish() can be used to determine when previous rendering commands have 69 been completed, but will idle the graphics pipeline and adversely affect 70 application performance. 71 72 This extension provides a query mechanism that can be used to determine 73 the amount of time it takes to fully complete a set of GL commands, and 74 without stalling the rendering pipeline. It uses the query object 75 mechanisms first introduced in the occlusion query extension, which allow 76 time intervals to be polled asynchronously by the application. 77 78IP Status 79 80 No known IP claims. 81 82New Procedures and Functions 83 84 void QueryCounter(uint id, enum target); 85 86 void GetQueryObjecti64v(uint id, enum pname, int64 *params); 87 void GetQueryObjectui64v(uint id, enum pname, uint64 *params); 88 89New Tokens 90 91 Accepted by the <target> parameter of BeginQuery, EndQuery, and 92 GetQueryiv: 93 94 TIME_ELAPSED 0x88BF 95 96 Accepted by the <target> parameter of GetQueryiv and QueryCounter. 97 Accepted by the <value> parameter of GetBooleanv, GetIntegerv, 98 GetInteger64v, GetFloatv, and GetDoublev: 99 100 TIMESTAMP 0x8E28 101 102Additions to Chapter 2 of the OpenGL 3.2 (Core Profile) Specification 103(OpenGL Operation) 104 105 (Modify table 2.1, Correspondence of command suffix letters to GL argument 106 types, p. 14) Add one new type and suffix: 107 108 Letter Corresponding GL Type 109 ------ --------------------- 110 ui64 uint64 111 112 (Modify Section 2.14, Asynchronous Queries, p. 89) 113 114 Asynchronous queries provide a mechanism to return information about the 115 processing of a sequence of GL commands. There are three query types 116 supported by the GL. Transform feedback queries (see section 2.16) return 117 information on the number of vertices and primitives processed by the GL 118 and written to one or more buffer objects. Occlusion queries (see section 119 4.1.6) count the number of fragments or samples that pass the depth test. 120 Timer queries (section 5.4) record the amount of time needed to fully 121 process these commands or the current time of the GL. 122 123Additions to Chapter 3 of the OpenGL 3.2 Specification (Rasterization) 124 125 None. 126 127Additions to Chapter 4 of the OpenGL 3.2 Specification (Per-Fragment 128Operations and the Framebuffer) 129 130 None. 131 132Additions to Chapter 5 of the OpenGL 3.2 Specification (Special Functions) 133 134 (Add new Section 5.4, Timer Queries, p. 246) 135 136 Timer queries use query objects to track the amount of time needed to 137 fully complete a set of GL commands, or to determine the current time 138 of the GL. 139 140 When BeginQuery and EndQuery are called with a <target> of 141 TIME_ELAPSED, the GL prepares to start and stop the timer used for 142 timer queries. The timer is started or stopped when the effects from all 143 previous commands on the GL client and server state and the framebuffer 144 have been fully realized. The BeginQuery and EndQuery commands may return 145 before the timer is actually started or stopped. When the timer query 146 timer is finally stopped, the elapsed time (in nanoseconds) is written to 147 the corresponding query object as the query result value, and the query 148 result for that object is marked as available. 149 150 If the elapsed time overflows the number of bits, <n>, available to hold 151 elapsed time, its value becomes undefined. It is recommended, but not 152 required, that implementations handle this overflow case by saturating at 153 2^n - 1. 154 155 A timer query object is created with the command 156 157 void QueryCounter(uint id, enum target); 158 159 <target> must be TIMESTAMP. If <id> is an unused query object name, the 160 name is marked as used and associated with a new query object of type 161 TIMESTAMP. Otherwise <id> must be the name of an existing query object 162 of that type. 163 164 When QueryCounter is called, the GL records the current time into 165 the corresponding query object. The time is recorded after all previous 166 commands on the GL client and server state and the framebuffer have been 167 fully realized. When the time is recorded, the query result for that 168 object is marked available. QueryCounter timer queries can be used 169 within a BeginQuery / EndQuery block where the <target> is TIME_ELAPSED 170 and it does not affect the result of that query object. 171 172** core profile only 173 QueryCounter fails and an INVALID\_OPERATION error is generated if <id> 174 is not a name returned from a previous call to GenQueries, or if such a 175 name has since been deleted with DeleteQueries. 176** end core profile only 177 178 If <id> is already in use within a BeginQuery / EndQuery block, or if 179 <id> is the name of an existing query object whose type does not match 180 <target>, an INVALID_OPERATION error is generated. 181 182 The current time of the GL may be queried by calling GetIntegerv or 183 GetInteger64v with the symbolic constant TIMESTAMP. This will return the 184 GL time after all previous commands have reached the GL server but have 185 not yet necessarily executed. By using a combination of this synchronous 186 get command and the asynchronous timestamp query object target, 187 applications can measure the latency between when commands reach the GL 188 server and when they are realized in the framebuffer. 189 190Additions to Chapter 6 of the OpenGL 2.0 Specification (State and State 191Requests) 192 193 (Modify Section 6.1.6, Asynchronous Queries, p. 255) 194 195 Section 6.1.6, Asynchronous Queries 196 197 The command 198 199 boolean IsQuery(uint id); 200 201 returns TRUE if <id> is the name of a query object. If <id> is zero, or if 202 <id> is a non-zero value that is not the name of a query object, IsQuery 203 returns FALSE. 204 205 Information about a query target can be queried with the command 206 207 void GetQueryiv(enum target, enum pname, int *params); 208 209 <target> identifies the query target and can be SAMPLES_PASSED for 210 occlusion queries, PRIMITIVES_GENERATED and 211 TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN for primitive queries, or 212 TIME_ELAPSED or TIMESTAMP for timer queries. 213 214 If <pname> is CURRENT_QUERY, the name of the currently active query for 215 <target>, or zero if no query is active, will be placed in <params>. 216 217 If <pname> is QUERY_COUNTER_BITS, the implementation-dependent number of 218 bits used to hold the query result for <target> will be placed in 219 <params>. The number of query counter bits may be zero, in which case 220 the counter contains no useful information. 221 222 For primitive queries (PRIMITIVES_GENERATED and 223 TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN) if the number of bits is non-zero, 224 the minimum number of bits allowed is 32. 225 226 For occlusion queries (SAMPLES_PASSED), if the number of bits is 227 non-zero, the minimum number of bits allowed is a function of the 228 implementation's maximum viewport dimensions (MAX_VIEWPORT_DIMS). The 229 counter must be able to represent at least two overdraws for every pixel 230 in the viewport. The formula to compute the allowable minimum value 231 (where <n> is the minimum number of bits) is: 232 233 n = min(32, ceil(log_2(maxViewportWidth * maxViewportHeight * 2))). 234 235 For timer queries (TIME_ELAPSED and TIMESTAMP), if the number 236 of bits is non-zero, the minimum number of bits allowed is 30 which 237 will allow at least 1 second of timing. 238 239 The state of a query object can be queried with the commands 240 241 void GetQueryObjectiv(uint id, enum pname, int *params); 242 void GetQueryObjectuiv(uint id, enum pname, uint *params); 243 void GetQueryObjecti64v(uint id, enum pname, int64 *params); 244 void GetQueryObjectui64v(uint id, enum pname, uint64 *params); 245 246 If <id> is not the name of a query object, or if the query object named 247 by <id> is currently active, then an INVALID_OPERATION error is 248 generated. 249 250 If <pname> is QUERY_RESULT, then the query object's result 251 value is returned as a single integer in <params>. If the value is so 252 large in magnitude that it cannot be represented with the requested type, 253 then the nearest value representable using the requested type is 254 returned. If the number of query counter bits for target is zero, then 255 the result is returned as a single integer with the value zero. 256 257 There may be an indeterminate delay before the above query returns. If 258 <pname> is QUERY_RESULT_AVAILABLE, FALSE is returned if such a delay 259 would be required; otherwise TRUE is returned. It must always be true 260 that if any query object returns a result available of TRUE, all queries 261 of the same type issued prior to that query must also return TRUE. 262 263 Querying the state for any given query object forces that occlusion 264 query to complete within a finite amount of time. 265 266 If multiple queries are issued using the same object name prior to 267 calling GetQueryObject[u]i[64]v, the result and availability information 268 returned will always be from the last query issued. The results from any 269 queries before the last one will be lost if they are not retrieved before 270 starting a new query on the same <target> and <id>. 271 272Interactions with NV_present_video and NV_video_capture 273 274 The GL timer recorded by this extension is the same timer as that used 275 by the NV_present_video and NV_video_capture extensions. This allows 276 the timer to be used with any of these extensions interchangeably. 277 278Interactions with the Compatibility Profile 279 280 In the compatibility profile, query objects support application-provided 281 names, and the language requiring an error is <id> is not a name 282 returned from GenQueries is removed. This is noted in the body text 283 above. 284 285Errors 286 287 The error INVALID_ENUM is generated if BeginQuery or EndQuery is called 288 where <target> is not SAMPLES_PASSED, 289 TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN or TIME_ELAPSED. 290 291 The error INVALID_ENUM is generated if GetQueryiv is called where 292 <target> is not SAMPLES_PASSED, TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN, 293 TIME_ELAPSED or TIMESTAMP. 294 295 The error INVALID_ENUM is generated if QueryCounter is called where 296 <target> is not TIMESTAMP. 297 298 The error INVALID_OPERATION is generated if QueryCounter is called 299 on a query object that is already in use inside a BeginQuery/EndQuery. 300 301 The error INVALID_OPERATION is generated if QueryCounter is called on 302 a query object whose type is not TIMESTAMP. 303 304 (in the core profile only) 305 The error INVALID_OPERATION is generated if QueryCounter is called 306 where <id> is not a name returned from a previous call to GenQueries, 307 or if such a name has since been deleted with DeleteQueries. 308 309 The error INVALID_OPERATION is generated if GetQueryObjecti64v or 310 GetQueryObjectui64v is called where <id> is not the name of a query 311 object. 312 313 The error INVALID_OPERATION is generated if GetQueryObjecti64v or 314 GetQueryObjectui64v is called where <id> is the name of a currently 315 active query object. 316 317 The error INVALID_ENUM is generated if GetQueryObjecti64v or 318 GetQueryObjectui64v is called where <pname> is not QUERY_RESULT or 319 QUERY_RESULT_AVAILABLE. 320 321New State 322 323 None. 324 325Examples 326 327 (1) Here is some rough sample code that demonstrates the intended usage 328 of this extension. 329 330 GLuint queries[N]; 331 GLint available = 0; 332 // timer queries can contain more than 32 bits of data, so always 333 // query them using the 64 bit types to avoid overflow 334 GLuint64 timeElapsed = 0; 335 336 // Create a query object. 337 glGenQueries(N, queries); 338 339 // Start query 1 340 glBeginQuery(GL_TIME_ELAPSED, queries[0]); 341 342 // Draw object 1 343 .... 344 345 // End query 1 346 glEndQuery(GL_TIME_ELAPSED); 347 348 ... 349 350 // Start query N 351 glBeginQuery(GL_TIME_ELAPSED, queries[N-1]); 352 353 // Draw object N 354 .... 355 356 // End query N 357 glEndQuery(GL_TIME_ELAPSED); 358 359 // Wait for all results to become available 360 while (!available) { 361 glGetQueryObjectiv(queries[N-1], GL_QUERY_RESULT_AVAILABLE, &available); 362 } 363 364 for (i = 0; i < N; i++) { 365 // See how much time the rendering of object i took in nanoseconds. 366 glGetQueryObjectui64v(queries[i], GL_QUERY_RESULT, &timeElapsed); 367 368 // Do something useful with the time. Note that care should be 369 // taken to use all significant bits of the result, not just the 370 // least significant 32 bits. 371 AdjustObjectLODBasedOnDrawTime(i, timeElapsed); 372 } 373 374 This example is sub-optimal in that it stalls at the end of every 375 frame to wait for query results. Ideally, the collection of results 376 would be delayed one frame to minimize the amount of time spent 377 waiting for the GPU to finish rendering. 378 379 (2) This example is basically the same as the example above but uses 380 QueryCounter instead. 381 382 GLuint queries[N+1]; 383 GLint available = 0; 384 // timer queries can contain more than 32 bits of data, so always 385 // query them using the 64 bit types to avoid overflow 386 GLuint64 timeStart, timeEnd, timeElapsed = 0; 387 388 // Create a query object. 389 glGenQueries(N+1, queries); 390 391 // Query current timestamp 1 392 glQueryCounter(queries[0], GL_TIMESTAMP); 393 394 // Draw object 1 395 .... 396 397 // Query current timestamp N 398 glQueryCounter(queries[N-1], GL_TIMESTAMP); 399 400 // Draw object N 401 .... 402 403 // Query current timestamp N+1 404 glQueryCounter(queries[N], GL_TIMESTAMP); 405 406 // Wait for all results to become available 407 while (!available) { 408 glGetQueryObjectiv(queries[N], GL_QUERY_RESULT_AVAILABLE, &available); 409 } 410 411 for (i = 0; i < N; i++) { 412 // See how much time the rendering of object i took in nanoseconds. 413 glGetQueryObjectui64v(queries[i], GL_QUERY_RESULT, &timeStart); 414 glGetQueryObjectui64v(queries[i+1], GL_QUERY_RESULT, &timeEnd); 415 timeElapsed = timeEnd - timeStart; 416 417 // Do something useful with the time. Note that care should be 418 // taken to use all significant bits of the result, not just the 419 // least significant 32 bits. 420 AdjustObjectLODBasedOnDrawTime(i, timeElapsed); 421 } 422 423 (3) This example demonstrates how to measure the latency between GL 424 commands reaching the server and being realized in the framebuffer. 425 426 /* Submit a frame of rendering commands */ 427 while (!doneRendering) { 428 ... 429 glDrawElements(...); 430 } 431 432 /* 433 * Measure rendering latency: 434 * 435 * Some commands may have already been submitted to hardware, 436 * and some of those may have already completed. The goal is 437 * to measure the time it takes for the remaining commands to 438 * complete, thereby measuring how far behind the app the GPU 439 * is lagging, but without synchronizing the GPU with the CPU. 440 */ 441 442 /* Queue a query to find out when the frame finishes on the GL */ 443 glQueryCounter(endFrameQuery, GL_TIMESTAMP); 444 445 /* Get the current GL time without stalling the GL */ 446 glGet(GL_TIMESTAMP, &flushTime); 447 448 /* Finish the frame, submitting outstanding commands to the GL */ 449 SwapBuffers(); 450 451 /* Render another frame */ 452 453 /* 454 * Later, compare the query result of <endFrameQuery> 455 * and <flushTime> to measure the latency of the frame 456 */ 457 458 459Issues from EXT_timer_query 460 461 (1) What time interval is being measured? 462 463 RESOLVED: The timer starts when all commands prior to BeginQuery() have 464 been fully executed. At that point, everything that should be drawn by 465 those commands has been written to the framebuffer. The timer stops 466 when all commands prior to EndQuery() have been fully executed. 467 468 (2) What unit of time will time intervals be returned in? 469 470 RESOLVED: Nanoseconds (10^-9 seconds). This unit of measurement allows 471 for reasonably accurate timing of even small blocks of rendering 472 commands. The granularity of the timer is implementation-dependent. A 473 32-bit query counter can express intervals of up to approximately 4 474 seconds. 475 476 (3) What should be the minimum number of counter bits for timer queries? 477 478 RESOLVED: 30 bits, which will allow timing sections that take up to 1 479 second to render. 480 481 (4) How are counter results of more than 32 bits returned? 482 483 RESOLVED: Via two new datatypes, int64EXT and uint64EXT, and their 484 corresponding GetQueryObject entry points. These types hold integer 485 values and have a minimum bit width of 64. 486 487 UPDATE: This resolution was relevant for EXT_timer_query and OpenGL 2.0. 488 OpenGL 3.2 now has int64 and uint64 datatypes as part of the core spec. 489 490 (5) Should the extension measure total time elapsed between the full 491 completion of the BeginQuery and EndQuery commands, or just time 492 spent in the graphics library? 493 494 RESOLVED: This extension will measure the total time elapsed between 495 the full completion of these commands. Future extensions may implement 496 a query to determine time elapsed at different stages of the graphics 497 pipeline. 498 499 (6) This extension introduces a second query type supported by 500 BeginQuery/EndQuery. Can multiple query types be active 501 simultaneously? 502 503 RESOLVED: Yes; an application may perform an occlusion query and a 504 timer query simultaneously. An application can not perform multiple 505 occlusion queries or multiple timer queries simultaneously. An 506 application also can not use the same query object for an occlusion 507 query and a timer query simultaneously. 508 509 (7) Do query objects have a query type permanently associated with them? 510 511 RESOLVED: No. A single query object can be used to perform different 512 types of queries, but not at the same time. 513 514 Having a fixed type for each query object simplifies some aspects of the 515 implementation -- not having to deal with queries with different result 516 sizes, for example. It would also mean that BeginQuery() with a query 517 object of the "wrong" type would result in an INVALID_OPERATION error. 518 519 UPDATE: This resolution was relevant for EXT_timer_query and OpenGL 2.0. 520 Since EXT_transform_feedback has since been incorporated into the core, 521 the resolution is that BeginQuery will generate error INVALID_OPERATION 522 if <id> represents a query object of a different type. 523 524 (8) How predictable/repeatable are the results returned by the timer 525 query? 526 527 RESOLVED: In general, the amount of time needed to render the same 528 primitives should be fairly constant. But there may be many other 529 system issues (e.g., context switching on the CPU and GPU, virtual 530 memory page faults, memory cache behavior on the CPU and GPU) that can 531 cause times to vary wildly. 532 533 Note that modern GPUs are generally highly pipelined, and may be 534 processing different primitives in different pipeline stages 535 simultaneously. In this extension, the timers start and stop when the 536 BeginQuery/EndQuery commands reach the bottom of the rendering pipeline. 537 What that means is that by the time the timer starts, the GL driver on 538 the CPU may have started work on GL commands issued after BeginQuery, 539 and the higher pipeline stages (e.g., vertex transformation) may have 540 started as well. 541 542 (9) What should the new 64 bit integer type be called? 543 544 RESOLVED: The new types will be called GLint64EXT/GLuint64EXT The new 545 command suffixes will be i64 and ui64. These names clearly convey the 546 minimum size of the types. These types are similar to the C99 standard 547 type int_least64_t, but we use names similar to the C99 optional type 548 int64_t for simplicity. 549 550 UPDATE: This resolution was relevant for EXT_timer_query and OpenGL 2.0. 551 OpenGL 3.2 now has int64 and uint64 datatypes as part of the core spec. 552 The i64 suffix already exists in OpenGL 3.2 and the ui64 suffix has been 553 added as part of this extension. 554 555Issues 556 557 (10) What about tile-based implementations? The effects of a command are 558 not complete until the frame is completely rendered. Timing recorded 559 before the frame is complete may not be what developers expect. Also 560 the amount of time needed to render the same primitives is not 561 consistent, which conflicts with issue (8) above. The time depends on 562 how early or late in the scene it is placed. 563 564 RESOLVED: The current language supports tile-based rendering okay as it 565 is written. Developers are warned that using timers on tile-based 566 implementation may not produce results they expect since rendering is not 567 done in a linear order. Timing results are calculated when the frame is 568 completed and may depend on how early or late in the scene it is placed. 569 570 (11) Can the GL implementation use different clocks to implement the 571 TIME_ELAPSED and TIMESTAMP queries? 572 573 RESOLVED: Yes, the implementation can use different internal clocks to 574 implement TIME_ELAPSED and TIMESTAMP. If different clocks are 575 used it is possible there is a slight discrepancy when comparing queries 576 made from TIME_ELAPSED and TIMESTAMP; they may have slight 577 differences when both are used to measure the same sequence. However, this 578 is unlikely to affect real applications since comparing the two queries is 579 not expected to be useful. 580 581 (12) Why do BeginQuery and QueryCounter have the same arguments in the 582 opposite order? 583 584 RESOLVED: Due to an unfortunate oversight, which cannot be fixed at 585 this point. 586 587 588Revision History 589 590 Rev. Date Author Changes 591 ---- ------------ -------- ------------------------------------------- 592 13 Aug 9, 2014 Jon Leech Fix typo in example 3 (bug 12552). 593 594 12 Jul 11, 2013 Jon Leech Change type of queries[] in sample code to 595 GLuint (public bug 432). 596 597 11 Apr 13, 2012 Jon Leech Clean up error language, add error for 598 query objects which are not of type 599 TIMESTAMP, and add issue 12 (Khronos 600 internal bug 7662) 601 602 10 June 3, 2011 dkoch Add INVALID_OPERATION error when calling 603 QueryCounter with a non-generated <id> in 604 the core profile (Khronos internal bug 7662). 605 606 9 Dec 18, 2009 pdaniell Remove ambiguous language about "interuptions 607 to the GL". Rename CURRENT_TIME to TIMESTAMP. 608 609 8 Dec 10, 2009 Jon Leech Improve description of QueryCounter command. 610 611 7 Dec 10, 2009 Jon Leech Replace non-ASCII punctuation. 612 613 6 Dec 07, 2009 pdaniell Remove ARB suffix from new tokens for core. 614 615 5 Oct 29, 2009 pdaniell TIMESTAMP_ARB renamed to CURRENT_TIME_ARB. 616 Issue (11) raised about using different 617 clocks to implement CURRENT_TIME and 618 TIME_ELAPSED queries. Add example (3) for 619 calculating the GL latency. 620 621 4 Oct 23, 2009 pdaniell Add support for TIMESTAMP_ARB as a <value> 622 to Get* to allow synchronous time query. 623 624 3 Oct 15, 2009 pdaniell Resolved Issue (10). Added Interactions 625 with NV_present_video and NV_video_capture 626 section. 627 628 2 Oct 15, 2009 pdaniell Clarified some of the old EXT_timer_query 629 Issues wrt OpenGL 3.2. Added specification 630 for the TIMESTAMP_ARB time. Added new Issue 631 for tile-based implementations. Issue 3 632 resolution added to the spec. 633 634 1 Oct 13, 2009 pdaniell Initial revision based on EXT_timer_query 635