1Name 2 3 NV_command_list 4 5Name Strings 6 7 GL_NV_command_list 8 9Contact 10 11 Pierre Boudier, NVIDIA (pboudier 'at' nvidia.com) 12 Christoph Kubisch, NVIDIA (ckubisch 'at' nvidia.com) 13 Tristan Lorach, NVIDIA (tlorach 'at' nvidia.com) 14 15Contributors 16 17 Jeff Bolz, NVIDIA 18 Corentin Wallez, NVIDIA 19 Markus Tavenrath, NVIDIA 20 Mark Kilgard, NVIDIA 21 Joseph Emmons, NVIDIA 22 Thomas Ludwig, MAXON 23 24Status 25 26 Shipping with NVIDIA driver release 347.88 (March 2015) 27 28Version 29 30 Last Modified Date: November 3, 2015 31 Revision: 6 32 33Number 34 35 OpenGL Extension #477 36 37Dependencies 38 39 This extension interacts with NV_vertex_buffer_unified_memory. 40 41 This extension interacts with NV_uniform_buffer_unified_memory. 42 43 This extension interacts with NV_parameter_buffer_object. 44 45 This extension interacts with ARB_robust_buffer_access_behavior 46 47 This extension interacts with NV_bindless_texture and ARB_bindless_texture 48 49 This extension interacts with NV_shader_buffer_load 50 51 This extension interacts with ARB_shader_draw_parameters 52 53 The extension is written against the OpenGL 4.4 Specification, 54 Compatibility Profile. 55 56Overview 57 58 This extension adds a few new features designed to provide very low 59 overhead batching and replay of rendering commands and state changes: 60 61 - A state object, which stores a pre-validated representation of the 62 the state of (almost) the entire pipeline. 63 64 - A more flexible and extensible MultiDrawIndirect (MDI) type of mechanism, using 65 a token-based command stream, allowing to setup binding state and emit draw calls. 66 67 - A set of functions to execute a list of the token-based command streams with state object 68 changes interleaved with the streams. 69 70 - Command lists enabling compilation and reuse of sequences of command 71 streams and state object changes. 72 73 Because state objects reflect the state of the entire pipeline, it is 74 expected that they can be pre-validated and executed efficiently. It is 75 also expected that when state objects are combined into a command list, 76 the command list can diff consecutive state objects to produce a reduced/ 77 optimized set of state changes specific to that transition. 78 79 The token-based command stream can also be stored in regular buffer objects 80 and therefore be modified by the server itself. This allows more 81 complex work creation than the original MDI approach, which was limited 82 to emitting draw calls only. 83 84New Procedures and Functions 85 86 void CreateStatesNV(sizei n, uint *states); 87 void DeleteStatesNV(sizei n, const uint *states); 88 boolean IsStateNV(uint state); 89 90 void StateCaptureNV(uint state, enum mode); 91 92 uint GetCommandHeaderNV(enum tokenID, uint size); 93 ushort GetStageIndexNV(enum shadertype); 94 95 void DrawCommandsNV(enum primitiveMode, uint buffer, const intptr* indirects, const sizei* sizes, 96 uint count); 97 void DrawCommandsAddressNV(enum primitiveMode, const uint64* indirects, const sizei* sizes, 98 uint count); 99 100 void DrawCommandsStatesNV(uint buffer, const intptr* indirects, const sizei* sizes, 101 const uint* states, const uint* fbos, uint count); 102 void DrawCommandsStatesAddressNV(const uint64* indirects, const sizei* sizes, 103 const uint* states, const uint* fbos, uint count); 104 105 void CreateCommandListsNV(sizei n, uint *lists); 106 void DeleteCommandListsNV(sizei n, const uint *lists); 107 boolean IsCommandListNV(uint list); 108 109 void ListDrawCommandsStatesClientNV(uint list, uint segment, const void** indirects, 110 const sizei* sizes, const uint* states, const uint* fbos, uint count); 111 112 void CommandListSegmentsNV(uint list, uint segments); 113 void CompileCommandListNV(uint list); 114 void CallCommandListNV(uint list); 115 116New Tokens 117 118 Used in DrawCommandsStates buffer formats, in 119 GetCommandHeaderNV to return the header: 120 121 122 TERMINATE_SEQUENCE_COMMAND_NV 0x0000 123 NOP_COMMAND_NV 0x0001 124 DRAW_ELEMENTS_COMMAND_NV 0x0002 125 DRAW_ARRAYS_COMMAND_NV 0x0003 126 DRAW_ELEMENTS_STRIP_COMMAND_NV 0x0004 127 DRAW_ARRAYS_STRIP_COMMAND_NV 0x0005 128 DRAW_ELEMENTS_INSTANCED_COMMAND_NV 0x0006 129 DRAW_ARRAYS_INSTANCED_COMMAND_NV 0x0007 130 ELEMENT_ADDRESS_COMMAND_NV 0x0008 131 ATTRIBUTE_ADDRESS_COMMAND_NV 0x0009 132 UNIFORM_ADDRESS_COMMAND_NV 0x000a 133 BLEND_COLOR_COMMAND_NV 0x000b 134 STENCIL_REF_COMMAND_NV 0x000c 135 LINE_WIDTH_COMMAND_NV 0x000d 136 POLYGON_OFFSET_COMMAND_NV 0x000e 137 ALPHA_REF_COMMAND_NV 0x000f 138 VIEWPORT_COMMAND_NV 0x0010 139 SCISSOR_COMMAND_NV 0x0011 140 FRONT_FACE_COMMAND_NV 0x0012 141 142 143Additions to Chapter 5 of the OpenGL 4.4 (Compatibility) Specification 144(Shared Objects and Multiple Contexts) 145 146 Add state objects and command lists to the set of objects that can not be 147 shared between contexts. 148 149Additions to Chapter 7 of the OpenGL 4.4 (Compatibility) Specification 150(Shared Objects and Multiple Contexts) 151 152 Modify Section 7.12.2, Shader Memory Access Synchronization 153 154 (modify list of barrier bits) 155 156 * COMMAND_BARRIER_BIT: Command data sourced from buffer objects by 157 Draw*Indirect, DispatchComputeIndirect and DrawCommands*NV commands 158 after the barrier will reflect data written by shaders prior to the 159 barrier. The buffer objects affected by this bit are derived from the 160 DRAW_INDIRECT_BUFFER and DISPATCH_INDIRECT_BUFFER bindings, or 161 from the arguments passed to DrawCommands*NV. 162 163Additions to Chapter 10 of the OpenGL 4.4 (Compatibility) Specification 164(Drawing Commands) 165 166Add a new Section 10.X (Indirect Draw Commands With State Changes) 167 168Add a new subsection 10.X.1 (State Objects) 169 170 The current state of the rendering pipeline can be captured into a state 171 object for later reuse with a new set of drawing commands. The name space 172 for state objects is the unsigned integers, with zero reserved. The 173 command: 174 175 void CreateStatesNV(sizei n, uint *states); 176 177 returns <n> previously unused state object names in <states>, and creates 178 a state object in the initial state for each name. 179 180 State objects are deleted by calling 181 182 void DeleteStatesNV(sizei n, const uint *states); 183 184 <states> contains <n> names of state objects to be deleted. Once a state 185 object is deleted it has no contents and its name is again unused. Unused 186 names in <states> are silently ignored, as is the value zero. 187 188 All the states that can be set via DrawCommandsStatesNV (as defined in 189 Section 10.X.2) are excluded from the captured state and will be inherited 190 from the most recent commands or GL context state. Binding state is, however, 191 never inherited from GL context, only from commands. 192 193 194 The command 195 196 void StateCaptureNV(uint state, enum basicmode); 197 198 captures the current state of the rendering pipeline into the object 199 indicated by <state>. <basicmode> indicates the basic Begin mode that this 200 state object must be used with, see Table 10.X.1.2 for compatibility 201 between primitive modes and basic modes. 202 203 Table 10.X.1.2 (Primitive mode compatibility) 204 205 basic primitive mode | compatible primitive mode 206 --------------------------------------------------------------------- 207 POINTS | POINTS 208 LINES | LINES 209 | LINE_STRIP 210 | LINE_LOOP 211 TRIANGLES | TRIANGLES 212 | TRIANGLE_STRIP 213 | TRIANGLE_FAN 214 QUADS | QUADS 215 | QUAD_STRIP 216 PATCHES | PATCHES 217 LINES_ADJACENCY | LINES_ADJACENCY 218 | LINES_STRIP_ADJACENCY 219 TRIANGLES_ADJACENCY | TRIANGLES_ADJACENCY 220 | TRIANGLES_STRIP_ADJACENCY 221 222 This rendering state includes: 223 224 - Vertex attribute enable state, formats, types, relative offsets and strides. 225 226 - Primitive state such as primitive restart and patch parameters, provoking vertex. 227 228 - Immediate vertex attribute values as provided by glVertexAttrib* or 229 glVertexAttribI* 230 231 - All active program binaries except compute (either from the active 232 program pipeline or from UseProgram) with their current subroutine 233 configuration. 234 235 - Rasterization, multisample fragment operation, depth, stencil, and 236 blending state. 237 238 - Rasterization state such as stippling and polygon modes and offsets. 239 240 - Viewport, scissor, and depth range state. 241 242 - Framebuffer attachment configuration: attachment state including attachment 243 formats, drawbuffer state, and target/layer information, but not including 244 actual attachments or sizes of attachments (these are stored separately). 245 246 - Framebuffer attachment textures (but not their residency state). 247 248 It does NOT include: 249 250 - Bound vertex buffers or vertex unified addresses, or their offsets, 251 or bound index buffers/addresses. 252 253 - Other program-related bindings, such as shader storage buffers, atomic counter buffers, texture 254 and sampler bindings. 255 256 - Default-block uniform values from active programs 257 258 - Blending constant color, front and back stencil reference values, alpha test threshold. 259 260 - Polygon offset values. 261 262 - Viewport and scissor rectangle for viewport index zero. 263 264 Essentially all state that can be manupulated by the commands listed in 10.X.2 (Drawing with Commands) 265 is excluded from the state capture. 266 267 INVALID_ENUM is generated if <mode> is not a basic primitive mode, as listed 268 in Table 10.X.1.2. 269 INVALID_OPERATION is generated if the default framebuffer is bound as either draw or read buffer. 270 INVALID_OPERATION is generated if transform feedback is enabled. 271 INVALID_OPERATION is generated if occlusion query is enabled. 272 INVALID_OPERATION is generated if the current active program or program pipeline 273 makes use of SHADER_STORAGE_BUFFER, ATOMIC_COUNTER_BUFFER or has uniforms defined 274 in the default uniform-block, or uniforms inheriting from fixed function state 275 (gl_ModelView etc.). 276 INVALID_OPERATION is generated if the current active program or program pipeline 277 uses uniform blocks that did not have the "commandBindableNV" flag set (see 278 "Modifications to the OpenGL Shading Language Specification" section). 279 INVALID_OPERATION is generated if neither program, nor program pipeline 280 objects are actively used. 281 282Add a new subsection 10.X.2 (Drawing with Commands) 283 284 void DrawCommandsNV(enum mode, uint buffer, const intptr* indirects, const sizei* sizes, 285 uint count); 286 void DrawCommandsAddressNV(enum mode, const uint64* indirects, const sizei* sizes, 287 uint count); 288 289 These commands accept arrays of buffer addresses (either an array of 290 offsets <indirects> into a buffer named by <buffer>, or an array of GPU 291 addresses <indirects>), and an array of sequence lengths in <sizes>. 292 All arrays have <count> entries. 293 The current binding state of vertex, element and uniform buffers will not be 294 effective but must be set via commands within the buffer, other state will 295 however be inherited from the current OpenGL context. 296 297 INVALID_ENUM is generated if <mode> is not an accepted value. 298 INVALID_VALUE is generated if <buffer> is not a valid buffer object. 299 INVALID_OPERATION is generated if a geometry shader is active and <mode> is 300 incompatible with the input primitive type of the geometry shader in the currently 301 installed program object. 302 INVALID_OPERATION is generated if the default (zero) frame buffer object is 303 currently bound as DRAW_FRAMEBUFFER, a non-zero frame buffer object is required. 304 305 DrawCommandsNV and DrawCommandsAddressNV are equivalent to: 306 307 Save current GL state; 308 enum indexType = UNSIGNED_SHORT; 309 for (uint i = 0; i < count; i++) { 310 uint64 address = address computed from <buffer>+<indirects>[i]; 311 312 indexType = DrawCommandSequenceNV(<mode>, indexType, address, sizes[i]); 313 } 314 Restore current GL state; 315 316 The command: 317 318 enum DrawCommandSequenceNV(enum mode, enum indexType, void *address, sizei size); 319 320 does not exist in the GL, but is used to describe functionality in the rest 321 of this section. 322 323 DrawCommandSequenceNV is a flexible and extensible command that executes 324 simple state changes and draw commands based on a tokenized format. The 325 loop above illustrates that the state changes from one invocation will 326 influence the next. All rendering is peformed as if the client states for 327 VERTEX_ATTRIB_ARRAY_UNIFIED_NV, ELEMENT_ARRAY_UNIFIED_NV and 328 UNIFORM_BUFFER_UNIFIED_NV are enabled. 329 330 It is defined by the following pseudo code, tokens, and structures: 331 332 333 Table 10.X.2 (Token values and command structure names) 334 335 tokenID | Command 336 --------------------------------------------------------------------- 337 TERMINATE_SEQUENCE_COMMAND_NV | TerminateSequenceCommandNV 338 NOP_COMMAND_NV | NOPCommandNV 339 DRAW_ELEMENTS_COMMAND_NV | DrawElementsCommandNV 340 DRAW_ARRAYS_COMMAND_NV | DrawArraysCommandNV 341 DRAW_ELEMENTS_STRIP_COMMAND_NV | DrawElementsCommandNV 342 DRAW_ARRAYS__STRIP_COMMAND_NV | DrawArraysCommandNV 343 DRAW_ELEMENTS_INSTANCED_COMMAND_NV | DrawElementsInstancedCommandNV 344 DRAW_ARRAYS_INSTANCED_COMMAND_NV | DrawArraysInstancedCommandNV 345 ELEMENT_ADDRESS_COMMAND_NV | ElementAddressCommandNV 346 ATTRIBUTE_ADDRESS_COMMAND_NV | AttributeAddressCommandNV 347 UNIFORM_ADDRESS_COMMAND_NV | UniformAddressCommandNV 348 BLEND_COLOR_COMMAND_NV | BlendColorCommandNV 349 STENCIL_REF_COMMAND_NV | StencilRefCommandNV 350 LINE_WIDTH_COMMAND_NV | LineWidthCommandNV 351 POLYGON_OFFSET_COMMAND_NV | PolygonOffsetCommandNV 352 ALPHA_REF_COMMAND_NV | AlphaRefCommandNV 353 VIEWPORT_COMMAND_NV | ViewportCommandNV 354 SCISSOR_COMMAND_NV | ScissorCommandNV 355 FRONT_FACE_COMMAND_NV | FrontFaceCommandNV 356 357 358 Tight packing is used for all structures 359 360 typedef struct { 361 uint header; 362 } TerminateSequenceCommandNV; 363 364 typedef struct { 365 uint header; 366 } NOPCommandNV; 367 368 typedef struct { 369 uint header; 370 uint count; 371 uint firstIndex; 372 uint baseVertex; 373 } DrawElementsCommandNV; 374 375 typedef struct { 376 uint header; 377 uint count; 378 uint first; 379 } DrawArraysCommandNV; 380 381 typedef struct { 382 uint header; 383 uint mode; 384 uint count; 385 uint instanceCount; 386 uint firstIndex; 387 uint baseVertex; 388 uint baseInstance; 389 } DrawElementsInstancedCommandNV; 390 391 typedef struct { 392 uint header; 393 uint mode; 394 uint count; 395 uint instanceCount; 396 uint first; 397 uint baseInstance; 398 } DrawArraysInstancedCommandNV; 399 400 typedef struct { 401 uint header; 402 uint addressLo; 403 uint addressHi; 404 uint typeSizeInByte; 405 } ElementAddressCommandNV; 406 407 typedef struct { 408 uint header; 409 uint index; 410 uint addressLo; 411 uint addressHi; 412 } AttributeAddressCommandNV; 413 414 typedef struct { 415 uint header; 416 ushort index; 417 ushort stage; 418 uint addressLo; 419 uint addressHi; 420 } UniformAddressCommandNV; 421 422 typedef struct { 423 uint header; 424 float red; 425 float green; 426 float blue; 427 float alpha; 428 } BlendColorCommandNV; 429 430 typedef struct { 431 uint header; 432 uint frontStencilRef; 433 uint backStencilRef; 434 } StencilRefCommandNV; 435 436 typedef struct { 437 uint header; 438 float lineWidth; 439 } LineWidthCommandNV; 440 441 typedef struct { 442 uint header; 443 float scale; 444 float bias; 445 } PolygonOffsetCommandNV; 446 447 typedef struct { 448 uint header; 449 float alphaRef; 450 } AlphaRefCommandNV; 451 452 typedef struct { 453 uint header; 454 uint x; 455 uint y; 456 uint width; 457 uint height; 458 } ViewportCommandNV; // only ViewportIndex 0 459 460 typedef struct { 461 uint header; 462 uint x; 463 uint y; 464 uint width; 465 uint height; 466 } ScissorCommandNV; // only ViewportIndex 0 467 468 typedef struct { 469 uint header; 470 uint frontFace; // 0 for CW, 1 for CCW 471 } FrontFaceCommandNV; 472 473 enum DrawCommandSequenceNV(enum mode, enum indexType, void *address, sizei size) 474 { 475 enum modeStrip; 476 if (mode == TRIANGLES) modeStrip = TRIANGLE_STRIP; 477 else if (mode == LINES) modeStrip = LINE_STRIP; 478 else if (mode == LINES_ADJACENCY) modeStrip = LINE_STRIP_ADJACENCY; 479 else if (mode == TRIANGLES_ADJACENCY) modeStrip = TRIANGLE_STRIP_ADJACENCY; 480 else if (mode == QUADS) modeStrip = QUAD_STRIP; 481 else modeStrip = mode; 482 483 enum modeSpecial; 484 if (mode == LINES) modeSpecial = LINE_LOOP; 485 else if (mode == TRIANGLES) modeSpecial = TRIANGLE_FAN; 486 else modeSpecial = mode; 487 488 void *current = address; 489 490 while (current != (ubyte *)address + size) { 491 uint header = *(uint*)current; 492 493 switch( GetTokenType(header)){ 494 case TERMINATE_SEQUENCE_NV: 495 { 496 return indexType; 497 } 498 break; 499 case NOP_COMMAND_NV: 500 501 break; 502 case DRAW_ELEMENTS_COMMAND_NV: 503 { 504 DrawElementsCommandNV* cmd = (DrawElementsCommandNV*)current; 505 DrawElementsBaseVertex(mode, cmd->count, indexType, (void*)(cmd->firstIndex * sizeofindextype), cmd->baseVertex); 506 } 507 break; 508 case DRAW_ARRAYS_COMMAND_NV: 509 { 510 DrawArraysCommandNV* cmd = (DrawArraysCommandNV*)current; 511 DrawArrays(mode, cmd->first, cmd->count); 512 } 513 break; 514 case DRAW_ELEMENTS_STRIP_COMMAND_NV: 515 { 516 DrawElementsCommandNV* cmd = (DrawElementsCommandNV*)current; 517 DrawElementsBaseVertex(modeStrip, cmd->count, indexType, (void*)(cmd->firstIndex * sizeofindextype), cmd->baseVertex); 518 } 519 break; 520 case DRAW_ARRAYS_STRIP_COMMAND_NV: 521 { 522 DrawArraysCommandNV* cmd = (DrawArraysCommandNV*)current; 523 DrawArrays(modeStrip, cmd->first, cmd->count); 524 } 525 break; 526 case DRAW_ELEMENTS_INSTANCED_COMMAND_NV: 527 { 528 // undefined behavior if (cmd->mode != mode && cmd->mode != modeStrip && cmd->mode != modeSpecial) 529 530 DrawElementsInstancedCommandNV* cmd = (DrawElementsInstancedCommandNV*)current; 531 DrawElementsIndirect(cmd->mode, indexType, &cmd->count); 532 } 533 break; 534 case DRAW_ARRAYS_INSTANCED_COMMAND_NV: 535 { 536 // undefined behavior if (cmd->mode != mode && cmd->mode != modeStrip && cmd->mode != modeSpecial) 537 538 DrawArraysInstancedCommandNV* cmd = (DrawArraysInstancedCommandNV*)current; 539 DrawArraysIndirect(cmd->mode, &cmd->count); 540 } 541 break; 542 case ELEMENT_ADDRESS_COMMAND_NV: 543 { 544 ElementAddressCommandNV* cmd = (ElementAddressCommandNV*)current; 545 switch(cmd->typeSizeInByte){ 546 case 1: indexType = UNSIGNED_BYTE; break; 547 case 2: indexType = UNSIGNED_SHORT; break; 548 case 4: indexType = UNSIGNED_INT; break; 549 } 550 BufferAddressRangeNV(ELEMENT_ARRAY_ADDRESS_NV, 0, uint64(cmd->addressLo) | (uint64(cmd->addressHi)<<32), 0x7FFFFFFF); 551 } 552 break; 553 case ATTRIBUTE_ADDRESS_COMMAND_NV: 554 { 555 AttributeAddressCommandNV* cmd = (AttributeAddressCommandNV*)current; 556 BufferAddressRangeNV(VERTEX_ATTRIB_ARRAY_ADDRESS_NV, cmd->index, uint64(cmd->addressLo) | (uint64(cmd->addressHi)<<32), 0x7FFFFFFF); 557 } 558 break; 559 case UNIFORM_ADDRESS_COMMAND_NV: 560 { 561 UniformAddressCommandNV* cmd = (UniformAddressCommandNV*)current; 562 BufferAddressRangeNV(UNIFORM_BUFFER_ADDRESS_NV, cmd->index, uint64(cmd->addressLo) | (uint64(cmd->addressHi)<<32), 0x10000); 563 } 564 break; 565 case BLEND_COLOR_COMMAND_NV: 566 { 567 BlendColorCommandNV* cmd = (BlendColorCommandNV*)current; 568 BlendColor(cmd->red,cmd->green,cmd->blue,cmd->alpha); 569 } 570 break; 571 case STENCIL_REF_COMMAND_NV: 572 { 573 StencilRefCommandNV* cmd = (StencilRefCommandNV*)current; 574 StencilFuncSeparate(FRONT, asIs, cmd->frontStencilRef, asIs); 575 StencilFuncSeparate(BACK, asIs, cmd->backStencilRef, asIs); 576 } 577 break; 578 case LINE_WIDTH_COMMAND_NV: 579 { 580 LineWidthCommandNV* cmd = (LineWidthCommandNV*)current; 581 LineWidth(cmd->lineWidth); 582 } 583 break; 584 case POLYGON_OFFSET_COMMAND_NV: 585 { 586 PolygonOffsetCommandNV* cmd = (PolygonOffsetCommandNV*)current; 587 PolygonOffset(cmd->scale,cmd->bias); 588 } 589 break; 590 case ALPHA_REF_COMMAND_NV: 591 { 592 AlphaRefCommandNV* cmd = (AlphaRefCommandNV*)current; 593 AlphaFunc(asIs, cmd->alphaRef); 594 } 595 break 596 case VIEWPORT_COMMAND_NV: 597 { 598 ViewportCommandNV* cmd = (ViewportCommandNV*)current; 599 Viewport (cmd->x,cmd->y,cmd->width,cmd->height); 600 } 601 break; 602 case SCISSOR_COMMAND_NV: 603 { 604 ScissorCommandNV* cmd = (ScissorCommandNV*)current; 605 Scissor(cmd->x,cmd->y,cmd->width,cmd->height); 606 } 607 break; 608 case FRONT_FACE_COMMAND_NV: 609 { 610 FrontFaceCommandNV* cmd = (FrontFaceCommandNV*)current; 611 FrontFace(cmd->frontFace ? CW : CCW); 612 } 613 break; 614 } 615 616 current = (ubyte *)current + GetTokenSize(header); 617 } 618 619 return indexType; 620 } 621 622 None of the commands called by DrawCommandSequenceNV may generate their 623 appropriate errors, providing erroneous data as parameters 624 or generating state that normally would create errors when executed 625 by the server can produce undefined results and may cause program 626 termination. 627 The residency of all resources referenced directly (buffer addresses inside tokens) 628 or indirectly (texture handles inside uniform buffer objects) must be managed 629 explicitly. 630 631 632 (XXX should we add something similar to CheckFramebufferStatus? for 633 debugging, that tests the content in software and throws error + offset into buffer 634 triggering the error) 635 636 All BufferAddressRangeNV calls issued by DrawCommandSequenceNV are 637 effective independent of their appropriate client state being enabled or not. 638 639 640 uint GetCommandHeaderNV(enum tokenID, uint size) 641 642 Returns the encoded 32bit header value for a given command; the returned 643 value is implementation specific. 644 The <size> is only provided as basic consistency check, since the size of each 645 structure is fixed and no padding is allowed. The value is the sum of the 646 header and the command specific structure. 647 INVALID_ENUM is generated if <tokenID> is not one of the values listed under Table 10.X.2. 648 INVALID_VALUE is thrown if the <size> does not match the fixed 649 size of a command defined by the spec. 650 651 ushort GetStageIndexNV(enum shadertype) 652 653 Returns the 16bit value for a specific shader stage; the returned value 654 is implementation specific. The value is to be used with the stage field 655 within UniformAddressCommandNV tokens. 656 657Add a new subsection 10.X.3 (Drawing with Commands and State Objects) 658 659 State objects may be used in rendering with the commands: 660 661 void DrawCommandsStatesNV(uint buffer, const intptr* indirects, const sizei* sizes, 662 const uint* states, const uint* fbos, uint count); 663 void DrawCommandsStatesAddressNV(const uint64* indirects, const sizei* sizes, 664 const uint* states, const uint* fbos, uint count); 665 666 These commands accept arrays of buffer addresses (either an array of 667 offsets <indirects> into a buffer named by <buffer>, or an array of GPU 668 addresses <indirects>), an array of sequence lengths in <sizes>, and an 669 array of state object names in <states>, of which all names must be non-zero. 670 Frame buffer object names are stored in <fbos> and can 671 be either zero or non-zero. All arrays have <count> entries. 672 The residency of textures used as attachment inside the state object's 673 captured fbo or the passed fbo must managed explicitly. 674 675 INVALID_VALUE is generated if one entry of <states> is zero. 676 INVALID_OPERATION is generated if the fbo configuration from <fbos> 677 mismatches the configuration inside the corresponding state object 678 from <states>. 679 680 DrawCommandsStatesNV and DrawCommandsStatesAddressNV are equivalent to: 681 682 Save current GL state; 683 enum indexType = UNSIGNED_SHORT; 684 for (uint i = 0; i < count; i++) { 685 fbo = LookupFbo(fbos[i]); 686 stateObject = LookupStateObject(states[i]); 687 688 if ( i == 0){ 689 Set full state captured by stateObject; 690 } 691 else { 692 Set difference of state going from <states>[i-1] to current stateObject, 693 } 694 695 if ( fbo == 0) { 696 BindFramebuffer(FRAMEBUFFER, stateObject.fbo.name); 697 } 698 else if ( stateObject.fbo.configuration == fbo.configuration ){ 699 // The configuration excludes attachment textures and size information, however 700 // includes attached texture formats and other state (see StateCaptureNV). 701 702 BindFramebuffer(FRAMEBUFFER, fbo.name); 703 } 704 else { 705 // Only compatible fbo states can be used. 706 707 generate ERROR INVALID_OPERATION; 708 return; 709 } 710 711 enum mode = primitive mode from stateObject 712 713 uint64 address = address computed from <buffer>+<indirects>[i]; 714 715 indexType = DrawCommandSequenceNV(mode, indexType, address, sizes[i]); 716 } 717 Restore current GL state; 718 719 where LookupFbo and LookupStateObject return the driver's internal fbo 720 and stateObject object and stateObject.fbo is the driver's fbo state 721 object and fbo.configuration and fbo.name are the current configuration 722 of a fbo and the fbo's name respectively. 723 724Add a new section 10.X.4 (Command Lists) 725 726 A list of DrawCommandsStates* commands may be compiled into a command 727 list, for further optimization and efficient reuse. The name space for 728 command lists is the unsigned integers, with zero reserved. The command: 729 730 void CreateCommandListsNV(sizei n, uint *lists); 731 732 returns <n> previously unused command list names in <lists>, and creates 733 a command list in the initial state for each name. 734 735 Command lists are deleted by calling 736 737 void DeleteCommandListsNV(sizei n, const uint *lists); 738 739 <lists> contains <n> names of command lists to be deleted. Once a command 740 list is deleted it has no contents and its name is again unused. Unused 741 names in <lists> are silently ignored, as is the value zero. 742 743 The command 744 745 void CommandListSegmentsNV(uint list, uint segments); 746 747 indicates that <list> will have <segments> number of segments, each 748 of which is a list of command sequences that it enqueues. This must be 749 called before any commands are enqueued. In the initial state, a command 750 list has a single segment. 751 752 A command list's initial state allows it to enqueue commands, but not to 753 be executed. The following command can be enqueued: 754 755 void ListDrawCommandsStatesClientNV(uint list, uint segment, const void** indirects, 756 const sizei* sizes, const uint* states, const uint* fbos, 757 uint count); 758 759 A list has multiple segments and each segment enqueues an ordered list of 760 command sequences. This command enqueues the equivalent of the DrawCommandsStatesNV 761 commands into the list indicated by <list> on the segment indicated by <segment> 762 except that the sequence data is copied from the sequences pointed to by the <indirects> 763 pointer. The <indirects> pointer should point to a list of size <count> of pointers, 764 each of which should point to a command sequence. 765 766 The pre-validated state from <states> is saved into the command list, rather 767 than a reference to the state object (i.e. the state objects or fbos could be 768 deleted and the command list would be unaffected). This includes native 769 GPU addresses for all textures indirectly referenced through the fbos 770 passed or state objects' fbos attachments, therefore a recompile of the command list 771 is required if such referenced textures change their allocation (for example 772 due to resizing), as well as explicit management of the residency of 773 the textures prior CallCommandListNV. 774 775 ListDrawCommandsStatesClientNV performs a by-value copy of the 776 indirect data based on the provided client-side pointers. In this case 777 the content is fully immutable, while the buffer-based versions can 778 change the content of the buffers at any later time. 779 780 The command 781 782 void CompileCommandListNV(uint list); 783 784 make the list indicated by <list> switch from allowing collection of 785 commands to allowing its execution. At this time, the implementation may 786 generate optimized commands to transition between states as efficiently 787 as possible. Lists may be executed with the command 788 789 void CallCommandListNV(uint list); 790 791 This executes the command list indicated by <list>, which operates as if 792 the DrawCommandsStates* commands were replayed in the order they were 793 enqueued on each segment, starting from segment zero and proceeding to the 794 maximum segment. All buffer or texture resources' residency must be 795 managed explicitly, including texture attachments of the effective 796 fbos during list enqueuing. 797 798 799Modifications to the OpenGL Shading Language Specification, Version 4.40 800 801 Including the following line in a shader can be used to control the 802 language features described in this extension: 803 804 #extension GL_NV_command_list : <behavior> 805 806 where <behavior> is as specified in section 3.3. 807 808 New preprocessor #defines are added to the OpenGL Shading Language: 809 810 #define GL_NV_command_list 1 811 812 813 Modify Section 4.4.5, "Uniform and Shader Storage Block Layout Qualifiers" 814 815 (modify first paragraph, p.78) Layout qualifiers can be used for uniform 816 and shader storage blocks, but not for non-block uniform declarations. 817 The layout qualifier identifiers (and shared keyword) for uniform and 818 shader storage blocks are 819 820 layout-qualifier-id 821 shared 822 packed 823 std140 824 std430 825 row_major 826 column_major 827 binding = integer-constant-expression 828 offset = integer-constant-expression 829 align = integer-constant-expression 830 commandBindableNV 831 832 (add paragraph prior "When multiple arguments", p. 80) 833 The commandBindableNV qualifier enables the associated uniform block 834 to be updated via UniformAddressCommandNVs when executing 835 DrawCommandsStatesNV. When commandBindableNV is enabled the <binding> 836 identifier must be provided for each block, only its value will 837 correspond with the index field of a UniformAddressCommandNV. 838 A link time error will be thrown if an index is greater or equal to 839 MAX_PROGRAM_PARAMETER_BUFFER_BINDINGS_NV. 840 Changing the binding point by the OpenGL API may not influence this 841 associated index value and may cause UniformAddressCommandNVs to have 842 undefined behavior. 843 844Dependencies on OpenGL 4.4 (Core Profile) 845 846 If only the core profile of OpenGL 4.4 is supported, references to 847 functionality deprecated by OpenGL 3.0 (built-in input/output/uniform variables 848 corresponding to fixed-function vertex attributes, fixed-function 849 vertex and fragment processing) should be removed and/or replaced with 850 functionality supported in the core profile. In such an environment, the 851 QUADS primitive type is not supported by the StateCaptureNV function. StateCaptureNV will 852 also ignore all references to deprecated state such as line stippling. 853 The ALPHA_REF_COMMAND_NV is not allowed to be used, therefore GetCommandHeaderNV will 854 return an error if the token enum is passed. 855 856Interactions with NV_shader_buffer_load 857 858 The GPU addresses used in ELEMENT_ADDRESS_COMMAND_NV, 859 ATTRIBUTE_ADDRESS_COMMAND_NV and UNIFORM_ADDRESS_COMMAND_NV 860 can be queried via the API provided in this extension. Furthermore 861 the same API must be used to ensure residency of such buffers 862 when draw commands using such addresses are issued. 863 864Interactions with NV_bindless_texture or ARB_bindless_texture 865 866 Residency of fbo attachment textures referenced in state objects 867 or command lists must be managed explicitly using the API provided 868 by either of these extensions. 869 870Interactions with NV_parameter_buffer_object 871 872 The UNIFORM_ADDRESS_COMMAND_NV described in (Drawing with Commands), will affect 873 the PROGRAM_PARAMETER_BUFFER of the target stage defined within the command 874 token. 875 876Interactions with ARB_robust_buffer_access_behavior 877 878 The buffer setups performed by ELEMENT_ADDRESS_COMMAND_NV, 879 ATTRIBUTE_ADDRESS_COMMAND_NV and UNIFORM_ADDRESS_COMMAND_NV 880 do not provide the required buffer ranges for robust buffer 881 access. Therefore draw calls executed under this type of 882 buffer setup will not respect the robust buffer access rules. 883 884Interactions with ARB_shader_draw_parameters 885 886 The drawing operations performed through this extension will not support 887 setting of the built-in GLSL values that were added by 888 ARB_shader_draw_parameters (gl_BaseInstanceARB, gl_BaseVertexARB, gl_DrawIDARB). 889 Accessing these variables will result in undefined values. 890 891Additions to the AGL/GLX/WGL Specifications 892 893 None. 894 895GLX Protocol 896 897 None. 898 899Errors 900 901 902New State 903 904 None. 905 906Issues 907 908 1) What motivates the design? 909 910 The primary goal is to be able to reuse pre-validated command buffers. Other 911 APIs and proposals have addressed this with various incarnations of command 912 lists or state objects, but a recurring problem is that interactions between 913 various stages of the pipeline prevent this prevalidation and reuse. These 914 interactions are often hardware-specific (and differ from vendor to vendor 915 or even generation to generation) and new interactions are introduced by 916 new features that were not imagined when the prevalidation scheme was 917 proposed. 918 919 We attempt to address this by having a monolithic state object that 920 encompasses (almost) the entire state of the pipeline. This should provide 921 enough information for all implementations to do any needed cross- 922 validation. We try to create these in a way that minimizes the new API 923 footprint - since we want ALL state (including any added in the future), we 924 just capture it from the current state of the context. 925 926 We expect that a captured state object will be represented as a list of 927 commands to send to the GPU. While that list of commands may be fairly 928 large, it is also well-suited to filtering redundant changes when switching 929 from one state object to another (filtering may occur on the GPU, or by 930 some processing on the CPU). We anticipate that filtering will be applied 931 when compiling a command list, but it is likely that some (perhaps less 932 aggressive) filtering will also occur in unlisted DrawCommandsStates 933 commands. 934 935 2) Should binding state be captured? 936 937 Binding state should not be captured, for multiple reasons. 938 939 The memory management performed by the driver as part of legacy command 940 execution is expensive and not well-suited for the prevalidation of 941 commands. This can be replaced by explicit bindless memory management 942 APIs (e.g. Make*Resident). 943 944 Resource bindings also require behind-the-scenes management of internal 945 GPU structures like texture handles. Again, this can be replaced by the 946 bindless APIs. 947 948 3) What FBO state should be captured? 949 950 We definitely want to capture enough information to be able to do any 951 state-based recompiles of the fragment shader, which would include 952 drawbuffer state and format state. However, it is not desirable to have 953 all properties of the FBO be captured, e.g. if attachment width/height 954 were captured then state objects could become invalid if the window shape 955 changed 956 957 RESOLVED: state objects reference the FBO configuration, but passing 958 other compatible FBOs during rendering is possible. Furthermore the 959 VIEWPORT_COMMAND_NV allows setting the appropriate viewport state. 960 961 4) Can UBOs be accessed? How? 962 963 RESOLVED: We want to encourage the "first level of the scene graph" information read 964 by shaders to be accessed with fast UBO memory accesses. 965 UNIFORM_ADDRESS_COMMAND_NV provides this mechanism. 966 967 5) What about Compute? 968 969 Compute does not have the same complex state interactions that the graphics 970 pipeline has, so it is not included in this extension. 971 972 6) What dynamic state should be allowed? 973 974 There are some state values which are pretty much raw integer/floating 975 point data, where requiring a unique state object for each value would 976 drastically bloat the number of state objects needed and break batching. 977 We allow for a few such values to be set in the token command buffer 978 rather than in the state object. The current list is motivated by similar 979 state in other APIs, and may not be complete. 980 981 7) What are the "segments" in command lists? 982 983 These are multiple "starting points" for appending commands to the list, 984 which are ultimately replayed in order by segments. This may be useful to 985 build a multipass rendering algorithm with only a single traversal of the 986 scene graph. 987 988 8) When are state objects consumed into the list? 989 990 This could either occur as the command is appended to the list, or during 991 CompileCommandListNV. 992 993 RESOLVED: At ListDrawCommandsStatesClientNV time. 994 995 9) Do we want to have multiple modes in the same dispatch ? 996 997 RESOLVED: yes, state-objects with different modes can be used, allowing 998 fast transitioning between those. Furthermore, it is possible to mix 999 LINES/LINE_STRIP/LINE_LOOP or TRIANGLES/TRIANGLE_STRIP/TRIANGLE_FAN and others 1000 using the same state object, as long as their base primitive mode is the same. 1001 1002 10) Do we want to allow mixing DrawArrays and DrawElements in the same 1003 dispatch ? 1004 1005 RESOLVED: yes. 1006 1007 11) What happens if the token buffer is modified while it is being dispatched ? 1008 1009 RESOLVED: there is no guarantee of coherency, so undefined behavior. 1010 1011 12) I would like to change states in the middle; how do I do this ? 1012 1013 RESOLVED: you can select a new state object or state tokens, but you cannot change 1014 state in the indirect buffer itself. 1015 1016 13) Is the token buffer multithread safe; does it scale ? 1017 1018 RESOLVED: yes. it is trivial to allocate a token buffer per thread, and then submit 1019 them in the main thread sequentially. since the implementation is not involved 1020 when the application writes to them, the only thread safety requirements are in 1021 the application itself. 1022 Command lists and state objects are, however, currently not context share-able, 1023 though as rendering is much more efficient now, the main dispatching thread can 1024 spend the time on preparing state objects prior drawing. The cost of glStateCaptureNV 1025 is no worse than a classic API draw call, and exploiting temporal coherence not too 1026 many states would be "new" frame to frame, but instead cached states can be reused. 1027 1028 14) Can I reuse token buffer multiple times ? 1029 1030 RESOLVED: yes. 1031 1032 15) Should we use a fixed length decoding or at the very least a size in the header ? 1033 1034 RESOLVED: fixed length is used. As basic consistency check the size is also passed to header generation. 1035 The NOP command can be used to pad structures to custom sizes. 1036 1037 16) Can I do buffer updates in a single DrawCommands call ? 1038 1039 RESOLVED: NO. 1040 Updating memory in general requires synchronization, and having lots of 1041 updates inside a single DrawCommands would become a performance bottleneck. 1042 1043 17) I want to implement some occlusion scheme and skip some of the draws; how do I do this ? 1044 1045 RESOLVED: this extension does not offer a conditional render facility, but this can be 1046 implemented by using NOP or preferably TERMINATE_SEQUENCE commands in the stream. 1047 1048 18) I want to implement some level of detail scheme; is that possible ? 1049 1050 RESOLVED: you can use NOP or TERMINATE_SEQUENCE to skip the level of details that you don't want to draw. 1051 1052 19) Why can't I just get a token to change the state, and avoid specifying lists of 1053 state and indirect buffers ? 1054 1055 RESOLVED: Getting a token to specify a state switch imply that the application would 1056 have access to a virtual address of state changes. This would potentially open security 1057 issue, since part of the validation may involve complex sequence of programming. 1058 1059 20) Instead of void** which means all commands must be stored in one buffer, could GLuint64** be used 1060 when EnableClientState(DRAW_INDIRECT_UNIFIED_NV) is set? This would allow managing different command 1061 buffers independently. 1062 1063 RESOLVED: separate Address command added 1064 1065 21) How big can each indirect command list's buffer size be? 1066 1067 RESOLVED: no limit required. 1068 1069 22) How to retrieve the "index" within UniformAddressCommandNV, or is that the GL binding point? 1070 1071 RESOLVED: added commandBindableNV layout qualifier in GLSL for uniform blocks to ensure fixed binding unit. 1072 Also added stage value to command. 1073 1074 23) In what condition is the state left, that is modified by tokens, after the dispatch call? 1075 1076 RESOLVED: state is reset. 1077 1078 24) How does working with this extension look like 1079 1080 You will find related samples at https://github.com/nvpro-samples 1081 1082 25) How can I use textures, images, shader storage or atomic counter buffers in combination with state objects? 1083 1084 Textures and images are covered via NV/ARB_bindless_texture, you can store their handles inside uniform buffers. 1085 Shader storage and atomic counter buffers are currently not directly exposed, however NV_gpu_shader5 allows 1086 storing pointers to such buffers inside uniform buffers as well. Atomic counters can be replaced by regular 1087 atomic increments. 1088 1089 Alternatively use DrawCommandsNV or DrawCommandsAddressNV, which does support any GLSL programs with these 1090 resource bindings, as well as default-block uniforms. 1091 1092 1093Revision History 1094 1095 Rev. Date Author Changes 1096 ---- -------- -------- ----------------------------------------- 1097 6 11/3/2015 ckubisch Rephrase what stateobjects capture and what not 1098 5 8/17/2015 ckubisch correct errors for DrawCommandsNV and DrawCommandsAddressNV 1099 rendering to default framebuffer is not allowed. Clarify 1100 which state is inherited (updated Issue 25). 1101 4 6/18/2015 ckubisch Add missing interaction with ARB_shader_draw_parameters 1102 3 5/27/2015 jemmons Multiple minor fixes and clarifications 1103 2 4/16/2015 pboudier Fix incorrect type (size_t is now sizei) in ListDrawCommandsStatesClientNV 1104 1 pboudier concept 1105 jbolz base spec 1106 ckubisch detailed spec 1107 mjk Internal revisions 1108