• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Name
2
3    NV_command_list
4
5Name Strings
6
7    GL_NV_command_list
8
9Contact
10
11    Pierre Boudier, NVIDIA (pboudier 'at' nvidia.com)
12    Christoph Kubisch, NVIDIA (ckubisch 'at' nvidia.com)
13    Tristan Lorach, NVIDIA (tlorach 'at' nvidia.com)
14
15Contributors
16
17    Jeff Bolz, NVIDIA
18    Corentin Wallez, NVIDIA
19    Markus Tavenrath, NVIDIA
20    Mark Kilgard, NVIDIA
21    Joseph Emmons, NVIDIA
22    Thomas Ludwig, MAXON
23
24Status
25
26    Shipping with NVIDIA driver release 347.88 (March 2015)
27
28Version
29
30    Last Modified Date: November 3, 2015
31    Revision: 6
32
33Number
34
35    OpenGL Extension #477
36
37Dependencies
38
39    This extension interacts with NV_vertex_buffer_unified_memory.
40
41    This extension interacts with NV_uniform_buffer_unified_memory.
42
43    This extension interacts with NV_parameter_buffer_object.
44
45    This extension interacts with ARB_robust_buffer_access_behavior
46
47    This extension interacts with NV_bindless_texture and ARB_bindless_texture
48
49    This extension interacts with NV_shader_buffer_load
50
51    This extension interacts with ARB_shader_draw_parameters
52
53    The extension is written against the OpenGL 4.4 Specification,
54    Compatibility Profile.
55
56Overview
57
58    This extension adds a few new features designed to provide very low
59    overhead batching and replay of rendering commands and state changes:
60
61    - A state object, which stores a pre-validated representation of the
62      the state of (almost) the entire pipeline.
63
64    - A more flexible and extensible MultiDrawIndirect (MDI) type of mechanism, using
65      a token-based command stream, allowing to setup binding state and emit draw calls.
66
67    - A set of functions to execute a list of the token-based command streams with state object
68      changes interleaved with the streams.
69
70    - Command lists enabling compilation and reuse of sequences of command
71      streams and state object changes.
72
73    Because state objects reflect the state of the entire pipeline, it is
74    expected that they can be pre-validated and executed efficiently. It is
75    also expected that when state objects are combined into a command list,
76    the command list can diff consecutive state objects to produce a reduced/
77    optimized set of state changes specific to that transition.
78
79    The token-based command stream can also be stored in regular buffer objects
80    and therefore be modified by the server itself. This allows more
81    complex work creation than the original MDI approach, which was limited
82    to emitting draw calls only.
83
84New Procedures and Functions
85
86    void CreateStatesNV(sizei n, uint *states);
87    void DeleteStatesNV(sizei n, const uint *states);
88    boolean IsStateNV(uint state);
89
90    void StateCaptureNV(uint state, enum mode);
91
92    uint   GetCommandHeaderNV(enum tokenID, uint size);
93    ushort GetStageIndexNV(enum shadertype);
94
95    void DrawCommandsNV(enum primitiveMode, uint buffer, const intptr* indirects, const sizei* sizes,
96                        uint count);
97    void DrawCommandsAddressNV(enum primitiveMode, const uint64* indirects, const sizei* sizes,
98                               uint count);
99
100    void DrawCommandsStatesNV(uint buffer, const intptr* indirects, const sizei* sizes,
101                                   const uint* states, const uint* fbos, uint count);
102    void DrawCommandsStatesAddressNV(const uint64* indirects, const sizei* sizes,
103                                     const uint* states, const uint* fbos, uint count);
104
105    void CreateCommandListsNV(sizei n, uint *lists);
106    void DeleteCommandListsNV(sizei n, const uint *lists);
107    boolean IsCommandListNV(uint list);
108
109    void ListDrawCommandsStatesClientNV(uint list, uint segment, const void** indirects,
110                                        const sizei* sizes, const uint* states, const uint* fbos, uint count);
111
112    void CommandListSegmentsNV(uint list, uint segments);
113    void CompileCommandListNV(uint list);
114    void CallCommandListNV(uint list);
115
116New Tokens
117
118    Used in DrawCommandsStates buffer formats, in
119    GetCommandHeaderNV to return the header:
120
121
122      TERMINATE_SEQUENCE_COMMAND_NV                      0x0000
123      NOP_COMMAND_NV                                     0x0001
124      DRAW_ELEMENTS_COMMAND_NV                           0x0002
125      DRAW_ARRAYS_COMMAND_NV                             0x0003
126      DRAW_ELEMENTS_STRIP_COMMAND_NV                     0x0004
127      DRAW_ARRAYS_STRIP_COMMAND_NV                       0x0005
128      DRAW_ELEMENTS_INSTANCED_COMMAND_NV                 0x0006
129      DRAW_ARRAYS_INSTANCED_COMMAND_NV                   0x0007
130      ELEMENT_ADDRESS_COMMAND_NV                         0x0008
131      ATTRIBUTE_ADDRESS_COMMAND_NV                       0x0009
132      UNIFORM_ADDRESS_COMMAND_NV                         0x000a
133      BLEND_COLOR_COMMAND_NV                             0x000b
134      STENCIL_REF_COMMAND_NV                             0x000c
135      LINE_WIDTH_COMMAND_NV                              0x000d
136      POLYGON_OFFSET_COMMAND_NV                          0x000e
137      ALPHA_REF_COMMAND_NV                               0x000f
138      VIEWPORT_COMMAND_NV                                0x0010
139      SCISSOR_COMMAND_NV                                 0x0011
140      FRONT_FACE_COMMAND_NV                              0x0012
141
142
143Additions to Chapter 5 of the OpenGL 4.4 (Compatibility) Specification
144(Shared Objects and Multiple Contexts)
145
146    Add state objects and command lists to the set of objects that can not be
147    shared between contexts.
148
149Additions to Chapter 7 of the OpenGL 4.4 (Compatibility) Specification
150(Shared Objects and Multiple Contexts)
151
152    Modify Section 7.12.2, Shader Memory Access Synchronization
153
154    (modify list of barrier bits)
155
156    * COMMAND_BARRIER_BIT: Command data sourced from buffer objects by
157      Draw*Indirect, DispatchComputeIndirect and DrawCommands*NV commands
158      after the barrier will reflect data written by shaders prior to the
159      barrier. The buffer objects affected by this bit are derived from the
160      DRAW_INDIRECT_BUFFER and DISPATCH_INDIRECT_BUFFER bindings, or
161      from the arguments passed to DrawCommands*NV.
162
163Additions to Chapter 10 of the OpenGL 4.4 (Compatibility) Specification
164(Drawing Commands)
165
166Add a new Section 10.X (Indirect Draw Commands With State Changes)
167
168Add a new subsection 10.X.1 (State Objects)
169
170    The current state of the rendering pipeline can be captured into a state
171    object for later reuse with a new set of drawing commands. The name space
172    for state objects is the unsigned integers, with zero reserved. The
173    command:
174
175        void CreateStatesNV(sizei n, uint *states);
176
177    returns <n> previously unused state object names in <states>, and creates
178    a state object in the initial state for each name.
179
180    State objects are deleted by calling
181
182        void DeleteStatesNV(sizei n, const uint *states);
183
184    <states> contains <n> names of state objects to be deleted. Once a state
185    object is deleted it has no contents and its name is again unused. Unused
186    names in <states> are silently ignored, as is the value zero.
187
188    All the states that can be set via DrawCommandsStatesNV (as defined in
189    Section 10.X.2) are excluded from the captured state and will be inherited
190    from the most recent commands or GL context state. Binding state is, however,
191    never inherited from GL context, only from commands.
192
193
194    The command
195
196        void StateCaptureNV(uint state, enum basicmode);
197
198    captures the current state of the rendering pipeline into the object
199    indicated by <state>. <basicmode> indicates the basic Begin mode that this
200    state object must be used with, see Table 10.X.1.2 for compatibility
201    between primitive modes and basic modes.
202
203        Table 10.X.1.2 (Primitive mode compatibility)
204
205        basic primitive mode        | compatible primitive mode
206        ---------------------------------------------------------------------
207        POINTS                      | POINTS
208        LINES                       | LINES
209                                    | LINE_STRIP
210                                    | LINE_LOOP
211        TRIANGLES                   | TRIANGLES
212                                    | TRIANGLE_STRIP
213                                    | TRIANGLE_FAN
214        QUADS                       | QUADS
215                                    | QUAD_STRIP
216        PATCHES                     | PATCHES
217        LINES_ADJACENCY             | LINES_ADJACENCY
218                                    | LINES_STRIP_ADJACENCY
219        TRIANGLES_ADJACENCY         | TRIANGLES_ADJACENCY
220                                    | TRIANGLES_STRIP_ADJACENCY
221
222    This rendering state includes:
223
224    - Vertex attribute enable state, formats, types, relative offsets and strides.
225
226    - Primitive state such as primitive restart and patch parameters, provoking vertex.
227
228    - Immediate vertex attribute values as provided by glVertexAttrib* or
229      glVertexAttribI*
230
231    - All active program binaries except compute (either from the active
232      program pipeline or from UseProgram) with their current subroutine
233      configuration.
234
235    - Rasterization, multisample fragment operation, depth, stencil, and
236      blending state.
237
238    - Rasterization state such as stippling and polygon modes and offsets.
239
240    - Viewport, scissor, and depth range state.
241
242    - Framebuffer attachment configuration: attachment state including attachment
243      formats, drawbuffer state, and target/layer information, but not including
244      actual attachments or sizes of attachments (these are stored separately).
245
246    - Framebuffer attachment textures (but not their residency state).
247
248    It does NOT include:
249
250    - Bound vertex buffers or vertex unified addresses, or their offsets,
251      or bound index buffers/addresses.
252
253    - Other program-related bindings, such as shader storage buffers, atomic counter buffers, texture
254      and sampler bindings.
255
256    - Default-block uniform values from active programs
257
258    - Blending constant color, front and back stencil reference values, alpha test threshold.
259
260    - Polygon offset values.
261
262    - Viewport and scissor rectangle for viewport index zero.
263
264    Essentially all state that can be manupulated by the commands listed in 10.X.2 (Drawing with Commands)
265    is excluded from the state capture.
266
267    INVALID_ENUM is generated if <mode> is not a basic primitive mode, as listed
268    in Table 10.X.1.2.
269    INVALID_OPERATION is generated if the default framebuffer is bound as either draw or read buffer.
270    INVALID_OPERATION is generated if transform feedback is enabled.
271    INVALID_OPERATION is generated if occlusion query is enabled.
272    INVALID_OPERATION is generated if the current active program or program pipeline
273    makes use of SHADER_STORAGE_BUFFER, ATOMIC_COUNTER_BUFFER or has uniforms defined
274    in the default uniform-block, or uniforms inheriting from fixed function state
275    (gl_ModelView etc.).
276    INVALID_OPERATION is generated if the current active program or program pipeline
277    uses uniform blocks that did not have the "commandBindableNV" flag set (see
278    "Modifications to the OpenGL Shading Language Specification" section).
279    INVALID_OPERATION is generated if neither program, nor program pipeline
280    objects are actively used.
281
282Add a new subsection 10.X.2 (Drawing with Commands)
283
284        void DrawCommandsNV(enum mode, uint buffer, const intptr* indirects, const sizei* sizes,
285                            uint count);
286        void DrawCommandsAddressNV(enum mode, const uint64* indirects, const sizei* sizes,
287                                   uint count);
288
289    These commands accept arrays of buffer addresses (either an array of
290    offsets <indirects> into a buffer named by <buffer>, or an array of GPU
291    addresses <indirects>), and an array of sequence lengths in <sizes>.
292    All arrays have <count> entries.
293    The current binding state of vertex, element and uniform buffers will not be
294    effective but must be set via commands within the buffer, other state will
295    however be inherited from the current OpenGL context.
296
297    INVALID_ENUM is generated if <mode> is not an accepted value.
298    INVALID_VALUE is generated if <buffer> is not a valid buffer object.
299    INVALID_OPERATION is generated if a geometry shader is active and <mode> is
300    incompatible with the input primitive type of the geometry shader in the currently
301    installed program object.
302    INVALID_OPERATION is generated if the default (zero) frame buffer object is
303    currently bound as DRAW_FRAMEBUFFER, a non-zero frame buffer object is required.
304
305    DrawCommandsNV and DrawCommandsAddressNV are equivalent to:
306
307        Save current GL state;
308        enum indexType = UNSIGNED_SHORT;
309        for (uint i = 0; i < count; i++) {
310            uint64 address = address computed from <buffer>+<indirects>[i];
311
312            indexType = DrawCommandSequenceNV(<mode>, indexType, address, sizes[i]);
313        }
314        Restore current GL state;
315
316    The command:
317
318        enum DrawCommandSequenceNV(enum mode, enum indexType, void *address, sizei size);
319
320    does not exist in the GL, but is used to describe functionality in the rest
321    of this section.
322
323    DrawCommandSequenceNV is a flexible and extensible command that executes
324    simple state changes and draw commands based on a tokenized format. The
325    loop above illustrates that the state changes from one invocation will
326    influence the next. All rendering is peformed as if the client states for
327    VERTEX_ATTRIB_ARRAY_UNIFIED_NV, ELEMENT_ARRAY_UNIFIED_NV and
328    UNIFORM_BUFFER_UNIFIED_NV are enabled.
329
330    It is defined by the following pseudo code, tokens, and structures:
331
332
333    Table 10.X.2 (Token values and command structure names)
334
335      tokenID                               | Command
336      ---------------------------------------------------------------------
337        TERMINATE_SEQUENCE_COMMAND_NV       | TerminateSequenceCommandNV
338        NOP_COMMAND_NV                      | NOPCommandNV
339        DRAW_ELEMENTS_COMMAND_NV            | DrawElementsCommandNV
340        DRAW_ARRAYS_COMMAND_NV              | DrawArraysCommandNV
341        DRAW_ELEMENTS_STRIP_COMMAND_NV      | DrawElementsCommandNV
342        DRAW_ARRAYS__STRIP_COMMAND_NV       | DrawArraysCommandNV
343        DRAW_ELEMENTS_INSTANCED_COMMAND_NV  | DrawElementsInstancedCommandNV
344        DRAW_ARRAYS_INSTANCED_COMMAND_NV    | DrawArraysInstancedCommandNV
345        ELEMENT_ADDRESS_COMMAND_NV          | ElementAddressCommandNV
346        ATTRIBUTE_ADDRESS_COMMAND_NV        | AttributeAddressCommandNV
347        UNIFORM_ADDRESS_COMMAND_NV          | UniformAddressCommandNV
348        BLEND_COLOR_COMMAND_NV              | BlendColorCommandNV
349        STENCIL_REF_COMMAND_NV              | StencilRefCommandNV
350        LINE_WIDTH_COMMAND_NV               | LineWidthCommandNV
351        POLYGON_OFFSET_COMMAND_NV           | PolygonOffsetCommandNV
352        ALPHA_REF_COMMAND_NV                | AlphaRefCommandNV
353        VIEWPORT_COMMAND_NV                 | ViewportCommandNV
354        SCISSOR_COMMAND_NV                  | ScissorCommandNV
355        FRONT_FACE_COMMAND_NV               | FrontFaceCommandNV
356
357
358        Tight packing is used for all structures
359
360        typedef struct {
361          uint  header;
362        } TerminateSequenceCommandNV;
363
364        typedef struct {
365          uint  header;
366        } NOPCommandNV;
367
368        typedef  struct {
369          uint  header;
370          uint  count;
371          uint  firstIndex;
372          uint  baseVertex;
373        } DrawElementsCommandNV;
374
375        typedef  struct {
376          uint  header;
377          uint  count;
378          uint  first;
379        } DrawArraysCommandNV;
380
381        typedef  struct {
382          uint  header;
383          uint  mode;
384          uint  count;
385          uint  instanceCount;
386          uint  firstIndex;
387          uint  baseVertex;
388          uint  baseInstance;
389        } DrawElementsInstancedCommandNV;
390
391        typedef  struct {
392          uint  header;
393          uint  mode;
394          uint  count;
395          uint  instanceCount;
396          uint  first;
397          uint  baseInstance;
398        } DrawArraysInstancedCommandNV;
399
400        typedef struct {
401          uint  header;
402          uint  addressLo;
403          uint  addressHi;
404          uint  typeSizeInByte;
405        } ElementAddressCommandNV;
406
407        typedef struct {
408          uint  header;
409          uint  index;
410          uint  addressLo;
411          uint  addressHi;
412        } AttributeAddressCommandNV;
413
414        typedef struct {
415          uint    header;
416          ushort  index;
417          ushort  stage;
418          uint    addressLo;
419          uint    addressHi;
420        } UniformAddressCommandNV;
421
422        typedef struct {
423          uint  header;
424          float red;
425          float green;
426          float blue;
427          float alpha;
428        } BlendColorCommandNV;
429
430        typedef struct {
431          uint  header;
432          uint  frontStencilRef;
433          uint  backStencilRef;
434        } StencilRefCommandNV;
435
436        typedef struct {
437          uint  header;
438          float lineWidth;
439        } LineWidthCommandNV;
440
441        typedef struct {
442          uint  header;
443          float scale;
444          float bias;
445        } PolygonOffsetCommandNV;
446
447        typedef struct {
448          uint  header;
449          float alphaRef;
450        } AlphaRefCommandNV;
451
452        typedef struct {
453          uint  header;
454          uint  x;
455          uint  y;
456          uint  width;
457          uint  height;
458        } ViewportCommandNV; // only ViewportIndex 0
459
460        typedef struct {
461          uint  header;
462          uint  x;
463          uint  y;
464          uint  width;
465          uint  height;
466        } ScissorCommandNV; // only ViewportIndex 0
467
468        typedef struct {
469          uint  header;
470          uint  frontFace; // 0 for CW, 1 for CCW
471        } FrontFaceCommandNV;
472
473        enum DrawCommandSequenceNV(enum mode, enum indexType, void *address, sizei size)
474        {
475          enum modeStrip;
476          if      (mode == TRIANGLES)            modeStrip = TRIANGLE_STRIP;
477          else if (mode == LINES)                modeStrip = LINE_STRIP;
478          else if (mode == LINES_ADJACENCY)      modeStrip = LINE_STRIP_ADJACENCY;
479          else if (mode == TRIANGLES_ADJACENCY)  modeStrip = TRIANGLE_STRIP_ADJACENCY;
480          else if (mode == QUADS)                modeStrip = QUAD_STRIP;
481          else    modeStrip = mode;
482
483          enum modeSpecial;
484          if      (mode == LINES)      modeSpecial = LINE_LOOP;
485          else if (mode == TRIANGLES)  modeSpecial = TRIANGLE_FAN;
486          else    modeSpecial = mode;
487
488          void *current = address;
489
490          while (current != (ubyte *)address + size) {
491            uint    header  = *(uint*)current;
492
493            switch( GetTokenType(header)){
494            case TERMINATE_SEQUENCE_NV:
495              {
496                return indexType;
497              }
498              break;
499            case NOP_COMMAND_NV:
500
501              break;
502            case DRAW_ELEMENTS_COMMAND_NV:
503              {
504                DrawElementsCommandNV* cmd = (DrawElementsCommandNV*)current;
505                DrawElementsBaseVertex(mode, cmd->count, indexType, (void*)(cmd->firstIndex * sizeofindextype), cmd->baseVertex);
506              }
507              break;
508            case DRAW_ARRAYS_COMMAND_NV:
509              {
510                DrawArraysCommandNV* cmd = (DrawArraysCommandNV*)current;
511                DrawArrays(mode, cmd->first, cmd->count);
512              }
513              break;
514            case DRAW_ELEMENTS_STRIP_COMMAND_NV:
515              {
516                DrawElementsCommandNV* cmd = (DrawElementsCommandNV*)current;
517                DrawElementsBaseVertex(modeStrip, cmd->count, indexType, (void*)(cmd->firstIndex * sizeofindextype), cmd->baseVertex);
518              }
519              break;
520            case DRAW_ARRAYS_STRIP_COMMAND_NV:
521              {
522                DrawArraysCommandNV* cmd = (DrawArraysCommandNV*)current;
523                DrawArrays(modeStrip, cmd->first, cmd->count);
524              }
525              break;
526            case DRAW_ELEMENTS_INSTANCED_COMMAND_NV:
527              {
528                // undefined behavior if (cmd->mode != mode && cmd->mode != modeStrip && cmd->mode != modeSpecial)
529
530                DrawElementsInstancedCommandNV* cmd = (DrawElementsInstancedCommandNV*)current;
531                DrawElementsIndirect(cmd->mode, indexType, &cmd->count);
532              }
533              break;
534            case DRAW_ARRAYS_INSTANCED_COMMAND_NV:
535              {
536                // undefined behavior if (cmd->mode != mode && cmd->mode != modeStrip && cmd->mode != modeSpecial)
537
538                DrawArraysInstancedCommandNV* cmd = (DrawArraysInstancedCommandNV*)current;
539                DrawArraysIndirect(cmd->mode, &cmd->count);
540              }
541              break;
542            case ELEMENT_ADDRESS_COMMAND_NV:
543              {
544                ElementAddressCommandNV* cmd = (ElementAddressCommandNV*)current;
545                switch(cmd->typeSizeInByte){
546                  case 1: indexType = UNSIGNED_BYTE;  break;
547                  case 2: indexType = UNSIGNED_SHORT; break;
548                  case 4: indexType = UNSIGNED_INT;   break;
549                }
550                BufferAddressRangeNV(ELEMENT_ARRAY_ADDRESS_NV, 0, uint64(cmd->addressLo) | (uint64(cmd->addressHi)<<32), 0x7FFFFFFF);
551              }
552              break;
553            case ATTRIBUTE_ADDRESS_COMMAND_NV:
554              {
555                AttributeAddressCommandNV* cmd = (AttributeAddressCommandNV*)current;
556                BufferAddressRangeNV(VERTEX_ATTRIB_ARRAY_ADDRESS_NV, cmd->index, uint64(cmd->addressLo) | (uint64(cmd->addressHi)<<32), 0x7FFFFFFF);
557              }
558              break;
559            case UNIFORM_ADDRESS_COMMAND_NV:
560              {
561                UniformAddressCommandNV* cmd = (UniformAddressCommandNV*)current;
562                BufferAddressRangeNV(UNIFORM_BUFFER_ADDRESS_NV, cmd->index, uint64(cmd->addressLo) | (uint64(cmd->addressHi)<<32), 0x10000);
563              }
564              break;
565            case BLEND_COLOR_COMMAND_NV:
566              {
567                BlendColorCommandNV* cmd = (BlendColorCommandNV*)current;
568                BlendColor(cmd->red,cmd->green,cmd->blue,cmd->alpha);
569              }
570              break;
571            case STENCIL_REF_COMMAND_NV:
572              {
573                StencilRefCommandNV* cmd = (StencilRefCommandNV*)current;
574                StencilFuncSeparate(FRONT, asIs, cmd->frontStencilRef, asIs);
575                StencilFuncSeparate(BACK,  asIs, cmd->backStencilRef,  asIs);
576              }
577              break;
578            case LINE_WIDTH_COMMAND_NV:
579              {
580                LineWidthCommandNV* cmd = (LineWidthCommandNV*)current;
581                LineWidth(cmd->lineWidth);
582              }
583              break;
584            case POLYGON_OFFSET_COMMAND_NV:
585              {
586                PolygonOffsetCommandNV* cmd = (PolygonOffsetCommandNV*)current;
587                PolygonOffset(cmd->scale,cmd->bias);
588              }
589              break;
590            case ALPHA_REF_COMMAND_NV:
591              {
592                AlphaRefCommandNV* cmd = (AlphaRefCommandNV*)current;
593                AlphaFunc(asIs, cmd->alphaRef);
594              }
595              break
596            case VIEWPORT_COMMAND_NV:
597              {
598                ViewportCommandNV* cmd = (ViewportCommandNV*)current;
599                Viewport  (cmd->x,cmd->y,cmd->width,cmd->height);
600              }
601              break;
602            case SCISSOR_COMMAND_NV:
603              {
604                ScissorCommandNV* cmd = (ScissorCommandNV*)current;
605                Scissor(cmd->x,cmd->y,cmd->width,cmd->height);
606              }
607              break;
608            case FRONT_FACE_COMMAND_NV:
609              {
610                FrontFaceCommandNV* cmd = (FrontFaceCommandNV*)current;
611                FrontFace(cmd->frontFace ? CW : CCW);
612              }
613              break;
614            }
615
616            current = (ubyte *)current + GetTokenSize(header);
617          }
618
619          return indexType;
620        }
621
622    None of the commands called by DrawCommandSequenceNV may generate their
623    appropriate errors, providing erroneous data as parameters
624    or generating state that normally would create errors when executed
625    by the server can produce undefined results and may cause program
626    termination.
627    The residency of all resources referenced directly (buffer addresses inside tokens)
628    or indirectly (texture handles inside uniform buffer objects) must be managed
629    explicitly.
630
631
632    (XXX should we add something similar to CheckFramebufferStatus? for
633     debugging, that tests the content in software and throws error + offset into buffer
634     triggering the error)
635
636    All BufferAddressRangeNV calls issued by DrawCommandSequenceNV are
637    effective independent of their appropriate client state being enabled or not.
638
639
640    uint GetCommandHeaderNV(enum tokenID, uint size)
641
642    Returns the encoded 32bit header value for a given command; the returned
643    value is implementation specific.
644    The <size> is only provided as basic consistency check, since the size of each
645    structure is fixed and no padding is allowed. The value is the sum of the
646    header and the command specific structure.
647    INVALID_ENUM is generated if <tokenID> is not one of the values listed under Table 10.X.2.
648    INVALID_VALUE is thrown if the <size> does not match the fixed
649    size of a command defined by the spec.
650
651    ushort GetStageIndexNV(enum shadertype)
652
653    Returns the 16bit value for a specific shader stage; the returned value
654    is implementation specific. The value is to be used with the stage field
655    within UniformAddressCommandNV tokens.
656
657Add a new subsection 10.X.3 (Drawing with Commands and State Objects)
658
659    State objects may be used in rendering with the commands:
660
661        void DrawCommandsStatesNV(uint buffer, const intptr* indirects, const sizei* sizes,
662                                       const uint* states, const uint* fbos, uint count);
663        void DrawCommandsStatesAddressNV(const uint64* indirects, const sizei* sizes,
664                                              const uint* states, const uint* fbos, uint count);
665
666    These commands accept arrays of buffer addresses (either an array of
667    offsets <indirects> into a buffer named by <buffer>, or an array of GPU
668    addresses <indirects>), an array of sequence lengths in <sizes>, and an
669    array of state object names in <states>, of which all names must be non-zero.
670    Frame buffer object names are stored in <fbos> and can
671    be either zero or non-zero. All arrays have <count> entries.
672    The residency of textures used as attachment inside the state object's
673    captured fbo or the passed fbo must managed explicitly.
674
675    INVALID_VALUE is generated if one entry of <states> is zero.
676    INVALID_OPERATION is generated if the fbo configuration from <fbos>
677    mismatches the configuration inside the corresponding state object
678    from <states>.
679
680    DrawCommandsStatesNV and DrawCommandsStatesAddressNV are equivalent to:
681
682        Save current GL state;
683        enum indexType = UNSIGNED_SHORT;
684        for (uint i = 0; i < count; i++) {
685            fbo         = LookupFbo(fbos[i]);
686            stateObject = LookupStateObject(states[i]);
687
688            if ( i == 0){
689              Set full state captured by stateObject;
690            }
691            else {
692              Set difference of state going from <states>[i-1] to current stateObject,
693            }
694
695            if ( fbo == 0) {
696              BindFramebuffer(FRAMEBUFFER, stateObject.fbo.name);
697            }
698            else if ( stateObject.fbo.configuration == fbo.configuration ){
699              // The configuration excludes attachment textures and size information, however
700              // includes attached texture formats and other state (see StateCaptureNV).
701
702              BindFramebuffer(FRAMEBUFFER, fbo.name);
703            }
704            else {
705              // Only compatible fbo states can be used.
706
707              generate ERROR INVALID_OPERATION;
708              return;
709            }
710
711            enum mode = primitive mode from stateObject
712
713            uint64 address = address computed from <buffer>+<indirects>[i];
714
715            indexType = DrawCommandSequenceNV(mode, indexType, address, sizes[i]);
716        }
717        Restore current GL state;
718
719    where LookupFbo and LookupStateObject return the driver's internal fbo
720    and stateObject object and stateObject.fbo is the driver's fbo state
721    object and fbo.configuration and fbo.name are the current configuration
722    of a fbo and the fbo's name respectively.
723
724Add a new section 10.X.4 (Command Lists)
725
726    A list of DrawCommandsStates* commands may be compiled into a command
727    list, for further optimization and efficient reuse. The name space for
728    command lists is the unsigned integers, with zero reserved. The command:
729
730        void CreateCommandListsNV(sizei n, uint *lists);
731
732    returns <n> previously unused command list names in <lists>, and creates
733    a command list in the initial state for each name.
734
735    Command lists are deleted by calling
736
737        void DeleteCommandListsNV(sizei n, const uint *lists);
738
739    <lists> contains <n> names of command lists to be deleted. Once a command
740    list is deleted it has no contents and its name is again unused. Unused
741    names in <lists> are silently ignored, as is the value zero.
742
743    The command
744
745        void CommandListSegmentsNV(uint list, uint segments);
746
747    indicates that <list> will have <segments> number of segments, each
748    of which is a list of command sequences that it enqueues. This must be
749    called before any commands are enqueued. In the initial state, a command
750    list has a single segment.
751
752    A command list's initial state allows it to enqueue commands, but not to
753    be executed. The following command can be enqueued:
754
755        void ListDrawCommandsStatesClientNV(uint list, uint segment, const void** indirects,
756                                              const sizei* sizes, const uint* states, const uint* fbos,
757                                              uint count);
758
759    A list has multiple segments and each segment enqueues an ordered list of
760    command sequences. This command enqueues the equivalent of the DrawCommandsStatesNV
761    commands into the list indicated by <list> on the segment indicated by <segment>
762    except that the sequence data is copied from the sequences pointed to by the <indirects>
763    pointer. The <indirects> pointer should point to a list of size <count> of pointers,
764    each of which should point to a command sequence.
765
766    The pre-validated state from <states> is saved into the command list, rather
767    than a reference to the state object (i.e. the state objects or fbos could be
768    deleted and the command list would be unaffected). This includes native
769    GPU addresses for all textures indirectly referenced through the fbos
770    passed or state objects' fbos attachments, therefore a recompile of the command list
771    is required if such referenced textures change their allocation (for example
772    due to resizing), as well as explicit management of the residency of
773    the textures prior CallCommandListNV.
774
775    ListDrawCommandsStatesClientNV performs a by-value copy of the
776    indirect data based on the provided client-side pointers. In this case
777    the content is fully immutable, while the buffer-based versions can
778    change the content of the buffers at any later time.
779
780    The command
781
782        void CompileCommandListNV(uint list);
783
784    make the list indicated by <list> switch from allowing collection of
785    commands to allowing its execution. At this time, the implementation may
786    generate optimized commands to transition between states as efficiently
787    as possible. Lists may be executed with the command
788
789        void CallCommandListNV(uint list);
790
791    This executes the command list indicated by <list>, which operates as if
792    the DrawCommandsStates* commands were replayed in the order they were
793    enqueued on each segment, starting from segment zero and proceeding to the
794    maximum segment. All buffer or texture resources' residency must be
795    managed explicitly, including texture attachments of the effective
796    fbos during list enqueuing.
797
798
799Modifications to the OpenGL Shading Language Specification, Version 4.40
800
801    Including the following line in a shader can be used to control the
802    language features described in this extension:
803
804      #extension GL_NV_command_list : <behavior>
805
806    where <behavior> is as specified in section 3.3.
807
808    New preprocessor #defines are added to the OpenGL Shading Language:
809
810      #define GL_NV_command_list          1
811
812
813    Modify Section 4.4.5, "Uniform and Shader Storage Block Layout Qualifiers"
814
815    (modify first paragraph, p.78) Layout qualifiers can be used for uniform
816    and shader storage blocks, but not for non-block uniform declarations.
817    The layout qualifier identifiers (and shared keyword) for uniform and
818    shader storage blocks are
819
820      layout-qualifier-id
821        shared
822        packed
823        std140
824        std430
825        row_major
826        column_major
827        binding = integer-constant-expression
828        offset  = integer-constant-expression
829        align   = integer-constant-expression
830        commandBindableNV
831
832    (add paragraph prior "When multiple arguments", p. 80)
833    The commandBindableNV qualifier enables the associated uniform block
834    to be updated via UniformAddressCommandNVs when executing
835    DrawCommandsStatesNV. When commandBindableNV is enabled the <binding>
836    identifier must be provided for each block, only its value will
837    correspond with the index field of a UniformAddressCommandNV.
838    A link time error will be thrown if an index is greater or equal to
839    MAX_PROGRAM_PARAMETER_BUFFER_BINDINGS_NV.
840    Changing the binding point by the OpenGL API may not influence this
841    associated index value and may cause UniformAddressCommandNVs to have
842    undefined behavior.
843
844Dependencies on OpenGL 4.4 (Core Profile)
845
846    If only the core profile of OpenGL 4.4 is supported, references to
847    functionality deprecated by OpenGL 3.0 (built-in input/output/uniform variables
848    corresponding to fixed-function vertex attributes, fixed-function
849    vertex and fragment processing) should be removed and/or replaced with
850    functionality supported in the core profile.  In such an environment, the
851    QUADS primitive type is not supported by the StateCaptureNV function. StateCaptureNV will
852    also ignore all references to deprecated state such as line stippling.
853    The ALPHA_REF_COMMAND_NV is not allowed to be used, therefore GetCommandHeaderNV will
854    return an error if the token enum is passed.
855
856Interactions with NV_shader_buffer_load
857
858    The GPU addresses used in ELEMENT_ADDRESS_COMMAND_NV,
859    ATTRIBUTE_ADDRESS_COMMAND_NV and UNIFORM_ADDRESS_COMMAND_NV
860    can be queried via the API provided in this extension. Furthermore
861    the same API must be used to ensure residency of such buffers
862    when draw commands using such addresses are issued.
863
864Interactions with NV_bindless_texture or ARB_bindless_texture
865
866    Residency of fbo attachment textures referenced in state objects
867    or command lists must be managed explicitly using the API provided
868    by either of these extensions.
869
870Interactions with NV_parameter_buffer_object
871
872    The UNIFORM_ADDRESS_COMMAND_NV described in (Drawing with Commands), will affect
873    the PROGRAM_PARAMETER_BUFFER of the target stage defined within the command
874    token.
875
876Interactions with ARB_robust_buffer_access_behavior
877
878    The buffer setups performed by ELEMENT_ADDRESS_COMMAND_NV,
879    ATTRIBUTE_ADDRESS_COMMAND_NV and UNIFORM_ADDRESS_COMMAND_NV
880    do not provide the required buffer ranges for robust buffer
881    access. Therefore draw calls executed under this type of
882    buffer setup will not respect the robust buffer access rules.
883
884Interactions with ARB_shader_draw_parameters
885
886    The drawing operations performed through this extension will not support
887    setting of the built-in GLSL values that were added by
888    ARB_shader_draw_parameters (gl_BaseInstanceARB, gl_BaseVertexARB, gl_DrawIDARB).
889    Accessing these variables will result in undefined values.
890
891Additions to the AGL/GLX/WGL Specifications
892
893    None.
894
895GLX Protocol
896
897    None.
898
899Errors
900
901
902New State
903
904    None.
905
906Issues
907
908    1) What motivates the design?
909
910    The primary goal is to be able to reuse pre-validated command buffers. Other
911    APIs and proposals have addressed this with various incarnations of command
912    lists or state objects, but a recurring problem is that interactions between
913    various stages of the pipeline prevent this prevalidation and reuse. These
914    interactions are often hardware-specific (and differ from vendor to vendor
915    or even generation to generation) and new interactions are introduced by
916    new features that were not imagined when the prevalidation scheme was
917    proposed.
918
919    We attempt to address this by having a monolithic state object that
920    encompasses (almost) the entire state of the pipeline. This should provide
921    enough information for all implementations to do any needed cross-
922    validation. We try to create these in a way that minimizes the new API
923    footprint - since we want ALL state (including any added in the future), we
924    just capture it from the current state of the context.
925
926    We expect that a captured state object will be represented as a list of
927    commands to send to the GPU. While that list of commands may be fairly
928    large, it is also well-suited to filtering redundant changes when switching
929    from one state object to another (filtering may occur on the GPU, or by
930    some processing on the CPU). We anticipate that filtering will be applied
931    when compiling a command list, but it is likely that some (perhaps less
932    aggressive) filtering will also occur in unlisted DrawCommandsStates
933    commands.
934
935    2) Should binding state be captured?
936
937    Binding state should not be captured, for multiple reasons.
938
939    The memory management performed by the driver as part of legacy command
940    execution is expensive and not well-suited for the prevalidation of
941    commands. This can be replaced by explicit bindless memory management
942    APIs (e.g. Make*Resident).
943
944    Resource bindings also require behind-the-scenes management of internal
945    GPU structures like texture handles. Again, this can be replaced by the
946    bindless APIs.
947
948    3) What FBO state should be captured?
949
950    We definitely want to capture enough information to be able to do any
951    state-based recompiles of the fragment shader, which would include
952    drawbuffer state and format state. However, it is not desirable to have
953    all properties of the FBO be captured, e.g. if attachment width/height
954    were captured then state objects could become invalid if the window shape
955    changed
956
957    RESOLVED: state objects reference the FBO configuration, but passing
958    other compatible FBOs during rendering is possible. Furthermore the
959    VIEWPORT_COMMAND_NV allows setting the appropriate viewport state.
960
961    4) Can UBOs be accessed? How?
962
963    RESOLVED: We want to encourage the "first level of the scene graph" information read
964    by shaders to be accessed with fast UBO memory accesses.
965    UNIFORM_ADDRESS_COMMAND_NV provides this mechanism.
966
967    5) What about Compute?
968
969    Compute does not have the same complex state interactions that the graphics
970    pipeline has, so it is not included in this extension.
971
972    6) What dynamic state should be allowed?
973
974    There are some state values which are pretty much raw integer/floating
975    point data, where requiring a unique state object for each value would
976    drastically bloat the number of state objects needed and break batching.
977    We allow for a few such values to be set in the token command buffer
978    rather than in the state object. The current list is motivated by similar
979    state in other APIs, and may not be complete.
980
981    7) What are the "segments" in command lists?
982
983    These are multiple "starting points" for appending commands to the list,
984    which are ultimately replayed in order by segments. This may be useful to
985    build a multipass rendering algorithm with only a single traversal of the
986    scene graph.
987
988    8) When are state objects consumed into the list?
989
990    This could either occur as the command is appended to the list, or during
991    CompileCommandListNV.
992
993    RESOLVED: At ListDrawCommandsStatesClientNV time.
994
995    9) Do we want to have multiple modes in the same dispatch ?
996
997    RESOLVED: yes, state-objects with different modes can be used, allowing
998    fast transitioning between those. Furthermore, it is possible to mix
999    LINES/LINE_STRIP/LINE_LOOP or TRIANGLES/TRIANGLE_STRIP/TRIANGLE_FAN and others
1000    using the same state object, as long as their base primitive mode is the same.
1001
1002    10) Do we want to allow mixing DrawArrays and DrawElements in the same
1003    dispatch ?
1004
1005    RESOLVED: yes.
1006
1007    11) What happens if the token buffer is modified while it is being dispatched ?
1008
1009    RESOLVED: there is no guarantee of coherency, so undefined behavior.
1010
1011    12) I would like to change states in the middle; how do I do this ?
1012
1013    RESOLVED: you can select a new state object or state tokens, but you cannot change
1014    state in the indirect buffer itself.
1015
1016    13) Is the token buffer multithread safe; does it scale ?
1017
1018    RESOLVED: yes. it is trivial to allocate a token buffer per thread, and then submit
1019    them in the main thread sequentially. since the implementation is not involved
1020    when the application writes to them, the only thread safety requirements are in
1021    the application itself.
1022    Command lists and state objects are, however, currently not context share-able,
1023    though as rendering is much more efficient now, the main dispatching thread can
1024    spend the time on preparing state objects prior drawing. The cost of glStateCaptureNV
1025    is no worse than a classic API draw call, and exploiting temporal coherence not too
1026    many states would be "new" frame to frame, but instead cached states can be reused.
1027
1028    14) Can I reuse token buffer multiple times ?
1029
1030    RESOLVED: yes.
1031
1032    15) Should we use a fixed length decoding or at the very least a size in the header ?
1033
1034    RESOLVED: fixed length is used. As basic consistency check the size is also passed to header generation.
1035    The NOP command can be used to pad structures to custom sizes.
1036
1037    16) Can I do buffer updates in a single DrawCommands call ?
1038
1039    RESOLVED: NO.
1040    Updating memory in general requires synchronization, and having lots of
1041    updates inside a single DrawCommands would become a performance bottleneck.
1042
1043    17) I want to implement some occlusion scheme and skip some of the draws; how do I do this ?
1044
1045    RESOLVED: this extension does not offer a conditional render facility, but this can be
1046    implemented by using NOP or preferably TERMINATE_SEQUENCE commands in the stream.
1047
1048    18) I want to implement some level of detail scheme; is that possible ?
1049
1050    RESOLVED: you can use NOP or TERMINATE_SEQUENCE to skip the level of details that you don't want to draw.
1051
1052    19) Why can't I just get a token to change the state, and avoid specifying lists of
1053    state and indirect buffers ?
1054
1055    RESOLVED: Getting a token to specify a state switch imply that the application would
1056    have access to a virtual address of state changes. This would potentially open security
1057    issue, since part of the validation may involve complex sequence of programming.
1058
1059    20) Instead of void** which means all commands must be stored in one buffer, could GLuint64** be used
1060    when EnableClientState(DRAW_INDIRECT_UNIFIED_NV) is set? This would allow managing different command
1061    buffers independently.
1062
1063    RESOLVED: separate Address command added
1064
1065    21) How big can each indirect command list's buffer size be?
1066
1067    RESOLVED: no limit required.
1068
1069    22) How to retrieve the "index" within UniformAddressCommandNV, or is that the GL binding point?
1070
1071    RESOLVED: added commandBindableNV layout qualifier in GLSL for uniform blocks to ensure fixed binding unit.
1072    Also added stage value to command.
1073
1074    23) In what condition is the state left, that is modified by tokens, after the dispatch call?
1075
1076    RESOLVED: state is reset.
1077
1078    24) How does working with this extension look like
1079
1080    You will find related samples at https://github.com/nvpro-samples
1081
1082    25) How can I use textures, images, shader storage or atomic counter buffers in combination with state objects?
1083
1084    Textures and images are covered via NV/ARB_bindless_texture, you can store their handles inside uniform buffers.
1085    Shader storage and atomic counter buffers are currently not directly exposed, however NV_gpu_shader5 allows
1086    storing pointers to such buffers inside uniform buffers as well. Atomic counters can be replaced by regular
1087    atomic increments.
1088
1089    Alternatively use DrawCommandsNV or DrawCommandsAddressNV, which does support any GLSL programs with these
1090    resource bindings, as well as default-block uniforms.
1091
1092
1093Revision History
1094
1095    Rev.    Date      Author    Changes
1096    ----  --------    --------  -----------------------------------------
1097     6    11/3/2015   ckubisch  Rephrase what stateobjects capture and what not
1098     5    8/17/2015   ckubisch  correct errors for DrawCommandsNV and DrawCommandsAddressNV
1099                                rendering to default framebuffer is not allowed. Clarify
1100                                which state is inherited (updated Issue 25).
1101     4    6/18/2015   ckubisch  Add missing interaction with ARB_shader_draw_parameters
1102     3    5/27/2015   jemmons   Multiple minor fixes and clarifications
1103     2    4/16/2015   pboudier  Fix incorrect type (size_t is now sizei) in ListDrawCommandsStatesClientNV
1104     1                pboudier  concept
1105                      jbolz     base spec
1106                      ckubisch  detailed spec
1107                      mjk       Internal revisions
1108