1Name 2 3 NV_mesh_shader 4 5Name String 6 7 GL_NV_mesh_shader 8 9Contact 10 11 Christoph Kubisch, NVIDIA (ckubisch 'at' nvidia.com) 12 Pat Brown, NVIDIA (pbrown 'at' nvidia.com) 13 14Contributors 15 16 Yury Uralsky, NVIDIA 17 Tyson Smith, NVIDIA 18 Pyarelal Knowles, NVIDIA 19 20Status 21 22 Shipping 23 24Version 25 26 Last Modified Date: September 5, 2019 27 NVIDIA Revision: 5 28 29Number 30 31 OpenGL Extension #527 32 OpenGL ES Extension #312 33 34Dependencies 35 36 This extension is written against the OpenGL 4.5 Specification 37 (Compatibility Profile), dated June 29, 2017. 38 39 OpenGL 4.5 or OpenGL ES 3.2 is required. 40 41 This extension requires support for the OpenGL Shading Language (GLSL) 42 extension "NV_mesh_shader", which can be found at the Khronos Group Github 43 site here: 44 45 https://github.com/KhronosGroup/GLSL 46 47 This extension interacts with ARB_indirect_parameters and OpenGL 4.6. 48 49 This extension interacts with NV_command_list. 50 51 This extension interacts with ARB_draw_indirect and 52 NV_vertex_buffer_unified_memory. 53 54 This extension interacts with OVR_multiview 55 56 57Overview 58 59 This extension provides a new mechanism allowing applications to use two 60 new programmable shader types -- the task and mesh shader -- to generate 61 collections of geometric primitives to be processed by fixed-function 62 primitive assembly and rasterization logic. When the task and mesh 63 shaders are drawn, they replace the standard programmable vertex 64 processing pipeline, including vertex array attribute fetching, vertex 65 shader processing, tessellation, and the geometry shader processing. 66 67New Procedures and Functions 68 69 void DrawMeshTasksNV(uint first, uint count); 70 71 void DrawMeshTasksIndirectNV(intptr indirect); 72 73 void MultiDrawMeshTasksIndirectNV(intptr indirect, 74 sizei drawcount, 75 sizei stride); 76 77 void MultiDrawMeshTasksIndirectCountNV( intptr indirect, 78 intptr drawcount, 79 sizei maxdrawcount, 80 sizei stride); 81 82New Tokens 83 84 Accepted by the <type> parameter of CreateShader and returned by the 85 <params> parameter of GetShaderiv: 86 87 MESH_SHADER_NV 0x9559 88 TASK_SHADER_NV 0x955A 89 90 Accepted by the <pname> parameter of GetIntegerv, GetBooleanv, GetFloatv, 91 GetDoublev and GetInteger64v: 92 93 MAX_MESH_UNIFORM_BLOCKS_NV 0x8E60 94 MAX_MESH_TEXTURE_IMAGE_UNITS_NV 0x8E61 95 MAX_MESH_IMAGE_UNIFORMS_NV 0x8E62 96 MAX_MESH_UNIFORM_COMPONENTS_NV 0x8E63 97 MAX_MESH_ATOMIC_COUNTER_BUFFERS_NV 0x8E64 98 MAX_MESH_ATOMIC_COUNTERS_NV 0x8E65 99 MAX_MESH_SHADER_STORAGE_BLOCKS_NV 0x8E66 100 MAX_COMBINED_MESH_UNIFORM_COMPONENTS_NV 0x8E67 101 102 MAX_TASK_UNIFORM_BLOCKS_NV 0x8E68 103 MAX_TASK_TEXTURE_IMAGE_UNITS_NV 0x8E69 104 MAX_TASK_IMAGE_UNIFORMS_NV 0x8E6A 105 MAX_TASK_UNIFORM_COMPONENTS_NV 0x8E6B 106 MAX_TASK_ATOMIC_COUNTER_BUFFERS_NV 0x8E6C 107 MAX_TASK_ATOMIC_COUNTERS_NV 0x8E6D 108 MAX_TASK_SHADER_STORAGE_BLOCKS_NV 0x8E6E 109 MAX_COMBINED_TASK_UNIFORM_COMPONENTS_NV 0x8E6F 110 111 MAX_MESH_WORK_GROUP_INVOCATIONS_NV 0x95A2 112 MAX_TASK_WORK_GROUP_INVOCATIONS_NV 0x95A3 113 114 MAX_MESH_TOTAL_MEMORY_SIZE_NV 0x9536 115 MAX_TASK_TOTAL_MEMORY_SIZE_NV 0x9537 116 117 MAX_MESH_OUTPUT_VERTICES_NV 0x9538 118 MAX_MESH_OUTPUT_PRIMITIVES_NV 0x9539 119 120 MAX_TASK_OUTPUT_COUNT_NV 0x953A 121 122 MAX_DRAW_MESH_TASKS_COUNT_NV 0x953D 123 124 MAX_MESH_VIEWS_NV 0x9557 125 126 MESH_OUTPUT_PER_VERTEX_GRANULARITY_NV 0x92DF 127 MESH_OUTPUT_PER_PRIMITIVE_GRANULARITY_NV 0x9543 128 129 130 Accepted by the <pname> parameter of GetIntegeri_v, GetBooleani_v, 131 GetFloati_v, GetDoublei_v and GetInteger64i_v: 132 133 MAX_MESH_WORK_GROUP_SIZE_NV 0x953B 134 MAX_TASK_WORK_GROUP_SIZE_NV 0x953C 135 136 137 Accepted by the <pname> parameter of GetProgramiv: 138 139 MESH_WORK_GROUP_SIZE_NV 0x953E 140 TASK_WORK_GROUP_SIZE_NV 0x953F 141 142 MESH_VERTICES_OUT_NV 0x9579 143 MESH_PRIMITIVES_OUT_NV 0x957A 144 MESH_OUTPUT_TYPE_NV 0x957B 145 146 Accepted by the <pname> parameter of GetActiveUniformBlockiv: 147 148 UNIFORM_BLOCK_REFERENCED_BY_MESH_SHADER_NV 0x959C 149 UNIFORM_BLOCK_REFERENCED_BY_TASK_SHADER_NV 0x959D 150 151 Accepted by the <pname> parameter of GetActiveAtomicCounterBufferiv: 152 153 ATOMIC_COUNTER_BUFFER_REFERENCED_BY_MESH_SHADER_NV 0x959E 154 ATOMIC_COUNTER_BUFFER_REFERENCED_BY_TASK_SHADER_NV 0x959F 155 156 Accepted in the <props> array of GetProgramResourceiv: 157 158 REFERENCED_BY_MESH_SHADER_NV 0x95A0 159 REFERENCED_BY_TASK_SHADER_NV 0x95A1 160 161 Accepted by the <programInterface> parameter of GetProgramInterfaceiv, 162 GetProgramResourceIndex, GetProgramResourceName, GetProgramResourceiv, 163 GetProgramResourceLocation, and GetProgramResourceLocationIndex: 164 165 MESH_SUBROUTINE_NV 0x957C 166 TASK_SUBROUTINE_NV 0x957D 167 168 MESH_SUBROUTINE_UNIFORM_NV 0x957E 169 TASK_SUBROUTINE_UNIFORM_NV 0x957F 170 171 Accepted by the <stages> parameter of UseProgramStages: 172 173 MESH_SHADER_BIT_NV 0x00000040 174 TASK_SHADER_BIT_NV 0x00000080 175 176Modifications to the OpenGL 4.5 Specification (Compatibility Profile) 177 178 Modify Chapter 3, Dataflow Model, p. 33 179 180 (insert at the end of the section after Figure 3.1, p. 35) 181 182 Figure 3.2 shows a block diagram of the alternate mesh processing pipeline 183 of GL. This pipeline produces a set of output primitives similar to the 184 primitives produced by the conventional GL vertex processing pipeline. 185 186 Work on the mesh pipeline is initiated by the application drawing a 187 set of mesh tasks via an API command. If an optional task shader is 188 active, each task triggers the execution of a task shader work group that 189 will generate a new set of tasks upon completion. Each of these spawned 190 tasks, or each of the original drawn tasks if no task shader is 191 present, triggers the execution of a mesh shader work group that produces 192 an output mesh with a variable-sized number of primitives assembled from 193 vertices in the output mesh. The primitives from these output meshes are 194 processed by the rasterization, fragment shader, per-fragment-operations, 195 and framebuffer pipeline stages in the same manner as primitives produced 196 from draw calls sent to the conventional vertex processing pipeline 197 depicted in Figure 3.1. 198 199 Conventional From Application 200 Vertex | 201 Pipeline v 202 Draw Mesh Tasks <----- Draw Indirect Buffer 203 (Fig 3.1) | 204 | +---+-----+ 205 | | | 206 | | | 207 | | Task Shader ---+ 208 | | | | 209 | | v | 210 | | Task Generation | Image Load/Store 211 | | | | Atomic Counter 212 | +---+-----+ |<--> Shader Storage 213 | | | Texture Fetch 214 | v | Uniform Block 215 | Mesh Shader ----------+ 216 | | | 217 +-------------> + | 218 | | 219 v | 220 Rasterization | 221 | | 222 v | 223 Fragment Shader ------+ 224 | 225 v 226 Per-Fragment Operations 227 | 228 v 229 Framebuffer 230 231 Figure 3.2, GL Mesh Processing Pipeline 232 233 234 Modify Chapter 7, Programs and Shaders, p. 84 235 236 (Change the sentence starting with "Shader stages including vertex shaders") 237 238 Shader stages including vertex shaders, tessellation control shaders, 239 tessellation evaluation shaders, geometry shaders, mesh shaders, task 240 shaders, fragment shaders, and compute shaders can be created, compiled, and 241 linked into program objects 242 243 (replace the sentence starting with "A single program 244 object can contain all of these shaders, or any subset thereof.") 245 246 Mesh and Task shaders affect the assembly of primitives from 247 groups of shader invocations (see chapter X). 248 A single program object cannot mix mesh and task shader stages 249 with vertex, tessellation or geometry shader stages. Furthermore 250 a task shader stage cannot be combined with a fragment shader stage 251 when the mesh shader stage is omitted. Other combinations as well 252 as their subsets are possible. 253 254 Modify Section 7.1, Shader Objects, p. 85 255 256 (add following entries to table 7.1) 257 258 type | Shader Stage 259 =================|=============== 260 TASK_SHADER_NV | Task shader 261 MESH_SHADER_NV | Mesh shader 262 263 Modify Section 7.3, Program Objects, p.89 264 265 (add to the list of reasons why LinkProgram can fail, p. 92) 266 267 * <program> contains objects to form either a mesh or task shader (see 268 chapter X), and 269 - the program also contains objects to form vertex, tessellation 270 control, tessellation evaluation, or geometry shaders. 271 272 * <program> contains objects to form a task shader (see chapter X), and 273 - the program is not separable and contains no objects to form a mesh 274 shader. 275 276 Modify Section 7.3.1 Program Interfaces, p.96 277 278 (add to the list starting with VERTEX_SUBROUTINE, after GEOMETRY_SUBROUTINE) 279 280 TASK_SUBROUTINE_NV, MESH_SUBROUTINE_NV, 281 282 (add to the list starting with VERTEX_SUBROUTINE_UNIFORM, after 283 GEOMETRY_SUBROUTINE_UNIFORM) 284 285 TASK_SUBROUTINE_UNIFORM_NV, MESH_SUBROUTINE_UNIFORM_NV, 286 287 (add to the list of errors for GetProgramInterfaceiv, p 102, 288 after GEOMETRY_SUBROUTINE_UNIFORM) 289 290 TASK_SUBROUTINE_UNIFORM_NV, MESH_SUBROUTINE_UNIFORM_NV, 291 292 (modify entries for table 7.2 for GetProgramResourceiv, p. 105) 293 294 Property | Supported Interfaces 295 ==================================|================================= 296 ARRAY_SIZE | ..., TASK_SUBROUTINE_UNIFORM_NV, 297 | MESH_SUBROUTINE_UNIFORM_NV 298 ----------------------------------|----------------------------- 299 NUM_COMPATIBLE_SUBROUTINES, | ..., TASK_SUBROUTINE_UNIFORM_NV, 300 COMPATIBLE_SUBROUTINES | MESH_SUBROUTINE_UNIFORM_NV 301 ----------------------------------|----------------------------- 302 LOCATION | 303 ----------------------------------|----------------------------- 304 REFERENCED_BY_VERTEX_SHADER, ... | ATOMIC_COUNTER_BUFFER, ... 305 REFERENCED_BY_TASK_SHADER_NV, | 306 REFERENCED_BY_MESH_SHADER_NV | 307 ----------------------------------|----------------------------- 308 309 (add to list of the sentence starting with "For the properties 310 REFERENCED_BY_VERTEX_SHADER", after REFERENCED_BY_GEOMETRY_SHADER, p. 108) 311 312 REFERENCED_BY_TASK_SHADER_NV, REFERENCED_BY_MESH_SHADER_NV 313 314 (for the description of GetProgramResourceLocation and 315 GetProgramResourceLocationIndex, add to the list of the sentence 316 starting with "For GetProgramResourceLocation, programInterface must 317 be one of UNIFORM,", after GEOMETRY_SUBROUTINE_UNIFORM, p. 114) 318 319 TASK_SUBROUTINE_UNIFORM_NV, MESH_SUBROUTINE_UNIFORM_NV, 320 321 Modify Section 7.4, Program Pipeline Objects, p. 115 322 323 (modify the first paragraph, p. 118, to add new shader stage bits for mesh 324 and task shaders) 325 326 The bits set in <stages> indicate the program stages for which the program 327 object named by <program> becomes current. These stages may include 328 compute, vertex, tessellation control, tessellation evaluation, geometry, 329 fragment, mesh, and task shaders, indicated respectively by 330 COMPUTE_SHADER_BIT, VERTEX_SHADER_BIT, TESS_CONTROL_SHADER_BIT, 331 TESS_EVALUATION_SHADER_BIT, GEOMETRY_SHADER_BIT, FRAGMENT_SHADER_BIT, 332 MESH_SHADER_BIT_NV, and TASK_SHADER_BIT_NV, respectively. The constant 333 ALL_SHADER_BITS indicates <program> is to be made current for all shader 334 stages. 335 336 (modify the first error in "Errors" for UseProgramStages, p. 118 to allow 337 the use of mesh and task shader bits) 338 339 An INVALID_VALUE error is generated if stages is not the special value 340 ALL_SHADER_BITS, and has any bits set other than VERTEX_SHADER_BIT, 341 COMPUTE_SHADER_BIT, TESS_CONTROL_SHADER_BIT, TESS_EVALUATION_SHADER_BIT, 342 GEOMETRY_SHADER_BIT, FRAGMENT_SHADER_BIT, MESH_SHADER_BIT_NV, and 343 TASK_SHADER_BIT_NV. 344 345 346 Modify Section 7.6, Uniform Variables, p. 125 347 348 (add entries to table 7.4, p. 126) 349 350 Shader Stage | pname for querying default uniform 351 | block storage, in components 352 =====================|===================================== 353 Task (see chapter X) | MAX_TASK_UNIFORM_COMPONENTS_NV 354 Mesh (see chapter X) | MAX_MESH_UNIFORM_COMPONENTS_NV 355 356 (add entries to table 7.5, p. 127) 357 358 Shader Stage | pname for querying combined uniform 359 | block storage, in components 360 =====================|======================================== 361 Task (see chapter X) | MAX_COMBINED_TASK_UNIFORM_COMPONENTS_NV 362 Mesh (see chapter X) | MAX_COMBINED_MESH_UNIFORM_COMPONENTS_NV 363 364 (add entries to table 7.7, p. 131) 365 366 pname | prop 367 ===========================================|============================= 368 UNIFORM_BLOCK_REFERENCED_BY_TASK_SHADER_NV | REFERENCED_BY_TASK_SHADER_NV 369 UNIFORM_BLOCK_REFERENCED_BY_MESH_SHADER_NV | REFERENCED_BY_MESH_SHADER_NV 370 371 (add entries to table 7.8, p. 132) 372 373 pname | prop 374 ===========================================|============================= 375 ATOMIC_COUNTER_BUFFER_REFERENCED_- | REFERENCED_BY_TASK_SHADER_NV 376 BY_TASK_SHADER_NV | 377 -------------------------------------------|----------------------------- 378 ATOMIC_COUNTER_BUFFER_REFERENCED_- | REFERENCED_BY_MESH_SHADER_NV 379 BY_MESH_SHADER_NV | 380 381 (modify the sentence starting with "The limits for vertex" in 7.6.2 382 Uniform Blocks, p. 136) 383 ... geometry, task, mesh, fragment... 384 MAX_GEOMETRY_UNIFORM_BLOCKS, MAX_TASK_UNIFORM_BLOCKS_NV, MAX_MESH_UNIFORM_- 385 BLOCKS_NV, MAX_FRAGMENT_UNIFORM_BLOCKS... 386 387 (modify the sentence starting with "The limits for vertex", in 388 7.7 Atomic Counter Buffers, p. 141) 389 390 ... geometry, task, mesh, fragment... 391 MAX_GEOMETRY_ATOMIC_COUNTER_BUFFERS, MAX_TASK_ATOMIC_COUNTER_BUFFERS_NV, 392 MAX_MESH_ATOMIC_COUNTER_BUFFERS_NV, MAX_FRAGMENT_ATOMIC_COUNTER_BUFFERS, ... 393 394 395 Modify Section 7.8 Shader Buffer Variables and Shader Storage Blocks, p. 142 396 397 (modify the sentences starting with "The limits for vertex", p. 143) 398 399 ... geometry, task, mesh, fragment... 400 MAX_GEOMETRY_SHADER_STORAGE_BLOCKS, MAX_TASK_SHADER_STORAGE_BLOCKS_NV, 401 MAX_MESH_SHADER_STORAGE_BLOCKS_NV, MAX_FRAGMENT_SHADER_STORAGE_BLOCKS,... 402 403 Modify Section 7.9 Subroutine Uniform Variables, p. 144 404 405 (modify table 7.9, p. 145) 406 407 Interface | Shader Type 408 ====================|=============== 409 TASK_SUBROUTINE_NV | TASK_SHADER_NV 410 MESH_SUBROUTINE_NV | MESH_SHADER_NV 411 412 (modify table 7.10, p. 146) 413 414 Interface | Shader Type 415 ============================|=============== 416 TASK_SUBROUTINE_UNIFORM_NV | TASK_SHADER_NV 417 MESH_SUBROUTINE_UNIFORM_NV | MESH_SHADER_NV 418 419 420 Modify Section 7.13 Shader, Program, and Program Pipeline Queries, p. 157 421 422 (add to the list of queries for GetProgramiv, p. 157) 423 424 If <pname> is TASK_WORK_GROUP_SIZE_NV, an array of three integers 425 containing the local work group size of the task shader 426 (see chapter X), as specified by its input layout qualifier(s), is returned. 427 If <pname> is MESH_WORK_GROUP_SIZE_NV, an array of three integers 428 containing the local work group size of the mesh shader 429 (see chapter X), as specified by its input layout qualifier(s), is returned. 430 If <pname> is MESH_VERTICES_OUT_NV, the maximum number of vertices the 431 mesh shader (see chapter X) will output is returned. 432 If <pname> is MESH_PRIMITIVES_OUT_NV, the maximum number of primitives 433 the mesh shader (see chapter X) will output is returned. 434 If <pname> is MESH_OUTPUT_TYPE_NV, the mesh shader output type, 435 which must be one of POINTS, LINES or TRIANGLES, is returned. 436 437 (add to the list of errors for GetProgramiv, p. 159) 438 439 An INVALID_OPERATION error is generated if TASK_WORK_- 440 GROUP_SIZE is queried for a program which has not been linked successfully, 441 or which does not contain objects to form a task shader. 442 An INVALID_OPERATION error is generated if MESH_VERTICES_OUT_NV, 443 MESH_PRIMITIVES_OUT_NV, MESH_OUTPUT_TYPE_NV, or MESH_WORK_GROUP_SIZE_NV 444 are queried for a program which has not been linked 445 successfully, or which does not contain objects to form a mesh shader. 446 447 448 Add new language extending the edits to Section 9.2.8 (Attaching Textures 449 to a Framebuffer) from the OVR_multiview extension that describe how 450 various drawing commands are processed for when multiview rendering is 451 enabled: 452 453 When multiview rendering is enabled, the DrawMeshTasks* commands (section 454 X.6) will not spawn separate task and mesh shader invocations for each 455 view. Instead, the primitives produced by each mesh shader local work 456 group will be processed separately for each view. For per-vertex and 457 per-primitive mesh shader outputs not qualified with "perviewNV", the 458 single value written for each vertex or primitive will be used for the 459 output when processing each view. For mesh shader outputs qualified with 460 "perviewNV", the output is arrayed and the mesh shader is responsible for 461 writing separate values for each view. When processing output primitives 462 for a view numbered <V>, outputs qualified with "perviewNV" will assume 463 the values for array element <V>. 464 465 466 Modify Section 10.3.11 Indirect Commands in Buffer Objects, p. 400 467 468 (after "and to DispatchComputeIndirect (see section 19)" add) 469 470 and to DrawMeshTasksIndirectNV, MultiDrawMeshTasksIndirectNV, 471 MultiDrawMeshTasksIndirectCountNV (see chapter X) 472 473 (add following entries to the table 10.7) 474 475 Indirect Command Name | Indirect Buffer target 476 ====================================|======================== 477 DrawMeshTasksIndirectNV | DRAW_INDIRECT_BUFFER 478 MultiDrawMeshTasksIndirectNV | DRAW_INDIRECT_BUFFER 479 MultiDrawMeshTasksIndirectCountNV | DRAW_INDIRECT_BUFFER 480 481 482 Modify Section 11.1.3 Shader Execution, p. 437 483 484 (add after the first paragraph in section 11.1.3, p 437) 485 486 If there is an active program object present for the task or 487 mesh shader stages, the executable code for these 488 active programs is used to process incoming work groups (see 489 chapter X). 490 491 (add to the list of constants, 11.1.3.5 Texture Access, p. 441) 492 493 * MAX_TASK_TEXTURE_IMAGE_UNITS_NV (for task shaders) 494 495 * MAX_MESH_TEXTURE_IMAGE_UNITS_NV (for mesh shaders) 496 497 (add to the list of constants, 11.1.3.6 Atomic Counter Access, p. 443) 498 499 * MAX_TASK_ATOMIC_COUNTERS_NV (for task shaders) 500 501 * MAX_MESH_ATOMIC_COUNTERS_NV (for mesh shaders) 502 503 (add to the list of constants, 11.1.3.7 Image Access, p. 444) 504 505 * MAX_TASK_IMAGE_UNIFORMS_NV (for task shaders) 506 507 * MAX_MESH_IMAGE_UNIFORMS_NV (for mesh shaders) 508 509 (add to the list of constants, 11.1.3.8 Shader Storage Buffer Access, 510 p. 444) 511 512 * MAX_TASK_SHADER_STORAGE_BLOCKS_NV (for task shaders) 513 514 * MAX_MESH_SHADER_STORAGE_BLOCKS_NV (for mesh shaders) 515 516 (modify the sentence of 11.3.10 Shader Outputs, p. 445) 517 518 A vertex and mesh shader can write to ... 519 520 521 522 Insert a new chapter X before Chapter 13, Fixed-Function Vertex 523 Post-Processing, p. 505 524 525 Chapter X, Programmable Mesh Processing 526 527 In addition to the programmable vertex processing pipeline described in 528 Chapters 10 and 11 [[compatibility profile only: and the fixed-function 529 vertex processing pipeline in Chapter 12]], applications may use the mesh 530 pipeline to generate primitives for rasterization. The mesh pipeline 531 generates a collection of meshes using the programmable task and mesh 532 shaders. Task and mesh shaders are created as described in section 7.1 533 using a type parameter of TASK_SHADER_NV and MESH_SHADER_NV, respectively. 534 They are attached to and used in program objects as described in section 535 7.3. 536 537 Mesh and task shader workloads are formed from groups of work items called 538 work groups and processed by the executable code for a mesh or task shader 539 program. A work group is a collection of shader invocations that execute 540 the same code, potentially in parallel. An invocation within a work group 541 may share data with other members of the same work group through shared 542 variables (see section 4.3.8, "Shared Variables", of the OpenGL Shading 543 Language Specification) and issue memory and control barriers to 544 synchronize with other members of the same work group. 545 546 X.1 Task Shader Variables 547 548 Task shaders can access uniform variables belonging to the current 549 program object. Limits on uniform storage and methods for manipulating 550 uniforms are described in section 7.6. 551 552 There is a limit to the total amount of memory consumed by output 553 variables in a single task shader work group. This limit, expressed in 554 basic machine units, may be queried by calling GetIntegerv with the value 555 MAX_TASK_TOTAL_MEMORY_SIZE_NV. 556 557 X.2 Task Shader Outputs 558 559 Each task shader work group can define how many mesh work groups 560 should be generated by writing to gl_TaskCountNV. The maximum 561 number can be queried by GetIntergev using MAX_TASK_OUTPUT_COUNT_NV. 562 563 Furthermore the task work group can output data (qualified with "taskNV") 564 that can be accessed by to the generated mesh work groups. 565 566 X.3 Mesh Shader Variables 567 568 Mesh shaders can access uniform variables belonging to the current 569 program object. Limits on uniform storage and methods for manipulating 570 uniforms are described in section 7.6. 571 There is a limit to the total size of all variables declared as shared 572 as well as output attributes in a single mesh stage. This limit, expressed 573 in units of basic machine units, may be queried as the value of 574 MAX_MESH_TOTAL_MEMORY_SIZE_NV. 575 576 X.4 Mesh Shader Inputs 577 578 When each mesh shader work group runs, its invocations have access to 579 built-in variables describing the work group and invocation and also the 580 task shader outputs (qualified with "taskNV") written the task shader that 581 generated the work group. When no task shader is active, the mesh shader 582 has no access to task shader outputs. 583 584 X.5 Mesh Shader Outputs 585 586 When each mesh shader work group completes, it emits an output mesh 587 consisting of 588 589 * a primitive count, written to the built-in output gl_PrimitiveCountNV; 590 591 * a collection of vertex attributes, where each vertex in the mesh has a 592 set of built-in and user-defined per-vertex output variables and blocks; 593 594 * a collection of per-primitive attributes, where each of the 595 gl_PrimitiveCountNV primitives in the mesh has a set of built-in and 596 user-defined per-primitive output variables and blocks; and 597 598 * an array of vertex index values written to the built-in output array 599 gl_PrimitiveIndicesNV, where each output primitive has a set of one, 600 two, or three indices that identify the output vertices in the mesh used 601 to form the primitive. 602 603 This data is used to generate primitives of one of three types. The 604 supported output primitive types are points (POINTS), lines (LINES), and 605 triangles (TRIANGLES). The vertices output by the mesh shader are assembled 606 into points, lines, or triangles based on the output primitive type in the 607 DrawElements manner described in section 10.4, with the 608 gl_PrimitiveIndicesNV array content serving as index values, and the 609 local vertex attribute arrays as vertex arrays. 610 611 The output arrays are sized depending on the compile-time provided 612 values ("max_vertices" and "max_primitives"), which must be below 613 their appropriate maxima that can be queried via GetIntegerv and 614 MAX_MESH_OUTPUT_PRIMITIVES_NV as well as MAX_MESH_OUTPUT_VERTICES_NV. 615 616 The output attributes are allocated at an implementation-dependent 617 granularity that can be queried via MESH_OUTPUT_PER_VERTEX_GRANULARITY_NV 618 and MESH_OUTPUT_PER_PRIMITIVE_GRANULARITY_NV. The total amount of memory 619 consumed for per-vertex and per-primitive output variables must not exceed 620 an implementation-dependent total memory limit that can be queried by 621 calling GetIntegerv with the enum MAX_MESH_TOTAL_MEMORY_SIZE_NV. The 622 memory consumed by the gl_PrimitiveIndicesNV[] array does not count 623 against this limit. 624 625 X.6 Mesh Tasks Drawing Commands 626 627 One or more work groups is launched by calling 628 629 void DrawMeshTasksNV( uint first, uint count ); 630 631 If there is an active program object for the task shader stage, 632 <count> work groups are processed by the active program for the task 633 shader stage. If there is no active program object for the task shader 634 stage, <count> work groups are instead processed by the active 635 program for the mesh shader stage. The active program for both shader 636 stages will be determined in the same manner as the active program for other 637 pipeline stages, as described in section 7.3. While the individual shader 638 invocations within a work group are executed as a unit, work groups are 639 executed completely independently and in unspecified order. 640 The x component of gl_WorkGroupID of the first active stage will be within 641 the range of [<first> , <first + count - 1>]. The y and z component of 642 gl_WorkGroupID within all stages will be set to zero. 643 644 The maximum number of task or mesh shader work groups that 645 may be dispatched at one time may be determined by calling GetIntegerv 646 with <target> set to MAX_DRAW_MESH_TASKS_COUNT_NV. 647 648 The local work size in each dimension is specified at compile time using 649 an input layout qualifier in one or more of the task or mesh shaders 650 attached to the program; see the OpenGL Shading Language Specification for 651 more information. After the program has been linked, the local work group 652 size of the task or mesh shader may be queried by calling GetProgramiv 653 with <pname> set to TASK_WORK_GROUP_SIZE_NV or MESH_WORK_GROUP_SIZE_NV, as 654 described in section 7.13. 655 656 The maximum size of a task or mesh shader local work group may be 657 determined by calling GetIntegeri_v with <target> set to 658 MAX_TASK_WORK_GROUP_SIZE_NV or MAX_MESH_WORK_GROUP_SIZE_NV, and <index> 659 set to 0, 1, or 2 to retrieve the maximum work size in the X, Y and Z 660 dimension, respectively. Furthermore, the maximum number of invocations 661 in a single local work group (i.e., the product of the three dimensions) 662 may be determined by calling GetIntegerv with pname set to 663 MAX_TASK_WORK_GROUP_INVOCATIONS_NV or MAX_MESH_WORK_GROUP_INVOCATIONS_NV. 664 665 Errors 666 667 An INVALID_OPERATION error is generated if there is no active 668 program for the mesh shader stage. 669 670 An INVALID_VALUE error is generated if <count> exceeds 671 MAX_DRAW_MESH_TASKS_COUNT_NV. 672 673 674 If there is an active program on the task shader stage, each task shader 675 work group writes a task count to the built-in task shader output 676 gl_TaskCountNV. If this count is non-zero upon completion of the task 677 shader, then gl_TaskCountNV work groups are generated and processed by the 678 active program for the mesh shader stage. If this count is zero, no work 679 groups are generated. If the count is greater than MAX_TASK_OUTPUT_COUNT_NV 680 the number of mesh shader work groups generated is undefined. 681 The built-in variables available to the generated mesh shader work groups 682 are identical to those that would be generated if DrawMeshTasksNV were 683 called with no task shader active and with a <count> of gl_TaskCountNV. 684 685 The primitives of the mesh are then processed by the pipeline stages 686 described in subsequent chapters in the same manner as primitives produced 687 by the conventional vertex processing pipeline described in previous 688 chapters. 689 690 The command 691 692 void DrawMeshTasksIndirectNV(intptr indirect); 693 694 typedef struct { 695 uint count; 696 uint first; 697 } DrawMeshTasksIndirectCommandNV; 698 699 is equivalent to calling DrawMeshTasksNV with the parameters sourced from a 700 a DrawMeshTasksIndirectCommandNV struct stored in the buffer currently 701 bound to the DRAW_INDIRECT_BUFFER binding at an offset, in basic machine 702 units, specified by <indirect>. If the <count> read from the indirect 703 draw buffer is greater than MAX_DRAW_MESH_TASKS_COUNT_NV, then the results 704 of this command are undefined. 705 706 Errors 707 708 An INVALID_OPERATION error is generated if there is no active program 709 for the mesh shader stage. 710 711 An INVALID_VALUE error is generated if <indirect> is negative or is 712 not a multiple of the size, in basic machine units, of uint. 713 714 An INVALID_OPERATION error is generated if the command would source 715 data beyond the end of the buffer object. 716 717 An INVALID_OPERATION error is generated if zero is bound to the 718 DRAW_INDIRECT_BUFFER binding. 719 720 The command 721 722 void MultiDrawMeshTasksIndirectNV(intptr indirect, 723 sizei drawcount, 724 sizei stride); 725 726 behaves identically to DrawMeshTasksIndirectNV, except that <indirect> is 727 treated as an array of <drawcount> DrawMeshTasksIndirectCommandNV 728 structures. <indirect> contains the offset of the first element of the 729 array within the buffer currently bound to the DRAW_INDIRECT buffer 730 binding. <stride> specifies the distance, in basic machine units, between 731 the elements of the array. If <stride> is zero, the array elements are 732 treated as tightly packed. <stride> must be a multiple of four, otherwise 733 an INVALID_VALUE error is generated. 734 735 <drawcount> must be positive, otherwise an INVALID_VALUE error will be 736 generated. 737 738 Errors 739 740 In addition to errors that would be generated by 741 DrawMeshTasksIndirect: 742 743 An INVALID_VALUE error is generated if <stride> is neither zero nor a 744 multiple of four. 745 746 An INVALID_VALUE error is generated if <stride> is non-zero and less 747 than the size of DrawMeshTasksIndirectCommandNV. 748 749 An INVALID_VALUE error is generated if <drawcount> is not positive. 750 751 The command 752 753 void MultiDrawMeshTasksIndirectCountNV( intptr indirect, 754 intptr drawcount, 755 sizei maxdrawcount, 756 sizei stride); 757 758 behaves similarly to MultiDrawMeshTasksIndirectNV, except that <drawcount> 759 defines an offset (in bytes) into the buffer object bound to the 760 PARAMETER_BUFFER_ARB binding point at which a single <sizei> typed value 761 is stored, which contains the draw count. <maxdrawcount> specifies the 762 maximum number of draws that are expected to be stored in the buffer. 763 If the value stored at <drawcount> into the buffer is greater than 764 <maxdrawcount>, an implementation stop processing draws after 765 <maxdrawcount> parameter sets. 766 767 Errors 768 769 In addition to errors that would be generated by 770 MultiDrawMeshTasksIndirectNV: 771 772 An INVALID_OPERATION error is generated if no buffer is bound to the 773 PARAMETER_BUFFER binding point. 774 775 An INVALID_VALUE error is generated if <drawcount> (the offset of the 776 memory holding the actual draw count) is not a multiple of four. 777 778 An INVALID_OPERATION error is generated if reading a sizei typed value 779 from the buffer bound to the PARAMETER_BUFFER target at the offset 780 specified by drawcount would result in an out-of-bounds access. 781 782 783New Implementation Dependent State 784 785 Add to Table 23.43, "Program Object State" 786 787 +----------------------------------------------------+-----------+-------------------------+---------------+--------------------------------------------------------+---------+ 788 | Get Value | Type | Get Command | Initial Value | Description | Sec. | 789 +----------------------------------------------------+-----------+-------------------------+---------------+--------------------------------------------------------+---------+ 790 | TASK_WORK_GROUP_SIZE_NV | 3 x Z+ | GetProgramiv | { 0, ... } | Local work size of a linked mesh stage | 7.13 | 791 | MESH_WORK_GROUP_SIZE_NV | 3 x Z+ | GetProgramiv | { 0, ... } | Local work size of a linked task stage | 7.13 | 792 | MESH_VERTICES_OUT_NV | Z+ | GetProgramiv | 0 | max_vertices size of a linked mesh stage | 7.13 | 793 | MESH_PRIMITIVES_OUT_NV | Z+ | GetProgramiv | 0 | max_primitives size of a linked mesh stage | 7.13 | 794 | MESH_OUTPUT_TYPE_NV | Z+ | GetProgramiv | POINTS | Primitive output type of a linked mesh stage | 7.13 | 795 | UNIFORM_BLOCK_REFERENCED_BY_TASK_SHADER_NV | B | GetActiveUniformBlockiv | FALSE | True if uniform block is referenced by the task stage | 7.6.2 | 796 | UNIFORM_BLOCK_REFERENCED_BY_MESH_SHADER_NV | B | GetActiveUniformBlockiv | FALSE | True if uniform block is referenced by the mesh stage | 7.6.2 | 797 | ATOMIC_COUNTER_BUFFER_REFERENCED_BY_TASK_SHADER_NV | B | GetActiveAtomicCounter- | FALSE | AACB has a counter used by task shaders | 7.7 | 798 | | | Bufferiv | | | | 799 | ATOMIC_COUNTER_BUFFER_REFERENCED_BY_MESH_SHADER_NV | B | GetActiveAtomicCounter- | FALSE | AACB has a counter used by mesh shaders | 7.7 | 800 | | | Bufferiv | | | | 801 +----------------------------------------------------+-----------+-------------------------+---------------+--------------------------------------------------------+---------+ 802 803 Add to Table 23.53, "Program Object Resource State" 804 805 +----------------------------------------------------+-----------+-------------------------+---------------+--------------------------------------------------------+---------+ 806 | Get Value | Type | Get Command | Initial Value | Description | Sec. | 807 +----------------------------------------------------+-----------+-------------------------+---------------+--------------------------------------------------------+---------+ 808 | REFERENCED_BY_TASK_SHADER_NV | Z+ | GetProgramResourceiv | - | Active resource used by task shader | 7.3.1 | 809 | REFERENCED_BY_MESH_SHADER_NV | Z+ | GetProgramResourceiv | - | Active resource used by mesh shader | 7.3.1 | 810 +----------------------------------------------------+-----------+-------------------------+---------------+--------------------------------------------------------+---------+ 811 812 Add to Table 23.67, "Implementation Dependent Values" 813 814 +------------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+--------+ 815 | Get Value | Type | Get Command | Minimum Value | Description | Sec. | 816 +------------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+--------+ 817 | MAX_DRAW_MESH_TASKS_COUNT_NV | Z+ | GetIntegerv | 2^16 - 1 | Maximum number of work groups that may be drawn by a single | X.6 | 818 | | | | | draw mesh tasks command | | 819 | MESH_OUTPUT_PER_VERTEX_GRANULARITY_NV | Z+ | GetIntegerv | - | Per-vertex output allocation granularity for mesh shaders | X.3 | 820 | MESH_OUTPUT_PER_PRIMITIVE_GRANULARITY_NV | Z+ | GetIntegerv | - | Per-primitive output allocation granularity for mesh shaders | X.3 | 821 +------------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+--------+ 822 823 Insert Table 23.75, "Implementation Dependent Task Shader Limits" 824 825 +-----------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+----------+ 826 | Get Value | Type | Get Command | Minimum Value | Description | Sec. | 827 +-----------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+----------+ 828 | MAX_TASK_WORK_GROUP_SIZE_NV | 3 x Z+ | GetIntegeri_v | 32 (x), 1 (y,z) | Maximum local size of a task work group (per dimension) | X.6 | 829 | MAX_TASK_WORK_GROUP_INVOCATIONS_NV | Z+ | GetIntegerv | 32 | Maximum total task shader invocations in a single local work group | X.6 | 830 | MAX_TASK_UNIFORM_BLOCKS_NV | Z+ | GetIntegerv | 12 | Maximum number of uniform blocks per task program | 7.6.2 | 831 | MAX_TASK_TEXTURE_IMAGE_UNITS_NV | Z+ | GetIntegerv | 16 | Maximum number of texture image units accessible by a task program | 11.1.3.5 | 832 | MAX_TASK_ATOMIC_COUNTER_BUFFERS_NV | Z+ | GetIntegerv | 8 | Number of atomic counter buffers accessed by a task program | 7.7 | 833 | MAX_TASK_ATOMIC_COUNTERS_NV | Z+ | GetIntegerv | 8 | Number of atomic counters accessed by a task program | 11.1.3.6 | 834 | MAX_TASK_IMAGE_UNIFORMS_NV | Z+ | GetIntegerv | 8 | Number of image variables in task program | 11.1.3.7 | 835 | MAX_TASK_SHADER_STORAGE_BLOCKS_NV | Z+ | GetIntegerv | 12 | Maximum number of storage buffer blocks per task program | 7.8 | 836 | MAX_TASK_UNIFORM_COMPONENTS_NV | Z+ | GetIntegerv | 512 | Number of components for task shader uniform variables | 7.6 | 837 | MAX_COMBINED_TASK_UNIFORM_COMPONENTS_NV | Z+ | GetIntegerv | * | Number of words for task shader uniform variables in all uniform | 7.6 | 838 | | | | | blocks, including the default | | 839 | MAX_TASK_TOTAL_MEMORY_SIZE_NV | Z+ | GetIntegerv | 16384 | Maximum total storage size of all variables declared as <shared> and | X.1 | 840 | | | | | <out> in all task shaders linked into a single program object | | 841 | MAX_TASK_OUTPUT_COUNT_NV | Z+ | GetIntegerv | 65535 | Maximum number of child mesh work groups a single task shader | X.2 | 842 | | | | | work group can emit | | 843 +-----------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+----------+ 844 845 Insert Table 23.76, "Implementation Dependent Mesh Shader Limits", 846 renumber subsequent tables. 847 848 +-----------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+----------+ 849 | Get Value | Type | Get Command | Minimum Value | Description | Sec. | 850 +-----------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+----------+ 851 | MAX_MESH_WORK_GROUP_SIZE_NV | 3 x Z+ | GetIntegeri_v | 32 (x), 1 (y,z) | Maximum local size of a mesh work group (per dimension) | X.6 | 852 | MAX_MESH_WORK_GROUP_INVOCATIONS_NV | Z+ | GetIntegerv | 32 | Maximum total mesh shader invocations in a single local work group | X.6 | 853 | MAX_MESH_UNIFORM_BLOCKS_NV | Z+ | GetIntegerv | 12 | Maximum number of uniform blocks per mesh program | 7.6.2 | 854 | MAX_MESH_TEXTURE_IMAGE_UNITS_NV | Z+ | GetIntegerv | 16 | Maximum number of texture image units accessible by a mesh shader | 11.1.3.5 | 855 | MAX_MESH_ATOMIC_COUNTER_BUFFERS_NV | Z+ | GetIntegerv | 8 | Number of atomic counter buffers accessed by a mesh shader | 7.7 | 856 | MAX_MESH_ATOMIC_COUNTERS_NV | Z+ | GetIntegerv | 8 | Number of atomic counters accessed by a mesh shader | 11.1.3.6 | 857 | MAX_MESH_IMAGE_UNIFORMS_NV | Z+ | GetIntegerv | 8 | Number of image variables in mesh shaders | 11.1.3.7 | 858 | MAX_MESH_SHADER_STORAGE_BLOCKS_NV | Z+ | GetIntegerv | 12 | Maximum number of storage buffer blocks per task program | 7.8 | 859 | MAX_MESH_UNIFORM_COMPONENTS_NV | Z+ | GetIntegerv | 512 | Number of components for mesh shader uniform variables | 7.6 | 860 | MAX_COMBINED_MESH_UNIFORM_COMPONENTS_NV | Z+ | GetIntegerv | * | Number of words for mesh shader uniform variables in all uniform | 7.6 | 861 | | | | | blocks, including the default | | 862 | MAX_MESH_TOTAL_MEMORY_SIZE_NV | Z+ | GetIntegerv | 16384 | Maximum total storage size of all variables declared as <shared> and | X.3 | 863 | | | | | <out> in all mesh shaders linked into a single program object | | 864 | MAX_MESH_OUTPUT_PRIMITIVES_NV | Z+ | GetIntegerv | 256 | Maximum number of primitives a single mesh work group can emit | X.5 | 865 | MAX_MESH_OUTPUT_VERTICES_NV | Z+ | GetIntegerv | 256 | Maximum number of vertices a single mesh work group can emit | X.5 | 866 | MAX_MESH_VIEWS_NV | Z+ | GetIntegerv | 1 | Maximum number of multi-view views that can be used in a mesh shader | | 867 +-----------------------------------------+-----------+---------------+---------------------+-----------------------------------------------------------------------+----------+ 868 869 870Interactions with ARB_indirect_parameters and OpenGL 4.6 871 872 If none of ARB_indirect_parameters or OpenGL 4.6 are supported, remove the 873 MultiDrawMeshTasksIndirectCountNV function. 874 875Interactions with NV_command_list 876 877 Modify the subsection 10.X.1 State Objects 878 879 (add after the first paragraph of the description of the StateCaptureNV 880 command) 881 882 When programs with active mesh or task stages are used, the 883 base primitive mode must be set to GL_POINTS. 884 885 (add to the list of errors) 886 887 INVALID_OPERATION is generated if <basicmode> is not GL_POINTS 888 when the mesh or task shaders are active. 889 890 Modify subsection 10.X.2 Drawing with Commands 891 892 (add a new paragraph before "None of the commands called by") 893 894 When mesh or task shaders are active the DRAW_ARRAYS_COMMAND_NV 895 must be used to draw mesh tasks. The fields of the 896 DrawArraysCommandNV will be interpreted as follows: 897 898 DrawMeshTasksNV(cmd->first, cmd->count); 899 900Interactions with ARB_draw_indirect and NV_vertex_buffer_unified_memory 901 902 When the ARB_draw_indirect and NV_vertex_buffer_unified_memory extensions 903 are supported, applications can enable DRAW_INDIRECT_UNIFIED_NV to specify 904 that indirect draw data are sourced from a pre-programmed memory range. For 905 such implementations, we add a paragraph to spec language for 906 DrawMeshTasksIndirectNV, also inherited by MultiDrawMeshTasksIndirectNV and 907 MultiDrawMeshTasksIndirectCountNV: 908 909 While DRAW_INDIRECT_UNIFIED_NV is enabled, DrawMeshTasksIndirectNV 910 sources its arguments from the address specified by the command 911 BufferAddressRange where <pname> is DRAW_INDIRECT_ADDRESS_NV and 912 <index> is zero, added to the <indirect> parameter. If the draw 913 indirect address range does not belong to a buffer object that is 914 resident at the time of the Draw, undefined results, possibly 915 including program termination, may occur. 916 917 Additionally, the errors specified for DRAW_INDIRECT_BUFFER accesses for 918 DrawMeshTasksIndirectNV are modified as follows: 919 920 An INVALID_OPERATION error is generated if DRAW_INDIRECT_UNIFIED_NV is 921 disabled and zero is bound to the DRAW_INDIRECT_BUFFER binding. 922 923 An INVALID_OPERATION error is generated if DRAW_INDIRECT_UNIFIED_NV is 924 disabled and the command would source data beyond the end of the 925 DRAW_INDIRECT_BUFFER binding. 926 927 An INVALID_OPERATION error is generated if DRAW_INDIRECT_UNIFIED_NV is 928 enabled and the command would source data beyond the end of the 929 DRAW_INDIRECT_ADDRESS_NV buffer address range. 930 931 932Interactions with OVR_multiview 933 934 Modify the new section "9.2.2.2 (Multiview Images)" 935 936 (insert a new entry to the list following 937 "In this mode there are several restrictions:") 938 939 - in mesh shaders only the appropriate per-view outputs are 940 used. 941 942Interactions with OpenGL ES 3.2 943 944 If implemented in OpenGL ES, remove all references to 945 MESH_SUBROUTINE_NV, TASK_SUBROUTINE_NV, MESH_SUBROUTINE_UNIFORM_NV, 946 TASK_SUBROUTINE_UNIFORM_NV, 947 ATOMIC_COUNTER_BUFFER_REFERENCED_BY_MESH_SHADER_NV, 948 ATOMIC_COUNTER_BUFFER_REFERENCED_BY_TASK_SHADER_NV, GetDoublev, GetDoublei_v 949 and MultiDrawMeshTasksIndirectCountNV. 950 951 Modify Section 7.3, Program Objects, p. 71 ES 3.2 952 953 (replace the reason why LinkProgram can fail with "program contains objects 954 to form either a vertex shader or fragment shader", p. 73 ES 3.2) 955 956 * <program> contains objects to form either a vertex shader or fragment 957 shader but not a mesh shader, and 958 959 - <program> is not separable, and does not contain objects to form both a 960 vertex shader and fragment shader. 961 962 (add to the list of reasons why LinkProgram can fail, p. 74 ES 3.2) 963 964 * program contains objects to form either a mesh or task shader (see 965 chapter X) but no fragment shader. 966 967Issues 968 969 (1) Should we use a new command to specify work to be processed by task 970 and mesh shaders? 971 972 RESOLVED: Yes. Using a separate draw call helps to clearly 973 differentiate task and mesh shader processing for the existing vertex 974 processing performed by the standard OpenGL vertex processing pipeline 975 with its vertex, tessellation, and geometry shaders. 976 977 (2) What name should we use for the draw calls that spawn task and mesh 978 shaders? 979 980 RESOLVED: For basic draws, we use the following command: 981 982 void DrawMeshTasksNV(uint first, uint count); 983 984 The first <first> and <count> parameters specifying a range of mesh task 985 numbers to process by the task and/or mesh shaders. 986 987 Since the programming model of mesh and task shaders is very similar to 988 that of compute shaders, we considered using an interface similar to 989 DispatchCompute(), such as: 990 991 void DrawWorkGroupsNV(uint num_groups_x, uint num_groups_y, 992 uint num_groups_z); 993 994 We ultimately decided to not use such a generic name. It might be 995 useful in the future to give compute shaders the ability to spawn 996 "draws" in the future, and it's not clear that the programming model for 997 such a design would look anything like mesh and task shaders. 998 999 The existing graphics draw calls DrawArrays() and DrawElements() 1000 directly or indirectly refer to elements of a vertex array. Since the 1001 programming model here spawns generic work that ultimately produces a 1002 set of (likely connected) output primitives, we use the word "mesh" to 1003 refer to the output of this pipeline and "tasks" to refer to the fact 1004 that the draw call is spawning generic work groups to produce such these 1005 "meshes". 1006 1007 NOTE: In order to minimize divergence from the programming model for 1008 compute shaders, mesh shaders use the same three-dimensional local work 1009 group concept used by compute shaders. However, the hardware used for 1010 task and mesh shaders is more limited and supports only one-dimensional 1011 work groups. We decided to only use one "dimension" in the draw call to 1012 keep the API simple and reflect the limitation. 1013 1014 (3) Should we be able to dispatch a range of work groups that doesn't 1015 start at zero? 1016 1017 RESOLVED: Yes. When porting application code from using regular vertex 1018 processing to mesh shader processing, the use of an implicit offset via 1019 the <first> parameter should be helpful as it is in standard DrawArrays 1020 calls. We think it's likely that applications will store information 1021 about tasks to process in a single array with global task numbers. In 1022 this case, the draw call with an offset allows applications to specify a 1023 range of this array of tasks to process. 1024 1025 (4) Should we support separable program objects with mesh and task 1026 shaders, where one program provides a task shader and a second 1027 program provides a mesh shader that interfaces with it? 1028 1029 RESOLVED: Yes. Supporting separable program objects is not difficult 1030 and may be useful in some cases. For example, one might use a single 1031 task shader that could be used for common processing of different types 1032 of geometry (e.g., evaluating visibililty via a bounding box) while 1033 using different mesh shaders to generate different types of primitives. 1034 1035 (5) Should we have queryable limits on the total amount of output memory 1036 consumed by mesh or task shaders? 1037 1038 RESOLVED: Yes. We have implementation-dependent limits on the total 1039 amount of output memory consumed by mesh and task shaders that can be 1040 queried using MAX_MESH_TOTAL_MEMORY_SIZE_NV and 1041 MAX_TASK_TOTAL_MEMORY_SIZE_NV. For each per-vertex or per-primitive 1042 output attribute in a mesh shader, memory is allocated separately for 1043 each vertex or primitive allocated by the shader. The total number of 1044 vertices or primitives used for this allocation is determined by taking 1045 the maximum vertex and primitive counts declared in the mesh shader and 1046 padding to implementation-dependent granularities that can be queried 1047 using MESH_OUTPUT_PER_VERTEX_GRANULARITY_NV and 1048 MESH_OUTPUT_PER_PRIMITIVE_GRANULARITY_NV. 1049 1050 (6) Should we have any MultiDrawMeshTasksIndirectNV, to draw 1051 multiple sets of mesh tasks in one call? 1052 1053 RESOLVED: Yes, we support "multi-draw" APIs to for consistency with 1054 the standard vertex pipeline. When using these APIs, each individual 1055 "draw" has its own structure stored in a buffer object. If mesh or task 1056 shaders need to determine which draw is being processed, the built-in 1057 gl_DrawIDARB can be used for that purpose. 1058 1059 (7) Do we support transform feedback with mesh shaders? 1060 1061 RESOLVED: No. In the initial implementation of this extension, the 1062 hardware doesn't support it. 1063 1064 (8) When using multi-view (OVR_multiview), how do we broadcast the 1065 primitive to multiple layers or viewports? 1066 1067 RESOLVED: When the OVR_multiview extension is enabled in a vertex 1068 shader, the layout qualifier: 1069 1070 layout(num_views = 2) in; 1071 1072 indicates that the vertex shader should be run separately for two views, 1073 where the shader can use the built-in input gl_ViewIDOVR to determine 1074 which view is being processed. A separate set of primitives is 1075 generated for each view, and each is rasterized into a separate 1076 framebuffer layer. 1077 1078 When the "num_views" layout qualifier for the OVR_multiview extension is 1079 enabled in a mesh shader, the semantics are slightly different. Instead 1080 of running a separate mesh shader invocation for each view, a single 1081 invocation is generated to process all views. The view count from the 1082 layout qualifier indicates the size of the extra array dimension for 1083 "arrayed" per-vertex and per-primitive outputs qualified with 1084 "perviewNV". The set of primitives generated by the mesh shader will be 1085 broadcast separately to each view. For per-vertex or per-primitive 1086 outputs not qualified with "perviewNV", the single value written by the 1087 mesh shader for each vertex/primitive will be used for each view. For 1088 outputs qualified with "perviewNV", each view will use a separate value 1089 from the corresponding "arrayed" output. 1090 1091 (9) Should we support NV_gpu_program5-style assembly programs for mesh 1092 and task shaders? 1093 1094 RESOLVED: No. We do provide a GLSL extension, also called 1095 "GL_NV_mesh_shader". 1096 1097 Also, please refer to issues in the GLSL extension specification. 1098 1099Revision History 1100 1101 Revision 5 (pdaniell) 1102 - Fix minimum implementation limit of MAX_DRAW_MESH_TASKS_COUNT_NV. 1103 1104 Revision 4 (pknowles) 1105 - Add ES interactions. 1106 1107 Revision 3, January 14, 2019 (pbrown) 1108 - Fix a typo in language prohibiting use of a task shader without a mesh 1109 shader. 1110 1111 Revision 2, September 17, 2018 (pbrown) 1112 - Prepare specification for publication. 1113 1114 Revision 1 (ckubsich) 1115 - Internal revisions. 1116