1Name 2 3 ARB_shader_storage_buffer_object 4 5Name Strings 6 7 GL_ARB_shader_storage_buffer_object 8 9Contact 10 11 Pat Brown, NVIDIA (pbrown 'at' nvidia.com) 12 13Contributors 14 15 Jeff Bolz, NVIDIA 16 Piers Daniell, NVIDIA 17 Christophe Riccio, AMD 18 Graham Sellers, AMD 19 Bruce Merry 20 John Kessenich 21 22Notice 23 24 Copyright (c) 2012-2014 The Khronos Group Inc. Copyright terms at 25 http://www.khronos.org/registry/speccopyright.html 26 27Specification Update Policy 28 29 Khronos-approved extension specifications are updated in response to 30 issues and bugs prioritized by the Khronos OpenGL Working Group. For 31 extensions which have been promoted to a core Specification, fixes will 32 first appear in the latest version of that core Specification, and will 33 eventually be backported to the extension document. This policy is 34 described in more detail at 35 https://www.khronos.org/registry/OpenGL/docs/update_policy.php 36 37Status 38 39 Complete. 40 Approved by the ARB on 2012/06/12. 41 42Version 43 44 Last Modified Date: April 28, 2014 45 Revision: 16 46 47Number 48 49 ARB Extension #137 50 51Dependencies 52 53 OpenGL 4.0 (either core or compatibility profile) is required. 54 55 OpenGL 4.3 or ARB_program_interface_query is required. 56 57 This extension is written against the OpenGL 4.2 (Compatibility Profile) 58 Specification. 59 60 This extension interacts with OpenGL 4.3 and ARB_compute_shader. 61 62 This extension interacts with OpenGL 4.3 and ARB_program_interface_query. 63 64 This extension interacts with NV_bindless_texture. 65 66Overview 67 68 This extension provides the ability for OpenGL shaders to perform random 69 access reads, writes, and atomic memory operations on variables stored in 70 a buffer object. Application shader code can declare sets of variables 71 (referred to as "buffer variables") arranged into interface blocks in a 72 manner similar to that done with uniform blocks in OpenGL 3.1. In both 73 cases, the values of the variables declared in a given interface block are 74 taken from a buffer object bound to a binding point associated with the 75 block. Buffer objects used in this extension are referred to as "shader 76 storage buffers". 77 78 While the capability provided by this extension is similar to that 79 provided by OpenGL 3.1 and ARB_uniform_buffer_object, there are several 80 significant differences. Most importantly, shader code is allowed to 81 write to shader storage buffers, while uniform buffers are always 82 read-only. Shader storage buffers have a separate set of binding points, 83 with different counts and size limits. The maximum usable size for shader 84 storage buffers is implementation-dependent, but its minimum value is 85 substantially larger than the minimum for uniform buffers. 86 87 The ability to write to buffer objects creates the potential for multiple 88 independent shader invocations to read and write the same underlying 89 memory. The same issue exists with the ARB_shader_image_load_store 90 extension provided in OpenGL 4.2, which can write to texture objects and 91 buffers. In both cases, the specification makes few guarantees related to 92 the relative order of memory reads and writes performed by the shader 93 invocations. For ARB_shader_image_load_store, the OpenGL API and shading 94 language do provide some control over memory transactions; those 95 mechanisms also affect reads and writes of shader storage buffers. In the 96 OpenGL API, the glMemoryBarrier() call can be used to ensure that certain 97 memory operations related to commands issued prior the barrier complete 98 before other operations related to commands issued after the barrier. 99 Additionally, the shading language provides the memoryBarrier() function 100 to control the relative order of memory accesses within individual shader 101 invocations and provides various memory qualifiers controlling how the 102 memory corresponding to individual variables is accessed. 103 104 105New Procedures and Functions 106 107 void ShaderStorageBlockBinding(uint program, uint storageBlockIndex, 108 uint storageBlockBinding); 109 110New Tokens 111 112 Accepted by the <target> parameters of BindBuffer, BufferData, 113 BufferSubData, MapBuffer, UnmapBuffer, GetBufferSubData, and 114 GetBufferPointerv: 115 116 SHADER_STORAGE_BUFFER 0x90D2 117 118 Accepted by the <pname> parameter of GetIntegerv, GetIntegeri_v, 119 GetBooleanv, GetInteger64v, GetFloatv, GetDoublev, GetBooleani_v, 120 GetIntegeri_v, GetFloati_v, GetDoublei_v, and GetInteger64i_v: 121 122 SHADER_STORAGE_BUFFER_BINDING 0x90D3 123 124 Accepted by the <pname> parameter of GetIntegeri_v, GetBooleani_v, 125 GetIntegeri_v, GetFloati_v, GetDoublei_v, and GetInteger64i_v: 126 127 SHADER_STORAGE_BUFFER_START 0x90D4 128 SHADER_STORAGE_BUFFER_SIZE 0x90D5 129 130 Accepted by the <pname> parameter of GetIntegerv, GetBooleanv, 131 GetInteger64v, GetFloatv, and GetDoublev: 132 133 MAX_VERTEX_SHADER_STORAGE_BLOCKS 0x90D6 134 MAX_GEOMETRY_SHADER_STORAGE_BLOCKS 0x90D7 135 MAX_TESS_CONTROL_SHADER_STORAGE_BLOCKS 0x90D8 136 MAX_TESS_EVALUATION_SHADER_STORAGE_BLOCKS 0x90D9 137 MAX_FRAGMENT_SHADER_STORAGE_BLOCKS 0x90DA 138 MAX_COMPUTE_SHADER_STORAGE_BLOCKS 0x90DB 139 MAX_COMBINED_SHADER_STORAGE_BLOCKS 0x90DC 140 MAX_SHADER_STORAGE_BUFFER_BINDINGS 0x90DD 141 MAX_SHADER_STORAGE_BLOCK_SIZE 0x90DE 142 SHADER_STORAGE_BUFFER_OFFSET_ALIGNMENT 0x90DF 143 144 Accepted in the <barriers> bitfield in glMemoryBarrier: 145 146 SHADER_STORAGE_BARRIER_BIT 0x2000 147 148 Also, add a new alias for the existing token 149 MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS: 150 151 MAX_COMBINED_SHADER_OUTPUT_RESOURCES 0x8F39 (alias) 152 153Additions to Chapter 2 of the OpenGL 4.2 (Compatibility Profile) Specification 154(OpenGL Operation) 155 156 Modify Section 2.9, Buffer Objects, p. 56 157 158 (Add to Table 2.9, p. 57) 159 160 Target Name Purpose Described in section(s) 161 --------------------- -------------------- ---------------------- 162 SHADER_STORAGE_BUFFER read-write storage 2.14.X 163 for shaders 164 165 (modify next-to-last paragraph, p. 58) target must be one of 166 ATOMIC_COUNTER_BUFFER, SHADER_STORAGE_BUFFER, TRANSFORM_FEEDBACK_BUFFER, 167 UNIFORM_BUFFER. ... 168 169 170 Modify Section 2.14.7, Uniform Variables, p. 113 171 172 (modify Table 2.16, pp. 122-125) 173 174 Add a new column labeled "Buffer". Include dots for all the types on 175 p. 122 (including BOOL types not supported for "Attrib" and "Xfb"). Add 176 dots for the "DOUBLE_MAT*" rows on p. 123. Add no dots for any image or 177 sampler types. 178 179 In the description of the table (p. 125), add a new sentence: Types whose 180 "Buffer" column are marked may be declared as buffer variables (see 181 section 2.14.X). 182 183 184 Modify unnumbered "Standard Uniform Block Layout" section, p. 132 185 186 (insert a new paragraph at the end of the section, at the bottom of 187 p. 133) Shader storage blocks (section 2.14.X) also support the "std140" 188 layout qualifier, as well as a "std430" layout qualifier not supported for 189 uniform blocks. When using the "std430" storage layout, shader storage 190 blocks will be laid out in buffer storage identically to uniform and 191 shader storage blocks using the "std140" layout, except that the base 192 alignment of arrays of scalars and vectors in rule (4) and of structures 193 in rule (9) are not rounded up a multiple of the base alignment of a vec4. 194 195 196 Add new section immediately before Section 2.14.8, Subroutine Uniform 197 Variables (p. 135) 198 199 2.14.X, Shader Buffer Variables 200 201 Shaders can declare named /buffer variables/, as described in the OpenGL 202 Shading Language Specification. Sets of buffer variables are grouped into 203 interface blocks called /shader storage blocks/. The values of each 204 buffer variable in a shader storage block are read from or written to the 205 data store of a buffer object bound to the binding point associated with 206 the block. The values of active buffer variables may be changed by 207 executing shaders that assign values to them or perform atomic memory 208 operations on them, by modifying the contents of the bound buffer object's 209 data store with commands such as BufferSubData, by binding a new buffer 210 object to the binding point associated with the block, or by changing the 211 binding point associated with the block. 212 213 Buffer variables in shader storage blocks are represented in memory in the 214 same way as uniforms stored in uniform blocks, as described in the 215 "Uniform Buffer Object Storage" subsection of Section 2.14.7. When a 216 program is linked successfully, each active buffer variable is assigned an 217 offset relative to the base of the buffer object binding associated with 218 its shader storage block. For buffer variables declared as arrays and 219 matrices, strides between array elements or matrix columns or rows will 220 also be assigned. Offsets and strides of buffer variables will be 221 assigned in an implementation-dependent manner unless the shader storage 222 block is declared using the "std140" or "std430" storage layout 223 qualifiers. For "std140" and "std430" shader storage blocks, offsets will 224 be assigned using the method described in the "Standard Uniform Block 225 Layout" subsection of Section 2.14.7. If a program is re-linked, existing 226 buffer variable offsets and strides are invalidated, and a new set of 227 active variables, offsets, and strides will be generated. 228 229 The total amount of buffer object storage that can be accessed in any 230 shader storage block is subject to an implementation-dependent limit. The 231 maximum amount of available space, in basic machine units, can be queried 232 by calling GetIntegerv with the constant MAX_SHADER_STORAGE_BLOCK_SIZE. 233 If the amount of storage required for any shader storage block exceeds 234 this limit, a program will fail to link. 235 236 If the number of active shader storage blocks referenced by the shaders in 237 a program exceeds implementation-dependent limits, the program will fail 238 to link. The limits for vertex, tessellation control, tessellation 239 evaluation, geometry, fragment, and compute shaders can be obtained by 240 calling GetIntegerv with pname values of MAX_VERTEX_SHADER_STORAGE_BLOCKS, 241 MAX_TESS_CONTROL_SHADER_STORAGE_BLOCKS, 242 MAX_TESS_EVALUATION_SHADER_STORAGE_BLOCKS, 243 MAX_GEOMETRY_SHADER_STORAGE_BLOCKS, MAX_FRAGMENT_SHADER_STORAGE_BLOCKS, 244 and MAX_COMPUTE_SHADER_STORAGE_BLOCKS, respectively. Additionally, a 245 program will fail to link if the sum of the number of active shader 246 storage blocks referenced by each shader stage in a program exceeds the 247 value of the implementation-dependent limit 248 MAX_COMBINED_SHADER_STORAGE_BLOCKS. If a shader storage block in a 249 program is referenced by multiple shaders, each such reference counts 250 separately against this combined limit. 251 252 When a named shader storage block is declared by multiple shaders in a 253 program, it must be declared identically in each shader. The buffer 254 variables within the block must be declared with the same names, types, 255 qualification, and declaration order. If a program contains multiple 256 shaders with different declarations for the same named shader storage 257 block, the program will fail to link. 258 259 Regions of buffer objects are bound as storage for shader storage blocks 260 by calling one of the commands BindBufferRange or BindBufferBase (see 261 section 2.9.1) with target set to SHADER_STORAGE_BUFFER. In addition to 262 the general errors described in section 2.9.1, BindBufferRange will 263 generate an INVALID_VALUE error if index is greater than or equal to the 264 value of MAX_SHADER_STORAGE_BUFFER_BINDINGS, or if <offset> is not a 265 multiple of the implementation-dependent alignment requirement (the value 266 of SHADER_STORAGE_BUFFER_OFFSET_ALIGNMENT). 267 268 Each of a program's active shader storage blocks has a corresponding 269 shader storage buffer object binding point. When a program object is 270 linked, the shader storage buffer object binding point assigned to each of 271 its active shader storage blocks is reset to the value specified by the 272 corresponding "binding" layout qualifier, if present, or zero otherwise. 273 After a program is linked, the command 274 275 void ShaderStorageBlockBinding(uint program, uint storageBlockIndex, 276 uint storageBlockBinding); 277 278 changes the active shader storage block with an assigned index of 279 <storageBlockIndex> in program object <program>. The error INVALID_VALUE 280 is generated if <storageBlockIndex> is not an active shader storage block 281 index in <program>, or if <storageBlockBinding> is greater than or equal 282 to the value of MAX_SHADER_STORAGE_BUFFER_BINDINGS. If successful, 283 ShaderStorageBlockBinding specifies that <program> will use the data 284 store of the buffer object bound to the binding point 285 <storageBlockBinding> to read and write the values of the buffer 286 variables in the shader storage block identified by <storageBlockIndex>. 287 288 When executing shaders that access shader storage blocks, the binding 289 point corresponding to each active shader storage block must be populated 290 with a buffer object with a size no smaller than the minimum required size 291 of the shader storage block (the value of BUFFER_SIZE for the appropriate 292 SHADER_STORAGE_BUFFER resource). For binding points populated by 293 BindBufferRange, the size in question is the value of the <size> parameter 294 or the size of the buffer minus the value of the <offset> parameter, 295 whichever is smaller. If any active shader storage block is not backed by 296 a sufficiently large buffer object, the results of shader execution are 297 undefined, and may result in GL interruption or termination. Shaders may 298 be executed to process the primitives and vertices specified between Begin 299 and End, or by vertex array commands (see section 2.8). Shaders may also 300 be executed as a result of DrawPixels, Bitmap, or RasterPos* commands. 301 302 303 Modify Section 2.14.12, Shader Execution (p. 145) 304 305 (add new sub-section before "Shader Inputs", p. 151) 306 307 Shader Storage Buffer Access 308 309 Shaders have the ability to read and write to buffer memory via buffer 310 variables in shader storage blocks. The maximum number of shader storage 311 blocks available to shaders are the values of the implementation dependent 312 constants 313 314 * MAX_VERTEX_SHADER_STORAGE_BLOCKS (for vertex shaders), 315 316 * MAX_TESS_CONTROL_SHADER_STORAGE_BLOCKS (for tessellation control 317 shaders), 318 319 * MAX_TESS_EVALUATION_SHADER_STORAGE_BLOCKS (for tessellation evaluation 320 shaders), 321 322 * MAX_GEOMETRY_SHADER_STORAGE_BLOCKS (for geometry shaders), 323 324 * MAX_FRAGMENT_SHADER_STORAGE_BLOCKS (for fragment shaders), and 325 326 * MAX_COMPUTE_SHADER_STORAGE_BLOCKS (for compute shaders). 327 328 All active shaders combined cannot use more than the value of 329 MAX_COMBINED_SHADER_STORAGE_BLOCKS shader storage blocks. If more than one 330 pipeline stage accesses the same shader storage block, each such access 331 counts separately against this combined limit. 332 333 334 (add to the list of bullets in the "Validation" section on p. 153) 335 336 * The sum of the number of active shader storage blocks used by the 337 current program objects exceeds the combined limit on the number of 338 active shader storage blocks (MAX_COMBINED_SHADER_STORAGE_BLOCKS). 339 340 341 Modify Section 2.14.13, Shader Memory Access (p. 153) 342 343 (modify last paragraph, p. 153) Shaders may perform random-access reads 344 and writes to texture or buffer object memory by using built-in image 345 load, store, and atomic functions operating on shader image variables, or 346 by reading from, assigning to, or performing atomic memory operation on 347 shader buffer variables, as described in the OpenGL Shading Language 348 Specification. The ability to perform such random-access reads and writes 349 in systems that may be highly pipelined results in ordering and 350 synchronization issues discussed in the sections below. 351 352 (add to list of MemoryBarrier <barriers> bullets, p. 158) 353 354 * SHADER_STORAGE_BARRIER_BIT: Memory accesses using shader buffer 355 variables issued after the barrier will reflect data written by 356 shaders prior to the barrier. Additionally, assignments to and atomic 357 operations performed on shader buffer variables after the barrier will 358 not execute until all memory accesses (e.g., loads, stores, texture 359 fetches, vertex fetches) initiated prior to the barrier complete. 360 361 362Additions to Chapter 3 of the OpenGL 4.2 (Compatibility Profile) Specification 363(Rasterization) 364 365 Modify Section 3.10.22, Texture Image Loads and Stores (p. 358) 366 367 (modify first paragraph, p. 367) Implementations may support a limited 368 combined number of image units, shader storage blocks, and active fragment 369 shader outputs (see section 4.2.1). A link error will be generated if the 370 sum of the number of active image uniforms used in all shaders, the number 371 of active shader storage blocks, and the number of active fragment shader 372 outputs exceeds the implementation-dependent value of 373 MAX_COMBINED_SHADER_OUTPUT_RESOURCES. 374 375 376Additions to Chapter 4 of the OpenGL 4.2 (Compatibility Profile) Specification 377(Per-Fragment Operations and the Frame Buffer) 378 379 None. 380 381Additions to Chapter 5 of the OpenGL 4.2 (Compatibility Profile) Specification 382(Special Functions) 383 384 None. 385 386Additions to Chapter 6 of the OpenGL 4.2 (Compatibility Profile) Specification 387(State and State Requests) 388 389 Modify Secction 6.1.15, Buffer Object Queries (p. 490) 390 391 (add to end of section) 392 393 To query which buffer objects are bound to the array of shader storage 394 buffer binding points and will be used as the storage for active shader 395 storage blocks, call GetIntegeri_v with <param> set to 396 SHADER_STORAGE_BUFFER_BINDING. <index> must be in the range zero to the 397 value of MAX_SHADER_STORAGE_BUFFER_BINDINGS-1. The name of the buffer 398 object bound to index is returned in <values>. If no buffer object is 399 bound for <index>, zero is returned in <values>. 400 401 To query the starting offset or size of the range of each buffer object 402 binding used for shader storage buffers, call GetInteger64i_v with <param> 403 set to SHADER_STORAGE_BUFFER_START or SHADER_STORAGE_BUFFER_SIZE 404 respectively. <index> must be in the range zero to the value of 405 MAX_SHADER_STORAGE_BUFFER_BINDINGS-1. If the parameter (starting offset 406 or size) was not specified when the buffer object was bound (e.g. if 407 bound with BindBufferBase), or if no buffer object is bound to index, zero 408 is returned. 409 410 411Additions to Appendix A of the OpenGL 4.2 (Compatibility Profile) Specification 412(Invariance) 413 414 Modify Section A.1, Repeatability (p. 583) 415 416 (modify last sentence of the first paragraph, p. 583) ... This 417 repeatability requirement doesn't apply when using shaders containing side 418 effects (image stores, image atomic operations, atomic counter operations, 419 buffer variable stores, buffer variable atomic operations), because these 420 memory operations are not guaranteed to be processed in a defined order. 421 422 Modify Section A.3, Invariance (p. 584) 423 424 (modify first sentence of the paragraph after Rule 5, p. 586) If a 425 sequence of GL commands specifies primitives to be rendered with shaders 426 containing side effects (image stores, image atomic operations, atomic 427 counter operations, buffer variable stores, buffer variable atomic 428 operations), invariance rules are relaxed. ... 429 430 (modify first paragraph, p. 587) When any sequence of GL commands triggers 431 shader invocations that perform image stores, image atomic operations, 432 atomic counter operations, buffer variable stores, or buffer variable 433 atomic operations and subsequent GL commands read the memory written by 434 those shader invocations, these operations must be explicitly 435 synchronized. For more details, see Section 2.14.X, Shader Memory Access. 436 437 438Additions to Appendix D of the OpenGL 4.2 (Compatibility Profile) Specification 439(Shared Objects and Multiple Contexts) 440 441 Modify Section D.3, Propagating State Changes, p. 611 442 443 (modify second bullet, p. 612) 444 445 * Rendering commands that trigger shader invocations, where the shader 446 performs image stores, image atomic operations, atomic counter 447 operations, buffer variable stores, or buffer variable atomic 448 operations. 449 450Additions to the OpenGL Shading Language 4.20 Specification 451 452 Including the following line in a shader can be used to control the 453 language features described in this extension: 454 455 #extension GL_ARB_shader_storage_buffer_object : <behavior> 456 457 where <behavior> is as specified in section 3.3. 458 459 New preprocessor #defines are added to the OpenGL Shading Language: 460 461 #define GL_ARB_shader_storage_buffer_object 1 462 463 464 Modify Section 3.6, Keywords (p. 15) 465 466 (add to list of keywords) 467 468 buffer 469 470 471 Modify Section 4.1.9, Arrays (p. 29) 472 473 (modify first paragraph of the section, p. 29, adding an exception 474 allowing general indexing of the last array of a shader storage block) 475 ... Except for the last declared member of a shader storage block 476 (section 4.3.X), the size of an array must be declared before it is 477 indexed with anything other than an integral constant expression. The 478 size of an array must be declared before passing it as an argument to a 479 function. ... 480 481 (modify last paragraph, p. 30) ... This returns a type int. If an array 482 has been explicitly sized, the value returned by the length method is 483 a constant expression. If an array has not been explicitly 484 sized and is not the last declared member of a shader storage block, the 485 value returned by the length method is not a constant 486 expression and will be determined when a program is linked. If an array 487 has not been explicitly sized and is the last declared member of a shader 488 storage block, the value returned will not be constant expression and 489 will be determined at run time based on 490 the size of the buffer object providing storage for the block. For such 491 arrays, the value returned by the length method will be undefined if the 492 array is contained in an array of shader storage blocks that is indexed 493 with a non-constant expression less than zero or greater than or equal 494 to the number of blocks in the array. 495 496 (add a new paragraph to end of the section, at the bottom of p. 30) In a 497 shader storage block, the last member may be declared without an explicit 498 size. In this case, the effective array size is inferred at run-time from 499 the size of the data store backing the interface block. Such unsized 500 arrays may be indexed with general integer expressions, but may not be 501 passed as an argument to a function or indexed with a negative constant 502 expression. 503 504 505 Modify Section 4.3, Storage Qualifiers (p. 36) 506 507 Storage 508 Qualifier Meaning 509 ---------- --------------------------------------------------- 510 buffer value is stored in a buffer object, and can be read 511 or written by shader invocations and the OpenGL API 512 513 514 Modify Section 4.3.3, Constant Expressions (p. 38) 515 516 (modify first bullet, p. 39, clarifying that the length() method only 517 produces constant expressions on explicitly sized objects, since we now 518 allow it on implicitly sized or unsized arrays) 519 520 * valid use of the length() method on an explicitly sized object, whether 521 or not the object itself is constant (implicitly sized or unsized arrays 522 do not return a constant expression) 523 524 525 Insert after Section 4.3.5, Uniform (p. 40) 526 527 4.3.X, Buffer Variables 528 529 The <buffer> qualifier is used to declare global variables whose values 530 are stored in the data store of a buffer object bound through the OpenGL 531 API. Buffer variables can be read and written, with the underlying 532 storage shared among all active shader invocations. Buffer variable 533 memory reads and writes within a single shader invocation are processed in 534 order. However, the order of reads and writes performed in one invocation 535 relative to those performed by another invocation is largely undefined. 536 Buffer variables may be qualified with memory qualifiers affecting how the 537 underlying memory is accessed, as described in Section 4.10. 538 539 The "buffer" qualifier can be used with any of the basic data types, or 540 when declaring a variable whose type is a structure, or an array of any of 541 these. 542 543 Buffer variables may only be declared inside interface blocks (Section 544 4.3.7), which are referred to as shader storage blocks. It is illegal to 545 declare buffer variables at global scope (outside a block). Buffer 546 variables cannot have initializers. 547 548 There are implementation-dependent limits on the number of the shader 549 storage blocks used for each type of shader, the combined number of shader 550 storage blocks used for a program, and the amount of storage required by 551 each individual shader storage block. If any of these limits are 552 exceeded, it will cause a compile-time or link-time error. 553 554 If multiple shaders are linked together, then they will share a single 555 global buffer variable name space, including within a language as well as 556 across languages. Hence, the types of buffer variables with the same name 557 must match across all shaders that are linked into a single program. 558 559 560 Modify Section 4.3.7, Interface Blocks (p. 43) 561 562 (modify first paragraph) Input, output, uniform, and buffer variable 563 declarations can be grouped into named interface blocks ... A uniform 564 block is backed by the application with a buffer object. A block of 565 buffer variables, called a shader storage block, is also backed by the 566 application with a buffer object. ... 567 568 (modify second paragraph) An interface block is started by an in, out, 569 uniform, or buffer keyword, followed by ... 570 571 (add "buffer" to the grammar rules) 572 573 interface-qualifier: 574 in 575 out 576 uniform 577 buffer 578 579 (modify first paragraph, p. 44) Types and declarators are the same as for 580 other input, output, uniform, and buffer variable declarations... 581 582 (modify third paragraph, p. 44) If no optional qualifier is used in a 583 member-declaration, the qualification of the variable is just in, out, 584 uniform, or buffer as determined by <interface-qualifier>. ... Input 585 variables, output variables, uniform variables, and buffer variables can 586 only be in in blocks, out blocks, uniform blocks, and shader storage 587 blocks, respectively. Repeating the "in", "out", "uniform", or "buffer" 588 interface qualifier for a member's storage qualifier is optional. ... 589 590 (modify fourth paragraph, p. 44) For this section, define an interface to 591 be one of these: 592 593 * All the uniforms of a program. This spans all compilation units linked 594 together within one program. 595 596 * All the buffer variables of a program. 597 598 * The boundary between adjacent programmable pipeline stages: ... 599 600 (modify next-to-last paragraph, p. 45) For uniform or shader storage 601 blocks declared as an array, each individual array element corresponds to 602 a separate buffer object bind range, backing one instance of the block. As 603 the array size indicates the number of buffer objects needed, uniform and 604 shader storage block array declarations must specify an array size. A 605 uniform or shader storage block array can only be indexed with a 606 dynamically uniform integral expression, otherwise results are undefined. 607 608 (modify last paragraph of the section, p. 46) There are 609 implementation-dependent limits on the number of uniform blocks and the 610 number of shader storage blocks that can be used per stage. If either 611 limit is exceeded, it will cause a link error. 612 613 614 Modify Section 4.4.1.2, Geometry Shader Inputs (p. 49) 615 616 (modify example at the top of p. 51, since it's now legal to take the 617 length of implicitly sized arrays) 618 619 // code sequence within one shader... 620 in vec4 Color1[]; // legal, size still unknown 621 in vec4 Color2[2]; // legal, size is 2 622 in vec4 Color3[3]; // illegal, input sizes are inconsistent 623 layout(lines) in; // legal for Color2, input size is 2, matching Color2 624 in vec4 Color4[3]; // illegal, contradicts layout of lines 625 layout(lines) in; // legal, matches other layout() declaration 626 layout(triangles) in; // illegal, does not match earlier layout() 627 // declaration 628 629 630 Modify Section 4.4.3, Uniform Block Layout Qualifiers (p. 57). Rename 631 section title to "Uniform and Shader Storage Block Layout Qualifiers". 632 633 (modify first paragraph) Layout qualifiers can be used for uniform and 634 shader storage blocks, but not for non-block uniform declarations. The 635 layout qualifier identifiers for uniform and shader storage blocks are 636 637 layout-qualifier-id 638 shared 639 packed 640 std140 641 std430 642 row_major 643 column_major 644 binding = integer-constant 645 646 (modify last paragraph, p. 57) Uniform and shader storage block layout 647 qualifiers can be declared for global scope, on a single uniform or shader 648 storage block, or on a single block member declaration. 649 650 (modify first paragraph, p. 58) Default layouts are established (except 651 for binding) at global scope for uniform blocks as 652 653 layout(layout-qualifier-id-list) uniform; 654 655 and for shader storage blocks as 656 657 layout(layout-qualifier-id-list) buffer; 658 659 ... The result becomes the new default qualification scoped to subsequent 660 uniform or shader storage block definitions. 661 662 (modify third paragraph, p. 58) The initial state of compilation is as if 663 the following were declared: 664 665 layout(shared, column_major) uniform; 666 layout(shared, column_major) buffer; 667 668 (modify fourth paragraph, p. 58) Uniform and shader storage blocks can be 669 declared with optional layout qualifiers, and so can their individual 670 member declarations. Such block layout qualification is scoped only to the 671 content of the block. As with global layout declarations, block layout 672 qualification first inherits from the current default qualification and 673 then overrides it. Similarly, individual member layout qualification is 674 scoped just to the member declaration, and inherits from and overrides the 675 block's qualification. 676 677 (modify the fifth paragraph, p. 58) The shared qualifier overrides only 678 the std140, std430, and packed qualifiers; other qualifiers are 679 inherited. The compiler/linker will ensure that multiple programs and 680 programmable stages containing this definition will share the same memory 681 layout for this block, as long as all arrays are declared with explicit 682 sizes and all matrices have matching row_major and/or column_major 683 qualifications (which may come from a declaration outside the block 684 definition). ... 685 686 (modify sixth paragraph, p. 58) The packed qualfier overrides only std140, 687 std430, and shared; other qualifiers are inherited. ... Attempts to share 688 a packed uniform or shader storage block across programs or stages will 689 generally fail. ... 690 691 (modify seventh paragraph, p. 58) The std140 and std430 qualifiers 692 override only the packed, shared, std140, and std430 qualifiers; other 693 qualifiers are inherited. The std430 qualifier is supported only for 694 shader storage blocks; a shader using the std430 qualifier on a uniform 695 block will fail to compile. ... 696 697 (modify eight paragraph, p. 58) Layout qualifiers on member declarations 698 cannot use the shared, packed, std140, or std430 qualifiers. ... 699 700 (modify last paragraph, p. 58) The <binding> identifier specifies the 701 buffer binding point corresponding to the uniform or shader storage block, 702 which will be used to obtain the values of the member variables of the 703 block. It is an error to specify the binding identifier for the global 704 scope or for block member declarations. Any uniform or shader storage 705 block declared without a binding identifier is initially assigned to block 706 binding point zero. After a program is linked, the binding points used 707 for uniform and shader storage blocks declared with or without a binding 708 identifier can be updated by the OpenGL API. 709 710 (modify second paragraph, p. 59) If the <binding> identifier is used with 711 a uniform or shader storage block instanced as an array then the first 712 element of the array takes the specified block binding and each subsequent 713 element takes the next consecutive block binding point. 714 715 (modify third paragraph, p. 59) If the binding point for any uniform or 716 shader storage block instance is less than zero or greater than or equal 717 to the implementation-dependent maximum number of bindings for the block 718 type (uniform or shader storage), a compilation error will occur. When 719 the binding identifier is used with a uniform or shader storage block 720 instanced as an array of size <N>, all elements of the array from 721 <binding> through <binding>+<N>-1 must be within this range. 722 723 724 Modify Section 4.10, Memory Qualifiers (p. 71) 725 726 (modify first paragraph of section, p. 71, removing the "Only" from "Only 727 variables") Variables declared as image types (the basic opaque types with 728 "image" in their keyword) can be qualified with a memory qualifier. 729 730 (add to the end of the third paragraph, p. 73) ... It is an error to 731 qualify an image variable with both "readonly" and "writeonly". 732 733 (insert after third paragraph, p. 73) The memory qualifiers "coherent", 734 "volatile", "restrict", "readonly", and "writeonly" may be used in the 735 declaration of buffer variables (i.e., members of shader storage blocks). 736 When a buffer variable is declared with a memory qualifier, the behavior 737 specified for memory accesses involving image variables described above 738 applies identically to memory accesses involving that buffer variable. It 739 is an error to assign to a buffer variable qualified with "readonly" or to 740 read from a buffer variable qualified with "writeonly". 741 742 Additionally, memory qualifiers may also be used in the declaration of 743 shader storage blocks. When a block declaration is qualified with a 744 memory qualifier, it is as if all of its members were declared with the 745 same memory qualifier. For example, the block declaration 746 747 coherent buffer Block { 748 readonly vec4 member1; 749 vec4 member2; 750 }; 751 752 is equivalent to 753 754 buffer Block { 755 coherent readonly vec4 member1; 756 coherent vec4 member2; 757 }; 758 759 Memory qualifiers are only supported in the declarations of image 760 variables, buffer variables, and shader storage blocks; it is an error to 761 use such qualifiers in any other declaration. 762 763 764 Modify Section 5.5, Vector and Scalar Components and Length, p. 79 765 766 (modify last paragraph of section, p. 81) ... The type returned by 767 .length() on a vector is int, and the value returned is considered a 768 constant expression. 769 770 771 Modify Section 5.6, Matrix Components, p. 81 772 773 (modify last paragraph of section, p. 81) ... The type returned by 774 .length() on a matrix is int, and the value returned is considered a 775 constant expression. 776 777 778 Modify Section 5.9, Expressions, p. 83 779 780 (insert after 4th bullet of section, p. 83, correcting the oversight that 781 .length() can also be used on vectors and matrices) 782 783 * an expression of vector or matrix type with the length method applied 784 785 786 Insert new section after Section 8.10, Atomic Counter Functions (p. 149) 787 788 8.X Atomic Memory Functions 789 790 Atomic memory functions perform atomic operations on an individual signed 791 or unsigned integer found in buffer object or shared variable storage. 792 All of the atomic memory operations read a value from memory, compute a 793 new value using one of the operations described below, write the new value 794 to memory, and return the original value read. The contents of the memory 795 being updated by the atomic operation are guaranteed not to be modified by 796 any other assignment or atomic memory function in any shader invocation 797 between the time the original value is read and the time the new value is 798 written. 799 800 Atomic memory functions are supported only for a limited set of variables. 801 A shader will fail to compile if the value passed to the <mem> argument of 802 an atomic memory function does not correspond to a buffer or shared 803 variable. It is acceptable to pass an element of an array or a single 804 component of a vector to the <mem> argument of an atomic memory function, 805 as long as the underlying array or vector is a buffer or shared variable. 806 807 Functions: 808 809 uint atomicAdd(inout uint mem, uint data); 810 int atomicAdd(inout int mem, int data); 811 812 Computes a new value by adding the value of <data> to the contents 813 of <mem>. 814 815 uint atomicMin(inout uint mem, uint data); 816 int atomicMin(inout int mem, int data); 817 818 Computes a new value by taking the minimum of the value of <data> 819 and the contents of <mem>. 820 821 uint atomicMax(inout uint mem, uint data); 822 int atomicMax(inout int mem, int data); 823 824 Computes a new value by taking the maximum of the value of <data> 825 and the contents of <mem>. 826 827 uint atomicAnd(inout uint mem, uint data); 828 int atomicAnd(inout int mem, int data); 829 830 Computes a new value by performing a bit-wise and of the value of 831 <data> and the contents of <mem>. 832 833 uint atomicOr(inout uint mem, uint data); 834 int atomicOr(inout int mem, int data); 835 836 Computes a new value by performing a bit-wise or of the value of 837 <data> and the contents of <mem>. 838 839 uint atomicXor(inout uint mem, uint data); 840 int atomicXor(inout int mem, int data); 841 842 Computes a new value by performing a bit-wise exclusive or of the 843 value of <data> and the contents of <mem>. 844 845 uint atomicExchange(inout uint mem, uint data); 846 int atomicExchange(inout int mem, int data); 847 848 Computes a new value by simply copying the value of <data>. 849 850 uint atomicCompSwap(inout uint mem, uint compare, uint data); 851 int atomicCompSwap(inout int mem, int compare, int data); 852 853 Compares the value of <compare> and the contents of <mem>. If the 854 values are equal, the new value is given by <data>; otherwise, it is 855 taken from the original contents of <mem>. 856 857Additions to the AGL/EGL/GLX/WGL Specifications 858 859 None 860 861GLX Protocol 862 863 TBD 864 865Dependencies on OpenGL 4.3 and ARB_compute_shader: 866 867 If OpenGL 4.3 and ARB_compute_shader are not supported, any references to 868 uses of shader storage blocks in compute shaders, as well as the enumerant 869 MAX_COMPUTE_SHADER_STORAGE_BLOCKS, should be removed. Additionally, this 870 extension provides GLSL atomic memory functions that can be used with 871 buffer variables (from this extension) and shared variables (from 872 ARB_compute_shader). If ARB_compute_shader is not supported, references 873 to shared variables should be removed from the language describing these 874 functions. 875 876 Note that no "#extension" directive is necessary to use atomic memory 877 functions on shared variables in compute shaders. 878 879Dependencies on OpenGL 4.3 and ARB_program_interface_query 880 881 If OpenGL 4.3 and ARB_program_interface_query are not supported, it 882 wouldn't be possible to use GLSL query APIs to enumerate active buffer 883 variables and shader storage blocks used by a program. We require that 884 OpenGL 4.3 or ARB_program_interface_query be supported; this shouldn't be 885 a problem for any implementations of this extension. 886 887Dependencies on NV_bindless_texture 888 889 If NV_bindless_texture is supported (and enabled via the #extension 890 directive), the restriction that image and sampler variables must be 891 uniform variables not in blocks is lifted. In this case, image and 892 sampler variables may be members in shader storage blocks. 893 894 If an image variable is declared as a member of a shader storage block, 895 the memory qualifiers on such variable declarations apply to the memory 896 holding the block member and *not* the memory referenced by the image. If 897 it is necessary to apply a memory qualifier to the memory referenced by an 898 image variable found inside a shader storage block, it's possible to embed 899 the image variable declaration in a sturcture and then embed the structure 900 in a block. In the following example: 901 902 struct S { 903 readonly image2D x; 904 }; 905 buffer Block { 906 S m; 907 }; 908 909 "readonly" is considered to apply to the memory pointed to by the image 910 variable <x>. In this example: 911 912 buffer Block { 913 readonly image2D m; 914 } 915 916 "readonly" is considered to apply to the memory holding the image handle. 917 It would be illegal to write to <m>, but it would be legal to write to 918 the texture memory pointed to by <m> (i.e., you can pass <m> to 919 imageStore). 920 921Errors 922 923 INVALID_VALUE is generated by BindBufferRange if <target> is 924 SHADER_STORAGE_BUFFER and <index> is greater than or equal to the value of 925 MAX_SHADER_STORAGE_BUFFER_BINDINGS. 926 927 INVALID_VALUE is generated by BindBufferRange if <target> is 928 SHADER_STORAGE_BUFFER and <offset> is not a multiple of the value of 929 SHADER_STORAGE_BUFFER_OFFSET_ALIGNMENT. 930 931 INVALID_VALUE is generated by ShaderStorageBlockBinding if 932 <storageBlockIndex> is not an active shader storage block index of 933 <program>. 934 935 INVALID_VALUE is generated by ShaderStorageBlockBinding if 936 <storageBlockBinding> is is greater than or equal to the value of 937 MAX_SHADER_STORAGE_BUFFER_BINDINGS. 938 939New State 940 941 Add new table, labeled "Shader Storage Buffer State", after Table 6.58 942 (Atomic Counter State), p. 562: 943 944 Initial 945 Get Value Type Get Command Value Description Sec. 946 ----------------------- ---- ----------- ------- ------------------------ ----- 947 SHADER_STORAGE_BUFFER_BINDING Z+ GetIntegerv 0 Current value of generic 2.14.X 948 shader storage buffer 949 binding 950 SHADER_STORAGE_BUFFER_BINDING n*Z+ GetIntegeri_v 0 Buffer object bound 2.14.X 951 to each shader storage 952 buffer binding point 953 SHADER_STORAGE_BUFFER_START n*Z+ GetInteger64i_v 0 Start offset of 2.14.X 954 binding range for each 955 shader storage buffer 956 SHADER_STORAGE_BUFFER_SIZE n*Z+ GetInteger64i_v 0 Size of binding range for 2.14.X 957 each shader storage buffer 958 959New Implementation Dependent State 960 961 Add to Table 6.66, Implementation Dependent Vertex Shader Limits, p. 570 962 963 Get Value Type Get Command Minimum Value Description Sec. 964 ----------------------- ---- ----------- ------------- ------------------------- ----- 965 MAX_VERTEX_SHADER_STORAGE_BLOCKS Z+ GetIntegerv 0 Number of shader storage 2.14.X 966 blocks accessed by a 967 vertex shader 968 969 Add to Table 6.67, Implementation Dependent Tessellation Shader Limits, p. 571 970 971 Get Value Type Get Command Minimum Value Description Sec. 972 ----------------------- ---- ----------- ------------- ------------------------- ----- 973 MAX_TESS_CONTROL_SHADER_ Z+ GetIntegerv 0 Number of shader storage 2.14.X 974 STORAGE_BLOCKS blocks accessed by a 975 tess. control shader 976 MAX_TESS_EVALUATION_SHADER_ Z+ GetIntegerv 0 Number of shader storage 2.14.X 977 STORAGE_BLOCKS blocks accessed by a 978 tess. evaluation shader 979 980 Add to Table 6.68, Implementation Dependent Geometry Shader Limits, p. 572 981 982 Get Value Type Get Command Minimum Value Description Sec. 983 ----------------------- ---- ----------- ------------- ------------------------- ----- 984 MAX_GEOMETRY_SHADER_STORAGE_ Z+ GetIntegerv 0 Number of shader storage 2.14.X 985 BLOCKS blocks accessed by a 986 geometry shader 987 988 Add to Table 6.69, Implementation Dependent Fragment Shader Limits, p. 573 989 990 Get Value Type Get Command Minimum Value Description Sec. 991 ----------------------- ---- ----------- ------------- ------------------------- ----- 992 MAX_FRAGMENT_SHADER_STORAGE_ Z+ GetIntegerv 8 Number of shader storage 2.14.X 993 BLOCKS blocks accessed by a 994 fragment shader 995 996 Add to new table in ARB_compute_shader, Implementation Dependent Compute Shader Limits 997 998 Get Value Type Get Command Minimum Value Description Sec. 999 ----------------------- ---- ----------- ------------- ------------------------- ----- 1000 MAX_COMPUTE_SHADER_STORAGE_ Z+ GetIntegerv 8 Number of shader storage 2.14.X 1001 BLOCKS blocks accessed by a 1002 compute shader 1003 1004 Add to Table 6.70, Implementation Dependent Aggregate Shader Limits, p. 574 1005 1006 Get Value Type Get Command Minimum Value Description Sec. 1007 ----------------------- ---- ----------- ------------- ------------------------- ----- 1008 MAX_COMBINED_SHADER_STORAGE_ Z+ GetIntegerv 8 Number of shader storage 2.14.X 1009 BLOCKS blocks accessed by a 1010 program 1011 MAX_SHADER_STORAGE_BLOCK_SIZE Z+ GetInteger- 2^24 Maximum size in basic 2.14.X 1012 64v machine units of a shader 1013 storage block 1014 SHADER_STORAGE_BUFFER_OFFSET_ Z+ GetIntegerv 256 Minimum required alignment 2.14.X 1015 ALIGNMENT for shader storage buffer 1016 binding offsets 1017 MAX_SHADER_STORAGE_BUFFER_ Z+ GetIntegerv 8 Maximum number of shader 2.14.X 1018 BINDINGS storage buffer bindings 1019 in the context 1020 1021 Modify Table 6.71, Implementation Dependent Aggregate Shader Limits (cont.), p. 575 1022 1023 Get Value Type Get Command Minimum Value Description Sec. 1024 ----------------------- ---- ----------- ------------- ------------------------- ----- 1025 MAX_COMBINED_SHADER_OUTPUT_ Z+ GetIntegerv 8 limit on active image 3.10.22 1026 RESOURCES units, shader storage 1027 blocks, and fragment outputs 1028 1029 (The only change here is a rename of the token formerly called 1030 MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS.) 1031 1032Sample Code 1033 1034 The following example code records a list of fragment (x,y) coordinates 1035 and colors in rasterized primitives into a buffer object. Fragment shader 1036 code would incude: 1037 1038 #extension GL_ARB_shader_storage_buffer_object : require 1039 1040 // Use an atomic counter to keep a running count of the number of 1041 // fragments recorded in the shader storage buffer. 1042 layout(binding=0, offset=0) uniform atomic_uint fragmentCounter; 1043 1044 // Keep a uniform with the number of fragments that can be recorded in 1045 // the buffer. 1046 uniform uint maxFragmentCount; 1047 1048 // Structure with the per-fragment information to record. 1049 struct FragmentData { 1050 ivec2 position; 1051 vec4 color; 1052 }; 1053 1054 // Shader storage block holding an array <fragments> declared without 1055 // a fixed size. Application code should determine how many fragments 1056 // it wants to record and allocate a buffer appropriately. With the 1057 // "std140" layout, each FragmentData record will take 32B. With other 1058 // layouts, the stride of the array is implementation-dependent. The 1059 // "binding=2" layout qualifier says that the block <Fragments> should 1060 // be associated with shader storage buffer binding point #2. 1061 layout(std140, binding=2) buffer Fragments { 1062 FragmentData fragments[]; 1063 }; 1064 1065 in vec4 color; 1066 1067 void main() 1068 { 1069 uint fragmentNumber = atomicCounterIncrement(fragmentCounter); 1070 if (fragmentNumber < maxFragmentCount) { 1071 fragments[fragmentNumber].position = ivec2(gl_FragCoord.xy); 1072 fragments[fragmentNumber].color = color; 1073 } 1074 } 1075 1076 In application code 1077 1078 #define NFRAGMENTS 100000 1079 #define FRAGMENT_SIZE 32 // known due to "std140" usage 1080 1081 GLuint fragmentBuffer, counterBuffer; 1082 1083 // Generate, bind, and specify the data store to hold fragments. The 1084 // NULL pointer in BufferData says that the intial buffer contents are 1085 // undefined. They will be filled in by the fragment shader code. 1086 glGenBuffers(1, &fragmentBuffer); 1087 glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 2, fragmentBuffer); 1088 glBufferData(GL_SHADER_STORAGE_BUFFER, NFRAGMENTS*FRAGMENT_SIZE, 1089 NULL, GL_DYNAMIC_DRAW); 1090 1091 // Generate, bind, and specify the data store for the atomic counter. 1092 glGenBuffers(1, &counterBuffer); 1093 glBindBufferBase(GL_ATOMIC_COUNTER_BUFFER, 0, counterBuffer); 1094 glBufferData(GL_ATOMIC_COUNTER_BUFFER, sizeof(GLuint), NULL, 1095 GL_DYNAMIC_DRAW); 1096 1097 // Reset the atomic counter to zero, then draw stuff. This will record 1098 // values into the shader storage buffer as fragments are generated. 1099 GLuint zero = 0; 1100 glBufferSubData(GL_ATOMIC_COUNTER_BUFFER, 0, sizeof(GLuint), &zero); 1101 glUseProgram(program); 1102 glDrawElements(GL_TRIANGLES, ...); 1103 1104 // You could inspect the contents with a call such as: 1105 void *ptr = glMapBuffer(GL_SHADER_STORAGE_BUFFER, GL_READ_ONLY); 1106 ... 1107 glUnmapBuffer(GL_SHADER_STORAGE_BUFFER); 1108 1109 // You could also use the storage buffer contents for vertex pulling. 1110 // The glMemoryBarrier() command ensures that the data writes to the 1111 // storage buffer complete prior to vertex pulling. 1112 glMemoryBarrier(GL_VERTEX_ATTRIB_ARRAY_BARRIER_BIT); 1113 glBindBuffer(GL_ARRAY_BUFFER, fragmentBuffer); 1114 glVertexAttribIPointer(0, 2, GL_INT, GL_FALSE, FRAGMENT_SIZE, 1115 (void*)0); 1116 glVertexAttribPointer(1, 4, GL_FLOAT, GL_FALSE, FRAGMENT_SIZE, 1117 (void*)16); 1118 glEnableVertexAttribArray(0); 1119 glEnableVertexAttribArray(1); 1120 glDrawArrays(GL_POINTS, ...); 1121 1122Conformance Tests 1123 1124 TBD 1125 1126Issues 1127 1128 (1) The main goal of this extension is to allow C-style GLSL shader code 1129 to write to buffer objects without using roundabout hacks like 1130 creating buffer textures and using shader image loads and stores. 1131 What other approaches could we take to achieving the same thing? 1132 1133 RESOLVED: We are using "shader storage blocks" as an abstraction 1134 similar to uniform blocks, except that we allow shaders to write to 1135 "shader storage blocks". Other options considered include: 1136 1137 - Use uniform blocks, but with a special layout qualifier (e.g., 1138 "writeonly" or "readwrite") that implies different semantics and 1139 implementation-dependent limits. This would avoid the need for a new 1140 storage qualifier in the shading language, and could also avoid adding 1141 new GL APIs to enumerate active buffer variables and shader storage 1142 blocks. However, it would have the disadvantage of shoehorning two 1143 features, which might be implemented very differently in hardware, 1144 into a single abstraction. 1145 1146 - Use C-style pointer syntax as in NV_shader_buffer_store, but treat the 1147 pointers as referring to a buffer binding rather than a specific GPU 1148 address. In this approach, pointers might be required to be uniform. 1149 (In NV_shader_buffer_store, pointers are just data. They can be 1150 passed as uniforms, uniform block members, shader inputs/outputs, 1151 reconstructed from texture data, or however the application wants to 1152 pass them.) 1153 1154 (2) When using shader storage blocks to append records to a buffer, the 1155 storage is provided by a buffer object. There doesn't seem to be any 1156 reason why the shader really needs to know the "length" of the buffer. 1157 It might therefore want to declare global storage blocks containing 1158 unsized arrays. Should we allow this? If so, how does that interact 1159 with bounds checking? What does it mean for the ".length()" method in 1160 GLSL? What would happen if you tried to pass such an array as a 1161 function parameter? What does it mean for a possible introspection 1162 API allowing applications to query how big the block needs to be? 1163 1164 RESOLVED: We will support shader storage blocks whose last member is an 1165 unsized array. For this unsized array, the effective size will be 1166 determined at run-time from the size of the data store. Such unsized 1167 arrays can be indexed with general integer expressions (other than 1168 negative constant expressions, which are generally forbidden for array 1169 indexing in GLSL). The ".length()" method is not supported, nor is 1170 passing the array as a function argument. 1171 1172 When using the ARB_program_interface_query extension to enumerate the 1173 set of active buffer variables, only the first element of arrays (sized 1174 or unsized) will be enumerated; the array size and offsets for array 1175 elements other than the first can be determined by querying the 1176 TOP_LEVEL_ARRAY_SIZE and TOP_LEVEL_ARRAY_STRIDE properties of the buffer 1177 variable. 1178 1179 The bounds checking rules for unsized arrays at the end of shader 1180 storage blocks are the same as for uniform blocks. If the array is 1181 accessed using an index pointing at memory beyond the end of the buffer 1182 object associated with the shader storage blocks, the results are 1183 undefined and can lead to program termination; see also issue (7). 1184 1185 Other options considered here included having the shader declare an 1186 array with a dummy size that's either unrealistically small (1 or 2) or 1187 unrealistically large, and providing guarantees like: 1188 1189 - (small) if the last element of the storage block is an array, we 1190 have defined behavior for indexed accesses off the end of the array, 1191 as long as the effective offset is contained within the buffer; or 1192 1193 - (large) if the buffer is too small for the large declared array, 1194 we have defined behavior for accesses to array elements as long as 1195 the effective offset is contained within the buffer. 1196 1197 Note that it wouldn't be possible for the application to determine the 1198 stride of an array of structures if it were declared with a size of 1. 1199 For a size of 2 or larger, you could use 1200 1201 offset(array[1].member) - offset(array[0].member) 1202 1203 for "shared" layouts at least, but that's not possible if there is no 1204 "array[1].member". 1205 1206 (3) Do we allow arrays of shader storage blocks? 1207 1208 RESOLVED: Yes; we already allow arrays of uniform blocks, where each 1209 block instance has an identical layout but is backed by a separate 1210 buffer object. It seems like we should do this here for consistency. 1211 1212 If we had overloaded the existing uniform block APIs (e.g., by applying 1213 a "readwrite" layout qualifier to uniform blocks), it would be really 1214 weird if we disallowed arrays of writeable uniform blocks since we 1215 already allow it for regular (read-only) uniform blocks. 1216 1217 (4) We have typically provided some sort of "introspection" API where 1218 application code written with no explicit knowledge of the shaders 1219 used can discover properties of active variables. Should we provide 1220 some here? If so, any pitfalls? 1221 1222 RESOLVED: Yes, we will provide an introspection API, but not as part of 1223 this extension. Instead, we require support for the 1224 ARB_program_interface_query extension, which provides a generic 1225 mechanism for enumerating the set of active resources for a number of 1226 "interfaces". This API includes interfaces for all active shader 1227 storage blocks as well as all active buffer variables. Supporting 1228 enumeration of these new resources was one of the primary motivations 1229 for the generic ARB_program_interface_query extension; however, that 1230 extension also added enumeration support for other resources that 1231 previously had no enumeration API. 1232 1233 The enumeration of buffer variables follows slightly different rules 1234 than other variables; in particular, only the first element of members 1235 declared as arrays are enumerated. The previous enumeration rules would 1236 have awful consequences when applied to large arrays of structures in 1237 shader storage blocks. For example, the following declaration would 1238 report 80K active uniforms, starting with "records[0].position" and 1239 ending with "records[39999].texcoord". Ouch! 1240 1241 struct FragmentData { 1242 vec4 position; 1243 vec2 texcoord; 1244 }; 1245 buffer FragmentInfo { 1246 FragmentData records[40000]; 1247 }; 1248 1249 Regular uniforms and UBOs also have exactly the same problem; the 1250 primary difference is that current implementation limits on uniform 1251 storage provide a bounds on how bad this could get. Even those limits 1252 might not actually bound the GetActiveUniform* badness, as the spec 1253 doesn't require a program to link successfully for GetActiveUniform* to 1254 enumerate uniforms. 1255 1256 (5) Uniform blocks already have a well-established usage model, for which 1257 implementations may have dedicated support as well as limits that 1258 reflect this usage model. If we were to overload uniform blocks, some 1259 new uses might not meet this limit and usage model. Is that a 1260 problem? 1261 1262 RESOLVED: Yes, it could have been a problem if we had overloaded 1263 uniform blocks. Implementations may be able to distinguish between 1264 different types of uniform blocks, which might be implemented 1265 differently. One might be able to distinguish based on the size of the 1266 block as well as the layout qualifier (i.e., "readwrite" might be 1267 "different" than "readonly"). 1268 1269 Note that if an implementation wants to use the size of the block as a 1270 factor for determining how the block is accessed, this would introduce a 1271 new wrinkle into the unsized array use case above. That might not be a 1272 huge deal; implementations could make a worst-case assumption and treat 1273 the effective size of an unsized array as resulting in a maximum-size 1274 buffer object. 1275 1276 Note that this consideration applies equally to purely read-only uniform 1277 storage. For example, implementations might have a limit on the size of 1278 uniform blocks that can be accessed by shaders with accelerated hardware 1279 support. However, applications might well want to store large data sets 1280 in buffer objects and access them using random-access reads in shader 1281 code. OpenGL 4.2's mechanisms allow data to be pulled from buffer 1282 objects for vertex shaders using vertex buffers (but only using the 1283 vertex/instance number as an index). Data can also be read from a 1284 texture buffer object via texelFetch(), but that doesn't allow for more 1285 complex data structures (as noted above in the "write" example above). 1286 It would be desirable to have a mechanism to allow random access reads 1287 to "large" buffer objects, even if the implementation and performance 1288 characterstics are different from regular UBO usage. 1289 1290 NVIDIA's NV_shader_buffer_load extension fills this need by allowing the 1291 use of read-only pointers. That extension has been supported for a 1292 longer time and is supported on more platforms than the 1293 NV_shader_buffer_store mentioned above. 1294 1295 (6) The size of uniform blocks on typical OpenGL 3/4 implementations is 1296 64KB. Is this good enough for shader storage buffers, or do we need a 1297 higher limit? 1298 1299 RESOLVED: 64K is not good enough; a higher limit is required. The 1300 current specification requires implementations to support shader storage 1301 blocks of at least 2^24 bytes (16MB). Implementations may support 1302 larger sizes; the maximum size can be determined by querying 1303 MAX_SHADER_STORAGE_BLOCK_SIZE. Because implementations may choose to 1304 support block sizes >= 2^31 bytes, applications should query the maximum 1305 size with GetInteger64v(). 1306 1307 (7) How are write accesses to shader storage blocks bounds-checked? 1308 1309 RESOLVED: For shader storage blocks, we use the same language found in 1310 the current OpenGL 4.2 specification for uniform blocks, which 1311 guarantees no bounds-checking: 1312 1313 | If any active uniform block is not backed by a sufficiently 1314 | large buffer object, the results of shader execution are 1315 | undefined, and may result in GL interruption or termination. 1316 1317 It would be desirable for to have a "robustness" feature that provides 1318 more solid guarantees when accessing outside the bounds of a buffer 1319 object range. However, such a feature is not present in the existing 1320 ARB_robustness specification and is considered orthogonal to the 1321 functionality being added here. 1322 1323 If we were to add bounds-checking here or in the future, there may still 1324 be issues of how bounds-checking would be performed, with multiple use 1325 cases. For example, some existing UBO hardware might include hardware 1326 bounds checking (e.g., return zeroes if accessing off the end of a 1327 buffer object), but that support might not be extended to cover writes 1328 or even some other read-only use cases. 1329 1330 If shader-based bounds checking is required, using code inserted by the 1331 compiler, we'd have to figure out how to specify it. In particular, 1332 we'd have to figure out what granularity the check be done at. At the 1333 byte/word level? Using the first index of the array? 1334 1335 struct FragmentData { 1336 vec4 position; 1337 vec2 texcoord; 1338 }; 1339 layout(writeonly,binding=2) uniform FragmentInfo { 1340 FragmentData records[40000]; 1341 }; 1342 1343 In the example above, let's assume that the structure was tightly 1344 packed, where each element of <records> requires exactly 24 bytes -- 16 1345 for <position> and 8 for <texcoord>. If we bound a 32-byte buffer, what 1346 would happen to reads/writes of records[1].position? Are reads/writes 1347 of the x/y components guaranteed to work, with "out-of-bounds" behavior 1348 on z/w? What about a 31-byte buffer -- do you read/write partial data 1349 for records[1].position.y? What about a 40-byte buffer, which contains 1350 sufficient storage for all of records[1].position? Is it guaranteed to 1351 work, or should we allow implementations to treat accesses to array 1352 elements out of bounds unless the buffer storage for the entire element 1353 (including records[1].texcoord in this case). 1354 1355 (8) Should we provide new "packing" layout qualifiers to augment the 1356 existing vec4-centric "std140" rule for uniform blocks? 1357 1358 RESOLVED: Yes, add a new "std430" layout that provided for tighter 1359 packing of arrays and structures. With "std140", the base alignment of 1360 arrays of scalars and vectors and of structures is always a multiple of 1361 the base alignment of a vec4 (16B), which means that the stride of an 1362 array of type "float", "int", or "uint" is 16B instead of 4B. With 1363 "std430", such arrays will now be tightly packed. 1364 1365 Note that in the "std430" packing, arrays of vec3s are still not tightly 1366 packed; vec3 types still require a 16B alignment as in "std140". 1367 1368 Note that the "std430" layout is supported only for shader storage 1369 blocks, and not for uniform blocks. 1370 1371 (9) Should we allow memory qualifiers ("coherent", "volatile", "restrict", 1372 "readonly", and "writeonly") to apply to entire shader storage blocks? 1373 To individual shader storage block members. 1374 1375 RESOLVED: We allow memory qualifiers to apply to both shader storage 1376 blocks and block members (buffer variables). When a memory qualifier is 1377 applied to a block declaration, it is considered to apply to all block 1378 members. 1379 1380 Note that the extension NV_bindless_texture allows image variables 1381 (which accept memory qualifiers) to be declared as members of shader 1382 storage blocks (which also accept memory qualifiers). This spec adds an 1383 interaction that says that if this case occurs, the qualifier is 1384 considered to apply to the image handle, stored in the block, and not 1385 the memory referenced by the image. 1386 1387 (10) Should we allow mutable assignments of storage blocks to binding 1388 points? 1389 1390 RESOLVED: Yes, allow them in a manner similar to uniform blocks, since 1391 OpenGL 4.2's atomic counter buffer feature requires the "binding=N" 1392 layout in atomic counter declarations and doesn't let you change the 1393 binding used post-link. However, we decided to use the same behavior as 1394 uniform blocks, since the functionality seems so similar. 1395 1396 (11) Is this extension/feature really needed? Isn't it possible to do 1397 something similar in unextended OpenGL 4.2? 1398 1399 RESOLVED: Yes, it's possible to achieve similar functionality in 1400 unextended OpenGL 4.2, but something cleaner is clearly desirable. 1401 1402 One of the intended uses of OpenGL 4.2's atomic counter feature 1403 (ARB_shader_atomic_counters) is to allow shader invocations to write 1404 values generated by shaders into a buffer object, using the atomic 1405 counters to reserve a unique slot number in an array of outputs. The 1406 array itself is accessed by associating the buffer object with a buffer 1407 texture (ARB_texture_buffer_object) and writing to that texture using 1408 shader image stores (ARB_shader_image_load_store). There are a number 1409 of unfortunate limitations of this approach: 1410 1411 * Buffers written to using image stores must have a 1- to 4-component 1412 texture format associated with them. It's not possible to write out 1413 an array of structures, though one can use multiple buffers with 1414 each buffer holding a separate member. 1415 1416 * The image store function takes a canonical vec4/ivec4/uvec4 value to 1417 write, regardless of the value stored. If you're only storing a 1418 float or a vec2, you need to use a constructor (or a swizzle hack) 1419 to generate a vec4 in which the extra components are ignored. 1420 1421 * The image store function takes signed integer coordinates (like the 1422 texelFetch built-ins). However, the atomic counter returns an 1423 unsigned value, and GLSL doesn't support implicit conversions from 1424 unsigned to signed. 1425 1426 * Image stores to buffers require the use of a buffer texture, even 1427 though we don't ever use it as a texture. 1428 1429 The solution offered here is far more direct -- shader code simply 1430 declares the format of the buffer object as an interface block and can 1431 read and write the buffer using normal shader code. 1432 1433 (12) Are there other extensions providing similar functionality? 1434 1435 RESOLVED: Yes. The NVIDIA extension NV_shader_buffer_store also 1436 provides a mechanism where buffer objects can be written to with regular 1437 shader code. Using that extension, an application is able to query a 1438 GPU address of a buffer, make that buffer resident, and then access the 1439 buffer in GLSL code using the queried GPU address as a pointer. 1440 Applications using NV_shader_buffer_store are required to ensure that 1441 pointers are valid and no automatic bounds checking is provided. 1442 1443 This proposed extension is intended to provide GLSL functionality 1444 similar to what you can get with NV_shader_buffer_store, but without 1445 general pointers. Instead, this extension uses bindings, with shader 1446 code effectively extracting a pointer from the bound buffer. 1447 1448 (13) Do we need some sort of limit on the combined sum of actively used 1449 shader storage blocks and other resources, similar to what we had for 1450 image units in OpenGL 4.2 (MAX_COMBINED_IMAGE_UNITS_ 1451 AND_FRAGMENT_OUTPUTS)? 1452 1453 RESOLVED: Yes. For this extension, we just add shader storage blocks 1454 to the set of resources that have a combined limit and also create a new 1455 general token name (MAX_COMBINED_SHADER_OUTPUT_RESOURCES) that is a new 1456 alias of the old combined limit token. 1457 1458 Some OpenGL 4.2 and 4.3 implementations need to share a single set of 1459 internal hardware resources to handle fragment shader outputs, image 1460 loads and stores (from OpenGL 4.3 and ARB_shader_image_load_store), as 1461 well as shader storage buffers. We specify that a link error will occur 1462 if a program requires more of these internal resources than are 1463 available. It is expected that implementations without a need for a 1464 combined limit will expose a limit greater than or equal to the sum of 1465 the individual limits for each shader stage and resource type. 1466 1467 This link error have interaction problems with the 1468 ARB_separate_shader_objects extension and OpenGL 4.1. When linking a 1469 separable program, the linker will not know anything about the usage of 1470 fragment shader outputs, image units, and shader storage blocks from 1471 other programs that could be in use at the same time as the program 1472 being linked. This makes it seemingly impossible to enforce a combined 1473 limit. In practice, this is unlikely to be a problem because the 1474 implementations needing to enforce this combined limit will support the 1475 use of image uniforms and shader storage blocks only in fragment and 1476 compute shaders, and those two stages can't run concurrently. 1477 1478 (14) Are accesses to shader storage buffers coherent with other accesses 1479 to the same underlying resource (e.g., image loads/stores, texture 1480 fetches)? In the same shader invocation? In different shader 1481 invocations? 1482 1483 RESOLVED: No; we don't guarantee coherent accesses between shader 1484 resources of different types. Spec language corresponding to this issue 1485 will be proposed outside this extension. 1486 1487 (15) Do we really need to have a combined limit on the sum of the number 1488 of active shader storage blocks for each program stage 1489 (MAX_COMBINED_SHADER_STORAGE_BLOCKS)? 1490 1491 RESOLVED: We include such a limit, following the precedent of providing 1492 a combined limit for each new resource with per-stage limits. It's not 1493 clear that this combined limit is needed by any current implementation, 1494 though we envision an implementation that could have a set of physical 1495 resources shared between shader stages without providing a full set of 1496 resources for every stage. 1497 1498 Some implementations do need a combined limit on the number of fragment 1499 shader outputs, image uniforms, and shader storage blocks, which is 1500 handled by the separate MAX_COMBINED_SHADER_OUTPUT_RESOURCES limit 1501 discussed in issue (13). 1502 1503 (16) How does an application determine the required buffer object size for 1504 a shader storage block whose last member is an unsized array? 1505 1506 RESOLVED: The ARB_program_interface_query extension includes a property 1507 BUFFER_SIZE that can be queried for active shader storage blocks. For 1508 blocks where all members have known storage requirements, the value of 1509 this property gives the minimum buffer size required to back the shader 1510 storage block. 1511 1512 For shader storage blocks ending in an unsized array, the BUFFER_SIZE 1513 property returns the minimum buffer size needed to store a single 1514 element in the unsized array. The actual storage requirements are a 1515 function of the number of elements the application wants to store in the 1516 buffer object. If an application needs to store N elements in the 1517 unsized array, the required size can be derived by 1518 1519 minimum_size = buffer_size + (N-1) * top_level_stride 1520 1521 where <buffer_size> is the value of the BUFFER_SIZE property of the 1522 shader storage block, and <top_level_stride> is the value of the 1523 TOP_LEVEL_STRIDE property for the unsized array. 1524 1525 Note that when using the "std140" layout qualifier, applications can 1526 determine the layout of shader storage blocks without any queries by 1527 following the layout rules documented in the API specification. 1528 1529 (17) Should we provide GLSL constants for the implementation-dependent 1530 limits in this specification (e.g., gl_MaxVertexShaderStorageBlocks)? 1531 1532 RESOLVED: No. It's not clear that these constants are of any real 1533 value, and they've been specified inconsistently. In particular, we 1534 have a bunch of constants for atomic counters, atomic counter buffers, 1535 and image units/uniforms, but we don't have any limits for uniform 1536 blocks (ARB_uniform_buffer_object). 1537 1538 (18) Other than the last member of a shader storage block, should we allow 1539 block members declared without an explicit size? 1540 1541 RESOLVED: Yes, for consistency with the rest of GLSL. GLSL in general 1542 allows for arrays declared without a size. Such arrays are implicitly 1543 sized by the compiler based on usage. For example, if a shader includes 1544 code such as: 1545 1546 uniform int array[]; // no explicit size 1547 ... 1548 expression = array[2] * array[9]; // only references to <array> 1549 1550 the array is likely to be implicitly sized to 10 elements, since it 1551 needs to provide storage for array[9]. These implicitly sized arrays 1552 are also permitted in interface blocks, such as uniform blocks. 1553 1554 When an array is declared in shader code, there are limitations on how 1555 the array can be used. Such arrays may not be passed to functions in 1556 their entirety or used by the ".length()" method. Additionally, the 1557 array may only be indexed with integer constant expressions. 1558 1559 If the last member of a shader storage block is declared as an array 1560 without an explicit size, it will be considered to be an explicitly 1561 unsized array whose size will be inferred at run-time based on the 1562 provided buffer object. Such arrays can be indexed with arbitrary 1563 expressions, but can not be passed as function arguments or be used by 1564 the ".length()" method. 1565 1566 Note that when using uniform or shader storage blocks using the "shared" 1567 or "std140" layout qualifier, shaders should avoid using implicitly 1568 sized arrays. In this case, the size will be inferred by the compiler 1569 based on shader code and might not be computed identically for multiple 1570 programs using the same block. 1571 1572 (19) Should the ".length()" method be supported for unsized arrays at the 1573 end of a shader storage block? If not, how can shader code determine 1574 the effective size of an unsized array? 1575 1576 RESOLVED: In previous versions of GLSL, the ".length()" method is not 1577 supported for arrays without a declared size, which means that its value 1578 is known at compile time. As a result, the value returned by 1579 ".length()" is considered a constant expression. 1580 1581 In this expression, we allow unsized arrays at the end of shader storage 1582 blocks, and allow the ".length()" method to be used to determine the 1583 size of such arrays based on the size of the provided buffer object. 1584 The derived array size can be derived by reversing the process described 1585 in issue (16): 1586 1587 array.length() = 1588 max((buffer_object_size - offset_of_array) / stride_of_array, 0) 1589 1590 Given that we will support the ".length()" method on unsized arrays, we 1591 will also support on implicitly sized arrays for consistency. For such 1592 arrays, the array size will be determined at link time but will not be 1593 considered a constant expression. 1594 1595 1596Revision History 1597 1598 Revision 16, April 28, 2014 (pbrown) 1599 - Fix typo in description of MAX_COMBINED_SHADER_STORAGE_BLOCKS. 1600 1601 Revision 15, September 23, 2013 (Jon Leech) 1602 - Fix typo ShaderStorageBinding -> ShaderStorageBlockBinding in the 1603 description of that command (Bug 10715). 1604 1605 Revision 14, September 6, 2013 (Jon Leech) 1606 - Fix typo SHADER_STORAGE_BLOCK -> SHADER_STORAGE_BUFFER in the 1607 description of ShaderStorageBlockBinding (Bug 10795). 1608 1609 Revision 13, June 1, 2012 (pbrown) 1610 - Mark issues (8) and (9) as resolved. 1611 1612 Revision 12, May 31, 2012 (pbrown) 1613 - Modify spec to allow the "std430" layout qualifier only on shader 1614 storage blocks, not uniform blocks (bug 8992). 1615 1616 Revision 11, May 14, 2012 (pbrown) 1617 - Further clarify the interaction with ARB_compute_shader on atomic 1618 memory functions; add a clarification that no #extension directive is 1619 needed to use these functions on shared memory variables in compute 1620 shaders. 1621 1622 Revision 10, May 8, 2012 (pbrown) 1623 - Add explicit language specifying that the value returned by the 1624 .length() method for unsized arrays is undefined when the array is in 1625 an array of blocks dereferenced with an out-of-bounds index. 1626 1627 Revision 9, May 7, 2012 (pbrown) 1628 - Allow the use of the .length() method on unsized and implicitly sized 1629 arrays. For unsized arrays in shader storage blocks, .length() will 1630 be computed from the size of the associated buffer object. For 1631 implicitly sized arrays, .length() will be determined at link time. 1632 1633 Revision 8, May 3, 2012 (pbrown) 1634 - Add a "std430" layout qualifier supporting more tightly packed arrays 1635 and structures relative to "std140" for issue (8). 1636 - Add support for memory qualifiers on shader storage block declarations 1637 for issue (9), also add more explicit language on how these qualifiers 1638 work on buffer variables. 1639 - Add spec language making it illegal to use "readonly" and "writeonly" 1640 memory qualifiers on the same declaration. 1641 - Remove built-in constants for shader storage block implementation 1642 limits, as described in issue (17). 1643 - Mark various spec issues as resolved per the Khronos F2F. 1644 - Add interaction with NV_bindless_texture, describing the behavior of 1645 memory qualifiers on image variables inside shader storage blocks. 1646 1647 Revision 7, April 25, 2012 (pbrown) 1648 - Remove the GLSL spec language generally disallowing unsized arrays in 1649 interface blocks (bug 8837). We have supported implicitly sized 1650 arrays in blocks in previous versions of GLSL and decided to retain 1651 backward compatibility. 1652 - Added a warning in the descript the "shared" layout qualifier 1653 indicating that such blocks might not be shareable between programs if 1654 they contain implicitly-sized array members. 1655 - Minor typo/wording fixes. 1656 - Fixed token table to describe all the general query functions 1657 (e.g., GetIntegerv, GetInteger64) where certain tokens can be used. 1658 - Update the spec to require dynamically uniform indexing on arrays of 1659 shader storage blocks. 1660 - Added issues (18) and (19). 1661 1662 Revision 6, April 16, 2012 (pbrown) 1663 - Tentatively add built-in constants for implementation limits on shader 1664 storage blocks, as well as new issue (17) on the topic. 1665 1666 Revision 5, April 13, 2012 (pbrown) 1667 - Add missing #extension and #define built-in documentation for the GLSL 1668 part of the extension. 1669 - Add GLSL spec language documenting support for unsized arrays at the 1670 end of shader storage blocks. 1671 - Add GLSL spec language generally disallowing unsized arrays in 1672 interface blocks, including input/output blocks, uniform blocks, and 1673 shader storage buffers (bug 8837). This borrows from similar language 1674 where unsized arrays are not permitted in structures. 1675 - Extend the tables describing API tokens enumerating GLSL types to 1676 indicate the set of types that can be used for buffer variables. 1677 - Add sample code. 1678 - Update language for several issues, and mark them as resolved. 1679 - Add an issue indicating how an application can determine the required 1680 size of a shader storage buffer when using unsized arrays. 1681 1682 Revision 4, April 12, 2012 (pbrown) 1683 - Remove the enumeration APIs for buffer variables and shader storage 1684 blocks; these resources can only be enumerated using the new APIs 1685 provided by the ARB_program_interface_query extension. 1686 - Add an interaction with ARB_program_interface_query, and have this 1687 spec require that extension to ensure that the queries are available. 1688 - Add a new interaction with ARB_compute_shader; the atomic memory 1689 functions provided in this extension for buffer variables can also be 1690 used for shared variables in compute shaders. Also add new compute 1691 shader limit for active storage blocks. 1692 - Add values for new enumerants in this extension. 1693 - Fix up the "New Procedures and Functions" and "New Tokens" sections. 1694 - Assign enumerant values for all tokens. 1695 - Add a new token MAX_COMBINED_SHADER_OUTPUT_RESOURCES that's an alias 1696 for MAX_COMBINED_IMAGE_UNITS_AND_FRAGMENT_OUTPUTS. That combined 1697 limit now needs to apply to fragment outputs, image units, and shader 1698 storage blocks. 1699 - General cleanup of API specification language for shader storage 1700 blocks. 1701 - Add documentation of per-stage and combined limits in "Shader Execution" 1702 spec langauge, and a validation error for exceeding combined limits with 1703 separate program objects. 1704 - Add new edits to Appendix A and Appendix D. 1705 - Add appropriate text to the Dependencies, New Errors, New State, and 1706 New Implementation-Dependent State sections. 1707 - Add some new issues; update issue (13). 1708 1709 Revision 3, January 23, 2012 (pbrown) 1710 - Add actual spec language in place of the previous "here's our options" 1711 overview. Clean up the overview and issues section to reflect the 1712 general approach chosen in the initial feature discussion. 1713 - Note: Lists of new enumerants, functions, state, and errors have not 1714 been built yet. 1715 1716 Revision 2, January 3, 2012 (pbrown) 1717 - Move issues from overview to separate section in preparation for 1718 further edits; no other changes. 1719 1720 Revision 1, October 26, 2011 (pbrown) 1721 - Initial sketch/proposal, containing only an introduction and issues 1722 list. 1723