1Name 2 3 NV_shader_storage_buffer_object 4 5Name Strings 6 7 GL_NV_shader_storage_buffer_object 8 9Contact 10 11 Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com) 12 13Status 14 15 Complete 16 17Version 18 19 Last Modified Date: February 11, 2014 20 NVIDIA Revision: 2 21 22Number 23 24 422 25 26Dependencies 27 28 OpenGL 4.0 (Core or Compatibiity Profile) is required. 29 30 This extension is written against the OpenGL 4.2 Specification 31 (Compatibility Profile). 32 33 NV_gpu_program4 and NV_gpu_program5 are required. 34 35 ARB_shader_storage_buffer_object is required. 36 37 This specification interacts with NV_shader_atomic_float. 38 39Overview 40 41 This extension provides assembly language support for shader storage 42 buffers (from the ARB_shader_storage_buffer_object extension) for all 43 program types supported by NV_gpu_program5, including compute programs 44 added by the NV_compute_program5 extension. 45 46 Assembly programs using this extension can read and write to the memory of 47 buffer objects bound to the binding points provided by 48 ARB_shader_storage_buffer_object. 49 50New Procedures and Functions 51 52 None. 53 54New Tokens 55 56 None. 57 58Additions to Chapter 2 of the OpenGL 4.2 (Compatibility Profile) Specification 59(OpenGL Operation) 60 61 (All modifications are relative to Section 2.X, GPU Programs, from the 62 NV_gpu_program4 specification, as modified by NV_gpu_program5.) 63 64 Modify Section 2.X.2, Program Grammar 65 66 (add the following grammar rules to the NV_gpu_program5 base grammar for 67 shader storage buffers) 68 69 <MemInstruction> ::= <ATOMBop_instruction> 70 | <LDBop_instruction> 71 | <STBop_instruction> 72 73 <ATOMBop_instruction> ::= "ATOMB" <opModifiers> <instResult> "," 74 <instOperandV> "," <storageUseV> 75 76 <LDBop_instruction> ::= "LDB" <opModifiers> <instResult> "," 77 <storageUseV> 78 79 <STBop_instruction> ::= "STB" <opModifiers> <instOperandV> "," 80 <storageUseV> 81 82 <namingStatement> ::= <STORAGE_statement> 83 84 <STORAGE_statement> ::= "STORAGE" <establishName> <storageSingleInit> 85 | "STORAGE" <establishName> <optArraySize> 86 <storageMultipleInit> 87 88 <storageSingleInit> ::= "=" <storageUseDS> 89 90 <storageMultipleInit> ::= "=" "{" <storageItemList> "}" 91 92 <storageItemList> ::= <storageUseDM> 93 | <storageUseDM> "," <storageItemList> 94 95 <programSingleItem> ::= <progStorLenParams> 96 97 <programMultipleItem> ::= <progStorLenParams> 98 99 <progStorLenParams> ::= "program" "." "storagelen" <arrayMemAbs> 100 | "program" "." "storagelen" <arrayRange> 101 102 <progStorLenParam> ::= "program" "." "storagelen" <arrayMemAbs> 103 104 <storageUseV> ::= <storageVarName> <optArrayMem> 105 | <storageVarName> <arrayMem> <optArrayMem> 106 107 <storageUseDS> ::= <storageBinding> <arrayMemAbs> 108 109 <storageUseDM> ::= <storageBinding> <arrayMemAbs> 110 | <storageBinding> <arrayRange> 111 | <storageBinding> 112 113 <storageBinding> ::= "program" "." "storage" <arrayMemAbs> 114 | "program" "." "storage" <arrayRange> 115 116 117 Modify Section 2.X.3.3, Program Parameters 118 119 Shader Storage Buffer Property Bindings 120 121 Binding Components Underlying State 122 ------------------------- ---------- ------------------------------- 123 program.storagelen[a] (x,-,-,-) program storage buffer a, 124 variable-size array length 125 program.storagelen[a..b] (x,-,-,-) program storage buffer a..b, 126 variable-size array length 127 128 If a program parameter binding matches "program.storagelen[a]", the "x" 129 component of the program parameter variable is filled an unsized array 130 length associated with the shader storage buffer binding point <a>. If 131 the program is generated internally by the implementation when compiling a 132 GLSL shader, this length is the number of elements that can be stored in 133 an unsized array at the end of the associated GLSL shader storage block. 134 If there is no shader block associated with shader storage buffer binding 135 point <a>, or if the associated block does not end with an unsized array, 136 the "x" component will hold the integer value zero. If the program is an 137 assembly program specified via ProgramStringARB or LoadProgramNV, the "x" 138 component will hold the integer value zero. The "y", "z", and "w" 139 components of the variable are undefined. 140 141 Additionally, for program parameter array bindings, 142 "program.storagelen[a..b]" is equivalent to specifying unsized array 143 lengths for storage buffer binding points <a> through <b>, in order. A 144 program using any of these bindings will fail to load if <a> is greater 145 than <b>. 146 147 148 Add New Section 2.X.3.Y, Shader Storage Buffers, after Section 2.X.3.6, 149 Program Parameter Buffers 150 151 Shader storage buffers are arrays of basic machine units from which data 152 can be read or written using the LDB and STB instructions. Shader storage 153 buffers also support atomic memory operations using the ATOMB instruction. 154 The GL provides an implementation-dependent number of shader storage 155 buffer binding points to which buffer objects can be attached. Shader 156 storage buffer contents can be changed either by updating the contents of 157 bound buffer object, by changing the buffer object attached to a binding 158 point, or by using the ATOMB or STB instructions in a shader to modify 159 contents of buffer object memory. 160 161 Shader storage buffer bindings are established by calling BindBufferBase, 162 BindBufferOffsetEXT, or BindBufferRange with a target of 163 SHADER_STORAGE_BUFFER, as documented in the ARB_shader_storage_buffer 164 extension. The number of shader storage buffer binding points is given by 165 the value of the constant MAX_SHADER_STORAGE_BUFFER_BINDINGS. There is a 166 limit on the maximum number of basic machine units in a buffer object that 167 can be accessed using any single parameter buffer binding point, given by 168 the implementation-dependent constant MAX_SHADER_STORAGE_BLOCK_SIZE. 169 Buffer objects larger than this size may be used, but the results of 170 accessing portions of the buffer object beyond the limit are undefined. 171 172 Shader storage buffer variables may only be used as operands in the ATOMB, 173 LDB, and STB instructions; they may not be used by used as results or 174 operands in general instructions. Shader storage buffer variables must be 175 declared explicitly via the <STORAGE_statement> grammar rule. Shader 176 storage buffer bindings can not be used directly in executable 177 instructions. 178 179 Shader storage buffer variables may be declared as arrays, but all 180 bindings assigned to the array must use the same binding point(s) and must 181 increase consecutively. 182 183 In explicit shader storage variable declarations, the bindings in Table 184 X.2 starting with "program.storage[a..b]" may be used, indicating that the 185 variable spans multiple buffer binding points. Such variables must be 186 accessed as an arrays, with the first index specifying an offset into the 187 range of buffer object binding points. A buffer index of zero identifies 188 binding point <a>; an index of <b>-<a>-1 identifies binding point <b>. If 189 such a variable is declared as an array, a second index must be provided 190 to identify the individual array element. A program will fail to compile 191 if such bindings are used when <a> or <b> is negative or greater than or 192 equal to the number of buffer binding points supported for the program 193 type, or if <a> is greater than <b>. 194 195 Binding Components Underlying State 196 ----------------------------- ---------- ----------------------------- 197 program.storage[a][c] (x,x,x,x) shader storage buffer a, 198 element c 199 program.storage[a][c..d] (x,x,x,x) shader storage buffer a, 200 elements c through d 201 program.storage[a] (x,x,x,x) shader storage buffer a, 202 all elements 203 program.storage[a..b][c] (x,x,x,x) shader storage buffers a 204 through b, element c 205 program.storage[a..b][c..d] (x,x,x,x) shader storage buffers a 206 through b, elements c 207 through d 208 program.storage[a..b] (x,x,x,x) shader storage buffers a 209 through b, all elements 210 211 Table X.2: Shader Storage Bindings. <a> and <b> indicate buffer 212 numbers, <c> and <d> indicate individual elements. 213 214 If a shader storage buffer binding matches "program.storage[a][c]", the 215 shader storage buffer variable is associated with element <c> of the 216 buffer object bound to binding point <a>. If no buffer object is bound to 217 binding point <a>, or the bound buffer object is not large enough to hold 218 an element <c>, reads from the binding return zero and writes to the 219 binding have no effect. The binding point <a> must be a nonnegative 220 integer constant. 221 222 For shader storage array declarations, "program.storage[a][c..d]" is 223 equivalent to specifying elements <c> through <d> of the buffer object 224 bound to binding point <a> in order. 225 226 For shader storage array declarations, "program.storage[a]" is equivalent 227 to specifying the entire buffer -- elements 0 through <n>-1, where <n> is 228 either the size of the array (if declared) or the implementation-dependent 229 maximum shader storage buffer object size limit (if no size is declared). 230 231 When bindings beginning with "program.storage[a..b]" are used in a 232 variable declaration, they behave identically to corresponding beginning 233 with "program.storage[a]", except that the variable is filled with a 234 separate set of values for each buffer binding point from <a> to <b> 235 inclusive. 236 237 238 Modify Section 2.X.4, Program Execution Environment 239 240 (add to the opcode table) 241 242 Instr- Modifiers 243 uction F I C S H D Out Inputs Description 244 ------ - - - - - - --- -------- -------------------------------- 245 ATOMB - - X - - - s v,su atomic transaction to storage buffer 246 LDB - - X X - F v su load from storage buffer 247 STB - - - - - - - v,su store to storage buffer 248 249 250 Modify Section 2.X.4.5, Program Memory Access, from NV_gpu_program5 251 252 (modify first paragraph) 253 254 Programs may load from or store to buffer object memory via the ATOM 255 (atomic global memory operation), ATOMB (atomic storage buffer memory 256 operation), LDB (load from storage buffer), LDC (load constant), LOAD 257 (global load), STB (store to storage buffer), and STORE (global store) 258 instructions. 259 260 261 (Add to "Section 2.X.6, Program Options" of the NV_gpu_program4 extension, 262 as extended by NV_gpu_program5:) 263 264 + Shader Storage Buffer Operations (NV_shader_storage_buffer) 265 266 If a program (of any type, including compute programs) specifies the 267 "NV_shader_storage_buffer" option, it may use the "ATOMB", "LDB", and 268 "STB" opcodes to perform atomic memory options, loads, and stores to 269 shader storage buffers, and "STORAGE" to declare shader storage buffer 270 variables. If the option is not specified, a program will fail to load if 271 it contains "ATOMB", "LDB", or "STB" opcodes, or "STORAGE" declarations. 272 273 274 (add the following subsection to section 2.X.8, Program Instruction Set.) 275 276 Section 2.X.8.Z, ATOMB: Atomic Memory Operation (Storage Buffer Memory) 277 278 The ATOMB instruction performs an atomic memory operation by reading from 279 shader storage buffer memory specified by the second operand, computing a 280 new value based on the value read from memory and the first (vector) 281 operand, and then writing the result back to the same memory address. The 282 memory transaction is atomic, guaranteeing that no other write to the 283 memory accessed will occur between the time it is read and written by the 284 ATOMB instruction. The result of the ATOMB instruction is the scalar 285 value read from memory. The second operand used for the ATOMB instruction 286 must correspond to a shader storage variable declared using the "STORAGE" 287 statement; a program will fail to load if any other type of operand is 288 used for the second operand of an ATOMB instruction. 289 290 The ATOMB instruction has two required instruction modifiers. The atomic 291 modifier specifies the type of operation to be performed. The storage 292 modifier specifies the size and data type of the operand read from memory 293 and the base data type of the operation used to compute the value to be 294 written to memory. 295 296 atomic storage 297 modifier modifiers operation 298 -------- ------------------ -------------------------------------- 299 ADD U32, S32, U64, F32 compute a sum 300 MIN U32, S32 compute minimum 301 MAX U32, S32 compute maximum 302 IWRAP U32 increment memory, wrapping at operand 303 DWRAP U32 decrement memory, wrapping at operand 304 AND U32, S32 compute bit-wise AND 305 OR U32, S32 compute bit-wise OR 306 XOR U32, S32 compute bit-wise XOR 307 EXCH U32, S32, U64, F32 exchange memory with operand 308 CSWAP U32, S32, U64 compare-and-swap 309 310 Table X.Y, Supported atomic and storage modifiers for the ATOM 311 instruction. 312 313 Not all storage modifiers are supported by ATOMB, and the set of modifiers 314 allowed for any given instruction depends on the atomic modifier 315 specified. Table X.Y enumerates the set of atomic modifiers supported by 316 the ATOMB instruction, and the storage modifiers allowed for each. 317 318 tmp0 = VectorLoad(op0); 319 result = BufferMemoryLoad(op1, storageModifier); 320 switch (atomicModifier) { 321 case ADD: 322 writeval = tmp0.x + result; 323 break; 324 case MIN: 325 writeval = min(tmp0.x, result); 326 break; 327 case MAX: 328 writeval = max(tmp0.x, result); 329 break; 330 case IWRAP: 331 writeval = (result >= tmp0.x) ? 0 : result+1; 332 break; 333 case DWRAP: 334 writeval = (result == 0 || result > tmp0.x) ? tmp0.x : result-1; 335 break; 336 case AND: 337 writeval = tmp0.x & result; 338 break; 339 case OR: 340 writeval = tmp0.x | result; 341 break; 342 case XOR: 343 writeval = tmp0.x ^ result; 344 break; 345 case EXCH: 346 break; 347 case CSWAP: 348 if (result == tmp0.x) { 349 writeval = tmp0.y; 350 } else { 351 return result; // no memory store 352 } 353 break; 354 } 355 BufferMemoryStore(op1, writeval, storageModifier); 356 357 ATOMB performs a scalar atomic operation. The <y>, <z>, and <w> 358 components of the result vector are undefined. 359 360 ATOMB supports no base data type modifiers, but requires exactly one 361 storage modifier. The base data types of the result vector, and the first 362 (vector) operand are derived from the storage modifier. 363 364 365 Section 2.X.8.Z, LDB: Load from Storage Buffer Memory 366 367 The LDB instruction generates a result vector by fetching data from shader 368 storage buffer memory identified by the first operand, as described in 369 Section 2.X.4.5. The single operand for the LDB instruction must 370 correspond to a shader storage buffer variable declared using the 371 "STORAGE" statement; a program will fail to load if any other type of 372 operand is used in an LDB instruction. 373 374 result = BufferMemoryLoad(op0, storageModifier); 375 376 LDB supports no base data type modifiers, but requires exactly one storage 377 modifier. The base data type of the result vector is derived from the 378 storage modifier. 379 380 381 Section 2.X.8.Z, STB: Store to Storage Buffer Memory 382 383 The STB instruction writes the contents of the first vector operand to 384 shader storage buffer memory identified by the second operand, as 385 described in Section 2.X.4.5. This instruction generates no result. The 386 second operand for the STB instruction must correspond to a shader shader 387 storage buffer variable declared using the "STORAGE" statement; a program 388 will fail to load if any other type of operand is used in an STB 389 instruction. 390 391 tmp0 = VectorLoad(op0); 392 BufferMemoryStore(op1, tmp0, storageModifier); 393 394 STB supports no base data type modifiers, but requires exactly one storage 395 modifier. The base data type of the vector components of the first 396 operand is derived from the storage modifier. 397 398 399Additions to Chapter 3 of the OpenGL 4.2 (Compatibility Profile) Specification 400(Rasterization) 401 402 None. 403 404Additions to Chapter 4 of the OpenGL 4.2 (Compatibility Profile) Specification 405(Per-Fragment Operations and the Frame Buffer) 406 407 None. 408 409Additions to Chapter 5 of the OpenGL 4.2 (Compatibility Profile) Specification 410(Special Functions) 411 412 None. 413 414Additions to Chapter 6 of the OpenGL 4.2 (Compatibility Profile) Specification 415(State and State Requests) 416 417 None. 418 419Additions to the AGL/GLX/WGL Specifications 420 421 None. 422 423GLX Protocol 424 425 None. 426 427Dependencies on NV_shader_atomic_float 428 429 If NV_shader_atomic_float is not supported, the ADD and EXCH atomic 430 operations in the ATOMB instructions do not support the "F32" storage 431 modifier. 432 433Errors 434 435 None. 436 437New State 438 439 None, but shares buffer bindings and other state with the 440 ARB_shader_storage_buffer_object extension. 441 442New Implementation Dependent State 443 444 None, but shares implementation-dependent state with the 445 ARB_shader_storage_buffer_object extension. 446 447Issues 448 449 None. 450 451Revision History 452 453 Revision 2, February 11, 2014 (pbrown) 454 - Fix typos in the descriptions of the "program.storage" bindings. 455 456 Revision 1, May 9, 2012 (pbrown) 457 - Internal spec development. 458