1Name 2 3 NV_gpu_shader5 4 5Name Strings 6 7 GL_NV_gpu_shader5 8 9Contact 10 11 Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com) 12 13Contributors 14 15 Barthold Lichtenbelt, NVIDIA 16 Chris Dodd, NVIDIA 17 Eric Werness, NVIDIA 18 Greg Roth, NVIDIA 19 Jeff Bolz, NVIDIA 20 Piers Daniell, NVIDIA 21 Daniel Rakos, AMD 22 Mathias Heyer, NVIDIA 23 24Status 25 26 Shipping. 27 28Version 29 30 Last Modified Date: 03/07/2017 31 NVIDIA Revision: 11 32 33Number 34 35 OpenGL Extension #389 36 OpenGL ES Extension #260 37 38Dependencies 39 40 This extension is written against the OpenGL 3.2 (Compatibility Profile) 41 Specification. 42 43 This extension is written against version 1.50 (revision 09) of the OpenGL 44 Shading Language Specification. 45 46 If implemented in OpenGL, OpenGL 3.2 and GLSL 1.50 are required. 47 48 If implemented in OpenGL, ARB_gpu_shader5 is required. 49 50 This extension interacts with ARB_gpu_shader5. 51 52 This extension interacts with ARB_gpu_shader_fp64. 53 54 This extension interacts with ARB_tessellation_shader. 55 56 This extension interacts with NV_shader_buffer_load. 57 58 This extension interacts with EXT_direct_state_access. 59 60 This extension interacts with EXT_vertex_attrib_64bit and 61 NV_vertex_attrib_integer_64bit. 62 63 This extension interacts with OpenGL ES 3.1 (dated October 29th 2014). 64 65 This extension interacts with OpenGL ES Shading Language 3.1 (revision 3). 66 67 If implemented in OpenGL ES, OpenGL ES 3.1 and GLSL ES 3.10 are required. 68 69 If implemented in OpenGL ES, OES/EXT_gpu_shader5 and EXT_shader_implicit- 70 _conversions are required. 71 72 This extension interacts with OES/EXT_tessellation_shader 73 74 This extension interacts with OES/EXT_geometry_shader 75 76Overview 77 78 This extension provides a set of new features to the OpenGL Shading 79 Language and related APIs to support capabilities of new GPUs. Shaders 80 using the new functionality provided by this extension should enable this 81 functionality via the construct 82 83 #extension GL_NV_gpu_shader5 : require (or enable) 84 85 This extension was developed concurrently with the ARB_gpu_shader5 86 extension, and provides a superset of the features provided there. The 87 features common to both extensions are documented in the ARB_gpu_shader5 88 specification; this document describes only the addition language features 89 not available via ARB_gpu_shader5. A shader that enables this extension 90 via an #extension directive also implicitly enables the common 91 capabilities provided by ARB_gpu_shader5. 92 93 In addition to the capabilities of ARB_gpu_shader5, this extension 94 provides a variety of new features for all shader types, including: 95 96 * support for a full set of 8-, 16-, 32-, and 64-bit scalar and vector 97 data types, including uniform API, uniform buffer object, and shader 98 input and output support; 99 100 * the ability to aggregate samplers into arrays, index these arrays with 101 arbitrary expressions, and not require that non-constant indices be 102 uniform across all shader invocations; 103 104 * new built-in functions to pack and unpack 64-bit integer types into a 105 two-component 32-bit integer vector; 106 107 * new built-in functions to pack and unpack 32-bit unsigned integer 108 types into a two-component 16-bit floating-point vector; 109 110 * new built-in functions to convert double-precision floating-point 111 values to or from their 64-bit integer bit encodings; 112 113 * new built-in functions to compute the composite of a set of boolean 114 conditions a group of shader threads; 115 116 * vector relational functions supporting comparisons of vectors of 8-, 117 16-, and 64-bit integer types or 16-bit floating-point types; and 118 119 * extending texel offset support to allow loading texel offsets from 120 regular integer operands computed at run-time, except for lookups with 121 gradients (textureGrad*). 122 123 This extension also provides additional support for processing patch 124 primitives (introduced by ARB_tessellation_shader). 125 ARB_tessellation_shader requires the use of a tessellation evaluation 126 shader when processing patches, which means that patches will never 127 survive past the tessellation pipeline stage. This extension lifts that 128 restriction, and allows patches to proceed further in the pipeline and be 129 used 130 131 * as input to a geometry shader, using a new "patches" layout qualifier; 132 133 * as input to transform feedback; 134 135 * by fixed-function rasterization stages, in which case the patches are 136 drawn as independent points. 137 138 Additionally, it allows geometry shaders to read per-patch attributes 139 written by a tessellation control shader using input variables declared 140 with "patch in". 141 142 143New Procedures and Functions 144 145 void Uniform1i64NV(int location, int64EXT x); 146 void Uniform2i64NV(int location, int64EXT x, int64EXT y); 147 void Uniform3i64NV(int location, int64EXT x, int64EXT y, int64EXT z); 148 void Uniform4i64NV(int location, int64EXT x, int64EXT y, int64EXT z, 149 int64EXT w); 150 void Uniform1i64vNV(int location, sizei count, const int64EXT *value); 151 void Uniform2i64vNV(int location, sizei count, const int64EXT *value); 152 void Uniform3i64vNV(int location, sizei count, const int64EXT *value); 153 void Uniform4i64vNV(int location, sizei count, const int64EXT *value); 154 155 void Uniform1ui64NV(int location, uint64EXT x); 156 void Uniform2ui64NV(int location, uint64EXT x, uint64EXT y); 157 void Uniform3ui64NV(int location, uint64EXT x, uint64EXT y, uint64EXT z); 158 void Uniform4ui64NV(int location, uint64EXT x, uint64EXT y, uint64EXT z, 159 uint64EXT w); 160 void Uniform1ui64vNV(int location, sizei count, const uint64EXT *value); 161 void Uniform2ui64vNV(int location, sizei count, const uint64EXT *value); 162 void Uniform3ui64vNV(int location, sizei count, const uint64EXT *value); 163 void Uniform4ui64vNV(int location, sizei count, const uint64EXT *value); 164 165 void GetUniformi64vNV(uint program, int location, int64EXT *params); 166 167 168 (The following function is also provided by NV_shader_buffer_load.) 169 170 void GetUniformui64vNV(uint program, int location, uint64EXT *params); 171 172 173 (All of the following ProgramUniform* functions are supported if and only 174 if implemented in OpenGL ES or EXT_direct_state_access is supported.) 175 176 void ProgramUniform1i64NV(uint program, int location, int64EXT x); 177 void ProgramUniform2i64NV(uint program, int location, int64EXT x, 178 int64EXT y); 179 void ProgramUniform3i64NV(uint program, int location, int64EXT x, 180 int64EXT y, int64EXT z); 181 void ProgramUniform4i64NV(uint program, int location, int64EXT x, 182 int64EXT y, int64EXT z, int64EXT w); 183 void ProgramUniform1i64vNV(uint program, int location, sizei count, 184 const int64EXT *value); 185 void ProgramUniform2i64vNV(uint program, int location, sizei count, 186 const int64EXT *value); 187 void ProgramUniform3i64vNV(uint program, int location, sizei count, 188 const int64EXT *value); 189 void ProgramUniform4i64vNV(uint program, int location, sizei count, 190 const int64EXT *value); 191 192 void ProgramUniform1ui64NV(uint program, int location, uint64EXT x); 193 void ProgramUniform2ui64NV(uint program, int location, uint64EXT x, 194 uint64EXT y); 195 void ProgramUniform3ui64NV(uint program, int location, uint64EXT x, 196 uint64EXT y, uint64EXT z); 197 void ProgramUniform4ui64NV(uint program, int location, uint64EXT x, 198 uint64EXT y, uint64EXT z, uint64EXT w); 199 void ProgramUniform1ui64vNV(uint program, int location, sizei count, 200 const uint64EXT *value); 201 void ProgramUniform2ui64vNV(uint program, int location, sizei count, 202 const uint64EXT *value); 203 void ProgramUniform3ui64vNV(uint program, int location, sizei count, 204 const uint64EXT *value); 205 void ProgramUniform4ui64vNV(uint program, int location, sizei count, 206 const uint64EXT *value); 207 208 209New Tokens 210 211 Returned by the <type> parameter of GetActiveAttrib, GetActiveUniform, and 212 GetTransformFeedbackVarying: 213 214 INT64_NV 0x140E 215 UNSIGNED_INT64_NV 0x140F 216 217 INT8_NV 0x8FE0 218 INT8_VEC2_NV 0x8FE1 219 INT8_VEC3_NV 0x8FE2 220 INT8_VEC4_NV 0x8FE3 221 INT16_NV 0x8FE4 222 INT16_VEC2_NV 0x8FE5 223 INT16_VEC3_NV 0x8FE6 224 INT16_VEC4_NV 0x8FE7 225 INT64_VEC2_NV 0x8FE9 226 INT64_VEC3_NV 0x8FEA 227 INT64_VEC4_NV 0x8FEB 228 UNSIGNED_INT8_NV 0x8FEC 229 UNSIGNED_INT8_VEC2_NV 0x8FED 230 UNSIGNED_INT8_VEC3_NV 0x8FEE 231 UNSIGNED_INT8_VEC4_NV 0x8FEF 232 UNSIGNED_INT16_NV 0x8FF0 233 UNSIGNED_INT16_VEC2_NV 0x8FF1 234 UNSIGNED_INT16_VEC3_NV 0x8FF2 235 UNSIGNED_INT16_VEC4_NV 0x8FF3 236 UNSIGNED_INT64_VEC2_NV 0x8FF5 237 UNSIGNED_INT64_VEC3_NV 0x8FF6 238 UNSIGNED_INT64_VEC4_NV 0x8FF7 239 FLOAT16_NV 0x8FF8 240 FLOAT16_VEC2_NV 0x8FF9 241 FLOAT16_VEC3_NV 0x8FFA 242 FLOAT16_VEC4_NV 0x8FFB 243 244 (If ARB_tessellation_shader is supported, the following enum is accepted 245 by a new primitive.) 246 247 Accepted by the <primitiveMode> parameter of BeginTransformFeedback: 248 249 PATCHES 250 251 252 253Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification 254(OpenGL Operation) 255 256 Modify Section 2.6.1, Begin and End, p. 22 257 258 (Extend language describing PATCHES introduced by ARB_tessellation_shader. 259 It particular, add the following to the end of the description of the 260 primitive type.) 261 262 If a patch primitive is drawn, each patch is drawn separately as a 263 collection of points, which each patch vertex definining a separate point. 264 Extra vertices from an incomplete patch are never drawn. 265 266 267 Modify Section 2.14.3, Vertex Attributes, p. 86 268 269 (modify the second paragraph, p. 87) ... exceeds MAX_VERTEX_ATTRIBS. For 270 the purposes of this comparison, attribute variables of the type i64vec3, 271 u64vec3, i64vec4, and u64vec4 count as consuming twice as many attributes 272 as equivalent single-precision types. 273 274 275 (extend the list of types in the first paragraph, p. 88) 276 ... UNSIGNED_INT_VEC3, UNSIGNED_INT_VEC4, INT8_NV, INT8_VEC2_NV, 277 INT8_VEC3_NV, INT8_VEC4_NV, INT16_NV, INT16_VEC2_NV, INT16_VEC3_NV, 278 INT16_VEC4_NV, INT64_NV, INT64_VEC2_NV, INT64_VEC3_NV, INT64_VEC4_NV, 279 UNSIGNED_INT8_NV, UNSIGNED_INT8_VEC2_NV, UNSIGNED_INT8_VEC3_NV, 280 UNSIGNED_INT8_VEC4_NV, UNSIGNED_INT16_NV, UNSIGNED_INT16_VEC2_NV, 281 UNSIGNED_INT16_VEC3_NV, UNSIGNED_INT16_VEC4_NV, UNSIGNED_INT64_NV, 282 UNSIGNED_INT64_VEC2_NV, UNSIGNED_INT64_VEC3_NV, UNSIGNED_INT64_VEC4_NV, 283 FLOAT16_NV, FLOAT16_VEC2_NV, FLOAT16_VEC3_NV, or FLOAT16_VEC4_NV. 284 285 286 Modify Section 2.14.4, Uniform Variables, p. 89 287 288 (modify third paragraph, p. 90) ... uniform variable storage for a vertex 289 shader. A scalar or vector uniform with with 64-bit integer components 290 will consume no more than 2<n> components, where <n> is 1 for scalars, and 291 the component count for vectors. A link error is generated ... 292 293 (add to Table 2.13, p. 96) 294 295 Type Name Token Keyword 296 -------------------- ---------------- 297 INT8_NV int8_t 298 INT8_VEC2_NV i8vec2 299 INT8_VEC3_NV i8vec3 300 INT8_VEC4_NV i8vec4 301 INT16_NV int16_t 302 INT16_VEC2_NV i16vec2 303 INT16_VEC3_NV i16vec3 304 INT16_VEC4_NV i16vec4 305 INT64_NV int64_t 306 INT64_VEC2_NV i64vec2 307 INT64_VEC3_NV i64vec3 308 INT64_VEC4_NV i64vec4 309 UNSIGNED_INT8_NV uint8_t 310 UNSIGNED_INT8_VEC2_NV u8vec2 311 UNSIGNED_INT8_VEC3_NV u8vec3 312 UNSIGNED_INT8_VEC4_NV u8vec4 313 UNSIGNED_INT16_NV uint16_t 314 UNSIGNED_INT16_VEC2_NV u16vec2 315 UNSIGNED_INT16_VEC3_NV u16vec3 316 UNSIGNED_INT16_VEC4_NV u16vec4 317 UNSIGNED_INT64_NV uint64_t 318 UNSIGNED_INT64_VEC2_NV u64vec2 319 UNSIGNED_INT64_VEC3_NV u64vec3 320 UNSIGNED_INT64_VEC4_NV u64vec4 321 FLOAT16_NV float16_t 322 FLOAT16_VEC2_NV f16vec2 323 FLOAT16_VEC3_NV f16vec3 324 FLOAT16_VEC4_NV f16vec4 325 326 (modify list of commands at the bottom of p. 99) 327 328 void Uniform{1,2,3,4}{i64,ui64}NV(int location, T value); 329 void Uniform{1,2,3,4}{i64,ui64}vNV(int location, T value); 330 331 (insert after fourth paragraph, p. 100) The Uniform*i64{v}NV and 332 Uniform*ui64{v}NV commands will load <count> sets of one to four 64-bit 333 signed or unsigned integer values into a uniform location defined as a 334 64-bit signed or unsigned integer scalar or vector types. 335 336 337 (modify "Uniform Buffer Object Storage", p. 102, adding two bullets after 338 the last "Members of type", and modifying the subsequent bullet) 339 340 * Members of type int8_t, int16_t, and int64_t are extracted from a 341 buffer object by reading a single byte, short, or int64-typed value at 342 the specified offset. 343 344 * Members of type uint8_t, uint16_t, and uint64_t are extracted from a 345 buffer object by reading a single ubyte, ushort, or uint64-typed value 346 at the specified offset. 347 348 * Members of type float16_t are extracted from a buffer object by reading 349 a single half-typed value at the specified offset. 350 351 * Vectors with N elements with basic data types of bool, int, uint, 352 float, double, int8_t, int16_t, int64_t, uint8_t, uint16_t, uint64_t, 353 or float16_t are extracted as N values in consecutive memory locations 354 beginning at the specified offset, with components stored in order with 355 the first (X) component at the lowest offset. The GL data type used for 356 component extraction is derived according to the rules for scalar 357 members above. 358 359 360 Modify Section 2.14.6, Varying Variables, p. 106 361 362 (modify third paragraph, p. 107) ... For the purposes of counting input 363 and output components consumed by a shader, variables declared as vectors, 364 matrices, and arrays will all consume multiple components. Each component 365 of variables declared as 64-bit integer scalars or vectors, will be 366 counted as consuming two components. 367 368 (add after the bulleted list, p. 108) For the purposes of counting the 369 total number of components to capture, each component of outputs declared 370 as 64-bit integer scalars or vectors will be counted as consuming two 371 components. 372 373 374 Modify Section 2.15.1, Geometry Shader Input Primitives, p. 118 375 376 (add new qualifier at the end of the section, p. 120) 377 378 Patches (patches) 379 380 Geometry shaders that operate on patches are valid for the PATCHES 381 primitive type. The number of vertices available to each program 382 invocation is equal to the vertex count of the variable-size patch, with 383 vertices presented to the geometry shader in the order specified in the 384 patch. 385 386 387 Modify Section 2.15.4, Geometry Shader Execution Environment, p. 121 388 389 (add to the end of "Geometry Shader Inputs", p. 123) 390 391 Geometry shaders also support built-in and user-defined per-primitive 392 inputs. The following built-in inputs, not replicated per-vertex and not 393 contained in gl_in[], are supported: 394 395 * The variable gl_PatchVerticesIn is filled with the number of the 396 vertices in the input primitive. 397 398 * The variables gl_TessLevelOuter[] and gl_TessLevelInner[] are arrays 399 holding outer and inner tessellation levels of an input patch. If a 400 tessellation control shader is active, the tessellation levels will be 401 taken from the corresponding outputs of the tessellation control 402 shader. Otherwise, the default levels provided as patch parameters 403 are used. Tessellation level values loaded in these variables will be 404 prior to the clamping and rounding operations performed by the 405 primitive generator as described in Section 2.X.2 of 406 ARB_tessellation_shader. For triangular tessellation, 407 gl_TessLevelOuter[3] and gl_TessLevelInner[1] will be undefined. For 408 isoline tessellation, gl_TessLevelOuter[2], gl_TessLevelOuter[3], and 409 both values in gl_TessLevelInner[] are undefined. 410 411 Additionally, a geometry shader with an input primitive type of "patches" 412 may declare per-patch input variables using the qualifier "patch in". 413 Unlike per-vertex inputs, per-patch inputs do not correspond to any 414 specific vertex in the input primitive, and are not indexed by vertex 415 number. Per-patch inputs declared as arrays have multiple values for the 416 input patch; similarly declared per-vertex inputs would indicate a single 417 value for each vertex in the output patch. User-defined per-patch input 418 variables are filled with corresponding per-patch output values written by 419 the tessellation control shader. If no tessellation control shader is 420 active, all such variables are undefined. 421 422 Per-patch input variables and the built-in inputs "gl_PatchVerticesIn", 423 "gl_TessLevelOuter[]", and "gl_TessLevelInner[]" are supported only for 424 geometry shaders with an input primitive type of "patches". A program 425 will fail to link if any such variable is used in a geometry shader with a 426 input primitive type other than "patches". 427 428 429 Modify Section 2.19, Transform Feedback, p. 130 430 431 (add to Table 2.14, p. 131) 432 433 Transform Feedback 434 primitiveMode allowed render primitive modes 435 ---------------------- --------------------------------- 436 PATCHES PATCHES 437 438 439 (modify first paragraph, p. 131) ... <primitiveMode> is one of TRIANGLES, 440 LINES, POINTS, or PATCHES and specifies the type of primitives that will 441 be recorded into the buffer objects bound for transform feedback (see 442 below). ... 443 444 (modify last paragraph, p. 131 and first paragraph, p. 132, adding patch 445 support, and dealing with capture of 8- and 16-bit components) 446 447 When an individual point, line, triangle, or patch primitive reaches the 448 transform feedback stage ... When capturing line, triangle, and patch 449 primitives, all attributes ... For multi-component varying variables or 450 varying array elements, the individual components are written in order. 451 For variables with 8- or 16-bit fixed- or floating-point components, 452 individual components will be converted to and stored as equivalent values 453 of type "int", "uint", or "float". The value for any attribute specified 454 ... 455 456 (modify next-to-last paragraph, p. 132) ... is not incremented. If 457 transform feedback receives a primitive that fits in the remaining space 458 after such an overflow occurs, that primitive may or may not be recorded. 459 Primitives that fail to fit in the remaining space are never recorded. 460 461 462Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification 463(Rasterization) 464 465 None. 466 467Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification 468(Per-Fragment Operations and the Frame Buffer) 469 470 None. 471 472Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification 473(Special Functions) 474 475 None. 476 477Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification 478(State and State Requests) 479 480 Modify Section 6.1.15, Shader and Program Queries, p. 332 481 482 (add to the first list of commands, p. 337) 483 484 void GetUniformi64vNV(uint program, int location, int64EXT *params); 485 void GetUniformui64vNV(uint program, int location, uint64EXT *params); 486 487 488Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile) 489Specification (Invariance) 490 491 None. 492 493Additions to the AGL/GLX/WGL Specifications 494 495 None. 496 497Modifications to The OpenGL Shading Language Specification, Version 1.50 498(Revision 09) 499 500 Including the following line in a shader can be used to control the 501 language features described in this extension: 502 503 #extension GL_NV_gpu_shader5 : <behavior> 504 505 where <behavior> is as specified in section 3.3. 506 507 New preprocessor #defines are added to the OpenGL Shading Language: 508 509 #define GL_NV_gpu_shader5 1 510 511 If the features of this extension are enabled by an #extension directive, 512 shading language features documented in the ARB_gpu_shader5 extension will 513 also be provided. 514 515 516 Modify Section 3.6, Keywords, p. 15 517 518 (add the following to the list of reserved keywords) 519 520 int8_t i8vec2 i8vec3 i8vec4 521 int16_t i16vec2 i16vec3 i16vec4 522 int32_t i32vec2 i32vec3 i32vec4 523 int64_t i64vec2 i64vec3 i64vec4 524 uint8_t u8vec2 u8vec3 u8vec4 525 uint16_t u16vec2 u16vec3 u16vec4 526 uint32_t u32vec2 u32vec3 u32vec4 527 uint64_t u64vec2 u64vec3 u64vec4 528 float16_t f16vec2 f16vec3 f16vec4 529 float32_t f32vec2 f32vec3 f32vec4 530 float64_t f64vec2 f64vec3 f64vec4 531 532 (note: the "float64_t" and "f64vec*" types are available if and only if 533 ARB_gpu_shader_fp64 is also supported) 534 535 536 Modify Section 4.1, Basic Types, p. 18 537 538 (add to the basic "Transparent Types" table, p. 18) 539 540 Types Meaning 541 -------- ---------------------------------------------------------- 542 int8_t an 8-bit signed integer 543 i8vec2 a two-component signed integer vector (8-bit components) 544 i8vec3 a three-component signed integer vector (8-bit components) 545 i8vec4 a four-component signed integer vector (8-bit components) 546 547 int16_t a 16-bit signed integer 548 i16vec2 a two-component signed integer vector (16-bit components) 549 i16vec3 a three-component signed integer vector (16-bit components) 550 i16vec4 a four-component signed integer vector (16-bit components) 551 552 int32_t a 32-bit signed integer 553 i32vec2 a two-component signed integer vector (32-bit components) 554 i32vec3 a three-component signed integer vector (32-bit components) 555 i32vec4 a four-component signed integer vector (32-bit components) 556 557 int64_t a 64-bit signed integer 558 i64vec2 a two-component signed integer vector (64-bit components) 559 i64vec3 a three-component signed integer vector (64-bit components) 560 i64vec4 a four-component signed integer vector (64-bit components) 561 562 uint8_t a 8-bit unsigned integer 563 u8vec2 a two-component unsigned integer vector (8-bit components) 564 u8vec3 a three-component unsigned integer vector (8-bit components) 565 u8vec4 a four-component unsigned integer vector (8-bit components) 566 567 uint16_t a 16-bit unsigned integer 568 u16vec2 a two-component unsigned integer vector (16-bit components) 569 u16vec3 a three-component unsigned integer vector (16-bit components) 570 u16vec4 a four-component unsigned integer vector (16-bit components) 571 572 uint32_t a 32-bit unsigned integer 573 u32vec2 a two-component unsigned integer vector (32-bit components) 574 u32vec3 a three-component unsigned integer vector (32-bit components) 575 u32vec4 a four-component unsigned integer vector (32-bit components) 576 577 uint64_t a 64-bit unsigned integer 578 u64vec2 a two-component unsigned integer vector (64-bit components) 579 u64vec3 a three-component unsigned integer vector (64-bit components) 580 u64vec4 a four-component unsigned integer vector (64-bit components) 581 582 float16_t a single 16-bit floating-point value 583 f16vec2 a two-component floating-point vector (16-bit components) 584 f16vec3 a three-component floating-point vector (16-bit components) 585 f16vec4 a four-component floating-point vector (16-bit components) 586 587 float32_t a single 32-bit floating-point value 588 f32vec2 a two-component floating-point vector (32-bit components) 589 f32vec3 a three-component floating-point vector (32-bit components) 590 f32vec4 a four-component floating-point vector (32-bit components) 591 592 float64_t a single 64-bit floating-point value 593 f64vec2 a two-component floating-point vector (64-bit components) 594 f64vec3 a three-component floating-point vector (64-bit components) 595 f64vec4 a four-component floating-point vector (64-bit components) 596 597 598 Modify Section 4.1.3, Integers, p. 20 599 600 (add after the first paragraph of the section, p. 20) 601 602 Variables with the types "int8_t", "int16_t", and "int64_t" represent 603 signed integer values with exactly 8, 16, or 64 bits, respectively. 604 Variables with the type "uint8_t", "uint16_t", and "uint64_t" represent 605 unsigned integer values with exactly 8, 16, or 64 bits, respectively. 606 Variables with the type "int32_t" and "uint32_t" represent signed and 607 unsigned integer values with 32 bits, and are equivalent to "int" and 608 "uint" types, respectively. 609 610 611 (modify the grammar, p. 21, adding "L" and "UL" suffixes) 612 613 integer-suffix: one of 614 615 u U l L ul UL 616 617 (modify next-to-last paragraph, p. 21) ... When the suffix "u" or "U" is 618 present, the literal has type <uint>. When the suffix "l" or "L" is 619 present, the literal has type <int64_t>. When the suffix "ul" or "UL" is 620 present, the literal has type <uint64_t>. Otherwise, the type is 621 <int>. ... 622 623 624 Modify Section 4.1.4, Floats, p. 22 625 626 (insert after second paragraph, p. 22) 627 628 Variables of type "float16_t" represent floating-point using exactly 16 629 bits and are stored using the 16-bit floating-point representation 630 described in the OpenGL Specification. Variables of type "float32_t" 631 and "float64_t" represent floating-point with 32 or 64 bits, and are 632 equivalent to "float" and "double" types, respectively. 633 634 635 Modify Section 4.1.7, Samplers, p. 23 636 637 (modify 1st paragraph of the section, deleting the restriction requiring 638 constant indexing of sampler arrays) ... Samplers may aggregated into 639 arrays within a shader (using square brackets [ ]) and can be indexed with 640 general integer expressions. The results of accessing a sampler array 641 with an out-of-bounds index are undefined. ... 642 643 (remove the additional restriction added by ARB_gpu_shader5 making a 644 similar edit requiring uniform indexing across shader invocations for 645 defined results. NV_gpu_shader5 has no such limitation.) 646 647 648 Modify Section 4.1.10, Implicit Conversions, p. 27 649 650 (modify table of implicit conversions) 651 652 Can be implicitly 653 Type of expression converted to 654 -------------------- ----------------------------------------- 655 int uint, int64_t, uint64_t, float, double(*) 656 ivec2 uvec2, i64vec2, u64vec2, vec2, dvec2(*) 657 ivec3 uvec3, i64vec3, u64vec3, vec3, dvec3(*) 658 ivec4 uvec4, i64vec4, u64vec4, vec4, dvec4(*) 659 660 int8_t int16_t int, int64_t, uint, uint64_t, float, double(*) 661 i8vec2 i16vec2 ivec2, i64vec2, uvec2, u64vec2, vec2, dvec2(*) 662 i8vec3 i16vec3 ivec3, i64vec3, uvec3, u64vec3, vec3, dvec3(*) 663 i8vec4 i16vec4 ivec4, i64vec4, uvec4, u64vec4, vec4, dvec4(*) 664 665 int64_t uint64_t, double(*) 666 i64vec2 u64vec2, dvec2(*) 667 i64vec3 u64vec3, dvec3(*) 668 i64vec4 u64vec4, dvec4(*) 669 670 uint uint64_t, float, double(*) 671 uvec2 u64vec2, vec2, dvec2(*) 672 uvec3 u64vec3, vec3, dvec3(*) 673 uvec4 u64vec4, vec4, dvec4(*) 674 675 uint8_t uint16_t uint, uint64_t, float, double(*) 676 u8vec2 u16vec2 uvec2, u64vec2, vec2, dvec2(*) 677 u8vec3 i16vec3 uvec3, u64vec3, vec3, dvec3(*) 678 u8vec4 i16vec4 uvec4, u64vec4, vec4, dvec4(*) 679 680 uint64_t double(*) 681 u64vec2 dvec2(*) 682 u64vec3 dvec3(*) 683 u64vec4 dvec4(*) 684 685 float double(*) 686 vec2 dvec2(*) 687 vec3 dvec3(*) 688 vec4 dvec4(*) 689 690 float16_t float, double(*) 691 f16vec2 vec2, dvec2(*) 692 f16vec3 vec3, dvec3(*) 693 f16vec4 vec4, dvec4(*) 694 695 (*) if ARB_gpu_shader_fp64 is supported 696 697 (Note: Expressions of type "int32_t", "uint32_t", "float32_t", and 698 "float64_t" are treated as identical to those of type "int", "uint", 699 "float", and "double", respectively. Implicit conversions to and from 700 these explicitly-sized types are allowed whenever conversions involving 701 the equivalent base type are allowed.) 702 703 704 (modify second paragraph of the section) No implicit conversions are 705 provided to convert from unsigned to signed integer types, from 706 floating-point to integer types, from higher-precision to lower-precision 707 types, from 8-bit to 16-bit types, or between matrix types. There are no 708 implicit array or structure conversions. 709 710 (add before the final paragraph of the section, p. 27) 711 712 (insert before the final paragraph of the section) When performing 713 implicit conversion for binary operators, there may be multiple data types 714 to which the two operands can be converted. For example, when adding an 715 int8_t value to a uint16_t value, both values can be implicitly converted 716 to uint, uint64_t, float, and double. In such cases, a floating-point 717 type is chosen if either operand has a floating-point type. Otherwise, an 718 unsigned integer type is chosen if either operand has an unsigned integer 719 type. Otherwise, a signed integer type is chosen. If operands can be 720 converted to both 32- and 64-bit versions of the chosen base data type, 721 the 32-bit version is used. 722 723 724 Modify Section 4.3.4, Inputs, p. 31 725 726 (modify third paragraph of section, p. 31, allowing explicitly-sized 727 types) ... Vertex shader inputs variables can only be signed and unsigned 728 integers, floats, doubles, explicitly-sized integers and floating-point 729 values, vectors of any of these types, and matrices. ... 730 731 (modify edits done in ARB_tessellation_shader adding support for "patch 732 in", allowing for geometry shaders as well) Additionally, tessellation 733 evaluation and geometry shaders support per-patch input variables declared 734 with the "patch in" qualifier. Per-patch input ... 735 736 737 (modify third paragraph, p. 32) ... Fragment inputs can only be signed and 738 unsigned integers, floats, doubles, explicitly-sized integers and 739 floating-point values, vectors of any of these types, matrices, or arrays 740 or structures of these. Fragment inputs declared as signed or unsigned 741 integers, doubles, 64-bit floating-point values, including vectors, 742 matrices, or arrays derived from those types, must be qualified as "flat". 743 744 745 Modify Section 4.3.6, Outputs, p. 33 746 747 (modify third paragraph of the section, p. 33) ... They can only be signed 748 and unsigned integers, floats, doubles, explicitly-sized integers and 749 floating-point values, vectors of any of these types, matrices, or arrays 750 or structures of these. 751 752 (modify last paragraph, p. 33) ... Fragment outputs can only be signed 753 and unsigned integers, floats, explicitly-sized integers and 754 floating-point values with 32 or fewer bits, vectors of any of these 755 types, or arrays of these. Doubles, 64-bit integers or floating-point 756 values, vectors or arrays of those types, matrices, and structures cannot 757 be output. ... 758 759 760 Modify Section 4.3.8.1, Input Layout Qualifiers, p. 37 761 762 (add to the list of qualifiers for geometry shaders, p. 37) 763 764 layout-qualifier-id: 765 ... 766 triangles_adjacency 767 patches 768 769 (modify the "size of input arrays" table, p. 38) 770 771 Layout Size of Input Arrays 772 ------------ -------------------- 773 patches gl_MaxPatchVertices 774 775 (add paragraph below that table, p. 38) 776 777 When using the input primitive type "patches", the geometry shader is used 778 to process a set of patches with vertex counts that may vary from patch to 779 patch. For the purposes of input array sizing, patches are treated as 780 having a vertex count fixed at the implementation-dependent maximum patch 781 size, gl_MaxPatchVertices. If a shader reads an input corresponding to a 782 vertex not found in the patch being processed, the values read are 783 undefined. 784 785 786 Modify Section 5.4.1, Conversion and Scalar Constructors, p. 49 787 788 (add after first list of constructor examples) 789 790 Similar constructors are provided to convert to and from explicitly-sized 791 scalar data types, as well: 792 793 float(uint8_t) // converts an 8-bit uint value to a float 794 int64_t(double) // converts a double value to a 64-bit int 795 float64_t(int16_t) // converts a 16-bit int value to a 64-bit float 796 uint16_t(bool) // converts a Boolean value to a 16-bit uint 797 798 (replace final two paragraphs, p. 49, and the first paragraph, p. 50, 799 using more general language) 800 801 When constructors are used to convert any floating-point type to any 802 integer type, the fractional part of the floating-point value is dropped. 803 It is undefined to convert a negative floating point value to an unsigned 804 integer type. 805 806 When a constructor is used to convert any integer or floating-point type 807 to bool, 0 and 0.0 are converted to false, and non-zero values are 808 converted to true. When a constructor is used to convert a bool to any 809 integer or floating-point type, false is converted to 0 or 0.0, and true 810 is converted to 1 or 1.0. 811 812 Constructors converting between signed and unsigned integers with the same 813 bit count always preserve the bit pattern of the input. This will change 814 the value of the argument if its most significant bit is set, converting a 815 negative signed integer to a large unsigned integer, or vice versa. 816 817 818 Modify Section 5.9, Expressions, p. 57 819 820 (modify bulleted list as follows, adding support for expressions with 821 64-bit integer types) 822 823 Expressions in the shading language are built from the following: 824 825 * Constants of type bool, int, int64_t, uint, uint64_t, float, all vector 826 types, and all matrix types. 827 828 ... 829 830 * The arithmetic binary operators add (+), subtract (-), multiply (*), and 831 divide (/) operate on 32-bit integer, 64-bit integer, and floating-point 832 scalars, vectors, and matrices. If the fundamental types of the 833 operands do not match, the conversions from Section 4.1.10 "Implicit 834 Conversions" are applied to produce matching types. ... 835 836 * The operator modulus (%) operate on 32- and 64-bit integer scalars or 837 vectors. If the fundamental types of the operands do not match, the 838 conversions from Section 4.1.10 "Implicit Conversions" are applied to 839 produce matching types. ... 840 841 * The arithmetic unary operators negate (-), post- and pre-increment and 842 decrement (-- and ++) operate on 32-bit integer, 64-bit integer, and 843 floating-point values (including vectors and matrices). ... 844 845 * The relational operators greater than (>), less than (<), and less than 846 or equal (<=) operate only on scalar 32-bit integer, 64-bit integer, and 847 floating-point expressions. The result is scalar Boolean. The 848 fundamental type of the two operands must match, either as specified, or 849 after one of the implicit type conversions specified in Section 4.1.10. 850 ... 851 852 * The equality operators equal (==), and not equal (!=) operate only on 853 scalar 32-bit integer, 64-bit integer, and floating-point expressions. 854 The result is scalar Boolean. The fundamental type of the two operands 855 must match, either as specified, or after one of the implicit type 856 conversions specified in Section 4.1.10. ... 857 858 859 Modify Section 6.1, Function Definitions, p. 63 860 861 (ARB_gpu_shader5 adds a set of rules for defining whether implicit 862 conversions for one matching function definition are better or worse than 863 those for another. These comparisons are done argument by argument. 864 Extend the edits made by ARB_gpu_shader5 to add several new rules for 865 comparing implicit conversions for a single argument, corresponding to the 866 new data types introduced by this extension.) 867 868 To determine whether the conversion for a single argument in one match is 869 better than that for another match, the following rules are applied, in 870 order: 871 872 1. An exact match is better than a match involving any implicit 873 conversion. 874 875 2. A match involving a conversion from a signed integer, unsigned 876 integer, or floating-point type to a similar type having a larger 877 number of bits is better a match not involving another conversion. 878 The set of conversions qualifying under this rule are: 879 880 source types destination types 881 ----------------- ----------------- 882 int8_t, int16_t int, int64_t 883 int int64_t 884 uint8_t, uint16_t uint, uint64_t 885 uint uint64_t 886 float16_t float 887 float double 888 889 3. A match involving one conversion in rule 2 is better than a match 890 involving another conversion in rule 2 if: 891 892 (a) both conversions start with the same type and the first 893 conversion is to a type with a smaller number of bits (e.g., 894 converting from int16_t to int is preferred to converting 895 int16_t to int64_t), or 896 897 (b) both conversions end with the same type and the first 898 conversion is from a type with a larger number of bits (e.g., 899 converting an "out" parameter from int16_t to int is preferred 900 to convering from int8_t to int). 901 902 4. A match involving an implicit conversion from any integer type to 903 float is better than a match involving an implicit conversion from 904 any integer type to double. 905 906 907 Modify Section 7.1, Vertex and Geometry Shader Special Variables, p. 69 908 909 (NOTE: These edits are written against the re-organized section in the 910 ARB_tessellation_shader specification.) 911 912 (add to the list of built-ins inputs for geometry shaders) In the geometry 913 language, built-in input and output variables are intrinsically declared 914 as: 915 916 in int gl_PatchVerticesIn; 917 patch in float gl_TessLevelOuter[4]; 918 patch in float gl_TessLevelInner[2]; 919 920 ... 921 922 The input variable gl_PatchVerticesIn behaves as in the identically-named 923 tessellation control and evaluation shader inputs. 924 925 The input variables gl_TessLevelOuter[] and gl_TessLevelInner[] behave as 926 in the identically-named tessellation evaluation shader inputs. 927 928 929 Modify Chapter 8, Built-in Functions, p. 81 930 931 (add to description of generic types, last paragraph of p. 69) ... Where 932 the input arguments (and corresponding output) can be int64_t, i64vec2, 933 i64vec3, or i64vec4, <genI64Type> is used as the argument. Where the 934 input arguments (and corresponding output) can be uint64_t, u64vec2, 935 u64vec3, or u64vec4, <genU64Type> is used as the argument. 936 937 938 Modify Section 8.3, Common Functions, p. 84 939 940 (add support for 64-bit integer packing and unpacking functions) 941 942 Syntax: 943 944 int64_t packInt2x32(ivec2 v); 945 uint64_t packUint2x32(uvec2 v); 946 947 ivec2 unpackInt2x32(int64_t v); 948 uvec2 unpackUint2x32(uint64_t v); 949 950 The functions packInt2x32() and packUint2x32() return a signed or unsigned 951 64-bit integer obtained by packing the components of a two-component 952 signed or unsigned integer vector, respectively. The first vector 953 component specifies the 32 least significant bits; the second component 954 specifies the 32 most significant bits. 955 956 The functions unpackInt2x32() and unpackUint2x32() return a signed or 957 unsigned integer vector built from a 64-bit signed or unsigned integer 958 scalar, respectively. The first component of the vector contains the 32 959 least significant bits of the input; the second component consists the 32 960 most significant bits. 961 962 963 (add support for 16-bit floating-point packing and unpacking functions) 964 965 Syntax: 966 967 uint packFloat2x16(f16vec2 v); 968 f16vec2 unpackFloat2x16(uint v); 969 970 The function packFloat2x16() returns an unsigned integer obtained by 971 interpreting the components of a two-component 16-bit floating-point 972 vector as integers according to OpenGL Specification, and then packing the 973 two 16-bit integers into a 32-bit unsigned integer. The first vector 974 component specifies the 16 least significant bits of the result; the 975 second component specifies the 16 most significant bits. 976 977 The function unpackFloat2x16() returns a two-component vector with 16-bit 978 floating-point components obtained by unpacking a 32-bit unsigned integer 979 into a pair of 16-bit values, and interpreting those values as 16-bit 980 floating-point numbers according to the OpenGL Specification. The first 981 component of the vector is obtained from the 16 least significant bits of 982 the input; the second component is obtained from the 16 most significant 983 bits. 984 985 986 (add functions to get/set the bit encoding for floating-point values) 987 988 64-bit floating-point data types in the OpenGL shading language are 989 specified to be encoded according to the IEEE specification for 990 double-precision floating-point values. The functions below allow shaders 991 to convert double-precision floating-point values to and from 64-bit 992 signed or unsigned integers representing their encoding. 993 994 To obtain signed or unsigned integer values holding the encoding of a 995 floating-point value, use: 996 997 genI64Type doubleBitsToInt64(genDType value); 998 genU64Type doubleBitsToUint64(genDType value); 999 1000 Conversions are done on a component-by-component basis. 1001 1002 To obtain a floating-point value corresponding to a signed or unsigned 1003 integer encoding, use: 1004 1005 genDType int64BitsToDouble(genI64Type value); 1006 genDType uint64BitsToDouble(genU64Type value); 1007 1008 1009 (add functions to evaluate predicates over groups of threads) 1010 1011 Syntax: 1012 1013 bool anyThreadNV(bool value); 1014 bool allThreadsNV(bool value); 1015 bool allThreadsEqualNV(bool value); 1016 1017 Implementations of the OpenGL Shading Language may, but are not required, 1018 to run multiple shader threads for a single stage as a SIMD thread group, 1019 where individual execution threads are assigned to thread groups in an 1020 undefined, implementation-dependent order. Algorithms may benefit from 1021 being able to evaluate a composite of boolean values over all active 1022 threads in the thread group. 1023 1024 The function anyThreadNV() returns true if and only if <value> is true for 1025 at least one active thread in the group. The function allThreadsNV() 1026 returns true if and only if <value> is true for all active threads in the 1027 group. The function allThreadsEqualNV() returns true if <value> is the 1028 same for all active threads in the group; the result of 1029 allThreadsEqualNV() will be true if and only if anyThreadNV() and 1030 allThreadsNV() would return the same value. 1031 1032 Since these functions depends on the values of <value> in an undefined 1033 group of threads, the value returned by these functions is largely 1034 undefined. However, anyThreadNV() is guaranteed to return true if <value> 1035 is true, and allThreadsNV() is guaranteed to return false if <value> is 1036 false. 1037 1038 Since implementations are generally not required to combine threads into 1039 groups, simply returning <value> for anyThreadNV() and allThreadsNV() and 1040 returning true for allThreadsEqualNV() is a legal implementation of these 1041 functions. 1042 1043 1044 Modify Section 8.6, Vector Relational Functions, p. 90 1045 1046 (modify the first paragraph, p. 90, adding support for relational 1047 functions operating on explicitly-sized types) 1048 1049 Relational and equality operators (<, <=, >, >=, ==, !=) are defined (or 1050 reserved) to operate on scalars and produce scalar Boolean results. For 1051 vector results, use the following built-in functions. In the definitions 1052 below, the following terms are used as placeholders for all vector types 1053 for a given fundamental data type: 1054 1055 placeholder fundamental types 1056 ----------- ------------------------------------------------ 1057 bvec bvec2, bvec3, bvec4 1058 1059 ivec ivec2, ivec3, ivec4, i8vec2, i8vec3, i8vec4, 1060 i16vec2, i16vec3, i16vec4, i64vec2, i64vec3, i64vec4 1061 1062 uvec uvec2, uvec3, uvec4, u8vec2, u8vec3, u8vec4, 1063 u16vec2, u16vec3, u16vec4, u64vec2, u64vec3, u64vec4 1064 1065 vec vec2, vec3, vec4, dvec2(*), dvec3(*), dvec4(*), 1066 f16vec2, f16vec3, f16vec4 1067 1068 (*) only if ARB_gpu_shader_fp64 is supported 1069 1070 In all cases, the sizes of the input and return vectors for any 1071 particular call must match. 1072 1073 1074 Modify Section 8.7, Texture Lookup Functions, p. 91 1075 1076 (modify text for textureOffset() functions, p. 94, allowing non-constant 1077 offsets) 1078 1079 Do a texture lookup as in texture but with offset added to the (u,v,w) 1080 texel coordinates before looking up each texel. The value <offset> need 1081 not be constant; however, a limited range of offset values are supported. 1082 If any component of <offset> is less than MIN_PROGRAM_TEXEL_OFFSET_EXT or 1083 greater than MAX_PROGRAM_TEXEL_OFFSET_EXT, the offset applied to the 1084 texture coordinates is undefined. Note that offset does not apply to the 1085 layer coordinate for texture arrays. This is explained in detail in 1086 section 3.9.9 of the OpenGL Specification (Version 3.2, Compatibility 1087 Profile), where offset is (delta_u, delta_v, delta_w). Note that texel 1088 offsets are also not supported for cube maps. 1089 1090 (Note: This lifting of the constant offset restriction also applies to 1091 texelFetchOffset, p. 95, textureProjOffset, p. 95, textureLodOffset, 1092 p. 96, textureProjLodOffset, p. 96.) 1093 1094 1095 (modify the description of the textureGradOffset() functions, p. 97, 1096 preserving the restriction on constant offsets) 1097 1098 Do a texture lookup with both explicit gradient and offset, as described 1099 in textureGrad and textureOffset. For these functions, the offset value 1100 must be a constant expression. A limited range of offset values are 1101 supported; the minimum and maximum offset values are 1102 implementation-dependent and given by MIN_PROGRAM_TEXEL_OFFSET and 1103 MAX_PROGRAM_TEXEL_OFFSET, respectively. 1104 1105 1106 (modify the description of the textureProjGradOffset() functions, 1107 p. 98, preserving the restriction on constant offsets) 1108 1109 Do a texture lookup projectively and with explicit gradient as described 1110 in textureProjGrad, as well as with offset, as described in textureOffset. 1111 For these functions, the offset value must be a constant expression. A 1112 limited range of offset values are supported; the minimum and maximum 1113 offset values are implementation-dependent and given by 1114 MIN_PROGRAM_TEXEL_OFFSET and MAX_PROGRAM_TEXEL_OFFSET, respectively. 1115 1116 (modify the description of the textureGatherOffsets() functions, 1117 added in ARB_gpu_shader5, to remove the restriction on constant offsets) 1118 1119 The textureGatherOffsets() functions operate identically ... 1120 selecting the texel T_i0_j0 of that footprint. The specified values in 1121 <offsets> need not be constant. A limited range of ... 1122 1123 Modify Section 9, Shading Language Grammar, p. 92 1124 1125 !!! TBD !!! 1126 1127 1128GLX Protocol 1129 1130 TBD 1131 1132Interactions with OpenGL ES 3.1 1133 1134 If implemented in OpenGL ES, NV_gpu_shader5 acts as a superset 1135 of functionality provided by OES_gpu_shader5. 1136 1137 A shader that enables this extension 1138 via an #extension directive also implicitly enables the common 1139 capabilities provided by OES_gpu_shader5. 1140 1141 Replace references to ARB_gpu_shader5 with OES_gpu_shader5 and 1142 EXT_shader_implicit_conversions (as appropriate). 1143 Replace references to ARB_geometry_shader with OES/EXT_geometry_shader. 1144 Replace references to ARB_tessellation_shader with OES/EXT_tessellation_shader. 1145 1146 Replace references to int64EXT and uint64EXT with int64 and uint64, 1147 respectively. 1148 1149 The specification should be edited as follows to include new 1150 ProgramUniform* functions. 1151 1152 (modify the ProgramUniform* language) 1153 1154 The following commands: 1155 1156 .... 1157 void ProgramUniform{1,2,3,4}{i64,ui64}NV 1158 (uint program int location, T value); 1159 void ProgramUniform{1,2,3,4}{i64,ui64}vNV 1160 (uint program, int location, const T *value); 1161 1162 operate identically to the corresponding command where "Program" is 1163 deleted from the name (and extension suffixes are dropped or updated 1164 appropriately) except, rather than updating the currently active program 1165 object, these "Program" commands update the program object named by the 1166 <program> parameter. ... 1167 1168 Changes to Section 2.6.1 "Begin and End" don't apply. 1169 1170 Disregard introduction of 64bit -integer or -floating point vertex 1171 attribute types. 1172 1173Interactions with OpenGL ES Shading Language 3.10, revision 3 1174 1175 If implemented in GLSL ES, NV_gpu_shader5 acts as a superset 1176 of functionality provided by OES_gpu_shader5 and 1177 EXT_shader_implicit_conversions. 1178 1179 A shader that enables this extension via an #extension directive 1180 also implicitly enables the common capabilities provided by 1181 OES_gpu_shader5 and EXT_shader_implicit_conversions. 1182 1183 Replace references to ARB_tessellation_shader with OES/EXT_tessellation_shader. 1184 1185 Implicit conversion between GLSL ES types are introduced by 1186 EXT_shader_implicit_conversions instead of ARB_gpu_shader5. 1187 1188 Disregard the notion of 'double' types as vertex shader inputs. 1189 1190 Section 4.1.7.2 "Images" 1191 Remove the third sentence restricts 1192 access to arrays of images to constant integral expression. 1193 1194 This essentially leaves it to the 'dynamically uniform integral 1195 expressions' default as OES_gpu_shader5 introduced. 1196 1197 Modify Section 4.3.9 "Interface Blocks", as modified OES_gpu_shader5 1198 1199 NV_gpu_shader5 also lifts OES_gpu_shader5 restrictions with 1200 regard to indexing into arrays of uniforms blocks and shader 1201 storage blocks. 1202 1203 Change sentence 1204 "All indices used to index a shader storage block array must be 1205 constant integral expressions. A uniform block array can only 1206 be indexed with a dynamically uniform integral expression, 1207 otherwise results are undefined." into 1208 1209 "Arbitrary indices may be used to index a uniform block array; 1210 integral constant expressions are not required. If the index 1211 used to access an array of uniform blocks is out-of-bounds, 1212 the results of the access are undefined." 1213 1214 Indexing into arrays of shader storage blocks defaults to 1215 'dynamically uniform integral expressions'. 1216 1217 Changes to Section 4.3.9, p.48 "Interface Blocks" 1218 1219 Replace the sentence 1220 "All indices used to index a shader storage block array must be 1221 constant integral expressions. A uniform block array can only 1222 be indexed with a dynamically uniform integral expression, 1223 otherwise results are undefined." 1224 with 1225 "Arbitrary indices may be used to index a uniform block array; 1226 integral constant expressions are not required. If the index 1227 used to access an array of uniform blocks is out-of-bounds, the 1228 results of the access are undefined." 1229 1230 4.4.1.1 "Compute Shader Inputs" change 1231 1232 "layout-qualifier-id: 1233 local_size_x = integer-constant 1234 local_size_y = integer-constant 1235 local_size_z = integer-constant" into 1236 1237 "layout-qualifier-id: 1238 local_size_x = integer-constant-expression 1239 local_size_y = integer-constant-expression 1240 local_size_z = integer-constant-expression" 1241 1242 Section 4.4.1.gs "Geometry Shader Inputs" change 1243 1244 "<layout-qualifier-id> 1245 ... 1246 invocations = integer-constant" into 1247 1248 "<layout-qualifier-id> 1249 ... 1250 invocations = integer-constant-expression" 1251 1252 Section 4.4.2 "Output Layout Qualifiers" change 1253 1254 "layout-qualifier-id: 1255 location = integer-constant" into 1256 1257 "layout-qualifier-id: 1258 location = integer-constant-expression" 1259 1260 Section 4.4.2.ts "Tessellation Control Outputs" change 1261 1262 "layout-qualifier-id 1263 vertices = integer-constant" into 1264 1265 "layout-qualifier-id: 1266 vertices = integer-constant-expression" 1267 1268 Section 4.4.3 "Uniform Variable Layout Qualifiers" change 1269 1270 "layout-qualifier-id: 1271 location = integer-constant" into 1272 1273 "layout-qualifier-id: 1274 location = integer-constant-expression" 1275 1276 Section 4.4.4 "Uniform and Shader Storage Block Layout Qualifiers" change 1277 1278 "layout-qualifier-id: 1279 ... 1280 binding = integer-constant" into 1281 1282 "layout-qualifier-id: 1283 ... 1284 binding = integer-constant-expression" 1285 1286 Section 4.4.5 "Opaque Uniform Layout Qualifiers" change 1287 1288 "layout-qualifier-id: 1289 binding = integer-constant" into 1290 1291 "layout-qualifier-id: 1292 binding = integer-constant-expression" 1293 1294 Change sentence 1295 "A link-time error will result if two shaders in a program 1296 specify different integer-constant bindings for the same 1297 opaque-uniform name." into 1298 1299 "A link-time error will result if two shaders in a program 1300 specify different bindings for the same opaque-uniform 1301 name." 1302 1303 Section 4.4.6 "Atomic Counter Layout Qualifiers" change 1304 1305 "layout-qualifier-id: 1306 binding = integer-constant 1307 offset = integer-constant" into 1308 1309 "layout-qualifier-id: 1310 binding = integer-constant-expression 1311 offset = integer-constant-expression" 1312 1313 Section 4.4.7 "Format Layout Qualifiers" change 1314 1315 "layout-qualifier-id: 1316 ... 1317 binding = integer-constant" into 1318 1319 "layout-qualifier-id: 1320 ... 1321 binding = integer-constant-expression" 1322 1323 Section 4.7.3 "Precision Qualifiers" 1324 1325 After "Literal constants do not have precision qualifiers." add 1326 "Neither do explicitly sized types such as int8_t, uint32_t, 1327 float16_t etc." 1328 1329Dependencies on OES_gpu_shader5 1330 1331 In addition to allowing arbitrary indexing arrays of samplers, this 1332 extension also lifts OES_gpu_shader5 restrictions for indexing 1333 arrays of images and shader storage blocks. Additionally, it allows 1334 usage of 'integer-constant-expressions' for layout qualifiers that 1335 formerly took 'integer-constant'. 1336 1337 In Section 'Overview': change the bullet point 1338 1339 "* the ability to aggregate samplers into arrays...." 1340 1341 to 1342 1343 "* the ability to index into arrays of samplers, uniforms and shader 1344 storage blocks with arbitrary expressions, and not require that 1345 non-constant indices be uniform across all shader invocations." 1346 1347 "* the ability to index into arrays of images using dynamically 1348 uniform integers." 1349 1350 "* the ability to use 'integer-constant-expressions' in place of 1351 'integer-constant' for layout qualifiers." 1352 1353Dependencies on OES/EXT_tessellation_shader and OpenGL ES 3.2 1354 1355 If implemented in OpenGL ES 3.1 or earlier and 1356 OES/EXT_tessellation_shader is not supported, language introduced by 1357 this extension describing processing patches in geometry shaders, 1358 transform feedback, and rasterization should be removed. 1359 1360 If implemented in OpenGL ES 3.2 or implemented in 1361 OpenGL ES 3.1 and OES/EXT_tessellation_shader is supported: 1362 1363 It is legal to send patches past the tessellation stage -- the 1364 following language from OES/EXT_tessellation_shader is removed: 1365 1366 Patch primitives are not supported by pipeline stages below the 1367 tessellation evaluation shader. 1368 1369 It is legal to use a tessellation control shader without a tessellation 1370 evaluation shader. 1371 1372 Remove from the bullet list describing reasons for link failure below the 1373 LinkProgram command on p. 70 (as modified by OES/EXT_tessellation_shader): 1374 1375 * the program is not separable and contains no object to form a 1376 tessellation evaluation shader; or 1377 1378 Modify section 11.1.2.1, "Output Variables" on p. 262 (as modified 1379 by the OES/EXT_geometry_shader extension): 1380 1381 Into the paragraph starting with 1382 "Each program object can specify a set of output variables from one 1383 shader to be recorded in transform feedback mode..." 1384 1385 Insert after the tesselation evaluation shader bullet point: 1386 * tesselation control shader 1387 1388 1389 Modify section 11.1.3.11, "Validation" to replace the bullet point 1390 starting with "One but not both of the tessellation..." on p. 271 1391 1392 * the tessellation evaluation but not tessellation control stage 1393 has an active program with corresponding executable shader. 1394 1395 1396 Modify section 11.1ts, "Tessellation" 1397 1398 Replace 1399 "Tessellation is considered active if and only if the active 1400 program object or program pipeline object includes both a 1401 tessellation control shader and a tessellation evaluation shader." 1402 with 1403 "Tessellation is considered active if and only if the active 1404 program object or program pipeline object includes a tessellation 1405 control shader." 1406 1407 Replace 1408 "An INVALID_OPERATION error is generated by any command that 1409 transfers vertices to the GL if the current program state has one 1410 but not both of a tessellation control shader and tessellation 1411 evaluation shader." 1412 with 1413 "An INVALID_OPERATION error is generated by any command that 1414 transfers vertices to the GL if the current program state has a 1415 tessellation evaluation shader but not a tessellation control 1416 shader." 1417 1418 Modify section 12.1.2 "Transform Feedback Primitive Capture" 1419 1420 Replace the second paragraph of the section on p. 274 (as modified 1421 by OES/EXT_tessellation_shader): 1422 1423 The data captured in transform feedback mode depends on the active 1424 programs on each of the shader stages. If a program is active for the 1425 geometry shader stage, transform feedback captures the vertices of each 1426 primitive emitted by the geometry shader. Otherwise, if a program is 1427 active for the tessellation evaluation shader stage, transform feedback 1428 captures each primitive produced by the tessellation primitive generator, 1429 whose vertices are processed by the tessellation evaluation shader. 1430 Otherwise, if a program is active for the tessellation control shader stage, 1431 transform feedback captures each output patch of that stage. 1432 Otherwise, transform feedback captures each primitive processed by the 1433 vertex shader. 1434 1435 Modify the second paragraph following ResumeTransformFeedback on p. 277 1436 (as modified by OES/EXT_tessellation_shader): 1437 1438 When transform feedback is active and not paused ... If a tessellation 1439 or geometry shader is active, the type of primitive emitted 1440 by that shader is used instead of the <mode> parameter passed to drawing 1441 commands for the purposes of this error check. If tessellation 1442 and geometry shaders are both active, the output primitive 1443 type of the geometry shader will be used for the purposes of this error. 1444 Any primitive type may be used while transform feedback is paused. 1445 1446 1447 Modify section 13.3, "Points" 1448 1449 After 1450 "The point size is determined by the last active stage before the 1451 rasterizer:" 1452 1453 Add a new bullet point to the list, between the 1454 tessellation evaluation shader and the vertex shader: 1455 1456 * the tessellation control shader, if active and no tessellation 1457 evaluation shader is active; 1458 1459Dependencies on OES/EXT_geometry_shader 1460 1461 If implemented in GLSL ES and OES/EXT_geometry_shader is not supported, 1462 disregard all changes to geometry shader related functionality. 1463 1464Dependencies on ARB_gpu_shader5 1465 1466 This extension also incorporates all the changes to the OpenGL Shading 1467 Language made by ARB_gpu_shader5; enabling this extension by a #extension 1468 directive in shader code also enables all features of ARB_gpu_shader5 as 1469 though the shader code has also declared 1470 1471 #extension GL_ARB_gpu_shader5 : enable 1472 1473 The converse is not true; implementations supporting both extensions 1474 should not provide the shading language features in this extension if 1475 shader code #extension directives enable only ARB_gpu_shader5. 1476 1477 This specification and ARB_gpu_shader5 both lift the restriction in GLSL 1478 1.50 requiring that indexing in arrays of samplers must be done with 1479 constant expressions. However, ARB_gpu_shader5 specifies that results are 1480 undefined if the indices would diverge if multiple shader invocations are 1481 run in lockstep. This extension does not impose the non-divergent 1482 indexing requirement. 1483 1484Dependencies on ARB_gpu_shader_fp64 1485 1486 This extension and ARB_gpu_shader_fp64 both provide support for shading 1487 language variables with 64-bit components. If both extensions are 1488 supported, the various edits describing this new support should be 1489 combined. 1490 1491 If ARB_gpu_shader_fp64 is not supported, the following edits should be 1492 removed: 1493 1494 * language adding the data types "float64_t", "f64vec2", "f64vec3", and 1495 "f64vec4"; 1496 1497 * language allowing implicit conversions of various types to double, 1498 dvec2, dvec3, or dvec4; and 1499 1500 * the built-in functions doubleBitsToInt64(), doubleBitsToUint64(), 1501 int64BitsToDouble(), and uint64BitsToDouble(). 1502 1503Dependencies on ARB_tessellation_shader 1504 1505 If ARB_tessellation_shader is not supported, language introduced by this 1506 extension describing processing patches in geometry shaders, transform 1507 feedback, and rasterization should be removed. 1508 1509 If this extension and ARB_tessellation_shader are supported, it is legal 1510 to send patches past the tessellation stage -- the following language from 1511 ARB_tessellation_shader is removed: 1512 1513 Patch primitives are not supported by pipeline stages below the 1514 tessellation evaluation shader. If there is no active program object or 1515 the active program object does not contain a tessellation evaluation 1516 shader, the error INVALID_OPERATION is generated by Begin (or vertex 1517 array commands that implicitly call Begin) if the primitive mode is 1518 PATCHES. 1519 1520Dependencies on NV_shader_buffer_load 1521 1522 If NV_shader_buffer_load is supported, that specification should be edited 1523 as follows, to allow pointers to dereference the new data types added by 1524 this extension. 1525 1526 Modify "Section 2.20.X, Shader Memory Access" from NV_shader_buffer_load. 1527 1528 (add rules for loads of variables having the new data types from this 1529 extension to the list of bullets following "When a shader dereferences a 1530 pointer variable") 1531 1532 - Data of type "int8_t," "int16_t", "int32_t", and "int64_t" are read 1533 from or written to memory as a single 8-, 16-, 32-, or 64-bit signed 1534 integer value at the specified GPU address. 1535 1536 - Data of type "uint8_t," "uint16_t", "uint32_t", and "uint64_t" are read 1537 from or written to memory as a single 8-, 16-, 32-, or 64-bit unsigned 1538 integer value at the specified GPU address. 1539 1540 - Data of type "float16_t", "float32_t", and "float64_t" are read from or 1541 written to memory as a single 16-, 32-, or 64-bit floating-point value 1542 at the specified GPU address. 1543 1544Dependencies on EXT_direct_state_access 1545 1546 If EXT_direct_state_access is supported, that specification should be 1547 edited as follows to include new ProgramUniform* functions. 1548 1549 (modify the ProgramUniform* language) 1550 1551 The following commands: 1552 1553 .... 1554 void ProgramUniform{1,2,3,4}{i64,ui64}NV 1555 (uint program int location, T value); 1556 void ProgramUniform{1,2,3,4}{i64,ui64}vNV 1557 (uint program, int location, const T *value); 1558 1559 operate identically to the corresponding command where "Program" is 1560 deleted from the name (and extension suffixes are dropped or updated 1561 appropriately) except, rather than updating the currently active program 1562 object, these "Program" commands update the program object named by the 1563 <program> parameter. ... 1564 1565Dependencies on EXT_vertex_attrib_64bit and NV_vertex_attrib_integer_64bit 1566 1567 The EXT_vertex_attrib_64bit extension provides the ability to specify 1568 64-bit floating-point vertex attributes in a GLSL vertex shader and the 1569 specify the values of these attributes via the OpenGL API. To 1570 successfully compile vertex shaders with fp64 input variables, is 1571 necessary to include 1572 1573 #extension GL_EXT_vertex_attrib_64bit : enable 1574 1575 in the shader text. 1576 1577 However, this extension is considered to enable 64-bit 1578 floating-point and integer inputs. Provided EXT_vertex_attrib_64bit 1579 and NV_vertex_attrib_integer_64bit are supported, including the 1580 following code in a vertex shader 1581 1582 #extension GL_NV_gpu_shader5 : enable 1583 1584 will enable 64-bit floating-point or integer input variables whose 1585 values would be specified using the OpenGL API mechanisms found in 1586 the EXT_vertex_attrib_64bit and NV_vertex_attrib_integer_64bit 1587 extensions. 1588 1589 1590Errors 1591 1592 None. 1593 1594New State 1595 1596 None. 1597 1598New Implementation Dependent State 1599 1600 None. 1601 1602Issues 1603 1604 (1) What implicit conversions are supported by this extension on top of 1605 those provided by related extensions? 1606 1607 RESOLVED: ARB_gpu_shader5 and ARB_gpu_shader_fp64 provide new implicit 1608 conversions from "int" to "uint", and from "int", "uint", and "float" to 1609 "double". 1610 1611 This extension provides integer types of multiple sizes and supports 1612 implicit conversions from small integer types to 32- or 64-bit integer 1613 types of the same signedness, as well as float and double. It also 1614 provides floating-point types of multiple sizes and supports implicit 1615 conversions from smaller to larger types. Additionally, it supports 1616 conversion from 64-bit integer types to double. 1617 1618 (2) How do these implicit conversions impact binary operators? 1619 1620 RESOLVED: For binary operators, we prefer converting to a common type 1621 that is as close as possible in size and type to the original 1622 expression. 1623 1624 (3) How do these implicit conversions impact function overloading rules? 1625 1626 RESOLVED: We extend the preference rules in ARB_gpu_shader5 to account 1627 for the new data types, adding rules to: 1628 1629 * favor new "promotions" in integer/floating point types (previously, 1630 the only promotion was float-to-double) 1631 1632 * for promotions, favor conversion to the type closer in size (e.g., 1633 prefer converting from int16_t to int over converting to int64_t) 1634 1635 (4) What should be done to distinguish between 32- and 64-bit integer 1636 constants? 1637 1638 RESOLVED: We will use "L" and "UL" to identify signed and unsigned 1639 64-bit integer constants; the use of "L" matches a similar ("long") 1640 suffix in the C programming language. C leaves the size of integer 1641 types implementation-dependent, and many implementations require an "LL" 1642 suffix to declare 64-bit integer constants. With our size definitions, 1643 "L" will be considered sufficient to make an integer constant 64-bit. 1644 1645 (5) Should provide support for vertex attributes with 64-bit components, 1646 and if so, how should the support be provided in the OpenGL API? 1647 1648 RESOLVED: Yes, this seems like useful functionality, particularly for 1649 applications wanting to provide double-precision or 64-bit integer data 1650 to shaders performing computations on such types. We provide 1651 VertexAttribL* entry points for 64-bit components in the separate 1652 EXT_vertex_attrib_64bit and NV_vertex_attrib_64bit extensions, which 1653 should be supported on all implementations supporting this extension. 1654 1655 (6) Should we allow vertex attributes with 8- or 16-bit components in the 1656 shading language, and if so, how does it interact with the OpenGL API? 1657 1658 RESOLVED: Yes, but we will use existing APIs to specify such 1659 attributes, which already typically allow 8- and 16-bit components on 1660 the API side. Vertex attribute components (other than 64-bit ones) 1661 specified by the API will be converted from the type specified in the 1662 vertex attribute commands to the component type of the attribute. For 1663 floating-point values, that may involve 16-to-32 bit conversion or vice 1664 versa. For integer types, that may involve dropping all but the least 1665 significant bits of attribute components. 1666 1667 (7) Should we support uniforms with double or 64-bit attribute types, and 1668 if so, how? Should we support uniforms with <32-bit components, and 1669 if so, how? 1670 1671 RESOLVED: We will support uniforms of all component types, either in a 1672 buffer object (via OpenGL 3.1 or ARB_uniform_buffer_object) or in 1673 storage associated with the program. 1674 1675 When uniforms are stored in buffer object, they are stored using their 1676 native data types according to the pre-existing packing and layout 1677 rules. Those rules were already written to be able to accommodate both 1678 the larger and smaller new data types. 1679 1680 Uniforms stored in program objects are loaded with Uniform* APIs. There 1681 are no pre-existing uniform APIs accepting doubles or other "long" 1682 types, so there was no clear need to add an extra "L" to the name to 1683 distinguish from other APIs like we do with VertexAttribL* APIs. 1684 1685 Uniforms with 8- and 16- bit components are loaded with the "larger" 1686 Uniform*{i,ui,f} APIs; it didn't seem worth it to add numerous entry 1687 points to the APIs to handle all those new types. 1688 1689 (8) How do the uniform loading commands introduced by this extension 1690 interact similar commands added by NV_shader_buffer_load? 1691 1692 RESOLVED: NV_shader_buffer_load provided the command Uniformui64NV to 1693 load pointer uniforms with a single 64-bit unsigned integer. This 1694 extension provides vectors of 64-bit unsigned integers, so we needed 1695 Uniform{2,3,4}ui64NV commands. We chose to provide a Uniform1ui64NV 1696 command, which will be functionally equivalent to Uniformui64NV. 1697 1698 (9) How will transform feedback work for capturing variables with double 1699 or 64-bit components? Should we support transform feedback on 1700 variables with components with fewer than 32 bits? 1701 1702 RESOLVED: Transform feedback will support variables with any component 1703 size. Components with fewer than 32-bits are converted to their 1704 equivalent 32-bit types. 1705 1706 For doubles and variables with 64-bit components, each component 1707 captured will count as 64-bit values and occupy two components for the 1708 purpose of component counting rules. This could be a problem for the 1709 SEPARATE_ATTRIBS mode, since the minimum component limit is four, which 1710 would not be sufficient to capture a dvec3 or dvec4. However, 1711 implementations supporting this extension should also be able to support 1712 ARB_transform_feedback3, which extends INTERLEAVED_ATTRIBS mode to 1713 capture vertex attribute values interleaved into multiple buffers. That 1714 functionality effectively obsoletes the SEPARATE_ATTRIBS mode, since it 1715 is a functional superset. 1716 1717 We considered support for capturing 8- and 16-bit values directly, which 1718 had a number of problems. First, full byte addressing might impose both 1719 alignment issues (e.g., capturing a uint8_t followed by a float might 1720 misalign the float) and additional hardware implementation burdens. One 1721 other option would be to pack multiple values into a 32-bit integer 1722 (e.g., f16vec2 would be packed with .x in the LSBs and .y in the MSBs). 1723 This could work, even with word addressing, but would require padding 1724 for odd sizes (e.g., f16vec2 padded to two words, with the second word 1725 holding only .z). It would also have endianness issues; packed values 1726 would look like arrays of the corresponding smaller type on 1727 little-endian systems, but not on big-endian ones. 1728 1729 (10) What precision will be used for computation, storage, and inter-stage 1730 transfer of 8- and 16-bit component data types? 1731 1732 RESOLVED: The components may be considered to occupy a full 32 bits for 1733 the purposes of input/output component count limits. 8- and 16-bit 1734 values should, however, be passed at that precision. 1735 1736 (11) Is the new support for non-constant texel offsets completely 1737 orthogonal? 1738 1739 RESOLVED: No. Non-constant offsets are not supported for the existing 1740 functions textureGradOffset() and textureProjGradOffset(). 1741 1742 (12) Should we provide functions like intBitsToFloat() that operate on 1743 16-bit floating-point values? 1744 1745 RESOLVED: Not in this extension. Such conversions can be performed 1746 using the following code: 1747 1748 uint16_t float16BitsToUint16(float16_t v) 1749 { 1750 return uint16_t(packFloat2x16(f16vec2(v, 0)); 1751 } 1752 1753 float16_t uint16BitsToFloat16(uint16_t v) 1754 { 1755 return unpackFloat2x16(uint(v)).x; 1756 } 1757 1758 (13) Should we provide distinct sized types for 32-bit integers and 1759 floats, and 64-bit floats? Should we provide those types as aliases 1760 for existing unsized types? Or should we provide no such types at 1761 all? 1762 1763 RESOLVED: We will provide sized versions of these types, which are 1764 defined as completely equivalent to unsized types according to the 1765 following table: 1766 1767 unsized type sized types 1768 ------------- --------------- 1769 int int32_t 1770 uint uint32_t 1771 float float32_t 1772 double float64_t 1773 1774 Vector types with sized and unsized components have equivalent 1775 relationships. 1776 1777 Note that the nominally "unsized" data types in the GLSL 1.30 spec are 1778 actually sized. The specification explicitly defines signed and unsized 1779 integers (int, uint) to be 32-bit values. It also defines 1780 floating-point values to "match the IEEE single precision floating-point 1781 definition for precision and dynamic range", which are also 32-bit 1782 values. 1783 1784 This type equivalence has minor implications on function overloading: 1785 1786 * You can't declare separate versions of a function with an "int" 1787 argument in one version and an "int32_t" argument in another. 1788 1789 * Because there is no implicit conversion between equivalent types, we 1790 will get an exact match if an argument is declared with one type 1791 (e.g., "int") in the caller and a textually different but equivalent 1792 type ("int32_t") in the function. 1793 1794 Note that the type equivalence also applies to API data type queries. 1795 For example, the type INT will be returned for a variable declared as 1796 "int32_t". 1797 1798 (14) What are functions like anyThreadNV() and allThreadsNV() good for? 1799 1800 NRESOLVED: If an implementation performs SIMD thread execution, 1801 divergent branching may result in reduced performance if the "if" and 1802 "else" blocks of an "if" statement are executed sequentially. For 1803 example, an algorithm may have both a "fast path" that performs a 1804 computation quickly for a subset of all cases and a "fast path" that 1805 performs a computation quickly but correctly. When performing SIMD 1806 execution, code like the following: 1807 1808 if (condition) { 1809 result = do_fast_path(...); 1810 } else { 1811 result = do_slow_path(...); 1812 } 1813 1814 may end up executing *both* the fast and slow paths for a SIMD thread 1815 group if <condition> diverges, and may execute more slowly than simply 1816 executing the slow path unconditionally. These functions allow code 1817 like: 1818 1819 if (allThreadsNV(condition)) { 1820 result = do_fast_path(...); 1821 } else { 1822 result = do_slow_path(...); 1823 } 1824 1825 that executes the fast path if and only if it can be used for *all* 1826 threads in the group. For thread groups where <condition> diverges, 1827 this algorithm would unconditionally run the slow path, but would never 1828 run both in sequence. 1829 1830 There may be other cases where "voting" across shader invocations may be 1831 useful. Note that we provide no control over how shader invocations may 1832 be packed within a SIMD thread group, unlike various "compute" APIs 1833 (CUDA, OpenCL). 1834 1835 (15) Can the 64-bit uniform APIs be used to load values for uniforms of 1836 type "bool", "bvec2", "bvec3", or "bvec4"? 1837 1838 RESOLVED: No. OpenGL 2.0 and beyond did allow "bool" variable to be 1839 set with Uniform*i* and Uniform*f APIs, and OpenGL 3.0 extended that 1840 support to Uniform*ui* for orthogonality. But it seems pointless to 1841 extended this capability forward to 64-bit Uniform APIs as well. 1842 1843 (19) The ARB_tessellation_shader extension adds support for patch 1844 primitives that might survive to the transform feedback stage. How 1845 are such primitives captured? 1846 1847 RESOLVED: If patch primitives survive to the transform feedback stage, 1848 they are recorded on a patch-by-patch basis. Incomplete patches are not 1849 recorded. As with other primitive types, if the transform feedback 1850 buffers do not contain enough space to capture an entire patch, no 1851 vertices are recorded. 1852 1853 Note that the only way to get patch primitives all the way to transform 1854 feedback is to have tessellation evaluation and geometry shaders 1855 disabled; the output streams from both of those shader stages are 1856 collections of points, lines, or triangles. 1857 1858 (20) Previous transform feedback allowed capturing only fixed-size 1859 primitives; this extension supports variable-sized patches. What 1860 interactions does this functionality have with transform feedback 1861 buffer overflow? 1862 1863 RESOLVED: With fixed-size point, line, or triangle primitives, once any 1864 primitive fails to be recorded due to insufficient space, all subsequent 1865 primitives would also fail. With variable-size patch primitives, the 1866 transform feedback stage might first receive a large patch that doesn't 1867 fit, followed by a smaller patch that could squeeze into the remaining 1868 space. 1869 1870 To allow for different types of implementation of this extension without 1871 requiring special-case handling of this corner case, we've chosen to 1872 leave this behavior undefined -- the smaller patch may or may not be 1873 recorded. 1874 1875 1876Revision History 1877 1878 Rev. Date Author Changes 1879 ---- -------- -------- ----------------------------------------- 1880 11 03/07/17 mheyer Update OpenGL ES interactions to clarify 1881 that using a tessellation control shader 1882 without a tessellation evaluation shader 1883 is legal, and PATCHES can be sent past the 1884 tessellation stage. 1885 1886 10 04/16/16 mheyer Add OpenGL ES interactions (written before 1887 revision 9, but not published) 1888 1889 9 02/19/16 pbrown Clarify that non-constant offset vectors are 1890 supported in textureGatherOffsets(). 1891 1892 8 09/11/14 pbrown Fix incorrect implicit conversions, which 1893 follow the general pattern of little->big 1894 and int->uint->float. Thanks to Daniel 1895 Rakos, author of similar functionality in 1896 the AMD_gpu_shader_int64 spec. 1897 1898 7 11/08/10 pbrown Fix typos in description of packFloat2x16 and 1899 unpackFloat2x16. 1900 1901 6 03/23/10 pbrown Update overview, dependencies, remove references 1902 to old extension names. Extend the function 1903 overloading prioritization rules from 1904 ARB_gpu_shader5 to account for new data types. 1905 Major overhaul of the issues section to match 1906 the refactoring done to produce ARB specs. 1907 1908 5 03/08/10 pbrown Add interaction with EXT_vertex_attrib_64bit and 1909 NV_vertex_attrib_integer_64bit; enabling this 1910 extension automatically enables 64-bit floating- 1911 point and integer vertex inputs. 1912 1913 4 03/01/10 pbrown Fix prototype for GetUniformui64vNV. 1914 1915 3 01/14/10 pbrown Fix with updated enum assignments. 1916 1917 2 12/08/09 pbrown Add explicit component counting rules for 1918 64-bit integer attributes similar to those 1919 in the ARB_gpu_shader_fp64 spec. 1920 1921 1 pbrown Internal revisions. 1922