1Name 2 3 MESA_shader_integer_functions 4 5Name Strings 6 7 GL_MESA_shader_integer_functions 8 9Contact 10 11 Ian Romanick <ian.d.romanick@intel.com> 12 13Contributors 14 15 All the contributors of GL_ARB_gpu_shader5 16 17Status 18 19 Supported by all GLSL 1.30 capable drivers in Mesa 12.1 and later 20 21Version 22 23 Version 3, March 31, 2017 24 25Number 26 27 OpenGL Extension #495 28 29Dependencies 30 31 This extension is written against the OpenGL 3.2 (Compatibility Profile) 32 Specification. 33 34 This extension is written against Version 1.50 (Revision 09) of the OpenGL 35 Shading Language Specification. 36 37 GLSL 1.30 (OpenGL) or GLSL ES 3.00 (OpenGL ES) is required. 38 39 This extension interacts with ARB_gpu_shader5. 40 41 This extension interacts with ARB_gpu_shader_fp64. 42 43 This extension interacts with NV_gpu_shader5. 44 45Overview 46 47 GL_ARB_gpu_shader5 extends GLSL in a number of useful ways. Much of this 48 added functionality requires significant hardware support. There are many 49 aspects, however, that can be easily implmented on any GPU with "real" 50 integer support (as opposed to simulating integers using floating point 51 calculations). 52 53 This extension provides a set of new features to the OpenGL Shading 54 Language to support capabilities of these GPUs, extending the 55 capabilities of version 1.30 of the OpenGL Shading Language and version 56 3.00 of the OpenGL ES Shading Language. Shaders using the new 57 functionality provided by this extension should enable this 58 functionality via the construct 59 60 #extension GL_MESA_shader_integer_functions : require (or enable) 61 62 This extension provides a variety of new features for all shader types, 63 including: 64 65 * support for implicitly converting signed integer types to unsigned 66 types, as well as more general implicit conversion and function 67 overloading infrastructure to support new data types introduced by 68 other extensions; 69 70 * new built-in functions supporting: 71 72 * splitting a floating-point number into a significand and exponent 73 (frexp), or building a floating-point number from a significand and 74 exponent (ldexp); 75 76 * integer bitfield manipulation, including functions to find the 77 position of the most or least significant set bit, count the number 78 of one bits, and bitfield insertion, extraction, and reversal; 79 80 * extended integer precision math, including add with carry, subtract 81 with borrow, and extenended multiplication; 82 83 The resulting extension is a strict subset of GL_ARB_gpu_shader5. 84 85IP Status 86 87 No known IP claims. 88 89New Procedures and Functions 90 91 None 92 93New Tokens 94 95 None 96 97Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification 98(OpenGL Operation) 99 100 None. 101 102Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification 103(Rasterization) 104 105 None. 106 107Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification 108(Per-Fragment Operations and the Frame Buffer) 109 110 None. 111 112Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification 113(Special Functions) 114 115 None. 116 117Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification 118(State and State Requests) 119 120 None. 121 122Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile) 123Specification (Invariance) 124 125 None. 126 127Additions to the AGL/GLX/WGL Specifications 128 129 None. 130 131Modifications to The OpenGL Shading Language Specification, Version 1.50 132(Revision 09) 133 134 Including the following line in a shader can be used to control the 135 language features described in this extension: 136 137 #extension GL_MESA_shader_integer_functions : <behavior> 138 139 where <behavior> is as specified in section 3.3. 140 141 New preprocessor #defines are added to the OpenGL Shading Language: 142 143 #define GL_MESA_shader_integer_functions 1 144 145 146 Modify Section 4.1.10, Implicit Conversions, p. 27 147 148 (modify table of implicit conversions) 149 150 Can be implicitly 151 Type of expression converted to 152 --------------------- ----------------- 153 int uint, float 154 ivec2 uvec2, vec2 155 ivec3 uvec3, vec3 156 ivec4 uvec4, vec4 157 158 uint float 159 uvec2 vec2 160 uvec3 vec3 161 uvec4 vec4 162 163 (modify second paragraph of the section) No implicit conversions are 164 provided to convert from unsigned to signed integer types or from 165 floating-point to integer types. There are no implicit array or structure 166 conversions. 167 168 (insert before the final paragraph of the section) When performing 169 implicit conversion for binary operators, there may be multiple data types 170 to which the two operands can be converted. For example, when adding an 171 int value to a uint value, both values can be implicitly converted to uint 172 and float. In such cases, a floating-point type is chosen if either 173 operand has a floating-point type. Otherwise, an unsigned integer type is 174 chosen if either operand has an unsigned integer type. Otherwise, a 175 signed integer type is chosen. 176 177 178 Modify Section 5.9, Expressions, p. 57 179 180 (modify bulleted list as follows, adding support for implicit conversion 181 between signed and unsigned types) 182 183 Expressions in the shading language are built from the following: 184 185 * Constants of type bool, int, int64_t, uint, uint64_t, float, all vector 186 types, and all matrix types. 187 188 ... 189 190 * The operator modulus (%) operates on signed or unsigned integer scalars 191 or vectors. If the fundamental types of the operands do not match, the 192 conversions from Section 4.1.10 "Implicit Conversions" are applied to 193 produce matching types. ... 194 195 196 Modify Section 6.1, Function Definitions, p. 63 197 198 (modify description of overloading, beginning at the top of p. 64) 199 200 Function names can be overloaded. The same function name can be used for 201 multiple functions, as long as the parameter types differ. If a function 202 name is declared twice with the same parameter types, then the return 203 types and all qualifiers must also match, and it is the same function 204 being declared. For example, 205 206 vec4 f(in vec4 x, out vec4 y); // (A) 207 vec4 f(in vec4 x, out uvec4 y); // (B) okay, different argument type 208 vec4 f(in ivec4 x, out uvec4 y); // (C) okay, different argument type 209 210 int f(in vec4 x, out ivec4 y); // error, only return type differs 211 vec4 f(in vec4 x, in vec4 y); // error, only qualifier differs 212 vec4 f(const in vec4 x, out vec4 y); // error, only qualifier differs 213 214 When function calls are resolved, an exact type match for all the 215 arguments is sought. If an exact match is found, all other functions are 216 ignored, and the exact match is used. If no exact match is found, then 217 the implicit conversions in Section 4.1.10 (Implicit Conversions) will be 218 applied to find a match. Mismatched types on input parameters (in or 219 inout or default) must have a conversion from the calling argument type 220 to the formal parameter type. Mismatched types on output parameters (out 221 or inout) must have a conversion from the formal parameter type to the 222 calling argument type. 223 224 If implicit conversions can be used to find more than one matching 225 function, a single best-matching function is sought. To determine a best 226 match, the conversions between calling argument and formal parameter 227 types are compared for each function argument and pair of matching 228 functions. After these comparisons are performed, each pair of matching 229 functions are compared. A function definition A is considered a better 230 match than function definition B if: 231 232 * for at least one function argument, the conversion for that argument 233 in A is better than the corresponding conversion in B; and 234 235 * there is no function argument for which the conversion in B is better 236 than the corresponding conversion in A. 237 238 If a single function definition is considered a better match than every 239 other matching function definition, it will be used. Otherwise, a 240 semantic error occurs and the shader will fail to compile. 241 242 To determine whether the conversion for a single argument in one match is 243 better than that for another match, the following rules are applied, in 244 order: 245 246 1. An exact match is better than a match involving any implicit 247 conversion. 248 249 2. A match involving an implicit conversion from float to double is 250 better than a match involving any other implicit conversion. 251 252 3. A match involving an implicit conversion from either int or uint to 253 float is better than a match involving an implicit conversion from 254 either int or uint to double. 255 256 If none of the rules above apply to a particular pair of conversions, 257 neither conversion is considered better than the other. 258 259 For the function prototypes (A), (B), and (C) above, the following 260 examples show how the rules apply to different sets of calling argument 261 types: 262 263 f(vec4, vec4); // exact match of vec4 f(in vec4 x, out vec4 y) 264 f(vec4, uvec4); // exact match of vec4 f(in vec4 x, out ivec4 y) 265 f(vec4, ivec4); // matched to vec4 f(in vec4 x, out vec4 y) 266 // (C) not relevant, can't convert vec4 to 267 // ivec4. (A) better than (B) for 2nd 268 // argument (rule 2), same on first argument. 269 f(ivec4, vec4); // NOT matched. All three match by implicit 270 // conversion. (C) is better than (A) and (B) 271 // on the first argument. (A) is better than 272 // (B) and (C). 273 274 275 Modify Section 8.3, Common Functions, p. 84 276 277 (add support for single-precision frexp and ldexp functions) 278 279 Syntax: 280 281 genType frexp(genType x, out genIType exp); 282 genType ldexp(genType x, in genIType exp); 283 284 The function frexp() splits each single-precision floating-point number in 285 <x> into a binary significand, a floating-point number in the range [0.5, 286 1.0), and an integral exponent of two, such that: 287 288 x = significand * 2 ^ exponent 289 290 The significand is returned by the function; the exponent is returned in 291 the parameter <exp>. For a floating-point value of zero, the significant 292 and exponent are both zero. For a floating-point value that is an 293 infinity or is not a number, the results of frexp() are undefined. 294 295 If the input <x> is a vector, this operation is performed in a 296 component-wise manner; the value returned by the function and the value 297 written to <exp> are vectors with the same number of components as <x>. 298 299 The function ldexp() builds a single-precision floating-point number from 300 each significand component in <x> and the corresponding integral exponent 301 of two in <exp>, returning: 302 303 significand * 2 ^ exponent 304 305 If this product is too large to be represented as a single-precision 306 floating-point value, the result is considered undefined. 307 308 If the input <x> is a vector, this operation is performed in a 309 component-wise manner; the value passed in <exp> and returned by the 310 function are vectors with the same number of components as <x>. 311 312 313 (add support for new integer built-in functions) 314 315 Syntax: 316 317 genIType bitfieldExtract(genIType value, int offset, int bits); 318 genUType bitfieldExtract(genUType value, int offset, int bits); 319 320 genIType bitfieldInsert(genIType base, genIType insert, int offset, 321 int bits); 322 genUType bitfieldInsert(genUType base, genUType insert, int offset, 323 int bits); 324 325 genIType bitfieldReverse(genIType value); 326 genUType bitfieldReverse(genUType value); 327 328 genIType bitCount(genIType value); 329 genIType bitCount(genUType value); 330 331 genIType findLSB(genIType value); 332 genIType findLSB(genUType value); 333 334 genIType findMSB(genIType value); 335 genIType findMSB(genUType value); 336 337 The function bitfieldExtract() extracts bits <offset> through 338 <offset>+<bits>-1 from each component in <value>, returning them in the 339 least significant bits of corresponding component of the result. For 340 unsigned data types, the most significant bits of the result will be set 341 to zero. For signed data types, the most significant bits will be set to 342 the value of bit <offset>+<base>-1. If <bits> is zero, the result will be 343 zero. The result will be undefined if <offset> or <bits> is negative, or 344 if the sum of <offset> and <bits> is greater than the number of bits used 345 to store the operand. Note that for vector versions of bitfieldExtract(), 346 a single pair of <offset> and <bits> values is shared for all components. 347 348 The function bitfieldInsert() inserts the <bits> least significant bits of 349 each component of <insert> into the corresponding component of <base>. 350 The result will have bits numbered <offset> through <offset>+<bits>-1 351 taken from bits 0 through <bits>-1 of <insert>, and all other bits taken 352 directly from the corresponding bits of <base>. If <bits> is zero, the 353 result will simply be <base>. The result will be undefined if <offset> or 354 <bits> is negative, or if the sum of <offset> and <bits> is greater than 355 the number of bits used to store the operand. Note that for vector 356 versions of bitfieldInsert(), a single pair of <offset> and <bits> values 357 is shared for all components. 358 359 The function bitfieldReverse() reverses the bits of <value>. The bit 360 numbered <n> of the result will be taken from bit (<bits>-1)-<n> of 361 <value>, where <bits> is the total number of bits used to represent 362 <value>. 363 364 The function bitCount() returns the number of one bits in the binary 365 representation of <value>. 366 367 The function findLSB() returns the bit number of the least significant one 368 bit in the binary representation of <value>. If <value> is zero, -1 will 369 be returned. 370 371 The function findMSB() returns the bit number of the most significant bit 372 in the binary representation of <value>. For positive integers, the 373 result will be the bit number of the most significant one bit. For 374 negative integers, the result will be the bit number of the most 375 significant zero bit. For a <value> of zero or negative one, -1 will be 376 returned. 377 378 379 (support for unsigned integer add/subtract with carry-out) 380 381 Syntax: 382 383 genUType uaddCarry(genUType x, genUType y, out genUType carry); 384 genUType usubBorrow(genUType x, genUType y, out genUType borrow); 385 386 The function uaddCarry() adds 32-bit unsigned integers or vectors <x> and 387 <y>, returning the sum modulo 2^32. The value <carry> is set to zero if 388 the sum was less than 2^32, or one otherwise. 389 390 The function usubBorrow() subtracts the 32-bit unsigned integer or vector 391 <y> from <x>, returning the difference if non-negative or 2^32 plus the 392 difference, otherwise. The value <borrow> is set to zero if x >= y, or 393 one otherwise. 394 395 396 (support for signed and unsigned multiplies, with 32-bit inputs and a 397 64-bit result spanning two 32-bit outputs) 398 399 Syntax: 400 401 void umulExtended(genUType x, genUType y, out genUType msb, 402 out genUType lsb); 403 void imulExtended(genIType x, genIType y, out genIType msb, 404 out genIType lsb); 405 406 The functions umulExtended() and imulExtended() multiply 32-bit unsigned 407 or signed integers or vectors <x> and <y>, producing a 64-bit result. The 408 32 least significant bits are returned in <lsb>; the 32 most significant 409 bits are returned in <msb>. 410 411 412GLX Protocol 413 414 None. 415 416Dependencies on ARB_gpu_shader_fp64 417 418 This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set 419 of implicit conversions supported in the OpenGL Shading Language. If more 420 than one of these extensions is supported, an expression of one type may 421 be converted to another type if that conversion is allowed by any of these 422 specifications. 423 424 If ARB_gpu_shader_fp64 or a similar extension introducing new data types 425 is not supported, the function overloading rule in the GLSL specification 426 preferring promotion an input parameters to smaller type to a larger type 427 is never applicable, as all data types are of the same size. That rule 428 and the example referring to "double" should be removed. 429 430 431Dependencies on NV_gpu_shader5 432 433 This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set 434 of implicit conversions supported in the OpenGL Shading Language. If more 435 than one of these extensions is supported, an expression of one type may 436 be converted to another type if that conversion is allowed by any of these 437 specifications. 438 439 If NV_gpu_shader5 is supported, integer data types are supported with four 440 different precisions (8-, 16, 32-, and 64-bit) and floating-point data 441 types are supported with three different precisions (16-, 32-, and 442 64-bit). The extension adds the following rule for output parameters, 443 which is similar to the one present in this extension for input 444 parameters: 445 446 5. If the formal parameters in both matches are output parameters, a 447 conversion from a type with a larger number of bits per component is 448 better than a conversion from a type with a smaller number of bits 449 per component. For example, a conversion from an "int16_t" formal 450 parameter type to "int" is better than one from an "int8_t" formal 451 parameter type to "int". 452 453 Such a rule is not provided in this extension because there is no 454 combination of types in this extension and ARB_gpu_shader_fp64 where this 455 rule has any effect. 456 457 458Errors 459 460 None 461 462 463New State 464 465 None 466 467New Implementation Dependent State 468 469 None 470 471Issues 472 473 (1) What should this extension be called? 474 475 UNRESOLVED. This extension borrows from GL_ARB_gpu_shader5, so creating 476 some sort of a play on that name would be viable. However, nothing in 477 this extension should require SM5 hardware, so such a name would be a 478 little misleading and weird. 479 480 Since the primary purpose is to add integer related functions from 481 GL_ARB_gpu_shader5, call this extension GL_MESA_shader_integer_functions 482 for now. 483 484 (2) Why is some of the formatting in this extension weird? 485 486 RESOLVED: This extension is formatted to minimize the differences (as 487 reported by 'diff --side-by-side -W180') with the GL_ARB_gpu_shader5 488 specification. 489 490 (3) Should ldexp and frexp be included? 491 492 RESOLVED: Yes. Few GPUs have native instructions to implement these 493 functions. These are generally implemented using existing GLSL built-in 494 functions and the other functions provided by this extension. 495 496 (4) Should umulExtended and imulExtended be included? 497 498 RESOLVED: Yes. These functions should be implementable on any GPU that 499 can support the rest of this extension, but the implementation may be 500 complex. The implementation on a GPU that only supports 32bit x 32bit = 501 32bit multiplication would be quite expensive. However, many GPUs 502 (including OpenGL 4.0 GPUs that already support this function) have a 503 32bit x 16bit = 48bit multiplier. The implementation there is only 504 trivially more expensive than regular 32bit multiplication. 505 506 (5) Should the pack and unpack functions be included? 507 508 RESOLVED: No. These functions are already available via 509 GL_ARB_shading_language_packing. 510 511 (6) Should the "BitsTo" functions be included? 512 513 RESOLVED: No. These functions are already available via 514 GL_ARB_shader_bit_encoding. 515 516Revision History 517 518 Rev. Date Author Changes 519 ---- ----------- -------- ----------------------------------------- 520 3 31-Mar-2017 Jon Leech Add ES support (OpenGL-Registry/issues/3) 521 2 7-Jul-2016 idr Fix typo in #extension line 522 1 20-Jun-2016 idr Initial version based on GL_ARB_gpu_shader5. 523