• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Name
2
3    MESA_shader_integer_functions
4
5Name Strings
6
7    GL_MESA_shader_integer_functions
8
9Contact
10
11    Ian Romanick <ian.d.romanick@intel.com>
12
13Contributors
14
15    All the contributors of GL_ARB_gpu_shader5
16
17Status
18
19    Supported by all GLSL 1.30 capable drivers in Mesa 12.1 and later
20
21Version
22
23    Version 2, July 7, 2016
24
25Number
26
27    TBD
28
29Dependencies
30
31    This extension is written against the OpenGL 3.2 (Compatibility Profile)
32    Specification.
33
34    This extension is written against Version 1.50 (Revision 09) of the OpenGL
35    Shading Language Specification.
36
37    GLSL 1.30 is required.
38
39    This extension interacts with ARB_gpu_shader5.
40
41    This extension interacts with ARB_gpu_shader_fp64.
42
43    This extension interacts with NV_gpu_shader5.
44
45Overview
46
47    GL_ARB_gpu_shader5 extends GLSL in a number of useful ways.  Much of this
48    added functionality requires significant hardware support.  There are many
49    aspects, however, that can be easily implmented on any GPU with "real"
50    integer support (as opposed to simulating integers using floating point
51    calculations).
52
53    This extension provides a set of new features to the OpenGL Shading
54    Language to support capabilities of these GPUs, extending the capabilities
55    of version 1.30 of the OpenGL Shading Language.  Shaders
56    using the new functionality provided by this extension should enable this
57    functionality via the construct
58
59      #extension GL_MESA_shader_integer_functions : require   (or enable)
60
61    This extension provides a variety of new features for all shader types,
62    including:
63
64      * support for implicitly converting signed integer types to unsigned
65        types, as well as more general implicit conversion and function
66        overloading infrastructure to support new data types introduced by
67        other extensions;
68
69      * new built-in functions supporting:
70
71        * splitting a floating-point number into a significand and exponent
72          (frexp), or building a floating-point number from a significand and
73          exponent (ldexp);
74
75        * integer bitfield manipulation, including functions to find the
76          position of the most or least significant set bit, count the number
77          of one bits, and bitfield insertion, extraction, and reversal;
78
79        * extended integer precision math, including add with carry, subtract
80          with borrow, and extenended multiplication;
81
82    The resulting extension is a strict subset of GL_ARB_gpu_shader5.
83
84IP Status
85
86    No known IP claims.
87
88New Procedures and Functions
89
90    None
91
92New Tokens
93
94    None
95
96Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification
97(OpenGL Operation)
98
99    None.
100
101Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification
102(Rasterization)
103
104    None.
105
106Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification
107(Per-Fragment Operations and the Frame Buffer)
108
109    None.
110
111Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification
112(Special Functions)
113
114    None.
115
116Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification
117(State and State Requests)
118
119    None.
120
121Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile)
122Specification (Invariance)
123
124    None.
125
126Additions to the AGL/GLX/WGL Specifications
127
128    None.
129
130Modifications to The OpenGL Shading Language Specification, Version 1.50
131(Revision 09)
132
133    Including the following line in a shader can be used to control the
134    language features described in this extension:
135
136      #extension GL_MESA_shader_integer_functions : <behavior>
137
138    where <behavior> is as specified in section 3.3.
139
140    New preprocessor #defines are added to the OpenGL Shading Language:
141
142      #define GL_MESA_shader_integer_functions        1
143
144
145    Modify Section 4.1.10, Implicit Conversions, p. 27
146
147    (modify table of implicit conversions)
148
149                                Can be implicitly
150        Type of expression        converted to
151        ---------------------   -----------------
152        int                     uint, float
153        ivec2                   uvec2, vec2
154        ivec3                   uvec3, vec3
155        ivec4                   uvec4, vec4
156
157        uint                    float
158        uvec2                   vec2
159        uvec3                   vec3
160        uvec4                   vec4
161
162    (modify second paragraph of the section) No implicit conversions are
163    provided to convert from unsigned to signed integer types or from
164    floating-point to integer types.  There are no implicit array or structure
165    conversions.
166
167    (insert before the final paragraph of the section) When performing
168    implicit conversion for binary operators, there may be multiple data types
169    to which the two operands can be converted.  For example, when adding an
170    int value to a uint value, both values can be implicitly converted to uint
171    and float.  In such cases, a floating-point type is chosen if either
172    operand has a floating-point type.  Otherwise, an unsigned integer type is
173    chosen if either operand has an unsigned integer type.  Otherwise, a
174    signed integer type is chosen.
175
176
177    Modify Section 5.9, Expressions, p. 57
178
179    (modify bulleted list as follows, adding support for implicit conversion
180    between signed and unsigned types)
181
182    Expressions in the shading language are built from the following:
183
184    * Constants of type bool, int, int64_t, uint, uint64_t, float, all vector
185      types, and all matrix types.
186
187    ...
188
189    * The operator modulus (%) operates on signed or unsigned integer scalars
190      or vectors.  If the fundamental types of the operands do not match, the
191      conversions from Section 4.1.10 "Implicit Conversions" are applied to
192      produce matching types.  ...
193
194
195    Modify Section 6.1, Function Definitions, p. 63
196
197    (modify description of overloading, beginning at the top of p. 64)
198
199     Function names can be overloaded.  The same function name can be used for
200     multiple functions, as long as the parameter types differ.  If a function
201     name is declared twice with the same parameter types, then the return
202     types and all qualifiers must also match, and it is the same function
203     being declared.  For example,
204
205       vec4 f(in vec4 x, out vec4  y);   // (A)
206       vec4 f(in vec4 x, out uvec4 y);   // (B) okay, different argument type
207       vec4 f(in ivec4 x, out uvec4 y);  // (C) okay, different argument type
208
209       int  f(in vec4 x, out ivec4 y);  // error, only return type differs
210       vec4 f(in vec4 x, in  vec4  y);  // error, only qualifier differs
211       vec4 f(const in vec4 x, out vec4 y);  // error, only qualifier differs
212
213     When function calls are resolved, an exact type match for all the
214     arguments is sought.  If an exact match is found, all other functions are
215     ignored, and the exact match is used.  If no exact match is found, then
216     the implicit conversions in Section 4.1.10 (Implicit Conversions) will be
217     applied to find a match.  Mismatched types on input parameters (in or
218     inout or default) must have a conversion from the calling argument type
219     to the formal parameter type.  Mismatched types on output parameters (out
220     or inout) must have a conversion from the formal parameter type to the
221     calling argument type.
222
223     If implicit conversions can be used to find more than one matching
224     function, a single best-matching function is sought.  To determine a best
225     match, the conversions between calling argument and formal parameter
226     types are compared for each function argument and pair of matching
227     functions.  After these comparisons are performed, each pair of matching
228     functions are compared.  A function definition A is considered a better
229     match than function definition B if:
230
231       * for at least one function argument, the conversion for that argument
232         in A is better than the corresponding conversion in B; and
233
234       * there is no function argument for which the conversion in B is better
235         than the corresponding conversion in A.
236
237     If a single function definition is considered a better match than every
238     other matching function definition, it will be used.  Otherwise, a
239     semantic error occurs and the shader will fail to compile.
240
241     To determine whether the conversion for a single argument in one match is
242     better than that for another match, the following rules are applied, in
243     order:
244
245       1. An exact match is better than a match involving any implicit
246          conversion.
247
248       2. A match involving an implicit conversion from float to double is
249          better than a match involving any other implicit conversion.
250
251       3. A match involving an implicit conversion from either int or uint to
252          float is better than a match involving an implicit conversion from
253          either int or uint to double.
254
255     If none of the rules above apply to a particular pair of conversions,
256     neither conversion is considered better than the other.
257
258     For the function prototypes (A), (B), and (C) above, the following
259     examples show how the rules apply to different sets of calling argument
260     types:
261
262       f(vec4, vec4);        // exact match of vec4 f(in vec4 x, out vec4 y)
263       f(vec4, uvec4);       // exact match of vec4 f(in vec4 x, out ivec4 y)
264       f(vec4, ivec4);       // matched to vec4 f(in vec4 x, out vec4 y)
265                             //   (C) not relevant, can't convert vec4 to
266                             //   ivec4.  (A) better than (B) for 2nd
267                             //   argument (rule 2), same on first argument.
268       f(ivec4, vec4);       // NOT matched.  All three match by implicit
269                             //   conversion.  (C) is better than (A) and (B)
270                             //   on the first argument.  (A) is better than
271                             //   (B) and (C).
272
273
274    Modify Section 8.3, Common Functions, p. 84
275
276    (add support for single-precision frexp and ldexp functions)
277
278    Syntax:
279
280      genType frexp(genType x, out genIType exp);
281      genType ldexp(genType x, in genIType exp);
282
283    The function frexp() splits each single-precision floating-point number in
284    <x> into a binary significand, a floating-point number in the range [0.5,
285    1.0), and an integral exponent of two, such that:
286
287      x = significand * 2 ^ exponent
288
289    The significand is returned by the function; the exponent is returned in
290    the parameter <exp>.  For a floating-point value of zero, the significant
291    and exponent are both zero.  For a floating-point value that is an
292    infinity or is not a number, the results of frexp() are undefined.
293
294    If the input <x> is a vector, this operation is performed in a
295    component-wise manner; the value returned by the function and the value
296    written to <exp> are vectors with the same number of components as <x>.
297
298    The function ldexp() builds a single-precision floating-point number from
299    each significand component in <x> and the corresponding integral exponent
300    of two in <exp>, returning:
301
302      significand * 2 ^ exponent
303
304    If this product is too large to be represented as a single-precision
305    floating-point value, the result is considered undefined.
306
307    If the input <x> is a vector, this operation is performed in a
308    component-wise manner; the value passed in <exp> and returned by the
309    function are vectors with the same number of components as <x>.
310
311
312    (add support for new integer built-in functions)
313
314    Syntax:
315
316      genIType bitfieldExtract(genIType value, int offset, int bits);
317      genUType bitfieldExtract(genUType value, int offset, int bits);
318
319      genIType bitfieldInsert(genIType base, genIType insert, int offset,
320                              int bits);
321      genUType bitfieldInsert(genUType base, genUType insert, int offset,
322                              int bits);
323
324      genIType bitfieldReverse(genIType value);
325      genUType bitfieldReverse(genUType value);
326
327      genIType bitCount(genIType value);
328      genIType bitCount(genUType value);
329
330      genIType findLSB(genIType value);
331      genIType findLSB(genUType value);
332
333      genIType findMSB(genIType value);
334      genIType findMSB(genUType value);
335
336    The function bitfieldExtract() extracts bits <offset> through
337    <offset>+<bits>-1 from each component in <value>, returning them in the
338    least significant bits of corresponding component of the result.  For
339    unsigned data types, the most significant bits of the result will be set
340    to zero.  For signed data types, the most significant bits will be set to
341    the value of bit <offset>+<base>-1.  If <bits> is zero, the result will be
342    zero.  The result will be undefined if <offset> or <bits> is negative, or
343    if the sum of <offset> and <bits> is greater than the number of bits used
344    to store the operand.  Note that for vector versions of bitfieldExtract(),
345    a single pair of <offset> and <bits> values is shared for all components.
346
347    The function bitfieldInsert() inserts the <bits> least significant bits of
348    each component of <insert> into the corresponding component of <base>.
349    The result will have bits numbered <offset> through <offset>+<bits>-1
350    taken from bits 0 through <bits>-1 of <insert>, and all other bits taken
351    directly from the corresponding bits of <base>.  If <bits> is zero, the
352    result will simply be <base>.  The result will be undefined if <offset> or
353    <bits> is negative, or if the sum of <offset> and <bits> is greater than
354    the number of bits used to store the operand.  Note that for vector
355    versions of bitfieldInsert(), a single pair of <offset> and <bits> values
356    is shared for all components.
357
358    The function bitfieldReverse() reverses the bits of <value>.  The bit
359    numbered <n> of the result will be taken from bit (<bits>-1)-<n> of
360    <value>, where <bits> is the total number of bits used to represent
361    <value>.
362
363    The function bitCount() returns the number of one bits in the binary
364    representation of <value>.
365
366    The function findLSB() returns the bit number of the least significant one
367    bit in the binary representation of <value>.  If <value> is zero, -1 will
368    be returned.
369
370    The function findMSB() returns the bit number of the most significant bit
371    in the binary representation of <value>.  For positive integers, the
372    result will be the bit number of the most significant one bit.  For
373    negative integers, the result will be the bit number of the most
374    significant zero bit.  For a <value> of zero or negative one, -1 will be
375    returned.
376
377
378    (support for unsigned integer add/subtract with carry-out)
379
380    Syntax:
381
382      genUType uaddCarry(genUType x, genUType y, out genUType carry);
383      genUType usubBorrow(genUType x, genUType y, out genUType borrow);
384
385    The function uaddCarry() adds 32-bit unsigned integers or vectors <x> and
386    <y>, returning the sum modulo 2^32.  The value <carry> is set to zero if
387    the sum was less than 2^32, or one otherwise.
388
389    The function usubBorrow() subtracts the 32-bit unsigned integer or vector
390    <y> from <x>, returning the difference if non-negative or 2^32 plus the
391    difference, otherwise.  The value <borrow> is set to zero if x >= y, or
392    one otherwise.
393
394
395    (support for signed and unsigned multiplies, with 32-bit inputs and a
396     64-bit result spanning two 32-bit outputs)
397
398    Syntax:
399
400      void umulExtended(genUType x, genUType y, out genUType msb,
401                        out genUType lsb);
402      void imulExtended(genIType x, genIType y, out genIType msb,
403                        out genIType lsb);
404
405    The functions umulExtended() and imulExtended() multiply 32-bit unsigned
406    or signed integers or vectors <x> and <y>, producing a 64-bit result.  The
407    32 least significant bits are returned in <lsb>; the 32 most significant
408    bits are returned in <msb>.
409
410
411GLX Protocol
412
413    None.
414
415Dependencies on ARB_gpu_shader_fp64
416
417    This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
418    of implicit conversions supported in the OpenGL Shading Language.  If more
419    than one of these extensions is supported, an expression of one type may
420    be converted to another type if that conversion is allowed by any of these
421    specifications.
422
423    If ARB_gpu_shader_fp64 or a similar extension introducing new data types
424    is not supported, the function overloading rule in the GLSL specification
425    preferring promotion an input parameters to smaller type to a larger type
426    is never applicable, as all data types are of the same size.  That rule
427    and the example referring to "double" should be removed.
428
429
430Dependencies on NV_gpu_shader5
431
432    This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
433    of implicit conversions supported in the OpenGL Shading Language.  If more
434    than one of these extensions is supported, an expression of one type may
435    be converted to another type if that conversion is allowed by any of these
436    specifications.
437
438    If NV_gpu_shader5 is supported, integer data types are supported with four
439    different precisions (8-, 16, 32-, and 64-bit) and floating-point data
440    types are supported with three different precisions (16-, 32-, and
441    64-bit).  The extension adds the following rule for output parameters,
442    which is similar to the one present in this extension for input
443    parameters:
444
445       5. If the formal parameters in both matches are output parameters, a
446          conversion from a type with a larger number of bits per component is
447          better than a conversion from a type with a smaller number of bits
448          per component.  For example, a conversion from an "int16_t" formal
449          parameter type to "int"  is better than one from an "int8_t" formal
450          parameter type to "int".
451
452    Such a rule is not provided in this extension because there is no
453    combination of types in this extension and ARB_gpu_shader_fp64 where this
454    rule has any effect.
455
456
457Errors
458
459    None
460
461
462New State
463
464    None
465
466New Implementation Dependent State
467
468    None
469
470Issues
471
472    (1) What should this extension be called?
473
474      UNRESOLVED.  This extension borrows from GL_ARB_gpu_shader5, so creating
475      some sort of a play on that name would be viable.  However, nothing in
476      this extension should require SM5 hardware, so such a name would be a
477      little misleading and weird.
478
479      Since the primary purpose is to add integer related functions from
480      GL_ARB_gpu_shader5, call this extension GL_MESA_shader_integer_functions
481      for now.
482
483    (2) Why is some of the formatting in this extension weird?
484
485      RESOLVED: This extension is formatted to minimize the differences (as
486      reported by 'diff --side-by-side -W180') with the GL_ARB_gpu_shader5
487      specification.
488
489    (3) Should ldexp and frexp be included?
490
491      RESOLVED: Yes.  Few GPUs have native instructions to implement these
492      functions.  These are generally implemented using existing GLSL built-in
493      functions and the other functions provided by this extension.
494
495    (4) Should umulExtended and imulExtended be included?
496
497      RESOLVED: Yes.  These functions should be implementable on any GPU that
498      can support the rest of this extension, but the implementation may be
499      complex.  The implementation on a GPU that only supports 32bit x 32bit =
500      32bit multiplication would be quite expensive.  However, many GPUs
501      (including OpenGL 4.0 GPUs that already support this function) have a
502      32bit x 16bit = 48bit multiplier.  The implementation there is only
503      trivially more expensive than regular 32bit multiplication.
504
505    (5) Should the pack and unpack functions be included?
506
507      RESOLVED: No.  These functions are already available via
508      GL_ARB_shading_language_packing.
509
510    (6) Should the "BitsTo" functions be included?
511
512      RESOLVED: No.  These functions are already available via
513      GL_ARB_shader_bit_encoding.
514
515Revision History
516
517    Rev.      Date     Author    Changes
518    ----  -----------  --------  -----------------------------------------
519     2     7-Jul-2016  idr       Fix typo in #extension line
520     1    20-Jun-2016  idr       Initial version based on GL_ARB_gpu_shader5.
521