• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Name
2
3    ARB_gpu_shader_fp64
4
5Name Strings
6
7    GL_ARB_gpu_shader_fp64
8
9Contact
10
11    Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com)
12
13Contributors
14
15    Barthold Lichtenbelt, NVIDIA
16    Bill Licea-Kane, AMD
17    Bruce Merry, ARM
18    Chris Dodd, NVIDIA
19    Eric Werness, NVIDIA
20    Graham Sellers, AMD
21    Greg Roth, NVIDIA
22    Jeff Bolz, NVIDIA
23    Nick Haemel, AMD
24    Pierre Boudier, AMD
25    Piers Daniell, NVIDIA
26
27Notice
28
29    Copyright (c) 2010-2013 The Khronos Group Inc. Copyright terms at
30        http://www.khronos.org/registry/speccopyright.html
31
32Specification Update Policy
33
34    Khronos-approved extension specifications are updated in response to
35    issues and bugs prioritized by the Khronos OpenGL Working Group. For
36    extensions which have been promoted to a core Specification, fixes will
37    first appear in the latest version of that core Specification, and will
38    eventually be backported to the extension document. This policy is
39    described in more detail at
40        https://www.khronos.org/registry/OpenGL/docs/update_policy.php
41
42Status
43
44    Complete. Approved by the ARB at the 2010/01/22 F2F meeting.
45    Approved by the Khronos Board of Promoters on March 10, 2010.
46
47Version
48
49    Last Modified Date:         August 27, 2012
50    NVIDIA Revision:            11
51
52Number
53
54    ARB Extension #89
55
56Dependencies
57
58    This extension is written against the OpenGL 3.2 (Compatibility Profile)
59    Specification.
60
61    This extension is written against version 1.50 (revision 09) of the OpenGL
62    Shading Language Specification.
63
64    OpenGL 3.2 and GLSL 1.50 are required.
65
66    This extension interacts with EXT_direct_state_access.
67
68    This extension interacts with NV_shader_buffer_load.
69
70Overview
71
72    This extension allows GLSL shaders to use double-precision floating-point
73    data types, including vectors and matrices of doubles.  Doubles may be
74    used as inputs, outputs, and uniforms.
75
76    The shading language supports various arithmetic and comparison operators
77    on double-precision scalar, vector, and matrix types, and provides a set
78    of built-in functions including:
79
80      * square roots and inverse square roots;
81
82      * fused floating-point multiply-add operations;
83
84      * splitting a floating-point number into a significand and exponent
85        (frexp), or building a floating-point number from a significand and
86        exponent (ldexp);
87
88      * absolute value, sign tests, various functions to round to an integer
89        value, modulus, minimum, maximum, clamping, blending two values, step
90        functions, and testing for infinity and NaN values;
91
92      * packing and unpacking doubles into a pair of 32-bit unsigned integers;
93
94      * matrix component-wise multiplication, and computation of outer
95        products, transposes, determinants, and inverses; and
96
97      * vector relational functions.
98
99    Double-precision versions of angle, trigonometry, and exponential
100    functions are not supported.
101
102    Implicit conversions are supported from integer and single-precision
103    floating-point values to doubles, and this extension uses the relaxed
104    function overloading rules specified by the ARB_gpu_shader5 extension to
105    resolve ambiguities.
106
107    This extension provides API functions for specifying double-precision
108    uniforms in the default uniform block, including functions similar to the
109    uniform functions added by EXT_direct_state_access (if supported).
110
111    This extension provides an "LF" suffix for specifying double-precision
112    constants.  Floating-point constants without a suffix in GLSL are treated
113    as single-precision values for backward compatibility with versions not
114    supporting doubles; similar constants are treated as double-precision
115    values in the "C" programming language.
116
117    This extension does not support interpolation of double-precision values;
118    doubles used as fragment shader inputs must be qualified as "flat".
119    Additionally, this extension does not allow vertex attributes with 64-bit
120    components.  That support is added separately by EXT_vertex_attrib_64bit.
121
122IP Status
123
124    No known IP claims.
125
126New Procedures and Functions
127
128    void Uniform1d(int location, double x);
129    void Uniform2d(int location, double x, double y);
130    void Uniform3d(int location, double x, double y, double z);
131    void Uniform4d(int location, double x, double y, double z, double w);
132    void Uniform1dv(int location, sizei count, const double *value);
133    void Uniform2dv(int location, sizei count, const double *value);
134    void Uniform3dv(int location, sizei count, const double *value);
135    void Uniform4dv(int location, sizei count, const double *value);
136
137    void UniformMatrix2dv(int location, sizei count, boolean transpose,
138                          const double *value);
139    void UniformMatrix3dv(int location, sizei count, boolean transpose,
140                          const double *value);
141    void UniformMatrix4dv(int location, sizei count, boolean transpose,
142                          const double *value);
143    void UniformMatrix2x3dv(int location, sizei count, boolean transpose,
144                            const double *value);
145    void UniformMatrix2x4dv(int location, sizei count, boolean transpose,
146                            const double *value);
147    void UniformMatrix3x2dv(int location, sizei count, boolean transpose,
148                            const double *value);
149    void UniformMatrix3x4dv(int location, sizei count, boolean transpose,
150                            const double *value);
151    void UniformMatrix4x2dv(int location, sizei count, boolean transpose,
152                            const double *value);
153    void UniformMatrix4x3dv(int location, sizei count, boolean transpose,
154                            const double *value);
155
156    void GetUniformdv(uint program, int location, double *params);
157
158    (All of the following ProgramUniform* functions are supported if and only
159     if EXT_direct_state_access is supported.)
160
161    void ProgramUniform1dEXT(uint program, int location, double x);
162    void ProgramUniform2dEXT(uint program, int location, double x, double y);
163    void ProgramUniform3dEXT(uint program, int location, double x, double y,
164                             double z);
165    void ProgramUniform4dEXT(uint program, int location, double x, double y,
166                             double z, double w);
167    void ProgramUniform1dvEXT(uint program, int location, sizei count,
168                              const double *value);
169    void ProgramUniform2dvEXT(uint program, int location, sizei count,
170                              const double *value);
171    void ProgramUniform3dvEXT(uint program, int location, sizei count,
172                              const double *value);
173    void ProgramUniform4dvEXT(uint program, int location, sizei count,
174                              const double *value);
175
176    void ProgramUniformMatrix2dvEXT(uint program, int location, sizei count,
177                                    boolean transpose, const double *value);
178    void ProgramUniformMatrix3dvEXT(uint program, int location, sizei count,
179                                    boolean transpose, const double *value);
180    void ProgramUniformMatrix4dvEXT(uint program, int location, sizei count,
181                                    boolean transpose, const double *value);
182    void ProgramUniformMatrix2x3dvEXT(uint program, int location, sizei count,
183                                      boolean transpose, const double *value);
184    void ProgramUniformMatrix2x4dvEXT(uint program, int location, sizei count,
185                                      boolean transpose, const double *value);
186    void ProgramUniformMatrix3x2dvEXT(uint program, int location, sizei count,
187                                      boolean transpose, const double *value);
188    void ProgramUniformMatrix3x4dvEXT(uint program, int location, sizei count,
189                                      boolean transpose, const double *value);
190    void ProgramUniformMatrix4x2dvEXT(uint program, int location, sizei count,
191                                      boolean transpose, const double *value);
192    void ProgramUniformMatrix4x3dvEXT(uint program, int location, sizei count,
193                                      boolean transpose, const double *value);
194
195New Tokens
196
197    Returned in the <type> parameter of GetActiveUniform, and
198    GetTransformFeedbackVarying:
199
200        DOUBLE
201        DOUBLE_VEC2                                     0x8FFC
202        DOUBLE_VEC3                                     0x8FFD
203        DOUBLE_VEC4                                     0x8FFE
204        DOUBLE_MAT2                                     0x8F46
205        DOUBLE_MAT3                                     0x8F47
206        DOUBLE_MAT4                                     0x8F48
207        DOUBLE_MAT2x3                                   0x8F49
208        DOUBLE_MAT2x4                                   0x8F4A
209        DOUBLE_MAT3x2                                   0x8F4B
210        DOUBLE_MAT3x4                                   0x8F4C
211        DOUBLE_MAT4x2                                   0x8F4D
212        DOUBLE_MAT4x3                                   0x8F4E
213
214
215Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification
216(OpenGL Operation)
217
218    Modify Section 2.14.4, Uniform Variables, p. 89
219
220    (modify third paragraph, p. 90) ... uniform variable storage for a vertex
221    shader.  A uniform matrix with single- or double-precision components will
222    consume no more than 4 * min(r,c) or 8 * min(r,c) uniform components,
223    respectively.  A scalar or vector uniform with double-precision components
224    will consume no more than 2<n> components, where <n> is 1 for scalars, and
225    the component count for vectors.  A link error is generated ...
226
227    (add to Table 2.13, p. 96)
228
229      Type Name Token           Keyword
230      --------------------      ----------------
231      DOUBLE                    double
232      DOUBLE_VEC2               dvec2
233      DOUBLE_VEC3               dvec3
234      DOUBLE_VEC4               dvec4
235      DOUBLE_MAT2               dmat2
236      DOUBLE_MAT3               dmat3
237      DOUBLE_MAT4               dmat4
238      DOUBLE_MAT2x3             dmat2x3
239      DOUBLE_MAT2x4             dmat2x4
240      DOUBLE_MAT3x2             dmat3x2
241      DOUBLE_MAT3x4             dmat3x4
242      DOUBLE_MAT4x2             dmat4x2
243      DOUBLE_MAT4x3             dmat4x3
244
245    (modify list of commands at the bottom of p. 99)
246
247      void Uniform{1,2,3,4}d(int location, T value);
248      void Uniform{1,2,3,4}dv(int location, T value);
249      void UniformMatrix{2,3,4}dv
250           (int location, sizei count, boolean transpose,
251            const double *value);
252      void UniformMatrix{2x3,3x2,2x4,4x2,3x4,4x3}dv
253           (int location, sizei count, boolean transpose,
254            const double *value);
255
256    (insert after fourth paragraph, p. 100) The Uniform*d{v} commands will
257    load <count> sets of one to four double-precision floating-point values
258    into a uniform location defined as a double, a double vector, or an array
259    of double scalars or vectors.
260
261    (modify fifth paragraph, p. 100) The UniformMatrix{2,3,4}fv and
262    UniformMatrix{2,3,4}dv commands will load <count> 2x2, 3x3, or 4x4
263    matrices (corresponding to 2, 3, or 4 in the command name) of single- or
264    double-precision floating-point values, respectively, into ...
265
266    (replace second bullet on the middle of p. 101, regarding
267     INVALID_OPERATION errors in Uniform* comamnds)
268
269     * if the type of the uniform declared in the shader does not match the
270       component type and count indicated in the Uniform* command name (where
271       a boolean uniform component type is considered to match any of the
272       Uniform*i{v}, Uniform*ui{v}, or Uniform*f{v} commands),
273
274    (modify sixth paragraph, p. 100) The UniformMatrix{2x3,3x2,2x4,
275    4x2,3x4,4x3}fv and UniformMatrix{2x3,3x2,2x4,4x2,3x4,4x3}dv commands will
276    load <count> 2x3, 3x2, 2x4, 4x2, 3x4, or 4x3 matrices (corresponding to
277    the numbers in the command name) of single- or double-precision
278    floating-point values, respectively, into ...
279
280    (modify "Uniform Buffer Object Storage", p. 102, adding a bullet after the
281     last "Members of type", and modifying the subsequent bullet)
282
283     * Members of type double are extracted from a buffer object by reading a
284       single double-typed value at the specified offset.
285
286     * Vectors with N elements with basic data types of bool, int, uint,
287       float, or double are extracted as N values in consecutive memory
288       locations beginning at the specified offset, with components stored in
289       order with the first (X) component at the lowest offset. The GL data
290       type used for component extraction is derived according to the rules
291       for scalar members above.
292
293
294    Modify Section 2.14.6, Varying Variables, p. 106
295
296    (modify third paragraph, p. 107) ... For the purposes of counting input
297    and output components consumed by a shader, variables declared as vectors,
298    matrices, and arrays will all consume multiple components.  Each component
299    of variables declared as double-precision floating-point scalars, vectors,
300    or matrices may be counted as consuming two components.
301
302    (add after the bulleted list, p. 108) For the purposes of counting the
303    total number of components to capture, each component of outputs declared
304    as double-precision floating-point scalars, vectors, or matrices may be
305    counted as consuming two components.
306
307
308    Modify Section 2.19, Transform Feedback, p. 130
309
310    (add to end of first paragraph, p. 132) ...  The results of appending a
311    varying variable to a transform feedback buffer are undefined if any
312    component of that variable would be written at an offset not aligned to
313    the size of the component.
314
315
316Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification
317(Rasterization)
318
319    None.
320
321Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification
322(Per-Fragment Operations and the Frame Buffer)
323
324    None.
325
326Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification
327(Special Functions)
328
329    None.
330
331Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification
332(State and State Requests)
333
334    Modify Section 6.1.15, Shader and Program Queries, p. 332
335
336    (add to the first list of commands, p. 337)
337
338      void GetUniformdv(uint program, int location, double *params);
339
340
341Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile)
342Specification (Invariance)
343
344    None.
345
346Additions to the AGL/GLX/WGL Specifications
347
348    None.
349
350Modifications to The OpenGL Shading Language Specification, Version 1.50
351(Revision 09)
352
353    Including the following line in a shader can be used to control the
354    language features described in this extension:
355
356      #extension GL_ARB_gpu_shader_fp64 : <behavior>
357
358    where <behavior> is as specified in section 3.3.
359
360    New preprocessor #defines are added to the OpenGL Shading Language:
361
362      #define GL_ARB_gpu_shader_fp64    1
363
364
365    Modify Section 3.6, Keywords, p. 14
366
367    (add the following to the list of keywords, p. 14)
368
369    double              dvec2           dvec3           dvec4
370
371    dmat2               dmat3           dmat4
372    dmat2x2             dmat2x3         dmat2x4
373    dmat3x2             dmat3x3         dmat3x4
374    dmat4x2             dmat4x3         dmat4x4
375
376    (remove "double", "dvec2", "dvec3", and "dvec4" from the list of
377    keywords reserved for future use, p. 15)
378
379
380    Modify Section 4.1, Basic Types, p. 17
381
382    (add to the basic "Transparent Types" table, pp. 17-18)
383
384      Types       Meaning
385      --------    ----------------------------------------------------------
386      double      a single double-precision floating point scalar
387      dvec2       a two-component double precision floating-point vector
388      dvec3       a three component double precision floating-point vector
389      dvec4       a four component double precision floating-point vector
390
391      dmat2       a 2x2 double-precision floating-point matrix
392      dmat3       a 3x3 double-precision floating-point matrix
393      dmat4       a 4x4 double-precision floating-point matrix
394      dmat2x2     same as dmat2
395      dmat2x3     a double-precision matrix with 2 columns and 3 rows
396      dmat2x4     a double-precision matrix with 2 columns and 4 rows
397      dmat3x2     a double-precision matrix with 3 columns and 2 rows
398      dmat3x3     same as dmat3
399      dmat3x4     a double-precision matrix with 3 columns and 4 rows
400      dmat4x2     a double-precision matrix with 4 columns and 2 rows
401      dmat4x3     a double-precision matrix with 4 columns and 3 rows
402      dmat4x4     same as dmat4
403
404
405    Modify Section 4.1.4, Floats, p. 22
406
407    (modify two paragraphs of the section, adding support for doubles)
408
409    Single- and double-precision floating-point values are available for use
410    in a variety of scalar calculations.  Floating-point variables are defined
411    as in the following example:
412
413      float a, b = 1.5;
414      double c, d = 2.0LF;
415
416    As an input value to one of the processing units, a single or
417    double-precision floating-point variable is expected to match the IEEE
418    floating-point definition for precision and dynamic range of the
419    corresponding type.  It is not required that the precision of internal
420    processing for operands of type "float" match the IEEE floating-point
421    specification for floating-point operations, but the minimum guidelines
422    for precision established by the OpenGL specification must be met.
423    Treatment of conditions such as divide by 0 may lead to an unspecified
424    result, but in no case should such a condition lead to the interruption or
425    termination of processing.
426
427    (modify the grammar, p. 22, adding "L" suffix)
428
429      floating-suffix:  one of
430
431        f F lf LF
432
433    (modify last paragraph, p. 22) ...  including before a suffix.  When the
434    suffix "lf" or "LF" is present, the literal has type <double>.  Otherwise,
435    the literal has type <float>.  A leading unary ...
436
437
438    Modify Section 4.1.6, Matrices, p. 23
439
440    (modify the first paragraph of the section)
441
442    The OpenGL Shading Language has built-in types for 2×2, 2×3, 2×4, 3×2,
443    3×3, 3×4, 4×2, 4×3, and 4×4 matrices of single- and double-precision
444    floating-point numbers.  Matrix types beginning with "mat" have
445    single-precision components; matrix types beginning with "dmat" have
446    double-precision components.  The first number in the type is the number
447    of columns, the second is the number of rows. Example matrix declarations:
448
449      mat2 mat2D;
450      mat3 optMatrix;
451      mat4 view, projection;
452      mat4x4 view; // an alternate way of declaring a mat4
453      mat3x2 m; // a matrix with 3 columns and 2 rows
454      dmat4 highPrecisionMVP;
455      dmat2x4 skinnyAndTallWithBigComponents;
456
457    ...
458
459    Modify Section 4.1.10, Implicit Conversions, p. 27
460
461    (modify table of implicit conversions)
462
463                                Can be implicitly
464        Type of expression        converted to
465        ---------------------   -------------------
466        int                     uint(*), float, double
467        ivec2                   uvec2(*), vec2, dvec2
468        ivec3                   uvec3(*), vec3, dvec3
469        ivec4                   uvec4(*), vec4, dvec4
470
471        uint                    float, double
472        uvec2                   vec2, dvec2
473        uvec3                   vec3, dvec3
474        uvec4                   vec4, dvec4
475
476        float                   double
477        vec2                    dvec2
478        vec3                    dvec3
479        vec4                    dvec4
480
481        mat2                    dmat2
482        mat3                    dmat3
483        mat4                    dmat4
484        mat2x3                  dmat2x3
485        mat2x4                  dmat2x4
486        mat3x2                  dmat3x2
487        mat3x4                  dmat3x4
488        mat4x2                  dmat4x2
489        mat4x3                  dmat4x3
490
491        (*) if ARB_gpu_shader5 or NV_gpu_shader5 is supported
492
493    (modify second paragraph of the section) No implicit conversions are
494    provided to convert from unsigned to signed integer types, from
495    floating-point to integer types, or from higher-precision to
496    lower-precision types.  There are no implicit array or structure
497    conversions.
498
499    (add before the final paragraph of the section, p. 27)
500
501    (insert before the final paragraph of the section) When performing
502    implicit conversion for binary operators, there may be multiple data types
503    to which the two operands can be converted.  For example, when adding an
504    int value to a uint value, both values can be implicitly converted to
505    uint, float, and double.  In such cases, a floating-point type is chosen
506    if either operand has a floating-point type.  Otherwise, an unsigned
507    integer type is chosen if either operand has an unsigned integer type.
508    Otherwise, a signed integer type is chosen.  If operands can be implicitly
509    converted to multiple data types deriving from the same base data type,
510    the type with the smallest component size is used.
511
512
513    Modify Section 4.3.4, Inputs, p. 31
514
515    (modify third paragraph of the section, p. 31) ... Vertex shader inputs
516    can only be single-precision floating-point scalars, vectors, or matrices,
517    or signed and unsigned integers and integer vectors.  Vertex shader inputs
518    can also form arrays of these types, but not structures.
519
520    (modify third paragraph, p. 32, allowing doubles as inputs and disallowing
521    as non-flat fragment inputs) ... Fragment inputs can only be signed and
522    unsigned integers and integer vectors, float, floating-point vectors,
523    double, double-precision vectors, single- or double-precision matrices, or
524    arrays or structures of these. Fragment shader inputs that are signed or
525    unsigned integers, integer vectors, doubles, double-precision vectors, or
526    double-precision matrices must be qualified with the interpolation
527    qualifier flat.
528
529
530    Modify Section 4.3.6, Outputs, p. 33
531
532    (modify third paragraph of the section, p. 33) They can only be float,
533    double, single- or double-precision floating-point vectors or matrices,
534    signed or unsigned integers or integer vectors, or arrays or structures of
535    any these.
536
537    (modify last paragraph, p. 33) ... Fragment outputs can only be float,
538    single-precision floating-point vectors, signed or unsigned integers or
539    integer vectors, or arrays of these. ...
540
541
542    Modify Section 5.4.1, Conversion and Scalar Constructors, p. 49
543
544    (add double to the first list of constructor examples)
545
546    Converting between scalar types is done as the following prototypes
547    indicate:
548
549      int(uint)     // converts an unsigned integer value to a signed integer
550      int(float)    // converts a float value to a signed integer
551      int(double)   // converts a double value to a signed integer
552      int(bool)     // converts a Boolean value to a signed integer
553      uint(int)     // converts a signed integer value to an unsigned integer
554      uint(float)   // converts a float value to an unsigned integer
555      uint(double)  // converts a double value to an unsigned integer
556      uint(bool)    // converts a Boolean value to an unsigned integer
557      float(int)    // converts a signed integer value to a float
558      float(uint)   // converts an unsigned integer value to a float
559      float(double) // converts a double value to a float
560      float(bool)   // converts a Boolean value to a float
561      double(int)   // converts a signed integer value to a double
562      double(uint)  // converts an unsigned integer value to a double
563      double(float) // converts a float value to a double
564      double(bool)  // converts a Boolean value to a double
565      bool(int)     // converts a signed integer value to a Boolean
566      bool(uint)    // converts an unsigned integer value to a Boolean
567      bool(float)   // converts a float value to a Boolean
568      bool(double)  // converts a double value to a Boolean
569
570    (modify second paragraph of the section, p. 49) When constructors are used
571    to convert any floating-point type to an integer, the fractional part of
572    the floating-point value is dropped. ...
573
574    (modify third paragraph of the section, p. 49) When a constructor is used
575    to convert any integer or floating-point type to bool, 0 and 0.0 are
576    converted to false, and non-zero values are converted to true.  When a
577    constructor is used to convert a bool to any integer or floating-point
578    type, false is converted to 0 or 0.0, and true is converted to 1 or 1.0.
579
580
581    Modify Section 5.4.2, Vector and Matrix Constructors, p. 50
582
583    (modify the last paragraph, p. 50) If the basic type (bool, int, uint,
584    float, or double) of a parameter to a constructor does not match the basic
585    type of the object being constructed, the scalar construction rules
586    (above) are used to convert the parameters.
587
588
589    (add to the first group of examples, p. 52)
590
591      dmat2(dvec2, dvec2)
592      dmat3(dvec3, dvec3, dvec3)
593      dmat4(dvec4, dvec4, dvec4, dvec4)
594      dmat2x4(dvec3, double,   // first column
595              double, dvec3)   // second column
596
597
598    Modify Section 5.9, Expressions, p. 57
599
600    (modify bulleted list as follows, adding support for double-precision
601    floating-point types)
602
603    Expressions in the shading language are built from the following:
604
605    * Constants of type bool, int, uint, float, double, all vector types and
606      all matrix types.
607
608    ...
609
610    * The arithmetic binary operators add (+), subtract (-), multiply (*), and
611      divide (/) operate on integer, single-precision floating-point, and
612      double-precision floating-point scalars, vectors, and matrices.  If the
613      fundamental type (integer, single-precision floating-point,
614      double-precision floating-point) of the operands do not match, the
615      conversions from Section 4.1.10 "Implicit Conversions" are applied to
616      produce matching types.  ...
617
618    * The arithmetic unary operators negate (-), post- and pre-increment and
619      decrement (-- and ++) operate on integer, single-precision
620      floating-point, or double-precision floating-point values (including
621      vectors and matrices). ...
622
623    * The relational operators greater than (>), less than (<), and less than
624      or equal (<=) operate only on scalar integer, single-precision
625      floating-point, or double-precision floating-point expressions.  The
626      result is scalar Boolean.  The fundamental type of the two operands must
627      match, either as specified, or after one of the implicit type
628      conversions specified in Section 4.1.10.  ...
629
630      ...
631
632
633    Modify Chapter 8, Built-in Functions, p. 81
634
635    (add to description of generic types, last paragraph of p. 81) ... Where
636    the input arguments (and corresponding output) can be double, dvec2,
637    dvec3, or dvec4, <genDType> is used as the argument.  ... Similarly, <mat>
638    is used for any matrix basic type with single-precision components and
639    <dmat> is used for any matrix basic type with double-precision components.
640
641
642    Modify Section 8.2, Exponential Functions, p. 83
643
644    (add overloads for double-precision square roots)
645
646      genDType sqrt(genDType x);
647      genDType inversesqrt(genDType x);
648
649
650    Modify Section 8.3, Common Functions, p. 84
651
652    (add support for double-precision floating-point multiply-add)
653
654    Syntax:
655
656      genDType fma(genDType a, genDType b, genDType c);
657
658    The function fma() performs a fused double-precision floating-point
659    multiply-add to compute the value a*b+c.  The results of fma() may not be
660    identical to evaluating the expression (a*b)+c, because the computation
661    may be performed in a single operation with intermediate precision
662    different from that used to compute a non-fma() expression.
663
664    The results of fma() are guaranteed to be invariant given fixed inputs
665    <a>, <b>, and <c>, as though the result were taken from a variable
666    declared as "precise".
667
668
669    (add support for double-precision frexp and ldexp functions)
670
671    Syntax:
672
673      genDType frexp(genDType x, out genIType exp);
674      genDType ldexp(genDType x, in genIType exp);
675
676    The function frexp() splits each double-precision floating-point number in
677    <x> into its binary significand, a floating-point number in the range
678    [0.5, 1.0), and an integral exponent of two, such that:
679
680      x = significand * 2 ^ exponent
681
682    The significand is returned by the function; the exponent is returned in
683    the parameter <exp>.  For a floating-point value of zero, the significant
684    and exponent are both zero.  For a floating-point value that is an
685    infinity or is not a number, the results of frexp() are undefined.
686
687    If the input <x> is a vector, this operation is performed in a
688    component-wise manner; the value returned by the function and the value
689    written to <exp> are vectors with the same number of components as <x>.
690
691    The function ldexp() builds a double-precision floating-point number from
692    each significand component in <x> and the corresponding integral exponent
693    of two in <exp>, returning:
694
695      significand * 2 ^ exponent
696
697    If this product is too large to be represented as a double-precision
698    floating-point value, the result is considered undefined.
699
700    If the input <x> is a vector, this operation is performed in a
701    component-wise manner; the value passed in <exp> and returned by the
702    function are vectors with the same number of components as <x>.
703
704
705    (add overloads for double-precision functions)
706
707      genDType abs(genDType x);
708      genDType sign(genDType x);
709      genDType floor(genDType x);
710      genDType trunc(genDType x);
711      genDType round(genDType x);
712      genDType roundEven(genDType x);
713      genDType ceil(genDType x);
714      genDType fract(genDType x);
715      genDType mod(genDType x, double y);
716      genDType mod(genDType x, genDType y);
717      genDType modf(genDType x, out genDType i);
718      genDType min(genDType x, genDType y);
719      genDType min(genDType x, double y);
720      genDType max(genDType x, genDType y);
721      genDType max(genDType x, double y);
722      genDType clamp(genDType x, genDType minVal, genDType maxVal);
723      genDType clamp(genDType x, double minVal, double maxVal);
724      genDType mix(genDType x, genDType y, genDType a);
725      genDType mix(genDType x, genDType y, double a);
726      genDType mix(genDType x, genDType y, genBType a);
727      genDType step(genDType edge, genDType x);
728      genDType step(double edge, genDType x);
729      genDType smoothstep(genDType edge0, genDType edge1, genDType x);
730      genDType smoothstep(double edge0, double edge1, genDType x);
731      genBType isnan(genDType x);
732      genBType isinf(genDType x);
733
734
735    (add support for 64-bit floating-point packing and unpacking functions)
736
737    Syntax:
738
739      double   packDouble2x32(uvec2 v);
740      uvec2    unpackDouble2x32(double v);
741
742    The function packDouble2x32() returns a double obtained by packing the
743    components of a two-component unsigned integer vector into a 64-bit value
744    and interpeting its bits according to the IEEE double-precision
745    floating-point representation.  The first vector component specifies the
746    32 least significant bits; the second component specifies the 32 most
747    significant bits.
748
749    The function unpackDouble2x32() returns a two-component unsigned integer
750    vector obtained by interpreting a double using the 64-bit IEEE
751    double-precision floating-point representation and unpacking into two
752    32-bit halves.  The first component of the vector contains the 32 least
753    significant bits of the double; the second component consists the 32 most
754    significant bits.
755
756
757    Modify Section 8.4, Geometric Functions, p. 87
758
759    (add double-precision equivalents for existing geometric functions)
760
761      double length(genDType x);
762      double distance(genDType p0, genDType p1);
763      double dot(genDType x, genDType y);
764      dvec3 cross(dvec3 x, dvec3 y);
765      genDType normalize(genDType x);
766      genDType faceforward(genDType N, genDType I, genDType Nref);
767      genDType reflect(genDType I, genDType N);
768      genDType refract(genDType I, genDType N, double eta);
769
770
771    Modify Section 8.5, Matrix Functions, p. 89
772
773    (add double-precision equivalents for existing matrix functions)
774
775      dmat matrixCompMult(dmat x, dmat y);
776      dmat2 outerProduct(dvec2 c, dvec2 r);
777      dmat3 outerProduct(dvec3 c, dvec3 r);
778      dmat4 outerProduct(dvec4 c, dvec4 r);
779      dmat2x3 outerProduct(dvec3 c, dvec2 r);
780      dmat3x2 outerProduct(dvec2 c, dvec3 r);
781      dmat2x4 outerProduct(dvec4 c, dvec2 r);
782      dmat4x2 outerProduct(dvec2 c, dvec4 r);
783      dmat3x4 outerProduct(dvec4 c, dvec3 r);
784      dmat4x3 outerProduct(dvec3 c, dvec4 r);
785      dmat2 transpose(dmat2 m);
786      dmat3 transpose(dmat3 m);
787      dmat4 transpose(dmat4 m);
788      dmat2x3 transpose(dmat3x2 m);
789      dmat3x2 transpose(dmat2x3 m);
790      dmat2x4 transpose(dmat4x2 m);
791      dmat4x2 transpose(dmat2x4 m);
792      dmat3x4 transpose(dmat4x3 m);
793      dmat4x3 transpose(dmat3x4 m);
794      double determinant(dmat2 m);
795      double determinant(dmat3 m);
796      double determinant(dmat4 m);
797      dmat2 inverse(dmat2 m);
798      dmat3 inverse(dmat3 m);
799      dmat4 inverse(dmat4 m);
800
801
802    Modify Section 8.6, Vector Relational Functions, p. 90
803
804    (modify the first paragraph, p. 90, adding support for relational
805    functions operating on double precision types)
806
807    Relational and equality operators (<, <=, >, >=, ==, !=) are defined (or
808    reserved) to operate on scalars and produce scalar Boolean results.  For
809    vector results, use the following built-in functions.  In the definitions
810    below, the following terms are used as placeholders for all vector types
811    for a given fundamental data type.  In all cases, the sizes of the input
812    and return vectors for any particular call must match.
813
814        placeholder     fundamental types
815        -----------     ------------------------------------------------
816        bvec            bvec2, bvec3, bvec4
817
818        ivec            ivec2, ivec3, ivec4
819
820        uvec            uvec2, uvec3, uvec4
821
822        vec             vec2, vec3, vec4, dvec2, dvec3, dvec4
823
824
825    Modify Section 9, Shading Language Grammar, p. 92
826
827    !!! TBD !!!
828
829
830GLX Protocol
831
832    !!! TBD
833
834Dependencies on ARB_gpu_shader5
835
836    If ARB_gpu_shader5 is not supported, the changes to the function
837    overloading rules in the OpenGL Shading Language Specification provided
838    there should included in this extension.
839
840Dependencies on NV_gpu_shader5
841
842    This extension and NV_gpu_shader5 both provide support for shading
843    language variables with 64-bit components.  If both extensions are
844    supported, the various edits describing this new support should be
845    combined.
846
847Dependencies on EXT_direct_state_access
848
849    If EXT_direct_state_access is not supported, references to the
850    ProgramUniform*d*EXT functions should be removed.
851
852    If EXT_direct_state_access is supported, that specification should be
853    edited as follows:
854
855    (modify the ProgramUniform* language)
856
857    The following commands:
858
859        ....
860        void ProgramUniform{1,2,3,4}dEXT(uint program int location, T value);
861        void ProgramUniform{1,2,3,4}dvEXT (uint program, int location,
862                                          const T *value);
863        void ProgramUniformMatrix{2,3,4}dvEXT
864             (uint program, int location, sizei count, boolean transpose,
865              const double *value);
866        void ProgramUniformMatrix{2x3,3x2,2x4,4x2,3x4,4x3}dvEXT
867             (uint program, int location, sizei count, boolean transpose,
868              const double *value);
869
870    operate identically to the corresponding command where "Program" is
871    deleted from the name (and extension suffixes are dropped or updated
872    appropriately) except, rather than updating the currently active program
873    object, these "Program" commands update the program object named by the
874    <program> parameter.  ...
875
876Dependencies on NV_shader_buffer_load
877
878    If NV_shader_buffer_load is supported, that specification should be edited
879    as follows:
880
881    Modify "Section 2.20.X, Shader Memory Access" from NV_shader_buffer_load.
882
883    (add rules for loads of variables having the new data types from this
884    extension to the list of bullets following "When a shader dereferences a
885    pointer variable")
886
887      - Data of type "double" are read from or written to memory as one
888        double-typed value at the specified GPU address.
889
890
891Errors
892
893    None.
894
895New State
896
897    None.
898
899New Implementation Dependent State
900
901    None.
902
903Issues
904
905    (1) How do double-precision types interact with the rules for storing
906    uniforms in a buffer object?
907
908      RESOLVED:  The rules were already written with data types larger and
909      smaller than those in the original GLSL in mind.  Single precision
910      floats typically take four bytes; doubles take eight bytes.  The larger
911      storage requirement for doubles means a larger alignment requirement;
912      doubles still need to be size-aligned.
913
914    (2) Should double-precision vertex shader inputs be supported?
915
916      RESOLVED:  Not in this extension.  Such support will be added by the
917      EXT_vertex_attrib_64bit extension.
918
919    (3) Should double-precision fragment shader outputs be supported?
920
921      RESOLVED:  Not in this extension.  Note that we don't have
922      double-precision framebuffer formats to accept such values.
923
924    (4) Should transform feedback be able to capture double-precision
925    components?
926
927      RESOLVED:  Yes.  However, undefined behavior will occur unless all
928      components are captured to size-aligned offsets.
929
930      If any variable captured in transform feedback has double-precision
931      components, the practical requirements for defined behavior are:
932
933        (a) the offset of the base of a buffer object must be a multiple of
934            eight bytes;
935
936        (b) the amount of data captured per vertex must be a multiple of eight
937            bytes; and
938
939        (c) each double-precision variable captured must be aligned to a
940            multiple of eight bytes relative to the beginning of a vertex.
941
942      If capturing a mix of single- and double-precision components, it might
943      be necessary to use the "gl_SkipComponents1" variable from
944      ARB_transform_feedback3 to force proper alignment.
945
946      We considered the possibility of adding error checks to throw errors in
947      cases where undefined behavior might occur, but chose not to include
948      such errors.  For OpenGL 3.0-style transform feedback, cases (b) and (c)
949      are solely a function of the variables captured could be detected when a
950      program object is linked.  (Such an error would be more problematic for
951      transform feedback via NV_transform_feedback, where the set of variables
952      captured can be updated without relinking.)  For case (a), the
953      requirement of OpenGL 3.0 is that transform feedback buffer offsets must
954      be a multiple of 4 bytes; enforcing a stricter 8-byte alignment would
955      require either a backward-incompatible change or a Begin-time error to
956      checks the offset of transform feedback buffers against the current
957      program.
958
959    (5) Should we have double-precision matrix types?  We didn't add integer
960        matrices, but integer matrix math is fairly uncommon.
961
962      RESOLVED:  Yes, we will support all matrix sizes in double-precision.
963      We will also provide double-precision equivalents for all matrix
964      operators and built-in matrix functions.
965
966    (6) What should be done to distinguish between single- and
967        double-precision floating-point constants?
968
969      RESOLVED:  We will use "LF" to identify double-precision floating-point
970      constants.  Here, we depart from the C standard.  In C, floating-point
971      constants without a suffix are implicitly double-precision and require a
972      "F" suffix to specify a single-precision constant.  However, GLSL has
973      historically provided no support for double precision.  Changing to C
974      rules would materially affect the behavior of pre-existing shaders that
975      add an #extension line for this extension, since constants with no
976      suffix have meant "float" up to now.  Additionally, such a change would
977      likely have required that we introduce implicit conversions from double
978      to float; otherwise, assigning a constant with no suffix to a float
979      would result in a compile-time error.
980
981    (7) Should we require IEEE 1394-compliant behavior for NaNs and
982        infinities?  Denorms?
983
984      RESOLVED:  Following historical precedent in the GLSL and OpenGL APIs
985      not defining special-case floating-point behavior, we chose not to do so
986      in this extension.
987
988    (8) Should we provide double-precision versions of all the built-ins that
989        take a <genType>, which are currently defined to be floats and
990        floating-point vectors?
991
992      RESOLVED:  We provide double-precision versions of most of the built-in
993      functions supported by GLSL.  We opted not to provide double-precision
994      functions for special trigonometry, exponential, derivative, and noise
995      functions.
996
997    (9) Are double-precision "varyings" (values passed between shader stages)
998        supported by this extension?  If so, is double-precision interpolation
999        is supported?
1000
1001      RESOLVED:  Double-precision shader inputs and outputs are supported,
1002      except for vertex shader inputs and fragment shader outputs.
1003      Additionally, double-precision vertex shader inputs are provided by the
1004      separate extension EXT_vertex_attrib_64bit.  No known extension provides
1005      double-precision fragment outputs, but that doesn't seem important since
1006      OpenGL provides no pixel/texture formats with double-precision
1007      components that could reasonably receive such outputs.
1008
1009      Interpolation not supported in this extension for double-precision
1010      floating-point components.  As with integer types in OpenGL 3.0,
1011      double-precision floating-point fragment shader inputs must be qualified
1012      as "flat".
1013
1014      Note that this extension reformulates the spec language requiring "flat"
1015      qualifiers, in addition to adding doubles to the list of "flat" types.
1016      In GLSL 1.30, the spec applies these requirements to vertex shader
1017      outputs but imposes no requirement on fragment inputs.  We move this
1018      requirement to fragment inputs, since vertex shader outputs may be
1019      passed to tessellation or geometry shaders without interpolation, and
1020      thus without the need for qualification by "flat".
1021
1022    (15) Can the 64-bit uniform APIs be used to load values for uniforms of
1023         type "bool", "bvec2", "bvec3", or "bvec4"?
1024
1025      RESOLVED:  No.  OpenGL 2.0 and beyond did allow "bool" variable to be
1026      set with Uniform*i* and Uniform*f APIs, and OpenGL 3.0 extended that
1027      support to Uniform*ui* for orthogonality.  But it seems pointless to
1028      extended this capability forward to 64-bit Uniform APIs as well.
1029
1030    (19) Should we support any implicit conversion of matrix types, now that
1031         we have both "mat4" and "dmat4"?
1032
1033      RESOLVED:  No.  It doesn't seem worth the trouble.
1034
1035
1036
1037Revision History
1038
1039    Rev.    Date    Author    Changes
1040    ----  --------  --------  -----------------------------------------
1041    11    08/27/12  pbrown    Clarify that Uniform*d can not be used to load
1042                              uniforms with boolean types (bug 9345); import
1043                              issue (15) on the topic from NV_gpu_shader5.
1044
1045    10    03/23/10  pbrown    Update issues section to include fp64 issues
1046                              that were left behind in NV_gpu_shader5 when the
1047                              specs were refactored.
1048
1049     9    02/02/10  pbrown    Specify that capturing any component at an
1050                              offset that is not size-aligned results in
1051                              undefined behavior (bug 5863).
1052
1053     8    01/29/10  pbrown    Remove shading language and API support for
1054                              double-precision vertex attributes; moved to the
1055                              EXT_vertex_attrib_64bit specification (bug
1056                              5953).  Added clarification disallowing
1057                              double-precision fragment shader outputs.
1058
1059     7    01/29/10  pbrown    Delete accidental modifications to the language
1060                              for equal and not equal operators (bug 5904),
1061                              which already supported all types.
1062
1063     6    01/15/10  pbrown    Modify the spec rules for counting attributes,
1064                              input and output components, and components
1065                              to capture in transform feedback to permit,
1066                              but not require, double-precision values to
1067                              require twice as many resources as single-
1068                              precision equivalents (bug 5855).
1069
1070     5    01/14/10  pbrown    Minor updates from spec reviews.
1071
1072     4    12/10/09  pbrown    Functionality updates from spec review:
1073                              Allow implicit conversion from mat*->dmat*.
1074                              Rename fmad and [un]packFloat2x32 to fma
1075                              and [un]packDouble2x32.  Add overlooked
1076                              fp64 versions of geometric functions.
1077
1078     3    12/10/09  pbrown    Convert from EXT to ARB.
1079
1080     2    12/08/09  pbrown    Miscellaneous fixes from spec review:  Clarified
1081                              input/output component counting rules, where
1082                              each fp64 value counts double.  General typo
1083                              fixes and language clarifications.
1084
1085     1              pbrown    Internal revisions.
1086