1Name 2 3 ARB_gpu_shader_fp64 4 5Name Strings 6 7 GL_ARB_gpu_shader_fp64 8 9Contact 10 11 Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com) 12 13Contributors 14 15 Barthold Lichtenbelt, NVIDIA 16 Bill Licea-Kane, AMD 17 Bruce Merry, ARM 18 Chris Dodd, NVIDIA 19 Eric Werness, NVIDIA 20 Graham Sellers, AMD 21 Greg Roth, NVIDIA 22 Jeff Bolz, NVIDIA 23 Nick Haemel, AMD 24 Pierre Boudier, AMD 25 Piers Daniell, NVIDIA 26 27Notice 28 29 Copyright (c) 2010-2013 The Khronos Group Inc. Copyright terms at 30 http://www.khronos.org/registry/speccopyright.html 31 32Specification Update Policy 33 34 Khronos-approved extension specifications are updated in response to 35 issues and bugs prioritized by the Khronos OpenGL Working Group. For 36 extensions which have been promoted to a core Specification, fixes will 37 first appear in the latest version of that core Specification, and will 38 eventually be backported to the extension document. This policy is 39 described in more detail at 40 https://www.khronos.org/registry/OpenGL/docs/update_policy.php 41 42Status 43 44 Complete. Approved by the ARB at the 2010/01/22 F2F meeting. 45 Approved by the Khronos Board of Promoters on March 10, 2010. 46 47Version 48 49 Last Modified Date: August 27, 2012 50 NVIDIA Revision: 11 51 52Number 53 54 ARB Extension #89 55 56Dependencies 57 58 This extension is written against the OpenGL 3.2 (Compatibility Profile) 59 Specification. 60 61 This extension is written against version 1.50 (revision 09) of the OpenGL 62 Shading Language Specification. 63 64 OpenGL 3.2 and GLSL 1.50 are required. 65 66 This extension interacts with EXT_direct_state_access. 67 68 This extension interacts with NV_shader_buffer_load. 69 70Overview 71 72 This extension allows GLSL shaders to use double-precision floating-point 73 data types, including vectors and matrices of doubles. Doubles may be 74 used as inputs, outputs, and uniforms. 75 76 The shading language supports various arithmetic and comparison operators 77 on double-precision scalar, vector, and matrix types, and provides a set 78 of built-in functions including: 79 80 * square roots and inverse square roots; 81 82 * fused floating-point multiply-add operations; 83 84 * splitting a floating-point number into a significand and exponent 85 (frexp), or building a floating-point number from a significand and 86 exponent (ldexp); 87 88 * absolute value, sign tests, various functions to round to an integer 89 value, modulus, minimum, maximum, clamping, blending two values, step 90 functions, and testing for infinity and NaN values; 91 92 * packing and unpacking doubles into a pair of 32-bit unsigned integers; 93 94 * matrix component-wise multiplication, and computation of outer 95 products, transposes, determinants, and inverses; and 96 97 * vector relational functions. 98 99 Double-precision versions of angle, trigonometry, and exponential 100 functions are not supported. 101 102 Implicit conversions are supported from integer and single-precision 103 floating-point values to doubles, and this extension uses the relaxed 104 function overloading rules specified by the ARB_gpu_shader5 extension to 105 resolve ambiguities. 106 107 This extension provides API functions for specifying double-precision 108 uniforms in the default uniform block, including functions similar to the 109 uniform functions added by EXT_direct_state_access (if supported). 110 111 This extension provides an "LF" suffix for specifying double-precision 112 constants. Floating-point constants without a suffix in GLSL are treated 113 as single-precision values for backward compatibility with versions not 114 supporting doubles; similar constants are treated as double-precision 115 values in the "C" programming language. 116 117 This extension does not support interpolation of double-precision values; 118 doubles used as fragment shader inputs must be qualified as "flat". 119 Additionally, this extension does not allow vertex attributes with 64-bit 120 components. That support is added separately by EXT_vertex_attrib_64bit. 121 122IP Status 123 124 No known IP claims. 125 126New Procedures and Functions 127 128 void Uniform1d(int location, double x); 129 void Uniform2d(int location, double x, double y); 130 void Uniform3d(int location, double x, double y, double z); 131 void Uniform4d(int location, double x, double y, double z, double w); 132 void Uniform1dv(int location, sizei count, const double *value); 133 void Uniform2dv(int location, sizei count, const double *value); 134 void Uniform3dv(int location, sizei count, const double *value); 135 void Uniform4dv(int location, sizei count, const double *value); 136 137 void UniformMatrix2dv(int location, sizei count, boolean transpose, 138 const double *value); 139 void UniformMatrix3dv(int location, sizei count, boolean transpose, 140 const double *value); 141 void UniformMatrix4dv(int location, sizei count, boolean transpose, 142 const double *value); 143 void UniformMatrix2x3dv(int location, sizei count, boolean transpose, 144 const double *value); 145 void UniformMatrix2x4dv(int location, sizei count, boolean transpose, 146 const double *value); 147 void UniformMatrix3x2dv(int location, sizei count, boolean transpose, 148 const double *value); 149 void UniformMatrix3x4dv(int location, sizei count, boolean transpose, 150 const double *value); 151 void UniformMatrix4x2dv(int location, sizei count, boolean transpose, 152 const double *value); 153 void UniformMatrix4x3dv(int location, sizei count, boolean transpose, 154 const double *value); 155 156 void GetUniformdv(uint program, int location, double *params); 157 158 (All of the following ProgramUniform* functions are supported if and only 159 if EXT_direct_state_access is supported.) 160 161 void ProgramUniform1dEXT(uint program, int location, double x); 162 void ProgramUniform2dEXT(uint program, int location, double x, double y); 163 void ProgramUniform3dEXT(uint program, int location, double x, double y, 164 double z); 165 void ProgramUniform4dEXT(uint program, int location, double x, double y, 166 double z, double w); 167 void ProgramUniform1dvEXT(uint program, int location, sizei count, 168 const double *value); 169 void ProgramUniform2dvEXT(uint program, int location, sizei count, 170 const double *value); 171 void ProgramUniform3dvEXT(uint program, int location, sizei count, 172 const double *value); 173 void ProgramUniform4dvEXT(uint program, int location, sizei count, 174 const double *value); 175 176 void ProgramUniformMatrix2dvEXT(uint program, int location, sizei count, 177 boolean transpose, const double *value); 178 void ProgramUniformMatrix3dvEXT(uint program, int location, sizei count, 179 boolean transpose, const double *value); 180 void ProgramUniformMatrix4dvEXT(uint program, int location, sizei count, 181 boolean transpose, const double *value); 182 void ProgramUniformMatrix2x3dvEXT(uint program, int location, sizei count, 183 boolean transpose, const double *value); 184 void ProgramUniformMatrix2x4dvEXT(uint program, int location, sizei count, 185 boolean transpose, const double *value); 186 void ProgramUniformMatrix3x2dvEXT(uint program, int location, sizei count, 187 boolean transpose, const double *value); 188 void ProgramUniformMatrix3x4dvEXT(uint program, int location, sizei count, 189 boolean transpose, const double *value); 190 void ProgramUniformMatrix4x2dvEXT(uint program, int location, sizei count, 191 boolean transpose, const double *value); 192 void ProgramUniformMatrix4x3dvEXT(uint program, int location, sizei count, 193 boolean transpose, const double *value); 194 195New Tokens 196 197 Returned in the <type> parameter of GetActiveUniform, and 198 GetTransformFeedbackVarying: 199 200 DOUBLE 201 DOUBLE_VEC2 0x8FFC 202 DOUBLE_VEC3 0x8FFD 203 DOUBLE_VEC4 0x8FFE 204 DOUBLE_MAT2 0x8F46 205 DOUBLE_MAT3 0x8F47 206 DOUBLE_MAT4 0x8F48 207 DOUBLE_MAT2x3 0x8F49 208 DOUBLE_MAT2x4 0x8F4A 209 DOUBLE_MAT3x2 0x8F4B 210 DOUBLE_MAT3x4 0x8F4C 211 DOUBLE_MAT4x2 0x8F4D 212 DOUBLE_MAT4x3 0x8F4E 213 214 215Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification 216(OpenGL Operation) 217 218 Modify Section 2.14.4, Uniform Variables, p. 89 219 220 (modify third paragraph, p. 90) ... uniform variable storage for a vertex 221 shader. A uniform matrix with single- or double-precision components will 222 consume no more than 4 * min(r,c) or 8 * min(r,c) uniform components, 223 respectively. A scalar or vector uniform with double-precision components 224 will consume no more than 2<n> components, where <n> is 1 for scalars, and 225 the component count for vectors. A link error is generated ... 226 227 (add to Table 2.13, p. 96) 228 229 Type Name Token Keyword 230 -------------------- ---------------- 231 DOUBLE double 232 DOUBLE_VEC2 dvec2 233 DOUBLE_VEC3 dvec3 234 DOUBLE_VEC4 dvec4 235 DOUBLE_MAT2 dmat2 236 DOUBLE_MAT3 dmat3 237 DOUBLE_MAT4 dmat4 238 DOUBLE_MAT2x3 dmat2x3 239 DOUBLE_MAT2x4 dmat2x4 240 DOUBLE_MAT3x2 dmat3x2 241 DOUBLE_MAT3x4 dmat3x4 242 DOUBLE_MAT4x2 dmat4x2 243 DOUBLE_MAT4x3 dmat4x3 244 245 (modify list of commands at the bottom of p. 99) 246 247 void Uniform{1,2,3,4}d(int location, T value); 248 void Uniform{1,2,3,4}dv(int location, T value); 249 void UniformMatrix{2,3,4}dv 250 (int location, sizei count, boolean transpose, 251 const double *value); 252 void UniformMatrix{2x3,3x2,2x4,4x2,3x4,4x3}dv 253 (int location, sizei count, boolean transpose, 254 const double *value); 255 256 (insert after fourth paragraph, p. 100) The Uniform*d{v} commands will 257 load <count> sets of one to four double-precision floating-point values 258 into a uniform location defined as a double, a double vector, or an array 259 of double scalars or vectors. 260 261 (modify fifth paragraph, p. 100) The UniformMatrix{2,3,4}fv and 262 UniformMatrix{2,3,4}dv commands will load <count> 2x2, 3x3, or 4x4 263 matrices (corresponding to 2, 3, or 4 in the command name) of single- or 264 double-precision floating-point values, respectively, into ... 265 266 (replace second bullet on the middle of p. 101, regarding 267 INVALID_OPERATION errors in Uniform* comamnds) 268 269 * if the type of the uniform declared in the shader does not match the 270 component type and count indicated in the Uniform* command name (where 271 a boolean uniform component type is considered to match any of the 272 Uniform*i{v}, Uniform*ui{v}, or Uniform*f{v} commands), 273 274 (modify sixth paragraph, p. 100) The UniformMatrix{2x3,3x2,2x4, 275 4x2,3x4,4x3}fv and UniformMatrix{2x3,3x2,2x4,4x2,3x4,4x3}dv commands will 276 load <count> 2x3, 3x2, 2x4, 4x2, 3x4, or 4x3 matrices (corresponding to 277 the numbers in the command name) of single- or double-precision 278 floating-point values, respectively, into ... 279 280 (modify "Uniform Buffer Object Storage", p. 102, adding a bullet after the 281 last "Members of type", and modifying the subsequent bullet) 282 283 * Members of type double are extracted from a buffer object by reading a 284 single double-typed value at the specified offset. 285 286 * Vectors with N elements with basic data types of bool, int, uint, 287 float, or double are extracted as N values in consecutive memory 288 locations beginning at the specified offset, with components stored in 289 order with the first (X) component at the lowest offset. The GL data 290 type used for component extraction is derived according to the rules 291 for scalar members above. 292 293 294 Modify Section 2.14.6, Varying Variables, p. 106 295 296 (modify third paragraph, p. 107) ... For the purposes of counting input 297 and output components consumed by a shader, variables declared as vectors, 298 matrices, and arrays will all consume multiple components. Each component 299 of variables declared as double-precision floating-point scalars, vectors, 300 or matrices may be counted as consuming two components. 301 302 (add after the bulleted list, p. 108) For the purposes of counting the 303 total number of components to capture, each component of outputs declared 304 as double-precision floating-point scalars, vectors, or matrices may be 305 counted as consuming two components. 306 307 308 Modify Section 2.19, Transform Feedback, p. 130 309 310 (add to end of first paragraph, p. 132) ... The results of appending a 311 varying variable to a transform feedback buffer are undefined if any 312 component of that variable would be written at an offset not aligned to 313 the size of the component. 314 315 316Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification 317(Rasterization) 318 319 None. 320 321Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification 322(Per-Fragment Operations and the Frame Buffer) 323 324 None. 325 326Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification 327(Special Functions) 328 329 None. 330 331Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification 332(State and State Requests) 333 334 Modify Section 6.1.15, Shader and Program Queries, p. 332 335 336 (add to the first list of commands, p. 337) 337 338 void GetUniformdv(uint program, int location, double *params); 339 340 341Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile) 342Specification (Invariance) 343 344 None. 345 346Additions to the AGL/GLX/WGL Specifications 347 348 None. 349 350Modifications to The OpenGL Shading Language Specification, Version 1.50 351(Revision 09) 352 353 Including the following line in a shader can be used to control the 354 language features described in this extension: 355 356 #extension GL_ARB_gpu_shader_fp64 : <behavior> 357 358 where <behavior> is as specified in section 3.3. 359 360 New preprocessor #defines are added to the OpenGL Shading Language: 361 362 #define GL_ARB_gpu_shader_fp64 1 363 364 365 Modify Section 3.6, Keywords, p. 14 366 367 (add the following to the list of keywords, p. 14) 368 369 double dvec2 dvec3 dvec4 370 371 dmat2 dmat3 dmat4 372 dmat2x2 dmat2x3 dmat2x4 373 dmat3x2 dmat3x3 dmat3x4 374 dmat4x2 dmat4x3 dmat4x4 375 376 (remove "double", "dvec2", "dvec3", and "dvec4" from the list of 377 keywords reserved for future use, p. 15) 378 379 380 Modify Section 4.1, Basic Types, p. 17 381 382 (add to the basic "Transparent Types" table, pp. 17-18) 383 384 Types Meaning 385 -------- ---------------------------------------------------------- 386 double a single double-precision floating point scalar 387 dvec2 a two-component double precision floating-point vector 388 dvec3 a three component double precision floating-point vector 389 dvec4 a four component double precision floating-point vector 390 391 dmat2 a 2x2 double-precision floating-point matrix 392 dmat3 a 3x3 double-precision floating-point matrix 393 dmat4 a 4x4 double-precision floating-point matrix 394 dmat2x2 same as dmat2 395 dmat2x3 a double-precision matrix with 2 columns and 3 rows 396 dmat2x4 a double-precision matrix with 2 columns and 4 rows 397 dmat3x2 a double-precision matrix with 3 columns and 2 rows 398 dmat3x3 same as dmat3 399 dmat3x4 a double-precision matrix with 3 columns and 4 rows 400 dmat4x2 a double-precision matrix with 4 columns and 2 rows 401 dmat4x3 a double-precision matrix with 4 columns and 3 rows 402 dmat4x4 same as dmat4 403 404 405 Modify Section 4.1.4, Floats, p. 22 406 407 (modify two paragraphs of the section, adding support for doubles) 408 409 Single- and double-precision floating-point values are available for use 410 in a variety of scalar calculations. Floating-point variables are defined 411 as in the following example: 412 413 float a, b = 1.5; 414 double c, d = 2.0LF; 415 416 As an input value to one of the processing units, a single or 417 double-precision floating-point variable is expected to match the IEEE 418 floating-point definition for precision and dynamic range of the 419 corresponding type. It is not required that the precision of internal 420 processing for operands of type "float" match the IEEE floating-point 421 specification for floating-point operations, but the minimum guidelines 422 for precision established by the OpenGL specification must be met. 423 Treatment of conditions such as divide by 0 may lead to an unspecified 424 result, but in no case should such a condition lead to the interruption or 425 termination of processing. 426 427 (modify the grammar, p. 22, adding "L" suffix) 428 429 floating-suffix: one of 430 431 f F lf LF 432 433 (modify last paragraph, p. 22) ... including before a suffix. When the 434 suffix "lf" or "LF" is present, the literal has type <double>. Otherwise, 435 the literal has type <float>. A leading unary ... 436 437 438 Modify Section 4.1.6, Matrices, p. 23 439 440 (modify the first paragraph of the section) 441 442 The OpenGL Shading Language has built-in types for 2×2, 2×3, 2×4, 3×2, 443 3×3, 3×4, 4×2, 4×3, and 4×4 matrices of single- and double-precision 444 floating-point numbers. Matrix types beginning with "mat" have 445 single-precision components; matrix types beginning with "dmat" have 446 double-precision components. The first number in the type is the number 447 of columns, the second is the number of rows. Example matrix declarations: 448 449 mat2 mat2D; 450 mat3 optMatrix; 451 mat4 view, projection; 452 mat4x4 view; // an alternate way of declaring a mat4 453 mat3x2 m; // a matrix with 3 columns and 2 rows 454 dmat4 highPrecisionMVP; 455 dmat2x4 skinnyAndTallWithBigComponents; 456 457 ... 458 459 Modify Section 4.1.10, Implicit Conversions, p. 27 460 461 (modify table of implicit conversions) 462 463 Can be implicitly 464 Type of expression converted to 465 --------------------- ------------------- 466 int uint(*), float, double 467 ivec2 uvec2(*), vec2, dvec2 468 ivec3 uvec3(*), vec3, dvec3 469 ivec4 uvec4(*), vec4, dvec4 470 471 uint float, double 472 uvec2 vec2, dvec2 473 uvec3 vec3, dvec3 474 uvec4 vec4, dvec4 475 476 float double 477 vec2 dvec2 478 vec3 dvec3 479 vec4 dvec4 480 481 mat2 dmat2 482 mat3 dmat3 483 mat4 dmat4 484 mat2x3 dmat2x3 485 mat2x4 dmat2x4 486 mat3x2 dmat3x2 487 mat3x4 dmat3x4 488 mat4x2 dmat4x2 489 mat4x3 dmat4x3 490 491 (*) if ARB_gpu_shader5 or NV_gpu_shader5 is supported 492 493 (modify second paragraph of the section) No implicit conversions are 494 provided to convert from unsigned to signed integer types, from 495 floating-point to integer types, or from higher-precision to 496 lower-precision types. There are no implicit array or structure 497 conversions. 498 499 (add before the final paragraph of the section, p. 27) 500 501 (insert before the final paragraph of the section) When performing 502 implicit conversion for binary operators, there may be multiple data types 503 to which the two operands can be converted. For example, when adding an 504 int value to a uint value, both values can be implicitly converted to 505 uint, float, and double. In such cases, a floating-point type is chosen 506 if either operand has a floating-point type. Otherwise, an unsigned 507 integer type is chosen if either operand has an unsigned integer type. 508 Otherwise, a signed integer type is chosen. If operands can be implicitly 509 converted to multiple data types deriving from the same base data type, 510 the type with the smallest component size is used. 511 512 513 Modify Section 4.3.4, Inputs, p. 31 514 515 (modify third paragraph of the section, p. 31) ... Vertex shader inputs 516 can only be single-precision floating-point scalars, vectors, or matrices, 517 or signed and unsigned integers and integer vectors. Vertex shader inputs 518 can also form arrays of these types, but not structures. 519 520 (modify third paragraph, p. 32, allowing doubles as inputs and disallowing 521 as non-flat fragment inputs) ... Fragment inputs can only be signed and 522 unsigned integers and integer vectors, float, floating-point vectors, 523 double, double-precision vectors, single- or double-precision matrices, or 524 arrays or structures of these. Fragment shader inputs that are signed or 525 unsigned integers, integer vectors, doubles, double-precision vectors, or 526 double-precision matrices must be qualified with the interpolation 527 qualifier flat. 528 529 530 Modify Section 4.3.6, Outputs, p. 33 531 532 (modify third paragraph of the section, p. 33) They can only be float, 533 double, single- or double-precision floating-point vectors or matrices, 534 signed or unsigned integers or integer vectors, or arrays or structures of 535 any these. 536 537 (modify last paragraph, p. 33) ... Fragment outputs can only be float, 538 single-precision floating-point vectors, signed or unsigned integers or 539 integer vectors, or arrays of these. ... 540 541 542 Modify Section 5.4.1, Conversion and Scalar Constructors, p. 49 543 544 (add double to the first list of constructor examples) 545 546 Converting between scalar types is done as the following prototypes 547 indicate: 548 549 int(uint) // converts an unsigned integer value to a signed integer 550 int(float) // converts a float value to a signed integer 551 int(double) // converts a double value to a signed integer 552 int(bool) // converts a Boolean value to a signed integer 553 uint(int) // converts a signed integer value to an unsigned integer 554 uint(float) // converts a float value to an unsigned integer 555 uint(double) // converts a double value to an unsigned integer 556 uint(bool) // converts a Boolean value to an unsigned integer 557 float(int) // converts a signed integer value to a float 558 float(uint) // converts an unsigned integer value to a float 559 float(double) // converts a double value to a float 560 float(bool) // converts a Boolean value to a float 561 double(int) // converts a signed integer value to a double 562 double(uint) // converts an unsigned integer value to a double 563 double(float) // converts a float value to a double 564 double(bool) // converts a Boolean value to a double 565 bool(int) // converts a signed integer value to a Boolean 566 bool(uint) // converts an unsigned integer value to a Boolean 567 bool(float) // converts a float value to a Boolean 568 bool(double) // converts a double value to a Boolean 569 570 (modify second paragraph of the section, p. 49) When constructors are used 571 to convert any floating-point type to an integer, the fractional part of 572 the floating-point value is dropped. ... 573 574 (modify third paragraph of the section, p. 49) When a constructor is used 575 to convert any integer or floating-point type to bool, 0 and 0.0 are 576 converted to false, and non-zero values are converted to true. When a 577 constructor is used to convert a bool to any integer or floating-point 578 type, false is converted to 0 or 0.0, and true is converted to 1 or 1.0. 579 580 581 Modify Section 5.4.2, Vector and Matrix Constructors, p. 50 582 583 (modify the last paragraph, p. 50) If the basic type (bool, int, uint, 584 float, or double) of a parameter to a constructor does not match the basic 585 type of the object being constructed, the scalar construction rules 586 (above) are used to convert the parameters. 587 588 589 (add to the first group of examples, p. 52) 590 591 dmat2(dvec2, dvec2) 592 dmat3(dvec3, dvec3, dvec3) 593 dmat4(dvec4, dvec4, dvec4, dvec4) 594 dmat2x4(dvec3, double, // first column 595 double, dvec3) // second column 596 597 598 Modify Section 5.9, Expressions, p. 57 599 600 (modify bulleted list as follows, adding support for double-precision 601 floating-point types) 602 603 Expressions in the shading language are built from the following: 604 605 * Constants of type bool, int, uint, float, double, all vector types and 606 all matrix types. 607 608 ... 609 610 * The arithmetic binary operators add (+), subtract (-), multiply (*), and 611 divide (/) operate on integer, single-precision floating-point, and 612 double-precision floating-point scalars, vectors, and matrices. If the 613 fundamental type (integer, single-precision floating-point, 614 double-precision floating-point) of the operands do not match, the 615 conversions from Section 4.1.10 "Implicit Conversions" are applied to 616 produce matching types. ... 617 618 * The arithmetic unary operators negate (-), post- and pre-increment and 619 decrement (-- and ++) operate on integer, single-precision 620 floating-point, or double-precision floating-point values (including 621 vectors and matrices). ... 622 623 * The relational operators greater than (>), less than (<), and less than 624 or equal (<=) operate only on scalar integer, single-precision 625 floating-point, or double-precision floating-point expressions. The 626 result is scalar Boolean. The fundamental type of the two operands must 627 match, either as specified, or after one of the implicit type 628 conversions specified in Section 4.1.10. ... 629 630 ... 631 632 633 Modify Chapter 8, Built-in Functions, p. 81 634 635 (add to description of generic types, last paragraph of p. 81) ... Where 636 the input arguments (and corresponding output) can be double, dvec2, 637 dvec3, or dvec4, <genDType> is used as the argument. ... Similarly, <mat> 638 is used for any matrix basic type with single-precision components and 639 <dmat> is used for any matrix basic type with double-precision components. 640 641 642 Modify Section 8.2, Exponential Functions, p. 83 643 644 (add overloads for double-precision square roots) 645 646 genDType sqrt(genDType x); 647 genDType inversesqrt(genDType x); 648 649 650 Modify Section 8.3, Common Functions, p. 84 651 652 (add support for double-precision floating-point multiply-add) 653 654 Syntax: 655 656 genDType fma(genDType a, genDType b, genDType c); 657 658 The function fma() performs a fused double-precision floating-point 659 multiply-add to compute the value a*b+c. The results of fma() may not be 660 identical to evaluating the expression (a*b)+c, because the computation 661 may be performed in a single operation with intermediate precision 662 different from that used to compute a non-fma() expression. 663 664 The results of fma() are guaranteed to be invariant given fixed inputs 665 <a>, <b>, and <c>, as though the result were taken from a variable 666 declared as "precise". 667 668 669 (add support for double-precision frexp and ldexp functions) 670 671 Syntax: 672 673 genDType frexp(genDType x, out genIType exp); 674 genDType ldexp(genDType x, in genIType exp); 675 676 The function frexp() splits each double-precision floating-point number in 677 <x> into its binary significand, a floating-point number in the range 678 [0.5, 1.0), and an integral exponent of two, such that: 679 680 x = significand * 2 ^ exponent 681 682 The significand is returned by the function; the exponent is returned in 683 the parameter <exp>. For a floating-point value of zero, the significant 684 and exponent are both zero. For a floating-point value that is an 685 infinity or is not a number, the results of frexp() are undefined. 686 687 If the input <x> is a vector, this operation is performed in a 688 component-wise manner; the value returned by the function and the value 689 written to <exp> are vectors with the same number of components as <x>. 690 691 The function ldexp() builds a double-precision floating-point number from 692 each significand component in <x> and the corresponding integral exponent 693 of two in <exp>, returning: 694 695 significand * 2 ^ exponent 696 697 If this product is too large to be represented as a double-precision 698 floating-point value, the result is considered undefined. 699 700 If the input <x> is a vector, this operation is performed in a 701 component-wise manner; the value passed in <exp> and returned by the 702 function are vectors with the same number of components as <x>. 703 704 705 (add overloads for double-precision functions) 706 707 genDType abs(genDType x); 708 genDType sign(genDType x); 709 genDType floor(genDType x); 710 genDType trunc(genDType x); 711 genDType round(genDType x); 712 genDType roundEven(genDType x); 713 genDType ceil(genDType x); 714 genDType fract(genDType x); 715 genDType mod(genDType x, double y); 716 genDType mod(genDType x, genDType y); 717 genDType modf(genDType x, out genDType i); 718 genDType min(genDType x, genDType y); 719 genDType min(genDType x, double y); 720 genDType max(genDType x, genDType y); 721 genDType max(genDType x, double y); 722 genDType clamp(genDType x, genDType minVal, genDType maxVal); 723 genDType clamp(genDType x, double minVal, double maxVal); 724 genDType mix(genDType x, genDType y, genDType a); 725 genDType mix(genDType x, genDType y, double a); 726 genDType mix(genDType x, genDType y, genBType a); 727 genDType step(genDType edge, genDType x); 728 genDType step(double edge, genDType x); 729 genDType smoothstep(genDType edge0, genDType edge1, genDType x); 730 genDType smoothstep(double edge0, double edge1, genDType x); 731 genBType isnan(genDType x); 732 genBType isinf(genDType x); 733 734 735 (add support for 64-bit floating-point packing and unpacking functions) 736 737 Syntax: 738 739 double packDouble2x32(uvec2 v); 740 uvec2 unpackDouble2x32(double v); 741 742 The function packDouble2x32() returns a double obtained by packing the 743 components of a two-component unsigned integer vector into a 64-bit value 744 and interpeting its bits according to the IEEE double-precision 745 floating-point representation. The first vector component specifies the 746 32 least significant bits; the second component specifies the 32 most 747 significant bits. 748 749 The function unpackDouble2x32() returns a two-component unsigned integer 750 vector obtained by interpreting a double using the 64-bit IEEE 751 double-precision floating-point representation and unpacking into two 752 32-bit halves. The first component of the vector contains the 32 least 753 significant bits of the double; the second component consists the 32 most 754 significant bits. 755 756 757 Modify Section 8.4, Geometric Functions, p. 87 758 759 (add double-precision equivalents for existing geometric functions) 760 761 double length(genDType x); 762 double distance(genDType p0, genDType p1); 763 double dot(genDType x, genDType y); 764 dvec3 cross(dvec3 x, dvec3 y); 765 genDType normalize(genDType x); 766 genDType faceforward(genDType N, genDType I, genDType Nref); 767 genDType reflect(genDType I, genDType N); 768 genDType refract(genDType I, genDType N, double eta); 769 770 771 Modify Section 8.5, Matrix Functions, p. 89 772 773 (add double-precision equivalents for existing matrix functions) 774 775 dmat matrixCompMult(dmat x, dmat y); 776 dmat2 outerProduct(dvec2 c, dvec2 r); 777 dmat3 outerProduct(dvec3 c, dvec3 r); 778 dmat4 outerProduct(dvec4 c, dvec4 r); 779 dmat2x3 outerProduct(dvec3 c, dvec2 r); 780 dmat3x2 outerProduct(dvec2 c, dvec3 r); 781 dmat2x4 outerProduct(dvec4 c, dvec2 r); 782 dmat4x2 outerProduct(dvec2 c, dvec4 r); 783 dmat3x4 outerProduct(dvec4 c, dvec3 r); 784 dmat4x3 outerProduct(dvec3 c, dvec4 r); 785 dmat2 transpose(dmat2 m); 786 dmat3 transpose(dmat3 m); 787 dmat4 transpose(dmat4 m); 788 dmat2x3 transpose(dmat3x2 m); 789 dmat3x2 transpose(dmat2x3 m); 790 dmat2x4 transpose(dmat4x2 m); 791 dmat4x2 transpose(dmat2x4 m); 792 dmat3x4 transpose(dmat4x3 m); 793 dmat4x3 transpose(dmat3x4 m); 794 double determinant(dmat2 m); 795 double determinant(dmat3 m); 796 double determinant(dmat4 m); 797 dmat2 inverse(dmat2 m); 798 dmat3 inverse(dmat3 m); 799 dmat4 inverse(dmat4 m); 800 801 802 Modify Section 8.6, Vector Relational Functions, p. 90 803 804 (modify the first paragraph, p. 90, adding support for relational 805 functions operating on double precision types) 806 807 Relational and equality operators (<, <=, >, >=, ==, !=) are defined (or 808 reserved) to operate on scalars and produce scalar Boolean results. For 809 vector results, use the following built-in functions. In the definitions 810 below, the following terms are used as placeholders for all vector types 811 for a given fundamental data type. In all cases, the sizes of the input 812 and return vectors for any particular call must match. 813 814 placeholder fundamental types 815 ----------- ------------------------------------------------ 816 bvec bvec2, bvec3, bvec4 817 818 ivec ivec2, ivec3, ivec4 819 820 uvec uvec2, uvec3, uvec4 821 822 vec vec2, vec3, vec4, dvec2, dvec3, dvec4 823 824 825 Modify Section 9, Shading Language Grammar, p. 92 826 827 !!! TBD !!! 828 829 830GLX Protocol 831 832 !!! TBD 833 834Dependencies on ARB_gpu_shader5 835 836 If ARB_gpu_shader5 is not supported, the changes to the function 837 overloading rules in the OpenGL Shading Language Specification provided 838 there should included in this extension. 839 840Dependencies on NV_gpu_shader5 841 842 This extension and NV_gpu_shader5 both provide support for shading 843 language variables with 64-bit components. If both extensions are 844 supported, the various edits describing this new support should be 845 combined. 846 847Dependencies on EXT_direct_state_access 848 849 If EXT_direct_state_access is not supported, references to the 850 ProgramUniform*d*EXT functions should be removed. 851 852 If EXT_direct_state_access is supported, that specification should be 853 edited as follows: 854 855 (modify the ProgramUniform* language) 856 857 The following commands: 858 859 .... 860 void ProgramUniform{1,2,3,4}dEXT(uint program int location, T value); 861 void ProgramUniform{1,2,3,4}dvEXT (uint program, int location, 862 const T *value); 863 void ProgramUniformMatrix{2,3,4}dvEXT 864 (uint program, int location, sizei count, boolean transpose, 865 const double *value); 866 void ProgramUniformMatrix{2x3,3x2,2x4,4x2,3x4,4x3}dvEXT 867 (uint program, int location, sizei count, boolean transpose, 868 const double *value); 869 870 operate identically to the corresponding command where "Program" is 871 deleted from the name (and extension suffixes are dropped or updated 872 appropriately) except, rather than updating the currently active program 873 object, these "Program" commands update the program object named by the 874 <program> parameter. ... 875 876Dependencies on NV_shader_buffer_load 877 878 If NV_shader_buffer_load is supported, that specification should be edited 879 as follows: 880 881 Modify "Section 2.20.X, Shader Memory Access" from NV_shader_buffer_load. 882 883 (add rules for loads of variables having the new data types from this 884 extension to the list of bullets following "When a shader dereferences a 885 pointer variable") 886 887 - Data of type "double" are read from or written to memory as one 888 double-typed value at the specified GPU address. 889 890 891Errors 892 893 None. 894 895New State 896 897 None. 898 899New Implementation Dependent State 900 901 None. 902 903Issues 904 905 (1) How do double-precision types interact with the rules for storing 906 uniforms in a buffer object? 907 908 RESOLVED: The rules were already written with data types larger and 909 smaller than those in the original GLSL in mind. Single precision 910 floats typically take four bytes; doubles take eight bytes. The larger 911 storage requirement for doubles means a larger alignment requirement; 912 doubles still need to be size-aligned. 913 914 (2) Should double-precision vertex shader inputs be supported? 915 916 RESOLVED: Not in this extension. Such support will be added by the 917 EXT_vertex_attrib_64bit extension. 918 919 (3) Should double-precision fragment shader outputs be supported? 920 921 RESOLVED: Not in this extension. Note that we don't have 922 double-precision framebuffer formats to accept such values. 923 924 (4) Should transform feedback be able to capture double-precision 925 components? 926 927 RESOLVED: Yes. However, undefined behavior will occur unless all 928 components are captured to size-aligned offsets. 929 930 If any variable captured in transform feedback has double-precision 931 components, the practical requirements for defined behavior are: 932 933 (a) the offset of the base of a buffer object must be a multiple of 934 eight bytes; 935 936 (b) the amount of data captured per vertex must be a multiple of eight 937 bytes; and 938 939 (c) each double-precision variable captured must be aligned to a 940 multiple of eight bytes relative to the beginning of a vertex. 941 942 If capturing a mix of single- and double-precision components, it might 943 be necessary to use the "gl_SkipComponents1" variable from 944 ARB_transform_feedback3 to force proper alignment. 945 946 We considered the possibility of adding error checks to throw errors in 947 cases where undefined behavior might occur, but chose not to include 948 such errors. For OpenGL 3.0-style transform feedback, cases (b) and (c) 949 are solely a function of the variables captured could be detected when a 950 program object is linked. (Such an error would be more problematic for 951 transform feedback via NV_transform_feedback, where the set of variables 952 captured can be updated without relinking.) For case (a), the 953 requirement of OpenGL 3.0 is that transform feedback buffer offsets must 954 be a multiple of 4 bytes; enforcing a stricter 8-byte alignment would 955 require either a backward-incompatible change or a Begin-time error to 956 checks the offset of transform feedback buffers against the current 957 program. 958 959 (5) Should we have double-precision matrix types? We didn't add integer 960 matrices, but integer matrix math is fairly uncommon. 961 962 RESOLVED: Yes, we will support all matrix sizes in double-precision. 963 We will also provide double-precision equivalents for all matrix 964 operators and built-in matrix functions. 965 966 (6) What should be done to distinguish between single- and 967 double-precision floating-point constants? 968 969 RESOLVED: We will use "LF" to identify double-precision floating-point 970 constants. Here, we depart from the C standard. In C, floating-point 971 constants without a suffix are implicitly double-precision and require a 972 "F" suffix to specify a single-precision constant. However, GLSL has 973 historically provided no support for double precision. Changing to C 974 rules would materially affect the behavior of pre-existing shaders that 975 add an #extension line for this extension, since constants with no 976 suffix have meant "float" up to now. Additionally, such a change would 977 likely have required that we introduce implicit conversions from double 978 to float; otherwise, assigning a constant with no suffix to a float 979 would result in a compile-time error. 980 981 (7) Should we require IEEE 1394-compliant behavior for NaNs and 982 infinities? Denorms? 983 984 RESOLVED: Following historical precedent in the GLSL and OpenGL APIs 985 not defining special-case floating-point behavior, we chose not to do so 986 in this extension. 987 988 (8) Should we provide double-precision versions of all the built-ins that 989 take a <genType>, which are currently defined to be floats and 990 floating-point vectors? 991 992 RESOLVED: We provide double-precision versions of most of the built-in 993 functions supported by GLSL. We opted not to provide double-precision 994 functions for special trigonometry, exponential, derivative, and noise 995 functions. 996 997 (9) Are double-precision "varyings" (values passed between shader stages) 998 supported by this extension? If so, is double-precision interpolation 999 is supported? 1000 1001 RESOLVED: Double-precision shader inputs and outputs are supported, 1002 except for vertex shader inputs and fragment shader outputs. 1003 Additionally, double-precision vertex shader inputs are provided by the 1004 separate extension EXT_vertex_attrib_64bit. No known extension provides 1005 double-precision fragment outputs, but that doesn't seem important since 1006 OpenGL provides no pixel/texture formats with double-precision 1007 components that could reasonably receive such outputs. 1008 1009 Interpolation not supported in this extension for double-precision 1010 floating-point components. As with integer types in OpenGL 3.0, 1011 double-precision floating-point fragment shader inputs must be qualified 1012 as "flat". 1013 1014 Note that this extension reformulates the spec language requiring "flat" 1015 qualifiers, in addition to adding doubles to the list of "flat" types. 1016 In GLSL 1.30, the spec applies these requirements to vertex shader 1017 outputs but imposes no requirement on fragment inputs. We move this 1018 requirement to fragment inputs, since vertex shader outputs may be 1019 passed to tessellation or geometry shaders without interpolation, and 1020 thus without the need for qualification by "flat". 1021 1022 (15) Can the 64-bit uniform APIs be used to load values for uniforms of 1023 type "bool", "bvec2", "bvec3", or "bvec4"? 1024 1025 RESOLVED: No. OpenGL 2.0 and beyond did allow "bool" variable to be 1026 set with Uniform*i* and Uniform*f APIs, and OpenGL 3.0 extended that 1027 support to Uniform*ui* for orthogonality. But it seems pointless to 1028 extended this capability forward to 64-bit Uniform APIs as well. 1029 1030 (19) Should we support any implicit conversion of matrix types, now that 1031 we have both "mat4" and "dmat4"? 1032 1033 RESOLVED: No. It doesn't seem worth the trouble. 1034 1035 1036 1037Revision History 1038 1039 Rev. Date Author Changes 1040 ---- -------- -------- ----------------------------------------- 1041 11 08/27/12 pbrown Clarify that Uniform*d can not be used to load 1042 uniforms with boolean types (bug 9345); import 1043 issue (15) on the topic from NV_gpu_shader5. 1044 1045 10 03/23/10 pbrown Update issues section to include fp64 issues 1046 that were left behind in NV_gpu_shader5 when the 1047 specs were refactored. 1048 1049 9 02/02/10 pbrown Specify that capturing any component at an 1050 offset that is not size-aligned results in 1051 undefined behavior (bug 5863). 1052 1053 8 01/29/10 pbrown Remove shading language and API support for 1054 double-precision vertex attributes; moved to the 1055 EXT_vertex_attrib_64bit specification (bug 1056 5953). Added clarification disallowing 1057 double-precision fragment shader outputs. 1058 1059 7 01/29/10 pbrown Delete accidental modifications to the language 1060 for equal and not equal operators (bug 5904), 1061 which already supported all types. 1062 1063 6 01/15/10 pbrown Modify the spec rules for counting attributes, 1064 input and output components, and components 1065 to capture in transform feedback to permit, 1066 but not require, double-precision values to 1067 require twice as many resources as single- 1068 precision equivalents (bug 5855). 1069 1070 5 01/14/10 pbrown Minor updates from spec reviews. 1071 1072 4 12/10/09 pbrown Functionality updates from spec review: 1073 Allow implicit conversion from mat*->dmat*. 1074 Rename fmad and [un]packFloat2x32 to fma 1075 and [un]packDouble2x32. Add overlooked 1076 fp64 versions of geometric functions. 1077 1078 3 12/10/09 pbrown Convert from EXT to ARB. 1079 1080 2 12/08/09 pbrown Miscellaneous fixes from spec review: Clarified 1081 input/output component counting rules, where 1082 each fp64 value counts double. General typo 1083 fixes and language clarifications. 1084 1085 1 pbrown Internal revisions. 1086