1Name 2 3 ATI_text_fragment_shader 4 5Name Strings 6 7 GL_ATI_text_fragment_shader 8 9Contributors 10 11 Bob Beretta, Apple Computer 12 Dan Ginsburg, AMD 13 Evan Hart, NVIDIA 14 Benj Lipchak, AMD 15 James McCombe, Apple Computer 16 Jason Mitchell 17 18 and contributors to the ARB_vertex_program working group, 19 the product of which provided the API for program specification 20 and object management. 21 22Contact 23 24 Benj Lipchak, AMD (benj.lipchak 'at' amd.com) 25 Jeremy Sandmel, Apple Computer (jsandmel 'at' apple.com) 26 27Status 28 29 Shipping on MacOS X, version 10.2 30 31Version 32 33 Last Modified Date: November 4, 2006 34 Author Revision: 1.0.11 (based on 1.5 of ATI_fragment_shader) 35 36Number 37 38 269 39 40Dependencies 41 42 ARB_multitexture is required by this extension. 43 44 ARB_shadow interacts with this extension. 45 46 ARB_vertex_program is referred to for documentation on the 47 program management API, but not specifically required as long 48 as the entry points are exported by this extension. 49 50 ATI_fragment_shader is the architectural basis for this extension, 51 but is not specifically required by this extension. 52 53 The extension is written against the OpenGL 1.2.1 Specification. 54 55Overview 56 57 The ATI_fragment_shader extension exposes a powerful fragment 58 processing model that provides a very general means of expressing 59 fragment color blending and dependent texture address modification. 60 The processing is termed a fragment shader or fragment program and 61 is specifed using a register-based model in which there are fixed 62 numbers of instructions, texture lookups, read/write registers, and 63 constants. 64 65 ATI_fragment_shader provides a unified instruction set 66 for operating on address or color data and eliminates the 67 distinction between the two. That extension provides all the 68 interfaces necessary to fully expose this programmable fragment 69 processor in GL. 70 71 ATI_text_fragment_shader is a redefinition of the 72 ATI_fragment_shader functionality, using a slightly different 73 interface. The intent of creating ATI_text_fragment_shader is to 74 take a step towards treating fragment programs similar to other 75 programmable parts of the GL rendering pipeline, specifically 76 vertex programs. This new interface is intended to appear 77 similar to the ARB_vertex_program API, within the limits of the 78 feature set exposed by the original ATI_fragment_shader extension. 79 80 The most significant differences between the two extensions are: 81 82 (1) ATI_fragment_shader provides a procedural function call 83 interface to specify the fragment program, whereas 84 ATI_text_fragment_shader uses a textual string to specify 85 the program. The fundamental syntax and constructs of the 86 program "language" remain the same. 87 88 (2) The program object managment portions of the interface, 89 namely the routines used to create, bind, and delete program 90 objects and set program constants are managed 91 using the framework defined by ARB_vertex_program. 92 93 (3) ATI_fragment_shader refers to the description of the 94 programmable fragment processing as a "fragment shader". 95 In keeping with the desire to treat all programmable parts 96 of the pipeline consistently, ATI_text_fragment_shader refers 97 to these as "fragment programs". The name of the extension is 98 left as ATI_text_fragment_shader instead of 99 ATI_text_fragment_program in order to indicate the underlying 100 similarity between the API's of the two extensions, and to 101 differentiate it from any other potential extensions that 102 may be able to move even further in the direction of treating 103 fragment programs as just another programmable area of the 104 GL pipeline. 105 106 Although ATI_fragment_shader was originally conceived as a 107 device-independent extension that would expose the capabilities of 108 future generations of hardware, changing trends in programmable 109 hardware have affected the lifespan of this extension. For this 110 reason you will now find a fixed set of features and resources 111 exposed, and the queries to determine this set have been deprecated 112 in ATI_fragment_shader. Further, in ATI_text_fragment_shader, 113 most of these resource limits are fixed by the text grammar and 114 the queries have been removed altogether. 115 116Issues 117 118 None 119 120 121New Procedures and Functions 122 123 None. 124 125 NOTE: Though this extension introduces no new procedures and 126 functions, it relies on the program object management API from the 127 pending ARB_vertex_program extension with the introduction of 128 a new program target and program specification syntax. 129 See the ARB_vertex_program specification for full details on the 130 use of these procedures and functions. 131 132 ProgramStringARB 133 BindProgramARB 134 DeleteProgramsARB 135 GenProgramsARB 136 ProgramEnvParameter4{d,dv,f,fv}ARB 137 ProgramLocalParameter4{d,dv,f,fv}ARB 138 GetProgramEnvParameter{dv,fv}ARB 139 GetProgramLocalParameter{dv,fv}ARB 140 GetProgramivARB 141 GetProgramStringARB 142 IsProgramARB 143 144New Tokens 145 146 Accepted by the <cap> parameter of Disable, Enable, and IsEnabled, 147 and by the <pname> parameter of GetBooleanv, GetIntegerv, GetFloatv, 148 and GetDoublev, and by the <target> parameter of ProgramStringARB, 149 BindProgramARB, ProgramEnvParameter4{d,dv,f,fv}ARB, 150 ProgramLocalParameter4{d,dv,f,fv}ARB, 151 GetProgramEnvParameter{dv,fv}ARB, GetProgramLocalParameter{dv,fv}ARB, 152 GetProgramivARB, GetProgramfvATI, and GetProgramStringARB. 153 154 TEXT_FRAGMENT_SHADER_ATI 0x8200 155 156Additions to Chapter 2 of the OpenGL 1.2.1 Specification (OpenGL 157Operation) 158 159 None 160 161 162Additions to Chapter 3 of the OpenGL 1.2.1 Specification (Rasterization) 163 164 Add New Section 3.10, (p. 154) (subsequent sections get incremented) 165 166 3.10 Fragment Programs 167 168 The texture application and texture environments may optionally be 169 replaced by an application supplied program referred to here as a 170 fragment program. In this case, subsequent processing is still 171 applied normally, including fog, color sum, and antialiasing 172 application. 173 174 The framework for specifying and managing fragment programs is 175 the one defined in section 5.7 of ARB_vertex_program. For fragment 176 programs, TEXT_FRAGMENT_SHADER_ATI is used as the <target> for these 177 program management entrypoints. 178 179 A fragment program is similar in concept to a vertex program, 180 described in section 2.14 of ARB_vertex_program, except that its 181 processing is performed at a later stage in the GL pipeline. Where 182 a vertex program takes the current values of the vertex components 183 as its inputs, a fragment program takes the fragments and their 184 associated data, produced by rasterization, as inputs. Likewise, 185 while a vertex program outputs a homogeneous position and a set of 186 attributes, a fragment program outputs a color. 187 188 3.10.1 Fragment Program Grammar and Semantic Restrictions 189 190 Fragment programs are specified as string of ASCII characters 191 encoding the programs. When a program is loaded by a call to 192 ProgramStringARB (section 5.7.1), with a target of 193 TEXT_FRAGMENT_SHADER_ATI, the program string is parsed into 194 a set of tokens possibly separated by white space. Spaces, tabs, 195 newlines, carriage returns, and comments are considered whitespace. 196 Comments begin with the character "#" and are terminated by a 197 newline, a carriage return, or the end of the program array. 198 199 The Backus-Naur Form (BNF) grammar below specifies the syntactically 200 valid sequences for fragment programs. The set of valid tokens can 201 be inferred from the grammar. The token "" represents an empty 202 string and is used to indicate optional rules. A program is invalid 203 if it contains any undefined tokens or characters. 204 205 A text fragment shader program is required to begin with the header 206 string "!!ATIfs1.0", without any preceding whitespace. This string 207 identifies the subsequent program text as a text fragment shader 208 program (version 1.0) that should be parsed according to the 209 following grammar and semantic rules. Program string parsing begins 210 with the character immediately following the header string. 211 212 <program> ::= <optionalConstDeclareBlock> 213 <optionalPrelimPassBlock> 214 <outputPassBlock> 215 216 <constDeclareBlock> ::= "" 217 | "StartConstants" ";" 218 <constDeclareSequence> 219 "EndConstants" ";" 220 221 <constDeclareSequence> ::= <constDeclareSequence> <constDeclareStatement> 222 | "" 223 224 <constDeclareStatement> ::= "CONSTANT" <programConstName> "=" <constBinding> ";" 225 226 <constBinding> ::= <progEnvParam> 227 | <programLocalParam> 228 | <literalConstBinding> 229 230 <progEnvParam> ::= "program" "." "env" 231 "[" <progEnvParamNum> "]" 232 233 <progEnvParamNum> ::= <integer> from 0 to 7 234 235 <progLocalParam> ::= "program" "." "local" 236 "[" <progLocalParamNum> "]" 237 238 <progLocalParamNum> ::= <integer> from 0 to 7 239 240 <literalConstBinding> ::= "{" <normalizedFloat> "}" 241 | "{" <normalizedFloat> "," 242 <normalizedFloat> "}" 243 | "{" <normalizedFloat> "," 244 <normalizedFloat> "," 245 <normalizedFloat> "}" 246 | "{" <normalizedFloat> "," 247 <normalizedFloat> "," 248 <normalizedFloat> "," 249 <normalizedFloat> "}" 250 251 <optionalPrelimPassBlock> ::= "" 252 | "StartPrelimPass" ";" 253 <initRegSequence> 254 <aluSequence> 255 "EndPass" ";" 256 257 <outputPassBlock> ::= "" 258 | "StartOutputPass" ";" 259 <initRegSequence> 260 <aluSequence> 261 "EndPass" ";" 262 263 <initRegSequence> ::= <initRegSequence> <initRegStatement> 264 | "" 265 266 <initRegStatement> ::= <initRegOp> <initRegDst> <initRegSrc> ";" 267 268 <initRegOp> ::= "PassTexCoord" 269 | "SampleMap" 270 271 <initRegDst> ::= <regName> 272 273 <initRegSrc> ::= <regName> <threeTupleSelect> 274 | <texCoordName> <threeTupleSelect> 275 276 <aluSequence> ::= <aluSequence> <aluStatement> 277 | "" 278 279 <aluStatement> ::= <unaryOp> <unaryOpArgs> ";" 280 | <binaryOp> <binaryOpArgs> ";" 281 | <ternaryOp> <ternaryOpArgs> ";" 282 283 <unaryOpArgs> ::= <dstInfo> <argInfo> 284 <binaryOpArgs> ::= <dstInfo> <argInfo> "," <argInfo> 285 <ternaryOpArgs> ::= <dstInfo> <argInfo> "," <argInfo> "," <argInfo> 286 287 <dstInfo> ::= <dstName> <optionalDstMask> <optionalDstMod> 288 289 <optionalDstMask> ::= "" 290 | "." "r" 291 | "." "g" 292 | "." "rg" 293 | "." "b" 294 | "." "rb" 295 | "." "gb" 296 | "." "rgb" 297 | "." "a" 298 | "." "ra" 299 | "." "ga" 300 | "." "rga" 301 | "." "ba" 302 | "." "rba" 303 | "." "gba" 304 | "." "rgba" 305 306 <optionalDstMod> ::= <dstModSetting> <optionalSaturate> 307 308 <dstModSetting> ::= "" 309 | "." "2x" 310 | "." "4x" 311 | "." "8x" 312 | "." "half" 313 | "." "quarter" 314 | "." "eighth" 315 316 <optionalSaturate> ::= "." "sat" 317 318 <dstName> ::= <regName> 319 320 <argInfo> ::= <argName> <optionalArgReplicate> <optionalArgMod> 321 322 <argName> ::= <regName> 323 | <programConstantName> 324 | <fixedConstantName> 325 | <colorName> 326 327 <optionalArgReplicate> ::= "" 328 | "." "r" 329 | "." "g" 330 | "." "b" 331 | "." "a" 332 333 <optionalArgMod> ::= "" 334 | <optionalNegate> <optional2Times> <optionalBias> <optionalComplement> 335 336 <optionalNegate> ::= "" 337 | "." "neg" 338 339 <optional2Times> ::= "" 340 | "." "2x" 341 342 <optionalBias> ::= "" 343 | "." "bias" 344 345 <optionalComplement> ::= "" 346 | "." "comp" 347 348 <texCoordName> ::= "t0" 349 | "t1" 350 | "t2" 351 | "t3" 352 | "t4" 353 | "t5" 354 355 <threeTupleSelect> ::= "." "str" 356 | "." "stq" 357 | "." "str_dr" 358 | "." "stq_dq" 359 360 <regName> ::= "r0" 361 | "r1" 362 | "r2" 363 | "r3" 364 | "r4" 365 | "r5" 366 367 <programConstantName> ::= "c0" 368 | "c1" 369 | "c2" 370 | "c3" 371 | "c4" 372 | "c5" 373 | "c6" 374 | "c7" 375 376 <fixedConstantName> ::= "0" 377 | "1" 378 379 <colorName> ::= "color0" 380 | "color1" 381 382 <unaryOp> ::= "MOV" 383 384 <binaryOp> ::= "ADD" 385 | "MUL" 386 | "SUB" 387 | "DOT3" 388 | "DOT4" 389 390 <ternaryOp> ::= "MAD" 391 | "LERP" 392 | "CND" 393 | "CND0" 394 | "DOT2ADD" 395 396 The <integer> rule matches an integer constant. The integer 397 consists of a sequence of one or more digits ("0" through "9"). 398 399 The <normalizedFloat> rule matches a floating-point constant in the 400 range of 0.0 to 1.0, inclusive. 401 402 If TEXT_FRAGMENT_SHADER_ATI is enabled, but the currently bound 403 program is invalid, the results of drawing commands are undefined. 404 A program may be invalid because it specifically violates the 405 syntax of the above grammar or because the specified program 406 violates one of the additional semantic restrictions given in 407 summary below with details following: 408 409 Summary of semantic restrictions: 410 --------------------------------- 411 1. All "cX" constants used by a program must be declared in a 412 constant block, and program constants can be bound at most once. 413 2. If an instruction refers to "cX" constants as arguments, at most 414 2 different constants can be used in a single instruction. 415 3. "color0" and "color1" may be used only in the output pass. 416 4. A preliminary pass must contain at least one ALU operation. 417 5. A maximum of 8 pairs or implicit pairs of color and alpha 418 instructions (not including "PassTexCoord and" "SampleMap") can 419 be used in a single pass. 420 6. A given destination register can only be written by a SampleMap 421 or PassTexCoord instruction once in a given pass. 422 7. The second argument to "PassTexCoord" and "SampleMap" can not be 423 an "rX" register in the first pass. 424 8. Once a texture coordinate source is specified with a particular 425 choice for coordinate selection, (i.e "str" or "stq"), the 426 program may not refer to that same texture coordinate with a 427 different choice later on. The exception is that a different 428 projection can be specified (i.e. using both "t2.str" and 429 "t2.str_dr" on the same texture coordinate set is legal, but 430 using "t2.str" and "t2.stq" is not) 431 9. The second argument to "PassTexCoord" and "SampleMap" in the 432 output pass can not be a register that uses "stq" or "stq_dq" 433 as a component choice selection. 434 10. Alpha destination masks for DOT2ADD, DOT3, and DOT4 instructions 435 can only be specified in combination with color destination masks. 436 11. If a DOT4 is specified to not write the alpha channel of it's 437 destination, then it is illegal to specify the next instruction 438 to write *only* the alpha channel of it's destination. 439 12. A program can not issue an instruction which requires the 440 use of the alpha component of a "color1" (secondary color) 441 parameter. 442 13. A program may not refer to a register number greater than 443 the number of supported texture units. 444 14. A program may not refer to a texture coordinate set greater 445 than the number of supported texture units. 446 447 The details of the above restrictions and usage guidelines are given 448 below: 449 450 There are three types of data that can be in a fragment program: 451 registers, constants, and interpolators. The 6 "rX" registers 452 can be used as source or destination in any instruction. 453 The final result of the program is whatever value is in 454 the register "r0". This value will be the final color of the 455 output fragment passed by the programmable fragment processing 456 unit to subsequent non-programmable fragment processing. 457 458 There are 8 constant registers available, "c0" through 459 "c7". To use these constants, a program must include a 460 constant declaration block which indicates how the constants are 461 to be bound. Constants can be bound to program local parameters, 462 program global parameters, or literal string constants. Program 463 locals represent per-program storage, while program environment 464 parameters are global to all programs. See the ARB_vertex_program 465 documentation for details on the use of 466 ProgramLocalParameter4{d,dv,f,fv}ARB, and 467 ProgramEnvParameter4{d,dv,f,fv}ARB to set these bound constants. 468 Constants can also be bound to a constant floating point vector 469 within the program text itself, such as "{ 1.0, 0.0, 0.2, 0.5 }". 470 471 "cX" constants can be used as source in any instruction, 472 but at most 2 different constants may be used as source arguments 473 in any single instruction. 474 475 Additionally, the primary and secondary color interpolators are 476 available as source in any instruction, but only in 477 the last pass of the program (i.e., the only pass of a one-pass 478 program or the second pass of a two-pass program). 479 480 Either one or two passes may be specified in a program. The 481 passes can be thought of as an optional preliminary 482 pass and a required final output pass. The passes are 483 delineated by the occurence of the "StartPrelimPass" and "EndPass" 484 tokens for the optional preliminary pass, and the 485 "StartOutputPass" and "EndPass" tokens for the output pass. Note 486 that in a two-pass shader, the preliminary pass must contain 487 at least one match for the <aluStatement> rule in the grammar. 488 Or put another way, the preliminary pass can not consist solely of 489 PassTexCoord and SampleMap operations. 490 491 Each pass may use up to 8 pairs of instructions for a total of at 492 most 16 pairs in the shader. A pair consists of one color 493 instruction followed immediately by one alpha instruction. 494 In ATI_fragment_shader, color and alpha instructions were specified 495 independently through the use of ColorFragmentOp and AlphaFragmentOp. 496 In ATI_text_fragment_shader color instructions are identified by the 497 use of the "r", "g", or "b" write masks on the destination register 498 of the instruction. Alpha instructions are identified by the use of 499 the "a" write mask. If the "a" mask and at least one of "r", "g", 500 or "b" masks are used, or if no mask is used at all, the 501 instruction is considered to be an implicit pair that will apply 502 the same operation to the color and the alpha channels. 503 504 For instance, the following would be considered color operations 505 506 "DOT3 r2.rgb, r0, r3;" 507 "MUL r1.g, r0, r2;" 508 509 The following would be considered alpha operations 510 511 "MOV r2.a, r0;" 512 "MUL r1.a, r0, r2;" 513 514 The following would each be considered an implicit pair of color 515 and alpha operations (i.e. three example pairs are given below) 516 517 "DOT3 r2, r0, r3;" 518 "MUL r4.ba, r0, r2;" 519 "MUL r1.rgba, r0, r2;" 520 521 Therefore, the following examples indicate legal pairs of 522 instructions, each of which would count against the limit of 8 523 instruction pairs per pass. 524 525 # pair #1 526 "DOT3 r2.rgb, r0, r3;" 527 "MUL r1.a, r0, r2;" 528 529 # pair #2 530 "SUB r4.r, r0, r3;" 531 "MUL r6.a, r0, r2;" 532 533 # (implicit) pair #3 534 "SUB r4.rgba, r0, r3;" 535 536 # (implicit) pair #4 537 "ADD r4.ba, r0, r3;" 538 539 # (implicit) pair #5 540 "DOT4 r5, r2, r3;" 541 542 The color and alpha instructions of a pair are executed in 543 parallel: the result of the color instruction cannot affect the 544 source arguments of the alpha instruction. In other words, 545 if an alpha instruction refers to a temporary register ("rX") that 546 was written by it's paired color instruction, then the value of 547 that register used by the alpha instruction will be the value 548 before the color instruction was executed. 549 550 For instance, consider the following color alpha pairing: 551 552 "SUB r4.rgb, r0, r3;" 553 "MUL r6.a, r4, r2;" # MUL instruction will use the value 554 # in r4 that r4 had before SUB 555 # instruction was issued. 556 557 Both a color and an alpha instruction need not be specified for 558 every pair; the necessary color or alpha no-op is automatically 559 inserted by the GL to complete each instruction pair. 560 561 Note that a given register can only be used as a destination 562 at most once during the <initRegSequence> of each pass. In other 563 words, a program may not initialize the same register twice in 564 one pass using the PassTexCoord or SampleMap instructions. Writing 565 to the same register by the <aluSequence> instructions is quite 566 legal, however. 567 568 The first instructions specified in each pass of a program are "free" 569 instructions in that they don't count against the 8 instructions 570 available in each pass. They are routing instructions that specify 571 from where the contents of the registers come. They are specified 572 with the "SampleMap" and "PassTexCoord" tokens. 573 574 The token sequence 575 576 "PassTexCoord <initRegDst> <initRegSrc>;" 577 578 specifies that the value present in <initRegSrc> is passed directly 579 into the contents of <initRegDst> (one of the registers "rX"). 580 This value is then available for use as a source argument to 581 subsequent color and alpha instructions following in the same pass. 582 <initRegSrc> may either be the texture coordinates on a texture unit 583 ("tX"), or in the case of a two-pass program's second pass, it may 584 be the value of a register set in the first pass ("rX"). 585 586 Note that in order to preserve the contents of a register from the 587 first pass to the second, there must be a "PassTexCoord" 588 instruction in the setup for the second pass that assigns that 589 register to itself. For example: 590 591 "StartOutputPass;" 592 "PassTexCoord r1, r1.str;" 593 etc. 594 595 will preserve the first 3 components of "r1" for use in the 596 second pass. 597 598 The token sequence 599 600 "SampleMapATI <initRegDst> <initRegSrc>;" 601 602 specifies that the value present in the texture data bound on the 603 unit associated with <initRegDst> will be written to that register. 604 A value for <initRegDst> of "rX" means that the actively bound 605 texture on texture unit X will be sampled, and the result written to 606 "rX". The <initRegSrc> parameter specifies which texture coordinate 607 interpolator is used to sample the map. A value of "rX" for 608 <initRegSrc> in the second pass of a two-pass program will do 609 dependent texture read sampling using the value in register X. 610 Otherwise, specifying "tX" will sample the map using the texture 611 coordinates on unit X. 612 613 Only the first 3 components of <initRegSrc> are used in 614 "PassTexCoord" and "SampleMap". As such, it is necessary to 615 identify which 3 components are to be used. To do so, one can append 616 a component selection operator on to the end of the <initRegSrc> 617 This parameter was called a swizzle in ATI_fragment_shader and is 618 referred to by the <threeTupleSelect> token in the 619 ATI_text_fragment_shader grammar. This parameter is used to select 620 which of the 4 original components of the source register or 621 texture coordinates will be mapped to the 3 available positions, 622 and whether or not a projection (division by the q component) will 623 occur. 624 625 Table 3.20 shows the <swizzle> modes: 626 627 628 Coordinates Used for 1D or Coordinates Used for 629 Swizzle 2D SampleMap and PassTexCoord 3D or cubemap SampleMap 630 ------- ----------------------------- ----------------------- 631 "str" (s, t, r, undefined) (s, t, r, undefined) 632 "stq" (s, t, q, undefined) (s, t, q, undefined) 633 "str_dr" (s/r, t/r, 1/r, undefined) (undefined) 634 "stq_dq" (s/q, t/q, 1/q, undefined) (undefined) 635 636 Table 3.20 Coordinate swizzles 637 638 For example, a fragment program could specify 639 640 "PassTexCoord r1, r1.str;" 641 or 642 "SampleMap r1, t2.stq_dq;" 643 644 Each texture coordinate source ("tX") used as a <initRegSrc> can 645 only draw upon "str" or "stq" components throughout the program. 646 For example, if "t2" is used in a SampleMapATI as "t2.str", it 647 cannot be used again later as "t2.stq". The projection, however, 648 may vary. That is, it would be okay to later use "t2.str_dr". 649 650 Additionally, when the <initRegSrc> is a register (in the second 651 pass of a two-pass program), only "str" and "str_dr" are allowed. 652 Note that if this is a PassTexCoord, the fourth component (alpha 653 channel if the register contains RGBA) is not passed along and the 654 fourth component of <initRegDst> becomes undefined. 655 656 The color and alpha instructions are divided into unary, binary, and 657 ternary instructions depending upon the number of arguments 658 each instruction requires. 659 660 Unary instructions have the form: 661 <op> <dst>, <a1>; 662 663 Unary instructions include: 664 "MOV" 665 666 Binary instructions have the form: 667 <op> <dst>, <a1>, <a2>; 668 669 Binary instructions include: 670 "ADD" 671 "MUL" 672 "SUB" 673 "DOT3" 674 "DOT4" 675 676 Ternary instructions have the form: 677 <op> <dst>, <a1>, <a2>, <a3>; 678 679 Ternary instructions include: 680 "MAD" 681 "LERP" 682 "CND" 683 "CND0" 684 "DOT2ADD" 685 686 Table 3.21 shows the effect of each <op>. 687 R(d), G(d), B(d), and A(d) are the destination component 688 values and a1, a2, and a3 represent the source arguments to the 689 instruction. 690 691 692 Op Result 693 -- ------ 694 "ADD" R(d) = R(a1) + R(a2) 695 G(d) = G(a1) + G(a2) 696 B(d) = B(a1) + B(a2) 697 A(d) = A(a1) + A(a2) 698 699 "SUB" R(d) = R(a1) - R(a2) 700 G(d) = G(a1) - G(a2) 701 B(d) = B(a1) - B(a2) 702 A(d) = A(a1) - A(a2) 703 704 "MUL" R(d) = R(a1) * R(a2) 705 G(d) = G(a1) * G(a2) 706 B(d) = B(a1) * B(a2) 707 A(d) = A(a1) * A(a2) 708 709 "MAD" R(d) = R(a1) * R(a2) + R(a3) 710 G(d) = G(a1) * G(a2) + G(a3) 711 B(d) = B(a1) * B(a2) + B(a3) 712 A(d) = A(a1) * A(a2) + A(a3) 713 714 "LERP" ** R(d) = R(a1) * R(a2) + (1 - R(a1)) * R(a3) 715 G(d) = G(a1) * G(a2) + (1 - G(a1)) * G(a3) 716 B(d) = B(a1) * B(a2) + (1 - B(a1)) * B(a3) 717 A(d) = A(a1) * A(a2) + (1 - A(a1)) * A(a3) 718 719 "MOV" R(d) = R(a1) 720 G(d) = G(a1) 721 B(d) = B(a1) 722 A(d) = A(a1) 723 724 "CND" R(d) = (R(a3) > 0.5) ? R(a1) : R(a2) 725 G(d) = (G(a3) > 0.5) ? G(a1) : G(a2) 726 B(d) = (B(a3) > 0.5) ? B(a1) : B(a2) 727 A(d) = (A(a3) > 0.5) ? A(a1) : A(a2) 728 729 "CND0" R(d) = (R(a3) >= 0) ? R(a1) : R(a2) 730 G(d) = (G(a3) >= 0) ? G(a1) : G(a2) 731 B(d) = (B(a3) >= 0) ? B(a1) : B(a2) 732 A(d) = (A(a3) >= 0) ? A(a1) : A(a2) 733 734 "DOT2ADD" * R(d) = G(d) = B(d) = A(d) = R(a1) * R(a2) + 735 G(a1) * G(a2) + 736 B(a3) 737 738 "DOT3" * R(d) = G(d) = B(d) = A(d) = R(a1) * R(a2) + 739 G(a1) * G(a2) + 740 B(a1) * B(a2) 741 742 "DOT4" * ** R(d) = G(d) = B(d) = A(d) = R(a1) * R(a2) + 743 G(a1) * G(a2) + 744 B(a1) * B(a2) + 745 A(a1) * A(a2) 746 747 Table 3.21 Color and Alpha Fragment Shader Instructions 748 749 Special Notes: 750 * - DOT2ADD/DOT3/DOT4 can use an alpha destination mask 751 only in combindation with a color destination mask. 752 That is, it is illegal to use only a ".a" mask specifier 753 on the destination register of these instructions 754 ** - If a DOT4 is specified with a destination mask that 755 does not include alpha (i.e. ".r", ".rb", "g", etc) 756 then the immediately following instruction must write 757 at least one color channel and can not use the 758 alpha only destination mask specifier ".a". 759 *** - The blend factor (a1) of LERP_ATI must be in the range 760 [0,1] or the results are undefined. 761 762 The <dst> parameter specifies to which register ("rX") the 763 result of the instruction is written. 764 765 Each <dst> parameter can optionally have a mask appended to the 766 "rX" name, as in "r1.r", or "r3.gb". The mask parameter 767 specifies which of the color components in <dst> will be written. 768 If there is no mask specified, everything is written, or any of the 769 masks "r", "g", "b", and/or "a" can be added to enable writing the 770 output red, green, blue, and/or alpha channels, respectively. The 771 masks must be specified in "rgba" order. 772 773 Further, each <dst> parameter can optionally have appended a 774 modification parameter, as in "r3.2x" or "r3.half". These can 775 be combined with the mask parameter as in "r4.rg.8x". The result 776 of an instruction can be modulated by appending *one* of the 777 following: "2x", "4x", "8x", "half", "quarter", or "eighth". 778 These are all mutually exclusive. However, you can optionally add 779 "sat" that clamps the result after any modulation occurs. 780 781 Table 3.22 shows the result of each modification. 782 783 784 Modifier Result 785 -------- ------ 786 "" d = d 787 "2x" d = 2 * d 788 "4x" d = 4 * d 789 "8x" d = 8 * d 790 "half" d = d / 2 791 "quarter" d = d / 4 792 "eighth" d = d / 8 793 "sat" d = clamp(d) to range [0, 1] 794 795 Table 3.22 Result of destination modification 796 797 798 Note that the internal precision of the fragment program allows 799 values in the range [-8, 8]. 800 801 The <a1>, <a2>, and <a3 parameters specify the source arguments. 802 The source can come from "rX", "cX", "0", "1", "color0", or "color1", 803 where "color0" is the primary fragment color and "color1" is the 804 secondary fragment color. Note that in a two-pass program, "color0" 805 and "color1" cannot be used in the first pass of the program. 806 807 Each source argument can be given a single optional replication 808 parameter that specifies the replication of each component. 809 810 Table 3.23 shows the result of each source replication modifier. 811 812 813 Replication Result 814 ----------- ----- 815 "" R(s) = R(s) 816 G(s) = G(s) 817 B(s) = B(s) 818 A(s) = A(s) 819 820 "r" R(s) = R(s) 821 G(s) = R(s) 822 B(s) = R(s) 823 A(s) = R(s) 824 825 "g" R(s) = G(s) 826 G(s) = G(s) 827 B(s) = G(s) 828 A(s) = G(s) 829 830 "b" R(s) = B(s) 831 G(s) = B(s) 832 B(s) = B(s) 833 A(s) = B(s) 834 835 "a" R(s) = A(s) 836 G(s) = A(s) 837 B(s) = A(s) 838 A(s) = A(s) 839 840 Table 3.23 Result of source replication 841 842 843 Note that the GL secondary color is specified to contain red, 844 green, and blue components only. It is therefore illegal to specify 845 a program which requires the use of the alpha component of the 846 "color1" parameters. This means that using "color1.a" source argument 847 replication would be prohibited. Additionally, issuing an alpha 848 operation using the alpha component of "color1", either implicitly 849 or explicitly would also be prohibited. 850 851 For instance, the following statements would all be illegal: 852 853 "MOV r0, color1; # implicit alpha op in pair " 854 "MOV r0.ra, color1; # explicit alpha op in pair " 855 "MOV r0.a, color1; # explicit single alpha op " 856 "MOV r0.rgb, color1.a # can't replicate non-existent alpha channel " 857 858 On the other hand, both of these are legal: 859 860 "MOV r0.rgb, color1; # explicit color op, no alpha op specified " 861 "MOV r0, color1.g # non-alpha component replicated on src " 862 863 Each argument can also be given an optional modification parameter 864 that specifies modifiers to each component. Any or all of the 865 following can be specified "neg", "comp", "bias", "2x". 866 867 Table 3.24 shows the result of each source modifier. 868 869 870 Modifier Result 871 -------- ------ 872 "" s = s 873 "neg" s = -s 874 "comp" s = 1 - s 875 "bias" s = s - 0.5 876 "2x" s = 2 * s 877 878 Table 3.24 Result of source modification 879 880 881 If multiple source modifiers are applied, the order of operations is 882 "comp", "bias", "2x", then "negate". The following equation 883 shows the order of operations if all modifiers were to be applied: 884 885 s = -(2 * ((1.0 - s) - 0.5)) 886 887 In order to set the constants that can be used by program 888 instructions, the following entry points (identical to those in 889 the pending ARB_vertex_program extension) are used: 890 891 void ProgramLocalParameter4dARB(enum target, uint index, 892 double x, double y, 893 double z, double w); 894 void ProgramLocalParameter4dvARB(enum target, uint index, 895 const double *params); 896 void ProgramLocalParameter4fARB(enum target, uint index, 897 float x, float y, float z, float w); 898 void ProgramLocalParameter4fvARB(enum target, uint index, 899 const float *params); 900 void ProgramEnvParameter4dARB(enum target, uint index, 901 double x, double y, 902 double z, double w); 903 void ProgramEnvParameter4dvARB(enum target, uint index, 904 const double *params); 905 void ProgramEnvParameter4fARB(enum target, uint index, 906 float x, float y, float z, float w); 907 void ProgramEnvParameter4fvARB(enum target, uint index, 908 const float *params); 909 910 The <target> must be TEXT_FRAGMENT_SHADER_ATI. The <index> specifies 911 the number of the parameter to update. For ATI_text_fragment_shader, 912 <index> is limited to the range 0 to 7. Note that this does *not* 913 necessarily correspond to the "X" in the constant named "cX", 914 but rather to the parameter index (env or local) to which "cX" is 915 bound in the constant declaration block at the beginning of the 916 program. For instance, if constant "c1" is bound as follows: 917 918 "StartConstants; " 919 " CONSTANT c1 = program.local[3]; " 920 "EndConstants; " 921 922 then to set the value of constant "c1", to the vector value of 923 { 0.4, 0.0, 0.5, 0.25), the application could call 924 925 glProgramLocalParameter4dARB(TEXT_FRAGMENT_SHADER_ATI, // target 926 3, // index 927 0.4, // x 928 0.0, // y 929 0.5, // z 930 0.25); // w 931 932 The <params> pointer, must contain four floating point values in 933 the range [0, 1] to set the components of the constant. Similarly, 934 the <x>, <y>, <z>, and <w> parameters must also be in the range 935 [0,1]. Constant registers loaded with floating point values 936 outside of this range will have undefined values. 937 938 Note that binding a program constant to a literal string constant 939 within the program text is roughly analogous to 940 ATI_fragment_shader's use of the call to SetFragmentShaderConstant 941 within a BeginFragmentShader/EndFragmentShader pair. That is, the 942 constant value can not be changed without respecifying the program 943 and the program constant value is local to the program. 944 945 Binding a program constant to a program environment parameter is 946 roughly analogous to ATI_fragment_shader's use of a call to 947 SetFragmentShaderConstant outside of a BeginFragmentShader / 948 EndFragmentShader pair. That is, the program constant's value can 949 be changed without redefining the program and the program constant 950 value is global to all programs with a binding to that specific 951 program environment parameter. 952 953 Binding a program constant to a program local parameter has no 954 direct analogue in ATI_fragment_shader as it represents a way 955 to specify a program parameter which is local to a given 956 fragment program object, but allows the parameter's value to 957 be changed without redefining the fragment program itself. 958 959Additions to Chapter 4 of the OpenGL 1.2.1 Specification (Per-Fragment 960Operations and the Framebuffer) 961 962 None 963 964 965Additions to Chapter 5 of the OpenGL 1.2.1 Specification (Special 966Functions) 967 968 None 969 970Additions to Chapter 6 of the OpenGL 1.2.1 Specification (State and 971State Requests) 972 973 None 974 975 976Additions to Appendix A of the OpenGL 1.2.1 Specification (Invariance) 977 978 None 979 980 981Additions to the AGL/GLX/WGL Specifications 982 983 None 984 985Interactions with ARB_shadow 986 987 The texture comparison introduced by ARB_shadow can be expressed in 988 terms of a fragment shader, and in fact use the same internal 989 resources on some implementations. Therefore, if fragment shader 990 mode is enabled, the GL behaves as if TEXTURE_COMPARE_MODE_ARB is 991 NONE. 992 993Errors 994 995 996 997New State 998 999 Initial 1000 Get Value Type Get Command Value Description Sec. Attribute 1001 --------- ---- ----------- ------- ----------- ------ --------- 1002 TEXT_FRAGMENT_SHADER_ATI B IsEnabled False Fragment shader enable 3.8.11 enable 1003 1004 Table X.6. New Accessible State Introduced by ATI_text_fragment_shader. 1005 1006 1007 Get Value Type Get Command Initial Value Description Sec Attribute 1008 --------- ------ ----------- ------------- ------------------- ------ --------- 1009 - 6xR4 - undefined temporary registers 3.8.11 - 1010 1011 Table X.9. Fragment Shader Per-fragment Execution State. All per-fragment 1012 execution state registers are uninitialized at the beginning of program 1013 execution. 1014 1015 1016New Implementation Dependent State 1017 1018 None 1019 1020 1021Deprecated Functionality 1022 1023 The original ATI_fragment_shader spec included some deprecated 1024 functionality for determining implementation-dependent constants 1025 and limits. Since that functionality was deprecated to the 1026 point where those queries are specified to return fixed values, and 1027 most of the limits are specified by the fragment program grammar, 1028 those queries are not included in the ATI_text_fragment_shader 1029 extension. 1030 1031Sample Usage 1032 1033----------------------------------------------------- 1034 1035 # The following program shows how to perform a simple modulation 1036 # between the interpolated color and a single texture: 1037 !!ATIfs1.0 1038 1039 StartOutputPass; 1040 SampleMap r0, t0.stq_dq; #sample the texture 1041 1042 MUL r0, r0, color0; #perform the modulation 1043 EndPass; 1044 1045----------------------------------------------------- 1046 1047 # The following program shows how to use the constant 1048 # declaration block in a fragment program. 1049 !!ATIfs1.0 1050 1051 StartConstants; 1052 CONSTANT c0 = program.env[0]; 1053 CONSTANT c1 = program.local[3]; 1054 CONSTANT c2 = { 1.0, 0.0, 0.5, 0.75 }; 1055 EndConstants; 1056 1057 StartOutputPass; 1058 MUL r2, c1, c0; # multiply global param by local param 1059 ADD r0, c2, r0; # add constant param and put result in r0 1060 EndPass; 1061 1062----------------------------------------------------- 1063 1064 # The following is an example that performs bumped 1065 # cubic environment mapping: 1066 !!ATIfs1.0 1067 1068 StartPrelimPass; 1069 PassTexCoord r0, t0.str; # 1st row of 3x3 basis matrix 1070 PassTexCoord r1, t1.str; # 2nd row of 3x3 basis matrix 1071 PassTexCoord r2, t2.str; # 3rd row of 3x3 basis matrix 1072 PassTexCoord r3, t3.str; # Eye vector 1073 SampleMap r4, t5.str; # Sample normal map 1074 1075 # Three dot products transform from tangent space to cube map space 1076 DOT3 r0.r, r0, r4; 1077 DOT3 r0.g, r1, r4; 1078 DOT3 r0.b, r2, r4; 1079 1080 DOT3 r2.2x, r0, r3; # 2 * (N dot Eye) 1081 MUL r2, r0, r2; # 2 * N * (N dot Eye) 1082 DOT3 r1, r0, r0; # N dot N 1083 MAD r1, r3.neg, r1, r2; # 2 * N * (N dot Eye) - Eye * (N dot N) 1084 EndPass; 1085 1086 StartOutputPass; 1087 SampleMap r0, r0.str; # Sample diffuse cubic env map 1088 SampleMap r1, r1.str; # Sample specular cubic env map 1089 SampleMap r2, t5.str; # Sample the base map (gloss in a) 1090 1091 MUL r0, r0, r2; # diffuse * base 1092 MAD r0, r0, r2.a, r1; # (diffuse * base) + (spec * gloss) 1093 EndPass; 1094 1095----------------------------------------------------- 1096 1097 # Chrome shader from ATIRadeon8500_PointLight_Shader demo 1098 !!ATIfs1.0 1099 1100 StartPrelimPass; 1101 # get the outputs from the vertex shader 1102 PassTexCoord r1, t1.str; # N 1103 PassTexCoord r2, t2.str; # light to vtx vector in light space 1104 PassTexCoord r3, t3.str; # H 1105 SampleMap r4, t4.str; # L (sample cubemap normalizer) 1106 1107 DOT3 r4, r1, r4.2x.bias; # reg4 = N.L 1108 DOT3 r1, r1, r3; # reg1 = N.H 1109 DOT3 r1.g, r3, r3; # reg1(green) = H.H aka |H|^2) 1110 DOT3 r2, r2, r2; # reg2 = |light to vertex|^2 1111 EndPass; 1112 1113 StartOutputPass; 1114 SampleMap r0, t5.str; # sample env map using eye vector 1115 SampleMap r2, r2.str; # sample atten map 1116 SampleMap r3, r1.str; # sample spec NHHH map = (N.H)^256 1117 PassTexCoord r4, r4.str; # pass N.L through 1118 1119 # this ensures a pixel is only lit if facing the light 1120 # (since the spec exp makes negative N.H positive 1121 # we must do this) 1122 MUL r3, r3, r4; # reg3 = ((N.H)^256 * N.L) 1123 1124 MUL r3, r0, r3; # reg3 = spec * env map 1125 MUL r4, r0, r4; # reg4 = diff * env map 1126 ADD r0, r3, r4; # reg0 = ((spec * env map) + diff * env map) 1127 MUL r0.sat, r0, r2.r; # apply point light attenuation 1128 EndPass; 1129 1130----------------------------------------------------- 1131 1132 # Rusty shader from ATIRadeon8500_PointLight_Shader demo 1133 !!ATIfs1.0 1134 1135 StartPrelimPass; 1136 # get the outputs from the vertex shader 1137 SampleMap r1, t0.str; # N from bump map 1138 PassTexCoord r2, t2.str; # light to vertex vector in light space 1139 PassTexCoord r3, t3.str; # H 1140 SampleMap r4, t4.str; # L (sample cubemap normalizer) 1141 SampleMap r5, t0.str; # specular map (provides our k term for computing N.H^k) 1142 1143 1144 DOT3 r4, r1.2x.bias, r4.2x.bias; # reg4 = N.L 1145 DOT3 r1, r1.2x.bias, r3; # reg1 = N.H 1146 MUL r1, r1, r1; # reg1 = N.H * N.H = (N.H)^2 1147 DOT3 r1.b, r3, r3; # reg1(blue) = H.H = |H|^2 1148 MUL r1.g.half, r1.b, r5; # reg1(green) = |H|^2 * 0.5 * k 1149 DOT3 r2, r2, r2; # reg2 = |light to vertex|^2 1150 EndPass; 1151 1152 StartOutputPass; 1153 SampleMap r0, t0.str; # base map 1154 SampleMap r2, r2.str; # attenuation 1155 1156 # note the swizzle (str_DR) because we devide by R we get the following: 1157 # <(N.H)^2, |H|^2 * 0.5 * k> / |H|^2 = <(N.H)^2/|H>H|^2, 0.5 * k> 1158 # note that N.H^2 / |H|^2 effectively takes care of the denormalized H term 1159 # and reduces to N.H^2 also note that raising this to the (0.5*k) power 1160 # results in (N.H)^k ... it's a little tricky but it works and now you 1161 # get per pixel specular lighting with per pixel k exponents! 1162 SampleMap r3, r1.str_dr; # (N.H)^k 1163 PassTexCoord r4, r4.str; # N.L 1164 1165 # reg3 = (N.H)^k * (N.L) 1166 # this ensures a pixel is only lit if facing the light 1167 # (since the specular exponent makes negative N.H look positive, 1168 # we must do this) 1169 MUL r3, r3, r4; 1170 1171 MUL r3, r0, r3; # reg3 = specular * basemap 1172 MUL r4, r0, r4; # reg4 = diffuse * basemap 1173 ADD r0, r3, r4; # reg0 = specular + diffuse 1174 MUL r0.sat, r0, r2.r; # apply attenuation 1175 EndPass; 1176 1177----------------------------------------------------- 1178 1179Revision History 1180 1181 Date: 11/4/2006 1182 Revision: 1.0.11 1183 - Updated contact info after ATI/AMD merger. 1184 1185 Date: 9/5/2002 1186 Revision: 1.0.10 1187 - final version for submission to registry 1188 - clarified contact/contributor info 1189 - fixed a misplaced ".half" typo in the rusty shader example 1190 code 1191 1192 Date: 8/9/2002 1193 Revision: 1.0.9 1194 - fixed a typo which refered to "color1" and "color2" instead of 1195 "color0" and "color1" 1196 - clarified semantic restrictrions surrounding DOT2ADD/DOT3/DOT4 1197 to make them slightly less restrictive and more closely aligned 1198 with underlying hardware implementation and original 1199 ATI_fragment_shader restrictions. 1200 1201 Date: 7/9/2002 1202 Revision: 1.0.8 1203 - fixed a typo where constant declarations were missing the 1204 "CONSTANT" keyword 1205 1206 Date: 6/26/2002 1207 Revision: 1.0.7 1208 - clarified additional semantic constraints regarding 1209 pass delimiters (prelim pass must have at least one ALU op) 1210 - fixed a typo in the rusty shader example code 1211 - clarified error conditions involving the use of the alpha 1212 component of the secondary color parameter. 1213 1214 Date: 6/23/2002 1215 Revision: 1.0.6 1216 - Very minor spec bug fixes: 1217 - removed _ATI from several instructions 1218 - fixed up some wrong line wrappings 1219 - formally listed the <optionalDstMask> options to disallow 1220 masks of the form ".r.g.b.a." which were never really legal, 1221 but were allowed by the grammar as specified before. 1222 - cleaned up the list of GL functions which accept 1223 TEXT_FRAGMENT_SHADER_ATI as an enumerant. 1224 - formally defined the value of TEXT_FRAGMENT_SHADER_ATI 1225 - fixed 2 typos in the "simple modulation" sample shader 1226 (SampleMap uses the r# to choose the texture unit, 1227 and make sure to use the stq_dq source selector) 1228 - removed an ambiguous "1.0" from the example code on 1229 how to set a constant bound to a program local parameter. 1230 1231 Date: 6/6/2002 1232 Revision: 1.0.5 1233 - Apple would now like the program local/env syntax added 1234 in version 1.0.3 added back in to fit better into their 1235 "pipeline program" based architecture and program token 1236 stream syntax. Adding back in the changes introduced 1237 in version 1.0.3. 1238 - Fixed a typo in the description of constant binding syntax 1239 where text referred to "c4" but the sample code referred 1240 to "c1". "c1" is correct. 1241 - Synced spec with version 1.4 and 1.5 of ATI_fragment_shader 1242 1.5: Added interaction with ARB_shadow. 1243 1.4: Specified that LERP's blend factor must be in the range 1244 [0,1]. 1245 1246 Date: 5/31/2002 1247 Revision: 1.0.4 1248 - To get the equivalent functionality to ATI_fragment_shader, 1249 we only need inline constants and program env parameters, 1250 So,based on some feedback from Apple, for simplicity, we remove 1251 the usage of program locals that was added in 1.0.3 . 1252 We keep program env parameters however. 1253 1254 Date: 5/31/2002 1255 Revision: 1.0.3 1256 - added in the ability to declare constants as 1257 program local/env parameters as in ARB_vertex_program 1258 - added in the ability to declare constants as textual 1259 string constants. 1260 - above changes required additional "constant declaration" 1261 block before the preliminary pass block, in order to 1262 specify the program constant bindings. 1263 - changed some tokens in the grammar to add the word 1264 optional (they were already optional, just changed the name). 1265 - fixed a reference in the text where "2x" was called "scale" 1266 - fixed a bug in the grammar where it was possible to specify the 1267 "." of a <dstMask> (now <optionalDstMask>) without specifying 1268 the "r","g","b", or "a" mask values. 1269 1270 Date: 5/26/2002 1271 Revision: 1.0.2 1272 - some spec language english grammatical fixes 1273 - cleaned up description in usage guidelines to refer 1274 to tokens named in the text grammar 1275 - reordered semantic restriction summary to correspond to 1276 order of explanations in following section. 1277 - clarify that ATI_text_fragment_shader does not 1278 replace color sum stage, (neither did ATI_fragment_shader). 1279 - pulled "!!ATIfs1.0" header token from grammar and simply 1280 required it to identify the subsequent language as 1281 was done for ARB_vertex_program, version 24. 1282 - clarified that the restriction on using a destination 1283 register once in a singe pass applies only to 1284 the PassTexCoord and SampleMap instructions. 1285 1286 Date: 5/23/2002 1287 Revision: 1.0.1 1288 - added back in the concept of color/alpha pairing that was 1289 removed in the first pass at the extension grammar 1290 This lets color and alpha instructions be co-issued and 1291 gives a program the opportunity to do different operations 1292 on color and alpha components. 1293 This feature of ATI_fragment_shader should not have been removed 1294 in the original ATI_text_fragment_shader spec. 1295 - add commas between instruction arguments in the grammar and 1296 examples 1297 - clean up some white space issues 1298 - add a couple of references to TEXT_FRAGMENT_SHADER_ATI as 1299 the target of the entry points shared with ARB_vertex_program. 1300 1301 Date: 5/22/2002 1302 Revision: 1.0 1303 - first fully specified version 1304 - based on the 1.3 version of ATI_fragment_shader specification 1305