1Name 2 3 NV_vertex_program3 4 5Name Strings 6 7 GL_NV_vertex_program3 8 9Contact 10 11 Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com) 12 13Status 14 15 Shipping. 16 17Version 18 19 Last Modified Data: 10/12/2009 20 NVIDIA Revision: 7 21 22Number 23 24 306 25 26Dependencies 27 28 ARB_vertex_program is required. 29 NV_vertex_program2_option is required. 30 This extension interacts with ARB_fragment_program_shadow. 31 32Overview 33 34 This extension, like the NV_vertex_program2_option extension, 35 provides additional vertex program functionality to extend the 36 standard ARB_vertex_program language and execution environment. 37 ARB programs wishing to use this added functionality need only add: 38 39 OPTION NV_vertex_program3; 40 41 to the beginning of their vertex programs. 42 43 New functionality provided by this extension, above and beyond that 44 already provided by NV_vertex_program2_option extension, includes: 45 46 * texture lookups in vertex programs, 47 48 * ability to push and pop address registers on the stack, 49 50 * address register-relative addressing for vertex attribute and 51 result arrays, and 52 53 * a second four-component condition code. 54 55Issues 56 57 Should we provided a separate "!!VP3.0" program type, like the 58 "!!VP2.0" type defined in NV_vertex_program2? 59 60 RESOLVED: No. Since ARB_vertex_program has been fully defined 61 (it wasn't in the !!VP2.0 time-frame), we will simply define 62 language extensions to !!ARBvp1.0 that expose new functionality. 63 The NV_vertex_program2_option specification followed this same 64 pattern for the NV3X family (GeForce FX, Quadro FX). 65 66 Should this be called "NV_vertex_program3_option"? 67 68 RESOLVED: No. The similar extension to !!ARBvp1.0 called 69 "NV_vertex_program2_option" got that name only because the simpler 70 "NV_vertex_program2" name had already been used. 71 72 Is there a limit on the number of texture units that can be accessed 73 by a vertex program? 74 75 RESOLVED: Yes. The limit may be lower than the total number of texture 76 image units available and is given by the implementation-dependent 77 constant MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB. Any program that attempts 78 to use more unique texture image units will fail to load. Programs can 79 use any texture image unit number, as long as they don't use too many 80 simultaneously. As an example, the GeForce 6 series of GPUs provides 16 81 texture image units accessible to vertex programs, but no more than four 82 can be used simultaneously. It is not an error to use texture image 83 units 12-15 in a program. 84 85 This limitation is identical to the one in the ARB_vertex_shader 86 extensions -- both extensions use the same enum to query the number of 87 available image units. Violating this limit in GLSL results in a link 88 error. 89 90 Is there a restriction on the texture targets that can be accessed by a 91 vertex program? 92 93 RESOLVED: Yes -- for any texture image unit, vertex and fragment 94 processing can not use different targets. If they do, an 95 INVALID_OPERATION is generated at Begin-time. This resolution is 96 consistent with resultion of the same issue in the ARB_vertex_shader 97 extension and OpenGL 2.0. 98 99 Since vertices don't have screen space partial derivatives, how is 100 the LOD used for texture accesses defined? 101 102 RESOLVED: The TXL instruction allows a program to explicitly 103 set an LOD; the LOD for all other texture instructions is zero. 104 The texture LOD bias specified in the texture object and environment 105 do apply to all vertex texture lookups. 106 107 108New Procedures and Functions 109 110 None. 111 112New Tokens 113 114 Accepted by the <pname> parameter of GetBooleanv, GetIntegerv, 115 GetFloatv, and GetDoublev: 116 117 MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB 0x8B4C 118 119Additions to Chapter 2 of the OpenGL 1.4 Specification (OpenGL Operation) 120 121 Modify Section 2.14.2, Vertex Program Grammar and Restrictions 122 123 (mostly add to existing grammar rules, as extended by 124 NV_vertex_program2_option) 125 126 <optionName> ::= "NV_vertex_program3" 127 128 <instruction> ::= <TexInstruction> 129 130 <ALUInstruction> ::= <ASTACKop_instruction> 131 132 <TexInstruction> ::= <TEXop_instruction> 133 134 <ASTACKop_instruction> ::= <PUSHAop> <instOperandAddrVNS> 135 | <POPAop> <instResultAddr> 136 137 <PUSHAop> ::= "PUSHA" 138 139 <POPAop> ::= "POPA" 140 141 <TEXop_instruction> ::= <TEXop> <instResult> "," <instOperandV> "," 142 <texTarget> 143 144 <TEXop> ::= "TEX" 145 | "TXP" 146 | "TXB" 147 | "TXL" 148 149 <texTarget> ::= <texImageUnit> "," <texTargetType> 150 151 <texImageUnit> ::= "texture" <optTexImageUnitNum> 152 153 <optTexImageUnitNum> ::= /* empty */ 154 | "[" <texImageUnitNum> "]" 155 156 <texImageUnitNum> ::= <integer> 157 /*[0,MAX_TEXTURE_IMAGE_UNITS_ARB-1]*/ 158 159 <texTargetType> ::= "1D" 160 | "2D" 161 | "3D" 162 | "CUBE" 163 | "RECT" 164 165 <attribVtxBasic> ::= "texcoord" "[" <arrayMemRel> "]" 166 | "attrib" "[" <arrayMemRel> "]" 167 168 <resultVtxBasic> ::= "texcoord" "[" <arrayMemRel> "]" 169 170 <ccMaskRule> ::= "EQ0" 171 | "GE0" 172 | "GT0" 173 | "LE0" 174 | "LT0" 175 | "NE0" 176 | "TR0" 177 | "FL0" 178 | "EQ1" 179 | "GE1" 180 | "GT1" 181 | "LE1" 182 | "LT1" 183 | "NE1" 184 | "TR1" 185 | "FL1" 186 187 (modify description of reserved identifiers) 188 189 ... The following strings are reserved keywords and may not be used 190 as identifiers: 191 192 ABS, ADD, ADDRESS, ALIAS, ARA, ARL, ARR, ATTRIB, BRA, CAL, COS, 193 DP3, DP4, DPH, DST, END, EX2, EXP, FLR, FRC, LG2, LIT, LOG, MAD, 194 MAX, MIN, MOV, MUL, OPTION, OUTPUT, PARAM, POPA, POW, PUSHA, RCC, 195 RCP, RET, RSQ, SEQ, SFL, SGE, SGT, SIN, SLE, SLT, SNE, SUB, SSG, 196 STR, SWZ, TEMP, TEX, TXB, TXL, TXP, XPD, program, result, state, 197 and vertex. 198 199 Modify Section 2.14.3.1, Vertex Attributes 200 201 (add new bindings to binding table) 202 203 Vertex Attribute Binding Components Underlying State 204 ------------------------ ---------- -------------------------------- 205 ... 206 vertex.texcoord[A+n] (s,t,r,q) indexed texture coordinate 207 vertex.attrib[A+n] (x,y,z,w) indexed generic vertex attribute 208 209 If a vertex attribute binding matches "vertex.texcoord[A+n]", where 210 "A" is a component of an address register (Section 2.14.3.5), a 211 texture coordinate number <c> is computed by adding the current 212 value of the address register component and <n>. The "x", "y", 213 "z", and "w" components of the vertex attribute variable are 214 filled with the "s", "t", "r", and "q" components, respectively, 215 of the vertex texture coordinates for texture unit <c>. If <c> 216 is negative or greater than or equal to MAX_TEXTURE_COORDS_ARB, 217 the vertex attribute variable is undefined. 218 219 If a vertex attribute binding matches "vertex.attrib[A+n]", where 220 "A" is a component of an address register (Section 2.14.3.5), a 221 vertex attribute number <a> is computed by adding the current value 222 of the address register component and <n>. The "x", "y", "z", and 223 "w" components of the vertex attribute variable are filled with the 224 "x", "y", "z", and "w" components, respectively, of generic vertex 225 attribute <a>. If <a> is negative or greater than or equal to 226 MAX_VERTEX_ATTRIBS_ARB, the vertex attribute variable is undefined. 227 228 Modify Section 2.14.3.4, Vertex Program Results 229 230 (add new binding to binding table) 231 232 Binding Components Description 233 ----------------------------- ---------- ---------------------------- 234 ... 235 result.texcoord[A+n] (s,t,r,q) indexed texture coordinate 236 237 If a result variable binding matches "result.texcoord[A+n]", where "A" 238 is a component of an address register (Section 2.14.3.5), a texture 239 coordinate number <c> is computed by adding the current value of 240 the address register component and <n>. Updates to the "x", "y", 241 "z", and "w" components of the result variable set the "s", "t", 242 "r" and "q" components, respectively, of the transformed vertex's 243 texture coordinates for texture unit <c>. If <c> is negative or 244 greater than or equal to MAX_TEXTURE_COORDS_ARB, the effects of 245 updates to vertex attribute variable are undefined and may overwrite 246 other programs results. 247 248 Modify Section 2.14.3.X, Condition Code Registers (added in 249 NV_Vertex_program2_option) 250 251 The vertex program condition code registers are two four-component 252 vectors, called CC0 and CC1. Each component of this register is one 253 of four enumerated values: GT (greater than), EQ (equal), LT (less 254 than), or UN (unordered). The condition code register can be used 255 to mask writes to registers and to evaluate conditional branches. 256 257 Most vertex program instructions can optionally update one of the 258 two condition code registers. When a vertex program instruction 259 updates a condition code register, a condition code component is set 260 to LT if the corresponding component of the result is less than zero, 261 EQ if it is equal to zero, GT if it is greater than zero, and UN if 262 it is NaN (not a number). 263 264 The condition code registers are initialized to vectors of EQ values 265 each time a vertex program executes. 266 267 Modify Section 2.14.3.7, Vertex Program Resource Limits 268 269 (add new paragraph to end of section) In addition to the previous limits, 270 the number of unique texture image units that can be accessed 271 simultaneously by a vertex program is limited. The limit is given by the 272 implementation-dependent constant MAX_VERTEX_TEXTURE_IMAGE_UNITS_ARB, and 273 may be lower than the total number of texture image units provided. If 274 the number of texture image units referenced by a vertex program exceeds 275 this limit, the program will fail to load. 276 277 Modify Section 2.14.4, Vertex Program Execution Environment 278 279 (modify Begin-time error language for vertex program execution to cover 280 invalid texture uses) 281 282 If vertex program mode is enabled and the currently bound program object 283 does not contain a valid vertex program, the error INVALID_OPERATION will 284 be generated by Begin, RasterPos, and any command that implicitly calls 285 Begin (e.g., DrawArrays). 286 287 If vertex program mode is enabled and the currently bound program object 288 accesses a texture image unit, the texture target used must be consistent 289 with the target (if any) used for fragment processing. If vertex and 290 fragment processing require the use of different texture targets on the 291 same texture image unit, the error INVALID_OPERATION will be generated by 292 Begin, RasterPos, and any command that implicitly calls Begin. 293 294 (modify instruction table) There are forty-eight vertex program 295 instructions. Vertex program instructions may have up to eight 296 variants, including a suffix of "C" or "C0" to allow an update of 297 condition code register zero (section 2.14.3.X), a suffix of "C1" 298 to allow an update of condition code register one, and a suffix of 299 "_SAT" to clamp the result vector components to the range [0,1]. 300 For example, the eight forms of the "ADD" instruction are "ADD", 301 "ADDC", "ADDC0", "ADDC1", "ADD_SAT", "ADDC_SAT", "ADDC0_SAT", and 302 "ADDC1_SAT". The instructions and their respective input and output 303 parameters are summarized in Table X.5. 304 305 Modifiers 306 Instruction C S Inputs Output Description 307 ----------- - - ------ ------ -------------------------------- 308 ABS X X v v absolute value 309 ADD X X v,v v add 310 ARA X - a a address register add 311 ARL X - s a address register load 312 ARR X - v a address register load (round) 313 BRA - - c - branch 314 CAL - - c - subroutine call 315 COS X X s ssss cosine 316 DP3 X X v,v ssss 3-component dot product 317 DP4 X X v,v ssss 4-component dot product 318 DPH X X v,v ssss homogeneous dot product 319 DST X X v,v v distance vector 320 EX2 X X s ssss exponential base 2 321 EXP X X s v exponential base 2 (approximate) 322 FLR X X v v floor 323 FRC X X v v fraction 324 LG2 X X s ssss logarithm base 2 325 LIT X X v v compute light coefficients 326 LOG X X s v logarithm base 2 (approximate) 327 MAD X X v,v,v v multiply and add 328 MAX X X v,v v maximum 329 MIN X X v,v v minimum 330 MOV X X v v move 331 MUL X X v,v v multiply 332 POPA - - - a pop address register 333 POW X X s,s ssss exponentiate 334 PUSHA - - a - push address register 335 RCC X X s ssss reciprocal (clamped) 336 RCP X X s ssss reciprocal 337 RET - - c - subroutine return 338 RSQ X X s ssss reciprocal square root 339 SEQ X X v,v v set on equal 340 SFL X X v,v v set on false 341 SGE X X v,v v set on greater than or equal 342 SGT X X v,v v set on greater than 343 SIN X X s ssss sine 344 SLE X X v,v v set on less than or equal 345 SLT X X v,v v set on less than 346 SNE X X v,v v set on not equal 347 SSG X X v v set sign 348 STR X X v,v v set on true 349 SUB X X v,v v subtract 350 SWZ X X v v extended swizzle 351 TEX X X v v texture lookup 352 TXB X X v v texture lookup with LOD bias 353 TXL X X v v texture lookup with explicit LOD 354 TXP X X v v projective texture lookup 355 XPD X X v,v v cross product 356 357 Table X.5: Summary of vertex program instructions. The columns 358 "C" and "S" indicate whether the "C", "C0", and "C1" condition code 359 update modifiers, and the "_SAT" saturation modifiers, respectively, 360 are supported for the opcode. "v" indicates a floating-point vector 361 input or output, "s" indicates a floating-point scalar input, 362 "ssss" indicates a scalar output replicated across a 4-component 363 result vector, "a" indicates a vector address register, and "c" 364 indicates a condition code test. 365 366 Rewrite Section 2.14.4.3, Vertex Program Destination Register Update 367 368 A vertex program instruction can optionally clamp the results of 369 a floating-point result vector to the range [0,1]. The components 370 of the result vector are clamped to [0,1] if the saturation suffix 371 "_SAT" is present in the instruction. 372 373 Most vertex program instructions write a 4-component result vector to 374 a single temporary or vertex result register. Writes to individual 375 components of the destination register are controlled by individual 376 component write masks specified as part of the instruction. 377 378 The component write mask is specified by the <optionalMask> rule 379 found in the <maskedDstReg> rule. If the optional mask is "", 380 all components are enabled. Otherwise, the optional mask names 381 the individual components to enable. The characters "x", "y", 382 "z", and "w" match the x, y, z, and w components respectively. 383 For example, an optional mask of ".xzw" indicates that the x, z, 384 and w components should be enabled for writing but the y component 385 should not. The grammar requires that the destination register mask 386 components must be listed in "xyzw" order. The condition code write 387 mask is specified by the <ccMask> rule found in the <instResultCC> 388 and <instResultAddrCC> rules. Otherwise, the selected condition 389 code register is loaded and swizzled according to the swizzle 390 codes specified by <swizzleSuffix>. Each component of the swizzled 391 condition code is tested according to the rule given by <ccMaskRule>. 392 <ccMaskRule> may have the values "EQ", "NE", "LT", "GE", LE", or "GT", 393 which mean to enable writes if the corresponding condition code field 394 evaluates to equal, not equal, less than, greater than or equal, less 395 than or equal, or greater than, respectively. Comparisons involving 396 condition codes of "UN" (unordered) evaluate to true for "NE" and 397 false otherwise. For example, if the condition code is (GT,LT,EQ,GT) 398 and the condition code mask is "(NE.zyxw)", the swizzle operation 399 will load (EQ,LT,GT,GT) and the mask will thus will enable writes on 400 the y, z, and w components. In addition, "TR" always enables writes 401 and "FL" always disables writes, regardless of the condition code. 402 If the condition code mask is empty, it is treated as "(TR)". 403 404 Each component of the destination register is updated with the result 405 of the vertex program instruction if and only if the component is 406 enabled for writes by both the component write mask and the condition 407 code write mask. Otherwise, the component of the destination register 408 remains unchanged. 409 410 A vertex program instruction can also optionally update the condition 411 code register. The condition code is updated if the condition 412 code register update suffix "C" is present in the instruction. 413 The instruction "ADDC" will update the condition code; the otherwise 414 equivalent instruction "ADD" will not. If condition code updates 415 are enabled, each component of the destination register enabled 416 for writes is compared to zero. The corresponding component of 417 the condition code is set to "LT", "EQ", or "GT", if the written 418 component is less than, equal to, or greater than zero, respectively. 419 Condition code components are set to "UN" if the written component is 420 NaN (not a number). Values of -0.0 and +0.0 both evaluate to "EQ". 421 If a component of the destination register is not enabled for writes, 422 the corresponding condition code component is also unchanged. 423 424 In the following example code, 425 426 # R1=(-2, 0, 2, NaN) R0 CC 427 MOVC R0, R1; # ( -2, 0, 2, NaN) (LT,EQ,GT,UN) 428 MOVC R0.xyz, R1.yzwx; # ( 0, 2, NaN, NaN) (EQ,GT,UN,UN) 429 MOVC R0 (NE), R1.zywx; # ( 0, 0, NaN, -2) (EQ,EQ,UN,LT) 430 431 the first instruction writes (-2,0,2,NaN) to R0 and updates the 432 condition code to (LT,EQ,GT,UN). The second instruction, only the 433 "x", "y", and "z" components of R0 and the condition code are updated, 434 so R0 ends up with (0,2,NaN,NaN) and the condition code ends up with 435 (EQ,GT,UN,UN). In the third instruction, the condition code mask 436 disables writes to the x component (its condition code field is "EQ"), 437 so R0 ends up with (0,0,NaN,-2) and the condition code ends up with 438 (EQ,EQ,UN,LT). 439 440 The following pseudocode illustrates the process of writing a 441 result vector to the destination register. In the pseudocode, 442 "instrSaturate" is TRUE if and only if result saturation is 443 enabled, "instrMask" refers to the component write mask given by 444 the <optWriteMask> rule. "ccMaskRule" refers to the condition code 445 mask rule given by <ccMask> and "updatecc" is TRUE if and only if 446 condition code updates are enabled. "result", "destination", and "cc" 447 refer to the result vector, the register selected by <dstRegister> 448 and the condition code, respectively. Condition codes do not exist 449 in the VP1 execution environment. 450 451 boolean TestCC(CondCode field) { 452 switch (ccMaskRule) { 453 case "EQ": return (field == "EQ"); 454 case "NE": return (field != "EQ"); 455 case "LT": return (field == "LT"); 456 case "GE": return (field == "GT" || field == "EQ"); 457 case "LE": return (field == "LT" || field == "EQ"); 458 case "GT": return (field == "GT"); 459 case "TR": return TRUE; 460 case "FL": return FALSE; 461 case "": return TRUE; 462 } 463 } 464 465 enum GenerateCC(float value) { 466 if (value == NaN) { 467 return UN; 468 } else if (value < 0) { 469 return LT; 470 } else if (value == 0) { 471 return EQ; 472 } else { 473 return GT; 474 } 475 } 476 477 void UpdateDestination(floatVec destination, floatVec result) 478 { 479 floatVec merged; 480 ccVec mergedCC; 481 482 // Clamp result components to [0,1] if requested in the instruction. 483 if (instrSaturate) { 484 if (result.x < 0) result.x = 0; 485 else if (result.x > 1) result.x = 1; 486 if (result.y < 0) result.y = 0; 487 else if (result.y > 1) result.y = 1; 488 if (result.z < 0) result.z = 0; 489 else if (result.z > 1) result.z = 1; 490 if (result.w < 0) result.w = 0; 491 else if (result.w > 1) result.w = 1; 492 } 493 494 // Merge the converted result into the destination register, under 495 // control of the compile- and run-time write masks. 496 merged = destination; 497 mergedCC = cc; 498 if (instrMask.x && TestCC(cc.c***)) { 499 merged.x = result.x; 500 if (updatecc) mergedCC.x = GenerateCC(result.x); 501 } 502 if (instrMask.y && TestCC(cc.*c**)) { 503 merged.y = result.y; 504 if (updatecc) mergedCC.y = GenerateCC(result.y); 505 } 506 if (instrMask.z && TestCC(cc.**c*)) { 507 merged.z = result.z; 508 if (updatecc) mergedCC.z = GenerateCC(result.z); 509 } 510 if (instrMask.w && TestCC(cc.***c)) { 511 merged.w = result.w; 512 if (updatecc) mergedCC.w = GenerateCC(result.w); 513 } 514 515 // Write out the new destination register and condition code. 516 destination = merged; 517 cc = mergedCC; 518 } 519 520 While this rule describes floating-point results, the same logic 521 applies to the integer results generated by the ARA, ARL, and ARR 522 instructions. 523 524 Add to Section 2.14.4.5, Vertex Program Options 525 526 Section 2.14.4.5.3, NV_vertex_program3 Program Option 527 528 If a vertex program specifies the "NV_vertex_program3" option, the 529 ARB_vertex_program grammar and execution environment are extended 530 to take advantage of all the features of the "NV_vertex_program2" 531 option, plus the following features: 532 533 * several new instructions: 534 535 * POPA -- pop address register off stack 536 * PUSHA -- push address register onto stack 537 * TEX -- texture lookup 538 * TXB -- texture lookup w/LOD bias 539 * TXL -- texture lookup w/explicit LOD 540 * TXP -- projective texture lookup 541 542 * address register-relative addressing for vertex texture 543 coordinate and generic attribute arrays, 544 545 * address register-relative addressing for vertex texture 546 coordinate result array, and 547 548 * a second four-component condition code. 549 550 551 Modify Section 2.14.5.34, RET: Subroutine Call Return 552 553 The RET instruction conditionally returns from a subroutine initiated 554 by a CAL instruction by popping an instruction reference off the 555 top of the call stack and transferring control to the referenced 556 instruction. The following pseudocode describes the operation of 557 the instruction: 558 559 if (TestCC(cc.c***) || TestCC(cc.*c**) || 560 TestCC(cc.**c*) || TestCC(cc.***c)) { 561 if (callStackDepth <= 0) { 562 // terminate vertex program normally 563 } else { 564 callStackDepth--; 565 if (callStack[callStackDepth] is a instruction reference) { 566 instruction = callStack[callStackDepth]; 567 } else { 568 // terminate vertex program abnormally 569 } 570 } 571 572 // continue execution at <instruction> 573 } else { 574 // do nothing 575 } 576 577 In the pseudocode, <callStackDepth> is the depth of the call stack, 578 <callStack> is an array holding the call stack, and <instruction> is 579 a reference to an instruction previously pushed onto the call stack. 580 581 If the call stack is empty when RET executes, the vertex program 582 terminates normally. 583 584 The vertex program terminates abnormally if the entry at the top of the 585 call stack is not an instruction reference pushed by CAL. When a vertex 586 program terminates abnormally, all of the vertex program results are 587 undefined. 588 589 Add to Section 2.14.5, Vertex Program Instruction Set 590 591 Section 2.14.5.43, POPA: Pop Address Register Stack 592 593 The POPA instruction generates a integer result vector by popping 594 an entry off of the call stack. 595 596 if (callStackDepth <= 0) { 597 terminate vertex program; 598 } else { 599 callStackDepth--; 600 if (callStack[callStackDepth] is an address register) { 601 iresult = callStack[callStackDepth]; 602 } else { 603 terminate vertex program; 604 } 605 } 606 607 POPA does not support non-default write masks; a program will fail to load 608 if it includes a component write mask other than ".xyzw" or a condition 609 code write mask test other than "TR". 610 611 In the pseudocode, <callStackDepth> is the current depth of the call 612 stack and <callStack> is an array holding the call stack. 613 614 The vertex program terminates abnormally if it executes a POPA instruction 615 when the call stack is empty, or when the entry at the top of the call 616 stack is not an address register pushed by PUSHA. When a vertex program 617 terminates abnormally, all of the vertex program results are undefined. 618 619 Section 2.14.5.44, PUSHA: Push Address Register Stack 620 621 The PUSHA instruction pushes the address register operand onto the 622 call stack, which is also used for subroutine calls. The PUSHA 623 instruction does not generate a result vector. 624 625 tmp = AddrVectorLoad(op0); 626 if (callStackDepth >= MAX_PROGRAM_CALL_DEPTH_NV) { 627 terminate vertex program; 628 } else { 629 callStack[callStackDepth] = tmp; 630 callStackDepth++; 631 } 632 633 In the pseudocode, <callStackDepth> is the current depth of the call 634 stack and <callStack> is an array holding the call stack. 635 636 The vertex program terminates abnormally if it executes a PUSHA 637 instruction when the call stack is full. When a vertex program terminates 638 abnormally, all of the vertex program results are undefined. 639 640 Component swizzling is not supported when the operand is loaded. 641 642 Section 2.14.5.45, TEX: Texture Lookup 643 644 The TEX instruction uses the single vector operand to perform a 645 lookup in the specified texture map, yielding a 4-component result 646 vector containing filtered texel values. The (s,t,r,q) coordinates 647 used for the texture lookup are (x,y,z,1), where x, y, and z are 648 components of the vector operand. 649 650 tmp = VectorLoad(op0); 651 result = TextureSample(tmp.x, tmp.y, tmp.z, 1.0, 0.0, unit, target); 652 653 where <unit> and <target> are the texture image unit number and 654 target type, matching the <texImageUnitNum> and <texTargetType> 655 grammar rules. 656 657 The resulting sample is mapped to RGBA as described in Table 3.21, 658 and the R, G, B, and A values are written to the x, y, z, and w 659 components, respectively, of the result vector. 660 661 Since partial derivatives of the texture coordinates are not defined, 662 the base LOD value for vertex texture lookups is defined to be 663 zero. The value of lambda' used in equation 3.16 will be simply 664 clamp(texobj_bias + texunit_bias). 665 666 Section 2.14.5.46, TXB: Texture Lookup (With LOD Bias) 667 668 The TXB instruction uses the single vector operand to perform a 669 lookup in the specified texture map, yielding a 4-component result 670 vector containing filtered texel values. The (s,t,r,q) coordinates 671 used for the texture lookup are (x,y,z,1), where x, y, and z are 672 components of the vector operand. The w component of the operand 673 is used as an additional LOD bias. 674 675 tmp = VectorLoad(op0); 676 result = TextureSample(tmp.x, tmp.y, tmp.z, 1.0, tmp.w, unit, target); 677 678 where <unit> and <target> are the texture image unit number and 679 target type, matching the <texImageUnitNum> and <texTargetType> 680 grammar rules. 681 682 The resulting sample is mapped to RGBA as described in Table 3.21, 683 and the R, G, B, and A values are written to the x, y, z, and w 684 components, respectively, of the result vector. 685 686 Since partial derivatives of the texture coordinates are not defined, 687 the base LOD value for vertex texture lookups is defined to be 688 zero. The value of lambda' used in equation 3.16 will be simply 689 clamp(texobj_bias + texunit_bias + tmp.w). 690 691 Since the base LOD value is zero, the TXB instruction is completely 692 equivalent to the TXL instruction, where the w component contains 693 an explicit base LOD value. 694 695 Section 2.14.5.47, TXL: Texture Lookup (With Explicit LOD) 696 697 The TXL instruction uses the single vector operand to perform a 698 lookup in the specified texture map, yielding a 4-component result 699 vector containing filtered texel values. The (s,t,r,q) coordinates 700 used for the texture lookup are (x,y,z,1), where x, y, and z are 701 components of the vector operand. The w component of the operand 702 is used as the base LOD for the texture lookup. 703 704 tmp = VectorLoad(op0); 705 result = TextureSampleLOD(tmp.x, tmp.y, tmp.z, 1.0, tmp.w, unit, target); 706 707 where <unit> and <target> are the texture image unit number and 708 target type, matching the <texImageUnitNum> and <texTargetType> 709 grammar rules. 710 711 The resulting sample is mapped to RGBA as described in Table 3.21, 712 and the R, G, B, and A values are written to the x, y, z, and w 713 components, respectively, of the result vector. 714 715 The value of lambda' used in equation 3.16 will be simply tmp.w + 716 clamp(texobj_bias + texunit_bias), where tmp.w is the base LOD. 717 718 Section 2.14.5.48, TXP: Texture Lookup (Projective) 719 720 The TXP instruction uses the single vector operand to perform a 721 lookup in the specified texture map, yielding a 4-component result 722 vector containing filtered texel values. The (s,t,r,q) coordinates 723 used for the texture lookup are (x,y,z,w), where x, y, z, and w are 724 the four components of the vector operand. 725 726 tmp = VectorLoad(op0); 727 result = TextureSample(tmp.x, tmp.y, tmp.z, tmp.w, 0.0, unit, target); 728 729 where <unit> and <target> are the texture image unit number and 730 target type, matching the <texImageUnitNum> and <texTargetType> 731 grammar rules. 732 733 The resulting sample is mapped to RGBA as described in Table 3.21, 734 and the R, G, B, and A values are written to the x, y, z, and w 735 components, respectively, of the result vector. 736 737 Since partial derivatives of the texture coordinates are not defined, 738 the base LOD value for vertex texture lookups is defined to be 739 zero. The value of lambda' used in equation 3.16 will be simply 740 clamp(texobj_bias + texunit_bias). 741 742Additions to Chapter 3 of the OpenGL 1.4 Specification (Rasterization) 743 744 None. 745 746Additions to Chapter 4 of the OpenGL 1.4 Specification (Per-Fragment 747Operations and the Frame Buffer) 748 749 None. 750 751Additions to Chapter 5 of the OpenGL 1.4 Specification (Special Functions) 752 753 None. 754 755Additions to Chapter 6 of the OpenGL 1.4 Specification (State and 756State Requests) 757 758 None. 759 760Additions to Appendix A of the OpenGL 1.4 Specification (Invariance) 761 762 None. 763 764Additions to the AGL/GLX/WGL Specifications 765 766 None. 767 768Dependencies on ARB_vertex_program 769 770 ARB_vertex_program is required. 771 772 This specification and NV_vertex_program2_option are based on a 773 modified version of the grammar published in the ARB_vertex_program 774 specification. This modified grammar includes a few structural 775 changes to better accommodate new functionality from this and 776 other extensions, but should be functionally equivalent to the 777 ARB_vertex_program grammar. See NV_vertex_program2_option for 778 details on the base grammar. 779 780Dependencies on NV_vertex_program2_option 781 782 NV_vertex_program2_option is required. 783 784 If the NV_vertex_program3 program option is specified, all 785 the functionality described in both this extension and the 786 NV_vertex_program2_option specification is available. 787 788Dependencies on ARB_fragment_program_shadow 789 790 If this extension and ARB_fragment_program shadow are both supported, 791 vertex programs may include the option statement: 792 793 OPTION ARB_fragment_program_shadow; 794 795 which enables the use of SHADOW1D, SHADOW2D, and SHADOWRECT texture 796 targets in texture lookup instructions, as described in the 797 ARB_fragment_program_shadow specification. 798 799 NVIDIA NOTE: Drivers prior to September 2006 do not support the use of 800 this option, and will not accept texture lookups with SHADOW1D, SHADOW2D, 801 and SHADOWRECT targets. Shadow mapping in vertex programs will result in 802 software fallbacks on GeForce 6 and GeForce 7 series GPUs, but may be done 803 in hardware on future GPUs. 804 805Errors 806 807 None. 808 809New State 810 811 None. 812 813New Implementation Dependent State: 814 815 Minimum 816 Get Value Type Get Command Value Description Section Attr. 817 --------- ---- ----------- ------- -------------------------- -------- ----- 818 MAX_VERTEX_TEXTURE_ Z+ GetIntegerv 1 Number of separate texture 2.14.3.7 - 819 IMAGE_UNITS_ARB image units that can be 820 accessed by a vertex program 821 822Revision History 823 824 Rev. Date Author Changes 825 ---- -------- -------- -------------------------------------------- 826 7 10/12/09 pbrown Update grammar/documentation of PUSHA/POPA to 827 reflect the implementation. <instResultAddr> is 828 used for POPA with some semantic checks. Note 829 that some driver versions erroneously allowed 830 conditional write masks on POPA. Also clarify 831 that ARB_fragment_program_shadow includes 832 support for "SHADOWRECT". 833 834 6 09/27/06 pbrown Document that ARB_fragment_program_shadow is 835 allowed, to enable the use of "SHADOW1D" and 836 "SHADOW2D" targets for texture lookups. 837 838 5 11/07/05 pbrown Fix PUSHA documentation to specify the right 839 constant name used for overflow testing. 840 841 4 09/01/05 pbrown Fix spec language to document that a vertex 842 program will fail to compile if it uses "too 843 many" textures -- previously only documented 844 in the issues section. 845 846 3 08/25/05 pbrown Document that using a different texture target 847 than fragment processing on the same texture 848 unit results in an INVALID_OPERATION error at 849 Begin time. This is consistent with GLSL 850 language in the ARB_shader_objects and OpenGL 851 2.0 specifications. The implementation has 852 always done this, but it was overlooked in 853 the spec language. 854 855 2 06/23/04 pbrown Documented that vertex results are undefined 856 when a vertex program terminates abnormally 857 (e.g., PUSHA/POPA stack overflow/underflow). 858 Documented error in RET if the top of the call 859 stack contains a value written by PUSHA. 860 861 1 -------- pbrown Initial pre-release revisions. 862 863