1Name 2 3 NV_vertex_program1_1 4 5Name Strings 6 7 GL_NV_vertex_program1_1 8 9Contact 10 11 Mark J. Kilgard, NVIDIA Corporation (mjk 'at' nvidia.com) 12 13Contributors 14 15 Pat Brown 16 Erik Lindholm 17 Steve Glanville 18 Erik Faye-Lund 19 20Notice 21 22 Copyright NVIDIA Corporation, 2001, 2002. 23 24IP Status 25 26 NVIDIA Proprietary. 27 28Status 29 30 Version 1.0 31 32Version 33 34 NVIDIA Date: March 4, 2014 35 Version: 8 36 37Number 38 39 266 40 41Dependencies 42 43 Written based on the wording of the OpenGL 1.2.1 specification and 44 requires OpenGL 1.2.1. 45 46 Assumes support for the NV_vertex_program extension. 47 48Overview 49 50 This extension adds four new vertex program instructions (DPH, 51 RCC, SUB, and ABS). 52 53 This extension also supports a position-invariant vertex program 54 option. A vertex program is position-invariant when it generates 55 the _exact_ same homogenuous position and window space position 56 for a vertex as conventional OpenGL transformation (ignoring vertex 57 blending and weighting). 58 59 By default, vertex programs are _not_ guaranteed to be 60 position-invariant because there is no guarantee made that the way 61 a vertex program might compute its homogenous position is exactly 62 identical to the way conventional OpenGL transformation computes 63 its homogenous positions. In a position-invariant vertex program, 64 the homogeneous position (HPOS) is not output by the program. 65 Instead, the OpenGL implementation is expected to compute the HPOS 66 for position-invariant vertex programs in a manner exactly identical 67 to how the homogenous position and window position are computed 68 for a vertex by conventional OpenGL transformation. In this way 69 position-invariant vertex programs guarantee correct multi-pass 70 rendering semantics in cases where multiple passes are rendered and 71 the second and subsequent passes use a GL_EQUAL depth test. 72 73Issues 74 75 How should options to the vertex program semantics be handled? 76 77 RESOLUTION: A VP1.1 vertex program can contain a sequence 78 of options. This extension provides a single option 79 ("NV_position_invariant"). Specifying an option changes the 80 way the program's subsequent instruction sequence are parsed, 81 may add new semantic checks, and modifies the semantics by which 82 the vertex program is executed. 83 84 Should this extension provide SUB and ABS instructions even though 85 the functionality can be accomplished with ADD and MAX? 86 87 RESOLUTION: Yes. SUB and ABS provide no functionality that could 88 not be accomplished in VP1.0 with ADD and MAX idioms, SUB and ABS 89 provide more understanable vertex programs. 90 91 Should the optionalSign in a VP1.1 accept both "-" and "+"? 92 93 RESOLUTION: Yes. The "+" does not negate its operand but is 94 available for symetry. 95 96 Is relative addressing available to position-invariant version 1.1 97 vertex programs? 98 99 RESOLUTION: No. This reflects a hardware restriction. 100 101 Should something be said about the relative performance of 102 position-invariant vertex programs and conventional vertex programs? 103 104 RESOLUTION: For architectural reasons, position-invariant vertex 105 programs may be _slightly_ faster than conventional vertex programs. 106 This is true in the GeForce3 architecture. If your vertex program 107 transforms the object-space position to clip-space with four DP4 108 instructions using the tracked GL_MODELVIEW_PROJECTION_NV matrix, 109 consider using position-invariant vertex programs. Do not expect a 110 measurable performance improvement unless vertex program processing 111 is your bottleneck and your vertex program is relatively short. 112 113 Should position-invariant vertex programs have a lower limit on the 114 maximum instructions? 115 116 RESOLUTION: Yes, the driver takes care to match the same 117 instructions used for position transformation used by conventional 118 transformation and this requires a few vertex program instructions. 119 120New Procedures and Functions 121 122 None. 123 124New Tokens 125 126 None. 127 128Additions to Chapter 2 of the OpenGL 1.2.1 Specification (OpenGL Operation) 129 130 2.14.1.9 Vertex Program Register Accesses 131 132 Replace the first two sentences and update Table X.4: 133 134 "There are 21 vertex program instructions. The instructions and their 135 respective input and output parameters are summarized in Table X.4." 136 137 Output 138 Inputs (vector or 139Opcode (scalar or vector) replicated scalar) Operation 140------ ------------------ ------------------ -------------------------- 141 ARL s address register address register load 142 MOV v v move 143 MUL v,v v multiply 144 ADD v,v v add 145 MAD v,v,v v multiply and add 146 RCP s ssss reciprocal 147 RSQ s ssss reciprocal square root 148 DP3 v,v ssss 3-component dot product 149 DP4 v,v ssss 4-component dot product 150 DST v,v v distance vector 151 MIN v,v v minimum 152 MAX v,v v maximum 153 SLT v,v v set on less than 154 SGE v,v v set on greater equal than 155 EXP s v exponential base 2 156 LOG s v logarithm base 2 157 LIT v v light coefficients 158 DPH v,v ssss homogeneous dot product 159 RCC s ssss reciprocal clamped 160 SUB v,v v subtract 161 ABS v v absolute value 162 163Table X.4: Summary of vertex program instructions. "v" indicates a 164vector input or output, "s" indicates a scalar input, and "ssss" indicates 165a scalar output replicated across a 4-component vector. 166 167 Add four new sections describing the DPH, RCC, SUB, and ABS 168 instructions. 169 170 "2.14.1.10.18 DPH: Homogeneous Dot Product 171 172 The DPH instruction assigns the four-component dot product of the 173 two source vectors where the W component of the first source vector 174 is assumed to be 1.0 into the destination register. 175 176 t.x = source0.c***; 177 t.y = source0.*c**; 178 t.z = source0.**c*; 179 if (negate0) { 180 t.x = -t.x; 181 t.y = -t.y; 182 t.z = -t.z; 183 } 184 u.x = source1.c***; 185 u.y = source1.*c**; 186 u.z = source1.**c*; 187 u.w = source1.***c; 188 if (negate1) { 189 u.x = -u.x; 190 u.y = -u.y; 191 u.z = -u.z; 192 u.w = -u.w; 193 } 194 v.x = t.x * u.x + t.y * u.y + t.z * u.z + u.w; 195 if (xmask) destination.x = v.x; 196 if (ymask) destination.y = v.x; 197 if (zmask) destination.z = v.x; 198 if (wmask) destination.w = v.x; 199 200 2.14.1.10.19 RCC: Reciprocal Clamped 201 202 The RCC instruction inverts the value of the source scalar, clamps 203 the result as described below, and stores the clamped result into 204 the destination register. The reciprocal of exactly 1.0 must be 205 exactly 1.0. 206 207 Additionally (before clamping) the reciprocal of negative infinity 208 gives [-0.0, -0.0, -0.0, -0.0]; the reciprocal of negative zero gives 209 [-Inf, -Inf, -Inf, -Inf]; the reciprocal of positive zero gives 210 [+Inf, +Inf, +Inf, +Inf]; and the reciprocal of positive infinity 211 gives [0.0, 0.0, 0.0, 0.0]. 212 213 t.x = source0.c; 214 if (negate0) { 215 t.x = -t.x; 216 } 217 if (t.x == 1.0f) { 218 u.x = 1.0f; 219 } else { 220 u.x = 1.0f / t.x; 221 } 222 if (Positive(u.x)) { 223 if (u.x > 1.84467e+019) { 224 u.x = 1.84467e+019; // the IEEE 32-bit binary value 0x5F800000 225 } else if (u.x < 5.42101e-020) { 226 u.x = 5.42101e-020; // the IEEE 32-bit bindary value 0x1F800000 227 } 228 } else { 229 if (u.x < -1.84467e+019) { 230 u.x = -1.84467e+019; // the IEEE 32-bit binary value 0xDF800000 231 } else if (u.x > -5.42101e-020) { 232 u.x = -5.42101e-020; // the IEEE 32-bit binary value 0x9F800000 233 } 234 } 235 if (xmask) destination.x = u.x; 236 if (ymask) destination.y = u.x; 237 if (zmask) destination.z = u.x; 238 if (wmask) destination.w = u.x; 239 240 where Positive(x) is true for +0 and other positive values and false 241 for -0 and other negative values; and 242 243 | u.x - IEEE(1.0f/t.x) | < 1.0f/(2^22) 244 245 for 1.0f <= t.x <= 2.0f. The intent of this precision requirement is 246 that this amount of relative precision apply over all values of t.x." 247 248 2.14.1.10.20 SUB: Subtract 249 250 The SUB instruction subtracts the values of the one source vector 251 from another source vector and stores the result into the destination 252 register. 253 254 t.x = source0.c***; 255 t.y = source0.*c**; 256 t.z = source0.**c*; 257 t.w = source0.***c; 258 if (negate0) { 259 t.x = -t.x; 260 t.y = -t.y; 261 t.z = -t.z; 262 t.w = -t.w; 263 } 264 u.x = source1.c***; 265 u.y = source1.*c**; 266 u.z = source1.**c*; 267 u.w = source1.***c; 268 if (negate1) { 269 u.x = -u.x; 270 u.y = -u.y; 271 u.z = -u.z; 272 u.w = -u.w; 273 } 274 if (xmask) destination.x = t.x - u.x; 275 if (ymask) destination.y = t.y - u.y; 276 if (zmask) destination.z = t.z - u.z; 277 if (wmask) destination.w = t.w - u.w; 278 279 2.14.1.10.21 ABS: Absolute Value 280 281 The ABS instruction assigns the component-wise absolute value of a 282 source vector into the destination register. 283 284 t.x = source0.c***; 285 t.y = source0.*c**; 286 t.z = source0.**c*; 287 t.w = source0.***c; 288 if (xmask) destination.x = (t.x >= 0) ? t.x : -t.x; 289 if (ymask) destination.y = (t.y >= 0) ? t.y : -t.y; 290 if (zmask) destination.z = (t.z >= 0) ? t.z : -t.z; 291 if (wmask) destination.w = (t.w >= 0) ? t.w : -t.w; 292 293 Insert sections 2.14.A and 2.14.B after section 2.14.4 294 295 "2.14.A Version 1.1 Vertex Programs 296 297 Version 1.1 vertex programs provide support for the DPH, RCC, SUB, 298 and ABS instructions (see sections 2.14.1.10.18 through 2.14.1.10.21). 299 300 Version 1.1 vertex programs are loaded with the LoadProgramNV command 301 (see section 2.14.1.7). The target must be VERTEX_PROGRAM_NV to 302 load a version 1.1 vertex program. The initial "!!VP1.1" token 303 designates the program should be parsed and treated as a version 1.1 304 vertex program. 305 306 Version 1.1 programs must conform to a more expanded grammar than 307 the grammar for vertex programs. The version 1.1 vertex program 308 grammar for syntactically valid sequences is the same as the grammar 309 defined in section 2.14.1.7 with the following modified rules: 310 311 <program> ::= "!!VP1.1" <optionSequence> <instructionSequence> "END" 312 313 <optionSequence> ::= <optionSequence> <option> 314 | "" 315 316 <option> ::= "OPTION" "NV_position_invariant" ";" 317 318 <VECTORop> ::= "MOV" 319 | "LIT" 320 | "ABS" 321 322 <SCALARop> ::= "RCP" 323 | "RSQ" 324 | "EXP" 325 | "LOG" 326 | "RCC" 327 328 <BINop> ::= "MUL" 329 | "ADD" 330 | "DP3" 331 | "DP4" 332 | "DST" 333 | "MIN" 334 | "MAX" 335 | "SLT" 336 | "SGE" 337 | "DPH" 338 | "SUB" 339 340 <optionalSign> ::= "-" 341 | "+" 342 | "" 343 344 Except for supporting the additional DPH, RCC, SUB, and ABS 345 instructions, version 1.1 vertex programs with no options specified 346 otherwise behave in the same manner as version 1.0 vertex programs. 347 348 2.14.B Position-invariant Vertex Program Option 349 350 By default, vertex programs are _not_ guaranteed to be 351 position-invariant because there is no guarantee made that the 352 way a vertex program might compute its homogenous position is 353 exactly identical to the way conventional OpenGL transformation 354 computes its homogenous positions. However in a position-invariant 355 vertex program, the homogeneous position (HPOS) is not output by 356 the program. Instead, the OpenGL implementation is expected to 357 compute the HPOS for position-invariant vertex programs in a manner 358 exactly identical to how the homogenous position and window position 359 are computed for a vertex by conventional OpenGL transformation 360 (assuming vertex weighting and vertex blending are disabled). In this 361 way position-invariant vertex programs guarantee correct multi-pass 362 rendering semantics in cases where multiple passes are rendered with 363 conventional OpenGL transformation and position-invariant vertex 364 programs and the second and subsequent passes use a EQUAL depth test. 365 366 If an <option> with the identifier "NV_position_invariant" is 367 encountered during the parsing of the program, the specified program 368 is presumed to be position-invariant. 369 370 When a position-invariant vertex program is specified, the 371 <vertexResultRegName> rule is replaced with the following rule 372 (that does not provide "HPOS"): 373 374 <vertexResultRegName> ::= "COL0" 375 | "COL1" 376 | "BFC0" 377 | "BFC1" 378 | "FOGC" 379 | "PSIZ" 380 | "TEX0" 381 | "TEX1" 382 | "TEX2" 383 | "TEX3" 384 | "TEX4" 385 | "TEX5" 386 | "TEX6" 387 | "TEX7" 388 389 While position-invariant version 1.1 vertex programs provide 390 position-invariance, such programs do not provide support for 391 relative program parameter addressing. The <relProgParamReg> rule 392 for version 1.1 position-invariant vertex programs is replaced by 393 (eliminating the relative addressing cases): 394 395 <relProgParamReg> ::= "c" "[" <addrReg> "]" 396 397 Note that while the ARL instruction is still available to 398 position-invariant version 1.1 vertex programs, it provides no 399 meaningful functionality without support for relative addressing. 400 401 The semantic restriction for vertex program instruction length is 402 changed in the case of position-invariant vertex programs to the 403 following: A position-invariant vertex program fails to load if it 404 contains more than 124 instructions. 405 406 " 407 408Additions to Chapter 4 of the OpenGL 1.2.1 Specification (Per-Fragment 409Operations and the Framebuffer) 410 411 None 412 413Additions to Chapter 5 of the OpenGL 1.2.1 Specification (Special Functions) 414 415 None 416 417Additions to Chapter 6 of the OpenGL 1.2.1 Specification (State and 418State Requests) 419 420 None 421 422Additions to the AGL/GLX/WGL Specifications 423 424 None 425 426GLX Protocol 427 428 None 429 430Errors 431 432 None 433 434New State 435 436 None 437 438Revision History 439 440 Rev. Date Author Changes 441 ---- -------- --------- ---------------------------------------- 442 8 03/04/14 mjk RCC decimal value corrections 443