• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Name
2
3    NV_vertex_program1_1
4
5Name Strings
6
7    GL_NV_vertex_program1_1
8
9Contact
10
11    Mark J. Kilgard, NVIDIA Corporation (mjk 'at' nvidia.com)
12
13Contributors
14
15    Pat Brown
16    Erik Lindholm
17    Steve Glanville
18    Erik Faye-Lund
19
20Notice
21
22    Copyright NVIDIA Corporation, 2001, 2002.
23
24IP Status
25
26    NVIDIA Proprietary.
27
28Status
29
30    Version 1.0
31
32Version
33
34    NVIDIA Date: March 4, 2014
35    Version:     8
36
37Number
38
39    266
40
41Dependencies
42
43    Written based on the wording of the OpenGL 1.2.1 specification and
44    requires OpenGL 1.2.1.
45
46    Assumes support for the NV_vertex_program extension.
47
48Overview
49
50    This extension adds four new vertex program instructions (DPH,
51    RCC, SUB, and ABS).
52
53    This extension also supports a position-invariant vertex program
54    option.  A vertex program is position-invariant when it generates
55    the _exact_ same homogenuous position and window space position
56    for a vertex as conventional OpenGL transformation (ignoring vertex
57    blending and weighting).
58
59    By default, vertex programs are _not_ guaranteed to be
60    position-invariant because there is no guarantee made that the way
61    a vertex program might compute its homogenous position is exactly
62    identical to the way conventional OpenGL transformation computes
63    its homogenous positions.  In a position-invariant vertex program,
64    the homogeneous position (HPOS) is not output by the program.
65    Instead, the OpenGL implementation is expected to compute the HPOS
66    for position-invariant vertex programs in a manner exactly identical
67    to how the homogenous position and window position are computed
68    for a vertex by conventional OpenGL transformation.  In this way
69    position-invariant vertex programs guarantee correct multi-pass
70    rendering semantics in cases where multiple passes are rendered and
71    the second and subsequent passes use a GL_EQUAL depth test.
72
73Issues
74
75    How should options to the vertex program semantics be handled?
76
77      RESOLUTION:  A VP1.1 vertex program can contain a sequence
78      of options.  This extension provides a single option
79      ("NV_position_invariant").  Specifying an option changes the
80      way the program's subsequent instruction sequence are parsed,
81      may add new semantic checks, and modifies the semantics by which
82      the vertex program is executed.
83
84    Should this extension provide SUB and ABS instructions even though
85    the functionality can be accomplished with ADD and MAX?
86
87      RESOLUTION:  Yes.  SUB and ABS provide no functionality that could
88      not be accomplished in VP1.0 with ADD and MAX idioms, SUB and ABS
89      provide more understanable vertex programs.
90
91    Should the optionalSign in a VP1.1 accept both "-" and "+"?
92
93      RESOLUTION:  Yes.  The "+" does not negate its operand but is
94      available for symetry.
95
96    Is relative addressing available to position-invariant version 1.1
97    vertex programs?
98
99      RESOLUTION:  No.  This reflects a hardware restriction.
100
101    Should something be said about the relative performance of
102    position-invariant vertex programs and conventional vertex programs?
103
104      RESOLUTION:  For architectural reasons, position-invariant vertex
105      programs may be _slightly_ faster than conventional vertex programs.
106      This is true in the GeForce3 architecture.  If your vertex program
107      transforms the object-space position to clip-space with four DP4
108      instructions using the tracked GL_MODELVIEW_PROJECTION_NV matrix,
109      consider using position-invariant vertex programs.  Do not expect a
110      measurable performance improvement unless vertex program processing
111      is your bottleneck and your vertex program is relatively short.
112
113    Should position-invariant vertex programs have a lower limit on the
114    maximum instructions?
115
116      RESOLUTION:  Yes, the driver takes care to match the same
117      instructions used for position transformation used by conventional
118      transformation and this requires a few vertex program instructions.
119
120New Procedures and Functions
121
122    None.
123
124New Tokens
125
126    None.
127
128Additions to Chapter 2 of the OpenGL 1.2.1 Specification (OpenGL Operation)
129
130    2.14.1.9  Vertex Program Register Accesses
131
132    Replace the first two sentences and update Table X.4:
133
134    "There are 21 vertex program instructions.  The instructions and their
135    respective input and output parameters are summarized in Table X.4."
136
137                             Output
138         Inputs              (vector or
139Opcode   (scalar or vector)  replicated scalar)   Operation
140------   ------------------  ------------------   --------------------------
141 ARL     s                   address register     address register load
142 MOV     v                   v                    move
143 MUL     v,v                 v                    multiply
144 ADD     v,v                 v                    add
145 MAD     v,v,v               v                    multiply and add
146 RCP     s                   ssss                 reciprocal
147 RSQ     s                   ssss                 reciprocal square root
148 DP3     v,v                 ssss                 3-component dot product
149 DP4     v,v                 ssss                 4-component dot product
150 DST     v,v                 v                    distance vector
151 MIN     v,v                 v                    minimum
152 MAX     v,v                 v                    maximum
153 SLT     v,v                 v                    set on less than
154 SGE     v,v                 v                    set on greater equal than
155 EXP     s                   v                    exponential base 2
156 LOG     s                   v                    logarithm base 2
157 LIT     v                   v                    light coefficients
158 DPH     v,v                 ssss                 homogeneous dot product
159 RCC     s                   ssss                 reciprocal clamped
160 SUB     v,v                 v                    subtract
161 ABS     v                   v                    absolute value
162
163Table X.4:  Summary of vertex program instructions.  "v" indicates a
164vector input or output, "s" indicates a scalar input, and "ssss" indicates
165a scalar output replicated across a 4-component vector.
166
167    Add four new sections describing the DPH, RCC, SUB, and ABS
168    instructions.
169
170    "2.14.1.10.18  DPH: Homogeneous Dot Product
171
172    The DPH instruction assigns the four-component dot product of the
173    two source vectors where the W component of the first source vector
174    is assumed to be 1.0 into the destination register.
175
176        t.x = source0.c***;
177        t.y = source0.*c**;
178        t.z = source0.**c*;
179        if (negate0) {
180          t.x = -t.x;
181          t.y = -t.y;
182          t.z = -t.z;
183        }
184        u.x = source1.c***;
185        u.y = source1.*c**;
186        u.z = source1.**c*;
187        u.w = source1.***c;
188        if (negate1) {
189          u.x = -u.x;
190          u.y = -u.y;
191          u.z = -u.z;
192          u.w = -u.w;
193        }
194        v.x = t.x * u.x + t.y * u.y + t.z * u.z + u.w;
195        if (xmask) destination.x = v.x;
196        if (ymask) destination.y = v.x;
197        if (zmask) destination.z = v.x;
198        if (wmask) destination.w = v.x;
199
200    2.14.1.10.19  RCC: Reciprocal Clamped
201
202    The RCC instruction inverts the value of the source scalar, clamps
203    the result as described below, and stores the clamped result into
204    the destination register.  The reciprocal of exactly 1.0 must be
205    exactly 1.0.
206
207    Additionally (before clamping) the reciprocal of negative infinity
208    gives [-0.0, -0.0, -0.0, -0.0]; the reciprocal of negative zero gives
209    [-Inf, -Inf, -Inf, -Inf]; the reciprocal of positive zero gives
210    [+Inf, +Inf, +Inf, +Inf]; and the reciprocal of positive infinity
211    gives [0.0, 0.0, 0.0, 0.0].
212
213        t.x = source0.c;
214        if (negate0) {
215          t.x = -t.x;
216        }
217        if (t.x == 1.0f) {
218          u.x = 1.0f;
219        } else {
220          u.x = 1.0f / t.x;
221        }
222        if (Positive(u.x)) {
223          if (u.x > 1.84467e+019) {
224            u.x = 1.84467e+019;   // the IEEE 32-bit binary value 0x5F800000
225          } else if (u.x < 5.42101e-020) {
226            u.x = 5.42101e-020;    // the IEEE 32-bit bindary value 0x1F800000
227          }
228        } else {
229          if (u.x < -1.84467e+019) {
230            u.x = -1.84467e+019;  // the IEEE 32-bit binary value 0xDF800000
231          } else if (u.x > -5.42101e-020) {
232            u.x = -5.42101e-020;   // the IEEE 32-bit binary value 0x9F800000
233          }
234        }
235        if (xmask) destination.x = u.x;
236        if (ymask) destination.y = u.x;
237        if (zmask) destination.z = u.x;
238        if (wmask) destination.w = u.x;
239
240    where Positive(x) is true for +0 and other positive values and false
241    for -0 and other negative values; and
242
243        | u.x - IEEE(1.0f/t.x) | < 1.0f/(2^22)
244
245    for 1.0f <= t.x <= 2.0f.  The intent of this precision requirement is
246    that this amount of relative precision apply over all values of t.x."
247
248    2.14.1.10.20  SUB: Subtract
249
250    The SUB instruction subtracts the values of the one source vector
251    from another source vector and stores the result into the destination
252    register.
253
254        t.x = source0.c***;
255        t.y = source0.*c**;
256        t.z = source0.**c*;
257        t.w = source0.***c;
258        if (negate0) {
259          t.x = -t.x;
260          t.y = -t.y;
261          t.z = -t.z;
262          t.w = -t.w;
263        }
264        u.x = source1.c***;
265        u.y = source1.*c**;
266        u.z = source1.**c*;
267        u.w = source1.***c;
268        if (negate1) {
269          u.x = -u.x;
270          u.y = -u.y;
271          u.z = -u.z;
272          u.w = -u.w;
273        }
274        if (xmask) destination.x = t.x - u.x;
275        if (ymask) destination.y = t.y - u.y;
276        if (zmask) destination.z = t.z - u.z;
277        if (wmask) destination.w = t.w - u.w;
278
279    2.14.1.10.21  ABS: Absolute Value
280
281    The ABS instruction assigns the component-wise absolute value of a
282    source vector into the destination register.
283
284        t.x = source0.c***;
285        t.y = source0.*c**;
286        t.z = source0.**c*;
287        t.w = source0.***c;
288        if (xmask) destination.x = (t.x >= 0) ? t.x : -t.x;
289        if (ymask) destination.y = (t.y >= 0) ? t.y : -t.y;
290        if (zmask) destination.z = (t.z >= 0) ? t.z : -t.z;
291        if (wmask) destination.w = (t.w >= 0) ? t.w : -t.w;
292
293    Insert sections 2.14.A and 2.14.B after section 2.14.4
294
295    "2.14.A  Version 1.1 Vertex Programs
296
297    Version 1.1 vertex programs provide support for the DPH, RCC, SUB,
298    and ABS instructions (see sections 2.14.1.10.18 through 2.14.1.10.21).
299
300    Version 1.1 vertex programs are loaded with the LoadProgramNV command
301    (see section 2.14.1.7).  The target must be VERTEX_PROGRAM_NV to
302    load a version 1.1 vertex program.  The initial "!!VP1.1" token
303    designates the program should be parsed and treated as a version 1.1
304    vertex program.
305
306    Version 1.1 programs must conform to a more expanded grammar than
307    the grammar for vertex programs.  The version 1.1 vertex program
308    grammar for syntactically valid sequences is the same as the grammar
309    defined in section 2.14.1.7 with the following modified rules:
310
311    <program>              ::= "!!VP1.1" <optionSequence> <instructionSequence> "END"
312
313    <optionSequence>       ::= <optionSequence> <option>
314                             | ""
315
316    <option>               ::= "OPTION" "NV_position_invariant" ";"
317
318    <VECTORop>             ::= "MOV"
319                             | "LIT"
320                             | "ABS"
321
322    <SCALARop>             ::= "RCP"
323                             | "RSQ"
324                             | "EXP"
325                             | "LOG"
326                             | "RCC"
327
328    <BINop>                ::= "MUL"
329                             | "ADD"
330                             | "DP3"
331                             | "DP4"
332                             | "DST"
333                             | "MIN"
334                             | "MAX"
335                             | "SLT"
336                             | "SGE"
337                             | "DPH"
338                             | "SUB"
339
340    <optionalSign>         ::= "-"
341                             | "+"
342                             | ""
343
344    Except for supporting the additional DPH, RCC, SUB, and ABS
345    instructions, version 1.1 vertex programs with no options specified
346    otherwise behave in the same manner as version 1.0 vertex programs.
347
348    2.14.B  Position-invariant Vertex Program Option
349
350    By default, vertex programs are _not_ guaranteed to be
351    position-invariant because there is no guarantee made that the
352    way a vertex program might compute its homogenous position is
353    exactly identical to the way conventional OpenGL transformation
354    computes its homogenous positions.  However in a position-invariant
355    vertex program, the homogeneous position (HPOS) is not output by
356    the program.  Instead, the OpenGL implementation is expected to
357    compute the HPOS for position-invariant vertex programs in a manner
358    exactly identical to how the homogenous position and window position
359    are computed for a vertex by conventional OpenGL transformation
360    (assuming vertex weighting and vertex blending are disabled).  In this
361    way position-invariant vertex programs guarantee correct multi-pass
362    rendering semantics in cases where multiple passes are rendered with
363    conventional OpenGL transformation and position-invariant vertex
364    programs and the second and subsequent passes use a EQUAL depth test.
365
366    If an <option> with the identifier "NV_position_invariant" is
367    encountered during the parsing of the program, the specified program
368    is presumed to be position-invariant.
369
370    When a position-invariant vertex program is specified, the
371    <vertexResultRegName> rule is replaced with the following rule
372    (that does not provide "HPOS"):
373
374    <vertexResultRegName>  ::= "COL0"
375                             | "COL1"
376                             | "BFC0"
377                             | "BFC1"
378                             | "FOGC"
379                             | "PSIZ"
380                             | "TEX0"
381                             | "TEX1"
382                             | "TEX2"
383                             | "TEX3"
384                             | "TEX4"
385                             | "TEX5"
386                             | "TEX6"
387                             | "TEX7"
388
389    While position-invariant version 1.1 vertex programs provide
390    position-invariance, such programs do not provide support for
391    relative program parameter addressing.  The <relProgParamReg> rule
392    for version 1.1 position-invariant vertex programs is replaced by
393    (eliminating the relative addressing cases):
394
395    <relProgParamReg>      ::= "c" "[" <addrReg> "]"
396
397    Note that while the ARL instruction is still available to
398    position-invariant version 1.1 vertex programs, it provides no
399    meaningful functionality without support for relative addressing.
400
401    The semantic restriction for vertex program instruction length is
402    changed in the case of position-invariant vertex programs to the
403    following: A position-invariant vertex program fails to load if it
404    contains more than 124 instructions.
405
406    "
407
408Additions to Chapter 4 of the OpenGL 1.2.1 Specification (Per-Fragment
409Operations and the Framebuffer)
410
411    None
412
413Additions to Chapter 5 of the OpenGL 1.2.1 Specification (Special Functions)
414
415    None
416
417Additions to Chapter 6 of the OpenGL 1.2.1 Specification (State and
418State Requests)
419
420    None
421
422Additions to the AGL/GLX/WGL Specifications
423
424    None
425
426GLX Protocol
427
428    None
429
430Errors
431
432    None
433
434New State
435
436    None
437
438Revision History
439
440    Rev.    Date    Author     Changes
441    ----  -------- ---------  ----------------------------------------
442      8   03/04/14 mjk        RCC decimal value corrections
443