extensions/NV/NV_fragment_program_option.txt

Name

    NV_fragment_program_option

Name Strings

    GL_NV_fragment_program_option

Contact

    Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com)

Status

    Shipping.

Version

    Last Modified:      05/27/2005
    NVIDIA Revision:    4

Number

    303

Dependencies

    ARB_fragment_program is required.

Overview

    This extension provides additional fragment program functionality
    to extend the standard ARB_fragment_program language and execution
    environment.  ARB programs wishing to use this added functionality
    need only add:

        OPTION NV_fragment_program;

    to the beginning of their fragment programs.

    The functionality provided by this extension, which is roughly
    equivalent to that provided by the NV_fragment_program extension,
    includes:

      * increased control over precision in arithmetic computations and
        storage,

      * data-dependent conditional writemasks,

      * an absolute value operator on scalar and swizzled operand loads,

      * instructions to compute partial derivatives, and perform texture
        lookups using specified partial derivatives,

      * fully orthogonal "set on" instructions,

      * instructions to compute reflection vector and perform a 2D
        coordinate transform, and

      * instructions to pack and unpack multiple quantities into a single
        component.

Issues

    Why is this a separate extension, rather than just an additional
    feature of NV_fragment_program?

      RESOLVED:  The NV_fragment_program specification was complete
      (with a published implementation) prior to the completion of
      ARB_fragment_program.  Future NVIDIA fragment program extensions
      should contain extensions to the ARB_fragment_program execution
      environment as a standard feature.

    Should a similar option be provided to expose ARB_fragment_program
    features not found in NV_fragment_program (e.g., state bindings,
    certain "macro" instructions) under the NV_fragment_program
    interface?

      RESOLVED:  No.  Why not just write an ARB program?

    The ARB_fragment_program spec has a minor grammar bug that requires
    that inline scalar constants used as scalar operands include a
    component selector.  In other words, you have to say "11.0.x" to
    use the constant "11.0".  What should we do here?

      RESOLVED:  The NV_fragment_program_option grammar will correct
      this problem, which should be fixed in future revisions to the
      ARB language.

New Procedures and Functions

    None.

New Tokens

    None.

Additions to Chapter 2 of the OpenGL 1.2.1 Specification (OpenGL Operation)

    None.

Additions to Chapter 3 of the OpenGL 1.2.1 Specification (Rasterization)

    Modify Section 3.11.2 of ARB_fragment_program (Fragment Program
    Grammar and Restrictions):

    (mostly add to existing grammar rules, modify a few existing grammar
    rules -- changes marked with "***")

    <optionName>            ::= "NV_fragment_program"

    <TexInstruction>        ::= <TXDop_instruction>

    <VECTORop>              ::= "DDX"
                              | "DDY"
                              | "PK2H"
                              | "PK2US"
                              | "PK4B"
                              | "PK4UB"

    <SCALARop>              ::= "UP2H"
                              | "UP2US"
                              | "UP4B"
                              | "UP4UB"

    <BINop>                 ::= "RFL"
                              | "SEQ"
                              | "SFL"
                              | "SGT"
                              | "SLE"
                              | "SNE"
                              | "STR"

    <TRIop>                 ::= "X2D"

    <TXDop_instruction>     ::= <TXDop> <instResult> "," <instOperandV> ","
                                <instOperandV> "," <instOperandV> ","
                                <texTarget>

    <TXDop>                 ::= "TXD"

    <killCond>              ::= <ccTest>

    <instOperandV>          ::= <instOperandAbsV>

    <instOperandAbsV>       ::= <optSign> "|" <instOperandBaseV> "|"

    <instOperandS>          ::= <instOperandAbsS>

    <instOperandAbsS>       ::= <optSign> "|" <instOperandBaseS> "|"

    <instResult>            ::= <instResultCC>

    <instResultCC>          ::= <instResultBase> <ccMask>

    <TEMP_statement>        ::= <varSize> "TEMP" <varNameList>

    <OUTPUT_statement>      ::= <varSize> "OUTPUT" <establishName> "="
                                  <resultUseD>

    <varSize>               ::= "SHORT"
                              | "LONG"

    <paramUseV>             ::= <constantScalar>
                                  (*** instead of <constantScalar>
                                       <swizzleSuffix>)

    <paramUseS>             ::= <constantScalar>
                                  (*** instead of <constantScalar>
                                       <scalarSuffix>)

    <ccMask>                ::= "(" <ccTest> ")"

    <ccTest>                ::= <ccMaskRule> <swizzleSuffix>

    <ccMaskRule>            ::= "EQ"
                              | "GE"
                              | "GT"
                              | "LE"
                              | "LT"
                              | "NE"
                              | "TR"
                              | "FL"

    (modify language describing reserved keywords) The following strings
    are reserved keywords and may not be used as identifiers:

        ALIAS, ATTRIB, END, OPTION, OUTPUT, PARAM, TEMP, fragment,
        program, result, state, and texture.

    Additionally, all the instruction names (and variants) listed in
    Table X.5 are reserved.

    Modify Section 3.11.3.3, Fragment Program Temporaries

    (replace second paragraph) Fragment program temporary variables
    can be declared explicitly using the <TEMP_statement> grammar
    rule.  Each such statement can declare one or more temporaries.
    Temporary declaration can optionally specify a variable size,
    using the <varSize> grammar rule.  Variables declared as "SHORT"
    will represented with at least 16 bits per component (5 bits of
    exponent, 10 bits of mantissa).  Variables declared as "LONG" will be
    represented with at least 32 bits per component (8 bits of exponent,
    23 bits of mantissa).  Fragment program temporary variables can not
    be declared implicitly.

    Modify Section 3.11.3.4, Fragment Program Results

    (replace second paragraph) Fragment program result variables
    can be declared explicitly using the <OUTPUT_statement> grammar
    rule, or implicitly using the <resultBinding> grammar rule in an
    executable instruction.  Explicit result variable declaration can
    optionally specify a variable size, using the <varSize> grammar rule.
    Variables declared as "SHORT" will represented with at least 16
    bits per component (5 bits of exponent, 10 bits of mantissa).
    Variables declared as "LONG" will be represented with at least
    32 bits per component (8 bits of exponent, 23 bits of mantissa).
    Each fragment program result variable is bound to a fragment attribute
    used in subsequent back-end processing.  The set of fragment program
    result variable bindings is given in Table X.3.

    (add to the end of a section) A fragment program will fail to load if
    contains instructions writing to variables bound to the same result,
    but declared with different sizes.

    Add New Section 3.11.3.X, Condition Code Register (insert after
    Section 3.11.3.4, Fragment Program Results)

    The fragment program condition code register is a single
    four-component vector.  Each component of this register is one of four
    enumerated values: GT (greater than), EQ (equal), LT (less than),
    or UN (unordered).  The condition code register can be used to mask
    writes to registers and to evaluate conditional branches.

    Most fragment program instructions can optionally update the condition
    code register.  When a fragment program instruction updates the
    condition code register, a condition code component is set to LT if
    the corresponding component of the result is less than zero, EQ if it
    is equal to zero, GT if it is greater than zero, and UN if it is NaN
    (not a number).

    The condition code register is initialized to a vector of EQ values
    each time a fragment program executes.

    Modify Section 3.11.4, Fragment Program Execution Environment

    (modify instruction table) There are fifty-two fragment program
    instructions.  Fragment program instructions may have up to sixteen
    variants, including a suffix of "R", "H", or "X" to specify arithmetic
    precision (section 3.11.4.X), a suffix of "C" to allow an update
    of the condition code register (section 3.11.3.X), and a suffix of
    "_SAT" to clamp the result vector components to the range [0,1]
    (section 3.11.4.3).  For example, the sixteen forms of the "ADD"
    instruction are "ADD", "ADDR", "ADDH", "ADDX", "ADDC", "ADDRC",
    "ADDHC", "ADDXC", "ADD_SAT", "ADDR_SAT", "ADDH_SAT", "ADDX_SAT",
    "ADDC_SAT", "ADDRC_SAT", "ADDHC_SAT", and "ADDXC_SAT".The instructions
    and their respective input and output parameters are summarized in
    Table X.5.

               Modifiers
      Instr.   R H X C S  Inputs  Output   Description
      -------  - - - - -  ------  ------   --------------------------------
      ABS      X X X X X  v       v        absolute value
      ADD      X X X X X  v,v     v        add
      CMP      - - - - X  v,v,v   v        compare
      COS      X X - X X  s       ssss     cosine with reduction to [-PI,PI]
      DDX      X X - X X  v       v        partial derivative relative to X
      DDY      X X - X X  v       v        partial derivative relative to Y
      DP3      X X X X X  v,v     ssss     3-component dot product
      DP4      X X X X X  v,v     ssss     4-component dot product
      DPH      X X X X X  v,v     ssss     homogeneous dot product
      DST      X X - X X  v,v     v        distance vector
      EX2      X X - X X  s       ssss     exponential base 2
      FLR      X X X X X  v       v        floor
      FRC      X X X X X  v       v        fraction
      KIL      - - - - -  v or c  -        kill fragment
      LG2      X X - X X  s       ssss     logarithm base 2
      LIT      X X - X X  v       v        compute light coefficients
      LRP      X X X X X  v,v,v   v        linear interpolation
      MAD      X X X X X  v,v,v   v        multiply and add
      MAX      X X X X X  v,v     v        maximum
      MIN      X X X X X  v,v     v        minimum
      MOV      X X X X X  v       v        move
      MUL      X X X X X  v,v     v        multiply
      PK2H     - - - - -  v       ssss     pack two 16-bit floats
      PK2US    - - - - -  v       ssss     pack two unsigned 16-bit scalars
      PK4B     - - - - -  v       ssss     pack four signed 8-bit scalars
      PK4UB    - - - - -  v       ssss     pack four unsigned 8-bit scalars
      POW      X X - X X  s,s     ssss     exponentiate
      RCP      X X - X X  s       ssss     reciprocal
      RFL      X X - X X  v,v     v        reflection vector
      RSQ      X X - X X  s       ssss     reciprocal square root
      SCS      - - - - X  s       ss--     sine/cosine without reduction
      SEQ      X X X X X  v,v     v        set on equal
      SFL      X X X X X  v,v     v        set on false
      SGE      X X X X X  v,v     v        set on greater than or equal
      SGT      X X X X X  v,v     v        set on greater than
      SIN      X X - X X  s       ssss     sine with reduction to [-PI,PI]
      SLE      X X X X X  v,v     v        set on less than or equal
      SLT      X X X X X  v,v     v        set on less than
      SNE      X X X X X  v,v     v        set on not equal
      STR      X X X X X  v,v     v        set on true
      SUB      X X X X X  v,v     v        subtract
      SWZ      - - - - X  v       v        extended swizzle
      TEX      - - - X X  v       v        texture sample
      TXB      - - - X X  v       v        texture sample with bias
      TXD      - - - X X  v,v,v   v        texture sample w/partials
      TXP      - - - X X  v       v        texture sample with projection
      UP2H     - - - X X  s       v        unpack two 16-bit floats
      UP2US    - - - X X  s       v        unpack two unsigned 16-bit scalars
      UP4B     - - - X X  s       v        unpack four signed 8-bit scalars
      UP4UB    - - - X X  s       v        unpack four unsigned 8-bit scalars
      X2D      X X - X X  v,v,v   v        2D coordinate transformation
      XPD      - - - - X  v,v     v        cross product

      Table X.5:  Summary of fragment program instructions.  The columns
      "R", "H", "X", "C", and "S" indicate whether the "R", "H", or "X"
      precision modifiers, the C condition code update modifier, and the
      "_SAT" saturation modifier, respectively, are supported for the
      opcode.  In the input/output columns, "v" indicates a floating-point
      vector input or output, "s" indicates a floating-point scalar
      input, "ssss" indicates a scalar output replicated across a
      4-component result vector, "ss--" indicates two scalar outputs in
      the first two components, and "c" indicates a condition code test.
      Instructions describe as "texture sample" also specify a texture
      image unit identifier and a texture target.

    Modify Section 3.11.4.1, Fragment Program Operands

    (add prior to the discussion of negation) A component-wise absolute
    value operation can optionally performed on the operand if the operand
    is surrounded with two "|" characters.  For example, "|src|" indicates
    that a component-wise absolute value operation should be performed on
    the variable named "src".  In terms of the grammar, this operation
    is performed if the <instOperandV> or <instOperandS> grammar rules
    match <instOperandAbsV> or <instOperandAbsS>, respectively.

    (modify operand load pseudo-code) The following pseudo-code spells
    out the operand generation process.  In the example, "float" is a
    floating-point scalar type, while "floatVec" is a four-component
    vector.  "source" refers to the register used for the operand,
    matching the <srcReg> rule.  "abs" is TRUE if an absolute value
    operation should be performed on the operand (<instOperandAbsV> or
    <instOperandAbsS> rules) "negate" is TRUE if the <optionalSign> rule
    in <scalarSrcReg> or <swizzleSrcReg> matches "-" and FALSE otherwise.
    The ".c***", ".*c**", ".**c*", ".***c" modifiers refer to the x,
    y, z, and w components obtained by the swizzle operation; the ".c"
    modifier refers to the single component selected for a scalar load.

      floatVec VectorLoad(floatVec source)
      {
          floatVec operand;

          operand.x = source.c***;
          operand.y = source.*c**;
          operand.z = source.**c*;
          operand.w = source.***c;
          if (abs) {
             operand.x = abs(operand.x);
             operand.y = abs(operand.y);
             operand.z = abs(operand.z);
             operand.w = abs(operand.w);
          }
          if (negate) {
             operand.x = -operand.x;
             operand.y = -operand.y;
             operand.z = -operand.z;
             operand.w = -operand.w;
          }

          return operand;
      }

      float ScalarLoad(floatVec source)
      {
          float operand;

          operand = source.c;
          if (abs) {
            operand = abs(operand);
          if (negate) {
            operand = -operand;
          }

          return operand;
      }

    Add New Section 3.11.4.X, Fragment Program Operation Precision
    (insert after Section 3.11.4,2, Fragment Program Parameter Arrays)

    Fragment program implementations may be able to perform instructions
    with different levels of arithmetic precision.  The "R", "H", and
    "X" opcode precision modifiers (Section 3.11.4) specify the minimum
    precision used to perform arithmetic operations.  Instructions with
    an "R" precision modifiers will be carried out at no less than
    IEEE single-precision floating-point (8 bits of exponent, 23 bits
    of mantissa).  Instructions with an "H" precision modifier will
    be carried out at no less than 16-bit floating-point precision (5
    bits of exponent, 10 bits of mantissa).  Instructions with an "X"
    precision modifier will be carried out at no less than signed 12-bit
    fixed-point precision (two's complement with 10 fraction bits).

    If the result of a computation overflows the range of numbers
    supported by the instruction precision, the result will be +/-INF
    (infinity) for "R" and "H" precision, or -2048/1024 or +2047/1024 for
    "X" precision.

    If no precision modifier is specified, the instruction will be carried
    out with at least as much precision as the destination variable.

    Rewrite Section 3.11.4.3,  Fragment Program Destination Register
    Update

    Most fragment program instructions write a 4-component result vector
    to a single temporary or fragment result register.  Writes to
    individual components of the destination register are controlled
    by individual component write masks specified as part of the
    instruction.

    The component write mask is specified by the <optionalMask> rule
    found in the <maskedDstReg> rule.  If the optional mask is "",
    all components are enabled.  Otherwise, the optional mask names
    the individual components to enable.  The characters "x", "y",
    "z", and "w" match the x, y, z, and w components, respectively.
    For example, an optional mask of ".xzw" indicates that the x, z,
    and w components should be enabled for writing but the y component
    should not.  The grammar requires that the destination register mask
    components must be listed in "xyzw" order.

    The condition code write mask is specified by the <ccMask> rule found
    in the <instResultCC> rule.  The condition code register is loaded and
    swizzled according to the swizzle codes specified by <swizzleSuffix>.
    Each component of the swizzled condition code is tested according to
    the rule given by <ccMaskRule>.  <ccMaskRule> may have the values
    "EQ", "NE", "LT", "GE", LE", or "GT", which mean to enable writes
    if the corresponding condition code field evaluates to equal,
    not equal, less than, greater than or equal, less than or equal,
    or greater than, respectively.  Comparisons involving condition
    codes of "UN" (unordered) evaluate to true for "NE" and false
    otherwise.  For example, if the condition code is (GT,LT,EQ,GT)
    and the condition code mask is "(NE.zyxw)", the swizzle operation
    will load (EQ,LT,GT,GT) and the mask will thus will enable writes on
    the y, z, and w components.  In addition, "TR" always enables writes
    and "FL" always disables writes, regardless of the condition code.
    If the condition code mask is empty, it is treated as "(TR)".

    Each component of the destination register is updated with the result
    of the fragment program instruction if and only if the component is
    enabled for writes by both the component write mask and the condition
    code write mask.  Otherwise, the component of the destination register
    remains unchanged.

    A fragment program instruction can also optionally update the
    condition code register.  The condition code is updated if
    the condition code register update suffix "C" is present in the
    instruction.  The instruction "ADDC" will update the condition code;
    the otherwise equivalent instruction "ADD" will not.  If condition
    code updates are enabled, each component of the destination register
    enabled for writes is compared to zero.  The corresponding component
    of the condition code is set to "LT", "EQ", or "GT", if the written
    component is less than, equal to, or greater than zero, respectively.
    Condition code components are set to "UN" if the written component is
    NaN (not a number).  Values of -0.0 and +0.0 both evaluate to "EQ".
    If a component of the destination register is not enabled for writes,
    the corresponding condition code component is also unchanged.

    In the following example code,

        # R1=(-2, 0, 2, NaN)              R0                  CC
        MOVC R0, R1;               # ( -2,  0,   2, NaN) (LT,EQ,GT,UN)
        MOVC R0.xyz, R1.yzwx;      # (  0,  2, NaN, NaN) (EQ,GT,UN,UN)
        MOVC R0 (NE), R1.zywx;     # (  0,  0, NaN,  -2) (EQ,EQ,UN,LT)

    the first instruction writes (-2,0,2,NaN) to R0 and updates the
    condition code to (LT,EQ,GT,UN).  The second instruction, only the
    "x", "y", and "z" components of R0 and the condition code are updated,
    so R0 ends up with (0,2,NaN,NaN) and the condition code ends up with
    (EQ,GT,UN,UN).  In the third instruction, the condition code mask
    disables writes to the x component (its condition code field is "EQ"),
    so R0 ends up with (0,0,NaN,-2) and the condition code ends up with
    (EQ,EQ,UN,LT).

    The following pseudocode illustrates the process of writing a result
    vector to the destination register.  In the pseudocode, "instrmask"
    refers to the component write mask given by the <optWriteMask>
    rule.  "ccMaskRule" refers to the condition code mask rule given
    by <ccMask> and "updatecc" is TRUE if and only if condition code
    updates are enabled.  "result", "destination", and "cc" refer to
    the result vector, the register selected by <dstRegister> and the
    condition code, respectively.  Condition codes do not exist in the
    VP1 execution environment.

      boolean TestCC(CondCode field) {
          switch (ccMaskRule) {
          case "EQ":  return (field == "EQ");
          case "NE":  return (field != "EQ");
          case "LT":  return (field == "LT");
          case "GE":  return (field == "GT" || field == "EQ");
          case "LE":  return (field == "LT" || field == "EQ");
          case "GT":  return (field == "GT");
          case "TR":  return TRUE;
          case "FL":  return FALSE;
          case "":    return TRUE;
          }
      }

      enum GenerateCC(float value) {
        if (value == NaN) {
          return UN;
        } else if (value < 0) {
          return LT;
        } else if (value == 0) {
          return EQ;
        } else {
          return GT;
        }
      }

      void UpdateDestination(floatVec destination, floatVec result)
      {
          floatVec merged;
          ccVec    mergedCC;

          // Merge the converted result into the destination register, under
          // control of the compile- and run-time write masks.
          merged = destination;
          mergedCC = cc;
          if (instrMask.x && TestCC(cc.c***)) {
              merged.x = result.x;
              if (updatecc) mergedCC.x = GenerateCC(result.x);
          }
          if (instrMask.y && TestCC(cc.*c**)) {
              merged.y = result.y;
              if (updatecc) mergedCC.y = GenerateCC(result.y);
          }
          if (instrMask.z && TestCC(cc.**c*)) {
              merged.z = result.z;
              if (updatecc) mergedCC.z = GenerateCC(result.z);
          }
          if (instrMask.w && TestCC(cc.***c)) {
              merged.w = result.w;
              if (updatecc) mergedCC.w = GenerateCC(result.w);
          }

          // Write out the new destination register and condition code.
          destination = merged;
          cc = mergedCC;
      }

    Add to Section 3.11.4.5 of ARB_fragment_program (Fragment Program
    Options):

    Section 3.11.4.5.3, NV_fragment_program Option

    If a fragment program specifies the "NV_fragment_program" option,
    the grammar will be extended to support the features found in the
    NV_fragment_program extension not present in the ARB_fragment_program
    extension, including:

      * the availability of the following instructions:

          - DDX (partial derivative relative to X),
          - DDY (partial derivative relative to Y),
          - PK2H (pack as two half floats),
          - PK2US (pack as two unsigned shorts),
          - PK4B (pack as four signed bytes),
          - PK4UB (pack as four unsigned bytes),
          - RFL (reflection vector),
          - SEQ (set on equal to),
          - SFL (set on false),
          - SGT (set on greater than),
          - SLE (set on less than or equal to),
          - SNE (set on not equal to),
          - STR (set on true),
          - TXD (texture lookup with computed partial derivatives),
          - UP2H (unpack two half floats),
          - UP2US (unpack two unsigned shorts),
          - UP4B (unpack four signed bytes),
          - UP4UB (unpack four unsigned bytes), and
          - X2D (2D coordinate transformation),

      * opcode precision suffixes "R", "H", and "X", to specify
        the precision of arithmetic operations ("R" specifies 32-bit
        floating-point computations, "H" specifies 16-bit floating-point
        computations, and "X" specifies 12-bit signed fixed-point
        computations with 10 fraction bits),

      * the availability of the "SHORT" and "LONG" variable precision
        keywords to control the size of a variable's components,

      * a four-component condition code register to hold the sign of
        result vector components (useful for comparisons),

      * a condition code update opcode suffix "C", where the results of
        the instruction are used to update the condition code register,

      * a condition code write mask operator, where the condition code
        register is swizzled and tested, and the test results are used
        to mask register writes,

      * an absolute value operator on scalar and swizzled source inputs

    The added functionality is identical to that provided by the
    NV_fragment_program extension specification.

    Modify Section 3.11.5,  Fragment Program ALU Instruction Set

    Section 3.11.5.30,  DDX:  Derivative Relative to X

    The DDX instruction computes approximate partial derivatives of the
    four components of the single operand with respect to the X window
    coordinate to yield a result vector.  The partial derivatives are
    evaluated at the center of the pixel.

      f = VectorLoad(op0);
      result = ComputePartialX(f);

    Note that the partial derivates obtained by this instruction are
    approximate, and derivative-of-derivate instruction sequences may
    not yield accurate second derivatives.

    Section 3.11.5.31,  DDY:  Derivative Relative to Y

    The DDY instruction computes approximate partial derivatives of the
    four components of the single operand with respect to the Y window
    coordinate to yield a result vector.  The partial derivatives are
    evaluated at the center of the pixel.

      f = VectorLoad(op0);
      result = ComputePartialY(f);

    Note that the partial derivates obtained by this instruction are
    approximate, and derivative-of-derivate instruction sequences may
    not yield accurate second derivatives.

    Section 3.11.5.32,  PK2H:  Pack Two 16-bit Floats

    The PK2H instruction converts the "x" and "y" components of
    the single operand into 16-bit floating-point format, packs the
    bit representation of these two floats into a 32-bit value, and
    replicates that value to all four components of the result vector.
    The PK2H instruction can be reversed by the UP2H instruction below.

      tmp0 = VectorLoad(op0);
      /* result obtained by combining raw bits of tmp0.x, tmp0.y */
      result.x = RawBits(tmp0.x) | (RawBits(tmp0.y) << 16);
      result.y = RawBits(tmp0.x) | (RawBits(tmp0.y) << 16);
      result.z = RawBits(tmp0.x) | (RawBits(tmp0.y) << 16);
      result.w = RawBits(tmp0.x) | (RawBits(tmp0.y) << 16);

    A fragment program will fail to load if it contains a PK2H instruction
    that writes its results to a variable declared as "SHORT".

    Section 3.11.5.33,  PK2US:  Pack Two Unsigned 16-bit Scalars

    The PK2US instruction converts the "x" and "y" components of the
    single operand into a packed pair of 16-bit unsigned scalars.
    The scalars are represented in a bit pattern where all '0' bits
    corresponds to 0.0 and all '1' bits corresponds to 1.0.  The bit
    representations of the two converted components are packed into a
    32-bit value, and that value is replicated to all four components
    of the result vector.  The PK2US instruction can be reversed by the
    UP2US instruction below.

      tmp0 = VectorLoad(op0);
      if (tmp0.x < 0.0) tmp0.x = 0.0;
      if (tmp0.x > 1.0) tmp0.x = 1.0;
      if (tmp0.y < 0.0) tmp0.y = 0.0;
      if (tmp0.y > 1.0) tmp0.y = 1.0;
      us.x = round(65535.0 * tmp0.x);  /* us is a ushort vector */
      us.y = round(65535.0 * tmp0.y);
      /* result obtained by combining raw bits of us. */
      result.x = ((us.x) | (us.y << 16));
      result.y = ((us.x) | (us.y << 16));
      result.z = ((us.x) | (us.y << 16));
      result.w = ((us.x) | (us.y << 16));

    A fragment program will fail to load if it contains a PK2S instruction
    that writes its results to a variable declared as "SHORT".

    Section 3.11.5.34,  PK4B:  Pack Four Signed 8-bit Scalars

    The PK4B instruction converts the four components of the single
    operand into 8-bit signed quantities.  The signed quantities
    are represented in a bit pattern where all '0' bits corresponds
    to -128/127 and all '1' bits corresponds to +127/127.  The bit
    representations of the four converted components are packed into a
    32-bit value, and that value is replicated to all four components
    of the result vector.  The PK4B instruction can be reversed by the
    UP4B instruction below.

      tmp0 = VectorLoad(op0);
      if (tmp0.x < -128/127) tmp0.x = -128/127;
      if (tmp0.y < -128/127) tmp0.y = -128/127;
      if (tmp0.z < -128/127) tmp0.z = -128/127;
      if (tmp0.w < -128/127) tmp0.w = -128/127;
      if (tmp0.x > +127/127) tmp0.x = +127/127;
      if (tmp0.y > +127/127) tmp0.y = +127/127;
      if (tmp0.z > +127/127) tmp0.z = +127/127;
      if (tmp0.w > +127/127) tmp0.w = +127/127;
      ub.x = round(127.0 * tmp0.x + 128.0);  /* ub is a ubyte vector */
      ub.y = round(127.0 * tmp0.y + 128.0);
      ub.z = round(127.0 * tmp0.z + 128.0);
      ub.w = round(127.0 * tmp0.w + 128.0);
      /* result obtained by combining raw bits of ub. */
      result.x = ((ub.x) | (ub.y << 8) | (ub.z << 16) | (ub.w << 24));
      result.y = ((ub.x) | (ub.y << 8) | (ub.z << 16) | (ub.w << 24));
      result.z = ((ub.x) | (ub.y << 8) | (ub.z << 16) | (ub.w << 24));
      result.w = ((ub.x) | (ub.y << 8) | (ub.z << 16) | (ub.w << 24));

    A fragment program will fail to load if it contains a PK4B instruction
    that writes its results to a variable declared as "SHORT".

    Section 3.11.5.35,  PK4UB:  Pack Four Unsigned 8-bit Scalars

    The PK4UB instruction converts the four components of the single
    operand into a packed grouping of 8-bit unsigned scalars.  The scalars
    are represented in a bit pattern where all '0' bits corresponds to
    0.0 and all '1' bits corresponds to 1.0.  The bit representations
    of the four converted components are packed into a 32-bit value, and
    that value is replicated to all four components of the result vector.
    The PK4UB instruction can be reversed by the UP4UB instruction below.

      tmp0 = VectorLoad(op0);
      if (tmp0.x < 0.0) tmp0.x = 0.0;
      if (tmp0.x > 1.0) tmp0.x = 1.0;
      if (tmp0.y < 0.0) tmp0.y = 0.0;
      if (tmp0.y > 1.0) tmp0.y = 1.0;
      if (tmp0.z < 0.0) tmp0.z = 0.0;
      if (tmp0.z > 1.0) tmp0.z = 1.0;
      if (tmp0.w < 0.0) tmp0.w = 0.0;
      if (tmp0.w > 1.0) tmp0.w = 1.0;
      ub.x = round(255.0 * tmp0.x);  /* ub is a ubyte vector */
      ub.y = round(255.0 * tmp0.y);
      ub.z = round(255.0 * tmp0.z);
      ub.w = round(255.0 * tmp0.w);
      /* result obtained by combining raw bits of ub. */
      result.x = ((ub.x) | (ub.y << 8) | (ub.z << 16) | (ub.w << 24));
      result.y = ((ub.x) | (ub.y << 8) | (ub.z << 16) | (ub.w << 24));
      result.z = ((ub.x) | (ub.y << 8) | (ub.z << 16) | (ub.w << 24));
      result.w = ((ub.x) | (ub.y << 8) | (ub.z << 16) | (ub.w << 24));

    A fragment program will fail to load if it contains a PK4UB
    instruction that writes its results to a variable declared as
    "SHORT".

    Section 3.11.5.36,  RFL:  Reflection Vector

    The RFL instruction computes the reflection of the second vector
    operand (the "direction" vector) about the vector specified by the
    first vector operand (the "axis" vector).  Both operands are treated
    as 3D vectors (the w components are ignored).  The result vector is
    another 3D vector (the "reflected direction" vector).  The length
    of the result vector, ignoring rounding errors, should equal that
    of the second operand.

      axis = VectorLoad(op0);
      direction = VectorLoad(op1);
      tmp.w = (axis.x * axis.x + axis.y * axis.y +
               axis.z * axis.z);
      tmp.x = (axis.x * direction.x + axis.y * direction.y +
               axis.z * direction.z);
      tmp.x = 2.0 * tmp.x;
      tmp.x = tmp.x / tmp.w;
      result.x = tmp.x * axis.x - direction.x;
      result.y = tmp.x * axis.y - direction.y;
      result.z = tmp.x * axis.z - direction.z;

    A fragment program will fail to load if the w component of the result
    is enabled in the component write mask.

    Section 3.11.5.37,  SEQ:  Set on Equal

    The SEQ instruction performs a component-wise comparison of the
    two operands.  Each component of the result vector is 1.0 if the
    corresponding component of the first operand is equal to that of
    the second, and 0.0 otherwise.

      tmp0 = VectorLoad(op0);
      tmp1 = VectorLoad(op1);
      result.x = (tmp0.x == tmp1.x) ? 1.0 : 0.0;
      result.y = (tmp0.y == tmp1.y) ? 1.0 : 0.0;
      result.z = (tmp0.z == tmp1.z) ? 1.0 : 0.0;
      result.w = (tmp0.w == tmp1.w) ? 1.0 : 0.0;

    Section 3.11.5.38,  SFL:  Set on False

    The SFL instruction is a degenerate case of the other "Set on"
    instructions that sets all components of the result vector to 0.0.

      result.x = 0.0;
      result.y = 0.0;
      result.z = 0.0;
      result.w = 0.0;

    Section 3.11.5.39,  SGT:  Set on Greater Than

    The SGT instruction performs a component-wise comparison of the
    two operands.  Each component of the result vector is 1.0 if the
    corresponding component of the first operands is greater than that
    of the second, and 0.0 otherwise.

      tmp0 = VectorLoad(op0);
      tmp1 = VectorLoad(op1);
      result.x = (tmp0.x > tmp1.x) ? 1.0 : 0.0;
      result.y = (tmp0.y > tmp1.y) ? 1.0 : 0.0;
      result.z = (tmp0.z > tmp1.z) ? 1.0 : 0.0;
      result.w = (tmp0.w > tmp1.w) ? 1.0 : 0.0;

    Section 3.11.5.40,  SLE:  Set on Less Than or Equal

    The SLE instruction performs a component-wise comparison of the
    two operands.  Each component of the result vector is 1.0 if the
    corresponding component of the first operand is less than or equal
    to that of the second, and 0.0 otherwise.

      tmp0 = VectorLoad(op0);
      tmp1 = VectorLoad(op1);
      result.x = (tmp0.x <= tmp1.x) ? 1.0 : 0.0;
      result.y = (tmp0.y <= tmp1.y) ? 1.0 : 0.0;
      result.z = (tmp0.z <= tmp1.z) ? 1.0 : 0.0;
      result.w = (tmp0.w <= tmp1.w) ? 1.0 : 0.0;

    Section 3.11.5.41,  SNE:  Set on Not Equal

    The SNE instruction performs a component-wise comparison of the
    two operands.  Each component of the result vector is 1.0 if the
    corresponding component of the first operand is not equal to that
    of the second, and 0.0 otherwise.

      tmp0 = VectorLoad(op0);
      tmp1 = VectorLoad(op1);
      result.x = (tmp0.x != tmp1.x) ? 1.0 : 0.0;
      result.y = (tmp0.y != tmp1.y) ? 1.0 : 0.0;
      result.z = (tmp0.z != tmp1.z) ? 1.0 : 0.0;
      result.w = (tmp0.w != tmp1.w) ? 1.0 : 0.0;

    Section 3.11.5.42,  STR:  Set on True

    The STR instruction is a degenerate case of the other "Set on"
    instructions that sets all components of the result vector to 1.0.

      result.x = 1.0;
      result.y = 1.0;
      result.z = 1.0;
      result.w = 1.0;

    Section 3.11.5.43,  UP2H:  Unpack Two 16-Bit Floats

    The UP2H instruction unpacks two 16-bit floats stored together in
    a 32-bit scalar operand.  The first 16-bit float (stored in the 16
    least significant bits) is written into the "x" and "z" components
    of the result vector; the second is written into the "y" and "w"
    components of the result vector.

    This operation undoes the type conversion and packing performed by
    the PK2H instruction.

      tmp = ScalarLoad(op0);
      result.x = (fp16) (RawBits(tmp) & 0xFFFF);
      result.y = (fp16) ((RawBits(tmp) >> 16) & 0xFFFF);
      result.z = (fp16) (RawBits(tmp) & 0xFFFF);
      result.w = (fp16) ((RawBits(tmp) >> 16) & 0xFFFF);

    A fragment program will fail to load if it contains a UP2H instruction
    whose operand is a variable declared as "SHORT".

    Section 3.11.5.44,  UP2US:  Unpack Two Unsigned 16-Bit Scalars

    The UP2US instruction unpacks two 16-bit unsigned values packed
    together in a 32-bit scalar operand.  The unsigned quantities are
    encoded where a bit pattern of all '0' bits corresponds to 0.0 and
    a pattern of all '1' bits corresponds to 1.0.  The "x" and "z"
    components of the result vector are obtained from the 16 least
    significant bits of the operand; the "y" and "w" components are
    obtained from the 16 most significant bits.

    This operation undoes the type conversion and packing performed by
    the PK2US instruction.

      tmp = ScalarLoad(op0);
      result.x = ((RawBits(tmp) >> 0)  & 0xFFFF) / 65535.0;
      result.y = ((RawBits(tmp) >> 16) & 0xFFFF) / 65535.0;
      result.z = ((RawBits(tmp) >> 0)  & 0xFFFF) / 65535.0;
      result.w = ((RawBits(tmp) >> 16) & 0xFFFF) / 65535.0;

    A fragment program will fail to load if it contains a UP2S instruction
    whose operand is a variable declared as "SHORT".

    Section 3.11.5.45,  UP4B:  Unpack Four Signed 8-Bit Values

    The UP4B instruction unpacks four 8-bit signed values packed together
    in a 32-bit scalar operand.  The signed quantities are encoded where
    a bit pattern of all '0' bits corresponds to -128/127 and a pattern
    of all '1' bits corresponds to +127/127.  The "x" component of the
    result vector is the converted value corresponding to the 8 least
    significant bits of the operand; the "w" component corresponds to
    the 8 most significant bits.

    This operation undoes the type conversion and packing performed by
    the PK4B instruction.

      tmp = ScalarLoad(op0);
      result.x = (((RawBits(tmp) >> 0) & 0xFF) - 128) / 127.0;
      result.y = (((RawBits(tmp) >> 8) & 0xFF) - 128) / 127.0;
      result.z = (((RawBits(tmp) >> 16) & 0xFF) - 128) / 127.0;
      result.w = (((RawBits(tmp) >> 24) & 0xFF) - 128) / 127.0;

    A fragment program will fail to load if it contains a UP4B instruction
    whose operand is a variable declared as "SHORT".

    Section 3.11.5.46,  UP4UB:  Unpack Four Unsigned 8-Bit Scalars

    The UP4UB instruction unpacks four 8-bit unsigned values packed
    together in a 32-bit scalar operand.  The unsigned quantities are
    encoded where a bit pattern of all '0' bits corresponds to 0.0 and a
    pattern of all '1' bits corresponds to 1.0.  The "x" component of the
    result vector is obtained from the 8 least significant bits of the
    operand; the "w" component is obtained from the 8 most significant
    bits.

    This operation undoes the type conversion and packing performed by
    the PK4UB instruction.

      tmp = ScalarLoad(op0);
      result.x = ((RawBits(tmp) >> 0)  & 0xFF) / 255.0;
      result.y = ((RawBits(tmp) >> 8)  & 0xFF) / 255.0;
      result.z = ((RawBits(tmp) >> 16) & 0xFF) / 255.0;
      result.w = ((RawBits(tmp) >> 24) & 0xFF) / 255.0;

    A fragment program will fail to load if it contains a UP4UB
    instruction whose operand is a variable declared as "SHORT".

    Section 3.11.5.47,  X2D:  2D Coordinate Transformation

    The X2D instruction multiplies the 2D offset vector specified by the
    "x" and "y" components of the second vector operand by the 2x2 matrix
    specified by the four components of the third vector operand, and adds
    the transformed offset vector to the 2D vector specified by the "x"
    and "y" components of the first vector operand.  The first component
    of the sum is written to the "x" and "z" components of the result;
    the second component is written to the "y" and "w" components of
    the result.

      tmp0 = VectorLoad(op0);
      tmp1 = VectorLoad(op1);
      tmp2 = VectorLoad(op2);
      result.x = tmp0.x + tmp1.x * tmp2.x + tmp1.y * tmp2.y;
      result.y = tmp0.y + tmp1.x * tmp2.z + tmp1.y * tmp2.w;
      result.z = tmp0.x + tmp1.x * tmp2.x + tmp1.y * tmp2.y;
      result.w = tmp0.y + tmp1.x * tmp2.z + tmp1.y * tmp2.w;

    Modify Section, 3.11.6.4 KIL: Kill fragment

    Rather than mapping a coordinate set to a color, this function
    prevents a fragment from receiving any future processing.  If any
    component of its source vector is negative, the processing of this
    fragment will be discontinued and no further outputs to this fragment
    will occur.  Subsequent stages of the GL pipeline will be skipped
    for this fragment.

    A KIL instruction may be specified using either a vector operand
    or a condition code test.  If a vector operand is specified, the
    following is performed:

      tmp = VectorLoad(op0);
      if ((tmp.x < 0) || (tmp.y < 0) ||
          (tmp.z < 0) || (tmp.w < 0))
      {
          exit;
      }

    If a condition code is specified, the following is performed:

      if (TestCC(rc.c***) || TestCC(rc.*c**) ||
          TestCC(rc.**c*) || TestCC(rc.***c))
      {
         exit;
      }


    Add Section 3.11.6.5, TXD: Texture Lookup with Derivatives

    The TXD instruction takes the first three components of its first
    vector operand and maps them to s, t, and r.  These coordinates are
    used to sample from the specified texture target on the specified
    texture image unit in a manner consistent with its parameters.

    The level of detail is computed as specified in section 3.8.
    In this calculation, ds/dx, dt/dx, and dr/dx are given by the x,
    y, and z components, respectively, of the second vector operand.
    ds/dy, dt/dy, and dr/dy are given by the x, y, and z components of
    the third vector operand.

    The resulting sample is mapped to RGBA as described in table 3.21
    and written to the result vector.

      tmp = VectorLoad(op0);
      result = TextureSample(tmp.x, tmp.y, tmp.z, 0.0, op1, op2);

Additions to Chapter 4 of the OpenGL 1.2.1 Specification (Per-Fragment
Operations and the Frame Buffer)

    None.

Additions to Chapter 5 of the OpenGL 1.2.1 Specification (Special
Functions)

    None.

Additions to Chapter 6 of the OpenGL 1.2.1 Specification (State and
State Requests)

    None.

Additions to Appendix A of the OpenGL 1.2.1 Specification (Invariance)

    None.

Additions to the AGL/GLX/WGL Specifications

    None.

Dependencies on ARB_fragment_program

    This specification is based on a modified version of the grammar
    published in the ARB_fragment_program specification.  This modified
    grammar (see below) includes a few structural changes to better
    accommodate new functionality from this and other extensions,
    but should be functionally equivalent to the ARB_fragment_program
    grammar.

    <program>               ::= <optionSequence> <statementSequence> "END"

    <optionSequence>        ::= <optionSequence> <option>
                              | /* empty */

    <option>                ::= "OPTION" <optionName> ";"

    <optionName>            ::= "ARB_fog_exp"
                              | "ARB_fog_exp2"
                              | "ARB_fog_linear"
                              | "ARB_precision_hint_fastest"
                              | "ARB_precision_hint_nicest"

    <statementSequence>     ::= <statement> <statementSequence>
                              | /* empty */

    <statement>             ::= <instruction> ";"
                              | <namingStatement> ";"

    <instruction>           ::= <ALUInstruction>
                              | <TexInstruction>

    <ALUInstruction>        ::= <VECTORop_instruction>
                              | <SCALARop_instruction>
                              | <BINSCop_instruction>
                              | <BINop_instruction>
                              | <TRIop_instruction>
                              | <SWZop_instruction>

    <TexInstruction>        ::= <TEXop_instruction>
                              | <KILop_instruction>

    <VECTORop_instruction>  ::= <VECTORop> <instResult> "," <instOperandV>

    <VECTORop>              ::= "ABS"
                              | "FLR"
                              | "FRC"
                              | "LIT"
                              | "MOV"

    <SCALARop_instruction>  ::= <SCALARop> <instResult> "," <instOperandS>

    <SCALARop>              ::= "COS"
                              | "EX2"
                              | "LG2"
                              | "RCP"
                              | "RSQ"
                              | "SCS"
                              | "SIN"

    <BINSCop_instruction>   ::= <BINSCop> <instResult> "," <instOperandS> ","
                                <instOperandS>

    <BINSCop>               ::= "POW"

    <BINop_instruction>     ::= <BINop> <instResult> "," <instOperandV> ","
                                <instOperandV>

    <BINop>                 ::= "ADD"
                              | "DP3"
                              | "DP4"
                              | "DPH"
                              | "DST"
                              | "MAX"
                              | "MIN"
                              | "MUL"
                              | "SGE"
                              | "SLT"
                              | "SUB"
                              | "XPD"

    <TRIop_instruction>     ::= <TRIop> <instResult> "," <instOperandV> ","
                                <instOperandV> "," <instOperandV>

    <TRIop>                 ::= "CMP"
                              | "MAD"
                              | "LRP"

    <SWZop_instruction>     ::= <SWZop> <instResult> "," <instOperandVNS> ","
                                <extendedSwizzle>

    <SWZop>                 ::= "SWZ"

    <TEXop_instruction>     ::= <TEXop> <instResult> "," <instOperandV> ","
                                <texTarget>

    <TEXop>                 ::= "TEX"
                              | "TXP"
                              | "TXB"

    <KILop_instruction>     ::= <KILop> <killCond>

    <KILop>                 ::= "KIL"

    <texTarget>             ::= <texImageUnit> "," <texTargetType>

    <texImageUnit>          ::= "texture" <optTexImageUnitNum>

    <optTexImageUnitNum>    ::= /* empty */
                              | "[" <texImageUnitNum> "]"

    <texImageUnitNum>       ::= <integer>
                                /*[0,MAX_TEXTURE_IMAGE_UNITS_ARB-1]*/

    <texTargetType>         ::= "1D"
                              | "2D"
                              | "3D"
                              | "CUBE"
                              | "RECT"

    <killCond>              ::= <instOperandV>

    <instOperandV>          ::= <instOperandBaseV>

    <instOperandBaseV>      ::= <optSign> <attribUseV>
                              | <optSign> <tempUseV>
                              | <optSign> <paramUseV>

    <instOperandS>          ::= <instOperandBaseS>

    <instOperandBaseS>      ::= <optSign> <attribUseS>
                              | <optSign> <tempUseS>
                              | <optSign> <paramUseS>

    <instOperandVNS>        ::= <attribUseVNS>
                              | <tempUseVNS>
                              | <paramUseVNS>

    <instResult>            ::= <instResultBase>

    <instResultBase>        ::= <tempUseW>
                              | <resultUseW>

    <namingStatement>       ::= <ATTRIB_statement>
                              | <PARAM_statement>
                              | <TEMP_statement>
                              | <OUTPUT_statement>
                              | <ALIAS_statement>

    <ATTRIB_statement>      ::= "ATTRIB" <establishName> "=" <attribUseD>

    <PARAM_statement>       ::= <PARAM_singleStmt>
                              | <PARAM_multipleStmt>

    <PARAM_singleStmt>      ::= "PARAM" <establishName> <paramSingleInit>

    <PARAM_multipleStmt>    ::= "PARAM" <establishName> "[" <optArraySize> "]"
                                <paramMultipleInit>

    <optArraySize>          ::= /* empty */
                              | <integer> /* [1,MAX_PROGRAM_PARAMETERS_ARB]*/

    <paramSingleInit>       ::= "=" <paramUseDB>

    <paramMultipleInit>     ::= "=" "{" <paramMultInitList> "}"

    <paramMultInitList>     ::= <paramUseDM>
                              | <paramUseDM> "," <paramMultInitList>

    <TEMP_statement>        ::= "TEMP" <varNameList>

    <OUTPUT_statement>      ::= "OUTPUT" <establishName> "=" <resultUseD>

    <ALIAS_statement>       ::= "ALIAS" <establishName> "=" <establishedName>

    <establishedName>       ::= <tempVarName>
                              | <addrVarName>
                              | <attribVarName>
                              | <paramArrayVarName>
                              | <paramSingleVarName>
                              | <resultVarName>

    <varNameList>           ::= <establishName>
                              | <establishName> "," <varNameList>

    <establishName>         ::= <identifier>

    <attribUseV>            ::= <attribBasic> <swizzleSuffix>
                              | <attribVarName> <swizzleSuffix>
                              | <attribColor> <swizzleSuffix>
                              | <attribColor> "." <colorType> <swizzleSuffix>

    <attribUseS>            ::= <attribBasic> <scalarSuffix>
                              | <attribVarName> <scalarSuffix>
                              | <attribColor> <scalarSuffix>
                              | <attribColor> "." <colorType> <scalarSuffix>

    <attribUseVNS>          ::= <attribBasic>
                              | <attribVarName>
                              | <attribColor>
                              | <attribColor> "." <colorType>

    <attribUseD>            ::= <attribBasic>
                              | <attribColor>
                              | <attribColor> "." <colorType>

    <attribBasic>           ::= "fragment" "." <attribFragBasic>

    <attribFragBasic>       ::= "texcoord" <optTexCoordNum>
                              | "fogcoord"
                              | "position"

    <attribColor>           ::= "fragment" "." "color"

    <paramUseV>             ::= <paramSingleVarName> <swizzleSuffix>
                              | <paramArrayVarName> "[" <arrayMem> "]"
                                <swizzleSuffix>
                              | <stateSingleItem> <swizzleSuffix>
                              | <programSingleItem> <swizzleSuffix>
                              | <constantVector> <swizzleSuffix>
                              | <constantScalar> <swizzleSuffix>

    <paramUseS>             ::= <paramSingleVarName> <scalarSuffix>
                              | <paramArrayVarName> "[" <arrayMem> "]"
                                <scalarSuffix>
                              | <stateSingleItem> <scalarSuffix>
                              | <programSingleItem> <scalarSuffix>
                              | <constantVector> <scalarSuffix>
                              | <constantScalar> <scalarSuffix>

    <paramUseVNS>           ::= <paramSingleVarName>
                              | <paramArrayVarName> "[" <arrayMem> "]"
                              | <stateSingleItem>
                              | <programSingleItem>
                              | <constantVector>
                              | <constantScalar>

    <paramUseDB>            ::= <stateSingleItem>
                              | <programSingleItem>
                              | <constantVector>
                              | <signedConstantScalar>

    <paramUseDM>            ::= <stateMultipleItem>
                              | <programMultipleItem>
                              | <constantVector>
                              | <signedConstantScalar>

    <stateMultipleItem>     ::= <stateSingleItem>
                              | "state" "." <stateMatrixRows>

    <stateSingleItem>       ::= "state" "." <stateMaterialItem>
                              | "state" "." <stateLightItem>
                              | "state" "." <stateLightModelItem>
                              | "state" "." <stateLightProdItem>
                              | "state" "." <stateFogItem>
                              | "state" "." <stateMatrixRow>
                              | "state" "." <stateTexEnvItem>
                              | "state" "." <stateDepthItem>

    <stateMaterialItem>     ::= "material" "." <stateMatProperty>
                              | "material" "." <faceType> "."
                                <stateMatProperty>

    <stateMatProperty>      ::= "ambient"
                              | "diffuse"
                              | "specular"
                              | "emission"
                              | "shininess"

    <stateLightItem>        ::= "light" "[" <stateLightNumber> "]" "."
                                <stateLightProperty>

    <stateLightProperty>    ::= "ambient"
                              | "diffuse"
                              | "specular"
                              | "position"
                              | "attenuation"
                              | "spot" "." <stateSpotProperty>
                              | "half"

    <stateSpotProperty>     ::= "direction"

    <stateLightModelItem>   ::= "lightmodel" <stateLModProperty>

    <stateLModProperty>     ::= "." "ambient"
                              | "." "scenecolor"
                              | "." <faceType> "." "scenecolor"

    <stateLightProdItem>    ::= "lightprod" "[" <stateLightNumber> "]" "."
                                <stateLProdProperty>
                              | "lightprod" "[" <stateLightNumber> "]" "."
                                <faceType> "." <stateLProdProperty>

    <stateLProdProperty>    ::= "ambient"
                              | "diffuse"
                              | "specular"

    <stateLightNumber>      ::= <integer> /* [0,MAX_LIGHTS-1] */

    <stateFogItem>          ::= "fog" "." <stateFogProperty>

    <stateFogProperty>      ::= "color"
                              | "params"

    <stateMatrixRows>       ::= <stateMatrixItem>
                              | <stateMatrixItem> "." <stateMatModifier>
                              | <stateMatrixItem> "." "row" "["
                                <stateMatrixRowNum> ".." <stateMatrixRowNum>
                                "]"
                              | <stateMatrixItem> "." <stateMatModifier> "."
                                "row" "[" <stateMatrixRowNum> ".."
                                <stateMatrixRowNum> "]"

    <stateMatrixRow>        ::= <stateMatrixItem> "." "row" "["
                                <stateMatrixRowNum> "]"
                              | <stateMatrixItem> "." <stateMatModifier> "."
                                "row" "[" <stateMatrixRowNum> "]"

    <stateMatrixItem>       ::= "matrix" "." <stateMatrixName>

    <stateMatModifier>      ::= "inverse"
                              | "transpose"
                              | "invtrans"

    <stateMatrixName>       ::= "modelview" <stateOptModMatNum>
                              | "projection"
                              | "mvp"
                              | "texture" <optTexCoordNum>
                              | "palette" "[" <statePaletteMatNum> "]"
                              | "program" "[" <stateProgramMatNum> "]"

    <stateMatrixRowNum>     ::= <integer> /* [0,3] */

    <stateOptModMatNum>     ::= /* empty */
                              | "[" <stateModMatNum> "]"

    <stateModMatNum>        ::= <integer> /*[0,MAX_VERTEX_UNITS_ARB-1]*/

    <statePaletteMatNum>    ::= <integer> /*[0,MAX_PALETTE_MATRICES_ARB-1]*/

    <stateProgramMatNum>    ::= <integer> /*[0,MAX_PROGRAM_MATRICES_ARB-1]*/

    <stateTexEnvItem>       ::= "texenv" <optLegacyTexUnitNum> "."
                                <stateTexEnvProperty>

    <stateTexEnvProperty>   ::= "color"

    <stateDepthItem>        ::= "depth" "." <stateDepthProperty>

    <stateDepthProperty>    ::= "range"

    <programSingleItem>     ::= <progEnvParam>
                              | <progLocalParam>

    <programMultipleItem>   ::= <progEnvParams>
                              | <progLocalParams>

    <progEnvParams>         ::= "program" "." "env" "[" <progEnvParamNums> "]"

    <progEnvParamNums>      ::= <progEnvParamNum>
                              | <progEnvParamNum> ".." <progEnvParamNum>

    <progEnvParam>          ::= "program" "." "env" "[" <progEnvParamNum> "]"

    <progLocalParams>       ::= "program" "." "local" "[" <progLocalParamNums>
                                "]"

    <progLocalParamNums>    ::= <progLocalParamNum>
                              | <progLocalParamNum> ".." <progLocalParamNum>

    <progLocalParam>        ::= "program" "." "local" "[" <progLocalParamNum>
                                "]"

    <progEnvParamNum>       ::= <integer>
                                /*[0,MAX_PROGRAM_ENV_PARAMETERS_ARB-1]*/

    <progLocalParamNum>     ::= <integer>
                                /*[0,MAX_PROGRAM_LOCAL_PARAMETERS_ARB-1]*/

    <constantVector>        ::= "{" <constantVectorList> "}"

    <constantVectorList>    ::= <signedConstantScalar>
                              | <signedConstantScalar> ","
                                <signedConstantScalar>
                              | <signedConstantScalar> ","
                                <signedConstantScalar> ","
                                <signedConstantScalar>
                              | <signedConstantScalar> ","
                                <signedConstantScalar> ","
                                <signedConstantScalar> ","
                                <signedConstantScalar>

    <signedConstantScalar>  ::= <optSign> <constantScalar>

    <constantScalar>        ::= <floatConstant>

    <floatConstant>         ::= <float>

    <tempUseV>              ::= <tempVarName> <swizzleSuffix>

    <tempUseS>              ::= <tempVarName> <scalarSuffix>

    <tempUseVNS>            ::= <tempVarName>

    <tempUseW>              ::= <tempVarName> <optWriteMask>

    <resultUseW>            ::= <resultBasic> <optWriteMask>
                              | <resultVarName> <optWriteMask>

    <resultUseD>            ::= <resultBasic>

    <resultBasic>           ::= "result" "." <resultFragBasic>

    <resultFragBasic>       ::= "color" <resultOptColorNum>
                              | "depth"

    <resultOptColorNum>     ::= /* empty */

    <arrayMem>              ::= <arrayMemAbs>

    <arrayMemAbs>           ::= <integer>

    <optWriteMask>          ::= /* empty */
                              | <xyzwMask>
                              | <rgbaMask>

    <xyzwMask>              ::= "." "x"
                              | "." "y"
                              | "." "xy"
                              | "." "z"
                              | "." "xz"
                              | "." "yz"
                              | "." "xyz"
                              | "." "w"
                              | "." "xw"
                              | "." "yw"
                              | "." "xyw"
                              | "." "zw"
                              | "." "xzw"
                              | "." "yzw"
                              | "." "xyzw"

    <rgbaMask>              ::= "." "r"
                              | "." "g"
                              | "." "rg"
                              | "." "b"
                              | "." "rb"
                              | "." "gb"
                              | "." "rgb"
                              | "." "a"
                              | "." "ra"
                              | "." "ga"
                              | "." "rga"
                              | "." "ba"
                              | "." "rba"
                              | "." "gba"
                              | "." "rgba"

    <swizzleSuffix>         ::= /* empty */
                              | "." <component>
                              | "." <xyzwComponent> <xyzwComponent>
                                <xyzwComponent> <xyzwComponent>
                              | "." <rgbaComponent> <rgbaComponent>
                                <rgbaComponent> <rgbaComponent>

    <extendedSwizzle>       ::= <extSwizComp> "," <extSwizComp> ","
                                <extSwizComp> "," <extSwizComp>

    <extSwizComp>           ::= <optSign> <xyzwExtSwizSel>
                              | <optSign> <rgbaExtSwizSel>

    <xyzwExtSwizSel>        ::= "0"
                              | "1"
                              | <xyzwComponent>

    <rgbaExtSwizSel>        ::= <rgbaComponent>

    <scalarSuffix>          ::= "." <component>

    <component>             ::= <xyzwComponent>
                              | <rgbaComponent>

    <xyzwComponent>         ::= "x"
                              | "y"
                              | "z"
                              | "w"

    <rgbaComponent>         ::= "r"
                              | "g"
                              | "b"
                              | "a"

    <optSign>               ::= /* empty */
                              | "-"
                              | "+"

    <faceType>              ::= "front"
                              | "back"

    <colorType>             ::= "primary"
                              | "secondary"

    <optTexCoordNum>        ::= /* empty */
                              | "[" <texCoordNum> "]"

    <texCoordNum>           ::= <integer> /*[0,MAX_TEXTURE_COORDS_ARB-1]*/

    <optLegacyTexUnitNum>   ::= /* empty */
                              | "[" <legacyTexUnitNum> "]"

    <legacyTexUnitNum>      ::= <integer> /*[0,MAX_TEXTURE_UNITS-1]*/

    The <integer>, <float>, and <identifier> grammar rules match
    integer constants, floating point constants, and identifier names
    as described in the ARB_vertex_program specification.  The <float>
    grammar rule here is identical to the <floatConstant> grammar rule
    in ARB_vertex_program.

    The grammar rules <tempVarName>, <addrVarName>, <attribVarName>,
    <paramArrayVarName>, <paramSingleVarName>, <resultVarName> refer
    to the names of temporary, address register, attribute, program
    parameter array, program parameter, and result variables declared
    in the program text.

GLX Protocol

    None.

Errors

    None.

New State

    None.

Revision History

    Rev.  Date      Author   Changes
    ----  --------  -------  --------------------------------------------
    4     05/27/05  pbrown   Removed required NV_fragment_program dependency;
                             that extension actually isn't needed although the
                             functionality it provides obviously is.

    3     07/08/04  pbrown   Fixed entries for KIL and RFL in the opcode
                             table.

    2     05/16/04  pbrown   Documented terminals in modified fragment program
                             grammar.

    1     --------  pbrown   Internal pre-release revisions.