• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Name
2
3    INTEL_shader_atomic_float_minmax
4
5Name Strings
6
7    GL_INTEL_shader_atomic_float_minmax
8
9Contact
10
11    Ian Romanick (ian . d . romanick 'at' intel . com)
12
13Contributors
14
15
16Status
17
18    In progress
19
20Version
21
22    Last Modified Date: 06/22/2018
23    Revision: 4
24
25Number
26
27    TBD
28
29Dependencies
30
31    OpenGL 4.2, OpenGL ES 3.1, ARB_shader_storage_buffer_object, or
32    ARB_compute_shader is required.
33
34    This extension is written against version 4.60 of the OpenGL Shading
35    Language Specification.
36
37Overview
38
39    This extension provides GLSL built-in functions allowing shaders to
40    perform atomic read-modify-write operations to floating-point buffer
41    variables and shared variables.  Minimum, maximum, exchange, and
42    compare-and-swap are enabled.
43
44
45New Procedures and Functions
46
47    None.
48
49New Tokens
50
51    None.
52
53IP Status
54
55    None.
56
57Modifications to the OpenGL Shading Language Specification, Version 4.60
58
59    Including the following line in a shader can be used to control the
60    language features described in this extension:
61
62      #extension GL_INTEL_shader_atomic_float_minmax : <behavior>
63
64    where <behavior> is as specified in section 3.3.
65
66    New preprocessor #defines are added to the OpenGL Shading Language:
67
68      #define GL_INTEL_shader_atomic_float_minmax   1
69
70Additions to Chapter 8 of the OpenGL Shading Language Specification
71(Built-in Functions)
72
73    Modify Section 8.11, "Atomic Memory Functions"
74
75    (add a new row after the existing "atomicMin" table row, p. 179)
76
77        float atomicMin(inout float mem, float data)
78
79
80        Computes a new value by taking the minimum of the value of data and
81        the contents of mem.  If one of these is an IEEE signaling NaN (i.e.,
82        a NaN with the most-significant bit of the mantissa cleared), it is
83        always considered smaller.  If one of these is an IEEE quiet NaN
84        (i.e., a NaN with the most-significant bit of the mantissa set), it is
85        always considered larger.  If both are IEEE quiet NaNs or both are
86        IEEE signaling NaNs, the result of the comparison is undefined.
87
88    (add a new row after the exiting "atomicMax" table row, p. 179)
89
90        float atomicMax(inout float mem, float data)
91
92        Computes a new value by taking the maximum of the value of data and
93        the contents of mem.  If one of these is an IEEE signaling NaN (i.e.,
94        a NaN with the most-significant bit of the mantissa cleared), it is
95        always considered larger.  If one of these is an IEEE quiet NaN (i.e.,
96        a NaN with the most-significant bit of the mantissa set), it is always
97        considered smaller.  If both are IEEE quiet NaNs or both are IEEE
98        signaling NaNs, the result of the comparison is undefined.
99
100    (add to "atomicExchange" table cell, p. 180)
101
102        float atomicExchange(inout float mem, float data)
103
104    (add to "atomicCompSwap" table cell, p. 180)
105
106        float atomicCompSwap(inout float mem, float compare, float data)
107
108Interactions with OpenGL 4.6 and ARB_gl_spirv
109
110    If OpenGL 4.6 or ARB_gl_spirv is supported, then
111    SPV_INTEL_shader_atomic_float_minmax must also be supported.
112
113    The AtomicFloatMinmaxINTEL capability is available whenever the OpenGL or
114    OpenGL ES implementation supports INTEL_shader_atomic_float_minmax.
115
116Issues
117
118    1) Why call this extension INTEL_shader_atomic_float_minmax?
119
120    RESOLVED: Several other extensions already set the precedent of
121    VENDOR_shader_atomic_float and VENDOR_shader_atomic_float64 for extensions
122    that enable floating-point atomic operations.  Using that as a base for
123    the name seems logical.
124
125    There already exists NV_shader_atomic_float, but the two extensions have
126    nearly zero overlap in functionality.  NV_shader_atomic_float adds
127    atomicAdd and image atomic operations that currently shipping Intel GPUs
128    do not support.  Calling this extension INTEL_shader_atomic_float would
129    likely have been confusing.
130
131    Adding something to describe the actual functions added by this extension
132    seemed reasonable.  INTEL_shader_atomic_float_compare was considered, but
133    that name was deemed to be not properly descriptive.  Calling this
134    extension INTEL_shader_atomic_float_min_max_exchange_compswap is right
135    out.
136
137    2) What atomic operations should we support for floating-point targets?
138
139    RESOLVED.  Exchange, min, max, and compare-swap make sense, and these are
140    all supported by the hardware.  Future extensions may add other functions.
141
142    For buffer variables and shared variables it is not possible to bit-cast
143    the memory location in GLSL, so existing integer operations, such as
144    atomicOr, cannot be used.  However, the underlying hardware implementation
145    can do this by treating the memory as an integer.  It would be possible to
146    implement atomicNegate using this technique with atomicXor.  It is unclear
147    whether this provides any actual utility.
148
149    3) What should be said about the NaN behavior?
150
151    RESOLVED.  There are several aspects of NaN behavior that should be
152    documented in this extension.  However, some of this behavior varies based
153    on NaN concepts that do not exist in the GLSL specification.
154
155    * atomicCompSwap performs the comparison as the floating-point equality
156      operator (==).  That is, if either 'mem' or 'compare' is NaN, the
157      comparison result is always false.
158
159    * atomicMin and atomicMax implement the IEEE specification with respect to
160      NaN.  IEEE considers two different kinds of NaN: signaling NaN and quiet
161      NaN.  A quiet NaN has the most significant bit of the mantissa set, and
162      a signaling NaN does not.  This concept does not exist in SPIR-V,
163      Vulkan, or OpenGL.  Let qNaN denote a quiet NaN and sNaN denote a
164      signaling NaN.  atomicMin and atomicMax specifically implement
165
166      - fmin(qNaN, x) = fmin(x, qNaN) = fmax(qNaN, x) = fmax(x, qNaN) = x
167      - fmin(sNaN, x) = fmin(x, sNaN) = fmax(sNaN, x) = fmax(x, sNaN) = sNaN
168      - fmin(sNaN, qNaN) = fmin(qNaN, sNaN) = fmax(sNaN, qNaN) =
169        fmax(qNaN, sNaN) = sNaN
170      - fmin(sNaN, sNaN) = sNaN.  This specification does not define which of
171        the two arguments is stored.
172      - fmax(sNaN, sNaN) = sNaN.  This specification does not define which of
173        the two arguments is stored.
174      - fmin(qNaN, qNaN) = qNaN.  This specification does not define which of
175        the two arguments is stored.
176      - fmax(qNaN, qNaN) = qNaN.  This specification does not define which of
177        the two arguments is stored.
178
179    Further details are available in the Skylake Programmer's Reference
180    Manuals available at
181    https://01.org/linuxgraphics/documentation/hardware-specification-prms.
182
183    4) What about atomicMin and atomicMax with (+0.0, -0.0) or (-0.0, +0.0)
184    arguments?
185
186    RESOLVED.  atomicMin should store -0.0, and atomicMax should store +0.0.
187    Due to a known issue in shipping Skylake GPUs, the incorrectly signed 0 is
188    stored.  This behavior may change in later GPUs.
189
190Revision History
191
192    Rev  Date        Author    Changes
193    ---  ----------  --------  ---------------------------------------------
194      1  04/19/2018  idr       Initial version
195      2  05/05/2018  idr       Describe interactions with the capabilities
196                               added by SPV_INTEL_shader_atomic_float_minmax.
197      3  05/29/2018  idr       Remove mention of 64-bit float support.
198      4  06/22/2018  idr       Resolve issue #2.
199                               Add issue #3 (regarding NaN behavior).
200                               Add issue #4 (regarding atomicMin(-0, +0).
201