1Name 2 3 INTEL_shader_atomic_float_minmax 4 5Name Strings 6 7 GL_INTEL_shader_atomic_float_minmax 8 9Contact 10 11 Ian Romanick (ian . d . romanick 'at' intel . com) 12 13Contributors 14 15 16Status 17 18 In progress 19 20Version 21 22 Last Modified Date: 06/22/2018 23 Revision: 4 24 25Number 26 27 TBD 28 29Dependencies 30 31 OpenGL 4.2, OpenGL ES 3.1, ARB_shader_storage_buffer_object, or 32 ARB_compute_shader is required. 33 34 This extension is written against version 4.60 of the OpenGL Shading 35 Language Specification. 36 37Overview 38 39 This extension provides GLSL built-in functions allowing shaders to 40 perform atomic read-modify-write operations to floating-point buffer 41 variables and shared variables. Minimum, maximum, exchange, and 42 compare-and-swap are enabled. 43 44 45New Procedures and Functions 46 47 None. 48 49New Tokens 50 51 None. 52 53IP Status 54 55 None. 56 57Modifications to the OpenGL Shading Language Specification, Version 4.60 58 59 Including the following line in a shader can be used to control the 60 language features described in this extension: 61 62 #extension GL_INTEL_shader_atomic_float_minmax : <behavior> 63 64 where <behavior> is as specified in section 3.3. 65 66 New preprocessor #defines are added to the OpenGL Shading Language: 67 68 #define GL_INTEL_shader_atomic_float_minmax 1 69 70Additions to Chapter 8 of the OpenGL Shading Language Specification 71(Built-in Functions) 72 73 Modify Section 8.11, "Atomic Memory Functions" 74 75 (add a new row after the existing "atomicMin" table row, p. 179) 76 77 float atomicMin(inout float mem, float data) 78 79 80 Computes a new value by taking the minimum of the value of data and 81 the contents of mem. If one of these is an IEEE signaling NaN (i.e., 82 a NaN with the most-significant bit of the mantissa cleared), it is 83 always considered smaller. If one of these is an IEEE quiet NaN 84 (i.e., a NaN with the most-significant bit of the mantissa set), it is 85 always considered larger. If both are IEEE quiet NaNs or both are 86 IEEE signaling NaNs, the result of the comparison is undefined. 87 88 (add a new row after the exiting "atomicMax" table row, p. 179) 89 90 float atomicMax(inout float mem, float data) 91 92 Computes a new value by taking the maximum of the value of data and 93 the contents of mem. If one of these is an IEEE signaling NaN (i.e., 94 a NaN with the most-significant bit of the mantissa cleared), it is 95 always considered larger. If one of these is an IEEE quiet NaN (i.e., 96 a NaN with the most-significant bit of the mantissa set), it is always 97 considered smaller. If both are IEEE quiet NaNs or both are IEEE 98 signaling NaNs, the result of the comparison is undefined. 99 100 (add to "atomicExchange" table cell, p. 180) 101 102 float atomicExchange(inout float mem, float data) 103 104 (add to "atomicCompSwap" table cell, p. 180) 105 106 float atomicCompSwap(inout float mem, float compare, float data) 107 108Interactions with OpenGL 4.6 and ARB_gl_spirv 109 110 If OpenGL 4.6 or ARB_gl_spirv is supported, then 111 SPV_INTEL_shader_atomic_float_minmax must also be supported. 112 113 The AtomicFloatMinmaxINTEL capability is available whenever the OpenGL or 114 OpenGL ES implementation supports INTEL_shader_atomic_float_minmax. 115 116Issues 117 118 1) Why call this extension INTEL_shader_atomic_float_minmax? 119 120 RESOLVED: Several other extensions already set the precedent of 121 VENDOR_shader_atomic_float and VENDOR_shader_atomic_float64 for extensions 122 that enable floating-point atomic operations. Using that as a base for 123 the name seems logical. 124 125 There already exists NV_shader_atomic_float, but the two extensions have 126 nearly zero overlap in functionality. NV_shader_atomic_float adds 127 atomicAdd and image atomic operations that currently shipping Intel GPUs 128 do not support. Calling this extension INTEL_shader_atomic_float would 129 likely have been confusing. 130 131 Adding something to describe the actual functions added by this extension 132 seemed reasonable. INTEL_shader_atomic_float_compare was considered, but 133 that name was deemed to be not properly descriptive. Calling this 134 extension INTEL_shader_atomic_float_min_max_exchange_compswap is right 135 out. 136 137 2) What atomic operations should we support for floating-point targets? 138 139 RESOLVED. Exchange, min, max, and compare-swap make sense, and these are 140 all supported by the hardware. Future extensions may add other functions. 141 142 For buffer variables and shared variables it is not possible to bit-cast 143 the memory location in GLSL, so existing integer operations, such as 144 atomicOr, cannot be used. However, the underlying hardware implementation 145 can do this by treating the memory as an integer. It would be possible to 146 implement atomicNegate using this technique with atomicXor. It is unclear 147 whether this provides any actual utility. 148 149 3) What should be said about the NaN behavior? 150 151 RESOLVED. There are several aspects of NaN behavior that should be 152 documented in this extension. However, some of this behavior varies based 153 on NaN concepts that do not exist in the GLSL specification. 154 155 * atomicCompSwap performs the comparison as the floating-point equality 156 operator (==). That is, if either 'mem' or 'compare' is NaN, the 157 comparison result is always false. 158 159 * atomicMin and atomicMax implement the IEEE specification with respect to 160 NaN. IEEE considers two different kinds of NaN: signaling NaN and quiet 161 NaN. A quiet NaN has the most significant bit of the mantissa set, and 162 a signaling NaN does not. This concept does not exist in SPIR-V, 163 Vulkan, or OpenGL. Let qNaN denote a quiet NaN and sNaN denote a 164 signaling NaN. atomicMin and atomicMax specifically implement 165 166 - fmin(qNaN, x) = fmin(x, qNaN) = fmax(qNaN, x) = fmax(x, qNaN) = x 167 - fmin(sNaN, x) = fmin(x, sNaN) = fmax(sNaN, x) = fmax(x, sNaN) = sNaN 168 - fmin(sNaN, qNaN) = fmin(qNaN, sNaN) = fmax(sNaN, qNaN) = 169 fmax(qNaN, sNaN) = sNaN 170 - fmin(sNaN, sNaN) = sNaN. This specification does not define which of 171 the two arguments is stored. 172 - fmax(sNaN, sNaN) = sNaN. This specification does not define which of 173 the two arguments is stored. 174 - fmin(qNaN, qNaN) = qNaN. This specification does not define which of 175 the two arguments is stored. 176 - fmax(qNaN, qNaN) = qNaN. This specification does not define which of 177 the two arguments is stored. 178 179 Further details are available in the Skylake Programmer's Reference 180 Manuals available at 181 https://01.org/linuxgraphics/documentation/hardware-specification-prms. 182 183 4) What about atomicMin and atomicMax with (+0.0, -0.0) or (-0.0, +0.0) 184 arguments? 185 186 RESOLVED. atomicMin should store -0.0, and atomicMax should store +0.0. 187 Due to a known issue in shipping Skylake GPUs, the incorrectly signed 0 is 188 stored. This behavior may change in later GPUs. 189 190Revision History 191 192 Rev Date Author Changes 193 --- ---------- -------- --------------------------------------------- 194 1 04/19/2018 idr Initial version 195 2 05/05/2018 idr Describe interactions with the capabilities 196 added by SPV_INTEL_shader_atomic_float_minmax. 197 3 05/29/2018 idr Remove mention of 64-bit float support. 198 4 06/22/2018 idr Resolve issue #2. 199 Add issue #3 (regarding NaN behavior). 200 Add issue #4 (regarding atomicMin(-0, +0). 201