1Name 2 3 AMD_shader_ballot 4 5Name Strings 6 7 GL_AMD_shader_ballot 8 9Contact 10 11 Qun Lin, AMD (quentin.lin 'at' amd.com) 12 13Contributors 14 15 Qun Lin, AMD 16 Graham Sellers, AMD 17 Daniel Rakos, AMD 18 Rex Xu, AMD 19 Dominik Witczak, AMD 20 21Status 22 23 Shipping 24 25Version 26 27 Last Modified Date: 03/28/2018 28 Author Revision: 5 29 30Number 31 32 ??? 33 34Dependencies 35 36 This extension is written against the OpenGL Shading Language 37 Specification, Version 4.50. 38 39 This extension requires ARB_shader_group_vote and ARB_shader_ballot. 40 41 This extension interacts with ARB_gpu_shader_int64. 42 43 This extension interacts with AMD_gpu_shader_half_float. 44 45 This extension interacts with AMD_gpu_shader_int16. 46 47Overview 48 49 The extensions ARB_shader_group_vote and ARB_shader_ballot introduced the 50 concept of sub-groups and a set of operations that allow data exchange 51 across shader invocations within a sub-group. 52 53 This extension further extends the capabilities of these extensions with 54 additional sub-group operations. 55 56IP Status 57 58 None. 59 60New Procedures and Functions 61 62 None. 63 64New Tokens 65 66 None. 67 68Modifications to the OpenGL Shading Language Specification, Version 4.50 69 70 Including the following line in a shader can be used to control the 71 language features described in this extension: 72 73 #extension GL_AMD_shader_ballot : <behavior> 74 75 where <behavior> is as specified in section 3.3. 76 77 New preprocessor #defines are added to the OpenGL Shading Language: 78 79 #define GL_AMD_shader_ballot 1 80 81Additions to Chapter 8 of the OpenGL Shading Language (GLSL) Specification, 82version 4.30 (Built-in functions) 83 84 Add Section 8.18, Shader Invocation Group Functions 85 86 The <min>, <max>, <add> group invocation functions process values of the 87 specified value <v> across all active shader invocations in the sub-group 88 with three special group operatons according to the following table: 89 90 Group Operation Description 91 --------------- --------------------------------------------------------- 92 Reduce A reduction operation for values of the specified value 93 <v> in the sub-group 94 95 InclusiveScan A binary operation with an identity <I> and <n> (where 96 <n> is the size of the sub-group) elements { a[0], a[1], 97 .., a[n] } resulting in { a[0], (a[0] op a[1]), .., (a[0] 98 op a[1] op .. op a[n-1]) }. <op> could be any of <min>, 99 <max>, <add>. 100 101 ExclusiveScan A binary operation with an identity <I> and <n> (where 102 <n> is the size of the sub-group) elements { a[0], a[1], 103 .., a[n] } resulting in { I, a[0], (a[0] op a[1]), .., 104 (a[0] op a[1] op .. op a[n-2]) }. <op> could be any of 105 <min>, <max>, <add>. 106 107 The identity <I> in the group operations <InclusiveScan> and <ExclusiveScan> 108 is decided according to the following table: 109 110 Function Data Type Identity 111 -------- ----------------------------------- ---------- 112 Min 32-bit signed integer INT_MAX 113 64-bit signed integer INT64_MAX 114 32-bit unsigned integer UINT_MAX 115 64-bit unsigned integer UINT64_MAX 116 16-bit/32-bit/64-bit floating-point +INF 117 118 Max 32-bit signed integer INT_MIN 119 64-bit signed integer INT64_MIN 120 32-bit/64-bit unsigned integer 0 121 floating-point -INF 122 123 Add 32-bit/64-bit signed integer 0 124 32-bit/64-bit unsigned integer 0 125 16-bit/32-bit/64-bit floating-point 0 126 127 +------------------------------------------------------+-----------------------------------------------------------+ 128 | Syntax | Description | 129 +------------------------------------------------------+-----------------------------------------------------------+ 130 | genType minInvocationsAMD(genType v) | Returns the minimum value of <v> across all active shader | 131 | genIType minInvocationsAMD(genIType v) | invocations in the sub-group with <Reduce> group | 132 | genUType minInvocationsAMD(genUType v) | operation. These functions must be used in uniform | 133 | genDType minInvocationsAMD(genDType v) | control flow. These functions operate component-wise. | 134 +------------------------------------------------------+-----------------------------------------------------------+ 135 | genType minInvocationsNonUniformAMD(genType v) | Returns the minimum value of <v> across all active shader | 136 | genIType minInvocationsNonUniformAMD(genIType v) | invocations in the sub-group with <Reduce> group | 137 | genUType minInvocationsNonUniformAMD(genUType v) | operation. These functions could be used in non-uniform | 138 | genDType minInvocationsNonUniformAMD(genDType v) | control flow. These functions operate component-wise. | 139 +------------------------------------------------------+-----------------------------------------------------------+ 140 | genType minInvocationsInclusiveScanAMD(genType v) | Returns the minimum value of <v> across all active shader | 141 | genIType minInvocationsInclusiveScanAMD(genIType v) | invocations in the sub-group with <InclusiveScan> group | 142 | genUType minInvocationsInclusiveScanAMD(genUType v) | operation. These functions must be used in uniform | 143 | genDType minInvocationsInclusiveScanAMD(genDType v) | control flow. These functions operate component-wise. | 144 | | | 145 | | | 146 | | | 147 | | | 148 +------------------------------------------------------+-----------------------------------------------------------+ 149 | genType minInvocationsInclusiveScanNonUniformAMD( | Returns the minimum value of <v> across all active shader | 150 | genType v) | invocations in the sub-group with <InclusiveScan> group | 151 | genType minInvocationsInclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | 152 | genIType v) | control flow. These functions operate component-wise. | 153 | genUType minInvocationsInclusiveScanNonUniformAMD( | | 154 | genUType v) | | 155 | genDType minInvocationsInclusiveScanNonUniformAMD( | | 156 | genDType v) | | 157 +------------------------------------------------------+-----------------------------------------------------------+ 158 | genType minInvocationsExclusiveScanAMD(genType v) | Returns the minimum value of <v> across all active shader | 159 | genIType minInvocationsExclusiveScanAMD(genIType v) | invocations in the sub-group with <ExclusiveScan> group | 160 | genUType minInvocationsExclusiveScanAMD(genUType v) | operation. These functions must be used in uniform | 161 | genDType minInvocationsExclusiveScanAMD(genDType v) | control flow. These functions operate component-wise. | 162 | | | 163 | | | 164 | | | 165 | | | 166 +------------------------------------------------------+-----------------------------------------------------------+ 167 | genType minInvocationsExclusiveScanNonUniformAMD( | Returns the minimum value of <v> across all active shader | 168 | genType v) | invocations in the sub-group with <ExclusiveScan> group | 169 | genIType minInvocationsExclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | 170 | genIType v) | control flow. These functions operate component-wise. | 171 | genUType minInvocationsExclusiveScanNonUniformAMD( | | 172 | genUType v) | | 173 | genDType minInvocationsExclusiveScanNonUniformAMD( | | 174 | genDType v) | | 175 +------------------------------------------------------+-----------------------------------------------------------+ 176 | genType maxInvocationsAMD(genType v) | Returns the maximum value of <v> across all active shader | 177 | genIType maxInvocationsAMD(genIType v) | invocations in the sub-group with <Reduce> group | 178 | genUType maxInvocationsAMD(genUType v) | operation. These functions must be used in uniform | 179 | genDType maxInvocationsAMD(genDType v) | control flow. These functions operate component-wise. | 180 +------------------------------------------------------+-----------------------------------------------------------+ 181 | genType maxInvocationsNonUniformAMD(genType v) | Returns the maximum value of <v> across all active shader | 182 | genIType maxInvocationsNonUniformAMD(genIType v) | invocations in the sub-group with <Reduce> group | 183 | genUType maxInvocationsNonUniformAMD(genUType v) | operation. These functions could be used in non-uniform | 184 | genDType maxInvocationsNonUniformAMD(genDType v) | control flow. These functions operate component-wise. | 185 +------------------------------------------------------+-----------------------------------------------------------+ 186 | genType maxInvocationsInclusiveScanAMD(genType v) | Returns the maximum value of <v> across all active shader | 187 | genIType maxInvocationsInclusiveScanAMD(genIType v) | invocations in the sub-group with <InclusiveScan> group | 188 | genUType maxInvocationsInclusiveScanAMD(genUType v) | operation. These functions must be used in uniform | 189 | genDType maxInvocationsInclusiveScanAMD(genDType v) | control flow. These functions operate component-wise. | 190 | | | 191 | | | 192 | | | 193 | | | 194 +------------------------------------------------------+-----------------------------------------------------------+ 195 | genType maxInvocationsInclusiveScanNonUniformAMD( | Returns the maximum value of <v> across all active shader | 196 | genType v) | invocations in the sub-group with <InclusiveScan> group | 197 | genType maxInvocationsInclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | 198 | genIType v) | control flow. These functions operate component-wise. | 199 | genUType maxInvocationsInclusiveScanNonUniformAMD( | | 200 | genUType v) | | 201 | genDType maxInvocationsInclusiveScanNonUniformAMD( | | 202 | genDType v) | | 203 +------------------------------------------------------+-----------------------------------------------------------+ 204 | genType maxInvocationsExclusiveScanAMD(genType v) | Returns the maximum value of <v> across all active shader | 205 | genIType maxInvocationsExclusiveScanAMD(genIType v) | invocations in the sub-group with <ExclusiveScan> group | 206 | genUType maxInvocationsExclusiveScanAMD(genUType v) | operation. These functions must be used in uniform | 207 | genDType maxInvocationsExclusiveScanAMD(genDType v) | control flow. These functions operate component-wise. | 208 | | | 209 | | | 210 | | | 211 | | | 212 +------------------------------------------------------+-----------------------------------------------------------+ 213 | genType maxInvocationsExclusiveScanNonUniformAMD( | Returns the maximum value of <v> across all active shader | 214 | genType v) | invocations in the sub-group with <ExclusiveScan> group | 215 | genIType maxInvocationsExclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | 216 | genIType v) | control flow. These functions operate component-wise. | 217 | genUType maxInvocationsExclusiveScanNonUniformAMD( | | 218 | genUType v) | | 219 | genDType maxInvocationsExclusiveScanNonUniformAMD( | | 220 | genDType v) | | 221 +------------------------------------------------------+-----------------------------------------------------------+ 222 | genType addInvocationsAMD(genType v) | Returns the sum of the value of <v> across all active | 223 | genIType addInvocationsAMD(genIType v) | shader invocations in the sub-group with <Reduce> group | 224 | genUType addInvocationsAMD(genUType v) | operation. These functions must be used in uniform | 225 | genDType addInvocationsAMD(genDType v) | control flow. These functions operate component-wise. | 226 +------------------------------------------------------+-----------------------------------------------------------+ 227 | genType addInvocationsNonUniformAMD(genType v) | Returns the sum of the value of <v> across all active | 228 | genIType addInvocationsNonUniformAMD(genIType v) | shader invocations in the sub-group with <Reduce> group | 229 | genUType addInvocationsNonUniformAMD(genUType v) | operation. These functions could be used in non-uniform | 230 | genDType addInvocationsNonUniformAMD(genDType v) | control flow. These functions operate component-wise. | 231 +------------------------------------------------------+-----------------------------------------------------------+ 232 | genType addInvocationsInclusiveScanAMD(genType v) | Returns the sum of the value of <v> across all active | 233 | genIType addInvocationsInclusiveScanAMD(genIType v) | shader invocations in the sub-group with <InclusiveScan> | 234 | genUType addInvocationsInclusiveScanAMD(genUType v) | group operation. These functions must be used in uniform | 235 | genDType addInvocationsInclusiveScanAMD(genDType v) | control flow. These functions operate component-wise. | 236 | | | 237 | | | 238 | | | 239 | | | 240 +------------------------------------------------------+-----------------------------------------------------------+ 241 | genType addInvocationsInclusiveScanNonUniformAMD( | Returns the sum of the value of <v> across all active | 242 | genType v) | shader invocations in the sub-group with <InclusiveScan> | 243 | genIType addInvocationsInclusiveScanNonUniformAMD( | group operation. These functions could be used in | 244 | genIType v) | non-uniform control flow. These functions operate | 245 | genUType addInvocationsInclusiveScanNonUniformAMD( | component-wise. | 246 | genUType v) | | 247 | genDType addInvocationsInclusiveScanNonUniformAMD( | | 248 | genDType v) | | 249 +------------------------------------------------------+-----------------------------------------------------------+ 250 | genType addInvocationsExclusiveScanAMD(genType v) | Returns the sum of the value of <v> across all active | 251 | genIType addInvocationsExclusiveScanAMD(genIType v) | shader invocations in the sub-group with <ExclusiveScan> | 252 | genUType addInvocationsExclusiveScanAMD(genUType v) | group operation. These functions must be used in uniform | 253 | genDType addInvocationsExclusiveScanAMD(genDType v) | control flow. These functions operate component-wise. | 254 | | | 255 | | | 256 | | | 257 | | | 258 +------------------------------------------------------+-----------------------------------------------------------+ 259 | genType addInvocationsExclusiveScanNonUniformAMD( | Returns the sum of the value of <v> across all active | 260 | genType v) | shader invocations in the sub-group with <ExclusiveScan> | 261 | genIType addInvocationsExclusiveScanNonUniformAMD( | group operation. These functions could be used in | 262 | genIType v) | non-uniform control flow. These functions operate | 263 | genUType addInvocationsExclusiveScanNonUniformAMD( | component-wise. | 264 | genUType v) | | 265 | genDType addInvocationsExclusiveScanNonUniformAMD( | | 266 | genDType v) | | 267 +------------------------------------------------------+-----------------------------------------------------------+ 268 | genType swizzleInvocationsAMD( | Swizzles data within a group of 4 consecutive invocations | 269 | genType data, uvec4 offset) | of the sub-group based on <offset> as described below: | 270 | genIType swizzleInvocationsAMD( | | 271 | genIType data, uvec4 offset) | for (i = 0; i < gl_SubGroupSizeARB; i+=4) { | 272 | genUType swizzleInvocationsAMD( | dataOut[i+0] = isActive[i+offset.x] ? | 273 | genUType data, uvec4 offset) | dataIn[i+offset.x] : 0; | 274 | | dataOut[i+1] = isActive[i+offset.y] ? | 275 | | dataIn[i+offset.y] : 0; | 276 | | dataOut[i+2] = isActive[i+offset.z] ? | 277 | | dataIn[i+offset.z] : 0; | 278 | | dataOut[i+3] = isActive[i+offset.w] ? | 279 | | dataIn[i+offset.w] : 0; | 280 | | } | 281 | | | 282 | | Where: | 283 | | - isActive[i] tells whether the invocation with the index | 284 | | <i> is currently active in the sub-group. | 285 | | - dataIn[i] is the value of <data> for invocation index | 286 | | <i>. | 287 | | - dataOut[i] is the return value of the function for | 288 | | invocation index <i>. | 289 | | | 290 | | Components of <offset> must be constant integer | 291 | | expression with a value in the range [0, 3]. | 292 +------------------------------------------------------+-----------------------------------------------------------+ 293 | genType swizzleInvocationsMaskedAMD( | Swizzles data within a group of 32 consecutive | 294 | genType data, uvec3 mask) | invocations with a limited mask as described below: | 295 | genIType swizzleInvocationsMaskedAMD( | | 296 | genIType data, uvec3 mask) | for (i = 0; i < gl_SubGroupSizeARB; i++) { | 297 | genUType swizzleInvocationsMaskedAMD( | j = (((i & 0x1f) & mask.x) | mask.y) ^ mask.z; | 298 | genIType data, uvec3 mask) | j |= (i & 0x20); // which group of 32 | 299 | | dataOut[i] = isActive[j] ? dataIn[j] : 0; | 300 | | } | 301 | | | 302 | | Where: | 303 | | - isActive[i] tells whether the invocation with the index | 304 | | <i> is currently active in the sub-group. | 305 | | - dataIn[i] is the value of <data> for invocation index | 306 | | <i>. | 307 | | - dataOut[i] is the return value of the function for | 308 | | invocation index <i>. | 309 | | | 310 | | Components of <mask> must be constant integer expression | 311 | | with a value in the range [0, 31]. | 312 +------------------------------------------------------+-----------------------------------------------------------+ 313 | genType writeInvocationAMD( | Returns <inputValue> for all active invocations in the | 314 | genType inputValue, | sub-group except for the invocation whose invocation | 315 | genType writeValue, | index within the sub-group is <invocationIndex> for which | 316 | uint invocationIndex) | <writeValue> is returned as described below: | 317 | genIType writeInvocationAMD( | | 318 | genIType inputValue, | for (i = 0; i < gl_SubGroupSizeARB; i++) { | 319 | genIType writeValue, | out[i] = (i == invocationIndex) ? | 320 | uint invocationIndex) | writeValue:inputValue; | 321 | genUType writeInvocationAMD( | } | 322 | genUType inputValue, | | 323 | genUType writeValue, | Where out[i] is the return value of the function for | 324 | uint invocationIndex) | invocation index <i>. | 325 | | | 326 | | <writeValue> and <invocationIndex> must be dynamically | 327 | | uniform within the sub-group, otherwise the return value | 328 | | of the function is undefined. | 329 +------------------------------------------------------+-----------------------------------------------------------+ 330 331Dependencies on ARB_gpu_shader_int64 332 333 If the shader enables ARB_gpu_shader_int64, this extension adds additional 334 shader invocation group functions. 335 336 Add Section 8.18, Shader Invocation Group Functions 337 338 +------------------------------------------------------+-----------------------------------------------------------+ 339 | Syntax | Description | 340 +------------------------------------------------------+-----------------------------------------------------------+ 341 | genI64Type minInvocationsAMD(genI64Type v) | Returns the minimum value of <v> across all active shader | 342 | genU64Type minInvocationsAMD(genU64Type v) | invocations in the sub-group with <Reduce> group | 343 | | operation. These functions must be used in uniform | 344 | | control flow. These functions operate component-wise. | 345 +------------------------------------------------------+-----------------------------------------------------------+ 346 | genI64Type minInvocationsNonUniformAMD(genI64Type v) | Returns the minimum value of <v> across all active shader | 347 | genU64Type minInvocationsNonUniformAMD(genU64Type v) | invocations in the sub-group with <Reduce> group | 348 | | operation. These functions could be used in non-uniform | 349 | | control flow. These functions operate component-wise. | 350 +------------------------------------------------------+-----------------------------------------------------------+ 351 | genI64Type minInvocationsInclusiveScanAMD( | Returns the minimum value of <v> across all active shader | 352 | genI64Type v) | invocations in the sub-group with <InclusiveScan> group | 353 | genU64Type minInvocationsInclusiveScanAMD( | operation. These functions must be used in uniform | 354 | genU64Type v) | control flow. These functions operate component-wise. | 355 +------------------------------------------------------+-----------------------------------------------------------+ 356 | genI64Type minInvocationsInclusiveScanNonUniformAMD( | Returns the minimum value of <v> across all active shader | 357 | genI64Type v) | invocations in the sub-group with <InclusiveScan> group | 358 | genU64Type minInvocationsInclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | 359 | genU64Type v) | control flow. These functions operate component-wise. | 360 +------------------------------------------------------+-----------------------------------------------------------+ 361 | genI64Type minInvocationsExclusiveScanAMD( | Returns the minimum value of <v> across all active shader | 362 | genI64Type v) | invocations in the sub-group with <ExclusiveScan> group | 363 | genU64Type minInvocationsExclusiveScanAMD( | operation. These functions must be used in uniform | 364 | genU64Type v) | control flow. These functions operate component-wise. | 365 +------------------------------------------------------+-----------------------------------------------------------+ 366 | genI64Type minInvocationsExclusiveScanNonUniformAMD( | Returns the minimum value of <v> across all active shader | 367 | genI64Type v) | invocations in the sub-group with <ExclusiveScan> group | 368 | genU64Type minInvocationsExclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | 369 | genU64Type v) | control flow. These functions operate component-wise. | 370 +------------------------------------------------------+-----------------------------------------------------------+ 371 | genI64Type maxInvocationsAMD(genI64Type v) | Returns the maximum value of <v> across all active shader | 372 | genU64Type maxInvocationsAMD(genU64Type v) | invocations in the sub-group with <Reduce> group | 373 | | operation. These functions must be used in uniform | 374 | | control flow. These functions operate component-wise. | 375 +------------------------------------------------------+-----------------------------------------------------------+ 376 | genI64Type maxInvocationsNonUniformAMD(genI64Type v) | Returns the maximum value of <v> across all active shader | 377 | genU64Type maxInvocationsNonUniformAMD(genU64Type v) | invocations in the sub-group with <Reduce> group | 378 | | operation. These functions could be used in non-uniform | 379 | | control flow. These functions operate component-wise. | 380 +------------------------------------------------------+-----------------------------------------------------------+ 381 | genI64Type maxInvocationsInclusiveScanAMD( | Returns the maximum value of <v> across all active shader | 382 | genI64Type v) | invocations in the sub-group with <InclusiveScan> group | 383 | genU64Type maxInvocationsInclusiveScanAMD( | operation. These functions must be used in uniform | 384 | genU64Type v) | control flow. These functions operate component-wise. | 385 +------------------------------------------------------+-----------------------------------------------------------+ 386 | genI64Type maxInvocationsInclusiveScanNonUniformAMD( | Returns the maximum value of <v> across all active shader | 387 | genI64Type v) | invocations in the sub-group with <InclusiveScan> group | 388 | genU64Type maxInvocationsInclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | 389 | genU64Type v) | control flow. These functions operate component-wise. | 390 +------------------------------------------------------+-----------------------------------------------------------+ 391 | genI64Type maxInvocationsExclusiveScanAMD( | Returns the maximum value of <v> across all active shader | 392 | genI64Type v) | invocations in the sub-group with <ExclusiveScan> group | 393 | genU64Type maxInvocationsExclusiveScanAMD( | operation. These functions must be used in uniform | 394 | genU64Type v) | control flow. These functions operate component-wise. | 395 +------------------------------------------------------+-----------------------------------------------------------+ 396 | genI64Type maxInvocationsExclusiveScanNonUniformAMD( | Returns the maximum value of <v> across all active shader | 397 | genI64Type v) | invocations in the sub-group with <ExclusiveScan> group | 398 | genU64Type maxInvocationsExclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | 399 | genU64Type v) | control flow. These functions operate component-wise. | 400 +------------------------------------------------------+-----------------------------------------------------------+ 401 | genI64Type addInvocationsAMD(genI64Type v) | Returns the sum of the value of <v> across all active | 402 | genU64Type addInvocationsAMD(genU64Type v) | shader invocations in the sub-group with <Reduce> group | 403 | | operation. These functions must be used in uniform | 404 | | control flow. These functions operate component-wise. | 405 +------------------------------------------------------+-----------------------------------------------------------+ 406 | genI64Type addInvocationsNonUniformAMD(genI64Type v) | Returns the sum of the value of <v> across all active | 407 | genU64Type addInvocationsNonUniformAMD(genU64Type v) | shader invocations in the sub-group with <Reduce> group | 408 | | operation. These functions could be used in non-uniform | 409 | | control flow. These functions operate component-wise. | 410 +------------------------------------------------------+-----------------------------------------------------------+ 411 | genI64Type addInvocationsInclusiveScanAMD( | Returns the sum of the value of <v> across all active | 412 | genI64Type v) | shader invocations in the sub-group with <InclusiveScan> | 413 | genU64Type addInvocationsInclusiveScanAMD( | group operation. These functions must be used in uniform | 414 | genU64Type v) | control flow. These functions operate component-wise. | 415 +------------------------------------------------------+-----------------------------------------------------------+ 416 | genI64Type addInvocationsInclusiveScanNonUniformAMD( | Returns the sum of the value of <v> across all active | 417 | genI64Type v) | shader invocations in the sub-group with <InclusiveScan> | 418 | genU64Type addInvocationsInclusiveScanNonUniformAMD( | group operation. These functions could be used in | 419 | genU64Type v) | non-uniform control flow. These functions operate | 420 | | component-wise. | 421 +------------------------------------------------------+-----------------------------------------------------------+ 422 | genI64Type addInvocationsExclusiveScanAMD( | Returns the sum of the value of <v> across all active | 423 | genI64Type v) | shader invocations in the sub-group with <ExclusiveScan> | 424 | genU64Type addInvocationsExclusiveScanAMD( | group operation. These functions must be used in uniform | 425 | genU64Type v) | control flow. These functions operate component-wise. | 426 +------------------------------------------------------+-----------------------------------------------------------+ 427 | genI64Type addInvocationsExclusiveScanNonUniformAMD( | Returns the sum of the value of <v> across all active | 428 | genI64Type v) | shader invocations in the sub-group with <ExclusiveScan> | 429 | genU64Type addInvocationsExclusiveScanNonUniformAMD( | group operation. These functions could be used in | 430 | genU64Type v) | non-uniform control flow. These functions operate | 431 | | component-wise. | 432 +------------------------------------------------------+-----------------------------------------------------------+ 433 | uint mbcntAMD(uint64_t mask) | Returns the bit count of gl_SubGroupLtMaskARB with <mask> | 434 | | as described below: | 435 | | | 436 | | bitCount(gl_SubGroupLtMaskARB & mask). | 437 +------------------------------------------------------+-----------------------------------------------------------+ 438 439Dependencies on AMD_gpu_shader_half_float 440 441 If the shader enables AMD_gpu_shader_half_float, this extension adds 442 additional shader invocation group functions. 443 444 Add Section 8.18, Shader Invocation Group Functions 445 446 +------------------------------------------------------+-----------------------------------------------------------+ 447 | Syntax | Description | 448 +------------------------------------------------------+-----------------------------------------------------------+ 449 | genF16Type minInvocationsAMD(genF16Type v) | Returns the minimum value of <v> across all active shader | 450 | | invocations in the sub-group with <Reduce> group | 451 | | operation. These functions must be used in uniform | 452 | | control flow. These functions operate component-wise. | 453 +------------------------------------------------------+-----------------------------------------------------------+ 454 | genF16Type minInvocationsNonUniformAMD(genF16Type v) | Returns the minimum value of <v> across all active shader | 455 | | invocations in the sub-group with <Reduce> group | 456 | | operation. These functions could be used in non-uniform | 457 | | control flow. These functions operate component-wise. | 458 +------------------------------------------------------+-----------------------------------------------------------+ 459 | genF16Type minInvocationsInclusiveScanAMD( | Returns the minimum value of <v> across all active shader | 460 | genF16Type v) | invocations in the sub-group with <InclusiveScan> group | 461 | | operation. These functions must be used in uniform | 462 | | control flow. These functions operate component-wise. | 463 +------------------------------------------------------+-----------------------------------------------------------+ 464 | genF16Type minInvocationsInclusiveScanNonUniformAMD( | Returns the minimum value of <v> across all active shader | 465 | genF16Type v) | invocations in the sub-group with <InclusiveScan> group | 466 | | operation. These functions could be used in non-uniform | 467 | | control flow. These functions operate component-wise. | 468 +------------------------------------------------------+-----------------------------------------------------------+ 469 | genF16Type minInvocationsExclusiveScanAMD( | Returns the minimum value of <v> across all active shader | 470 | genF16Type v) | invocations in the sub-group with <ExclusiveScan> group | 471 | | operation. These functions must be used in uniform | 472 | | control flow. These functions operate component-wise. | 473 +------------------------------------------------------+-----------------------------------------------------------+ 474 | genF16Type minInvocationsExclusiveScanNonUniformAMD( | Returns the minimum value of <v> across all active shader | 475 | genF16Type v) | invocations in the sub-group with <ExclusiveScan> group | 476 | | operation. These functions could be used in non-uniform | 477 | | control flow. These functions operate component-wise. | 478 +------------------------------------------------------+-----------------------------------------------------------+ 479 | genF16Type maxInvocationsAMD(genF16Type v) | Returns the maximum value of <v> across all active shader | 480 | | invocations in the sub-group with <Reduce> group | 481 | | operation. These functions must be used in uniform | 482 | | control flow. These functions operate component-wise. | 483 +------------------------------------------------------+-----------------------------------------------------------+ 484 | genF16Type maxInvocationsNonUniformAMD(genF16Type v) | Returns the maximum value of <v> across all active shader | 485 | | invocations in the sub-group with <Reduce> group | 486 | | operation. These functions could be used in non-uniform | 487 | | control flow. These functions operate component-wise. | 488 +------------------------------------------------------+-----------------------------------------------------------+ 489 | genF16Type maxInvocationsInclusiveScanAMD( | Returns the maximum value of <v> across all active shader | 490 | genF16Type v) | invocations in the sub-group with <InclusiveScan> group | 491 | | operation. These functions must be used in uniform | 492 | | control flow. These functions operate component-wise. | 493 +------------------------------------------------------+-----------------------------------------------------------+ 494 | genF16Type maxInvocationsInclusiveScanNonUniformAMD( | Returns the maximum value of <v> across all active shader | 495 | genF16Type v) | invocations in the sub-group with <InclusiveScan> group | 496 | | operation. These functions could be used in non-uniform | 497 | | control flow. These functions operate component-wise. | 498 +------------------------------------------------------+-----------------------------------------------------------+ 499 | genF16Type maxInvocationsExclusiveScanAMD( | Returns the maximum value of <v> across all active shader | 500 | genF16Type v) | invocations in the sub-group with <ExclusiveScan> group | 501 | | operation. These functions must be used in uniform | 502 | | control flow. These functions operate component-wise. | 503 +------------------------------------------------------+-----------------------------------------------------------+ 504 | genF16Type maxInvocationsExclusiveScanNonUniformAMD( | Returns the maximum value of <v> across all active shader | 505 | genF16Type v) | invocations in the sub-group with <ExclusiveScan> group | 506 | | operation. These functions could be used in non-uniform | 507 | | control flow. These functions operate component-wise. | 508 +------------------------------------------------------+-----------------------------------------------------------+ 509 | genF16Type addInvocationsAMD(genF16Type v) | Returns the sum of the value of <v> across all active | 510 | | shader invocations in the sub-group with <Reduce> group | 511 | | operation. These functions must be used in uniform | 512 | | control flow. These functions operate component-wise. | 513 +------------------------------------------------------+-----------------------------------------------------------+ 514 | genF16Type addInvocationsNonUniformAMD(genF16Type v) | Returns the sum of the value of <v> across all active | 515 | | shader invocations in the sub-group with <Reduce> group | 516 | | operation. These functions could be used in non-uniform | 517 | | control flow. These functions operate component-wise. | 518 +------------------------------------------------------+-----------------------------------------------------------+ 519 | genF16Type addInvocationsInclusiveScanAMD( | Returns the sum of the value of <v> across all active | 520 | genF16Type v) | shader invocations in the sub-group with <InclusiveScan> | 521 | | group operation. These functions must be used in uniform | 522 | | control flow. These functions operate component-wise. | 523 +------------------------------------------------------+-----------------------------------------------------------+ 524 | genF16Type addInvocationsInclusiveScanNonUniformAMD( | Returns the sum of the value of <v> across all active | 525 | genF16Type v) | shader invocations in the sub-group with <InclusiveScan> | 526 | | group operation. These functions could be used in | 527 | | non-uniform control flow. These functions operate | 528 | | component-wise. | 529 +------------------------------------------------------+-----------------------------------------------------------+ 530 | genF16Type addInvocationsExclusiveScanAMD( | Returns the sum of the value of <v> across all active | 531 | genF16Type v) | shader invocations in the sub-group with <ExclusiveScan> | 532 | | group operation. These functions must be used in uniform | 533 | | control flow. These functions operate component-wise. | 534 +------------------------------------------------------+-----------------------------------------------------------+ 535 | genF16Type addInvocationsExclusiveScanNonUniformAMD( | Returns the sum of the value of <v> across all active | 536 | genF16Type v) | shader invocations in the sub-group with <ExclusiveScan> | 537 | | group operation. These functions could be used in | 538 | | non-uniform control flow. These functions operate | 539 | | component-wise. | 540 +------------------------------------------------------+-----------------------------------------------------------+ 541 542Dependencies on AMD_gpu_shader_int16 543 544 If the shader enables AMD_gpu_shader_int16, this extension adds 545 additional shader invocation group functions. 546 547 Add Section 8.18, Shader Invocation Group Functions 548 549 +------------------------------------------------------+-----------------------------------------------------------+ 550 | Syntax | Description | 551 +------------------------------------------------------+-----------------------------------------------------------+ 552 | genI16Type minInvocationsAMD(genI16Type v) | Returns the minimum value of <v> across all active shader | 553 | genU16Type minInvocationsAMD(genU16Type v) | invocations in the sub-group with <Reduce> group | 554 | | operation. These functions must be used in uniform | 555 | | control flow. These functions operate component-wise. | 556 +------------------------------------------------------+-----------------------------------------------------------+ 557 | genI16Type minInvocationsNonUniformAMD(genI16Type v) | Returns the minimum value of <v> across all active shader | 558 | genU16Type minInvocationsNonUniformAMD(genU16Type v) | invocations in the sub-group with <Reduce> group | 559 | | operation. These functions could be used in non-uniform | 560 | | control flow. These functions operate component-wise. | 561 +------------------------------------------------------+-----------------------------------------------------------+ 562 | genI16Type minInvocationsInclusiveScanAMD( | Returns the minimum value of <v> across all active shader | 563 | genI16Type v) | invocations in the sub-group with <InclusiveScan> group | 564 | genU16Type minInvocationsInclusiveScanAMD( | operation. These functions must be used in uniform | 565 | genU16Type v) | control flow. These functions operate component-wise. | 566 +------------------------------------------------------+-----------------------------------------------------------+ 567 | genI16Type minInvocationsInclusiveScanNonUniformAMD( | Returns the minimum value of <v> across all active shader | 568 | genI16Type v) | invocations in the sub-group with <InclusiveScan> group | 569 | genU16Type minInvocationsInclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | 570 | genU16Type v) | control flow. These functions operate component-wise. | 571 +------------------------------------------------------+-----------------------------------------------------------+ 572 | genI16Type minInvocationsExclusiveScanAMD( | Returns the minimum value of <v> across all active shader | 573 | genI16Type v) | invocations in the sub-group with <ExclusiveScan> group | 574 | genU16Type minInvocationsExclusiveScanAMD( | operation. These functions must be used in uniform | 575 | genU16Type v) | control flow. These functions operate component-wise. | 576 +------------------------------------------------------+-----------------------------------------------------------+ 577 | genI16Type minInvocationsExclusiveScanNonUniformAMD( | Returns the minimum value of <v> across all active shader | 578 | genI16Type v) | invocations in the sub-group with <ExclusiveScan> group | 579 | genU16Type minInvocationsExclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | 580 | genU16Type v) | control flow. These functions operate component-wise. | 581 +------------------------------------------------------+-----------------------------------------------------------+ 582 | genI16Type maxInvocationsAMD(genI16Type v) | Returns the maximum value of <v> across all active shader | 583 | genU16Type maxInvocationsAMD(genU16Type v) | invocations in the sub-group with <Reduce> group | 584 | | operation. These functions must be used in uniform | 585 | | control flow. These functions operate component-wise. | 586 +------------------------------------------------------+-----------------------------------------------------------+ 587 | genI16Type maxInvocationsNonUniformAMD(genI16Type v) | Returns the maximum value of <v> across all active shader | 588 | genU16Type maxInvocationsNonUniformAMD(genU16Type v) | invocations in the sub-group with <Reduce> group | 589 | | operation. These functions could be used in non-uniform | 590 | | control flow. These functions operate component-wise. | 591 +------------------------------------------------------+-----------------------------------------------------------+ 592 | genI16Type maxInvocationsInclusiveScanAMD( | Returns the maximum value of <v> across all active shader | 593 | genI16Type v) | invocations in the sub-group with <InclusiveScan> group | 594 | genU16Type maxInvocationsInclusiveScanAMD( | operation. These functions must be used in uniform | 595 | genU16Type v) | control flow. These functions operate component-wise. | 596 +------------------------------------------------------+-----------------------------------------------------------+ 597 | genI16Type maxInvocationsInclusiveScanNonUniformAMD( | Returns the maximum value of <v> across all active shader | 598 | genI16Type v) | invocations in the sub-group with <InclusiveScan> group | 599 | genU16Type maxInvocationsInclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | 600 | genU16Type v) | control flow. These functions operate component-wise. | 601 +------------------------------------------------------+-----------------------------------------------------------+ 602 | genI16Type maxInvocationsExclusiveScanAMD( | Returns the maximum value of <v> across all active shader | 603 | genI16Type v) | invocations in the sub-group with <ExclusiveScan> group | 604 | genU16Type maxInvocationsExclusiveScanAMD( | operation. These functions must be used in uniform | 605 | genU16Type v) | control flow. These functions operate component-wise. | 606 +------------------------------------------------------+-----------------------------------------------------------+ 607 | genI16Type maxInvocationsExclusiveScanNonUniformAMD( | Returns the maximum value of <v> across all active shader | 608 | genI16Type v) | invocations in the sub-group with <ExclusiveScan> group | 609 | genU16Type maxInvocationsExclusiveScanNonUniformAMD( | operation. These functions could be used in non-uniform | 610 | genU16Type v) | control flow. These functions operate component-wise. | 611 +------------------------------------------------------+-----------------------------------------------------------+ 612 | genI16Type addInvocationsAMD(genI16Type v) | Returns the sum of the value of <v> across all active | 613 | genU16Type addInvocationsAMD(genU16Type v) | shader invocations in the sub-group with <Reduce> group | 614 | | operation. These functions must be used in uniform | 615 | | control flow. These functions operate component-wise. | 616 +------------------------------------------------------+-----------------------------------------------------------+ 617 | genI16Type addInvocationsNonUniformAMD(genI16Type v) | Returns the sum of the value of <v> across all active | 618 | genU16Type addInvocationsNonUniformAMD(genU16Type v) | shader invocations in the sub-group with <Reduce> group | 619 | | operation. These functions could be used in non-uniform | 620 | | control flow. These functions operate component-wise. | 621 +------------------------------------------------------+-----------------------------------------------------------+ 622 | genI16Type addInvocationsInclusiveScanAMD( | Returns the sum of the value of <v> across all active | 623 | genI16Type v) | shader invocations in the sub-group with <InclusiveScan> | 624 | genU16Type addInvocationsInclusiveScanAMD( | group operation. These functions must be used in uniform | 625 | genU16Type v) | control flow. These functions operate component-wise. | 626 +------------------------------------------------------+-----------------------------------------------------------+ 627 | genI16Type addInvocationsInclusiveScanNonUniformAMD( | Returns the sum of the value of <v> across all active | 628 | genI16Type v) | shader invocations in the sub-group with <InclusiveScan> | 629 | genU16Type addInvocationsInclusiveScanNonUniformAMD( | group operation. These functions could be used in | 630 | genU16Type v) | non-uniform control flow. These functions operate | 631 | | component-wise. | 632 +------------------------------------------------------+-----------------------------------------------------------+ 633 | genI16Type addInvocationsExclusiveScanAMD( | Returns the sum of the value of <v> across all active | 634 | genI16Type v) | shader invocations in the sub-group with <ExclusiveScan> | 635 | genU16Type addInvocationsExclusiveScanAMD( | group operation. These functions must be used in uniform | 636 | genU16Type v) | control flow. These functions operate component-wise. | 637 +------------------------------------------------------+-----------------------------------------------------------+ 638 | genI16Type addInvocationsExclusiveScanNonUniformAMD( | Returns the sum of the value of <v> across all active | 639 | genI16Type v) | shader invocations in the sub-group with <ExclusiveScan> | 640 | genU16Type addInvocationsExclusiveScanNonUniformAMD( | group operation. These functions could be used in | 641 | genU16Type v) | non-uniform control flow. These functions operate | 642 | | component-wise. | 643 +------------------------------------------------------+-----------------------------------------------------------+ 644 645Additions to the AGL/GLX/WGL Specifications 646 647 None. 648 649GLX Protocol 650 651 None. 652 653Errors 654 655 None. 656 657Issues 658 659 660Revision History 661 662 Rev. Date Author Changes 663 ---- ---------- -------- -------------------------------------------------- 664 5 03/28/2018 rexu Add interactions with ARB_gpu_shader_int16. New 665 group invocation functions are added to support 666 16-bit integer type in group operations. 667 668 4 10/19/2016 rexu Add interactions with ARB_gpu_shader_int64 and 669 AMD_gpu_shader_half_float. New group invocation 670 functions are added to support 64-bit integer 671 type and 16-bit/64-bit floating-point type 672 in group operations. Clarify that <mask> in 673 swizzleInvocationsMaskedAMD() should be constant 674 integer expression with a value in the range 675 [0, 31]. 676 677 3 08/16/2016 rexu Clarify that minInvocationsAMD, maxInvocationsAMD, 678 addInvocationsAMD, along with their non-uniform 679 versions, operate component-wise rather than on 680 vector. 681 682 2 08/11/2016 rexu Add non-uniform versions of minInvocationsAMD, 683 maxInvocationsAMD, and addInvocationsAMD. 684 Support those operations in non-uniform control 685 flow. 686 687 1 04/21/2016 qlin Internal revisions. 688