• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Name
2
3    ARB_shader_ballot
4
5Name Strings
6
7    GL_ARB_shader_ballot
8
9Contact
10
11    Timothy Lottes (timothy.lottes 'at' amd.com)
12
13Contributors
14
15    Timothy Lottes, AMD
16    Graham Sellers, AMD
17    Daniel Rakos, AMD
18    Jeannot Breton, NVIDIA
19    Pat Brown, NVIDIA
20    Eric Werness, NVIDIA
21    Mark Kilgard, NVIDIA
22    Jeff Bolz, NVIDIA
23
24Notice
25
26    Copyright (c) 2015 The Khronos Group Inc. Copyright terms at
27        http://www.khronos.org/registry/speccopyright.html
28
29Specification Update Policy
30
31    Khronos-approved extension specifications are updated in response to
32    issues and bugs prioritized by the Khronos OpenGL Working Group. For
33    extensions which have been promoted to a core Specification, fixes will
34    first appear in the latest version of that core Specification, and will
35    eventually be backported to the extension document. This policy is
36    described in more detail at
37        https://www.khronos.org/registry/OpenGL/docs/update_policy.php
38
39Status
40
41    Complete. Approved by the ARB on June 26, 2015.
42    Ratified by the Khronos Board of Promoters on August 7, 2015.
43
44Version
45
46    Last Modified Date: 03/18/2017
47    Revision: 8
48
49Number
50
51    ARB Extension #183
52
53Dependencies
54
55    This extension is written against Revision 5 of the version 4.50 of the
56    OpenGL Shading Language Specification, dated January 30, 2015.
57
58    This extension requires GL_ARB_gpu_shader_int64.
59
60Overview
61
62    This extension provides the ability for a group of invocations which
63    execute in lockstep to do limited forms of cross-invocation communication
64    via a group broadcast of a invocation value, or broadcast of a bitarray
65    representing a predicate value from each invocation in the group.
66
67New Procedures and Functions
68
69    None.
70
71New Tokens
72
73    None.
74
75IP Status
76
77    None.
78
79Modifications to the OpenGL Shading Language Specification, Version 4.50
80
81    Including the following line in a shader can be used to control the
82    language features described in this extension:
83
84      #extension GL_ARB_shader_ballot : <behavior>
85
86    where <behavior> is as specified in section 3.3.
87
88    New preprocessor #defines are added to the OpenGL Shading Language:
89
90      #define GL_ARB_shader_ballot               1
91
92Additions to Chapter 7 of the OpenGL Shading Language Specification
93(Built-in Variables)
94
95    Modify Section 7.4, Built-In Uniform State, p. 133
96
97    (Add to the list of built-in uniform variable declaration)
98
99        uniform uint  gl_SubGroupSizeARB;
100
101    (Add this paragraph at the end of this section)
102
103    A sub-group is a collection of invocations which execute in lockstep.
104    The variable <gl_SubGroupSizeARB> is the maximum number of invocations
105    in a sub-group. The maximum <gl_SubGroupSizeARB> supported in this
106    extension is 64.
107
108    Modify Section 7.1, Built-in Languages Variable, p. 110
109
110    (Add to the list of built-in variables for the compute, vertex, geometry,
111     tessellation control, tessellation evaluation and fragment languages)
112
113        in uint     gl_SubGroupInvocationARB;
114        in uint64_t gl_SubGroupEqMaskARB;
115        in uint64_t gl_SubGroupGeMaskARB;
116        in uint64_t gl_SubGroupGtMaskARB;
117        in uint64_t gl_SubGroupLeMaskARB;
118        in uint64_t gl_SubGroupLtMaskARB;
119
120    (Add those paragraphs at the end of this section)
121
122    The variable <gl_SubGroupInvocationARB> holds the index of the invocation within
123    sub-group. This variable is in the range 0 to <gl_SubGroupSizeARB>-1, where
124    <gl_SubGroupSizeARB> is the total number of invocations in a sub-group.
125
126    The <gl_SubGroup??MaskARB> variables provide a bitmask for all invocations,
127    with one bit per invocation starting with the least significant bit,
128    according to the following table,
129
130        variable               equation for bit values
131        --------------------   ------------------------------------
132        gl_SubGroupEqMaskARB   bit index == gl_SubGroupInvocationARB
133        gl_SubGroupGeMaskARB   bit index >= gl_SubGroupInvocationARB
134        gl_SubGroupGtMaskARB   bit index >  gl_SubGroupInvocationARB
135        gl_SubGroupLeMaskARB   bit index <= gl_SubGroupInvocationARB
136        gl_SubGroupLtMaskARB   bit index <  gl_SubGroupInvocationARB
137
138Additions to Chapter 8 of the OpenGL Shading Language Specification
139(Built-in Functions)
140
141    Add Section 8.18, Shader Invocation Group Functions
142
143    Syntax:
144
145        uint64_t ballotARB(bool value);
146
147    The function ballotARB() returns a bitfield containing the result of
148    evaluating the expression <value> in all active invocations in the
149    sub-group. An active invocation is one that is executing the ballotARB()
150    call. The sub-group may have inactive invocations for example due to
151    exit of the shader, or divergent branching. Sub-groups of up to 64
152    invocations may be represented by the return value of ballotARB(). Bits
153    for each invocation are packed in least significant bit ordering. If
154    <value> evaluates to true for an active invocation then the corresponding
155    bit is set to one in the result, otherwise it is zero. Bits corresponding
156    to invocations that are not active or that do not exist in the sub group
157    (because, for example, they are at bit positions beyond the sub-group
158    size) are set to zero. The following trivial assumptions can be made:
159
160        * ballotARB(true) returns bitfield where the corresponding bits are
161          set for all active invocations in the sub-group.
162
163        * ballotARB(false) returns zero.
164
165    Syntax:
166
167        genType readInvocationARB(genType value, uint invocationIndex);
168        genIType readInvocationARB(genIType value, uint invocationIndex);
169        genUType readInvocationARB(genUType value, uint invocationIndex);
170
171        genType readFirstInvocationARB(genType value);
172        genIType readFirstInvocationARB(genIType value);
173        genUType readFirstInvocationARB(genUType value);
174
175    The function readInvocationARB() returns the <value> from a given
176    <invocationIndex> to all active invocations in the sub-group.
177    The <invocationIndex> must be the same for all active invocations
178    in the sub-group otherwise results are undefined.
179
180    The function readFirstInvocationARB() returns the <value> from the first
181    active invocation to all active invocations in the sub-group.
182
183Issues
184
185    1) How are the values of gl_SubGroup??MaskARB defined?
186
187    RESOLVED.  Earlier versions of this specification defined a bitmask
188    such as "LtMask" ("less than mask") as having bits set if
189    "gl_SubGroupInvocationARB <  bit index".  However, this was reversed
190    from the definition in GL_NV_shader_thread_group that these built-ins
191    were derived from, and also mismatched a recent Vulkan/SPIR-V extension.
192
193    Fortunately, all known implementations of this extension had implemented
194    "wrong" behavior (matching the sense of the original built-ins in
195    GL_NV_shader_thread_group), so the best thing to do is change the
196    definition in the spec.
197
198Revision History
199
200    Rev  Date        Author    Changes
201    ---  ----------  --------  ---------------------------------------------
202      8  03/18/2017  jbolz     Reversed the sense of the comparison in the
203                               definition of gl_SubGroup??MaskARB.
204      7  08/25/2015  nhenning  Add ARB suffix on documentation for
205                               readInvocation and readFirstInvocation
206                               functions.
207      6  07/31/2015  pdaniell  Add ARB suffix on the readInvocation and
208                               readFirstInvocation functions.
209      5  07/30/2015  pdaniell  Update the function definition syntax to use
210                               our standard gen*Type conventions.
211      4  06/23/2015  tlottes   More precise spec language.
212      3  06/22/2015  tlottes   Deferred GPU processor another spec.
213                               Cleaned up spec language.
214      2  04/20/2015  tlottes   Updated spec language.
215      1  03/09/2015  tlottes   Initial revision based on AMD_gcn_shader and
216                               NV_shader_thread_group.
217