• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Name
2
3    ARB_compute_variable_group_size
4
5Name Strings
6
7    GL_ARB_compute_variable_group_size
8
9Contact
10
11    Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com)
12
13Contributors
14
15    Slawomir Grajewski, Intel Corporation
16    Jeannot Breton, NVIDIA
17    Daniel Koch, NVIDIA
18
19Notice
20
21    Copyright (c) 2013 The Khronos Group Inc. Copyright terms at
22        http://www.khronos.org/registry/speccopyright.html
23
24Specification Update Policy
25
26    Khronos-approved extension specifications are updated in response to
27    issues and bugs prioritized by the Khronos OpenGL Working Group. For
28    extensions which have been promoted to a core Specification, fixes will
29    first appear in the latest version of that core Specification, and will
30    eventually be backported to the extension document. This policy is
31    described in more detail at
32        https://www.khronos.org/registry/OpenGL/docs/update_policy.php
33
34Status
35
36    Complete. Approved by the ARB on June 3, 2013.
37    Ratified by the Khronos Board of Promoters on July 19, 2013.
38
39Version
40
41    Last Modified Date:         December 10, 2018
42    Revision:                   9
43
44Number
45
46    ARB Extension #153
47
48Dependencies
49
50    This extension is written against the OpenGL 4.3 (Compatibility Profile)
51    Specification, dated August 6, 2012.
52
53    This extension is written against the OpenGL Shading Language
54    Specification, Version 4.30, Revision 7, dated September 24, 2012.
55
56    OpenGL 4.3 or ARB_compute_shader is required.
57
58    This extension interacts with NV_compute_program5.
59
60Overview
61
62    This extension allows applications to write generic compute shaders that
63    operate on workgroups with arbitrary dimensions.  Instead of specifying a
64    fixed workgroup size in the compute shader, an application can use a
65    compute shader using the /local_size_variable/ layout qualifer to indicate
66    a variable workgroup size.  When using such compute shaders, the new
67    command DispatchComputeGroupSizeARB should be used to specify both a
68    workgroup size and workgroup count.
69
70    In this extension, compute shaders with fixed group sizes must be
71    dispatched by DispatchCompute and DispatchComputeIndirect.  Compute
72    shaders with variable group sizes must be dispatched via
73    DispatchComputeGroupSizeARB.  No support is provided in this extension for
74    indirect dispatch of compute shaders with a variable group size.
75
76New Procedures and Functions
77
78    void DispatchComputeGroupSizeARB(uint num_groups_x, uint num_groups_y,
79                                     uint num_groups_z, uint group_size_x,
80                                     uint group_size_y, uint group_size_z);
81
82New Tokens
83
84    Accepted by the <pname> parameter of GetIntegerv, GetBooleanv, GetFloatv,
85    GetDoublev and GetInteger64v:
86
87        MAX_COMPUTE_VARIABLE_GROUP_INVOCATIONS_ARB      0x9344
88        MAX_COMPUTE_FIXED_GROUP_INVOCATIONS_ARB         0x90EB (see note)
89
90    Accepted by the <pname> parameter of GetIntegeri_v, GetBooleani_v,
91    GetFloati_v, GetDoublei_v and GetInteger64i_v:
92
93        MAX_COMPUTE_VARIABLE_GROUP_SIZE_ARB             0x9345
94        MAX_COMPUTE_FIXED_GROUP_SIZE_ARB                0x91BF (see note)
95
96    Note:  MAX_COMPUTE_FIXED_GROUP_INVOCATIONS_ARB and
97    MAX_COMPUTE_FIXED_GROUP_SIZE_ARB are aliases for the OpenGL 4.3 core enums
98    MAX_COMPUTE_WORK_GROUP_INVOCATIONS and MAX_COMPUTE_WORK_GROUP_SIZE,
99    respectively.
100
101
102Modifications to the OpenGL 4.3 (Compatibility Profile) Specification
103
104    Modify Chapter 19, Compute Shaders, p. 585
105
106    (modify second paragraph, p. 585)
107
108    ... One or more workgroups is launched by calling
109
110      void DispatchCompute(uint num_groups_x, uint num_groups_y,
111                           uint num_groups_z)
112
113    or
114
115      void DispatchComputeGroupSizeARB(uint num_groups_x, uint num_groups_y,
116                                       uint num_groups_z, uint group_size_x,
117                                       uint group_size_y, uint group_size_z);
118
119    (modify second paragraph, p. 586)
120
121    For DispatchCompute, the workgroup size in each dimension must be
122    specified at compile time in the active program for the compute shader
123    stage.  The workgroup size is specified using an input layout qualifer
124    ...
125
126    (insert after second paragraph, p. 586)
127
128    For DispatchComputeGroupSizeARB, the workgroup size must be specified as
129    variable in the active program for the compute shader stage.  The group
130    size used to execute the compute shader is taken from the <group_size_x>,
131    <group_size_y>, and <group_size_z> parameters.  For the purposes of the
132    COMPUTE_WORK_GROUP_SIZE query, a program without a workgroup size
133    specified at compile time will be considered to have a size of zero in
134    each dimension.
135
136    (modify the third paragraph, p. 586)
137
138    The maximum size of a workgroup may be determined by calling
139    GetIntegeri_v with <index> set to 0, 1, or 2 to retrieve the maximum work
140    size in the X, Y and Z dimension, respectively.  <target> should be set to
141    MAX_COMPUTE_FIXED_GROUP_SIZE_ARB for compute shaders with fixed group
142    sizes or MAX_COMPUTE_VARIABLE_GROUP_SIZE_ARB for compute shaders with
143    variable local group sizes.  Furthermore, the maximum number of
144    invocations in a single workgroup (i.e., the product of the three
145    dimensions) may be determined by calling GetIntegerv with <pname> set to
146    MAX_COMPUTE_FIXED_GROUP_INVOCATIONS_ARB for compute shaders with fixed
147    group sizes or MAX_COMPUTE_VARIABLE_GROUP_INVOCATIONS_ARB for compute
148    shaders with variable group sizes.
149
150    (insert after the first INVALID_OPERATION error in the first error block,
151     shared between DispatchCompute and DispatchComputeGroupSizeARB, p. 586)
152
153    An INVALID_OPERATION error is generated by DispatchCompute if the active
154    program for the compute shader stage has a variable workgroup
155    size.
156
157    An INVALID_OPERATION error is generated by DispatchComputeGroupSizeARB if
158    the active program for the compute shader stage has a fixed workgroup
159    size.
160
161    (insert at the end of the first error block, shared between
162     DispatchCompute and DispatchComputeGroupSizeARB, p. 586)
163
164    An INVALID_VALUE error is generated by DispatchComputeGroupSizeARB if any
165    of <group_size_x>, <group_size_y>, or <group_size_z> is less than or equal
166    to zero or greater than the maximum workgroup size for compute
167    shaders with variable group size (MAX_COMPUTE_VARIABLE_GROUP_SIZE_ARB) in
168    the corresponding dimension.
169
170    An INVALID_VALUE error is generated by DispatchComputeGroupSizeARB if the
171    product of <group_size_x>, <group_size_y>, and <group_size_z> exceeds the
172    implementation-dependent maximum workgroup invocation count for
173    compute shaders with variable group size
174    (MAX_COMPUTE_VARIABLE_GROUP_INVOCATIONS_ARB).
175
176    (insert at the end of the first error block, for DispatchComputeIndirect,
177     p. 587)
178
179    An INVALID_OPERATION error is generated if the active program for the
180    compute shader stage has a variable workgroup size.
181
182
183Modifications to the OpenGL Shading Language Specification, Version 4.30
184
185    Including the following line in a shader can be used to control the
186    language features described in this extension:
187
188      #extension GL_ARB_compute_variable_group_size : <behavior>
189
190    where <behavior> is as specified in section 3.3.
191
192    New preprocessor #defines are added to the OpenGL Shading Language:
193
194      #define GL_ARB_compute_variable_group_size        1
195
196
197    Modify Section 4.4.1.4, Compute Shader Inputs (p. 59)
198
199    (add to list of layout qualifiers for compute shader inputs, p. 59)
200
201      layout-qualifier-id
202        local_size_x = integer-constant
203        local_size_y = integer-constant
204        local_size_z = integer-constant
205        local_size_variable
206
207    (modify the last paragraph, p. 59)
208
209    The local_size_x, local_size_y, and local_size_z qualifiers are used to
210    declare a fixed local group size for the kernel in the first, second...
211
212    (modify the second to last paragaph in the section)
213
214    If the fixed local group size of the shader in any dimension...
215    ... If multiple compute shaders attached to a single program object declare
216    a fixed local group size, the declarations must be identical; otherwise a
217    link-time error results.
218
219    (insert before the last paragraph of the section, p. 60)
220
221    The *local_size_variable* qualifier is used to declare that
222    the local group size of the shader is variable, and will be specified
223    using arguments to OpenGL API compute dispatch commands.  If a compute
224    shader including a *local_size_variable* qualifier also declares a
225    fixed local group size using the *local_size_x*, *local_size_y*, or
226    *local_size_z* qualifiers, a compile-time error results.  If one compute
227    shader attached to a program declares a variable local group size and a
228    second compute shader attached to the same program declares a fixed
229    local group size, a link-time error results.
230
231    (modify last paragraph of the section, p. 60, which specified link errors
232     if *local_size* layout qualifiers were omitted)
233
234    Furthermore, if a program object contains any compute shaders, at least
235    one must contain an input layout qualifier specifying a fixed or variable
236    local group size for the program, or a link-time error will occur.
237
238
239    Modify Section 7.1, Built-In Language Variables, p. 110
240
241    (add to list of compute built-ins, p. 110)
242
243      in    uvec3 gl_NumWorkGroups;     // already exists in 4.30
244      const uvec3 gl_WorkGroupSize;     // already exists in 4.30
245      in    uvec3 gl_LocalGroupSizeARB; // new!
246
247    (modify third paragraph, p. 113)
248
249    The built-in constant gl_WorkGroupSize is a compute-shader constant ...
250    It is a compile-time error to use gl_WorkGroupSize in a shader that does
251    not declare a fixed local group size, or before that shader has declared
252    a fixed local group size, using local_size_x, local_size_y, and
253    local_size_z.   ...
254
255    (insert after third paragraph, p. 113)
256
257    The built-in variable /gl_LocalGroupSizeARB/ is a compute-shader input
258    variable containing the workgroup size for the current compute-
259    shader workgroup.  For compute shaders with a fixed local group size (using
260    *local_size_x*, *local_size_y*, or *local_size_z* layout qualifiers), its
261    value will be the same as the constant /gl_WorkGroupSize/.  For compute
262    shaders with a variable local group size (using *local_size_variable*),
263    the value of /gl_LocalGroupSizeARB/ will be the workgroup
264    size specified in the OpenGL API command dispatching the current
265    compute shader work.
266
267    (modify next-to-last paragraph, p. 113)
268
269    The built-in variable gl_LocalInvocationID ...  The possible values for
270    this varaible range across the workgroup size, i.e., (0,0,0) to
271    (gl_LocalGroupSizeARB.x - 1, gl_LocalGroupSizeARB.y - 1,
272    gl_LocalGroupSizeARB.z - 1).
273
274    (modify last paragraph, p. 113)
275
276    The built-in variable gl_GlobalInvocationID ...  This is computed as:
277
278      gl_GlobalInvocationID = gl_WorkGroupID * gl_LocalGroupSizeARB +
279                              gl_LocalInvocationID;
280
281
282    (modify first paragraph, p. 114)
283
284    The built-in variable gl_LocalInvocationIndex ...  This is computed as:
285
286      gl_LocalInvocationIndex =
287        gl_LocalInvocationID.z * (gl_LocalGroupSizeARB.x *
288                                  gl_LocalGroupSizeARB.y) +
289        gl_LocalInvocationID.y * gl_LocalGroupSizeARB.x +
290        gl_LocalInvocationID.x;
291
292
293Additions to the AGL/EGL/GLX/WGL Specifications
294
295    None
296
297GLX Protocol
298
299    TBD
300
301Dependencies on NV_compute_program5
302
303    If NV_compute_program5 is supported, variable workgroup sizes are
304    supported for assembly programs.  Make the following edits to the
305    NV_compute_program5 specification:
306
307    (modify the NV_compute_program5 edits to Section 2.X.3.2, Program
308     Attribute Variables)
309
310    If a compute attribute binding matches "invocation.groupsize", the "x",
311    "y", and "z" components of the invocation attribute variable are filled
312    the "x", "y", and "z" dimensions, respectively, of the workgroup,
313    as specified by the GROUP_SIZE declaration for programs with fixed-size
314    workgroups or through the OpenGL API for programs with variable-size
315    workgroups.  The "w" component of the attribute is undefined.
316
317    (add to section 2.X.6 of the NV_gpu_program4/5 spec, Program Options)
318
319    + Compute Shader Variable Group Size (ARB_compute_variable_group_size)
320
321    If a program specifies the "ARB_compute_variable_group_size" option, it
322    supports variable-size workgroups.  Compute programs with a variable
323    workgroup size must be dispatched with DispatchComputeGroupSizeARB.  Compute
324    programs with a fixed workgroup size must be dispatched with
325    DispatchCompute or DispatchComputeIndirect.
326
327    (modify Section 2.X.7.Y, Compute Program Declarations)
328
329    - Shader Thread Group Size (GROUP_SIZE)
330
331    The GROUP_SIZE statement declares the number of shader threads in a one-,
332    two-, or three-dimensional workgroup.  The statement must have one
333    to three unsigned integer arguments.  Each argument must be less than or
334    equal to the value of the implementation-dependent limit
335    MAX_COMPUTE_LOCAL_WORK_SIZE for its corresponding dimension (X, Y, or Z).
336    If the ARB_compute_variable_group_size option is specified, no fixed group
337    size should be specified and a program will fail to load if it includes
338    any GROUP_SIZE declaration.  If the ARB_compute_variable_group_size option
339    is not specified, a program will fail to load unless it contains exactly
340    one GROUP_SIZE declaration.
341
342Errors
343
344    An INVALID_OPERATION error is generated by DispatchCompute or
345    DispatchComputeIndirect if the active program for the compute shader stage
346    has a variable workgroup size.
347
348    An INVALID_OPERATION error is generated by DispatchComputeGroupSizeARB if
349    the active program for the compute shader stage has a fixed workgroup
350    size.
351
352    An INVALID_VALUE error is generated by DispatchComputeGroupSizeARB if any
353    of <group_size_x>, <group_size_y>, or <group_size_z> is less than or equal
354    to zero or greater than the maximum workgroup size for compute
355    shaders with variable group size (MAX_COMPUTE_VARIABLE_GROUP_SIZE_ARB) in
356    the corresponding dimension.
357
358    An INVALID_VALUE error is generated by DispatchComputeGroupSizeARB if the
359    product of <group_size_x>, <group_size_y>, and <group_size_z> exceeds the
360    implementation-dependent maximum workgroup invocation count for
361    compute shaders with variable group size
362    (MAX_COMPUTE_VARIABLE_GROUP_INVOCATIONS_ARB).
363
364New State
365
366    None.
367
368New Implementation Dependent State
369
370    Add to Table 23.73 (Implementation Dependent Compute Shader Limits),
371    p. 716
372
373                                                    Minimum
374    Get Value                  Type  Get Command     Value     Description                    Sec.
375    -------------------------  ----  -------------  ---------  ----------------------------   ------
376    MAX_COMPUTE_VARIABLE_      3xZ+  GetIntegeri_v  512 (x,y)  maximum local group size for   19
377      WORK_GROUP_SIZE_ARB                           64 (z)     compute shaders with variable
378                                                               group size (per dimension)
379    MAX_COMPUTE_VARIABLE_      Z+    GetIntegerv    512        maximum number of invocations  19
380      WORK_GROUP_                                              in a group for compute shaders
381      INVOCATIONS_ARB                                          with variable group size
382
383    In table 23.73, rename entries for "MAX_COMPUTE_WORK_GROUP_SIZE" and
384    "MAX_COMPUTE_WORK_GROUP_INVOCATIONS" to use the labels
385    "MAX_COMPUTE_FIXED_GROUP_SIZE_ARB" and
386    "MAX_COMPUTE_FIXED_GROUP_INVOCATIONS_ARB", respectively.  Also modify the
387    description of these entries to refer to "compute shaders with fixed group
388    size".
389
390Issues
391
392    (1) If a compute shader declares a workgroup size, can it be dispatched
393        using OpenGL APIs accepting an explicit workgroup size as part of the
394        command?  If so, what happens?
395
396      RESOLVED:  No.  Attempting to do so will generate an INVALID_OPERATION
397      error.
398
399      Since the fixed workgroup size may affect the compilation of the shader
400      and the value of certain built-in constants, having the OpenGL API
401      override the workgroup size baked into the compute shader seems
402      suspect.  We could conceivably allow an explicit workgroup size in the
403      OpenGL API and require that it match the workgroup size baked into the
404      compute shader, but doing so seems to be of limited value.
405
406    (2) If a compute shader doesn't declare a workgroup size, can it be
407        dispatched using OpenGL APIs that do not accept an explicit workgroup
408        size as part of the command?  If so, what happens?
409
410      RESOLVED:  No.  Attempting to do so will generate an INVALID_OPERATION
411      error.
412
413      We could theoretically treat this case as allowing OpenGL
414      implementations to pick a workgroup size that "works well" on a
415      particular piece of hardware.  However, that wouldn't resolve the
416      question of what the "num_groups" arguments to DispatchCompute would
417      mean if the group size were implementation-dependent.  One could
418      intepret the "num_groups" arguments as specifying the number of
419      *invocations* in each dimension, as though the group size were 1x1x1.
420      But it's just easier to make this condition an error, as we do for APIs
421      attempting to override the group size of a compute shader.
422
423    (3) What new GLSL built-ins should we provide to expose the group size
424        specified in the OpenGL API?
425
426      RESOLVED:  We will provide a new built-in variable exposing the group
427      size specified in the API.  The name choice is potentially tricky, since
428      we now have two different "workgroup size" variables -- a previously
429      existing constant for the fixed workgroup size and now a second input
430      for the variable workgroup size specified in the API.  We choose the
431      name "gl_LocalGroupSizeARB" here, which seems to fit reasonably well with
432      existing inputs such as "gl_LocalInvocationID".
433
434      If we had provided this functionality in the original compute shader
435      extension, maybe we could have only had "gl_LocalGroupSizeARB"?
436      However, the constant "gl_WorkGroupSize" would still be useful for
437      sizing built-in arrays for shaders with a fixed workgroup size.  For
438      example, a shader might want to declare a shared variable with one
439      instance per workgroup invocation, such as:
440
441        shared float shared_values[gl_WorkGroupSize.x * gl_WorkGroupSize.y *
442                                   gl_WorkGroupSize.z];
443
444      Such declarations would be illegal using the input
445      "gl_LocalGroupSizeARB".
446
447    (4) Do we need to modify the behavior of existing GLSL built-ins for
448        compute shaders without an explicit workgroup size?
449
450      RESOLVED:  No, not really.
451
452      The constant gl_WorkGroupSize seems like it would be affected by
453      omitting an explicit workgroup size.  However, it is already an error
454      to use gl_WorkGroupSize in a shader before a workgroup size layout
455      qualifier is declared.  That would make its use illegal in shaders where
456      workgroup size layout qualifiers are not declared at all.
457
458      We do need to make minor modifications to the language describing other
459      built-in inputs such as gl_LocalInvocationIndex, that are today defined
460      to be a function of the constant gl_WorkGroupSize.  We modify these
461      definitions to use the input gl_LocalGroupSizeARB instead.
462
463    (5) Should we provide a function (e.g.,
464        DispatchComputeIndirectGroupSizeARB) that takes both a workgroup
465        count and a workgroup size from indirect dispatch buffers?  If so,
466        what do we do if the workgroup size is not positive or exceeds
467        implementation-dependent limits?
468
469      RESOLVED:  No, let's leave this out of this extension.
470
471    (6) Is it necessary for compute shaders to include a "#extension"
472        directive to enable this extension in order to link successfully
473        without a fixed workgroup size?
474
475      RESOLVED:  Yes, compute shaders will have to use the
476      "local_size_variable" layout qualifier to declare a variable workgroup
477      size, and an "#extension" directive is required to be able to use that
478      layout qualifier.
479
480      In unextended OpenGL 4.3, we get a link error if no shaders in the
481      program exercise an existing language feature (declaring the fixed
482      workgroup size).  We could have simply removed this error, but the general
483      rule for "#extension" is that a user should be able to determine if a
484      shader were legal or not simply by examining the source code.
485
486      Note that it is necessary to use "#extension" to use the new built-in
487      input (gl_LocalGroupSizeARB) provided by this extension.
488
489    (7) Do we need different implementation-dependent limits for dynamic group
490    sizes?
491
492      RESOLVED:  Yes, some implementations of this extension may require lower
493      limits for variable local group sizes.  We add new tokens
494      MAX_COMPUTE_VARIABLE_GROUP_SIZE_ARB and
495      MAX_COMPUTE_VARIABLE_GROUP_INVOCATIONS_ARB to query these limits.
496      Implementations must support variable group dimensions of 512/512/64,
497      with at least 512 invocations per group.  The minimum limits for fixed
498      group sizes in unextended OpenGL 4.3 are 1024/1024/64 with at least 1024
499      invocations per group.
500
501    (8) Do we need an explicit query to determine if a program with a compute
502    shader has a fixed or variable local group size?
503
504      RESOLVED:  No.  The existing COMPUTE_WORK_GROUP_SIZE query will return
505      zero when using a shader with a variable local group size, and will
506      always return non-zero values for shaders with a fixed group size.
507
508Revision History
509
510    Revision 9, December 10, 2018 (Jon Leech)
511      - Use 'workgroup' consistently throughout (Bug 11723, internal API
512        issue 87).
513
514    Revision 8, May 30, 2013 (pbrown)
515      - Fix a typo in the MAX_COMPUTE_VARIABLE_GROUP_SIZE_ARB description;
516        that limit applies only to shaders with variable group sizes.
517
518    Revision 7, May 30, 2013 (pbrown)
519      - Mark issue (8) as resolved.
520
521    Revision 6, May 12, 2013 (JohnK)
522      - Editorial things:
523         - be more consistent/broader with "fixed local group size" language
524           (vs. variable), and related, also bringing in another paragraph from
525           the core spec.
526         - move spec. more toward using bold layout qualifier ids everywhere
527         - few minor typos, other tiny changes
528
529    Revision 5, May 8, 2013
530      - Assign enum values for new tokens.
531      - Add interaction with NV_compute_program5 assembly programs.
532
533    Revision 4, May 7, 2013
534      - Add new implementation limits MAX_COMPUTE_VARIABLE_GROUP_SIZE_ARB and
535        MAX_COMPUTE_VARIABLE_GROUP_INVOCATIONS_ARB for compute shaders with
536        variable group sizes, with minimum values of 512/512/64 and 512,
537        respectively.
538      - Add new tokens MAX_COMPUTE_FIXED_GROUP_SIZE_ARB and
539        MAX_COMPUTE_FIXED_GROUP_INVOCATIONS_ARB for compute shaders with fixed
540        group sizes, which are aliased to existing OpenGL 4.3 tokens
541        (MAX_COMPUTE_WORK_GROUP_SIZE and MAX_COMPUTE_WORK_GROUP_INVOCATIONS).
542
543    Revision 3, May 4, 2013
544      - Add ARB suffixes for the new entry point (DispatchComputeGroupSizeARB)
545        and GLSL built-in variable (gl_LocalGroupSizeARB).
546      - Add a missing INVALID_OPERATION error to DispatchComputeIndirect,
547        which requires a compute shader with a variable local group size.
548      - Add new issue (8) about querying if a program with a compute shader
549        has a fixed or variable group size.
550
551    Revision 2, May 3, 2013
552      - Modify the spec to accept an explicit layout qualifer
553        /local_size_variable/ to specify a compute shader with a variable
554        local group size instead of inferring it from the lack of fixed-size
555        layout qualifiers.
556      - Modify some spec language to refer to the existing and new types of
557        compute shaders as having a fixed and variable local group size,
558        respectively.
559      - Mark various issues as resolved based on work group discussions.
560      - Add new issue (7) about different implementation-dependent size limits
561        for compute shaders with variable-size work groups.
562
563    Revision 1, January 20, 2013
564      - Initial revision.
565