• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Name
2
3    NV_shader_atomic_float64
4
5Name Strings
6
7    GL_NV_shader_atomic_float64
8
9Contact
10
11    Kedarnath Thangudu, NVIDIA Corporation (kthangudu 'at' nvidia.com)
12
13Contributors
14
15    Pat Brown, NVIDIA
16
17Status
18
19    Shipping in NVIDIA release 367.XX drivers and up.
20
21Version
22
23    Last Modified Date:         October 15, 2014
24    NVIDIA Revision:            1
25
26Number
27
28    OpenGL Extension #488
29
30Dependencies
31
32    This extension is written against the OpenGL 4.5 (Compatibility Profile)
33    Specification.
34
35    This extension is written against version 4.50 (revision 3) of the OpenGL
36    Shading Language Specification.
37
38    This extension requires ARB_gpu_shader_fp64 or NV_gpu_program_fp64.
39
40    This extension interacts with NV_shader_buffer_store, NV_gpu_shader5,
41    ARB_shader_storage_buffer_object, and ARB_compute_shader.
42
43    This extension interacts with NV_gpu_program5.
44
45Overview
46
47    This extension provides GLSL built-in functions and assembly opcodes
48    allowing shaders to perform atomic read-modify-write operations to buffer
49    or shared memory with double-precision floating-point components.  The set
50    of atomic operations provided by this extension is limited to adds and
51    exchanges. Providing atomic add support allows shaders to atomically
52    accumulate the sum of double-precision floating-point values into buffer
53    memory across multiple (possibly concurrent) shader invocations.
54
55    This extension provides GLSL support for atomics targeting double-precision
56    floating-point pointers (if NV_gpu_shader5 is supported).
57    Additionally, assembly opcodes for these operations are also provided if
58    NV_gpu_program5 is supported.
59
60New Procedures and Functions
61
62    None.
63
64New Tokens
65
66    None.
67
68Additions to the OpenGL 4.5 (Compatibility Profile) Specification
69
70    None.
71
72Additions to the AGL/GLX/WGL Specifications
73
74    None.
75
76GLX Protocol
77
78    None.
79
80Modifications to the OpenGL Shading Language Specification, Version 4.50
81(revision 3)
82
83    Including the following line in a shader can be used to control the
84    language features described in this extension:
85
86      #extension GL_NV_shader_atomic_float64 : <behavior>
87
88    where <behavior> is as specified in section 3.3.
89
90    New preprocessor #defines are added to the OpenGL Shading Language:
91
92      #define GL_NV_shader_atomic_float64         1
93
94    Modify Section 8.11, Atomic Memory Functions (p. 172)
95
96    (add to "atomicAdd" table cell, p. 173)
97
98      double atomicAdd(coherent inout double mem, double data)
99
100    (add to "atomicExchange" table cell, p. 173)
101
102      double atomicExchange(coherent inout double mem, double data)
103
104
105Dependencies on NV_shader_buffer_store, NV_gpu_shader5,
106ARB_shader_storage_buffer_object, and ARB_compute_shader
107
108    If NV_shader_buffer_store and NV_gpu_shader5 are supported, the following
109    functions should be added to the "Section 8.Y, Shader Memory Functions"
110    language in the NV_shader_buffer_store specification:
111
112      double atomicAdd(double *address, double data);
113      double atomicExchange(double *address, double data);
114
115    If ARB_shader_storage_buffer_object or ARB_compute_shader are supported,
116    make similar edits to the functions documented in the
117    ARB_shader_storage_buffer object extension.
118
119    These functions are available if and only if GL_NV_shader_atomic_float64 is
120    enabled via the "#extension" directive.
121
122Dependencies on NV_gpu_program5
123
124    If NV_gpu_program5 is supported and "OPTION NV_shader_atomic_float64" is
125    specified in an assembly program, "F64" should be allowed as a storage
126    modifier to the ATOM instruction for the atomic operations "ADD" and
127    "EXCH".
128
129    (Add to "Section 2.X.6, Program Options" of the NV_gpu_program4 extension,
130    as extended by NV_gpu_program5:)
131
132      + Double-precision Floating-Point Atomic Operations (NV_shader_atomic_float64)
133
134      If a program specifies the "NV_shader_atomic_float64" option, it may use
135      "F64" storage modifier with the "ATOM" opcode to perform atomic double-
136      precision floating-point add or exchange operations.
137
138    (Add to the table in "Section 2.X.8.Z, ATOM" in NV_gpu_program5:)
139
140      atomic     storage
141      modifier   modifiers                 operation
142      --------   -----------------------   ---------------------------------
143       ADD       U32, S32, U64, F32, F64   compute a sum
144                 F16X2, F16X4
145       ...
146       EXCH      U32, S32, U64, F32, F64   exchange memory with operand
147                 F16X2, F16X4
148
149    Note:
150      Storage modifier U64 is provided by NV_shader_atomic_int64
151      Storage modifier F32 is provided by NV_shader_atomic_float
152      Storage modifiers F16X2 and F16X4 are provided by NV_shader_atomic_fp16_vector
153
154Errors
155
156    None.
157
158New State
159
160    None.
161
162New Implementation Dependent State
163
164    None.
165
166Issues
167
168    (1) What double-precision floating-point targets are supported for
169    atomic operations?
170
171      RESOLVED: This extension only supports atomic operations on double-
172      precision floating-point buffer memory. Atomic operation on double-
173      precision texture memory are not supported since OpenGL provides
174      no pixel/texture formats with double-precision components.
175
176    (2) What atomic operations should we support for double-precision
177    floating-point targets?
178
179      RESOLVED:  Double-precision floating-point atomic addition is the main
180      functionality targeted by this extension.  We provide exchanges because
181      the operation needs no special hardware support.
182
183      We chose not to provide support for bitwise operations (AND/OR/XOR);
184      it's possible to support these by casting a pointer or aliasing an image
185      if required.  Minimum, maximum, and compare-and-swap make sense, but the
186      underlying atomic hardware targeted by this extension does not support
187      floating-point comparisons.
188
189Revision History
190
191    Revision 1
192      - Internal revisions.
193
194