• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Name
2
3    NVX_gpu_multicast2
4
5Name Strings
6
7    GL_NVX_gpu_multicast2
8
9Contact
10
11    Joshua Schnarr, NVIDIA Corporation (jschnarr 'at' nvidia.com)
12    Ingo Esser, NVIDIA Corporation (iesser 'at' nvidia.com)
13
14Contributors
15
16    Robert Menzel, NVIDIA
17    Ralf Biermann, NVIDIA
18
19Status
20
21    Complete.
22
23Version
24
25    Last Modified Date: July 23, 2019
26    Author Revision: 8
27
28Number
29
30    OpenGL Extension #543
31
32Dependencies
33
34    This extension is written against the OpenGL 4.6 specification
35    (Compatibility Profile), dated October 24, 2016.
36
37    This extension requires NV_gpu_multicast.
38
39    This extension requires EXT_device_group.
40
41    This extension requires NV_viewport_array.
42
43    This extension requires NV_clip_space_w_scaling.
44
45    This extension requires NVX_progress_fence.
46
47
48Overview
49
50    This extension provides additional mechanisms that influence multicast rendering which is
51    simultaneous rendering to multiple GPUs.
52
53New Procedures and Functions
54
55    uint AsyncCopyImageSubDataNVX(
56        sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *waitValueArray,
57        uint srcGpu, GLbitfield dstGpuMask,
58        uint srcName, GLenum srcTarget, int srcLevel, int srcX, int srcY, int srcZ,
59        uint dstName, GLenum dstTarget, int dstLevel, int dstX, int dstY, int dstZ,
60        sizei srcWidth, sizei srcHeight, sizei srcDepth,
61        sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray);
62
63    sync AsyncCopyBufferSubDataNVX(
64        sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *fenceValueArray,
65        uint readGpu, GLbitfield writeGpuMask,
66        uint readBuffer, uint writeBuffer,
67        GLintptr readOffset, GLintptr writeOffset, sizeiptr size,
68        sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray);
69
70    void UploadGpuMaskNVX(bitfield mask);
71
72    void MulticastViewportArrayvNVX(uint gpu, uint first, sizei count, const float *v);
73
74    void MulticastScissorArrayvNVX(uint gpu, uint first, sizei count, const int *v);
75
76    void MulticastViewportPositionWScaleNVX(uint gpu, uint index, float xcoeff, float ycoeff);
77
78
79New Tokens
80
81    Accepted by the <pname> parameter of GetIntegerv and GetInteger64v:
82
83        UPLOAD_GPU_MASK_NVX                        0x954A
84
85Additions to Chapter 20 (Multicast Rendering) added to the OpenGL 4.5 (Compatibility Profile)
86Specification by NV_gpu_multicast
87
88    Additions to Section 20.1 (Controlling Individual GPUs)
89
90    Texture data uploads using the functions TexImage1D, TexImage2D, TexImage3D,
91    TexSubImage1D, TexSubImage2D and TexSubImage3D are restricted to a specific set of GPUs with
92
93      void UploadGpuMaskNVX(bitfield mask);
94
95    This command also restricts buffer object data uploads using the functions BufferStorage,
96    NamedBufferStorage, BufferSubData and NamedBufferSubData to the specified set of GPUs.
97
98    Further this command also restricts buffer object clears using the functions ClearBufferData,
99    ClearNamedBufferData, ClearBufferSubData and ClearNamedBufferSubData.
100
101    The following errors apply to UploadGpuMaskNVX:
102
103    INVALID_VALUE is generated
104    * if <mask> is zero,
105    * if <mask> is greater than or equal to 2^n, where n is equal to MULTICAST_GPUS_NV
106
107    If the command does not generate an error, UPLOAD_GPU_MASK_NVX is set to <mask>.
108
109    The default value of UPLOAD_GPU_MASK_NVX is (2^n)-1.
110
111    If a function restricted by UploadGpuMaskNVX operates on textures or buffer objects
112    with GPU-shared storage type (as opposed to per-GPU storage), UPLOAD_GPU_MASK_NVX is ignored.
113
114    Modify Section 20.2 (Multi-GPU Buffer Storage)
115
116    Append the following paragraphs:
117
118    To initiate a copy of buffer data without waiting for it to complete, use the following command:
119
120    void AsyncCopyBufferSubDataNVX(
121        sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *fenceValueArray,
122        uint readGpu, GLbitfield writeGpuMask,
123        uint readBuffer, uint writeBuffer,
124        GLintptr readOffset, GLintptr writeOffset, sizeiptr size,
125        sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray);
126
127    This command behaves equivalently to MulticastCopyBufferSubDataNV, except that it may be
128    performed concurrently with commands submitted in the future.
129    Fence semaphore objects created with CreateProgressFenceNVX are used for synchronization of one or
130    multiple copies.
131    An array of <waitSemaphoreCount> synchronization objects can be specified in the <waitSemaphoresArray>
132    parameter as a pointer to the array of semaphore objects.
133    The copy will wait for all fence semaphores in the <waitSemaphoreArray> array to be reach or exceed
134    their corresponding fence value in <fenceValueArray> before starting the transfer.
135    A signal operation for each of the <signalSemaphoreCount> semaphores in <signalSemaphoresArray> is written
136    after the copy with the corresponding fence value in <signalValueArray>.
137    To wait for the copy to complete, use WaitSemaphoreui64NVX or ClientWaitSemaphoreui64NVX to wait
138    for the semaphores in <signalSemaphoreArray> to be signalled with the fence values in <signalValueArray>.
139
140    Modify Section 20.3.1 (Copying Image Data Between GPUs)
141
142    Insert the following paragraphs above the line starting "To copy pixel values":
143
144    To initiate a copy of texel data without waiting for it to complete, use the following command:
145
146    void AsyncCopyImageSubDataNVX(
147        sizei waitSemaphoreCount, const uint *waitSemaphoreArray, const uint64 *waitValueArray,
148        uint srcGpu, GLbitfield dstGpuMask,
149        uint srcName, GLenum srcTarget, int srcLevel, int srcX, int srcY, int srcZ,
150        uint dstName, GLenum dstTarget, int dstLevel, int dstX, int dstY, int dstZ,
151        sizei srcWidth, sizei srcHeight, sizei srcDepth,
152        sizei signalSemaphoreCount, const uint *signalSemaphoreArray, const uint64 *signalValueArray);
153
154    This command behaves equivalently to MulticastCopyImageSubDataNV, except that it may be
155    performed concurrently with commands submitted in the future.
156    Fence semaphore objects created with CreateProgressFenceNVX are used for synchronization of one or
157    multiple copies. An array of <waitSemaphoreCount> synchronization objects can be specified in the
158    <waitSemaphoreArray> parameter as a pointer to the array of semaphore objects.
159    The copy will wait for all fence semaphores in the <waitSemaphoresArray> array to be reach or exceed
160    their corresponding fence value in <fenceValueArray> before starting the transfer.
161    A signal operation for each of the <signalSemaphoreCount> semaphores in <signalSemaphoreArray> is written
162    after the copy with the corresponding fence value in <signalValueArray>.
163    To wait for the copy to complete, use WaitSemaphoreui64NVX or ClientWaitSemaphoreui64NVX to wait
164    for the semaphores in <signalSemaphoresArray> to be signalled with the fence values in <signalValueArray>.
165
166Additions to Chapter 13 (Fixed-Function Vertex Post-Processing) added to the OpenGL 4.5 (Compatibility Profile)
167
168    Modify Section 13.6 (Coordinate transformations)
169
170    Viewport transformation parameters for multiple viewports are specified using
171
172        MulticastViewportArrayvNVX(uint gpu, uint first, sizei count, const float * v);
173
174    where the array of viewport parameters can be controlled for each multicast GPU, respectively.
175
176    A set of scissor rectangles that are each applied to the corresponding viewport is specified
177    using
178
179        MulticastScissorArrayvNVX(uint gpu, uint first, sizei count, const int *v);
180
181    where the rectangle parameters can be controlled for each multicast GPU, respectively.
182
183
184    If VIEWPORT_POSITION_W_SCALE_NV is enabled, the w coordinates for each
185    primitive sent to a given viewport will be scaled as a function of
186    its x and y coordinates using the following equation:
187
188        w' = xcoeff * x + ycoeff * y + w;
189
190    The coefficients for "x" and "y" used in the above equation depend on the
191    viewport index and can be controlled for each multicast GPU, respectively, by the command
192
193        MulticastViewportPositionWScaleNVX(uint gpu, uint index, float xcoeff, float ycoeff);
194
195    An error INVALID_VALUE error is generated if <gpu> is greater than or equal to MULTICAST_GPUS_NV.
196
197Additions to the OpenGL Shading Language Specification, Version 4.50
198
199    Including the following line in a shader can be used to enumerate multicast GPUs
200    by using the shader built-in variable gl_DeviceIndex:
201
202        #extension GL_EXT_device_group : enable
203
204    Each multicast GPU contains a unique device index in the gl_DeviceIndex variable.
205
206Errors
207
208    Relaxation of INVALID_ENUM errors
209    ---------------------------------
210    GetIntegerv and GetInteger64v now accept new tokens as
211    described in the "New Tokens" section.
212
213New State
214
215    Additions to Table 23.6 Buffer Object State
216                                                   Initial
217    Get Value                   Type  Get Command Value  Description               Sec.  Attribute
218    -------------------------- ------ ----------- -----  -----------------------   ----  ---------
219    UPLOAD_GPU_MASK_NVX          Z+   GetIntegerv   *    Mask of GPUs that         20.1     -
220                                                         restricts buffer data
221                                                         writes
222    * See section 20.1
223
224
225New Implementation Dependent State
226
227    None.
228
229Sample Code
230
231    None.
232
233Issues
234
235    None.
236
237Revision History
238
239    Rev.    Date    Author    Changes
240    ----  --------  --------  -----------------------------------------------
241     1    09/20/17  jschnarr  initial draft
242     2    02/23/18  rbiermann updated draft with new functions
243     3    05/23/18  rbiermann updated draft with new ViewportArray and AsyncCopy functions
244     4    06/08/18  rbiermann added NVX_progress_fence for synchronization objects
245     5    08/15/18  rbiermann updated draft with gl_deviceIndex
246     6    04/16/19  rbiermann updated draft with UploadGpuMaskNVX
247     7    07/19/19  rbiermann updated draft with modifications of UploadGpuMaskNVX section
248     8    07/23/19  rbiermann updated draft with support of Clear(Named)Buffer(Sub)Data by UploadGpuMaskNVX
249
250