1Name 2 3 AMD_gpu_association 4 5Name Strings 6 7 WGL_AMD_gpu_association 8 9Contact 10 11 Nick Haemel, AMD (nick.haemel 'at' amd.com) 12 13Status 14 15 Complete 16 17Version 18 19 Last Modified Date: March 03, 2009 20 Author Revision: 1.0 21 22 Based on: WGL_ARB_make_current_read specification 23 Date: 3/15/2000 Version: 1.1 24 25 EXT_framebuffer_object specification 26 Date: 2/13/2007 Revision #119 27 28Number 29 30 361 31 32Dependencies 33 34 OpenGL 1.5 is required. 35 36 WGL_ARB_extensions_string is required. 37 38 GL_EXT_framebuffer_object is required. 39 40 This extension interacts with WGL_ARB_make_current_read. 41 42 This extension interacts with GL_EXT_framebuffer_blit. 43 44 This extension interacts with WGL_ARB_create_context. 45 46 47Overview 48 49 50 There currently is no way for applications to efficiently use GPU 51 resources in systems that contain more than one GPU. Vendors have 52 provided methods that attempt to split the workload for an 53 application among the available GPU resources. This has proven to be 54 very inefficient because most applications were never written with 55 these sorts of optimizations in mind. 56 57 This extension provides a mechanism for applications to explicitly 58 use the GPU resources on a given system individually. By providing 59 this functionality, a driver allows applications to make appropriate 60 decisions regarding where and when to distribute rendering tasks. 61 62 The set of GPUs available on a system can be queried by calling 63 wglGetGPUIDsAMD(). The current GPU assigned to a specific context 64 can be determined by calling wglGetContextGPUIDAMD. Each GPU in a 65 system may have different performance characteristics in addition 66 to supporting a different version of OpenGL. The specifics of each 67 GPU can be obtained by calling wglGetGPUInfo. This will allow 68 applications to pick the most appropriate GPU for each rendering 69 task. 70 71 Once all necessary GPU information has been obtained, a context tied 72 to a specific GPU can be created with wglCreateAssociatedContextAMD. 73 These associated contexts can be made current with 74 wglMakeAssociatedContextCurrentAMD and deleted with 75 wglDeleteAssociatedContextAMD. Only one GPU associated or 76 non-associated context can be current at one time per thread. 77 78 To provide an accelerated path for blitting data from one context 79 to another, the new blit function BlitContextFramebufferAMD has 80 been added. 81 82 83 84 85New Procedures and Functions 86 87 UINT wglGetGPUIDsAMD(UINT maxCount, UINT *ids); 88 89 INT wglGetGPUInfoAMD(UINT id, INT property, GLenum dataType, 90 UINT size, void *data) 91 92 UINT wglGetContextGPUIDAMD(HGLRC hglrc); 93 94 HGLRC wglCreateAssociatedContextAMD(UINT id); 95 96 HGLRC wglCreateAssociatedContextAttribsAMD(UINT id, HGLRC hShareContext, 97 const int *attribList); 98 99 BOOL wglDeleteAssociatedContextAMD(HGLRC hglrc); 100 101 BOOL wglMakeAssociatedContextCurrentAMD(HGLRC hglrc); 102 103 HGLRC wglGetCurrentAssociatedContextAMD(void); 104 105 VOID wglBlitContextFramebufferAMD(HGLRC dstCtx, GLint srcX0, GLint srcY0, 106 GLint srcX1, GLint srcY1, GLint dstX0, 107 GLint dstY0, GLint dstX1, GLint dstY1, 108 GLbitfield mask, GLenum filter); 109 110New Tokens 111 112 Accepted by the <property> parameter of wglGetGPUInfo: 113 114 WGL_GPU_VENDOR_AMD 0x1F00 115 WGL_GPU_RENDERER_STRING_AMD 0x1F01 116 WGL_GPU_OPENGL_VERSION_STRING_AMD 0x1F02 117 WGL_GPU_FASTEST_TARGET_GPUS_AMD 0x21A2 118 WGL_GPU_RAM_AMD 0x21A3 119 WGL_GPU_CLOCK_AMD 0x21A4 120 WGL_GPU_NUM_PIPES_AMD 0x21A5 121 WGL_GPU_NUM_SIMD_AMD 0x21A6 122 WGL_GPU_NUM_RB_AMD 0x21A7 123 WGL_GPU_NUM_SPI_AMD 0x21A8 124 125 Accepted by the <dataType> argument of wglGetGPUInfoAMD: 126 127 GL_UNSIGNED_BYTE 128 GL_UNSIGNED_INT 129 GL_INT 130 GL_FLOAT 131 132 Accepted by the <mask> argument of wglBlitContextFramebufferAMD: 133 134 GL_COLOR_BUFFER_BIT 135 GL_DEPTH_BUFFER_BIT 136 GL_STENCIL_BUFFER_BIT 137 138 139Additions to the GLX Specification 140 141 This specification is written for WGL. 142 143 144GLX Protocol 145 146 This specification is written for WGL. 147 148 149Additions to the WGL Specification 150 151 GPU Associated Contexts 152 153 When multiple GPUs are present, a context can be created for 154 off-screen rendering that is associated with a specific GPU. 155 This will allow applications to achieve an app-specific 156 distributed GPU utilization. 157 158 The IDs for available GPUs can be queried with the command: 159 160 UINT wglGetGPUIDsAMD(UINT maxCount, UINT *ids); 161 162 where <maxCount> is the max number of IDs that can be returned and 163 <ids> is the array of returned IDs. If the function succeeds, 164 the return value is the number of total GPUs available. The 165 value 0 is returned if no GPUs are available or if the call has 166 failed. The array pointer <ids> passed into the function will be 167 populated by the smaller of maxCount or the total GPU count 168 available. The ID 0 is reserved and will not be retuned as a 169 valid GPU ID. If the array <ids> is NULL, the function will 170 only return the total number of GPUs. <ids> will be tightly packed 171 with no 0 values between valid ids. 172 173 Calling wglGetGPUIDsAMD once with <maxCount> set to zero returns 174 the total available GPU count which can be used to allocate an 175 appropriately sized id array before calling wglGetGPUIDsAMD 176 again to query the full set of supported GPUs. 177 178 Each GPU in a system may have different properties, performance 179 characteristics and different supported OpenGL versions. To 180 determine which GPU is best suited for a specific task the 181 following functions may be used: 182 183 INT wglGetGPUInfoAMD(UINT id, INT property, GLenum dataType, 184 UINT size, void *data); 185 186 <id> is a GPU id obtained from calling wglGetGPUIDsAMD. The GPU ID 187 must be a valid GPU ID. The function will fail if <id> is an invalid 188 GPU ID and -1 will be returned. <property> is the information being 189 queried. <dataType> may be GL_UNSIGNED_INT, GL_INT, GL_FLOAT, or 190 GL_UNSIGNED_BYTE and signals what data type is to be returned. <size> 191 signals the size of the data buffer passed into wglGetGPUInfoAMD. 192 This is the count of the array of type <dataType>. <data> is the 193 buffer which will be filled with the requested information. For a 194 string, <size> will be the number of characters allocated and will 195 include NULL termination. For arrays of type GL_UNSIGNED_INT, GL_INT, 196 and GL_FLOAT <size> will be the array depth. If the function 197 succeeds, the number of values written will be returned. If the number 198 of values written is equal to <size>, the query should be repeated with 199 a larger <data> buffer. Strings should be queried using the 200 GL_UNSIGNED_BYTE type, are UTF-8 encoded and will be NULL terminated. 201 If the function fails, -1 will be returned. 202 203 <property> defines the GPU property to be queried, and may be one of 204 WGL_GPU_OPENGL_VERSION_STRING_AMD, WGL_GPU_RENDERER_STRING_AMD, 205 WGL_GPU_FASTEST_TARGET_GPUS_AMD, WGL_GPU_RAM_AMD, WGL_GPU_CLOCK_AMD, 206 WGL_GPU_NUM_PIPES_AMD, WGL_GPU_NUM_SIMD_AMD, WGL_GPU_NUM_RB_AMD, or 207 WGL_GPU_NUM_SPI_AMD. 208 209 If <size> is not sufficient to hold the entire value for a particular 210 property, the number of values returned will equal <size>. If 211 <dataType> is inappropriate for <property>, for instance INT for a 212 property which is a string, the function will fail and -1 will be 213 returned. 214 215 Querying WGL_GPU_OPENGL_VERSION_STRING_AMD returns the highest supported 216 OpenGL version string and WGL_GPU_RENDERER_STRING_AMD returns name 217 of the GPU. <dataType> must be GL_UNSIGNED_BYTE with the previous 218 properties. Querying WGL_GPU_FASTEST_TARGET_GPUS_AMD returns an array 219 of the IDs of GPUs with the fastest data blit rates when using 220 wglBlitContextFramebufferAMD. This list is ordered fastest 221 first. This provides a performance hint about which contexts and GPUS 222 are capable of transfering data between each other the quickest. Querying 223 WGL_GPU_RAM_AMD returns the amount of RAM available to GPU in MB. Querying 224 WGL_GPU_CLOCK_AMD returns the GPU clock speed in MHz. Querying 225 WGL_GPU_NUM_PIPES_AMD returns the nubmer of 3D pipes. Querying 226 WGL_GPU_NUM_SIMD_AMD returns the number of SIMD ALU units in each 227 shader pipe. Querying WGL_GPU_NUM_RB_AMD returns the number of render 228 backends. Querying WGL_GPU_NUM_SPI_AMD returns the number of shader 229 parameter interpolaters. If the <parameter> being queried is not 230 applicable for the GPU specified by <id>, the value 0 will be returned. 231 232 Unassociated contexts are created by calling wglCreateContext. 233 Although these contexts are unassociated, their use will still be 234 tied to a single GPU in most cases. For this reason it is advantageous 235 to be able to query the GPU an existing unassociated context resides 236 on. If multiple GPUs are available, it would be undesirable 237 to use one for rendering to visible surfaces and then chose the 238 same one for off-screen rendering. Use the following command to 239 determine which GPU a context is attached to: 240 241 UINT wglGetContextGPUIDAMD(HGLRC hglrc); 242 243 <hglrc> is the context for which the GPU id will be returned. If the 244 context is invalid or if an error has occurred, wglGetContextGPUIDAMD 245 will return 0. 246 247 To create an associated context, use: 248 249 HGLRC wglCreateAssociatedContextAMD(UINT id); 250 251 <id> must be a valid GPU id and cannot be 0. If a context was 252 successfully created the handle will be returned by 253 wglCreateAssociatedContextAMD. If a context could not be created, NULL 254 will be returned. If a context could not be created, error information 255 can be obtained by calling GetLastError. Upon successful creation, 256 no pixel format is tied to an associated context. 257 258 To create an associated context and request a specific GL version, use: 259 260 HGLRC wglCreateAssociatedContextAttribsAMD(UINT id, 261 HGLRC hShareContext, const int *attribList) 262 263 All capabilities and limitations of wglCreateContextAttribsARB apply 264 to wglCreateAssociatedContextAttribsAMD. Additionally, <id> must be 265 a valid GPU ID and cannot be 0. If a context was successfully created 266 the handle will be returned by wglCreateAssociatedContextAttribsAMD. 267 If a context could not be created, NULL will be returned. In this 268 case, error information can be obtained by calling GetLastError. Upon 269 successful creation, no pixel format is tied to an associated context. 270 271 <hShareContext> must either be NULL or that of an associated context 272 created with the the same GPU ID as <id>. If <hShareContext> was 273 created using a different ID, wglCreateAssociatedContextAttribsAMD 274 will fail and return NULL. 275 276 277 A context must be deleted once it is no longer needed. Use the 278 following call to delete an associated context: 279 280 BOOL wglDeleteAssociatedContextAMD(HGLRC hglrc); 281 282 If the function succeeds, TRUE will be returned, otherwise FALSE is 283 returned. <hglrc> must be a valid associated context created by 284 calling wglCreateAssociatedContextAMD. If an unassociated context, 285 created by calling wglCreateContext, is passed into <hglrc>, the 286 function will fail. An associated context cannot be deleted by calling 287 wglDeleteContext. If an associated context is passed into 288 wglDeleteContext, the result is undefiend. 289 290 To render using an associated context, it must be made the current 291 context for a thread: 292 293 BOOL wglMakeAssociatedContextCurrentAMD(HGLRC hglrc); 294 295 <hglrc> is a context handle created by calling 296 wglCreateAssociatedContextAMD. If <hglrc> was created using 297 wglCreateContext, the call will fail and FALSE will be returned. If 298 <hglrc> is not a valid context and not NULL, the call will fail and 299 FALSE will be returned. If the call succeeds, TRUE will be returned. 300 To detach the current associated context, pass NULL as <hglrc>. 301 302 Only one type of context can be current to a thread at a time. If an 303 unassociated context is current to a hdc when 304 wglMakeAssociatedContextCurrentAMD is called with a valid <hglrc>, it 305 is as if wglMakeCurrent is called first with an hglrc of NULL. If an 306 associated context is current and wglMakeCurrent is called with a 307 valid context, it is as if wglMakeAssociatedContextCurrentAMD is 308 called with an hglrc of NULL. 309 310 The current associated context can be queried by calling: 311 312 HGLRC wglGetCurrentAssociatedContextAMD(void); 313 314 The current associated context is returned on a successful call to 315 this function. If no associated context is current, NULL is returned. 316 If an unassociated context is current, NULL will be returned. 317 318 Associated contexts can be shared just as unassociated contexts can by 319 calling wglShareLists. Associated contexts can only be shared with 320 other contexts that were created with the same GPU id. Associated 321 contexts cannot be shared with non-associated contexts. FALSE will be 322 returned if either context is not valid or not an associated context 323 associated with the same GPU. 324 325 An associated context can not be passed in as a parameter into 326 wglCopyContext. If an associated context is passed into wglCopyContext, 327 the result is undefiend. 328 329 The addresses returned from wglGetProcAddress are only valid for the 330 current context. It may be invalid to use proc addresses obtained from 331 a traditional context with an associated context. Furthermore, the 332 OpenGL version and extensions supported on an associated context may 333 differ. Each context should be treated seperately, proc addressses 334 should be queried for each after context creation. 335 336 Calls to wglSwapBuffers and wglSwapLayerBuffers when an associated 337 context is current will return FALSE and will have no effect. 338 339 There is no way to use pBuffers with associated contexts. 340 341 Overlays and underlays are not supported with associated contexts. 342 343 The same associated context is used for both write and read operations. 344 345 To facilitate high performance data communication between multiple 346 contexts, a new function is necessary to blit data from one context 347 to another. 348 349 VOID wglBlitContextFramebufferAMD(HGLRC dstCtx, GLint srcX0, GLint srcY0, 350 GLint srcX1, GLint srcY1, GLint dstX0, 351 GLint dstY0, GLint dstX1, GLint dstY1, 352 GLbitfield mask, GLenum filter); 353 354 <dstCtx> is the context handle for the write context. <mask> is the 355 bitwise OR of a number of values indicating which buffers are to be 356 copied. The values are GL_COLOR_BUFFER_BIT, GL_DEPTH_BUFFER_BIT, and 357 GL_STENCIL_BUFFER_BIT, which are described in section 4.2.3. The 358 pixels corresponding to these buffers are copied from the source 359 rectangle, bound by the locations (srcX0, srcY0) and (srcX1, srcY1), 360 to the destination rectangle, bound by the locations (dstX0, dstY0) 361 and (dstX1, dstY1). The lower bounds of the rectangle are inclusive, 362 while the upper bounds are exclusive. 363 364 The source context is the current GL context. Specifying the current 365 GL context as the <dstCtx> will result in the error 366 GL_INVALID_OPERATION being generated. If <dstCtx> is invalid, the 367 error GL_INVALID_OPERATION will be generated. If no context is 368 current at the time of this call, the error GL_INVALID_OPERATION 369 will be generated. These errors may be queried by calling glGetError. 370 The target framebuffer will be the framebuffer bound to 371 GL_DRAW_FRAMEBUFFER_EXT in the context <dstCtx>. The source framebuffer 372 will be the framebuffer bound to GL_READ_FRAMEBUFFER_EXT in the 373 currently bound context. 374 375 The restrictions that apply to the source and destination rectangles 376 specified with <srcX0>, <srcY0>, <srcX1>, <srcY1>, <dstX0>, <dstY0> 377 <dstX0>, and <dstY0> are the same as those that apply for 378 glBlitFramebufferEXT. The same error conditions exist as for 379 glBlitFramebufferEXT. 380 381 When called, this function will execute immediately in the currently 382 bound context. It is up to the caller to maintain appropriate 383 synchronization between the current context and <dstCtx> to ensure 384 rendering to the appropriate surfaces has completed on the current 385 and <dstCtx> contexts. 386 387Additions to Chapter 4 of the OpenGL 1.5 Specification (Per-Fragment 388Operations and the Frame Buffer) 389Modify the beginning of section 4.4.1 as follows: 390 391 When an assoicated context is bound, the default state for an associated 392 context is invalid for rendering. Because there is no attached window, 393 there is no default framebuffer surface to render to. An app created 394 framebuffer object must be bound for rendering to be valid. If the 395 object bound to GL_FRAMEBUFFER_BINDING_EXT is 0, it is as if the 396 framebuffer is incomplete, and an 397 GL_INVALID_FRAMEBUFFER_OPERATION_EXT error will be generated 398 where rendering is attempted. 399 400 401New State 402 403 None 404 405 406Interactions with GL_EXT_framebuffer_blit 407 408 If the framebuffer blit extension is not supported, all language 409 referring to glBlitFramebufferEXT and wglBlitContextFramebufferAMD 410 is removed. 411 412Interactions with GL_EXT_framebuffer_object 413 414 If WGL_AMD_gpu_association is supported, and context created with it 415 will also support EXT_framebuffer_object. 416 417 418Interactions with WGL_ARB_make_current_read 419 420 If the make current read extension is supported, it is invalid to pass 421 an associated context handle as a parameter to 422 wglMakeContextCurrentARB. If an associated context is passed into 423 wglMakeContextCurrentARB, the result is undefiend. 424 425Interactions with WGL_ARB_create_context 426 If wglCreateContextAttribsARB is not supported, all language 427 referring to wglCreateAssociatedContextAttribsAMD is removed. 428 429 430Issues 431 4321. Is a different DC necessary in addition to an associated context? 433 434 - Resolved. It seems unnecessary. An associated DC would be a virtual 435 DC with no real meaning. Using an associated DC would require apps to 436 create windows and set pixelformats that are meaningless. 437 4382. Should the list of IDs returned by wglGetGPUIDs be ordered in some 439 way? By fastest GPU? 440 441 - Resolved. There is no need to create a restriction. The GPU info can 442 be queried. 443 4443. What happens when the GPUs support different versions of OpenGL? 445 Do we allow this? Do we need the 446 447 - Resolved. It is the applications responsibility to use each GPU 448 appropriately based on the supported version of OpenGL. 449 4504. What should the relation between wglMakeAssociatedContextCurrentAMD and 451 wglMakeCurrent be? Should it be legal to have an associated context and a 452 normal context current to the same thread? 453 454 - Resolved. It seems feasible to have an associated context and a normal 455 one both current, but for simplicity, only one of any type will be allowed 456 per thead. 457 4585. Will a call to wglShareLists with contexts that were not created through 459 wglCreateContext make it through to the driver? If not, will a new 460 shareLists call be necessary? 461 462 -Resolved. This is not an issue. 463 4646. Is GPUClock a good parameter for the GPUInfo structure? How should 465 relative GPU performance be presented? 466 467 - Resolved. This is sufficent. One alternative would be to test execution of 468 some amount of geometry rendering. But applications are better positioned 469 to perform this based on their rendering needs. 470 4717. Is BlitContextFramebufferEXT the best way to transfer data from one 472 context to another? Or would it be better to create a read and write 473 context bind point? 474 475 - Resolved. Both methods would work well. Adding a new function prevents 476 the additional state that would have to be tracked for a global read and 477 write bind point. In addition, using global (wgl) bind points may 478 introduce mutability issues when multiple threads are being used. 479 Only one thread could use the interface at a time. 480 4818. Should the GPU ID be part of the pixel format? That would allow 482 apps to search for a format that worked on different GPUs. 483 484 - Resolved. Using the pixel format would provide a non-intrusive solution, 485 but would require the app to use a DC that is not available through 486 current interfaces. In addition, the app would have to create a 487 dummy window. 488 4899. Are there any problems calling wglShareLists with contexts not created 490 by Windows? 491 492 - Resolved. This is not an issue. 493 49410. What should happen in an associated context when the default FBO is 495 bound? The 2 options seem to be 1. Throw an error and do not render, 496 2. Discard all rendering as failing the pixel ownership test. 497 498 - The first option seems more logical. 499 500 501Revision History 502 03/03/2009 - Initial version written. Written by nickh 503