1Name 2 3 NV_shading_rate_image 4 5Name Strings 6 7 GL_NV_shading_rate_image 8 9Contact 10 11 Pat Brown, NVIDIA Corporation (pbrown 'at' nvidia.com) 12 13Contributors 14 15 Daniel Koch, NVIDIA 16 Mark Kilgard, NVIDIA 17 Jeff Bolz, NVIDIA 18 Mathias Schott, NVIDIA 19 Pyarelal Knowles, NVIDIA 20 21Status 22 23 Shipping 24 25Version 26 27 Last Modified: March 16, 2020 28 Revision: 3 29 30Number 31 32 OpenGL Extension #531 33 OpenGL ES Extension #315 34 35Dependencies 36 37 This extension is written against the OpenGL 4.5 Specification 38 (Compatibility Profile), dated October 24, 2016. 39 40 OpenGL 4.5 or OpenGL ES 3.2 is required. 41 42 This extension requires support for the OpenGL Shading Language (GLSL) 43 extension "NV_shading_rate_image", which can be found at the Khronos Group 44 Github site here: 45 46 https://github.com/KhronosGroup/GLSL 47 48 This extension interacts trivially with ARB_sample_locations and 49 NV_sample_locations. 50 51 This extension interacts with NV_scissor_exclusive. 52 53 This extension interacts with NV_conservative_raster. 54 55 This extension interacts with NV_conservative_raster_underestimation. 56 57 This extension interacts with EXT_raster_multisample. 58 59 NV_framebuffer_mixed_samples is required. 60 61 If implemented in OpenGL ES, at least one of NV_viewport_array or 62 OES_viewport_array is required. 63 64Overview 65 66 By default, OpenGL runs a fragment shader once for each pixel covered by a 67 primitive being rasterized. When using multisampling, the outputs of that 68 fragment shader are broadcast to each covered sample of the fragment's 69 pixel. When using multisampling, applications can also request that the 70 fragment shader be run once per color sample (when using the "sample" 71 qualifier on one or more active fragment shader inputs), or run a fixed 72 number of times per pixel using SAMPLE_SHADING enable and the 73 MinSampleShading frequency value. In all of these approaches, the number 74 of fragment shader invocations per pixel is fixed, based on API state. 75 76 This extension allows applications to bind and enable a shading rate image 77 that can be used to vary the number of fragment shader invocations across 78 the framebuffer. This can be useful for applications like eye tracking 79 for virtual reality, where the portion of the framebuffer that the user is 80 looking at directly can be processed at high frequency, while distant 81 corners of the image can be processed at lower frequency. The shading 82 rate image is an immutable-format two-dimensional or two-dimensional array 83 texture that uses a format of R8UI. Each texel represents a fixed-size 84 rectangle in the framebuffer, covering 16x16 pixels in the initial 85 implementation of this extension. When rasterizing a primitive covering 86 one of these rectangles, the OpenGL implementation reads the texel in the 87 bound shading rate image and looks up the fetched value in a palette of 88 shading rates. The shading rate used can vary from (finest) 16 fragment 89 shader invocations per pixel to (coarsest) one fragment shader invocation 90 for each 4x4 block of pixels. 91 92 When this extension is advertised by an OpenGL implementation, the 93 implementation must also support the GLSL extension 94 "GL_NV_shading_rate_image" (documented separately), which provides new 95 built-in variables that allow fragment shaders to determine the effective 96 shading rate used for each fragment. Additionally, the GLSL extension also 97 provides new layout qualifiers allowing the interlock functionality provided 98 by ARB_fragment_shader_interlock to guarantee mutual exclusion across an 99 entire fragment when the shading rate specifies multiple pixels per fragment 100 shader invocation. 101 102 Note that this extension requires the use of a framebuffer object; the 103 shading rate image and related state are ignored when rendering to the 104 default framebuffer. 105 106New Procedures and Functions 107 108 void BindShadingRateImageNV(uint texture); 109 void ShadingRateImagePaletteNV(uint viewport, uint first, sizei count, 110 const enum *rates); 111 void GetShadingRateImagePaletteNV(uint viewport, uint entry, 112 enum *rate); 113 void ShadingRateImageBarrierNV(boolean synchronize); 114 void ShadingRateSampleOrderNV(enum order); 115 void ShadingRateSampleOrderCustomNV(enum rate, uint samples, 116 const int *locations); 117 void GetShadingRateSampleLocationivNV(enum rate, uint samples, 118 uint index, int *location); 119 120New Tokens 121 122 Accepted by the <cap> parameter of Enable, Disable, and IsEnabled, by the 123 <target> parameter of Enablei, Disablei, IsEnabledi, EnableIndexedEXT, 124 DisableIndexedEXT, and IsEnabledIndexedEXT, and by the <pname> parameter 125 of GetBooleanv, GetIntegerv, GetInteger64v, GetFloatv, GetDoublev, 126 GetDoubleIndexedv, GetBooleani_v, GetIntegeri_v, GetInteger64i_v, 127 GetFloati_v, GetDoublei_v, GetBooleanIndexedvEXT, GetIntegerIndexedvEXT, 128 and GetFloatIndexedvEXT: 129 130 SHADING_RATE_IMAGE_NV 0x9563 131 132 Accepted in the <rates> parameter of ShadingRateImagePaletteNV and the 133 <rate> parameter of ShadingRateSampleOrderCustomNV and 134 GetShadingRateSampleLocationivNV; returned in the <rate> parameter of 135 GetShadingRateImagePaletteNV: 136 137 SHADING_RATE_NO_INVOCATIONS_NV 0x9564 138 SHADING_RATE_1_INVOCATION_PER_PIXEL_NV 0x9565 139 SHADING_RATE_1_INVOCATION_PER_1X2_PIXELS_NV 0x9566 140 SHADING_RATE_1_INVOCATION_PER_2X1_PIXELS_NV 0x9567 141 SHADING_RATE_1_INVOCATION_PER_2X2_PIXELS_NV 0x9568 142 SHADING_RATE_1_INVOCATION_PER_2X4_PIXELS_NV 0x9569 143 SHADING_RATE_1_INVOCATION_PER_4X2_PIXELS_NV 0x956A 144 SHADING_RATE_1_INVOCATION_PER_4X4_PIXELS_NV 0x956B 145 SHADING_RATE_2_INVOCATIONS_PER_PIXEL_NV 0x956C 146 SHADING_RATE_4_INVOCATIONS_PER_PIXEL_NV 0x956D 147 SHADING_RATE_8_INVOCATIONS_PER_PIXEL_NV 0x956E 148 SHADING_RATE_16_INVOCATIONS_PER_PIXEL_NV 0x956F 149 150 Accepted by the <pname> parameter of GetBooleanv, GetDoublev, 151 GetIntegerv, and GetFloatv: 152 153 SHADING_RATE_IMAGE_BINDING_NV 0x955B 154 SHADING_RATE_IMAGE_TEXEL_WIDTH_NV 0x955C 155 SHADING_RATE_IMAGE_TEXEL_HEIGHT_NV 0x955D 156 SHADING_RATE_IMAGE_PALETTE_SIZE_NV 0x955E 157 MAX_COARSE_FRAGMENT_SAMPLES_NV 0x955F 158 159 Accepted by the <order> parameter of ShadingRateSampleOrderNV: 160 161 SHADING_RATE_SAMPLE_ORDER_DEFAULT_NV 0x95AE 162 SHADING_RATE_SAMPLE_ORDER_PIXEL_MAJOR_NV 0x95AF 163 SHADING_RATE_SAMPLE_ORDER_SAMPLE_MAJOR_NV 0x95B0 164 165 166Modifications to the OpenGL 4.5 Specification (Compatibility Profile) 167 168 Modify Section 14.3.1, Multisampling, p. 532 169 170 (add to the end of the section) 171 172 When using a shading rate image (Section 14.4.1), rasterization may 173 produce fragments covering multiple pixels, where each pixel is treated as 174 a sample. If SHADING_RATE_IMAGE_NV is enabled for any viewport, 175 primitives will be processed with multisample rasterization rules, 176 regardless of the MULTISAMPLE enable or the value of SAMPLE_BUFFERS. If 177 the framebuffer has no multisample buffers, each pixel is treated as 178 having a single sample located at the pixel center. 179 180 181 Delete Section 14.3.1.1, Sample Shading, p. 532. The functionality in 182 this section is moved to the new Section 14.4, "Shading Rate Control". 183 184 185 Add new section before Section 14.4, Points, p. 533 186 187 Section 14.4, Shading Rate Control 188 189 By default, each fragment processed by programmable fragment processing 190 (chapter 15) [[compatibility only: or fixed-function fragment processing 191 (chapter 16)]] corresponds to a single pixel with a single (x,y) 192 coordinate. When using multisampling, implementations are permitted to run 193 separate fragment shader invocations for each sample, but often only run a 194 single invocation for all samples of the fragment. We will refer to the 195 density of fragment shader invocations in a particular framebuffer region 196 as the _shading rate_. Applications can use the shading rate to increase 197 the size of fragments to cover multiple pixels and reduce the amount of 198 fragment shader work. Applications can also use the shading rate to 199 explicitly control the minimum number of fragment shader invocations when 200 multisampling. 201 202 203 Section 14.4.1, Shading Rate Image 204 205 Applications can specify the use of a shading rate that varies by (x,y) 206 location using a _shading rate image_. Use of a shading rate image is 207 enabled or disabled for all viewports using Enable or Disable with target 208 SHADING_RATE_IMAGE_NV. Use of a shading rate image is enabled or disabled 209 for a specific viewport using Enablei or Disablei with the constant 210 SHADING_RATE_IMAGE_NV and the index of the selected viewport. The shading 211 rate image may only be used with a framebuffer object. When rendering to 212 the default framebuffer, the shading rate image operations in this section 213 are disabled. 214 215 The shading rate image is a texture that can be bound with the command 216 217 void BindShadingRateImageNV(uint texture); 218 219 This command unbinds the current shading rate image, if any. If <texture> 220 is zero, no new texture is bound. If <texture> is non-zero, it must be 221 the name of an existing immutable-format texture with a target of 222 TEXTURE_2D or TEXTURE_2D_ARRAY with a format of R8UI. If <texture> has 223 multiple mipmap levels, only the base level will be used as the shading 224 rate image. 225 226 Errors 227 228 INVALID_VALUE is generated if <texture> is not zero and is not the 229 name of an existing texture object. 230 231 INVALID_OPERATION is generated if <texture> is not an immutable-format 232 texture, has a format other than R8UI, or has a texture target other 233 than TEXTURE_2D or TEXTURE_2D_ARRAY. 234 235 When rasterizing a primitive covering pixel (x,y) with a shading rate 236 image having a target of TEXTURE_2D, a two-dimensional texel coordinate 237 (u,v) is generated, where: 238 239 u = floor(x / SHADING_RATE_IMAGE_TEXEL_WIDTH_NV) 240 v = floor(y / SHADING_RATE_IMAGE_TEXEL_HEIGHT_NV) 241 242 and where SHADING_RATE_IMAGE_TEXEL_WIDTH_NV and 243 SHADING_RATE_IMAGE_TEXEL_HEIGHT_NV are the width and height of the 244 implementation-dependent footprint of each shading rate image texel in the 245 framebuffer. If the bound shading rate image has a target of 246 TEXTURE_2D_ARRAY, a three-dimensional texture coordinate (u,v,w) is 247 generated, where u and v are computed as above. The coordinate w is set 248 to the layer L of the framebuffer being rendered to if L is less than the 249 number of layers in the shading rate image, or zero otherwise. 250 251 If a texel with coordinates (u,v) or (u,v,w) exists in the bound shading 252 rate image, the value of the 8-bit R component of the texel is used as the 253 shading rate index. If the (u,v) or (u,v,w) coordinate is outside the 254 extent of the shading rate image, or if no shading rate image is bound, 255 zero will be used as the shading rate index. 256 257 A shading rate index is mapped to a _base shading rate_ using a lookup 258 table called the shading rate image palette. There is a separate palette 259 for each viewport. The number of entries in each palette is given by the 260 implementation-dependent constant SHADING_RATE_IMAGE_PALETTE_SIZE_NV. The 261 base shading rate for an (x,y) coordinate with a shading rate index of <i> 262 will be given by palette entry <i>. If the shading rate index is greater 263 than or equal to the palette size, the results of the palette lookup are 264 undefined. 265 266 Shading rate image palettes are updated using the command 267 268 void ShadingRateImagePaletteNV(uint viewport, uint first, sizei count, 269 const enum *rates); 270 271 <viewport> specifies the number of the viewport whose palette should be 272 updated. <rates> is an array of <count> shading rate enums and is used to 273 update entries <first> through <first> + <count> - 1 in the palette. The 274 set of shading rate values accepted in <rates> is given in Table X.1. The 275 default value for all palette entries is 276 SHADING_RATE_1_INVOCATION_PER_PIXEL_NV. 277 278 Shading Rate Size Invocations 279 ------------------------------------------- ----- ----------- 280 SHADING_RATE_NO_INVOCATIONS_NV - 0 281 SHADING_RATE_1_INVOCATION_PER_PIXEL_NV 1x1 1 282 SHADING_RATE_1_INVOCATION_PER_1X2_PIXELS_NV 1x2 1 283 SHADING_RATE_1_INVOCATION_PER_2X1_PIXELS_NV 2x1 1 284 SHADING_RATE_1_INVOCATION_PER_2X2_PIXELS_NV 2x2 1 285 SHADING_RATE_1_INVOCATION_PER_2X4_PIXELS_NV 2x4 1 286 SHADING_RATE_1_INVOCATION_PER_4X2_PIXELS_NV 4x2 1 287 SHADING_RATE_1_INVOCATION_PER_4X4_PIXELS_NV 4x4 1 288 SHADING_RATE_2_INVOCATIONS_PER_PIXEL_NV 1x1 2 289 SHADING_RATE_4_INVOCATIONS_PER_PIXEL_NV 1x1 4 290 SHADING_RATE_8_INVOCATIONS_PER_PIXEL_NV 1x1 8 291 SHADING_RATE_16_INVOCATIONS_PER_PIXEL_NV 1x1 16 292 293 Table X.1: Shading rates accepted by ShadingRateImagePaletteNV. An 294 entry of "<W>x<H>" in the "Size" column indicates that the shading 295 rate results in fragments with a width and height (in pixels) of <W> 296 and <H>, respectively. The entry in the "Invocations" column 297 specifies the number of fragment shader invocations that should be 298 generated for each fragment. 299 300 Errors 301 302 INVALID_VALUE is generated if <viewport> is greater than or equal to 303 MAX_VIEWPORTS or if <first> plus <count> is greater than 304 SHADING_RATE_IMAGE_PALETTE_SIZE_NV. 305 306 INVALID_ENUM is generated if any entry in <rates> is not a valid 307 shading rate. 308 309 Individual entries in the shading rate palette can be queried using the 310 command: 311 312 void GetShadingRateImagePaletteNV(uint viewport, uint entry, 313 enum *rate); 314 315 where <viewport> specifies the viewport of the palette to query and 316 <entry> specifies the palette entry number. A single enum from Table X.1 317 is returned in <rate>. 318 319 Errors 320 321 INVALID_VALUE is generated if <viewport> is greater than or equal to 322 MAX_VIEWPORTS or if <entry> is greater than or equal to 323 SHADING_RATE_IMAGE_PALETTE_SIZE_NV. 324 325 If the shading rate image is enabled, a base shading rate will be obtained 326 as described above. If the shading rate image is disabled, the base 327 shading rate will be SHADING_RATE_1_INVOCATION_PER_PIXEL_NV. In either 328 case, the shading rate will be adjusted as described in the following 329 sections. 330 331 The rasterization hardware that reads from the shading rate image may 332 cache texels it reads for maximum performance. If the shading rate image 333 is updated using commands such as TexSubImage2D, image stores in shaders, 334 or by framebuffer writes performed when the shading rate image is bound to 335 a framebuffer object, this cache may retain out-of-date texture data. 336 Calling 337 338 void ShadingRateImageBarrierNV(boolean synchronize); 339 340 with <synchronize> set to TRUE ensures that rendering commands submitted 341 after the barrier don't access old shading rate image data updated 342 directly (TexSubImage2D) or indirectly (rendering, image stores) by 343 commands submitted before the barrier. If <synchronize> is set to FALSE, 344 ShadingRateImageBarrierNV doesn't wait on the completion of commands 345 submitted before the barrier. If an application has ensured that all 346 prior commands updating the shading rate image have completed using sync 347 objects or other mechanism, <synchronize> can be safely set to FALSE. 348 Otherwise, the lack of synchronization may cause subsequent rendering 349 commands to source the shading rate image before prior updates have 350 completed. 351 352 353 Section 14.4.2, Sample Shading 354 355 When the shading rate image is disabled, sample shading can be used to 356 specify a minimum number of fragment shader invocations to generate for 357 each fragment. When the shading rate image is enabled, sample shading can 358 be used to adjust the shading rate to increase the number of fragment 359 shader invocations generated for each primitive. Sample shading is 360 controlled by calling Enable or Disable with target SAMPLE_SHADING. If 361 MULTISAMPLE or SAMPLE_SHADING is disabled, sample shading has no effect. 362 363 When sample shading is active, an integer sample shading factor is derived 364 based on the value provided in the command: 365 366 void MinSampleShading(float value); 367 368 When the shading rate image is disabled, a <value> of 0.0 specifies that 369 the minimum number of fragment shader invocations for the shading rate be 370 executed and a <value> of 1.0 specifies that a fragment shader should be 371 on each shadeable sample with separate values per sample. When the 372 shading rate image is enabled, <value> is used to derive a sample shading 373 rate that can adjust the shading rate. <value> is not clamped to [0.0, 374 1.0]; values larger than 1.0 can be used to force larger adjustments to 375 the shading rate. 376 377 The sample shading factor is computed from <value> in an 378 implementation-dependent manner but must be greater than or equal to: 379 380 factor = max(ceil(value * max_shaded_samples), 1) 381 382 In this computation, <max_shaded_samples> is the maximum number of 383 fragment shader invocations per fragment, and is equal to: 384 385 - the number of color samples, if the framebuffer has color attachments; 386 387 - the number of depth/stencil samples, if the framebuffer has 388 depth/stencil attachments but no color attachments; or 389 390 - the value of FRAMEBUFFER_DEFAULT_SAMPLES if the framebuffer has no 391 attachments. 392 393 If the framebuffer has non-multisample attachments, the maximum number of 394 shaded samples per pixel is always one. 395 396 397 Section 14.4.3, Shading Rate Adjustment 398 399 Once a base shading rate has been established, it is adjusted to produce a 400 final shading rate. 401 402 First, if the base shading rate specifies multiple pixels for a fragment, 403 the shading rate is adjusted in an implementation-dependent manner to 404 limit the total number of coverage samples for the "coarse" fragment. 405 After adjustment, the maximum number of samples will not exceed the 406 implementation-dependent maximum MAX_COARSE_FRAGMENT_SAMPLES_NV. However, 407 implementations are permitted to clamp to a lower number of coverage 408 samples if required. Table X.2 describes the clamping performed in the 409 initial implementation of this extension. 410 411 Coverage Samples per Pixel 412 Base rate 2 4 8 16 413 --------- ----- ----- ----- ----- 414 1x2 - - - 1x1 415 2x1 - - 1x1 1x1 416 2x2 - - 1x2 1x1 417 2x4 - 2x2 1x2 1x1 418 4x2 2x2 2x2 1x2 1x1 419 4x4 2x4 2x2 1x2 1x1 420 421 Table X.2, Coarse shading rate adjustment for total coverage sample 422 count for the initial implementation of this extension, where 423 MAX_COARSE_FRAGMENT_SAMPLES_NV is 16. The entries in the "2", "4", "8", 424 and "16" columns indicate the fragment size for the adjusted shading 425 rate. 426 427 If sample shading is enabled and the sample shading factor is greater than 428 one, the base shading rate is further adjusted to result in more shader 429 invocations per pixel. Table X.3 describes how the shading rate is 430 adjusted in the initial implementation of this extension. 431 432 Sample Shading Factor 433 Base rate 2 4 8 16 434 ---------- --------- ------- -------- -------- 435 1x1 / 1 1x1 / 2 1x1 / 4 1x1 / 8 1x1 / 16 436 1x2 / 1 1x1 / 1 1x1 / 2 1x1 / 4 1x1 / 8 437 2x1 / 1 1x1 / 1 1x1 / 2 1x1 / 4 1x1 / 8 438 2x2 / 1 1x2 / 1 1x1 / 1 1x1 / 2 1x1 / 4 439 2x4 / 1 2x2 / 1 1x2 / 1 1x1 / 1 1x1 / 2 440 4x2 / 1 2x2 / 1 2x1 / 1 1x1 / 1 1x1 / 2 441 4x4 / 1 2x4 / 1 2x2 / 1 1x2 / 1 1x1 / 1 442 1x1 / 2 1x1 / 4 1x1 / 8 1x1 / 16 1x1 / 16 443 1x1 / 4 1x1 / 8 1x1 / 16 1x1 / 16 1x1 / 16 444 1x1 / 8 1x1 / 16 1x1 / 16 1x1 / 16 1x1 / 16 445 1x1 / 16 1x1 / 16 1x1 / 16 1x1 / 16 1x1 / 16 446 447 Table X.3, Shading rate adjustment based on the sample shading factor in 448 the initial implementation of this extension. All rates in this table 449 are of the form "<W>x<H> / <I>", indicating a fragment size of <W>x<H> 450 pixels with <I> invocations per fragment. 451 452 If RASTER_MULTISAMPLE_EXT is enabled and the shading rate indicates 453 multiple fragment shader invocations per pixel, implementations are 454 permitted to adjust the shading rate to reduce the number of invocations 455 per pixel. In this case, implementations are not required to support more 456 than one invocations per pixel. 457 458 If the active fragment shader uses any inputs that are qualified with 459 "sample" (unique values per sample), including the built-ins "gl_SampleID" 460 and "gl_SamplePosition", the shader code is written to expect a separate 461 shader invocation for each shaded sample. For such fragment shaders, the 462 shading rate is set to the maximum number of shader invocations per pixel 463 (SHADING_RATE_16_INVOCATIONS_PER_PIXEL_NV). This adjustment effectively 464 disables the shading rate image. 465 466 Finally, if the shading rate indicates multiple fragment shader 467 invocations per sample, the total number of invocations per fragment in 468 the shading rate is clamped to the maximum number of shaded samples per 469 pixel described in section 14.4.2. 470 471 472 Section 14.4.4, Shading Rate Application 473 474 If the palette indicates a shading rate of SHADING_RATE_NO_INVOCATIONS_NV, 475 for pixel (x,y), no fragments will be generated for that pixel. 476 477 When the final shading rate for pixel (x,y) is results in fragments with a 478 width and height of <W> and <H>, where either <W> or <H> is greater than 479 one, a single fragment will be produced for that pixel that also includes 480 all other pixels covered by the same primitive whose coordinates (x',y') 481 satisfy: 482 483 floor(x / W) == floor(x' / W), and 484 floor(y / H) == floor(y' / H). 485 486 This combined fragment is considered to have multiple coverage samples; 487 the total number of samples in this fragment is given by 488 489 samples = A * B * S 490 491 where <A> and <B> are the width and height of the combined fragment, in 492 pixels, and <S> is the number of coverage samples per pixel in the draw 493 framebuffer. The set of coverage samples in the fragment is the union of 494 the per-pixel coverage samples in each of the fragment's pixels. The 495 location and order of coverage samples within each pixel in the combined 496 fragment are the same as the location and order used for single-pixel 497 fragments. Each coverage sample in the set of pixels belonging to the 498 combined fragment is assigned a unique sample number in the range 499 [0,<S>-1]. When rendering to a framebuffer object, the order of coverage 500 samples can be specified for each combination of fragment size and 501 coverage sample count. When using the default framebuffer, the coverage 502 samples are ordered in an implementation-dependent manner. The command 503 504 void ShadingRateSampleOrderNV(enum order); 505 506 sets the coverage sample order for all valid combinations of shading rate 507 and per-pixel sample coverage count. If <order> is 508 COARSE_SAMPLE_ORDER_DEFAULT_NV, coverage samples are ordered in an 509 implementation-dependent default order. If <order> is 510 COARSE_SAMPLE_ORDER_PIXEL_MAJOR_NV, coverage samples in the combined 511 fragment will be ordered sequentially, sorted first by pixel coordinate 512 (in row-major order) and then by per-pixel coverage sample number. If 513 <order> is COARSE_SAMPLE_ORDER_SAMPLE_MAJOR_NV, coverage samples in the 514 combined fragment will be ordered sequentially, sorted first by per-pixel 515 coverage sample number and then by pixel coordinate (in row-major order). 516 517 When processing a fragment using an ordering specified by 518 COARSE_SAMPLE_ORDER_PIXEL_MAJOR_NV sample <cs> in the combined fragment 519 will be assigned to coverage sample <ps> of pixel (px,py) specified by: 520 521 px = fx + (floor(cs / fsc) % fw) 522 py = fy + floor(cs / (fsc * fw)) 523 ps = cs % fsc 524 525 where the lower-leftmost pixel in the fragment has coordinates (fx,fy), 526 the fragment width and height are <fw> and <fh>, respectively, and there 527 are <fsc> coverage samples per pixel. When processing a fragment with an 528 ordering specified by COARSE_SAMPLE_ORDER_SAMPLE_MAJOR_NV, sample <cs> in 529 the combined fragment will be assigned using: 530 531 px = fx + (cs % fw) 532 py = fy + (floor(cs / fw) % fh) 533 ps = floor(cs / (fw * fh)) 534 535 Additionally, the command 536 537 void ShadingRateSampleOrderCustomNV(enum rate, uint samples, 538 const int *locations); 539 540 specifies the order of coverage samples for fragments using a shading rate 541 of <rate> with <samples> coverage samples per pixel. <rate> must be one 542 of the shading rates specified in Table X.1 and must specify a shading 543 rate with more than one pixel per fragment. <locations> specifies an 544 array of N (x,y,s) tuples, where N is the product the fragment width 545 indicated by <rate>, the fragment height indicated by <rate>, and 546 <samples>. For each (x,y,s) tuple specified in <locations>, <x> must be 547 in the range [0,fw-1], y must be in the range [0,fh-1], and s must be in 548 the range [0,fsc-1]. No two tuples in <locations> may have the same 549 values. 550 551 When using a sample order specified by ShadingRateSampleOrderCustomNV, 552 sample <cs> in the combined fragment will be assigned using: 553 554 px = fx + locations[3 * cs + 0] 555 py = fy + locations[3 * cs + 1] 556 ps = locations[3 * cs + 2] 557 558 where all terms in these equations are defined as in the equations 559 specified for ShadingRateSampleOrderNV and are consistent with a shading 560 rate of <rate> and a per-pixel sample count of <samples>. 561 562 Errors 563 564 * INVALID_ENUM is generated if <rate> is not one of the enums in Table 565 X.1. 566 567 * INVALID_OPERATION is generated if <rate> does not specify a 568 shading rate palette entry that specifies fragments with more than 569 one pixel. 570 571 * INVALID_VALUE is generated if <sampleCount> is not 1, 2, 4, or 8. 572 573 * INVALID_OPERATION is generated if the product of the fragment width 574 indicated by <rate>, the fragment height indicated by <rate>, and 575 samples is greater than MAX_COARSE_FRAGMENT_SAMPLES_NV. 576 577 * INVALID_VALUE is generated if any (x,y,s) tuple in <locations> has 578 negative values of <x>, <y>, or <s>, has an <x> value greater than or 579 equal to the width of fragments using <rate>, has a <y> value greater 580 than or equal to the height of fragments using <rate>, or has an <s> 581 value greater than or equal to <sampleCount>. 582 583 * INVALID_OPERATION is generated if any pair of (x,y,s) tuples in 584 <locations> have identical values. 585 586 In the initial state, the order of coverage samples in combined fragments 587 is implementation-dependent, but will be identical to the order obtained 588 by passing COARSE_SAMPLE_ORDER_DEFAULT_NV to ShadingRateSampleOrderNV. 589 590 The command 591 592 void GetShadingRateSampleLocationivNV(enum rate, uint samples, 593 uint index, int *location); 594 595 can be used to determine the specific pixel and sample number for each 596 numbered sample in a single- or multi-pixel fragment when the final 597 shading rate is <rate> and uses <samples> coverage samples per pixel. 598 <index> specifies a sample number in the fragment. Three integers are 599 returned in <location>, and are interpreted in the same manner as each 600 (x,y,s) tuples passed to ShadingRateSampleOrderCustomNV. The command 601 GetMultisamplefv can be used to determine the location of the identified 602 sample <s> within a combined fragment pixel identified by (x,y). 603 604 Errors 605 606 INVALID_OPERATION is returned if <rate> is 607 SHADING_RATE_NO_INVOCATIONS_NV. 608 609 INVALID_VALUE is returned if <index> is greater than or equal to the 610 number of coverage samples in the draw framebuffer in a combined pixel 611 for a shading rate given by <rate>. 612 613 When the final shading rate for pixel (x,y) specifies single-pixel 614 fragments, a single fragment with S samples numbered in the range 615 [0,<S>-1] will be generated when (x,y) is covered. 616 617 If the final shading rate for the fragment containing pixel (x,y) produces 618 fragments covering multiple pixels, a single fragment shader invocation 619 will be generated for the combined fragment. When using fragments with 620 multiple pixels per fragment, fragment shader outputs (e.g., color values 621 and gl_FragDepth) will be broadcast to all covered pixels/samples of the 622 fragment. If a "discard" is used in a fragment shader, none of the 623 pixels/samples of the fragment will be updated. 624 625 If the final shading rate for pixel (x,y) indicates <N> fragment shader 626 invocations per fragment, <N> separate fragment shader invocations will be 627 generated for the single-pixel fragment. Each coverage sample in the 628 fragment is assigned to one of the <N> fragment shader invocations in an 629 implementation-dependent manner. 630 631 If sample shading is enabled and the final shading rate results in 632 multiple fragment shader invocations per pixel, each fragment shader 633 invocation for a pixel will have a separate set of interpolated input 634 values. If sample shading is disabled, interpolated fragment shader 635 inputs not qualified with "centroid" may have the same value for each 636 invocation. 637 638 639 Modify Section 14.6.X, Conservative Rasterization from the 640 NV_conservative_raster extension specification 641 642 (add to the end of the section) 643 644 When the shading rate results in fragments covering more than one pixel, 645 coverage evaluation for conservative rasterization will be performed 646 independently for each pixel. In a such a case, a pixel considered not to 647 be covered by a conservatively rasterized primitive will still be 648 considered uncovered even if a neighboring pixel in the same fragment is 649 covered. 650 651 652 Modify Section 14.9.2, Scissor Test 653 654 (add to the end of the section) 655 656 When the shading rate results in fragments covering more than one pixel, 657 the scissor tests are performed separately for each pixel in the fragment. 658 If a pixel covered by a fragment fails either the scissor or exclusive 659 scissor test, that pixel is treated as though it was not covered by the 660 primitive. If all pixels covered by a fragment are either not covered by 661 the primitive being rasterized or fail either scissor test, the fragment 662 is discarded. 663 664 665 Modify Section 14.9.3, Multisample Fragment Operations (p. 562) 666 667 (modify the end of the first paragraph to indicate that sample mask 668 operations are performed when using the shading rate image, which can 669 produce coarse fragments where each pixel is considered a "sample") 670 671 ... This step is skipped if MULTISAMPLE is disabled or if the value of 672 SAMPLE_BUFFERS is not one, unless SHADING_RATE_IMAGE_NV is enabled for one 673 or more viewports. 674 675 (add to the end of the section) 676 677 When the shading rate results in fragments covering more than one pixel, 678 each fragment will a composite coverage mask that includes separate 679 coverage bits for each sample in each pixel covered by the fragment. This 680 composite coverage mask will be used by the GLSL built-in input variable 681 gl_SampleMaskIn[] and updated according to the built-in output variable 682 gl_SampleMask[]. Each bit number in this composite mask maps to a 683 specific pixel and sample number within that pixel. 684 685 When building the composite coverage mask for a fragment, rasterization 686 logic evaluates separate per-pixel coverage masks and then modifies each 687 per-pixel mask as described in this section. After that, it assembles the 688 composite mask by applying the mapping of composite mask bits to 689 pixels/samples, which can be queried using GetShadingRateSampleLocationfvNV. 690 When using the output sample mask gl_SampleMask[] to determine which 691 samples should be updated by subsequent per-fragment operations, a set of 692 separate per-pixel output masks is extracted by reversing the mapping used 693 to generate the composite sample mask. 694 695 696 Modify Section 15.1, Fragment Shader Variables (p. 566) 697 698 (modify fourth paragraph, p. 567, specifying how "centroid" works for 699 multi-pixel fragments) 700 701 When interpolating input variables, the default screen-space location at 702 which these variables are sampled is defined in previous rasterization 703 sections. The default location may be overriden by interpolation 704 qualifiers. When interpolating variables declared using "centroid in", 705 the variable is sampled at a location inside the area of the fragment that 706 is covered by the primitive generating the fragment. ... 707 708 709 Modify Section 15.2.2, Shader Inputs (p. 566), as edited by 710 NV_conservative_raster_underestimation 711 712 (add to new paragraph on gl_FragFullyCoveredNV) 713 714 When CONSERVATIVE_RASTERIZATION_NV or CONSERVATIVE_RASTERIZATION2_NV is 715 enabled, the built-in read-only variable gl_FragFullyCoveredNV is set to 716 true if the fragment is fully covered by the generating primitive, and 717 false otherwise. When the shading rate results in fragments covering more 718 than one pixel, gl_FragFullyCoveredNV will be true if and only if all 719 pixels covered by the fragment are fully covered by the primitive being 720 rasterized. 721 722 723 Modify Section 17.3, Per-Fragment Operations (p. 587) 724 725 (insert a new paragraph after the first paragraph of the section) 726 727 If the fragment covers multiple pixels, the operations described in the 728 section are performed independently for each pixel covered by the 729 fragment. The set of samples covered by each pixel is determined by 730 extracting the portion of the fragment's composite coverage that applies 731 to that pixel, as described in section 14.9.3. 732 733 734Dependencies on ARB_sample_locations and NV_sample_locations 735 736 If ARB_sample_locations or NV_sample_locations is supported, applications 737 can enable programmable sample locations instead of the default sample 738 locations, and also configure sample locations that may vary from pixel to 739 pixel. 740 741 When using "coarse" shading rates covering multiple pixels, the coarse 742 fragment is considered to include the samples of all the pixels it 743 contains. Each sample of each pixel in the coarse fragment is mapped to 744 exactly one sample in the coarse fragment. The location of each sample in 745 the coarse fragment is determined by mapping the sample to a pixel (px,py) 746 and a sample <s> within the identified pixel. The exact location of that 747 identified sample is the same as it would be for one-pixel fragments. If 748 programmable sample locations are enabled, those locations will be used. 749 If the sample location pixel grid is enabled, those locations will depend 750 on the (x,y) coordinate of the containing pixel. 751 752Dependencies on NV_scissor_exclusive 753 754 If NV_scissor_exclusive is not supported, remove references to the 755 exclusive scissor test in section 14.9.2. 756 757Dependencies on NV_sample_mask_override_coverage 758 759 If NV_sample_mask_override_coverage is supported, applications are able to 760 use the sample mask to enable coverage for samples not covered by the 761 primitive being rasterized. When this extension is used in conjunction 762 with a shading rate where fragments cover multiple pixels, it's possible 763 for the sample mask override to enable coverage for pixels that would 764 normally be discarded. For example, this can enable coverage in pixels 765 that are not covered by the primitive being rasterized or that fail the 766 scissor test. 767 768Dependencies on NV_conservative_raster 769 770 If NV_conservative_raster is supported, conservative rasterization 771 evaluates coverage per pixel, even when using a shading rate that 772 specifies multiple pixels per fragment. 773 774 If NV_conservative_raster is not supported, remove edits to the "Section 775 14.6.X" section from that extension. 776 777Dependencies on NV_conservative_raster_underestimation 778 779 If NV_conservative_raster_underestimation is supported, and conservative 780 rasterization is enabled with a shading rate that specifies multiple 781 pixels per fragment, gl_FragFullyCoveredNV will be true if and only if all 782 pixels covered by the fragment are fully covered by the primitive being 783 rasterized. 784 785 If NV_conservative_raster_underestimation is not supported, remove edits 786 to Section 15.2.2 related to gl_FragFullyCoveredNV. 787 788Dependencies on EXT_raster_multisample 789 790 If EXT_raster_multisample is not supported, remove the language allowing 791 implementations to reduce the number of fragment shader invocations 792 per pixel if RASTER_MULTISAMPLE_EXT is enabled. 793 794Interactions with NV_viewport_array or OES_viewport_array 795 796 If NV_viewport_array is supported, references to MAX_VIEWPORTS and 797 GetFloati_v apply to MAX_VIEWPORTS_NV and GetFloati_vNV respecively. 798 799 If OES_viewport_array is supported, references to MAX_VIEWPORTS and 800 GetFloati_v apply to MAX_VIEWPORTS_OES and GetFloati_vOES respectively. 801 802Interactions with OpenGL ES 3.2 803 804 If implemented in OpenGL ES, remove all references to GetDoublev, 805 GetDoublei_v, EnableIndexedEXT, DisableIndexedEXT, IsEnabledIndexedEXT, 806 GetBooleanIndexedvEXT, GetIntegerIndexedvEXT, GetFloatIndexedvEXT and 807 GetDoubleIndexedv. 808 809 If implemented in OpenGL ES, remove all references to the MULTISAMPLE enable 810 state. 811 812Additions to the AGL/GLX/WGL Specifications 813 814 None 815 816Errors 817 818 See the "Errors" sections for individual commands above. 819 820New State 821 822 Get Value Get Command Type Initial Value Description Sec. Attribute 823 --------- --------------- ---- ------------- ----------- ---- --------- 824 SHADING_RATE_IMAGE_NV IsEnabledi 16+ x FALSE Use shading rate image to 14.4.1 enable 825 B determine shading rate for 826 a given viewport 827 SHADING_RATE_IMAGE_ GetIntegerv Z 0 Texture object bound for 14.4.1 none 828 BINDING_NV use as a shading rate image 829 <none> GetShadingRate- 16+ x SHADING_RATE_1_- Shading rate palette 14.4.1 none 830 ImagePaletteNV 16+ x INVOCATION_PER_- entries 831 Z12 PIXEL_NV 832 <none> GetShadingRate- many n/a Locations of individual 14.4.3 none 833 SampleLocation- 3xZ+ samples in "coarse" 834 fragments 835 836New Implementation Dependent State 837 838 Minimum 839 Get Value Type Get Command Value Description Sec. 840 --------- ----- --------------- ------- ------------------------ ------ 841 SHADING_RATE_IMAGE_ Z+ GetIntegerv 1 Width (in pixels) covered by 14.4.1 842 TEXEL_WIDTH_NV each shading rate image texel 843 SHADING_RATE_IMAGE_ Z+ GetIntegerv 1 Height (in pixels) covered by 14.4.1 844 TEXEL_HEIGHT_NV each shading rate image texel 845 SHADING_RATE_IMAGE_ Z+ GetIntegerv 16 Number of entries in each 14.4.1 846 PALETTE_SIZE_NV viewport's shading rate 847 palette 848 MAX_COARSE_FRAGMENT_ Z+ GetIntegerv 1 Maximum number of samples in 14.4.3 849 PALETTE_SIZE_NV "coarse" fragments 850 851Issues 852 853 (1) How should we name this extension? 854 855 RESOLVED: We are calling this extension NV_shading_rate_image. We use 856 the term "shading rate" to indicate the variable number of fragment 857 shader invocations that will be spawned for a particular neighborhood of 858 covered pixels. The extension can support shading rates running one 859 invocation for multiple pixels and/or multiple invocations for a single 860 pixel. We use "image" in the extension name because we allow 861 applications to control the shading rate using an image, where each 862 pixel specifies a shading rate for a portion of the framebuffer. 863 864 We considered a name like "NV_variable_rate_shading", but decided that 865 name didn't sufficiently distinguish between this extension (where 866 shading rate varies across the framebuffer at once) from an extension 867 where an API is provided to change the shading rate for the entire 868 framebuffer. For example, the MinSampleShadingARB() API in 869 ARB_sample_shading allows an application to run one thread per pixel 870 (0.0) for some draw calls and one thread per sample (1.0) for others. 871 872 (2) Should this extension support only off-screen (FBO) rendering or can 873 it also support on-screen rendering? 874 875 RESOLVED: This extension only supports rendering to a framebuffer 876 object; the feature is disabled when rendering to the default 877 framebuffer. In some window system environments, the default 878 framebuffer may be a subset of a larger framebuffer allocation 879 corresponding the full screen. Because the initial hardware 880 implementation of this extension always uses (x,y) coordinates relative 881 to the framebuffer allocation to determine the shading rate, the shading 882 rate would depend on the location of a window on the screen and change 883 as the window moves. While some window systems may have separate 884 default framebuffer allocations for each window, we've chosen to 885 disallow use of the shading rate image with the default framebuffer 886 globally instead of adding a "Can I use the shading rate image with a 887 default framebuffer?" query. 888 889 (3) How does this feature work with per-sample shading? 890 891 RESOLVED: When using per-sample shading, an application is expecting a 892 fragment shader to run with a separate invocation per sample. The 893 shading rate image might allow for a "coarsening" that would break such 894 shaders. We've chosen to override the shading rate (effectively 895 disabling the shading rate image) when per-sample shading is used. 896 897 (4) Should BindShadingRateImageNV take any arguments to bind a subset of 898 a complex texture (e.g., a specific layer of an array texture or a 899 non-base mipmap level)? 900 901 RESOLVED: No. Applications can use texture views to create texture 902 that refer to the desired subset of a more complex texture, if required. 903 904 (5) Does a shading rate image need to be bound in order to use the shading 905 rate feature? 906 907 RESOLVED: No. The behavior where there is no texture bound when 908 SHADING_RATE_IMAGE_NV is enabled is explicitly defined to behave as if a 909 lookup was performed and returned zero. If an application wants to use 910 a constant rate other than SHADING_RATE_1_INVOCATION_PER_PIXEL_NV, it 911 can enable SHADING_RATE_IMAGE_NV, ensure no image is bound, and define 912 the entries for index zero in the relevant palette(s) to contain the 913 desired shading rate. This technique can be used to emulate 16x 914 multisampling on implementations that don't support it by binding larger 915 4x multisample textures to the framebuffer and then setting a shading 916 rate of SHADING_RATE_1_INVOCATION_PER_2X2_PIXELS_NV. 917 918 (6) How is the FRAGMENT_SHADER_INVOCATIONS_ARB query (from 919 ARB_pipeline_statistics_query) handled with fragments covering 920 multiple pixels? 921 922 RESOLVED: The fragment shader invocation for each multi-pixel fragment 923 is counted exactly once. 924 925 (7) How do we handle the combination of variable-rate shading (including 926 multiple invocations per pixel) and target-independent rasterization 927 (i.e., RASTER_MULTISAMPLE_EXT)? 928 929 RESOLVED: In EXT_raster_multisample, the specification allows 930 implementations to run a single fragment shader invocation for each 931 pixel, even if sample shading would normally call for multiple 932 invocations per pixel: 933 934 If RASTER_MULTISAMPLE_EXT is enabled, the number of unique samples to 935 process is implementation-dependent and need not be more than one. 936 937 The shading rates in this extension calling for multiple fragment shader 938 invocations per pixel behave similarly to sample shading, so we extend 939 the allowance to this extension as well. If the shading rate in a 940 region of the framebuffer calls for multiple fragment shader invocations 941 per pixel, implementations are permitted to modify the shading rate and 942 need not support more than one invocation per pixel. 943 944 (8) Both the shading rate image and the framebuffer attachments can be 945 layered or non-layered. Do they have to match? 946 947 RESOLVED: No. When using a shading rate image with a target of 948 TEXTURE_2D with a layered framebuffer, all layers in the framebuffer 949 will use the same two-dimensional shading rate image. When using a 950 shading rate image with a target of TEXTURE_2D_ARRAY with a non-layered 951 framebuffer, layer zero of the shading rate image will be used, except 952 perhaps in the (undefined behavior) case where a shader writes a 953 non-zero value to gl_Layer. 954 955 (9) When using shading rates that specify "coarse" fragments covering 956 multiple pixels, we will generate a combined coverage mask that 957 combines the coverage masks of all pixels covered by the fragment. By 958 default, these masks are combined in an implementation-dependent 959 order. Should we provide a mechanism allowing applications to query 960 or specify an exact order? 961 962 RESOLVED: Yes, this feature is useful for cases where most of the 963 fragment shader can be evaluated once for an entire coarse fragment, but 964 where some per-pixel computations are also required. For example, a 965 per-pixel alpha test may want to kill all the samples for some pixels in 966 a coarse fragment. This sort of test can be implemented using an output 967 sample mask, but such a shader would need to know which bit in the mask 968 corresponds to each sample in the coarse fragment. The command 969 ShadingRateSampleOrderNV allows applications to specify simple orderings 970 for all combinations, while ShadingRateSampleOrderCustomNV allows for 971 completely customized orders for each combination. 972 973 (10) How do centroid-sampled variables work with fragments larger than one 974 pixel? 975 976 RESOLVED: For single-pixel fragments, attributes declared with 977 "centroid" are sampled at an implementation-dependent location in the 978 intersection of the area of the primitive being rasterized and the area 979 of the pixel that corresponds to the fragment. With multi-pixel 980 fragments, we follow a similar pattern, using the intersection of the 981 primitive and the *set* of pixels corresponding to the fragment. 982 983 One important thing to keep in mind when using such "coarse" shading 984 rates is that fragment attributes are sampled at the center of the 985 fragment by default, regardless of the set of pixels/samples covered by 986 the fragment. For fragments with a size of 4x4 pixels, this center 987 location will be more than two pixels (1.5 * sqrt(2)) away from the 988 center of the pixels at the corners of the fragment. When rendering a 989 primitive that covers only a small part of a coarse fragment, 990 interpolating a color outside the primitive can produce overly bright or 991 dark color values if the color values have a large gradient. To deal 992 with this, an application can use centroid sampling on attributes where 993 "extrapolation" artifacts can lead to overly bright or dark pixels. 994 Note that this same problem also exists for multisampling with 995 single-pixel fragments, but is less severe because it only affects 996 certain samples of a pixel and such bright/dark samples may be averaged 997 with other samples that don't have a similar problem. 998 999 (11) How does this feature interact with multisampling? 1000 1001 RESOLVED: The shading rate image can produce "coarse" fragments larger 1002 than one pixel, which we want to behave a lot like regular multisample. 1003 One can consider each coarse fragment to be a lot like a "pixel", where 1004 the individual pixels covered by the fragment are treated as "samples". 1005 1006 When the shading rate is enabled, we override several rules related to 1007 multisampling: 1008 1009 (a) Multisample rasterization rules apply, even if we don't have 1010 multisample buffers or if MULTISAMPLE is disabled. 1011 1012 (b) Coverage for the pixels comprising a coarse fragment is combined 1013 into a single aggregate coverage mask that can be read using the 1014 fragment shader input "gl_SampleMaskIn[]". 1015 1016 (c) Coverage for pixels comprising a coarse fragment can be modified using 1017 the fragment shader output "gl_SampleMask[]", which is also 1018 interpreted as an aggregate coverage mask. 1019 1020 Note that (a) means that point and line primitives may be rasterized 1021 differently depending on whether the shading rate image is enabled or 1022 disabled. 1023 1024 Also, please refer to issues in the GLSL extension specification. 1025 1026Revision History 1027 1028 Revision 3 (pbrown), March 16, 2020 1029 - Fix cut-and-paste error in "New Procedures and Functions" incorrectly 1030 listing ShadingRateSampleOrderNV as a second instance of 1031 ShadingRateImageBarrier. 1032 1033 Revision 2 (pknowles) 1034 - ES interactions. 1035 1036 Revision 1 (pbrown) 1037 - Internal revisions. 1038