1Name 2 3 KHR_texture_compression_astc_hdr 4 5Name Strings 6 7 GL_KHR_texture_compression_astc_hdr 8 GL_KHR_texture_compression_astc_ldr 9 10Contact 11 12 Sean Ellis (sean.ellis 'at' arm.com) 13 Jon Leech (oddhack 'at' sonic.net) 14 15Contributors 16 17 Sean Ellis, ARM 18 Jorn Nystad, ARM 19 Tom Olson, ARM 20 Andy Pomianowski, AMD 21 Cass Everitt, NVIDIA 22 Walter Donovan, NVIDIA 23 Robert Simpson, Qualcomm 24 Maurice Ribble, Qualcomm 25 Larry Seiler, Intel 26 Daniel Koch, NVIDIA 27 Anthony Wood, Imagination Technologies 28 Jon Leech 29 Andrew Garrard, Samsung 30 31IP Status 32 33 No known issues. 34 35Notice 36 37 Copyright (c) 2012-2016 The Khronos Group Inc. Copyright terms at 38 http://www.khronos.org/registry/speccopyright.html 39 40Specification Update Policy 41 42 Khronos-approved extension specifications are updated in response to 43 issues and bugs prioritized by the Khronos OpenGL and OpenGL ES Working Groups. For 44 extensions which have been promoted to a core Specification, fixes will 45 first appear in the latest version of that core Specification, and will 46 eventually be backported to the extension document. This policy is 47 described in more detail at 48 https://www.khronos.org/registry/OpenGL/docs/update_policy.php 49 50Status 51 52 Complete. 53 Approved by the ARB on 2012/06/18. 54 Approved by the OpenGL ES WG on 2012/06/15. 55 Ratified by the Khronos Board of Promoters on 2012/07/27 (LDR profile). 56 Ratified by the Khronos Board of Promoters on 2013/09/27 (HDR profile). 57 58Version 59 60 Version 8, June 8, 2017 61 62Number 63 64 ARB Extension #118 65 OpenGL ES Extension #117 66 67Dependencies 68 69 Written based on the wording of the OpenGL ES 3.1 (April 29, 2015) 70 Specification 71 72 May be implemented against any version of OpenGL or OpenGL ES supporting 73 compressed textures. 74 75 Some of the functionality of these extensions is not supported if the 76 underlying implementation does not support cube map array textures. 77 78 79Overview 80 81 Adaptive Scalable Texture Compression (ASTC) is a new texture 82 compression technology that offers unprecendented flexibility, while 83 producing better or comparable results than existing texture 84 compressions at all bit rates. It includes support for 2D and 85 slice-based 3D textures, with low and high dynamic range, at bitrates 86 from below 1 bit/pixel up to 8 bits/pixel in fine steps. 87 88 The goal of these extensions is to support the full 2D profile of the 89 ASTC texture compression specification, and allow construction of 3D 90 textures from multiple compressed 2D slices. 91 92 ASTC-compressed textures are handled in OpenGL ES and OpenGL by adding 93 new supported formats to the existing commands for defining and updating 94 compressed textures, and defining the interaction of the ASTC formats 95 with each texture target. 96 97New Procedures and Functions 98 99 None 100 101New Tokens 102 103 Accepted by the <format> parameter of CompressedTexSubImage2D and 104 CompressedTexSubImage3D, and by the <internalformat> parameter of 105 CompressedTexImage2D, CompressedTexImage3D, TexStorage2D, 106 TextureStorage2D, TexStorage3D, and TextureStorage3D: 107 108 COMPRESSED_RGBA_ASTC_4x4_KHR 0x93B0 109 COMPRESSED_RGBA_ASTC_5x4_KHR 0x93B1 110 COMPRESSED_RGBA_ASTC_5x5_KHR 0x93B2 111 COMPRESSED_RGBA_ASTC_6x5_KHR 0x93B3 112 COMPRESSED_RGBA_ASTC_6x6_KHR 0x93B4 113 COMPRESSED_RGBA_ASTC_8x5_KHR 0x93B5 114 COMPRESSED_RGBA_ASTC_8x6_KHR 0x93B6 115 COMPRESSED_RGBA_ASTC_8x8_KHR 0x93B7 116 COMPRESSED_RGBA_ASTC_10x5_KHR 0x93B8 117 COMPRESSED_RGBA_ASTC_10x6_KHR 0x93B9 118 COMPRESSED_RGBA_ASTC_10x8_KHR 0x93BA 119 COMPRESSED_RGBA_ASTC_10x10_KHR 0x93BB 120 COMPRESSED_RGBA_ASTC_12x10_KHR 0x93BC 121 COMPRESSED_RGBA_ASTC_12x12_KHR 0x93BD 122 123 COMPRESSED_SRGB8_ALPHA8_ASTC_4x4_KHR 0x93D0 124 COMPRESSED_SRGB8_ALPHA8_ASTC_5x4_KHR 0x93D1 125 COMPRESSED_SRGB8_ALPHA8_ASTC_5x5_KHR 0x93D2 126 COMPRESSED_SRGB8_ALPHA8_ASTC_6x5_KHR 0x93D3 127 COMPRESSED_SRGB8_ALPHA8_ASTC_6x6_KHR 0x93D4 128 COMPRESSED_SRGB8_ALPHA8_ASTC_8x5_KHR 0x93D5 129 COMPRESSED_SRGB8_ALPHA8_ASTC_8x6_KHR 0x93D6 130 COMPRESSED_SRGB8_ALPHA8_ASTC_8x8_KHR 0x93D7 131 COMPRESSED_SRGB8_ALPHA8_ASTC_10x5_KHR 0x93D8 132 COMPRESSED_SRGB8_ALPHA8_ASTC_10x6_KHR 0x93D9 133 COMPRESSED_SRGB8_ALPHA8_ASTC_10x8_KHR 0x93DA 134 COMPRESSED_SRGB8_ALPHA8_ASTC_10x10_KHR 0x93DB 135 COMPRESSED_SRGB8_ALPHA8_ASTC_12x10_KHR 0x93DC 136 COMPRESSED_SRGB8_ALPHA8_ASTC_12x12_KHR 0x93DD 137 138 If extension "EXT_texture_storage" is supported, these tokens are also 139 accepted by TexStorage2DEXT, TextureStorage2DEXT, TexStorage3DEXT and 140 TextureStorage3DEXT. 141 142Additions to Chapter 8 of the OpenGL ES 3.1 Specification (Textures and Samplers) 143 144 Add to Section 8.7 Compressed Texture Images: 145 146 Modify table 8.19 (Compressed internal formats) to add all the ASTC 147 format tokens in the New Tokens section. The "Base Internal Format" 148 column is RGBA for all ASTC formats. 149 150 Add a new column "Block Width x Height", which is 4x4 for all non-ASTC 151 formats in the table, and matches the size in the token name for ASTC 152 formats (e.g. COMPRESSED_SRGB8_ALPHA8_ASTC_10x8_KHR has a block size of 153 10 x 8). 154 155 Add a second new column "3D Tex." which is empty for all non-ASTC 156 formats. If only the LDR profile is supported by the implementation, 157 this column is also empty for all ASTC formats. If both the LDR and HDR 158 profiles are supported, this column is checked for all ASTC formats. 159 160 Add a third new column "Cube Map Array Tex." which is empty for all 161 non-ASTC formats, and checked for all ASTC formats. 162 163 Append to the table caption: 164 165 "The "Block Size" column specifies the compressed block size of the 166 format. Modifying compressed images along aligned block boundaries is 167 possible, as described in this section. The "3D Tex." and "Cube Map 168 Array Tex." columns determine if 3D images composed of compressed 2D 169 slices, and cube map array textures respectively can be specified using 170 CompressedTexImage3D." 171 172 Append to the paragraph at the bottom of p. 168: 173 174 "If <internalformat> is one of the specific ... supports only 175 two-dimensional images. However, if the "3D Tex." column of table 8.19 176 is checked, CompressedTexImage3D will accept a three-dimensional image 177 specified as an array of compressed data consisting of multiple rows of 178 compressed blocks laid out as described in section 8.5." 179 180 Modify the second and third errors in the Errors section for 181 CompressedTexImage[2d]D on p. 169, and add a new error: 182 183 "An INVALID_VALUE error is generated by 184 185 * CompressedTexImage2D if <target> is 186 one of the cube map face targets from table 8.21, and 187 * CompressedTexImage3D if <target> is TEXTURE_CUBE_MAP_ARRAY, 188 189 and <width> and <height> are not equal. 190 191 An INVALID_OPERATION error is generated by CompressedTexImage3D if 192 <internalformat> is one of the the formats in table 8.19 and <target> is 193 not TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY, or TEXTURE_3D. 194 195 An INVALID_OPERATION error is generated by CompressedTexImage3D if 196 <target> is TEXTURE_CUBE_MAP_ARRAY and the "Cube Map Array" 197 column of table 8.19 is *not* checked, or if <target> is 198 TEXTURE_3D and the "3D Tex." column of table 8.19 is *not* checked" 199 200 Modify the fifth and sixth paragraphs on p. 170: 201 202 "Since these specific compressed formats are easily edited along texel 203 block boundaries, the limitations on subimage location and size are 204 relaxed for CompressedTexSubImage2D and CompressedTexSubImage3D. 205 206 The block width and height varies for different formats, as described in 207 table 8.19. The contents of any block of texels of a compressed texture 208 image in these specific compressed formats that does not intersect the 209 area being modified are preserved during CompressedTexSubImage* calls." 210 211 Modify the second error in the Errors section for 212 CompressedTexSubImage[23]D on p. 170, and add a new error: 213 214 "An INVALID_OPERATION error is generated by CompressedTexSubImage3D if 215 <format> is one of the formats in table 8.19 and <target> is not 216 TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY, or TEXTURE_3D. 217 218 An INVALID_OPERATION error is generated by CompressedTexSubImage3D if 219 <target> is TEXTURE_CUBE_MAP_ARRAY and the "Cube Map Array" column of 220 table 8.19 is *not* checked, or if <target> is TEXTURE_3D and the "3D 221 Tex." column of table 8.19 is *not* checked" 222 223 Modify the final error in the same section, on p. 171: 224 225 "An INVALID_OPERATION error is generated if format is one of the formats 226 in table 8.19 and any of the following conditions occurs. The block 227 width and height refer to the values in the corresponding column of the 228 table. 229 230 * <width> is not a multiple of the format's block width, and <width> + 231 <xoffset> is not equal to the value of TEXTURE_WIDTH. 232 * height is not a multiple of the format's block height, and <height> 233 + <yoffset> is not equal to the value of TEXTURE_HEIGHT. 234 * <xoffset> or <yoffset> is not a multiple of the block width or 235 height, respectively." 236 237 Modify table 8.24 (sRGB texture internal formats) to add all of the 238 COMPRESSED_SRGB8_ALPHA8_ASTC_*_KHR formats defined above. 239 240Additions to Appendix C of the OpenGL ES 3.1 Specification (Compressed 241Texture Image Formats) 242 243 Add a new sub-section on ASTC image formats, as follows: 244 245 "C.2 ASTC Compressed Texture Image Formats 246 ========================================= 247 248 C.2.1 What is ASTC? 249 --------------------- 250 251 ASTC stands for Adaptive Scalable Texture Compression. 252 The ASTC formats form a family of related compressed texture image 253 formats. They are all derived from a common set of definitions. 254 255 ASTC textures may be encoded using either high or low dynamic range, 256 corresponding to the "HDR profile" and "LDR profile". Support for the 257 HDR profile is indicated by the "GL_KHR_texture_compression_astc_hdr" 258 extension string, and support for the LDR profile is indicated by the 259 "GL_KHR_texture_compression_astc_ldr" extension string. 260 261 The LDR profile supports two-dimensional images for texture targets 262 TEXTURE_2D. TEXTURE_2D_ARRAY, the six texture cube map face targets, and 263 TEXTURE_CUBE_MAP_ARRAY. These images may optionally be specified using 264 the sRGB color space for the RGB channels. 265 266 The HDR profile is a superset of the LDR profile, and also supports 267 texture target TEXTURE_3D for images made up of multiple two-dimensional 268 slices of compressed data. HDR images may be a mix of low and high 269 dynamic range data. If the HDR profile is supported, the LDR profile and 270 its extension string must also be supported. 271 272 ASTC textures may be encoded as 1, 2, 3 or 4 components, but they are 273 all decoded into RGBA. 274 275 Different ASTC formats have different block sizes, specified as part of 276 the name of the format token passed to CompressedImage2D and its related 277 functions, and in table 8.19. 278 279 Additional ASTC formats (the "Full profile") exist which support 3D data 280 specified as compressed 3D blocks. However, such formats are not defined 281 by either the LDR or HDR profiles, and are not described in this 282 specification. 283 284 C.2.2 Design Goals 285 -------------------- 286 287 The design goals for the format are as follows: 288 289 * Random access. This is a must for any texture compression format. 290 * Bit exact decode. This is a must for conformance testing and 291 reproducibility. 292 * Suitable for mobile use. The format should be suitable for both 293 desktop and mobile GPU environments. It should be low bandwidth 294 and low in area. 295 * Flexible choice of bit rate. Current formats only offer a few bit 296 rates, leaving content developers with only coarse control over 297 the size/quality tradeoff. 298 * Scalable and long-lived. The format should support existing R, RG, 299 RGB and RGBA image types, and also have high "headroom", allowing 300 continuing use for several years and the ability to innovate in 301 encoders. Part of this is the choice to include HDR and 3D. 302 * Feature orthogonality. The choices for the various features of the 303 format are all orthogonal to each other. This has three effects: 304 first, it allows a large, flexible configuration space; second, 305 it makes that space easier to understand; and third, it makes 306 verification easier. 307 * Best in class at given bit rate. It should beat or match the current 308 best in class for peak signal-to-noise ratio (PSNR) at all bit rates. 309 * Fast decode. Texel throughput for a cached texture should be one 310 texel decode per clock cycle per decoder. Parallel decoding of several 311 texels from the same block should be possible at incremental cost. 312 * Low bandwidth. The encoding scheme should ensure that memory access 313 is kept to a minimum, cache reuse is high and memory bandwidth for 314 the format is low. 315 * Low area. It must occupy comparable die size to competing formats. 316 317 C.2.3 Basic Concepts 318 ---------------------- 319 320 ASTC is a block-based lossy compression format. The compressed image 321 is divided into a number of blocks of uniform size, which makes it 322 possible to quickly determine which block a given texel resides in. 323 324 Each block has a fixed memory footprint of 128 bits, but these bits 325 can represent varying numbers of texels (the block "footprint"). 326 327 Block footprint sizes are not confined to powers-of-two, and are 328 also not confined to be square. They may be 2D, in which case the 329 block dimensions range from 4 to 12 texels, or 3D, in which case 330 the block dimensions range from 3 to 6 texels. 331 332 Decoding one texel requires only the data from a single block. This 333 simplifies cache design, reduces bandwidth and improves encoder throughput. 334 335 C.2.4 Block Encoding 336 ---------------------- 337 338 To understand how the blocks are stored and decoded, it is useful to start 339 with a simple example, and then introduce additional features. 340 341 The simplest block encoding starts by defining two color "endpoints". The 342 endpoints define two colors, and a number of additional colors are generated 343 by interpolating between them. We can define these colors using 1, 2, 3, 344 or 4 components (usually corresponding to R, RG, RGB and RGBA textures), 345 and using low or high dynamic range. 346 347 We then store a color interpolant weight for each texel in the image, which 348 specifies how to calculate the color to use. From this, a weighted average 349 of the two endpoint colors is used to generate the intermediate color, 350 which is the returned color for this texel. 351 352 There are several different ways of specifying the endpoint colors, and the 353 weights, but once they have been defined, calculation of the texel colors 354 proceeds identically for all of them. Each block is free to choose whichever 355 encoding scheme best represents its color endpoints, within the constraint 356 that all the data fits within the 128 bit block. 357 358 For blocks which have a large number of texels (e.g. a 12x12 block), there is 359 not enough space to explicitly store a weight for every texel. In this case, 360 a sparser grid with fewer weights is stored, and interpolation is used to 361 determine the effective weight to be used for each texel position. This allows 362 very low bit rates to be used with acceptable quality. This can also be used 363 to more efficiently encode blocks with low detail, or with strong vertical 364 or horizontal features. 365 366 For blocks which have a mixture of disparate colors, a single line in the 367 color space is not a good fit to the colors of the pixels in the original 368 image. It is therefore possible to partition the texels into multiple sets, 369 the pixels within each set having similar colors. For each of these 370 "partitions", we specify separate endpoint pairs, and choose which pair of 371 endpoints to use for a particular texel by looking up the partition index 372 from a partitioning pattern table. In ASTC, this partition table is actually 373 implemented as a function. 374 375 The endpoint encoding for each partition is independent. 376 377 For blocks which have uncorrelated channels - for example an image with a 378 transparency mask, or an image used as a normal map - it may be necessary 379 to specify two weights for each texel. Interpolation between the components 380 of the endpoint colors can then proceed independently for each "plane" of 381 the image. The assignment of channels to planes is selectable. 382 383 Since each of the above options is independent, it is possible to specify any 384 combination of channels, endpoint color encoding, weight encoding, 385 interpolation, multiple partitions and single or dual planes. 386 387 Since these values are specified per block, it is important that they are 388 represented with the minimum possible number of bits. As a result, these 389 values are packed together in ways which can be difficult to read, but 390 which are nevertheless highly amenable to hardware decode. 391 392 All of the values used as weights and color endpoint values can be specified 393 with a variable number of bits. The encoding scheme used allows a fine- 394 grained tradeoff between weight bits and color endpoint bits using "integer 395 sequence encoding". This can pack adjacent values together, allowing us to 396 use fractional numbers of bits per value. 397 398 Finally, a block may be just a single color. This is a so-called "void 399 extent block" and has a special coding which also allows it to identify 400 nearby regions of single color. This may be used to short-circuit fetching of 401 what would be identical blocks, and further reduce memory bandwidth. 402 403 C.2.5 LDR and HDR Modes 404 ------------------------- 405 406 The decoding process for LDR content can be simplified if it is known in 407 advance that sRGB output is required. This selection is therefore included 408 as part of the global configuration. 409 410 The two modes differ in various ways. 411 412 ----------------------------------------------------------------------------- 413 Operation LDR Mode HDR Mode 414 ----------------------------------------------------------------------------- 415 Returned value Vector of FP16 values, Vector of FP16 values 416 or Vector of UNORM8 values. 417 418 sRGB compatible Yes No 419 420 LDR endpoint 16 bits, or 16 bits 421 decoding precision 8 bits for sRGB 422 423 HDR endpoint mode Error color As decoded 424 results 425 426 Error results Error color Vector of NaNs (0xFFFF) 427 ----------------------------------------------------------------------------- 428 Table C.2.1 - Differences Between LDR and HDR Modes 429 430 The error color is opaque fully-saturated magenta 431 (R,G,B,A = 0xFF, 0x00, 0xFF, 0xFF). This has been chosen as it is much more 432 noticeable than black or white, and occurs far less often in valid images. 433 434 For linear RGB decode, the error color may be either opaque fully-saturated 435 magenta (R,G,B,A = 1.0, 0.0, 1.0, 1.0) or a vector of four NaNs 436 (R,G,B,A = NaN, NaN, NaN, NaN). In the latter case, the recommended NaN 437 value returned is 0xFFFF. 438 439 The error color is returned as an informative response to invalid 440 conditions, including invalid block encodings or use of reserved endpoint 441 modes. 442 443 Future, forward-compatible extensions to KHR_texture_compression_astc 444 may define valid interpretations of these conditions, which will decode to 445 some other color. Therefore, encoders and applications must not rely on 446 invalid encodings as a way of generating the error color. 447 448 C.2.6 Configuration Summary 449 ----------------------------- 450 451 The global configuration data for the format is as follows: 452 453 * Block dimension (always 2D for both LDR and HDR profiles) 454 * Block footprint size 455 * sRGB output enabled or not 456 457 The data specified per block is as follows: 458 459 * Texel weight grid size 460 * Texel weight range 461 * Texel weight values 462 * Number of partitions 463 * Partition pattern index 464 * Color endpoint modes (includes LDR or HDR selection) 465 * Color endpoint data 466 * Number of planes 467 * Plane-to-channel assignment 468 469 C.2.7 Decode Procedure 470 ------------------------ 471 472 To decode one texel: 473 474 Find block containing texel 475 Read block mode 476 If void-extent block, store void extent and immediately return single 477 color (optimization) 478 479 For each plane in image 480 If block mode requires infill 481 Find and decode stored weights adjacent to texel, unquantize and 482 interpolate 483 Else 484 Find and decode weight for texel, and unquantize 485 486 Read number of partitions 487 If number of partitions > 1 488 Read partition table pattern index 489 Look up partition number from pattern 490 491 Read color endpoint mode and endpoint data for selected partition 492 Unquantize color endpoints 493 Interpolate color endpoints using weight (or weights in dual-plane mode) 494 Return interpolated color 495 496 C.2.8 Block Determination and Bit Rates 497 The block footprint is a global setting for any given texture, and is 498 therefore not encoded in the individual blocks. 499 500 For 2D textures, the block footprint's width and height are selectable 501 from a number of predefined sizes, namely 4, 5, 6, 8, 10 and 12 pixels. 502 503 For square and nearly-square blocks, this gives the following bit rates: 504 505 ------------------------------------- 506 Footprint 507 Width Height Bit Rate Increment 508 ------------------------------------- 509 4 4 8.00 125% 510 5 4 6.40 125% 511 5 5 5.12 120% 512 6 5 4.27 120% 513 6 6 3.56 114% 514 8 5 3.20 120% 515 8 6 2.67 105% 516 10 5 2.56 120% 517 10 6 2.13 107% 518 8 8 2.00 125% 519 10 8 1.60 125% 520 10 10 1.28 120% 521 12 10 1.07 120% 522 12 12 0.89 523 ------------------------------------- 524 Table C.2.2 - 2D Footprint and Bit Rates 525 526 The block footprint is shown as <width>x<height> in the format name. For 527 example, the format COMPRESSED_RGBA_ASTC_8x6_KHR specifies an image with 528 a block width of 8 texels, and a block height of 6 texels. 529 530 The "Increment" column indicates the ratio of bit rate against the next 531 lower available rate. A consistent value in this column indicates an even 532 spread of bit rates. 533 534 The HDR profile supports only those block footprints listed in Table 535 C.2.2. Other block sizes are not supported. 536 537 For images which are not an integer multiple of the block size, additional 538 texels are added to the edges with maximum X and Y. These texels may be 539 any color, as they will not be accessed. 540 541 Although these are not all powers of two, it is possible to calculate block 542 addresses and pixel addresses within the block, for legal image sizes, 543 without undue complexity. 544 545 Given a 2D image which is W x H pixels in size, with block size 546 w x h, the size of the image in blocks is: 547 548 Bw = ceiling(W/w) 549 Bh = ceiling(H/h) 550 551 For a 3D image, each 2D slice is a single texel thick, so that for an 552 image which is W x H x D pixels in size, with block size w x h, the size 553 of the image in blocks is: 554 555 Bw = ceiling(W/w) 556 Bh = ceiling(H/h) 557 Bd = D 558 559 C.2.9 Block Layout 560 -------------------- 561 562 Each block in the image is stored as a single 128-bit block in memory. These 563 blocks are laid out in raster order, starting with the block at (0,0,0), then 564 ordered sequentially by X, Y and finally Z (if present). They are aligned to 565 128-bit boundaries in memory. 566 567 The bits in the block are labeled in little-endian order - the byte at the 568 lowest address contains bits 0..7. Bit 0 is the least significant bit in the 569 byte. 570 571 Each block has the same basic layout, as shown in figure C.1. 572 573 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 574 -------------------------------------------------------------- 575 | Texel Weight Data (variable width) Fill direction -> 576 -------------------------------------------------------------- 577 578 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 579 -------------------------------------------------------------- 580 Texel Weight Data 581 -------------------------------------------------------------- 582 583 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 584 -------------------------------------------------------------- 585 Texel Weight Data 586 -------------------------------------------------------------- 587 588 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 589 -------------------------------------------------------------- 590 Texel Weight Data 591 -------------------------------------------------------------- 592 593 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 594 -------------------------------------------------------------- 595 : More config data : 596 -------------------------------------------------------------- 597 598 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 599 -------------------------------------------------------------- 600 <-Fill direction Color Endpoint Data 601 -------------------------------------------------------------- 602 603 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 604 -------------------------------------------------------------- 605 : Extra configuration data 606 -------------------------------------------------------------- 607 608 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 609 -------------------------------------------------------------- 610 Extra | Part | Block mode | 611 -------------------------------------------------------------- 612 613 Figure C.1 - Block Layout Overview 614 615 Dotted partition lines indicate that the split position is not fixed. 616 617 The "Block mode" field specifies how the Texel Weight Data is encoded. 618 619 The "Part" field specifies the number of partitions, minus one. If dual 620 plane mode is enabled, the number of partitions must be 3 or fewer. 621 If 4 partitions are specified, the error value is returned for all 622 texels in the block. 623 624 The size and layout of the extra configuration data depends on the 625 number of partitions, and the number of planes in the image, as shown in 626 figures C.2 and C.3 (only the bottom 32 bits are shown): 627 628 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 629 -------------------------------------------------------------- 630 <- Color endpoint data |CEM 631 -------------------------------------------------------------- 632 633 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 634 -------------------------------------------------------------- 635 CEM | 0 0 | Block Mode | 636 -------------------------------------------------------------- 637 638 Figure C.2 - Single-partition Block Layout 639 640 CEM is the color endpoint mode field, which determines how the Color 641 Endpoint Data is encoded. 642 643 If dual-plane mode is active, the color component selector bits appear 644 directly below the weight bits. 645 646 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 647 -------------------------------------------------------------- 648 | CEM | Partition Index 649 -------------------------------------------------------------- 650 651 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 652 -------------------------------------------------------------- 653 Partition Index | Block Mode | 654 -------------------------------------------------------------- 655 656 Figure C.3 - Multi-partition Block Layout 657 658 The Partition Index field specifies which partition layout to use. CEM is 659 the first 6 bits of color endpoint mode information for the various 660 partitions. For modes which require more than 6 bits of CEM data, the 661 additional bits appear at a variable position directly beneath the texel 662 weight data. 663 664 If dual-plane mode is active, the color component selector bits then appear 665 directly below the additional CEM bits. 666 667 The final special case is that if bits [8:0] of the block are "111111100", 668 then the block is a void-extent block, which has a separate encoding 669 described in section C.2.23. 670 671 C.2.10 Block Mode 672 ------------------ 673 674 The Block Mode field specifies the width, height and depth of the grid of 675 weights, what range of values they use, and whether dual weight planes are 676 present. Since some these are not represented using powers of two (there 677 are 12 possible weight widths, for example), and not all combinations are 678 allowed, this is not a simple bit packing. However, it can be unpacked 679 quickly in hardware. 680 681 The weight ranges are encoded using a 3 bit value R, which is interpreted 682 together with a precision bit H, as follows: 683 684 Low Precision Range (H=0) High Precision Range (H=1) 685 R Weight Range Trits Quints Bits Weight Range Trits Quints Bits 686 ------------------------------------------------------------------------- 687 000 Invalid Invalid 688 001 Invalid Invalid 689 010 0..1 1 0..9 1 1 690 011 0..2 1 0..11 1 2 691 100 0..3 2 0..15 4 692 101 0..4 1 0..19 1 2 693 110 0..5 1 1 0..23 1 3 694 111 0..7 3 0..31 5 695 ------------------------------------------------------------------------- 696 Table C.2.7 - Weight Range Encodings 697 698 Each weight value is encoded using the specified number of Trits, Quints 699 and Bits. The details of this encoding can be found in Section C.2.12 - 700 Integer Sequence Encoding. 701 702 For 2D blocks, the Block Mode field is laid out as follows: 703 704 ------------------------------------------------------------------------- 705 10 9 8 7 6 5 4 3 2 1 0 Width Height Notes 706 ------------------------------------------------------------------------- 707 D H B A R0 0 0 R2 R1 B+4 A+2 708 D H B A R0 0 1 R2 R1 B+8 A+2 709 D H B A R0 1 0 R2 R1 A+2 B+8 710 D H 0 B A R0 1 1 R2 R1 A+2 B+6 711 D H 1 B A R0 1 1 R2 R1 B+2 A+2 712 D H 0 0 A R0 R2 R1 0 0 12 A+2 713 D H 0 1 A R0 R2 R1 0 0 A+2 12 714 D H 1 1 0 0 R0 R2 R1 0 0 6 10 715 D H 1 1 0 1 R0 R2 R1 0 0 10 6 716 B 1 0 A R0 R2 R1 0 0 A+6 B+6 D=0, H=0 717 x x 1 1 1 1 1 1 1 0 0 - - Void-extent 718 x x 1 1 1 x x x x 0 0 - - Reserved* 719 x x x x x x x 0 0 0 0 - - Reserved 720 ------------------------------------------------------------------------- 721 Table C.2.8 - 2D Block Mode Layout 722 723 Note that, due to the encoding of the R field, as described in the 724 previous page, bits R2 and R1 cannot both be zero, which disambiguates 725 the first five rows from the rest of the table. 726 727 Bit positions with a value of x are ignored for purposes of determining 728 if a block is a void-extent block or reserved, but may have defined 729 encodings for specific void-extent blocks. 730 731 The penultimate row of the table is reserved only if bits [5:2] are not 732 all 1, in which case it encodes a void-extent block (as shown in the 733 previous row). 734 735 The D bit is set to indicate dual-plane mode. In this mode, the maximum 736 allowed number of partitions is 3. 737 738 The penultimate row of the table is reserved only if bits [4:2] are not 739 all 1, in which case it encodes a void-extent block (as shown in the 740 previous row). 741 742 The size of the grid in each dimension must be less than or equal to 743 the corresponding dimension of the block footprint. If the grid size 744 is greater than the footprint dimension in any axis, then this is an 745 illegal block encoding and all texels will decode to the error color. 746 747 C.2.11 Color Endpoint Mode 748 --------------------------- 749 750 In single-partition mode, the Color Endpoint Mode (CEM) field stores one 751 of 16 possible values. Each of these specifies how many raw data values 752 are encoded, and how to convert these raw values into two RGBA color 753 endpoints. They can be summarized as follows: 754 755 --------------------------------------------- 756 CEM Description Class 757 --------------------------------------------- 758 0 LDR Luminance, direct 0 759 1 LDR Luminance, base+offset 0 760 2 HDR Luminance, large range 0 761 3 HDR Luminance, small range 0 762 4 LDR Luminance+Alpha, direct 1 763 5 LDR Luminance+Alpha, base+offset 1 764 6 LDR RGB, base+scale 1 765 7 HDR RGB, base+scale 1 766 8 LDR RGB, direct 2 767 9 LDR RGB, base+offset 2 768 10 LDR RGB, base+scale plus two A 2 769 11 HDR RGB, direct 2 770 12 LDR RGBA, direct 3 771 13 LDR RGBA, base+offset 3 772 14 HDR RGB, direct + LDR Alpha 3 773 15 HDR RGB, direct + HDR Alpha 3 774 --------------------------------------------- 775 Table C.2.10 - Color Endpoint Modes. 776 [[ If the HDR profile is not implemented, remove from table C.2.10 777 all rows whose description starts with "HDR", and add to the 778 caption: ]] 779 Modes not described in the CEM column are reserved for HDR modes, and 780 will generate errors in an unextended OpenGL ES implementation. 781 782 In multi-partition mode, the CEM field is of variable width, from 6 to 14 783 bits. The lowest 2 bits of the CEM field specify how the endpoint mode 784 for each partition is calculated: 785 786 ---------------------------------------------------- 787 Value Meaning 788 ---------------------------------------------------- 789 00 All color endpoint pairs are of the same type. 790 A full 4-bit CEM is stored in block bits [28:25] 791 and is used for all partitions. 792 01 All endpoint pairs are of class 0 or 1. 793 10 All endpoint pairs are of class 1 or 2. 794 11 All endpoint pairs are of class 2 or 3. 795 ---------------------------------------------------- 796 Table C.2.11 - Multi-Partition Color Endpoint Modes 797 798 If the CEM selector value in bits [24:23] is not 00, 799 then data layout is as follows: 800 801 --------------------------------------------------- 802 Part n m l k j i h g 803 ------------------------------------------ 804 2 ... Weight : M1 : ... 805 ------------------------------------------ 806 3 ... Weight : M2 : M1 :M0 : ... 807 ------------------------------------------ 808 4 ... Weight : M3 : M2 : M1 : M0 : ... 809 ------------------------------------------ 810 811 Part 28 27 26 25 24 23 812 ---------------------- 813 2 | M0 |C1 |C0 | CEM | 814 ---------------------- 815 3 |M0 |C2 |C1 |C0 | CEM | 816 ---------------------- 817 4 |C3 |C2 |C1 |C0 | CEM | 818 ---------------------- 819 --------------------------------------------------- 820 Figure C.4 - Multi-Partition Color Endpoint Modes 821 822 In this view, each partition i has two fields. C<i> is the class 823 selector bit, choosing between the two possible CEM classes (0 indicates 824 the lower of the two classes), and M<i> is a two-bit field specifying 825 the low bits of the color endpoint mode within that class. The 826 additional bits appear at a variable bit position, immediately below the 827 texel weight data. 828 829 The ranges used for the data values are not explicitly specified. 830 Instead, they are derived from the number of available bits remaining 831 after the configuration data and weight data have been specified. 832 833 Details of the decoding procedure for Color Endpoints can be found in 834 section C.2.13. 835 836 C.2.12 Integer Sequence Encoding 837 --------------------------------- 838 839 Both the weight data and the endpoint color data are variable width, and 840 are specified using a sequence of integer values. The range of each 841 value in a sequence (e.g. a color weight) is constrained. 842 843 Since it is often the case that the most efficient range for these 844 values is not a power of two, each value sequence is encoded using a 845 technique known as "integer sequence encoding". This allows efficient, 846 hardware-friendly packing and unpacking of values with non-power-of-two 847 ranges. 848 849 In a sequence, each value has an identical range. The range is specified 850 in one of the following forms: 851 852 Value range MSB encoding LSB encoding Value Block Packed 853 block size 854 ----------- ------------ ------------ ----------- ----- ---------- 855 0 .. 2^n-1 - n bit value m 1 n 856 m (n <= 8) 857 0 .. (3 * 2^n)-1 Base-3 "trit" n bit value t * 2^n + m 5 8 + 5*n 858 value t m (n <= 6) 859 0 .. (5 * 2^n)-1 Base-5 "quint" n bit value q * 2^n + m 3 7 + 3*n 860 value q m (n <= 5) 861 ------------------------------------------- 862 Table C.2.13 -Encoding for Different Ranges 863 864 Since 3^5 is 243, it is possible to pack five trits into 8 bits(which has 865 256 possible values), so a trit can effectively be encoded as 1.6 bits. 866 Similarly, since 5^3 is 125, it is possible to pack three quints into 867 7 bits (which has 128 possible values), so a quint can be encoded as 868 2.33 bits. 869 870 The encoding scheme packs the trits or quints, and then interleaves the n 871 additional bits in positions that satisfy the requirements of an 872 arbitrary length stream. This makes it possible to correctly specify 873 lists of values whose length is not an integer multiple of 3 or 5 values. 874 It also makes it possible to easily select a value at random within the stream. 875 876 If there are insufficient bits in the stream to fill the final block, then 877 unused (higher order) bits are assumed to be 0 when decoding. 878 879 To decode the bits for value number i in a sequence of bits b, both 880 indexed from 0, perform the following: 881 882 If the range is encoded as n bits per value, then the value is bits 883 b[i*n+n-1:i*n] - a simple multiplexing operation. 884 885 If the range is encoded using a trit, then each block contains 5 values 886 (v0 to v4), each of which contains a trit (t0 to t4) and a corresponding 887 LSB value (m0 to m4). The first bit of the packed block is bit 888 floor(i/5)*(8+5*n). The bits in the block are packed as follows 889 (in this example, n is 4): 890 891 27 26 25 24 23 22 21 20 19 18 17 16 892 ----------------------------------------------- 893 |T7 | m4 |T6 T5 | m3 |T4 | 894 ----------------------------------------------- 895 896 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 897 -------------------------------------------------------------- 898 | m2 |T3 T2 | m1 |T1 T0 | m0 | 899 -------------------------------------------------------------- 900 901 Figure C.5 - Trit-based Packing 902 903 The five trits t0 to t4 are obtained by bit manipulations of the 8 bits 904 T[7:0] as follows: 905 906 if T[4:2] = 111 907 C = { T[7:5], T[1:0] }; t4 = t3 = 2 908 else 909 C = T[4:0] 910 if T[6:5] = 11 911 t4 = 2; t3 = T[7] 912 else 913 t4 = T[7]; t3 = T[6:5] 914 915 if C[1:0] = 11 916 t2 = 2; t1 = C[4]; t0 = { C[3], C[2]&~C[3] } 917 else if C[3:2] = 11 918 t2 = 2; t1 = 2; t0 = C[1:0] 919 else 920 t2 = C[4]; t1 = C[3:2]; t0 = { C[1], C[0]&~C[1] } 921 922 If the range is encoded using a quint, then each block contains 3 values 923 (v0 to v2), each of which contains a quint (q0 to q2) and a corresponding 924 LSB value (m0 to m2). The first bit of the packed block is bit 925 floor(i/3)*(7+3*n). 926 927 The bits in the block are packed as follows (in this example, n is 4): 928 929 18 17 16 930 ----------- 931 |Q6 Q5 | m2 932 ----------- 933 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 934 --------------------------------------------------------------- 935 m2 |Q4 Q3 | m1 |Q2 Q1 Q0 | m0 | 936 --------------------------------------------------------------- 937 938 Figure C.6 - Quint-based Packing 939 940 The three quints q0 to q2 are obtained by bit manipulations of the 7 bits 941 Q[6:0] as follows: 942 943 if Q[2:1] = 11 and Q[6:5] = 00 944 q2 = { Q[0], Q[4]&~Q[0], Q[3]&~Q[0] }; q1 = q0 = 4 945 else 946 if Q[2:1] = 11 947 q2 = 4; C = { Q[4:3], ~Q[6:5], Q[0] } 948 else 949 q2 = Q[6:5]; C = Q[4:0] 950 951 if C[2:0] = 101 952 q1 = 4; q0 = C[4:3] 953 else 954 q1 = C[4:3]; q0 = C[2:0] 955 956 Both these procedures ensure a valid decoding for all 128 possible values 957 (even though a few are duplicates). They can also be implemented 958 efficiently in software using small tables. 959 960 Encoding methods are not specified here, although table-based mechanisms 961 work well. 962 963 C.2.13 Endpoint Unquantization 964 ------------------------------- 965 966 Each color endpoint is specified as a sequence of integers in a given 967 range. These values are packed using integer sequence encoding, as a 968 stream of bits stored from just above the configuration data, and 969 growing upwards. 970 971 Once unpacked, the values must be unquantized from their storage range, 972 returning them to a standard range of 0..255. 973 974 For bit-only representations, this is simple bit replication from the 975 most significant bit of the value. 976 977 For trit or quint-based representations, this involves a set of bit 978 manipulations and adjustments to avoid the expense of full-width 979 multipliers. This procedure ensures correct scaling, but scrambles 980 the order of the decoded values relative to the encoded values. 981 This must be compensated for using a table in the encoder. 982 983 The initial inputs to the procedure are denoted A (9 bits), B (9 bits), 984 C (9 bits) and D (3 bits) and are decoded using the range as follows: 985 986 --------------------------------------------------------------- 987 Range T Q B Bits A B C D 988 --------------------------------------------------------------- 989 0..5 1 1 a aaaaaaaaa 000000000 204 Trit value 990 0..9 1 1 a aaaaaaaaa 000000000 113 Quint value 991 0..11 1 2 ba aaaaaaaaa b000b0bb0 93 Trit value 992 0..19 1 2 ba aaaaaaaaa b0000bb00 54 Quint value 993 0..23 1 3 cba aaaaaaaaa cb000cbcb 44 Trit value 994 0..39 1 3 cba aaaaaaaaa cb0000cbc 26 Quint value 995 0..47 1 4 dcba aaaaaaaaa dcb000dcb 22 Trit value 996 0..79 1 4 dcba aaaaaaaaa dcb0000dc 13 Quint value 997 0..95 1 5 edcba aaaaaaaaa edcb000ed 11 Trit value 998 0..159 1 5 edcba aaaaaaaaa edcb0000e 6 Quint value 999 0..191 1 6 fedcba aaaaaaaaa fedcb000f 5 Trit value 1000 --------------------------------------------------------------- 1001 Table C.2.16 - Color Unquantization Parameters 1002 1003 These are then processed as follows: 1004 1005 T = D * C + B; 1006 T = T ^ A; 1007 T = (A & 0x80) | (T >> 2); 1008 1009 Note that the multiply in the first line is nearly trivial as it only 1010 needs to multiply by 0, 1, 2, 3 or 4. 1011 1012 C.2.14 LDR Endpoint Decoding 1013 ----------------------------- 1014 The decoding method used depends on the Color Endpoint Mode (CEM) field, 1015 which specifies how many values are used to represent the endpoint. 1016 1017 The CEM field also specifies how to take the n unquantized color endpoint 1018 values v0 to v[n-1] and convert them into two RGBA color endpoints e0 1019 and e1. 1020 1021 The HDR Modes are more complex and do not fit neatly into this section. 1022 They are documented in following section. 1023 1024 The methods can be summarized as follows. 1025 1026 ------------------------------------------------- 1027 CEM Range Description n 1028 ------------------------------------------------- 1029 0 LDR Luminance, direct 2 1030 1 LDR Luminance, base+offset 2 1031 2 HDR Luminance, large range 2 1032 3 HDR Luminance, small range 2 1033 4 LDR Luminance+Alpha, direct 4 1034 5 LDR Luminance+Alpha, base+offset 4 1035 6 LDR RGB, base+scale 4 1036 7 HDR RGB, base+scale 4 1037 8 LDR RGB, direct 6 1038 9 LDR RGB, base+offset 6 1039 10 LDR RGB, base+scale plus two A 6 1040 11 HDR RGB 6 1041 12 LDR RGBA, direct 8 1042 13 LDR RGBA, base+offset 8 1043 14 HDR RGB + LDR Alpha 8 1044 15 HDR RGB + HDR Alpha 8 1045 ------------------------------------------------- 1046 Table C.2.17 -Color Endpoint Modes 1047 [[ If the HDR profile is not implemented, remove from table C.2.17 1048 all rows whose description starts with "HDR", and add to the 1049 caption: ]] 1050 Modes not described are reserved, as described in table C.2.10. 1051 1052 [[ HDR profile only ]] 1053 Mode 14 is special in that the alpha values are interpolated linearly, 1054 but the color components are interpolated logarithmically. This is the 1055 only endpoint format with mixed-mode operation, and will return the 1056 error value if encountered in LDR mode. 1057 1058 Decode the different LDR endpoint modes as follows: 1059 1060 Mode 0 LDR Luminance, direct 1061 1062 e0=(v0,v0,v0,0xFF); e1=(v1,v1,v1,0xFF); 1063 1064 Mode 1 LDR Luminance, base+offset 1065 1066 L0 = (v0>>2)|(v1&0xC0); L1=L0+(v1&0x3F); 1067 if (L1>0xFF) { L1=0xFF; } 1068 e0=(L0,L0,L0,0xFF); e1=(L1,L1,L1,0xFF); 1069 1070 Mode 4 LDR Luminance+Alpha,direct 1071 1072 e0=(v0,v0,v0,v2); 1073 e1=(v1,v1,v1,v3); 1074 1075 Mode 5 LDR Luminance+Alpha, base+offset 1076 1077 bit_transfer_signed(v1,v0); bit_transfer_signed(v3,v2); 1078 e0=(v0,v0,v0,v2); e1=(v0+v1,v0+v1,v0+v1,v2+v3); 1079 clamp_unorm8(e0); clamp_unorm8(e1); 1080 1081 Mode 6 LDR RGB, base+scale 1082 1083 e0=(v0*v3>>8,v1*v3>>8,v2*v3>>8, 0xFF); 1084 e1=(v0,v1,v2,0xFF); 1085 1086 Mode 8 LDR RGB, Direct 1087 1088 s0= v0+v2+v4; s1= v1+v3+v5; 1089 if (s1>=s0){e0=(v0,v2,v4,0xFF); 1090 e1=(v1,v3,v5,0xFF); } 1091 else { e0=blue_contract(v1,v3,v5,0xFF); 1092 e1=blue_contract(v0,v2,v4,0xFF); } 1093 1094 Mode 9 LDR RGB, base+offset 1095 1096 bit_transfer_signed(v1,v0); 1097 bit_transfer_signed(v3,v2); 1098 bit_transfer_signed(v5,v4); 1099 if(v1+v3+v5 >= 0) 1100 { e0=(v0,v2,v4,0xFF); e1=(v0+v1,v2+v3,v4+v5,0xFF); } 1101 else 1102 { e0=blue_contract(v0+v1,v2+v3,v4+v5,0xFF); 1103 e1=blue_contract(v0,v2,v4,0xFF); } 1104 clamp_unorm8(e0); clamp_unorm8(e1); 1105 1106 Mode 10 LDR RGB, base+scale plus two A 1107 1108 e0=(v0*v3>>8,v1*v3>>8,v2*v3>>8, v4); 1109 e1=(v0,v1,v2, v5); 1110 1111 Mode 12 LDR RGBA, direct 1112 1113 s0= v0+v2+v4; s1= v1+v3+v5; 1114 if (s1>=s0){e0=(v0,v2,v4,v6); 1115 e1=(v1,v3,v5,v7); } 1116 else { e0=blue_contract(v1,v3,v5,v7); 1117 e1=blue_contract(v0,v2,v4,v6); } 1118 1119 Mode 13 LDR RGBA, base+offset 1120 1121 bit_transfer_signed(v1,v0); 1122 bit_transfer_signed(v3,v2); 1123 bit_transfer_signed(v5,v4); 1124 bit_transfer_signed(v7,v6); 1125 if(v1+v3+v5>=0) { e0=(v0,v2,v4,v6); 1126 e1=(v0+v1,v2+v3,v4+v5,v6+v7); } 1127 else { e0=blue_contract(v0+v1,v2+v3,v4+v5,v6+v7); 1128 e1=blue_contract(v0,v2,v4,v6); } 1129 clamp_unorm8(e0); clamp_unorm8(e1); 1130 1131 The bit_transfer_signed procedure transfers a bit from one value (a) 1132 to another (b). Initially, both a and b are in the range 0..255. 1133 After calling this procedure, a's range becomes -32..31, and b remains 1134 in the range 0..255. Note that, as is often the case, this is easier to 1135 express in hardware than in C: 1136 1137 bit_transfer_signed(int& a, int& b) 1138 { 1139 b >>= 1; 1140 b |= a & 0x80; 1141 a >>= 1; 1142 a &= 0x3F; 1143 if( (a&0x20)!=0 ) a-=0x40; 1144 } 1145 1146 The blue_contract procedure is used to give additional precision to 1147 RGB colors near grey: 1148 1149 color blue_contract( int r, int g, int b, int a ) 1150 { 1151 color c; 1152 c.r = (r+b) >> 1; 1153 c.g = (g+b) >> 1; 1154 c.b = b; 1155 c.a = a; 1156 return c; 1157 } 1158 1159 The clamp_unorm8 procedure is used to clamp a color into the UNORM8 range: 1160 1161 void clamp_unorm8(color c) 1162 { 1163 if(c.r < 0) {c.r=0;} else if(c.r > 255) {c.r=255;} 1164 if(c.g < 0) {c.g=0;} else if(c.g > 255) {c.g=255;} 1165 if(c.b < 0) {c.b=0;} else if(c.b > 255) {c.b=255;} 1166 if(c.a < 0) {c.a=0;} else if(c.a > 255) {c.a=255;} 1167 } 1168 1169 [[ If the HDR profile is not implemented, do not include section 1170 C.2.15 ]] 1171 1172 C.2.15 HDR Endpoint Decoding 1173 ------------------------- 1174 1175 For HDR endpoint modes, color values are represented in a 12-bit 1176 pseudo-logarithmic representation. 1177 1178 HDR Endpoint Mode 2 1179 1180 Mode 2 represents luminance-only data with a large range. It encodes 1181 using two values (v0, v1). The complete decoding procedure is as follows: 1182 1183 if(v1 >= v0) 1184 { 1185 y0 = (v0 << 4); 1186 y1 = (v1 << 4); 1187 } 1188 else 1189 { 1190 y0 = (v1 << 4) + 8; 1191 y1 = (v0 << 4) - 8; 1192 } 1193 // Construct RGBA result (0x780 is 1.0f) 1194 e0 = (y0, y0, y0, 0x780); 1195 e1 = (y1, y1, y1, 0x780); 1196 1197 HDR Endpoint Mode 3 1198 1199 Mode 3 represents luminance-only data with a small range. It packs the 1200 bits for a base luminance value, together with an offset, into two values 1201 (v0, v1): 1202 1203 Value 7 6 5 4 3 2 1 0 1204 ----- ------------------------------ 1205 v0 |M | L[6:0] | 1206 ------------------------------ 1207 v1 | X[3:0] | d[3:0] | 1208 ------------------------------ 1209 1210 Table C.2.18 - HDR Mode 3 Value Layout 1211 1212 The bit field marked as X allocates different bits to L or d depending 1213 on the value of the mode bit M. 1214 1215 The complete decoding procedure is as follows: 1216 1217 // Check mode bit and extract. 1218 if((v0&0x80) !=0) 1219 { 1220 y0 = ((v1 & 0xE0) << 4) | ((v0 & 0x7F) << 2); 1221 d = (v1 & 0x1F) << 2; 1222 } 1223 else 1224 { 1225 y0 = ((v1 & 0xF0) << 4) | ((v0 & 0x7F) << 1); 1226 d = (v1 & 0x0F) << 1; 1227 } 1228 1229 // Add delta and clamp 1230 y1 = y0 + d; 1231 if(y1 > 0xFFF) { y1 = 0xFFF; } 1232 1233 // Construct RGBA result (0x780 is 1.0f) 1234 e0 = (y0, y0, y0, 0x780); 1235 e1 = (y1, y1, y1, 0x780); 1236 1237 HDR Endpoint Mode 7 1238 1239 Mode 7 packs the bits for a base RGB value, a scale factor, and some 1240 mode bits into the four values (v0, v1, v2, v3): 1241 1242 Value 7 6 5 4 3 2 1 0 1243 ----- ------------------------------ 1244 v0 |M[3:2] | R[5:0] | 1245 ----- ------------------------------ 1246 v1 |M1 |X0 |X1 | G[4:0] | 1247 ----- ------------------------------ 1248 v2 |M0 |X2 |X3 | B[4:0] | 1249 ----- ------------------------------ 1250 v3 |X4 |X5 |X6 | S[4:0] | 1251 ----- ------------------------------ 1252 Table C.2.19 - HDR Mode 7 Value Layout 1253 1254 The mode bits M0 to M3 are a packed representation of an endpoint bit 1255 mode, together with the major component index. For modes 0 to 4, the 1256 component (red, green, or blue) with the largest magnitude is identified, 1257 and the values swizzled to ensure that it is decoded from the red channel. 1258 1259 The endpoint bit mode is used to determine the number of bits assigned 1260 to each component of the endpoint, and the destination of each of the 1261 extra bits X0 to X6, as follows: 1262 1263 ------------------------------------------------------ 1264 Number of bits Destination of extra bits 1265 Mode R G B S X0 X1 X2 X3 X4 X5 X6 1266 ------------------------------------------------------ 1267 0 11 5 5 7 R9 R8 R7 R10 R6 S6 S5 1268 1 11 6 6 5 R8 G5 R7 B5 R6 R10 R9 1269 2 10 5 5 8 R9 R8 R7 R6 S7 S6 S5 1270 3 9 6 6 7 R8 G5 R7 B5 R6 S6 S5 1271 4 8 7 7 6 G6 G5 B6 B5 R6 R7 S5 1272 5 7 7 7 7 G6 G5 B6 B5 R6 S6 S5 1273 ------------------------------------------------------ 1274 Table C.2.20 - Endpoint Bit Mode 1275 1276 As noted before, this appears complex when expressed in C, but much 1277 easier to achieve in hardware - bit masking, extraction, shifting 1278 and assignment usually ends up as a single wire or multiplexer. 1279 1280 The complete decoding procedure is as follows: 1281 1282 // Extract mode bits and unpack to major component and mode. 1283 int modeval = ((v0&0xC0)>>6) | ((v1&0x80)>>5) | ((v2&0x80)>>4); 1284 1285 int majcomp; 1286 int mode; 1287 1288 if( (modeval & 0xC ) != 0xC ) 1289 { 1290 majcomp = modeval >> 2; mode = modeval & 3; 1291 } 1292 else if( modeval != 0xF ) 1293 { 1294 majcomp = modeval & 3; mode = 4; 1295 } 1296 else 1297 { 1298 majcomp = 0; mode = 5; 1299 } 1300 1301 // Extract low-order bits of r, g, b, and s. 1302 int red = v0 & 0x3f; 1303 int green = v1 & 0x1f; 1304 int blue = v2 & 0x1f; 1305 int scale = v3 & 0x1f; 1306 1307 // Extract high-order bits, which may be assigned depending on mode 1308 int x0 = (v1 >> 6) & 1; int x1 = (v1 >> 5) & 1; 1309 int x2 = (v2 >> 6) & 1; int x3 = (v2 >> 5) & 1; 1310 int x4 = (v3 >> 7) & 1; int x5 = (v3 >> 6) & 1; 1311 int x6 = (v3 >> 5) & 1; 1312 1313 // Now move the high-order xs into the right place. 1314 int ohm = 1 << mode; 1315 if( ohm & 0x30 ) green |= x0 << 6; 1316 if( ohm & 0x3A ) green |= x1 << 5; 1317 if( ohm & 0x30 ) blue |= x2 << 6; 1318 if( ohm & 0x3A ) blue |= x3 << 5; 1319 if( ohm & 0x3D ) scale |= x6 << 5; 1320 if( ohm & 0x2D ) scale |= x5 << 6; 1321 if( ohm & 0x04 ) scale |= x4 << 7; 1322 if( ohm & 0x3B ) red |= x4 << 6; 1323 if( ohm & 0x04 ) red |= x3 << 6; 1324 if( ohm & 0x10 ) red |= x5 << 7; 1325 if( ohm & 0x0F ) red |= x2 << 7; 1326 if( ohm & 0x05 ) red |= x1 << 8; 1327 if( ohm & 0x0A ) red |= x0 << 8; 1328 if( ohm & 0x05 ) red |= x0 << 9; 1329 if( ohm & 0x02 ) red |= x6 << 9; 1330 if( ohm & 0x01 ) red |= x3 << 10; 1331 if( ohm & 0x02 ) red |= x5 << 10; 1332 1333 // Shift the bits to the top of the 12-bit result. 1334 static const int shamts[6] = { 1,1,2,3,4,5 }; 1335 int shamt = shamts[mode]; 1336 red <<= shamt; green <<= shamt; blue <<= shamt; scale <<= shamt; 1337 1338 // Minor components are stored as differences 1339 if( mode != 5 ) { green = red - green; blue = red - blue; } 1340 1341 // Swizzle major component into place 1342 if( majcomp == 1 ) swap( red, green ); 1343 if( majcomp == 2 ) swap( red, blue ); 1344 1345 // Clamp output values, set alpha to 1.0 1346 e1.r = clamp( red, 0, 0xFFF ); 1347 e1.g = clamp( green, 0, 0xFFF ); 1348 e1.b = clamp( blue, 0, 0xFFF ); 1349 e1.alpha = 0x780; 1350 1351 e0.r = clamp( red - scale, 0, 0xFFF ); 1352 e0.g = clamp( green - scale, 0, 0xFFF ); 1353 e0.b = clamp( blue - scale, 0, 0xFFF ); 1354 e0.alpha = 0x780; 1355 1356 HDR Endpoint Mode 11 1357 1358 Mode 11 specifies two RGB values, which it calculates from a number of 1359 bitfields (a, b0, b1, c, d0 and d1) which are packed together with some 1360 mode bits into the six values (v0, v1, v2, v3, v4, v5): 1361 1362 Value 7 6 5 4 3 2 1 0 1363 ----- ------------------------------ 1364 v0 | a[7:0] | 1365 ----- ------------------------------ 1366 v1 |m0 |a8 | c[5:0] | 1367 ----- ------------------------------ 1368 v2 |m1 |X0 | b0[5:0] | 1369 ----- ------------------------------ 1370 v3 |m2 |X1 | b1[5:0] | 1371 ----- ------------------------------ 1372 v4 |mj0|X2 |X4 | d0[4:0] | 1373 ----- ------------------------------ 1374 v5 |mj1|X3 |X5 | d1[4:0] | 1375 ----- ------------------------------ 1376 Table C.2.21 - HDR Mode 11 Value Layout 1377 1378 If the major component bits mj[1:0 ] are both 1, then the RGB values 1379 are specified directly 1380 1381 Value 7 6 5 4 3 2 1 0 1382 ----- ------------------------------ 1383 v0 | R0[11:4] | 1384 ----- ------------------------------ 1385 v1 | R1[11:4] | 1386 ----- ------------------------------ 1387 v2 | G0[11:4] | 1388 ----- ------------------------------ 1389 v3 | G1[11:4] | 1390 ----- ------------------------------ 1391 v4 | 1 | B0[11:5] | 1392 ----- ------------------------------ 1393 v5 | 1 | B1[11:5] | 1394 ----- ------------------------------ 1395 Table C.2.22 - HDR Mode 11 Value Layout 1396 1397 The mode bits m[2:0] specify the bit allocation for the different 1398 values, and the destinations of the extra bits X0 to X5: 1399 1400 ------------------------------------------------------------------------- 1401 Number of bits Destination of extra bits 1402 Mode a b c d X0 X1 X2 X3 X4 X5 1403 ------------------------------------------------------------------------- 1404 0 9 7 6 7 b0[6] b1[6] d0[6] d1[6] d0[5] d1[5] 1405 1 9 8 6 6 b0[6] b1[6] b0[7] b1[7] d0[5] d1[5] 1406 2 10 6 7 7 a[9] c[6] d0[6] d1[6] d0[5] d1[5] 1407 3 10 7 7 6 b0[6] b1[6] a[9] c[6] d0[5] d1[5] 1408 4 11 8 6 5 b0[6] b1[6] b0[7] b1[7] a[9] a[10] 1409 5 11 6 7 6 a[9] a[10] c[7] c[6] d0[5] d1[5] 1410 6 12 7 7 5 b0[6] b1[6] a[11] c[6] a[9] a[10] 1411 7 12 6 7 6 a[9] a[10] a[11] c[6] d0[5] d1[5] 1412 ------------------------------------------------------------------------- 1413 Table C.2.23 - Endpoint Bit Mode 1414 1415 The complete decoding procedure is as follows: 1416 1417 // Find major component 1418 int majcomp = ((v4 & 0x80) >> 7) | ((v5 & 0x80) >> 6); 1419 1420 // Deal with simple case first 1421 if( majcomp == 3 ) 1422 { 1423 e0 = (v0 << 4, v2 << 4, (v4 & 0x7f) << 5, 0x780); 1424 e1 = (v1 << 4, v3 << 4, (v5 & 0x7f) << 5, 0x780); 1425 return; 1426 } 1427 1428 // Decode mode, parameters. 1429 int mode = ((v1&0x80)>>7) | ((v2&0x80)>>6) | ((v3&0x80)>>5); 1430 int va = v0 | ((v1 & 0x40) << 2); 1431 int vb0 = v2 & 0x3f; 1432 int vb1 = v3 & 0x3f; 1433 int vc = v1 & 0x3f; 1434 int vd0 = v4 & 0x7f; 1435 int vd1 = v5 & 0x7f; 1436 1437 // Assign top bits of vd0, vd1. 1438 static const int dbitstab[8] = {7,6,7,6,5,6,5,6}; 1439 vd0 = signextend( vd0, dbitstab[mode] ); 1440 vd1 = signextend( vd1, dbitstab[mode] ); 1441 1442 // Extract and place extra bits 1443 int x0 = (v2 >> 6) & 1; 1444 int x1 = (v3 >> 6) & 1; 1445 int x2 = (v4 >> 6) & 1; 1446 int x3 = (v5 >> 6) & 1; 1447 int x4 = (v4 >> 5) & 1; 1448 int x5 = (v5 >> 5) & 1; 1449 1450 int ohm = 1 << mode; 1451 if( ohm & 0xA4 ) va |= x0 << 9; 1452 if( ohm & 0x08 ) va |= x2 << 9; 1453 if( ohm & 0x50 ) va |= x4 << 9; 1454 if( ohm & 0x50 ) va |= x5 << 10; 1455 if( ohm & 0xA0 ) va |= x1 << 10; 1456 if( ohm & 0xC0 ) va |= x2 << 11; 1457 if( ohm & 0x04 ) vc |= x1 << 6; 1458 if( ohm & 0xE8 ) vc |= x3 << 6; 1459 if( ohm & 0x20 ) vc |= x2 << 7; 1460 if( ohm & 0x5B ) vb0 |= x0 << 6; 1461 if( ohm & 0x5B ) vb1 |= x1 << 6; 1462 if( ohm & 0x12 ) vb0 |= x2 << 7; 1463 if( ohm & 0x12 ) vb1 |= x3 << 7; 1464 1465 // Now shift up so that major component is at top of 12-bit value 1466 int shamt = (modeval >> 1) ^ 3; 1467 va <<= shamt; vb0 <<= shamt; vb1 <<= shamt; 1468 vc <<= shamt; vd0 <<= shamt; vd1 <<= shamt; 1469 1470 e1.r = clamp( va, 0, 0xFFF ); 1471 e1.g = clamp( va - vb0, 0, 0xFFF ); 1472 e1.b = clamp( va - vb1, 0, 0xFFF ); 1473 e1.alpha = 0x780; 1474 1475 e0.r = clamp( va - vc, 0, 0xFFF ); 1476 e0.g = clamp( va - vb0 - vc - vd0, 0, 0xFFF ); 1477 e0.b = clamp( va - vb1 - vc - vd1, 0, 0xFFF ); 1478 e0.alpha = 0x780; 1479 1480 if( majcomp == 1 ) { swap( e0.r, e0.g ); swap( e1.r, e1.g ); } 1481 else if( majcomp == 2 ) { swap( e0.r, e0.b ); swap( e1.r, e1.b ); } 1482 1483 HDR Endpoint Mode 14 1484 1485 Mode 14 specifies two RGBA values, using the eight values (v0, v1, v2, 1486 v3, v4, v5, v6, v7). First, the RGB values are decoded from (v0..v5) 1487 using the method from Mode 11, then the alpha values are filled in 1488 from v6 and v7: 1489 1490 // Decode RGB as for mode 11 1491 (e0,e1) = decode_mode_11(v0,v1,v2,v3,v4,v5) 1492 1493 // Now fill in the alphas 1494 e0.alpha = v6; 1495 e1.alpha = v7; 1496 1497 Note that in this mode, the alpha values are interpreted (and 1498 interpolated) as 8-bit unsigned normalized values, as in the LDR modes. 1499 This is the only mode that exhibits this behaviour. 1500 1501 HDR Endpoint Mode 15 1502 1503 Mode 15 specifies two RGBA values, using the eight values (v0, v1, v2, 1504 v3, v4, v5, v6, v7). First, the RGB values are decoded from (v0..v5) 1505 using the method from Mode 11. The alpha values are stored in values 1506 v6 and v7 as a mode and two values which are interpreted according 1507 to the mode: 1508 1509 Value 7 6 5 4 3 2 1 0 1510 ----- ------------------------------ 1511 v6 |M0 | A[6:0] | 1512 ----- ------------------------------ 1513 v7 |M1 | B[6:0] | 1514 ----- ------------------------------ 1515 Table C.2.24 - HDR Mode 15 Alpha Value Layout 1516 1517 The alpha values are decoded from v6 and v7 as follows: 1518 1519 // Decode RGB as for mode 11 1520 (e0,e1) = decode_mode_11(v0,v1,v2,v3,v4,v5) 1521 1522 // Extract mode bits 1523 mode = ((v6 >> 7) & 1) | ((v7 >> 6) & 2); 1524 v6 &= 0x7F; 1525 v7 &= 0x7F; 1526 1527 if(mode==3) 1528 { 1529 // Directly specify alphas 1530 e0.alpha = v6 << 5; 1531 e1.alpha = v7 << 5; 1532 } 1533 else 1534 { 1535 // Transfer bits from v7 to v6 and sign extend v7. 1536 v6 |= (v7 << (mode+1))) & 0x780; 1537 v7 &= (0x3F >> mode); 1538 v7 ^= 0x20 >> mode; 1539 v7 -= 0x20 >> mode; 1540 v6 <<= (4-mode); 1541 v7 <<= (4-mode); 1542 1543 // Add delta and clamp 1544 v7 += v6; 1545 v7 = clamp(v7, 0, 0xFFF); 1546 e0.alpha = v6; 1547 e1.alpha = v7; 1548 } 1549 1550 Note that in this mode, the alpha values are interpreted (and 1551 interpolated) as 12-bit HDR values, and are interpolated as 1552 for any other HDR component. 1553 1554 C.2.16 Weight Decoding 1555 ----------------------- 1556 The weight information is stored as a stream of bits, growing downwards 1557 from the most significant bit in the block. Bit n in the stream is thus 1558 bit 127-n in the block. 1559 1560 For each location in the weight grid, a value (in the specified range) 1561 is packed into the stream. These are ordered in a raster pattern 1562 starting from location (0,0,0), with the X dimension increasing fastest, 1563 and the Z dimension increasing slowest. If dual-plane mode is selected, 1564 both weights are emitted together for each location, plane 0 first, 1565 then plane 1. 1566 1567 C.2.17 Weight Unquantization 1568 ----------------------------- 1569 1570 Each weight plane is specified as a sequence of integers in a given 1571 range. These values are packed using integer sequence encoding. 1572 1573 Once unpacked, the values must be unquantized from their storage 1574 range, returning them to a standard range of 0..64. The procedure 1575 for doing so is similar to the color endpoint unquantization. 1576 1577 First, we unquantize the actual stored weight values to the range 0..63. 1578 1579 For bit-only representations, this is simple bit replication from the 1580 most significant bit of the value. 1581 1582 For trit or quint-based representations, this involves a set of bit 1583 manipulations and adjustments to avoid the expense of full-width 1584 multipliers. 1585 1586 For representations with no additional bits, the results are as follows: 1587 1588 Range 0 1 2 3 4 1589 -------------------------- 1590 0..2 0 32 63 - - 1591 0..4 0 16 32 47 63 1592 -------------------------- 1593 Table C.2.25 - Weight Unquantization Values 1594 1595 For other values, we calculate the initial inputs to a bit manipulation 1596 procedure. These are denoted A (7 bits), B (7 bits), C (7 bits), and 1597 D (3 bits) and are decoded using the range as follows: 1598 1599 Range T Q B Bits A B C D 1600 ------------------------------------------------------- 1601 0..5 1 1 a aaaaaaa 0000000 50 Trit value 1602 0..9 1 1 a aaaaaaa 0000000 28 Quint value 1603 0..11 1 2 ba aaaaaaa b000b0b 23 Trit value 1604 0..19 1 2 ba aaaaaaa b0000b0 13 Quint value 1605 0..23 1 3 cba aaaaaaa cb000cb 11 Trit value 1606 ------------------------------------------------------- 1607 Table C.2.26 - Weight Unquantization Parameters 1608 1609 These are then processed as follows: 1610 1611 T = D * C + B; 1612 T = T ^ A; 1613 T = (A & 0x20) | (T >> 2); 1614 1615 Note that the multiply in the first line is nearly trivial as it only 1616 needs to multiply by 0, 1, 2, 3 or 4. 1617 1618 As a final step, for all types of value, the range is expanded from 1619 0..63 up to 0..64 as follows: 1620 1621 if (T > 32) { T += 1; } 1622 1623 This allows the implementation to use 64 as a divisor during inter- 1624 polation, which is much easier than using 63. 1625 1626 C.2.18 Weight Infill 1627 --------------------- 1628 1629 After unquantization, the weights are subject to weight selection and 1630 infill. The infill method is used to calculate the weight for a texel 1631 position, based on the weights in the stored weight grid array (which 1632 may be a different size). 1633 1634 The procedure below must be followed exactly, to ensure bit exact 1635 results. 1636 1637 The block size is specified as two dimensions along the s and t 1638 axes (Bs, Bt). Texel coordinates within the block (s,t) can have values 1639 from 0 to one less than the block dimension in that axis. 1640 1641 For each block dimension, we compute scale factors (Ds, Dt) 1642 1643 Ds = floor( (1024 + floor(Bs/2)) / (Bs-1) ); 1644 Dt = floor( (1024 + floor(Bt/2)) / (Bt-1) ); 1645 1646 Since the block dimensions are constrained, these are easily looked up 1647 in a table. These scale factors are then used to scale the (s,t) 1648 coordinates to a homogeneous coordinate (cs, ct): 1649 1650 cs = Ds * s; 1651 ct = Dt * t; 1652 1653 This homogeneous coordinate (cs, ct) is then scaled again to give 1654 a coordinate (gs, gt) in the weight-grid space . The weight-grid is 1655 of size (N, M), as specified in the block mode field: 1656 1657 gs = (cs*(N-1)+32) >> 6; 1658 gt = (ct*(M-1)+32) >> 6; 1659 1660 The resulting coordinates may be in the range 0..176. These are inter- 1661 preted as 4:4 unsigned fixed point numbers in the range 0.0 .. 11.0. 1662 1663 If we label the integral parts of these (js, jt) and the fractional 1664 parts (fs, ft), then: 1665 1666 js = gs >> 4; fs = gs & 0x0F; 1667 jt = gt >> 4; ft = gt & 0x0F; 1668 1669 These values are then used to bilinearly interpolate between the stored 1670 weights. 1671 1672 v0 = js + jt*N; 1673 p00 = decode_weight(v0); 1674 p01 = decode_weight(v0 + 1); 1675 p10 = decode_weight(v0 + N); 1676 p11 = decode_weight(v0 + N + 1); 1677 1678 The function decode_weight(n) decodes the nth weight in the stored weight 1679 stream. The values p00 to p11 are the weights at the corner of the square 1680 in which the texel position resides. These are then weighted using the 1681 fractional position to produce the effective weight i as follows: 1682 1683 w11 = (fs*ft+8) >> 4; 1684 w10 = ft - w11; 1685 w01 = fs - w11; 1686 w00 = 16 - fs - ft + w11; 1687 i = (p00*w00 + p01*w01 + p10*w10 + p11*w11 + 8) >> 4; 1688 1689 C.2.19 Weight Application 1690 -------------------------- 1691 Once the effective weight i for the texel has been calculated, the color 1692 endpoints are interpolated and expanded. 1693 1694 For LDR endpoint modes, each color component C is calculated from the 1695 corresponding 8-bit endpoint components C0 and C1 as follows: 1696 1697 If sRGB conversion is not enabled, or for the alpha channel in any case, 1698 C0 and C1 are first expanded to 16 bits by bit replication: 1699 1700 C0 = (C0 << 8) | C0; C1 = (C1 << 8) | C1; 1701 1702 If sRGB conversion is enabled, C0 and C1 for the R, G, and B channels 1703 are expanded to 16 bits differently, as follows: 1704 1705 C0 = (C0 << 8) | 0x80; C1 = (C1 << 8) | 0x80; 1706 1707 C0 and C1 are then interpolated to produce a UNORM16 result C: 1708 1709 C = floor( (C0*(64-i) + C1*i + 32)/64 ) 1710 1711 If sRGB conversion is enabled, the top 8 bits of the interpolation 1712 result for the R, G and B channels are passed to the external sRGB 1713 conversion block. Otherwise, if C = 65535, then the final result is 1714 1.0 (0x3C00) otherwise C is divided by 65536 and the infinite-precision 1715 result of the division is converted to FP16 with round-to-zero 1716 semantics. 1717 1718 For HDR endpoint modes, color values are represented in a 12-bit 1719 pseudo-logarithmic representation, and interpolation occurs in a 1720 piecewise-approximate logarithmic manner as follows: 1721 1722 In LDR mode, the error result is returned. 1723 1724 In HDR mode, the color components from each endpoint, C0 and C1, are 1725 initially shifted left 4 bits to become 16-bit integer values and these 1726 are interpolated in the same way as LDR. The 16-bit value C is then 1727 decomposed into the top five bits, E, and the bottom 11 bits M, which 1728 are then processed and recombined with E to form the final value Cf: 1729 1730 C = floor( (C0*(64-i) + C1*i + 32)/64 ) 1731 E = (C&0xF800) >> 11; M = C&0x7FF; 1732 if (M < 512) { Mt = 3*M; } 1733 else if (M >= 1536) { Mt = 5*M - 2048; } 1734 else { Mt = 4*M - 512; } 1735 Cf = (E<<10) + (Mt>>3) 1736 1737 This interpolation is a considerably closer approximation to a 1738 logarithmic space than simple 16-bit interpolation. 1739 1740 This final value Cf is interpreted as an IEEE FP16 value. If the result 1741 is +Inf or NaN, it is converted to the bit pattern 0x7BFF, which is the 1742 largest representable finite value. 1743 1744 C.2.20 Dual-Plane Decoding 1745 --------------------------- 1746 If dual-plane mode is disabled, all of the endpoint components are inter- 1747 polated using the same weight value. 1748 1749 If dual-plane mode is enabled, two weights are stored with each texel. 1750 One component is then selected to use the second weight for interpolation, 1751 instead of the first weight. The first weight is then used for all other 1752 components. 1753 1754 The component to treat specially is indicated using the 2-bit Color 1755 Component Selector (CCS) field as follows: 1756 1757 Value Weight 0 Weight 1 1758 -------------------------- 1759 0 GBA R 1760 1 RBA G 1761 2 RGA B 1762 3 RGB A 1763 -------------------------- 1764 Table C.2.28 - Dual Plane Color Component Selector Values 1765 1766 The CCS bits are stored at a variable position directly below the weight 1767 bits and any additional CEM bits. 1768 1769 C.2.21 Partition Pattern Generation 1770 ------------------------------------ 1771 1772 When multiple partitions are active, each texel position is assigned a 1773 partition index. This partition index is calculated using a seed (the 1774 partition pattern index), the texel's x,y,z position within the block, 1775 and the number of partitions. An additional argument, small_block, is 1776 set to 1 if the number of texels in the block is less than 31, 1777 otherwise it is set to 0. 1778 1779 This function is specified in terms of x, y and z in order to support 1780 3D textures. For 2D textures and texture slices, z will always be 0. 1781 1782 The full partition selection algorithm is as follows: 1783 1784 int select_partition(int seed, int x, int y, int z, 1785 int partitioncount, int small_block) 1786 { 1787 if( small_block ){ x <<= 1; y <<= 1; z <<= 1; } 1788 seed += (partitioncount-1) * 1024; 1789 uint32_t rnum = hash52(seed); 1790 uint8_t seed1 = rnum & 0xF; 1791 uint8_t seed2 = (rnum >> 4) & 0xF; 1792 uint8_t seed3 = (rnum >> 8) & 0xF; 1793 uint8_t seed4 = (rnum >> 12) & 0xF; 1794 uint8_t seed5 = (rnum >> 16) & 0xF; 1795 uint8_t seed6 = (rnum >> 20) & 0xF; 1796 uint8_t seed7 = (rnum >> 24) & 0xF; 1797 uint8_t seed8 = (rnum >> 28) & 0xF; 1798 uint8_t seed9 = (rnum >> 18) & 0xF; 1799 uint8_t seed10 = (rnum >> 22) & 0xF; 1800 uint8_t seed11 = (rnum >> 26) & 0xF; 1801 uint8_t seed12 = ((rnum >> 30) | (rnum << 2)) & 0xF; 1802 1803 seed1 *= seed1; seed2 *= seed2; 1804 seed3 *= seed3; seed4 *= seed4; 1805 seed5 *= seed5; seed6 *= seed6; 1806 seed7 *= seed7; seed8 *= seed8; 1807 seed9 *= seed9; seed10 *= seed10; 1808 seed11 *= seed11; seed12 *= seed12; 1809 1810 int sh1, sh2, sh3; 1811 if( seed & 1 ) 1812 { sh1 = (seed&2 ? 4:5); sh2 = (partitioncount==3 ? 6:5); } 1813 else 1814 { sh1 = (partitioncount==3 ? 6:5); sh2 = (seed&2 ? 4:5); } 1815 sh3 = (seed & 0x10) ? sh1 : sh2; 1816 1817 seed1 >>= sh1; seed2 >>= sh2; seed3 >>= sh1; seed4 >>= sh2; 1818 seed5 >>= sh1; seed6 >>= sh2; seed7 >>= sh1; seed8 >>= sh2; 1819 seed9 >>= sh3; seed10 >>= sh3; seed11 >>= sh3; seed12 >>= sh3; 1820 1821 int a = seed1*x + seed2*y + seed11*z + (rnum >> 14); 1822 int b = seed3*x + seed4*y + seed12*z + (rnum >> 10); 1823 int c = seed5*x + seed6*y + seed9 *z + (rnum >> 6); 1824 int d = seed7*x + seed8*y + seed10*z + (rnum >> 2); 1825 1826 a &= 0x3F; b &= 0x3F; c &= 0x3F; d &= 0x3F; 1827 1828 if( partitioncount < 4 ) d = 0; 1829 if( partitioncount < 3 ) c = 0; 1830 1831 if( a >= b && a >= c && a >= d ) return 0; 1832 else if( b >= c && b >= d ) return 1; 1833 else if( c >= d ) return 2; 1834 else return 3; 1835 } 1836 1837 As has been observed before, the bit selections are much easier to 1838 express in hardware than in C. 1839 1840 The seed is expanded using a hash function hash52, which is defined as 1841 follows: 1842 1843 uint32_t hash52( uint32_t p ) 1844 { 1845 p ^= p >> 15; p -= p << 17; p += p << 7; p += p << 4; 1846 p ^= p >> 5; p += p << 16; p ^= p >> 7; p ^= p >> 3; 1847 p ^= p << 6; p ^= p >> 17; 1848 return p; 1849 } 1850 1851 This assumes that all operations act on 32-bit values 1852 1853 C.2.22 Data Size Determination 1854 ------------------------------- 1855 1856 The size of the data used to represent color endpoints is not 1857 explicitly specified. Instead, it is determined from the block mode and 1858 number of partitions as follows: 1859 1860 config_bits = 17; 1861 if(num_partitions>1) 1862 if(single_CEM) 1863 config_bits = 29; 1864 else 1865 config_bits = 25 + 3*num_partitions; 1866 1867 num_weights = M * N * Q; // size of weight grid 1868 1869 if(dual_plane) 1870 config_bits += 2; 1871 num_weights *= 2; 1872 1873 weight_bits = ceil(num_weights*8*trits_in_weight_range/5) + 1874 ceil(num_weights*7*quints_in_weight_range/3) + 1875 num_weights*bits_in_weight_range; 1876 1877 remaining_bits = 128 - config_bits - weight_bits; 1878 1879 num_CEM_pairs = base_CEM_class+1 + count_bits(extra_CEM_bits); 1880 1881 The CEM value range is then looked up from a table indexed by remaining 1882 bits and num_CEM_pairs. This table is initialized such that the range 1883 is as large as possible, consistent with the constraint that the number 1884 of bits required to encode num_CEM_pairs pairs of values is not more 1885 than the number of remaining bits. 1886 1887 An equivalent iterative algorithm would be: 1888 1889 num_CEM_values = num_CEM_pairs*2; 1890 1891 for(range = each possible CEM range in descending order of size) 1892 { 1893 CEM_bits = ceil(num_CEM_values*8*trits_in_CEM_range/5) + 1894 ceil(num_CEM_values*7*quints_in_CEM_range/3) + 1895 num_CEM_values*bits_in_CEM_range; 1896 1897 if(CEM_bits <= remaining_bits) 1898 break; 1899 } 1900 return range; 1901 1902 In cases where this procedure results in unallocated bits, these bits 1903 are not read by the decoding process and can have any value. 1904 1905 C.2.23 Void-Extent Blocks 1906 -------------------------- 1907 1908 A void-extent block is a block encoded with a single color. It also 1909 specifies some additional information about the extent of the single- 1910 color area beyond this block, which can optionally be used by a 1911 decoder to reduce or prevent redundant block fetches. 1912 1913 The layout of a 2D Void-Extent block is as follows: 1914 1915 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 1916 --------------------------------------------------------------- 1917 | Block color A component | 1918 --------------------------------------------------------------- 1919 1920 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 1921 ---------------------------------------------------------------- 1922 | Block color B component | 1923 ---------------------------------------------------------------- 1924 1925 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 1926 ---------------------------------------------------------------- 1927 | Block color G component | 1928 ---------------------------------------------------------------- 1929 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 1930 ---------------------------------------------------------------- 1931 | Block color R component | 1932 ---------------------------------------------------------------- 1933 1934 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 1935 ---------------------------------------------------------------- 1936 | Void-extent maximum T coordinate | Min T | 1937 ---------------------------------------------------------------- 1938 1939 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 1940 ---------------------------------------------------------------- 1941 Void-extent minimum T coordinate | Void-extent max S | 1942 ---------------------------------------------------------------- 1943 1944 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 1945 ---------------------------------------------------------------- 1946 Void-extent max S coord | Void-extent minimum S coordinate | 1947 ---------------------------------------------------------------- 1948 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 1949 ---------------------------------------------------------------- 1950 Min S coord | 1 | 1 | D | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0 | 1951 ---------------------------------------------------------------- 1952 ------------------------------------------------- 1953 Figure C.7 - 2D Void-Extent Block Layout Overview 1954 1955 Bit 9 is the Dynamic Range flag, which indicates the format in which 1956 colors are stored. A 0 value indicates LDR, in which case the color 1957 components are stored as UNORM16 values. A 1 indicates HDR, in which 1958 case the color components are stored as FP16 values. 1959 1960 The reason for the storage of UNORM16 values in the LDR case is due 1961 to the possibility that the value will need to be passed on to sRGB 1962 conversion. By storing the color value in the format which comes out 1963 of the interpolator, before the conversion to FP16, we avoid having 1964 to have separate versions for sRGB and linear modes. 1965 1966 If a void-extent block with HDR values is decoded in LDR mode, then 1967 the result will be the error color, opaque magenta, for all texels 1968 within the block. 1969 1970 In the HDR case, if the color component values are infinity or NaN, this 1971 will result in undefined behavior. As usual, this must not lead to GL 1972 interruption or termination. 1973 1974 Bits 10 and 11 are reserved and must be 1. 1975 1976 The minimum and maximum coordinate values are treated as unsigned 1977 integers and then normalized into the range 0..1 (by dividing by 2^13-1 1978 or 2^9-1, for 2D and 3D respectively). The maximum values for each 1979 dimension must be greater than the corresponding minimum values, 1980 unless they are all all-1s. 1981 1982 If all the coordinates are all-1s, then the void extent is ignored, 1983 and the block is simply a constant-color block. 1984 1985 The existence of single-color blocks with void extents must not produce 1986 results different from those obtained if these single-color blocks are 1987 defined without void-extents. Any situation in which the results would 1988 differ is invalid. Results from invalid void extents are undefined. 1989 1990 If a void-extent appears in a MIPmap level other than the most detailed 1991 one, then the extent will apply to all of the more detailed levels too. 1992 This allows decoders to avoid sampling more detailed MIPmaps. 1993 1994 If the more detailed MIPmap level is not a constant color in this region, 1995 then the block may be marked as constant color, but without a void extent, 1996 as detailed above. 1997 1998 If a void-extent extends to the edge of a texture, then filtered texture 1999 colors may not be the same color as that specified in the block, due to 2000 texture border colors, wrapping, or cube face wrapping. 2001 2002 Care must be taken when updating or extracting partial image data that 2003 void-extents in the image do not become invalid. 2004 2005 C.2.24 Illegal Encodings 2006 ------------------------- 2007 2008 In ASTC, there is a variety of ways to encode an illegal block. Decoders 2009 are required to recognize all illegal blocks and emit the standard error 2010 color value upon encountering an illegal block. 2011 2012 Here is a comprehensive list of situations that represent illegal block 2013 encodings: 2014 2015 * The block mode specified is one of the modes explicitly listed 2016 as Reserved. 2017 * A 2D void-extent block that has any of the reserved bits not 2018 set to 1. 2019 * A block mode has been specified that would require more than 2020 64 weights total. 2021 * A block mode has been specified that would require more than 2022 96 bits for integer sequence encoding of the weight grid. 2023 * A block mode has been specifed that would require fewer than 2024 24 bits for integer sequence encoding of the weight grid. 2025 * The size of the weight grid exceeds the size of the block footprint 2026 in any dimension. 2027 * Color endpoint modes have been specified such that the color 2028 integer sequence encoding would require more than 18 integers. 2029 * The number of bits available for color endpoint encoding after all 2030 the other fields have been counted is less than ceil(13C/5) where C 2031 is the number of color endpoint integers (this would restrict color 2032 integers to a range smaller than 0..5, which is not supported). 2033 * Dual weight mode is enabled for a block with 4 partitions. 2034 * Void-Extent blocks where the low coordinate for some texture axis 2035 is greater than or equal to the high coordinate. 2036 2037 Note also that, in LDR mode, a block which has both HDR and LDR endpoint 2038 modes assigned to different partitions is not an error block. Only those 2039 texels which belong to the HDR partition will result in the error color. 2040 Texels belonging to a LDR partition will be decoded as normal. 2041 2042 C.2.25 LDR PROFILE SUPPORT 2043 --------------------------- 2044 2045 Implementations of the LDR Profile must satisfy the following requirements: 2046 2047 * All textures with valid encodings for LDR Profile must decode 2048 identically using either a LDR Profile, HDR Profile, or Full Profile 2049 decoder. 2050 * All features included only in the HDR Profile or Full Profile must be 2051 treated as reserved in the LDR Profile, and return the error color on 2052 decoding. 2053 * Any sequence of API calls valid for the LDR Profile must also be valid 2054 for the HDR Profile or Full Profile and return identical results when 2055 given a texture encoded for the LDR Profile. 2056 2057 The feature subset for the LDR profile is: 2058 2059 * 2D textures only, including 2D, 2D array, cube map face, 2060 and cube map array texture targets. 2061 * Only those block sizes listed in Table C.2.2 are supported. 2062 * LDR operation mode only. 2063 * Only LDR endpoint formats must be supported, namely formats 2064 0, 1, 4, 5, 6, 8, 9, 10, 12, 13. 2065 * Decoding from a HDR endpoint results in the error color. 2066 * Interpolation returns UNORM8 results when used in conjunction 2067 with sRGB. 2068 * LDR void extent blocks must be supported, but void extents 2069 may not be checked." 2070 2071 If only the LDR profile is supported, read this extension by striking 2072 all descriptions of HDR modes and decoding algorithms. The extension 2073 documents how to modify the document for some particularly tricky cases, 2074 but the general rule is as described in this paragraph. 2075 2076Interactions with immutable-format texture images 2077 2078 ASTC texture formats are supported by immutable-format textures only if 2079 such textures are supported by the underlying implementation (e.g. 2080 OpenGL 4.1 or later, OpenGL ES 3.0 or later, or earlier versions 2081 supporting the GL_EXT_texture_storage extension). Otherwise, remove all 2082 references to the Tex*Storage* commands from this specification. 2083 2084Interactions with texture cube map arrays 2085 2086 ASTC textures are supported for the TEXTURE_CUBE_MAP_ARRAY target only 2087 when cube map arrays are supported by the underlying implementation 2088 (e.g. OpenGL 4.0 or later, or an OpenGL or OpenGL ES version supporting 2089 an extension defining cube map arrays). Otherwise, remove all references 2090 to texture cube map arrays from this specification. 2091 2092Interactions with OpenGL (all versions) 2093 2094 ASTC is not supported for 1D textures and texture rectangles, and does 2095 not support non-zero borders. 2096 2097 Add the following error conditions to CompressedTexImage*D: 2098 2099 "An INVALID_ENUM error is generated by CompressedTexImage1D if 2100 <internalformat> is one of the ASTC formats. 2101 2102 An INVALID_OPERATION error is generated by CompressedTexImage2D 2103 and CompressedTexImage3D if <internalformat> is one of the ASTC 2104 formats and <border> is non-zero." 2105 2106 Add the following error conditions to CompressedTexSubImage*D: 2107 2108 "An INVALID_ENUM error is generated by CompressedTex*SubImage1D 2109 if the internal format of the texture is one of the ASTC formats. 2110 2111 An INVALID_OPERATION error is generated by CompressedTex*SubImage2D 2112 if the internal format of the texture is one of the ASTC formats 2113 and <border> is non-zero." 2114 2115 Add the following error conditions to TexStorage1D and TextureStorage1D: 2116 2117 "An INVALID_ENUM error is generated by TexStorage1D and TextureStorage1D 2118 if <format> is one of the ASTC formats." 2119 2120 Add the following error conditions to TexStorage2D and TextureStorage2D 2121 for versions of OpenGL that support texture rectangles: 2122 2123 "An INVALID_OPERATON error is generated by TexStorage2D and 2124 TextureStorage2D if <format> is one of the ASTC formats and <target> 2125 is TEXTURE_RECTANGLE. 2126 2127Interactions with OpenGL 4.2 2128 2129 OpenGL 4.2 supports the feature that compressed textures can be 2130 compressed online, by passing the compressed texture format enum as 2131 the internal format when uploading a texture using TexImage1D, 2132 TexImage2D or TexImage3D (see Section 3.9.3, Texture Image 2133 Specification, subsection Encoding of Special Internal Formats). 2134 2135 Due to the complexity of the ASTC compression algorithm, it is not 2136 usually suitable for online use, and therefore ASTC support will be 2137 limited to pre-compressed textures only. Where on-device compression 2138 is required, a domain-specific limited compressor will typically 2139 be used, and this is therefore not suitable for implementation in 2140 the driver. 2141 2142 In particular, the ASTC format specifiers will not be added to 2143 Table 3.14, and thus will not be accepted by the TexImage*D 2144 functions, and will not be returned by the (already deprecated) 2145 COMPRESSED_TEXTURE_FORMATS query. 2146 2147Issues 2148 2149 1) Three-dimensional block ASTC formats (e.g. formats whose block depth 2150 is greater than one) are not supported by these extensions. 2151 2152 2) The first release of the extension was not clear about the 2153 restrictions of the LDR profile and did not document interactions 2154 with cube map array textures. 2155 2156 RESOLVED. This extension has been rewritten to be based on OpenGL ES 2157 3.1, to clearly document LDR restrictions, and to add cube map array 2158 texture interactions. 2159 2160Revision History 2161 2162 Revision 8, June 8, 2017 - Added missing interactions with OpenGL. 2163 2164 Revision 7, July 14, 2016 - Clarified definition of 2D void-extent 2165 blocks. 2166 2167 Revision 6, March 8, 2016 - Clarified that sRGB transform is not 2168 applied to Alpha channel. 2169 2170 Revision 5, September 15, 2015 - fix typo in third paragraph of section 2171 8.7. 2172 2173 Revision 4, June 24, 2015 - minor cleanup from feedback. Move Issues and 2174 Interactions sections to the end of the document. Merge some language 2175 from OpenGL ES specification edits and rename some tables to figures, 2176 due to how they're generated in the core specifications. Include a 2177 description of the "Cube Map Array Texture" column added to table 3.19 2178 and expand the description of how to read this document when supporting 2179 only the LDR profile (Bug 13921). 2180 2181 Revision 3, May 28, 2015 - rebase extension on OpenGL ES 3.1. Clarify 2182 texture formats and targets supported by LDR and HDR profiles. Add cube 2183 map array targets and an Interactions section defining when they are 2184 supported. Add an Interactions section for immutable-format textures 2185 (Bug 13921). 2186 2187 Revision 2, April 28, 2015 - added CompressedTex{Sub,}Image3D to 2188 commands accepting ASTC format tokens in the New Tokens section (Bug 2189 10183). 2190