• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1Name
2
3    KHR_texture_compression_astc_hdr
4
5Name Strings
6
7    GL_KHR_texture_compression_astc_hdr
8    GL_KHR_texture_compression_astc_ldr
9
10Contact
11
12    Sean Ellis (sean.ellis 'at' arm.com)
13    Jon Leech (oddhack 'at' sonic.net)
14
15Contributors
16
17    Sean Ellis, ARM
18    Jorn Nystad, ARM
19    Tom Olson, ARM
20    Andy Pomianowski, AMD
21    Cass Everitt, NVIDIA
22    Walter Donovan, NVIDIA
23    Robert Simpson, Qualcomm
24    Maurice Ribble, Qualcomm
25    Larry Seiler, Intel
26    Daniel Koch, NVIDIA
27    Anthony Wood, Imagination Technologies
28    Jon Leech
29    Andrew Garrard, Samsung
30
31IP Status
32
33    No known issues.
34
35Notice
36
37    Copyright (c) 2012-2016 The Khronos Group Inc. Copyright terms at
38            http://www.khronos.org/registry/speccopyright.html
39
40Specification Update Policy
41
42    Khronos-approved extension specifications are updated in response to
43    issues and bugs prioritized by the Khronos OpenGL and OpenGL ES Working Groups. For
44    extensions which have been promoted to a core Specification, fixes will
45    first appear in the latest version of that core Specification, and will
46    eventually be backported to the extension document. This policy is
47    described in more detail at
48        https://www.khronos.org/registry/OpenGL/docs/update_policy.php
49
50Status
51
52    Complete.
53    Approved by the ARB on 2012/06/18.
54    Approved by the OpenGL ES WG on 2012/06/15.
55    Ratified by the Khronos Board of Promoters on 2012/07/27 (LDR profile).
56    Ratified by the Khronos Board of Promoters on 2013/09/27 (HDR profile).
57
58Version
59
60    Version 8, June 8, 2017
61
62Number
63
64    ARB Extension #118
65    OpenGL ES Extension #117
66
67Dependencies
68
69    Written based on the wording of the OpenGL ES 3.1 (April 29, 2015)
70    Specification
71
72    May be implemented against any version of OpenGL or OpenGL ES supporting
73    compressed textures.
74
75    Some of the functionality of these extensions is not supported if the
76    underlying implementation does not support cube map array textures.
77
78
79Overview
80
81    Adaptive Scalable Texture Compression (ASTC) is a new texture
82    compression technology that offers unprecendented flexibility, while
83    producing better or comparable results than existing texture
84    compressions at all bit rates. It includes support for 2D and
85    slice-based 3D textures, with low and high dynamic range, at bitrates
86    from below 1 bit/pixel up to 8 bits/pixel in fine steps.
87
88    The goal of these extensions is to support the full 2D profile of the
89    ASTC texture compression specification, and allow construction of 3D
90    textures from multiple compressed 2D slices.
91
92    ASTC-compressed textures are handled in OpenGL ES and OpenGL by adding
93    new supported formats to the existing commands for defining and updating
94    compressed textures, and defining the interaction of the ASTC formats
95    with each texture target.
96
97New Procedures and Functions
98
99    None
100
101New Tokens
102
103    Accepted by the <format> parameter of CompressedTexSubImage2D and
104    CompressedTexSubImage3D, and by the <internalformat> parameter of
105    CompressedTexImage2D, CompressedTexImage3D, TexStorage2D,
106    TextureStorage2D, TexStorage3D, and TextureStorage3D:
107
108    COMPRESSED_RGBA_ASTC_4x4_KHR            0x93B0
109    COMPRESSED_RGBA_ASTC_5x4_KHR            0x93B1
110    COMPRESSED_RGBA_ASTC_5x5_KHR            0x93B2
111    COMPRESSED_RGBA_ASTC_6x5_KHR            0x93B3
112    COMPRESSED_RGBA_ASTC_6x6_KHR            0x93B4
113    COMPRESSED_RGBA_ASTC_8x5_KHR            0x93B5
114    COMPRESSED_RGBA_ASTC_8x6_KHR            0x93B6
115    COMPRESSED_RGBA_ASTC_8x8_KHR            0x93B7
116    COMPRESSED_RGBA_ASTC_10x5_KHR           0x93B8
117    COMPRESSED_RGBA_ASTC_10x6_KHR           0x93B9
118    COMPRESSED_RGBA_ASTC_10x8_KHR           0x93BA
119    COMPRESSED_RGBA_ASTC_10x10_KHR          0x93BB
120    COMPRESSED_RGBA_ASTC_12x10_KHR          0x93BC
121    COMPRESSED_RGBA_ASTC_12x12_KHR          0x93BD
122
123    COMPRESSED_SRGB8_ALPHA8_ASTC_4x4_KHR    0x93D0
124    COMPRESSED_SRGB8_ALPHA8_ASTC_5x4_KHR    0x93D1
125    COMPRESSED_SRGB8_ALPHA8_ASTC_5x5_KHR    0x93D2
126    COMPRESSED_SRGB8_ALPHA8_ASTC_6x5_KHR    0x93D3
127    COMPRESSED_SRGB8_ALPHA8_ASTC_6x6_KHR    0x93D4
128    COMPRESSED_SRGB8_ALPHA8_ASTC_8x5_KHR    0x93D5
129    COMPRESSED_SRGB8_ALPHA8_ASTC_8x6_KHR    0x93D6
130    COMPRESSED_SRGB8_ALPHA8_ASTC_8x8_KHR    0x93D7
131    COMPRESSED_SRGB8_ALPHA8_ASTC_10x5_KHR   0x93D8
132    COMPRESSED_SRGB8_ALPHA8_ASTC_10x6_KHR   0x93D9
133    COMPRESSED_SRGB8_ALPHA8_ASTC_10x8_KHR   0x93DA
134    COMPRESSED_SRGB8_ALPHA8_ASTC_10x10_KHR  0x93DB
135    COMPRESSED_SRGB8_ALPHA8_ASTC_12x10_KHR  0x93DC
136    COMPRESSED_SRGB8_ALPHA8_ASTC_12x12_KHR  0x93DD
137
138    If extension "EXT_texture_storage" is supported, these tokens are also
139    accepted by TexStorage2DEXT, TextureStorage2DEXT, TexStorage3DEXT and
140    TextureStorage3DEXT.
141
142Additions to Chapter 8 of the OpenGL ES 3.1 Specification (Textures and Samplers)
143
144    Add to Section 8.7 Compressed Texture Images:
145
146    Modify table 8.19 (Compressed internal formats) to add all the ASTC
147    format tokens in the New Tokens section. The "Base Internal Format"
148    column is RGBA for all ASTC formats.
149
150    Add a new column "Block Width x Height", which is 4x4 for all non-ASTC
151    formats in the table, and matches the size in the token name for ASTC
152    formats (e.g. COMPRESSED_SRGB8_ALPHA8_ASTC_10x8_KHR has a block size of
153    10 x 8).
154
155    Add a second new column "3D Tex." which is empty for all non-ASTC
156    formats. If only the LDR profile is supported by the implementation,
157    this column is also empty for all ASTC formats. If both the LDR and HDR
158    profiles are supported, this column is checked for all ASTC formats.
159
160    Add a third new column "Cube Map Array Tex." which is empty for all
161    non-ASTC formats, and checked for all ASTC formats.
162
163    Append to the table caption:
164
165   "The "Block Size" column specifies the compressed block size of the
166    format. Modifying compressed images along aligned block boundaries is
167    possible, as described in this section. The "3D Tex." and "Cube Map
168    Array Tex." columns determine if 3D images composed of compressed 2D
169    slices, and cube map array textures respectively can be specified using
170    CompressedTexImage3D."
171
172    Append to the paragraph at the bottom of p. 168:
173
174   "If <internalformat> is one of the specific ... supports only
175    two-dimensional images. However, if the "3D Tex." column of table 8.19
176    is checked, CompressedTexImage3D will accept a three-dimensional image
177    specified as an array of compressed data consisting of multiple rows of
178    compressed blocks laid out as described in section 8.5."
179
180    Modify the second and third errors in the Errors section for
181    CompressedTexImage[2d]D on p. 169, and add a new error:
182
183   "An INVALID_VALUE error is generated by
184
185      * CompressedTexImage2D if <target> is
186        one of the cube map face targets from table 8.21, and
187      * CompressedTexImage3D if <target> is TEXTURE_CUBE_MAP_ARRAY,
188
189    and <width> and <height> are not equal.
190
191    An INVALID_OPERATION error is generated by CompressedTexImage3D if
192    <internalformat> is one of the the formats in table 8.19 and <target> is
193    not TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY, or TEXTURE_3D.
194
195    An INVALID_OPERATION error is generated by CompressedTexImage3D if
196    <target> is TEXTURE_CUBE_MAP_ARRAY and the "Cube Map Array"
197    column of table 8.19 is *not* checked, or if <target> is
198    TEXTURE_3D and the "3D Tex." column of table 8.19 is *not* checked"
199
200    Modify the fifth and sixth paragraphs on p. 170:
201
202   "Since these specific compressed formats are easily edited along texel
203    block boundaries, the limitations on subimage location and size are
204    relaxed for CompressedTexSubImage2D and CompressedTexSubImage3D.
205
206    The block width and height varies for different formats, as described in
207    table 8.19. The contents of any block of texels of a compressed texture
208    image in these specific compressed formats that does not intersect the
209    area being modified are preserved during CompressedTexSubImage* calls."
210
211    Modify the second error in the Errors section for
212    CompressedTexSubImage[23]D on p. 170, and add a new error:
213
214   "An INVALID_OPERATION error is generated by CompressedTexSubImage3D if
215    <format> is one of the formats in table 8.19 and <target> is not
216    TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY, or TEXTURE_3D.
217
218    An INVALID_OPERATION error is generated by CompressedTexSubImage3D if
219    <target> is TEXTURE_CUBE_MAP_ARRAY and the "Cube Map Array" column of
220    table 8.19 is *not* checked, or if <target> is TEXTURE_3D and the "3D
221    Tex." column of table 8.19 is *not* checked"
222
223    Modify the final error in the same section, on p. 171:
224
225   "An INVALID_OPERATION error is generated if format is one of the formats
226    in table 8.19 and any of the following conditions occurs. The block
227    width and height refer to the values in the corresponding column of the
228    table.
229
230      * <width> is not a multiple of the format's block width, and <width> +
231        <xoffset> is not equal to the value of TEXTURE_WIDTH.
232      * height is not a multiple of the format's block height, and <height>
233        + <yoffset> is not equal to the value of TEXTURE_HEIGHT.
234      * <xoffset> or <yoffset> is not a multiple of the block width or
235        height, respectively."
236
237    Modify table 8.24 (sRGB texture internal formats) to add all of the
238    COMPRESSED_SRGB8_ALPHA8_ASTC_*_KHR formats defined above.
239
240Additions to Appendix C of the OpenGL ES 3.1 Specification (Compressed
241Texture Image Formats)
242
243    Add a new sub-section on ASTC image formats, as follows:
244
245   "C.2 ASTC Compressed Texture Image Formats
246    =========================================
247
248    C.2.1   What is ASTC?
249    ---------------------
250
251    ASTC stands for Adaptive Scalable Texture Compression.
252    The ASTC formats form a family of related compressed texture image
253    formats. They are all derived from a common set of definitions.
254
255    ASTC textures may be encoded using either high or low dynamic range,
256    corresponding to the "HDR profile" and "LDR profile". Support for the
257    HDR profile is indicated by the "GL_KHR_texture_compression_astc_hdr"
258    extension string, and support for the LDR profile is indicated by the
259    "GL_KHR_texture_compression_astc_ldr" extension string.
260
261    The LDR profile supports two-dimensional images for texture targets
262    TEXTURE_2D. TEXTURE_2D_ARRAY, the six texture cube map face targets, and
263    TEXTURE_CUBE_MAP_ARRAY. These images may optionally be specified using
264    the sRGB color space for the RGB channels.
265
266    The HDR profile is a superset of the LDR profile, and also supports
267    texture target TEXTURE_3D for images made up of multiple two-dimensional
268    slices of compressed data. HDR images may be a mix of low and high
269    dynamic range data. If the HDR profile is supported, the LDR profile and
270    its extension string must also be supported.
271
272    ASTC textures may be encoded as 1, 2, 3 or 4 components, but they are
273    all decoded into RGBA.
274
275    Different ASTC formats have different block sizes, specified as part of
276    the name of the format token passed to CompressedImage2D and its related
277    functions, and in table 8.19.
278
279    Additional ASTC formats (the "Full profile") exist which support 3D data
280    specified as compressed 3D blocks. However, such formats are not defined
281    by either the LDR or HDR profiles, and are not described in this
282    specification.
283
284    C.2.2   Design Goals
285    --------------------
286
287    The design goals for the format are as follows:
288
289    * Random access. This is a must for any texture compression format.
290    * Bit exact decode. This is a must for conformance testing and
291      reproducibility.
292    * Suitable for mobile use. The format should be suitable for both
293      desktop and mobile GPU environments. It should be low bandwidth
294      and low in area.
295    * Flexible choice of bit rate. Current formats only offer a few bit
296      rates, leaving content developers with only coarse control over
297      the size/quality tradeoff.
298    * Scalable and long-lived. The format should support existing R, RG,
299      RGB and RGBA image types, and also have high "headroom", allowing
300      continuing use for several years and the ability to innovate in
301      encoders. Part of this is the choice to include HDR and 3D.
302    * Feature orthogonality. The choices for the various features of the
303      format are all orthogonal to each other. This has three effects:
304      first, it allows a large, flexible configuration space; second,
305      it makes that space easier to understand; and third, it makes
306      verification easier.
307    * Best in class at given bit rate. It should beat or match the current
308      best in class for peak signal-to-noise ratio (PSNR) at all bit rates.
309    * Fast decode. Texel throughput for a cached texture should be one
310      texel decode per clock cycle per decoder. Parallel decoding of several
311      texels from the same block should be possible at incremental cost.
312    * Low bandwidth. The encoding scheme should ensure that memory access
313      is kept to a minimum, cache reuse is high and memory bandwidth for
314      the format is low.
315    * Low area. It must occupy comparable die size to competing formats.
316
317    C.2.3   Basic Concepts
318    ----------------------
319
320    ASTC is a block-based lossy compression format. The compressed image
321    is divided into a number of blocks of uniform size, which makes it
322    possible to quickly determine which block a given texel resides in.
323
324    Each block has a fixed memory footprint of 128 bits, but these bits
325    can represent varying numbers of texels (the block "footprint").
326
327    Block footprint sizes are not confined to powers-of-two, and are
328    also not confined to be square. They may be 2D, in which case the
329    block dimensions range from 4 to 12 texels, or 3D, in which case
330    the block dimensions range from 3 to 6 texels.
331
332    Decoding one texel requires only the data from a single block. This
333    simplifies cache design, reduces bandwidth and improves encoder throughput.
334
335    C.2.4   Block Encoding
336    ----------------------
337
338    To understand how the blocks are stored and decoded, it is useful to start
339    with a simple example, and then introduce additional features.
340
341    The simplest block encoding starts by defining two color "endpoints". The
342    endpoints define two colors, and a number of additional colors are generated
343    by interpolating between them. We can define these colors using 1, 2, 3,
344    or 4 components (usually corresponding to R, RG, RGB and RGBA textures),
345    and using low or high dynamic range.
346
347    We then store a color interpolant weight for each texel in the image, which
348    specifies how to calculate the color to use. From this, a weighted average
349    of the two endpoint colors is used to generate the intermediate color,
350    which is the returned color for this texel.
351
352    There are several different ways of specifying the endpoint colors, and the
353    weights, but once they have been defined, calculation of the texel colors
354    proceeds identically for all of them. Each block is free to choose whichever
355    encoding scheme best represents its color endpoints, within the constraint
356    that all the data fits within the 128 bit block.
357
358    For blocks which have a large number of texels (e.g. a 12x12 block), there is
359    not enough space to explicitly store a weight for every texel. In this case,
360    a sparser grid with fewer weights is stored, and interpolation is used to
361    determine the effective weight to be used for each texel position. This allows
362    very low bit rates to be used with acceptable quality. This can also be used
363    to more efficiently encode blocks with low detail, or with strong vertical
364    or horizontal features.
365
366    For blocks which have a mixture of disparate colors, a single line in the
367    color space is not a good fit to the colors of the pixels in the original
368    image. It is therefore possible to partition the texels into multiple sets,
369    the pixels within each set having similar colors. For each of these
370    "partitions", we specify separate endpoint pairs, and choose which pair of
371    endpoints to use for a particular texel by looking up the partition index
372    from a partitioning pattern table. In ASTC, this partition table is actually
373    implemented as a function.
374
375    The endpoint encoding for each partition is independent.
376
377    For blocks which have uncorrelated channels - for example an image with a
378    transparency mask, or an image used as a normal map - it may be necessary
379    to specify two weights for each texel. Interpolation between the components
380    of the endpoint colors can then proceed independently for each "plane" of
381    the image. The assignment of channels to planes is selectable.
382
383    Since each of the above options is independent, it is possible to specify any
384    combination of channels, endpoint color encoding, weight encoding,
385    interpolation, multiple partitions and single or dual planes.
386
387    Since these values are specified per block, it is important that they are
388    represented with the minimum possible number of bits. As a result, these
389    values are packed together in ways which can be difficult to read, but
390    which are nevertheless highly amenable to hardware decode.
391
392    All of the values used as weights and color endpoint values can be specified
393    with a variable number of bits. The encoding scheme used allows a fine-
394    grained tradeoff between weight bits and color endpoint bits using "integer
395    sequence encoding". This can pack adjacent values together, allowing us to
396    use fractional numbers of bits per value.
397
398    Finally, a block may be just a single color. This is a so-called "void
399    extent block" and has a special coding which also allows it to identify
400    nearby regions of single color. This may be used to short-circuit fetching of
401    what would be identical blocks, and further reduce memory bandwidth.
402
403    C.2.5   LDR and HDR Modes
404    -------------------------
405
406    The decoding process for LDR content can be simplified if it is known in
407    advance that sRGB output is required. This selection is therefore included
408    as part of the global configuration.
409
410    The two modes differ in various ways.
411
412    -----------------------------------------------------------------------------
413    Operation           LDR Mode                    HDR Mode
414    -----------------------------------------------------------------------------
415    Returned value      Vector of FP16 values,      Vector of FP16 values
416                        or Vector of UNORM8 values.
417
418    sRGB compatible     Yes                         No
419
420    LDR endpoint        16 bits, or                 16 bits
421    decoding precision  8 bits for sRGB
422
423    HDR endpoint mode   Error color                 As decoded
424    results
425
426    Error results       Error color                 Vector of NaNs (0xFFFF)
427    -----------------------------------------------------------------------------
428          Table C.2.1 - Differences Between LDR and HDR Modes
429
430    The error color is opaque fully-saturated magenta
431    (R,G,B,A = 0xFF, 0x00, 0xFF, 0xFF). This has been chosen as it is much more
432    noticeable than black or white, and occurs far less often in valid images.
433
434    For linear RGB decode, the error color may be either opaque fully-saturated
435    magenta (R,G,B,A = 1.0, 0.0, 1.0, 1.0) or a vector of four NaNs
436    (R,G,B,A = NaN, NaN, NaN, NaN). In the latter case, the recommended NaN
437    value returned is 0xFFFF.
438
439    The error color is returned as an informative response to invalid
440    conditions, including invalid block encodings or use of reserved endpoint
441    modes.
442
443    Future, forward-compatible extensions to KHR_texture_compression_astc
444    may define valid interpretations of these conditions, which will decode to
445    some other color. Therefore, encoders and applications must not rely on
446    invalid encodings as a way of generating the error color.
447
448    C.2.6   Configuration Summary
449    -----------------------------
450
451    The global configuration data for the format is as follows:
452
453    *   Block dimension (always 2D for both LDR and HDR profiles)
454    *   Block footprint size
455    *   sRGB output enabled or not
456
457    The data specified per block is as follows:
458
459    *   Texel weight grid size
460    *   Texel weight range
461    *   Texel weight values
462    *   Number of partitions
463    *   Partition pattern index
464    *   Color endpoint modes (includes LDR or HDR selection)
465    *   Color endpoint data
466    *   Number of planes
467    *   Plane-to-channel assignment
468
469    C.2.7   Decode Procedure
470    ------------------------
471
472    To decode one texel:
473
474    Find block containing texel
475    Read block mode
476    If void-extent block, store void extent and immediately return single
477        color (optimization)
478
479    For each plane in image
480      If block mode requires infill
481        Find and decode stored weights adjacent to texel, unquantize and
482            interpolate
483      Else
484        Find and decode weight for texel, and unquantize
485
486    Read number of partitions
487    If number of partitions > 1
488      Read partition table pattern index
489      Look up partition number from pattern
490
491    Read color endpoint mode and endpoint data for selected partition
492    Unquantize color endpoints
493    Interpolate color endpoints using weight (or weights in dual-plane mode)
494    Return interpolated color
495
496    C.2.8   Block Determination and Bit Rates
497    The block footprint is a global setting for any given texture, and is
498    therefore not encoded in the individual blocks.
499
500    For 2D textures, the block footprint's width and height are selectable
501    from a number of predefined sizes, namely 4, 5, 6, 8, 10 and 12 pixels.
502
503    For square and nearly-square blocks, this gives the following bit rates:
504
505        -------------------------------------
506         Footprint
507        Width Height    Bit Rate    Increment
508        -------------------------------------
509        4     4         8.00        125%
510        5     4         6.40        125%
511        5     5         5.12        120%
512        6     5         4.27        120%
513        6     6         3.56        114%
514        8     5         3.20        120%
515        8     6         2.67        105%
516        10    5         2.56        120%
517        10    6         2.13        107%
518        8     8         2.00        125%
519        10    8         1.60        125%
520        10    10        1.28        120%
521        12    10        1.07        120%
522        12    12        0.89
523        -------------------------------------
524        Table C.2.2 - 2D Footprint and Bit Rates
525
526    The block footprint is shown as <width>x<height> in the format name. For
527    example, the format COMPRESSED_RGBA_ASTC_8x6_KHR specifies an image with
528    a block width of 8 texels, and a block height of 6 texels.
529
530    The "Increment" column indicates the ratio of bit rate against the next
531    lower available rate. A consistent value in this column indicates an even
532    spread of bit rates.
533
534    The HDR profile supports only those block footprints listed in Table
535    C.2.2. Other block sizes are not supported.
536
537    For images which are not an integer multiple of the block size, additional
538    texels are added to the edges with maximum X and Y. These texels may be
539    any color, as they will not be accessed.
540
541    Although these are not all powers of two, it is possible to calculate block
542    addresses and pixel addresses within the block, for legal image sizes,
543    without undue complexity.
544
545    Given a 2D image which is W x H pixels in size, with block size
546    w x h, the size of the image in blocks is:
547
548        Bw = ceiling(W/w)
549        Bh = ceiling(H/h)
550
551    For a 3D image, each 2D slice is a single texel thick, so that for an
552    image which is W x H x D pixels in size, with block size w x h, the size
553    of the image in blocks is:
554
555        Bw = ceiling(W/w)
556        Bh = ceiling(H/h)
557        Bd = D
558
559    C.2.9   Block Layout
560    --------------------
561
562    Each block in the image is stored as a single 128-bit block in memory. These
563    blocks are laid out in raster order, starting with the block at (0,0,0), then
564    ordered sequentially by X, Y and finally Z (if present). They are aligned to
565    128-bit boundaries in memory.
566
567    The bits in the block are labeled in little-endian order - the byte at the
568    lowest address contains bits 0..7. Bit 0 is the least significant bit in the
569    byte.
570
571    Each block has the same basic layout, as shown in figure C.1.
572
573    127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112
574     --------------------------------------------------------------
575    | Texel Weight Data (variable width)        Fill direction ->
576     --------------------------------------------------------------
577
578    111 110 109 108 107 106 105 104 103 102 101 100 99  98  97  96
579     --------------------------------------------------------------
580                            Texel Weight Data
581     --------------------------------------------------------------
582
583    95  94  93  92  91  90  89  88  87  86  85  84  83  82  81  80
584     --------------------------------------------------------------
585                            Texel Weight Data
586     --------------------------------------------------------------
587
588    79  78  77  76  75  74  73  72  71  70  69  68  67  66  65  64
589     --------------------------------------------------------------
590                            Texel Weight Data
591     --------------------------------------------------------------
592
593    63  62  61  60  59  58  57  56  55  54  53  52  51  50  49  48
594     --------------------------------------------------------------
595                       :    More config data   :
596     --------------------------------------------------------------
597
598    47  46  45  44  43  42  41  40  39  38  37  36  35  34  33  32
599     --------------------------------------------------------------
600          <-Fill direction              Color  Endpoint Data
601     --------------------------------------------------------------
602
603    31  30  29  28  27  26  25  24  23  22  21  20  19  18  17  16
604     --------------------------------------------------------------
605                   :     Extra configuration data
606     --------------------------------------------------------------
607
608    15  14  13  12  11  10  9   8   7   6   5   4   3   2   1   0
609     --------------------------------------------------------------
610       Extra  | Part  | Block mode                                 |
611     --------------------------------------------------------------
612
613        Figure C.1 - Block Layout Overview
614
615    Dotted partition lines indicate that the split position is not fixed.
616
617    The "Block mode" field specifies how the Texel Weight Data is encoded.
618
619    The "Part" field specifies the number of partitions, minus one. If dual
620    plane mode is enabled, the number of partitions must be 3 or fewer.
621    If 4 partitions are specified, the error value is returned for all
622    texels in the block.
623
624    The size and layout of the extra configuration data depends on the
625    number of partitions, and the number of planes in the image, as shown in
626    figures C.2 and C.3 (only the bottom 32 bits are shown):
627
628    31  30  29  28  27  26  25  24  23  22  21  20  19  18  17  16
629    --------------------------------------------------------------
630    <- Color endpoint data                                    |CEM
631    --------------------------------------------------------------
632
633    15  14  13  12  11  10  9   8   7   6   5   4   3   2   1   0
634    --------------------------------------------------------------
635      CEM     | 0   0 |              Block Mode                   |
636    --------------------------------------------------------------
637
638        Figure C.2 - Single-partition Block Layout
639
640    CEM is the color endpoint mode field, which determines how the Color
641    Endpoint Data is encoded.
642
643    If dual-plane mode is active, the color component selector bits appear
644    directly below the weight bits.
645
646    31  30  29  28  27  26  25  24  23  22  21  20  19  18  17  16
647    --------------------------------------------------------------
648              |         CEM           |     Partition Index
649    --------------------------------------------------------------
650
651    15  14  13  12  11  10  9   8   7   6   5   4   3   2   1   0
652    --------------------------------------------------------------
653      Partition Index |              Block Mode                   |
654    --------------------------------------------------------------
655
656        Figure C.3 - Multi-partition Block Layout
657
658    The Partition Index field specifies which partition layout to use. CEM is
659    the first 6 bits of color endpoint mode information for the various
660    partitions. For modes which require more than 6 bits of CEM data, the
661    additional bits appear at a variable position directly beneath the texel
662    weight data.
663
664    If dual-plane mode is active, the color component selector bits then appear
665    directly below the additional CEM bits.
666
667    The final special case is that if bits [8:0] of the block are "111111100",
668    then the block is a void-extent block, which has a separate encoding
669    described in section C.2.22.
670
671    C.2.10  Block Mode
672    ------------------
673
674    The Block Mode field specifies the width, height and depth of the grid of
675    weights, what range of values they use, and whether dual weight planes are
676    present. Since some these are not represented using powers of two (there
677    are 12 possible weight widths, for example), and not all combinations are
678    allowed, this is not a simple bit packing. However, it can be unpacked
679    quickly in hardware.
680
681    The weight ranges are encoded using a 3 bit value R, which is interpreted
682    together with a precision bit H, as follows:
683
684        Low Precision Range (H=0)           High Precision Range (H=1)
685    R   Weight Range  Trits  Quints  Bits   Weight Range  Trits  Quints  Bits
686    -------------------------------------------------------------------------
687    000 Invalid                             Invalid
688    001 Invalid                             Invalid
689    010 0..1                          1     0..9                   1      1
690    011 0..2            1                   0..11           1             2
691    100 0..3                          2     0..15                         4
692    101 0..4                   1            0..19                  1      2
693    110 0..5            1             1     0..23           1             3
694    111 0..7                          3     0..31                         5
695    -------------------------------------------------------------------------
696    Table C.2.7 - Weight Range Encodings
697
698    Each weight value is encoded using the specified number of Trits, Quints
699    and Bits. The details of this encoding can be found in Section C.3.12 -
700    Integer Sequence Encoding.
701
702    For 2D blocks, the Block Mode field is laid out as follows:
703
704    -------------------------------------------------------------------------
705    10  9   8   7   6   5   4   3   2   1   0   Width Height Notes
706    -------------------------------------------------------------------------
707    D   H     B       A     R0  0   0   R2  R1  B+4   A+2
708    D   H     B       A     R0  0   1   R2  R1  B+8   A+2
709    D   H     B       A     R0  1   0   R2  R1  A+2   B+8
710    D   H   0   B     A     R0  1   1   R2  R1  A+2   B+6
711    D   H   1   B     A     R0  1   1   R2  R1  B+2   A+2
712    D   H   0   0     A     R0  R2  R1  0   0   12    A+2
713    D   H   0   1     A     R0  R2  R1  0   0   A+2   12
714    D   H   1   1   0   0   R0  R2  R1  0   0   6     10
715    D   H   1   1   0   1   R0  R2  R1  0   0   10    6
716      B     1   0     A     R0  R2  R1  0   0   A+6   B+6   D=0, H=0
717    x   x   1   1   1   1   1   1   1   0   0   -     -     Void-extent
718    x   x   1   1   1   x   x   x   x   0   0   -     -     Reserved*
719    x   x   x   x   x   x   x   0   0   0   0   -     -     Reserved
720    -------------------------------------------------------------------------
721    Table C.2.8 - 2D Block Mode Layout
722
723    Note that, due to the encoding of the R field, as described in the
724    previous page, bits R2 and R1 cannot both be zero, which disambiguates
725    the first five rows from the rest of the table.
726
727    Bit positions with a value of x are ignored for purposes of determining
728    if a block is a void-extent block or reserved, but may have defined
729    encodings for specific void-extent blocks.
730
731    The penultimate row of the table is reserved only if bits [5:2] are not
732    all 1, in which case it encodes a void-extent block (as shown in the
733    previous row).
734
735    The D bit is set to indicate dual-plane mode. In this mode, the maximum
736    allowed number of partitions is 3.
737
738    The penultimate row of the table is reserved only if bits [4:2] are not
739    all 1, in which case it encodes a void-extent block (as shown in the
740    previous row).
741
742    The size of the grid in each dimension must be less than or equal to
743    the corresponding dimension of the block footprint. If the grid size
744    is greater than the footprint dimension in any axis, then this is an
745    illegal block encoding and all texels will decode to the error color.
746
747    C.2.11  Color Endpoint Mode
748    ---------------------------
749
750    In single-partition mode, the Color Endpoint Mode (CEM) field stores one
751    of 16 possible values. Each of these specifies how many raw data values
752    are encoded, and how to convert these raw values into two RGBA color
753    endpoints. They can be summarized as follows:
754
755    ---------------------------------------------
756    CEM Description                         Class
757    ---------------------------------------------
758    0   LDR Luminance, direct               0
759    1   LDR Luminance, base+offset          0
760    2   HDR Luminance, large range          0
761    3   HDR Luminance, small range          0
762    4   LDR Luminance+Alpha, direct         1
763    5   LDR Luminance+Alpha, base+offset    1
764    6   LDR RGB, base+scale                 1
765    7   HDR RGB, base+scale                 1
766    8   LDR RGB, direct                     2
767    9   LDR RGB, base+offset                2
768    10  LDR RGB, base+scale plus two A      2
769    11  HDR RGB, direct                     2
770    12  LDR RGBA, direct                    3
771    13  LDR RGBA, base+offset               3
772    14  HDR RGB, direct + LDR Alpha         3
773    15  HDR RGB, direct + HDR Alpha         3
774    ---------------------------------------------
775    Table C.2.10 - Color Endpoint Modes.
776      [[ If the HDR profile is not implemented, remove from table C.2.10
777         all rows whose description starts with "HDR", and add to the
778         caption: ]]
779    Modes not described in the CEM column are reserved for HDR modes, and
780    will generate errors in an unextended OpenGL ES implementation.
781
782    In multi-partition mode, the CEM field is of variable width, from 6 to 14
783    bits. The lowest 2 bits of the CEM field specify how the endpoint mode
784    for each partition is calculated:
785
786    ----------------------------------------------------
787    Value   Meaning
788    ----------------------------------------------------
789    00  All color endpoint pairs are of the same type.
790        A full 4-bit CEM is stored in block bits [28:25]
791        and is used for all partitions.
792    01  All endpoint pairs are of class 0 or 1.
793    10  All endpoint pairs are of class 1 or 2.
794    11  All endpoint pairs are of class 2 or 3.
795    ----------------------------------------------------
796    Table C.2.11 - Multi-Partition Color Endpoint Modes
797
798    If the CEM selector value in bits [24:23] is not 00,
799    then data layout is as follows:
800
801    ---------------------------------------------------
802    Part            n   m   l   k   j   i   h   g
803            ------------------------------------------
804    2   ... Weight :  M1   :                         ...
805            ------------------------------------------
806    3   ... Weight :  M2   :  M1   :M0 :             ...
807            ------------------------------------------
808    4   ... Weight :  M3   :  M2   :  M1   :  M0   : ...
809            ------------------------------------------
810
811    Part    28  27  26  25  24  23
812            ----------------------
813    2      |  M0   |C1 |C0 | CEM  |
814            ----------------------
815    3      |M0 |C2 |C1 |C0 | CEM  |
816            ----------------------
817    4      |C3 |C2 |C1 |C0 | CEM  |
818            ----------------------
819    ---------------------------------------------------
820    Figure C.4 - Multi-Partition Color Endpoint Modes
821
822    In this view, each partition i has two fields. C<i> is the class
823    selector bit, choosing between the two possible CEM classes (0 indicates
824    the lower of the two classes), and M<i> is a two-bit field specifying
825    the low bits of the color endpoint mode within that class. The
826    additional bits appear at a variable bit position, immediately below the
827    texel weight data.
828
829    The ranges used for the data values are not explicitly specified.
830    Instead, they are derived from the number of available bits remaining
831    after the configuration data and weight data have been specified.
832
833    Details of the decoding procedure for Color Endpoints can be found in
834    section C.2.13.
835
836    C.2.12  Integer Sequence Encoding
837    ---------------------------------
838
839    Both the weight data and the endpoint color data are variable width, and
840    are specified using a sequence of integer values. The range of each
841    value in a sequence (e.g. a color weight) is constrained.
842
843    Since it is often the case that the most efficient range for these
844    values is not a power of two, each value sequence is encoded using a
845    technique known as "integer sequence encoding". This allows efficient,
846    hardware-friendly packing and unpacking of values with non-power-of-two
847    ranges.
848
849    In a sequence, each value has an identical range. The range is specified
850    in one of the following forms:
851
852    Value range      MSB encoding   LSB encoding Value       Block Packed
853                                                                   block size
854    -----------      ------------   ------------ ----------- ----- ----------
855    0 .. 2^n-1       -              n bit value  m           1     n
856                                    m (n <= 8)
857    0 .. (3 * 2^n)-1 Base-3 "trit"  n bit value  t * 2^n + m 5     8 + 5*n
858                     value t        m (n <= 6)
859    0 .. (5 * 2^n)-1 Base-5 "quint" n bit value  q * 2^n + m 3     7 + 3*n
860                     value q        m (n <= 5)
861    -------------------------------------------
862    Table C.2.13 -Encoding for Different Ranges
863
864    Since 3^5 is 243, it is possible to pack five trits into 8 bits(which has
865    256 possible values), so a trit can effectively be encoded as 1.6 bits.
866    Similarly, since 5^3 is 125, it is possible to pack three quints into
867    7 bits (which has 128 possible values), so a quint can be encoded as
868    2.33 bits.
869
870    The encoding scheme packs the trits or quints, and then interleaves the n
871    additional bits in positions that satisfy the requirements of an
872    arbitrary length stream. This makes it possible to correctly specify
873    lists of values whose length is not an integer multiple of 3 or 5 values.
874    It also makes it possible to easily select a value at random within the stream.
875
876    If there are insufficient bits in the stream to fill the final block, then
877    unused (higher order) bits are assumed to be 0 when decoding.
878
879    To decode the bits for value number i in a sequence of bits b, both
880    indexed from 0, perform the following:
881
882    If the range is encoded as n bits per value, then the value is bits
883    b[i*n+n-1:i*n] - a simple multiplexing operation.
884
885    If the range is encoded using a trit, then each block contains 5 values
886    (v0 to v4), each of which contains a trit (t0 to t4) and a corresponding
887    LSB value (m0 to m4). The first bit of the packed block is bit
888    floor(i/5)*(8+5*n). The bits in the block are packed as follows
889    (in this example, n is 4):
890
891                    27  26  25  24  23  22  21  20  19  18  17  16
892                    -----------------------------------------------
893                   |T7 |     m4        |T6  T5 |     m3        |T4 |
894                    -----------------------------------------------
895
896    15  14  13  12  11  10  9   8   7   6   5   4   3   2   1   0
897     --------------------------------------------------------------
898    |    m2        |T3  T2 |      m1       |T1  T0 |      m0       |
899     --------------------------------------------------------------
900
901    Figure C.5 - Trit-based Packing
902
903    The five trits t0 to t4 are obtained by bit manipulations of the 8 bits
904    T[7:0] as follows:
905
906        if T[4:2] = 111
907            C = { T[7:5], T[1:0] }; t4 = t3 = 2
908        else
909            C = T[4:0]
910            if T[6:5] = 11
911                t4 = 2; t3 = T[7]
912            else
913                t4 = T[7]; t3 = T[6:5]
914
915        if C[1:0] = 11
916            t2 = 2; t1 = C[4]; t0 = { C[3], C[2]&~C[3] }
917        else if C[3:2] = 11
918            t2 = 2; t1 = 2; t0 = C[1:0]
919        else
920            t2 = C[4]; t1 = C[3:2]; t0 = { C[1], C[0]&~C[1] }
921
922    If the range is encoded using a quint, then each block contains 3 values
923    (v0 to v2), each of which contains a quint (q0 to q2) and a corresponding
924    LSB value (m0 to m2). The first bit of the packed block is bit
925    floor(i/3)*(7+3*n).
926
927    The bits in the block are packed as follows (in this example, n is 4):
928
929                                                        18  17  16
930                                                        -----------
931                                                       |Q6  Q5 | m2
932                                                        -----------
933    15  14  13  12  11  10  9   8   7   6   5   4   3   2   1   0
934    ---------------------------------------------------------------
935      m2       |Q4  Q3 |     m1        |Q2  Q1  Q0 |      m0       |
936    ---------------------------------------------------------------
937
938    Figure C.6 - Quint-based Packing
939
940    The three quints q0 to q2 are obtained by bit manipulations of the 7 bits
941    Q[6:0] as follows:
942
943        if Q[2:1] = 11 and Q[6:5] = 00
944            q2 = { Q[0], Q[4]&~Q[0], Q[3]&~Q[0] }; q1 = q0 = 4
945        else
946            if Q[2:1] = 11
947                q2 = 4; C = { Q[4:3], ~Q[6:5], Q[0] }
948            else
949                q2 = Q[6:5]; C = Q[4:0]
950
951            if C[2:0] = 101
952                q1 = 4; q0 = C[4:3]
953            else
954                q1 = C[4:3];    q0 = C[2:0]
955
956    Both these procedures ensure a valid decoding for all 128 possible values
957    (even though a few are duplicates). They can also be implemented
958    efficiently in software using small tables.
959
960    Encoding methods are not specified here, although table-based mechanisms
961    work well.
962
963    C.2.13  Endpoint Unquantization
964    -------------------------------
965
966    Each color endpoint is specified as a sequence of integers in a given
967    range. These values are packed using integer sequence encoding, as a
968    stream of bits stored from just above the configuration data, and
969    growing upwards.
970
971    Once unpacked, the values must be unquantized from their storage range,
972    returning them to a standard range of 0..255.
973
974    For bit-only representations, this is simple bit replication from the
975    most significant bit of the value.
976
977    For trit or quint-based representations, this involves a set of bit
978    manipulations and adjustments to avoid the expense of full-width
979    multipliers. This procedure ensures correct scaling, but scrambles
980    the order of the decoded values relative to the encoded values.
981    This must be compensated for using a table in the encoder.
982
983    The initial inputs to the procedure are denoted A (9 bits), B (9 bits),
984    C (9 bits) and D (3 bits) and are decoded using the range as follows:
985
986    ---------------------------------------------------------------
987    Range   T Q B   Bits    A           B           C   D
988    ---------------------------------------------------------------
989    0..5    1   1   a       aaaaaaaaa   000000000   204 Trit value
990    0..9      1 1   a       aaaaaaaaa   000000000   113 Quint value
991    0..11   1   2   ba      aaaaaaaaa   b000b0bb0   93  Trit value
992    0..19     1 2   ba      aaaaaaaaa   b0000bb00   54  Quint value
993    0..23   1   3   cba     aaaaaaaaa   cb000cbcb   44  Trit value
994    0..39     1 3   cba     aaaaaaaaa   cb0000cbc   26  Quint value
995    0..47   1   4   dcba    aaaaaaaaa   dcb000dcb   22  Trit value
996    0..79     1 4   dcba    aaaaaaaaa   dcb0000dc   13  Quint value
997    0..95   1   5   edcba   aaaaaaaaa   edcb000ed   11  Trit value
998    0..159    1 5   edcba   aaaaaaaaa   edcb0000e   6   Quint value
999    0..191  1   6   fedcba  aaaaaaaaa   fedcb000f   5   Trit value
1000    ---------------------------------------------------------------
1001    Table C.2.16 - Color Unquantization Parameters
1002
1003    These are then processed as follows:
1004
1005        T = D * C + B;
1006        T = T ^ A;
1007        T = (A & 0x80) | (T >> 2);
1008
1009    Note that the multiply in the first line is nearly trivial as it only
1010    needs to multiply by 0, 1, 2, 3 or 4.
1011
1012    C.2.14  LDR Endpoint Decoding
1013    -----------------------------
1014    The decoding method used depends on the Color Endpoint Mode (CEM) field,
1015    which specifies how many values are used to represent the endpoint.
1016
1017    The CEM field also specifies how to take the n unquantized color endpoint
1018    values v0 to v[n-1] and convert them into two RGBA color endpoints e0
1019    and e1.
1020
1021    The HDR Modes are more complex and do not fit neatly into this section.
1022    They are documented in following section.
1023
1024    The methods can be summarized as follows.
1025
1026    -------------------------------------------------
1027    CEM Range   Description                         n
1028    -------------------------------------------------
1029    0   LDR Luminance, direct                       2
1030    1   LDR Luminance, base+offset                  2
1031    2   HDR Luminance, large range                  2
1032    3   HDR Luminance, small range                  2
1033    4   LDR Luminance+Alpha, direct                 4
1034    5   LDR Luminance+Alpha, base+offset            4
1035    6   LDR RGB, base+scale                         4
1036    7   HDR RGB, base+scale                         4
1037    8   LDR RGB, direct                             6
1038    9   LDR RGB, base+offset                        6
1039    10  LDR RGB, base+scale plus two A              6
1040    11  HDR RGB                                     6
1041    12  LDR RGBA, direct                            8
1042    13  LDR RGBA, base+offset                       8
1043    14  HDR RGB + LDR Alpha                         8
1044    15  HDR RGB + HDR Alpha                         8
1045    -------------------------------------------------
1046    Table C.2.17 -Color Endpoint Modes
1047      [[ If the HDR profile is not implemented, remove from table C.2.17
1048         all rows whose description starts with "HDR", and add to the
1049         caption: ]]
1050    Modes not described are reserved, as described in table C.2.10.
1051
1052      [[ HDR profile only ]]
1053    Mode 14 is special in that the alpha values are interpolated linearly,
1054    but the color components are interpolated logarithmically. This is the
1055    only endpoint format with mixed-mode operation, and will return the
1056    error value if encountered in LDR mode.
1057
1058    Decode the different LDR endpoint modes as follows:
1059
1060    Mode 0  LDR Luminance, direct
1061
1062        e0=(v0,v0,v0,0xFF); e1=(v1,v1,v1,0xFF);
1063
1064    Mode 1  LDR Luminance, base+offset
1065
1066        L0 = (v0>>2)|(v1&0xC0); L1=L0+(v1&0x3F);
1067        if (L1>0xFF) { L1=0xFF; }
1068        e0=(L0,L0,L0,0xFF); e1=(L1,L1,L1,0xFF);
1069
1070    Mode 4  LDR Luminance+Alpha,direct
1071
1072        e0=(v0,v0,v0,v2);
1073        e1=(v1,v1,v1,v3);
1074
1075    Mode 5  LDR Luminance+Alpha, base+offset
1076
1077        bit_transfer_signed(v1,v0); bit_transfer_signed(v3,v2);
1078        e0=(v0,v0,v0,v2); e1=(v0+v1,v0+v1,v0+v1,v2+v3);
1079        clamp_unorm8(e0); clamp_unorm8(e1);
1080
1081    Mode 6  LDR RGB, base+scale
1082
1083        e0=(v0*v3>>8,v1*v3>>8,v2*v3>>8, 0xFF);
1084        e1=(v0,v1,v2,0xFF);
1085
1086    Mode 8  LDR RGB, Direct
1087
1088        s0= v0+v2+v4; s1= v1+v3+v5;
1089        if (s1>=s0){e0=(v0,v2,v4,0xFF);
1090                    e1=(v1,v3,v5,0xFF); }
1091        else { e0=blue_contract(v1,v3,v5,0xFF);
1092               e1=blue_contract(v0,v2,v4,0xFF); }
1093
1094    Mode 9  LDR RGB, base+offset
1095
1096        bit_transfer_signed(v1,v0);
1097        bit_transfer_signed(v3,v2);
1098        bit_transfer_signed(v5,v4);
1099        if(v1+v3+v5 >= 0)
1100        { e0=(v0,v2,v4,0xFF); e1=(v0+v1,v2+v3,v4+v5,0xFF); }
1101        else
1102        { e0=blue_contract(v0+v1,v2+v3,v4+v5,0xFF);
1103          e1=blue_contract(v0,v2,v4,0xFF); }
1104        clamp_unorm8(e0); clamp_unorm8(e1);
1105
1106    Mode 10 LDR RGB, base+scale plus two A
1107
1108        e0=(v0*v3>>8,v1*v3>>8,v2*v3>>8, v4);
1109        e1=(v0,v1,v2, v5);
1110
1111    Mode 12 LDR RGBA, direct
1112
1113        s0= v0+v2+v4; s1= v1+v3+v5;
1114        if (s1>=s0){e0=(v0,v2,v4,v6);
1115                    e1=(v1,v3,v5,v7); }
1116        else { e0=blue_contract(v1,v3,v5,v7);
1117               e1=blue_contract(v0,v2,v4,v6); }
1118
1119    Mode 13 LDR RGBA, base+offset
1120
1121        bit_transfer_signed(v1,v0);
1122        bit_transfer_signed(v3,v2);
1123        bit_transfer_signed(v5,v4);
1124        bit_transfer_signed(v7,v6);
1125        if(v1+v3+v5>=0) { e0=(v0,v2,v4,v6);
1126               e1=(v0+v1,v2+v3,v4+v5,v6+v7); }
1127        else { e0=blue_contract(v0+v1,v2+v3,v4+v5,v6+v7);
1128               e1=blue_contract(v0,v2,v4,v6); }
1129        clamp_unorm8(e0); clamp_unorm8(e1);
1130
1131    The bit_transfer_signed procedure transfers a bit from one value (a)
1132    to another (b). Initially, both a and b are in the range 0..255.
1133    After calling this procedure, a's range becomes -32..31, and b remains
1134    in the range 0..255. Note that, as is often the case, this is easier to
1135    express in hardware than in C:
1136
1137        bit_transfer_signed(int& a, int& b)
1138        {
1139            b >>= 1;
1140            b |= a & 0x80;
1141            a >>= 1;
1142            a &= 0x3F;
1143            if( (a&0x20)!=0 ) a-=0x40;
1144        }
1145
1146    The blue_contract procedure is used to give additional precision to
1147    RGB colors near grey:
1148
1149        color blue_contract( int r, int g, int b, int a )
1150        {
1151            color c;
1152            c.r = (r+b) >> 1;
1153            c.g = (g+b) >> 1;
1154            c.b = b;
1155            c.a = a;
1156            return c;
1157        }
1158
1159    The clamp_unorm8 procedure is used to clamp a color into the UNORM8 range:
1160
1161        void clamp_unorm8(color c)
1162        {
1163            if(c.r < 0) {c.r=0;} else if(c.r > 255) {c.r=255;}
1164            if(c.g < 0) {c.g=0;} else if(c.g > 255) {c.g=255;}
1165            if(c.b < 0) {c.b=0;} else if(c.b > 255) {c.b=255;}
1166            if(c.a < 0) {c.a=0;} else if(c.a > 255) {c.a=255;}
1167        }
1168
1169      [[ If the HDR profile is not implemented, do not include section
1170         C.2.15 ]]
1171
1172    C.2.15 HDR Endpoint Decoding
1173    -------------------------
1174
1175    For HDR endpoint modes, color values are represented in a 12-bit
1176    pseudo-logarithmic representation.
1177
1178    HDR Endpoint Mode 2
1179
1180    Mode 2 represents luminance-only data with a large range. It encodes
1181    using two values (v0, v1). The complete decoding procedure is as follows:
1182
1183        if(v1 >= v0)
1184        {
1185            y0 = (v0 << 4);
1186            y1 = (v1 << 4);
1187        }
1188        else
1189        {
1190            y0 = (v1 << 4) + 8;
1191            y1 = (v0 << 4) - 8;
1192        }
1193        // Construct RGBA result (0x780 is 1.0f)
1194        e0 = (y0, y0, y0, 0x780);
1195        e1 = (y1, y1, y1, 0x780);
1196
1197    HDR Endpoint Mode 3
1198
1199    Mode 3 represents luminance-only data with a small range. It packs the
1200    bits for a base luminance value, together with an offset, into two values
1201    (v0, v1):
1202
1203    Value   7   6   5   4   3   2   1   0
1204    -----   ------------------------------
1205    v0     |M  |         L[6:0]           |
1206            ------------------------------
1207    v1     |    X[3:0]     |   d[3:0]     |
1208            ------------------------------
1209
1210    Table C.2.18 - HDR Mode 3 Value Layout
1211
1212    The bit field marked as X allocates different bits to L or d depending
1213    on the value of the mode bit M.
1214
1215    The complete decoding procedure is as follows:
1216
1217        // Check mode bit and extract.
1218        if((v0&0x80) !=0)
1219        {
1220            y0 = ((v1 & 0xE0) << 4) | ((v0 & 0x7F) << 2);
1221            d  =  (v1 & 0x1F) << 2;
1222        }
1223        else
1224        {
1225            y0 = ((v1 & 0xF0) << 4) | ((v0 & 0x7F) << 1);
1226            d  =  (v1 & 0x0F) << 1;
1227        }
1228
1229        // Add delta and clamp
1230        y1 = y0 + d;
1231        if(y1 > 0xFFF) { y1 = 0xFFF; }
1232
1233        // Construct RGBA result (0x780 is 1.0f)
1234        e0 = (y0, y0, y0, 0x780);
1235        e1 = (y1, y1, y1, 0x780);
1236
1237    HDR Endpoint Mode 7
1238
1239    Mode 7 packs the bits for a base RGB value, a scale factor, and some
1240    mode bits into the four values (v0, v1, v2, v3):
1241
1242    Value   7   6   5   4   3   2   1   0
1243    -----   ------------------------------
1244    v0     |M[3:2] |       R[5:0]         |
1245    -----   ------------------------------
1246    v1     |M1 |X0 |X1 |      G[4:0]      |
1247    -----   ------------------------------
1248    v2     |M0 |X2 |X3 |      B[4:0]      |
1249    -----   ------------------------------
1250    v3     |X4 |X5 |X6 |      S[4:0]      |
1251    -----   ------------------------------
1252    Table C.2.19 - HDR Mode 7 Value Layout
1253
1254    The mode bits M0 to M3 are a packed representation of an endpoint bit
1255    mode, together with the major component index. For modes 0 to 4, the
1256    component (red, green, or blue) with the largest magnitude is identified,
1257    and the values swizzled to ensure that it is decoded from the red channel.
1258
1259    The endpoint bit mode is used to determine the number of bits assigned
1260    to each component of the endpoint, and the destination of each of the
1261    extra bits X0 to X6, as follows:
1262
1263    ------------------------------------------------------
1264            Number of bits      Destination of extra bits
1265    Mode    R   G   B   S       X0  X1  X2  X3  X4  X5  X6
1266    ------------------------------------------------------
1267    0       11  5   5   7       R9  R8  R7  R10 R6  S6  S5
1268    1       11  6   6   5       R8  G5  R7  B5  R6  R10 R9
1269    2       10  5   5   8       R9  R8  R7  R6  S7  S6  S5
1270    3       9   6   6   7       R8  G5  R7  B5  R6  S6  S5
1271    4       8   7   7   6       G6  G5  B6  B5  R6  R7  S5
1272    5       7   7   7   7       G6  G5  B6  B5  R6  S6  S5
1273    ------------------------------------------------------
1274    Table C.2.20 - Endpoint Bit Mode
1275
1276    As noted before, this appears complex when expressed in C, but much
1277    easier to achieve in hardware - bit masking, extraction, shifting
1278    and assignment usually ends up as a single wire or multiplexer.
1279
1280    The complete decoding procedure is as follows:
1281
1282        // Extract mode bits and unpack to major component and mode.
1283        int modeval = ((v0&0xC0)>>6) | ((v1&0x80)>>5) | ((v2&0x80)>>4);
1284
1285        int majcomp;
1286        int mode;
1287
1288        if( (modeval & 0xC ) != 0xC )
1289        {
1290            majcomp = modeval >> 2; mode = modeval & 3;
1291        }
1292        else if( modeval != 0xF )
1293        {
1294            majcomp = modeval & 3;  mode = 4;
1295        }
1296        else
1297        {
1298            majcomp = 0; mode = 5;
1299        }
1300
1301        // Extract low-order bits of r, g, b, and s.
1302        int red   = v0 & 0x3f;
1303        int green = v1 & 0x1f;
1304        int blue  = v2 & 0x1f;
1305        int scale = v3 & 0x1f;
1306
1307        // Extract high-order bits, which may be assigned depending on mode
1308        int x0 = (v1 >> 6) & 1; int x1 = (v1 >> 5) & 1;
1309        int x2 = (v2 >> 6) & 1; int x3 = (v2 >> 5) & 1;
1310        int x4 = (v3 >> 7) & 1; int x5 = (v3 >> 6) & 1;
1311        int x6 = (v3 >> 5) & 1;
1312
1313        // Now move the high-order xs into the right place.
1314        int ohm = 1 << mode;
1315        if( ohm & 0x30 ) green |= x0 << 6;
1316        if( ohm & 0x3A ) green |= x1 << 5;
1317        if( ohm & 0x30 ) blue |= x2 << 6;
1318        if( ohm & 0x3A ) blue |= x3 << 5;
1319        if( ohm & 0x3D ) scale |= x6 << 5;
1320        if( ohm & 0x2D ) scale |= x5 << 6;
1321        if( ohm & 0x04 ) scale |= x4 << 7;
1322        if( ohm & 0x3B ) red |= x4 << 6;
1323        if( ohm & 0x04 ) red |= x3 << 6;
1324        if( ohm & 0x10 ) red |= x5 << 7;
1325        if( ohm & 0x0F ) red |= x2 << 7;
1326        if( ohm & 0x05 ) red |= x1 << 8;
1327        if( ohm & 0x0A ) red |= x0 << 8;
1328        if( ohm & 0x05 ) red |= x0 << 9;
1329        if( ohm & 0x02 ) red |= x6 << 9;
1330        if( ohm & 0x01 ) red |= x3 << 10;
1331        if( ohm & 0x02 ) red |= x5 << 10;
1332
1333        // Shift the bits to the top of the 12-bit result.
1334        static const int shamts[6] = { 1,1,2,3,4,5 };
1335        int shamt = shamts[mode];
1336        red <<= shamt; green <<= shamt; blue <<= shamt; scale <<= shamt;
1337
1338        // Minor components are stored as differences
1339        if( mode != 5 ) { green = red - green; blue = red - blue; }
1340
1341        // Swizzle major component into place
1342        if( majcomp == 1 ) swap( red, green );
1343        if( majcomp == 2 ) swap( red, blue );
1344
1345        // Clamp output values, set alpha to 1.0
1346        e1.r = clamp( red, 0, 0xFFF );
1347        e1.g = clamp( green, 0, 0xFFF );
1348        e1.b = clamp( blue, 0, 0xFFF );
1349        e1.alpha = 0x780;
1350
1351        e0.r = clamp( red - scale, 0, 0xFFF );
1352        e0.g = clamp( green - scale, 0, 0xFFF );
1353        e0.b = clamp( blue - scale, 0, 0xFFF );
1354        e0.alpha = 0x780;
1355
1356    HDR Endpoint Mode 11
1357
1358    Mode 11 specifies two RGB values, which it calculates from a number of
1359    bitfields (a, b0, b1, c, d0 and d1) which are packed together with some
1360    mode bits into the six values (v0, v1, v2, v3, v4, v5):
1361
1362    Value   7   6   5   4   3   2   1   0
1363    -----   ------------------------------
1364    v0     |            a[7:0]            |
1365    -----   ------------------------------
1366    v1     |m0 |a8 |      c[5:0]          |
1367    -----   ------------------------------
1368    v2     |m1 |X0 |     b0[5:0]          |
1369    -----   ------------------------------
1370    v3     |m2 |X1 |     b1[5:0]          |
1371    -----   ------------------------------
1372    v4     |mj0|X2 |X4 |     d0[4:0]      |
1373    -----   ------------------------------
1374    v5     |mj1|X3 |X5 |     d1[4:0]      |
1375    -----   ------------------------------
1376    Table C.2.21 - HDR Mode 11 Value Layout
1377
1378    If the major component bits mj[1:0 ] are both 1, then the RGB values
1379    are specified directly
1380
1381    Value   7   6   5   4   3   2   1   0
1382    -----   ------------------------------
1383    v0     |          R0[11:4]            |
1384    -----   ------------------------------
1385    v1     |          R1[11:4]            |
1386    -----   ------------------------------
1387    v2     |          G0[11:4]            |
1388    -----   ------------------------------
1389    v3     |          G1[11:4]            |
1390    -----   ------------------------------
1391    v4     | 1 |        B0[11:5]          |
1392    -----   ------------------------------
1393    v5     | 1 |        B1[11:5]          |
1394    -----   ------------------------------
1395    Table C.2.22 - HDR Mode 11 Value Layout
1396
1397    The mode bits m[2:0] specify the bit allocation for the different
1398    values, and the destinations of the extra bits X0 to X5:
1399
1400    -------------------------------------------------------------------------
1401            Number of bits      Destination of extra bits
1402    Mode    a   b   c   d       X0      X1      X2      X3      X4      X5
1403    -------------------------------------------------------------------------
1404    0       9   7   6   7       b0[6]   b1[6]   d0[6]   d1[6]   d0[5]   d1[5]
1405    1       9   8   6   6       b0[6]   b1[6]   b0[7]   b1[7]   d0[5]   d1[5]
1406    2       10  6   7   7       a[9]    c[6]    d0[6]   d1[6]   d0[5]   d1[5]
1407    3       10  7   7   6       b0[6]   b1[6]   a[9]    c[6]    d0[5]   d1[5]
1408    4       11  8   6   5       b0[6]   b1[6]   b0[7]   b1[7]   a[9]    a[10]
1409    5       11  6   7   6       a[9]    a[10]   c[7]    c[6]    d0[5]   d1[5]
1410    6       12  7   7   5       b0[6]   b1[6]   a[11]   c[6]    a[9]    a[10]
1411    7       12  6   7   6       a[9]    a[10]   a[11]   c[6]    d0[5]   d1[5]
1412    -------------------------------------------------------------------------
1413    Table C.2.23 - Endpoint Bit Mode
1414
1415    The complete decoding procedure is as follows:
1416
1417        // Find major component
1418        int majcomp = ((v4 & 0x80) >> 7) | ((v5 & 0x80) >> 6);
1419
1420        // Deal with simple case first
1421        if( majcomp == 3 )
1422        {
1423            e0 = (v0 << 4, v2 << 4, (v4 & 0x7f) << 5, 0x780);
1424            e1 = (v1 << 4, v3 << 4, (v5 & 0x7f) << 5, 0x780);
1425            return;
1426        }
1427
1428        // Decode mode, parameters.
1429        int mode = ((v1&0x80)>>7) | ((v2&0x80)>>6) | ((v3&0x80)>>5);
1430        int va  = v0 | ((v1 & 0x40) << 2);
1431        int vb0 = v2 & 0x3f;
1432        int vb1 = v3 & 0x3f;
1433        int vc  = v1 & 0x3f;
1434        int vd0 = v4 & 0x7f;
1435        int vd1 = v5 & 0x7f;
1436
1437        // Assign top bits of vd0, vd1.
1438        static const int dbitstab[8] = {7,6,7,6,5,6,5,6};
1439        vd0 = signextend( vd0, dbitstab[mode] );
1440        vd1 = signextend( vd1, dbitstab[mode] );
1441
1442        // Extract and place extra bits
1443        int x0 = (v2 >> 6) & 1;
1444        int x1 = (v3 >> 6) & 1;
1445        int x2 = (v4 >> 6) & 1;
1446        int x3 = (v5 >> 6) & 1;
1447        int x4 = (v4 >> 5) & 1;
1448        int x5 = (v5 >> 5) & 1;
1449
1450        int ohm = 1 << mode;
1451        if( ohm & 0xA4 ) va |= x0 << 9;
1452        if( ohm & 0x08 ) va |= x2 << 9;
1453        if( ohm & 0x50 ) va |= x4 << 9;
1454        if( ohm & 0x50 ) va |= x5 << 10;
1455        if( ohm & 0xA0 ) va |= x1 << 10;
1456        if( ohm & 0xC0 ) va |= x2 << 11;
1457        if( ohm & 0x04 ) vc |= x1 << 6;
1458        if( ohm & 0xE8 ) vc |= x3 << 6;
1459        if( ohm & 0x20 ) vc |= x2 << 7;
1460        if( ohm & 0x5B ) vb0 |= x0 << 6;
1461        if( ohm & 0x5B ) vb1 |= x1 << 6;
1462        if( ohm & 0x12 ) vb0 |= x2 << 7;
1463        if( ohm & 0x12 ) vb1 |= x3 << 7;
1464
1465        // Now shift up so that major component is at top of 12-bit value
1466        int shamt = (modeval >> 1) ^ 3;
1467        va <<= shamt; vb0 <<= shamt; vb1 <<= shamt;
1468        vc <<= shamt; vd0 <<= shamt; vd1 <<= shamt;
1469
1470        e1.r = clamp( va, 0, 0xFFF );
1471        e1.g = clamp( va - vb0, 0, 0xFFF );
1472        e1.b = clamp( va - vb1, 0, 0xFFF );
1473        e1.alpha = 0x780;
1474
1475        e0.r = clamp( va - vc, 0, 0xFFF );
1476        e0.g = clamp( va - vb0 - vc - vd0, 0, 0xFFF );
1477        e0.b = clamp( va - vb1 - vc - vd1, 0, 0xFFF );
1478        e0.alpha = 0x780;
1479
1480        if( majcomp == 1 )      { swap( e0.r, e0.g ); swap( e1.r, e1.g ); }
1481        else if( majcomp == 2 ) { swap( e0.r, e0.b ); swap( e1.r, e1.b ); }
1482
1483    HDR Endpoint Mode 14
1484
1485    Mode 14 specifies two RGBA values, using the eight values (v0, v1, v2,
1486    v3, v4, v5, v6, v7). First, the RGB values are decoded from (v0..v5)
1487    using the method from Mode 11, then the alpha values are filled in
1488    from v6 and v7:
1489
1490        // Decode RGB as for mode 11
1491        (e0,e1) = decode_mode_11(v0,v1,v2,v3,v4,v5)
1492
1493        // Now fill in the alphas
1494        e0.alpha = v6;
1495        e1.alpha = v7;
1496
1497    Note that in this mode, the alpha values are interpreted (and
1498    interpolated) as 8-bit unsigned normalized values, as in the LDR modes.
1499    This is the only mode that exhibits this behaviour.
1500
1501    HDR Endpoint Mode 15
1502
1503    Mode 15 specifies two RGBA values, using the eight values (v0, v1, v2,
1504    v3, v4, v5, v6, v7). First, the RGB values are decoded from (v0..v5)
1505    using the method from Mode 11. The alpha values are stored in values
1506    v6 and v7 as a mode and two values which are interpreted according
1507    to the mode:
1508
1509    Value   7   6   5   4   3   2   1   0
1510    -----   ------------------------------
1511    v6     |M0 |        A[6:0]            |
1512    -----   ------------------------------
1513    v7     |M1 |        B[6:0]            |
1514    -----   ------------------------------
1515    Table C.2.24 - HDR Mode 15 Alpha Value Layout
1516
1517    The alpha values are decoded from v6 and v7 as follows:
1518
1519        // Decode RGB as for mode 11
1520        (e0,e1) = decode_mode_11(v0,v1,v2,v3,v4,v5)
1521
1522        // Extract mode bits
1523        mode = ((v6 >> 7) & 1) | ((v7 >> 6) & 2);
1524        v6 &= 0x7F;
1525        v7 &= 0x7F;
1526
1527        if(mode==3)
1528        {
1529            // Directly specify alphas
1530            e0.alpha = v6 << 5;
1531            e1.alpha = v7 << 5;
1532        }
1533        else
1534        {
1535            // Transfer bits from v7 to v6 and sign extend v7.
1536            v6 |= (v7 << (mode+1))) & 0x780;
1537            v7 &= (0x3F >> mode);
1538            v7 ^= 0x20 >> mode;
1539            v7 -= 0x20 >> mode;
1540            v6 <<= (4-mode);
1541            v7 <<= (4-mode);
1542
1543            // Add delta and clamp
1544            v7 += v6;
1545            v7 = clamp(v7, 0, 0xFFF);
1546            e0.alpha = v6;
1547            e1.alpha = v7;
1548        }
1549
1550    Note that in this mode, the alpha values are interpreted (and
1551    interpolated) as 12-bit HDR values, and are interpolated as
1552    for any other HDR component.
1553
1554    C.2.16  Weight Decoding
1555    -----------------------
1556    The weight information is stored as a stream of bits, growing downwards
1557    from the most significant bit in the block. Bit n in the stream is thus
1558    bit 127-n in the block.
1559
1560    For each location in the weight grid, a value (in the specified range)
1561    is packed into the stream. These are ordered in a raster pattern
1562    starting from location (0,0,0), with the X dimension increasing fastest,
1563    and the Z dimension increasing slowest. If dual-plane mode is selected,
1564    both weights are emitted together for each location, plane 0 first,
1565    then plane 1.
1566
1567    C.2.17  Weight Unquantization
1568    -----------------------------
1569
1570    Each weight plane is specified as a sequence of integers in a given
1571    range. These values are packed using integer sequence encoding.
1572
1573    Once unpacked, the values must be unquantized from their storage
1574    range, returning them to a standard range of 0..64. The procedure
1575    for doing so is similar to the color endpoint unquantization.
1576
1577    First, we unquantize the actual stored weight values to the range 0..63.
1578
1579    For bit-only representations, this is simple bit replication from the
1580    most significant bit of the value.
1581
1582    For trit or quint-based representations, this involves a set of bit
1583    manipulations and adjustments to avoid the expense of full-width
1584    multipliers.
1585
1586    For representations with no additional bits, the results are as follows:
1587
1588    Range   0   1   2   3   4
1589    --------------------------
1590    0..2    0   32  63  -   -
1591    0..4    0   16  32  47  63
1592    --------------------------
1593    Table C.2.25 - Weight Unquantization Values
1594
1595    For other values, we calculate the initial inputs to a bit manipulation
1596    procedure. These are denoted A (7 bits), B (7 bits), C (7 bits), and
1597    D (3 bits) and are decoded using the range as follows:
1598
1599    Range   T Q B   Bits    A       B       C   D
1600    -------------------------------------------------------
1601    0..5    1   1   a       aaaaaaa 0000000 50  Trit value
1602    0..9      1 1   a       aaaaaaa 0000000 28  Quint value
1603    0..11   1   2   ba      aaaaaaa b000b0b 23  Trit value
1604    0..19     1 2   ba      aaaaaaa b0000b0 13  Quint value
1605    0..23   1   3   cba     aaaaaaa cb000cb 11  Trit value
1606    -------------------------------------------------------
1607    Table C.2.26 - Weight Unquantization Parameters
1608
1609    These are then processed as follows:
1610
1611        T = D * C + B;
1612        T = T ^ A;
1613        T = (A & 0x20) | (T >> 2);
1614
1615    Note that the multiply in the first line is nearly trivial as it only
1616    needs to multiply by 0, 1, 2, 3 or 4.
1617
1618    As a final step, for all types of value, the range is expanded from
1619    0..63 up to 0..64 as follows:
1620
1621        if (T > 32) { T += 1; }
1622
1623    This allows the implementation to use 64 as a divisor during inter-
1624    polation, which is much easier than using 63.
1625
1626    C.2.18  Weight Infill
1627    ---------------------
1628
1629    After unquantization, the weights are subject to weight selection and
1630    infill. The infill method is used to calculate the weight for a texel
1631    position, based on the weights in the stored weight grid array (which
1632    may be a different size).
1633
1634    The procedure below must be followed exactly, to ensure bit exact
1635    results.
1636
1637    The block size is specified as two dimensions along the s and t
1638    axes (Bs, Bt). Texel coordinates within the block (s,t) can have values
1639    from 0 to one less than the block dimension in that axis.
1640
1641    For each block dimension, we compute scale factors (Ds, Dt)
1642
1643        Ds = floor( (1024 + floor(Bs/2)) / (Bs-1) );
1644        Dt = floor( (1024 + floor(Bt/2)) / (Bt-1) );
1645
1646    Since the block dimensions are constrained, these are easily looked up
1647    in a table. These scale factors are then used to scale the (s,t)
1648    coordinates to a homogeneous coordinate (cs, ct):
1649
1650        cs = Ds * s;
1651        ct = Dt * t;
1652
1653    This homogeneous coordinate (cs, ct) is then scaled again to give
1654    a coordinate (gs, gt) in the weight-grid space . The weight-grid is
1655    of size (N, M), as specified in the block mode field:
1656
1657        gs = (cs*(N-1)+32) >> 6;
1658        gt = (ct*(M-1)+32) >> 6;
1659
1660    The resulting coordinates may be in the range 0..176. These are inter-
1661    preted as 4:4 unsigned fixed point numbers in the range 0.0 .. 11.0.
1662
1663    If we label the integral parts of these (js, jt) and the fractional
1664    parts (fs, ft), then:
1665
1666        js = gs >> 4; fs = gs & 0x0F;
1667        jt = gt >> 4; ft = gt & 0x0F;
1668
1669    These values are then used to bilinearly interpolate between the stored
1670    weights.
1671
1672        v0 = js + jt*N;
1673        p00 = decode_weight(v0);
1674        p01 = decode_weight(v0 + 1);
1675        p10 = decode_weight(v0 + N);
1676        p11 = decode_weight(v0 + N + 1);
1677
1678    The function decode_weight(n) decodes the nth weight in the stored weight
1679    stream. The values p00 to p11 are the weights at the corner of the square
1680    in which the texel position resides. These are then weighted using the
1681    fractional position to produce the effective weight i as follows:
1682
1683        w11 = (fs*ft+8) >> 4;
1684        w10 = ft - w11;
1685        w01 = fs - w11;
1686        w00 = 16 - fs - ft + w11;
1687        i = (p00*w00 + p01*w01 + p10*w10 + p11*w11 + 8) >> 4;
1688
1689    C.2.19  Weight Application
1690    --------------------------
1691    Once the effective weight i for the texel has been calculated, the color
1692    endpoints are interpolated and expanded.
1693
1694    For LDR endpoint modes, each color component C is calculated from the
1695    corresponding 8-bit endpoint components C0 and C1 as follows:
1696
1697    If sRGB conversion is not enabled, or for the alpha channel in any case,
1698    C0 and C1 are first expanded to 16 bits by bit replication:
1699
1700        C0 = (C0 << 8) | C0;        C1 = (C1 << 8) | C1;
1701
1702    If sRGB conversion is enabled, C0 and C1 for the R, G, and B channels
1703    are expanded to 16 bits differently, as follows:
1704
1705        C0 = (C0 << 8) | 0x80;  C1 = (C1 << 8) | 0x80;
1706
1707    C0 and C1 are then interpolated to produce a UNORM16 result C:
1708
1709        C = floor( (C0*(64-i) + C1*i + 32)/64 )
1710
1711    If sRGB conversion is enabled, the top 8 bits of the interpolation
1712    result for the R, G and B channels are passed to the external sRGB
1713    conversion block. Otherwise, if C = 65535, then the final result is
1714    1.0 (0x3C00) otherwise C is divided by 65536 and the infinite-precision
1715    result of the division is converted to FP16 with round-to-zero
1716    semantics.
1717
1718    For HDR endpoint modes, color values are represented in a 12-bit
1719    pseudo-logarithmic representation, and interpolation occurs in a
1720    piecewise-approximate logarithmic manner as follows:
1721
1722    In LDR mode, the error result is returned.
1723
1724    In HDR mode, the color components from each endpoint, C0 and C1, are
1725    initially shifted left 4 bits to become 16-bit integer values and these
1726    are interpolated in the same way as LDR. The 16-bit value C is then
1727    decomposed into the top five bits, E, and the bottom 11 bits M, which
1728    are then processed and recombined with E to form the final value Cf:
1729
1730        C = floor( (C0*(64-i) + C1*i + 32)/64 )
1731        E = (C&0xF800) >> 11; M = C&0x7FF;
1732        if (M < 512) { Mt = 3*M; }
1733        else if (M >= 1536) { Mt = 5*M - 2048; }
1734        else { Mt = 4*M - 512; }
1735        Cf = (E<<10) + (Mt>>3)
1736
1737    This interpolation is a considerably closer approximation to a
1738    logarithmic space than simple 16-bit interpolation.
1739
1740    This final value Cf is interpreted as an IEEE FP16 value. If the result
1741    is +Inf or NaN, it is converted to the bit pattern 0x7BFF, which is the
1742    largest representable finite value.
1743
1744    C.2.20  Dual-Plane Decoding
1745    ---------------------------
1746    If dual-plane mode is disabled, all of the endpoint components are inter-
1747    polated using the same weight value.
1748
1749    If dual-plane mode is enabled, two weights are stored with each texel.
1750    One component is then selected to use the second weight for interpolation,
1751    instead of the first weight. The first weight is then used for all other
1752    components.
1753
1754    The component to treat specially is indicated using the 2-bit Color
1755    Component Selector (CCS) field as follows:
1756
1757    Value   Weight 0  Weight 1
1758    --------------------------
1759    0         GBA        R
1760    1         RBA        G
1761    2         RGA        B
1762    3         RGB        A
1763    --------------------------
1764    Table C.2.28 - Dual Plane Color Component Selector Values
1765
1766    The CCS bits are stored at a variable position directly below the weight
1767    bits and any additional CEM bits.
1768
1769    C.2.21  Partition Pattern Generation
1770    ------------------------------------
1771
1772    When multiple partitions are active, each texel position is assigned a
1773    partition index. This partition index is calculated using a seed (the
1774    partition pattern index), the texel's x,y,z position within the block,
1775    and the number of partitions. An additional argument, small_block, is
1776    set to 1 if the number of texels in the block is less than 31,
1777    otherwise it is set to 0.
1778
1779    This function is specified in terms of x, y and z in order to support
1780    3D textures. For 2D textures and texture slices, z will always be 0.
1781
1782    The full partition selection algorithm is as follows:
1783
1784        int select_partition(int seed, int x, int y, int z,
1785                             int partitioncount, int small_block)
1786        {
1787            if( small_block ){ x <<= 1; y <<= 1; z <<= 1; }
1788            seed += (partitioncount-1) * 1024;
1789            uint32_t rnum = hash52(seed);
1790            uint8_t seed1  =  rnum        & 0xF;
1791            uint8_t seed2  = (rnum >>  4) & 0xF;
1792            uint8_t seed3  = (rnum >>  8) & 0xF;
1793            uint8_t seed4  = (rnum >> 12) & 0xF;
1794            uint8_t seed5  = (rnum >> 16) & 0xF;
1795            uint8_t seed6  = (rnum >> 20) & 0xF;
1796            uint8_t seed7  = (rnum >> 24) & 0xF;
1797            uint8_t seed8  = (rnum >> 28) & 0xF;
1798            uint8_t seed9  = (rnum >> 18) & 0xF;
1799            uint8_t seed10 = (rnum >> 22) & 0xF;
1800            uint8_t seed11 = (rnum >> 26) & 0xF;
1801            uint8_t seed12 = ((rnum >> 30) | (rnum << 2)) & 0xF;
1802
1803            seed1 *= seed1;     seed2 *= seed2;
1804            seed3 *= seed3;     seed4 *= seed4;
1805            seed5 *= seed5;     seed6 *= seed6;
1806            seed7 *= seed7;     seed8 *= seed8;
1807            seed9 *= seed9;     seed10 *= seed10;
1808            seed11 *= seed11;   seed12 *= seed12;
1809
1810            int sh1, sh2, sh3;
1811            if( seed & 1 )
1812                { sh1 = (seed&2 ? 4:5); sh2 = (partitioncount==3 ? 6:5); }
1813            else
1814                { sh1 = (partitioncount==3 ? 6:5); sh2 = (seed&2 ? 4:5); }
1815            sh3 = (seed & 0x10) ? sh1 : sh2;
1816
1817            seed1 >>= sh1; seed2  >>= sh2; seed3  >>= sh1; seed4  >>= sh2;
1818            seed5 >>= sh1; seed6  >>= sh2; seed7  >>= sh1; seed8  >>= sh2;
1819            seed9 >>= sh3; seed10 >>= sh3; seed11 >>= sh3; seed12 >>= sh3;
1820
1821            int a = seed1*x + seed2*y + seed11*z + (rnum >> 14);
1822            int b = seed3*x + seed4*y + seed12*z + (rnum >> 10);
1823            int c = seed5*x + seed6*y + seed9 *z + (rnum >>  6);
1824            int d = seed7*x + seed8*y + seed10*z + (rnum >>  2);
1825
1826            a &= 0x3F; b &= 0x3F; c &= 0x3F; d &= 0x3F;
1827
1828            if( partitioncount < 4 ) d = 0;
1829            if( partitioncount < 3 ) c = 0;
1830
1831            if( a >= b && a >= c && a >= d ) return 0;
1832            else if( b >= c && b >= d ) return 1;
1833            else if( c >= d ) return 2;
1834            else return 3;
1835        }
1836
1837    As has been observed before, the bit selections are much easier to
1838    express in hardware than in C.
1839
1840    The seed is expanded using a hash function hash52, which is defined as
1841    follows:
1842
1843        uint32_t hash52( uint32_t p )
1844        {
1845            p ^= p >> 15;  p -= p << 17;  p += p << 7; p += p <<  4;
1846            p ^= p >>  5;  p += p << 16;  p ^= p >> 7; p ^= p >> 3;
1847            p ^= p <<  6;  p ^= p >> 17;
1848            return p;
1849        }
1850
1851    This assumes that all operations act on 32-bit values
1852
1853    C.2.22  Data Size Determination
1854    -------------------------------
1855
1856    The size of the data used to represent color endpoints is not
1857    explicitly specified. Instead, it is determined from the block mode and
1858    number of partitions as follows:
1859
1860        config_bits = 17;
1861        if(num_partitions>1)
1862            if(single_CEM)
1863                config_bits = 29;
1864            else
1865                config_bits = 25 + 3*num_partitions;
1866
1867        num_weights = M * N * Q; // size of weight grid
1868
1869        if(dual_plane)
1870            config_bits += 2;
1871            num_weights *= 2;
1872
1873        weight_bits = ceil(num_weights*8*trits_in_weight_range/5) +
1874                      ceil(num_weights*7*quints_in_weight_range/3) +
1875                      num_weights*bits_in_weight_range;
1876
1877        remaining_bits = 128 - config_bits - weight_bits;
1878
1879        num_CEM_pairs = base_CEM_class+1 + count_bits(extra_CEM_bits);
1880
1881    The CEM value range is then looked up from a table indexed by remaining
1882    bits and num_CEM_pairs. This table is initialized such that the range
1883    is as large as possible, consistent with the constraint that the number
1884    of bits required to encode num_CEM_pairs pairs of values is not more
1885    than the number of remaining bits.
1886
1887    An equivalent iterative algorithm would be:
1888
1889        num_CEM_values = num_CEM_pairs*2;
1890
1891        for(range = each possible CEM range in descending order of size)
1892        {
1893            CEM_bits = ceil(num_CEM_values*8*trits_in_CEM_range/5) +
1894                       ceil(num_CEM_values*7*quints_in_CEM_range/3) +
1895                       num_CEM_values*bits_in_CEM_range;
1896
1897            if(CEM_bits <= remaining_bits)
1898                break;
1899        }
1900        return range;
1901
1902    In cases where this procedure results in unallocated bits, these bits
1903    are not read by the decoding process and can have any value.
1904
1905    C.2.23  Void-Extent Blocks
1906    --------------------------
1907
1908    A void-extent block is a block encoded with a single color. It also
1909    specifies some additional information about the extent of the single-
1910    color area beyond this block, which can optionally be used by a
1911    decoder to reduce or prevent redundant block fetches.
1912
1913    The layout of a 2D Void-Extent block is as follows:
1914
1915    127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112
1916     ---------------------------------------------------------------
1917    |                 Block color A component                       |
1918     ---------------------------------------------------------------
1919
1920    111 110 109 108 107 106 105 104 103 102 101 100 99  98  97  96
1921    ----------------------------------------------------------------
1922    |                 Block color B component                       |
1923    ----------------------------------------------------------------
1924
1925    95  94  93  92  91  90  89  88  87  86  85  84  83  82  81  80
1926    ----------------------------------------------------------------
1927    |                 Block color G component                       |
1928    ----------------------------------------------------------------
1929    79  78  77  76  75  74  73  72  71  70  69  68  67  66  65  64
1930    ----------------------------------------------------------------
1931    |                 Block color R component                       |
1932    ----------------------------------------------------------------
1933
1934    63  62  61  60  59  58  57  56  55  54  53  52  51  50  49  48
1935    ----------------------------------------------------------------
1936    |    Void-extent maximum T coordinate              |    Min T   |
1937    ----------------------------------------------------------------
1938
1939    47  46  45  44  43  42  41  40  39  38  37  36  35  34  33  32
1940    ----------------------------------------------------------------
1941    Void-extent minimum T coordinate       |   Void-extent max S    |
1942    ----------------------------------------------------------------
1943
1944    31  30  29  28  27  26  25  24  23  22  21  20  19  18  17  16
1945    ----------------------------------------------------------------
1946    Void-extent max S coord    |  Void-extent minimum S coordinate  |
1947    ----------------------------------------------------------------
1948    15  14  13  12  11  10   9   8   7   6   5   4   3   2   1   0
1949    ----------------------------------------------------------------
1950    Min S coord    | 1 | 1 | D | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 0 | 0  |
1951    ----------------------------------------------------------------
1952    -------------------------------------------------
1953    Figure C.7 - 2D Void-Extent Block Layout Overview
1954
1955    Bit 9 is the Dynamic Range flag, which indicates the format in which
1956    colors are stored. A 0 value indicates LDR, in which case the color
1957    components are stored as UNORM16 values. A 1 indicates HDR, in which
1958    case the color components are stored as FP16 values.
1959
1960    The reason for the storage of UNORM16 values in the LDR case is due
1961    to the possibility that the value will need to be passed on to sRGB
1962    conversion. By storing the color value in the format which comes out
1963    of the interpolator, before the conversion to FP16, we avoid having
1964    to have separate versions for sRGB and linear modes.
1965
1966    If a void-extent block with HDR values is decoded in LDR mode, then
1967    the result will be the error color, opaque magenta, for all texels
1968    within the block.
1969
1970    In the HDR case, if the color component values are infinity or NaN, this
1971    will result in undefined behavior. As usual, this must not lead to GL
1972    interruption or termination.
1973
1974    Bits 10 and 11 are reserved and must be 1.
1975
1976    The minimum and maximum coordinate values are treated as unsigned
1977    integers and then normalized into the range 0..1 (by dividing by 2^13-1
1978    or 2^9-1, for 2D and 3D respectively). The maximum values for each
1979    dimension must be greater than the corresponding minimum values,
1980    unless they are all all-1s.
1981
1982    If all the coordinates are all-1s, then the void extent is ignored,
1983    and the block is simply a constant-color block.
1984
1985    The existence of single-color blocks with void extents must not produce
1986    results different from those obtained if these single-color blocks are
1987    defined without void-extents. Any situation in which the results would
1988    differ is invalid. Results from invalid void extents are undefined.
1989
1990    If a void-extent appears in a MIPmap level other than the most detailed
1991    one, then the extent will apply to all of the more detailed levels too.
1992    This allows decoders to avoid sampling more detailed MIPmaps.
1993
1994    If the more detailed MIPmap level is not a constant color in this region,
1995    then the block may be marked as constant color, but without a void extent,
1996    as detailed above.
1997
1998    If a void-extent extends to the edge of a texture, then filtered texture
1999    colors may not be the same color as that specified in the block, due to
2000    texture border colors, wrapping, or cube face wrapping.
2001
2002    Care must be taken when updating or extracting partial image data that
2003    void-extents in the image do not become invalid.
2004
2005    C.2.24  Illegal Encodings
2006    -------------------------
2007
2008    In ASTC, there is a variety of ways to encode an illegal block. Decoders
2009    are required to recognize all illegal blocks and emit the standard error
2010    color value upon encountering an illegal block.
2011
2012    Here is a comprehensive list of situations that represent illegal block
2013    encodings:
2014
2015    *   The block mode specified is one of the modes explicitly listed
2016        as Reserved.
2017    *   A 2D void-extent block that has any of the reserved bits not
2018        set to 1.
2019    *   A block mode has been specified that would require more than
2020        64 weights total.
2021    *   A block mode has been specified that would require more than
2022        96 bits for integer sequence encoding of the weight grid.
2023    *   A block mode has been specifed that would require fewer than
2024        24 bits for integer sequence encoding of the weight grid.
2025    *   The size of the weight grid exceeds the size of the block footprint
2026        in any dimension.
2027    *   Color endpoint modes have been specified such that the color
2028        integer sequence encoding would require more than 18 integers.
2029    *   The number of bits available for color endpoint encoding after all
2030        the other fields have been counted is less than ceil(13C/5) where C
2031        is the number of color endpoint integers (this would restrict color
2032        integers to a range smaller than 0..5, which is not supported).
2033    *   Dual weight mode is enabled for a block with 4 partitions.
2034    *   Void-Extent blocks where the low coordinate for some texture axis
2035        is greater than or equal to the high coordinate.
2036
2037    Note also that, in LDR mode, a block which has both HDR and LDR endpoint
2038    modes assigned to different partitions is not an error block. Only those
2039    texels which belong to the HDR partition will result in the error color.
2040    Texels belonging to a LDR partition will be decoded as normal.
2041
2042    C.2.25  LDR PROFILE SUPPORT
2043    ---------------------------
2044
2045    Implementations of the LDR Profile must satisfy the following requirements:
2046
2047    *   All textures with valid encodings for LDR Profile must decode
2048        identically using either a LDR Profile, HDR Profile, or Full Profile
2049        decoder.
2050    *   All features included only in the HDR Profile or Full Profile must be
2051        treated as reserved in the LDR Profile, and return the error color on
2052        decoding.
2053    *   Any sequence of API calls valid for the LDR Profile must also be valid
2054        for the HDR Profile or Full Profile and return identical results when
2055        given a texture encoded for the LDR Profile.
2056
2057    The feature subset for the LDR profile is:
2058
2059    *   2D textures only, including 2D, 2D array, cube map face,
2060        and cube map array texture targets.
2061    *   Only those block sizes listed in Table C.2.2 are supported.
2062    *   LDR operation mode only.
2063    *   Only LDR endpoint formats must be supported, namely formats
2064        0, 1, 4, 5, 6, 8, 9, 10, 12, 13.
2065    *   Decoding from a HDR endpoint results in the error color.
2066    *   Interpolation returns UNORM8 results when used in conjunction
2067        with sRGB.
2068    *   LDR void extent blocks must be supported, but void extents
2069        may not be checked."
2070
2071    If only the LDR profile is supported, read this extension by striking
2072    all descriptions of HDR modes and decoding algorithms. The extension
2073    documents how to modify the document for some particularly tricky cases,
2074    but the general rule is as described in this paragraph.
2075
2076Interactions with immutable-format texture images
2077
2078    ASTC texture formats are supported by immutable-format textures only if
2079    such textures are supported by the underlying implementation (e.g.
2080    OpenGL 4.1 or later, OpenGL ES 3.0 or later, or earlier versions
2081    supporting the GL_EXT_texture_storage extension). Otherwise, remove all
2082    references to the Tex*Storage* commands from this specification.
2083
2084Interactions with texture cube map arrays
2085
2086    ASTC textures are supported for the TEXTURE_CUBE_MAP_ARRAY target only
2087    when cube map arrays are supported by the underlying implementation
2088    (e.g. OpenGL 4.0 or later, or an OpenGL or OpenGL ES version supporting
2089    an extension defining cube map arrays). Otherwise, remove all references
2090    to texture cube map arrays from this specification.
2091
2092Interactions with OpenGL (all versions)
2093
2094    ASTC is not supported for 1D textures and texture rectangles, and does
2095    not support non-zero borders.
2096
2097    Add the following error conditions to CompressedTexImage*D:
2098
2099    "An INVALID_ENUM error is generated by CompressedTexImage1D if
2100    <internalformat> is one of the ASTC formats.
2101
2102    An INVALID_OPERATION error is generated by CompressedTexImage2D
2103    and CompressedTexImage3D if <internalformat> is one of the ASTC
2104    formats and <border> is non-zero."
2105
2106    Add the following error conditions to CompressedTexSubImage*D:
2107
2108    "An INVALID_ENUM error is generated by CompressedTex*SubImage1D
2109    if the internal format of the texture is one of the ASTC formats.
2110
2111    An INVALID_OPERATION error is generated by CompressedTex*SubImage2D
2112    if the internal format of the texture is one of the ASTC formats
2113    and <border> is non-zero."
2114
2115    Add the following error conditions to TexStorage1D and TextureStorage1D:
2116
2117    "An INVALID_ENUM error is generated by TexStorage1D and TextureStorage1D
2118    if <format> is one of the ASTC formats."
2119
2120    Add the following error conditions to TexStorage2D and TextureStorage2D
2121    for versions of OpenGL that support texture rectangles:
2122
2123    "An INVALID_OPERATON error is generated by TexStorage2D and
2124    TextureStorage2D if <format> is one of the ASTC formats and <target>
2125    is TEXTURE_RECTANGLE.
2126
2127Interactions with OpenGL 4.2
2128
2129    OpenGL 4.2 supports the feature that compressed textures can be
2130    compressed online, by passing the compressed texture format enum as
2131    the internal format when uploading a texture using TexImage1D,
2132    TexImage2D or TexImage3D (see Section 3.9.3, Texture Image
2133    Specification, subsection Encoding of Special Internal Formats).
2134
2135    Due to the complexity of the ASTC compression algorithm, it is not
2136    usually suitable for online use, and therefore ASTC support will be
2137    limited to pre-compressed textures only. Where on-device compression
2138    is required, a domain-specific limited compressor will typically
2139    be used, and this is therefore not suitable for implementation in
2140    the driver.
2141
2142    In particular, the ASTC format specifiers will not be added to
2143    Table 3.14, and thus will not be accepted by the TexImage*D
2144    functions, and will not be returned by the (already deprecated)
2145    COMPRESSED_TEXTURE_FORMATS query.
2146
2147Issues
2148
2149 1) Three-dimensional block ASTC formats (e.g. formats whose block depth
2150    is greater than one) are not supported by these extensions.
2151
2152 2) The first release of the extension was not clear about the
2153    restrictions of the LDR profile and did not document interactions
2154    with cube map array textures.
2155
2156    RESOLVED. This extension has been rewritten to be based on OpenGL ES
2157    3.1, to clearly document LDR restrictions, and to add cube map array
2158    texture interactions.
2159
2160Revision History
2161
2162    Revision 8, June 8, 2017 - Added missing interactions with OpenGL.
2163
2164    Revision 7, July 14, 2016 - Clarified definition of 2D void-extent
2165    blocks.
2166
2167    Revision 6, March 8, 2016 - Clarified that sRGB transform is not
2168    applied to Alpha channel.
2169
2170    Revision 5, September 15, 2015 - fix typo in third paragraph of section
2171    8.7.
2172
2173    Revision 4, June 24, 2015 - minor cleanup from feedback. Move Issues and
2174    Interactions sections to the end of the document. Merge some language
2175    from OpenGL ES specification edits and rename some tables to figures,
2176    due to how they're generated in the core specifications. Include a
2177    description of the "Cube Map Array Texture" column added to table 3.19
2178    and expand the description of how to read this document when supporting
2179    only the LDR profile (Bug 13921).
2180
2181    Revision 3, May 28, 2015 - rebase extension on OpenGL ES 3.1. Clarify
2182    texture formats and targets supported by LDR and HDR profiles. Add cube
2183    map array targets and an Interactions section defining when they are
2184    supported. Add an Interactions section for immutable-format textures
2185    (Bug 13921).
2186
2187    Revision 2, April 28, 2015 - added CompressedTex{Sub,}Image3D to
2188    commands accepting ASTC format tokens in the New Tokens section (Bug
2189    10183).
2190