1# Transport Security State Generator 2 3This directory contains the code for the transport security state generator, a 4tool that generates a C++ file based on preload data in 5[transport_security_state_static.json](/net/http/transport_security_state_static.json). 6This JSON file contains the domain security policy configurations for all 7preloaded domains. 8 9[TOC] 10 11## Domain Security Policies 12 13Website owners can set a number of security policies for their domains, usually 14by sending configuration in a HTTP header. Chromium supports preloading for some 15of these security policies so that users benefit from these policies regardless 16of their browsing history. Website owners can request preloading for their 17domains. Chromium supports preloading for the following domain security 18policies: 19 20* [HTTP Strict Transport Security (HSTS)](https://tools.ietf.org/html/rfc6797) 21* [Public Key Pinning Extension for HTTP](https://tools.ietf.org/html/rfc7469) 22k 23 24Chromium and most other browsers ship the preloaded configurations inside their 25binary. Chromium uses a custom data structure for this. 26 27### I want to preload a website 28 29Please follow the instructions at [hstspreload.org](https://hstspreload.org/). 30 31## I want to use the preload list for another project 32 33Please contact [the list maintainers](https://hstspreload.org/#contact) before 34you do. 35 36## The Preload Generator 37 38The transport security state generator is executed during the build process (it 39may execute multiple times depending on the targets you're building) and 40generates data structures that are compiled into the binary. You can find the 41generated output in 42`[build-folder]/gen/net/http/transport_security_state_static*.h`. 43 44### Usage 45 46Make sure you have build the `transport_security_state_generator` target. 47 48`transport_security_state_generator <json-file> <pins-file> <template-file> <output-file> [--v=1]` 49 50* **json-file**: JSON file containing all preload configurations (e.g. 51 `net/http/transport_security_state_static.json`) 52* **pins-file**: file containing the public key information for the pinsets 53 referenced from **json-file** (e.g. 54 `net/http/transport_security_state_static.pins`) 55* **template-file**: contains the global structure of the header file with 56 placeholder for the generated data (e.g. 57 `net/http/transport_security_state_static.template`) 58* **output-file**: file to write the output to 59* **--v**: verbosity level 60 61## The Preload Format 62 63The preload data is stored in the Chromium binary as a trie encoded in a byte 64array (`net::TransportSecurityStateSource::preloaded_data`). The hostnames are 65stored in their canonicalized form and compressed using a Huffman coding. The 66generic decoder for preloaded Huffman encoded trie data is `PreloadDecoder` and 67lives in `net/extras/preload_data/decoder.cc`. The HSTS specific implementation 68is `DecodeHSTSPreload` and lives in `net/http/transport_security_state.cc`. 69 70### Huffman Coding 71 72A Huffman coding is calculated for all characters used in the trie (characters 73in hostnames and the `end of table` and `terminal` values). The Huffman tree 74can be rebuild from the `net::TransportSecurityStateSource::huffman_tree` 75array. 76 77The (internal) nodes of the tree are encoded as pairs of uint8s. The last node 78in the array is the root of the tree. Each node is two uint8_t values, the first 79is "left" and the second is "right". If a uint8_t value has the MSB set it is a 80leaf value and the 7 least significant bits represent a ASCII character (from 81the range 0-127, the tree does not support extended ASCII). If the MSB is not 82set it is a pointer to the n'th node in the array. 83 84For example, the following uint8_t array 85 86`0xE1, 0xE2, 0xE3, 0x0, 0xE4, 0xE5, 0x1, 0x2` 87 88represents 9 elements: 89 90* the implicit root node (node 3) 91* 3 internal nodes: 0x0 (node 0), 0x1 (node 1), and 0x2 (node 2) 92* 5 leaf values: 0xE1, 0xE2, 0xE3, 0xE4, and 0xE5 (which all have the most 93significant bit set) 94 95When decoded this results in the following Huffman tree: 96 97``` 98 root (node 3) 99 / \ 100 node 1 node 2 101 / \ / \ 1020xE3 (c) node 0 0xE4 (d) 0xE5 (e) 103 / \ 104 0xE1 (a) 0xE2 (b) 105``` 106 107 108### The Trie Encoding 109 110The byte array containing the trie is made up of a set of nodes represented by 111dispatch tables. Each dispatch table contains a (possibly empty) shared prefix, 112a value, and zero or more pointers to child dispatch tables. The node value 113is an encoded entry and the associated hostname can be found by going up the 114trie. 115 116The trie contains the hostnames in reverse and the hostnames are terminated by a 117`terminal value`. 118 119The dispatch table for the root node starts at bit position 120`net::TransportSecurityStateSource::root_position`. 121 122The binary format for the trie is defined by the following 123[ABNF](https://tools.ietf.org/html/rfc5234). 124 125```abnf 126trie = 1*dispatch-table 127 128dispatch-table = prefix-part ; a common prefix for the node and its children 129 1*value-part ; 1 or more values or pointers to children 130 end-of-table-value ; signals the end of the table 131 132prefix-part = prefix-length ; a prefix code encoding of the number 133of characters in the prefix 134 prefix-characters ; the actual prefix characters 135prefix-length = 1*BIT ; See net::extras::PreloadDecoder::DecodeSize for the format 136value-part = huffman-character node-value 137 ; table with the node value and pointers to children 138 139node-value = node-entry ; preload entry for the hostname at this node 140 / node-pointer ; a bit offset pointing to another dispatch 141 ; table 142 143node-entry = preloaded-entry ; encoded preload configuration for one 144 ; hostname (see section below) 145node-pointer = long-bit-offset 146 / short-bit-offset 147 148long-bit-offset = %b1 ; 1 bit indicates long form will follow 149 4BIT ; 4 bit number indicating bit length of the offset 150 8*22BIT ; offset encoded as an n bit number (see above) 151 ; where n is the offset length (see above) + 8 152short-bit-offset = %b0 ; 0 bit indicates short form will follow 153 7BIT ; offset as a 7 bit number 154 155terminal-value = huffman-character ; ASCII value 0x00 encoded using Huffman 156end-of-table-value = huffman-character ; ASCII value 0x7F encoded using Huffman 157 158prefix-characters = *huffman-character 159huffman-character = 1*BIT 160``` 161 162### The Preloaded Entry Encoding 163 164The entries are encoded using a variable length encoding. Each entry is made up 165of 4 parts, one for each supported policy. The length of these parts depends 166on the actual configuration, some field will be omitted in some cases. 167 168The binary format for an entry is defined by the following ABNF. 169 170```abnf 171preloaded-entry = BIT ; simple entry flag 172 [hsts-part hpkp-part] 173 ; policy specific parts are only 174 ; present when the simple entry flag 175 ; is set to 0 and omitted otherwise 176 177hsts-part = include-subdomains ; HSTS includeSubdomains flag 178 BIT ; whether to force HTTPS 179 180hpkp-part = BIT ; whether to enable pinning 181 [pinset-id] ; only present when pinning is enabled 182 [include-subdomains] ; HPKP includeSubdomains flag, only 183 ; present when pinning is enabled and 184 ; HSTS includeSubdomains is not used 185 186hpkp-pinset-id = array-index 187 188report-uri-id = array-index 189include-subdomains = BIT 190array-index = 4BIT ; a 4 bit number 191``` 192 193The **array-index** values are indices in the associated arrays: 194 195* `net::TransportSecurityStateSource::pinsets` for **pinset-id** 196**report-uri-id** 197 198#### Simple entries 199 200The majority of entries on the preload list are submitted through 201[hstspreload.org](https://hstspreload.org) and share the same policy 202configuration (HSTS + includeSubdomains only). To safe space, these entries 203(called **simple entries**) use a shorter encoding where the first bit (simple 204entry flag) is set to 1 and the rest of the configuration is omitted. 205 206### Tests 207 208The generator code has its own unittests in the 209`net/tools/transport_security_state_generator` folder. 210 211The encoder and decoder for the preload format life in different places and are 212tested by end-to-end tests (`TransportSecurityStateTest.DecodePreload*`) in 213`net/http/transport_security_state_unittest.cc`. The tests use their own 214preload lists, the data structures for these lists are generated in the same way 215as for the official Chromium list. 216 217All these tests are part of the `net_unittests` target. 218 219#### Writing tests that depend on static transport security state 220 221Tests in `net_unittests` (except for `TransportSecurityStateStaticTest`) should 222not depend on the real preload list. If you are writing tests that require a 223static transport security state use 224`transport_security_state_static_unittest_default.json` instead. Tests can 225override the active preload list by calling 226`SetTransportSecurityStateSourceForTesting`. 227 228## See also 229 230* <https://hstspreload.org/> 231* <https://www.chromium.org/hsts> 232