1This document explains the strategy that was used so far in starting the 2migration to PSA Crypto and mentions future perspectives and open questions. 3 4Goals 5===== 6 7Several benefits are expected from migrating to PSA Crypto: 8 9G1. Use PSA Crypto drivers when available. 10G2. Allow isolation of long-term secrets (for example, private keys). 11G3. Allow isolation of short-term secrets (for example, TLS session keys). 12G4. Have a clean, unified API for Crypto (retire the legacy API). 13G5. Code size: compile out our implementation when a driver is available. 14 15As of Mbed TLS 3.2, most of (G1) and all of (G2) is implemented when 16`MBEDTLS_USE_PSA_CRYPTO` is enabled. For (G2) to take effect, the application 17needs to be changed to use new APIs. For a more detailed account of what's 18implemented, see `docs/use-psa-crypto.md`, where new APIs are about (G2), and 19internal changes implement (G1). 20 21Generally speaking, the numbering above doesn't mean that each goal requires 22the preceding ones to be completed. 23 24Compile-time options 25==================== 26 27We currently have two compile-time options that are relevant to the migration: 28 29- `MBEDTLS_PSA_CRYPTO_C` - enabled by default, controls the presence of the PSA 30 Crypto APIs. 31- `MBEDTLS_USE_PSA_CRYPTO` - disabled by default (enabled in "full" config), 32 controls usage of PSA Crypto APIs to perform operations in X.509 and TLS 33(G1 above), as well as the availability of some new APIs (G2 above). 34- `PSA_CRYPTO_CONFIG` - disabled by default, supports builds with drivers and 35 without the corresponding software implementation (G5 above). 36 37The reasons why `MBEDTLS_USE_PSA_CRYPTO` is optional and disabled by default 38are: 39- it's incompatible with `MBEDTLS_ECP_RESTARTABLE`; 40- to avoid a hard/default dependency of TLS, X.509 and PK on 41 `MBEDTLS_PSA_CRYPTO_C`, for backward compatibility reasons: 42 - When `MBEDTLS_PSA_CRYPTO_C` is enabled and used, applications need to call 43 `psa_crypto_init()` before TLS/X.509 uses PSA functions. (This prevents us 44from even enabling the option by default.) 45 - `MBEDTLS_PSA_CRYPTO_C` has a hard dependency on `MBEDTLS_ENTROPY_C || 46 MBEDTLS_PSA_CRYPTO_EXTERNAL_RNG` but it's 47 currently possible to compile TLS and X.509 without any of the options. 48 Also, we can't just auto-enable `MBEDTLS_ENTROPY_C` as it doesn't build 49 out of the box on all platforms, and even less 50 `MBEDTLS_PSA_CRYPTO_EXTERNAL_RNG` as it requires a user-provided RNG 51 function. 52 53The downside of this approach is that until we are able to make 54`MBDEDTLS_USE_PSA_CRYPTO` non-optional (always enabled), we have to maintain 55two versions of some parts of the code: one using PSA, the other using the 56legacy APIs. However, see next section for strategies that can lower that 57cost. The rest of this section explains the reasons for the 58incompatibilities mentioned above. 59 60At the time of writing (early 2022) it is unclear what could be done about the 61backward compatibility issues, and in particular if the cost of implementing 62solutions to these problems would be higher or lower than the cost of 63maintaining dual code paths until the next major version. (Note: these 64solutions would probably also solve other problems at the same time.) 65 66### `MBEDTLS_ECP_RESTARTABLE` 67 68Currently this option controls not only the presence of restartable APIs in 69the crypto library, but also their use in the TLS and X.509 layers. Since PSA 70Crypto does not support restartable operations, there's a clear conflict: the 71TLS and X.509 layers can't both use only PSA APIs and get restartable 72behaviour. 73 74Supporting this in PSA is on our roadmap and currently planned for end of 752022, see <https://github.com/orgs/Mbed-TLS/projects/1#column-18883250>. 76 77It will then require follow-up work to make use of the new PSA API in 78PK/X.509/TLS in all places where we currently allow restartable operations. 79 80### Backward compatibility issues with making `MBEDTLS_USE_PSA_CRYPTO` always on 81 821. Existing applications may not be calling `psa_crypto_init()` before using 83 TLS, X.509 or PK. We can try to work around that by calling (the relevant 84part of) it ourselves under the hood as needed, but that would likely require 85splitting init between the parts that can fail and the parts that can't (see 86<https://github.com/ARM-software/psa-crypto-api/pull/536> for that). 872. It's currently not possible to enable `MBEDTLS_PSA_CRYPTO_C` in 88 configurations that don't have `MBEDTLS_ENTROPY_C`, and we can't just 89auto-enable the latter, as it won't build or work out of the box on all 90platforms. There are two kinds of things we'd need to do if we want to work 91around that: 92 1. Make it possible to enable the parts of PSA Crypto that don't require an 93 RNG (typically, public key operations, symmetric crypto, some key 94management functions (destroy etc)) in configurations that don't have 95`ENTROPY_C`. This requires going through the PSA code base to adjust 96dependencies. Risk: there may be annoying dependencies, some of which may be 97surprising. 98 2. For operations that require an RNG, provide an alternative function 99 accepting an explicit `f_rng` parameter (see #5238), that would be 100available in entropy-less builds. (Then code using those functions still needs 101to have one version using it, for entropy-less builds, and one version using 102the standard function, for driver support in build with entropy.) 103 104See <https://github.com/Mbed-TLS/mbedtls/issues/5156>. 105 106Taking advantage of the existing abstractions layers - or not 107============================================================= 108 109The Crypto library in Mbed TLS currently has 3 abstraction layers that offer 110algorithm-agnostic APIs for a class of algorithms: 111 112- MD for messages digests aka hashes (including HMAC) 113- Cipher for symmetric ciphers (included AEAD) 114- PK for asymmetric (aka public-key) cryptography (excluding key exchange) 115 116Note: key exchange (FFDH, ECDH) is not covered by an abstraction layer. 117 118These abstraction layers typically provide, in addition to the API for crypto 119operations, types and numerical identifiers for algorithms (for 120example `mbedtls_cipher_mode_t` and its values). The 121current strategy is to keep using those identifiers in most of the code, in 122particular in existing structures and public APIs, even when 123`MBEDTLS_USE_PSA_CRYPTO` is enabled. (This is not an issue for G1, G2, G3 124above, and is only potentially relevant for G4.) 125 126The are multiple strategies that can be used regarding the place of those 127layers in the migration to PSA. 128 129Silently call to PSA from the abstraction layer 130----------------------------------------------- 131 132- Provide a new definition (conditionally on `USE_PSA_CRYPTO`) of wrapper 133 functions in the abstraction layer, that calls PSA instead of the legacy 134crypto API. 135- Upside: changes contained to a single place, no need to change TLS or X.509 136 code anywhere. 137- Downside: tricky to implement if the PSA implementation is currently done on 138 top of that layer (dependency loop). 139 140This strategy is currently (early 2022) used for all operations in the PK 141layer. 142 143This strategy is not very well suited to the Cipher layer, as the PSA 144implementation is currently done on top of that layer. 145 146This strategy will probably be used for some time for the PK layer, while we 147figure out what the future of that layer is: parts of it (parse/write, ECDSA 148signatures in the format that X.509 & TLS want) are not covered by PSA, so 149they will need to keep existing in some way. (Also, the PK layer is a good 150place for dispatching to either PSA or `mbedtls_xxx_restartable` while that 151part is not covered by PSA yet, if we decide to do that.) 152 153Replace calls for each operation 154-------------------------------- 155 156- For every operation that's done through this layer in TLS or X.509, just 157 replace function call with calls to PSA (conditionally on `USE_PSA_CRYPTO`) 158- Upside: conceptually simple, and if the PSA implementation is currently done 159 on top of that layer, avoids concerns about dependency loops. 160- Upside: opens the door to building TLS/X.509 without that layer, saving some 161 code size. 162- Downside: TLS/X.509 code has to be done for each operation. 163 164This strategy is currently (early 2022) used for the MD layer and the Cipher 165layer. 166 167Opt-in use of PSA from the abstraction layer 168-------------------------------------------- 169 170- Provide a new way to set up a context that causes operations on that context 171 to be done via PSA. 172- Upside: changes mostly contained in one place, TLS/X.509 code only needs to 173 be changed when setting up the context, but not when using it. In 174 particular, no changes to/duplication of existing public APIs that expect a 175 key to be passed as a context of this layer (eg, `mbedtls_pk_context`). 176- Upside: avoids dependency loop when PSA implemented on top of that layer. 177- Downside: when the context is typically set up by the application, requires 178 changes in application code. 179 180This strategy is not useful when no context is used, for example with the 181one-shot function `mbedtls_md()`. 182 183There are two variants of this strategy: one where using the new setup 184function also allows for key isolation (the key is only held by PSA, 185supporting both G1 and G2 in that area), and one without isolation (the key is 186still stored outside of PSA most of the time, supporting only G1). 187 188This strategy, with support for key isolation, is currently (early 2022) used for 189private-key operations in the PK layer - see `mbedtls_pk_setup_opaque()`. This 190allows use of PSA-held private ECDSA keys in TLS and X.509 with no change to 191the TLS/X.509 code, but a contained change in the application. 192 193This strategy, without key isolation, was also previously used (until 3.1 194included) in the Cipher layer - see `mbedtls_cipher_setup_psa()`. This allowed 195use of PSA for cipher operations in TLS with no change to the application 196code, and a contained change in TLS code. (It only supported a subset of 197ciphers.) 198 199Note: for private key operations in the PK layer, both the "silent" and the 200"opt-in" strategy can apply, and can complement each other, as one provides 201support for key isolation, but at the (unavoidable) code of change in 202application code, while the other requires no application change to get 203support for drivers, but fails to provide isolation support. 204 205Summary 206------- 207 208Strategies currently (early 2022) used with each abstraction layer: 209 210- PK (for G1): silently call PSA 211- PK (for G2): opt-in use of PSA (new key type) 212- Cipher (G1): replace calls at each call site 213- MD (G1): replace calls at each call site 214 215 216Supporting builds with drivers without the software implementation 217================================================================== 218 219This section presents a plan towards G5: save code size by compiling out our 220software implementation when a driver is available. 221 222Additionally, we want to save code size by compiling out the 223abstractions layers that we are not using when `MBEDTLS_USE_PSA_CRYPTO` is 224enabled (see previous section): MD and Cipher. 225 226Let's expand a bit on the definition of the goal: in such a configuration 227(driver used, software implementation and abstraction layer compiled out), 228we want: 229 230a. the library to build in a reasonably-complete configuration, 231b. with all tests passing, 232c. and no more tests skipped than the same configuration with software 233 implementation. 234 235Criterion (c) ensures not only test coverage, but that driver-based builds are 236at feature parity with software-based builds. 237 238We can roughly divide the work needed to get there in the following steps: 239 2400. Have a working driver interface for the algorithms we want to replace. 2411. Have users of these algorithms call to PSA, not the legacy API, for all 242 operations. (This is G1, and for PK, X.509 and TLS this is controlled by 243 `MBEDTLS_USE_PSA_CRYPTO`.) This needs to be done in the library and tests. 2442. Have users of these algorithms not depend on the legacy API for information 245 management (getting a size for a given algorithm, etc.) 2463. Adapt compile-time guards used to query availability of a given algorithm; 247 this needs to be done in the library (for crypto operations and data) and 248tests. 249 250Note: the first two steps enable use of drivers, but not by themselves removal 251of the software implementation. 252 253Note: the fact that step 1 is not achieved for all of libmbedcrypto (see 254below) is the reason why criterion (a) has "a reasonably-complete 255configuration", to allow working around internal crypto dependencies when 256working on other parts such as X.509 and TLS - for example, a configuration 257without RSA PKCS#1 v2.1 still allows reasonable use of X.509 and TLS. 258 259Note: this is a conceptual division that will sometimes translate to how the 260work is divided into PRs, sometimes not. For example, in situations where it's 261not possible to achieve good test coverage at the end of step 1 or step 2, it 262is preferable to group with the next step(s) in the same PR until good test 263coverage can be reached. 264 265**Status as of Mbed TLS 3.2:** 266 267- Step 0 is achieved for most algorithms, with only a few gaps remaining. 268- Step 1 is achieved for most of PK, X.509, and TLS when 269 `MBEDTLS_USE_PSA_CRYPTO` is enabled with only a few gaps remaining (see 270 docs/use-psa-crypto.md). 271- Step 1 is not achieved for a lot of the crypto library including the PSA 272 core. For example, `entropy.c` calls the legacy API 273 `mbedtls_sha256` (or `mbedtls_sha512` optionally); `hmac_drbg.c` calls the 274 legacy API `mbedtls_md` and `ctr_drbg.c` calls the legacy API `mbedtls_aes`; 275 the PSA core depends on the entropy module and at least one of the DRBG 276 modules (unless `MBEDTLS_PSA_CRYPTO_EXTERNAL_RNG` is used). Further, several 277 crypto modules have similar issues, for example RSA PKCS#1 v2.1 calls 278 `mbedtls_md` directly. 279- Step 2 is achieved for most of X.509 and TLS (same gaps as step 1) when 280 `MBEDTLS_USE_PSA_CRYPTO` is enabled - this was tasks like #5795, #5796, 281 #5797. It is being done in PK and RSA PKCS#1 v1.5 by PR #6065. 282- Step 3 was mostly not started at all before 3.2; it is being done for PK by 283 PR #6065. 284 285**Strategy for step 1:** 286 287Regarding PK, X.509, and TLS, this is mostly achieved with only a few gaps. 288(The strategy was outlined in the previous section.) 289 290Regarding libmbedcrypto, outside of the RNG subsystem, for modules that 291currently depend on other legacy crypto modules, this can be achieved without 292backwards compatibility issues, by using the software implementation if 293available, and "falling back" to PSA only if it's not. The compile-time 294dependency changes from the current one (say, `MD_C` or `AES_C`) to "the 295previous dependency OR PSA Crypto with needed algorithms". When building 296without software implementation, users need to call `psa_crypto_init()` before 297calling any function from these modules. This condition does not constitute a 298break of backwards compatibility, as it was previously impossible to build in 299those configurations, and in configurations were the build was possible, 300application code keeps working unchanged. An work-in-progress example of 301applying this strategy, for RSA PKCS#1 v2.1, is here: 302<https://github.com/Mbed-TLS/mbedtls/pull/6141> 303 304There is a problem with the modules used for the PSA RNG, as currently the RNG 305is initialized before drivers and the key store. This part will need further 306study, but in the meantime we can proceed with everything that's not the 307entropy module of one of the DRBG modules, and that does not depend on one of 308those modules. 309 310**Strategy for step 2:** 311 312The most satisfying situation here is when we can just use the PSA Crypto API 313for information management as well. However sometimes it may not be 314convenient, for example in parts of the code that accept old-style identifiers 315(such as `mbedtls_md_type_t`) in their API and can't assume PSA to be 316compiled in (such as `rsa.c`). 317 318It is suggested that, as a temporary solution until we clean this up 319later when removing the legacy API including its identifiers (G4), we may 320occasionally use ad-hoc internal functions, such as the ones introduced by PR 3216065 in `library/hash_info.[ch]`. 322 323An alternative would be to have two different code paths depending on whether 324`MBEDTLS_PSA_CRYPTO_C` is defined or not. However this is not great for 325readability or testability. 326 327**Strategy for step 3:** 328 329There are currently two (complementary) ways for crypto-using code to check if a 330particular algorithm is supported: using `MBEDTLS_xxx` macros, and using 331`PSA_WANT_xxx` macros. For example, PSA-based code that want to use SHA-256 332will check for `PSA_WANT_ALG_SHA_256`, while legacy-based code that wants to 333use SHA-256 will check for `MBEDTLS_SHA256_C` if using the `mbedtls_sha256` 334API, or for `MBEDTLS_MD_C && MBEDTLS_SHA256_C` if using the `mbedtls_md` API. 335 336Code that obeys `MBEDTLS_USE_PSA_CRYPTO` will want to use one of the two 337dependencies above depending on whether `MBEDTLS_USE_PSA_CRYPTO` is defined: 338if it is, the code want the algorithm available in PSA, otherwise, it wants it 339available via the legacy API(s) is it using (MD and/or low-level). 340 341The strategy for steps 1 and 2 above will introduce new situations: code that 342currently compute hashes using MD (resp. a low-level hash module) will gain 343the ability to "fall back" to using PSA if the legacy dependency isn't 344available. Data related to a certain hash (OID, sizes, translations) should 345only be included in the build if it is possible to use that hash in some way. 346 347In order to cater to these new needs, new families of macros are introduced in 348`legacy_or_psa.h`, see its documentation for details. 349 350It should be noted that there are currently: 351- too many different ways of computing a hash (low-level, MD, PSA); 352- too many different ways to configure the library that influence which of 353 these ways is available and will be used (`MBEDTLS_USE_PSA_CRYPTO`, 354 `MBEDTLS_PSA_CRYPTO_CONFIG`, `mbedtls_config.h` + `psa/crypto_config.h`). 355 356As a result, we need more families of dependency macros than we'd like to. 357This is a temporary situation until we move to a place where everything is 358based on PSA Crypto. In the meantime, long and explicit names where chosen for 359the new macros in the hope of avoiding confusion. 360 361Note: the new macros supplement but do not replace the existing macros: 362- code that always uses PSA Crypto (for example, code specific to TLS 1.3) 363 should use `PSA_WANT_xxx`; 364- code that always uses the legacy API (for example, crypto modules that have 365 not undergone step 1 yet) should use `MBEDTLS_xxx_C`; 366- code that may use one of the two APIs, either based on 367 `MBEDTLS_USE_PSA_CRYPTO` (X.509, TLS 1.2, shared between TLS 1.2 and 1.3), 368 or based on availability (crypto modules after step 1), should use one of 369 the new macros from `legacy_or_psa.h`. 370 371Executing step 3 will mostly consist of using the right dependency macros in 372the right places (once the previous steps are done). 373 374**Note on testing** 375 376Since supporting driver-only builds is not about adding features, but about 377supporting existing features in new types of builds, testing will not involve 378adding cases to the test suites, but instead adding new components in `all.sh` 379that build and run tests in newly-supported configurations. For example, if 380we're making some part of the library work with hashes provided only by 381drivers when `MBEDTLS_USE_PSA_CRYPTO` is defined, there should be a place in 382`all.sh` that builds and run tests in such a configuration. 383 384There is however a risk, especially in step 3 where we change how dependencies 385are expressed (sometimes in bulk), to get things wrong in a way that would 386result in more tests being skipped, which is easy to miss. Care must be 387taken to ensure this does not happen. The following criteria can be used: 388 3891. The sets of tests skipped in the default config and the full config must be 390 the same before and after the PR that implements step 3. This is tested 391manually for each PR that changes dependency declarations by using the script 392`outcome-analysis.sh` in the present directory. 3932. The set of tests skipped in the driver-only build is the same as in an 394 equivalent software-based configuration. This is tested automatically by the 395CI in the "Results analysis" stage, by running 396`tests/scripts/analyze_outcomes.py`. See the 397`analyze_driver_vs_reference_xxx` actions in the script and the comments above 398their declaration for how to do that locally. 399 400 401Migrating away from the legacy API 402================================== 403 404This section briefly introduces questions and possible plans towards G4, 405mainly as they relate to choices in previous stages. 406 407The role of the PK/Cipher/MD APIs in user migration 408--------------------------------------------------- 409 410We're currently taking advantage of the existing PK layer in order 411to reduce the number of places where library code needs to be changed. It's 412only natural to consider using the same strategy (with the PK, MD and Cipher 413layers) for facilitating migration of application code. 414 415Note: a necessary first step for that would be to make sure PSA is no longer 416implemented of top of the concerned layers 417 418### Zero-cost compatibility layer? 419 420The most favourable case is if we can have a zero-cost abstraction (no 421runtime, RAM usage or code size penalty), for example just a bunch of 422`#define`s, essentially mapping `mbedtls_` APIs to their `psa_` equivalent. 423 424Unfortunately that's unlikely to fully work. For example, the MD layer uses the 425same context type for hashes and HMACs, while the PSA API (rightfully) has 426distinct operation types. Similarly, the Cipher layer uses the same context 427type for unauthenticated and AEAD ciphers, which again the PSA API 428distinguishes. 429 430It is unclear how much value, if any, a zero-cost compatibility layer that's 431incomplete (for example, for MD covering only hashes, or for Cipher covering 432only AEAD) or differs significantly from the existing API (for example, 433introducing new context types) would provide to users. 434 435### Low-cost compatibility layers? 436 437Another possibility is to keep most or all of the existing API for the PK, MD 438and Cipher layers, implemented on top of PSA, aiming for the lowest possible 439cost. For example, `mbedtls_md_context_t` would be defined as a (tagged) union 440of `psa_hash_operation_t` and `psa_mac_operation_t`, then `mbedtls_md_setup()` 441would initialize the correct part, and the rest of the functions be simple 442wrappers around PSA functions. This would vastly reduce the complexity of the 443layers compared to the existing (no need to dispatch through function 444pointers, just call the corresponding PSA API). 445 446Since this would still represent a non-zero cost, not only in terms of code 447size, but also in terms of maintenance (testing, etc.) this would probably 448be a temporary solution: for example keep the compatibility layers in 4.0 (and 449make them optional), but remove them in 5.0. 450 451Again, this provides the most value to users if we can manage to keep the 452existing API unchanged. Their might be conflicts between this goal and that of 453reducing the cost, and judgment calls may need to be made. 454 455Note: when it comes to holding public keys in the PK layer, depending on how 456the rest of the code is structured, it may be worth holding the key data in 457memory controlled by the PK layer as opposed to a PSA key slot, moving it to a 458slot only when needed (see current `ecdsa_verify_wrap` when 459`MBEDTLS_USE_PSA_CRYPTO` is defined) For example, when parsing a large 460number, N, of X.509 certificates (for example the list of trusted roots), it 461might be undesirable to use N PSA key slots for their public keys as long as 462the certs are loaded. OTOH, this could also be addressed by merging the "X.509 463parsing on-demand" (#2478), and then the public key data would be held as 464bytes in the X.509 CRT structure, and only moved to a PK context / PSA slot 465when it's actually used. 466 467Note: the PK layer actually consists of two relatively distinct parts: crypto 468operations, which will be covered by PSA, and parsing/writing (exporting) 469from/to various formats, which is currently not fully covered by the PSA 470Crypto API. 471 472### Algorithm identifiers and other identifiers 473 474It should be easy to provide the user with a bunch of `#define`s for algorithm 475identifiers, for example `#define MBEDTLS_MD_SHA256 PSA_ALG_SHA_256`; most of 476those would be in the MD, Cipher and PK compatibility layers mentioned above, 477but there might be some in other modules that may be worth considering, for 478example identifiers for elliptic curves. 479 480### Lower layers 481 482Generally speaking, we would retire all of the low-level, non-generic modules, 483such as AES, SHA-256, RSA, DHM, ECDH, ECP, bignum, etc, without providing 484compatibility APIs for them. People would be encouraged to switch to the PSA 485API. (The compatibility implementation of the existing PK, MD, Cipher APIs 486would mostly benefit people who already used those generic APis rather than 487the low-level, alg-specific ones.) 488 489### APIs in TLS and X.509 490 491Public APIs in TLS and X.509 may be affected by the migration in at least two 492ways: 493 4941. APIs that rely on a legacy `mbedtls_` crypto type: for example 495 `mbedtls_ssl_conf_own_cert()` to configure a (certificate and the 496associated) private key. Currently the private key is passed as a 497`mbedtls_pk_context` object, which would probably change to a `psa_key_id_t`. 498Since some users would probably still be using the compatibility PK layer, it 499would need a way to easily extract the PSA key ID from the PK context. 500 5012. APIs the accept list of identifiers: for example 502 `mbedtls_ssl_conf_curves()` taking a list of `mbedtls_ecp_group_id`s. This 503could be changed to accept a list of pairs (`psa_ecc_family_t`, size) but we 504should probably take this opportunity to move to a identifier independent from 505the underlying crypto implementation and use TLS-specific identifiers instead 506(based on IANA values or custom enums), as is currently done in the new 507`mbedtls_ssl_conf_groups()` API, see #4859). 508 509Testing 510------- 511 512An question that needs careful consideration when we come around to removing 513the low-level crypto APIs and making PK, MD and Cipher optional compatibility 514layers is to be sure to preserve testing quality. A lot of the existing test 515cases use the low level crypto APIs; we would need to either keep using that 516API for tests, or manually migrate tests to the PSA Crypto API. Perhaps a 517combination of both, perhaps evolving gradually over time. 518