1--- 2title: Transform Fallback 3--- 4 5# Transform Fallback 6 7We need to more clearly describe the presumed lookup fallback for transforms: 8 9## Code equivalence 10 11- A lone script code or long script name is equivalent to the BCP 47 syntax: Latn = Latin = und-Latn. 12- "und" from BCP 47 is treated the same as the special code "any" in transform IDs 13- In the unlikely event that we have a collision between a special transform code (any, hex, fullwidth, etc) and a BCP 47 language code, we have to figure out what to do. Initial suggestion: add "\_ZZ" to language code. 14- For the special codes, we should probably switch to aliases that have a low probability of collision, eg > 3 letters always. 15 16## Language tag fallback 17 18If the source or target is a Unicode language ID, then a fallback is followed, with some additions. 19 201. az\_Arab\_IR 212. az\_Arab 223. az\_IR 234. az 245. Arab 256. Cyrl 26 27The fallback additions are: 28 29- We fallback also through the country (03). This is along the lines we've otherwise discussed for BCP47 support, and that we should clarify in the spec. 30- Once the language is reached, we fall back to script; first the specified script if there is one (05), then the likely script for lang (06 - if different than 05) 31 32## Laddered fallback 33 34The source, target, and varient use "laddered" fallback. That is, in pseudo code: 35 36a. for variant in variant-chain 37 38b. for target in target-chain 39 40c. for source in source-chain 41 42 transform = lookup source-target/variant 43 44 if transform != null return transform 45 46.. 47 48For example, here is the chain for ru\_RU-el\_GR/BGN. I'm spacing out the source, target, and variant for clarity. 49 501. ru\_RU - el\_GR /BGN 512. ru - el\_GR /BGN 523. Cyrl - el\_GR /BGN 534. ru\_RU - el /BGN 545. ru - el /BGN 556. Cyrl - el /BGN 567. ru\_RU - Grek /BGN 578. ru - Grek /BGN 589. Cyrl - Grek /BGN 5910. ru\_RU - el\_GR 6011. ru - el\_GR 6112. Cyrl - el\_GR 6213. ru\_RU - el 6314. ru - el 6415. Cyrl - el 6516. ru\_RU - Grek 6617. ru - Grek 6718. Cyrl - Grek 68 69**Comments:** 70 711. The above is not how ICU code works. That code actually discards the variant if the exact match is not found, so lines 02-09 are not queried at all. I think that is definitely a mistake. 722. Personally, I think the above chain might not be optimal; that it would be better to have BGN be stronger than country difference, but not as strong as Script. However, in conversations with Markus, I was convinced that a simple story for how it works is probably the best, and the above is simpler to explain and easier to implement. 73 74## Model Requirements 75 76We have the implicit requirement that no variant is populated unless there is a no-variant version. We need to make sure that that is maintained by the build tools and/or tests. That is, if we have fa-Latn/BGN, we should have fa-Latn as well. The other piece of this is that we should name all the no-variant versions, so that people can be explicit about the variant even in case we change the default later on. The upshot is that the no-variant version should always just be aliases to one of the variant versions. Operationally, that means the following actions: 77 78Case 1. only fa-Latn/BGN. Add an alias from fa-Latn to fa-Latn/BGN 79 80Case 2. only foo-Latn. Rename to foo-Latn/SOMETHING, and then do Case 1. 81 82