• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1---
2title: BCP 47 Changes (DRAFT)
3---
4
5# BCP 47 Changes (DRAFT)
6
7With the new release of the new version of [BCP 47](http://www.inter-locale.com/ID/draft-ietf-ltru-4646bis-18.html), there are various changes we need to make in Unicode CLDR and LDML. Already in CLDR 1.7 we have made modifications anticipating the release: see [BCP 47 Tag Conversion](http://unicode.org/reports/tr35/#BCP_47_Tag_Conversion) in the spec (and the orginal [design proposal](https://cldr.unicode.org/development/development-process/design-proposals/bcp47-syntax-mapping)), but more changes need to be made.
8
9## Formula
10
11We need to take another look at which languages we show in the survey tool for translation, because the new version is [very large](http://tools.ietf.org/html/draft-ietf-ltru-4645bis), around 7,000 languages. Showing all of those languages in the Survey tool would neither be good for the usability of the tool for most translators, nor for tool performance, so we need some formula for picking which languages to show by default.
12
13For feedback on this document, please file a Reply under [http://www.unicode.org/cldr/bugs/locale-bugs?findid=1977](http://www.unicode.org/cldr/bugs/locale-bugs?findid=1977). For discussion of issues, please send email to [cldr-users@unicode.org](mailto:cldr-users@unicode.org).
14
15**Draft Formula**
16
17A. We show a language code X for translation if any of the following conditions are true:
18
191. X is a qualified language\*\*, **and** has at least **100K** speakers, **and** at least one of the following is true:
20	1. X is has official status\* in any country
21	2. X exceeds a threshold population† of literate users worldwide: **10M**
22	3. X exceeds a threshold population† in some country Z: **1M** ***and*** **1/3** of Z's population†.
232. X has ***non-draft*** minimal language coverage‡ in CLDR itself.
243. *Only for translation in locale Y:* X is a qualified language\*\* that already has a translation in CLDR data in Y.
254. X is an exception explicitly approved by the committee, either in root, or in some language Y.
26	1. Current examples: Latin, Sanskrit
27
28If a translator finds that X is needed for translation in language Y, then a bug can be filed. If we find the volume is high, we may need to add is some way for a translator to add a language in the survey tool.
29
30B. We show a script code S for translation if and only if it is one of the scripts used by one of the languages shown.
31
32**Notes**
33
34\*\* qualified language: excluding collection (except for macrolanguages with predominant forms), ancient, historic, and extinct languages: see [Scope](http://www.sil.org/iso639-3/scope.asp) and [Types](http://www.sil.org/iso639-3/types.asp). Some could be added as exceptions as needed.
35
36‡ minimal coverage - see [Coverage Levels](http://www.unicode.org/reports/tr35/#Coverage_Levels) - at a non-draft level.
37
38\* *official status* means official, de facto official, official regional, or de facto official regional.
39
40† *population* means literate 14-day active users (well, theoretically - we can only get an approximation of that), based on  [CLDR figures](http://www.unicode.org/cldr/data/charts/supplemental/language_territory_information.html). Our concern is with written language, not spoken, and so we don't focus on variants that don't have much written usage; moreover, the population figures we want to focus on are the literate population. For this reason and others, we don't rely on the Ethnologue figures. See also [Picking the Right Language Code](https://cldr.unicode.org/index/cldr-spec/picking-the-right-language-code).
41
42
43**Please review the generated lists in** [**Filtered Scripts and Languages**](https://cldr.unicode.org/development/development-process/design-proposals/bcp-47-changes-draft)**.** A spreadsheet with some details is on<http://spreadsheets.google.com/pub?key=rORMJfeNEUR37PlS8HIa_rQ>. The first column is the language, 2rd is the world population of the language (literate), and the remaining columns are the reasons (data for 1.1, 1.2, 1.3 from the above).
44
45Known issues:
46
47- need to add Norwegian [no], resolve tl vs fil, ...
48- Tokelau has no speakers (bug filed)
49
50### Survey Tool Changes
51
52The above would only require a small tool change: the main change is that the approved list from #1 and #2 would be in CODE\_FALLBACK, and nothing else would. Languages would get #3 cases by virtue of there being a translated tag already in the language, even if Root doesn't have anything (because it is not in CODE\_FALLBACK). Thus if the locale doesn't already contain a translation for, say, Ancient Greek, it would not show up in the survey tool.
53
54We would add the lists to the supplemental metadata for access by the tools. The Coverage tool and spec also need to be aligned with the above.
55
56## Other Changes
57
58We also need to make other changes to the spec in regards to the new version of BCP 47. In particular, those [macrolanguages](http://www.sil.org/iso639-3/macrolanguages.asp) with an encompassed language that is a "predominant form", CLDR treats the predominant form and the macrolanguage as aliases. See [Locale Field Definitions](http://unicode.org/reports/tr35/#Locale_Field_Definitions) in the spec. We need to flesh that table out to include all of the [macrolanguages](http://www.sil.org/iso639-3/macrolanguages.asp) that are in the [Included Languages](https://cldr.unicode.org/development/development-process/design-proposals/bcp-47-changes-draft), such as Azerbaijani. Here is a start at that (but still just draft). The first part of this list is from a draft of BCP 47bis. The last three are codes that are in the current (2006) version of BCP 47.
59
60Macrolanguage Table
61
62| Macrolanguage | Encompassed Language | Comments |
63|---|---|---|
64| Arabic ' ar ' | Standard Arabic ' arb ' |  |
65| Konkani ' kok ' | Konkani ( individual language) ' knn ' |  |
66| Malay ' ms ' | Standard Malay ' zsm ' |  |
67| Swahili ' sw ' | Swahili ( individual language) ' swh ' |  |
68| Uzbek ' uz ' | Northern Uzbek ' uzn ' |  |
69| Chinese ' zh ' | Mandarin Chinese ' cmn ' |  |
70| Norwegian ' no ' | Norwegian Bokmal ' nob ' = nb | To regularize, we may want to switch in CLDR from nb as the 'norm' to no. |
71| Serbo-croatian ' sh ' |  | *This is a complex situation, and we'll probably leave as is.* |
72| Kurdish 'ku' | Northern Kurdish 'kmr'? | We probably want to change the default content locale to ku-Latn |
73| Akan ' ak ' | Twi ' tw ' and Fanti ' fat' | This appears to be a mistake in ISO 639. See: ISO 636 Deprecation Requests . |
74| Persian  fas (fa) | Western Farsi pes and prs Dari | This appears to be a mistake in ISO 639. See: ISO 636 Deprecation Requests . |
75
76These would also go into the \<alias> element of the supplemental metadata. We may add more such aliases over time, as we find new predominant forms. Note that we still need to offer both aliases for translation in many cases. For example, we want to show both 'no' and 'nb'.
77
78## Lenient Parsing
79
80There are many circumstances where we get less than perfect language identifiers coming in. I think we should have some guidelines as to how to do this. Here are the possibilities:
81
821. case / hyphen insensitivity
832. map valid non-canonical forms to their canonical equivalents (zh-cmn, cmn => zh)
843. map certain common invalid forms to their canonical equivalents:
85	1. UK => GB
86	2. eng => en // and other illegal 3-letter 639 codes that correspond to 2-letter codes
87	3. 840 => US // other numeric region codes that correspond to 2-letter codes
884. map away extlangs. Formally, en-yue is valid (this slipped by us in doing BCP 47), and canonicalizes in BCP 47 to yue, the same as zh-yue does. In any event, the simplest thing for us to do is if there is a syntactic extlang:
89	1. Verify that the base language and extlang are both valid language subtags
90	2. Remove the base language
91	3. This avoids having to store which languages are also extlangs, and what their prefixes are.
92
93People have to do #1. We should recommend #2, and make it easy to support #3.
94
95See demo at [http://unicode.org/cldr/utility/languageid.jsp](http://unicode.org/cldr/utility/languageid.jsp)
96
97Also, we should consider modifying the canonical form of language identifiers so as to have lowercase variants (with the exception of some set of grandfathered codes). The following are generated by GenerateMaximalLocales, plus 7 hand modifications for the last line.
98
99## Filtered Scripts and Languages
100
101The following script/language names would be included (/excluded) from default translation. For the method used to get this list, see [Formula](https://cldr.unicode.org/development/development-process/design-proposals/bcp-47-changes-draft).
102
103The languages are listed in the format Abkhazian [ab]-OR, where [xx] is the code, and "OR" is the abbreviated "best" status in some territory: **U**nknown, **O**fficial **R**egional, **O**fficial **M**inority, **D**e facto official, **O**fficial.
104
105### Included Script Names: 41+
106
107- Arabic [Arab], Armenian [Armn]
108- Bengali [Beng]
109- Cyrillic [Cyrl]
110- Devanagari [Deva]
111- Ethiopic [Ethi]
112- Georgian [Geor], Greek [Grek], Gujarati [Gujr], Gurmukhi [Guru]
113- Hebrew [Hebr], Han [Hani]
114	- Simplified Han [Hans], Traditional Han [Hant], Bopomofo [Bopo]
115- Japanese [Jpan]
116	- Hiragana [Hira], Katakana [Kana]
117- Kannada [Knda], Khmer [Khmr], Korean [Kore]
118	- Hangul [Hang]
119- Lao [Laoo], Latin [Latn]
120- Malayalam [Mlym], Mongolian [Mong], Myanmar [Mymr]
121- Oriya [Orya]
122- Sinhala [Sinh]
123- Tamil [Taml], Telugu [Telu], Thaana [Thaa], Thai [Thai], Tibetan [Tibt]
124- Special codes:
125	- Common [Zyyy], Symbols [Zsym], Unwritten [Zxxx], Unknown or Invalid Script [Zzzz]
126	- Braille [Brai]
127- *Possibly also in the future:*
128	- Tifinagh [Tfng], Yi [Yiii],
129	- Unified Canadian Aboriginal Syllabics [Cans],
130
131### Excluded Script Names:
132
133- Avestan [Avst]
134- Balinese [Bali], Batak [Batk], Blissymbols [Blis], Book Pahlavi [Phlv], Brahmi [Brah], Buginese [Bugi], Buhid [Buhd]
135- Carian [Cari], Chakma [Cakm], Cham, Cherokee [Cher], Cirth [Cirt], Coptic [Copt], Cypriot [Cprt]
136- Deseret [Dsrt]
137- Eastern Syriac [Syrn], Egyptian demotic [Egyd], Egyptian hieratic [Egyh], Egyptian hieroglyphs [Egyp], Estrangelo Syriac [Syre]
138- Fraktur Latin [Latf]
139- Gaelic Latin [Latg], Georgian Khutsuri [Geok], Glagolitic [Glag], Gothic [Goth]
140- Hanunoo [Hano]
141- Imperial Aramaic [Armi], Indus [Inds], Inherited [Qaai], Inscriptional Pahlavi [Phli], Inscriptional Parthian [Prti]
142- Javanese [Java]
143- Kaithi [Kthi], Katakana or Hiragana [Hrkt], Kayah Li [Kali], Kharoshthi [Khar]
144- Lanna [Lana], Lepcha [Lepc], Limbu [Limb], Linear A [Lina], Linear B [Linb], Lisu, Lycian [Lyci], Lydian [Lydi]
145- Mandaean [Mand], Manichaean [Mani], Mathematical Notation [Zmth], Mayan hieroglyphs [Maya], Meitei Mayek [Mtei], Meroitic [Mero], Moon
146- N’Ko [Nkoo], New Tai Lue [Talu], Nkgb
147- Ogham [Ogam], Ol Chiki [Olck], Old Church Slavonic Cyrillic [Cyrs], Old Hungarian [Hung], Old Italic [Ital], Old Permic [Perm], Old Persian [Xpeo], Orkhon [Orkh], Osmanya [Osma]
148- Pahawh Hmong [Hmng], Phags-pa [Phag], Phoenician [Phnx], Pollard Phonetic [Plrd], Psalter Pahlavi [Phlp]
149- Rejang [Rjng], Rongorongo [Roro], Runic [Runr]
150- Samaritan [Samr], Sarati [Sara], Saurashtra [Saur], Shavian [Shaw], SignWriting [Sgnw], Sumero-Akkadian Cuneiform [Xsux], Sundanese [Sund], Syloti Nagri [Sylo], Syriac [Syrc]
151- Tagalog [Tglg], Tagbanwa [Tagb], Tai Le [Tale], Tai Viet [Tavt], Tengwar [Teng], Tifinagh [Tfng]
152- Ugaritic [Ugar]
153- Vai [Vaii], Visible Speech [Visp]
154- Western Syriac [Syrj]
155- Yi [Yiii]
156- Inherited [Zinh]
157
158### Included Languages: 202
159
160- Abkhazian [ab]-OR, Adyghe [ady]-OR, Afrikaans [af]-O, Akan [ak]-U, Albanian [sq]-O, Amharic [am]-O, Arabic [ar]-O, Armenian [hy]-O, Assamese [as]-O, Asturian [ast]-OR, Avaric [av]-OR, Awadhi [awa]-U, Aymara [ay]-O, Azerbaijani [az]-O
161- Bambara [bm]-U, Bashkir [ba]-OR, Basque [eu]-OR, Belarusian [be]-O, Bengali [bn]-O, Bhojpuri [bho]-U, Bislama [bi]-O, Bosnian [bs]-O, Bulgarian [bg]-O, Burmese [my]-O
162- Catalan [ca]-O, Cebuano [ceb]-OR, Chamorro [ch]-O, Chechen [ce]-OR, Chinese [zh]-O, Chuukese [chk]-O, Croatian [hr]-O, Czech [cs]-O
163- Danish [da]-O, Divehi [dv]-O, Dutch [nl]-O, Dzongkha [dz]-O
164- Efik [efi]-O, English [en]-O, Erzya [myv]-OR, Estonian [et]-O, Ewe [ee]-OR
165- Faroese [fo]-O, Fijian [fj]-O, Filipino [fil]-O, Finnish [fi]-O, French [fr]-O
166- Ga [gaa]-OR, Gagauz [gag]-OR, Galician [gl]-OR, Georgian [ka]-O, German [de]-O, Gilbertese [gil]-O, Greek [el]-O, Guarani [gn]-O, Gujarati [gu]-O
167- Haitian [ht]-O, Hausa [ha]-O, Hawaiian [haw]-OR, Hebrew [he]-O, Hiligaynon [hil]-OR, Hindi [hi]-O, Hiri Motu [ho]-O, Hungarian [hu]-O
168- Icelandic [is]-O, Igbo [ig]-O, Iloko [ilo]-OR, Indonesian [id]-O, Ingush [inh]-OR, Inuktitut [iu]-OR, Irish [ga]-O, Italian [it]-O
169- Japanese [ja]-O, Javanese [jv]-U
170- Kabardian [kbd]-OR, Kalaallisut [kl]-O, Kannada [kn]-O, Karachay-Balkar [krc]-OR, Kashmiri [ks]-O, Kazakh [kk]-O, Khasi [kha]-OR, Khmer [km]-O, Kinyarwanda [rw]-O, Kirghiz [ky]-O, Komi-Permyak [koi]-OR, Komi-Zyrian [kpv]-OR, Konkani [kok]-OR, Korean [ko]-O, Kosraean [kos]-O, Krio [kri]-U, Kumyk [kum]-OR, Kurdish [ku]-OR
171- Lahnda [lah]-U, Lak [lbe]-OR, Lao [lo]-O, Latin [la]-DO, Latvian [lv]-O, Lezghian [lez]-OR, Lingala [ln]-O, Lithuanian [lt]-O, Luxembourgish [lb]-O
172- Macedonian [mk]-O, Madurese [mad]-U, Maguindanao [mdh]-OR, Maithili [mai]-OR, Malagasy [mg]-O, Malay [ms]-O, Malayalam [ml]-O, Maltese [mt]-O, Maore Comorian [swb]-O, Maori [mi]-O, Marathi [mr]-O, Marshallese [mh]-O, Moksha [mdf]-OR, Mongolian [mn]-O, Mossi [mos]-U
173- Nauru [na]-O, Nepali [ne]-O, Niuean [niu]-O, Northeastern Thai [tts]-U, Northern Sami [se]-OR, Northern Sotho [nso]-O, Norwegian Bokmål [nb]-O, Norwegian Nynorsk [nn]-O, Nyanja [ny]-O
174- Oriya [or]-O, Oromo [om]-U, Ossetic [os]-OR
175- Palauan [pau]-O, Pangasinan [pag]-OR, Papiamento [pap]-DO, Pashto [ps]-O, Persian [fa]-O, Plains Cree [crk]-OR, Pohnpeian [pon]-O, Polish [pl]-O, Portuguese [pt]-O, Punjabi [pa]-O
176- Quechua [qu]-O
177- Rhaeto-Romance [rm]-O, Romanian [ro]-O, Rundi [rn]-O, Russian [ru]-O
178- Samoan [sm]-O, Sango [sg]-O, Sanskrit [sa]-O, Santali [sat]-OR, Scottish Gaelic [gd]-OR, Serbian [sr]-O, Shona [sn]-U, Sindhi [sd]-O, Sinhala [si]-O, Slovak [sk]-O, Slovenian [sl]-O, Somali [so]-O, Southern Sotho [st]-O, Spanish [es]-O, Sundanese [su]-O, Swahili [sw]-O, Swati [ss]-O, Swedish [sv]-O, Swiss German [gsw]-U
179- Tagalog [tl]-OR, Tahitian [ty]-O, Tajik [tg]-O, Tamil [ta]-O, Tatar [tt]-OR, Tausug [tsg]-OR, Telugu [te]-O, Tetum [tet]-O, Thai [th]-O, Tibetan [bo]-OR, Tigrinya [ti]-DO, Tok Pisin [tpi]-O, Tokelau [tkl]-O, Tonga [to]-O, Tsonga [ts]-O, Tswana [tn]-O, Turkish [tr]-O, Turkmen [tk]-O, Tuvalu [tvl]-O, Tuvinian [tyv]-OR, Twi [tw]-OR
180- Udmurt [udm]-OR, Uighur [ug]-OR, Ukrainian [uk]-O, Ulithian [uli]-O, Unknown or Invalid Language [und]-S, Urdu [ur]-O, Uzbek [uz]-O
181- Venda [ve]-O, Vietnamese [vi]-O
182- Waray [war]-OR, Welsh [cy]-OR, Western Frisian [fy]-OR, Wolof [wo]-O, Woods Cree [cwd]-OR
183- Xhosa [xh]-O
184- Yakut [sah]-OR, Yapese [yap]-O, Yoruba [yo]-O
185- Zhuang [za]-OR, Zulu [zu]-O
186
187### Excluded Languages: 299
188
189- Achinese [ace]-U, Acoli [ach]-U, Adangme [ada]-U, Afar [aa]-U, Afrihili [afh]-U, Afro-Asiatic Language [afa]-U, Ainu [ain]-U, Akkadian [akk]-U, Aleut [ale]-U, Algonquian Language [alg]-U, Altaic Language [tut]-U, Ancient Egyptian [egy]-U, Ancient Greek [grc]-U, Angika [anp]-U, Apache Language [apa]-U, Aragonese [an]-U, Aramaic [arc]-U, Arapaho [arp]-U, Araucanian [arn]-U, Arawak [arw]-U, Aromanian [rup]-U, Artificial Language [art]-U, Athapascan Language [ath]-U, Atsam [cch]-U, Australian Language [aus]-U, Austronesian Language [map]-U, Avestan [ae]-U
190- Balinese [ban]-U, Baltic Language [bat]-U, Baluchi [bal]-U, Bamileke Language [bai]-U, Banda [bad]-U, Bantu [bnt]-U, Basa [bas]-U, Batak [btk]-U, Beja [bej]-U, Bemba [bem]-U, Berber [ber]-U, Bihari [bh]-U, Bikol [bik]-U, Bini [bin]-U, Blin [byn]-U, Blissymbols [zbl]-U, Braj [bra]-U, Breton [br]-U, Buginese [bug]-U, Buriat [bua]-U
191- Caddo [cad]-U, Carib [car]-U, Caucasian Language [cau]-U, Celtic Language [cel]-U, Central American Indian Language [cai]-U, Chagatai [chg]-U, Chamic Language [cmc]-U, Cherokee [chr]-U, Cheyenne [chy]-U, Chibcha [chb]-U, Chinook Jargon [chn]-U, Chipewyan [chp]-U, Choctaw [cho]-U, Church Slavic [cu]-U, Chuvash [cv]-U, Classical Newari [nwc]-U, Classical Syriac [syc]-U, Coptic [cop]-U, Cornish [kw]-U, Corsican [co]-U, Cree [cr]-U, Creek [mus]-U, Creole or Pidgin [crp]-U, Crimean Turkish [crh]-U, Cushitic Language [cus]-U
192- Dakota [dak]-U, Dargwa [dar]-U, Dayak [day]-U, Delaware [del]-U, Dinka [din]-U, Dogri [doi]-U, Dogrib [dgr]-U, Dravidian Language [dra]-U, Duala [dua]-U, Dyula [dyu]-U
193- Eastern Frisian [frs]-U, Ekajuk [eka]-U, Elamite [elx]-U, English-based Creole or Pidgin [cpe]-U, Esperanto [eo]-U, Ewondo [ewo]-U
194- Fang [fan]-U, Fanti [fat]-U, Finno-Ugrian Language [fiu]-U, Fon [fon]-U, French-based Creole or Pidgin [cpf]-U, Friulian [fur]-U, Fulah [ff]-U
195- Ganda [lg]-U, Gayo [gay]-U, Gbaya [gba]-U, Geez [gez]-U, Germanic Language [gem]-U, Gondi [gon]-U, Gorontalo [gor]-U, Gothic [got]-U, Grebo [grb]-U, Gwichʼin [gwi]-U
196- Haida [hai]-U, Herero [hz]-U, Himachali [him]-U, Hittite [hit]-U, Hmong [hmn]-U, Hupa [hup]-U
197- Iban [iba]-U, Ido [io]-U, Ijo [ijo]-U, Inari Sami [smn]-U, Indic Language [inc]-U, Indo-European Language [ine]-U, Interlingua [ia]-U, Interlingue [ie]-U, Inupiaq [ik]-U, Iranian Language [ira]-U, Iroquoian Language [iro]-U
198- Jju [kaj]-U, Judeo-Arabic [jrb]-U, Judeo-Persian [jpr]-U
199- Kabyle [kab]-U, Kachin [kac]-U, Kalmyk [xal]-U, Kamba [kam]-U, Kanuri [kr]-U, Kara-Kalpak [kaa]-U, Karelian [krl]-U, Karen [kar]-U, Kashubian [csb]-U, Kawi [kaw]-U, Khoisan Language [khi]-U, Khotanese [kho]-U, Kikuyu [ki]-U, Kimbundu [kmb]-U, Klingon [tlh]-U, Komi [kv]-U, Kongo [kg]-U, Koro [kfo]-U, Kpelle [kpe]-U, Kru [kro]-U, Kuanyama [kj]-U, Kurukh [kru]-U, Kutenai [kut]-U
200- Ladino [lad]-U, Lamba [lam]-U, Limburgish [li]-U, Lojban [jbo]-U, Low German [nds]-U, Lower Sorbian [dsb]-U, Lozi [loz]-U, Luba-Katanga [lu]-U, Luba-Lulua [lua]-U, Luiseno [lui]-U, Lule Sami [smj]-U, Lunda [lun]-U, Luo [luo]-U, Lushai [lus]-U
201- Magahi [mag]-U, Makasar [mak]-U, Manchu [mnc]-U, Mandar [mdr]-U, Mandingo [man]-U, Manipuri [mni]-U, Manobo Language [mno]-U, Manx [gv]-U, Mari [chm]-U, Marwari [mwr]-U, Masai [mas]-U, Mayan Language [myn]-U, Mende [men]-U, Micmac [mic]-U, Middle Dutch [dum]-U, Middle English [enm]-U, Middle French [frm]-U, Middle High German [gmh]-U, Middle Irish [mga]-U, Minangkabau [min]-U, Mirandese [mwl]-U, Miscellaneous Language [mis]-S, Mohawk [moh]-U, Mon-Khmer Language [mkh]-U, Mongo [lol]-U, Multiple Languages [mul]-S, Munda Language [mun]-U
202- N’Ko [nqo]-U, Nahuatl [nah]-U, Navajo [nv]-U, Ndonga [ng]-U, Neapolitan [nap]-U, Newari [new]-U, Nias [nia]-U, Niger-Kordofanian Language [nic]-U, Nilo-Saharan Language [ssa]-U, No linguistic content [zxx]-S, Nogai [nog]-U, North American Indian Language [nai]-U, North Ndebele [nd]-U, Northern Frisian [frr]-U, Norwegian [no]-U, Nubian Language [nub]-U, Nyamwezi [nym]-U, Nyankole [nyn]-U, Nyasa Tonga [tog]-U, Nyoro [nyo]-U, Nzima [nzi]-U
203- Occitan [oc]-U, Ojibwa [oj]-U, Old English [ang]-U, Old French [fro]-U, Old High German [goh]-U, Old Irish [sga]-U, Old Norse [non]-U, Old Persian [peo]-U, Old Provençal [pro]-U, Osage [osa]-U, Otomian Language [oto]-U, Ottoman Turkish [ota]-U
204- Pahlavi [pal]-U, Pali [pi]-U, Pampanga [pam]-U, Papuan Language [paa]-U, Philippine Language [phi]-U, Phoenician [phn]-U, Portuguese-based Creole or Pidgin [cpp]-U, Prakrit Language [pra]-U
205- Rajasthani [raj]-U, Rapanui [rap]-U, Rarotongan [rar]-U, Romance Language [roa]-U, Romany [rom]-U, Root [root]-U
206- Salishan Language [sal]-U, Samaritan Aramaic [sam]-U, Sami Language [smi]-U, Sandawe [sad]-U, Sardinian [sc]-U, Sasak [sas]-U, Scots [sco]-U, Selkup [sel]-U, Semitic Language [sem]-U, Serer [srr]-U, Shan [shn]-U, Sichuan Yi [ii]-U, Sicilian [scn]-U, Sidamo [sid]-U, Sign Language [sgn]-U, Siksika [bla]-U, Sino-Tibetan Language [sit]-U, Siouan Language [sio]-U, Skolt Sami [sms]-U, Slave [den]-U, Slavic Language [sla]-U, Sogdien [sog]-U, Songhai [son]-U, Soninke [snk]-U, Sorbian Language [wen]-U, South American Indian Language [sai]-U, South Ndebele [nr]-U, Southern Altai [alt]-U, Southern Sami [sma]-U, Sranan Tongo [srn]-U, Sukuma [suk]-U, Sumerian [sux]-U, Susu [sus]-U, Syriac [syr]-U
207- Tai Language [tai]-U, Tamashek [tmh]-U, Tereno [ter]-U, Tigre [tig]-U, Timne [tem]-U, Tiv [tiv]-U, Tlingit [tli]-U, Tsimshian [tsi]-U, Tumbuka [tum]-U, Tupi Language [tup]-U, Tyap [kcg]-U
208- Ugaritic [uga]-U, Umbundu [umb]-U, Upper Sorbian [hsb]-U
209- Vai [vai]-U, Volapük [vo]-U, Votic [vot]-U
210- Wakashan Language [wak]-U, Walamo [wal]-U, Walloon [wa]-U, Washo [was]-U
211- Yao [yao]-U, Yiddish [yi]-U, Yupik Language [ypk]-U
212- Zande [znd]-U, Zapotec [zap]-U, Zaza [zza]-U, Zenaga [zen]-U, Zuni [zun]-U
213
214