1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 2"http://www.w3.org/TR/html4/loose.dtd"> 3<html> 4 5<head> 6<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> 7<meta http-equiv="Content-Language" content="en-us"> 8<link rel="stylesheet" href="http://www.unicode.org/reports/reports.css" 9 type="text/css"> 10<title>UTS #35: Unicode LDML: Supplemental</title> 11<style type="text/css"> 12<!-- 13.dtd { 14 font-family: monospace; 15 font-size: 90%; 16 background-color: #CCCCFF; 17 border-style: dotted; 18 border-width: 1px; 19} 20 21.xmlExample { 22 font-family: monospace; 23 font-size: 80% 24} 25 26.blockedInherited { 27 font-style: italic; 28 font-weight: bold; 29 border-style: dashed; 30 border-width: 1px; 31 background-color: #FF0000 32} 33 34.inherited { 35 font-weight: bold; 36 border-style: dashed; 37 border-width: 1px; 38 background-color: #00FF00 39} 40 41.element { 42 font-weight: bold; 43 color: red; 44} 45 46.attribute { 47 font-weight: bold; 48 color: maroon; 49} 50 51.attributeValue { 52 font-weight: bold; 53 color: blue; 54} 55 56li, p { 57 margin-top: 0.5em; 58 margin-bottom: 0.5em 59} 60 61h2, h3, h4, table { 62 margin-top: 1.5em; 63 margin-bottom: 0.5em; 64} 65--> 66</style> 67</head> 68 69<body> 70 71 <table class="header" width="100%"> 72 <tr> 73 <td class="icon"><a href="http://unicode.org"> <img 74 alt="[Unicode]" src="http://unicode.org/webscripts/logo60s2.gif" 75 width="34" height="33" 76 style="vertical-align: middle; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px; border-top-width: 0px;"></a> 77 <a class="bar" href="http://www.unicode.org/reports/">Technical 78 Reports</a></td> 79 </tr> 80 <tr> 81 <td class="gray"> </td> 82 </tr> 83 </table> 84 <div class="body"> 85 <h2 style="text-align: center"> 86 Unicode Technical 87 Standard #35 88 </h2> 89 <h1> 90 Unicode Locale Data Markup Language (LDML)<br>Part 6: 91 Supplemental 92 </h1> 93 94 <!-- At least the first row of this header table should be identical across the parts of this UTS. --> 95 <table border="1" cellpadding="2" cellspacing="0" class="wide"> 96 <tr> 97 <td>Version</td> 98 <td>34</td> 99 </tr> 100 <tr> 101 <td>Editors</td> 102 <td>Steven Loomis (<a href="mailto:srl@icu-project.org">srl@icu-project.org</a>) 103 and <a href="tr35.html#Acknowledgments">other CLDR committee 104 members</a></td> 105 </tr> 106 </table> 107 108 <p> 109 For the full header, summary, and status, see <a href="tr35.html"> 110 Part 1: Core</a> 111 </p> 112 113 <h3> 114 <i>Summary</i> 115 </h3> 116 <p> 117 This document describes parts of an XML format (<i>vocabulary</i>) 118 for the exchange of structured locale data. This format is used in 119 the <a href="http://cldr.unicode.org/">Unicode Common Locale Data 120 Repository</a>. 121 </p> 122 123 <p> 124 This is a partial document, describing only those parts of the LDML 125 that are relevant for supplemental data. For the other parts of the 126 LDML see the <a href="tr35.html">main LDML document</a> and the links 127 above. 128 </p> 129 130 <h3> 131 <i>Status</i> 132 </h3> 133 134 <!-- NOT YET APPROVED 135 <p> 136 <i class="changed">This is a<b><font color="#ff3333"> 137 draft </font></b>document which may be updated, replaced, or superseded by 138 other documents at any time. Publication does not imply endorsement 139 by the Unicode Consortium. This is not a stable document; it is 140 inappropriate to cite this document as other than a work in 141 progress. 142 </i> 143 </p> 144 END NOT YET APPROVED --> 145 <!-- APPROVED --> 146 <p> 147 <i>This document has been reviewed by Unicode members and other 148 interested parties, and has been approved for publication by the 149 Unicode Consortium. This is a stable document and may be used as 150 reference material or cited as a normative reference by other 151 specifications.</i> 152 </p> 153 <!-- END APPROVED --> 154 155 <blockquote> 156 <p> 157 <i><b>A Unicode Technical Standard (UTS)</b> is an independent 158 specification. Conformance to the Unicode Standard does not imply 159 conformance to any UTS.</i> 160 </p> 161 </blockquote> 162 <p> 163 <i>Please submit corrigenda and other comments with the CLDR bug 164 reporting form [<a href="tr35.html#Bugs">Bugs</a>]. Related 165 information that is useful in understanding this document is found 166 in the <a href="tr35.html#References">References</a>. For the latest 167 version of the Unicode Standard see [<a href="tr35.html#Unicode">Unicode</a>]. 168 For a list of current Unicode Technical Reports see [<a 169 href="tr35.html#Reports">Reports</a>]. For more information about 170 versions of the Unicode Standard, see [<a href="tr35.html#Versions">Versions</a>]. 171 </i> 172 </p> 173 174 <!-- This section of Parts should be identical in all of the parts of this UTS. --> 175 <h2> 176 <a name="Parts" href="#Parts">Parts</a> 177 </h2> 178 <p>The LDML specification is divided into the following parts:</p> 179 <ul class="toc"> 180 <li>Part 1: <a href="tr35.html#Contents">Core</a> (languages, 181 locales, basic structure) 182 </li> 183 <li>Part 2: <a href="tr35-general.html#Contents">General</a> 184 (display names & transforms, etc.) 185 </li> 186 <li>Part 3: <a href="tr35-numbers.html#Contents">Numbers</a> 187 (number & currency formatting) 188 </li> 189 <li>Part 4: <a href="tr35-dates.html#Contents">Dates</a> (date, 190 time, time zone formatting) 191 </li> 192 <li>Part 5: <a href="tr35-collation.html#Contents">Collation</a> 193 (sorting, searching, grouping) 194 </li> 195 <li>Part 6: <a href="tr35-info.html#Contents">Supplemental</a> 196 (supplemental data) 197 </li> 198 <li>Part 7: <a href="tr35-keyboards.html#Contents">Keyboards</a> 199 (keyboard mappings) 200 </li> 201 </ul> 202 203 <h2> 204 <a name="Contents" href="#Contents">Contents of Part 6, 205 Supplemental</a> 206 </h2> 207 <!-- START Generated TOC: CheckHtmlFiles --> 208 <ul class="toc"> 209 <li>1 <a href="#Supplemental_Data">Introduction Supplemental 210 Data</a></li> 211 <li>2 <a href="#Territory_Data">Territory Data</a> 212 <ul class="toc"> 213 <li>2.1 <a href="#Supplemental_Territory_Containment">Supplemental 214 Territory Containment</a></li> 215 <li>2.2 <a href="#Subdivision_Containment">Subdivision 216 Containment</a></li> 217 <li>2.3 <a href="#Supplemental_Territory_Information">Supplemental 218 Territory Information</a></li> 219 <li>2.4 <a href="#Territory_Based_Preferences">Territory-Based 220 Preferences</a> 221 <ul class="toc"> 222 <li>2.4.1 <a href="#Preferred_Units_For_Usage">Preferred 223 Units for Specific Usages</a> 224 <ul class="toc"> 225 <li>Table: <a href="#Unit_Preference_Categories">Unit 226 Preference Categories</a></li> 227 </ul> 228 </li> 229 </ul> 230 </li> 231 <li>2.5 <a href="#rgScope"><rgScope>: Scope of the 232 “rg” Locale Key</a></li> 233 </ul> 234 </li> 235 <li>3 <a href="#Supplemental_Language_Data">Supplemental 236 Language Data</a> 237 <ul class="toc"><li>3.1 <a 238 href="#Supplemental_Language_Grouping">Supplemental Language Grouping</a></li></ul></li> 239 240 <li>4 <a href="#Supplemental_Code_Mapping">Supplemental Code 241 Mapping</a></li> 242 <li>5 <a href="#Telephone_Code_Data">Telephone Code Data</a> (Deprecated)</li> 243 <li>6 <a href="#Postal_Code_Validation">Postal Code 244 Validation (Deprecated)</a></li> 245 <li>7 <a href="#Supplemental_Character_Fallback_Data">Supplemental 246 Character Fallback Data</a></li> 247 <li>8 <a href="#Coverage_Levels">Coverage Levels</a> 248 <ul class="toc"> 249 <li>8.1 <a href="#Coverage_Level_Definitions">Definitions</a></li> 250 <li>8.2 <a href="#Coverage_Level_Data_Requirements">Data 251 Requirements</a></li> 252 <li>8.3 <a href="#Coverage_Level_Default_Values">Default 253 Values</a></li> 254 </ul> 255 </li> 256 <li>9 <a href="#Appendix_Supplemental_Metadata">Supplemental 257 Metadata</a> 258 <ul class="toc"> 259 <li>9.1 <a href="#Supplemental_Alias_Information">Supplemental 260 Alias Information</a> 261 <ul class="toc"> 262 <li>Table: <a href="#Alias_Attribute_Values">Alias 263 Attribute Values</a></li> 264 </ul> 265 </li> 266 <li>9.2 <a href="#Supplemental_Deprecated_Information">Supplemental 267 Deprecated Information (Deprecated)</a> 268 </li> 269 <li>9.3 <a href="#Default_Content">Default Content</a></li> 270 </ul> 271 </li> 272 <li>10 <a href="#Metadata_Elements">Locale Metadata Elements</a></li> 273 <li>11 <a href="#Version_Information">Version Information</a></li> 274 <li>12 <a href="#Parent_Locales">Parent Locales</a></li> 275 </ul> 276 <!-- END Generated TOC: CheckHtmlFiles --> 277 <h2> 278 1 Introduction <a name="Supplemental_Data" href="#Supplemental_Data">Supplemental 279 Data</a> 280 </h2> 281 282 <p> 283 The following represents the format for additional supplemental 284 information. This is information that is important for 285 internationalization and proper use of CLDR, but is not contained in 286 the locale hierarchy. It is not localizable, nor is it overridden by 287 locale data. The current CLDR data can be viewed in the <a 288 href="http://www.unicode.org/cldr/data/charts/supplemental/index.html">Supplemental 289 Charts</a>. 290 </p> 291 <p class="dtd"> 292 <!-- t d {border: 1px solid #ccc;}br {mso-data-placement:same-cell;}--> 293 <!ELEMENT supplementalData (version, generation?, cldrVersion?, 294 currencyData?, territoryContainment?, subdivisionContainment?, 295 languageData?, territoryInfo?, postalCodeData?, calendarData?, 296 calendarPreferenceData?, weekData?, timeData?, measurementData?, unitPreferenceData?, timezoneData?, 297 characters?, transforms?, metadata?, codeMappings?, parentLocales?, 298 likelySubtags?, metazoneInfo?, plurals?, telephoneCodeData?, 299 numberingSystems?, bcp47KeywordMappings?, gender?, references?, 300 languageMatching?, dayPeriodRuleSet*, metaZones?, primaryZones?, 301 windowsZones?, coverageLevels?, idValidity?, 302 rgScope?) > 303 </p> 304 <p> 305 The data in CLDR is presently split into multiple files: 306 supplementalData.xml, supplementalMetadata.xml, characters.xml, 307 likelySubtags.xml, ordinals.xml, plurals.xml, telephoneCodeData.xml, 308 genderList.xml, plus transforms (see <i>Part 2 Section 10 <a 309 href="tr35-general.html#Transforms">Transforms</a> 310 </i>and<i> Part 2 Section 10.3 <a 311 href="tr35-general.html#Transform_Rules_Syntax">Transform Rule 312 Syntax</a></i>). The split is just for convenience: logically, they are 313 treated as though they were a single file. Future versions of CLDR 314 may split the data in a different fashion. Do not depend on any 315 specific XML filename or path for supplemental data. 316 </p> 317 318 <p> 319 Note that <a href="#Metadata_Elements">Chapter 10</a> presents 320 information about metadata that is maintained on a per-locale basis. 321 It is included in this section because it is not intended to be used 322 as part of the locale itself. 323 </p> 324 325 <h2> 326 2 <a name="Territory_Data" href="#Territory_Data">Territory Data</a> 327 </h2> 328 329 <h3> 330 2.1 <a name="Supplemental_Territory_Containment" 331 href="#Supplemental_Territory_Containment">Supplemental 332 Territory Containment</a> 333 </h3> 334 <p class="dtd"> 335 <!ELEMENT territoryContainment ( group* ) ><br> 336 <!ELEMENT group EMPTY ><br> <!ATTLIST group type 337 NMTOKEN #REQUIRED ><br> <!ATTLIST group contains NMTOKENS 338 #IMPLIED ><br> <!ATTLIST group grouping ( true | false ) 339 #IMPLIED ><br> <!ATTLIST group status ( deprecated, 340 grouping ) #IMPLIED > 341 </p> 342 <p> 343 The following data provides information that shows groupings of 344 countries (regions). The data is based on the [<a 345 href="tr35.html#UNM49">UNM49</a>]. There is one special code, 346 <code>QO</code> 347 , which is used for outlying areas of Oceania that are typically 348 uninhabited. The territory containment forms a tree with the 349 following levels: 350 </p> 351 <p align="center">World</p> 352 <p align="center">Continent</p> 353 <p align="center">Subcontinent</p> 354 <p align="center">Country</p> 355 <p> 356 Excluding groupings, in this tree:<br> 357 </p> 358 <ul> 359 <li>All non-overlapping regions form a strict tree rooted at 360 World</li> 361 <li>All leaf-nodes (country) are always at depth 4. Some of 362 these “country” regions are actually parts of other countries, such 363 as Hong Kong (part of China). Such relationships are not part of the 364 containment data.</li> 365 </ul> 366 <p> 367 For a chart showing the relationships (plus the included timezones), 368 see the <a 369 href="http://www.unicode.org/cldr/charts/latest/supplemental/territory_containment_un_m_49.html">Territory 370 Containment Chart</a>. The XML structure has the following form. 371 </p> 372 <pre><territoryContainment></pre> 373 <blockquote> 374 <pre><group type="001" contains="002 009 019 142 150"/> <!--World --> 375<group type="011" contains="BF BJ CI CV GH GM GN GW LR ML MR NE NG SH SL SN TG"/> <!--Western Africa --> 376<group type="013" contains="BZ CR GT HN MX NI PA SV"/> <!--Central America --> 377<group type="014" contains="BI DJ ER ET KE KM MG MU MW MZ RE RW SC SO TZ UG YT ZM ZW"/> <!--Eastern Africa --> 378<group type="142" contains="030 035 062 145"/> <!--Asia --> 379<group type="145" contains="AE AM AZ BH CY GE IL IQ JO KW LB OM PS QA SA SY TR YE"/> <!--Western Asia --> 380<group type="015" contains="DZ EG EH LY MA SD TN"/> <!--Northern Africa --> 381...</pre> 382 </blockquote> 383 <p>There are groupings that don't follow this regular structure, 384 such as:</p> 385 <pre><group type="003" contains="013 021 029" grouping="true"/> <!--North America --></pre> 386 <p> 387 These are marked with the attribute <span class="attribute">grouping</span>="<span 388 class="attributeValue">true</span>". 389 </p> 390 <p> 391 When groupings have been deprecated but kept around for backwards 392 compatibility, they are marked with the attribute <span 393 class="attribute">status</span>="<span class="attributeValue">deprecated</span>", 394 like this: 395 </p> 396 <pre><group type="029" contains="AN" status="deprecated"/> <!--Caribbean --></pre> 397 <p> 398 When the containment relationship itself is a grouping, it is marked 399 with the attribute <span class="attribute">status</span>="<span 400 class="attributeValue">grouping</span>", like this: 401 </p> 402 <pre><group type="150" contains="EU" status="grouping"/> <!--Europe --></pre> 403 <p>That is, the type value isn’t a grouping, but if you filter out 404 groupings you can drop this containment. In the example above, EU is 405 a grouping, and contained in 150.</p> 406 <h3> 407 2.2 <a name="Subdivision_Containment" href="#Subdivision_Containment">Subdivision 408 Containment</a> 409 </h3> 410 <p class="dtd"> 411 <!ELEMENT subdivisionContainment ( subgroup* ) ><br> 412 <br> 413 <!ELEMENT subgroup EMPTY ><br> 414 <!ATTLIST subgroup type NMTOKEN #REQUIRED ><br> 415 <!ATTLIST subgroup contains NMTOKENS #IMPLIED > 416 </p> 417 <p>The subdivision containment data is similar to the territory 418 containment. It is based on ISO 3166-2 data, but may diverge from it 419 in the future.</p> 420 <p class="xmlExample"> 421 <subgroup type="BD" contains="bda bdb bdc bdd bde bdf bdg bdh"/><br> 422 <subgroup type="bda" contains="bd02 bd06 bd07 bd25 bd50 bd51"/> 423 </p> 424 <p> 425 The <strong>type</strong> is a 426 <code><a href="tr35.html#unicode_region_subtag">unicode_region_subtag</a></code> 427 (territory) identifier for the top level of containment, 428 or a <code><a href="tr35.html#unicode_subdivision_subtag">unicode_subdivision_id</a></code> 429 for lower levels of containment when there are multiple levels. 430 The <strong>contains</strong> value is a space-delimited list of one or more 431 <code><a href="tr35.html#unicode_subdivision_subtag">unicode_subdivision_id</a></code> 432 values. 433 In the example above, subdivision bda contains 434 other subdivisions bd02, bd06, bd07, bd25, bd50, bd51. 435 </p> 436 <p> Note: Formerly (in CLDR 28 through 30):</p> 437 <ul> 438 <li>The <strong>type</strong> attribute could only contain a 439 <code>unicode_region_subtag</code>;</li> 440 <li>The <strong>contains</strong> attribute contained 441 <code>unicode_subdivision_suffix</code> values; these are not unique 442 across multiple territories, so...</li> 443 <li>For lower containment levels, a now-deprecated subtype 444 <strong>attribute</strong> was used to specify the parent 445 <code>unicode_subdivision_suffix</code>.</li> 446 </ul> 447 * The type attribute contained only a <code>unicode_region_subtag</code> 448 449 450 <code>unicode_subdivision_suffix</code> 451 values were used in the <strong>contains</strong> attribute; these are not 452 unique across multiple territories, so for lower levels a now-deprecated 453 <h3> 454 2.3 <a name="Supplemental_Territory_Information" 455 href="#Supplemental_Territory_Information">Supplemental 456 Territory Information</a> 457 </h3> 458 459 <p class="dtd"> 460 <!ELEMENT territory ( languagePopulation* ) ><br> 461 <!ATTLIST territory type NMTOKEN #REQUIRED ><br> 462 <!ATTLIST territory gdp NMTOKEN #REQUIRED ><br> 463 <!ATTLIST territory literacyPercent NMTOKEN #REQUIRED ><br> 464 <!ATTLIST territory population NMTOKEN #REQUIRED ><br> 465 <br> 466 <!ELEMENT languagePopulation EMPTY ><br> 467 <!ATTLIST languagePopulation type NMTOKEN #REQUIRED ><br> 468 <!ATTLIST languagePopulation literacyPercent NMTOKEN #IMPLIED ><br> 469 <!ATTLIST languagePopulation writingPercent NMTOKEN #IMPLIED ><br> 470 <!ATTLIST languagePopulation populationPercent NMTOKEN #REQUIRED ><br> 471 <!ATTLIST languagePopulation officialStatus (de_facto_official | official | official_regional | official_minority) #IMPLIED > 472 </p> 473 <p> 474 This data provides testing information for language and territory 475 populations. The main goal is to provide approximate figures for the 476 literate, functional population for each language in each territory: 477 that is, the population that is able to read and write each language, 478 and is comfortable enough to use it with computers. For a chart of 479 this data, see <a 480 href='http://www.unicode.org/cldr/charts/latest/supplemental/territory_language_information.html'>Territory-Language 481 Information</a>. 482 </p> 483 <p> 484 <em>Example</em> 485 </p> 486 <pre style='font-size: 70%'><territory type="AO" gdp="175500000000" literacyPercent="70.4" population="19088100"> <!--Angola--> 487 <languagePopulation type="pt" populationPercent="67" officialStatus="official"/> <!--Portuguese--> 488 <languagePopulation type="umb" populationPercent="29"/> <!--Umbundu--> 489 <languagePopulation type="kmb" writingPercent="10" populationPercent="25" references="R1034"/> <!--Kimbundu--> 490 <languagePopulation type="ln" populationPercent="0.67" references="R1010"/> <!--Lingala--> 491</territory></pre> 492 <p> 493 Note that reliable information is difficult to obtain; the 494 information in CLDR is an estimate culled from different sources, 495 including the World Bank, CIA Factbook, and others. The GDP and 496 country literacy figures are taken from the World Bank where 497 available, otherwise supplemented by FactBook data and other sources. 498 The GDP figures are “PPP (constant 2000 international $)”. Much of 499 the per-language data is taken from the Ethnologue, but is 500 supplemented and processed using many other sources, including 501 per-country census data. (The focus of the Ethnologue is native 502 speakers, which includes people who are not literate, and excludes 503 people who are functional second-language users.) Some references are 504 marked in the XML files, with attributes such as 505 <code>references="R1010"</code> . 506 </p> 507 <p> 508 The percentages may add up to more than 100% due to multilingual 509 populations, or may be less than 100% due to illiteracy or because 510 the data has not yet been gathered or processed. Languages with 511 smaller populations might not be included. 512 </p> 513 <p>The following describes the meaning of some of these terms—as 514 used in CLDR—in more detail.</p> 515 <p> 516 <a name="literacy_percent" href="#literacy_percent">literacy percent 517 for the territory</a> — an estimate of the percentage of the 518 country’s population that is functionally literate. 519 </p> 520 <p> 521 <a name="language_population_percent" 522 href="#language_population_percent">language population percent</a> — 523 an estimate of the number of people who are functional in that 524 language in that country, including both first and second language 525 speakers. The level of fluency is that necessary to use a UI on a 526 computer, smartphone, or similar devices, rather than complete 527 fluency. 528 </p> 529 <p> 530 <a name="literacy_percent_for_langPop" href="#literacy_percent_for_langPop">literacy 531 percent for language population</a> — Within the 532 set of people who are functional in the corresponding language (as specified 533 by <a href="#language_population_percent">language population percent</a>), 534 this is an estimate of the percentage of those people who are functionally 535 literate in that language, that is, who are <em>capable</em> of reading or 536 writing in that language, even if they do not regularly use it for reading 537 or writing. If not specified, this defaults to the 538 <a href="#literacy_percent">literacy percent for the territory</a>. 539 </p> 540 <p> 541 <a name="writing_percent" href="#writing_percent">writing percent</a> 542 — Within the 543 set of people who are functional in the corresponding language (as specified 544 by <a href="#language_population_percent">language population percent</a>), 545 this is an estimate of the percentage of those people who regularly 546 read or write a significant amount in that language. Ideally, the regularity 547 would be measured as “7-day actives”. If it is known that the language is not 548 widely or commonly written, but there are no solid figures, the value is 549 typically given 1%-5%.</p> 550 <p> 551 For a language such as Swiss German, which is typically not written, even 552 though nearly the whole native Germanophone population <em>could </em>write 553 in Swiss German, the <a href="#literacy_percent_for_langPop">literacy percent 554 for language population</a> is high, but the <a href="#writing_percent">writing 555 percent</a> is low. 556 </p> 557 <p> 558 <a name="official_language" href="#official_language">official 559 language</a> — as used in CLDR, a language that can generally be used in 560 all communications with a central government. That is, people can 561 expect that essentially all communication from the government is 562 available in that language (ballots, information pamphlets, legal 563 documents, …) and that they can use that language in any 564 communication to the central government (petitions, forms, filing 565 lawsuits,…). 566 </p> 567 <p> 568 Official languages for a country in this sense are not necessarily 569 the same as those with official legal status in the country. For 570 example, Irish is declared to be an official language in Ireland, but 571 English has no such formal status in the United States. Languages 572 such as the latter are called <em>de facto</em> official languages. 573 As another example, German has legal status in Italy, but cannot be 574 used in all communications with the central government, and is thus 575 not an official language <em>of Italy</em> for CLDR purposes. It is, 576 however, an <em>official regional language</em>. Other languages are 577 declared to be official, but can’t actually be used for all 578 communication with any major governmental entity in the country. 579 There is no intention to mark such nominally official languages as 580 “official” in the CLDR data. 581 </p> 582 <p> 583 <a name="official_regional_language" 584 href="#official_regional_language">official regional language</a> — 585 a language that is official (<em>de jure</em> or <em>de facto</em>) 586 in a major region within a country, but does not qualify as an 587 official language of the country as a whole. For example, it can be 588 used in an official petition to a provincial government, but not the 589 central government. The term “major” is meant to distinguish from 590 smaller-scale usage, such as for a town or village. 591 </p> 592 593 <h3> 594 2.4 <a name="Territory_Based_Preferences" 595 href="#Territory_Based_Preferences">Territory-Based Preferences</a> 596 </h3> 597 <p> 598 The default preference for several locale items is based solely on a 599 <a href="tr35.html#unicode_region_subtag">unicode_region_subtag</a>, 600 which may either be specified as part of a <a 601 href="tr35.html#unicode_language_id">unicode_language_id</a>, 602 inferred from other locale ID elements using the <a 603 href="tr35.html#Likely_Subtags">Likely Subtags</a> mechanism, or 604 provided explicitly using an “rg” <a href="tr35.html#RegionOverride">Region 605 Override</a> locale key. For more information on this process see <a 606 href="tr35.html#Locale_Inheritance">Locale Inheritance and 607 Matching</a>. The specific items that are handled in this way are: 608 </p> 609 <ul> 610 <li>Default calendar (see <a 611 href="tr35-dates.html#Calendar_Preference_Data">Calendar 612 Preference Data</a>) 613 </li> 614 <li>Default week conventions (first day of week and weekend 615 days; see <a href="tr35-dates.html#Week_Data">Week Data</a>) 616 </li> 617 <li>Default hour cycle (see <a href="tr35-dates.html#Time_Data">Time 618 Data</a>) 619 </li> 620 <li>Default currency (see <a 621 href="tr35-numbers.html#Supplemental_Currency_Data">Supplemental 622 Currency Data</a>) 623 </li> 624 <li>Default measurement system and paper size (see <a 625 href="tr35-general.html#Measurement_System_Data">Measurement 626 System Data</a>) 627 </li> 628 <li>Default units for specific usage (see <a 629 href="#Preferred_Units_For_Usage">Preferred Units for Specific 630 Usages</a>, below) 631 </li> 632 </ul> 633 634 <h4> 635 2.4.1 <a name="Preferred_Units_For_Usage" 636 href="#Preferred_Units_For_Usage">Preferred Units for Specific 637 Usages</a> 638 </h4> 639 <p>This data is intended to map from a particular 640 usage — e.g. measuring the height of a person or the fuel consumption 641 of an automobile — to the unit or combination of units typically used 642 for that usage in a given region. Considerations for such a mapping 643 include:</p> 644 <ul> 645 <li>The list of possible usages large and open-ended. The intent 646 here is to start with a small set for which there is an urgent need, 647 and expand as necessary.</li> 648 <li>Even for a given usage such a measuring a road distance, 649 there are multiple ranges in use. For example, one set of units may 650 be used for indicating the distance to the next city (kilometers or 651 miles), while another may be used for indicating the distance to the 652 next exit (meters, yards, or feet).</li> 653 <li>There are also differences between more formal usage 654 (official signage, medical records) and more informal usage 655 (conversation, texting).</li> 656 <li>For some usages, the measurement may be expressed using a 657 sequence of units, such as “1 meter, 78 centimeters” or “12 stone, 2 658 pounds”.</li> 659 </ul> 660 <p>The DTD structure is as follows:</p> 661 <p class="dtd"> 662 <!ELEMENT unitPreferenceData ( 663 unitPreferences* ) ><br> <br> <!ELEMENT 664 unitPreferences ( unitPreference* ) ><br> <!ATTLIST 665 unitPreferences category NMTOKEN #REQUIRED ><br> 666 <!ATTLIST unitPreferences usage NMTOKENS #REQUIRED ><br> 667 <!ATTLIST unitPreferences scope (small) #IMPLIED ><br> <br> 668 <!ELEMENT unitPreference ( #PCDATA ) ><br> <!ATTLIST 669 unitPreference regions NMTOKENS #REQUIRED ><br> 670 </p> 671 <p>An example of data using this structure is as 672 follows:</p> 673 <pre> 674 <unitPreferenceData> 675 ... 676 <unitPreferences category="length" usage="person"> 677 <unitPreference regions="001">centimeter</unitPreference> 678 <unitPreference regions="BR CN DE DK MX NL NO PL PT RU" alt="informal">meter centimeter</unitPreference> 679 <unitPreference regions="AT BE DZ EG ES FR HK ID IL IT JO MY SA SE TR VN">meter centimeter</unitPreference> 680 <unitPreference regions="CA GB IN US" alt="informal">foot inch</unitPreference> 681 <unitPreference regions="US">inch</unitPreference> 682 </unitPreferences> 683 <unitPreferences category="length" usage="person" scope="small"> 684 <unitPreference regions="001">centimeter</unitPreference> 685 <unitPreference regions="CA GB IN" alt="informal">inch</unitPreference> 686 <unitPreference regions="US">inch</unitPreference> 687 </unitPreferences> 688 ... 689 </unitPreferenceData> 690</pre> 691 <p>There are several things to note:</p> 692 <ul> 693 <li>The <unitPreferences> <em>category</em> attribute 694 values match a <unit> element <em>type</em> attribute value, 695 as listed in <a href="tr35-general.html#Unit_Elements">Unit 696 Elements</a>. 697 </li> 698 <li>The <unitPreferences> <em>usage</em> attribute values 699 are specific to this data; current values are listed in a table at 700 the end of this section. 701 </li> 702 <li>The <unitPreferences> element may have a <em>scope="small"</em> 703 attribute to indicate that it is intended for the smaller range of 704 values for that usage, such measuring the height or weight of an 705 infant versus that of an adult, or measuring the road distance to 706 the next exit versus that to the next city. 707 </li> 708 <li>Each <unitPreferences> element must contain one 709 <unitPreference> element with attribute <em>regions="001"</em>; 710 this specifies the worldwide default unit or unit sequence for the 711 usage and scope specified by the <unitPreferences> element. 712 There may be additional <unitPreference> elements which 713 specify a different unit or unit sequence for specific regions and 714 possibly for a different degree of formality. 715 </li> 716 <li>The <unitPreference> element may have an <em>alt="informal"</em> 717 attribute to indicate that the specified unit or unit sequence is 718 preferred in more informal usage. 719 </li> 720 <li>The value of the <unitPreference> element is a 721 sequence of one or more space-separated unit names from the a 722 <unit> element <em>unit</em> attribute values for the relevant 723 type, as listed in <a href="tr35-general.html#Unit_Elements">Unit 724 Elements</a>. 725 </li> 726 </ul> 727 <p>For a given combination of category, usage, 728 scope and formality, the intended procedure for looking up the unit 729 or unit combination to use for a given region is as follows:</p> 730 <ul> 731 <li>Get the appropriate <unitPreferences> element for the 732 desired <em>category</em> and <em>usage</em>: If scope=small is 733 desired and a <unitPreferences> element with <em>scope="small"</em> 734 exists for the desired <em>category</em> and <em>usage</em>, use it. 735 Otherwise, use a <unitPreferences> element for the desired <em>category</em> 736 and <em>usage</em> that has no <em>scope</em> attribute. In the 737 selected <unitPreferences> element, pick a 738 <unitPreference> element using the following steps. 739 </li> 740 <li>If informal usage is preferred, look for a 741 <unitPreference> element with <em>alt="informal"</em> whose <em>regions</em> 742 attribute includes the given region. If found, use the specified 743 unit [sequence]. 744 </li> 745 <li>Look for a <unitPreference> element whose <em>regions</em> 746 attribute includes the given region. If found, use the specified 747 unit [sequence]. 748 </li> 749 <li>Look for a <unitPreference> element with <em>alt="informal"</em> 750 whose <em>regions</em> attribute is "001". If found, use the 751 specified unit [sequence]. 752 </li> 753 <li>Look for a <unitPreference> element whose <em>regions</em> 754 attribute is "001". If found, use the specified unit [sequence]. 755 </li> 756 </ul> 757 <p>CLDR 29 contains usage mapping data for the 758 following combinations of category, usage, and scope:</p> 759 <table border="1" cellpadding="4" cellspacing="0"> 760 <caption> 761 <a name="Unit_Preference_Categories" 762 href="#Unit_Preference_Categories">Unit Preference Categories</a> 763 </caption> 764 <tr> 765 <td><strong>Category</strong></td> 766 <td><strong>Usage</strong></td> 767 <td><strong>Sample Value</strong></td> 768 </tr> 769 <tr> 770 <td><em>area</em></td> 771 <td>land-agricult</td> 772 <td>hectare</td> 773 </tr> 774 <tr> 775 <td><em>area</em></td> 776 <td>land-commercl</td> 777 <td>hectare</td> 778 </tr> 779 <tr> 780 <td><em>area</em></td> 781 <td>land-residntl</td> 782 <td>hectare</td> 783 </tr> 784 <tr> 785 <td><em>concentr</em></td> 786 <td>blood-glucose</td> 787 <td>milligram-per-deciliter</td> 788 </tr> 789 <tr> 790 <td><em>consumption</em></td> 791 <td>vehicle-fuel</td> 792 <td>liter-per-100kilometers</td> 793 </tr> 794 <tr> 795 <td><em>duration</em></td> 796 <td>music-track</td> 797 <td>minute second</td> 798 </tr> 799 <tr> 800 <td><em>duration</em></td> 801 <td>person-age</td> 802 <td>year-person month-person</td> 803 </tr> 804 <tr> 805 <td><em>duration</em></td> 806 <td>tv-program</td> 807 <td>minute second</td> 808 </tr> 809 <tr> 810 <td><em>energy</em></td> 811 <td>food</td> 812 <td>foodcalorie</td> 813 </tr> 814 <tr> 815 <td><em>energy</em></td> 816 <td>person-usage</td> 817 <td>kilocalorie</td> 818 </tr> 819 <tr> 820 <td><em>length</em></td> 821 <td>person</td> 822 <td>centimeter</td> 823 </tr> 824 <tr> 825 <td><em>length</em></td> 826 <td>person, scope=small</td> 827 <td>centimeter</td> 828 </tr> 829 <tr> 830 <td><em>length</em></td> 831 <td>rainfall</td> 832 <td>millimeter</td> 833 </tr> 834 <tr> 835 <td><em>length</em></td> 836 <td>road</td> 837 <td>kilometer</td> 838 </tr> 839 <tr> 840 <td><em>length</em></td> 841 <td>road, scope=small</td> 842 <td>meter</td> 843 </tr> 844 <tr> 845 <td><em>length</em></td> 846 <td>snowfall</td> 847 <td>centimeter</td> 848 </tr> 849 <tr> 850 <td><em>length</em></td> 851 <td>vehicle</td> 852 <td>meter</td> 853 </tr> 854 <tr> 855 <td><em>length</em></td> 856 <td>visiblty</td> 857 <td>kilometer</td> 858 </tr> 859 <tr> 860 <td><em>length</em></td> 861 <td>visiblty, scope=small</td> 862 <td>meter</td> 863 </tr> 864 <tr> 865 <td><em>mass</em></td> 866 <td>person</td> 867 <td>kilogram</td> 868 </tr> 869 <tr> 870 <td><em>mass</em></td> 871 <td>person, scope=small</td> 872 <td>gram</td> 873 </tr> 874 <tr> 875 <td><em>pressure</em></td> 876 <td>baromtrc</td> 877 <td>hectopascal</td> 878 </tr> 879 <tr> 880 <td><em>speed</em></td> 881 <td>road-travel</td> 882 <td>kilometer-per-hour</td> 883 </tr> 884 <tr> 885 <td><em>speed</em></td> 886 <td>wind</td> 887 <td>kilometer-per-hour</td> 888 </tr> 889 <tr> 890 <td><em>temperature</em></td> 891 <td>person</td> 892 <td>celsius</td> 893 </tr> 894 <tr> 895 <td><em>temperature</em></td> 896 <td>weather</td> 897 <td>celsius</td> 898 </tr> 899 <tr> 900 <td><em>volume</em></td> 901 <td>vehicle-fuel</td> 902 <td>liter</td> 903 </tr> 904 </table> 905 906 <h3> 907 2.5 <a name="rgScope" href="#rgScope"><rgScope>: Scope of 908 the “rg” Locale Key</a> 909 </h3> 910 <p> 911 The supplemental <rgScope> element specifies the data paths for 912 which the region used for data lookup is determined by the value of 913 any “rg” key present in the locale identifier (see <a 914 href="tr35.html#RegionOverride">Region Override</a>). If no “rg” key 915 is present, the region used for lookup is determined as usual: from 916 the unicode_region_subtag if present, else inferred from the 917 unicode_language_subtag. The DTD structure is as follows: 918 </p> 919 <p class="dtd"> 920 <!ELEMENT rgScope ( rgPath* ) ><br> 921 <br> <!ELEMENT rgPath EMPTY ><br> <!ATTLIST 922 rgPath path CDATA #REQUIRED ><br> 923 </p> 924 <p>The <rgScope> element contains a list of 925 <rgPath> elements, each of which specifies a datapath for which 926 any “rg” key determines the region for lookup. For example:</p> 927 <pre> 928 <rgScope> 929 <rgPath path="//supplementalData/currencyData/fractions/info[@iso4217='#'][@digits='*'][@rounding='*'][@cashDigits='*'][@cashRounding='*']" draft="provisional" /> 930 <rgPath path="//supplementalData/currencyData/fractions/info[@iso4217='#'][@digits='*'][@rounding='*'][@cashRounding='*']" draft="provisional" /> 931 <rgPath path="//supplementalData/currencyData/fractions/info[@iso4217='#'][@digits='*'][@rounding='*']" draft="provisional" /> 932 <rgPath path="//supplementalData/calendarPreferenceData/calendarPreference[@territories='#'][@ordering='*']" draft="provisional" /> 933 ... 934 <rgPath path="//supplementalData/unitPreferenceData/unitPreferences[@category='*'][@usage='*'][@scope='*']/unitPreference[@regions='#'][@alt='*']" draft="provisional" /> 935 <rgPath path="//supplementalData/unitPreferenceData/unitPreferences[@category='*'][@usage='*'][@scope='*']/unitPreference[@regions='#']" draft="provisional" /> 936 <rgPath path="//supplementalData/unitPreferenceData/unitPreferences[@category='*'][@usage='*']/unitPreference[@regions='#'][@alt='*']" draft="provisional" /> 937 <rgPath path="//supplementalData/unitPreferenceData/unitPreferences[@category='*'][@usage='*']/unitPreference[@regions='#']" draft="provisional" /> 938 </rgScope> 939</pre> 940 <p>The exact format of the path is provisional in 941 CLDR 29, but as currently shown:</p> 942 <ul> 943 <li>An attribute value of '*' indicates that the path applies 944 regardless of the value of the attribute.</li> 945 <li>Each path must have exactly one attribute whose value is 946 marked here as '#'; in actual data items with this path, the 947 corresponding value is a list of region codes. It is the region 948 codes in this list that are compared with the region specified by 949 the “rg” key to determine which data item to use for this path.</li> 950 </ul> 951 952 <h2> 953 3 <a name="Supplemental_Language_Data" 954 href="#Supplemental_Language_Data">Supplemental Language Data</a> 955 </h2> 956 957 <p class="dtd"> 958 <!ELEMENT languageData ( language* ) ><br> <!ELEMENT 959 language EMPTY ><br> <!ATTLIST language type NMTOKEN 960 #REQUIRED ><br> <!ATTLIST language scripts NMTOKENS 961 #IMPLIED ><br> <!ATTLIST language territories NMTOKENS 962 #IMPLIED ><br> <!ATTLIST language variants NMTOKENS 963 #IMPLIED ><br> <!ATTLIST language alt NMTOKENS #IMPLIED 964 ><br> 965 </p> 966 <p> 967 The language data is used for consistency checking and testing. It 968 provides a list of which languages are used with which scripts and in 969 which countries. To a large extent, however, the territory list has 970 been superseded by the data in<em> Section 2.2 <a 971 href="#Supplemental_Territory_Information">Supplemental 972 Territory Information</a> 973 </em>. 974 </p> 975 <pre> <languageData> 976 <language type="af" scripts="Latn" territories="ZA"/> 977 <language type="am" scripts="Ethi" territories="ET"/> 978 <language type="ar" scripts="Arab" territories="AE BH DZ EG IN IQ JO KW LB 979LY MA OM PS QA SA SD SY TN YE"/> 980 ...</pre> 981 <p>If the language is not a modern language, or the script is not 982 a modern script, or the language not a major language of the 983 territory, then the alt attribute is set to secondary.</p> 984 <pre> <language type="fr" scripts="Latn" territories="IT US" alt="secondary" /> 985 ...</pre> 986 <h2>3.1 <a name="Supplemental_Language_Grouping" 987 href="#Supplemental_Language_Grouping">Supplemental Language Grouping</a> </h2> 988 989 <p><!ELEMENT languageGroups ( languageGroup* ) ><br> 990 <!ELEMENT languageGroup ( #PCDATA ) > <br> 991 <!ATTLIST languageGroup parent NMTOKEN #REQUIRED ></p> 992 <p>The language groups supply language containment. For example, the following indicates that aav is the Unicode language code for a language group that contains caq, crv, etc.</p> 993 <code><languageGroup parent="<strong>fiu</strong>">chm et <strong>fi</strong> fit fkv hu izh kca koi krl kv liv mdf mns mrj myv smi udm vep vot vro</languageGroup></code> 994 <p>The vast majority of the languageGroup data is extracted from wikidata, but may be overridden in some cases. The wikidata information is more fine-grained, but makes use of language groups that don't have ISO or Unicode language codes. Those language groups are omitted from the data. For example, wikidata has the following child-parent chain: only the first and last elements are present in the language groups.</p> 995 <table> 996 <tr><td>Name</td><td>Wikidata Code</td><td>Language Code</td></tr> 997 <tr><td>Finnish</td> 998 <td><a href="https://www.wikidata.org/wiki/Q1412">Q1412</a></td> 999 <td>fi</td></tr> 1000 <tr><td>Finnic languages</td><td><a href="https://www.wikidata.org/wiki/Q33328">Q33328</a></td></tr> 1001 <tr><td>Finno-Samic languages</td><td><a href="https://www.wikidata.org/wiki/Q163652">Q163652</a></td></tr> 1002 <tr><td>Finno-Volgaic languages</td><td><a href="https://www.wikidata.org/wiki/Q161236">Q161236</a></td></tr> 1003 <tr><td>Finno-Permic languages</td><td><a href="https://www.wikidata.org/wiki/Q161240">Q161240</a></td></tr> 1004 <tr><td>Finno-Ugric languages</td><td><a href="https://www.wikidata.org/wiki/Q79890">Q79890</a></td><td>fiu</td></tr> 1005 1006 </table><br> 1007 <h2> 1008 4 <a name="Supplemental_Code_Mapping" 1009 href="#Supplemental_Code_Mapping">Supplemental Code Mapping</a> 1010 </h2> 1011 1012 <p class="dtd"><!ELEMENT codeMappings (languageCodes*, 1013 territoryCodes*, currencyCodes*) ></p> 1014 <p class="dtd"> 1015 <!ELEMENT languageCodes EMPTY ><br> <!ATTLIST 1016 languageCodes type NMTOKEN #REQUIRED><br> <!ATTLIST 1017 languageCodes alpha3 NMTOKEN #REQUIRED> 1018 </p> 1019 <p class="dtd"> 1020 <!ELEMENT territoryCodes EMPTY ><br> <!ATTLIST 1021 territoryCodes type NMTOKEN #REQUIRED><br> <!ATTLIST 1022 territoryCodes numeric NMTOKEN #REQUIRED><br> <!ATTLIST 1023 territoryCodes alpha3 NMTOKEN #REQUIRED><br> <!ATTLIST 1024 territoryCodes fips10 NMTOKEN #IMPLIED><br> <!ATTLIST 1025 territoryCodes internet NMTOKENS #IMPLIED> [deprecated] 1026 </p> 1027 <p class="dtd"> 1028 <!ELEMENT currencyCodes EMPTY ><br> <!ATTLIST 1029 currencyCodes type NMTOKEN #REQUIRED> <br> <!ATTLIST 1030 currencyCodes numeric NMTOKEN #REQUIRED> 1031 </p> 1032 <p> 1033 The code mapping information provides mappings between the subtags 1034 used in the CLDR locale IDs (from BCP 47) and other coding systems or 1035 related information. The language codes are only provided for those 1036 codes that have two letters in BCP 47 to their ISO three-letter 1037 equivalents. The territory codes provide mappings to numeric (UN M.49 1038 [<a href="tr35.html#UNM49">UNM49</a>] codes, equivalent to ISO 1039 numeric codes), ISO three-letter codes, FIPS 10 codes, and the 1040 internet top-level domain codes. 1041 </p> 1042 <p>The alphabetic codes are only provided where different from the 1043 type. For example:</p> 1044 <pre><territoryCodes type="AA" numeric="958" alpha3="AAA"/> 1045<territoryCodes type="AD" numeric="020" alpha3="AND" fips10="AN"/> 1046<territoryCodes type="AE" numeric="784" alpha3="ARE"/> 1047... 1048<territoryCodes type="GB" numeric="826" alpha3="GBR" fips10="UK"/> 1049... 1050<territoryCodes type="QU" numeric="967" alpha3="QUU" internet="EU"/> 1051... 1052<territoryCodes type="XK" numeric="983" alpha3="XKK"/> 1053...</pre> 1054 <p>Where there is no corresponding code, sometimes private use 1055 codes are used, such as the numeric code for XK.</p> 1056 <p> 1057 The currencyCodes are mappings from three letter currency codes to 1058 numeric values (ISO 4217 <a 1059 href="http://www.currency-iso.org/en/home/tables/table-a1.html">Current 1060 currency & funds code list</a>.) The mapping currently covers only 1061 current codes and does not include historic currencies. For example: 1062 </p> 1063 <pre> 1064<currencyCodes type="AED" numeric="784"/> 1065<currencyCodes type="AFN" numeric="971"/> 1066... 1067<currencyCodes type="EUR" numeric="978"/> 1068... 1069<currencyCodes type="ZAR" numeric="710"/> 1070<currencyCodes type="ZMW" numeric="967"/> 1071</pre> 1072 <h2> 1073 5 <a name="Telephone_Code_Data" href="#Telephone_Code_Data">Telephone 1074 Code Data</a> (Deprecated) 1075 </h2> 1076 <p>Deprecated in CLDR v34, and data removed.</p> 1077 1078 <p class="dtd"> 1079 <!ELEMENT telephoneCodeData ( codesByTerritory* ) ><br> <br> 1080 <!ELEMENT codesByTerritory ( telephoneCountryCode+ ) ><br> 1081 <!ATTLIST codesByTerritory territory NMTOKEN #REQUIRED ><br> 1082 <br> <!ELEMENT telephoneCountryCode EMPTY ><br> 1083 <!ATTLIST telephoneCountryCode code NMTOKEN #REQUIRED ><br> 1084 <!ATTLIST telephoneCountryCode from NMTOKEN #IMPLIED ><br> 1085 <!ATTLIST telephoneCountryCode to NMTOKEN #IMPLIED > 1086 </p> 1087 <p> 1088 This data specifies the mapping between ITU telephone country codes [<a 1089 href="tr35.html#ITUE164">ITUE164</a>] and CLDR-style territory codes 1090 (ISO 3166 2-letter codes or non-corresponding UN M.49 [<a 1091 href="tr35.html#UNM49">UNM49</a>] 3-digit codes). There are several 1092 things to note: 1093 </p> 1094 <ul> 1095 <li>A given telephone country code may map to multiple CLDR 1096 territory codes; +1 (North America Numbering Plan) covers the US and 1097 Canada, as well as many islands in the Caribbean and some in the 1098 Pacific</li> 1099 <li>Some telephone country codes are for global services (for 1100 example, some satellite services), and thus correspond to territory 1101 code 001.</li> 1102 <li>The mappings change over time (territories move from one 1103 telephone code to another). These changes are usually planned 1104 several years in advance, and there may be a period during which 1105 either telephone code can be used to reach the territory. While the 1106 CLDR telephone code data is not intended to include past changes, it 1107 is intended to incorporate known information on planned future 1108 changes, using "from" and "to" date attributes 1109 to indicate when mappings are valid.</li> 1110 </ul> 1111 <p>A subset of the telephone code data might look like the 1112 following (showing a past mapping change to illustrate the from and 1113 to attributes):</p> 1114 <pre><codesByTerritory territory="001"> 1115 <telephoneCountryCode code="800"/> <!-- International Freephone Service --> 1116 <telephoneCountryCode code="808"/> <!-- International Shared Cost Services (ISCS) --> 1117 <telephoneCountryCode code="870"/> <!-- Inmarsat Single Number Access Service (SNAC) --> 1118</codesByTerritory> 1119<codesByTerritory territory="AS"> <!-- American Samoa --> 1120 <telephoneCountryCode code="1" from="2004-10-02"/> <!-- +1 684 in North America Numbering Plan --> 1121 <telephoneCountryCode code="684" to="2005-04-02"/> <!-- +684 now a spare code --> 1122</codesByTerritory> 1123<codesByTerritory territory="CA"> 1124 <telephoneCountryCode code="1"/> <!-- North America Numbering Plan --> 1125</codesByTerritory></pre> 1126 1127 <h2> 1128 6 <a name="Postal_Code_Validation" href="#Postal_Code_Validation">Postal 1129 Code Validation (Deprecated)</a> 1130 </h2> 1131 <p>Deprecated in v27. Please see other services that are kept up 1132 to date, such as:</p> 1133 <ul> 1134 1135 <li><a href="http://i18napis.appspot.com/address/data/US">http://i18napis.appspot.com/address/data/US</a></li> 1136 <li><a href="http://i18napis.appspot.com/address/data/CH">http://i18napis.appspot.com/address/data/CH</a></li> 1137 <li>...<br></li> 1138 </ul> 1139 <p class="dtd"> 1140 <!ELEMENT postalCodeData (postCodeRegex*) ><br> 1141 <!ELEMENT postCodeRegex (#PCDATA) ><br> <!ATTLIST 1142 postCodeRegex territoryId NMTOKEN #REQUIRED><br> 1143 </p> 1144 <p>The Postal Code regex information can be used to validate 1145 postal codes used in different countries. In some cases, the regex is 1146 quite simple, such as for Germany:</p> 1147 <pre><postCodeRegex territoryId="DE" >\d{5}</postCodeRegex></pre> 1148 <p>The US code is slightly more complicated, since there is an 1149 optional portion:</p> 1150 <pre><postCodeRegex territoryId="US" >\d{5}([ \-]\d{4})?</postCodeRegex></pre> 1151 <p>The most complicated currently is the UK.</p> 1152 1153 <h2> 1154 7 <a name="Supplemental_Character_Fallback_Data" 1155 href="#Supplemental_Character_Fallback_Data">Supplemental 1156 Character Fallback Data</a> 1157 </h2> 1158 <p class="dtd"> 1159 <!ELEMENT characters ( character-fallback*) ><br> <br> 1160 <!ELEMENT character-fallback ( character* ) ><br> 1161 <!ELEMENT character (substitute*) ><br> <!ATTLIST 1162 character value CDATA #REQUIRED ><br> <br> <!ELEMENT 1163 substitute (#PCDATA) > 1164 </p> 1165 <p>The characters element provides a way for non-Unicode systems, 1166 or systems that only support a subset of Unicode characters, to 1167 transform CLDR data. It gives a list of characters with alternative 1168 values that can be used if the main value is not available. For 1169 example:</p> 1170 <pre><characters> 1171 <character-fallback> 1172 <character value = "ß"> 1173 <substitute>ss</substitute> 1174 </character> 1175 <character value = "Ø"> 1176 <substitute>Ö</substitute> 1177 <substitute>O</substitute> 1178 </character> 1179 <character value = "<span style="font-size: 150%">₧</span>"> 1180 <substitute>Pts</substitute> 1181 </character> 1182 <character value = "<span style="font-size: 150%">₣</span>"> 1183 <substitute>Fr.</substitute> 1184 </character> 1185 </character-fallback> 1186</characters></pre> 1187 <p>The ordering of the substitute elements indicates the 1188 preference among them.</p> 1189 That is, this data provides recommended fallbacks for use when a 1190 charset or supported repertoire does not contain a desired character. 1191 There is more than one possible fallback: the recommended usage is 1192 that when a character <i>value</i> is not in the desired repertoire 1193 the following process is used, whereby the first value that is wholly 1194 in the desired repertoire is used. 1195 <ul> 1196 <li style="margin-top: 0.5em; margin-bottom: 0.5em"><code>toNFC</code>(<i>value</i>)</li> 1197 <li style="margin-top: 0.5em; margin-bottom: 0.5em">other 1198 canonically equivalent sequences, if there are any</li> 1199 <li style="margin-top: 0.5em; margin-bottom: 0.5em">the explicit 1200 <i>substitutes</i> value (in order) 1201 </li> 1202 <li style="margin-top: 0.5em; margin-bottom: 0.5em"><code>toNFKC</code>(<i>value</i>)</li> 1203 </ul> 1204 1205 1206 1207 <h2> 1208 8 <a name="Coverage_Levels" href="#Coverage_Levels">Coverage 1209 Levels</a> 1210 </h2> 1211 <p>The following describes the coverage levels used for the 1212 current version of CLDR. This list will change between releases of 1213 CLDR. Each level adds to what is in the lower level.</p> 1214 <table border="1" cellpadding="0" cellspacing="1"> 1215 <!-- nocaption --> 1216 <tr> 1217 <th nowrap><div align="right">Level</div></th> 1218 <th colspan="2">Description</th> 1219 </tr> 1220 <tr> 1221 <td nowrap><div align="right">0</div></td> 1222 <td>undetermined</td> 1223 <td>Does not meet any of the following levels.</td> 1224 </tr> 1225 <tr> 1226 <td nowrap><div align="right">10</div></td> 1227 <td>core</td> 1228 <td>The CLDR "core" data, which is defined as the basic 1229 information about the language and writing system that is required 1230 before other information can be added using the CLDR survey tool. 1231 See <a href="http://cldr.unicode.org/index/cldr-spec/minimaldata">http://cldr.unicode.org/index/cldr-spec/minimaldata</a> 1232 </td> 1233 </tr> 1234 <tr> 1235 <td nowrap><div align="right">40</div></td> 1236 <td>basic</td> 1237 <td>The minimum amount of locale data deemed necessary to 1238 create a "viable" locale in CLDR. Contains names for the languages, 1239 scripts, and territories associated with the language, numbering 1240 systems used in those languages, date and number formats, plus a 1241 few key values such as the values in Section 3.1 <a 1242 href="tr35.html#Unknown_or_Invalid_Identifiers">Unknown or 1243 Invalid Identifiers</a>. Also contains data associated with the most prominent languages 1244 and countries.</td> 1245 </tr> 1246 <tr> 1247 <td nowrap><div align="right">60</div></td> 1248 <td>moderate</td> 1249 <td>Contains more types of data and more language and territory 1250 names than the basic level. If the language is associated with an 1251 EU country, then the moderate level attempts to complete the data 1252 as it pertains to all EU member countries.</td> 1253 </tr> 1254 <tr> 1255 <td nowrap><div align="right">80</div></td> 1256 <td>modern</td> 1257 <td>Contains all fields in normal modern use, including all 1258 country names, and currencies in use.</td> 1259 </tr> 1260 <tr> 1261 <td nowrap><div align="right">100</div></td> 1262 <td>comprehensive</td> 1263 <td>Contains complete localizations (or valid inheritance) for 1264 every possible field.</td> 1265 </tr> 1266 </table> 1267 <p> 1268 Levels 40 through 80 are based on the definitions and specifications 1269 listed in <strong>8.1-8.4</strong>. However, these principles are 1270 continually being refined by the CLDR technical committee, and so do 1271 not completely reflect the data that is actually used for coverage 1272 determination, which is under the XPath <strong>//supplementalData/CoverageLevels</strong>. 1273 For a view of the trunk version of this data<strike>file</strike>, 1274 see <a 1275 href="http://unicode.org/repos/cldr/tags/latest/common/supplemental/coverageLevels.xml">coverageLevels.xml</a>. 1276 (As described in the <a href="tr35-info.html#Supplemental_Data">introduction 1277 to Supplemental Data</a>, the specific XML filename may change.) 1278 </p> 1279 <p class="dtd"> 1280 <!ELEMENT coverageLevels ( approvalRequirements, 1281 coverageVariable*, coverageLevel* ) ><br> <!ELEMENT 1282 coverageLevel EMPTY ><br> <!ATTLIST coverageLevel 1283 inLanguage CDATA #IMPLIED ><br> <!ATTLIST coverageLevel 1284 inScript CDATA #IMPLIED ><br> <!ATTLIST coverageLevel 1285 inTerritory CDATA #IMPLIED ><br> <!ATTLIST coverageLevel 1286 value CDATA #REQUIRED ><br> <!ATTLIST coverageLevel match 1287 CDATA #REQUIRED > 1288 </p> 1289 <p>For example, here is an example coverageLevel line.</p> 1290 <pre><coverageLevel<br> value="30" 1291 inLanguage="(de|fi)" <br> match="localeDisplayNames/types/type[@type='phonebook'][@key='collation']"/></pre> 1292 <p> 1293 The coverageLevel elements are read in order, and the first match 1294 results in a coverage level value. The element matches based on the <span 1295 class="attribute">inLanguage</span>, <span class="attribute">inScript</span>, 1296 <span class="attribute">inTerritory</span>, and <span 1297 class="attribute">match</span> attribute values, which are regular 1298 expressions. For example, in the above example, a match occurs if the 1299 language is de or fi, and if the path is a locale display name for 1300 collation=phonebook. 1301 </p> 1302 <p> 1303 The <span class="attribute">match</span> attribute value logically 1304 has "//ldml/" prefixed before it is applied. In addition, 1305 the "[@" is automatically quoted. Otherwise standard 1306 Perl/Java style regular expression syntax is used. 1307 </p> 1308 <p class="dtd"> 1309 <!ELEMENT coverageVariable EMPTY ><br> <!ATTLIST 1310 coverageVariable key CDATA #REQUIRED ><br> <!ATTLIST 1311 coverageVariable value CDATA #REQUIRED > 1312 </p> 1313 <p>The coverageVariable element allows us to create variables for 1314 certain regular expressions that are used frequently in the 1315 coverageLevel definitions above. Each coverage varible must contain a 1316 key / value pair of attributes, which can then be used to be 1317 substituted into a coverageLevel definition above.</p> 1318 <p>For example, here is an example coverageLevel line using 1319 coverageVariable substitution.</p> 1320 1321 <pre><coverageVariable key="%dayTypes" value="(sun|mon|tue|wed|thu|fri|sat)"><br> 1322<coverageVariable key="%wideAbbr" value="(wide|abbreviated)"><br> 1323<coverageLevel value="20" match="dates/calendars/calendar[@type='gregorian']/days/dayContext[@type='format']/dayWidth[@type='%wideAbbr']/day[@type='%dayTypes']"/></pre> 1324 <p>In this example, the coverge variables %dayTypes and %wideAbbr 1325 are used to substitute their respective values into the match 1326 expression. This allows us to reuse the same variable for other 1327 coverageLevel matches that use the same regular expression fragment.</p> 1328 <p class="dtd"> 1329 <br> <!ELEMENT approvalRequirements ( approvalRequirement* ) 1330 ><br> <!ELEMENT approvalRequirement EMPTY ><br> 1331 <!ATTLIST approvalRequirement votes CDATA #REQUIRED><br> 1332 <!ATTLIST approvalRequirement locales CDATA #REQUIRED><br> 1333 <!ATTLIST approvalRequirement paths CDATA #REQUIRED><br> 1334 </p> 1335 <p></p> 1336 <p>The approvalRequirements allows to specify the number of survey 1337 tool votes required for approval, either based on locale, or path, or 1338 both. Certain locales require a higher voting threshhold (usually 8 1339 votes instead of 4), in order to promote greater stability in the 1340 data. Furthermore, certain fields that are very high visibility 1341 fields, such as number formats, require a CLDR TC committee member's 1342 vote for approval.</p> 1343 1344 <p>Here is an example of the approvalRequirements section.</p> 1345 1346 <pre><approvalRequirements><br> <!-- "high bar" items --> 1347 <approvalRequirement votes="20" locales="*" paths="//ldml/numbers/symbols[^/]++/(decimal|group)"/> 1348 <!-- established locales - http://cldr.unicode.org/index/process#TOC-Draft-Status-of-Optimal-Field-Value --> 1349 <approvalRequirement votes="8" locales="ar ca cs da de el es fi fr he hi hr hu it ja ko nb nl pl pt pt_PT ro ru sk sl sr sv th tr uk vi zh zh_Hant" paths=""/> 1350 <!-- all other items --> 1351 <approvalRequirement votes="4" locales="*" paths=""/><br></approvalRequirements> </pre> 1352 <p>This section specifies that a TC vote (20 votes) is required 1353 for decimal and grouping separators. Furthermore it specifies that 1354 any field in the established locales list (i.e. ar, ca, cs, etc.) 1355 requires 8 votes, and that all other locales require 4 votes only.</p> 1356 <p> 1357 For more information on the CLDR Voting process, See <a 1358 href="http://cldr.unicode.org/index/process">http://cldr.unicode.org/index/process</a> 1359 </p> 1360 1361 <h3> 1362 8.1 <a name="Coverage_Level_Definitions" 1363 href="#Coverage_Level_Definitions">Definitions</a> 1364 </h3> 1365 <ul> 1366 <li><i>Target-Language</i> is the language under consideration.</li> 1367 <li><i>Target-Territories</i> is the list of territories found 1368 by looking up <i>Target-Language</i> in the <languageData> 1369 elements in <a href="tr35-info.html#Supplemental_Language_Data">Supplemental 1370 Language Data</a>.</li> 1371 <li><i>Language-List</i> is <i>Target-Language</i>, plus 1372 <ul> 1373 <li><b>basic: </b>Chinese, English, French, German, Italian, 1374 Japanese, Portuguese, Russian, Spanish, Unknown (de, en, es, fr, 1375 it, ja, pt, ru, zh, und</li> 1376 <li><b>moderate: </b>basic + Arabic, Hindi, Korean, 1377 Indonesian, Dutch, Bengali, Turkish, Thai, Polish (ar, hi, ko, in, 1378 nl, bn, tr, th, pl). If an EU language, add the remaining official 1379 EU languages, currently: Danish, Greek, Finnish, Swedish, Czech, 1380 Estonian, Latvian, Lithuanian, Hungarian, Maltese, Slovak, Slovene 1381 (da, el, fi, sv, cs, et, lv, lt, hu, mt, sk, sl)</li> 1382 <li><b>modern:</b> all languages that are official or major 1383 commercial languages of modern territories</li> 1384 </ul></li> 1385 <li><i>Target-Scripts </i>is the list of scripts in which <i>Target-Language</i> 1386 can be customarily written (found by looking up <i>Target-Language</i> 1387 in the <languageData> elements in <a 1388 href="tr35-info.html#Supplemental_Language_Data">Supplemental 1389 Language Data</a>.)<i>,</i> plus Unknown (Zzzz)<i>.</i></li> 1390 <li><i>Script-List</i> is the <i>Target-Scripts</i> plus the 1391 major scripts used for multiple languages 1392 <ul> 1393 <li>Latin, Simplified Chinese, Traditional Chinese, Cyrillic, 1394 Arabic (Latn, Hans, Hant, Cyrl, Arab)</li> 1395 </ul></li> 1396 <li><i>Territory-List</i> is the list of territories formed by 1397 taking the <i>Target-Territories</i> and adding: 1398 <ul> 1399 <li><b>basic: </b>Brazil, China, France, Germany, India, 1400 Italy, Japan, Russia, United Kingdom, United States, Unknown (BR, 1401 CN, DE, GB, FR, IN, IT, JP, RU, US, ZZ)</li> 1402 <li><b>moderate: </b>basic + Spain, Canada, Korea, Mexico, 1403 Australia, Netherlands, Switzerland, Belgium, Sweden, Turkey, 1404 Austria, Indonesia, Saudi Arabia, Norway, Denmark, Poland, South 1405 Africa, Greece, Finland, Ireland, Portugal, Thailand, Hong Kong 1406 SAR China, Taiwan (ES, BE, SE, TR, AT, ID, SA, NO, DK, PL, ZA, GR, 1407 FI, IE, PT, TH, HK, TW). If an EU language, add the remaining 1408 member EU countries: Luxembourg, Czech Republic, Hungary, Estonia, 1409 Lithuania, Latvia, Slovenia, Slovakia, Malta (LU, CZ, HU, ES, LT, 1410 LV, SI, SK, MT).</li> 1411 <li><b>modern:</b> all current ISO 3166 territories, plus the 1412 UN M.49 [<a href="tr35.html#UNM49">UNM49</a>] regions in <a 1413 href="tr35-info.html#Supplemental_Territory_Containment">Supplemental 1414 Territory Containment</a>.</li> 1415 </ul></li> 1416 <li><i>Currency-List</i> is the list of current official 1417 currencies used in any of the territories in <i>Territory-List</i>, 1418 found by looking at the region elements in <a 1419 href="tr35-info.html#Supplemental_Territory_Containment">Supplemental 1420 Territory Containment</a>, plus Unknown (XXX).</li> 1421 <li><i>Calendar-List</i> is the set of calendars in customary 1422 use in any of <i>Target-Territories</i>, plus Gregorian.</li> 1423 <li><em>Number-System-List</em> is the set of number systems in 1424 customary use in the language.</li> 1425 </ul> 1426 <h3> 1427 8.2 <a name="Coverage_Level_Data_Requirements" 1428 href="#Coverage_Level_Data_Requirements">Data Requirements</a> 1429 </h3> 1430 <p>The required data to qualify for the level is then the 1431 following.</p> 1432 <ol> 1433 <li>localeDisplayNames 1434 <ol> 1435 <li><i>languages: </i>localized names for all languages in <i>Language-List.</i></li> 1436 <li><i>scripts:</i> localized names for all scripts in <i>Script-List</i>.</li> 1437 <li><i>territories:</i> localized names for all territories in 1438 <i>Territory-List</i>.</li> 1439 <li><i>variants, keys, types:</i> localized names for any in 1440 use in <i>Target-Territories</i>; for example, a translation for 1441 PHONEBOOK in a German locale.</li> 1442 </ol> 1443 </li> 1444 <li>dates: all of the following for each calendar in <i>Calendar-List</i>. 1445 <ol> 1446 <li>calendars: localized names</li> 1447 <li>month names, day names, era names, and quarter names 1448 <ul> 1449 <li>context=format and width=narrow, wide, & abbreviated</li> 1450 <li>plus context=standAlone and width=narrow, wide, & 1451 abbreviated, <i>if the grammatical forms of these are 1452 different than for context=format.</i> 1453 </li> 1454 </ul> 1455 </li> 1456 <li>week: minDays, firstDay, weekendStart, weekendEnd 1457 <ul> 1458 <li>if some of these vary in territories in <i>Territory-List</i>, 1459 include territory locales for those that do. 1460 </li> 1461 </ul> 1462 </li> 1463 <li>am, pm, eraNames, eraAbbr</li> 1464 <li>dateFormat, timeFormat: full, long, medium, short</li> 1465 <li> 1466 <p>intervalFormatFallback</p> 1467 </li> 1468 </ol> 1469 </li> 1470 <li>numbers: symbols, decimalFormats, scientificFormats, 1471 percentFormats, currencyFormats for each number system in <em>Number-System-List</em>. 1472 </li> 1473 <li>currencies: displayNames and symbol for all currencies in <i>Currency-List</i>, 1474 for all plural forms 1475 </li> 1476 <li>transforms: (moderate and above) transliteration between 1477 Latin and each other script in <i>Target-Scripts.</i> 1478 </li> 1479 </ol> 1480 <h3> 1481 8.3 <a name="Coverage_Level_Default_Values" 1482 href="#Coverage_Level_Default_Values">Default Values</a> 1483 </h3> 1484 <p> 1485 Items should <i>only</i> be included if they are not the same as the 1486 default, which is: 1487 </p> 1488 <ul> 1489 <li>what is in root, if there is something defined there.</li> 1490 <li>for timezone IDs: the name computed according to <i><a 1491 href="tr35.html#Time_Zone_Fallback">Appendix J: Time Zone 1492 Display Names</a></i></li> 1493 <li>for collation sequence, the UCA DUCET (Default Unicode 1494 Collation Element Table), as modified by CLDR. 1495 <ul> 1496 <li>however, in that case the locale must be added to the 1497 validSubLocale list in <a 1498 href="http://unicode.org/cldr/data/common/collation/root.xml">collation/root.xml</a>. 1499 </li> 1500 </ul> 1501 </li> 1502 <li>for currency symbol, language, territory, script names, 1503 variants, keys, types, the internal code identifiers, for example, 1504 <ul> 1505 <li>currencies: EUR, USD, JPY, ...</li> 1506 <li>languages: en, ja, ru, ...</li> 1507 <li>territories: GB, JP, FR, ...</li> 1508 <li>scripts: Latn, Thai, ...</li> 1509 <li>variants: PHONEBOOK,...</li> 1510 </ul> 1511 </li> 1512 </ul> 1513 <!-- end section 8 --> 1514 1515 1516 <!-- begin section 9 supplemental metadata --> 1517 <h2> 1518 9 <a name="Appendix_Supplemental_Metadata" 1519 href="#Appendix_Supplemental_Metadata">Supplemental Metadata</a> 1520 </h2> 1521 1522 <p> 1523 Note that this section discusses the 1524 <code><metadata></code> 1525 element within the 1526 <code><supplementalData></code> 1527 element. For the per-locale metadata used in tests and the Survey 1528 Tool, see <a href="#Metadata_Elements">10: Locale Metadata 1529 Element</a>. 1530 </p> 1531 1532 1533 <p>The supplemental metadata contains information about the CLDR 1534 file itself, used to test validity and provide information for locale 1535 inheritance. A number of these elements are described in</p> 1536 <ul class="toc"> 1537 <li style="margin-top: 0.5em; margin-bottom: 0.5em">Appendix I: 1538 <a href="tr35.html#Inheritance_and_Validity">Inheritance and 1539 Validity</a> 1540 </li> 1541 <li style="margin-top: 0.5em; margin-bottom: 0.5em">Appendix K: 1542 <a href="tr35.html#Valid_Attribute_Values">Valid Attribute 1543 Values</a> 1544 </li> 1545 <li style="margin-top: 0.5em; margin-bottom: 0.5em">Appendix L: 1546 <a href="tr35.html#Canonical_Form">Canonical Form</a> 1547 </li> 1548 <li style="margin-top: 0.5em; margin-bottom: 0.5em">Appendix M: 1549 <a href="#Coverage_Levels">Coverage Levels</a> 1550 </li> 1551 </ul> 1552 <h3> 1553 9.1 <a name="Supplemental_Alias_Information" 1554 href="#Supplemental_Alias_Information">Supplemental Alias 1555 Information</a> 1556 </h3> 1557 1558 <p class="dtd"> 1559 <!ELEMENT alias 1560 (languageAlias*,scriptAlias*,territoryAlias*,subdivisionAlias*,variantAlias*,zoneAlias*) 1561 ><br> <br> <em>The following are common attributes 1562 for subelements of <alias>:</em><br> <!ELEMENT *Alias EMPTY 1563 ><br> <!ATTLIST *Alias type NMTOKEN #IMPLIED ><br> 1564 <!ATTLIST *Alias replacement NMTOKEN #IMPLIED ><br> 1565 <!ATTLIST *Alias reason ( deprecated | overlong ) #IMPLIED> <br> 1566 <br> <em>The languageAlias has additional reasons</em><br> 1567 <!ATTLIST languageAlias reason ( deprecated | overlong | 1568 macrolanguage | legacy | bibliographic ) #IMPLIED> 1569 </p> 1570 <p> 1571 This element provides information as to parts of locale IDs that 1572 should be substituted when accessing CLDR data. This logical 1573 substitution should be done to both the locale id, and to any lookup 1574 for display names of languages, territories, and so on. The 1575 replacement for the language and territory types is more complicated: 1576 see <em>Part 1: <a href="tr35.html#Contents">Core</a>, Section 1577 3.3.1 <a href="tr35.html#BCP_47_Language_Tag_Conversion">BCP 47 1578 Language Tag Conversion</a></em> for details. 1579 </p> 1580 <pre><alias> 1581 <languageAlias type="in" replacement="id"> 1582 <languageAlias type="sh" replacement="sr"> 1583 <languageAlias type="sh_YU" replacement="sr_Latn_YU"> 1584... 1585 <territoryAlias type="BU" replacement="MM"> 1586... 1587</alias></pre> 1588 <p>Attribute values for the *Alias values include the following:</p> 1589 <table> 1590 <caption> 1591 <a name="Alias_Attribute_Values" href="#Alias_Attribute_Values">Alias 1592 Attribute Values</a> 1593 </caption> 1594 <tr> 1595 <th scope="col">Attribute</th> 1596 <th scope="col">Value</th> 1597 <th scope="col">Description</th> 1598 </tr> 1599 <tr> 1600 <td>type</td> 1601 <td>NMTOKEN</td> 1602 <td>The code to be replaced</td> 1603 </tr> 1604 <tr> 1605 <td>replacement</td> 1606 <td>NMTOKEN</td> 1607 <td>The code(s) to replace it, space-delimited.</td> 1608 </tr> 1609 <tr> 1610 <td rowspan="5">reason</td> 1611 <td>deprecated</td> 1612 <td>The code in type is deprecated, such as 'iw' by 'he', or 1613 'CS' by 'RS ME'.</td> 1614 </tr> 1615 <tr> 1616 <td>overlong</td> 1617 <td>The code in type is too long, such as 'eng' by 'en' or 1618 'USA' or '840' by 'US'</td> 1619 </tr> 1620 <tr> 1621 <td>macrolanguage</td> 1622 <td>The code in type is an encompassed languagethat is replaced 1623 by a macrolanguage, such as '<a 1624 href="http://www-01.sil.org/iso639-3/documentation.asp?id=arb">arb'</a> 1625 by 'ar'. 1626 </td> 1627 </tr> 1628 <tr> 1629 <td>legacy</td> 1630 <td>The code in type is a legacy code that is replaced by 1631 another code for compatiblity with established legacy usage, such 1632 as 'sh' by 'sr_Latn'</td> 1633 </tr> 1634 <tr> 1635 <td>bibliographic</td> 1636 <td>The code in type is a <a 1637 href="http://www.loc.gov/standards/iso639-2/langhome.html">bibliographic 1638 code</a>, which is replaced by a terminology code, such as 'alb' by 1639 'sq'. 1640 </td> 1641 </tr> 1642 </table> 1643 <h3> 1644 9.2 <a name="Supplemental_Deprecated_Information" 1645 href="#Supplemental_Deprecated_Information">Supplemental 1646 Deprecated Information (Deprecated)</a> 1647 </h3> 1648 <pre class="dtd"><!ELEMENT deprecated ( deprecatedItems* ) > 1649<!ATTLIST deprecated draft ( approved | contributed | provisional | unconfirmed | true | false ) #IMPLIED > <!-- true and false are deprecated. --> 1650 1651<!ELEMENT deprecatedItems EMPTY > 1652<!ATTLIST deprecatedItems type ( standard | supplemental | ldml | supplementalData | ldmlBCP47 ) #IMPLIED > <!-- standard | supplemental are deprecated --> 1653<!ATTLIST deprecatedItems elements NMTOKENS #IMPLIED > 1654<!ATTLIST deprecatedItems attributes NMTOKENS #IMPLIED > 1655<!ATTLIST deprecatedItems values CDATA #IMPLIED ></pre> 1656 <p>The deprecated items element was used to indicate elements, 1657 attributes, and attribute values that are deprecated. This means that 1658 the items are valid, but that their usage is strongly discouraged. 1659 This element and its subelements have been deprecated 1660 in favor of <a href="tr35.html#DTD_Annotations">DTD Annotations</a>.</p> 1661 1662 <p>Where particular values are deprecated (such as territory codes 1663 like SU for Soviet Union), the names for such codes may be removed 1664 from the common/main translated data after some period of time. 1665 However, typically supplemental information for deprecated codes is 1666 retained, such as containment, likely subtags, older currency codes 1667 usage, etc. The English name may also be retained, for debugging 1668 purposes.</p> 1669 <h3> 1670 9.3 <a name="Default_Content" href="#Default_Content">Default 1671 Content</a> 1672 </h3> 1673 <pre class="dtd"><!ELEMENT defaultContent EMPTY > 1674 <!ATTLIST defaultContent locales NMTOKENS #IMPLIED ></pre> 1675 <p> 1676 In CLDR, locales without territory information (or where needed, 1677 script information) provide data appropriate for what is called the <i>default 1678 content locale</i>. For example, the <i>en</i> locale contains data 1679 appropriate for <i>en-US</i>, while the <i>zh</i> locale contains 1680 content for <i>zh-Hans-CN</i>, and the <i>zh-Hant</i> locale contains 1681 content for <i>zh-Hant-TW</i>. The default content locales themselves 1682 thus inherit all of their contents, and are empty. 1683 </p> 1684 <p> 1685 The choice of content is typically based on the largest literate 1686 population of the possible choices. Thus if an implementation only 1687 provides the base language (such as<i> en</i>), it will still get a 1688 complete and consistent set of data appropriate for a locale which is 1689 reasonably likely to be the one meant. Where other information is 1690 available, such as independent country information, that information 1691 can always be used to pick a different locale (such as <i>en-CA</i> 1692 for a website targeted at Canadian users). 1693 </p> 1694 <p> 1695 If an implementation is to use a different default locale, then the 1696 data needs to be <i>pivoted</i>; all of the data from the CLDR for 1697 the current default locale pushed out to the locales that inherit 1698 from it, then the new default content locale's data moved into 1699 the base. There are tools in CLDR to perform this operation. 1700 </p> 1701 <p>For the relationship between <span >Inheritance, DefaultContent, LikelySubtags, and LocaleMatching, see <strong><em>Section 4.2.6 <a 1702 href="tr35.html#Inheritance_vs_Related">Inheritance vs Related Information</a></em></strong>.</span></p> 1703 <!-- end section 9 supp metadata --> 1704 1705 1706 <!-- begin section 10 the metadata element --> 1707 <h2> 1708 10 <a name="Metadata_Elements" href="#Metadata_Elements">Locale 1709 Metadata Element<strike>s</strike> 1710 </a> 1711 </h2> 1712 1713 <p> 1714 Note: This section refers to the per-locale 1715 <code><metadata></code> 1716 element, containing metadata about a particular locale. This is in 1717 contrast to the <a href="#Appendix_Supplemental_Metadata"><em>Supplemental</em> 1718 Metadata</a>, which is in the supplemental tree and is not specific to a 1719 locale. 1720 </p> 1721 1722 1723 <p class="dtd"> 1724 <!ELEMENT metadata ( alias | ( casingData?, special* ) ) ><br> 1725 <!ELEMENT casingData ( alias | ( casingItem*, special* ) ) ><br> 1726 <!ELEMENT casingItem ( #PCDATA ) ><br> <!ATTLIST 1727 casingItem type CDATA #REQUIRED ><br> <!ATTLIST casingItem 1728 override (true | false) #IMPLIED ><br> <!ATTLIST 1729 casingItem forceError (true | false) #IMPLIED ><br> 1730 </p> 1731 <p>The <metadata> element contains metadata about the locale 1732 for use by the Survey Tool or other tools in checking locale data; 1733 this data is not intended for export as part of the locale itself.</p> 1734 <p>The <casingItem> element specifies the capitalization 1735 intended for the majority of the data in a given category with the 1736 locale. The purpose is so that warnings can be issued to translators 1737 that anything deviating from that capitalization should be carefully 1738 reviewed. Its type attribute has one of the values used for the 1739 <contextTransformUsage> element above, with the exception of 1740 the special value "all"; its value is one of the following:</p> 1741 <ul> 1742 <li>lowercase</li> 1743 <li>titlecase</li> 1744 </ul> 1745 <p>The <casingItem> data is generated by a tool based on the 1746 data available in CLDR. In cases where the generated casing 1747 information is incorrect and needs to be manually edited, the 1748 override attribute is set to "true" so that the tool will not 1749 override the manual edits. When the casing information is known to be 1750 both correct and something that should apply to all elements of the 1751 specified type in a given locale, the forceErr attribute may be set 1752 to "true" to force an error instead of a warning for items that do 1753 not match the casing information.</p> 1754 <!-- end section Info-A metadta element --> 1755 1756 <!-- begin section 11 Version Information --> 1757 <h2> 1758 11 <a name="Version_Information" href="#Version_Information">Version 1759 Information</a> 1760 </h2> 1761 1762 1763 <p class="dtd"> 1764 <!ELEMENT version EMPTY ><br> <!ATTLIST version 1765 cldrVersion CDATA #FIXED "27" ><br> <!ATTLIST version 1766 unicodeVersion CDATA #FIXED "7.0.0" ><br> 1767 </p> 1768 <p> 1769 The <cldrVersion> attribute defines the CLDR version for this 1770 data, as published on <a 1771 href="http://cldr.unicode.org/index/downloads"> CLDR 1772 Releases/Downloads</a> 1773 </p> 1774 <p>The <unicodeVersion> attribute defines the version of the 1775 Unicode standard that is used to interpret data. Specifically, some 1776 data elements such as exemplar characters are expressed in terms of 1777 UnicodeSets. Since UnicodeSets can be expressed in terms of Unicode 1778 properties, their meaning depend on the Unicode version from which 1779 property values are derived.</p> 1780 <!-- end section Version Information metadta element --> 1781 1782 <h2> 1783 12 <a name="Parent_Locales" href="#Parent_Locales">Parent Locales</a> 1784 </h2> 1785 <p> 1786 The parentLocales data is supplemental data, but is described in 1787 detail in the <a href="tr35.html#Parent_Locales">core 1788 specification section 4.1.3.</a> 1789 </p> 1790 1791 <hr> 1792 <p class="copyright"> 1793 Copyright © 2001–2018 Unicode, Inc. All 1794 Rights Reserved. The Unicode Consortium makes no expressed or implied 1795 warranty of any kind, and assumes no liability for errors or 1796 omissions. No liability is assumed for incidental and consequential 1797 damages in connection with or arising out of the use of the 1798 information or programs contained or accompanying this technical 1799 report. The Unicode <a href="http://unicode.org/copyright.html">Terms 1800 of Use</a> apply. 1801 </p> 1802 <p class="copyright">Unicode and the Unicode logo are trademarks 1803 of Unicode, Inc., and are registered in some jurisdictions.</p> 1804 </div> 1805 1806</body> 1807 1808</html> 1809