1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> 2<html> 3<head> 4<meta http-equiv="Content-Language" content="en-us"> 5<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"> 6<link rel="stylesheet" type="text/css" 7 href="http://unicode.org/cldr/apps/surveytool.css"> 8<title>Help Text file for Supplemental Charts</title> 9<style type="text/css"> 10<!-- 11DIV.chat { 12 PADDING-RIGHT: 2px; 13 PADDING-LEFT: 2px; 14 PADDING-BOTTOM: 4px; 15 PADDING-TOP: 4px 16} 17 18DIV.in { 19 HEIGHT: 1px; 20 TEXT-ALIGN: left 21} 22 23DIV.1st { 24 PADDING-TOP: 4px 25} 26--> 27</style> 28</head> 29<body> 30 <h1 align="center">Chart Messages</h1> 31 <p> 32 This is a help-text file for use with the survey tool. You can add a 33 new row, where the <i>key</i> is a key that the program knows about, 34 and the <i>Text to Insert</i> is what you want to show up as help 35 text, or modify existing text. <b>The software that interprets 36 this expects a particular format, so don't make arbitrary changes 37 (see the end). </b> 38 </p> 39 <table id="table1" style="border-collapse: collapse;" border="1" 40 cellpadding="4" cellspacing="0" width="100%"> 41 <tbody> 42 <tr> 43 <th>Key</th> 44 <th>Text to Insert</th> 45 </tr> 46 <tr> 47 <td>territory_language_information</td> 48 <td>The main goal for CLDR language data is to provide 49 approximate figures for the literate, functional population for 50 each language in each territory: that is, the population that is 51 able to read and write each language, and is comfortable enough to 52 use it with computers. 53 <p>The GDP and Literacy figures are taken from the World Bank 54 where available, otherwise supplemented by FactBook data and other 55 sources. The GDP figures are "PPP (constant 2000 international 56 $)". Much of the per-language data is taken from the Ethnologue, 57 but is supplemented and processed using many other sources, 58 including per-country census data. (The focus of the Ethnologue is 59 native speakers, which includes people who are not literate, and 60 excludes people who are functional second-langauge users.)</p> 61 <p> 62 The literacy rate may be discounted to reflect the actual usage of 63 the written form in normal daily life. Thus languages that are 64 typically not written, such as Swiss German, will be given a low 65 literacy rate, even though the whole population <i>could</i> write 66 in Swiss German. 67 </p> 68 <p>The percentages may add up to more than 100% due to 69 multilingual populations, or may be less than 100% due to 70 illiteracy or because the data has not yet been gathered or 71 processed. Languages with a small population may be omitted.</p> 72 <p>Official status is supplied where available, formatted as 73 {O}. Hovering with the mouse shows a short description.</p> 74 <ul> 75 <li><b>Likely languages and scripts:</b>To see (and verify) 76 the likely languages and scripts for this subtag, click on the 77 country code.</li> 78 <li><b>Reporting Defects:</b> If you find errors or omissions 79 in this data, please report the information with the <i>bug</i> 80 or <i>add new</i> links, below.</li> 81 <li><b>XML Source:</b> <a 82 href="http://unicode.org/cldr/data/common/supplemental/supplementalData.xml"> 83 supplementalData.xml</a> (see the <territoryInfo>, 84 <calendarData>, <weekData>, and 85 <measurementData> elements)</li> 86 </ul> 87 </td> 88 </tr> 89 <tr> 90 <td>language_territory_information</td> 91 <td> 92 <p align="left"> 93 The language data is provided for localization testing, and is 94 under development for CLDR 1.5. For information on the meaning of 95 the different values, see <a 96 href="territory_language_information.html">Territory-Language 97 Information</a>. 98 </p> 99 <ul> 100 <li><b>Reporting Defects:</b> If you find errors or omissions 101 in this data, or add a new territory for a language, see the <i>add 102 new</i> links below.</li> 103 <li><b>XML Source:</b> <a 104 href="http://unicode.org/cldr/data/common/supplemental/supplementalData.xml"> 105 supplementalData.xml</a> (see the <territoryInfo> element)</li> 106 </ul> 107 </td> 108 </tr> 109 <tr> 110 <td>detailed_territory_currency_information</td> 111 <td> 112 <p align="left"> 113 The following table shows when currencies were in use in different 114 countries. See also <a href="#format_info">Decimal Digits and 115 Rounding</a>. The digits column shows the number of digits to use; if 116 there is special rounding (such as for CH), that is in 117 parentheses. The Countries column shows which countries the 118 currency is <font face="Lucida Sans Unicode">—</font> <i>or 119 has been</i> <font face="Lucida Sans Unicode">—</font> used in, 120 officially. 121 </p> 122 <ul> 123 <li><b>Reporting Defects:</b> If you find errors or omissions 124 in this data, please report the information with a <a 125 target="_blank" href="http://unicode.org/cldr/trac/newticket"> 126 bug report</a>.</li> 127 <li><b>XML Source:</b> <a 128 href="http://unicode.org/cldr/data/common/supplemental/supplementalData.xml"> 129 supplementalData.xml</a> (see the <currencyData> element)</li> 130 </ul> 131 </td> 132 </tr> 133 <tr> 134 <td>languages_and_scripts</td> 135 <td>This table shows some information about the scripts 136 commonly used with different languages. This information is not 137 complete, and is being enhanced over time. The table is sorted by 138 language; for the same information sorted by script, see <a 139 name="scripts_and_languages" href="scripts_and_languages.html">Scripts 140 and Languages</a>. The following conventions are used in the table: 141 <table id="table2" style="margin: 1em; border-collapse: collapse;" 142 border="1"> 143 <tbody> 144 <tr> 145 <th align="left">Column</th> 146 <th align="left">Comment</th> 147 </tr> 148 <tr> 149 <td>Language</td> 150 <td>Where there isn't any information in Unicode CLDR as to 151 which languages are written in a given script, the language 152 code is given as <i>Unknown or Invalid Language</i> ("und"). 153 </td> 154 </tr> 155 <tr> 156 <td>ML</td> 157 <td>The modern language column shows "O" if the language is 158 not in customary modern use (currently following ISO 639-3 159 Types: Ancient, Extinct, Historical, or Constructed).</td> 160 </tr> 161 <tr> 162 <td>P</td> 163 <td>The Primary column shows "N" if the language is neither 164 an official nor a defacto-official language of some country. 165 For more information, see <a 166 name="language_territory_information" 167 href="http://www.unicode.org/cldr/data/charts/supplemental/language_territory_information.html">Language-Territory 168 Information</a>. 169 </td> 170 </tr> 171 <tr> 172 <td>Script</td> 173 <td>Where there isn't any information in Unicode CLDR as to 174 which script is used by a language, the script code is given as 175 <i>Unknown or Invalid Script</i> ("Zzzz"). 176 </td> 177 </tr> 178 <tr> 179 <td>MS</td> 180 <td>The modern script column shows "N" if the script is not 181 in customary modern use.</td> 182 </tr> 183 </tbody> 184 </table> 185 <ul> 186 <li><b>Reporting Defects:</b> If you find errors or omissions 187 in this data, please report the information with a <a 188 target="_blank" href="http://unicode.org/cldr/trac/newticket"> 189 bug report</a>.</li> 190 <li><b>XML Source:</b> <a 191 href="http://unicode.org/cldr/data/common/supplemental/supplementalData.xml"> 192 supplementalData.xml</a> (see the <languageData> element)</li> 193 </ul> 194 </td> 195 </tr> 196 <tr> 197 <td>scripts_and_languages</td> 198 <td>This table shows some information about the scripts 199 commonly used with different languages. This information is not 200 complete, and is being enhanced over time. The table is sorted by 201 script; for the same information sorted by language, see <a 202 name="languages_and_scripts" 203 href="http://www.unicode.org/cldr/data/charts/supplemental/languages_and_scripts.html">Languages 204 and Scripts</a>. The following conventions are used in the table: 205 <table id="table3" style="margin: 1em; border-collapse: collapse;" 206 border="1"> 207 <tbody> 208 <tr> 209 <th align="left">Column</th> 210 <th align="left">Comment</th> 211 </tr> 212 <tr> 213 <td>Language</td> 214 <td>Where there isn't any information in Unicode CLDR as to 215 which languages are written in a given script, the language 216 code is given as <i>Unknown or Invalid Language</i> ("und"). 217 </td> 218 </tr> 219 <tr> 220 <td>ML</td> 221 <td>The modern language column shows "O" if the language is 222 not in customary modern use (currently following ISO 639-3 223 Types: Ancient, Extinct, Historical, or Constructed).</td> 224 </tr> 225 <tr> 226 <td>P</td> 227 <td>The Primary column shows "N" if the language 228 combination is neither an official nor a defacto-official 229 language of some country. For more information, see <a 230 name="language_territory_information0" 231 href="http://www.unicode.org/cldr/data/charts/supplemental/language_territory_information.html">Language-Territory 232 Information</a>. 233 </td> 234 </tr> 235 <tr> 236 <td>Script</td> 237 <td>Where there isn't any information in Unicode CLDR as to 238 which script is used by a language, the script code is given as 239 <i>Unknown or Invalid Script</i> ("Zzzz"). 240 </td> 241 </tr> 242 <tr> 243 <td>MS</td> 244 <td>The modern script column shows "N" if the script is not 245 in customary modern use.</td> 246 </tr> 247 </tbody> 248 </table> 249 <ul> 250 <li><b>Reporting Defects:</b> If you find errors or omissions 251 in this data, please report the information with a <a 252 target="_blank" href="http://unicode.org/cldr/trac/newticket"> 253 bug report</a>.</li> 254 <li><b>XML Source:</b> <a 255 href="http://unicode.org/cldr/data/common/supplemental/supplementalData.xml"> 256 supplementalData.xml</a> (see the <languageData> element)</li> 257 </ul> 258 </td> 259 </tr> 260 <tr> 261 <td>territory_containment_un_m_49</td> 262 <td> 263 <p align="left"> 264 The <b>Territory Containment</b> table shows the organization of 265 territories and regions according to <a 266 href="http://unstats.un.org/unsd/methods/m49/m49regin.htm">UN 267 M.49</a>, starting with the World. (CLDR supplements this table with 268 the QO code for outlying areas that would not otherwise be 269 included.) As the last column, the timezone IDs for that country 270 are listed. 271 </p> 272 <ul> 273 <li><b>Reporting Defects:</b> If you find errors or omissions 274 in this data, please report the information with a <a 275 target="_blank" href="http://unicode.org/cldr/trac/newticket"> 276 bug report</a>. However, such reports should be limited to cases 277 where the information here deviates from <a 278 href="http://unstats.un.org/unsd/methods/m49/m49regin.htm">UN 279 M.49</a>.</li> 280 <li><b>XML Source:</b> <a 281 href="http://unicode.org/cldr/data/common/supplemental/supplementalData.xml"> 282 supplementalData.xml</a> (see the <territoryContainment> and 283 <timezoneData> elements)</li> 284 </ul> 285 </td> 286 </tr> 287 <tr> 288 <td>zone_tzid</td> 289 <td> 290 <p align="left"> 291 The <b>Zone-Tzid</b> table shows the mapping from Windows timezone 292 IDs to the standard TZIDs. 293 </p> 294 <ul> 295 <li><b>Reporting Defects:</b> If you find errors or omissions 296 in this data, please report the information with a <a 297 target="_blank" href="http://unicode.org/cldr/trac/newticket"> 298 bug report</a>.</li> 299 <li><b>XML Source:</b>under <mapTimezones> in <a 300 href="http://unicode.org/cldr/data/common/supplemental/metaZones.xml">metaZones.xml</a> 301 and <a 302 href="http://unicode.org/cldr/data/common/supplemental/windowsZones.xml">windowsZones.xml</a></li> 303 </ul> 304 </td> 305 </tr> 306 <tr> 307 <td>character_fallback_substitutions</td> 308 <td>The <b>Character Fallback Substitutions</b> table shows 309 recommended fallbacks for use when a charset or supported 310 repertoire does not contain a desired character, using the data 311 from <a 312 href="http://unicode.org/cldr/data/common/supplemental/characters.xml">characters.xml</a>. 313 There is more than one possible fallback: the recommended usage is 314 that when a character <i>value</i> is not in the desired repertoire 315 the following process is used, whereby the first value that is 316 wholly in the desired repertoire is used. 317 <ul> 318 <li><code>toNFC</code>(<i>value</i>)</li> 319 <li>other canonically equivalent sequences, if there are any</li> 320 <li>the explicit <i>substitutes</i> value from <a 321 href="http://unicode.org/cldr/data/common/supplemental/characters.xml">characters.xml</a> 322 (in order) 323 </li> 324 <li><code>toNFKC</code>(<i>value</i>)</li> 325 </ul> 326 <p> 327 The <b>Explicit</b>, <b>NFC</b>, and <b>NFKC</b> <i>substitutes</i> 328 are shown in the chart by different colors. Note that the 329 character fallbacks do lose information, and should not be used 330 where there is a viable alternative, such as HTML escapes. 331 </p> 332 <ul> 333 <li><b>Reporting Defects:</b> If you find errors or omissions 334 in this data, please report the information with a <a 335 target="_blank" href="http://unicode.org/cldr/trac/newticket"> 336 bug report</a>.</li> 337 <li><b>XML Source:</b> <a 338 href="http://unicode.org/cldr/data/common/supplemental/characters.xml">characters.xml</a> 339 </li> 340 </ul> 341 </td> 342 </tr> 343 <tr> 344 <td>aliases</td> 345 <td> 346 <p align="left"> 347 <b>Aliases</b> show how to map deprecated codes or aliases onto 348 the ones that should be used to access CLDR data. Most other 349 metadata is not shown in tables; the source data should be 350 consulted. Codes are shown in brackets before or after the English 351 name, eg "Vanuatu [VU]" 352 </p> 353 <ul> 354 <li><b>Reporting Defects:</b> If you find errors or omissions 355 in this data, please report the information with a <a 356 target="_blank" href="http://unicode.org/cldr/trac/newticket"> 357 bug report</a>.</li> 358 <li><b>XML Source:</b> <a 359 href="http://unicode.org/cldr/data/common/supplemental/supplementalMetadata.xml"> 360 supplementalMetadata.xml</a> (see the <alias> element)</li> 361 </ul> 362 </td> 363 </tr> 364 <tr> 365 <td>likely_subtags</td> 366 <td>There are a number of situations where it is useful to be 367 able to find the most likely language, script, or region, if that 368 information is otherwise missing. For example: 369 <ul> 370 <li><span>Given the language "zh" and the region "TW", 371 what is the most likely script?</span></li> 372 <li><span>Given the script "Thai" what is the most 373 likely language or region?</span></li> 374 <li><span>Given the region TW, what is the most likely 375 language and script?</span></li> 376 </ul> 377 <p> 378 <span>Conversely, given a locale, it is useful to find out 379 which fields (language, script, or region) may be superfluous, in 380 the sense that they contain the likely tags. For example, 381 "en_Latn" can be simplified down to "en" since "Latn" is the 382 likely script for "en"; "ja_Japn_JP" can be simplified down to 383 "ja".</span> 384 </p> 385 <p> 386 <span>The <i>likelySubtag</i> supplemental data provides 387 default information for computing these values. This data is 388 based on the default content data, the population data, and the 389 the suppress-script data in [<a 390 href="http://unicode.org/draft/reports/tr35/tr35.html#BCP47">BCP47</a>]. 391 It is heuristically derived, and may change over time. The chart 392 shows how the data "fills in" the missing fields in the <span 393 class="source">source values</span> to get the <span 394 class="target">target values</span>. 395 </span> 396 </p> 397 <ul> 398 <li><b>Reporting Defects:</b> If you find errors or omissions 399 in this data, please report the information with a <a 400 target="_blank" href="http://unicode.org/cldr/trac/newticket">bug 401 report</a>.</li> 402 </ul> 403 </td> 404 </tr> 405 <tr> 406 <td>language_plural_rules</td> 407 <td> 408 <p> 409 Languages vary in how they handle plurals of nouns or unit 410 expressions ("hours", "meters", and so on). Some languages have 411 two forms, like English; some languages have only a single form; 412 and some languages have multiple forms (see <a href="#sl">Slovenian</a> 413 below). They also vary between cardinals (such as 1, 2, or 3) and 414 ordinals (such as 1st, 2nd, or 3rd), and in ranges of cardinals 415 (such as "1-2", used in expressions like "1-2 meters long"). CLDR 416 uses short, mnemonic tags for these plural categories. For more 417 information on these categories, see <a 418 href="http://cldr.unicode.org/index/cldr-spec/plural-rules">Plural 419 Rules</a>. 420 </p> 421 <p></p> 422 <ul> 423 <li><b>Examples:</b> The symbol ~ (as in "1.7~2.1") has a 424 special meaning: it is a range of numbers that includes the end 425 points (1.7 and 2.1), and everything between that has exactly the 426 same number of decimals as the end points (thus also 1.8, 1.9, 427 and 2.0, but not 2 or 1.91 or 1.90). The samples are generated mechanically, and 428 are not comprehensive: “0, 2~19, 101~119, …” could show up as the less-complete 429 “0, 2~16, 101 …”.</li> 430 <li><b>Reporting Defects:</b> When you find errors or 431 omissions in this data, please report the information with a <a 432 target="_blank" href="http://unicode.org/cldr/trac/newticket">bug 433 report</a>. But first read "Reporting Defects" on <a 434 href="http://cldr.unicode.org/index/cldr-spec/plural-rules">Plural 435 Rules</a>.</li> 436 </ul> 437 </td> 438 </tr> 439 <tr> 440 <td>error_locale_header|error_index_header</td> 441 <td> 442 <p> 443 Please review and correct them. Note that errors in <span 444 style="font-style: italic;">sublocales</span> are often fixed by 445 fixing the main locale.<br> <br> 446 </p> 447 <div style="margin-left: 40px;"> 448 <span style="font-style: italic;">This list is only 449 generated daily, and so may not reflect fixes you have made until 450 tomorrow. (There were production problems in integrating it fully 451 into the Survey tool. However, it should let you see the problems 452 and make sure that they get taken care of.)</span> 453 </div> 454 <p> 455 The table below gives a count for each of the following kinds of 456 items. The focus is on correcting the problems, and getting enough 457 votes for "minimal approval" (status=<span 458 style="font-style: italic; font-weight: bold;">contributed</span> 459 -- high enough to get incorporated into most implementations). 460 </p> 461 <ul> 462 <li><span style="font-weight: bold;">Disputed:</span> Of 463 those voting on an item, if enough switched their vote the item 464 could have minimal approval.</li> 465 <li><span style="font-weight: bold;">Conflicted:</span> For 466 this many items, the organization is losing a vote because of 467 conflicts within the organization.</li> 468 <li><span style="font-weight: bold;">Error:</span> The item 469 has a serious error and must be corrected.</li> 470 <li><span style="font-weight: bold;">Warning:</span> The item 471 has a significant problem that should be corrected.</li> 472 <li><span style="font-weight: bold;">Missing Coverage:</span> 473 These items should be translated but are missing.</li> 474 <li><span style="font-weight: bold;">Missing Votes:</span> 475 These items have translations, but not enough votes for "minimal 476 approval".</li> 477 </ul> 478 </td> 479 </tr> 480 </tbody> 481 </table> 482 <p>The text to insert can be fairly arbitrary HTML. The software 483 that reads this table will search the first column (eg between 484 <td> and </td>) and return the contents of the second 485 column.</p> 486 <p> 487 <b>WARNING</b> 488 </p> 489 <ul> 490 <li><b><i>It uses a very dumb parser, so make sure that 491 table elements are matched, eg <td> with </td>, and 492 also that <tr>, </tr>, <table>, and 493 </table> are on separate lines.</i></b></li> 494 <li><b><i>The regular expression for the key must match 495 the whole path, so if it is an interior substring, remember to add 496 .* on both ends.</i></b></li> 497 </ul> 498</body> 499</html> 500