• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
2<html>
3<head>
4<meta http-equiv="Content-Language" content="en-us">
5<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
6<link rel="stylesheet" type="text/css"
7	href="https://unicode.org/cldr/apps/surveytool.css">
8<title>Help Text file for Supplemental Charts</title>
9<style type="text/css">
10<!--
11DIV.chat {
12	PADDING-RIGHT: 2px;
13	PADDING-LEFT: 2px;
14	PADDING-BOTTOM: 4px;
15	PADDING-TOP: 4px
16}
17
18DIV.in {
19	HEIGHT: 1px;
20	TEXT-ALIGN: left
21}
22
23DIV.1st {
24	PADDING-TOP: 4px
25}
26-->
27</style>
28</head>
29<body>
30	<h1 align="center">Chart Messages</h1>
31	<p>
32		This is a help-text file for use with the survey tool and charts. You can add a
33		new row, where the <i>key</i> is a key that the program knows about,
34		and the <i>Text to Insert</i> is what you want to show up as help
35		text, or modify existing text. <b>The software that interprets
36			this expects a particular format, so don't make arbitrary changes
37			(see the end). </b>
38	</p>
39	<table id="table1" style="border-collapse: collapse;" border="1"
40		cellpadding="4" cellspacing="0" width="100%">
41		<tbody>
42			<tr>
43				<th>Key</th>
44				<th>Text to Insert</th>
45			</tr>
46			<tr>
47				<td>territory_language_information</td>
48				<td>The main goal for CLDR language data is to provide
49					approximate figures for the literate, functional population for
50					each language in each territory: that is, the population that is
51					able to read and write each language, and is comfortable enough to
52					use it with computers.
53					<p>The GDP and Literacy figures are taken from the World Bank
54						where available, otherwise supplemented by FactBook data and other
55						sources. The GDP figures are "PPP (constant 2000 international
56						$)". Much of the per-language data is taken from the Ethnologue,
57						but is supplemented and processed using many other sources,
58						including per-country census data. (The focus of the Ethnologue is
59						native speakers, which includes people who are not literate, and
60						excludes people who are functional second-langauge users.)</p>
61					<p>
62						The literacy rate may be discounted to reflect the actual usage of
63						the written form in normal daily life. Thus languages that are
64						typically not written, such as Swiss German, will be given a low
65						literacy rate, even though the whole population <i>could</i> write
66						in Swiss German.
67					</p>
68					<p>The percentages may add up to more than 100% due to
69						multilingual populations, or may be less than 100% due to
70						illiteracy or because the data has not yet been gathered or
71						processed. Languages with a small population may be omitted.</p>
72					<p>Official status is supplied where available, formatted as
73						{O}. Hovering with the mouse shows a short description.</p>
74					<ul>
75						<li><b>Likely languages and scripts:</b>To see (and verify)
76							the likely languages and scripts for this subtag, click on the
77							country code.</li>
78						<li><b>Reporting Defects:</b> If you find errors or omissions
79							in this data, please report the information with the <i>bug</i>
80							or <i>add new</i> links, below.</li>
81						<li><b>XML Source:</b> <a
82							href="https://github.com/unicode-org/cldr/tree/main/common/supplemental/supplementalData.xml">
83								supplementalData.xml</a> (see the &lt;territoryInfo&gt;,
84							&lt;calendarData&gt;, &lt;weekData&gt;, and
85							&lt;measurementData&gt; elements)</li>
86					</ul>
87				</td>
88			</tr>
89			<tr>
90				<td>language_territory_information</td>
91				<td>
92					<p align="left">
93						For information on the meaning of
94						the different values, see <a
95							href="territory_language_information.html">Territory-Language
96							Information</a>.
97					</p>
98					<ul>
99						<li><b>Reporting Defects:</b> If you find errors or omissions
100							in this data, or add a new territory for a language, see the <i>add
101								new</i> links below.</li>
102						<li><b>XML Source:</b> <a
103							href="https://github.com/unicode-org/cldr/tree/main/common/supplemental/supplementalData.xml">
104								supplementalData.xml</a> (see the &lt;territoryInfo&gt; element)</li>
105					</ul>
106				</td>
107			</tr>
108			<tr>
109				<td>detailed_territory_currency_information</td>
110				<td>
111					<p align="left">
112						The following table shows when currencies were in use in different
113						countries. See also <a href="#format_info">Decimal Digits and
114							Rounding</a>. The digits column shows the number of digits to use; if
115						there is special rounding (such as for CH), that is in
116						parentheses. The Countries column shows which countries the
117						currency is <font face="Lucida Sans Unicode">—</font> <i>or
118							has been</i> <font face="Lucida Sans Unicode">—</font> used in,
119						officially.
120					</p>
121					<ul>
122						<li><b>Reporting Defects:</b> If you find errors or omissions
123							in this data, please report the information with a <a
124							target="_blank" href="http://cldr.unicode.org/index/bug-reports#TOC-Filing-a-Ticket">
125								bug report</a>.</li>
126						<li><b>XML Source:</b> <a
127							href="https://github.com/unicode-org/cldr/tree/main/common/supplemental/supplementalData.xml">
128								supplementalData.xml</a> (see the &lt;currencyData&gt; element)</li>
129					</ul>
130				</td>
131			</tr>
132			<tr>
133				<td>languages_and_scripts</td>
134				<td>This table shows some information about the scripts
135					commonly used with different languages. This information is not
136					complete, and is being enhanced over time. The table is sorted by
137					language; for the same information sorted by script, see <a
138					name="scripts_and_languages" href="scripts_and_languages.html">Scripts
139						and Languages</a>. The following conventions are used in the table:
140					<table id="table2" style="margin: 1em; border-collapse: collapse;"
141						border="1">
142						<tbody>
143							<tr>
144								<th align="left">Column</th>
145								<th align="left">Comment</th>
146							</tr>
147							<tr>
148								<td>Language</td>
149								<td>Where there isn't any information in Unicode CLDR as to
150									which languages are written in a given script, the language
151									code is given as <i>Unknown or Invalid Language</i> ("und").
152								</td>
153							</tr>
154							<tr>
155								<td>ML</td>
156								<td>The modern language column shows "O" if the language is
157									not in customary modern use (currently following ISO 639-3
158									Types: Ancient, Extinct, Historical, or Constructed).</td>
159							</tr>
160							<tr>
161								<td>P</td>
162								<td>The Primary column shows "N" if the language is neither
163									an official nor a defacto-official language of some country.
164									For more information, see <a
165									name="language_territory_information"
166									href="http://www.unicode.org/cldr/data/charts/supplemental/language_territory_information.html">Language-Territory
167										Information</a>.
168								</td>
169							</tr>
170							<tr>
171								<td>Script</td>
172								<td>Where there isn't any information in Unicode CLDR as to
173									which script is used by a language, the script code is given as
174									<i>Unknown or Invalid Script</i> ("Zzzz").
175								</td>
176							</tr>
177							<tr>
178								<td>MS</td>
179								<td>The modern script column shows "N" if the script is not
180									in customary modern use.</td>
181							</tr>
182						</tbody>
183					</table>
184					<ul>
185						<li><b>Reporting Defects:</b> If you find errors or omissions
186							in this data, please report the information with a <a
187							target="_blank" href="http://cldr.unicode.org/index/bug-reports#TOC-Filing-a-Ticket">
188								bug report</a>.</li>
189						<li><b>XML Source:</b> <a
190							href="https://github.com/unicode-org/cldr/tree/main/common/supplemental/supplementalData.xml">
191								supplementalData.xml</a> (see the &lt;languageData&gt; element)</li>
192					</ul>
193				</td>
194			</tr>
195			<tr>
196				<td>scripts_and_languages</td>
197				<td>This table shows some information about the scripts
198					commonly used with different languages. This information is not
199					complete, and is being enhanced over time. The table is sorted by
200					script; for the same information sorted by language, see <a
201					name="languages_and_scripts"
202					href="http://www.unicode.org/cldr/data/charts/supplemental/languages_and_scripts.html">Languages
203						and Scripts</a>. The following conventions are used in the table:
204					<table id="table3" style="margin: 1em; border-collapse: collapse;"
205						border="1">
206						<tbody>
207							<tr>
208								<th align="left">Column</th>
209								<th align="left">Comment</th>
210							</tr>
211							<tr>
212								<td>Language</td>
213								<td>Where there isn't any information in Unicode CLDR as to
214									which languages are written in a given script, the language
215									code is given as <i>Unknown or Invalid Language</i> ("und").
216								</td>
217							</tr>
218							<tr>
219								<td>ML</td>
220								<td>The modern language column shows "O" if the language is
221									not in customary modern use (currently following ISO 639-3
222									Types: Ancient, Extinct, Historical, or Constructed).</td>
223							</tr>
224							<tr>
225								<td>P</td>
226								<td>The Primary column shows "N" if the language
227									combination is neither an official nor a defacto-official
228									language of some country. For more information, see <a
229									name="language_territory_information0"
230									href="http://www.unicode.org/cldr/data/charts/supplemental/language_territory_information.html">Language-Territory
231										Information</a>.
232								</td>
233							</tr>
234							<tr>
235								<td>Script</td>
236								<td>Where there isn't any information in Unicode CLDR as to
237									which script is used by a language, the script code is given as
238									<i>Unknown or Invalid Script</i> ("Zzzz").
239								</td>
240							</tr>
241							<tr>
242								<td>MS</td>
243								<td>The modern script column shows "N" if the script is not
244									in customary modern use.</td>
245							</tr>
246						</tbody>
247					</table>
248					<ul>
249						<li><b>Reporting Defects:</b> If you find errors or omissions
250							in this data, please report the information with a <a
251							target="_blank" href="http://cldr.unicode.org/index/bug-reports#TOC-Filing-a-Ticket">
252								bug report</a>.</li>
253						<li><b>XML Source:</b> <a
254							href="https://github.com/unicode-org/cldr/tree/main/common/supplemental/supplementalData.xml">
255								supplementalData.xml</a> (see the &lt;languageData&gt; element)</li>
256					</ul>
257				</td>
258			</tr>
259			<tr>
260				<td>territory_containment_un_m_49</td>
261				<td>
262					<p align="left">
263						The <b>Territory Containment</b> table shows the organization of
264						territories and regions according to <a
265							href="http://unstats.un.org/unsd/methods/m49/m49regin.htm">UN
266							M.49</a>, starting with the World. (CLDR supplements this table with
267						the QO code for outlying areas that would not otherwise be
268						included.) As the last column, the timezone IDs for that country
269						are listed.
270					</p>
271					<ul>
272						<li><b>Reporting Defects:</b> If you find errors or omissions
273							in this data, please report the information with a <a
274							target="_blank" href="http://cldr.unicode.org/index/bug-reports#TOC-Filing-a-Ticket">
275								bug report</a>. However, such reports should be limited to cases
276							where the information here deviates from <a
277							href="http://unstats.un.org/unsd/methods/m49/m49regin.htm">UN
278								M.49</a>.</li>
279						<li><b>XML Source:</b> <a
280							href="https://github.com/unicode-org/cldr/tree/main/common/supplemental/supplementalData.xml">
281								supplementalData.xml</a> (see the &lt;territoryContainment&gt; and
282							&lt;timezoneData&gt; elements)</li>
283					</ul>
284				</td>
285			</tr>
286			<tr>
287				<td>zone_tzid</td>
288				<td>
289					<p align="left">
290						The <b>Zone-Tzid</b> table shows the mapping from Windows timezone
291						IDs to the standard TZIDs.
292					</p>
293					<ul>
294						<li><b>Reporting Defects:</b> If you find errors or omissions
295							in this data, please report the information with a <a
296							target="_blank" href="http://cldr.unicode.org/index/bug-reports#TOC-Filing-a-Ticket">
297								bug report</a>.</li>
298						<li><b>XML Source:</b>under &lt;mapTimezones&gt; in <a
299							href="https://github.com/unicode-org/cldr/tree/main/common/supplemental/metaZones.xml">metaZones.xml</a>
300							and <a
301							href="https://github.com/unicode-org/cldr/tree/main/common/supplemental/windowsZones.xml">windowsZones.xml</a></li>
302					</ul>
303				</td>
304			</tr>
305			<tr>
306				<td>character_fallback_substitutions</td>
307				<td>The <b>Character Fallback Substitutions</b> table shows
308					recommended fallbacks for use when a charset or supported
309					repertoire does not contain a desired character, using the data
310					from <a
311					href="https://github.com/unicode-org/cldr/tree/main/common/supplemental/characters.xml">characters.xml</a>.
312					There is more than one possible fallback: the recommended usage is
313					that when a character <i>value</i> is not in the desired repertoire
314					the following process is used, whereby the first value that is
315					wholly in the desired repertoire is used.
316					<ul>
317						<li><code>toNFC</code>(<i>value</i>)</li>
318						<li>other canonically equivalent sequences, if there are any</li>
319						<li>the explicit <i>substitutes</i> value from <a
320							href="https://github.com/unicode-org/cldr/tree/main/common/supplemental/characters.xml">characters.xml</a>
321							(in order)
322						</li>
323						<li><code>toNFKC</code>(<i>value</i>)</li>
324					</ul>
325					<p>
326						The <b>Explicit</b>, <b>NFC</b>, and <b>NFKC</b> <i>substitutes</i>
327						are shown in the chart by different colors. Note that the
328						character fallbacks do lose information, and should not be used
329						where there is a viable alternative, such as HTML escapes.
330					</p>
331					<ul>
332						<li><b>Reporting Defects:</b> If you find errors or omissions
333							in this data, please report the information with a <a
334							target="_blank" href="http://cldr.unicode.org/index/bug-reports#TOC-Filing-a-Ticket">
335								bug report</a>.</li>
336						<li><b>XML Source:</b> <a
337							href="https://github.com/unicode-org/cldr/tree/main/common/supplemental/characters.xml">characters.xml</a>&nbsp;
338						</li>
339					</ul>
340				</td>
341			</tr>
342			<tr>
343				<td>aliases</td>
344				<td>
345					<p align="left">
346						<b>Aliases</b> show how to map deprecated codes or aliases onto
347						the ones that should be used to access CLDR data. Most other
348						metadata is not shown in tables; the source data should be
349						consulted. Codes are shown in brackets before or after the English
350						name, eg "Vanuatu [VU]"
351					</p>
352					<ul>
353						<li><b>Reporting Defects:</b> If you find errors or omissions
354							in this data, please report the information with a <a
355							target="_blank" href="http://cldr.unicode.org/index/bug-reports#TOC-Filing-a-Ticket">
356								bug report</a>.</li>
357						<li><b>XML Source:</b> <a
358							href="https://github.com/unicode-org/cldr/tree/main/common/supplemental/supplementalMetadata.xml">
359								supplementalMetadata.xml</a> (see the &lt;alias&gt; element)</li>
360					</ul>
361				</td>
362			</tr>
363			<tr>
364				<td>likely_subtags</td>
365				<td>There are a number of situations where it is useful to be
366					able to find the most likely language, script, or region, if that
367					information is otherwise missing. For example:
368					<ul>
369						<li><span>Given the language "zh" and the region "TW",
370								what is the most likely script?</span></li>
371						<li><span>Given the script "Thai" what is the most
372								likely language or region?</span></li>
373						<li><span>Given the region TW, what is the most likely
374								language and script?</span></li>
375					</ul>
376					<p>
377						<span>Conversely, given a locale, it is useful to find out
378							which fields (language, script, or region) may be superfluous, in
379							the sense that they contain the likely tags. For example,
380							"en_Latn" can be simplified down to "en" since "Latn" is the
381							likely script for "en"; "ja_Japn_JP" can be simplified down to
382							"ja".</span>
383					</p>
384					<p>
385						<span>The <i>likelySubtag</i> supplemental data provides
386							default information for computing these values. This data is
387							based on the default content data, the population data, and the
388							the suppress-script data in [<a
389							href="http://unicode.org/draft/reports/tr35/tr35.html#BCP47">BCP47</a>].
390							It is heuristically derived, and may change over time. The chart
391							shows how the data "fills in" the missing fields in the <span
392							class="source">source values</span> to get the <span
393							class="target">target values</span>.
394						</span>
395					</p>
396					<ul>
397						<li><b>Reporting Defects:</b> If you find errors or omissions
398							in this data, please report the information with a <a
399							target="_blank" href="http://cldr.unicode.org/index/bug-reports#TOC-Filing-a-Ticket">bug
400								report</a>.</li>
401					</ul>
402				</td>
403			</tr>
404			<tr>
405				<td>language_plural_rules</td>
406				<td>
407					<p>
408						Languages vary in how they handle plurals of nouns or unit
409						expressions ("hours", "meters", and so on). Some languages have
410						two forms, like English; some languages have only a single form;
411						and some languages have multiple forms (see <a href="#sl">Slovenian</a>
412						below). They also vary between cardinals (such as 1, 2, or 3) and
413						ordinals (such as 1st, 2nd, or 3rd), and in ranges of cardinals
414						(such as "1-2", used in expressions like "1-2 meters long"). CLDR
415						uses short, mnemonic tags for these plural categories. For more
416						information on these categories, see <a
417							href="http://cldr.unicode.org/index/cldr-spec/plural-rules" target='spec'>Plural
418							Rules</a>.
419					</p>
420					<ul>
421						<li><b>Examples:</b> The symbol ~ (as in "1.7~2.1") has a
422							special meaning: it is a range of numbers that includes the end
423							points (1.7 and 2.1), and everything between that has exactly the
424							same number of decimals as the end points (thus also 1.8, 1.9,
425							and 2.0, but not 2 or 1.91 or 1.90). The samples are generated mechanically, and
426							are not comprehensive: “0, 2~19, 101~119, …” could show up as the less-complete
427							“0, 2~16, 101 …”.</li>
428						<li><strong>Rules:</strong> The plural categories are computed based on machine-readable rules,
429						using the syntax described in <a href="http://unicode.org/reports/tr35/tr35-numbers.html#Language_Plural_Rules" target='spec'>Language Plural Rules</a>.
430						In particular, they use special variables and relation defined in <a href="http://unicode.org/reports/tr35/tr35-numbers.html#Operands" target='spec'>Plural Rule Operands</a>
431						and following.</li>
432						<li><b>Reporting Defects:</b> When you find errors or
433							omissions in this data, please report the information with a <a
434							target="_blank" href="http://cldr.unicode.org/index/bug-reports#TOC-Filing-a-Ticket">bug
435								report</a>. But first read &quot;Reporting Defects&quot; on <a
436							href="http://cldr.unicode.org/index/cldr-spec/plural-rules" target='spec'>Plural
437								Rules</a>.</li>
438					</ul>
439				</td>
440			</tr>
441			<tr>
442				<td>error_locale_header|error_index_header</td>
443				<td>
444					<p>
445						Please review and correct them. Note that errors in <span
446							style="font-style: italic;">sublocales</span> are often fixed by
447						fixing the main locale.<br> <br>
448					</p>
449					<div style="margin-left: 40px;">
450						<span style="font-style: italic;">This list is only
451							generated daily, and so may not reflect fixes you have made until
452							tomorrow. (There were production problems in integrating it fully
453							into the Survey tool. However, it should let you see the problems
454							and make sure that they get taken care of.)</span>
455					</div>
456					<p>
457						The table below gives a count for each of the following kinds of
458						items. The focus is on correcting the problems, and getting enough
459						votes for "minimal approval" (status=<span
460							style="font-style: italic; font-weight: bold;">contributed</span>
461						-- high enough to get incorporated into most implementations).
462					</p>
463					<ul>
464						<li><span style="font-weight: bold;">Disputed:</span> Of
465							those voting on an item, if enough switched their vote the item
466							could have minimal approval.</li>
467						<li><span style="font-weight: bold;">Conflicted:</span> For
468							this many items, the organization is losing a vote because of
469							conflicts within the organization.</li>
470						<li><span style="font-weight: bold;">Error:</span> The item
471							has a serious error and must be corrected.</li>
472						<li><span style="font-weight: bold;">Warning:</span> The item
473							has a significant problem that should be corrected.</li>
474						<li><span style="font-weight: bold;">Missing Coverage:</span>
475							These items should be translated but are missing.</li>
476						<li><span style="font-weight: bold;">Missing Votes:</span>
477							These items have translations, but not enough votes for "minimal
478							approval".</li>
479					</ul>
480				</td>
481			</tr>
482		</tbody>
483	</table>
484	<p>The text to insert can be fairly arbitrary HTML. The software
485		that reads this table will search the first column (eg between
486		&lt;td&gt; and &lt;/td&gt;) and return the contents of the second
487		column.</p>
488	<p>
489		<b>WARNING</b>
490	</p>
491	<ul>
492		<li><b><i>It uses a very dumb parser, so make sure that
493					table elements are matched, eg &lt;td&gt; with &lt;/td&gt;, and
494					also that &lt;tr&gt;, &lt;/tr&gt;, &lt;table&gt;, and
495					&lt;/table&gt; are on separate lines.</i></b></li>
496		<li><b><i>The regular expression for the key must match
497					the whole path, so if it is an interior substring, remember to add
498					.* on both ends.</i></b></li>
499	</ul>
500</body>
501</html>
502