• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
2<html>
3<head>
4<meta http-equiv="Content-Language" content="en-us">
5<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
6<link rel="stylesheet" type="text/css"
7	href="http://unicode.org/cldr/apps/surveytool.css">
8<title>Help Text file for Supplemental Charts</title>
9<style type="text/css">
10<!--
11DIV.chat {
12	PADDING-RIGHT: 2px;
13	PADDING-LEFT: 2px;
14	PADDING-BOTTOM: 4px;
15	PADDING-TOP: 4px
16}
17
18DIV.in {
19	HEIGHT: 1px;
20	TEXT-ALIGN: left
21}
22
23DIV.1st {
24	PADDING-TOP: 4px
25}
26-->
27</style>
28</head>
29<body>
30	<h1 align="center">Chart Messages</h1>
31	<p>
32		This is a help-text file for use with the survey tool. You can add a
33		new row, where the <i>key</i> is a key that the program knows about,
34		and the <i>Text to Insert</i> is what you want to show up as help
35		text, or modify existing text. <b>The software that interprets
36			this expects a particular format, so don't make arbitrary changes
37			(see the end). </b>
38	</p>
39	<table id="table1" style="border-collapse: collapse;" border="1"
40		cellpadding="4" cellspacing="0" width="100%">
41		<tbody>
42			<tr>
43				<th>Key</th>
44				<th>Text to Insert</th>
45			</tr>
46			<tr>
47				<td>territory_language_information</td>
48				<td>The main goal for CLDR language data is to provide
49					approximate figures for the literate, functional population for
50					each language in each territory: that is, the population that is
51					able to read and write each language, and is comfortable enough to
52					use it with computers.
53					<p>The GDP and Literacy figures are taken from the World Bank
54						where available, otherwise supplemented by FactBook data and other
55						sources. The GDP figures are "PPP (constant 2000 international
56						$)". Much of the per-language data is taken from the Ethnologue,
57						but is supplemented and processed using many other sources,
58						including per-country census data. (The focus of the Ethnologue is
59						native speakers, which includes people who are not literate, and
60						excludes people who are functional second-langauge users.)</p>
61					<p>
62						The literacy rate may be discounted to reflect the actual usage of
63						the written form in normal daily life. Thus languages that are
64						typically not written, such as Swiss German, will be given a low
65						literacy rate, even though the whole population <i>could</i> write
66						in Swiss German.
67					</p>
68					<p>The percentages may add up to more than 100% due to
69						multilingual populations, or may be less than 100% due to
70						illiteracy or because the data has not yet been gathered or
71						processed. Languages with a small population may be omitted.</p>
72					<p>Official status is supplied where available, formatted as
73						{O}. Hovering with the mouse shows a short description.</p>
74					<ul>
75						<li><b>Likely languages and scripts:</b>To see (and verify)
76							the likely languages and scripts for this subtag, click on the
77							country code.</li>
78						<li><b>Reporting Defects:</b> If you find errors or omissions
79							in this data, please report the information with the <i>bug</i>
80							or <i>add new</i> links, below.</li>
81						<li><b>XML Source:</b> <a
82							href="http://unicode.org/cldr/data/common/supplemental/supplementalData.xml">
83								supplementalData.xml</a> (see the &lt;territoryInfo&gt;,
84							&lt;calendarData&gt;, &lt;weekData&gt;, and
85							&lt;measurementData&gt; elements)</li>
86					</ul>
87				</td>
88			</tr>
89			<tr>
90				<td>language_territory_information</td>
91				<td>
92					<p align="left">
93						The language data is provided for localization testing, and is
94						under development for CLDR 1.5. For information on the meaning of
95						the different values, see <a
96							href="territory_language_information.html">Territory-Language
97							Information</a>.
98					</p>
99					<ul>
100						<li><b>Reporting Defects:</b> If you find errors or omissions
101							in this data, or add a new territory for a language, see the <i>add
102								new</i> links below.</li>
103						<li><b>XML Source:</b> <a
104							href="http://unicode.org/cldr/data/common/supplemental/supplementalData.xml">
105								supplementalData.xml</a> (see the &lt;territoryInfo&gt; element)</li>
106					</ul>
107				</td>
108			</tr>
109			<tr>
110				<td>detailed_territory_currency_information</td>
111				<td>
112					<p align="left">
113						The following table shows when currencies were in use in different
114						countries. See also <a href="#format_info">Decimal Digits and
115							Rounding</a>. The digits column shows the number of digits to use; if
116						there is special rounding (such as for CH), that is in
117						parentheses. The Countries column shows which countries the
118						currency is <font face="Lucida Sans Unicode">—</font> <i>or
119							has been</i> <font face="Lucida Sans Unicode">—</font> used in,
120						officially.
121					</p>
122					<ul>
123						<li><b>Reporting Defects:</b> If you find errors or omissions
124							in this data, please report the information with a <a
125							target="_blank" href="http://unicode.org/cldr/trac/newticket">
126								bug report</a>.</li>
127						<li><b>XML Source:</b> <a
128							href="http://unicode.org/cldr/data/common/supplemental/supplementalData.xml">
129								supplementalData.xml</a> (see the &lt;currencyData&gt; element)</li>
130					</ul>
131				</td>
132			</tr>
133			<tr>
134				<td>languages_and_scripts</td>
135				<td>This table shows some information about the scripts
136					commonly used with different languages. This information is not
137					complete, and is being enhanced over time. The table is sorted by
138					language; for the same information sorted by script, see <a
139					name="scripts_and_languages" href="scripts_and_languages.html">Scripts
140						and Languages</a>. The following conventions are used in the table:
141					<table id="table2" style="margin: 1em; border-collapse: collapse;"
142						border="1">
143						<tbody>
144							<tr>
145								<th align="left">Column</th>
146								<th align="left">Comment</th>
147							</tr>
148							<tr>
149								<td>Language</td>
150								<td>Where there isn't any information in Unicode CLDR as to
151									which languages are written in a given script, the language
152									code is given as <i>Unknown or Invalid Language</i> ("und").
153								</td>
154							</tr>
155							<tr>
156								<td>ML</td>
157								<td>The modern language column shows "O" if the language is
158									not in customary modern use (currently following ISO 639-3
159									Types: Ancient, Extinct, Historical, or Constructed).</td>
160							</tr>
161							<tr>
162								<td>P</td>
163								<td>The Primary column shows "N" if the language is neither
164									an official nor a defacto-official language of some country.
165									For more information, see <a
166									name="language_territory_information"
167									href="http://www.unicode.org/cldr/data/charts/supplemental/language_territory_information.html">Language-Territory
168										Information</a>.
169								</td>
170							</tr>
171							<tr>
172								<td>Script</td>
173								<td>Where there isn't any information in Unicode CLDR as to
174									which script is used by a language, the script code is given as
175									<i>Unknown or Invalid Script</i> ("Zzzz").
176								</td>
177							</tr>
178							<tr>
179								<td>MS</td>
180								<td>The modern script column shows "N" if the script is not
181									in customary modern use.</td>
182							</tr>
183						</tbody>
184					</table>
185					<ul>
186						<li><b>Reporting Defects:</b> If you find errors or omissions
187							in this data, please report the information with a <a
188							target="_blank" href="http://unicode.org/cldr/trac/newticket">
189								bug report</a>.</li>
190						<li><b>XML Source:</b> <a
191							href="http://unicode.org/cldr/data/common/supplemental/supplementalData.xml">
192								supplementalData.xml</a> (see the &lt;languageData&gt; element)</li>
193					</ul>
194				</td>
195			</tr>
196			<tr>
197				<td>scripts_and_languages</td>
198				<td>This table shows some information about the scripts
199					commonly used with different languages. This information is not
200					complete, and is being enhanced over time. The table is sorted by
201					script; for the same information sorted by language, see <a
202					name="languages_and_scripts"
203					href="http://www.unicode.org/cldr/data/charts/supplemental/languages_and_scripts.html">Languages
204						and Scripts</a>. The following conventions are used in the table:
205					<table id="table3" style="margin: 1em; border-collapse: collapse;"
206						border="1">
207						<tbody>
208							<tr>
209								<th align="left">Column</th>
210								<th align="left">Comment</th>
211							</tr>
212							<tr>
213								<td>Language</td>
214								<td>Where there isn't any information in Unicode CLDR as to
215									which languages are written in a given script, the language
216									code is given as <i>Unknown or Invalid Language</i> ("und").
217								</td>
218							</tr>
219							<tr>
220								<td>ML</td>
221								<td>The modern language column shows "O" if the language is
222									not in customary modern use (currently following ISO 639-3
223									Types: Ancient, Extinct, Historical, or Constructed).</td>
224							</tr>
225							<tr>
226								<td>P</td>
227								<td>The Primary column shows "N" if the language
228									combination is neither an official nor a defacto-official
229									language of some country. For more information, see <a
230									name="language_territory_information0"
231									href="http://www.unicode.org/cldr/data/charts/supplemental/language_territory_information.html">Language-Territory
232										Information</a>.
233								</td>
234							</tr>
235							<tr>
236								<td>Script</td>
237								<td>Where there isn't any information in Unicode CLDR as to
238									which script is used by a language, the script code is given as
239									<i>Unknown or Invalid Script</i> ("Zzzz").
240								</td>
241							</tr>
242							<tr>
243								<td>MS</td>
244								<td>The modern script column shows "N" if the script is not
245									in customary modern use.</td>
246							</tr>
247						</tbody>
248					</table>
249					<ul>
250						<li><b>Reporting Defects:</b> If you find errors or omissions
251							in this data, please report the information with a <a
252							target="_blank" href="http://unicode.org/cldr/trac/newticket">
253								bug report</a>.</li>
254						<li><b>XML Source:</b> <a
255							href="http://unicode.org/cldr/data/common/supplemental/supplementalData.xml">
256								supplementalData.xml</a> (see the &lt;languageData&gt; element)</li>
257					</ul>
258				</td>
259			</tr>
260			<tr>
261				<td>territory_containment_un_m_49</td>
262				<td>
263					<p align="left">
264						The <b>Territory Containment</b> table shows the organization of
265						territories and regions according to <a
266							href="http://unstats.un.org/unsd/methods/m49/m49regin.htm">UN
267							M.49</a>, starting with the World. (CLDR supplements this table with
268						the QO code for outlying areas that would not otherwise be
269						included.) As the last column, the timezone IDs for that country
270						are listed.
271					</p>
272					<ul>
273						<li><b>Reporting Defects:</b> If you find errors or omissions
274							in this data, please report the information with a <a
275							target="_blank" href="http://unicode.org/cldr/trac/newticket">
276								bug report</a>. However, such reports should be limited to cases
277							where the information here deviates from <a
278							href="http://unstats.un.org/unsd/methods/m49/m49regin.htm">UN
279								M.49</a>.</li>
280						<li><b>XML Source:</b> <a
281							href="http://unicode.org/cldr/data/common/supplemental/supplementalData.xml">
282								supplementalData.xml</a> (see the &lt;territoryContainment&gt; and
283							&lt;timezoneData&gt; elements)</li>
284					</ul>
285				</td>
286			</tr>
287			<tr>
288				<td>zone_tzid</td>
289				<td>
290					<p align="left">
291						The <b>Zone-Tzid</b> table shows the mapping from Windows timezone
292						IDs to the standard TZIDs.
293					</p>
294					<ul>
295						<li><b>Reporting Defects:</b> If you find errors or omissions
296							in this data, please report the information with a <a
297							target="_blank" href="http://unicode.org/cldr/trac/newticket">
298								bug report</a>.</li>
299						<li><b>XML Source:</b>under &lt;mapTimezones&gt; in <a
300							href="http://unicode.org/cldr/data/common/supplemental/metaZones.xml">metaZones.xml</a>
301							and <a
302							href="http://unicode.org/cldr/data/common/supplemental/windowsZones.xml">windowsZones.xml</a></li>
303					</ul>
304				</td>
305			</tr>
306			<tr>
307				<td>character_fallback_substitutions</td>
308				<td>The <b>Character Fallback Substitutions</b> table shows
309					recommended fallbacks for use when a charset or supported
310					repertoire does not contain a desired character, using the data
311					from <a
312					href="http://unicode.org/cldr/data/common/supplemental/characters.xml">characters.xml</a>.
313					There is more than one possible fallback: the recommended usage is
314					that when a character <i>value</i> is not in the desired repertoire
315					the following process is used, whereby the first value that is
316					wholly in the desired repertoire is used.
317					<ul>
318						<li><code>toNFC</code>(<i>value</i>)</li>
319						<li>other canonically equivalent sequences, if there are any</li>
320						<li>the explicit <i>substitutes</i> value from <a
321							href="http://unicode.org/cldr/data/common/supplemental/characters.xml">characters.xml</a>
322							(in order)
323						</li>
324						<li><code>toNFKC</code>(<i>value</i>)</li>
325					</ul>
326					<p>
327						The <b>Explicit</b>, <b>NFC</b>, and <b>NFKC</b> <i>substitutes</i>
328						are shown in the chart by different colors. Note that the
329						character fallbacks do lose information, and should not be used
330						where there is a viable alternative, such as HTML escapes.
331					</p>
332					<ul>
333						<li><b>Reporting Defects:</b> If you find errors or omissions
334							in this data, please report the information with a <a
335							target="_blank" href="http://unicode.org/cldr/trac/newticket">
336								bug report</a>.</li>
337						<li><b>XML Source:</b> <a
338							href="http://unicode.org/cldr/data/common/supplemental/characters.xml">characters.xml</a>&nbsp;
339						</li>
340					</ul>
341				</td>
342			</tr>
343			<tr>
344				<td>aliases</td>
345				<td>
346					<p align="left">
347						<b>Aliases</b> show how to map deprecated codes or aliases onto
348						the ones that should be used to access CLDR data. Most other
349						metadata is not shown in tables; the source data should be
350						consulted. Codes are shown in brackets before or after the English
351						name, eg "Vanuatu [VU]"
352					</p>
353					<ul>
354						<li><b>Reporting Defects:</b> If you find errors or omissions
355							in this data, please report the information with a <a
356							target="_blank" href="http://unicode.org/cldr/trac/newticket">
357								bug report</a>.</li>
358						<li><b>XML Source:</b> <a
359							href="http://unicode.org/cldr/data/common/supplemental/supplementalMetadata.xml">
360								supplementalMetadata.xml</a> (see the &lt;alias&gt; element)</li>
361					</ul>
362				</td>
363			</tr>
364			<tr>
365				<td>likely_subtags</td>
366				<td>There are a number of situations where it is useful to be
367					able to find the most likely language, script, or region, if that
368					information is otherwise missing. For example:
369					<ul>
370						<li><span>Given the language "zh" and the region "TW",
371								what is the most likely script?</span></li>
372						<li><span>Given the script "Thai" what is the most
373								likely language or region?</span></li>
374						<li><span>Given the region TW, what is the most likely
375								language and script?</span></li>
376					</ul>
377					<p>
378						<span>Conversely, given a locale, it is useful to find out
379							which fields (language, script, or region) may be superfluous, in
380							the sense that they contain the likely tags. For example,
381							"en_Latn" can be simplified down to "en" since "Latn" is the
382							likely script for "en"; "ja_Japn_JP" can be simplified down to
383							"ja".</span>
384					</p>
385					<p>
386						<span>The <i>likelySubtag</i> supplemental data provides
387							default information for computing these values. This data is
388							based on the default content data, the population data, and the
389							the suppress-script data in [<a
390							href="http://unicode.org/draft/reports/tr35/tr35.html#BCP47">BCP47</a>].
391							It is heuristically derived, and may change over time. The chart
392							shows how the data "fills in" the missing fields in the <span
393							class="source">source values</span> to get the <span
394							class="target">target values</span>.
395						</span>
396					</p>
397					<ul>
398						<li><b>Reporting Defects:</b> If you find errors or omissions
399							in this data, please report the information with a <a
400							target="_blank" href="http://unicode.org/cldr/trac/newticket">bug
401								report</a>.</li>
402					</ul>
403				</td>
404			</tr>
405			<tr>
406				<td>language_plural_rules</td>
407				<td>
408					<p>
409						Languages vary in how they handle plurals of nouns or unit
410						expressions ("hours", "meters", and so on). Some languages have
411						two forms, like English; some languages have only a single form;
412						and some languages have multiple forms (see <a href="#sl">Slovenian</a>
413						below). They also vary between cardinals (such as 1, 2, or 3) and
414						ordinals (such as 1st, 2nd, or 3rd), and in ranges of cardinals
415						(such as "1-2", used in expressions like "1-2 meters long"). CLDR
416						uses short, mnemonic tags for these plural categories. For more
417						information on these categories, see <a
418							href="http://cldr.unicode.org/index/cldr-spec/plural-rules">Plural
419							Rules</a>.
420					</p>
421					<p></p>
422					<ul>
423						<li><b>Examples:</b> The symbol ~ (as in "1.7~2.1") has a
424							special meaning: it is a range of numbers that includes the end
425							points (1.7 and 2.1), and everything between that has exactly the
426							same number of decimals as the end points (thus also 1.8, 1.9,
427							and 2.0, but not 2 or 1.91 or 1.90). The samples are generated mechanically, and
428							are not comprehensive: “0, 2~19, 101~119, …” could show up as the less-complete
429							“0, 2~16, 101 …”.</li>
430						<li><b>Reporting Defects:</b> When you find errors or
431							omissions in this data, please report the information with a <a
432							target="_blank" href="http://unicode.org/cldr/trac/newticket">bug
433								report</a>. But first read &quot;Reporting Defects&quot; on <a
434							href="http://cldr.unicode.org/index/cldr-spec/plural-rules">Plural
435								Rules</a>.</li>
436					</ul>
437				</td>
438			</tr>
439			<tr>
440				<td>error_locale_header|error_index_header</td>
441				<td>
442					<p>
443						Please review and correct them. Note that errors in <span
444							style="font-style: italic;">sublocales</span> are often fixed by
445						fixing the main locale.<br> <br>
446					</p>
447					<div style="margin-left: 40px;">
448						<span style="font-style: italic;">This list is only
449							generated daily, and so may not reflect fixes you have made until
450							tomorrow. (There were production problems in integrating it fully
451							into the Survey tool. However, it should let you see the problems
452							and make sure that they get taken care of.)</span>
453					</div>
454					<p>
455						The table below gives a count for each of the following kinds of
456						items. The focus is on correcting the problems, and getting enough
457						votes for "minimal approval" (status=<span
458							style="font-style: italic; font-weight: bold;">contributed</span>
459						-- high enough to get incorporated into most implementations).
460					</p>
461					<ul>
462						<li><span style="font-weight: bold;">Disputed:</span> Of
463							those voting on an item, if enough switched their vote the item
464							could have minimal approval.</li>
465						<li><span style="font-weight: bold;">Conflicted:</span> For
466							this many items, the organization is losing a vote because of
467							conflicts within the organization.</li>
468						<li><span style="font-weight: bold;">Error:</span> The item
469							has a serious error and must be corrected.</li>
470						<li><span style="font-weight: bold;">Warning:</span> The item
471							has a significant problem that should be corrected.</li>
472						<li><span style="font-weight: bold;">Missing Coverage:</span>
473							These items should be translated but are missing.</li>
474						<li><span style="font-weight: bold;">Missing Votes:</span>
475							These items have translations, but not enough votes for "minimal
476							approval".</li>
477					</ul>
478				</td>
479			</tr>
480		</tbody>
481	</table>
482	<p>The text to insert can be fairly arbitrary HTML. The software
483		that reads this table will search the first column (eg between
484		&lt;td&gt; and &lt;/td&gt;) and return the contents of the second
485		column.</p>
486	<p>
487		<b>WARNING</b>
488	</p>
489	<ul>
490		<li><b><i>It uses a very dumb parser, so make sure that
491					table elements are matched, eg &lt;td&gt; with &lt;/td&gt;, and
492					also that &lt;tr&gt;, &lt;/tr&gt;, &lt;table&gt;, and
493					&lt;/table&gt; are on separate lines.</i></b></li>
494		<li><b><i>The regular expression for the key must match
495					the whole path, so if it is an interior substring, remember to add
496					.* on both ends.</i></b></li>
497	</ul>
498</body>
499</html>
500