• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
2"http://www.w3.org/TR/html4/loose.dtd">
3<html>
4
5<head>
6<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
7<meta http-equiv="Content-Language" content="en-us">
8<link rel="stylesheet" href="http://unicode.org/reports/reports.css"
9	type="text/css">
10<title>UTS #35: Unicode Locale Data Markup Language</title>
11<style type="text/css">
12<!--
13.dtd {
14	font-family: monospace;
15	font-size: 90%;
16	background-color: #CCCCFF;
17	border-style: dotted;
18	border-width: 1px;
19}
20
21.xmlExample {
22	font-family: monospace;
23	font-size: 80%
24}
25
26.blockedInherited {
27	font-style: italic;
28	font-weight: bold;
29	border-style: dashed;
30	border-width: 1px;
31	background-color: #FF0000
32}
33
34.inherited {
35	font-weight: bold;
36	border-style: dashed;
37	border-width: 1px;
38	background-color: #00FF00
39}
40
41.element {
42	font-weight: bold;
43	color: red;
44}
45
46.attribute {
47	font-weight: bold;
48	color: maroon;
49}
50
51.attributeValue {
52	font-weight: bold;
53	color: blue;
54}
55
56li, p {
57	margin-top: 0.5em;
58	margin-bottom: 0.5em
59}
60
61h2, h3, h4, h5, table {
62	margin-top: 1.5em;
63	margin-bottom: 0.5em;
64}
65
66h5 {
67	font-size: medium;
68	font-style: italic
69}
70-->
71</style>
72</head>
73
74<body>
75
76	<table class="header" width="100%">
77		<tr>
78			<td class="icon"><a href="http://unicode.org"> <img
79					alt="[Unicode]" src="http://unicode.org/webscripts/logo60s2.gif"
80					width="34" height="33"
81					style="vertical-align: middle; border-left-width: 0px; border-bottom-width: 0px; border-right-width: 0px; border-top-width: 0px;"></a>&nbsp;
82				<a class="bar" href="http://www.unicode.org/reports/">Technical
83					Reports</a></td>
84		</tr>
85		<tr>
86			<td class="gray">&nbsp;</td>
87		</tr>
88	</table>
89	<div class="body">
90		<h2 style="text-align: center">
91			Unicode Technical Standard #35
92		</h2>
93		<h1 style="text-align: center">Unicode Locale Data Markup Language (LDML)</h1>
94
95		<!-- At least the first row of this header table should be identical across the parts of this UTS. -->
96		<table border="1" cellpadding="2" cellspacing="0" class="wide">
97			<tr>
98				<td>Version</td>
99				<td>34</td>
100			</tr>
101			<tr>
102				<td>Editors</td>
103				<td><a
104					href="https://plus.google.com/114199149796022210033?rel=author">
105						Mark Davis</a> (<a href="mailto:markdavis@google.com">markdavis@google.com</a>)
106					and <a href="tr35.html#Acknowledgments">other CLDR committee
107						members</a></td>
108			</tr>
109			<tr>
110				<td>Date</td>
111				<td>2018-10-10</td>
112			</tr>
113			<tr>
114				<!-- This link must be made live when posting the final version but is disabled during proposed update stage. -->
115				<td>This Version</td>
116				<td>
117				<a href="http://www.unicode.org/reports/tr35/tr35-53/tr35.html">
118				http://www.unicode.org/reports/tr35/tr35-53/tr35.html</a></td>
119			</tr>
120			<tr>
121				<td>Previous Version</td>
122				<td>
123				<a href="http://www.unicode.org/reports/tr35/tr35-51/tr35.html">http://www.unicode.org/reports/tr35/tr35-51/tr35.html</a></td>
124			</tr>
125			<tr>
126				<td>Latest Version</td>
127				<td><a href="http://www.unicode.org/reports/tr35/">http://www.unicode.org/reports/tr35/</a></td>
128			</tr>
129			<tr>
130				<td>Corrigenda</td>
131				<td><a href="http://unicode.org/cldr/corrigenda.html">http://unicode.org/cldr/corrigenda.html</a></td>
132			</tr>
133			<tr>
134				<td>Latest Proposed Update</td>
135				<td><a href="http://www.unicode.org/reports/tr35/proposed.html">http://www.unicode.org/reports/tr35/proposed.html</a></td>
136			</tr>
137			<tr>
138				<td>Namespace</td>
139				<td><a href="http://cldr.unicode.org/">http://cldr.unicode.org/</a></td>
140			</tr>
141			<tr>
142				<td>DTDs</td>
143				<td><a href="http://unicode.org/cldr/dtd/34/">
144				http://unicode.org/cldr/dtd/34/</a></td>
145			</tr>
146			<tr>
147				<td>Revision</td>
148				<td><a href="#Modifications">53</a></td>
149			</tr>
150		</table>
151		<h3>
152			<i>Summary</i>
153		</h3>
154		<p>
155			This document describes an XML format (<i>vocabulary</i>) for the
156			exchange of structured locale data. This format is used in the <a
157				href="http://cldr.unicode.org/">Unicode Common Locale Data
158				Repository</a>.
159		</p>
160
161		<h3>
162			<i>Status</i>
163		</h3>
164
165		<!-- NOT YET APPROVED
166		<p>
167				<i class="changed">This is a<b><font color="#ff3333">
168				draft </font></b>document which may be updated, replaced, or superseded by
169				other documents at any time. Publication does not imply endorsement
170				by the Unicode Consortium. This is not a stable document; it is
171				inappropriate to cite this document as other than a work in
172				progress.
173			</i>
174		</p>
175		 END NOT YET APPROVED -->
176		<!-- APPROVED -->
177		<p>
178			<i>This document has been reviewed by Unicode members and other
179				interested parties, and has been approved for publication by the
180				Unicode Consortium. This is a stable document and may be used as
181				reference material or cited as a normative reference by other
182				specifications.</i>
183		</p>
184		<!-- END APPROVED -->
185
186		<blockquote>
187			<p>
188				<i><b>A Unicode Technical Standard (UTS)</b> is an independent
189					specification. Conformance to the Unicode Standard does not imply
190					conformance to any UTS.</i>
191			</p>
192		</blockquote>
193		<p>
194			<i>Please submit corrigenda and other comments with the CLDR bug
195				reporting form [<a href="http://cldr.unicode.org/index/bug-reports">Bugs</a>].
196				Related information that is useful in understanding this document is
197				found in the <a href="#References">References</a>. For the latest
198				version of the Unicode Standard see [<a
199				href="http://www.unicode.org/versions/latest/">Unicode</a>]. For a
200				list of current Unicode Technical Reports see [<a
201				href="http://www.unicode.org/reports/">Reports</a>]. For more
202				information about versions of the Unicode Standard, see [<a
203				href="http://www.unicode.org/versions/">Versions</a>].
204			</i>
205		</p>
206
207		<!-- This section of Parts should be identical in all of the parts of this UTS. -->
208		<h2>
209			<a name="Parts" href="#Parts">Parts</a>
210		</h2>
211		<p>The LDML specification is divided into the following parts:</p>
212		<ul class="toc">
213			<li>Part 1: <a href="tr35.html#Contents">Core</a> (languages,
214				locales, basic structure)
215			</li>
216			<li>Part 2: <a href="tr35-general.html#Contents">General</a>
217				(display names &amp; transforms, etc.)
218			</li>
219			<li>Part 3: <a href="tr35-numbers.html#Contents">Numbers</a>
220				(number &amp; currency formatting)
221			</li>
222			<li>Part 4: <a href="tr35-dates.html#Contents">Dates</a> (date,
223				time, time zone formatting)
224			</li>
225			<li>Part 5: <a href="tr35-collation.html#Contents">Collation</a>
226				(sorting, searching, grouping)
227			</li>
228			<li>Part 6: <a href="tr35-info.html#Contents">Supplemental</a>
229				(supplemental data)
230			</li>
231			<li>Part 7: <a href="tr35-keyboards.html#Contents">Keyboards</a>
232				(keyboard mappings)
233			</li>
234		</ul>
235
236		<h2>
237			<a name="Contents" href="#Contents">Contents of Part 1, Core</a>
238		</h2>
239		<!-- START Generated TOC: CheckHtmlFiles -->
240		<ul class="toc">
241			<li>1 <a href="#Introduction">Introduction</a>
242				<ul class="toc">
243					<li>1.1 <a href="#Conformance">Conformance</a></li>
244				</ul>
245			</li>
246			<li>2 <a href="#Locale">What is a Locale?</a></li>
247			<li>3 <a href="#Identifiers">Unicode Language and Locale
248					Identifiers</a>
249				<ul class="toc">
250					<li>3.1 <a href="#Unicode_language_identifier">Unicode
251							Language Identifier</a></li>
252					<li>3.2 <a href="#Unicode_locale_identifier">Unicode
253							Locale Identifier</a></li>
254					<li>3.3 <a href="#BCP_47_Conformance">BCP 47 Conformance</a>
255						<ul class="toc">
256							<li>3.3.1 <a href="#BCP_47_Language_Tag_Conversion">BCP
257									47 Language Tag Conversion</a></li>
258						</ul>
259					</li>
260					<li>3.4 <a href="#Field_Definitions">Language Identifier
261							Field Definitions</a>
262						<ul class="toc">
263							<li>Table: <a href="#Language_Locale_Field_Definitions">Language
264									Identifier Field Definitions</a></li>
265						</ul>
266					</li>
267					<li>3.5 <a href="#Special_Codes">Special Codes</a>
268						<ul class="toc">
269							<li>3.5.1 <a href="#Unknown_or_Invalid_Identifiers">Unknown
270									or Invalid Identifiers</a></li>
271							<li>3.5.2 <a href="#Numeric_Codes">Numeric Codes</a></li>
272							<li>3.5.3 <a href="#Private_Use">Private Use Codes</a>
273								<ul class="toc">
274									<li>Table: <a href="#Private_Use_CLDR">Private Use
275											Codes in CLDR</a></li>
276								</ul>
277							</li>
278						</ul>
279					</li>
280					<li>3.6 <a href="#Locale_Extension_Key_and_Type_Data">Unicode
281							BCP 47 U Extension</a>
282						<ul class="toc">
283							<li>3.6.1 <a href="#Key_And_Type_Definitions_">Key And
284									Type Definitions</a>
285								<ul class="toc">
286									<li>Table: <a href="#Key_Type_Definitions">Key/Type
287											Definitions</a></li>
288								</ul>
289							</li>
290							<li>3.6.2 <a href="#Numbering System Data">Numbering
291									System Data</a></li>
292							<li>3.6.3 <a href="#Time_Zone_Identifiers">Time Zone
293									Identifiers</a></li>
294							<li>3.6.4 <a href="#Unicode_Locale_Extension_Data_Files">U
295									Extension Data Files</a>
296							</li>
297							<li>3.6.5 <a href="#Unicode_Subdivision_Codes">Subdivision
298									Codes</a>
299								<ul class="toc">
300									<li>3.6.5.1 <a href="#Validity">Validity</a></li>
301								</ul>
302							</li>
303						</ul>
304					</li>
305					<li>3.7 <a href="#t_Extension">Unicode BCP 47 T Extension</a>
306						<ul class="toc">
307							<li>3.7.1 <a href="#Transformed_Content_Data_File">T
308									Extension Data Files</a></li>
309						</ul>
310					</li>
311					<li>3.8 <a href="#Compatibility_with_Older_Identifiers">Compatibility
312							with Older Identifiers</a>
313						<ul class="toc">
314							<li>3.8.1 <a href="#Old_Locale_Extension_Syntax">Old
315									Locale Extension Syntax</a>
316								<ul class="toc">
317									<li>Table: <a href="#Locale_Extension_Mappings">Locale
318											Extension Mappings</a></li>
319								</ul>
320							</li>
321							<li>3.8.2 <a href="#Legacy_Variants">Legacy Variants</a>
322								<ul class="toc">
323									<li>Table: <a href="#Legacy_Variant_Mappings">Legacy
324											Variant Mappings</a></li>
325								</ul>
326							</li>
327							<li>3.8.3 <a href="#Relation_to_OpenI18n">Relation to
328									OpenI18n</a></li>
329						</ul>
330					</li>
331					<li>3.9 <a href="#Transmitting_Locale_Information">Transmitting
332							Locale Information</a>
333						<ul class="toc">
334							<li>3.9.1 <a href="#Message_Formatting_and_Exceptions">Message
335									Formatting and Exceptions</a></li>
336						</ul>
337					</li>
338					<li>3.10 <a href="#Language_and_Locale_IDs">Unicode
339							Language and Locale IDs</a>
340						<ul class="toc">
341							<li>3.10.1 <a href="#Written_Language">Written Language</a></li>
342						  <li>3.10.2 <a href="#Hybrid_Locale">Hybrid Locale Identifiers</a></li>
343						</ul>
344					</li>
345					<li>3.11 <a href="#Validity_Data">Validity Data</a></li>
346				</ul>
347			</li>
348			<li>4 <a href="#Locale_Inheritance">Locale Inheritance and
349					Matching</a>
350				<ul class="toc">
351					<li>4.1 <a href="#Lookup">Lookup</a>
352						<ul class="toc">
353							<li>4.1.1 <a href="#Bundle_vs_Item_Lookup">Bundle vs
354									Item Lookup</a>
355								<ul class="toc">
356									<li>Table: <a href="#Lookup-Differences">Lookup
357											Differences</a></li>
358								</ul>
359							</li>
360							<li>4.1.2 <a href="#Multiple_Inheritance">Lateral
361									Inheritance</a>
362								<ul class="toc">
363									<li>Table: <a href="#Count_Fallback_normal">Count
364											Fallback: normal</a></li>
365									<li>Table: <a href="#Count_Fallback_currency">Count
366											Fallback: currency</a></li>
367								</ul>
368							</li>
369							<li>4.1.3 <a href="#Parent_Locales">Parent Locales</a></li>
370						</ul>
371					</li>
372					<li>4.2 <a href="#Inheritance_and_Validity">Inheritance
373							and Validity</a>
374						<ul class="toc">
375							<li>4.2.1 <a href="#Definitions">Definitions</a></li>
376							<li>4.2.2 <a href="#Resolved_Data_File">Resolved Data
377									File</a></li>
378							<li>4.2.3 <a href="#Valid_Data">Valid Data</a></li>
379							<li>4.2.4 <a href="#Checking_for_Draft_Status">Checking
380									for Draft Status</a></li>
381							<li>4.2.5 <a href="#Keyword_and_Default_Resolution">Keyword
382									and Default Resolution</a></li>
383							<li>4.2.6 <a
384				href="#Inheritance_vs_Related">Inheritance vs Related Information</a></li>
385						</ul>
386					</li>
387					<li>4.3 <a href="#Likely_Subtags">Likely Subtags</a></li>
388					<li>4.4 <a href="#LanguageMatching">Language Matching</a>
389					  <ul>
390					    <li>4.4.1 <a href="#EnhancedLanguageMatching">Enhanced Language Matching</a></li>
391				      </ul>
392					</li>
393				</ul>
394			</li>
395			<li>5 <a href="#XML_Format">XML Format</a>
396				<ul class="toc">
397					<li>5.1 <a href="#Common_Elements">Common Elements</a>
398						<ul class="toc">
399							<li>5.1.1 <a href="#special">Element special</a>
400								<ul class="toc">
401									<li>5.1.1.1 <a href="#Sample_Special_Elements">Sample
402											Special Elements</a></li>
403								</ul>
404							</li>
405							<li>5.1.2 <a href="#Alias_Elements">Element alias</a>
406								<ul class="toc">
407									<li>Table: <a href="#Inheritance_with_source_locale_">Inheritance
408											with source=&quot;locale&quot;</a></li>
409								</ul>
410							</li>
411							<li>5.1.3 <a href="#Element_displayName">Element
412									displayName</a></li>
413							<li>5.1.4 <a href="#Escaping_Characters">Escaping
414									Characters</a></li>
415						</ul>
416					</li>
417					<li>5.2 <a href="#Common_Attributes">Common Attributes</a>
418						<ul class="toc">
419							<li>5.2.1 <a href="#Attribute_type">Attribute type</a></li>
420							<li>5.2.2 <a href="#Attribute_draft">Attribute draft</a></li>
421							<li>5.2.3 <a href="#alt_attribute">Attribute alt</a></li>
422						</ul>
423					</li>
424					<li>5.3 <a href="#Common_Structures">Common Structures</a>
425						<ul class="toc">
426							<li>5.3.1 <a href="#Date_Ranges">Date and Date Ranges</a></li>
427							<li>5.3.2 <a href="#Text_Directionality">Text
428									Directionality</a></li>
429							<li>5.3.3 <a href="#Unicode_Sets">Unicode Sets</a>
430								<ul class="toc">
431									<li>5.3.3.1 <a href="#Lists_of_Code_Points">Lists of
432											Code Points</a></li>
433									<li>5.3.3.2 <a href="#Unicode_Properties">Unicode
434											Properties</a></li>
435									<li>5.3.3.3 <a href="#Boolean_Operations">Boolean
436											Operations</a></li>
437									<li>5.3.3.4 <a href="#UnicodeSet_Examples">UnicodeSet
438											Examples</a></li>
439								</ul>
440							</li>
441							<li>5.3.4 <a href="#String_Range">String Range</a></li>
442						</ul>
443					</li>
444					<li>5.4 <a href="#Identity_Elements">Identity Elements</a></li>
445					<li>5.5 <a href="#Valid_Attribute_Values">Valid Attribute
446							Values</a></li>
447					<li>5.6 <a href="#Canonical_Form">Canonical Form</a>
448						<ul class="toc">
449							<li>5.6.1 <a href="#Content">Content</a></li>
450							<li>5.6.2 <a href="#Ordering">Ordering</a></li>
451							<li>5.6.3 <a href="#Comments">Comments</a></li>
452						</ul>
453					</li>
454                    	<li>5.7 <a href="#DTD_Annotations">DTD Annotations</a></li>
455
456				</ul>
457			</li>
458			<li>6 <a href="#Property_Data">Property Data</a>
459				<ul class="toc">
460					<li>6.1 <a href="#Script_Metadata">Script Metadata</a></li>
461					<li>6.2 <a href="#Extended_Pictographic">Extended Pictographic</a></li>
462					<li>6.3 <a href="#Labels.txt">Labels.txt</a></li>
463				</ul>
464			</li>
465			<li>7 <a href="#Format_Parse_Issues">Issues in Formatting
466					and Parsing</a>
467				<ul class="toc">
468					<li>7.1 <a href="#Lenient_Parsing">Lenient Parsing</a>
469						<ul class="toc">
470							<li>7.1.1 <a href="#Motivation">Motivation</a></li>
471							<li>7.1.2 <a href="#Loose_Matching">Loose Matching</a></li>
472						</ul>
473					</li>
474					<li>7.2 <a href="#Invalid_Patterns">Handling Invalid
475							Patterns</a></li>
476				</ul>
477			</li>
478			<li>Annex A <a href="#Deprecated_Structure">Deprecated Structure</a>
479				<ul class="toc">
480					<li>A.1 <a href="#Fallback_Elements">Element fallback</a></li>
481					<li>A.2 <a href="#BCP47_Keyword_Mapping">BCP 47 Keyword
482							Mapping</a></li>
483					<li>A.3 <a href="#Choice_Patterns">Choice Patterns</a></li>
484					<li>A.4 <a href="#Element_default">Element default</a></li>
485					<li>A.5 <a href="#Deprecated_Common_Attributes">Deprecated
486							Common Attributes</a>
487						<ul>
488							<li>A.5.1 <a href="#Attribute_standard">Attribute
489									standard</a></li>
490							<li>A.5.2 <a href="#Attribute_draft_nonLeaf">Attribute
491									draft in non-leaf elements</a></li>
492						</ul>
493					</li>
494					<li>A.6 <a href="#Element_base">Element base</a></li>
495					<li>A.7 <a href="#Element_rules">Element rules</a></li>
496					<li>A.8 <a href="#Deprecated_subelements_of_dates">Deprecated
497							subelements of &lt;dates&gt;</a></li>
498					<li>A.9 <a href="#Deprecated_subelements_of_calendars">Deprecated
499							subelements of &lt;calendars&gt;</a></li>
500					<li>A.10 <a href="#Deprecated_subelements_of_timeZoneNames">Deprecated
501							subelements of &lt;timeZoneNames&gt;</a></li>
502					<li>A.11 <a href="#Deprecated_subelements_of_zone_metazone">Deprecated
503							subelements of &lt;zone&gt; and &lt;metazone&gt;</a></li>
504					<li>A.12 <a
505						href="#Renamed_attribute_values_for_contextTransformUsage">Renamed
506							attribute values for &lt;contextTransformUsage&gt; element</a></li>
507					<li>A.13 <a href="#Deprecated_subelements_of_segmentations">Deprecated
508							subelements of &lt;segmentations&gt;</a></li>
509					<li>A.14 <a href="#Element_cp">Element cp</a></li>
510					<li>A.15 <a href="#validSubLocales">Attribute
511							validSubLocales</a></li>
512					<li>A.16 <a href="#postCodeElements">Elements
513							postalCodeData, postCodeRegex</a></li>
514					<li>A.17 <a href="#telephoneCodeData">Element
515							telephoneCodeData</a></li>
516				</ul>
517			</li>
518			<li>Annex B <a href="#Links_to_Other_Parts">Links to Other Parts</a>
519				<ul class="toc">
520					<li>Table: <a href="#Part_2_Links">Part 2 Links: General
521							(display names &amp; transforms, etc.)</a></li>
522					<li>Table: <a href="#Part_3_Links">Part 3 Links: Numbers
523							(number &amp; currency formatting)</a></li>
524					<li>Table: <a href="#Part_4_Links">Part 4 Links: Dates
525							(date, time, time zone formatting)</a></li>
526					<li>Table: <a href="#Part_5_Links">Part 5 Links: Collation
527							(sorting, searching, grouping)</a></li>
528					<li>Table: <a href="#Part_6_Links">Part 6 Links:
529							Supplemental (supplemental data)</a></li>
530					<li>Table: <a href="#Part_7_Links">Part 7 Links: Keyboards
531							(keyboard mappings)</a></li>
532				</ul>
533			</li>
534			<li><a href="#References">References</a></li>
535			<li><a href="#Acknowledgments">Acknowledgments</a></li>
536			<li><a href="#Modifications">Modifications</a></li>
537		</ul>
538		<!-- END Generated TOC: CheckHtmlFiles -->
539		<h2>
540			<a name="Introduction" href="#Introduction">1 Introduction</a>
541		</h2>
542		<p>Not long ago, computer systems were like separate worlds,
543			isolated from one another. The internet and related events have
544			changed all that. A single system can be built of many different
545			components, hardware and software, all needing to work together. Many
546			different technologies have been important in bridging the gaps; in
547			the internationalization arena, Unicode has provided a lingua franca
548			for communicating textual data. However, there remain differences in
549			the locale data used by different systems.</p>
550		<p>The best practice for internationalization is to store and
551			communicate language-neutral data, and format that data for the
552			client. This formatting can take place on any of a number of the
553			components in a system; a server might format data based on the
554			user&#39;s locale, or it could be that a client machine does the
555			formatting. The same goes for parsing data, and locale-sensitive
556			analysis of data.</p>
557		<p>
558			But there remain significant differences across systems and
559			applications in the locale-sensitive data used for such formatting,
560			parsing, and analysis. Many of those differences are simply
561			gratuitous; all within acceptable limits for human beings, but
562			yielding different results. In many other cases there are outright
563			errors. Whatever the cause, the differences can cause discrepancies
564			to creep into a heterogeneous system. This is especially serious in
565			the case of collation (sort-order), where different collation caused
566			not only ordering differences, but also different results of queries!
567			That is, with a query of customers with names between &quot;Abbot,
568			Cosmo&quot; and &quot;Arnold, James&quot;, if different systems have
569			different sort orders, different lists will be returned. (For
570			comparisons across systems formatted as HTML tables, see [<a
571				href="#Comparisons">Comparisons</a>].)
572		</p>
573		<blockquote>
574			<p class="note">
575				<b>Note:</b> There are many different equally valid ways in which
576				data can be judged to be &quot;correct&quot; for a particular
577				locale. The goal for the common locale data is to make it as
578				consistent as possible with existing locale data, and acceptable to
579				users in that locale.
580			</p>
581		</blockquote>
582		<p>This document specifies an XML format for the communication of
583			locale data: the Unicode Locale Data Markup Language (LDML). This
584			provides a common format for systems to interchange locale data so
585			that they can get the same results in the services provided by
586			internationalization libraries. It also provides a standard format
587			that can allow users to customize the behavior of a system. With it,
588			for example, collation (sorting) rules can be exchanged, allowing two
589			implementations to exchange a specification of tailored collation
590			rules. Using the same specification, the two implementations will
591			achieve the same results in comparing strings. Unicode LDML can also
592			be used to let a user encapsulate specialized sorting behavior for a
593			specific domain, or create a customized locale for a minority
594			language. Unicode LDML is also used in the Unicode Common Locale Data
595			Repository (CLDR). CLDR uses an open process for reconciling
596			differences between the locale data used on different systems and
597			validating the data, to produce with a useful, common, consistent
598			base of locale data.</p>
599		<p>
600			For more information, see the Common Locale Data Repository project
601			page [<a href="#localeProject">LocaleProject</a>].
602		</p>
603		<p>As LDML is an interchange format, it was designed for ease of
604			maintenance and simplicity of transformation into other formats,
605			above efficiency of run-time lookup and use. Implementations should
606			consider converting LDML data into a more compact format prior to
607			use.</p>
608		<h3>
609			<a name="Conformance" href="#Conformance">1.1 Conformance</a>
610		</h3>
611		<p>There are many ways to use the Unicode LDML format and the data
612			in CLDR, and the Unicode Consortium does not restrict the ways in
613			which the format or data are used. However, an implementation may
614			also claim conformance to LDML or to CLDR, as follows:</p>
615		<p>&nbsp;</p>
616		<p>
617			<i><b>UAX35-C1.</b> </i>An implementation that claims conformance to
618			this specification shall:
619		</p>
620		<ol>
621			<li>Identify the sections of the specification that it conforms
622				to.
623				<ul>
624					<li>For example, an implementation might claim conformance to
625						all LDML features except for <i>transforms</i> and <i>segments</i>.
626					</li>
627				</ul>
628			</li>
629			<li>Interpret the relevant elements and attributes of LDML
630				documents in accordance with the descriptions in those sections.
631				<ul>
632					<li>For example, an implementation that claims conformance to
633						the date format patterns must interpret the characters in such
634						patterns according to <a
635						href="tr35-dates.html#Date_Field_Symbol_Table">Date Field
636							Symbol Table</a>.
637					</li>
638				</ul>
639			</li>
640			<li>Declare which types of CLDR data that it uses.
641				<ul>
642					<li>For example, an implementation might declare that it only
643						uses language names, and those with a <i>draft</i> status of <i>contributed</i>
644						or <i>approved</i>.
645					</li>
646				</ul>
647			</li>
648		</ol>
649		<p>
650			<i><b>UAX35-C2.</b> </i>An implementation that claims conformance to
651			Unicode locale or language identifiers shall:
652		</p>
653		<ol>
654			<li>Specify whether Unicode locale extensions are allowed</li>
655			<li>Specify the canonical form used for identifiers in terms of
656				casing and field separator characters.</li>
657		</ol>
658		<p>External specifications may also reference particular
659			components of Unicode locale or language identifiers, such as:</p>
660		<blockquote>
661			<p>
662				<i>Field X can contain any Unicode region subtag values as given
663					in Unicode Technical Standard #35: Unicode Locale Data Markup
664					Language (LDML), excluding grouping codes.</i>
665			</p>
666		</blockquote>
667		<h2>
668			<a name="Locale" href="#Locale">2 What is a Locale?</a>
669		</h2>
670		<p>Before diving into the XML structure, it is helpful to describe
671			the model behind the structure. People do not have to subscribe to
672			this model to use data in LDML, but they do need to understand it so
673			that the data can be correctly translated into whatever model their
674			implementation uses.</p>
675		<p>
676			The first issue is basic: <i>what is a locale?</i> In this model, a
677			locale is an identifier (id) that refers to a set of user preferences
678			that tend to be shared across significant swaths of the world.
679			Traditionally, the data associated with this id provides support for
680			formatting and parsing of dates, times, numbers, and currencies; for
681			measurement units, for sort-order (collation), plus translated names
682			for time zones, languages, countries, and scripts. The data can also
683			include support for text boundaries (character, word, line, and
684			sentence), text transformations (including transliterations), and
685			other services.
686		</p>
687		<p>Locale data is not cast in stone: the data used on
688			someone&#39;s machine generally may reflect the US format, for
689			example, but preferences can typically set to override particular
690			items, such as setting the date format for 2002.03.15, or using
691			metric or Imperial measurement units. In the abstract, locales are
692			simply one of many sets of preferences that, say, a website may want
693			to remember for a particular user. Depending on the application, it
694			may want to also remember the user&#39;s time zone, preferred
695			currency, preferred character set, smoker/non-smoker preference, meal
696			preference (vegetarian, kosher, and so on), music preference,
697			religion, party affiliation, favorite charity, and so on.</p>
698		<p>Locale data in a system may also change over time: country
699			boundaries change; governments (and currencies) come and go:
700			committees impose new standards; bugs are found and fixed in the
701			source data; and so on. Thus the data needs to be versioned for
702			stability over time.</p>
703		<p>
704			In general terms, the locale id is a parameter that is supplied to a
705			particular service (date formatting, sorting, spell-checking, and so
706			on). The format in this document does not attempt to represent all
707			the data that could conceivably be used by all possible services.
708			Instead, it collects together data that is in common use in systems
709			and internationalization libraries for basic services. The main
710			difference among locales is in terms of language; there may also be
711			some differences according to different countries or regions.
712			However, the line between <i>locales</i> and <i>languages</i>, as
713			commonly used in the industry, are rather fuzzy. Note also that the
714			vast majority of the locale data in CLDR is in fact language data;
715			all non-linguistic data is separated out into a separate tree. For
716			more information, see <i><a href="#Language_and_Locale_IDs">Section
717					3.10 Language and Locale IDs</a></i>.
718		</p>
719		<p>
720			We will speak of data as being &quot;in locale X&quot;. That does not
721			imply that a locale <i>is</i> a collection of data; it is simply
722			shorthand for &quot;the set of data associated with the locale id
723			X&quot;. Each individual piece of data is called a <i>resource </i>or
724			<i>field</i>, and a tag indicating the key of the resource is called
725			a <i>resource tag.</i>
726		</p>
727		<h2>
728			<a name="Identifiers" href="#Identifiers"></a><a
729				name="Unicode_Language_and_Locale_Identifiers"
730				href="#Unicode_Language_and_Locale_Identifiers"> 3 Unicode
731				Language and Locale Identifiers</a>
732		</h2>
733		<p>
734			Unicode LDML uses stable identifiers based on [<a href="#BCP47">BCP47</a>]
735			for distinguishing among languages, locales, regions, currencies,
736			time zones, transforms, and so on. There are many systems for
737			identifiers for these entities. The Unicode LDML identifiers may not
738			match the identifiers used on a particular target system. If so, some
739			process of identifier translation may be required when using LDML
740			data.
741		</p>
742		<p>
743			The BCP 47 extensions (-u- and -t-) are described in <em>Section
744				3.6 <a href="#u_Extension">Unicode BCP 47 U Extension</a>
745			</em> and <em>Section 3.7 <a href="#BCP47_T_Extension">Unicode
746					BCP 47 T Extension</a></em>.
747		</p>
748		<h3>
749			<i><a name="Unicode_language_identifier"
750				href="#Unicode_language_identifier">3.1 Unicode Language
751					Identifier</a></i>
752		</h3>
753		<p>
754			A <i>Unicode language identifier</i> has the following structure
755			(provided in either EBNF (Perl-based) or ABNF [<a href="#RFC5234">RFC5234</a>]).
756			The following table defines syntactically well-formed identifiers:
757			they are not necessarily valid identifiers. For additional validity
758			criteria, see the links on the right.
759		</p>
760		<table>
761			<tr>
762				<th>&nbsp;</th>
763				<th><div align="center">EBNF</div></th>
764				<th><div align="center">ABNF</div></th>
765				<th><div align="center">Validity / Comments</div></th>
766			</tr>
767			<tr>
768				<td><code>
769						<a href="#unicode_language_id" name="unicode_language_id">unicode_language_id</a>
770					</code></td>
771				<td><code>
772						= &quot;root&quot;<br>
773						| (unicode_language_subtag <br>     (sep
774						unicode_script_subtag)? <br>   | unicode_script_subtag)<br>
775						  (sep unicode_region_subtag)? <br>
776						  (sep
777						unicode_variant_subtag)* ;
778					</code></td>
779				<td><code>
780						= &quot;root&quot;<br>
781/ (unicode_language_subtag <br>     [sep
782						unicode_script_subtag] <br>   / unicode_script_subtag)<br>
783						  [sep unicode_region_subtag] <br>
784						  *(sep
785						unicode_variant_subtag)
786					</code></td><td>"root" is treated as a special <code>unicode_language_subtag</code></tr>
787			<tr>
788				<td><code>
789						<a href="#unicode_language_subtag" name="unicode_language_subtag">unicode_language_subtag</a>
790					</code></td>
791				<td><code> = alpha{2,3} | alpha{5,8}; </code></td>
792				<td><code> = 2*3ALPHA / 5*8ALPHA </code></td>
793				<td><code>
794						<a href='#unicode_language_subtag_validity'>validity</a><br>
795						<a href='http://unicode.org/cldr/latest/common/validity/language.xml'>latest-data</a>
796					</code></td>
797			</tr>
798			<tr>
799				<td><code>
800						<a href="#unicode_script_subtag" name="unicode_script_subtag">unicode_script_subtag</a>
801					</code></td>
802				<td><code>= alpha{4} ;</code></td>
803				<td><code>= 4ALPHA</code></td>
804				<td><code>
805						<a href='#unicode_script_subtag_validity'>validity</a><br>
806						<a href='http://unicode.org/cldr/latest/common/validity/script.xml'>latest-data</a>
807					</code></td>
808			</tr>
809			<tr>
810				<td><code>
811						<a href="#unicode_region_subtag" name="unicode_region_subtag">unicode_region_subtag</a>
812					</code></td>
813				<td><code>= (alpha{2} | digit{3}) ;</code></td>
814				<td><code>= 2ALPHA / 3DIGIT</code></td>
815				<td><code>
816						<a href='#unicode_language_subtag_validity'>validity</a><br>
817						<a href='http://unicode.org/cldr/latest/common/validity/region.xml'>latest-data</a>
818					</code></td>
819			</tr>
820			<tr>
821				<td><code>
822						<a href="#unicode_variant_subtag" name="unicode_variant_subtag">unicode_variant_subtag</a>
823					</code></td>
824				<td><code>
825						= (alphanum{5,8} <br> | digit alphanum{3}) ;
826					</code></td>
827				<td><code>
828						= 5*8alphanum<br>/ (DIGIT 3alphanum)
829					</code></td>
830				<td><code>
831						<a href='#unicode_language_subtag_validity'>validity</a><br>
832						<a href='http://unicode.org/cldr/latest/common/validity/variant.xml'>latest-data</a>
833					</code></td>
834			</tr>
835			<tr>
836				<td><code>sep</code></td>
837				<td><code>= [-_] ;</code></td>
838				<td><code>= "-" / "_"</code></td>
839			</tr>
840			<tr>
841				<td><code>digit</code></td>
842				<td><code>= [0-9] ;</code></td>
843				<td><code>&nbsp;</code></td>
844			</tr>
845			<tr>
846				<td><code>alpha</code></td>
847				<td><code>= [A-Z a-z] ;</code></td>
848				<td><code>&nbsp;</code></td>
849			</tr>
850			<tr>
851				<td><code>alphanum</code></td>
852				<td><code>= [0-9 A-Z a-z] ;</code></td>
853				<td><code>= ALPHA / DIGIT</code></td>
854			</tr>
855		</table>
856		<p>
857			The semantics of the various subtags is explained in <em>Section
858				3.4 <a href="#Field_Definitions">Language Identifier Field
859					Definitions</a>
860			</em>; there are also direct links from
861			<code>
862				<a href="#unicode_language_subtag">unicode_language_subtag</a>
863			</code>
864			, etc. While theoretically the
865			<code>
866				<a href="#unicode_language_subtag">unicode_language_subtag</a>
867			</code>
868			may have more than 3 letters through the IANA registration process,
869			in practice that has not occurred. The
870			<code>
871				<a href="#unicode_language_subtag">unicode_language_subtag</a>
872			</code>
873			&quot;und&quot; may be omitted when there is a
874			<code>
875				<a href="#unicode_script_subtag">unicode_script_subtag</a>
876			</code>
877			; for that reason
878			<code>
879				<a href="#unicode_language_subtag">unicode_language_subtag</a>
880			</code>
881			values with 4 letters are not permitted. However, such
882			<code>
883				<a href="#unicode_language_id">unicode_language_id</a>
884			</code>
885			values are not intended for general interchange, because they are not
886			valid BCP 47 tags. Instead, they are intended for certain protocols
887			such as the identification of transliterators or font ScriptLangTag
888			values.
889		</p>
890		<p>For example, &quot;en-US&quot; (American English),
891			&quot;en_GB&quot; (British English), &quot;es-419&quot; (Latin
892			American Spanish), and &quot;uz-Cyrl&quot; (Uzbek in Cyrillic) are
893			all valid Unicode language identifiers.</p>
894		<h3>
895			<i><a name="Unicode_locale_identifier"
896				href="#Unicode_locale_identifier">3.2 Unicode Locale Identifier</a></i>
897		</h3>
898		<p>
899			A <i>Unicode locale identifier</i> is composed of a Unicode language
900			identifier plus (optional) locale extensions. It has the
901			following structure. The semantics of the U and T extensions are
902			explained in <em>Section 3.6 <a href="#u_Extension">Unicode
903					BCP 47 U Extension</a>
904			</em> and <em>Section 3.7 <a href="#BCP47_T_Extension">Unicode
905			BCP 47 T Extension</a></em>. Other extensions and private use extensions are supported for pass-through. The following table defines syntactically
906			        <em>well-formed</em> identifiers: they are not necessarily <em>valid</em> identifiers.
907		For additional validity criteria, see the links on the right. </p>
908		<table border="0">
909			<tr>
910				<th>&nbsp;</th>
911				<th><div align="center">EBNF</div></th>
912				<th><div align="center">ABNF</div></th>
913				<th><div align="center">Validity</div></th>
914			</tr>
915			<tr>
916				<td><code>
917						<a href="#unicode_locale_id" name="unicode_locale_id">unicode_locale_id</a>
918					</code></td>
919				<td><code>
920						= unicode_language_id<br>
921						   extensions*<br>
922						  
923pu_extensions? ; </code></td>
924				<td><code>
925						= unicode_language_id<br>
926						   [extensions]   <br>
927						  
928				   1*pu_extensions </code></td>
929			</tr>
930		  <tr>
931				<td><code>
932				<a href="#extensions" name="extensions">extensions</a>
933					</code></td>
934				<td><code>
935						= unicode_locale_extensions <br>
936				| transformed_extensions <br>
937				| other_extensions ;</code></td>
938				<td><code>= unicode_locale_extensions <br>
939/ transformed_extensions <br>
940/ other_extensions</code></td>
941			</tr>
942			<tr>
943				<td><code>
944						<a href="#unicode_locale_extensions"
945							name="unicode_locale_extensions">unicode_locale_extensions</a>
946					</code></td>
947				<td><code>
948						= sep [uU]<br>   ((sep keyword)+ <br>   |(sep attribute)+
949						(sep keyword)*) ;
950					</code></td>
951				<td><code>
952						= sep &quot;u&quot; <br>   (1*(sep keyword) <br>   / 1*(sep
953						attribute) *(sep keyword))
954					</code></td>
955			</tr>
956			<tr>
957				<td><code>
958						<a href="#transformed_extensions" name="transformed_extensions">transformed_extensions</a>
959					</code></td>
960				<td><code>
961					= sep [tT] <br>   ((sep tlang (sep tfield)*) <br>
962					  | (sep tfield)+) ; </code></td>
963				<td><code>
964						= sep &quot;t&quot; <br>   ((sep tlang
965					*(sep tfield)) <br>   / 1*(sep tfield)) </code></td>
966			</tr>
967			<tr>
968				<td><code><a href="#pu_extensions" name="pu_extensions">pu_extensions</a></code></td>
969				<td><code>= sep [xX] <br>
970				   
971			    (sep alphanum{1,8})* ;</code></td>
972				<td><code>= sep &quot;x&quot; <br>
973				   
974			    [sep 1*8alphanum]</code></td>
975			</tr>
976		  <tr>
977				<td><code><a href="#other_extensions" name="other_extensions">other_extensions</a></code></td>
978				<td><code>= [alphanum-[tTuUxX]]<br>
979				   
980			    (sep alphanum{2,8})* ;</code></td>
981				<td><code>= (DIGIT<br>
982				     
983			    / %x61-%x73<br>
984			        / %x76-%x77<br>
985			       
986			    / %x79-%x7A)<br>
987 
988	        *(sep 2*8alphanum)</code></td>
989			</tr>
990			<tr>
991				<td><code>keyword</code></td>
992				<td><code>= key (sep type)? ;</code></td>
993				<td><code>= key [sep type]</code></td>
994			</tr>
995		  <tr>
996				<td><code>key</code></td>
997				<td><code>
998						= alphanum alpha ;
999					</code></td>
1000				<td><code>
1001						= alphanum ALPHA
1002					</code></td>
1003				<td><code>
1004						<a href="#Key_Type_Definitions">validity</a><br>
1005				<a
1006							href='http://unicode.org/cldr/latest/common/bcp47'>latest-data</a>
1007					</code></td>
1008			</tr>
1009			<tr>
1010				<td><code>type</code></td>
1011				<td><code>
1012						= alphanum{3,8}<br>  (sep alphanum{3,8})* ;
1013					</code></td>
1014				<td><code>
1015						= 3*8alphanum<br>  *(sep 3*8alphanum)
1016					</code></td>
1017				<td><code>
1018						<a href="#Key_Type_Definitions">validity</a><br>
1019				<a
1020							href='http://unicode.org/cldr/latest/common/bcp47'>latest-data</a>
1021					</code></td>
1022			</tr>
1023			<tr>
1024				<td><code>attribute</code></td>
1025				<td><code>= alphanum{3,8} ;</code></td>
1026				<td><code>= 3*8alphanum</code></td>
1027			</tr>
1028			<tr>
1029				<td><code>
1030						<a name="unicode_subdivision_id" href="#unicode_subdivision_id">unicode_subdivision_id</a><a
1031							name="unicode_subdivision_subtag"></a><a
1032							name="subdivision_attribute"></a>
1033					</code></td>
1034				<td><code>
1035						= <a href="#unicode_region_subtag">unicode_region_subtag</a> unicode_subdivision_suffix ;
1036					</code></td>
1037				<td><code>
1038						= <a href="#unicode_region_subtag">unicode_region_subtag</a> unicode_subdivision_suffix
1039					</code></td>
1040				<td><code>
1041						<a href='#unicode_subdivision_subtag_validity'>validity</a><br>
1042						<a
1043							href='http://unicode.org/cldr/latest/common/validity/subdivision.xml'>latest-data</a>
1044					</code></td>
1045
1046			</tr>
1047			<tr>
1048				<td><code>unicode_subdivision_suffix</code></td>
1049				<td><code> = (alphanum{1,4} ;</code></td>
1050				<td><code>= 1*4alphanum</code></td>
1051			</tr>
1052			<tr>
1053				<td><code>
1054						<a name="unicode_measure_unit" href="#unicode_measure_unit">unicode_measure_unit</a>
1055					</code></td>
1056				<td><code>
1057						= alphanum{3,8}<br>   (sep alphanum{3,8})* ;
1058					</code></td>
1059				<td><code>
1060						= 3*8alphanum<br>   *(sep 3*8alphanum)
1061					</code></td>
1062				<td><code>
1063						<a href='#Validity_Data'>validity</a><br>
1064				<a
1065							href='http://unicode.org/cldr/latest/common/validity/unit.xml'>latest-data</a>
1066					</code></td>
1067			</tr>
1068			<tr>
1069				<td><code>tlang</code></td>
1070				<td><code>
1071					= unicode_language_subtag<br>   (sep unicode_script_subtag)?<br>   (sep unicode_region_subtag)?<br>   (sep unicode_variant_subtag)* ; </code></td>
1072				<td><code>
1073						= unicode_language_subtag <br>   [sep unicode_script_subtag] <br>   [sep unicode_region_subtag] <br>
1074						 
1075					*(sep unicode_variant_subtag) </code></td>
1076			</tr>
1077			<tr>
1078				<td><code>tfield</code></td>
1079				<td><code>
1080						= tkey tvalue;
1081					</code></td>
1082				<td><code>
1083						= tkey tvalue
1084					</code></td>
1085				<td><code>
1086						<a href="#BCP47_T_Extension">validity</a><br>
1087				<a
1088							href='http://unicode.org/cldr/latest/common/bcp47'>latest-data</a>
1089					</code></td>
1090
1091			</tr>
1092			<tr>
1093				<td><code>
1094						tkey
1095					</code></td>
1096				<td><code>
1097						= alpha digit ;
1098					</code></td>
1099				<td><code>= ALPHA DIGIT</code></td>
1100			</tr>
1101			<tr>
1102				<td><code>
1103						tvalue
1104					</code></td>
1105				<td><code>= (sep alphanum{3,8})+ ;</code></td>
1106				<td><code>= 1*(sep 3*8alphanum)</code></td>
1107			</tr>
1108		</table>
1109
1110		<p>
1111			For historical reasons, this is called a Unicode locale identifier.
1112			However, it really functions (with few exceptions) as a <span
1113				class="st">language</span> identifier, and accesses <span class="st">language</span>-based
1114			data. Except where it would be unclear, this document uses the term
1115			&quot;locale&quot; data loosely to encompass both types of data: for
1116			more information, see <i><a href="#Language_and_Locale_IDs">Section
1117					3.10 Language and Locale IDs</a></i>.
1118		</p>
1119		<p></p>
1120		<p>As of the release of this specification, there were no other_extensions defined. The other_extensions are present in the syntax to allow implementations to preserve that information. There cannot be more than one extension with the same singleton (-u-, -t-, ...). The private use extension must come after all other extensions.
1121		</p>
1122		<p>As for terminology, the term <i>code</i> may also be used instead of
1123			&quot;subtag&quot;, and &quot;territory&quot; instead of
1124			&quot;region&quot;. The primary language subtag is also called the <i>base
1125				language code</i>. For example, the base language code for
1126			&quot;en-US&quot; (American English) is &quot;en&quot; (English). The
1127			<i>type</i> may also be referred to as a <i>value</i> or <i>key-value</i>.
1128		</p>
1129		<p>
1130			The identifiers can vary in case and in the separator characters. The
1131			&quot;-&quot; and &quot;_&quot; separators are treated as equivalent, although &quot;-&quot; is preferred.</p>
1132	  <p>All identifier field values are case-insensitive. Although case
1133			distinctions do not carry any special meaning, an implementation of
1134			LDML should use the casing recommendations in [<a href="#BCP47">BCP47</a>],
1135			especially when a Unicode locale identifier is used for locale data
1136		exchange in software protocols.</p>
1137	  <p>The canonical form of a <code><a href="#unicode_locale_id">unicode_locale_id</a></code> has:</p>
1138	  <ul>
1139	    <li>	a language subtag (those beginning with a script subtag only are specialized use)</li>
1140	    <li>any script subtag  in title case (eg, Hant)</li>
1141	    <li>any region subtag  in uppercase (eg, DE)</li>
1142        <li>all other subtags  in lowercase (eg, en)</li>
1143	    <li>any variants in alphabetical order (eg, en-fonipa-scouse, not en-scouse-fonipa)</li>
1144	    <li>any extensions in alphabetical order by their singleton (eg, en-t-xxx-u-yyy, not en-u-yyy-t-xxx)</li>
1145      </ul>
1146		<p>
1147		<b>Note:</b>		    The current version of CLDR data uses some non-preferred forms for backward compatibility. This might be changed in future CLDR releases.</p>
1148		<ul>
1149		  <li>It uses uppercase letters for
1150		    variant subtags, while the preferred forms are all lowercase.</li>
1151		  <li>It uses &quot;_&quot; as the separator, while the preferred form of the separator is  "-".</li>
1152		  <li>It uses &quot;root&quot;, while the preferred form is &quot;und&quot;.</li>
1153      </ul>
1154		<h3>
1155			<a name="BCP_47_Conformance" href="#BCP_47_Conformance">3.3 BCP
1156				47 Conformance</a>
1157		</h3>
1158		<p>
1159			Unicode language and locale identifiers inherit the design and the
1160			repertoire of subtags from [<a href="#BCP47">BCP47</a>] Language
1161			Tags. There are some extensions and restrictions made for the use of
1162			the Unicode locale identifier in CLDR:
1163		</p>
1164		<ul>
1165			<li>It does not allow for the full syntax of [<a href="#BCP47">BCP47</a>]:
1166              <ul>
1167		  <li>No extlang subtags are allowed (as in the BCP 47 canonical form, see BCP 47 <a href="https://tools.ietf.org/html/bcp47#section-4.5">Section 4.5</a> and <a href="https://tools.ietf.org/html/bcp47#section-3.1.7" target="_blank" >Section 3.1.7</a>)</li>
1168		  <li>No irregular  BCP 47 grandfathered tags are allowed (these are all deprecated in BCP 47)</li>
1169		  <li>A tag must not start with the subtag &quot;x&quot;: thus a <em>privateuse</em> (eg x-abc) can only be after a language subtag, like &quot;und&quot;</li>
1170		</ul>
1171			</li>
1172			<li>It allows for certain semantic additions and constraints:
1173				<ul>
1174					<li>Certain codes that are private-use in BCP-47 and ISO are given semantics by LDML</li>
1175					<li>Each macrolanguage has an identified  primary encompassed language, which  is treated as an alias for the macrolanguage, and thus is replaced when canonicalizing (as allowed by BCP 47, see <a href="https://tools.ietf.org/html/bcp47#section-4.1.2">Section 4.1.2</a>)</li>
1176				</ul>
1177			</li>
1178					<li>It allows certain syntax for backwards compatibility  (not BCP 47-compatible):
1179                      <ul>
1180                        <li>The "_" character for field separator characters, as well as the "-" used in [<a href="#BCP47">BCP47</a>]
1181                          (however, the canonical form is with &quot;-&quot;)</li>
1182                        <li>The subtag "root" to indicate the generic locale used as the parent
1183                          of all languages in the CLDR data model				      (&quot;und&quot; can be used instead)</li>
1184                        <li>The language tag may begin with a script subtag rather than a language subtag. This is specialized use only, and  not required for CLDR conformance.</li>
1185                      </ul>
1186		  </li>
1187		</ul>
1188		<p>There are thus two subtypes of Unicode locale identifiers:</p>
1189		<ul>
1190		  <li>the term <em>Unicode CLDR locale identifier</em> applies where the backwards compatibility syntax is used.</li>
1191		  <li>the term <em>Unicode BCP 47 locale identifier</em> applies otherwise. A <em>Unicode BCP 47 locale identifier</em> is  also a valid BCP 47 language tag.</li>
1192      </ul>
1193		<h4>
1194			<a name="BCP_47_Language_Tag_Conversion"
1195				href="#BCP_47_Language_Tag_Conversion">3.3.1 BCP 47 Language Tag
1196				Conversion</a>
1197		</h4>
1198		<p>The different identifiers can be converted to one another as described in this section.
1199		<p>
1200		<h5>
1201			<a name="Language_Tag_to_Locale_Identifier"
1202				href="#Language_Tag_to_Locale_Identifier">BCP 47 Language Tag to Unicode BCP 47 Locale Identifier</a>
1203		</h5>
1204		<p>A valid [<a href="#BCP47">BCP47</a>] language tag can be converted
1205			to a valid Unicode BCP 47 locale identifier by performing the
1206		following transformation. </p>
1207		<ol>
1208			<li>Canonicalize the language tag (afterwards, there will be no
1209		  extlang subtags).</li>
1210			<li>If the BCP 47 primary language subtag matches the <i>type</i>
1211				attribute of a <i>languageAlias</i> element in <a
1212				href="tr35-info.html#Supplemental_Data">Supplemental Data</a>,
1213				replace the language subtag with the <i>replacement</i> value.
1214				<ol>
1215					<li>If there are additional subtags in the <i>replacement</i>
1216						value, add them to the result, but only if there is no
1217						corresponding subtag already in the tag.
1218					</li>
1219				</ol>
1220			</li>
1221			<li>If the BCP 47 region subtag matches the <i>type</i>
1222				attribute of a <i>territoryAlias</i> element in <a
1223				href="tr35-info.html#Supplemental_Data">Supplemental Data</a>,
1224				replace the language subtag with the <i>replacement</i> value, as
1225				follows:
1226				<ol>
1227					<li>If there is a single territory in the replacement, use it.</li>
1228					<li>If there are multiple territories:
1229						<ol>
1230							<li>Look up the most likely territory for the base language
1231								code (and script, if there is one).</li>
1232							<li>If that likely territory is in the list, use it.</li>
1233							<li>Otherwise, use the first territory in the list.</li>
1234						</ol>
1235					</li>
1236				</ol>
1237			</li>
1238		  <li>If the tag is one of the five deprecated grandfathered tags (cel-gaulish, i-default, i-enochian, i-mingo, zh-min) remaining after step #1,  prefix by &quot;und-x-&quot;.</li>
1239		  <li>If the first subtag is &quot;x&quot;, prefix by &quot;und-&quot;.</li>
1240	  </ol>
1241		<p>The result is  a Unicode BCP 47 locale identifier,  in canonical form. It is both a BCP 47 language tag and a Unicode locale identifier.	Because the process maps from all BCP 47 language tags into a subset of BCP 47 language tags, the format changes are not reversible, much as a lowercase transformation of the string “McGowan” is not reversible.</p>
1242		<br>
1243		<p><em>Examples</em></p>
1244		<table>
1245			<tr>
1246			  <th style='width:10em'>BCP 47 language tag</th>
1247			  <th style='width:10em'>Unicode BCP 47 locale identifier</th>
1248				<th>Comments</th>
1249			</tr>
1250			<tr>
1251				<td><code>en-US</code></td>
1252				<td><code>en-US</code></td>
1253				<td>no changes</td>
1254			</tr>
1255			<tr>
1256			  <td><code>iw-FX</code></td>
1257			  <td><code>he-FR</code></td>
1258			  <td>BCP 47 canonicalization [1]</td>
1259		  </tr>
1260			<tr>
1261			  <td><code>cmn-TW</code></td>
1262			  <td><code>zh-TW</code></td>
1263			  <td>language alias [2]</td>
1264		  </tr>
1265			<tr>
1266				<td><code>zh-cmn-TW</code></td>
1267				<td><code>zh-TW</code></td>
1268				<td>BCP 47 canonicalization [1], then language alias [2]</td>
1269			</tr>
1270			<tr>
1271				<td><code>sr-CS</code></td>
1272				<td><code>sr-RS</code></td>
1273				<td>territory alias [3]</td>
1274			</tr>
1275			<tr>
1276				<td><code>sh</code></td>
1277				<td><code>sr-Latn</code></td>
1278				<td>multiple replacement subtags [2.1]</td>
1279			</tr>
1280			<tr>
1281				<td><code>sh-Cyrl</code></td>
1282				<td><code>sr-Cyrl</code></td>
1283				<td>no replacement with multiple replacement subtags [2.1  doesn't apply]</td>
1284			</tr>
1285			<tr>
1286				<td><code>hy-SU</code></td>
1287				<td><code>hy-AM</code></td>
1288				<td>multiple territory values [3.2]<br> <code>&lt;territoryAlias
1289						type=&quot;SU&quot; replacement=&quot;RU AM AZ BY EE GE KZ KG LV
1290						LT MD TJ TM UA UZ&quot; …/&gt;</code></td>
1291			</tr>
1292          <tr>
1293	          <td><code>i-enochian</code></td>
1294	          <td><code>und-x-i-enochian</code></td>
1295	          <td>prefix any grandfathered tags with &quot;und-x-&quot; [4]</td>
1296          </tr>
1297	      <tr>
1298			  <td><code>x-abc</code></td>
1299			  <td><code>und-x-abc</code></td>
1300			  <td>prefix with &quot;und-&quot;, so that there is always a base language subtag [5]</td>
1301		  </tr>
1302	  </table>
1303	  <p>&nbsp;</p>
1304	  		<h5>
1305			<a name="Unicode_Locale_Identifier_CLDR_to_BCP_47"
1306				href="#Unicode_Locale_Identifier_CLDR_to_BCP_47">Unicode Locale Identifier: CLDR to BCP 47</a>
1307		</h5>
1308
1309	  <p>A Unicode CLDR locale identifier can be converted to a valid [<a
1310				href="#BCP47">BCP47</a>] language tag (which is also a Unicode BCP 47 locale identifier) by performing the following
1311	  transformation. </p>
1312      <ol>
1313        <li>Replace the "_" separators with "-"</li>
1314        <li>Replace the special language identifier "root"  with the BCP
1315          47 primary language tag "und"</li>
1316        <li>Add an initial &quot;und&quot; primary language subtag if the first subtag is a script.</li>
1317      </ol>
1318      <p><em>Examples:</em></p>
1319      <table>
1320        <tr>
1321          <th style='width:10em'>Unicode CLDR locale identifier</th>
1322          <th style='width:10em'>BCP 47 language tag</th>
1323          <th>Comments</th>
1324        </tr>
1325        <tr>
1326          <td><code>en_US</code></td>
1327          <td><code>en-US</code></td>
1328          <td>change separator [1]</td>
1329        </tr>
1330        <tr>
1331          <td><code>de_DE_u_co_phonebk</code></td>
1332          <td><code>de-DE-u-co-phonebk</code></td>
1333          <td>change separator [1]</td>
1334        </tr>
1335        <tr>
1336          <td><code>root</code></td>
1337          <td><code>und</code></td>
1338          <td>change to &quot;und&quot; [2]</td>
1339        </tr>
1340        <tr>
1341          <td><code>root_u_cu_usd</code></td>
1342          <td><code>und-u-cu-usd</code></td>
1343          <td>change to &quot;und&quot; [1, 2]</td>
1344        </tr>
1345        <tr>
1346          <td><code>Latn_DE</code></td>
1347          <td><code>und-Latn-DE</code></td>
1348          <td>add &quot;und&quot; [1, 3]</td>
1349        </tr>
1350      </table><br>
1351      <p></p>
1352      	  		<h5>
1353			<a name="Unicode_Locale_Identifier_BCP_47_to_CLDR"
1354				href="#Unicode_Locale_Identifier_BCP_47_to_CLDR">Unicode Locale Identifier: BCP 47 to  CLDR</a>
1355		</h5>
1356
1357	  <p>A Unicode BCP 47 locale identifier can be transformed into a Unicode CLDR locale identifier by performing the following transformation.</p>
1358        <ol>
1359          <li>the separator is changed to &quot;_&quot;</li>
1360          <li>the primary language subtag "und" is replaced with "root"
1361            if no script, region, or variant subtags are present.</li>
1362        </ol>
1363	  <p><em>Examples:</em></p>
1364		<table>
1365		  <tr>
1366		    <th style='width:10em'>BCP 47 language tag</th>
1367		    <th style='width:10em'>Unicode CLDR locale identifier</th>
1368		    <th>Comments</th>
1369	      </tr>
1370		  <tr>
1371		    <td><code>en-US</code></td>
1372		    <td><code>en_US</code></td>
1373		    <td>changes separator [1]</td>
1374	      </tr>
1375		  <tr>
1376		    <td><code>und</code></td>
1377		    <td><code>root</code></td>
1378		    <td>changes to &quot;root&quot;, because no script, region, or variant tag is
1379	        present  [2]</td>
1380	      </tr>
1381		  <tr>
1382		    <td><code>und-US</code></td>
1383			  <td><code>und_US</code></td>
1384		    <td>no change to &quot;und&quot;, because a region subtag is present [1]</td>
1385	      </tr>
1386		  <tr>
1387		    <td nowrap><code>und-u-cu-USD</code></td>
1388		    <td nowrap><code>root_u_cu_usd</code></td>
1389		    <td>changes to &quot;root&quot;, because no script, region, or variant tag is
1390		      present [1, 2]</td>
1391	      </tr>
1392		</table>
1393		<h3>
1394			<a name="Field_Definitions" href="#Field_Definitions">3.4
1395				Language Identifier Field Definitions </a>
1396		</h3>
1397		<p>
1398			Unicode language and locale identifier field values are provided in
1399			the following table. Note that some private-use BCP 47 field values
1400			are given specific meanings in CLDR. While field values are based on
1401			[<a href="#BCP47">BCP47</a>] subtag values, their validity status in
1402			CLDR is specified by means of machine-readable files in the <a
1403				href='http://unicode.org/repos/cldr/tags/latest/common/validity/'>common/validity/</a>
1404			subdirectory, such as language.xml. For the format of those files and
1405			more information, see <em><a href='#Validity_Data'>Section
1406					3.11 Validity Data</a></em>.
1407		</p>
1408		<table>
1409			<caption>
1410				<a name="Language_Locale_Field_Definitions"
1411					href="#Language_Locale_Field_Definitions">Language Identifier
1412					Field Definitions </a>
1413			</caption>
1414			<tr>
1415				<th>Field</th>
1416				<th>Valid values</th>
1417			</tr>
1418			<tr>
1419				<td><a href="#unicode_language_subtag_validity"
1420					name="unicode_language_subtag_validity">unicode_language_subtag</a>
1421					<p>
1422						(also known as a <i>Unicode base language code)</i>
1423					</p></td>
1424				<td>Subtags in the language.xml file (see <em>Section 3.11
1425						<a href="#Validity_Data">Validity Data</a>
1426				</em>). These are based on [<a href="#BCP47">BCP47</a>] subtag values
1427					marked as <b>Type: language</b>
1428					<p>ISO 639-3 introduces the notion of
1429						&quot;macrolanguages&quot;, where certain ISO 639-1 or ISO 639-2
1430						codes are given broad semantics, and additional codes are given
1431						for the narrower semantics. For backwards compatibility, Unicode
1432						language identifiers retain use of the narrower semantics for
1433						these codes. For example:</p>
1434					<table border="1" cellspacing="0" cellpadding="2"
1435						style="margin: 0.5em">
1436						<tr>
1437							<th>For</th>
1438							<th>Use</th>
1439							<th><i>Not</i></th>
1440						</tr>
1441						<tr>
1442							<td>Standard Chinese (Mandarin)</td>
1443							<td><code>zh</code></td>
1444							<td><code>cmn</code></td>
1445						</tr>
1446						<tr>
1447							<td>Standard Arabic</td>
1448							<td><code>ar</code></td>
1449							<td><code>arb</code></td>
1450						</tr>
1451						<tr>
1452							<td>Standard Malay</td>
1453							<td><code>ms</code></td>
1454							<td><code>zsm</code></td>
1455						</tr>
1456						<tr>
1457							<td>Standard Swahili</td>
1458							<td><code>sw</code></td>
1459							<td><code>swh</code></td>
1460						</tr>
1461						<tr>
1462							<td>Standard Uzbek</td>
1463							<td><code>uz</code></td>
1464							<td><code>uzn</code></td>
1465						</tr>
1466						<tr>
1467							<td>Standard Konkani</td>
1468							<td><code>kok</code></td>
1469							<td><code>knn</code></td>
1470						</tr>
1471						<tr>
1472							<td>Northern Kurdish</td>
1473							<td><code>ku</code></td>
1474							<td><code>kmr</code></td>
1475						</tr>
1476					</table>
1477					<p>
1478						If a language subtag matches the type attribute of a languageAlias
1479						element, then the replacement value is used instead. For example,
1480						because "swh" occurs in
1481						<tt>&lt;languageAlias type="swh" replacement="sw"/&gt;</tt>
1482						, "sw" must be used instead of "swh". Thus Unicode language
1483						identifiers use &quot;ar-EG&quot; for Standard Arabic (Egypt), not
1484						&quot;arb-EG&quot;; they use &quot;zh-TW&quot; for Mandarin
1485						Chinese (Taiwan), not &quot;cmn-TW&quot;.
1486					</p>
1487					<p>
1488						The private use codes listed as <strong>excluded</strong>
1489						in <em>Section 3.5.3 <a href="#Private_Use">Private Use Codes</a></em>
1490						will never be given specific semantics in Unicode identifiers, and
1491					are thus safe for use for other purposes by other applications. </p>
1492					<p>The CLDR provides data for normalizing language/locale
1493						codes, including mapping overlong codes like &quot;eng-840&quot;
1494						or &quot;eng-USA&quot; to the correct code &quot;en-US&quot;;
1495						see the
1496						<strong><a href="https://www.unicode.org/cldr/charts/latest/supplemental/aliases.html">Aliases</a></strong>
1497						Chart.</p>
1498					<p>The following are special language subtags:</p>
1499                    <table class="simple" border="1" cellspacing="0" cellpadding="2">
1500                      <tr>
1501                        <td>&nbsp;</td>
1502                        <td><strong>Name</strong></td>
1503                        <td><strong>Comment</strong></td>
1504                      </tr>
1505                      <tr>
1506                        <td><code>mis</code></td>
1507                        <td>Uncoded languages</td>
1508                        <td>The content is in a language that doesn't yet have an ISO 639 code.</td>
1509                      </tr>
1510                      <tr>
1511                        <td><code>mul</code></td>
1512                        <td>Multiple languages</td>
1513                        <td>The content contains  more than one language or text that is simultaneously in multiple languages (such as brand names).</td>
1514                      </tr>
1515                      <tr>
1516                        <td><code>zxx</code></td>
1517                        <td>No linguistic content</td>
1518                        <td>The content  is not in any particular languages (such as images, symbols, etc.)</td>
1519                      </tr>
1520                    </table></td>
1521			</tr>
1522			<tr>
1523				<td><a href="#unicode_script_subtag_validity"
1524					name="unicode_script_subtag_validity">unicode_script_subtag</a>
1525					<p>
1526						(also known as a <i>Unicode script code)</i>
1527					</p></td>
1528				<td>Subtags in the script.xml file (see <em>Section 3.11 <a
1529						href="#Validity_Data">Validity Data</a></em>). These are based on [<a
1530					href="#BCP47">BCP47</a>] subtag values marked as <b>Type:
1531						script</b>
1532					<p>In most cases the script is not necessary, since the
1533						language is only customarily written in a single script. Examples
1534						of cases where it is used are:</p>
1535					<table border="1" cellspacing="0" cellpadding="2"
1536						style="margin: 0.5em">
1537						<tr>
1538							<td><code>az_Arab</code></td>
1539							<td>Azerbaijani in Arabic script</td>
1540						</tr>
1541						<tr>
1542							<td><code>az_Cyrl</code></td>
1543							<td>Azerbaijani in Cyrillic script</td>
1544						</tr>
1545						<tr>
1546							<td><code>az_Latn</code></td>
1547							<td>Azerbaijani in Latin script</td>
1548						</tr>
1549						<tr>
1550							<td><code>zh_Hans</code></td>
1551							<td>Chinese, in simplified script (=zh, zh-Hans, zh-CN,
1552								zh-Hans-CN)</td>
1553						</tr>
1554						<tr>
1555							<td><code>zh_Hant</code></td>
1556							<td>Chinese, in traditional script</td>
1557						</tr>
1558					</table>
1559					<p>
1560						Unicode identifiers give specific semantics to certain Unicode Script values. For more information, see also [<a
1561							href="http://www.unicode.org/reports/tr41/#UAX24">UAX24</a>]:
1562					</p>
1563					<table cellspacing="0" cellpadding="2" border="1"
1564						style="margin: 0.5em">
1565						<tr>
1566						  <td><code>Qaag</code></td>
1567						  <td>Zawgyi</td>
1568						  <td colspan="2">Qaag is a special script code for identifying the non-standard use of Myanmar characters for display with the Zawgyi font. The purpose of the code is to enable migration to standard, interoperable use of Unicode by providing an identifier for Zawgyi for tagging text, applications, input methods, font tables, transformations, and other mechanisms used for migration.</td>
1569					  </tr>
1570						<tr>
1571							<td><code>Qaai</code></td>
1572							<td>Inherited</td>
1573							<td colspan="2"><strong>deprecated</strong>: the <em>canonicalized</em>
1574								form is Zinh</td>
1575						</tr>
1576					  <tr>
1577							<td><code>Zinh</code></td>
1578							<td>Inherited</td>
1579							<td colspan="2">&nbsp;</td>
1580						</tr>
1581						<tr>
1582							<td><code>Zsye</code></td>
1583							<td>Emoji Style</td>
1584							<td colspan="2">Prefer emoji style for characters that have both text
1585								and emoji styles available.</td>
1586						</tr>
1587						<tr>
1588							<td><code>Zsym</code></td>
1589							<td>Text Style</td>
1590							<td colspan="2">Prefer text style for characters that have both text and
1591								emoji styles available.</td>
1592						</tr>
1593						<tr>
1594						  <td rowspan="7"><code>Zxxx</code></td>
1595						  <td rowspan="7">Unwritten</td>
1596						  <td colspan="2">Indicates spoken or otherwise unwritten content. For example:</td>
1597					  </tr>
1598						<tr>
1599						  <th>Sample(s)</th>
1600						  <th>Description</th>
1601				      </tr>
1602						<tr>
1603						  <td>uz</td>
1604						  <td>either written or spoken content</td>
1605				      </tr>
1606						<tr>
1607						  <td>uz-Latn <em>or</em> uz-Arab</td>
1608						  <td>written-only content (particular script)</td>
1609				      </tr>
1610						<tr>
1611						  <td>uz-Zyyy</td>
1612						  <td>written-only content (unspecified script)</td>
1613				      </tr>
1614						<tr>
1615						  <td>uz-Zxxx</td>
1616						  <td>spoken-only content</td>
1617				      </tr>
1618						<tr>
1619						  <td>uz-Latn, uz-Zxxx</td>
1620						  <td>both specific written and spoken content (using a <em>language list</em>)</td>
1621				      </tr>
1622						<tr>
1623							<td><code>Zyyy</code></td>
1624							<td>Common</td>
1625							<td colspan="2">&nbsp;</td>
1626						</tr>
1627						<tr>
1628							<td><code>Zzzz</code></td>
1629							<td>Unknown</td>
1630							<td colspan="2">&nbsp;</td>
1631						</tr>
1632					</table>
1633					<p>The private use subtags listed as <strong>excluded</strong> in <em>Section 3.5.3 <a href="#Private_Use">Private Use Codes</a></em> will never be given
1634						specific semantics in Unicode identifiers, and are thus safe for
1635			  use for other purposes by other applications.</p></td>
1636			</tr>
1637			<tr>
1638				<td><a href="#unicode_region_subtag_validity"
1639					name="unicode_region_subtag_validity">unicode_region_subtag</a>
1640					<p>
1641						(also known as a <i>Unicode region code, </i>or<i> a Unicode
1642							territory code)</i>
1643					</p></td>
1644				<td>Subtags in the region.xml file (see<em> Section 3.11 <a
1645						href="#Validity_Data">Validity Data</a></em>). These are based on [<a
1646					href="#BCP47">BCP47</a>] subtag values marked as <b>Type:
1647						region</b>
1648					<p>Unicode identifiers give specific semantics to the following
1649						subtags:</p>
1650					<table border="1" cellspacing="0" cellpadding="2">
1651						<tr>
1652							<td>&nbsp;</td>
1653							<td><strong>Name</strong></td>
1654							<td><strong>Comment</strong></td>
1655							<td><strong> ISO 3166-1 status</strong></td>
1656						</tr>
1657						<tr>
1658							<td><code>QO</code></td>
1659							<td>Outlying Oceania</td>
1660							<td>countries in Oceania [009] that do not have a <a
1661								href="http://www.unicode.org/cldr/charts/latest/supplemental/territory_containment_un_m_49.html">subcontinent</a>.
1662							</td>
1663							<td>private use</td>
1664						</tr>
1665						<tr>
1666							<td><code>QU</code></td>
1667							<td>European Union</td>
1668							<td><strong>deprecated</strong>: the <em>canonicalized</em>
1669								form is EU</td>
1670							<td>private use</td>
1671						</tr>
1672						<tr>
1673							<td><code>UK</code></td>
1674							<td>United Kingdom</td>
1675							<td><strong>deprecated</strong>: the <em>canonicalized</em>
1676								form is GB</td>
1677							<td>exceptionally reserved</td>
1678						</tr>
1679						<tr>
1680						  <td><code>XA</code></td>
1681						  <td>Pseudo-Accents</td>
1682						  <td>special code indicating derived testing locale with English + added accents and lengthened</td>
1683						  <td>private use</td>
1684					  </tr>
1685						<tr>
1686						  <td><code>XB</code></td>
1687						  <td>Pseudo-Bidi</td>
1688						  <td>special code indicating derived testing locale with forced RTL English</td>
1689						  <td>private use</td>
1690					  </tr>
1691						<tr>
1692							<td><code>XK</code></td>
1693							<td>Kosovo</td>
1694							<td>industry practice</td>
1695							<td>private use</td>
1696						</tr>
1697						<tr>
1698							<td><code>ZZ</code></td>
1699							<td>Unknown or Invalid Territory</td>
1700							<td>used in APIs or as replacement for invalid code</td>
1701							<td>private use</td>
1702						</tr>
1703					</table>
1704					<p>The private use subtags listed as <strong>excluded</strong> in <em>Section 3.5.3 <a href="#Private_Use">Private Use Codes</a></em> will normally never be
1705						given specific semantics in Unicode identifiers, and are thus safe
1706						for use for other purposes by other applications. However, LDML
1707						may follow widespread industry practice in the use of some of
1708						these codes, such as for XK.</p>
1709					<p>The CLDR provides data for normalizing territory/region
1710						codes, including mapping overlong codes like &quot;eng-840&quot;
1711						or &quot;eng-USA&quot; to the correct code &quot;en-US&quot;.</p>
1712					<p>Special Codes:</p>
1713					<ul>
1714						<li>The territory code 'UK' has a special status in ISO, and
1715							is used for the domain name instead of GB. It is thus recognized
1716							by CLDR as being an alternate (unnormalized) form of 'GB'.</li>
1717						<li>The territory code '001' (the World) is used to indicate
1718							a standardized form, such as &quot;ar-001&quot; for Modern
1719							Standard Arabic.</li>
1720					</ul></td>
1721			</tr>
1722			<tr>
1723				<td><a href="#unicode_variant_subtag_validity"
1724					name="unicode_variant_subtag_validity">unicode_variant_subtag</a>
1725					<p>
1726						(also known as a <i>Unicode language variant code)</i>
1727					</p></td>
1728				<td>Subtags in the variant.xml file (see<em> Section 3.11
1729						<a href="#Validity_Data">Validity Data</a>
1730				</em>). These are based on [<a href="#BCP47">BCP47</a>] subtag values
1731					marked as <b>Type: variant</b>
1732					<p>
1733						CLDR provides data for normalizing variant codes. About handling
1734						of the "POSIX" variant see <i>Section 3.8.2, <a
1735							href="#Legacy_Variants">Legacy Variants</a></i>.
1736					</p></td>
1737			</tr>
1738		</table>
1739		<p>
1740			<i>Examples:</i>
1741		</p>
1742		<blockquote>
1743			<pre>en
1744fr_BE
1745zh-Hant-HK</pre>
1746		</blockquote>
1747		<p>
1748			<em>Deprecated</em> codes—such as QU above—are valid, but strongly
1749			discouraged.
1750		</p>
1751		<p>
1752			A locale that only has a language subtag (and optionally a script
1753			subtag) is called a <i>language locale</i>; one with both language
1754			and territory subtag is called a <i>territory locale</i> (or <i>country
1755				locale</i>).
1756		</p>
1757		<h3>
1758			<a name="Special_Codes" href="#Special_Codes">3.5 Special Codes</a>
1759		</h3>
1760
1761		<h4>
1762			<a name="Unknown_or_Invalid_Identifiers"
1763				href="#Unknown_or_Invalid_Identifiers">3.5.1 Unknown or Invalid
1764				Identifiers</a>
1765		</h4>
1766		<p>The following identifiers are used to indicate an unknown or
1767			invalid code in Unicode language and locale identifiers. For Unicode
1768			identifiers, the region code uses a private use ISO 3166 code, and
1769			Time Zone code uses an additional code; the others are defined by the
1770			relevant standards. When these codes are used in APIs connected with
1771			Unicode identifiers, the meaning is that either there was no
1772			identifier available, or that at some point an input identifier value
1773			was determined to be invalid or ill-formed.</p>
1774		<table border="1" cellspacing="0" cellpadding="4"
1775			style="margin-top: 0.5em; margin-bottom: 0.5em" id="table4">
1776			<tr>
1777				<th>Code Type</th>
1778				<th>Value</th>
1779				<th>Description in Referenced Standards</th>
1780			</tr>
1781			<tr>
1782				<td>Language</td>
1783				<td><code>und</code></td>
1784				<td>Undetermined language, also used for “root”</td>
1785			</tr>
1786			<tr>
1787				<td>Script</td>
1788				<td><code>Zzzz</code></td>
1789				<td>Code for uncoded script, Unknown [<a
1790					href="http://www.unicode.org/reports/tr41/#UAX24">UAX24</a>]
1791				</td>
1792			</tr>
1793			<tr>
1794				<td>Region&nbsp;&nbsp;</td>
1795				<td><code>ZZ</code></td>
1796				<td>Unknown or Invalid Territory</td>
1797			</tr>
1798			<tr>
1799				<td>Currency</td>
1800				<td><code>XXX</code></td>
1801				<td>The codes assigned for transactions where no currency is
1802					involved</td>
1803			</tr>
1804			<tr>
1805				<td>Time Zone</td>
1806				<td><code>unk</code></td>
1807				<td>Unknown or Invalid Time Zone</td>
1808			</tr>
1809			<tr>
1810				<td>Subdivision</td>
1811				<td><em>&lt;region&gt;</em>zzzz</td>
1812				<td>Unknown or Invalid Subdivision</td>
1813			</tr>
1814		</table>
1815		<p>When only the script or region are known, then a locale ID will
1816			use &quot;und&quot; as the language subtag portion. Thus the locale
1817			tag &quot;und_Grek&quot; represents the Greek script;
1818			&quot;und_US&quot; represents the US territory.</p>
1819		<h4>
1820			<a name="Numeric_Codes" href="#Numeric_Codes">3.5.2 Numeric Codes</a>
1821		</h4>
1822		<p>For region codes, ISO and the UN establish a mapping to
1823			three-letter codes and numeric codes. However, this does not extend
1824			to the private use codes, which are the codes 900-999 (total: 100),
1825			and AAA, QMA-QZZ, XAA-XZZ, and ZZZ (total: 1092). Unicode identifiers
1826			supply a standard mapping to these: for the numeric codes, it uses
1827			the top of the numeric private use range; for the 3-letter codes it
1828			doubles the final letter. These are the resulting mappings for all of
1829			the private use region codes:</p>
1830		<table border="1" cellspacing="0" cellpadding="4"
1831			style="margin-top: 0.5em; margin-bottom: 0.5em" id="table19">
1832			<tr>
1833				<th>Region</th>
1834				<th>UN/ISO Numeric</th>
1835				<th>ISO 3-Letter</th>
1836			</tr>
1837			<tr>
1838				<td><code>AA</code></td>
1839				<td><code>958</code></td>
1840				<td><code>AAA</code></td>
1841			</tr>
1842			<tr>
1843				<td><code>QM..QZ</code></td>
1844				<td><code>959..972</code></td>
1845				<td><code>QMM..QZZ</code></td>
1846			</tr>
1847			<tr>
1848				<td><code>XA..XZ</code></td>
1849				<td><code>973..998</code></td>
1850				<td><code>XAA..XZZ</code></td>
1851			</tr>
1852			<tr>
1853				<td><code>ZZ</code></td>
1854				<td><code>999</code></td>
1855				<td><code>ZZZ</code></td>
1856			</tr>
1857		</table>
1858		<p>For script codes, ISO 15924 supplies a mapping (however, the
1859			numeric codes are not in common use):</p>
1860		<table border="1" cellspacing="0" cellpadding="4"
1861			style="margin-top: 0.5em; margin-bottom: 0.5em" id="table21">
1862			<tr>
1863				<th>Script</th>
1864				<th>Numeric</th>
1865			</tr>
1866			<tr>
1867				<td><code>Qaaa..Qabx</code></td>
1868				<td><code>900..949</code></td>
1869			</tr>
1870		</table>
1871		<br>
1872		<h4>
1873			3.5.3 <a name="Private_Use" href="#Private_Use">Private Use Codes</a>
1874		</h4>
1875		<p>Private use codes fall into three groups.</p>
1876		<ul>
1877			<li><strong>defined:</strong> those that are given particular
1878				semantics currently in CLDR</li>
1879			<li><strong>reserved:</strong> those that may be given
1880				particular semantics in future versions of CLDR</li>
1881			<li><strong>excluded:</strong> those that will never be given
1882				particular CLDR semantics in the future, and thus can normally be
1883				used by applications without worrying about collisions. However,
1884				CLDR may follow widespread industry practice in the use of some of
1885				these codes, such as for XA, XB, and XK.</li>
1886		</ul>
1887		<table>
1888			<caption>
1889				<a name="Private_Use_CLDR" href="#Private_Use_CLDR">Private Use
1890					Codes in CLDR</a>
1891			</caption>
1892			<tr>
1893				<th>category</th>
1894				<th>status</th>
1895				<th>codes</th>
1896			</tr>
1897			<tr>
1898				<td rowspan="3">base language</td>
1899				<td>defined</td>
1900				<td>none</td>
1901			</tr>
1902			<tr>
1903				<td>reserved</td>
1904				<td>qaa..qfy</td>
1905			</tr>
1906			<tr>
1907				<td>excluded</td>
1908				<td>qfz..qtz</td>
1909			</tr>
1910			<tr>
1911				<td rowspan="3">script</td>
1912				<td>defined</td>
1913				<td>Qaai (obsolete), Qaag</td>
1914			</tr>
1915			<tr>
1916				<td>reserved</td>
1917				<td>Qaaa..Qaaf Qaah Qaaj..Qaap</td>
1918			</tr>
1919			<tr>
1920				<td>excluded</td>
1921				<td>Qaaq..Qabx</td>
1922			</tr>
1923			<tr>
1924				<td rowspan="3">region</td>
1925				<td>defined</td>
1926				<td>QO, QU, UK, XA, XB, XK, ZZ</td>
1927			</tr>
1928			<tr>
1929				<td>reserved</td>
1930				<td>AA 			QM..QN QP..QT QV..QZ</td>
1931			</tr>
1932			<tr>
1933				<td>excluded</td>
1934				<td>XC..XJ, XL..XZ</td>
1935			</tr>
1936			<tr>
1937				<td rowspan="3">timezone</td>
1938				<td>defined</td>
1939				<td>IANA: Etc/Unknown<br>
1940					bcp47: as listed in bcp47/timezone.xml
1941				</td>
1942			</tr>
1943			<tr>
1944				<td>reserved</td>
1945				<td>bcp47: all non-5 letter codes not starting with x</td>
1946			</tr>
1947			<tr>
1948				<td>excluded</td>
1949				<td>bcp47: all non-5 letter codes starting with x</td>
1950			</tr>
1951		</table>
1952		<p>
1953			See also <em>Section 3.5.1 <a
1954				href="#Unknown_or_Invalid_Identifiers">Unknown or Invalid
1955					Identifiers</a></em>.
1956		</p>
1957		<p></p>
1958		<h3>
1959			<a name="Locale_Extension_Key_and_Type_Data"></a><a
1960				name="u_Extension" href="#u_Extension">3.6 Unicode BCP 47 U
1961				Extension</a>
1962		</h3>
1963		<p>
1964			[<a href="#BCP47">BCP47</a>] Language Tags provides a mechanism for
1965			extending language tags for use in various applications by extension
1966			subtags. Each extension subtag is identified by a single alphanumeric
1967			character subtag assigned by IANA.
1968		</p>
1969		<p>
1970			The Unicode Consortium has registered and is the maintaining
1971			authority for two BCP 47 language tag extensions: the extension 'u'
1972			for Unicode locale extension [<a href="#RFC6067">RFC6067</a>] and
1973			extension 't' for transformed content [<a href="#RFC6497">RFC6497</a>].
1974			The Unicode BCP 47 extension data defines the complete list of valid
1975			subtags.
1976		</p>
1977
1978		<p>
1979			These subtags are all in lowercase (that is the canonical casing for
1980			these subtags), however, subtags are case-insensitive and casing does
1981			not carry any specific meaning. All subtags within the Unicode
1982			extensions are alphanumeric characters in length of two to eight that
1983			meet the rule
1984			<code>extension</code>
1985			in the [<a href="#BCP47">BCP47</a>]
1986		</p>
1987		<p>
1988			<strong>The -u- Extension.</strong> The syntax of 'u' extension
1989			subtags is defined by the rule
1990			<code>unicode_locale_extensions</code>
1991			in <a href="#Unicode_locale_identifier">Section 3.2 Unicode
1992				locale identifier</a>, except the separator of subtags
1993			<code>sep</code>
1994			must be always hyphen '-' when the extension is used as a part of BCP
1995			47 language tag.
1996		</p>
1997		<p>
1998			A 'u' extension may contain multiple
1999			<code>attribute</code>
2000			s or
2001			<code>keyword</code>
2002			s as defined in <a href="#Unicode_locale_identifier">Section 3.2
2003				Unicode locale identifier</a>. Although the order of
2004			<code>attribute</code>
2005			s or
2006			<code>keyword</code>
2007			s does not matter, this specification defines the canonical form as
2008			below:
2009		</p>
2010		<ul>
2011			<li>All attributes are sorted in alphabetical order.</li>
2012			<li>All keywords are sorted by alphabetical order of keys.</li>
2013			<li>All keywords are in lowercase.</li>
2014			<li>All keys and types use the canonical form (from the name
2015				attribute; see <a href="#Unicode_Locale_Extension_Data_Files">Section
2016					3.6.4 U Extension Data Files</a>).
2017			</li>
2018			<li>Type value "true" is removed.</li>
2019		</ul>
2020		<p>For example, the canonical form of 'u' extension
2021			"u-foo-bar-nu-thai-ca-buddhist-kk-true" is
2022			"u-bar-foo-ca-buddhist-kk-nu-thai". The attributes "foo" and "bar" in
2023			this example are provided only for illustration; no attribute subtags
2024			are defined by the current CLDR specification.</p>
2025		<p>
2026			<em>See also <a
2027				href="http://cldr.unicode.org/index/bcp47-extension"> Unicode
2028					Extensions for BCP 47</a> on the CLDR site.
2029			</em>
2030		</p>
2031		<h4>
2032			<a href="#Key_And_Type_Definitions_" name="Key_And_Type_Definitions_">3.6.1
2033				Key And Type Definitions</a>
2034		</h4>
2035		<p>The following chart contains a set of U extension key values
2036			that are currently available, with a description or sampling of the U
2037			extension type values. Each category is associated with an XML file
2038			in the bcp47 directory.</p>
2039		<p>
2040			For the complete list of valid keys and types defined for Unicode
2041			locale extensions, see <a href="#Unicode_Locale_Extension_Data_Files">Section
2042				3.6.4 U Extension Data Files</a>. For information on the process for
2043			adding new <i>key</i>/<i>type</i>, see [<a href="#localeProject">LocaleProject</a>].
2044		</p>
2045		<p>
2046			Most type values are represented by a single subtag in the current
2047			version of CLDR. There are exceptions, such as types used for key
2048			"ca" (calendar) and "kr" (collation reordering). If the type is not
2049			included, then the type value "true" is assumed. Note that the
2050			default for key with a possible &quot;true&quot; value is often
2051			&quot;false&quot;, but may not always be. Note also that
2052			"true"/"True" is not a valid script code, since <a
2053				href="http://www.unicode.org/iso15924/codelists.html">the ISO
2054				15924 Registration Authority has exceptionally reserved it</a>, which
2055			means that it will not be assigned for any purpose.
2056		</p>
2057		<p>The BCP 47 form for keys and types is the canonical form, and
2058			recommended. Other aliases are included for backwards compatibility.
2059	  </p>
2060		<table>
2061			<caption>
2062				<a name="Key_Type_Definitions" href="#Key_Type_Definitions">Key/Type
2063					Definitions</a>
2064			</caption>
2065			<tr>
2066				<th>key<br> (old key name)
2067				</th>
2068				<th>key description</th>
2069				<th>example type<br> (old type name)
2070				</th>
2071				<th>type description</th>
2072			</tr>
2073			<tr>
2074				<td colspan="4"><strong>A <a
2075						href="#UnicodeCalendarIdentifier" name="UnicodeCalendarIdentifier">Unicode
2076							Calendar Identifier</a> defines a type of calendar. The valid values
2077						are those <em>name</em> attribute values in the <em>type</em>
2078						elements of key name="ca" in bcp47/<a target="_blank"
2079						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/calendar.xml">calendar.xml</a></strong>.</td>
2080			</tr>
2081			<tr>
2082				<td rowspan="10">"ca"<br> (calendar)
2083				</td>
2084				<td rowspan="10">Calendar algorithm<br> <br> <i>(For
2085						information on the calendar algorithms associated with the data
2086						used with these, see [<a href="#Calendars">Calendars</a>].)
2087				</i></td>
2088				<td>"buddhist"</td>
2089				<td>Thai Buddhist calendar (same as Gregorian except for the
2090					year)</td>
2091			</tr>
2092			<tr>
2093				<td>"chinese"</td>
2094				<td>Traditional Chinese calendar</td>
2095			</tr>
2096			<tr>
2097				<td colspan="2">…</td>
2098			</tr>
2099			<tr>
2100				<td>"gregory"<br> (gregorian)
2101				</td>
2102				<td>Gregorian calendar</td>
2103			</tr>
2104			<tr>
2105				<td colspan="2">…</td>
2106			</tr>
2107			<tr>
2108				<td>"islamic"</td>
2109				<td>Islamic calendar</td>
2110			</tr>
2111			<tr>
2112				<td>"islamic-civil"</td>
2113				<td>Islamic calendar, tabular (intercalary years
2114					[2,5,7,10,13,16,18,21,24,26,29] - civil epoch)</td>
2115			</tr>
2116			<tr>
2117				<td>"islamic-umalqura"</td>
2118				<td>Islamic calendar, Umm al-Qura</td>
2119			</tr>
2120			<tr>
2121				<td colspan="2">…</td>
2122			</tr>
2123			<tr>
2124				<td colspan="2"><b>Note:</b> <i>Some calendar types are
2125						represented by two subtags. In such cases, the first subtag
2126						specifies a generic calendar type and the second subtag specifies
2127						a calendar algorithm variant. The CLDR uses generic calendar types
2128						(single subtag types) for tagging data when calendar algorithm
2129						variations within a generic calendar type are irrelevant. For
2130						example, type "islamic" is used for specifying Islamic calendar
2131						formatting data for all Islamic calendar types, including
2132						"islamic-civil" and "islamic-umalqura".</i></td>
2133			</tr>
2134
2135			<tr>
2136				<td colspan="4"><strong>A <a
2137						href="#UnicodeCurrencyFormatIdentifier"
2138						name="UnicodeCurrencyFormatIdentifier">Unicode Currency Format
2139							Identifier</a> defines a style for currency formatting. The valid
2140						values are those <em>name</em> attribute values in the <em>type</em>
2141						elements of key name="cf" in bcp47/<a target="_blank"
2142						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/currency.xml">currency.xml</a></strong>.</td>
2143			</tr>
2144			<tr>
2145				<td rowspan="2">"cf"</td>
2146				<td rowspan="2">Currency Format style</td>
2147				<td>"standard"</td>
2148				<td>Negative numbers use the minusSign symbol (the default).</td>
2149			</tr>
2150			<tr>
2151				<td>"account"</td>
2152				<td>Negative numbers use parentheses or equivalent.</td>
2153			</tr>
2154
2155			<tr>
2156				<td colspan="4"><strong>A <a
2157						href="#UnicodeCollationIdentifier"
2158						name="UnicodeCollationIdentifier">Unicode Collation Identifier</a>
2159						defines a type of collation (sort order). The valid values are
2160						those <em>name</em> attribute values in the <em>type</em> elements
2161						of bcp47/<a target="_blank"
2162						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/collation.xml">collation.xml</a></strong>.</td>
2163			</tr>
2164			<tr>
2165				<td colspan="4"><i>For information on each collation
2166						setting parameter, from <strong>ka</strong> to <strong>vt</strong>,
2167						see <a href="tr35-collation.html#Setting_Options">Setting
2168							Options</a>
2169				</i></td>
2170			</tr>
2171			<tr>
2172				<td rowspan="9">"co"<br> (collation)
2173				</td>
2174				<td rowspan="9">Collation type</td>
2175				<td>"standard"</td>
2176				<td>The default ordering for each language. For root it is
2177					based on the [<a href="#DUCET">DUCET</a>] (Default Unicode
2178					Collation Element Table): see <em><a
2179						href="tr35-collation.html#Root_Collation">Root Collation</a></em>. Each
2180					other locale is based on that, except for appropriate modifications
2181					to certain characters for that language.
2182				</td>
2183			</tr>
2184
2185			<tr>
2186				<td>"search"</td>
2187				<td>A special collation type dedicated for string search—it is
2188					not used to determine the relative order of two strings, but only
2189					to determine whether they should be considered equivalent for the
2190					specified strength, using the string search matching rules
2191					appropriate for the language. Compared to the normal collator for
2192					the language, this may add or remove primary equivalences, may make
2193					additional characters ignorable or change secondary equivalences,
2194					and may modify contractions to allow matching within them,
2195					depending on the desired behavior. For example, in Czech, the
2196					distinction between ‘a’ and ‘á’ is secondary for normal collation,
2197					but primary for search; a search for ‘a’ should never match ‘á’ and
2198					vice versa. A search collator is normally used with strength set to
2199					PRIMARY or SECONDARY (should be SECONDARY if using “asymmetric”
2200					search as described in the [<a
2201					href="http://www.unicode.org/reports/tr41/#UTS10">UCA</a>] section
2202					Asymmetric Search). The search collator in root supplies matching
2203					rules that are appropriate for most languages (and which are
2204					different than the root collation behavior); language-specific
2205					search collators may be provided to override the matching rules for
2206					a given language as necessary.
2207				</td>
2208			</tr>
2209			<tr>
2210				<td colspan="2"><p>
2211						Other keywords provide additional choices for certain locales; <i>they
2212							only have effect in certain locales.</i>
2213					</p></td>
2214			</tr>
2215			<tr>
2216				<td colspan="2">…</td>
2217			</tr>
2218			<tr>
2219				<td>"phonetic"</td>
2220				<td>Requests a phonetic variant if available, where text is
2221					sorted based on pronunciation. It may interleave different scripts,
2222					if multiple scripts are in common use.</td>
2223			</tr>
2224			<tr>
2225				<td>"pinyin"</td>
2226				<td>Pinyin ordering for Latin and for CJK characters; that is,
2227					an ordering for CJK characters based on a character-by-character
2228					transliteration into a pinyin. (used in Chinese)</td>
2229			</tr>
2230			<tr>
2231				<td>"reformed"</td>
2232				<td>Reformed collation (such as in Swedish)</td>
2233			</tr>
2234			<tr>
2235				<td>"searchjl"</td>
2236				<td>Special collation type for a modified string search in
2237					which a pattern consisting of a sequence of Hangul initial
2238					consonants (jamo lead consonants) will match a sequence of Hangul
2239					syllable characters whose initial consonants match the pattern. The
2240					jamo lead consonants can be represented using conjoining or
2241					compatibility jamo. This search collator is best used at SECONDARY
2242					strength with an "asymmetric" search as described in the [<a
2243					href="http://www.unicode.org/reports/tr41/#UTS10">UCA</a>] section
2244					Asymmetric Search and obtained, for example, using ICU4C's usearch
2245					facility with attribute USEARCH_ELEMENT_COMPARISON set to value
2246					USEARCH_PATTERN_BASE_WEIGHT_IS_WILDCARD; this ensures that a full
2247					Hangul syllable in the search pattern will only match the same
2248					syllable in the searched text (instead of matching any syllable
2249					with the same initial consonant), while a Hangul initial consonant
2250					in the search pattern will match any Hangul syllable in the
2251					searched text with the same initial consonant.
2252				</td>
2253			</tr>
2254			<tr>
2255				<td colspan="2">…</td>
2256			</tr>
2257
2258			<tr>
2259				<td colspan="4"><strong>A <a
2260						href="#UnicodeCurrencyIdentifier" name="UnicodeCurrencyIdentifier">Unicode
2261							Currency Identifier</a> defines a type of currency. The valid values
2262						are those <em>name</em> attribute values in the <em>type</em>
2263						elements of key name="cu" in bcp47/<a target="_blank"
2264						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/currency.xml">currency.xml</a>.
2265				</strong></td>
2266			</tr>
2267			<tr>
2268				<td>"cu"<br> (currency)
2269				</td>
2270				<td>Currency type</td>
2271				<td><i>ISO 4217 code,</i>
2272					<p>
2273						<i>plus others in common use</i>
2274					</p></td>
2275				<td><p>
2276						Codes consisting of 3 ASCII letters that are or have been valid in
2277						ISO 4217, plus certain additional codes that are or have been in
2278						common use. The list of countries and time periods associated with
2279						each currency value is available in <a
2280							href="tr35-numbers.html#Supplemental_Currency_Data">Supplemental
2281							Currency Data</a>, plus the default number of decimals.
2282					</p>
2283					<p>
2284						The XXX code is given a broader interpretation as <em>Unknown
2285							or Invalid Currency</em>.
2286					</p></td>
2287			</tr>
2288
2289			<tr>
2290				<td colspan="4"><strong>A <a
2291						href="#UnicodeEmojiPresentationStyleIdentifier" name="UnicodeEmojiPresentationStyleIdentifier">Unicode
2292							Emoji Presentation Style Identifier</a> specifies a request for
2293						the preferred emoji presentation style. This can be used as part of
2294						the value for an HTML lang attribute, for example
2295						<code>&lt;html lang=&quot;sr-Latn-u-em-emoji&quot;&gt;</code>.
2296						The valid values are those <em>name</em> attribute values
2297						in the <em>type</em> elements of key name="em" in bcp47/<a
2298						target="_blank"
2299						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/variant.xml">variant.xml</a></strong>.</td>
2300			</tr>
2301			<tr>
2302				<td rowspan="3">"em"</td>
2303				<td rowspan="3">Emoji presentation style</td>
2304				<td>"emoji"</td>
2305				<td>Use an emoji presentation for emoji characters if possible.</td>
2306			</tr>
2307			<tr>
2308				<td>"text"</td>
2309				<td>Use a text presentation for emoji characters if possible.</td>
2310			</tr>
2311			<tr>
2312				<td>"default"</td>
2313				<td>Use the default presentation for emoji characters as specified in UTR #51 Section 4,
2314					<a href="http://www.unicode.org/reports/tr51/#Presentation_Style">Presentation Style</a>.</td>
2315			</tr>
2316
2317			<tr>
2318				<td colspan="4"><strong>A <a
2319						href="#UnicodeFirstDayIdentifier" name="UnicodeFirstDayIdentifier">Unicode
2320							First Day Identifier</a> defines the preferred first day of the week
2321						for calendar display. Specifying "fw" in a locale identifier
2322						overrides the default value specified by supplemental week data
2323						(see Part 4 Dates, section 4.3 <a href="tr35-dates.html#Week_Data">Week
2324							Data</a>). The valid values are those <em>name</em> attribute values
2325						in the <em>type</em> elements of key name="fw" in bcp47/<a
2326						target="_blank"
2327						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/calendar.xml">calendar.xml</a></strong>.</td>
2328			</tr>
2329			<tr>
2330				<td rowspan="4">"fw"</td>
2331				<td rowspan="4">First day of week</td>
2332				<td>"sun"</td>
2333				<td>Sunday</td>
2334			</tr>
2335			<tr>
2336				<td>"mon"</td>
2337				<td>Monday</td>
2338			</tr>
2339			<tr>
2340				<td colspan="2">…</td>
2341			</tr>
2342			<tr>
2343				<td>"sat"</td>
2344				<td>Saturday</td>
2345			</tr>
2346
2347			<tr>
2348				<td colspan="4"><strong>A <a
2349						href="#UnicodeHourCycleIdentifier"
2350						name="UnicodeHourCycleIdentifier">Unicode Hour Cycle
2351							Identifier</a> defines the preferred time cycle. Specifying "hc" in a
2352						locale identifier overrides the the default value specified by
2353						supplemental time data (see Part 4 Dates, section 4.4 <a
2354						href="tr35-dates.html#Time_Data">Time Data</a>). The valid values
2355						are those <em>name</em> attribute values in the <em>type</em>
2356						elements of key name="hc" in bcp47/<a target="_blank"
2357						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/calendar.xml">calendar.xml</a></strong>.</td>
2358			</tr>
2359			<tr>
2360				<td rowspan="4">"hc"</td>
2361				<td rowspan="4">Hour cycle</td>
2362				<td>"h12"</td>
2363				<td>Hour system using 1–12; corresponds to 'h' in patterns</td>
2364			</tr>
2365			<tr>
2366				<td>"h23"</td>
2367				<td>Hour system using 0–23; corresponds to 'H' in patterns</td>
2368			</tr>
2369			<tr>
2370				<td>"h11"</td>
2371				<td>Hour system using 0–11; corresponds to 'K' in patterns</td>
2372			</tr>
2373			<tr>
2374				<td>"h24"</td>
2375				<td>Hour system using 1–24; corresponds to 'k' in pattern</td>
2376			</tr>
2377
2378			<tr>
2379				<td colspan="4"><strong>A <a
2380						href="#UnicodeLineBreakStyleIdentifier"
2381						name="UnicodeLineBreakStyleIdentifier">Unicode Line Break
2382							Style Identifier</a> defines a preferred line break style
2383						corresponding to the CSS level 3 <a
2384						href="https://drafts.csswg.org/css-text/#line-break-property">line-break
2385							option</a>. Specifying "lb" in a locale identifier overrides the
2386						locale‘s default style (which may correspond to "normal" or
2387						"strict"). The valid values are those <em>name</em> attribute
2388						values in the <em>type</em> elements of key name="lb" in bcp47/<a
2389						target="_blank"
2390						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/segmentation.xml">segmentation.xml</a></strong>.</td>
2391			</tr>
2392			<tr>
2393				<td rowspan="3">"lb"</td>
2394				<td rowspan="3">Line break style</td>
2395				<td>"strict"</td>
2396				<td>CSS level 3 line-break=strict, e.g. treat CJ as NS</td>
2397			</tr>
2398			<tr>
2399				<td>"normal"</td>
2400				<td>CSS level 3 line-break=normal, e.g. treat CJ as ID, break
2401					before hyphens for ja,zh</td>
2402			</tr>
2403			<tr>
2404				<td>"loose"</td>
2405				<td>CSS lev 3 line-break=loose</td>
2406			</tr>
2407
2408			<tr>
2409				<td colspan="4"><strong>A <a
2410						href="#UnicodeLineBreakWordIdentifier"
2411						name="UnicodeLineBreakWordIdentifier">Unicode Line Break Word
2412							Identifier</a> defines preferred line break word handling behavior
2413						corresponding to the CSS level 3 <a
2414						href="https://drafts.csswg.org/css-text/#word-break-property">word-break
2415							option</a>. The valid values are those <em>name</em> attribute values
2416						in the <em>type</em> elements of key name="lw" in bcp47/<a
2417						target="_blank"
2418						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/segmentation.xml">segmentation.xml</a></strong>.</td>
2419			</tr>
2420			<tr>
2421				<td rowspan="3">"lw"</td>
2422				<td rowspan="3">Line break word handling</td>
2423				<td>"normal"</td>
2424				<td>CSS level 3 word-break=normal, normal script/language
2425					behavior for midword breaks</td>
2426			</tr>
2427			<tr>
2428				<td>"breakall"</td>
2429				<td>CSS level 3 word-break=break-all, allow midword breaks
2430					unless forbidden by lb setting</td>
2431			</tr>
2432			<tr>
2433				<td>"keepall"</td>
2434				<td>CSS level 3 word-break=keep-all, prohibit midword breaks
2435					except for dictionary breaks</td>
2436			</tr>
2437
2438			<tr>
2439				<td colspan="4"><strong>A <a
2440						href="#UnicodeMeasurementSystemIdentifier"
2441						name="UnicodeMeasurementSystemIdentifier">Unicode Measurement
2442							System Identifier</a> defines a preferred measurement system.
2443						Specifying "ms" in a locale identifier overrides the default value
2444						specified by supplemental measurement system data (see Part 2
2445						General, section 5 <a
2446						href="tr35-general.html#Measurement_System_Data">Measurement
2447							System Data</a>). The valid values are those <em>name</em> attribute
2448						values in the <em>type</em> elements of key name="ms" in bcp47/<a
2449						target="_blank"
2450						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/measure.xml">measure.xml</a></strong>.</td>
2451			</tr>
2452			<tr>
2453				<td rowspan="3">"ms"</td>
2454				<td rowspan="3">Measurement system</td>
2455				<td>"metric"</td>
2456				<td>Metric System</td>
2457			</tr>
2458			<tr>
2459				<td>"ussystem"</td>
2460				<td>US System of measurement: feet, pints, etc.; pints are 16oz</td>
2461			</tr>
2462			<tr>
2463				<td>"uksystem"</td>
2464				<td>UK System of measurement: feet, pints, etc.; pints are 20oz</td>
2465			</tr>
2466
2467			<tr>
2468				<td colspan="4"><strong>A <a
2469						href="#UnicodeNumberSystemIdentifier"
2470						name="UnicodeNumberSystemIdentifier">Unicode Number System
2471							Identifier</a> defines a type of number system. The valid values are
2472						those <em>name</em> attribute values in the <em>type</em> elements
2473						of bcp47/<a target="_blank"
2474						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/number.xml">number.xml</a>.
2475				</strong></td>
2476			</tr>
2477			<tr>
2478				<td rowspan="7">"nu"<br> (numbers)
2479				</td>
2480				<td rowspan="7">Numbering system</td>
2481				<td><i>Unicode script subtag</i></td>
2482				<td><p>
2483						Four-letter types indicating the primary numbering system for the
2484						corresponding script represented in Unicode. Unless otherwise
2485						specified, it is a decimal numbering system using digits
2486						[:GeneralCategory=Nd:]. For example, &quot;latn&quot; refers to
2487						the ASCII / Western digits 0-9, while &quot;taml&quot; is an
2488						algorithmic (non-decimal) numbering system. (The code "tamldec" is
2489						indicates the "modern Tamil decimal digits".)<br>
2490					</p>
2491					<p class="note">
2492						For more information, see <a
2493							href="tr35-numbers.html#Numbering_Systems">Numbering Systems</a>.
2494					</p></td>
2495			</tr>
2496			<tr>
2497				<td>"arabext"</td>
2498				<td>Extended Arabic-Indic digits ("arab" means the base
2499					Arabic-Indic digits)</td>
2500			</tr>
2501			<tr>
2502				<td>"armnlow"</td>
2503				<td>Armenian lowercase numerals</td>
2504			</tr>
2505			<tr>
2506				<td colspan="2">…</td>
2507			</tr>
2508			<tr>
2509				<td>"roman"</td>
2510				<td>Roman numerals</td>
2511			</tr>
2512			<tr>
2513				<td>"romanlow"</td>
2514				<td>Roman lowercase numerals</td>
2515			</tr>
2516			<tr>
2517				<td>"tamldec"</td>
2518				<td>Modern Tamil decimal digits</td>
2519			</tr>
2520
2521			<tr>
2522				<td colspan="4"><strong>A <a href="#RegionOverride"
2523						name="RegionOverride">Region Override</a> specifies an alternate
2524						region to use for obtaining certain region-specific default values
2525						(those specified by the <a href="tr35-info.html#rgScope">&lt;rgScope&gt;</a>
2526						element), instead of using the region specified by the <a
2527						href="#unicode_region_subtag">unicode_region_subtag</a> in the
2528						Unicode Language Identifier (or inferred from the <a
2529						href="#unicode_language_subtag">unicode_language_subtag</a>).
2530				</strong></td>
2531			</tr>
2532			<tr>
2533				<td rowspan="2">"rg"</td>
2534				<td rowspan="2">Region Override</td>
2535				<td>&quot;uszzzz&quot;<br> <br></td>
2536				<td rowspan="2">The value is a <a href="#unicode_region_subtag">unicode_region_subtag</a>
2537					for a regular region (not a macroregion), suffixed by "ZZZZ" (case
2538					is not significant). For example, “en-GB-u-rg-uszzzz” represents a
2539					locale for British English but with region-specific defaults set to
2540					US for items such as default currency, default calendar and week
2541					data, default time cycle, and default measurement system and unit
2542					preferences.
2543				</td>
2544			</tr>
2545			<tr>
2546				<td>…</td>
2547			</tr>
2548
2549			<tr>
2550				<td colspan="4"><strong>A <a
2551						name="unicode_subdivision_subtag_validity"></a><a
2552						href="#UnicodeSubdivisionIdentifier"
2553						name="UnicodeSubdivisionIdentifier">Unicode Subdivision
2554							Identifier</a> defines a regional subdivision used for locales. The
2555						valid values are based on the <em>subdivisionContainment</em>
2556						element as described in <em>Section <a
2557							href="#Unicode_Subdivision_Codes">3.6.5 Subdivision Codes</a></em>.
2558				</strong></td>
2559			</tr>
2560			<tr>
2561				<td rowspan="2">"sd"</td>
2562				<td rowspan="2">Regional Subdivision</td>
2563				<td>&quot;gbsct&quot;<br> <br></td>
2564				<td rowspan="2">A <a href="#unicode_subdivision_id">unicode_subdivision_id</a>, which is
2565					a <a href="#unicode_region_subtag">unicode_region_subtag</a>concatenated
2566					with a unicode_subdivision_suffix.<br> For example, <em>gbsct</em> is “gb”+“sct” (where sct
2567						represents the subdivision code for Scotland). Thus
2568					“en-GB-u-sd-gbsct” represents the language variant “English as used
2569					in Scotland”. And both “en-u-sd-usca” and “en-US-u-sd-usca”
2570					represent “English as used in California”. See
2571						<strong><em><a href="#Unicode_Subdivision_Codes">3.6.5
2572									Subdivision Codes</a></em></strong>.
2573				</td>
2574			</tr>
2575			<tr>
2576				<td>…</td>
2577			</tr>
2578
2579			<tr>
2580				<td colspan="4"><strong>A <a
2581						href="#UnicodeSentenceBreakSuppressionsIdentifier"
2582						name="UnicodeSentenceBreakSuppressionsIdentifier">Unicode
2583							Sentence Break Suppressions Identifier</a> defines a set of data to
2584						be used for suppressing certain sentence breaks that would
2585						otherwise be found by UAX #14 rules. The valid values are those <em>name</em>
2586						attribute values in the <em>type</em> elements of key name="ss" in
2587						bcp47/<a target="_blank"
2588						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/segmentation.xml">segmentation.xml</a></strong>.</td>
2589			</tr>
2590			<tr>
2591				<td rowspan="2">"ss"</td>
2592				<td rowspan="2">Sentence break suppressions</td>
2593				<td>"none"</td>
2594				<td>Don’t use sentence break suppressions data (the default).</td>
2595			</tr>
2596			<tr>
2597				<td>"standard"</td>
2598				<td>Use sentence break suppressions data of type "standard"</td>
2599			</tr>
2600
2601			<tr>
2602				<td colspan="4"><strong>A <a
2603						href="#UnicodeTimezoneIdentifier" name="UnicodeTimezoneIdentifier">Unicode
2604							Timezone Identifier</a> defines a timezone. The valid values are
2605						those name attribute values in the <em>type</em> elements of
2606						bcp47/<a target="_blank"
2607						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/timezone.xml">timezone.xml</a>.
2608				</strong></td>
2609			</tr>
2610			<tr>
2611				<td>"tz"<br> (timezone)
2612				</td>
2613				<td>Time zone</td>
2614				<td><i>Unicode short time zone IDs</i></td>
2615				<td><p>
2616						Short identifiers defined in terms of a TZ time zone database [<a
2617							href="#Olson">Olson</a>] identifier in the file
2618						common/bcp47/timezone.xml file, plus a few extra values.
2619					</p>
2620					<p>
2621						For more information, see <a href="#Time_Zone_Identifiers">Section
2622							3.7.1.2 Time Zone Identifiers</a>.
2623					</p>
2624					<p>CLDR provides data for normalizing timezone codes.</p></td>
2625			</tr>
2626			<tr>
2627				<td colspan="4"><strong>A <a
2628						href="#UnicodeVariantIdentifier" name="UnicodeVariantIdentifier">Unicode
2629							Variant Identifier</a> defines a special variant used for locales.
2630						The valid values are those name attribute values in the <em>type</em>
2631						elements of bcp47/<a target="_blank"
2632						href="http://www.unicode.org/repos/cldr/tags/latest/common/bcp47/variant.xml">variant.xml</a>.
2633				</strong></td>
2634			</tr>
2635			<tr>
2636				<td>"va"</td>
2637				<td>Common variant type</td>
2638				<td>"posix"</td>
2639				<td>POSIX style locale variant. About handling of the "POSIX"
2640					variant see <i>Section 3.8.2, <a href="#Legacy_Variants">Legacy
2641							Variants</a></i>.
2642				</td>
2643			</tr>
2644		</table>
2645		<p>
2646			For more information on the allowed keys and types, see the specific
2647			elements below, and <a href="#Unicode_Locale_Extension_Data_Files">Section
2648				3.6.4 U Extension Data Files</a>.
2649		</p>
2650		<p>Additional keys or types might be added in future versions.
2651			Implementations of LDML should be robust to handle any syntactically
2652			valid key or type values.</p>
2653		<h4>
2654			<a href="#Numbering System Data" name="Numbering System Data">3.6.2
2655				Numbering System Data </a>
2656		</h4>
2657		<p>
2658			LDML supports multiple numbering systems. The identifiers for those
2659			numbering systems are defined in the file <strong>bcp47/number.xml</strong>.
2660			For example, for the 'trunk' version of the data see <a
2661				href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/number.xml">bcp47/number.xml</a>.<br>
2662		</p>
2663		<p>
2664			Details about those numbering systems are defined in <strong>supplemental/numberingSystems.xml</strong>.
2665			For example, for the 'trunk' version of the data see <a
2666				href="http://unicode.org/repos/cldr/tags/latest/common/supplemental/numberingSystems.xml">supplemental/numberingSystems.xml</a>.<br>
2667		</p>
2668		<p>
2669			LDML makes certain stability guarantees on this data: <br>
2670		</p>
2671		<ol>
2672			<li>Like other BCP 47 identifiers, once a numeric identifier is
2673				added to <strong>bcp47/number.xml</strong> or <strong>numberingSystems.xml</strong>,
2674				it will never be removed from either of those files.
2675			</li>
2676			<li>If an identifier has type="numeric" in numberingSystems.xml,
2677				then
2678				<ol>
2679					<li>It is a decimal, positional numbering system with an
2680						attribute digits=X, where X is a string with the 10 digits in
2681						order used by the numbering system.</li>
2682					<li>The values of the type and digits will never change.</li>
2683				</ol>
2684			</li>
2685		</ol>
2686		<h4>
2687			<a href="#Time_Zone_Identifiers" name="Time_Zone_Identifiers">3.6.3
2688				Time Zone Identifiers</a>
2689		</h4>
2690		<p>
2691			LDML inherits time zone IDs from the tz database [<a href="#Olson">Olson</a>].
2692			Because these IDs from the tz database do not satisfy the BCP 47
2693			language subtag syntax requirements, CLDR defines short identifiers
2694			for the use in the Unicode locale extension. The short identifiers
2695			are defined in the file <strong>common/bcp47/timezone.xml</strong>.
2696		</p>
2697		<p>
2698			The short identifiers use UN/LOCODE [<a href="#LOCODE">LOCODE</a>]
2699			(excluding a space character) codes where possible. For example, the
2700			short identifier for "America/Los_Angeles" is "uslax" (the LOCODE for
2701			Los Angeles, US is "US LAX"). Identifiers of length not equal to 5
2702			are used where there is no corresponding UN/LOCODE, such as
2703			"usnavajo" for "America/Shiprock", or "utcw01" for "Etc/GMT+1", so
2704			that they do not overlap with future UN/LOCODE.
2705		</p>
2706		<p>Although the first two letters of a short identifier may match
2707			an ISO 3166 two-letter country code, a user should not assume that
2708			the time zone belongs to the country. The first two letters in an
2709			identifier of length not equal to 5 has no meaning. Also, the
2710			identifiers are stabilized, meaning that they will not change no
2711			matter what changes happen in the base standard. So if Hawaii leaves
2712			the US and joins Canada as a new province, the short time zone
2713			identifier "ushnl" would not change in CLDR even if the UN/LOCODE
2714			changes to "cahnl" or something else.</p>
2715		<p>There is a special code "unk" for an Unknown or Invalid time
2716			zone. This can be expressed in the tz database style ID
2717			"Etc/Unknown", although it is not defined in the tz database.</p>
2718		<p>
2719			<b>Stability of Time Zone Identifiers</b>
2720		</p>
2721		<p>
2722			Although the short time zone identifiers are guaranteed to be stable,
2723			the preferred IDs in the tz database (as those found in <strong>zone.tab</strong>
2724			file) might be changed time to time. For example, "Asia/Culcutta" was
2725			replaced with "Asia/Kolkata" and moved to <strong>backward</strong>
2726			file in the tz database. CLDR contains locale data using a time zone
2727			ID from the tz database as the key, stability of the IDs is cirtical.
2728		</p>
2729		<p>
2730			To maintain the stability of "long" IDs (for those inherited from the
2731			tz database), a special rule applied to the <i>alias</i> attribute in
2732			the &lt;type&gt; element for "tz" - the first "long" ID is the CLDR
2733			canonical "long" time zone ID.
2734		</p>
2735		<p>For example:</p>
2736		<blockquote>&lt;type name="inccu" alias="Asia/Calcutta
2737			Asia/Kolkata" description="Kolkata, India"/&gt;</blockquote>
2738		<p>
2739			Above &lt;type&gt; element defines the short time zone ID "inccu"
2740			(for the use in the Unicode locale extension), corresponding <em>CLDR
2741				canonical "long" ID</em> "Asia/Culcutta", and an alias "Asia/Kolkata".
2742		</p>
2743		<h4>
2744			<a href="#Unicode_Locale_Extension_Data_Files"
2745				name="Unicode_Locale_Extension_Data_Files">3.6.4 U Extension
2746				Data Files</a>
2747		</h4>
2748		<p>
2749			The 'u' extension data is stored in multiple XML files located under
2750			common/bcp47 directory in CLDR. Each file contains the locale
2751			extension key/type values and their backward compatibility mappings
2752			appropriate for a particular domain. <a
2753				href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/collation.xml">common/bcp47/collation.xml</a>
2754			contains key/type values for collation, including optional collation
2755			parameters and valid type values for each key.
2756		</p>
2757		<p>
2758			The 't' extension data is stored in <a
2759				href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform.xml">common/bcp47/transform.xml</a>.
2760		</p>
2761		<p class="dtd">&lt;!ELEMENT keyword ( key* )&gt;</p>
2762		<p class="dtd">
2763			&lt;!ELEMENT key ( type* )&gt;<br> &lt;!ATTLIST key extension
2764			NMTOKEN #IMPLIED&gt;<br> &lt;!ATTLIST key name NMTOKEN
2765			#REQUIRED&gt;<br> &lt;!ATTLIST key description CDATA
2766			#IMPLIED&gt;<br> &lt;!ATTLIST key deprecated ( true | false )
2767			"false"&gt;<br> &lt;!ATTLIST key preferred NMTOKEN #IMPLIED&gt;<br>
2768			&lt;!ATTLIST key alias NMTOKEN #IMPLIED&gt;<br> &lt;!ATTLIST key valueType (single | multiple
2769				| incremental | any) #IMPLIED &gt;<br> &lt;!ATTLIST key since
2770			CDATA #IMPLIED&gt;
2771		</p>
2772		<p class="dtd">
2773			&lt;!ELEMENT type EMPTY&gt;<br> &lt;!ATTLIST type name NMTOKEN
2774			#REQUIRED&gt;<br> &lt;!ATTLIST type description CDATA
2775			#IMPLIED&gt;<br> &lt;!ATTLIST type deprecated ( true | false )
2776			"false"&gt;<br> &lt;!ATTLIST type preferred NMTOKEN #IMPLIED&gt;<br>
2777			&lt;!ATTLIST type alias CDATA #IMPLIED&gt;<br> &lt;!ATTLIST type
2778			since CDATA #IMPLIED&gt;
2779		</p>
2780		<p class="dtd">
2781			&lt;!ELEMENT attribute EMPTY&gt;<br> &lt;!ATTLIST attribute name
2782			NMTOKEN #REQUIRED&gt;<br> &lt;!ATTLIST attribute description
2783			CDATA #IMPLIED&gt;<br> &lt;!ATTLIST attribute deprecated ( true
2784			| false ) "false"&gt;<br> &lt;!ATTLIST attribute preferred
2785			NMTOKEN #IMPLIED&gt;<br> &lt;!ATTLIST attribute since CDATA
2786			#IMPLIED&gt;
2787		</p>
2788		<p>The extension attribute in &lt;key&gt; element specifies the
2789			BCP 47 language tag extension type. The default value of the
2790			extension attribute is "u" (Unicode locale extension). The
2791			&lt;type&gt; element is only applicable to the enclosing &lt;key&gt;.
2792		</p>
2793		<p>
2794			In the Unicode locale extension 'u' and
2795				't' data files, the common attributes for the &lt;key&gt;,
2796			&lt;type&gt; and &lt;attribute&gt; elements are as follows:
2797		</p>
2798		<dl>
2799			<dt>
2800				<b>name</b>
2801			</dt>
2802			<dd>
2803				<p>
2804					The key or type name used by Unicode locale extension with <a
2805						href="#Unicode_locale_identifier">'u' extension syntax</a> or the 't' extensions syntax. When <i>alias</i>
2806					below is absent, this name can be also used with the old style <a
2807						href="#Old_Locale_Extension_Syntax"> "@key=type" syntax</a>.
2808				</p>
2809				<p>
2810					Most type names are <strong>literal type names</strong>, which
2811					match exactly the same value. All of these have at least one
2812					lowercase letter, such as &quot;buddhist&quot;. There are a small
2813					number of <strong>indirect type names</strong>, such as
2814					&quot;RG_KEY_VALUE&quot;. These have no lowercase letters. The
2815					interpretation of each one is listed below.
2816				</p>
2817				<h5>
2818					<a name="CODEPOINTS" href="#CODEPOINTS">CODEPOINTS</a>
2819				</h5>
2820				<p>
2821					The type name <strong>"CODEPOINTS"</strong> is reserved for a
2822					variable representing Unicode code point(s). The syntax is:
2823				</p>
2824				<table border="0">
2825					<tr>
2826						<th>&nbsp;</th>
2827						<th><div align="center">EBNF</div></th>
2828						<th><div align="center">ABNF</div></th>
2829					</tr>
2830					<tr>
2831						<td><pre>codepoints</pre></td>
2832						<td><pre>= codepoint (sep codepoint)?</pre></td>
2833						<td><pre>= codepoint *(sep codepoint)</pre></td>
2834					</tr>
2835					<tr>
2836						<td><pre>codepoint</pre></td>
2837						<td><pre>= [0-9 A-F a-f]{4,6}</pre></td>
2838						<td><pre>= 4*6HEXDIG</pre></td>
2839					</tr>
2840				</table>
2841				<p>In addition, no codepoint may exceed 10FFFF. For example,
2842					"00A0", "300b", "10D40C" and "00C1-00E1" are valid, but "A0",
2843					"U060C" and "110000" are not.</p>
2844				<p>In the current version of CLDR, the type "CODEPOINTS" is only
2845					used for the deprecated locale extension key "vt" (variableTop).
2846					The subtags forming the type for "vt" represent an arbitrary string
2847					of characters. There is no formal limit in the number of
2848					characters, although practically anything above 1 will be rare, and
2849					anything longer than 4 might be useless. Repetition is allowed, for
2850					example, 0061-0061 ("aa") is a Valid type value for "vt", since the
2851					sequence may be a collating element. Order is vital: 0061-0062
2852					("ab") is different than 0062-0061 ("ba"). Note that for
2853					variableTop any character sequence must be a contraction which
2854					yields exactly one primary weight.</p>
2855				<p>For example,</p>
2856				<blockquote>
2857					<p>
2858						<strong>en-u-vt-00A4</strong> : this indicates English, with any
2859						characters sorting at or below &quot; ¤&quot; (at a primary level)
2860						considered Variable.
2861					</p>
2862				</blockquote>
2863				<p>
2864					By default in UCA, variable characters are ignored in sorting at a
2865					primary, secondary, and tertiary level. But in CLDR, they are not
2866					ignorable by default. For more information, see <a
2867						href="tr35-collation.html#Setting_Options">Collation: Section
2868						3.3 <em>Setting Options</em>
2869					</a>.
2870				</p>
2871
2872				<h5>
2873					<a name="REORDER_CODE" href="#REORDER_CODE">REORDER_CODE</a>
2874				</h5>
2875				<p>
2876					The type name <strong>"REORDER_CODE"</strong> is reserved for
2877					reordering block names (e.g. "latn", "digit" and "others") defined
2878					in the <i><a href="tr35-collation.html#Root_Collation">Root
2879							Collation</a></i>. The type "REORDER_CODE" is used for locale extension
2880					key "kr" (colReorder). The value of type for "kr" is represented by
2881					one or more reordering block names such as "latn-digit". For more
2882					information, see <a href="tr35-collation.html#Script_Reordering">Collation:
2883						Section 3.12 <em>Collation Reordering</em>
2884					</a>.
2885				</p>
2886				<h5>
2887					<a name="RG_KEY_VALUE" href="#RG_KEY_VALUE">RG_KEY_VALUE</a>
2888				</h5>
2889				<p>
2890					The type name <strong>"RG_KEY_VALUE"</strong> is reserved for
2891					region codes in the format required by the "rg" key; this is a
2892					region code from the idValidity data in common/validity/region.xml
2893					(with certain exclusions, listed below) followed by "zzzz". The
2894					excluded region codes are those with idStatus='unknown' and
2895					'macroregion'; region codes with idStatus='deprecated' should not
2896					be generated, and those with idStatus='private_use' are only to be
2897					used with prior agreement. Thus the value for the "rg" key will
2898					normally be a region code with idStatus='regular' followed by
2899					"zzzz"; this set of values is the same as the subdivision codes
2900					with idStatus='unknown' from the idValidity data in
2901					common/validity/subdivision.xml.
2902				</p>
2903				<h5>
2904					<a name="SUBDIVISION_CODE" href="#SUBDIVISION_CODE">SUBDIVISION_CODE</a>
2905				</h5>
2906				<p>
2907					The type name <strong>"SUBDIVISION_CODE"</strong> is reserved for
2908					subdivision codes in the format required by the "sd" key; this is a
2909					subdivision code from the idValidity data in
2910					common/validity/subdivision.xml, excluding those with
2911					idStatus='unknown'. Codes with idStatus='deprecated' should not be
2912					generated, and those with idStatus='private_use' are only to be
2913					used with prior agreement.
2914				</p>
2915				<h5>
2916					<a name="PRIVATE_USE" href="#PRIVATE_USE">PRIVATE_USE</a>
2917				</h5>
2918				<p>
2919					The type name <strong>"PRIVATE_USE"</strong> is reserved for
2920					private use types. A valid type value is composed of one or more
2921					subtags separated by hyphens and each subtag consists of three to
2922					eight ASCII alphanumeric characters. In the current version of
2923					CLDR, <strong>"PRIVATE_USE"</strong> is only used for transform
2924					extension "x0".
2925				</p>
2926
2927			</dd>
2928			<dt>
2929				<b>valueType</b>
2930			</dt>
2931			<dd>
2932				<p>The valueType attribute indicates how many
2933					subtags are valid for a given key:</p>
2934				<table class='simple' width="100%" border="1">
2935					<tbody>
2936						<tr>
2937							<th>single</th>
2938							<td>Either exactly one type value, or no type value (but only if the value of &quot;true&quot; would be valid). This is the default
2939								if no valueType attribute is present.</td>
2940						</tr>
2941						<tr>
2942							<th>incremental</th>
2943							<td>Multiple type values are allowed, but only if a prefix
2944								is also present, and the sequence is explicitly listed. Each
2945								successive type value indicates a refinement of its prefix. For
2946								example:<br> &lt;key name=&quot;ca&quot;
2947								description=&quot;Calendar algorithm key&quot;<strong>
2948									valueType=&quot;incremental&quot;</strong>&gt; <br>&nbsp;&nbsp;&lt;type
2949								name=&quot;islamic&quot; description=&quot;Islamic
2950								calendar&quot;/&gt;<br> &nbsp;&nbsp;&lt;type
2951								name=&quot;islamic-umalqura&quot; description=&quot;Islamic
2952								calendar, Umm al-Qura&quot;/&gt;<br> Thus <em>ca-islamic-umalqura</em>
2953								is valid. However, <em>ca-gregory-japanese</em> is not valid,
2954								because &quot;gregory-japanese&quot; is not listed as a type.
2955							</td>
2956						</tr>
2957						<tr>
2958							<th>multiple</th>
2959							<td>Multiple type values are allowed, but each may only
2960								occur once. For example:<br>&lt;key name=&quot;kr&quot;
2961								description=&quot;Collation reorder codes&quot; <strong>valueType=&quot;multiple&quot;</strong>&gt;<br>
2962								&nbsp;&nbsp;&lt;type name=&quot;REORDER_CODE&quot; …/&gt;
2963							</td>
2964						</tr>
2965						<tr>
2966							<th>any</th>
2967							<td>Any number of type values are allowed, with none of the
2968								above restrictions. For example:<br> &lt;key
2969								extension=&quot;t&quot; name=&quot;x0&quot;<strong> </strong>description=&quot;Private
2970								use transform type key.&quot;<strong>
2971									valueType=&quot;any&quot;</strong>&gt;<br> &nbsp;&nbsp;&lt;type
2972								name=&quot;PRIVATE_USE&quot; …/&gt;
2973							</td>
2974						</tr>
2975					</tbody>
2976				</table>
2977			</dd>
2978			<dt>
2979				<b>description</b>
2980			</dt>
2981			<dd>
2982				<p>
2983					The description of the key, type or attribute element. There is
2984					also some informative text about certain keys and types in the
2985					Section 3.5 <a href="#Key_And_Type_Definitions_">Key And Type
2986						Definitions</a>.
2987				</p>
2988			</dd>
2989			<dt>
2990				<b>deprecated</b>
2991			</dt>
2992			<dd>
2993				<p>The deprecation status of the key, type or attribute element.
2994					The value "true" indicates the element is deprecated and no longer
2995					used in the version of CLDR. The default value is "false".</p>
2996			</dd>
2997			<dt>
2998				<b>preferred</b>
2999			</dt>
3000			<dd>
3001				<p>The preferred value of the deprecated key, type or attribute
3002					element. When a key, type or attribute element is deprecated, this
3003					attribute is used for specifying a new canonical form if available.</p>
3004			</dd>
3005			<dt>
3006				<b>alias</b> (Not applicable to &lt;attribute&gt;)
3007			</dt>
3008			<dd>
3009				<p>The BCP 47 form is the canonical form, and recommended. Other
3010			  aliases are included only for backwards compatibility.</p>
3011			</dd>
3012			<dd>
3013				<em>Example:</em>
3014			</dd>
3015			<dd>
3016				<p>
3017					&lt;type name="phonebk" <strong>alias="phonebook"</strong>
3018					description="Phonebook style ordering (such as in German)"/&gt;<br>
3019				</p>
3020				The preferred term, and the only one to be used in BCP 47, is the
3021				name: in this example, &quot;phonebk&quot;.<br>
3022			</dd>
3023			<dd>
3024				<p>
3025					The alias is a key or type name used by Unicode locale extensions
3026					with the old <a href="#Old_Locale_Extension_Syntax">"@key=type"
3027						syntax</a>. The attribute value for type may contain multiple names
3028					delimited by ASCII space characters. Of those aliases, the first
3029					name is the preferred value.
3030				</p>
3031			</dd>
3032			<dt>
3033				<b>since</b>
3034			</dt>
3035			<dd>The version of CLDR in which this key or type was
3036				introduced. Absence of this attribute value implies the key or type
3037				was available in CLDR 1.7.2.</dd>
3038		</dl>
3039		<p>
3040			<em>Note: There are no values defined for the locale extension
3041				attribute in the current CLDR release. </em>
3042		</p>
3043		<p>For example,</p>
3044		<pre>
3045&lt;key name="co" alias="collation" description="Collation type key"&gt;
3046  &lt;type name="pinyin" description="Pinyin ordering for Latin and for CJK characters (used in Chinese)"/&gt;
3047&lt;/key&gt;
3048
3049&lt;key name="ka" alias="colAlternate" description="Collation parameter key for alternate handling"&gt;
3050  &lt;type name="noignore" alias="non-ignorable" description="Variable collation elements are not reset to ignorable"/&gt;
3051  &lt;type name="shifted" description="Variable collation elements are reset to zero at levels one through three"/&gt;
3052&lt;/key&gt;
3053
3054&lt;key name="tz" alias="timezone"&gt;
3055  ...
3056  &lt;type name="aumel" alias="Australia/Melbourne Australia/Victoria" description="Melbourne, Australia"/&gt;
3057  &lt;type name="aumqi" alias="Antarctica/Macquarie" description="Macquarie Island Station, Macquarie Island" since="1.8.1"/&gt;
3058  ...
3059&lt;/key&gt;
3060    </pre>
3061		The data above indicates:
3062		<ul>
3063			<li>type "pinyin" is valid for key "co", thus "u-co-pinyin" is a
3064				valid Unicode locale extension.</li>
3065			<li>type "pinyin" is not valid for key "ka", thus "u-ka-pinyin"
3066				is not a valid Unicode locale extension.</li>
3067			<li>type "pinyin" has no <i>alias</i>, so "zh@collation=pinyin"
3068				is a valid Unicode locale identifier according to the old syntax.
3069			</li>
3070			<li>type "noignore" has an alias attribute, so
3071				"en@colAlternate=noignore" is not a valid Unicode locale identifier
3072				according to the old syntax.</li>
3073			<li>type "aumel" is valid for key "tz", supported by CLDR 1.7.2
3074				(default value) or later versions.</li>
3075			<li>type "aumqi" is valid for key "tz", supported by CLDR 1.8.1
3076				or later versions.</li>
3077		</ul>
3078		<p>It is strongly recommended that all API methods accept all
3079			possible aliases for keywords and types, but generate the canonical
3080			form. For example, &quot;ar-u-ca-islamicc&quot; would be equivalent
3081			to &quot;ar-u-ca-islamic-civil&quot; on input, but the latter should
3082			be output. The one exception is where an alias would only be
3083			well-formed with the old syntax, such as &quot;gregorian&quot; (for
3084			&quot;gregory&quot;).</p>
3085		<h4>
3086			<a href="#Unicode_Subdivision_Codes" name="Unicode_Subdivision_Codes">3.6.5
3087				Subdivision Codes</a>
3088		</h4>
3089		<p>
3090			The subdivision codes designate a
3091				subdivision of a country or region. They are called various names,
3092				such as a <em>state</em> in the United States, or a <em>province</em>
3093				in Canada. The codes in CLDR
3094			are based on ISO 3166-2 subdivision codes. The
3095				ISO codes have a region code followed by a hyphen, then a suffix
3096				consisting of 1..3 ASCII letters or digits.
3097		</p>
3098		<p>
3099			The CLDR codes are designed to work in a
3100				<a href='#unicode_locale_id'>unicode_locale_id</a> (BCP47), and are
3101				thus all lowercase, with no hyphen.
3102			For example, the following are valid, and mean “English as used in
3103			California, USA”.
3104		</p>
3105		<ul>
3106			<li>en-u-sd-<strong>usca</strong></li>
3107			<li>en-US-u-sd-<strong>usca</strong></li>
3108		</ul>
3109		<p>CLDR has additional subdivision codes. These
3110			may start with a 3-digit region code or use a suffix of 4 ASCII
3111			letters or digits, so they will not collide with the ISO codes.
3112			Subdivision codes for unknown values are the region code plus
3113			&quot;zzzz&quot;, such as &quot;uszzzz&quot; for an unknown
3114			subdivision of the US. Other codes may be added for stability.</p>
3115		<p>
3116			Like BCP 47, CLDR requires stable codes, which are not guaranteed for
3117			ISO 3166-2 (nor have the ISO 3166-2
3118				codes been stable in the past). If an ISO 3166-2 code is removed, it
3119			remains valid (though marked as deprecated) in CLDR. If an ICU 3166-2
3120			code is reused (for the same region), then CLDR will define a new
3121			equivalent code using these a 4-character suffixes.
3122	  </p>
3123		<h5>
3124			<a name="Validity" href="#Validity">3.6.5.1 Validity</a>
3125		</h5>
3126		<p>
3127			A <a href="#unicode_subdivision_id">unicode_subdivision_id</a>
3128			is only valid when it is present in the
3129				subdivision.xml file as described in <em>Section 3.11 <a
3130					href="#Validity_Data">Validity Data</a></em>.
3131			The data is in a compressed form, and thus needs to be expanded
3132			before such a test is made.
3133		</p>
3134		<p>
3135			<em> Examples:<br>
3136			</em>
3137		</p>
3138		<ul>
3139			<li><strong>usca</strong> is valid — there is an <strong>id</strong>
3140				element<code>&lt;id type="subdivision"…&gt;… usca
3141					…&lt;/id&gt;</code></li>
3142			<li><strong>ussct</strong> is invalid — there is no <strong>id</strong>
3143				element <code>&lt;id type="subdivision"…&gt;… ussct
3144					…&lt;/id&gt;</code></li>
3145		</ul>
3146		<p>If a <a href='#unicode_locale_id'>unicode_locale_id</a> contains both a <a
3147				href="#unicode_region_subtag">unicode_region_subtag</a> and a <a
3148				href="#unicode_subdivision_id">unicode_subdivision_id</a>, it is only valid if the <a
3149				href="#unicode_subdivision_id">unicode_subdivision_id</a> starts with the <a
3150				href="#unicode_region_subtag">unicode_region_subtag</a> (case-insensitively).<br>
3151		</p>
3152		<p>It is  recommended that a <a href='#unicode_locale_id'>unicode_locale_id</a> contain a <a
3153				href="#unicode_region_subtag">unicode_region_subtag</a> if it contains a <a
3154				href="#unicode_subdivision_id">unicode_subdivision_id</a> and the region would not be added by adding likely subtags. That produces better behavior if the <a
3155				href="#unicode_subdivision_id">unicode_subdivision_id</a> is ignored by an implementation or if the language tag is truncated.		</p>
3156		<p>
3157			Examples:<br>
3158		</p>
3159		<ul>
3160			<li>en-<strong>US</strong>-u-sd-<strong>us</strong>ca
3161				is valid — the region &quot;US&quot; matches
3162			the first part of "usca"</li>
3163			<li>en-u-sd-<strong>us</strong>ca is valid — it still works after adding likely subtags.</li>
3164			<li>en-<strong>CA</strong>-u-sd-<strong>gb</strong>sct is
3165				invalid — the region &quot;CA&quot; does not match the first part of &quot;gbsct&quot;. An implementation should  disregard the subdivision id (or return an error).</li>
3166			<li>en-u-sd-<strong>gb</strong>sct is valid but not recommended — an implementation that ignores the <a
3167				href="#unicode_subdivision_id">unicode_subdivision_id</a> can get the wrong fallback behavior, or could add likely subtags and get the invalid en<strong>-Latn-US</strong>-u-sd-<strong>gb</strong>sct</li>
3168		</ul>
3169		<p>
3170			In version 28.0, the subdivisions in the
3171			validity files used the ISO format, uppercase with a hyphen separating two
3172			components, instead of the BCP 47 format.
3173	  </p>
3174		<h3>
3175			<a name="t_Extension"></a><a name="BCP47_T_Extension"
3176				href="#BCP47_T_Extension">3.7 Unicode BCP 47 T Extension</a>
3177		</h3>
3178		<p>
3179			The Unicode Consortium has registered and is the maintaining
3180			authority for two BCP 47 language tag extensions: the extension 'u'
3181			for Unicode locale extension [<a href="#RFC6067">RFC6067</a>] and
3182			extension 't' for transformed content [<a href="#RFC6497">RFC6497</a>].
3183			The Unicode BCP 47 extension data defines the complete list of valid
3184			subtags.
3185		While the title of the RFC is &ldquo;Transformed Content&rdquo;, the abstract makes it clear that the scope is broader than the term "transformed" might indicate to a casual reader: “including content that has been transliterated, transcribed, or
3186        translated, or <em>in some other way influenced by the source. It also provides for additional information used for identification.</em>”</p>
3187		<p>
3188			<strong>The -t- Extension.</strong> The syntax of 't' extension
3189			subtags is defined by the rule
3190			<code>unicode_locale_extensions</code>
3191			in <a href="#Unicode_locale_identifier"><em>Section 3.2
3192					Unicode locale identifier</em></a>, except the separator of subtags
3193			<code>sep</code>
3194			must be always hyphen '-' when the extension is used as a part of BCP
3195			47 language tag. For information about the registration process,
3196			meaning, and usage of the 't' extension, see [<a href="#RFC6497">RFC6497</a>].
3197		</p>
3198		<p>
3199			These subtags are all in lowercase (that is the canonical casing for
3200			these subtags), however, subtags are case-insensitive and casing does
3201			not carry any specific meaning. All subtags within the Unicode
3202			extensions are alphanumeric characters in length of two to eight that
3203			meet the rule
3204			<code>extension</code>
3205			in the [<a href="#BCP47">BCP47</a>].</p>
3206	  <p>The following keys are defined for the -t- extension:</p>
3207		<table class='simple'>
3208		  <tbody>
3209		    <tr>
3210		      <th>Keys</th>
3211		      <th>Description</th>
3212		      <th>Values in latest release</th>
3213	        </tr>
3214		    <tr>
3215		      <td>m0</td>
3216		      <td><strong>Transform extension mechanism:</strong> to reference an authority or rules for a type of transformation</td>
3217		      <td><a href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform.xml">​transform.xml</a></td>
3218	        </tr>
3219		    <tr>
3220		      <td nowrap>s0, d0 </td>
3221		      <td><strong>Transform source/destination:</strong> for non-languages/scripts, such as fullwidth-halfwidth conversion.</td>
3222		      <td><a href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform-destination.xml">​transform-destination.xml</a></td>
3223	        </tr>
3224		    <tr>
3225		      <td>i0</td>
3226		      <td><strong>Input Method Engine transform:</strong> Used to indicate an input method transformation, such as one used by
3227a client-side input method. The first subfield in a sequence would
3228typically be a 'platform' or vendor designation.</td>
3229		      <td><a href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform_ime.xml">​transform_ime.xml</a></td>
3230	        </tr>
3231		    <tr>
3232		      <td>k0</td>
3233		      <td><strong>Keyboard transform:</strong> Used to indicate a keyboard transformation, such as one used by a client-side virtual keyboard. The first subfield in a sequence would typically be a 'platform' designation, representing the platform that the keyboard is intended for. The keyboard might or might not correspond to a keyboard mapping shipped by the vendor for the platform. One or more subsequent fields may occur, but are only added where needed to distinguish from others.</td>
3234		      <td><a href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform_keyboard.xml">​transform_keyboard.xml</a></td>
3235	        </tr>
3236		    <tr>
3237		      <td>t0</td>
3238		      <td><strong>Machine Translation:</strong> Used to indicate content that has been machine translated, or a request for a particular type of machine translation of content. The first subfield in a sequence would typically be a 'platform' or vendor designation.</td>
3239		      <td><a href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform_mt.xml">​transform_mt.xml</a></td>
3240	        </tr>
3241		    <tr>
3242		      <td nowrap>h0</td>
3243		      <td><strong>Hybrid Locale Identifiers:</strong> h0 with the value 'hybrid' indicates that the -t- value is a language that is mixed into the main  language tag to form a hybrid.  		For more information, and examples, see <em>Section 3.10.2 <a href="#Hybrid_Locale">Hybrid Locale Identifiers</a>.</em></td>
3244		      <td><a href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform_hybrid.xml">​transform_hybrid.xml</a></td>
3245	        </tr>
3246			    <tr>
3247		      <td>x0</td>
3248		      <td><strong>Private use transform</strong></td>
3249		      <td><a href="http://unicode.org/repos/cldr/tags/latest/common/bcp47/transform_private_use.xml">​transform_private_use.xml</a></td>
3250	        </tr>
3251      </tbody>
3252	  </table>
3253		<h4>
3254			<a href="#Transformed_Content_Data_File"
3255				name="Transformed_Content_Data_File">3.7.1 T Extension Data
3256				Files</a>
3257		</h4>
3258		<p>The overall structure of the data files is the similar to the U
3259			Extension, with the following exceptions.</p>
3260		<p>In the transformed content 't' data file, the name attribute in
3261			a &lt;key&gt; element defines a valid field separator subtag. The
3262			name attribute in an enclosed &lt;type&gt; element defines a valid
3263			field subtag for the field separator subtag. For example:</p>
3264		<pre>
3265&lt;key extension="t" name="m0"
3266    description="Transform extension mechanism"&gt;
3267	&lt;type name="ungegn"
3268		description="United Nations Group of Experts on Geographical Names"
3269      since="21"/&gt;
3270&lt;key&gt;
3271</pre>
3272		The data above indicates:
3273		<ul>
3274			<li>"m0" is a valid field separator for the transformed content
3275				extension 't'.</li>
3276			<li>field subtag "ungegn" is valid for field separator "m0".</li>
3277			<li>field subtag "ungegn" was introduced in CLDR 21.</li>
3278		</ul>
3279		<p>The attributes are:</p>
3280		<dl>
3281			<dt>
3282				<b>name</b>
3283			</dt>
3284			<dd>
3285				The name of the mechanism, limited to 3-8 characters (or sequences
3286				of them). Any indirect type names are
3287					listed in 3.6.4 <a href="#Unicode_Locale_Extension_Data_Files">U
3288						Extension Data Files</a>.
3289		  </dd>
3290			<dt>
3291				<b>description</b>
3292			</dt>
3293			<dd>A description of the name, with all and only that
3294				information necessary to distinguish one name from | American
3295				Library others with which it might be confused. Descriptions are not
3296				intended to provide general background information.</dd>
3297			<dt>
3298				<b>since</b>
3299			</dt>
3300			<dd>Indicates the first version of CLDR where the name appears.
3301				(Required for new items.)</dd>
3302			<dt>&nbsp;</dt>
3303			<dt>
3304				<b>alias</b>
3305			</dt>
3306			<dd>
3307				Alternative name, not limited in number of characters. Aliases are
3308				intended for compatibility, not to provide all possible alternate
3309				names or designations. <em>(Optional)</em>
3310			</dd>
3311		</dl>
3312		<p>
3313			For information about the registration process, meaning, and usage of
3314			the 't' extension, see [<a href="#RFC6497">RFC6497</a>].
3315		</p>
3316		<h3>
3317			<a name="Compatibility_with_Older_Identifiers"
3318				href="#Compatibility_with_Older_Identifiers">3.8 Compatibility
3319				with Older Identifiers</a>
3320		</h3>
3321		<p>LDML version before 1.7.2 used slightly different syntax for
3322			variant subtags and locale extensions. Implementations of LDML may
3323			provide backward compatible identifier support as described in
3324			following sections.</p>
3325
3326		<h4>
3327			<a name="Old_Locale_Extension_Syntax"
3328				href="#Old_Locale_Extension_Syntax">3.8.1 Old Locale Extension
3329				Syntax </a>
3330		</h4>
3331		<p>LDML 1.7 or older specification used different syntax for
3332			representing unicode locale extensions. The previous definition of
3333			Unicode locale extensions had the following structure:</p>
3334		<table border="0">
3335			<tr>
3336				<th>&nbsp;</th>
3337				<th><div align="center">EBNF</div></th>
3338				<th><div align="center">ABNF</div></th>
3339			</tr>
3340			<tr>
3341				<td>old_unicode_locale_extensions</td>
3342				<td><pre>= "@" old_key "=" old_type
3343 (";" old_key "=" old_type)*</pre></td>
3344				<td><pre>= "@" old_key "=" old_type
3345*(";" old_key "=" old_type)</pre></td>
3346			</tr>
3347		</table>
3348		<p>The new specification mandates keys to be two alphanumeric
3349			characters and types to be three to eight alphanumeric characters. As
3350			the result, new codes were assigned to all existing keys and some
3351			types. For example, a new key "co" replaced the previous key
3352			"collation", a new type "phonebk" replaced the previous type
3353			"phonebook". However, the existing collation type "big5han" already
3354			satisfied the new requirement, so no new type code was assigned to
3355			the type. All new keys and types introduced after LDML 1.7 satisfy
3356			the new requirement, so they do not have aliases dedicated for the
3357			old syntax, except time zone types. The conversion between old types
3358			and new types can be done regardless of key, with one known exception
3359			(old type "traditional" is mapped to new type "trad" for collation
3360			and "traditio" for numbering system), and this relationship will be
3361			maintained in the future versions unless otherwise noted.</p>
3362		<p>
3363			The new specification introduced a new field
3364			<code>attribute</code>
3365			in addition to key/type pairs in the Unicode locale extension. When
3366			it is necessary to map a new Unicode locale identifier with
3367			<code>attribute</code>
3368			field to a well-formed old locale identifier, a special key name <i>attribute</i>
3369			with the value of entire
3370			<code>attribute</code>
3371			subtags in the new identifier is used. For example, a new identifier
3372			<code>ja-u-xxx-yyy-ca-japanese</code>
3373			is mapped to an old identifier
3374			<code>ja@attribute=xxx-yyy;calendar=japanese</code>
3375			.
3376		</p>
3377		<p>The chart below shows some example mappings between the new
3378			syntax and the old syntax.</p>
3379
3380		<table>
3381			<caption>
3382				<a name="Locale_Extension_Mappings"
3383					href="#Locale_Extension_Mappings">Locale Extension Mappings</a>
3384			</caption>
3385			<tr>
3386				<th>Old (LDML 1.7 or older)</th>
3387				<th>New</th>
3388			</tr>
3389			<tr>
3390				<td>de_DE@collation=phonebook</td>
3391				<td>de_DE_u_co_phonebk</td>
3392			</tr>
3393			<tr>
3394				<td>zh_Hant_TW@collation=big5han</td>
3395				<td>zh_Hant_TW_u_co_big5han</td>
3396			</tr>
3397			<tr>
3398				<td>th_TH@calendar=gregorian;numbers=thai</td>
3399				<td>th_TH_u_ca_gregory_nu_thai</td>
3400			</tr>
3401			<tr>
3402				<td>en_US_POSIX@timezone=America/Los_Angeles</td>
3403				<td>en_US_u_tz_uslax_va_posix</td>
3404			</tr>
3405		</table>
3406
3407		<p>Where the old API is supplied the bcp47 language code, or vice
3408			versa, the recommendation is to:</p>
3409		<ol>
3410			<li>Have all methods that take the old syntax also take the new
3411				syntax, interpreted correctly. For example,
3412				&quot;zh-TW-u-co-pinyin&quot; and &quot;zh_TW@collation=pinyin&quot;
3413				would both be interpreted as meaning the same.</li>
3414			<li>Have all methods (both for old and new syntax) accept all
3415				possible aliases for keywords and types. For example,
3416				&quot;ar-u-ca-islamicc&quot; would be equivalent to
3417				&quot;ar-u-ca-islamic-civil&quot;.
3418				<ul>
3419					<li>The one exception is where an alias would only be
3420						well-formed with the old syntax, such as &quot;gregorian&quot;
3421						(for &quot;gregory&quot;).</li>
3422				</ul>
3423			</li>
3424			<li>Where an API cannot successfully accept the alternate
3425				syntax, throw an exception (or otherwise indicate an error) so that
3426				people can detect that they are using the wrong method (or wrong
3427				input).</li>
3428			<li>Provide a method that tests a purported locale ID string to
3429				determine its status:
3430				<ol>
3431					<li><strong>well-formed</strong> - syntactically correct</li>
3432					<li><strong>valid</strong> - well-formed and only uses
3433						registered language subtags, extensions, keywords, types...</li>
3434					<li><strong>canonical</strong> - valid and no deprecated codes
3435						or structure.</li>
3436				</ol>
3437			</li>
3438		</ol>
3439
3440		<h4>
3441			<a name="Legacy_Variants" href="#Legacy_Variants">3.8.2 Legacy
3442				Variants </a>
3443		</h4>
3444		<p>
3445			Old LDML specification allowed codes other than registered [<a
3446				href="#BCP47">BCP47</a>] variant subtags used in Unicode language
3447			and locale identifiers for representing variations of locale data.
3448			Unicode locale identifiers including such variant codes can be
3449			converted to the new [<a href="#BCP47">BCP47</a>] compatible
3450			identifiers by following the descriptions below:
3451		</p>
3452		<table>
3453			<caption>
3454				<a name="Legacy_Variant_Mappings" href="#Legacy_Variant_Mappings">Legacy
3455					Variant Mappings</a>
3456			</caption>
3457			<tr>
3458				<th>Variant Code</th>
3459				<th>Description</th>
3460			</tr>
3461
3462			<tr>
3463				<td>AALAND</td>
3464				<td>Åland, variant of "sv" Swedish used in Finland. Use "sv_AX"
3465					to indicate this.</td>
3466			</tr>
3467
3468			<tr>
3469				<td>BOKMAL</td>
3470				<td>Bokmål, variant of "no" Norwegian. Use primary language
3471					subtag "nb" to indicate this.</td>
3472			</tr>
3473
3474			<tr>
3475				<td>NYNORSK</td>
3476				<td>Nynorsk, variant of "no" Norwegian. Use primary language
3477					subtag "nn" to indicate this.</td>
3478			</tr>
3479
3480			<tr>
3481				<td>POSIX</td>
3482				<td>POSIX variation of locale data. Use Unicode locale
3483					extension "-u-va-posix" to indicate this.</td>
3484			</tr>
3485
3486			<tr>
3487				<td>POLYTONI</td>
3488				<td>Polytonic, variant of "el" Greek. Use [<a href="#BCP47">BCP47</a>]
3489					variant subtag "polyton" to indicate this.
3490				</td>
3491			</tr>
3492
3493			<tr>
3494				<td>SAAHO</td>
3495				<td>The Saaho variant of Afar. Use primary language subtag
3496					"ssy" to indicated this.</td>
3497			</tr>
3498		</table>
3499		<p>
3500			When converting to old syntax, the Unicode locale extension
3501			"-u-va-posix" should be converted to the "POSIX" variant, <i>not</i>
3502			to old extension syntax like "@va=posix". This is an exception: The
3503			other mappings above should not be reversed.
3504		</p>
3505
3506		<p>Examples:</p>
3507		<ul>
3508			<li>en_US_POSIX ↔ en-US-u-va-posix</li>
3509			<li>en_US_POSIX@colNumeric=yes ↔ en-US-u-kn-va-posix</li>
3510			<li>en-US-POSIX-u-kn-true → en-US-u-kn-va-posix</li>
3511			<li>en-US-POSIX-u-kn-va-posix → en-US-u-kn-va-posix</li>
3512		</ul>
3513
3514		<h4>
3515			<a name="Relation_to_OpenI18n" href="#Relation_to_OpenI18n">3.8.3
3516				Relation to OpenI18n</a>
3517		</h4>
3518		<p>
3519			The locale id format generally follows the description in the <i>OpenI18N
3520				Locale Naming Guideline</i> [<a href="#NamingGuideline">NamingGuideline</a>],
3521			with some enhancements. The main differences from the those
3522			guidelines are that the locale id:
3523		</p>
3524		<ol type="a">
3525			<li style="margin-top: 0.5em; margin-bottom: 0.5em">does not
3526				include a charset (since the data in LDML format always provides a
3527				representation of all Unicode characters. The repository is stored
3528				in UTF-8, although that can be transcoded to other encodings as
3529				well.),</li>
3530			<li style="margin-top: 0.5em; margin-bottom: 0.5em">adds the
3531				ability to have a variant, as in Java</li>
3532			<li style="margin-top: 0.5em; margin-bottom: 0.5em">adds the
3533				ability to discriminate the written language by script (or script
3534				variant).</li>
3535			<li style="margin-top: 0.5em; margin-bottom: 0.5em">is a
3536				superset of [<a href="#BCP47">BCP47</a>] codes.
3537			</li>
3538		</ol>
3539		<h3>
3540			<a name="Transmitting_Locale_Information"
3541				href="#Transmitting_Locale_Information">3.9 Transmitting Locale
3542				Information</a>
3543		</h3>
3544		<p>
3545			In a world of on-demand software components, with arbitrary
3546			connections between those components, it is important to get a sense
3547			of where localization should be done, and how to transmit enough
3548			information so that it can be done at that appropriate place.
3549			End-users need to get messages localized to their languages, messages
3550			that not only contain a translation of text, but also contain
3551			variables such as date, time, number formats, and currencies
3552			formatted according to the users&#39; conventions. The strategy for
3553			doing the so-called <i>JIT localization </i>is made up of two parts:
3554		</p>
3555		<ol>
3556			<li>Store and transmit <i>neutral-format</i> data wherever
3557				possible.
3558				<ul>
3559					<li>Neutral-format data is data that is kept in a standard
3560						format, no matter what the local user&#39;s environment is.
3561						Neutral-format is also (loosely) called <i>binary data</i>, even
3562						though it actually could be represented in many different ways,
3563						including a textual representation such as in XML.
3564					</li>
3565					<li>Such data should use accepted standards where possible,
3566						such as for currency codes.</li>
3567					<li>Textual data should also be in a uniform character set
3568						(Unicode/10646) to avoid possible data corruption problems when
3569						converting between encodings.</li>
3570				</ul>
3571			</li>
3572			<li>Localize that data as &quot;<i>close</i>&quot; to the
3573				end-user as possible.
3574			</li>
3575		</ol>
3576		<p>There are a number of advantages to this strategy. The longer
3577			the data is kept in a neutral format, the more flexible the entire
3578			system is. On a practical level, if transmitted data is
3579			neutral-format, then it is much easier to manipulate the data, debug
3580			the processing of the data, and maintain the software connections
3581			between components.</p>
3582		<p>Once data has been localized into a given language, it can be
3583			quite difficult to programmatically convert that data into another
3584			format, if required. This is especially true if the data contains a
3585			mixture of translated text and formatted variables. Once information
3586			has been localized into, say, Romanian, it is much more difficult to
3587			localize that data into, say, French. Parsing is more difficult than
3588			formatting, and may run up against different ambiguities in
3589			interpreting text that has been localized, even if the original
3590			translated message text is available (which it may not be).</p>
3591		<p>Moreover, the closer we are to end-user, the more we know about
3592			that user&#39;s preferred formats. If we format dates, for example,
3593			at the user&#39;s machine, then it can easily take into account any
3594			customizations that the user has specified. If the formatting is done
3595			elsewhere, either we have to transmit whatever user customizations
3596			are in play, or we only transmit the user&#39;s locale code, which
3597			may only approximate the desired format. Thus the closer the
3598			localization is to the end user, the less we need to ship all of the
3599			user&#39;s preferences around to all the places that localization
3600			could possibly need to be done.</p>
3601		<p>Even though localization should be done as close to the
3602			end-user as possible, there will be cases where different components
3603			need to be aware of whatever settings are appropriate for doing the
3604			localization. Thus information such as a locale code or time zone
3605			needs to be communicated between different components.</p>
3606		<h4>
3607			<a name="Message_Formatting_and_Exceptions"
3608				href="#Message_Formatting_and_Exceptions">3.9.1 Message
3609				Formatting and Exceptions</a>
3610		</h4>
3611		<p>
3612			Windows (<a
3613				href="http://msdn.microsoft.com/en-us/library/ms679351.aspx">FormatMessage</a>,
3614			<a href="http://msdn.microsoft.com/en-us/library/aa331875.aspx">String.Format</a>),
3615			Java (<a
3616				href="http://docs.oracle.com/javase/7/docs/api/java/text/MessageFormat.html">MessageFormat</a>)
3617			and ICU (<a
3618				href="http://www.icu-project.org/apiref/icu4c/classMessageFormat.html">MessageFormat</a>,
3619			<a href="http://www.icu-project.org/apiref/icu4c/umsg_8h.html">umsg</a>)
3620			all provide methods of formatting variables (dates, times, etc) and
3621			inserting them at arbitrary positions in a string. This avoids the
3622			manual string concatenation that causes severe problems for
3623			localization. The question is, where to do this? It is especially
3624			important since the original code site that originates a particular
3625			message may be far down in the bowels of a component, and passed up
3626			to the top of the component with an exception. So we will take that
3627			case as representative of this class of issues.
3628		</p>
3629		<p>There are circumstances where the message can be communicated
3630			with a language-neutral code, such as a numeric error code or
3631			mnemonic string key, that is understood outside of the component. If
3632			there are arguments that need to accompany that message, such as a
3633			number of files or a datetime, those need to accompany the numeric
3634			code so that when the localization is finally at some point, the full
3635			information can be presented to the end-user. This is the best case
3636			for localization.</p>
3637		<p>More often, the exact messages that could originate from within
3638			the component are not known outside of the component itself; or at
3639			least they may not be known by the component that is finally
3640			displaying text to the user. In such a case, the information as to
3641			the user&#39;s locale needs to be communicated in some way to the
3642			component that is doing the localization. That locale information
3643			does not necessarily need to be communicated deep within the
3644			component; ideally, any exceptions should bundle up some
3645			language-neutral message ID, plus the arguments needed to format the
3646			message (for example, datetime), but not do the localization at the
3647			throw site. This approach has the advantages noted above for JIT
3648			localization.</p>
3649		<p>In addition, exceptions are often caught at a higher level;
3650			they do not end up being displayed to any end-user at all. By
3651			avoiding the localization at the throw site, it the cost of doing
3652			formatting, when that formatting is not really necessary. In fact, in
3653			many running programs most of the exceptions that are thrown at a low
3654			level never end up being presented to an end-user, so this can have
3655			considerable performance benefits.</p>
3656		<h3>
3657			<a name="Language_and_Locale_IDs" href="#Language_and_Locale_IDs">3.10
3658				Unicode Language and Locale IDs</a>
3659		</h3>
3660		<p>People have very slippery notions of what distinguishes a
3661			language code versus a locale code. The problem is that both are
3662			somewhat nebulous concepts.</p>
3663		<p>
3664			In practice, many people use [<a href="#BCP47">BCP47</a>] codes to
3665			mean locale codes instead of strictly language codes. It is easy to
3666			see why this came about; because [<a href="#BCP47">BCP47</a>]
3667			includes an explicit region (territory) code, for most people it was
3668			sufficient for use as a locale code as well. For example, when
3669			typical web software receives an [<a href="#BCP47">BCP47</a>] code,
3670			it will use it as a locale code. Other typical software will do the
3671			same: in practice, language codes and locale codes are treated
3672			interchangeably. Some people recommend distinguishing on the basis of
3673			&quot;-&quot; versus &quot;_&quot; (for example, <i>zh-TW</i> for
3674			language code, <i>zh_TW</i> for locale code), but in practice that
3675			does not work because of the free variation out in the world in the
3676			use of these separators. Notice that Windows, for example, uses
3677			&quot;-&quot; as a separator in its locale codes. So pragmatically
3678			one is forced to treat &quot;-&quot; and &quot;_&quot; as equivalent
3679			when interpreting either one on input.
3680		</p>
3681		<p>
3682			Another reason for the conflation of these codes is that <i>very</i>
3683			little data in most systems is distinguished by region alone;
3684			currency codes and measurement systems being some of the few.
3685			Sometimes date or number formats are mentioned as regional, but that
3686			really does not make much sense. If people see the sentence &quot;You
3687			will have to adjust the value to १,२३४.५६७ from ૭૧,૨૩૪.૫૬&quot;
3688			(using Indic digits), they would say that sentence is simply not
3689			English. Number format is far more closely associated with language
3690			than it is with region. The same is true for date formats: people
3691			would never expect to see intermixed a date in the format
3692			&quot;2003年4月1日&quot; (using Kanji) in text purporting to be purely
3693			English. There are regional differences in date and number format —
3694			differences which can be important — but those are different in kind
3695			than other language differences between regions.
3696		</p>
3697		<p>
3698			As far as we are concerned — <i>as a completely practical matter</i>
3699			— two languages are different if they require substantially different
3700			localized resources. Distinctions according to spoken form are
3701			important in some contexts, but the written form is by far and away
3702			the most important issue for data interchange. Unfortunately, this is
3703			not the principle used in [<a href="#ISO639">ISO639</a>], which has
3704			the fairly unproductive notion (for data interchange) that only
3705			spoken language matters (it is also not completely consistent about
3706			this, however).
3707		</p>
3708		<p>
3709			[<a href="#BCP47">BCP47</a>] <i><b>can</b></i> express a difference
3710			if the use of written languages happens to correspond to region
3711			boundaries expressed as [<a href="#ISO3166">ISO3166</a>] region
3712			codes, and has recently added codes that allow it to express some
3713			important cases that are not distinguished by [<a href="#ISO3166">ISO3166</a>]
3714			codes. These written languages include simplified and traditional
3715			Chinese (both used in Hong Kong S.A.R.); Serbian in Latin script;
3716			Azerbaijani in Arab script, and so on.
3717		</p>
3718		<p>
3719			Notice also that <i>currency codes</i> are different than <i>currency
3720				localizations</i>. The currency localizations should largely be in the
3721			language-based resource bundles, not in the territory-based resource
3722			bundles. Thus, the resource bundle <i>en</i> contains the localized
3723			mappings in English for a range of different currency codes: USD →
3724			US$, RUR → Rub, AUD → $A and so on. Of course, some currency symbols
3725			are used for more than one currency, and in such cases
3726			specializations appear in the territory-based bundles. Continuing the
3727			example, <i>en_US</i> would have USD → $, while <i>en_AU</i> would
3728			have AUD → $. (In protocols, the currency codes should always
3729			accompany any currency amounts; otherwise the data is ambiguous, and
3730			software is forced to use the user&#39;s territory to guess at the
3731			currency. For some informal discussion of this, see <a
3732				href="http://source.icu-project.org/repos/icu/icuhtml/trunk/design/jit_localization.html">JIT
3733				Localization</a>.)
3734		</p>
3735		<h4>
3736			<a name="Written_Language" href="#Written_Language">3.10.1
3737				Written Language</a>
3738		</h4>
3739		<p>
3740			Criteria for what makes a written language should be purely
3741			pragmatic; <i>what would copy-editors say? </i>If one gave them text
3742			like the following, they would respond that is far from acceptable
3743			English for publication, and ask for it to be redone:
3744		</p>
3745		<ol>
3746			<li type="A">&quot;Theatre Center News: The date of the last
3747				version of this document was 2003年3月20日. A copy can be obtained for
3748				$50,0 or 1.234,57 грн. We would like to acknowledge contributions by
3749				the following authors (in alphabetical order): Alaa Ghoneim, Behdad
3750				Esfahbod, Ahmed Talaat, Eric Mader, Asmus Freytag, Avery Bishop, and
3751				Doug Felt.&quot;</li>
3752		</ol>
3753		<p>So one would change it to either B or C below, depending on
3754			which orthographic variant of English was the target for the
3755			publication:</p>
3756		<ol type="A" start="2">
3757			<li>&quot;Theater Center News: The date of the last version of
3758				this document was 3/20/2003. A copy can be obtained for $50.00 or
3759				1,234.57 Ukrainian Hryvni. We would like to acknowledge
3760				contributions by the following authors (in alphabetical order): Alaa
3761				Ghoneim, Ahmed Talaat, Asmus Freytag, Avery Bishop, Behdad Esfahbod,
3762				Doug Felt, Eric Mader.&quot;</li>
3763			<li>&quot;Theatre Centre News: The date of the last version of
3764				this document was 20/3/2003. A copy can be obtained for $50.00 or
3765				1,234.57 Ukrainian Hryvni. We would like to acknowledge
3766				contributions by the following authors (in alphabetical order): Alaa
3767				Ghoneim, Ahmed Talaat, Asmus Freytag, Avery Bishop, Behdad Esfahbod,
3768				Doug Felt, Eric Mader.&quot;</li>
3769		</ol>
3770		<p>
3771			Clearly there are many acceptable variations on this text. For
3772			example, copy editors might still quibble with the use of first
3773			versus last name sorting in the list, but clearly the first list was
3774			<i>not</i> acceptable English alphabetical order. And in quoting a
3775			name, like &quot;Theatre Centre News&quot;, one may leave it in the
3776			source orthography even if it differs from the publication target
3777			orthography. And so on. However, just as clearly, there limits on
3778			what is acceptable English, and &quot;2003年3月20日&quot;, for example,
3779			is <i>not</i>.
3780		</p>
3781		<p>Note that the language of locale data may differ from the
3782			language of localized software or web sites, when those latter are
3783			not localized into the user&#39;s preferred language. In such cases,
3784			the kind of incongruous juxtapositions described above may well
3785			appear, but this situation is usually preferable to forcing
3786			unfamiliar date or number formats on the user as well.</p>
3787	  <h4>
3788			<a name="Hybrid_Locale" href="#Hybrid_Locale">3.10.2
3789		Hybrid Locale Identifiers</a>
3790		</h4>
3791        <p>Hybrid locales have intermixed content from 2 (or more) languages, often with one language's grammatical structure applied to words in another. These are commonly referred to with portmanteau words such as <em>Franglais, <a href="https://en.oxforddictionaries.com/definition/spanglish">​Spanglish</a> </em>or<em> Denglish</em>. Hybrid locales do not <em>not</em> reference text simply containing two languages: a book of parallel text containing English and French, such as the following, is not Franglais:</p>
3792      <table style='margin-left:2em; margin-right:2em'>
3793          <tbody>
3794            <tr>
3795              <td width='50%' style='font-family:serif'>On the 24th of May, 1863, my uncle, Professor Liedenbrock, rushed into his little house, No. 19 Königstrasse, one of the oldest streets in the oldest portion of the city of Hamburg…</td>
3796              <td style='font-family:serif'>Le 24 mai 1863, un dimanche, mon oncle, le professeur Lidenbrock, revint précipitamment vers sa petite maison située au numéro 19 de Königstrasse, l’une des plus anciennes rues du vieux quartier de Hambourg…</td>
3797            </tr>
3798          </tbody>
3799        </table>
3800        <p>While text in a document can be tagged as partly in one language and partly in another, that is not the same having a hybrid locale. There is a difference between having a Spanglish document, and a Spanish document that has some passages quoted in English. Fine-grained tagging doesn't  handle grammatical combinations like Denglisch “<a href="http://www.duden.de/rechtschreibung/downloaden">​gedownloadet</a>”, which is neither English nor German — similarly the Franglais “<a href='http://www.le-dictionnaire.com/definition.php?mot=downloader'>downloadé</a>”. More importantly, it doesn’t work for the very common use case for a <a href="#unicode_locale_id">unicode_locale_id</a>: <i>locale selection</i>. </p>
3801      <p>To communicate requests for localized content and internationalization services, locales are used. When people pick a language from a menu, internally they are picking a locale (en-GB, es-419, etc.). To allow an application to support Spanglish or Hinglish locale selection, <a href="#unicode_locale_id">unicode_locale_id</a>s can represent hybrid locales using the  T extension key-value 'h0-hybrid'. (For more information on the T extension, see <em>Section 3.7 <a href="#t_Extension">Unicode BCP 47 T Extension</a>.</em>)
3802      </p>
3803      <p>Examples:</p>
3804      <table class='simple'>
3805          <tbody>
3806            <tr>
3807              <td>hi-t-<u>en-h0-hybrid</u></td>
3808              <td>Hinglish</td>
3809              <td>Hindi-English hybrid locale</td>
3810            </tr>
3811            <tr>
3812              <td>ta-t-<u>en-h0-hybrid</u></td>
3813              <td>Tanglish</td>
3814              <td>Tamil-English hybrid locale</td>
3815            </tr>
3816            <tr>
3817              <td>ba-t-<u>en-h0-hybrid</u></td>
3818              <td>Banglish</td>
3819              <td>Bangla-English hybrid locale</td>
3820            </tr>
3821             <tr><td colspan="3">…</td></tr>
3822            <tr>
3823              <td>en-t-<u>hi-h0-hybrid</u></td>
3824              <td>Hinglish</td>
3825              <td>English-Hindi hybrid locale</td>
3826            </tr>
3827            <tr>
3828              <td>en-t-<u>zh-h0-hybrid</u></td>
3829              <td>Chinglish</td>
3830              <td>English-Chinese hybrid locale</td>
3831            </tr>
3832			<tr><td colspan="3">…</td></tr>
3833        </tbody>
3834        </table>
3835        <blockquote>
3836          <p><em>Note: The <a href="#unicode_language_id">unicode_language_id</a> should be the language used as the ‘scaffold’: for the fallback locale for internationalization services, typically used for more of the core vocabulary/structure in the content. Thus Hinglish should be represented as hi-t-h0-en where Hindi is the scaffold, and as en-t-h0-hi where English is.</em></p>
3837        </blockquote>
3838      <p>The value of -t- is a full <em><a href="#unicode_language_id">unicode_language_id</a></em>, and can contain subtags for script or region where it is important to include them, as in the following. It may be useful in order to emphasize the script, even where it is the default script for the language, if it is not the same as the script of the main language tag.</p>
3839      <table class='simple'>
3840          <tbody>
3841            <tr>
3842              <td>ru-t<u>-en-latn-gb-h0-hybrid</u></td>
3843              <td>Runglish</td>
3844              <td>Russian with an admixture of British English in Latin script</td>
3845            </tr>
3846            <tr>
3847              <td>ru-t-<u>en-cyrl-gb-h0-hybrid</u></td>
3848              <td>Runglish</td>
3849              <td>Russian with an admixture of British English in Cyrillic script</td>
3850            </tr>
3851          </tbody>
3852        </table>
3853      <p>Should there ever be strong need for hybrids of more than two languages or for other purposes such as hybrid languages as the source of translated content, additional structure could be added.</p>
3854		<h3>
3855			<a name="Validity_Data" href="#Validity_Data">3.11 Validity Data</a>
3856		</h3>
3857		<p class='dtd'>
3858			&lt;!ELEMENT idValidity (id*) &gt;<br> &lt;!ELEMENT id ( #PCDATA
3859			) &gt;<br> &lt;!ATTLIST id type NMTOKEN #REQUIRED &gt; <br>
3860			&lt;!ATTLIST id idStatus NMTOKEN #REQUIRED &gt;
3861		</p>
3862		<p>
3863			The directory <a
3864				href='http://unicode.org/repos/cldr/tags/latest/common/validity/'>common/validity</a>
3865			contains machine-readable data for validating the language, region,
3866			script, and variant subtags, as well as currency, subdivisions and
3867			measure units. Each file contains a number of subtags with the
3868			following <strong>idStatus</strong> values:
3869		</p>
3870		<ul>
3871			<li><strong>regular</strong> — the standard codes used for the
3872				specific type of subtag</li>
3873			<li><strong>special</strong> — certain
3874				exceptional language codes like 'mul'<em> (languages only)</em></li>
3875			<li><strong>unknown</strong> — the code used to indicate the
3876				&quot;unknown&quot;, &quot;undetermined&quot; or &quot;invalid&quot;
3877				values. For more information, see <em>Section 3.5.1 <a
3878					href="#Unknown_or_Invalid_Identifiers">Unknown or Invalid
3879						Identifiers</a></em>.</li>
3880			<li><strong>macroregion</strong> — the standard codes that are
3881				macroregions<em> (for regions only).</em>
3882				<ul>
3883					<li>Note that some two-letter region codes are macroregions,
3884						and (in the future) some three-digit codes may be regular codes.</li>
3885					<li>For details as to which regions are contained within which
3886						macroregions, see the <strong>&lt;containment&gt;</strong> element
3887						of the supplemental data.
3888					</li>
3889				</ul></li>
3890			<li><strong>deprecated</strong> — codes that should not be used.
3891				The <strong>&lt;alias&gt;</strong> element in the supplementalMeta
3892				file contains more information about these codes, and which codes
3893				should be used instead.</li>
3894			<li><strong>private_use</strong> — codes that, for CLDR, are
3895				considered private use. Note that some BCP 47 private-use codes have
3896				defined CLDR semantics, and are considered regular codes. For more
3897				information, see <em>Section 3.5.3 <a href="#Private_Use">Private
3898						Use Codes</a>.
3899			</em></li>
3900		</ul>
3901		<p>
3902			The list of subtags for each idStatus use a compact format as a
3903			space-delimited list of StringRanges, as defined in <em>Section
3904				<a href="#String_Range">5.3.4 String Range</a>.
3905			</em> The separator for each StringRange is a &quot;~&quot;.
3906		</p>
3907		<p>Each measure unit is a sequence of subtags, such as
3908			“angle-arc-minute”. The first subtag provides a general “category” of
3909			the unit.</p>
3910		<p>
3911			In version 28.0, the subdivisions in the
3912			validity files used the ISO format, uppercase with a hyphen separating two
3913			components, instead of the BCP 47 format.
3914	  </p>
3915		<h2>
3916			<a name="Locale_Inheritance" href="#Locale_Inheritance">4 Locale
3917				Inheritance and Matching</a>
3918		</h2>
3919		<p>
3920			The XML format relies on an inheritance model, whereby the resources
3921			are collected into <i>bundles</i>, and the bundles organized into a
3922			tree. Data for the many Spanish locales does not need to be
3923			duplicated across all of the countries having Spanish as a national
3924			language. Instead, common data is collected in the Spanish language
3925			locale, and territory locales only need to supply differences. The
3926			parent of all of the language locales is a generic locale known as <i>root</i>.
3927			Wherever possible, the resources in the root are language &amp;
3928			territory neutral. For example, the collation (sorting) order in the
3929			root is based on the [<a href="#DUCET">DUCET</a>] (see<em><a
3930				href="tr35-collation.html#Root_Collation">Root Collation</a></em>). Since
3931			English language collation has the same ordering as the root locale,
3932			the &#39;en&#39; locale data does not need to supply any collation
3933			data, nor do the &#39;en_US&#39;, &#39;en_GB&#39; or the any of the
3934			various other locales that use English.
3935		</p>
3936		<p>Given a particular locale id &quot;en_US_someVariant&quot;, the
3937			search chain for a particular resource is the following.</p>
3938		<blockquote>
3939			<pre>en_US_someVariant
3940en_US
3941en
3942root</pre>
3943		</blockquote>
3944		<p>
3945			<em>The inheritance is often not simple truncation, as will be
3946				seen later in this section.</em>
3947		</p>
3948		<p>If a type and key are supplied in the locale id, then logically
3949			the chain from that id to the root is searched for a resource tag
3950			with a given type, all the way up to root. If no resource is found
3951			with that tag and type, then the chain is searched again without the
3952			type.</p>
3953		<p>
3954			Thus the data for any given locale will only contain resources that
3955			are different from the parent locale. For example, most territory
3956			locales will inherit the bulk of their data from the language locale:
3957			&quot;en&quot; will contain the bulk of the data: &quot;en_IE&quot;
3958			will only contain a few items like currency. All data that is
3959			inherited from a parent is presumed to be valid, just as valid as if
3960			it were physically present in the file. This provides for much
3961			smaller resource bundles, and much simpler (and less error-prone)
3962			maintenance. At the script or region level, the &quot;primary&quot;
3963			child locale will be empty, since its parent will contain all of the
3964			appropriate resources for it. For more information see <i>CLDR
3965				Information : Section 9.3 <a href="tr35-info.html#Default_Content">Default
3966					Content</a>.
3967			</i>
3968		</p>
3969
3970		<p>
3971			Certain data items depend only on the region specified in a locale id
3972			(by a <a
3973				href="#unicode_region_subtag_validity">unicode_region_subtag</a> or
3974				an “rg” <a href="#RegionOverride">Region Override</a> key)
3975			, and are obtained from supplemental data rather than through locale
3976			resources. For example:
3977		</p>
3978		<ul>
3979			<li>The currency for the specified region (see <a
3980				href="tr35-numbers.html#Supplemental_Currency_Data">Supplemental
3981					Currency Data</a>)
3982			</li>
3983			<li>The measurement system for the specified region (see <a
3984				href="tr35-general.html#Measurement_System_Data">Measurement
3985					System Data</a>)
3986			</li>
3987			<li>The week conventions for the specified region (see <a
3988				href="tr35-dates.html#Week_Data">Week Data</a>)
3989			</li>
3990		</ul>
3991		<p>
3992			(For more information on the specific
3993				items handled this way, see <a
3994				href="tr35-info.html#Territory_Based_Preferences">Territory-Based
3995					Preferences</a>.)
3996			These items will be correct for the specified region regardless of
3997			whether a locale bundle actually exists with the same combination of
3998			language and region as in the locale id. For example, suppose data is
3999			requested for the locale id "fr_US" and there is no bundle for that
4000			combination. Data obtained via locale inheritance, such as currency
4001			patterns and currency symbols, will be obtained from the parent
4002			locale "fr". However, currency amounts would be formatted by default
4003			using US dollars, just displayed in the manner governed by the locale
4004			"fr". When a locale id does not specify a region, the region-specific
4005			items such as those above are obtained from the likely region for the
4006			locale (obtained via <a href="#Likely_Subtags">Likely Subtags</a>).</p>
4007		<p>For the relationship between Inheritance, DefaultContent, LikelySubtags, and LocaleMatching, see Section 4.2.6 <a
4008				href="tr35.html#Inheritance_vs_Related">Inheritance vs Related Information</a>.</p>
4009		<h3>
4010			<a href="#Lookup" name="Lookup">4.1 Lookup</a>
4011		</h3>
4012
4013		<p>If a language has more than one script in customary modern use,
4014			then the CLDR file structure in common/main follows the following
4015			model:</p>
4016		<blockquote>
4017			<p>
4018				lang<br> lang_script<br> lang_script_region<br>
4019				lang_region<i> (aliases to lang_script_region)</i>
4020			</p>
4021		</blockquote>
4022		<h4>
4023			<a href="#Bundle_vs_Item_Lookup" name="Bundle_vs_Item_Lookup">4.1.1
4024				Bundle vs Item Lookup</a>
4025		</h4>
4026		<p>
4027			There are actually two different kinds of inheritance fallback: <em>resource&nbsp;bundle&nbsp;lookup</em>
4028			and <em>resource&nbsp;item&nbsp;lookup</em>. For the former, a
4029			process is looking to find the first, best resource bundle it can;
4030			for the later, it is fallback&nbsp;within&nbsp;bundles on individual
4031			items, like the translated name for the region &quot;CN&quot; in
4032			Breton.
4033		</p>
4034		<p>
4035			These are closely related, but distinct, processes. They are
4036			illustrated in the table <a href="#Lookup-Differences">Lookup
4037				Differences</a>, where &quot;key&quot; stands for zero or more key/type
4038			pairs. Logically speaking, when looking up an item for a given
4039			locale, you first do a resource bundle lookup to find the best bundle
4040			for the locale, then you do a inherited item lookup starting with
4041			that resource bundle.
4042		</p>
4043		<p>
4044			The table <a href="#Lookup-Differences">Lookup Differences</a> uses
4045			the naïve resource bundle lookup for illustration. More sophisticated
4046			systems will get far better results for resource bundle lookup if
4047			they use the algorithm described in <em>Section 4.4 <a
4048				href="#LanguageMatching">Language Matching</a></em>. That algorithm takes
4049			into account both the user’s desired locale(s) and the application’s
4050			supported locales, in order to get the best match.
4051		</p>
4052		<p>
4053			If the naïve resource bundle lookup is used, the desired locale needs
4054			to be canonicalized using 4.3 <a href="#Likely_Subtags">Likely
4055				Subtags</a> and the supplemental alias information, so that locales that
4056			CLDR considers identical are treated as such. Thus eng-Latn-GB should
4057			be mapped to en-GB, and cmn-TW mapped to zh-Hant-TW.
4058		</p>
4059		<p>For the purposes of CLDR, everything with the &lt;ldml&gt; dtd
4060			is treated logically as if it is one resource bundle, even if the
4061			implementation separates data into separate physical resource
4062			bundles. For example, suppose that there is a main XML file for Nama
4063			(naq), but there are no &lt;unit&gt; elements for it because the
4064			units are all inherited from root. If the &lt;unit&gt; elements are
4065			separated into a separate data tree for modularity in the
4066			implementation, the Nama &lt;unit&gt; resource bundle would be empty.
4067			However, for purposes of resource-bundle lookup the resource bundle
4068			lookup still stops at naq.xml.</p>
4069
4070		<div id="iqaw2" style="margin-top: 0px; margin-bottom: 0px;">
4071			<table class='simple' id="a1bn" border="1" cellpadding="3" cellspacing="0">
4072				<caption>
4073					<a href="#Lookup-Differences" name="Lookup-Differences">Lookup
4074						Differences</a>
4075				</caption>
4076				<tbody id="iqaw3">
4077					<tr id="x40y0">
4078						<th id="x40y1" style="vertical-align: top;" nowrap>Lookup
4079							Type</th>
4080						<th id="x40y3" style="vertical-align: top;" nowrap>Example</th>
4081						<th id="x40y5" style="vertical-align: top;">Comments</th>
4082					</tr>
4083					<tr id="iqaw4">
4084						<td id="iqaw5" style="vertical-align: top;" nowrap>
4085							<p id="rkc40">
4086								<strong>Resource bundle</strong> lookup
4087							</p>
4088						</td>
4089						<td id="iqaw7" style="vertical-align: top;" nowrap>
4090							<p>se-FI&nbsp;→</p>
4091							<p>se&nbsp; →</p>
4092							<p>
4093								<em>default-locale*&nbsp;&nbsp;→</em>
4094							</p>
4095							<p>root</p>
4096						</td>
4097						<td id="rkc41" style="vertical-align: top;">
4098							<p>* The default-locale may have its own inheritance change;
4099								for example, it may be &quot;en-GB&nbsp;→&nbsp;en&quot; In that
4100								case, the chain is expanded by inserting the chain, resulting
4101								in:</p>
4102							<blockquote>
4103								<p>se-FI →</p>
4104								<p>se →</p>
4105								<p>fi →</p>
4106								<p>
4107									<em>en-GB →</em>
4108								</p>
4109								<p>
4110									<em>en →</em>
4111								</p>
4112								<p>root</p>
4113							</blockquote>
4114						</td>
4115					</tr>
4116					<tr id="iqaw9">
4117						<td id="iqaw10" style="vertical-align: top;" nowrap>
4118							<p>
4119								<strong>Inherited item</strong> lookup
4120							</p>
4121						</td>
4122						<td id="iqaw12" style="vertical-align: top;" nowrap>
4123							<p>se-FI+key&nbsp;→</p>
4124							<p>se+key →</p>
4125							<p>
4126								<em>root_alias*+key&nbsp;</em>
4127							</p>
4128							<p>→&nbsp;root+key</p>
4129						</td>
4130						<td id="rkc43" style="vertical-align: top;">
4131							<p>* If there is a root_alias to another key or locale, then
4132								insert that entire chain. For example, suppose that months for
4133								another calendar system have a root alias to Gregorian months.
4134								In that case, the root alias would change the key, and retry
4135								from se-FI downward. This can happen multiple times.</p>
4136							<blockquote>
4137								<p>se-FI+key&nbsp;→</p>
4138								<p>se+key →</p>
4139								<p>root_alias*+key →</p>
4140								<p>
4141									<em>se-FI+key2&nbsp;→</em>
4142								</p>
4143								<p>
4144									<em>se+key2 →</em>
4145								</p>
4146								<p>root_alias*+key2 →</p>
4147								<p>root+key2</p>
4148							</blockquote>
4149						</td>
4150					</tr>
4151				</tbody>
4152			</table>
4153		</div>
4154		<p>Both the resource bundle inheritance and the inherited item
4155			inheritance use the parentLocale data, where available, instead of
4156			simple trunctation.</p>
4157		<p>The fallback is a bit different for these two cases; internal
4158			aliases and keys are are not involved in the bundle lookup, and the
4159			default locale is not involved in the item lookup. If the
4160			default-locale were used in the resource-item lookup, then strange
4161			results will occur. For example, suppose that the default locale is
4162			Swedish, and there is a Nama locale but no specific inherited item
4163			for collation. If the default-locale were used in resource-item
4164			lookup, it would produce odd and unexpected results for Nama sorting.
4165		</p>
4166		<p>The default locale is not even always used in resource bundle
4167			inheritance. For the following services, the fallback is always
4168			directly to the root locale rather than through default locale.</p>
4169		<ul>
4170			<li>collation</li>
4171			<li>break iteration</li>
4172			<li>case mapping</li>
4173			<li>transliteration
4174				<ul>
4175					<li>The lookup for transliteration is yet more complicated
4176						because of the interplay of source and target locales: see <em>Part
4177							2 General, Section 10.1 <a
4178							href="http://www.unicode.org/reports/tr35/tr35-general.html#Inheritance">Inheritance.</a>
4179					</em>
4180					</li>
4181				</ul>
4182			</li>
4183		</ul>
4184		<p>
4185			Thus if there is no Akan locale, for example, asking for a collation
4186			for Akan should produce the root collation, <em>not the Swedish
4187				collation.</em>
4188		</p>
4189		<p>The inherited item lookup must remain stable, because the
4190			resources are built with a certain fallback in mind; changing the
4191			core fallback order can render the bundle structure incoherent.</p>
4192		<p>
4193			Resource bundle lookup, on the other hand, is more flexible; changes
4194			in the view of the &quot;best&quot; match between the input request
4195			and the output bundle are more tolerant, when represent overall
4196			improvements for users. For more information, see <i> <a
4197				href="#Fallback_Elements">A.1 Element fallback</a></i>.
4198		</p>
4199		<p>
4200			Where the LDML inheritance relationship does not match a target
4201			system, such as POSIX, the data logically should be fully resolved in
4202			converting to a format for use by that system, by adding <i>all</i>
4203			inherited data to each locale data set.
4204		</p>
4205		<p>
4206			For a more complete description of how inheritance applies to data,
4207			and the use of keywords, see <i><a
4208				href="#Inheritance_and_Validity">Section 4.2 Inheritance </a></i>.
4209		</p>
4210		<p>
4211			The locale data does not contain general character properties that
4212			are derived from the <i>Unicode Character Database</i> [<a
4213				href="http://unicode.org/reports/tr41/#UAX44">UAX44</a>]. That data
4214			being common across locales, it is not duplicated in the bundles.
4215			Constructing a POSIX locale from the CLDR data requires use of UCD
4216			data. In addition, POSIX locales may also specify the character
4217			encoding, which requires the data to be transformed into that target
4218			encoding.
4219		</p>
4220		<p>
4221			<b>Warning: </b>If a locale has a different script than its parent
4222			(for example, sr_Latn), then special attention must be paid to make
4223			sure that all inheritance is covered. For example, auxiliary exemplar
4224			characters may need to be empty (&quot;[]&quot;) to block
4225			inheritance.
4226		</p>
4227		<p>
4228			<strong>Empty Override:</strong> There is one special value reserved
4229			in LDML to indicate that a child locale is to have no value for a
4230			path, even if the parent locale has a value for that path. That value
4231			is &quot;∅∅∅&quot;. For example, if there is no phrase for &quot;two
4232			days ago&quot; in a language, that can be indicated with:
4233		</p>
4234		<pre>&lt;field type="day"&gt;
4235  &lt;relative type="-2"&gt;∅∅∅&lt;/relative&gt;
4236</pre>
4237		<h4>
4238			<a name="Multiple_Inheritance"></a><a name="Lateral_Inheritance"
4239				href="#Lateral_Inheritance">4.1.2 Lateral Inheritance </a>
4240		</h4>
4241		<p>
4242			In clearly specified instances, resources may inherit from within the
4243			same locale. For example, currency format symbols inherit from the
4244			number format symbols; the Buddhist calendar inherits from the
4245			Gregorian calendar. This <i>only</i> happens where documented in this
4246			specification. In these special cases, the inheritance functions as
4247			normal, up to the root. If the data is not found along that path,
4248			then a second search is made, logically changing the
4249			element/attribute to the alternate values.
4250		</p>
4251		<p>
4252			For example, for the locale &quot;en_US&quot; the month data in
4253			&lt;calendar class=&quot;<span style="color: blue">buddhist</span>&quot;&gt;
4254			inherits first from &lt;calendar class=&quot;<span
4255				style="color: blue">buddhist</span>&quot;&gt; in &quot;en&quot;,
4256			then in &quot;root&quot;. If not found there, then it inherits from
4257			&lt;calendar type=&quot;<span style="color: blue">gregorian</span>&quot;&gt;
4258			in &quot;en_US&quot;, then &quot;en&quot;, then in &quot;root&quot;.
4259		</p>
4260		<p>There is one special case, for items with a &quot;count&quot;
4261			parameter (used to select a plural form). In that case, the
4262			inheritance works as follows:</p>
4263		<p>If there is no value for a path, and that path has a
4264			[@count=&quot;x&quot;] attribute and value, then:</p>
4265		<ol>
4266			<li>If &quot;x&quot; is anything but &quot;other&quot;, it falls
4267				back to [@count=&quot;other&quot;], within that the same locale.</li>
4268			<li>In the special case of currencies, if the
4269				[@count=&quot;other&quot;] value is missing, it falls back to the
4270				path that is completely missing the count item.</li>
4271			<li>If there is no value within the same locale, the same
4272				process is used in the parent locale, and so on.</li>
4273		</ol>
4274		<p>
4275			<em>Examples:</em>
4276		</p>
4277		<table class='simple' border="1" cellpadding="3" cellspacing="0" id="a1bn3">
4278			<caption>
4279				<a name="Count_Fallback_normal" href="#Count_Fallback_normal">Count
4280					Fallback: normal</a>
4281			</caption>
4282			<tbody>
4283				<tr>
4284					<th nowrap style="vertical-align: top;">Locale</th>
4285					<th nowrap style="vertical-align: top;">Path</th>
4286				</tr>
4287				<tr>
4288					<td nowrap style="vertical-align: top;">fr-CA</td>
4289					<td nowrap id="iqaw" style="vertical-align: top;"><code>
4290							//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="x"]</strong>
4291						</code></td>
4292				</tr>
4293				<tr>
4294					<td nowrap style="vertical-align: top;">fr-CA</td>
4295					<td nowrap id="iqaw16" style="vertical-align: top;"><code>
4296							//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="other"]</strong>
4297						</code></td>
4298				</tr>
4299				<tr>
4300					<td nowrap style="vertical-align: top;">fr</td>
4301					<td nowrap id="iqaw19" style="vertical-align: top;"><code>
4302							//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="x"]</strong>
4303						</code></td>
4304				</tr>
4305				<tr>
4306					<td nowrap style="vertical-align: top;">fr</td>
4307					<td nowrap id="iqaw18" style="vertical-align: top;"><code>
4308							//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="other"]</strong>
4309						</code></td>
4310				</tr>
4311				<tr>
4312					<td nowrap style="vertical-align: top;">root</td>
4313					<td nowrap id="iqaw21" style="vertical-align: top;"><code>
4314							//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="x"]</strong>
4315						</code></td>
4316				</tr>
4317				<tr>
4318					<td nowrap style="vertical-align: top;">root</td>
4319					<td nowrap id="iqaw20" style="vertical-align: top;"><code>
4320							//ldml/units/unitLength[@type="<strong>narrow</strong>"]/unit[@type="mass-gram"]/unitPattern<strong>[@count="other"]</strong>
4321						</code></td>
4322				</tr>
4323			</tbody>
4324		</table>
4325		<p>Note that there may be an alias in root that changes the path
4326			and starts again from the requested locale, such as:</p>
4327		<p>
4328			<code>
4329				&lt;unitLength type=&quot;<strong>narrow</strong>&quot;&gt;<br>
4330				   &lt;alias source=&quot;locale&quot;
4331				path=&quot;../unitLength[@type='<strong>short</strong>']&quot;/&gt;<br>
4332				&lt;/unitLength&gt;
4333			</code>
4334		</p>
4335		<table class='simple' border="1" cellpadding="3" cellspacing="0" id="a1bn2">
4336			<caption>
4337				<a name="Count_Fallback_currency" href="#Count_Fallback_currency">Count
4338					Fallback: currency</a>
4339			</caption>
4340			<tbody>
4341				<tr>
4342					<th nowrap style="vertical-align: top;">Locale</th>
4343					<th nowrap style="vertical-align: top;">Path</th>
4344				</tr>
4345				<tr>
4346					<td nowrap style="vertical-align: top;">fr-CA</td>
4347					<td nowrap id="iqaw11" style="vertical-align: top;"><code>
4348							//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="x"]</strong>
4349						</code></td>
4350				</tr>
4351				<tr>
4352					<td nowrap style="vertical-align: top;">fr-CA</td>
4353					<td nowrap id="iqaw6" style="vertical-align: top;"><code>
4354							//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="other"]</strong>
4355						</code></td>
4356				</tr>
4357				<tr>
4358					<td nowrap style="vertical-align: top;">fr-CA</td>
4359					<td nowrap id="iqaw8" style="vertical-align: top;"><code>//ldml/numbers/currencies/currency[@type="CAD"]/displayName</code></td>
4360				</tr>
4361				<tr>
4362					<td nowrap style="vertical-align: top;">fr</td>
4363					<td nowrap id="iqaw15" style="vertical-align: top;"><code>
4364							//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="x"]</strong>
4365						</code></td>
4366				</tr>
4367				<tr>
4368					<td nowrap style="vertical-align: top;">fr</td>
4369					<td nowrap id="iqaw14" style="vertical-align: top;"><code>
4370							//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="other"]</strong>
4371						</code></td>
4372				</tr>
4373				<tr>
4374					<td nowrap style="vertical-align: top;">fr</td>
4375					<td nowrap id="iqaw13" style="vertical-align: top;"><code>//ldml/numbers/currencies/currency[@type="CAD"]/displayName</code></td>
4376				</tr>
4377				<tr>
4378					<td nowrap style="vertical-align: top;">root</td>
4379					<td nowrap id="iqaw25" style="vertical-align: top;"><code>
4380							//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="x"]</strong>
4381						</code></td>
4382				</tr>
4383				<tr>
4384					<td nowrap style="vertical-align: top;">root</td>
4385					<td nowrap id="iqaw24" style="vertical-align: top;"><code>
4386							//ldml/numbers/currencies/currency[@type="CAD"]/displayName<strong>[@count="other"]</strong>
4387						</code></td>
4388				</tr>
4389				<tr>
4390					<td nowrap style="vertical-align: top;">root</td>
4391					<td nowrap id="iqaw23" style="vertical-align: top;"><code>//ldml/numbers/currencies/currency[@type="CAD"]/displayName</code></td>
4392				</tr>
4393			</tbody>
4394		</table>
4395		<br>
4396		<h4>
4397			<a name="Parent_Locales" href="#Parent_Locales">4.1.3 Parent
4398				Locales</a>
4399		</h4>
4400		<p class="dtd">
4401			&lt;!ELEMENT parentLocales ( parentLocale* ) &gt;<br>
4402			&lt;!ELEMENT parentLocale EMPTY &gt;<br> &lt;!ATTLIST
4403			parentLocale parent NMTOKEN #REQUIRED
4404			&gt;<br> &lt;!ATTLIST parentLocale locales NMTOKENS #REQUIRED &gt;
4405		</p>
4406		<p>In some cases, the normal truncation inheritance does not
4407			function well. This happens when:</p>
4408		<ol>
4409			<li>The child locale is of a different script. In this case,
4410				mixing elements from the parent into the child data results in a
4411				mishmash.</li>
4412			<li>A large number of child locales behave similarly, and
4413				differently from the truncation parent.</li>
4414		</ol>
4415		<p>
4416			The <span class="element">parentLocale</span> element is used to
4417			override the normal inheritance when accessing CLDR data.
4418		</p>
4419		<p>For case 1, the children are script locales, and the parent is
4420			&quot;root&quot;. For example:</p>
4421		<pre> &lt;parentLocale parent=&quot;root&quot; locales=&quot;az_Cyrl ha_Arab … zh_Hant&quot;/&gt;</pre>
4422		<p>For case 2, the children and parent share the same primary
4423			language, but the region is changed. For example:</p>
4424		<pre> &lt;parentLocale parent=&quot;es_419&quot; locales=&quot;es_AR es_BO … es_UY es_VE&quot;/&gt;</pre>
4425		<p>Collation data, however, is an exception. Since collation rules
4426			do not truly inherit data from the parent, the parentLocale element
4427			is not necessary and not used for collation. Thus, for a locale like
4428			zh_Hant in the example above, the parentLocale element would dictate
4429			the parent as &quot;root&quot; when referring to main locale data,
4430			but for collation data, the parent locale would still be
4431			&quot;zh&quot;, even though the parentLocale element is present for
4432			that locale.</p>
4433		<p>
4434			Since parentLocale information is not localizable on a per locale
4435			basis, the parentLocale information is contained in CLDR’s <a
4436				href="tr35-info.html">supplemental data.</a>
4437		</p>
4438		<p>
4439			When a <span class="element">parentLocale</span> element is used to
4440			override normal inheritance, the following invariants must always be
4441			true:
4442		</p>
4443		<ol>
4444			<li>If X is the parentLocale of Y, then either X is the root
4445				locale, or X has the same base language code as Y. For example, the
4446				parent of &quot;en&quot; cannot be &quot;fr&quot;, and the parent of
4447				&quot;en_YY&quot; cannot be &quot;fr&quot; or &quot;fr_XX&quot;.</li>
4448			<li>If X is the parentLocale of Y, Y must not be a base language
4449				locale. For example, the parent of &quot;en&quot; cannot be
4450				&quot;en_XX&quot;.</li>
4451			<li>There can never be cycles, such as: X parent of Y ... parent
4452				of X.</li>
4453		</ol>
4454		<h3>
4455			<a name="Inheritance_and_Validity" href="#Inheritance_and_Validity">4.2
4456				Inheritance and Validity</a>
4457		</h3>
4458		<p>The following describes in more detail how to determine the
4459			exact inheritance of elements, and the validity of a given element in
4460			LDML.</p>
4461		<h4>
4462			<a name="Definitions" href="#Definitions">4.2.1 Definitions</a>
4463		</h4>
4464		<p>
4465			<i>Blocking</i> elements are those whose subelements do not inherit
4466			from parent locales. For example, a &lt;collation&gt; element is a
4467			blocking element: everything in a &lt;collation&gt; element is
4468			treated as a single lump of data, as far as inheritance is concerned.
4469			For more information, see <a href="#Valid_Attribute_Values">Section
4470				5.5 Valid Attribute Values</a>.
4471		</p>
4472		<p>
4473			Attributes that serve to distinguish multiple elements at the same
4474			level are called <i>distinguishing</i> attributes. For example, the <i>type</i>
4475			attribute distinguishes different elements in lists of translations,
4476			such as:
4477		</p>
4478		<pre>&lt;language type=&quot;aa&quot;&gt;Afar&lt;/language&gt;
4479&lt;language type=&quot;ab&quot;&gt;Abkhazian&lt;/language&gt;</pre>
4480		<p>
4481			Distinguishing attributes affect inheritance; two elements with
4482			different distinguishing attributes are treated as different for
4483			purposes of inheritance. For more information, see <a
4484				href="#Valid_Attribute_Values">Section 5.5 Valid Attribute
4485				Values</a>. Other attributes are called nondistinguishing (or
4486			informational) attributes. These carry separate information, and do
4487			not affect inheritance.
4488		</p>
4489		<p>
4490			For any element in an XML file, <i>an element chain</i> is a resolved
4491			[<a href="#XPath">XPath</a>] leading from the root to an element,
4492			with attributes on each element in alphabetical order. So in, say, <a
4493				href="http://unicode.org/cldr/data/common/main/el.xml">http://unicode.org/cldr/data/common/main/el.xml</a>
4494			we may have:
4495		</p>
4496		<pre>&lt;ldml&gt;
4497  &lt;identity&gt;
4498    &lt;version number=&quot;1.1&quot; /&gt;
4499    &lt;language type=&quot;el&quot; /&gt;
4500  &lt;/identity&gt;
4501  &lt;localeDisplayNames&gt;
4502    &lt;languages&gt;
4503      &lt;language type=&quot;ar&quot;&gt;Αραβικά&lt;/language&gt;
4504...</pre>
4505		<p>Which gives the following element chains (among others):</p>
4506		<ul>
4507			<li>//ldml/identity/version[@number=&quot;1.1&quot;]</li>
4508			<li>//ldml/localeDisplayNames/languages/language[@type=&quot;ar&quot;]</li>
4509		</ul>
4510		<p>
4511			An element chain A is an <i>extension</i> of an element chain B if B
4512			is equivalent to an initial portion of A. For example, #2 below is an
4513			extension of #1. (Equivalent, depending on the tree, may not be
4514			&quot;identical to&quot;. See below for an example.)
4515		</p>
4516		<ol>
4517			<li>//ldml/localeDisplayNames</li>
4518			<li>//ldml/localeDisplayNames/languages/language[@type=&quot;ar&quot;]</li>
4519		</ol>
4520		<p>
4521			An LDML file can be thought of as an ordered list of <i>element
4522				pairs</i>: &lt;element chain, data&gt;, where the element chains are all
4523			the chains for the end-nodes. (This works because of restrictions on
4524			the structure of LDML, including that it does not allow mixed
4525			content.) The ordering is the ordering that the element chains are
4526			found in the file, and thus determined by the DTD.
4527	  </p>
4528		<p>For example, some of those pairs would be the following. Notice
4529			that the first has the null string as element contents.</p>
4530		<ul>
4531			<li><b>&lt;</b>//ldml/identity/version[@number=&quot;1.1&quot;]<b>,
4532			</b>&quot;&quot;<b>&gt;</b></li>
4533			<li><b>&lt;</b>//ldml/localeDisplayNames/languages/language[@type=&quot;ar&quot;]<b>,
4534			</b>&quot;Αραβικά&quot;<b>&gt;</b></li>
4535		</ul>
4536		<blockquote>
4537			<p>
4538				<b>Note: </b>There are two exceptions to this:
4539			</p>
4540			<ol>
4541				<li>Blocking nodes and their contents are treated as a single
4542					end node.</li>
4543				<li>In terms of computing inheritance, the element pair
4544					consists of the element chain plus all distinguishing attributes;
4545					the value consists of the value (if any) plus any nondistinguishing
4546					attributes.</li>
4547			</ol>
4548			<blockquote>
4549				<p>Thus instead of the element pair being (a) below, it is (b):</p>
4550				<ol type="a">
4551					<li><b>&lt;</b>//ldml/dates/calendars/calendar[@type=&#39;gregorian&#39;]/week/weekendStart[@day=&#39;sun&#39;][@time=&#39;00:00&#39;]<b>,</b><br>
4552						<b>&quot;&quot;&gt;</b></li>
4553					<li><b>&lt;</b>//ldml/dates/calendars/calendar[@type=&#39;gregorian&#39;]/week/weekendStart<b>,</b><br>
4554						[@day=&#39;sun&#39;][@time=&#39;00:00&#39;]<b>&gt;</b></li>
4555				</ol>
4556			</blockquote>
4557		</blockquote>
4558		<p>
4559			Two LDML element chains are <i>equivalent</i> when they would be
4560			identical if all attributes and their values were removed — except
4561			for distinguishing attributes. Thus the following are equivalent:
4562		</p>
4563		<ul>
4564			<li><code>//ldml/localeDisplayNames/languages/language[@type=&quot;ar&quot;]</code></li>
4565			<li><code>//ldml/localeDisplayNames/languages/language[@type=&quot;ar&quot;][@draft=&quot;unconfirmed&quot;]</code></li>
4566		</ul>
4567		<p>
4568			For any locale ID, an <i>locale chain</i> is an ordered list starting
4569			with the root and leading down to the ID. For example:
4570		</p>
4571		<blockquote>
4572			<p>&lt;root, de, de_DE, de_DE_xxx&gt;</p>
4573		</blockquote>
4574		<h4>
4575			<a name="Resolved_Data_File" href="#Resolved_Data_File">4.2.2
4576				Resolved Data File</a>
4577		</h4>
4578		<p>To produce fully resolved locale data file from CLDR for a
4579			locale ID L, you start with L, and successively add unique items from
4580			the parent locales until you get up to root. More formally, this can
4581			be expressed as the following procedure.</p>
4582		<ol>
4583			<li>Let Result be initially L.</li>
4584			<li>For each Li in the locale chain for L, starting at L and
4585				going up to root:
4586				<ol>
4587					<li>Let Temp be a copy of the pairs in the LDML file for Li</li>
4588					<li>Replace each alias in Temp by the resolved list of pairs
4589						it points to.
4590						<ol>
4591							<li>The resolved list of pairs is obtained by recursively
4592								applying this procedure.</li>
4593							<li>That alias now blocks any inheritance from the parent.
4594								(See <i><a href="#Common_Elements">Section 5.1 Common
4595										Elements</a></i> for an example.)
4596							</li>
4597						</ol>
4598					</li>
4599					<li>For each element pair P in Temp:
4600						<ol>
4601							<li>If P does not contain a blocking element, and Result
4602								does not have an element pair Q with an equivalent element
4603								chain, add P to Result.</li>
4604						</ol>
4605					</li>
4606				</ol>
4607			</li>
4608		</ol>
4609		<p>
4610			<b>Notes:</b>
4611		</p>
4612		<ul>
4613			<li>When adding an element pair to a result, it has to go in the
4614				right order for it to be valid according to the DTD.</li>
4615			<li>The identity element and its children are unaffected by
4616				resolution.</li>
4617			<li>The LDML data must be constructed so as to avoid circularity
4618				in step 2.2.</li>
4619		</ul>
4620		<h4>
4621			<a name="Valid_Data" href="#Valid_Data">4.2.3 Valid Data</a>
4622		</h4>
4623		<p>
4624			The attribute <i>draft=&quot;x&quot; </i>in LDML means that the data
4625			has not been approved by the subcommittee. (For more information, see
4626			<a href="http://cldr.unicode.org/index/process">Process</a>).
4627			However, some data that is not explicitly marked as <i>draft </i>may
4628			be implicitly <i>draft</i>, either because it inherits it from a
4629			parent, or from an enclosing element.
4630		</p>
4631		<p>
4632			<b>Example 2. </b>Suppose that new locale data is added for af
4633			(Afrikaans). To indicate that all of the data is <i>unconfirmed</i>,
4634			the attribute can be added to the top level.
4635		</p>
4636		<p>
4637			<code>
4638				&lt;ldml version=&quot;1.1&quot; draft=&quot;unconfirmed&quot;&gt;<br>
4639				&nbsp;&lt;identity&gt;<br> &nbsp; &lt;version
4640				number=&quot;1.1&quot; /&gt; <br> &nbsp; &lt;language
4641				type=&quot;af&quot; /&gt; <br> &nbsp;&lt;/identity&gt;<br>
4642				&nbsp;&lt;characters&gt;...&lt;/characters&gt;<br>
4643				&nbsp;&lt;localeDisplayNames&gt;...&lt;/localeDisplayNames&gt;<br>
4644				&lt;/ldml&gt;
4645			</code>
4646		</p>
4647		<p>
4648			Any data can be added to that file, and the status will all be draft=<i>unconfirmed</i>.
4649			Once an item is vetted—<i>whether it is inherited or explicitly
4650				in the file</i>—then its status can be changed to <i>approved</i>. This
4651			can be done either by leaving draft=&quot;unconfirmed&quot; on the
4652			enclosing element and marking the child with
4653			draft=&quot;approved&quot;, such as:
4654		</p>
4655		<p>
4656			<code>
4657				&lt;ldml version=&quot;1.1&quot; draft=&quot;unconfirmed&quot;&gt;<br>
4658				&nbsp;&lt;identity&gt;<br> &nbsp; &lt;version
4659				number=&quot;1.1&quot; /&gt; <br> &nbsp; &lt;language
4660				type=&quot;af&quot; /&gt; <br> &nbsp;&lt;/identity&gt;<br>
4661				&nbsp;&lt;characters
4662				draft=&quot;approved&quot;&gt;...&lt;/characters&gt;<br>
4663				&nbsp;&lt;localeDisplayNames&gt;...&lt;/localeDisplayNames&gt;<br>
4664				&nbsp;&lt;dates/&gt;<br> &nbsp;&lt;numbers/&gt;<br>
4665				&nbsp;&lt;collations/&gt;<br> &lt;/ldml&gt;
4666			</code>
4667		</p>
4668		<p>
4669			However, normally the draft attributes should be canonicalized, which
4670			means they are pushed down to leaf nodes as described in <i><a
4671				href="#Canonical_Form">Section 5.6 Canonical Form</a></i>. If an LDML
4672			file does has draft attributes that are not on leaf nodes, the file
4673			should be interpreted as if it were the canonicalized version of that
4674			file.
4675		</p>
4676		<p>More formally, here is how to determine whether data for an
4677			element chain E is implicitly or explicitly draft, given a locale L.
4678			Sections 1, 2, and 4 are simply formalizations of what is in LDML
4679			already. Item 3 adds the new element.</p>
4680		<h4>
4681			<a name="Checking_for_Draft_Status" href="#Checking_for_Draft_Status">4.2.4
4682				Checking for Draft Status</a>
4683		</h4>
4684		<ol>
4685			<li><b>Parent Locale Inheritance</b>
4686				<ol>
4687					<li>Walk through the locale chain until you find a locale ID
4688						L&#39; with a data file D. (L&#39; may equal L).</li>
4689					<li>Produce the fully resolved data file D&#39; for D.</li>
4690					<li>In D&#39;, find the first element pair whose element chain
4691						E&#39; is either equivalent to or an extension of E.</li>
4692					<li>If there is no such E&#39;, return <i>true</i></li>
4693					<li>If E&#39; is not equivalent to E, truncate E&#39; to the
4694						length of E.</li>
4695				</ol></li>
4696			<li><b>Enclosing Element Inheritance</b>
4697				<ol>
4698					<li>Walk through the elements in E&#39;, from back to front.
4699						<ol>
4700							<li>If you ever encounter draft=<i>x</i>, return <i>x</i></li>
4701						</ol>
4702					</li>
4703					<li>If L&#39; = L, return <i>false</i></li>
4704				</ol></li>
4705			<li><b>Missing File Inheritance</b>
4706				<ol>
4707					<li>Otherwise, walk again through the elements in E&#39;, from
4708						back to front.
4709						<ol>
4710							<li>If you encounter a validSubLocales attribute
4711								(deprecated):
4712								<ol>
4713									<li>If L is in the attribute value, return <i>false</i></li>
4714									<li>Otherwise return <i>true</i></li>
4715								</ol>
4716							</li>
4717						</ol>
4718					</li>
4719				</ol></li>
4720			<li><b>Otherwise</b>
4721				<ol>
4722					<li>Return <i>true</i></li>
4723				</ol></li>
4724		</ol>
4725		<p>The validSubLocales in the most specific (farthest from root
4726			file) locale file &quot;wins&quot; through the full resolution step
4727			(data from more specific files replacing data from less specific
4728			ones).</p>
4729		<h4>
4730			<a name="Keyword_and_Default_Resolution"
4731				href="#Keyword_and_Default_Resolution">4.2.5 Keyword and Default
4732				Resolution</a>
4733		</h4>
4734		<p>When accessing data based on keywords, the following process is
4735			used. Consider the following example:</p>
4736		<ul>
4737			<li>The locale &#39;de&#39; has collation types A, B, C, and no
4738				&lt;default&gt; element</li>
4739			<li>The locale &#39;de_CH&#39; has &lt;default
4740				type=&#39;B&#39;&gt;</li>
4741		</ul>
4742		<p>Here are the searches for various combinations.</p>
4743		<table class='simple' border="1" cellpadding="0" cellspacing="0">
4744			<tr>
4745				<td><strong>User Input</strong></td>
4746				<td><strong>Lookup in Locale</strong></td>
4747				<td><strong>For</strong></td>
4748				<td><strong>Comment</strong></td>
4749			</tr>
4750			<tr>
4751				<td rowspan="3">de_CH<br> <em>no keyword</em></td>
4752				<td>de_CH</td>
4753				<td>default collation type</td>
4754				<td>finds &quot;B&quot;</td>
4755			</tr>
4756			<tr>
4757				<td>de_CH</td>
4758				<td>collation type=B</td>
4759				<td>not found</td>
4760			</tr>
4761			<tr>
4762				<td>de</td>
4763				<td>collation type=B</td>
4764				<td><em>found</em></td>
4765			</tr>
4766			<tr>
4767				<td rowspan="4">de<br> <em>no keyword</em></td>
4768				<td>de</td>
4769				<td>default collation type</td>
4770				<td>not found</td>
4771			</tr>
4772			<tr>
4773				<td>root</td>
4774				<td>default collation type</td>
4775				<td>finds &quot;standard&quot;</td>
4776			</tr>
4777			<tr>
4778				<td>de</td>
4779				<td>collation type=standard</td>
4780				<td>not found</td>
4781			</tr>
4782			<tr>
4783				<td>root</td>
4784				<td>collation type=standard</td>
4785				<td><i>found</i></td>
4786			</tr>
4787			<tr>
4788				<td>de_u_co_A</td>
4789				<td>de</td>
4790				<td>collation type=A</td>
4791				<td><i>found</i></td>
4792			</tr>
4793			<tr>
4794				<td rowspan="2">de_u_co_standard</td>
4795				<td>de</td>
4796				<td>collation type=standard</td>
4797				<td>not found</td>
4798			</tr>
4799			<tr>
4800				<td>root</td>
4801				<td>collation type=standard</td>
4802				<td><i>found</i></td>
4803			</tr>
4804			<tr>
4805				<td rowspan="6">de_u_co_foobar</td>
4806				<td>de</td>
4807				<td>collation type=foobar</td>
4808				<td>not found</td>
4809			</tr>
4810			<tr>
4811				<td>root</td>
4812				<td>collation type=foobar</td>
4813				<td>not found, starts looking for default</td>
4814			</tr>
4815			<tr>
4816				<td>de</td>
4817				<td>default collation type</td>
4818				<td>not found</td>
4819			</tr>
4820			<tr>
4821				<td>root</td>
4822				<td>default collation type</td>
4823				<td>finds &quot;standard&quot;</td>
4824			</tr>
4825			<tr>
4826				<td>de</td>
4827				<td>collation type=standard</td>
4828				<td>not found</td>
4829			</tr>
4830			<tr>
4831				<td>root</td>
4832				<td>collation type=standard</td>
4833				<td><i>found</i></td>
4834			</tr>
4835		</table>
4836		<p>Examples of &quot;search&quot; collator lookup; 'de' has a
4837			language-specific version, but 'en' does not:</p>
4838		<table class='simple' border="1" cellpadding="0" cellspacing="0">
4839			<tr>
4840				<td><strong>User Input</strong></td>
4841				<td><strong>Lookup in Locale</strong></td>
4842				<td><strong>For</strong></td>
4843				<td><strong>Comment</strong></td>
4844			</tr>
4845			<tr>
4846				<td rowspan="2">de_CH_u_co_search</td>
4847				<td>de_CH</td>
4848				<td>collation type=search</td>
4849				<td>not found</td>
4850			</tr>
4851			<tr>
4852				<td>de</td>
4853				<td>collation type=search</td>
4854				<td><i>found</i></td>
4855			</tr>
4856			<tr>
4857				<td rowspan="3">en_US_u_co_search</td>
4858				<td>en_US</td>
4859				<td>collation type=search</td>
4860				<td>not found</td>
4861			</tr>
4862			<tr>
4863				<td>en</td>
4864				<td>collation type=search</td>
4865				<td>not found</td>
4866			</tr>
4867			<tr>
4868				<td>root</td>
4869				<td>collation type=search</td>
4870				<td><i>found</i></td>
4871			</tr>
4872		</table>
4873		<p>Examples of lookup for Chinese collation types. Note:</p>
4874		<ul>
4875			<li>All of the Chinese-specific collation types are provided in
4876				the 'zh' locale</li>
4877			<li>For 'zh' the &lt;default&gt; element specifies
4878				&quot;pinyin&quot;; for 'zh_Hant' the &lt;default&gt; element
4879				specifies &quot;stroke&quot;. However any of the available Chinese
4880				collation types can be explicitly requested for any Chinese locale.</li>
4881		</ul>
4882		<table class='simple' border="1" cellpadding="0" cellspacing="0">
4883			<tr>
4884				<td><strong>User Input</strong></td>
4885				<td><strong>Lookup in Locale</strong></td>
4886				<td><strong>For</strong></td>
4887				<td><strong>Comment</strong></td>
4888			</tr>
4889			<tr>
4890				<td rowspan="3">zh_Hant<br> <em>no keyword</em></td>
4891				<td>zh_Hant</td>
4892				<td>default collation type</td>
4893				<td>finds &quot;stroke&quot;</td>
4894			</tr>
4895			<tr>
4896				<td>zh_Hant</td>
4897				<td>collation type=stroke</td>
4898				<td>not found</td>
4899			</tr>
4900			<tr>
4901				<td>zh</td>
4902				<td>collation type=stroke</td>
4903				<td><i>found</i></td>
4904			</tr>
4905			<tr>
4906				<td rowspan="3">zh_Hant_HK_u_co_pinyin</td>
4907				<td>zh_Hant_HK</td>
4908				<td>collation type=pinyin</td>
4909				<td>not found</td>
4910			</tr>
4911			<tr>
4912				<td>zh_Hant</td>
4913				<td>collation type=pinyin</td>
4914				<td>not found</td>
4915			</tr>
4916			<tr>
4917				<td>zh</td>
4918				<td>collation type=pinyin</td>
4919				<td><i>found</i></td>
4920			</tr>
4921			<tr>
4922				<td rowspan="2">zh<br> <em>no keyword</em></td>
4923				<td>zh</td>
4924				<td>default collation type</td>
4925				<td>finds &quot;pinyin&quot;</td>
4926			</tr>
4927			<tr>
4928				<td>zh</td>
4929				<td>collation type=pinyin</td>
4930				<td><i>found</i></td>
4931			</tr>
4932		</table>
4933		<blockquote>
4934			<p>
4935				<b>Note: </b>It is an invariant that the default in root for a given
4936				element must<br> always be a value that exists in root. So you
4937				can not have the following in root:
4938			</p>
4939		</blockquote>
4940		<p>
4941			<code>
4942				&lt;someElements&gt;<br> &nbsp; &lt;default
4943				type=&#39;a&#39;/&gt;<br> &nbsp; &lt;someElement
4944				type=&#39;b&#39;&gt;...&lt;/someElement&gt;<br> &nbsp;
4945				&lt;someElement type=&#39;c&#39;&gt;...&lt;/someElement&gt;<br>
4946				<b>&nbsp; &lt;!-- no &#39;a&#39; --&gt;</b><br>
4947				&lt;/someElements&gt;
4948			</code>
4949		</p>
4950		<p>For identifiers, such as language codes, script codes, region
4951			codes, variant codes, types, keywords, currency symbols or currency
4952			display names, the default value is the identifier itself whenever if
4953			no value is found in the root. Thus if there is no display name for
4954			the region code &#39;QA&#39; in root, then the display name is simply
4955			&#39;QA&#39;.	  </p>
4956
4957		<h4>
4958		  <a name="Inheritance_vs_Related" href="#Inheritance_vs_Related">4.2.6 Inheritance vs Related Information</a>
4959		</h4>
4960	    <p>There are related types of data and processing that are easy to confuse:</p>
4961		  <table class='simple'>
4962		    <tr>
4963		      <td rowspan="4"><p><strong>Inheritance</strong></p></td>
4964		      <td colspan="2">Part of the internal mechanism used by CLDR to organize and manage locale data.
4965		        This is used to share common resources, and ease maintenance, and provide the best fallback behavior in the absence of data. <em>Should not be used for locale matching or likely subtags.</em></td>
4966	        </tr>
4967		    <tr>
4968		      <td><em>Example:</em></td>
4969		      <td>parent(en_AU) ⇒ en_001<br>
4970	          parent(en_001) ⇒ en<br>
4971	          parent(en) ⇒ root</td>
4972	        </tr>
4973		    <tr>
4974		      <td><em>Data: </em></td>
4975		      <td>supplementalData.xml &lt;parentLocale&gt;</td>
4976	        </tr>
4977		    <tr>
4978		      <td><em>Spec:</em></td>
4979		      <td><strong>Section <a href="#Inheritance_and_Validity">4.2 Inheritance and Validity</a></strong></td>
4980	        </tr>
4981		    <tr>
4982		      <td rowspan="4"><strong>DefaultContent</strong></td>
4983		      <td colspan="2">Part of the internal mechanism used by CLDR to manage locale data. A particular sublocale is designated the defaultContent for a parent, so that the parent exhibits consistent behavior.  <em>Should not be used for locale matching or likely subtags.</em></td>
4984	        </tr>
4985		    <tr>
4986		      <td><em>Example:</em></td>
4987		      <td>addLikelySubtags(sr-ME) ⇒ sr-Latn-ME, minimize(de-Latn-DE) ⇒ de</td>
4988	        </tr>
4989		    <tr>
4990		      <td><em>Data: </em></td>
4991		      <td>supplementalMetadata.xml &lt;defaultContent&gt;</td>
4992	        </tr>
4993		    <tr>
4994		      <td><em>Spec:</em></td>
4995		      <td><strong>Part 6: Section 9.3 <a  href="tr35-info.html#Default_Content">Default Content</a>
4996		      </strong></td>
4997	        </tr>
4998   		    <tr>
4999		      <td rowspan="4"><strong>LikelySubtags</strong></td>
5000		      <td colspan="2">Provides most likely full subtag (script and region) in the absence of other information. A core component of LocaleMatching.</td>
5001	        </tr>
5002		    <tr>
5003		      <td><em>Example:</em></td>
5004		      <td>addLikelySubtags(zh) ⇒ zh-Hans-CN<br>
5005		        addLikelySubtags(zh-TW) ⇒ zh-Hant-TW <br>
5006minimize(zh-Hans, favorRegion) ⇒ zh-TW</td>
5007	        </tr>
5008		    <tr>
5009		      <td><em>Data: </em></td>
5010		      <td>likelySubtags.xml &lt;likelySubtags&gt;</td>
5011	        </tr>
5012		    <tr>
5013		      <td><em>Spec:</em></td>
5014		      <td><strong>Section <a href="#Likely_Subtags">4.3 Likely
5015			  Subtags</a></strong></td>
5016	        </tr>
5017		    <tr>
5018		      <td rowspan="4"><strong>LocaleMatching</strong></td>
5019		      <td colspan="2">Provides the   best match for the user’s language(s) among an application’s supported languages.  </td>
5020	        </tr>
5021		    <tr>
5022		      <td><em>Example:</em></td>
5023		      <td>bestLocale(userLangs=&lt;en, fr&gt;, appLangs=&lt;fr-CA, ru&gt;) ⇒ fr-CA</td>
5024	        </tr>
5025		    <tr>
5026		      <td><em>Data: </em></td>
5027		      <td>languageInfo.xml &lt;languageMatching&gt;</td>
5028	        </tr>
5029		    <tr>
5030		      <td><em>Spec:</em></td>
5031		      <td><strong>Section
5032              <a href="#LanguageMatching">4.4 Language Matching</a></strong></td>
5033	        </tr>
5034        </table>
5035
5036
5037		<h3>
5038		  <a name="Likely_Subtags" href="#Likely_Subtags">4.3 Likely
5039				Subtags</a>
5040		</h3>
5041		<p class="dtd">
5042			&lt;!ELEMENT likelySubtag EMPTY &gt;<br> &lt;!ATTLIST
5043			likelySubtag from NMTOKEN #REQUIRED&gt;<br> &lt;!ATTLIST
5044			likelySubtag to NMTOKEN #REQUIRED&gt;
5045		</p>
5046		<p>There are a number of situations where it is useful to be able
5047			to find the most likely language, script, or region. For example,
5048			given the language &quot;zh&quot; and the region &quot;TW&quot;, what
5049			is the most likely script? Given the script &quot;Thai&quot; what is
5050			the most likely language or region? Given the region TW, what is the
5051			most likely language and script?</p>
5052		<p>Conversely, given a locale, it is useful to find out which
5053			fields (language, script, or region) may be superfluous, in the sense
5054			that they contain the likely tags. For example, &quot;en_Latn&quot;
5055			can be simplified down to &quot;en&quot; since &quot;Latn&quot; is
5056			the likely script for &quot;en&quot;; &quot;ja_Jpan_JP&quot; can be
5057			simplified down to &quot;ja&quot;.</p>
5058		<p>
5059			The <i>likelySubtag</i> supplemental data provides default
5060			information for computing these values. This data is based on the
5061			default content data, the population data, and the the
5062			suppress-script data in [<a href="#BCP47">BCP47</a>]. It is
5063			heuristically derived, and may change over time.
5064		</p>
5065	  <p>For the relationship between Inheritance, DefaultContent, LikelySubtags, and LocaleMatching, see <strong><em>Section 4.2.6 <a
5066				href="tr35.html#Inheritance_vs_Related">Inheritance vs Related Information</a></em></strong>.</p>
5067	  <p>
5068			To look up data in the table, see if a locale matches one of the <b>from</b>
5069			attribute values. If so, fetch the corresponding <b>to</b> attribute
5070			value. For example, the Chinese data looks like the following:
5071		</p>
5072		<blockquote>
5073			<p class="example">
5074				&lt;likelySubtag from=&quot;zh&quot; to=&quot;zh_Hans_CN&quot;/&gt;<br>
5075				&lt;likelySubtag from=&quot;zh_HK&quot;
5076				to=&quot;zh_Hant_HK&quot;/&gt;<br> &lt;likelySubtag
5077				from=&quot;zh_Hani&quot; to=&quot;zh_Hani_CN&quot;/&gt;<br>
5078				&lt;likelySubtag from=&quot;zh_Hant&quot;
5079				to=&quot;zh_Hant_TW&quot;/&gt;<br> &lt;likelySubtag
5080				from=&quot;zh_MO&quot; to=&quot;zh_Hant_MO&quot;/&gt;<br>
5081				&lt;likelySubtag from=&quot;zh_TW&quot;
5082				to=&quot;zh_Hant_TW&quot;/&gt;
5083			</p>
5084		</blockquote>
5085		<p>So looking up &quot;zh_TW&quot; returns &quot;zh_Hant_TW&quot;,
5086			while looking up &quot;zh&quot; returns &quot;zh_Hans_CN&quot;.</p>
5087		<p>In more detail, the data is designed to be used in the
5088			following operations.</p>
5089		<p>
5090			Note that as of CLDR v24, any field present in the 'from' field, is
5091			also present in the 'to' field, so an input field will not change in
5092			&quot;Add Likely Subtags&quot; operation. The data and operations can
5093			also be used with language tags using [<a href="#BCP47">BCP47</a>]
5094			syntax, with the appropriate changes. In addition, certain common
5095			'denormalized' language subtags such as 'iw' (for 'he') may occur in
5096			both the 'from' and 'to' fields. This allows for implementations that
5097			use those denormalized subtags to use the data with only minor
5098			changes to the operations.
5099		</p>
5100		<p>&nbsp;</p>
5101		<p>
5102			<i><b>Add Likely Subtags: </b></i><em>Given a source locale X,
5103				to return a locale Y where the empty subtags have been filled in by
5104				the most likely subtags.</em> This is written as X ⇒ Y (&quot;X maximizes
5105			to Y&quot;).
5106		</p>
5107		<p>
5108			A subtag is called <em>empty</em> if it is a missing script or region
5109			subtag, or it is a base language subtag with the value
5110			&quot;und&quot;. In the description below, a subscript on a subtag <em>x</em>
5111			indicates which tag it is from: <em>x<sub>s</sub></em> is in the
5112			source, <em>x<sub>m</sub></em>is in a match, and <em>x<sub>r</sub></em>
5113			is in the final result.
5114		</p>
5115		<p>This operation is performed in the following way.</p>
5116		<ol>
5117			<li style="margin-top: 0.5em; margin-bottom: 0.5em"><strong>Canonicalize.</strong>
5118				<ol>
5119					<li>Make sure the input locale is in canonical form: uses the
5120						right separator, and has the right casing.</li>
5121					<li style="margin-top: 0.5em; margin-bottom: 0.5em">Replace
5122						any deprecated subtags with their canonical values using the
5123						&lt;alias&gt; data in supplemental metadata. Use the first value
5124						in the replacement list, if it exists. Language tag replacements
5125						may have multiple parts, such as &quot;sh&quot; ➞
5126						&quot;sr_Latn&quot; or mo&quot; ➞ &quot;ro_MD&quot;. In such a
5127						case, the original script and/or region are retained if there is
5128						one. Thus &quot;sh_Arab_AQ&quot; ➞ &quot;sr_Arab_AQ&quot;, not
5129						&quot;sr_Latn_AQ&quot;.</li>
5130					<li>If the tag is grandfathered (see &lt;variable
5131						id=&quot;$grandfathered&quot; type=&quot;choice&quot;&gt; in the
5132						supplemental data), then return it.</li>
5133					<li>Remove the script code &#39;Zzzz&#39; and the region code
5134						&#39;ZZ&#39; if they occur.</li>
5135					<li>Get the components of the cleaned-up source tag <em>(language<sub>s</sub>,
5136							script<sub>s</sub>,
5137					</em>and<em> region<sub>s</sub></em>), plus any variants and extensions.
5138					</li>
5139				</ol></li>
5140			<li style="margin-top: 0.5em; margin-bottom: 0.5em"><strong>Lookup.
5141			</strong>Lookup each of the following in order, and stop on the first match:
5142				<ol>
5143					<li style="margin-top: 0.5em; margin-bottom: 0.5em"><em>language<sub>s</sub>_script<sub>s</sub>_region<sub>s</sub></em></li>
5144
5145					<li style="margin-top: 0.5em; margin-bottom: 0.5em"><em>language<sub>s</sub>_region<sub>s</sub></em></li>
5146
5147					<li style="margin-top: 0.5em; margin-bottom: 0.5em"><em>language<sub>s</sub>_script<sub>s</sub></em></li>
5148					<li style="margin-top: 0.5em; margin-bottom: 0.5em"><em><em>language<sub>s</sub></em></em></li>
5149					<li>und<em>_script<sub>s</sub></em></li>
5150				</ol></li>
5151			<li><strong>Return</strong>
5152				<ol>
5153					<li>If there is no match,either return
5154						<ol>
5155							<li>an error value, or</li>
5156							<li>the match for &quot;und&quot; (in APIs where a valid
5157								language tag is required).</li>
5158						</ol>
5159					</li>
5160					<li>Otherwise there is a match = <span
5161						style="margin-top: 0.5em; margin-bottom: 0.5em"><em>language<sub>m</sub>_script<sub>m</sub>_region<sub>m</sub></em></span></li>
5162					<li>Let x<sub>r</sub> = x<sub>s</sub> if x<sub>s</sub> is not
5163						empty, and x<sub>m</sub> otherwise.
5164					</li>
5165					<li>R<span style="margin-top: 0.5em; margin-bottom: 0.5em">eturn
5166							the language tag composed of <em>language<sub>r</sub> _
5167								script<sub>r</sub> _ region<sub>r</sub></em> + variants + extensions
5168					</span>.
5169					</li>
5170				</ol></li>
5171		</ol>
5172		<p>The lookup can be optimized. For example, if any of the tags in
5173			Step 2 are the same as previous ones in that list, they do not need
5174			to be tested.</p>
5175		<p>
5176			<i>Example1:</i>
5177		</p>
5178		<ul>
5179			<li style="margin-top: 0.5em; margin-bottom: 0.5em">
5180				<p>Input is ZH-ZZZZ-SG.</p>
5181			</li>
5182			<li style="margin-top: 0.5em; margin-bottom: 0.5em">
5183				<p>Normalize to zh_SG.</p>
5184			</li>
5185			<li style="margin-top: 0.5em; margin-bottom: 0.5em">
5186				<p>Lookup in table. No match.</p>
5187			</li>
5188			<li style="margin-top: 0.5em; margin-bottom: 0.5em">
5189				<p>Lookup zh, and get the match (zh_Hans_CN). Substitute SG, and
5190					return zh_Hans_SG.</p>
5191			</li>
5192		</ul>
5193		<p>To find the most likely language for a country, or language for
5194			a script, use &quot;und&quot; as the language subtag. For example,
5195			looking up &quot;und_TW&quot; returns zh_Hant_TW.</p>
5196		<p>A goal of the algorithm is that if X ⇒ Y, and X' results from
5197			replacing an empty subtag in X by the the corresponding subtag in Y,
5198			then X' ⇒ Y. For example, if und_AF ⇒ fa_Arab_AF, then:</p>
5199		<ul>
5200			<li>fa_Arab_AF ⇒ fa_Arab_AF</li>
5201			<li>und_Arab_AF ⇒ fa_Arab_AF</li>
5202			<li>fa_AF ⇒ fa_Arab_AF</li>
5203		</ul>
5204		<p>There are a small number of exceptions to this goal in the
5205			current data, where X ∈ {und_Bopo, und_Brai, und_Cakm, und_Limb,
5206			und_Shaw}.</p>
5207		<p>
5208			<b><i>Remove</i></b><i><b> Likely Subtags: </b>Given a locale,
5209				remove any fields that Add Likely Subtags would add.</i>
5210		</p>
5211		<p>The reverse operation removes fields that would be added by the
5212			first operation.</p>
5213		<ol>
5214			<li style="margin-top: 0.5em; margin-bottom: 0.5em">First get
5215				max = AddLikelySubtags(inputLocale). If an error is signaled, return
5216				it.</li>
5217			<li style="margin-top: 0.5em; margin-bottom: 0.5em">Remove the
5218				variants from max.</li>
5219			<li style="margin-top: 0.5em; margin-bottom: 0.5em">Then for <i>trial</i>
5220				in {language, language _ region, language _ script}
5221				<ul>
5222					<li style="margin-top: 0.5em; margin-bottom: 0.5em">If
5223						AddLikelySubtags(<i>trial</i>) = max, then return <i>trial</i> +
5224						variants.
5225					</li>
5226				</ul>
5227			</li>
5228			<li style="margin-top: 0.5em; margin-bottom: 0.5em">If you do
5229				not get a match, return max + variants.</li>
5230		</ol>
5231		<p>Example:</p>
5232		<ul>
5233			<li style="margin-top: 0.5em; margin-bottom: 0.5em">
5234				<p>Input is zh_Hant. Maximize to get zh_Hant_TW.</p>
5235			</li>
5236			<li style="margin-top: 0.5em; margin-bottom: 0.5em">
5237				<p>zh =&gt; zh_Hans_CN. No match, so continue.</p>
5238			</li>
5239			<li style="margin-top: 0.5em; margin-bottom: 0.5em">
5240				<p>zh_TW =&gt; zh_Hant_TW. Matches, so return zh_TW.</p>
5241			</li>
5242		</ul>
5243		<p>A variant of this favors the script over the region, thus using
5244			{language, language_script, language_region} in the above. If that
5245			variant is used, then the result in this example would be zh_Hant
5246			instead of zh_TW.		</p>
5247		<h3>
5248			<a name="LanguageMatching" href="#LanguageMatching">4.4 Language
5249				Matching</a>
5250		</h3>
5251		<p class="dtd">
5252			&lt;!ELEMENT languageMatching ( languageMatches* ) &gt;<br>
5253			&lt;!ELEMENT languageMatches ( paradigmLocales*, matchVariable*, languageMatch* ) &gt;<br>
5254		&lt;!ATTLIST languageMatches type NMTOKEN #REQUIRED &gt;</p>
5255		<p class="dtd">&lt;!ELEMENT languageMatch EMPTY &gt;<br> &lt;!ATTLIST
5256		  languageMatch desired CDATA #REQUIRED &gt;<br> &lt;!ATTLIST
5257		  languageMatch supported CDATA #REQUIRED &gt;<br> &lt;!ATTLIST
5258		  languageMatch percent NMTOKEN #REQUIRED &gt;<br>
5259          &lt;!ATTLIST languageMatch distance NMTOKEN #IMPLIED &gt;<br>
5260           &lt;!ATTLIST languageMatch oneway ( true | false ) #IMPLIED &gt;</p>
5261		<p class="dtd">&lt;!ELEMENT languageMatches ( paradigmLocales*, matchVariable*, languageMatch* ) &gt;<br>
5262		  &lt;!ATTLIST languageMatches type NMTOKEN #REQUIRED &gt;</p>
5263		<p class="dtd">&lt;!ELEMENT paradigmLocales EMPTY &gt;<br>
5264		  &lt;!ATTLIST paradigmLocales locales NMTOKENS #REQUIRED &gt;
5265	    </p>
5266		<p>
5267			Implementers are often faced with the issue of how to match the
5268			user's requested languages with their product's supported languages.
5269			For example, suppose that a product supports {ja-JP, de, zh-TW}. If
5270			the user understands written American English, German, French, Swiss
5271			German, and Italian, then <strong>de</strong> would be the best
5272			match; if s/he understands only Chinese (zh), then zh-TW would be the
5273			best match.
5274		</p>
5275		<p>The standard truncation-fallback algorithm does not work well
5276			when faced with the complexities of natural language. The language
5277			matching data is designed to fill that gap. Stated in those terms,
5278			language matching can have the effect of a more complex fallback,
5279			such as:</p>
5280		<p>
5281			sr-Cyrl-RS<br> sr-Cyrl<br> sr-Latn-RS<br> sr-Latn<br>
5282			sr<br> hr-Latn<br> hr
5283		</p>
5284		<p>Language matching is used to find the best supported locale ID
5285			given a requested list of languages. The requested list could come
5286			from different sources, such as such as the user's list of preferred
5287			languages in the OS Settings, or from a browser Accept-Language list.
5288			For example, if my native tongue is English, I can understand Swiss
5289			German and German, my French is rusty but usable, and Italian basic,
5290			ideally an implementation would allow me to select {gsw, de, fr} as
5291			my preferred list of languages, skipping Italian because my
5292			comprehension is not good enough for arbitrary content.</p>
5293		<p>Language Matching can also be used to get fallback data elements. In
5294		  many cases, there may not be full data for a particular locale. For
5295		  example, for a Breton speaker, the best fallback if data is
5296		  unavailable might be French. That is, suppose we have found a Breton
5297		  bundle, but it does not contain translation for the key &quot;CN&quot;
5298		  (for the country China). It is best to return &quot;chine&quot;,
5299		  rather than falling back to the value default language such as Russian
5300		  and getting &quot;Кітай&quot;.&nbsp; The language matching data can be
5301		  used to get the closest fallback locales (of those supported) to a
5302		  given language.
5303</p>
5304	  <p>For the relationship between Inheritance, DefaultContent, LikelySubtags, and LocaleMatching, see <strong><em>Section 4.2.6 <a
5305				href="tr35.html#Inheritance_vs_Related">Inheritance vs Related Information</a></em></strong>.</p>		<p>
5306			When such fallback is used for inherited item lookup, the normal
5307			order of inheritance is used for inherited item lookup, except that
5308			before using any data from <strong>root</strong>, the data for the
5309			fallback locales would be used if available. Language matching does
5310			not interact with the fallback of resources <em>within the
5311				locale-parent chain</em>. For example, suppose that we are looking for
5312			the value for a particular path <strong>P</strong> in <strong>nb-NO</strong>.
5313			In the absence of aliases, normally the following lookup is used.
5314		</p>
5315		<blockquote>
5316			<p>
5317				<strong>nb-NO</strong> → <strong>nb</strong> → <strong>root</strong>
5318			</p>
5319		</blockquote>
5320		<p>
5321			That is, we first look in <strong>nb-NO</strong>. If there is no
5322			value for <strong>P</strong> there, then we look in <strong>nb</strong>.
5323			If there is no value for <strong>P</strong> there, we return the
5324			value for <strong>P</strong> in root (or a code value, if there is
5325			nothing there). Remember that if there is an alias element along this
5326			path, then the lookup may restart with a different path in <strong>nb-NO</strong>
5327			(or another locale).
5328		</p>
5329		<p>
5330			However, suppose that <strong>nb-NO</strong> has the fallback values
5331			<strong>[nn da sv en]</strong>, derived from language matching. In
5332			that case, an implementation <em>may</em> progressively lookup each
5333			of the listed locales, with the appropriate substitutions, returning
5334			the first value that is not found in <strong>root</strong>. This
5335			follows roughly the following pseudocode:
5336		</p>
5337		<ul>
5338			<li>value = lookup(P, nb-NO); if (locationFound != root) return
5339				value;</li>
5340			<li>value = lookup(P, nn-NO); if (locationFound != root) return
5341				value;</li>
5342			<li>value = lookup(P, da-NO); if (locationFound != root) return
5343				value;</li>
5344			<li>value = lookup(P, sv-NO); if (locationFound != root) return
5345				value;</li>
5346			<li>value = lookup(P, en-NO); return value;</li>
5347		</ul>
5348		<p>
5349			The locales in the fallback list are not used recursively. For
5350			example, for the lookup of a path in nb-NO, if <strong>fr</strong>
5351			were a fallback value for <strong>da</strong>, it would not matter
5352			for the above process. Only the original language matters.
5353		</p>
5354		<p>The language matching data is intended to be used according to
5355			the following algorithm. This is a logical description, and can be
5356			optimized for production in many ways. In this algorithm, the
5357			languageMatching data is interpreted as an ordered list.</p>
5358		<p>The language matching algorithm takes a list of a user’s
5359			desired languages, and a list of the application’s supported
5360			languages.</p>
5361		<ul>
5362			<li>Set the best weighted distance BWD to ∞</li>
5363			<li>Set the best desired language BD to null</li>
5364			<li>For each desired language D
5365				<ul>
5366					<li>Compute a discount factor F, based on the position in the
5367						list.
5368						<ul>
5369							<li>This discount factor is up to the implementation, but is
5370								typically a positive value that increases according to how far D
5371								is from the start of the desired language list.</li>
5372						</ul>
5373					</li>
5374					<li>For each supported language S
5375						<ul>
5376							<li>Find the matching distance MD as described below.</li>
5377							<li>Compute the weighted distance as F + MD</li>
5378							<li>If WD &lt; BD
5379								<ul>
5380									<li>BWD = WD</li>
5381									<li>BD = D</li>
5382								</ul>
5383							</li>
5384						</ul>
5385					</li>
5386				</ul>
5387			</li>
5388			<li>If the BWD is less than a threshold, return BD.
5389				<ul>
5390					<li>The threshold is implementation-defined, typically set to
5391						greater than a default region difference, and less than a default
5392						script difference.</li>
5393				</ul>
5394			</li>
5395			<li>Otherwise return a default supported language (like
5396				English).</li>
5397		</ul>
5398		<p>To find the matching distance MD between any two languages,
5399			perform the following steps.</p>
5400		<ol>
5401			<li>Maximize each language using Section 4.3 <a
5402				href="#Likely_Subtags">Likely Subtags</a>.
5403				<ul>
5404					<li>und is a special case: see below.</li>
5405				</ul>
5406			</li>
5407			<li>Set the match-distance MD to 0</li>
5408			<li>For each subtag in the list, starting from the end: region,
5409				script, base-language
5410				<ol>
5411					<li>If respective subtags in each language tag are identical,
5412						remove the subtag from each (logically) and continue.</li>
5413					<li>Traverse the languageMatching data until a match is found.
5414						<ul>
5415							<li>* matches any field.</li>
5416							<li>If the oneway flag is false, then the match is
5417								symmetric.</li>
5418						</ul>
5419					</li>
5420					<li>Add 100 minus the <strong>percent</strong> attribute value
5421						to MD.
5422					</li>
5423					<li>Remove the subtag from each (logically)</li>
5424				</ol>
5425			</li>
5426			<li>Return MD</li>
5427		</ol>
5428		<p>
5429			It is typically useful to set the discount factor between successive
5430			elements of the desired languages list to be slightly greater than
5431			the default region difference. That avoids the following problem:<br>
5432		</p>
5433		<p>
5434			<em>Supported languages:</em> "de, fr, ja"<br>
5435		</p>
5436		<p>
5437			<em>User's desired languages:</em> "de-AT, fr"
5438		</p>
5439		<p>This user would expect to get "de", not "fr". In practice, when
5440			a user selects a list of preferred languages, they don't include all
5441			the regional variants ahead of their second base language. Yet while
5442			the user's desired languages really doesn't tell us the priority
5443			ranking among their languages, normally the fall-off between the
5444			user's languages is substantially greater than regional variants. But
5445			unless F is greater than the distance between de-AT and de-DE, then
5446			the user’s second-choice language would be returned.</p>
5447		<p>The base language subtag &quot;und&quot; is a special case.
5448			Suppose we have the following situation:</p>
5449		<ul>
5450			<li>desired languages: {und, it}</li>
5451			<li>supported languages: {en, it}</li>
5452			<li>resulting language: en<br>
5453			</li>
5454		</ul>
5455		<p>Part of this is because 'und' has a special function in BCP 47;
5456			it stands in for 'no supplied base language'. To prevent this from
5457			happening, if the desired base language is und, the language matcher
5458	  should not apply likely subtags to it. </p>
5459		<p>Examples:</p>
5460		<p>For example, suppose that nn-DE and nb-FR are being compared.
5461			They are first maximized to nn-Latn-DE and nb-Latn-FR, respectively.
5462			The list is searched. The first match is with &quot;*-*-*&quot;, for
5463			a match of 96%. The languages are truncated to nn-Latn and nb-Latn,
5464			then to nn and nb. The first match is also for a value of 96%, so the
5465			result is 92%.</p>
5466		<p>Note that language matching is orthogonal to the how closely
5467			two languages are related linguistically. For example, Breton is more
5468			closely related to Welsh than to French, but French is the better
5469			match (because it is more likely that a Breton reader will understand
5470			French than Welsh). This also illustrates that the matches are often
5471			asymmetric: it is not likely that a French reader will understand
5472			Breton.</p>
5473		<p>The &quot;*&quot; acts as a wild card, as shown in the
5474			following example:</p>
5475		<p class="example">
5476			&lt;languageMatch desired=&quot;es-*-ES&quot;
5477			supported=&quot;es-*-ES&quot; percent=&quot;100&quot;/&gt;<br>
5478			&lt;!-- Latin American Spanishes are closer to each other.
5479			Approximate by having es-ES be further from everything else.--&gt;
5480		</p>
5481		<p>&nbsp;</p>
5482		<p class="example">&lt;languageMatch desired=&quot;es-*-ES&quot;
5483			supported=&quot;es-*-*&quot; percent=&quot;93&quot;/&gt;</p>
5484		<p class="example">
5485			<br> &lt;languageMatch desired=&quot;*&quot;
5486			supported=&quot;*&quot; percent=&quot;1&quot;/&gt;<br> &lt;!--
5487			[Default value - must be at end!] Normally there is no comprehension
5488			of different languages.--&gt;
5489		</p>
5490		<p class="example">
5491			<br> &lt;languageMatch desired=&quot;*-*&quot;
5492			supported=&quot;*-*&quot; percent=&quot;20&quot;/&gt;<br>
5493			&lt;!-- [Default value - must be at end!] Normally there is little
5494			comprehension of different scripts.--&gt;
5495		</p>
5496		<p class="example">
5497			<br> &lt;languageMatch desired=&quot;*-*-*&quot;
5498			supported=&quot;*-*-*&quot; percent=&quot;96&quot;/&gt;<br>
5499			&lt;!-- [Default value - must be at end!] Normally there are small
5500			differences across regions.--&gt;
5501		</p>
5502		<p>When the language+region is not matched, and there is otherwise
5503			no reason to pick among the supported regions for that language, then
5504			some measure of geographic &quot;closeness&quot; can be used. The
5505			results may be more understandable by users. Looking for en-SK, for
5506			example, should fall back to something within Europe (eg en-GB) in
5507			preference to something far away and unrelated (eg en-SG). Such a
5508			closeness metric does not need to be exact; a small amount of data
5509			can be used to give an approximate distance between any two regions.
5510			However, any such data must be used carefully; although Hong Kong is
5511			closer to India than to the UK, it is unlikely that en-IN would be a
5512			better match to en-HK than en-GB would.</p>
5513
5514		<h4><a name="EnhancedLanguageMatching" href="#EnhancedLanguageMatching">4.4.1 Enhanced Language Matching</a></h4>
5515		<p>The enhanced format for language matching adds  structure to enable better matching of languages. It is distinguished by having a suffix &quot;_new&quot; on the type, as in the example below. The extended structure allows matching to  take into account broad similarities that would give better results. For example, for English the regions that are or inherit from US (AS|GU|MH|MP|PR|UM|VI|US) form a &ldquo;cluster&rdquo;. Each region in that cluster should be closer to each other than to any other region. And a region outside the cluster should be closer to another region outside that cluster than to one inside. We get this issue with the &ldquo;world languages&rdquo; like English, Spanish, Portuguese, Arabic, etc.</p>
5516		<p><em>Example:</em></p>
5517		<pre> &lt;languageMatches type=&quot;written_new&quot;&gt;<br>	&lt;paradigmLocales locales=&quot;en en-GB es es-419 pt-BR pt-PT&quot;/&gt;<br>	&lt;matchVariable id=&quot;$enUS&quot; value=&quot;AS+GU+MH+MP+PR+UM+US+VI&quot;/&gt;<br>	&lt;matchVariable id=&quot;$cnsar&quot; value=&quot;HK+MO&quot;/&gt;<br>	&lt;matchVariable id=&quot;$americas&quot; value=&quot;019&quot;/&gt;<br>	&lt;matchVariable id=&quot;$maghreb&quot; value=&quot;MA+DZ+TN+LY+MR+EH&quot;/&gt;<br>	&lt;languageMatch desired=&quot;no&quot; supported=&quot;nb&quot; distance=&quot;1&quot;/&gt;&lt;!-- no ⇒ nb --&gt;<br>…
5518	&lt;languageMatch desired=&quot;ar_*_$maghreb&quot; supported=&quot;ar_*_$maghreb&quot; distance=&quot;4&quot;/&gt;
5519		&lt;!-- ar; *; $maghreb ⇒ ar; *; $maghreb --&gt;
5520	&lt;languageMatch desired=&quot;ar_*_$!maghreb&quot;	supported=&quot;ar_*_$!maghreb&quot;	distance=&quot;4&quot;/&gt;
5521		&lt;!-- ar; *; $!maghreb ⇒ ar; *; $!maghreb --&gt;<br>…</pre>
5522<p>The <strong>matchVariable</strong> allows for  a rule to matche to multiple regions, as illustrated by <strong>$maghreb</strong>. The syntax is simple: it allows for + for <em>union</em> and - for <em>set difference</em>, but no precedence. So A+B-A+D is interpreted as (((A+B)-A)+D), not as (A+B)-(A+D). The variable <strong>id</strong> has a value of the form [$][a-zA-Z0-9]+. If $X is defined, then $!X automatically means all those regions that are not in $X. </p>
5523<p dir="ltr">When the set is interpreted, then macrolanguages are (logically) transformed into a list of their contents, so &ldquo;053+GB&rdquo; → &ldquo;AU+GB+NF+NZ&rdquo;. This is done recursively, so 009 → &ldquo;053+054+057+061+QO&rdquo; → &ldquo;AU+NF+NZ+FJ+NC+PG+SB +VU...&rdquo;. Note that we use 019 for all of the Americas in the variables above, because en-US should be in the same cluster as es-419 and its contents. </p>
5524<p>In the rules, the percent value (100..0) is replaced by a <strong>distance</strong> value, which is the inverse (0..100).</p>
5525<p dir="ltr">These new variables and rules divide up the world into clusters, where items in the same clusters (for specific languages) get the normal regional difference, and items in different clusters get different weights.</p>
5526<br>
5527<p dir="ltr">Each cluster can have one or more associated <strong>paradigmLocales</strong>. These are locales that are preferred within a cluster. So when matching desired=[en-SA] against [en-GU en en-IN en-GB], the value en-GB is returned. Both of {en-GU en} are in a different cluster. While {en-IN en-GB} are in the same cluster, and the same distance from en-SA, the preference is given to en-GB because it is in the paradigm locales. It would be possible to express this in rules, but using this mechanism handles these very common cases without bulking up the tables.<br>
5528</p>
5529<p dir="ltr">The <strong>paradigmLocales</strong>  also allow matching to macroregions. For example, desired=[es-419] should match to {es-MX} more closely than to {es}, and vice versa: {es-MX} should match more closely to {es-419} than to {es}. But es-MX should match more closely to es-419 than to any of the other es-419 sublocales. In general, in the absence of other distance data, there is a &lsquo;paradigm&rsquo; in each cluster that the others should match more closely to: en(-US), en-GB, es(-ES), es-419, ru(-RU)... </p>
5530
5531		<h2>
5532			<a name="XML_Format" href="#XML_Format">5 XML Format</a>
5533		</h2>
5534		<p>There are two kinds of data that can be expressed in LDML:
5535			language-dependent data and supplementary data. In either case, data
5536			can be split across multiple files, which can be in multiple
5537			directory trees.</p>
5538		<p>For example, the language-dependent data for Japanese in CLDR
5539			is present in the following files:</p>
5540		<ul>
5541			<li>common/collation/ja.xml</li>
5542			<li>common/main/ja.xml</li>
5543			<li>common/rbnf/ja.xml</li>
5544			<li>common/segmentations/ja.xml</li>
5545		</ul>
5546		<p>Data for cased languages such as French are in files like:</p>
5547		<ul>
5548			<li>common/casing/fr.xml</li>
5549		</ul>
5550		<p>The status of the data is the same, whether or not data is
5551			split. That is, for the purpose of validation and lookup, all of the
5552			data for the above ja.xml files is treated as if it was in a single
5553			file. These files have the &lt;ldml&gt; root element and use
5554			ldml.dtd. The file name must match the identity element. For example,
5555			the &lt;ldml&gt; file pa_Arab_PK.xml must contain the following
5556			elements:</p>
5557		<pre>
5558			<strong>&lt;ldml&gt;</strong><br> 	&lt;identity&gt;<br> 		…<br> 		<strong>&lt;language type=&quot;pa&quot;/&gt;<br> 		&lt;script type=&quot;Arab&quot;/&gt;<br> 		&lt;territory type=&quot;PK&quot;/&gt;</strong><br> 	&lt;/identity&gt;
5559…</pre>
5560		<p>Supplemental data can have different root elements, currently:
5561			ldmlBCP47, supplementalData, keyboard, and platform. Keyboard and
5562			platform files are considered distinct. The ldmlBCP47 files and
5563			supplementalData files that have the same root are all logically part
5564			of the same file; they are simply split into separate files for
5565			convenience. Implementations may split the files in different ways,
5566			also for their convenience. The files in /properties are also
5567			supplemental data files, but are structured like UCD properties.</p>
5568
5569		<p>For example, supplemental data relating to Japan or the
5570			Japanese writing are in:</p>
5571		<ul>
5572			<li>common/supplemental/ (in many files, such as
5573				supplementalData.xml)</li>
5574			<li>common/transforms/Hiragana-Katakana.xml</li>
5575			<li>common/transforms/Hiragana-Latin.xml</li>
5576			<li>common/properties/scriptMetadata.txt</li>
5577			<li>common/bcp47/calendar.xml</li>
5578			<li>uca/allkeys_CLDR.txt (sorting)</li>
5579			<li>/keyboards/chromeos/ja-t-k0-chromeos.xml</li>
5580			<li>...</li>
5581		</ul>
5582		<p>Like the &lt;ldml&gt; files, the keyboard file names must match
5583			internal data: in particular, the locale attribute on the keyboard
5584			element must have a value that corresponds to the file name, such as
5585			&lt;keyboard locale=&quot;af-t-k0-android&quot;&gt; for the file
5586			af-t-k0-android.xml.</p>
5587		<p>
5588			The following sections describe the structure of the XML format for
5589			language-dependent data. The more precise syntax is in the ldml.dtd
5590			file<i>; however, the DTD does not describe all the constraints
5591				on the structure.</i>
5592		</p>
5593		<p>To start with, the root element is &lt;ldml&gt;, with the
5594			following DTD entry:</p>
5595		<p class='dtd'>
5596			&lt;!ELEMENT ldml
5597			(identity,(alias|(fallback*,localeDisplayNames?,layout?,contextTransforms?,characters?,<br>
5598			delimiters?,measurement?,dates?,numbers?,units?,listPatterns?,collations?,posix?,<br>
5599			segmentations?,rbnf?,annotations?,metadata?,references?,special*)))&gt;
5600		</p>
5601
5602		<p>The XML structure is stable over releases. Elements and
5603			attributes may be deprecated: they are retained in the DTD but their
5604			usage is strongly discouraged. In most cases, an alternate structure
5605			is provided for expressing the information. There is only one
5606			exception: newer DTDs cannot be used with version 1.1 files, without
5607			some modification.</p>
5608		<p>In general, all translatable text in this format is in element
5609			contents, while attributes are reserved for types and non-translated
5610			information (such as numbers or dates). The reason that attributes
5611			are not used for translatable text is that spaces are not preserved,
5612			and we cannot predict where spaces may be significant in translated
5613			material.</p>
5614		<p>
5615			There are two kinds of elements in LDML: <i>rule</i> elements and <i>structure</i>
5616			elements. For structure elements, there are restrictions to allow for
5617			effective inheritance and processing:
5618		</p>
5619		<ol>
5620			<li>There is no &quot;mixed&quot; content: if an element has
5621				textual content, then it cannot contain any elements.</li>
5622			<li>The [<a href="#XPath">XPath</a>] leading to the content is
5623				unique; no two different pieces of textual content have the same [<a
5624				href="#XPath">XPath</a>].
5625			</li>
5626		</ol>
5627		<p>
5628			Rule elements do not have this restriction, but also do not inherit,
5629			except as an entire block. The rule elements are listed in
5630			serialElements in the supplemental metadata. See also <i><a
5631				href="#Inheritance_and_Validity">Section 4.2 Inheritance and
5632					Validity</a></i>. For more technical details, see <a
5633				href="http://cldr.unicode.org/development/updating-dtds">Updating-DTDs</a>.
5634		</p>
5635		<p>
5636			Note that the data in examples given below is purely illustrative,
5637			and does not match any particular language. For a more detailed
5638			example of this format, see [<a href="#LDML">Example</a>]. There is
5639			also a DTD for this format, but <i>remember that the DTD alone is
5640				not sufficient to understand the semantics, the constraints,
5641				nor&nbsp; the interrelationships between the different elements and
5642				attributes</i>. You may wish to have copies of each of these to hand as
5643			you proceed through the rest of this document.
5644		</p>
5645		<p>In particular, all elements allow for draft versions to coexist
5646			in the file at the same time. Thus most elements are marked in the
5647			DTD as allowing multiple instances. However, unless an element is
5648			listed as a serialElement, or has a distinguishing attribute, it can
5649			only occur once as a subelement of a given element. Thus, for
5650			example, the following is illegal even though allowed by the DTD:</p>
5651		<p>
5652			&lt;languages&gt;<br> &nbsp; &lt;language
5653			type=&quot;aa&quot;&gt;...&lt;/language&gt;<br> &nbsp;
5654			&lt;language type=&quot;aa&quot;&gt;..&lt;/language&gt;
5655		</p>
5656		<p>There must be only one instance of these per parent, unless
5657			there are other distinguishing attributes (such as an alt element).</p>
5658		<p>In general, LDML data should be in NFC format. However, certain
5659			elements may need to contain characters that are not in NFC,
5660			including exemplars, transforms, segmentations, and
5661			p/s/t/i/pc/sc/tc/ic rules in collation. These elements must not be
5662			normalized (either to NFC or NFD), or their meaning may be changed.
5663			Thus LDML documents must not be normalized as a whole. To prevent
5664			problems with normalization, no element value can start with a
5665			combining slash (U+0338 COMBINING LONG SOLIDUS OVERLAY).</p>
5666		<p>
5667			Lists, such as <span class="attribute">singleCountries</span> are
5668			space-delimited. That means that they are separated by one or more
5669			XML whitespace characters,
5670		</p>
5671		<ul>
5672			<li>singleCountries</li>
5673			<li>preferenceOrdering</li>
5674			<li>references</li>
5675		</ul>
5676		<h3>
5677			<a name="Common_Elements" href="#Common_Elements">5.1 Common
5678				Elements</a>
5679		</h3>
5680		<p>At any level in any element, two special elements are allowed.</p>
5681		<h4>
5682			<a name="special" href="#special">5.1.1 Element special</a>
5683		</h4>
5684		<p>
5685			This element is designed to allow for arbitrary additional annotation
5686			and data that is product-specific. It has one required attribute <span
5687				class="attribute">xmlns</span>, which specifies the XML <a
5688				href="http://www.w3.org/TR/REC-xml-names/">namespace</a> of the
5689			special data. For example, the following used the version 1.0 POSIX
5690			special element.
5691		</p>
5692		<pre>&lt;!DOCTYPE ldml SYSTEM &quot;<span style="color: blue">http://unicode.org/cldr/dtd/1.0/ldml.dtd</span>&quot; [
5693    &lt;!ENTITY % posix SYSTEM &quot;<span style="color: blue">http://unicode.org/cldr/dtd/1.0/ldmlPOSIX.dtd</span>&quot;&gt;
5694<span style="color: blue">%posix;</span>
5695]&gt;
5696&lt;ldml&gt;
5697...
5698&lt;special xmlns:posix=&quot;<span style="color: blue">http://www.opengroup.org/regproducts/xu.htm</span>&quot;&gt;
5699        <span style="color: green">&lt;!-- old abbreviations for pre-GUI days --&gt;</span>
5700        &lt;posix:messages&gt;
5701            &lt;posix:yesstr&gt;<span style="color: blue">Yes</span>&lt;/posix:yesstr&gt;
5702            &lt;posix:nostr&gt;<span style="color: blue">No</span>&lt;/posix:nostr&gt;
5703            &lt;posix:yesexpr&gt;<span style="color: blue">^[Yy].*</span>&lt;/posix:yesexpr&gt;
5704            &lt;posix:noexpr&gt;<span style="color: blue">^[Nn].*</span>&lt;/posix:noexpr&gt;
5705        &lt;/posix:messages&gt;
5706    &lt;/special&gt;
5707&lt;/ldml&gt;
5708</pre>
5709		<h5>
5710			<a name="Sample_Special_Elements" href="#Sample_Special_Elements">5.1.1.1
5711				Sample Special Elements</a>
5712		</h5>
5713		<p>
5714			The elements in this section are <i><b>not</b></i> part of the Locale
5715			Data Markup Language 1.0 specification. Instead, they are special
5716			elements used for application-specific data to be stored in the
5717			Common Locale Repository. They may change or be removed future
5718			versions of this document, and are present her more as examples of
5719			how to extend the format. (Some of these items may move into a future
5720			version of the Locale Data Markup Language specification.)
5721		</p>
5722		<ul>
5723			<li><a href="http://unicode.org/cldr/dtd/1.1/ldmlICU.dtd">http://unicode.org/cldr/dtd/1.1/ldmlICU.dtd</a></li>
5724			<li><a href="http://unicode.org/cldr/dtd/1.1/ldmlOpenOffice.dtd">http://unicode.org/cldr/dtd/1.1/ldmlOpenOffice.dtd</a></li>
5725		</ul>
5726		<p>The above examples are old versions: consult the documentation
5727			for the specific application to see which should be used.</p>
5728		<p>These DTDs use namespaces and the special element. To include
5729			one or more, use the following pattern to import the special DTDs
5730			that are used in the file:</p>
5731		<pre>&lt;?xml version=&quot;<span style="color: blue">1.0</span>&quot; encoding=&quot;<span
5732				style="color: blue">UTF-8</span>&quot; ?&gt;
5733&lt;!DOCTYPE ldml SYSTEM &quot;<span style="color: blue">http://unicode.org/cldr/dtd/1.1/ldml.dtd</span>&quot; [
5734    &lt;!ENTITY % <span style="color: blue">icu</span> SYSTEM &quot;<span
5735				style="color: blue">http://unicode.org/cldr/dtd/1.1/ldmlICU.dtd</span>&quot;&gt;
5736    &lt;!ENTITY % <span style="color: blue">openOffice</span> SYSTEM &quot;<span
5737				style="color: blue">http://unicode.org/cldr/dtd/1.1/ldmlOpenOffice.dtd</span>&quot;&gt;
5738<span style="color: blue">%icu;
5739%openOffice;
5740</span>]&gt;</pre>
5741		<p>Thus to include just the ICU DTD, one uses:</p>
5742		<pre>&lt;?xml version=&quot;<span style="color: blue">1.0</span>&quot; encoding=&quot;<span
5743				style="color: blue">UTF-8</span>&quot; ?&gt;
5744&lt;!DOCTYPE ldml SYSTEM &quot;<span style="color: blue">http://unicode.org/cldr/dtd/1.1/ldml.dtd</span>&quot; [
5745    &lt;!ENTITY % icu SYSTEM &quot;<span style="color: blue">http://unicode.org/cldr/dtd/1.1/ldmlICU.dtd</span>&quot;&gt;
5746<span style="color: blue">%icu;
5747</span>]&gt;</pre>
5748		<blockquote>
5749			<p>
5750				<b>Note: </b>A previous version of this document contained a special
5751				element for <a
5752					href="http://www.open-std.org/jtc1/sc22/wg20/docs/n897-14652w25.pdf">ISO
5753					TR 14652</a> compatibility data. That element has been withdrawn,
5754				pending further investigation, since<b><i> </i></b>14652 is a Type 1
5755				TR: &quot;when the required support cannot be obtained for the
5756				publication of an International Standard, despite repeated
5757				effort&quot;. See the ballot comments on <a
5758					href="http://www.open-std.org/jtc1/sc22/wg20/docs/n948-J1N6769-14652.pdf">14652
5759					Comments</a> for details on the 14652 defects. For example, most of
5760				these patterns make little provision for substantial changes in
5761				format when elements are empty, so are not particularly useful in
5762				practice. Compare, for example, the mail-merge capabilities of
5763				production software such as Microsoft Word or OpenOffice.
5764			</p>
5765			<p>
5766				<b>Note: </b>While the CLDR specification guarantees backwards
5767				compatibility, the definition of specials is up to other
5768				organizations. Any assurance of backwards compatibility is up to
5769				those organizations.
5770			</p>
5771		</blockquote>
5772		<p>
5773			A number of the elements above can have extra information for <a
5774				name="OpenOffice" href="#OpenOffice">openoffice.org</a>, such as the
5775			following example:
5776		</p>
5777		<pre>    &lt;special xmlns:openOffice=&quot;<span
5778				style="color: blue">http://www.openoffice.org</span>&quot;&gt;
5779        &lt;openOffice:search&gt;
5780            &lt;openOffice:searchOptions&gt;
5781                &lt;openOffice:transliterationModules&gt;<span
5782				style="color: blue">IGNORE_CASE</span>&lt;/openOffice:transliterationModules&gt;
5783            &lt;/openOffice:searchOptions&gt;
5784        &lt;/openOffice:search&gt;
5785    &lt;/special&gt;
5786</pre>
5787		<h4>
5788			<a name="Alias_Elements" href="#Alias_Elements">5.1.2 Element
5789				alias</a>
5790		</h4>
5791		<p class="dtd">
5792			&lt;!ELEMENT alias (special*) &gt;<br> &lt;!ATTLIST alias source
5793			NMTOKEN #REQUIRED &gt;<br> &lt;!ATTLIST alias path CDATA
5794			#IMPLIED&gt;
5795		</p>
5796		<p>The contents of any element in root can be replaced by an
5797			alias, which points to the path where the data can be found.</p>
5798		<p>Aliases will only ever appear in root with the form
5799			//ldml/.../alias[@source=&quot;locale&quot;][@path=&quot;...&quot;].</p>
5800		<p>Consider the following example in root:</p>
5801		<pre>
5802      &lt;calendar type=&quot;gregorian&quot;&gt;<br> &lt;months&gt;<br>      &lt;default choice=&quot;format&quot;/&gt;<br>      &lt;monthContext type=&quot;format&quot;&gt;<br>            &lt;default choice=&quot;wide&quot;/&gt;<br>            &lt;monthWidth type=&quot;abbreviated&quot;&gt;<br>             <strong>&lt;alias source=&quot;locale&quot; path=&quot;../monthWidth[@type='wide']&quot;/&gt;</strong><br>                      &lt;/monthWidth&gt;</pre>
5803		<p>
5804			If the locale &quot;de_DE&quot; is being accessed for a month name
5805			for format/abbreviated, then a resource bundle at &quot;de_DE&quot;
5806			will be searched for a resource element at the that path. If not
5807			found there, then the resource bundle at &quot;de&quot; will be
5808			searched, and so on. When the alias is found in root, then the search
5809			is restarted, but searching for format/<strong>wide</strong> element
5810			instead of format/abbreviated.
5811		</p>
5812		<p>
5813			If the <b>path</b> attribute is present, then its value is an [<a
5814				href="#XPath">XPath</a>] that points to a different node in the
5815			tree. For example:
5816		</p>
5817		<pre>&lt;alias source=&quot;locale&quot; path=&quot;../monthWidth[@type=&#39;wide&#39;]&quot;/&gt;</pre>
5818		<p>
5819			The default value if the path is not present is the same position in
5820			the tree. All of the attributes in the [<a href="#XPath">XPath</a>]
5821			must be <i>distinguishing</i> elements. For more details, see <a
5822				href="#Inheritance_and_Validity">Section 4.2 Inheritance and
5823				Validity</a>.
5824		</p>
5825		<p>
5826			There is a special value for the source attribute, the constant <b>source=&quot;locale&quot;</b>.
5827			This special value is equivalent to the locale being resolved. For
5828			example, consider the following example, where locale data for
5829			&#39;de&#39; is being resolved:
5830		</p>
5831		<div align="center">
5832			<center>
5833				<table border="1" cellpadding="0" cellspacing="1">
5834					<caption>
5835						<a name="Inheritance_with_source_locale_"
5836							href="#Inheritance_with_source_locale_">Inheritance with
5837							source=&quot;locale&quot;</a>
5838					</caption>
5839					<tr>
5840						<th>Root</th>
5841						<th>de</th>
5842						<th bgcolor="#C0C0C0">Resolved</th>
5843					</tr>
5844					<tr>
5845						<td><code>
5846								&lt;x&gt;<br> &nbsp; &lt;a&gt;1&lt;/a&gt;<br> &nbsp;
5847								&lt;b&gt;2&lt;/b&gt;<br> &nbsp; &lt;c&gt;3&lt;/c&gt;<br>
5848								<br> &lt;/x&gt;
5849							</code></td>
5850						<td><code>
5851								&lt;x&gt;<br> &nbsp;&lt;a&gt;11&lt;/a&gt;<br>
5852								&nbsp;&lt;b&gt;12&lt;/b&gt;<br> <br>
5853								&nbsp;&lt;d&gt;14&lt;/d&gt;<br> &lt;/x&gt;
5854							</code></td>
5855						<td bgcolor="#C0C0C0"><code>
5856								&lt;x&gt;<br> &nbsp;&lt;a&gt;11&lt;/a&gt;<br>
5857								&nbsp;&lt;b&gt;12&lt;/b&gt;<br> &nbsp;<span
5858									style="background-color: #FFFF00"><span
5859									class="inherited"><span style="font-weight: 400;">&lt;c&gt;3&lt;/c&gt;</span></span></span><br>
5860								&nbsp;&lt;d&gt;14&lt;/d&gt;<br> &lt;/x&gt;
5861							</code></td>
5862					</tr>
5863					<tr>
5864						<td><code>
5865								&lt;y&gt;<br> &nbsp;&lt;alias source=&quot;locale&quot;
5866								path=&quot;../x&quot;&gt;<br> &lt;/y&gt;
5867							</code></td>
5868						<td><code>
5869								&lt;y&gt;<br> <br> &nbsp;&lt;b&gt;22&lt;/b&gt;<br>
5870								<br> <br> &nbsp;&lt;e&gt;25&lt;/e&gt;<br>
5871								&lt;/y&gt;
5872							</code></td>
5873						<td bgcolor="#C0C0C0"><code>
5874								&lt;y&gt;<br> &nbsp;<span style="background-color: #FFFF00"><span
5875									class="inherited"><span style="font-weight: 400;">&lt;a&gt;11&lt;/a&gt;</span></span></span><br>
5876								&nbsp;&lt;b&gt;22&lt;/b&gt;<br> &nbsp;<span
5877									style="background-color: #FFFF00"><span
5878									class="inherited"><span style="font-weight: 400;">&lt;c&gt;3&lt;/c&gt;</span></span></span><br>
5879								&nbsp;<span style="background-color: #FFFF00"><span
5880									class="inherited"><span style="font-weight: 400;">&lt;d&gt;14&lt;/d&gt;</span></span></span><br>
5881								&nbsp;&lt;e&gt;25&lt;/e&gt;<br> &lt;/y&gt;
5882							</code></td>
5883					</tr>
5884				</table>
5885			</center>
5886		</div>
5887		<p>The first row shows the inheritance within the &lt;x&gt;
5888			element, whereby &lt;c&gt; is inherited from root. The second shows
5889			the inheritance within the &lt;y&gt; element, whereby &lt;a&gt;,
5890			&lt;c&gt;, and &lt;d&gt; are inherited also from root, but from an
5891			alias there. The alias in root is logically replaced not by the
5892			elements in root itself, but by elements in the &#39;target&#39;
5893			locale.</p>
5894		<p>
5895			For more details on data resolution, see <a
5896				href="#Inheritance_and_Validity">Section 4.2 Inheritance and
5897				Validity</a>.
5898		</p>
5899		<p>
5900			Aliases must be resolved recursively. An alias may point to another
5901			path that results in another alias being found, and so on. For
5902			example, looking up Thai buddhist abbreviated months for the locale <strong>xx-YY</strong>
5903			may result in the following chain of aliases being followed:
5904		</p>
5905		<blockquote>
5906			<p>../../calendar[@type=&quot;buddhist&quot;]/months/monthContext[@type=&quot;format&quot;]/monthWidth[@type=&quot;abbreviated&quot;]
5907			</p>
5908			<p>xx-YY → xx → root // finds alias that changes path to:</p>
5909			<p>../../calendar[@type=&quot;gregorian&quot;]/months/monthContext[@type=&quot;format&quot;]/monthWidth[@type=&quot;abbreviated&quot;]
5910			</p>
5911			<p>xx-YY → xx → root // finds alias that changes path to:</p>
5912			<p>../../calendar[@type=&quot;gregorian&quot;]/months/monthContext[@type=&quot;format&quot;]/monthWidth[@type=&quot;wide&quot;]
5913			</p>
5914			<p>xx-YY → xx // finds value here</p>
5915		</blockquote>
5916		<p>It is an error to have a circular chain of aliases. That is, a
5917			collection of LDML XML documents must not have situations where a
5918			sequence of alias lookups (including inheritance and lateral
5919			inheritance) can be followed indefinitely without terminating.</p>
5920		<h4>
5921			<a name="Element_displayName" href="#Element_displayName">5.1.3
5922				Element displayName</a>
5923		</h4>
5924		<p>Many elements can have a display name. This is a translated
5925			name that can be presented to users when discussing the particular
5926			service. For example, a number format, used to format numbers using
5927			the conventions of that locale, can have translated name for
5928			presentation in GUIs.</p>
5929		<pre>  &lt;numberFormat&gt;
5930    &lt;displayName&gt;<span style="color: blue">Prozentformat</span>&lt;/displayName&gt;
5931...
5932  &lt;numberFormat&gt;</pre>
5933		<p>
5934			Where present, the display names must be unique; that is, two
5935			distinct code would not get the same display name.&nbsp; (There is
5936			one exception to this: in time zones, where parsing results would
5937			give the same GMT offset, the standard and daylight display names can
5938			be the same across different time zone IDs.) Any translations should
5939			follow customary practice for the locale in question. For more
5940			information, see [<a href="#DataFormats">Data Formats</a>].
5941		</p>
5942		<h4>
5943			<a name="Escaping_Characters" href="#Escaping_Characters">5.1.4
5944				Escaping Characters</a>
5945		</h4>
5946		<p>Unfortunately, XML does not have the capability to contain all
5947			Unicode code points. Due to this, in certain instances extra syntax
5948			is required to represent those code points that cannot be otherwise
5949			represented in element content. The escaping syntax is only defined
5950			on a few types of elements, such as in collation or exemplar sets,
5951			and uses the appropriate syntax for that type.</p>
5952		<p>The element &lt;cp&gt;, which was formerly used for this
5953			purpose, has been deprecated.</p>
5954
5955		<h3>
5956			<a name="Common_Attributes" href="#Common_Attributes">5.2 Common
5957				Attributes</a>
5958		</h3>
5959		<h4>
5960			<a name="Attribute_type" href="#Attribute_type">5.2.1 Attribute
5961				type</a>
5962		</h4>
5963		<p>
5964			The attribute <i>type</i> is also used to indicate an alternate
5965			resource that can be selected with a matching type=option in the
5966			locale id modifiers, or be referenced by a default element. For
5967			example:
5968		</p>
5969		<pre>&lt;ldml&gt;
5970  ...
5971  &lt;currencies&gt;
5972    &lt;currency&gt;<span style="color: blue">...</span>&lt;/currency&gt;
5973    &lt;currency type=&quot;<span style="color: blue">preEuro</span>&quot;&gt;<span
5974				style="color: blue">...</span>&lt;/currency&gt;
5975  &lt;/currencies&gt;
5976&lt;/ldml&gt;</pre>
5977		<h4>
5978			<a name="Attribute_draft" href="#Attribute_draft">5.2.2 Attribute
5979				draft</a>
5980		</h4>
5981		<p>
5982			If this attribute is present, it indicates the status of all the data
5983			in this element and any subelements (unless they have a contrary <i>draft</i>
5984			value), as per the following:
5985		</p>
5986		<ul>
5987			<li style="margin-top: 0.5em; margin-bottom: 0.5em"><i>approved:</i>
5988				fully approved by the technical committee (equals the CLDR 1.3 value
5989				of <i>false</i>, or an absent <i>draft</i> attribute). This does not
5990				mean that the data is guaranteed to be error-free—this is the best
5991				judgment of the committee.</li>
5992			<li style="margin-top: 0.5em; margin-bottom: 0.5em"><i>contributed</i>:
5993				partially approved by the technical committee.</li>
5994			<li style="margin-top: 0.5em; margin-bottom: 0.5em"><i>provisional</i>:
5995				partially confirmed. Implementations may choose to accept the
5996				provisional data, especially if there is no translated alternative.</li>
5997			<li style="margin-top: 0.5em; margin-bottom: 0.5em"><i>unconfirmed</i>:
5998				no confirmation available.</li>
5999		</ul>
6000		<p>
6001			For more information on precisely how these values are computed for
6002			any given release, see <a
6003				href="http://cldr.unicode.org/index/process#TOC-Data-Submission-and-Vetting-Process">Data
6004				Submission and Vetting Process</a> on the CLDR website.
6005		</p>
6006		<p>
6007			The draft attribute should only occur on &quot;leaf&quot; elements, and is deprecated elsewhere. For a more
6008			formal description of how elements are inherited, and what their
6009			draft status is, see <i><a href="#Inheritance_and_Validity">Section
6010					4.2 Inheritance and Validity</a></i>.
6011		</p>
6012		<h4>
6013			<a name="alt_attribute" href="#alt_attribute">5.2.3 Attribute alt</a>
6014		</h4>
6015		<p>
6016			This attribute labels an alternative value for an element. The value
6017			is a <i>descriptor</i> indicates what kind of alternative it is, and
6018			takes one of the following
6019		</p>
6020		<ul>
6021			<li><i>variantname</i> meaning that the value is a variant of
6022				the normal value, and may be used in its place in certain
6023				circumstances. If a variant value is absent for a particular locale,
6024				the normal value is used. The variant mechanism should only be used
6025				when such a fallback is acceptable.</li>
6026			<li><span style="color: blue">proposed</span>, optionally
6027				followed by a number, indicating that the value is a proposed
6028				replacement for an existing value.</li>
6029			<li><i>variantname</i><span style="color: blue">-proposed</span>,
6030				optionally followed by a number, indicating that the value is a
6031				proposed replacement variant value.</li>
6032		</ul>
6033		<p>
6034			&quot;<span style="color: blue">proposed</span>&quot; should only be
6035			present if the draft status is not &quot;approved&quot;. It indicates
6036			that the data is proposed replacement data that has been added
6037			provisionally until the differences between it and the other data can
6038			be vetted. For example, suppose that the translation for September
6039			for some language is &quot;Settembru&quot;, and a bug report is filed
6040			that that should be &quot;Settembro&quot;. The new data can be
6041			entered in, but marked as <i>alt=&quot;proposed&quot;</i> until it is
6042			vetted.
6043		</p>
6044		<pre>...
6045&lt;month type=&quot;9&quot;&gt;Settembru&lt;/month&gt;
6046&lt;month type=&quot;9&quot; draft=&quot;unconfirmed&quot; alt=&quot;proposed&quot;&gt;Settembro&lt;/month&gt;
6047&lt;month type=&quot;10&quot;&gt;...</pre>
6048		<p>Now assume another bug report comes in, saying that the correct
6049			form is actually &quot;Settembre&quot;. Another alternative can be
6050			added:</p>
6051		<pre>...
6052&lt;month type=&quot;9&quot; draft=&quot;unconfirmed&quot; alt=&quot;proposed2&quot;&gt;Settembre&lt;/month&gt;
6053...</pre>
6054		<p>
6055			The values for <i>variantname</i> at this time include &quot;<span
6056				style="color: blue">variant</span>&quot;, &quot;<span
6057				style="color: blue">list</span>&quot;, &quot;<span
6058				style="color: blue">email</span>&quot;, &quot;<span
6059				style="color: blue">www</span>&quot;, &quot;<span
6060				class="attributeValue">short</span>&quot;, and &quot;<span
6061				style="color: blue">secondary</span>&quot;.
6062		</p>
6063		<p>
6064			For a more complete description of how draft applies to data, see <i><a
6065				href="#Inheritance_and_Validity">Section 4.2 Inheritance and
6066					Validity</a></i>.
6067		</p>
6068		<p class="element2">
6069			Attribute <a name="references_attribute" href="#references_attribute">references</a>
6070		</p>
6071		<p>The value of this attribute is a token representing a reference
6072			for the information in the element, including standards that it may
6073			conform to. &lt;references&gt;. (In older versions of CLDR, the value
6074			of the attribute was freeform text. That format is deprecated.)</p>
6075		<p>
6076			<i>Example:</i>
6077		</p>
6078		<p class="example">&lt;territory type=&quot;UM&quot;
6079			references=&quot;R222&quot;&gt;USAs yttre öar&lt;/territory&gt;</p>
6080		<p>The reference element may be inherited. Thus, for example, R222
6081			may be used in sv_SE.xml even though it is not defined there, if it
6082			is defined in sv.xml.</p>
6083		<p>&lt;... allow=&quot;verbatim&quot; ...&gt; (deprecated)</p>
6084		<p>This attribute was originally intended for use in marking
6085			display names whose capitalization differed from what was indicated
6086			by the now-deprecated &lt;inText&gt; element (perhaps, for example,
6087			because the names included a proper noun). It was never supported in
6088			the dtd and is not needed for use with the new
6089			&lt;contextTransforms&gt; element.</p>
6090		<h3>
6091			<a name="Common_Structures" href="#Common_Structures">5.3 Common
6092				Structures</a>
6093		</h3>
6094		<h4>
6095			<a name="Date_Ranges" href="#Date_Ranges">5.3.1 Date and Date
6096				Ranges</a>
6097		</h4>
6098		<p>
6099			When attribute specify date ranges, it is usually done with
6100			attributes <i>from</i> and <i>to</i>. The <i>from</i> attribute
6101			specifies the starting point, and the <i>to</i> attribute specifies
6102			the end point. The deprecated <i>time</i> attribute was formerly used
6103			to specify time with the deprecated weekEndStart and weekEndEnd
6104			elements, which were themselves inherently <i>from</i> or <i>to</i>.
6105		</p>
6106		<p>
6107			The data format is a restricted ISO 8601 format, restricted to the
6108			fields <i>year, month, day, hour, minute, </i>and<i> second</i> in
6109			that order, with &quot;-&quot; used as a separator between date
6110			fields, a space used as the separator between the date and the time
6111			fields, and &quot;:&quot; used as a separator between the time
6112			fields. If the minute or minute and second are absent, they are
6113			interpreted as zero. If the hour is also missing, then it is
6114			interpreted based on whether the attribute is <i>from</i> or <i>to</i>.
6115		</p>
6116		<ul>
6117			<li>
6118				<p class="note">
6119					<i>from</i> defaults to &quot;00:00:00&quot; (midnight at the start
6120					of the day).
6121				</p>
6122			</li>
6123			<li>
6124				<p class="note">
6125					<i>to </i>defaults to &quot;24:00:00&quot; (midnight at the end of
6126					the day).
6127				</p>
6128			</li>
6129		</ul>
6130		<p class="note">
6131			That is, Friday at 24:00:00 is the same time as Saturday at 00:00:00.
6132			Thus when the hour is missing, the <i>from and to</i> are interpreted
6133			inclusively: the range includes all of the day mentioned.
6134		</p>
6135		<p class="note">For example, the following are equivalent:</p>
6136		<table style="margin-top: 0.5em; margin-bottom: 0.5em" id="table25">
6137			<tr>
6138				<td>&lt;usesMetazone from=&quot;1991-10-27&quot;
6139					to=&quot;2006-04-02&quot; .../&gt;</td>
6140			</tr>
6141			<tr>
6142				<td>&lt;usesMetazone from=&quot;1991-10-27 00:00:00&quot;
6143					to=&quot;2006-04-02 24:00:00&quot; .../&gt;</td>
6144			</tr>
6145			<tr>
6146				<td>&lt;usesMetazone from=&quot;1991-10-<font color="#FF0000"><b>26
6147							24</b></font>:00:00&quot; to=&quot;2006-04-<font color="#FF0000"><b>03
6148							00</b></font>:00:00&quot; .../&gt;
6149				</td>
6150			</tr>
6151		</table>
6152
6153		<p>
6154			If the <i>from</i> element is missing, it is assumed to be as far
6155			backwards in time as there is data for; if the <i>to</i> element is
6156			missing, then it is from this point onwards, with no known end point.
6157		</p>
6158		<p>The dates and times are specified in local time, unless
6159			otherwise noted. (In particular, the metazone values are in UTC (also
6160			known as GMT).</p>
6161		<h4>
6162			<a name="Text_Directionality" href="#Text_Directionality">5.3.2
6163				Text Directionality</a>
6164		</h4>
6165		<p>The content of certain elements, such as date or number
6166			formats, may consist of several sub-elements with an inherent order
6167			(for example, the year, month, and day for dates). In some cases, the
6168			order of these sub-elements may be changed depending on the
6169			bidirectional context in which the element is embedded.</p>
6170		<p>For example, short date formats in languages such as Arabic may
6171			contain neutral or weak characters at the beginning or end of the
6172			element content. In such a case, the overall order of the
6173			sub-elements may change depending on the surrounding text.</p>
6174		<p>Element content whose display may be affected in this way
6175			should include an explicit direction mark, such as U+200E
6176			LEFT-TO-RIGHT MARK or U+200F RIGHT-TO-LEFT MARK, at the beginning or
6177			end of the element content, or both.</p>
6178		<h4>
6179			<a name="Unicode_Sets" href="#Unicode_Sets">5.3.3 Unicode Sets</a>
6180		</h4>
6181		<p>
6182			Some attribute values or element contents use <em>UnicodeSet</em>
6183			notation. A UnicodeSet represents a finite set of Unicode code points
6184			and strings, and is defined by lists of code points and strings,
6185			Unicode property sets, and set operators, all bounded by square
6186			brackets. In this context, a code point means a string consisting of
6187			exactly one code point.
6188		</p>
6189		<p>
6190			A UnicodeSet  implements the  semantics in <i>UTS
6191			#18: Unicode Regular Expressions</i> [<a
6192				href="http://www.unicode.org/reports/tr41/#UTS18">UTS18</a>] Levels 1 &amp; 2 that are relevant to determining sets of characters. Note however that it may deviate from the syntax provided in [<a
6193				href="http://www.unicode.org/reports/tr41/#UTS18">UTS18</a>], which is illustrative rather than a requirement. There is one exception to the supported semantics, Section <a href="http://unicode.org/reports/tr18/#RL2.6">RL2.6</a> <em>Wildcards in Property Values</em>. That feature can be supported in clients such as ICU by implementing a “hook” as is done in the <a href="https://unicode.org/cldr/utility/list-unicodeset.jsp?a=\p{name=/APPLE/}">online UnicodeSet utilities</a>.</p>
6194		<p>A UnicodeSet may be cited in specifications
6195				outside of the domain of LDML. In such a case, the specification may
6196				specify a subset of the syntax provided here.</p>
6197	  <p>The following provides EBNF syntax for a UnicodeSet:</p>
6198	  <div align='center'>
6199		<table class='simple'>
6200<tr>
6201  <th>Symbol</th>
6202  <th>Expression</th>
6203  <th>Examples</th>
6204</tr>
6205<tr><th>root</th>
6206	<td><code>= prop <br>| '[-]' <br>| '[' [\-\^]? s seq+ ']'</code></td>
6207	<td>\p{x=y},<br>
6208	  [abc]</td>
6209</tr>
6210<tr><th>seq</th>
6211	<td><code>= root (s [\&amp;\-] s root)* s <br>| range s</code></td>
6212	<td>[abc]-[cde], a	  <br></td>
6213</tr>
6214<tr><th>range</th>
6215	<td><code>= char ('-' char)? <br>| '{' (s char)+ s '}'</code></td>
6216	<td>a, a-c, {abc}</td>
6217</tr>
6218<tr><th>prop</th>
6219	<td><code>= '\\' [pP] '{' propName ([≠=] s value1+)? '}' <br>| '[:' '^'? propName ([≠=] s value2+)? ':]'</code></td>
6220	<td>\p{x=y}, [:x=y:]<br></td>
6221</tr>
6222<tr><th>propName</th>
6223	<td><code>= s [A-Za-z0-9] [A-Za-z0-9_\x20]* s</code></td>
6224	<td>General_Category,<br>
6225	  General Category</td>
6226</tr>
6227<tr><th>value1</th>
6228	<td><code>= [^\}] <br>
6229	  | '\\' quoted </code></td>
6230	<td>Lm,<br>
6231	  \n,<br>
6232	  \}</td>
6233</tr>
6234<tr><th>value2</th>
6235	<td><code>= [^:] <br>
6236	  | '\\' quoted</code></td>
6237	<td>Lm,<br>
6238      \n,<br>
6239      \:</td>
6240</tr>
6241<tr><th>char</th>
6242	<td><code>= [^\&amp; \- \[ \[ \] \\ \} \{ [:Pat_WS:]] <br>
6243	  | '\\' quoted</code></td>
6244	<td>a, b, c, \n</td>
6245</tr>
6246<tr><th>quoted</th>
6247<td><code>= 'u' (hex{4} | bracketedHex) <br>
6248	| 'x' (hex{2} | bracketedHex) <br>  | 'U00' ('0' hex{5} | '10' hex{4}) <br>| 'N{' propName '}' <br>| [\u0000-\U00010FFFF]</code></td>
6249<td>&nbsp;</td>
6250</tr>
6251<tr><th>bracketedHex</th>
6252	<td><code>= '{' s hexCodePoint (s hexCodePoint)* s '}'</code></td>
6253	<td>{61 2019 62}</td>
6254</tr>
6255<tr><th>hexCodePoint</th>
6256	<td><code>= hex{1,5} | '10' hex{4}</code></td>
6257	<td>&nbsp;</td>
6258</tr>
6259<tr><th>hex</th>
6260	<td><code>= [0-9A-Fa-f]</code></td>
6261	<td>&nbsp;</td>
6262</tr>
6263<tr><th>s</th>
6264	<td><code>= [:Pattern_White_Space:]*</code></td>
6265	<td>optional whitespace</td>
6266</tr>
6267	</table>
6268</div>
6269		<p>Some constraints on UnicodeSet syntax are not captured by this EBNF. Notably, property names and values are restricted to those supported by the implementation.</p>
6270		<p>The syntax characters are listed in the table below:</p>
6271		<table>
6272		  <tbody>
6273		    <tr>
6274		      <th>Char</th>
6275		      <th>Hex</th>
6276		      <th>Name</th>
6277		      <th>Usage</th>
6278	        </tr>
6279		    <tr>
6280		      <td>$</td>
6281		      <td>U+0024</td>
6282		      <td>DOLLAR SIGN</td>
6283		      <td>Equivalent of \uFFFF (This is for implementations that return \uFFFF when accessing before the first or after the last character)</td>
6284	        </tr>
6285		    <tr>
6286		      <td>&amp;</td>
6287		      <td>U+0026</td>
6288		      <td>AMPERSAND</td>
6289		      <td>Intersecting UnicodeSets</td>
6290	        </tr>
6291			    <tr>
6292		      <td>-</td>
6293		      <td>U+002D</td>
6294		      <td>HYPHEN-MINUS</td>
6295			      <td>Ranges of characters; also set difference.</td>
6296        </tr>
6297	    <tr>
6298		      <td>:</td>
6299		      <td>U+003A</td>
6300		      <td>COLON</td>
6301		      <td>POSIX-style property syntax</td>
6302	        </tr>
6303		    <tr>
6304		      <td>[</td>
6305		      <td>U+005B</td>
6306		      <td>LEFT SQUARE BRACKET</td>
6307		      <td>Grouping; POSIX property syntax</td>
6308	        </tr>
6309		    <tr>
6310		      <td>]</td>
6311		      <td>U+005D</td>
6312		      <td>RIGHT SQUARE BRACKET</td>
6313		      <td>Grouping; POSIX property syntax</td>
6314	        </tr>
6315		    <tr>
6316		      <td>\</td>
6317		      <td>U+005C</td>
6318		      <td>REVERSE SOLIDUS</td>
6319		      <td>Escaping</td>
6320	        </tr>
6321		    <tr>
6322		      <td>^</td>
6323		      <td>U+005E</td>
6324		      <td>CIRCUMFLEX ACCENT</td>
6325		      <td>Posix negation syntax</td>
6326	        </tr>
6327		    <tr>
6328		      <td>{</td>
6329		      <td>U+007B</td>
6330		      <td>LEFT CURLY BRACKET</td>
6331		      <td>Strings in set; Perl property syntax</td>
6332	        </tr>
6333		    <tr>
6334		      <td>}</td>
6335		      <td>U+007D</td>
6336		      <td>RIGHT CURLY BRACKET</td>
6337			      <td>Strings in set; Perl property syntax</td>
6338        </tr>
6339		    <tr>
6340		      <td>&nbsp;</td>
6341		      <td>U+0020 U+0009..U+000D U+0085<br>
6342	          U+200E U+200F<br>
6343	          U+2028 U+2029</td>
6344		      <td>ASCII whitespace,<br>
6345	          LRM, RLM,<br>
6346	          LINE/PARAGRAPH SEPARATOR</td>
6347		      <td>Ignored except when escaped</td>
6348	        </tr>
6349	      </tbody>
6350	  </table>
6351	  <br>
6352		<h5>
6353			<a href="#Lists_of_Code_Points" name="Lists_of_Code_Points">5.3.3.1
6354				Lists of Code Points</a>
6355		</h5>
6356		<p>
6357			Lists are a sequence of strings that may include ranges, which are
6358			indicated by a &#39;-&#39; between two code points, as in
6359			&quot;a-z&quot;. The sequence<em> start-end</em> specifies the range
6360			of all code points from the start to end, inclusive, in Unicode
6361			order. For example, <b>[a c d-f m]</b> is equivalent to <b>[a c d
6362				e f m]</b>. Whitespace can be freely used for clarity, as <b>[a c
6363				d-f m]</b> means the same as <b>[acd-fm]</b>.
6364		</p>
6365		<p>
6366			A string with multiple code points is represented in a list by being
6367			surrounded by curly braces, such as in <strong>[a-z {ch}]</strong>.
6368			It can be used with the range notation, as described in <em>Section
6369				<a href="#String_Range">5.3.4 String Range</a>
6370			</em>. There is an additional restriction on string ranges in a
6371			UnicodeSet: the number of codepoints in the first string of the range
6372			must be identical to the number in the second. Thus [{ab}-{c}] and
6373			[{ab}-c] are invalid.
6374		</p>
6375		<p>In UnicodeSets, there are two ways to quote syntax code points:
6376		</p>
6377		<p>
6378			<a name="Backslash_Escapes"></a>Outside of single quotes, certain
6379			backslashed code point sequences can be used to quote code points:
6380		</p>
6381	  <table class='simple'>
6382			<tr>
6383			  <td>\x{h...h}<br>
6384	          \u{h...h}</td>
6385			  <td>list of 1-6 hex digits ([0-9A-Fa-f]), separated by spaces</td>
6386	    </tr>
6387			<tr>
6388			  <td>\xhh</td>
6389			  <td>1-2 hex digits</td>
6390	    </tr>
6391			<tr>
6392				<td>\uhhhh</td>
6393				<td>Exactly 4 hex digits</td>
6394			</tr>
6395			<tr>
6396				<td>\Uhhhhhhhh</td>
6397				<td>Exactly 8 hex digits</td>
6398			</tr>
6399			<tr>
6400				<td>\a</td>
6401				<td>U+0007 (BEL / ALERT)</td>
6402			</tr>
6403			<tr>
6404				<td>\b</td>
6405				<td>U+0008 (BACKSPACE)</td>
6406			</tr>
6407			<tr>
6408				<td>\t</td>
6409				<td>U+0009 (TAB / CHARACTER TABULATION)</td>
6410			</tr>
6411			<tr>
6412				<td>\n</td>
6413				<td>U+000A (LINE FEED)</td>
6414			</tr>
6415			<tr>
6416				<td>\v</td>
6417				<td>U+000B (LINE TABULATION)</td>
6418			</tr>
6419			<tr>
6420				<td>\f</td>
6421				<td>U+000C (FORM FEED)</td>
6422			</tr>
6423			<tr>
6424				<td>\r</td>
6425				<td>U+000D (CARRIAGE RETURN)</td>
6426			</tr>
6427			<tr>
6428				<td>\\</td>
6429				<td>U+005C (BACKSLASH / REVERSE SOLIDUS)</td>
6430			</tr>
6431			<tr>
6432				<td>\N{name}</td>
6433				<td>The Unicode code point named &quot;name&quot;.</td>
6434			</tr>
6435			<tr>
6436				<td>\p{…},\P{…}</td>
6437				<td>Unicode property (see below)</td>
6438			</tr>
6439		</table><br>
6440	  <p>Anything else following a backslash is mapped to itself, except
6441			the property syntax described below, or in an environment where it is
6442			defined to have some special meaning.		</p>
6443	  <p>
6444		  Any code point formed as the result of a backslash escape loses any
6445			special meaning and is treated as a literal. In particular, note that
6446			\x, \u and \U escapes create literal code points. (In contrast, Java
6447			treats Unicode escapes as just a way to represent arbitrary code
6448			points in an ASCII source file, and any resulting code points are <i><b>not</b></i>
6449		  tagged as literals.)
6450		</p>
6451		<p>
6452			Unicode property sets are defined as described as described in <i>UTS
6453				#18: Unicode Regular Expressions</i> [<a
6454				href="http://www.unicode.org/reports/tr41/#UTS18">UTS18</a>], Level
6455			1 and RL2.5, including the syntax where given. For an example of a
6456			concrete implementation of this, see [<a href="#ICUUnicodeSet">ICUUnicodeSet</a>].
6457		</p>
6458		<h5>
6459			<a href="#Unicode_Properties" name="Unicode_Properties">5.3.3.2
6460				Unicode Properties</a>
6461		</h5>
6462
6463		<p>
6464			Briefly, Unicode property sets are specified by any Unicode property
6465			and a value of that property, such as <b>[:General_Category=Letter:]</b>.
6466			for Unicode letters or <b>\p{uppercase}</b> is the set of upper case
6467			letters in Unicode. The property names are defined by the
6468			PropertyAliases.txt file and the property values by the
6469			PropertyValueAliases.txt file. For more information, see [<a
6470				href="http://unicode.org/reports/tr41/#UAX44">UAX44</a>]. The syntax
6471			for specifying the property sets is an extension of either POSIX or
6472			Perl syntax, by the addition of &quot;=&lt;value&gt;&quot;. For
6473			example, you can match letters by using the POSIX-style syntax:
6474		</p>
6475		<p>
6476			<b>[:General_Category=Letter:]</b>
6477		</p>
6478		<p>or by using the Perl-style syntax</p>
6479		<p>
6480			<b>\p{General_Category=Letter}</b>.
6481		</p>
6482		<p>
6483			Property names and values are case-insensitive, and whitespace,
6484			&quot;-&quot;, and &quot;_&quot; are ignored. The property name can
6485			be omitted for the <strong>General_Category</strong> and <strong>Script</strong>
6486			properties, but is required for other properties. If the property
6487			value is omitted, it is assumed to represent a boolean property with
6488			the value &quot;true&quot;. Thus <b>[:Letter:]</b> is equivalent to <b>[:General_Category=Letter:]</b>,
6489			and <b>[:Wh-ite-s pa_ce:]</b> is equivalent to <b>[:Whitespace=true:]</b>.
6490		</p>
6491		<p>
6492			The table below shows the two kinds of syntax: POSIX and Perl style.
6493			Also, the table shows the &quot;Negative&quot; version, which is a
6494			property that excludes all code points of a given kind. For example,
6495			<b>[:^Letter:]</b> matches all code points that are not <b>[:Letter:]</b>.
6496		</p>
6497		<table>
6498			<tr>
6499				<th>&nbsp;</th>
6500				<th>Positive</th>
6501				<th>Negative</th>
6502			</tr>
6503			<tr>
6504				<td>POSIX-style Syntax</td>
6505				<td>[:type=value:]</td>
6506				<td>[:^type=value:]</td>
6507			</tr>
6508			<tr>
6509				<td>Perl-style Syntax</td>
6510				<td>\p{type=value}</td>
6511				<td>\P{type=value}</td>
6512			</tr>
6513		</table>
6514		<h5>
6515			<a href="#Boolean_Operations" name="Boolean_Operations">5.3.3.3
6516				Boolean Operations</a>
6517		</h5>
6518
6519		<p>The low-level lists or properties then can be freely combined
6520			with the normal set operations (union, inverse, difference, and
6521			intersection):</p>
6522		<ul>
6523			<li>To union two sets, simply concatenate them. For example, <b>[[:letter:]
6524					[:number:]]</b></li>
6525			<li>To intersect two sets, use the &#39;&amp;&#39; operator. For
6526				example, <b>[[:letter:] &amp; [a-z]] </b>
6527			</li>
6528			<li>To take the set-difference of two sets, use the &#39;-&#39;
6529				operator. For example, <b>[[:letter:] - [a-z]]</b>
6530			</li>
6531			<li>To invert a set, place a &#39;^&#39; immediately after the
6532				opening &#39;[&#39;. For example, <b>[^a-z]</b>. In any other
6533				location, the &#39;^&#39; does not have a special meaning. The
6534				inversion [^X] is equivalent to [[\x{0}-\x{10FFFF}]-[X]]. Thus
6535				multi-code point strings are discarded.
6536			</li>
6537			<li>Symmetric difference (~) is not supported.</li>
6538		</ul>
6539		<p>
6540			The binary operators &#39;&amp;&#39;, &#39;-&#39;, and the implicit
6541			union have equal precedence and bind left-to-right. Thus <b>[[:letter:]-[a-z]-[\u0100-\u01FF]]</b>
6542			is equal to <b>[[[:letter:]-[a-z]]-[\u0100-\u01FF]]</b>. Another
6543			example is the set <b>[[ace][bdf] - [abc][def]]</b>, which is not the
6544			empty set, but instead equal to <b>[[[[ace] [bdf]] - [abc]]
6545				[def]]</b>, which equals <b>[[[abcdef] - [abc]] [def]]</b>, which equals
6546			<b>[[def] [def]]</b>, which equals <b>[def]</b>.
6547		</p>
6548		<p>
6549			<strong>One caution:</strong> the &#39;&amp;&#39; and &#39;-&#39;
6550			operators operate between sets. That is, they must be immediately
6551			preceded and immediately followed by a set. For example, the pattern
6552			<b>[[:Lu:]-A]</b> is illegal, since it is interpreted as the set <b>[:Lu:]</b>
6553			followed by the incomplete range <b>-A</b>. To specify the set of
6554			upper case letters except for &#39;A&#39;, enclose the &#39;A&#39; in
6555			brackets: <b>[[:Lu:]-[A]]</b>.
6556		</p>
6557		<h5>
6558			<a href="#UnicodeSet_Examples" name="UnicodeSet_Examples">5.3.3.4
6559				UnicodeSet Examples</a>
6560		</h5>
6561		<p>The following table summarizes the syntax that can be used.</p>
6562		<table style="margin-top: 0.5em; margin-bottom: 0.5em" id="table18">
6563			<tr>
6564				<th>Example</th>
6565				<th>Description</th>
6566			</tr>
6567			<tr>
6568				<td nowrap>[a]</td>
6569				<td>The set containing &#39;a&#39; alone</td>
6570			</tr>
6571			<tr>
6572				<td nowrap>[a-z]</td>
6573				<td>The set containing &#39;a&#39; through &#39;z&#39; and all
6574					letters in between, in Unicode order.<br> Thus it is the same
6575					as [\u0061-\u007A].
6576				</td>
6577			</tr>
6578			<tr>
6579				<td nowrap>[^a-z]</td>
6580				<td>The set containing all code points but &#39;a&#39; through
6581					&#39;z&#39;.<br> Thus it is the same as [\u0000-\u0060
6582					\u007B-\x{10FFFF}].
6583				</td>
6584			</tr>
6585			<tr>
6586				<td nowrap>[[pat1][pat2]]</td>
6587				<td>The union of sets specified by pat1 and pat2</td>
6588			</tr>
6589			<tr>
6590				<td nowrap>[[pat1]&amp;[pat2]]</td>
6591				<td>The intersection of sets specified by pat1 and pat2</td>
6592			</tr>
6593			<tr>
6594				<td nowrap>[[pat1]-[pat2]]</td>
6595				<td>The asymmetric difference of sets specified by pat1 and
6596					pat2</td>
6597			</tr>
6598			<tr>
6599				<td nowrap>[a {ab} {ac}]</td>
6600				<td>The code point &#39;a&#39; and the multi-code point strings
6601					&quot;ab&quot; and &quot;ac&quot;</td>
6602			</tr>
6603			<tr>
6604			  <td nowrap>[x\u{61 2019 62}y]</td>
6605			  <td>Equivalent to [x\u0061\u201\u0062y] (= [xa’by])</td>
6606		  </tr>
6607			<tr>
6608				<td nowrap>[{ax}-{bz}]</td>
6609				<td>The set containing [{ax} {ay} {az} {bx} {by} {bz}], using
6610					the range syntax to get all the strings from {ax} to {bz} as
6611					described in <em>Section <a href="#String_Range">5.3.4
6612							String Range</a></em>.
6613				</td>
6614			</tr>
6615			<tr>
6616				<td nowrap>[:Lu:]</td>
6617				<td>The set of code points with a given property value, as
6618					defined by PropertyValueAliases.txt. In this case, these are the
6619					Unicode upper case letters. The long form for this is <b>[:General_Category=Uppercase_Letter:]</b>.
6620				</td>
6621			</tr>
6622			<tr>
6623				<td nowrap>[:L:]</td>
6624				<td>The set of code points belonging to all Unicode categories
6625					starting with &#39;L&#39;, that is, <b>[[:Lu:][:Ll:][:Lt:][:Lm:][:Lo:]]</b>.
6626					The long form for this is <b>[:General_Category=Letter:]</b>.
6627				</td>
6628			</tr>
6629		</table>
6630		<br>
6631		<h4>
6632			<a name="String_Range" href="#String_Range">5.3.4 String Range</a>
6633		</h4>
6634		<p>A String Range is a compact format for specifying a list of
6635			strings.</p>
6636		<p>
6637			<strong>Syntax:<br>
6638			</strong>
6639		</p>
6640		<blockquote>
6641			<p>
6642				X <em>sep</em> Y<br>
6643			</p>
6644		</blockquote>
6645		<p>The separator and the format of strings X, Y may vary depending
6646			on the domain. For example,</p>
6647		<ul>
6648			<li>for the validity files the separator is ~,</li>
6649			<li>for UnicodeSet the separator is
6650				-, and any multi-codepoint string is
6651				enclosed in {…}.
6652			</li>
6653		</ul>
6654		<p>
6655			<strong>Validity: <br>
6656			</strong>
6657		</p>
6658		<blockquote>
6659			<p>
6660				A string range X <em>sep</em> Y is valid iff len(X) ≥ len(Y) &gt; 0,
6661				where len(X) is the length of X in code points.
6662			</p>
6663			<p>
6664				<em>There may be additional, domain-specific requirements for
6665					validity of the expansion of the string range.</em>
6666			</p>
6667		</blockquote>
6668		<p>
6669			<strong>Interpretation:<br>
6670			</strong>
6671		</p>
6672		<ol>
6673			<li>Break X into P and S, where len(S) = len(Y)
6674				<ul>
6675					<li>Note that P will be an empty string if the lengths of X
6676						and Y are equal.</li>
6677				</ul>
6678			</li>
6679			<li>Form the combinations of all P+(s₀..y₀)+(s₁..y₁)+...(sₙ..yₙ)
6680				<ul>
6681					<li>s₀ is the first code point in S, etc.</li>
6682				</ul>
6683			</li>
6684		</ol>
6685		<p>
6686			<strong>Examples:</strong>
6687		</p>
6688		<table>
6689			<tbody>
6690				<tr>
6691					<td>ab-ad</td>
6692					<td>→</td>
6693					<td>ab ac ad</td>
6694				</tr>
6695				<tr>
6696					<td>ab-d</td>
6697					<td>→</td>
6698					<td>ab ac ad</td>
6699				</tr>
6700				<tr>
6701					<td>ab-cd</td>
6702					<td>→</td>
6703					<td>ab ac ad bb bc bd cb cc cd</td>
6704				</tr>
6705				<tr>
6706					<td>����-����</td>
6707					<td>→</td>
6708					<td>���� ���� ���� ���� ����</td>
6709				</tr>
6710				<tr>
6711					<td>����-��</td>
6712					<td>→</td>
6713					<td>���� ���� ���� ���� ����</td>
6714				</tr>
6715			</tbody>
6716		</table>
6717		<br>
6718		<h3>
6719			<a name="Identity_Elements" href="#Identity_Elements">5.4
6720				Identity Elements</a>
6721		</h3>
6722		<p class="dtd">&lt;!ELEMENT identity (alias | (version,
6723			generation?, language, script?, territory?, variant?, special*) )
6724			&gt;</p>
6725		<p>The identity element contains information identifying the
6726			target locale for this data, and general information about the
6727			version of this data.</p>
6728		<p class="element2">
6729			&lt;version number=&quot;<u>$</u>Revision: 1.227 <u>$</u>&quot;&gt;
6730		</p>
6731		<p>The version element provides, in an attribute, the version of
6732			this file.&nbsp; The contents of the element can contain textual
6733			notes about the changes between this version and the last. For
6734			example:</p>
6735		<blockquote>
6736			<pre>&lt;version number=&quot;<span style="color: blue">1.1</span>&quot;&gt;<span
6737					style="color: blue">Various notes and changes in version 1.1</span>&lt;/version&gt;</pre>
6738			<p>This is not to be confused with the version attribute on the
6739				ldml element, which tracks the dtd version.</p>
6740		</blockquote>
6741		<p class="element2">
6742			&lt;generation date=&quot;<u>$</u>Date: 2007/07/17 23:41:16 <u>$</u>&quot;
6743			/&gt;
6744		</p>
6745		<p>The generation element is now deprecated. It was used to
6746			contain the last modified date for the data. This could be in two
6747			formats: ISO 8601 format, or CVS format (illustrated by the example
6748			above).</p>
6749		<p class="element2">
6750			&lt;language type=&quot;<span style="color: blue">en</span>&quot;/&gt;
6751		</p>
6752		<p>The language code is the primary part of the specification of
6753			the locale id, with values as described above.</p>
6754		<p class="element2">
6755			&lt;script type=&quot;<span style="color: blue">Latn</span>&quot;
6756			/&gt;
6757		</p>
6758		<p>The script code may be used in the identification of written
6759			languages, with values described above.</p>
6760		<p class="element2">
6761			&lt;territory type=&quot;<span style="color: blue">US</span>&quot;/&gt;
6762		</p>
6763		<p>The territory code is a common part of the specification of the
6764			locale id, with values as described above.</p>
6765		<p class="element2">
6766			&lt;variant type=&quot;<span class="attributeValue">NYNORSK</span>&quot;/&gt;
6767		</p>
6768		<p>The variant code is the tertiary part of the specification of
6769			the locale id, with values as described above.</p>
6770
6771		<p>
6772			When combined according to the rules described in <i> <a
6773				href="#Unicode_Language_and_Locale_Identifiers">Section 3,
6774					Unicode Language and Locale Identifiers</a></i>, the language element,
6775			along with any of the optional script, territory, and variant
6776			elements, must identify a known, stable locale identifier. Otherwise,
6777			it is an error.
6778		</p>
6779		<h3>
6780			<a name="Valid_Attribute_Values" href="#Valid_Attribute_Values">5.5
6781				Valid Attribute Values</a>
6782		</h3>
6783		<p>The valid attribute values, as well as other validity
6784			information is contained in the supplementalMetadata.xml file. (Some,
6785			but not all, of this information could have been represented in XML
6786			Schema or a DTD.) Most of this is primarily for internal tool use.</p>
6787
6788		<p>The &lt;elementOrder&gt; and &lt;attributeOrder&gt; elements
6789			are now deprecated, since the information regarding element and
6790			attribute ordering is now contained in the DTD.</p>
6791		<p>
6792			<i>The suppress elements are those that are suppressed in
6793				canonicalization.</i>
6794		</p>
6795		<p>
6796			<i>The serialElements are those that do not inherit, and may have
6797				ordering</i>
6798		</p>
6799		<blockquote>
6800			<pre>&lt;serialElements&gt;attributeValues base comment extend first_non_ignorable first_primary_ignorable
6801first_secondary_ignorable first_tertiary_ignorable first_trailing first_variable i ic languagePopulation
6802last_non_ignorable last_primary_ignorable last_secondary_ignorable last_tertiary_ignorable last_trailing
6803last_variable optimize p pc reset rules s sc settings suppress_contractions t tRule tc variable x
6804&lt;/serialElements&gt;</pre>
6805		</blockquote>
6806		<p>
6807			<i>The validity elements give the possible attribute values. They
6808				are in the format of a series of variables, followed by
6809				attributeValues. </i>
6810		</p>
6811		<blockquote>
6812			<pre>&lt;variable id=&quot;$calendar&quot; type=&quot;choice&quot;&gt;
6813buddhist coptic ethiopic ethiopic-amete-alem chinese gregorian hebrew indian islamic islamic-civil
6814japanese arabic civil-arabic thai-buddhist persian roc&lt;/variable&gt;</pre>
6815		</blockquote>
6816		<p>The types indicate the style of match:</p>
6817		<ul>
6818			<li>choice: for a list of possible values</li>
6819			<li>regex: for a regular expression match</li>
6820			<li>notDoneYet: for items without matching criteria</li>
6821			<li>locale: for locale IDs</li>
6822			<li>list: for a space-delimited list of values</li>
6823			<li>path: for a valid [<a href="#XPath">XPath</a>]
6824			</li>
6825		</ul>
6826		<p>If the attribute order=&quot;given&quot; is supplied, it
6827			indicates the order of elements when canonicalizing (see below).</p>
6828		<p>The variable values are intended for internal testing, and the
6829			definition and usage may change between releases. They do not
6830			necessarily include all valid elements. For example, for primary
6831			language codes, they include the subset that occur in CLDR locale
6832			data. They are intended for a particular version of CLDR, and may
6833			omit codes that were present in earlier versions, such as deprecated
6834			codes.</p>
6835		<p>The &lt;deprecated&gt; element lists elements, attributes, and
6836			attribute values that are deprecated. If any deprecatedItems element
6837			contains more than one attribute, then only the listed combinations
6838			are deprecated. Thus the following means not that the draft attribute
6839			is deprecated, but that the true and false values for that attribute
6840			are:</p>
6841		<blockquote>
6842			<pre>&lt;deprecatedItems attributes=&quot;draft&quot; values=&quot;true false&quot;/&gt; </pre>
6843		</blockquote>
6844		<p>
6845			Similarly, the following means that the <i>type</i> attribute is
6846			deprecated, but only for the listed elements:
6847		</p>
6848		<blockquote>
6849			<pre>&lt;deprecatedItems elements=&quot;abbreviationFallback default ... preferenceOrdering&quot; attributes=&quot;type&quot;/&gt; </pre>
6850		</blockquote>
6851		<p class="dtd">
6852			&lt;!ELEMENT blockingItems EMPTY &gt;<br> &lt;!ATTLIST
6853			blockingItems elements NMTOKENS #IMPLIED &gt;
6854		</p>
6855		<p>
6856			The blockingItems were used to indicate which elements (and their child elements)
6857			do not inherit. For example, because supplementalData is a blocking
6858			item, all paths containing the element <span class="element">supplementalData</span>
6859			do not inherit. However, <strong>the &lt;blockingItems&gt; element is now deprecated,</strong>
6860			having been replaced by the annotations in the DTD and the DTDData classes in CLDR tooling.
6861		</p>
6862		<pre class="dtd">&lt;!ELEMENT distinguishingItems EMPTY &gt;
6863&lt;!ATTLIST distinguishingItems exclude ( true | false ) #IMPLIED &gt;
6864&lt;!ATTLIST distinguishingItems elements NMTOKENS #IMPLIED &gt;
6865&lt;!ATTLIST distinguishingItems attributes NMTOKENS #IMPLIED &gt;</pre>
6866		<p>
6867			The distinguishing items were used to indicate which combinations of elements and
6868			attributes (in unblocked environments) are <i>distinguishing</i> in
6869			performing inheritance. For example, the attribute type is
6870			distinguishing <i>except</i> in combination with certain elements,
6871			such as in the following. However, <strong>the &lt;distinguishingItems&gt; element is now deprecated,</strong>
6872			having been replaced by the annotations in the DTD and the DTDData classes in CLDR tooling.
6873		</p>
6874		<pre>&lt;distinguishingItems
6875  exclude=&quot;true&quot;
6876  elements=&quot;default measurementSystem mapping abbreviationFallback preferenceOrdering&quot;
6877  attributes=&quot;type&quot;/&gt;
6878</pre>
6879		<h3>
6880			<a name="Canonical_Form" href="#Canonical_Form">5.6 Canonical
6881				Form</a>
6882		</h3>
6883		<p>The following are restrictions on the format of LDML files to
6884			allow for easier parsing and comparison of files.</p>
6885		<p>Peer elements have consistent order. That is, if the DTD or
6886			this specification requires the following order in an element foo:</p>
6887		<pre>&lt;foo&gt;
6888  &lt;pattern&gt;
6889  &lt;somethingElse&gt;
6890&lt;/foo&gt;</pre>
6891		<p>It can never require the reverse order in a different element
6892			bar.</p>
6893		<pre>&lt;foo&gt;
6894  &lt;somethingElse&gt;
6895  &lt;pattern&gt;
6896&lt;/foo&gt;</pre>
6897		<p>Note that there was one case that had to be corrected in order
6898			to make this true. For that reason, pattern occurs twice under
6899			currency:</p>
6900		<pre class="dtd">&lt;!ELEMENT currency (alias | (pattern*, displayName?, symbol?, pattern*,
6901decimal?, group?, special*)) &gt;</pre>
6902		<p>
6903			<a href="http://www.w3.org/TR/REC-xml/">XML</a> files can have a wide
6904			variation in textual form, while representing precisely the same
6905			data. By putting the LDML files in the repository into a canonical
6906			form, this allows us to use the simple diff tools used widely (and in
6907			CVS) to detect differences when vetting changes, without those tools
6908			being confused. This is not a requirement on other uses of LDML; just
6909			simply a way to manage repository data more easily.
6910		</p>
6911		<h4>
6912			<a name="Content" href="#Content">5.6.1 Content</a>
6913		</h4>
6914		<ol>
6915			<li>All start elements are on their own line, indented by <i>depth</i>
6916				tabs.
6917			</li>
6918			<li>All end elements (except for leaf nodes) are on their own
6919				line, indented by <i>depth</i> tabs.
6920			</li>
6921			<li>Any leaf node with empty content is in the form
6922				&lt;foo/&gt;.</li>
6923			<li>There are no blank lines except within comments or content.</li>
6924			<li>Spaces are used within a start element. There are no extra
6925				spaces within elements.
6926				<ul>
6927					<li><code>&lt;version number=&quot;1.2&quot;/&gt;</code>, not
6928						<code>&lt;version&nbsp; number = &quot;1.2&quot; /&gt;</code></li>
6929					<li><code>&lt;/identity&gt;</code>, not <code>&lt;/identity
6930							&gt;</code></li>
6931				</ul>
6932			</li>
6933			<li>All attribute values use double quote (&quot;), not single
6934				(&#39;).</li>
6935			<li>There are no CDATA sections, and no escapes except those
6936				absolutely required.
6937				<ul>
6938					<li>no &amp;apos; since it is not necessary</li>
6939					<li>no &#39;&amp;#x61;&#39;, it would be just &#39;a&#39;</li>
6940				</ul>
6941			</li>
6942			<li>All attributes with defaulted values are suppressed.</li>
6943			<li>The draft and alt=&quot;proposed.*&quot; attributes are only
6944				on leaf elements.</li>
6945			<li>The tzid are canonicalized in the following way:
6946				<ol>
6947					<li type="a">All tzids as of as CLDR 1.1 (2004.06.08) in
6948						zone.tab are canonical.</li>
6949					<li>After that point, the first time a tzid is introduced,
6950						that is the canonical form.</li>
6951				</ol>
6952				<p>
6953					That is, new IDs are added, but existing ones keep the original
6954					form. The <i>TZ</i> timezone database keeps a set of equivalences
6955					in the &quot;backward&quot; file. These are used to map other tzids
6956					to the canonical form. For example, when
6957					<code>America/Argentina/Catamarca</code>
6958					was introduced as the new name for the previous
6959					<code>America/Catamarca</code>
6960					, a link was added in the backward file.
6961				</p>
6962				<p>
6963					<code>Link America/Argentina/Catamarca America/Catamarca</code>
6964				</p>
6965			</li>
6966		</ol>
6967		<p>
6968			<i>Example:</i>
6969		</p>
6970		<pre>&lt;ldml draft=&quot;unconfirmed&quot; &gt;
6971	&lt;identity&gt;
6972		&lt;version number=&quot;1.2&quot;/&gt;
6973		&lt;language type=&quot;en&quot;/&gt;
6974		&lt;territory type=&quot;AS&quot;/&gt;
6975	&lt;/identity&gt;
6976	&lt;numbers&gt;
6977		&lt;currencyFormats&gt;
6978			&lt;currencyFormatLength&gt;
6979				&lt;currencyFormat&gt;
6980					&lt;pattern&gt;¤#,##0.00;(¤#,##0.00)&lt;/pattern&gt;
6981				&lt;/currencyFormat&gt;
6982			&lt;/currencyFormatLength&gt;
6983		&lt;/currencyFormats&gt;
6984	&lt;/numbers&gt;
6985&lt;/ldml&gt;</pre>
6986		<h4>
6987			<a name="Ordering" href="#Ordering">5.6.2 Ordering</a>
6988		</h4>
6989		<p>An element is ordered first by the element name, and then if
6990			the element names are identical, by the sorted set of attribute-value
6991			pairs. For the latter, compare the first pair in each (in sorted
6992			order by attribute pair). If not identical, go to the second pair,
6993			and so on.</p>
6994		<p>Elements and attributes are ordered according to their order in
6995			the respective DTDs. Attribute value comparison is a bit more
6996			complicated, and may depend on the attribute and type. This is
6997			currently done with specific ordering tables.</p>
6998		<p>
6999			Any future additions to the DTD must be structured so as to allow
7000			compatibility with this ordering. See also <a
7001				href="#Valid_Attribute_Values">Section 5.5 Valid Attribute
7002				Values.</a>
7003		</p>
7004
7005		<h4>
7006			<a name="Comments" href="#Comments">5.6.3 Comments</a>
7007		</h4>
7008		<ol>
7009			<li>Comments are of the form &lt;!-- <i>stuff</i> --&gt;.
7010			</li>
7011			<li>They are logically attached to a node. There are 4 kinds:
7012				<ol>
7013					<li>Inline always appear after a leaf node, on the same line
7014						at the end. These are a single line.</li>
7015					<li>Preblock comments always precede the attachment node, and
7016						are indented on the same level.</li>
7017					<li>Postblock comments always follow the attachment node, and
7018						are indented on the same level.</li>
7019					<li>Final comment, after &lt;/ldml&gt;</li>
7020				</ol>
7021			</li>
7022			<li>Multiline comments (except the final comment) have each line
7023				after the first indented to one deeper level.</li>
7024		</ol>
7025		<p>
7026			<b>Examples:</b>
7027		</p>
7028		<pre>&lt;eraAbbr&gt;
7029	&lt;era type=&quot;0&quot;&gt;BC&lt;/era&gt; &lt;!-- might add alternate BDE in the future --&gt;
7030...
7031&lt;timeZoneNames&gt;
7032	&lt;!-- Note: zones that do not use daylight time need further work --&gt;
7033	&lt;zone type=&quot;America/Los_Angeles&quot;&gt;
7034	...
7035	&lt;!-- Note: the following is known to be sparse,
7036		and needs to be improved in the future --&gt;
7037	&lt;zone type=&quot;Asia/Jerusalem&quot;&gt;</pre>
7038
7039    <h3>
7040			<a name="DTD_Annotations" href="#DTD_Annotations">5.7 DTD Annotations</a>
7041	  </h3>
7042				<p>The information in a standard DTD is insufficient for use in CLDR. To make up for that, DTD annotations are added. These are of the form<br>
7043				&lt;!--@...--&gt;<br>
7044				and are included below the !ELEMENT or !ATTLIST line that they apply to. The current annotations are:</p>
7045				<table>
7046                <tr><th>Type</th><th>Description</th></tr>
7047                <tr>
7048                  <td>&lt;!--@VALUE--&gt;</td>
7049                  <td>The attribute is not distinguishing, and is treated like an element value</td></tr>
7050                <tr>
7051                  <td>&lt;!--@METADATA--&gt;</td>
7052                  <td>The attribute is a “comment” on the data, like the draft status. It is not typically used in implementations.</td>
7053                </tr>
7054                <tr>
7055                  <td>&lt;!--@ORDERED--&gt;</td>
7056                  <td>The element's children are ordered, and do not inherit.</td>
7057                </tr>
7058                <tr>
7059                  <td>&lt;!--@DEPRECATED--&gt;</td>
7060                  <td>The element or attribute is deprecated, and should not be used.</td>
7061                </tr>
7062                <tr>
7063                  <td>&lt;!--@DEPRECATED: attribute-value1, attribute-value2--&gt;</td>
7064                  <td>The attribute values are deprecated, and should not be used. Spaces
7065                  	between tokens are not significant.</td>
7066                </tr>
7067                </table>
7068
7069				<p> There is additional information in the attributeValueValidity.xml
7070					file that is used internally for testing. For example, the following
7071					line indicates that the 'currency' element in the ldml dtd must have
7072					values from the bcp47 'cu' type.</p>
7073				<p class='example'> &lt;attributeValues dtds='ldml' elements='currency'
7074					attributes='type'&gt;$_bcp47_cu&lt;/attributeValues&gt;</p>
7075				<p>The element values may be literals, regular expressions, or variables
7076					(some of which are set programmatically according to other CLDR data,
7077					such as the above. However, the information as this point does not
7078					cover all attribute values, is used only for testing, and should not
7079					be used in implementations since the structure may change without
7080					notice.</p>
7081
7082		<h2>
7083			<a name="Property_Data" href="#Property_Data">6 Property Data</a>
7084		</h2>
7085		<p>Some data in CLDR does not use an XML format, but rather a
7086			semicolon-delimited format derived from that of the Unicode Character
7087			Database. That is because the data is more likely to be parsed by
7088			implementations that already parse UCD data. Those files are present
7089			in the common/properties directory.</p>
7090		<p>Each file has a header that explains the format and usage of
7091			the data.</p>
7092		<h3><a name="Script_Metadata" href="#Script_Metadata">6.1 Script Metadata</a></h3>
7093		<p><code>scriptMetadata.txt</code>: </p>
7094		<p>This file provides general information about scripts that may be useful to implementations processing text. The information is the best currently available, and may change between versions of CLDR. The format is similar to Unicode Character Database property file, and is documented in the header of the data file.</p>
7095		<h3><a name="Extended_Pictographic" href="#Extended_Pictographic">6.2 Extended Pictographic</a>        </h3>
7096		<p><code>ExtendedPictographic.txt</code></p>
7097	  <p>This file was used to define the ExtendedPictographic data used for “future-proofing” emoji behavior, especially in segmentation. As of Emoji version 11.0, the set of Extended_Pictographic is incorporated into the emoji data files found at <a href="https://unicode.org/Public/emoji/">unicode.org/Public/emoji/</a>.</p>
7098
7099
7100
7101
7102
7103
7104
7105
7106
7107
7108
7109
7110
7111
7112
7113
7114
7115	  <h3><a name="Labels.txt" href="#Labels.txt">6.3 Labels.txt</a>        </h3>
7116		<p><code>labels.txt</code>: </p>
7117		  <p>This file provides general information about associations of labels to characters that may be useful to implementations of character-picking applications. The information is the best currently available, and may change between versions of CLDR. The format is similar to Unicode Character Database property file, and is documented in the header of the data file.</p>
7118		  <p>Initially, the contents are focused on emoji, but may be expanded in the future to other types of characters. Note that a character may have multiple labels.</p>
7119
7120        <h2>
7121			<a name="Format_Parse_Issues" href="#Format_Parse_Issues">7
7122				Issues in Formatting and Parsing</a>
7123		</h2>
7124		<h3>
7125			<a name="Lenient_Parsing" href="#Lenient_Parsing">7.1 Lenient Parsing</a>
7126		</h3>
7127		<h4>
7128			<a name="Motivation" href="#Motivation">7.1.1 Motivation</a>
7129		</h4>
7130		<p>User input is frequently messy. Attempting to parse it by
7131			matching it exactly against a pattern is likely to be unsuccessful,
7132			even when the meaning of the input is clear to a human being. For
7133			example, for a date pattern of &quot;MM/dd/yy&quot;, the input
7134			&quot;June 1, 2006&quot; will fail.</p>
7135		<p>The goal of lenient parsing is to accept user input whenever it
7136			is possible to decipher what the user intended. Doing so requires
7137			using patterns as data to guide the parsing process, rather than an
7138			exact template that must be matched. This informative section
7139			suggests some heuristics that may be useful for lenient parsing of
7140			dates, times, and numbers.</p>
7141		<h4>
7142			<a name="Loose_Matching" href="#Loose_Matching">7.1.2 Loose Matching</a>
7143		</h4>
7144		<p>Loose matching ignores attributes of the strings being compared
7145			that are not important to matching. It involves the following steps:</p>
7146		<ul>
7147			<li>Remove &quot;.&quot; from currency symbols and other fields
7148				used for matching, and also from the input string unless:
7149				<ul>
7150					<li>&quot;.&quot; is in the decimal set, and</li>
7151					<li>its position in the input string is immediately before a
7152						decimal digit</li>
7153				</ul>
7154			</li>
7155			<li>Ignore all format characters: in particular, ignore any
7156				RLM, LRM or ALM used to control BIDI formatting.</li>
7157			<li>Ignore all characters in [:Zs:] unless they occur between
7158				letters. (In the heuristics below, even those between letters are
7159				ignored except to delimit fields)</li>
7160			<li>Map all characters in [:Dash:] to U+002D HYPHEN-MINUS</li>
7161			<li>Use the data in the &lt;character-fallback&gt; element to
7162				map equivalent characters (for example, curly to straight
7163				apostrophes). Other apostrophe-like characters should also be
7164				treated as equivalent, especially if the character actually used in
7165				a format may be unavailable on some keyboards. For example:
7166				<ul>
7167					<li>U+02BB MODIFIER LETTER TURNED COMMA (ʻ) might be typed
7168						instead as U+2018 LEFT SINGLE QUOTATION MARK (‘).</li>
7169					<li>U+02BC MODIFIER LETTER APOSTROPHE (ʼ) might be typed
7170						instead as U+2019 RIGHT SINGLE QUOTATION MARK (’), U+0027
7171						APOSTROPHE, etc.</li>
7172					<li>U+05F3 HEBREW PUNCTUATION GERESH (‎׳) might be typed
7173						instead as U+0027 APOSTROPHE.</li>
7174				</ul>
7175			</li>
7176			<li>Apply mappings particular to the domain (i.e., for dates or
7177				for numbers, discussed in more detail below)</li>
7178			<li>Apply case folding (possibly including language-specific
7179				mappings such as Turkish i)</li>
7180			<li>Normalize to NFKC; thus <i>no-break space</i> will map to <i>
7181					space</i>; half-width <i>katakana</i> will map to full-width.
7182			</li>
7183		</ul>
7184		<p>Loose matching involves (logically) applying the above
7185			transform to both the input text and to each of the field elements
7186			used in matching, before applying the specific heuristics below. For
7187			example, if the input number text is &quot; - NA f. 1,000.00&quot;,
7188			then it is mapped to &quot;-naf1,000.00&quot; before processing. The
7189			currency signs are also transformed, so &quot;NA f.&quot; is
7190			converted to &quot;naf&quot; for purposes of matching. As with other
7191			Unicode algorithms, this is a logical statement of the process;
7192			actual implementations can optimize, such as by applying the
7193			transform incrementally during matching.</p>
7194		<h3>
7195			<a name="Invalid_Patterns" href="#Invalid_Patterns">7.2 Handling
7196				Invalid Patterns</a>
7197		</h3>
7198		<p>Processes sometimes encounter invalid number or
7199			date patterns, such as a number pattern with “¤¤¤¤¤” (valid pattern
7200			character but invalid length in current CLDR), a date pattern with
7201			“nn” (invalid pattern character in current CLDR), or a date pattern
7202			with “MMMMMM” (invalid length in current CLDR). The recommended
7203			behavior for handling such an invalid pattern field is:</p>
7204		<ul>
7205			<li>For a field using a currently-invalid length for a valid
7206				pattern character:
7207				<ul>
7208					<li>In <strong>formatting, </strong>emit U+FFFD REPLACEMENT
7209						CHARACTER for the invalid field.
7210					</li>
7211					<li>In <strong>parsing, </strong>the field may be parsed as if
7212						it had a valid length.
7213					</li>
7214				</ul>
7215			</li>
7216			<li>For a pattern that contains a currently-invalid pattern
7217				character (applies only to date patterns, for which A-Za-z are
7218				reserved as pattern characters but not all defined as valid):
7219				<ul>
7220					<li>Produce an error (set an error code or throw an exception)
7221						when an attempt is made to create a formatter with such a pattern
7222						or to apply such a pattern to an existing formatter.</li>
7223				</ul>
7224			</li>
7225		</ul>
7226		<h2>
7227			<a name="Deprecated_Structure" href="#Deprecated_Structure">Annex A
7228				Deprecated Structure</a>
7229		</h2>
7230		<p>The deprecated elements, attributes, and values are listed in
7231			the supplementalMetadata.xml file, under &lt;deprecatedItems&gt;.
7232			While valid LDML, it is strongly discouraged, and no longer used in
7233			CLDR.</p>
7234		<p>The remainder of this section describes selected cases of
7235			deprecated structure that were present in previous versions of CLDR.
7236		</p>
7237		<h3>
7238			<a name="Fallback_Elements" href="#Fallback_Elements">A.1 Element
7239				fallback</a>
7240		</h3>
7241		<p class="dtd">&lt;!ELEMENT fallback (#PCDATA) &gt;</p>
7242		<p>
7243			The fallback element is deprecated. Implementations should use
7244			instead the information in <em><a href="#LanguageMatching">Section
7245					4.4 Language Matching</a></em> for doing language fallback.
7246		</p>
7247		<h3>
7248			<a name="BCP47_Keyword_Mapping" href="#BCP47_Keyword_Mapping">A.2
7249				BCP 47 Keyword Mapping</a>
7250		</h3>
7251
7252		<p>
7253			<b>Note:</b> <i>This structure is deprecated and replaced with <a
7254				href="#Unicode_Locale_Extension_Data_Files">Section 3.6.4 U
7255					Extension Data Files</a>.
7256			</i>
7257		</p>
7258
7259		<p class="dtd">
7260			&lt;!ELEMENT bcp47KeywordMappings ( mapKeys?, mapTypes* ) &gt;<br>
7261			&lt;!ELEMENT mapKeys ( keyMap* ) &gt;<br> &lt;!ELEMENT keyMap
7262			EMPTY &gt;<br> &lt;!ATTLIST keyMap type NMTOKEN #REQUIRED &gt;<br>
7263			&lt;!ATTLIST keyMap bcp47 NMTOKEN #REQUIRED &gt;<br>
7264			&lt;!ELEMENT mapTypes ( typeMap* ) &gt;<br> &lt;!ATTLIST
7265			mapTypes type NMTOKEN #REQUIRED &gt;<br> &lt;!ELEMENT typeMap
7266			EMPTY &gt;<br> &lt;!ATTLIST typeMap type CDATA #REQUIRED &gt;<br>
7267			&lt;!ATTLIST typeMap bcp47 NMTOKEN #REQUIRED &gt;<br>
7268		</p>
7269		<p>
7270			This section defines mappings between old Unicode locale identifier
7271			key/type values and their BCP 47 'u' extension subtag
7272			representations. The 'u' extension syntax described in <a
7273				href="#u_Extension">Section 3.6 Unicode BCP 47 U Extension</a>
7274			restricts a key to two ASCII alphanumerics and a type to three to
7275			eight ASCII alphanumerics. A key or a type which does not meet that
7276			syntax requirement is converted according to the mapping data defined
7277			by the mapKeys or mapTypes elements. For example, a keyword
7278			"collation=phonebook" is converted to BCP 47 'u' extension subtags
7279			"co-phonebk" by the mapping data below:
7280		</p>
7281		<pre>    &lt;mapKeys&gt;
7282        ...
7283        &lt;keyMap type="collation" bcp47="co"/&gt;
7284        ...
7285    &lt;/mapKeys&gt;
7286    &lt;mapTypes type="collation"&gt;
7287        ...
7288        &lt;typeMap type="phonebook" bcp47="phonebk"/&gt;
7289        ...
7290    &lt;/mapTypes&gt;
7291	</pre>
7292		<h3>
7293			<a name="Choice_Patterns" href="#Choice_Patterns">A.3 Choice
7294				Patterns</a>
7295		</h3>
7296		<p>
7297			<b>Note:</b> <i>This structure is deprecated and replaced with
7298				count attributes.</i>
7299		</p>
7300		<p>A choice pattern is a string that chooses among a number of
7301			strings, based on numeric value. It has the following form:</p>
7302		<p>
7303			&lt;choice_pattern&gt; = &lt;choice&gt; ( &#39;|&#39; &lt;choice&gt;
7304			)*<br> &lt;choice&gt; =
7305			&lt;number&gt;&lt;relation&gt;&lt;string&gt;<br> &lt;number&gt;
7306			= (&#39;+&#39; | &#39;-&#39;)? (<font size="3">&#39;∞&#39; |
7307				[0-9]+ (&#39;.&#39; [0-9]+)?)<br> &lt;relation&gt; =
7308				&#39;&lt;&#39; | &#39;
7309			</font><span style="color: blue">≤&#39;</span>
7310		</p>
7311		<p>The interpretation of a choice pattern is that given a number
7312			N, the pattern is scanned from right to left, for each choice
7313			evaluating &lt;number&gt; &lt;relation&gt; N. The first choice that
7314			matches results in the corresponding string. If no match is found,
7315			then the first string is used. For example:</p>
7316		<table border="1" cellpadding="0" cellspacing="0">
7317			<tr>
7318				<td width="33%">Pattern</td>
7319				<td width="33%">N</td>
7320				<td width="34%">Result</td>
7321			</tr>
7322			<tr>
7323				<td width="33%" rowspan="4">0≤Rf|1≤Ru|1&lt;Re</td>
7324				<td width="33%">-<font size="3">∞, </font>-3, -1, -0.000001
7325				</td>
7326				<td width="34%">Rf (defaulted to first string)</td>
7327			</tr>
7328			<tr>
7329				<td width="33%">0, 0.01, 0.9999</td>
7330				<td width="34%">Rf</td>
7331			</tr>
7332			<tr>
7333				<td width="33%">1</td>
7334				<td width="34%">Ru</td>
7335			</tr>
7336			<tr>
7337				<td width="33%">1.00001, 5, 99, <font size="3">∞</font></td>
7338				<td width="34%">Re</td>
7339			</tr>
7340		</table>
7341		<p>Quoting is done using &#39; characters, as in date or number
7342			formats.</p>
7343		<h3>
7344			<a name="Element_default" href="#Element_default">A.4 Element
7345				default</a>
7346		</h3>
7347		<p>
7348			<b>Note:</b> <i>This structure is deprecated. </i> Use replacement
7349			structure instead, for example:
7350		</p>
7351		<ul>
7352			<li>For &lt;collations&gt;, now use the &lt;defaultCollation&gt;
7353				element.</li>
7354			<li>For &lt;calendars&gt;, the default calendar type for a
7355				locale is now specified by <i><a
7356					href="tr35-dates.html#Calendar_Preference_Data">Calendar
7357						Preference Data</a></i>.
7358			</li>
7359		</ul>
7360		<p>In some cases, a number of elements are present. The default
7361			element can be used to indicate which of them is the default, in the
7362			absence of other information. The value of the choice attribute is to
7363			match the value of the type attribute for the selected item.</p>
7364		<pre>&lt;timeFormats&gt;
7365  &lt;default choice=&quot;<span style="color: red">medium</span>&quot; /&gt;
7366  &lt;timeFormatLength type=&quot;<span style="color: blue">full</span>&quot;&gt;
7367    &lt;timeFormat type=&quot;<span style="color: blue">standard</span>&quot;&gt;
7368      &lt;pattern type=&quot;<span style="color: blue">standard</span>&quot;&gt;<span
7369				style="color: blue">h:mm:ss a z</span>&lt;/pattern&gt;
7370    &lt;/timeFormat&gt;
7371  &lt;/timeFormatLength&gt;
7372  &lt;timeFormatLength type=&quot;<span style="color: blue">long</span>&quot;&gt;
7373    &lt;timeFormat type=&quot;<span style="color: blue">standard</span>&quot;&gt;
7374      &lt;pattern type=&quot;<span style="color: blue">standard</span>&quot;&gt;<span
7375				style="color: blue">h:mm:ss a z</span>&lt;/pattern&gt;
7376    &lt;/timeFormat&gt;
7377  &lt;/timeFormatLength&gt;
7378  &lt;timeFormatLength type=&quot;<span style="color: red">medium</span>&quot;&gt;
7379    &lt;timeFormat type=&quot;<span style="color: blue">standard</span>&quot;&gt;
7380      &lt;pattern type=&quot;<span style="color: blue">standard</span>&quot;&gt;<span
7381				style="color: blue">h:mm:ss a</span>&lt;/pattern&gt;
7382    &lt;/timeFormat&gt;
7383  &lt;/timeFormatLength&gt;
7384...</pre>
7385		<p>Like all other elements, the &lt;default&gt; element is
7386			inherited. Thus, it can also refer to inherited resources. For
7387			example, suppose that the above resources are present in fr, and that
7388			in fr_BE we have the following:</p>
7389		<pre>&lt;timeFormats&gt;
7390  &lt;default choice=&quot;<span style="color: red">long</span>&quot;/&gt;
7391&lt;/timeFormats&gt;</pre>
7392		<p>In that case, the default time format for fr_BE would be the
7393			inherited &quot;long&quot; resource from fr. Now suppose that we had
7394			in fr_CA:</p>
7395		<pre>  &lt;timeFormatLength type=&quot;<span style="color: red">medium</span>&quot;&gt;
7396    &lt;timeFormat type=&quot;<span style="color: blue">standard</span>&quot;&gt;
7397      &lt;pattern type=&quot;<span style="color: blue">standard</span>&quot;&gt;<span
7398				style="color: blue">...</span>&lt;/pattern&gt;
7399    &lt;/timeFormat&gt;
7400  &lt;/timeFormatLength&gt;
7401    </pre>
7402		<p>In this case, the &lt;default&gt; is inherited from fr, and has
7403			the value &quot;medium&quot;. It thus refers to this new
7404			&quot;medium&quot; pattern in this resource bundle.</p>
7405		<h3>
7406			<a name="Deprecated_Common_Attributes"
7407				href="#Deprecated_Common_Attributes">A.5 Deprecated Common
7408				Attributes</a>
7409		</h3>
7410		<h4>
7411			<a name="Attribute_standard" href="#Attribute_standard">A.5.1 Attribute standard</a>
7412		</h4>
7413		<p class="element2">
7414			<b>Note: </b>This attribute is deprecated. Instead, use a reference
7415			element with the attribute standard=&quot;true&quot;.
7416		</p>
7417		<p>The value of this attribute is a list of strings representing
7418			standards: international, national, organization, or vendor
7419			standards. The presence of this attribute indicates that the data in
7420			this element is compliant with the indicated standards. Where
7421			possible, for uniqueness, the string should be a URL that represents
7422			that standard. The strings are separated by commas; leading or
7423			trailing spaces on each string are not significant. Examples:</p>
7424		<p>
7425			<code>
7426				&lt;collation standard=&quot;<span style="color: blue">MSA
7427					200:2002</span>&quot;&gt;<br> ...<br> &lt;dateFormatStyle
7428				standard=”http://www.iso.ch/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=26780&amp;amp;ICS1=1&amp;amp;ICS2=140&amp;amp;ICS3=30”&gt;
7429			</code>
7430		</p>
7431
7432		<h4>
7433			<a name="Attribute_draft_nonLeaf" href="#Attribute_draft_nonLeaf">A.5.2
7434				Attribute draft in non-leaf elements</a>
7435		</h4>
7436		<p>The draft attribute is deprecated except in
7437			leaf elements (elements that do not have any subelements)</p>
7438
7439		<h3>
7440			<a name="Element_base" href="#Element_base">A.6 Element base</a>
7441		</h3>
7442		<p>
7443			<b>Note:</b> <i>This element is deprecated.</i> Use the collation
7444			&lt;import&gt; element instead.
7445		</p>
7446		<p>
7447			The optional base element
7448			<code>
7449				&lt;base&gt;<span style="color: blue">...</span>&lt;/base&gt;
7450			</code>
7451			, contains an alias element that points to another data source that
7452			defines a <i>base </i>collation. If present, it indicates that the
7453			settings and rules in the collation are modifications applied on <i>top
7454				of the</i> respective elements in the base collation. That is, any
7455			successive settings, where present, override what is in the base as
7456			described in <a href="tr35-collation.html#Setting_Options">Setting
7457				Options</a>. Any successive rules are concatenated to the end of the
7458			rules in the base. The results of multiple rules applying to the same
7459			characters is covered in <a href="tr35-collation.html#Orderings">Orderings</a>.
7460		</p>
7461
7462		<h3>
7463			<a name="Element_rules" href="#Element_rules">A.7 Element rules</a>
7464		</h3>
7465		<p>
7466			<b>Note:</b> <i>The XML collation syntax is deprecated; this
7467				includes the &lt;rules&gt; element and its subelements, except that
7468				the &lt;import&gt; element has been moved up to be a subelement of
7469				&lt;collation&gt;.</i> Use the basic collation syntax with the <a
7470				href="tr35-collation.html#Rules">&lt;cr&gt; element</a> instead.
7471		</p>
7472		<p class="dtd">&lt;!ELEMENT rules (alias | ( ( reset | import ), (
7473			reset | import | p | pc | s | sc | t | tc | i | ic | x)* )) &gt;</p>
7474
7475		<h3>
7476			<a name="Deprecated_subelements_of_dates"
7477				href="#Deprecated_subelements_of_dates">A.8 Deprecated
7478				subelements of &lt;dates&gt;</a>
7479		</h3>
7480		<ul>
7481			<li>&lt;localizedPatternChars&gt;</li>
7482			<li>&lt;dateRangePattern&gt;, replaced by
7483				&lt;intervalFormats&gt;.</li>
7484		</ul>
7485
7486		<h3>
7487			<a name="Deprecated_subelements_of_calendars"
7488				href="#Deprecated_subelements_of_calendars">A.9 Deprecated
7489				subelements of &lt;calendars&gt;</a>
7490		</h3>
7491		<ul>
7492			<li>&lt;monthNames&gt; and &lt;monthAbbr&gt;; month name forms
7493				are specified in the &lt;months&gt; element. The older monthNames,
7494				monthAbbr are equivalent to: using the months element with the
7495				context type=&quot;<span style="color: blue">format</span>&quot; and
7496				the width type=&quot;<span style="color: blue">wide</span>&quot;
7497				(for ...Names) and type=&quot;<span style="color: blue">narrow</span>&quot;
7498				(for ...Abbr), respectively.
7499			</li>
7500			<li>&lt;dayNames&gt; and &lt;dayAbbr&gt;; weekday name forms are
7501				specified in the &lt;days&gt; element. The older dayNames, dayAbbr
7502				are equivalent to: using the days element with the context
7503				type=&quot;<span style="color: blue">format</span>&quot; and the
7504				width type=&quot;<span style="color: blue">wide</span>&quot; (for
7505				...Names) and type=&quot;<span style="color: blue">narrow</span>&quot;
7506				(for ...Abbr), respectively.
7507			</li>
7508			<li><a name="week" href="#week">&lt;week&gt;</a> is deprecated
7509				in the main LDML files, because the data is more appropriately
7510				organized as connected to territories, not to linguistic data. Use
7511				the supplemental &lt;weekData&gt; element instead.</li>
7512			<li>&lt;am&gt; and &lt;pm&gt;; these are now included as part of
7513				the &lt;dayPeriods&gt; element</li>
7514			<li>&lt;fields&gt; is deprecated as a subelement of
7515				&lt;calendars&gt; instead, a &lt;fields&gt; element should be
7516				located just under a &lt;dates&gt; element. See <a
7517				href="tr35-dates.html#Calendar_Fields">Calendar Fields</a>.
7518			</li>
7519		</ul>
7520
7521		<h3>
7522			<a name="Deprecated_subelements_of_timeZoneNames"
7523				href="#Deprecated_subelements_of_timeZoneNames">A.10 Deprecated
7524				subelements of &lt;timeZoneNames&gt;</a>
7525		</h3>
7526		<ul>
7527			<li>&lt;hoursFormat&gt; e.g. &quot;{0}/{1}&quot; for
7528				&quot;-0800/-0700&quot;</li>
7529			<li><a name="fallbackRegionFormat" href="#fallbackRegionFormat">&lt;fallbackRegionFormat&gt;</a>
7530				(deprecated), e.g. &quot;{0} Time ({1})&quot; for &quot;United
7531				States Time (New York)&quot;</li>
7532			<li>&lt;abbreviationFallback&gt;</li>
7533			<li>&lt;preferenceOrdering&gt;, a preference ordering among
7534				modern zones; use metazones instead.</li>
7535			<li>&lt;singleCountries&gt;, use <a
7536				href="tr35-dates.html#Primary_Zones">Primary Zones</a></li>
7537		</ul>
7538
7539		<h3>
7540			<a name="Deprecated_subelements_of_zone_metazone"
7541				href="#Deprecated_subelements_of_zone_metazone">A.11 Deprecated
7542				subelements of &lt;zone&gt; and &lt;metazone&gt;</a>
7543		</h3>
7544		<ul>
7545			<li>&lt;commonlyUsed&gt;, formerly used to indicate whether a
7546				zone was commonly used in the locale.</li>
7547		</ul>
7548
7549		<h3>
7550			<a name="Renamed_attribute_values_for_contextTransformUsage"
7551				href="#Renamed_attribute_values_for_contextTransformUsage">A.12
7552				Renamed attribute values for &lt;contextTransformUsage&gt; element</a>
7553		</h3>
7554		<p>
7555			The &lt;contextTransformUsage&gt; element was introduced in CLDR 21.
7556			The values for its <em>type</em> attribute are documented in <a
7557				href="tr35-general.html#contextTransformUsage_type_attribute_values">
7558				&lt;contextTransformUsage&gt; type attribute values</a>. In CLDR 25,
7559			some of these values were renamed from their previous values for
7560			improved clarity:
7561		</p>
7562		<ul>
7563			<li>"type" was renamed to "keyValue"</li>
7564			<li>"displayName" was renamed to "currencyName"</li>
7565			<li>"displayName-count" was renamed to "currencyName-count"</li>
7566			<li>"tense" was renamed to "relative"</li>
7567		</ul>
7568
7569		<h3>
7570			<a name="Deprecated_subelements_of_segmentations"
7571				href="#Deprecated_subelements_of_segmentations">A.13 Deprecated
7572				subelements of &lt;segmentations&gt;</a>
7573		</h3>
7574		<ul>
7575			<li>&lt;exceptions&gt; and &lt;exceptions&gt; were deprecated
7576				and replaced with &lt;suppressions&gt; and &lt;suppression&gt;.</li>
7577		</ul>
7578		<h3>
7579			<a name="Element_cp" href="#Element_cp">A.14 Element cp</a>
7580		</h3>
7581		<p>The cp element was used to escape characters that cannot be
7582			represented in XML, even with NCRs. These escapes were only allowed
7583			in certain elements, according to the DTD.</p>
7584		<p>However, this mechanism is very clumsy, and was replaced by
7585			specialized syntax.</p>
7586		<table>
7587			<tr>
7588				<th>Code Point</th>
7589				<th>XML Example</th>
7590			</tr>
7591			<tr>
7592				<td><code>U+0000</code></td>
7593				<td><code>&lt;cp hex=&quot;0&quot;&gt;</code></td>
7594			</tr>
7595		</table>
7596		<p>&nbsp;</p>
7597		<h3>
7598			<a name="validSubLocales" href="#validSubLocales">A.15 Attribute
7599				validSubLocales</a>
7600		</h3>
7601		<p>
7602			The attribute <i>validSubLocales</i> allowed sublocales in a given
7603			tree to be treated as though a file for them were present when there
7604			was not one. It only had an effect for locales that inherit from the
7605			current file where a file is missing.
7606		</p>
7607		<p>
7608			<b>Example 1. </b>Suppose that in a particular LDML tree, there are
7609			no region locales for German, for example, there is a de.xml file,
7610			but no files for de_AT.xml, de_CH.xml, or de_DE.xml. Then no elements
7611			are valid for any of those region locales. If we want to mark one of
7612			those files as having valid elements, then we introduce an empty
7613			file, such as the following.
7614		</p>
7615		<p>
7616			<code>
7617				&lt;ldml version=&quot;1.1&quot;&gt;<br> &nbsp;&lt;identity&gt;<br>
7618				&nbsp; &lt;version number=&quot;1.1&quot; /&gt; <br> &nbsp; &lt;language type=&quot;de&quot; /&gt; <br> &nbsp;
7619				&lt;territory type=&quot;AT&quot; /&gt; <br>
7620				&nbsp;&lt;/identity&gt;<br> &lt;/ldml&gt;
7621			</code>
7622		</p>
7623		<p>
7624			With the <i>validSubLocales</i> attribute, instead of adding the
7625			empty files for de_AT.xml, de_CH.xml, and de_DE.xml, in the de file
7626			we could add to the parent locale a list of the child locales that
7627			should behave as if files were present.
7628		</p>
7629		<p>
7630			<code>
7631				&lt;ldml version=&quot;1.1&quot; validSubLocales=&quot;de_AT de_CH
7632				de_DE&quot;&gt;<br> &nbsp;&lt;identity&gt;<br> &nbsp;
7633				&lt;version number=&quot;1.1&quot; /&gt; <br> &nbsp;
7634				&lt;language type=&quot;de&quot; /&gt; <br>
7635				&nbsp;&lt;/identity&gt;<br> ...<br> &lt;/ldml&gt;
7636			</code>
7637		</p>
7638		<p>
7639			Now that the <i>validSubLocales</i> attribute has been deprecated, it
7640			is recommended to simply add empty files to specify which sublocales
7641			are valid. This convention is used throughout the CLDR.
7642		</p>
7643		<h3>
7644			<a name="postCodeElements" href="#postCodeElements">A.16 Elements
7645				postalCodeData, postCodeRegex</a>
7646		</h3>
7647		<p>The postal code validation data has been deprecated. Please see
7648			other services that are kept up to date, such as:</p>
7649		<ul>
7650			<li><a href="http://i18napis.appspot.com/address/data/US">http://i18napis.appspot.com/address/data/US</a></li>
7651			<li><a href="http://i18napis.appspot.com/address/data/CH">http://i18napis.appspot.com/address/data/CH</a></li>
7652			<li>...</li>
7653		</ul>
7654		<p>
7655			See <a href="tr35-info.html#Postal_Code_Validation">Postal Code
7656				Validation</a>
7657		</p>
7658
7659		<h3>
7660			<a name="telephoneCodeData" href="#telephoneCodeData">A.17 Element
7661				telephoneCodeData</a>
7662		</h3>
7663		<p>The element &lt;telephoneCodeData&gt; and its subelements have
7664			been deprecated and the data removed.</p>
7665
7666		<hr>
7667		<h2>
7668			<a name="Links_to_Other_Parts" href="#Links_to_Other_Parts">Annex B
7669				Links to Other Parts</a>
7670		</h2>
7671		<p>
7672			The LDML specification is split into several <a href="#Parts">parts</a>
7673			by topic, with one HTML document per part. The following tables
7674			provide redirects for links to specific topics. Please update your
7675			links and bookmarks.
7676		</p>
7677
7678		<p>Part 1 Links: Core (this document): No redirects needed.</p>
7679
7680		<table cellspacing="0" cellpadding="2" border="1" width="100%">
7681			<caption>
7682				<a href="#Part_2_Links" name="Part_2_Links">Part 2 Links</a>: <a
7683					href="tr35-general.html">General</a> (display names &amp;
7684				transforms, etc.)
7685			</caption>
7686			<tr>
7687				<th>Old section</th>
7688				<th>Section in new part</th>
7689			</tr>
7690			<tr>
7691				<td>5.4 <a name="Display_Name_Elements"
7692					href="#Display_Name_Elements">Display Name Elements</a></td>
7693				<td>1 <a href="tr35-general.html#Display_Name_Elements">Display
7694						Name Elements</a></td>
7695			</tr>
7696			<tr>
7697				<td>5.5 <a name="Layout_Elements" href="#Layout_Elements">Layout
7698						Elements</a></td>
7699				<td>2 <a href="tr35-general.html#Layout_Elements">Layout
7700						Elements</a></td>
7701			</tr>
7702			<tr>
7703				<td>5.6 <a name="Character_Elements" href="#Character_Elements">Character
7704						Elements</a></td>
7705				<td>3 <a href="tr35-general.html#Character_Elements">Character
7706						Elements</a></td>
7707			</tr>
7708			<tr>
7709				<td>5.6.1 <a name="ExemplarSyntax" href="#ExemplarSyntax">Exemplar
7710						Syntax</a></td>
7711				<td>3.1 <a href="tr35-general.html#ExemplarSyntax">Exemplar
7712						Syntax</a></td>
7713			</tr>
7714			<tr>
7715				<td>5.6.2 Restrictions</td>
7716				<td>3.1 <a href="tr35-general.html#ExemplarSyntax">Exemplar
7717						Syntax</a></td>
7718			</tr>
7719			<tr>
7720				<td>5.6.3 Mapping</td>
7721				<td>3.2 <a href="tr35-general.html#Character_Mapping">Mapping</a></td>
7722			</tr>
7723			<tr>
7724				<td>5.6.4 <a name="IndexLabels" href="#IndexLabels">Index
7725						Labels</a></td>
7726				<td>3.3 <a href="tr35-general.html#IndexLabels">Index
7727						Labels</a></td>
7728			</tr>
7729			<tr>
7730				<td>5.6.5 Ellipsis</td>
7731				<td>3.4 <a href="tr35-general.html#Ellipsis">Ellipsis</a></td>
7732			</tr>
7733			<tr>
7734				<td>5.6.6 More Information</td>
7735				<td>3.5 <a href="tr35-general.html#Character_More_Info">More
7736						Information</a></td>
7737			</tr>
7738			<tr>
7739				<td>5.7 <a name="Delimiter_Elements" href="#Delimiter_Elements">Delimiter
7740						Elements</a></td>
7741				<td>4 <a href="tr35-general.html#Delimiter_Elements">Delimiter
7742						Elements</a></td>
7743			</tr>
7744			<tr>
7745				<td>C.6 <a name="Measurement_System_Data"
7746					href="#Measurement_System_Data">Measurement System Data</a></td>
7747				<td>5 <a href="tr35-general.html#Measurement_System_Data">Measurement
7748						System Data</a></td>
7749			</tr>
7750			<tr>
7751				<td>5.8 <a name="Measurement_Elements"
7752					href="#Measurement_Elements">Measurement Elements (deprecated)</a></td>
7753				<td>5.1 <a href="tr35-general.html#Measurement_Elements">Measurement
7754						Elements (deprecated)</a></td>
7755			</tr>
7756			<tr>
7757				<td>5.11 <a name="Unit_Elements" href="#Unit_Elements">Unit
7758						Elements</a></td>
7759				<td>6 <a href="tr35-general.html#Unit_Elements">Unit
7760						Elements</a></td>
7761			</tr>
7762			<tr>
7763				<td>5.12 <a name="POSIX_Elements" href="#POSIX_Elements">POSIX
7764						Elements</a></td>
7765				<td>7 <a href="tr35-general.html#POSIX_Elements">POSIX
7766						Elements</a></td>
7767			</tr>
7768			<tr>
7769				<td>5.13 <a name="Reference_Elements"
7770					href="#Reference_Elements">Reference Element</a></td>
7771				<td>8 <a href="tr35-general.html#Reference_Elements">Reference
7772						Element</a></td>
7773			</tr>
7774			<tr>
7775				<td>5.15 <a name="Segmentations" href="#Segmentations">Segmentations</a></td>
7776				<td>9 <a href="tr35-general.html#Segmentations">Segmentations</a></td>
7777			</tr>
7778			<tr>
7779				<td>5.15.1 <a name="Segmentation_Inheritance"
7780					href="#Segmentation_Inheritance">Segmentation Inheritance</a></td>
7781				<td>9.1 <a href="tr35-general.html#Segmentation_Inheritance">Segmentation
7782						Inheritance</a></td>
7783			</tr>
7784			<tr>
7785				<td>5.16 <a name="Transforms" href="#Transforms">Transforms</a></td>
7786				<td>10 <a href="tr35-general.html#Transforms">Transforms</a></td>
7787			</tr>
7788			<tr>
7789				<td>N <a name="Transform_Rules" href="#Transform_Rules">Transform
7790						Rules</a></td>
7791				<td>10.3 <a href="tr35-general.html#Transform_Rules_Syntax">Transform
7792						Rules Syntax</a></td>
7793			</tr>
7794			<tr>
7795				<td>5.18 <a name="ListPatterns" href="#ListPatterns">List
7796						Patterns</a></td>
7797				<td>11 <a href="tr35-general.html#ListPatterns">List
7798						Patterns</a></td>
7799			</tr>
7800			<tr>
7801				<td>C.20 <a name="List_Gender" href="#List_Gender">Gender
7802						of Lists</a></td>
7803				<td>11.1 <a href="tr35-general.html#List_Gender">Gender of
7804						Lists</a></td>
7805			</tr>
7806			<tr>
7807				<td>5.19 <a name="Context_Transform_Elements"
7808					href="#Context_Transform_Elements">ContextTransform Elements</a></td>
7809				<td>12 <a href="tr35-general.html#Context_Transform_Elements">ContextTransform
7810						Elements</a></td>
7811			</tr>
7812			<tr>
7813				<td></td>
7814				<td><a href="tr35-general.html#"></a></td>
7815			</tr>
7816		</table>
7817
7818
7819		<table cellspacing="0" cellpadding="2" border="1" width="100%">
7820			<caption>
7821				<a href="#Part_3_Links" name="Part_3_Links">Part 3 Links</a>: <a
7822					href="tr35-numbers.html">Numbers</a> (number &amp; currency
7823				formatting)
7824			</caption>
7825			<tr>
7826				<th>Old section</th>
7827				<th>Section in new part</th>
7828			</tr>
7829			<tr>
7830				<td>C.13 <a name="Numbering_Systems" href="#Numbering_Systems">Numbering
7831						Systems</a></td>
7832				<td>1 <a href="tr35-numbers.html#Numbering_Systems">Numbering
7833						Systems</a></td>
7834			</tr>
7835			<tr>
7836				<td>5.10 <a name="Number_Elements" href="#Number_Elements">Number
7837						Elements</a></td>
7838				<td>2 <a href="tr35-numbers.html#Number_Elements">Number
7839						Elements</a></td>
7840			</tr>
7841			<tr>
7842				<td>5.10.1 <a name="Number_Symbols" href="#Number_Symbols">Number
7843						Symbols</a></td>
7844				<td>2.3 <a href="tr35-numbers.html#Number_Symbols">Number
7845						Symbols</a></td>
7846			</tr>
7847			<tr>
7848				<td>G <a name="Number_Format_Patterns"
7849					href="#Number_Format_Patterns">Number Format Patterns</a></td>
7850				<td>3 <a href="tr35-numbers.html#Number_Format_Patterns">Number
7851						Format Patterns</a></td>
7852			</tr>
7853			<tr>
7854				<td>5.10.2 <a name="Currencies" href="#Currencies">Currencies</a></td>
7855				<td>4 <a href="tr35-numbers.html#Currencies">Currencies</a></td>
7856			</tr>
7857			<tr>
7858				<td>C.1 <a name="Supplemental_Currency_Data"
7859					href="#Supplemental_Currency_Data">Supplemental Currency Data</a></td>
7860				<td>4.1 <a href="tr35-numbers.html#Supplemental_Currency_Data">Supplemental
7861						Currency Data</a></td>
7862			</tr>
7863			<tr>
7864				<td>C.11 <a name="Language_Plural_Rules"
7865					href="#Language_Plural_Rules">Language Plural Rules</a></td>
7866				<td>5 <a href="tr35-numbers.html#Language_Plural_Rules">Language
7867						Plural Rules</a></td>
7868			</tr>
7869			<tr>
7870				<td>5.17 <a name="Rule-Based_Number_Formatting"
7871					href="#Rule-Based_Number_Formatting">Rule-Based Number
7872						Formatting</a></td>
7873				<td>6 <a href="tr35-numbers.html#Rule-Based_Number_Formatting">Rule-Based
7874						Number Formatting</a></td>
7875			</tr>
7876		</table>
7877
7878
7879		<table cellspacing="0" cellpadding="2" border="1" width="100%">
7880			<caption>
7881				<a href="#Part_4_Links" name="Part_4_Links">Part 4 Links</a>: <a
7882					href="tr35-dates.html">Dates</a> (date, time, time zone formatting)
7883			</caption>
7884			<tr>
7885				<th>Old section</th>
7886				<th>Section in new part</th>
7887			</tr>
7888			<tr>
7889				<td><a name="Date_Elements" href="#Date_Elements">5.9 Date
7890						Elements</a></td>
7891				<td>1 <a
7892					href="tr35-dates.html#Overview_Dates_Element_Supplemental">Overview:
7893						Dates Element, Supplemental Date and Calendar Information</a></td>
7894			</tr>
7895			<tr>
7896				<td><a name="Calendar_Elements" href="#Calendar_Elements">5.9.1
7897						Calendar Elements</a></td>
7898				<td>2 <a href="tr35-dates.html#Calendar_Elements">Calendar
7899						Elements</a></td>
7900			</tr>
7901			<tr>
7902				<td><a name="months_days_quarters_eras"
7903					href="#months_days_quarters_eras">Elements months, days,
7904						quarters, eras</a></td>
7905				<td>2.1 <a href="tr35-dates.html#months_days_quarters_eras">Elements
7906						months, days, quarters, eras</a></td>
7907			</tr>
7908			<tr>
7909				<td><a name="monthPatterns_cyclicNameSets"
7910					href="#monthPatterns_cyclicNameSets">Elements monthPatterns,
7911						cyclicNameSets</a></td>
7912				<td>2.2 <a href="tr35-dates.html#monthPatterns_cyclicNameSets">Elements
7913						monthPatterns, cyclicNameSets</a></td>
7914			</tr>
7915			<tr>
7916				<td><a name="dayPeriods" href="#dayPeriods">Element
7917						dayPeriods</a></td>
7918				<td>2.3 <a href="tr35-dates.html#dayPeriods">Element
7919						dayPeriods</a></td>
7920			</tr>
7921			<tr>
7922				<td><a name="dateFormats" href="#dateFormats">Element
7923						dateFormats</a></td>
7924				<td>2.4 <a href="tr35-dates.html#dateFormats">Element
7925						dateFormats</a></td>
7926			</tr>
7927			<tr>
7928				<td><a name="timeFormats" href="#timeFormats">Element
7929						timeFormats</a></td>
7930				<td>2.5 <a href="tr35-dates.html#timeFormats">Element
7931						timeFormats</a></td>
7932			</tr>
7933			<tr>
7934				<td><a name="dateTimeFormats" href="#dateTimeFormats">Element
7935						dateTimeFormats</a></td>
7936				<td>2.6 <a href="tr35-dates.html#dateTimeFormats">Element
7937						dateTimeFormats</a></td>
7938			</tr>
7939			<tr>
7940				<td><a name="Calendar_Fields" href="#Calendar_Fields">5.9.2
7941						Calendar Fields</a></td>
7942				<td>3 <a href="tr35-dates.html#Calendar_Fields">Calendar
7943						Fields</a></td>
7944			</tr>
7945			<tr>
7946				<td>5.9.3 <a name="Timezone_Names" href="#Timezone_Names">Time
7947						Zone Names</a></td>
7948				<td>5 <a href="tr35-dates.html#Time_Zone_Names">Time Zone
7949						Names</a></td>
7950			</tr>
7951			<tr>
7952				<td><a name="Supplemental_Calendar_Data"
7953					href="#Supplemental_Calendar_Data">C.5 Supplemental Calendar
7954						Data</a></td>
7955				<td>4 <a href="tr35-dates.html#Supplemental_Calendar_Data">Supplemental
7956						Calendar Data</a></td>
7957			</tr>
7958			<tr>
7959				<td><a name="Supplemental_Timezone_Data"
7960					href="#Supplemental_Timezone_Data">C.7 Supplemental Time Zone
7961						Data</a></td>
7962				<td>6 <a href="tr35-dates.html#Supplemental_Time_Zone_Data">Supplemental
7963						Time Zone Data</a></td>
7964			</tr>
7965			<tr>
7966				<td><a name="Calendar_Preference_Data"
7967					href="#Calendar_Preference_Data">C.15 Calendar Preference Data</a></td>
7968				<td>4.2 <a href="tr35-dates.html#Calendar_Preference_Data">Calendar
7969						Preference Data</a></td>
7970			</tr>
7971			<tr>
7972				<td><a name="DayPeriodRules" href="#DayPeriodRules">C.17
7973						DayPeriod Rules</a></td>
7974				<td>4.5 <a href="tr35-dates.html#Day_Period_Rules">Day
7975						Period Rules</a></td>
7976			</tr>
7977			<tr>
7978				<td><a name="Date_Format_Patterns" href="#Date_Format_Patterns">Appendix
7979						F: Date Format Patterns</a></td>
7980				<td>8 <a href="tr35-dates.html#Date_Format_Patterns">Date
7981						Format Patterns</a></td>
7982			</tr>
7983			<tr>
7984				<td><a name="Date_Field_Symbol_Table"
7985					href="#Date_Field_Symbol_Table">Date Field Symbol Table</a></td>
7986				<td><a href="tr35-dates.html#Date_Field_Symbol_Table">Date
7987						Field Symbol Table</a></td>
7988			</tr>
7989			<tr>
7990				<td><a name="Localized_Pattern_Characters"
7991					href="#Localized_Pattern_Characters">F.1 Localized Pattern
7992						Characters (deprecated)</a></td>
7993				<td>8.1 <a href="tr35-dates.html#Localized_Pattern_Characters">Localized
7994						Pattern Characters (deprecated)</a></td>
7995			</tr>
7996			<tr>
7997				<td><a name="Time_Zone_Fallback" href="#Time_Zone_Fallback">Appendix
7998						J: Time Zone Display Names</a></td>
7999				<td>7 <a href="tr35-dates.html#Using_Time_Zone_Names">Using
8000						Time Zone Names</a></td>
8001			</tr>
8002			<tr>
8003				<td><a name="fallbackFormat" href="#fallbackFormat"><b>fallbackFormat</b>:</a></td>
8004				<td><a href="tr35-dates.html#fallbackFormat"><b>fallbackFormat</b>:</a></td>
8005			</tr>
8006			<tr>
8007				<td>O.4 Parsing Dates and Times</td>
8008				<td>9 <a href="tr35-dates.html#Parsing_Dates_Times">Parsing
8009						Dates and Times</a></td>
8010			</tr>
8011		</table>
8012
8013
8014		<table cellspacing="0" cellpadding="2" border="1" width="100%">
8015			<caption>
8016				<a href="#Part_5_Links" name="Part_5_Links">Part 5 Links</a>: <a
8017					href="tr35-collation.html">Collation</a> (sorting, searching,
8018				grouping)
8019			</caption>
8020			<tr>
8021				<th>Old section</th>
8022				<th>Section in new part</th>
8023			</tr>
8024			<tr>
8025				<td>5.14 <a name="Collation_Elements"
8026					href="#Collation_Elements">Collation Elements</a></td>
8027				<td>3 <a href="tr35-collation.html#Collation_Tailorings">Collation
8028						Tailorings</a></td>
8029			</tr>
8030			<tr>
8031				<td>5.14.1 <a name="Collation_Version"
8032					href="#Collation_Version">Version</a></td>
8033				<td>3.1 <a href="tr35-collation.html#Collation_Version">Version</a></td>
8034			</tr>
8035			<tr>
8036				<td>5.14.2 <a name="Collation_Element"
8037					href="#Collation_Element">Collation Element</a></td>
8038				<td>3.2 <a href="tr35-collation.html#Collation_Element">Collation
8039						Element</a></td>
8040			</tr>
8041			<tr>
8042				<td>5.14.3 <a name="Setting_Options" href="#Setting_Options">Setting
8043						Options</a></td>
8044				<td>3.3 <a href="tr35-collation.html#Setting_Options">Setting
8045						Options</a></td>
8046			</tr>
8047			<tr>
8048				<td>Table <a name="Collation_Settings"
8049					href="#Collation_Settings">Collation Settings</a></td>
8050				<td>Table <a href="tr35-collation.html#Collation_Settings">Collation
8051						Settings</a></td>
8052			</tr>
8053			<tr>
8054				<td>5.14.4 <a name="Rules" href="#Rules">Collation Rule
8055						Syntax</a></td>
8056				<td>3.4 <a href="tr35-collation.html#Rules">Collation Rule
8057						Syntax</a></td>
8058			</tr>
8059			<tr>
8060				<td>5.14.5 <a name="Orderings" href="#Orderings">Orderings</a></td>
8061				<td>3.5 <a href="tr35-collation.html#Orderings">Orderings</a></td>
8062			</tr>
8063			<tr>
8064				<td>5.14.6 <a name="Contractions" href="#Contractions">Contractions</a></td>
8065				<td>3.6 <a href="tr35-collation.html#Contractions">Contractions</a></td>
8066			</tr>
8067			<tr>
8068				<td>5.14.7 <a name="Expansions" href="#Expansions">Expansions</a></td>
8069				<td>3.7 <a href="tr35-collation.html#Expansions">Expansions</a></td>
8070			</tr>
8071			<tr>
8072				<td>5.14.8 <a name="Context_Before" href="#Context_Before">Context
8073						Before</a></td>
8074				<td>3.8 <a href="tr35-collation.html#Context_Before">Context
8075						Before</a></td>
8076			</tr>
8077			<tr>
8078				<td>5.14.9 <a name="Placing_Characters_Before_Others"
8079					href="#Placing_Characters_Before_Others">Placing Characters
8080						Before Others</a></td>
8081				<td>3.9 <a
8082					href="tr35-collation.html#Placing_Characters_Before_Others">Placing
8083						Characters Before Others</a></td>
8084			</tr>
8085			<tr>
8086				<td>5.14.10 <a name="Logical_Reset_Positions"
8087					href="#Logical_Reset_Positions">Logical Reset Positions</a></td>
8088				<td>3.10 <a href="tr35-collation.html#Logical_Reset_Positions">Logical
8089						Reset Positions</a></td>
8090			</tr>
8091			<tr>
8092				<td>5.14.11 <a name="Special_Purpose_Commands"
8093					href="#Special_Purpose_Commands">Special-Purpose Commands</a></td>
8094				<td>3.11 <a href="tr35-collation.html#Special_Purpose_Commands">Special-Purpose
8095						Commands</a></td>
8096			</tr>
8097			<tr>
8098				<td>5.14.12 <a name="Script_Reordering"
8099					href="#Script_Reordering">Collation Reordering</a></td>
8100				<td>3.12 <a href="tr35-collation.html#Script_Reordering">Collation
8101						Reordering</a></td>
8102			</tr>
8103			<tr>
8104				<td>5.14.13 <a name="Case_Parameters" href="#Case_Parameters">Case
8105						Parameters</a></td>
8106				<td>3.13 <a href="tr35-collation.html#Case_Parameters">Case
8107						Parameters</a></td>
8108			</tr>
8109			<tr>
8110				<td>Definition: <a name="UncasedExceptions"
8111					href="#UncasedExceptions">UncasedExceptions</a></td>
8112				<td>removed: see 3.13 <a
8113					href="tr35-collation.html#Case_Parameters">Case Parameters</a></td>
8114			</tr>
8115			<tr>
8116				<td>Definition: <a name="LowerExceptions"
8117					href="#LowerExceptions">LowerExceptions</a></td>
8118				<td>removed: see 3.13 <a
8119					href="tr35-collation.html#Case_Parameters">Case Parameters</a></td>
8120			</tr>
8121			<tr>
8122				<td>Definition: <a name="UpperExceptions"
8123					href="#UpperExceptions">UpperExceptions</a></td>
8124				<td>removed: see 3.13 <a
8125					href="tr35-collation.html#Case_Parameters">Case Parameters</a></td>
8126			</tr>
8127			<tr>
8128				<td>5.14.14 <a name="Visibility" href="#Visibility">Visibility</a></td>
8129				<td>3.14 <a href="tr35-collation.html#Visibility">Visibility</a></td>
8130			</tr>
8131		</table>
8132
8133		<table cellspacing="0" cellpadding="2" border="1" width="100%">
8134			<caption>
8135				<a href="#Part_6_Links" name="Part_6_Links">Part 6 Links</a>: <a
8136					href="tr35-info.html">Supplemental</a> (supplemental data)
8137			</caption>
8138			<tr>
8139				<th>Old section</th>
8140				<th>Section in new part</th>
8141			</tr>
8142
8143			<tr>
8144				<td>C <a name="Supplemental_Data" href="#Supplemental_Data">Supplemental
8145						Data</a></td>
8146				<td>Introduction <a href="tr35-info.html#Supplemental_Data">Supplemental
8147						Data</a></td>
8148			</tr>
8149
8150			<tr>
8151				<td>C.2 <a name="Supplemental_Territory_Containment"
8152					href="#Supplemental_Territory_Containment">Supplemental
8153						Territory Containment</a></td>
8154				<td>1.1 <a
8155					href="tr35-info.html#Supplemental_Territory_Containment">Supplemental
8156						Territory Containment</a></td>
8157			</tr>
8158			<tr>
8159				<td>C.4 <a name="Supplemental_Territory_Information"
8160					href="#Supplemental_Territory_Information">Supplemental
8161						Territory Information</a></td>
8162				<td>1.2 <a
8163					href="tr35-info.html#Supplemental_Territory_Information">Supplemental
8164						Territory Information</a></td>
8165			</tr>
8166			<tr>
8167				<td>C.3 <a name="Supplemental_Language_Data"
8168					href="#Supplemental_Language_Data">Supplemental Language Data</a></td>
8169				<td>2 <a href="tr35-info.html#Supplemental_Language_Data">Supplemental
8170						Language Data</a></td>
8171			</tr>
8172			<tr>
8173				<td>C.9 <a name="Supplemental_Code_Mapping"
8174					href="#Supplemental_Code_Mapping">Supplemental Code Mapping</a></td>
8175				<td>4 <a href="tr35-info.html#Supplemental_Code_Mapping">Supplemental
8176						Code Mapping</a></td>
8177			</tr>
8178			<tr>
8179				<td>C.12 <a name="Telephone_Code_Data"
8180					href="#Telephone_Code_Data">Telephone Code Data</a></td>
8181				<td>5 <a href="tr35-info.html#Telephone_Code_Data">Telephone
8182						Code Data</a></td>
8183			</tr>
8184			<tr>
8185				<td>C.14 <a name="Postal_Code_Validation"
8186					href="#Postal_Code_Validation">Postal Code Validation</a></td>
8187				<td>6 <a href="tr35-info.html#Postal_Code_Validation">Postal
8188						Code Validation</a></td>
8189			</tr>
8190			<tr>
8191				<td>C.8 <a name="Supplemental_Character_Fallback_Data"
8192					href="#Supplemental_Character_Fallback_Data">Supplemental
8193						Character Fallback Data</a></td>
8194				<td>7 <a
8195					href="tr35-info.html#Supplemental_Character_Fallback_Data">Supplemental
8196						Character Fallback Data</a></td>
8197			</tr>
8198			<tr>
8199				<td>M <a name="Coverage_Levels" href="#Coverage_Levels">Coverage
8200						Levels</a></td>
8201				<td>8 <a href="tr35-info.html#Coverage_Levels">Coverage
8202						Levels</a></td>
8203			</tr>
8204			<tr>
8205				<td>5.20 <a name="Metadata_Elements"
8206					href="tr35-info.html#Metadata_Elements">Metadata Elements</a></td>
8207				<td>10 <a href="tr35-info.html#Metadata_Elements">Locale
8208						Metadata Element</a></td>
8209			</tr>
8210			<tr>
8211				<td>P <a name="Appendix_Supplemental_Metadata"
8212					href="tr35-info.html#Appendix_Supplemental_Metadata">Supplemental
8213						Metadata</a><br> P.1 <a name="Supplemental_Alias_Information"
8214					href="tr35-info.html#Supplemental_Alias_Information">Supplemental
8215						Alias Information</a><br> P.2 <a
8216					name="Supplemental_Deprecated_Information"
8217					href="tr35-info.html#Supplemental_Deprecated_Information">Supplemental
8218						Deprecated Information</a><br> P.3 <a name="Default_Content"
8219					href="tr35-info.html#Default_Content">Default Content</a>
8220				</td>
8221				<td>9 <a href="tr35-info.html#Appendix_Supplemental_Metadata">Supplemental
8222						Metadata</a> <br> 9.1 <a
8223					href="tr35-info.html#Supplemental_Alias_Information">Supplemental
8224						Alias Information</a><br> 9.2 <a
8225					href="tr35-info.html#Supplemental_Deprecated_Information">Supplemental
8226						Deprecated Information</a><br> 9.3 <a
8227					href="tr35-info.html#Default_Content">Default Content</a>
8228				</td>
8229			</tr>
8230		</table>
8231
8232		<table cellspacing="0" cellpadding="2" border="1" width="100%">
8233			<caption>
8234				<a href="#Part_7_Links" name="Part_7_Links">Part 7 Links</a>: <a
8235					href="tr35-keyboards.html">Keyboards</a> (keyboard mappings)
8236			</caption>
8237			<tr>
8238				<th>Old section</th>
8239				<th>Section in new part</th>
8240			</tr>
8241
8242			<tr>
8243				<td>S <a name="Keyboards" href="#Keyboards">Keyboards</a></td>
8244				<td>1 <a href="tr35-keyboards.html#Keyboards">Keyboards</a></td>
8245			</tr>
8246
8247			<tr>
8248				<td>S <a name="Goals_and_Nongoals" href="#Goals_and_Nongoals">Goals
8249						and Nongoals</a></td>
8250				<td><a href="tr35-keyboards.html#Goals_and_Nongoals">Goals
8251						and Nongoals</a></td>
8252			</tr>
8253
8254			<tr>
8255				<td>S <a name="File_and_Dir_Structure"
8256					href="#File_and_Dir_Structure">File and Directory Structure</a></td>
8257				<td><a href="tr35-keyboards.html#File_and_Dir_Structure">File
8258						and Directory Structure</a></td>
8259			</tr>
8260
8261			<tr>
8262				<td>S <a name="Element_Heirarchy_Layout_File"
8263					href="#Element_Heirarchy_Layout_File">Element Hierarchy -
8264						Layout File</a></td>
8265				<td><a href="tr35-keyboards.html#Element_Heirarchy_Layout_File">Element
8266						Hierarchy - Layout File</a></td>
8267			</tr>
8268
8269			<tr>
8270				<td>S <a name="Element_Heirarchy_Platform_File"
8271					href="#Element_Heirarchy_Platform_File">Element Hierarchy -
8272						Platform File</a></td>
8273				<td><a
8274					href="tr35-keyboards.html#Element_Heirarchy_Platform_File">Element
8275						Hierarchy - Platform File</a></td>
8276			</tr>
8277
8278			<tr>
8279				<td>S <a name="Invariants" href="#Invariants">Invariants</a></td>
8280				<td><a href="tr35-keyboards.html#Invariants">Invariants</a></td>
8281			</tr>
8282
8283			<tr>
8284				<td>S <a name="Data_Sources" href="#Data_Sources">Data
8285						Sources</a></td>
8286				<td><a href="tr35-keyboards.html#Data_Sources">Data Sources</a></td>
8287			</tr>
8288
8289			<tr>
8290				<td>S <a name="Keyboard_IDs" href="#Keyboard_IDs">Keyboard
8291						IDs</a></td>
8292				<td><a href="tr35-keyboards.html#Keyboard_IDs">Keyboard IDs</a></td>
8293			</tr>
8294
8295			<tr>
8296				<td>S <a name="Platform_Behaviors_in_Edge_Cases"
8297					href="#Platform_Behaviors_in_Edge_Cases">Platform Behaviors in
8298						Edge Cases</a></td>
8299				<td><a
8300					href="tr35-keyboards.html#Platform_Behaviors_in_Edge_Cases">Platform
8301						Behaviors in Edge Cases</a></td>
8302			</tr>
8303
8304			<tr>
8305				<td>S <a name="Element_Keyboard" href="#Element_Keyboard">Element:
8306						keyboard</a></td>
8307				<td><a href="tr35-keyboards.html#Element_Keyboard">Element:
8308						keyboard</a></td>
8309			</tr>
8310
8311			<tr>
8312				<td>S <a name="Element_version" href="#Element_version">Element:
8313						version</a></td>
8314				<td><a href="tr35-keyboards.html#Element_version">Element:
8315						version</a></td>
8316			</tr>
8317
8318			<tr>
8319				<td>S <a name="Element_generation" href="#Element_generation">Element:
8320						generation</a></td>
8321				<td><a href="tr35-keyboards.html#Element_generation">Element:
8322						generation</a></td>
8323			</tr>
8324
8325			<tr>
8326				<td>S <a name="Element_names" href="#Element_names">Element:
8327						names</a></td>
8328				<td><a href="tr35-keyboards.html#Element_names">Element:
8329						names</a></td>
8330			</tr>
8331
8332			<tr>
8333				<td>S <a name="Element_name" href="#Element_name">Element:
8334						name</a></td>
8335				<td><a href="tr35-keyboards.html#Element_name">Element:
8336						name</a></td>
8337			</tr>
8338
8339			<tr>
8340				<td>S <a name="Element_settings" href="#Element_settings">Element:
8341						settings</a></td>
8342				<td><a href="tr35-keyboards.html#Element_settings">Element:
8343						settings</a></td>
8344			</tr>
8345
8346			<tr>
8347				<td>S <a name="Element_keyMap" href="#Element_keyMap">Element:
8348						keyMap</a></td>
8349				<td><a href="tr35-keyboards.html#Element_keyMap">Element:
8350						keyMap</a></td>
8351			</tr>
8352
8353			<tr>
8354				<td>S <a name="Element_map" href="#Element_map">Element:
8355						map</a></td>
8356				<td><a href="tr35-keyboards.html#Element_map">Element: map</a></td>
8357			</tr>
8358
8359			<tr>
8360				<td>S <a name="Element_transforms" href="#Element_transforms">Element:
8361						transforms</a></td>
8362				<td><a href="tr35-keyboards.html#Element_transforms">Element:
8363						transforms</a></td>
8364			</tr>
8365
8366			<tr>
8367				<td>S <a name="Element_transform" href="#Element_transform">Element:
8368						transform</a></td>
8369				<td><a href="tr35-keyboards.html#Element_transform">Element:
8370						transform</a></td>
8371			</tr>
8372
8373			<tr>
8374				<td>S <a name="Element_platform" href="#Element_platform">Element:
8375						platform</a></td>
8376				<td><a href="tr35-keyboards.html#Element_platform">Element:
8377						platform</a></td>
8378			</tr>
8379
8380			<tr>
8381				<td>S <a name="Element_hardwareMap" href="#Element_hardwareMap">Element:
8382						hardwareMap</a></td>
8383				<td><a href="tr35-keyboards.html#Element_hardwareMap">Element:
8384						hardwareMap</a></td>
8385			</tr>
8386
8387			<tr>
8388				<td>S <a name="Principles_for_Keyboard_Ids"
8389					href="#Principles_for_Keyboard_Ids">Principles for Keyboard Ids</a></td>
8390				<td><a href="tr35-keyboards.html#Principles_for_Keyboard_Ids">Principles
8391						for Keyboard Ids</a></td>
8392			</tr>
8393
8394		</table>
8395		<hr>
8396		<h2>
8397			<a name="References" href="#References">References</a>
8398		</h2>
8399		<table cellpadding="4" cellspacing="0" class="noborder" border="0">
8400			<tr>
8401				<th class="noborder" width="148">Ancillary Information</th>
8402				<td class="noborder" width="730"><i>To properly localize,
8403						parse, and format data requires ancillary information, which is
8404						not expressed in Locale Data Markup Language. Some of the formats
8405						for values used in Locale Data Markup Language are constructed
8406						according to external specifications. The sources for this data
8407						and/or formats include the following:<br> &nbsp;
8408				</i></td>
8409			</tr>
8410			<tr>
8411				<td class="noborder" width="148">[<a name="Bugs" href="#Bugs">Bugs</a>]
8412				</td>
8413				<td class="noborder" width="730">CLDR Bug Reporting form<br>
8414					<a href="http://cldr.unicode.org/index/bug-reports">
8415						http://cldr.unicode.org/index/bug-reports</a></td>
8416			</tr>
8417			<tr>
8418				<td class="noborder" width="148">[<a name="Charts"
8419					href="#Charts">Charts</a>]
8420				</td>
8421				<td class="noborder" width="730">The online code charts can be
8422					found at <a href="http://unicode.org/charts/">http://unicode.org/charts/</a>
8423					An index to character names with links to the corresponding chart
8424					is found at <a href="http://unicode.org/charts/charindex.html">http://unicode.org/charts/charindex.html</a>
8425				</td>
8426			</tr>
8427			<tr>
8428				<td class="noborder" width="148">[<a name="DUCET" href="#DUCET">DUCET</a>]
8429				</td>
8430				<td class="noborder" width="730">The Default Unicode Collation
8431					Element Table (DUCET)<br> For the base-level collation, of
8432					which all the collation tables in this document are tailorings.<br>
8433					<a
8434					href="http://unicode.org/reports/tr10/#Default_Unicode_Collation_Element_Table">http://unicode.org/reports/tr10/#Default_Unicode_Collation_Element_Table</a>
8435				</td>
8436			</tr>
8437			<tr>
8438				<td class="noborder" width="148">[<a name="FAQ" href="#FAQ">FAQ</a>]
8439				</td>
8440				<td class="noborder" valign="top" width="730">Unicode
8441					Frequently Asked Questions<br> <a
8442					href="http://unicode.org/faq/">http://unicode.org/faq/<br>
8443				</a><i>For answers to common questions on technical issues.</i>
8444				</td>
8445			</tr>
8446			<tr>
8447				<td class="noborder" width="148">[<a name="FCD" href="#FCD">FCD</a>]
8448				</td>
8449				<td class="noborder" width="730">As defined in UTN #5 Canonical
8450					Equivalences in Applications<br> <a
8451					href="http://unicode.org/notes/tn5/">http://unicode.org/notes/tn5/</a>
8452				</td>
8453			</tr>
8454			<tr>
8455				<td class="noborder" width="148">[<a name="Glossary"
8456					href="#Glossary">Glossary</a>]
8457				</td>
8458				<td class="noborder" width="730">Unicode Glossary<a
8459					href="http://unicode.org/glossary/"><br>
8460						http://unicode.org/glossary/<br> </a><i>For explanations of
8461						terminology used in this and other documents.</i></td>
8462			</tr>
8463			<tr>
8464				<td class="noborder" width="148">[<a name="JavaChoice"
8465					href="#JavaChoice">JavaChoice</a>]
8466				</td>
8467				<td class="noborder" width="730">Java ChoiceFormat<br> <a
8468					href="http://docs.oracle.com/javase/7/docs/api/java/text/ChoiceFormat.html">
8469						http://docs.oracle.com/javase/7/docs/api/java/text/ChoiceFormat.html</a></td>
8470			</tr>
8471			<tr>
8472				<td class="noborder" width="148">[<a name="Olson" href="#Olson">Olson</a>]
8473				</td>
8474				<td class="noborder" width="730">The <i>TZ</i>ID Database (aka
8475					Olson timezone database)<br> Time zone and daylight savings
8476					information.<br> <a href="http://www.iana.org/time-zones">http://www.iana.org/time-zones</a><br>
8477					For archived data, see <br> <a
8478					href="ftp://ftp.iana.org/tz/releases/">ftp://ftp.iana.org/tz/releases/</a></td>
8479			</tr>
8480			<tr>
8481				<td class="noborder" width="148">[<a name="Reports"
8482					href="#Reports">Reports</a>]
8483				</td>
8484				<td class="noborder" width="730">Unicode Technical Reports<br>
8485					<a href="http://unicode.org/reports/">http://unicode.org/reports/<br>
8486				</a><i>For information on the status and development process for
8487						technical reports, and for a list of technical reports.</i></td>
8488			</tr>
8489			<tr>
8490				<td class="noborder" width="148">[<a name="Unicode"
8491					href="#Unicode">Unicode</a>]
8492				</td>
8493				<td class="noborder" width="730">The Unicode Consortium. <em>The
8494						Unicode Standard, Version 7.0.0</em>,&nbsp;(Mountain View, CA: The
8495					Unicode Consortium, 2014. ISBN 978-1-936213-09-2)<br> <a
8496					href="http://www.unicode.org/versions/Unicode7.0.0/">
8497						http://www.unicode.org/versions/Unicode7.0.0/</a>
8498				</td>
8499			</tr>
8500			<tr>
8501				<td class="noborder" width="148">[<a name="Versions"
8502					href="#Versions">Versions</a>]
8503				</td>
8504				<td class="noborder" width="730">Versions of the Unicode
8505					Standard<br> <a href="http://www.unicode.org/versions/">
8506						http://www.unicode.org/versions/</a><br> <i>For information
8507						on version numbering, and citing and referencing the Unicode
8508						Standard, the Unicode Character Database, and Unicode Technical
8509						Reports.</i>
8510				</td>
8511			</tr>
8512			<tr>
8513				<td class="noborder" width="148">[<a name="XPath" href="#XPath">XPath</a>]
8514				</td>
8515				<td class="noborder" width="730"><a
8516					href="http://www.w3.org/TR/xpath/"> http://www.w3.org/TR/xpath/</a></td>
8517			</tr>
8518			<tr>
8519				<th class="noborder" width="148">Other Standards</th>
8520				<td class="noborder" width="730"><i>Various standards
8521						define codes that are used as keys or values in Locale Data Markup
8522						Language. These include:</i></td>
8523			</tr>
8524			<tr>
8525				<td class="noborder">[<a name="BCP47" href="#BCP47">BCP47</a>]
8526				</td>
8527				<td class="noborder"><a
8528					href="http://www.rfc-editor.org/rfc/bcp/bcp47.txt">
8529						http://www.rfc-editor.org/rfc/bcp/bcp47.txt</a>
8530					<p>
8531						The Registry<br> <a
8532							href="http://www.iana.org/assignments/language-subtag-registry">http://www.iana.org/assignments/language-subtag-registry</a>
8533					</p></td>
8534			</tr>
8535			<tr>
8536				<td class="noborder" width="148">[<a name="ISO639"
8537					href="#ISO639">ISO639</a>]
8538				</td>
8539				<td class="noborder" width="730">ISO Language Codes<br> <a
8540					href="http://www.loc.gov/standards/iso639-2/">http://www.loc.gov/standards/iso639-2/</a><br>
8541					Actual List<br> <a
8542					href="http://www.loc.gov/standards/iso639-2/langcodes.html">http://www.loc.gov/standards/iso639-2/langcodes.html</a></td>
8543			</tr>
8544			<tr>
8545				<td class="noborder" width="148">[<a name="ISO1000"
8546					href="#ISO1000">ISO1000</a>]
8547				</td>
8548				<td class="noborder" width="730">ISO 1000: SI units and
8549					recommendations for the use of their multiples and of certain other
8550					units, International Organization for Standardization, 1992.<br>
8551					<a href="http://www.iso.org/iso/catalogue_detail?csnumber=5448">http://www.iso.org/iso/catalogue_detail?csnumber=5448</a>
8552				</td>
8553			</tr>
8554			<tr>
8555				<td class="noborder" width="148">[<a name="ISO3166"
8556					href="#ISO3166">ISO3166</a>]
8557				</td>
8558				<td class="noborder" width="730">ISO Region Codes<br> <a
8559					href="http://www.iso.org/iso/country_codes">http://www.iso.org/iso/country_codes</a><br>
8560					Actual List<br> <a
8561					href="http://www.iso.org/iso/country_names_and_code_elements">http://www.iso.org/iso/country_names_and_code_elements</a></td>
8562			</tr>
8563			<tr>
8564				<td class="noborder" width="148">[<a name="ISO4217"
8565					href="#ISO4217">ISO4217</a>]
8566				</td>
8567				<td class="noborder" width="730">ISO Currency Codes<br> <a
8568					href="http://www.iso.org/iso/home/standards/currency_codes.htm">http://www.iso.org/iso/home/standards/currency_codes.htm</a>
8569					<p>
8570						<i>(Note that as of this point, there are significant problems
8571							with this list. The supplemental data file contains the best
8572							compendium of currency information available.)</i>
8573					</p>
8574				</td>
8575			</tr>
8576			<tr>
8577				<td class="noborder" width="148">[<a name="ISO8601"
8578					href="#ISO8601">ISO8601</a>]
8579				</td>
8580				<td class="noborder" width="730">ISO Date and Time Format<br>
8581					<a href="http://www.iso.org/iso/iso8601">http://www.iso.org/iso/iso8601</a>
8582				</td>
8583			</tr>
8584			<tr>
8585				<td class="noborder" width="148">[<a name="ISO15924"
8586					href="#ISO15924">ISO15924</a>]
8587				</td>
8588				<td class="noborder" width="730">ISO Script Codes<br> <a
8589					href="http://www.unicode.org/iso15924/standard/index.html">http://www.unicode.org/iso15924/standard/index.html</a><br>
8590					Actual List<br> <a
8591					href="http://www.unicode.org/iso15924/codelists.html">http://www.unicode.org/iso15924/codelists.html</a></td>
8592			</tr>
8593			<tr>
8594				<td class="noborder" width="148">[<a name="LOCODE"
8595					href="#LOCODE">LOCODE</a>]
8596				</td>
8597				<td class="noborder" width="730">United Nations Code for Trade
8598					and Transport Locations, commonly known as "UN/LOCODE"<br> <a
8599					href="http://www.unece.org/cefact/locode/welcome.html">
8600						http://www.unece.org/cefact/locode/welcome.html</a><br> Download
8601					at: <a
8602					href="http://www.unece.org/cefact/codesfortrade/codes_index.htm"http://www.unece.org/cefact/codesfortrade/codes_index.htm</a>
8603				</td>
8604			</tr>
8605			<tr>
8606				<td class="noborder" width="148">[<a name="RFC6067"
8607					href="#RFC6067">RFC6067</a>]
8608				</td>
8609				<td class="noborder" width="730">BCP 47 Extension U<br> <a
8610					href="http://www.ietf.org/rfc/rfc6067.txt">http://www.ietf.org/rfc/rfc6067.txt</a></td>
8611			</tr>
8612			<tr>
8613				<td class="noborder" width="148">[<a name="RFC6497"
8614					href="#RFC6497">RFC6497</a>]
8615				</td>
8616				<td class="noborder" width="730">BCP 47 Extension T -
8617					Transformed Content<br> <a
8618					href="http://www.ietf.org/rfc/rfc6497.txt">http://www.ietf.org/rfc/rfc6497.txt</a>
8619				</td>
8620			</tr>
8621			<tr>
8622				<td class="noborder" width="148">[<a name="UNM49" href="#UNM49">UNM49</a>]
8623				</td>
8624				<td class="noborder" width="730">UN M.49: UN Statistics
8625					Division
8626					<p>
8627						Country or area &amp; region codes<br> <a
8628							href="http://unstats.un.org/unsd/methods/m49/m49.htm">http://unstats.un.org/unsd/methods/m49/m49.htm</a>
8629					</p>
8630					<p>
8631						Composition of macro geographical (continental) regions,
8632						geographical sub-regions, and selected economic and other
8633						groupings<br> <a
8634							href="http://unstats.un.org/unsd/methods/m49/m49regin.htm">http://unstats.un.org/unsd/methods/m49/m49regin.htm</a>
8635					</p>
8636				</td>
8637			</tr>
8638			<tr>
8639				<td class="noborder" width="148">[<a name="XMLSchema"
8640					href="#XMLSchema">XML Schema</a>]
8641				</td>
8642				<td class="noborder" width="730">W3C XML Schema<br> <a
8643					href="http://www.w3.org/XML/Schema">http://www.w3.org/XML/Schema</a></td>
8644			</tr>
8645			<tr>
8646				<th class="noborder" width="148">General</th>
8647				<td class="noborder" width="730"><i>The following are
8648						general references from the text:</i></td>
8649			</tr>
8650			<tr>
8651				<td class="noborder" width="148">[<a name="ByType"
8652					href="#ByType">ByType</a>]
8653				</td>
8654				<td class="noborder" width="730">CLDR Comparison Charts<br>
8655					<a href="http://www.unicode.org/cldr/comparison_charts.html">http://www.unicode.org/cldr/comparison_charts.html</a></td>
8656			</tr>
8657			<tr>
8658				<td class="noborder" width="148">[<a name="Calendars"
8659					href="#Calendars">Calendars</a>]
8660				</td>
8661				<td class="noborder" width="730">Calendrical Calculations: The
8662					Millennium Edition by Edward M. Reingold, Nachum Dershowitz;
8663					Cambridge University Press; Book and CD-ROM edition (July 1, 2001);
8664					ISBN: 0521777526. Note that the algorithms given in this book are
8665					copyrighted.</td>
8666			</tr>
8667			<tr>
8668				<td class="noborder" width="148">[<a name="Comparisons"
8669					href="#Comparisons">Comparisons</a>]
8670				</td>
8671				<td class="noborder" width="730">Comparisons between locale
8672					data from different sources<br> <a
8673					href="http://unicode.org/cldr/data/diff/">http://unicode.org/cldr/data/diff/</a>
8674				</td>
8675			</tr>
8676			<tr>
8677				<td class="noborder" width="148">[<a name="CurrencyInfo"
8678					href="#CurrencyInfo">CurrencyInfo</a>]
8679				</td>
8680				<td class="noborder" width="730">UNECE Currency Data<br> <a
8681					href="http://www.currency-iso.org/en/home/tables.html">http://www.currency-iso.org/en/home/tables.html</a></td>
8682			</tr>
8683			<tr>
8684				<td class="noborder" width="148">[<a name="DataFormats"
8685					href="#DataFormats">DataFormats</a>]
8686				</td>
8687				<td class="noborder" width="730">CLDR Translation Guidelines<br>
8688					<a href="http://cldr.unicode.org/translation">http://cldr.unicode.org/translation</a></td>
8689			</tr>
8690			<tr>
8691				<td class="noborder" width="148">[<a name="LDML" href="#LDML">Example</a>]
8692				</td>
8693				<td class="noborder" width="730">A sample in Locale Data Markup
8694					Language<br> <a
8695					href="http://unicode.org/cldr/dtd/1.1/ldml-example.xml">http://unicode.org/cldr/dtd/1.1/ldml-example.xml</a>
8696				</td>
8697			</tr>
8698			<tr>
8699				<td class="noborder" width="148">[<a name="ICUCollation"
8700					href="#ICUCollation">ICUCollation</a>]
8701				</td>
8702				<td class="noborder" width="730">ICU rule syntax<br> <a
8703					href="http://www.icu-project.org/userguide/Collate_Customization.html">http://www.icu-project.org/userguide/Collate_Customization.html</a></td>
8704			</tr>
8705			<tr>
8706				<td class="noborder" width="148">[<a name="ICUTransforms"
8707					href="#ICUTransforms">ICUTransforms</a>]
8708				</td>
8709				<td class="noborder" width="730">Transforms<br> <a
8710					href="http://www.icu-project.org/userguide/Transformations.html">http://www.icu-project.org/userguide/Transformations.html</a><br>
8711					Transforms Demo<br> <a
8712					href="http://demo.icu-project.org/icu-bin/translit/">http://demo.icu-project.org/icu-bin/translit/</a></td>
8713			</tr>
8714			<tr>
8715				<td class="noborder" width="148">[<a name="ICUUnicodeSet"
8716					href="#ICUUnicodeSet">ICUUnicodeSet</a>]
8717				</td>
8718				<td class="noborder" width="730">ICU UnicodeSet<br> <a
8719					href="http://www.icu-project.org/userguide/unicodeSet.html">http://www.icu-project.org/userguide/unicodeSet.html<br>
8720				</a>API<br> <a
8721					href="http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeSet.html">http://www.icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeSet.html</a></td>
8722			</tr>
8723			<tr>
8724				<td class="noborder" width="148">[<a name="ITUE164"
8725					href="#ITUE164">ITUE164</a>]
8726				</td>
8727				<td class="noborder" width="730">International
8728					Telecommunication Union: List Of ITU Recommendation E.164 Assigned
8729					Country Codes<br> available at <a
8730					href="http://www.itu.int/opb/publications.aspx?parent=T-SP&amp;view=T-SP2">http://www.itu.int/opb/publications.aspx?parent=T-SP&amp;view=T-SP2</a>
8731				</td>
8732			</tr>
8733			<tr>
8734				<td class="noborder" width="148">[<a name="LocaleExplorer"
8735					href="#LocaleExplorer">LocaleExplorer</a>]
8736				</td>
8737				<td class="noborder" width="730">ICU Locale Explorer<br> <a
8738					href="http://demo.icu-project.org/icu-bin/locexp">http://demo.icu-project.org/icu-bin/locexp</a></td>
8739			</tr>
8740			<tr>
8741				<td class="noborder" width="148">[<a name="localeProject"
8742					href="#localeProject">LocaleProject</a>]
8743				</td>
8744				<td class="noborder" width="730">Common Locale Data Repository
8745					Project<br> <a href="http://unicode.org/cldr/">http://unicode.org/cldr/</a>
8746				</td>
8747			</tr>
8748			<tr>
8749				<td class="noborder" width="148">[<a name="NamingGuideline"
8750					href="#NamingGuideline">NamingGuideline</a>]
8751				</td>
8752				<td class="noborder" width="730">OpenI18N Locale Naming
8753					Guideline<br> formerly at
8754					http://www.openi18n.org/docs/text/LocNameGuide-V10.txt
8755				</td>
8756			</tr>
8757			<tr>
8758				<td class="noborder" width="148">[<a name="RBNF" href="#RBNF">RBNF</a>]
8759				</td>
8760				<td class="noborder" width="730">Rule-Based Number Format<br>
8761					<a
8762					href="http://www.icu-project.org/apiref/icu4c/classRuleBasedNumberFormat.html">http://www.icu-project.org/apiref/icu4c/classRuleBasedNumberFormat.html#_details</a></td>
8763			</tr>
8764			<tr>
8765				<td class="noborder" width="148">[<a name="RBBI" href="#RBBI">RBBI</a>]
8766				</td>
8767				<td class="noborder" width="730">Rule-Based Break Iterator<br>
8768					<a
8769					href="http://www.icu-project.org/userguide/boundaryAnalysis.html">http://www.icu-project.org/userguide/boundaryAnalysis.html</a></td>
8770			</tr>
8771			<tr>
8772				<td class="noborder" width="148">[<a name="RFC5234"
8773					href="#RFC5234">RFC5234</a>]
8774				</td>
8775				<td class="noborder" width="730">RFC5234 Augmented BNF for
8776					Syntax Specifications: ABNF<br> <a
8777					href="http://www.ietf.org/rfc/rfc5234.txt">http://www.ietf.org/rfc/rfc5234.txt</a>
8778				</td>
8779			</tr>
8780			<tr>
8781				<td class="noborder" width="148">[<a name="UCAChart"
8782					href="#UCAChart">UCAChart</a>]
8783				</td>
8784				<td class="noborder" width="730">Collation Chart<a
8785					href="http://unicode.org/charts/collation/"><br>
8786						http://unicode.org/charts/collation/</a></td>
8787			</tr>
8788			<tr>
8789				<td class="noborder" width="148">[<a name="UTCInfo"
8790					href="#UTCInfo">UTCInfo</a>]
8791				</td>
8792				<td class="noborder" width="730">NIST Time and Frequency
8793					Division Home Page<br> <a href="http://tf.nist.gov/">http://tf.nist.gov/<br>
8794				</a>U.S. Naval Observatory: What is Universal Time?<br> <a
8795					href="http://aa.usno.navy.mil/faq/docs/UT.php">http://aa.usno.navy.mil/faq/docs/UT.php</a>
8796				</td>
8797			</tr>
8798			<tr>
8799				<td class="noborder" width="148">[<a name="WindowsCulture"
8800					href="#WindowsCulture">WindowsCulture</a>]
8801				</td>
8802				<td class="noborder" width="730">Windows Culture Info
8803					(with&nbsp; mappings from [<a href="#BCP47">BCP47</a>]-style codes
8804					to LCIDs)<br> <a
8805					href="http://msdn.microsoft.com/en-us/library/system.globalization.cultureinfo(vs.71).aspx">http://msdn2.microsoft.com/en-us/library/system.globalization.cultureinfo(vs.71).aspx</a>
8806				</td>
8807			</tr>
8808		</table>
8809		<h2>
8810			<a name="Acknowledgments" href="#Acknowledgments">Acknowledgments</a>
8811		</h2>
8812		<p>Special thanks to the following people for their continuing
8813			overall contributions to the CLDR project, and for their specific
8814			contributions in the following areas. These descriptions only touch
8815			on the many contributions that they have made.</p>
8816		<ul>
8817			<li><a
8818				href="https://plus.google.com/114199149796022210033?rel=author">Mark
8819					Davis</a> for creating the initial version of LDML, and adding to and
8820				maintaining this specification, and for his work on the LDML code
8821				and tests, much of the supplemental data and overall structure, and
8822				transforms and keyboards.</li>
8823			<li>John Emmons for the POSIX conversion tool and metazones.</li>
8824			<li>Deborah Goldsmith for her contributions to LDML architecture
8825				and this specification.</li>
8826			<li>Chris Hansten for coordinating and managing data submissions
8827				and vetting.</li>
8828			<li>Erkki Kolehmainen and his team for their work on Finnish.</li>
8829			<li>Steven R. Loomis for development of the survey tool and
8830				database management.</li>
8831			<li>Peter Nugent for his contributions to the POSIX tool and
8832				from Open Office, and for coordinating and managing data submissions
8833				and vetting.</li>
8834			<li>George Rhoten for his work on currencies.</li>
8835			<li>Roozbeh Pournader (روزبه پورنادر) for his work on South
8836				Asian countries.</li>
8837			<li>Ram Viswanadha (రఘురామ్ విశ్వనాధ) for all of his work on
8838				LDML code and data integration, and for coordinating and managing
8839				data submissions and vetting.</li>
8840			<li>Vladimir Weinstein (Владимир Вајнштајн) for his work on
8841				collation.</li>
8842			<li>Yoshito Umaoka (馬岡 由人) for his work on the timezone
8843				architecture.</li>
8844			<li>Rick McGowan for his work gathering language, script and
8845				region data.</li>
8846			<li>Xiaomei Ji (吉晓梅) for her work on time intervals and plural
8847				formatting.</li>
8848			<li>David Bertoni for his contributions to the conversion tools.</li>
8849			<li>Mike Tardif for reviewing this specification and for
8850				coordinating and vetting data submissions.</li>
8851			<li>Peter Edberg for work on this specification, telephone code
8852				data, monthPatterns, cyclicNameSets and contextTransforms.</li>
8853			<li>Raymond Wainman and Cibu Johny for their work on keyboards.</li>
8854			<li>Jennifer Chye for her contributions to the conversion tools.</li>
8855			<li><a
8856				href="https://plus.google.com/117587389715494866571?rel=author">Markus
8857					Scherer</a> for a major rewrite of Part 5, Collation.</li>
8858		</ul>
8859		<p>
8860			Other contributors to CLDR are listed on the <a
8861				href="http://www.unicode.org/cldr/">CLDR Project Page</a>.
8862		</p>
8863
8864		<h2>
8865			<a name="Modifications" href="#Modifications">Modifications</a>
8866		</h2>
8867
8868<p><b>Revision 53</b></p>
8869<p><strong>Part 1: <a href="tr35.html#Contents">Core</a> (languages,
8870				locales, basic structure)
8871	</strong></p>
8872<ul>
8873  <li><strong>Section 3.2 <a
8874				href="#Unicode_locale_identifier">Unicode Locale Identifier</a></strong>
8875[<a href="http://unicode.org/cldr/trac/ticket/11435">#11435</a>]
8876[<a href="http://unicode.org/cldr/trac/ticket/11434">#11434</a>]
8877<ul>
8878  <li>Fixed cases of "-" in the syntax that should have been <em>sep</em>, and note that &quot;-&quot; is the canonical (preferred) form.</li>
8879  <li>Fixed &quot;u&quot; and &quot;t&quot; in the syntax to [uU] and [tT], resp., to reflect that case is ignored when parsing.</li>
8880  <li>Included specific syntax rather than just noting &quot;Although not shown in the syntax above, Unicode locale identifiers may also have [BCP47] extensions (other than &quot;u&quot; and &quot;t&quot;) and private use subtags.&quot;</li>
8881  <li>Reformated and fleshed out the canonical form description; listed where CLDR uses non-canonical forms.</li>
8882  <li>Added missing details about how Unicode Locale Identifiers differ from BCP 47, and how to convert between them.</li>
8883  </ul>
8884  </li>
8885  <li><strong>Section 3.3 <a href="#BCP_47_Conformance">BCP
8886    47 Conformance</a> </strong>
8887<ul>
8888  <li>Reorganized for clarity, introduced new terms <em>Unicode BCP 47 locale identifier</em> and <em>Unicode CLDR locale identifier</em>. [<a href="http://unicode.org/cldr/trac/ticket/11451">#11451</a>]</li>
8889  </ul>
8890  </li>
8891  <li><strong>Section 3.3.1 <a  href="http://unicode.org/repos/cldr/trunk/specs/ldml/tr35.html#BCP_47_Language_Tag_Conversion">BCP 47 Language Tag Conversion</a>
8892    [<a href="http://unicode.org/cldr/trac/ticket/11451">#11451</a>]</strong>
8893    <ul>
8894      <li>Now handles private-use extensions and grandfathered tags.</li>
8895      <li>Added more examples.</li>
8896      <li>Separated into three conversions.
8897        <ul>
8898          <li> <a  href="http://unicode.org/repos/cldr/trunk/specs/ldml/tr35.html#Language_Tag_to_Locale_Identifier">BCP 47 Language Tag to Unicode BCP 47 Locale Identifier</a>          </li>
8899          <li> <a  href="http://unicode.org/repos/cldr/trunk/specs/ldml/tr35.html#Unicode_Locale_Identifier_CLDR_to_BCP_47">Unicode Locale Identifier: CLDR to BCP 47</a>          </li>
8900          <li> <a  href="http://unicode.org/repos/cldr/trunk/specs/ldml/tr35.html#Unicode_Locale_Identifier_BCP_47_to_CLDR">Unicode Locale Identifier: BCP 47 to CLDR</a>          </li>
8901        </ul>
8902      </li>
8903      </ul>
8904  </li>
8905  <li><strong>Section 3.4
8906    <a href="#Field_Definitions">Language Identifier Field Definitions </a>
8907    </strong>
8908    <ul>
8909      <li>Added another macrolanguage example ku (used for kmr), and link to Aliases chart
8910      	[<a href="http://unicode.org/cldr/trac/ticket/11470">#11470</a>]</li>
8911      <li>Documented special language subtags mis, mul, zxx [<a href="http://unicode.org/cldr/trac/ticket/11451">#11451</a>]</li>
8912      <li>Added special script code Qaag [<a href="http://unicode.org/cldr/trac/ticket/11408">#11408</a>]</li>
8913      <li>Documented special region subtags XA and XB [<a href="http://unicode.org/cldr/trac/ticket/11451">#11451</a>]</li>
8914      </ul>
8915  </li>
8916  <li><strong>Section 3.5.3 <a href="#Private_Use">Private Use Codes</a></strong>
8917    <ul>
8918      <li>Adjusted table to move Qaag, XA, and XB into <em>defined</em>. The XA and XB were correct in the identity file (a change in a previous release), but had not been added to that table. [<a href="http://unicode.org/cldr/trac/ticket/11408">#11408</a>]</li>
8919      </ul>
8920  </li>
8921  <li><strong>Section 3.6.4 <a href="#Unicode_Locale_Extension_Data_Files" >U Extension Data Files</a>
8922    </strong>
8923    <ul>
8924      <li>Qualified valueType, since a key's value may be empty (if &quot;true&quot;). [<a href="http://unicode.org/cldr/trac/ticket/11408">#11408</a>]</li>
8925  </ul>
8926  </li>
8927  <li><strong>Section 3.6.5.1 <a  href="#Validity">Validity</a></strong>
8928    <ul>
8929      <li>Softened the requirement that there be region code matching the first 2 letters of the subdivision code. ​That was needlessly strict, and introduces a dependency on <em>likely subtags</em> that should not be there. [<a href="http://unicode.org/cldr/trac/ticket/11397">#11397</a>]</li>
8930      </ul>
8931  </li>
8932  <li><strong>Section 4.2.6 <a
8933				href="#Inheritance_vs_Related">Inheritance vs Related Information</a>
8934  </strong>
8935    <ul>
8936      <li>Added table to explain the relationship between Inheritance, DefaultContent, LikelySubtags, and LocaleMatching.</li>
8937  </ul>
8938  </li>
8939  <li><strong>Section 5.3.3
8940    <a href="#Unicode_Sets">Unicode Sets</a>
8941    </strong>
8942    <ul>
8943      <li>Clarified the relation between UnicodeSet and <a
8944				href="http://www.unicode.org/reports/tr41/#UTS18">UTS #18</a> [<a href="http://unicode.org/cldr/trac/ticket/11232">#11232</a>]</li>
8945      </ul>
8946  </li>
8947  </ul>
8948<p><strong>Part 2: <a href="tr35-general.html#Contents">General</a>
8949		(display names &amp; transforms, etc.)
8950	</strong></p>
8951<ul>
8952  <li><strong>Section 6 <a href="tr35-general.html#Unit_Elements">Unit Elements</a> </strong>
8953    <ul>
8954      <li>Added &lt;displayName&gt; element for &lt;coordinateUnit&gt;.
8955        [<a href="http://unicode.org/cldr/trac/ticket/9986">#9986</a>]</li>
8956      <li>Noted that unitPatterns can use explicit count values “0” and “1”.
8957      	[<a href="http://unicode.org/cldr/trac/ticket/10922">#10922</a>]</li>
8958      <li>Defined the syntax  of unit identifiers [<a href="http://unicode.org/cldr/trac/ticket/11271">#11271</a>]</li>
8959      <li>Added several new units: percent and permille, petabyte, and atmosphere.
8960        [<a href="http://unicode.org/cldr/trac/ticket/10632">#10632</a>]
8961        [<a href="http://unicode.org/cldr/trac/ticket/10410">#10410</a>]
8962        [<a href="http://unicode.org/cldr/trac/ticket/10600">#10600</a>]</li>
8963      </ul>
8964  </li>
8965  <li><strong>Section 10.1.1 <a href="tr35-general.html#Pivots">Pivots</a></strong>
8966    <ul>
8967      <li>Described the use of private use characters in Interindic. [<a href="http://unicode.org/cldr/trac/ticket/10962">#10962</a>]</li>
8968    </ul>
8969  </li>
8970  </ul>
8971<p><strong>Part 3: <a href="tr35-numbers.html#Contents">Numbers</a>
8972		(number &amp; currency formatting)
8973	</strong></p>
8974<ul>
8975  <li><strong>Section 2.5 <a href="tr35-numbers.html#Miscellaneous_Patterns">Miscellaneous Patterns</a></strong>
8976    <ul>
8977      <li>Documented <strong>approximately</strong> and <strong>atMost</strong>. [<a href="http://unicode.org/cldr/trac/ticket/11354">#11354</a>]</li>
8978      </ul>
8979  </li>
8980  <li><strong>Section 3.2 <a
8981				href="tr35-numbers.html##Special_Pattern_Characters">Special Pattern Characters</a></strong><a
8982				href="tr35-numbers.html##Special_Pattern_Characters"></a>
8983    <ul>
8984      <li>Documented edge cases for negative subpatterns (and whitespace)  [<a href="http://unicode.org/cldr/trac/ticket/10703">#10703</a>]</li>
8985      </ul>
8986  </li>
8987  <li><strong>Section 3.4 <a href="tr35-numbers.html#sci">Scientific Notation</a> </strong>
8988    <ul>
8989      <li>Specify the special formats used for the integer parts.  [<a href="http://unicode.org/cldr/trac/ticket/10103">#10103</a>]</li>
8990    </ul>
8991  </li>
8992  <li><strong>Section 5 <a href="tr35-numbers.html#Language_Plural_Rules">Language Plural Rules</a></strong>
8993    <ul>
8994      <li>Added a new section <a href="tr35-numbers.html#Explicit_0_1_rules">Explicit 0 and
8995        1 rules</a> covering the language-independent explicit plural cases “0” and “1”.
8996        [<a href="http://unicode.org/cldr/trac/ticket/10922">#10922</a>]</li>
8997      </ul>
8998  </li>
8999  </ul>
9000
9001<p><strong>Part 4: <a href="tr35-dates.html#Contents">Dates</a> (date,
9002				time, time zone formatting)
9003	</strong></p>
9004<ul>
9005  <li><strong>Section 2.6.3 <a  href="tr35-dates.html#intervalFormats">Element intervalFormats</a></strong>
9006    <ul>
9007      <li>Described how to synthesize intervalFormatItems for skeletons that combine date and time fields.
9008      	[<a href="http://unicode.org/cldr/trac/ticket/10133">#10133</a>]  </li>
9009    </ul>
9010  </li>
9011  <li><strong>Section 4.4 <a  href="tr35-dates.html#Time_Data">Time Data</a></strong>
9012    <ul>
9013      <li>Documented the relation between @allowed and @preferred. [<a href="http://unicode.org/cldr/trac/ticket/9930">#9930</a>]</li>
9014    </ul>
9015  </li>
9016</ul>
9017<p><strong>Part 5: <a href="tr35-collation.html#Contents">Collation</a>
9018		(sorting, searching, grouping)
9019	</strong></p>
9020<ul>
9021  <li><em>no changes</em></li>
9022</ul>
9023<p><strong>Part 6: <a href="tr35-info.html#Contents">Supplemental</a>
9024		(supplemental data)
9025	</strong></p>
9026<ul>
9027  <li> <strong>Section 4 <a href="tr35-info.html#Supplemental_Code_Mapping">Supplemental
9028  		Code Mapping</a></strong>
9029    <ul>
9030      <li>For the element &lt;territoryCodes&gt;, deprecated the internet attribute.
9031      	[<a href="http://unicode.org/cldr/trac/ticket/11072">#11072</a>]</li>
9032    </ul>
9033  </li>
9034
9035  <li> <strong>Section 5 <a href="tr35-info.html#Telephone_Code_Data">Telephone
9036				Code Data</a></strong>
9037    <ul>
9038      <li>Now deprecated, and data removed. [<a href="http://unicode.org/cldr/trac/ticket/10383">#10383</a>]</li>
9039    </ul>
9040  </li>
9041
9042  <li> <strong>Section 9.3 <a href="tr35-info.html#Default_Content">Default
9043				Content</a></strong>
9044    <ul>
9045      <li>Added pointer to <strong>Section 4.2.6 <a
9046				href="#Inheritance_vs_Related">Inheritance vs Related Information</a> </strong></li>
9047  </ul>
9048  </li>
9049</ul>
9050<p><strong>Part 7: <a href="tr35-keyboards.html#Contents">Keyboards</a>
9051		(keyboard mappings)
9052	</strong>	  </p>
9053	<ul>
9054  <li><em>no changes</em></li>
9055</ul>
9056
9057
9058<p>&nbsp;</p>
9059
9060
9061	  <p>Modifications in previous versions are listed in those respective versions. Click on <strong>Previous Version</strong> in the header until you get to the desired version.</p>
9062
9063		<hr>
9064		<p class="copyright">
9065			Copyright © 2001–2018 Unicode, Inc. All
9066			Rights Reserved. The Unicode Consortium makes no expressed or implied
9067			warranty of any kind, and assumes no liability for errors or
9068			omissions. No liability is assumed for incidental and consequential
9069			damages in connection with or arising out of the use of the
9070			information or programs contained or accompanying this technical
9071			report. The Unicode <a href="http://unicode.org/copyright.html">Terms
9072				of Use</a> apply.
9073		</p>
9074		<p class="copyright">Unicode and the Unicode logo are trademarks
9075			of Unicode, Inc., and are registered in some jurisdictions.</p>
9076	</div>
9077
9078</body>
9079
9080</html>
9081