• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1## Unicode Technical Standard #35
2
3# Unicode Locale Data Markup Language (LDML)<br/>Part 3: Numbers
4
5<!-- HTML: no th -->
6<table><tbody>
7<tr><td>Version</td><td>40</td></tr>
8<tr><td>Editors</td><td>Shane F. Carr (<a href="mailto:shane@unicode.org">shane@unicode.org</a>) and <a href="tr35.html#Acknowledgments">other CLDR committee members</a></td></tr>
9</tbody></table>
10
11For the full header, summary, and status, see [Part 1: Core](tr35.md).
12
13### _Summary_
14
15This document describes parts of an XML format (_vocabulary_) for the exchange of structured locale data. This format is used in the [Unicode Common Locale Data Repository](https://unicode.org/cldr/).
16
17This is a partial document, describing only those parts of the LDML that are relevant for number and currency formatting. For the other parts of the LDML see the [main LDML document](tr35.md) and the links above.
18
19### _Status_
20
21_This is a draft document which may be updated, replaced, or superseded by other documents at any time. Publication does not imply endorsement by the Unicode Consortium. This is not a stable document; it is inappropriate to cite this document as other than a work in progress._
22
23> _**A Unicode Technical Standard (UTS)** is an independent specification. Conformance to the Unicode Standard does not imply conformance to any UTS._
24
25_Please submit corrigenda and other comments with the CLDR bug reporting form [[Bugs](tr35.md#Bugs)]. Related information that is useful in understanding this document is found in the [References](tr35.md#References). For the latest version of the Unicode Standard see [[Unicode](tr35.md#Unicode)]. For a list of current Unicode Technical Reports see [[Reports](tr35.md#Reports)]. For more information about versions of the Unicode Standard, see [[Versions](tr35.md#Versions)]._
26
27## <a name="Parts" href="#Parts">Parts</a>
28
29The LDML specification is divided into the following parts:
30
31*   Part 1: [Core](tr35.md#Contents) (languages, locales, basic structure)
32*   Part 2: [General](tr35-general.md#Contents) (display names & transforms, etc.)
33*   Part 3: [Numbers](tr35-numbers.md#Contents) (number & currency formatting)
34*   Part 4: [Dates](tr35-dates.md#Contents) (date, time, time zone formatting)
35*   Part 5: [Collation](tr35-collation.md#Contents) (sorting, searching, grouping)
36*   Part 6: [Supplemental](tr35-info.md#Contents) (supplemental data)
37*   Part 7: [Keyboards](tr35-keyboards.md#Contents) (keyboard mappings)
38
39## <a name="Contents" href="#Contents">Contents of Part 3, Numbers</a>
40
41*   1 [Numbering Systems](#Numbering_Systems)
42*   2 [Number Elements](#Number_Elements)
43    *   2.1 [Default Numbering System](#defaultNumberingSystem)
44    *   2.2 [Other Numbering Systems](#otherNumberingSystems)
45    *   2.3 [Number Symbols](#Number_Symbols)
46    *   2.4 [Number Formats](#Number_Formats)
47        *   2.4.1 [Compact Number Formats](#Compact_Number_Formats)
48        *   2.4.2 [Currency Formats](#Currency_Formats)
49    *   2.5 [Miscellaneous Patterns](#Miscellaneous_Patterns)
50    *   2.6 [Minimal Pairs](#Minimal_Pairs)
51*   3 [Number Format Patterns](#Number_Format_Patterns)
52    *   3.1 [Number Patterns](#Number_Patterns)
53        *   Table: [Number Pattern Examples](#Number_Pattern_Examples)
54    *   3.2 [Special Pattern Characters](#Special_Pattern_Characters)
55        *   Table: [Number Pattern Character Definitions](#Number_Pattern_Character_Definitions)
56        *   Table: [Sample Patterns and Results](#Sample_Patterns_and_Results)
57        *   Table: [Examples of minimumGroupingDigits](#Examples_of_minimumGroupingDigits)
58        *   3.2.1 [Explicit Plus Signs](#Explicit_Plus)
59    *   3.3 [Formatting](#Formatting)
60    *   3.4 [Scientific Notation](#sci)
61    *   3.5 [Significant Digits](#sigdig)
62        *   Table: [Significant Digits Examples](#Significant_Digits_Examples)
63    *   3.6 [Padding](#Padding)
64    *   3.7 [Rounding](#Rounding)
65    *   3.8 [Quoting Rules](#Quoting_Rules)
66*   4 [Currencies](#Currencies)
67    *   4.1 [Supplemental Currency Data](#Supplemental_Currency_Data)
68*   5 [Language Plural Rules](#Language_Plural_Rules)
69    *   5.1 [Plural rules syntax](#Plural_rules_syntax)
70        *   5.1.1 [Operands](#Operands)
71            *   Table: [Plural Operand Meanings](#Plural_Operand_Meanings)
72            *   Table: [Plural Operand Examples](#Plural_Operand_Examples)
73        *   5.1.2 [Relations](#Relations)
74            *   Table: [Relations Examples](#Relations_Examples)
75            *   Table: [Plural Rules Examples](#Plural_Rules_Examples)
76        *   5.1.3 [Samples](#Samples)
77            *   Table: [Plural Samples Examples](#Plural_Samples_Examples)
78        *   5.1.4 [Using Cardinals](#Using_cardinals)
79    *   5.2 [Plural Ranges](#Plural_Ranges)
80*   6 [Rule-Based Number Formatting](#Rule-Based_Number_Formatting)
81*   7 [Parsing Numbers](#Parsing_Numbers)
82
83## 1 <a name="Numbering_Systems" href="#Numbering_Systems">Numbering Systems</a>
84
85```xml
86<!ELEMENT numberingSystems ( numberingSystem* ) >
87<!ELEMENT numberingSystem EMPTY >
88<!ATTLIST numberingSystem id NMTOKEN #REQUIRED >
89<!ATTLIST numberingSystem type ( numeric | algorithmic ) #REQUIRED >
90<!ATTLIST numberingSystem radix NMTOKEN #IMPLIED >
91<!ATTLIST numberingSystem digits CDATA #IMPLIED >
92<!ATTLIST numberingSystem rules CDATA #IMPLIED >
93```
94
95Numbering systems information is used to define different representations for numeric values to an end user. Numbering systems are defined in CLDR as one of two different types: algorithmic and numeric. Numeric systems are simply a decimal based system that uses a predefined set of digits to represent numbers. Examples are Western ( ASCII digits ), Thai digits, Devanagari digits. Algorithmic systems are more complex in nature, since the proper formatting and presentation of a numeric quantity is based on some algorithm or set of rules. Examples are Chinese numerals, Hebrew numerals, or Roman numerals. In CLDR, the rules for presentation of numbers in an algorithmic system are defined using the RBNF syntax described in _[Section 6: Rule-Based Number Formatting](#Rule-Based_Number_Formatting)_.
96
97Attributes for the `<numberingSystem>` element are as follows:
98
99- `id` - Specifies the name of the numbering system that can be used to designate its use in formatting.
100- `type` - Specifies whether the numbering system is algorithmic or numeric.
101- `digits` - For numeric systems, specifies the digits used to represent numbers, in order, starting from zero.
102- `rules` - Specifies the RBNF ruleset to be used for formatting numbers from this numbering system. The rules specifier can contain simply a ruleset name, in which case the ruleset is assumed to be found in the rule set grouping "NumberingSystemRules". Alternatively, the specifier can denote a specific locale, ruleset grouping, and ruleset name, separated by slashes.
103
104Examples:
105
106```xml
107<!-- ASCII digits - A numeric system -->
108<numberingSystem id="latn" type="numeric" digits="0123456789"/>
109
110<!-- A numeric system using Thai digits -->
111<numberingSystem id="thai" type="numeric" digits="๐๑๒๓๔๕๖๗๘๙"/>
112
113<!-- An algorithmic system - Georgian numerals , rules found in NumberingSystemRules -->
114<numberingSystem id="geor" type="algorithmic" rules="georgian"/>
115
116<!-- An algorithmic system. Traditional Chinese Numerals -->
117<numberingSystem id="hant" type="algorithmic" rules="zh_Hant/SpelloutRules/spellout-cardinal"/>
118```
119
120For general information about the numbering system data, including the BCP47 identifiers, see the main document _Section Q.1.1 [Numbering System Data](tr35.md#Numbering%20System%20Data)._
121
122## 2 <a name="Number_Elements" href="#Number_Elements">Number Elements</a>
123
124```xml
125<!ELEMENT numbers ( alias | ( defaultNumberingSystem*, otherNumberingSystems*, minimumGroupingDigits*, symbols*, decimalFormats*, scientificFormats*, percentFormats*, currencyFormats*, currencies?, miscPatterns*, minimalPairs*, special* ) ) >
126```
127
128The numbers element supplies information for formatting and parsing numbers and currencies. It has the following sub-elements: `<defaultNumberingSystem>`, `<otherNumberingSystems>`, `<symbols>`, `<decimalFormats>`, `<scientificFormats>`, `<percentFormats>`, `<currencyFormats>`, and `<currencies>`. The currency IDs are from [[ISO4217](tr35.md#ISO4217)] (plus some additional common-use codes). For more information, including the pattern structure, see _[Section 3: Number Format Patterns](#Number_Format_Patterns)_.
129
130### 2.1 <a name="defaultNumberingSystem" href="#defaultNumberingSystem">Default Numbering System</a>
131
132```xml
133<!ELEMENT defaultNumberingSystem ( #PCDATA )>
134```
135
136This element indicates which numbering system should be used for presentation of numeric quantities in the given locale.
137
138### 2.2 <a name="otherNumberingSystems" href="#otherNumberingSystems">Other Numbering Systems</a>
139
140```xml
141<!ELEMENT otherNumberingSystems ( alias | ( native*, traditional*, finance*)) >
142```
143
144This element defines general categories of numbering systems that are sometimes used in the given locale for formatting numeric quantities. These additional numbering systems are often used in very specific contexts, such as in calendars or for financial purposes. There are currently three defined categories, as follows:
145
146**native**
147
148> Defines the numbering system used for the native digits, usually defined as a part of the script used to write the language. The native numbering system can only be a numeric positional decimal-digit numbering system, using digits with General_Category=Decimal_Number. Note: In locales where the native numbering system is the default, it is assumed that the numbering system "latn" ( Western Digits 0-9 ) is always acceptable, and can be selected using the -nu keyword as part of a Unicode locale identifier.
149
150**traditional**
151
152> Defines the traditional numerals for a locale. This numbering system may be numeric or algorithmic. If the traditional numbering system is not defined, applications should use the native numbering system as a fallback.
153
154**finance**
155
156> Defines the numbering system used for financial quantities. This numbering system may be numeric or algorithmic. This is often used for ideographic languages such as Chinese, where it would be easy to alter an amount represented in the default numbering system simply by adding additional strokes. If the financial numbering system is not specified, applications should use the default numbering system as a fallback.
157
158The categories defined for other numbering systems can be used in a Unicode locale identifier to select the proper numbering system without having to know the specific numbering system by name. For example:
159
160*   To select Hindi language using the native digits for numeric formatting, use locale ID: "hi-IN-u-nu-native".
161*   To select Chinese language using the appropriate financial numerals, use locale ID: "zh-u-nu-finance".
162*   To select Tamil language using the traditional Tamil numerals, use locale ID: "ta-u-nu-traditio".
163*   To select Arabic language using western digits 0-9, use locale ID: "ar-u-nu-latn".
164
165For more information on numbering systems and their definitions, see _[Section 1: Numbering Systems](#Numbering_Systems)_.
166
167### 2.3 <a name="Number_Symbols" href="#Number_Symbols">Number Symbols</a>
168
169```xml
170<!ELEMENT symbols (alias | (decimal*, group*, list*, percentSign*, nativeZeroDigit*, patternDigit*, plusSign*, minusSign*, approximatelySign*, exponential*, superscriptingExponent*, perMille*, infinity*, nan*, currencyDecimal*, currencyGroup*, timeSeparator*, special*)) >
171```
172
173Number symbols define the localized symbols that are commonly used when formatting numbers in a given locale. These symbols can be referenced using a number formatting pattern as defined in _[Section 3: Number Format Patterns](#Number_Format_Patterns)_.
174
175The available number symbols are as follows:
176
177**decimal**
178
179> separates the integer and fractional part of the number.
180
181**group**
182
183> separates clusters of integer digits to make large numbers more legible; commonly used for thousands (grouping size 3, e.g. "100,000,000") or in some locales, ten-thousands (grouping size 4, e.g. "1,0000,0000"). There may be two different grouping sizes: The _primary grouping size_ used for the least significant integer group, and the _secondary grouping size_ used for more significant groups; these are not the same in all locales (e.g. "12,34,56,789"). If a pattern contains multiple grouping separators, the interval between the last one and the end of the integer defines the primary grouping size, and the interval between the last two defines the secondary grouping size. All others are ignored, so "#,##,###,####" == "###,###,####" == "##,#,###,####".
184
185**list**
186
187> symbol used to separate numbers in a list intended to represent structured data such as an array; must be different from the **decimal** value. This list separator is for “non-linguistic” usage as opposed to the listPatterns for “linguistic” lists (e.g. “Bob, Carol, and Ted”) described in Part 2, _Section 11 [List Patterns](tr35-general.md#ListPatterns)_.
188
189**percentSign**
190
191> symbol used to indicate a percentage (1/100th) amount. (If present, the value is also multiplied by 100 before formatting. That way 1.23 → 123%)
192
193~~**nativeZeroDigit**~~
194
195> Deprecated - do not use.
196
197~~**patternDigit**~~
198
199> Deprecated. This was formerly used to provide the localized pattern character corresponding to '#', but localization of the pattern characters themselves has been deprecated for some time (determining the locale-specific _replacements_ for pattern characters is of course not deprecated and is part of normal number formatting).
200
201**minusSign**
202
203> Symbol used to denote negative value.
204
205**plusSign**
206
207> Symbol used to denote positive value.  It can be used to produce modified patterns, so that 3.12 is formatted as "+3.12", for example. The standard number patterns (except for type="accounting") will contain the minusSign, explicitly or implicitly. In the explicit pattern, the value of the plusSign can be substituted for the value of the minusSign to produce a pattern that has an explicit plus sign.
208
209**approximatelySign**
210
211> Symbol used to denote a value that is approximate but not exact. The symbol is substituted in place of the minusSign using the same semantics as plusSign substitution.
212
213**exponential**
214
215> Symbol separating the mantissa and exponent values.
216
217**superscriptingExponent**
218
219> (Programmers are used to the fallback exponent style “1.23E4”, but that should not be shown to end-users. Instead, the exponential notation superscriptingExponent should be used to show a format like “1.23 × 104”. ) The superscripting can use markup, such as `<sup>4</sup>` in HTML, or for the special case of Latin digits, use the superscript characters: U+207B ( ⁻ ), U+2070 ( ⁰ ), U+00B9 ( ¹ ), U+00B2 ( ² ), U+00B3 ( ³ ), U+2074 ( ⁴ ) .. U+2079 ( ⁹ ).
220
221**perMille**
222
223> symbol used to indicate a per-mille (1/1000th) amount. (If present, the value is also multiplied by 1000 before formatting. That way 1.23 → 1230 [1/000])
224
225**infinity**
226
227> The infinity sign. Corresponds to the IEEE infinity bit pattern.
228
229**nan - Not a number**
230
231> The NaN sign. Corresponds to the IEEE NaN bit pattern.
232
233**currencyDecimal**
234
235> Optional. If specified, then for currency formatting/parsing this is used as the decimal separator instead of using the regular decimal separator; otherwise, the regular decimal separator is used.
236
237**currencyGroup**
238
239> Optional. If specified, then for currency formatting/parsing this is used as the group separator instead of using the regular group separator; otherwise, the regular group separator is used.
240
241**timeSeparator**
242
243> This replaces any use of the timeSeparator pattern character in a date-time format pattern (no timeSeparator pattern character is currently defined, see note below). This allows the same time format to be used for multiple number systems when the time separator depends on the number system. For example, the time format for Arabic should be COLON when using the Latin numbering system (0, 1, 2, …), but when the Arabic numbering system is used (٠‎ - ١‎ - ٢‎ …), the traditional time separator in older print styles was often ARABIC COMMA.
244>
245> **Note:** In CLDR 26 the timeSeparator pattern character was specified to be COLON. This was withdrawn in CLDR 28 due to backward compatibility issues, and no timeSeparator pattern character is currently defined. No CLDR locales are known to have a need to specify timeSeparator symbols that depend on number system; if this changes in the future a different timeSeparator pattern character will be defined. In the meantime, since CLDR data consumers can still request the timeSeparator symbol. it should match the symbol actually used in the [timeFormats](tr35-dates.md#timeFormats) and [availableFormats](tr35-dates.md#availableFormats_appendItems) items.
246
247Example:
248
249```xml
250<symbols>
251    <decimal>.</decimal>
252    <group>,</group>
253    <list>;</list>
254    <percentSign>%</percentSign>
255    <patternDigit>#</patternDigit>
256    <plusSign>+</plusSign>
257    <minusSign>-</minusSign>
258    <approximatelySign>~</approximatelySign>
259    <exponential>E</exponential>
260    <superscriptingExponent>×</superscriptingExponent>
261    <perMille>‰</perMille>
262    <infinity>∞</infinity>
263    <nan>☹</nan>
264    <timeSeparator>:</timeSeparator>
265</symbols>
266```
267
268```xml
269<!ATTLIST symbols numberSystem CDATA #IMPLIED >
270```
271The `numberSystem` attribute is used to specify that the given number symbols are to be used when the given numbering system is active. Number symbols can only be defined for numbering systems of the "numeric" type, since any special symbols required for an algorithmic numbering system should be specified by the RBNF formatting rules used for that numbering system. By default, number symbols without a specific `numberSystem` attribute are assumed to be used for the "latn" numbering system, which is western (ASCII) digits. Locales that specify a numbering system other than "latn" as the default should also specify number formatting symbols that are appropriate for use within the context of the given numbering system. For example, a locale that uses the Arabic-Indic digits as its default would likely use an Arabic comma for the grouping separator rather than the ASCII comma.
272For more information on numbering systems and their definitions, see _[Section 1: Numbering Systems](#Numbering_Systems)_.
273
274### 2.4 <a name="Number_Formats" href="#Number_Formats">Number Formats</a>
275
276```xml
277<!ELEMENT decimalFormats (alias | (default*, decimalFormatLength*, special*)) >
278<!ELEMENT decimalFormatLength (alias | (default*, decimalFormat*, special*)) >
279<!ATTLIST decimalFormatLength type ( full | long | medium | short ) #IMPLIED >
280<!ELEMENT decimalFormat (alias | (pattern*, special*)) >
281```
282
283(scientificFormats, percentFormats have the same structure)
284
285Number formats are used to define the rules for formatting numeric quantities using the pattern syntax described in _[Section 3: Number Format Patterns](#Number_Format_Patterns)_.
286
287Different formats are provided for different contexts, as follows:
288
289**decimalFormats**
290
291> The normal locale specific way to write a base 10 number. Variations of the decimalFormat pattern are provided that allow compact number formatting.
292
293**percentFormats**
294
295> Pattern for use with percentage formatting
296
297**scientificFormats**
298
299> Pattern for use with scientific (exponent) formatting.
300
301Example:
302
303```xml
304<decimalFormats>
305  <decimalFormatLength type="long">
306    <decimalFormat>
307      <pattern>#,##0.###</pattern>
308    </decimalFormat>
309  </decimalFormatLength>
310</decimalFormats>
311
312<scientificFormats>
313  <default type="long"/>
314  <scientificFormatLength type="long">
315    <scientificFormat>
316      <pattern>0.000###E+00</pattern>
317    </scientificFormat>
318  </scientificFormatLength>
319  <scientificFormatLength type="medium">
320    <scientificFormat>
321      <pattern>0.00##E+00</pattern>
322    </scientificFormat>
323  </scientificFormatLength>
324</scientificFormats>
325
326<percentFormats>
327  <percentFormatLength type="long">
328    <percentFormat>
329      <pattern>#,##0%</pattern>
330    </percentFormat>
331  </percentFormatLength>
332</percentFormats>
333```
334
335```xml
336<!ATTLIST symbols numberSystem CDATA #IMPLIED >
337```
338
339The `numberSystem` attribute is used to specify that the given number formatting pattern(s) are to be used when the given numbering system is active. By default, number formatting patterns without a specific `numberSystem` attribute are assumed to be used for the "latn" numbering system, which is western (ASCII) digits. Locales that specify a numbering system other than "latn" as the default should also specify number formatting patterns that are appropriate for use within the context of the given numbering system.
340For more information on numbering systems and their definitions, see _[Section 1: Numbering Systems](#Numbering_Systems)_.
341
342#### 2.4.1 <a name="Compact_Number_Formats" href="#Compact_Number_Formats">Compact Number Formats</a>
343
344A pattern `type` attribute is used for _compact number formats_, such as the following:
345
346```xml
347<decimalFormatLength type="long">
348    <decimalFormat>
349        <pattern type="1000" count="one">0 millier</pattern>
350        <pattern type="1000" count="other">0 milliers</pattern>
351        <pattern type="10000" count="one">00 mille</pattern>
352        <pattern type="10000" count="other">00 mille</pattern>
353        <pattern type="100000" count="one">000 mille</pattern>
354        <pattern type="100000" count="other">000 mille</pattern>
355        <pattern type="1000000" count="one">0 million</pattern>
356        <pattern type="1000000" count="other">0 millions</pattern>
357358    </decimalFormat>
359</decimalFormatLength>
360<decimalFormatLength type="short">
361    <decimalFormat>
362        <pattern type="1000" count="one">0 K</pattern>
363        <pattern type="1000" count="other">0 K</pattern>
364        <pattern type="10000" count="one">00 K</pattern>
365        <pattern type="10000" count="other">00 K</pattern>
366        <pattern type="100000" count="one">000 K</pattern>
367        <pattern type="100000" count="other">000 K</pattern>
368        <pattern type="1000000" count="one">0 M</pattern>
369        <pattern type="1000000" count="other">0 M</pattern>
370371    </decimalFormat>
372373<currencyFormatLength type="short">
374    <currencyFormat type="standard">
375        <pattern type="1000" count="one">0 K ¤</pattern>
376        <pattern type="1000" count="other">0 K ¤</pattern>
377        <pattern type="10000" count="one">00 K ¤</pattern>
378        <pattern type="10000" count="other">00 K ¤</pattern>
379        <pattern type="100000" count="one">000 K ¤</pattern>
380        <pattern type="100000" count="other">000 K ¤</pattern>
381        <pattern type="1000000" count="one">0 M ¤</pattern>
382        <pattern type="1000000" count="other">0 M ¤</pattern>
383```
384
385Formats can be supplied for numbers (as above) or for currencies or other units. They can also be used with ranges of numbers, resulting in formatting strings like “$10K” or “$3–7M”.
386
387To format a number N, the greatest type less than or equal to N is used, with the appropriate plural category. N is divided by the type, after removing the number of zeros in the pattern, less 1. APIs supporting this format should provide control over the number of significant or fraction digits.
388
389The default pattern for any type that is not supplied is the special value “0”, as in the following. The value “0” must be used when a child locale overrides a parent locale to drop the compact pattern for that type and use the default pattern.
390
391 `<pattern type="1" count="one">0</pattern>`
392
393If the value is precisely “0”, either explicit or defaulted, then the normal number format pattern for that sort of object is supplied — either `<decimalFormat>` or `<currencyFormat type="standard">` — with the normal formatting for the locale (such as the grouping separators). However, for the “0” case by default the signficant digits are adjusted for consistency, typically to 2 or 3 digits, and the maximum fractional digits are set to 0 (for both currencies and plain decimal). Thus the output would be $12, not $12.01. APIs may, however, allow these default behaviors to be overridden.
394
395With the data above, N=12345 matches `<pattern type="10000" count="other">00 K</pattern>`. N is divided by 1000 (obtained from10000 after removing "00" and restoring one "0". The result is formatted according to the normal decimal pattern. With no fractional digits, that yields "12 K".
396
397Formatting 1200 in USD would result in “1.2 K $”, while 990 implicitly maps to the special value “0”, which maps to `<currencyFormat type="standard"><pattern>#,##0.00 ¤</pattern>`, and would result in simply “990 $”.
398
399The short format is designed for UI environments where space is at a premium, and should ideally result in a formatted string no more than about 6 em wide (with no fractional digits).
400
401#### 2.4.2 <a name="Currency_Formats" href="#Currency_Formats">Currency Formats</a>
402
403Pattern for use with currency formatting. This format contains a few additional structural options that allow proper placement of the currency symbol relative to the numeric quantity. Refer to _[Section 4 - Currencies](#Currencies)_ for additional information on the use of these options.
404
405```xml
406<!ELEMENT currencyFormats (alias | (default*, currencySpacing*, currencyFormatLength*, unitPattern*, special*)) >
407<!ELEMENT currencySpacing (alias | (beforeCurrency*, afterCurrency*, special*)) >
408<!ELEMENT beforeCurrency (alias | (currencyMatch*, surroundingMatch*, insertBetween*)) >
409<!ELEMENT afterCurrency (alias | (currencyMatch*, surroundingMatch*, insertBetween*)) >
410<!ELEMENT currencyMatch ( #PCDATA ) >
411<!ELEMENT surroundingMatch ( #PCDATA )) >
412<!ELEMENT insertBetween ( #PCDATA ) >
413<!ELEMENT currencyFormatLength (alias | (default*, currencyFormat*, special*)) >
414<!ATTLIST currencyFormatLength type ( full | long | medium | short ) #IMPLIED >
415<!ELEMENT currencyFormat (alias | (pattern*, special*)) >
416```
417
418In addition to a standard currency format, in which negative currency amounts might typically be displayed as something like “-$3.27”, locales may provide an "accounting" form, in which for "en_US" the same example would appear as “($3.27)”.
419
420```xml
421<currencyFormats>
422    <currencyFormatLength>
423        <currencyFormat type="standard">
424            <pattern>¤#,##0.00</pattern>
425        </currencyFormat>
426        <currencyFormat type="accounting">
427            <pattern>¤#,##0.00;(¤#,##0.00)</pattern>
428        </currencyFormat>
429    </currencyFormatLength>
430</currencyFormats>
431```
432
433### 2.5 <a name="Miscellaneous_Patterns" href="#Miscellaneous_Patterns">Miscellaneous Patterns</a>
434
435```xml
436<!ELEMENT miscPatterns (alias | (default*, pattern*, special*)) >
437<!ATTLIST miscPatterns numberSystem CDATA #IMPLIED >
438```
439
440The miscPatterns supply additional patterns for special purposes. The currently defined values are:
441
442**approximately**
443
444> indicates an approximate number, such as: “~99”
445
446**atMost**
447
448> indicates a number or lower, such as: “`≤`99” to indicate that there are 99 items or fewer.
449
450**atLeast**
451
452> indicates a number or higher, such as: “99+” to indicate that there are 99 items or more.
453
454**range**
455
456> indicates a range of numbers, such as: “99–103” to indicate that there are from 99 to 103 items.
457
458_For example:_
459
460```xml
461<miscPatterns numberSystem="…">
462  <pattern type="approximately">~{0}</pattern>
463  <pattern type="atLeast">≥{0}</pattern>
464  <pattern type="atMost">≤{0}</pattern>
465  <pattern type="range">{0}–{1}</pattern>
466</miscPatterns>
467```
468
469### 2.6 <a name="Minimal_Pairs" href="#Minimal_Pairs">Minimal Pairs</a>
470
471```xml
472<!ELEMENT minimalPairs ( alias | ( pluralMinimalPairs*, ordinalMinimalPairs*, caseMinimalPairs*, genderMinimalPairs*, special* ) ) >
473```
474```xml
475<!ELEMENT pluralMinimalPairs ( #PCDATA ) >
476<!ATTLIST pluralMinimalPairs count NMTOKEN #IMPLIED >
477```
478```xml
479<!ELEMENT ordinalMinimalPairs ( #PCDATA ) >
480<!ATTLIST ordinalMinimalPairs ordinal NMTOKEN #IMPLIED >
481```
482
483```xml
484<!ELEMENT caseMinimalPairs ( #PCDATA ) >
485<!ATTLIST caseMinimalPairs case NMTOKEN #REQUIRED >
486```
487
488```xml
489<!ELEMENT genderMinimalPairs ( #PCDATA ) >
490<!ATTLIST genderMinimalPairs gender NMTOKEN #REQUIRED >
491```
492
493Minimal pairs provide examples that justify why multiple plural or ordinal categories exist, and for providing contextual examples for verifying consistency of translations. The allowable values for the `count`, `ordinal`, `case`, and `gender` attributes are found in the dtd file.
494
495Examples
496
497```xml
498<minimalPairs>
499    <pluralMinimalPairs count="one">{0} Tag</pluralMinimalPairs>
500    <pluralMinimalPairs count="other">{0} Tage</pluralMinimalPairs>
501
502    <ordinalMinimalPairs ordinal="other">{0}. Abzweigung nach rechts nehmen</ordinalMinimalPairs>
503
504    <caseMinimalPairs case="accusative">… für {0} …</caseMinimalPairs>
505    <caseMinimalPairs case="dative">… mit {0} …</caseMinimalPairs>
506    <caseMinimalPairs case="genitive">Anstatt {0} …</caseMinimalPairs>
507    <caseMinimalPairs case="nominative">{0} kostet (kosten) € 3,50.</caseMinimalPairs>
508
509    <genderMinimalPairs gender="feminine">Die {0} ist …</genderMinimalPairs>
510    <genderMinimalPairs gender="masculine">Der {0} ist …</genderMinimalPairs>
511    <genderMinimalPairs gender="neuter">Das {0} ist …</genderMinimalPairs>
512</minimalPairs>
513```
514
515
516For more information, see [Plural Rules](http://cldr.unicode.org/index/cldr-spec/plural-rules) and [Grammatical Inflection](http://cldr.unicode.org/translation/grammatical-inflection).
517
518## 3 <a name="Number_Format_Patterns" href="#Number_Format_Patterns">Number Format Patterns</a>
519
520### 3.1 <a name="Number_Patterns" href="#Number_Patterns">Number Patterns</a>
521
522Number patterns affect how numbers are interpreted in a localized context. Here are some examples, based on the French locale. The "." shows where the decimal point should go. The "," shows where the thousands separator should go. A "0" indicates zero-padding: if the number is too short, a zero (in the locale's numeric set) will go there. A "#" indicates no padding: if the number is too short, nothing goes there. A "¤" shows where the currency sign will go. The following illustrates the effects of different patterns for the French locale, with the number "1234.567". Notice how the pattern characters ',' and '.' are replaced by the characters appropriate for the locale.
523
524##### <a name="Number_Pattern_Examples" href="#Number_Pattern_Examples">Number Pattern Examples</a>
525
526| Pattern | Currency | Text |
527| --- | --- | --- |
528| #,##0.## | _n/a_ | 1 234,57 |
529| #,##0.### | _n/a_ | 1 234,567 |
530| ###0.##### | _n/a_ | 1234,567 |
531| ###0.0000# | _n/a_ | 1234,5670 |
532| 00000.0000 | _n/a_ | 01234,5670 |
533| #,##0.00 ¤ | EUR | 1 234,57 € |
534|| JPY | 1 235 ¥JP |
535
536The number of # placeholder characters before the decimal do not matter, since no limit is placed on the maximum number of digits. There should, however, be at least one zero someplace in the pattern. In currency formats, the number of digits after the decimal also do not matter, since the information in the supplemental data (see _[Supplemental Currency Data](#Supplemental_Currency_Data))_ is used to override the number of decimal places — and the rounding — according to the currency that is being formatted. That can be seen in the above chart, with the difference between Yen and Euro formatting.
537
538To ensure correct layout, especially in currency patterns in which a a variety of symbols may be used, number patterns may contain (invisible) bidirectional text format characters such as LRM, RLM, and ALM.
539
540_When parsing using a pattern, a lenient parse should be used; see [Lenient Parsing](tr35.md#Lenient_Parsing)._ As noted there, lenient parsing should ignore bidi format characters.
541
542### 3.2 <a name="Special_Pattern_Characters" href="#Special_Pattern_Characters">Special Pattern Characters</a>
543
544Many characters in a pattern are taken literally; they are matched during parsing and output unchanged during formatting. Special characters, on the other hand, stand for other characters, strings, or classes of characters. For example, the '#' character is replaced by a localized digit for the chosen numberSystem. Often the replacement character is the same as the pattern character; in the U.S. locale, the ',' grouping character is replaced by ','. However, the replacement is still happening, and if the symbols are modified, the grouping character changes. Some special characters affect the behavior of the formatter by their presence; for example, if the percent character is seen, then the value is multiplied by 100 before being displayed.
545
546To insert a special character in a pattern as a literal, that is, without any special meaning, the character must be quoted. There are some exceptions to this which are noted below. The Localized Replacement column shows the replacement from _Section 2.3 [Number Symbols](#Number_Symbols)_ or the numberSystem's digits: _italic_ indicates a special function.
547
548Invalid sequences of special characters (such as “¤¤¤¤¤¤” in current CLDR) should be handled for formatting and parsing as described in [Handling Invalid Patterns](tr35.md#Invalid_Patterns).
549
550##### <a name="Number_Pattern_Character_Definitions" href="#Number_Pattern_Character_Definitions">Number Pattern Character Definitions</a>
551
552| Symbol | Location | Localized Replacement | Meaning |
553| :-- | :-- | :-- | :-- |
554| 0 | Number | digit | Digit |
555| 1-9 | Number | digit | '1' through '9' indicate rounding. |
556| @ | Number | digit | Significant digit |
557| # | Number | digit, _nothing_ | Digit, omitting leading/trailing zeros |
558| . | Number | decimal, currencyDecimal | Decimal separator or monetary decimal separator |
559| - | Number | minusSign, plusSign, approximatelySign | Minus sign. **Warning:** the pattern '-'0.0 is not the same as the pattern -0.0. In the former case, the minus sign is a literal. In the latter case, it is a special symbol, which is replaced by the minusSymbol, and can also be replaced by the plusSymbol for a format like +12% as in Section 3.2.1 [Explicit Plus Signs](#Explicit_Plus). |
560| , | Number | group, currencyGroup | Grouping separator. May occur in both the integer part and the fractional part. The position determines the grouping. |
561| E | Number | exponential, superscriptingExponent | Separates mantissa and exponent in scientific notation. _Need not be quoted in prefix or suffix._ |
562| + | Exponent or Number (for explicit plus) | plusSign | Prefix positive exponents with localized plus sign. Used for explicit plus for numbers as well, as described in Section 3.2.1 [Explicit Plus Signs](#Explicit_Plus). _Need not be quoted in prefix or suffix._ |
563| % | Prefix or suffix | percentSign | Multiply by 100 and show as percentage |
564| ‰ (U+2030) | Prefix or suffix | perMille | Multiply by 1000 and show as per mille (aka “basis points”) |
565| ; | Subpattern boundary | _syntax_ | Separates positive and negative subpatterns. When there is no explicit negative subpattern, an implicit negative subpattern is formed from the positive pattern with a prefixed - (ASCII U+002D HYPHEN-MINUS). |
566| ¤ (U+00A4) | Prefix or suffix | _currency symbol/name from currency specified in API_ | Any sequence is replaced by the localized currency symbol for the currency being formatted, as in the table below. If present in a pattern, the monetary decimal separator and grouping separators (if available) are used instead of the numeric ones. If data is unavailable for a given sequence in a given locale, the display may fall back to ¤ or ¤¤. See also the formatting forcurrency display names, steps 2 and 4 in [Currencies](#Currencies). <table><tr><th>No.</th><th>Replacement / Example</th></tr><tr><td rowspan="2">¤</td><td>Standard currency symbol</td></tr><tr><td>_C$12.00_</td></tr><tr><td rowspan="2">¤¤</td><td>ISO currency symbol (constant)</td></tr><tr><td>_CAD 12.00_</td></tr><tr><td rowspan="2">¤¤¤</td><td>Appropriate currency display name for the currency,based on the plural rules in effect for the locale</td></tr><tr><td>_5.00 Canadian dollars_</td></tr><tr><td rowspan="2" >¤¤¤¤¤</td><td>Narrow currency symbol. The same symbols may be used for multiple currencies. Thus the symbol may be ambiguous, and should only be where the context is clear.</td></tr><tr><td>_$12.00_</td></tr><tr><td>_others_</td><td>_Invalid in current CLDR. Reserved for future specification_</td></tr></table> |
567| * | Prefix or suffix boundary | _padding character specified in API_ | Pad escape, precedes pad character |
568| ' | Prefix or suffix | _syntax-only_ | Used to quote special characters in a prefix or suffix, for example, `"'#'#"` formats 123 to `"#123"`. To create a single quote itself, use two in a row: `"# o''clock"`. |
569
570A pattern contains a positive subpattern and may contain a negative subpattern, for example, "#,##0.00;(#,##0.00)". Each subpattern has a prefix, a numeric part, and a suffix. If there is no explicit negative subpattern, the implicit negative subpattern is the ASCII minus sign (-) prefixed to the positive subpattern. That is, "0.00" alone is equivalent to "0.00;-0.00". (The data in CLDR is normalized to remove an explicit negative subpattern where it would be identical to the implicit form.)
571
572Note that if an negative subpattern is used as-is: a minus sign is _not_ added, eg "0.00;0.00" ≠ "0.00;-0.00". Trailing semicolons are ignored, eg "0.00;" = "0.00". Whitespace is not ignored, including those around semicolons, so "0.00 ; -0.00" ≠ "0.00;-0.00".
573
574If there is an explicit negative subpattern, it serves only to specify the negative prefix and suffix; the number of digits, minimal digits, and other characteristics are ignored in the negative subpattern. That means that "#,##0.0#;(#)" has precisely the same result as "#,##0.0#;(#,##0.0#)". However in the CLDR data, the format is normalized so that the other characteristics are preserved, just for readability.
575
576> **Note:** The thousands separator and decimal separator in patterns are always ASCII ',' and '.'. They are substituted by the code with the correct local values according to other fields in CLDR. The same is true of the - (ASCII minus sign) and other special characters listed above.
577
578A currency decimal pattern normally contains a currency symbol placeholder (¤, ¤¤, ¤¤¤, or ¤¤¤¤¤). The currency symbol placeholder may occur before the first digit, after the last digit symbol, or where the decimal symbol would otherwise be placed (for formats such as "12€50", as in "12€50 pour une omelette").
579
580Placement | Examples
581-------|-------
582Before|"¤#,##0.00" "¤ #,##0.00" "¤-#,##0.00" "¤ -#,##0.00" "-¤#,##0.00" "-¤ #,##0.00" …
583After|"#,##0.00¤" "#,##0.00 ¤" "#,##0.00-¤" "#,##0.00- ¤" "#,##0.00¤-" "#,##0.00 ¤-" …
584Decimal|"#,##0¤00"
585
586Below is a sample of patterns, special characters, and results:
587
588##### <a name="Sample_Patterns_and_Results" href="#Sample_Patterns_and_Results">Sample Patterns and Results</a>
589
590<table><tbody>
591<tr><th>explicit pattern:</th><td colspan="2">0.00;-0.00</td><td colspan="2">0.00;0.00-</td><td colspan="2">0.00+;0.00-</td></tr>
592<tr><th>decimalSign:</th><td colspan="2">,</td><td colspan="2">,</td><td colspan="2">,</td></tr>
593<tr><th>minusSign:</th><td colspan="2">∸</td><td colspan="2">∸</td><td colspan="2">∸</td></tr>
594<tr><th>plusSign:</th><td colspan="2">∔</td><td colspan="2">∔</td><td colspan="2">∔</td></tr>
595<tr><th>number:</th><td>3.1415</td><td>-3.1415</td><td>3.1415</td><td>-3.1415</td><td>3.1415</td><td>-3.1415</td></tr>
596<tr><th>formatted:</th><td>3,14</td><td>∸3,14</td><td>3,14</td><td>3,14∸</td><td>3,14∔</td><td>3,14∸</td></tr>
597</tbody></table>
598
599_In the above table, ∸ = U+2238 DOT MINUS and ∔ = U+2214 DOT PLUS are used for illustration._
600
601The prefixes, suffixes, and various symbols used for infinity, digits, thousands separators, decimal separators, and so on may be set to arbitrary values, and they will appear properly during formatting. _However, care must be taken that the symbols and strings do not conflict, or parsing will be unreliable._ For example, either the positive and negative prefixes or the suffixes must be distinct for any parser using this data to be able to distinguish positive from negative values. Another example is that the decimal separator and thousands separator should be distinct characters, or parsing will be impossible.
602
603The _grouping separator_ is a character that separates clusters of integer digits to make large numbers more legible. It is commonly used for thousands, but in some locales it separates ten-thousands. The _grouping size_ is the number of digits between the grouping separators, such as 3 for "100,000,000" or 4 for "1 0000 0000". There are actually two different grouping sizes: One used for the least significant integer digits, the _primary grouping size_, and one used for all others, the _secondary grouping size_. In most locales these are the same, but sometimes they are different. For example, if the primary grouping interval is 3, and the secondary is 2, then this corresponds to the pattern "#,##,##0", and the number 123456789 is formatted as "12,34,56,789". If a pattern contains multiple grouping separators, the interval between the last one and the end of the integer defines the primary grouping size, and the interval between the last two defines the secondary grouping size. All others are ignored, so "#,##,###,####" == "###,###,####" == "##,#,###,####".
604
605The grouping separator may also occur in the fractional part, such as in “#,##0.###,#”. This is most commonly done where the grouping separator character is a thin, non-breaking space (U+202F), such as “1.618 033 988 75”. See [physics.nist.gov/cuu/Units/checklist.html](https://physics.nist.gov/cuu/Units/checklist.html).
606
607For consistency in the CLDR data, the following conventions are observed:
608
609* All number patterns should be minimal: there should be no leading # marks except to specify the position of the grouping separators (for example, avoid  ##,##0.###).
610* All formats should have one 0 before the decimal point (for example, avoid #,###.##)
611* Decimal formats should have three hash marks in the fractional position (for example, #,##0.###).
612* Currency formats should have two zeros in the fractional position (for example, ¤ #,##0.00).
613    * The exact number of decimals is overridden with the decimal count in supplementary data or by API settings.
614* The only time two thousands separators needs to be used is when the number of digits varies, such as for Hindi: #,##,##0.
615* The **minimumGroupingDigits** can be used to suppress groupings below a certain value. This is used for languages such as Polish, where one would only write the grouping separator for values above 9999. The minimumGroupingDigits contains the default for the locale.
616    * The attribute value is used by adding it to the grouping separator value. If the input number has fewer integer digits, the grouping separator is suppressed.
617    * ##### <a name="Examples_of_minimumGroupingDigits" href="#Examples_of_minimumGroupingDigits">Examples of minimumGroupingDigits</a>
618
619        | minimum­GroupingDigits | Pattern Grouping | Input Number | Formatted |
620        | ---: | ---: | ---: | ---: |
621        | 1 | 3 | 1000 | 1,000 |
622        | 1 | 3 | 10000 | 10,000 |
623        | 2 | 3 | 1000 | 1000 |
624        | 2 | 3 | 10000 | 10,000 |
625        | 1 | 4 | 10000 | 1,0000 |
626        | 2 | 4 | 10000 | 10000 |
627
628#### 3.2.1 <a name="Explicit_Plus" href="#Explicit_Plus">Explicit Plus Signs</a>
629
630An explicit "plus" format can be formed, so as to show a visible + sign when formatting a non-negative number. The displayed plus sign can be an ASCII plus or another character, such as + U+FF0B FULLWIDTH PLUS SIGN or ➕ U+2795 HEAVY PLUS SIGN; it is taken from whatever is set for plusSign in _Section 2.3 [Number Symbols](#Number_Symbols)_.
631
6321. Get the negative subpattern (explicit or implicit).
6332. Replace any unquoted ASCII minus sign by an ASCII plus sign.
6343. If there are any replacements, use that for the positive subpattern.
635
636For an example, see [Sample Patterns and Results](#Sample_Patterns_and_Results).
637
638### 3.3 <a name="Formatting" href="#Formatting">Formatting</a>
639
640Formatting is guided by several parameters, all of which can be specified either using a pattern or using an external API designed for number formatting. The following description applies to formats that do not use [scientific notation](#sci) or [significant digits](#sigdig).
641
642* If the number of actual integer digits exceeds the _maximum integer digits_, then only the least significant digits are shown. For example, 1997 is formatted as "97" if the maximum integer digits is set to 2.
643* If the number of actual integer digits is less than the _minimum integer digits_, then leading zeros are added. For example, 1997 is formatted as "01997" if the minimum integer digits is set to 5.
644* If the number of actual fraction digits exceeds the _maximum fraction digits_, then half-even rounding it performed to the maximum fraction digits. For example, 0.125 is formatted as "0.12" if the maximum fraction digits is 2. This behavior can be changed by specifying a rounding increment and a rounding mode.
645* If the number of actual fraction digits is less than the _minimum fraction digits_, then trailing zeros are added. For example, 0.125 is formatted as "0.1250" if the minimum fraction digits is set to 4.
646* Trailing fractional zeros are not displayed if they occur _j_ positions after the decimal, where _j_ is less than the maximum fraction digits. For example, 0.10004 is formatted as "0.1" if the maximum fraction digits is four or less.
647
648**Special Values**
649
650`NaN` is represented as a single character, typically `(U+FFFD)` . This character is determined by the localized number symbols. This is the only value for which the prefixes and suffixes are not used.
651
652Infinity is represented as a single character, typically ∞ `(U+221E)` , with the positive or negative prefixes and suffixes applied. The infinity character is determined by the localized number symbols.
653
654### 3.4 <a name="sci" href="#sci">Scientific Notation</a>
655
656Numbers in scientific notation are expressed as the product of a mantissa and a power of ten, for example, 1234 can be expressed as 1.234 x 103. The mantissa is typically in the half-open interval [1.0, 10.0) or sometimes [0.0, 1.0), but it need not be. In a pattern, the exponent character immediately followed by one or more digit characters indicates scientific notation. Example: "0.###E0" formats the number 1234 as "1.234E3".
657
658* The number of digit characters after the exponent character gives the minimum exponent digit count. There is no maximum. Negative exponents are formatted using the localized minus sign, _not_ the prefix and suffix from the pattern. This allows patterns such as "0.###E0 m/s". To prefix positive exponents with a localized plus sign, specify '+' between the exponent and the digits: "0.###E+0" will produce formats "1E+1", "1E+0", "1E-1", and so on. (In localized patterns, use the localized plus sign rather than '+'.)
659* The minimum number of integer digits is achieved by adjusting the exponent. Example: 0.00123 formatted with "00.###E0" yields "12.3E-4". This only happens if there is no maximum number of integer digits. If there is a maximum, then the minimum number of integer digits is fixed at one.
660* The maximum number of integer digits, if present, specifies the exponent grouping. The most common use of this is to generate _engineering notation_, in which the exponent is a multiple of three, for example, "##0.###E0". The number 12345 is formatted using "##0.####E0" as "12.345E3".
661* When using scientific notation, the formatter controls the digit counts using logic for significant digits. The maximum number of significant digits comes from the mantissa portion of the pattern: the string of #, 0, and period (".") characters immediately preceding the E. To get the maximum number of significant digits, use the following algorithm:
662
663    1.  If the mantissa pattern contains a period:
664        1.  If the mantissa pattern contains at least one 0:
665            *   Return the number of 0s before the period added to the number of #s or 0s after the period
666        2.  Else:
667            *   Return 1 plus the number of #s after the period
668    2.  Else:
669        1.  If the mantissa pattern contains at least one 0:
670            *   Return the number of 0s.
671        2.  Else:
672            *   Return positive infinity.
673
674    Examples:
675
676    *   0.##E0 means a max of 3 significant digits.
677    *   #.##E0 also means a max of 3 significant digits.
678    *   #.0#E0 means a max of 2 significant digits.
679    *   0E0 means a max of 1 significant digit.
680    *   #E0 means infinite precision.
681    *   ###E0 means engineering notation with infinite precision.
682*   Exponential patterns may not contain grouping separators.
683
684### 3.5 <a name="sigdig" href="#sigdig">Significant Digits</a>
685
686There are two ways of controlling how many digits are shows: (a) significant digits counts, or (b) integer and fraction digit counts. Integer and fraction digit counts are described above. When a formatter is using significant digits counts, it uses however many integer and fraction digits are required to display the specified number of significant digits. It may ignore min/max integer/fraction digits, or it may use them to the extent possible.
687
688##### <a name="Significant_Digits_Examples" href="#Significant_Digits_Examples">Significant Digits Examples</a>
689
690| Pattern | Minimum significant digits | Maximum significant digits | Number | Output |
691| :-- | :-- | :-- | :-- | :-- |
692| `@@@` | 3 | 3 | 12345 | `12300` |
693| `@@@` | 3 | 3 | 0.12345 | `0.123` |
694| `@@##` | 2 | 4 | 3.14159 | `3.142` |
695| `@@##` | 2 | 4 | 1.23004 | `1.23` |
696
697* In order to enable significant digits formatting, use a pattern containing the `'@'` pattern character. In order to disable significant digits formatting, use a pattern that does not contain the `'@'` pattern character.
698* Significant digit counts may be expressed using patterns that specify a minimum and maximum number of significant digits. These are indicated by the `'@'` and `'#'` characters. The minimum number of significant digits is the number of `'@'` characters. The maximum number of significant digits is the number of `'@'` characters plus the number of `'#'` characters following on the right. For example, the pattern `"@@@"` indicates exactly 3 significant digits. The pattern `"@##"` indicates from 1 to 3 significant digits. Trailing zero digits to the right of the decimal separator are suppressed after the minimum number of significant digits have been shown. For example, the pattern `"@##"` formats the number 0.1203 as `"0.12"`.
699* Implementations may forbid the use of significant digits in combination with min/max integer/fraction digits. In such a case, if a pattern uses significant digits, it may not contain a decimal separator, nor the `'0'` pattern character. Patterns such as `"@00"` or `"@.###"` would be disallowed.
700* Any number of `'#'` characters may be prepended to the left of the leftmost `'@'` character. These have no effect on the minimum and maximum significant digits counts, but may be used to position grouping separators. For example, `"#,#@#"` indicates a minimum of one significant digits, a maximum of two significant digits, and a grouping size of three.
701* The number of significant digits has no effect on parsing.
702* Significant digits may be used together with exponential notation. Such patterns are equivalent to a normal exponential pattern with a minimum and maximum integer digit count of one, a minimum fraction digit count of `Minimum Significant Digits - 1`, and a maximum fraction digit count of `Maximum Significant Digits - 1`. For example, the pattern `"@@###E0"` is equivalent to `"0.0###E0"`.
703
704### 3.6 <a name="Padding" href="#Padding">Padding</a>
705
706Patterns support padding the result to a specific width. In a pattern the pad escape character, followed by a single pad character, causes padding to be parsed and formatted. The pad escape character is '*'. For example, `"$*x#,##0.00"` formats 123 to `"$xx123.00"` , and 1234 to `"$1,234.00"` .
707
708* When padding is in effect, the width of the positive subpattern, including prefix and suffix, determines the format width. For example, in the pattern `"* #0 o''clock"`, the format width is 10.
709* Some parameters which usually do not matter have meaning when padding is used, because the pattern width is significant with padding. In the pattern "* ##,##,#,##0.##", the format width is 14. The initial characters "##,##," do not affect the grouping size or maximum integer digits, but they do affect the format width.
710* Padding may be inserted at one of four locations: before the prefix, after the prefix, before the suffix, or after the suffix. No padding can be specified in any other location. If there is no prefix, before the prefix and after the prefix are equivalent, likewise for the suffix.
711* When specified in a pattern, the code point immediately following the pad escape is the pad character. This may be any character, including a special pattern character. That is, the pad escape _escapes_ the following character. If there is no character after the pad escape, then the pattern is illegal.
712
713### 3.7 <a name="Rounding" href="#Rounding">Rounding</a>
714
715Patterns support rounding to a specific increment. For example, 1230 rounded to the nearest 50 is 1250. Mathematically, rounding to specific increments is performed by dividing by the increment, rounding to an integer, then multiplying by the increment. To take a more bizarre example, 1.234 rounded to the nearest 0.65 is 1.3, as follows:
716
717<table><tbody>
718<tr><th>Original:</th><td>1.234</td></tr>
719<tr><th>Divide by increment (0.65):</th><td>1.89846…</td></tr>
720<tr><th>Round:</th><td>2</td></tr>
721<tr><th>Multiply by increment (0.65):</th><td>1.3</td></tr>
722</tbody></table>
723
724To specify a rounding increment in a pattern, include the increment in the pattern itself. "#,#50" specifies a rounding increment of 50. "#,##0.05" specifies a rounding increment of 0.05.
725
726* Rounding only affects the string produced by formatting. It does not affect parsing or change any numerical values.
727* An implementation may allow the specification of a _rounding mode_ to determine how values are rounded. In the absence of such choices, the default is to round "half-even", as described in IEEE arithmetic. That is, it rounds towards the "nearest neighbor" unless both neighbors are equidistant, in which case, it rounds towards the even neighbor. Behaves as for round "half-up" if the digit to the left of the discarded fraction is odd; behaves as for round "half-down" if it's even. Note that this is the rounding mode that minimizes cumulative error when applied repeatedly over a sequence of calculations.
728* Some locales use rounding in their currency formats to reflect the smallest currency denomination.
729* In a pattern, digits '1' through '9' specify rounding, but otherwise behave identically to digit '0'.
730
731### 3.8 <a name="Quoting_Rules" href="#Quoting_Rules">Quoting Rules</a>
732
733Single quotes, (**'**), enclose bits of the pattern that should be treated literally. Inside a quoted string, two single quotes ('') are replaced with a single one ('). For example: `'X '`#`' Q '` -> **X 1939 Q** (Literal strings `shaded`.)
734
735## 4 <a name="Currencies" href="#Currencies">Currencies</a>
736
737```xml
738<!ELEMENT currencies (alias | (default?, currency*, special*)) >
739<!ELEMENT currency (alias | (((pattern+, displayName*, symbol*) | (displayName+, symbol*, pattern*) | (symbol+, pattern*))?, decimal*, group*, special*)) >
740<!ELEMENT symbol ( #PCDATA ) >
741<!ATTLIST symbol choice ( true | false ) #IMPLIED > <!-- deprecated -->
742```
743
744> **Note:** The term "pattern" appears twice in the above. The first is for consistency with all other cases of pattern + displayName; the second is for backwards compatibility.
745
746```xml
747<currencies>
748    <currency type="USD">
749        <displayName>Dollar</displayName>
750        <symbol>$</symbol>
751    </currency>
752    <currency type ="JPY">
753        <displayName>Yen</displayName>
754        <symbol>¥</symbol>
755    </currency>
756    <currency type="PTE">
757        <displayName>Escudo</displayName>
758        <symbol>$</symbol>
759    </currency>
760</currencies>
761```
762
763In formatting currencies, the currency number format is used with the appropriate symbol from `<currencies>`, according to the currency code. The `<currencies>` list can contain codes that are no longer in current use, such as PTE. The `choice` attribute has been deprecated.
764
765The `count` attribute distinguishes the different plural forms, such as in the following:
766
767```xml
768<currencyFormats>
769    <unitPattern count="other">{0} {1}</unitPattern>
770771<currencies>
772```
773
774```xml
775<currency type="ZWD">
776    <displayName>Zimbabwe Dollar</displayName>
777    <displayName count="one">Zimbabwe dollar</displayName>
778    <displayName count="other">Zimbabwe dollars</displayName>
779    <symbol>Z$</symbol>
780</currency>
781```
782
783Note on displayNames:
784* In general the region portion of the displayName should match the territory name, see **Part 2** _Section 1.2 [Locale Display Name Fields](tr35-general.md#locale_display_name_fields)_.
785* As a result, the English currency displayName in CLDR may not match the name in ISO 4217.
786
787To format a particular currency value "ZWD" for a particular numeric value _n_ using the (long) display name:
788
7891. If the numeric value is exactly 0 or 1, first see if there is a count with a matching explicit number (0 or 1). If so, use that string (see [Explicit 0 and 1 rules](#Explicit_0_1_rules)).
7902. Otherwise, determine the `count` value that corresponds to _n_ using the rules in _[Section 5 - Language Plural Rules](#Language_Plural_Rules)_
7913. Next, get the currency unitPattern.
792   1. Look for a `unitPattern` element that matches the `count` value, starting in the current locale and then following the locale fallback chain up to, but not including root.
793   2. If no matching `unitPattern` element was found in the previous step, then look for a `unitPattern` element that matches `count="other"`, starting in the current locale and then following the locale fallback chain up to root (which has a `unitPattern` element with `count="other"` for every unit type).
794   3. The resulting unitPattern element indicates the appropriate positioning of the numeric value and the currency display name.
7954. Next, get the `displayName` element for the currency.
796   1. Look for a `displayName` element that matches the `count` value, starting in the current locale and then following the locale fallback chain up to, but not including root.
797   2. If no matching `displayName` element was found in the previous step, then look for a `displayName` element that matches `count="other"`, starting in the current locale and then following the locale fallback chain up to, but not including root.
798   3. If no matching `displayName` element was found in the previous step, then look for a `displayName` element that with no count, starting in the current locale and then following the locale fallback chain up to root.
799   4. If there is no `displayName` element, use the currency code itself (for example, "ZWD").
8005. Format the numeric value according to the locale. Use the locale’s `<decimalFormats …>` pattern, not the `<currencyFormats>` pattern that is used with the symbol (eg, Z$). As when formatting symbol currency values, reset the number of decimals according to the supplemental `<currencyData>` and use the currencyDecimal symbol if different from the decimal symbol.
801   1. The number of decimals should be overridable in an API, so that clients can choose between “2 US dollars” and “2.00 US dollars”.
8026. Substitute the formatted numeric value for the {0} in the `unitPattern`, and the currency display name for the {1}.
803
804While for English this may seem overly complex, for some other languages different plural forms are used for different unit types; the plural forms for certain unit types may not use all of the plural-form tags defined for the language.
805
806For example, if the the currency is ZWD and the number is 1234, then the latter maps to `count="other"` for English. The unit pattern for that is "{0} {1}", and the display name is "Zimbabwe dollars". The final formatted number is then "1,234 Zimbabwe dollars".
807
808When the currency symbol is substituted into a pattern, there may be some further modifications, according to the following.
809
810```xml
811<currencySpacing>
812  <beforeCurrency>
813    <currencyMatch>[:^S:]</currencyMatch>
814    <surroundingMatch>[:digit:]</surroundingMatch>
815    <insertBetween> </insertBetween>
816  </beforeCurrency>
817  <afterCurrency>
818    <currencyMatch>[:^S:]</currencyMatch>
819    <surroundingMatch>[:digit:]</surroundingMatch>
820    <insertBetween> </insertBetween>
821  </afterCurrency>
822</currencySpacing>
823```
824
825This element controls whether additional characters are inserted on the boundary between the symbol and the pattern. For example, with the above `currencySpacing`, inserting the symbol "US$" into the pattern "#,##0.00¤" would result in an extra _no-break space_ inserted before the symbol, for example, "#,##0.00 US$". The `beforeCurrency` element governs this case, since we are looking _before_ the "¤" symbol. The `currencyMatch` is positive, since the "U" in "US$" is at the start of the currency symbol being substituted. The `surroundingMatch` is positive, since the character just before the "¤" will be a digit. Because these two conditions are true, the insertion is made.
826
827Conversely, look at the pattern "¤#,##0.00" with the symbol "US$". In this case, there is no insertion; the result is simply "US$#,##0.00". The `afterCurrency` element governs this case, since we are looking _after_ the "¤" symbol. The `surroundingMatch` is positive, since the character just after the "¤" will be a digit. However, the `currencyMatch` is **not** positive, since the "\$" in "US\$" is at the end of the currency symbol being substituted. So the insertion is not made.
828
829For more information on the matching used in the `currencyMatch` and `surroundingMatch` elements, see the main document _[Appendix E: Unicode Sets](tr35.md#Unicode_Sets)_.
830
831Currencies can also contain optional grouping, decimal data, and pattern elements. This data is inherited from the `<symbols>` in the same locale data (if not present in the chain up to root), so only the _differing_ data will be present. See the main document _Section 4.1 [Multiple Inheritance](tr35.md#Multiple_Inheritance)_.
832
833> **Note:** _Currency values should **never** be interchanged without a known currency code. You never want the number 3.5 interpreted as $3.50 by one user and €3.50 by another._ Locale data contains localization information for currencies, not a currency value for a country. A currency amount logically consists of a numeric value, plus an accompanying currency code (or equivalent). The currency code may be implicit in a protocol, such as where USD is implicit. But if the raw numeric value is transmitted without any context, then it has no definitive interpretation.
834
835Notice that the currency code is completely independent of the end-user's language or locale. For example, BGN is the code for Bulgarian Lev. A currency amount of <BGN, 1.23456×10³> would be localized for a Bulgarian user into "1 234,56 лв." (using Cyrillic letters). For an English user it would be localized into the string "BGN 1,234.56" The end-user's language is needed for doing this last localization step; but that language is completely orthogonal to the currency code needed in the data. After all, the same English user could be working with dozens of currencies. Notice also that the currency code is also independent of whether currency values are inter-converted, which requires more interesting financial processing: the rate of conversion may depend on a variety of factors.
836
837Thus logically speaking, once a currency amount is entered into a system, it should be logically accompanied by a currency code in all processing. This currency code is independent of whatever the user's original locale was. Only in badly-designed software is the currency code (or equivalent) not present, so that the software has to "guess" at the currency code based on the user's locale.
838
839> **Note:** The number of decimal places **and** the rounding for each currency is not locale-specific data, and is not contained in the Locale Data Markup Language format. Those values override whatever is given in the currency numberFormat. For more information, see _[Supplemental Currency Data](#Supplemental_Currency_Data)_.
840
841For background information on currency names, see [[CurrencyInfo](tr35.md#CurrencyInfo)].
842
843### 4.1 <a name="Supplemental_Currency_Data" href="#Supplemental_Currency_Data">Supplemental Currency Data</a>
844
845```xml
846<!ELEMENT currencyData ( fractions*, region+ ) >
847<!ELEMENT fractions ( info+ ) >
848
849<!ELEMENT info EMPTY >
850<!ATTLIST info iso4217 NMTOKEN #REQUIRED >
851<!ATTLIST info digits NMTOKEN #IMPLIED >
852<!ATTLIST info rounding NMTOKEN #IMPLIED >
853<!ATTLIST info cashDigits NMTOKEN #IMPLIED >
854<!ATTLIST info cashRounding NMTOKEN #IMPLIED >
855
856<!ELEMENT region ( currency* ) >
857<!ATTLIST region iso3166 NMTOKEN #REQUIRED >
858
859<!ELEMENT currency ( alternate* ) >
860<!ATTLIST currency from NMTOKEN #IMPLIED >
861<!ATTLIST currency to NMTOKEN #IMPLIED >
862<!ATTLIST currency iso4217 NMTOKEN #REQUIRED >
863<!ATTLIST currency tender ( true | false ) #IMPLIED >
864```
865
866Each `currencyData` element contains one `fractions` element followed by one or more `region` elements. Here is an example for illustration.
867
868```xml
869<supplementalData>
870    <currencyData>
871        <fractions>
872873        <info iso4217="CHF" digits="2" rounding="5"/>
874875        <info iso4217="ITL" digits="0"/>
876877        </fractions>
878879        <region iso3166="IT">
880            <currency iso4217="EUR" from="1999-01-01"/>
881            <currency iso4217="ITL" from="1862-8-24" to="2002-02-28"/>
882        </region>
883884        <region iso3166="CS">
885            <currency iso4217="EUR" from="2003-02-04"/>
886            <currency iso4217="CSD" from="2002-05-15"/>
887            <currency iso4217="YUM" from="1994-01-24" to="2002-05-15"/>
888        </region>
889890    </currencyData>
891892</supplementalData>
893```
894
895The `fractions` element contains any number of `info` elements, with the following attributes:
896
897* **iso4217:** the ISO 4217 code for the currency in question. If a particular currency does not occur in the fractions list, then it is given the defaults listed for the next two attributes.
898* **digits:** the minimum and maximum number of decimal digits normally formatted. The default is 2. For example, in the en_US locale with the default value of 2 digits, the value 1 USD would format as "$1.00", and the value 1.123 USD would format as → "$1.12".
899* **rounding:** the rounding increment, in units of 10<sup>-digits</sup>. The default is 0, which means no rounding is to be done. Therefore, rounding=0 and rounding=1 have identical behavior. Thus with fraction digits of 2 and rounding increment of 5, numeric values are rounded to the nearest 0.05 units in formatting. With fraction digits of 0 and rounding increment of 50, numeric values are rounded to the nearest 50.
900* **cashDigits:** the number of decimal digits to be used when formatting quantities used in cash transactions (as opposed to a quantity that would appear in a more formal setting, such as on a bank statement). If absent, the value of "digits" should be used as a default.
901* **cashRounding:** the cash rounding increment, in units of 10-cashDigits. The default is 0, which means no rounding is to be done; and as with rounding, this has the same effect as cashRounding="1". This is the rounding increment to be used when formatting quantities used in cash transactions (as opposed to a quantity that would appear in a more formal setting, such as on a bank statement). If absent, the value of "rounding" should be used as a default.
902
903For example, the following line
904
905```xml
906<info iso4217="CZK" digits="2" rounding="0"/>
907```
908
909should cause the value 2.006 to be displayed as “2.01”, not “2.00”.
910
911Each `region` element contains one attribute:
912
913* **iso3166:** the ISO 3166 code for the region in question. The special value _XXX_ can be used to indicate that the region has no valid currency or that the circumstances are unknown (usually used in conjunction with _before_, as described below).
914
915And can have any number of `currency` elements, with the `ordered` subelements.
916
917```xml
918<region iso3166="IT"> <!-- Italy -->
919    <currency iso4217="EUR" from="2002-01-01"/>
920    <currency iso4217="ITL" to="2001-12-31"/>
921</region>
922```
923
924* **iso4217:** the ISO 4217 code for the currency in question. Note that some additional codes that were in widespread usage are included, others such as GHP are not included because they were never used.
925* **from:** the currency was valid from to the datetime indicated by the value. See the main document _Section 5.2.1 [Dates and Date Ranges](tr35.md#Date_Ranges)_.
926* **to:** the currency was valid up to the datetime indicated by the value of _before_. See the main document _Section 5.2.1 [Dates and Date Ranges](tr35.md#Date_Ranges)_.
927* **tender:** indicates whether or not the ISO currency code represents a currency that was or is legal tender in some country. The default is "true". Certain ISO codes represent things like financial instruments or precious metals, and do not represent normally interchanged currencies.
928
929
930That is, each `currency` element will list an interval in which it was valid. The _ordering_ of the elements in the list tells us which was the primary currency during any period in time. Here is an example of such an overlap:
931
932```xml
933<currency iso4217="CSD" to="2002-05-15"/>
934<currency iso4217="YUD" from="1994-01-24" to="2002-05-15"/>
935<currency iso4217="YUN" from="1994-01-01" to="1994-07-22"/>
936```
937
938The `from` element is limited by the fact that ISO 4217 does not go very far back in time, so there may be no ISO code for the previous currency.
939
940Currencies change relatively frequently. There are different types of changes:
941
9421. YU=>CS (name change)
9432. CS=>RS+ME (split, different names)
9443. SD=>SD+SS (split, same name for one // South Sudan splits from Sudan)
9454. DE+DD=>DE (Union, reuses one name // East Germany unifies with Germany)
946
947The [UN Information](https://unstats.un.org/unsd/methodology/m49/) is used to determine dates due to country changes.
948
949When a code is no longer in use, it is terminated (see #1, #2, #4, #5)
950
951> Example:
952>
953> * ```<currency iso4217="EUR" from="2003-02-04" to="2006-06-03"/>```
954
955When codes split, each of the new codes inherits (see #2, #3) the previous data. However, some modifications can be made if it is clear that currencies were only in use in one of the parts.
956
957When codes merge, the data is copied from the most populous part.
958
959> Example. When CS split into RS and ME:
960>
961> * RS & ME copy the former CS, except that the line for EUR is dropped from RS
962> * CS now terminates on Jun 3, 2006 (following the UN info)
963
964## 5 <a name="Language_Plural_Rules" href="#Language_Plural_Rules">Language Plural Rules</a>
965
966```xml
967<!ELEMENT plurals (pluralRules*, pluralRanges*) >
968<!ATTLIST plurals type ( ordinal | cardinal ) #IMPLIED > <!-- default is cardinal -->
969
970<!ELEMENT pluralRules (pluralRule*) >
971<!ATTLIST pluralRules locales NMTOKENS #REQUIRED >
972
973<!ELEMENT pluralRule ( #PCDATA ) >
974<!ATTLIST pluralRule count (zero | one | two | few | many | other) #REQUIRED >
975```
976
977The plural categories are used to format messages with numeric placeholders, expressed as decimal numbers. The fundamental rule for determining plural categories is the existence of minimal pairs: whenever two different numbers may require different versions of the same message, then the numbers have different plural categories.
978
979This happens even if nouns are invariant; even if all English nouns were invariant (like “sheep”), English would still require 2 plural categories because of subject-verb agreement, and pronoun agreement. For example:
980
9811. 1 sheep **is** here. Do you want to buy **it**?
9822. 2 sheep **are** here. Do you want to buy **them**?
983
984For more information, see [Determining-Plural-Categories](http://cldr.unicode.org/index/cldr-spec/plural-rules#h.44ozdx564iez).
985
986English does not have a separate plural category for “zero”, because it does not require a different message for “0”. For example, the same message can be used below, with just the numeric placeholder changing.
987
9881. You have 3 friends online.
9892. You have 0 friends online.
990
991However, across many languages it is commonly more natural to express "0" messages with a negative (“None of your friends are online.”) and "1" messages also with an alternate form “You have a friend online.”. Thus pluralized message APIs should also offer the ability to specify at least the 0 and 1 cases explicitly; developers can use that ability whenever these values might occur in a placeholder.
992
993The CLDR plural rules are not expected to cover all cases. For example, strictly speaking, there could be more plural and ordinal forms for English. Formally, we have a different plural form where a change in digits forces a change in the rest of the sentence. There is an edge case in English because of the behavior of "a/an".
994
995For example, in changing from 3 to 8:
996
997* "a 3rd of a loaf" should result in "an 8th of a loaf", not "a 8th of a loaf"
998* "a 3 foot stick" should result in "an 8 foot stick", not "a 8 foot stick"
999
1000So numbers of the following forms could have a special plural category and special ordinal category: 8(X), 11(X), 18(X), 8x(X), where x is 0..9 and the optional X is 00, 000, 00000, and so on.
1001
1002On the other hand, the above constructions are relatively rare in messages constructed using numeric placeholders, so the disruption for implementations currently using CLDR plural categories wouldn't be worth the small gain.
1003
1004This section defines the types of plural forms that exist in a language—namely, the cardinal and ordinal plural forms. Cardinal plural forms express units such as time, currency or distance, used in conjunction with a number expressed in decimal digits (i.e. "2", not "two", and not an indefinite number such as "some" or "many"). Ordinal plural forms denote the order of items in a set and are always integers. For example, English has two forms for cardinals:
1005
1006* form "one": 1 day
1007* form "other": 0 days, 2 days, 10 days, 0.3 days
1008
1009and four forms for ordinals:
1010
1011* form "one": 1st floor, 21st floor, 101st floor
1012* form "two": 2nd floor, 22nd floor, 102nd floor
1013* form "few": 3rd floor, 23rd floor, 103rd floor
1014* form "other": 4th floor, 11th floor, 96th floor
1015
1016Other languages may have additional forms or only one form for each type of plural. CLDR provides the following tags for designating the various plural forms of a language; for a given language, only the tags necessary for that language are defined, along with the specific numeric ranges covered by each tag (for example, the plural form "few" may be used for the numeric range 2–4 in one language and 3–9 in another):
1017
1018* zero (see also plural case “0”, described in [Explicit 0 and 1 rules](#Explicit_0_1_rules))
1019* one (see also plural case “1”, described in [Explicit 0 and 1 rules](#Explicit_0_1_rules))
1020* two
1021* few
1022* many
1023
1024In addition, an "other" tag is always implicitly defined to cover the forms not explicitly designated by the tags defined for a language. This "other" tag is also used for languages that only have a single form (in which case no plural-form tags are explicitly defined for the language). For a more complex example, consider the cardinal rules for Russian and certain other languages:
1025
1026```xml
1027<pluralRules locales="hr ru sr uk">
1028    <pluralRules count="one">n mod 10 is 1 and n mod 100 is not 11</pluralRule>
1029    <pluralRules count="few">n mod 10 in 2..4 and n mod 100 not in 12..14</pluralRule>
1030</pluralRules>
1031```
1032
1033These rules specify that Russian has a "one" form (for 1, 21, 31, 41, 51, …), a "few" form (for 2–4, 22–24, 32–34, …), and implicitly an "other" form (for everything else: 0, 5–20, 25–30, 35–40, …, decimals). Russian does not need additional separate forms for zero, two, or many, so these are not defined.
1034
1035A source number represents the visual appearance of the digits of the result. In text, it can be represented by the EBNF for decimalValue. Note that the same double number can be represented by multiple source numbers. For example, "1.0" and "1.00" are different source numbers, but there is only one double number that they correspond to: 1.0d == 1.00d. As another example, 1e3d == 1000d, but the source numbers "1e3" and "1000" are different, and can have different plural categories. So the input to the plural rules carries more information than a computer double. The plural category for negative numbers is calculated according to the absolute value of the source number, and leading integer digits don't have any effect on the plural category calculation. (This may change in the future, if we find languages that have different behavior.)
1036
1037Plural categories may also differ according to the visible decimals. For example, here are some of the behaviors exhibited by different languages:
1038
1039| Behavior | Description | Example |
1040| --- | --- | --- |
1041| Base | The fractions are ignored; the category is the same as the category of the integer. | 1.13 has the same plural category as 1. |
1042| Separate | All fractions by value are in one category (typically ‘other’ = ‘plural’). | 1.01 gets the same class as 9; <br/> 1.00 gets the same category as 1. |
1043| Visible | All visible fractions are in one category (typically ‘other’ = ‘plural). | 1.00, 1.01, 3.5 all get the same category. |
1044| Digits | The visible fraction determines the category. | 1.13 gets the same class as 13. |
1045
1046There are also variants of the above: for example, short fractions may have the Digits behavior, but longer fractions may just look at the final digit of the fraction.
1047
1048#### <a name="Explicit_0_1_rules" href="#Explicit_0_1_rules">Explicit 0 and 1 rules</a>
1049
1050Some types of CLDR data (such as [unitPatterns](tr35-general.md#Unit_Elements) and [currency displayNames](#Currencies)) allow specification of plural rules for explicit cases “0” and “1”, in addition to the language-specific plural cases specified above: “zero”, “one”, “two” ... “other”. For the language-specific plural rules:
1051
1052* The rules depend on language; for a given language, only a subset of the cases may be defined. For example, English only defines “one” and “other”, cases like “two” and “few” cannot be used in plurals for English CLDR items.
1053* Each plural case may cover multiple numeric values, and may depend on the formatting of those values. For example, in French the “one” case covers 0.0 through 1.99.
1054* The “one” case, if defined, includes at least some formatted forms of the numeric value 1; the “zero” case, if defined, includes at least some formatted forms of the numeric value 0.
1055
1056By contrast, for the explicit cases “0” and “1”:
1057
1058* The explicit “0” and “1” cases are not defined by language-specific rules, and are available in any language for the CLDR data items that accept them.
1059* The explicit “0” and “1” cases apply to the exact numeric values 0 and 1 respectively. These cases are typically used for plurals of items that do not have fractional value, like books or files.
1060* The explicit “0” and “1” cases have precedence over the “zero” and “one” cases. For example, if for a particular element CLDR data includes values for both the “1” and “one” cases, then the “1” value is used for numeric values of exactly 1, while the “one” value is used for any other formatted numeric values matching the “one” plural rule for the language.
1061
1062Usage example: In English (which only defines language-specific rules for “one” and “other”) this can be used to have special behavior for 0:
1063
1064* count=“0”: no books
1065* count=“one”: {0} book, e.g. “1 book”
1066* count=“other”: {0} books, e.g. “3 books”
1067
1068### 5.1 <a name="Plural_rules_syntax" href="#Plural_rules_syntax">Plural rules syntax</a>
1069
1070The xml value for each pluralRule is a _condition_ with a boolean result that specifies whether that rule (i.e. that plural form) applies to a given numeric value _n_, where n can be expressed as a decimal fraction or with compact decimal formatting, denoted by a special notation in the syntax, e.g., “1.2c6” for “1.2M”. Clients of CLDR may express all the rules for a locale using the following syntax:
1071
1072```
1073rules         = rule (';' rule)*
1074rule          = keyword ':' condition samples
1075              | 'other' ':' samples
1076keyword       = [a-z]+
1077keyword       = [a-z]+
1078```
1079
1080In CLDR, the keyword is the attribute value of 'count'. Those values in CLDR are currently limited to just what is in the DTD, but clients may support other values.
1081
1082The conditions themselves have the following syntax.
1083
1084```
1085condition       = and_condition ('or' and_condition)*
1086samples         = ('@integer' sampleList)?
1087                  ('@decimal' sampleList)?
1088and_condition   = relation ('and' relation)*
1089relation        = is_relation | in_relation | within_relation
1090is_relation     = expr 'is' ('not')? value
1091in_relation     = expr (('not')? 'in' | '=' | '!=') range_list
1092within_relation = expr ('not')? 'within' range_list
1093expr            = operand (('mod' | '%') value)?
1094operand         = 'n' | 'i' | 'f' | 't' | 'v' | 'w' | 'c' | 'e'
1095range_list      = (range | value) (',' range_list)*
1096range           = value'..'value
1097value           = digit+
1098sampleList      = sampleRange (',' sampleRange)* (',' ('…'|'...'))?
1099sampleRange     = sampleValue ('~' sampleValue)?
1100sampleValue     = value ('.' digit+)? ([ce] digitPos digit+)?
1101digit           = [0-9]
1102digitPos        = [1-9]
1103```
1104
1105* Whitespace (defined as Unicode [Pattern_White_Space](https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=%5Cp%7BPattern_White_Space%7D)) can occur between or around any of the above tokens, with the exception of the tokens in value, digit, and decimalValue.
1106* In the syntax, **and** binds more tightly than **or**. So **X or Y and Z** is interpreted as **(X or (Y and Z))**.
1107* Each plural rule must be written to be self-contained, and not depend on the ordering. Thus rules must be mutually exclusive; for a given numeric value, only one rule can apply (i.e., the condition can only be true for one of the pluralRule elements. Each keyword can have at most one condition. The 'other' keyword must have an empty condition: it is only present for samples.
1108* The samples should be included, since they are used by client software for samples and determining whether the keyword has finite values or not.
1109* The 'other' keyword must have no condition, and all other keywords must have a condition.
1110
1111#### 5.1.1 <a name="Operands" href="#Operands">Operands</a>
1112
1113The operands correspond to features of the source number, and have the following meanings.
1114
1115##### <a name="Plural_Operand_Meanings" href="#Plural_Operand_Meanings">Plural Operand Meanings</a>
1116
1117| Symbol | Value |
1118| --- | --- |
1119| n | absolute value of the source number. |
1120| i | integer digits of n. |
1121| v | number of visible fraction digits in n, _with_ trailing zeros.* |
1122| w | number of visible fraction digits in n, _without_ trailing zeros.* |
1123| f | visible fraction digits in n, _with_ trailing zeros.* |
1124| t | visible fraction digits in n, _without_ trailing zeros.* |
1125| c | compact decimal exponent value: exponent of the power of 10 used in compact decimal formatting. |
1126| e | currently, synonym for ‘c’. however, may be redefined in the future. |
1127
1128\* If there is a compact decimal exponent value (‘c’), then the f, t, v, and w values are computed _after_ shifting the decimal point in the original by the ‘c’ value. So for 1.2c3, the f, t, v, and w values are the same as those of 1200:  i=1200 and f=0. Similarly, for 1.2005c3 has i=1200 and f=5 (corresponding to 1200.5).
1129
1130##### <a name="Plural_Operand_Examples" href="#Plural_Operand_Examples">Plural Operand Examples</a>
1131
1132| source | n | i | v | w | f | t | e |
1133| ---: | ---: | ---: | ---: | ---: | ---: | ---: | ---: |
1134| 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |
1135| 1.0 | 1 | 1 | 1 | 0 | 0 | 0 | 0 |
1136| 1.00 | 1 | 1 | 2 | 0 | 0 | 0 | 0 |
1137| 1.3 | 1.3 | 1 | 1 | 1 | 3 | 3 | 0 |
1138| 1.30 | 1.3 | 1 | 2 | 1 | 30 | 3 | 0 |
1139| 1.03 | 1.03 | 1 | 2 | 2 | 3 | 3 | 0 |
1140| 1.230 | 1.23 | 1 | 3 | 2 | 230 | 23 | 0 |
1141| 1200000 | 1200000 | 1200000 | 0 | 0 | 0 | 0 | 0 |
1142| 1.2c6 | 1200000 | 1200000 | 0 | 0 | 0 | 0 | 6 |
1143| 123c6 | 123000000 | 123000000 | 0 | 0 | 0 | 0 | 6 |
1144| 123c5 | 12300000 | 12300000 | 0 | 0 | 0 | 0 | 5 |
1145| 1200.50 | 1200.5 | 1200 | 2 | 1 | 50 | 5 | 0 |
1146| 1.20050c3 | 1200.5 | 1200 | 2 | 1 | 50 | 5 | 3 |
1147
1148
1149#### 5.1.2 <a name="Relations" href="#Relations">Relations</a>
1150
1151The positive relations are of the format **x = y** and **x = y mod z**. The **y** value can be a comma-separated list, such as **n = 3, 5, 7..15**, and is treated as if each relation were expanded into an OR statement. The range value **a..b** is equivalent to listing all the _**integers**_ between **a** and **b**, inclusive. When **!=** is used, it means the entire relation is negated.
1152
1153##### <a name="Relations_Examples" href="#Relations_Examples">Relations Examples</a>
1154
1155| Expression | Meaning |
1156| --- | --- |
1157| x = 2..4, 15 | x = 2 OR x = 3 OR x = 4 OR x = 15 |
1158| x != 2..4, 15 | NOT (x = 2 OR x = 3 OR x = 4 OR x = 15) |
1159
1160| Expression | Value |
1161| --- | --- |
1162| 3.5 = 2..4, 15 | false |
1163| 3.5 != 2..4, 15 | true |
1164| 3 = 2..4, 15 | true |
1165| 3 != 2..4, 15 | false |
1166
1167> The old keywords 'mod', 'in', 'is', and 'within' are present only for backwards compatibility. The preferred form is to use '%' for modulo, and '=' or '!=' for the relations, with the operand 'i' instead of within. (The difference between **in** and **within** is that **in** only includes integers in the specified range, while **within** includes all values.)
1168
1169The modulus (% or **mod**) is a remainder operation as defined in Java; for example, where **n** = 4.3 the result of **n mod 3** is 1.3.
1170
1171The values of relations are defined according to the operand as follows. Importantly, the results may depend on the visible decimals in the source, including trailing zeros, and the compact decimal exponent.
1172
11731. Let the base value BV be computed from absolute value of the original source number according to the operand.
11742. Let R be false when the comparison contains ‘not’.
11753. Let R be !R if the comparison contains ‘within’ and the source number is not an integer.
11764. If there is a module value MV, let BV be BV - floor(BV/MV).
11775. Let CR be the list of comparison ranges, normalized that overlapping ranges are merged. Single values in the rule are represented by a range with identical \<starti, endi> values.
11786. Iterate through CR:
1179   * if starti ≤ BV ≤ endi then return R.
11807. Otherwise return !R.
1181
1182##### <a name="Plural_Rules_Examples" href="#Plural_Rules_Examples">Plural Rules Examples</a>
1183
1184| Rules | Comments |
1185| --- | --- |
1186| one: n = 1 <br/> few: n = 2..4 | This defines two rules, for 'one' and 'few'. The condition for 'one' is "n = 1" which means that the number must be equal to 1 for this condition to pass. The condition for 'few' is "n = 2..4" which means that the number must be between 2 and 4 inclusive for this condition to pass. All other numbers are assigned the keyword 'other' by the default rule. |
1187| zero: n = 0 or n != 1 and n mod 100 = 1..19 <br/> one: n = 1 | Each rule must not overlap with other rules. Also note that a modulus is applied to n in the last rule, thus its condition holds for 119, 219, 319… |
1188| one: n = 1 <br/> few: n mod 10 = 2..4 and n mod 100 != 12..14 | This illustrates conjunction and negation. The condition for 'few' has two parts, both of which must be met: "n mod 10 = 2..4" and "n mod 100 != 12..14". The first part applies a modulus to n before the test as in the previous example. The second part applies a different modulus and also uses negation, thus it matches all numbers _not_ in 12, 13, 14, 112, 113, 114, 212, 213, 214… |
1189
1190#### 5.1.3 <a name="Samples" href="#Samples">Samples</a>
1191
1192Samples are provided if sample indicator (@integer or @decimal) is present on any rule. (CLDR always provides samples.)
1193
1194Where samples are provided, the absence of one of the sample indicators indicates that no numeric values can satisify that rule. For example, the rule "i = 1 and v = 0" can only have integer samples, so @decimal must not occur. The @integer samples have no visible fraction digits, while @decimal samples have visible fraction digits; both can have compact decimal exponent values (if the 'e' operand occurs).
1195
1196The sampleRanges have a special notation: **start**~**end**. The **start** and **end** values must have the same number of decimal digits, and the same compact decimal exponent values (or neither have compact decimal exponent values). The range encompasses all and only values those value **v** where **start ≤ v ≤ end**, and where **v** has the same number of decimal places as **start** and **end**, and the same compact decimal exponent values.
1197
1198Samples must indicate whether they are infinite or not. The '…' marker must be present if and only infinitely many values (integer or decimal) can satisfy the rule. If a set is not infinite, it must list all the possible values.
1199
1200##### <a name="Plural_Samples_Examples" href="#Plural_Samples_Examples">Plural Samples Examples</a>
1201
1202| Rules | Comments |
1203| --- | --- |
1204| @integer 1, 3~5 | 1, 3, 4, 5. |
1205| @integer 3\~5, 103\~105, … | Infinite set: 3, 4, 5, 103, 104, 105, … |
1206| @decimal 1.3\~1.5, 1.03\~1.05, … | Infinite set: 1.3, 1.4, 1.5, 1.03, 1.04, 1.05, … |
1207
1208In determining whether a set of samples is infinite, leading zero integer digits and trailing zero decimals are not significant. Thus "i = 1000 and f = 0" is satisfied by 01000, 1000, 1000.0, 1000.00, 1000.000, 01c3 etc. but is still considered finite.
1209
1210#### 5.1.4 <a name="Using_cardinals" href="#Using_cardinals">Using Cardinals</a>
1211
1212Elements such as `<currencyFormats>`, `<currency>` and `<unit>` provide selection among subelements designating various localized cardinal plural forms by tagging each of the relevant subelements with a different count value, or with no count value in some cases. Note that the plural forms for a specific currencyFormat, unit type, or currency type may not use all of the different plural-form tags defined for the language. To format a currency or unit type for a particular numeric value, determine the count value according to the plural rules for the language, then select the appropriate display form for the currency format, currency type or unit type using the rules in those sections:
1213
1214* 2.3 [Number Symbols](#Number_Symbols) (for `currencyFormat`s elements)
1215* Section 4 [Currencies](#Currencies) (for `currency` elements)
1216* The main document section 5.11 [Unit Elements](tr35.md#Unit_Elements)
1217
1218### 5.2 <a name="Plural_Ranges" href="#Plural_Ranges">Plural Ranges</a>
1219
1220```xml
1221<!ELEMENT pluralRanges (pluralRange*) >
1222<!ATTLIST pluralRanges locales NMTOKENS #REQUIRED >
1223
1224<!ELEMENT pluralRange ( #PCDATA ) >
1225<!ATTLIST pluralRange start (zero|one|two|few|many|other) #IMPLIED >
1226<!ATTLIST pluralRange end (zero|one|two|few|many|other) #IMPLIED >
1227<!ATTLIST pluralRange result (zero|one|two|few|many|other) #REQUIRED >
1228```
1229
1230Often ranges of numbers are presented to users, such as in “Length: 3.2–4.5 centimeters”. This means any length from 3.2 cm to 4.5 cm, inclusive. However, different languages have different conventions for the pluralization given to a range: should it be “0–1 centimeter” or “0–1 centimeters”? This becomes much more complicated for languages that have many different plural forms, such as Russian or Arabic.
1231
1232The `pluralRanges` element provides information allowing an implementation to derive the plural category of a range from the plural categories of the `start` and `end` values. If there is no value for a _<`start`,`end`>_ pair, the default result is `end`. However, where that result has been verified for a given language, it is included in the CLDR data.
1233
1234The data has been gathered presuming that in any usage, the start value is strictly less than the end value, and that no values are negative. Results for any cases that do not meet these criteria are undefined.
1235
1236## 6 <a name="Rule-Based_Number_Formatting" href="#Rule-Based_Number_Formatting">Rule-Based Number Formatting</a>
1237
1238```xml
1239<!ELEMENT rbnf ( alias | rulesetGrouping*) >
1240
1241<!ELEMENT rulesetGrouping ( alias | ruleset*) >
1242<!ATTLIST rulesetGrouping type NMTOKEN #REQUIRED>
1243
1244<!ELEMENT ruleset ( alias | rbnfrule*) >
1245<!ATTLIST ruleset type NMTOKEN #REQUIRED>
1246<!ATTLIST ruleset access ( public | private ) #IMPLIED >
1247
1248<!ELEMENT rbnfrule ( #PCDATA ) >
1249<!ATTLIST rbnfrule value CDATA #REQUIRED >
1250<!ATTLIST rbnfrule radix CDATA #IMPLIED >
1251<!ATTLIST rbnfrule decexp CDATA #IMPLIED >
1252```
1253
1254The rule-based number format (RBNF) encapsulates a set of rules for mapping binary numbers to and from a readable representation. They are typically used for spelling out numbers, but can also be used for other number systems like roman numerals, Chinese numerals, or for ordinal numbers (1st, 2nd, 3rd,…).
1255
1256Where, however, the CLDR plurals or ordinals can be used, their usage is recommended in preference to the RBNF data. First, the RBNF data is not completely fleshed out over all languages that otherwise have modern coverage. Secondly, the alternate forms are neither complete, nor useful without additional information. For example, for German there is spellout-cardinal-masculine, and spellout-cardinal-feminine. But a complete solution would have all genders (masculine/feminine/neuter), all cases (nominative, accusative, dative, genitive), plus context (with strong or weak determiner or none). Moreover, even for the alternate forms that do exist, CLDR does not supply any data for when to use one vs another (eg, when to use spellout-cardinal-masculine vs spellout-cardinal-feminine). So these data are inappropriate for general purpose software.
1257
1258There are 4 common spellout rules. Some languages may provide more than these 4 types:
1259
1260* **numbering:** This is the default used when there is no context for the number. For many languages, this may also be used for enumeration of objects, like used when pronouncing "table number one" and "table number two". It can also be used for pronouncing a math equation, like "2 - 3 = -1".
1261* **numbering-year:** This is used for cases where years are pronounced or written a certain way. An example in English is the year 1999, which comes out as "nineteen ninety-nine" instead of the numbering value "one thousand nine hundred ninety-nine". The rules for this type have undefined behavior for non-integer numbers, and values less than 1.
1262* **cardinal:** This is used when providing the quantity of the number of objects. For many languages, there may not be a default cardinal type. Many languages require the notion of the gender and other grammatical properties so that the number and the objects being referenced are in grammatical agreement. An example of its usage is "one e-mail", "two people" or "three kilometers". Some languages may not have dedicated words for 0 or negative numbers for cardinals. In those cases, the words from the numbering type can be reused.
1263* **ordinal:** This is used when providing the order of the number of objects. For many languages, there may not be a default ordinal type. Many languages also require the notion of the gender for ordinal so that the ordinal number and the objects being referenced are in grammatical agreement. An example of its usage is "first place", "second e-mail" or "third house on the right". The rules for this type have undefined behavior for non-integer numbers, and values less than 1.
1264
1265In addition to the spellout rules, there are also a numbering system rules. Even though they may be derived from a specific culture, they are typically not translated and the rules are in **root**. An example of these rules are the Roman numerals where the value 8 comes out as VIII.
1266
1267With regards to the number range supported for all these number types, the largest possible number range tries to be supported, but some languages may not have words for large numbers. For example, the old Roman numbering system can't support the value 5000 and beyond. For those unsupported cases, the default number format from CLDR is used.
1268
1269Any rules marked as **private** should never be referenced externally. Frequently they only support a subrange of numbers that are used in the public rules.
1270
1271The syntax used in the CLDR representation of rules is intended to be simply a transcription of ICU based RBNF rules into an XML compatible syntax. The rules are fairly sophisticated; for details see _Rule-Based Number Formatter_ [[RBNF](tr35.md#RBNF)].
1272
1273```xml
1274<ruleSetGrouping>
1275```
1276
1277Used to group rules into functional sets for use with ICU. Currently, the valid types of rule set groupings are "SpelloutRules", "OrdinalRules", and "NumberingSystemRules".
1278
1279```xml
1280<ruleset>
1281```
1282
1283This element denotes a specific rule set to the number formatter. The ruleset is assumed to be a public ruleset unless the attribute type="private" is specified.
1284
1285```xml
1286<rule>
1287```
1288
1289Contains the actual formatting rule for a particular number or sequence of numbers. The `value` attribute is used to indicate the starting number to which the rule applies. The actual text of the rule is identical to the ICU syntax, with the exception that Unicode left and right arrow characters are used to replace < and > in the rule text, since < and > are reserved characters in XML. The `radix` attribute is used to indicate an alternate radix to be used in calculating the prefix and postfix values for number formatting. Alternate radix values are typically used for formatting year numbers in formal documents, such as "nineteen hundred seventy-six" instead of "one thousand nine hundred seventy-six".
1290
1291## 7 <a name="Parsing_Numbers" href="#Parsing_Numbers">Parsing Numbers</a>
1292
1293The following elements are relevant to determining the value of a parsed number:
1294
1295* A possible prefix or suffix, indicating sign
1296* A possible currency symbol or code
1297* Decimal digits
1298* A possible decimal separator
1299* A possible exponent
1300* A possible percent or per mille character
1301
1302Other characters should either be ignored, or indicate the end of input, depending on the application. The key point is to disambiguate the sets of characters that might serve in more than one position, based on context. For example, a period might be either the decimal separator, or part of a currency symbol (for example, "NA f."). Similarly, an "E" could be an exponent indicator, or a currency symbol (the Swaziland Lilangeni uses "E" in the "en" locale). An apostrophe might be the decimal separator, or might be the grouping separator.
1303
1304Here is a set of heuristic rules that may be helpful:
1305
1306* Any character with the decimal digit property is unambiguous and should be accepted.
1307
1308  **Note:** In some environments, applications may independently wish to restrict the decimal digit set to prevent security problems. See [[UTR36](https://www.unicode.org/reports/tr41/#UTR36)].
1309
1310* The exponent character can only be interpreted as such if it occurs after at least one digit, and if it is followed by at least one digit, with only an optional sign in between. A regular expression may be helpful here.
1311* For the sign, decimal separator, percent, and per mille, use a set of all possible characters that can serve those functions. For example, the decimal separator set could include all of [.,']. (The actual set of characters can be derived from the number symbols in the By-Type charts [[ByType](tr35.md#ByType)], which list all of the values in CLDR.) To disambiguate, the decimal separator for the locale must be removed from the "ignore" set, and the grouping separator for the locale must be removed from the decimal separator set. The same principle applies to all sets and symbols: any symbol must appear in at most one set.
1312* Since there are a wide variety of currency symbols and codes, this should be tried before the less ambiguous elements. It may be helpful to develop a set of characters that can appear in a symbol or code, based on the currency symbols in the locale.
1313* Otherwise, a character should be ignored unless it is in the "stop" set. This includes even characters that are meaningful for formatting, for example, the grouping separator.
1314* If more than one sign, currency symbol, exponent, or percent/per mille occurs in the input, the first found should be used.
1315* A currency symbol in the input should be interpreted as the longest match found in the set of possible currency symbols.
1316* Especially in cases of ambiguity, the user's input should be echoed back, properly formatted according to the locale, before it is actually used for anything.
1317
1318* * *
1319
1320Copyright © 2001–2021 Unicode, Inc. All Rights Reserved. The Unicode Consortium makes no expressed or implied warranty of any kind, and assumes no liability for errors or omissions. No liability is assumed for incidental and consequential damages in connection with or arising out of the use of the information or programs contained or accompanying this technical report. The Unicode [Terms of Use](https://unicode.org/copyright.html) apply.
1321
1322Unicode and the Unicode logo are trademarks of Unicode, Inc., and are registered in some jurisdictions.
1323