1--- 2layout: default 3title: Formatting 4nav_order: 7 5has_children: true 6--- 7<!-- 8© 2020 and later: Unicode, Inc. and others. 9License & terms of use: http://www.unicode.org/copyright.html 10--> 11 12# Formatting and Parsing 13{: .no_toc } 14 15## Contents 16{: .no_toc .text-delta } 17 181. TOC 19{:toc} 20 21--- 22 23## Overview 24 25Formatters translate between binary data and human-readable textual 26representations of these values. For example, you cannot display the computer 27representation of the number 103. You can only display the numeral 103 as a 28textual representation (using three text characters). The result from a 29formatter is a string that contains text that the user will recognize as 30representing the internal value. A formatter can also parse a string by 31converting a textual representation of some value back into its internal 32representation. For example, it reads the characters 1, 0 and 3 followed by 33something other than a digit, and produces the value 103 as an internal binary 34representation. 35 36These classes encapsulate information about the display of localized times, 37days, numbers, currencies, and messages. Formatting classes do both formatting 38and parsing and allow the separation of the data that the end-user sees from the 39code. Separating the program code from the data allows a program to be more 40easily localized. Formatting is converting a date, time, number, message or 41other object from its internal representation into a string. Parsing is the 42reverse operation. It is the process of converting a string to an internal 43representation of the date, time, number, message or other object. 44 45Using the formatting classes is an important step in internationalizing your 46software because the `format()` and `parse()` methods in each of the classes make 47your software language neutral, by replacing implicit conversions with explicit 48formatting calls. 49 50## Internationalization Formatting Tips 51 52This section discusses some of the ways you can format and parse numbers, 53currencies, dates, times and text messages in your program so that the data is 54separate from the code and can be easily localized. This is the information your 55users see on their computer screens, so it needs to be in a language and format 56that conforms to their local conventions. 57 58Some things you need to keep in mind while you are creating your code are the 59following: 60 61* Keep your code and your data separate 62 63* Format the data in a locale-sensitive manner 64 65* Keep your code locale-independent 66 67* Avoid writing special routines to handle specific locales 68 69* String objects formatted by `format()` are parseable by the `parse()` method\* 70 71> :point_right: **Note**: Although parsing is supported in several legacy ICU APIs, 72it is generally considered bad practice to parse localized strings. 73For more information, read [Why You Should Not Parse 74Localized Strings](https://blog.sffc.xyz/post/190943794505/why-you-should-not-parse-localized-strings). 75 76### Numbers and Currencies 77 78Programs store and operate on numbers using a locale-independent binary 79representation. When displaying or printing a number it is converted to a 80locale-specific string. For example, the number 12345.67 is "12,345.67" in the 81US, "12 345,67" in France and "12.345,67" in Germany. 82 83By invoking the methods provided by the `NumberFormat` class, you can format 84numbers, currencies, and percentages according to the specified or default 85locale. `NumberFormat` is locale-sensitive so you need to create a new 86`NumberFormat` for each locale. `NumberFormat` methods format primitive-type 87numbers, such as double and output the number as a locale-specific string. 88 89For currencies you call `getCurrencyInstance` to create a formatter that returns a 90string with the formatted number and the appropriate currency sign. Of course, 91the `NumberFormat` class is unaware of exchange rates so, the number output is the 92same regardless of the specified currency. This means that the same number has 93different monetary values depending on the currency locale. If the number is 949988776.65 the results will be: 95 96* 9 988 776,65 € in France 97 98* 9.988.776,65 € in Germany 99 100* $9,988,776.65 in the United States 101 102In order to format percentages, create a locale-specific formatter and call the 103`getPercentInstance` method. With this formatter, a decimal fraction such as 0.75 104is displayed as 75%. 105 106#### Customizing Number Formats 107 108If you need to customize a number format you can use the `DecimalFormat` and 109the `DecimalFormatSymbols` classes in the [Formatting 110Numbers](numbers/index#formatting-numbers) chapter. This not usually necessary and 111it makes your code much more complex, but it is available for those rare 112instances where you need it. In general, you would do this by explicitly 113specifying the number format pattern. 114 115If you need to format or parse spelled-out numbers, you can use the 116`RuleBasedNumberFormat` class (see the [Formatting Numbers](numbers/index#formatting-numbers) chapter). 117You can instantiate a default formatter for a locale, or by using the 118`RuleBasedNumberFormat` rule syntax, specify your own. 119 120Using `NumberFormat` class methods (see the [Formatting Numbers](numbers/index#formatting-numbers) chapter) 121with a predefined locale is the easiest and the most accurate way to format numbers, and currencies. 122 123> :point_right: **Note**: *See [Properties and ICU Rule Syntax](../strings/properties) for 124information regarding syntax characters.* 125 126### Date and Times 127 128You display or print a Date by first converting it to a locale-specific string 129that conforms to the conventions of the end user's Locale. For example, Germans 130recognize 20.4.98 as a valid date, and Americans recognize 4/20/98. 131 132> :point_right: **Note**: *The appropriate Calendar support is required for different locales. For 133example, the Buddhist calendar is the official calendar in Thailand so the 134typical assumption of Gregorian Calendar usage should not be used. ICU will pick 135the appropriate Calendar based on the locale you supply when opening a `Calendar` 136or `DateFormat`.* 137 138### Messages 139 140Message format helps make the order of display elements localizable. It helps 141address problems of grammatical differences in languages. For example, consider 142the sentence, "I go to work by car everyday." In Japanese, the grammar 143equivalent can be "Everyday, I to work by car go." Another example will be the 144plurals in text, for example, "no space for rent, one room for rent and many 145rooms for rent," where "for rent" is the only constant text among the three. 146 147## Formatting and Parsing Classes 148 149ICU provides four major areas and twelve classes for formatting numbers, dates 150and messages: 151 152### General Formatting 153 154* `Format`: 155 156 The abstract superclass of all format classes. It provides the basic methods 157 for formatting and parsing numbers, dates, strings and other objects. 158 159* `FieldPosition`: 160 161 A concrete class for holding the field constant and the begin and end 162 indices for number and date fields. 163 164* `ParsePosition`: 165 166 A concrete class for holding the parse position in a string during parsing. 167 168* `Formattable`: 169 170 `Formattable` objects can be passed to the `Format` class or its subclasses for 171 formatting. It encapsulates a polymorphic piece of data to be formatted and 172 is used with `MessageFormat`. `Formattable` is used by some formatting 173 operations to provide a single "type" that encompasses all formattable 174 values (e.g., it can hold a number, a date, or a string, and so on). 175 176* `UParseError`: 177 178 `UParseError` is used to returned detailed information about parsing errors. 179 It is used by the ICU parsing engines that parse long rules, patterns, or 180 programs. This is helpful when the text being parsed is long enough that 181 more information than a `UErrorCode` is needed to localize the error. 182 183**Formatting Numbers** 184 185* [`NumberFormat`](numbers/legacy-numberformat#numberformat) 186 187 The abstract superclass that provides the basic fields and methods for 188 formatting `Number` objects and number primitives to localized strings and 189 parsing localized strings to `Number` objects. 190 191* [`DecimalFormat`](numbers/legacy-numberformat#decimalformat) 192 193 A concrete class for formatting `Number` objects and number primitives to 194 localized strings and parsing localized strings to `Number` objects, in base 10. 195 196* [`RuleBasedNumberFormat`](numbers/rbnf) 197 198 A concrete class for formatting `Number` objects and number primitives to 199 localized text, especially spelled-out format such as found in check writing 200 (e.g. "two hundred and thirty-four"), and parsing text into `Number` objects. 201 202* [`DecimalFormatSymbols`](numbers/legacy-numberformat#decimalformatsymbols) 203 204 A concrete class for accessing localized number strings, such as the 205 grouping separators, decimal separator, and percent sign. Used by 206 `DecimalFormat`. 207 208**Formatting Dates and Times** 209 210* [`DateFormat`](datetime/index#dateformat) 211 212 The abstract superclass that provides the basic fields and methods for 213 formatting `Date` objects to localized strings and parsing date and time 214 strings to `Date` objects. 215 216* [`SimpleDateFormat`](datetime/index#simpledateformat) 217 218 A concrete class for formatting `Date` objects to localized strings and 219 parsing date and time strings to `Date` objects, using a `GregorianCalendar`. 220 221* [`DateFormatSymbols`](datetime/index#dateformatsymbols) 222 223 A concrete class for accessing localized date-time formatting strings, such 224 as names of the months, days of the week and the time zone. 225 226**Formatting Messages** 227 228* [`MessageFormat`](messages/index#messageformat) 229 230 A concrete class for producing a language-specific user message that 231 contains numbers, currency, percentages, date, time and string variables. 232 233* [`ChoiceFormat`](messages/examples#choiceformat-class) 234 235 A concrete class for mapping strings to ranges of numbers and for handling 236 plurals and names series in user messages. 237