No. | Date | Rel. Note | Data | Charts | Spec | Delta | SVN Tag | DTD Diffs |
---|---|---|---|---|---|---|---|---|
30 | 2016-10-05 | v30 | CLDR30 | Charts30 | LDML30 | Δ30 | release-30 | ΔDTD30 |
Unicode CLDR 30 provides an update to the key building blocks for software supporting the world's languages. This data is used by all major software systems for their software internationalization and localization, adapting software to the conventions of different languages for such common software tasks. The following summarizes the main improvements in the release:
The new data fields being added to the release are:
The structure for annotations has changed to make processing simpler:
In locale data,
new element <relativePeriod>,
new attribute count for <dateFormatItem>;
in supplemental data,
new element <weekOfPreference>For generated periods like “the week of August 10”. Data examples:
<availableFormats>
<dateFormatItem id="MMMMW" count=...>'week' W 'of' MMM</dateFormatItem>
<dateFormatItem id="yw" count=...>'week' w 'of' y</dateFormatItem>
...
<field type="week">
<relativePeriod>the week of {0}</relativePeriod>
Note: the structure and intended usage for these items is still being refined, see Warnings and Errata.New relative data items for weekdays,
and for “this hour”, “this minute”Examples:
<field type="sun">
<relativeTime type="future">
<relativeTimePattern count=...>in {0} Sundays</relativeTimePattern>
...
<field type="hour">
<relative type="0">this hour</relative>New unit patterns for
“per square kilometer”, “per square mile”In locale data, new elements
<characterLabel>,
<characterLabelPatterns>To generate labels for groups of related characters in character pickers In annotation data,
new attribute type for <annotation>,
deprecated attribute tts for <annotation>Restructured to make the difference clearer between short names (text-to-speech) and other keywords (for predictive typing, search, etc.). See detail below. New data file
ExtendedPictographic.txtSpecifies property data for “future-proofing” emoji segmentation.
OLD: <annotation cp='[😀]' tts='grinning face'>face; grin</annotation> NEW: <annotation cp="😀">face | grin</annotation>
<annotation cp="😀" type="tts">grinning face</annotation
Other changes:
The measurement of the number of items is reflects the different ways that the information is represented. A single data field (element or attribute value) may result in multiple data items. For example, plural rules may be shared by multiple languages, and a single data field contains all the languages to which those rules apply. Sometimes a changed item appears as a deletion+addition, and sequences of items (such as sort order) are not counted as different even if the order changes. For more details, see the Delta Data charts.
added items 9.32% deleted items* 0.12% changed items 5.90% total items 818,314
The JSON-format data and details about it are not yet available, but will be soon.
Users of the annotation data need to move to the new structure, described above.
There are changes to the bidirectional control characters in number symbols and number patterns for number systems 'arab' and 'arabext', and for number system 'latn' in some locales. These include use of the ALM (Arabic Letter Mark) character, which was new in Unicode 6.3.
The structure and intended usage for the “week x of y” patterns is still being refined and may change. This applies especially to dateFormatItems such as the following:
<dateFormatItem id="MMMMW" count=...>'week' W 'of' MMM</dateFormatItem>
<dateFormatItem id="yw" count=...>'week' w 'of' y</dateFormatItem>
Areas of discussion include the use of the count attribute and the use of ordinal vs. cardinal numbers.
The emoji-related data in CLDR 30 is based on a draft version of emoji 4.0 data, which may change before it is finalized.
The process described in LDML 30 for synthesizing short names of emoji sequences may be updated; if so, details will be provided here.