• Home
Name Date Size #Lines LOC

..--

android/05-Jul-2025-

cast/04-Jul-2025-7047

chromecast_video/04-Jul-2025-

chromeos/05-Jul-2025-

common/05-Jul-2025-

filters/04-Jul-2025-11,32311,290

flutter/04-Jul-2025-6953

flutter_desktop/04-Jul-2025-

fuzzers/04-Jul-2025-623486

ios/04-Jul-2025-

patches/04-Jul-2025-14,15514,077

scripts/04-Jul-2025-1,7641,366

source/04-Jul-2025-5,010,2914,599,517

tzres/04-Jul-2025-

APIChangeReport.htmlD04-Jul-202566.8 KiB1,0171,013

Android.bpD04-Jul-202548 KiB1,3551,348

DIR_METADATAD04-Jul-202598 76

LICENSED04-Jul-202524.6 KiB513452

METADATAD04-Jul-2025290 1312

MODULE_LICENSE_MITD04-Jul-20250

README.chromiumD04-Jul-202511.3 KiB291223

README.fuchsiaD04-Jul-20252.5 KiB5038

codereview.settingsD04-Jul-2025277 87

icu.gypD04-Jul-202524.4 KiB727724

icu.gypiD04-Jul-202516.6 KiB465463

icu.isolateD04-Jul-2025734 2524

icu4c.cssD04-Jul-20257 KiB512412

icu_nacl.gypD04-Jul-20252.7 KiB111109

license.htmlD04-Jul-2025622 1916

readme.htmlD04-Jul-20251.6 KiB3024

shim_headers.gypiD04-Jul-20251.9 KiB6158

version.jsonD04-Jul-202527 43

README.chromium

1Name: icu
2URL: https://github.com/unicode-org/icu
3Version: 74-2
4CPEPrefix: cpe:/a:icu-project:international_components_for_unicode:74.2
5License: MIT
6License File: LICENSE
7Security Critical: yes
8Shipped: yes
9
10Description:
11This directory contains the source code of ICU 74.2 for C/C++.
12
13A. How to update ICU
14
151. Run "scripts/update.sh <version>" (e.g. 74-2).
16   This will download ICU from the upstream git repository.
17   It does preserve Chrome-specific build files and
18   converter files. (see section C)
19
20   source.gni and icu.gyp* files are automatically updated, too.
21
222. Review and apply patches/changes in "D. Local Modifications" if
23   necessary/applicable. Update patch files in patches/.
24
253. Follow the instructions in section B on building ICU data files
26
27B. How to build ICU data files
28
29
30Pre-built data files are generated and checked in with the following steps
31
321. icu data files for Chrome OS, Linux, Mac and Windows
33
34  a. Make a icu data build directory outside the Chromium source tree
35     and cd to that directory (say, $ICUBUILDIR).
36
37  b. Run
38       ${CHROME_ICU_TREE_TOP}/scripts/make_data_all.sh
39
40     This script takes the following steps:
41
42     i) Run
43        ${CHROME_ICU_TREE_TOP}/source/runConfigureICU Linux --disable-layout --disable-tests
44
45     ii) Run make
46
47     iii) (cd data && make clean)
48
49     iv) scripts/config_data.sh common
50       This configure the build with filer for common.
51
52     v) Run make
53
54     vi)  scripts/copy_data.sh common
55       This copies the ICU data files for non-Android platforms
56       (both Little and Big Endian) to the following locations:
57
58       common/icudtl.dat
59       common/icudtb.dat
60
61     vii) Repeat step iii) - vi) for chromeos to produce chromeos/icudtl.dat
62
63     viii) cast/patch_locale.sh
64       Modify the file for cast, android, ios and flutter.
65
66     ix) Repeat step iii) - vi) for cast, andriod and ios to produce
67       cast/icudtl.dat
68       andriod/icudtl.dat
69       ios/icudtl.dat
70
71     x) flutter/patch_brkitr.sh
72       On top of cast/patch_locale.sh.sh (step viii)), further patch
73       the code for flutter.
74
75     xi) Repeat step iii) - vi) for flutter to produce
76       flutter/icudtl.dat
77
78     xii) scripts/clean_up_data_source.sh
79
80     This reverts the result of cast/patch_locale.sh and flutter/patch_brkitr.sh
81     make the tree ready for committing updated ICU data files for
82     non-Android and Android platforms.
83
84  c. Whenever data is updated (e.g timezone update), take step b as long
85     as the ICU build directory used in a. is kept.
86
872. Note on the locale data customization
88
89  - filter/chromeos.json
90      a. Filter the locale data for ChromeOS's UI langauges :
91         locales, lang, region, currency, zone
92      b. Filter the locale data for non-UI languages to the bare minimum :
93         ExemplarCharacters, LocaleScript, layout, and the name of the
94         language for a locale in its native language.
95      c. Filter the legacy Chinese character set-based collation
96         (big5han/gb2312han) that don't make any sense and nobdoy uses.
97
98  - filter/common.json
99      Same as above in filter/chromeos.json, AND
100      e. Filter exemplar cities in timezone data (data/zone).
101
102  - filter/android.json and filter/ios.json
103      a. Filter the locale data for Android / iOS UI langauges :
104         locales, lang, region, currency, zone
105      b. Filter the locale data for non-UI languages to the bare minimum :
106         ExemplarCharacters, LocaleScript, layout, and the name of the
107         language for a locale in its native language.
108      c. Filter the legacy Chinese character set-based collation
109      d. Filter source/data/{region,lang} to exclude these data
110         except the language and script names of zh_Hans and zh_Hant.
111      e. Keep only the minimal calendar data in data/locales.
112      f. Include currency display names for a smaller subset of currencies.
113      g. Minimize the locale data for 9 locales to which Chrome on Android
114         is not localized.
115
116
117C. Chromium-specific data build files and converters
118
119They're preserved in step A.1 above. In general, there's no need to touch
120them when updating ICU.
121
1221. source/data/mappings
123  - convrtrs.txt : Lists encodings and aliases required by the WHATWG
124    Encoding spec plus a few extra (see the file as to why).
125
126  - ucmlocal.txt : to list only converters we need.
127
128  - *html.ucm: Mapping files per WHATWG encoding standards for EUC-JP,
129    Shift_JIS, Big5 (Big5+Big5HKSCS), EUC-KR and all the single byte encodings.
130    They're generated with scripts/{eucjp,sjis,big5,euckr,single_byte}_gen.sh.
131
132  - gb18030.ucm and windows-936.ucm
133    gb_table.patch was applied for the following changes. No need
134    to apply it again. The patch is kept for the record.
135    a. Map \xA3\xA0 to U+3000 instead of U+E5E5 in gb18030 and windows-936 per
136    the encoding spec (one-way mapping in toUnicode direction).
137    b. Map \xA8\xBF to U+01F9 instead of U+E7C8. Add one-way map
138    from U+1E3F to \xA8\xBC (windows-936/GBK).
139       See https://www.w3.org/Bugs/Public/show_bug.cgi?id=28740#c3
140
1412. source/data/brkitr
142  - dictionaries/khmerdict.txt: Abridged Khmer dictionary. See
143    https://unicode-org.atlassian.net/browse/ICU-9451
144  - dictionaries/laodict.txt: Abridged Lao dictionary. We keep using the smaller
145    old version from ICU69-1.
146  - rules/word_ja.txt (used only on Android)
147    Added for Japanese-specific word-breaking without the C+J dictionary.
148  - rules/{root,zh,zh_Hant}.txt
149    a. Use line_normal by default.
150    b. Drop local patches we used to have for the following issues. They'll
151       be dealt with in the upstream (Unicode/CLDR).
152       http://unicode.org/cldr/trac/ticket/6557
153       http://unicode.org/cldr/trac/ticket/4200 (http://crbug.com/39779)
154
1553. Add {an,ku,tg,wa}.txt to source/data/{locale,lang}
156   with the minimal locale data necessary for spellchecker and
157   and language menus.
158
159D. Local Modifications
160
1611. Applied locale data patches from Google obtained by diff'ing
162   the upstream copy and Google's internal copy for source/data
163
164  - patches/locale_google.patch:
165    * Google's internal ICU locale changes
166    * Simpler region names for Hong Kong and Macau in all locales
167    * Currency signs in ru and uk locales (do not include 'tr' locale changes)
168    * AM/PM, midnight, noon formatting for a few Indian locales
169    * Timezone name changes in Korean and Chinese locales
170    * Default digit for Arabic locale is European digits.
171
172  - patches/locale1.patch: Minor fixes for Korean
173
174  - patches/name_5_langs.patch: add the native names of 5 languages not currently
175    supported by CLDR/ICU. When updating the ICU to a new version,
176    source/data/lang/{ay,dv,ilo,lus,ts}.txt have to be checked and if they are
177    present with their display names populated, this patch has to be adjusted
178    or discarded as necessary.
179
1802. Breakiterator patches
181  - patches/wordbrk.patch for word.txt, word_POSIX.txt, and word_fi_sv.txt
182    a. Move full stops (U+002E, U+FF0E) from MidNumLet to MidNum so that
183       FQDN labels can be split at '.'
184    b. Move fullwidth digits (U+FF10 - U+FF19) from Ideographic to Numeric.
185       See http://unicode.org/cldr/trac/ticket/6555
186    c. Restore pre-ICU 72 behavior of breaking at '@'. The new upstream behavior
187       of not breaking at '@' interacted badly with the local change to break at
188       '.' (D.2.a above): although not breaking at '@' is intended to not break
189       within e-mail addresses, this is not possible with Chromium's
190       break-at-'.' behavior.
191
192  - patches/khmer-dictbe.patch
193    Adjust parameters to use a smaller Khmer dictionary (khmerdict.txt).
194    https://unicode-org.atlassian.net/browse/ICU-9451
195
196  - Add several common Chinese words that were dropped previously to
197    source/data/cjdict/brkitr/cjdict.txt
198    patch: patches/cjdict.patch
199    upstream bug: https://unicode-org.atlassian.net/browse/ICU-10888
200
2013. Timezone data update
202  Run scripts/update_tz.sh to grab the latest version of the
203  following timezone data files and put them in source/data/misc
204
205     metaZones.txt
206     timezoneTypes.txt
207     windowsZones.txt
208     zoneinfo64.txt
209
210  As of Oct 2, 2024, the latest version is 2024b
211  and the above files are available at the ICU github repos.
212
2134. Build-related changes
214
215  - patches/configure.patch:
216    * Remove a section of configure that will cause breakage while
217      running runConfigureICU.
218
219  - patches/data_symb.patch :
220      Put ICU_DATA_ENTRY_POINT(icudtXX_dat) in common when we use
221      the icu data file or icudt.dll
222
2235. ISO-2022-JP encoding (fromUnicode) change per WHATWG encoding spec.
224  - patches/iso2022jp.patch
225  - upstream bug:
226    https://unicode-org.atlassian.net/browse/ICU-20251
227
2286. Enable tracing of file but not resource, only for Chromium
229    to reduce performance impact/risk.
230  - patches/restrace.patch
231
2327. Patch Arabic date time pattern back to 67 value to avoid test
233   breakage in
234   third_party/blink/web_tests/fast/forms/datetimelocal/datetimelocal-appearance-l10n.html
235  - patches/ardatepattern.patch
236  - https://bugs.chromium.org/p/chromium/issues/detail?id=1139186
237
2388.  Remove explicit std::atomic<NumberRangeFormatterImpl*> template
239    instantiation
240    patches/atomic_template_instantiation.patch
241  - The explicit instantiation was added to silence MSVC C4251 warnings:
242    https://unicode-org.atlassian.net/browse/ICU-20157
243    Small test cases show that it is generally an error to instantiate
244    std::atomic<T*> with an incomplete type T with MSVC, clang, and GCC, so this
245    instantiation never should have worked:
246    https://gcc.godbolt.org/z/34xx8h
247    At this time, it's not clear if this particular instantiation with
248    NumberRangeFormatterImpl* was ever necessary for MSVC. Further testing with
249    MSVC is required to upstream this patch.
250  - https://unicode-org.atlassian.net/browse/ICU-21482
251
2529.  Patch source/common/uposixdefs.h so it compiles on Fuchsia on Macs.
253    patches/fuchsia.patch
254  - context bug: https://bugs.chromium.org/p/chromium/issues/detail?id=1184527
255
25610. Patch fix of Etc/Unknown being returned for
257    Intl.DateTimeFormat().resolvedOptions().timeZone on macOS 14.
258    patches/revert_realpath.patch
259  - https://bugs.chromium.org/p/chromium/issues/detail?id=1473422
260  - https://unicode-org.atlassian.net/browse/ICU-22541
261
26211. Patch source/common/locid.cpp to resolve crashes in DecimalFormatting.
263    patches/locid.patch
264  - https://issues.chromium.org/issues/333453962
265
26612. Patch source/common/messagepattern.ccp/.h to resolve stack overflow.
267    patches/limit_message_pattern.patch
268  - https://issues.chromium.org/343280855
269  - https://unicode-org.atlassian.net/browse/ICU-22798
270
27113. Patch source/common/ubidiwrt.cpp to resolve a bidi buffer overflow.
272    patches/bidi_buffer_overflow.patch
273  - https://issues.chromium.org/u/1/issues/345352978
274  - https://unicode-org.atlassian.net/browse/ICU-22768
275
27614. Patch source/tools/pkgdata/pkgdata.cpp to remove a format-overflow warning.
277    patches/remove_format_overflow_warning.patch
278  - https://unicode-org.atlassian.net/browse/ICU-21589
279  - https://issues.chromium.org/u/1/issues/345352978
280
28115. Patch source/i18n/japancal.cpp to fix an extended year int32 overflow.
282    patches/ja_extended_year_overflow.patch
283  - https://issues.chromium.org/u/1/issues/345352978
284  - https://unicode-org.atlassian.net/browse/ICU-22730
285
28616. Patch an icu:calendar integer overflow by propagating error.
287    patch/propagate_error_avoid_overflow.patch
288  - https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=68895
289  - https://issues.chromium.org/u/1/issues/345352978
290  - https://unicode-org.atlassian.net/browse/ICU-22730
291

README.fuchsia

1# Notes specific to building ICU in Fuchsia tree
2
3This note is specific to compiling ICU for Fuchsia as a target, but only within
4the Fuchsia git source code repository.
5
6On Fuchsia we need to compile ICU at 3 separate but not always different ICU
7commit IDs. I call these ICU "flavors", and Fuchsia has "default", "stable",
8and "latest" flavors defined today.
9
10This setup fights a bit with the way the default ICU build works, since all of
11the above flavors end up wanting to place some of their outputs in
12`$root_build_dir` under the same name (e.g. `icudtl.dat`). Such a setup causes
13GN to complain about multiple build rules defining the same output, and fail
14the build. So the vanilla setup does not work for us.
15
16To have 3 possibly different ICUs coexist in peace in the Fuchsia tree, the
17following changes are needed in the ICU library proper:
18
19- We introduced a file `//build/icu.gni` which will live in the Fuchsia git
20  tree, and contains the shared configuration to be used instead of
21  `config.gni`. Without it, compilation will fail saying that multiple config
22  files define the same args. Since providing this config file to all
23  downstream deps would be tedious, and error prone, there is a
24  Fuchsia-specific branch in `config.gni` which will look for `//build/icu.gni`
25  only if the configuration indicates we are building Fuchsia, and building in
26  the Fuchsia git source tree.
27
28- Added some flags and conditionals to avoid
29  putting key artifacts into `$root_build_dir`. Without it, compilation will
30  fail because all ICU flavors we compile will want to put same-named artifacts
31  into the same spot in `$root_build_dir`.
32
33- The above conditionals caused some of the variables to become unused in some
34  code paths, so I marked those as `not_needed`.  This should not adversely
35  affect any builds, Fuchsia or otherwise.
36
37- Started storing the major version of the library in a JSON file at the root
38  directory. This allows tools other than GN to read the file, and use
39  standardized tools for parsing the value.  Without this, we'd need to rely on
40  possibly brittle parsing to achieve the same effect, which seemed unnecessary.
41
42- I modified [scripts/update.sh] to transition to updating the JSON file, so
43  the update process to update the above JSON file instead of the `.gni` file
44  directly so the ICU update process does not change for the human operator.
45
46These settings should only take effect in the Fuchsia in-tree build of ICU.
47While there might be some value in having `//build/icu.gni` defined in each
48downstream build, this seemed like an overkill at this time.
49
50

readme.html

1<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
2"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3
4<html lang="en-US" xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US">
5  <head>
6    <title>ReadMe for ICU4C</title>
7    <meta name="COPYRIGHT" content=
8    "Copyright (C) 2016 and later: Unicode, Inc. and others. License &amp; terms of use: http://www.unicode.org/copyright.html"/>
9    <!-- meta name="COPYRIGHT" content=
10    "Copyright (c) 1997-2016 IBM Corporation and others. All Rights Reserved." / -->
11    <meta name="KEYWORDS" content=
12    "ICU; International Components for Unicode; ICU4C; what's new; readme; read me; introduction; downloads; downloading; building; installation;" />
13    <meta name="DESCRIPTION" content=
14    "The introduction to the International Components for Unicode with instructions on building, installation, usage and other information about ICU." />
15    <meta http-equiv="content-type" content="text/html; charset=utf-8" />
16	<link type="text/css" href="./icu4c.css" rel="stylesheet"/>
17  </head>
18
19
20  <body>
21    <p>This readme has moved to the <a href="https://unicode-org.github.io/icu/userguide/icu4c/">ICU4C Readme</a>
22      section in the <a href="https://unicode-org.github.io/icu/">ICU User Guide</a>.</p>
23    <hr />
24    <p> Copyright &copy; 2016 and later: Unicode, Inc. and others. License &amp; terms of use:
25    <a href="http://www.unicode.org/copyright.html">http://www.unicode.org/copyright.html</a><br/>
26    Copyright &copy; 1997-2016 International Business Machines Corporation and  others.
27    All Rights Reserved.</p>
28  </body>
29</html>
30