• Home
Name Date Size #Lines LOC

..--

android/03-May-2024-

cast/03-May-2024-7461

chromeos/03-May-2024-

common/03-May-2024-

filters/03-May-2024-9,4579,415

flutter/03-May-2024-6549

flutter_desktop/03-May-2024-

fuzzers/03-May-2024-700549

ios/03-May-2024-

patches/03-May-2024-10,57110,508

scripts/03-May-2024-1,9431,434

source/03-May-2024-5,606,7755,223,164

tzres/03-May-2024-

APIChangeReport.htmlD03-May-202442.3 KiB732728

BUILD.gnD03-May-202414.7 KiB519453

DIR_METADATAD03-May-202452 43

LICENSED03-May-202424.9 KiB520459

README.chromiumD03-May-20249.9 KiB259195

README.fuchsiaD03-May-20242.5 KiB5038

codereview.settingsD03-May-2024277 87

config.gniD03-May-20241.4 KiB3830

icu.gypD03-May-202424.1 KiB720717

icu.gypiD03-May-202416.4 KiB460458

icu.isolateD03-May-2024734 2524

icu4c.cssD03-May-20247 KiB512412

icu_nacl.gypD03-May-20242.7 KiB111109

license.htmlD03-May-2024622 1916

readme.htmlD03-May-20241.6 KiB3024

shim_headers.gypiD03-May-20241.9 KiB6158

sources.gniD03-May-202430.9 KiB952947

version.gniD03-May-2024572 1512

version.jsonD03-May-202428 43

README.chromium

1Name: icu
2URL: https://github.com/unicode-org/icu
3Version: 72-1
4CPEPrefix: cpe:/a:icu-project:international_components_for_unicode:72.1
5License: MIT
6Security Critical: yes
7
8Description:
9This directory contains the source code of ICU 72.1 for C/C++.
10
11A. How to update ICU
12
131. Run "scripts/update.sh <version>" (e.g. 72-1).
14   This will download ICU from the upstream git repository.
15   It does preserve Chrome-specific build files and
16   converter files. (see section C)
17
18   source.gni and icu.gyp* files are automatically updated, too.
19
202. Review and apply patches/changes in "D. Local Modifications" if
21   necessary/applicable. Update patch files in patches/.
22
233. Follow the instructions in section B on building ICU data files
24
25B. How to build ICU data files
26
27
28Pre-built data files are generated and checked in with the following steps
29
301. icu data files for Chrome OS, Linux, Mac and Windows
31
32  a. Make a icu data build directory outside the Chromium source tree
33     and cd to that directory (say, $ICUBUILDIR).
34
35  b. Run
36       ${CHROME_ICU_TREE_TOP}/scripts/make_data_all.sh
37
38     This script takes the following steps:
39
40     i) Run
41        ${CHROME_ICU_TREE_TOP}/source/runConfigureICU Linux --disable-layout --disable-tests
42
43     ii) Run make
44
45     iii) (cd data && make clean)
46
47     iv) scripts/config_data.sh common
48       This configure the build with filer for common.
49
50     v) Run make
51
52     vi)  scripts/copy_data.sh common
53       This copies the ICU data files for non-Android platforms
54       (both Little and Big Endian) to the following locations:
55
56       common/icudtl.dat
57       common/icudtb.dat
58
59     vii) Repeat step iii) - vi) for chromeos to produce chromeos/icudtl.dat
60
61     viii) cast/patch_locale.sh
62       Modify the file for cast, android, ios and flutter.
63
64     ix) Repeat step iii) - vi) for cast, andriod and ios to produce
65       cast/icudtl.dat
66       andriod/icudtl.dat
67       ios/icudtl.dat
68
69     x) flutter/patch_brkitr.sh
70       On top of cast/patch_locale.sh.sh (step viii)), further patch
71       the code for flutter.
72
73     xi) Repeat step iii) - vi) for flutter to produce
74       flutter/icudtl.dat
75
76     xii) scripts/clean_up_data_source.sh
77
78     This reverts the result of cast/patch_locale.sh and flutter/patch_brkitr.sh
79     make the tree ready for committing updated ICU data files for
80     non-Android and Android platforms.
81
82  c. Whenever data is updated (e.g timezone update), take step b as long
83     as the ICU build directory used in a. is kept.
84
852. Note on the locale data customization
86
87  - filter/chromeos.json
88      a. Filter the locale data for ChromeOS's UI langauges :
89         locales, lang, region, currency, zone
90      b. Filter the locale data for non-UI languages to the bare minimum :
91         ExemplarCharacters, LocaleScript, layout, and the name of the
92         language for a locale in its native language.
93      c. Filter the legacy Chinese character set-based collation
94         (big5han/gb2312han) that don't make any sense and nobdoy uses.
95
96  - filter/common.json
97      Same as above in filter/chromeos.json, AND
98      e. Filter exemplar cities in timezone data (data/zone).
99
100  - filter/android.json and filter/ios.json
101      a. Filter the locale data for Android / iOS UI langauges :
102         locales, lang, region, currency, zone
103      b. Filter the locale data for non-UI languages to the bare minimum :
104         ExemplarCharacters, LocaleScript, layout, and the name of the
105         language for a locale in its native language.
106      c. Filter the legacy Chinese character set-based collation
107      d. Filter source/data/{region,lang} to exclude these data
108         except the language and script names of zh_Hans and zh_Hant.
109      e. Keep only the minimal calendar data in data/locales.
110      f. Include currency display names for a smaller subset of currencies.
111      g. Minimize the locale data for 9 locales to which Chrome on Android
112         is not localized.
113
114
115C. Chromium-specific data build files and converters
116
117They're preserved in step A.1 above. In general, there's no need to touch
118them when updating ICU.
119
1201. source/data/mappings
121  - convrtrs.txt : Lists encodings and aliases required by the WHATWG
122    Encoding spec plus a few extra (see the file as to why).
123
124  - ucmlocal.txt : to list only converters we need.
125
126  - *html.ucm: Mapping files per WHATWG encoding standards for EUC-JP,
127    Shift_JIS, Big5 (Big5+Big5HKSCS), EUC-KR and all the single byte encodings.
128    They're generated with scripts/{eucjp,sjis,big5,euckr,single_byte}_gen.sh.
129
130  - gb18030.ucm and windows-936.ucm
131    gb_table.patch was applied for the following changes. No need
132    to apply it again. The patch is kept for the record.
133    a. Map \xA3\xA0 to U+3000 instead of U+E5E5 in gb18030 and windows-936 per
134    the encoding spec (one-way mapping in toUnicode direction).
135    b. Map \xA8\xBF to U+01F9 instead of U+E7C8. Add one-way map
136    from U+1E3F to \xA8\xBC (windows-936/GBK).
137       See https://www.w3.org/Bugs/Public/show_bug.cgi?id=28740#c3
138
1392. source/data/brkitr
140  - dictionaries/khmerdict.txt: Abridged Khmer dictionary. See
141    https://unicode-org.atlassian.net/browse/ICU-9451
142  - dictionaries/laodict.txt: Abridged Lao dictionary. We keep using the smaller
143    old version from ICU69-1.
144  - rules/word_ja.txt (used only on Android)
145    Added for Japanese-specific word-breaking without the C+J dictionary.
146  - rules/{root,zh,zh_Hant}.txt
147    a. Use line_normal by default.
148    b. Drop local patches we used to have for the following issues. They'll
149       be dealt with in the upstream (Unicode/CLDR).
150       http://unicode.org/cldr/trac/ticket/6557
151       http://unicode.org/cldr/trac/ticket/4200 (http://crbug.com/39779)
152
1533. Add {an,ku,tg,wa}.txt to source/data/{locale,lang}
154   with the minimal locale data necessary for spellchecker and
155   and language menus.
156
157D. Local Modifications
158
1591. Applied locale data patches from Google obtained by diff'ing
160   the upstream copy and Google's internal copy for source/data
161
162  - patches/locale_google.patch:
163    * Google's internal ICU locale changes
164    * Simpler region names for Hong Kong and Macau in all locales
165    * Currency signs in ru and uk locales (do not include 'tr' locale changes)
166    * AM/PM, midnight, noon formatting for a few Indian locales
167    * Timezone name changes in Korean and Chinese locales
168    * Default digit for Arabic locale is European digits.
169
170  - patches/locale1.patch: Minor fixes for Korean
171
172
1732. Breakiterator patches
174  - patches/wordbrk.patch for word.txt, word_POSIX.txt, and word_fi_sv.txt
175    a. Move full stops (U+002E, U+FF0E) from MidNumLet to MidNum so that
176       FQDN labels can be split at '.'
177    b. Move fullwidth digits (U+FF10 - U+FF19) from Ideographic to Numeric.
178       See http://unicode.org/cldr/trac/ticket/6555
179    c. Restore pre-ICU 72 behavior of breaking at '@'. The new upstream behavior
180       of not breaking at '@' interacted badly with the local change to break at
181       '.' (D.2.a above): although not breaking at '@' is intended to not break
182       within e-mail addresses, this is not possible with Chromium's
183       break-at-'.' behavior.
184
185  - patches/khmer-dictbe.patch
186    Adjust parameters to use a smaller Khmer dictionary (khmerdict.txt).
187    https://unicode-org.atlassian.net/browse/ICU-9451
188
189  - Add several common Chinese words that were dropped previously to
190    source/data/cjdict/brkitr/cjdict.txt
191    patch: patches/cjdict.patch
192    upstream bug: https://unicode-org.atlassian.net/browse/ICU-10888
193
1943. Timezone data update
195  Run scripts/update_tz.sh to grab the latest version of the
196  following timezone data files and put them in source/data/misc
197
198     metaZones.txt
199     timezoneTypes.txt
200     windowsZones.txt
201     zoneinfo64.txt
202
203  As of Mar 31, 2023, the latest version is 2023c
204  and the above files are available at the ICU github repos.
205
2064. Build-related changes
207
208  - patches/configure.patch:
209    * Remove a section of configure that will cause breakage while
210      running runConfigureICU.
211
212  - patches/wpo.patch (only needed when icudata dll is used).
213    upstream bugs : https://unicode-org.atlassian.net/browse/ICU-8043
214                    https://unicode-org.atlassian.net/browse/ICU-5701
215
216  - patches/data_symb.patch :
217      Put ICU_DATA_ENTRY_POINT(icudtXX_dat) in common when we use
218      the icu data file or icudt.dll
219
2205. ISO-2022-JP encoding (fromUnicode) change per WHATWG encoding spec.
221  - patches/iso2022jp.patch
222  - upstream bug:
223    https://unicode-org.atlassian.net/browse/ICU-20251
224
2256. Enable tracing of file but not resource, only for Chromium
226    to reduce performance impact/risk.
227  - patches/restrace.patch
228
2297. Patch Arabic date time pattern back to 67 value to avoid test
230   breakage in
231   third_party/blink/web_tests/fast/forms/datetimelocal/datetimelocal-appearance-l10n.html
232  - patches/ardatepattern.patch
233  - https://bugs.chromium.org/p/chromium/issues/detail?id=1139186
234
2358.  Remove explicit std::atomic<NumberRangeFormatterImpl*> template
236    instantiation
237    patches/atomic_template_instantiation.patch
238  - The explicit instantiation was added to silence MSVC C4251 warnings:
239    https://unicode-org.atlassian.net/browse/ICU-20157
240    Small test cases show that it is generally an error to instantiate
241    std::atomic<T*> with an incomplete type T with MSVC, clang, and GCC, so this
242    instantiation never should have worked:
243    https://gcc.godbolt.org/z/34xx8h
244    At this time, it's not clear if this particular instantiation with
245    NumberRangeFormatterImpl* was ever necessary for MSVC. Further testing with
246    MSVC is required to upstream this patch.
247  - https://unicode-org.atlassian.net/browse/ICU-21482
248
2499.  Patch source/common/uposixdefs.h so it compiles on Fuchsia on Macs.
250    patches/fuchsia.patch
251  - context bug: https://bugs.chromium.org/p/chromium/issues/detail?id=1184527
252
25310. Patch ICU to fix mix usage of UBool which break win64_msvc
254  - patches/fix-bool-mix-use.patch
255  - https://github.com/unicode-org/icu/pull/2255
256
25711. Patch ICU en-CA date pattern back to y-MM-dd format
258    patches/revert-en-ca-date.patch
259

README.fuchsia

1# Notes specific to building ICU in Fuchsia tree
2
3This note is specific to compiling ICU for Fuchsia as a target, but only within
4the Fuchsia git source code repository.
5
6On Fuchsia we need to compile ICU at 3 separate but not always different ICU
7commit IDs. I call these ICU "flavors", and Fuchsia has "default", "stable",
8and "latest" flavors defined today.
9
10This setup fights a bit with the way the default ICU build works, since all of
11the above flavors end up wanting to place some of their outputs in
12`$root_build_dir` under the same name (e.g. `icudtl.dat`). Such a setup causes
13GN to complain about multiple build rules defining the same output, and fail
14the build. So the vanilla setup does not work for us.
15
16To have 3 possibly different ICUs coexist in peace in the Fuchsia tree, the
17following changes are needed in the ICU library proper:
18
19- We introduced a file `//build/icu.gni` which will live in the Fuchsia git
20  tree, and contains the shared configuration to be used instead of
21  `config.gni`. Without it, compilation will fail saying that multiple config
22  files define the same args. Since providing this config file to all
23  downstream deps would be tedious, and error prone, there is a
24  Fuchsia-specific branch in `config.gni` which will look for `//build/icu.gni`
25  only if the configuration indicates we are building Fuchsia, and building in
26  the Fuchsia git source tree.
27
28- Added some flags and conditionals to avoid
29  putting key artifacts into `$root_build_dir`. Without it, compilation will
30  fail because all ICU flavors we compile will want to put same-named artifacts
31  into the same spot in `$root_build_dir`.
32
33- The above conditionals caused some of the variables to become unused in some
34  code paths, so I marked those as `not_needed`.  This should not adversely
35  affect any builds, Fuchsia or otherwise.
36
37- Started storing the major version of the library in a JSON file at the root
38  directory. This allows tools other than GN to read the file, and use
39  standardized tools for parsing the value.  Without this, we'd need to rely on
40  possibly brittle parsing to achieve the same effect, which seemed unnecessary.
41
42- I modified [scripts/update.sh] to transition to updating the JSON file, so
43  the update process to update the above JSON file instead of the `.gni` file
44  directly so the ICU update process does not change for the human operator.
45
46These settings should only take effect in the Fuchsia in-tree build of ICU.
47While there might be some value in having `//build/icu.gni` defined in each
48downstream build, this seemed like an overkill at this time.
49
50

readme.html

1<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
2"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
3
4<html lang="en-US" xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-US">
5  <head>
6    <title>ReadMe for ICU4C</title>
7    <meta name="COPYRIGHT" content=
8    "Copyright (C) 2016 and later: Unicode, Inc. and others. License &amp; terms of use: http://www.unicode.org/copyright.html"/>
9    <!-- meta name="COPYRIGHT" content=
10    "Copyright (c) 1997-2016 IBM Corporation and others. All Rights Reserved." / -->
11    <meta name="KEYWORDS" content=
12    "ICU; International Components for Unicode; ICU4C; what's new; readme; read me; introduction; downloads; downloading; building; installation;" />
13    <meta name="DESCRIPTION" content=
14    "The introduction to the International Components for Unicode with instructions on building, installation, usage and other information about ICU." />
15    <meta http-equiv="content-type" content="text/html; charset=utf-8" />
16	<link type="text/css" href="./icu4c.css" rel="stylesheet"/>
17  </head>
18
19
20  <body>
21    <p>This readme has moved to the <a href="https://unicode-org.github.io/icu/userguide/icu4c/">ICU4C Readme</a>
22      section in the <a href="https://unicode-org.github.io/icu/">ICU User Guide</a>.</p>
23    <hr />
24    <p> Copyright &copy; 2016 and later: Unicode, Inc. and others. License &amp; terms of use:
25    <a href="http://www.unicode.org/copyright.html">http://www.unicode.org/copyright.html</a><br/>
26    Copyright &copy; 1997-2016 International Business Machines Corporation and  others.
27    All Rights Reserved.</p>
28  </body>
29</html>
30