• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Maintaining ICU in Node.js
2
3## Background
4
5International Components for Unicode ([ICU4C][ICU]) is used both by V8
6and also by Node.js directly to provide internationalization
7functionality. To quote from icu-project.org:
8
9> ICU is a mature, widely used set of C/C++ and Java libraries providing
10> Unicode and Globalization support for software applications. ICU is
11> widely portable and gives applications the same results on all platforms
12> and between C/C++ and Java software.
13
14If Node.js is configured to use its built-in ICU,
15it uses a strict subset of ICU which is in
16[deps/icu-small](https://github.com/nodejs/node/tree/HEAD/deps/icu-small).
17A good description of the different ways Node.js can be built with ICU
18support is in [api/intl.html](https://nodejs.org/api/intl.html).
19
20## Data dependencies
21
22ICU consumes and includes:
23
24* Extracted locale data from [CLDR][]
25* Extracted [Unicode][] data.
26* Time zone ([tz][]) data
27
28The current versions of these items can be viewed for node with `node -p process.versions`:
29
30```console
31$ node -p process.versions
32
33{
3435  cldr: '35.1',
36  icu: '64.2',
37  tz: '2019a',
38  unicode: '12.1'
39}
40```
41
42### Time zone data
43
44Time zone data files are updated independently of ICU CLDR data.  ICU and its
45main data files do not need to be upgraded in order to apply time zone data file
46fixes.
47
48The [IANA tzdata][tz] project releases new versions and announces them on the
49[`tz-announce`](https://mm.icann.org/pipermail/tz-announce/) mailing list.
50
51The Unicode project takes new releases and publishes
52[updated time zone data files](https://github.com/unicode-org/icu-data/tree/HEAD/tzdata/icunew)
53in the icu/icu-data repository.
54
55All modern versions of Node.js use the version 44 ABI of the time zone data
56files.
57
58#### Example: updating the ICU `.dat` file
59
60* Decompress `deps/icu-small/source/data/in/icudt##l.dat.bz2`, where `##` is
61  the ICU major version number.
62* Clone the icu/icu-data repository and copy the latest `tzdata` release `le`
63  files into the `source/data/in` directory.
64* Follow the upstream [ICU instructions](https://unicode-org.github.io/icu/userguide/datetime/timezone/)
65  to patch the ICU `.dat` file:
66  > `for i in zoneinfo64.res windowsZones.res timezoneTypes.res metaZones.res;
67  > do icupkg -a $i icudt*l.dat`
68* Optionally, verify that there is only one of the above files listed when using
69  `icupkg -l`.
70* Optionally, extract each file using `icupkg -x` and verify the `shasum`
71  matches the desired value.
72* Compress the `.dat` file with the same filename as in the first step.
73* Build, test, verifying `process.versions.tz` matches the desired version.
74* Create a new minor version release.
75
76## Release schedule
77
78ICU typically has >1 release a year, particularly coinciding with a major
79release of [Unicode][]. The current release schedule is available on the [ICU][]
80website on the left sidebar.
81
82### V8 depends on ICU
83
84V8 will aggressively upgrade to a new ICU version, due to requirements for
85features/bugfixes needed for [Ecma402][] support. The minimum required version
86of ICU is specified within the V8 source tree. If the ICU version is too old,
87V8 will not compile.
88
89```c
90// deps/v8/src/objects/intl-objects.h
91#define V8_MINIMUM_ICU_VERSION 65
92```
93
94V8 in Node.js depends on the ICU version supplied by Node.js.
95
96The file `tools/icu/icu_versions.json` contains the current minimum
97version of ICU that Node.js is known to work with. This should be
98_at least_ the same version as V8, so that users will find out
99earlier that their ICU is too old.  A test case validates this when
100Node.js is built.
101
102## How to upgrade ICU
103
104> The script `tools/dep_updaters/update-icu.sh` automates
105> this process.
106
107* Make sure your Node.js workspace is clean (`git status`
108  should be sufficient).
109* Configure Node.js with the specific [ICU version](http://site.icu-project.org/download)
110  you want to upgrade to, for example:
111
112```bash
113./configure \
114    --with-intl=full-icu \
115    --with-icu-source=https://github.com/unicode-org/icu/releases/download/release-67-1/icu4c-67_1-src.tgz
116make
117```
118
119> _Note_ in theory, the equivalent `vcbuild.bat` commands should work also,
120> but the commands below are makefile-centric.
121
122* If there are ICU version-specific changes needed, you may need to make changes
123  in `tools/icu/icu-generic.gyp` or add patch files to `tools/icu/patches`.
124  * Specifically, look for the lists in `sources!` in the `tools/icu/icu-generic.gyp` for
125    files to exclude.
126
127* Verify the Node.js build works:
128
129```bash
130make test-ci
131```
132
133Also running
134
135<!-- eslint-disable strict -->
136
137```js
138new Intl.DateTimeFormat('es', { month: 'long' }).format(new Date(9E8));
139```
140
141…Should return `enero` not `January`.
142
143* Now, run the shrink tool to update `deps/icu-small` from `deps/icu`
144
145> :warning: Do not modify any source code in `deps/icu-small` !
146> See section below about floating patches to ICU.
147
148```bash
149python tools/icu/shrink-icu-src.py
150```
151
152* Now, do a clean rebuild of Node.js to test:
153
154```bash
155make -k distclean
156./configure
157make
158```
159
160* Test this newly default-generated Node.js
161
162<!-- eslint-disable strict -->
163
164```js
165process.versions.icu;
166new Intl.DateTimeFormat('es', { month: 'long' }).format(new Date(9E8));
167```
168
169(This should print your updated ICU version number, and also `enero` again.)
170
171You are ready to check in (`git add`) the updated `deps/icu-small`.
172
173> :warning: Do not modify any source code in `deps/icu-small` !
174> See section below about floating patches to ICU.
175
176* Now, rebuild the Node.js license.
177
178```bash
179# clean up - remove deps/icu
180make clean
181tools/license-builder.sh
182```
183
184* Update the URL and hash for the full ICU file in `tools/icu/current_ver.dep`.
185  It should match the ICU URL used in the first step.  When this is done, the
186  following should build with small ICU.
187
188```bash
189# clean up
190rm -rf out deps/icu deps/icu4c*
191./configure --with-intl=small-icu --download=all
192make
193make test-ci
194```
195
196* Commit the change to the `deps/icu-small`, `tools/icu/current_ver.dep`
197  and `LICENSE` files.
198
199## Floating patches to ICU
200
201Floating patches are applied at `configure` time. The "patch" files
202are used instead of the original source files. The patch files are
203complete `.cpp` files replacing the original contents.
204
205Patches are tied to a specific ICU version. They won't apply to a
206future ICU version.  We assume that you filed a bug against [ICU][] and
207upstreamed the fix, so the patch won't be needed in a later ICU
208version.
209
210### Example
211
212For example, to patch `source/tools/toolutil/pkg_genc.cpp` for
213ICU version 63:
214
215```bash
216# go to your Node.js source directory
217cd <node>
218
219# create the floating patch directory
220mkdir -p tools/icu/patches/63
221
222# create the subdirectory for the file(s) to patch:
223mkdir -p tools/icu/patches/63/source/tools/toolutil/
224
225# copy the file to patch
226cp deps/icu-small/source/tools/toolutil/pkg_genc.cpp \
227tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp
228
229# Make any changes to this file:
230(edit tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp )
231
232# test
233make clean && ./configure && make
234```
235
236You should see a message such as:
237
238```console
239INFO: Using floating patch "tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp" from "tools/icu"
240```
241
242### Clean up
243
244Any patches older than the minimum version given in `tools/icu/icu_versions.json`
245ought to be deleted, because they will never be used.
246
247### Why not just modify the ICU source directly?
248
249Especially given the V8 dependencies above, there may be times when a floating
250patch to ICU is required.  Though it seems expedient to simply change a file in
251`deps/icu-small`, this is not the right approach for the following reasons:
252
2531. **Repeatability.** Given the complexity of merging in a new ICU version,
254   following the steps above in the prior section of this document ought to be
255   repeatable without concern for overriding a patch.
256
2572. **Verifiability.** Given the number of files modified in an ICU PR,
258   a floating patch could easily be missed or dropped altogether next time
259   something is landed.
260
2613. **Compatibility.** There are a number of ways that ICU can be loaded into
262   Node.js (see the top level README.md). Only modifying `icu-small` would cause
263   the patch not to be landed in case the user specifies the ICU source code
264   another way.
265
266[CLDR]: http://cldr.unicode.org/
267[Ecma402]: https://github.com/tc39/ecma402
268[ICU]: http://site.icu-project.org/
269[Unicode]: https://home.unicode.org/
270[tz]: https://www.iana.org/time-zones
271