1# Maintaining ICU in Node.js 2 3## Background 4 5International Components for Unicode ([ICU4C][ICU]) is used both by V8 6and also by Node.js directly to provide internationalization 7functionality. To quote from icu-project.org: 8 9> ICU is a mature, widely used set of C/C++ and Java libraries providing 10> Unicode and Globalization support for software applications. ICU is 11> widely portable and gives applications the same results on all platforms 12> and between C/C++ and Java software. 13 14If Node.js is configured to use its built-in ICU, 15it uses a strict subset of ICU which is in 16[deps/icu-small](https://github.com/nodejs/node/tree/HEAD/deps/icu-small). 17A good description of the different ways Node.js can be built with ICU 18support is in [api/intl.html](https://nodejs.org/api/intl.html). 19 20## Data dependencies 21 22ICU consumes and includes: 23 24* Extracted locale data from [CLDR][] 25* Extracted [Unicode][] data. 26* Time zone ([tz][]) data 27 28The current versions of these items can be viewed for node with `node -p process.versions`: 29 30```console 31$ node -p process.versions 32 33{ 34 … 35 cldr: '35.1', 36 icu: '64.2', 37 tz: '2019a', 38 unicode: '12.1' 39} 40``` 41 42### Time zone data 43 44Time zone data files are updated independently of ICU CLDR data. ICU and its 45main data files do not need to be upgraded in order to apply time zone data file 46fixes. 47 48The [IANA tzdata][tz] project releases new versions and announces them on the 49[`tz-announce`](https://mm.icann.org/pipermail/tz-announce/) mailing list. 50 51The Unicode project takes new releases and publishes 52[updated time zone data files](https://github.com/unicode-org/icu-data/tree/HEAD/tzdata/icunew) 53in the icu/icu-data repository. 54 55All modern versions of Node.js use the version 44 ABI of the time zone data 56files. 57 58#### Example: updating the ICU `.dat` file 59 60* Decompress `deps/icu-small/source/data/in/icudt##l.dat.bz2`, where `##` is 61 the ICU major version number. 62* Clone the icu/icu-data repository and copy the latest `tzdata` release `le` 63 files into the `source/data/in` directory. 64* Follow the upstream [ICU instructions](https://unicode-org.github.io/icu/userguide/datetime/timezone/) 65 to patch the ICU `.dat` file: 66 > `for i in zoneinfo64.res windowsZones.res timezoneTypes.res metaZones.res; 67 > do icupkg -a $i icudt*l.dat` 68* Optionally, verify that there is only one of the above files listed when using 69 `icupkg -l`. 70* Optionally, extract each file using `icupkg -x` and verify the `shasum` 71 matches the desired value. 72* Compress the `.dat` file with the same filename as in the first step. 73* Build, test, verifying `process.versions.tz` matches the desired version. 74* Create a new minor version release. 75 76## Release schedule 77 78ICU typically has >1 release a year, particularly coinciding with a major 79release of [Unicode][]. The current release schedule is available on the [ICU][] 80website on the left sidebar. 81 82### V8 depends on ICU 83 84V8 will aggressively upgrade to a new ICU version, due to requirements for 85features/bugfixes needed for [Ecma402][] support. The minimum required version 86of ICU is specified within the V8 source tree. If the ICU version is too old, 87V8 will not compile. 88 89```c 90// deps/v8/src/objects/intl-objects.h 91#define V8_MINIMUM_ICU_VERSION 65 92``` 93 94V8 in Node.js depends on the ICU version supplied by Node.js. 95 96The file `tools/icu/icu_versions.json` contains the current minimum 97version of ICU that Node.js is known to work with. This should be 98_at least_ the same version as V8, so that users will find out 99earlier that their ICU is too old. A test case validates this when 100Node.js is built. 101 102## How to upgrade ICU 103 104> The script `tools/dep_updaters/update-icu.sh` automates 105> this process. 106 107* Make sure your Node.js workspace is clean (`git status` 108 should be sufficient). 109* Configure Node.js with the specific [ICU version](http://site.icu-project.org/download) 110 you want to upgrade to, for example: 111 112```bash 113./configure \ 114 --with-intl=full-icu \ 115 --with-icu-source=https://github.com/unicode-org/icu/releases/download/release-67-1/icu4c-67_1-src.tgz 116make 117``` 118 119> _Note_ in theory, the equivalent `vcbuild.bat` commands should work also, 120> but the commands below are makefile-centric. 121 122* If there are ICU version-specific changes needed, you may need to make changes 123 in `tools/icu/icu-generic.gyp` or add patch files to `tools/icu/patches`. 124 * Specifically, look for the lists in `sources!` in the `tools/icu/icu-generic.gyp` for 125 files to exclude. 126 127* Verify the Node.js build works: 128 129```bash 130make test-ci 131``` 132 133Also running 134 135<!-- eslint-disable strict --> 136 137```js 138new Intl.DateTimeFormat('es', { month: 'long' }).format(new Date(9E8)); 139``` 140 141…Should return `enero` not `January`. 142 143* Now, run the shrink tool to update `deps/icu-small` from `deps/icu` 144 145> :warning: Do not modify any source code in `deps/icu-small` ! 146> See section below about floating patches to ICU. 147 148```bash 149python tools/icu/shrink-icu-src.py 150``` 151 152* Now, do a clean rebuild of Node.js to test: 153 154```bash 155make -k distclean 156./configure 157make 158``` 159 160* Test this newly default-generated Node.js 161 162<!-- eslint-disable strict --> 163 164```js 165process.versions.icu; 166new Intl.DateTimeFormat('es', { month: 'long' }).format(new Date(9E8)); 167``` 168 169(This should print your updated ICU version number, and also `enero` again.) 170 171You are ready to check in (`git add`) the updated `deps/icu-small`. 172 173> :warning: Do not modify any source code in `deps/icu-small` ! 174> See section below about floating patches to ICU. 175 176* Now, rebuild the Node.js license. 177 178```bash 179# clean up - remove deps/icu 180make clean 181tools/license-builder.sh 182``` 183 184* Update the URL and hash for the full ICU file in `tools/icu/current_ver.dep`. 185 It should match the ICU URL used in the first step. When this is done, the 186 following should build with small ICU. 187 188```bash 189# clean up 190rm -rf out deps/icu deps/icu4c* 191./configure --with-intl=small-icu --download=all 192make 193make test-ci 194``` 195 196* Commit the change to the `deps/icu-small`, `tools/icu/current_ver.dep` 197 and `LICENSE` files. 198 199## Floating patches to ICU 200 201Floating patches are applied at `configure` time. The "patch" files 202are used instead of the original source files. The patch files are 203complete `.cpp` files replacing the original contents. 204 205Patches are tied to a specific ICU version. They won't apply to a 206future ICU version. We assume that you filed a bug against [ICU][] and 207upstreamed the fix, so the patch won't be needed in a later ICU 208version. 209 210### Example 211 212For example, to patch `source/tools/toolutil/pkg_genc.cpp` for 213ICU version 63: 214 215```bash 216# go to your Node.js source directory 217cd <node> 218 219# create the floating patch directory 220mkdir -p tools/icu/patches/63 221 222# create the subdirectory for the file(s) to patch: 223mkdir -p tools/icu/patches/63/source/tools/toolutil/ 224 225# copy the file to patch 226cp deps/icu-small/source/tools/toolutil/pkg_genc.cpp \ 227tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp 228 229# Make any changes to this file: 230(edit tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp ) 231 232# test 233make clean && ./configure && make 234``` 235 236You should see a message such as: 237 238```console 239INFO: Using floating patch "tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp" from "tools/icu" 240``` 241 242### Clean up 243 244Any patches older than the minimum version given in `tools/icu/icu_versions.json` 245ought to be deleted, because they will never be used. 246 247### Why not just modify the ICU source directly? 248 249Especially given the V8 dependencies above, there may be times when a floating 250patch to ICU is required. Though it seems expedient to simply change a file in 251`deps/icu-small`, this is not the right approach for the following reasons: 252 2531. **Repeatability.** Given the complexity of merging in a new ICU version, 254 following the steps above in the prior section of this document ought to be 255 repeatable without concern for overriding a patch. 256 2572. **Verifiability.** Given the number of files modified in an ICU PR, 258 a floating patch could easily be missed or dropped altogether next time 259 something is landed. 260 2613. **Compatibility.** There are a number of ways that ICU can be loaded into 262 Node.js (see the top level README.md). Only modifying `icu-small` would cause 263 the patch not to be landed in case the user specifies the ICU source code 264 another way. 265 266[CLDR]: http://cldr.unicode.org/ 267[Ecma402]: https://github.com/tc39/ecma402 268[ICU]: http://site.icu-project.org/ 269[Unicode]: https://home.unicode.org/ 270[tz]: https://www.iana.org/time-zones 271