1# Maintaining ICU in Node.js 2 3## Background 4 5International Components for Unicode ([ICU4C][ICU]) is used both by V8 6and also by Node.js directly to provide internationalization 7functionality. To quote from icu-project.org: 8 9> ICU is a mature, widely used set of C/C++ and Java libraries providing 10> Unicode and Globalization support for software applications. ICU is 11> widely portable and gives applications the same results on all platforms 12> and between C/C++ and Java software. 13 14## Data dependencies 15 16ICU consumes and includes: 17 18* Extracted locale data from [CLDR][] 19* Extracted [Unicode][] data. 20* Time zone ([tz][]) data 21 22The current versions of these items can be viewed for node with `node -p process.versions`: 23 24```console 25$ node -p process.versions 26 27{ 28 … 29 cldr: '35.1', 30 icu: '64.2', 31 tz: '2019a', 32 unicode: '12.1' 33} 34``` 35 36### Time zone data 37 38Time zone data files are updated independently of ICU CLDR data. ICU and its 39main data files do not need to be upgraded in order to apply time zone data file 40fixes. 41 42The [IANA tzdata][tz] project releases new versions and announces them on the 43[`tz-announce`](https://mm.icann.org/pipermail/tz-announce/) mailing list. 44 45The Unicode project takes new releases and publishes 46[updated time zone data files](https://github.com/unicode-org/icu-data/tree/HEAD/tzdata/icunew) 47in the icu/icu-data repository. 48 49All modern versions of Node.js use the version 44 ABI of the time zone data 50files. 51 52#### Example: updating the ICU `.dat` file 53 54* Decompress `deps/icu/source/data/in/icudt##l.dat.bz2`, where `##` is 55 the ICU major version number. 56* Clone the icu/icu-data repository and copy the latest `tzdata` release `le` 57 files into the `source/data/in` directory. 58* Follow the upstream [ICU instructions](https://unicode-org.github.io/icu/userguide/datetime/timezone/) 59 to patch the ICU `.dat` file: 60 > `for i in zoneinfo64.res windowsZones.res timezoneTypes.res metaZones.res; 61 > do icupkg -a $i icudt*l.dat` 62* Optionally, verify that there is only one of the above files listed when using 63 `icupkg -l`. 64* Optionally, extract each file using `icupkg -x` and verify the `shasum` 65 matches the desired value. 66* Compress the `.dat` file with the same filename as in the first step. 67* Build, test, verifying `process.versions.tz` matches the desired version. 68* Create a new minor version release. 69 70## Release schedule 71 72ICU typically has >1 release a year, particularly coinciding with a major 73release of [Unicode][]. The current release schedule is available on the [ICU][] 74website on the left sidebar. 75 76### V8 depends on ICU 77 78V8 will aggressively upgrade to a new ICU version, due to requirements for 79features/bugfixes needed for [Ecma402][] support. The minimum required version 80of ICU is specified within the V8 source tree. If the ICU version is too old, 81V8 will not compile. 82 83```c 84// deps/v8/src/objects/intl-objects.h 85#define V8_MINIMUM_ICU_VERSION 65 86``` 87 88V8 in Node.js depends on the ICU version supplied by Node.js. 89 90The file `tools/icu/icu_versions.json` contains the current minimum 91version of ICU that Node.js is known to work with. This should be 92_at least_ the same version as V8, so that users will find out 93earlier that their ICU is too old. A test case validates this when 94Node.js is built. 95 96## How to upgrade ICU 97 98* Make sure your Node.js workspace is clean (`git status` 99should be sufficient). 100* Configure Node.js with the specific [ICU version](http://site.icu-project.org/download) 101 you want to upgrade to, for example: 102 103```bash 104./configure \ 105 --with-intl=full-icu \ 106 --with-icu-source=https://github.com/unicode-org/icu/releases/download/release-67-1/icu4c-67_1-src.tgz 107make 108``` 109 110> _Note_ in theory, the equivalent `vcbuild.bat` commands should work also, 111> but the commands below are makefile-centric. 112 113* If there are ICU version-specific changes needed, you may need to make changes 114 in `tools/icu/icu-generic.gyp` or add patch files to `tools/icu/patches`. 115 * Specifically, look for the lists in `sources!` in the `tools/icu/icu-generic.gyp` for 116 files to exclude. 117 118* Verify the Node.js build works: 119 120```bash 121make test-ci 122``` 123 124Also running 125 126<!-- eslint-disable strict --> 127 128```js 129new Intl.DateTimeFormat('es', { month: 'long' }).format(new Date(9E8)); 130``` 131 132…Should return `enero` not `January`. 133 134* Now, run the shrink tool to update `deps/icu-small` from `deps/icu` 135 136> :warning: Do not modify any source code in `deps/icu-small` ! 137> See section below about floating patches to ICU. 138 139```bash 140python tools/icu/shrink-icu-src.py 141``` 142 143* Now, do a clean rebuild of Node.js to test: 144 145```bash 146make -k distclean 147./configure 148make 149``` 150 151* Test this newly default-generated Node.js 152 153<!-- eslint-disable strict --> 154 155```js 156process.versions.icu; 157new Intl.DateTimeFormat('es', { month: 'long' }).format(new Date(9E8)); 158``` 159 160(This should print your updated ICU version number, and also `enero` again.) 161 162You are ready to check in the updated `deps/icu-small`. This is a big commit, 163so make this a separate commit from the smaller changes. 164 165> :warning: Do not modify any source code in `deps/icu-small` ! 166> See section below about floating patches to ICU. 167 168* Now, rebuild the Node.js license. 169 170```bash 171# clean up - remove deps/icu 172make clean 173tools/license-builder.sh 174``` 175 176* Update the URL and hash for the full ICU file in `tools/icu/current_ver.dep`. 177It should match the ICU URL used in the first step. When this is done, the 178following should build with small ICU. 179 180```bash 181# clean up 182rm -rf out deps/icu deps/icu4c* 183./configure --with-intl=small-icu --download=all 184make 185make test-ci 186``` 187 188* commit the change to `tools/icu/current_ver.dep` and `LICENSE` files. 189 190 * Note: To simplify review, I often will “pre-land” this patch, meaning that 191 I run the patch through `curl -L https://github.com/nodejs/node/pull/xxx.patch 192 | git am -3 --whitespace=fix` per the collaborator’s guide… and then push that 193 patched branch into my PR's branch. This reduces the whitespace changes that 194 show up in the PR, since the final land will eliminate those anyway. 195 196## Floating patches to ICU 197 198Floating patches are applied at `configure` time. The "patch" files 199are used instead of the original source files. The patch files are 200complete `.cpp` files replacing the original contents. 201 202Patches are tied to a specific ICU version. They won’t apply to a 203future ICU version. We assume that you filed a bug against [ICU][] and 204upstreamed the fix, so the patch won’t be needed in a later ICU 205version. 206 207### Example 208 209For example, to patch `source/tools/toolutil/pkg_genc.cpp` for 210ICU version 63: 211 212```bash 213# go to your Node.js source directory 214cd <node> 215 216# create the floating patch directory 217mkdir -p tools/icu/patches/63 218 219# create the subdirectory for the file(s) to patch: 220mkdir -p tools/icu/patches/63/source/tools/toolutil/ 221 222# copy the file to patch 223cp deps/icu-small/source/tools/toolutil/pkg_genc.cpp \ 224tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp 225 226# Make any changes to this file: 227(edit tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp ) 228 229# test 230make clean && ./configure && make 231``` 232 233You should see a message such as: 234 235```console 236INFO: Using floating patch "tools/icu/patches/63/source/tools/toolutil/pkg_genc.cpp" from "tools/icu" 237``` 238 239### Clean up 240 241Any patches older than the minimum version given in `tools/icu/icu_versions.json` 242ought to be deleted, because they will never be used. 243 244### Why not just modify the ICU source directly? 245 246Especially given the V8 dependencies above, there may be times when a floating 247patch to ICU is required. Though it seems expedient to simply change a file in 248`deps/icu-small`, this is not the right approach for the following reasons: 249 2501. **Repeatability.** Given the complexity of merging in a new ICU version, 251following the steps above in the prior section of this document ought to be 252repeatable without concern for overriding a patch. 253 2542. **Verifiability.** Given the number of files modified in an ICU PR, 255a floating patch could easily be missed or dropped altogether next time 256something is landed. 257 2583. **Compatibility.** There are a number of ways that ICU can be loaded into 259Node.js (see the top level README.md). Only modifying `icu-small` would cause 260the patch not to be landed in case the user specifies the ICU source code 261another way. 262 263[CLDR]: http://cldr.unicode.org/ 264[Ecma402]: https://github.com/tc39/ecma402 265[ICU]: http://site.icu-project.org/ 266[Unicode]: https://home.unicode.org/ 267[tz]: https://www.iana.org/time-zones 268