# Basic instructions for running the LdmlConverter via Maven
> Note: While this document provides useful background information about the
LdmlConverter, the actual complete process for integrating CLDR data to ICU
is described in the document `../../../docs/processes/cldr-icu.md` which is
best viewed as
[CLDR-ICU integration](https://unicode-org.github.io/icu/processes/cldr-icu.html)
## Requirements
* A CLDR release for supplying CLDR data and the CLDR API.
* The Maven build tool
* The Ant build tool (using JDK 11+)
## Important directories
| Directory | Description |
|-----------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `TOOLS_ROOT` | Path to root of ICU tools directory, below which are (e.g.) the `cldr/` and `unicodetools/` directories. |
| `CLDR_DIR` | This is the path to the to root of standard CLDR sources, below which are the `common/` and `tools/` directories. |
| `CLDR_DATA_DIR` | The top-level directory for the CLDR production data (typically the "production" directory in the staging repository). Usually generated locally or obtained from: https://github.com/unicode-org/cldr-staging/tree/main/production |
In Posix systems, it's best to set these as exported shell variables, and any
following instructions assume they have been set accordingly:
```
$ export TOOLS_ROOT=/path/to/icu/tools
$ export CLDR_DIR=/path/to/cldr
$ export CLDR_DATA_DIR=/path/to/cldr-staging/production
```
Note that you should not attempt to use data from the CLDR project directory
(where the CLDR API code exists) for conversion into ICU data. The process now
relies on a pre-processing step, and the CLDR data must come from the separate
"staging" repository (i.e. https://github.com/unicode-org/cldr-staging) or be
pre-processed locally into a different directory.
## Initial Setup
This project relies on the Maven build tool for managing dependencies and uses
Ant for configuration purposes, so both will need to be installed. On a Debian
based system, this should be as simple as:
```
$ sudo apt-get install maven ant
```
You must also install an additional CLDR JAR file the local Maven repository at
`$TOOLS_ROOT/cldr/lib` (see the `README.txt` in that directory for more
information).
```
$ cd "$TOOLS_ROOT/cldr/lib"
$ ./install-cldr-jars.sh "$CLDR_DIR"
```
## Generating all ICU data and source code
```
$ cd "$TOOLS_ROOT/cldr/cldr-to-icu"
$ ant -f build-icu-data.xml
```
## Other Examples
* Outputting a subset of the supplemental data into a specified directory:
```
$ ant -f build-icu-data.xml -DoutDir=/tmp/cldr -DoutputTypes=plurals,dayPeriods -DdontGenCode=true
```
Note: Output types can be listed with mixedCase, lower_underscore or UPPER_UNDERSCORE.
Pass `-DoutputTypes=help` to see the full list.
* Outputting only a subset of locale IDs (and all the supplemental data):
```
$ ant -f build-icu-data.xml -DoutDir=/tmp/cldr -DlocaleIdFilter='(zh|yue).*' -DdontGenCode=true
```
* Overriding the default CLDR version string (which normally matches the CLDR library code):
```
$ ant -f build-icu-data.xml -DcldrVersion="36.1"
```
### Using `alt="ascii"` CLDR alternate values from the CLDR XML
CLDR provides alternate values in addition to the default values for locale data.
For example, some locales have time formats using U+202F NARROW NO-BREAK SPACE (NNBSP) between the hours/minutes/seconds and the day periods.
In order to provide the equivalent time formats that use the ASCII space
U+0020 SPACE,
the alternate values have the extra attribute `alt="ascii"`.
Follw these steps to generate ICU data using the ASCII versions of locale data:
1. First, edit the `build-icu-data.xml` file where it mentions `ALTERNATE VALUES`
with the correctly annotated source path, target path, and locales list
as follows:
```diff
@@ -384,6 +399,20 @@
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
```
1. Then run the generator:
```
$ ant -f build-icu-data.xml
```
## Config syntax details
Note: some elements have an implicit default attributes associated with them, according to [`ldml.dtd`](https://github.com/unicode-org/icu/blob/main/icu4c/source/data/dtd/cldr/common/dtd/ldml.dtd).
For example, for the `timeFormat` element,
the following excerpt of the DTD schema indicates that there is a default value `"standard"` for the `type` attribute:
```
```
See `build-icu-data.xml` for documentation of all options and additional customization.
## Running unit tests
```
$ mvn test -DCLDR_DIR="$CLDR_DATA_DIR"
```
## Importing and running from an IDE
This project should be easy to import into an IDE which supports Maven development, such
as IntelliJ or Eclipse. It uses a local Maven repository directory for the unpublished
CLDR libraries (which are included in the project), but otherwise gets all dependencies
via Maven's public repositories.