• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1---
2layout: default
3title: Resources
4nav_order: 2
5parent: Locales and Resources
6---
7<!--
8© 2020 and later: Unicode, Inc. and others.
9License & terms of use: http://www.unicode.org/copyright.html
10-->
11
12# Resource Management
13{: .no_toc }
14
15## Contents
16{: .no_toc .text-delta }
17
181. TOC
19{:toc}
20
21---
22
23## Overview
24
25> :point_right: **Note**: This page describes the use of ICU4C Resource
26> Management techniques and APIs. For an overview of the message localization
27> process using ICU, see the related page [Localizing with ICU](localizing.md).
28
29A software product that needs to be localized wins or loses depending on how
30easy is to change the data or "resources" which affect users. From the simplest
31point of view, that data is the information presented to the user (such as a
32translated message) as well as the region-specific ways of doing things (such as
33sorting). The process of localization will eventually involve translators and it
34would be very convenient if the process of localizing could be done only by
35translators and experts in the target culture. There are several points to keep
36in mind when designing such a localizable software product.
37
38### Keeping Data Separate
39
40Obviously, one does not want to make translators wade through the source code
41and make changes there. That would be a recipe for a disaster. Instead, the
42translatable data should be kept separately, in a format that allows translators
43easy access. A separate resource managing mechanism is hence required.
44Application access data through API calls, which pick the appropriate entries
45from the resources. Resources are kept in human readable/editable format with
46optional tools for content editing.
47
48The data should contain all the elements to be localized, including, but no
49limited to, GUI messages, icons, formatting patterns, and collation rules. A
50convenient way for keeping binary data should also be provided - often icons for
51different cultures should be different.
52
53### Keeping Data Small
54
55It is not unlikely that the data will be same for several regions - take for
56example Spanish speaking countries - names of the days and month will be the
57same in both Mexico and Spain. It would be very beneficial if we can prevent the
58duplication of data. This can be achieved by structuring resources in such a way
59so that an unsuccessful query into a more specific resource triggers the same
60query in a more general resource. A convenient way to do this is to use a tree
61like structure.
62
63Another way to reduce the data size is to allow linking of the resources that
64are same for the regions that are not in general-specific relation.
65
66### Find the Best Available Data
67
68Sometimes, the exact data for a region is still not available. However, if the
69data is structured correctly, the user can be presented with similar data. For
70example, a Spanish speaking user in Mexico would probably be happier with
71Spanish than with English captions, even if some of the details for Mexico are
72not there.
73
74If the data is grouped correctly, the program can automatically find the most
75suitable data for the situation.
76
77The previous points all lead to a separate mechanism that stores data separately
78from the code. Software is able to access the data through the API calls. Data
79is structured in a tree like structure, with the most general region in the root
80(most commonly, the root region is the native language of the development team).
81Branches lead to more specialized regions, usually through languages, countries
82and country regions. Data that is already the same on the more general level is
83not repeated.
84
85> :point_right: **Note**: The path through languages, countries and country
86> region could be different. One may decide to go through countries and then
87> through languages spoken in the particular country. In either case, some data
88> must be duplicated - if you go through languages, the currency data for
89> different speaking parts of the same country will be duplicated (consider
90> French and English languages in Canada) - on the other side, when you go
91> through countries, you will need to duplicate day names and similar
92> information.
93
94Here is an example of a such a resource tree structure:
95
96```
97             root                             Root
98              |
99  +-------+---+---+----+----+
100  |       |       |    |    |
101  en      de      ja  ru    zh                Language
102  |       |       |    |    |
103  +---+   +---+   |    |    +------+
104  |   |   |   |   |    |    |      |
105  |   |   |   |   |    |    Hans  Hant        Script
106  |   |   |   |   |    |    |      |
107  |   |   |   |   |    |    |      +----+
108  |   |   |   |   |    |    |      |    |
109  US  IE  DE  AT  JP   RU   CN     HK   TW    Country or Region
110  |
111  POSIX                                       Variant
112```
113
114Let us assume that the root resource contains data written by the original
115implementors and that this data is in English and conforms to the conventions
116used in the United States. Therefore, resources for English and English in
117United States would be empty and would take its data from the root resource. If
118a version for Ireland is required, appropriate overriding changes can be made to
119the data for English in Ireland. Special variant information could be put into
120en_US_POSIX if specific legacy formatting were required, or specific sub-region
121information were required. When making the version for the German speaking
122region, all the German data would be in that resource, with the differences in
123the Germany and Austria resources.
124
125It is important to note that some locales have the optional script tag. This is
126important for multi-script locales, like Uzbek, Azerbaijani, Serbian or Chinese.
127Even though Chinese uses Han characters, the characters are usually identified
128as either traditional Chinese (Hant) characters, or simplified Chinese (Hans).
129
130Even if all the data that would go to a certain resource comes from the more
131general resources, it should be made clear that the particular region is
132supported by application. This can be done by having completely empty resources.
133
134## The ICU Model
135
136ICU bases its resource management model on the ideas presented above. All the
137resource APIs are concentrated in the resource bundle framework. This framework
138is closely tied in its functioning to the ICU [Locale](index.md) naming scheme.
139
140ICU provides and relies on a set of locale specific data in the resource bundle
141format. If we think that we have correct data for a requested locale, even if
142all its data comes from a more general locales, we will provide an empty
143resource bundle. This is reflected in our return informational codes (see the
144section on APIs). A lot of ICU frameworks (collation, formatting etc.) relies on
145the data stored in resource bundles.
146
147Resource bundles rely on the ICU data framework. For more information on the
148functioning of ICU data, see the appropriate [section](../icudata.md).
149
150Users of the ICU library can also use the resource bundle framework to store and
151retrieve localizable data in their projects.
152
153Resource bundles are collections of resources. Individual resources can contain
154data or other resources.
155
156> :point_right: **Note**: ICU4J relies on the resource bundle mechanism already
157> provided by JDK for its functioning. Therefore, most of the discussion here
158> pertains only to ICU4C.
159
160### Fallback Mechanism
161
162Essential part ICU's resource management framework is the fallback mechanism. It
163ensures that if the data for the requested locale is missing, an effort will be
164made to obtain the most usable data. Fallback can happen in two situations:
165
1661.  When a resource bundle for a locale is requested. If it doesn't exist, a
167    more general resource bundle will be used. If there are no such resource
168    bundles, a resource bundle for default locale will be used. If this fails,
169    the root resource bundle will be used. When using ICU locale data, not
170    finding the requested resource bundle means that we don't know what the data
171    should be for that particular locale, so you might want to consider this
172    situation an error. Custom packages of resource bundles may or may not
173    adhere to this contract. A special care should be taken in remote server
174    situations, when the data from the default locale might not mean anything to
175    the remote user (imagine a situation where a server in Japan responds to a
176    Spanish speaking client by using default Japanese data.
177
1782.  When a resource inside a resource bundle is requested. If the resource is
179    not present, it will be sought after in more general resources. If at
180    initial opening of a resource bundle we went through the default locale, the
181    search for a resource will also go through it. For example, if a resource
182    bundle for zh_Hans_CN is opened, a missing resource will be looked for in
183    zh_Hans, zh and finally root. This is usually harmless, except when a
184    resource is only located in the default locale or in the root resource
185    bundle.
186
187### Data Packaging
188
189ICU allows and requires that the application specific data be stored apart from
190the ICU internal data (locale, converter, transformation data etc.). Application
191data should be stored in packages. ICU uses the default package (NULL) for its
192data. All the ICU's build tools provide means to specify the package for your
193data. More about how to package application data can be found below.
194
195## Resource Bundle APIs
196
197ICU4C provides both C and C++ APIs for using resource bundles. The core
198implementation is in C, while the C++ APIs are only a thin wrapper around it.
199Therefore, the code using C APIs will generally be faster.
200
201Resource bundles use ICU's "open use close" paradigm. In C all the resource
202bundle operations are done using the `UResourceBundle*` handle. `UResourceBundle*`
203allows access to both resource bundles and individual resources. In C++, class
204`ResourceBundle` should be used for both resource bundles and individual
205resources.
206
207To use the resource bundle framework, you need to include the appropriate header
208file, `unicode/ures.h` for C and `unicode/resbund.h` for C++.
209
210### Error Checking
211
212If an operation with resource bundle fails, an error code will be set. It is
213important to check for the value of the error code. In C you should frequently
214use the following construct:
215
216```c
217if (U_SUCCESS(status)) {
218    /* everything is fine */
219} else {
220    /* there was an error */
221}
222```
223
224### Opening of Resource Bundles
225
226The most common C resource bundle opening API is:
227
228```c
229UResourceBundle* ures_open(const char* package, const char* locale, UErrorCode* status)
230```
231
232The first argument specifies the package name or `NULL` for the default ICU package.
233The second argument is the locale for which you want the resource bundle.
234Special values for the locale are `NULL` for the default locale and `""` (empty
235string) for the root locale. The third argument should be set to `U_ZERO_ERROR`
236before calling the function. It will return the status of operation. Apart from
237returning regular errors, it can return two informational/warning codes:
238`U_USING_FALLBACK_WARNING` and `U_USING_DEFAULT_WARNING`. The first informational
239code means that the requested resource bundle was not found and that a more
240general bundle was returned. If you are opening ICU resource bundles, do note
241that this means that we do not guarantee that the contents of opened resource
242bundle will be correct for the requested locale. The situation might be
243different for application packages. However, the warning `U_USING_DEFAULT_WARNING`
244means that there were no more general resource bundles found and that you were
245returned either a resource bundle that is the default for the system, or the root
246resource bundle. This will almost certainly contain wrong data.
247
248There are a couple of other opening APIs: `ures_openDirect` takes the same
249arguments as the `ures_open` but will fail if the requested locale is not found.
250Also, if opening is successful, no fallback will be performed if an individual
251resource is not found. The second one, `ures_openU` takes a `UChar*` for package
252name instead of `char*`.
253
254In C++, opening is done through a constructor. There are several constructors.
255Most notable difference from C APIs is that the package should be given as a
256`UnicodeString` and the locale is passed as a `Locale` object. There is also a copy
257constructor and a constructor that takes a C `UResourceBundle*` handle. The
258result is a `ResourceBundle` object. Remarks about informational codes are also
259valid for the C++ APIs.
260
261> :point_right: **Note**: All the data accessing examples in the following
262> sections use ICU's
263> [root](https://github.com/unicode-org/icu/blob/main/icu4c/source/data/locales/root.txt)
264> resource bundle.
265
266```c
267UErrorCode status = U_ZERO_ERROR;
268UResourceBundle* icuRoot = ures_open(NULL, "root", &status);
269if (U_SUCCESS(status)) {
270    /* everything is fine */
271    ...
272    /* do some interesting stuff here - see below */
273    ...
274    /* and close the bundle afterwards */
275    ures_close(icuRoot); /* discussed later */
276} else {
277    /* there was an error */
278    /* report and exit */
279}
280```
281
282In C++, opening would look like this:
283
284```c++
285UErrorCode status = U_ZERO_ERROR;
286// we rely on automatic construction of Locale object from a char*
287ResourceBundle myResource("myPackage", "de_AT", status);
288if (U_SUCCESS(status)) {
289    /* everything is fine */
290    ...
291    /* do some interesting stuff here */
292    ...
293    /* the bundle will be closed when going out of scope */
294} else {
295    /* there was an error */
296    /* report and exit */
297}
298```
299
300### Closing of Resource Bundles
301
302After using, resource bundles need to be closed to prevent memory leaks. In C,
303you should call the `void ures_close(UResourceBundle* resB)` API. In C++, if you
304have just used the `ResourceBundle` objects, going out of scope will close the
305bundles. When using allocated objects, make sure that you call the appropriate
306delete function.
307
308As already mentioned, resource bundles and resources share the same type. You
309can close bundles and resources in any order you like. You can invoke `ures_close`
310on `NULL` resource bundles. Therefore, you can always this API regardless of the
311success of previous operations.
312
313### Accessing Resources
314
315Once you are in the possession of a valid resource bundle, you can access the
316resources and data that it holds. The result of accessing operations will be a
317new resource bundle object. In C, `UResourceBundle*` handles can be reused by
318using the fill-in parameter. That saves you from frequent closing and
319reallocating of resource bundle structures, which can dramatically improve the
320performance. C++ APIs do not provide means for object reuse. All the C examples
321in the following sections will use a fill-in parameter.
322
323#### Types of Resources
324
325Resource bundles can contain two main types of resources: complex and simple
326resources. Complex resources store other resources and can have named or unnamed
327elements. **Tables** store named elements, while **arrays** store unnamed ones.
328Simple resources contain data which can be **string**, **binary**, **integer
329array** or a single **integer**.
330
331There are several ways for accessing data stored in the complex resources.
332Tables can be accessed using keys, indexes and by iteration. Arrays can be
333accessed using indexes and by iteration.
334
335In order to be able to distinguish between resources, one needs to know the type
336of the resource at hand. To find this out, use the
337`UResType ures_getType(UResourceBundle* resourceBundle)` API, or the C++ analog
338`UResType getType(void)`. The `UResType` is an enumeration defined in the
339[unicode/ures.h](https://github.com/unicode-org/icu/blob/main/icu4c/source/common/unicode/ures.h)
340header file.
341
342> :point_right: **Note**: Indexes of resources in tables do not necessarily
343> correspond to the order of items in a table. Due to the way binary structure is
344> organized, items in a table are sorted according to the binary ordering of the
345> keys, therefore, the index of an item in a table will be the index of its key
346> string in the binary order. Furthermore, the ordering of the keys are different
347> on ASCII and EBCDIC platforms.
348> <br>
349> Starting with ICU 4.4, the order of table items is the ASCII string order on
350> all platforms.
351> <br>
352> The iteration order of table items might change from release to release.
353
354#### Accessing by Key
355
356To access resources using a key, you can use the `UResourceBundle*
357ures_getByKey(const UResourceBundle* resourceBundle, const char* key,
358UResourceBundle* fillIn, UErrorCode* status)` API. First argument is the parent
359resource bundle, which can be either a resource bundle opened using `ures_open` or
360similar APIs or a table resource. The key is always specified using invariant
361characters. The fill-in parameter can be either `NULL` or a valid resource bundle
362handle. If it is `NULL`, a new resource bundle will be constructed. If you pass an
363already existing resource bundle, it will be closed and the memory will be
364reused for the new resource bundle. Status indicator can return
365`U_MISSING_RESOURCE_ERROR` which indicates that no resources with that key exist,
366or one of the above mentioned informational codes (`U_USING_FALLBACK_WARNING` and
367`U_USING_DEFAULT_WARNING`) which do not affect the validity of data in the case of
368resource retrieval.
369
370```c
371...
372/* we already got zones resource from the opening example */
373UResourceBundle *zones = ures_getByKey(icuRoot, "zoneStrings", NULL, &status);
374if (U_SUCCESS(status)) {
375    /* ... do interesting stuff - see below ... */
376}
377ures_close(zones);
378/* clean up the rest */
379...
380```
381
382In C++, the analogous API is `ResourceBundle get(const char* key, UErrorCode& status) const`.
383
384Trying to retrieve resources by key on any other type of resource than tables
385will produce a `U_RESOURCE_TYPE_MISMATCH` error.
386
387#### Accessing by Index
388
389Accessing by index requires you to supply an index of the resource that you want
390to retrieve. Appropriate API is `UResourceBundle* ures_getByIndex(const
391UResourceBundle* resourceBundle, int32_t indexR, UResourceBundle* fillIn,
392UErrorCode* status)`. The arguments have the same semantics as for the
393`ures_getByKey` API. The only difference is the second argument, which is the
394index of the resource that you want to retrieve. Indexes start at zero. If an
395index out of range is specified, `U_MISSING_RESOURCE_ERROR` is returned. To find
396the size of a resource, you can use `int32_t ures_getSize(UResourceBundle*
397resourceBundle)`. The maximum index is the result of this API minus 1.
398
399```c
400...
401/* we already got zones resource from the accessing by key example */
402UResourceBundle *currentZone = NULL;
403int32_t index = 0;
404for (index = 0; index < ures_getSize(zones); index++) {
405    currentZone = ures_getByIndex(zones, index, currentZone, &status);
406    /* ... do interesting stuff here ... */
407}
408ures_close(currentZone);
409/* cleanup the rest */
410...
411```
412
413Accessing simple resource with an index 0 will return themselves. This is useful
414for iterating over all the resources regardless of type.
415
416C++ overloads the get API with `ResourceBundle get(int32_t index, UErrorCode& status) const`.
417
418#### Iterating Over Resources
419
420If you don't care about the order of the resources and want simple code, you can
421use the iteration mechanism. To set up iteration over a complex resource, you
422can simply start iterating using the `UResourceBundle*
423ures_getNextResource(UResourceBundle* resourceBundle, UResourceBundle* fillIn,
424UErrorCode* status)`. It is advisable though to reset the iterator for a
425resource before starting, in order to ensure that the iteration will indeed
426start from the beginning - just in case somebody else has already been playing
427with this resource. To reset the iterator use `void
428ures_resetIterator(UResourceBundle* resourceBundle)` API. To check whether there
429are more resources, call `UBool ures_hasNext(UResourceBundle* resourceBundle)`.
430If you have iterated through the whole resource, `NULL` will be returned.
431
432```c
433...
434/* we already got zones resource from the accessing by key example */
435UResourceBundle *currentZone = NULL;
436ures_resetIterator(zones);
437while (ures_hasNext(zones)) {
438    currentZone = ures_getNextResource(zones, currentZone, &status);
439    /* ... do interesting stuff here ... */
440}
441ures_close(currentZone);
442/* cleanup the rest */
443...
444```
445
446C++ provides analogous APIs: `ResourceBundle getNext(UErrorCode& status)`, `void resetIterator(void)`
447 and `UBool hasNext(void)`.
448
449#### Accessing Data in the Simple Resources
450
451In order to get to the data in the simple resources, you need to use appropriate
452APIs according to the type of a simple resource. They are summarized in the
453tables below. All the pointers returned should be considered pointers to read
454only data. Using an API on a resource of a wrong type will result in an error.
455
456Strings:
457
458| Language | API                                                                                                    |
459| -------- | ------------------------------------------------------------------------------------------------------ |
460| C        | `const UChar* ures_getString(const UResourceBundle* resourceBundle, int32_t* len, UErrorCode* status)` |
461| C++      | `UnicodeString getString(UErrorCode& status) const`                                                    |
462
463Example:
464
465```c
466...
467UResourceBundle* version = ures_getByKey(icuRoot, "Version", NULL, &status);
468if (U_SUCCESS(status)) {
469  int32_t versionStringLen = 0;
470  const UChar* versionString = ures_getString(version, &versionStringLen, &status);
471}
472ures_close(version);
473...
474```
475
476Binaries:
477
478| Language | API                                                                                                      |
479| -------- | -------------------------------------------------------------------------------------------------------- |
480| C        | `const uint8_t* ures_getBinary(const UResourceBundle* resourceBundle, int32_t* len, UErrorCode* status)` |
481| C++      | `const uint8_t* getBinary(int32_t& len, UErrorCode& status) const`                                       |
482
483Integers, signed and unsigned:
484
485| Language | API                                                                                                                                                                 |
486| -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
487| C        | `int32_t ures_getInt(const UResourceBundle* resourceBundle, UErrorCode* status)` `uint32_t ures_getUInt(const UResourceBundle* resourceBundle, UErrorCode* status)` |
488| C++      | `int32_t getInt(UErrorCode& status) const` <br> `uint32_t getUInt(UErrorCode& status) const`                                                                        |
489
490Integer Arrays:
491
492| Language | API                                                                                                         |
493| -------- | ----------------------------------------------------------------------------------------------------------- |
494| C        | `const int32_t* ures_getIntVector(const UResourceBundle* resourceBundle, int32_t* len, UErrorCode* status)` |
495| C++      | `const int32_t* getIntVector(int32_t& len, UErrorCode& status) const`                                       |
496
497#### Convenience APIs
498
499Since the vast majority of data stored in resource bundles are strings, ICU's
500resource bundle framework provides a number of different convenience APIs that
501directly access strings stored in resources. They are analogous to APIs already
502discussed, with the difference that they return const `UChar*` or `UnicodeString`
503objects.
504
505> :point_right: **Note**: The C APIs that allow returning of `UnicodeStrings` only
506> work if used in a C++ file. Trying to use them in a C file will produce a
507> compiler error.
508
509APIs that allow retrieving strings by specifying a key:
510
511| Language (Return Type) | API                                                                                                                |
512| ---------------------- | ------------------------------------------------------------------------------------------------------------------ |
513| C (UChar*)             | `const UChar* ures_getStringByKey(const UResourceBundle* resB, const char* key, int32_t* len, UErrorCode* status)` |
514| C (UnicodeString)      | `UnicodeString ures_getUnicodeStringByKey(const UResourceBundle* resB, const char* key, UErrorCode* status)`       |
515| C++                    | `UnicodeString getStringEx(const char* key, UErrorCode& status) const`                                             |
516
517
518APIs that allow retrieving strings by specifying an index:
519
520| Language (Return Type) | API                                                                                                                 |
521| ---------------------- | ------------------------------------------------------------------------------------------------------------------- |
522| C (UChar*)             | `const UChar* ures_getStringByIndex(const UResourceBundle* resB, int32_t indexS, int32_t* len, UErrorCode* status)` |
523| C (UnicodeString)      | `UnicodeString ures_getUnicodeStringByIndex(const UResourceBundle* resB, int32_t indexS, UErrorCode* status)`       |
524| C++                    | `UnicodeString getStringEx(int32_t index, UErrorCode& status) const`                                                |
525
526APIs for retrieving strings through iteration:
527
528| Language (Return Type) | API                                                                                                                    |
529| ---------------------- | ---------------------------------------------------------------------------------------------------------------------- |
530| C (UChar*)             | `const UChar* ures_getNextString(UResourceBundle* resourceBundle, int32_t* len, const char** key, UErrorCode* status)` |
531| C (UnicodeString)      | `UnicodeString ures_getNextUnicodeString(UResourceBundle* resB, const char** key, UErrorCode* status)`                 |
532| C++                    | `UnicodeString getNextString(UErrorCode& status)`                                                                      |
533
534#### Other APIs
535
536Resource bundle framework provides a number of additional APIs that allow you to
537get more information on the resources you are using. They are summarized in the
538following tables.
539
540| Language | API                                                     |
541| -------- | ------------------------------------------------------- |
542| C        | `int32_t ures_getSize(UResourceBundle* resourceBundle)` |
543| C++      | `int32_t getSize(void) const`                           |
544
545Gets the number of items in a resource. Simple resources always return size 1.
546
547| Language | API                                                      |
548| -------- | -------------------------------------------------------- |
549| C        | `UResType ures_getType(UResourceBundle* resourceBundle)` |
550| C++      | `UResType getType(void)`                                 |
551
552Gets the type of the resource. For a list of resource types, see:
553[unicode/ures.h](https://github.com/unicode-org/icu/blob/main/icu4c/source/common/unicode/ures.h)
554
555| Language | API                                              |
556| -------- | ------------------------------------------------ |
557| C        | `const char* ures_getKey(UResourceBundle* resB)` |
558| C++      | `const char* getKey(void)`                       |
559
560Gets the key of a named resource or `NULL` if this resource is a member of an
561array.
562
563| Language | API                                                                           |
564| -------- | ----------------------------------------------------------------------------- |
565| C        | `void ures_getVersion(const UResourceBundle* resB, UVersionInfo versionInfo)` |
566| C++      | `void getVersion(UVersionInfo versionInfo) const`                             |
567
568Fills out the version structure for this resource.
569
570| Language | API                                                                                     |
571| -------- | --------------------------------------------------------------------------------------- |
572| C        | `const char* ures_getLocale(const UResourceBundle* resourceBundle, UErrorCode* status)` |
573| C++      | `const Locale& getLocale(void) const`                                                   |
574
575Returns the locale this resource is from. This API is going to change, so stay
576tuned.
577
578### Format of Resource Bundles
579
580Resource bundles are written in its source format. Before using them, they must
581be compiled to the binary format using the `genrb` utility. Currently supported
582source format is a text file. The format is defined in a [formal definition
583file](https://github.com/unicode-org/icu-docs/blob/main/design/bnf_rb.txt).
584
585This is an example of a resource bundle source file:
586
587```
588// Comments start with a '//' and extend to the end of the line
589// first, a locale name for the bundle is defined. The whole bundle is a table
590// every resource, including the whole bundle has its name.
591// The name consists of invariant characters, digits and following symbols: -, _.
592root {
593    menu {
594        id { "mainmenu" }
595        items {
596            {
597                id { "file" }
598                name { "&File" }
599                items {
600                    {
601                        id { "open" }
602                        name { "&Open" }
603                    }
604                    {
605                        id { "save" }
606                        name { "&Save" }
607                    }
608                    {
609                        id { "exit" }
610                        name { "&Exit" }
611                    }
612                }
613            }
614
615            {
616                id { "edit" }
617                name { "&Edit" }
618                items {
619                    {
620                        id { "copy" }
621                        name { "&Copy" }
622                    }
623                    {
624                        id { "cut" }
625                        name { "&Cut" }
626                    }
627                    {
628                        id { "paste" }
629                        name { "&Paste" }
630                    }
631                }
632           }
633
634            ...
635        }
636    }
637
638    // This resource is an array, thus accessible only through iteration and indexes...
639    errors {
640        "Invalid Command",
641        "Bad Value",
642
643        // Add more strings here...
644
645        "Read the Manual"
646    }
647
648    splash:import { "splash_root.gif" } // This is a binary imported file
649
650    pgpkey:bin { a1b2c3d4e5f67890 } // a binary value
651
652    versionInfo { // a table
653        major:int { 1 } // of integers
654        minor:int { 4 }
655        patch:int { 7 }
656    }
657
658    buttonSize:intvector { 10, 20, 10, 20 } // an array of 32-bit integers
659
660    // will pick up data from zoneStrings resource in en bundle in the ICU package
661    simpleAlias:alias { "/ICUDATA/en/zoneStrings" }
662
663    // will pick up data from CollationElements resource in en bundle
664    // in the ICU package
665    CollationElements:alias { "/ICUDATA/en" }
666}
667```
668
669Binary format is described in the
670[uresdata.h](https://github.com/unicode-org/icu/blob/main/icu4c/source/common/uresdata.h)
671header file.
672
673### Resources Syntax
674
675Syntax of the resources that can be stored in resource bundles is specified in
676the following table:
677
678| Data Type       | Format                                                                       | Description                                                                                                                                                                                                                                                                       |
679| --------------- | ---------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
680| Tables          | `[name][:table] { subname1 { subresource1 } ... subnameN { subresourceN } }` | Tables are a complex resource that holds named resources. If it is a part of an array, it does not have a name. At this point, a resource bundle is a table. Access is allowed by key, index, and iteration.                                                                      |
681| Arrays          | `[name][:array] {subresource1, ... subresourceN }`                           | Arrays are a complex resource that holds unnamed resources. If it is a part of an array, it does not have a name. Arrays require less memory than tables (since they don't store the name of sub-resources) but the index and iteration access are as fast as with tables.        |
682| Strings         | `[name][:string] { ["]UnicodeText["] }`                                      | Strings are simple resources that hold a chunk of Unicode encoded data. If it is a part of an array, it does not have a name.                                                                                                                                                     |
683| Binaries        | `name:bin { binarydata } name:import{ "fileNameToImport" }`                  | Binaries are used for storing binary information (processed data, images etc). Information is stored on a byte level.                                                                                                                                                             |
684| Integers        | `name:int { integervalue }`                                                  | Integers are used for storing a 32 bit integer value.                                                                                                                                                                                                                             |
685| Integer Vectors | `name:intvector { integervalue, ... integervalueN }`                         | Integer vectors are used for storing 32 bit integer values.                                                                                                                                                                                                                       |
686| Aliases         | `name:alias { locale and path to aliased resource }`                         | Aliases point to other resources. They are useful for preventing duplication of data in resources that are not on the same branch of the fallback chain. Alias can also have an empty path. In that case the position of the alias resource is used to find the aliased resource. |
687
688Although specifying type for some resources can be omitted for backward
689compatibility reasons, you are strongly encouraged to always specify the type of
690the resources. As structure gets more complicated, some combinations of
691resources that are not typed might produce unexpected results.
692
693### Escape Sequences
694
695String values can contain C/Java-style escape sequences like `\t`, `\r`, `\n`,
696`\xhh`, `\uhhhh` and `\U00hhhhhh`, consistent with the `u_unescape()` C API, see the
697[ustring.h](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/ustring_8h.html)
698API documentation.
699
700A literal backslash (\\) in a string value must be doubled (\\\\) or escaped
701with `\x5C` or `\u005C`.
702
703A literal ASCII double quote (") in a double-quoted string must be escaped with
704\\" or `\x22` or `\u0022`.
705
706You should also escape carriage return (`\r`) and line feed (`\n`) as well as
707control codes, non-characters, unassigned code points and other default-invisible
708characters (see the Unicode [UAX #44](https://www.unicode.org/reports/tr44/)
709 `Default_Ignorable_Code_Point` property).
710
711### Examples
712
713The way to write your resource is to start with a table that has your locale
714name. The contents of a table are between the curly brackets:
715
716```
717root:table {
718}
719```
720
721Then you can start adding resources to your bundle. Resources on the first level
722must be named and we suggest that you specify the type:
723
724```
725root:table {
726  usage:string { "Usage: genrb [Options] files" }
727  version:int { 122 }
728  errorcodes:array {
729    :string { "Invalid argument" }
730    :string { "File not found" }
731  }
732}
733```
734
735The resource bundle format doesn't care about indentation and line breaks. You
736can continue one string over many lines - you need to have the line break
737outside of the string:
738
739```
740aVeryLongString:string {
741  "This string is quite long "
742  "and therefore should be "
743  "broken into several lines."
744}
745```
746
747For more examples on syntax, take a look at our resource files for
748[locales](https://github.com/unicode-org/icu/blob/main/icu4c/source/data/locales)
749and
750[test data](https://github.com/unicode-org/icu/blob/main/icu4c/source/test/testdata),
751especially at the
752[testtypes resource bundle](https://github.com/unicode-org/icu/blob/main/icu4c/source/test/testdata/testtypes.txt).
753
754### Making Your Own Resource Bundles
755
756In order to make your own resource bundle package, you need to perform several
757steps:
758
7591.  Create your root resource bundle. This bundle should contain all the data
760    for your program. You are probably best off if you fill it with data in your
761    native language.
762
7632.  Create a chain of empty resource bundles for your native language and
764    region. For example, if your region is sr_CS, create all the entries in root
765    in Serbian and leave bundles for sr and sr_CS locales empty. This way, users
766    of your package will know whether you support a certain locale or not.
767
7683.  If you already have some data to localize, create more bundles with
769    localized data.
770
7714.  Decide on the name of your package. You will use the package name to access
772    your resources.
773
7745.  Compile the resource bundles using the `genrb` tool. The command line format
775    is `genrb [options] list-of-input-files`. Genrb expects that source files
776    are in invariant encoding and `\uXXXX` characters or UTF-8/UTF-16 with BOM.
777    If you need to use a different encoding, specify it using the `--encoding`
778    option. You also need to specify the destination directory name for your
779    resources using the `--destdir` option. This destination name needs to be the
780    same as the package name. Full list of options can be retrieved by invoking
781    `genrb --help`.
782
783    You can also output Java class files. You will need to specify the
784    `--write-java` option, followed by an optional encoding for the resulting
785    `.java` file. Default encoding is ASCII + `\uXXXX`. You will also have to
786    specify the resource bundle name using the `--bundle-name argument`.
787
788    After using `genrb`, you will end up with files of name
789    `packagename_localename.res`. For example, if you had `root.txt`, `en.txt`,
790    `en_US.txt`, `es.txt` and you invoked `genrb` using the following command line:
791    `genrb -d myapplication root.txt en.txt en_US.txt es.txt`, you will end up
792    with `myapplication/root.res`, `myapplication/en.res`, etc. The forward slash can
793    be a back slash on some platforms, like Windows. These files are now ready
794    to use and you can open them using `ures_open("myapplication", "en_US", err);`.
795
7966.  However, you might want to have only one file containing all the data. In
797    that case you need to use the package data tool. It can produce either a
798    memory mapped file or a dynamically linked library. For more information on
799    how to use package data tool, see the appropriate [section](../icudata.md).
800
801Rolling out your own data takes some practice, especially if you want to package
802it all together. You might want to take a look at how we package data. Good
803places to start (except of course ICU's own
804[data](https://github.com/unicode-org/icu/blob/main/icu4c/source/data/)) are
805[source/test/testdata/](https://github.com/unicode-org/icu/blob/main/icu4c/source/test/testdata/)
806and
807[source/samples/ufortune/resources/](https://github.com/unicode-org/icu/blob/main/icu4c/source/samples/ufortune/resources/)
808directories.
809
810Also, here is a sample Windows batch file that does compiling and packing of
811several resources:
812
813```bat
814genrb -d myapplication root.txt en.txt en_GB.txt fr.txt es.txt es_ES.txt
815echo root.res en.res en_GB.res fr.res es.res es_ES.res > packagelist.txt
816mkdir tmpdir
817pkgdata -p myapplication -T tmpdir -m common packagelist.txt
818```
819
820It is also possible to use the `icupkg` tool instead of `pkgdata` to generate .dat
821data archives. The `icupkg` tool became available in ICU4C 3.6. If you need the
822data in a shared or static library, you still need to use the `pkgdata` tool. For
823easier maintenance, packaging, installation and application patching, it's
824recommended that you use .dat data archives.
825
826### Using XLIFF for Localization
827
828ICU provides tool that allow for converting resource bundles to and from XLIFF
829format. Files in XLIFF format can contain translations of resources. In that
830case, more than one resulting resource bundle will be constructed.
831
832To produce a XLIFF file from a resource bundle, use the `-x` option of `genrb` tool
833from ICU4C. Assume that we want to convert a simple resource bundle to the XLIFF
834format:
835
836```
837root {
838   usage           {"usage: ufortune [-v] [-l locale]"}
839   optionMessage   {"unrecognized command line option:"}
840}
841```
842
843To get a XLIFF file, we need to call genrb like this: `genrb -x -l en root.txt`.
844Option `-x` tells `genrb` to produce XLIFF file, option `-l` specifies the language of
845the resource. If the language is not specified, `genrb` will try to deduce the
846language from the resource name (en, zh, sh). If the resource name is not an ISO
847language code (root), default language for the platform will be used. Language
848will be a source attribute for all the translation units. XLIFF file produced
849from the resource above will be named `root.xlf` and will look like this:
850
851```xml
852<?xml version="1.0" encoding="utf-8"?>
853<xliff version = "1.1 "xmlns = 'urn:oasis:names:tc:xliff:document:1.1'
854xmlns:xsi = 'http://www.w3.org/2001/XMLSchema-instance'
855xsi:schemaLocation='urn:oasis:names:tc:xliff:document:1.1
856http://www.oasis-open.org/committees/xliff/documents/xliff-core-1.1.xsd'>
857    <file xml:space = "preserve" source-language = "en”
858         datatype = "x-icu-resource-bundle" original = "root.txt"
859         date = "2007-08-17T21:17:08Z">
860        <header>
861            <tool tool-id = "genrb-3.3-icu-3.8" tool-name = "genrb"/>
862        </header>
863        <body>
864            <group id = "root" restype = "x-icu-table">
865                <trans-unit id = "optionMessage" resname = "optionMessage">
866                    <source>unrecognized command line option:</source>
867                </trans-unit>
868                <trans-unit id = "usage" resname = "usage">
869                    <source>usage: ufortune [-v] [-l locale]</source>
870                </trans-unit>
871            </group>
872        </body>
873    </file>
874</xliff>
875```
876
877This file can be sent to translators. Using translation tools that support
878XLIFF, translators will produce one or more translations for this resource.
879Processed file might look a bit like this:
880
881```xml
882<?xml version="1.0" encoding="utf-8"?>
883<xliff version = "1.1" xmlns='urn:oasis:names:tc:xliff:document:1.1'
884xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
885    xsi:schemaLocation='urn:oasis:names:tc:xliff:document:1.1
886http://www.oasis-open.org/committees/xliff/documents/xliff-core-1.1.xsd'>
887    <file xml:space = "preserve" source-language = "en" target-language = "sh"
888          datatype = "x-icu-resource-bundle" original = "root.txt"
889          date = "2007-08-17T21:17:08Z">
890        <header>
891            <tool tool-id = "genrb-3.3-icu-3.8" tool-name = "genrb"/>
892        </header>
893        <body>
894            <group id = "root" restype = "x-icu-table">
895                <trans-unit id = "optionMessage" resname = "optionMessage">
896                    <source>unrecognized command line option:</source>
897                    <target>nepoznata opcija na komandnoj liniji:</target>
898                </trans-unit>
899                <trans-unit id = "usage" resname = "usage">
900                    <source>usage: ufortune [-v] [-l locale]</source>
901                    <target>upotreba: ufortune [-v] [-l lokal]</target>
902                </trans-unit>
903            </group>
904        </body>
905    </file>
906</xliff>
907```
908
909In order to convert this file to a set of resource bundle files, we need to use
910ICU4J's `com.ibm.icu.dev.tool.localeconverter.XLIFF2ICUConverter` class.
911
912> :point_right: **Note**: XLIFF2ICUConverter class relies on XML parser being
913> available. JDK 1.4 and newer provide a XML parser out of box. For earlier
914> versions, you will need to install xerces.
915
916Command line for running XLIFF2ICUConverter should specify the file than needs
917to be converted, sh.xlf in this case. Optionally, you can specify input and
918output directories as well as the package name. After running this tool, two
919files will be produced: en.txt and sh.txt. This is how they would look like:
920
921```
922// ***************************************************************************
923// *
924// * Tool: com.ibm.icu.dev.tool.localeconverter.XLIFF2ICUConverter.java
925// * Date & Time: 08/17/2007 11:33:54 AM HST
926// * Source File: C:\trunk\icuhtml\userguide\xliff\sh.xlf
927// *
928// ***************************************************************************
929en:table{
930    optionMessage:string{"unrecognized command line option:"}
931    usage:string{"usage: ufortune [-v] [-l locale]"}
932}
933```
934
935and
936
937```
938// ***************************************************************************
939// *
940// * Tool: com.ibm.icu.dev.tool.localeconverter.XLIFF2ICUConverter.java
941// * Date & Time: 08/17/2007 11:33:54 AM HST
942// * Source File: C:\trunk\icuhtml\userguide\xliff\sh.xlf
943// *
944// ***************************************************************************
945sh:table{
946    optionMessage:string{"nepoznata opcija na komandnoj liniji:"}
947    usage:string{"upotreba: ufortune [-v] [-l lokal]"}
948}
949```
950
951These files can be then used as all the other resource bundle files.
952