1--- 2layout: default 3title: Resources 4nav_order: 2 5parent: Locales and Resources 6--- 7<!-- 8© 2020 and later: Unicode, Inc. and others. 9License & terms of use: http://www.unicode.org/copyright.html 10--> 11 12# Resource Management 13{: .no_toc } 14 15## Contents 16{: .no_toc .text-delta } 17 181. TOC 19{:toc} 20 21--- 22 23## Overview 24 25> :point_right: **Note**: This page describes the use of ICU4C Resource 26> Management techniques and APIs. For an overview of the message localization 27> process using ICU, see the related page [Localizing with ICU](localizing.md). 28 29A software product that needs to be localized wins or loses depending on how 30easy is to change the data or "resources" which affect users. From the simplest 31point of view, that data is the information presented to the user (such as a 32translated message) as well as the region-specific ways of doing things (such as 33sorting). The process of localization will eventually involve translators and it 34would be very convenient if the process of localizing could be done only by 35translators and experts in the target culture. There are several points to keep 36in mind when designing such a localizable software product. 37 38### Keeping Data Separate 39 40Obviously, one does not want to make translators wade through the source code 41and make changes there. That would be a recipe for a disaster. Instead, the 42translatable data should be kept separately, in a format that allows translators 43easy access. A separate resource managing mechanism is hence required. 44Application access data through API calls, which pick the appropriate entries 45from the resources. Resources are kept in human readable/editable format with 46optional tools for content editing. 47 48The data should contain all the elements to be localized, including, but no 49limited to, GUI messages, icons, formatting patterns, and collation rules. A 50convenient way for keeping binary data should also be provided - often icons for 51different cultures should be different. 52 53### Keeping Data Small 54 55It is not unlikely that the data will be same for several regions - take for 56example Spanish speaking countries - names of the days and month will be the 57same in both Mexico and Spain. It would be very beneficial if we can prevent the 58duplication of data. This can be achieved by structuring resources in such a way 59so that an unsuccessful query into a more specific resource triggers the same 60query in a more general resource. A convenient way to do this is to use a tree 61like structure. 62 63Another way to reduce the data size is to allow linking of the resources that 64are same for the regions that are not in general-specific relation. 65 66### Find the Best Available Data 67 68Sometimes, the exact data for a region is still not available. However, if the 69data is structured correctly, the user can be presented with similar data. For 70example, a Spanish speaking user in Mexico would probably be happier with 71Spanish than with English captions, even if some of the details for Mexico are 72not there. 73 74If the data is grouped correctly, the program can automatically find the most 75suitable data for the situation. 76 77The previous points all lead to a separate mechanism that stores data separately 78from the code. Software is able to access the data through the API calls. Data 79is structured in a tree like structure, with the most general region in the root 80(most commonly, the root region is the native language of the development team). 81Branches lead to more specialized regions, usually through languages, countries 82and country regions. Data that is already the same on the more general level is 83not repeated. 84 85> :point_right: **Note**: The path through languages, countries and country 86> region could be different. One may decide to go through countries and then 87> through languages spoken in the particular country. In either case, some data 88> must be duplicated - if you go through languages, the currency data for 89> different speaking parts of the same country will be duplicated (consider 90> French and English languages in Canada) - on the other side, when you go 91> through countries, you will need to duplicate day names and similar 92> information. 93 94Here is an example of a such a resource tree structure: 95 96``` 97 root Root 98 | 99 +-------+---+---+----+----+ 100 | | | | | 101 en de ja ru zh Language 102 | | | | | 103 +---+ +---+ | | +------+ 104 | | | | | | | | 105 | | | | | | Hans Hant Script 106 | | | | | | | | 107 | | | | | | | +----+ 108 | | | | | | | | | 109 US IE DE AT JP RU CN HK TW Country or Region 110 | 111 POSIX Variant 112``` 113 114Let us assume that the root resource contains data written by the original 115implementors and that this data is in English and conforms to the conventions 116used in the United States. Therefore, resources for English and English in 117United States would be empty and would take its data from the root resource. If 118a version for Ireland is required, appropriate overriding changes can be made to 119the data for English in Ireland. Special variant information could be put into 120en_US_POSIX if specific legacy formatting were required, or specific sub-region 121information were required. When making the version for the German speaking 122region, all the German data would be in that resource, with the differences in 123the Germany and Austria resources. 124 125It is important to note that some locales have the optional script tag. This is 126important for multi-script locales, like Uzbek, Azerbaijani, Serbian or Chinese. 127Even though Chinese uses Han characters, the characters are usually identified 128as either traditional Chinese (Hant) characters, or simplified Chinese (Hans). 129 130Even if all the data that would go to a certain resource comes from the more 131general resources, it should be made clear that the particular region is 132supported by application. This can be done by having completely empty resources. 133 134## The ICU Model 135 136ICU bases its resource management model on the ideas presented above. All the 137resource APIs are concentrated in the resource bundle framework. This framework 138is closely tied in its functioning to the ICU [Locale](index.md) naming scheme. 139 140ICU provides and relies on a set of locale specific data in the resource bundle 141format. If we think that we have correct data for a requested locale, even if 142all its data comes from a more general locales, we will provide an empty 143resource bundle. This is reflected in our return informational codes (see the 144section on APIs). A lot of ICU frameworks (collation, formatting etc.) relies on 145the data stored in resource bundles. 146 147Resource bundles rely on the ICU data framework. For more information on the 148functioning of ICU data, see the appropriate [section](../icudata.md). 149 150Users of the ICU library can also use the resource bundle framework to store and 151retrieve localizable data in their projects. 152 153Resource bundles are collections of resources. Individual resources can contain 154data or other resources. 155 156> :point_right: **Note**: ICU4J relies on the resource bundle mechanism already 157> provided by JDK for its functioning. Therefore, most of the discussion here 158> pertains only to ICU4C. 159 160### Fallback Mechanism 161 162Essential part ICU's resource management framework is the fallback mechanism. It 163ensures that if the data for the requested locale is missing, an effort will be 164made to obtain the most usable data. Fallback can happen in two situations: 165 1661. When a resource bundle for a locale is requested. If it doesn't exist, a 167 more general resource bundle will be used. If there are no such resource 168 bundles, a resource bundle for default locale will be used. If this fails, 169 the root resource bundle will be used. When using ICU locale data, not 170 finding the requested resource bundle means that we don't know what the data 171 should be for that particular locale, so you might want to consider this 172 situation an error. Custom packages of resource bundles may or may not 173 adhere to this contract. A special care should be taken in remote server 174 situations, when the data from the default locale might not mean anything to 175 the remote user (imagine a situation where a server in Japan responds to a 176 Spanish speaking client by using default Japanese data. 177 1782. When a resource inside a resource bundle is requested. If the resource is 179 not present, it will be sought after in more general resources. If at 180 initial opening of a resource bundle we went through the default locale, the 181 search for a resource will also go through it. For example, if a resource 182 bundle for zh_Hans_CN is opened, a missing resource will be looked for in 183 zh_Hans, zh and finally root. This is usually harmless, except when a 184 resource is only located in the default locale or in the root resource 185 bundle. 186 187### Data Packaging 188 189ICU allows and requires that the application specific data be stored apart from 190the ICU internal data (locale, converter, transformation data etc.). Application 191data should be stored in packages. ICU uses the default package (NULL) for its 192data. All the ICU's build tools provide means to specify the package for your 193data. More about how to package application data can be found below. 194 195## Resource Bundle APIs 196 197ICU4C provides both C and C++ APIs for using resource bundles. The core 198implementation is in C, while the C++ APIs are only a thin wrapper around it. 199Therefore, the code using C APIs will generally be faster. 200 201Resource bundles use ICU's "open use close" paradigm. In C all the resource 202bundle operations are done using the `UResourceBundle*` handle. `UResourceBundle*` 203allows access to both resource bundles and individual resources. In C++, class 204`ResourceBundle` should be used for both resource bundles and individual 205resources. 206 207To use the resource bundle framework, you need to include the appropriate header 208file, `unicode/ures.h` for C and `unicode/resbund.h` for C++. 209 210### Error Checking 211 212If an operation with resource bundle fails, an error code will be set. It is 213important to check for the value of the error code. In C you should frequently 214use the following construct: 215 216```c 217if (U_SUCCESS(status)) { 218 /* everything is fine */ 219} else { 220 /* there was an error */ 221} 222``` 223 224### Opening of Resource Bundles 225 226The most common C resource bundle opening API is: 227 228```c 229UResourceBundle* ures_open(const char* package, const char* locale, UErrorCode* status) 230``` 231 232The first argument specifies the package name or `NULL` for the default ICU package. 233The second argument is the locale for which you want the resource bundle. 234Special values for the locale are `NULL` for the default locale and `""` (empty 235string) for the root locale. The third argument should be set to `U_ZERO_ERROR` 236before calling the function. It will return the status of operation. Apart from 237returning regular errors, it can return two informational/warning codes: 238`U_USING_FALLBACK_WARNING` and `U_USING_DEFAULT_WARNING`. The first informational 239code means that the requested resource bundle was not found and that a more 240general bundle was returned. If you are opening ICU resource bundles, do note 241that this means that we do not guarantee that the contents of opened resource 242bundle will be correct for the requested locale. The situation might be 243different for application packages. However, the warning `U_USING_DEFAULT_WARNING` 244means that there were no more general resource bundles found and that you were 245returned either a resource bundle that is the default for the system, or the root 246resource bundle. This will almost certainly contain wrong data. 247 248There are a couple of other opening APIs: `ures_openDirect` takes the same 249arguments as the `ures_open` but will fail if the requested locale is not found. 250Also, if opening is successful, no fallback will be performed if an individual 251resource is not found. The second one, `ures_openU` takes a `UChar*` for package 252name instead of `char*`. 253 254In C++, opening is done through a constructor. There are several constructors. 255Most notable difference from C APIs is that the package should be given as a 256`UnicodeString` and the locale is passed as a `Locale` object. There is also a copy 257constructor and a constructor that takes a C `UResourceBundle*` handle. The 258result is a `ResourceBundle` object. Remarks about informational codes are also 259valid for the C++ APIs. 260 261> :point_right: **Note**: All the data accessing examples in the following 262> sections use ICU's 263> [root](https://github.com/unicode-org/icu/blob/main/icu4c/source/data/locales/root.txt) 264> resource bundle. 265 266```c 267UErrorCode status = U_ZERO_ERROR; 268UResourceBundle* icuRoot = ures_open(NULL, "root", &status); 269if (U_SUCCESS(status)) { 270 /* everything is fine */ 271 ... 272 /* do some interesting stuff here - see below */ 273 ... 274 /* and close the bundle afterwards */ 275 ures_close(icuRoot); /* discussed later */ 276} else { 277 /* there was an error */ 278 /* report and exit */ 279} 280``` 281 282In C++, opening would look like this: 283 284```c++ 285UErrorCode status = U_ZERO_ERROR; 286// we rely on automatic construction of Locale object from a char* 287ResourceBundle myResource("myPackage", "de_AT", status); 288if (U_SUCCESS(status)) { 289 /* everything is fine */ 290 ... 291 /* do some interesting stuff here */ 292 ... 293 /* the bundle will be closed when going out of scope */ 294} else { 295 /* there was an error */ 296 /* report and exit */ 297} 298``` 299 300### Closing of Resource Bundles 301 302After using, resource bundles need to be closed to prevent memory leaks. In C, 303you should call the `void ures_close(UResourceBundle* resB)` API. In C++, if you 304have just used the `ResourceBundle` objects, going out of scope will close the 305bundles. When using allocated objects, make sure that you call the appropriate 306delete function. 307 308As already mentioned, resource bundles and resources share the same type. You 309can close bundles and resources in any order you like. You can invoke `ures_close` 310on `NULL` resource bundles. Therefore, you can always this API regardless of the 311success of previous operations. 312 313### Accessing Resources 314 315Once you are in the possession of a valid resource bundle, you can access the 316resources and data that it holds. The result of accessing operations will be a 317new resource bundle object. In C, `UResourceBundle*` handles can be reused by 318using the fill-in parameter. That saves you from frequent closing and 319reallocating of resource bundle structures, which can dramatically improve the 320performance. C++ APIs do not provide means for object reuse. All the C examples 321in the following sections will use a fill-in parameter. 322 323#### Types of Resources 324 325Resource bundles can contain two main types of resources: complex and simple 326resources. Complex resources store other resources and can have named or unnamed 327elements. **Tables** store named elements, while **arrays** store unnamed ones. 328Simple resources contain data which can be **string**, **binary**, **integer 329array** or a single **integer**. 330 331There are several ways for accessing data stored in the complex resources. 332Tables can be accessed using keys, indexes and by iteration. Arrays can be 333accessed using indexes and by iteration. 334 335In order to be able to distinguish between resources, one needs to know the type 336of the resource at hand. To find this out, use the 337`UResType ures_getType(UResourceBundle* resourceBundle)` API, or the C++ analog 338`UResType getType(void)`. The `UResType` is an enumeration defined in the 339[unicode/ures.h](https://github.com/unicode-org/icu/blob/main/icu4c/source/common/unicode/ures.h) 340header file. 341 342> :point_right: **Note**: Indexes of resources in tables do not necessarily 343> correspond to the order of items in a table. Due to the way binary structure is 344> organized, items in a table are sorted according to the binary ordering of the 345> keys, therefore, the index of an item in a table will be the index of its key 346> string in the binary order. Furthermore, the ordering of the keys are different 347> on ASCII and EBCDIC platforms. 348> <br> 349> Starting with ICU 4.4, the order of table items is the ASCII string order on 350> all platforms. 351> <br> 352> The iteration order of table items might change from release to release. 353 354#### Accessing by Key 355 356To access resources using a key, you can use the `UResourceBundle* 357ures_getByKey(const UResourceBundle* resourceBundle, const char* key, 358UResourceBundle* fillIn, UErrorCode* status)` API. First argument is the parent 359resource bundle, which can be either a resource bundle opened using `ures_open` or 360similar APIs or a table resource. The key is always specified using invariant 361characters. The fill-in parameter can be either `NULL` or a valid resource bundle 362handle. If it is `NULL`, a new resource bundle will be constructed. If you pass an 363already existing resource bundle, it will be closed and the memory will be 364reused for the new resource bundle. Status indicator can return 365`U_MISSING_RESOURCE_ERROR` which indicates that no resources with that key exist, 366or one of the above mentioned informational codes (`U_USING_FALLBACK_WARNING` and 367`U_USING_DEFAULT_WARNING`) which do not affect the validity of data in the case of 368resource retrieval. 369 370```c 371... 372/* we already got zones resource from the opening example */ 373UResourceBundle *zones = ures_getByKey(icuRoot, "zoneStrings", NULL, &status); 374if (U_SUCCESS(status)) { 375 /* ... do interesting stuff - see below ... */ 376} 377ures_close(zones); 378/* clean up the rest */ 379... 380``` 381 382In C++, the analogous API is `ResourceBundle get(const char* key, UErrorCode& status) const`. 383 384Trying to retrieve resources by key on any other type of resource than tables 385will produce a `U_RESOURCE_TYPE_MISMATCH` error. 386 387#### Accessing by Index 388 389Accessing by index requires you to supply an index of the resource that you want 390to retrieve. Appropriate API is `UResourceBundle* ures_getByIndex(const 391UResourceBundle* resourceBundle, int32_t indexR, UResourceBundle* fillIn, 392UErrorCode* status)`. The arguments have the same semantics as for the 393`ures_getByKey` API. The only difference is the second argument, which is the 394index of the resource that you want to retrieve. Indexes start at zero. If an 395index out of range is specified, `U_MISSING_RESOURCE_ERROR` is returned. To find 396the size of a resource, you can use `int32_t ures_getSize(UResourceBundle* 397resourceBundle)`. The maximum index is the result of this API minus 1. 398 399```c 400... 401/* we already got zones resource from the accessing by key example */ 402UResourceBundle *currentZone = NULL; 403int32_t index = 0; 404for (index = 0; index < ures_getSize(zones); index++) { 405 currentZone = ures_getByIndex(zones, index, currentZone, &status); 406 /* ... do interesting stuff here ... */ 407} 408ures_close(currentZone); 409/* cleanup the rest */ 410... 411``` 412 413Accessing simple resource with an index 0 will return themselves. This is useful 414for iterating over all the resources regardless of type. 415 416C++ overloads the get API with `ResourceBundle get(int32_t index, UErrorCode& status) const`. 417 418#### Iterating Over Resources 419 420If you don't care about the order of the resources and want simple code, you can 421use the iteration mechanism. To set up iteration over a complex resource, you 422can simply start iterating using the `UResourceBundle* 423ures_getNextResource(UResourceBundle* resourceBundle, UResourceBundle* fillIn, 424UErrorCode* status)`. It is advisable though to reset the iterator for a 425resource before starting, in order to ensure that the iteration will indeed 426start from the beginning - just in case somebody else has already been playing 427with this resource. To reset the iterator use `void 428ures_resetIterator(UResourceBundle* resourceBundle)` API. To check whether there 429are more resources, call `UBool ures_hasNext(UResourceBundle* resourceBundle)`. 430If you have iterated through the whole resource, `NULL` will be returned. 431 432```c 433... 434/* we already got zones resource from the accessing by key example */ 435UResourceBundle *currentZone = NULL; 436ures_resetIterator(zones); 437while (ures_hasNext(zones)) { 438 currentZone = ures_getNextResource(zones, currentZone, &status); 439 /* ... do interesting stuff here ... */ 440} 441ures_close(currentZone); 442/* cleanup the rest */ 443... 444``` 445 446C++ provides analogous APIs: `ResourceBundle getNext(UErrorCode& status)`, `void resetIterator(void)` 447 and `UBool hasNext(void)`. 448 449#### Accessing Data in the Simple Resources 450 451In order to get to the data in the simple resources, you need to use appropriate 452APIs according to the type of a simple resource. They are summarized in the 453tables below. All the pointers returned should be considered pointers to read 454only data. Using an API on a resource of a wrong type will result in an error. 455 456Strings: 457 458| Language | API | 459| -------- | ------------------------------------------------------------------------------------------------------ | 460| C | `const UChar* ures_getString(const UResourceBundle* resourceBundle, int32_t* len, UErrorCode* status)` | 461| C++ | `UnicodeString getString(UErrorCode& status) const` | 462 463Example: 464 465```c 466... 467UResourceBundle* version = ures_getByKey(icuRoot, "Version", NULL, &status); 468if (U_SUCCESS(status)) { 469 int32_t versionStringLen = 0; 470 const UChar* versionString = ures_getString(version, &versionStringLen, &status); 471} 472ures_close(version); 473... 474``` 475 476Binaries: 477 478| Language | API | 479| -------- | -------------------------------------------------------------------------------------------------------- | 480| C | `const uint8_t* ures_getBinary(const UResourceBundle* resourceBundle, int32_t* len, UErrorCode* status)` | 481| C++ | `const uint8_t* getBinary(int32_t& len, UErrorCode& status) const` | 482 483Integers, signed and unsigned: 484 485| Language | API | 486| -------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- | 487| C | `int32_t ures_getInt(const UResourceBundle* resourceBundle, UErrorCode* status)` `uint32_t ures_getUInt(const UResourceBundle* resourceBundle, UErrorCode* status)` | 488| C++ | `int32_t getInt(UErrorCode& status) const` <br> `uint32_t getUInt(UErrorCode& status) const` | 489 490Integer Arrays: 491 492| Language | API | 493| -------- | ----------------------------------------------------------------------------------------------------------- | 494| C | `const int32_t* ures_getIntVector(const UResourceBundle* resourceBundle, int32_t* len, UErrorCode* status)` | 495| C++ | `const int32_t* getIntVector(int32_t& len, UErrorCode& status) const` | 496 497#### Convenience APIs 498 499Since the vast majority of data stored in resource bundles are strings, ICU's 500resource bundle framework provides a number of different convenience APIs that 501directly access strings stored in resources. They are analogous to APIs already 502discussed, with the difference that they return const `UChar*` or `UnicodeString` 503objects. 504 505> :point_right: **Note**: The C APIs that allow returning of `UnicodeStrings` only 506> work if used in a C++ file. Trying to use them in a C file will produce a 507> compiler error. 508 509APIs that allow retrieving strings by specifying a key: 510 511| Language (Return Type) | API | 512| ---------------------- | ------------------------------------------------------------------------------------------------------------------ | 513| C (UChar*) | `const UChar* ures_getStringByKey(const UResourceBundle* resB, const char* key, int32_t* len, UErrorCode* status)` | 514| C (UnicodeString) | `UnicodeString ures_getUnicodeStringByKey(const UResourceBundle* resB, const char* key, UErrorCode* status)` | 515| C++ | `UnicodeString getStringEx(const char* key, UErrorCode& status) const` | 516 517 518APIs that allow retrieving strings by specifying an index: 519 520| Language (Return Type) | API | 521| ---------------------- | ------------------------------------------------------------------------------------------------------------------- | 522| C (UChar*) | `const UChar* ures_getStringByIndex(const UResourceBundle* resB, int32_t indexS, int32_t* len, UErrorCode* status)` | 523| C (UnicodeString) | `UnicodeString ures_getUnicodeStringByIndex(const UResourceBundle* resB, int32_t indexS, UErrorCode* status)` | 524| C++ | `UnicodeString getStringEx(int32_t index, UErrorCode& status) const` | 525 526APIs for retrieving strings through iteration: 527 528| Language (Return Type) | API | 529| ---------------------- | ---------------------------------------------------------------------------------------------------------------------- | 530| C (UChar*) | `const UChar* ures_getNextString(UResourceBundle* resourceBundle, int32_t* len, const char** key, UErrorCode* status)` | 531| C (UnicodeString) | `UnicodeString ures_getNextUnicodeString(UResourceBundle* resB, const char** key, UErrorCode* status)` | 532| C++ | `UnicodeString getNextString(UErrorCode& status)` | 533 534#### Other APIs 535 536Resource bundle framework provides a number of additional APIs that allow you to 537get more information on the resources you are using. They are summarized in the 538following tables. 539 540| Language | API | 541| -------- | ------------------------------------------------------- | 542| C | `int32_t ures_getSize(UResourceBundle* resourceBundle)` | 543| C++ | `int32_t getSize(void) const` | 544 545Gets the number of items in a resource. Simple resources always return size 1. 546 547| Language | API | 548| -------- | -------------------------------------------------------- | 549| C | `UResType ures_getType(UResourceBundle* resourceBundle)` | 550| C++ | `UResType getType(void)` | 551 552Gets the type of the resource. For a list of resource types, see: 553[unicode/ures.h](https://github.com/unicode-org/icu/blob/main/icu4c/source/common/unicode/ures.h) 554 555| Language | API | 556| -------- | ------------------------------------------------ | 557| C | `const char* ures_getKey(UResourceBundle* resB)` | 558| C++ | `const char* getKey(void)` | 559 560Gets the key of a named resource or `NULL` if this resource is a member of an 561array. 562 563| Language | API | 564| -------- | ----------------------------------------------------------------------------- | 565| C | `void ures_getVersion(const UResourceBundle* resB, UVersionInfo versionInfo)` | 566| C++ | `void getVersion(UVersionInfo versionInfo) const` | 567 568Fills out the version structure for this resource. 569 570| Language | API | 571| -------- | --------------------------------------------------------------------------------------- | 572| C | `const char* ures_getLocale(const UResourceBundle* resourceBundle, UErrorCode* status)` | 573| C++ | `const Locale& getLocale(void) const` | 574 575Returns the locale this resource is from. This API is going to change, so stay 576tuned. 577 578### Format of Resource Bundles 579 580Resource bundles are written in its source format. Before using them, they must 581be compiled to the binary format using the `genrb` utility. Currently supported 582source format is a text file. The format is defined in a [formal definition 583file](https://github.com/unicode-org/icu-docs/blob/main/design/bnf_rb.txt). 584 585This is an example of a resource bundle source file: 586 587``` 588// Comments start with a '//' and extend to the end of the line 589// first, a locale name for the bundle is defined. The whole bundle is a table 590// every resource, including the whole bundle has its name. 591// The name consists of invariant characters, digits and following symbols: -, _. 592root { 593 menu { 594 id { "mainmenu" } 595 items { 596 { 597 id { "file" } 598 name { "&File" } 599 items { 600 { 601 id { "open" } 602 name { "&Open" } 603 } 604 { 605 id { "save" } 606 name { "&Save" } 607 } 608 { 609 id { "exit" } 610 name { "&Exit" } 611 } 612 } 613 } 614 615 { 616 id { "edit" } 617 name { "&Edit" } 618 items { 619 { 620 id { "copy" } 621 name { "&Copy" } 622 } 623 { 624 id { "cut" } 625 name { "&Cut" } 626 } 627 { 628 id { "paste" } 629 name { "&Paste" } 630 } 631 } 632 } 633 634 ... 635 } 636 } 637 638 // This resource is an array, thus accessible only through iteration and indexes... 639 errors { 640 "Invalid Command", 641 "Bad Value", 642 643 // Add more strings here... 644 645 "Read the Manual" 646 } 647 648 splash:import { "splash_root.gif" } // This is a binary imported file 649 650 pgpkey:bin { a1b2c3d4e5f67890 } // a binary value 651 652 versionInfo { // a table 653 major:int { 1 } // of integers 654 minor:int { 4 } 655 patch:int { 7 } 656 } 657 658 buttonSize:intvector { 10, 20, 10, 20 } // an array of 32-bit integers 659 660 // will pick up data from zoneStrings resource in en bundle in the ICU package 661 simpleAlias:alias { "/ICUDATA/en/zoneStrings" } 662 663 // will pick up data from CollationElements resource in en bundle 664 // in the ICU package 665 CollationElements:alias { "/ICUDATA/en" } 666} 667``` 668 669Binary format is described in the 670[uresdata.h](https://github.com/unicode-org/icu/blob/main/icu4c/source/common/uresdata.h) 671header file. 672 673### Resources Syntax 674 675Syntax of the resources that can be stored in resource bundles is specified in 676the following table: 677 678| Data Type | Format | Description | 679| --------------- | ---------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | 680| Tables | `[name][:table] { subname1 { subresource1 } ... subnameN { subresourceN } }` | Tables are a complex resource that holds named resources. If it is a part of an array, it does not have a name. At this point, a resource bundle is a table. Access is allowed by key, index, and iteration. | 681| Arrays | `[name][:array] {subresource1, ... subresourceN }` | Arrays are a complex resource that holds unnamed resources. If it is a part of an array, it does not have a name. Arrays require less memory than tables (since they don't store the name of sub-resources) but the index and iteration access are as fast as with tables. | 682| Strings | `[name][:string] { ["]UnicodeText["] }` | Strings are simple resources that hold a chunk of Unicode encoded data. If it is a part of an array, it does not have a name. | 683| Binaries | `name:bin { binarydata } name:import{ "fileNameToImport" }` | Binaries are used for storing binary information (processed data, images etc). Information is stored on a byte level. | 684| Integers | `name:int { integervalue }` | Integers are used for storing a 32 bit integer value. | 685| Integer Vectors | `name:intvector { integervalue, ... integervalueN }` | Integer vectors are used for storing 32 bit integer values. | 686| Aliases | `name:alias { locale and path to aliased resource }` | Aliases point to other resources. They are useful for preventing duplication of data in resources that are not on the same branch of the fallback chain. Alias can also have an empty path. In that case the position of the alias resource is used to find the aliased resource. | 687 688Although specifying type for some resources can be omitted for backward 689compatibility reasons, you are strongly encouraged to always specify the type of 690the resources. As structure gets more complicated, some combinations of 691resources that are not typed might produce unexpected results. 692 693### Escape Sequences 694 695String values can contain C/Java-style escape sequences like `\t`, `\r`, `\n`, 696`\xhh`, `\uhhhh` and `\U00hhhhhh`, consistent with the `u_unescape()` C API, see the 697[ustring.h](https://unicode-org.github.io/icu-docs/apidoc/released/icu4c/ustring_8h.html) 698API documentation. 699 700A literal backslash (\\) in a string value must be doubled (\\\\) or escaped 701with `\x5C` or `\u005C`. 702 703A literal ASCII double quote (") in a double-quoted string must be escaped with 704\\" or `\x22` or `\u0022`. 705 706You should also escape carriage return (`\r`) and line feed (`\n`) as well as 707control codes, non-characters, unassigned code points and other default-invisible 708characters (see the Unicode [UAX #44](https://www.unicode.org/reports/tr44/) 709 `Default_Ignorable_Code_Point` property). 710 711### Examples 712 713The way to write your resource is to start with a table that has your locale 714name. The contents of a table are between the curly brackets: 715 716``` 717root:table { 718} 719``` 720 721Then you can start adding resources to your bundle. Resources on the first level 722must be named and we suggest that you specify the type: 723 724``` 725root:table { 726 usage:string { "Usage: genrb [Options] files" } 727 version:int { 122 } 728 errorcodes:array { 729 :string { "Invalid argument" } 730 :string { "File not found" } 731 } 732} 733``` 734 735The resource bundle format doesn't care about indentation and line breaks. You 736can continue one string over many lines - you need to have the line break 737outside of the string: 738 739``` 740aVeryLongString:string { 741 "This string is quite long " 742 "and therefore should be " 743 "broken into several lines." 744} 745``` 746 747For more examples on syntax, take a look at our resource files for 748[locales](https://github.com/unicode-org/icu/blob/main/icu4c/source/data/locales) 749and 750[test data](https://github.com/unicode-org/icu/blob/main/icu4c/source/test/testdata), 751especially at the 752[testtypes resource bundle](https://github.com/unicode-org/icu/blob/main/icu4c/source/test/testdata/testtypes.txt). 753 754### Making Your Own Resource Bundles 755 756In order to make your own resource bundle package, you need to perform several 757steps: 758 7591. Create your root resource bundle. This bundle should contain all the data 760 for your program. You are probably best off if you fill it with data in your 761 native language. 762 7632. Create a chain of empty resource bundles for your native language and 764 region. For example, if your region is sr_CS, create all the entries in root 765 in Serbian and leave bundles for sr and sr_CS locales empty. This way, users 766 of your package will know whether you support a certain locale or not. 767 7683. If you already have some data to localize, create more bundles with 769 localized data. 770 7714. Decide on the name of your package. You will use the package name to access 772 your resources. 773 7745. Compile the resource bundles using the `genrb` tool. The command line format 775 is `genrb [options] list-of-input-files`. Genrb expects that source files 776 are in invariant encoding and `\uXXXX` characters or UTF-8/UTF-16 with BOM. 777 If you need to use a different encoding, specify it using the `--encoding` 778 option. You also need to specify the destination directory name for your 779 resources using the `--destdir` option. This destination name needs to be the 780 same as the package name. Full list of options can be retrieved by invoking 781 `genrb --help`. 782 783 You can also output Java class files. You will need to specify the 784 `--write-java` option, followed by an optional encoding for the resulting 785 `.java` file. Default encoding is ASCII + `\uXXXX`. You will also have to 786 specify the resource bundle name using the `--bundle-name argument`. 787 788 After using `genrb`, you will end up with files of name 789 `packagename_localename.res`. For example, if you had `root.txt`, `en.txt`, 790 `en_US.txt`, `es.txt` and you invoked `genrb` using the following command line: 791 `genrb -d myapplication root.txt en.txt en_US.txt es.txt`, you will end up 792 with `myapplication/root.res`, `myapplication/en.res`, etc. The forward slash can 793 be a back slash on some platforms, like Windows. These files are now ready 794 to use and you can open them using `ures_open("myapplication", "en_US", err);`. 795 7966. However, you might want to have only one file containing all the data. In 797 that case you need to use the package data tool. It can produce either a 798 memory mapped file or a dynamically linked library. For more information on 799 how to use package data tool, see the appropriate [section](../icudata.md). 800 801Rolling out your own data takes some practice, especially if you want to package 802it all together. You might want to take a look at how we package data. Good 803places to start (except of course ICU's own 804[data](https://github.com/unicode-org/icu/blob/main/icu4c/source/data/)) are 805[source/test/testdata/](https://github.com/unicode-org/icu/blob/main/icu4c/source/test/testdata/) 806and 807[source/samples/ufortune/resources/](https://github.com/unicode-org/icu/blob/main/icu4c/source/samples/ufortune/resources/) 808directories. 809 810Also, here is a sample Windows batch file that does compiling and packing of 811several resources: 812 813```bat 814genrb -d myapplication root.txt en.txt en_GB.txt fr.txt es.txt es_ES.txt 815echo root.res en.res en_GB.res fr.res es.res es_ES.res > packagelist.txt 816mkdir tmpdir 817pkgdata -p myapplication -T tmpdir -m common packagelist.txt 818``` 819 820It is also possible to use the `icupkg` tool instead of `pkgdata` to generate .dat 821data archives. The `icupkg` tool became available in ICU4C 3.6. If you need the 822data in a shared or static library, you still need to use the `pkgdata` tool. For 823easier maintenance, packaging, installation and application patching, it's 824recommended that you use .dat data archives. 825 826### Using XLIFF for Localization 827 828ICU provides tool that allow for converting resource bundles to and from XLIFF 829format. Files in XLIFF format can contain translations of resources. In that 830case, more than one resulting resource bundle will be constructed. 831 832To produce a XLIFF file from a resource bundle, use the `-x` option of `genrb` tool 833from ICU4C. Assume that we want to convert a simple resource bundle to the XLIFF 834format: 835 836``` 837root { 838 usage {"usage: ufortune [-v] [-l locale]"} 839 optionMessage {"unrecognized command line option:"} 840} 841``` 842 843To get a XLIFF file, we need to call genrb like this: `genrb -x -l en root.txt`. 844Option `-x` tells `genrb` to produce XLIFF file, option `-l` specifies the language of 845the resource. If the language is not specified, `genrb` will try to deduce the 846language from the resource name (en, zh, sh). If the resource name is not an ISO 847language code (root), default language for the platform will be used. Language 848will be a source attribute for all the translation units. XLIFF file produced 849from the resource above will be named `root.xlf` and will look like this: 850 851```xml 852<?xml version="1.0" encoding="utf-8"?> 853<xliff version = "1.1 "xmlns = 'urn:oasis:names:tc:xliff:document:1.1' 854xmlns:xsi = 'http://www.w3.org/2001/XMLSchema-instance' 855xsi:schemaLocation='urn:oasis:names:tc:xliff:document:1.1 856http://www.oasis-open.org/committees/xliff/documents/xliff-core-1.1.xsd'> 857 <file xml:space = "preserve" source-language = "en” 858 datatype = "x-icu-resource-bundle" original = "root.txt" 859 date = "2007-08-17T21:17:08Z"> 860 <header> 861 <tool tool-id = "genrb-3.3-icu-3.8" tool-name = "genrb"/> 862 </header> 863 <body> 864 <group id = "root" restype = "x-icu-table"> 865 <trans-unit id = "optionMessage" resname = "optionMessage"> 866 <source>unrecognized command line option:</source> 867 </trans-unit> 868 <trans-unit id = "usage" resname = "usage"> 869 <source>usage: ufortune [-v] [-l locale]</source> 870 </trans-unit> 871 </group> 872 </body> 873 </file> 874</xliff> 875``` 876 877This file can be sent to translators. Using translation tools that support 878XLIFF, translators will produce one or more translations for this resource. 879Processed file might look a bit like this: 880 881```xml 882<?xml version="1.0" encoding="utf-8"?> 883<xliff version = "1.1" xmlns='urn:oasis:names:tc:xliff:document:1.1' 884xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' 885 xsi:schemaLocation='urn:oasis:names:tc:xliff:document:1.1 886http://www.oasis-open.org/committees/xliff/documents/xliff-core-1.1.xsd'> 887 <file xml:space = "preserve" source-language = "en" target-language = "sh" 888 datatype = "x-icu-resource-bundle" original = "root.txt" 889 date = "2007-08-17T21:17:08Z"> 890 <header> 891 <tool tool-id = "genrb-3.3-icu-3.8" tool-name = "genrb"/> 892 </header> 893 <body> 894 <group id = "root" restype = "x-icu-table"> 895 <trans-unit id = "optionMessage" resname = "optionMessage"> 896 <source>unrecognized command line option:</source> 897 <target>nepoznata opcija na komandnoj liniji:</target> 898 </trans-unit> 899 <trans-unit id = "usage" resname = "usage"> 900 <source>usage: ufortune [-v] [-l locale]</source> 901 <target>upotreba: ufortune [-v] [-l lokal]</target> 902 </trans-unit> 903 </group> 904 </body> 905 </file> 906</xliff> 907``` 908 909In order to convert this file to a set of resource bundle files, we need to use 910ICU4J's `com.ibm.icu.dev.tool.localeconverter.XLIFF2ICUConverter` class. 911 912> :point_right: **Note**: XLIFF2ICUConverter class relies on XML parser being 913> available. JDK 1.4 and newer provide a XML parser out of box. For earlier 914> versions, you will need to install xerces. 915 916Command line for running XLIFF2ICUConverter should specify the file than needs 917to be converted, sh.xlf in this case. Optionally, you can specify input and 918output directories as well as the package name. After running this tool, two 919files will be produced: en.txt and sh.txt. This is how they would look like: 920 921``` 922// *************************************************************************** 923// * 924// * Tool: com.ibm.icu.dev.tool.localeconverter.XLIFF2ICUConverter.java 925// * Date & Time: 08/17/2007 11:33:54 AM HST 926// * Source File: C:\trunk\icuhtml\userguide\xliff\sh.xlf 927// * 928// *************************************************************************** 929en:table{ 930 optionMessage:string{"unrecognized command line option:"} 931 usage:string{"usage: ufortune [-v] [-l locale]"} 932} 933``` 934 935and 936 937``` 938// *************************************************************************** 939// * 940// * Tool: com.ibm.icu.dev.tool.localeconverter.XLIFF2ICUConverter.java 941// * Date & Time: 08/17/2007 11:33:54 AM HST 942// * Source File: C:\trunk\icuhtml\userguide\xliff\sh.xlf 943// * 944// *************************************************************************** 945sh:table{ 946 optionMessage:string{"nepoznata opcija na komandnoj liniji:"} 947 usage:string{"upotreba: ufortune [-v] [-l lokal]"} 948} 949``` 950 951These files can be then used as all the other resource bundle files. 952