--- layout: default title: Resource and Data Tracing nav_order: 2 parent: ICU Data --- # Resource and Data Tracing {: .no_toc } ## Contents {: .no_toc .text-delta } 1. TOC {:toc} --- ## Overview When building an [ICU data filter specification](buildtool.md), it is useful to see what resources are being used by your application so that you can select those resources and discard the others. This guide describes how to use *utrace.h* to inspect resource access in real time in ICU4C. **Note:** This feature is only available in ICU4C at this time. If you are interested in ICU4J, please see [ICU-20656](https://unicode-org.atlassian.net/browse/ICU-20656). ## Quick Start First, you *must* have a copy of ICU4C configured with tracing enabled. $ ./runConfigureICU Linux --enable-tracing The following program prints resource and data usages to standard out: ```cpp #include "unicode/brkiter.h" #include "unicode/errorcode.h" #include "unicode/localpointer.h" #include "unicode/utrace.h" #include static void U_CALLCONV traceData( const void *context, int32_t fnNumber, int32_t level, const char *fmt, va_list args) { char buf[1000]; const char *fnName; fnName = utrace_functionName(fnNumber); utrace_vformat(buf, sizeof(buf), 0, fmt, args); std::cout << fnName << " " << buf << std::endl; } int main() { icu::ErrorCode status; const void* context = nullptr; utrace_setFunctions(context, nullptr, nullptr, traceData); utrace_setLevel(UTRACE_VERBOSE); // Create a new BreakIterator icu::LocalPointer brkitr( icu::BreakIterator::createWordInstance("zh-CN", status)); } ``` The following output is produced from this program: res-open icudt64l-brkitr/zh_CN.res res-open icudt64l-brkitr/zh.res res-open icudt64l-brkitr/root.res bundle-open icudt64l-brkitr/zh.res resc (get) icudt64l-brkitr/zh.res @ /boundaries resc (get) icudt64l-brkitr/root.res @ /boundaries/word resc (string) icudt64l-brkitr/root.res @ /boundaries/word file-open icudt64l-brkitr/word.brk What this means: 1. The BreakIterator constructor opened three resource files in the locale fallback chain for zh_CN. The actual bundle was opened for zh. 2. One string was read from that resource bundle: the one at the resource path "/boundaries/word" in brkitr/root.res. 3. In addition, the binary data file brkitr/word.brk was opened. Based on that information, you can make a more informed decision when writing resource filter rules for this simple program. ## Data Tracing API The `traceData` function shown above takes five arguments. The following two are most important for data tracing: - `fnNumber` indicates what type of data access this is. - `args` contains the details on which resources were accessed. **Important:** When reading from `args`, the strings are valid only within the scope of your `traceData` function. You should make copies of the strings if you intend to save them for further processing. ### UTRACE_UDATA_RESOURCE UTRACE_UDATA_RESOURCE is used to indicate that a value inside of a resource bundle was read by ICU code. When `fnNumber` is `UTRACE_UDATA_RESOURCE`, there are three C-style strings in `args`: 1. Data type; not usually relevant for the purpose of resource filtering. 2. The internal path of the resource file from which the value was read. 3. The path to the value within that resource file. To read each of these into different variables, you can write the code, ```cpp const char* dataType = va_arg(args, const char*); const char* filePath = va_arg(args, const char*); const char* resPath = va_arg(args, const char*); ``` As stated above, you should copy the strings if you intend to save them. The pointers will not be valid after the tracing function returns. ### UTRACE_UDATA_BUNDLE UTRACE_UDATA_BUNDLE is used to indicate that a resource bundle was opened by ICU code. For the purposes of making your ICU data filter, the specific resource paths provided by UTRACE_UDATA_RESOURCE are more precise and useful. ### UTRACE_UDATA_DATA_FILE UTRACE_UDATA_DATA_FILE is used to indicate that a non-resource-bundle binary data file was opened by ICU code. Such files are used for break iteration, conversion, confusables, and a handful of other ICU services. ### UTRACE_UDATA_RES_FILE UTRACE_UDATA_RES_FILE is used to indicate that a binary resource bundle file was opened by ICU code. This can be helpful to debug locale fallbacks. This differs from UTRACE_UDATA_BUNDLE because the resource *file* is typically opened only once per application runtime. For the purposes of making your ICU data filter, the specific resource paths provided by UTRACE_UDATA_RESOURCE are more precise and useful.