• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1---
2layout: default
3title: Resource and Data Tracing
4nav_order: 2
5parent: ICU Data
6---
7<!--
8© 2019 and later: Unicode, Inc. and others.
9License & terms of use: http://www.unicode.org/copyright.html
10-->
11
12# Resource and Data Tracing
13{: .no_toc }
14
15## Contents
16{: .no_toc .text-delta }
17
181. TOC
19{:toc}
20
21---
22
23## Overview
24
25When building an [ICU data filter specification](buildtool.md), it is useful to
26see what resources are being used by your application so that you can select
27those resources and discard the others. This guide describes how to use
28*utrace.h* to inspect resource access in real time in ICU4C.
29
30**Note:** This feature is only available in ICU4C at this time. If you are
31interested in ICU4J, please see
32[ICU-20656](https://unicode-org.atlassian.net/browse/ICU-20656).
33
34## Quick Start
35
36First, you *must* have a copy of ICU4C configured with tracing enabled.
37
38    $ ./runConfigureICU Linux --enable-tracing
39
40The following program prints resource and data usages to standard out:
41
42```cpp
43#include "unicode/brkiter.h"
44#include "unicode/errorcode.h"
45#include "unicode/localpointer.h"
46#include "unicode/utrace.h"
47
48#include <iostream>
49
50static void U_CALLCONV traceData(
51        const void *context,
52        int32_t fnNumber,
53        int32_t level,
54        const char *fmt,
55        va_list args) {
56    char        buf[1000];
57    const char *fnName;
58
59    fnName = utrace_functionName(fnNumber);
60    utrace_vformat(buf, sizeof(buf), 0, fmt, args);
61    std::cout << fnName << " " << buf << std::endl;
62}
63
64int main() {
65    icu::ErrorCode status;
66
67    const void* context = nullptr;
68    utrace_setFunctions(context, nullptr, nullptr, traceData);
69    utrace_setLevel(UTRACE_VERBOSE);
70
71    // Create a new BreakIterator
72    icu::LocalPointer<icu::BreakIterator> brkitr(
73        icu::BreakIterator::createWordInstance("zh-CN", status));
74}
75```
76
77The following output is produced from this program:
78
79    res-open icudt64l-brkitr/zh_CN.res
80    res-open icudt64l-brkitr/zh.res
81    res-open icudt64l-brkitr/root.res
82    bundle-open icudt64l-brkitr/zh.res
83    resc       (get) icudt64l-brkitr/zh.res @ /boundaries
84    resc       (get) icudt64l-brkitr/root.res @ /boundaries/word
85    resc    (string) icudt64l-brkitr/root.res @ /boundaries/word
86    file-open icudt64l-brkitr/word.brk
87
88What this means:
89
901. The BreakIterator constructor opened three resource files in the locale
91   fallback chain for zh_CN. The actual bundle was opened for zh.
922. One string was read from that resource bundle: the one at the resource path
93   "/boundaries/word" in brkitr/root.res.
943. In addition, the binary data file brkitr/word.brk was opened.
95
96Based on that information, you can make a more informed decision when writing
97resource filter rules for this simple program.
98
99## Data Tracing API
100
101The `traceData` function shown above takes five arguments. The following two
102are most important for data tracing:
103
104- `fnNumber` indicates what type of data access this is.
105- `args` contains the details on which resources were accessed.
106
107**Important:** When reading from `args`, the strings are valid only within the
108scope of your `traceData` function. You should make copies of the strings if
109you intend to save them for further processing.
110
111### UTRACE_UDATA_RESOURCE
112
113UTRACE_UDATA_RESOURCE is used to indicate that a value inside of a resource
114bundle was read by ICU code.
115
116When `fnNumber` is `UTRACE_UDATA_RESOURCE`, there are three C-style strings in
117`args`:
118
1191. Data type; not usually relevant for the purpose of resource filtering.
1202. The internal path of the resource file from which the value was read.
1213. The path to the value within that resource file.
122
123To read each of these into different variables, you can write the code,
124
125```cpp
126const char* dataType = va_arg(args, const char*);
127const char* filePath = va_arg(args, const char*);
128const char* resPath = va_arg(args, const char*);
129```
130
131As stated above, you should copy the strings if you intend to save them. The
132pointers will not be valid after the tracing function returns.
133
134### UTRACE_UDATA_BUNDLE
135
136UTRACE_UDATA_BUNDLE is used to indicate that a resource bundle was opened by
137ICU code.
138
139For the purposes of making your ICU data filter, the specific resource paths
140provided by UTRACE_UDATA_RESOURCE are more precise and useful.
141
142### UTRACE_UDATA_DATA_FILE
143
144UTRACE_UDATA_DATA_FILE is used to indicate that a non-resource-bundle binary
145data file was opened by ICU code. Such files are used for break iteration,
146conversion, confusables, and a handful of other ICU services.
147
148### UTRACE_UDATA_RES_FILE
149
150UTRACE_UDATA_RES_FILE is used to indicate that a binary resource bundle file
151was opened by ICU code. This can be helpful to debug locale fallbacks. This
152differs from UTRACE_UDATA_BUNDLE because the resource *file* is typically
153opened only once per application runtime.
154
155For the purposes of making your ICU data filter, the specific resource paths
156provided by UTRACE_UDATA_RESOURCE are more precise and useful.
157