• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Typical Traces
2
3## Introduction
4
5The DFX subsystem provides performance traces for the application framework and core modules of the system. Each trace records the task execution time, formatted data at runtime, and process or thread information. You can use [Smartperf-Host](../../device-dev/device-test/smartperf-host.md) to parse the traces, analyze the performance hotspots during application running based on the trace lanes, and work out a performance optimization solution.
6
7This topic describes the meanings and usage of typical traces, and demonstrates how to use them to identify potential performance problems. It also elaborates the working principles of traces to help you better collect and analyze performance data. An in-depth understanding of traces will provide considerable support for your application performance profiling.
8
9## Typical Traces and Their Meanings
10
11The following describes typical traces in the rendering process.
12
13### Rendering Process
14
15Like other operating systems, OpenHarmony uses the Vsync signal to determine the time to render a frame. The Vsync signal, short for vertical sync signal, instructs the display to start refreshing the next frame after the vertical blanking interval.
16
17The display sends the Vsync signal at a fixed frequency. For example, if the refresh rate is 60 Hz, the display sends the Vsync signal every 16.6 ms.
18
19Upon receiving the Vsync signal, the UI backend engine prepares for rendering of the next frame. Then the application submits a rendering command, which is used to describe graphics rendering, texture setting, and shader usage. The UI backend engine adds the command to the rendering queue and executes the command at a proper time. Generally, the rendering command is executed in the background thread to prevent the main thread from being blocked.
20
21After finishing the command execution, the UI backend engine sends it to Render Service of the graphics subsystem. Render Service performs graphics calculation and rendering operations, such as vertex transformation, illumination, and texture mapping. After Render Service finishes rendering, it writes the rendering result to the frame buffer. The frame buffer is a memory area that stores image data for display output. Once the frame buffer is updated, the UI backend engine waits for the next Vsync signal. This ensures that the rendering result is ready before the vertical blanking interval elapses. When the next Vsync signal arrives, the UI backend engine sends the content in the frame buffer to the display, and the display refreshes its pixels accordingly. Till now, the rendering process is complete. See Figure 1.
22
23**Figure 1 Rendering flowchart**
24
25![Rendering flowchart](figures/trace-process.jpg)
26
27From the perspective of traces, the rendering process for a frame is as follows:
28
29(1) A Vsync signal arrives.
30
31(2) The UI backend engine draws the first frame.
32
33(3) The UI backend engine sends a rendering command to Render Service and requests a frame.
34
35(4) Render Service composites graphics layers, calculates the region to refresh, and renders and draws the frame.
36
37(5) Render Service sends the drawn frame to the display.
38
39Figure 2 shows typical traces used by the UI backend engine in the rendering process of a frame.
40
41**Figure 2** Trace lanes for UI backend engine rendering
42
43![Trace lanes for UI backend engine rendering](figures/trace-ui-engine.png)
44
45| No. | **Trace**                               | **Parameter Description**        | **Process Description**                              |
46|:--- | --------------------------------------- | ---------------- | ------------------------------------ |
47| 1   | OnVsyncEvent now:%" PRIu64 "            | Current timestamp, in nanoseconds.      | Receives a Vsync signal. The rendering process starts.                    |
48| 2   | FlushVsync                              |                  | Refreshes the view synchronization events, including recording frame information, refreshing tasks, drawing and rendering the context, and processing user input.|
49| 3   | UITaskScheduler::FlushTask              |                  | Refreshes the UI, including the layout, rendering, and animation.                  |
50| 4   | FlushMessages                           |                  | Sends a message to instruct Render Service to perform rendering.                       |
51| 5   | FlushLayoutTask                         |                  | Executes the layout task.                              |
52| 6   | FlushRenderTask %zu                     | Number of nodes to be rendered on the current page.| Executes the rendering task.                             |
53| 7   | Layout                                  |                  | Lays out the nodes.                                |
54| 8   | FrameNode::RenderTask                   |                  | Executes a single rendering task.                            |
55| 9   | ListLayoutAlgorithm::MeasureListItem:%d | Index of the current list item.         | Calculates the layout size of the list item.                          |
56
57Render Service in the graphics subsystem is responsible for drawing UI content, processing rendering tasks submitted by applications, combining graphics layers of different applications, and presenting images on the display.
58
59When receiving a Vsync signal, the UI backend engine processes the command sent by the application, including adding, deleting, or modifying the application's render node, and then performs animation calculation and occlusion calculation. All these operations are used to update the render tree.
60
61The UI backend engine draws the render tree. Specifically, it preprocess every node, calculates the absolute position and dirty region, and draws the dirty region. It preferentially uses the hardware composer for drawing. If certain content cannot be composited, the GPU performs redrawing. All the drawing results are stored in the display buffer and finally presented on the display.
62
63Figure 3 shows the trace lanes when the Vsync signal is refreshed.
64
65**Figure 3** Trace lanes on Render Service
66
67![Trace lanes on Render Service](figures/trace-rs.png)
68
69| No. | Trace                        | Description          |
70|:--- | ---------------------------- | ------------ |
71| 1   | RSMainThread::DoComposition  | Composites the layers of nodes in the render tree. |
72| 2   | RSMainThread::ProcessCommand | Processes the command from the client. |
73| 3   | Animate                      | Processes the animation.        |
74| 4   | RSMainThread::CalcOcclusion  | Calculates the occlusion.        |
75| 5   | ProcessDisplayRenderNode[x]  | Performs drawing on a single display.|
76| 6   | ProcessSurfaceNode:x         | Performs composition on a single node.  |
77| 7   | Repaint                      | The hardware composer performs composition.   |
78| 8   | Redraw                       | Redraws the image if certain content cannot be composited.|
79| 9   | RenderFrame                  | The GPU performs drawing.     |
80| 10  | SwapBuffers                  | Refreshes the display buffer.     |
81| 11  | Commit                       | Presents the drawing result on the display.    |
82
83### Lazy Loading
84
85Lazy loading is implementing using **LazyForEach**, which iterates over provided data sources and creates corresponding components during each iteration. When **LazyForEach** is used in a scrolling container, the framework creates components as required within the visible area of the scrolling container. When a component is out of the visible area, the framework destroys the component to reduce memory usage.
86
87Figure 4 shows the traces for a frame during lazy loading.
88
89**Figure 4** Trace lanes for lazy loading
90
91![Trace lanes for lazy loading](figures/trace-lazyforeach.png)
92
93| **No.**| **Trace**                      | **Parameter Description**                              | **Process Description**                                          |
94|:------ | ------------------------------ | -------------------------------------- | ------------------------------------------------ |
95| 1      | OnIdle, targettime:%" PRId64 " | Timestamp, before which the task is completed.                      | Checks whether there are new events to process in the idle event loop, and if yes, adds the task scheduler to the UI thread and executes the predicted layout task.|
96| 2      | expiringItem_ count:[%zu]      | Number of lazy-loaded items.                            | Pre-builds items, including processing all lazy-loaded items.                                  |
97| 3      | List predict                   |                                        | Adds a predicted layout task.                                        |
98| 4      | Builder:BuildLazyItem [%d]     | Index of the project to be created.                              | Creates an item when needed and caches it.                                   |
99| 5      | Layout[%s][self:%d][parent:%d] | Tags, which indicate the index of the current node in the UI node tree and the index of the parent node in the UI node tree, respectively.| Lays out the current frame node.                                         |
100| 6      | Build[%s][self:%d][parent:%d]  | Tags, which indicate the index of the current node in the UI node tree and the index of the parent node in the UI node tree, respectively.| Builds the current frame node.                                         |
101| 7      | CustomNode:BuildRecycle %s     | JavaScript view name.                                | Triggers re-rendering.                                          |
102| 8      | ExecuteJS                      |                                        | Runs JavaScript code.                                          |
103
104### Page Loading
105
106When page loading is triggered, the system creates a new page instance and calls the lifecycle functions of the page accordingly. The page layout is loaded in the lifecycle functions, and the data is bound to the page components so that the page can display and update the data.
107
108Figure 5 shows the traces for a frame during page loading.
109
110Figure 5 Trace lanes for page loading
111
112![Trace lanes for page loading](figures/trace-load-frame-rate.png)
113
114| **No.**| **Trace**                                                     | **Parameter Description**                              | **Process Description**                      |
115|:------ | ------------------------------------------------------------- | -------------------------------------- | ---------------------------- |
116| 1      | PageRouterManager::RunPage                                    |                                        | Preprocesses page routes and loads the page.                |
117| 2      | PageRouterManager::LoadPage                                   |                                        | Loads the page and processes routes.                     |
118| 3      | JsiDeclarativeEngine::LoadPageSource                          |                                        | Loads a JavaScript file and parses it into ABC bytecode.|
119| 4      | JsiDeclarativeEngine::LoadJsWithModule Execute Page code : %s | Page URL.                               | Executes the page code.                      |
120| 5      | Build[%s][self:%d][parent:%d]                                 | Tags, which indicate the index of the current node in the UI node tree and the index of the parent node in the UI node tree, respectively.| Builds the current frame node.                     |
121| 6      | CustomNode:BuildItem %s                                       | JavaScript view name.                                | Renders the child node and mounts it to the parent node.            |
122| 7      | ViewChangeCallback(%d, %d)                                    | Width and height of the view.                               | Called when the view changes.                      |
123
124## Practice of Traces
125
126The following example traverses a list in **LazyForEach** mode and uses [Smartperf-Host](../../device-dev/device-test/smartperf-host.md) to trace the code execution process.
127
128In the [sample code](https://gitee.com/openharmony/applications_app_samples/tree/master/code/Performance/PerformanceLibrary/feature/trace), a **\<List>** container component is used to create 120 custom **\<IconView>** components in lazy loading mode. In each **\<IconView>** component, a **\<Flex>** container that holds the **\<Image>** and **\<Text>** components is used to present a list item with text and an image.
129
130```ts
131// src/main/ets/pages/LazyForEachPage.ets
132
133@Entry
134@Component
135struct LazyForEachPage {
136  private iconItemSourceList = new ListData();
137  aboutToAppear() {
138    // Add data of 120 icon items.
139     ......
140  }
141  build() {
142    Column() {
143      Text ('Example of lazy loading')
144        .fontSize(24)
145        .fontColor(Color.Black)
146        .fontWeight(FontWeight.Bold)
147        .textAlign(TextAlign.Start)
148        .width('90%')
149        .height(50)
150      List({ space: 20 }) {
151        LazyForEach(this.iconItemSourceList, (item: IconItemModel) => {
152          ListItem() {
153            IconItem({ image: item.image, text: item.text })
154          }
155        }, (item: IconItemModel, index) => index.toString())
156      }
157      .divider({ strokeWidth: 2, startMargin: 20, endMargin: 20 }) // Divider
158      .width('100%')
159      .height('100%')
160      .layoutWeight(1)
161    }
162    .width('100%')
163    .height('100%')
164    .alignItems(HorizontalAlign.Center)
165  }
166}
167```
168
169```ts
170// src/main/ets/view/IconView.ets
171
172@Component
173export struct IconItem {
174  image: string | Resource = '';
175  text: string | Resource = '';
176  build() {
177    Flex({ direction: FlexDirection.Row, justifyContent: FlexAlign.Center, alignContent: FlexAlign.Center }) {
178      Image(this.image)
179        .height(40)
180        .width(40)
181        .objectFit(ImageFit.Contain)
182        .margin({
183          left: 15
184        })
185      Text(this.text)
186        .fontSize(20)
187        .fontColor(Color.Black)
188        .width(100)
189        .height(50)
190        .textAlign(TextAlign.Center)
191    }
192    .width('100%')
193    .height(50)
194  }
195}
196```
197
198The following uses [Smartperf-Host](../../device-dev/device-test/smartperf-host.md) to capture the htrace file and generate a trace lane diagram, helping you better understand the loading process of the sample code. The lane diagram is divided into five parts. Each part is marked with numbers and tags.
199
200**Figure 6** Lane diagram of the list traversed in LazyForEach mode
201
202![Lane diagram of the list traversed in LazyForEach mode](figures/trace-lazyforeach-channel.png)
203
204The following describes the details of each part.
205
206**1. Loading the LazyForEach page**
207
208**Figure 7** Lane for loading the LazyForEach page
209
210![Lane for loading the LazyForEach page](figures/trace-load-lazyforeach.png)
211
212- **H:JsiDeclarativeEngine::LoadPageSource**: loads a JavaScript file and parses it into ABC bytecode.
213- **H: FlushPipelineWithoutAnimation**: flushes the rendering pipeline.
214- **H:CustomNode:OnAppear**: builds the current **OnAppear** lifecycle and executes the **aboutToAppear** lifecycle function.
215- **H:CustomNode:BuildItem LazyForEachPage**: renders the child node and mounts it to the LazyForEachPage page.
216
217**2. Laying out and rendering the current frame node Stage, and instructing Render Service to perform rendering**
218
219**Figure 8** Lane for Stage layout and rendering
220
221![Lane for Stage layout and rendering](figures/trace-stage-frame-rate.png)
222
223- **H:Layout[stage][self:1][parent:0]**: lays out the current frame node Stage, which functions as the framework and bears the Page node. Therefore, the tag presentation starts from Stage.
224  - **H:Measure[%s][self:17][parent:16]**: calculates the layout size of the components **\<Page>**, **\<Column>**, **\<Row>**, **\<Image>**, **\<Text>**, and more.
225  - **H:Builder:BuildLazyItem [0]** and **H:ListLayoutAlgorithm::MeasureListItem:0**: create a lazy item and calculate the layout size of the list item, respectively.
226  - **H:Layout[%s][self:38][parent:37]**: lays out the components **\<Page>**, **\<Column>**, **\<Row>**, **\<Image>**, **\<Text>**, and more.
227- **H:FrameNode::RenderTask**: executes the rendering task.
228- **H: RequestNextVSync**: requests the Vsync signal of the next frame.
229
230**3. Laying out and rendering the current frame node Flex, and instructing Render Service to perform rendering**
231
232**Figure 9** Lane for Flex layout and rendering
233
234![Lane for Flex layout and rendering](figures/trace-notify-rs.png)
235
236- **H:Layout[Flex][self:63][parent:62]**: lays out the current frame node Flex.
237  - **H:Measure[%s][self:17][parent:16]**: calculates the layout size of the components **\<Image>**, **\<Text>**, and more.
238- **H:FrameNode::RenderTask**: renders Flex;
239- **H: RequestNextVSync**: requests the Vsync signal of the next frame.
240
241**4. Preprocessing data and adding a predicted layout task**
242
243**Figure 10** Lane for preprocessing data and adding a predicted layout task
244
245![Lane for preprocessing data and adding a predicted layout task](figures/trace-per-build.png)
246
247- **H:Builder:BuildLazyItem [11]**: preprocesses 11 data records.
248- **H:Layout[ListItem][self:76][parent:-1]**: adds a predicted layout for Flex, **\<Image>**, and **\<Text>**.
249- **H:FlushMessages**: sends a message to instruct Render Service to perform rendering.
250
251**5. Compositing the layers for each node in the render tree**
252
253**Figure 11** Lane of compositing the layers for each node in the render tree
254
255![Lane for compositing the layer for each node in the render tree](figures/trace-node-tree.png)
256
257- **H:AcquireBuffer** and **H:ProcessSurfaceNode:EntryView XYWH[0 0 720 1280]**: obtain the display buffer and draw the EntryView, SystemUi_StatusBar, and SystemUi_NavigationBar.
258- **H:Repaint**: uses a hardware composer to composite and draw the current node tree.
259
260## Custom Traces
261
262You can use hiTraceMeter to customize traces based on service requirements. Currently, ArkTS and native APIs are provided for trace customization. For details, read the following topics:
263
264> [Development of Performance Tracing (ArkTS)](../dfx/hitracemeter-guidelines.md)
265>
266> [Performance Tracing Development Guide (Native)](../dfx/hitracemeter-native-guidelines.md)
267
268After custom traces are added, you can view them on [Smartperf-Host](../../device-dev/device-test/smartperf-host.md). They are displayed within an independent lane under the corresponding process.
269
270The figure below shows two lanes that record the duration from the calling of **startTrace** to the calling of **finishTrace**. Two tags (**CUSTOM_TRACE_TAG_1** and **CUSTOM_TRACE_TAG_2**) are used to record the time consumption.
271
272**Figure 12** Example of custom traces
273
274![Custom trace example](figures/trace-custom-example.png)
275
276The figure below shows two lanes that record the use of **TraceByValue**, which specifies the status value of a trace in the corresponding period. The meanings of the status values can be customized as required. You can move the cursor to the corresponding data block to view the status value. In the figure, the status value of the **CUSTOM_TRACE_TAG_2** tag is 2001 in the period marked in the red box.
277
278**Figure 13** Example of a custom status value
279
280![Example of a custom status value](figures/trace-custom-value.png)
281
282## Working Principles of Traces
283
284The generation of traces depends on the HiTrace component of the DFX subsystem. The hiTraceMeter module provides APIs to trace system performance. For details, read the following topics:
285
286> [HiTrace Component](https://gitee.com/openharmony/hiviewdfx_hitrace)
287>
288> [hiTraceMeter Module](../reference/apis-performance-analysis-kit/_hitrace.md)
289
290hiTraceMeter provides two sets of trace start and end APIs to collect statistics on the time consumed by logical behavior. Most time consumption statistics are collected per method. Therefore, hiTraceMeter also provides the macro definitions **HITRACE_METER**, **HITRACE_METER_NAME**, and **HITRACE_METER_FMT** to quickly record the time consumption of a single method. You only need to add them at the beginning of the method. These macro definitions depend on the lifecycle of local variables in the method. The principle is that a trace instance is created at the beginning of the method, and the trace start API is called in the instance constructor. Once the method finishes execution, the trace instance is destructed and the trace end API is called in the instance destructor.
291
292### Tracing Example in Applications
293
294The following example shows the use of hiTraceMeter in the ArkUI framework subsystem. The source code comes from the [ArkUI framework](https://gitee.com/openharmony/arkui_ace_engine).
295
296The following code is used to encapsulate the hiTraceMeter APIs. The principle is the same as that of **HITRACE_METER**. Quick tracing is implemented based on the lifecycle of local variables in the method.
297
298```cpp
299// frameworks/base/log/ace_trace.h
300
301#define ACE_SCOPED_TRACE(fmt, ...) AceScopedTrace aceScopedTrace(fmt, ##__VA_ARGS__)
302#define ACE_FUNCTION_TRACE() ACE_SCOPED_TRACE(__func__)
303
304class ACE_FORCE_EXPORT AceScopedTrace final {
305public:
306    explicit AceScopedTrace(const char* format, ...) __attribute__((__format__(printf, 2, 3)));
307    ~AceScopedTrace();
308
309    ACE_DISALLOW_COPY_AND_MOVE(AceScopedTrace);
310
311private:
312    bool traceEnabled_ { false };
313};
314```
315
316The following code is used to refresh the view synchronization events, including recording frame information, refreshing tasks, drawing and rendering the context, and processing user input. Invoke the macro definition **ACE_FUNCTION_TRACE** at the beginning of the method, record the function name **FlushVsync** as the trace name, and record the function start time. When the function ends, record the function end time to obtain the execution duration.
317
318```cpp
319// frameworks/core/pipeline/pipeline_context.cpp
320
321void PipelineContext::FlushVsync(uint64_t nanoTimestamp, uint32_t frameCount)
322{
323    ACE_FUNCTION_TRACE();
324
325    // Other service logic in the method is omitted here.
326    // ...
327}
328```
329
330### Tracing Example on Render Service
331
332The following example shows the use of hiTraceMeter in the graphics subsystem. The source code comes from the [graphics subsystem](https://gitee.com/openharmony/graphic_graphic_2d).
333
334The following code is used to encapsulate the hiTraceMeter APIs.
335
336```cpp
337// utils/log/rs_trace.h
338
339#include "hitrace_meter.h"
340#define ROSEN_TRACE_BEGIN(tag, name) StartTrace(tag, name)
341#define RS_TRACE_BEGIN(name) ROSEN_TRACE_BEGIN(HITRACE_TAG_GRAPHIC_AGP, name)
342#define ROSEN_TRACE_END(tag) FinishTrace(tag)
343#define RS_TRACE_END() ROSEN_TRACE_END(HITRACE_TAG_GRAPHIC_AGP)
344#define RS_TRACE_NAME(name) HITRACE_METER_NAME(HITRACE_TAG_GRAPHIC_AGP, name)
345#define RS_TRACE_NAME_FMT(fmt, ...) HITRACE_METER_FMT(HITRACE_TAG_GRAPHIC_AGP, fmt, ##__VA_ARGS__)
346#define RS_ASYNC_TRACE_BEGIN(name, value) StartAsyncTrace(HITRACE_TAG_GRAPHIC_AGP, name, value)
347#define RS_ASYNC_TRACE_END(name, value) FinishAsyncTrace(HITRACE_TAG_GRAPHIC_AGP, name, value)
348#define RS_TRACE_INT(name, value) CountTrace(HITRACE_TAG_GRAPHIC_AGP, name, value)
349#define RS_TRACE_FUNC() RS_TRACE_NAME(__func__)
350```
351
352The following code is used to draw images on the display. Invoke the macro definition **RS_TRACE_NAME** at the beginning of the method, combine the function name **ProcessDisplayRenderNode** with a display ID, and record the combination as the trace name. In essence, the macro definition **HITRACE_METER_NAME** for tracing a single method is used. Therefore, you only need to call the **ProcessDisplayRenderNode** function once to collect the function execution start time and end time and obtain the execution duration.
353
354```cpp
355// rosen/modules/render_service/core/pipeline/rs_surface_capture_task.cpp
356
357void RSSurfaceCaptureVisitor::ProcessDisplayRenderNode(RSDisplayRenderNode &node)
358{
359    RS_TRACE_NAME("RSSurfaceCaptureVisitor::ProcessDisplayRenderNode:" +
360        std::to_string(node.GetId()));
361
362    // Other service logic in the method is omitted here.
363    // ...
364}
365```
366