1# Typical Traces 2 3## Introduction 4 5The DFX subsystem provides performance traces for the application framework and core modules of the system. Each trace records the task execution time, formatted data at runtime, and process or thread information. You can use [Smartperf-Host](../../device-dev/device-test/smartperf-host.md) to parse the traces, analyze the performance hotspots during application running based on the trace lanes, and work out a performance optimization solution. 6 7This topic describes the meanings and usage of typical traces, and demonstrates how to use them to identify potential performance problems. It also elaborates the working principles of traces to help you better collect and analyze performance data. An in-depth understanding of traces will provide considerable support for your application performance profiling. 8 9## Typical Traces and Their Meanings 10 11The following describes typical traces in the rendering process. 12 13### Rendering Process 14 15Like other operating systems, OpenHarmony uses the Vsync signal to determine the time to render a frame. The Vsync signal, short for vertical sync signal, instructs the display to start refreshing the next frame after the vertical blanking interval. 16 17The display sends the Vsync signal at a fixed frequency. For example, if the refresh rate is 60 Hz, the display sends the Vsync signal every 16.6 ms. 18 19Upon receiving the Vsync signal, the UI backend engine prepares for rendering of the next frame. Then the application submits a rendering command, which is used to describe graphics rendering, texture setting, and shader usage. The UI backend engine adds the command to the rendering queue and executes the command at a proper time. Generally, the rendering command is executed in the background thread to prevent the main thread from being blocked. 20 21After finishing the command execution, the UI backend engine sends it to Render Service of the graphics subsystem. Render Service performs graphics calculation and rendering operations, such as vertex transformation, illumination, and texture mapping. After Render Service finishes rendering, it writes the rendering result to the frame buffer. The frame buffer is a memory area that stores image data for display output. Once the frame buffer is updated, the UI backend engine waits for the next Vsync signal. This ensures that the rendering result is ready before the vertical blanking interval elapses. When the next Vsync signal arrives, the UI backend engine sends the content in the frame buffer to the display, and the display refreshes its pixels accordingly. Till now, the rendering process is complete. See Figure 1. 22 23**Figure 1 Rendering flowchart** 24 25 26 27From the perspective of traces, the rendering process for a frame is as follows: 28 29(1) A Vsync signal arrives. 30 31(2) The UI backend engine draws the first frame. 32 33(3) The UI backend engine sends a rendering command to Render Service and requests a frame. 34 35(4) Render Service composites graphics layers, calculates the region to refresh, and renders and draws the frame. 36 37(5) Render Service sends the drawn frame to the display. 38 39Figure 2 shows typical traces used by the UI backend engine in the rendering process of a frame. 40 41**Figure 2** Trace lanes for UI backend engine rendering 42 43 44 45| No. | **Trace** | **Parameter Description** | **Process Description** | 46|:--- | --------------------------------------- | ---------------- | ------------------------------------ | 47| 1 | OnVsyncEvent now:%" PRIu64 " | Current timestamp, in nanoseconds. | Receives a Vsync signal. The rendering process starts. | 48| 2 | FlushVsync | | Refreshes the view synchronization events, including recording frame information, refreshing tasks, drawing and rendering the context, and processing user input.| 49| 3 | UITaskScheduler::FlushTask | | Refreshes the UI, including the layout, rendering, and animation. | 50| 4 | FlushMessages | | Sends a message to instruct Render Service to perform rendering. | 51| 5 | FlushLayoutTask | | Executes the layout task. | 52| 6 | FlushRenderTask %zu | Number of nodes to be rendered on the current page.| Executes the rendering task. | 53| 7 | Layout | | Lays out the nodes. | 54| 8 | FrameNode::RenderTask | | Executes a single rendering task. | 55| 9 | ListLayoutAlgorithm::MeasureListItem:%d | Index of the current list item. | Calculates the layout size of the list item. | 56 57Render Service in the graphics subsystem is responsible for drawing UI content, processing rendering tasks submitted by applications, combining graphics layers of different applications, and presenting images on the display. 58 59When receiving a Vsync signal, the UI backend engine processes the command sent by the application, including adding, deleting, or modifying the application's render node, and then performs animation calculation and occlusion calculation. All these operations are used to update the render tree. 60 61The UI backend engine draws the render tree. Specifically, it preprocess every node, calculates the absolute position and dirty region, and draws the dirty region. It preferentially uses the hardware composer for drawing. If certain content cannot be composited, the GPU performs redrawing. All the drawing results are stored in the display buffer and finally presented on the display. 62 63Figure 3 shows the trace lanes when the Vsync signal is refreshed. 64 65**Figure 3** Trace lanes on Render Service 66 67 68 69| No. | Trace | Description | 70|:--- | ---------------------------- | ------------ | 71| 1 | RSMainThread::DoComposition | Composites the layers of nodes in the render tree. | 72| 2 | RSMainThread::ProcessCommand | Processes the command from the client. | 73| 3 | Animate | Processes the animation. | 74| 4 | RSMainThread::CalcOcclusion | Calculates the occlusion. | 75| 5 | ProcessDisplayRenderNode[x] | Performs drawing on a single display.| 76| 6 | ProcessSurfaceNode:x | Performs composition on a single node. | 77| 7 | Repaint | The hardware composer performs composition. | 78| 8 | Redraw | Redraws the image if certain content cannot be composited.| 79| 9 | RenderFrame | The GPU performs drawing. | 80| 10 | SwapBuffers | Refreshes the display buffer. | 81| 11 | Commit | Presents the drawing result on the display. | 82 83### Lazy Loading 84 85Lazy loading is implementing using **LazyForEach**, which iterates over provided data sources and creates corresponding components during each iteration. When **LazyForEach** is used in a scrolling container, the framework creates components as required within the visible area of the scrolling container. When a component is out of the visible area, the framework destroys the component to reduce memory usage. 86 87Figure 4 shows the traces for a frame during lazy loading. 88 89**Figure 4** Trace lanes for lazy loading 90 91 92 93| **No.**| **Trace** | **Parameter Description** | **Process Description** | 94|:------ | ------------------------------ | -------------------------------------- | ------------------------------------------------ | 95| 1 | OnIdle, targettime:%" PRId64 " | Timestamp, before which the task is completed. | Checks whether there are new events to process in the idle event loop, and if yes, adds the task scheduler to the UI thread and executes the predicted layout task.| 96| 2 | expiringItem_ count:[%zu] | Number of lazy-loaded items. | Pre-builds items, including processing all lazy-loaded items. | 97| 3 | List predict | | Adds a predicted layout task. | 98| 4 | Builder:BuildLazyItem [%d] | Index of the project to be created. | Creates an item when needed and caches it. | 99| 5 | Layout[%s][self:%d][parent:%d] | Tags, which indicate the index of the current node in the UI node tree and the index of the parent node in the UI node tree, respectively.| Lays out the current frame node. | 100| 6 | Build[%s][self:%d][parent:%d] | Tags, which indicate the index of the current node in the UI node tree and the index of the parent node in the UI node tree, respectively.| Builds the current frame node. | 101| 7 | CustomNode:BuildRecycle %s | JavaScript view name. | Triggers re-rendering. | 102| 8 | ExecuteJS | | Runs JavaScript code. | 103 104### Page Loading 105 106When page loading is triggered, the system creates a new page instance and calls the lifecycle functions of the page accordingly. The page layout is loaded in the lifecycle functions, and the data is bound to the page components so that the page can display and update the data. 107 108Figure 5 shows the traces for a frame during page loading. 109 110Figure 5 Trace lanes for page loading 111 112 113 114| **No.**| **Trace** | **Parameter Description** | **Process Description** | 115|:------ | ------------------------------------------------------------- | -------------------------------------- | ---------------------------- | 116| 1 | PageRouterManager::RunPage | | Preprocesses page routes and loads the page. | 117| 2 | PageRouterManager::LoadPage | | Loads the page and processes routes. | 118| 3 | JsiDeclarativeEngine::LoadPageSource | | Loads a JavaScript file and parses it into ABC bytecode.| 119| 4 | JsiDeclarativeEngine::LoadJsWithModule Execute Page code : %s | Page URL. | Executes the page code. | 120| 5 | Build[%s][self:%d][parent:%d] | Tags, which indicate the index of the current node in the UI node tree and the index of the parent node in the UI node tree, respectively.| Builds the current frame node. | 121| 6 | CustomNode:BuildItem %s | JavaScript view name. | Renders the child node and mounts it to the parent node. | 122| 7 | ViewChangeCallback(%d, %d) | Width and height of the view. | Called when the view changes. | 123 124## Practice of Traces 125 126The following example traverses a list in **LazyForEach** mode and uses [Smartperf-Host](../../device-dev/device-test/smartperf-host.md) to trace the code execution process. 127 128In the [sample code](https://gitee.com/openharmony/applications_app_samples/tree/master/code/Performance/PerformanceLibrary/feature/trace), a **\<List>** container component is used to create 120 custom **\<IconView>** components in lazy loading mode. In each **\<IconView>** component, a **\<Flex>** container that holds the **\<Image>** and **\<Text>** components is used to present a list item with text and an image. 129 130```ts 131// src/main/ets/pages/LazyForEachPage.ets 132 133@Entry 134@Component 135struct LazyForEachPage { 136 private iconItemSourceList = new ListData(); 137 aboutToAppear() { 138 // Add data of 120 icon items. 139 ...... 140 } 141 build() { 142 Column() { 143 Text ('Example of lazy loading') 144 .fontSize(24) 145 .fontColor(Color.Black) 146 .fontWeight(FontWeight.Bold) 147 .textAlign(TextAlign.Start) 148 .width('90%') 149 .height(50) 150 List({ space: 20 }) { 151 LazyForEach(this.iconItemSourceList, (item: IconItemModel) => { 152 ListItem() { 153 IconItem({ image: item.image, text: item.text }) 154 } 155 }, (item: IconItemModel, index) => index.toString()) 156 } 157 .divider({ strokeWidth: 2, startMargin: 20, endMargin: 20 }) // Divider 158 .width('100%') 159 .height('100%') 160 .layoutWeight(1) 161 } 162 .width('100%') 163 .height('100%') 164 .alignItems(HorizontalAlign.Center) 165 } 166} 167``` 168 169```ts 170// src/main/ets/view/IconView.ets 171 172@Component 173export struct IconItem { 174 image: string | Resource = ''; 175 text: string | Resource = ''; 176 build() { 177 Flex({ direction: FlexDirection.Row, justifyContent: FlexAlign.Center, alignContent: FlexAlign.Center }) { 178 Image(this.image) 179 .height(40) 180 .width(40) 181 .objectFit(ImageFit.Contain) 182 .margin({ 183 left: 15 184 }) 185 Text(this.text) 186 .fontSize(20) 187 .fontColor(Color.Black) 188 .width(100) 189 .height(50) 190 .textAlign(TextAlign.Center) 191 } 192 .width('100%') 193 .height(50) 194 } 195} 196``` 197 198The following uses [Smartperf-Host](../../device-dev/device-test/smartperf-host.md) to capture the htrace file and generate a trace lane diagram, helping you better understand the loading process of the sample code. The lane diagram is divided into five parts. Each part is marked with numbers and tags. 199 200**Figure 6** Lane diagram of the list traversed in LazyForEach mode 201 202 203 204The following describes the details of each part. 205 206**1. Loading the LazyForEach page** 207 208**Figure 7** Lane for loading the LazyForEach page 209 210 211 212- **H:JsiDeclarativeEngine::LoadPageSource**: loads a JavaScript file and parses it into ABC bytecode. 213- **H: FlushPipelineWithoutAnimation**: flushes the rendering pipeline. 214- **H:CustomNode:OnAppear**: builds the current **OnAppear** lifecycle and executes the **aboutToAppear** lifecycle function. 215- **H:CustomNode:BuildItem LazyForEachPage**: renders the child node and mounts it to the LazyForEachPage page. 216 217**2. Laying out and rendering the current frame node Stage, and instructing Render Service to perform rendering** 218 219**Figure 8** Lane for Stage layout and rendering 220 221 222 223- **H:Layout[stage][self:1][parent:0]**: lays out the current frame node Stage, which functions as the framework and bears the Page node. Therefore, the tag presentation starts from Stage. 224 - **H:Measure[%s][self:17][parent:16]**: calculates the layout size of the components **\<Page>**, **\<Column>**, **\<Row>**, **\<Image>**, **\<Text>**, and more. 225 - **H:Builder:BuildLazyItem [0]** and **H:ListLayoutAlgorithm::MeasureListItem:0**: create a lazy item and calculate the layout size of the list item, respectively. 226 - **H:Layout[%s][self:38][parent:37]**: lays out the components **\<Page>**, **\<Column>**, **\<Row>**, **\<Image>**, **\<Text>**, and more. 227- **H:FrameNode::RenderTask**: executes the rendering task. 228- **H: RequestNextVSync**: requests the Vsync signal of the next frame. 229 230**3. Laying out and rendering the current frame node Flex, and instructing Render Service to perform rendering** 231 232**Figure 9** Lane for Flex layout and rendering 233 234 235 236- **H:Layout[Flex][self:63][parent:62]**: lays out the current frame node Flex. 237 - **H:Measure[%s][self:17][parent:16]**: calculates the layout size of the components **\<Image>**, **\<Text>**, and more. 238- **H:FrameNode::RenderTask**: renders Flex; 239- **H: RequestNextVSync**: requests the Vsync signal of the next frame. 240 241**4. Preprocessing data and adding a predicted layout task** 242 243**Figure 10** Lane for preprocessing data and adding a predicted layout task 244 245 246 247- **H:Builder:BuildLazyItem [11]**: preprocesses 11 data records. 248- **H:Layout[ListItem][self:76][parent:-1]**: adds a predicted layout for Flex, **\<Image>**, and **\<Text>**. 249- **H:FlushMessages**: sends a message to instruct Render Service to perform rendering. 250 251**5. Compositing the layers for each node in the render tree** 252 253**Figure 11** Lane of compositing the layers for each node in the render tree 254 255 256 257- **H:AcquireBuffer** and **H:ProcessSurfaceNode:EntryView XYWH[0 0 720 1280]**: obtain the display buffer and draw the EntryView, SystemUi_StatusBar, and SystemUi_NavigationBar. 258- **H:Repaint**: uses a hardware composer to composite and draw the current node tree. 259 260## Custom Traces 261 262You can use hiTraceMeter to customize traces based on service requirements. Currently, ArkTS and native APIs are provided for trace customization. For details, read the following topics: 263 264> [Development of Performance Tracing (ArkTS)](../dfx/hitracemeter-guidelines.md) 265> 266> [Performance Tracing Development Guide (Native)](../dfx/hitracemeter-native-guidelines.md) 267 268After custom traces are added, you can view them on [Smartperf-Host](../../device-dev/device-test/smartperf-host.md). They are displayed within an independent lane under the corresponding process. 269 270The figure below shows two lanes that record the duration from the calling of **startTrace** to the calling of **finishTrace**. Two tags (**CUSTOM_TRACE_TAG_1** and **CUSTOM_TRACE_TAG_2**) are used to record the time consumption. 271 272**Figure 12** Example of custom traces 273 274 275 276The figure below shows two lanes that record the use of **TraceByValue**, which specifies the status value of a trace in the corresponding period. The meanings of the status values can be customized as required. You can move the cursor to the corresponding data block to view the status value. In the figure, the status value of the **CUSTOM_TRACE_TAG_2** tag is 2001 in the period marked in the red box. 277 278**Figure 13** Example of a custom status value 279 280 281 282## Working Principles of Traces 283 284The generation of traces depends on the HiTrace component of the DFX subsystem. The hiTraceMeter module provides APIs to trace system performance. For details, read the following topics: 285 286> [HiTrace Component](https://gitee.com/openharmony/hiviewdfx_hitrace) 287> 288> [hiTraceMeter Module](../reference/apis-performance-analysis-kit/_hitrace.md) 289 290hiTraceMeter provides two sets of trace start and end APIs to collect statistics on the time consumed by logical behavior. Most time consumption statistics are collected per method. Therefore, hiTraceMeter also provides the macro definitions **HITRACE_METER**, **HITRACE_METER_NAME**, and **HITRACE_METER_FMT** to quickly record the time consumption of a single method. You only need to add them at the beginning of the method. These macro definitions depend on the lifecycle of local variables in the method. The principle is that a trace instance is created at the beginning of the method, and the trace start API is called in the instance constructor. Once the method finishes execution, the trace instance is destructed and the trace end API is called in the instance destructor. 291 292### Tracing Example in Applications 293 294The following example shows the use of hiTraceMeter in the ArkUI framework subsystem. The source code comes from the [ArkUI framework](https://gitee.com/openharmony/arkui_ace_engine). 295 296The following code is used to encapsulate the hiTraceMeter APIs. The principle is the same as that of **HITRACE_METER**. Quick tracing is implemented based on the lifecycle of local variables in the method. 297 298```cpp 299// frameworks/base/log/ace_trace.h 300 301#define ACE_SCOPED_TRACE(fmt, ...) AceScopedTrace aceScopedTrace(fmt, ##__VA_ARGS__) 302#define ACE_FUNCTION_TRACE() ACE_SCOPED_TRACE(__func__) 303 304class ACE_FORCE_EXPORT AceScopedTrace final { 305public: 306 explicit AceScopedTrace(const char* format, ...) __attribute__((__format__(printf, 2, 3))); 307 ~AceScopedTrace(); 308 309 ACE_DISALLOW_COPY_AND_MOVE(AceScopedTrace); 310 311private: 312 bool traceEnabled_ { false }; 313}; 314``` 315 316The following code is used to refresh the view synchronization events, including recording frame information, refreshing tasks, drawing and rendering the context, and processing user input. Invoke the macro definition **ACE_FUNCTION_TRACE** at the beginning of the method, record the function name **FlushVsync** as the trace name, and record the function start time. When the function ends, record the function end time to obtain the execution duration. 317 318```cpp 319// frameworks/core/pipeline/pipeline_context.cpp 320 321void PipelineContext::FlushVsync(uint64_t nanoTimestamp, uint32_t frameCount) 322{ 323 ACE_FUNCTION_TRACE(); 324 325 // Other service logic in the method is omitted here. 326 // ... 327} 328``` 329 330### Tracing Example on Render Service 331 332The following example shows the use of hiTraceMeter in the graphics subsystem. The source code comes from the [graphics subsystem](https://gitee.com/openharmony/graphic_graphic_2d). 333 334The following code is used to encapsulate the hiTraceMeter APIs. 335 336```cpp 337// utils/log/rs_trace.h 338 339#include "hitrace_meter.h" 340#define ROSEN_TRACE_BEGIN(tag, name) StartTrace(tag, name) 341#define RS_TRACE_BEGIN(name) ROSEN_TRACE_BEGIN(HITRACE_TAG_GRAPHIC_AGP, name) 342#define ROSEN_TRACE_END(tag) FinishTrace(tag) 343#define RS_TRACE_END() ROSEN_TRACE_END(HITRACE_TAG_GRAPHIC_AGP) 344#define RS_TRACE_NAME(name) HITRACE_METER_NAME(HITRACE_TAG_GRAPHIC_AGP, name) 345#define RS_TRACE_NAME_FMT(fmt, ...) HITRACE_METER_FMT(HITRACE_TAG_GRAPHIC_AGP, fmt, ##__VA_ARGS__) 346#define RS_ASYNC_TRACE_BEGIN(name, value) StartAsyncTrace(HITRACE_TAG_GRAPHIC_AGP, name, value) 347#define RS_ASYNC_TRACE_END(name, value) FinishAsyncTrace(HITRACE_TAG_GRAPHIC_AGP, name, value) 348#define RS_TRACE_INT(name, value) CountTrace(HITRACE_TAG_GRAPHIC_AGP, name, value) 349#define RS_TRACE_FUNC() RS_TRACE_NAME(__func__) 350``` 351 352The following code is used to draw images on the display. Invoke the macro definition **RS_TRACE_NAME** at the beginning of the method, combine the function name **ProcessDisplayRenderNode** with a display ID, and record the combination as the trace name. In essence, the macro definition **HITRACE_METER_NAME** for tracing a single method is used. Therefore, you only need to call the **ProcessDisplayRenderNode** function once to collect the function execution start time and end time and obtain the execution duration. 353 354```cpp 355// rosen/modules/render_service/core/pipeline/rs_surface_capture_task.cpp 356 357void RSSurfaceCaptureVisitor::ProcessDisplayRenderNode(RSDisplayRenderNode &node) 358{ 359 RS_TRACE_NAME("RSSurfaceCaptureVisitor::ProcessDisplayRenderNode:" + 360 std::to_string(node.GetId())); 361 362 // Other service logic in the method is omitted here. 363 // ... 364} 365``` 366