• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Analyzing CPP Crash
2
3A cpp crash refers to a process crash in C/C++ application. The FaultLogger module provides capabilities such as process crash detection, log collection, log storage, and log reporting, helping you to locate faults more effectively.
4
5The following introduces cpp crash detection, crash fault locating and analysis, and typical cases. To use this guideline, you need to have basic knowledge about stack and heap in C/C++.
6
7## Cpp Crash Detection
8
9Process crash detection is based on the posix signal mechanism. Currently, the exception signals that can be processed are as follows:
10
11| Signo| Signal| Description| Trigger Cause|
12| -------- | -------- | -------- | -------- |
13| 4 | SIGILL | Invalid instruction| An invalid, incorrectly formatted, unknown, or privileged instruction is executed.|
14| 5 | SIGTRAP | Breakpoint or trap| An exception occurs or a trap instruction is executed.|
15| 6 | SIGABRT | Process abort| The process is aborted abnormally. Generally, this exception occurs when the process calls **abort()** in the Standard Function Library.|
16| 7 | SIGBUS | Illegal memory access| The process accesses an aligned or nonexistent physical address.|
17| 8 | SIGFPE | Floating-point exception| An incorrect arithmetic operation is executed, for example, a 0 divisor, floating point overflow, or integer overflow.|
18| 11 | SIGSEGV | Invalid memory access| The process accesses an invalid memory region.|
19| 16 | SIGSTKFLT | Stack error| The processor performs an incorrect stack operation, such as a pop when the stack is empty or a push when the stack is full.|
20| 31 | SIGSYS | Incorrect system call| An incorrect or invalid parameter is used in a system call.|
21
22Some of the preceding fault signals are classified into codes based on specific scenarios.
23**SIGILL** occurs in Unix and Unix-like operating systems. It indicates an invalid instruction exception. The **SIGILL** signal is usually triggered by the following causes:
24| Code| Signal| Description| Trigger Cause|
25| -------- | -------- | -------- | -------- |
26| 1 | ILL_ILLOPC | Illegal operation code.| A privileged instruction or an instruction that is unsupported by the CPU is executed.|
27| 2 | ILL_ILLOPN | Illegal operand.| An incorrect operand or improper operand type is used.|
28| 3 | ILL_ILLADR | Illegal address.| A program accesses an invalid memory address or an unaligned memory address.|
29| 4 | ILL_ILLTRP | Illegal trap.| A program performs an illegal trap instruction or an undefined operation.|
30| 5 | ILL_PRVOPC | Illegal privileged operation code.| A common user executes a privileged instruction.|
31| 6 | ILL_PRVREG | Illegal privileged register.| A common user accesses a privileged register.|
32| 7 | ILL_COPROC | Illegal coprocessor.| A program performs an undefined coprocessor instruction.|
33| 8 | ILL_BADSTK | Illegal stack.| A program performs an operation at an invalid stack address, or when the stack overflows.|
34
35**SIGTRAP** usually occurs in debugging and tracking. The four scenarios of the **SIGTRAP** signal are described as follows.
36| Code| Signal| Description| Trigger Cause|
37| -------- | -------- | -------- | -------- |
38| 1 | TRAP_BRKPT | Software breakpoint.| The software breakpoint is reached in a program. When debugging a program, a software breakpoint at the key position can be used to pause the program execution and check information such as variable values.|
39| 2 | TRAP_TRACE | Single-step debugging.| A single instruction is executed in a program. Single instruction can be used to check the execution result of each instruction.|
40| 3 | TRAP_BRANCH | Branch tracing.| A branch instruction is executed in a program. Branch instruction can be used to control the execution process of a program, such as if statements and loop statements.|
41| 4 | TRAP_HWBKPT | Hardware breakpoint.| The hardware breakpoint is reached in a program. When debugging a program, a hardware breakpoint at the key position can be used to pause the program execution and check information such as variable values. Different from a software breakpoint, a hardware breakpoint is implemented in CPU hardware. Therefore, whether a hardware breakpoint is triggered can be detected in real time during program execution.|
42
43The **SIGBUS** signal is sent by the operating system to a process. It usually indicates a memory access error. The codes of the **SIGBUS** signal are described as follows:
44
45| Code| Signal| Description| Trigger Cause|
46| -------- | -------- | -------- | -------- |
47| 1 | BUS_ADRALN | Unaligned memory address.| A program accesses an unaligned memory address, for example, a non-even address of a 4-byte integer.|
48| 2 | BUS_ADRERR | Invalid memory address.| A program accesses a memory address that does not exist in the Process Address Space, such as a null pointer.|
49| 3 | BUS_OBJERR | Invalid object access.| A program accesses an object that is deleted or not initialized.|
50| 4 | BUS_MCEERR_AR | Invalid hardware memory check.| A checksum error is detected when the hardware memory is accessed.|
51| 5 | BUS_MCEERR_AO | Invalid hardware memory check.| An address check error is detected when the hardware memory is accessed.|
52
53The **SIGFPE** signal indicates a floating-point exception or an arithmetic exception. The codes of the **SIGFPE** signal are described as follows:
54
55| Code| Signal| Description| Trigger Cause|
56| -------- | -------- | -------- | -------- |
57| 1 | FPE_INTDIV | Invalid integer division.| The divisor in an integer division is zero.  |
58| 2 | FPE_INTOVF | Integer overflow.| The divisor in an integer division is negative.  |
59| 3 | FPE_FLTDIV | Invalid floating-point division.| The divisor in a floating-point division is zero.  |
60| 4 | FPE_FLTOVF | Floating-point overflow.| The divisor in a floating-point division is negative.  |
61| 5 | FPE_FLTUND | Floating-point underflow.| The divisor in a floating-point division is zero.  |
62| 6 | FPE_FLTRES | Invalid floating-point result.| The divisor in a floating-point division is positive.  |
63| 7 | FPE_FLTINV | Invalid floating-point operation.| The divisor in a floating-point division is negative.  |
64| 8 | FPE_FLTSUB | Floating-point trap.| The divisor in a floating-point division is zero.  |
65
66The **SIGSEGV** signal occurs when a process accesses a non-existent memory address or an inaccessible address. The codes of the **SIGSEGV** signal are described as follows:
67
68| Code| Signal| Description| Trigger Cause|
69| -------- | -------- | -------- | -------- |
70| 1 | SEGV_MAPERR | Non-existent memory address.| A process accesses a memory address that does not exist or that is not mapped to the Process Address Space. This exception is usually caused by pointer errors or memory leaks.|
71| 2 | SEGV_ACCERR | Inaccessible memory address.| A process accesses an inaccessible memory address marked by the operating system, such as a read-only memory address or a memory address without execution permission. This exception is usually caused by buffer overflow or modifying read-only memory.|
72
73The classification of codes cannot only be based on **signo**, but also be based on the causes of the signal. The preceding describes the codes classified based on the **signo** of each signal, while the following describes the codes classified based on causes of all signals:
74
75| Code| Signal| Description| Trigger Cause|
76| -------- | -------- | -------- | -------- |
77| 0 | SI_USER | User space.|This signal is sent by a process in user space to another process, usually using the **kill()**. For example, when a user presses **Ctrl+C** on the terminal, a **SIGINT** signal is sent to all foreground processes.|
78| 0x80 | SI_KERNEL | Kernel.|This signal is sent by the kernel to the process. It is usually sent when the kernel detects some errors or exceptions. For example, when a process accesses an invalid memory address or executes an invalid instruction, the kernel sends a **SIGSEGV** signal to the process.|
79| -1 | SI_QUEUE | The **sigqueue()** function.|This signal is sent by **sigqueue()**, and an additional integer value and a pointer can be carried. It is usually used for advanced communication between processes, such as transferring data or notifying a process that an event occurs.|
80| -2 | SI_TIMER | Timer.|This signal is sent by a timer and is usually used to execute a scheduled task or a periodic task. For example, when a timer expires, the kernel sends a **SIGALRM** signal to the process.|
81| -3 | SI_MESGQ | Message queue.|This signal is sent by a message queue and is usually used for communication across processes. For example, when a process sends a message to a message queue, the kernel sends a **SIGIO** signal to the receiving process.|
82| -4 | SI_ASYNCIO | Asynchronous I/O.|This signal is sent by an asynchronous I/O and is usually used for a non-blocking I/O. For example, when an I/O operation on a file descriptor is complete, the kernel sends a **SIGIO** signal to the process.|
83| -5 | SI_SIGIO | Synchronous I/O.|This signal is sent by an asynchronous I/O and is usually used for a non-blocking I/O. For example, when an I/O operation on a file descriptor is complete, the kernel sends a **SIGIO** signal to the process.|
84| -6 | SI_TKILL | The **tkill()** function.|This signal is sent by the function **tkill()**, which is similar to the function **kill()**. In addition, you can specify the ID of the thread that sends the signal. It is usually used to send a signal to a specified thread in a multithreaded program.|
85
86## Fault Analysis
87
88### Crash Log Collection
89
90The process crash log is managed together with the app freeze and JS crash logs by the FaultLogger module. You can obtain process crash logs using any of the following methods:
91
92- Method 1: DevEco Studio
93
94    DevEco Studio collects process crash logs from **/data/log/faultlog/faultlogger/** to FaultLog, where logs are displayed by process name, fault, and time. For details about how to obtain logs, see <!--RP1-->[DevEco Studio User Guide-FaultLog](https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/ide-fault-log-V5)<!--RP1End-->.
95
96- Method 2: hiAppEvent APIs
97
98    hiAppEvent provides APIs to subscribe to various fault logs. For details, see [Introduction to HiAppEvent](hiappevent-intro.md).
99
100<!--Del-->
101- Method 3: Shell
102
103    - When a process crashes, you can find fault logs in **/data/log/faultlog/temp/** on the device. The log files are named in the format of **cppcrash-process PID-timestamp (millisecond)**. They contain information such as the process crash call stack, process crash register, stack memory, maps, and process file handle list.
104
105        ![cppcrash-temp-log](figures/20230407111853.png)
106
107        The fault logs obtained using Shell in **/data/log/faultlog/temp** is as follows:
108
109        ```text
110        Timestamp:2024-05-06 20:10:51.000  <- Timestamp when the fault occurs
111        Pid:9623                           <- Process ID
112        Uid:0                              <- User ID
113        Process name:./crasher_cpp         <- Process name
114        Process life time:1s               <- Process life time
115        Reason:Signal:SIGSEGV(SEGV_MAPERR)@0x00000004 probably caused by NULL pointer dereference  <- Fault cause and null pointer prompt
116        Fault thread info:
117        Tid:9623, Name:crasher_cpp         <- Thread ID, thread name
118        #00 pc 00008d22 /system/bin/crasher_cpp(TestNullPointerDereferenceCrash0()+22)(adfc673300571d2da1e47d1d12f48b44)  <- Call stack
119        #01 pc 000064d1 /system/bin/crasher_cpp(DfxCrasher::ParseAndDoCrash(char const*) const+160)(adfc673300571d2da1e47d1d12f48b44)
120        #02 pc 00006569 /system/bin/crasher_cpp(main+92)(adfc673300571d2da1e47d1d12f48b44)
121        #03 pc 00072b98 /system/lib/ld-musl-arm.so.1(libc_start_main_stage2+56)(d820b1827e57855d4f9ed03ba5dfea83)
122        #04 pc 00004e28 /system/bin/crasher_cpp(_start_c+84)(adfc673300571d2da1e47d1d12f48b44)
123        #05 pc 00004dcc /system/bin/crasher_cpp(adfc673300571d2da1e47d1d12f48b44)
124        Registers:   <- Fault registers
125        r0:ffffafd2 r1:00000004 r2:00000001 r3:00000000
126        r4:ffd27e39 r5:0096e000 r6:00000a40 r7:0096fdfc
127        r8:f7ba58d5 r9:f7baea86 r10:f7cadd38
128        fp:ffd27308 ip:f7cb2078 sp:ffd272a0 lr:f7c7ab98 pc:0096ad22
129        Memory near registers:  <- Memory near fault registers
130        r4([stack]):
131            ffd27e30 72656873
132            ffd27e34 7070635f
133            ...
134            ffd27eac 3d73746f
135        r5(/system/bin/crasher_cpp):
136            0096dff8 00000000
137            0096dffc 0096717d
138            ...
139            0096e074 00000000
140        r7(/system/lib/ld-musl-arm.so.1):
141            f7cabb58 00000000
142            f7cabb5c 0034ba00
143            ...
144            f7cabbd4 00000000
145        r8(/system/lib/ld-musl-arm.so.1):
146            f7ba58cc 63637573
147            f7ba58d0 2e737365
148            ...
149            f7ba5948 70206269
150        r9(/system/lib/ld-musl-arm.so.1):
151            f7baea7c 20746f6e
152            f7baea80 6e756f66
153            ...
154            f7baeaf8 25206e69
155        r10([anon:ld-musl-arm.so.1.bss]):
156            f7cadd30 00000000
157            f7cadd34 00000000
158            ...
159            f7caddac 00000000
160        r12([anon:ld-musl-arm.so.1.bss]):
161            f7cb2070 56726562
162            f7cb2074 65756c61
163            ...
164            f7cb20ec 00000000
165        sp([stack]):
166            ffd27328 00000000
167            ffd2732c 00966dd0
168            ...
169            ffd273a4 00000004
170        pc(/system/bin/crasher_cpp):
171            00966dc8 e1a0d00c
172            00966dcc eb000000
173            ...
174            00966e44 e5907008
175        pc(/system/bin/crasher_cpp):
176            00966dc8 e1a0d00c
177            00966dcc eb000000
178            ...
179            00966e44 e5907008
180        FaultStack:  <- Stack of the crashed thread
181            ffd27260 00000000
182            ffd27264 f7cac628
183            ...
184            ffd2729c 0096ad1f
185        sp0:ffd272a0 0096fdfc <- #00Stack top
186            ffd272a4 009684d3
187        sp1:ffd272a8 00000001
188            ffd272ac 73657408
189            ffd272b0 f7590074
190            ...
191            ffd272dc 0096856d
192        sp2:ffd272e0 ffd27334
193            ffd272e4 ffd27334
194            ffd272e8 00000002
195            ....
196            ffd272f4 f7bfbb9c
197        sp3:ffd272f8 00000000
198            ffd272fc ffd27334
199
200        Maps:   <- Process maps files when the fault occurs
201        962000-966000 r--p 00000000 /system/bin/crasher_cpp
202        966000-96c000 r-xp 00003000 /system/bin/crasher_cpp
203        96c000-96f000 r--p 00008000 /system/bin/crasher_cpp
204        96f000-970000 rw-p 0000a000 /system/bin/crasher_cpp
205        149f000-14a0000 ---p 00000000 [heap]
206        14a0000-14a2000 rw-p 00000000 [heap]
207        ...
208        f7b89000-f7be1000 r--p 00000000 /system/lib/ld-musl-arm.so.1
209        f7be1000-f7ca9000 r-xp 00057000 /system/lib/ld-musl-arm.so.1
210        f7ca9000-f7cab000 r--p 0011e000 /system/lib/ld-musl-arm.so.1
211        f7cab000-f7cad000 rw-p 0011f000 /system/lib/ld-musl-arm.so.1
212        f7cad000-f7cbc000 rw-p 00000000 [anon:ld-musl-arm.so.1.bss]
213        ffd07000-ffd28000 rw-p 00000000 [stack]
214        ffff0000-ffff1000 r-xp 00000000 [vectors]
215        OpenFiles:   <- FD information of the file opened by the process when the fault occurs
216        0->/dev/pts/1 native object of unknown type 0
217        1->/dev/pts/1 native object of unknown type 0
218        2->/dev/pts/1 native object of unknown type 0
219        3->socket:[67214] native object of unknown type 0
220        ...
221        11->pipe:[67219] native object of unknown type 0
222        12->socket:[29074] native object of unknown type 0
223        25->/dev/ptmx native object of unknown type 0
224        26->/dev/ptmx native object of unknown type 0
225        ```
226
227    - You can find more comprehensive fault logs in **/data/log/faultlog/faultlogger/**, which include information such as device name, system version and process logs. The log files are named in the format of **cppcrash-process name-process UID-time (millisecond).log**.
228
229        ![cppcrash-faultlogger-log](figures/cppcrash_image_023.png)
230<!--DelEnd-->
231
232**Fault Logs of Null Pointer**
233
234In this scenario, a message is printed in the log, indicating that the fault may be caused by a null pointer dereference. The following is an example process crash log archived by DevEco Studio in FaultLog:
235
236```text
237Generated by HiviewDFX@OpenHarmony
238================================================================
239Device info:OpenHarmony 3.2        <- Device information
240Build info:OpenHarmony 5.0.0.23    <- Build information
241Fingerprint:cdf52fd0cc328fc432459928f3ed8edfe8a72a92ee7316445143bed179138073 <- Fingerprint
242Module name:crasher_cpp            <-Module name
243Timestamp:2024-05-06 20:10:51.000  <- Timestamp when the fault occurs
244Pid:9623   <- Process ID
245Uid:0         <- User ID
246Process name:./crasher_cpp         <- Process name
247Process life time:1s               <- Process life time
248Reason:Signal:SIGSEGV(SEGV_MAPERR)@0x00000004 probably caused by NULL pointer dereference  <- Fault cause and null pointer prompt
249Fault thread info:
250Tid:9623, Name:crasher_cpp         <- Thread ID, thread name
251#00 pc 00008d22 /system/bin/crasher_cpp(TestNullPointerDereferenceCrash0()+22)(adfc673300571d2da1e47d1d12f48b44)  <- Call stack
252#01 pc 000064d1 /system/bin/crasher_cpp(DfxCrasher::ParseAndDoCrash(char const*) const+160)(adfc673300571d2da1e47d1d12f48b44)
253#02 pc 00006569 /system/bin/crasher_cpp(main+92)(adfc673300571d2da1e47d1d12f48b44)
254#03 pc 00072b98 /system/lib/ld-musl-arm.so.1(libc_start_main_stage2+56)(d820b1827e57855d4f9ed03ba5dfea83)
255#04 pc 00004e28 /system/bin/crasher_cpp(_start_c+84)(adfc673300571d2da1e47d1d12f48b44)
256#05 pc 00004dcc /system/bin/crasher_cpp(adfc673300571d2da1e47d1d12f48b44)
257Registers:   <- Fault registers
258r0:ffffafd2 r1:00000004 r2:00000001 r3:00000000
259r4:ffd27e39 r5:0096e000 r6:00000a40 r7:0096fdfc
260r8:f7ba58d5 r9:f7baea86 r10:f7cadd38
261fp:ffd27308 ip:f7cb2078 sp:ffd272a0 lr:f7c7ab98 pc:0096ad22
262Memory near registers:  <- Memory near fault registers
263r4([stack]):
264    ffd27e30 72656873
265    ffd27e34 7070635f
266    ...
267    ffd27eac 3d73746f
268r5(/system/bin/crasher_cpp):
269    0096dff8 00000000
270    0096dffc 0096717d
271    ...
272    0096e074 00000000
273r7(/system/lib/ld-musl-arm.so.1):
274    f7cabb58 00000000
275    f7cabb5c 0034ba00
276    ...
277    f7cabbd4 00000000
278r8(/system/lib/ld-musl-arm.so.1):
279    f7ba58cc 63637573
280    f7ba58d0 2e737365
281    ...
282    f7ba5948 70206269
283r9(/system/lib/ld-musl-arm.so.1):
284    f7baea7c 20746f6e
285    f7baea80 6e756f66
286    ...
287    f7baeaf8 25206e69
288r10([anon:ld-musl-arm.so.1.bss]):
289    f7cadd30 00000000
290    f7cadd34 00000000
291    ...
292    f7caddac 00000000
293r12([anon:ld-musl-arm.so.1.bss]):
294    f7cb2070 56726562
295    f7cb2074 65756c61
296    ...
297    f7cb20ec 00000000
298sp([stack]):
299    ffd27328 00000000
300    ffd2732c 00966dd0
301    ...
302    ffd273a4 00000004
303pc(/system/bin/crasher_cpp):
304    00966dc8 e1a0d00c
305    00966dcc eb000000
306    ...
307    00966e44 e5907008
308pc(/system/bin/crasher_cpp):
309    00966dc8 e1a0d00c
310    00966dcc eb000000
311    ...
312    00966e44 e5907008
313FaultStack:  <- Stack of the crashed thread
314    ffd27260 00000000
315    ffd27264 f7cac628
316    ...
317    ffd2729c 0096ad1f
318sp0:ffd272a0 0096fdfc <- #00Stack top
319    ffd272a4 009684d3
320sp1:ffd272a8 00000001
321    ffd272ac 73657408
322    ffd272b0 f7590074
323    ...
324    ffd272dc 0096856d
325sp2:ffd272e0 ffd27334
326    ffd272e4 ffd27334
327    ffd272e8 00000002
328    ....
329    ffd272f4 f7bfbb9c
330sp3:ffd272f8 00000000
331    ffd272fc ffd27334
332
333Maps:   <- Process maps files when the fault occurs
334962000-966000 r--p 00000000 /system/bin/crasher_cpp
335966000-96c000 r-xp 00003000 /system/bin/crasher_cpp
33696c000-96f000 r--p 00008000 /system/bin/crasher_cpp
33796f000-970000 rw-p 0000a000 /system/bin/crasher_cpp
338149f000-14a0000 ---p 00000000 [heap]
33914a0000-14a2000 rw-p 00000000 [heap]
340...
341f7b89000-f7be1000 r--p 00000000 /system/lib/ld-musl-arm.so.1
342f7be1000-f7ca9000 r-xp 00057000 /system/lib/ld-musl-arm.so.1
343f7ca9000-f7cab000 r--p 0011e000 /system/lib/ld-musl-arm.so.1
344f7cab000-f7cad000 rw-p 0011f000 /system/lib/ld-musl-arm.so.1
345f7cad000-f7cbc000 rw-p 00000000 [anon:ld-musl-arm.so.1.bss]
346ffd07000-ffd28000 rw-p 00000000 [stack]
347ffff0000-ffff1000 r-xp 00000000 [vectors]
348OpenFiles:   <- FD information of the file opened by the process when the fault occurs
3490->/dev/pts/1 native object of unknown type 0
3501->/dev/pts/1 native object of unknown type 0
3512->/dev/pts/1 native object of unknown type 0
3523->socket:[67214] native object of unknown type 0
353...
35411->pipe:[67219] native object of unknown type 0
35512->socket:[29074] native object of unknown type 0
35625->/dev/ptmx native object of unknown type 0
35726->/dev/ptmx native object of unknown type 0
358
359HiLog:   <- HiLog logs when the fault occurs
36005-06 20:10:51.301  9623  9623 E C03f00/MUSL-SIGCHAIN: signal_chain_handler call 2 rd sigchain action for signal: 11
36105-06 20:10:51.306  9623  9623 I C02d11/DfxSignalHandler: DFX_SigchainHandler :: sig(11), pid(9623), tid(9623).
36205-06 20:10:51.307  9623  9623 I C02d11/DfxSignalHandler: DFX_SigchainHandler :: sig(11), pid(9623), processName(./crasher_cpp), threadName(crasher_cpp).
36305-06 20:10:51.389  9623  9623 I C02d11/DfxSignalHandler: processdump have get all resgs
364
365```
366
367**Fault Logs of Stack Overflow**
368
369If the following prompt information is printed in logs, it indicates that the fault may be caused by stack overflow. The following is an example process crash log archived by DevEco Studio in FaultLog:
370
371```text
372Generated by HiviewDFX@OpenHarmony
373================================================================
374Device info:OpenHarmony 3.2            <- Device information
375Build info:OpenHarmony 5.0.0.23        <- Build information
376Fingerprint:8bc3343f50024204e258b8dce86f41f8fcc50c4d25d56b24e71fe26c0a23e321  <- Fingerprint
377Module name:crasher_cpp                <- Module name
378Timestamp:2024-05-06 20:18:24.000      <- Timestamp when the fault occurs
379Pid:9838                               <- Process ID
380Uid:0                                  <- User ID
381Process name:./crasher_cpp             <- Process name
382Process life time:2s                   <- Process life time
383Reason:Signal:SIGSEGV(SEGV_ACCERR)@0xf76b7ffc current thread stack low address = 0xf76b8000, probably caused by stack-buffer-overflow <- Fault cause and stack overflow prompt
384...
385```
386
387**Fault Logs of Stack Coverage**
388
389In the stack coverage scenario, the stack frame cannot be traced because the stack memory is illegally accessed. A message is displayed in the log, indicating that the stack fails to be returned and the system attempts to parse the thread stack to obtain an unreliable call stack. The information is provided for problem analysis. The following is an example process crash log archived by DevEco Studio in FaultLog:
390
391```text
392Generated by HiviewDFX@OpenHarmony
393================================================================
394Device info:OpenHarmony 3.2               <- Device information
395Build info:OpenHarmony 5.0.0.23           <- Build information
396Fingerprint:79b6d47b87495edf27135a83dda8b1b4f9b13d37bda2560d43f2cf65358cd528    <- Fingerprint
397Module name:crasher_cpp                   <- Module name
398Timestamp:2024-05-06 20:27:23.2035266415  <- Timestamp when the fault occurs
399Pid:10026                                 <- Process ID
400Uid:0                                     <- User ID
401Process name:./crasher_cpp                <- Process name
402Process life time:1s                      <- Process life time
403Reason:Signal:SIGSEGV(SEGV_MAPERR)@0000000000 probably caused by NULL pointer dereference  <- Fault cause
404Fault thread info:
405Tid:10026, Name:crasher_cpp               <- Thread ID, thread name
406#00 pc 00000000 Not mapped
407#01 pc 00008d22 /system/bin/crasher_cpp(TestNullPointerDereferenceCrash0()+22)(adfc673300571d2da1e47d1d12f48b44)  <- Call stack
408#02 pc 000064d1 /system/bin/crasher_cpp(DfxCrasher::ParseAndDoCrash(char const*) const+160)(adfc673300571d2da1e47d1d12f48b44)
409#03 pc 00006569 /system/bin/crasher_cpp(main+92)(adfc673300571d2da1e47d1d12f48b44)
410#04 pc 00072b98 /system/lib/ld-musl-arm.so.1(libc_start_main_stage2+56)(d820b1827e57855d4f9ed03ba5dfea83)
411Registers:   <- Fault registers
412r0:ffffafd2 r1:00000004 r2:00000001 r3:00000000
413r4:ffd27e39 r5:0096e000 r6:00000a40 r7:0096fdfc
414r8:f7ba58d5 r9:f7baea86 r10:f7cadd38
415fp:ffd27308 ip:f7cb2078 sp:ffd272a0 lr:f7c7ab98 pc:0096ad22
416ExtraCrashInfo(Unwindstack):   <- Print the custom stack information about the system framework service.
417Failed to unwind stack, try to get unreliable call stack from #02 by reparsing thread stack   <- Attempt to obtain an unreliable stack from the thread stack
418...
419```
420
421**Fault Logs of Asynchronous Thread**
422
423When an asynchronous thread crashes, the stack of the thread that submits the asynchronous task is also printed to help locate the crash. Currently, the ARM64 architecture is supported on the debugging application (**HAP_DEBUGGABLE**). The **SubmitterStacktrace** is used to differentiate the call stack of the crash thread and that of the submitting thread. The following is an example process crash log archived by DevEco Studio in FaultLog:
424
425```text
426Generated by HiviewDFX@OpenHarmony
427================================================================
428Device info:OpenHarmony 3.2                 <- Device information
429Build info:OpenHarmony 5.0.0.23             <- Build information
430Fingerprint:8bc3343f50024204e258b8dce86f41f8fcc50c4d25d56b24e71fe26c0a23e321  <- Fingerprint
431Module name:crasher_cpp                     <- Module name
432Timestamp:2024-05-06 20:28:24.000           <- Timestamp when the fault occurs
433Pid:9838                                    <- Process ID
434Uid:0                                       <- User ID
435Process name:./crasher_cpp                  <- Process name
436Process life time:2s                        <- Process life time
437Reason:Signal:SIGSEGV(SI_TKILL)@0x000000000004750 from:18256:0   <- Fault Cause
438Fault thread info:
439Tid:18257, Name:crasher_cpp                 <- Thread ID, thread name
440#00 pc 000054e6 /system/bin/ld-musl-aarch64.so.l(raise+228)(adfc673300571d2da1e47d1d12f48b44) <- Call stack
441#01 pc 000054f9 /system/bin/crasher_cpp(CrashInSubThread(void*)+56)(adfc673300571d2da1e47d1d12f48b50)
442#02 pc 000054f9 /system/bin/ld-musl-aarch64.so.l(start+236)(adfc673300571d2da1e47d1d12f48b44)
443========SubmitterStacktrace========       <- The call stack used to print submitting thread
444#00 pc 000094dc /system/bin/crasher_cpp(DfxCrasher::AsyncStacktrace()+36)(adfc673300571d2da1e47d1d12f48b50)
445#01 pc 00009a58 /system/bin/crasher_cpp(DfxCrasher::ParseAndDoCrash(char const*) const+232)(adfc673300571d2da1e47d1d12f48b50)
446#02 pc 00009b40 /system/bin/crasher_cpp(main+140)(adfc673300571d2da1e47d1d12f48b50)
447#03 pc 0000a4e1c /system/bin/ld-musl-aarch64.so.l(libc_start_main_stage2+68)(adfc673300571d2da1e47d1d12f48b44)
448...
449```
450
451**Logs of Custom Information About System Framework Services**
452
453When a process crashes, the custom maintenance and test information of the system framework service can be printed to help you locate faults. The information can be the string, memory, callback, or stack type. Currently, the ARM64 architecture is supported. Since API 18, the **LastFatalMessage** field carries only the last fatal-level log printed by using HiLog or the last message set by using the **set_fatal_message** API of libc before the process crashes. The callback type information and stack type information are moved from the **LastFatalMessage** field to the **ExtraCrashInfo** (Callback) and **ExtraCrashInfo** (Unwindstack) fields, respectively. The following is the core content of the process crash logs archived by DevEco Studio in FaultLog, which contains four types of custom information about system framework services.
454
455- String information:
456
457    ```text
458    Generated by HiviewDFX@OpenHarmony
459    ================================================================
460    Device info:OpenHarmony 3.2        <- Device information
461    Build info:OpenHarmony 5.0.0.23    <- Build information
462    Fingerprint:cdf52fd0cc328fc432459928f3ed8edfe8a72a92ee7316445143bed179138073 <- Fingerprint
463    Module name:crasher_cpp            <-Module name
464    Timestamp:2024-05-06 20:10:51.000  <- Timestamp when the fault occurs
465    Pid:9623   <- Process ID
466    Uid:0         <- User ID
467    Process name:./crasher_cpp         <- Process name
468    Process life time:1s               <- Process life time
469    Reason:Signal:SIGSEGV(SEGV_MAPERR)@0x00000004 probably caused by NULL pointer dereference  <- Fault cause and null pointer prompt
470    Fault thread info:
471    Tid:9623, Name:crasher_cpp         <- Thread ID, thread name
472    #00 pc 00008d22 /system/bin/crasher_cpp(TestNullPointerDereferenceCrash0()+22)(adfc673300571d2da1e47d1d12f48b44)  <- Call stack
473    #01 pc 000064d1 /system/bin/crasher_cpp(DfxCrasher::ParseAndDoCrash(char const*) const+160)(adfc673300571d2da1e47d1d12f48b44)
474    #02 pc 00006569 /system/bin/crasher_cpp(main+92)(adfc673300571d2da1e47d1d12f48b44)
475    #03 pc 00072b98 /system/lib/ld-musl-arm.so.1(libc_start_main_stage2+56)(d820b1827e57855d4f9ed03ba5dfea83)
476    #04 pc 00004e28 /system/bin/crasher_cpp(_start_c+84)(adfc673300571d2da1e47d1d12f48b44)
477    #05 pc 00004dcc /system/bin/crasher_cpp(adfc673300571d2da1e47d1d12f48b44)
478    Registers:   <- Fault registers
479    r0:ffffafd2 r1:00000004 r2:00000001 r3:00000000
480    r4:ffd27e39 r5:0096e000 r6:00000a40 r7:0096fdfc
481    r8:f7ba58d5 r9:f7baea86 r10:f7cadd38
482    fp:ffd27308 ip:f7cb2078 sp:ffd272a0 lr:f7c7ab98 pc:0096ad22
483    ExtraCrashInfo(String):   <- Print custom string information about the system framework service
484    test get CrashObject.
485    ...
486    ```
487
488- Memory information:
489
490    ```text
491    ...
492    ExtraCrashInfo(Memory start address 0000xxxx):   <- Print custom memory information about the system framework service
493    +0x000: xxxxx   xxxxx    xxxxx     xxxxx         <- Print the memory value from 0x000 to 0x018.
494    +0x020: xxxxx   xxxxx    xxxxx     xxxxx         <- Print the memory value from 0x020 to 0x038.
495    ...
496    ```
497
4983. Callback information:
499
500    From API 18, the callback information is moved from the **LastFatalMessage** field to the **ExtraCrashInfo(Callback)** field.
501
502    ```text
503    ...
504    ExtraCrashInfo(Callback):   <- Print custom callback information about the system framework service.
505    test get callback information.
506    ...
507    ```
508
5094. Stack information:
510
511    From API 18, the callback information is moved from the **LastFatalMessage** field to the **ExtraCrashInfo(Unwindstack)** field.
512
513    ```text
514    ...
515    ExtraCrashInfo(Unwindstack):   <- Print the custom stack information about the system framework service.
516    Failed to unwind stack, try to get unreliable call stack from #02 by reparsing thread stack
517    ...
518    ```
519
520> **NOTE**
521>
522> The omitted information is similar to the example of the string information.
523
524### Locating the Problematic Code Based on the Crash Stack
525
526#### Method 1: DevEco Studio
527
528In application development, you can locate the problematic code in the cppcrash stack of the dynamic library. Both native stack frames and JS stack frames are supported. For some stack frames that fail to be parsed and located in DevEco Studio, refer to Method 2.
529
530![cppcrash-addr2line1](figures/cppcrash_image_002.png)
531
532#### Method 2: SDK llvm-addr2line
533
534- Obtain the symbol list.
535    Obtain the .so file with symbols in the crash stack, which should be the same as that of the application or system.
536    Compiled and built in DevEco Studio, the .so file of dynamic library is generated with symbols by default in **/build/default/intermediates/libs**. You can run the **Linux file** command to check whether the BuildID of two .so files match. Generated by a compiler, BuildID is the unique identifier of a binary file, in which "not stripped" indicates that a symbol table is included.
537
538    ```text
539    $ file libbabel.so
540    libbabel.so: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV), dynamically linked, BuildID[sha1]=fdb1b5432b9ea4e2a3d29780c3abf30e2a22da9d, with debug_info, not stripped
541    ```
542
543    **Note**: The symbol table of the system dynamic library is archived with the version.
544
545- Locate the line number using llvm-addr2line.
546    You can find llvm-addr2line in **[SDK DIR PATH]\OpenHarmony\11\native\llvm\bin**, or you need to search for the path as it varies based on the SDK version.
547    The sample stack is as follows (part are omitted):
548
549    ```text
550    Generated by HiviewDFX@OpenHarmony
551    ================================================================
552    Device info:OpenHarmony 3.2
553    Build info:OpenHarmony 5.0.0.22
554    Fingerprint:50577c0a1a1b5644ac030ba8f08c241cca0092026b59f29e7b142d5d4d5bb934
555    Module name:com.samples.recovery
556    Version:1.0.0
557    VersionCode:1000000
558    PreInstalled:No
559    Foreground:No
560    Timestamp:2017-08-05 17:03:40.000
561    Pid:2396
562    Uid:20010044
563    Process name:com.samples.recovery
564    Process life time:7s
565    Reason:Signal:SIGSEGV(SEGV_MAPERR)@0000000000  probably caused by NULL pointer dereference
566    Tid:2396, Name:amples.recovery
567    # 00 pc 00003510 /data/storage/el1/bundle/libs/arm/libentry.so(TriggerCrash(napi_env__*, napi_callback_info__*)+24)(446ff75d3f6a518172cc52e8f8055650b02b0e54)
568    # 01 pc 0002b0c5 /system/lib/platformsdk/libace_napi.z.so(panda::JSValueRef ArkNativeFunctionCallBack<true>(panda::JsiRuntimeCallInfo*)+448)(a84fbb767fd826946623779c608395bf)
569    # 02 pc 001e7597 /system/lib/platformsdk/libark_jsruntime.so(panda::ecmascript::EcmaInterpreter::RunInternal(panda::ecmascript::JSThread*, unsigned char const*, unsigned long long*)+14710)(106c552f6ce4420b9feac95e8b21b792)
570    # 03 pc 001e0439 /system/lib/platformsdk/libark_jsruntime.so(panda::ecmascript::EcmaInterpreter::Execute(panda::ecmascript::EcmaRuntimeCallInfo*)+984)(106c552f6ce4420b9feac95e8b21b792)
571    ...
572    # 39 pc 00072998 /system/lib/ld-musl-arm.so.1(libc_start_main_stage2+56)(5b1e036c4f1369ecfdbb7a96aec31155)
573    # 40 pc 00005b48 /system/bin/appspawn(_start_c+84)(cb0631260fa74df0bc9b0323e30ca03d)
574    # 41 pc 00005aec /system/bin/appspawn(cb0631260fa74df0bc9b0323e30ca03d)
575    Registers:
576    r0:00000000 r1:ffc47af8 r2:00000001 r3:f6555c94
577    r4:00000000 r5:f4d90f64 r6:bd8434f8 r7:00000000
578    r8:00000000 r9:ffc48808 r10:ffc47b70
579    fp:f7d8a5a0 ip:00000000 sp:ffc47aac lr:f4d6b0c7 pc:bd843510
580    ```
581
582    Parsed by SDK llvm-addr2line, the row number of problematic code is as follows:
583
584    ```text
585    [SDK DIR PATH]\OpenHarmony\11\native\llvm\bin> .\llvm-addr2line.exe -Cfie libentry.so 3150
586    TrggerCrash(napi_env__*, napi_callback_info__*)
587    D:/code/apprecovery-demo/entry/src/main/cpp/hello.cpp:48
588    ```
589
590    You can use the **llvm-addr2line.exe -fCpie libutils.z.so offset** command to parse the stack line by line. If there are multiple offsets, you can parse them together using the **llvm-addr2line.exe -fCpie libxxx.so 0x1bc868 0x1be28c xxx** command. If the obtained row number does not seem correct, you can change the address (for example, subtract 1) or disable some compilation optimization.
591
592#### Method 3: DevEco Studio hstack
593
594hstack is a tool provided by DevEco Studio for you to restore the crash stack of an obfuscated release app to the source code stack. It runs on Windows, macOS, and Linux. For details, see [DevEco Studio hstack User Guide](https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/ide-command-line-hstack-V5).
595
596### Reviewing Code Based on Services
597
598Review the context after the row number of the stack top is obtained. As shown in the following figure, line 48 in the **hello.cpp** file indicates a null pointer dereference.
599
600![cppcrash-demo1](figures/cppcrash_image_004.png)
601
602This example is constructed, and actual scenario is usually more complicate and needs to be analyzed based on services.
603
604### Disassembling (optional)
605
606Generally, if the problem is clear, you can locate the problem by decompiling the code line. In a few cases, if the method called in a line contains multiple parameters and the parameters involve structs, you need to use disassembly for further analysis.
607
608Case
609
610The header information of the CPPCRASH log is as follows:
611
612```text
613Process name:com.ohos.medialibrary.medialibrarydata
614
615Process life time:13402s
616
617Reason:SIGSEGV(SEGV_MAPERR)@0x0000005b3b46c000
618
619Fault thread info:
620
621Tid:48552, Name:UpradeTask
622
623#00 pc 00000000000a87e4 /system/lib/ld-musl-aarch64.so.1(memcpy+356)(3c3e7fb27680dc2ee99aa08dd0f81e85)
624
625...
626```
627
628Procedure:
629
630- Obtain the corresponding assembly instruction based on the PC register address and obtain the current operation based on the assembly instruction.
631
632    Obtain the PC address at the top of the stack from the CPPCRASH log file and disassemble the corresponding ELF file (using the unstrip .so file and the **llvm-objdump -d -l xxx.so** command).
633
634    For example, when a **data_abort** issue occurs during the execution of the instruction corresponding to the **00000000000a87e4** address, decompile the libc.so file corresponding to the buildId **3c3e7fb27680dc2ee99aa08dd0f81e85**.
635
636    Disassemble the code to view the information displayed in the **a87e4** offset address:
637
638    ```text
639    xxx/../../third_party/optimized-routines/string/aarch64/memcpy.S:175
640
641    a87e4:   a94371aa         ldp x10, x11, [x1, #48]
642    ```
643
644    Check the code of the **memcpy.S** source file corresponding to line 175:
645
646    ```text
647    L(loop64):
648
649    line 170   stp A_l, A_h, [dst, 16]
650
651    line 171   ldp A_l, A_h, [src, 16]
652
653    line 172   stp B_l, B_h, [dst, 32]
654
655    line 173   ldp B_l, B_h, [src, 32]
656
657    line 174   stp C_l, C_h, [dst, 48]
658
659    line 175   ldp C_l, C_h, [src, 48]      ---->  Instruction in the crash
660
661    line 176   stp D_l, D_h, [dst, 64]
662
663    line 177   ldp D_l, D_h, [src, 64]
664
665    line 178   subs count, count, 64
666
667    line 179   b.hi L(loop64)
668    ```
669
670- Infer the code object of the current operation based on the register value and context.
671
672    Generally, register **x0** is the first parameter of the function, **x1** is the second parameter, **x2** is the third parameter, and so on. If the method is a class method, **x0** is the address pointer of the object, and **x1**, **x2**, and **x3** are deduced by analogy. Note that if there are more than five function parameters, they will be pushed into the stack.
673
674    In **void* memcpy(void* restrict dest, void* restrict src, size_t n)** at the stack top, **x0** indicates the destination address **dest**, **x1** indicates the source address, and **x2** indicates the number of copied bytes.
675
676    Obtain the corresponding three register values in the CPPCRASH log file. Based on the error access address **0x0000005b3b46c000**, it is determined that the faulty parameter is the **src** parameter corresponding to **x1**.
677
678    ```text
679    Register:
680
681    x0:000005b50c3e3c4 x1:000005b3b46bfcc x2:0000000000007e88 x3:000005b50c42380
682
683    ...
684    ```
685
6863. Determine the fault type of the code object.
687
688    Check **Memory near registers** in the CPPCRASH log.
689
690    ```text
691    x1(/data/medialibrary/database/kvdb/3ddb6fb8b2fcb38d2f431e86bfb806dab771637860d6e86bb9430fa15df04248/single_ver/main/gen_natural_st):
692
693        0000005b21bb1fb8 8067d0f2e727f00a
694
695        0000005b21bb1fc0 1b10e1e9a1079f7a
696
697        0000005b21bb1fc8 83906d9c18cdb9c1
698
699        0000005b21bb1fd0 627dd75ab9335eb0
700
701        0000005b21bb1fd8 aabe2bb1b00f2c03
702
703        0000005b21bb1fe0 f981e4acb716cbc1
704
705        0000005b21bb1fe8 806b3d5730d281ee
706
707        0000005b21bb1ff0 3e99fedbc0a9b5e9
708
709        0000005b21bb1ff8 a91ab9d327969682
710
711        0000005b21bb2000 ffffffffffffffff       -----> Out-of-bounds read
712
713        0000005b21bb2008 ffffffffffffffff
714
715        0000005b21bb2010 ffffffffffffffff
716
717        0000005b21bb2018 ffffffffffffffff
718
719        0000005b21bb2020 ffffffffffffffff
720
721        0000005b21bb2028 ffffffffffffffff
722
723        0000005b21bb2030 ffffffffffffffff
724    ```
725
726    According to the log, an out-of-bound read problem occurs. The faulty parameters are **buf** and **bufSize** of **memcpy**.
727
728    In this case, you only need to analyze the parameter logic passed in when **memcpy** is called in the code.
729
7304. Track the parameter source of the problematic object and locate the problem based on the code and logs.
731
732    Method 1: Check whether the parameter object and range are valid. For example, check whether the **buf** size is the same as the input **bufSize**.
733
734    Method 2: Check whether the lifecycle of the parameter object is valid. For example, check whether **buf** has been released and whether memory corruption occurs due to multi-thread operations.
735
736    Method 3: Use the parameter object to access the function context and check the improper operation logic of the parameter. For example, trace the operation logic of **buf** and **bufsize**, add debugging information, and locate the improper operation logic.
737
738    Code snippet:
739
740    ```text
741    static StatusInter xxxFunc(..., const uint8_t *buf, uint32_t bufSize)
742
743    ...
744
745    uint32_t srcSize = bufSize;
746
747    uint32_t srcOffset = cache->appendOffset - bufSize;
748
749    errno_t ret = memcpy_s(cache->buffer + srcOffset, srcSize, buf, bufSize);
750
751    if (ret != EOK) {
752
753        return MEMORY_OPERATE_FAILED_INTER;
754
755    }
756
757    ...
758    ```
759
760    By continuously tracing the sources of **buf** and **bufSize**, it is found that after continuous copy, **bufSize** is greater than **buf**, causing out-of-bounds read.
761
762### Common CppCrash Faults and Causes
763
764- Null pointer dereference.
765    When a crash log is in format **SIGSEGV(SEGV_MAPERR)@0x00000000** or the values of the input parameter registers such as **r0** and **r1** printed in the **Register** are **0**, check whether a null pointer is input when invoking a method.
766    When a crash log is in format **SIGSEGV(SEGV_MAPERR)@0x0000000c** or the value of the input parameter register such as **r1** printed in the **Register** is small, check whether the called structs contain a null pointer.
767- SIGABRT.
768    Generally, this fault is triggered by the user, framework, or C library, and you can locate the problematic code in the first frame of the framework library. In this case, check whether resources such as thread and file descriptor are properly used, and whether the invoking sequence of APIs is correct.
769- SIGSEGV.
770  - Multithreading operation collection in STD library is not thread-safe. If the collection is added or deleted on multiple threads, the **SIGSEGV** crash occurs. If **llvm-addr2line** is used and the result code involve operations on collections, this could be the reason for the crash.
771  - If the pointer does not match the lifecycle of an object, for example, using a raw pointer to store the **sptr** type and **shared_ptr** type, can lead to memory leak and dangling pointer. A raw pointer is a pointer that does not have features such as encapsulation and automatic memory management. It is only a simple pointer to the memory address. The memory to which the pointer points is not protected or managed. A raw pointer can directly access the pointed memory, but problems such as memory leak and null pointer reference may also occur. Therefore, when using a raw pointer, pay attention to potential security problems. You are advised to use smart pointers to manage memory.
772- Use after free.
773    This fault occurs when the reference of a released stack variable is not set to null and the access continues.
774
775    ```text
776    # include <iostream>
777
778    int& getStackReference() {
779        int x = 5;
780        return x; // Return the reference to x.
781    }
782
783    int main() {
784        int& ref = getStackReference (); // Obtain the reference to x.
785        // x is released when getStackReference() returns.
786        // ref is now a dangling reference. If you continue to access it, undefined behavior occurs.
787        std::cout << ref << std::endl; // Outputting the value of x is an undefined behavior.
788        return 0;
789    }
790    ```
791
792- Stack overflow occurs in recursive invocation, mutual invocation of destructors, and the use of large stack memory blocks in special stacks (signal stacks).
793    ```text
794    # include <iostream>
795
796    class RecursiveClass {
797    public:
798        RecursiveClass() {
799            std::cout << "Constructing RecursiveClass" << std::endl;
800        }
801
802        ~RecursiveClass() {
803            std::cout << "Destructing RecursiveClass" << std::endl;
804            // Recursive invocation of a destructor.
805            RecursiveClass obj;
806        }
807    };
808
809    int main() {
810        RecursiveClass obj;
811        return 0;
812    }
813    ```
814
815    When a **RecursiveClass** object is created, its constructor is called. When this object is destroyed, its destructor is called. In the destructor, a new **RecursiveClass** object is created, which causes recursive calls until the stack overflows. Recursive calls are infinite. As a result, the stack space is used up and the application crashes.
816- Binary mismatch usually indicates the mismatch of the Application Binary Interface (ABI). For example, when a compiled binary interface or its data structure definition does not match the ABI, a random crash stack is generated.
817- Memory corruption occurs when the memory of a valid wild pointer is changed to an invalid value, which results in out-of-bounds access and data overwrite. In this case, a random crash stack is generated.
818- SIGBUS (Alignment) occurs when the address is in the unaligned state after the pointer is forcibly converted.
819- When the length of a function name exceeds 256 bytes, the stack frame does not contain the function name.
820- If the ELF file does not contain **.note.gnu.build-id**, the stack frame does not contain the **build-id** information.
821
822## Case Study
823
824The following analyzes the typical CppCrash cases based on signals, scenarios, and tools respectively.
825The analysis based on signals introduces common crash signals and provides a typical case for each type of signal.
826The analysis based on scenarios concludes a common scenario for frequent problems, and provides a typical case for each scenario.
827The analysis based on tools describes how to use various maintenance and debugging tools, and provides a typical case for each tool.
828
829### Analyzing CppCrash Based on Signals
830
831#### Type 1: SIGSEGV Crash
832
833The **SIGSEGV** signal indicates a Segmentation Fault of the program. This fault occurs when a program accesses a memory area outside its bounds (for example, writes a memory in the operating system), or accesses a memory area without correct permission (for example, writes to read-only memory). The details are as follows:
834
835- **SIGSEGV** is a type of memory management fault.
836- **SIGSEGV** is generated in a user-mode program.
837- **SIGSEGV** occurs when a user-mode program accesses a memory area outside its bound.
838- **SIGSEGV** also occurs when a user-mode program accesses a memory without correct permission.
839
840In most cases, **SIGSEGV** is caused by pointer overwriting. However, not all pointer overwriting causes **SIGSEGV**. The **SIGSEGV** crash would not be triggered unless an out-of-bounds pointer is dereferenced. In addition, even if an out-of-bounds pointer is dereferenced, the **SIGSEGV** crash may not be caused. The **SIGSEGV** crash involves the operating system, C library, compiler, and linker. The examples are as follows:
841
842- The memory area is read-only memory.
843    The sample code is as follows:
844
845    ```text
846    static napi_value TriggerCrash(napi_env env, napi_callback_info info)
847    {
848        char *s = "hello world";
849        s[1] = 'H';
850        return 0;
851    }
852    ```
853
854    This is one of the most common examples. In this case, "hello world" is a constant string and is placed in **.rodata section** of GCC. When the target program is generated, **.rodata section** is merged into the **text segment** and placed together with the **code segment**. Therefore, the memory area where the **.rodata section** is located is read-only. This is the **SIGSEGV(SEGV_ACCERR)** crash caused by writing to read-only memory area.
855
856    ![cppcrash-demo2](figures/cppcrash_image_005.png)
857
858- The memory area is out of the process address space.
859    The sample code is as follows:
860
861    ```text
862    static napi_value TriggerCrash(napi_env env, napi_callback_info info)
863    {
864        uint64_t* p = (uint64_t*)0xffffffcfc42ae6f4;
865        *p = 10;
866        return 0;
867    }
868    ```
869
870    In this example, the program accesses a memory address in the kernel. The **SIGSEGV(SEGV_MAPERR)@0xffffffcfc42ae6f4** crash is usually triggered by the program by accident. The key logs of this cpp crash are as follows:
871
872    ```text
873    Device info:xxxxxx xxxx xx xxx
874    Build info:xxxxxxx
875    Fingerprint:73a5dcdf3e509605563aa11ac8cb4f3d7f99b9946dc142212246b53b741c4129
876    Module name:com.samples.recovery
877    Version:1.0.0
878    VersionCode:1000000
879    PreInstalled:No
880    Foreground:Yes
881    Timestamp:2024-04-29 14:07:12.082
882    Pid:21374
883    Uid:20020144
884    Process name:com.samples.recovery
885    Process life time:8s
886    Reason:Signal:SIGSEGV(SEGV_MAPERR)@0xffffffcfc42ae6f4
887    Fault thread info:
888    Tid:21374, Name:amples.recovery
889    # 00 pc 0000000000001ccc /data/storage/el1/bundle/libs/arm64/libentry.so(TriggerCrash(napi_env__*, napi_callback_info__*)+36)(4dd115fa8b8c1b3f37bdb5b7b67fc70f31f0dbac)
890    # 01 pc 0000000000033678 /system/lib64/platformsdk/libace_napi.z.so(ArkNativeFunctionCallBack(panda::JsiRuntimeCallInfo*)+372)(7d6f229764fdd4b72926465066bc475e)
891    # 02 pc 00000000001d7f38 /system/lib64/module/arkcompiler/stub.an(RTStub_PushCallArgsAndDispatchNative+40)
892    # 03 at doTriggerException entry (entry/src/main/ets/pages/FaultTriggerPage.ets:72:7)
893    # 04 at triggerNativeException entry (entry/src/main/ets/pages/FaultTriggerPage.ets:79:5)
894    # 05 at anonymous entry (entry/src/main/ets/pages/FaultTriggerPage.ets:353:19)
895    # 06 pc 000000000048e024 /system/lib64/platformsdk/libark_jsruntime.so(panda::FunctionRef::Call(panda::ecmascript::EcmaVM const*, panda::Local<panda::JSValueRef>, panda::Local<panda::JSValueRef> const*, int)+1040)(9fa942a1d42bd4ae607257975fbc1b77)
896    ...
897    # 38 pc 00000000000324b0 /system/bin/appspawn(AppSpawnRun+172)(c992404f8d1cf03c84c067fbf3e1dff9)
898    # 39 pc 00000000000213a8 /system/bin/appspawn(main+956)(c992404f8d1cf03c84c067fbf3e1dff9)
899    # 40 pc 00000000000a4b98 /system/lib/ld-musl-aarch64.so.1(libc_start_main_stage2+64)(ff4c94d996663814715bedb2032b2bbc)
900    ```
901
9023. The memory does not exist.
903    The sample code is as follows:
904
905    ```text
906    static napi_value TriggerCrash(napi_env env, napi_callback_info info)
907    {
908        int *a = NULL;
909        *a = 1;
910        return 0;
911    }
912    ```
913
914    In practice, the most common null pointer dereference occurs when the user-mode address to which the null pointer points does not exist. The inference information "Reason:Signal:SIGSEGV(SEGV_MAPERR)@000000000000000000 probably caused by NULL pointer dereference" is printed in the **Reason** of CppCrash logs, as shown in the following figure.
915
916    ![cppcrash-demo3](figures/cppcrash_image_006.png)
917
9184. Double free.
919    The sample code is as follows:
920
921    ```text
922    static napi_value TriggerCrash(napi_env env, napi_callback_info info)
923    {
924        void *pc = malloc(1024);
925        free(pc);
926        free (pc); // Double free
927        printf("free ok!\n");
928        return 0;
929    }
930    ```
931
932    In the double-free memory scenario, the system throws a **SIGSEGV(SI_TKILL)** fault indicating an illegal memory operation, as shown below
933
934    ![cppcrash-demo3](figures/cppcrash_image_007.png)
935
936    The preceding are common causes for **SIGSEGV** crashes. Other scenarios may also trigger **SIGSEGV** crashes, which include stack overflow memory access, heap overflow memory access, global wild pointer access, execution on an invalid address, and invalid parameter invocation. The **SIGSEGV** crash is associated to the stack allocation and recovery of the operating system and the compiler.
937
938#### Type 2: SIGABRT Crash
939
940The **SIGABRT** signal is sent to abort the process. This signal can be called by the process executing **abort()** in C standard library, or it can be sent to the process from outside like other signals.
941
942-
943    The sample code of executing the **abort()** function:
944
945    ```text
946    static napi_value TriggerCrash(napi_env env, napi_callback_info info)
947    {
948        OH_LOG_FATAL(LOG_APP, "test fatal log.");
949        abort();
950        return 0;
951    }
952    ```
953
954    In this scenario, the **abort()** function is proactively called when a process is identified as not safe in checks from basic libraries. The last fatal log before the process exits is printed in the crash log, as shown in the following figure:
955
956    ![cppcrash-demo4](figures/cppcrash_image_008.png)
957
958-
959    The sample code of executing the **assert()** function:
960
961    ```text
962    static napi_value TriggerCrash(napi_env env, napi_callback_info info)
963    {
964    # if 0 // If the value is 0, an error is reported. If the value is 1, it is normal.
965        void *pc = malloc(1024);
966    # else
967        void *pc = nullptr;
968    # endif
969        assert(pc != nullptr);
970        return 0;
971    }
972    ```
973
974    In addition to the **abort()** function, other exception handling mechanisms in C++ include the **assert()** function, **exit()** function, exception capture mechanism (**try-catch**), and **exception** class. The **assert()** function is used to check some data in the function execution. If the check fails, the process aborts. The corresponding fault scenario is shown below.
975
976    ![cppcrash-demo5](figures/cppcrash_image_009.png)
977
978### Analyzing CppCrash Based on Scenarios
979
980#### Type 1: Memory Access Crash
981
982**Background**
983
984The crash address **0x7f82764b70** is in the readable and executable segment of **libace_napi_ark.z.so**. The cause is that the address needs to be written, but the corresponding **maps** segment has only the read and execute permissions. In other words, when a process attempts to access a memory area that is not allowed to be accessed, the process crashes.
985
986```text
9877f82740000-7f8275c000 r--p 00000000 /system/lib64/libace_napi_ark.z.so
9887f8275c000-7f8276e000 r-xp 0001b000 /system/lib64/libace_napi_ark.z.so <- The crash address locates within this address range.
9897f8276e000-7f82773000 r--p 0002c000 /system/lib64/libace_napi_ark.z.so
9907f82773000-7f82774000 rw-p 00030000 /system/lib64/libace_napi_ark.z.so
991```
992
993The following figure shows the crash call stack.
994
995![cppcrash-demo6](figures/cppcrash_image_010.png)
996
997**Fault Analysis**
998
999This address error is regular, but it is abnormal that the node address fall in **libace_napi_ark.z.so**. In this case, this may be memory corruption error. You can use [ASan Check](https://developer.huawei.com/consumer/cn/doc/harmonyos-guides-V5/ide-asan-V5) to locate the memory corruption error. By performing stress tests to reproduce the problem, ASan can also be used to find the regular crash scenario. The fault detected by ASan is the same as that in the crash stack above. The stack reports **heap-use-after-free**, which was actually a double free of the same address. During the second free operation, the address is used to access to its object member, resulting in a UAF fault.
1000The key logs of ASan are as follows:
1001
1002```text
1003=================================================================
1004==appspawn==2029==ERROR: AddressSanitizer: heap-use-after-free on address 0x003a375eb724 at pc 0x002029ba8514 bp 0x007fd8175710 sp 0x007fd8175708
1005READ of size 1 at 0x003a375eb724 thread T0 (thread name)
1006    # 0 0x2029ba8510  (/system/asan/lib64/platformsdk/libark_jsruntime.so+0xca8510) panda::ecmascript::Node::IsUsing() const at arkcompiler/ets_runtime/ecmascript/ecma_global_storage.h:82:16
1007(inlined by) panda::JSNApi::DisposeGlobalHandleAddr(panda::ecmascript::EcmaVM const*, unsigned long) at arkcompiler/ets_runtime/ecmascript/napi/jsnapi.cpp:749:67 BuildID[md5/uuid]=9a18e2ec0dc8a83216800b2f0dd7b76a
1008    # 1 0x403ee94d30  (/system/asan/lib64/libace.z.so+0x6194d30) panda::CopyableGlobal<panda::ObjectRef>::Free() at arkcompiler/ets_runtime/ecmascript/napi/include/jsnapi.h:1520:9
1009(inlined by) panda::CopyableGlobal<panda::ObjectRef>::Reset() at arkcompiler/ets_runtime/ecmascript/napi/include/jsnapi.h:189:9
1010(inlined by) OHOS::Ace::Framework::JsiType<panda::ObjectRef>::Reset() at foundation/arkui/ace_engine/frameworks/bridge/declarative_frontend/engine/jsi/jsi_types.inl:112:13
1011(inlined by) OHOS::Ace::Framework::JsiWeak<OHOS::Ace::Framework::JsiObject>::~JsiWeak() at foundation/arkui/ace_engine/frameworks/bridge/declarative_frontend/engine/jsi/jsi_ref.h:167:16
1012(inlined by) OHOS::Ace::Framework::ViewFunctions::~ViewFunctions() at foundation/arkui/ace_engine/frameworks/bridge/declarative_frontend/jsview/js_view_functions.h:44:5 BuildID[md5/uuid]=1330f8b9be73bdb76ae18107c2a60ca1
1013    # 2 0x403ee9296c  (/system/asan/lib64/libace.z.so+0x619296c) OHOS::Ace::Framework::ViewFunctions::~ViewFunctions() at foundation/arkui/ace_engine/frameworks/bridge/declarative_frontend/jsview/js_view_functions.h:42:5
1014(inlined by) OHOS::Ace::Framework::ViewFunctions::~ViewFunctions() at foundation/arkui/ace_engine/frameworks/bridge/declarative_frontend/jsview/js_view_functions.h:42:5 BuildID[md5/uuid]=1330f8b9be73bdb76ae18107c2a60ca1
1015    # 3 0x403ed9b130  (/system/asan/lib64/libace.z.so+0x609b130) OHOS::Ace::Referenced::DecRefCount() at foundation/arkui/ace_engine/frameworks/base/memory/referenced.h:76:13
1016(inlined by) OHOS::Ace::RefPtr<OHOS::Ace::Framework::ViewFunctions>::~RefPtr() at foundation/arkui/ace_engine/frameworks/base/memory/referenced.h:148:22 BuildID[md5/uuid]=1330f8b9be73bdb76ae18107c2a60ca1
1017    # 4 0x403ed9b838  (/system/asan/lib64/libace.z.so+0x609b838) OHOS::Ace::RefPtr<OHOS::Ace::Framework::ViewFunctions>::Reset() at foundation/arkui/ace_engine/frameworks/base/memory/referenced.h:163:9
1018(inlined by) OHOS::Ace::Framework::JSViewFullUpdate::~JSViewFullUpdate() at foundation/arkui/ace_engine/frameworks/bridge/declarative_frontend/jsview/js_view.cpp:159:21 BuildID[md5/uuid]=1330f8b9be73bdb76ae18107c2a60ca1
1019    # 5 0x403ed9bf24  (/system/asan/lib64/libace.z.so+0x609bf24) OHOS::Ace::Framework::JSViewFullUpdate::~JSViewFullUpdate() at foundation/arkui/ace_engine/frameworks/bridge/declarative_frontend/jsview/js_view.cpp:157:1
1020(inlined by) OHOS::Ace::Framework::JSViewFullUpdate::~JSViewFullUpdate() at foundation/arkui/ace_engine/frameworks/bridge/declarative_frontend/jsview/js_view.cpp:157:1 BuildID[md5/uuid]=1330f8b9be73bdb76ae18107c2a60ca1
1021...
1022freed by thread T0 (thread name) here:
1023    # 0 0x2024ed3abc  (/system/asan/lib64/libclang_rt.asan.so+0xd3abc)
1024    # 1 0x2029ba8424  (/system/asan/lib64/platformsdk/libark_jsruntime.so+0xca8424) std::__h::__function::__value_func<void (unsigned long)>::operator()[abi:v15004](unsigned long&&) const at prebuilts/clang/ohos/linux-x86_64/llvm/bin/../include/libcxx-ohos/include/c++/v1/__functional/function.h:512:16
1025(inlined by) std::__h::function<void (unsigned long)>::operator()(unsigned long) const at prebuilts/clang/ohos/linux-x86_64/llvm/bin/../include/libcxx-ohos/include/c++/v1/__functional/function.h:1197:12
1026(inlined by) panda::ecmascript::JSThread::DisposeGlobalHandle(unsigned long) at arkcompiler/ets_runtime/ecmascript/js_thread.h:604:9
1027(inlined by) panda::JSNApi::DisposeGlobalHandleAddr(panda::ecmascript::EcmaVM const*, unsigned long) at arkcompiler/ets_runtime/ecmascript/napi/jsnapi.cpp:752:24 BuildID[md5/uuid]=9a18e2ec0dc8a83216800b2f0dd7b76a
1028    # 2 0x403ee94b68  (/system/asan/lib64/libace.z.so+0x6194b68) panda::CopyableGlobal<panda::FunctionRef>::Free() at arkcompiler/ets_runtime/ecmascript/napi/include/jsnapi.h:1520:9
1029(inlined by) panda::CopyableGlobal<panda::FunctionRef>::Reset() at arkcompiler/ets_runtime/ecmascript/napi/include/jsnapi.h:189:9
1030(inlined by) OHOS::Ace::Framework::JsiType<panda::FunctionRef>::Reset() at foundation/arkui/ace_engine/frameworks/bridge/declarative_frontend/engine/jsi/jsi_types.inl:112:13
1031(inlined by) OHOS::Ace::Framework::JsiWeak<OHOS::Ace::Framework::JsiFunction>::~JsiWeak() at foundation/arkui/ace_engine/frameworks/bridge/declarative_frontend/engine/jsi/jsi_ref.h:167:16
1032(inlined by) OHOS::Ace::Framework::ViewFunctions::~ViewFunctions() at foundation/arkui/ace_engine/frameworks/bridge/declarative_frontend/jsview/js_view_functions.h:44:5 BuildID[md5/uuid]=1330f8b9be73bdb76ae18107c2a60ca1
1033    # 3 0x403ee9296c  (/system/asan/lib64/libace.z.so+0x619296c) OHOS::Ace::Framework::ViewFunctions::~ViewFunctions() at foundation/arkui/ace_engine/frameworks/bridge/declarative_frontend/jsview/js_view_functions.h:42:5
1034(inlined by) OHOS::Ace::Framework::ViewFunctions::~ViewFunctions() at foundation/arkui/ace_engine/frameworks/bridge/declarative_frontend/jsview/js_view_functions.h:42:5 BuildID[md5/uuid]=1330f8b9be73bdb76ae18107c2a60ca1
1035    # 4 0x403ed9b130  (/system/asan/lib64/libace.z.so+0x609b130) OHOS::Ace::Referenced::DecRefCount() at foundation/arkui/ace_engine/frameworks/base/memory/referenced.h:76:13
1036(inlined by) OHOS::Ace::RefPtr<OHOS::Ace::Framework::ViewFunctions>::~RefPtr() at foundation/arkui/ace_engine/frameworks/base/memory/referenced.h:148:22 BuildID[md5/uuid]=1330f8b9be73bdb76ae18107c2a60ca1
1037...
1038previously allocated by thread T0 (thread name) here:
1039    # 0 0x2024ed3be4  (/system/asan/lib64/libclang_rt.asan.so+0xd3be4)
1040    # 1 0x2029ade778  (/system/asan/lib64/platformsdk/libark_jsruntime.so+0xbde778) panda::ecmascript::NativeAreaAllocator::AllocateBuffer(unsigned long) at arkcompiler/ets_runtime/ecmascript/mem/native_area_allocator.cpp:98:17 BuildID[md5/uuid]=9a18e2ec0dc8a83216800b2f0dd7b76a
1041    # 2 0x2029a39064  (/system/asan/lib64/platformsdk/libark_jsruntime.so+0xb39064) std::__h::enable_if<!std::is_array_v<panda::ecmascript::NodeList<panda::ecmascript::WeakNode>>, panda::ecmascript::NodeList<panda::ecmascript::WeakNode>*>::type panda::ecmascript::NativeAreaAllocator::New<panda::ecmascript::NodeList<panda::ecmascript::WeakNode>>() at arkcompiler/ets_runtime/ecmascript/mem/native_area_allocator.h:61:19
1042(inlined by) unsigned long panda::ecmascript::EcmaGlobalStorage<panda::ecmascript::Node>::NewGlobalHandleImplement<panda::ecmascript::WeakNode>(panda::ecmascript::NodeList<panda::ecmascript::WeakNode>**, panda::ecmascript::NodeList<panda::ecmascript::WeakNode>**, unsigned long) at arkcompiler/ets_runtime/ecmascript/ecma_global_storage.h:565:34
1043(inlined by) panda::ecmascript::EcmaGlobalStorage<panda::ecmascript::Node>::SetWeak(unsigned long, void*, void (*)(void*), void (*)(void*)) at arkcompiler/ets_runtime/ecmascript/ecma_global_storage.h:455:26 BuildID[md5/uuid]=9a18e2ec0dc8a83216800b2f0dd7b76a
1044    # 3 0x2029ba5620  (/system/asan/lib64/platformsdk/libark_jsruntime.so+0xca5620) std::__h::__function::__value_func<unsigned long (unsigned long, void*, void (*)(void*), void (*)(void*))>::operator()[abi:v15004](unsigned long&&, void*&&, void (*&&)(void*), void (*&&)(void*)) const at prebuilts/clang/ohos/linux-x86_64/llvm/bin/../include/libcxx-ohos/include/c++/v1/__functional/function.h:512:16
1045(inlined by) std::__h::function<unsigned long (unsigned long, void*, void (*)(void*), void (*)(void*))>::operator()(unsigned long, void*, void (*)(void*), void (*)(void*)) const at prebuilts/clang/ohos/linux-x86_64/llvm/bin/../include/libcxx-ohos/include/c++/v1/__functional/function.h:1197:12
1046(inlined by) panda::ecmascript::JSThread::SetWeak(unsigned long, void*, void (*)(void*), void (*)(void*)) at arkcompiler/ets_runtime/ecmascript/js_thread.h:610:16
1047(inlined by) panda::JSNApi::SetWeak(panda::ecmascript::EcmaVM const*, unsigned long) at arkcompiler/ets_runtime/ecmascript/napi/jsnapi.cpp:711:31 BuildID[md5/uuid]=9a18e2ec0dc8a83216800b2f0dd7b76a
1048...
1049```
1050
1051When **JsiWeak** is destructed or reset, **CopyableGlobal** in the parent class **JsiType** of its member (**JsiObject**/**JsiValue**/**JsiFunction**) is released, as shown in the following figure.
1052
1053![cppcrash-demo5](figures/cppcrash_image_011.png)
1054
1055During Garbage Collection (GC), **IterateWeakEcmaGlobalStorage** calls **DisposeGlobalHandle** on **WeakNode** without a callback, and releases it, as shown in the following figure.
1056
1057![cppcrash-demo6](figures/cppcrash_image_012.png)
1058
1059Therefore, for the same **WeakNode**, there may be two functions for release. If **IterateWeakEcmaGlobalStorage** releases it first during GC, without a callback notification to **JsiWeak** for cleanup, **JsiWeak** still retains a reference **CopyableGlobal** to the released **WeakNode**. When the **NodeList** containing the **WeakNode** is released and returned to the operating system, the retained **CopyableGlobal** in **JsiWeak** is released again, leading to a double-free error.
1060
1061![cppcrash-demo7](figures/cppcrash_image_013.png)
1062
1063**Solutions**
1064
1065Invoke a callback when **JsiWeak** calls **SetWeakCallback**. Therefore, the callback can notify **JsiWeak** to reset **CopyableGlobal** when **IterateWeakEcmaGlobalStorage** releases the **WeakNode**, ensuring the same address is not double-freed.
1066
1067**Suggestions**
1068
1069When using memory, consider whether the memory is double-freed or not freed. Additionally, when locating memory access crashes (usually **SIGSEGV** crashes), run the ASan to reproduce the fault if there is no clue based on the crash stack analysis.
1070
1071#### Type 2: Multi-thread Crash
1072
1073**Background**
1074
1075**napi_env** is still used after being released.
1076
1077**Symptom**
1078
1079The **env** of a **napi** API is invalid. The crash stack is mounted to **NativeEngineInterface::ClearLastError()**. Based on the log of **env** address, it is found that the **env** is used after being released.
1080
1081![cppcrash-demo9](figures/cppcrash_image_015.png)
1082
1083The key crash stack is as follows.
1084
1085![cppcrash-demo8](figures/cppcrash_image_014.png)
1086
1087**Solutions**
1088
1089The **env** created by a thread should not be transferred to another thread.
1090
1091**Suggestions**
1092
1093You can select the **Multi Thread Check** option to locate multi thread faults. For details, see "Ark Runtime Multi Thread Check" in guideline.
1094
1095Note: **env** in the **napi** interface is the **arkNativeEngine** when the engine is created.
1096
1097#### Type 3: Lifecycle Crash
1098
1099**Background**
1100
1101When you create a native **napi_value**, it needs to be used with **napi_handle_scope**. The **napi_handle_scope** is used to manage the lifecycle of **napi_value**. **napi_value** can be used only within **napi_handle_scope**, otherwise, the lifecycle of **napi_value** and its JS objects is no longer protected. If the reference count is 0, **napi_value** is collected by GC. Using **napi_value** at this point indicates accessing freed memory, which results in faults.
1102
1103**Symptom**
1104
1105**napi_value** is a raw pointer (a struct pointer). It is used to hold JS objects and maintain the lifecycle of JS objects to ensure that JS objects are not collected by GC. **napi_handle_scope** is used to manage **napi_value**. Once out of **napi_handle_scope**, **napi_value** is collected by GC, and **napi_value** no longer holds the JS object (no longer protects the JS object's lifecycle)
1106
1107**Fault Analysis**
1108
1109By decompiling the crash stack, the upper-level interface of the problematic **napi** interface can be located, in which the problematic **napi_value** can be found. In this case, you need to check if the **napi_value** is used out of **napi_handle_scope**.
1110
1111**Cases**
1112
1113The **napi_value** is used out of the scope of the NAPI framework.
1114
1115![cppcrash-demo9](figures/cppcrash_image_016.png)
1116
1117On the JS side, data is added using the **Add()**, and on the native side, **napi_value** is saved to a **vector**. On the JS side, data is obtained using the **get** API, and on the native side, the saved **napi_value** is returned as an array. The JS side then reads the properties of the data. The error message "Can not get Prototype on non ECMA Object" is displayed. The **native_value** across **napi** is not saved using **napi_ref**. As a result, the **native_value** is invalid.
1118Note: The scope of the NAPI framework is **napi_handle_scope**. You can use **napi_handle_scope** to manage the lifecycle of **napi_value**. The scope of the framework layer is embedded in the end-to-end process of the JS call native. That is, the scope is opened when the native method is entered, and the scope is closed when the native method ends.
1119
1120#### Type 4: Pointer Crash
1121
1122**Background**
1123
1124Smart pointers are used without null checks, causing null pointer dereference crashes during process execution.
1125
1126**Impact**
1127
1128The process crashes, causing unexpected exit.
1129
1130**Fault Analysis**
1131
1132![cppcrash-demo10](figures/cppcrash_image_017.png)
1133
1134Null pointer crashes can be identified based on the fault cause. Run the llvm-addr2line command to parse the line number. It is found that the service code does not check whether the smart pointer is null before using it. As a result, the service code accesses the null address, causing the crash.
1135
1136**Solution**
1137
1138Add protective null checks for the pointer.
1139
1140**Suggestions**
1141
1142Pointers should be null-checked before using it to prevent null pointers and process crashes and exits.
1143
1144### Analyzing Cpp Crash Based on Tools
1145
1146#### Tool 1: ASAN
1147
1148[ASan Check](https://developer.huawei.com/consumer/en/doc/harmonyos-guides-V5/ide-asan-V5).
1149
1150#### Tool 2: Ark Runtime Multi Thread Check
1151
1152**Fundamentals**
1153
1154JS is single-threaded. Operations on JS objects can be performed only on the JS thread. Otherwise, multi-thread security problems may occur. (JS objects created on the main thread can be operated only on the main thread, and JS objects created on the worker thread can be operated only on the worker thread.) The napi APIs involve object operations. Therefore, 95% napi APIs can be used only on the JS thread. The multi-thread detection mechanism checks whether the **JS thread ID** of the calling thread is the same as that of the used **VM/Env**. If they are different, the **VM/Env** is used across threads, causing multi-thread security problems. Common problems: 1. Napi APIs are used in non-JS threads. 2. **env** of other threads are used in napi APIs.
1155
1156**How to Use**
1157
1158![cppcrash-demo13](figures/cppcrash_image_020.png)
1159
1160Select **Multi Thread Check** on DevEco to enable Ark multi-thread detection.
1161
1162**Scenario**
1163
1164If the stack of crash logs is difficult to analyze and the probability of this problem is high, you need to enable multi-thread detection. When the multi-thread detection is enabled, if the fatal information in the **cpp_crash** log is "Fatal: ecma_vm cannot run in multi-thread! thread:3096 currentThread:3550", it indicates that a multi-thread security problem occurs. That is, the calling thread ID is **3550**, but the JS thread is created by thread **3096**. The **vm** is used across threads.
1165
1166**Cases**
1167
1168After the multi thread check is enabled, the crash is triggered again. If the problem is caused by multiple threads, fatal information is displayed. The following is an example:
1169
1170```text
1171Fatal: ecma_vm cannot run in multi-thread! thread:xxx currentThread:yyy
1172```
1173
1174The preceding information indicates that the calling thread ID is **17585**, but the JS thread is created by thread **17688**. The **vm** is used across threads. The **vm** is the **napi_env__*** of the JS thread. It is the environment for running thread code. One thread uses one **vm**.
1175The key crash log is as follows:
1176
1177```text
1178Reason:Signal:SIGABRT(SI_TKILL)@0x01317b9f000044b1 from:17585: 20020127
1179LastFatalMessage: [default] CheckThread:177 Fatal: ecma_vm cannot run in multi-thread! thread:17688 currentThread:17585
1180Fault thread Info:
1181Tid:17585, Name:xxxxx
1182# 00 pc 00000000000f157c /system/lib/ld-musl-aarch64-asan.so.1(__restore_sigs+52)(38eb4ca904ae601d4b4dca502e948960)
1183# 01 pc 00000000000f1800 /system/lib/ld-musl-aarch64-asan.so.1(raise+112) (38eb4ca904ae��01d4b4dca502e948960)
1184# 02 pc 00000000000adc74 /system/lib/ld-musl-aarch64-asan.so.1(abort.+20) (38eb4ca904ae601d4b4dca502e948960)
1185# 03 pc 0000000000844fdc /system/asan/lib��4/platformsdk/libark_jsruntime.so(panda::ecmascript::EcmaVM::CheckThread() const+2712)(1df055932338c14060b864435aec88ab)
1186# 04 pc 0000000000f3d930 /system/asan/lib��4/platformsdk/libark_jsruntime.so(panda::0bjectRef:: New(panda::ecmascript::EcmaVM const*)+908)(1df055932338c14060b864435aec88
1187# 05 pC 0000000000095048 /sYstem/asan/lib64/platformsdk/libace_napi.z.so(napi_create_object+80)(efc1b3d1378f56b4b800489fb30dcded)
1188# 06 pc 00000000005d9770 /data/ storage/el1/bundle/libs/arm64/xxxxx.so (c0f1735eada49fadc5197745f5afOc0a52246270)
1189```
1190
1191To analyze the multi-thread problem, perform the following steps:
1192i. Check the first stack frame under **libace_napi.z.so**. The preceding figure shows **xxxxx.so**. Check whether the **napi_env** of thread **17688** is transferred to thread **17585**.
1193ii. If the stack frame under **libace_napi.z.so** does not transfer the **napi_env** parameter, check whether the parameter is transferred as a struct member variable.
1194
1195#### Tool 3: objdump
1196
1197**How to Use**
1198
1199objdump binary is a system tool. You must have the OpenHarmony compilation environment, whose project code can be obtained from Gitee. The command is as follows:
1200
1201```text
1202repo init -u git@gitee.com:openharmony/manifest.git -b master --no-repo-verify --no-clone-bundle --depth=1
1203repo sync -c
1204./build/prebuilts_download.sh
1205```
1206
1207You can obtain the tool in `prebuilts/clang/ohos/linux-x86_64/llvm/bin/llvm-objdump` of the project. The command is as follows:
1208
1209```text
1210prebuilts/clang/ohos/linux-x86_64/llvm/bin/llvm-objdump -d libark_jsruntime.so > dump.txt
1211```
1212
1213**Scenario**
1214
1215In some cases, addr2line can only be used to check whether a line of the code is faulty but cannot determine which variable is abnormal. In this case, you can use objdump to disassemble the code and combine the information from the cppcrash register to further determine the crash cause.
1216
1217**Cases**
1218
1219The log is as follows:
1220
1221```text
1222Tid:6655, Name:GC_WorkerThread
1223# 00 pc 00000000004492d4 /system/lib64/platformsdk/libark_jsruntime.so(panda::ecmascript::NonMovableMarker::MarkObject(unsigned int, panda::ecmascript::TaggedObject*)+124)(21cf5411626d5986a4ba6383e959b3cc)
1224# 01 pc 000000000044b580 /system/lib64/platformsdk/libark_jsruntime.so(panda::ecmascript::NonMovableMarker::MarkValue(unsigned int, panda::ecmascript::ObjectSlot&, panda::ecmascript::Region*, bool)+72)(21cf5411626d5986a4ba6383e959b3cc)
1225# 02 pc 000000000044b4e8 /system/lib64/platformsdk/libark_jsruntime.so(std::__h::__function::__func<panda::ecmascript::NonMovableMarker::ProcessMarkStack(unsigned int)::$_2, std::__h::allocator<panda::ecmascript::NonMovableMarker::ProcessMarkStack(unsigned int)::$_2>, void (panda::ecmascript::TaggedObject*, panda::ecmascript::ObjectSlot, panda::ecmascript::ObjectSlot, panda::ecmascript::VisitObjectArea)>::operator()(panda::ecmascript::TaggedObject*&&, panda::ecmascript::ObjectSlot&&, panda::ecmascript::ObjectSlot&&, panda::ecmascript::VisitObjectArea&&)+256)(21cf5411626d5986a4ba6383e959b3cc)
1226# 03 pc 0000000000442ac0 /system/lib64/platformsdk/libark_jsruntime.so(void panda::ecmascript::ObjectXRay::VisitObjectBody<(panda::ecmascript::VisitType)1>(panda::ecmascript::TaggedObject*, panda::ecmascript::JSHClass*, std::__h::function<void (panda::ecmascript::TaggedObject*, panda::ecmascript::ObjectSlot, panda::ecmascript::ObjectSlot, panda::ecmascript::VisitObjectArea)> const&)+216)(21cf5411626d5986a4ba6383e959b3cc)
1227# 04 pc 0000000000447ccc /system/lib64/platformsdk/libark_jsruntime.so(panda::ecmascript::NonMovableMarker::ProcessMarkStack(unsigned int)+248)(21cf5411626d5986a4ba6383e959b3cc)
1228# 05 pc 0000000000438588 /system/lib64/platformsdk/libark_jsruntime.so(panda::ecmascript::Heap::ParallelGCTask::Run(unsigned int)+148)(21cf5411626d5986a4ba6383e959b3cc)
1229# 06 pc 00000000004e31c8 /system/lib64/platformsdk/libark_jsruntime.so(panda::ecmascript::Runner::Run(unsigned int)+144)(21cf5411626d5986a4ba6383e959b3cc)
1230# 07 pc 00000000004e3780 /system/lib64/platformsdk/libark_jsruntime.so(void* std::__h::__thread_proxy[abi:v15004]<std::__h::tuple<std::__h::unique_ptr<std::__h::__thread_struct, std::__h::default_delete<std::__h::__thread_struct>>, void (panda::ecmascript::Runner::*)(unsigned int), panda::ecmascript::Runner*, unsigned int>>(void*)+64)(21cf5411626d5986a4ba6383e959b3cc)
1231# 08 pc 000000000014d894 /system/lib/ld-musl-aarch64.so.1
1232# 09 pc 0000000000085d04 /system/lib/ld-musl-aarch64.so.1
1233```
1234
1235Run the addr2line command to locate the error line.
1236
1237![cppcrash-demo14](figures/cppcrash_image_021.png)
1238
1239The preceding information indicates that a null pointer is accessed and the process is suspended when **InYoungSpace** is accessed. Therefore, it can be suspected that the **Region** is a null pointer.
1240Use objdump to disassemble and search for the error address **4492d4**. The command is as follows:
1241
1242![cppcrash-demo15](figures/cppcrash_image_022.png)
1243
1244Check the **x20** register, and the value is **0x000000000000000**. The preceding information shows that **x20** performs bitwise operation based on **x2** (the last 18 bits are cleared, which is a typical **Region::ObjectAddressToRange** operation). The analysis shows that **x2** is the second parameter object of the **MarkObject** function, and **x20** is the variable **objectRegion**.
1245
1246```text
1247Registers: x0:0000007f0fe31560 x1:0000000000000003 x2:0000000000000000 x3:0000005593100000
1248        x4:0000000000000000 x5:0000000000000000 x6:0000000000000000 x7:0000005596374fa0
1249        x8:0000000000000000 x9:0000000000000000 x10:0000000000000000 x11:0000007f9cb42bb8
1250        x12:000000000000005e x13:000000000061f59e x14:00000005d73d60fb x15:0000000000000000
1251        x16:0000007f9cc5f200 x17:0000007f9f201f68 x18:0000000000000000 x19:0000000000000000
1252        x20:0000000000000000 x21:0000000000000000 x22:0000000000000000 x23:000000559313f860
1253        x24:000000559313f868 x25:0000000000000003 x26:00000055a0e19960 x27:0000007f9cc57b38
1254        x28:0000007f9f21a1c0 x29:00000055a0e19700 lr:0000007f9cb4b584 sp:00000055a0e19700 pc:0000007f9cb492d4
1255```
1256
1257**ldrb w8, [x20]** corresponds to **packedData_.flags_.spaceFlag_** because **packedData_** is the first field of **region**, **flags_** is the first field of **packedData_**, and **spaceFlag_** is the first field of **flags_**. Therefore, the first byte corresponding to the **objectRegion** address is used.
1258To view assembly code, you need to be familiar with common assembly instructions and parameter transfer rules. For example, the non-inline member function **r0** in C++ stores the **this** pointer. In addition, due to compiler optimization, the mapping between source code and assembly code may not be clear. The mapping can be quickly obtained based on some feature values (constants) in the code.
1259