1# Debugging memory usage on Android 2 3## Prerequisites 4 5* A host running macOS or Linux. 6* [ADB](https://developer.android.com/studio/command-line/adb) installed and 7 in PATH. 8* A device running Android 11+. 9 10If you are profiling your own app and are not running a userdebug build of 11Android, your app needs to be marked as profileable or 12debuggable in its manifest. See the [heapprofd documentation]( 13/docs/data-sources/native-heap-profiler.md#heapprofd-targets) for more 14details on which applications can be targeted. 15 16## dumpsys meminfo 17 18A good place to get started investigating memory usage of a process is 19`dumpsys meminfo` which gives a high-level overview of how much of the various 20types of memory are being used by a process. 21 22```bash 23$ adb shell dumpsys meminfo com.android.systemui 24 25Applications Memory Usage (in Kilobytes): 26Uptime: 2030149 Realtime: 2030149 27 28** MEMINFO in pid 1974 [com.android.systemui] ** 29 Pss Private Private SwapPss Rss Heap Heap Heap 30 Total Dirty Clean Dirty Total Size Alloc Free 31 ------ ------ ------ ------ ------ ------ ------ ------ 32 Native Heap 16840 16804 0 6764 19428 34024 25037 5553 33 Dalvik Heap 9110 9032 0 136 13164 36444 9111 27333 34 35[more stuff...] 36``` 37 38Looking at the "Private Dirty" column of Dalvik Heap (= Java Heap) and 39Native Heap, we can see that SystemUI's memory usage on the Java heap 40is 9M, on the native heap it's 17M. 41 42## Linux memory management 43 44But what does *clean*, *dirty*, *Rss*, *Pss*, *Swap* actually mean? To answer 45this question, we need to delve into Linux memory management a bit. 46 47From the kernel's point of view, memory is split into equally sized blocks 48called *pages*. These are generally 4KiB. 49 50Pages are organized in virtually contiguous ranges called VMA 51(Virtual Memory Area). 52 53VMAs are created when a process requests a new pool of memory pages through 54the [mmap() system call](https://man7.org/linux/man-pages/man2/mmap.2.html). 55Applications rarely call mmap() directly. Those calls are typically mediated by 56the allocator, `malloc()/operator new()` for native processes or by the 57Android RunTime for Java apps. 58 59VMAs can be of two types: file-backed and anonymous. 60 61**File-backed VMAs** are a view of a file in memory. They are obtained passing a 62file descriptor to `mmap()`. The kernel will serve page faults on the VMA 63through the passed file, so reading a pointer to the VMA becomes the equivalent 64of a `read()` on the file. 65File-backed VMAs are used, for instance, by the dynamic linker (`ld`) when 66executing new processes or dynamically loading libraries, or by the Android 67framework, when loading a new .dex library or accessing resources in the APK. 68 69**Anonymous VMAs** are memory-only areas not backed by any file. This is the way 70allocators request dynamic memory from the kernel. Anonymous VMAs are obtained 71calling `mmap(... MAP_ANONYMOUS ...)`. 72 73Physical memory is only allocated, in page granularity, once the application 74tries to read/write from a VMA. If you allocate 32 MiB worth of pages but only 75touch one byte, your process' memory usage will only go up by 4KiB. You will 76have increased your process' *virtual memory* by 32 MiB, but its resident 77*physical memory* by 4 KiB. 78 79When optimizing memory use of programs, we are interested in reducing their 80footprint in *physical memory*. High *virtual memory* use is generally not a 81cause for concern on modern platforms (except if you run out of address space, 82which is very hard on 64 bit systems). 83 84We call the amount a process' memory that is resident in *physical memory* its 85**RSS** (Resident Set Size). Not all resident memory is equal though. 86 87From a memory-consumption viewpoint, individual pages within a VMA can have the 88following states: 89 90* **Resident**: the page is mapped to a physical memory page. Resident pages can 91 be in two states: 92 * **Clean** (only for file-backed pages): the contents of the page are the 93 same of the contents on-disk. The kernel can evict clean pages more easily 94 in case of memory pressure. This is because if they should be needed 95 again, the kernel knows it can re-create its contents by reading them from 96 the underlying file. 97 * **Dirty**: the contents of the page diverge from the disk, or (in most 98 cases), the page has no disk backing (i.e. it's _anonymous_). Dirty pages 99 cannot be evicted because doing so would cause data loss. However they can 100 be swapped out on disk or ZRAM, if present. 101* **Swapped**: a dirty page can be written to the swap file on disk (on most Linux 102 desktop distributions) or compressed (on Android and CrOS through 103 [ZRAM](https://source.android.com/devices/tech/perf/low-ram#zram)). The page 104 will stay swapped until a new page fault on its virtual address happens, at 105 which point the kernel will bring it back in main memory. 106* **Not present**: no page fault ever happened on the page or the page was 107 clean and later was evicted. 108 109It is generally more important to reduce the amount of _dirty_ memory as that 110cannot be reclaimed like _clean_ memory and, on Android, even if swapped in 111ZRAM, will still eat part of the system memory budget. 112This is why we looked at *Private Dirty* in the `dumpsys meminfo` example. 113 114*Shared* memory can be mapped into more than one process. This means VMAs in 115different processes refer to the same physical memory. This typically happens 116with file-backed memory of commonly used libraries (e.g., libc.so, 117framework.dex) or, more rarely, when a process `fork()`s and a child process 118inherits dirty memory from its parent. 119 120This introduces the concept of **PSS** (Proportional Set Size). In **PSS**, 121memory that is resident in multiple processes is proportionally attributed to 122each of them. If we map one 4KiB page into four processes, each of their 123**PSS** will increase by 1KiB. 124 125#### Recap 126 127* Dynamically allocated memory, whether allocated through C's `malloc()`, C++'s 128 `operator new()` or Java's `new X()` starts always as _anonymous_ and _dirty_, 129 unless it is never used. 130* If this memory is not read/written for a while, or in case of memory pressure, 131 it gets swapped out on ZRAM and becomes _swapped_. 132* Anonymous memory, whether _resident_ (and hence _dirty_) or _swapped_ is 133 always a resource hog and should be avoided if unnecessary. 134* File-mapped memory comes from code (java or native), libraries and resource 135 and is almost always _clean_. Clean memory also erodes the system memory 136 budget but typically application developers have less control on it. 137 138## Memory over time 139 140`dumpsys meminfo` is good to get a snapshot of the current memory usage, but 141even very short memory spikes can lead to low-memory situations, which will 142lead to [LMKs](#lmk). We have two tools to investigate situations like this 143 144* RSS High Watermark. 145* Memory tracepoints. 146 147### RSS High Watermark 148 149We can get a lot of information from the `/proc/[pid]/status` file, including 150memory information. `VmHWM` shows the maximum RSS usage the process has seen 151since it was started. This value is kept updated by the kernel. 152 153```bash 154$ adb shell cat '/proc/$(pidof com.android.systemui)/status' 155[...] 156VmHWM: 256972 kB 157VmRSS: 195272 kB 158RssAnon: 30184 kB 159RssFile: 164420 kB 160RssShmem: 668 kB 161VmSwap: 43960 kB 162[...] 163``` 164 165### Memory tracepoints 166 167NOTE: For detailed instructions about the memory trace points see the 168 [Data sources > Memory > Counters and events]( 169 /docs/data-sources/memory-counters.md) page. 170 171We can use Perfetto to get information about memory management events from the 172kernel. 173 174```bash 175$ adb shell perfetto \ 176 -c - --txt \ 177 -o /data/misc/perfetto-traces/trace \ 178<<EOF 179 180buffers: { 181 size_kb: 8960 182 fill_policy: DISCARD 183} 184buffers: { 185 size_kb: 1280 186 fill_policy: DISCARD 187} 188data_sources: { 189 config { 190 name: "linux.process_stats" 191 target_buffer: 1 192 process_stats_config { 193 scan_all_processes_on_start: true 194 } 195 } 196} 197data_sources: { 198 config { 199 name: "linux.ftrace" 200 ftrace_config { 201 ftrace_events: "mm_event/mm_event_record" 202 ftrace_events: "kmem/rss_stat" 203 ftrace_events: "kmem/ion_heap_grow" 204 ftrace_events: "kmem/ion_heap_shrink" 205 } 206 } 207} 208duration_ms: 30000 209 210EOF 211``` 212 213While it is running, take a photo if you are following along. 214 215Pull the file using `adb pull /data/misc/perfetto-traces/trace ~/mem-trace` 216and upload to the [Perfetto UI](https://ui.perfetto.dev). This will show 217overall stats about system [ION](#ion) usage, and per-process stats to 218expand. Scroll down (or Ctrl-F for) to `com.google.android.GoogleCamera` and 219expand. This will show a timeline for various memory stats for camera. 220 221![Camera Memory Trace](/docs/images/trace-rss-camera.png) 222 223We can see that around 2/3 into the trace, the memory spiked (in the 224mem.rss.anon track). This is where I took a photo. This is a good way to see 225how the memory usage of an application reacts to different triggers. 226 227## Which tool to use 228 229If you want to drill down into _anonymous_ memory allocated by Java code, 230labeled by `dumpsys meminfo` as `Dalvik Heap`, see the 231[Analyzing the java heap](#java-hprof) section. 232 233If you want to drill down into _anonymous_ memory allocated by native code, 234labeled by `dumpsys meminfo` as `Native Heap`, see the 235[Analyzing the Native Heap](#heapprofd) section. Note that it's frequent to end 236up with native memory even if your app doesn't have any C/C++ code. This is 237because the implementation of some framework API (e.g. Regex) is internally 238implemented through native code. 239 240If you want to drill down into file-mapped memory the best option is to use 241`adb shell showmap PID` (on Android) or inspect `/proc/PID/smaps`. 242 243 244## {#lmk} Low-memory kills 245 246When an Android device becomes low on memory, a daemon called `lmkd` will 247start killing processes in order to free up memory. Devices' strategies differ, 248but in general processes will be killed in order of descending `oom_score_adj` 249score (i.e. background apps and processes first, foreground processes last). 250 251Apps on Android are not killed when switching away from them. They instead 252remain *cached* even after the user finishes using them. This is to make 253subsequent starts of the app faster. Such apps will generally be killed 254first (because they have a higher `oom_score_adj`). 255 256We can collect information about LMKs and `oom_score_adj` using Perfetto. 257 258```protobuf 259$ adb shell perfetto \ 260 -c - --txt \ 261 -o /data/misc/perfetto-traces/trace \ 262<<EOF 263 264buffers: { 265 size_kb: 8960 266 fill_policy: DISCARD 267} 268buffers: { 269 size_kb: 1280 270 fill_policy: DISCARD 271} 272data_sources: { 273 config { 274 name: "linux.process_stats" 275 target_buffer: 1 276 process_stats_config { 277 scan_all_processes_on_start: true 278 } 279 } 280} 281data_sources: { 282 config { 283 name: "linux.ftrace" 284 ftrace_config { 285 ftrace_events: "lowmemorykiller/lowmemory_kill" 286 ftrace_events: "oom/oom_score_adj_update" 287 ftrace_events: "ftrace/print" 288 atrace_apps: "lmkd" 289 } 290 } 291} 292duration_ms: 60000 293 294EOF 295``` 296 297Pull the file using `adb pull /data/misc/perfetto-traces/trace ~/oom-trace` 298and upload to the [Perfetto UI](https://ui.perfetto.dev). 299 300![OOM Score](/docs/images/oom-score.png) 301 302We can see that the OOM score of Camera gets reduced (making it less likely 303to be killed) when it is opened, and gets increased again once it is closed. 304 305## {#heapprofd} Analyzing the Native Heap 306 307**Native Heap Profiles require Android 10.** 308 309NOTE: For detailed instructions about the native heap profiler and 310 troubleshooting see the [Data sources > Native heap profiler]( 311 /docs/data-sources/native-heap-profiler.md) page. 312 313Applications usually get memory through `malloc` or C++'s `new` rather than 314directly getting it from the kernel. The allocator makes sure that your memory 315is more efficiently handled (i.e. there are not many gaps) and that the 316overhead from asking the kernel remains low. 317 318We can log the native allocations and frees that a process does using 319*heapprofd*. The resulting profile can be used to attribute memory usage 320to particular function callstacks, supporting a mix of both native and Java 321code. The profile *will only show allocations done while it was running*, any 322allocations done before will not be shown. 323 324### {#capture-profile-native} Capturing the profile 325 326Use the `tools/heap_profile` script to profile a process. If you are having 327trouble make sure you are using the [latest version]( 328https://raw.githubusercontent.com/google/perfetto/master/tools/heap_profile). 329See all the arguments using `tools/heap_profile -h`, or use the defaults 330and just profile a process (e.g. `system_server`): 331 332```bash 333$ tools/heap_profile -n system_server 334 335Profiling active. Press Ctrl+C to terminate. 336You may disconnect your device. 337 338Wrote profiles to /tmp/profile-1283e247-2170-4f92-8181-683763e17445 (symlink /tmp/heap_profile-latest) 339These can be viewed using pprof. Googlers: head to pprof/ and upload them. 340``` 341 342When you see *Profiling active*, play around with the phone a bit. When you 343are done, press Ctrl-C to end the profile. For this tutorial, I opened a 344couple of apps. 345 346### Viewing the data 347 348Then upload the `raw-trace` file from the output directory to the 349[Perfetto UI](https://ui.perfetto.dev) and click on diamond marker that 350shows. 351 352![Profile Diamond](/docs/images/profile-diamond.png) 353 354The tabs that are available are 355 356* **space**: how many bytes were allocated but not freed at this callstack the 357 moment the dump was created. 358* **alloc\_space**: how many bytes were allocated (including ones freed at the 359 moment of the dump) at this callstack 360* **objects**: how many allocations without matching frees were sampled at this 361 callstack. 362* **alloc\_objects**: how many allocations (including ones with matching frees) 363 were sampled at this callstack. 364 365The default view will show you all allocations that were done while the 366profile was running but that weren't freed (the **space** tab). 367 368![Native Flamegraph](/docs/images/syssrv-apk-assets-two.png) 369 370We can see that a lot of memory gets allocated in paths through 371`ResourceManager.loadApkAssets`. To get the total memory that was allocated 372this way, we can enter "loadApkAssets" into the Focus textbox. This will only 373show callstacks where some frame matches "loadApkAssets". 374 375![Native Flamegraph with Focus](/docs/images/syssrv-apk-assets-focus.png) 376 377From this we have a clear idea where in the code we have to look. From the 378code we can see how that memory is being used and if we actually need all of 379it. In this case the key is the `_CompressedAsset` that requires decompressing 380into RAM rather than being able to (_cleanly_) memory-map. By not compressing 381these data, we can save RAM. 382 383## {#java-hprof} Analyzing the Java Heap 384 385**Java Heap Dumps require Android 11.** 386 387NOTE: For detailed instructions about the Java heap profiler and 388 troubleshooting see the [Data sources > Java heap profiler]( 389 /docs/data-sources/java-heap-profiler.md) page. 390 391### {#capture-profile-java} Capturing the profile 392We can get a snapshot of the graph of all the Java objects that constitute the 393Java heap. We use the `tools/java_heap_dump` script. If you are having trouble 394make sure you are using the [latest version]( 395https://raw.githubusercontent.com/google/perfetto/master/tools/java_heap_dump). 396 397```bash 398$ tools/java_heap_dump -n com.android.systemui 399 400Dumping Java Heap. 401Wrote profile to /tmp/tmpup3QrQprofile 402This can be viewed using https://ui.perfetto.dev. 403``` 404 405### Viewing the Data 406 407Upload the trace to the [Perfetto UI](https://ui.perfetto.dev) and click on 408diamond marker that shows. 409 410![Profile Diamond](/docs/images/profile-diamond.png) 411 412This will present a flamegraph of the memory attributed to the shortest path 413to a garbage-collection root. In general an object is reachable by many paths, 414we only show the shortest as that reduces the complexity of the data displayed 415and is generally the highest-signal. The rightmost `[merged]` stacks is the 416sum of all objects that are too small to be displayed. 417 418![Java Flamegraph](/docs/images/java-flamegraph.png) 419 420The tabs that are available are 421 422* **space**: how many bytes are retained via this path to the GC root. 423* **objects**: how many objects are retained via this path to the GC root. 424 425If we want to only see callstacks that have a frame that contains some string, 426we can use the Focus feature. If we want to know all allocations that have to 427do with notifications, we can put "notification" in the Focus box. 428 429As with native heap profiles, if we want to focus on some specific aspect of the 430graph, we can filter by the names of the classes. If we wanted to see everything 431that could be caused by notifications, we can put "notification" in the Focus box. 432 433![Java Flamegraph with Focus](/docs/images/java-flamegraph-focus.png) 434 435We aggregate the paths per class name, so if there are multiple objects of the 436same type retained by a `java.lang.Object[]`, we will show one element as its 437child, as you can see in the leftmost stack above. 438