1# Debugging memory usage on Android 2 3## Prerequisites 4 5* A host running macOS or Linux. 6* A device running Android 11+. 7 8If you are profiling your own app and are not running a userdebug build of 9Android, your app needs to be marked as profileable or 10debuggable in its manifest. See the [heapprofd documentation]( 11/docs/data-sources/native-heap-profiler.md#heapprofd-targets) for more 12details on which applications can be targeted. 13 14## dumpsys meminfo 15 16A good place to get started investigating memory usage of a process is 17`dumpsys meminfo` which gives a high-level overview of how much of the various 18types of memory are being used by a process. 19 20```bash 21$ adb shell dumpsys meminfo com.android.systemui 22 23Applications Memory Usage (in Kilobytes): 24Uptime: 2030149 Realtime: 2030149 25 26** MEMINFO in pid 1974 [com.android.systemui] ** 27 Pss Private Private SwapPss Rss Heap Heap Heap 28 Total Dirty Clean Dirty Total Size Alloc Free 29 ------ ------ ------ ------ ------ ------ ------ ------ 30 Native Heap 16840 16804 0 6764 19428 34024 25037 5553 31 Dalvik Heap 9110 9032 0 136 13164 36444 9111 27333 32 33[more stuff...] 34``` 35 36Looking at the "Private Dirty" column of Dalvik Heap (= Java Heap) and 37Native Heap, we can see that SystemUI's memory usage on the Java heap 38is 9M, on the native heap it's 17M. 39 40## Linux memory management 41 42But what does *clean*, *dirty*, *Rss*, *Pss*, *Swap* actually mean? To answer 43this question, we need to delve into Linux memory management a bit. 44 45From the kernel's point of view, memory is split into equally sized blocks 46called *pages*. These are generally 4KiB. 47 48Pages are organized in virtually contiguous ranges called VMA 49(Virtual Memory Area). 50 51VMAs are created when a process requests a new pool of memory pages through 52the [mmap() system call](https://man7.org/linux/man-pages/man2/mmap.2.html). 53Applications rarely call mmap() directly. Those calls are typically mediated by 54the allocator, `malloc()/operator new()` for native processes or by the 55Android RunTime for Java apps. 56 57VMAs can be of two types: file-backed and anonymous. 58 59**File-backed VMAs** are a view of a file in memory. They are obtained passing a 60file descriptor to `mmap()`. The kernel will serve page faults on the VMA 61through the passed file, so reading a pointer to the VMA becomes the equivalent 62of a `read()` on the file. 63File-backed VMAs are used, for instance, by the dynamic linker (`ld`) when 64executing new processes or dynamically loading libraries, or by the Android 65framework, when loading a new .dex library or accessing resources in the APK. 66 67**Anonymous VMAs** are memory-only areas not backed by any file. This is the way 68allocators request dynamic memory from the kernel. Anonymous VMAs are obtained 69calling `mmap(... MAP_ANONYMOUS ...)`. 70 71Physical memory is only allocated, in page granularity, once the application 72tries to read/write from a VMA. If you allocate 32 MiB worth of pages but only 73touch one byte, your process' memory usage will only go up by 4KiB. You will 74have increased your process' *virtual memory* by 32 MiB, but its resident 75*physical memory* by 4 KiB. 76 77When optimizing memory use of programs, we are interested in reducing their 78footprint in *physical memory*. High *virtual memory* use is generally not a 79cause for concern on modern platforms (except if you run out of address space, 80which is very hard on 64 bit systems). 81 82We call the amount a process' memory that is resident in *physical memory* its 83**RSS** (Resident Set Size). Not all resident memory is equal though. 84 85From a memory-consumption viewpoint, individual pages within a VMA can have the 86following states: 87 88* **Resident**: the page is mapped to a physical memory page. Resident pages can 89 be in two states: 90 * **Clean** (only for file-backed pages): the contents of the page are the 91 same of the contents on-disk. The kernel can evict clean pages more easily 92 in case of memory pressure. This is because if they should be needed 93 again, the kernel knows it can re-create its contents by reading them from 94 the underlying file. 95 * **Dirty**: the contents of the page diverge from the disk, or (in most 96 cases), the page has no disk backing (i.e. it's _anonymous_). Dirty pages 97 cannot be evicted because doing so would cause data loss. However they can 98 be swapped out on disk or ZRAM, if present. 99* **Swapped**: a dirty page can be written to the swap file on disk (on most Linux 100 desktop distributions) or compressed (on Android and CrOS through 101 [ZRAM](https://source.android.com/devices/tech/perf/low-ram#zram)). The page 102 will stay swapped until a new page fault on its virtual address happens, at 103 which point the kernel will bring it back in main memory. 104* **Not present**: no page fault ever happened on the page or the page was 105 clean and later was evicted. 106 107It is generally more important to reduce the amount of _dirty_ memory as that 108cannot be reclaimed like _clean_ memory and, on Android, even if swapped in 109ZRAM, will still eat part of the system memory budget. 110This is why we looked at *Private Dirty* in the `dumpsys meminfo` example. 111 112*Shared* memory can be mapped into more than one process. This means VMAs in 113different processes refer to the same physical memory. This typically happens 114with file-backed memory of commonly used libraries (e.g., libc.so, 115framework.dex) or, more rarely, when a process `fork()`s and a child process 116inherits dirty memory from its parent. 117 118This introduces the concept of **PSS** (Proportional Set Size). In **PSS**, 119memory that is resident in multiple processes is proportionally attributed to 120each of them. If we map one 4KiB page into four processes, each of their 121**PSS** will increase by 1KiB. 122 123#### Recap 124 125* Dynamically allocated memory, whether allocated through C's `malloc()`, C++'s 126 `operator new()` or Java's `new X()` starts always as _anonymous_ and _dirty_, 127 unless it is never used. 128* If this memory is not read/written for a while, or in case of memory pressure, 129 it gets swapped out on ZRAM and becomes _swapped_. 130* Anonymous memory, whether _resident_ (and hence _dirty_) or _swapped_ is 131 always a resource hog and should be avoided if unnecessary. 132* File-mapped memory comes from code (java or native), libraries and resource 133 and is almost always _clean_. Clean memory also erodes the system memory 134 budget but typically application developers have less control on it. 135 136## Memory over time 137 138`dumpsys meminfo` is good to get a snapshot of the current memory usage, but 139even very short memory spikes can lead to low-memory situations, which will 140lead to [LMKs](#lmk). We have two tools to investigate situations like this 141 142* RSS High Watermark. 143* Memory tracepoints. 144 145### RSS High Watermark 146 147We can get a lot of information from the `/proc/[pid]/status` file, including 148memory information. `VmHWM` shows the maximum RSS usage the process has seen 149since it was started. This value is kept updated by the kernel. 150 151```bash 152$ adb shell cat '/proc/$(pidof com.android.systemui)/status' 153[...] 154VmHWM: 256972 kB 155VmRSS: 195272 kB 156RssAnon: 30184 kB 157RssFile: 164420 kB 158RssShmem: 668 kB 159VmSwap: 43960 kB 160[...] 161``` 162 163### Memory tracepoints 164 165NOTE: For detailed instructions about the memory trace points see the 166 [Data sources > Memory > Counters and events]( 167 /docs/data-sources/memory-counters.md) page. 168 169We can use Perfetto to get information about memory management events from the 170kernel. 171 172```bash 173$ adb shell perfetto \ 174 -c - --txt \ 175 -o /data/misc/perfetto-traces/trace \ 176<<EOF 177 178buffers: { 179 size_kb: 8960 180 fill_policy: DISCARD 181} 182buffers: { 183 size_kb: 1280 184 fill_policy: DISCARD 185} 186data_sources: { 187 config { 188 name: "linux.process_stats" 189 target_buffer: 1 190 process_stats_config { 191 scan_all_processes_on_start: true 192 } 193 } 194} 195data_sources: { 196 config { 197 name: "linux.ftrace" 198 ftrace_config { 199 ftrace_events: "mm_event/mm_event_record" 200 ftrace_events: "kmem/rss_stat" 201 ftrace_events: "kmem/ion_heap_grow" 202 ftrace_events: "kmem/ion_heap_shrink" 203 } 204 } 205} 206duration_ms: 30000 207 208EOF 209``` 210 211While it is running, take a photo if you are following along. 212 213Pull the file using `adb pull /data/misc/perfetto-traces/trace ~/mem-trace` 214and upload to the [Perfetto UI](https://ui.perfetto.dev). This will show 215overall stats about system [ION](#ion) usage, and per-process stats to 216expand. Scroll down (or Ctrl-F for) to `com.google.android.GoogleCamera` and 217expand. This will show a timeline for various memory stats for camera. 218 219![Camera Memory Trace](/docs/images/trace-rss-camera.png) 220 221We can see that around 2/3 into the trace, the memory spiked (in the 222mem.rss.anon track). This is where I took a photo. This is a good way to see 223how the memory usage of an application reacts to different triggers. 224 225## Which tool to use 226 227If you want to drill down into _anonymous_ memory allocated by Java code, 228labeled by `dumpsys meminfo` as `Dalvik Heap`, see the 229[Analyzing the java heap](#java-hprof) section. 230 231If you want to drill down into _anonymous_ memory allocated by native code, 232labeled by `dumpsys meminfo` as `Native Heap`, see the 233[Analyzing the Native Heap](#heapprofd) section. Note that it's frequent to end 234up with native memory even if your app doesn't have any C/C++ code. This is 235because the implementation of some framework API (e.g. Regex) is internally 236implemented through native code. 237 238If you want to drill down into file-mapped memory the best option is to use 239`adb shell showmap PID` (on Android) or inspect `/proc/PID/smaps`. 240 241 242## {#lmk} Low-memory kills 243 244When an Android device becomes low on memory, a daemon called `lmkd` will 245start killing processes in order to free up memory. Devices' strategies differ, 246but in general processes will be killed in order of descending `oom_score_adj` 247score (i.e. background apps and processes first, foreground processes last). 248 249Apps on Android are not killed when switching away from them. They instead 250remain *cached* even after the user finishes using them. This is to make 251subsequent starts of the app faster. Such apps will generally be killed 252first (because they have a higher `oom_score_adj`). 253 254We can collect information about LMKs and `oom_score_adj` using Perfetto. 255 256```protobuf 257$ adb shell perfetto \ 258 -c - --txt \ 259 -o /data/misc/perfetto-traces/trace \ 260<<EOF 261 262buffers: { 263 size_kb: 8960 264 fill_policy: DISCARD 265} 266buffers: { 267 size_kb: 1280 268 fill_policy: DISCARD 269} 270data_sources: { 271 config { 272 name: "linux.process_stats" 273 target_buffer: 1 274 process_stats_config { 275 scan_all_processes_on_start: true 276 } 277 } 278} 279data_sources: { 280 config { 281 name: "linux.ftrace" 282 ftrace_config { 283 ftrace_events: "lowmemorykiller/lowmemory_kill" 284 ftrace_events: "oom/oom_score_adj_update" 285 ftrace_events: "ftrace/print" 286 atrace_apps: "lmkd" 287 } 288 } 289} 290duration_ms: 60000 291 292EOF 293``` 294 295Pull the file using `adb pull /data/misc/perfetto-traces/trace ~/oom-trace` 296and upload to the [Perfetto UI](https://ui.perfetto.dev). 297 298![OOM Score](/docs/images/oom-score.png) 299 300We can see that the OOM score of Camera gets reduced (making it less likely 301to be killed) when it is opened, and gets increased again once it is closed. 302 303## {#heapprofd} Analyzing the Native Heap 304 305**Native Heap Profiles require Android 10.** 306 307NOTE: For detailed instructions about the native heap profiler and 308 troubleshooting see the [Data sources > Native heap profiler]( 309 /docs/data-sources/native-heap-profiler.md) page. 310 311Applications usually get memory through `malloc` or C++'s `new` rather than 312directly getting it from the kernel. The allocator makes sure that your memory 313is more efficiently handled (i.e. there are not many gaps) and that the 314overhead from asking the kernel remains low. 315 316We can log the native allocations and frees that a process does using 317*heapprofd*. The resulting profile can be used to attribute memory usage 318to particular function callstacks, supporting a mix of both native and Java 319code. The profile *will only show allocations done while it was running*, any 320allocations done before will not be shown. 321 322### Capturing the profile 323 324Use the `tools/heap_profile` script to profile a process. If you are having 325trouble make sure you are using the [latest version]( 326https://raw.githubusercontent.com/google/perfetto/master/tools/heap_profile). 327See all the arguments using `tools/heap_profile -h`, or use the defaults 328and just profile a process (e.g. `system_server`): 329 330```bash 331$ tools/heap_profile -n system_server 332 333Profiling active. Press Ctrl+C to terminate. 334You may disconnect your device. 335 336Wrote profiles to /tmp/profile-1283e247-2170-4f92-8181-683763e17445 (symlink /tmp/heap_profile-latest) 337These can be viewed using pprof. Googlers: head to pprof/ and upload them. 338``` 339 340When you see *Profiling active*, play around with the phone a bit. When you 341are done, press Ctrl-C to end the profile. For this tutorial, I opened a 342couple of apps. 343 344### Viewing the data 345 346Then upload the `raw-trace` file from the output directory to the 347[Perfetto UI](https://ui.perfetto.dev) and click on diamond marker that 348shows. 349 350![Profile Diamond](/docs/images/profile-diamond.png) 351 352The tabs that are available are 353 354* **space**: how many bytes were allocated but not freed at this callstack the 355 moment the dump was created. 356* **alloc\_space**: how many bytes were allocated (including ones freed at the 357 moment of the dump) at this callstack 358* **objects**: how many allocations without matching frees were sampled at this 359 callstack. 360* **alloc\_objects**: how many allocations (including ones with matching frees) 361 were sampled at this callstack. 362 363The default view will show you all allocations that were done while the 364profile was running but that weren't freed (the **space** tab). 365 366![Native Flamegraph](/docs/images/syssrv-apk-assets-two.png) 367 368We can see that a lot of memory gets allocated in paths through 369`ResourceManager.loadApkAssets`. To get the total memory that was allocated 370this way, we can enter "loadApkAssets" into the Focus textbox. This will only 371show callstacks where some frame matches "loadApkAssets". 372 373![Native Flamegraph with Focus](/docs/images/syssrv-apk-assets-focus.png) 374 375From this we have a clear idea where in the code we have to look. From the 376code we can see how that memory is being used and if we actually need all of 377it. In this case the key is the `_CompressedAsset` that requires decompressing 378into RAM rather than being able to (_cleanly_) memory-map. By not compressing 379these data, we can save RAM. 380 381## {#java-hprof} Analyzing the Java Heap 382 383**Java Heap Dumps require Android 11.** 384 385NOTE: For detailed instructions about the Java heap profiler and 386 troubleshooting see the [Data sources > Java heap profiler]( 387 /docs/data-sources/java-heap-profiler.md) page. 388 389### Capturing the profile 390We can get a snapshot of the graph of all the Java objects that constitute the 391Java heap. We use the `tools/java_heap_dump` script. If you are having trouble 392make sure you are using the [latest version]( 393https://raw.githubusercontent.com/google/perfetto/master/tools/java_heap_dump). 394 395```bash 396$ tools/java_heap_dump -n com.android.systemui 397 398Dumping Java Heap. 399Wrote profile to /tmp/tmpup3QrQprofile 400This can be viewed using https://ui.perfetto.dev. 401``` 402 403### Viewing the Data 404 405Upload the trace to the [Perfetto UI](https://ui.perfetto.dev) and click on 406diamond marker that shows. 407 408![Profile Diamond](/docs/images/profile-diamond.png) 409 410This will present a flamegraph of the memory attributed to the shortest path 411to a garbage-collection root. In general an object is reachable by many paths, 412we only show the shortest as that reduces the complexity of the data displayed 413and is generally the highest-signal. The rightmost `[merged]` stacks is the 414sum of all objects that are too small to be displayed. 415 416![Java Flamegraph](/docs/images/java-flamegraph.png) 417 418The tabs that are available are 419 420* **space**: how many bytes are retained via this path to the GC root. 421* **objects**: how many objects are retained via this path to the GC root. 422 423If we want to only see callstacks that have a frame that contains some string, 424we can use the Focus feature. If we want to know all allocations that have to 425do with notifications, we can put "notification" in the Focus box. 426 427As with native heap profiles, if we want to focus on some specific aspect of the 428graph, we can filter by the names of the classes. If we wanted to see everything 429that could be caused by notifications, we can put "notification" in the Focus box. 430 431![Java Flamegraph with Focus](/docs/images/java-flamegraph-focus.png) 432 433We aggregate the paths per class name, so if there are multiple objects of the 434same type retained by a `java.lang.Object[]`, we will show one element as its 435child, as you can see in the leftmost stack above. 436