• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Debugging memory usage on Android
2
3## Prerequisites
4
5* A host running macOS or Linux.
6* [ADB](https://developer.android.com/studio/command-line/adb) installed and
7  in PATH.
8* A device running Android 11+.
9
10If you are profiling your own app and are not running a userdebug build of
11Android, your app needs to be marked as profileable or
12debuggable in its manifest. See the [heapprofd documentation](
13/docs/data-sources/native-heap-profiler.md#heapprofd-targets) for more
14details on which applications can be targeted.
15
16## dumpsys meminfo
17
18A good place to get started investigating memory usage of a process is
19`dumpsys meminfo` which gives a high-level overview of how much of the various
20types of memory are being used by a process.
21
22```bash
23$ adb shell dumpsys meminfo com.android.systemui
24
25Applications Memory Usage (in Kilobytes):
26Uptime: 2030149 Realtime: 2030149
27
28** MEMINFO in pid 1974 [com.android.systemui] **
29                   Pss  Private  Private  SwapPss      Rss     Heap     Heap     Heap
30                 Total    Dirty    Clean    Dirty    Total     Size    Alloc     Free
31                ------   ------   ------   ------   ------   ------   ------   ------
32  Native Heap    16840    16804        0     6764    19428    34024    25037     5553
33  Dalvik Heap     9110     9032        0      136    13164    36444     9111    27333
34
35[more stuff...]
36```
37
38Looking at the "Private Dirty" column of Dalvik Heap (= Java Heap) and
39Native Heap, we can see that SystemUI's memory usage on the Java heap
40is 9M, on the native heap it's 17M.
41
42## Linux memory management
43
44But what does *clean*, *dirty*, *Rss*, *Pss*, *Swap* actually mean? To answer
45this question, we need to delve into Linux memory management a bit.
46
47From the kernel's point of view, memory is split into equally sized blocks
48called *pages*. These are generally 4KiB.
49
50Pages are organized in virtually contiguous ranges called VMA
51(Virtual Memory Area).
52
53VMAs are created when a process requests a new pool of memory pages through
54the [mmap() system call](https://man7.org/linux/man-pages/man2/mmap.2.html).
55Applications rarely call mmap() directly. Those calls are typically mediated by
56the allocator, `malloc()/operator new()` for native processes or by the
57Android RunTime for Java apps.
58
59VMAs can be of two types: file-backed and anonymous.
60
61**File-backed VMAs** are a view of a file in memory. They are obtained passing a
62file descriptor to `mmap()`. The kernel will serve page faults on the VMA
63through the passed file, so reading a pointer to the VMA becomes the equivalent
64of a `read()` on the file.
65File-backed VMAs are used, for instance, by the dynamic linker (`ld`) when
66executing new processes or dynamically loading libraries, or by the Android
67framework, when loading a new .dex library or accessing resources in the APK.
68
69**Anonymous VMAs** are memory-only areas not backed by any file. This is the way
70allocators request dynamic memory from the kernel. Anonymous VMAs are obtained
71calling `mmap(... MAP_ANONYMOUS ...)`.
72
73Physical memory is only allocated, in page granularity, once the application
74tries to read/write from a VMA. If you allocate 32 MiB worth of pages but only
75touch one byte, your process' memory usage will only go up by 4KiB. You will
76have increased your process' *virtual memory* by 32 MiB, but its resident
77*physical memory* by 4 KiB.
78
79When optimizing memory use of programs, we are interested in reducing their
80footprint in *physical memory*. High *virtual memory* use is generally not a
81cause for concern on modern platforms (except if you run out of address space,
82which is very hard on 64 bit systems).
83
84We call the amount a process' memory that is resident in *physical memory* its
85**RSS** (Resident Set Size). Not all resident memory is equal though.
86
87From a memory-consumption viewpoint, individual pages within a VMA can have the
88following states:
89
90* **Resident**: the page is mapped to a physical memory page. Resident pages can
91  be in two states:
92    * **Clean** (only for file-backed pages): the contents of the page are the
93      same of the contents on-disk. The kernel can evict clean pages more easily
94      in case of memory pressure. This is because if they should be needed
95      again, the kernel knows it can re-create its contents by reading them from
96      the underlying file.
97    * **Dirty**: the contents of the page diverge from the disk, or (in most
98      cases), the page has no disk backing (i.e. it's _anonymous_). Dirty pages
99      cannot be evicted because doing so would cause data loss. However they can
100      be swapped out on disk or ZRAM, if present.
101* **Swapped**: a dirty page can be written to the swap file on disk (on most Linux
102  desktop distributions) or compressed (on Android and CrOS through
103  [ZRAM](https://source.android.com/devices/tech/perf/low-ram#zram)). The page
104  will stay swapped until a new page fault on its virtual address happens, at
105  which point the kernel will bring it back in main memory.
106* **Not present**: no page fault ever happened on the page or the page was
107  clean and later was evicted.
108
109It is generally more important to reduce the amount of _dirty_ memory as that
110cannot be reclaimed like _clean_ memory and, on Android, even if swapped in
111ZRAM, will still eat part of the system memory budget.
112This is why we looked at *Private Dirty* in the `dumpsys meminfo` example.
113
114*Shared* memory can be mapped into more than one process. This means VMAs in
115different processes refer to the same physical memory. This typically happens
116with file-backed memory of commonly used libraries (e.g., libc.so,
117framework.dex) or, more rarely, when a process `fork()`s and a child process
118inherits dirty memory from its parent.
119
120This introduces the concept of **PSS** (Proportional Set Size). In **PSS**,
121memory that is resident in multiple processes is proportionally attributed to
122each of them. If we map one 4KiB page into four processes, each of their
123**PSS** will increase by 1KiB.
124
125#### Recap
126
127* Dynamically allocated memory, whether allocated through C's `malloc()`, C++'s
128  `operator new()` or Java's `new X()` starts always as _anonymous_ and _dirty_,
129  unless it is never used.
130* If this memory is not read/written for a while, or in case of memory pressure,
131  it gets swapped out on ZRAM and becomes _swapped_.
132* Anonymous memory, whether _resident_ (and hence _dirty_) or _swapped_ is
133  always a resource hog and should be avoided if unnecessary.
134* File-mapped memory comes from code (java or native), libraries and resource
135  and is almost always _clean_. Clean memory also erodes the system memory
136  budget but typically application developers have less control on it.
137
138## Memory over time
139
140`dumpsys meminfo` is good to get a snapshot of the current memory usage, but
141even very short memory spikes can lead to low-memory situations, which will
142lead to [LMKs](#lmk). We have two tools to investigate situations like this
143
144* RSS High Watermark.
145* Memory tracepoints.
146
147### RSS High Watermark
148
149We can get a lot of information from the `/proc/[pid]/status` file, including
150memory information. `VmHWM` shows the maximum RSS usage the process has seen
151since it was started. This value is kept updated by the kernel.
152
153```bash
154$ adb shell cat '/proc/$(pidof com.android.systemui)/status'
155[...]
156VmHWM:    256972 kB
157VmRSS:    195272 kB
158RssAnon:  30184 kB
159RssFile:  164420 kB
160RssShmem: 668 kB
161VmSwap:   43960 kB
162[...]
163```
164
165### Memory tracepoints
166
167NOTE: For detailed instructions about the memory trace points see the
168      [Data sources > Memory > Counters and events](
169      /docs/data-sources/memory-counters.md) page.
170
171We can use Perfetto to get information about memory management events from the
172kernel.
173
174```bash
175$ adb shell perfetto \
176  -c - --txt \
177  -o /data/misc/perfetto-traces/trace \
178<<EOF
179
180buffers: {
181    size_kb: 8960
182    fill_policy: DISCARD
183}
184buffers: {
185    size_kb: 1280
186    fill_policy: DISCARD
187}
188data_sources: {
189    config {
190        name: "linux.process_stats"
191        target_buffer: 1
192        process_stats_config {
193            scan_all_processes_on_start: true
194        }
195    }
196}
197data_sources: {
198    config {
199        name: "linux.ftrace"
200        ftrace_config {
201            ftrace_events: "mm_event/mm_event_record"
202            ftrace_events: "kmem/rss_stat"
203            ftrace_events: "kmem/ion_heap_grow"
204            ftrace_events: "kmem/ion_heap_shrink"
205        }
206    }
207}
208duration_ms: 30000
209
210EOF
211```
212
213While it is running, take a photo if you are following along.
214
215Pull the file using `adb pull /data/misc/perfetto-traces/trace ~/mem-trace`
216and upload to the [Perfetto UI](https://ui.perfetto.dev). This will show
217overall stats about system [ION](#ion) usage, and per-process stats to
218expand. Scroll down (or Ctrl-F for) to `com.google.android.GoogleCamera` and
219expand. This will show a timeline for various memory stats for camera.
220
221![Camera Memory Trace](/docs/images/trace-rss-camera.png)
222
223We can see that around 2/3 into the trace, the memory spiked (in the
224mem.rss.anon track). This is where I took a photo. This is a good way to see
225how the memory usage of an application reacts to different triggers.
226
227## Which tool to use
228
229If you want to drill down into _anonymous_ memory allocated by Java code,
230labeled by `dumpsys meminfo` as `Dalvik Heap`, see the
231[Analyzing the java heap](#java-hprof) section.
232
233If you want to drill down into _anonymous_ memory allocated by native code,
234labeled by `dumpsys meminfo` as `Native Heap`, see the
235[Analyzing the Native Heap](#heapprofd) section. Note that it's frequent to end
236up with native memory even if your app doesn't have any C/C++ code. This is
237because the implementation of some framework API (e.g. Regex) is internally
238implemented through native code.
239
240If you want to drill down into file-mapped memory the best option is to use
241`adb shell showmap PID` (on Android) or inspect `/proc/PID/smaps`.
242
243
244## {#lmk} Low-memory kills
245
246When an Android device becomes low on memory, a daemon called `lmkd` will
247start killing processes in order to free up memory. Devices' strategies differ,
248but in general processes will be killed in order of descending `oom_score_adj`
249score (i.e. background apps and processes first, foreground processes last).
250
251Apps on Android are not killed when switching away from them. They instead
252remain *cached* even after the user finishes using them. This is to make
253subsequent starts of the app faster. Such apps will generally be killed
254first (because they have a higher `oom_score_adj`).
255
256We can collect information about LMKs and `oom_score_adj` using Perfetto.
257
258```protobuf
259$ adb shell perfetto \
260  -c - --txt \
261  -o /data/misc/perfetto-traces/trace \
262<<EOF
263
264buffers: {
265    size_kb: 8960
266    fill_policy: DISCARD
267}
268buffers: {
269    size_kb: 1280
270    fill_policy: DISCARD
271}
272data_sources: {
273    config {
274        name: "linux.process_stats"
275        target_buffer: 1
276        process_stats_config {
277            scan_all_processes_on_start: true
278        }
279    }
280}
281data_sources: {
282    config {
283        name: "linux.ftrace"
284        ftrace_config {
285            ftrace_events: "lowmemorykiller/lowmemory_kill"
286            ftrace_events: "oom/oom_score_adj_update"
287            ftrace_events: "ftrace/print"
288            atrace_apps: "lmkd"
289        }
290    }
291}
292duration_ms: 60000
293
294EOF
295```
296
297Pull the file using `adb pull /data/misc/perfetto-traces/trace ~/oom-trace`
298and upload to the [Perfetto UI](https://ui.perfetto.dev).
299
300![OOM Score](/docs/images/oom-score.png)
301
302We can see that the OOM score of Camera gets reduced (making it less likely
303to be killed) when it is opened, and gets increased again once it is closed.
304
305## {#heapprofd} Analyzing the Native Heap
306
307**Native Heap Profiles require Android 10.**
308
309NOTE: For detailed instructions about the native heap profiler and
310      troubleshooting see the [Data sources > Native heap profiler](
311      /docs/data-sources/native-heap-profiler.md) page.
312
313Applications usually get memory through `malloc` or C++'s `new` rather than
314directly getting it from the kernel. The allocator makes sure that your memory
315is more efficiently handled (i.e. there are not many gaps) and that the
316overhead from asking the kernel remains low.
317
318We can log the native allocations and frees that a process does using
319*heapprofd*. The resulting profile can be used to attribute memory usage
320to particular function callstacks, supporting a mix of both native and Java
321code. The profile *will only show allocations done while it was running*, any
322allocations done before will not be shown.
323
324### {#capture-profile-native} Capturing the profile
325
326Use the `tools/heap_profile` script to profile a process. If you are having
327trouble make sure you are using the [latest version](
328https://raw.githubusercontent.com/google/perfetto/master/tools/heap_profile).
329See all the arguments using `tools/heap_profile -h`, or use the defaults
330and just profile a process (e.g. `system_server`):
331
332```bash
333$ tools/heap_profile -n system_server
334
335Profiling active. Press Ctrl+C to terminate.
336You may disconnect your device.
337
338Wrote profiles to /tmp/profile-1283e247-2170-4f92-8181-683763e17445 (symlink /tmp/heap_profile-latest)
339These can be viewed using pprof. Googlers: head to pprof/ and upload them.
340```
341
342When you see *Profiling active*, play around with the phone a bit. When you
343are done, press Ctrl-C to end the profile. For this tutorial, I opened a
344couple of apps.
345
346### Viewing the data
347
348Then upload the `raw-trace` file from the output directory to the
349[Perfetto UI](https://ui.perfetto.dev) and click on diamond marker that
350shows.
351
352![Profile Diamond](/docs/images/profile-diamond.png)
353
354The tabs that are available are
355
356* **space**: how many bytes were allocated but not freed at this callstack the
357  moment the dump was created.
358* **alloc\_space**: how many bytes were allocated (including ones freed at the
359  moment of the dump) at this callstack
360* **objects**: how many allocations without matching frees were sampled at this
361  callstack.
362* **alloc\_objects**: how many allocations (including ones with matching frees)
363  were sampled at this callstack.
364
365The default view will show you all allocations that were done while the
366profile was running but that weren't freed (the **space** tab).
367
368![Native Flamegraph](/docs/images/syssrv-apk-assets-two.png)
369
370We can see that a lot of memory gets allocated in paths through
371`ResourceManager.loadApkAssets`. To get the total memory that was allocated
372this way, we can enter "loadApkAssets" into the Focus textbox. This will only
373show callstacks where some frame matches "loadApkAssets".
374
375![Native Flamegraph with Focus](/docs/images/syssrv-apk-assets-focus.png)
376
377From this we have a clear idea where in the code we have to look. From the
378code we can see how that memory is being used and if we actually need all of
379it. In this case the key is the `_CompressedAsset` that requires decompressing
380into RAM rather than being able to (_cleanly_) memory-map. By not compressing
381these data, we can save RAM.
382
383## {#java-hprof} Analyzing the Java Heap
384
385**Java Heap Dumps require Android 11.**
386
387NOTE: For detailed instructions about the Java heap profiler and
388      troubleshooting see the [Data sources > Java heap profiler](
389      /docs/data-sources/java-heap-profiler.md) page.
390
391### {#capture-profile-java} Capturing the profile
392We can get a snapshot of the graph of all the Java objects that constitute the
393Java heap. We use the `tools/java_heap_dump` script. If you are having trouble
394make sure you are using the [latest version](
395https://raw.githubusercontent.com/google/perfetto/master/tools/java_heap_dump).
396
397```bash
398$ tools/java_heap_dump -n com.android.systemui
399
400Dumping Java Heap.
401Wrote profile to /tmp/tmpup3QrQprofile
402This can be viewed using https://ui.perfetto.dev.
403```
404
405### Viewing the Data
406
407Upload the trace to the [Perfetto UI](https://ui.perfetto.dev) and click on
408diamond marker that shows.
409
410![Profile Diamond](/docs/images/profile-diamond.png)
411
412This will present a flamegraph of the memory attributed to the shortest path
413to a garbage-collection root. In general an object is reachable by many paths,
414we only show the shortest as that reduces the complexity of the data displayed
415and is generally the highest-signal. The rightmost `[merged]` stacks is the
416sum of all objects that are too small to be displayed.
417
418![Java Flamegraph](/docs/images/java-flamegraph.png)
419
420The tabs that are available are
421
422* **space**: how many bytes are retained via this path to the GC root.
423* **objects**: how many objects are retained via this path to the GC root.
424
425If we want to only see callstacks that have a frame that contains some string,
426we can use the Focus feature. If we want to know all allocations that have to
427do with notifications, we can put "notification" in the Focus box.
428
429As with native heap profiles, if we want to focus on some specific aspect of the
430graph, we can filter by the names of the classes. If we wanted to see everything
431that could be caused by notifications, we can put "notification" in the Focus box.
432
433![Java Flamegraph with Focus](/docs/images/java-flamegraph-focus.png)
434
435We aggregate the paths per class name, so if there are multiple objects of the
436same type retained by a `java.lang.Object[]`, we will show one element as its
437child, as you can see in the leftmost stack above.
438