README.md
1Android Live-LocK Daemon
2========================
3
4Introduction
5------------
6
7Android Live-LocK Daemon (llkd) is used to catch kernel deadlocks and mitigate.
8
9Code is structured to allow integration into another service as either as part
10of the main loop, or spun off as a thread should that be necessary. A default
11standalone implementation is provided by llkd component.
12
13The 'C' interface from libllkd component is thus:
14
15 #include "llkd.h"
16 bool llkInit(const char* threadname) /* return true if enabled */
17 unsigned llkCheckMillseconds(void) /* ms to sleep for next check */
18
19If a threadname is provided, a thread will be automatically spawned, otherwise
20caller must call llkCheckMilliseconds in its main loop. Function will return
21the period of time before the next expected call to this handler.
22
23Operations
24----------
25
26There are two detection scenarios. Persistent D or Z state, and persistent
27stack signature.
28
29If a thread is in D or Z state with no forward progress for longer than
30ro.llk.timeout_ms, or ro.llk.[D|Z].timeout_ms, kill the process or parent
31process respectively. If another scan shows the same process continues to
32exist, then have a confirmed live-lock condition and need to panic. Panic
33the kernel in a manner to provide the greatest bugreporting details as to the
34condition. Add a alarm self watchdog should llkd ever get locked up that is
35double the expected time to flow through the mainloop. Sampling is every
36ro.llk_sample_ms.
37
38For usedebug releases only, persistent stack signature checking is enabled.
39If a thread in any state but Z, has a persistent listed ro.llk.stack kernel
40symbol always being reported, even if there is forward scheduling progress, for
41longer than ro.llk.timeout_ms, or ro.llk.stack.timeout_ms, then issue a kill
42to the process. If another scan shows the same process continues to exist,
43then have a confirmed live-lock condition and need to panic. There is no
44ABA detection since forward scheduling progress is allowed, thus the condition
45for the symbols are:
46
47- Check is looking for " __symbol__+0x" or " __symbol__.cfi+0x" in
48 /proc/__pid__/stack.
49- The __symbol__ should be rare and short lived enough that on a typical
50 system the function is seen at most only once in a sample over the timeout
51 period of ro.llk.stack.timeout_ms, samples occur every ro.llk.check_ms. This
52 can be the only way to prevent a false trigger as there is no ABA protection.
53- Persistent continuously when the live lock condition exists.
54- Should be just below the function that is calling the lock that could
55 contend, because if the lock is below or in the symbol function, the
56 symbol will show in all affected processes, not just the one that
57 caused the lockup.
58
59Default will not monitor init, or [kthreadd] and all that [kthreadd] spawns.
60This reduces the effectiveness of llkd by limiting its coverage. If there is
61value in covering [kthreadd] spawned threads, the requirement will be that
62the drivers not remain in a persistent 'D' state, or that they have mechanisms
63to recover the thread should it be killed externally (this is good driver
64coding hygiene, a common request to add such to publicly reviewed kernel.org
65maintained drivers). For instance use wait_event_interruptible() instead of
66wait_event(). The blacklists can be adjusted accordingly if these
67conditions are met to cover kernel components. For the stack symbol checking,
68there is an additional process blacklist so that we do not incide sepolicy
69violations on services that block ptrace operations.
70
71An accompanying gTest set have been added, and will setup a persistent D or Z
72process, with and without forward progress, but not in a live-lock state
73because that would require a buggy kernel, or a module or kernel modification
74to stimulate. The test will check that llkd will mitigate first by killing
75the appropriate process. D state is setup by vfork() waiting for exec() in
76child process. Z state is setup by fork() and an un-waited for child process.
77Should be noted that both of these conditions should never happen on Android
78on purpose, and llkd effectively sweeps up processes that create these
79conditions. If the test can, it will reconfigure llkd to expedite the test
80duration by adjusting the ro.llk.* Android properties. Tests run the D state
81with some scheduling progress to ensure that ABA checking prevents false
82triggers. If 100% reliable ABA on platform, then ro.llk.killtest can be
83set to false; however this will result in some of the unit tests to panic
84kernel instead of deal with more graceful kill operation.
85
86Android Properties
87------------------
88
89The following are the Android Properties llkd respond to.
90*prop*_ms named properties are in milliseconds.
91Properties that use comma (*,*) separator for lists, use a leading separator to
92preserve default and add or subtract entries with (*optional*) plus (*+*) and
93minus (*-*) prefixes respectively.
94For these lists, the string "*false*" is synonymous with an *empty* list,
95and *blank* or *missing* resorts to the specified *default* value.
96
97#### ro.config.low_ram
98device is configured with limited memory.
99
100#### ro.debuggable
101device is configured for userdebug or eng build.
102
103#### ro.llk.sysrq_t
104default not ro.config.low_ram, or ro.debuggable if property is "eng".
105if true do sysrq t (dump all threads).
106
107#### ro.llk.enable
108default false, allow live-lock daemon to be enabled.
109
110#### llk.enable
111default ro.llk.enable, and evaluated for eng.
112
113#### ro.khungtask.enable
114default false, allow [khungtask] daemon to be enabled.
115
116#### khungtask.enable
117default ro.khungtask.enable and evaluated for eng.
118
119#### ro.llk.mlockall
120default false, enable call to mlockall().
121
122#### ro.khungtask.timeout
123default value 12 minutes, [khungtask] maximum timelimit.
124
125#### ro.llk.timeout_ms
126default 10 minutes, D or Z maximum timelimit, double this value and it sets
127the alarm watchdog for llkd.
128
129#### ro.llk.D.timeout_ms
130default ro.llk.timeout_ms, D maximum timelimit.
131
132#### ro.llk.Z.timeout_ms
133default ro.llk.timeout_ms, Z maximum timelimit.
134
135#### ro.llk.stack.timeout_ms
136default ro.llk.timeout_ms,
137checking for persistent stack symbols maximum timelimit.
138Only active on userdebug or eng builds.
139
140#### ro.llk.check_ms
141default 2 minutes samples of threads for D or Z.
142
143#### ro.llk.stack
144default cma_alloc,__get_user_pages,bit_wait_io,wait_on_page_bit_killable
145comma separated list of kernel symbols.
146Look for kernel stack symbols that if ever persistently present can
147indicate a subsystem is locked up.
148Beware, check does not on purpose do forward scheduling ABA except by polling
149every ro.llk_check_ms over the period ro.llk.stack.timeout_ms, so stack symbol
150should be exceptionally rare and fleeting.
151One must be convinced that it is virtually *impossible* for symbol to show up
152persistently in all samples of the stack.
153Again, looks for a match for either " **symbol**+0x" or " **symbol**.cfi+0x"
154in stack expansion.
155Only available on userdebug or eng builds, limited privileges due to security
156concerns on user builds prevents this checking.
157
158#### ro.llk.blacklist.process
159default 0,1,2 (kernel, init and [kthreadd]) plus process names
160init,[kthreadd],[khungtaskd],lmkd,llkd,watchdogd,
161[watchdogd],[watchdogd/0],...,[watchdogd/***get_nprocs**-1*].
162Do not watch these processes. A process can be comm, cmdline or pid reference.
163NB: automated default here can be larger than the current maximum property
164size of 92.
165NB: false is a very very very unlikely process to want to blacklist.
166
167#### ro.llk.blacklist.parent
168default 0,2,adbd&[setsid] (kernel, [kthreadd] and adbd *only for zombie setsid*).
169Do not watch processes that have this parent.
170An ampersand (*&*) separator is used to specify that the parent is ignored
171only in combination with the target child process.
172Ampersand was selected because it is never part of a process name,
173however a setprop in the shell requires it to be escaped or quoted;
174init rc file where this is normally specified does not have this issue.
175A parent or target processes can be specified as comm, cmdline or pid reference.
176
177#### ro.llk.blacklist.uid
178default *empty* or false, comma separated list of uid numbers or names.
179Do not watch processes that match this uid.
180
181#### ro.llk.blacklist.process.stack
182default process names init,lmkd.llkd,llkd,keystore,ueventd,apexd,logd.
183This subset of processes are not monitored for live lock stack signatures.
184Also prevents the sepolicy violation associated with processes that block
185ptrace, as these can not be checked anyways.
186Only active on userdebug and eng builds.
187
188Architectural Concerns
189----------------------
190
191- built-in [khungtask] daemon is too generic and trips on driver code that
192 sits around in D state too much. To switch to S instead makes the task(s)
193 killable, so the drivers should be able to resurrect them if needed.
194- Properties are limited to 92 characters.
195- Create kernel module and associated gTest to actually test panic.
196- Create gTest to test out blacklist (ro.llk.blacklist.*properties* generally
197 not be inputs). Could require more test-only interfaces to libllkd.
198- Speed up gTest using something else than ro.llk.*properties*, which should
199 not be inputs as they should be baked into the product.
200