1page.title=Avoiding Priority Inversion 2@jd:body 3 4<!-- 5 Copyright 2013 The Android Open Source Project 6 7 Licensed under the Apache License, Version 2.0 (the "License"); 8 you may not use this file except in compliance with the License. 9 You may obtain a copy of the License at 10 11 http://www.apache.org/licenses/LICENSE-2.0 12 13 Unless required by applicable law or agreed to in writing, software 14 distributed under the License is distributed on an "AS IS" BASIS, 15 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 16 See the License for the specific language governing permissions and 17 limitations under the License. 18--> 19<div id="qv-wrapper"> 20 <div id="qv"> 21 <h2>In this document</h2> 22 <ol id="auto-toc"> 23 </ol> 24 </div> 25</div> 26 27<p> 28This article explains how the Android's audio system attempts to avoid 29priority inversion, 30and highlights techniques that you can use too. 31</p> 32 33<p> 34These techniques may be useful to developers of high-performance 35audio apps, OEMs, and SoC providers who are implementing an audio 36HAL. Please note implementing these techniques is not 37guaranteed to prevent glitches or other failures, particularly if 38used outside of the audio context. 39Your results may vary, and you should conduct your own 40evaluation and testing. 41</p> 42 43<h2 id="background">Background</h2> 44 45<p> 46The Android AudioFlinger audio server and AudioTrack/AudioRecord 47client implementation are being re-architected to reduce latency. 48This work started in Android 4.1, and continued with further improvements 49in 4.2, 4.3, 4.4, and 5.0. 50</p> 51 52<p> 53To achieve this lower latency, many changes were needed throughout the system. One 54important change is to assign CPU resources to time-critical 55threads with a more predictable scheduling policy. Reliable scheduling 56allows the audio buffer sizes and counts to be reduced while still 57avoiding underruns and overruns. 58</p> 59 60<h2 id="priorityInversion">Priority inversion</h2> 61 62<p> 63<a href="http://en.wikipedia.org/wiki/Priority_inversion">Priority inversion</a> 64is a classic failure mode of real-time systems, 65where a higher-priority task is blocked for an unbounded time waiting 66for a lower-priority task to release a resource such as (shared 67state protected by) a 68<a href="http://en.wikipedia.org/wiki/Mutual_exclusion">mutex</a>. 69</p> 70 71<p> 72In an audio system, priority inversion typically manifests as a 73<a href="http://en.wikipedia.org/wiki/Glitch">glitch</a> 74(click, pop, dropout), 75<a href="http://en.wikipedia.org/wiki/Max_Headroom_(character)">repeated audio</a> 76when circular buffers 77are used, or delay in responding to a command. 78</p> 79 80<p> 81A common workaround for priority inversion is to increase audio buffer sizes. 82However, this method increases latency and merely hides the problem 83instead of solving it. It is better to understand and prevent priority 84inversion, as seen below. 85</p> 86 87<p> 88In the Android audio implementation, priority inversion is most 89likely to occur in these places. And so you should focus your attention here: 90</p> 91 92<ul> 93 94<li> 95between normal mixer thread and fast mixer thread in AudioFlinger 96</li> 97 98<li> 99between application callback thread for a fast AudioTrack and 100fast mixer thread (they both have elevated priority, but slightly 101different priorities) 102</li> 103 104<li> 105between application callback thread for a fast AudioRecord and 106fast capture thread (similar to previous) 107</li> 108 109<li> 110within the audio Hardware Abstraction Layer (HAL) implementation, e.g. for telephony or echo cancellation 111</li> 112 113<li> 114within the audio driver in kernel 115</li> 116 117<li> 118between AudioTrack or AudioRecord callback thread and other app threads (this is out of our control) 119</li> 120 121</ul> 122 123<h2 id="commonSolutions">Common solutions</h2> 124 125<p> 126The typical solutions include: 127</p> 128 129<ul> 130 131<li> 132disabling interrupts 133</li> 134 135<li> 136priority inheritance mutexes 137</li> 138 139</ul> 140 141<p> 142Disabling interrupts is not feasible in Linux user space, and does 143not work for Symmetric Multi-Processors (SMP). 144</p> 145 146 147<p> 148Priority inheritance 149<a href="http://en.wikipedia.org/wiki/Futex">futexes</a> 150(fast user-space mutexes) are available 151in Linux kernel, but are not currently exposed by the Android C 152runtime library 153<a href="http://en.wikipedia.org/wiki/Bionic_(software)">Bionic</a>. 154They are not used in the audio system because they are relatively heavyweight, 155and because they rely on a trusted client. 156</p> 157 158<h2 id="androidTechniques">Techniques used by Android</h2> 159 160<p> 161Experiments started with "try lock" and lock with timeout. These are 162non-blocking and bounded blocking variants of the mutex lock 163operation. Try lock and lock with timeout worked fairly well but were 164susceptible to a couple of obscure failure modes: the 165server was not guaranteed to be able to access the shared state if 166the client happened to be busy, and the cumulative timeout could 167be too long if there was a long sequence of unrelated locks that 168all timed out. 169</p> 170 171 172<p> 173We also use 174<a href="http://en.wikipedia.org/wiki/Linearizability">atomic operations</a> 175such as: 176</p> 177 178<ul> 179<li>increment</li> 180<li>bitwise "or"</li> 181<li>bitwise "and"</li> 182</ul> 183 184<p> 185All of these return the previous value and include the necessary 186SMP barriers. The disadvantage is they can require unbounded retries. 187In practice, we've found that the retries are not a problem. 188</p> 189 190<p class="note"><strong>Note:</strong> Atomic operations and their interactions with memory barriers 191are notoriously badly misunderstood and used incorrectly. We include these methods 192here for completeness but recommend you also read the article 193<a href="https://developer.android.com/training/articles/smp.html"> 194SMP Primer for Android</a> 195for further information. 196</p> 197 198<p> 199We still have and use most of the above tools, and have recently 200added these techniques: 201</p> 202 203<ul> 204 205<li> 206Use non-blocking single-reader single-writer 207<a href="http://en.wikipedia.org/wiki/Circular_buffer">FIFO queues</a> 208for data. 209</li> 210 211<li> 212Try to 213<i>copy</i> 214state rather than 215<i>share</i> 216state between high- and 217low-priority modules. 218</li> 219 220<li> 221When state does need to be shared, limit the state to the 222maximum-size 223<a href="http://en.wikipedia.org/wiki/Word_(computer_architecture)">word</a> 224that can be accessed atomically in one-bus operation 225without retries. 226</li> 227 228<li> 229For complex multi-word state, use a state queue. A state queue 230is basically just a non-blocking single-reader single-writer FIFO 231queue used for state rather than data, except the writer collapses 232adjacent pushes into a single push. 233</li> 234 235<li> 236Pay attention to 237<a href="http://en.wikipedia.org/wiki/Memory_barrier">memory barriers</a> 238for SMP correctness. 239</li> 240 241<li> 242<a href="http://en.wikipedia.org/wiki/Trust,_but_verify">Trust, but verify</a>. 243When sharing 244<i>state</i> 245between processes, don't 246assume that the state is well-formed. For example, check that indices 247are within bounds. This verification isn't needed between threads 248in the same process, between mutual trusting processes (which 249typically have the same UID). It's also unnecessary for shared 250<i>data</i> 251such as PCM audio where a corruption is inconsequential. 252</li> 253 254</ul> 255 256<h2 id="nonBlockingAlgorithms">Non-blocking algorithms</h2> 257 258<p> 259<a href="http://en.wikipedia.org/wiki/Non-blocking_algorithm">Non-blocking algorithms</a> 260have been a subject of much recent study. 261But with the exception of single-reader single-writer FIFO queues, 262we've found them to be complex and error-prone. 263</p> 264 265<p> 266Starting in Android 4.2, you can find our non-blocking, 267single-reader/writer classes in these locations: 268</p> 269 270<ul> 271 272<li> 273frameworks/av/include/media/nbaio/ 274</li> 275 276<li> 277frameworks/av/media/libnbaio/ 278</li> 279 280<li> 281frameworks/av/services/audioflinger/StateQueue* 282</li> 283 284</ul> 285 286<p> 287These were designed specifically for AudioFlinger and are not 288general-purpose. Non-blocking algorithms are notorious for being 289difficult to debug. You can look at this code as a model. But be 290aware there may be bugs, and the classes are not guaranteed to be 291suitable for other purposes. 292</p> 293 294<p> 295For developers, some of the sample OpenSL ES application code should be updated to 296use non-blocking algorithms or reference a non-Android open source library. 297</p> 298 299<p> 300We have published an example non-blocking FIFO implementation that is specifically designed for 301application code. See these files located in the platform source directory 302<code>frameworks/av/audio_utils</code>: 303</p> 304<ul> 305 <li><a href="https://android.googlesource.com/platform/system/media/+/master/audio_utils/include/audio_utils/fifo.h">include/audio_utils/fifo.h</a></li> 306 <li><a href="https://android.googlesource.com/platform/system/media/+/master/audio_utils/fifo.c">fifo.c</a></li> 307 <li><a href="https://android.googlesource.com/platform/system/media/+/master/audio_utils/include/audio_utils/roundup.h">include/audio_utils/roundup.h</a></li> 308 <li><a href="https://android.googlesource.com/platform/system/media/+/master/audio_utils/roundup.c">roundup.c</a></li> 309</ul> 310 311<h2 id="tools">Tools</h2> 312 313<p> 314To the best of our knowledge, there are no automatic tools for 315finding priority inversion, especially before it happens. Some 316research static code analysis tools are capable of finding priority 317inversions if able to access the entire codebase. Of course, if 318arbitrary user code is involved (as it is here for the application) 319or is a large codebase (as for the Linux kernel and device drivers), 320static analysis may be impractical. The most important thing is to 321read the code very carefully and get a good grasp on the entire 322system and the interactions. Tools such as 323<a href="http://developer.android.com/tools/help/systrace.html">systrace</a> 324and 325<code>ps -t -p</code> 326are useful for seeing priority inversion after it occurs, but do 327not tell you in advance. 328</p> 329 330<h2 id="aFinalWord">A final word</h2> 331 332<p> 333After all of this discussion, don't be afraid of mutexes. Mutexes 334are your friend for ordinary use, when used and implemented correctly 335in ordinary non-time-critical use cases. But between high- and 336low-priority tasks and in time-sensitive systems mutexes are more 337likely to cause trouble. 338</p> 339