1Generic Mutex Subsystem 2 3started by Ingo Molnar <mingo@redhat.com> 4 5 "Why on earth do we need a new mutex subsystem, and what's wrong 6 with semaphores?" 7 8firstly, there's nothing wrong with semaphores. But if the simpler 9mutex semantics are sufficient for your code, then there are a couple 10of advantages of mutexes: 11 12 - 'struct mutex' is smaller on most architectures: E.g. on x86, 13 'struct semaphore' is 20 bytes, 'struct mutex' is 16 bytes. 14 A smaller structure size means less RAM footprint, and better 15 CPU-cache utilization. 16 17 - tighter code. On x86 i get the following .text sizes when 18 switching all mutex-alike semaphores in the kernel to the mutex 19 subsystem: 20 21 text data bss dec hex filename 22 3280380 868188 396860 4545428 455b94 vmlinux-semaphore 23 3255329 865296 396732 4517357 44eded vmlinux-mutex 24 25 that's 25051 bytes of code saved, or a 0.76% win - off the hottest 26 codepaths of the kernel. (The .data savings are 2892 bytes, or 0.33%) 27 Smaller code means better icache footprint, which is one of the 28 major optimization goals in the Linux kernel currently. 29 30 - the mutex subsystem is slightly faster and has better scalability for 31 contended workloads. On an 8-way x86 system, running a mutex-based 32 kernel and testing creat+unlink+close (of separate, per-task files) 33 in /tmp with 16 parallel tasks, the average number of ops/sec is: 34 35 Semaphores: Mutexes: 36 37 $ ./test-mutex V 16 10 $ ./test-mutex V 16 10 38 8 CPUs, running 16 tasks. 8 CPUs, running 16 tasks. 39 checking VFS performance. checking VFS performance. 40 avg loops/sec: 34713 avg loops/sec: 84153 41 CPU utilization: 63% CPU utilization: 22% 42 43 i.e. in this workload, the mutex based kernel was 2.4 times faster 44 than the semaphore based kernel, _and_ it also had 2.8 times less CPU 45 utilization. (In terms of 'ops per CPU cycle', the semaphore kernel 46 performed 551 ops/sec per 1% of CPU time used, while the mutex kernel 47 performed 3825 ops/sec per 1% of CPU time used - it was 6.9 times 48 more efficient.) 49 50 the scalability difference is visible even on a 2-way P4 HT box: 51 52 Semaphores: Mutexes: 53 54 $ ./test-mutex V 16 10 $ ./test-mutex V 16 10 55 4 CPUs, running 16 tasks. 8 CPUs, running 16 tasks. 56 checking VFS performance. checking VFS performance. 57 avg loops/sec: 127659 avg loops/sec: 181082 58 CPU utilization: 100% CPU utilization: 34% 59 60 (the straight performance advantage of mutexes is 41%, the per-cycle 61 efficiency of mutexes is 4.1 times better.) 62 63 - there are no fastpath tradeoffs, the mutex fastpath is just as tight 64 as the semaphore fastpath. On x86, the locking fastpath is 2 65 instructions: 66 67 c0377ccb <mutex_lock>: 68 c0377ccb: f0 ff 08 lock decl (%eax) 69 c0377cce: 78 0e js c0377cde <.text..lock.mutex> 70 c0377cd0: c3 ret 71 72 the unlocking fastpath is equally tight: 73 74 c0377cd1 <mutex_unlock>: 75 c0377cd1: f0 ff 00 lock incl (%eax) 76 c0377cd4: 7e 0f jle c0377ce5 <.text..lock.mutex+0x7> 77 c0377cd6: c3 ret 78 79 - 'struct mutex' semantics are well-defined and are enforced if 80 CONFIG_DEBUG_MUTEXES is turned on. Semaphores on the other hand have 81 virtually no debugging code or instrumentation. The mutex subsystem 82 checks and enforces the following rules: 83 84 * - only one task can hold the mutex at a time 85 * - only the owner can unlock the mutex 86 * - multiple unlocks are not permitted 87 * - recursive locking is not permitted 88 * - a mutex object must be initialized via the API 89 * - a mutex object must not be initialized via memset or copying 90 * - task may not exit with mutex held 91 * - memory areas where held locks reside must not be freed 92 * - held mutexes must not be reinitialized 93 * - mutexes may not be used in hardware or software interrupt 94 * contexts such as tasklets and timers 95 96 furthermore, there are also convenience features in the debugging 97 code: 98 99 * - uses symbolic names of mutexes, whenever they are printed in debug output 100 * - point-of-acquire tracking, symbolic lookup of function names 101 * - list of all locks held in the system, printout of them 102 * - owner tracking 103 * - detects self-recursing locks and prints out all relevant info 104 * - detects multi-task circular deadlocks and prints out all affected 105 * locks and tasks (and only those tasks) 106 107Disadvantages 108------------- 109 110The stricter mutex API means you cannot use mutexes the same way you 111can use semaphores: e.g. they cannot be used from an interrupt context, 112nor can they be unlocked from a different context that which acquired 113it. [ I'm not aware of any other (e.g. performance) disadvantages from 114using mutexes at the moment, please let me know if you find any. ] 115 116Implementation of mutexes 117------------------------- 118 119'struct mutex' is the new mutex type, defined in include/linux/mutex.h 120and implemented in kernel/mutex.c. It is a counter-based mutex with a 121spinlock and a wait-list. The counter has 3 states: 1 for "unlocked", 1220 for "locked" and negative numbers (usually -1) for "locked, potential 123waiters queued". 124 125the APIs of 'struct mutex' have been streamlined: 126 127 DEFINE_MUTEX(name); 128 129 mutex_init(mutex); 130 131 void mutex_lock(struct mutex *lock); 132 int mutex_lock_interruptible(struct mutex *lock); 133 int mutex_trylock(struct mutex *lock); 134 void mutex_unlock(struct mutex *lock); 135 int mutex_is_locked(struct mutex *lock); 136 void mutex_lock_nested(struct mutex *lock, unsigned int subclass); 137 int mutex_lock_interruptible_nested(struct mutex *lock, 138 unsigned int subclass); 139 int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock); 140