1 Proper Locking Under a Preemptible Kernel: 2 Keeping Kernel Code Preempt-Safe 3 Robert Love <rml@tech9.net> 4 Last Updated: 28 Aug 2002 5 6 7INTRODUCTION 8 9 10A preemptible kernel creates new locking issues. The issues are the same as 11those under SMP: concurrency and reentrancy. Thankfully, the Linux preemptible 12kernel model leverages existing SMP locking mechanisms. Thus, the kernel 13requires explicit additional locking for very few additional situations. 14 15This document is for all kernel hackers. Developing code in the kernel 16requires protecting these situations. 17 18 19RULE #1: Per-CPU data structures need explicit protection 20 21 22Two similar problems arise. An example code snippet: 23 24 struct this_needs_locking tux[NR_CPUS]; 25 tux[smp_processor_id()] = some_value; 26 /* task is preempted here... */ 27 something = tux[smp_processor_id()]; 28 29First, since the data is per-CPU, it may not have explicit SMP locking, but 30require it otherwise. Second, when a preempted task is finally rescheduled, 31the previous value of smp_processor_id may not equal the current. You must 32protect these situations by disabling preemption around them. 33 34You can also use put_cpu() and get_cpu(), which will disable preemption. 35 36 37RULE #2: CPU state must be protected. 38 39 40Under preemption, the state of the CPU must be protected. This is arch- 41dependent, but includes CPU structures and state not preserved over a context 42switch. For example, on x86, entering and exiting FPU mode is now a critical 43section that must occur while preemption is disabled. Think what would happen 44if the kernel is executing a floating-point instruction and is then preempted. 45Remember, the kernel does not save FPU state except for user tasks. Therefore, 46upon preemption, the FPU registers will be sold to the lowest bidder. Thus, 47preemption must be disabled around such regions. 48 49Note, some FPU functions are already explicitly preempt safe. For example, 50kernel_fpu_begin and kernel_fpu_end will disable and enable preemption. 51However, math_state_restore must be called with preemption disabled. 52 53 54RULE #3: Lock acquire and release must be performed by same task 55 56 57A lock acquired in one task must be released by the same task. This 58means you can't do oddball things like acquire a lock and go off to 59play while another task releases it. If you want to do something 60like this, acquire and release the task in the same code path and 61have the caller wait on an event by the other task. 62 63 64SOLUTION 65 66 67Data protection under preemption is achieved by disabling preemption for the 68duration of the critical region. 69 70preempt_enable() decrement the preempt counter 71preempt_disable() increment the preempt counter 72preempt_enable_no_resched() decrement, but do not immediately preempt 73preempt_check_resched() if needed, reschedule 74preempt_count() return the preempt counter 75 76The functions are nestable. In other words, you can call preempt_disable 77n-times in a code path, and preemption will not be reenabled until the n-th 78call to preempt_enable. The preempt statements define to nothing if 79preemption is not enabled. 80 81Note that you do not need to explicitly prevent preemption if you are holding 82any locks or interrupts are disabled, since preemption is implicitly disabled 83in those cases. 84 85But keep in mind that 'irqs disabled' is a fundamentally unsafe way of 86disabling preemption - any spin_unlock() decreasing the preemption count 87to 0 might trigger a reschedule. A simple printk() might trigger a reschedule. 88So use this implicit preemption-disabling property only if you know that the 89affected codepath does not do any of this. Best policy is to use this only for 90small, atomic code that you wrote and which calls no complex functions. 91 92Example: 93 94 cpucache_t *cc; /* this is per-CPU */ 95 preempt_disable(); 96 cc = cc_data(searchp); 97 if (cc && cc->avail) { 98 __free_block(searchp, cc_entry(cc), cc->avail); 99 cc->avail = 0; 100 } 101 preempt_enable(); 102 return 0; 103 104Notice how the preemption statements must encompass every reference of the 105critical variables. Another example: 106 107 int buf[NR_CPUS]; 108 set_cpu_val(buf); 109 if (buf[smp_processor_id()] == -1) printf(KERN_INFO "wee!\n"); 110 spin_lock(&buf_lock); 111 /* ... */ 112 113This code is not preempt-safe, but see how easily we can fix it by simply 114moving the spin_lock up two lines. 115 116 117PREVENTING PREEMPTION USING INTERRUPT DISABLING 118 119 120It is possible to prevent a preemption event using local_irq_disable and 121local_irq_save. Note, when doing so, you must be very careful to not cause 122an event that would set need_resched and result in a preemption check. When 123in doubt, rely on locking or explicit preemption disabling. 124 125Note in 2.5 interrupt disabling is now only per-CPU (e.g. local). 126 127An additional concern is proper usage of local_irq_disable and local_irq_save. 128These may be used to protect from preemption, however, on exit, if preemption 129may be enabled, a test to see if preemption is required should be done. If 130these are called from the spin_lock and read/write lock macros, the right thing 131is done. They may also be called within a spin-lock protected region, however, 132if they are ever called outside of this context, a test for preemption should 133be made. Do note that calls from interrupt context or bottom half/ tasklets 134are also protected by preemption locks and so may use the versions which do 135not check preemption. 136