• Home
  • Raw
  • Download

Lines Matching +full:we +full:- +full:on +full:- +full:ns

17 Linux Kernel, everyone hacking on the kernel needs to know the
37 +------------------------------------+------------------------------------+
41 +------------------------------------+------------------------------------+
43 +------------------------------------+------------------------------------+
45 +------------------------------------+------------------------------------+
47 +------------------------------------+------------------------------------+
49 +------------------------------------+------------------------------------+
51 +------------------------------------+------------------------------------+
57 +------------------------------------+------------------------------------+
61 +------------------------------------+------------------------------------+
63 +------------------------------------+------------------------------------+
65 +------------------------------------+------------------------------------+
67 +------------------------------------+------------------------------------+
69 +------------------------------------+------------------------------------+
71 +------------------------------------+------------------------------------+
75 ------------------------------------
77 This overlap, where the result depends on the relative timing of
80 Linux starting running on SMP machines, they became one of the major
84 preempting one task during the critical region, we have exactly the same
97 If I could give you one piece of advice on locking: **keep it simple**.
102 -----------------------------------------------------
106 single-holder lock: if you can't get the spinlock, you keep trying
122 ------------------------------
126 design decision: when no-one else can run at the same time, there is no
131 prevent any races. For most purposes, we can think of preemption as
139 between user contexts, as we will see below.
142 ----------------------------
154 nf_register_sockopt(). Registration and de-registration
155 are only done on module load and unload (and boot time, where there is
162 -----------------------------------------
168 used. It disables softirqs on that CPU, then grabs the lock.
184 -----------------------------------------
190 ---------------------------------------
197 -------------------------------
205 Since a tasklet is never run on two CPUs at once, you don't need to
207 on SMP.
216 on the same CPU.
219 ------------------------
226 The same softirq can run on the other CPUs: you can use a per-CPU array
227 (see `Per-CPU Data`_) for better performance. If you're
240 could be running on a different CPU.
250 ----------------------------------------------
255 by a hardware interrupt on another CPU. This is where
257 interrupts on that cpu, then grab the lock.
272 variant which saves whether interrupts were on or off in a flags word,
278 Note that softirqs (and hence tasklets and timers) are run on return
284 -------------------------------------
288 architecture-specific whether all interrupts are disabled inside irq
296 - If you are in a process context (any syscall) and want to lock other
300 - Otherwise (== data can be touched in an interrupt), use
304 - Avoid holding spinlock for more than 5 lines of code and across any
308 -----------------------------
311 various contexts. In some cases, the same context can only be running on
313 particular thread can only run on one CPU at a time, but if it needs
337 +--------+----------------------------+
339 +--------+----------------------------+
341 +--------+----------------------------+
343 +--------+----------------------------+
345 +--------+----------------------------+
347 +--------+----------------------------+
360 spin_trylock() does not spin but returns non-zero if it
361 acquires the spinlock on the first try or 0 if not. This function can be
367 non-zero if it could lock the mutex on the first try or 0 if not. This
379 -------------------
381 For our first example, we assume that all operations are in user context
382 (ie. from system calls), so we can sleep. This means we can use a mutex
411 if (i->id == id) {
412 i->popularity++;
422 list_del(&obj->list);
424 cache_num--;
430 list_add(&obj->list, &cache);
434 if (!outcast || i->popularity < outcast->popularity)
446 return -ENOMEM;
448 strscpy(obj->name, name, sizeof(obj->name));
449 obj->id = id;
450 obj->popularity = 0;
468 int ret = -ENOENT;
474 strcpy(name, obj->name);
480 Note that we always make sure we have the cache_lock when we add,
483 easy, since we copy the data for the user, and never let them access the
487 cache_add() we set up the fields of the object before
488 grabbing the lock. This is safe, as no-one else can access it until we
492 --------------------------------
498 The change is shown below, in standard patch format: the ``-`` are lines
503 --- cache.c.usercontext 2003-12-09 13:58:54.000000000 +1100
504 +++ cache.c.interrupt 2003-12-09 14:07:49.000000000 +1100
505 @@ -12,7 +12,7 @@
509 -static DEFINE_MUTEX(cache_lock);
514 @@ -55,6 +55,7 @@
521 return -ENOMEM;
522 @@ -63,30 +64,33 @@
523 obj->id = id;
524 obj->popularity = 0;
526 - mutex_lock(&cache_lock);
529 - mutex_unlock(&cache_lock);
536 - mutex_lock(&cache_lock);
541 - mutex_unlock(&cache_lock);
548 int ret = -ENOENT;
551 - mutex_lock(&cache_lock);
556 strcpy(name, obj->name);
558 - mutex_unlock(&cache_lock);
564 interrupts if they are on, otherwise does nothing (if we are already in
575 ----------------------------------
582 The first problem is that we use the ``cache_lock`` to protect objects:
583 we'd need to make this non-static so the rest of the code can use it.
590 worse, add another object, re-using the same address.
592 As there is only one lock, you can't hold it forever: no-one else would
602 --- cache.c.interrupt 2003-12-09 14:25:43.000000000 +1100
603 +++ cache.c.refcnt 2003-12-09 14:33:05.000000000 +1100
604 @@ -7,6 +7,7 @@
612 @@ -17,6 +18,35 @@
618 + if (--obj->refcnt == 0)
624 + obj->refcnt++;
648 @@ -35,6 +65,7 @@
651 list_del(&obj->list);
653 cache_num--;
656 @@ -63,6 +94,7 @@
657 strscpy(obj->name, name, sizeof(obj->name));
658 obj->id = id;
659 obj->popularity = 0;
660 + obj->refcnt = 1; /* The cache holds a reference */
664 @@ -79,18 +111,15 @@
668 -int cache_find(int id, char *name)
672 - int ret = -ENOENT;
677 - if (obj) {
678 - ret = 0;
679 - strcpy(name, obj->name);
680 - }
684 - return ret;
688 We encapsulate the reference counting in the standard 'get' and 'put'
689 functions. Now we can return the object itself from
706 although for anything non-trivial using spinlocks is clearer. The
713 --- cache.c.refcnt 2003-12-09 15:00:35.000000000 +1100
714 +++ cache.c.refcnt-atomic 2003-12-11 15:49:42.000000000 +1100
715 @@ -7,7 +7,7 @@
719 - unsigned int refcnt;
724 @@ -18,33 +18,15 @@
728 -static void __object_put(struct object *obj)
729 -{
730 - if (--obj->refcnt == 0)
731 - kfree(obj);
732 -}
733 -
734 -static void __object_get(struct object *obj)
735 -{
736 - obj->refcnt++;
737 -}
738 -
741 - unsigned long flags;
742 -
743 - spin_lock_irqsave(&cache_lock, flags);
744 - __object_put(obj);
745 - spin_unlock_irqrestore(&cache_lock, flags);
746 + if (atomic_dec_and_test(&obj->refcnt))
752 - unsigned long flags;
753 -
754 - spin_lock_irqsave(&cache_lock, flags);
755 - __object_get(obj);
756 - spin_unlock_irqrestore(&cache_lock, flags);
757 + atomic_inc(&obj->refcnt);
761 @@ -65,7 +47,7 @@
764 list_del(&obj->list);
765 - __object_put(obj);
767 cache_num--;
770 @@ -94,7 +76,7 @@
771 strscpy(obj->name, name, sizeof(obj->name));
772 obj->id = id;
773 obj->popularity = 0;
774 - obj->refcnt = 1; /* The cache holds a reference */
775 + atomic_set(&obj->refcnt, 1); /* The cache holds a reference */
779 @@ -119,7 +101,7 @@
783 - __object_get(obj);
790 ---------------------------------
792 In these examples, we assumed that the objects (except the reference
793 counts) never changed once they are created. If we wanted to allow the
796 - You can make ``cache_lock`` non-static, and tell people to grab that
799 - You can provide a cache_obj_rename() which grabs this
803 - You can make the ``cache_lock`` protect only the cache itself, and
806 Theoretically, you can make the locks as fine-grained as one lock for
810 - One lock which protects the infrastructure (the ``cache`` list in
811 this example) and all the objects. This is what we have done so far.
813 - One lock which protects the infrastructure (including the list
817 - Multiple locks to protect the infrastructure (eg. one lock per hash
818 chain), possibly with a separate per-object lock.
820 Here is the "lock-per-object" implementation:
824 --- cache.c.refcnt-atomic 2003-12-11 15:50:54.000000000 +1100
825 +++ cache.c.perobjectlock 2003-12-11 17:15:03.000000000 +1100
826 @@ -6,11 +6,17 @@
841 - int popularity;
845 @@ -77,6 +84,7 @@
846 obj->id = id;
847 obj->popularity = 0;
848 atomic_set(&obj->refcnt, 1); /* The cache holds a reference */
849 + spin_lock_init(&obj->lock);
855 ``cache_lock`` rather than the per-object lock: this is because it (like
875 -----------------------------
881 stay-up-five-nights-talk-to-fluffy-code-bunnies kind of problem.
890 with a single CPU (although not on UP compiles, since spinlocks vanish
891 on kernel compiles with ``CONFIG_SMP``\ =n. You'll still get data
894 This complete lockup is easy to diagnose: on SMP boxes the watchdog
899 A more complex problem is the so-called 'deadly embrace', involving two
909 lock it twice. Secondly, if the same softirq on another CPU is trying to
913 +-----------------------+-----------------------+
916 | Grab lock A -> OK | Grab lock B -> OK |
917 +-----------------------+-----------------------+
918 | Grab lock B -> spin | Grab lock A -> spin |
919 +-----------------------+-----------------------+
927 -------------------
936 are never held around calls to non-trivial functions outside the same
955 -------------------------------
961 If you want to destroy the entire collection (say on module removal),
969 struct foo *next = list->next;
970 timer_delete(&list->timer);
978 Sooner or later, this will crash on SMP, because a timer can have just
980 the lock after we spin_unlock_bh(), and then try to free
985 If 0, it means (in this case) that it is currently running, so we can
992 struct foo *next = list->next;
993 if (!timer_delete(&list->timer)) {
1025 Concurrency depends on how long the lock is usually held: you should
1027 example, we always create the object without the lock held, and then
1028 grab the lock only when we are ready to insert it in the list.
1030 Acquisition times depend on how much damage the lock operations do to
1032 the last one to grab the lock (ie. is the lock cache-hot for this CPU):
1033 on a machine with more CPUs, this likelihood drops fast. Consider a
1034 700MHz Intel Pentium III: an instruction takes about 0.7ns, an atomic
1035 increment takes about 58ns, a lock which is cache-hot on this CPU takes
1036 160ns, and a cacheline transfer from another CPU takes an additional 170
1037 to 360ns. (These figures from Paul McKenney's `Linux Journal RCU
1041 by splitting locks into parts (such as in our final per-object-lock
1050 ------------------------
1065 --------------------------------
1068 Using RCU, the readers can avoid taking a lock altogether: as we expect
1072 How do we get rid of read locks? Getting rid of read locks means that
1074 actually quite simple: we can read a linked list while an element is
1078 new->next = list->next;
1080 list->next = new;
1088 otherwise: we want a reader to either not see the new element at all, or
1096 Removing an element from the list is even simpler: we replace the
1102 list->next = old->next;
1106 does this (the normal version poisons the old object, which we don't
1111 don't realize that the pre-fetched contents is wrong when the ``next``
1118 Our final dilemma is this: when can we actually destroy the removed
1120 the list right now: if we free this element and the ``next`` pointer
1121 changes, the reader will jump off into garbage and crash. We need to
1122 wait until we know that all the readers who were traversing the list
1123 when we deleted the element are finished. We use
1125 destroy the object once all pre-existing readers are finished.
1127 until all pre-existing are finished.
1136 readers cannot sleep, we know that any readers which were traversing the
1143 --- cache.c.perobjectlock 2003-12-11 17:15:03.000000000 +1100
1144 +++ cache.c.rcupdate 2003-12-11 17:55:14.000000000 +1100
1145 @@ -1,15 +1,18 @@
1155 - /* These two protected by cache_lock. */
1165 @@ -40,7 +43,7 @@
1169 - list_for_each_entry(i, &cache, list) {
1171 if (i->id == id) {
1172 i->popularity++;
1174 @@ -49,19 +52,25 @@
1178 +/* Final discard done once we know no readers are looking. */
1188 - list_del(&obj->list);
1189 - object_put(obj);
1190 + list_del_rcu(&obj->list);
1191 cache_num--;
1192 + call_rcu(&obj->rcu, cache_delete_rcu);
1198 - list_add(&obj->list, &cache);
1199 + list_add_rcu(&obj->list, &cache);
1203 @@ -104,12 +114,11 @@
1207 - unsigned long flags;
1209 - spin_lock_irqsave(&cache_lock, flags);
1214 - spin_unlock_irqrestore(&cache_lock, flags);
1221 solution would be to make it an ``atomic_t``, but for this usage, we
1226 synchronization with any other functions, so is almost as fast on SMP as
1227 it would be on UP.
1238 need to actually get and put the reference count: we could expose
1239 __cache_find() by making it non-static, and such
1243 object is not altered in any way, which is much faster on SMP machines
1246 Per-CPU Data
1247 ------------
1255 machine to test on and can show that it is), you could instead use a
1260 Of particular use for simple per-cpu counters is the ``local_t`` type,
1262 more efficient than simple code on some architectures
1270 ----------------------------------------
1274 will not run simultaneously on multiple CPUs.
1287 (and waits for it to finish if it's currently running on other CPUs).
1302 --------------------------
1310 - Accesses to userspace:
1312 - copy_from_user()
1314 - copy_to_user()
1316 - get_user()
1318 - put_user()
1320 - kmalloc(GP_KERNEL) <kmalloc>`
1322 - mutex_lock_interruptible() and
1332 --------------------------------
1337 - printk()
1339 - kfree()
1341 - add_timer() and timer_delete()
1346 .. kernel-doc:: include/linux/mutex.h
1349 .. kernel-doc:: kernel/locking/mutex.c
1355 .. kernel-doc:: kernel/futex/core.c
1358 .. kernel-doc:: kernel/futex/futex.h
1361 .. kernel-doc:: kernel/futex/pi.c
1364 .. kernel-doc:: kernel/futex/requeue.c
1367 .. kernel-doc:: kernel/futex/waitwake.c
1373 - ``Documentation/locking/spinlocks.rst``: Linus Torvalds' spinlocking
1376 - Unix Systems for Modern Architectures: Symmetric Multiprocessing and
1394 Thanks to the cabal for having no influence on this document.
1405 preemption, even on UP.
1410 blocks any software interrupt on the current CPU. Bottom halves are
1423 Symmetric Multi-Processor: kernels compiled for multiple-CPU machines.
1432 interrupts which can run on multiple CPUs at once. Sometimes used to
1436 A dynamically-registrable software interrupt, which is guaranteed to
1437 only run on one CPU at a time.
1440 A dynamically-registrable software interrupt, which is run at (or close
1445 Uni-Processor: Non-SMP. (``CONFIG_SMP=n``).
1448 The kernel executing on behalf of a particular process (ie. a system