Requirements.html - OpenGrok cross reference for /Documentation/RCU/Design/Requirements/Requirements.html

Lines Matching +full:locality +full:- +full:specific
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
5         <meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
20 Read-copy update (RCU) is a synchronization mechanism that is often
21 used as a replacement for reader-writer locking.
23 which means that RCU's read-side primitives can be exceedingly fast
33 of as an informal, high-level specification for RCU.
49 <li>	<a href="#Fundamental Non-Requirements">Fundamental Non-Requirements</a>
52 <li>	<a href="#Quality-of-Implementation Requirements">
53 	Quality-of-Implementation Requirements</a>
56 <li>	<a href="#Software-Engineering Requirements">
57 	Software-Engineering Requirements</a>
77 <li>	<a href="#Grace-Period Guarantee">
78 	Grace-Period Guarantee</a>
79 <li>	<a href="#Publish-Subscribe Guarantee">
80 	Publish-Subscribe Guarantee</a>
81 <li>	<a href="#Memory-Barrier Guarantees">
82 	Memory-Barrier Guarantees</a>
85 <li>	<a href="#Guaranteed Read-to-Write Upgrade">
86 	Guaranteed Read-to-Write Upgrade</a>
89 <h3><a name="Grace-Period Guarantee">Grace-Period Guarantee</a></h3>
92 RCU's grace-period guarantee is unusual in being premeditated:
99 RCU's grace-period guarantee allows updaters to wait for the completion
100 of all pre-existing RCU read-side critical sections.
101 An RCU read-side critical section
105 big RCU read-side critical section.
106 Production-quality implementations of <tt>rcu_read_lock()</tt> and
138 all pre-existing readers, any instance of <tt>thread0()</tt> that
160 	with readers, but pre-existing readers will block
170 	update-side code does run concurrently with readers, whether
171 	pre-existing or not.
219 The RCU read-side critical section in <tt>do_something_dlm()</tt>
242 an RCU read-side critical section must not contain calls to
244 Similarly, an RCU read-side critical section must not
249 Although RCU's grace-period guarantee is useful in and of itself, with
251 it would be good to be able to use RCU to coordinate read-side
253 For this, the grace-period guarantee is not sufficient, as can
257 and, if the value loaded is non-<tt>NULL</tt>, locklessly accessing the
258 <tt>-&gt;a</tt> and <tt>-&gt;b</tt> fields.
266  5     return -ENOMEM;
272 11   p-&gt;a = a;
273 12   p-&gt;b = a;
291  5     return -ENOMEM;
298 12   p-&gt;a = a;
299 13   p-&gt;b = a;</b>
309 it will see garbage in the <tt>-&gt;a</tt> and <tt>-&gt;b</tt>
314 reordering in this manner, which brings us to the publish-subscribe
317 <h3><a name="Publish-Subscribe Guarantee">Publish/Subscribe Guarantee</a></h3>
320 RCU's publish-subscribe guarantee allows data to be inserted
333  5     return -ENOMEM;
339 11   p-&gt;a = a;
340 12   p-&gt;b = a;
363 	two assignments to <tt>p-&gt;a</tt> and <tt>p-&gt;b</tt>
374 	to <tt>p-&gt;a</tt> and <tt>p-&gt;b</tt> cannot possibly
382 to control its accesses to the RCU-protected data,
392  6     do_something(p-&gt;a, p-&gt;b);
418 <b> 5     do_something(gp-&gt;a, gp-&gt;b);</b>
431 the fetches of <tt>gp-&gt;a</tt>
432 and <tt>gp-&gt;b</tt> might well come from two different structures,
444  6     do_something(p-&gt;a, p-&gt;b);
458 <a href="http://www.rdrop.com/users/paulmck/RCU/consume.2015.07.13a.pdf">high-quality implementatio…
463 outermost RCU read-side critical section containing that
476 Of course, it is also necessary to remove elements from RCU-protected
481 <li>	Wait for all pre-existing RCU read-side critical sections
482 	to complete (because only pre-existing readers can possibly have
530 	within an RCU read-side critical section or in a code
532 	code protected by the corresponding update-side lock.
548 	It could also fetch the pointer from <tt>gp</tt> in a byte-at-a-time
550 	mash-up of two distinct pointer values.
551 	It might even use value-speculation optimizations, where it makes
554 	Too bad about any dereferences that returned pre-initialization garbage
571 In short, RCU's publish-subscribe guarantee is provided by the combination
573 This guarantee allows data elements to be safely added to RCU-protected
575 This guarantee can be used in combination with the grace-period
576 guarantee to also allow data elements to be removed from RCU-protected
587 late-1990s meeting with the DEC Alpha architects, back in the days when
588 DEC was still a free-standing company.
597 <h3><a name="Memory-Barrier Guarantees">Memory-Barrier Guarantees</a></h3>
600 The previous section's simple linked-data-structure scenario clearly
601 demonstrates the need for RCU's stringent memory-ordering guarantees on
605 <li>	Each CPU that has an RCU read-side critical section that
608 	that the RCU read-side critical section ends and the time that
610 	Without this guarantee, a pre-existing RCU read-side critical section
614 <li>	Each CPU that has an RCU read-side critical section that ends
618 	read-side critical section begins.
619 	Without this guarantee, a later RCU read-side critical section
646 	Given that multiple CPUs can start RCU read-side critical sections
648 	tell whether or not a given RCU read-side critical section starts
654 	RCU read-side critical section starts before a
656 	then it must assume that the RCU read-side critical section
659 	can avoid waiting on a given RCU read-side critical section only
669 	the relationship of the code within the enclosed RCU read-side
672 	If we take this viewpoint, then a given RCU read-side critical
719 		CPU 1: <tt>do_something_with(q-&gt;a);
735 	end of the RCU read-side critical section and the end of the
759 		CPU 1: <tt>do_something_with(q-&gt;a); /* Boom!!! */</tt>
767 	grace period and the beginning of the RCU read-side critical section,
776 	that you have adhered to the as-if rule than it is to actually
789 	RCU read-side critical sections.
790 	Given such rearrangement, if a given RCU read-side critical section
791 	is done, how can you be sure that all prior RCU read-side critical
800 	Because calls to <tt>schedule()</tt> had better prevent calling-code
802 	<tt>schedule()</tt>, if RCU detects the end of a given RCU read-side
804 	RCU read-side critical sections, no matter how aggressively the
811 	into user-mode code, and so on.
819 Note that these memory-barrier requirements do not replace the fundamental
820 RCU requirement that a grace period wait for all pre-existing readers.
829 The common-case RCU primitives are unconditional.
838 After all, this guarantee was reverse-engineered, not premeditated.
846 <h3><a name="Guaranteed Read-to-Write Upgrade">Guaranteed Read-to-Write Upgrade</a></h3>
850 update within an RCU read-side critical section.
851 For example, that RCU read-side critical section might search for
852 a given data element, and then might acquire the update-side
854 in that RCU read-side critical section.
855 Of course, it is necessary to exit the RCU read-side critical section
865 	But how does the upgrade-to-write operation exclude other readers?
876 This guarantee allows lookup code to be shared between read-side
877 and update-side code, and was premeditated, appearing in the earliest
880 <h2><a name="Fundamental Non-Requirements">Fundamental Non-Requirements</a></h2>
883 RCU provides extremely lightweight readers, and its read-side guarantees,
888 long, however, the following sections list a few non-guarantees that
890 Except where otherwise noted, these non-guarantees were premeditated.
899 <li>	<a href="#Grace Periods Don't Partition Read-Side Critical Sections">
900 	Grace Periods Don't Partition Read-Side Critical Sections</a>
901 <li>	<a href="#Read-Side Critical Sections Don't Partition Grace Periods">
902 	Read-Side Critical Sections Don't Partition Grace Periods</a>
908 Reader-side markers such as <tt>rcu_read_lock()</tt> and
910 except through their interaction with the grace-period APIs such as
955 these fast-path APIs.
1029 	pre-existing readers.
1046 <h3><a name="Grace Periods Don't Partition Read-Side Critical Sections">
1047 Grace Periods Don't Partition Read-Side Critical Sections</a></h3>
1050 It is tempting to assume that if any part of one RCU read-side critical
1052 read-side critical section follows that same grace period, then all of
1053 the first RCU read-side critical section must precede all of the second.
1055 partition the set of RCU read-side critical sections.
1099 RCU knows that the thread cannot be in the midst of an RCU read-side
1105 If it is necessary to partition RCU read-side critical sections in this
1153 This mean that the two RCU read-side critical sections cannot overlap,
1166 This non-requirement was also non-premeditated, but became apparent
1169 <h3><a name="Read-Side Critical Sections Don't Partition Grace Periods">
1170 Read-Side Critical Sections Don't Partition Grace Periods</a></h3>
1173 It is also tempting to assume that if an RCU read-side critical section
1235 Again, an RCU read-side critical section can overlap almost all of a
1238 As a result, an RCU read-side critical section cannot partition a pair
1246 	read-side critical section, would be required to partition the RCU
1247 	read-side critical sections at the beginning and end of the chain?
1263 These parallelism facts of life are by no means specific to RCU, but
1271 	This is most obvious in preemptible user-level
1274 	the underlying hypervisor), but can also happen in bare-metal
1281 	wrap-around when incrementing a 64-bit counter.
1284 	memory-barrier instructions to preserve ordering.
1287 	Greater numbers of concurrent writes and more-frequent
1290 	sufficient locality to avoid significant performance and
1296 <li>	Counters are finite, especially on 32-bit systems.
1302 	As an example of the latter, RCU's dyntick-idle nesting counter
1304 	is 64 bits even on a 32-bit system).
1306 	half-interrupts on a given CPU without that CPU ever going idle.
1307 	If a half-interrupt happened every microsecond, it would take
1311 	Linux kernel in a single shared-memory environment.
1312 	RCU must therefore pay close attention to high-end scalability.
1322 <h2><a name="Quality-of-Implementation Requirements">Quality-of-Implementation Requirements</a></h2>
1325 These sections list quality-of-implementation requirements.
1328 make it inappropriate for industrial-strength production use.
1329 Classes of quality-of-implementation requirements are as follows:
1345 RCU is and always has been intended primarily for read-mostly situations,
1346 which means that RCU's read-side primitives are optimized, often at the
1347 expense of its update-side primitives.
1351 <li>	Read-mostly data, where stale and inconsistent data is not
1353 <li>	Read-mostly data, where data must be consistent:
1355 <li>	Read-write data, where data must be consistent:
1358 <li>	Write-mostly data, where data must be consistent:
1362 	<li>	Existence guarantees for update-friendly mechanisms.
1363 	<li>	Wait-free read-side primitives for real-time use.
1368 This focus on read-mostly situations means that RCU must interoperate
1374 synchronization primitives be legal within RCU read-side critical sections,
1386 	These are forbidden within Linux-kernel RCU read-side critical
1388 	(in this case, voluntary context switch) within an RCU read-side
1390 	However, sleeping locks may be used within userspace RCU read-side
1391 	critical sections, and also within Linux-kernel sleepable RCU
1393 	read-side critical sections.
1394 	In addition, the -rt patchset turns spinlocks into a
1398 	-rt-Linux-kernel RCU read-side critical sections.
1402 	Note that it <i>is</i> legal for a normal RCU read-side
1429 speed-of-light delays if nothing else.
1460 For example, the translation between a user-level SystemV semaphore
1461 ID to the corresponding in-kernel data structure is protected by RCU,
1465 spinlocks located in the in-kernel data structure from within
1466 the RCU read-side critical section, and this is indicated by the
1482 and Linux-kernel RCU implementations must therefore avoid unnecessarily
1487 of energy efficiency in battery-powered systems and on specific
1488 energy-efficiency shortcomings of the Linux-kernel RCU implementation.
1489 In my experience, the battery-powered embedded community will consider
1491 So much so that mere Linux-kernel-mailing-list posts are
1500 <a href="http://elinux.org/Linux_Tiny-FAQ">bloatwatch</a>
1501 efforts, memory footprint is critically important on single-CPU systems with
1502 non-preemptible (<tt>CONFIG_PREEMPT=n</tt>) kernels, and thus
1505 Josh Triplett has since taken over the small-memory banner with his
1514 For example, in keeping with RCU's read-side specialization,
1517 Similarly, in non-preemptible environments, <tt>rcu_read_lock()</tt> and
1521 In preemptible environments, in the case where the RCU read-side
1523 highest-priority real-time process), <tt>rcu_read_lock()</tt> and
1525 In particular, they should not contain atomic read-modify-write
1526 operations, memory-barrier instructions, preemption disabling,
1528 However, in the case where the RCU read-side critical section was preempted,
1530 This is why it is better to nest an RCU read-side critical section
1531 within a preempt-disable region than vice versa, at least in cases
1533 real-time latencies.
1536 The <tt>synchronize_rcu()</tt> grace-period-wait primitive is
1539 the duration of the longest RCU read-side critical section.
1542 so that they can be satisfied by a single underlying grace-period-wait
1545 grace-period-wait operation to serve more than
1546 …="https://www.usenix.org/conference/2004-usenix-annual-technical-conference/making-rcu-safe-deep-s…
1547 of <tt>synchronize_rcu()</tt>, thus amortizing the per-invocation
1549 However, the grace-period optimization is also required to avoid
1550 measurable degradation of real-time scheduling and interrupt latencies.
1553 In some cases, the multi-millisecond <tt>synchronize_rcu()</tt>
1556 instead, reducing the grace-period latency down to a few tens of
1557 microseconds on small systems, at least in cases where the RCU read-side
1567 is permitted to impose modest degradation of real-time latency
1568 on non-idle online CPUs.
1570 degradation as a scheduling-clock interrupt.
1574 <tt>synchronize_rcu_expedited()</tt>'s reduced grace-period
1605 25   call_rcu(&amp;p-&gt;rh, remove_gp_cb);
1614 on lines&nbsp;1-5.
1623 including within preempt-disable code, <tt>local_bh_disable()</tt> code,
1624 interrupt-disable code, and interrupt handlers.
1634 Long-running operations should be relegated to separate threads or
1648 	Presumably the <tt>-&gt;gp_lock</tt> acquired on line&nbsp;18 excludes
1652 	<tt>-&gt;gp_lock</tt>
1702 <tt>synchronize_rcu()</tt> simply open-coded it.
1718 	definition would say that updates in garbage-collected languages
1734 The polling-style <tt>get_state_synchronize_rcu()</tt> and
1779 In theory, delaying grace-period completion and callback invocation
1789 For one simple example, an infinite loop in an RCU read-side critical
1791 For a more involved example, consider a 64-CPU system built with
1792 <tt>CONFIG_RCU_NOCB_CPU=y</tt> and booted with <tt>rcu_nocbs=1-63</tt>,
1815 	next scheduling-clock.
1817 	can run indefinitely in the kernel without scheduling-clock
1824 	task that has been preempted within an RCU read-side critical
1838 Note that these forward-progress measures are provided only for RCU,
1845 invocation of callbacks when any given non-<tt>rcu_nocbs</tt> CPU has
1857 <li>	Lifts callback-execution batch limits, which speeds up callback
1864 Again, these forward-progress measures are provided only for RCU,
1868 Even for RCU, callback-invocation forward progress for <tt>rcu_nocbs</tt>
1869 CPUs is much less well-developed, in part because workloads benefiting
1873 <tt>call_rcu()</tt> invocation rates, then additional forward-progress
1880 due to the collision of multicore hardware with object-oriented techniques
1881 designed in single-threaded environments for single-threaded use.
1882 And in theory, RCU read-side critical sections may be composed, and in
1884 In practice, as with all real-world implementations of composable
1890 Linux-kernel RCU when <tt>CONFIG_PREEMPT=n</tt>, can be
1908 are limited by the nesting-depth counter.
1912 That said, a consecutive pair of RCU read-side critical sections
1914 cannot be enclosed in another RCU read-side critical section.
1916 an RCU read-side critical section:  To do so would result either
1918 in RCU implicitly splitting the enclosing RCU read-side critical
1919 section, neither of which is conducive to a long-lived and prosperous
1924 For example, many transactional-memory implementations prohibit
1927 For another example, lock-based critical sections can be composed
1931 In short, although RCU read-side critical sections are highly composable,
1939 RCU read-side critical sections, perhaps even so intense that there
1941 RCU read-side critical section in flight.
1943 all the RCU read-side critical sections are finite, grace periods
1948 in RCU read-side critical sections being preempted for long durations,
1949 which has the effect of creating a long-duration RCU read-side
1952 real-time priorities are of course more vulnerable.
1966 Finally, high update rates should not delay RCU read-side critical
1967 sections, although some small read-side delays can occur when using
1973 1990s, a simple user-level test consisting of <tt>close(open(path))</tt>
1976 high-update-rate corner case.
1981 completion of grace-period processing.
1986 <h2><a name="Software-Engineering Requirements">
1987 Software-Engineering Requirements</a></h2>
1998 	RCU read-side critical section.
1999 	Update-side code can use <tt>rcu_dereference_protected()</tt>,
2017 	an RCU read-side critical section.
2020 <li>	A given function might wish to check for RCU-related preconditions
2028 	To catch this sort of error, a given RCU-protected pointer may be
2030 	will complain about simple-assignment accesses to that pointer.
2043 	Similarly, statically allocated non-stack <tt>rcu_head</tt>
2049 <li>	An infinite loop in an RCU read-side critical section will
2058 	RCU read-side critical section.
2065 	Furthermore, RCU CPU stall warnings are counter-productive
2080 	of RCU read-side critical sections, there is currently no
2086 <li>	In kernels built with <tt>CONFIG_RCU_TRACE=y</tt>, RCU-related
2088 <li>	Open-coded use of <tt>rcu_assign_pointer()</tt> and
2090 	data structures can be surprisingly error-prone.
2091 	Therefore, RCU-protected
2093 	and, more recently, RCU-protected
2096 	Many other special-purpose RCU-protected data structures are
2111 This not a hard-and-fast list:  RCU's diagnostic capabilities will
2113 in real-world RCU usage.
2127 	Interrupts and non-maskable interrupts (NMIs)</a>.
2135 <li>	<a href="#Scheduling-Clock Interrupts and RCU">
2136 	Scheduling-Clock Interrupts and RCU</a>.
2144 most notable Linux-kernel complications.
2165 <a href="https://lkml.kernel.org/g/CA+55aFy4wcCwaL4okTs8wXhGZ5h-ibecy_Meg9C4MNQrUnwMcg@mail.gmail.c…
2179 it would create too many per-CPU kthreads.
2199 boot CPU's per-CPU variables are set up.
2200 The read-side primitives (<tt>rcu_read_lock()</tt>,
2226 state and thus a grace period, so the early-boot implementation can
2227 be a no-op.
2234 The reason is that an RCU read-side critical section might be preempted,
2261 	handled by the expedited grace-period mechanism.
2265 	Because dead-zone execution takes place within task context,
2288 I learned of these boot-time requirements as a result of a series of
2294 The Linux kernel has interrupts, and RCU read-side critical sections are
2295 legal within interrupt handlers and within interrupt-disabled regions
2299 Some Linux-kernel architectures can enter an interrupt handler from
2300 non-idle process context, and then just never leave it, instead stealthily
2303 These &ldquo;half-interrupts&rdquo; mean that RCU has to be very careful
2306 of RCU's dyntick-idle code.
2309 The Linux kernel has non-maskable interrupts (NMIs), and
2310 RCU read-side critical sections are legal within NMI handlers.
2311 Thankfully, RCU update-side primitives, including
2315 The name notwithstanding, some Linux-kernel architectures
2318 <a href="https://lkml.kernel.org/r/CALCETrXLq1y7e_dKFPgou-FKHB6Pu-r8+t-6Ds+8=va7anBWDA@mail.gmail.c…
2342 The module-unload functions must therefore cancel any
2343 delayed calls to loadable-module functions, for example,
2353 to deal with in-flight RCU callbacks.
2358 which waits until all in-flight RCU callbacks have been invoked.
2362 In theory, the underlying module-unload code could invoke
2367 Nikita Danilov noted this requirement for an analogous filesystem-unmount
2388 	and <tt>rcu_barrier()</tt> must wait for each pre-existing
2411 	pre-existing callbacks, you will need to invoke both
2425 with the exception of <a href="#Sleepable RCU">SRCU</a> read-side
2428 on the other hand, the Linux kernel's CPU-hotplug implementation
2432 The Linux-kernel CPU-hotplug implementation has notifiers that
2434 to respond appropriately to a given CPU-hotplug operation.
2435 Most RCU operations may be invoked from CPU-hotplug notifiers,
2436 including even synchronous grace-period operations such as
2440 However, all-callback-wait operations such as
2442 fact that there are phases of CPU-hotplug operations where
2444 the CPU-hotplug operation ends, which could also result in deadlock.
2445 Furthermore, <tt>rcu_barrier()</tt> blocks CPU-hotplug operations
2447 when invoked from a CPU-hotplug notifier.
2454 The preemptible-RCU <tt>rcu_read_unlock()</tt>
2456 involving the scheduler's runqueue and priority-inheritance locks.
2465 This scheduler-RCU requirement came as a
2470 avoid excessive CPU-time accumulation by these kthreads.
2472 when running context-switch-heavy workloads when built with
2476 for context-switch-heavy <tt>CONFIG_NO_HZ_FULL=y</tt> workloads,
2480 It is forbidden to hold any of scheduler's runqueue or priority-inheritance
2482 disabled across the entire RCU read-side critical section, that is,
2486 There was hope that this restriction might be lifted when interrupt-disabled
2488 the resulting RCU-preempt quiescent state until the end of the corresponding
2489 interrupts-disabled region.
2493 In addition, real-time systems using RCU priority boosting
2495 quiescent-state reporting would also defer deboosting, which in turn
2496 would degrade real-time latencies.
2499 In theory, if a given RCU read-side critical section could be
2503 RCU read-side critical section, not interrupts.
2504 Unfortunately, given the possibility of vCPU preemption, long-running
2506 that a given RCU read-side critical section will complete in less than
2510 disabled across the entire RCU read-side critical section.
2530 The kernel needs to access user-space memory, for example, to access
2531 data referenced by system-call parameters.
2535 However, user-space memory might well be paged out, which means
2536 that <tt>get_user()</tt> might well page-fault and thus block while
2539 a <tt>get_user()</tt> invocation into an RCU read-side critical
2547  3 v = p-&gt;value;
2563  4 v = p-&gt;value;
2573 of an RCU read-side critical section.
2575 a use-after-free access, which could be bad for your kernel's
2585 <tt>p-&gt;value</tt> is not volatile, so the compiler would not have any
2589 Therefore, the Linux-kernel definitions of <tt>rcu_read_lock()</tt>
2592 <tt>rcu_read_unlock()</tt> within a nested set of RCU read-side critical
2599 especially by people with battery-powered embedded systems.
2602 This is a large part of the energy-efficiency requirement,
2607 execute an RCU read-side critical section on an idle CPU.
2613 test whether or not it is currently legal to run RCU read-side
2617 idle-loop code.
2630 	deep, even on 32-bit systems, this should not be a serious
2666 These energy-efficiency requirements have proven quite difficult to
2668 clean-sheet rewrites of RCU's energy-efficiency code, the last of
2673 Flaming me on the Linux-kernel mailing list was apparently not
2674 sufficient to fully vent their ire at RCU's energy-efficiency bugs!
2676 <h3><a name="Scheduling-Clock Interrupts and RCU">
2677 Scheduling-Clock Interrupts and RCU</a></h3>
2680 The kernel transitions between in-kernel non-idle execution, userspace
2686 	<th>In-Kernel</th>
2690 	<td>Can rely on scheduling-clock interrupt.</td>
2691 		<td>Can rely on scheduling-clock interrupt and its
2693 			<td>Can rely on RCU's dyntick-idle detection.</td></tr>
2695 	<td>Can rely on scheduling-clock interrupt.</td>
2696 		<td>Can rely on scheduling-clock interrupt and its
2698 			<td>Can rely on RCU's dyntick-idle detection.</td></tr>
2700 	<td>Can only sometimes rely on scheduling-clock interrupt.
2703 		<td>Can rely on RCU's dyntick-idle detection.</td>
2704 			<td>Can rely on RCU's dyntick-idle detection.</td></tr>
2711 	Why can't <tt>NO_HZ_FULL</tt> in-kernel execution rely on the
2712 	scheduling-clock interrupt, just like <tt>HZ_PERIODIC</tt>
2718 	does not necessarily re-enable the scheduling-clock interrupt
2729 It also requires that the scheduling-clock interrupt be enabled when
2734 	it is non-idle, the scheduling-clock tick had better be running.
2736 	very long (11-second) grace periods, with a pointless IPI waking
2738 <li>	If a CPU is in a portion of the kernel that executes RCU read-side
2745 	positively no-joking guaranteed to never execute any RCU read-side
2748 	for light-weight exception handlers, which can then avoid the
2756 	was in fact joking about not doing RCU read-side critical sections.
2757 <li>	If a CPU is executing in the kernel with the scheduling-clock
2758 	interrupt disabled and RCU believes this CPU to be non-idle,
2770 	scheduling-clock interrupt is enabled, of course no problem.
2789 	But given that long-running interrupt handlers can cause
2799 in-kernel execution, usermode execution, and idle, and as long as the
2800 scheduling-clock interrupt is enabled when RCU needs it to be, you
2807 Although small-memory non-realtime systems can simply use Tiny RCU,
2812 it does appear in many RCU-protected data structures, including
2818 This need for memory efficiency is one reason that RCU uses hand-crafted
2824 Although this information might appear in debug-only kernel builds at some
2825 point, in the meantime, the <tt>-&gt;func</tt> field will often provide
2835 <a href="https://lkml.kernel.org/g/1439976106-137226-1-git-send-email-kirill.shutemov@linux.intel.c…
2836 the Linux kernel's memory-management subsystem needs a particular bit
2837 to remain zero during all phases of grace-period processing,
2839 <tt>rcu_head</tt> structure's <tt>-&gt;next</tt> field.
2844 energy-efficiency purposes.
2849 two-byte boundary, and passing a misaligned <tt>rcu_head</tt>
2854 Why not a four-byte or even eight-byte alignment requirement?
2855 Because the m68k architecture provides only two-byte alignment,
2862 Deferring invocation could potentially have energy-efficiency
2863 benefits, but only if the rate of non-lazy callbacks decreases
2874 RCU is used heavily by hot code paths in performance-critical
2878 read-side primitives.
2893 combination of RCU's runtime primitives with minimal per-operation
2896 per-operation overhead, witness the batching optimizations for
2903 The Linux kernel is used for real-time workloads, especially
2905 <a href="https://rt.wiki.kernel.org/index.php/Main_Page">-rt patchset</a>.
2906 The real-time-latency response requirements are such that the
2908 read-side critical sections is inappropriate.
2910 use an RCU implementation that allows RCU read-side critical
2914 <a href="https://lwn.net/Articles/107930/">real-time patch</a>
2917 encountered by a very early version of the -rt patchset.
2920 In addition, RCU must make do with a sub-100-microsecond real-time latency
2922 In fact, on smaller systems with the -rt patchset, the Linux kernel
2923 provides sub-20-microsecond real-time latencies for the whole kernel,
2927 To my surprise, the sub-100-microsecond real-time latency budget
2931 This real-time requirement motivated the grace-period kthread, which
2935 RCU must avoid degrading real-time response for CPU-bound threads, whether
2938 That said, CPU-bound loops in the kernel must execute
2947 practice also means that RCU must have an aggressive stress-test
2949 This stress-test suite is called <tt>rcutorture</tt>.
2958 smartphones, Linux-powered televisions, and servers.
2970 consider the fact that in most jurisdictions, a successful multi-year
2972 suffices for a number of types of safety-critical certifications.
2974 in production for safety-critical applications.
2985 this point has two different implementations, non-preemptible and
2991 <li>	<a href="#Bottom-Half Flavor">Bottom-Half Flavor (Historical)</a>
2997 <h3><a name="Bottom-Half Flavor">Bottom-Half Flavor (Historical)</a></h3>
3000 The RCU-bh flavor of RCU has since been expressed in terms of
3003 The read-side API remains, and continues to disable softirq and to
3009 The softirq-disable (AKA &ldquo;bottom-half&rdquo;,
3011 flavor of RCU, or <i>RCU-bh</i>, was developed by
3013 network-based denial-of-service attacks researched by Robert
3020 The result was an out-of-memory condition and a system hang.
3023 The solution was the creation of RCU-bh, which does
3025 across its read-side critical sections, and which uses the transition
3028 This means that RCU-bh grace periods can complete even when some of
3030 based on RCU-bh to withstand network-based denial-of-service attacks.
3035 disable and re-enable softirq handlers, any attempt to start a softirq
3037 RCU-bh read-side critical section will be deferred.
3041 with the code following the RCU-bh read-side critical section rather
3045 For example, suppose that a three-millisecond-long RCU-bh read-side
3055 <a href="https://lwn.net/Articles/609973/#RCU Per-Flavor API Table">RCU-bh API</a>
3066 However, the update-side APIs are now simple wrappers for other RCU
3067 flavors, namely RCU-sched in CONFIG_PREEMPT=n kernels and RCU-preempt
3073 The RCU-sched flavor of RCU has since been expressed in terms of
3076 The read-side API remains, and continues to disable preemption and to
3083 side effect of also waiting for all pre-existing interrupt
3085 However, there are legitimate preemptible-RCU implementations that
3087 of an RCU read-side critical section can be a quiescent state.
3088 Therefore, <i>RCU-sched</i> was created, which follows &ldquo;classic&rdquo;
3089 RCU in that an RCU-sched grace period waits for for pre-existing
3091 In kernels built with <tt>CONFIG_PREEMPT=n</tt>, the RCU and RCU-sched
3098 disable and re-enable preemption, respectively.
3100 RCU-sched read-side critical section, <tt>rcu_read_unlock_sched()</tt>
3104 However, the highest-priority task won't be preempted, so that task
3105 will enjoy low-overhead <tt>rcu_read_unlock_sched()</tt> invocations.
3109 <a href="https://lwn.net/Articles/609973/#RCU Per-Flavor API Table">RCU-sched API</a>
3122 However, anything that disables preemption also marks an RCU-sched
3123 read-side critical section, including
3132 an RCU read-side critical section&rdquo; was a reliable indication
3134 After all, if you are always blocking in an RCU read-side critical
3135 section, you can probably afford to use a higher-overhead synchronization
3138 whose RCU read-side critical
3152 That said, one consequence of these domains is that read-side code
3167 As noted above, it is legal to block within SRCU read-side critical sections,
3169 If you block forever in one of a given domain's SRCU read-side critical
3172 happen if any operation in a given domain's SRCU read-side critical
3175 For example, this results in a self-deadlock:
3194 and if an <tt>ss1</tt>-domain SRCU read-side critical section
3195 acquired another mutex that was held across as <tt>ss</tt>-domain
3203 Unlike the other RCU flavors, SRCU read-side critical sections can
3214 be invoked from CPU-hotplug notifiers, due to the fact that SRCU grace
3218 will not fire until late in the CPU-hotplug process.
3224 from being invoked from CPU-hotplug notifiers.
3228 non-expedited grace periods are implemented by the same mechanism.
3241 As of v4.12, SRCU's callbacks are maintained per-CPU, eliminating
3255 <a href="https://lwn.net/Articles/609973/#RCU Per-Flavor API Table">SRCU API</a>
3280 anywhere in the code, it is not possible to use read-side markers
3293 read-side critical sections that are delimited by voluntary context
3298 tasks-RCU read-side critical sections.
3301 The tasks-RCU API is quite compact, consisting only of
3317 One of the tricks that RCU uses to attain update-side scalability is
3318 to increase grace-period latency with increasing numbers of CPUs.
3320 grace-period state machine so as to avoid the need for the additional
3326 If there is a strong reason to use <tt>rcu_barrier()</tt> in CPU-hotplug
3332 The tradeoff between grace-period latency on the one hand and interruptions
3333 of other CPUs on the other hand may need to be re-examined.
3334 The desire is of course for zero grace-period latency as well as zero
3342 groups CPUs so as to reduce lock contention and increase cache locality.
3347 because the hotpath read-side primitives do not access the combining
3360 realistic system-level workload.
3379 Additional work may be required to provide reasonable forward-progress