Requirements.html - OpenGrok cross reference for /Documentation/RCU/Design/Requirements/Requirements.html

Lines Matching +full:message +full:- +full:handling +full:- +full:unit
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
5         <meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
20 Read-copy update (RCU) is a synchronization mechanism that is often
21 used as a replacement for reader-writer locking.
23 which means that RCU's read-side primitives can be exceedingly fast
33 of as an informal, high-level specification for RCU.
49 <li>	<a href="#Fundamental Non-Requirements">Fundamental Non-Requirements</a>
52 <li>	<a href="#Quality-of-Implementation Requirements">
53 	Quality-of-Implementation Requirements</a>
56 <li>	<a href="#Software-Engineering Requirements">
57 	Software-Engineering Requirements</a>
77 <li>	<a href="#Grace-Period Guarantee">
78 	Grace-Period Guarantee</a>
79 <li>	<a href="#Publish-Subscribe Guarantee">
80 	Publish-Subscribe Guarantee</a>
81 <li>	<a href="#Memory-Barrier Guarantees">
82 	Memory-Barrier Guarantees</a>
85 <li>	<a href="#Guaranteed Read-to-Write Upgrade">
86 	Guaranteed Read-to-Write Upgrade</a>
89 <h3><a name="Grace-Period Guarantee">Grace-Period Guarantee</a></h3>
92 RCU's grace-period guarantee is unusual in being premeditated:
99 RCU's grace-period guarantee allows updaters to wait for the completion
100 of all pre-existing RCU read-side critical sections.
101 An RCU read-side critical section
105 big RCU read-side critical section.
106 Production-quality implementations of <tt>rcu_read_lock()</tt> and
138 all pre-existing readers, any instance of <tt>thread0()</tt> that
160 	with readers, but pre-existing readers will block
170 	update-side code does run concurrently with readers, whether
171 	pre-existing or not.
180 a state suitable for handling recovery from node failure,
219 The RCU read-side critical section in <tt>do_something_dlm()</tt>
242 an RCU read-side critical section must not contain calls to
244 Similarly, an RCU read-side critical section must not
249 Although RCU's grace-period guarantee is useful in and of itself, with
251 it would be good to be able to use RCU to coordinate read-side
253 For this, the grace-period guarantee is not sufficient, as can
257 and, if the value loaded is non-<tt>NULL</tt>, locklessly accessing the
258 <tt>-&gt;a</tt> and <tt>-&gt;b</tt> fields.
266  5     return -ENOMEM;
272 11   p-&gt;a = a;
273 12   p-&gt;b = a;
291  5     return -ENOMEM;
298 12   p-&gt;a = a;
299 13   p-&gt;b = a;</b>
309 it will see garbage in the <tt>-&gt;a</tt> and <tt>-&gt;b</tt>
314 reordering in this manner, which brings us to the publish-subscribe
317 <h3><a name="Publish-Subscribe Guarantee">Publish/Subscribe Guarantee</a></h3>
320 RCU's publish-subscribe guarantee allows data to be inserted
333  5     return -ENOMEM;
339 11   p-&gt;a = a;
340 12   p-&gt;b = a;
363 	two assignments to <tt>p-&gt;a</tt> and <tt>p-&gt;b</tt>
374 	to <tt>p-&gt;a</tt> and <tt>p-&gt;b</tt> cannot possibly
382 to control its accesses to the RCU-protected data,
392  6     do_something(p-&gt;a, p-&gt;b);
418 <b> 5     do_something(gp-&gt;a, gp-&gt;b);</b>
431 the fetches of <tt>gp-&gt;a</tt>
432 and <tt>gp-&gt;b</tt> might well come from two different structures,
444  6     do_something(p-&gt;a, p-&gt;b);
458 <a href="http://www.rdrop.com/users/paulmck/RCU/consume.2015.07.13a.pdf">high-quality implementatio…
463 outermost RCU read-side critical section containing that
476 Of course, it is also necessary to remove elements from RCU-protected
481 <li>	Wait for all pre-existing RCU read-side critical sections
482 	to complete (because only pre-existing readers can possibly have
530 	within an RCU read-side critical section or in a code
532 	code protected by the corresponding update-side lock.
548 	It could also fetch the pointer from <tt>gp</tt> in a byte-at-a-time
550 	mash-up of two distinct pointer values.
551 	It might even use value-speculation optimizations, where it makes
554 	Too bad about any dereferences that returned pre-initialization garbage
571 In short, RCU's publish-subscribe guarantee is provided by the combination
573 This guarantee allows data elements to be safely added to RCU-protected
575 This guarantee can be used in combination with the grace-period
576 guarantee to also allow data elements to be removed from RCU-protected
587 late-1990s meeting with the DEC Alpha architects, back in the days when
588 DEC was still a free-standing company.
597 <h3><a name="Memory-Barrier Guarantees">Memory-Barrier Guarantees</a></h3>
600 The previous section's simple linked-data-structure scenario clearly
601 demonstrates the need for RCU's stringent memory-ordering guarantees on
605 <li>	Each CPU that has an RCU read-side critical section that
608 	that the RCU read-side critical section ends and the time that
610 	Without this guarantee, a pre-existing RCU read-side critical section
614 <li>	Each CPU that has an RCU read-side critical section that ends
618 	read-side critical section begins.
619 	Without this guarantee, a later RCU read-side critical section
646 	Given that multiple CPUs can start RCU read-side critical sections
648 	tell whether or not a given RCU read-side critical section starts
654 	RCU read-side critical section starts before a
656 	then it must assume that the RCU read-side critical section
659 	can avoid waiting on a given RCU read-side critical section only
669 	the relationship of the code within the enclosed RCU read-side
672 	If we take this viewpoint, then a given RCU read-side critical
719 		CPU 1: <tt>do_something_with(q-&gt;a);
735 	end of the RCU read-side critical section and the end of the
759 		CPU 1: <tt>do_something_with(q-&gt;a); /* Boom!!! */</tt>
767 	grace period and the beginning of the RCU read-side critical section,
776 	that you have adhered to the as-if rule than it is to actually
789 	RCU read-side critical sections.
790 	Given such rearrangement, if a given RCU read-side critical section
791 	is done, how can you be sure that all prior RCU read-side critical
800 	Because calls to <tt>schedule()</tt> had better prevent calling-code
802 	<tt>schedule()</tt>, if RCU detects the end of a given RCU read-side
804 	RCU read-side critical sections, no matter how aggressively the
811 	into user-mode code, and so on.
819 Note that these memory-barrier requirements do not replace the fundamental
820 RCU requirement that a grace period wait for all pre-existing readers.
829 The common-case RCU primitives are unconditional.
838 After all, this guarantee was reverse-engineered, not premeditated.
846 <h3><a name="Guaranteed Read-to-Write Upgrade">Guaranteed Read-to-Write Upgrade</a></h3>
850 update within an RCU read-side critical section.
851 For example, that RCU read-side critical section might search for
852 a given data element, and then might acquire the update-side
854 in that RCU read-side critical section.
855 Of course, it is necessary to exit the RCU read-side critical section
865 	But how does the upgrade-to-write operation exclude other readers?
876 This guarantee allows lookup code to be shared between read-side
877 and update-side code, and was premeditated, appearing in the earliest
880 <h2><a name="Fundamental Non-Requirements">Fundamental Non-Requirements</a></h2>
883 RCU provides extremely lightweight readers, and its read-side guarantees,
888 long, however, the following sections list a few non-guarantees that
890 Except where otherwise noted, these non-guarantees were premeditated.
899 <li>	<a href="#Grace Periods Don't Partition Read-Side Critical Sections">
900 	Grace Periods Don't Partition Read-Side Critical Sections</a>
901 <li>	<a href="#Read-Side Critical Sections Don't Partition Grace Periods">
902 	Read-Side Critical Sections Don't Partition Grace Periods</a>
908 Reader-side markers such as <tt>rcu_read_lock()</tt> and
910 except through their interaction with the grace-period APIs such as
955 these fast-path APIs.
1029 	pre-existing readers.
1046 <h3><a name="Grace Periods Don't Partition Read-Side Critical Sections">
1047 Grace Periods Don't Partition Read-Side Critical Sections</a></h3>
1050 It is tempting to assume that if any part of one RCU read-side critical
1052 read-side critical section follows that same grace period, then all of
1053 the first RCU read-side critical section must precede all of the second.
1055 partition the set of RCU read-side critical sections.
1099 RCU knows that the thread cannot be in the midst of an RCU read-side
1105 If it is necessary to partition RCU read-side critical sections in this
1153 This mean that the two RCU read-side critical sections cannot overlap,
1166 This non-requirement was also non-premeditated, but became apparent
1169 <h3><a name="Read-Side Critical Sections Don't Partition Grace Periods">
1170 Read-Side Critical Sections Don't Partition Grace Periods</a></h3>
1173 It is also tempting to assume that if an RCU read-side critical section
1235 Again, an RCU read-side critical section can overlap almost all of a
1238 As a result, an RCU read-side critical section cannot partition a pair
1246 	read-side critical section, would be required to partition the RCU
1247 	read-side critical sections at the beginning and end of the chain?
1271 	This is most obvious in preemptible user-level
1274 	the underlying hypervisor), but can also happen in bare-metal
1281 	wrap-around when incrementing a 64-bit counter.
1284 	memory-barrier instructions to preserve ordering.
1287 	Greater numbers of concurrent writes and more-frequent
1296 <li>	Counters are finite, especially on 32-bit systems.
1302 	As an example of the latter, RCU's dyntick-idle nesting counter
1304 	is 64 bits even on a 32-bit system).
1306 	half-interrupts on a given CPU without that CPU ever going idle.
1307 	If a half-interrupt happened every microsecond, it would take
1311 	Linux kernel in a single shared-memory environment.
1312 	RCU must therefore pay close attention to high-end scalability.
1322 <h2><a name="Quality-of-Implementation Requirements">Quality-of-Implementation Requirements</a></h2>
1325 These sections list quality-of-implementation requirements.
1328 make it inappropriate for industrial-strength production use.
1329 Classes of quality-of-implementation requirements are as follows:
1345 RCU is and always has been intended primarily for read-mostly situations,
1346 which means that RCU's read-side primitives are optimized, often at the
1347 expense of its update-side primitives.
1351 <li>	Read-mostly data, where stale and inconsistent data is not
1353 <li>	Read-mostly data, where data must be consistent:
1355 <li>	Read-write data, where data must be consistent:
1358 <li>	Write-mostly data, where data must be consistent:
1362 	<li>	Existence guarantees for update-friendly mechanisms.
1363 	<li>	Wait-free read-side primitives for real-time use.
1368 This focus on read-mostly situations means that RCU must interoperate
1374 synchronization primitives be legal within RCU read-side critical sections,
1386 	These are forbidden within Linux-kernel RCU read-side critical
1388 	(in this case, voluntary context switch) within an RCU read-side
1390 	However, sleeping locks may be used within userspace RCU read-side
1391 	critical sections, and also within Linux-kernel sleepable RCU
1393 	read-side critical sections.
1394 	In addition, the -rt patchset turns spinlocks into a
1398 	-rt-Linux-kernel RCU read-side critical sections.
1402 	Note that it <i>is</i> legal for a normal RCU read-side
1429 speed-of-light delays if nothing else.
1460 For example, the translation between a user-level SystemV semaphore
1461 ID to the corresponding in-kernel data structure is protected by RCU,
1465 spinlocks located in the in-kernel data structure from within
1466 the RCU read-side critical section, and this is indicated by the
1482 and Linux-kernel RCU implementations must therefore avoid unnecessarily
1487 of energy efficiency in battery-powered systems and on specific
1488 energy-efficiency shortcomings of the Linux-kernel RCU implementation.
1489 In my experience, the battery-powered embedded community will consider
1491 So much so that mere Linux-kernel-mailing-list posts are
1500 <a href="http://elinux.org/Linux_Tiny-FAQ">bloatwatch</a>
1501 efforts, memory footprint is critically important on single-CPU systems with
1502 non-preemptible (<tt>CONFIG_PREEMPT=n</tt>) kernels, and thus
1505 Josh Triplett has since taken over the small-memory banner with his
1514 For example, in keeping with RCU's read-side specialization,
1517 Similarly, in non-preemptible environments, <tt>rcu_read_lock()</tt> and
1521 In preemptible environments, in the case where the RCU read-side
1523 highest-priority real-time process), <tt>rcu_read_lock()</tt> and
1525 In particular, they should not contain atomic read-modify-write
1526 operations, memory-barrier instructions, preemption disabling,
1528 However, in the case where the RCU read-side critical section was preempted,
1530 This is why it is better to nest an RCU read-side critical section
1531 within a preempt-disable region than vice versa, at least in cases
1533 real-time latencies.
1536 The <tt>synchronize_rcu()</tt> grace-period-wait primitive is
1539 the duration of the longest RCU read-side critical section.
1542 so that they can be satisfied by a single underlying grace-period-wait
1545 grace-period-wait operation to serve more than
1546 …="https://www.usenix.org/conference/2004-usenix-annual-technical-conference/making-rcu-safe-deep-s…
1547 of <tt>synchronize_rcu()</tt>, thus amortizing the per-invocation
1549 However, the grace-period optimization is also required to avoid
1550 measurable degradation of real-time scheduling and interrupt latencies.
1553 In some cases, the multi-millisecond <tt>synchronize_rcu()</tt>
1556 instead, reducing the grace-period latency down to a few tens of
1557 microseconds on small systems, at least in cases where the RCU read-side
1567 is permitted to impose modest degradation of real-time latency
1568 on non-idle online CPUs.
1570 degradation as a scheduling-clock interrupt.
1574 <tt>synchronize_rcu_expedited()</tt>'s reduced grace-period
1605 25   call_rcu(&amp;p-&gt;rh, remove_gp_cb);
1614 on lines&nbsp;1-5.
1623 including within preempt-disable code, <tt>local_bh_disable()</tt> code,
1624 interrupt-disable code, and interrupt handlers.
1634 Long-running operations should be relegated to separate threads or
1648 	Presumably the <tt>-&gt;gp_lock</tt> acquired on line&nbsp;18 excludes
1652 	<tt>-&gt;gp_lock</tt>
1702 <tt>synchronize_rcu()</tt> simply open-coded it.
1718 	definition would say that updates in garbage-collected languages
1734 The polling-style <tt>get_state_synchronize_rcu()</tt> and
1779 In theory, delaying grace-period completion and callback invocation
1789 For one simple example, an infinite loop in an RCU read-side critical
1791 For a more involved example, consider a 64-CPU system built with
1792 <tt>CONFIG_RCU_NOCB_CPU=y</tt> and booted with <tt>rcu_nocbs=1-63</tt>,
1815 	next scheduling-clock.
1817 	can run indefinitely in the kernel without scheduling-clock
1824 	task that has been preempted within an RCU read-side critical
1838 Note that these forward-progress measures are provided only for RCU,
1845 invocation of callbacks when any given non-<tt>rcu_nocbs</tt> CPU has
1857 <li>	Lifts callback-execution batch limits, which speeds up callback
1864 Again, these forward-progress measures are provided only for RCU,
1868 Even for RCU, callback-invocation forward progress for <tt>rcu_nocbs</tt>
1869 CPUs is much less well-developed, in part because workloads benefiting
1873 <tt>call_rcu()</tt> invocation rates, then additional forward-progress
1880 due to the collision of multicore hardware with object-oriented techniques
1881 designed in single-threaded environments for single-threaded use.
1882 And in theory, RCU read-side critical sections may be composed, and in
1884 In practice, as with all real-world implementations of composable
1890 Linux-kernel RCU when <tt>CONFIG_PREEMPT=n</tt>, can be
1898 mutually recursive functions each in its own translation unit,
1908 are limited by the nesting-depth counter.
1912 That said, a consecutive pair of RCU read-side critical sections
1914 cannot be enclosed in another RCU read-side critical section.
1916 an RCU read-side critical section:  To do so would result either
1918 in RCU implicitly splitting the enclosing RCU read-side critical
1919 section, neither of which is conducive to a long-lived and prosperous
1924 For example, many transactional-memory implementations prohibit
1927 For another example, lock-based critical sections can be composed
1931 In short, although RCU read-side critical sections are highly composable,
1939 RCU read-side critical sections, perhaps even so intense that there
1941 RCU read-side critical section in flight.
1943 all the RCU read-side critical sections are finite, grace periods
1948 in RCU read-side critical sections being preempted for long durations,
1949 which has the effect of creating a long-duration RCU read-side
1952 real-time priorities are of course more vulnerable.
1966 Finally, high update rates should not delay RCU read-side critical
1967 sections, although some small read-side delays can occur when using
1973 1990s, a simple user-level test consisting of <tt>close(open(path))</tt>
1976 high-update-rate corner case.
1981 completion of grace-period processing.
1986 <h2><a name="Software-Engineering Requirements">
1987 Software-Engineering Requirements</a></h2>
1998 	RCU read-side critical section.
1999 	Update-side code can use <tt>rcu_dereference_protected()</tt>,
2017 	an RCU read-side critical section.
2020 <li>	A given function might wish to check for RCU-related preconditions
2028 	To catch this sort of error, a given RCU-protected pointer may be
2030 	will complain about simple-assignment accesses to that pointer.
2043 	Similarly, statically allocated non-stack <tt>rcu_head</tt>
2049 <li>	An infinite loop in an RCU read-side critical section will
2058 	RCU read-side critical section.
2065 	Furthermore, RCU CPU stall warnings are counter-productive
2080 	of RCU read-side critical sections, there is currently no
2086 <li>	In kernels built with <tt>CONFIG_RCU_TRACE=y</tt>, RCU-related
2088 <li>	Open-coded use of <tt>rcu_assign_pointer()</tt> and
2090 	data structures can be surprisingly error-prone.
2091 	Therefore, RCU-protected
2093 	and, more recently, RCU-protected
2096 	Many other special-purpose RCU-protected data structures are
2111 This not a hard-and-fast list:  RCU's diagnostic capabilities will
2113 in real-world RCU usage.
2127 	Interrupts and non-maskable interrupts (NMIs)</a>.
2135 <li>	<a href="#Scheduling-Clock Interrupts and RCU">
2136 	Scheduling-Clock Interrupts and RCU</a>.
2144 most notable Linux-kernel complications.
2165 <a href="https://lkml.kernel.org/g/CA+55aFy4wcCwaL4okTs8wXhGZ5h-ibecy_Meg9C4MNQrUnwMcg@mail.gmail.c…
2173 Or the translation is accurate, but the original message is bogus.
2179 it would create too many per-CPU kthreads.
2199 boot CPU's per-CPU variables are set up.
2200 The read-side primitives (<tt>rcu_read_lock()</tt>,
2226 state and thus a grace period, so the early-boot implementation can
2227 be a no-op.
2234 The reason is that an RCU read-side critical section might be preempted,
2261 	handled by the expedited grace-period mechanism.
2265 	Because dead-zone execution takes place within task context,
2288 I learned of these boot-time requirements as a result of a series of
2294 The Linux kernel has interrupts, and RCU read-side critical sections are
2295 legal within interrupt handlers and within interrupt-disabled regions
2299 Some Linux-kernel architectures can enter an interrupt handler from
2300 non-idle process context, and then just never leave it, instead stealthily
2303 These &ldquo;half-interrupts&rdquo; mean that RCU has to be very careful
2306 of RCU's dyntick-idle code.
2309 The Linux kernel has non-maskable interrupts (NMIs), and
2310 RCU read-side critical sections are legal within NMI handlers.
2311 Thankfully, RCU update-side primitives, including
2315 The name notwithstanding, some Linux-kernel architectures
2318 <a href="https://lkml.kernel.org/r/CALCETrXLq1y7e_dKFPgou-FKHB6Pu-r8+t-6Ds+8=va7anBWDA@mail.gmail.c…
2342 The module-unload functions must therefore cancel any
2343 delayed calls to loadable-module functions, for example,
2353 to deal with in-flight RCU callbacks.
2358 which waits until all in-flight RCU callbacks have been invoked.
2362 In theory, the underlying module-unload code could invoke
2367 Nikita Danilov noted this requirement for an analogous filesystem-unmount
2388 	and <tt>rcu_barrier()</tt> must wait for each pre-existing
2411 	pre-existing callbacks, you will need to invoke both
2425 with the exception of <a href="#Sleepable RCU">SRCU</a> read-side
2428 on the other hand, the Linux kernel's CPU-hotplug implementation
2432 The Linux-kernel CPU-hotplug implementation has notifiers that
2434 to respond appropriately to a given CPU-hotplug operation.
2435 Most RCU operations may be invoked from CPU-hotplug notifiers,
2436 including even synchronous grace-period operations such as
2440 However, all-callback-wait operations such as
2442 fact that there are phases of CPU-hotplug operations where
2444 the CPU-hotplug operation ends, which could also result in deadlock.
2445 Furthermore, <tt>rcu_barrier()</tt> blocks CPU-hotplug operations
2447 when invoked from a CPU-hotplug notifier.
2454 The preemptible-RCU <tt>rcu_read_unlock()</tt>
2456 involving the scheduler's runqueue and priority-inheritance locks.
2465 This scheduler-RCU requirement came as a
2470 avoid excessive CPU-time accumulation by these kthreads.
2472 when running context-switch-heavy workloads when built with
2476 for context-switch-heavy <tt>CONFIG_NO_HZ_FULL=y</tt> workloads,
2480 It is forbidden to hold any of scheduler's runqueue or priority-inheritance
2482 disabled across the entire RCU read-side critical section, that is,
2486 There was hope that this restriction might be lifted when interrupt-disabled
2488 the resulting RCU-preempt quiescent state until the end of the corresponding
2489 interrupts-disabled region.
2493 In addition, real-time systems using RCU priority boosting
2495 quiescent-state reporting would also defer deboosting, which in turn
2496 would degrade real-time latencies.
2499 In theory, if a given RCU read-side critical section could be
2503 RCU read-side critical section, not interrupts.
2504 Unfortunately, given the possibility of vCPU preemption, long-running
2506 that a given RCU read-side critical section will complete in less than
2510 disabled across the entire RCU read-side critical section.
2530 The kernel needs to access user-space memory, for example, to access
2531 data referenced by system-call parameters.
2535 However, user-space memory might well be paged out, which means
2536 that <tt>get_user()</tt> might well page-fault and thus block while
2539 a <tt>get_user()</tt> invocation into an RCU read-side critical
2547  3 v = p-&gt;value;
2563  4 v = p-&gt;value;
2573 of an RCU read-side critical section.
2575 a use-after-free access, which could be bad for your kernel's
2585 <tt>p-&gt;value</tt> is not volatile, so the compiler would not have any
2589 Therefore, the Linux-kernel definitions of <tt>rcu_read_lock()</tt>
2592 <tt>rcu_read_unlock()</tt> within a nested set of RCU read-side critical
2599 especially by people with battery-powered embedded systems.
2602 This is a large part of the energy-efficiency requirement,
2607 execute an RCU read-side critical section on an idle CPU.
2613 test whether or not it is currently legal to run RCU read-side
2617 idle-loop code.
2630 	deep, even on 32-bit systems, this should not be a serious
2666 These energy-efficiency requirements have proven quite difficult to
2668 clean-sheet rewrites of RCU's energy-efficiency code, the last of
2673 Flaming me on the Linux-kernel mailing list was apparently not
2674 sufficient to fully vent their ire at RCU's energy-efficiency bugs!
2676 <h3><a name="Scheduling-Clock Interrupts and RCU">
2677 Scheduling-Clock Interrupts and RCU</a></h3>
2680 The kernel transitions between in-kernel non-idle execution, userspace
2686 	<th>In-Kernel</th>
2690 	<td>Can rely on scheduling-clock interrupt.</td>
2691 		<td>Can rely on scheduling-clock interrupt and its
2693 			<td>Can rely on RCU's dyntick-idle detection.</td></tr>
2695 	<td>Can rely on scheduling-clock interrupt.</td>
2696 		<td>Can rely on scheduling-clock interrupt and its
2698 			<td>Can rely on RCU's dyntick-idle detection.</td></tr>
2700 	<td>Can only sometimes rely on scheduling-clock interrupt.
2703 		<td>Can rely on RCU's dyntick-idle detection.</td>
2704 			<td>Can rely on RCU's dyntick-idle detection.</td></tr>
2711 	Why can't <tt>NO_HZ_FULL</tt> in-kernel execution rely on the
2712 	scheduling-clock interrupt, just like <tt>HZ_PERIODIC</tt>
2718 	does not necessarily re-enable the scheduling-clock interrupt
2729 It also requires that the scheduling-clock interrupt be enabled when
2734 	it is non-idle, the scheduling-clock tick had better be running.
2736 	very long (11-second) grace periods, with a pointless IPI waking
2738 <li>	If a CPU is in a portion of the kernel that executes RCU read-side
2745 	positively no-joking guaranteed to never execute any RCU read-side
2748 	for light-weight exception handlers, which can then avoid the
2756 	was in fact joking about not doing RCU read-side critical sections.
2757 <li>	If a CPU is executing in the kernel with the scheduling-clock
2758 	interrupt disabled and RCU believes this CPU to be non-idle,
2770 	scheduling-clock interrupt is enabled, of course no problem.
2789 	But given that long-running interrupt handlers can cause
2799 in-kernel execution, usermode execution, and idle, and as long as the
2800 scheduling-clock interrupt is enabled when RCU needs it to be, you
2807 Although small-memory non-realtime systems can simply use Tiny RCU,
2812 it does appear in many RCU-protected data structures, including
2818 This need for memory efficiency is one reason that RCU uses hand-crafted
2824 Although this information might appear in debug-only kernel builds at some
2825 point, in the meantime, the <tt>-&gt;func</tt> field will often provide
2835 <a href="https://lkml.kernel.org/g/1439976106-137226-1-git-send-email-kirill.shutemov@linux.intel.c…
2836 the Linux kernel's memory-management subsystem needs a particular bit
2837 to remain zero during all phases of grace-period processing,
2839 <tt>rcu_head</tt> structure's <tt>-&gt;next</tt> field.
2844 energy-efficiency purposes.
2849 two-byte boundary, and passing a misaligned <tt>rcu_head</tt>
2854 Why not a four-byte or even eight-byte alignment requirement?
2855 Because the m68k architecture provides only two-byte alignment,
2862 Deferring invocation could potentially have energy-efficiency
2863 benefits, but only if the rate of non-lazy callbacks decreases
2874 RCU is used heavily by hot code paths in performance-critical
2878 read-side primitives.
2893 combination of RCU's runtime primitives with minimal per-operation
2896 per-operation overhead, witness the batching optimizations for
2903 The Linux kernel is used for real-time workloads, especially
2905 <a href="https://rt.wiki.kernel.org/index.php/Main_Page">-rt patchset</a>.
2906 The real-time-latency response requirements are such that the
2908 read-side critical sections is inappropriate.
2910 use an RCU implementation that allows RCU read-side critical
2914 <a href="https://lwn.net/Articles/107930/">real-time patch</a>
2917 encountered by a very early version of the -rt patchset.
2920 In addition, RCU must make do with a sub-100-microsecond real-time latency
2922 In fact, on smaller systems with the -rt patchset, the Linux kernel
2923 provides sub-20-microsecond real-time latencies for the whole kernel,
2927 To my surprise, the sub-100-microsecond real-time latency budget
2931 This real-time requirement motivated the grace-period kthread, which
2932 also simplified handling of a number of race conditions.
2935 RCU must avoid degrading real-time response for CPU-bound threads, whether
2938 That said, CPU-bound loops in the kernel must execute
2947 practice also means that RCU must have an aggressive stress-test
2949 This stress-test suite is called <tt>rcutorture</tt>.
2958 smartphones, Linux-powered televisions, and servers.
2970 consider the fact that in most jurisdictions, a successful multi-year
2972 suffices for a number of types of safety-critical certifications.
2974 in production for safety-critical applications.
2985 this point has two different implementations, non-preemptible and
2991 <li>	<a href="#Bottom-Half Flavor">Bottom-Half Flavor (Historical)</a>
2997 <h3><a name="Bottom-Half Flavor">Bottom-Half Flavor (Historical)</a></h3>
3000 The RCU-bh flavor of RCU has since been expressed in terms of
3003 The read-side API remains, and continues to disable softirq and to
3009 The softirq-disable (AKA &ldquo;bottom-half&rdquo;,
3011 flavor of RCU, or <i>RCU-bh</i>, was developed by
3013 network-based denial-of-service attacks researched by Robert
3020 The result was an out-of-memory condition and a system hang.
3023 The solution was the creation of RCU-bh, which does
3025 across its read-side critical sections, and which uses the transition
3028 This means that RCU-bh grace periods can complete even when some of
3030 based on RCU-bh to withstand network-based denial-of-service attacks.
3035 disable and re-enable softirq handlers, any attempt to start a softirq
3037 RCU-bh read-side critical section will be deferred.
3041 with the code following the RCU-bh read-side critical section rather
3045 For example, suppose that a three-millisecond-long RCU-bh read-side
3055 <a href="https://lwn.net/Articles/609973/#RCU Per-Flavor API Table">RCU-bh API</a>
3066 However, the update-side APIs are now simple wrappers for other RCU
3067 flavors, namely RCU-sched in CONFIG_PREEMPT=n kernels and RCU-preempt
3073 The RCU-sched flavor of RCU has since been expressed in terms of
3076 The read-side API remains, and continues to disable preemption and to
3083 side effect of also waiting for all pre-existing interrupt
3085 However, there are legitimate preemptible-RCU implementations that
3087 of an RCU read-side critical section can be a quiescent state.
3088 Therefore, <i>RCU-sched</i> was created, which follows &ldquo;classic&rdquo;
3089 RCU in that an RCU-sched grace period waits for for pre-existing
3091 In kernels built with <tt>CONFIG_PREEMPT=n</tt>, the RCU and RCU-sched
3098 disable and re-enable preemption, respectively.
3100 RCU-sched read-side critical section, <tt>rcu_read_unlock_sched()</tt>
3104 However, the highest-priority task won't be preempted, so that task
3105 will enjoy low-overhead <tt>rcu_read_unlock_sched()</tt> invocations.
3109 <a href="https://lwn.net/Articles/609973/#RCU Per-Flavor API Table">RCU-sched API</a>
3122 However, anything that disables preemption also marks an RCU-sched
3123 read-side critical section, including
3132 an RCU read-side critical section&rdquo; was a reliable indication
3134 After all, if you are always blocking in an RCU read-side critical
3135 section, you can probably afford to use a higher-overhead synchronization
3138 whose RCU read-side critical
3152 That said, one consequence of these domains is that read-side code
3167 As noted above, it is legal to block within SRCU read-side critical sections,
3169 If you block forever in one of a given domain's SRCU read-side critical
3172 happen if any operation in a given domain's SRCU read-side critical
3175 For example, this results in a self-deadlock:
3194 and if an <tt>ss1</tt>-domain SRCU read-side critical section
3195 acquired another mutex that was held across as <tt>ss</tt>-domain
3203 Unlike the other RCU flavors, SRCU read-side critical sections can
3214 be invoked from CPU-hotplug notifiers, due to the fact that SRCU grace
3218 will not fire until late in the CPU-hotplug process.
3224 from being invoked from CPU-hotplug notifiers.
3228 non-expedited grace periods are implemented by the same mechanism.
3241 As of v4.12, SRCU's callbacks are maintained per-CPU, eliminating
3255 <a href="https://lwn.net/Articles/609973/#RCU Per-Flavor API Table">SRCU API</a>
3280 anywhere in the code, it is not possible to use read-side markers
3293 read-side critical sections that are delimited by voluntary context
3298 tasks-RCU read-side critical sections.
3301 The tasks-RCU API is quite compact, consisting only of
3317 One of the tricks that RCU uses to attain update-side scalability is
3318 to increase grace-period latency with increasing numbers of CPUs.
3320 grace-period state machine so as to avoid the need for the additional
3326 If there is a strong reason to use <tt>rcu_barrier()</tt> in CPU-hotplug
3332 The tradeoff between grace-period latency on the one hand and interruptions
3333 of other CPUs on the other hand may need to be re-examined.
3334 The desire is of course for zero grace-period latency as well as zero
3347 because the hotpath read-side primitives do not access the combining
3360 realistic system-level workload.
3379 Additional work may be required to provide reasonable forward-progress