• Home
  • Raw
  • Download

Lines Matching +full:cpu +full:- +full:idle +full:- +full:states

1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
4 <head><title>A Tour Through TREE_RCU's Grace-Period Memory Ordering</title>
5 <meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">
13 grace-period memory ordering guarantee is provided.
28 <p>RCU grace periods provide extremely strong memory-ordering guarantees
29 for non-idle non-offline code.
32 period that are within RCU read-side critical sections.
35 of that grace period that are within RCU read-side critical sections.
37 <p>Note well that RCU-sched read-side critical sections include any region
40 an extremely small region of preemption-disabled code, one can think of
47 a linked RCU-protected data structure, and phase two frees that element.
49 phase-one update (in the common case, removal) must not witness state
50 following the phase-two update (in the common case, freeing).
53 of lock-based critical sections, memory barriers, and per-CPU
59 <p>The workhorse for RCU's grace-period memory ordering is the
61 <tt>-&gt;lock</tt>.
66 Their lock-release counterparts are
74 The key point is that the lock-acquisition functions, including
80 happening before one of the above lock-release functions will be seen
82 one of the above lock-acquisition functions.
84 above lock-release function on any given CPU will be seen by all
86 of the above lock-acquisition functions executing on that same CPU,
87 even if the lock-release and lock-acquisition functions are operating
95 lock-acquisition and lock-release functions:
134 <p>This approach must be extended to include idle CPUs, which need
135 RCU's grace-period memory ordering guarantee to extend to any
136 RCU read-side critical sections preceding and following the current
137 idle sojourn.
139 <tt>atomic_add_return()</tt> read-modify-write atomic operation that
140 is invoked within <tt>rcu_dynticks_eqs_enter()</tt> at idle-entry
141 time and within <tt>rcu_dynticks_eqs_exit()</tt> at idle-exit time.
142 The grace-period kthread invokes <tt>rcu_dynticks_snap()</tt> and
144 an <tt>atomic_add_return()</tt> of zero) to detect idle CPUs.
156 so the grace period won't expect quiescent states from them.
157 Races between grace-period start and CPU-hotplug operations
158 are mediated by the CPU's leaf <tt>rcu_node</tt> structure's
159 <tt>-&gt;lock</tt> as described above.
166 This task might be affinitied to a CPU that is not yet aware that
194 <p>Tree RCU's grace--period memory-ordering guarantees rely most
195 heavily on the <tt>rcu_node</tt> structure's <tt>-&gt;lock</tt>
216 14 if (tne != rdtp-&gt;tick_nohz_enabled_snap) {
219 17 rdtp-&gt;tick_nohz_enabled_snap = tne;
224 22 if (rdtp-&gt;all_lazy &amp;&amp;
225 23 rdtp-&gt;nonlazy_posted != rdtp-&gt;nonlazy_posted_snap) {
226 24 rdtp-&gt;all_lazy = false;
227 25 rdtp-&gt;nonlazy_posted_snap = rdtp-&gt;nonlazy_posted;
231 29 if (rdtp-&gt;last_accelerate == jiffies)
233 31 rdtp-&gt;last_accelerate = jiffies;
235 33 rdp = this_cpu_ptr(rsp-&gt;rda);
236 34 if (rcu_segcblist_pend_cbs(&amp;rdp-&gt;cblist))
238 36 rnp = rdp-&gt;mynode;
252 </p><p><img src="rcu_node-lock.svg" alt="rcu_node-lock.svg">
254 <p>The box represents the <tt>rcu_node</tt> structure's <tt>-&gt;lock</tt>
261 <p>Tree RCU's grace-period memory-ordering guarantee is provided by
266 <li> <a href="#Grace-Period Initialization">Grace-Period Initialization</a>
267 <li> <a href="#Self-Reported Quiescent States">
268 Self-Reported Quiescent States</a>
270 <li> <a href="#CPU-Hotplug Interface">CPU-Hotplug Interface</a>
271 <li> <a href="Forcing Quiescent States">Forcing Quiescent States</a>
272 <li> <a href="Grace-Period Cleanup">Grace-Period Cleanup</a>
281 <p>If RCU's grace-period guarantee is to mean anything at all, any
287 </p><p><img src="TreeRCU-callback-registry.svg" alt="TreeRCU-callback-registry.svg">
289 <p>Because <tt>call_rcu()</tt> normally acts only on CPU-local state,
292 an element from an RCU-protected data structure).
293 It simply enqueues the <tt>rcu_head</tt> structure on a per-CPU list,
300 the current CPU is inundated with queued <tt>rcu_head</tt> structures)
310 which in turn is invoked on a surviving CPU after the outgoing
311 CPU has been completely offlined.
313 <p>There are a few other code paths within grace-period processing
315 However, either way, all of the CPU's recently queued <tt>rcu_head</tt>
316 structures are associated with a future grace-period number under
317 the protection of the CPU's lead <tt>rcu_node</tt> structure's
318 <tt>-&gt;lock</tt>.
320 for that same <tt>rcu_node</tt> structure's <tt>-&gt;lock</tt>, and
321 also full ordering against any of the current task's or CPU's prior critical
322 sections for any <tt>rcu_node</tt> structure's <tt>-&gt;lock</tt>.
344 <h4><a name="Grace-Period Initialization">Grace-Period Initialization</a></h4>
346 <p>Grace-period initialization is carried out by
347 the grace-period kernel thread, which makes several passes over the
350 grace-period computation will require duplicating this tree.
354 grace-period kernel thread's traversals are presented in multiple
356 grace-period initialization.
358 <p>The first ordering-related grace-period initialization action is to
359 advance the <tt>rcu_state</tt> structure's <tt>-&gt;gp_seq</tt>
360 grace-period-number counter, as shown below:
362 </p><p><img src="TreeRCU-gp-init-1.svg" alt="TreeRCU-gp-init-1.svg" width="75%">
365 which helps reject false-positive RCU CPU stall detection.
376 <tt>rcu_init_new_rnp()</tt> will be invoked for the first incoming CPU.
379 <tt>rcu_cleanup_dead_rnp()</tt> will be invoked for the last outgoing CPU.
381 <tt>rcu_node</tt> structure onlines its first CPU and if the next
384 its last CPU and if the next <tt>rcu_node</tt> structure has no online CPUs).
386 </p><p><img src="TreeRCU-gp-init-2.svg" alt="TreeRCU-gp-init-1.svg" width="75%">
389 tree traverses breadth-first, setting each <tt>rcu_node</tt> structure's
390 <tt>-&gt;gp_seq</tt> field to the newly advanced value from the
393 </p><p><img src="TreeRCU-gp-init-3.svg" alt="TreeRCU-gp-init-1.svg" width="75%">
395 <p>This change will also cause each CPU's next call to
399 But because the grace-period kthread started the grace period at the
401 <tt>-&gt;gp_seq</tt> field) before setting each leaf <tt>rcu_node</tt>
402 structure's <tt>-&gt;gp_seq</tt> field, each CPU's observation of
410 But what about the CPU that started the grace period?
417 sense, yes, the CPU starting the grace period is immediately
419 However, if we instead assume that RCU is not self-aware,
420 then even the CPU starting the grace period does not really
423 On the other hand, this CPU potentially gets early notification
431 <h4><a name="Self-Reported Quiescent States">
432 Self-Reported Quiescent States</a></h4>
435 quiescent states (or as described in a later section, had quiescent
436 states reported on their behalf), the grace period can end.
437 Online non-idle CPUs report their own quiescent states, as shown
440 </p><p><img src="TreeRCU-qs.svg" alt="TreeRCU-qs.svg" width="75%">
442 <p>This is for the last CPU to report a quiescent state, which signals
444 Earlier quiescent states would push up the <tt>rcu_node</tt> tree
446 is waiting for additional quiescent states.
448 state will acquire that <tt>rcu_node</tt> structure's <tt>-&gt;lock</tt>.
450 <p>Any number of events can lead up to a CPU invoking
452 <tt>__note_gp_changes()</tt>), at which point that CPU will notice
457 In addition, this CPU will consider any RCU read-side critical
466 But a RCU read-side critical section might have started
468 (the advancing of <tt>-&gt;gp_seq</tt> from earlier), so why should
478 all-at-once grace-period start could possibly be.
483 <p>If the CPU does a context switch, a quiescent state will be
485 On the other hand, if the CPU takes a scheduler-clock interrupt
489 in a per-CPU variable.
492 this CPU (for example, after the next scheduler-clock
501 each <tt>rcu_node</tt> structure's <tt>-&gt;qsmask</tt> field,
505 structure only if the current CPU is reporting the last quiescent
507 A key point is that if a CPU's traversal stops at a given <tt>rcu_node</tt>
508 structure, then there will be a later traversal by another CPU
510 from that point, and the <tt>rcu_node</tt> <tt>-&gt;lock</tt>
511 guarantees that the first CPU's quiescent state happens before the
512 remainder of the second CPU's traversal.
514 quiescent states happen before the last CPU traverses through
515 the root <tt>rcu_node</tt> structure, the &ldquo;last CPU&rdquo;
517 structure's <tt>-&gt;qsmask</tt> field.
521 <p>Due to energy-efficiency considerations, RCU is forbidden from
522 disturbing idle CPUs.
523 CPUs are therefore required to notify RCU when entering or leaving idle
524 state, which they do via fully ordered value-returning atomic operations
525 on a per-CPU variable.
528 </p><p><img src="TreeRCU-dyntick.svg" alt="TreeRCU-dyntick.svg" width="50%">
530 <p>The RCU grace-period kernel thread samples the per-CPU idleness
531 variable while holding the corresponding CPU's leaf <tt>rcu_node</tt>
532 structure's <tt>-&gt;lock</tt>.
533 This means that any RCU read-side critical sections that precede the
534 idle period (the oval near the top of the diagram above) will happen
537 any RCU read-side critical sections that follow the
538 idle period (the oval near the bottom of the diagram above).
540 <p>Plumbing this into the full grace-period execution is described
541 <a href="#Forcing Quiescent States">below</a>.
543 <h4><a name="CPU-Hotplug Interface">CPU-Hotplug Interface</a></h4>
548 as part of the corresponding CPU hotplug operations.
551 </p><p><img src="TreeRCU-hotplug.svg" alt="TreeRCU-hotplug.svg" width="50%">
553 <p>Because CPU hotplug operations are much less frequent than idle transitions,
554 they are heavier weight, and thus acquire the CPU's leaf <tt>rcu_node</tt>
555 structure's <tt>-&gt;lock</tt> and update this structure's
556 <tt>-&gt;qsmaskinitnext</tt>.
557 The RCU grace-period kernel thread samples this mask to detect CPUs
560 <p>Plumbing this into the full grace-period execution is described
561 <a href="#Forcing Quiescent States">below</a>.
563 <h4><a name="Forcing Quiescent States">Forcing Quiescent States</a></h4>
565 <p>As noted above, idle and offline CPUs cannot report their own
566 quiescent states, and therefore the grace-period kernel thread
568 This process is called &ldquo;forcing quiescent states&rdquo;, it is
571 </p><p><img src="TreeRCU-gp-fqs.svg" alt="TreeRCU-gp-fqs.svg" width="100%">
575 states due to recently idled and/or offlined CPUs, then only the
577 However, if there is a newly offlined CPU as illustrated on the left
578 or a newly idled CPU as illustrated on the right, the corresponding
580 As with self-reported quiescent states, the upwards driving stops
582 states outstanding from other CPUs.
603 <a href="#Putting It All Together">stitched-together diagram</a>.
608 <h4><a name="Grace-Period Cleanup">Grace-Period Cleanup</a></h4>
610 <p>Grace-period cleanup first scans the <tt>rcu_node</tt> tree
611 breadth-first advancing all the <tt>-&gt;gp_seq</tt> fields, then it
612 advances the <tt>rcu_state</tt> structure's <tt>-&gt;gp_seq</tt> field.
615 </p><p><img src="TreeRCU-gp-cleanup.svg" alt="TreeRCU-gp-cleanup.svg" width="75%">
618 grace-period cleanup is complete, the next grace period can begin.
631 CPU has reported its quiescent state, but it may be some
634 structure's <tt>-&gt;gp_seq</tt> field has been updated,
646 <p>Once a given CPU's leaf <tt>rcu_node</tt> structure's
647 <tt>-&gt;gp_seq</tt> field has been updated, that CPU can begin
653 the scheduling-clock interrupt (<tt>rcu_sched_clock_irq()</tt> on
654 the left) or by idle entry (<tt>rcu_cleanup_after_idle()</tt> on
660 via wakeup) the needed phase-two processing for each update.
662 </p><p><img src="TreeRCU-callback-invocation.svg" alt="TreeRCU-callback-invocation.svg" width="60%">
665 number of corner-case code paths, for example, when a CPU notes that
667 In all cases, the CPU acquires its leaf <tt>rcu_node</tt> structure's
668 <tt>-&gt;lock</tt> before invoking callbacks, which preserves the
675 some other CPU, proper ordering must in place in both the callback
678 <a href="#Grace-Period Cleanup">grace-period cleanup</a> diagram.
679 The callback might be running on a CPU corresponding to the leftmost
681 a CPU corresponding to the rightmost leaf <tt>rcu_node</tt> structure,
682 and the grace-period kernel thread might not yet have reached the
685 reached that CPU, so again the callback function and the awakened
690 <p>A stitched-together diagram is
691 <a href="Tree-RCU-Diagram.html">here</a>.