explanation.txt - OpenGrok cross reference for /tools/memory-model/Documentation/explanation.txt

Lines Matching refs:it
49 rather, it explains in English what the code expresses symbolically.
78 for the loads, the model will predict whether it is possible for the
102 device, stores it in a buffer, and sets a flag to indicate the buffer
106 ready, and if it is, copies the data back to userspace.  The buffer
135 reads flag into the private variable r1, and if it is set, reads the
173 If this were to occur it would mean the driver contains a bug, because
174 incorrect data would get sent to the user: 0 instead of 1.  As it
197 	it, as loads can obtain values only from earlier stores.
202 	P1 must load 0 from buf before P0 stores 1 to it; otherwise r2
206 	P0 stores 1 to buf before storing 1 to flag, since it executes
220 pattern cannot occur, but in other respects it differs from Sequential
245 this outcome to occur, and in fact it does sometimes occur on x86 and
249 x86, Alpha, and other architectures.  However, it is different in
257 (i.e., the order in which certain events occur) but it doesn't have to
288 kernel source code.  Instead it considers the effects of those
307 write event is omitted for executions where it doesn't occur, such as
319 -- not the computation leading up to it.
323 of these relations to be orderings, that is, it requires them not to
331 can think of it as the order in which statements occur in the source
339 on different CPUs are never linked by po.  Also, it is by definition
340 an ordering so it cannot have any cycles.
349 code, and it retains their outlook to a considerable extent.  The
360 large number of compiler optimizations.  In particular, it is guaranteed
362 (unless it can prove the accesses will never be executed), it will not
364 by the C standard), and it will not introduce extraneous accesses.
383 not perfect; and under some circumstances it is possible for the
432 value it obtains must somehow affect what the second event does.
501 though it had been written there by an imaginary initial store that
508 and WRITE_ONCE() will prevent load-tearing; it's not possible to have:
560 memory contents.  Specifically, it requires that for each location in
566 To put it another way, for any variable x, the coherence order (co) of
601 because it means that the accesses to any single memory location obey
623 program order, it must also come later in x's coherence order and
638 program order, so it must not read from that store but rather from one
660 the r2 load in program order, so it must not read from a store that
664 READ_ONCE() in P1, on Itanium it sometimes could end up with r1 = 5
666 encoded in Itanium's Very-Long-Instruction-Word format, and it is yet
670 Just like the po relation, co is inherently an ordering -- it is not
704 The value loaded from x will be 0 (assuming cache coherence!), and it
715 relations; it is not independent.  Given a read event R and a write
736 When CPU C executes a store instruction, it tells the memory subsystem
740 time it is executed.)  The memory subsystem also determines where the
741 store falls in the location's coherence order.  In particular, it must
745 When a CPU executes a load instruction R, it first checks to see
747 same location, that come before R in program order.  If there are, it
757 propagating it to the CPU's local cache.  A local cache can take some
758 time to process the stores that it receives, and a store can't be used
759 to satisfy one of the CPU's loads until it has been processed.  On
769 final time it is forwarded or satisfied.
848 before W' does.  However, for different CPUs C and C', it does not
858 value before it knows what that value is, among other things.
898 each load between the store that it reads from and the following
908 relations not to have any cycles.  This means it must not be possible
917 Although it is not obvious, it can be shown that the converse is also
925 What does it mean to say that a read-modify-write (rmw) update, such
929 atomic_inc(&x) concurrently, it must be guaranteed that the final
940 In this example, CPU 0's increment effectively gets lost because it
941 occurs in between CPU 1's load and store.  To put it another way, the
998 store before it knows what value should be stored (in the case of a
999 data dependency), what location it should be stored into (in the case
1005 has no reason to respect a control dependency to a load, because it
1007 then ignore the result if it turns out that the second load shouldn't
1011 To be fair about it, all Linux-supported architectures do execute
1014 a particular location before it knows what that location is.  However,
1015 the split-cache design used by Alpha can cause it to behave in a way
1028 this situation we know it is possible for the CPU to execute R' before
1029 W, because it can forward the value that W will store to R'.  But it
1030 cannot execute R' before R, because it cannot forward the value before
1031 it knows what that value is, or that W and R' do access the same
1037 (In theory, a CPU might forward a store to a load when it runs across
1044 because it could tell that the store and the second load access the
1045 same location even before it knows what the location's address is.
1056 If it executed W first then the memory subsystem would respond to R's
1074 As mentioned above, the Alpha architecture is unique in that it does
1105 until it knows what location to load from, i.e., after executing its
1110 stores do reach P1's local cache in the proper order, it can happen
1116 program order, it appears that they aren't.
1128 the stores it has already received.  Thus, if the code was changed to:
1152 it has to wait for the local cache to process all the stores received
1181 order (i.e., W' overwrites W).  Nevertheless, it is possible for W' to
1185 R ->fre W means that W overwrites the value which R reads, but it
1190 The third relation included in hb is like ppo, in that it only links
1191 events that are on the same CPU.  However it is more difficult to
1192 explain, because it arises only indirectly from the requirement of
1193 cache coherence.  The relation is called prop, and it links two events
1289 would force the two loads to be executed in program order, and it
1295 outcome is impossible -- as it should be.
1341 in the order they execute means that it must not have cycles.  This
1345 instruction execution, but it is not included in hb.  It relies on the
1417 because it does not start and end on the same CPU.
1426 they execute means that it cannot have cycles.  This requirement is
1457 In particular, it is not possible for a critical section to both start
1481 The Grace Period Guarantee tells us that when this code runs, it will
1514 obscure, and we won't give it here.  It is closely related to the pb
1535 grace period which ends before Z begins.  (In fact it covers more than
1536 this, because it also includes cases where some store propagates to
1561 before any instruction po-after F can execute.  (However, it does not
1620 and F with E ->rcu-link F ->rcu-order E.  Or to put it a third way,
1625 Justifying the axiom isn't easy, but it is in fact a valid
1653 at the time that Z executes.  From this, it can be shown (see the
1744 The following instruction timing diagram shows how it might actually
1766 before it does, and the critical section in P2 both starts after P1's
1767 grace period does and ends after it does.
1791 This waits until s is equal to 0 and then atomically sets it to 1,
1802 which atomically sets s to 1 if it is currently equal to 0 and returns
1803 true if it succeeds (the read part of the cmpxchg operation acts as an
1820 CPUs, but the LKMM says it holds even when they are on the same CPU.
1873 and thus it could load y before x, obtaining r2 = 0 and r1 = 1.
1956 On the face of it, one would expect that when this code runs, the only
1973 And now it is obvious that this code runs the risk of dereferencing a
2008 which may execute concurrently; if it does then the LKMM says there is
2016 assuming that accesses may be concurrent unless it can prove they
2030 store, then even if X executes before Y it is still possible that X
2055 	empty sequence of xb links (again, if the sequence is empty it
2100 provides an xb link from Z to Y (i.e., it forces Z to execute before
2121 Fortunately, the compiler isn't completely free; it is subject to some
2122 limitations.  For one, it is not allowed to introduce a data race into
2124 race (if it could, memory models would be useless and no multithreaded
2125 code would be safe!).  For another, it cannot move a plain access past
2128 A compiler barrier is a kind of fence, but as the name implies, it
2129 only affects the compiler; it does not necessarily have any effect on
2133 it affects how the compiler generates the rest of the object code.
2149 The LKMM doesn't say much about the barrier() function, but it does
2150 require that all fences are also compiler barriers.  In addition, it
2187 	Consequently U's store to buf, no matter how it is carried out
2240 w-pre-bounded or w-post-bounded by a marked access, it also requires
2247 compiler if it ever did this.  Still, better safe than sorry.)
2271 until it knows what value p will hold.  Furthermore, without some
2303 is only equivalent to READ_ONCE().  While it is a marked access, it is
2334 Finally, it turns out there is a situation in which a plain write does
2335 not need to be w-post-bounded: when it is separated from the other
2364 to the READ_ONCE().  (Another way of putting it is that the fre link
2421 	If R and W are race candidates and it is possible to link R to
2423 	not allowed (i.e., a load cannot read from a store that it
2426 	If W and R are race candidates and it is possible to link W to
2429 	load must read from that store or one coherence-after it).
2431 	If W and W' are race candidates and it is possible to link W
2437 Perhaps it could say more (for example, plain accesses might
2438 contribute to the ppo relation), but at the moment it seems that this
2451 optional, and it doesn't require the events linked by the relation to
2455 hb anyway, and where the formal model adds prop into hb, it includes
2466 Although we didn't mention it above, the instruction execution
2475 it is not guaranteed that the load from y will execute after the
2490 relation, they do contribute to it indirectly.  Namely, when there is
2494 shaky, but essentially it says there is no way to generate object code
2496 pre-bounding by address dependencies, it is possible for the compiler
2557 Is it possible to end up with r0 = 36 at the end?  The LKMM will tell
2558 you it is not, but the model won't mention that this is because P1
2559 will self-deadlock in the executions where it stores 36 in y.