• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1=========
2MemorySSA
3=========
4
5.. contents::
6   :local:
7
8Introduction
9============
10
11``MemorySSA`` is an analysis that allows us to cheaply reason about the
12interactions between various memory operations. Its goal is to replace
13``MemoryDependenceAnalysis`` for most (if not all) use-cases. This is because,
14unless you're very careful, use of ``MemoryDependenceAnalysis`` can easily
15result in quadratic-time algorithms in LLVM. Additionally, ``MemorySSA`` doesn't
16have as many arbitrary limits as ``MemoryDependenceAnalysis``, so you should get
17better results, too.
18
19At a high level, one of the goals of ``MemorySSA`` is to provide an SSA based
20form for memory, complete with def-use and use-def chains, which
21enables users to quickly find may-def and may-uses of memory operations.
22It can also be thought of as a way to cheaply give versions to the complete
23state of heap memory, and associate memory operations with those versions.
24
25This document goes over how ``MemorySSA`` is structured, and some basic
26intuition on how ``MemorySSA`` works.
27
28A paper on MemorySSA (with notes about how it's implemented in GCC) `can be
29found here <http://www.airs.com/dnovillo/Papers/mem-ssa.pdf>`_. Though, it's
30relatively out-of-date; the paper references multiple heap partitions, but GCC
31eventually swapped to just using one, like we now have in LLVM.  Like
32GCC's, LLVM's MemorySSA is intraprocedural.
33
34
35MemorySSA Structure
36===================
37
38MemorySSA is a virtual IR. After it's built, ``MemorySSA`` will contain a
39structure that maps ``Instruction``\ s to ``MemoryAccess``\ es, which are
40``MemorySSA``'s parallel to LLVM ``Instruction``\ s.
41
42Each ``MemoryAccess`` can be one of three types:
43
44- ``MemoryPhi``
45- ``MemoryUse``
46- ``MemoryDef``
47
48``MemoryPhi``\ s are ``PhiNode``\ s, but for memory operations. If at any
49point we have two (or more) ``MemoryDef``\ s that could flow into a
50``BasicBlock``, the block's top ``MemoryAccess`` will be a
51``MemoryPhi``. As in LLVM IR, ``MemoryPhi``\ s don't correspond to any
52concrete operation. As such, ``BasicBlock``\ s are mapped to ``MemoryPhi``\ s
53inside ``MemorySSA``, whereas ``Instruction``\ s are mapped to ``MemoryUse``\ s
54and ``MemoryDef``\ s.
55
56Note also that in SSA, Phi nodes merge must-reach definitions (that is,
57definitions that *must* be new versions of variables). In MemorySSA, PHI nodes
58merge may-reach definitions (that is, until disambiguated, the versions that
59reach a phi node may or may not clobber a given variable).
60
61``MemoryUse``\ s are operations which use but don't modify memory. An example of
62a ``MemoryUse`` is a ``load``, or a ``readonly`` function call.
63
64``MemoryDef``\ s are operations which may either modify memory, or which
65introduce some kind of ordering constraints. Examples of ``MemoryDef``\ s
66include ``store``\ s, function calls, ``load``\ s with ``acquire`` (or higher)
67ordering, volatile operations, memory fences, etc.
68
69Every function that exists has a special ``MemoryDef`` called ``liveOnEntry``.
70It dominates every ``MemoryAccess`` in the function that ``MemorySSA`` is being
71run on, and implies that we've hit the top of the function. It's the only
72``MemoryDef`` that maps to no ``Instruction`` in LLVM IR. Use of
73``liveOnEntry`` implies that the memory being used is either undefined or
74defined before the function begins.
75
76An example of all of this overlaid on LLVM IR (obtained by running ``opt
77-passes='print<memoryssa>' -disable-output`` on an ``.ll`` file) is below. When
78viewing this example, it may be helpful to view it in terms of clobbers. The
79operands of a given ``MemoryAccess`` are all (potential) clobbers of said
80MemoryAccess, and the value produced by a ``MemoryAccess`` can act as a clobber
81for other ``MemoryAccess``\ es. Another useful way of looking at it is in
82terms of heap versions.  In that view, operands of a given
83``MemoryAccess`` are the version of the heap before the operation, and
84if the access produces a value, the value is the new version of the heap
85after the operation.
86
87.. code-block:: llvm
88
89  define void @foo() {
90  entry:
91    %p1 = alloca i8
92    %p2 = alloca i8
93    %p3 = alloca i8
94    ; 1 = MemoryDef(liveOnEntry)
95    store i8 0, i8* %p3
96    br label %while.cond
97
98  while.cond:
99    ; 6 = MemoryPhi({%0,1},{if.end,4})
100    br i1 undef, label %if.then, label %if.else
101
102  if.then:
103    ; 2 = MemoryDef(6)
104    store i8 0, i8* %p1
105    br label %if.end
106
107  if.else:
108    ; 3 = MemoryDef(6)
109    store i8 1, i8* %p2
110    br label %if.end
111
112  if.end:
113    ; 5 = MemoryPhi({if.then,2},{if.else,3})
114    ; MemoryUse(5)
115    %1 = load i8, i8* %p1
116    ; 4 = MemoryDef(5)
117    store i8 2, i8* %p2
118    ; MemoryUse(1)
119    %2 = load i8, i8* %p3
120    br label %while.cond
121  }
122
123The ``MemorySSA`` IR is shown in comments that precede the instructions they map
124to (if such an instruction exists). For example, ``1 = MemoryDef(liveOnEntry)``
125is a ``MemoryAccess`` (specifically, a ``MemoryDef``), and it describes the LLVM
126instruction ``store i8 0, i8* %p3``. Other places in ``MemorySSA`` refer to this
127particular ``MemoryDef`` as ``1`` (much like how one can refer to ``load i8, i8*
128%p1`` in LLVM with ``%1``). Again, ``MemoryPhi``\ s don't correspond to any LLVM
129Instruction, so the line directly below a ``MemoryPhi`` isn't special.
130
131Going from the top down:
132
133- ``6 = MemoryPhi({entry,1},{if.end,4})`` notes that, when entering
134  ``while.cond``, the reaching definition for it is either ``1`` or ``4``. This
135  ``MemoryPhi`` is referred to in the textual IR by the number ``6``.
136- ``2 = MemoryDef(6)`` notes that ``store i8 0, i8* %p1`` is a definition,
137  and its reaching definition before it is ``6``, or the ``MemoryPhi`` after
138  ``while.cond``. (See the `Build-time use optimization`_ and `Precision`_
139  sections below for why this ``MemoryDef`` isn't linked to a separate,
140  disambiguated ``MemoryPhi``.)
141- ``3 = MemoryDef(6)`` notes that ``store i8 0, i8* %p2`` is a definition; its
142  reaching definition is also ``6``.
143- ``5 = MemoryPhi({if.then,2},{if.else,3})`` notes that the clobber before
144  this block could either be ``2`` or ``3``.
145- ``MemoryUse(5)`` notes that ``load i8, i8* %p1`` is a use of memory, and that
146  it's clobbered by ``5``.
147- ``4 = MemoryDef(5)`` notes that ``store i8 2, i8* %p2`` is a definition; it's
148  reaching definition is ``5``.
149- ``MemoryUse(1)`` notes that ``load i8, i8* %p3`` is just a user of memory,
150  and the last thing that could clobber this use is above ``while.cond`` (e.g.
151  the store to ``%p3``). In heap versioning parlance, it really only depends on
152  the heap version 1, and is unaffected by the new heap versions generated since
153  then.
154
155As an aside, ``MemoryAccess`` is a ``Value`` mostly for convenience; it's not
156meant to interact with LLVM IR.
157
158Design of MemorySSA
159===================
160
161``MemorySSA`` is an analysis that can be built for any arbitrary function. When
162it's built, it does a pass over the function's IR in order to build up its
163mapping of ``MemoryAccess``\ es. You can then query ``MemorySSA`` for things
164like the dominance relation between ``MemoryAccess``\ es, and get the
165``MemoryAccess`` for any given ``Instruction`` .
166
167When ``MemorySSA`` is done building, it also hands you a ``MemorySSAWalker``
168that you can use (see below).
169
170
171The walker
172----------
173
174A structure that helps ``MemorySSA`` do its job is the ``MemorySSAWalker``, or
175the walker, for short. The goal of the walker is to provide answers to clobber
176queries beyond what's represented directly by ``MemoryAccess``\ es. For example,
177given:
178
179.. code-block:: llvm
180
181  define void @foo() {
182    %a = alloca i8
183    %b = alloca i8
184
185    ; 1 = MemoryDef(liveOnEntry)
186    store i8 0, i8* %a
187    ; 2 = MemoryDef(1)
188    store i8 0, i8* %b
189  }
190
191The store to ``%a`` is clearly not a clobber for the store to ``%b``. It would
192be the walker's goal to figure this out, and return ``liveOnEntry`` when queried
193for the clobber of ``MemoryAccess`` ``2``.
194
195By default, ``MemorySSA`` provides a walker that can optimize ``MemoryDef``\ s
196and ``MemoryUse``\ s by consulting whatever alias analysis stack you happen to
197be using. Walkers were built to be flexible, though, so it's entirely reasonable
198(and expected) to create more specialized walkers (e.g. one that specifically
199queries ``GlobalsAA``, one that always stops at ``MemoryPhi`` nodes, etc).
200
201
202Locating clobbers yourself
203^^^^^^^^^^^^^^^^^^^^^^^^^^
204
205If you choose to make your own walker, you can find the clobber for a
206``MemoryAccess`` by walking every ``MemoryDef`` that dominates said
207``MemoryAccess``. The structure of ``MemoryDef``\ s makes this relatively simple;
208they ultimately form a linked list of every clobber that dominates the
209``MemoryAccess`` that you're trying to optimize. In other words, the
210``definingAccess`` of a ``MemoryDef`` is always the nearest dominating
211``MemoryDef`` or ``MemoryPhi`` of said ``MemoryDef``.
212
213
214Build-time use optimization
215---------------------------
216
217``MemorySSA`` will optimize some ``MemoryAccess``\ es at build-time.
218Specifically, we optimize the operand of every ``MemoryUse`` to point to the
219actual clobber of said ``MemoryUse``. This can be seen in the above example; the
220second ``MemoryUse`` in ``if.end`` has an operand of ``1``, which is a
221``MemoryDef`` from the entry block.  This is done to make walking,
222value numbering, etc, faster and easier.
223
224It is not possible to optimize ``MemoryDef`` in the same way, as we
225restrict ``MemorySSA`` to one heap variable and, thus, one Phi node
226per block.
227
228
229Invalidation and updating
230-------------------------
231
232Because ``MemorySSA`` keeps track of LLVM IR, it needs to be updated whenever
233the IR is updated. "Update", in this case, includes the addition, deletion, and
234motion of ``Instructions``. The update API is being made on an as-needed basis.
235If you'd like examples, ``GVNHoist`` is a user of ``MemorySSA``\ s update API.
236
237
238Phi placement
239^^^^^^^^^^^^^
240
241``MemorySSA`` only places ``MemoryPhi``\ s where they're actually
242needed. That is, it is a pruned SSA form, like LLVM's SSA form.  For
243example, consider:
244
245.. code-block:: llvm
246
247  define void @foo() {
248  entry:
249    %p1 = alloca i8
250    %p2 = alloca i8
251    %p3 = alloca i8
252    ; 1 = MemoryDef(liveOnEntry)
253    store i8 0, i8* %p3
254    br label %while.cond
255
256  while.cond:
257    ; 3 = MemoryPhi({%0,1},{if.end,2})
258    br i1 undef, label %if.then, label %if.else
259
260  if.then:
261    br label %if.end
262
263  if.else:
264    br label %if.end
265
266  if.end:
267    ; MemoryUse(1)
268    %1 = load i8, i8* %p1
269    ; 2 = MemoryDef(3)
270    store i8 2, i8* %p2
271    ; MemoryUse(1)
272    %2 = load i8, i8* %p3
273    br label %while.cond
274  }
275
276Because we removed the stores from ``if.then`` and ``if.else``, a ``MemoryPhi``
277for ``if.end`` would be pointless, so we don't place one. So, if you need to
278place a ``MemoryDef`` in ``if.then`` or ``if.else``, you'll need to also create
279a ``MemoryPhi`` for ``if.end``.
280
281If it turns out that this is a large burden, we can just place ``MemoryPhi``\ s
282everywhere. Because we have Walkers that are capable of optimizing above said
283phis, doing so shouldn't prohibit optimizations.
284
285
286Non-Goals
287---------
288
289``MemorySSA`` is meant to reason about the relation between memory
290operations, and enable quicker querying.
291It isn't meant to be the single source of truth for all potential memory-related
292optimizations. Specifically, care must be taken when trying to use ``MemorySSA``
293to reason about atomic or volatile operations, as in:
294
295.. code-block:: llvm
296
297  define i8 @foo(i8* %a) {
298  entry:
299    br i1 undef, label %if.then, label %if.end
300
301  if.then:
302    ; 1 = MemoryDef(liveOnEntry)
303    %0 = load volatile i8, i8* %a
304    br label %if.end
305
306  if.end:
307    %av = phi i8 [0, %entry], [%0, %if.then]
308    ret i8 %av
309  }
310
311Going solely by ``MemorySSA``'s analysis, hoisting the ``load`` to ``entry`` may
312seem legal. Because it's a volatile load, though, it's not.
313
314
315Design tradeoffs
316----------------
317
318Precision
319^^^^^^^^^
320
321``MemorySSA`` in LLVM deliberately trades off precision for speed.
322Let us think about memory variables as if they were disjoint partitions of the
323heap (that is, if you have one variable, as above, it represents the entire
324heap, and if you have multiple variables, each one represents some
325disjoint portion of the heap)
326
327First, because alias analysis results conflict with each other, and
328each result may be what an analysis wants (IE
329TBAA may say no-alias, and something else may say must-alias), it is
330not possible to partition the heap the way every optimization wants.
331Second, some alias analysis results are not transitive (IE A noalias B,
332and B noalias C, does not mean A noalias C), so it is not possible to
333come up with a precise partitioning in all cases without variables to
334represent every pair of possible aliases.  Thus, partitioning
335precisely may require introducing at least N^2 new virtual variables,
336phi nodes, etc.
337
338Each of these variables may be clobbered at multiple def sites.
339
340To give an example, if you were to split up struct fields into
341individual variables, all aliasing operations that may-def multiple struct
342fields, will may-def more than one of them.  This is pretty common (calls,
343copies, field stores, etc).
344
345Experience with SSA forms for memory in other compilers has shown that
346it is simply not possible to do this precisely, and in fact, doing it
347precisely is not worth it, because now all the optimizations have to
348walk tons and tons of virtual variables and phi nodes.
349
350So we partition.  At the point at which you partition, again,
351experience has shown us there is no point in partitioning to more than
352one variable.  It simply generates more IR, and optimizations still
353have to query something to disambiguate further anyway.
354
355As a result, LLVM partitions to one variable.
356
357Use Optimization
358^^^^^^^^^^^^^^^^
359
360Unlike other partitioned forms, LLVM's ``MemorySSA`` does make one
361useful guarantee - all loads are optimized to point at the thing that
362actually clobbers them. This gives some nice properties.  For example,
363for a given store, you can find all loads actually clobbered by that
364store by walking the immediate uses of the store.
365