• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Memory management and object layout. Design.
2
3## Overview
4
5Panda Runtime should be scalable onto different devices/OSes. So we need some abstraction level for the OS memory management.
6For now, all targets suppose interaction with the user, so we have some limitations for the STW pause metric.
7We have very limited memory resources for IoT target, so we should maximize efforts on reducing memory overhead(fragmentation and object header size).
8
9The main components of Panda memory management and object model:
10* [Allocators](#allocators)
11* [GC](#gc)
12* [Object header](#object-header)
13
14Panda runtime works/interacts with these memory types:
15* internal memory for runtime(ArenaAllocators for JIT, etc)
16* application memory(i.e., memory for objects created by application)
17* native memory via JNI/FFI
18* memory for JITed code
19
20![High-level design](./images/panda-mm-overview.png "Memory management high-level design")
21
22There are several modes for memory management:
23- base mode
24  - allocators with some average metrics and profile-based configuration(if available)
25  - some baseline GC with profile-based configuration(if available)
26- performance
27  - allocators with low allocation cost
28  - low-pause/pauseless GC(for games) or GC with high throughput and acceptable STW pause (for not games)
29- power-saving mode
30  - energy-efficient allocators(if possible)
31  - special thresholds to improve power efficiency,
32
33Mode are chosen at the startup time (we'll use profile info from cloud for that).
34
35## Object header
36
37Rationale see [here](memory-management-overview.md).
38
39### Requirements
40
41* Support all required features from Runtime
42* Similar design for two different platforms - high-end and low-end
43* Compact Object Header for low-end target
44
45### Specification / Implementation
46
47**Common ObjectHeader methods:**
48
49* Get/Set Mark or Class word
50* Get size of the object header and an object itself
51* Get/Generate an object hash
52
53**Methods, specific for Class word:**
54
55* Get different object fields
56* Return object type
57* Verify object
58* Is it a subclass or not, is it an array or not, etc.
59* Get field address
60
61**Methods, specific for Mark word:**
62
63* Object locked/unlocked
64* Marked for GC or not
65* Monitor functions(get monitor, notify, notify all, wait)
66* Forwarded or not
67
68Mark word depends on configuration and can have different sizes and layout. So here all possible configurations:
69
70128 bits object header for high-end devices(64 bits pointers):
71```
72|--------------------------------------------------------------------------------------|--------------------|
73|                                   Object Header (128 bits)                           |        State       |
74|-----------------------------------------------------|--------------------------------|--------------------|
75|                 Mark Word (64 bits)                 |      Class Word (64 bits)      |                    |
76|-----------------------------------------------------|--------------------------------|--------------------|
77|               nothing:61          | GC:1 | state:00 |     OOP to metadata object     |       Unlock       |
78|-----------------------------------------------------|--------------------------------|--------------------|
79|    tId:29    |      Lcount:32     | GC:1 | state:00 |     OOP to metadata object     |  Lightweight Lock  |
80|-----------------------------------------------------|--------------------------------|--------------------|
81|               Monitor:61          | GC:1 | state:01 |     OOP to metadata object     |  Heavyweight Lock  |
82|-----------------------------------------------------|--------------------------------|--------------------|
83|                Hash:61            | GC:1 | state:10 |     OOP to metadata object     |       Hashed       |
84|-----------------------------------------------------|--------------------------------|--------------------|
85|           Forwarding address:62          | state:11 |     OOP to metadata object     |         GC         |
86|-----------------------------------------------------|--------------------------------|--------------------|
87```
8864 bits object header for high-end devices(32 bits pointers):
89```
90|--------------------------------------------------------------------------------------|--------------------|
91|                                   Object Header (64 bits)                            |        State       |
92|-----------------------------------------------------|--------------------------------|--------------------|
93|                 Mark Word (32 bits)                 |      Class Word (32 bits)      |                    |
94|-----------------------------------------------------|--------------------------------|--------------------|
95|               nothing:29          | GC:1 | state:00 |     OOP to metadata object     |       Unlock       |
96|-----------------------------------------------------|--------------------------------|--------------------|
97|    tId:13    |      Lcount:16     | GC:1 | state:00 |     OOP to metadata object     |  Lightweight Lock  |
98|-----------------------------------------------------|--------------------------------|--------------------|
99|               Monitor:29          | GC:1 | state:01 |     OOP to metadata object     |  Heavyweight Lock  |
100|-----------------------------------------------------|--------------------------------|--------------------|
101|                Hash:29            | GC:1 | state:10 |     OOP to metadata object     |       Hashed       |
102|-----------------------------------------------------|--------------------------------|--------------------|
103|           Forwarding address:30          | state:11 |     OOP to metadata object     |         GC         |
104|-----------------------------------------------------|--------------------------------|--------------------|
105```
106
107However, we can also support such version of the object header(Hash is stored just after the object in memory if object was relocated):
108```
109|--------------------------------------------------------------------------------------|--------------------|
110|                                   Object Header (64 bits)                            |        State       |
111|-----------------------------------------------------|--------------------------------|--------------------|
112|                 Mark Word (32 bits)                 |      Class Word (32 bits)      |                    |
113|-----------------------------------------------------|--------------------------------|--------------------|
114|        nothing:28        | Hash:1 | GC:1 | state:00 |     OOP to metadata object     |       Unlock       |
115|-----------------------------------------------------|--------------------------------|--------------------|
116|  tId:13  |   LCount:15   | Hash:1 | GC:1 | state:00 |     OOP to metadata object     |  Lightweight Lock  |
117|-----------------------------------------------------|--------------------------------|--------------------|
118|        Monitor:28        | Hash:1 | GC:1 | state:01 |     OOP to metadata object     |  Heavyweight Lock  |
119|-----------------------------------------------------|--------------------------------|--------------------|
120|   Forwarding address:28  | Hash:1 | GC:1 | state:11 |     OOP to metadata object     |         GC         |
121|-----------------------------------------------------|--------------------------------|--------------------|
122```
123This scenario decreases the size of a Monitor instance, and we don't need to save Hash somewhere during Lightweight Lock too.
124Unfortunately, it requires extra memory after GC moved the object (where the original hash value will be stored) and also required extra GC work.
125But, this scenario will be useful if we have allocator and GC which decreases such a situation to a minimum.
126
12732 bits object header for low-end devices:
128```
129|--------------------------------------------------------------------------------------|--------------------|
130|                                   Object Header (32 bits)                            |        State       |
131|-----------------------------------------------------|--------------------------------|--------------------|
132|                 Mark Word (16 bits)                 |      Class Word (16 bits)      |                    |
133|-----------------------------------------------------|--------------------------------|--------------------|
134|               nothing:13          | GC:1 | state:00 |     OOP to metadata object     |       Unlock       |
135|-----------------------------------------------------|--------------------------------|--------------------|
136|    thread Id:7    | Lock Count:6  | GC:1 | state:00 |     OOP to metadata object     |  Lightweight Lock  |
137|-----------------------------------------------------|--------------------------------|--------------------|
138|               Monitor:13          | GC:1 | state:01 |     OOP to metadata object     |  Heavyweight Lock  |
139|-----------------------------------------------------|--------------------------------|--------------------|
140|                Hash:13            | GC:1 | state:10 |     OOP to metadata object     |       Hashed       |
141|-----------------------------------------------------|--------------------------------|--------------------|
142|         Forwarding address:14            | state:11 |     OOP to metadata object     |         GC         |
143|-----------------------------------------------------|--------------------------------|--------------------|
144```
145
146States description:
147
148Unlock - the object not locked.
149
150Lightweight Lock - object locked by one thread.
151
152Heavyweight Lock - we have competition for this object(few threads try to lock this object).
153
154Hashed - the object has been hashed, and hash has been stored inside MarkWord.
155
156GC - the object has been moved by GC.
157
158## String and array representation
159
160Array:
161```
162+------------------------------------------------+
163|             Object Header (64 bits)            |
164|------------------------------------------------|
165|                Length (32 bits)                |
166|------------------------------------------------|
167|                  Array payload                 |
168+------------------------------------------------+
169```
170String:
171
172If we don't use strings compressing, each string has this structure:
173```
174+------------------------------------------------+
175|             Object Header (64 bits)            |
176|------------------------------------------------|
177|                Length (32 bits)                |
178|------------------------------------------------|
179|           String hash value (32 bits)          |
180|------------------------------------------------|
181|                  String payload                |
182+------------------------------------------------+
183```
184If we use strings compressing, each string has this structure:
185```
186+------------------------------------------------+
187|             Object Header (64 bits)            |
188|------------------------------------------------|
189|                Length (31 bits)                |
190|------------------------------------------------|
191|             Compressed bit (1 bit)             |
192|------------------------------------------------|
193|           String hash value (32 bits)          |
194|------------------------------------------------|
195|                  String payload                |
196+------------------------------------------------+
197```
198If the compressed bit is 1, the string has a compressed payload - 8 bits for each element.
199
200If the compressed bit is 0, the string has not been compressed - its payload consists of 16 bits elements.
201
202One of the ideas about string representation is to use a hash state inside Mark Word as a container for string hash value (of course we should save object hash somewhere else if it is needed or should use string hash value as the object hash value).
203
204String:
205```
206+------------------------------------------------+
207| String Hash | GC bit (1 bit) | Status (2 bits) |    <--- Mark Word (32 bits)
208|------------------------------------------------|
209|              Class Word (32 bits)              |
210|------------------------------------------------|
211|                Length (32 bits)                |
212|------------------------------------------------|
213|                  String payload                |
214+------------------------------------------------+
215```
216
217See research [here](./memory-management-overview.md#possible-string-objects-size-reduction).
218About JS strings and arrays see [here](./memory-management-overview.md#js-strings-and-arrays).
219
220## Allocators
221
222Requirements:
223- simple and effective allocator for JIT
224  - no need to manual cleanup memory
225  - efficient all at once deallocation to improve performance
226- reasonable fragmentation
227- scalable
228- support for pool extension and reduction(i.e., we can add another memory chunk to the allocator, and it can give it back to the global "pool" when it is empty)
229- cache awareness
230
231*(optional) power efficiency
232
233All allocators should have these methods:
234- method which allocates ```X``` bytes
235- method which allocates ```X``` bytes with specified alignment
236- method which frees allocated pointed by pointer memory(ArenaAllocator is an exception)
237
238### Arena Allocator
239
240It is a region-based allocator, i.e., all allocated in region/arena objects can be efficiently deallocated all at once.
241Deallocation for the specific object doesn't have effect in this allocator.
242
243JIT flow looks like this:
244```
245IR -> Optimizations -> Code
246```
247
248After code generation, all internal structures of JIT should be deleted.
249So, if we can hold JIT memory usage at some reasonable level - Arena Allocator ideally fits JIT requirements for allocator.
250
251### Code Allocator
252
253Requirements:
254- should allocate executable memory for JITed code
255
256This allocator can be tuned to provide more performance.
257For example, if we have some callgraph info, we can use it and allocate code for connected methods with a minimized potential cache-collision rate.
258
259### Main allocator
260
261Requirements:
262- acceptable fragmentation
263- acceptable allocation cost
264- possibility to iterate over the heap
265- scalable
266desired:
267- flexible allocation size list(required to support profile-guided allocation to improve fragmentation and power efficiency)
268
269#### Implementation details
270
271Each allocator works over some pool
272
273Size classes(numbers just informational - they will be tuned after performance analysis):
274- small(1b-4Kb)
275- large(4Kb - 4Mb)
276- humongous(4Mb - Inf)
277
278Size-segregated algorithm used for small size class to reduce fragmentation.
279Small objects are joined in "runs"(not individual element for each size, but some "container" with X elements of the same size in it).
280```
281+--------------------------------------+-----------------+-----------------+-----+-----------------+
282| header for run of objects with size X| obj with size X | free mem size X | ... | obj with size X |
283+--------------------------------------+-----------------+-----------------+-----+-----------------+
284```
285
286Large objects are not joined in "runs".
287
288Humongous objects can be allocated just by proxying requests to the OS(but keep reference to it somewhere) or by using special allocator.
289
290_Note: below for non-embedded target_
291
292Each thread maintains a cache for objects(at least for all objects with small size).
293This should reduce overhead because of synchronization tasks.
294
295Locking policy:
296- locks should protect localized/categorized resources(for example one lock for each size in small size class)
297- avoid holding locks during memory related system calls(mmap etc.)
298
299#### Profile-guided allocation
300
301We can use profile information about allocation size for improving main allocator metrics.
302If we see a very popular allocation size in profile, we can add it as an explicit segregated size and reduce fragmentation.
303To make it work, allocator should support dynamic size table(or should have possibility choose from statically predefined).
304
305### Energy efficiency in allocators
306
307As shown in this [paper](https://www.cs.york.ac.uk/rts/docs/CODES-EMSOFT-CASES-2006/emsoft/p215.pdf) by changing
308various settings of the allocator, it is possible to get very energy efficient allocator.
309There is no universal approach in this paper, but we can try to mix approach from this paper
310with our profile-guided approach.
311
312## Pools and OS interactions
313
314All used memory is divided in chunks. Main allocator can extend his pool with these chunks.
315
316For the cases when we can get memory shortage we should have some preallocated buffer which allow Runtime to continue to work, while GC trying to free memory.
317
318Note:
319For the IoT systems without MMU Pools should have non-trivial implementation.
320
321For some systems/languages will be implemented context-scoped allocator.
322This allocator works over some arena and after the program will be out of the context - this arena will be returned to the OS.
323
324## Spaces
325
326- MemMapSpace, shared between these:
327  - Code space (executable)
328  - Compiler Internal Space(linked list of arenas)
329  - Internal memory space for non-compiler part of runtime (including GC internals)
330  - Object space
331     - BumpPointerSpace
332     - Regular object space
333     - Humonguous objects space
334     - TLAB space(optional)
335     - RegionSpace(optional for some GCs)
336     - Non-moving space
337- MallocMemSpace
338  - Humonguous objects space(optional)
339
340Logical GC spaces:
341- young space (optional for some GCs)
342- survivor space (optional)
343- tenured space
344
345## GC
346
347Garbage collector(GC) automatically recycles memory that it can prove will never be used again.
348
349GC development will be iterative process.
350
351Common requirements:
352- precise GC (see [glossary](./glossary.md#memory-management-terms))
353- GC should support various [modes](#overview)(performance, power-saving mode, normal mode);
354- GC suitable for each mode he shouldn't violate requirements for this mode(see [here](#overview))
355
356Requirements for Runtime:
357- support for precise/exact roots
358- GC barriers support by Interpreter and JIT
359- safepoints support by Interpreter and JIT
360
361Panda should support multiple GCs, since different GCs have different advantages(memory usage, throughput) at different benchmarks/applications.
362So we should have possibility to use optimal GC for each application.
363
364### Epsilon GC
365
366Epsilon GC does absolutely nothing but makes the impression that Runtime has GC. I.e., it supports all required GC interfaces and can be integrated into Runtime.
367
368Epsilon GC should be used only for debug and profiling purposes. I.e., we can disable GC and measure in mode "What if we don't have GC".
369
370### STW GC
371
372Stop-The-World GC.
373
374Non-generational non-moving GC, during the work all mutator threads should be at safepoint.
375
3761. Root scan
3771. Mark
3781. Sweep
379
380### Concurrent Mark Sweep GC
381
382Requirements:
383- concurrent
384- generational
385- low cpu usage (high throughput)
386- acceptable STW pause
387- (optional) compaction
388
389We need to conduct more performance analysis experiments for choosing optimal scheme, but for now let's consider these options:
390- generational moving (optionally compacting) GC
391- (optional) generational non-moving (optionally compacting) GC
392
393Spaces(for moving CMS):
394```
395+------------+------------+----------------------------+
396| Eden/young |  Survivor  |        Tenured/old         |
397|            | (optional) |                            |
398+------------+------------+----------------------------+
399```
400
401Survivor space is optional and only for high-end targets.
402Since one of the metric for this GC - high throughput, the most of the objects in the Eden will live enough to die.
403If we prioritize energy-efficiency metric and the heap sizes at average not gigantic, it seems that we should avoid using survivor spaces.
404So we can support it optionally for experiments. As alternative we can introduce some average age metadata for run of small objects.
405
406Minor GC(STW):
4071. Root scan for young gen, CardTable used for finding roots in old gen
4081. Mark eden and move alive objects to the tenured(or survivor)
4091. Sweep eden
410
411Note: we'll use adaptive thresholds for triggering Minor GC for minimizing STW pause
412Note #2: we can tune minor GC by trying make concurrent marking and re-mark, but it will require copy of the card table.
413
414Major GC
4151. Concurrent scan of static roots
4161. Initial Mark - root scan(STW #1)
4171. Concurrent Marking + Reference processor
4181. Remark missed during concurrent marking objects (STW #2)
4191. Concurrent Sweep + Finalizers
4201. Reset
421
422Reference processor - prevents issues with wrong finalization order.
423
424Note: If we don't have Survivor spaces we can implement non-moving generational GC.
425
426### Region based GC (main)
427
428Requirements:
429- concurrent
430- generational
431- acceptable stable STW pause
432- (optional) compaction
433
434Since typical heap size for mobile applications is small - this GC can be considered as good choice for production.
435
436All heap consists of memory regions with fixed size(it can vary, i.e. size of memory region #K+1 can be different than size of memory region #K).
437```
438+------------------+------------------+-----+------------------+
439| Memory region #1 | Memory region #2 | ... | Memory region #N |
440| young            | tenured          | ... | tenured          |
441+------------------+------------------+-----+------------------+
442```
443
444Regions types:
445- young regions
446- tenured regions
447- humonguous regions(for humonguous objects)
448- empty regions
449
450Incoming references for each region are tracked via remembered sets:
451- old-to-young references
452- old-to-old references
453
454Minor GC(only for young regions - STW):
4551. Root scan for young gen, remembered sets used for finding roots in old gen
4561. Marking young gen + Reference processor + moving alive objects to the tenured space
4571. Sweep + finalizers
458
459The size of young space selected to satisfy
460
461Mixed GC - minor GC + some tenured regions added to the young gen regions after the concurrent marking.
462Concurrent marking(triggered when we reach some threshold for tenured generation size):
4631. Root scan (STW #1)
4641. Concurrent marking + Reference processor
4651. Re-mark - finishes marking and update liveness statistics (STW #2)
4661. Cleanup - reclaims empty regions and determines if we need mixed collections to reclaim tenured space. Tenured regions selected by using different thresholds.
467
468Note: RSets optionally can be refined with special threads
469
470### Low-pause GC (deffered)
471
472Requirements:
473- stable low STW pause/pauseless
474- (optional)incremental
475- with compaction
476
477No explicit minor GC.
478
479Major GC
4801. Concurrent scan of static roots
4811. Initial Mark - root scan(STW #1)
4821. Concurrent Marking + Reference processor
4831. Concurrent Sweep + Finalizers + Concurrent Copy & Compact
4841. Reset
485
486Note: good choice for the applications with big heap or for applications when it is hard to provide stable low pause with Region based GC.
487
488Note: compaction is target and mode dependent, so for low-memory devices we can consider [semi-space compaction](./glossary.md#memory-management-terms).
489For straight-forward approach we can consider some support from OS to minimize overlapping of semi-space compaction phases between applications.
490
491### GC: interaction with Interpreter, JIT and AOT
492
493#### Safepoints
494
495Prerequisites:
496* one HW register reserved for the pointer to the ExecState(per-thread state), let's call it `RVState`
497* ExecState structure has field with address of some page used for safepoints and we knew offset of this field `SPaddrOffset`
498
499In general, safepoint will be just presented as some implicit or explicit load from the `[RVState, SPaddrOffset]`.
500For example, it can be something like this: `LDR R12, [RVState, #SPaddrOffset]`
501
502Note: In some architectures it is make sense to use store instead of load because it requires less registers.
503
504Note: If it is no MMU available - it is allowed to use explicit condition for safepoint, i.e. something like this(pseudocode):
505```
506if (SafepointFlag == true) {
507    call Runtime::SafepointHandler
508}
509```
510
511When GC wants to stop the world, it forces it by stopping all threads at the safepoint.
512It protects some predefined safepoint memory page, and it leads to segmentation faults in all execution threads when they do the load from this address.
513
514Safepoints should be inserted at the beginning of the method and at the head of each loop.
515
516For each safepoint, we should have a method that can provide GC with information about objects on the stack.
517Interpreter already supports such info in the frames.
518But for JIT/compiler, it looks like we need some generated(by JIT/compiler) method that can get all necessary data for the safepoint.
519This method can actually be just some code without prologue and epilogue.
520We'll jump to its beginning from signal handler, and in the end, we should jump back to the safepoint, so probably we should put it near the original code.
521
522So the flow looks like this:
523
524```
525 ...
526 | compiled/jitted code | ------>
527 | safepoint #X in the code | ---seg fault--->
528 | signal handler | ---change return pc explicitly--->
529 | method that prepares data about objects on stack for the #X safepoint and waits until STW ends | ---jump via encoded relative branch to safepoint--->
530 | safepoint #X in the code | ---normal execution--->
531 | compiled/jitted code | ------>
532 ...
533```
534
535**Opens**:
536* should we generate method for each safepoint, or all safepoints at once?
537
538#### GC Barriers
539
540GC barrier is a block on writing to(write barrier) or reading from(read barrier) certain memory by the application code. GC Barriers used to ensure heap consistency and optimize some of GC flows.
541
542##### GC Write Barriers
543
544Heap inconsistency can happen when GC reclaim alive/reachable object.
545I.e. these two conditions should happen to reclaim active/reachable:
5461. We store reference to a white object into a black object
5471. There are no paths from any gray object to that white object
548
549Besides addressing of heap inconsistency problem, write barrier can be used for maintaining incoming references for young generation or region.
550
551So we can solve these issues with GC WRB(write barrier). GC WRB can be _pre_(inserted before the store) and _post_(inserted after the store). This barriers used **only** when we store reference to the object to some field of an object.
552
553_Pre_ barrier usually used to solve issue with lost alive object during concurrent marking. Pseudocode(example):
554```c++
555if (UNLIKELY(concurrent_marking)) {
556    auto pre_val = obj.field;
557    if (pre_val != nullptr) {
558         store_in_buff_to_mark(pre_val); // call function which stores reference to object stored in the field to process it later
559    }
560}
561obj.field = new_val; // STORE for which barrier generated
562```
563
564_Post_ barrier can be used to solve issue with tracking references from tenured generation to the young generation(or inter-region references). In this case we always know external roots for the young generation space(or for region). Pseudocode(abstract example, not real one):
565```c++
566obj.field = new_val; // STORE for which barrier generated
567if ((AddressOf(obj.field) not in [YOUNG_GENERATION_ADDR_BEG, YOUNG_GENERATION_ADDR_END]) &&
568    (AddressOf(new_val) in [YOUNG_GENERATION_ADDR_BEG, YOUNG_GENERATION_ADDR_END])) {
569    update_card(AddressOf(obj.field)); // call function which marks some memory range as containing roots for young generation
570}
571```
572Note: Sometimes we don't check if object and stored reference in different generations. Because we get much less overhead this way.
573
574##### GC Read Barriers
575
576Read barriers used during concurrent compaction in some GCs.
577For example we concurrently moving object from one place(`from-space`) to the another(`to-space`).
578At some moment we can have two instance of the one object.
579So we need one of these conditions should stand if we want to keep heap consistent:
5801. All writes happen into `to-space` instance of the object, but reads can happen from both `from-space` and `to-space` instances
5811. All writes and reads happen into/from `to-space`
582
583#### GC Barriers integration with Interpreter and compiler
584
585
586From Interpreter you could use runtime interface methods:
587```c++
588static void PreBarrier(void *obj_field_addr, void *pre_val_addr);
589static void PostBarrier(void *obj_field_addr, void *val_addr);
590```
591Note: for performance, we can put into ExecState address of conditional flag for conditional barriers with trivial condition (`if (*x) ...`).
592
593It is critical to make compiler to encode barriers very optimally. At least fast path should be encoded effectively.
594There are several approaches for that:
595 1. To describe barrier use some meta-language or IR which can be interpreted/encoded by all compilers compatible with runtime (it is currently not applicable for the runtime)
596 1. (a lot of open questions here, so consider this as an idea) One compiler knows how to encode barrier using runtime interfaces (see next item) and could provide some more compiler-friendly interface to the other compilers to encode GC barriers.
597 1. The compiler knows for each barrier type how it should be encoded (see pseudocode in `libpandabase/mem/gc_barrier.h`). And could use the runtime to get all required operands to do this.
598Let's consider below encoding of PRE_ barrier:
599   - get barrier type via RuntimeInterface: `BarrierType GetPreType() const`
600   - for this barrier type get all needed operands provided by Runtime via
601     `BarrierOperand GCBarrierSet::GetBarrierOperand(BarrierPosition barrier_position, std::string_view name);`
602     (you should use operand/parameters names from pseudocode provided in `enum BarrierType`)
603   - encode barrier code using loaded operands and pseudocode from `enum BarrierType`
604
605## Memory sanitizers support
606
607Panda Runtime should support [ASAN](https://github.com/google/sanitizers/wiki/AddressSanitizer).
608
609Optional: [MSAN](https://github.com/google/sanitizers/wiki/MemorySanitizer)
610(Note: not possible to use without custom built toolchain)
611
612Desirable, but not easy to support: [HWSAN](https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html)
613