| /arkcompiler/runtime_core/docs/ |
| D | 2022-08-18-isa-changelog.md | 13 1. We delete all original java specific opcodes and delete java specific opcode prefix. 14 2. We remove the prefix of ecmascript specific opcodes, such that most of the bytecode opcode can b… 15 …We add prefix "deprecated" and keep the many old isa as "deprecated"-prefixed opcodes (for compati… 16 4. We add prefix "throw" and make all throwing opcodes be prefixed by "throw". 17 5. We add prefix "wide" to support opcodes which need larger immediate number. 18 6. We adjust the format of some opcodes (about immediate number and accumulator), so that the bytec… 19 7. We change the semantics of some opcodes. 20 8. We add 8-bit or 16-bit imm as inline cache slot for some specific opcodes. 23 As we merge some "define-function" opcodes as one opcode, in function we add one field which record… 25 We also add header index in function such that runtime can access IndexHeader more efficiently. [all …]
|
| D | cfi_directives.md | 24 In prolog we save `lr`, `fp` and `callee` regs on stack. 25 So we should explicitly mark these stack slots with help of `CFI` directives. 30 In epilog we read saved `callees` from stack and also `fp`/`lr`. Here we annotate that saved regist… 41 In that case we "say" to `debugger` that we are not going to return to previous frame. So we direct…
|
| D | on-stack-replacement.md | 7 Under the OSR, we mean the transition from interpreter code to optimized code. Opposite transition … 8 unoptimized - we call `Deoptimization`. 47 …d regular compilation use the same hotness counter. First time, when counter is overflowed we look 48 whether method is already compiled or not. If not, we start compilation in regular mode. Otherwise,… 51 Once compilation is triggered and OSR compiled code is already set, we begin On-Stack Replacement p… 61 To ensure all loops in the compiled code may be entered from the interpreter, we need to avoid loop… 76 On each OSR entry, we need to restore execution context. 77 To do this, we need to know all live virtual registers at this moment. 98 Since Panda Interpreter is written in the C++ language, we haven't access to its stack. Thus, we ca… 99 interpreter frame by cframe on the stack. When OSR is occurred we call OSR compiled code, and once … [all …]
|
| D | rationale-for-bytecode.md | 30 making interpretation slower than _native code execution_. In return, we get the ability to 76 At the same time, to execute a stack-based addition we need to run 3 instructions compared to 110 With this approach, we are no longer required to encode destination register, it is "hardcoded" to 148 It easy to see that to address virtual registers 4 and 5 we need just 3 bits which allows to encode 158 into 4 bits, we have to use a wider encoding: 165 How to make sure that we benefit from the shorter encoding most of the time? An observation shows 171 Please note also that we don't need "full-range" versions for all instructions. In case some 172 instruction lacks a wide-range form, we can prepare operands for it with moves that have all 173 needed forms. Thus we save on opcode space without losing in encoding size (on average). 175 With such approach, we can carefully introduce various "overloads" for instruction when it could [all …]
|
| D | memory-management.md | 5 Panda Runtime should be scalable onto different devices/OSes. So we need some abstraction level for… 6 For now, all targets suppose interaction with the user, so we have some limitations for the STW pau… 7 We have very limited memory resources for IoT target, so we should maximize efforts on reducing mem… 33 Mode are chosen at the startup time (we'll use profile info from cloud for that). 107 However, we can also support such version of the object header(Hash is stored just after the object… 123 This scenario decreases the size of a Monitor instance, and we don't need to save Hash somewhere du… 125 But, this scenario will be useful if we have allocator and GC which decreases such a situation to a… 152 Heavyweight Lock - we have competition for this object(few threads try to lock this object). 172 If we don't use strings compressing, each string has this structure: 184 If we use strings compressing, each string has this structure: [all …]
|
| D | deoptimization.md | 22 The slow path encodes a call to the runtime function `Deoptimize`. We do the following for this: 32 The function `Deoptimize` calculates bitecode pc where we should start executing code in the interp… 33 If deoptimization occurred in the inlined method, we restore all interpreter frames for all inlined… 78 …s (which we saved in the method) and call `InvokeInterpreter` to execute the method in the interpr… 84 …he interpreter. After return from `InvokeInterpreter`, we restore callee saved registers(which we … 97 If deoptimization occurred in the inlined method, we call interpreter for all inlined methods from …
|
| D | ir_format.md | 14 …asses, add and delete passes(If 2 passes have a dependency we must take this into account). We sho… 15 Second, we need to support the transfer of information between optimizations. 52 …ations are not obvious or do need profiling information to implement them. We will have them in mi… 62 We will try to make it possible to pass optimizations in an arbitrary order. Some restrictions will… 78 Panda bytecode has more than 200 instructions. We need to convert all Bytecode instructions in IR i… 95 …we do, the more overhead we get. We need to find a balance between performance and the overhead ne… 97 In Ahead-Of-Time(AOT) mode the overhead is less critical for us, so we can do more optimizations. 112 We decided to choose the CFG with SSA form for the following reasons: 132 …* List of predecessors: vector of pointers to the BasicBlocks from which we can get into the curre… 133 …* List of successors: vector of pointers to the BasicBlocks in which we can get from the current b… [all …]
|
| D | memory-management-SW-requirements.md | 37 - Concurrent generational GC (optional - we can disable generational mode) 51 We can use profile to choose MM configuration for application (for example: we can choose non-compa…
|
| /arkcompiler/runtime_core/compiler/optimizer/optimizations/ |
| D | object_type_check_elimination.cpp | 77 // If we can't resolve klass in runtime we must throw exception, so we check NullPtr after in TryEliminateIsInstance() 78 // But we can't change the IsInstance to Deoptimize, because we can resolve after compilation in TryEliminateIsInstance() 80 // If we can't replace IsInstance, we should reset ObjectTypeInfo for input in TryEliminateIsInstance() 99 // If we can't replace IsInstance, we should reset ObjectTypeInfo for input in TryEliminateIsInstance() 112 // If we can't resolve klass in runtime we must throw exception, so we check NullPtr after in TryEliminateCheckCast() 113 // But we can't change the CheckCast to Deoptimize, because we can resolve after compilation in TryEliminateCheckCast() 115 // If we can't replace CheckCast, we should reset ObjectTypeInfo for input. in TryEliminateCheckCast() 136 // If we can't replace CheckCast, we should reset ObjectTypeInfo for input. in TryEliminateCheckCast()
|
| /arkcompiler/runtime_core/compiler/docs/ |
| D | memory_barriers_doc.md | 5 We need to encode barriers after the instructions NewArray, NewObject, NewMultiArray so that if the… 6 We can remove the barrier if we prove that the created object cannot be passed to another thread be… 7 This can happen if we save the object to memory or pass it to another method 21 We pass through all instructions in PRO order. If the instruction has flag `MEM_BARRIER` we add the… 22 If we visit an instruction that can pass an object to another thread(Store instruction, Call instru… 23 If the instruction has input from the `barriers_insts_`, we call function `MergeBarriers`. 25 So we will only set the barrier in the last instruction before potentially passing the created obje…
|
| D | memory_coalescing_doc.md | 40 The case with a coalesced store is quite straightforward: having two consecutive stores we replace … 46 …oduces multiple assignment that is not a part of SSA form. By this reason, we need additional pseu… 54 …e placed near each other without intermediate instructions. By this reason we need to find a place… 56 During hoisting and sinking of memory operations we use rules for memory instruction scheduling: do… 58 …ented for array accesses. We process instructions of basic block in order. To find accesses of con… 63 To track indices we use basic implementation of scalar evolution that allows to track how variables… 65 Processing each instruction in basic block we do the following: 70 …2) If we can't determine anything about index variable, we add this instruction as a candidate and… 72 …alesced with the instruction **or** both refer to different objects **or** we have no information … 76 Finally, we replace collected pairs by coalesced instructions. [all …]
|
| D | vn_doc.md | 6 At the case we move users from second instruction to first instructions(first instruction is domina… 19 We pass through all instructions in PRO order. If the instruction has attribute NO_Cse, we set next… 20 For other instructions we save information: opcode, type, `vn` of instruction inputs, advanced prop… 21 Based on the collected information, we are looking for a equivalent instructions in the hash map. 24 …a. If some equivalent instruction dominates current instruction, we move users from current instru… 25 …b. If all equivalent instructions do not dominate current instruction, we insert the instruction i… 26 2. If equivalent instructions weren't found, we set next `vn` to the current instruction field and …
|
| D | scheduler_doc.md | 26 For each basic block we first scan instructions in reverse order marking barriers and calculating t… 27 Together with dependencies we calculate priority as a longest (critical) path to leaf instructions … 29 Than we schedule each interval between barriers using the following algorithm. 33 …e is empty we look through "soonest" instruction from `waiting` queue and if we need to skip some … 34 Next, we move all already available instructions (`ASAP` <= `cycle`) from `waiting` queue into `rea… 36 Finally, we extract top instruction from `ready` queue and add it into new schedule. At this moment… 128 ... // Here we rearrange instructions in basic block according to sched_ 162 // Skipping cycles where we can't schedule any instruction
|
| /arkcompiler/runtime_core/runtime/mem/gc/ |
| D | gc_adaptive_stack.h | 39 * we will create a new task for worker. 53 * This method should be used when we find new object by field from another object. 54 * @param from_object from which object we found object by reference, nullptr for roots 60 * This method should be used when we find new object as a root 61 * @param root_type type of the root which we found 68 * If the source stack is empty, we will swap it with destination stack 83 * Should be used if we decide to free it not on destructor of this instance. 100 * If we set the limit for stack, we will create a new task for
|
| D | gc_settings.h | 106 * \brief Specify if we need to track removing objects 119 …* \brief Max stack size for marking in main thread, if it exceeds we will send a new task to worke… 125 …* \brief Max stack size for marking in a gc worker, if it exceeds we will send a new task to worke… 141 * \brief true if we want to do marking phase in multithreading mode. 148 * \brief true if we want to do compacting phase in multithreading mode. 181 …/// Max stack size for marking in main thread, if it exceeds we will send a new task to workers, 0… 183 …/// Max stack size for marking in a gc worker, if it exceeds we will send a new task to workers, 0… 185 …bool g1_track_freed_objects_ = false; /// if we need to track removing objects during the G1GC co… 188 …bool parallel_marking_enabled_ = false; /// true if we want to do marking phase in multithrea… 189 …bool parallel_compacting_enabled_ = false; /// true if we want to do compacting phase in multith…
|
| /arkcompiler/runtime_core/runtime/mem/ |
| D | alloc_config.h | 29 …* We want to record stats about allocations and free events. Allocators don't care about the type … 31 …* we can cast void* to object and get the specific size of this object, otherwise we should believ… 32 …* can record only approximate size. Because of this we force allocators to use specific config for… 91 * we find the first object, which crosses the border of this interval. 163 // We don't use crossing map in this config. 166 // We don't use crossing map in this config. 174 // We don't use crossing map in this config. 177 // We can't call CrossingMap when we don't use it in FindFirstObjInCrossingMap() 182 // We don't use crossing map in this config. 185 // We don't use crossing map in this config. [all …]
|
| D | freelist_allocator-inl.h | 77 …LOG_FREELIST_ALLOCATOR(DEBUG) << "Raw memory is not aligned as we need. Create special header for … in Alloc() 78 // Raw memory pointer is not aligned as we expected in Alloc() 79 // We need to create extra header inside in Alloc() 108 // We must update some values in current memory_block in Alloc() 122 …// It is not the object size itself, because we can't compute it from MemoryBlockHeader structure … in Alloc() 160 … // It is not the object size itself, because we can't compute it from MemoryBlockHeader structure. in FreeUnsafe() 187 // TODO(aemelenko): We can create a mutex for each pool to increase performance, 189 // (we must compute memory pool header addr from a memory block addr stored inside it) 191 // During Collect method call we iterate over memory blocks in each pool. 220 // Therefore, we must unlock allocator's alloc/free methods only [all …]
|
| D | heap_space.cpp | 96 // then we increase space to allocate this pool in ComputeNewSize() 118 // We have enough memory for allocation, no need to increase heap in WillAlloc() 121 …// If we allocate pool during GC work then we must allocate new pool anyway, so we wiil try to inc… in WillAlloc() 123 …equested pool size greater free bytes in current heap space and non occupied memory then we can not in WillAlloc() 124 // allocate such pool, so we need to trigger GC in WillAlloc() 128 // In this case we need increase space for allocate new pool in WillAlloc() 131 // Otherwise we need to trigger GC in WillAlloc() 265 // then we increase young space to allocate this pool in ComputeNewYoung() 282 // then we increase tenured space to allocate this pool in ComputeNewTenured() 413 // For tenured we just free pool in FreeTenuredPool() [all …]
|
| /arkcompiler/runtime_core/cmake/ |
| D | PandaCmakeFunctions.cmake | 15 # We need use linker scripts for section replacement above 4GB 17 # so we have difference start addresses for asan and default buildings 19 # When using rapidcheck we should use linker scripts with 32 # For cross-aarch64 with ASAN with linker script we need use additional path-link 33 … # Here we use default addresses space (without ASAN flag). It is nuance of cross-building. 40 # We need use specific options for AMD64 building with Clang compiler
|
| D | ClangTidy.cmake | 21 # TODO: Retry once we upgrade the checker. 36 # Currently we fix a certain version of clang-tidy to avoid unstable linting, 53 # Hence we check for ERROR_VARIABLE instead of RESULT_VARIABLE. 66 # definition. We add it to targets on which we run clang-tidy just to 93 # * We use permissive policy for checks, i.e. everything is enabled by default, 95 # * We maintain the list of global exceptions in this function (not in .clang-tidy) 147 "-readability-identifier-naming" # disabled because we will use little-hump-style 149 …"-fuchsia-trailing-return" # disabled because we have a lot of false positives and it is stylisti… 150 …"-fuchsia-default-arguments-calls" # disabled because we use functions with default arguments a lot 151 …"-fuchsia-default-arguments-declarations" # disabled because we use functions with default argumen… [all …]
|
| /arkcompiler/runtime_core/runtime/mem/gc/heap-space-misc/ |
| D | crossing_map.h | 29 // If enabled - we will manage elements, which cross map borders. 30 // since now we dirty card by object header, so disable cross-border 33 // TODO(aemelenko): Now, we can't change the granularity parameter here 86 * Therefore, during removing we need to send next and previous object parameters. 101 * we find the first object, which crosses the border of this interval. 127 // How much memory we manage via one element of the static array. 140 // According Status bits, we can use the offset value in such a way: 145 // We can start our range iteration from this element. 151 // this Page, and also we know what there is an object, which crossed the Page border. 160 // We have some object that starts inside this page, [all …]
|
| /arkcompiler/runtime_core/runtime/tests/ |
| D | freelist_allocator_test.cpp | 37 // We need to create a runtime instance to be able to use CrossingMap. in FreeListAllocatorTest() 84 …// We use common PoolManager from Runtime. Therefore, we have the same pool allocation for both ca… in AddMemoryPoolToAllocatorProtected() 98 // We need to remove corresponding Pools from the CrossingMap in ClearPoolManager() 279 // To cover all memory we need to consider pool header size at first bytes of pool memory. in TEST_F() 336 // We have an issue with QEMU during MT tests. Issue 2852 in TEST_F() 343 // Threads can concurrently add Pools to the allocator, therefore, we must make it into account in TEST_F() 344 // And also we must take fragmentation into account in TEST_F() 360 // We have an issue with QEMU during MT tests. Issue 2852 in TEST_F() 367 // Threads can concurrently add Pools to the allocator, therefore, we must make it into account in TEST_F() 368 // And also we must take fragmentation into account in TEST_F() [all …]
|
| /arkcompiler/runtime_core/libpandabase/utils/ |
| D | murmur3_hash.h | 31 // Firstly, we proceed each 32 bits block from key; 32 // Secondly, we proceed last 8 bits block which were not covered in previous step. 90 // We start hashing from the seed in MurmurHash3() 102 // Do this because we don't want to dispatch Big/Little endianness. in MurmurHash3() 140 // We start hashing from the seed in MurmurHash3String() 142 // We should still compute length of the string, we will need it later in MurmurHash3String() 160 // We couldn't read four 8bytes value in MurmurHash3String() 163 // Do this because we don't want to dispatch Big/Little endianness. in MurmurHash3String()
|
| /arkcompiler/runtime_core/runtime/ |
| D | lock_order_graph.cpp | 71 // We can only wait for a single monitor here. in CheckForTerminationLoops() 95 // If this node belongs to some previously found loop, we ignore it. in CheckForTerminationLoops() 108 …// On each iteration of the loop we take next unexplored node from the front and find all reachabl… in CheckForTerminationLoops() 109 …// it. If we find already explored node then there is a loop and we save it in nodes_in_deadlocks.… in CheckForTerminationLoops() 123 // the daemon thread sets SetEnteringMonitor, then we create an edge from a thread in CheckForTerminationLoops() 125 // So here we ignore this self-loop as a false loop. in CheckForTerminationLoops()
|
| /arkcompiler/runtime_core/runtime/mem/gc/reference-processor/ |
| D | reference_processor.h | 52 …* Predicate checks GC-specific conditions on this reference (i.e. if we need to skip this referenc… 60 …* Predicate checks if we should add this reference to the queue (e.g. don't process to many refs o… 66 * Process all references which we discovered by GC. 67 …* Predicate checks if we should process all references at once (e.g. processing takes too much tim… 73 …* Collect all processed references. They were cleared on the previous phase - we only collect them.
|