1The proposed implementation’s objectives: 2 * Avoid any new API at the assembler level 3 * Consider the literal pool simply as a container of literals 4 * Keep the literal pool at the macro-assembler level 5 * The literal pool is public (can be used by the users) 6 * The macro-assembler will have its own literal pool, and will manage 7 it automatically 8 * The macro-assembler is responsible for emitting its managed literal 9 pool when needed 10 * The assembler does not manage literals, but can place a literal in the 11 code buffer 12 13Contrary to AArch64, every single instruction handling labels has a 14specific range which also depends on the ISA and the variant. 15For example, ldrd's range is [-1020, 1020] for 32bit T32 and [-255, 255] for 16A32. 17But for ldr the range is [-4095, 4095] for A32, [0, 1020] for 16bit T32 and 18[-4095, 4095] for 32bit T32. 19So, the macro-assembler will have to wait for the instruction to be emitted 20at the assembler level, as the assembler may choose the variant to emit, to 21specify the maximum range the literal has to be placed. 22 23The checkpoint is also dependent on an offset, specific to the ISA, 24applied to the PC (r15). That offset is 4 in the T32 case and 8 in the A32 25case. ARM ARM describes it as the "Architecture State Offset". 26But also, in some cases, the PC is automatically aligned, referenced as 27Align(PC, alignment) in the ARM ARM. 28 29When a literal is added to the literal pool, it’s appended to the list of 30literals and the pool’s size is updated. 31When an instruction using a literal is emitted, the literal's checkpoint 32is set, and depends on the variant that was used to emit the instruction. 33The macro assembler then manages the checkpoint which defines when the 34macro-assembler’s internal pool has to be emitted. 35Broadly speaking, when that checkpoint is reached, the macro-assmbler's 36literal pool is emitted. The emission will include an optional branch to let the 37code execute around the pool: 38 b after_pool 39 .int 0xabcdef123 40 .float 1.0 41 ... 42 after_pool: 43The branch itself can be avoided if the previous instruction is an uncondition 44branch (like b, tbb/tbh, ...) 45 46We will emit the literal pool when one of the following conditions is met: 47 1. The offset in the code-buffer has reached the checkpoint (including enough 48 space for a branch). Every instruction has to perform the check, and only 49 via the macro-assembler because the assembler has no knowledge of the literal 50 pool, 51 2. A literal is added, and the instruction range does not allow the 52 placement at the end of the pool. 53 3. The assembler emits an unconditional branch, a return from function or 54 possibly any instruction which modifies pc ('mov pc, lr' or 'pop pc' 55 which is popular in the T32 world). This will need to be evaluated 56 (consider the size of the pool for example) to avoid trashing the 57 I-cache by emitting too often, but some instructions have a very small 58 range (ldrd for example), that this will be hard to avoid. 59 60At the application level, one can use a literal pool, but such pool will 61not be managed by the macro-assembler, and it’s the responsibility of the 62application to emit it. 63Which means that there will be no automatic mechanism or callback to 64tell the application when it needs to be emitted. Whereas, the built-in 65literal pool will be fully managed by the macro-assembler, and 66automatically emitted when the macro-assembler detects that it can/should. 67We believe it’s too complex for vixl to manage multiple literal pools 68considering the sometimes small latitude given to emit the pools. But 69that’s negotiable. For example, we could have one literal pool for small 70ranges like ldrd and one other for bigger ranges. 71Literals can also be created at the appplication level, and be placed in 72the code at the assembler level, then used as labels for ldr* and all 73the variants. 74 75What still needs to be done with this version of the literal pools: 76 • Have a notion of shared literal so that the literal can be reused 77 even when the literal has been emitted 78 79In the current implementation for AArch64 the literal pool is associated to a 80macro-assembler, the literal may be associated to a literal pool, and the 81assembler places the literal. If the literal is linked to literal pool, the 82assembler will have callbacks in the macro-assembler (as sub class of the 83assembler)... So, it’s pretty hard to manage all this. Here's a gdb trace: 84 85 86 vixl::aarch64::MacroAssembler::Ldr (this, rt, imm=1311768467294899695) 87 1488 literal = new Literal<uint64_t>(imm, 88 1489 &literal_pool_, 89 1490 RawLiteral::kDeletedOnPlacementByPool); 90 91 1498 ldr(rt, literal); 92 93 vixl::aarch64::Assembler::ldr (this, rt=..., literal) 94 1693 ldr(rt, static_cast<int>(LinkAndGetWordOffsetTo(literal))); 95 96 vixl::aarch64::Assembler::LinkAndGetWordOffsetTo (this=0x7fffffffd550, literal) 97 637 literal->SetLastUse(GetCursorOffset()); 98 99 vixl::aarch64::RawLiteral::SetLastUse (this, offset=40) at src/vixl/aarch64/assembler-aarch64.h:1151 100 1154 offset_ = -offset - 1; 101 102 vixl::aarch64::Assembler::LinkAndGetWordOffsetTo (this, literal) 103 640 literal->GetLiteralPool()->AddEntry(literal); 104 105 vixl::aarch64::LiteralPool::AddEntry 106 129 UpdateFirstUse(masm_->GetCursorOffset()); 107 108 vixl::aarch64::LiteralPool::UpdateFirstUse(this, use_position=40) 109 140 SetNextRecommendedCheckpoint(NextRecommendedCheckpoint()); 110 111 vixl::aarch64::LiteralPool::NextRecommendedCheckpoint (this) at src/vixl/aarch64/macro-assembler-aarch64.h:155 112 155 return first_use_ + kRecommendedLiteralPoolRange; 113 114 vixl::aarch64::LiteralPool::SetNextRecommendedCheckpoint (this, offset=131112) at src/vixl/aarch64/macro-assembler-aarch64.h:3130 115 3129 masm_->recommended_checkpoint_ = 116 3130 std::min(masm_->recommended_checkpoint_, offset); 117** Note that we've modified a *private* member of the MacroAssembler through a call to the assembler ** 118 3131 recommended_checkpoint_ = offset; 119 120 vixl::aarch64::LiteralPool::AddEntry (this=0x7fffffffd598, literal=0x28d7f40) 121 131 entries_.push_back(literal); 122 132 size_ += literal->GetSize(); 123