1# Code Sink 2## Overview 3 4The optimization moves instructions into successor blocks, when possible, so that they are not executed on paths where their results are not needed. 5 6## Rationality 7 8This optimization allows to avoid execution of statements that are not used on execution path. This should speedup the execution. 9 10Motivational example: 11``` 12BB 3 13 0.i32 Parameter arg 0 -> (v5, v8, v7) 14succs: [bb 0] 15 16BB 0 preds: [bb 3] 17 8.i32 AddI v0, 0x1 -> (v6) 18 5. IfImm GT i32 v0, 0x0 19succs: [bb 1, bb 2] 20 21BB 2 preds: [bb 0] 22 6.i32 Return v8 23 24BB 1 preds: [bb 0] 25 7.i32 Return v0 26``` 27 28In this example `v8` is used only in one branch however it is always executed. The code sinking optimization suggests to move `v8` into `BB 2`. 29 30## Dependence 31 32* AliasAnalysis 33* DominatorsTree 34* LoopAnalysis 35* Reverse Post Order (RPO) 36 37## Algorithm 38 39The iterative approach is used. On each iteration the optimization tries to sink each instruction to one of its immediately dominated blocks. It is possible if all users of the instruction is dominated by a block that the instruction is sunk into. Instructions in a basic block are iterated in reverse order to decrease the number of iterations. Iterating finishes when no instruction was sunk. 40 41Instructions that cannot sink: 42 43* Instructions allocating memory 44* Control flow instructions 45* Instructions that can throw an exception 46* Barrier instructions (calls, monitors, volatile, SafePoints, etc.) 47* Store instructions 48* Load instructions if they dominate in scope of current basic block: 49 * an aliased store instruction 50 * a Monitor instruction 51 * a volatile store instruction 52 53To determine which load instruction can be sunk we keep a list of store instructions that have been met so far (we are iterating in reverse order; therefore, when we meet load instruction, we have already collected all stores after this load and can easily check on aliases). 54 55Blocks that instruction cannot be sunk into: 56* Blocks that do not dominate all users of the instruction 57* Loads cannot be sunk into blocks with more than one predecessors (because other predecessors might have aliased stores) 58* Do not sunk instructions into loops 59 60## Pseudocode 61 62``` 63void CodeSink::RunImpl() { 64 // Iteratively sink instructions. On each iteration an instruction can be 65 // sunk to its basic block dominatee. Iterate sinking until no changes 66 // happens. 67 bool changed = true; 68 while(changed) { 69 changed = false; 70 for (auto block : GetGraph()->GetBlocksRPO()) { 71 bool barriered = false; 72 ArenaVector<Inst *> stores; 73 for (auto inst : block->InstsSafeReverse()) { 74 barriered |= inst->IsMonitor() || (inst->IsStore && inst->IsVolatile()); 75 candidate = SinkInstruction(inst, &stores, barriered); 76 if (candidate != nullptr) { 77 block->EraseInst(inst); 78 candidate->PrependInst(inst); 79 changed = true; 80 } 81 } 82 } 83 } 84} 85 86BasicBlock *CodeSink::SinkInstruction(Inst *inst, ArenaVector<Inst *> *stores, bool barriered) { 87 // Save stores to be sure we do not sink a load instruction that may be aliased 88 if (inst->IsStore()) { 89 stores->push_back(inst); 90 return nullptr; 91 } 92 // Check that instruction can be sunk 93 if (inst->IsAllocation() || inst->IsControlFlow() || inst->CanThrow() || inst->IsBarrier()) { 94 return nullptr; 95 } 96 if (inst->IsLoad()) { 97 // Do not sink over monitor 98 // Do not sink over volatile store 99 if (barriered) { 100 return nullptr; 101 } 102 for (auto store : *stores) { 103 if (GetGraph()->CheckInstAlias(inst, store) != AliasType::NO_ALIAS) { 104 return nullptr; 105 } 106 } 107 } 108 109 // Iterate over dominated blocks 110 for (auto cand : inst->GetBasicBlock()->GetDominatedBlocks()) { 111 if (IsAcceptableTarget(inst, cand)) { 112 return cand; 113 } 114 } 115 return nullptr; 116} 117 118bool CodeSink::IsAcceptableTarget(Inst *inst, BasicBlock *candidate) { 119 BasicBlock *block = inst->GetBasicBlock(); 120 Loop *loop = block->GetLoop(); 121 Loop *cand_loop = candidate->GetLoop(); 122 if (candidate->GetPredsBlocks().size() > 1) { 123 // Do not sink loads across a critical edge there may be stores in other code paths. 124 if (inst->IsLoad()) { 125 return false; 126 } 127 // Do not sink into loops 128 if (loop != cand_loop) { 129 return false; 130 } 131 } 132 133 // Check that all uses are dominated by the candidate 134 for (auto &user : inst->GetUsers()) { 135 Inst *uinst = user.GetInst(); 136 if (!candidate->IsDominate(uinst->GetBasicBlock())) { 137 return false; 138 } 139 } 140 return true; 141} 142``` 143 144## Examples 145### Regular sinking 146``` 147BB 2 preds: [bb 0] 148 5.i32 Add v3, v2 -> (v15) 149 8.i64 LoadObject 243 v0 -> (v9, v13) 150 9.b Compare NE i64 v8, v4 -> (v10) 151 10. IfImm NE b v9, 0x0 152succs: [bb 3, bb 4] 153 154BB 4 preds: [bb 2] 155 13.i64 StoreObject 243 v1, v8 156succs: [bb 3] 157 158BB 3 preds: [bb 2, bb 4] 159 15.i32 Return v5 160succs: [bb 1] 161``` 162Sink arithmetic operation `v5` but do not `v8` because `BB 3` has several predecessors. 163``` 164BB 2 preds: [bb 0] 165 8.i64 LoadObject 243 v0 -> (v9, v13) 166 9.b Compare NE i64 v8, v4 -> (v10) 167 10. IfImm NE b v9, 0x0 168succs: [bb 3, bb 4] 169 170BB 4 preds: [bb 2] 171 13.i64 StoreObject 243 v1, v8 172succs: [bb 3] 173 174BB 3 preds: [bb 2, bb 4] 175 5.i32 Add v3, v2 -> (v15) 176 15.i32 Return v5 177succs: [bb 1] 178``` 179### Loop Sinking 180``` 181BB 2 preds: [bb 0] 182 6.i64 Add v1, v5 -> (v21) 183succs: [bb 3] 184 185BB 3 preds: [bb 2, bb 3] 186prop: head, loop 1 187 10p.i64 Phi v4(bb2), v22(bb3) -> (v23, v22, v20) 188 20.i64 LoadArray v2, v10p -> (v21) 189 21.i64 Add v20, v6 -> (v26, v22) 190 22.i64 Add v21, v10p -> (v10p, v26) 191 23.i32 Add v10p, v3 -> (v24) 192 26.i64 Add v21, v22 -> (v27) 193 24.b Compare LT i32 v23, v0 -> (v25) 194 25. IfImm NE b v24, 0x0 195succs: [bb 3, bb 4] 196 197BB 4 preds: [bb 3] 198 27.i64 Return v26 199succs: [bb 1] 200``` 201Sinking `v6` into loop and `v26` out of loop 202``` 203BB 2 preds: [bb 0] 204 6.i64 Add v1, v5 -> (v21) 205succs: [bb 3] 206 207BB 3 preds: [bb 2, bb 3] 208prop: head, loop 1 209 10p.i64 Phi v4(bb2), v22(bb3) -> (v23, v20, v22) 210 20.i64 LoadArray v2, v10p -> (v21) 211 21.i64 Add v20, v6 -> (v22, v26) 212 22.i64 Add v21, v10p -> (v10p, v26) 213 23.i32 Add v10p, v3 -> (v24) 214 24.b Compare LT i32 v23, v0 -> (v25) 215 25. IfImm NE b v24, 0x0 216succs: [bb 3, bb 4] 217 218BB 4 preds: [bb 3] 219 26.i64 Add v21, v22 -> (v27) 220 27.i64 Return v26 221succs: [bb 1] 222 223``` 224## Links 225 226Source code: 227[code_sink.cpp](../optimizer/optimizations/code_sink.cpp) 228[code_sink.h](../optimizer/optimizations/code_sink.h) 229 230Tests: 231[code_sink_test.cpp](../tests/code_sink_test.cpp) 232