1# Loop Peeling 2## Overview 3 4`Loop peeling` optimization modifies the loops with exit-point at loop-header to the loops with exit-point at loop-backedge. 5 6## Rationality 7 8Simplify the loop and allow further loop optimizations. 9 10## Dependence 11 12* Loop Analysis 13* Dominators Tree 14* Reverse Post Order (RPO) 15 16## Algorithm 17 18`Loop peeling` modifies loops with the following requirements: 19 - loop is not irreducible; 20 - loop-header is not OSR-entry; 21 - there is only 1 back-edge; 22 - loop-header is a single loop-exit point; 23 - there are no inner loops; 24 25```cpp 26 [pre-header] 27 | 28 v 29 /---->[header]--------\ 30 | | | 31 | v v 32 \----[back-edge] [outer] 33``` 34 35 There are two stages of the algorithm: 36 37### 1. Insert pre-loop 38 ```cpp 39 [pre-header] 40 | 41 v 42 [pre-loop]--------\ 43 | | 44 v v 45 /---->[header]-------->| 46 | | | 47 | v v 48 \----[back-edge] [resolver] 49 | 50 v 51 [outer] 52 ``` 53Pre-loop basic block is a loop-header clone with all instructions, excluding `SafePoint`. 54 55### 2. Move exit-point form the loop-header to the loop-backedge block 56```cpp 57 [pre-header] 58 | 59 v 60 [pre-loop]--------\ 61 | | 62 v v 63 /---->[header] | 64 | | | 65 | v v 66 \----[back-edge]-->[resolver] 67 | 68 v 69 [outer] 70``` 71All instructions from loop-header are moving to the loop-backedge block. Also control-flow edge between loop-header and resolver-block is moving to the loop-backedge. 72 73## Pseudocode 74 75```cpp 76auto header = loop->GetHeader(); 77auto pre-loop = LoopPeeling->CreatePreLoop(); 78for (auto inst : header->GetInstructions()) { 79 auto clone_inst = Clone(inst); 80 pre-loop.AppendInst(clone_inst); 81} 82auto exit_block = LoopPeeling->CreateExitBlock(); 83for (auto inst : header->GetInstructionsReverse()) { 84 header->EraseInst(inst); 85 exit_block->PrependInst(inst); 86} 87``` 88 89## Examples 90 91 [0] 92 | 93 v 94 /---->[2]-----\ 95 | | | 96 | v v 97 \-----[3] [4] 98 | 99 [exit] 100 101 102 GRAPH(GetGraph()) { 103 PARAMETER(0, 0).u64(); 104 PARAMETER(1, 1).u64(); 105 PARAMETER(2, 2).u64(); 106 BASIC_BLOCK(2, 3, 4) { 107 INST(3, Opcode::Phi).u64().Inputs(1, 5); 108 INST(4, Opcode::Phi).u64().Inputs(2, 10); 109 INST(5, Opcode::Sub).u64().Inputs(3, 2); 110 INST(6, Opcode::SafePoint).Inputs(0, 3, 4).SrcVregs({0, 1, 2}); 111 INST(7, Opcode::Compare).CC(CC_EQ).b().Inputs(5, 0); 112 INST(8, Opcode::IfImm).SrcType(DataType::BOOL).CC(CC_NE).Imm(0).Inputs(7); 113 } 114 BASIC_BLOCK(3, 2) { 115 INST(9, Opcode::And).u64().Inputs(4, 5); 116 INST(10, Opcode::Add).u64().Inputs(9, 4); 117 } 118 BASIC_BLOCK(4, -1) { 119 INST(11, Opcode::Return).u64().Inputs(4); 120 } 121 } 122 123 `Loop peeling` transforms to: 124 125 [0] 126 | 127 v 128 [pre-loop]---------\ 129 | | 130 /---->[2] | 131 | | | 132 | v | 133 | [3] | 134 | | | 135 | v v 136 \--[loop-exit]--->[loop-outer] 137 | 138 v 139 [4] 140 | 141 v 142 [exit] 143 144 GRAPH(expected_graph) { 145 PARAMETER(0, 0).u64(); 146 PARAMETER(1, 1).u64(); 147 PARAMETER(2, 2).u64(); 148 BASIC_BLOCK(5, 2, 4) { 149 INST(12, Opcode::Sub).u64().Inputs(1, 2); 150 INST(13, Opcode::Compare).CC(CC_EQ).b().Inputs(12, 0); 151 INST(14, Opcode::IfImm).SrcType(DataType::BOOL).CC(CC_NE).Imm(0).Inputs(13); 152 } 153 BASIC_BLOCK(2, 2, 4) { 154 INST(3, Opcode::Phi).u64().Inputs({{5, 12}, {2, 5}}); 155 INST(4, Opcode::Phi).u64().Inputs({{5, 2}, {2, 10}}); 156 INST(15, Opcode::Phi).u64().Inputs({{5, 12}, {2, 5}}); 157 INST(9, Opcode::And).u64().Inputs(4, 15); 158 INST(10, Opcode::Add).u64().Inputs(9, 4); 159 INST(5, Opcode::Sub).u64().Inputs(3, 2); 160 INST(6, Opcode::SafePoint).Inputs(0, 5, 10).SrcVregs({0, 1, 2}); 161 INST(7, Opcode::Compare).CC(CC_EQ).b().Inputs(5, 0); 162 INST(8, Opcode::IfImm).SrcType(DataType::BOOL).CC(CC_NE).Imm(0).Inputs(7); 163 } 164 BASIC_BLOCK(4, -1) { 165 INST(16, Opcode::Phi).u64().Inputs({{5, 2}, {2, 10}}); 166 INST(11, Opcode::Return).u64().Inputs(16); 167 } 168 } 169 170## Links 171Source code: 172 173[loop_peeling.cpp](../optimizer/optimizations/loop_peeling.cpp) 174 175[loop_peeling.h](../optimizer/optimizations/loop_peeling.h) 176 177Tests: 178 179[loop_peeling_test.cpp](../tests/loop_peeling_test.cpp)