• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Loop Peeling
2## Overview
3
4`Loop peeling` optimization modifies the loops with exit-point at loop-header to the loops with exit-point at loop-backedge.
5
6## Rationality
7
8Simplify the loop and allow further loop optimizations.
9
10## Dependence
11
12* Loop Analysis
13* Dominators Tree
14* Reverse Post Order (RPO)
15
16## Algorithm
17
18`Loop peeling` modifies loops with the following requirements:
19 - loop is not irreducible;
20 - loop-header is not OSR-entry;
21 - there is only 1 back-edge;
22 - loop-header is a single loop-exit point;
23 - there are no inner loops;
24
25```cpp
26           [pre-header]
27               |
28               v
29      /---->[header]--------\
30      |        |            |
31      |        v            v
32      \----[back-edge]   [outer]
33```
34
35  There are two stages of the algorithm:
36
37### 1. Insert pre-loop
38 ```cpp
39          [pre-header]
40               |
41               v
42           [pre-loop]--------\
43               |             |
44               v             v
45      /---->[header]-------->|
46      |        |             |
47      |        v             v
48      \----[back-edge]   [resolver]
49                             |
50                             v
51                          [outer]
52 ```
53Pre-loop basic block is a loop-header clone with all instructions, excluding `SafePoint`.
54
55### 2. Move exit-point form the loop-header to the loop-backedge block
56```cpp
57         [pre-header]
58               |
59               v
60           [pre-loop]--------\
61               |             |
62               v             v
63      /---->[header]         |
64      |        |             |
65      |        v             v
66      \----[back-edge]-->[resolver]
67                             |
68                             v
69                          [outer]
70```
71All instructions from loop-header are moving to the loop-backedge block. Also control-flow edge between loop-header and resolver-block is moving to the loop-backedge.
72
73## Pseudocode
74
75```cpp
76auto header = loop->GetHeader();
77auto pre-loop = LoopPeeling->CreatePreLoop();
78for (auto inst : header->GetInstructions()) {
79    auto clone_inst = Clone(inst);
80    pre-loop.AppendInst(clone_inst);
81}
82auto exit_block = LoopPeeling->CreateExitBlock();
83for (auto inst : header->GetInstructionsReverse()) {
84        header->EraseInst(inst);
85        exit_block->PrependInst(inst);
86}
87```
88
89## Examples
90
91               [0]
92                |
93                v
94         /---->[2]-----\
95         |      |      |
96         |      v      v
97         \-----[3]    [4]
98                       |
99                     [exit]
100
101
102    GRAPH(GetGraph()) {
103        PARAMETER(0, 0).u64();
104        PARAMETER(1, 1).u64();
105        PARAMETER(2, 2).u64();
106        BASIC_BLOCK(2, 3, 4) {
107            INST(3, Opcode::Phi).u64().Inputs(1, 5);
108            INST(4, Opcode::Phi).u64().Inputs(2, 10);
109            INST(5, Opcode::Sub).u64().Inputs(3, 2);
110            INST(6, Opcode::SafePoint).Inputs(0, 3, 4).SrcVregs({0, 1, 2});
111            INST(7, Opcode::Compare).CC(CC_EQ).b().Inputs(5, 0);
112            INST(8, Opcode::IfImm).SrcType(DataType::BOOL).CC(CC_NE).Imm(0).Inputs(7);
113        }
114        BASIC_BLOCK(3, 2) {
115            INST(9, Opcode::And).u64().Inputs(4, 5);
116            INST(10, Opcode::Add).u64().Inputs(9, 4);
117        }
118        BASIC_BLOCK(4, -1) {
119            INST(11, Opcode::Return).u64().Inputs(4);
120        }
121    }
122
123 `Loop peeling` transforms to:
124
125               [0]
126                |
127                v
128            [pre-loop]---------\
129                |              |
130         /---->[2]             |
131         |      |              |
132         |      v              |
133         |     [3]             |
134         |      |              |
135         |      v              v
136         \--[loop-exit]--->[loop-outer]
137                               |
138                               v
139                              [4]
140                               |
141                               v
142                             [exit]
143
144    GRAPH(expected_graph) {
145        PARAMETER(0, 0).u64();
146        PARAMETER(1, 1).u64();
147        PARAMETER(2, 2).u64();
148        BASIC_BLOCK(5, 2, 4) {
149            INST(12, Opcode::Sub).u64().Inputs(1, 2);
150            INST(13, Opcode::Compare).CC(CC_EQ).b().Inputs(12, 0);
151            INST(14, Opcode::IfImm).SrcType(DataType::BOOL).CC(CC_NE).Imm(0).Inputs(13);
152        }
153        BASIC_BLOCK(2, 2, 4) {
154            INST(3, Opcode::Phi).u64().Inputs({{5, 12}, {2, 5}});
155            INST(4, Opcode::Phi).u64().Inputs({{5, 2}, {2, 10}});
156            INST(15, Opcode::Phi).u64().Inputs({{5, 12}, {2, 5}});
157            INST(9, Opcode::And).u64().Inputs(4, 15);
158            INST(10, Opcode::Add).u64().Inputs(9, 4);
159            INST(5, Opcode::Sub).u64().Inputs(3, 2);
160            INST(6, Opcode::SafePoint).Inputs(0, 5, 10).SrcVregs({0, 1, 2});
161            INST(7, Opcode::Compare).CC(CC_EQ).b().Inputs(5, 0);
162            INST(8, Opcode::IfImm).SrcType(DataType::BOOL).CC(CC_NE).Imm(0).Inputs(7);
163        }
164        BASIC_BLOCK(4, -1) {
165            INST(16, Opcode::Phi).u64().Inputs({{5, 2}, {2, 10}});
166            INST(11, Opcode::Return).u64().Inputs(16);
167        }
168    }
169
170## Links
171Source code:
172
173[loop_peeling.cpp](../optimizer/optimizations/loop_peeling.cpp)
174
175[loop_peeling.h](../optimizer/optimizations/loop_peeling.h)
176
177Tests:
178
179[loop_peeling_test.cpp](../tests/loop_peeling_test.cpp)