• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# IfConversion
2## Overview
3
4`IfConversion` tries to remove branches in executable code by creating linear sections with predicate instructions.
5
6## Rationality
7
8Hardware executes the program speculatively. It tries to predict the branch conditional(true or false) and starts executing instructions after the branch speculatively before executing the branch itself. If the prediction is incorrect(named branch misprediction), the pipeline stops and the state is restored. As result, several dozen cycles are lost. `IfConversion` can add several instructions but remove branch misprediction.
9
10## Dependence
11
12*  Dead Code Elimination(DCE)
13*  Remove Empty Blocks
14*  Remove Linear blocks
15*  Reverse Post Order(RPO)
16
17## Algorithm
18
19Optimization makes a pass through the blocks by post order traversal.
20Two patterns are checked for each block: `Triangle` and `Diamond`.
21
22### Triangle
23
24The pattern:
25
26```
27      [BB]
28       |  \
29       |  [JBB]
30       |  /
31      [PBB]
32```
33`BB` -- basic block the recognition starts from
34`JBB`(Join BB) -- true or false successor of `BB`, which will be joined to BB
35`PBB`(Phi BB) -- false or true successor of `BB`, which contain PHI instruction for BB and JBB
36
37### Diamond
38
39```
40      [BB]
41     /    \
42   [JBB] [JBB 2]
43     \    /
44      [PBB]
45```
46
47`BB` -- basic block the recognition starts from
48`JBB`(Join BB), `JBB 2` -- true and false successors of BB, which will be joined to `BB `
49`PBB`(Phi BB) -- the successor of `JBB` and `JBB 2`, which contain PHI instruction for `JBB` and `JBB 2`
50
51
52### Conditions to check
53
541. `JBB`(and `JBB 2` for Diamond) must have only one predecessor and one successor
552. `PBB` must have 2 or more predecessors
563. `JBB`(and `JBB 2` for Diamond) is the predecessor of the PBB
574. `JBB`(and `JBB 2` for Diamond) doesn't contain instruction with `no_ifcvt` property(for example memory instruction, call instruction, instruction with a call to runtime)
585. The number of instructions in `JBB`(and `JBB 2` for Diamond) less than the limit(set by the option `--compiler-if-conversion-limit=N` with the default value 2)
596. The number of Phi instruction in `PBB`, which have different inputs from corresponding predecessor blocks, should also be less than the limit(each of them would be converted into Select)
607. `PBB` doesn't contain float Phi with different inputs for `JBB` and `BB`(`JBB 2` for Diamond)
61
62
63### Transformation
64
651. `If` instructions removed from `BB`(the necessary information, such as the CC, is saved)
662. Edges `BB` -> `JBB` and `JBB` -> `PBB` are removed
673. All instruction from `JBB` are copied to `BB`
684. Select instructions are constructed at the end of `BB`(`JBB 2` for Diamond)
695. All Phi instructions in `PBB` are edited:
70   a. If `PBB` has other predecessors, we check if inputs from `JBB` and `BB`(`JBB 2` for Diamond) are equal, then input from `JBB` is removed. Otherwise, it is also removed, but input from `BB`(`JBB 2` for Diamond) is changed to corresponding Select instruction.
71   b. If `PBB` doesn't have other predecessors, all Phi inputs are copied to Select instructions and Phi instruction is deleted.
726. For Diamond `BB` and `JBB 2` are merged
737. If `PBB` doesn't have other predecessors, `BB` and `PBB` are merged
748. Loop information is fixed
75
76## Pseudocode
77
78TODO
79
80## Examples
81
82**Triangle**:
83
84Before:
85
86```
87BB 2  preds: [bb 0]
88    3.b    Compare B u64              v0, v1 -> (v4)
89    4.     IfImm NE b                 v3, 0x0
90succs: [bb 3, bb 4]
91
92BB 3  preds: [bb 2]
93    5.u64  Mul                        v0, v2 -> (v6p)
94succs: [bb 4]
95
96BB 4  preds: [bb 2, bb 3]
97   6p.u64  Phi                        v0(bb2), v5(bb3) -> (v7)
98    7.u64  Return                     v6p
99succs: [bb 1]
100```
101After:
102
103```
104BB 2  preds: [bb 0]
105    3.b    Compare B u64              v0, v1 -> (v8)
106    5.u64  Mul                        v0, v2 -> (v8)
107    8.u64  SelectImm NE b             v5, v0, v3, 0x0 -> (v7)
108    7.u64  Return                     v8
109succs: [bb 1]
110```
111
112**Diamond**:
113
114Before:
115
116```
117BB 2  preds: [bb 0]
118    3.b    Compare EQ u32             v1, v2 -> (v4)
119    4.     IfImm NE b                 v3, 0x0
120succs: [bb 3, bb 4]
121
122BB 4  preds: [bb 2]
123    5.u32  Add                        v0, v1 -> (v8p)
124succs: [bb 5]
125
126BB 3  preds: [bb 2]
127    7.u32  Sub                        v0, v1 -> (v8p)
128succs: [bb 5]
129
130BB 5  preds: [bb 4, bb 3]
131   8p.u32  Phi                        v5(bb4), v7(bb3) -> (v9)
132    9.u32  Return                     v8p
133succs: [bb 1]
134```
135
136After:
137
138```
139BB 2  preds: [bb 0]
140    3.b    Compare EQ u32             v1, v2 -> (v10)
141    7.u32  Sub                        v0, v1 -> (v10)
142    5.u32  Add                        v0, v1 -> (v10)
143   10.u32  SelectImm NE b             v7, v5, v3, 0x0 -> (v9)
144    9.u32  Return                     v10
145succs: [bb 1]
146```
147
148## Links
149
150Source code:
151[if_conversion.cpp](../optimizer/optimizations/if_conversion.cpp)
152[if_conversion.h](../optimizer/optimizations/if_conversion.h)
153
154Tests:
155[if_conversion_test.cpp](../tests/if_conversion_test.cpp)
156