1# IfConversion 2## Overview 3 4`IfConversion` tries to remove branches in executable code by creating linear sections with predicate instructions. 5 6## Rationality 7 8Hardware executes the program speculatively. It tries to predict the branch conditional(true or false) and starts executing instructions after the branch speculatively before executing the branch itself. If the prediction is incorrect(named branch misprediction), the pipeline stops and the state is restored. As result, several dozen cycles are lost. `IfConversion` can add several instructions but remove branch misprediction. 9 10## Dependence 11 12* Dead Code Elimination(DCE) 13* Remove Empty Blocks 14* Remove Linear blocks 15* Reverse Post Order(RPO) 16 17## Algorithm 18 19Optimization makes a pass through the blocks by post order traversal. 20Two patterns are checked for each block: `Triangle` and `Diamond`. 21 22### Triangle 23 24The pattern: 25 26``` 27 [BB] 28 | \ 29 | [JBB] 30 | / 31 [PBB] 32``` 33`BB` -- basic block the recognition starts from 34`JBB`(Join BB) -- true or false successor of `BB`, which will be joined to BB 35`PBB`(Phi BB) -- false or true successor of `BB`, which contain PHI instruction for BB and JBB 36 37### Diamond 38 39``` 40 [BB] 41 / \ 42 [JBB] [JBB 2] 43 \ / 44 [PBB] 45``` 46 47`BB` -- basic block the recognition starts from 48`JBB`(Join BB), `JBB 2` -- true and false successors of BB, which will be joined to `BB ` 49`PBB`(Phi BB) -- the successor of `JBB` and `JBB 2`, which contain PHI instruction for `JBB` and `JBB 2` 50 51 52### Conditions to check 53 541. `JBB`(and `JBB 2` for Diamond) must have only one predecessor and one successor 552. `PBB` must have 2 or more predecessors 563. `JBB`(and `JBB 2` for Diamond) is the predecessor of the PBB 574. `JBB`(and `JBB 2` for Diamond) doesn't contain instruction with `no_ifcvt` property(for example memory instruction, call instruction, instruction with a call to runtime) 585. The number of instructions in `JBB`(and `JBB 2` for Diamond) less than the limit(set by the option `--compiler-if-conversion-limit=N` with the default value 2) 596. The number of Phi instruction in `PBB`, which have different inputs from corresponding predecessor blocks, should also be less than the limit(each of them would be converted into Select) 607. `PBB` doesn't contain float Phi with different inputs for `JBB` and `BB`(`JBB 2` for Diamond) 61 62 63### Transformation 64 651. `If` instructions removed from `BB`(the necessary information, such as the CC, is saved) 662. Edges `BB` -> `JBB` and `JBB` -> `PBB` are removed 673. All instruction from `JBB` are copied to `BB` 684. Select instructions are constructed at the end of `BB`(`JBB 2` for Diamond) 695. All Phi instructions in `PBB` are edited: 70 a. If `PBB` has other predecessors, we check if inputs from `JBB` and `BB`(`JBB 2` for Diamond) are equal, then input from `JBB` is removed. Otherwise, it is also removed, but input from `BB`(`JBB 2` for Diamond) is changed to corresponding Select instruction. 71 b. If `PBB` doesn't have other predecessors, all Phi inputs are copied to Select instructions and Phi instruction is deleted. 726. For Diamond `BB` and `JBB 2` are merged 737. If `PBB` doesn't have other predecessors, `BB` and `PBB` are merged 748. Loop information is fixed 75 76## Pseudocode 77 78TODO 79 80## Examples 81 82**Triangle**: 83 84Before: 85 86``` 87BB 2 preds: [bb 0] 88 3.b Compare B u64 v0, v1 -> (v4) 89 4. IfImm NE b v3, 0x0 90succs: [bb 3, bb 4] 91 92BB 3 preds: [bb 2] 93 5.u64 Mul v0, v2 -> (v6p) 94succs: [bb 4] 95 96BB 4 preds: [bb 2, bb 3] 97 6p.u64 Phi v0(bb2), v5(bb3) -> (v7) 98 7.u64 Return v6p 99succs: [bb 1] 100``` 101After: 102 103``` 104BB 2 preds: [bb 0] 105 3.b Compare B u64 v0, v1 -> (v8) 106 5.u64 Mul v0, v2 -> (v8) 107 8.u64 SelectImm NE b v5, v0, v3, 0x0 -> (v7) 108 7.u64 Return v8 109succs: [bb 1] 110``` 111 112**Diamond**: 113 114Before: 115 116``` 117BB 2 preds: [bb 0] 118 3.b Compare EQ u32 v1, v2 -> (v4) 119 4. IfImm NE b v3, 0x0 120succs: [bb 3, bb 4] 121 122BB 4 preds: [bb 2] 123 5.u32 Add v0, v1 -> (v8p) 124succs: [bb 5] 125 126BB 3 preds: [bb 2] 127 7.u32 Sub v0, v1 -> (v8p) 128succs: [bb 5] 129 130BB 5 preds: [bb 4, bb 3] 131 8p.u32 Phi v5(bb4), v7(bb3) -> (v9) 132 9.u32 Return v8p 133succs: [bb 1] 134``` 135 136After: 137 138``` 139BB 2 preds: [bb 0] 140 3.b Compare EQ u32 v1, v2 -> (v10) 141 7.u32 Sub v0, v1 -> (v10) 142 5.u32 Add v0, v1 -> (v10) 143 10.u32 SelectImm NE b v7, v5, v3, 0x0 -> (v9) 144 9.u32 Return v10 145succs: [bb 1] 146``` 147 148## Links 149 150Source code: 151[if_conversion.cpp](../optimizer/optimizations/if_conversion.cpp) 152[if_conversion.h](../optimizer/optimizations/if_conversion.h) 153 154Tests: 155[if_conversion_test.cpp](../tests/if_conversion_test.cpp) 156