1Target-specific lowering in ICE 2=============================== 3 4This document discusses several issues around generating target-specific ICE 5instructions from high-level ICE instructions. 6 7Meeting register address mode constraints 8----------------------------------------- 9 10Target-specific instructions often require specific operands to be in physical 11registers. Sometimes one specific register is required, but usually any 12register in a particular register class will suffice, and that register class is 13defined by the instruction/operand type. 14 15The challenge is that ``Variable`` represents an operand that is either a stack 16location in the current frame, or a physical register. Register allocation 17happens after target-specific lowering, so during lowering we generally don't 18know whether a ``Variable`` operand will meet a target instruction's physical 19register requirement. 20 21To this end, ICE allows certain directives: 22 23 * ``Variable::setWeightInfinite()`` forces a ``Variable`` to get some 24 physical register (without specifying which particular one) from a 25 register class. 26 27 * ``Variable::setRegNum()`` forces a ``Variable`` to be assigned a specific 28 physical register. 29 30These directives are described below in more detail. In most cases, though, 31they don't need to be explicity used, as the routines that create lowered 32instructions have reasonable defaults and simple options that control these 33directives. 34 35The recommended ICE lowering strategy is to generate extra assignment 36instructions involving extra ``Variable`` temporaries, using the directives to 37force suitable register assignments for the temporaries, and then let the 38register allocator clean things up. 39 40Note: There is a spectrum of *implementation complexity* versus *translation 41speed* versus *code quality*. This recommended strategy picks a point on the 42spectrum representing very low complexity ("splat-isel"), pretty good code 43quality in terms of frame size and register shuffling/spilling, but perhaps not 44the fastest translation speed since extra instructions and operands are created 45up front and cleaned up at the end. 46 47Ensuring a non-specific physical register 48^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 49 50The x86 instruction:: 51 52 mov dst, src 53 54needs at least one of its operands in a physical register (ignoring the case 55where ``src`` is a constant). This can be done as follows:: 56 57 mov reg, src 58 mov dst, reg 59 60so long as ``reg`` is guaranteed to have a physical register assignment. The 61low-level lowering code that accomplishes this looks something like:: 62 63 Variable *Reg; 64 Reg = Func->makeVariable(Dst->getType()); 65 Reg->setWeightInfinite(); 66 NewInst = InstX8632Mov::create(Func, Reg, Src); 67 NewInst = InstX8632Mov::create(Func, Dst, Reg); 68 69``Cfg::makeVariable()`` generates a new temporary, and 70``Variable::setWeightInfinite()`` gives it infinite weight for the purpose of 71register allocation, thus guaranteeing it a physical register (though leaving 72the particular physical register to be determined by the register allocator). 73 74The ``_mov(Dest, Src)`` method in the ``TargetX8632`` class is sufficiently 75powerful to handle these details in most situations. Its ``Dest`` argument is 76an in/out parameter. If its input value is ``nullptr``, then a new temporary 77variable is created, its type is set to the same type as the ``Src`` operand, it 78is given infinite register weight, and the new ``Variable`` is returned through 79the in/out parameter. (This is in addition to the new temporary being the dest 80operand of the ``mov`` instruction.) The simpler version of the above example 81is:: 82 83 Variable *Reg = nullptr; 84 _mov(Reg, Src); 85 _mov(Dst, Reg); 86 87Preferring another ``Variable``'s physical register 88^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 89 90(An older version of ICE allowed the lowering code to provide a register 91allocation hint: if a physical register is to be assigned to one ``Variable``, 92then prefer a particular ``Variable``'s physical register if available. This 93hint would be used to try to reduce the amount of register shuffling. 94Currently, the register allocator does this automatically through the 95``FindPreference`` logic.) 96 97Ensuring a specific physical register 98^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 99 100Some instructions require operands in specific physical registers, or produce 101results in specific physical registers. For example, the 32-bit ``ret`` 102instruction needs its operand in ``eax``. This can be done with 103``Variable::setRegNum()``:: 104 105 Variable *Reg; 106 Reg = Func->makeVariable(Src->getType()); 107 Reg->setWeightInfinite(); 108 Reg->setRegNum(Reg_eax); 109 NewInst = InstX8632Mov::create(Func, Reg, Src); 110 NewInst = InstX8632Ret::create(Func, Reg); 111 112Precoloring with ``Variable::setRegNum()`` effectively gives it infinite weight 113for register allocation, so the call to ``Variable::setWeightInfinite()`` is 114technically unnecessary, but perhaps documents the intention a bit more 115strongly. 116 117The ``_mov(Dest, Src, RegNum)`` method in the ``TargetX8632`` class has an 118optional ``RegNum`` argument to force a specific register assignment when the 119input ``Dest`` is ``nullptr``. As described above, passing in ``Dest=nullptr`` 120causes a new temporary variable to be created with infinite register weight, and 121in addition the specific register is chosen. The simpler version of the above 122example is:: 123 124 Variable *Reg = nullptr; 125 _mov(Reg, Src, Reg_eax); 126 _ret(Reg); 127 128Disabling live-range interference 129^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 130 131(An older version of ICE allowed an overly strong preference for another 132``Variable``'s physical register even if their live ranges interfered. This was 133risky, and currently the register allocator derives this automatically through 134the ``AllowOverlap`` logic.) 135 136Call instructions kill scratch registers 137---------------------------------------- 138 139A ``call`` instruction kills the values in all scratch registers, so it's 140important that the register allocator doesn't allocate a scratch register to a 141``Variable`` whose live range spans the ``call`` instruction. ICE provides the 142``InstFakeKill`` pseudo-instruction to compactly mark such register kills. For 143each scratch register, a fake trivial live range is created that begins and ends 144in that instruction. The ``InstFakeKill`` instruction is inserted after the 145``call`` instruction. For example:: 146 147 CallInst = InstX8632Call::create(Func, ... ); 148 NewInst = InstFakeKill::create(Func, CallInst); 149 150The last argument to the ``InstFakeKill`` constructor links it to the previous 151call instruction, such that if its linked instruction is dead-code eliminated, 152the ``InstFakeKill`` instruction is eliminated as well. The linked ``call`` 153instruction could be to a target known to be free of side effects, and therefore 154safe to remove if its result is unused. 155 156Instructions producing multiple values 157-------------------------------------- 158 159ICE instructions allow at most one destination ``Variable``. Some machine 160instructions produce more than one usable result. For example, the x86-32 161``call`` ABI returns a 64-bit integer result in the ``edx:eax`` register pair. 162Also, x86-32 has a version of the ``imul`` instruction that produces a 64-bit 163result in the ``edx:eax`` register pair. The x86-32 ``idiv`` instruction 164produces the quotient in ``eax`` and the remainder in ``edx``, though generally 165only one or the other is needed in the lowering. 166 167To support multi-dest instructions, ICE provides the ``InstFakeDef`` 168pseudo-instruction, whose destination can be precolored to the appropriate 169physical register. For example, a ``call`` returning a 64-bit result in 170``edx:eax``:: 171 172 CallInst = InstX8632Call::create(Func, RegLow, ... ); 173 NewInst = InstFakeKill::create(Func, CallInst); 174 Variable *RegHigh = Func->makeVariable(IceType_i32); 175 RegHigh->setRegNum(Reg_edx); 176 NewInst = InstFakeDef::create(Func, RegHigh); 177 178``RegHigh`` is then assigned into the desired ``Variable``. If that assignment 179ends up being dead-code eliminated, the ``InstFakeDef`` instruction may be 180eliminated as well. 181 182Managing dead-code elimination 183------------------------------ 184 185ICE instructions with a non-nullptr ``Dest`` are subject to dead-code 186elimination. However, some instructions must not be eliminated in order to 187preserve side effects. This applies to most function calls, volatile loads, and 188loads and integer divisions where the underlying language and runtime are 189relying on hardware exception handling. 190 191ICE facilitates this with the ``InstFakeUse`` pseudo-instruction. This forces a 192use of its source ``Variable`` to keep that variable's definition alive. Since 193the ``InstFakeUse`` instruction has no ``Dest``, it will not be eliminated. 194 195Here is the full example of the x86-32 ``call`` returning a 32-bit integer 196result:: 197 198 Variable *Reg = Func->makeVariable(IceType_i32); 199 Reg->setRegNum(Reg_eax); 200 CallInst = InstX8632Call::create(Func, Reg, ... ); 201 NewInst = InstFakeKill::create(Func, CallInst); 202 NewInst = InstFakeUse::create(Func, Reg); 203 NewInst = InstX8632Mov::create(Func, Result, Reg); 204 205Without the ``InstFakeUse``, the entire call sequence could be dead-code 206eliminated if its result were unused. 207 208One more note on this topic. These tools can be used to allow a multi-dest 209instruction to be dead-code eliminated only when none of its results is live. 210The key is to use the optional source parameter of the ``InstFakeDef`` 211instruction. Using pseudocode:: 212 213 t1:eax = call foo(arg1, ...) 214 InstFakeKill // eax, ecx, edx 215 t2:edx = InstFakeDef(t1) 216 v_result_low = t1 217 v_result_high = t2 218 219If ``v_result_high`` is live but ``v_result_low`` is dead, adding ``t1`` as an 220argument to ``InstFakeDef`` suffices to keep the ``call`` instruction live. 221 222Instructions modifying source operands 223-------------------------------------- 224 225Some native instructions may modify one or more source operands. For example, 226the x86 ``xadd`` and ``xchg`` instructions modify both source operands. Some 227analysis needs to identify every place a ``Variable`` is modified, and it uses 228the presence of a ``Dest`` variable for this analysis. Since ICE instructions 229have at most one ``Dest``, the ``xadd`` and ``xchg`` instructions need special 230treatment. 231 232A ``Variable`` that is not the ``Dest`` can be marked as modified by adding an 233``InstFakeDef``. However, this is not sufficient, as the ``Variable`` may have 234no more live uses, which could result in the ``InstFakeDef`` being dead-code 235eliminated. The solution is to add an ``InstFakeUse`` as well. 236 237To summarize, for every source ``Variable`` that is not equal to the 238instruction's ``Dest``, append an ``InstFakeDef`` and ``InstFakeUse`` 239instruction to provide the necessary analysis information. 240