• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# On-Stack Replacement
2
3### Overview
4
5On-Stack Replacement (OSR) is a technique for switching between different implementations of the same function.
6
7Under the OSR, we mean the transition from interpreter code to optimized code. Opposite transition - from optimized to
8unoptimized - we call `Deoptimization`.
9
10OSR workflow:
11```
12                                    +-----------------------+
13                                    |                       |
14                                    |     Interpreter       |
15                                    |                       |
16                                    +-----------------------+
17    Method::osr_code                            |
18    +------------------------+                  |
19    | Method Prologue        |                  V
20    +------------------------+         +-----------------+
21    | mov x10, 0             |         |OsrEntry         |
22    | mov d4, 3.14           |         +-----------------+
23    |                        |                  |
24    |                        |                  +---------------------+
25    |        . . .           |                  |                     V
26    |                        |                  |            +-------------------+
27    | osr_entry_1:           |                  |            |  PrepareOsrEntry  |
28+-->|------------------------|                  |            |(fill CFrame from  |
29|   |  Loop 2                |                  |            | OsrStateStamp)    |
30|   |                        |                  |            +-------------------+
31|   |                        |                  |   CFrame          |       ^
32|   |------------------------|                  |<------------------+       |
33|   |        . . .           |                  |                           |
34|   |                        |                  |       OsrStateStamp       |
35|   |------------------------|                  |      +-----------------------------------+
36|   | Method epilogue        |                  |      |native_pc   : INVALID              |
37|   |------------------------|                  |      |bytecode_pc : offsetof osr_entry_1 |
38|   | OSR Stub 1:            |<-----------------+      |osr_entry   : osr_code+bytecode_pc |
39|   | mov x10, 0             |                         |vregs[]     : vreg1=Slot(2)        |
40|   | mov d4, 3.14           |                         |              vreg4=CpuReg(8)      |
41+---| jump osr_entry_1       |                         +-----------------------------------+
42    +------------------------+
43```
44
45### Triggering
46
47Both, OSR and regular compilation use the same hotness counter. First time, when counter is overflowed we look
48whether method is already compiled or not. If not, we start compilation in regular mode. Otherwise, we compile
49method in OSR mode.
50
51Once compilation is triggered and OSR compiled code is already set, we begin On-Stack Replacement procedure.
52
53Triggering workflow:
54
55![triggering_scheme](images/osr_trigger.png)
56
57### Compilation
58
59JIT compiles the whole OSR-method the same way it compiles a hot method.
60
61To ensure all loops in the compiled code may be entered from the interpreter, we need to avoid loop-optimizations.
62In OSR-methods special osr-entry flag is added to the loop-header basic blocks and some optimizations have to skip
63such loops.
64
65There are no restrictions for inlining: methods can be inlined in a general way and all loop-optimizations are
66applicable for them, because methods' loop-headers are not marked as osr-entry.
67
68New pseudo-instruction is introduced: SaveStateOsr - instruction should be the first one in each loop-header basic block
69with true osr-entry flag.
70This instruction contains information about all live virtual registers at the enter to the loop.
71Codegen creates special OsrStackMap for each SaveStateOsr instruction. Difference from regular stackmap is that it has
72`osr entry bytecode offset` field.
73
74### Metainfo
75
76On each OSR entry, we need to restore execution context.
77To do this, we need to know all live virtual registers at this moment.
78For this purpose new stackmap and new opcode were introduced.
79
80New opcode(OsrSaveState) has the same properties as regular SaveState, except that codegen handles them differently.
81No code is generated in place of OsrSaveState, but a special OsrEntryStub entity is created,
82which is necessary to generate an OSR entry code.
83
84OsrEntryStub does the following:
851. move all constants to the cpu registers or frame slots by inserting move or store instructions
862. encodes jump instruction to the head of the loop where the corresponding OsrSaveState is located
87
88The first point is necessary because the Panda compiler can place some constants in the cpu registers,
89but the constants themselves are not virtual registers and won't be stored in the metainfo.
90Accordingly, they need to be restored back to the CPU registers or frame slots.
91
92Osr stackmaps (OsrStateStamp) are needed to restore virtual registers.
93Each OsrStateStamp is linked to specific bytecode offset, which is offset to the first instruction of the loop.
94Stackmap contains all needed information to convert IFrame to CFrame.
95
96### Frame replacement
97
98Since Panda Interpreter is written in the C++ language, we haven't access to its stack. Thus, we can't just replace
99interpreter frame by cframe on the stack. When OSR is occurred we call OSR compiled code, and once it finishes execution
100we return `true` to the Interpreter. Interpreter, in turn, execute fake `return` instruction to exit from the execution
101procedure.
102
103Pseudocode:
104```python
105def interpreter_work():
106    switch(current_inst):
107        case Return:
108            return
109        case Jump:
110            if target < current_inst.offset:
111                if update_hotness(method, current_inst.bytecode_offset):
112                    set_current_inst(Return)
113        ...
114
115def update_hotness(method: Method*, bytecode_offset: int) -> bool:
116    hotness_counter += 1
117    return false if hotness_counter < threshold:
118
119    if method.HasOsrCode():
120        return OsrEntry(method, bytecode_offset)
121
122    ... # run compilation, see Triggering for more information
123
124    return false
125
126def osr_entry(method: Method*, bytecode_offset: int) -> bool:
127    stamp = Metainfo.find_stamp(bytecode_offset)
128    return false if not stamp
129
130    # Call assembly functions to do OSR magic
131
132    return true
133```
134
135Most part of the OSR entry is written in an assembly language, because CFrame is resided in the native stack.
136
137Osr Entry can occur in three different contexts according to the previous frame's kind:
1381. **Previous frame is CFrame**
139
140    Before: cframe->c2i->iframe
141
142    After: cframe->cframe'
143
144    New cframe is created in place of `c2i` frame, which is just dropped
145
1462. **Previous frame is IFrame**
147
148    Before: iframe->iframe
149
150    After: iframe->i2c->cframe'
151
152    New cframe is created in the current stack position. But before it we need to insert i2c bridge.
153
1543. **Previous frame is null(current frame is the top frame)**
155
156    Before: iframe
157
158    After: cframe'
159
160c2i - compiled to interpreter code bridge
161
162i2c - interpreter to compiled code bridge
163
164cframe' - new cframe, converted from iframe