• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# A higher level definition of the bytecode interpreter
2
3## Abstract
4
5The CPython interpreter is defined in C, meaning that the semantics of the
6bytecode instructions, the dispatching mechanism, error handling, and
7tracing and instrumentation are all intermixed.
8
9This document proposes defining a custom C-like DSL for defining the
10instruction semantics and tools for generating the code deriving from
11the instruction definitions.
12
13These tools would be used to:
14* Generate the main interpreter (done)
15* Generate the tier 2 interpreter
16* Generate documentation for instructions
17* Generate metadata about instructions, such as stack use (done).
18* Generate the tier 2 optimizer's abstract interpreter.
19
20Having a single definition file ensures that there is a single source
21of truth for bytecode semantics.
22
23Other tools that operate on bytecodes, like `frame.setlineno`
24and the `dis` module, will be derived from the common semantic
25definition, reducing errors.
26
27## Motivation
28
29The bytecode interpreter of CPython has traditionally been defined as standard
30C code, but with a lot of macros.
31The presence of these macros and the nature of bytecode interpreters means
32that the interpreter is effectively defined in a domain specific language (DSL).
33
34Rather than using an ad-hoc DSL embedded in the C code for the interpreter,
35a custom DSL should be defined and the semantics of the bytecode instructions,
36and the instructions defined in that DSL.
37
38Generating the interpreter decouples low-level details of dispatching
39and error handling from the semantics of the instructions, resulting
40in more maintainable code and a potentially faster interpreter.
41
42It also provides the ability to create and check optimizers and optimization
43passes from the semantic definition, reducing errors.
44
45## Rationale
46
47As we improve the performance of CPython, we need to optimize larger regions
48of code, use more complex optimizations and, ultimately, translate to machine
49code.
50
51All of these steps introduce the possibility of more bugs, and require more code
52to be written. One way to mitigate this is through the use of code generators.
53Code generators decouple the debugging of the code (the generator) from checking
54the correctness (the DSL input).
55
56For example, we are likely to want a new interpreter for the tier 2 optimizer
57to be added in 3.12. That interpreter will have a different API, a different
58set of instructions and potentially different dispatching mechanism.
59But the instructions it will interpret will be built from the same building
60blocks as the instructions for the tier 1 (PEP 659) interpreter.
61
62Rewriting all the instructions is tedious and error-prone, and changing the
63instructions is a maintenance headache as both versions need to be kept in sync.
64
65By using a code generator and using a common source for the instructions, or
66parts of instructions, we can reduce the potential for errors considerably.
67
68
69## Specification
70
71This specification is a work in progress.
72We update it as the need arises.
73
74### Syntax
75
76Each op definition has a kind, a name, a stack and instruction stream effect,
77and a piece of C code describing its semantics::
78
79```
80  file:
81    (definition | family | pseudo)+
82
83  definition:
84    "inst" "(" NAME ["," stack_effect] ")" "{" C-code "}"
85    |
86    "op" "(" NAME "," stack_effect ")" "{" C-code "}"
87    |
88    "macro" "(" NAME ")" "=" uop ("+" uop)* ";"
89
90  stack_effect:
91    "(" [inputs] "--" [outputs] ")"
92
93  inputs:
94    input ("," input)*
95
96  outputs:
97    output ("," output)*
98
99  input:
100    object | stream | array
101
102  output:
103    object | array
104
105  uop:
106    NAME | stream
107
108  object:
109    NAME [":" type] [ "if" "(" C-expression ")" ]
110
111  type:
112    NAME ["*"]
113
114  stream:
115    NAME "/" size
116
117  size:
118    INTEGER
119
120  array:
121    object "[" C-expression "]"
122
123  family:
124    "family" "(" NAME ")" = "{" NAME ("," NAME)+ [","] "}" ";"
125
126  pseudo:
127    "pseudo" "(" NAME ")" = "{" NAME ("," NAME)+ [","] "}" ";"
128```
129
130The following definitions may occur:
131
132* `inst`: A normal instruction, as previously defined by `TARGET(NAME)` in `ceval.c`.
133* `op`: A part instruction from which macros can be constructed.
134* `macro`: A bytecode instruction constructed from ops and cache effects.
135
136`NAME` can be any ASCII identifier that is a C identifier and not a C or Python keyword.
137`foo_1` is legal. `$` is not legal, nor is `struct` or `class`.
138
139The optional `type` in an `object` is the C type. It defaults to `PyObject *`.
140The objects before the "--" are the objects on top of the stack at the start of
141the instruction. Those after the "--" are the objects on top of the stack at the
142end of the instruction.
143
144
145An `inst` without `stack_effect` is a transitional form to allow the original C code
146definitions to be copied. It lacks information to generate anything other than the
147interpreter, but is useful for initial porting of code.
148
149Stack effect names may be `unused`, indicating the space is to be reserved
150but no use of it will be made in the instruction definition.
151This is useful to ensure that all instructions in a family have the same
152stack effect.
153
154The number in a `stream` define how many codeunits are consumed from the
155instruction stream. It returns a 16, 32 or 64 bit value.
156If the name is `unused` the size can be any value and that many codeunits
157will be skipped in the instruction stream.
158
159By convention cache effects (`stream`) must precede the input effects.
160
161The name `oparg` is pre-defined as a 32 bit value fetched from the instruction stream.
162
163### Special instruction annotations
164
165Instruction headers may be prefixed by one or more annotations. The non-exhaustive
166list of annotations and their meanings are as follows:
167
168* `override`. For external use by other interpreter definitions to override the current
169   instruction definition.
170* `pure`. This instruction has no side effects.
171* 'tierN'. This instruction only used by tier N interpreter.
172
173### Special functions/macros
174
175The C code may include special functions that are understood by the tools as
176part of the DSL.
177
178Those functions include:
179
180* `DEOPT_IF(cond, instruction)`. Deoptimize if `cond` is met.
181* `ERROR_IF(cond, label)`. Jump to error handler at `label` if `cond` is true.
182* `DECREF_INPUTS()`. Generate `Py_DECREF()` calls for the input stack effects.
183* `SYNC_SP()`. Synchronizes the physical stack pointer with the stack effects.
184
185Note that the use of `DECREF_INPUTS()` is optional -- manual calls
186to `Py_DECREF()` or other approaches are also acceptable
187(e.g. calling an API that "steals" a reference).
188
189Variables can either be defined in the input, output, or in the C code.
190Variables defined in the input may not be assigned in the C code.
191If an `ERROR_IF` occurs, all values will be removed from the stack;
192they must already be `DECREF`'ed by the code block.
193If a `DEOPT_IF` occurs, no values will be removed from the stack or
194the instruction stream; no values must have been `DECREF`'ed or created.
195
196These requirements result in the following constraints on the use of
197`DEOPT_IF` and `ERROR_IF` in any instruction's code block:
198
1991. Until the last `DEOPT_IF`, no objects may be allocated, `INCREF`ed,
200   or `DECREF`ed.
2012. Before the first `ERROR_IF`, all input values must be `DECREF`ed,
202   and no objects may be allocated or `INCREF`ed, with the exception
203   of attempting to create an object and checking for success using
204   `ERROR_IF(result == NULL, label)`. (TODO: Unclear what to do with
205   intermediate results.)
2063. No `DEOPT_IF` may follow an `ERROR_IF` in the same block.
207
208(There is some wiggle room: these rules apply to dynamic code paths,
209not to static occurrences in the source code.)
210
211If code detects an error condition before the first `DECREF` of an input,
212two idioms are valid:
213
214- Use `goto error`.
215- Use a block containing the appropriate `DECREF` calls ending in
216  `ERROR_IF(true, error)`.
217
218An example of the latter would be:
219```cc
220    res = PyObject_Add(left, right);
221    if (res == NULL) {
222        DECREF_INPUTS();
223        ERROR_IF(true, error);
224    }
225```
226
227### Semantics
228
229The underlying execution model is a stack machine.
230Operations pop values from the stack, and push values to the stack.
231They also can look at, and consume, values from the instruction stream.
232
233All members of a family
234(which represents a specializable instruction and its specializations)
235must have the same stack and instruction stream effect.
236
237The same is true for all members of a pseudo instruction
238(which is mapped by the bytecode compiler to one of its members).
239
240## Examples
241
242(Another source of examples can be found in the [tests](test_generator.py).)
243
244Some examples:
245
246### Output stack effect
247```C
248    inst ( LOAD_FAST, (-- value) ) {
249        value = frame->f_localsplus[oparg];
250        Py_INCREF(value);
251    }
252```
253This would generate:
254```C
255    TARGET(LOAD_FAST) {
256        PyObject *value;
257        value = frame->f_localsplus[oparg];
258        Py_INCREF(value);
259        PUSH(value);
260        DISPATCH();
261    }
262```
263
264### Input stack effect
265```C
266    inst ( STORE_FAST, (value --) ) {
267        SETLOCAL(oparg, value);
268    }
269```
270This would generate:
271```C
272    TARGET(STORE_FAST) {
273        PyObject *value = PEEK(1);
274        SETLOCAL(oparg, value);
275        STACK_SHRINK(1);
276        DISPATCH();
277    }
278```
279
280### Input stack effect and cache effect
281```C
282    op ( CHECK_OBJECT_TYPE, (owner, type_version/2 -- owner) ) {
283        PyTypeObject *tp = Py_TYPE(owner);
284        assert(type_version != 0);
285        DEOPT_IF(tp->tp_version_tag != type_version);
286    }
287```
288This might become (if it was an instruction):
289```C
290    TARGET(CHECK_OBJECT_TYPE) {
291        PyObject *owner = PEEK(1);
292        uint32 type_version = read32(next_instr);
293        PyTypeObject *tp = Py_TYPE(owner);
294        assert(type_version != 0);
295        DEOPT_IF(tp->tp_version_tag != type_version);
296        next_instr += 2;
297        DISPATCH();
298    }
299```
300
301### More examples
302
303For explanations see "Generating the interpreter" below.)
304```C
305    op ( CHECK_HAS_INSTANCE_VALUES, (owner -- owner) ) {
306        PyDictOrValues dorv = *_PyObject_DictOrValuesPointer(owner);
307        DEOPT_IF(!_PyDictOrValues_IsValues(dorv));
308    }
309```
310```C
311    op ( LOAD_INSTANCE_VALUE, (owner, index/1 -- null if (oparg & 1), res) ) {
312        res = _PyDictOrValues_GetValues(dorv)->values[index];
313        DEOPT_IF(res == NULL);
314        Py_INCREF(res);
315        null = NULL;
316        Py_DECREF(owner);
317    }
318```
319```C
320    macro ( LOAD_ATTR_INSTANCE_VALUE ) =
321        counter/1 + CHECK_OBJECT_TYPE + CHECK_HAS_INSTANCE_VALUES +
322        LOAD_INSTANCE_VALUE + unused/4 ;
323```
324```C
325    op ( LOAD_SLOT, (owner, index/1 -- null if (oparg & 1), res) ) {
326        char *addr = (char *)owner + index;
327        res = *(PyObject **)addr;
328        DEOPT_IF(res == NULL);
329        Py_INCREF(res);
330        null = NULL;
331        Py_DECREF(owner);
332    }
333```
334```C
335    macro ( LOAD_ATTR_SLOT ) = counter/1 + CHECK_OBJECT_TYPE + LOAD_SLOT + unused/4;
336```
337```C
338    inst ( BUILD_TUPLE, (items[oparg] -- tuple) ) {
339        tuple = _PyTuple_FromArraySteal(items, oparg);
340        ERROR_IF(tuple == NULL, error);
341    }
342```
343```C
344    inst ( PRINT_EXPR ) {
345        PyObject *value = POP();
346        PyObject *hook = _PySys_GetAttr(tstate, &_Py_ID(displayhook));
347        PyObject *res;
348        if (hook == NULL) {
349            _PyErr_SetString(tstate, PyExc_RuntimeError,
350                                "lost sys.displayhook");
351            Py_DECREF(value);
352            goto error;
353        }
354        res = PyObject_CallOneArg(hook, value);
355        Py_DECREF(value);
356        ERROR_IF(res == NULL);
357        Py_DECREF(res);
358    }
359```
360
361### Defining an instruction family
362
363A _family_ maps a specializable instruction to its specializations.
364
365Example: These opcodes all share the same instruction format):
366```C
367    family(load_attr) = { LOAD_ATTR, LOAD_ATTR_INSTANCE_VALUE, LOAD_SLOT };
368```
369
370### Defining a pseudo instruction
371
372A _pseudo instruction_ is used by the bytecode compiler to represent a set of possible concrete instructions.
373
374Example: `JUMP` may expand to `JUMP_FORWARD` or `JUMP_BACKWARD`:
375```C
376    pseudo(JUMP) = { JUMP_FORWARD, JUMP_BACKWARD };
377```
378
379
380## Generating the interpreter
381
382The generated C code for a single instruction includes a preamble and dispatch at the end
383which can be easily inserted. What is more complex is ensuring the correct stack effects
384and not generating excess pops and pushes.
385
386For example, in `CHECK_HAS_INSTANCE_VALUES`, `owner` occurs in the input, so it cannot be
387redefined. Thus it doesn't need to written and can be read without adjusting the stack pointer.
388The C code generated for `CHECK_HAS_INSTANCE_VALUES` would look something like:
389
390```C
391    {
392        PyObject *owner = stack_pointer[-1];
393        PyDictOrValues dorv = *_PyObject_DictOrValuesPointer(owner);
394        DEOPT_IF(!_PyDictOrValues_IsValues(dorv));
395    }
396```
397
398When combining ops together to form instructions, temporary values should be used,
399rather than popping and pushing, such that `LOAD_ATTR_SLOT` would look something like:
400
401```C
402    case LOAD_ATTR_SLOT: {
403        PyObject *s1 = stack_pointer[-1];
404        /* CHECK_OBJECT_TYPE */
405        {
406            PyObject *owner = s1;
407            uint32_t type_version = read32(next_instr + 1);
408            PyTypeObject *tp = Py_TYPE(owner);
409            assert(type_version != 0);
410            if (tp->tp_version_tag != type_version) goto deopt;
411        }
412        /* LOAD_SLOT */
413        {
414            PyObject *owner = s1;
415            uint16_t index = *(next_instr + 1 + 2);
416            char *addr = (char *)owner + index;
417            PyObject *null;
418            PyObject *res = *(PyObject **)addr;
419            if (res == NULL) goto deopt;
420            Py_INCREF(res);
421            null = NULL;
422            Py_DECREF(owner);
423            if (oparg & 1) {
424                stack_pointer[0] = null;
425                stack_pointer += 1;
426            }
427            s1 = res;
428        }
429        next_instr += (1 + 1 + 2 + 1 + 4);
430        stack_pointer[-1] = s1;
431        DISPATCH();
432    }
433```
434
435## Other tools
436
437From the instruction definitions we can generate the stack marking code used in `frame.set_lineno()`,
438and the tables for use by disassemblers.
439