1# A higher level definition of the bytecode interpreter 2 3## Abstract 4 5The CPython interpreter is defined in C, meaning that the semantics of the 6bytecode instructions, the dispatching mechanism, error handling, and 7tracing and instrumentation are all intermixed. 8 9This document proposes defining a custom C-like DSL for defining the 10instruction semantics and tools for generating the code deriving from 11the instruction definitions. 12 13These tools would be used to: 14* Generate the main interpreter (done) 15* Generate the tier 2 interpreter 16* Generate documentation for instructions 17* Generate metadata about instructions, such as stack use (done). 18* Generate the tier 2 optimizer's abstract interpreter. 19 20Having a single definition file ensures that there is a single source 21of truth for bytecode semantics. 22 23Other tools that operate on bytecodes, like `frame.setlineno` 24and the `dis` module, will be derived from the common semantic 25definition, reducing errors. 26 27## Motivation 28 29The bytecode interpreter of CPython has traditionally been defined as standard 30C code, but with a lot of macros. 31The presence of these macros and the nature of bytecode interpreters means 32that the interpreter is effectively defined in a domain specific language (DSL). 33 34Rather than using an ad-hoc DSL embedded in the C code for the interpreter, 35a custom DSL should be defined and the semantics of the bytecode instructions, 36and the instructions defined in that DSL. 37 38Generating the interpreter decouples low-level details of dispatching 39and error handling from the semantics of the instructions, resulting 40in more maintainable code and a potentially faster interpreter. 41 42It also provides the ability to create and check optimizers and optimization 43passes from the semantic definition, reducing errors. 44 45## Rationale 46 47As we improve the performance of CPython, we need to optimize larger regions 48of code, use more complex optimizations and, ultimately, translate to machine 49code. 50 51All of these steps introduce the possibility of more bugs, and require more code 52to be written. One way to mitigate this is through the use of code generators. 53Code generators decouple the debugging of the code (the generator) from checking 54the correctness (the DSL input). 55 56For example, we are likely to want a new interpreter for the tier 2 optimizer 57to be added in 3.12. That interpreter will have a different API, a different 58set of instructions and potentially different dispatching mechanism. 59But the instructions it will interpret will be built from the same building 60blocks as the instructions for the tier 1 (PEP 659) interpreter. 61 62Rewriting all the instructions is tedious and error-prone, and changing the 63instructions is a maintenance headache as both versions need to be kept in sync. 64 65By using a code generator and using a common source for the instructions, or 66parts of instructions, we can reduce the potential for errors considerably. 67 68 69## Specification 70 71This specification is a work in progress. 72We update it as the need arises. 73 74### Syntax 75 76Each op definition has a kind, a name, a stack and instruction stream effect, 77and a piece of C code describing its semantics:: 78 79``` 80 file: 81 (definition | family | pseudo)+ 82 83 definition: 84 "inst" "(" NAME ["," stack_effect] ")" "{" C-code "}" 85 | 86 "op" "(" NAME "," stack_effect ")" "{" C-code "}" 87 | 88 "macro" "(" NAME ")" "=" uop ("+" uop)* ";" 89 90 stack_effect: 91 "(" [inputs] "--" [outputs] ")" 92 93 inputs: 94 input ("," input)* 95 96 outputs: 97 output ("," output)* 98 99 input: 100 object | stream | array 101 102 output: 103 object | array 104 105 uop: 106 NAME | stream 107 108 object: 109 NAME [":" type] [ "if" "(" C-expression ")" ] 110 111 type: 112 NAME ["*"] 113 114 stream: 115 NAME "/" size 116 117 size: 118 INTEGER 119 120 array: 121 object "[" C-expression "]" 122 123 family: 124 "family" "(" NAME ")" = "{" NAME ("," NAME)+ [","] "}" ";" 125 126 pseudo: 127 "pseudo" "(" NAME ")" = "{" NAME ("," NAME)+ [","] "}" ";" 128``` 129 130The following definitions may occur: 131 132* `inst`: A normal instruction, as previously defined by `TARGET(NAME)` in `ceval.c`. 133* `op`: A part instruction from which macros can be constructed. 134* `macro`: A bytecode instruction constructed from ops and cache effects. 135 136`NAME` can be any ASCII identifier that is a C identifier and not a C or Python keyword. 137`foo_1` is legal. `$` is not legal, nor is `struct` or `class`. 138 139The optional `type` in an `object` is the C type. It defaults to `PyObject *`. 140The objects before the "--" are the objects on top of the stack at the start of 141the instruction. Those after the "--" are the objects on top of the stack at the 142end of the instruction. 143 144 145An `inst` without `stack_effect` is a transitional form to allow the original C code 146definitions to be copied. It lacks information to generate anything other than the 147interpreter, but is useful for initial porting of code. 148 149Stack effect names may be `unused`, indicating the space is to be reserved 150but no use of it will be made in the instruction definition. 151This is useful to ensure that all instructions in a family have the same 152stack effect. 153 154The number in a `stream` define how many codeunits are consumed from the 155instruction stream. It returns a 16, 32 or 64 bit value. 156If the name is `unused` the size can be any value and that many codeunits 157will be skipped in the instruction stream. 158 159By convention cache effects (`stream`) must precede the input effects. 160 161The name `oparg` is pre-defined as a 32 bit value fetched from the instruction stream. 162 163### Special instruction annotations 164 165Instruction headers may be prefixed by one or more annotations. The non-exhaustive 166list of annotations and their meanings are as follows: 167 168* `override`. For external use by other interpreter definitions to override the current 169 instruction definition. 170* `pure`. This instruction has no side effects. 171* 'tierN'. This instruction only used by tier N interpreter. 172 173### Special functions/macros 174 175The C code may include special functions that are understood by the tools as 176part of the DSL. 177 178Those functions include: 179 180* `DEOPT_IF(cond, instruction)`. Deoptimize if `cond` is met. 181* `ERROR_IF(cond, label)`. Jump to error handler at `label` if `cond` is true. 182* `DECREF_INPUTS()`. Generate `Py_DECREF()` calls for the input stack effects. 183* `SYNC_SP()`. Synchronizes the physical stack pointer with the stack effects. 184 185Note that the use of `DECREF_INPUTS()` is optional -- manual calls 186to `Py_DECREF()` or other approaches are also acceptable 187(e.g. calling an API that "steals" a reference). 188 189Variables can either be defined in the input, output, or in the C code. 190Variables defined in the input may not be assigned in the C code. 191If an `ERROR_IF` occurs, all values will be removed from the stack; 192they must already be `DECREF`'ed by the code block. 193If a `DEOPT_IF` occurs, no values will be removed from the stack or 194the instruction stream; no values must have been `DECREF`'ed or created. 195 196These requirements result in the following constraints on the use of 197`DEOPT_IF` and `ERROR_IF` in any instruction's code block: 198 1991. Until the last `DEOPT_IF`, no objects may be allocated, `INCREF`ed, 200 or `DECREF`ed. 2012. Before the first `ERROR_IF`, all input values must be `DECREF`ed, 202 and no objects may be allocated or `INCREF`ed, with the exception 203 of attempting to create an object and checking for success using 204 `ERROR_IF(result == NULL, label)`. (TODO: Unclear what to do with 205 intermediate results.) 2063. No `DEOPT_IF` may follow an `ERROR_IF` in the same block. 207 208(There is some wiggle room: these rules apply to dynamic code paths, 209not to static occurrences in the source code.) 210 211If code detects an error condition before the first `DECREF` of an input, 212two idioms are valid: 213 214- Use `goto error`. 215- Use a block containing the appropriate `DECREF` calls ending in 216 `ERROR_IF(true, error)`. 217 218An example of the latter would be: 219```cc 220 res = PyObject_Add(left, right); 221 if (res == NULL) { 222 DECREF_INPUTS(); 223 ERROR_IF(true, error); 224 } 225``` 226 227### Semantics 228 229The underlying execution model is a stack machine. 230Operations pop values from the stack, and push values to the stack. 231They also can look at, and consume, values from the instruction stream. 232 233All members of a family 234(which represents a specializable instruction and its specializations) 235must have the same stack and instruction stream effect. 236 237The same is true for all members of a pseudo instruction 238(which is mapped by the bytecode compiler to one of its members). 239 240## Examples 241 242(Another source of examples can be found in the [tests](test_generator.py).) 243 244Some examples: 245 246### Output stack effect 247```C 248 inst ( LOAD_FAST, (-- value) ) { 249 value = frame->f_localsplus[oparg]; 250 Py_INCREF(value); 251 } 252``` 253This would generate: 254```C 255 TARGET(LOAD_FAST) { 256 PyObject *value; 257 value = frame->f_localsplus[oparg]; 258 Py_INCREF(value); 259 PUSH(value); 260 DISPATCH(); 261 } 262``` 263 264### Input stack effect 265```C 266 inst ( STORE_FAST, (value --) ) { 267 SETLOCAL(oparg, value); 268 } 269``` 270This would generate: 271```C 272 TARGET(STORE_FAST) { 273 PyObject *value = PEEK(1); 274 SETLOCAL(oparg, value); 275 STACK_SHRINK(1); 276 DISPATCH(); 277 } 278``` 279 280### Input stack effect and cache effect 281```C 282 op ( CHECK_OBJECT_TYPE, (owner, type_version/2 -- owner) ) { 283 PyTypeObject *tp = Py_TYPE(owner); 284 assert(type_version != 0); 285 DEOPT_IF(tp->tp_version_tag != type_version); 286 } 287``` 288This might become (if it was an instruction): 289```C 290 TARGET(CHECK_OBJECT_TYPE) { 291 PyObject *owner = PEEK(1); 292 uint32 type_version = read32(next_instr); 293 PyTypeObject *tp = Py_TYPE(owner); 294 assert(type_version != 0); 295 DEOPT_IF(tp->tp_version_tag != type_version); 296 next_instr += 2; 297 DISPATCH(); 298 } 299``` 300 301### More examples 302 303For explanations see "Generating the interpreter" below.) 304```C 305 op ( CHECK_HAS_INSTANCE_VALUES, (owner -- owner) ) { 306 PyDictOrValues dorv = *_PyObject_DictOrValuesPointer(owner); 307 DEOPT_IF(!_PyDictOrValues_IsValues(dorv)); 308 } 309``` 310```C 311 op ( LOAD_INSTANCE_VALUE, (owner, index/1 -- null if (oparg & 1), res) ) { 312 res = _PyDictOrValues_GetValues(dorv)->values[index]; 313 DEOPT_IF(res == NULL); 314 Py_INCREF(res); 315 null = NULL; 316 Py_DECREF(owner); 317 } 318``` 319```C 320 macro ( LOAD_ATTR_INSTANCE_VALUE ) = 321 counter/1 + CHECK_OBJECT_TYPE + CHECK_HAS_INSTANCE_VALUES + 322 LOAD_INSTANCE_VALUE + unused/4 ; 323``` 324```C 325 op ( LOAD_SLOT, (owner, index/1 -- null if (oparg & 1), res) ) { 326 char *addr = (char *)owner + index; 327 res = *(PyObject **)addr; 328 DEOPT_IF(res == NULL); 329 Py_INCREF(res); 330 null = NULL; 331 Py_DECREF(owner); 332 } 333``` 334```C 335 macro ( LOAD_ATTR_SLOT ) = counter/1 + CHECK_OBJECT_TYPE + LOAD_SLOT + unused/4; 336``` 337```C 338 inst ( BUILD_TUPLE, (items[oparg] -- tuple) ) { 339 tuple = _PyTuple_FromArraySteal(items, oparg); 340 ERROR_IF(tuple == NULL, error); 341 } 342``` 343```C 344 inst ( PRINT_EXPR ) { 345 PyObject *value = POP(); 346 PyObject *hook = _PySys_GetAttr(tstate, &_Py_ID(displayhook)); 347 PyObject *res; 348 if (hook == NULL) { 349 _PyErr_SetString(tstate, PyExc_RuntimeError, 350 "lost sys.displayhook"); 351 Py_DECREF(value); 352 goto error; 353 } 354 res = PyObject_CallOneArg(hook, value); 355 Py_DECREF(value); 356 ERROR_IF(res == NULL); 357 Py_DECREF(res); 358 } 359``` 360 361### Defining an instruction family 362 363A _family_ maps a specializable instruction to its specializations. 364 365Example: These opcodes all share the same instruction format): 366```C 367 family(load_attr) = { LOAD_ATTR, LOAD_ATTR_INSTANCE_VALUE, LOAD_SLOT }; 368``` 369 370### Defining a pseudo instruction 371 372A _pseudo instruction_ is used by the bytecode compiler to represent a set of possible concrete instructions. 373 374Example: `JUMP` may expand to `JUMP_FORWARD` or `JUMP_BACKWARD`: 375```C 376 pseudo(JUMP) = { JUMP_FORWARD, JUMP_BACKWARD }; 377``` 378 379 380## Generating the interpreter 381 382The generated C code for a single instruction includes a preamble and dispatch at the end 383which can be easily inserted. What is more complex is ensuring the correct stack effects 384and not generating excess pops and pushes. 385 386For example, in `CHECK_HAS_INSTANCE_VALUES`, `owner` occurs in the input, so it cannot be 387redefined. Thus it doesn't need to written and can be read without adjusting the stack pointer. 388The C code generated for `CHECK_HAS_INSTANCE_VALUES` would look something like: 389 390```C 391 { 392 PyObject *owner = stack_pointer[-1]; 393 PyDictOrValues dorv = *_PyObject_DictOrValuesPointer(owner); 394 DEOPT_IF(!_PyDictOrValues_IsValues(dorv)); 395 } 396``` 397 398When combining ops together to form instructions, temporary values should be used, 399rather than popping and pushing, such that `LOAD_ATTR_SLOT` would look something like: 400 401```C 402 case LOAD_ATTR_SLOT: { 403 PyObject *s1 = stack_pointer[-1]; 404 /* CHECK_OBJECT_TYPE */ 405 { 406 PyObject *owner = s1; 407 uint32_t type_version = read32(next_instr + 1); 408 PyTypeObject *tp = Py_TYPE(owner); 409 assert(type_version != 0); 410 if (tp->tp_version_tag != type_version) goto deopt; 411 } 412 /* LOAD_SLOT */ 413 { 414 PyObject *owner = s1; 415 uint16_t index = *(next_instr + 1 + 2); 416 char *addr = (char *)owner + index; 417 PyObject *null; 418 PyObject *res = *(PyObject **)addr; 419 if (res == NULL) goto deopt; 420 Py_INCREF(res); 421 null = NULL; 422 Py_DECREF(owner); 423 if (oparg & 1) { 424 stack_pointer[0] = null; 425 stack_pointer += 1; 426 } 427 s1 = res; 428 } 429 next_instr += (1 + 1 + 2 + 1 + 4); 430 stack_pointer[-1] = s1; 431 DISPATCH(); 432 } 433``` 434 435## Other tools 436 437From the instruction definitions we can generate the stack marking code used in `frame.set_lineno()`, 438and the tables for use by disassemblers. 439