| Name | Date | Size | #Lines | LOC | ||
|---|---|---|---|---|---|---|
| .. | - | - | ||||
| README.md | D | 04-Jul-2025 | 2.7 KiB | 58 | 49 | |
| _typing_backports.py | D | 04-Jul-2025 | 469 | 16 | 10 | |
| analyzer.py | D | 04-Jul-2025 | 26.6 KiB | 891 | 747 | |
| cwriter.py | D | 04-Jul-2025 | 4.3 KiB | 147 | 128 | |
| generators_common.py | D | 04-Jul-2025 | 5.9 KiB | 243 | 213 | |
| interpreter_definition.md | D | 04-Jul-2025 | 13.7 KiB | 439 | 349 | |
| lexer.py | D | 04-Jul-2025 | 8 KiB | 376 | 313 | |
| mypy.ini | D | 04-Jul-2025 | 381 | 16 | 13 | |
| opcode_id_generator.py | D | 04-Jul-2025 | 1.7 KiB | 66 | 51 | |
| opcode_metadata_generator.py | D | 04-Jul-2025 | 13.6 KiB | 392 | 340 | |
| optimizer_generator.py | D | 04-Jul-2025 | 7.4 KiB | 237 | 204 | |
| parser.py | D | 04-Jul-2025 | 1.8 KiB | 67 | 51 | |
| parsing.py | D | 04-Jul-2025 | 15 KiB | 481 | 395 | |
| plexer.py | D | 04-Jul-2025 | 3.3 KiB | 111 | 81 | |
| py_metadata_generator.py | D | 04-Jul-2025 | 2.8 KiB | 98 | 77 | |
| stack.py | D | 04-Jul-2025 | 7.3 KiB | 228 | 195 | |
| target_generator.py | D | 04-Jul-2025 | 1.4 KiB | 55 | 44 | |
| tier1_generator.py | D | 04-Jul-2025 | 6.5 KiB | 206 | 179 | |
| tier2_generator.py | D | 04-Jul-2025 | 7.1 KiB | 255 | 222 | |
| uop_id_generator.py | D | 04-Jul-2025 | 2.3 KiB | 83 | 67 | |
| uop_metadata_generator.py | D | 04-Jul-2025 | 3.2 KiB | 96 | 81 |
README.md
1# Tooling to generate interpreters 2 3Documentation for the instruction definitions in `Python/bytecodes.c` 4("the DSL") is [here](interpreter_definition.md). 5 6What's currently here: 7 8- `analyzer.py`: code for converting `AST` generated by `Parser` 9 to more high-level structure for easier interaction 10- `lexer.py`: lexer for C, originally written by Mark Shannon 11- `plexer.py`: OO interface on top of lexer.py; main class: `PLexer` 12- `parsing.py`: Parser for instruction definition DSL; main class: `Parser` 13- `parser.py` helper for interactions with `parsing.py` 14- `tierN_generator.py`: a couple of driver scripts to read `Python/bytecodes.c` and 15 write `Python/generated_cases.c.h` (and several other files) 16- `optimizer_generator.py`: reads `Python/bytecodes.c` and 17 `Python/optimizer_bytecodes.c` and writes 18 `Python/optimizer_cases.c.h` 19- `stack.py`: code to handle generalized stack effects 20- `cwriter.py`: code which understands tokens and how to format C code; 21 main class: `CWriter` 22- `generators_common.py`: helpers for generators 23- `opcode_id_generator.py`: generate a list of opcodes and write them to 24 `Include/opcode_ids.h` 25- `opcode_metadata_generator.py`: reads the instruction definitions and 26 write the metadata to `Include/internal/pycore_opcode_metadata.h` 27- `py_metadata_generator.py`: reads the instruction definitions and 28 write the metadata to `Lib/_opcode_metadata.py` 29- `target_generator.py`: generate targets for computed goto dispatch and 30 write them to `Python/opcode_targets.h` 31- `uop_id_generator.py`: generate a list of uop IDs and write them to 32 `Include/internal/pycore_uop_ids.h` 33- `uop_metadata_generator.py`: reads the instruction definitions and 34 write the metadata to `Include/internal/pycore_uop_metadata.h` 35 36Note that there is some dummy C code at the top and bottom of 37`Python/bytecodes.c` 38to fool text editors like VS Code into believing this is valid C code. 39 40## A bit about the parser 41 42The parser class uses a pretty standard recursive descent scheme, 43but with unlimited backtracking. 44The `PLexer` class tokenizes the entire input before parsing starts. 45We do not run the C preprocessor. 46Each parsing method returns either an AST node (a `Node` instance) 47or `None`, or raises `SyntaxError` (showing the error in the C source). 48 49Most parsing methods are decorated with `@contextual`, which automatically 50resets the tokenizer input position when `None` is returned. 51Parsing methods may also raise `SyntaxError`, which is irrecoverable. 52When a parsing method returns `None`, it is possible that after backtracking 53a different parsing method returns a valid AST. 54 55Neither the lexer nor the parsers are complete or fully correct. 56Most known issues are tersely indicated by `# TODO:` comments. 57We plan to fix issues as they become relevant. 58