1# Implementation description 2 3Important note: Currently AbcKit supports JS, ArkTS1 and ArkTS2, but **ArkTS2 support is experimental**. 4Compiled JS and ArkTS1 are stored in "dynamic" `abc` file format and ArkTS2 in "static" `abc` file format. 5AbcKit works with these file formats using "dynamic" and "static" runtimes. 6 7Please take a look at [cookbook](mini_cookbook.md) beforehand to find out how API looks like from user' point of view. 8 91. [Two types of `abc` files](#two-types-of-abc-files) 102. [C API and C++ implementation](#c-api-and-c-implementation) 113. [Control flow by components](#control-flow-by-components) 124. [Control flow by source files](#control-flow-by-source-files) 135. [Dispatch between dynamic and static file formats](#dispatch-between-dynamic-and-static-file-formats) 146. [Data structures (context) and opaque pointers](#data-structures-context-and-opaque-pointers) 157. [Data structures (context) implementation](#data-structures-context-implementation) 168. [Includes naming conflict](#includes-naming-conflict) 179. [Bytecode<-->IR](#bytecode--ir) 18 19## Two types of abc files 20 21**AbcKit supports two types of `abc` files**: dynamic and static. 22Depending on `abc` file type different components are used, so there are two kinds of: 23 241. `panda::panda_file` and `ark::panda_file` 252. `panda::abc2program` and `ark::abc2program` 263. `panda::pandasm` and `ark::pandasm` 274. IR builder: `libabckit::IrBuilderDynamic` and `ark::compiler::IrBuilder` 285. Codegen: `libabckit::CodeGenDynamic` and `libabckit::CodeGenStatic` 29 30NOTE: `panda::` is dynamic runtime namespace, `ark::` is static runtime namespace 31 32And only one static runtime compiler is used for AbcKit graph representation: `ark::compiler::Graph`. 33 34## C API and C++ implementation 35 36All public API is stored in `./include/c` folder and has such structure: 37 38``` 39include/c/ 40├── abckit.h // Entry point API 41├── metadata_core.h // API for language-independent metadata inspection/transformation 42├── extensions 43│ ├── arkts 44│ │ └── metadata_arkts.h // API for language-specific (ArkTS1 and ArkTS2) metadata inspection/transformation 45│ └── js 46│ └── metadata_js.h // API for language-specific (JS) metadata inspection/transformation 47├── ir_core.h // API for language-independent graph inspection/transformation 48├── isa 49│ ├── isa_dynamic.h // API for language-specific (JS and ArkTS1) graph inspection/transformation 50│ └── isa_static.h // API for language-specific (ArkTS2) graph inspection/transformation (This header is now hidden) 51├── statuses.h // List of error codes 52``` 53 54Abckit APIs are pure C functions, all implementations are stored in `./src/` folder and written in C++ 55 56## Control flow by components 57 581. When `openAbc` is called: 59 1. Read `abc` file into `panda_file` 60 2. Convert `panda_file` into `pandasm` using `abc2program` 612. Abckit metadata API inspects/transforms `pandasm` program 623. When `createGraphFromFunction` is called: 63 1. `pandasm` program is converted back into `panda_file` (because current IR builders support only `panda_file` input) 64 2. IR builder builds `ark::compiler::Graph` 654. Abckit graph API inspects/transforms `ark::compiler::Graph` 665. When `functionSetGraph` is called, `codegen` generates `pandasm` bytecode from `ark::compiler::Graph` and replaces original code of function 676. When `writeAbc` is called, transformed `pandasm` program is emitted into `abc` file 68 69``` 70 ────────────────────────────────────────────────────────────────────────────────────────────────── 71 | /\ 72 \/ | 73x.abc────>(ark/panda)::panda_file───>(ark/panda)::abc2program────>(ark/panda)::pandasm────>(ark/panda)::panda_file───>(ark/panda)::ir_builder────>ark::compiler::Graph──>(ark/panda::)codegen 74 | /\ | /\ | 75 | | | | | 76 | | | | | 77 (abckit metadata API) | | \/ 78 ──────────>(ark/panda)::RuntimeIface───────────────>(abckit IR API) 79``` 80 81## Control flow by source files 82 831. C API declarations 842. C++ implementations for above APIs, in most cases implementation is just dispatch between dynamic and static runtimes 853. Runtime specific implementation: 86 1. Dynamic runtime implementation 87 2. Static runtime implementation 884. APIs consumes and produces pointers to opaque `AbckitXXX` structures 89 90``` 91 92 |───────────────────────────────────────────| 93 | 4. Data structures (context) (./src) | 94 | metadata_inspect_impl.h | 95 | ir_impl.h | 96 |───────────────────────────────────────────| 97 /\ 98 | 99 \/ 100─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 101 /\ 102 | 103 \/ 104 |──────────────────────────────────────| |───────────────────────────────────────────| |─────────────────────────────────────────────────────────────────| 105 | 1. API declarations (./include/c)| | 2. API implementation (./src/) | | 3.1 Dynamic runtime implementation (./src/adapter_dynamic/) | 106 | abckit.h | | abckit_impl.cpp | | abckit_dynamic.cpp | 107 | metadata_core.h | | metadata_(inspect|modify)_impl.cpp | | metadata_(inspect|modify)_dynamic.cpp | 108 | ir_core.h | | ir_impl.cpp | | | 109 | extensions/arkts/metadata_arkts.h |───────>| metadata_arkts_(inspect|modify)_impl.cpp |───────>| | 110 | extensions/js/metadata_js.h | | metadata_js_(inspect|modify)_impl.cpp | | | 111 | isa/isa_dynamic.h | | isa_dynamic_impl.cpp | | | 112 | isa/isa_static.h | | isa_static_impl.cpp | | | 113 |──────────────────────────────────────| |───────────────────────────────────────────| |─────────────────────────────────────────────────────────────────| 114 | 115 \/ 116 |───────────────────────────────────────────────────────────────| 117 | 3.2 Static runtime implementation (./src/adapter_static/) | 118 | abckit_static.cpp | 119 | metadata_(inspect|modify)_static.cpp | 120 | ir_static.cpp | 121 |───────────────────────────────────────────────────────────────| 122 123``` 124 125## Dispatch between dynamic and static file formats 126 127Lots of APIs are able to work with both dynamic and static file formats depending on source language of `abc` file. 128Runtime specific implementations are stored in `./src/adapter_dynamic/` and `./src/adapter_static/` 129 130For example here is `functionGetName()` API implementation (`./src/metadata_inspect_impl.cpp`): 131 132```cpp 133 switch (function->module->target) { 134 case ABCKIT_TARGET_JS: 135 case ABCKIT_TARGET_ARKTS_V1: 136 return FunctionGetNameDynamic(function); 137 case ABCKIT_TARGET_ARKTS_V2: 138 return FunctionGetNameStatic(function); 139 } 140``` 141 142So depending on function's source language one of two functions is called: 143 144- `FunctionGetNameDynamic()` from `./src/adapter_dynamic/metadata_inspect_dynamic.cpp` <-- this is implementation specific for dynamic file format 145- `FunctionGetNameStatic()` from `./src/adapter_static/metadata_inspect_static.cpp` <-- this is implementation specific for static file format 146 147## Data structures (context) and opaque pointers 148 149Abckit C API consumes and returns pointers to opaque `AbckitXXX` structures. 150User has forward declaration types for these structures (implementation is hidden from user and stored in `./src/metadata_inspect_impl.h`), 151so user can only receive pointer from API and pass it to another API and can't modify it manually. 152 153For example here is type forward declaration from `./include/c/metadata_core.h`: 154 155``` 156typedef struct AbckitLiteral AbckitLiteral; 157``` 158 159And here is implementation from `./src/metadata_inspect_impl.h`: 160 161``` 162struct AbckitLiteral { 163 AbckitFile *file; 164 libabckit::pandasm_Literal* val; 165}; 166``` 167 168## Data structures (context) implementation 169 170### Metadata 171 172On `openAbc` API call abckit doing next steps: 173 1741. Open `abc` file with `panda_file` 1752. Convert opened panda file into `pandasm` program with `abc2program` 1763. **Greedily** traverse all `pandasm` structures and create related `AbckitXXX` structures 177 178Implementation of all `AbckitXXX` data structures is stored in `metadata_inspect_impl.h` and `ir_core.h`. 179Top level data structure is `AbckitFile`, user receives `AbckitFile*` pointer after `openAbc` call. 180 181`AbckitXXX` metadata structures has "tree structure" which matches source program structure, for example: 182 1831. `AbckitFile` owns verctor of `unique_ptr<AbckitCoreModule>` (each module usually corresponds to one source file) 1842. `AbckitCoreModule` owns vectors of `unique_ptr<AbckitCoreNamesapce>` (top level namespaces), 185 `unique_ptr<AbckitCoreClass>` (top level classes), `unique_ptr<AbckitCoreFunction>` (top level functions) 1863. `AbckitNamespace` owns vectors of `unique_ptr<AbckitCoreNamesapce>` (namespaces nested in namespace), 187 `unique_ptr<AbckitCoreClass>` (classes nested in namespace), `unique_ptr<AbckitCoreFunction>` (top level namespace functions) 1884. `AbckitCoreClass` owns vector of `unique_ptr<AbckitCoreFunction>` (class methods) 1895. `AbckitCoreFunction` owns vector of `unique_ptr<AbckitCoreFunction>` (for function nested in other functions) 190 191### Graph 192 193On `createGraphFromFunction` API call abckit doing next steps: 194 1951. Emit `panda_file` from `pandasm` function (this is needed because currently `ir_builder` supports only `panda_file` input) 1962. Build `ark::compiler::Graph` using `IrBuilder` 1973. **Greedily traverse `ark::compiler::Graph` and create related `AbckitXXX` structures** 198 199`AbckitXXX` graph structures are: `AbckitGraph`, `AbckitBasicBock`, `AbckitInst` and `AbckitIrInterface`. 200For each `ark::compiler::Graph` basic block and instruction, `AbckitBasicBock` and `AbckitInst` are created. 201`AbckitGraph` contains such maps: 202 203```cpp 204std::unordered_map<ark::compiler::BasicBlock *, AbckitBasicBlock *> implToBB; 205std::unordered_map<ark::compiler::Inst *, AbckitInst *> implToInst; 206``` 207 208So we can get from internal graph implementation related `AbckitXXX` structure. 209 210## Includes naming conflict 211 212**Important implementation restriction:** libabckit includes headers from both dynamic and static runtimes, 213so during build clang must be provided with include paths (`-I`) to both runtimes' folders. 214But there are a lot of files and folders with same names in two runtimes, it causes naming conflict for `#include`s. 215Thats why no file in AbcKit includes headers from both runtimes: 216- there are `./src/adapter_dynamic/` folder for files which includes dynamic runtime headers 217- and `./src/adapter_static/` folder for files which includes static runtime headers 218 219### Wrappers 220 221But for some cases we need to work with two runtimes in single file, for example: 222 223- When we generate `ark::compiler::Graph` from `panda::panda_file::Function` 224- Or when we generate `panda::pandasm::Function` from `ark::compiler::Graph` 225 226For such cases we are using **wrappers** (they are stored in `./src/wrappers/`). 227On next picture (arrows show includes direction) you can see how: 228 229- `metadata_inspect_dynamic.cpp` includes dynamic runtime headers and also uses static graph (via `graph_wrapper.h`) 230- `graph_wrapper.cpp` includes static runtime headers and also uses dynamic `panda_file` (via `abcfile_wrapper.h`) 231 232``` 233 metadata_inspect_dynamic.cpp ──>graph_wrapper.h<────| 234 | | 235 \/ | 236DynamicRuntime |<───graph_wrapper.cpp──>StaticRuntime 237 /\ | 238 | | 239 abcfile_wrapper.cpp────────────>abcfile_wrapper.h<───| 240``` 241 242If you follow arrows from above picture you can see that no file includes both static and dynamic runtimes' headers 243 244## Bytecode <──> IR 245 246abckit uses `ark::compiler::Graph` for internal graph representation, 247so both dynamic and static `pandasm` bytecodes are converted into single `IR` 248 249Bytecode <──> IR transformation approach is the same as for bytecode optimizer, 250IR builder converts bytecode into IR and codegen converts IR into bytecode. 251 252- There are two IR builders: `libabckit::IrBuilderDynamic` and `ark::compiler::IrBuilder` 253- Two bytecode optimizers: `libabckit::CodeGenDynamic` and `libabckit::CodeGenStatic` 254- Two runtime interfaces: `libabckit::AbckitRuntimeAdapterDynamic` and `libabckit::AbckitRuntimeAdapterStatic` 255 256Runtime interface is part of static compiler, it is needed to abstract IR builder from it's source. 257Each user of IR builder should provide it's own runtime interface (by design of compiler). 258 259IR interface (`AbckitIrInterface`) idea is also taken from bytecode optimizer, it is needed to store relation between `panda_file` entity names and offsets. 260Instance of IR interface is: 261 2621. Created when function's bytecode converted into IR 2632. Stored inside `AbckitGraph` instance 2643. Used inside abckit API implementations to obtain `panda_file` entity from instruction's immediate offsets 265 266### IR builder 267 268For static bytecode abckit just reuses IrBuilder from static runtime. 269 270For dynamic bytecode abckit uses fork of IrBuilder from dynamic runtime with various changes. 271For dynamic bytecode almost all instructions are just converted into intrinsic calls (for example `Intrinsic.callthis0`). 272 273### Codegen 274 275For static bytecode abckit uses fork of bytecode optimizer codegen from static runtime with various changes. 276For dynamic bytecode abckit uses fork of bytecode optimizer codegen from dynamic runtime with various changes. 277 278### Compiler passes 279 280There are additional graph clean up and optimization passes applied during graph creation and bytecode generation, 281so if you do like this: `graph1`->`bytecode`->`graph2` without any additional changes, 282`graph1` **may be different** from `graph2`! 283