1# Implementation description 2 3Important note: Currently AbcKit supports JS, ArkTS and static ArkTS, but **static ArkTS support is experimental**. 4Compiled JS and ArkTS are stored in "dynamic" `abc` file format and static ArkTS in "static" `abc` file format. 5AbcKit works with these file formats using "dynamic" and "static" runtimes. 6 7Please take a look at [cookbook](mini_cookbook.md) beforehand to find out how API looks like from user' point of view. 8 91. [Two types of `abc` files](#two-types-of-abc-files) 102. [C API and C++ implementation](#c-api-and-c-implementation) 113. [Control flow by components](#control-flow-by-components) 124. [Control flow by source files](#control-flow-by-source-files) 135. [Dispatch between dynamic and static file formats](#dispatch-between-dynamic-and-static-file-formats) 146. [Data structures (context) and opaque pointers](#data-structures-context-and-opaque-pointers) 157. [Data structures (context) implementation](#data-structures-context-implementation) 168. [Includes naming conflict](#includes-naming-conflict) 179. [Bytecode<-->IR](#bytecode--ir) 18 19## Two types of abc files 20 21**AbcKit supports two types of `abc` files**: dynamic and static. 22Depending on `abc` file type different components are used, so there are two kinds of: 23 241. `panda::panda_file` and `ark::panda_file` 252. `panda::abc2program` and `ark::abc2program` 263. `panda::pandasm` and `ark::pandasm` 274. IR builder: `libabckit::IrBuilderDynamic` and `ark::compiler::IrBuilder` 285. Codegen: `libabckit::CodeGenDynamic` and `libabckit::CodeGenStatic` 29 30NOTE: `panda::` is dynamic runtime namespace, `ark::` is static runtime namespace 31 32And only one static runtime compiler is used for AbcKit graph representation: `ark::compiler::Graph`. 33 34## C and C++ API 35 36AbcKits provides two kinds of API : C API and C++ API. 37### C API 38All public API is stored in `./include/c` folder and has such structure: 39 40``` 41include/c/ 42├── abckit.h // Entry point API 43├── metadata_core.h // API for language-independent metadata inspection/transformation 44├── extensions 45│ ├── arkts 46│ │ └── metadata_arkts.h // API for language-specific (ArkTS and static ArkTS) metadata inspection/transformation 47│ └── js 48│ └── metadata_js.h // API for language-specific (JS) metadata inspection/transformation 49├── ir_core.h // API for language-independent graph inspection/transformation 50├── isa 51│ ├── isa_dynamic.h // API for language-specific (JS and ArkTS) graph inspection/transformation 52│ └── isa_static.h // API for language-specific (static ArkTS) graph inspection/transformation (This header is now hidden) 53├── statuses.h // List of error codes 54``` 55 56Abckit APIs are pure C functions, all implementations are stored in `./src/` folder and written in C++ 57 58### C++ API 59 60C++ API is stored in `./include/cpp` folder and has such structure: 61 62``` 63include/cpp/ 64├── abckit_cpp.h // C++ API entry point 65├── headers/ 66│ ├── file.h // File operations 67│ ├── graph.h // Graph operations 68│ ├── basic_block.h // Basic block operations 69│ ├── instruction.h // Instruction operations 70│ ├── literal.h // Literal operations 71│ ├── value.h // Value operations 72│ ├── type.h // Type operations 73│ ├── dynamic_isa.h // Dynamic ISA operations 74│ ├── config.h // Configuration 75│ ├── utils.h // Utility functions 76│ ├── base_classes.h // Base classes 77│ ├── base_concepts.h // Base concepts 78│ ├── core/ // Language-independent APIs 79│ ├── arkts/ // ArkTS-specific APIs 80│ └── js/ // JS-specific APIs 81``` 82 83C++ API provides a higher-level, object-oriented interface. 84 85## Control flow by components 86 871. When `openAbc` is called: 88 1. Read `abc` file into `panda_file` 89 2. Convert `panda_file` into `pandasm` using `abc2program` 902. Abckit metadata API inspects/transforms `pandasm` program 913. When `createGraphFromFunction` is called: 92 1. `pandasm` program is converted back into `panda_file` (because current IR builders support only `panda_file` input) 93 2. IR builder builds `ark::compiler::Graph` 944. Abckit graph API inspects/transforms `ark::compiler::Graph` 955. When `functionSetGraph` is called, `codegen` generates `pandasm` bytecode from `ark::compiler::Graph` and replaces original code of function 966. When `writeAbc` is called, transformed `pandasm` program is emitted into `abc` file 97 98``` 99 ────────────────────────────────────────────────────────────────────────────────────────────────── 100 | /\ 101 \/ | 102x.abc────>(ark/panda)::panda_file───>(ark/panda)::abc2program────>(ark/panda)::pandasm────>(ark/panda)::panda_file───>(ark/panda)::ir_builder────>ark::compiler::Graph──>(ark/panda::)codegen 103 | /\ | /\ | 104 | | | | | 105 | | | | | 106 (abckit metadata API) | | \/ 107 ──────────>(ark/panda)::RuntimeIface───────────────>(abckit IR API) 108``` 109 110## Control flow by source files 111 1121. C API declarations 1132. C++ implementations for above APIs, in most cases implementation is just dispatch between dynamic and static runtimes 1143. Runtime specific implementation: 115 1. Dynamic runtime implementation 116 2. Static runtime implementation 1174. APIs consumes and produces pointers to opaque `AbckitXXX` structures 118 119``` 120 121 |───────────────────────────────────────────| 122 | 4. Data structures (context) (./src) | 123 | metadata_inspect_impl.h | 124 | ir_impl.h | 125 |───────────────────────────────────────────| 126 /\ 127 | 128 \/ 129─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── 130 /\ 131 | 132 \/ 133 |──────────────────────────────────────| |───────────────────────────────────────────| |─────────────────────────────────────────────────────────────────| 134 | 1. API declarations (./include/c)| | 2. API implementation (./src/) | | 3.1 Dynamic runtime implementation (./src/adapter_dynamic/) | 135 | abckit.h | | abckit_impl.cpp | | abckit_dynamic.cpp | 136 | metadata_core.h | | metadata_(inspect|modify)_impl.cpp | | metadata_(inspect|modify)_dynamic.cpp | 137 | ir_core.h | | ir_impl.cpp | | | 138 | extensions/arkts/metadata_arkts.h |───────>| metadata_arkts_(inspect|modify)_impl.cpp |───────>| | 139 | extensions/js/metadata_js.h | | metadata_js_(inspect|modify)_impl.cpp | | | 140 | isa/isa_dynamic.h | | isa_dynamic_impl.cpp | | | 141 | isa/isa_static.h | | isa_static_impl.cpp | | | 142 |──────────────────────────────────────| |───────────────────────────────────────────| |─────────────────────────────────────────────────────────────────| 143 | 144 \/ 145 |───────────────────────────────────────────────────────────────| 146 | 3.2 Static runtime implementation (./src/adapter_static/) | 147 | abckit_static.cpp | 148 | metadata_(inspect|modify)_static.cpp | 149 | ir_static.cpp | 150 |───────────────────────────────────────────────────────────────| 151 152``` 153 154## Dispatch between dynamic and static file formats 155 156Lots of APIs are able to work with both dynamic and static file formats depending on source language of `abc` file. 157Runtime specific implementations are stored in `./src/adapter_dynamic/` and `./src/adapter_static/` 158 159For example here is `functionGetName()` API implementation (`./src/metadata_inspect_impl.cpp`): 160 161```cpp 162 switch (function->module->target) { 163 case ABCKIT_TARGET_JS: 164 case ABCKIT_TARGET_ARK_TS_V1: 165 return FunctionGetNameDynamic(function); 166 case ABCKIT_TARGET_ARK_TS_V2: 167 return FunctionGetNameStatic(function); 168 } 169``` 170 171So depending on function's source language one of two functions is called: 172 173- `FunctionGetNameDynamic()` from `./src/adapter_dynamic/metadata_inspect_dynamic.cpp` <-- this is implementation specific for dynamic file format 174- `FunctionGetNameStatic()` from `./src/adapter_static/metadata_inspect_static.cpp` <-- this is implementation specific for static file format 175 176## Data structures (context) and opaque pointers 177 178Abckit C API consumes and returns pointers to opaque `AbckitXXX` structures. 179User has forward declaration types for these structures (implementation is hidden from user and stored in `./src/metadata_inspect_impl.h`), 180so user can only receive pointer from API and pass it to another API and can't modify it manually. 181 182For example here is type forward declaration from `./include/c/metadata_core.h`: 183 184``` 185typedef struct AbckitLiteral AbckitLiteral; 186``` 187 188And here is implementation from `./src/metadata_inspect_impl.h`: 189 190``` 191struct AbckitLiteral { 192 AbckitFile *file; 193 libabckit::pandasm_Literal* val; 194}; 195``` 196 197## Data structures (context) implementation 198 199### Metadata 200 201On `openAbc` API call abckit doing next steps: 202 2031. Open `abc` file with `panda_file` 2042. Convert opened panda file into `pandasm` program with `abc2program` 2053. **Greedily** traverse all `pandasm` structures and create related `AbckitXXX` structures 206 207Implementation of all `AbckitXXX` data structures is stored in `metadata_inspect_impl.h` and `ir_core.h`. 208Top level data structure is `AbckitFile`, user receives `AbckitFile*` pointer after `openAbc` call. 209 210`AbckitXXX` metadata structures has "tree structure" which matches source program structure, for example: 211 2121. `AbckitFile` owns verctor of `unique_ptr<AbckitCoreModule>` (each module usually corresponds to one source file) 2132. `AbckitCoreModule` owns vectors of `unique_ptr<AbckitCoreNamespace>` (top level namespaces), 214 `unique_ptr<AbckitCoreClass>` (top level classes), `unique_ptr<AbckitCoreFunction>` (top level functions) 2153. `AbckitNamespace` owns vectors of `unique_ptr<AbckitCoreNamespace>` (namespaces nested in namespace), 216 `unique_ptr<AbckitCoreClass>` (classes nested in namespace), `unique_ptr<AbckitCoreFunction>` (top level namespace functions) 2174. `AbckitCoreClass` owns vector of `unique_ptr<AbckitCoreFunction>` (class methods) 2185. `AbckitCoreFunction` owns vector of `unique_ptr<AbckitCoreFunction>` (for function nested in other functions) 219 220### Graph 221 222On `createGraphFromFunction` API call abckit doing next steps: 223 2241. Emit `panda_file` from `pandasm` function (this is needed because currently `ir_builder` supports only `panda_file` input) 2252. Build `ark::compiler::Graph` using `IrBuilder` 2263. **Greedily traverse `ark::compiler::Graph` and create related `AbckitXXX` structures** 227 228`AbckitXXX` graph structures are: `AbckitGraph`, `AbckitBasicBock`, `AbckitInst` and `AbckitIrInterface`. 229For each `ark::compiler::Graph` basic block and instruction, `AbckitBasicBock` and `AbckitInst` are created. 230`AbckitGraph` contains such maps: 231 232```cpp 233std::unordered_map<ark::compiler::BasicBlock *, AbckitBasicBlock *> implToBB; 234std::unordered_map<ark::compiler::Inst *, AbckitInst *> implToInst; 235``` 236 237So we can get from internal graph implementation related `AbckitXXX` structure. 238 239## Includes naming conflict 240 241**Important implementation restriction:** libabckit includes headers from both dynamic and static runtimes, 242so during build clang must be provided with include paths (`-I`) to both runtimes' folders. 243But there are a lot of files and folders with same names in two runtimes, it causes naming conflict for `#include`s. 244Thats why no file in AbcKit includes headers from both runtimes: 245- there are `./src/adapter_dynamic/` folder for files which includes dynamic runtime headers 246- and `./src/adapter_static/` folder for files which includes static runtime headers 247 248### Wrappers 249 250But for some cases we need to work with two runtimes in single file, for example: 251 252- When we generate `ark::compiler::Graph` from `panda::panda_file::Function` 253- Or when we generate `panda::pandasm::Function` from `ark::compiler::Graph` 254 255For such cases we are using **wrappers** (they are stored in `./src/wrappers/`). 256On next picture (arrows show includes direction) you can see how: 257 258- `metadata_inspect_dynamic.cpp` includes dynamic runtime headers and also uses static graph (via `graph_wrapper.h`) 259- `graph_wrapper.cpp` includes static runtime headers and also uses dynamic `panda_file` (via `abcfile_wrapper.h`) 260 261``` 262 metadata_inspect_dynamic.cpp ──>graph_wrapper.h<────| 263 | | 264 \/ | 265DynamicRuntime |<───graph_wrapper.cpp──>StaticRuntime 266 /\ | 267 | | 268 abcfile_wrapper.cpp────────────>abcfile_wrapper.h<───| 269``` 270 271If you follow arrows from above picture you can see that no file includes both static and dynamic runtimes' headers 272 273## Bytecode <──> IR 274 275abckit uses `ark::compiler::Graph` for internal graph representation, 276so both dynamic and static `pandasm` bytecodes are converted into single `IR` 277 278Bytecode <──> IR transformation approach is the same as for bytecode optimizer, 279IR builder converts bytecode into IR and codegen converts IR into bytecode. 280 281- There are two IR builders: `libabckit::IrBuilderDynamic` and `ark::compiler::IrBuilder` 282- Two bytecode optimizers: `libabckit::CodeGenDynamic` and `libabckit::CodeGenStatic` 283- Two runtime interfaces: `libabckit::AbckitRuntimeAdapterDynamic` and `libabckit::AbckitRuntimeAdapterStatic` 284 285Runtime interface is part of static compiler, it is needed to abstract IR builder from it's source. 286Each user of IR builder should provide it's own runtime interface (by design of compiler). 287 288IR interface (`AbckitIrInterface`) idea is also taken from bytecode optimizer, it is needed to store relation between `panda_file` entity names and offsets. 289Instance of IR interface is: 290 2911. Created when function's bytecode converted into IR 2922. Stored inside `AbckitGraph` instance 2933. Used inside abckit API implementations to obtain `panda_file` entity from instruction's immediate offsets 294 295### IR builder 296 297For static bytecode abckit just reuses IrBuilder from static runtime. 298 299For dynamic bytecode abckit uses fork of IrBuilder from dynamic runtime with various changes. 300For dynamic bytecode almost all instructions are just converted into intrinsic calls (for example `Intrinsic.callthis0`). 301 302### Codegen 303 304For static bytecode abckit uses fork of bytecode optimizer codegen from static runtime with various changes. 305For dynamic bytecode abckit uses fork of bytecode optimizer codegen from dynamic runtime with various changes. 306 307### Compiler passes 308 309There are additional graph clean up and optimization passes applied during graph creation and bytecode generation, 310so if you do like this: `graph1`->`bytecode`->`graph2` without any additional changes, 311`graph1` **may be different** from `graph2`! 312