1# Design documentation for compilation units, packages, and modules 2 3## 1. Introduction 4 5Programs are structured as sequences of elements ready for compilation, i.e., compilation units. Each compilation unit creates its own scope. 6The compilation unit’s variables, functions, classes, interfaces, or other declarations are only accessible within such scope if not explicitly exported. 7 8A variable, function, class, interface, or other declarations exported from a different compilation unit must be imported first. 9 10There are three kinds of compilation units: 11 - Separate modules, 12 - Declaration modules, 13 - Packages 14 15## 2. Objectives 16 17The primary objectives of this design are: 18- Give a general picture of how compilation unit handling works within the compiler. 19- The current code structure needs to be refactored in order to follow the changes of the standard amendments, which allow other implementation approaches, and on the other hand, several hot fixes has been merged into the code base recently, which made the code quite kludgy and unclear. 20- There are missing features that are not implemented such as the internal access modifier, single export directive or even the requirement that a package module can directly access all top-level entities declared in all modules that constitute the packages. 21 22## 3. Module Handling 23 24A separate module is a module without a package header. A separate module can optionally consist of the following 25four parts: 261. Import directives that enable referring imported declarations in a module 272. Top-level declarations 283. Top-level statements 294. Re-export directive 30 31Every module implicitly imports all exported entities from essential kernel packages of the standard library. 32All entities from these packages are accessible as simple names, like the console variable. 33 34``` 35// Hello, world! module 36function main() { 37 console.log("Hello, world!") 38} 39``` 40 41It is ensured currently via the *ETSParser::ParseDefaultSources* method, which parse an internally created ets file named "<default_import>.ets". 42 43### 3.1. Package level scope 44 45Name declared on the package level should be accessible throughout the entire package. The name can be accessed in other packages or modules if exported. 46Currently there is no package level scope, packages are handled almost exactly the same as separate modules, so the name declared on the module level is accessible 47 throughout the entire module only, and only if it exported can be accessed in other modules/compilation units, including another package module belonging to the same package (compilation unit). 48 49[#17665] 50 51## 4. How variables are stored 52 53All declared variables in source files are inserted into scopes - during parsing the source code. Each scope (local scope, param scope, global scope, ...) has a **bindings_** named field, that stores the identifier, and a pointer to the created Variable objects. 54 55 * map<string, Variable*> bindings_ 56 57Each scope has a pointer to its parent scope. When a variable is referred in source code, the actual scope is checked. If the variable is not found in the actual scope, its parent scope is investigated, and so on. Search continues until the global scope is reached, which is the end of scope chain. 58 59### 4.1. Storage of imported variables 60 61Both import and export directives are only allowed in top-level. It means, that all the export marked variables are placed into the global scope of the actual program. Similarly, all the variables are stored in the global scope of the actual program that are imported. 62 63According to this, global scope has an extra **foreignBindings_** member that helps to know which variables are defined locally and imported from external sources. 64 65 * map<string, Variable*> bindings_ 66 * map<string, boolean> foreignBindings_ 67 68So, all the variables in global scope are inserted into **bindings_**, and also placed into the **foreignBindings_** as well. 69 70### 4.2. How to import variables 71 72When importing variables from an external source, only the global scope of the external source is checked for the variables. Only those variables can be imported, that are not foreign (name of the variable is in `foreignBindings_` with `false` value) and marked for export. 73This **foreignBinding_** member helps to eliminate "export waterfall" that is demonstrated by the following example: 74 75 76``` 77// C.ets 78export let a = 2 79``` 80 81```b.ets 82// B.ets 83import { a } from c.ets // 'a' marked as foreign 84``` 85 86``` 87// A.ets 88import { a } from b.ets 89// 'a' is not visible here, since 'a' is foreign in B.ets 90``` 91 92### 4.3. Improvement possibilities/suggestions 93 94Foreign marker can be improved. Since **Variable** objects are stored by pointers, it is not a solution to remove 'export' flag when they are imported. Deep copy of **Variable** objects is a solution, to remove 'export' flag when variables are imported, but it could significantly increase memory consumption. 95 96Currently, all variables - locally defined or imported ones - are stored in `foreignBindings_`. It could be enough if just imported ones are placed into that structure. 97 98 99## 5. Import directives 100 101Import directives make entities exported from other compilation units available for use in the current compilation unit by using different binding forms. 102 103An import declaration has the following two parts: 104* Import path that determines a compilation unit to import from; 105* Import binding that defines what entities, and in what form—qualified or unqualified—can be used by the current compilation unit. 106 107Import directive use the following form: 108```'import' allBinding|selectiveBindings|defaultBinding|typeBinding 'from' importPath;``` 109 110## 5.1. Resolving importPath 111 112The *importPath* is a string literal which can be points to a module (separate module | package module) or a folder (package folder, or a folder which contains an index.ts/index.ets file). It can be specified with a relative or an absolute path, with or without a file extension, and must be able to manage the paths entered in arktsconfig as well. 113Resolving these paths within the compiler is the responsibility of the *importPathManager*. In the process of parsing an import path, the string literal will be passed to the importPathManager which will resolve it as an absolute path and adds it to an own list, called *parseList_.* 114The latter list serves to let the compiler know what still needs to be parsed (handle and avoid duplications), and this list will be requested and traversed during the *ParseSources* call. The importPathManager also handles errors that can be caught before parsing, for example non-existent, incorrectly specified import paths, but not errors that can only be found after parsing (for example, the package folder should contains only package files that use the same package directive, etc.) 115 116The importPath with the resolved path and two additional information - which is the language information and whether the imported element has a declaration or not - , will be stored in an ImportSource instance. The latter two information can be set under the dynamicPaths tag in arktsconfig.json, otherwise they will be assigned a default value (the lang member will be specified from the extension, hasDecl member will be true). This instance will be passed as a parameter during the allocation of the *ETSImportDeclaration* AST node, as well as the specifiers list resolved from the binding forms explained in the next section and the import kind (type or value). 117 118## 5.2. Handle binding forms (allBinding|selectiveBindings|defaultBinding|typeBinding) 119 120The import specifier list will be filled until an import directive can be found, which may contain the following binding forms: 121 122### 5.2.1. allBinding: '*' importAlias 123 124```import * as N from "./test"``` 125 126 It is mandatory to add importAlias, but there is a temporary exception due to stdlib sources, which will have to be handled later and eliminate from the current implementation. 127The name of a compilation unit will be introduced as a result of import * as N where N is an identifier. In this case, it will be parsed by the *ParseNameSpaceSpecifier* (outdated/deprecated name, left from an old version of the standard). The allocated ImportNamespaceSpecifier AST node will be created here with the imported token, and that will be added to the specifier list. 128 129### 5.2.2. selectiveBindings: '{' importBinding (',' importBinding)* '}' 130 131The same bound entities can use several import bindings. The same bound entities can use one import directive, or several import directives with the same import path. 132```import {sin as Sine, PI} from "..."``` 133 134The *ParseNamedSpecifiers* method will create the *ImportSpecifier* AST Node with local name, which in case of import alias it will be a different identifier than the imported token. This specifier will be added to the specifier list. 135 136### 5.2.3 defaultBinding: Identifier | ( '{' 'default' 'as' Identifier '}' ); 137 138Default import binding allows importing a declaration exported from some module as default export. Knowing the actual name of the declaration is not required as the new name is given at importing. A compile-time error occurs if another form of import is used to import an entity initially exported as default. As for now the compiler only support the following default import syntax: 139```import ident from "...""``` 140but not the 141```import { default as ident} from "..." "``` 142The latest one recently added to the standard (#17739) 143 144The *ParseImportDefaultSpecifier* will create an *ImportDefaultSpecifier* AST node with the imported identifier member, and it will be added to the specifiers list. 145 146### 5.2.4. typeBinding: 'type' selectiveBindings; 147 148```import type {A} from "..."``` 149 150The difference between import and import type is that the first form imports all top-level declarations which were exported, and the second imports only exported types. There are two possible import kind that can be set to the *ETSImportDeclaration* AST node: 151* ir::ImportKinds::ALL; 152* ir::ImportKinds::TYPES 153 154The *ParseImportDeclarations* method itself will set the importKind member if it will met a type keyword token during the parsing process of import directive. 155 156## 5.3. Build ETSImportDeclaration nodes 157 158The *ETSImportDeclaration* nodes will be added to the statements list. 159 160The binder component will build/validate all the scope bindings. It's not a standalone analysis; each binder validation is triggered during the parsing process. Currently, the triggers can include: 161* Create a new scope 162* Add a declaration for the current scope. If the current scope cannot accept the binding due to the scoping rules, a SyntaxError `es2panda::Error` is raised. 163 164So the binder's BuildImportDeclaration method will be called for every *ETSImportDeclaration* nodes which will import these foreign bindings specified in the specifiers and insert it to the global scope's variable map, called *bindings_*. 165 166## 6. Exported declarations and export directives 167 168Top-level declarations can use export modifiers that make the declarations accessible in other compilation units by using import. 169The declarations not marked as exported can be used only inside the compilation unit they are declared in. 170 171In addition, only one top-level declaration can be exported by using the default export scheme. It allows specifying no declared name when importing. 172A compile-time error occurs if more than one top-level declaration is marked as default. 173 174The export directive allows the following: 175* Specifying a selective list of exported declarations with optional renaming; or 176* Specifying a name of one declaration; or 177* Re-exporting declarations from other compilation units; or 178* Export type 179 180One important difference that stands out looking the parser sections is that unlike import declarations, export declarations are not included in the dumped AST as a separate node, except for reexport declarations. This is a shortcoming that would be useful to address, but requires a major overhaul, which could be part of the rearchitecting phase. 181 182There are compile time checks for the following: 183* Exporting one program element more than once (like, exported as a type and also as a default export) 184* Clashing names, because two program elements cannot be exported on the same name (could occur when aliasing) 185* Trying to type export something, which is not a type 186 187In case of handling some variants of the export (like selective and default export) the compiler extensively use an object called Varbinder 188* It is a persistent object during the entire process of parsing, the lowering phases and checking 189* The binder is shared among all parsed program files, making it suitable for checks, such as those used in default exports. 190 191### 6.1. Exported declarations 192 193Top-level declarations that can use export modifiers 194 195``` 196export class TestClass {} 197 198export function foo(): void {} 199 200export let msg = "hello" 201``` 202 203In nutshell, when a top-level declaration is exported, the following happens: 204* The definition is stored in the global scope of the given file 205* It gets a specific *ModifierFlag* after the export is parsed, the flag depends on the type of the export 206* After this, it can be imported by another file using the specific import syntax 207* The file doing the import stores its own definitions and the ones it imports, separately 208 209### 6.2. Default export 210 211Default import binding allows importing a declaration exported from some module as default export 212 213``` 214//export.ets 215export default class TestClass {} 216 217//import.ets 218import ImportedClass from "./export" 219``` 220 221The basic rules of a default export and import are the following: 222* Only one default export can exist per module, because it has a dedicated syntax 223* When importing a default exported program element, any name can be used, as long as it does not clash with other names 224* The original name of the default export can also be used when importing, but it is not necessary: 225 226A brief description of what happens when the compiler is running: 227* The *default* modifier itself is parsed in the *ETSParser::ParseMemberModifiers* method 228* The exported declaration itself is parsed in the *ETSParser::ParseTopLevelDeclStatement* method 229* It gets the *ModifierFlags::DEFAULT_EXPORT* flag 230* As mentioned, only one default export is allowed per module and since every parsed program uses the same binder it is stored there as a member 231* This member of the binder is set in the *InitScopesPhaseETS::ParseGlobalClass* method, where an error is thrown if more than one default export exists 232 233### 6.3. Single export 234 235Single export directive allows to specify the declaration which will be exported from the current compilation unit using its name. 236``` 237export v 238let v = 1 239``` 240 241### 6.4. Selective export 242 243Each top-level declaration can be marked as exported by explicitly listing the names of exported declarations. An export list directive uses the same syntax as an import directive with selective bindings: 244 245``` 246function foo(): void {} 247 248export {foo} 249``` 250 251Renaming is optional: 252 253``` 254function foo(): void {} 255 256export {foo as test_func} 257``` 258 259The selective export directive is parsed in the *ETSParser::ParseExport* method, it will calls the *ETSParser::ParseNamedSpecifiers* method to parse the export specifiers, which are the selective bindings after the *export* keyword: 260``` 261export {a, b, c} 262``` 263 264At this point these are only identifiers stored in an *ir::Identifier* node for each name. 265 266Most part of the selective exports are handled in the lowering phase of top level statements via the *ImportExportDecls* AST visitor 267 268A brief description of how it works currently: 269* In the first step a map named *fieldMap_* is populated with every definion in the parsed file while via visitor methods, like *ImportExportDecls::VisitFunctionDeclaration* and *ImportExportDecls::VisitVariableDeclaration* 270* Another visitor method called *ImportExportDecls::VisitExportNamedDeclaration* stores the selective exported names in another map called *exportNameMap_* 271* The logic is handled in the *ImportExportDecls::HandleGlobalStmts*, which contains checks and throws errors, if necessary 272 273 * This method loops through the *exportnameMap_* map and searches for the given names in the *fieldMap_* 274 275 * If something is found, it gets the *ModifierFlags::EXPORT* flag (it only handles the original names and ignores aliases) 276 * Currently, this solution does not support aliasing and only works with function and variable declarations 277 278As it has been mentioned above, renaming inside selective export is still under development. Its process is currently as follows, and this is what is expected: 279* The logic itself is still handled in the same AST visitor, called *ImportExportDecls* 280* A multimap is introduced to handle aliasing, which is stored in the binder for every parsed file 281* It is populated in the *ImportExportDecls::VisitExportNamedDeclaration* method next to the other map named *exportNameMap_* 282 283* As for the functionalities: 284 285 * It will support renaming, as expected 286 * It will checks for clashing names, when the given alias is a name, which is also exported, like: 287 288 ``` 289 export function foo(): void {} 290 let tmp_var = "hello" 291 export {tmp_var as foo} 292 ``` 293 294 * It also checks for clashing names between the different types of exports, like export type, default export and export declaration. 295 * A compile time error will be thrown, if something is being imported using its original name, but it was exported with an alias 296 * Similar behaviour will occur if something is being referred by its original name after being imported with a namespace import: 297 ``` 298 //export.ets 299 function test_func(): void {} 300 301 export {test_func as foo} 302 303 //import.ets 304 import * as all from "./export 305 all.test_func() //A compile time error will be thrown at this point, since 'test_func' has an alias 'foo' 306 ``` 307 * Support for exporting an imported program element will also be added at some point: 308 ``` 309 //export.ets 310 export function foo(): void {} 311 //export_imported.ets 312 import {foo} from "./export" 313 314 export {foo} 315 ``` 316 * This would also support aliasing: 317 ``` 318 import {foo} from "./export" 319 320 export {foo as bar} 321 ``` 322 323### 6.5. Re-export 324 325In addition to exporting what is declared in the module, it is possible to re-export declarations that are part of other module's export. Only limited re-export possibilities are currently supported. It is possible to re-export a particular declaration or all declarations from a module. 326 327Syntax: 328``` 329reExportDirective: 330'export' ('*' | selectiveBindings) 'from' importPath; 331``` 332 333 ``` 334 //export.ets 335 export function foo(): void {} 336 337 //re-export.ets 338 export {foo} from "./export" 339 ``` 340 341When re-exporting, new names can be given. This action is similar to importing but with the opposite direction. 342 343``` 344export {foo as f} from "./export" 345``` 346 347 348A brief description how it works: 349 350 * The specifiers will be parsed in the *ETSParser::ParseExport* method, which calls the *ETSParser::ParseNamedSpecifiers* method just like in case of selective export directives or the *ETSParser::ParseNamespaceSpecifiers* method. 351 * It will resolve the importPath in the same way as in the case of import declarations. 352 * Instead of storing this specifiers in an *ExportNamedDeclaration* AST node, it will create an *ETSImportDeclaration* node passing the specifiers and the resolved path. 353 * An *ETSReExportDeclaration* AST node will be created, which will contains this *ETSImportDeclaration* node and the current program path which contains the reexport directive. 354 * So the binder's BuildImportDeclaration method will be called for every ETSImportDeclaration nodes which will import these foreign bindings specified in the specifiers and insert it to the appropriate global scope's variable map, retrieving the *ETSReExportDeclaration* nodes, making sure that the specifiers are not available in the place where the reexport directive is. 355 * In case of '*', ```import * as A from '...'```, the compiler create for re-exports an **ETSObjectType** each. If there is a call to a member within *A*, and cannot find it, the search will be extended with *SearchReExportsType* function that finds the right 'ETSObjectType' if it exists. 356 357**Improvement possibilities** 358 359Currently, re-export is a bit tweaked. See the following example, that is a typical usage of re-export: 360 361``` 362// C.ets 363export let a = 3 364 365// B.ets 366export {a} from "./C.ets" 367 368// A.ets 369import {a} from "./B.ets" 370``` 371 372On a very high level view, the engine copies the path of C.ets (that is defined in B.ets for re-export", and creates a direct import call to C.ets from A.ets. The following code symbolizes it: 373 374``` 375// C.ets 376export let a = 3 377 378// B.ets 379export {a} from "./C.ets" 380 381// A.ets 382import {a} from "./C.ets" <---- here, a direct import call is executed to C.ets, that works, but not correct. 383``` 384 385Instead, B.ets should import all variables from C.ets and all re-exported variables should be stored in a separated re-exported-variables variable map in B. These variables are not visible in B.ets (for usage) since they are treated for re-export only. After that, A.ets can import all exported variables from B.ets including re-exported-variables. 386 387 388### 6.6. Type exports 389 390In addition to export that is attached to some declaration, a programmer can use the export type directive in order to do 391the following: 392 393* Export as a type a particular class or interface already declared; or 394* Export an already declared type under a different name. 395 396 ``` 397 class TestClass {} 398 399 export type {TestClass} 400 ``` 401 402 403The export type directive supports renaming. 404 405 ``` 406 export type {TestClass as tc} 407 ``` 408 409 410Since the export type directive only support exporting types, a compile time error will be thrown if anything else is exported in such way: 411 412 ``` 413 let msg = "hello" 414 function foo(): void {} 415 416 export type {msg} //a CTE will be thrown here, since msg is not a type but a variable 417 export type {foo} 418 ``` 419 420A brief description about how it works currently: 421 422* The keyword *type* is parsed in the *ETSParser::ParseMemberModifiers* method 423* The export type directive uses the selective export syntax, which means it is parsed in the *ETSParser::ParseExport* method, which calls the *ETSParser::ParseNamedSpecifiers* method later to parse the specifiers 424* Just like the selective export, it uses the *ImportExportDecls* visitor to handle its logic: 425 426 * The checking of the exported types is handled in the *ImportExportDecls::VerifyTypeExports* and *ImportExportDecls::HandleSimpleType* methods 427 * Compile time errors are thrown if the program element to be exported is not a type 428 * Similarly, an error is thrown if something is being exported twice, for example once as a type and once as an export declaration 429