• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Design documentation for compilation units, packages, and modules
2
3## 1. Introduction
4
5Programs are structured as sequences of elements ready for compilation, i.e., compilation units. Each compilation unit creates its own scope.
6The compilation unit’s variables, functions, classes, interfaces, or other declarations are only accessible  within such scope if not explicitly exported.
7
8A variable, function, class, interface, or other declarations exported from a different compilation unit must be imported first.
9
10There are three kinds of compilation units:
11   - Separate modules,
12   - Declaration modules,
13   - Packages
14
15## 2. Objectives
16
17The primary objectives of this design are:
18- Give a general picture of how compilation unit handling works within the compiler.
19- The current code structure needs to be refactored in order to follow the changes of the standard amendments, which allow other implementation approaches, and on the other hand, several hot fixes has been merged into the code base recently, which made the code quite kludgy and unclear.
20- There are missing features that are not implemented such as the internal access modifier, single export directive or even the requirement that a package module can directly access all top-level entities declared in all modules that constitute the packages.
21
22## 3. Module Handling
23
24A separate module is a module without a package header. A separate module can optionally consist of the following
25four parts:
261. Import directives that enable referring imported declarations in a module
272. Top-level declarations
283. Top-level statements
294. Re-export directive
30
31Every module implicitly imports all exported entities from essential kernel packages of the standard library.
32All entities from these packages are accessible as simple names, like the console variable.
33
34```
35// Hello, world! module
36function main() {
37  console.log("Hello, world!")
38}
39```
40
41It is ensured currently via the *ETSParser::ParseDefaultSources* method, which parse an internally created ets file  named "<default_import>.ets".
42
43### 3.1. Package level scope
44
45Name declared on the package level should be accessible throughout the entire package. The name can be accessed in other packages or modules if exported.
46Currently there is no package level scope, packages are handled almost exactly the same as separate modules, so the name declared on the module level is accessible
47 throughout the entire module only, and only if it exported can be accessed in other modules/compilation units, including another package module belonging to the same package (compilation unit).
48
49[#17665]
50
51## 4. How variables are stored
52
53All declared variables in source files are inserted into scopes - during parsing the source code. Each scope (local scope, param scope, global scope, ...) has a **bindings_** named field, that stores the identifier, and a pointer to the created Variable objects.
54
55 * map<string, Variable*> bindings_
56
57Each scope has a pointer to its parent scope. When a variable is referred in source code, the actual scope is checked. If the variable is not found in the actual scope, its parent scope is investigated, and so on. Search continues until the global scope is reached, which is the end of scope chain.
58
59### 4.1. Storage of imported variables
60
61Both import and export directives are only allowed in top-level. It means, that all the export marked variables are placed into the global scope of the actual program. Similarly, all the variables are stored in the global scope of the actual program that are imported.
62
63According to this, global scope has an extra **foreignBindings_** member that helps to know which variables are defined locally and imported from external sources.
64
65  * map<string, Variable*> bindings_
66  * map<string, boolean> foreignBindings_
67
68So, all the variables in global scope are inserted into **bindings_**, and also placed into the **foreignBindings_** as well.
69
70### 4.2. How to import variables
71
72When importing variables from an external source, only the global scope of the external source is checked for the variables. Only those variables can be imported, that are not foreign (name of the variable is in `foreignBindings_` with `false` value) and marked for export.
73This **foreignBinding_** member helps to eliminate "export waterfall" that is demonstrated by the following example:
74
75
76```
77// C.ets
78export let a = 2
79```
80
81```b.ets
82// B.ets
83import { a } from c.ets  // 'a' marked as foreign
84```
85
86```
87// A.ets
88import { a } from b.ets
89// 'a' is not visible here, since 'a' is foreign in B.ets
90```
91
92### 4.3. Improvement possibilities/suggestions
93
94Foreign marker can be improved. Since **Variable** objects are stored by pointers, it is not a solution to remove 'export' flag when they are imported. Deep copy of **Variable** objects is a solution, to remove 'export' flag when variables are imported, but it could significantly increase memory consumption.
95
96Currently, all variables - locally defined or imported ones - are stored in `foreignBindings_`. It could be enough if just imported ones are placed into that structure.
97
98
99## 5. Import directives
100
101Import directives make entities exported from other compilation units available for use in the current compilation unit by using different binding forms.
102
103An import declaration has the following two parts:
104* Import path that determines a compilation unit to import from;
105* Import binding that defines what entities, and in what form—qualified or unqualified—can be used by the current compilation unit.
106
107Import directive use the following form:
108```'import' allBinding|selectiveBindings|defaultBinding|typeBinding 'from' importPath;```
109
110## 5.1. Resolving importPath
111
112The *importPath* is a string literal which can be points to a module (separate module | package module) or a folder (package folder, or a folder which contains an index.ts/index.ets file). It can be specified with a relative or an absolute path, with or without a file extension, and must be able to manage the paths entered in arktsconfig as well.
113Resolving these paths within the compiler is the responsibility of the *importPathManager*. In the process of parsing an import path, the string literal will be passed to the importPathManager which will resolve it as an absolute path and adds it to an own list, called *parseList_.*
114The latter list serves to let the compiler know what still needs to be parsed (handle and avoid duplications), and this list will be requested and traversed during the *ParseSources* call. The importPathManager also handles errors that can be caught before parsing, for example non-existent, incorrectly specified import paths, but not errors that can only be found after parsing (for example, the package folder should contains only package files that use the same package directive, etc.)
115
116The importPath with the resolved path and two additional information - which is the language information and whether the imported element has a declaration or not - , will be stored in an ImportSource instance. The latter two information can be set under the dynamicPaths tag in arktsconfig.json, otherwise they will be assigned a default value (the lang member will be specified from the extension, hasDecl member will be true). This instance will be passed as a parameter during the allocation of the *ETSImportDeclaration* AST node, as well as the specifiers list resolved from the binding forms explained in the next section and the import kind (type or value).
117
118## 5.2. Handle binding forms (allBinding|selectiveBindings|defaultBinding|typeBinding)
119
120The import specifier list will be filled until an import directive can be found, which may contain the following binding forms:
121
122### 5.2.1. allBinding: '*' importAlias
123
124```import * as N from "./test"```
125
126 It is mandatory to add importAlias, but there is a temporary exception due to stdlib sources, which will have to be handled later and eliminate from the current implementation.
127The name of a compilation unit will be introduced as a result of import * as N where N is an identifier. In this case, it will be parsed by the *ParseNameSpaceSpecifier* (outdated/deprecated name, left from an old version of the standard). The allocated ImportNamespaceSpecifier AST node will be created here with the imported token, and that will be added to the specifier list.
128
129### 5.2.2. selectiveBindings: '{' importBinding (',' importBinding)* '}'
130
131The same bound entities can use several import bindings. The same bound entities can use one import directive, or several import directives with the same import path.
132```import {sin as Sine, PI} from "..."```
133
134The *ParseNamedSpecifiers* method will create the *ImportSpecifier* AST Node with local name, which in case of import alias it will be a different identifier than the imported token. This specifier will be added to the specifier list.
135
136### 5.2.3 defaultBinding: Identifier | ( '{' 'default' 'as' Identifier '}' );
137
138Default import binding allows importing a declaration exported from some module as default export. Knowing the actual name of the declaration is not required as the new name is given at importing. A compile-time error occurs if another form of import is used to import an entity initially exported as default. As for now the compiler only support the following default import syntax:
139```import ident from "...""```
140but not the
141```import { default as ident} from "..." "```
142The latest one recently added to the standard (#17739)
143
144The *ParseImportDefaultSpecifier* will create an *ImportDefaultSpecifier* AST node with the imported identifier member, and it will be added to the specifiers list.
145
146### 5.2.4. typeBinding: 'type' selectiveBindings;
147
148```import type {A} from "..."```
149
150The difference between import and import type is that the first form imports all top-level declarations which were exported, and the second imports only exported types. There are two possible import kind that can be set to the *ETSImportDeclaration* AST node:
151* ir::ImportKinds::ALL;
152* ir::ImportKinds::TYPES
153
154The *ParseImportDeclarations* method itself will set the importKind member if it will met a type keyword token during the parsing process of import directive.
155
156## 5.3. Build ETSImportDeclaration nodes
157
158The *ETSImportDeclaration* nodes will be added to the statements list.
159
160The binder component will build/validate all the scope bindings. It's not a standalone analysis; each binder validation is triggered during the parsing process. Currently, the triggers can include:
161* Create a new scope
162* Add a declaration for the current scope. If the current scope cannot accept the binding due to the scoping rules, a SyntaxError `es2panda::Error` is raised.
163
164So the binder's BuildImportDeclaration method will be called for every *ETSImportDeclaration* nodes which will import these foreign bindings specified in the specifiers and insert it to the global scope's variable map, called *bindings_*.
165
166## 6. Exported declarations and export directives
167
168Top-level declarations can use export modifiers that make the declarations accessible in other compilation units by using import.
169The declarations not marked as exported can be used only inside the compilation unit they are declared in.
170
171In addition, only one top-level declaration can be exported by using the default export scheme. It allows specifying no declared name when importing.
172A compile-time error occurs if more than one top-level declaration is marked as default.
173
174The export directive allows the following:
175* Specifying a selective list of exported declarations with optional renaming; or
176* Specifying a name of one declaration; or
177* Re-exporting declarations from other compilation units; or
178* Export type
179
180One important difference that stands out looking the parser sections  is that unlike import declarations, export declarations are not included in the dumped AST as a separate node, except for reexport declarations. This is a shortcoming that would be useful to address, but requires a major overhaul, which could be part of the rearchitecting phase.
181
182There are compile time checks for the following:
183* Exporting one program element more than once (like, exported as a type and also as a default export)
184* Clashing names, because two program elements cannot be exported on the same name (could occur when aliasing)
185* Trying to type export something, which is not a type
186
187In case of handling some variants of the export (like selective and default export) the compiler extensively use an object called Varbinder
188* It is a persistent object during the entire process of parsing, the lowering phases and checking
189* The binder is shared among all parsed program files, making it suitable for checks, such as those used in default exports.
190
191### 6.1. Exported declarations
192
193Top-level declarations that can use export modifiers
194
195```
196export class TestClass {}
197
198export function foo(): void {}
199
200export let msg = "hello"
201```
202
203In nutshell, when a top-level declaration is exported, the following happens:
204* The definition is stored in the global scope of the given file
205* It gets a specific *ModifierFlag* after the export is parsed, the flag depends on the type of the export
206* After this, it can be imported by another file using the specific import syntax
207* The file doing the import stores its own definitions and the ones it imports, separately
208
209### 6.2. Default export
210
211Default import binding allows importing a declaration exported from some module as default export
212
213```
214//export.ets
215export default class TestClass {}
216
217//import.ets
218import ImportedClass from "./export"
219```
220
221The basic rules of a default export and import are the following:
222* Only one default export can exist per module, because it has a dedicated syntax
223* When importing a default exported program element, any name can be used, as long as it does not clash with other names
224* The original name of the default export can also be used when importing, but it is not necessary:
225
226A brief description of what happens when the compiler is running:
227* The *default* modifier itself is parsed in the *ETSParser::ParseMemberModifiers* method
228* The exported declaration itself is parsed in the *ETSParser::ParseTopLevelDeclStatement* method
229* It gets the *ModifierFlags::DEFAULT_EXPORT* flag
230* As mentioned, only one default export is allowed per module and since every parsed program uses the same binder it is stored there as a member
231* This member of the binder is set in the *InitScopesPhaseETS::ParseGlobalClass* method, where an error is thrown if more than one default export exists
232
233### 6.3. Single export
234
235Single export directive allows to specify the declaration which will be exported from the current compilation unit using its name.
236```
237export v
238let v = 1
239```
240
241### 6.4. Selective export
242
243Each top-level declaration can be marked as exported by explicitly listing the names of exported declarations. An export list directive uses the same syntax as an import directive with selective bindings:
244
245```
246function foo(): void {}
247
248export {foo}
249```
250
251Renaming is optional:
252
253```
254function foo(): void {}
255
256export {foo as test_func}
257```
258
259The selective export directive is parsed in the *ETSParser::ParseExport* method, it will calls the *ETSParser::ParseNamedSpecifiers* method to parse the export specifiers, which are the selective bindings after the *export* keyword:
260```
261export {a, b, c}
262```
263
264At this point these are only identifiers stored in an *ir::Identifier* node for each name.
265
266Most part of the selective exports are handled in the lowering phase of top level statements via the  *ImportExportDecls* AST visitor
267
268A brief description of how it works currently:
269* In the first step a map named *fieldMap_* is populated with every definion in the parsed file while via visitor methods, like *ImportExportDecls::VisitFunctionDeclaration* and *ImportExportDecls::VisitVariableDeclaration*
270* Another visitor method called *ImportExportDecls::VisitExportNamedDeclaration* stores the selective exported names in another map called *exportNameMap_*
271* The logic is handled in the *ImportExportDecls::HandleGlobalStmts*, which contains checks and throws errors, if necessary
272
273    * This method loops through the *exportnameMap_* map and searches for the given names in the *fieldMap_*
274
275        * If something is found, it gets the *ModifierFlags::EXPORT* flag (it only handles the original names and ignores aliases)
276        * Currently, this solution does not support aliasing and only works with function and variable declarations
277
278As it has been mentioned above, renaming inside selective export is still under development. Its process is currently as follows, and this is what is expected:
279* The logic itself is still handled in the same AST visitor, called *ImportExportDecls*
280* A multimap is introduced to handle aliasing, which is stored in the binder for every parsed file
281* It is populated in the *ImportExportDecls::VisitExportNamedDeclaration* method next to the other map named *exportNameMap_*
282
283* As for the functionalities:
284
285    * It will support renaming, as expected
286    * It will checks for clashing names, when the given alias is a name, which is also exported, like:
287
288        ```
289        export function foo(): void {}
290        let tmp_var = "hello"
291        export {tmp_var as foo}
292        ```
293
294    * It also checks for clashing names between the different types of exports, like export type, default export and export declaration.
295    * A compile time error will be thrown, if something is being imported using its original name, but it was exported with an alias
296    * Similar behaviour will occur if something is being referred by its original name after being imported with a namespace import:
297        ```
298        //export.ets
299        function test_func(): void {}
300
301        export {test_func as foo}
302
303        //import.ets
304        import * as all from "./export
305        all.test_func() //A compile time error will be thrown at this point, since 'test_func' has an alias 'foo'
306        ```
307    * Support for exporting an imported program element will also be added at some point:
308        ```
309        //export.ets
310        export function foo(): void {}
311        //export_imported.ets
312        import {foo} from "./export"
313
314        export {foo}
315        ```
316     * This would also support aliasing:
317        ```
318        import {foo} from "./export"
319
320        export {foo as bar}
321        ```
322
323### 6.5. Re-export
324
325In addition to exporting what is declared in the module, it is possible to re-export declarations that are part of other module's export. Only limited re-export possibilities are currently supported. It is possible to re-export a particular declaration or all declarations from a module.
326
327Syntax:
328```
329reExportDirective:
330'export' ('*' | selectiveBindings) 'from' importPath;
331```
332
333    ```
334    //export.ets
335    export function foo(): void {}
336
337    //re-export.ets
338    export {foo} from "./export"
339    ```
340
341When re-exporting, new names can be given. This action is similar to importing but with the opposite direction.
342
343```
344export {foo as f} from "./export"
345```
346
347
348A brief description how it works:
349
350 * The specifiers will be parsed in the *ETSParser::ParseExport* method, which calls the *ETSParser::ParseNamedSpecifiers* method just like in case of selective export directives or the *ETSParser::ParseNamespaceSpecifiers* method.
351 * It will resolve the importPath in the same way as in the case of import declarations.
352 * Instead of storing this specifiers in an *ExportNamedDeclaration* AST node, it will create an *ETSImportDeclaration* node passing the specifiers and the resolved path.
353 * An *ETSReExportDeclaration* AST node will be created, which will contains this *ETSImportDeclaration* node and the current program path which contains the reexport directive.
354 * So the binder's BuildImportDeclaration method will be called for every ETSImportDeclaration nodes which will import these foreign bindings specified in the specifiers and insert it to the appropriate global scope's variable map, retrieving the *ETSReExportDeclaration* nodes, making sure that the specifiers are not available in the place where the reexport directive is.
355 * In case of '*',  ```import * as A from '...'```, the compiler create for re-exports an **ETSObjectType** each. If there is a call to a member within *A*, and cannot find it, the search will be extended with *SearchReExportsType* function that finds the right 'ETSObjectType' if it exists.
356
357**Improvement possibilities**
358
359Currently, re-export is a bit tweaked. See the following example, that is a typical usage of re-export:
360
361```
362// C.ets
363export let a = 3
364
365// B.ets
366export {a} from "./C.ets"
367
368// A.ets
369import {a} from "./B.ets"
370```
371
372On a very high level view, the engine copies the path of C.ets (that is defined in B.ets for re-export", and creates a direct import call to C.ets from A.ets. The following code symbolizes it:
373
374```
375// C.ets
376export let a = 3
377
378// B.ets
379export {a} from "./C.ets"
380
381// A.ets
382import {a} from "./C.ets"  <---- here, a direct import call is executed to C.ets, that works, but not correct.
383```
384
385Instead, B.ets should import all variables from C.ets and all re-exported variables should be stored in a separated re-exported-variables variable map in B. These variables are not visible in B.ets (for usage) since they are treated for re-export only. After that, A.ets can import all exported variables from B.ets including re-exported-variables.
386
387
388### 6.6. Type exports
389
390In addition to export that is attached to some declaration, a programmer can use the export type directive in order to do
391the following:
392
393* Export as a type a particular class or interface already declared; or
394* Export an already declared type under a different name.
395
396   ```
397   class TestClass {}
398
399   export type {TestClass}
400   ```
401
402
403The export type directive supports renaming.
404
405   ```
406   export type {TestClass as tc}
407   ```
408
409
410Since the export type directive only support exporting types, a compile time error will be thrown if anything else is exported in such way:
411
412    ```
413    let msg = "hello"
414    function foo(): void {}
415
416    export type {msg} //a CTE will be thrown here, since msg is not a type but a variable
417    export type {foo}
418    ```
419
420A brief description about how it works currently:
421
422* The keyword *type* is parsed in the *ETSParser::ParseMemberModifiers* method
423* The export type directive uses the selective export syntax, which means it is parsed in the *ETSParser::ParseExport* method, which calls the *ETSParser::ParseNamedSpecifiers* method later to parse the specifiers
424* Just like the selective export, it uses the *ImportExportDecls* visitor to handle its logic:
425
426    * The checking of the exported types is handled in the *ImportExportDecls::VerifyTypeExports* and *ImportExportDecls::HandleSimpleType* methods
427    * Compile time errors are thrown if the program element to be exported is not a type
428    * Similarly, an error is thrown if something is being exported twice, for example once as a type and once as an export declaration
429