• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# CJS Module Lexer
2
3[![Build Status][travis-image]][travis-url]
4
5A [very fast](#benchmarks) JS CommonJS module syntax lexer used to detect the most likely list of named exports of a CommonJS module.
6
7Outputs the list of named exports (`exports.name = ...`) and possible module reexports (`module.exports = require('...')`), including the common transpiler variations of these cases.
8
9Forked from https://github.com/guybedford/es-module-lexer.
10
11_Comprehensively handles the JS language grammar while remaining small and fast. - ~90ms per MB of JS cold and ~15ms per MB of JS warm, [see benchmarks](#benchmarks) for more info._
12
13### Usage
14
15```
16npm install cjs-module-lexer
17```
18
19For use in CommonJS:
20
21```js
22const { parse } = require('cjs-module-lexer');
23
24// `init` return a promise for parity with the ESM API, but you do not have to call it
25
26const { exports, reexports } = parse(`
27  // named exports detection
28  module.exports.a = 'a';
29  (function () {
30    exports.b = 'b';
31  })();
32  Object.defineProperty(exports, 'c', { value: 'c' });
33  /* exports.d = 'not detected'; */
34
35  // reexports detection
36  if (maybe) module.exports = require('./dep1.js');
37  if (another) module.exports = require('./dep2.js');
38
39  // literal exports assignments
40  module.exports = { a, b: c, d, 'e': f }
41
42  // __esModule detection
43  Object.defineProperty(module.exports, '__esModule', { value: true })
44`);
45
46// exports === ['a', 'b', 'c', '__esModule']
47// reexports === ['./dep1.js', './dep2.js']
48```
49
50When using the ESM version, Wasm is supported instead:
51
52```js
53import { parse, init } from 'cjs-module-lexer';
54// init needs to be called and waited upon
55await init();
56const { exports, reexports } = parse(source);
57```
58
59The Wasm build is around 1.5x faster and without a cold start.
60
61### Grammar
62
63CommonJS exports matches are run against the source token stream.
64
65The token grammar is:
66
67```
68IDENTIFIER: As defined by ECMA-262, without support for identifier `\` escapes, filtered to remove strict reserved words:
69            "implements", "interface", "let", "package", "private", "protected", "public", "static", "yield", "enum"
70
71STRING_LITERAL: A `"` or `'` bounded ECMA-262 string literal.
72
73MODULE_EXPORTS: `module` `.` `exports`
74
75EXPORTS_IDENTIFIER: MODULE_EXPORTS_IDENTIFIER | `exports`
76
77EXPORTS_DOT_ASSIGN: EXPORTS_IDENTIFIER `.` IDENTIFIER `=`
78
79EXPORTS_LITERAL_COMPUTED_ASSIGN: EXPORTS_IDENTIFIER `[` STRING_LITERAL `]` `=`
80
81EXPORTS_LITERAL_PROP: (IDENTIFIER  (`:` IDENTIFIER)?) | (STRING_LITERAL `:` IDENTIFIER)
82
83EXPORTS_SPREAD: `...` (IDENTIFIER | REQUIRE)
84
85EXPORTS_MEMBER: EXPORTS_DOT_ASSIGN | EXPORTS_LITERAL_COMPUTED_ASSIGN
86
87EXPORTS_DEFINE: `Object` `.` `defineProperty `(` EXPORTS_IDENFITIER `,` STRING_LITERAL
88
89EXPORTS_DEFINE_VALUE: EXPORTS_DEFINE `, {`
90  (`enumerable: true,`)?
91  (
92    `value:` |
93    `get` (`: function` IDENTIFIER? )?  `() {` return IDENTIFIER (`.` IDENTIFIER | `[` STRING_LITERAL `]`)? `;`? `}` `,`?
94  )
95  `})`
96
97EXPORTS_LITERAL: MODULE_EXPORTS `=` `{` (EXPORTS_LITERAL_PROP | EXPORTS_SPREAD) `,`)+ `}`
98
99REQUIRE: `require` `(` STRING_LITERAL `)`
100
101EXPORTS_ASSIGN: (`var` | `const` | `let`) IDENTIFIER `=` (`_interopRequireWildcard (`)? REQUIRE
102
103MODULE_EXPORTS_ASSIGN: MODULE_EXPORTS `=` REQUIRE
104
105EXPORT_STAR: (`__export` | `__exportStar`) `(` REQUIRE
106
107EXPORT_STAR_LIB: `Object.keys(` IDENTIFIER$1 `).forEach(function (` IDENTIFIER$2 `) {`
108  (
109    (
110      `if (` IDENTIFIER$2 `===` ( `'default'` | `"default"` ) `||` IDENTIFIER$2 `===` ( '__esModule' | `"__esModule"` ) `) return` `;`?
111      (
112        (`if (Object` `.prototype`? `.hasOwnProperty.call(`  IDENTIFIER `, ` IDENTIFIER$2 `)) return` `;`?)?
113        (`if (` IDENTIFIER$2 `in` EXPORTS_IDENTIFIER `&&` EXPORTS_IDENTIFIER `[` IDENTIFIER$2 `] ===` IDENTIFIER$1 `[` IDENTIFIER$2 `]) return` `;`)?
114      )?
115    ) |
116    `if (` IDENTIFIER$2 `!==` ( `'default'` | `"default"` ) (`&& !` (`Object` `.prototype`? `.hasOwnProperty.call(`  IDENTIFIER `, ` IDENTIFIER$2 `)` | IDENTIFIER `.hasOwnProperty(` IDENTIFIER$2 `)`))? `)`
117  )
118  (
119    EXPORTS_IDENTIFIER `[` IDENTIFIER$2 `] =` IDENTIFIER$1 `[` IDENTIFIER$2 `]` `;`? |
120    `Object.defineProperty(` EXPORTS_IDENTIFIER `, ` IDENTIFIER$2 `, { enumerable: true, get` (`: function` IDENTIFIER? )?  `() { return ` IDENTIFIER$1 `[` IDENTIFIER$2 `]` `;`? `}` `,`? `})` `;`?
121  )
122  `})`
123```
124
125Spacing between tokens is taken to be any ECMA-262 whitespace, ECMA-262 block comment or ECMA-262 line comment.
126
127* The returned export names are taken to be the combination of:
128  1. All `IDENTIFIER` and `STRING_LITERAL` slots for `EXPORTS_MEMBER` and `EXPORTS_LITERAL` matches.
129  2. The first `STRING_LITERAL` slot for all `EXPORTS_DEFINE_VALUE` matches where that same string is not an `EXPORTS_DEFINE` match that is not also an `EXPORTS_DEFINE_VALUE` match.
130* The reexport specifiers are taken to be the combination of:
131  1. The `REQUIRE` matches of the last matched of either `MODULE_EXPORTS_ASSIGN` or `EXPORTS_LITERAL`.
132  2. All _top-level_ `EXPORT_STAR` `REQUIRE` matches and `EXPORTS_ASSIGN` matches whose `IDENTIFIER` also matches the first `IDENTIFIER` in `EXPORT_STAR_LIB`.
133
134### Parsing Examples
135
136#### Named Exports Parsing
137
138The basic matching rules for named exports are `exports.name`, `exports['name']` or `Object.defineProperty(exports, 'name', ...)`. This matching is done without scope analysis and regardless of the expression position:
139
140```js
141// DETECTS EXPORTS: a, b
142(function (exports) {
143  exports.a = 'a';
144  exports['b'] = 'b';
145})(exports);
146```
147
148Because there is no scope analysis, the above detection may overclassify:
149
150```js
151// DETECTS EXPORTS: a, b, c
152(function (exports, Object) {
153  exports.a = 'a';
154  exports['b'] = 'b';
155  if (false)
156    exports.c = 'c';
157})(NOT_EXPORTS, NOT_OBJECT);
158```
159
160It will in turn underclassify in cases where the identifiers are renamed:
161
162```js
163// DETECTS: NO EXPORTS
164(function (e) {
165  e.a = 'a';
166  e['b'] = 'b';
167})(exports);
168```
169
170#### Getter Exports Parsing
171
172`Object.defineProperty` is detected for specifically value and getter forms returning an identifier or member expression:
173
174```js
175// DETECTS: a, b, c, d, __esModule
176Object.defineProperty(exports, 'a', {
177  enumerable: true,
178  get: function () {
179    return q.p;
180  }
181});
182Object.defineProperty(exports, 'b', {
183  enumerable: true,
184  get: function () {
185    return q['p'];
186  }
187});
188Object.defineProperty(exports, 'c', {
189  enumerable: true,
190  get () {
191    return b;
192  }
193});
194Object.defineProperty(exports, 'd', { value: 'd' });
195Object.defineProperty(exports, '__esModule', { value: true });
196```
197
198Value properties are also detected specifically:
199
200```js
201Object.defineProperty(exports, 'a', {
202  value: 'no problem'
203});
204```
205
206To avoid matching getters that have side effects, any getter for an export name that does not support the forms above will
207opt-out of the getter matching:
208
209```js
210// DETECTS: NO EXPORTS
211Object.defineProperty(exports, 'a', {
212  get () {
213    return 'nope';
214  }
215});
216
217if (false) {
218  Object.defineProperty(module.exports, 'a', {
219    get () {
220      return dynamic();
221    }
222  })
223}
224```
225
226Alternative object definition structures or getter function bodies are not detected:
227
228```js
229// DETECTS: NO EXPORTS
230Object.defineProperty(exports, 'a', {
231  enumerable: false,
232  get () {
233    return p;
234  }
235});
236Object.defineProperty(exports, 'b', {
237  configurable: true,
238  get () {
239    return p;
240  }
241});
242Object.defineProperty(exports, 'c', {
243  get: () => p
244});
245Object.defineProperty(exports, 'd', {
246  enumerable: true,
247  get: function () {
248    return dynamic();
249  }
250});
251Object.defineProperty(exports, 'e', {
252  enumerable: true,
253  get () {
254    return 'str';
255  }
256});
257```
258
259`Object.defineProperties` is also not supported.
260
261#### Exports Object Assignment
262
263A best-effort is made to detect `module.exports` object assignments, but because this is not a full parser, arbitrary expressions are not handled in the
264object parsing process.
265
266Simple object definitions are supported:
267
268```js
269// DETECTS EXPORTS: a, b, c
270module.exports = {
271  a,
272  'b': b,
273  c: c,
274  ...d
275};
276```
277
278Object properties that are not identifiers or string expressions will bail out of the object detection, while spreads are ignored:
279
280```js
281// DETECTS EXPORTS: a, b
282module.exports = {
283  a,
284  ...d,
285  b: require('c'),
286  c: "not detected since require('c') above bails the object detection"
287}
288```
289
290`Object.defineProperties` is not currently supported either.
291
292#### module.exports reexport assignment
293
294Any `module.exports = require('mod')` assignment is detected as a reexport, but only the last one is returned:
295
296```js
297// DETECTS REEXPORTS: c
298module.exports = require('a');
299(module => module.exports = require('b'))(NOT_MODULE);
300if (false) module.exports = require('c');
301```
302
303This is to avoid over-classification in Webpack bundles with externals which include `module.exports = require('external')` in their source for every external dependency.
304
305In exports object assignment, any spread of `require()` are detected as multiple separate reexports:
306
307```js
308// DETECTS REEXPORTS: a, b
309module.exports = require('ignored');
310module.exports = {
311  ...require('a'),
312  ...require('b')
313};
314```
315
316#### Transpiler Re-exports
317
318For named exports, transpiler output works well with the rules described above.
319
320But for star re-exports, special care is taken to support common patterns of transpiler outputs from Babel and TypeScript as well as bundlers like RollupJS.
321These reexport and star reexport patterns are restricted to only be detected at the top-level as provided by the direct output of these tools.
322
323For example, `export * from 'external'` is output by Babel as:
324
325```js
326"use strict";
327
328exports.__esModule = true;
329
330var _external = require("external");
331
332Object.keys(_external).forEach(function (key) {
333  if (key === "default" || key === "__esModule") return;
334  exports[key] = _external[key];
335});
336```
337
338Where the `var _external = require("external")` is specifically detected as well as the `Object.keys(_external)` statement, down to the exact
339for of that entire expression including minor variations of the output. The `_external` and `key` identifiers are carefully matched in this
340detection.
341
342Similarly for TypeScript, `export * from 'external'` is output as:
343
344```js
345"use strict";
346function __export(m) {
347    for (var p in m) if (!exports.hasOwnProperty(p)) exports[p] = m[p];
348}
349Object.defineProperty(exports, "__esModule", { value: true });
350__export(require("external"));
351```
352
353Where the `__export(require("external"))` statement is explicitly detected as a reexport, including variations `tslib.__export` and `__exportStar`.
354
355### Environment Support
356
357Node.js 10+, and [all browsers with Web Assembly support](https://caniuse.com/#feat=wasm).
358
359### JS Grammar Support
360
361* Token state parses all line comments, block comments, strings, template strings, blocks, parens and punctuators.
362* Division operator / regex token ambiguity is handled via backtracking checks against punctuator prefixes, including closing brace or paren backtracking.
363* Always correctly parses valid JS source, but may parse invalid JS source without errors.
364
365### Benchmarks
366
367Benchmarks can be run with `npm run bench`.
368
369Current results:
370
371JS Build:
372
373```
374Module load time
375> 4ms
376Cold Run, All Samples
377test/samples/*.js (3635 KiB)
378> 299ms
379
380Warm Runs (average of 25 runs)
381test/samples/angular.js (1410 KiB)
382> 13.96ms
383test/samples/angular.min.js (303 KiB)
384> 4.72ms
385test/samples/d3.js (553 KiB)
386> 6.76ms
387test/samples/d3.min.js (250 KiB)
388> 4ms
389test/samples/magic-string.js (34 KiB)
390> 0.64ms
391test/samples/magic-string.min.js (20 KiB)
392> 0ms
393test/samples/rollup.js (698 KiB)
394> 8.48ms
395test/samples/rollup.min.js (367 KiB)
396> 5.36ms
397
398Warm Runs, All Samples (average of 25 runs)
399test/samples/*.js (3635 KiB)
400> 40.28ms
401```
402
403Wasm Build:
404```
405Module load time
406> 10ms
407Cold Run, All Samples
408test/samples/*.js (3635 KiB)
409> 43ms
410
411Warm Runs (average of 25 runs)
412test/samples/angular.js (1410 KiB)
413> 9.32ms
414test/samples/angular.min.js (303 KiB)
415> 3.16ms
416test/samples/d3.js (553 KiB)
417> 5ms
418test/samples/d3.min.js (250 KiB)
419> 2.32ms
420test/samples/magic-string.js (34 KiB)
421> 0.16ms
422test/samples/magic-string.min.js (20 KiB)
423> 0ms
424test/samples/rollup.js (698 KiB)
425> 6.28ms
426test/samples/rollup.min.js (367 KiB)
427> 3.6ms
428
429Warm Runs, All Samples (average of 25 runs)
430test/samples/*.js (3635 KiB)
431> 27.76ms
432```
433
434### Wasm Build Steps
435
436To build download the WASI SDK from https://github.com/WebAssembly/wasi-sdk/releases.
437
438The Makefile assumes the existence of "wasi-sdk-11.0" and "wabt" (optional) as sibling folders to this project.
439
440The build through the Makefile is then run via `make lib/lexer.wasm`, which can also be triggered via `npm run build-wasm` to create `dist/lexer.js`.
441
442On Windows it may be preferable to use the Linux subsystem.
443
444After the Web Assembly build, the CJS build can be triggered via `npm run build`.
445
446Optimization passes are run with [Binaryen](https://github.com/WebAssembly/binaryen) prior to publish to reduce the Web Assembly footprint.
447
448### License
449
450MIT
451
452[travis-url]: https://travis-ci.org/guybedford/es-module-lexer
453[travis-image]: https://travis-ci.org/guybedford/es-module-lexer.svg?branch=master
454