• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# CJS Module Lexer
2
3[![Build Status][travis-image]][travis-url]
4
5A [very fast](#benchmarks) JS CommonJS module syntax lexer used to detect the most likely list of named exports of a CommonJS module.
6
7Outputs the list of named exports (`exports.name = ...`) and possible module reexports (`module.exports = require('...')`), including the common transpiler variations of these cases.
8
9Forked from https://github.com/guybedford/es-module-lexer.
10
11_Comprehensively handles the JS language grammar while remaining small and fast. - ~90ms per MB of JS cold and ~15ms per MB of JS warm, [see benchmarks](#benchmarks) for more info._
12
13### Usage
14
15```
16npm install cjs-module-lexer
17```
18
19For use in CommonJS:
20
21```js
22const { parse } = require('cjs-module-lexer');
23
24// `init` return a promise for parity with the ESM API, but you do not have to call it
25
26const { exports, reexports } = parse(`
27  // named exports detection
28  module.exports.a = 'a';
29  (function () {
30    exports.b = 'b';
31  })();
32  Object.defineProperty(exports, 'c', { value: 'c' });
33  /* exports.d = 'not detected'; */
34
35  // reexports detection
36  if (maybe) module.exports = require('./dep1.js');
37  if (another) module.exports = require('./dep2.js');
38
39  // literal exports assignments
40  module.exports = { a, b: c, d, 'e': f }
41
42  // __esModule detection
43  Object.defineProperty(module.exports, '__esModule', { value: true })
44`);
45
46// exports === ['a', 'b', 'c', '__esModule']
47// reexports === ['./dep1.js', './dep2.js']
48```
49
50When using the ESM version, Wasm is supported instead:
51
52```js
53import { parse, init } from 'cjs-module-lexer';
54// init needs to be called and waited upon
55await init();
56const { exports, reexports } = parse(source);
57```
58
59The Wasm build is around 1.5x faster and without a cold start.
60
61### Grammar
62
63CommonJS exports matches are run against the source token stream.
64
65The token grammar is:
66
67```
68IDENTIFIER: As defined by ECMA-262, without support for identifier `\` escapes, filtered to remove strict reserved words:
69            "implements", "interface", "let", "package", "private", "protected", "public", "static", "yield", "enum"
70
71STRING_LITERAL: A `"` or `'` bounded ECMA-262 string literal.
72
73IDENTIFIER_STRING: ( `"` IDENTIFIER `"` | `'` IDENTIFIER `'` )
74
75MODULE_EXPORTS: `module` `.` `exports`
76
77EXPORTS_IDENTIFIER: MODULE_EXPORTS_IDENTIFIER | `exports`
78
79EXPORTS_DOT_ASSIGN: EXPORTS_IDENTIFIER `.` IDENTIFIER `=`
80
81EXPORTS_LITERAL_COMPUTED_ASSIGN: EXPORTS_IDENTIFIER `[` IDENTIFIER_STRING `]` `=`
82
83EXPORTS_LITERAL_PROP: (IDENTIFIER  `:` IDENTIFIER)?) | (IDENTIFIER_STRING `:` IDENTIFIER)
84
85EXPORTS_SPREAD: `...` (IDENTIFIER | REQUIRE)
86
87EXPORTS_MEMBER: EXPORTS_DOT_ASSIGN | EXPORTS_LITERAL_COMPUTED_ASSIGN
88
89EXPORTS_DEFINE: `Object` `.` `defineProperty `(` EXPORTS_IDENFITIER `,` IDENTIFIER_STRING
90
91EXPORTS_DEFINE_VALUE: EXPORTS_DEFINE `, {`
92  (`enumerable: true,`)?
93  (
94    `value:` |
95    `get` (`: function` IDENTIFIER? )?  `()` {` return IDENTIFIER (`.` IDENTIFIER | `[` IDENTIFIER_STRING `]`)? `;`? `}`
96  )
97  `})`
98
99EXPORTS_LITERAL: MODULE_EXPORTS `=` `{` (EXPORTS_LITERAL_PROP | EXPORTS_SPREAD) `,`)+ `}`
100
101REQUIRE: `require` `(` STRING_LITERAL `)`
102
103EXPORTS_ASSIGN: (`var` | `const` | `let`) IDENTIFIER `=` REQUIRE
104
105MODULE_EXPORTS_ASSIGN: MODULE_EXPORTS `=` REQUIRE
106
107EXPORT_STAR: (`__export` | `__exportStar`) `(` REQUIRE
108
109EXPORT_STAR_LIB: `Object.keys(` IDENTIFIER$1 `).forEach(function (` IDENTIFIER$2 `) {`
110  (
111    `if (` IDENTIFIER$2 `===` ( `'default'` | `"default"` ) `||` IDENTIFIER$2 `===` ( '__esModule' | `"__esModule"` ) `) return` `;`? |
112    `if (` IDENTIFIER$2 `!==` ( `'default'` | `"default"` ) `)`
113  )
114  (
115    `if (` IDENTIFIER$2 `in` EXPORTS_IDENTIFIER `&&` EXPORTS_IDENTIFIER `[` IDENTIFIER$2 `] ===` IDENTIFIER$1 `[` IDENTIFIER$2 `]) return` `;`?
116  )?
117  (
118    EXPORTS_IDENTIFIER `[` IDENTIFIER$2 `] =` IDENTIFIER$1 `[` IDENTIFIER$2 `]` `;`? |
119    `Object.defineProperty(` EXPORTS_IDENTIFIER `, ` IDENTIFIER$2 `, { enumerable: true, get: function () { return ` IDENTIFIER$1 `[` IDENTIFIER$2 `]` `;`? } })` `;`?
120  )
121  `})`
122```
123
124Spacing between tokens is taken to be any ECMA-262 whitespace, ECMA-262 block comment or ECMA-262 line comment.
125
126* The returned export names are taken to be the combination of:
127  1. All `IDENTIFIER` and `IDENTIFIER_STRING` slots for `EXPORTS_MEMBER` and `EXPORTS_LITERAL` matches.
128  2. The first `IDENTIFIER_STRING` slot for all `EXPORTS_DEFINE_VALUE` matches where that same string is not an `EXPORTS_DEFINE` match that is not also an `EXPORTS_DEFINE_VALUE` match.
129* The reexport specifiers are taken to be the the combination of:
130  1. The `REQUIRE` matches of the last matched of either `MODULE_EXPORTS_ASSIGN` or `EXPORTS_LITERAL`.
131  2. All _top-level_ `EXPORT_STAR` `REQUIRE` matches and `EXPORTS_ASSIGN` matches whose `IDENTIFIER` also matches the first `IDENTIFIER` in `EXPORT_STAR_LIB`.
132
133### Parsing Examples
134
135#### Named Exports Parsing
136
137The basic matching rules for named exports are `exports.name`, `exports['name']` or `Object.defineProperty(exports, 'name', ...)`. This matching is done without scope analysis and regardless of the expression position:
138
139```js
140// DETECTS EXPORTS: a, b
141(function (exports) {
142  exports.a = 'a';
143  exports['b'] = 'b';
144})(exports);
145```
146
147Because there is no scope analysis, the above detection may overclassify:
148
149```js
150// DETECTS EXPORTS: a, b, c
151(function (exports, Object) {
152  exports.a = 'a';
153  exports['b'] = 'b';
154  if (false)
155    exports.c = 'c';
156})(NOT_EXPORTS, NOT_OBJECT);
157```
158
159It will in turn underclassify in cases where the identifiers are renamed:
160
161```js
162// DETECTS: NO EXPORTS
163(function (e) {
164  e.a = 'a';
165  e['b'] = 'b';
166})(exports);
167```
168
169#### Getter Exports Parsing
170
171`Object.defineProperty` is detected for specifically value and getter forms returning an identifier or member expression:
172
173```js
174// DETECTS: a, b, c, d, __esModule
175Object.defineProperty(exports, 'a', {
176  enumerable: true,
177  get: function () {
178    return q.p;
179  }
180});
181Object.defineProperty(exports, 'b', {
182  enumerable: true,
183  get: function () {
184    return q['p'];
185  }
186});
187Object.defineProperty(exports, 'c', {
188  enumerable: true,
189  get () {
190    return b;
191  }
192});
193Object.defineProperty(exports, 'd', { value: 'd' });
194Object.defineProperty(exports, '__esModule', { value: true });
195```
196
197To avoid matching getters that have side effects, any getter for an export name that does not support the forms above will
198opt-out of the getter matching:
199
200```js
201// DETECTS: NO EXPORTS
202Object.defineProperty(exports, 'a', {
203  value: 'no problem'
204});
205
206if (false) {
207  Object.defineProperty(module.exports, 'a', {
208    get () {
209      return dynamic();
210    }
211  })
212}
213```
214
215Alternative object definition structures or getter function bodies are not detected:
216
217```js
218// DETECTS: NO EXPORTS
219Object.defineProperty(exports, 'a', {
220  enumerable: false,
221  get () {
222    return p;
223  }
224});
225Object.defineProperty(exports, 'b', {
226  configurable: true,
227  get () {
228    return p;
229  }
230});
231Object.defineProperty(exports, 'c', {
232  get: () => p
233});
234Object.defineProperty(exports, 'd', {
235  enumerable: true,
236  get: function () {
237    return dynamic();
238  }
239});
240Object.defineProperty(exports, 'e', {
241  enumerable: true,
242  get () {
243    return 'str';
244  }
245});
246```
247
248`Object.defineProperties` is also not supported.
249
250#### Exports Object Assignment
251
252A best-effort is made to detect `module.exports` object assignments, but because this is not a full parser, arbitrary expressions are not handled in the
253object parsing process.
254
255Simple object definitions are supported:
256
257```js
258// DETECTS EXPORTS: a, b, c
259module.exports = {
260  a,
261  'b': b,
262  c: c,
263  ...d
264};
265```
266
267Object properties that are not identifiers or string expressions will bail out of the object detection, while spreads are ignored:
268
269```js
270// DETECTS EXPORTS: a, b
271module.exports = {
272  a,
273  ...d,
274  b: require('c'),
275  c: "not detected since require('c') above bails the object detection"
276}
277```
278
279`Object.defineProperties` is not currently supported either.
280
281#### module.exports reexport assignment
282
283Any `module.exports = require('mod')` assignment is detected as a reexport, but only the last one is returned:
284
285```js
286// DETECTS REEXPORTS: c
287module.exports = require('a');
288(module => module.exports = require('b'))(NOT_MODULE);
289if (false) module.exports = require('c');
290```
291
292This is to avoid over-classification in Webpack bundles with externals which include `module.exports = require('external')` in their source for every external dependency.
293
294In exports object assignment, any spread of `require()` are detected as multiple separate reexports:
295
296```js
297// DETECTS REEXPORTS: a, b
298module.exports = require('ignored');
299module.exports = {
300  ...require('a'),
301  ...require('b')
302};
303```
304
305#### Transpiler Re-exports
306
307For named exports, transpiler output works well with the rules described above.
308
309But for star re-exports, special care is taken to support common patterns of transpiler outputs from Babel and TypeScript as well as bundlers like RollupJS.
310These reexport and star reexport patterns are restricted to only be detected at the top-level as provided by the direct output of these tools.
311
312For example, `export * from 'external'` is output by Babel as:
313
314```js
315"use strict";
316
317exports.__esModule = true;
318
319var _external = require("external");
320
321Object.keys(_external).forEach(function (key) {
322  if (key === "default" || key === "__esModule") return;
323  exports[key] = _external[key];
324});
325```
326
327Where the `var _external = require("external")` is specifically detected as well as the `Object.keys(_external)` statement, down to the exact
328for of that entire expression including minor variations of the output. The `_external` and `key` identifiers are carefully matched in this
329detection.
330
331Similarly for TypeScript, `export * from 'external'` is output as:
332
333```js
334"use strict";
335function __export(m) {
336    for (var p in m) if (!exports.hasOwnProperty(p)) exports[p] = m[p];
337}
338Object.defineProperty(exports, "__esModule", { value: true });
339__export(require("external"));
340```
341
342Where the `__export(require("external"))` statement is explicitly detected as a reexport, including variations `tslib.__export` and `__exportStar`.
343
344### Environment Support
345
346Node.js 10+, and [all browsers with Web Assembly support](https://caniuse.com/#feat=wasm).
347
348### JS Grammar Support
349
350* Token state parses all line comments, block comments, strings, template strings, blocks, parens and punctuators.
351* Division operator / regex token ambiguity is handled via backtracking checks against punctuator prefixes, including closing brace or paren backtracking.
352* Always correctly parses valid JS source, but may parse invalid JS source without errors.
353
354### Benchmarks
355
356Benchmarks can be run with `npm run bench`.
357
358Current results:
359
360JS Build:
361
362```
363Module load time
364> 4ms
365Cold Run, All Samples
366test/samples/*.js (3635 KiB)
367> 299ms
368
369Warm Runs (average of 25 runs)
370test/samples/angular.js (1410 KiB)
371> 13.96ms
372test/samples/angular.min.js (303 KiB)
373> 4.72ms
374test/samples/d3.js (553 KiB)
375> 6.76ms
376test/samples/d3.min.js (250 KiB)
377> 4ms
378test/samples/magic-string.js (34 KiB)
379> 0.64ms
380test/samples/magic-string.min.js (20 KiB)
381> 0ms
382test/samples/rollup.js (698 KiB)
383> 8.48ms
384test/samples/rollup.min.js (367 KiB)
385> 5.36ms
386
387Warm Runs, All Samples (average of 25 runs)
388test/samples/*.js (3635 KiB)
389> 40.28ms
390```
391
392Wasm Build:
393```
394Module load time
395> 10ms
396Cold Run, All Samples
397test/samples/*.js (3635 KiB)
398> 43ms
399
400Warm Runs (average of 25 runs)
401test/samples/angular.js (1410 KiB)
402> 9.32ms
403test/samples/angular.min.js (303 KiB)
404> 3.16ms
405test/samples/d3.js (553 KiB)
406> 5ms
407test/samples/d3.min.js (250 KiB)
408> 2.32ms
409test/samples/magic-string.js (34 KiB)
410> 0.16ms
411test/samples/magic-string.min.js (20 KiB)
412> 0ms
413test/samples/rollup.js (698 KiB)
414> 6.28ms
415test/samples/rollup.min.js (367 KiB)
416> 3.6ms
417
418Warm Runs, All Samples (average of 25 runs)
419test/samples/*.js (3635 KiB)
420> 27.76ms
421```
422
423### Wasm Build Steps
424
425To build download the WASI SDK from https://github.com/WebAssembly/wasi-sdk/releases.
426
427The Makefile assumes the existence of "wasi-sdk-11.0" and "wabt" (optional) as sibling folders to this project.
428
429The build through the Makefile is then run via `make lib/lexer.wasm`, which can also be triggered via `npm run build-wasm` to create `dist/lexer.js`.
430
431On Windows it may be preferable to use the Linux subsystem.
432
433After the Web Assembly build, the CJS build can be triggered via `npm run build`.
434
435Optimization passes are run with [Binaryen](https://github.com/WebAssembly/binaryen) prior to publish to reduce the Web Assembly footprint.
436
437### License
438
439MIT
440
441[travis-url]: https://travis-ci.org/guybedford/es-module-lexer
442[travis-image]: https://travis-ci.org/guybedford/es-module-lexer.svg?branch=master
443