1# CJS Module Lexer 2 3[![Build Status][travis-image]][travis-url] 4 5A [very fast](#benchmarks) JS CommonJS module syntax lexer used to detect the most likely list of named exports of a CommonJS module. 6 7Outputs the list of named exports (`exports.name = ...`) and possible module reexports (`module.exports = require('...')`), including the common transpiler variations of these cases. 8 9Forked from https://github.com/guybedford/es-module-lexer. 10 11_Comprehensively handles the JS language grammar while remaining small and fast. - ~90ms per MB of JS cold and ~15ms per MB of JS warm, [see benchmarks](#benchmarks) for more info._ 12 13### Usage 14 15``` 16npm install cjs-module-lexer 17``` 18 19For use in CommonJS: 20 21```js 22const { parse } = require('cjs-module-lexer'); 23 24// `init` return a promise for parity with the ESM API, but you do not have to call it 25 26const { exports, reexports } = parse(` 27 // named exports detection 28 module.exports.a = 'a'; 29 (function () { 30 exports.b = 'b'; 31 })(); 32 Object.defineProperty(exports, 'c', { value: 'c' }); 33 /* exports.d = 'not detected'; */ 34 35 // reexports detection 36 if (maybe) module.exports = require('./dep1.js'); 37 if (another) module.exports = require('./dep2.js'); 38 39 // literal exports assignments 40 module.exports = { a, b: c, d, 'e': f } 41 42 // __esModule detection 43 Object.defineProperty(module.exports, '__esModule', { value: true }) 44`); 45 46// exports === ['a', 'b', 'c', '__esModule'] 47// reexports === ['./dep1.js', './dep2.js'] 48``` 49 50When using the ESM version, Wasm is supported instead: 51 52```js 53import { parse, init } from 'cjs-module-lexer'; 54// init needs to be called and waited upon 55await init(); 56const { exports, reexports } = parse(source); 57``` 58 59The Wasm build is around 1.5x faster and without a cold start. 60 61### Grammar 62 63CommonJS exports matches are run against the source token stream. 64 65The token grammar is: 66 67``` 68IDENTIFIER: As defined by ECMA-262, without support for identifier `\` escapes, filtered to remove strict reserved words: 69 "implements", "interface", "let", "package", "private", "protected", "public", "static", "yield", "enum" 70 71STRING_LITERAL: A `"` or `'` bounded ECMA-262 string literal. 72 73MODULE_EXPORTS: `module` `.` `exports` 74 75EXPORTS_IDENTIFIER: MODULE_EXPORTS_IDENTIFIER | `exports` 76 77EXPORTS_DOT_ASSIGN: EXPORTS_IDENTIFIER `.` IDENTIFIER `=` 78 79EXPORTS_LITERAL_COMPUTED_ASSIGN: EXPORTS_IDENTIFIER `[` STRING_LITERAL `]` `=` 80 81EXPORTS_LITERAL_PROP: (IDENTIFIER (`:` IDENTIFIER)?) | (STRING_LITERAL `:` IDENTIFIER) 82 83EXPORTS_SPREAD: `...` (IDENTIFIER | REQUIRE) 84 85EXPORTS_MEMBER: EXPORTS_DOT_ASSIGN | EXPORTS_LITERAL_COMPUTED_ASSIGN 86 87EXPORTS_DEFINE: `Object` `.` `defineProperty `(` EXPORTS_IDENFITIER `,` STRING_LITERAL 88 89EXPORTS_DEFINE_VALUE: EXPORTS_DEFINE `, {` 90 (`enumerable: true,`)? 91 ( 92 `value:` | 93 `get` (`: function` IDENTIFIER? )? `() {` return IDENTIFIER (`.` IDENTIFIER | `[` STRING_LITERAL `]`)? `;`? `}` `,`? 94 ) 95 `})` 96 97EXPORTS_LITERAL: MODULE_EXPORTS `=` `{` (EXPORTS_LITERAL_PROP | EXPORTS_SPREAD) `,`)+ `}` 98 99REQUIRE: `require` `(` STRING_LITERAL `)` 100 101EXPORTS_ASSIGN: (`var` | `const` | `let`) IDENTIFIER `=` (`_interopRequireWildcard (`)? REQUIRE 102 103MODULE_EXPORTS_ASSIGN: MODULE_EXPORTS `=` REQUIRE 104 105EXPORT_STAR: (`__export` | `__exportStar`) `(` REQUIRE 106 107EXPORT_STAR_LIB: `Object.keys(` IDENTIFIER$1 `).forEach(function (` IDENTIFIER$2 `) {` 108 ( 109 ( 110 `if (` IDENTIFIER$2 `===` ( `'default'` | `"default"` ) `||` IDENTIFIER$2 `===` ( '__esModule' | `"__esModule"` ) `) return` `;`? 111 ( 112 (`if (Object` `.prototype`? `.hasOwnProperty.call(` IDENTIFIER `, ` IDENTIFIER$2 `)) return` `;`?)? 113 (`if (` IDENTIFIER$2 `in` EXPORTS_IDENTIFIER `&&` EXPORTS_IDENTIFIER `[` IDENTIFIER$2 `] ===` IDENTIFIER$1 `[` IDENTIFIER$2 `]) return` `;`)? 114 )? 115 ) | 116 `if (` IDENTIFIER$2 `!==` ( `'default'` | `"default"` ) (`&& !` (`Object` `.prototype`? `.hasOwnProperty.call(` IDENTIFIER `, ` IDENTIFIER$2 `)` | IDENTIFIER `.hasOwnProperty(` IDENTIFIER$2 `)`))? `)` 117 ) 118 ( 119 EXPORTS_IDENTIFIER `[` IDENTIFIER$2 `] =` IDENTIFIER$1 `[` IDENTIFIER$2 `]` `;`? | 120 `Object.defineProperty(` EXPORTS_IDENTIFIER `, ` IDENTIFIER$2 `, { enumerable: true, get` (`: function` IDENTIFIER? )? `() { return ` IDENTIFIER$1 `[` IDENTIFIER$2 `]` `;`? `}` `,`? `})` `;`? 121 ) 122 `})` 123``` 124 125Spacing between tokens is taken to be any ECMA-262 whitespace, ECMA-262 block comment or ECMA-262 line comment. 126 127* The returned export names are taken to be the combination of: 128 1. All `IDENTIFIER` and `STRING_LITERAL` slots for `EXPORTS_MEMBER` and `EXPORTS_LITERAL` matches. 129 2. The first `STRING_LITERAL` slot for all `EXPORTS_DEFINE_VALUE` matches where that same string is not an `EXPORTS_DEFINE` match that is not also an `EXPORTS_DEFINE_VALUE` match. 130* The reexport specifiers are taken to be the combination of: 131 1. The `REQUIRE` matches of the last matched of either `MODULE_EXPORTS_ASSIGN` or `EXPORTS_LITERAL`. 132 2. All _top-level_ `EXPORT_STAR` `REQUIRE` matches and `EXPORTS_ASSIGN` matches whose `IDENTIFIER` also matches the first `IDENTIFIER` in `EXPORT_STAR_LIB`. 133 134### Parsing Examples 135 136#### Named Exports Parsing 137 138The basic matching rules for named exports are `exports.name`, `exports['name']` or `Object.defineProperty(exports, 'name', ...)`. This matching is done without scope analysis and regardless of the expression position: 139 140```js 141// DETECTS EXPORTS: a, b 142(function (exports) { 143 exports.a = 'a'; 144 exports['b'] = 'b'; 145})(exports); 146``` 147 148Because there is no scope analysis, the above detection may overclassify: 149 150```js 151// DETECTS EXPORTS: a, b, c 152(function (exports, Object) { 153 exports.a = 'a'; 154 exports['b'] = 'b'; 155 if (false) 156 exports.c = 'c'; 157})(NOT_EXPORTS, NOT_OBJECT); 158``` 159 160It will in turn underclassify in cases where the identifiers are renamed: 161 162```js 163// DETECTS: NO EXPORTS 164(function (e) { 165 e.a = 'a'; 166 e['b'] = 'b'; 167})(exports); 168``` 169 170#### Getter Exports Parsing 171 172`Object.defineProperty` is detected for specifically value and getter forms returning an identifier or member expression: 173 174```js 175// DETECTS: a, b, c, d, __esModule 176Object.defineProperty(exports, 'a', { 177 enumerable: true, 178 get: function () { 179 return q.p; 180 } 181}); 182Object.defineProperty(exports, 'b', { 183 enumerable: true, 184 get: function () { 185 return q['p']; 186 } 187}); 188Object.defineProperty(exports, 'c', { 189 enumerable: true, 190 get () { 191 return b; 192 } 193}); 194Object.defineProperty(exports, 'd', { value: 'd' }); 195Object.defineProperty(exports, '__esModule', { value: true }); 196``` 197 198Value properties are also detected specifically: 199 200```js 201Object.defineProperty(exports, 'a', { 202 value: 'no problem' 203}); 204``` 205 206To avoid matching getters that have side effects, any getter for an export name that does not support the forms above will 207opt-out of the getter matching: 208 209```js 210// DETECTS: NO EXPORTS 211Object.defineProperty(exports, 'a', { 212 get () { 213 return 'nope'; 214 } 215}); 216 217if (false) { 218 Object.defineProperty(module.exports, 'a', { 219 get () { 220 return dynamic(); 221 } 222 }) 223} 224``` 225 226Alternative object definition structures or getter function bodies are not detected: 227 228```js 229// DETECTS: NO EXPORTS 230Object.defineProperty(exports, 'a', { 231 enumerable: false, 232 get () { 233 return p; 234 } 235}); 236Object.defineProperty(exports, 'b', { 237 configurable: true, 238 get () { 239 return p; 240 } 241}); 242Object.defineProperty(exports, 'c', { 243 get: () => p 244}); 245Object.defineProperty(exports, 'd', { 246 enumerable: true, 247 get: function () { 248 return dynamic(); 249 } 250}); 251Object.defineProperty(exports, 'e', { 252 enumerable: true, 253 get () { 254 return 'str'; 255 } 256}); 257``` 258 259`Object.defineProperties` is also not supported. 260 261#### Exports Object Assignment 262 263A best-effort is made to detect `module.exports` object assignments, but because this is not a full parser, arbitrary expressions are not handled in the 264object parsing process. 265 266Simple object definitions are supported: 267 268```js 269// DETECTS EXPORTS: a, b, c 270module.exports = { 271 a, 272 'b': b, 273 c: c, 274 ...d 275}; 276``` 277 278Object properties that are not identifiers or string expressions will bail out of the object detection, while spreads are ignored: 279 280```js 281// DETECTS EXPORTS: a, b 282module.exports = { 283 a, 284 ...d, 285 b: require('c'), 286 c: "not detected since require('c') above bails the object detection" 287} 288``` 289 290`Object.defineProperties` is not currently supported either. 291 292#### module.exports reexport assignment 293 294Any `module.exports = require('mod')` assignment is detected as a reexport, but only the last one is returned: 295 296```js 297// DETECTS REEXPORTS: c 298module.exports = require('a'); 299(module => module.exports = require('b'))(NOT_MODULE); 300if (false) module.exports = require('c'); 301``` 302 303This is to avoid over-classification in Webpack bundles with externals which include `module.exports = require('external')` in their source for every external dependency. 304 305In exports object assignment, any spread of `require()` are detected as multiple separate reexports: 306 307```js 308// DETECTS REEXPORTS: a, b 309module.exports = require('ignored'); 310module.exports = { 311 ...require('a'), 312 ...require('b') 313}; 314``` 315 316#### Transpiler Re-exports 317 318For named exports, transpiler output works well with the rules described above. 319 320But for star re-exports, special care is taken to support common patterns of transpiler outputs from Babel and TypeScript as well as bundlers like RollupJS. 321These reexport and star reexport patterns are restricted to only be detected at the top-level as provided by the direct output of these tools. 322 323For example, `export * from 'external'` is output by Babel as: 324 325```js 326"use strict"; 327 328exports.__esModule = true; 329 330var _external = require("external"); 331 332Object.keys(_external).forEach(function (key) { 333 if (key === "default" || key === "__esModule") return; 334 exports[key] = _external[key]; 335}); 336``` 337 338Where the `var _external = require("external")` is specifically detected as well as the `Object.keys(_external)` statement, down to the exact 339for of that entire expression including minor variations of the output. The `_external` and `key` identifiers are carefully matched in this 340detection. 341 342Similarly for TypeScript, `export * from 'external'` is output as: 343 344```js 345"use strict"; 346function __export(m) { 347 for (var p in m) if (!exports.hasOwnProperty(p)) exports[p] = m[p]; 348} 349Object.defineProperty(exports, "__esModule", { value: true }); 350__export(require("external")); 351``` 352 353Where the `__export(require("external"))` statement is explicitly detected as a reexport, including variations `tslib.__export` and `__exportStar`. 354 355### Environment Support 356 357Node.js 10+, and [all browsers with Web Assembly support](https://caniuse.com/#feat=wasm). 358 359### JS Grammar Support 360 361* Token state parses all line comments, block comments, strings, template strings, blocks, parens and punctuators. 362* Division operator / regex token ambiguity is handled via backtracking checks against punctuator prefixes, including closing brace or paren backtracking. 363* Always correctly parses valid JS source, but may parse invalid JS source without errors. 364 365### Benchmarks 366 367Benchmarks can be run with `npm run bench`. 368 369Current results: 370 371JS Build: 372 373``` 374Module load time 375> 4ms 376Cold Run, All Samples 377test/samples/*.js (3635 KiB) 378> 299ms 379 380Warm Runs (average of 25 runs) 381test/samples/angular.js (1410 KiB) 382> 13.96ms 383test/samples/angular.min.js (303 KiB) 384> 4.72ms 385test/samples/d3.js (553 KiB) 386> 6.76ms 387test/samples/d3.min.js (250 KiB) 388> 4ms 389test/samples/magic-string.js (34 KiB) 390> 0.64ms 391test/samples/magic-string.min.js (20 KiB) 392> 0ms 393test/samples/rollup.js (698 KiB) 394> 8.48ms 395test/samples/rollup.min.js (367 KiB) 396> 5.36ms 397 398Warm Runs, All Samples (average of 25 runs) 399test/samples/*.js (3635 KiB) 400> 40.28ms 401``` 402 403Wasm Build: 404``` 405Module load time 406> 10ms 407Cold Run, All Samples 408test/samples/*.js (3635 KiB) 409> 43ms 410 411Warm Runs (average of 25 runs) 412test/samples/angular.js (1410 KiB) 413> 9.32ms 414test/samples/angular.min.js (303 KiB) 415> 3.16ms 416test/samples/d3.js (553 KiB) 417> 5ms 418test/samples/d3.min.js (250 KiB) 419> 2.32ms 420test/samples/magic-string.js (34 KiB) 421> 0.16ms 422test/samples/magic-string.min.js (20 KiB) 423> 0ms 424test/samples/rollup.js (698 KiB) 425> 6.28ms 426test/samples/rollup.min.js (367 KiB) 427> 3.6ms 428 429Warm Runs, All Samples (average of 25 runs) 430test/samples/*.js (3635 KiB) 431> 27.76ms 432``` 433 434### Wasm Build Steps 435 436To build download the WASI SDK from https://github.com/WebAssembly/wasi-sdk/releases. 437 438The Makefile assumes the existence of "wasi-sdk-11.0" and "wabt" (optional) as sibling folders to this project. 439 440The build through the Makefile is then run via `make lib/lexer.wasm`, which can also be triggered via `npm run build-wasm` to create `dist/lexer.js`. 441 442On Windows it may be preferable to use the Linux subsystem. 443 444After the Web Assembly build, the CJS build can be triggered via `npm run build`. 445 446Optimization passes are run with [Binaryen](https://github.com/WebAssembly/binaryen) prior to publish to reduce the Web Assembly footprint. 447 448### License 449 450MIT 451 452[travis-url]: https://travis-ci.org/guybedford/es-module-lexer 453[travis-image]: https://travis-ci.org/guybedford/es-module-lexer.svg?branch=master 454