1# CJS Module Lexer 2 3[![Build Status][travis-image]][travis-url] 4 5A [very fast](#benchmarks) JS CommonJS module syntax lexer used to detect the most likely list of named exports of a CommonJS module. 6 7Outputs the list of named exports (`exports.name = ...`) and possible module reexports (`module.exports = require('...')`), including the common transpiler variations of these cases. 8 9Forked from https://github.com/guybedford/es-module-lexer. 10 11_Comprehensively handles the JS language grammar while remaining small and fast. - ~90ms per MB of JS cold and ~15ms per MB of JS warm, [see benchmarks](#benchmarks) for more info._ 12 13### Usage 14 15``` 16npm install cjs-module-lexer 17``` 18 19For use in CommonJS: 20 21```js 22const { parse } = require('cjs-module-lexer'); 23 24// `init` return a promise for parity with the ESM API, but you do not have to call it 25 26const { exports, reexports } = parse(` 27 // named exports detection 28 module.exports.a = 'a'; 29 (function () { 30 exports.b = 'b'; 31 })(); 32 Object.defineProperty(exports, 'c', { value: 'c' }); 33 /* exports.d = 'not detected'; */ 34 35 // reexports detection 36 if (maybe) module.exports = require('./dep1.js'); 37 if (another) module.exports = require('./dep2.js'); 38 39 // literal exports assignments 40 module.exports = { a, b: c, d, 'e': f } 41 42 // __esModule detection 43 Object.defineProperty(module.exports, '__esModule', { value: true }) 44`); 45 46// exports === ['a', 'b', 'c', '__esModule'] 47// reexports === ['./dep1.js', './dep2.js'] 48``` 49 50When using the ESM version, Wasm is supported instead: 51 52```js 53import { parse, init } from 'cjs-module-lexer'; 54// init needs to be called and waited upon 55await init(); 56const { exports, reexports } = parse(source); 57``` 58 59The Wasm build is around 1.5x faster and without a cold start. 60 61### Grammar 62 63CommonJS exports matches are run against the source token stream. 64 65The token grammar is: 66 67``` 68IDENTIFIER: As defined by ECMA-262, without support for identifier `\` escapes, filtered to remove strict reserved words: 69 "implements", "interface", "let", "package", "private", "protected", "public", "static", "yield", "enum" 70 71STRING_LITERAL: A `"` or `'` bounded ECMA-262 string literal. 72 73IDENTIFIER_STRING: ( `"` IDENTIFIER `"` | `'` IDENTIFIER `'` ) 74 75MODULE_EXPORTS: `module` `.` `exports` 76 77EXPORTS_IDENTIFIER: MODULE_EXPORTS_IDENTIFIER | `exports` 78 79EXPORTS_DOT_ASSIGN: EXPORTS_IDENTIFIER `.` IDENTIFIER `=` 80 81EXPORTS_LITERAL_COMPUTED_ASSIGN: EXPORTS_IDENTIFIER `[` IDENTIFIER_STRING `]` `=` 82 83EXPORTS_LITERAL_PROP: (IDENTIFIER `:` IDENTIFIER)?) | (IDENTIFIER_STRING `:` IDENTIFIER) 84 85EXPORTS_SPREAD: `...` (IDENTIFIER | REQUIRE) 86 87EXPORTS_MEMBER: EXPORTS_DOT_ASSIGN | EXPORTS_LITERAL_COMPUTED_ASSIGN 88 89EXPORTS_DEFINE: `Object` `.` `defineProperty `(` EXPORTS_IDENFITIER `,` IDENTIFIER_STRING 90 91EXPORTS_DEFINE_VALUE: EXPORTS_DEFINE `, {` 92 (`enumerable: true,`)? 93 ( 94 `value:` | 95 `get` (`: function` IDENTIFIER? )? `()` {` return IDENTIFIER (`.` IDENTIFIER | `[` IDENTIFIER_STRING `]`)? `;`? `}` 96 ) 97 `})` 98 99EXPORTS_LITERAL: MODULE_EXPORTS `=` `{` (EXPORTS_LITERAL_PROP | EXPORTS_SPREAD) `,`)+ `}` 100 101REQUIRE: `require` `(` STRING_LITERAL `)` 102 103EXPORTS_ASSIGN: (`var` | `const` | `let`) IDENTIFIER `=` REQUIRE 104 105MODULE_EXPORTS_ASSIGN: MODULE_EXPORTS `=` REQUIRE 106 107EXPORT_STAR: (`__export` | `__exportStar`) `(` REQUIRE 108 109EXPORT_STAR_LIB: `Object.keys(` IDENTIFIER$1 `).forEach(function (` IDENTIFIER$2 `) {` 110 ( 111 `if (` IDENTIFIER$2 `===` ( `'default'` | `"default"` ) `||` IDENTIFIER$2 `===` ( '__esModule' | `"__esModule"` ) `) return` `;`? | 112 `if (` IDENTIFIER$2 `!==` ( `'default'` | `"default"` ) `)` 113 ) 114 ( 115 `if (` IDENTIFIER$2 `in` EXPORTS_IDENTIFIER `&&` EXPORTS_IDENTIFIER `[` IDENTIFIER$2 `] ===` IDENTIFIER$1 `[` IDENTIFIER$2 `]) return` `;`? 116 )? 117 ( 118 EXPORTS_IDENTIFIER `[` IDENTIFIER$2 `] =` IDENTIFIER$1 `[` IDENTIFIER$2 `]` `;`? | 119 `Object.defineProperty(` EXPORTS_IDENTIFIER `, ` IDENTIFIER$2 `, { enumerable: true, get: function () { return ` IDENTIFIER$1 `[` IDENTIFIER$2 `]` `;`? } })` `;`? 120 ) 121 `})` 122``` 123 124Spacing between tokens is taken to be any ECMA-262 whitespace, ECMA-262 block comment or ECMA-262 line comment. 125 126* The returned export names are taken to be the combination of: 127 1. All `IDENTIFIER` and `IDENTIFIER_STRING` slots for `EXPORTS_MEMBER` and `EXPORTS_LITERAL` matches. 128 2. The first `IDENTIFIER_STRING` slot for all `EXPORTS_DEFINE_VALUE` matches where that same string is not an `EXPORTS_DEFINE` match that is not also an `EXPORTS_DEFINE_VALUE` match. 129* The reexport specifiers are taken to be the the combination of: 130 1. The `REQUIRE` matches of the last matched of either `MODULE_EXPORTS_ASSIGN` or `EXPORTS_LITERAL`. 131 2. All _top-level_ `EXPORT_STAR` `REQUIRE` matches and `EXPORTS_ASSIGN` matches whose `IDENTIFIER` also matches the first `IDENTIFIER` in `EXPORT_STAR_LIB`. 132 133### Parsing Examples 134 135#### Named Exports Parsing 136 137The basic matching rules for named exports are `exports.name`, `exports['name']` or `Object.defineProperty(exports, 'name', ...)`. This matching is done without scope analysis and regardless of the expression position: 138 139```js 140// DETECTS EXPORTS: a, b 141(function (exports) { 142 exports.a = 'a'; 143 exports['b'] = 'b'; 144})(exports); 145``` 146 147Because there is no scope analysis, the above detection may overclassify: 148 149```js 150// DETECTS EXPORTS: a, b, c 151(function (exports, Object) { 152 exports.a = 'a'; 153 exports['b'] = 'b'; 154 if (false) 155 exports.c = 'c'; 156})(NOT_EXPORTS, NOT_OBJECT); 157``` 158 159It will in turn underclassify in cases where the identifiers are renamed: 160 161```js 162// DETECTS: NO EXPORTS 163(function (e) { 164 e.a = 'a'; 165 e['b'] = 'b'; 166})(exports); 167``` 168 169#### Getter Exports Parsing 170 171`Object.defineProperty` is detected for specifically value and getter forms returning an identifier or member expression: 172 173```js 174// DETECTS: a, b, c, d, __esModule 175Object.defineProperty(exports, 'a', { 176 enumerable: true, 177 get: function () { 178 return q.p; 179 } 180}); 181Object.defineProperty(exports, 'b', { 182 enumerable: true, 183 get: function () { 184 return q['p']; 185 } 186}); 187Object.defineProperty(exports, 'c', { 188 enumerable: true, 189 get () { 190 return b; 191 } 192}); 193Object.defineProperty(exports, 'd', { value: 'd' }); 194Object.defineProperty(exports, '__esModule', { value: true }); 195``` 196 197To avoid matching getters that have side effects, any getter for an export name that does not support the forms above will 198opt-out of the getter matching: 199 200```js 201// DETECTS: NO EXPORTS 202Object.defineProperty(exports, 'a', { 203 value: 'no problem' 204}); 205 206if (false) { 207 Object.defineProperty(module.exports, 'a', { 208 get () { 209 return dynamic(); 210 } 211 }) 212} 213``` 214 215Alternative object definition structures or getter function bodies are not detected: 216 217```js 218// DETECTS: NO EXPORTS 219Object.defineProperty(exports, 'a', { 220 enumerable: false, 221 get () { 222 return p; 223 } 224}); 225Object.defineProperty(exports, 'b', { 226 configurable: true, 227 get () { 228 return p; 229 } 230}); 231Object.defineProperty(exports, 'c', { 232 get: () => p 233}); 234Object.defineProperty(exports, 'd', { 235 enumerable: true, 236 get: function () { 237 return dynamic(); 238 } 239}); 240Object.defineProperty(exports, 'e', { 241 enumerable: true, 242 get () { 243 return 'str'; 244 } 245}); 246``` 247 248`Object.defineProperties` is also not supported. 249 250#### Exports Object Assignment 251 252A best-effort is made to detect `module.exports` object assignments, but because this is not a full parser, arbitrary expressions are not handled in the 253object parsing process. 254 255Simple object definitions are supported: 256 257```js 258// DETECTS EXPORTS: a, b, c 259module.exports = { 260 a, 261 'b': b, 262 c: c, 263 ...d 264}; 265``` 266 267Object properties that are not identifiers or string expressions will bail out of the object detection, while spreads are ignored: 268 269```js 270// DETECTS EXPORTS: a, b 271module.exports = { 272 a, 273 ...d, 274 b: require('c'), 275 c: "not detected since require('c') above bails the object detection" 276} 277``` 278 279`Object.defineProperties` is not currently supported either. 280 281#### module.exports reexport assignment 282 283Any `module.exports = require('mod')` assignment is detected as a reexport, but only the last one is returned: 284 285```js 286// DETECTS REEXPORTS: c 287module.exports = require('a'); 288(module => module.exports = require('b'))(NOT_MODULE); 289if (false) module.exports = require('c'); 290``` 291 292This is to avoid over-classification in Webpack bundles with externals which include `module.exports = require('external')` in their source for every external dependency. 293 294In exports object assignment, any spread of `require()` are detected as multiple separate reexports: 295 296```js 297// DETECTS REEXPORTS: a, b 298module.exports = require('ignored'); 299module.exports = { 300 ...require('a'), 301 ...require('b') 302}; 303``` 304 305#### Transpiler Re-exports 306 307For named exports, transpiler output works well with the rules described above. 308 309But for star re-exports, special care is taken to support common patterns of transpiler outputs from Babel and TypeScript as well as bundlers like RollupJS. 310These reexport and star reexport patterns are restricted to only be detected at the top-level as provided by the direct output of these tools. 311 312For example, `export * from 'external'` is output by Babel as: 313 314```js 315"use strict"; 316 317exports.__esModule = true; 318 319var _external = require("external"); 320 321Object.keys(_external).forEach(function (key) { 322 if (key === "default" || key === "__esModule") return; 323 exports[key] = _external[key]; 324}); 325``` 326 327Where the `var _external = require("external")` is specifically detected as well as the `Object.keys(_external)` statement, down to the exact 328for of that entire expression including minor variations of the output. The `_external` and `key` identifiers are carefully matched in this 329detection. 330 331Similarly for TypeScript, `export * from 'external'` is output as: 332 333```js 334"use strict"; 335function __export(m) { 336 for (var p in m) if (!exports.hasOwnProperty(p)) exports[p] = m[p]; 337} 338Object.defineProperty(exports, "__esModule", { value: true }); 339__export(require("external")); 340``` 341 342Where the `__export(require("external"))` statement is explicitly detected as a reexport, including variations `tslib.__export` and `__exportStar`. 343 344### Environment Support 345 346Node.js 10+, and [all browsers with Web Assembly support](https://caniuse.com/#feat=wasm). 347 348### JS Grammar Support 349 350* Token state parses all line comments, block comments, strings, template strings, blocks, parens and punctuators. 351* Division operator / regex token ambiguity is handled via backtracking checks against punctuator prefixes, including closing brace or paren backtracking. 352* Always correctly parses valid JS source, but may parse invalid JS source without errors. 353 354### Benchmarks 355 356Benchmarks can be run with `npm run bench`. 357 358Current results: 359 360JS Build: 361 362``` 363Module load time 364> 4ms 365Cold Run, All Samples 366test/samples/*.js (3635 KiB) 367> 299ms 368 369Warm Runs (average of 25 runs) 370test/samples/angular.js (1410 KiB) 371> 13.96ms 372test/samples/angular.min.js (303 KiB) 373> 4.72ms 374test/samples/d3.js (553 KiB) 375> 6.76ms 376test/samples/d3.min.js (250 KiB) 377> 4ms 378test/samples/magic-string.js (34 KiB) 379> 0.64ms 380test/samples/magic-string.min.js (20 KiB) 381> 0ms 382test/samples/rollup.js (698 KiB) 383> 8.48ms 384test/samples/rollup.min.js (367 KiB) 385> 5.36ms 386 387Warm Runs, All Samples (average of 25 runs) 388test/samples/*.js (3635 KiB) 389> 40.28ms 390``` 391 392Wasm Build: 393``` 394Module load time 395> 10ms 396Cold Run, All Samples 397test/samples/*.js (3635 KiB) 398> 43ms 399 400Warm Runs (average of 25 runs) 401test/samples/angular.js (1410 KiB) 402> 9.32ms 403test/samples/angular.min.js (303 KiB) 404> 3.16ms 405test/samples/d3.js (553 KiB) 406> 5ms 407test/samples/d3.min.js (250 KiB) 408> 2.32ms 409test/samples/magic-string.js (34 KiB) 410> 0.16ms 411test/samples/magic-string.min.js (20 KiB) 412> 0ms 413test/samples/rollup.js (698 KiB) 414> 6.28ms 415test/samples/rollup.min.js (367 KiB) 416> 3.6ms 417 418Warm Runs, All Samples (average of 25 runs) 419test/samples/*.js (3635 KiB) 420> 27.76ms 421``` 422 423### Wasm Build Steps 424 425To build download the WASI SDK from https://github.com/WebAssembly/wasi-sdk/releases. 426 427The Makefile assumes the existence of "wasi-sdk-11.0" and "wabt" (optional) as sibling folders to this project. 428 429The build through the Makefile is then run via `make lib/lexer.wasm`, which can also be triggered via `npm run build-wasm` to create `dist/lexer.js`. 430 431On Windows it may be preferable to use the Linux subsystem. 432 433After the Web Assembly build, the CJS build can be triggered via `npm run build`. 434 435Optimization passes are run with [Binaryen](https://github.com/WebAssembly/binaryen) prior to publish to reduce the Web Assembly footprint. 436 437### License 438 439MIT 440 441[travis-url]: https://travis-ci.org/guybedford/es-module-lexer 442[travis-image]: https://travis-ci.org/guybedford/es-module-lexer.svg?branch=master 443