• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# Node.js API Documentation Tooling
2
3The Node.js API documentation is generated by an in-house tooling that resides
4within the [tools/doc](https://github.com/nodejs/node/tree/main/tools/doc)
5directory.
6
7The build process (using `make doc`) uses this tooling to parse the markdown
8files in [doc/api](https://github.com/nodejs/node/tree/main/doc/api) and
9generate the following:
10
111. Human-readable HTML in `out/doc/api/*.html`
122. A JSON representation in `out/doc/api/*.json`
13
14These artifacts are published to nodejs.org for multiple versions of
15Node.js. As an example the latest version of the human-readable HTML
16is published to [nodejs.org/en/doc](https://nodejs.org/en/docs/),
17and the latest version of the json documentation is published to
18[nodejs.org/api/all.json](https://nodejs.org/api/all.json)
19
20The artifacts are built as part of release builds by running the [doc-upload](https://github.com/nodejs/node/blob/1a83ad6a693f851199608ae957ac5d4f76871485/Makefile#L1218-L1224)
21Makefile target as part of the release-sources part of the
22iojs+release job.
23This target runs the `doc` target to build the docs and then uses
24`scp` to copy them onto the staging/www server into a directory of the form
25`/home/staging/nodejs/<type>/<full_version>/docs` where <type> is e.g.
26release, nightly, etc. The promotion step (either automatic for
27nightlies or manual for releases) then moves the docs to
28`/home/dist/nodejs/docs/\<full\_version>` where they are served by node.org.
29
30**The key things to know about the tooling include:**
31
321. The entry-point is `tools/doc/generate.js`.
332. The tooling supports the CLI arguments listed in the table below.
343. The tooling processes one file at a time.
354. The tooling uses a set of dependencies as described in the dependencies
36   section.
375. The tooling parses the input files and does several transformations to the
38   AST (Abstract Syntax Tree).
396. The tooling generates a JSON output that contains the metadata and content of
40   the Markdown file.
417. The tooling generates a HTML output that contains a human-readable and ready
42   to-view version of the file.
43
44This documentation serves the purpose of explaining the existing tooling
45processes, to allow easier maintenance and evolution of the tooling. It is not
46meant to be a guide on how to write documentation for Node.js.
47
48#### Vocabulary & Good to Know's
49
50* AST means "Abstract Syntax Tree" and it is a data structure that represents
51  the structure of a certain data format. In our case, the AST is a "graph"
52  representation of the contents of the Markdown file.
53* MDN means [Mozilla Developer Network](https://developer.mozilla.org/en-US/)
54  and it is a website that contains documentation for web technologies. We use
55  it as a reference for the structure of the documentation.
56* The
57  [Stability Index](https://nodejs.org/dist/latest/docs/api/documentation.html#stability-index)
58  is used to community the Stability of a given Node.js module. The Stability
59  levels include:
60  * Stability 0: Deprecated. (This module is Deprecated)
61  * Stability 1: Experimental. (This module is Experimental)
62  * Stability 2: Stable. (This module is Stable)
63  * Stability 3: Legacy. (This module is Legacy)
64* Within Remark YAML snippets `<!-- something -->` are considered HTML nodes,
65  that's because YAML isn't valid Markdown content. (Doesn't abide by the
66  Markdown spec)
67* "New Tooling" references to the (written from-scratch) API build tooling
68  introduced in `nodejs/nodejs.dev` that might replace the current one from
69  `nodejs/node`
70
71## CLI Arguments
72
73The tooling requires a `filename` argument and supports extra arguments (some
74also required) as shown below:
75
76| Argument              | Description                                                                                                                            | Required | Example                            |
77| --------------------- | -------------------------------------------------------------------------------------------------------------------------------------- | -------- | ---------------------------------- |
78| `--node-version=`     | The version of Node.js that is being documented. It defaults to `process.version` which is supplied by Node.js itself                  | No       | v19.0.0                            |
79| `--output-directory=` | The directory where the output files will be generated.                                                                                | Yes      | `./out/api/`                       |
80| `--apilinks=`         | This file is used as an index to specify the source file for each module                                                               | No       | `./out/doc/api/apilinks.json`      |
81| `--versions-file=`    | This file is used to specify an index of all previous versions of Node.js. It is used for the Version Navigation on the API docs page. | No       | `./out/previous-doc-versions.json` |
82
83**Note:** both of the `apilinks` and `versions-file` parameters are generated by
84the Node.js build process (Makefile). And they're files containing a JSON
85object.
86
87### Basic Usage
88
89```bash
90# cd tools/doc
91npm run node-doc-generator ${filename}
92```
93
94**OR**
95
96```bash
97# nodejs/node root directory
98make doc
99```
100
101## Dependencies and how the Tooling works internally
102
103The API tooling uses an-AST-alike library called
104[unified](https://github.com/unifiedjs/unified) for processing the Input file as
105a Graph that supports easy modification and update of its nodes.
106
107In addition to `unified` we also use
108[Remark](https://github.com/remarkjs/remark) for manipulating the Markdown part,
109and [Rehype](https://github.com/rehypejs/rehype)to help convert to and from
110Markdown.
111
112### What are the steps of the internal tooling?
113
114The tooling uses `unified` pipe-alike engine to pipe each part of the process.
115(The description below is a simplified version)
116
117* Starting from reading the Frontmatter section of the Markdown file with
118  [remark-frontmatter](https://www.npmjs.com/package/remark-frontmatter).
119* Then the tooling goes to parse the Markdown by using `remark-parse` and adds
120  support to [GitHub Flavoured Markdown](https://github.github.com/gfm/).
121* The tooling proceeds by parsing some of the Markdown nodes and transforming
122  them to HTML.
123* The tooling proceeds to generate the JSON output of the file.
124* Finally it does its final node transformations and generates a stringified
125  HTML.
126* It then stores the output to a JSON file and adds extra styling to the HTML
127  and then stores the HTML file.
128
129### What each file is responsible for?
130
131The files listed below are the ones referenced and actually used during the
132build process of the API docs as we see on <https://nodejs.org/api>. The
133remaining files from the directory might be used by other steps of the Node.js
134Makefile or might even be deprecated/remnant of old processes and might need to
135be revisited/removed.
136
137* **`html.mjs`**: Responsible for transforming nodes by decorating them with
138  visual artifacts for the HTML pages;
139  * For example, transforming man or JS doc references to links correctly
140    referring to respective External documentation.
141* **`json.mjs`**: Responsible for generating the JSON output of the file;
142  * It is mostly responsible for going through the whole Markdown file and
143    generating a JSON object that represent the Metadata of a specific Module.
144  * For example, for the FS module, it will generate an object with all its
145    methods, events, classes and use several regular expressions (ReGeX) for
146    extracting the information needed.
147* **`generate.mjs`**: Main entry-point of doc generation for a specific file. It
148  does e2e processing of a documentation file;
149* **`allhtml.mjs`**: A script executed after all files are generated to create a
150  single "all" page containing all the HTML documentation;
151* **`alljson.mjs`**: A script executed after all files are generated to create a
152  single "all" page containing all the JSON entries;
153* **`markdown.mjs`**: Contains utility to replace Markdown links to work with
154  the <https://nodejs.org/api/> website.
155* **`common.mjs`**: Contains a few utility functions that are used by the other
156  files.
157* **`type-parser.mjs`**: Used to replace "type references" (e.g. "String", or
158  "Buffer") to the correct Internal/External documentation pages (i.e. MDN or
159  other Node.js documentation pages).
160
161**Note:** It is important to mention that other files not mentioned here might
162be used during the process but are not relevant to the generation of the API
163docs themselves. You will notice that a lot of the logic within the build
164process is **specific** to the current <https://nodejs.org/api/> infrastructure.
165Just as adding some JavaScript snippets, styles, transforming certain Markdown
166elements into HTML, and adding certain HTML classes or such things.
167
168**Note:** Regarding the previous **Note** it is important to mention that we're
169currently working on an API tooling that is generic and independent of the
170current Nodejs.org Infrastructure.
171[The new tooling that is functional is available at the nodejs.dev repository](https://github.com/nodejs/nodejs.dev/blob/main/scripts/syncApiDocs.js)
172and uses plain ReGeX (No AST) and [MDX](https://mdxjs.com/).
173
174## The Build Process
175
176The build process that happens on `generate.mjs` follows the steps below:
177
178* Links within the Markdown are replaced directly within the source Markdown
179  (AST) (`markdown.replaceLinks`)
180  * This happens within `markdown.mjs` and basically it adds suffixes or
181    modifies link references within the Markdown
182  * This is necessary for the `https://nodejs.org` infrastructure as all pages
183    are suffixed with `.html`
184* Text (and some YAML) Nodes are transformed/modified through
185  `html.preprocessText`
186* JSON output is generated through `json.jsonAPI`
187* The title of the page is inferred through `html.firstHeader`
188* Nodes are transformed into HTML Elements through `html.preprocessElements`
189* The HTML Table of Contents (ToC) is generated through `html.buildToc`
190
191### `html.mjs`
192
193This file is responsible for doing node AST transformations that either update
194Markdown nodes to decorate them with more data or transform them into HTML Nodes
195that attain a certain visual responsibility; For example, to generate the "Added
196at" label, or the Source Links or the Stability Index, or the History table.
197
198**Note:** Methods not listed below are either not relevant or utility methods
199for string/array/object manipulation (e.g.: are used by the other methods
200mentioned below).
201
202#### `preprocessText`
203
204**New Tooling:** Most of the features within this method are available within
205the new tooling.
206
207This method does two things:
208
209* Replaces the Source Link YAML entry `<-- source_link= -->` into a "Source
210  Link" HTML anchor element.
211* Replaces type references within the Markdown (text) (i.e.: "String", "Buffer")
212  into the correct HTML anchor element that links to the correct documentation
213  page.
214  * The original node then gets mutated from text to HTML.
215  * It also updates references to Linux "MAN" pages to Web versions of them.
216
217#### `firstHeader`
218
219**New Tooling:** All features within this method are available within the new
220Tooling.
221
222Is used to attempt to extract the first heading of the page (recursively) to
223define the "title" of the page.
224
225**Note:** As all API Markdown files start with a Heading, this could possibly be
226improved to a reduced complexity.
227
228#### `preprocessElements`
229
230**New Tooling:** All features within this method are available within the new
231tooling.
232
233This method is responsible for doing multiple transformations within the AST
234Nodes, in majority, transforming the source node in respective HTML elements
235with diverse responsibilities, such as:
236
237* Updating Markdown `code` blocks by adding Language highlighting
238  * It also adds the "CJS"/"MJS" switch to Nodes that are followed by their
239    CJS/ESM equivalents.
240* Increasing the Heading level of each Heading
241* Parses YAML blocks and transforms them into HTML elements (See more at the
242  `parseYAML` method)
243* Updates BlockQuotes that are prefixed by the "Stability" word into a Stability
244  Index HTML element.
245
246#### `parseYAML`
247
248**New Tooling:** Most of the features within this method are available within
249the new tooling.
250
251This method is responsible for parsing the `<--YAML snippets -->` and
252transforming them into HTML elements.
253
254It follows a certain kind of "schema" that basically constitutes in the
255following options:
256
257| YAML Key      | Description                                                                                                                     | Example                                               | Example Result              | Available on new tooling |
258| ------------- | ------------------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------- | --------------------------- | ------------------------ |
259| `added`       | It's used to reference when a certain "module", "class" or "method" was added on Node.js                                        | `added: v0.1.90`                                      | `Added in: v0.1.90`         | Yes                      |
260| `deprecated`  | It's used to reference when a certain "module", "class" or "method" was deprecated on Node.js                                   | `deprecated: v0.1.90`                                 | `Deprecated since: v0.1.90` | Yes                      |
261| `removed`     | It's used to reference when a certain "module", "class" or "method" was removed on Node.js                                      | `removed: v0.1.90`                                    | `Removed in: v0.1.90`       | No                       |
262| `changes`     | It's used to describe all the changes (historical ones) that happened within a certain "module", "class" or "method" in Node.js | `[{ version: v0.1.90, pr-url: '', description: '' }]` | --                          | Yes                      |
263| `napiVersion` | It's used to describe in which version of the N-API this "module", "class" or "method" is available within Node.js              | `napiVersion: 1`                                      | `N-API version: 1`          | Yes                      |
264
265**Note:** The `changes` field gets prepended with the `added`, `deprecated` and
266`removed` fields if they exist. The table only gets generated if a `changes`
267field exists. In the new tooling only "added" is prepended for now.
268
269#### `buildToc`
270
271**New Tooling:** This feature is natively available within the new tooling
272through MDX.
273
274This method generates the Table of Contents based on all the Headings of the
275Markdown file.
276
277#### `altDocs`
278
279**New Tooling:** All features within this method are available within the new
280tooling.
281
282This method generates a version picker for the current page to be shown in older
283versions of the API docs.
284
285### `json.mjs`
286
287This file is responsible for generating a JSON object that (supposedly) is used
288for IDE-Intellisense or for indexing of all the "methods", "classes", "modules",
289"events", "constants" and "globals" available within a certain Markdown file.
290
291It attempts a best effort extraction of the data by using several regular
292expression patterns (ReGeX).
293
294**Note:** JSON output generation is currently not supported by the new tooling,
295but it is in the pipeline for development.
296
297#### `jsonAPI`
298
299This method traverses all the AST Nodes by iterating through each one of them
300and infers the kind of information each node contains through ReGeX. Then it
301mutate the data and appends it to the final JSON object.
302
303For a more in-depth information we recommend to refer to the `json.mjs` file as
304it contains a lot of comments.
305