• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1.. _blog-08-bazel-docgen:
2
3===============================================
4Pigweed Blog #8: Migrating pigweed.dev to Bazel
5===============================================
6*Published on 2025-02-03 by Kayce Basques*
7
8``pigweed.dev`` is now built with Bazel! This blog post covers:
9
10* :ref:`blog-08-bazel-docgen-why`
11* :ref:`blog-08-bazel-docgen-compare`
12* :ref:`blog-08-bazel-docgen-good`
13* :ref:`blog-08-bazel-docgen-challenges`
14* :ref:`blog-08-bazel-docgen-next`
15
16.. important::
17
18   Pigweed still supports GN and will continue to do so for a long time. This
19   post merely discusses how we migrated the build for our own docs site
20   (``pigweed.dev``) from GN to Bazel.
21
22.. _blog-08-bazel-docgen-why:
23
24---------------------------------
25Why pigweed.dev switched to Bazel
26---------------------------------
27.. _hermetic: https://bazel.build/basics/hermeticity
28
29Back in October 2023, Pigweed adopted Bazel as its :ref:`primary build system
30<seed-0111>`. For the first 12 months or so, the team focused on critical
31embedded development features like automated toolchain setup and `hermetic`_
32cross-platform builds.  Most of that work is done now so we shifted our focus
33to the last major part of Pigweed not using Bazel: our docs.
34
35.. note::
36
37   "Hermeticity" essentially means that the build system runs in an isolated,
38   reproducible environment to guarantee that the build always produces the
39   exact same outputs for all teammates. It's one of Bazel's key value
40   propositions.
41
42.. _blog-08-bazel-docgen-compare:
43
44-------------------------------------------------
45Comparing the new Bazel build to the old GN build
46-------------------------------------------------
47We eventually ended up with this architecture for the new Bazel-based docs
48build system:
49
50.. mermaid::
51
52   flowchart LR
53
54     Doxygen --> Breathe
55     Breathe --> reST
56     reST --> Sphinx
57     Rust --> Sphinx
58     Python --> Sphinx
59
60The GN build had roughly the same architecture, but the architecture is
61much more explicit and well-defined now.
62
63Here's an overview of what each component does and how they connect together:
64
65* C/C++ API references. We auto-generate our C/C++ API references with
66  **Doxygen** and then use **Breathe** to insert the reference content
67  into our main docs files, which are authored in **reST** (reStructuredText).
68  Example: :ref:`pw_unit_test C++ API reference <module-pw_unit_test-cpp>`
69
70* Rust API references. We auto-generate our Rust API references with
71  ``rustdoc`` and upload these docs to their own subsite.
72  Example: `Crate pw_bytes <https://pigweed.dev/rustdoc/pw_bytes/>`_
73
74* reST and Sphinx. We author most of our docs in **reST** (reStructuredText)
75  and convert them into HTML with **Sphinx**. This is the heart of our
76  docs system. Example: :ref:`Tour of Pigweed <showcase-sense-tutorial-intro>`
77
78* Python API references. We auto-generate our Python API references with
79  Sphinx's ``autodoc`` feature. Example: :ref:`pw_rpc Python client API <module-pw_rpc-py>`
80
81The next few sections provide more detail about how we used to run some of
82these components in GN and how we now run them in Bazel. (The Python API
83references component is skipped.)
84
85.. note::
86
87   I personally did most of the migration. I'm a technical writer. This
88   migration was my first time working with Bazel in-depth and was the largest
89   software engineering project I've ever done. My Pigweed teammates Alexei
90   Frolov, Ted Pudlik, Dave Roth, and Rob Mohr provided a lot of help and
91   guidance. Check out :bug:`318892911` for a granular breakdown of all the
92   work that was done.
93
94.. _blog-08-bazel-docgen-compare-doxygen:
95
96Generating C/C++ API references with Doxygen
97============================================
98In the GN build we needed a custom script to run Doxygen.
99The script manually cleaned output directories, calculated the
100absolute paths to all the headers that Doxygen should parse, and
101then ran Doxygen non-hermetically. I.e. Doxygen had access to
102all files in the Pigweed repository rather than only the ones it
103actually needed.
104
105In the Bazel build all of our Doxygen logic resides within the ``MODULE.bazel``
106file at the root of our repo and the ``BUILD.bazel`` files distributed
107throughout the codebase. We use `rules_doxygen
108<https://github.com/TendTo/rules_doxygen>`_ to hermetically run Doxygen.
109We just provide ``rules_doxygen`` with a Doxygen executable, tell it
110what headers to process, and it handles the rest.
111
112We chose ``rules_doxygen`` because it's actively maintained and supports `Bazel
113modules <https://bazel.build/external/module>`_ (the future of external
114dependency management in Bazel). Initially the repo was missing support for
115hermetic builds and macOS (Apple Silicon). I worked with the repo owner,
116`Ernesto Casablanca <https://github.com/TendTo>`_, to get these features
117implemented. It was one of my first proper engineering collaborations on an
118open source project and it was a really rewarding experience. Thank you,
119Ernesto!
120
121.. _blog-08-bazel-docgen-compare-rust:
122
123Generating Rust API references with rustdoc
124===========================================
125In the GN build there is no equivalent to this step. We have always generated
126our Rust API references through Bazel.  We use `rules_rust
127<https://github.com/bazelbuild/rules_rust>`_ to run ``rustdoc`` from within
128Bazel. Previously our docs builder would generate the Rust API references with
129Bazel, then use GN to build the rest of the docs, then upload the two
130disconnected outputs to production.  Now, the docs builder just runs a single
131Bazel command and everything is generated together. Long-term, this will
132probably make it easier to integrate the Rust docs more thoroughly with the
133rest of the site.
134
135.. _blog-08-bazel-docgen-compare-sphinx:
136
137Building the reStructuredText docs
138==================================
139.. inclusive-language: disable
140.. _Sphinx: https://www.sphinx-doc.org/en/master/
141.. inclusive-language: enable
142
143This is the heart of our docs system. We author our docs in `reStructuredText
144<https://docutils.sourceforge.io/rst.html>`_ (reST) and transform them
145into HTML with `Sphinx`_. We currently have around 440 reST files
146distributed throughout the Pigweed codebase.
147
148In the GN build, we basically had to implement all core docs workflows
149with our own custom scripts. E.g. we had a custom script for building
150the docs with Sphinx, another for locally previewing the docs, etc.
151We also had a lot of custom code for gathering up the reStructuredText
152files distributed throughout the codebase and reorganizing them into a
153structure that's easy for Sphinx to process.
154
155In the Bazel build, we no longer need any of this custom code.
156`rules_python <https://rules-python.readthedocs.io/en/latest/>`_
157provides almost all of our core docs workflows now.
158See :ref:`blog-08-bazel-docgen-good-sources` and
159:ref:`blog-08-bazel-docgen-good-rules_python` for more details.
160
161.. _blog-08-bazel-docgen-compare-verify:
162
163Verifying the outputs
164=====================
165Our goal was to switch from GN to Bazel without ``pigweed.dev`` readers
166noticing any change. With over 440 pages of documentation, it was infeasible to
167manually verify that the Bazel build was producing the same outputs as the GN
168build. I ended up automating the verification workflow like this:
169
170#. Build the docs with the old GN-based system.
171
172#. Build them again with the new Bazel-based system.
173
174#. Traverse the output that GN produced and check that Bazel has produced
175   the exact same set of HTML files.
176
177#. Read each HTML file produced by GN as a string, then read the equivalent
178   HTML file produced by Bazel as a string, then compare the strings to verify
179   that their contents match exactly.
180
181#. When they're not equal, use ``diff`` to manually pinpoint mismatches.
182
183For final verification, I set up a visual diffing workflow:
184
185#. Use `Playwright <https://playwright.dev/python/>`_ to take screenshots of
186   each GN-generated HTML file and its Bazel-built equivalent.
187
188#. Visually diff the screenshots with `pixelmatch-py
189   <https://github.com/whtsky/pixelmatch-py>`_.
190
191.. _blog-08-bazel-docgen-good:
192
193--------------
194What went well
195--------------
196We kicked off the migration project in mid-September 2024 and started using
197Bazel in production in mid-January 2025. If we were in a rush, we probably
198could have finished in 2 months. When you add up the work I did as well as the
199help I got from others, it was about 120 hours of work. I.e. one full-time
200employee working 15 full days. We expected this project to drag on for much
201longer.
202
203.. _blog-08-bazel-docgen-good-sources:
204
205Built-in support for reorganizing sources
206=========================================
207Our docs are stored alongside the rest of Pigweed's code in a single
208repository. To make it easier to keep the docs in-sync with code changes, each
209doc lives close to its related code, like this:
210
211.. code-block:: text
212
213   .
214   ├── a
215   │   ├── a.cpp
216   │   └── a.rst
217   ├── b
218   │   ├── b.cpp
219   │   └── b.rst
220   └── docs
221       ├── conf.py
222       └── index.rst
223
224Sphinx, however, is easiest to work with when you have a structure
225like this:
226
227.. code-block:: text
228
229   .
230   ├── a
231   │   └── a.cpp
232   ├── b
233   │   └── b.cpp
234   └── docs
235       ├── a
236       │   └── a.rst
237       ├── b
238       │   └── b.rst
239       ├── conf.py
240       └── index.rst
241
242By default, Sphinx considers the directory containing ``conf.py`` to
243be the root docs directory. All ``*.rst`` (reST) files should be at or
244below the root docs directory.
245
246In the old GN-based system we had to hack together this reorganization
247logic ourselves. Bazel has built-in support for source reorganization via
248its ``prefix`` and ``strip_prefix`` features.
249
250.. _blog-08-bazel-docgen-good-rules_python:
251
252rules_python did the heavy lifting
253==================================
254We now get almost all of :ref:`our core docs workflows <contrib-docs-build>`
255for free from ``rules_python``, thanks to the great work that `Richard
256Levasseur <https://github.com/rickeylev>`_ has been doing. In this regard the
257switch to Bazel has significantly reduced complexity in the Pigweed codebase
258because our docs system now needs much less custom code.
259
260.. _blog-08-bazel-docgen-good-speed:
261
262Faster cold start builds
263========================
264Currently, building the docs from scratch in Bazel is about 27% faster than
265building them from scratch in GN. However, there's still one major docs feature
266being migrated over to Bazel so it's not an apples-to-apples comparison yet.
267
268.. _blog-08-bazel-docgen-challenges:
269
270----------
271Challenges
272----------
273Overall the migration was a success, but I did get some scars!
274
275.. _blog-08-bazel-docgen-challenges-incremental:
276
277Incremental builds (or lack thereof)
278====================================
279Incremental builds aren't working. You change one line in one reStructuredText
280file, and it takes 30-60 seconds to regenerate the docs. Unacceptable! Bazel
281and Sphinx both separately support incremental builds, so we're hopeful that
282we can find a path forward without opening a huge can of worms.
283
284.. _blog-08-bazel-docgen-challenges-skylib:
285
286Core utilities were hard to find
287================================
288At one point I needed to copy a directory containing generated outputs.  I
289searched the Bazel docs, but couldn't find a built-in mechanism for this basic
290task, so I created a `genrule
291<https://bazel.build/reference/be/general#genrule>`_.  During code review, I
292learned that there is indeed a core utility for this: `copy_directory
293<https://github.com/bazelbuild/bazel-skylib/blob/main/rules/copy_directory.bzl>`_.
294I was quite surprised that ``copy_directory`` is not mentioned in the official
295Bazel docs.
296
297.. _blog-08-bazel-docgen-challenges-deps:
298
299Dependency hell
300===============
301Pigweed's CI/CD testing is rigorous. Before new code is allowed to merge into
302Pigweed, all of Pigweed is built and tested in 10-100 different environments
303(the exact number depends on what code you've touched). There's a check that
304builds Pigweed with Bazel on macOS (Apple Silicon), another one that builds
305Pigweed with GN on Windows (x86), and so on. We also have a bunch of
306integration tests to ensure that changes to Pigweed don't break our customers'
307builds or unit tests.
308
309The :ref:`rules_python <blog-08-bazel-docgen-good-rules_python>` features that
310we rely on were introduced in a fairly new version of the module, v0.36.  When
311I upgraded Pigweed to v0.36, I saw the dreaded red wall of integration test
312results.  In other words, upgrading to ``rules_python`` v0.36 would break the
313builds for many Pigweed customers. The only path forward was to independently
314upgrade each customer's codebase to support v0.36. My Pigweed teammate, Dave
315Roth, saved the day by doing exactly that. Thank you, Dave, for helping me
316escape `dependency hell <https://en.wikipedia.org/wiki/Dependency_hell>`_!
317
318.. _blog-08-bazel-docgen-challenges-graphs:
319
320Explicit build graphs were time consuming
321=========================================
322Like the rest of Pigweed's codebase, I opted to explicitly list all
323sources and dependencies in the docs build rules, like this:
324
325.. code-block:: py
326
327   sphinx_docs_library(
328       name = "docs",
329       srcs = [
330           "api.rst",
331           "code_size.rst",
332           "design.rst",
333           "docs.rst",
334           "guide.rst",
335       ],
336       # …
337   )
338
339For the initial prototyping, using globs would have been much
340faster:
341
342.. code-block:: py
343
344   sphinx_docs_library(
345       name = "docs",
346       srcs = glob([
347           "*.rst",
348       ]),
349       # …
350   )
351
352.. _blog-08-bazel-docgen-challenges-starlark:
353
354Uncanny valley experiences with Starlark
355========================================
356.. _Starlark: https://github.com/bazelbuild/starlark?tab=readme-ov-file#starlark
357.. _dialect: https://en.wikipedia.org/wiki/Programming_language#Dialects,_flavors_and_implementations
358
359`Starlark`_ naturally looks and feels a lot like Python, since it's a `dialect`_
360of Python. During the migration I had a few `uncanny valley
361<https://en.wikipedia.org/wiki/Uncanny_valley>`_ experiences where I expected
362some Python idiom to work, and then eventually figured out that Starlark
363doesn't allow it. For example, to build out a dict in Python, I sometimes
364use code like this:
365
366.. code-block:: py
367
368   output_group_info = {}
369   for out in ctx.attr.outs:
370       output_group_info[out] = ctx.actions.declare_directory(out)
371
372But this is not allowed in Starlark because dicts are immutable.
373It is OK, however, to rebind the entire variable, like this:
374
375.. code-block:: py
376
377   output_group_info = {}
378   for out in ctx.attr.outs:
379       output_group_info |= {out: ctx.actions.declare_directory(out)}
380
381.. _blog-08-bazel-docgen-next:
382
383-----------
384What's next
385-----------
386Our top priorities are figuring out incremental builds and turning
387down the old GN-based build.
388
389Thank you for reading! If you'd like to discuss any of this with me, you can
390find me in the ``#docs`` channel of `Pigweed's Discord
391<https://discord.com/channels/691686718377558037/691686718377558040>`_.
392