1.. _blog-08-bazel-docgen: 2 3=============================================== 4Pigweed Blog #8: Migrating pigweed.dev to Bazel 5=============================================== 6*Published on 2025-02-03 by Kayce Basques* 7 8``pigweed.dev`` is now built with Bazel! This blog post covers: 9 10* :ref:`blog-08-bazel-docgen-why` 11* :ref:`blog-08-bazel-docgen-compare` 12* :ref:`blog-08-bazel-docgen-good` 13* :ref:`blog-08-bazel-docgen-challenges` 14* :ref:`blog-08-bazel-docgen-next` 15 16.. important:: 17 18 Pigweed still supports GN and will continue to do so for a long time. This 19 post merely discusses how we migrated the build for our own docs site 20 (``pigweed.dev``) from GN to Bazel. 21 22.. _blog-08-bazel-docgen-why: 23 24--------------------------------- 25Why pigweed.dev switched to Bazel 26--------------------------------- 27.. _hermetic: https://bazel.build/basics/hermeticity 28 29Back in October 2023, Pigweed adopted Bazel as its :ref:`primary build system 30<seed-0111>`. For the first 12 months or so, the team focused on critical 31embedded development features like automated toolchain setup and `hermetic`_ 32cross-platform builds. Most of that work is done now so we shifted our focus 33to the last major part of Pigweed not using Bazel: our docs. 34 35.. note:: 36 37 "Hermeticity" essentially means that the build system runs in an isolated, 38 reproducible environment to guarantee that the build always produces the 39 exact same outputs for all teammates. It's one of Bazel's key value 40 propositions. 41 42.. _blog-08-bazel-docgen-compare: 43 44------------------------------------------------- 45Comparing the new Bazel build to the old GN build 46------------------------------------------------- 47We eventually ended up with this architecture for the new Bazel-based docs 48build system: 49 50.. mermaid:: 51 52 flowchart LR 53 54 Doxygen --> Breathe 55 Breathe --> reST 56 reST --> Sphinx 57 Rust --> Sphinx 58 Python --> Sphinx 59 60The GN build had roughly the same architecture, but the architecture is 61much more explicit and well-defined now. 62 63Here's an overview of what each component does and how they connect together: 64 65* C/C++ API references. We auto-generate our C/C++ API references with 66 **Doxygen** and then use **Breathe** to insert the reference content 67 into our main docs files, which are authored in **reST** (reStructuredText). 68 Example: :ref:`pw_unit_test C++ API reference <module-pw_unit_test-cpp>` 69 70* Rust API references. We auto-generate our Rust API references with 71 ``rustdoc`` and upload these docs to their own subsite. 72 Example: `Crate pw_bytes <https://pigweed.dev/rustdoc/pw_bytes/>`_ 73 74* reST and Sphinx. We author most of our docs in **reST** (reStructuredText) 75 and convert them into HTML with **Sphinx**. This is the heart of our 76 docs system. Example: :ref:`Tour of Pigweed <showcase-sense-tutorial-intro>` 77 78* Python API references. We auto-generate our Python API references with 79 Sphinx's ``autodoc`` feature. Example: :ref:`pw_rpc Python client API <module-pw_rpc-py>` 80 81The next few sections provide more detail about how we used to run some of 82these components in GN and how we now run them in Bazel. (The Python API 83references component is skipped.) 84 85.. note:: 86 87 I personally did most of the migration. I'm a technical writer. This 88 migration was my first time working with Bazel in-depth and was the largest 89 software engineering project I've ever done. My Pigweed teammates Alexei 90 Frolov, Ted Pudlik, Dave Roth, and Rob Mohr provided a lot of help and 91 guidance. Check out :bug:`318892911` for a granular breakdown of all the 92 work that was done. 93 94.. _blog-08-bazel-docgen-compare-doxygen: 95 96Generating C/C++ API references with Doxygen 97============================================ 98In the GN build we needed a custom script to run Doxygen. 99The script manually cleaned output directories, calculated the 100absolute paths to all the headers that Doxygen should parse, and 101then ran Doxygen non-hermetically. I.e. Doxygen had access to 102all files in the Pigweed repository rather than only the ones it 103actually needed. 104 105In the Bazel build all of our Doxygen logic resides within the ``MODULE.bazel`` 106file at the root of our repo and the ``BUILD.bazel`` files distributed 107throughout the codebase. We use `rules_doxygen 108<https://github.com/TendTo/rules_doxygen>`_ to hermetically run Doxygen. 109We just provide ``rules_doxygen`` with a Doxygen executable, tell it 110what headers to process, and it handles the rest. 111 112We chose ``rules_doxygen`` because it's actively maintained and supports `Bazel 113modules <https://bazel.build/external/module>`_ (the future of external 114dependency management in Bazel). Initially the repo was missing support for 115hermetic builds and macOS (Apple Silicon). I worked with the repo owner, 116`Ernesto Casablanca <https://github.com/TendTo>`_, to get these features 117implemented. It was one of my first proper engineering collaborations on an 118open source project and it was a really rewarding experience. Thank you, 119Ernesto! 120 121.. _blog-08-bazel-docgen-compare-rust: 122 123Generating Rust API references with rustdoc 124=========================================== 125In the GN build there is no equivalent to this step. We have always generated 126our Rust API references through Bazel. We use `rules_rust 127<https://github.com/bazelbuild/rules_rust>`_ to run ``rustdoc`` from within 128Bazel. Previously our docs builder would generate the Rust API references with 129Bazel, then use GN to build the rest of the docs, then upload the two 130disconnected outputs to production. Now, the docs builder just runs a single 131Bazel command and everything is generated together. Long-term, this will 132probably make it easier to integrate the Rust docs more thoroughly with the 133rest of the site. 134 135.. _blog-08-bazel-docgen-compare-sphinx: 136 137Building the reStructuredText docs 138================================== 139.. inclusive-language: disable 140.. _Sphinx: https://www.sphinx-doc.org/en/master/ 141.. inclusive-language: enable 142 143This is the heart of our docs system. We author our docs in `reStructuredText 144<https://docutils.sourceforge.io/rst.html>`_ (reST) and transform them 145into HTML with `Sphinx`_. We currently have around 440 reST files 146distributed throughout the Pigweed codebase. 147 148In the GN build, we basically had to implement all core docs workflows 149with our own custom scripts. E.g. we had a custom script for building 150the docs with Sphinx, another for locally previewing the docs, etc. 151We also had a lot of custom code for gathering up the reStructuredText 152files distributed throughout the codebase and reorganizing them into a 153structure that's easy for Sphinx to process. 154 155In the Bazel build, we no longer need any of this custom code. 156`rules_python <https://rules-python.readthedocs.io/en/latest/>`_ 157provides almost all of our core docs workflows now. 158See :ref:`blog-08-bazel-docgen-good-sources` and 159:ref:`blog-08-bazel-docgen-good-rules_python` for more details. 160 161.. _blog-08-bazel-docgen-compare-verify: 162 163Verifying the outputs 164===================== 165Our goal was to switch from GN to Bazel without ``pigweed.dev`` readers 166noticing any change. With over 440 pages of documentation, it was infeasible to 167manually verify that the Bazel build was producing the same outputs as the GN 168build. I ended up automating the verification workflow like this: 169 170#. Build the docs with the old GN-based system. 171 172#. Build them again with the new Bazel-based system. 173 174#. Traverse the output that GN produced and check that Bazel has produced 175 the exact same set of HTML files. 176 177#. Read each HTML file produced by GN as a string, then read the equivalent 178 HTML file produced by Bazel as a string, then compare the strings to verify 179 that their contents match exactly. 180 181#. When they're not equal, use ``diff`` to manually pinpoint mismatches. 182 183For final verification, I set up a visual diffing workflow: 184 185#. Use `Playwright <https://playwright.dev/python/>`_ to take screenshots of 186 each GN-generated HTML file and its Bazel-built equivalent. 187 188#. Visually diff the screenshots with `pixelmatch-py 189 <https://github.com/whtsky/pixelmatch-py>`_. 190 191.. _blog-08-bazel-docgen-good: 192 193-------------- 194What went well 195-------------- 196We kicked off the migration project in mid-September 2024 and started using 197Bazel in production in mid-January 2025. If we were in a rush, we probably 198could have finished in 2 months. When you add up the work I did as well as the 199help I got from others, it was about 120 hours of work. I.e. one full-time 200employee working 15 full days. We expected this project to drag on for much 201longer. 202 203.. _blog-08-bazel-docgen-good-sources: 204 205Built-in support for reorganizing sources 206========================================= 207Our docs are stored alongside the rest of Pigweed's code in a single 208repository. To make it easier to keep the docs in-sync with code changes, each 209doc lives close to its related code, like this: 210 211.. code-block:: text 212 213 . 214 ├── a 215 │ ├── a.cpp 216 │ └── a.rst 217 ├── b 218 │ ├── b.cpp 219 │ └── b.rst 220 └── docs 221 ├── conf.py 222 └── index.rst 223 224Sphinx, however, is easiest to work with when you have a structure 225like this: 226 227.. code-block:: text 228 229 . 230 ├── a 231 │ └── a.cpp 232 ├── b 233 │ └── b.cpp 234 └── docs 235 ├── a 236 │ └── a.rst 237 ├── b 238 │ └── b.rst 239 ├── conf.py 240 └── index.rst 241 242By default, Sphinx considers the directory containing ``conf.py`` to 243be the root docs directory. All ``*.rst`` (reST) files should be at or 244below the root docs directory. 245 246In the old GN-based system we had to hack together this reorganization 247logic ourselves. Bazel has built-in support for source reorganization via 248its ``prefix`` and ``strip_prefix`` features. 249 250.. _blog-08-bazel-docgen-good-rules_python: 251 252rules_python did the heavy lifting 253================================== 254We now get almost all of :ref:`our core docs workflows <contrib-docs-build>` 255for free from ``rules_python``, thanks to the great work that `Richard 256Levasseur <https://github.com/rickeylev>`_ has been doing. In this regard the 257switch to Bazel has significantly reduced complexity in the Pigweed codebase 258because our docs system now needs much less custom code. 259 260.. _blog-08-bazel-docgen-good-speed: 261 262Faster cold start builds 263======================== 264Currently, building the docs from scratch in Bazel is about 27% faster than 265building them from scratch in GN. However, there's still one major docs feature 266being migrated over to Bazel so it's not an apples-to-apples comparison yet. 267 268.. _blog-08-bazel-docgen-challenges: 269 270---------- 271Challenges 272---------- 273Overall the migration was a success, but I did get some scars! 274 275.. _blog-08-bazel-docgen-challenges-incremental: 276 277Incremental builds (or lack thereof) 278==================================== 279Incremental builds aren't working. You change one line in one reStructuredText 280file, and it takes 30-60 seconds to regenerate the docs. Unacceptable! Bazel 281and Sphinx both separately support incremental builds, so we're hopeful that 282we can find a path forward without opening a huge can of worms. 283 284.. _blog-08-bazel-docgen-challenges-skylib: 285 286Core utilities were hard to find 287================================ 288At one point I needed to copy a directory containing generated outputs. I 289searched the Bazel docs, but couldn't find a built-in mechanism for this basic 290task, so I created a `genrule 291<https://bazel.build/reference/be/general#genrule>`_. During code review, I 292learned that there is indeed a core utility for this: `copy_directory 293<https://github.com/bazelbuild/bazel-skylib/blob/main/rules/copy_directory.bzl>`_. 294I was quite surprised that ``copy_directory`` is not mentioned in the official 295Bazel docs. 296 297.. _blog-08-bazel-docgen-challenges-deps: 298 299Dependency hell 300=============== 301Pigweed's CI/CD testing is rigorous. Before new code is allowed to merge into 302Pigweed, all of Pigweed is built and tested in 10-100 different environments 303(the exact number depends on what code you've touched). There's a check that 304builds Pigweed with Bazel on macOS (Apple Silicon), another one that builds 305Pigweed with GN on Windows (x86), and so on. We also have a bunch of 306integration tests to ensure that changes to Pigweed don't break our customers' 307builds or unit tests. 308 309The :ref:`rules_python <blog-08-bazel-docgen-good-rules_python>` features that 310we rely on were introduced in a fairly new version of the module, v0.36. When 311I upgraded Pigweed to v0.36, I saw the dreaded red wall of integration test 312results. In other words, upgrading to ``rules_python`` v0.36 would break the 313builds for many Pigweed customers. The only path forward was to independently 314upgrade each customer's codebase to support v0.36. My Pigweed teammate, Dave 315Roth, saved the day by doing exactly that. Thank you, Dave, for helping me 316escape `dependency hell <https://en.wikipedia.org/wiki/Dependency_hell>`_! 317 318.. _blog-08-bazel-docgen-challenges-graphs: 319 320Explicit build graphs were time consuming 321========================================= 322Like the rest of Pigweed's codebase, I opted to explicitly list all 323sources and dependencies in the docs build rules, like this: 324 325.. code-block:: py 326 327 sphinx_docs_library( 328 name = "docs", 329 srcs = [ 330 "api.rst", 331 "code_size.rst", 332 "design.rst", 333 "docs.rst", 334 "guide.rst", 335 ], 336 # … 337 ) 338 339For the initial prototyping, using globs would have been much 340faster: 341 342.. code-block:: py 343 344 sphinx_docs_library( 345 name = "docs", 346 srcs = glob([ 347 "*.rst", 348 ]), 349 # … 350 ) 351 352.. _blog-08-bazel-docgen-challenges-starlark: 353 354Uncanny valley experiences with Starlark 355======================================== 356.. _Starlark: https://github.com/bazelbuild/starlark?tab=readme-ov-file#starlark 357.. _dialect: https://en.wikipedia.org/wiki/Programming_language#Dialects,_flavors_and_implementations 358 359`Starlark`_ naturally looks and feels a lot like Python, since it's a `dialect`_ 360of Python. During the migration I had a few `uncanny valley 361<https://en.wikipedia.org/wiki/Uncanny_valley>`_ experiences where I expected 362some Python idiom to work, and then eventually figured out that Starlark 363doesn't allow it. For example, to build out a dict in Python, I sometimes 364use code like this: 365 366.. code-block:: py 367 368 output_group_info = {} 369 for out in ctx.attr.outs: 370 output_group_info[out] = ctx.actions.declare_directory(out) 371 372But this is not allowed in Starlark because dicts are immutable. 373It is OK, however, to rebind the entire variable, like this: 374 375.. code-block:: py 376 377 output_group_info = {} 378 for out in ctx.attr.outs: 379 output_group_info |= {out: ctx.actions.declare_directory(out)} 380 381.. _blog-08-bazel-docgen-next: 382 383----------- 384What's next 385----------- 386Our top priorities are figuring out incremental builds and turning 387down the old GN-based build. 388 389Thank you for reading! If you'd like to discuss any of this with me, you can 390find me in the ``#docs`` channel of `Pigweed's Discord 391<https://discord.com/channels/691686718377558037/691686718377558040>`_. 392