• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1.. _module-pw_presubmit:
2
3============
4pw_presubmit
5============
6The presubmit module provides Python tools for running presubmit checks and
7checking and fixing code format. It also includes the presubmit check script for
8the Pigweed repository, ``pigweed_presubmit.py``.
9
10Presubmit checks are essential tools, but they take work to set up, and
11projects don’t always get around to it. The ``pw_presubmit`` module provides
12tools for setting up high quality presubmit checks for any project. We use this
13framework to run Pigweed’s presubmit on our workstations and in our automated
14building tools.
15
16The ``pw_presubmit`` module also includes ``pw format``, a tool that provides a
17unified interface for automatically formatting code in a variety of languages.
18With ``pw format``, you can format Bazel, C, C++, Python, GN, and Go code
19according to configurations defined by your project. ``pw format`` leverages
20existing tools like ``clang-format``, and it’s simple to add support for new
21languages. (Note: Bazel formatting requires ``buildifier`` to be present on your
22system. If it's not Bazel formatting passes without checking.)
23
24.. image:: docs/pw_presubmit_demo.gif
25   :alt: ``pw format`` demo
26   :align: left
27
28The ``pw_presubmit`` package includes presubmit checks that can be used with any
29project. These checks include:
30
31* Check code format of several languages including C, C++, and Python
32* Initialize a Python environment
33* Run all Python tests
34* Run pylint
35* Run mypy
36* Ensure source files are included in the GN and Bazel builds
37* Build and run all tests with GN
38* Build and run all tests with Bazel
39* Ensure all header files contain ``#pragma once``
40
41-------------
42Compatibility
43-------------
44Python 3
45
46-------------------------------------------
47Creating a presubmit check for your project
48-------------------------------------------
49Creating a presubmit check for a project using ``pw_presubmit`` is simple, but
50requires some customization. Projects must define their own presubmit check
51Python script that uses the ``pw_presubmit`` package.
52
53A project's presubmit script can be registered as a
54:ref:`pw_cli <module-pw_cli>` plugin, so that it can be run as ``pw
55presubmit``.
56
57Setting up the command-line interface
58=====================================
59The ``pw_presubmit.cli`` module sets up the command-line interface for a
60presubmit script. This defines a standard set of arguments for invoking
61presubmit checks. Its use is optional, but recommended.
62
63pw_presubmit.cli
64----------------
65.. automodule:: pw_presubmit.cli
66   :members: add_arguments, run
67
68Presubmit output directory
69--------------------------
70The ``pw_presubmit`` command line interface includes an ``--output-directory``
71option that specifies the working directory to use for presubmits. The default
72path is ``out/presubmit``.  A subdirectory is created for each presubmit step.
73This directory persists between presubmit runs and can be cleaned by deleting it
74or running ``pw presubmit --clean``.
75
76Presubmit checks
77================
78A presubmit check is defined as a function or other callable. The function must
79accept one argument: a ``PresubmitContext``, which provides the paths on which
80to run. Presubmit checks communicate failure by raising an exception.
81
82Presubmit checks may use the ``filter_paths`` decorator to automatically filter
83the paths list for file types they care about.
84
85Either of these functions could be used as presubmit checks:
86
87.. code-block:: python
88
89  @pw_presubmit.filter_paths(endswith='.py')
90  def file_contains_ni(ctx: PresubmitContext):
91      for path in ctx.paths:
92          with open(path) as file:
93              contents = file.read()
94              if 'ni' not in contents and 'nee' not in contents:
95                  raise PresumitFailure('Files must say "ni"!', path=path)
96
97  def run_the_build(_):
98      subprocess.run(['make', 'release'], check=True)
99
100Presubmit checks functions are grouped into "programs" -- a named series of
101checks. Projects may find it helpful to have programs for different purposes,
102such as a quick program for local use and a full program for automated use. The
103:ref:`example script <example-script>` uses ``pw_presubmit.Programs`` to define
104``quick`` and ``full`` programs.
105
106``PresubmitContext`` has the following members:
107
108* ``root``: Source checkout root directory
109* ``repos``: Repositories (top-level and submodules) processed by
110  ``pw presubmit``
111* ``output_dir``: Output directory for this specific presubmit step
112* ``failure_summary_log``: File path where steps should write a brief summary
113  of any failures
114* ``paths``: Modified files for the presubmit step to check (often used in
115  formatting steps but ignored in compile steps)
116* ``all_paths``: All files in the repository tree.
117* ``package_root``: Root directory for ``pw package`` installations
118* ``override_gn_args``: Additional GN args processed by ``build.gn_gen()``
119* ``luci``: Information about the LUCI build or None if not running in LUCI
120* ``num_jobs``: Number of jobs to run in parallel
121* ``continue_after_build_error``: For steps that compile, don't exit on the
122  first compilation error
123
124The ``luci`` member is of type ``LuciContext`` and has the following members:
125
126* ``buildbucket_id``: The globally-unique buildbucket id of the build
127* ``build_number``: The builder-specific incrementing build number, if
128  configured for this builder
129* ``project``: The LUCI project under which this build is running (often
130  ``pigweed`` or ``pigweed-internal``)
131* ``bucket``: The LUCI bucket under which this build is running (often ends
132  with ``ci`` or ``try``)
133* ``builder``: The builder being run
134* ``swarming_server``: The swarming server on which this build is running
135* ``swarming_task_id``: The swarming task id of this build
136* ``cas_instance``: The CAS instance accessible from this build
137* ``pipeline``: Information about the build pipeline, if applicable.
138* ``triggers``: Information about triggering commits, if applicable.
139
140The ``pipeline`` member, if present, is of type ``LuciPipeline`` and has the
141following members:
142
143* ``round``: The zero-indexed round number.
144* ``builds_from_previous_iteration``: A list of the buildbucket ids from the
145  previous round, if any, encoded as strs.
146
147The ``triggers`` member is a sequence of ``LuciTrigger`` objects, which have the
148following members:
149
150* ``number``: The number of the change in Gerrit.
151* ``patchset``: The number of the patchset of the change.
152* ``remote``: The full URL of the remote.
153* ``branch``: The name of the branch on which this change is being/was
154  submitted.
155* ``ref``: The ``refs/changes/..`` path that can be used to reference the
156  patch for unsubmitted changes and the hash for submitted changes.
157* ``gerrit_name``: The name of the googlesource.com Gerrit host.
158* ``submitted``: Whether the change has been submitted or is still pending.
159
160Additional members can be added by subclassing ``PresubmitContext`` and
161``Presubmit``. Then override ``Presubmit._create_presubmit_context()`` to
162return the subclass of ``PresubmitContext``. Finally, add
163``presubmit_class=PresubmitSubClass`` when calling ``cli.run()``.
164
165Substeps
166--------
167Presubmit steps can define substeps that can run independently in other tooling.
168These steps should subclass ``SubStepCheck`` and must define a ``substeps()``
169method that yields ``SubStep`` objects. ``SubStep`` objects have the following
170members:
171
172* ``name``: Name of the substep
173* ``_func``: Substep code
174* ``args``: Positional arguments for ``_func``
175* ``kwargs``: Keyword arguments for ``_func``
176
177``SubStep`` objects must have unique names. For a detailed example of a
178``SubStepCheck`` subclass see ``GnGenNinja`` in ``build.py``.
179
180Existing Presubmit Checks
181-------------------------
182A small number of presubmit checks are made available through ``pw_presubmit``
183modules.
184
185Code Formatting
186^^^^^^^^^^^^^^^
187Formatting checks for a variety of languages are available from
188``pw_presubmit.format_code``. These include C/C++, Java, Go, Python, GN, and
189others. All of these checks can be included by adding
190``pw_presubmit.format_code.presubmit_checks()`` to a presubmit program. These
191all use language-specific formatters like clang-format or black.
192
193These will suggest fixes using ``pw format --fix``.
194
195Options for code formatting can be specified in the ``pigweed.json`` file
196(see also :ref:`SEED-0101 <seed-0101>`). These apply to both ``pw presubmit``
197steps that check code formatting and ``pw format`` commands that either check
198or fix code formatting.
199
200* ``python_formatter``: Choice of Python formatter. Options are ``black`` (used
201  by Pigweed itself) and ``yapf`` (the default).
202* ``black_path``: If ``python_formatter`` is ``black``, use this as the
203  executable instead of ``black``.
204
205.. TODO(b/264578594) Add exclude to pigweed.json file.
206.. * ``exclude``: List of path regular expressions to ignore.
207
208Example section from a ``pigweed.json`` file:
209
210.. code-block::
211
212  {
213    "pw": {
214      "pw_presubmit": {
215        "format": {
216          "python_formatter": "black",
217          "black_path": "black"
218        }
219      }
220    }
221  }
222
223Sorted Blocks
224^^^^^^^^^^^^^
225Blocks of code can be required to be kept in sorted order using comments like
226the following:
227
228.. code-block::
229
230  # keep-sorted: start
231  bar
232  baz
233  foo
234  # keep-sorted: end
235
236This can be included by adding ``pw_presubmit.keep_sorted.presubmit_check`` to a
237presubmit program. Adding ``ignore-case`` to the start line will use
238case-insensitive sorting.
239
240By default, duplicates will be removed. Lines that are identical except in case
241are preserved, even with ``ignore-case``. To allow duplicates, add
242``allow-dupes`` to the start line.
243
244Prefixes can be ignored by adding ``ignore-prefix=`` followed by a
245comma-separated list of prefixes. The list below will be kept in this order.
246Neither commas nor whitespace are supported in prefixes.
247
248.. code-block::
249
250  # keep-sorted: start ignore-prefix=',"
251  'bar',
252  "baz",
253  'foo',
254  # keep-sorted: end
255
256Inline comments are assumed to be associated with the following line. For
257example, the following is already sorted. This can be disabled with
258``sticky-comments=no``.
259
260.. todo-check: disable
261
262.. code-block::
263
264  # keep-sorted: start
265  # TODO(b/1234) Fix this.
266  bar,
267  # TODO(b/5678) Also fix this.
268  foo,
269  # keep-sorted: end
270
271.. todo-check: enable
272
273By default, the prefix of the keep-sorted line is assumed to be the comment
274marker used by any inline comments. This can be overridden by adding lines like
275``sticky-comments=%,#`` to the start line.
276
277Lines indented more than the preceding line are assumed to be continuations.
278Thus, the following block is already sorted. keep-sorted blocks can not be
279nested, so there's no ability to add a keep-sorted block for the sub-items.
280
281.. code-block::
282
283  # keep-sorted: start
284  * abc
285    * xyz
286    * uvw
287  * def
288  # keep-sorted: end
289
290The presubmit check will suggest fixes using ``pw keep-sorted --fix``.
291
292Future versions may support additional multiline list items.
293
294.gitmodules
295^^^^^^^^^^^
296Various rules can be applied to .gitmodules files. This check can be included
297by adding ``pw_presubmit.gitmodules.create()`` to a presubmit program. This
298function takes an optional argument of type ``pw_presubmit.gitmodules.Config``.
299``Config`` objects have several properties.
300
301* ``allow_non_googlesource_hosts: bool = False`` — If false, all submodules URLs
302  must be on a Google-managed Gerrit server.
303* ``allowed_googlesource_hosts: Sequence[str] = ()`` — If set, any
304  Google-managed Gerrit URLs for submodules most be in this list. Entries
305  should be like ``pigweed`` for ``pigweed-review.googlesource.com``.
306* ``require_relative_urls: bool = False`` — If true, all submodules must be
307  relative to the superproject remote.
308* ``allow_sso: bool = True`` — If false, ``sso://`` and ``rpc://`` submodule
309  URLs are prohibited.
310* ``allow_git_corp_google_com: bool = True`` — If false, ``git.corp.google.com``
311  submodule URLs are prohibited.
312* ``require_branch: bool = False`` — If True, all submodules must reference a
313  branch.
314* ``validator: Callable[[PresubmitContext, Path, str, Dict[str, str]], None] = None``
315  — A function that can be used for arbitrary submodule validation. It's called
316  with the ``PresubmitContext``, the path to the ``.gitmodules`` file, the name
317  of the current submodule, and the properties of the current submodule.
318
319#pragma once
320^^^^^^^^^^^^
321There's a ``pragma_once`` check that confirms the first non-comment line of
322C/C++ headers is ``#pragma once``. This is enabled by adding
323``pw_presubmit.cpp_checks.pragma_once`` to a presubmit program.
324
325.. todo-check: disable
326
327TODO(b/###) Formatting
328^^^^^^^^^^^^^^^^^^^^^^^^^
329There's a check that confirms ``TODO`` lines match a given format. Upstream
330Pigweed expects these to look like ``TODO(b/###): Explanation``, but makes it
331easy for projects to define their own pattern instead.
332
333To use this check add ``todo_check.create(todo_check.BUGS_OR_USERNAMES)`` to a
334presubmit program.
335
336.. todo-check: enable
337
338Python Checks
339^^^^^^^^^^^^^
340There are two checks in the ``pw_presubmit.python_checks`` module, ``gn_pylint``
341and ``gn_python_check``. They assume there's a top-level ``python`` GN target.
342``gn_pylint`` runs Pylint and Mypy checks and ``gn_python_check`` runs Pylint,
343Mypy, and all Python tests.
344
345Inclusive Language
346^^^^^^^^^^^^^^^^^^
347.. inclusive-language: disable
348
349The inclusive language check looks for words that are typical of non-inclusive
350code, like using "master" and "slave" in place of "primary" and "secondary" or
351"sanity check" in place of "consistency check".
352
353.. inclusive-language: enable
354
355These checks can be disabled for individual lines with
356"inclusive-language: ignore" on the line in question or the line above it, or
357for entire blocks by using "inclusive-language: disable" before the block and
358"inclusive-language: enable" after the block.
359
360.. In case things get moved around in the previous paragraphs the enable line
361.. is repeated here: inclusive-language: enable.
362
363OWNERS
364^^^^^^
365There's a check that requires folders matching specific patterns contain
366``OWNERS`` files. It can be included by adding
367``module_owners.presubmit_check()`` to a presubmit program. This function takes
368a callable as an argument that indicates, for a given file, where a controlling
369``OWNERS`` file should be, or returns None if no ``OWNERS`` file is necessary.
370Formatting of ``OWNERS`` files is handled similary to formatting of other
371source files and is discussed in `Code Formatting`.
372
373Source in Build
374^^^^^^^^^^^^^^^
375Pigweed provides checks that source files are configured as part of the build
376for GN, Bazel, and CMake. These can be included by adding
377``source_in_build.gn(filter)`` and similar functions to a presubmit check. The
378CMake check additionally requires a callable that invokes CMake with appropriate
379options.
380
381pw_presubmit
382------------
383.. automodule:: pw_presubmit
384   :members: filter_paths, FileFilter, call, PresubmitFailure, Programs
385
386.. _example-script:
387
388
389Git hook
390--------
391You can run a presubmit program or step as a `git hook
392<https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks>`_ using
393``pw_presubmit.install_hook``.  This can be used to run certain presubmit
394checks before a change is pushed to a remote.
395
396We strongly recommend that you only run fast (< 15 seconds) and trivial checks
397as push hooks, and perform slower or more complex ones in CI. This is because,
398
399* Running slow checks in the push hook will force you to wait longer for
400  ``git push`` to complete, and
401* If your change fails one of the checks at this stage, it will not yet be
402  uploaded to the remote, so you'll have a harder time debugging any failures
403  (sharing the change with your colleagues, linking to it from an issue
404  tracker, etc).
405
406Example
407=======
408A simple example presubmit check script follows. This can be copied-and-pasted
409to serve as a starting point for a project's presubmit check script.
410
411See ``pigweed_presubmit.py`` for a more complex presubmit check script example.
412
413.. code-block:: python
414
415  """Example presubmit check script."""
416
417  import argparse
418  import logging
419  import os
420  from pathlib import Path
421  import re
422  import sys
423  from typing import List, Optional, Pattern
424
425  try:
426      import pw_cli.log
427  except ImportError:
428      print('ERROR: Activate the environment before running presubmits!',
429            file=sys.stderr)
430      sys.exit(2)
431
432  import pw_presubmit
433  from pw_presubmit import (
434      build,
435      cli,
436      cpp_checks,
437      environment,
438      format_code,
439      git_repo,
440      inclusive_language,
441      filter_paths,
442      python_checks,
443      PresubmitContext,
444  )
445  from pw_presubmit.install_hook import install_hook
446
447  # Set up variables for key project paths.
448  PROJECT_ROOT = Path(os.environ['MY_PROJECT_ROOT'])
449  PIGWEED_ROOT = PROJECT_ROOT / 'pigweed'
450
451  # Rerun the build if files with these extensions change.
452  _BUILD_EXTENSIONS = frozenset(
453      ['.rst', '.gn', '.gni', *format_code.C_FORMAT.extensions])
454
455
456  #
457  # Presubmit checks
458  #
459  def release_build(ctx: PresubmitContext):
460      build.gn_gen(ctx, build_type='release')
461      build.ninja(ctx)
462
463
464  def host_tests(ctx: PresubmitContext):
465      build.gn_gen(ctx, run_host_tests='true')
466      build.ninja(ctx)
467
468
469  # Avoid running some checks on certain paths.
470  PATH_EXCLUSIONS = (
471      re.compile(r'^external/'),
472      re.compile(r'^vendor/'),
473  )
474
475
476  # Use the upstream pragma_once check, but apply a different set of path
477  # filters with @filter_paths.
478  @filter_paths(endswith='.h', exclude=PATH_EXCLUSIONS)
479  def pragma_once(ctx: PresubmitContext):
480      cpp_checks.pragma_once(ctx)
481
482
483  #
484  # Presubmit check programs
485  #
486  OTHER = (
487      # Checks not ran by default but that should be available. These might
488      # include tests that are expensive to run or that don't yet pass.
489      build.gn_quick_check,
490  )
491
492  QUICK = (
493      # List some presubmit checks to run
494      pragma_once,
495      host_tests,
496      # Use the upstream formatting checks, with custom path filters applied.
497      format_code.presubmit_checks(exclude=PATH_EXCLUSIONS),
498      # Include the upstream inclusive language check.
499      inclusive_language.presubmit_check,
500      # Include just the lint-related Python checks.
501      python_checks.gn_pylint.with_filter(exclude=PATH_EXCLUSIONS),
502  )
503
504  FULL = (
505      QUICK,  # Add all checks from the 'quick' program
506      release_build,
507      # Use the upstream Python checks, with custom path filters applied.
508      # Checks listed multiple times are only run once.
509      python_checks.gn_python_check.with_filter(exclude=PATH_EXCLUSIONS),
510  )
511
512  PROGRAMS = pw_presubmit.Programs(other=OTHER, quick=QUICK, full=FULL)
513
514
515  #
516  # Allowlist of remote refs for presubmit. If the remote ref being pushed to
517  # matches any of these values (with regex matching), then the presubmits
518  # checks will be run before pushing.
519  #
520  PRE_PUSH_REMOTE_REF_ALLOWLIST = (
521      'refs/for/main',
522  )
523
524
525  def run(install: bool, remote_ref: Optional[str],  **presubmit_args) -> int:
526      """Process the --install argument then invoke pw_presubmit."""
527
528      # Install the presubmit Git pre-push hook, if requested.
529      if install:
530          # '$remote_ref' will be replaced by the actual value of the remote ref
531          # at runtime.
532          install_git_hook('pre-push', [
533              'python', '-m', 'tools.presubmit_check', '--base', 'HEAD~',
534              '--remote-ref', '$remote_ref'
535          ])
536          return 0
537
538      # Run the checks if either no remote_ref was passed, or if the remote ref
539      # matches anything in the allowlist.
540      if remote_ref is None or any(
541              re.search(pattern, remote_ref)
542              for pattern in PRE_PUSH_REMOTE_REF_ALLOWLIST):
543          return cli.run(root=PROJECT_ROOT, **presubmit_args)
544
545
546  def main() -> int:
547      """Run the presubmit checks for this repository."""
548      parser = argparse.ArgumentParser(description=__doc__)
549      cli.add_arguments(parser, PROGRAMS, 'quick')
550
551      # Define an option for installing a Git pre-push hook for this script.
552      parser.add_argument(
553          '--install',
554          action='store_true',
555          help='Install the presubmit as a Git pre-push hook and exit.')
556
557      # Define an optional flag to pass the remote ref into this script, if it
558      # is run as a pre-push hook. The destination variable in the parsed args
559      # will be `remote_ref`, as dashes are replaced with underscores to make
560      # valid variable names.
561      parser.add_argument(
562          '--remote-ref',
563          default=None,
564          nargs='?',  # Make optional.
565          help='Remote ref of the push command, for use by the pre-push hook.')
566
567      return run(**vars(parser.parse_args()))
568
569  if __name__ == '__main__':
570      pw_cli.log.install(logging.INFO)
571      sys.exit(main())
572
573---------------------
574Code formatting tools
575---------------------
576The ``pw_presubmit.format_code`` module formats supported source files using
577external code format tools. The file ``format_code.py`` can be invoked directly
578from the command line or from ``pw`` as ``pw format``.
579
580Example
581=======
582A simple example of adding support for a custom format. This code wraps the
583built in formatter to add a new format. It could also be used to replace
584a formatter or remove/disable a PigWeed supplied one.
585
586.. code-block:: python
587
588  #!/usr/bin/env python
589  """Formats files in repository. """
590
591  import logging
592  import sys
593
594  import pw_cli.log
595  from pw_presubmit import format_code
596  from your_project import presubmit_checks
597  from your_project import your_check
598
599  YOUR_CODE_FORMAT = CodeFormat('YourFormat',
600                                filter=FileFilter(suffix=('.your', )),
601                                check=your_check.check,
602                                fix=your_check.fix)
603
604  CODE_FORMATS = (*format_code.CODE_FORMATS, YOUR_CODE_FORMAT)
605
606  def _run(exclude, **kwargs) -> int:
607      """Check and fix formatting for source files in the repo."""
608      return format_code.format_paths_in_repo(exclude=exclude,
609                                              code_formats=CODE_FORMATS,
610                                              **kwargs)
611
612
613  def main():
614      return _run(**vars(format_code.arguments(git_paths=True).parse_args()))
615
616
617  if __name__ == '__main__':
618      pw_cli.log.install(logging.INFO)
619      sys.exit(main())
620