1.. _module-pw_bloat: 2 3======== 4pw_bloat 5======== 6.. pigweed-module:: 7 :name: pw_bloat 8 9``pw_bloat`` provides tools and helpers around using 10`Bloaty McBloatface <https://github.com/google/bloaty>`_ including generating 11size report cards for output binaries through :ref:`Pigweed's GN build 12system <module-pw_build-gn>`. 13 14Bloat report cards allow tracking the memory usage of a system over time as code 15changes are made and provide a breakdown of which parts of the code have the 16largest size impact. 17 18------------------------ 19``pw bloat`` CLI command 20------------------------ 21``pw_bloat`` includes a plugin for the Pigweed command line capable of running 22size reports on ELF binaries. 23 24.. note:: 25 26 The bloat CLI plugin is still experimental and only supports a small subset 27 of ``pw_bloat``'s capabilities. Notably, it currently only runs on binaries 28 which define memory region symbols; refer to the 29 :ref:`memoryregions documentation <module-pw_bloat-memoryregions>` 30 for details. 31 32Basic usage 33=========== 34 35Running a size report on a single executable 36-------------------------------------------- 37 38.. code-block:: sh 39 40 $ pw bloat out/docs/obj/pw_result/size_report/bin/ladder_and_then.elf 41 42 ▒█████▄ █▓ ▄███▒ ▒█ ▒█ ░▓████▒ ░▓████▒ ▒▓████▄ 43 ▒█░ █░ ░█▒ ██▒ ▀█▒ ▒█░ █ ▒█ ▒█ ▀ ▒█ ▀ ▒█ ▀█▌ 44 ▒█▄▄▄█░ ░█▒ █▓░ ▄▄░ ▒█░ █ ▒█ ▒███ ▒███ ░█ █▌ 45 ▒█▀ ░█░ ▓█ █▓ ░█░ █ ▒█ ▒█ ▄ ▒█ ▄ ░█ ▄█▌ 46 ▒█ ░█░ ░▓███▀ ▒█▓▀▓█░ ░▓████▒ ░▓████▒ ▒▓████▀ 47 48 +----------------------+---------+ 49 | memoryregions | sizes | 50 +======================+=========+ 51 |FLASH |1,048,064| 52 |RAM | 196,608| 53 |VECTOR_TABLE | 512| 54 +======================+=========+ 55 |Total |1,245,184| 56 +----------------------+---------+ 57 58Running a size report diff 59-------------------------- 60 61.. code-block:: sh 62 63 $ pw bloat out/docs/obj/pw_metric/size_report/bin/one_metric.elf \ 64 --diff out/docs/obj/pw_metric/size_report/bin/base.elf \ 65 -d symbols 66 67 ▒█████▄ █▓ ▄███▒ ▒█ ▒█ ░▓████▒ ░▓████▒ ▒▓████▄ 68 ▒█░ █░ ░█▒ ██▒ ▀█▒ ▒█░ █ ▒█ ▒█ ▀ ▒█ ▀ ▒█ ▀█▌ 69 ▒█▄▄▄█░ ░█▒ █▓░ ▄▄░ ▒█░ █ ▒█ ▒███ ▒███ ░█ █▌ 70 ▒█▀ ░█░ ▓█ █▓ ░█░ █ ▒█ ▒█ ▄ ▒█ ▄ ░█ ▄█▌ 71 ▒█ ░█░ ░▓███▀ ▒█▓▀▓█░ ░▓████▒ ░▓████▒ ▒▓████▀ 72 73 +-----------------------------------------------------------------------------------+ 74 | | 75 +-----------------------------------------------------------------------------------+ 76 | diff| memoryregions | symbols | sizes| 77 +=====+======================+===============================================+======+ 78 | |FLASH | | -4| 79 | | |[section .FLASH.unused_space] | -408| 80 | | |main | +60| 81 | | |__sf_fake_stdout | +4| 82 | | |pw_boot_PreStaticMemoryInit | -2| 83 | | |_isatty | -2| 84 | NEW| |_GLOBAL__sub_I_group_foo | +84| 85 | NEW| |pw::metric::Group::~Group() | +34| 86 | NEW| |pw::intrusive_list_impl::List::insert_after() | +32| 87 | NEW| |pw::metric::Metric::Increment() | +32| 88 | NEW| |__cxa_atexit | +28| 89 | NEW| |pw::metric::Metric::Metric() | +28| 90 | NEW| |pw::metric::Metric::as_int() | +28| 91 | NEW| |pw::intrusive_list_impl::List::Item::unlist() | +20| 92 | NEW| |pw::metric::Group::Group() | +18| 93 | NEW| |pw::intrusive_list_impl::List::Item::previous()| +14| 94 | NEW| |pw::metric::TypedMetric<>::~TypedMetric() | +14| 95 | NEW| |__aeabi_atexit | +12| 96 +-----+----------------------+-----------------------------------------------+------+ 97 | |RAM | | 0| 98 | | |[section .stack] | -32| 99 | NEW| |group_foo | +16| 100 | NEW| |metric_x | +12| 101 | NEW| |[section .static_init_ram] | +4| 102 +=====+======================+===============================================+======+ 103 |Total| | | -4| 104 +-----+----------------------+-----------------------------------------------+------+ 105 106 107.. _bloat-howto: 108 109--------------------------- 110Defining size reports in GN 111--------------------------- 112 113Diff size reports 114================= 115Size reports can be defined using the GN template ``pw_size_diff``. The template 116requires at least two executable targets on which to perform a size diff. The 117base for the size diff can be specified either globally through the top-level 118``base`` argument, or individually per-binary within the ``binaries`` list. 119 120Arguments 121--------- 122 123* ``base``: Optional default base target for all listed binaries. 124* ``source_filter``: Optional global regex to filter labels in the diff output. 125* ``data_sources``: Optional global list of datasources from bloaty config file 126* ``binaries``: List of binaries to size diff. Each binary specifies a target, 127 a label for the diff, and optionally a base target, source filter, and data 128 sources that override the global ones (if specified). 129 130 131.. code-block:: 132 133 import("$dir_pw_bloat/bloat.gni") 134 135 executable("empty_base") { 136 sources = [ "empty_main.cc" ] 137 } 138 139 executable("hello_world_printf") { 140 sources = [ "hello_printf.cc" ] 141 } 142 143 executable("hello_world_iostream") { 144 sources = [ "hello_iostream.cc" ] 145 } 146 147 pw_size_diff("my_size_report") { 148 base = ":empty_base" 149 data_sources = "symbols,segments" 150 binaries = [ 151 { 152 target = ":hello_world_printf" 153 label = "Hello world using printf" 154 }, 155 { 156 target = ":hello_world_iostream" 157 label = "Hello world using iostream" 158 data_sources = "symbols" 159 }, 160 ] 161 } 162 163A sample ``pw_size_diff`` reStructuredText size report table can be found 164within module docs. For example, see the :ref:`pw_checksum-size-report` 165section of the ``pw_checksum`` module for more detail. 166 167Single binary size reports 168========================== 169Size reports can also be defined using ``pw_size_report``, which provides 170a size report for a single binary. The template requires a target binary. 171 172Arguments 173--------- 174* ``target``: Binary target to run size report on. 175* ``data_sources``: Optional list of data sources to organize outputs. 176* ``source_filter``: Optional regex to filter labels in the output. 177* ``json_key_prefix``: Optional prefix for key names in json size report. 178* ``full_json_summary``: Optional boolean to print json size report by label 179* level hierarchy. Defaults to only use top-level label in size report. 180* ``ignore_unused_labels``: Optional boolean to remove labels that have size of 181* zero in json size report. 182 183.. code-block:: 184 185 import("$dir_pw_bloat/bloat.gni") 186 187 executable("hello_world_iostream") { 188 sources = [ "hello_iostream.cc" ] 189 } 190 191 pw_size_report("hello_world_iostream_size_report") { 192 target = ":hello_iostream" 193 data_sources = "segments,symbols" 194 source_filter = "pw::hello" 195 json_key_prefix = "hello_world_iostream" 196 full_json_summary = true 197 ignore_unused_labels = true 198 } 199 200Example of the generated ASCII table for a single binary: 201 202.. code-block:: 203 204 ┌─────────────┬──────────────────────────────────────────────────┬──────┐ 205 │segment_names│ symbols │ sizes│ 206 ├═════════════┼══════════════════════════════════════════════════┼══════┤ 207 │FLASH │ │12,072│ 208 │ │pw::kvs::KeyValueStore::InitializeMetadata() │ 684│ 209 │ │pw::kvs::KeyValueStore::Init() │ 456│ 210 │ │pw::kvs::internal::EntryCache::Find() │ 444│ 211 │ │pw::kvs::FakeFlashMemory::Write() │ 240│ 212 │ │pw::kvs::internal::Entry::VerifyChecksumInFlash() │ 228│ 213 │ │pw::kvs::KeyValueStore::GarbageCollectSector() │ 220│ 214 │ │pw::kvs::KeyValueStore::RemoveDeletedKeyEntries() │ 220│ 215 │ │pw::kvs::KeyValueStore::AppendEntry() │ 204│ 216 │ │pw::kvs::KeyValueStore::Get() │ 194│ 217 │ │pw::kvs::internal::Entry::Read() │ 188│ 218 │ │pw::kvs::ChecksumAlgorithm::Finish() │ 26│ 219 │ │pw::kvs::internal::Entry::ReadKey() │ 26│ 220 │ │pw::kvs::internal::Sectors::BaseAddress() │ 24│ 221 │ │pw::kvs::ChecksumAlgorithm::Update() │ 20│ 222 │ │pw::kvs::FlashTestPartition() │ 8│ 223 │ │pw::kvs::FakeFlashMemory::Disable() │ 6│ 224 │ │pw::kvs::FakeFlashMemory::Enable() │ 6│ 225 │ │pw::kvs::FlashMemory::SelfTest() │ 6│ 226 │ │pw::kvs::FlashPartition::Init() │ 6│ 227 │ │pw::kvs::FlashPartition::sector_size_bytes() │ 6│ 228 │ │pw::kvs::FakeFlashMemory::IsEnabled() │ 4│ 229 ├─────────────┼──────────────────────────────────────────────────┼──────┤ 230 │RAM │ │ 1,424│ 231 │ │test_kvs │ 992│ 232 │ │pw::kvs::(anonymous namespace)::test_flash │ 384│ 233 │ │pw::kvs::(anonymous namespace)::test_partition │ 24│ 234 │ │pw::kvs::FakeFlashMemory::no_errors_ │ 12│ 235 │ │borrowable_kvs │ 8│ 236 │ │kvs_entry_count │ 4│ 237 ├═════════════┼══════════════════════════════════════════════════┼══════┤ 238 │Total │ │13,496│ 239 └─────────────┴──────────────────────────────────────────────────┴──────┘ 240 241 242Size reports are typically included in reStructuredText, as described in 243`Documentation integration`_. Size reports may also be printed in the build 244output if desired. To enable this in the GN build 245(``pigweed/pw_bloat/bloat.gni``), set the ``pw_bloat_SHOW_SIZE_REPORTS`` 246build arg to ``true``. 247 248Collecting size report data 249=========================== 250Each ``pw_size_report`` target outputs a JSON file containing the sizes of all 251top-level labels in the binary. (By default, this represents "segments", i.e. 252ELF program headers.) If ``full_json_summary`` is set to true, sizes for all 253label levels are reported (i.e. default labels would show size of each symbol 254per segment). If a build produces multiple images, it may be useful to collect 255all of their sizes into a single file to provide a snapshot of sizes at some 256point in time --- for example, to display per-commit size deltas through CI. 257 258The ``pw_size_report_aggregation`` template is provided to collect multiple size 259reports' data into a single JSON file. 260 261Arguments 262--------- 263* ``deps``: List of ``pw_size_report`` targets whose data to collect. 264* ``output``: Path to the output JSON file. 265 266.. code-block:: 267 268 import("$dir_pw_bloat/bloat.gni") 269 270 pw_size_report_aggregation("image_sizes") { 271 deps = [ 272 ":app_image_size_report", 273 ":bootloader_image_size_report", 274 ] 275 output = "$root_gen_dir/artifacts/image_sizes.json" 276 } 277 278.. _module-pw_bloat-docs: 279 280------------------------- 281Documentation integration 282------------------------- 283Bloat reports are easy to add to documentation files. All ``pw_size_diff`` 284and ``pw_size_report`` targets output a file containing a tabular report card. 285This file can be imported directly into a reStructuredText file using the 286``include`` directive. 287 288For example, the ``simple_bloat_loop`` and ``simple_bloat_function`` size 289reports under ``//pw_bloat/examples`` are imported into this file as follows: 290 291.. code-block:: rst 292 293 Simple bloat loop example 294 ^^^^^^^^^^^^^^^^^^^^^^^^^ 295 .. include:: examples/simple_bloat_loop 296 297 Simple bloat function example 298 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 299 .. include:: examples/simple_bloat_function 300 301Resulting in this output: 302 303Simple bloat loop example 304========================= 305.. include:: examples/simple_bloat_loop 306 307Simple bloat function example 308============================= 309.. include:: examples/simple_bloat_function 310 311.. _module-pw_bloat-sources: 312 313------------------------------ 314Additional Bloaty data sources 315------------------------------ 316`Bloaty McBloatface <https://github.com/google/bloaty>`_ by itself cannot help 317answer some questions which embedded developers frequently face such as 318understanding how much space is left. To address this, Pigweed provides Python 319tooling (``pw_bloat.bloaty_config``) to generate bloaty configuration files 320based on the final ELF files through small tweaks in the linker scripts to 321expose extra information. 322 323See the sections below on how to enable the additional data sections through 324modifications in your linker script(s). 325 326As an example to generate the helper configuration which enables additional data 327sources for ``example.elf`` if you've updated your linker script(s) accordingly, 328simply run 329``python -m pw_bloaty.bloaty_config example.elf > example.bloaty``. The 330``example.bloaty`` can then be used with bloaty using the ``-c`` flag, for 331example 332``bloaty -c example.bloaty example.elf --domain vm -d memoryregions,utilization`` 333which may return something like: 334 335.. code-block:: 336 337 84.2% 1023Ki FLASH 338 94.2% 963Ki Free space 339 5.8% 59.6Ki Used space 340 15.8% 192Ki RAM 341 100.0% 192Ki Used space 342 0.0% 512 VECTOR_TABLE 343 96.9% 496 Free space 344 3.1% 16 Used space 345 0.0% 0 Not resident in memory 346 NAN% 0 Used space 347 348.. _module-pw_bloat-utilization: 349 350``utilization`` data source 351=========================== 352The most common question many embedded developers face when using ``bloaty`` is 353how much space you are using and how much space is left. To correctly answer 354this, section sizes must be used in order to correctly account for section 355alignment requirements. 356 357The generated ``utilization`` data source will work with any ELF file, where 358``Used Space`` is reported for the sum of virtual memory size of all sections. 359``Padding`` captures the amount of memory that is utilized to enfore alignment 360requirements. Tracking ``Padding`` size can help monitor application growth 361for changes that are too small to force realignment. 362 363In order for ``Free Space`` to be reported, your linker scripts must include 364properly aligned sections which span the unused remaining space for the relevant 365memory region with the ``unused_space`` string anywhere in their name. This 366typically means creating a trailing section which is pinned to span to the end 367of the memory region. 368 369For example imagine this partial example GNU LD linker script: 370 371.. code-block:: 372 373 MEMORY 374 { 375 FLASH(rx) : \ 376 ORIGIN = PW_BOOT_FLASH_BEGIN, \ 377 LENGTH = PW_BOOT_FLASH_SIZE 378 RAM(rwx) : \ 379 ORIGIN = PW_BOOT_RAM_BEGIN, \ 380 LENGTH = PW_BOOT_RAM_SIZE 381 } 382 383 SECTIONS 384 { 385 /* Main executable code. */ 386 .code : ALIGN(4) 387 { 388 /* Application code. */ 389 *(.text) 390 *(.text*) 391 KEEP(*(.init)) 392 KEEP(*(.fini)) 393 394 . = ALIGN(4); 395 /* Constants.*/ 396 *(.rodata) 397 *(.rodata*) 398 } >FLASH 399 400 /* Explicitly initialized global and static data. (.data)*/ 401 .static_init_ram : ALIGN(4) 402 { 403 *(.data) 404 *(.data*) 405 . = ALIGN(4); 406 } >RAM AT> FLASH 407 408 /* Zero initialized global/static data. (.bss) */ 409 .zero_init_ram (NOLOAD) : ALIGN(4) 410 { 411 *(.bss) 412 *(.bss*) 413 *(COMMON) 414 . = ALIGN(4); 415 } >RAM 416 } 417 418Could be modified as follows to enable ``Free Space`` reporting: 419 420.. code-block:: 421 422 MEMORY 423 { 424 FLASH(rx) : ORIGIN = PW_BOOT_FLASH_BEGIN, LENGTH = PW_BOOT_FLASH_SIZE 425 RAM(rwx) : ORIGIN = PW_BOOT_RAM_BEGIN, LENGTH = PW_BOOT_RAM_SIZE 426 427 /* Each memory region above has an associated .*.unused_space section that 428 * overlays the unused space at the end of the memory segment. These 429 * segments are used by pw_bloat.bloaty_config to create the utilization 430 * data source for bloaty size reports. 431 * 432 * These sections MUST be located immediately after the last section that is 433 * placed in the respective memory region or lld will issue a warning like: 434 * 435 * warning: ignoring memory region assignment for non-allocatable section 436 * '.VECTOR_TABLE.unused_space' 437 * 438 * If this warning occurs, it's also likely that LLD will have created quite 439 * large padded regions in the ELF file due to bad cursor operations. This 440 * can cause ELF files to balloon from hundreds of kilobytes to hundreds of 441 * megabytes. 442 * 443 * Attempting to add sections to the memory region AFTER the unused_space 444 * section will cause the region to overflow. 445 */ 446 } 447 448 SECTIONS 449 { 450 /* Main executable code. */ 451 .code : ALIGN(4) 452 { 453 /* Application code. */ 454 *(.text) 455 *(.text*) 456 KEEP(*(.init)) 457 KEEP(*(.fini)) 458 459 . = ALIGN(4); 460 /* Constants.*/ 461 *(.rodata) 462 *(.rodata*) 463 } >FLASH 464 465 /* Explicitly initialized global and static data. (.data)*/ 466 .static_init_ram : ALIGN(4) 467 { 468 *(.data) 469 *(.data*) 470 . = ALIGN(4); 471 } >RAM AT> FLASH 472 473 /* Defines a section representing the unused space in the FLASH segment. 474 * This MUST be the last section assigned to the FLASH region. 475 */ 476 PW_BLOAT_UNUSED_SPACE(FLASH) 477 478 /* Zero initialized global/static data. (.bss). */ 479 .zero_init_ram (NOLOAD) : ALIGN(4) 480 { 481 *(.bss) 482 *(.bss*) 483 *(COMMON) 484 . = ALIGN(4); 485 } >RAM 486 487 /* Defines a section representing the unused space in the RAM segment. This 488 * MUST be the last section assigned to the RAM region. 489 */ 490 PW_BLOAT_UNUSED_SPACE(RAM) 491 } 492 493The preprocessor macro ``PW_BLOAT_UNUSED_SPACE`` is defined in 494``pw_bloat/bloat_macros.ld``. To use these macros include this file in your 495``pw_linker_script`` as follows: 496 497.. code-block:: 498 499 pw_linker_script("my_linker_script") { 500 includes = [ "$dir_pw_bloat/bloat_macros.ld" ] 501 linker_script = "my_project_linker_script.ld" 502 } 503 504Note that linker scripts are not natively supported by GN and can't be provided 505through ``deps``, the ``bloat_macros.ld`` must be passed in the ``includes`` 506list. 507 508.. _module-pw_bloat-memoryregions: 509 510``memoryregions`` data source 511============================= 512Understanding how symbols, sections, and other data sources can be attributed 513back to the memory regions defined in your linker script is another common 514problem area. Unfortunately the ELF format does not include the original memory 515regions, meaning ``bloaty`` can not do this today by itself. In addition, it's 516relatively common that there are multiple memory regions which alias to the same 517memory but through different buses which could make attribution difficult. 518 519Instead of taking the less portable and brittle approach to parse ``*.map`` 520files, ``pw_bloat.bloaty_config`` consumes symbols which are defined in the 521linker script with a special format to extract this information from the ELF 522file: ``pw_bloat_config_memory_region_NAME_{start,end}{_N,}``. 523 524These symbols are defined by the preprocessor macros ``PW_BLOAT_MEMORY_REGION`` 525and ``PW_BLOAT_MEMORY_REGION_MAP`` with the right address and size for the 526regions. To use these macros include the ``pw_bloat/bloat_macros.ld`` in your 527``pw_linker_script`` as follows: 528 529.. code-block:: 530 531 pw_linker_script("my_linker_script") { 532 includes = [ "$dir_pw_bloat/bloat_macros.ld" ] 533 linker_script = "my_project_linker_script.ld" 534 } 535 536These symbols are then used to determine how to map segments to these memory 537regions. Note that segments must be used in order to account for inter-section 538padding which are not attributed against any sections. 539 540As an example, if you have a single view in the single memory region named 541``FLASH``, then you should include the following macro in your linker script to 542generate the symbols needed for the that region: 543 544.. code-block:: 545 546 PW_BLOAT_MEMORY_REGION(FLASH) 547 548As another example, if you have two aliased memory regions (``DCTM`` and 549``ITCM``) into the same effective memory named you'd like to call ``RAM``, then 550you should produce the following four symbols in your linker script: 551 552.. code-block:: 553 554 PW_BLOAT_MEMORY_REGION_MAP(RAM, ITCM) 555 PW_BLOAT_MEMORY_REGION_MAP(RAM, DTCM) 556