1.. _module-pw_metric: 2 3========= 4pw_metric 5========= 6 7.. attention:: 8 This module is **not yet production ready**; ask us if you are interested in 9 using it out or have ideas about how to improve it. 10 11-------- 12Overview 13-------- 14Pigweed's metric module is a **lightweight manual instrumentation system** for 15tracking system health metrics like counts or set values. For example, 16``pw_metric`` could help with tracking the number of I2C bus writes, or the 17number of times a buffer was filled before it could drain in time, or safely 18incrementing counters from ISRs. 19 20Key features of ``pw_metric``: 21 22- **Tokenized names** - Names are tokenized using the ``pw_tokenizer`` enabling 23 long metric names that don't bloat your binary. 24 25- **Tree structure** - Metrics can form a tree, enabling grouping of related 26 metrics for clearer organization. 27 28- **Per object collection** - Metrics and groups can live on object instances 29 and be flexibly combined with metrics from other instances. 30 31- **Global registration** - For legacy code bases or just because it's easier, 32 ``pw_metric`` supports automatic aggregation of metrics. This is optional but 33 convenient in many cases. 34 35- **Simple design** - There are only two core data structures: ``Metric`` and 36 ``Group``, which are both simple to understand and use. The only type of 37 metric supported is ``uint32_t`` and ``float``. This module does not support 38 complicated aggregations like running average or min/max. 39 40Example: Instrumenting a single object 41-------------------------------------- 42The below example illustrates what instrumenting a class with a metric group 43and metrics might look like. In this case, the object's 44``MySubsystem::metrics()`` member is not globally registered; the user is on 45their own for combining this subsystem's metrics with others. 46 47.. code-block:: 48 49 #include "pw_metric/metric.h" 50 51 class MySubsystem { 52 public: 53 void DoSomething() { 54 attempts_.Increment(); 55 if (ActionSucceeds()) { 56 successes_.Increment(); 57 } 58 } 59 Group& metrics() { return metrics_; } 60 61 private: 62 PW_METRIC_GROUP(metrics_, "my_subsystem"); 63 PW_METRIC(metrics_, attempts_, "attempts", 0u); 64 PW_METRIC(metrics_, successes_, "successes", 0u); 65 }; 66 67The metrics subsystem has no canonical output format at this time, but a JSON 68dump might look something like this: 69 70.. code-block:: none 71 72 { 73 "my_subsystem" : { 74 "successes" : 1000, 75 "attempts" : 1200, 76 } 77 } 78 79In this case, every instance of ``MySubsystem`` will have unique counters. 80 81Example: Instrumenting a legacy codebase 82---------------------------------------- 83A common situation in embedded development is **debugging legacy code** or code 84which is hard to change; where it is perhaps impossible to plumb metrics 85objects around with dependency injection. The alternative to plumbing metrics 86is to register the metrics through a global mechanism. ``pw_metric`` supports 87this use case. For example: 88 89**Before instrumenting:** 90 91.. code-block:: 92 93 // This code was passed down from generations of developers before; no one 94 // knows what it does or how it works. But it needs to be fixed! 95 void OldCodeThatDoesntWorkButWeDontKnowWhy() { 96 if (some_variable) { 97 DoSomething(); 98 } else { 99 DoSomethingElse(); 100 } 101 } 102 103**After instrumenting:** 104 105.. code-block:: 106 107 #include "pw_metric/global.h" 108 #include "pw_metric/metric.h" 109 110 PW_METRIC_GLOBAL(legacy_do_something, "legacy_do_something"); 111 PW_METRIC_GLOBAL(legacy_do_something_else, "legacy_do_something_else"); 112 113 // This code was passed down from generations of developers before; no one 114 // knows what it does or how it works. But it needs to be fixed! 115 void OldCodeThatDoesntWorkButWeDontKnowWhy() { 116 if (some_variable) { 117 legacy_do_something.Increment(); 118 DoSomething(); 119 } else { 120 legacy_do_something_else.Increment(); 121 DoSomethingElse(); 122 } 123 } 124 125In this case, the developer merely had to add the metrics header, define some 126metrics, and then start incrementing them. These metrics will be available 127globally through the ``pw::metric::global_metrics`` object defined in 128``pw_metric/global.h``. 129 130Why not just use simple counter variables? 131------------------------------------------ 132One might wonder what the point of leveraging a metric library is when it is 133trivial to make some global variables and print them out. There are a few 134reasons: 135 136- **Metrics offload** - To make it easy to get metrics off-device by sharing 137 the infrastructure for offloading. 138 139- **Consistent format** - To get the metrics in a consistent format (e.g. 140 protobuf or JSON) for analysis 141 142- **Uncoordinated collection** - To provide a simple and reliable way for 143 developers on a team to all collect metrics for their subsystems, without 144 having to coordinate to offload. This could extend to code in libraries 145 written by other teams. 146 147- **Pre-boot or interrupt visibility** - Some of the most challenging bugs come 148 from early system boot when not all system facilities are up (e.g. logging or 149 UART). In those cases, metrics provide a low-overhead approach to understand 150 what is happening. During early boot, metrics can be incremented, then after 151 boot dumping the metrics provides insights into what happened. While basic 152 counter variables can work in these contexts too, one still has to deal with 153 the offloading problem; which the library handles. 154 155--------------------- 156Metrics API reference 157--------------------- 158 159The metrics API consists of just a few components: 160 161- The core data structures ``pw::metric::Metric`` and ``pw::metric::Group`` 162- The macros for scoped metrics and groups ``PW_METRIC`` and 163 ``PW_METRIC_GROUP`` 164- The macros for globally registered metrics and groups 165 ``PW_METRIC_GLOBAL`` and ``PW_METRIC_GROUP_GLOBAL`` 166- The global groups and metrics list: ``pw::metric::global_groups`` and 167 ``pw::metric::global_metrics``. 168 169Metric 170------ 171The ``pw::metric::Metric`` provides: 172 173- A 31-bit tokenized name 174- A 1-bit discriminator for int or float 175- A 32-bit payload (int or float) 176- A 32-bit next pointer (intrusive list) 177 178The metric object is 12 bytes on 32-bit platforms. 179 180.. cpp:class:: pw::metric::Metric 181 182 .. cpp:function:: Increment(uint32_t amount = 0) 183 184 Increment the metric by the given amount. Results in undefined behaviour if 185 the metric is not of type int. 186 187 .. cpp:function:: Set(uint32_t value) 188 189 Set the metric to the given value. Results in undefined behaviour if the 190 metric is not of type int. 191 192 .. cpp:function:: Set(float value) 193 194 Set the metric to the given value. Results in undefined behaviour if the 195 metric is not of type float. 196 197.. _module-pw_metric-group: 198 199Group 200----- 201The ``pw::metric::Group`` object is simply: 202 203- A name for the group 204- A list of children groups 205- A list of leaf metrics groups 206- A 32-bit next pointer (intrusive list) 207 208The group object is 16 bytes on 32-bit platforms. 209 210.. cpp:class:: pw::metric::Group 211 212 .. cpp:function:: Dump(int indent_level = 0) 213 214 Recursively dump a metrics group to ``pw_log``. Produces output like: 215 216 .. code-block:: none 217 218 "$6doqFw==": { 219 "$05OCZw==": { 220 "$VpPfzg==": 1, 221 "$LGPMBQ==": 1.000000, 222 "$+iJvUg==": 5, 223 } 224 "$9hPNxw==": 65, 225 "$oK7HmA==": 13, 226 "$FCM4qQ==": 0, 227 } 228 229 Note the metric names are tokenized with base64. Decoding requires using 230 the Pigweed detokenizer. With a detokenizing-enabled logger, you could get 231 something like: 232 233 .. code-block:: none 234 235 "i2c_1": { 236 "gyro": { 237 "num_sampleses": 1, 238 "init_time_us": 1.000000, 239 "initialized": 5, 240 } 241 "bus_errors": 65, 242 "transactions": 13, 243 "bytes_sent": 0, 244 } 245 246Macros 247------ 248The **macros are the primary mechanism for creating metrics**, and should be 249used instead of directly constructing metrics or groups. The macros handle 250tokenizing the metric and group names. 251 252.. cpp:function:: PW_METRIC(identifier, name, value) 253.. cpp:function:: PW_METRIC(group, identifier, name, value) 254.. cpp:function:: PW_METRIC_STATIC(identifier, name, value) 255.. cpp:function:: PW_METRIC_STATIC(group, identifier, name, value) 256 257 Declare a metric, optionally adding it to a group. 258 259 - **identifier** - An identifier name for the created variable or member. 260 For example: ``i2c_transactions`` might be used as a local or global 261 metric; inside a class, could be named according to members 262 (``i2c_transactions_`` for Google's C++ style). 263 - **name** - The string name for the metric. This will be tokenized. There 264 are no restrictions on the contents of the name; however, consider 265 restricting these to be valid C++ identifiers to ease integration with 266 other systems. 267 - **value** - The initial value for the metric. Must be either a floating 268 point value (e.g. ``3.2f``) or unsigned int (e.g. ``21u``). 269 - **group** - A ``pw::metric::Group`` instance. If provided, the metric is 270 added to the given group. 271 272 The macro declares a variable or member named "name" with type 273 ``pw::metric::Metric``, and works in three contexts: global, local, and 274 member. 275 276 If the `_STATIC` variant is used, the macro declares a variable with static 277 storage. These can be used in function scopes, but not in classes. 278 279 1. At global scope: 280 281 .. code-block:: 282 283 PW_METRIC(foo, "foo", 15.5f); 284 285 void MyFunc() { 286 foo.Increment(); 287 } 288 289 2. At local function or member function scope: 290 291 .. code-block:: 292 293 void MyFunc() { 294 PW_METRIC(foo, "foo", 15.5f); 295 foo.Increment(); 296 // foo goes out of scope here; be careful! 297 } 298 299 3. At member level inside a class or struct: 300 301 .. code-block:: 302 303 struct MyStructy { 304 void DoSomething() { 305 somethings.Increment(); 306 } 307 // Every instance of MyStructy will have a separate somethings counter. 308 PW_METRIC(somethings, "somethings", 0u); 309 } 310 311 You can also put a metric into a group with the macro. Metrics can belong to 312 strictly one group, otherwise an assertion will fail. Example: 313 314 .. code-block:: 315 316 PW_METRIC_GROUP(my_group, "my_group"); 317 PW_METRIC(my_group, foo, "foo", 0.2f); 318 PW_METRIC(my_group, bar, "bar", 44000u); 319 PW_METRIC(my_group, zap, "zap", 3.14f); 320 321 .. tip:: 322 If you want a globally registered metric, see ``pw_metric/global.h``; in 323 that contexts, metrics are globally registered without the need to 324 centrally register in a single place. 325 326.. cpp:function:: PW_METRIC_GROUP(identifier, name) 327.. cpp:function:: PW_METRIC_GROUP(parent_group, identifier, name) 328.. cpp:function:: PW_METRIC_GROUP_STATIC(identifier, name) 329.. cpp:function:: PW_METRIC_GROUP_STATIC(parent_group, identifier, name) 330 331 Declares a ``pw::metric::Group`` with name name; the name is tokenized. 332 Works similar to ``PW_METRIC`` and can be used in the same contexts (global, 333 local, and member). Optionally, the group can be added to a parent group. 334 335 If the `_STATIC` variant is used, the macro declares a variable with static 336 storage. These can be used in function scopes, but not in classes. 337 338 Example: 339 340 .. code-block:: 341 342 PW_METRIC_GROUP(my_group, "my_group"); 343 PW_METRIC(my_group, foo, "foo", 0.2f); 344 PW_METRIC(my_group, bar, "bar", 44000u); 345 PW_METRIC(my_group, zap, "zap", 3.14f); 346 347.. cpp:function:: PW_METRIC_GLOBAL(identifier, name, value) 348 349 Declare a ``pw::metric::Metric`` with name name, and register it in the 350 global metrics list ``pw::metric::global_metrics``. 351 352 Example: 353 354 .. code-block:: 355 356 #include "pw_metric/metric.h" 357 #include "pw_metric/global.h" 358 359 // No need to coordinate collection of foo and bar; they're autoregistered. 360 PW_METRIC_GLOBAL(foo, "foo", 0.2f); 361 PW_METRIC_GLOBAL(bar, "bar", 44000u); 362 363 Note that metrics defined with ``PW_METRIC_GLOBAL`` should never be added to 364 groups defined with ``PW_METRIC_GROUP_GLOBAL``. Each metric can only belong 365 to one group, and metrics defined with ``PW_METRIC_GLOBAL`` are 366 pre-registered with the global metrics list. 367 368 .. attention:: 369 Do not create ``PW_METRIC_GLOBAL`` instances anywhere other than global 370 scope. Putting these on an instance (member context) would lead to dangling 371 pointers and misery. Metrics are never deleted or unregistered! 372 373.. cpp:function:: PW_METRIC_GROUP_GLOBAL(identifier, name, value) 374 375 Declare a ``pw::metric::Group`` with name name, and register it in the 376 global metric groups list ``pw::metric::global_groups``. 377 378 Note that metrics created with ``PW_METRIC_GLOBAL`` should never be added to 379 groups! Instead, just create a freestanding metric and register it into the 380 global group (like in the example below). 381 382 Example: 383 384 .. code-block:: 385 386 #include "pw_metric/metric.h" 387 #include "pw_metric/global.h" 388 389 // No need to coordinate collection of this group; it's globally registered. 390 PW_METRIC_GROUP_GLOBAL(leagcy_system, "legacy_system"); 391 PW_METRIC(leagcy_system, foo, "foo",0.2f); 392 PW_METRIC(leagcy_system, bar, "bar",44000u); 393 394 .. attention:: 395 Do not create ``PW_METRIC_GROUP_GLOBAL`` instances anywhere other than 396 global scope. Putting these on an instance (member context) would lead to 397 dangling pointers and misery. Metrics are never deleted or unregistered! 398 399---------------------- 400Usage & Best Practices 401---------------------- 402This library makes several tradeoffs to enable low memory use per-metric, and 403one of those tradeoffs results in requiring care in constructing the metric 404trees. 405 406Use the Init() pattern for static objects with metrics 407------------------------------------------------------ 408A common pattern in embedded systems is to allocate many objects globally, and 409reduce reliance on dynamic allocation (or eschew malloc entirely). This leads 410to a pattern where rich/large objects are statically constructed at global 411scope, then interacted with via tasks or threads. For example, consider a 412hypothetical global ``Uart`` object: 413 414.. code-block:: 415 416 class Uart { 417 public: 418 Uart(span<std::byte> rx_buffer, span<std::byte> tx_buffer) 419 : rx_buffer_(rx_buffer), tx_buffer_(tx_buffer) {} 420 421 // Send/receive here... 422 423 private: 424 pw::span<std::byte> rx_buffer; 425 pw::span<std::byte> tx_buffer; 426 }; 427 428 std::array<std::byte, 512> uart_rx_buffer; 429 std::array<std::byte, 512> uart_tx_buffer; 430 Uart uart1(uart_rx_buffer, uart_tx_buffer); 431 432Through the course of building a product, the team may want to add metrics to 433the UART to for example gain insight into which operations are triggering lots 434of data transfer. When adding metrics to the above imaginary UART object, one 435might consider the following approach: 436 437.. code-block:: 438 439 class Uart { 440 public: 441 Uart(span<std::byte> rx_buffer, 442 span<std::byte> tx_buffer, 443 Group& parent_metrics) 444 : rx_buffer_(rx_buffer), 445 tx_buffer_(tx_buffer) { 446 // PROBLEM! parent_metrics may not be constructed if it's a reference 447 // to a static global. 448 parent_metrics.Add(tx_bytes_); 449 parent_metrics.Add(rx_bytes_); 450 } 451 452 // Send/receive here which increment tx/rx_bytes. 453 454 private: 455 pw::span<std::byte> rx_buffer; 456 pw::span<std::byte> tx_buffer; 457 458 PW_METRIC(tx_bytes_, "tx_bytes", 0); 459 PW_METRIC(rx_bytes_, "rx_bytes", 0); 460 }; 461 462 PW_METRIC_GROUP(global_metrics, "/"); 463 PW_METRIC_GROUP(global_metrics, uart1_metrics, "uart1"); 464 465 std::array<std::byte, 512> uart_rx_buffer; 466 std::array<std::byte, 512> uart_tx_buffer; 467 Uart uart1(uart_rx_buffer, 468 uart_tx_buffer, 469 uart1_metrics); 470 471However, this **is incorrect**, since the ``parent_metrics`` (pointing to 472``uart1_metrics`` in this case) may not be constructed at the point of 473``uart1`` getting constructed. Thankfully in the case of ``pw_metric`` this 474will result in an assertion failure (or it will work correctly if the 475constructors are called in a favorable order), so the problem will not go 476unnoticed. Instead, consider using the ``Init()`` pattern for static objects, 477where references to dependencies may only be stored during construction, but no 478methods on the dependencies are called. 479 480Instead, the ``Init()`` approach separates global object construction into two 481phases: The constructor where references are stored, and a ``Init()`` function 482which is called after all static constructors have run. This approach works 483correctly, even when the objects are allocated globally: 484 485.. code-block:: 486 487 class Uart { 488 public: 489 // Note that metrics is not passed in here at all. 490 Uart(span<std::byte> rx_buffer, 491 span<std::byte> tx_buffer) 492 : rx_buffer_(rx_buffer), 493 tx_buffer_(tx_buffer) {} 494 495 // Precondition: parent_metrics is already constructed. 496 void Init(Group& parent_metrics) { 497 parent_metrics.Add(tx_bytes_); 498 parent_metrics.Add(rx_bytes_); 499 } 500 501 // Send/receive here which increment tx/rx_bytes. 502 503 private: 504 pw::span<std::byte> rx_buffer; 505 pw::span<std::byte> tx_buffer; 506 507 PW_METRIC(tx_bytes_, "tx_bytes", 0); 508 PW_METRIC(rx_bytes_, "rx_bytes", 0); 509 }; 510 511 PW_METRIC_GROUP(root_metrics, "/"); 512 PW_METRIC_GROUP(root_metrics, uart1_metrics, "uart1"); 513 514 std::array<std::byte, 512> uart_rx_buffer; 515 std::array<std::byte, 512> uart_tx_buffer; 516 Uart uart1(uart_rx_buffer, 517 uart_tx_buffer); 518 519 void main() { 520 // uart1_metrics is guaranteed to be initialized by this point, so it is 521 safe to pass it to Init(). 522 uart1.Init(uart1_metrics); 523 } 524 525.. attention:: 526 Be extra careful about **static global metric registration**. Consider using 527 the ``Init()`` pattern. 528 529Metric member order matters in objects 530-------------------------------------- 531The order of declaring in-class groups and metrics matters if the metrics are 532within a group declared inside the class. For example, the following class will 533work fine: 534 535.. code-block:: 536 537 #include "pw_metric/metric.h" 538 539 class PowerSubsystem { 540 public: 541 Group& metrics() { return metrics_; } 542 const Group& metrics() const { return metrics_; } 543 544 private: 545 PW_METRIC_GROUP(metrics_, "power"); // Note metrics_ declared first. 546 PW_METRIC(metrics_, foo, "foo", 0.2f); 547 PW_METRIC(metrics_, bar, "bar", 44000u); 548 }; 549 550but the following one will not since the group is constructed after the metrics 551(and will result in a compile error): 552 553.. code-block:: 554 555 #include "pw_metric/metric.h" 556 557 class PowerSubsystem { 558 public: 559 Group& metrics() { return metrics_; } 560 const Group& metrics() const { return metrics_; } 561 562 private: 563 PW_METRIC(metrics_, foo, "foo", 0.2f); 564 PW_METRIC(metrics_, bar, "bar", 44000u); 565 PW_METRIC_GROUP(metrics_, "power"); // Error: metrics_ must be first. 566 }; 567 568.. attention:: 569 570 Put **groups before metrics** when declaring metrics members inside classes. 571 572Thread safety 573------------- 574``pw_metric`` has **no built-in synchronization for manipulating the tree** 575structure. Users are expected to either rely on shared global mutex when 576constructing the metric tree, or do the metric construction in a single thread 577(e.g. a boot/init thread). The same applies for destruction, though we do not 578advise destructing metrics or groups. 579 580Individual metrics have atomic ``Increment()``, ``Set()``, and the value 581accessors ``as_float()`` and ``as_int()`` which don't require separate 582synchronization, and can be used from ISRs. 583 584.. attention:: 585 586 **You must synchronize access to metrics**. ``pw_metrics`` does not 587 internally synchronize access during construction. Metric Set/Increment are 588 safe. 589 590Lifecycle 591--------- 592Metric objects are not designed to be destructed, and are expected to live for 593the lifetime of the program or application. If you need dynamic 594creation/destruction of metrics, ``pw_metric`` does not attempt to cover that 595use case. Instead, ``pw_metric`` covers the case of products with two execution 596phases: 597 5981. A boot phase where the metric tree is created. 5992. A run phase where metrics are collected. The tree structure is fixed. 600 601Technically, it is possible to destruct metrics provided care is taken to 602remove the given metric (or group) from the list it's contained in. However, 603there are no helper functions for this, so be careful. 604 605Below is an example that **is incorrect**. Don't do what follows! 606 607.. code-block:: 608 609 #include "pw_metric/metric.h" 610 611 void main() { 612 PW_METRIC_GROUP(root, "/"); 613 { 614 // BAD! The metrics have a different lifetime than the group. 615 PW_METRIC(root, temperature, "temperature_f", 72.3f); 616 PW_METRIC(root, humidity, "humidity_relative_percent", 33.2f); 617 } 618 // OOPS! root now has a linked list that points to the destructed 619 // "humidity" object. 620 } 621 622.. attention:: 623 **Don't destruct metrics**. Metrics are designed to be registered / 624 structured upfront, then manipulated during a device's active phase. They do 625 not support destruction. 626 627.. _module-pw_metric-exporting: 628 629----------------- 630Exporting metrics 631----------------- 632Collecting metrics on a device is not useful without a mechanism to export 633those metrics for analysis and debugging. ``pw_metric`` offers optional RPC 634service libraries (``:metric_service_nanopb`` based on nanopb, and 635``:metric_service_pwpb`` based on pw_protobuf) that enable exporting a 636user-supplied set of on-device metrics via RPC. This facility is intended to 637function from the early stages of device bringup through production in the 638field. 639 640The metrics are fetched by calling the ``MetricService.Get`` RPC method, which 641streams all registered metrics to the caller in batches (server streaming RPC). 642Batching the returned metrics avoids requiring a large buffer or large RPC MTU. 643 644The returned metric objects have flattened paths to the root. For example, the 645returned metrics (post detokenization and jsonified) might look something like: 646 647.. code-block:: none 648 649 { 650 "/i2c1/failed_txns": 17, 651 "/i2c1/total_txns": 2013, 652 "/i2c1/gyro/resets": 24, 653 "/i2c1/gyro/hangs": 1, 654 "/spi1/thermocouple/reads": 242, 655 "/spi1/thermocouple/temp_celsius": 34.52, 656 } 657 658Note that there is no nesting of the groups; the nesting is implied from the 659path. 660 661RPC service setup 662----------------- 663To expose a ``MetricService`` in your application, do the following: 664 6651. Define metrics around the system, and put them in a group or list of 666 metrics. Easy choices include for example the ``global_groups`` and 667 ``global_metrics`` variables; or creat your own. 6682. Create an instance of ``pw::metric::MetricService``. 6693. Register the service with your RPC server. 670 671For example: 672 673.. code-block:: 674 675 #include "pw_rpc/server.h" 676 #include "pw_metric/metric.h" 677 #include "pw_metric/global.h" 678 #include "pw_metric/metric_service_nanopb.h" 679 680 // Note: You must customize the RPC server setup; see pw_rpc. 681 Channel channels[] = { 682 Channel::Create<1>(&uart_output), 683 }; 684 Server server(channels); 685 686 // Metric service instance, pointing to the global metric objects. 687 // This could also point to custom per-product or application objects. 688 pw::metric::MetricService metric_service( 689 pw::metric::global_metrics, 690 pw::metric::global_groups); 691 692 void RegisterServices() { 693 server.RegisterService(metric_service); 694 // Register other services here. 695 } 696 697 void main() { 698 // ... system initialization ... 699 700 RegisterServices(); 701 702 // ... start your applcation ... 703 } 704 705.. attention:: 706 Take care when exporting metrics. Ensure **appropriate access control** is in 707 place. In some cases it may make sense to entirely disable metrics export for 708 production builds. Although reading metrics via RPC won't influence the 709 device, in some cases the metrics could expose sensitive information if 710 product owners are not careful. 711 712.. attention:: 713 **MetricService::Get is a synchronous RPC method** 714 715 Calls to is ``MetricService::Get`` are blocking and will send all metrics 716 immediately, even though it is a server-streaming RPC. This will work fine if 717 the device doesn't have too many metrics, or doesn't have concurrent RPCs 718 like logging, but could be a problem in some cases. 719 720 We plan to offer an async version where the application is responsible for 721 pumping the metrics into the streaming response. This gives flow control to 722 the application. 723 724----------- 725Size report 726----------- 727The below size report shows the cost in code and memory for a few examples of 728metrics. This does not include the RPC service. 729 730.. include:: metric_size_report 731 732.. attention:: 733 At time of writing, **the above sizes show an unexpectedly large flash 734 impact**. We are investigating why GCC is inserting large global static 735 constructors per group, when all the logic should be reused across objects. 736 737------------- 738Metric Parser 739------------- 740The metric_parser Python Module requests the system metrics via RPC, then parses the 741response while detokenizing the group and metrics names, and returns the metrics 742in a dictionary organized by group and value. 743 744---------------- 745Design tradeoffs 746---------------- 747There are many possible approaches to metrics collection and aggregation. We've 748chosen some points on the tradeoff curve: 749 750- **Atomic-sized metrics** - Using simple metric objects with just uint32/float 751 enables atomic operations. While it might be nice to support larger types, it 752 is more useful to have safe metrics increment from interrupt subroutines. 753 754- **No aggregate metrics (yet)** - Aggregate metrics (e.g. average, max, min, 755 histograms) are not supported, and must be built on top of the simple base 756 metrics. By taking this route, we can considerably simplify the core metrics 757 system and have aggregation logic in separate modules. Those modules can then 758 feed into the metrics system - for example by creating multiple metrics for a 759 single underlying metric. For example: "foo", "foo_max", "foo_min" and so on. 760 761 The other problem with automatic aggregation is that what period the 762 aggregation happens over is often important, and it can be hard to design 763 this cleanly into the API. Instead, this responsibility is pushed to the user 764 who must take more care. 765 766 Note that we will add helpers for aggregated metrics. 767 768- **No virtual metrics** - An alternate approach to the concrete Metric class 769 in the current module is to have a virtual interface for metrics, and then 770 allow those metrics to have their own storage. This is attractive but can 771 lead to many vtables and excess memory use in simple one-metric use cases. 772 773- **Linked list registration** - Using linked lists for registration is a 774 tradeoff, accepting some memory overhead in exchange for flexibility. Other 775 alternatives include a global table of metrics, which has the disadvantage of 776 requiring centralizing the metrics -- an impossibility for middleware like 777 Pigweed. 778 779- **Synchronization** - The only synchronization guarantee provided by 780 pw_metric is that increment and set are atomic. Other than that, users are on 781 their own to synchonize metric collection and updating. 782 783- **No fast metric lookup** - The current design does not make it fast to 784 lookup a metric at runtime; instead, one must run a linear search of the tree 785 to find the matching metric. In most non-dynamic use cases, this is fine in 786 practice, and saves having a more involved hash table. Metric updates will be 787 through direct member or variable accesses. 788 789- **Relying on C++ static initialization** - In short, the convenience 790 outweighs the cost and risk. Without static initializers, it would be 791 impossible to automatically collect the metrics without post-processing the 792 C++ code to find the metrics; a huge and debatably worthwhile approach. We 793 have carefully analyzed the static initializer behaviour of Pigweed's 794 IntrusiveList and are confident it is correct. 795 796- **Both local & global support** - Potentially just one approach (the local or 797 global one) could be offered, making the module less complex. However, we 798 feel the additional complexity is worthwhile since there are legimitate use 799 cases for both e.g. ``PW_METRIC`` and ``PW_METRIC_GLOBAL``. We'd prefer to 800 have a well-tested upstream solution for these use cases rather than have 801 customers re-implement one of these. 802 803---------------- 804Roadmap & Status 805---------------- 806- **String metric names** - ``pw_metric`` stores metric names as tokens. On one 807 hand, this is great for production where having a compact binary is often a 808 requirement to fit the application in the given part. However, in early 809 development before flash is a constraint, string names are more convenient to 810 work with since there is no need for host-side detokenization. We plan to add 811 optional support for using supporting strings. 812 813- **Aggregate metrics** - We plan to add support for aggregate metrics on top 814 of the simple metric mechanism, either as another module or as additional 815 functionality inside this one. Likely examples include min/max, 816 817- **Selectively enable or disable metrics** - Currently the metrics are always 818 enabled once included. In practice this is not ideal since many times only a 819 few metrics are wanted in production, but having to strip all the metrics 820 code is error prone. Instead, we will add support for controlling what 821 metrics are enabled or disabled at compile time. This may rely on of C++20's 822 support for zero-sized members to fully remove the cost. 823 824- **Async RPC** - The current RPC service exports the metrics by streaming 825 them to the client in batches. However, the current solution streams all the 826 metrics to completion; this may block the RPC thread. In the future we will 827 have an async solution where the user is in control of flow priority. 828 829- **Timer integration** - We would like to add a stopwatch type mechanism to 830 time multiple in-flight events. 831 832- **C support** - In practice it's often useful or necessary to instrument 833 C-only code. While it will be impossible to support the global registration 834 system that the C++ version supports, we will figure out a solution to make 835 instrumenting C code relatively smooth. 836 837- **Global counter** - We may add a global metric counter to help detect cases 838 where post-initialization metrics manipulations are done. 839 840- **Proto structure** - It may be possible to directly map metrics to a custom 841 proto structure, where instead of a name or token field, a tag field is 842 provided. This could result in elegant export to an easily machine parsable 843 and compact representation on the host. We may investigate this in the 844 future. 845 846- **Safer data structures** - At a cost of 4B per metric and 4B per group, it 847 may be possible to make metric structure instantiation safe even in static 848 constructors, and also make it safe to remove metrics dynamically. We will 849 consider whether this tradeoff is the right one, since a 4B cost per metric 850 is substantial on projects with many metrics. 851