1Reproducers 2=========== 3 4As unbelievable as it may sound, the debugger has bugs. These bugs might 5manifest themselves as errors, missing results or even a crash. Quite often 6these bugs don't reproduce in simple, isolated scenarios. The debugger deals 7with a lot of moving parts and subtle differences can easily add up. 8 9Reproducers in LLDB improve the experience for both the users encountering bugs 10and the developers working on resolving them. The general idea consists of 11*capturing* all the information necessary to later *replay* a debug session 12while debugging the debugger. 13 14.. contents:: 15 :local: 16 17Usage 18----- 19 20Reproducers are a generic concept in LLDB and are not inherently coupled with 21the command line driver. The functionality can be used for anything that uses 22the SB API and the driver is just one example. However, because it's probably 23the most common way users interact with lldb, that's the workflow described in 24this section. 25 26Capture 27``````` 28 29Until reproducer capture is enabled by default, you need to launch LLDB in 30capture mode. For the command line driver, this means passing ``--capture``. 31You cannot enable reproducer capture from within LLDB, as this would be too 32late to capture initialization of the debugger. 33 34.. code-block:: bash 35 36 > lldb --capture 37 38In capture mode, LLDB will keep track of all the information it needs to replay 39the current debug session. Most data is captured lazily to limit the impact on 40performance. To create the reproducer, use the ``reproducer generate`` 41sub-command. It's always possible to check the status of the reproducers with 42the ``reproducer status`` sub-command. Note that generating the reproducer 43terminates the debug session. 44 45.. code-block:: none 46 47 (lldb) reproducer status 48 Reproducer is in capture mode. 49 (lldb) reproducer generate 50 Reproducer written to '/path/to/reproducer' 51 Please have a look at the directory to assess if you're willing to share the contained information. 52 53 54The resulting reproducer is a directory. It was a conscious decision to not 55compress and archive it automatically. The reproducer can contain potentially 56sensitive information like object and symbol files, their paths on disk, debug 57information, memory excerpts of the inferior process, etc. 58 59Replay 60`````` 61 62It is strongly recommended to replay the reproducer locally to ensure it 63actually reproduces the expected behavior. If the reproducer doesn't behave 64correctly locally, it means there's a bug in the reproducer implementation that 65should be addressed. 66 67To replay a reproducer, simply pass its path to LLDB through the ``--replay`` 68flag. It is unnecessary to pass any other command line flags. The flags that 69were passed to LLDB during capture are already part of the reproducer. 70 71.. code-block:: bash 72 73 > lldb --replay /path/to/reproducer 74 75 76During replay LLDB will behave similar to batch mode. The session should be 77identical to the recorded debug session. The only expected differences are that 78the binary being debugged doesn't actually run during replay. That means that 79you won't see any of its side effects, like things being printed to the 80terminal. Another expected difference is the behavior of the ``reproducer 81generate`` command, which becomes a NOOP during replay. 82 83Augmenting a Bug Report with a Reproducer 84````````````````````````````````````````` 85 86A reproducer can significantly improve a bug report, but it in itself is not 87sufficient. Always describe the expected and unexpected behavior. Just like the 88debugger can have bugs, the reproducer can have bugs too. 89 90 91Design 92------ 93 94 95Replay 96`````` 97 98Reproducers support two replay modes. The main and most common mode is active 99replay. It's called active, because it's LLDB that is driving replay by calling 100the captured SB API functions one after each other. The second mode is passive 101replay. In this mode, LLDB sits idle until an SB API function is called, for 102example from Python, and then replays just this individual call. 103 104Active Replay 105^^^^^^^^^^^^^ 106 107No matter how a reproducer was captured, they can always be replayed with the 108command line driver. When a reproducer is passed with the `--replay` flag, the 109driver short-circuits and passes off control to the reproducer infrastructure, 110effectively bypassing its normal operation. This works because the driver is 111implemented using the SB API and is therefore nothing more than a sequence of 112SB API calls. 113 114Replay is driven by the ``Registry::Replay``. As long as there's data in the 115buffer holding the API data, the next SB API function call is deserialized. 116Once the function is known, the registry can retrieve its signature, and use 117that to deserialize its arguments. The function can then be invoked, most 118commonly through the synthesized default replayer, or potentially using a 119custom defined replay function. This process continues, until more data is 120available or a replay error is encountered. 121 122During replay only a function's side effects matter. The result returned by the 123replayed function is ignored because it cannot be observed beyond the driver. 124This is sound, because anything that is passed into a subsequent API call will 125have been serialized as an input argument. This also works for SB API objects 126because the reproducers know about every object that has crossed the API 127boundary, which is true by definition for object return values. 128 129 130Passive Replay 131^^^^^^^^^^^^^^ 132 133Passive replay exists to support running the API test suite against a 134reproducer. The API test suite is written in Python and tests the debugger by 135calling into its API from Python. To make this work, the API must transparently 136replay itself when called. This is what makes passive replay different from 137driver replay, where it is lldb itself that's driving replay. For passive 138replay, the driving factor is external. 139 140In order to replay API calls, the reproducers need a way to intercept them. 141Every API call is already instrumented with an ``LLDB_RECORD_*`` macro that 142captures its input arguments. Furthermore, it also contains the necessary logic 143to detect which calls cross the API boundary and should be intercepted. We were 144able to reuse all of this to implement passive replay. 145 146During passive replay is enabled, nothing happens until an SB API is called. 147Inside that API function, the macro detects whether this call should be 148replayed (i.e. crossed the API boundary). If the answer is yes, the next 149function is deserialized from the SB API data and compared to the current 150function. If the signature matches, we deserialize its input arguments and 151reinvoke the current function with the deserialized arguments. We don't need to 152do anything special to prevent us from recursively calling the replayed version 153again, as the API boundary crossing logic knows that we're still behind the API 154boundary when we re-invoked the current function. 155 156Another big difference with driver replay is the return value. While this 157didn't matter for driver replay, it's key for passive replay, because that's 158what gets checked by the test suite. Luckily, the ``LLDB_RECORD_*`` macros 159contained sufficient type information to derive the result type. 160 161Testing 162------- 163 164Reproducers are tested in the following ways: 165 166 - Unit tests to cover the reproducer infrastructure. There are tests for the 167 provider, loader and for the reproducer instrumentation. 168 - Feature specific end-to-end test cases in the ``test/Shell/Reproducer`` 169 directory. These tests serve as integration and regression tests for the 170 reproducers infrastructure, as well as doing some sanity checking for basic 171 debugger functionality. 172 - The API and shell tests can be run against a replayed reproducer. The 173 ``check-lldb-reproducers`` target will run the API and shell test suite 174 twice: first running the test normally while capturing a reproducer and then 175 a second time using the replayed session as the test input. For the shell 176 tests this use a little shim (``lldb-repro``) that uses the arguments and 177 current working directory to transparently generate or replay a reproducer. 178 For the API tests an extra argument with the reproducer path is passed to 179 ``dotest.py`` which initializes the debugger in the appropriate mode. 180 Certain tests do not fit this paradigm (for example test that check the 181 output of the binary being debugged) and are skipped by marking them as 182 unsupported by adding ``UNSUPPORTED: lldb-repro`` to the top of the shell 183 test or adding the ``skipIfReproducer`` decorator for the API tests. 184 185Additional testing is possible: 186 187 - It's possible to unconditionally capture reproducers while running the 188 entire test suite by setting the ``LLDB_CAPTURE_REPRODUCER`` environment 189 variable. Assuming no bugs in reproducers, this can also help to reproduce 190 and investigate test failures. 191 192Knows Issues 193------------ 194 195The reproducers are still a work in progress. Here's a non-exhaustive list of 196outstanding work, limitations and known issues. 197 198 - The VFS cannot deal with more than one current working directory. Changing 199 the current working directory during the debug session will break relative 200 paths. 201 - Not all SB APIs are properly instrumented. We need customer serialization 202 for APIs that take buffers and lengths. 203 - We leak memory during replay because the reproducer doesn't capture the end 204 of an object's life time. We need to add instrumentation to the destructor 205 of SB API objects. 206 - The reproducer includes every file opened by LLDB. This is overkill. For 207 example we do not need to capture source files for code listings. There's 208 currently no way to say that some file shouldn't be included in the 209 reproducer. 210 - We do not yet automatically generate a reproducer on a crash. The reason is 211 that generating the reproducer is too expensive to do in a signal handler. 212 We should re-invoke lldb after a crash and do the heavy lifting. 213