1Projects 2======== 3 4The following is a mostly unordered set of the ideas for improvements to the 5LLDB debugger. Some are fairly deep, some would require less effort. 6 7.. contents:: 8 :local: 9 10Speed up type realization in lldb 11--------------------------------- 12 13The type of problem I'm addressing here is the situation where you are 14debugging a large program (lldb built with debug clang/swift will do) and you 15go to print a simple expression, and lldb goes away for 30 seconds. When you 16sample it, it is always busily churning through all the CU's in the world 17looking for something. The problem isn't that looking for something in 18particular is slow, but rather that we somehow turned an bounded search (maybe 19a subtype of "std::string" into an unbounded search (all things with the name 20of that subtype.) Or didn't stop when we got a reasonable answer proximate to 21the context of the search, but let the search leak out globally. And quite 22likely there are other issues that I haven't guessed yet. But if you end up 23churning though 3 or 4 Gig of debug info, that's going to be slow no matter how 24well written your debug reader is... 25 26My guess is the work will be more in the general symbol lookup than in the 27DWARF parser in particular, but it may be a combination of both. 28 29As a user debugging a largish program, this is the most obvious lameness of 30lldb. 31 32Symbol name completion in the expression parser 33----------------------------------------------- 34 35This is the other obvious lameness of lldb. You can do: 36 37:: 38 39 (lldb) frame var foo.b 40 41and we will tell you it is "foo.bar". But you can't do that in the expression 42parser. This will require collaboration with the clang/swift folks to get the 43right extension points in the compiler. And whatever they are, lldb will need 44use them to tell the compiler about what names are available. It will be 45important to avoid the pitfalls of #1 where we wander into the entire DWARF 46world. 47 48Make a high speed asynchronous communication channel 49---------------------------------------------------- 50 51All lldb debugging nowadays is done by talking to a debug agent. We used the 52gdb-remote protocol because that is universal, and good enough, and you have to 53support it anyway since so many little devices & JTAG's and VM's etc support 54it. But it is really old, not terribly high performance, and can't really 55handle sending or receiving messages while the process is supposedly running. 56It should have compression built in, remove the hand-built checksums and rely 57on the robust communication protocols we always have nowadays, allow for 58out-of-order requests/replies, allow for reconnecting to a temporarily 59disconnected debug session, regularize all of the packet formatting into JSON 60or BSON or whatever while including a way to do large binary transfers. It must 61be possible to come up with something faster, and better tunable for the many 62communications pathways we end up supporting. 63 64Fix local variable lookup in the lldb expression parser 65------------------------------------------------------- 66 67The injection of local variables into the clang expression parser is 68currently done incorrectly - it happens too late in the lookup. This results 69in namespace variables & functions, same named types and ivars shadowing 70locals when it should be the other way around. An attempt was made to fix 71this by manually inserting all the visible local variables into wrapper 72function in the expression text. This mostly gets the job done but that 73method means you have to realize all the types and locations of all local 74variables for even the simplest of expressions, and when run on large 75programs (e.g. lldb) it would cause unacceptable delays. And it was very 76fragile since an error in realizing any of the locals would cause all 77expressions run in that context to fail. We need to fix this by adjusting 78the points where name lookup calls out to lldb in clang. 79 80Support calling SB & commands everywhere and support non-stop debugging 81----------------------------------------------------------------------- 82 83There is a fairly ad-hoc system to handle when it is safe to run SB API's and 84command line commands. This is actually a bit of a tricky problem, since we 85allow access to the command line and SB API from some funky places in lldb. The 86Operating System plugins are the most obvious instance, since they get run 87right after lldb is told by debugserver that the process has stopped, but 88before it has finished collating the information from the stop for presentation 89to the higher levels. But breakpoint callbacks have some of the same problems, 90and other things like the scripted stepping operations and any fancier 91extension points we want to add to the debugger are going to be hard to 92implement robustly till we work on a finer-grained and more explicit control 93over who gets to control the process state. 94 95We also won't have any chance of supporting non-stop debugging - which is a 96useful mode for programs that have a lot of high-priority or real-time worker 97threads - until we get this sorted out. 98 99Finish the language abstraction and remove all the unnecessary API's 100-------------------------------------------------------------------- 101 102An important part of making lldb a more useful "debugger toolkit" as opposed to 103a C/C++/ObjC/Swift debugger is to have a clean abstraction for language 104support. We did most, but not all, of the physical separation. We need to 105finish that. And then by force of necessity the API's really look like the 106interface to a C++ type system with a few swift bits added on. How you would 107go about adding a new language is unclear and much more trouble than it is 108worth at present. But if we made this nice, we could add a lot of value to 109other language projects. 110 111Add some syntax to generate data formatters from type definitions 112----------------------------------------------------------------- 113 114Uses of the data formatters fall into two types. There are data formatters for 115types where the structure elements pretty much tell you how to present the 116data, you just need a little expression language to express how to turn them 117into what the user expects to see. Then there are the ones (like pretty much 118all our Foundation/AppKit/UIKit formatters) that use deep magic to figure out 119how the type is actually laid out. The latter are pretty much always going to 120have to be done by hand. 121 122But for the ones where the information is expressed in the fields, it would be 123great to have a way to express the instructions to produce summaries and 124children in some form you could embed next to the types and have the compiler 125produce a byte code form of the instructions and then make that available to 126lldb along with the library. This isn't as simple as having clang run over the 127headers and produce something from the types directly. After all, clang has no 128way of knowing that the interesting thing about a std::vector is the elements 129that you get by calling size (for the summary) and [] for the elements. But it 130shouldn't be hard to come up with a generic markup to express this. 131 132Allow the expression parser to access dynamic type/data formatter information 133----------------------------------------------------------------------------- 134 135This seems like a smaller one. The symptom is your object is Foo child of 136Bar, and in the Locals view you see all the fields of Foo, but because the 137static type of the object is Bar, you can't see any of the fields of Foo. 138But if you could get this working, you could hijack the mechanism to make 139the results of the value object summaries/synthetic children available to 140expressions. And if you can do that, you could add other properties to an 141object externally (through Python or some other extension point) and then 142have these also available in the expression parser. You could use this to 143express invariants for data structures, or other more advanced uses of types 144in the debugger. 145 146Another version of this is to allow access to synthetic children in the 147expression parser. Otherwise you end up in situations like: 148 149:: 150 151 (lldb) print return_a_foo() 152 (SomeVectorLikeType) $1 = { 153 [0] = 0 154 [1] = 1 155 [2] = 2 156 [3] = 3 157 [4] = 4 158 } 159 160That's good but: 161 162:: 163 164 (lldb) print return_a_foo()[2] 165 166fails because the expression parser doesn't know anything about the 167array-like nature of SomeVectorLikeType that it gets from the synthetic 168children. 169 170Recover thread information lazily 171--------------------------------- 172 173LLDB stores all the user intentions for a thread in the ThreadPlans stored in 174the Thread class. That allows us to reliably implement a very natural model for 175users moving through a debug session. For example, if step-over stops at a 176breakpoint in an function in a younger region of the stack, continue will 177complete the step-over rather than having to manually step out. But that means 178that it is important that the Thread objects live as long as the Threads they 179represent. For programs with many threads, but only one that you are debugging, 180that makes stepping less efficient, since now you have to fetch the thread list 181on every step or stepping doesn't work correctly. This is especially an issue 182when the threads are provided by an Operating System plugin, where it may take 183non-trivial work to reconstruct the thread list. It would be better to fetch 184threads lazily but keep "unseen" threads in a holding area, and only retire 185them when we know we've fetched the whole thread list and ensured they are no 186longer alive. 187 188Make Python-backed commands first class citizens 189------------------------------------------------ 190 191As it stands, Python commands have no way to advertise their options. They are 192required to parse their arguments by hand. That leads to inconsistency, and 193more importantly means they can't take advantage of auto-generated help and 194command completion. This leaves python-backed commands feeling worse than 195built-in ones. 196 197As part of this job, it would also be great to hook automatically hook the 198"type" of an option value or argument (e.g. eArgTypeShlibName) to sensible 199default completers. You need to be able to over-ride this in more complicated 200scenarios (like in "break set" where the presence of a "-s" option limits the 201search for completion of a "-n" option.) But in common cases it is unnecessary 202busy-work to have to supply the completer AND the type. If this worked, then it 203would be easier for Python commands to also get correct completers. 204 205Reimplement the command interpreter commands using the SB API 206------------------------------------------------------------- 207 208Currently, all the CommandObject::DoExecute methods are implemented using the 209lldb_private API's. That generally means that there's code that gets duplicated 210between the CommandObject and the SB API that does roughly the same thing. We 211would reduce this code duplication, present a single coherent face to the users 212of lldb, and keep ourselves more honest about what we need in the SB API's if 213we implemented the CommandObjects::DoExecute methods using the SB API's. 214 215BTW, it is only the way it was much easier to develop lldb if it had a 216functioning command-line early on. So we did that first, and developed the SB 217API's when lldb was more mature. There's no good technical reason to have the 218commands use the lldb_private API's. 219 220Documentation and better examples 221--------------------------------- 222 223We need to put the lldb syntax docs in the tutorial somewhere that is more 224easily accessible. On suggestion is to add non-command based help to the help 225system, and then have a "help lldb" or "help syntax" type command with this 226info. Be nice if the non-command based help could be hierarchical so you could 227make topics. 228 229There's a fair bit of docs about the SB API's, but it is spotty. Some classes 230are well documented in the Python "help (lldb.SBWhatever)" and some are not. 231 232We need more conceptual docs. And we need more examples. And we could provide a 233clean pluggable example for using LLDB standalone from Python. The 234process_events.py is a start of this, but it just handles process events, and 235it is really a quick sketch not a polished expandable proto-tool. 236 237Make a more accessible plugin architecture for lldb 238--------------------------------------------------- 239 240Right now, you can only use the Python or SB API's to extend an extant lldb. 241You can't implement any of the actual lldb Plugins as plugins. That means 242anybody that wants to add new Object file/Process/Language etc support has to 243build and distribute their own lldb. This is tricky because the API's the 244plugins use are currently not stable (and recently have been changing quite a 245lot.) We would have to define a subset of lldb_private that you could use, and 246some way of telling whether the plugins were compatible with the lldb. But 247long-term, making this sort of extension possible will make lldb more appealing 248for research and 3rd party uses. 249 250Use instruction emulation to reduce the overhead for breakpoints 251---------------------------------------------------------------- 252 253At present, breakpoints are implemented by inserting a trap instruction, then 254when the trap is hit, replace the trap with the actual instruction and single 255step. Then swap back and continue. This causes problems for read only text, and 256also means that no-stop debugging ust either stop all threads briefly to handle 257this two-step or risk missing some breakpoint hits. If you emulated the 258instruction and wrote back the results, you wouldn't have these problems, and 259it would also save a stop per breakpoint hit. Since we use breakpoints to 260implement stepping, this savings could be significant on slow connections. 261 262Use the JIT to speed up conditional breakpoint evaluation 263--------------------------------------------------------- 264 265We already JIT and cache the conditional expressions for breakpoints for the C 266family of languages, so we aren't re-compiling every time you hit the 267breakpoint. And if we couldn't IR interpret the expression, we leave the JIT'ed 268code in place for reuse. But it would be even better if we could also insert 269the "stop or not" decision into the code at the breakpoint, so you would only 270actually stop the process when the condition was true. Greg's idea was that if 271you had a conditional breakpoint set when you started the debug session, Xcode 272could rebuild and insert enough no-ops that we could instrument the breakpoint 273site and call the conditional expression, and only trap if the conditional was 274true. 275 276Broaden the idea in "target stop-hook" to cover more events in the debugger 277--------------------------------------------------------------------------- 278 279Shared library loads, command execution, User directed memory/register reads 280and writes are all places where you would reasonably want to hook into the 281debugger. 282 283Mock classes for testing 284------------------------ 285 286We need "ProcessMock" and "ObjectFileMock" and the like. These would be real 287plugin implementations for their underlying lldb classes, with the addition 288that you can prime them from some sort of text based input files. For classes 289that manage changes over time (like process) you would need to program the 290state at StopPoint 0, StopPoint 1, etc. These could then be used for testing 291reactions to complex threading problems & the like, and also for simulating 292hard-to-test environments (like bare board debugging). 293 294Expression parser needs syntax for "{symbol,type} A in CU B.cpp" 295---------------------------------------------------------------- 296 297Sometimes you need to specify non-visible or ambiguous types to the expression 298parser. We were planning to do $b_dot_cpp$A or something like. You might want 299to specify a static in a function, in a source file, or in a shared library. So 300the syntax should support all these. 301 302Add a "testButDontAbort" style test to the UnitTest framework 303------------------------------------------------------------- 304 305The way we use unittest now (maybe this is the only way it can work, I don't 306know) you can't report a real failure and continue with the test. That is 307appropriate in some cases: if I'm supposed to hit breakpoint A before I 308evaluate an expression, and don't hit breakpoint A, the test should fail. But 309it means that if I want to test five different expressions, I can either do it 310in one test, which is good because it means I only have to fire up one process, 311attach to it, and get it to a certain point. But it also means if the first 312test fails, the other four don't even get run. So though at first we wrote a 313bunch of test like this, as time went on we switched more to writing "one at a 314time" tests because they were more robust against a single failure. That makes 315the test suite run much more slowly. It would be great to add a 316"test_but_dont_abort" variant of the tests, then we could gang tests that all 317drive to the same place and do similar things. As an added benefit, it would 318allow us to be more thorough in writing tests, since each test would have lower 319costs. 320 321Convert the dotest style tests to use lldbutil.run_to_source_breakpoint 322----------------------------------------------------------------------- 323 324run_to_source_breakpoint & run_to_name_breakpoint provide a compact API that 325does in one line what the first 10 or 20 lines of most of the old tests now do 326by hand. Using these functions makes tests much more readable, and by 327centralizing common functionality will make maintaining the testsuites easier 328in the future. This is more of a finger exercise, and perhaps best implemented 329by a rule like: "If you touch a test case, and it isn't using 330run_to_source_breakpoint, please make it do so". 331 332Unify Watchpoint's & Breakpoints 333-------------------------------- 334 335Option handling isn't shared, and more importantly the PerformAction's have a 336lot of duplicated common code, most of which works less well on the Watchpoint 337side. 338 339Reverse debugging 340----------------- 341 342This is kind of a holy grail, it's hard to support for complex apps (many 343threads, shared memory, etc.) But it would be SO nice to have... 344 345Non-stop debugging 346------------------ 347 348By this I mean allowing some threads in the target program to run while 349stopping other threads. This is supported in name in lldb at present, but lldb 350makes the assumption "If I get a stop, I won't get another stop unless I 351actually run the program." in a bunch of places so getting it to work reliably 352will be some a good bit of work. And figuring out how to present this in the UI 353will also be tricky. 354 355Fix and continue 356---------------- 357 358We did this in gdb without a real JIT. The implementation shouldn't be that 359hard, especially if you can build the executable for fix and continue. The 360tricky part is how to verify that the user can only do the kinds of fixes that 361are safe to do. No changing object sizes is easy to detect, but there were many 362more subtle changes (function you are fixing is on the stack...) that take more 363work to prevent. And then you have to explain these conditions the user in some 364helpful way. 365 366Unified IR interpreter 367---------------------- 368 369Currently IRInterpreter implements a portion of the LLVM IR, but it doesn't 370handle vector data types and there are plenty of instructions it also doesn't 371support. Conversely, lli supports most of LLVM's IR but it doesn't handle 372remote memory and its function calling support is very rudimentary. It would be 373useful to unify these and make the IR interpreter -- both for LLVM and LLDB -- 374better. An alternate strategy would be simply to JIT into the current process 375but have callbacks for non-stack memory access. 376 377Teach lldb to predict exception propagation at the throw site 378------------------------------------------------------------- 379 380There are a bunch of places in lldb where we need to know at the point where an 381exception is thrown, what frame will catch the exception. 382 383For instance, if an expression throws an exception, we need to know whether the 384exception will be caught in the course of the expression evaluation. If so it 385would be safe to let the expression continue. But since we would destroy the 386state of the thread if we let the exception escape the expression, we currently 387stop the expression evaluation if we see a throw. If we knew where it would be 388caught we could distinguish these two cases. 389 390Similarly, when you step over a call that throws, you want to stop at the throw 391point if you know the exception will unwind past the frame you were stepping in, 392but it would annoying to have the step abort every time an exception was thrown. 393If we could predict the catching frame, we could do this right. 394 395And of course, this would be a useful piece of information to display when stopped 396at a throw point. 397 398Add predicates to the nodes of settings 399--------------------------------------- 400 401It would be very useful to be able to give values to settings that are dependent 402on the triple, or executable name, for targets, or on whether a process is local 403or remote, or on the name of a thread, etc. The original intent (and there is 404a sketch of this in the settings parsing code) was to be able to say: 405 406:: 407 408 (lldb) settings set target{arch=x86_64}.process.thread{name=foo}... 409 410The exact details are still to be worked out, however. 411 412Resurrect Type Validators 413------------------------- 414 415This half-implemented feature was removed in 416https://reviews.llvm.org/D71310 but the general idea might still be 417useful: Type Validators look at a ValueObject, and make sure that 418there is nothing semantically wrong with the object's contents to 419easily catch corrupted data. 420