1======================================================= 2Building a JIT: Starting out with KaleidoscopeJIT 3======================================================= 4 5.. contents:: 6 :local: 7 8Chapter 1 Introduction 9====================== 10 11**Warning: This text is currently out of date due to ORC API updates.** 12 13**The example code has been updated and can be used. The text will be updated 14once the API churn dies down.** 15 16Welcome to Chapter 1 of the "Building an ORC-based JIT in LLVM" tutorial. This 17tutorial runs through the implementation of a JIT compiler using LLVM's 18On-Request-Compilation (ORC) APIs. It begins with a simplified version of the 19KaleidoscopeJIT class used in the 20`Implementing a language with LLVM <LangImpl01.html>`_ tutorials and then 21introduces new features like optimization, lazy compilation and remote 22execution. 23 24The goal of this tutorial is to introduce you to LLVM's ORC JIT APIs, show how 25these APIs interact with other parts of LLVM, and to teach you how to recombine 26them to build a custom JIT that is suited to your use-case. 27 28The structure of the tutorial is: 29 30- Chapter #1: Investigate the simple KaleidoscopeJIT class. This will 31 introduce some of the basic concepts of the ORC JIT APIs, including the 32 idea of an ORC *Layer*. 33 34- `Chapter #2 <BuildingAJIT2.html>`_: Extend the basic KaleidoscopeJIT by adding 35 a new layer that will optimize IR and generated code. 36 37- `Chapter #3 <BuildingAJIT3.html>`_: Further extend the JIT by adding a 38 Compile-On-Demand layer to lazily compile IR. 39 40- `Chapter #4 <BuildingAJIT4.html>`_: Improve the laziness of our JIT by 41 replacing the Compile-On-Demand layer with a custom layer that uses the ORC 42 Compile Callbacks API directly to defer IR-generation until functions are 43 called. 44 45- `Chapter #5 <BuildingAJIT5.html>`_: Add process isolation by JITing code into 46 a remote process with reduced privileges using the JIT Remote APIs. 47 48To provide input for our JIT we will use the Kaleidoscope REPL from 49`Chapter 7 <LangImpl07.html>`_ of the "Implementing a language in LLVM tutorial", 50with one minor modification: We will remove the FunctionPassManager from the 51code for that chapter and replace it with optimization support in our JIT class 52in Chapter #2. 53 54Finally, a word on API generations: ORC is the 3rd generation of LLVM JIT API. 55It was preceded by MCJIT, and before that by the (now deleted) legacy JIT. 56These tutorials don't assume any experience with these earlier APIs, but 57readers acquainted with them will see many familiar elements. Where appropriate 58we will make this connection with the earlier APIs explicit to help people who 59are transitioning from them to ORC. 60 61JIT API Basics 62============== 63 64The purpose of a JIT compiler is to compile code "on-the-fly" as it is needed, 65rather than compiling whole programs to disk ahead of time as a traditional 66compiler does. To support that aim our initial, bare-bones JIT API will be: 67 681. Handle addModule(Module &M) -- Make the given IR module available for 69 execution. 702. JITSymbol findSymbol(const std::string &Name) -- Search for pointers to 71 symbols (functions or variables) that have been added to the JIT. 723. void removeModule(Handle H) -- Remove a module from the JIT, releasing any 73 memory that had been used for the compiled code. 74 75A basic use-case for this API, executing the 'main' function from a module, 76will look like: 77 78.. code-block:: c++ 79 80 std::unique_ptr<Module> M = buildModule(); 81 JIT J; 82 Handle H = J.addModule(*M); 83 int (*Main)(int, char*[]) = (int(*)(int, char*[]))J.getSymbolAddress("main"); 84 int Result = Main(); 85 J.removeModule(H); 86 87The APIs that we build in these tutorials will all be variations on this simple 88theme. Behind the API we will refine the implementation of the JIT to add 89support for optimization and lazy compilation. Eventually we will extend the 90API itself to allow higher-level program representations (e.g. ASTs) to be 91added to the JIT. 92 93KaleidoscopeJIT 94=============== 95 96In the previous section we described our API, now we examine a simple 97implementation of it: The KaleidoscopeJIT class [1]_ that was used in the 98`Implementing a language with LLVM <LangImpl01.html>`_ tutorials. We will use 99the REPL code from `Chapter 7 <LangImpl07.html>`_ of that tutorial to supply the 100input for our JIT: Each time the user enters an expression the REPL will add a 101new IR module containing the code for that expression to the JIT. If the 102expression is a top-level expression like '1+1' or 'sin(x)', the REPL will also 103use the findSymbol method of our JIT class find and execute the code for the 104expression, and then use the removeModule method to remove the code again 105(since there's no way to re-invoke an anonymous expression). In later chapters 106of this tutorial we'll modify the REPL to enable new interactions with our JIT 107class, but for now we will take this setup for granted and focus our attention on 108the implementation of our JIT itself. 109 110Our KaleidoscopeJIT class is defined in the KaleidoscopeJIT.h header. After the 111usual include guards and #includes [2]_, we get to the definition of our class: 112 113.. code-block:: c++ 114 115 #ifndef LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H 116 #define LLVM_EXECUTIONENGINE_ORC_KALEIDOSCOPEJIT_H 117 118 #include "llvm/ADT/STLExtras.h" 119 #include "llvm/ExecutionEngine/ExecutionEngine.h" 120 #include "llvm/ExecutionEngine/JITSymbol.h" 121 #include "llvm/ExecutionEngine/RTDyldMemoryManager.h" 122 #include "llvm/ExecutionEngine/SectionMemoryManager.h" 123 #include "llvm/ExecutionEngine/Orc/CompileUtils.h" 124 #include "llvm/ExecutionEngine/Orc/IRCompileLayer.h" 125 #include "llvm/ExecutionEngine/Orc/LambdaResolver.h" 126 #include "llvm/ExecutionEngine/Orc/RTDyldObjectLinkingLayer.h" 127 #include "llvm/IR/DataLayout.h" 128 #include "llvm/IR/Mangler.h" 129 #include "llvm/Support/DynamicLibrary.h" 130 #include "llvm/Support/raw_ostream.h" 131 #include "llvm/Target/TargetMachine.h" 132 #include <algorithm> 133 #include <memory> 134 #include <string> 135 #include <vector> 136 137 namespace llvm { 138 namespace orc { 139 140 class KaleidoscopeJIT { 141 private: 142 std::unique_ptr<TargetMachine> TM; 143 const DataLayout DL; 144 RTDyldObjectLinkingLayer ObjectLayer; 145 IRCompileLayer<decltype(ObjectLayer), SimpleCompiler> CompileLayer; 146 147 public: 148 using ModuleHandle = decltype(CompileLayer)::ModuleHandleT; 149 150Our class begins with four members: A TargetMachine, TM, which will be used to 151build our LLVM compiler instance; A DataLayout, DL, which will be used for 152symbol mangling (more on that later), and two ORC *layers*: an 153RTDyldObjectLinkingLayer and a CompileLayer. We'll be talking more about layers 154in the next chapter, but for now you can think of them as analogous to LLVM 155Passes: they wrap up useful JIT utilities behind an easy to compose interface. 156The first layer, ObjectLayer, is the foundation of our JIT: it takes in-memory 157object files produced by a compiler and links them on the fly to make them 158executable. This JIT-on-top-of-a-linker design was introduced in MCJIT, however 159the linker was hidden inside the MCJIT class. In ORC we expose the linker so 160that clients can access and configure it directly if they need to. In this 161tutorial our ObjectLayer will just be used to support the next layer in our 162stack: the CompileLayer, which will be responsible for taking LLVM IR, compiling 163it, and passing the resulting in-memory object files down to the object linking 164layer below. 165 166That's it for member variables, after that we have a single typedef: 167ModuleHandle. This is the handle type that will be returned from our JIT's 168addModule method, and can be passed to the removeModule method to remove a 169module. The IRCompileLayer class already provides a convenient handle type 170(IRCompileLayer::ModuleHandleT), so we just alias our ModuleHandle to this. 171 172.. code-block:: c++ 173 174 KaleidoscopeJIT() 175 : TM(EngineBuilder().selectTarget()), DL(TM->createDataLayout()), 176 ObjectLayer([]() { return std::make_shared<SectionMemoryManager>(); }), 177 CompileLayer(ObjectLayer, SimpleCompiler(*TM)) { 178 llvm::sys::DynamicLibrary::LoadLibraryPermanently(nullptr); 179 } 180 181 TargetMachine &getTargetMachine() { return *TM; } 182 183Next up we have our class constructor. We begin by initializing TM using the 184EngineBuilder::selectTarget helper method which constructs a TargetMachine for 185the current process. Then we use our newly created TargetMachine to initialize 186DL, our DataLayout. After that we need to initialize our ObjectLayer. The 187ObjectLayer requires a function object that will build a JIT memory manager for 188each module that is added (a JIT memory manager manages memory allocations, 189memory permissions, and registration of exception handlers for JIT'd code). For 190this we use a lambda that returns a SectionMemoryManager, an off-the-shelf 191utility that provides all the basic memory management functionality required for 192this chapter. Next we initialize our CompileLayer. The CompileLayer needs two 193things: (1) A reference to our object layer, and (2) a compiler instance to use 194to perform the actual compilation from IR to object files. We use the 195off-the-shelf SimpleCompiler instance for now. Finally, in the body of the 196constructor, we call the DynamicLibrary::LoadLibraryPermanently method with a 197nullptr argument. Normally the LoadLibraryPermanently method is called with the 198path of a dynamic library to load, but when passed a null pointer it will 'load' 199the host process itself, making its exported symbols available for execution. 200 201.. code-block:: c++ 202 203 ModuleHandle addModule(std::unique_ptr<Module> M) { 204 // Build our symbol resolver: 205 // Lambda 1: Look back into the JIT itself to find symbols that are part of 206 // the same "logical dylib". 207 // Lambda 2: Search for external symbols in the host process. 208 auto Resolver = createLambdaResolver( 209 [&](const std::string &Name) { 210 if (auto Sym = CompileLayer.findSymbol(Name, false)) 211 return Sym; 212 return JITSymbol(nullptr); 213 }, 214 [](const std::string &Name) { 215 if (auto SymAddr = 216 RTDyldMemoryManager::getSymbolAddressInProcess(Name)) 217 return JITSymbol(SymAddr, JITSymbolFlags::Exported); 218 return JITSymbol(nullptr); 219 }); 220 221 // Add the set to the JIT with the resolver we created above and a newly 222 // created SectionMemoryManager. 223 return cantFail(CompileLayer.addModule(std::move(M), 224 std::move(Resolver))); 225 } 226 227Now we come to the first of our JIT API methods: addModule. This method is 228responsible for adding IR to the JIT and making it available for execution. In 229this initial implementation of our JIT we will make our modules "available for 230execution" by adding them straight to the CompileLayer, which will immediately 231compile them. In later chapters we will teach our JIT to defer compilation 232of individual functions until they're actually called. 233 234To add our module to the CompileLayer we need to supply both the module and a 235symbol resolver. The symbol resolver is responsible for supplying the JIT with 236an address for each *external symbol* in the module we are adding. External 237symbols are any symbol not defined within the module itself, including calls to 238functions outside the JIT and calls to functions defined in other modules that 239have already been added to the JIT. (It may seem as though modules added to the 240JIT should know about one another by default, but since we would still have to 241supply a symbol resolver for references to code outside the JIT it turns out to 242be easier to re-use this one mechanism for all symbol resolution.) This has the 243added benefit that the user has full control over the symbol resolution 244process. Should we search for definitions within the JIT first, then fall back 245on external definitions? Or should we prefer external definitions where 246available and only JIT code if we don't already have an available 247implementation? By using a single symbol resolution scheme we are free to choose 248whatever makes the most sense for any given use case. 249 250Building a symbol resolver is made especially easy by the *createLambdaResolver* 251function. This function takes two lambdas [3]_ and returns a JITSymbolResolver 252instance. The first lambda is used as the implementation of the resolver's 253findSymbolInLogicalDylib method, which searches for symbol definitions that 254should be thought of as being part of the same "logical" dynamic library as this 255Module. If you are familiar with static linking: this means that 256findSymbolInLogicalDylib should expose symbols with common linkage and hidden 257visibility. If all this sounds foreign you can ignore the details and just 258remember that this is the first method that the linker will use to try to find a 259symbol definition. If the findSymbolInLogicalDylib method returns a null result 260then the linker will call the second symbol resolver method, called findSymbol, 261which searches for symbols that should be thought of as external to (but 262visibile from) the module and its logical dylib. In this tutorial we will adopt 263the following simple scheme: All modules added to the JIT will behave as if they 264were linked into a single, ever-growing logical dylib. To implement this our 265first lambda (the one defining findSymbolInLogicalDylib) will just search for 266JIT'd code by calling the CompileLayer's findSymbol method. If we don't find a 267symbol in the JIT itself we'll fall back to our second lambda, which implements 268findSymbol. This will use the RTDyldMemoryManager::getSymbolAddressInProcess 269method to search for the symbol within the program itself. If we can't find a 270symbol definition via either of these paths, the JIT will refuse to accept our 271module, returning a "symbol not found" error. 272 273Now that we've built our symbol resolver, we're ready to add our module to the 274JIT. We do this by calling the CompileLayer's addModule method. The addModule 275method returns an ``Expected<CompileLayer::ModuleHandle>``, since in more 276advanced JIT configurations it could fail. In our basic configuration we know 277that it will always succeed so we use the cantFail utility to assert that no 278error occurred, and extract the handle value. Since we have already typedef'd 279our ModuleHandle type to be the same as the CompileLayer's handle type, we can 280return the unwrapped handle directly. 281 282.. code-block:: c++ 283 284 JITSymbol findSymbol(const std::string Name) { 285 std::string MangledName; 286 raw_string_ostream MangledNameStream(MangledName); 287 Mangler::getNameWithPrefix(MangledNameStream, Name, DL); 288 return CompileLayer.findSymbol(MangledNameStream.str(), true); 289 } 290 291 JITTargetAddress getSymbolAddress(const std::string Name) { 292 return cantFail(findSymbol(Name).getAddress()); 293 } 294 295 void removeModule(ModuleHandle H) { 296 cantFail(CompileLayer.removeModule(H)); 297 } 298 299Now that we can add code to our JIT, we need a way to find the symbols we've 300added to it. To do that we call the findSymbol method on our CompileLayer, but 301with a twist: We have to *mangle* the name of the symbol we're searching for 302first. The ORC JIT components use mangled symbols internally the same way a 303static compiler and linker would, rather than using plain IR symbol names. This 304allows JIT'd code to interoperate easily with precompiled code in the 305application or shared libraries. The kind of mangling will depend on the 306DataLayout, which in turn depends on the target platform. To allow us to remain 307portable and search based on the un-mangled name, we just re-produce this 308mangling ourselves. 309 310Next we have a convenience function, getSymbolAddress, which returns the address 311of a given symbol. Like CompileLayer's addModule function, JITSymbol's getAddress 312function is allowed to fail [4]_, however we know that it will not in our simple 313example, so we wrap it in a call to cantFail. 314 315We now come to the last method in our JIT API: removeModule. This method is 316responsible for destructing the MemoryManager and SymbolResolver that were 317added with a given module, freeing any resources they were using in the 318process. In our Kaleidoscope demo we rely on this method to remove the module 319representing the most recent top-level expression, preventing it from being 320treated as a duplicate definition when the next top-level expression is 321entered. It is generally good to free any module that you know you won't need 322to call further, just to free up the resources dedicated to it. However, you 323don't strictly need to do this: All resources will be cleaned up when your 324JIT class is destructed, if they haven't been freed before then. Like 325``CompileLayer::addModule`` and ``JITSymbol::getAddress``, removeModule may 326fail in general but will never fail in our example, so we wrap it in a call to 327cantFail. 328 329This brings us to the end of Chapter 1 of Building a JIT. You now have a basic 330but fully functioning JIT stack that you can use to take LLVM IR and make it 331executable within the context of your JIT process. In the next chapter we'll 332look at how to extend this JIT to produce better quality code, and in the 333process take a deeper look at the ORC layer concept. 334 335`Next: Extending the KaleidoscopeJIT <BuildingAJIT2.html>`_ 336 337Full Code Listing 338================= 339 340Here is the complete code listing for our running example. To build this 341example, use: 342 343.. code-block:: bash 344 345 # Compile 346 clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core orcjit native` -O3 -o toy 347 # Run 348 ./toy 349 350Here is the code: 351 352.. literalinclude:: ../../examples/Kaleidoscope/BuildingAJIT/Chapter1/KaleidoscopeJIT.h 353 :language: c++ 354 355.. [1] Actually we use a cut-down version of KaleidoscopeJIT that makes a 356 simplifying assumption: symbols cannot be re-defined. This will make it 357 impossible to re-define symbols in the REPL, but will make our symbol 358 lookup logic simpler. Re-introducing support for symbol redefinition is 359 left as an exercise for the reader. (The KaleidoscopeJIT.h used in the 360 original tutorials will be a helpful reference). 361 362.. [2] +-----------------------------+-----------------------------------------------+ 363 | File | Reason for inclusion | 364 +=============================+===============================================+ 365 | STLExtras.h | LLVM utilities that are useful when working | 366 | | with the STL. | 367 +-----------------------------+-----------------------------------------------+ 368 | ExecutionEngine.h | Access to the EngineBuilder::selectTarget | 369 | | method. | 370 +-----------------------------+-----------------------------------------------+ 371 | | Access to the | 372 | RTDyldMemoryManager.h | RTDyldMemoryManager::getSymbolAddressInProcess| 373 | | method. | 374 +-----------------------------+-----------------------------------------------+ 375 | CompileUtils.h | Provides the SimpleCompiler class. | 376 +-----------------------------+-----------------------------------------------+ 377 | IRCompileLayer.h | Provides the IRCompileLayer class. | 378 +-----------------------------+-----------------------------------------------+ 379 | | Access the createLambdaResolver function, | 380 | LambdaResolver.h | which provides easy construction of symbol | 381 | | resolvers. | 382 +-----------------------------+-----------------------------------------------+ 383 | RTDyldObjectLinkingLayer.h | Provides the RTDyldObjectLinkingLayer class. | 384 +-----------------------------+-----------------------------------------------+ 385 | Mangler.h | Provides the Mangler class for platform | 386 | | specific name-mangling. | 387 +-----------------------------+-----------------------------------------------+ 388 | DynamicLibrary.h | Provides the DynamicLibrary class, which | 389 | | makes symbols in the host process searchable. | 390 +-----------------------------+-----------------------------------------------+ 391 | | A fast output stream class. We use the | 392 | raw_ostream.h | raw_string_ostream subclass for symbol | 393 | | mangling | 394 +-----------------------------+-----------------------------------------------+ 395 | TargetMachine.h | LLVM target machine description class. | 396 +-----------------------------+-----------------------------------------------+ 397 398.. [3] Actually they don't have to be lambdas, any object with a call operator 399 will do, including plain old functions or std::functions. 400 401.. [4] ``JITSymbol::getAddress`` will force the JIT to compile the definition of 402 the symbol if it hasn't already been compiled, and since the compilation 403 process could fail getAddress must be able to return this failure. 404