1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 2 "http://www.w3.org/TR/html4/strict.dtd"> 3 4<html> 5<head> 6 <title>Kaleidoscope: Extending the Language: User-defined Operators</title> 7 <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> 8 <meta name="author" content="Chris Lattner"> 9 <link rel="stylesheet" href="../llvm.css" type="text/css"> 10</head> 11 12<body> 13 14<h1>Kaleidoscope: Extending the Language: User-defined Operators</h1> 15 16<ul> 17<li><a href="index.html">Up to Tutorial Index</a></li> 18<li>Chapter 6 19 <ol> 20 <li><a href="#intro">Chapter 6 Introduction</a></li> 21 <li><a href="#idea">User-defined Operators: the Idea</a></li> 22 <li><a href="#binary">User-defined Binary Operators</a></li> 23 <li><a href="#unary">User-defined Unary Operators</a></li> 24 <li><a href="#example">Kicking the Tires</a></li> 25 <li><a href="#code">Full Code Listing</a></li> 26 </ol> 27</li> 28<li><a href="LangImpl7.html">Chapter 7</a>: Extending the Language: Mutable 29Variables / SSA Construction</li> 30</ul> 31 32<div class="doc_author"> 33 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p> 34</div> 35 36<!-- *********************************************************************** --> 37<h2><a name="intro">Chapter 6 Introduction</a></h2> 38<!-- *********************************************************************** --> 39 40<div> 41 42<p>Welcome to Chapter 6 of the "<a href="index.html">Implementing a language 43with LLVM</a>" tutorial. At this point in our tutorial, we now have a fully 44functional language that is fairly minimal, but also useful. There 45is still one big problem with it, however. Our language doesn't have many 46useful operators (like division, logical negation, or even any comparisons 47besides less-than).</p> 48 49<p>This chapter of the tutorial takes a wild digression into adding user-defined 50operators to the simple and beautiful Kaleidoscope language. This digression now gives 51us a simple and ugly language in some ways, but also a powerful one at the same time. 52One of the great things about creating your own language is that you get to 53decide what is good or bad. In this tutorial we'll assume that it is okay to 54use this as a way to show some interesting parsing techniques.</p> 55 56<p>At the end of this tutorial, we'll run through an example Kaleidoscope 57application that <a href="#example">renders the Mandelbrot set</a>. This gives 58an example of what you can build with Kaleidoscope and its feature set.</p> 59 60</div> 61 62<!-- *********************************************************************** --> 63<h2><a name="idea">User-defined Operators: the Idea</a></h2> 64<!-- *********************************************************************** --> 65 66<div> 67 68<p> 69The "operator overloading" that we will add to Kaleidoscope is more general than 70languages like C++. In C++, you are only allowed to redefine existing 71operators: you can't programatically change the grammar, introduce new 72operators, change precedence levels, etc. In this chapter, we will add this 73capability to Kaleidoscope, which will let the user round out the set of 74operators that are supported.</p> 75 76<p>The point of going into user-defined operators in a tutorial like this is to 77show the power and flexibility of using a hand-written parser. Thus far, the parser 78we have been implementing uses recursive descent for most parts of the grammar and 79operator precedence parsing for the expressions. See <a 80href="LangImpl2.html">Chapter 2</a> for details. Without using operator 81precedence parsing, it would be very difficult to allow the programmer to 82introduce new operators into the grammar: the grammar is dynamically extensible 83as the JIT runs.</p> 84 85<p>The two specific features we'll add are programmable unary operators (right 86now, Kaleidoscope has no unary operators at all) as well as binary operators. 87An example of this is:</p> 88 89<div class="doc_code"> 90<pre> 91# Logical unary not. 92def unary!(v) 93 if v then 94 0 95 else 96 1; 97 98# Define > with the same precedence as <. 99def binary> 10 (LHS RHS) 100 RHS < LHS; 101 102# Binary "logical or", (note that it does not "short circuit") 103def binary| 5 (LHS RHS) 104 if LHS then 105 1 106 else if RHS then 107 1 108 else 109 0; 110 111# Define = with slightly lower precedence than relationals. 112def binary= 9 (LHS RHS) 113 !(LHS < RHS | LHS > RHS); 114</pre> 115</div> 116 117<p>Many languages aspire to being able to implement their standard runtime 118library in the language itself. In Kaleidoscope, we can implement significant 119parts of the language in the library!</p> 120 121<p>We will break down implementation of these features into two parts: 122implementing support for user-defined binary operators and adding unary 123operators.</p> 124 125</div> 126 127<!-- *********************************************************************** --> 128<h2><a name="binary">User-defined Binary Operators</a></h2> 129<!-- *********************************************************************** --> 130 131<div> 132 133<p>Adding support for user-defined binary operators is pretty simple with our 134current framework. We'll first add support for the unary/binary keywords:</p> 135 136<div class="doc_code"> 137<pre> 138enum Token { 139 ... 140 <b>// operators 141 tok_binary = -11, tok_unary = -12</b> 142}; 143... 144static int gettok() { 145... 146 if (IdentifierStr == "for") return tok_for; 147 if (IdentifierStr == "in") return tok_in; 148 <b>if (IdentifierStr == "binary") return tok_binary; 149 if (IdentifierStr == "unary") return tok_unary;</b> 150 return tok_identifier; 151</pre> 152</div> 153 154<p>This just adds lexer support for the unary and binary keywords, like we 155did in <a href="LangImpl5.html#iflexer">previous chapters</a>. One nice thing 156about our current AST, is that we represent binary operators with full generalisation 157by using their ASCII code as the opcode. For our extended operators, we'll use this 158same representation, so we don't need any new AST or parser support.</p> 159 160<p>On the other hand, we have to be able to represent the definitions of these 161new operators, in the "def binary| 5" part of the function definition. In our 162grammar so far, the "name" for the function definition is parsed as the 163"prototype" production and into the <tt>PrototypeAST</tt> AST node. To 164represent our new user-defined operators as prototypes, we have to extend 165the <tt>PrototypeAST</tt> AST node like this:</p> 166 167<div class="doc_code"> 168<pre> 169/// PrototypeAST - This class represents the "prototype" for a function, 170/// which captures its argument names as well as if it is an operator. 171class PrototypeAST { 172 std::string Name; 173 std::vector<std::string> Args; 174 <b>bool isOperator; 175 unsigned Precedence; // Precedence if a binary op.</b> 176public: 177 PrototypeAST(const std::string &name, const std::vector<std::string> &args, 178 <b>bool isoperator = false, unsigned prec = 0</b>) 179 : Name(name), Args(args), <b>isOperator(isoperator), Precedence(prec)</b> {} 180 181 <b>bool isUnaryOp() const { return isOperator && Args.size() == 1; } 182 bool isBinaryOp() const { return isOperator && Args.size() == 2; } 183 184 char getOperatorName() const { 185 assert(isUnaryOp() || isBinaryOp()); 186 return Name[Name.size()-1]; 187 } 188 189 unsigned getBinaryPrecedence() const { return Precedence; }</b> 190 191 Function *Codegen(); 192}; 193</pre> 194</div> 195 196<p>Basically, in addition to knowing a name for the prototype, we now keep track 197of whether it was an operator, and if it was, what precedence level the operator 198is at. The precedence is only used for binary operators (as you'll see below, 199it just doesn't apply for unary operators). Now that we have a way to represent 200the prototype for a user-defined operator, we need to parse it:</p> 201 202<div class="doc_code"> 203<pre> 204/// prototype 205/// ::= id '(' id* ')' 206<b>/// ::= binary LETTER number? (id, id)</b> 207static PrototypeAST *ParsePrototype() { 208 std::string FnName; 209 210 <b>unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary. 211 unsigned BinaryPrecedence = 30;</b> 212 213 switch (CurTok) { 214 default: 215 return ErrorP("Expected function name in prototype"); 216 case tok_identifier: 217 FnName = IdentifierStr; 218 Kind = 0; 219 getNextToken(); 220 break; 221 <b>case tok_binary: 222 getNextToken(); 223 if (!isascii(CurTok)) 224 return ErrorP("Expected binary operator"); 225 FnName = "binary"; 226 FnName += (char)CurTok; 227 Kind = 2; 228 getNextToken(); 229 230 // Read the precedence if present. 231 if (CurTok == tok_number) { 232 if (NumVal < 1 || NumVal > 100) 233 return ErrorP("Invalid precedecnce: must be 1..100"); 234 BinaryPrecedence = (unsigned)NumVal; 235 getNextToken(); 236 } 237 break;</b> 238 } 239 240 if (CurTok != '(') 241 return ErrorP("Expected '(' in prototype"); 242 243 std::vector<std::string> ArgNames; 244 while (getNextToken() == tok_identifier) 245 ArgNames.push_back(IdentifierStr); 246 if (CurTok != ')') 247 return ErrorP("Expected ')' in prototype"); 248 249 // success. 250 getNextToken(); // eat ')'. 251 252 <b>// Verify right number of names for operator. 253 if (Kind && ArgNames.size() != Kind) 254 return ErrorP("Invalid number of operands for operator"); 255 256 return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);</b> 257} 258</pre> 259</div> 260 261<p>This is all fairly straightforward parsing code, and we have already seen 262a lot of similar code in the past. One interesting part about the code above is 263the couple lines that set up <tt>FnName</tt> for binary operators. This builds names 264like "binary@" for a newly defined "@" operator. This then takes advantage of the 265fact that symbol names in the LLVM symbol table are allowed to have any character in 266them, including embedded nul characters.</p> 267 268<p>The next interesting thing to add, is codegen support for these binary operators. 269Given our current structure, this is a simple addition of a default case for our 270existing binary operator node:</p> 271 272<div class="doc_code"> 273<pre> 274Value *BinaryExprAST::Codegen() { 275 Value *L = LHS->Codegen(); 276 Value *R = RHS->Codegen(); 277 if (L == 0 || R == 0) return 0; 278 279 switch (Op) { 280 case '+': return Builder.CreateFAdd(L, R, "addtmp"); 281 case '-': return Builder.CreateFSub(L, R, "subtmp"); 282 case '*': return Builder.CreateFMul(L, R, "multmp"); 283 case '<': 284 L = Builder.CreateFCmpULT(L, R, "cmptmp"); 285 // Convert bool 0/1 to double 0.0 or 1.0 286 return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()), 287 "booltmp"); 288 <b>default: break;</b> 289 } 290 291 <b>// If it wasn't a builtin binary operator, it must be a user defined one. Emit 292 // a call to it. 293 Function *F = TheModule->getFunction(std::string("binary")+Op); 294 assert(F && "binary operator not found!"); 295 296 Value *Ops[] = { L, R }; 297 return Builder.CreateCall(F, Ops, Ops+2, "binop");</b> 298} 299 300</pre> 301</div> 302 303<p>As you can see above, the new code is actually really simple. It just does 304a lookup for the appropriate operator in the symbol table and generates a 305function call to it. Since user-defined operators are just built as normal 306functions (because the "prototype" boils down to a function with the right 307name) everything falls into place.</p> 308 309<p>The final piece of code we are missing, is a bit of top-level magic:</p> 310 311<div class="doc_code"> 312<pre> 313Function *FunctionAST::Codegen() { 314 NamedValues.clear(); 315 316 Function *TheFunction = Proto->Codegen(); 317 if (TheFunction == 0) 318 return 0; 319 320 <b>// If this is an operator, install it. 321 if (Proto->isBinaryOp()) 322 BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence();</b> 323 324 // Create a new basic block to start insertion into. 325 BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction); 326 Builder.SetInsertPoint(BB); 327 328 if (Value *RetVal = Body->Codegen()) { 329 ... 330</pre> 331</div> 332 333<p>Basically, before codegening a function, if it is a user-defined operator, we 334register it in the precedence table. This allows the binary operator parsing 335logic we already have in place to handle it. Since we are working on a fully-general operator precedence parser, this is all we need to do to "extend the grammar".</p> 336 337<p>Now we have useful user-defined binary operators. This builds a lot 338on the previous framework we built for other operators. Adding unary operators 339is a bit more challenging, because we don't have any framework for it yet - lets 340see what it takes.</p> 341 342</div> 343 344<!-- *********************************************************************** --> 345<h2><a name="unary">User-defined Unary Operators</a></h2> 346<!-- *********************************************************************** --> 347 348<div> 349 350<p>Since we don't currently support unary operators in the Kaleidoscope 351language, we'll need to add everything to support them. Above, we added simple 352support for the 'unary' keyword to the lexer. In addition to that, we need an 353AST node:</p> 354 355<div class="doc_code"> 356<pre> 357/// UnaryExprAST - Expression class for a unary operator. 358class UnaryExprAST : public ExprAST { 359 char Opcode; 360 ExprAST *Operand; 361public: 362 UnaryExprAST(char opcode, ExprAST *operand) 363 : Opcode(opcode), Operand(operand) {} 364 virtual Value *Codegen(); 365}; 366</pre> 367</div> 368 369<p>This AST node is very simple and obvious by now. It directly mirrors the 370binary operator AST node, except that it only has one child. With this, we 371need to add the parsing logic. Parsing a unary operator is pretty simple: we'll 372add a new function to do it:</p> 373 374<div class="doc_code"> 375<pre> 376/// unary 377/// ::= primary 378/// ::= '!' unary 379static ExprAST *ParseUnary() { 380 // If the current token is not an operator, it must be a primary expr. 381 if (!isascii(CurTok) || CurTok == '(' || CurTok == ',') 382 return ParsePrimary(); 383 384 // If this is a unary operator, read it. 385 int Opc = CurTok; 386 getNextToken(); 387 if (ExprAST *Operand = ParseUnary()) 388 return new UnaryExprAST(Opc, Operand); 389 return 0; 390} 391</pre> 392</div> 393 394<p>The grammar we add is pretty straightforward here. If we see a unary 395operator when parsing a primary operator, we eat the operator as a prefix and 396parse the remaining piece as another unary operator. This allows us to handle 397multiple unary operators (e.g. "!!x"). Note that unary operators can't have 398ambiguous parses like binary operators can, so there is no need for precedence 399information.</p> 400 401<p>The problem with this function, is that we need to call ParseUnary from somewhere. 402To do this, we change previous callers of ParsePrimary to call ParseUnary 403instead:</p> 404 405<div class="doc_code"> 406<pre> 407/// binoprhs 408/// ::= ('+' unary)* 409static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) { 410 ... 411 <b>// Parse the unary expression after the binary operator. 412 ExprAST *RHS = ParseUnary(); 413 if (!RHS) return 0;</b> 414 ... 415} 416/// expression 417/// ::= unary binoprhs 418/// 419static ExprAST *ParseExpression() { 420 <b>ExprAST *LHS = ParseUnary();</b> 421 if (!LHS) return 0; 422 423 return ParseBinOpRHS(0, LHS); 424} 425</pre> 426</div> 427 428<p>With these two simple changes, we are now able to parse unary operators and build the 429AST for them. Next up, we need to add parser support for prototypes, to parse 430the unary operator prototype. We extend the binary operator code above 431with:</p> 432 433<div class="doc_code"> 434<pre> 435/// prototype 436/// ::= id '(' id* ')' 437/// ::= binary LETTER number? (id, id) 438<b>/// ::= unary LETTER (id)</b> 439static PrototypeAST *ParsePrototype() { 440 std::string FnName; 441 442 unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary. 443 unsigned BinaryPrecedence = 30; 444 445 switch (CurTok) { 446 default: 447 return ErrorP("Expected function name in prototype"); 448 case tok_identifier: 449 FnName = IdentifierStr; 450 Kind = 0; 451 getNextToken(); 452 break; 453 <b>case tok_unary: 454 getNextToken(); 455 if (!isascii(CurTok)) 456 return ErrorP("Expected unary operator"); 457 FnName = "unary"; 458 FnName += (char)CurTok; 459 Kind = 1; 460 getNextToken(); 461 break;</b> 462 case tok_binary: 463 ... 464</pre> 465</div> 466 467<p>As with binary operators, we name unary operators with a name that includes 468the operator character. This assists us at code generation time. Speaking of, 469the final piece we need to add is codegen support for unary operators. It looks 470like this:</p> 471 472<div class="doc_code"> 473<pre> 474Value *UnaryExprAST::Codegen() { 475 Value *OperandV = Operand->Codegen(); 476 if (OperandV == 0) return 0; 477 478 Function *F = TheModule->getFunction(std::string("unary")+Opcode); 479 if (F == 0) 480 return ErrorV("Unknown unary operator"); 481 482 return Builder.CreateCall(F, OperandV, "unop"); 483} 484</pre> 485</div> 486 487<p>This code is similar to, but simpler than, the code for binary operators. It 488is simpler primarily because it doesn't need to handle any predefined operators. 489</p> 490 491</div> 492 493<!-- *********************************************************************** --> 494<h2><a name="example">Kicking the Tires</a></h2> 495<!-- *********************************************************************** --> 496 497<div> 498 499<p>It is somewhat hard to believe, but with a few simple extensions we've 500covered in the last chapters, we have grown a real-ish language. With this, we 501can do a lot of interesting things, including I/O, math, and a bunch of other 502things. For example, we can now add a nice sequencing operator (printd is 503defined to print out the specified value and a newline):</p> 504 505<div class="doc_code"> 506<pre> 507ready> <b>extern printd(x);</b> 508Read extern: declare double @printd(double) 509ready> <b>def binary : 1 (x y) 0; # Low-precedence operator that ignores operands.</b> 510.. 511ready> <b>printd(123) : printd(456) : printd(789);</b> 512123.000000 513456.000000 514789.000000 515Evaluated to 0.000000 516</pre> 517</div> 518 519<p>We can also define a bunch of other "primitive" operations, such as:</p> 520 521<div class="doc_code"> 522<pre> 523# Logical unary not. 524def unary!(v) 525 if v then 526 0 527 else 528 1; 529 530# Unary negate. 531def unary-(v) 532 0-v; 533 534# Define > with the same precedence as <. 535def binary> 10 (LHS RHS) 536 RHS < LHS; 537 538# Binary logical or, which does not short circuit. 539def binary| 5 (LHS RHS) 540 if LHS then 541 1 542 else if RHS then 543 1 544 else 545 0; 546 547# Binary logical and, which does not short circuit. 548def binary& 6 (LHS RHS) 549 if !LHS then 550 0 551 else 552 !!RHS; 553 554# Define = with slightly lower precedence than relationals. 555def binary = 9 (LHS RHS) 556 !(LHS < RHS | LHS > RHS); 557 558</pre> 559</div> 560 561 562<p>Given the previous if/then/else support, we can also define interesting 563functions for I/O. For example, the following prints out a character whose 564"density" reflects the value passed in: the lower the value, the denser the 565character:</p> 566 567<div class="doc_code"> 568<pre> 569ready> 570<b> 571extern putchard(char) 572def printdensity(d) 573 if d > 8 then 574 putchard(32) # ' ' 575 else if d > 4 then 576 putchard(46) # '.' 577 else if d > 2 then 578 putchard(43) # '+' 579 else 580 putchard(42); # '*'</b> 581... 582ready> <b>printdensity(1): printdensity(2): printdensity(3) : 583 printdensity(4): printdensity(5): printdensity(9): putchard(10);</b> 584*++.. 585Evaluated to 0.000000 586</pre> 587</div> 588 589<p>Based on these simple primitive operations, we can start to define more 590interesting things. For example, here's a little function that solves for the 591number of iterations it takes a function in the complex plane to 592converge:</p> 593 594<div class="doc_code"> 595<pre> 596# determine whether the specific location diverges. 597# Solve for z = z^2 + c in the complex plane. 598def mandleconverger(real imag iters creal cimag) 599 if iters > 255 | (real*real + imag*imag > 4) then 600 iters 601 else 602 mandleconverger(real*real - imag*imag + creal, 603 2*real*imag + cimag, 604 iters+1, creal, cimag); 605 606# return the number of iterations required for the iteration to escape 607def mandleconverge(real imag) 608 mandleconverger(real, imag, 0, real, imag); 609</pre> 610</div> 611 612<p>This "z = z<sup>2</sup> + c" function is a beautiful little creature that is the basis 613for computation of the <a 614href="http://en.wikipedia.org/wiki/Mandelbrot_set">Mandelbrot Set</a>. Our 615<tt>mandelconverge</tt> function returns the number of iterations that it takes 616for a complex orbit to escape, saturating to 255. This is not a very useful 617function by itself, but if you plot its value over a two-dimensional plane, 618you can see the Mandelbrot set. Given that we are limited to using putchard 619here, our amazing graphical output is limited, but we can whip together 620something using the density plotter above:</p> 621 622<div class="doc_code"> 623<pre> 624# compute and plot the mandlebrot set with the specified 2 dimensional range 625# info. 626def mandelhelp(xmin xmax xstep ymin ymax ystep) 627 for y = ymin, y < ymax, ystep in ( 628 (for x = xmin, x < xmax, xstep in 629 printdensity(mandleconverge(x,y))) 630 : putchard(10) 631 ) 632 633# mandel - This is a convenient helper function for ploting the mandelbrot set 634# from the specified position with the specified Magnification. 635def mandel(realstart imagstart realmag imagmag) 636 mandelhelp(realstart, realstart+realmag*78, realmag, 637 imagstart, imagstart+imagmag*40, imagmag); 638</pre> 639</div> 640 641<p>Given this, we can try plotting out the mandlebrot set! Lets try it out:</p> 642 643<div class="doc_code"> 644<pre> 645ready> <b>mandel(-2.3, -1.3, 0.05, 0.07);</b> 646*******************************+++++++++++************************************* 647*************************+++++++++++++++++++++++******************************* 648**********************+++++++++++++++++++++++++++++**************************** 649*******************+++++++++++++++++++++.. ...++++++++************************* 650*****************++++++++++++++++++++++.... ...+++++++++*********************** 651***************+++++++++++++++++++++++..... ...+++++++++********************* 652**************+++++++++++++++++++++++.... ....+++++++++******************** 653*************++++++++++++++++++++++...... .....++++++++******************* 654************+++++++++++++++++++++....... .......+++++++****************** 655***********+++++++++++++++++++.... ... .+++++++***************** 656**********+++++++++++++++++....... .+++++++**************** 657*********++++++++++++++........... ...+++++++*************** 658********++++++++++++............ ...++++++++************** 659********++++++++++... .......... .++++++++************** 660*******+++++++++..... .+++++++++************* 661*******++++++++...... ..+++++++++************* 662*******++++++....... ..+++++++++************* 663*******+++++...... ..+++++++++************* 664*******.... .... ...+++++++++************* 665*******.... . ...+++++++++************* 666*******+++++...... ...+++++++++************* 667*******++++++....... ..+++++++++************* 668*******++++++++...... .+++++++++************* 669*******+++++++++..... ..+++++++++************* 670********++++++++++... .......... .++++++++************** 671********++++++++++++............ ...++++++++************** 672*********++++++++++++++.......... ...+++++++*************** 673**********++++++++++++++++........ .+++++++**************** 674**********++++++++++++++++++++.... ... ..+++++++**************** 675***********++++++++++++++++++++++....... .......++++++++***************** 676************+++++++++++++++++++++++...... ......++++++++****************** 677**************+++++++++++++++++++++++.... ....++++++++******************** 678***************+++++++++++++++++++++++..... ...+++++++++********************* 679*****************++++++++++++++++++++++.... ...++++++++*********************** 680*******************+++++++++++++++++++++......++++++++************************* 681*********************++++++++++++++++++++++.++++++++*************************** 682*************************+++++++++++++++++++++++******************************* 683******************************+++++++++++++************************************ 684******************************************************************************* 685******************************************************************************* 686******************************************************************************* 687Evaluated to 0.000000 688ready> <b>mandel(-2, -1, 0.02, 0.04);</b> 689**************************+++++++++++++++++++++++++++++++++++++++++++++++++++++ 690***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 691*********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++. 692*******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++... 693*****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++..... 694***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........ 695**************++++++++++++++++++++++++++++++++++++++++++++++++++++++........... 696************+++++++++++++++++++++++++++++++++++++++++++++++++++++.............. 697***********++++++++++++++++++++++++++++++++++++++++++++++++++........ . 698**********++++++++++++++++++++++++++++++++++++++++++++++............. 699********+++++++++++++++++++++++++++++++++++++++++++.................. 700*******+++++++++++++++++++++++++++++++++++++++....................... 701******+++++++++++++++++++++++++++++++++++........................... 702*****++++++++++++++++++++++++++++++++............................ 703*****++++++++++++++++++++++++++++............................... 704****++++++++++++++++++++++++++...... ......................... 705***++++++++++++++++++++++++......... ...... ........... 706***++++++++++++++++++++++............ 707**+++++++++++++++++++++.............. 708**+++++++++++++++++++................ 709*++++++++++++++++++................. 710*++++++++++++++++............ ... 711*++++++++++++++.............. 712*+++....++++................ 713*.......... ........... 714* 715*.......... ........... 716*+++....++++................ 717*++++++++++++++.............. 718*++++++++++++++++............ ... 719*++++++++++++++++++................. 720**+++++++++++++++++++................ 721**+++++++++++++++++++++.............. 722***++++++++++++++++++++++............ 723***++++++++++++++++++++++++......... ...... ........... 724****++++++++++++++++++++++++++...... ......................... 725*****++++++++++++++++++++++++++++............................... 726*****++++++++++++++++++++++++++++++++............................ 727******+++++++++++++++++++++++++++++++++++........................... 728*******+++++++++++++++++++++++++++++++++++++++....................... 729********+++++++++++++++++++++++++++++++++++++++++++.................. 730Evaluated to 0.000000 731ready> <b>mandel(-0.9, -1.4, 0.02, 0.03);</b> 732******************************************************************************* 733******************************************************************************* 734******************************************************************************* 735**********+++++++++++++++++++++************************************************ 736*+++++++++++++++++++++++++++++++++++++++*************************************** 737+++++++++++++++++++++++++++++++++++++++++++++********************************** 738++++++++++++++++++++++++++++++++++++++++++++++++++***************************** 739++++++++++++++++++++++++++++++++++++++++++++++++++++++************************* 740+++++++++++++++++++++++++++++++++++++++++++++++++++++++++********************** 741+++++++++++++++++++++++++++++++++.........++++++++++++++++++******************* 742+++++++++++++++++++++++++++++++.... ......+++++++++++++++++++**************** 743+++++++++++++++++++++++++++++....... ........+++++++++++++++++++************** 744++++++++++++++++++++++++++++........ ........++++++++++++++++++++************ 745+++++++++++++++++++++++++++......... .. ...+++++++++++++++++++++********** 746++++++++++++++++++++++++++........... ....++++++++++++++++++++++******** 747++++++++++++++++++++++++............. .......++++++++++++++++++++++****** 748+++++++++++++++++++++++............. ........+++++++++++++++++++++++**** 749++++++++++++++++++++++........... ..........++++++++++++++++++++++*** 750++++++++++++++++++++........... .........++++++++++++++++++++++* 751++++++++++++++++++............ ...........++++++++++++++++++++ 752++++++++++++++++............... .............++++++++++++++++++ 753++++++++++++++................. ...............++++++++++++++++ 754++++++++++++.................. .................++++++++++++++ 755+++++++++.................. .................+++++++++++++ 756++++++........ . ......... ..++++++++++++ 757++............ ...... ....++++++++++ 758.............. ...++++++++++ 759.............. ....+++++++++ 760.............. .....++++++++ 761............. ......++++++++ 762........... .......++++++++ 763......... ........+++++++ 764......... ........+++++++ 765......... ....+++++++ 766........ ...+++++++ 767....... ...+++++++ 768 ....+++++++ 769 .....+++++++ 770 ....+++++++ 771 ....+++++++ 772 ....+++++++ 773Evaluated to 0.000000 774ready> <b>^D</b> 775</pre> 776</div> 777 778<p>At this point, you may be starting to realize that Kaleidoscope is a real 779and powerful language. It may not be self-similar :), but it can be used to 780plot things that are!</p> 781 782<p>With this, we conclude the "adding user-defined operators" chapter of the 783tutorial. We have successfully augmented our language, adding the ability to extend the 784language in the library, and we have shown how this can be used to build a simple but 785interesting end-user application in Kaleidoscope. At this point, Kaleidoscope 786can build a variety of applications that are functional and can call functions 787with side-effects, but it can't actually define and mutate a variable itself. 788</p> 789 790<p>Strikingly, variable mutation is an important feature of some 791languages, and it is not at all obvious how to <a href="LangImpl7.html">add 792support for mutable variables</a> without having to add an "SSA construction" 793phase to your front-end. In the next chapter, we will describe how you can 794add variable mutation without building SSA in your front-end.</p> 795 796</div> 797 798<!-- *********************************************************************** --> 799<h2><a name="code">Full Code Listing</a></h2> 800<!-- *********************************************************************** --> 801 802<div> 803 804<p> 805Here is the complete code listing for our running example, enhanced with the 806if/then/else and for expressions.. To build this example, use: 807</p> 808 809<div class="doc_code"> 810<pre> 811 # Compile 812 g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy 813 # Run 814 ./toy 815</pre> 816</div> 817 818<p>Here is the code:</p> 819 820<div class="doc_code"> 821<pre> 822#include "llvm/DerivedTypes.h" 823#include "llvm/ExecutionEngine/ExecutionEngine.h" 824#include "llvm/ExecutionEngine/JIT.h" 825#include "llvm/LLVMContext.h" 826#include "llvm/Module.h" 827#include "llvm/PassManager.h" 828#include "llvm/Analysis/Verifier.h" 829#include "llvm/Analysis/Passes.h" 830#include "llvm/Target/TargetData.h" 831#include "llvm/Target/TargetSelect.h" 832#include "llvm/Transforms/Scalar.h" 833#include "llvm/Support/IRBuilder.h" 834#include <cstdio> 835#include <string> 836#include <map> 837#include <vector> 838using namespace llvm; 839 840//===----------------------------------------------------------------------===// 841// Lexer 842//===----------------------------------------------------------------------===// 843 844// The lexer returns tokens [0-255] if it is an unknown character, otherwise one 845// of these for known things. 846enum Token { 847 tok_eof = -1, 848 849 // commands 850 tok_def = -2, tok_extern = -3, 851 852 // primary 853 tok_identifier = -4, tok_number = -5, 854 855 // control 856 tok_if = -6, tok_then = -7, tok_else = -8, 857 tok_for = -9, tok_in = -10, 858 859 // operators 860 tok_binary = -11, tok_unary = -12 861}; 862 863static std::string IdentifierStr; // Filled in if tok_identifier 864static double NumVal; // Filled in if tok_number 865 866/// gettok - Return the next token from standard input. 867static int gettok() { 868 static int LastChar = ' '; 869 870 // Skip any whitespace. 871 while (isspace(LastChar)) 872 LastChar = getchar(); 873 874 if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]* 875 IdentifierStr = LastChar; 876 while (isalnum((LastChar = getchar()))) 877 IdentifierStr += LastChar; 878 879 if (IdentifierStr == "def") return tok_def; 880 if (IdentifierStr == "extern") return tok_extern; 881 if (IdentifierStr == "if") return tok_if; 882 if (IdentifierStr == "then") return tok_then; 883 if (IdentifierStr == "else") return tok_else; 884 if (IdentifierStr == "for") return tok_for; 885 if (IdentifierStr == "in") return tok_in; 886 if (IdentifierStr == "binary") return tok_binary; 887 if (IdentifierStr == "unary") return tok_unary; 888 return tok_identifier; 889 } 890 891 if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+ 892 std::string NumStr; 893 do { 894 NumStr += LastChar; 895 LastChar = getchar(); 896 } while (isdigit(LastChar) || LastChar == '.'); 897 898 NumVal = strtod(NumStr.c_str(), 0); 899 return tok_number; 900 } 901 902 if (LastChar == '#') { 903 // Comment until end of line. 904 do LastChar = getchar(); 905 while (LastChar != EOF && LastChar != '\n' && LastChar != '\r'); 906 907 if (LastChar != EOF) 908 return gettok(); 909 } 910 911 // Check for end of file. Don't eat the EOF. 912 if (LastChar == EOF) 913 return tok_eof; 914 915 // Otherwise, just return the character as its ascii value. 916 int ThisChar = LastChar; 917 LastChar = getchar(); 918 return ThisChar; 919} 920 921//===----------------------------------------------------------------------===// 922// Abstract Syntax Tree (aka Parse Tree) 923//===----------------------------------------------------------------------===// 924 925/// ExprAST - Base class for all expression nodes. 926class ExprAST { 927public: 928 virtual ~ExprAST() {} 929 virtual Value *Codegen() = 0; 930}; 931 932/// NumberExprAST - Expression class for numeric literals like "1.0". 933class NumberExprAST : public ExprAST { 934 double Val; 935public: 936 NumberExprAST(double val) : Val(val) {} 937 virtual Value *Codegen(); 938}; 939 940/// VariableExprAST - Expression class for referencing a variable, like "a". 941class VariableExprAST : public ExprAST { 942 std::string Name; 943public: 944 VariableExprAST(const std::string &name) : Name(name) {} 945 virtual Value *Codegen(); 946}; 947 948/// UnaryExprAST - Expression class for a unary operator. 949class UnaryExprAST : public ExprAST { 950 char Opcode; 951 ExprAST *Operand; 952public: 953 UnaryExprAST(char opcode, ExprAST *operand) 954 : Opcode(opcode), Operand(operand) {} 955 virtual Value *Codegen(); 956}; 957 958/// BinaryExprAST - Expression class for a binary operator. 959class BinaryExprAST : public ExprAST { 960 char Op; 961 ExprAST *LHS, *RHS; 962public: 963 BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs) 964 : Op(op), LHS(lhs), RHS(rhs) {} 965 virtual Value *Codegen(); 966}; 967 968/// CallExprAST - Expression class for function calls. 969class CallExprAST : public ExprAST { 970 std::string Callee; 971 std::vector<ExprAST*> Args; 972public: 973 CallExprAST(const std::string &callee, std::vector<ExprAST*> &args) 974 : Callee(callee), Args(args) {} 975 virtual Value *Codegen(); 976}; 977 978/// IfExprAST - Expression class for if/then/else. 979class IfExprAST : public ExprAST { 980 ExprAST *Cond, *Then, *Else; 981public: 982 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else) 983 : Cond(cond), Then(then), Else(_else) {} 984 virtual Value *Codegen(); 985}; 986 987/// ForExprAST - Expression class for for/in. 988class ForExprAST : public ExprAST { 989 std::string VarName; 990 ExprAST *Start, *End, *Step, *Body; 991public: 992 ForExprAST(const std::string &varname, ExprAST *start, ExprAST *end, 993 ExprAST *step, ExprAST *body) 994 : VarName(varname), Start(start), End(end), Step(step), Body(body) {} 995 virtual Value *Codegen(); 996}; 997 998/// PrototypeAST - This class represents the "prototype" for a function, 999/// which captures its name, and its argument names (thus implicitly the number 1000/// of arguments the function takes), as well as if it is an operator. 1001class PrototypeAST { 1002 std::string Name; 1003 std::vector<std::string> Args; 1004 bool isOperator; 1005 unsigned Precedence; // Precedence if a binary op. 1006public: 1007 PrototypeAST(const std::string &name, const std::vector<std::string> &args, 1008 bool isoperator = false, unsigned prec = 0) 1009 : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {} 1010 1011 bool isUnaryOp() const { return isOperator && Args.size() == 1; } 1012 bool isBinaryOp() const { return isOperator && Args.size() == 2; } 1013 1014 char getOperatorName() const { 1015 assert(isUnaryOp() || isBinaryOp()); 1016 return Name[Name.size()-1]; 1017 } 1018 1019 unsigned getBinaryPrecedence() const { return Precedence; } 1020 1021 Function *Codegen(); 1022}; 1023 1024/// FunctionAST - This class represents a function definition itself. 1025class FunctionAST { 1026 PrototypeAST *Proto; 1027 ExprAST *Body; 1028public: 1029 FunctionAST(PrototypeAST *proto, ExprAST *body) 1030 : Proto(proto), Body(body) {} 1031 1032 Function *Codegen(); 1033}; 1034 1035//===----------------------------------------------------------------------===// 1036// Parser 1037//===----------------------------------------------------------------------===// 1038 1039/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current 1040/// token the parser is looking at. getNextToken reads another token from the 1041/// lexer and updates CurTok with its results. 1042static int CurTok; 1043static int getNextToken() { 1044 return CurTok = gettok(); 1045} 1046 1047/// BinopPrecedence - This holds the precedence for each binary operator that is 1048/// defined. 1049static std::map<char, int> BinopPrecedence; 1050 1051/// GetTokPrecedence - Get the precedence of the pending binary operator token. 1052static int GetTokPrecedence() { 1053 if (!isascii(CurTok)) 1054 return -1; 1055 1056 // Make sure it's a declared binop. 1057 int TokPrec = BinopPrecedence[CurTok]; 1058 if (TokPrec <= 0) return -1; 1059 return TokPrec; 1060} 1061 1062/// Error* - These are little helper functions for error handling. 1063ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;} 1064PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; } 1065FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; } 1066 1067static ExprAST *ParseExpression(); 1068 1069/// identifierexpr 1070/// ::= identifier 1071/// ::= identifier '(' expression* ')' 1072static ExprAST *ParseIdentifierExpr() { 1073 std::string IdName = IdentifierStr; 1074 1075 getNextToken(); // eat identifier. 1076 1077 if (CurTok != '(') // Simple variable ref. 1078 return new VariableExprAST(IdName); 1079 1080 // Call. 1081 getNextToken(); // eat ( 1082 std::vector<ExprAST*> Args; 1083 if (CurTok != ')') { 1084 while (1) { 1085 ExprAST *Arg = ParseExpression(); 1086 if (!Arg) return 0; 1087 Args.push_back(Arg); 1088 1089 if (CurTok == ')') break; 1090 1091 if (CurTok != ',') 1092 return Error("Expected ')' or ',' in argument list"); 1093 getNextToken(); 1094 } 1095 } 1096 1097 // Eat the ')'. 1098 getNextToken(); 1099 1100 return new CallExprAST(IdName, Args); 1101} 1102 1103/// numberexpr ::= number 1104static ExprAST *ParseNumberExpr() { 1105 ExprAST *Result = new NumberExprAST(NumVal); 1106 getNextToken(); // consume the number 1107 return Result; 1108} 1109 1110/// parenexpr ::= '(' expression ')' 1111static ExprAST *ParseParenExpr() { 1112 getNextToken(); // eat (. 1113 ExprAST *V = ParseExpression(); 1114 if (!V) return 0; 1115 1116 if (CurTok != ')') 1117 return Error("expected ')'"); 1118 getNextToken(); // eat ). 1119 return V; 1120} 1121 1122/// ifexpr ::= 'if' expression 'then' expression 'else' expression 1123static ExprAST *ParseIfExpr() { 1124 getNextToken(); // eat the if. 1125 1126 // condition. 1127 ExprAST *Cond = ParseExpression(); 1128 if (!Cond) return 0; 1129 1130 if (CurTok != tok_then) 1131 return Error("expected then"); 1132 getNextToken(); // eat the then 1133 1134 ExprAST *Then = ParseExpression(); 1135 if (Then == 0) return 0; 1136 1137 if (CurTok != tok_else) 1138 return Error("expected else"); 1139 1140 getNextToken(); 1141 1142 ExprAST *Else = ParseExpression(); 1143 if (!Else) return 0; 1144 1145 return new IfExprAST(Cond, Then, Else); 1146} 1147 1148/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression 1149static ExprAST *ParseForExpr() { 1150 getNextToken(); // eat the for. 1151 1152 if (CurTok != tok_identifier) 1153 return Error("expected identifier after for"); 1154 1155 std::string IdName = IdentifierStr; 1156 getNextToken(); // eat identifier. 1157 1158 if (CurTok != '=') 1159 return Error("expected '=' after for"); 1160 getNextToken(); // eat '='. 1161 1162 1163 ExprAST *Start = ParseExpression(); 1164 if (Start == 0) return 0; 1165 if (CurTok != ',') 1166 return Error("expected ',' after for start value"); 1167 getNextToken(); 1168 1169 ExprAST *End = ParseExpression(); 1170 if (End == 0) return 0; 1171 1172 // The step value is optional. 1173 ExprAST *Step = 0; 1174 if (CurTok == ',') { 1175 getNextToken(); 1176 Step = ParseExpression(); 1177 if (Step == 0) return 0; 1178 } 1179 1180 if (CurTok != tok_in) 1181 return Error("expected 'in' after for"); 1182 getNextToken(); // eat 'in'. 1183 1184 ExprAST *Body = ParseExpression(); 1185 if (Body == 0) return 0; 1186 1187 return new ForExprAST(IdName, Start, End, Step, Body); 1188} 1189 1190/// primary 1191/// ::= identifierexpr 1192/// ::= numberexpr 1193/// ::= parenexpr 1194/// ::= ifexpr 1195/// ::= forexpr 1196static ExprAST *ParsePrimary() { 1197 switch (CurTok) { 1198 default: return Error("unknown token when expecting an expression"); 1199 case tok_identifier: return ParseIdentifierExpr(); 1200 case tok_number: return ParseNumberExpr(); 1201 case '(': return ParseParenExpr(); 1202 case tok_if: return ParseIfExpr(); 1203 case tok_for: return ParseForExpr(); 1204 } 1205} 1206 1207/// unary 1208/// ::= primary 1209/// ::= '!' unary 1210static ExprAST *ParseUnary() { 1211 // If the current token is not an operator, it must be a primary expr. 1212 if (!isascii(CurTok) || CurTok == '(' || CurTok == ',') 1213 return ParsePrimary(); 1214 1215 // If this is a unary operator, read it. 1216 int Opc = CurTok; 1217 getNextToken(); 1218 if (ExprAST *Operand = ParseUnary()) 1219 return new UnaryExprAST(Opc, Operand); 1220 return 0; 1221} 1222 1223/// binoprhs 1224/// ::= ('+' unary)* 1225static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) { 1226 // If this is a binop, find its precedence. 1227 while (1) { 1228 int TokPrec = GetTokPrecedence(); 1229 1230 // If this is a binop that binds at least as tightly as the current binop, 1231 // consume it, otherwise we are done. 1232 if (TokPrec < ExprPrec) 1233 return LHS; 1234 1235 // Okay, we know this is a binop. 1236 int BinOp = CurTok; 1237 getNextToken(); // eat binop 1238 1239 // Parse the unary expression after the binary operator. 1240 ExprAST *RHS = ParseUnary(); 1241 if (!RHS) return 0; 1242 1243 // If BinOp binds less tightly with RHS than the operator after RHS, let 1244 // the pending operator take RHS as its LHS. 1245 int NextPrec = GetTokPrecedence(); 1246 if (TokPrec < NextPrec) { 1247 RHS = ParseBinOpRHS(TokPrec+1, RHS); 1248 if (RHS == 0) return 0; 1249 } 1250 1251 // Merge LHS/RHS. 1252 LHS = new BinaryExprAST(BinOp, LHS, RHS); 1253 } 1254} 1255 1256/// expression 1257/// ::= unary binoprhs 1258/// 1259static ExprAST *ParseExpression() { 1260 ExprAST *LHS = ParseUnary(); 1261 if (!LHS) return 0; 1262 1263 return ParseBinOpRHS(0, LHS); 1264} 1265 1266/// prototype 1267/// ::= id '(' id* ')' 1268/// ::= binary LETTER number? (id, id) 1269/// ::= unary LETTER (id) 1270static PrototypeAST *ParsePrototype() { 1271 std::string FnName; 1272 1273 unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary. 1274 unsigned BinaryPrecedence = 30; 1275 1276 switch (CurTok) { 1277 default: 1278 return ErrorP("Expected function name in prototype"); 1279 case tok_identifier: 1280 FnName = IdentifierStr; 1281 Kind = 0; 1282 getNextToken(); 1283 break; 1284 case tok_unary: 1285 getNextToken(); 1286 if (!isascii(CurTok)) 1287 return ErrorP("Expected unary operator"); 1288 FnName = "unary"; 1289 FnName += (char)CurTok; 1290 Kind = 1; 1291 getNextToken(); 1292 break; 1293 case tok_binary: 1294 getNextToken(); 1295 if (!isascii(CurTok)) 1296 return ErrorP("Expected binary operator"); 1297 FnName = "binary"; 1298 FnName += (char)CurTok; 1299 Kind = 2; 1300 getNextToken(); 1301 1302 // Read the precedence if present. 1303 if (CurTok == tok_number) { 1304 if (NumVal < 1 || NumVal > 100) 1305 return ErrorP("Invalid precedecnce: must be 1..100"); 1306 BinaryPrecedence = (unsigned)NumVal; 1307 getNextToken(); 1308 } 1309 break; 1310 } 1311 1312 if (CurTok != '(') 1313 return ErrorP("Expected '(' in prototype"); 1314 1315 std::vector<std::string> ArgNames; 1316 while (getNextToken() == tok_identifier) 1317 ArgNames.push_back(IdentifierStr); 1318 if (CurTok != ')') 1319 return ErrorP("Expected ')' in prototype"); 1320 1321 // success. 1322 getNextToken(); // eat ')'. 1323 1324 // Verify right number of names for operator. 1325 if (Kind && ArgNames.size() != Kind) 1326 return ErrorP("Invalid number of operands for operator"); 1327 1328 return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence); 1329} 1330 1331/// definition ::= 'def' prototype expression 1332static FunctionAST *ParseDefinition() { 1333 getNextToken(); // eat def. 1334 PrototypeAST *Proto = ParsePrototype(); 1335 if (Proto == 0) return 0; 1336 1337 if (ExprAST *E = ParseExpression()) 1338 return new FunctionAST(Proto, E); 1339 return 0; 1340} 1341 1342/// toplevelexpr ::= expression 1343static FunctionAST *ParseTopLevelExpr() { 1344 if (ExprAST *E = ParseExpression()) { 1345 // Make an anonymous proto. 1346 PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>()); 1347 return new FunctionAST(Proto, E); 1348 } 1349 return 0; 1350} 1351 1352/// external ::= 'extern' prototype 1353static PrototypeAST *ParseExtern() { 1354 getNextToken(); // eat extern. 1355 return ParsePrototype(); 1356} 1357 1358//===----------------------------------------------------------------------===// 1359// Code Generation 1360//===----------------------------------------------------------------------===// 1361 1362static Module *TheModule; 1363static IRBuilder<> Builder(getGlobalContext()); 1364static std::map<std::string, Value*> NamedValues; 1365static FunctionPassManager *TheFPM; 1366 1367Value *ErrorV(const char *Str) { Error(Str); return 0; } 1368 1369Value *NumberExprAST::Codegen() { 1370 return ConstantFP::get(getGlobalContext(), APFloat(Val)); 1371} 1372 1373Value *VariableExprAST::Codegen() { 1374 // Look this variable up in the function. 1375 Value *V = NamedValues[Name]; 1376 return V ? V : ErrorV("Unknown variable name"); 1377} 1378 1379Value *UnaryExprAST::Codegen() { 1380 Value *OperandV = Operand->Codegen(); 1381 if (OperandV == 0) return 0; 1382 1383 Function *F = TheModule->getFunction(std::string("unary")+Opcode); 1384 if (F == 0) 1385 return ErrorV("Unknown unary operator"); 1386 1387 return Builder.CreateCall(F, OperandV, "unop"); 1388} 1389 1390Value *BinaryExprAST::Codegen() { 1391 Value *L = LHS->Codegen(); 1392 Value *R = RHS->Codegen(); 1393 if (L == 0 || R == 0) return 0; 1394 1395 switch (Op) { 1396 case '+': return Builder.CreateFAdd(L, R, "addtmp"); 1397 case '-': return Builder.CreateFSub(L, R, "subtmp"); 1398 case '*': return Builder.CreateFMul(L, R, "multmp"); 1399 case '<': 1400 L = Builder.CreateFCmpULT(L, R, "cmptmp"); 1401 // Convert bool 0/1 to double 0.0 or 1.0 1402 return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()), 1403 "booltmp"); 1404 default: break; 1405 } 1406 1407 // If it wasn't a builtin binary operator, it must be a user defined one. Emit 1408 // a call to it. 1409 Function *F = TheModule->getFunction(std::string("binary")+Op); 1410 assert(F && "binary operator not found!"); 1411 1412 Value *Ops[] = { L, R }; 1413 return Builder.CreateCall(F, Ops, Ops+2, "binop"); 1414} 1415 1416Value *CallExprAST::Codegen() { 1417 // Look up the name in the global module table. 1418 Function *CalleeF = TheModule->getFunction(Callee); 1419 if (CalleeF == 0) 1420 return ErrorV("Unknown function referenced"); 1421 1422 // If argument mismatch error. 1423 if (CalleeF->arg_size() != Args.size()) 1424 return ErrorV("Incorrect # arguments passed"); 1425 1426 std::vector<Value*> ArgsV; 1427 for (unsigned i = 0, e = Args.size(); i != e; ++i) { 1428 ArgsV.push_back(Args[i]->Codegen()); 1429 if (ArgsV.back() == 0) return 0; 1430 } 1431 1432 return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp"); 1433} 1434 1435Value *IfExprAST::Codegen() { 1436 Value *CondV = Cond->Codegen(); 1437 if (CondV == 0) return 0; 1438 1439 // Convert condition to a bool by comparing equal to 0.0. 1440 CondV = Builder.CreateFCmpONE(CondV, 1441 ConstantFP::get(getGlobalContext(), APFloat(0.0)), 1442 "ifcond"); 1443 1444 Function *TheFunction = Builder.GetInsertBlock()->getParent(); 1445 1446 // Create blocks for the then and else cases. Insert the 'then' block at the 1447 // end of the function. 1448 BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction); 1449 BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else"); 1450 BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont"); 1451 1452 Builder.CreateCondBr(CondV, ThenBB, ElseBB); 1453 1454 // Emit then value. 1455 Builder.SetInsertPoint(ThenBB); 1456 1457 Value *ThenV = Then->Codegen(); 1458 if (ThenV == 0) return 0; 1459 1460 Builder.CreateBr(MergeBB); 1461 // Codegen of 'Then' can change the current block, update ThenBB for the PHI. 1462 ThenBB = Builder.GetInsertBlock(); 1463 1464 // Emit else block. 1465 TheFunction->getBasicBlockList().push_back(ElseBB); 1466 Builder.SetInsertPoint(ElseBB); 1467 1468 Value *ElseV = Else->Codegen(); 1469 if (ElseV == 0) return 0; 1470 1471 Builder.CreateBr(MergeBB); 1472 // Codegen of 'Else' can change the current block, update ElseBB for the PHI. 1473 ElseBB = Builder.GetInsertBlock(); 1474 1475 // Emit merge block. 1476 TheFunction->getBasicBlockList().push_back(MergeBB); 1477 Builder.SetInsertPoint(MergeBB); 1478 PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, 1479 "iftmp"); 1480 1481 PN->addIncoming(ThenV, ThenBB); 1482 PN->addIncoming(ElseV, ElseBB); 1483 return PN; 1484} 1485 1486Value *ForExprAST::Codegen() { 1487 // Output this as: 1488 // ... 1489 // start = startexpr 1490 // goto loop 1491 // loop: 1492 // variable = phi [start, loopheader], [nextvariable, loopend] 1493 // ... 1494 // bodyexpr 1495 // ... 1496 // loopend: 1497 // step = stepexpr 1498 // nextvariable = variable + step 1499 // endcond = endexpr 1500 // br endcond, loop, endloop 1501 // outloop: 1502 1503 // Emit the start code first, without 'variable' in scope. 1504 Value *StartVal = Start->Codegen(); 1505 if (StartVal == 0) return 0; 1506 1507 // Make the new basic block for the loop header, inserting after current 1508 // block. 1509 Function *TheFunction = Builder.GetInsertBlock()->getParent(); 1510 BasicBlock *PreheaderBB = Builder.GetInsertBlock(); 1511 BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction); 1512 1513 // Insert an explicit fall through from the current block to the LoopBB. 1514 Builder.CreateBr(LoopBB); 1515 1516 // Start insertion in LoopBB. 1517 Builder.SetInsertPoint(LoopBB); 1518 1519 // Start the PHI node with an entry for Start. 1520 PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, VarName.c_str()); 1521 Variable->addIncoming(StartVal, PreheaderBB); 1522 1523 // Within the loop, the variable is defined equal to the PHI node. If it 1524 // shadows an existing variable, we have to restore it, so save it now. 1525 Value *OldVal = NamedValues[VarName]; 1526 NamedValues[VarName] = Variable; 1527 1528 // Emit the body of the loop. This, like any other expr, can change the 1529 // current BB. Note that we ignore the value computed by the body, but don't 1530 // allow an error. 1531 if (Body->Codegen() == 0) 1532 return 0; 1533 1534 // Emit the step value. 1535 Value *StepVal; 1536 if (Step) { 1537 StepVal = Step->Codegen(); 1538 if (StepVal == 0) return 0; 1539 } else { 1540 // If not specified, use 1.0. 1541 StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0)); 1542 } 1543 1544 Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar"); 1545 1546 // Compute the end condition. 1547 Value *EndCond = End->Codegen(); 1548 if (EndCond == 0) return EndCond; 1549 1550 // Convert condition to a bool by comparing equal to 0.0. 1551 EndCond = Builder.CreateFCmpONE(EndCond, 1552 ConstantFP::get(getGlobalContext(), APFloat(0.0)), 1553 "loopcond"); 1554 1555 // Create the "after loop" block and insert it. 1556 BasicBlock *LoopEndBB = Builder.GetInsertBlock(); 1557 BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction); 1558 1559 // Insert the conditional branch into the end of LoopEndBB. 1560 Builder.CreateCondBr(EndCond, LoopBB, AfterBB); 1561 1562 // Any new code will be inserted in AfterBB. 1563 Builder.SetInsertPoint(AfterBB); 1564 1565 // Add a new entry to the PHI node for the backedge. 1566 Variable->addIncoming(NextVar, LoopEndBB); 1567 1568 // Restore the unshadowed variable. 1569 if (OldVal) 1570 NamedValues[VarName] = OldVal; 1571 else 1572 NamedValues.erase(VarName); 1573 1574 1575 // for expr always returns 0.0. 1576 return Constant::getNullValue(Type::getDoubleTy(getGlobalContext())); 1577} 1578 1579Function *PrototypeAST::Codegen() { 1580 // Make the function type: double(double,double) etc. 1581 std::vector<const Type*> Doubles(Args.size(), 1582 Type::getDoubleTy(getGlobalContext())); 1583 FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()), 1584 Doubles, false); 1585 1586 Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule); 1587 1588 // If F conflicted, there was already something named 'Name'. If it has a 1589 // body, don't allow redefinition or reextern. 1590 if (F->getName() != Name) { 1591 // Delete the one we just made and get the existing one. 1592 F->eraseFromParent(); 1593 F = TheModule->getFunction(Name); 1594 1595 // If F already has a body, reject this. 1596 if (!F->empty()) { 1597 ErrorF("redefinition of function"); 1598 return 0; 1599 } 1600 1601 // If F took a different number of args, reject. 1602 if (F->arg_size() != Args.size()) { 1603 ErrorF("redefinition of function with different # args"); 1604 return 0; 1605 } 1606 } 1607 1608 // Set names for all arguments. 1609 unsigned Idx = 0; 1610 for (Function::arg_iterator AI = F->arg_begin(); Idx != Args.size(); 1611 ++AI, ++Idx) { 1612 AI->setName(Args[Idx]); 1613 1614 // Add arguments to variable symbol table. 1615 NamedValues[Args[Idx]] = AI; 1616 } 1617 1618 return F; 1619} 1620 1621Function *FunctionAST::Codegen() { 1622 NamedValues.clear(); 1623 1624 Function *TheFunction = Proto->Codegen(); 1625 if (TheFunction == 0) 1626 return 0; 1627 1628 // If this is an operator, install it. 1629 if (Proto->isBinaryOp()) 1630 BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence(); 1631 1632 // Create a new basic block to start insertion into. 1633 BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction); 1634 Builder.SetInsertPoint(BB); 1635 1636 if (Value *RetVal = Body->Codegen()) { 1637 // Finish off the function. 1638 Builder.CreateRet(RetVal); 1639 1640 // Validate the generated code, checking for consistency. 1641 verifyFunction(*TheFunction); 1642 1643 // Optimize the function. 1644 TheFPM->run(*TheFunction); 1645 1646 return TheFunction; 1647 } 1648 1649 // Error reading body, remove function. 1650 TheFunction->eraseFromParent(); 1651 1652 if (Proto->isBinaryOp()) 1653 BinopPrecedence.erase(Proto->getOperatorName()); 1654 return 0; 1655} 1656 1657//===----------------------------------------------------------------------===// 1658// Top-Level parsing and JIT Driver 1659//===----------------------------------------------------------------------===// 1660 1661static ExecutionEngine *TheExecutionEngine; 1662 1663static void HandleDefinition() { 1664 if (FunctionAST *F = ParseDefinition()) { 1665 if (Function *LF = F->Codegen()) { 1666 fprintf(stderr, "Read function definition:"); 1667 LF->dump(); 1668 } 1669 } else { 1670 // Skip token for error recovery. 1671 getNextToken(); 1672 } 1673} 1674 1675static void HandleExtern() { 1676 if (PrototypeAST *P = ParseExtern()) { 1677 if (Function *F = P->Codegen()) { 1678 fprintf(stderr, "Read extern: "); 1679 F->dump(); 1680 } 1681 } else { 1682 // Skip token for error recovery. 1683 getNextToken(); 1684 } 1685} 1686 1687static void HandleTopLevelExpression() { 1688 // Evaluate a top-level expression into an anonymous function. 1689 if (FunctionAST *F = ParseTopLevelExpr()) { 1690 if (Function *LF = F->Codegen()) { 1691 // JIT the function, returning a function pointer. 1692 void *FPtr = TheExecutionEngine->getPointerToFunction(LF); 1693 1694 // Cast it to the right type (takes no arguments, returns a double) so we 1695 // can call it as a native function. 1696 double (*FP)() = (double (*)())(intptr_t)FPtr; 1697 fprintf(stderr, "Evaluated to %f\n", FP()); 1698 } 1699 } else { 1700 // Skip token for error recovery. 1701 getNextToken(); 1702 } 1703} 1704 1705/// top ::= definition | external | expression | ';' 1706static void MainLoop() { 1707 while (1) { 1708 fprintf(stderr, "ready> "); 1709 switch (CurTok) { 1710 case tok_eof: return; 1711 case ';': getNextToken(); break; // ignore top-level semicolons. 1712 case tok_def: HandleDefinition(); break; 1713 case tok_extern: HandleExtern(); break; 1714 default: HandleTopLevelExpression(); break; 1715 } 1716 } 1717} 1718 1719//===----------------------------------------------------------------------===// 1720// "Library" functions that can be "extern'd" from user code. 1721//===----------------------------------------------------------------------===// 1722 1723/// putchard - putchar that takes a double and returns 0. 1724extern "C" 1725double putchard(double X) { 1726 putchar((char)X); 1727 return 0; 1728} 1729 1730/// printd - printf that takes a double prints it as "%f\n", returning 0. 1731extern "C" 1732double printd(double X) { 1733 printf("%f\n", X); 1734 return 0; 1735} 1736 1737//===----------------------------------------------------------------------===// 1738// Main driver code. 1739//===----------------------------------------------------------------------===// 1740 1741int main() { 1742 InitializeNativeTarget(); 1743 LLVMContext &Context = getGlobalContext(); 1744 1745 // Install standard binary operators. 1746 // 1 is lowest precedence. 1747 BinopPrecedence['<'] = 10; 1748 BinopPrecedence['+'] = 20; 1749 BinopPrecedence['-'] = 20; 1750 BinopPrecedence['*'] = 40; // highest. 1751 1752 // Prime the first token. 1753 fprintf(stderr, "ready> "); 1754 getNextToken(); 1755 1756 // Make the module, which holds all the code. 1757 TheModule = new Module("my cool jit", Context); 1758 1759 // Create the JIT. This takes ownership of the module. 1760 std::string ErrStr; 1761 TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&ErrStr).create(); 1762 if (!TheExecutionEngine) { 1763 fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str()); 1764 exit(1); 1765 } 1766 1767 FunctionPassManager OurFPM(TheModule); 1768 1769 // Set up the optimizer pipeline. Start with registering info about how the 1770 // target lays out data structures. 1771 OurFPM.add(new TargetData(*TheExecutionEngine->getTargetData())); 1772 // Provide basic AliasAnalysis support for GVN. 1773 OurFPM.add(createBasicAliasAnalysisPass()); 1774 // Do simple "peephole" optimizations and bit-twiddling optzns. 1775 OurFPM.add(createInstructionCombiningPass()); 1776 // Reassociate expressions. 1777 OurFPM.add(createReassociatePass()); 1778 // Eliminate Common SubExpressions. 1779 OurFPM.add(createGVNPass()); 1780 // Simplify the control flow graph (deleting unreachable blocks, etc). 1781 OurFPM.add(createCFGSimplificationPass()); 1782 1783 OurFPM.doInitialization(); 1784 1785 // Set the global so the code gen can use this. 1786 TheFPM = &OurFPM; 1787 1788 // Run the main "interpreter loop" now. 1789 MainLoop(); 1790 1791 TheFPM = 0; 1792 1793 // Print out all of the generated code. 1794 TheModule->dump(); 1795 1796 return 0; 1797} 1798</pre> 1799</div> 1800 1801<a href="LangImpl7.html">Next: Extending the language: mutable variables / SSA construction</a> 1802</div> 1803 1804<!-- *********************************************************************** --> 1805<hr> 1806<address> 1807 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img 1808 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a> 1809 <a href="http://validator.w3.org/check/referer"><img 1810 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a> 1811 1812 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> 1813 <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br> 1814 Last modified: $Date$ 1815</address> 1816</body> 1817</html> 1818