• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2                      "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
6  <title>Kaleidoscope: Extending the Language: User-defined Operators</title>
7  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8  <meta name="author" content="Chris Lattner">
9  <link rel="stylesheet" href="../_static/llvm.css" type="text/css">
10</head>
11
12<body>
13
14<h1>Kaleidoscope: Extending the Language: User-defined Operators</h1>
15
16<ul>
17<li><a href="index.html">Up to Tutorial Index</a></li>
18<li>Chapter 6
19  <ol>
20    <li><a href="#intro">Chapter 6 Introduction</a></li>
21    <li><a href="#idea">User-defined Operators: the Idea</a></li>
22    <li><a href="#binary">User-defined Binary Operators</a></li>
23    <li><a href="#unary">User-defined Unary Operators</a></li>
24    <li><a href="#example">Kicking the Tires</a></li>
25    <li><a href="#code">Full Code Listing</a></li>
26  </ol>
27</li>
28<li><a href="LangImpl7.html">Chapter 7</a>: Extending the Language: Mutable
29Variables / SSA Construction</li>
30</ul>
31
32<div class="doc_author">
33  <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
34</div>
35
36<!-- *********************************************************************** -->
37<h2><a name="intro">Chapter 6 Introduction</a></h2>
38<!-- *********************************************************************** -->
39
40<div>
41
42<p>Welcome to Chapter 6 of the "<a href="index.html">Implementing a language
43with LLVM</a>" tutorial.  At this point in our tutorial, we now have a fully
44functional language that is fairly minimal, but also useful.  There
45is still one big problem with it, however. Our language doesn't have many
46useful operators (like division, logical negation, or even any comparisons
47besides less-than).</p>
48
49<p>This chapter of the tutorial takes a wild digression into adding user-defined
50operators to the simple and beautiful Kaleidoscope language. This digression now gives
51us a simple and ugly language in some ways, but also a powerful one at the same time.
52One of the great things about creating your own language is that you get to
53decide what is good or bad.  In this tutorial we'll assume that it is okay to
54use this as a way to show some interesting parsing techniques.</p>
55
56<p>At the end of this tutorial, we'll run through an example Kaleidoscope
57application that <a href="#example">renders the Mandelbrot set</a>.  This gives
58an example of what you can build with Kaleidoscope and its feature set.</p>
59
60</div>
61
62<!-- *********************************************************************** -->
63<h2><a name="idea">User-defined Operators: the Idea</a></h2>
64<!-- *********************************************************************** -->
65
66<div>
67
68<p>
69The "operator overloading" that we will add to Kaleidoscope is more general than
70languages like C++.  In C++, you are only allowed to redefine existing
71operators: you can't programatically change the grammar, introduce new
72operators, change precedence levels, etc.  In this chapter, we will add this
73capability to Kaleidoscope, which will let the user round out the set of
74operators that are supported.</p>
75
76<p>The point of going into user-defined operators in a tutorial like this is to
77show the power and flexibility of using a hand-written parser.  Thus far, the parser
78we have been implementing uses recursive descent for most parts of the grammar and
79operator precedence parsing for the expressions.  See <a
80href="LangImpl2.html">Chapter 2</a> for details.  Without using operator
81precedence parsing, it would be very difficult to allow the programmer to
82introduce new operators into the grammar: the grammar is dynamically extensible
83as the JIT runs.</p>
84
85<p>The two specific features we'll add are programmable unary operators (right
86now, Kaleidoscope has no unary operators at all) as well as binary operators.
87An example of this is:</p>
88
89<div class="doc_code">
90<pre>
91# Logical unary not.
92def unary!(v)
93  if v then
94    0
95  else
96    1;
97
98# Define &gt; with the same precedence as &lt;.
99def binary&gt; 10 (LHS RHS)
100  RHS &lt; LHS;
101
102# Binary "logical or", (note that it does not "short circuit")
103def binary| 5 (LHS RHS)
104  if LHS then
105    1
106  else if RHS then
107    1
108  else
109    0;
110
111# Define = with slightly lower precedence than relationals.
112def binary= 9 (LHS RHS)
113  !(LHS &lt; RHS | LHS &gt; RHS);
114</pre>
115</div>
116
117<p>Many languages aspire to being able to implement their standard runtime
118library in the language itself.  In Kaleidoscope, we can implement significant
119parts of the language in the library!</p>
120
121<p>We will break down implementation of these features into two parts:
122implementing support for user-defined binary operators and adding unary
123operators.</p>
124
125</div>
126
127<!-- *********************************************************************** -->
128<h2><a name="binary">User-defined Binary Operators</a></h2>
129<!-- *********************************************************************** -->
130
131<div>
132
133<p>Adding support for user-defined binary operators is pretty simple with our
134current framework.  We'll first add support for the unary/binary keywords:</p>
135
136<div class="doc_code">
137<pre>
138enum Token {
139  ...
140  <b>// operators
141  tok_binary = -11, tok_unary = -12</b>
142};
143...
144static int gettok() {
145...
146    if (IdentifierStr == "for") return tok_for;
147    if (IdentifierStr == "in") return tok_in;
148    <b>if (IdentifierStr == "binary") return tok_binary;
149    if (IdentifierStr == "unary") return tok_unary;</b>
150    return tok_identifier;
151</pre>
152</div>
153
154<p>This just adds lexer support for the unary and binary keywords, like we
155did in <a href="LangImpl5.html#iflexer">previous chapters</a>.  One nice thing
156about our current AST, is that we represent binary operators with full generalisation
157by using their ASCII code as the opcode.  For our extended operators, we'll use this
158same representation, so we don't need any new AST or parser support.</p>
159
160<p>On the other hand, we have to be able to represent the definitions of these
161new operators, in the "def binary| 5" part of the function definition.  In our
162grammar so far, the "name" for the function definition is parsed as the
163"prototype" production and into the <tt>PrototypeAST</tt> AST node.  To
164represent our new user-defined operators as prototypes, we have to extend
165the  <tt>PrototypeAST</tt> AST node like this:</p>
166
167<div class="doc_code">
168<pre>
169/// PrototypeAST - This class represents the "prototype" for a function,
170/// which captures its argument names as well as if it is an operator.
171class PrototypeAST {
172  std::string Name;
173  std::vector&lt;std::string&gt; Args;
174  <b>bool isOperator;
175  unsigned Precedence;  // Precedence if a binary op.</b>
176public:
177  PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
178               <b>bool isoperator = false, unsigned prec = 0</b>)
179  : Name(name), Args(args), <b>isOperator(isoperator), Precedence(prec)</b> {}
180
181  <b>bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
182  bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
183
184  char getOperatorName() const {
185    assert(isUnaryOp() || isBinaryOp());
186    return Name[Name.size()-1];
187  }
188
189  unsigned getBinaryPrecedence() const { return Precedence; }</b>
190
191  Function *Codegen();
192};
193</pre>
194</div>
195
196<p>Basically, in addition to knowing a name for the prototype, we now keep track
197of whether it was an operator, and if it was, what precedence level the operator
198is at.  The precedence is only used for binary operators (as you'll see below,
199it just doesn't apply for unary operators).  Now that we have a way to represent
200the prototype for a user-defined operator, we need to parse it:</p>
201
202<div class="doc_code">
203<pre>
204/// prototype
205///   ::= id '(' id* ')'
206<b>///   ::= binary LETTER number? (id, id)</b>
207static PrototypeAST *ParsePrototype() {
208  std::string FnName;
209
210  <b>unsigned Kind = 0;  // 0 = identifier, 1 = unary, 2 = binary.
211  unsigned BinaryPrecedence = 30;</b>
212
213  switch (CurTok) {
214  default:
215    return ErrorP("Expected function name in prototype");
216  case tok_identifier:
217    FnName = IdentifierStr;
218    Kind = 0;
219    getNextToken();
220    break;
221  <b>case tok_binary:
222    getNextToken();
223    if (!isascii(CurTok))
224      return ErrorP("Expected binary operator");
225    FnName = "binary";
226    FnName += (char)CurTok;
227    Kind = 2;
228    getNextToken();
229
230    // Read the precedence if present.
231    if (CurTok == tok_number) {
232      if (NumVal &lt; 1 || NumVal &gt; 100)
233        return ErrorP("Invalid precedecnce: must be 1..100");
234      BinaryPrecedence = (unsigned)NumVal;
235      getNextToken();
236    }
237    break;</b>
238  }
239
240  if (CurTok != '(')
241    return ErrorP("Expected '(' in prototype");
242
243  std::vector&lt;std::string&gt; ArgNames;
244  while (getNextToken() == tok_identifier)
245    ArgNames.push_back(IdentifierStr);
246  if (CurTok != ')')
247    return ErrorP("Expected ')' in prototype");
248
249  // success.
250  getNextToken();  // eat ')'.
251
252  <b>// Verify right number of names for operator.
253  if (Kind &amp;&amp; ArgNames.size() != Kind)
254    return ErrorP("Invalid number of operands for operator");
255
256  return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);</b>
257}
258</pre>
259</div>
260
261<p>This is all fairly straightforward parsing code, and we have already seen
262a lot of similar code in the past.  One interesting part about the code above is
263the couple lines that set up <tt>FnName</tt> for binary operators.  This builds names
264like "binary@" for a newly defined "@" operator.  This then takes advantage of the
265fact that symbol names in the LLVM symbol table are allowed to have any character in
266them, including embedded nul characters.</p>
267
268<p>The next interesting thing to add, is codegen support for these binary operators.
269Given our current structure, this is a simple addition of a default case for our
270existing binary operator node:</p>
271
272<div class="doc_code">
273<pre>
274Value *BinaryExprAST::Codegen() {
275  Value *L = LHS-&gt;Codegen();
276  Value *R = RHS-&gt;Codegen();
277  if (L == 0 || R == 0) return 0;
278
279  switch (Op) {
280  case '+': return Builder.CreateFAdd(L, R, "addtmp");
281  case '-': return Builder.CreateFSub(L, R, "subtmp");
282  case '*': return Builder.CreateFMul(L, R, "multmp");
283  case '&lt;':
284    L = Builder.CreateFCmpULT(L, R, "cmptmp");
285    // Convert bool 0/1 to double 0.0 or 1.0
286    return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
287                                "booltmp");
288  <b>default: break;</b>
289  }
290
291  <b>// If it wasn't a builtin binary operator, it must be a user defined one. Emit
292  // a call to it.
293  Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
294  assert(F &amp;&amp; "binary operator not found!");
295
296  Value *Ops[2] = { L, R };
297  return Builder.CreateCall(F, Ops, "binop");</b>
298}
299
300</pre>
301</div>
302
303<p>As you can see above, the new code is actually really simple.  It just does
304a lookup for the appropriate operator in the symbol table and generates a
305function call to it.  Since user-defined operators are just built as normal
306functions (because the "prototype" boils down to a function with the right
307name) everything falls into place.</p>
308
309<p>The final piece of code we are missing, is a bit of top-level magic:</p>
310
311<div class="doc_code">
312<pre>
313Function *FunctionAST::Codegen() {
314  NamedValues.clear();
315
316  Function *TheFunction = Proto->Codegen();
317  if (TheFunction == 0)
318    return 0;
319
320  <b>// If this is an operator, install it.
321  if (Proto-&gt;isBinaryOp())
322    BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence();</b>
323
324  // Create a new basic block to start insertion into.
325  BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
326  Builder.SetInsertPoint(BB);
327
328  if (Value *RetVal = Body-&gt;Codegen()) {
329    ...
330</pre>
331</div>
332
333<p>Basically, before codegening a function, if it is a user-defined operator, we
334register it in the precedence table.  This allows the binary operator parsing
335logic we already have in place to handle it.  Since we are working on a fully-general operator precedence parser, this is all we need to do to "extend the grammar".</p>
336
337<p>Now we have useful user-defined binary operators.  This builds a lot
338on the previous framework we built for other operators.  Adding unary operators
339is a bit more challenging, because we don't have any framework for it yet - lets
340see what it takes.</p>
341
342</div>
343
344<!-- *********************************************************************** -->
345<h2><a name="unary">User-defined Unary Operators</a></h2>
346<!-- *********************************************************************** -->
347
348<div>
349
350<p>Since we don't currently support unary operators in the Kaleidoscope
351language, we'll need to add everything to support them.  Above, we added simple
352support for the 'unary' keyword to the lexer.  In addition to that, we need an
353AST node:</p>
354
355<div class="doc_code">
356<pre>
357/// UnaryExprAST - Expression class for a unary operator.
358class UnaryExprAST : public ExprAST {
359  char Opcode;
360  ExprAST *Operand;
361public:
362  UnaryExprAST(char opcode, ExprAST *operand)
363    : Opcode(opcode), Operand(operand) {}
364  virtual Value *Codegen();
365};
366</pre>
367</div>
368
369<p>This AST node is very simple and obvious by now.  It directly mirrors the
370binary operator AST node, except that it only has one child.  With this, we
371need to add the parsing logic.  Parsing a unary operator is pretty simple: we'll
372add a new function to do it:</p>
373
374<div class="doc_code">
375<pre>
376/// unary
377///   ::= primary
378///   ::= '!' unary
379static ExprAST *ParseUnary() {
380  // If the current token is not an operator, it must be a primary expr.
381  if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
382    return ParsePrimary();
383
384  // If this is a unary operator, read it.
385  int Opc = CurTok;
386  getNextToken();
387  if (ExprAST *Operand = ParseUnary())
388    return new UnaryExprAST(Opc, Operand);
389  return 0;
390}
391</pre>
392</div>
393
394<p>The grammar we add is pretty straightforward here.  If we see a unary
395operator when parsing a primary operator, we eat the operator as a prefix and
396parse the remaining piece as another unary operator.  This allows us to handle
397multiple unary operators (e.g. "!!x").  Note that unary operators can't have
398ambiguous parses like binary operators can, so there is no need for precedence
399information.</p>
400
401<p>The problem with this function, is that we need to call ParseUnary from somewhere.
402To do this, we change previous callers of ParsePrimary to call ParseUnary
403instead:</p>
404
405<div class="doc_code">
406<pre>
407/// binoprhs
408///   ::= ('+' unary)*
409static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
410  ...
411    <b>// Parse the unary expression after the binary operator.
412    ExprAST *RHS = ParseUnary();
413    if (!RHS) return 0;</b>
414  ...
415}
416/// expression
417///   ::= unary binoprhs
418///
419static ExprAST *ParseExpression() {
420  <b>ExprAST *LHS = ParseUnary();</b>
421  if (!LHS) return 0;
422
423  return ParseBinOpRHS(0, LHS);
424}
425</pre>
426</div>
427
428<p>With these two simple changes, we are now able to parse unary operators and build the
429AST for them.  Next up, we need to add parser support for prototypes, to parse
430the unary operator prototype.  We extend the binary operator code above
431with:</p>
432
433<div class="doc_code">
434<pre>
435/// prototype
436///   ::= id '(' id* ')'
437///   ::= binary LETTER number? (id, id)
438<b>///   ::= unary LETTER (id)</b>
439static PrototypeAST *ParsePrototype() {
440  std::string FnName;
441
442  unsigned Kind = 0;  // 0 = identifier, 1 = unary, 2 = binary.
443  unsigned BinaryPrecedence = 30;
444
445  switch (CurTok) {
446  default:
447    return ErrorP("Expected function name in prototype");
448  case tok_identifier:
449    FnName = IdentifierStr;
450    Kind = 0;
451    getNextToken();
452    break;
453  <b>case tok_unary:
454    getNextToken();
455    if (!isascii(CurTok))
456      return ErrorP("Expected unary operator");
457    FnName = "unary";
458    FnName += (char)CurTok;
459    Kind = 1;
460    getNextToken();
461    break;</b>
462  case tok_binary:
463    ...
464</pre>
465</div>
466
467<p>As with binary operators, we name unary operators with a name that includes
468the operator character.  This assists us at code generation time.  Speaking of,
469the final piece we need to add is codegen support for unary operators.  It looks
470like this:</p>
471
472<div class="doc_code">
473<pre>
474Value *UnaryExprAST::Codegen() {
475  Value *OperandV = Operand->Codegen();
476  if (OperandV == 0) return 0;
477
478  Function *F = TheModule->getFunction(std::string("unary")+Opcode);
479  if (F == 0)
480    return ErrorV("Unknown unary operator");
481
482  return Builder.CreateCall(F, OperandV, "unop");
483}
484</pre>
485</div>
486
487<p>This code is similar to, but simpler than, the code for binary operators.  It
488is simpler primarily because it doesn't need to handle any predefined operators.
489</p>
490
491</div>
492
493<!-- *********************************************************************** -->
494<h2><a name="example">Kicking the Tires</a></h2>
495<!-- *********************************************************************** -->
496
497<div>
498
499<p>It is somewhat hard to believe, but with a few simple extensions we've
500covered in the last chapters, we have grown a real-ish language.  With this, we
501can do a lot of interesting things, including I/O, math, and a bunch of other
502things.  For example, we can now add a nice sequencing operator (printd is
503defined to print out the specified value and a newline):</p>
504
505<div class="doc_code">
506<pre>
507ready&gt; <b>extern printd(x);</b>
508Read extern:
509declare double @printd(double)
510
511ready&gt; <b>def binary : 1 (x y) 0;  # Low-precedence operator that ignores operands.</b>
512..
513ready&gt; <b>printd(123) : printd(456) : printd(789);</b>
514123.000000
515456.000000
516789.000000
517Evaluated to 0.000000
518</pre>
519</div>
520
521<p>We can also define a bunch of other "primitive" operations, such as:</p>
522
523<div class="doc_code">
524<pre>
525# Logical unary not.
526def unary!(v)
527  if v then
528    0
529  else
530    1;
531
532# Unary negate.
533def unary-(v)
534  0-v;
535
536# Define &gt; with the same precedence as &lt;.
537def binary&gt; 10 (LHS RHS)
538  RHS &lt; LHS;
539
540# Binary logical or, which does not short circuit.
541def binary| 5 (LHS RHS)
542  if LHS then
543    1
544  else if RHS then
545    1
546  else
547    0;
548
549# Binary logical and, which does not short circuit.
550def binary&amp; 6 (LHS RHS)
551  if !LHS then
552    0
553  else
554    !!RHS;
555
556# Define = with slightly lower precedence than relationals.
557def binary = 9 (LHS RHS)
558  !(LHS &lt; RHS | LHS &gt; RHS);
559
560# Define ':' for sequencing: as a low-precedence operator that ignores operands
561# and just returns the RHS.
562def binary : 1 (x y) y;
563</pre>
564</div>
565
566
567<p>Given the previous if/then/else support, we can also define interesting
568functions for I/O.  For example, the following prints out a character whose
569"density" reflects the value passed in: the lower the value, the denser the
570character:</p>
571
572<div class="doc_code">
573<pre>
574ready&gt;
575<b>
576extern putchard(char)
577def printdensity(d)
578  if d &gt; 8 then
579    putchard(32)  # ' '
580  else if d &gt; 4 then
581    putchard(46)  # '.'
582  else if d &gt; 2 then
583    putchard(43)  # '+'
584  else
585    putchard(42); # '*'</b>
586...
587ready&gt; <b>printdensity(1): printdensity(2): printdensity(3):
588       printdensity(4): printdensity(5): printdensity(9):
589       putchard(10);</b>
590**++.
591Evaluated to 0.000000
592</pre>
593</div>
594
595<p>Based on these simple primitive operations, we can start to define more
596interesting things.  For example, here's a little function that solves for the
597number of iterations it takes a function in the complex plane to
598converge:</p>
599
600<div class="doc_code">
601<pre>
602# Determine whether the specific location diverges.
603# Solve for z = z^2 + c in the complex plane.
604def mandleconverger(real imag iters creal cimag)
605  if iters &gt; 255 | (real*real + imag*imag &gt; 4) then
606    iters
607  else
608    mandleconverger(real*real - imag*imag + creal,
609                    2*real*imag + cimag,
610                    iters+1, creal, cimag);
611
612# Return the number of iterations required for the iteration to escape
613def mandleconverge(real imag)
614  mandleconverger(real, imag, 0, real, imag);
615</pre>
616</div>
617
618<p>This "<code>z = z<sup>2</sup> + c</code>" function is a beautiful little
619creature that is the basis for computation of
620the <a href="http://en.wikipedia.org/wiki/Mandelbrot_set">Mandelbrot Set</a>.
621Our <tt>mandelconverge</tt> function returns the number of iterations that it
622takes for a complex orbit to escape, saturating to 255.  This is not a very
623useful function by itself, but if you plot its value over a two-dimensional
624plane, you can see the Mandelbrot set.  Given that we are limited to using
625putchard here, our amazing graphical output is limited, but we can whip together
626something using the density plotter above:</p>
627
628<div class="doc_code">
629<pre>
630# Compute and plot the mandlebrot set with the specified 2 dimensional range
631# info.
632def mandelhelp(xmin xmax xstep   ymin ymax ystep)
633  for y = ymin, y &lt; ymax, ystep in (
634    (for x = xmin, x &lt; xmax, xstep in
635       printdensity(mandleconverge(x,y)))
636    : putchard(10)
637  )
638
639# mandel - This is a convenient helper function for plotting the mandelbrot set
640# from the specified position with the specified Magnification.
641def mandel(realstart imagstart realmag imagmag)
642  mandelhelp(realstart, realstart+realmag*78, realmag,
643             imagstart, imagstart+imagmag*40, imagmag);
644</pre>
645</div>
646
647<p>Given this, we can try plotting out the mandlebrot set!  Lets try it out:</p>
648
649<div class="doc_code">
650<pre>
651ready&gt; <b>mandel(-2.3, -1.3, 0.05, 0.07);</b>
652*******************************+++++++++++*************************************
653*************************+++++++++++++++++++++++*******************************
654**********************+++++++++++++++++++++++++++++****************************
655*******************+++++++++++++++++++++.. ...++++++++*************************
656*****************++++++++++++++++++++++.... ...+++++++++***********************
657***************+++++++++++++++++++++++.....   ...+++++++++*********************
658**************+++++++++++++++++++++++....     ....+++++++++********************
659*************++++++++++++++++++++++......      .....++++++++*******************
660************+++++++++++++++++++++.......       .......+++++++******************
661***********+++++++++++++++++++....                ... .+++++++*****************
662**********+++++++++++++++++.......                     .+++++++****************
663*********++++++++++++++...........                    ...+++++++***************
664********++++++++++++............                      ...++++++++**************
665********++++++++++... ..........                        .++++++++**************
666*******+++++++++.....                                   .+++++++++*************
667*******++++++++......                                  ..+++++++++*************
668*******++++++.......                                   ..+++++++++*************
669*******+++++......                                     ..+++++++++*************
670*******.... ....                                      ...+++++++++*************
671*******.... .                                         ...+++++++++*************
672*******+++++......                                    ...+++++++++*************
673*******++++++.......                                   ..+++++++++*************
674*******++++++++......                                   .+++++++++*************
675*******+++++++++.....                                  ..+++++++++*************
676********++++++++++... ..........                        .++++++++**************
677********++++++++++++............                      ...++++++++**************
678*********++++++++++++++..........                     ...+++++++***************
679**********++++++++++++++++........                     .+++++++****************
680**********++++++++++++++++++++....                ... ..+++++++****************
681***********++++++++++++++++++++++.......       .......++++++++*****************
682************+++++++++++++++++++++++......      ......++++++++******************
683**************+++++++++++++++++++++++....      ....++++++++********************
684***************+++++++++++++++++++++++.....   ...+++++++++*********************
685*****************++++++++++++++++++++++....  ...++++++++***********************
686*******************+++++++++++++++++++++......++++++++*************************
687*********************++++++++++++++++++++++.++++++++***************************
688*************************+++++++++++++++++++++++*******************************
689******************************+++++++++++++************************************
690*******************************************************************************
691*******************************************************************************
692*******************************************************************************
693Evaluated to 0.000000
694ready&gt; <b>mandel(-2, -1, 0.02, 0.04);</b>
695**************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
696***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
697*********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
698*******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
699*****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
700***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
701**************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
702************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
703***********++++++++++++++++++++++++++++++++++++++++++++++++++........        .
704**********++++++++++++++++++++++++++++++++++++++++++++++.............
705********+++++++++++++++++++++++++++++++++++++++++++..................
706*******+++++++++++++++++++++++++++++++++++++++.......................
707******+++++++++++++++++++++++++++++++++++...........................
708*****++++++++++++++++++++++++++++++++............................
709*****++++++++++++++++++++++++++++...............................
710****++++++++++++++++++++++++++......   .........................
711***++++++++++++++++++++++++.........     ......    ...........
712***++++++++++++++++++++++............
713**+++++++++++++++++++++..............
714**+++++++++++++++++++................
715*++++++++++++++++++.................
716*++++++++++++++++............ ...
717*++++++++++++++..............
718*+++....++++................
719*..........  ...........
720*
721*..........  ...........
722*+++....++++................
723*++++++++++++++..............
724*++++++++++++++++............ ...
725*++++++++++++++++++.................
726**+++++++++++++++++++................
727**+++++++++++++++++++++..............
728***++++++++++++++++++++++............
729***++++++++++++++++++++++++.........     ......    ...........
730****++++++++++++++++++++++++++......   .........................
731*****++++++++++++++++++++++++++++...............................
732*****++++++++++++++++++++++++++++++++............................
733******+++++++++++++++++++++++++++++++++++...........................
734*******+++++++++++++++++++++++++++++++++++++++.......................
735********+++++++++++++++++++++++++++++++++++++++++++..................
736Evaluated to 0.000000
737ready&gt; <b>mandel(-0.9, -1.4, 0.02, 0.03);</b>
738*******************************************************************************
739*******************************************************************************
740*******************************************************************************
741**********+++++++++++++++++++++************************************************
742*+++++++++++++++++++++++++++++++++++++++***************************************
743+++++++++++++++++++++++++++++++++++++++++++++**********************************
744++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
745++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
746+++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
747+++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
748+++++++++++++++++++++++++++++++....   ......+++++++++++++++++++****************
749+++++++++++++++++++++++++++++.......  ........+++++++++++++++++++**************
750++++++++++++++++++++++++++++........   ........++++++++++++++++++++************
751+++++++++++++++++++++++++++.........     ..  ...+++++++++++++++++++++**********
752++++++++++++++++++++++++++...........        ....++++++++++++++++++++++********
753++++++++++++++++++++++++.............       .......++++++++++++++++++++++******
754+++++++++++++++++++++++.............        ........+++++++++++++++++++++++****
755++++++++++++++++++++++...........           ..........++++++++++++++++++++++***
756++++++++++++++++++++...........                .........++++++++++++++++++++++*
757++++++++++++++++++............                  ...........++++++++++++++++++++
758++++++++++++++++...............                 .............++++++++++++++++++
759++++++++++++++.................                 ...............++++++++++++++++
760++++++++++++..................                  .................++++++++++++++
761+++++++++..................                      .................+++++++++++++
762++++++........        .                               .........  ..++++++++++++
763++............                                         ......    ....++++++++++
764..............                                                    ...++++++++++
765..............                                                    ....+++++++++
766..............                                                    .....++++++++
767.............                                                    ......++++++++
768...........                                                     .......++++++++
769.........                                                       ........+++++++
770.........                                                       ........+++++++
771.........                                                           ....+++++++
772........                                                             ...+++++++
773.......                                                              ...+++++++
774                                                                    ....+++++++
775                                                                   .....+++++++
776                                                                    ....+++++++
777                                                                    ....+++++++
778                                                                    ....+++++++
779Evaluated to 0.000000
780ready&gt; <b>^D</b>
781</pre>
782</div>
783
784<p>At this point, you may be starting to realize that Kaleidoscope is a real
785and powerful language.  It may not be self-similar :), but it can be used to
786plot things that are!</p>
787
788<p>With this, we conclude the "adding user-defined operators" chapter of the
789tutorial.  We have successfully augmented our language, adding the ability to extend the
790language in the library, and we have shown how this can be used to build a simple but
791interesting end-user application in Kaleidoscope.  At this point, Kaleidoscope
792can build a variety of applications that are functional and can call functions
793with side-effects, but it can't actually define and mutate a variable itself.
794</p>
795
796<p>Strikingly, variable mutation is an important feature of some
797languages, and it is not at all obvious how to <a href="LangImpl7.html">add
798support for mutable variables</a> without having to add an "SSA construction"
799phase to your front-end.  In the next chapter, we will describe how you can
800add variable mutation without building SSA in your front-end.</p>
801
802</div>
803
804<!-- *********************************************************************** -->
805<h2><a name="code">Full Code Listing</a></h2>
806<!-- *********************************************************************** -->
807
808<div>
809
810<p>
811Here is the complete code listing for our running example, enhanced with the
812if/then/else and for expressions..  To build this example, use:
813</p>
814
815<div class="doc_code">
816<pre>
817# Compile
818clang++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
819# Run
820./toy
821</pre>
822</div>
823
824<p>On some platforms, you will need to specify -rdynamic or -Wl,--export-dynamic
825when linking.  This ensures that symbols defined in the main executable are
826exported to the dynamic linker and so are available for symbol resolution at
827run time.  This is not needed if you compile your support code into a shared
828library, although doing that will cause problems on Windows.</p>
829
830<p>Here is the code:</p>
831
832<div class="doc_code">
833<pre>
834#include "llvm/DerivedTypes.h"
835#include "llvm/ExecutionEngine/ExecutionEngine.h"
836#include "llvm/ExecutionEngine/JIT.h"
837#include "llvm/IRBuilder.h"
838#include "llvm/LLVMContext.h"
839#include "llvm/Module.h"
840#include "llvm/PassManager.h"
841#include "llvm/Analysis/Verifier.h"
842#include "llvm/Analysis/Passes.h"
843#include "llvm/Target/TargetData.h"
844#include "llvm/Transforms/Scalar.h"
845#include "llvm/Support/TargetSelect.h"
846#include &lt;cstdio&gt;
847#include &lt;string&gt;
848#include &lt;map&gt;
849#include &lt;vector&gt;
850using namespace llvm;
851
852//===----------------------------------------------------------------------===//
853// Lexer
854//===----------------------------------------------------------------------===//
855
856// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
857// of these for known things.
858enum Token {
859  tok_eof = -1,
860
861  // commands
862  tok_def = -2, tok_extern = -3,
863
864  // primary
865  tok_identifier = -4, tok_number = -5,
866
867  // control
868  tok_if = -6, tok_then = -7, tok_else = -8,
869  tok_for = -9, tok_in = -10,
870
871  // operators
872  tok_binary = -11, tok_unary = -12
873};
874
875static std::string IdentifierStr;  // Filled in if tok_identifier
876static double NumVal;              // Filled in if tok_number
877
878/// gettok - Return the next token from standard input.
879static int gettok() {
880  static int LastChar = ' ';
881
882  // Skip any whitespace.
883  while (isspace(LastChar))
884    LastChar = getchar();
885
886  if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
887    IdentifierStr = LastChar;
888    while (isalnum((LastChar = getchar())))
889      IdentifierStr += LastChar;
890
891    if (IdentifierStr == "def") return tok_def;
892    if (IdentifierStr == "extern") return tok_extern;
893    if (IdentifierStr == "if") return tok_if;
894    if (IdentifierStr == "then") return tok_then;
895    if (IdentifierStr == "else") return tok_else;
896    if (IdentifierStr == "for") return tok_for;
897    if (IdentifierStr == "in") return tok_in;
898    if (IdentifierStr == "binary") return tok_binary;
899    if (IdentifierStr == "unary") return tok_unary;
900    return tok_identifier;
901  }
902
903  if (isdigit(LastChar) || LastChar == '.') {   // Number: [0-9.]+
904    std::string NumStr;
905    do {
906      NumStr += LastChar;
907      LastChar = getchar();
908    } while (isdigit(LastChar) || LastChar == '.');
909
910    NumVal = strtod(NumStr.c_str(), 0);
911    return tok_number;
912  }
913
914  if (LastChar == '#') {
915    // Comment until end of line.
916    do LastChar = getchar();
917    while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp;&amp; LastChar != '\r');
918
919    if (LastChar != EOF)
920      return gettok();
921  }
922
923  // Check for end of file.  Don't eat the EOF.
924  if (LastChar == EOF)
925    return tok_eof;
926
927  // Otherwise, just return the character as its ascii value.
928  int ThisChar = LastChar;
929  LastChar = getchar();
930  return ThisChar;
931}
932
933//===----------------------------------------------------------------------===//
934// Abstract Syntax Tree (aka Parse Tree)
935//===----------------------------------------------------------------------===//
936
937/// ExprAST - Base class for all expression nodes.
938class ExprAST {
939public:
940  virtual ~ExprAST() {}
941  virtual Value *Codegen() = 0;
942};
943
944/// NumberExprAST - Expression class for numeric literals like "1.0".
945class NumberExprAST : public ExprAST {
946  double Val;
947public:
948  NumberExprAST(double val) : Val(val) {}
949  virtual Value *Codegen();
950};
951
952/// VariableExprAST - Expression class for referencing a variable, like "a".
953class VariableExprAST : public ExprAST {
954  std::string Name;
955public:
956  VariableExprAST(const std::string &amp;name) : Name(name) {}
957  virtual Value *Codegen();
958};
959
960/// UnaryExprAST - Expression class for a unary operator.
961class UnaryExprAST : public ExprAST {
962  char Opcode;
963  ExprAST *Operand;
964public:
965  UnaryExprAST(char opcode, ExprAST *operand)
966    : Opcode(opcode), Operand(operand) {}
967  virtual Value *Codegen();
968};
969
970/// BinaryExprAST - Expression class for a binary operator.
971class BinaryExprAST : public ExprAST {
972  char Op;
973  ExprAST *LHS, *RHS;
974public:
975  BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
976    : Op(op), LHS(lhs), RHS(rhs) {}
977  virtual Value *Codegen();
978};
979
980/// CallExprAST - Expression class for function calls.
981class CallExprAST : public ExprAST {
982  std::string Callee;
983  std::vector&lt;ExprAST*&gt; Args;
984public:
985  CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
986    : Callee(callee), Args(args) {}
987  virtual Value *Codegen();
988};
989
990/// IfExprAST - Expression class for if/then/else.
991class IfExprAST : public ExprAST {
992  ExprAST *Cond, *Then, *Else;
993public:
994  IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
995  : Cond(cond), Then(then), Else(_else) {}
996  virtual Value *Codegen();
997};
998
999/// ForExprAST - Expression class for for/in.
1000class ForExprAST : public ExprAST {
1001  std::string VarName;
1002  ExprAST *Start, *End, *Step, *Body;
1003public:
1004  ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
1005             ExprAST *step, ExprAST *body)
1006    : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
1007  virtual Value *Codegen();
1008};
1009
1010/// PrototypeAST - This class represents the "prototype" for a function,
1011/// which captures its name, and its argument names (thus implicitly the number
1012/// of arguments the function takes), as well as if it is an operator.
1013class PrototypeAST {
1014  std::string Name;
1015  std::vector&lt;std::string&gt; Args;
1016  bool isOperator;
1017  unsigned Precedence;  // Precedence if a binary op.
1018public:
1019  PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
1020               bool isoperator = false, unsigned prec = 0)
1021  : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
1022
1023  bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
1024  bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
1025
1026  char getOperatorName() const {
1027    assert(isUnaryOp() || isBinaryOp());
1028    return Name[Name.size()-1];
1029  }
1030
1031  unsigned getBinaryPrecedence() const { return Precedence; }
1032
1033  Function *Codegen();
1034};
1035
1036/// FunctionAST - This class represents a function definition itself.
1037class FunctionAST {
1038  PrototypeAST *Proto;
1039  ExprAST *Body;
1040public:
1041  FunctionAST(PrototypeAST *proto, ExprAST *body)
1042    : Proto(proto), Body(body) {}
1043
1044  Function *Codegen();
1045};
1046
1047//===----------------------------------------------------------------------===//
1048// Parser
1049//===----------------------------------------------------------------------===//
1050
1051/// CurTok/getNextToken - Provide a simple token buffer.  CurTok is the current
1052/// token the parser is looking at.  getNextToken reads another token from the
1053/// lexer and updates CurTok with its results.
1054static int CurTok;
1055static int getNextToken() {
1056  return CurTok = gettok();
1057}
1058
1059/// BinopPrecedence - This holds the precedence for each binary operator that is
1060/// defined.
1061static std::map&lt;char, int&gt; BinopPrecedence;
1062
1063/// GetTokPrecedence - Get the precedence of the pending binary operator token.
1064static int GetTokPrecedence() {
1065  if (!isascii(CurTok))
1066    return -1;
1067
1068  // Make sure it's a declared binop.
1069  int TokPrec = BinopPrecedence[CurTok];
1070  if (TokPrec &lt;= 0) return -1;
1071  return TokPrec;
1072}
1073
1074/// Error* - These are little helper functions for error handling.
1075ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1076PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1077FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1078
1079static ExprAST *ParseExpression();
1080
1081/// identifierexpr
1082///   ::= identifier
1083///   ::= identifier '(' expression* ')'
1084static ExprAST *ParseIdentifierExpr() {
1085  std::string IdName = IdentifierStr;
1086
1087  getNextToken();  // eat identifier.
1088
1089  if (CurTok != '(') // Simple variable ref.
1090    return new VariableExprAST(IdName);
1091
1092  // Call.
1093  getNextToken();  // eat (
1094  std::vector&lt;ExprAST*&gt; Args;
1095  if (CurTok != ')') {
1096    while (1) {
1097      ExprAST *Arg = ParseExpression();
1098      if (!Arg) return 0;
1099      Args.push_back(Arg);
1100
1101      if (CurTok == ')') break;
1102
1103      if (CurTok != ',')
1104        return Error("Expected ')' or ',' in argument list");
1105      getNextToken();
1106    }
1107  }
1108
1109  // Eat the ')'.
1110  getNextToken();
1111
1112  return new CallExprAST(IdName, Args);
1113}
1114
1115/// numberexpr ::= number
1116static ExprAST *ParseNumberExpr() {
1117  ExprAST *Result = new NumberExprAST(NumVal);
1118  getNextToken(); // consume the number
1119  return Result;
1120}
1121
1122/// parenexpr ::= '(' expression ')'
1123static ExprAST *ParseParenExpr() {
1124  getNextToken();  // eat (.
1125  ExprAST *V = ParseExpression();
1126  if (!V) return 0;
1127
1128  if (CurTok != ')')
1129    return Error("expected ')'");
1130  getNextToken();  // eat ).
1131  return V;
1132}
1133
1134/// ifexpr ::= 'if' expression 'then' expression 'else' expression
1135static ExprAST *ParseIfExpr() {
1136  getNextToken();  // eat the if.
1137
1138  // condition.
1139  ExprAST *Cond = ParseExpression();
1140  if (!Cond) return 0;
1141
1142  if (CurTok != tok_then)
1143    return Error("expected then");
1144  getNextToken();  // eat the then
1145
1146  ExprAST *Then = ParseExpression();
1147  if (Then == 0) return 0;
1148
1149  if (CurTok != tok_else)
1150    return Error("expected else");
1151
1152  getNextToken();
1153
1154  ExprAST *Else = ParseExpression();
1155  if (!Else) return 0;
1156
1157  return new IfExprAST(Cond, Then, Else);
1158}
1159
1160/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
1161static ExprAST *ParseForExpr() {
1162  getNextToken();  // eat the for.
1163
1164  if (CurTok != tok_identifier)
1165    return Error("expected identifier after for");
1166
1167  std::string IdName = IdentifierStr;
1168  getNextToken();  // eat identifier.
1169
1170  if (CurTok != '=')
1171    return Error("expected '=' after for");
1172  getNextToken();  // eat '='.
1173
1174
1175  ExprAST *Start = ParseExpression();
1176  if (Start == 0) return 0;
1177  if (CurTok != ',')
1178    return Error("expected ',' after for start value");
1179  getNextToken();
1180
1181  ExprAST *End = ParseExpression();
1182  if (End == 0) return 0;
1183
1184  // The step value is optional.
1185  ExprAST *Step = 0;
1186  if (CurTok == ',') {
1187    getNextToken();
1188    Step = ParseExpression();
1189    if (Step == 0) return 0;
1190  }
1191
1192  if (CurTok != tok_in)
1193    return Error("expected 'in' after for");
1194  getNextToken();  // eat 'in'.
1195
1196  ExprAST *Body = ParseExpression();
1197  if (Body == 0) return 0;
1198
1199  return new ForExprAST(IdName, Start, End, Step, Body);
1200}
1201
1202/// primary
1203///   ::= identifierexpr
1204///   ::= numberexpr
1205///   ::= parenexpr
1206///   ::= ifexpr
1207///   ::= forexpr
1208static ExprAST *ParsePrimary() {
1209  switch (CurTok) {
1210  default: return Error("unknown token when expecting an expression");
1211  case tok_identifier: return ParseIdentifierExpr();
1212  case tok_number:     return ParseNumberExpr();
1213  case '(':            return ParseParenExpr();
1214  case tok_if:         return ParseIfExpr();
1215  case tok_for:        return ParseForExpr();
1216  }
1217}
1218
1219/// unary
1220///   ::= primary
1221///   ::= '!' unary
1222static ExprAST *ParseUnary() {
1223  // If the current token is not an operator, it must be a primary expr.
1224  if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
1225    return ParsePrimary();
1226
1227  // If this is a unary operator, read it.
1228  int Opc = CurTok;
1229  getNextToken();
1230  if (ExprAST *Operand = ParseUnary())
1231    return new UnaryExprAST(Opc, Operand);
1232  return 0;
1233}
1234
1235/// binoprhs
1236///   ::= ('+' unary)*
1237static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1238  // If this is a binop, find its precedence.
1239  while (1) {
1240    int TokPrec = GetTokPrecedence();
1241
1242    // If this is a binop that binds at least as tightly as the current binop,
1243    // consume it, otherwise we are done.
1244    if (TokPrec &lt; ExprPrec)
1245      return LHS;
1246
1247    // Okay, we know this is a binop.
1248    int BinOp = CurTok;
1249    getNextToken();  // eat binop
1250
1251    // Parse the unary expression after the binary operator.
1252    ExprAST *RHS = ParseUnary();
1253    if (!RHS) return 0;
1254
1255    // If BinOp binds less tightly with RHS than the operator after RHS, let
1256    // the pending operator take RHS as its LHS.
1257    int NextPrec = GetTokPrecedence();
1258    if (TokPrec &lt; NextPrec) {
1259      RHS = ParseBinOpRHS(TokPrec+1, RHS);
1260      if (RHS == 0) return 0;
1261    }
1262
1263    // Merge LHS/RHS.
1264    LHS = new BinaryExprAST(BinOp, LHS, RHS);
1265  }
1266}
1267
1268/// expression
1269///   ::= unary binoprhs
1270///
1271static ExprAST *ParseExpression() {
1272  ExprAST *LHS = ParseUnary();
1273  if (!LHS) return 0;
1274
1275  return ParseBinOpRHS(0, LHS);
1276}
1277
1278/// prototype
1279///   ::= id '(' id* ')'
1280///   ::= binary LETTER number? (id, id)
1281///   ::= unary LETTER (id)
1282static PrototypeAST *ParsePrototype() {
1283  std::string FnName;
1284
1285  unsigned Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
1286  unsigned BinaryPrecedence = 30;
1287
1288  switch (CurTok) {
1289  default:
1290    return ErrorP("Expected function name in prototype");
1291  case tok_identifier:
1292    FnName = IdentifierStr;
1293    Kind = 0;
1294    getNextToken();
1295    break;
1296  case tok_unary:
1297    getNextToken();
1298    if (!isascii(CurTok))
1299      return ErrorP("Expected unary operator");
1300    FnName = "unary";
1301    FnName += (char)CurTok;
1302    Kind = 1;
1303    getNextToken();
1304    break;
1305  case tok_binary:
1306    getNextToken();
1307    if (!isascii(CurTok))
1308      return ErrorP("Expected binary operator");
1309    FnName = "binary";
1310    FnName += (char)CurTok;
1311    Kind = 2;
1312    getNextToken();
1313
1314    // Read the precedence if present.
1315    if (CurTok == tok_number) {
1316      if (NumVal &lt; 1 || NumVal &gt; 100)
1317        return ErrorP("Invalid precedecnce: must be 1..100");
1318      BinaryPrecedence = (unsigned)NumVal;
1319      getNextToken();
1320    }
1321    break;
1322  }
1323
1324  if (CurTok != '(')
1325    return ErrorP("Expected '(' in prototype");
1326
1327  std::vector&lt;std::string&gt; ArgNames;
1328  while (getNextToken() == tok_identifier)
1329    ArgNames.push_back(IdentifierStr);
1330  if (CurTok != ')')
1331    return ErrorP("Expected ')' in prototype");
1332
1333  // success.
1334  getNextToken();  // eat ')'.
1335
1336  // Verify right number of names for operator.
1337  if (Kind &amp;&amp; ArgNames.size() != Kind)
1338    return ErrorP("Invalid number of operands for operator");
1339
1340  return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
1341}
1342
1343/// definition ::= 'def' prototype expression
1344static FunctionAST *ParseDefinition() {
1345  getNextToken();  // eat def.
1346  PrototypeAST *Proto = ParsePrototype();
1347  if (Proto == 0) return 0;
1348
1349  if (ExprAST *E = ParseExpression())
1350    return new FunctionAST(Proto, E);
1351  return 0;
1352}
1353
1354/// toplevelexpr ::= expression
1355static FunctionAST *ParseTopLevelExpr() {
1356  if (ExprAST *E = ParseExpression()) {
1357    // Make an anonymous proto.
1358    PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1359    return new FunctionAST(Proto, E);
1360  }
1361  return 0;
1362}
1363
1364/// external ::= 'extern' prototype
1365static PrototypeAST *ParseExtern() {
1366  getNextToken();  // eat extern.
1367  return ParsePrototype();
1368}
1369
1370//===----------------------------------------------------------------------===//
1371// Code Generation
1372//===----------------------------------------------------------------------===//
1373
1374static Module *TheModule;
1375static IRBuilder&lt;&gt; Builder(getGlobalContext());
1376static std::map&lt;std::string, Value*&gt; NamedValues;
1377static FunctionPassManager *TheFPM;
1378
1379Value *ErrorV(const char *Str) { Error(Str); return 0; }
1380
1381Value *NumberExprAST::Codegen() {
1382  return ConstantFP::get(getGlobalContext(), APFloat(Val));
1383}
1384
1385Value *VariableExprAST::Codegen() {
1386  // Look this variable up in the function.
1387  Value *V = NamedValues[Name];
1388  return V ? V : ErrorV("Unknown variable name");
1389}
1390
1391Value *UnaryExprAST::Codegen() {
1392  Value *OperandV = Operand-&gt;Codegen();
1393  if (OperandV == 0) return 0;
1394
1395  Function *F = TheModule-&gt;getFunction(std::string("unary")+Opcode);
1396  if (F == 0)
1397    return ErrorV("Unknown unary operator");
1398
1399  return Builder.CreateCall(F, OperandV, "unop");
1400}
1401
1402Value *BinaryExprAST::Codegen() {
1403  Value *L = LHS-&gt;Codegen();
1404  Value *R = RHS-&gt;Codegen();
1405  if (L == 0 || R == 0) return 0;
1406
1407  switch (Op) {
1408  case '+': return Builder.CreateFAdd(L, R, "addtmp");
1409  case '-': return Builder.CreateFSub(L, R, "subtmp");
1410  case '*': return Builder.CreateFMul(L, R, "multmp");
1411  case '&lt;':
1412    L = Builder.CreateFCmpULT(L, R, "cmptmp");
1413    // Convert bool 0/1 to double 0.0 or 1.0
1414    return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
1415                                "booltmp");
1416  default: break;
1417  }
1418
1419  // If it wasn't a builtin binary operator, it must be a user defined one. Emit
1420  // a call to it.
1421  Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
1422  assert(F &amp;&amp; "binary operator not found!");
1423
1424  Value *Ops[2] = { L, R };
1425  return Builder.CreateCall(F, Ops, "binop");
1426}
1427
1428Value *CallExprAST::Codegen() {
1429  // Look up the name in the global module table.
1430  Function *CalleeF = TheModule-&gt;getFunction(Callee);
1431  if (CalleeF == 0)
1432    return ErrorV("Unknown function referenced");
1433
1434  // If argument mismatch error.
1435  if (CalleeF-&gt;arg_size() != Args.size())
1436    return ErrorV("Incorrect # arguments passed");
1437
1438  std::vector&lt;Value*&gt; ArgsV;
1439  for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1440    ArgsV.push_back(Args[i]-&gt;Codegen());
1441    if (ArgsV.back() == 0) return 0;
1442  }
1443
1444  return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
1445}
1446
1447Value *IfExprAST::Codegen() {
1448  Value *CondV = Cond-&gt;Codegen();
1449  if (CondV == 0) return 0;
1450
1451  // Convert condition to a bool by comparing equal to 0.0.
1452  CondV = Builder.CreateFCmpONE(CondV,
1453                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
1454                                "ifcond");
1455
1456  Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1457
1458  // Create blocks for the then and else cases.  Insert the 'then' block at the
1459  // end of the function.
1460  BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
1461  BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
1462  BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
1463
1464  Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1465
1466  // Emit then value.
1467  Builder.SetInsertPoint(ThenBB);
1468
1469  Value *ThenV = Then-&gt;Codegen();
1470  if (ThenV == 0) return 0;
1471
1472  Builder.CreateBr(MergeBB);
1473  // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1474  ThenBB = Builder.GetInsertBlock();
1475
1476  // Emit else block.
1477  TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1478  Builder.SetInsertPoint(ElseBB);
1479
1480  Value *ElseV = Else-&gt;Codegen();
1481  if (ElseV == 0) return 0;
1482
1483  Builder.CreateBr(MergeBB);
1484  // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1485  ElseBB = Builder.GetInsertBlock();
1486
1487  // Emit merge block.
1488  TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1489  Builder.SetInsertPoint(MergeBB);
1490  PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2,
1491                                  "iftmp");
1492
1493  PN-&gt;addIncoming(ThenV, ThenBB);
1494  PN-&gt;addIncoming(ElseV, ElseBB);
1495  return PN;
1496}
1497
1498Value *ForExprAST::Codegen() {
1499  // Output this as:
1500  //   ...
1501  //   start = startexpr
1502  //   goto loop
1503  // loop:
1504  //   variable = phi [start, loopheader], [nextvariable, loopend]
1505  //   ...
1506  //   bodyexpr
1507  //   ...
1508  // loopend:
1509  //   step = stepexpr
1510  //   nextvariable = variable + step
1511  //   endcond = endexpr
1512  //   br endcond, loop, endloop
1513  // outloop:
1514
1515  // Emit the start code first, without 'variable' in scope.
1516  Value *StartVal = Start-&gt;Codegen();
1517  if (StartVal == 0) return 0;
1518
1519  // Make the new basic block for the loop header, inserting after current
1520  // block.
1521  Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1522  BasicBlock *PreheaderBB = Builder.GetInsertBlock();
1523  BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
1524
1525  // Insert an explicit fall through from the current block to the LoopBB.
1526  Builder.CreateBr(LoopBB);
1527
1528  // Start insertion in LoopBB.
1529  Builder.SetInsertPoint(LoopBB);
1530
1531  // Start the PHI node with an entry for Start.
1532  PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, VarName.c_str());
1533  Variable-&gt;addIncoming(StartVal, PreheaderBB);
1534
1535  // Within the loop, the variable is defined equal to the PHI node.  If it
1536  // shadows an existing variable, we have to restore it, so save it now.
1537  Value *OldVal = NamedValues[VarName];
1538  NamedValues[VarName] = Variable;
1539
1540  // Emit the body of the loop.  This, like any other expr, can change the
1541  // current BB.  Note that we ignore the value computed by the body, but don't
1542  // allow an error.
1543  if (Body-&gt;Codegen() == 0)
1544    return 0;
1545
1546  // Emit the step value.
1547  Value *StepVal;
1548  if (Step) {
1549    StepVal = Step-&gt;Codegen();
1550    if (StepVal == 0) return 0;
1551  } else {
1552    // If not specified, use 1.0.
1553    StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
1554  }
1555
1556  Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
1557
1558  // Compute the end condition.
1559  Value *EndCond = End-&gt;Codegen();
1560  if (EndCond == 0) return EndCond;
1561
1562  // Convert condition to a bool by comparing equal to 0.0.
1563  EndCond = Builder.CreateFCmpONE(EndCond,
1564                              ConstantFP::get(getGlobalContext(), APFloat(0.0)),
1565                                  "loopcond");
1566
1567  // Create the "after loop" block and insert it.
1568  BasicBlock *LoopEndBB = Builder.GetInsertBlock();
1569  BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
1570
1571  // Insert the conditional branch into the end of LoopEndBB.
1572  Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1573
1574  // Any new code will be inserted in AfterBB.
1575  Builder.SetInsertPoint(AfterBB);
1576
1577  // Add a new entry to the PHI node for the backedge.
1578  Variable-&gt;addIncoming(NextVar, LoopEndBB);
1579
1580  // Restore the unshadowed variable.
1581  if (OldVal)
1582    NamedValues[VarName] = OldVal;
1583  else
1584    NamedValues.erase(VarName);
1585
1586
1587  // for expr always returns 0.0.
1588  return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
1589}
1590
1591Function *PrototypeAST::Codegen() {
1592  // Make the function type:  double(double,double) etc.
1593  std::vector&lt;Type*&gt; Doubles(Args.size(),
1594                             Type::getDoubleTy(getGlobalContext()));
1595  FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
1596                                       Doubles, false);
1597
1598  Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
1599
1600  // If F conflicted, there was already something named 'Name'.  If it has a
1601  // body, don't allow redefinition or reextern.
1602  if (F-&gt;getName() != Name) {
1603    // Delete the one we just made and get the existing one.
1604    F-&gt;eraseFromParent();
1605    F = TheModule-&gt;getFunction(Name);
1606
1607    // If F already has a body, reject this.
1608    if (!F-&gt;empty()) {
1609      ErrorF("redefinition of function");
1610      return 0;
1611    }
1612
1613    // If F took a different number of args, reject.
1614    if (F-&gt;arg_size() != Args.size()) {
1615      ErrorF("redefinition of function with different # args");
1616      return 0;
1617    }
1618  }
1619
1620  // Set names for all arguments.
1621  unsigned Idx = 0;
1622  for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1623       ++AI, ++Idx) {
1624    AI-&gt;setName(Args[Idx]);
1625
1626    // Add arguments to variable symbol table.
1627    NamedValues[Args[Idx]] = AI;
1628  }
1629
1630  return F;
1631}
1632
1633Function *FunctionAST::Codegen() {
1634  NamedValues.clear();
1635
1636  Function *TheFunction = Proto-&gt;Codegen();
1637  if (TheFunction == 0)
1638    return 0;
1639
1640  // If this is an operator, install it.
1641  if (Proto-&gt;isBinaryOp())
1642    BinopPrecedence[Proto-&gt;getOperatorName()] = Proto-&gt;getBinaryPrecedence();
1643
1644  // Create a new basic block to start insertion into.
1645  BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
1646  Builder.SetInsertPoint(BB);
1647
1648  if (Value *RetVal = Body-&gt;Codegen()) {
1649    // Finish off the function.
1650    Builder.CreateRet(RetVal);
1651
1652    // Validate the generated code, checking for consistency.
1653    verifyFunction(*TheFunction);
1654
1655    // Optimize the function.
1656    TheFPM-&gt;run(*TheFunction);
1657
1658    return TheFunction;
1659  }
1660
1661  // Error reading body, remove function.
1662  TheFunction-&gt;eraseFromParent();
1663
1664  if (Proto-&gt;isBinaryOp())
1665    BinopPrecedence.erase(Proto-&gt;getOperatorName());
1666  return 0;
1667}
1668
1669//===----------------------------------------------------------------------===//
1670// Top-Level parsing and JIT Driver
1671//===----------------------------------------------------------------------===//
1672
1673static ExecutionEngine *TheExecutionEngine;
1674
1675static void HandleDefinition() {
1676  if (FunctionAST *F = ParseDefinition()) {
1677    if (Function *LF = F-&gt;Codegen()) {
1678      fprintf(stderr, "Read function definition:");
1679      LF-&gt;dump();
1680    }
1681  } else {
1682    // Skip token for error recovery.
1683    getNextToken();
1684  }
1685}
1686
1687static void HandleExtern() {
1688  if (PrototypeAST *P = ParseExtern()) {
1689    if (Function *F = P-&gt;Codegen()) {
1690      fprintf(stderr, "Read extern: ");
1691      F-&gt;dump();
1692    }
1693  } else {
1694    // Skip token for error recovery.
1695    getNextToken();
1696  }
1697}
1698
1699static void HandleTopLevelExpression() {
1700  // Evaluate a top-level expression into an anonymous function.
1701  if (FunctionAST *F = ParseTopLevelExpr()) {
1702    if (Function *LF = F-&gt;Codegen()) {
1703      // JIT the function, returning a function pointer.
1704      void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
1705
1706      // Cast it to the right type (takes no arguments, returns a double) so we
1707      // can call it as a native function.
1708      double (*FP)() = (double (*)())(intptr_t)FPtr;
1709      fprintf(stderr, "Evaluated to %f\n", FP());
1710    }
1711  } else {
1712    // Skip token for error recovery.
1713    getNextToken();
1714  }
1715}
1716
1717/// top ::= definition | external | expression | ';'
1718static void MainLoop() {
1719  while (1) {
1720    fprintf(stderr, "ready&gt; ");
1721    switch (CurTok) {
1722    case tok_eof:    return;
1723    case ';':        getNextToken(); break;  // ignore top-level semicolons.
1724    case tok_def:    HandleDefinition(); break;
1725    case tok_extern: HandleExtern(); break;
1726    default:         HandleTopLevelExpression(); break;
1727    }
1728  }
1729}
1730
1731//===----------------------------------------------------------------------===//
1732// "Library" functions that can be "extern'd" from user code.
1733//===----------------------------------------------------------------------===//
1734
1735/// putchard - putchar that takes a double and returns 0.
1736extern "C"
1737double putchard(double X) {
1738  putchar((char)X);
1739  return 0;
1740}
1741
1742/// printd - printf that takes a double prints it as "%f\n", returning 0.
1743extern "C"
1744double printd(double X) {
1745  printf("%f\n", X);
1746  return 0;
1747}
1748
1749//===----------------------------------------------------------------------===//
1750// Main driver code.
1751//===----------------------------------------------------------------------===//
1752
1753int main() {
1754  InitializeNativeTarget();
1755  LLVMContext &amp;Context = getGlobalContext();
1756
1757  // Install standard binary operators.
1758  // 1 is lowest precedence.
1759  BinopPrecedence['&lt;'] = 10;
1760  BinopPrecedence['+'] = 20;
1761  BinopPrecedence['-'] = 20;
1762  BinopPrecedence['*'] = 40;  // highest.
1763
1764  // Prime the first token.
1765  fprintf(stderr, "ready&gt; ");
1766  getNextToken();
1767
1768  // Make the module, which holds all the code.
1769  TheModule = new Module("my cool jit", Context);
1770
1771  // Create the JIT.  This takes ownership of the module.
1772  std::string ErrStr;
1773  TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&amp;ErrStr).create();
1774  if (!TheExecutionEngine) {
1775    fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str());
1776    exit(1);
1777  }
1778
1779  FunctionPassManager OurFPM(TheModule);
1780
1781  // Set up the optimizer pipeline.  Start with registering info about how the
1782  // target lays out data structures.
1783  OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
1784  // Provide basic AliasAnalysis support for GVN.
1785  OurFPM.add(createBasicAliasAnalysisPass());
1786  // Do simple "peephole" optimizations and bit-twiddling optzns.
1787  OurFPM.add(createInstructionCombiningPass());
1788  // Reassociate expressions.
1789  OurFPM.add(createReassociatePass());
1790  // Eliminate Common SubExpressions.
1791  OurFPM.add(createGVNPass());
1792  // Simplify the control flow graph (deleting unreachable blocks, etc).
1793  OurFPM.add(createCFGSimplificationPass());
1794
1795  OurFPM.doInitialization();
1796
1797  // Set the global so the code gen can use this.
1798  TheFPM = &amp;OurFPM;
1799
1800  // Run the main "interpreter loop" now.
1801  MainLoop();
1802
1803  TheFPM = 0;
1804
1805  // Print out all of the generated code.
1806  TheModule-&gt;dump();
1807
1808  return 0;
1809}
1810</pre>
1811</div>
1812
1813<a href="LangImpl7.html">Next: Extending the language: mutable variables / SSA construction</a>
1814</div>
1815
1816<!-- *********************************************************************** -->
1817<hr>
1818<address>
1819  <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1820  src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1821  <a href="http://validator.w3.org/check/referer"><img
1822  src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
1823
1824  <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1825  <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br>
1826  Last modified: $Date$
1827</address>
1828</body>
1829</html>
1830