• Home
  • Raw
  • Download

Lines Matching +full:helper +full:- +full:string +full:- +full:parser

2 Kaleidoscope: Implementing a Parser and AST
14 `parser <http://en.wikipedia.org/wiki/Parsing>`_ for our Kaleidoscope
15 language. Once we have a parser, we'll define and build an `Abstract
18 The parser we will build uses a combination of `Recursive Descent
20 `Operator-Precedence
21 Parsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_ to
24 talk about the output of the parser: the Abstract Syntax Tree.
36 .. code-block:: c++
38 /// ExprAST - Base class for all expression nodes.
44 /// NumberExprAST - Expression class for numeric literals like "1.0".
64 .. code-block:: c++
66 /// VariableExprAST - Expression class for referencing a variable, like "a".
68 std::string Name;
71 VariableExprAST(const std::string &Name) : Name(Name) {}
74 /// BinaryExprAST - Expression class for a binary operator.
85 /// CallExprAST - Expression class for function calls.
87 std::string Callee;
91 CallExprAST(const std::string &Callee,
96 This is all (intentionally) rather straight-forward: variables capture
106 Turing-complete; we'll fix that in a later installment. The two things
110 .. code-block:: c++
112 /// PrototypeAST - This class represents the "prototype" for a function,
116 std::string Name;
117 std::vector<std::string> Args;
120 PrototypeAST(const std::string &name, std::vector<std::string> Args)
124 /// FunctionAST - This class represents a function definition itself.
144 Parser Basics
147 Now that we have an AST to build, we need to define the parser code to
152 .. code-block:: c++
159 In order to do this, we'll start by defining some basic helper routines:
161 .. code-block:: c++
163 /// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
164 /// token the parser is looking at. getNextToken reads another token from the
173 in our parser will assume that CurTok is the current token that needs to
176 .. code-block:: c++
179 /// LogError* - These are little helper functions for error handling.
189 The ``LogError`` routines are simple helper routines that our parser will
190 use to handle errors. The error recovery in our parser will not be the
191 best and is not particular user-friendly, but it will be enough for our
195 With these basic helper functions, we can implement the first piece of
205 .. code-block:: c++
226 .. code-block:: c++
242 parser:
247 if the user types in "(4 x" instead of "(4)", the parser should emit an
248 error. Because errors can occur, the parser needs a way to indicate that
249 they happened: in our parser, we return null on an error.
257 important role of parentheses are to guide the parser and provide
258 grouping. Once the parser constructs the AST, parentheses are not
264 .. code-block:: c++
270 std::string IdName = IdentifierStr;
305 that it uses *look-ahead* to determine if the current identifier is a
311 Now that we have all of our simple expression-parsing logic in place, we
312 can define a helper function to wrap it together into one entry point.
315 tutorial <LangImpl6.html#user-defined-unary-operators>`_. In order to parse an arbitrary
318 .. code-block:: c++
339 look-ahead to determine which sort of expression is being inspected, and
349 often ambiguous. For example, when given the string "x+y\*z", the parser
355 to use `Operator-Precedence
356 Parsing <http://en.wikipedia.org/wiki/Operator-precedence_parser>`_.
360 .. code-block:: c++
362 /// BinopPrecedence - This holds the precedence for each binary operator that is
366 /// GetTokPrecedence - Get the precedence of the pending binary operator token.
369 return -1;
373 if (TokPrec <= 0) return -1;
382 BinopPrecedence['-'] = 20;
390 the current token, or -1 if the token is not a binary operator. Having a
394 ``GetTokPrecedence`` function. (Or just use a fixed-size array).
396 With the helper above defined, we can now start parsing binary
404 primary expressions, the binary expression parser doesn't need to worry
410 .. code-block:: c++
438 .. code-block:: c++
455 -1, this check implicitly knows that the pair-stream ends when the token
460 .. code-block:: c++
475 Now that we parsed the left-hand side of an expression and one pair of
482 .. code-block:: c++
496 .. code-block:: c++
521 .. code-block:: c++
549 non-trivial lines), we correctly handle fully general binary expression
555 parser at an arbitrary token stream and build an expression from it,
565 straight-forward and not very interesting (once you've survived
568 .. code-block:: c++
576 std::string FnName = IdentifierStr;
583 std::vector<std::string> ArgNames;
598 .. code-block:: c++
615 .. code-block:: c++
623 Finally, we'll also let the user type in arbitrary top-level expressions
627 .. code-block:: c++
633 auto Proto = llvm::make_unique<PrototypeAST>("", std::vector<std::string>());
646 top-level dispatch loop. There isn't much interesting here, so I'll just
647 include the top-level loop. See `below <#full-code-listing>`_ for full code in the
648 "Top-Level Parsing" section.
650 .. code-block:: c++
659 case ';': // ignore top-level semicolons.
675 The most interesting part of this is that we ignore top-level
677 "4 + 5" at the command line, the parser doesn't know whether that is the
679 could type "def foo..." in which case 4+5 is the end of a top-level
681 the expression. Having top-level semicolons allows you to type "4+5;",
682 and the parser will know you are done.
687 With just under 400 lines of commented code (240 lines of non-comment,
688 non-blank code), we fully defined our minimal language, including a
689 lexer, parser, and AST builder. With this done, the executable will
693 .. code-block:: bash
700 Parsed a top-level expr
718 Note that it is fully self-contained: you don't need LLVM or any
722 .. code-block:: bash
725 clang++ -g -O3 toy.cpp