1:mod:`parser` --- Access Python parse trees 2=========================================== 3 4.. module:: parser 5 :synopsis: Access parse trees for Python source code. 6 7.. moduleauthor:: Fred L. Drake, Jr. <fdrake@acm.org> 8.. sectionauthor:: Fred L. Drake, Jr. <fdrake@acm.org> 9 10.. Copyright 1995 Virginia Polytechnic Institute and State University and Fred 11 L. Drake, Jr. This copyright notice must be distributed on all copies, but 12 this document otherwise may be distributed as part of the Python 13 distribution. No fee may be charged for this document in any representation, 14 either on paper or electronically. This restriction does not affect other 15 elements in a distributed package in any way. 16 17.. index:: single: parsing; Python source code 18 19-------------- 20 21The :mod:`parser` module provides an interface to Python's internal parser and 22byte-code compiler. The primary purpose for this interface is to allow Python 23code to edit the parse tree of a Python expression and create executable code 24from this. This is better than trying to parse and modify an arbitrary Python 25code fragment as a string because parsing is performed in a manner identical to 26the code forming the application. It is also faster. 27 28.. warning:: 29 30 The parser module is deprecated and will be removed in future versions of 31 Python. For the majority of use cases you can leverage the Abstract Syntax 32 Tree (AST) generation and compilation stage, using the :mod:`ast` module. 33 34There are a few things to note about this module which are important to making 35use of the data structures created. This is not a tutorial on editing the parse 36trees for Python code, but some examples of using the :mod:`parser` module are 37presented. 38 39Most importantly, a good understanding of the Python grammar processed by the 40internal parser is required. For full information on the language syntax, refer 41to :ref:`reference-index`. The parser 42itself is created from a grammar specification defined in the file 43:file:`Grammar/Grammar` in the standard Python distribution. The parse trees 44stored in the ST objects created by this module are the actual output from the 45internal parser when created by the :func:`expr` or :func:`suite` functions, 46described below. The ST objects created by :func:`sequence2st` faithfully 47simulate those structures. Be aware that the values of the sequences which are 48considered "correct" will vary from one version of Python to another as the 49formal grammar for the language is revised. However, transporting code from one 50Python version to another as source text will always allow correct parse trees 51to be created in the target version, with the only restriction being that 52migrating to an older version of the interpreter will not support more recent 53language constructs. The parse trees are not typically compatible from one 54version to another, though source code has usually been forward-compatible within 55a major release series. 56 57Each element of the sequences returned by :func:`st2list` or :func:`st2tuple` 58has a simple form. Sequences representing non-terminal elements in the grammar 59always have a length greater than one. The first element is an integer which 60identifies a production in the grammar. These integers are given symbolic names 61in the C header file :file:`Include/graminit.h` and the Python module 62:mod:`symbol`. Each additional element of the sequence represents a component 63of the production as recognized in the input string: these are always sequences 64which have the same form as the parent. An important aspect of this structure 65which should be noted is that keywords used to identify the parent node type, 66such as the keyword :keyword:`if` in an :const:`if_stmt`, are included in the 67node tree without any special treatment. For example, the :keyword:`!if` keyword 68is represented by the tuple ``(1, 'if')``, where ``1`` is the numeric value 69associated with all :const:`NAME` tokens, including variable and function names 70defined by the user. In an alternate form returned when line number information 71is requested, the same token might be represented as ``(1, 'if', 12)``, where 72the ``12`` represents the line number at which the terminal symbol was found. 73 74Terminal elements are represented in much the same way, but without any child 75elements and the addition of the source text which was identified. The example 76of the :keyword:`if` keyword above is representative. The various types of 77terminal symbols are defined in the C header file :file:`Include/token.h` and 78the Python module :mod:`token`. 79 80The ST objects are not required to support the functionality of this module, 81but are provided for three purposes: to allow an application to amortize the 82cost of processing complex parse trees, to provide a parse tree representation 83which conserves memory space when compared to the Python list or tuple 84representation, and to ease the creation of additional modules in C which 85manipulate parse trees. A simple "wrapper" class may be created in Python to 86hide the use of ST objects. 87 88The :mod:`parser` module defines functions for a few distinct purposes. The 89most important purposes are to create ST objects and to convert ST objects to 90other representations such as parse trees and compiled code objects, but there 91are also functions which serve to query the type of parse tree represented by an 92ST object. 93 94 95.. seealso:: 96 97 Module :mod:`symbol` 98 Useful constants representing internal nodes of the parse tree. 99 100 Module :mod:`token` 101 Useful constants representing leaf nodes of the parse tree and functions for 102 testing node values. 103 104 105.. _creating-sts: 106 107Creating ST Objects 108------------------- 109 110ST objects may be created from source code or from a parse tree. When creating 111an ST object from source, different functions are used to create the ``'eval'`` 112and ``'exec'`` forms. 113 114 115.. function:: expr(source) 116 117 The :func:`expr` function parses the parameter *source* as if it were an input 118 to ``compile(source, 'file.py', 'eval')``. If the parse succeeds, an ST object 119 is created to hold the internal parse tree representation, otherwise an 120 appropriate exception is raised. 121 122 123.. function:: suite(source) 124 125 The :func:`suite` function parses the parameter *source* as if it were an input 126 to ``compile(source, 'file.py', 'exec')``. If the parse succeeds, an ST object 127 is created to hold the internal parse tree representation, otherwise an 128 appropriate exception is raised. 129 130 131.. function:: sequence2st(sequence) 132 133 This function accepts a parse tree represented as a sequence and builds an 134 internal representation if possible. If it can validate that the tree conforms 135 to the Python grammar and all nodes are valid node types in the host version of 136 Python, an ST object is created from the internal representation and returned 137 to the called. If there is a problem creating the internal representation, or 138 if the tree cannot be validated, a :exc:`ParserError` exception is raised. An 139 ST object created this way should not be assumed to compile correctly; normal 140 exceptions raised by compilation may still be initiated when the ST object is 141 passed to :func:`compilest`. This may indicate problems not related to syntax 142 (such as a :exc:`MemoryError` exception), but may also be due to constructs such 143 as the result of parsing ``del f(0)``, which escapes the Python parser but is 144 checked by the bytecode compiler. 145 146 Sequences representing terminal tokens may be represented as either two-element 147 lists of the form ``(1, 'name')`` or as three-element lists of the form ``(1, 148 'name', 56)``. If the third element is present, it is assumed to be a valid 149 line number. The line number may be specified for any subset of the terminal 150 symbols in the input tree. 151 152 153.. function:: tuple2st(sequence) 154 155 This is the same function as :func:`sequence2st`. This entry point is 156 maintained for backward compatibility. 157 158 159.. _converting-sts: 160 161Converting ST Objects 162--------------------- 163 164ST objects, regardless of the input used to create them, may be converted to 165parse trees represented as list- or tuple- trees, or may be compiled into 166executable code objects. Parse trees may be extracted with or without line 167numbering information. 168 169 170.. function:: st2list(st, line_info=False, col_info=False) 171 172 This function accepts an ST object from the caller in *st* and returns a 173 Python list representing the equivalent parse tree. The resulting list 174 representation can be used for inspection or the creation of a new parse tree in 175 list form. This function does not fail so long as memory is available to build 176 the list representation. If the parse tree will only be used for inspection, 177 :func:`st2tuple` should be used instead to reduce memory consumption and 178 fragmentation. When the list representation is required, this function is 179 significantly faster than retrieving a tuple representation and converting that 180 to nested lists. 181 182 If *line_info* is true, line number information will be included for all 183 terminal tokens as a third element of the list representing the token. Note 184 that the line number provided specifies the line on which the token *ends*. 185 This information is omitted if the flag is false or omitted. 186 187 188.. function:: st2tuple(st, line_info=False, col_info=False) 189 190 This function accepts an ST object from the caller in *st* and returns a 191 Python tuple representing the equivalent parse tree. Other than returning a 192 tuple instead of a list, this function is identical to :func:`st2list`. 193 194 If *line_info* is true, line number information will be included for all 195 terminal tokens as a third element of the list representing the token. This 196 information is omitted if the flag is false or omitted. 197 198 199.. function:: compilest(st, filename='<syntax-tree>') 200 201 .. index:: 202 builtin: exec 203 builtin: eval 204 205 The Python byte compiler can be invoked on an ST object to produce code objects 206 which can be used as part of a call to the built-in :func:`exec` or :func:`eval` 207 functions. This function provides the interface to the compiler, passing the 208 internal parse tree from *st* to the parser, using the source file name 209 specified by the *filename* parameter. The default value supplied for *filename* 210 indicates that the source was an ST object. 211 212 Compiling an ST object may result in exceptions related to compilation; an 213 example would be a :exc:`SyntaxError` caused by the parse tree for ``del f(0)``: 214 this statement is considered legal within the formal grammar for Python but is 215 not a legal language construct. The :exc:`SyntaxError` raised for this 216 condition is actually generated by the Python byte-compiler normally, which is 217 why it can be raised at this point by the :mod:`parser` module. Most causes of 218 compilation failure can be diagnosed programmatically by inspection of the parse 219 tree. 220 221 222.. _querying-sts: 223 224Queries on ST Objects 225--------------------- 226 227Two functions are provided which allow an application to determine if an ST was 228created as an expression or a suite. Neither of these functions can be used to 229determine if an ST was created from source code via :func:`expr` or 230:func:`suite` or from a parse tree via :func:`sequence2st`. 231 232 233.. function:: isexpr(st) 234 235 .. index:: builtin: compile 236 237 When *st* represents an ``'eval'`` form, this function returns ``True``, otherwise 238 it returns ``False``. This is useful, since code objects normally cannot be queried 239 for this information using existing built-in functions. Note that the code 240 objects created by :func:`compilest` cannot be queried like this either, and 241 are identical to those created by the built-in :func:`compile` function. 242 243 244.. function:: issuite(st) 245 246 This function mirrors :func:`isexpr` in that it reports whether an ST object 247 represents an ``'exec'`` form, commonly known as a "suite." It is not safe to 248 assume that this function is equivalent to ``not isexpr(st)``, as additional 249 syntactic fragments may be supported in the future. 250 251 252.. _st-errors: 253 254Exceptions and Error Handling 255----------------------------- 256 257The parser module defines a single exception, but may also pass other built-in 258exceptions from other portions of the Python runtime environment. See each 259function for information about the exceptions it can raise. 260 261 262.. exception:: ParserError 263 264 Exception raised when a failure occurs within the parser module. This is 265 generally produced for validation failures rather than the built-in 266 :exc:`SyntaxError` raised during normal parsing. The exception argument is 267 either a string describing the reason of the failure or a tuple containing a 268 sequence causing the failure from a parse tree passed to :func:`sequence2st` 269 and an explanatory string. Calls to :func:`sequence2st` need to be able to 270 handle either type of exception, while calls to other functions in the module 271 will only need to be aware of the simple string values. 272 273Note that the functions :func:`compilest`, :func:`expr`, and :func:`suite` may 274raise exceptions which are normally raised by the parsing and compilation 275process. These include the built in exceptions :exc:`MemoryError`, 276:exc:`OverflowError`, :exc:`SyntaxError`, and :exc:`SystemError`. In these 277cases, these exceptions carry all the meaning normally associated with them. 278Refer to the descriptions of each function for detailed information. 279 280 281.. _st-objects: 282 283ST Objects 284---------- 285 286Ordered and equality comparisons are supported between ST objects. Pickling of 287ST objects (using the :mod:`pickle` module) is also supported. 288 289 290.. data:: STType 291 292 The type of the objects returned by :func:`expr`, :func:`suite` and 293 :func:`sequence2st`. 294 295ST objects have the following methods: 296 297 298.. method:: ST.compile(filename='<syntax-tree>') 299 300 Same as ``compilest(st, filename)``. 301 302 303.. method:: ST.isexpr() 304 305 Same as ``isexpr(st)``. 306 307 308.. method:: ST.issuite() 309 310 Same as ``issuite(st)``. 311 312 313.. method:: ST.tolist(line_info=False, col_info=False) 314 315 Same as ``st2list(st, line_info, col_info)``. 316 317 318.. method:: ST.totuple(line_info=False, col_info=False) 319 320 Same as ``st2tuple(st, line_info, col_info)``. 321 322 323Example: Emulation of :func:`compile` 324------------------------------------- 325 326While many useful operations may take place between parsing and bytecode 327generation, the simplest operation is to do nothing. For this purpose, using 328the :mod:`parser` module to produce an intermediate data structure is equivalent 329to the code :: 330 331 >>> code = compile('a + 5', 'file.py', 'eval') 332 >>> a = 5 333 >>> eval(code) 334 10 335 336The equivalent operation using the :mod:`parser` module is somewhat longer, and 337allows the intermediate internal parse tree to be retained as an ST object:: 338 339 >>> import parser 340 >>> st = parser.expr('a + 5') 341 >>> code = st.compile('file.py') 342 >>> a = 5 343 >>> eval(code) 344 10 345 346An application which needs both ST and code objects can package this code into 347readily available functions:: 348 349 import parser 350 351 def load_suite(source_string): 352 st = parser.suite(source_string) 353 return st, st.compile() 354 355 def load_expression(source_string): 356 st = parser.expr(source_string) 357 return st, st.compile() 358