1=========================== 2TableGen Language Reference 3=========================== 4 5.. contents:: 6 :local: 7 8.. warning:: 9 This document is extremely rough. If you find something lacking, please 10 fix it, file a documentation bug, or ask about it on llvmdev. 11 12Introduction 13============ 14 15This document is meant to be a normative spec about the TableGen language 16in and of itself (i.e. how to understand a given construct in terms of how 17it affects the final set of records represented by the TableGen file). If 18you are unsure if this document is really what you are looking for, please 19read the :doc:`introduction to TableGen <index>` first. 20 21Notation 22======== 23 24The lexical and syntax notation used here is intended to imitate 25`Python's`_. In particular, for lexical definitions, the productions 26operate at the character level and there is no implied whitespace between 27elements. The syntax definitions operate at the token level, so there is 28implied whitespace between tokens. 29 30.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation 31 32Lexical Analysis 33================ 34 35TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``) 36comments. 37 38The following is a listing of the basic punctuation tokens:: 39 40 - + [ ] { } ( ) < > : ; . = ? # 41 42Numeric literals take one of the following forms: 43 44.. TableGen actually will lex some pretty strange sequences an interpret 45 them as numbers. What is shown here is an attempt to approximate what it 46 "should" accept. 47 48.. productionlist:: 49 TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger` 50 DecimalInteger: ["+" | "-"] ("0"..."9")+ 51 HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+ 52 BinInteger: "0b" ("0" | "1")+ 53 54One aspect to note is that the :token:`DecimalInteger` token *includes* the 55``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as 56most languages do. 57 58TableGen has identifier-like tokens: 59 60.. productionlist:: 61 ualpha: "a"..."z" | "A"..."Z" | "_" 62 TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")* 63 TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")* 64 65Note that unlike most languages, TableGen allows :token:`TokIdentifier` to 66begin with a number. In case of ambiguity, a token will be interpreted as a 67numeric literal rather than an identifier. 68 69TableGen also has two string-like literals: 70 71.. productionlist:: 72 TokString: '"' <non-'"' characters and C-like escapes> '"' 73 TokCodeFragment: "[{" <shortest text not containing "}]"> "}]" 74 75:token:`TokCodeFragment` is essentially a multiline string literal 76delimited by ``[{`` and ``}]``. 77 78.. note:: 79 The current implementation accepts the following C-like escapes:: 80 81 \\ \' \" \t \n 82 83TableGen also has the following keywords:: 84 85 bit bits class code dag 86 def foreach defm field in 87 int let list multiclass string 88 89TableGen also has "bang operators" which have a 90wide variety of meanings: 91 92.. productionlist:: 93 BangOperator: one of 94 :!eq !if !head !tail !con 95 :!add !shl !sra !srl 96 :!cast !empty !subst !foreach !listconcat !strconcat 97 98Syntax 99====== 100 101TableGen has an ``include`` mechanism. It does not play a role in the 102syntax per se, since it is lexically replaced with the contents of the 103included file. 104 105.. productionlist:: 106 IncludeDirective: "include" `TokString` 107 108TableGen's top-level production consists of "objects". 109 110.. productionlist:: 111 TableGenFile: `Object`* 112 Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach` 113 114``class``\es 115------------ 116 117.. productionlist:: 118 Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody` 119 120A ``class`` declaration creates a record which other records can inherit 121from. A class can be parametrized by a list of "template arguments", whose 122values can be used in the class body. 123 124A given class can only be defined once. A ``class`` declaration is 125considered to define the class if any of the following is true: 126 127.. break ObjectBody into its consituents so that they are present here? 128 129#. The :token:`TemplateArgList` is present. 130#. The :token:`Body` in the :token:`ObjectBody` is present and is not empty. 131#. The :token:`BaseClassList` in the :token:`ObjectBody` is present. 132 133You can declare an empty class by giving and empty :token:`TemplateArgList` 134and an empty :token:`ObjectBody`. This can serve as a restricted form of 135forward declaration: note that records deriving from the forward-declared 136class will inherit no fields from it since the record expansion is done 137when the record is parsed. 138 139.. productionlist:: 140 TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">" 141 142Declarations 143------------ 144 145.. Omitting mention of arcane "field" prefix to discourage its use. 146 147The declaration syntax is pretty much what you would expect as a C++ 148programmer. 149 150.. productionlist:: 151 Declaration: `Type` `TokIdentifier` ["=" `Value`] 152 153It assigns the value to the identifer. 154 155Types 156----- 157 158.. productionlist:: 159 Type: "string" | "code" | "bit" | "int" | "dag" 160 :| "bits" "<" `TokInteger` ">" 161 :| "list" "<" `Type` ">" 162 :| `ClassID` 163 ClassID: `TokIdentifier` 164 165Both ``string`` and ``code`` correspond to the string type; the difference 166is purely to indicate programmer intention. 167 168The :token:`ClassID` must identify a class that has been previously 169declared or defined. 170 171Values 172------ 173 174.. productionlist:: 175 Value: `SimpleValue` `ValueSuffix`* 176 ValueSuffix: "{" `RangeList` "}" 177 :| "[" `RangeList` "]" 178 :| "." `TokIdentifier` 179 RangeList: `RangePiece` ("," `RangePiece`)* 180 RangePiece: `TokInteger` 181 :| `TokInteger` "-" `TokInteger` 182 :| `TokInteger` `TokInteger` 183 184The peculiar last form of :token:`RangePiece` is due to the fact that the 185"``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as 186two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``, 187instead of "1", "-", and "5". 188The :token:`RangeList` can be thought of as specifying "list slice" in some 189contexts. 190 191 192:token:`SimpleValue` has a number of forms: 193 194 195.. productionlist:: 196 SimpleValue: `TokIdentifier` 197 198The value will be the variable referenced by the identifier. It can be one 199of: 200 201.. The code for this is exceptionally abstruse. These examples are a 202 best-effort attempt. 203 204* name of a ``def``, such as the use of ``Bar`` in:: 205 206 def Bar : SomeClass { 207 int X = 5; 208 } 209 210 def Foo { 211 SomeClass Baz = Bar; 212 } 213 214* value local to a ``def``, such as the use of ``Bar`` in:: 215 216 def Foo { 217 int Bar = 5; 218 int Baz = Bar; 219 } 220 221* a template arg of a ``class``, such as the use of ``Bar`` in:: 222 223 class Foo<int Bar> { 224 int Baz = Bar; 225 } 226 227* value local to a ``multiclass``, such as the use of ``Bar`` in:: 228 229 multiclass Foo { 230 int Bar = 5; 231 int Baz = Bar; 232 } 233 234* a template arg to a ``multiclass``, such as the use of ``Bar`` in:: 235 236 multiclass Foo<int Bar> { 237 int Baz = Bar; 238 } 239 240.. productionlist:: 241 SimpleValue: `TokInteger` 242 243This represents the numeric value of the integer. 244 245.. productionlist:: 246 SimpleValue: `TokString`+ 247 248Multiple adjacent string literals are concatenated like in C/C++. The value 249is the concatenation of the strings. 250 251.. productionlist:: 252 SimpleValue: `TokCodeFragment` 253 254The value is the string value of the code fragment. 255 256.. productionlist:: 257 SimpleValue: "?" 258 259``?`` represents an "unset" initializer. 260 261.. productionlist:: 262 SimpleValue: "{" `ValueList` "}" 263 ValueList: [`ValueListNE`] 264 ValueListNE: `Value` ("," `Value`)* 265 266This represents a sequence of bits, as would be used to initialize a 267``bits<n>`` field (where ``n`` is the number of bits). 268 269.. productionlist:: 270 SimpleValue: `ClassID` "<" `ValueListNE` ">" 271 272This generates a new anonymous record definition (as would be created by an 273unnamed ``def`` inheriting from the given class with the given template 274arguments) and the value is the value of that record definition. 275 276.. productionlist:: 277 SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"] 278 279A list initializer. The optional :token:`Type` can be used to indicate a 280specific element type, otherwise the element type will be deduced from the 281given values. 282 283.. The initial `DagArg` of the dag must start with an identifier or 284 !cast, but this is more of an implementation detail and so for now just 285 leave it out. 286 287.. productionlist:: 288 SimpleValue: "(" `DagArg` `DagArgList` ")" 289 DagArgList: `DagArg` ("," `DagArg`)* 290 DagArg: `Value` [":" `TokVarName`] | `TokVarName` 291 292The initial :token:`DagArg` is called the "operator" of the dag. 293 294.. productionlist:: 295 SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")" 296 297Bodies 298------ 299 300.. productionlist:: 301 ObjectBody: `BaseClassList` `Body` 302 BaseClassList: [":" `BaseClassListNE`] 303 BaseClassListNE: `SubClassRef` ("," `SubClassRef`)* 304 SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"] 305 DefmID: `TokIdentifier` 306 307The version with the :token:`MultiClassID` is only valid in the 308:token:`BaseClassList` of a ``defm``. 309The :token:`MultiClassID` should be the name of a ``multiclass``. 310 311.. put this somewhere else 312 313It is after parsing the base class list that the "let stack" is applied. 314 315.. productionlist:: 316 Body: ";" | "{" BodyList "}" 317 BodyList: BodyItem* 318 BodyItem: `Declaration` ";" 319 :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";" 320 321The ``let`` form allows overriding the value of an inherited field. 322 323``def`` 324------- 325 326.. TODO:: 327 There can be pastes in the names here, like ``#NAME#``. Look into that 328 and document it (it boils down to ParseIDValue with IDParseMode == 329 ParseNameMode). ParseObjectName calls into the general ParseValue, with 330 the only different from "arbitrary expression parsing" being IDParseMode 331 == Mode. 332 333.. productionlist:: 334 Def: "def" `TokIdentifier` `ObjectBody` 335 336Defines a record whose name is given by the :token:`TokIdentifier`. The 337fields of the record are inherited from the base classes and defined in the 338body. 339 340Special handling occurs if this ``def`` appears inside a ``multiclass`` or 341a ``foreach``. 342 343``defm`` 344-------- 345 346.. productionlist:: 347 Defm: "defm" `TokIdentifier` ":" `BaseClassListNE` ";" 348 349Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must 350precede any ``class``'s that appear. 351 352``foreach`` 353----------- 354 355.. productionlist:: 356 Foreach: "foreach" `Declaration` "in" "{" `Object`* "}" 357 :| "foreach" `Declaration` "in" `Object` 358 359The value assigned to the variable in the declaration is iterated over and 360the object or object list is reevaluated with the variable set at each 361iterated value. 362 363Top-Level ``let`` 364----------------- 365 366.. productionlist:: 367 Let: "let" `LetList` "in" "{" `Object`* "}" 368 :| "let" `LetList` "in" `Object` 369 LetList: `LetItem` ("," `LetItem`)* 370 LetItem: `TokIdentifier` [`RangeList`] "=" `Value` 371 372This is effectively equivalent to ``let`` inside the body of a record 373except that it applies to multiple records at a time. The bindings are 374applied at the end of parsing the base classes of a record. 375 376``multiclass`` 377-------------- 378 379.. productionlist:: 380 MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`] 381 : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}" 382 BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)* 383 MultiClassID: `TokIdentifier` 384 MultiClassObject: `Def` | `Defm` | `Let` | `Foreach` 385