1=========================== 2TableGen Language Reference 3=========================== 4 5.. sectionauthor:: Sean Silva <silvas@purdue.edu> 6 7.. contents:: 8 :local: 9 10.. warning:: 11 This document is extremely rough. If you find something lacking, please 12 fix it, file a documentation bug, or ask about it on llvmdev. 13 14Introduction 15============ 16 17This document is meant to be a normative spec about the TableGen language 18in and of itself (i.e. how to understand a given construct in terms of how 19it affects the final set of records represented by the TableGen file). If 20you are unsure if this document is really what you are looking for, please 21read :doc:`/TableGenFundamentals` first. 22 23Notation 24======== 25 26The lexical and syntax notation used here is intended to imitate 27`Python's`_. In particular, for lexical definitions, the productions 28operate at the character level and there is no implied whitespace between 29elements. The syntax definitions operate at the token level, so there is 30implied whitespace between tokens. 31 32.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation 33 34Lexical Analysis 35================ 36 37TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``) 38comments. 39 40The following is a listing of the basic punctuation tokens:: 41 42 - + [ ] { } ( ) < > : ; . = ? # 43 44Numeric literals take one of the following forms: 45 46.. TableGen actually will lex some pretty strange sequences an interpret 47 them as numbers. What is shown here is an attempt to approximate what it 48 "should" accept. 49 50.. productionlist:: 51 TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger` 52 DecimalInteger: ["+" | "-"] ("0"..."9")+ 53 HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+ 54 BinInteger: "0b" ("0" | "1")+ 55 56One aspect to note is that the :token:`DecimalInteger` token *includes* the 57``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as 58most languages do. 59 60TableGen has identifier-like tokens: 61 62.. productionlist:: 63 ualpha: "a"..."z" | "A"..."Z" | "_" 64 TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")* 65 TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")* 66 67Note that unlike most languages, TableGen allows :token:`TokIdentifier` to 68begin with a number. In case of ambiguity, a token will be interpreted as a 69numeric literal rather than an identifier. 70 71TableGen also has two string-like literals: 72 73.. productionlist:: 74 TokString: '"' <non-'"' characters and C-like escapes> '"' 75 TokCodeFragment: "[{" <shortest text not containing "}]"> "}]" 76 77.. note:: 78 The current implementation accepts the following C-like escapes:: 79 80 \\ \' \" \t \n 81 82TableGen also has the following keywords:: 83 84 bit bits class code dag 85 def foreach defm field in 86 int let list multiclass string 87 88TableGen also has "bang operators" which have a 89wide variety of meanings: 90 91.. productionlist:: 92 BangOperator: one of 93 :!eq !if !head !tail !con 94 :!add !shl !sra !srl 95 :!cast !empty !subst !foreach !strconcat 96 97Syntax 98====== 99 100TableGen has an ``include`` mechanism. It does not play a role in the 101syntax per se, since it is lexically replaced with the contents of the 102included file. 103 104.. productionlist:: 105 IncludeDirective: "include" `TokString` 106 107TableGen's top-level production consists of "objects". 108 109.. productionlist:: 110 TableGenFile: `Object`* 111 Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach` 112 113``class``\es 114------------ 115 116.. productionlist:: 117 Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody` 118 119A ``class`` declaration creates a record which other records can inherit 120from. A class can be parametrized by a list of "template arguments", whose 121values can be used in the class body. 122 123A given class can only be defined once. A ``class`` declaration is 124considered to define the class if any of the following is true: 125 126.. break ObjectBody into its consituents so that they are present here? 127 128#. The :token:`TemplateArgList` is present. 129#. The :token:`Body` in the :token:`ObjectBody` is present and is not empty. 130#. The :token:`BaseClassList` in the :token:`ObjectBody` is present. 131 132You can declare an empty class by giving and empty :token:`TemplateArgList` 133and an empty :token:`ObjectBody`. This can serve as a restricted form of 134forward declaration: note that records deriving from the forward-declared 135class will inherit no fields from it since the record expansion is done 136when the record is parsed. 137 138.. productionlist:: 139 TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">" 140 141Declarations 142------------ 143 144.. Omitting mention of arcane "field" prefix to discourage its use. 145 146The declaration syntax is pretty much what you would expect as a C++ 147programmer. 148 149.. productionlist:: 150 Declaration: `Type` `TokIdentifier` ["=" `Value`] 151 152It assigns the value to the identifer. 153 154Types 155----- 156 157.. productionlist:: 158 Type: "string" | "code" | "bit" | "int" | "dag" 159 :| "bits" "<" `TokInteger` ">" 160 :| "list" "<" `Type` ">" 161 :| `ClassID` 162 ClassID: `TokIdentifier` 163 164Both ``string`` and ``code`` correspond to the string type; the difference 165is purely to indicate programmer intention. 166 167The :token:`ClassID` must identify a class that has been previously 168declared or defined. 169 170Values 171------ 172 173.. productionlist:: 174 Value: `SimpleValue` `ValueSuffix`* 175 ValueSuffix: "{" `RangeList` "}" 176 :| "[" `RangeList` "]" 177 :| "." `TokIdentifier` 178 RangeList: `RangePiece` ("," `RangePiece`)* 179 RangePiece: `TokInteger` 180 :| `TokInteger` "-" `TokInteger` 181 :| `TokInteger` `TokInteger` 182 183The peculiar last form of :token:`RangePiece` is due to the fact that the 184"``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as 185two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``, 186instead of "1", "-", and "5". 187The :token:`RangeList` can be thought of as specifying "list slice" in some 188contexts. 189 190 191:token:`SimpleValue` has a number of forms: 192 193 194.. productionlist:: 195 SimpleValue: `TokIdentifier` 196 197The value will be the variable referenced by the identifier. It can be one 198of: 199 200.. The code for this is exceptionally abstruse. These examples are a 201 best-effort attempt. 202 203* name of a ``def``, such as the use of ``Bar`` in:: 204 205 def Bar : SomeClass { 206 int X = 5; 207 } 208 209 def Foo { 210 SomeClass Baz = Bar; 211 } 212 213* value local to a ``def``, such as the use of ``Bar`` in:: 214 215 def Foo { 216 int Bar = 5; 217 int Baz = Bar; 218 } 219 220* a template arg of a ``class``, such as the use of ``Bar`` in:: 221 222 class Foo<int Bar> { 223 int Baz = Bar; 224 } 225 226* value local to a ``multiclass``, such as the use of ``Bar`` in:: 227 228 multiclass Foo { 229 int Bar = 5; 230 int Baz = Bar; 231 } 232 233* a template arg to a ``multiclass``, such as the use of ``Bar`` in:: 234 235 multiclass Foo<int Bar> { 236 int Baz = Bar; 237 } 238 239.. productionlist:: 240 SimpleValue: `TokInteger` 241 242This represents the numeric value of the integer. 243 244.. productionlist:: 245 SimpleValue: `TokString`+ 246 247Multiple adjacent string literals are concatenated like in C/C++. The value 248is the concatenation of the strings. 249 250.. productionlist:: 251 SimpleValue: `TokCodeFragment` 252 253The value is the string value of the code fragment. 254 255.. productionlist:: 256 SimpleValue: "?" 257 258``?`` represents an "unset" initializer. 259 260.. productionlist:: 261 SimpleValue: "{" `ValueList` "}" 262 ValueList: [`ValueListNE`] 263 ValueListNE: `Value` ("," `Value`)* 264 265This represents a sequence of bits, as would be used to initialize a 266``bits<n>`` field (where ``n`` is the number of bits). 267 268.. productionlist:: 269 SimpleValue: `ClassID` "<" `ValueListNE` ">" 270 271This generates a new anonymous record definition (as would be created by an 272unnamed ``def`` inheriting from the given class with the given template 273arguments) and the value is the value of that record definition. 274 275.. productionlist:: 276 SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"] 277 278A list initializer. The optional :token:`Type` can be used to indicate a 279specific element type, otherwise the element type will be deduced from the 280given values. 281 282.. The initial `DagArg` of the dag must start with an identifier or 283 !cast, but this is more of an implementation detail and so for now just 284 leave it out. 285 286.. productionlist:: 287 SimpleValue: "(" `DagArg` `DagArgList` ")" 288 DagArgList: `DagArg` ("," `DagArg`)* 289 DagArg: `Value` [":" `TokVarName`] 290 291The initial :token:`DagArg` is called the "operator" of the dag. 292 293.. productionlist:: 294 SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")" 295 296Bodies 297------ 298 299.. productionlist:: 300 ObjectBody: `BaseClassList` `Body` 301 BaseClassList: [":" `BaseClassListNE`] 302 BaseClassListNE: `SubClassRef` ("," `SubClassRef`)* 303 SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"] 304 DefmID: `TokIdentifier` 305 306The version with the :token:`MultiClassID` is only valid in the 307:token:`BaseClassList` of a ``defm``. 308The :token:`MultiClassID` should be the name of a ``multiclass``. 309 310.. put this somewhere else 311 312It is after parsing the base class list that the "let stack" is applied. 313 314.. productionlist:: 315 Body: ";" | "{" BodyList "}" 316 BodyList: BodyItem* 317 BodyItem: `Declaration` ";" 318 :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";" 319 320The ``let`` form allows overriding the value of an inherited field. 321 322``def`` 323------- 324 325.. TODO:: 326 There can be pastes in the names here, like ``#NAME#``. Look into that 327 and document it (it boils down to ParseIDValue with IDParseMode == 328 ParseNameMode). ParseObjectName calls into the general ParseValue, with 329 the only different from "arbitrary expression parsing" being IDParseMode 330 == Mode. 331 332.. productionlist:: 333 Def: "def" `TokIdentifier` `ObjectBody` 334 335Defines a record whose name is given by the :token:`TokIdentifier`. The 336fields of the record are inherited from the base classes and defined in the 337body. 338 339Special handling occurs if this ``def`` appears inside a ``multiclass`` or 340a ``foreach``. 341 342``defm`` 343-------- 344 345.. productionlist:: 346 Defm: "defm" `TokIdentifier` ":" `BaseClassListNE` ";" 347 348Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must 349precede any ``class``'s that appear. 350 351``foreach`` 352----------- 353 354.. productionlist:: 355 Foreach: "foreach" `Declaration` "in" "{" `Object`* "}" 356 :| "foreach" `Declaration` "in" `Object` 357 358The value assigned to the variable in the declaration is iterated over and 359the object or object list is reevaluated with the variable set at each 360iterated value. 361 362Top-Level ``let`` 363----------------- 364 365.. productionlist:: 366 Let: "let" `LetList` "in" "{" `Object`* "}" 367 :| "let" `LetList` "in" `Object` 368 LetList: `LetItem` ("," `LetItem`)* 369 LetItem: `TokIdentifier` [`RangeList`] "=" `Value` 370 371This is effectively equivalent to ``let`` inside the body of a record 372except that it applies to multiple records at a time. The bindings are 373applied at the end of parsing the base classes of a record. 374 375``multiclass`` 376-------------- 377 378.. productionlist:: 379 MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`] 380 : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}" 381 BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)* 382 MultiClassID: `TokIdentifier` 383 MultiClassObject: `Def` | `Defm` | `Let` | `Foreach` 384