• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1===========================
2TableGen Language Reference
3===========================
4
5.. sectionauthor:: Sean Silva <silvas@purdue.edu>
6
7.. contents::
8   :local:
9
10.. warning::
11   This document is extremely rough. If you find something lacking, please
12   fix it, file a documentation bug, or ask about it on llvmdev.
13
14Introduction
15============
16
17This document is meant to be a normative spec about the TableGen language
18in and of itself (i.e. how to understand a given construct in terms of how
19it affects the final set of records represented by the TableGen file). If
20you are unsure if this document is really what you are looking for, please
21read :doc:`/TableGenFundamentals` first.
22
23Notation
24========
25
26The lexical and syntax notation used here is intended to imitate
27`Python's`_. In particular, for lexical definitions, the productions
28operate at the character level and there is no implied whitespace between
29elements. The syntax definitions operate at the token level, so there is
30implied whitespace between tokens.
31
32.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation
33
34Lexical Analysis
35================
36
37TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``)
38comments.
39
40The following is a listing of the basic punctuation tokens::
41
42   - + [ ] { } ( ) < > : ; .  = ? #
43
44Numeric literals take one of the following forms:
45
46.. TableGen actually will lex some pretty strange sequences an interpret
47   them as numbers. What is shown here is an attempt to approximate what it
48   "should" accept.
49
50.. productionlist::
51   TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger`
52   DecimalInteger: ["+" | "-"] ("0"..."9")+
53   HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
54   BinInteger: "0b" ("0" | "1")+
55
56One aspect to note is that the :token:`DecimalInteger` token *includes* the
57``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as
58most languages do.
59
60TableGen has identifier-like tokens:
61
62.. productionlist::
63   ualpha: "a"..."z" | "A"..."Z" | "_"
64   TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")*
65   TokVarName: "$" `ualpha` (`ualpha` |  "0"..."9")*
66
67Note that unlike most languages, TableGen allows :token:`TokIdentifier` to
68begin with a number. In case of ambiguity, a token will be interpreted as a
69numeric literal rather than an identifier.
70
71TableGen also has two string-like literals:
72
73.. productionlist::
74   TokString: '"' <non-'"' characters and C-like escapes> '"'
75   TokCodeFragment: "[{" <shortest text not containing "}]"> "}]"
76
77.. note::
78   The current implementation accepts the following C-like escapes::
79
80      \\ \' \" \t \n
81
82TableGen also has the following keywords::
83
84   bit   bits      class   code         dag
85   def   foreach   defm    field        in
86   int   let       list    multiclass   string
87
88TableGen also has "bang operators" which have a
89wide variety of meanings:
90
91.. productionlist::
92   BangOperator: one of
93               :!eq     !if      !head    !tail      !con
94               :!add    !shl     !sra     !srl
95               :!cast   !empty   !subst   !foreach   !strconcat
96
97Syntax
98======
99
100TableGen has an ``include`` mechanism. It does not play a role in the
101syntax per se, since it is lexically replaced with the contents of the
102included file.
103
104.. productionlist::
105   IncludeDirective: "include" `TokString`
106
107TableGen's top-level production consists of "objects".
108
109.. productionlist::
110   TableGenFile: `Object`*
111   Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach`
112
113``class``\es
114------------
115
116.. productionlist::
117   Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody`
118
119A ``class`` declaration creates a record which other records can inherit
120from. A class can be parametrized by a list of "template arguments", whose
121values can be used in the class body.
122
123A given class can only be defined once. A ``class`` declaration is
124considered to define the class if any of the following is true:
125
126.. break ObjectBody into its consituents so that they are present here?
127
128#. The :token:`TemplateArgList` is present.
129#. The :token:`Body` in the :token:`ObjectBody` is present and is not empty.
130#. The :token:`BaseClassList` in the :token:`ObjectBody` is present.
131
132You can declare an empty class by giving and empty :token:`TemplateArgList`
133and an empty :token:`ObjectBody`. This can serve as a restricted form of
134forward declaration: note that records deriving from the forward-declared
135class will inherit no fields from it since the record expansion is done
136when the record is parsed.
137
138.. productionlist::
139   TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">"
140
141Declarations
142------------
143
144.. Omitting mention of arcane "field" prefix to discourage its use.
145
146The declaration syntax is pretty much what you would expect as a C++
147programmer.
148
149.. productionlist::
150   Declaration: `Type` `TokIdentifier` ["=" `Value`]
151
152It assigns the value to the identifer.
153
154Types
155-----
156
157.. productionlist::
158   Type: "string" | "code" | "bit" | "int" | "dag"
159       :| "bits" "<" `TokInteger` ">"
160       :| "list" "<" `Type` ">"
161       :| `ClassID`
162   ClassID: `TokIdentifier`
163
164Both ``string`` and ``code`` correspond to the string type; the difference
165is purely to indicate programmer intention.
166
167The :token:`ClassID` must identify a class that has been previously
168declared or defined.
169
170Values
171------
172
173.. productionlist::
174   Value: `SimpleValue` `ValueSuffix`*
175   ValueSuffix: "{" `RangeList` "}"
176              :| "[" `RangeList` "]"
177              :| "." `TokIdentifier`
178   RangeList: `RangePiece` ("," `RangePiece`)*
179   RangePiece: `TokInteger`
180             :| `TokInteger` "-" `TokInteger`
181             :| `TokInteger` `TokInteger`
182
183The peculiar last form of :token:`RangePiece` is due to the fact that the
184"``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as
185two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``,
186instead of "1", "-", and "5".
187The :token:`RangeList` can be thought of as specifying "list slice" in some
188contexts.
189
190
191:token:`SimpleValue` has a number of forms:
192
193
194.. productionlist::
195   SimpleValue: `TokIdentifier`
196
197The value will be the variable referenced by the identifier. It can be one
198of:
199
200.. The code for this is exceptionally abstruse. These examples are a
201   best-effort attempt.
202
203* name of a ``def``, such as the use of ``Bar`` in::
204
205     def Bar : SomeClass {
206       int X = 5;
207     }
208
209     def Foo {
210       SomeClass Baz = Bar;
211     }
212
213* value local to a ``def``, such as the use of ``Bar`` in::
214
215     def Foo {
216       int Bar = 5;
217       int Baz = Bar;
218     }
219
220* a template arg of a ``class``, such as the use of ``Bar`` in::
221
222     class Foo<int Bar> {
223       int Baz = Bar;
224     }
225
226* value local to a ``multiclass``, such as the use of ``Bar`` in::
227
228     multiclass Foo {
229       int Bar = 5;
230       int Baz = Bar;
231     }
232
233* a template arg to a ``multiclass``, such as the use of ``Bar`` in::
234
235     multiclass Foo<int Bar> {
236       int Baz = Bar;
237     }
238
239.. productionlist::
240   SimpleValue: `TokInteger`
241
242This represents the numeric value of the integer.
243
244.. productionlist::
245   SimpleValue: `TokString`+
246
247Multiple adjacent string literals are concatenated like in C/C++. The value
248is the concatenation of the strings.
249
250.. productionlist::
251   SimpleValue: `TokCodeFragment`
252
253The value is the string value of the code fragment.
254
255.. productionlist::
256   SimpleValue: "?"
257
258``?`` represents an "unset" initializer.
259
260.. productionlist::
261   SimpleValue: "{" `ValueList` "}"
262   ValueList: [`ValueListNE`]
263   ValueListNE: `Value` ("," `Value`)*
264
265This represents a sequence of bits, as would be used to initialize a
266``bits<n>`` field (where ``n`` is the number of bits).
267
268.. productionlist::
269   SimpleValue: `ClassID` "<" `ValueListNE` ">"
270
271This generates a new anonymous record definition (as would be created by an
272unnamed ``def`` inheriting from the given class with the given template
273arguments) and the value is the value of that record definition.
274
275.. productionlist::
276   SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"]
277
278A list initializer. The optional :token:`Type` can be used to indicate a
279specific element type, otherwise the element type will be deduced from the
280given values.
281
282.. The initial `DagArg` of the dag must start with an identifier or
283   !cast, but this is more of an implementation detail and so for now just
284   leave it out.
285
286.. productionlist::
287   SimpleValue: "(" `DagArg` `DagArgList` ")"
288   DagArgList: `DagArg` ("," `DagArg`)*
289   DagArg: `Value` [":" `TokVarName`] | `TokVarName`
290
291The initial :token:`DagArg` is called the "operator" of the dag.
292
293.. productionlist::
294   SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
295
296Bodies
297------
298
299.. productionlist::
300   ObjectBody: `BaseClassList` `Body`
301   BaseClassList: [":" `BaseClassListNE`]
302   BaseClassListNE: `SubClassRef` ("," `SubClassRef`)*
303   SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"]
304   DefmID: `TokIdentifier`
305
306The version with the :token:`MultiClassID` is only valid in the
307:token:`BaseClassList` of a ``defm``.
308The :token:`MultiClassID` should be the name of a ``multiclass``.
309
310.. put this somewhere else
311
312It is after parsing the base class list that the "let stack" is applied.
313
314.. productionlist::
315   Body: ";" | "{" BodyList "}"
316   BodyList: BodyItem*
317   BodyItem: `Declaration` ";"
318           :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";"
319
320The ``let`` form allows overriding the value of an inherited field.
321
322``def``
323-------
324
325.. TODO::
326   There can be pastes in the names here, like ``#NAME#``. Look into that
327   and document it (it boils down to ParseIDValue with IDParseMode ==
328   ParseNameMode). ParseObjectName calls into the general ParseValue, with
329   the only different from "arbitrary expression parsing" being IDParseMode
330   == Mode.
331
332.. productionlist::
333   Def: "def" `TokIdentifier` `ObjectBody`
334
335Defines a record whose name is given by the :token:`TokIdentifier`. The
336fields of the record are inherited from the base classes and defined in the
337body.
338
339Special handling occurs if this ``def`` appears inside a ``multiclass`` or
340a ``foreach``.
341
342``defm``
343--------
344
345.. productionlist::
346   Defm: "defm" `TokIdentifier` ":" `BaseClassListNE` ";"
347
348Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must
349precede any ``class``'s that appear.
350
351``foreach``
352-----------
353
354.. productionlist::
355   Foreach: "foreach" `Declaration` "in" "{" `Object`* "}"
356          :| "foreach" `Declaration` "in" `Object`
357
358The value assigned to the variable in the declaration is iterated over and
359the object or object list is reevaluated with the variable set at each
360iterated value.
361
362Top-Level ``let``
363-----------------
364
365.. productionlist::
366   Let:  "let" `LetList` "in" "{" `Object`* "}"
367      :| "let" `LetList` "in" `Object`
368   LetList: `LetItem` ("," `LetItem`)*
369   LetItem: `TokIdentifier` [`RangeList`] "=" `Value`
370
371This is effectively equivalent to ``let`` inside the body of a record
372except that it applies to multiple records at a time. The bindings are
373applied at the end of parsing the base classes of a record.
374
375``multiclass``
376--------------
377
378.. productionlist::
379   MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`]
380             : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}"
381   BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)*
382   MultiClassID: `TokIdentifier`
383   MultiClassObject: `Def` | `Defm` | `Let` | `Foreach`
384