• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1===========================
2TableGen Language Reference
3===========================
4
5.. contents::
6   :local:
7
8.. warning::
9   This document is extremely rough. If you find something lacking, please
10   fix it, file a documentation bug, or ask about it on llvmdev.
11
12Introduction
13============
14
15This document is meant to be a normative spec about the TableGen language
16in and of itself (i.e. how to understand a given construct in terms of how
17it affects the final set of records represented by the TableGen file). If
18you are unsure if this document is really what you are looking for, please
19read the :doc:`introduction to TableGen <index>` first.
20
21Notation
22========
23
24The lexical and syntax notation used here is intended to imitate
25`Python's`_. In particular, for lexical definitions, the productions
26operate at the character level and there is no implied whitespace between
27elements. The syntax definitions operate at the token level, so there is
28implied whitespace between tokens.
29
30.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation
31
32Lexical Analysis
33================
34
35TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``)
36comments.
37
38The following is a listing of the basic punctuation tokens::
39
40   - + [ ] { } ( ) < > : ; .  = ? #
41
42Numeric literals take one of the following forms:
43
44.. TableGen actually will lex some pretty strange sequences an interpret
45   them as numbers. What is shown here is an attempt to approximate what it
46   "should" accept.
47
48.. productionlist::
49   TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger`
50   DecimalInteger: ["+" | "-"] ("0"..."9")+
51   HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
52   BinInteger: "0b" ("0" | "1")+
53
54One aspect to note is that the :token:`DecimalInteger` token *includes* the
55``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as
56most languages do.
57
58TableGen has identifier-like tokens:
59
60.. productionlist::
61   ualpha: "a"..."z" | "A"..."Z" | "_"
62   TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")*
63   TokVarName: "$" `ualpha` (`ualpha` |  "0"..."9")*
64
65Note that unlike most languages, TableGen allows :token:`TokIdentifier` to
66begin with a number. In case of ambiguity, a token will be interpreted as a
67numeric literal rather than an identifier.
68
69TableGen also has two string-like literals:
70
71.. productionlist::
72   TokString: '"' <non-'"' characters and C-like escapes> '"'
73   TokCodeFragment: "[{" <shortest text not containing "}]"> "}]"
74
75:token:`TokCodeFragment` is essentially a multiline string literal
76delimited by ``[{`` and ``}]``.
77
78.. note::
79   The current implementation accepts the following C-like escapes::
80
81      \\ \' \" \t \n
82
83TableGen also has the following keywords::
84
85   bit   bits      class   code         dag
86   def   foreach   defm    field        in
87   int   let       list    multiclass   string
88
89TableGen also has "bang operators" which have a
90wide variety of meanings:
91
92.. productionlist::
93   BangOperator: one of
94               :!eq     !if      !head    !tail      !con
95               :!add    !shl     !sra     !srl
96               :!cast   !empty   !subst   !foreach   !listconcat   !strconcat
97
98Syntax
99======
100
101TableGen has an ``include`` mechanism. It does not play a role in the
102syntax per se, since it is lexically replaced with the contents of the
103included file.
104
105.. productionlist::
106   IncludeDirective: "include" `TokString`
107
108TableGen's top-level production consists of "objects".
109
110.. productionlist::
111   TableGenFile: `Object`*
112   Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach`
113
114``class``\es
115------------
116
117.. productionlist::
118   Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody`
119
120A ``class`` declaration creates a record which other records can inherit
121from. A class can be parametrized by a list of "template arguments", whose
122values can be used in the class body.
123
124A given class can only be defined once. A ``class`` declaration is
125considered to define the class if any of the following is true:
126
127.. break ObjectBody into its consituents so that they are present here?
128
129#. The :token:`TemplateArgList` is present.
130#. The :token:`Body` in the :token:`ObjectBody` is present and is not empty.
131#. The :token:`BaseClassList` in the :token:`ObjectBody` is present.
132
133You can declare an empty class by giving and empty :token:`TemplateArgList`
134and an empty :token:`ObjectBody`. This can serve as a restricted form of
135forward declaration: note that records deriving from the forward-declared
136class will inherit no fields from it since the record expansion is done
137when the record is parsed.
138
139.. productionlist::
140   TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">"
141
142Declarations
143------------
144
145.. Omitting mention of arcane "field" prefix to discourage its use.
146
147The declaration syntax is pretty much what you would expect as a C++
148programmer.
149
150.. productionlist::
151   Declaration: `Type` `TokIdentifier` ["=" `Value`]
152
153It assigns the value to the identifer.
154
155Types
156-----
157
158.. productionlist::
159   Type: "string" | "code" | "bit" | "int" | "dag"
160       :| "bits" "<" `TokInteger` ">"
161       :| "list" "<" `Type` ">"
162       :| `ClassID`
163   ClassID: `TokIdentifier`
164
165Both ``string`` and ``code`` correspond to the string type; the difference
166is purely to indicate programmer intention.
167
168The :token:`ClassID` must identify a class that has been previously
169declared or defined.
170
171Values
172------
173
174.. productionlist::
175   Value: `SimpleValue` `ValueSuffix`*
176   ValueSuffix: "{" `RangeList` "}"
177              :| "[" `RangeList` "]"
178              :| "." `TokIdentifier`
179   RangeList: `RangePiece` ("," `RangePiece`)*
180   RangePiece: `TokInteger`
181             :| `TokInteger` "-" `TokInteger`
182             :| `TokInteger` `TokInteger`
183
184The peculiar last form of :token:`RangePiece` is due to the fact that the
185"``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as
186two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``,
187instead of "1", "-", and "5".
188The :token:`RangeList` can be thought of as specifying "list slice" in some
189contexts.
190
191
192:token:`SimpleValue` has a number of forms:
193
194
195.. productionlist::
196   SimpleValue: `TokIdentifier`
197
198The value will be the variable referenced by the identifier. It can be one
199of:
200
201.. The code for this is exceptionally abstruse. These examples are a
202   best-effort attempt.
203
204* name of a ``def``, such as the use of ``Bar`` in::
205
206     def Bar : SomeClass {
207       int X = 5;
208     }
209
210     def Foo {
211       SomeClass Baz = Bar;
212     }
213
214* value local to a ``def``, such as the use of ``Bar`` in::
215
216     def Foo {
217       int Bar = 5;
218       int Baz = Bar;
219     }
220
221* a template arg of a ``class``, such as the use of ``Bar`` in::
222
223     class Foo<int Bar> {
224       int Baz = Bar;
225     }
226
227* value local to a ``multiclass``, such as the use of ``Bar`` in::
228
229     multiclass Foo {
230       int Bar = 5;
231       int Baz = Bar;
232     }
233
234* a template arg to a ``multiclass``, such as the use of ``Bar`` in::
235
236     multiclass Foo<int Bar> {
237       int Baz = Bar;
238     }
239
240.. productionlist::
241   SimpleValue: `TokInteger`
242
243This represents the numeric value of the integer.
244
245.. productionlist::
246   SimpleValue: `TokString`+
247
248Multiple adjacent string literals are concatenated like in C/C++. The value
249is the concatenation of the strings.
250
251.. productionlist::
252   SimpleValue: `TokCodeFragment`
253
254The value is the string value of the code fragment.
255
256.. productionlist::
257   SimpleValue: "?"
258
259``?`` represents an "unset" initializer.
260
261.. productionlist::
262   SimpleValue: "{" `ValueList` "}"
263   ValueList: [`ValueListNE`]
264   ValueListNE: `Value` ("," `Value`)*
265
266This represents a sequence of bits, as would be used to initialize a
267``bits<n>`` field (where ``n`` is the number of bits).
268
269.. productionlist::
270   SimpleValue: `ClassID` "<" `ValueListNE` ">"
271
272This generates a new anonymous record definition (as would be created by an
273unnamed ``def`` inheriting from the given class with the given template
274arguments) and the value is the value of that record definition.
275
276.. productionlist::
277   SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"]
278
279A list initializer. The optional :token:`Type` can be used to indicate a
280specific element type, otherwise the element type will be deduced from the
281given values.
282
283.. The initial `DagArg` of the dag must start with an identifier or
284   !cast, but this is more of an implementation detail and so for now just
285   leave it out.
286
287.. productionlist::
288   SimpleValue: "(" `DagArg` `DagArgList` ")"
289   DagArgList: `DagArg` ("," `DagArg`)*
290   DagArg: `Value` [":" `TokVarName`] | `TokVarName`
291
292The initial :token:`DagArg` is called the "operator" of the dag.
293
294.. productionlist::
295   SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
296
297Bodies
298------
299
300.. productionlist::
301   ObjectBody: `BaseClassList` `Body`
302   BaseClassList: [":" `BaseClassListNE`]
303   BaseClassListNE: `SubClassRef` ("," `SubClassRef`)*
304   SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"]
305   DefmID: `TokIdentifier`
306
307The version with the :token:`MultiClassID` is only valid in the
308:token:`BaseClassList` of a ``defm``.
309The :token:`MultiClassID` should be the name of a ``multiclass``.
310
311.. put this somewhere else
312
313It is after parsing the base class list that the "let stack" is applied.
314
315.. productionlist::
316   Body: ";" | "{" BodyList "}"
317   BodyList: BodyItem*
318   BodyItem: `Declaration` ";"
319           :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";"
320
321The ``let`` form allows overriding the value of an inherited field.
322
323``def``
324-------
325
326.. TODO::
327   There can be pastes in the names here, like ``#NAME#``. Look into that
328   and document it (it boils down to ParseIDValue with IDParseMode ==
329   ParseNameMode). ParseObjectName calls into the general ParseValue, with
330   the only different from "arbitrary expression parsing" being IDParseMode
331   == Mode.
332
333.. productionlist::
334   Def: "def" `TokIdentifier` `ObjectBody`
335
336Defines a record whose name is given by the :token:`TokIdentifier`. The
337fields of the record are inherited from the base classes and defined in the
338body.
339
340Special handling occurs if this ``def`` appears inside a ``multiclass`` or
341a ``foreach``.
342
343``defm``
344--------
345
346.. productionlist::
347   Defm: "defm" `TokIdentifier` ":" `BaseClassListNE` ";"
348
349Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must
350precede any ``class``'s that appear.
351
352``foreach``
353-----------
354
355.. productionlist::
356   Foreach: "foreach" `Declaration` "in" "{" `Object`* "}"
357          :| "foreach" `Declaration` "in" `Object`
358
359The value assigned to the variable in the declaration is iterated over and
360the object or object list is reevaluated with the variable set at each
361iterated value.
362
363Top-Level ``let``
364-----------------
365
366.. productionlist::
367   Let:  "let" `LetList` "in" "{" `Object`* "}"
368      :| "let" `LetList` "in" `Object`
369   LetList: `LetItem` ("," `LetItem`)*
370   LetItem: `TokIdentifier` [`RangeList`] "=" `Value`
371
372This is effectively equivalent to ``let`` inside the body of a record
373except that it applies to multiple records at a time. The bindings are
374applied at the end of parsing the base classes of a record.
375
376``multiclass``
377--------------
378
379.. productionlist::
380   MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`]
381             : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}"
382   BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)*
383   MultiClassID: `TokIdentifier`
384   MultiClassObject: `Def` | `Defm` | `Let` | `Foreach`
385