• Home
  • Line#
  • Scopes#
  • Navigate#
  • Raw
  • Download
1# C Type Names
2
3STG does not contain full type names for every type node in the graph. If full
4type names are needed then we need to generate them ourselves.
5
6## Implementation
7
8The STG type `Name` is the basic entity representing the human-friendly name of
9a graph node. Values of this type are computed recursively by `Describe` and are
10memoised in a `NameCache`.
11
12`Name` copes with the inside-out C type syntax in a fairly uniform fashion.
13There is some slightly special code for `Qualifier` decoration.
14
15Infinite recursion prevention in `Describe` is done in the most straightforward
16fashion possible. One consequence of this is that if there is a naming cycle,
17the memoised names for nodes in the cycle will depend on where it was entered.
18
19The rest of this document covers the concepts and design principles.
20
21## Background
22
23In sensible operator grammars, composition can be done using precedence levels.
24
25Example with binary operators (there are minor adjustments needed if operators
26have left or right associativity):
27
28| op  | precedence |
29| --- | ---------- |
30| +   | 0          |
31| *   | 1          |
32| num | 2          |
33
34```haskell
35show x = show_prec 0 x
36
37show_paren p x = if p then "(" ++ x ++ ")" else x
38
39show_prec _ (Number n) = to_string n
40show_prec prec (Mult e1 e2) = show_paren (prec > 1) (show_prec 2 e1 ++ "*" ++ show_prec 2 e2)
41show_prec prec (Add e1 e2) = show_paren (prec > 2) (show_prec 3 e1 ++ "+" ++ show_prec 3 e2)
42```
43
44The central idea is that expressions are rendered in the context of a precedence
45level. Parentheses are needed if the context precedence is higher than the
46expression's own precedence. Atomic values can be viewed as having maximal
47precedence. The default precedence context for printing an expression is the
48minimal one; no parentheses will be emitted.
49
50## The more-than-slightly-bonkers C declaration syntax
51
52C's type syntax is closely related to the inside-out declaration syntax it uses
53and has the same precedence rules. A simplified, partial precedence table for
54types might look like this.
55
56thing      | precedence
57---------- | ----------
58int        | 0
59refer *    | 1
60elt[N]     | 2
61ret(args)  | 2
62identifier | 3
63
64The basic (lowest precedence) elements are:
65
66*   primitive types, possibly CV-qualified
67*   typedefs, possibly CVR-qualified
68*   struct, union and enum types, possibly CV-qualified
69
70The "operators" in increasing precedence level order are:
71
72* pointer-to, possibly CVR-qualified
73* function (return type) and array (element type)
74
75The atomic (highest precedence) elements are:
76
77* variable names
78* function names
79
80### CVR-qualifiers
81
82The qualifiers `const`, `volatile` and `restrict` appear to the right of the
83pointer-to operator `*` and are idiomatically placed to the left of the basic
84elements. They can be considered as transparent to precedence.
85
86### User-defined types
87
88Struct, union and enum types can be named or anonymous. The normal case is a
89named type. Anonymous types are given structural descriptions.
90
91### Pointer, array and function types
92
93To declare `x` as a pointer to type `t`, we declare the dereferencing of `x` as
94`t`.
95
96```c++
97t * x;
98```
99
100To declare `x` as an array of type `t` and size `n`, we declare the `n`th
101element of `x` as `t`.
102
103```c++
104t x[n];
105```
106
107To declare `x` as a function returning type `t` and taking args `n`, we declare
108the result of applying `x` to `n` as `t`.
109
110```c++
111t x(n);
112```
113
114The context precedence level for rendering `x`, which may be a complex thing in
115its own right, is 2 in the case of arrays and functions and 1 in the case of
116pointers.
117
118We need to do things inside-out now, because the outer type is a leaf of the
119type expression tree. Instead we say `x` has a precedence and the leaf type
120wraps around it, using parentheses if `x`'s precedence is less than the leaf
121type (which typically happens if `x` is a pointer type).
122
123In each of these cases `x` has been replaced by another type that mentions `y`.
124
125```c++
126t * * y;     // y is a pointer to pointer to t
127t * y[m];    // y is an array of pointer to t
128t * y(m);    // y is a function returning pointer to t
129t (* y)[n];  // y is a pointer to an array of t
130t y[m][n];   // y is an array of arrays of t
131t y(m)[n];   // if a function could return an array, this is what it would look like
132t (* y)(n);  // y is a pointer to a function returning t
133t y[m](n);   // if an array could contain functions, this is what it would look like
134t y(m)(n);   // if a function could return a function, this is what it would look like
135```
136
137## Concise and Efficient Pretty Printer (sketch)
138
139This builds a type recursively, so that outside bits will be rendered first. The
140recursion needs to keep track of a left piece, a right piece and the precedence
141level of the hole in the middle.
142
143```haskell
144render (Basic type) = (type, 0, "")
145render (Ptr ref) = add Left 1 "*" (render ref)
146render (Function ret args) = add Right 2 ("(" ++ render_args args ++ ")") (render ret)
147render (Array elt size) = add Right 2 ("[" ++ render_size size ++ "]") (render elt)
148render (Decl name type) = add Left 3 name (render type)
149
150add side prec text (l, p, r) =
151  case side of
152    Left => (ll ++ text, prec, rr)
153    Right => (ll, prec, text ++ rr)
154  where
155    paren = prec < p
156    ll = if paren then l ++ "(" else if side == LEFT then l ++ " " else l
157    rr = if paren then ")" ++ r else r
158```
159
160To finally print something, just print the left and right pieces in sequence.
161
162The cunning bit about structuring the pretty printer this way is that `render`
163can be memoised. With the expectation that any given type will appear many times
164in typical output, this is a big win.
165
166NOTE: C type qualifiers are a significant extra complication and are omitted
167from this sketch. They must appear to the left or right of (the left part of) a
168type name. Which side can be determined by the current precendence.
169
170NOTE: Whitespace can be emitted sparingly as specific side / precedence contexts
171can imply the impossibility of inadvertently joining two words.
172