1# MLIR Language Reference 2 3MLIR (Multi-Level IR) is a compiler intermediate representation with 4similarities to traditional three-address SSA representations (like 5[LLVM IR](http://llvm.org/docs/LangRef.html) or 6[SIL](https://github.com/apple/swift/blob/master/docs/SIL.rst)), but which 7introduces notions from polyhedral loop optimization as first-class concepts. 8This hybrid design is optimized to represent, analyze, and transform high level 9dataflow graphs as well as target-specific code generated for high performance 10data parallel systems. Beyond its representational capabilities, its single 11continuous design provides a framework to lower from dataflow graphs to 12high-performance target-specific code. 13 14This document defines and describes the key concepts in MLIR, and is intended 15to be a dry reference document - the [rationale 16documentation](Rationale/Rationale.md), 17[glossary](../getting_started/Glossary.md), and other content are hosted 18elsewhere. 19 20MLIR is designed to be used in three different forms: a human-readable textual 21form suitable for debugging, an in-memory form suitable for programmatic 22transformations and analysis, and a compact serialized form suitable for 23storage and transport. The different forms all describe the same semantic 24content. This document describes the human-readable textual form. 25 26[TOC] 27 28## High-Level Structure 29 30MLIR is fundamentally based on a graph-like data structure of nodes, called 31*Operations*, and edges, called *Values*. Each Value is the result of exactly 32one Operation or Block Argument, and has a *Value Type* defined by the [type 33system](#type-system). [Operations](#operations) are contained in 34[Blocks](#blocks) and Blocks are contained in [Regions](#regions). Operations 35are also ordered within their containing block and Blocks are ordered in their 36containing region, although this order may or may not be semantically 37meaningful in a given [kind of region](Interfaces.md#regionkindinterfaces)). 38Operations may also contain regions, enabling hierarchical structures to be 39represented. 40 41Operations can represent many different concepts, from higher-level concepts 42like function definitions, function calls, buffer allocations, view or slices 43of buffers, and process creation, to lower-level concepts like 44target-independent arithmetic, target-specific instructions, configuration 45registers, and logic gates. These different concepts are represented by 46different operations in MLIR and the set of operations usable in MLIR can be 47arbitrarily extended. 48 49MLIR also provides an extensible framework for transformations on operations, 50using familiar concepts of compiler [Passes](Passes.md). Enabling an arbitrary 51set of passes on an arbitrary set of operations results in a significant 52scaling challenge, since each transformation must potentially take into 53account the semantics of any operation. MLIR addresses this complexity by 54allowing operation semantics to be described abstractly using 55[Traits](Traits.md) and [Interfaces](Interfaces.md), enabling transformations 56to operate on operations more generically. Traits often describe verification 57constraints on valid IR, enabling complex invariants to be captured and 58checked. (see [Op vs 59Operation](docs/Tutorials/Toy/Ch-2/#op-vs-operation-using-mlir-operations)) 60 61One obvious application of MLIR is to represent an 62[SSA-based](https://en.wikipedia.org/wiki/Static_single_assignment_form) IR, 63like the LLVM core IR, with appropriate choice of Operation Types to define 64[Modules](#module), [Functions](#functions), Branches, Allocations, and 65verification constraints to ensure the SSA Dominance property. MLIR includes a 66'standard' dialect which defines just such structures. However, MLIR is 67intended to be general enough to represent other compiler-like data 68structures, such as Abstract Syntax Trees in a language frontend, generated 69instructions in a target-specific backend, or circuits in a High-Level 70Synthesis tool. 71 72Here's an example of an MLIR module: 73 74```mlir 75// Compute A*B using an implementation of multiply kernel and print the 76// result using a TensorFlow op. The dimensions of A and B are partially 77// known. The shapes are assumed to match. 78func @mul(%A: tensor<100x?xf32>, %B: tensor<?x50xf32>) -> (tensor<100x50xf32>) { 79 // Compute the inner dimension of %A using the dim operation. 80 %n = dim %A, 1 : tensor<100x?xf32> 81 82 // Allocate addressable "buffers" and copy tensors %A and %B into them. 83 %A_m = alloc(%n) : memref<100x?xf32> 84 tensor_store %A to %A_m : memref<100x?xf32> 85 86 %B_m = alloc(%n) : memref<?x50xf32> 87 tensor_store %B to %B_m : memref<?x50xf32> 88 89 // Call function @multiply passing memrefs as arguments, 90 // and getting returned the result of the multiplication. 91 %C_m = call @multiply(%A_m, %B_m) 92 : (memref<100x?xf32>, memref<?x50xf32>) -> (memref<100x50xf32>) 93 94 dealloc %A_m : memref<100x?xf32> 95 dealloc %B_m : memref<?x50xf32> 96 97 // Load the buffer data into a higher level "tensor" value. 98 %C = tensor_load %C_m : memref<100x50xf32> 99 dealloc %C_m : memref<100x50xf32> 100 101 // Call TensorFlow built-in function to print the result tensor. 102 "tf.Print"(%C){message: "mul result"} 103 : (tensor<100x50xf32) -> (tensor<100x50xf32>) 104 105 return %C : tensor<100x50xf32> 106} 107 108// A function that multiplies two memrefs and returns the result. 109func @multiply(%A: memref<100x?xf32>, %B: memref<?x50xf32>) 110 -> (memref<100x50xf32>) { 111 // Compute the inner dimension of %A. 112 %n = dim %A, 1 : memref<100x?xf32> 113 114 // Allocate memory for the multiplication result. 115 %C = alloc() : memref<100x50xf32> 116 117 // Multiplication loop nest. 118 affine.for %i = 0 to 100 { 119 affine.for %j = 0 to 50 { 120 store 0 to %C[%i, %j] : memref<100x50xf32> 121 affine.for %k = 0 to %n { 122 %a_v = load %A[%i, %k] : memref<100x?xf32> 123 %b_v = load %B[%k, %j] : memref<?x50xf32> 124 %prod = mulf %a_v, %b_v : f32 125 %c_v = load %C[%i, %j] : memref<100x50xf32> 126 %sum = addf %c_v, %prod : f32 127 store %sum, %C[%i, %j] : memref<100x50xf32> 128 } 129 } 130 } 131 return %C : memref<100x50xf32> 132} 133``` 134 135## Notation 136 137MLIR has a simple and unambiguous grammar, allowing it to reliably round-trip 138through a textual form. This is important for development of the compiler - 139e.g. for understanding the state of code as it is being transformed and 140writing test cases. 141 142This document describes the grammar using 143[Extended Backus-Naur Form (EBNF)](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form). 144 145This is the EBNF grammar used in this document, presented in yellow boxes. 146 147``` 148alternation ::= expr0 | expr1 | expr2 // Either expr0 or expr1 or expr2. 149sequence ::= expr0 expr1 expr2 // Sequence of expr0 expr1 expr2. 150repetition0 ::= expr* // 0 or more occurrences. 151repetition1 ::= expr+ // 1 or more occurrences. 152optionality ::= expr? // 0 or 1 occurrence. 153grouping ::= (expr) // Everything inside parens is grouped together. 154literal ::= `abcd` // Matches the literal `abcd`. 155``` 156 157Code examples are presented in blue boxes. 158 159```mlir 160// This is an example use of the grammar above: 161// This matches things like: ba, bana, boma, banana, banoma, bomana... 162example ::= `b` (`an` | `om`)* `a` 163``` 164 165### Common syntax 166 167The following core grammar productions are used in this document: 168 169``` 170// TODO: Clarify the split between lexing (tokens) and parsing (grammar). 171digit ::= [0-9] 172hex_digit ::= [0-9a-fA-F] 173letter ::= [a-zA-Z] 174id-punct ::= [$._-] 175 176integer-literal ::= decimal-literal | hexadecimal-literal 177decimal-literal ::= digit+ 178hexadecimal-literal ::= `0x` hex_digit+ 179float-literal ::= [-+]?[0-9]+[.][0-9]*([eE][-+]?[0-9]+)? 180string-literal ::= `"` [^"\n\f\v\r]* `"` TODO: define escaping rules 181``` 182 183Not listed here, but MLIR does support comments. They use standard BCPL syntax, 184starting with a `//` and going until the end of the line. 185 186### Identifiers and keywords 187 188Syntax: 189 190``` 191// Identifiers 192bare-id ::= (letter|[_]) (letter|digit|[_$.])* 193bare-id-list ::= bare-id (`,` bare-id)* 194value-id ::= `%` suffix-id 195suffix-id ::= (digit+ | ((letter|id-punct) (letter|id-punct|digit)*)) 196 197symbol-ref-id ::= `@` (suffix-id | string-literal) 198value-id-list ::= value-id (`,` value-id)* 199 200// Uses of value, e.g. in an operand list to an operation. 201value-use ::= value-id 202value-use-list ::= value-use (`,` value-use)* 203``` 204 205Identifiers name entities such as values, types and functions, and are 206chosen by the writer of MLIR code. Identifiers may be descriptive (e.g. 207`%batch_size`, `@matmul`), or may be non-descriptive when they are 208auto-generated (e.g. `%23`, `@func42`). Identifier names for values may be 209used in an MLIR text file but are not persisted as part of the IR - the printer 210will give them anonymous names like `%42`. 211 212MLIR guarantees identifiers never collide with keywords by prefixing identifiers 213with a sigil (e.g. `%`, `#`, `@`, `^`, `!`). In certain unambiguous contexts 214(e.g. affine expressions), identifiers are not prefixed, for brevity. New 215keywords may be added to future versions of MLIR without danger of collision 216with existing identifiers. 217 218Value identifiers are only [in scope](#value-scoping) for the (nested) 219region in which they are defined and cannot be accessed or referenced 220outside of that region. Argument identifiers in mapping functions are 221in scope for the mapping body. Particular operations may further limit 222which identifiers are in scope in their regions. For instance, the 223scope of values in a region with [SSA control flow 224semantics](#control-flow-and-ssacfg-regions) is constrained according 225to the standard definition of [SSA 226dominance](https://en.wikipedia.org/wiki/Dominator_\(graph_theory\)). Another 227example is the [IsolatedFromAbove trait](Traits.md#isolatedfromabove), 228which restricts directly accessing values defined in containing 229regions. 230 231Function identifiers and mapping identifiers are associated with 232[Symbols](SymbolsAndSymbolTables) and have scoping rules dependent on 233symbol attributes. 234 235## Dialects 236 237Dialects are the mechanism by which to engage with and extend the MLIR 238ecosystem. They allow for defining new [operations](#operations), as well as 239[attributes](#attributes) and [types](#type-system). Each dialect is given a 240unique `namespace` that is prefixed to each defined attribute/operation/type. 241For example, the [Affine dialect](Dialects/Affine.md) defines the namespace: 242`affine`. 243 244MLIR allows for multiple dialects, even those outside of the main tree, to 245co-exist together within one module. Dialects are produced and consumed by 246certain passes. MLIR provides a [framework](DialectConversion.md) to convert 247between, and within, different dialects. 248 249A few of the dialects supported by MLIR: 250 251* [Affine dialect](Dialects/Affine.md) 252* [GPU dialect](Dialects/GPU.md) 253* [LLVM dialect](Dialects/LLVM.md) 254* [SPIR-V dialect](Dialects/SPIR-V.md) 255* [Standard dialect](Dialects/Standard.md) 256* [Vector dialect](Dialects/Vector.md) 257 258### Target specific operations 259 260Dialects provide a modular way in which targets can expose target-specific 261operations directly through to MLIR. As an example, some targets go through 262LLVM. LLVM has a rich set of intrinsics for certain target-independent 263operations (e.g. addition with overflow check) as well as providing access to 264target-specific operations for the targets it supports (e.g. vector 265permutation operations). LLVM intrinsics in MLIR are represented via 266operations that start with an "llvm." name. 267 268Example: 269 270```mlir 271// LLVM: %x = call {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b) 272%x:2 = "llvm.sadd.with.overflow.i16"(%a, %b) : (i16, i16) -> (i16, i1) 273``` 274 275These operations only work when targeting LLVM as a backend (e.g. for CPUs and 276GPUs), and are required to align with the LLVM definition of these intrinsics. 277 278## Operations 279 280Syntax: 281 282``` 283operation ::= op-result-list? (generic-operation | custom-operation) 284 trailing-location? 285generic-operation ::= string-literal `(` value-use-list? `)` successor-list? 286 (`(` region-list `)`)? attribute-dict? `:` function-type 287custom-operation ::= bare-id custom-operation-format 288op-result-list ::= op-result (`,` op-result)* `=` 289op-result ::= value-id (`:` integer-literal) 290successor-list ::= successor (`,` successor)* 291successor ::= caret-id (`:` bb-arg-list)? 292region-list ::= region (`,` region)* 293trailing-location ::= (`loc` `(` location `)`)? 294``` 295 296MLIR introduces a uniform concept called _operations_ to enable describing 297many different levels of abstractions and computations. Operations in MLIR are 298fully extensible (there is no fixed list of operations) and have 299application-specific semantics. For example, MLIR supports [target-independent 300operations](Dialects/Standard.md#memory-operations), [affine 301operations](Dialects/Affine.md), and [target-specific machine 302operations](#target-specific-operations). 303 304The internal representation of an operation is simple: an operation is 305identified by a unique string (e.g. `dim`, `tf.Conv2d`, `x86.repmovsb`, 306`ppc.eieio`, etc), can return zero or more results, take zero or more 307operands, may have zero or more attributes, may have zero or more successors, 308and zero or more enclosed [regions](#regions). The generic printing form 309includes all these elements literally, with a function type to indicate the 310types of the results and operands. 311 312Example: 313 314```mlir 315// An operation that produces two results. 316// The results of %result can be accessed via the <name> `#` <opNo> syntax. 317%result:2 = "foo_div"() : () -> (f32, i32) 318 319// Pretty form that defines a unique name for each result. 320%foo, %bar = "foo_div"() : () -> (f32, i32) 321 322// Invoke a TensorFlow function called tf.scramble with two inputs 323// and an attribute "fruit". 324%2 = "tf.scramble"(%result#0, %bar) {fruit: "banana"} : (f32, i32) -> f32 325``` 326 327In addition to the basic syntax above, dialects may register known operations. 328This allows those dialects to support _custom assembly form_ for parsing and 329printing operations. In the operation sets listed below, we show both forms. 330 331### Terminator Operations 332 333These are a special category of operations that *must* terminate a block, e.g. 334[branches](Dialects/Standard.md#terminator-operations). These operations may 335also have a list of successors ([blocks](#blocks) and their arguments). 336 337Example: 338 339```mlir 340// Branch to ^bb1 or ^bb2 depending on the condition %cond. 341// Pass value %v to ^bb2, but not to ^bb1. 342"cond_br"(%cond)[^bb1, ^bb2(%v : index)] : (i1) -> () 343``` 344 345### Module 346 347``` 348module ::= `module` symbol-ref-id? (`attributes` attribute-dict)? region 349``` 350 351An MLIR Module represents a top-level container operation. It contains a single 352[SSACFG region](#control-flow-and-ssacfg-regions) containing a single block 353which can contain any operations. Operations within this region cannot 354implicitly capture values defined outside the module, i.e. Modules are 355[IsolatedFromAbove](Traits.md#isolatedfromabove). Modules have an optional 356[symbol name](SymbolsAndSymbolTables.md) which can be used to refer to them in 357operations. 358 359### Functions 360 361An MLIR Function is an operation with a name containing a single [SSACFG 362region](#control-flow-and-ssacfg-regions). Operations within this region 363cannot implicitly capture values defined outside of the function, 364i.e. Functions are [IsolatedFromAbove](Traits.md#isolatedfromabove). All 365external references must use function arguments or attributes that establish a 366symbolic connection (e.g. symbols referenced by name via a string attribute 367like [SymbolRefAttr](#symbol-reference-attribute)): 368 369``` 370function ::= `func` function-signature function-attributes? function-body? 371 372function-signature ::= symbol-ref-id `(` argument-list `)` 373 (`->` function-result-list)? 374 375argument-list ::= (named-argument (`,` named-argument)*) | /*empty*/ 376argument-list ::= (type attribute-dict? (`,` type attribute-dict?)*) | /*empty*/ 377named-argument ::= value-id `:` type attribute-dict? 378 379function-result-list ::= function-result-list-parens 380 | non-function-type 381function-result-list-parens ::= `(` `)` 382 | `(` function-result-list-no-parens `)` 383function-result-list-no-parens ::= function-result (`,` function-result)* 384function-result ::= type attribute-dict? 385 386function-attributes ::= `attributes` attribute-dict 387function-body ::= region 388``` 389 390An external function declaration (used when referring to a function declared 391in some other module) has no body. While the MLIR textual form provides a nice 392inline syntax for function arguments, they are internally represented as 393"block arguments" to the first block in the region. 394 395Only dialect attribute names may be specified in the attribute dictionaries 396for function arguments, results, or the function itself. 397 398Examples: 399 400```mlir 401// External function definitions. 402func @abort() 403func @scribble(i32, i64, memref<? x 128 x f32, #layout_map0>) -> f64 404 405// A function that returns its argument twice: 406func @count(%x: i64) -> (i64, i64) 407 attributes {fruit: "banana"} { 408 return %x, %x: i64, i64 409} 410 411// A function with an argument attribute 412func @example_fn_arg(%x: i32 {swift.self = unit}) 413 414// A function with a result attribute 415func @example_fn_result() -> (f64 {dialectName.attrName = 0 : i64}) 416 417// A function with an attribute 418func @example_fn_attr() attributes {dialectName.attrName = false} 419``` 420 421## Blocks 422 423Syntax: 424 425``` 426block ::= block-label operation+ 427block-label ::= block-id block-arg-list? `:` 428block-id ::= caret-id 429caret-id ::= `^` suffix-id 430value-id-and-type ::= value-id `:` type 431 432// Non-empty list of names and types. 433value-id-and-type-list ::= value-id-and-type (`,` value-id-and-type)* 434 435block-arg-list ::= `(` value-id-and-type-list? `)` 436``` 437 438A *Block* is an ordered list of operations, concluding with a single 439[terminator operation](#terminator-operations). In [SSACFG 440regions](#control-flow-and-ssacfg-regions), each block represents a compiler 441[basic block](https://en.wikipedia.org/wiki/Basic_block) where instructions 442inside the block are executed in order and terminator operations implement 443control flow branches between basic blocks. 444 445Blocks in MLIR take a list of block arguments, notated in a function-like 446way. Block arguments are bound to values specified by the semantics of 447individual operations. Block arguments of the entry block of a region are also 448arguments to the region and the values bound to these arguments are determined 449by the semantics of the containing operation. Block arguments of other blocks 450are determined by the semantics of terminator operations, e.g. Branches, which 451have the block as a successor. In regions with [control 452flow](#control-flow-and-ssacfg-regions), MLIR leverages this structure to 453implicitly represent the passage of control-flow dependent values without the 454complex nuances of PHI nodes in traditional SSA representations. Note that 455values which are not control-flow dependent can be referenced directly and do 456not need to be passed through block arguments. 457 458Here is a simple example function showing branches, returns, and block 459arguments: 460 461```mlir 462func @simple(i64, i1) -> i64 { 463^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a 464 cond_br %cond, ^bb1, ^bb2 465 466^bb1: 467 br ^bb3(%a: i64) // Branch passes %a as the argument 468 469^bb2: 470 %b = addi %a, %a : i64 471 br ^bb3(%b: i64) // Branch passes %b as the argument 472 473// ^bb3 receives an argument, named %c, from predecessors 474// and passes it on to bb4 along with %a. %a is referenced 475// directly from its defining operation and is not passed through 476// an argument of ^bb3. 477^bb3(%c: i64): 478 br ^bb4(%c, %a : i64, i64) 479 480^bb4(%d : i64, %e : i64): 481 %0 = addi %d, %e : i64 482 return %0 : i64 // Return is also a terminator. 483} 484``` 485 486**Context:** The "block argument" representation eliminates a number 487of special cases from the IR compared to traditional "PHI nodes are 488operations" SSA IRs (like LLVM). For example, the [parallel copy 489semantics](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.524.5461&rep=rep1&type=pdf) 490of SSA is immediately apparent, and function arguments are no longer a 491special case: they become arguments to the entry block [[more 492rationale](Rationale/Rationale.md#block-arguments-vs-phi-nodes)]. Blocks 493are also a fundamental concept that cannot be represented by 494operations because values defined in an operation cannot be accessed 495outside the operation. 496 497## Regions 498 499### Definition 500 501A region is an ordered list of MLIR [Blocks](#blocks). The semantics within a 502region is not imposed by the IR. Instead, the containing operation defines the 503semantics of the regions it contains. MLIR currently defines two kinds of 504regions: [SSACFG regions](#control-flow-and-ssacfg-regions), which describe 505control flow between blocks, and [Graph regions](#graph-regions), which do not 506require control flow between block. The kinds of regions within an operation 507are described using the 508[RegionKindInterface](Interfaces.md#regionkindinterfaces). 509 510Regions do not have a name or an address, only the blocks contained in a 511region do. Regions must be contained within operations and have no type or 512attributes. The first block in the region is a special block called the 'entry 513block'. The arguments to the entry block are also the arguments of the region 514itself. The entry block cannot be listed as a successor of any other 515block. The syntax for a region is as follows: 516 517``` 518region ::= `{` block* `}` 519``` 520 521A function body is an example of a region: it consists of a CFG of blocks and 522has additional semantic restrictions that other types of regions may not have. 523For example, in a function body, block terminators must either branch to a 524different block, or return from a function where the types of the `return` 525arguments must match the result types of the function signature. Similarly, 526the function arguments must match the types and count of the region arguments. 527In general, operations with regions can define these correspondances 528arbitrarily. 529 530### Value Scoping 531 532Regions provide hierarchical encapsulation of programs: it is impossible to 533reference, i.e. branch to, a block which is not in the same region as the 534source of the reference, i.e. a terminator operation. Similarly, regions 535provides a natural scoping for value visibility: values defined in a region 536don't escape to the enclosing region, if any. By default, operations inside a 537region can reference values defined outside of the region whenever it would 538have been legal for operands of the enclosing operation to reference those 539values, but this can be restricted using traits, such as 540[OpTrait::IsolatedFromAbove](Traits.md#isolatedfromabove), or a custom 541verifier. 542 543Example: 544 545```mlir 546 "any_op"(%a) ({ // if %a is in-scope in the containing region... 547 // then %a is in-scope here too. 548 %new_value = "another_op"(%a) : (i64) -> (i64) 549 }) : (i64) -> (i64) 550``` 551 552MLIR defines a generalized 'hierarchical dominance' concept that operates 553across hierarchy and defines whether a value is 'in scope' and can be used by 554a particular operation. Whether a value can be used by another operation in 555the same region is defined by the kind of region. A value defined in a region 556can be used by an operation which has a parent in the same region, if and only 557if the parent could use the value. A value defined by an argument to a region 558can always be used by any operation deeply contained in the region. A value 559defined in a region can never be used outside of the region. 560 561### Control Flow and SSACFG Regions 562 563In MLIR, control flow semantics of a region is indicated by 564[RegionKind::SSACFG](Interfaces.md#regionkindinterfaces). Informally, these 565regions support semantics where operations in a region 'execute 566sequentially'. Before an operation executes, its operands have well-defined 567values. After an operation executes, the operands have the same values and 568results also have well-defined values. After an operation executes, the next 569operation in the block executes until the operation is the terminator operation 570at the end of a block, in which case some other operation will execute. The 571determination of the next instruction to execute is the 'passing of control 572flow'. 573 574In general, when control flow is passed to an operation, MLIR does not 575restrict when control flow enters or exits the regions contained in that 576operation. However, when control flow enters a region, it always begins in the 577first block of the region, called the *entry* block. Terminator operations 578ending each block represent control flow by explicitly specifying the 579successor blocks of the block. Control flow can only pass to one of the 580specified successor blocks as in a `branch` operation, or back to the 581containing operation as in a `return` operation. Terminator operations without 582successors can only pass control back to the containing operation. Within 583these restrictions, the particular semantics of terminator operations is 584determined by the specific dialect operations involved. Blocks (other than the 585entry block) that are not listed as a successor of a terminator operation are 586defined to be unreachable and can be removed without affecting the semantics 587of the containing operation. 588 589Although control flow always enters a region through the entry block, control 590flow may exit a region through any block with an appropriate terminator. The 591standard dialect leverages this capability to define operations with 592Single-Entry-Multiple-Exit (SEME) regions, possibly flowing through different 593blocks in the region and exiting through any block with a `return` 594operation. This behavior is similar to that of a function body in most 595programming languages. In addition, control flow may also not reach the end of 596a block or region, for example if a function call does not return. 597 598Example: 599 600```mlir 601func @accelerator_compute(i64, i1) -> i64 { // An SSACFG region 602^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a 603 cond_br %cond, ^bb1, ^bb2 604 605^bb1: 606 // This def for %value does not dominate ^bb2 607 %value = "op.convert"(%a) : (i64) -> i64 608 br ^bb3(%a: i64) // Branch passes %a as the argument 609 610^bb2: 611 accelerator.launch() { // An SSACFG region 612 ^bb0: 613 // Region of code nested under "accelerator.launch", it can reference %a but 614 // not %value. 615 %new_value = "accelerator.do_something"(%a) : (i64) -> () 616 } 617 // %new_value cannot be referenced outside of the region 618 619^bb3: 620 ... 621} 622``` 623 624#### Operations with Multiple Regions 625 626An operation containing multiple regions also completely determines the 627semantics of those regions. In particular, when control flow is passed to an 628operation, it may transfer control flow to any contained region. When control 629flow exits a region and is returned to the containing operation, the 630containing operation may pass control flow to any region in the same 631operation. An operation may also pass control flow to multiple contained 632regions concurrently. An operation may also pass control flow into regions 633that were specified in other operations, in particular those that defined the 634values or symbols the given operation uses as in a call operation. This 635passage of control is generally independent of passage of control flow through 636the basic blocks of the containing region. 637 638#### Closure 639 640Regions allow defining an operation that creates a closure, for example by 641“boxing” the body of the region into a value they produce. It remains up to the 642operation to define its semantics. Note that if an operation triggers 643asynchronous execution of the region, it is under the responsibility of the 644operation caller to wait for the region to be executed guaranteeing that any 645directly used values remain live. 646 647### Graph Regions 648 649In MLIR, graph-like semantics in a region is indicated by 650[RegionKind::Graph](Interfaces.md#regionkindinterfaces). Graph regions are 651appropriate for concurrent semantics without control flow, or for modeling 652generic directed graph data structures. Graph regions are appropriate for 653representing cyclic relationships between coupled values where there is no 654fundamental order to the relationships. For instance, operations in a graph 655region may represent independent threads of control with values representing 656streams of data. As usual in MLIR, the particular semantics of a region is 657completely determined by its containing operation. Graph regions may only 658contain a single basic block (the entry block). 659 660**Rationale:** Currently graph regions are arbitrarily limited to a single 661basic block, although there is no particular semantic reason for this 662limitation. This limitation has been added to make it easier to stabilize the 663pass infrastructure and commonly used passes for processing graph regions to 664properly handle feedback loops. Multi-block regions may be allowed in the 665future if use cases that require it arise. 666 667In graph regions, MLIR operations naturally represent nodes, while each MLIR 668value represents a multi-edge connecting a single source node and multiple 669destination nodes. All values defined in the region as results of operations 670are in scope within the region and can be accessed by any other operation in 671the region. In graph regions, the order of operations within a block and the 672order of blocks in a region is not semantically meaningful and non-terminator 673operations may be freely reordered, for instance, by canonicalization. Other 674kinds of graphs, such as graphs with multiple source nodes and multiple 675destination nodes, can also be represented by representing graph edges as MLIR 676operations. 677 678Note that cycles can occur within a single block in a graph region, or between 679basic blocks. 680 681```mlir 682"test.graph_region"() ({ // A Graph region 683 %1 = "op1"(%1, %3) : (i32, i32) -> (i32) // OK: %1, %3 allowed here 684 %2 = "test.ssacfg_region"() ({ 685 %5 = "op2"(%1, %2, %3, %4) : (i32, i32, i32, i32) -> (i32) // OK: %1, %2, %3, %4 all defined in the containing region 686 }) : () -> (i32) 687 %3 = "op2"(%1, %4) : (i32, i32) -> (i32) // OK: %4 allowed here 688 %4 = "op3"(%1) : (i32) -> (i32) 689}) : () -> () 690``` 691 692### Arguments and Results 693 694The arguments of the first block of a region are treated as arguments of the 695region. The source of these arguments is defined by the semantics of the parent 696operation. They may correspond to some of the values the operation itself uses. 697 698Regions produce a (possibly empty) list of values. The operation semantics 699defines the relation between the region results and the operation results. 700 701## Type System 702 703Each value in MLIR has a type defined by the type system below. There are a 704number of primitive types (like integers) and also aggregate types for tensors 705and memory buffers. MLIR [builtin types](#builtin-types) do not include 706structures, arrays, or dictionaries. 707 708MLIR has an open type system (i.e. there is no fixed list of types), and types 709may have application-specific semantics. For example, MLIR supports a set of 710[dialect types](#dialect-types). 711 712``` 713type ::= type-alias | dialect-type | builtin-type 714 715type-list-no-parens ::= type (`,` type)* 716type-list-parens ::= `(` `)` 717 | `(` type-list-no-parens `)` 718 719// This is a common way to refer to a value with a specified type. 720ssa-use-and-type ::= ssa-use `:` type 721 722// Non-empty list of names and types. 723ssa-use-and-type-list ::= ssa-use-and-type (`,` ssa-use-and-type)* 724``` 725 726### Type Aliases 727 728``` 729type-alias-def ::= '!' alias-name '=' 'type' type 730type-alias ::= '!' alias-name 731``` 732 733MLIR supports defining named aliases for types. A type alias is an identifier 734that can be used in the place of the type that it defines. These aliases *must* 735be defined before their uses. Alias names may not contain a '.', since those 736names are reserved for [dialect types](#dialect-types). 737 738Example: 739 740```mlir 741!avx_m128 = type vector<4 x f32> 742 743// Using the original type. 744"foo"(%x) : vector<4 x f32> -> () 745 746// Using the type alias. 747"foo"(%x) : !avx_m128 -> () 748``` 749 750### Dialect Types 751 752Similarly to operations, dialects may define custom extensions to the type 753system. 754 755``` 756dialect-namespace ::= bare-id 757 758opaque-dialect-item ::= dialect-namespace '<' string-literal '>' 759 760pretty-dialect-item ::= dialect-namespace '.' pretty-dialect-item-lead-ident 761 pretty-dialect-item-body? 762 763pretty-dialect-item-lead-ident ::= '[A-Za-z][A-Za-z0-9._]*' 764pretty-dialect-item-body ::= '<' pretty-dialect-item-contents+ '>' 765pretty-dialect-item-contents ::= pretty-dialect-item-body 766 | '(' pretty-dialect-item-contents+ ')' 767 | '[' pretty-dialect-item-contents+ ']' 768 | '{' pretty-dialect-item-contents+ '}' 769 | '[^[<({>\])}\0]+' 770 771dialect-type ::= '!' opaque-dialect-item 772dialect-type ::= '!' pretty-dialect-item 773``` 774 775Dialect types can be specified in a verbose form, e.g. like this: 776 777```mlir 778// LLVM type that wraps around llvm IR types. 779!llvm<"i32*"> 780 781// Tensor flow string type. 782!tf.string 783 784// Complex type 785!foo<"something<abcd>"> 786 787// Even more complex type 788!foo<"something<a%%123^^^>>>"> 789``` 790 791Dialect types that are simple enough can use the pretty format, which is a 792lighter weight syntax that is equivalent to the above forms: 793 794```mlir 795// Tensor flow string type. 796!tf.string 797 798// Complex type 799!foo.something<abcd> 800``` 801 802Sufficiently complex dialect types are required to use the verbose form for 803generality. For example, the more complex type shown above wouldn't be valid in 804the lighter syntax: `!foo.something<a%%123^^^>>>` because it contains characters 805that are not allowed in the lighter syntax, as well as unbalanced `<>` 806characters. 807 808See [here](Tutorials/DefiningAttributesAndTypes.md) to learn how to define dialect types. 809 810### Builtin Types 811 812Builtin types are a core set of [dialect types](#dialect-types) that are defined 813in a builtin dialect and thus available to all users of MLIR. 814 815``` 816builtin-type ::= complex-type 817 | float-type 818 | function-type 819 | index-type 820 | integer-type 821 | memref-type 822 | none-type 823 | tensor-type 824 | tuple-type 825 | vector-type 826``` 827 828#### Complex Type 829 830Syntax: 831 832``` 833complex-type ::= `complex` `<` type `>` 834``` 835 836The value of `complex` type represents a complex number with a parameterized 837element type, which is composed of a real and imaginary value of that element 838type. The element must be a floating point or integer scalar type. 839 840Examples: 841 842```mlir 843complex<f32> 844complex<i32> 845``` 846 847#### Floating Point Types 848 849Syntax: 850 851``` 852// Floating point. 853float-type ::= `f16` | `bf16` | `f32` | `f64` 854``` 855 856MLIR supports float types of certain widths that are widely used as indicated 857above. 858 859#### Function Type 860 861Syntax: 862 863``` 864// MLIR functions can return multiple values. 865function-result-type ::= type-list-parens 866 | non-function-type 867 868function-type ::= type-list-parens `->` function-result-type 869``` 870 871MLIR supports first-class functions: for example, the 872[`constant` operation](Dialects/Standard.md#stdconstant-constantop) produces the 873address of a function as a value. This value may be passed to and 874returned from functions, merged across control flow boundaries with 875[block arguments](#blocks), and called with the 876[`call_indirect` operation](Dialects/Standard.md#call-indirect-operation). 877 878Function types are also used to indicate the arguments and results of 879[operations](#operations). 880 881#### Index Type 882 883Syntax: 884 885``` 886// Target word-sized integer. 887index-type ::= `index` 888``` 889 890The `index` type is a signless integer whose size is equal to the natural 891machine word of the target 892([rationale](Rationale/Rationale.md#integer-signedness-semantics)) and is used 893by the affine constructs in MLIR. Unlike fixed-size integers, it cannot be used 894as an element of vector 895([rationale](Rationale/Rationale.md#index-type-disallowed-in-vector-types)). 896 897**Rationale:** integers of platform-specific bit widths are practical to express 898sizes, dimensionalities and subscripts. 899 900#### Integer Type 901 902Syntax: 903 904``` 905// Sized integers like i1, i4, i8, i16, i32. 906signed-integer-type ::= `si` [1-9][0-9]* 907unsigned-integer-type ::= `ui` [1-9][0-9]* 908signless-integer-type ::= `i` [1-9][0-9]* 909integer-type ::= signed-integer-type | 910 unsigned-integer-type | 911 signless-integer-type 912``` 913 914MLIR supports arbitrary precision integer types. Integer types have a designated 915width and may have signedness semantics. 916 917**Rationale:** low precision integers (like `i2`, `i4` etc) are useful for 918low-precision inference chips, and arbitrary precision integers are useful for 919hardware synthesis (where a 13 bit multiplier is a lot cheaper/smaller than a 16 920bit one). 921 922TODO: Need to decide on a representation for quantized integers 923([initial thoughts](Rationale/Rationale.md#quantized-integer-operations)). 924 925#### Memref Type 926 927Syntax: 928 929``` 930memref-type ::= ranked-memref-type | unranked-memref-type 931 932ranked-memref-type ::= `memref` `<` dimension-list-ranked tensor-memref-element-type 933 (`,` layout-specification)? (`,` memory-space)? `>` 934 935unranked-memref-type ::= `memref` `<*x` tensor-memref-element-type 936 (`,` memory-space)? `>` 937 938stride-list ::= `[` (dimension (`,` dimension)*)? `]` 939strided-layout ::= `offset:` dimension `,` `strides: ` stride-list 940layout-specification ::= semi-affine-map | strided-layout 941memory-space ::= integer-literal /* | TODO: address-space-id */ 942``` 943 944A `memref` type is a reference to a region of memory (similar to a buffer 945pointer, but more powerful). The buffer pointed to by a memref can be allocated, 946aliased and deallocated. A memref can be used to read and write data from/to the 947memory region which it references. Memref types use the same shape specifier as 948tensor types. Note that `memref<f32>`, `memref<0 x f32>`, `memref<1 x 0 x f32>`, 949and `memref<0 x 1 x f32>` are all different types. 950 951A `memref` is allowed to have an unknown rank (e.g. `memref<*xf32>`). The 952purpose of unranked memrefs is to allow external library functions to receive 953memref arguments of any rank without versioning the functions based on the rank. 954Other uses of this type are disallowed or will have undefined behavior. 955 956##### Codegen of Unranked Memref 957 958Using unranked memref in codegen besides the case mentioned above is highly 959discouraged. Codegen is concerned with generating loop nests and specialized 960instructions for high-performance, unranked memref is concerned with hiding the 961rank and thus, the number of enclosing loops required to iterate over the data. 962However, if there is a need to code-gen unranked memref, one possible path is to 963cast into a static ranked type based on the dynamic rank. Another possible path 964is to emit a single while loop conditioned on a linear index and perform 965delinearization of the linear index to a dynamic array containing the (unranked) 966indices. While this is possible, it is expected to not be a good idea to perform 967this during codegen as the cost of the translations is expected to be 968prohibitive and optimizations at this level are not expected to be worthwhile. 969If expressiveness is the main concern, irrespective of performance, passing 970unranked memrefs to an external C++ library and implementing rank-agnostic logic 971there is expected to be significantly simpler. 972 973Unranked memrefs may provide expressiveness gains in the future and help bridge 974the gap with unranked tensors. Unranked memrefs will not be expected to be 975exposed to codegen but one may query the rank of an unranked memref (a special 976op will be needed for this purpose) and perform a switch and cast to a ranked 977memref as a prerequisite to codegen. 978 979Example: 980 981```mlir 982// With static ranks, we need a function for each possible argument type 983%A = alloc() : memref<16x32xf32> 984%B = alloc() : memref<16x32x64xf32> 985call @helper_2D(%A) : (memref<16x32xf32>)->() 986call @helper_3D(%B) : (memref<16x32x64xf32>)->() 987 988// With unknown rank, the functions can be unified under one unranked type 989%A = alloc() : memref<16x32xf32> 990%B = alloc() : memref<16x32x64xf32> 991// Remove rank info 992%A_u = memref_cast %A : memref<16x32xf32> -> memref<*xf32> 993%B_u = memref_cast %B : memref<16x32x64xf32> -> memref<*xf32> 994// call same function with dynamic ranks 995call @helper(%A_u) : (memref<*xf32>)->() 996call @helper(%B_u) : (memref<*xf32>)->() 997``` 998 999The core syntax and representation of a layout specification is a 1000[semi-affine map](Dialects/Affine.md#semi-affine-maps). Additionally, syntactic 1001sugar is supported to make certain layout specifications more intuitive to read. 1002For the moment, a `memref` supports parsing a strided form which is converted to 1003a semi-affine map automatically. 1004 1005The memory space of a memref is specified by a target-specific integer index. If 1006no memory space is specified, then the default memory space (0) is used. The 1007default space is target specific but always at index 0. 1008 1009TODO: MLIR will eventually have target-dialects which allow symbolic use of 1010memory hierarchy names (e.g. L3, L2, L1, ...) but we have not spec'd the details 1011of that mechanism yet. Until then, this document pretends that it is valid to 1012refer to these memories by `bare-id`. 1013 1014The notionally dynamic value of a memref value includes the address of the 1015buffer allocated, as well as the symbols referred to by the shape, layout map, 1016and index maps. 1017 1018Examples of memref static type 1019 1020```mlir 1021// Identity index/layout map 1022#identity = affine_map<(d0, d1) -> (d0, d1)> 1023 1024// Column major layout. 1025#col_major = affine_map<(d0, d1, d2) -> (d2, d1, d0)> 1026 1027// A 2-d tiled layout with tiles of size 128 x 256. 1028#tiled_2d_128x256 = affine_map<(d0, d1) -> (d0 div 128, d1 div 256, d0 mod 128, d1 mod 256)> 1029 1030// A tiled data layout with non-constant tile sizes. 1031#tiled_dynamic = affine_map<(d0, d1)[s0, s1] -> (d0 floordiv s0, d1 floordiv s1, 1032 d0 mod s0, d1 mod s1)> 1033 1034// A layout that yields a padding on two at either end of the minor dimension. 1035#padded = affine_map<(d0, d1) -> (d0, (d1 + 2) floordiv 2, (d1 + 2) mod 2)> 1036 1037 1038// The dimension list "16x32" defines the following 2D index space: 1039// 1040// { (i, j) : 0 <= i < 16, 0 <= j < 32 } 1041// 1042memref<16x32xf32, #identity> 1043 1044// The dimension list "16x4x?" defines the following 3D index space: 1045// 1046// { (i, j, k) : 0 <= i < 16, 0 <= j < 4, 0 <= k < N } 1047// 1048// where N is a symbol which represents the runtime value of the size of 1049// the third dimension. 1050// 1051// %N here binds to the size of the third dimension. 1052%A = alloc(%N) : memref<16x4x?xf32, #col_major> 1053 1054// A 2-d dynamic shaped memref that also has a dynamically sized tiled layout. 1055// The memref index space is of size %M x %N, while %B1 and %B2 bind to the 1056// symbols s0, s1 respectively of the layout map #tiled_dynamic. Data tiles of 1057// size %B1 x %B2 in the logical space will be stored contiguously in memory. 1058// The allocation size will be (%M ceildiv %B1) * %B1 * (%N ceildiv %B2) * %B2 1059// f32 elements. 1060%T = alloc(%M, %N) [%B1, %B2] : memref<?x?xf32, #tiled_dynamic> 1061 1062// A memref that has a two-element padding at either end. The allocation size 1063// will fit 16 * 64 float elements of data. 1064%P = alloc() : memref<16x64xf32, #padded> 1065 1066// Affine map with symbol 's0' used as offset for the first dimension. 1067#imapS = affine_map<(d0, d1) [s0] -> (d0 + s0, d1)> 1068// Allocate memref and bind the following symbols: 1069// '%n' is bound to the dynamic second dimension of the memref type. 1070// '%o' is bound to the symbol 's0' in the affine map of the memref type. 1071%n = ... 1072%o = ... 1073%A = alloc (%n)[%o] : <16x?xf32, #imapS> 1074``` 1075 1076##### Index Space 1077 1078A memref dimension list defines an index space within which the memref can be 1079indexed to access data. 1080 1081##### Index 1082 1083Data is accessed through a memref type using a multidimensional index into the 1084multidimensional index space defined by the memref's dimension list. 1085 1086Examples 1087 1088```mlir 1089// Allocates a memref with 2D index space: 1090// { (i, j) : 0 <= i < 16, 0 <= j < 32 } 1091%A = alloc() : memref<16x32xf32, #imapA> 1092 1093// Loads data from memref '%A' using a 2D index: (%i, %j) 1094%v = load %A[%i, %j] : memref<16x32xf32, #imapA> 1095``` 1096 1097##### Index Map 1098 1099An index map is a one-to-one 1100[semi-affine map](Dialects/Affine.md#semi-affine-maps) that transforms a 1101multidimensional index from one index space to another. For example, the 1102following figure shows an index map which maps a 2-dimensional index from a 2x2 1103index space to a 3x3 index space, using symbols `S0` and `S1` as offsets. 1104 1105![Index Map Example](/includes/img/index-map.svg) 1106 1107The number of domain dimensions and range dimensions of an index map can be 1108different, but must match the number of dimensions of the input and output index 1109spaces on which the map operates. The index space is always non-negative and 1110integral. In addition, an index map must specify the size of each of its range 1111dimensions onto which it maps. Index map symbols must be listed in order with 1112symbols for dynamic dimension sizes first, followed by other required symbols. 1113 1114##### Layout Map 1115 1116A layout map is a [semi-affine map](Dialects/Affine.md#semi-affine-maps) which 1117encodes logical to physical index space mapping, by mapping input dimensions to 1118their ordering from most-major (slowest varying) to most-minor (fastest 1119varying). Therefore, an identity layout map corresponds to a row-major layout. 1120Identity layout maps do not contribute to the MemRef type identification and are 1121discarded on construction. That is, a type with an explicit identity map is 1122`memref<?x?xf32, (i,j)->(i,j)>` is strictly the same as the one without layout 1123maps, `memref<?x?xf32>`. 1124 1125Layout map examples: 1126 1127```mlir 1128// MxN matrix stored in row major layout in memory: 1129#layout_map_row_major = (i, j) -> (i, j) 1130 1131// MxN matrix stored in column major layout in memory: 1132#layout_map_col_major = (i, j) -> (j, i) 1133 1134// MxN matrix stored in a 2-d blocked/tiled layout with 64x64 tiles. 1135#layout_tiled = (i, j) -> (i floordiv 64, j floordiv 64, i mod 64, j mod 64) 1136``` 1137 1138##### Affine Map Composition 1139 1140A memref specifies a semi-affine map composition as part of its type. A 1141semi-affine map composition is a composition of semi-affine maps beginning with 1142zero or more index maps, and ending with a layout map. The composition must be 1143conformant: the number of dimensions of the range of one map, must match the 1144number of dimensions of the domain of the next map in the composition. 1145 1146The semi-affine map composition specified in the memref type, maps from accesses 1147used to index the memref in load/store operations to other index spaces (i.e. 1148logical to physical index mapping). Each of the 1149[semi-affine maps](Dialects/Affine.md) and thus its composition is required to 1150be one-to-one. 1151 1152The semi-affine map composition can be used in dependence analysis, memory 1153access pattern analysis, and for performance optimizations like vectorization, 1154copy elision and in-place updates. If an affine map composition is not specified 1155for the memref, the identity affine map is assumed. 1156 1157##### Strided MemRef 1158 1159A memref may specify strides as part of its type. A stride specification is a 1160list of integer values that are either static or `?` (dynamic case). Strides 1161encode the distance, in number of elements, in (linear) memory between 1162successive entries along a particular dimension. A stride specification is 1163syntactic sugar for an equivalent strided memref representation using 1164semi-affine maps. For example, `memref<42x16xf32, offset: 33, strides: [1, 64]>` 1165specifies a non-contiguous memory region of `42` by `16` `f32` elements such 1166that: 1167 11681. the minimal size of the enclosing memory region must be `33 + 42 * 1 + 16 * 1169 64 = 1066` elements; 11702. the address calculation for accessing element `(i, j)` computes `33 + i + 1171 64 * j` 11723. the distance between two consecutive elements along the outer dimension is 1173 `1` element and the distance between two consecutive elements along the 1174 outer dimension is `64` elements. 1175 1176This corresponds to a column major view of the memory region and is internally 1177represented as the type `memref<42x16xf32, (i, j) -> (33 + i + 64 * j)>`. 1178 1179The specification of strides must not alias: given an n-D strided memref, 1180indices `(i1, ..., in)` and `(j1, ..., jn)` may not refer to the same memory 1181address unless `i1 == j1, ..., in == jn`. 1182 1183Strided memrefs represent a view abstraction over preallocated data. They are 1184constructed with special ops, yet to be introduced. Strided memrefs are a 1185special subclass of memrefs with generic semi-affine map and correspond to a 1186normalized memref descriptor when lowering to LLVM. 1187 1188#### None Type 1189 1190Syntax: 1191 1192``` 1193none-type ::= `none` 1194``` 1195 1196The `none` type is a unit type, i.e. a type with exactly one possible value, 1197where its value does not have a defined dynamic representation. 1198 1199#### Tensor Type 1200 1201Syntax: 1202 1203``` 1204tensor-type ::= `tensor` `<` dimension-list tensor-memref-element-type `>` 1205tensor-memref-element-type ::= vector-element-type | vector-type | complex-type 1206 1207// memref requires a known rank, but tensor does not. 1208dimension-list ::= dimension-list-ranked | (`*` `x`) 1209dimension-list-ranked ::= (dimension `x`)* 1210dimension ::= `?` | decimal-literal 1211``` 1212 1213Values with tensor type represents aggregate N-dimensional data values, and 1214have a known element type. It may have an unknown rank (indicated by `*`) or may 1215have a fixed rank with a list of dimensions. Each dimension may be a static 1216non-negative decimal constant or be dynamically determined (indicated by `?`). 1217 1218The runtime representation of the MLIR tensor type is intentionally abstracted - 1219you cannot control layout or get a pointer to the data. For low level buffer 1220access, MLIR has a [`memref` type](#memref-type). This abstracted runtime 1221representation holds both the tensor data values as well as information about 1222the (potentially dynamic) shape of the tensor. The 1223[`dim` operation](Dialects/Standard.md#dim-operation) returns the size of a 1224dimension from a value of tensor type. 1225 1226Note: hexadecimal integer literals are not allowed in tensor type declarations 1227to avoid confusion between `0xf32` and `0 x f32`. Zero sizes are allowed in 1228tensors and treated as other sizes, e.g., `tensor<0 x 1 x i32>` and `tensor<1 x 12290 x i32>` are different types. Since zero sizes are not allowed in some other 1230types, such tensors should be optimized away before lowering tensors to vectors. 1231 1232Examples: 1233 1234```mlir 1235// Tensor with unknown rank. 1236tensor<* x f32> 1237 1238// Known rank but unknown dimensions. 1239tensor<? x ? x ? x ? x f32> 1240 1241// Partially known dimensions. 1242tensor<? x ? x 13 x ? x f32> 1243 1244// Full static shape. 1245tensor<17 x 4 x 13 x 4 x f32> 1246 1247// Tensor with rank zero. Represents a scalar. 1248tensor<f32> 1249 1250// Zero-element dimensions are allowed. 1251tensor<0 x 42 x f32> 1252 1253// Zero-element tensor of f32 type (hexadecimal literals not allowed here). 1254tensor<0xf32> 1255``` 1256 1257#### Tuple Type 1258 1259Syntax: 1260 1261``` 1262tuple-type ::= `tuple` `<` (type ( `,` type)*)? `>` 1263``` 1264 1265The value of `tuple` type represents a fixed-size collection of elements, where 1266each element may be of a different type. 1267 1268**Rationale:** Though this type is first class in the type system, MLIR provides 1269no standard operations for operating on `tuple` types 1270([rationale](Rationale/Rationale.md#tuple-types)). 1271 1272Examples: 1273 1274```mlir 1275// Empty tuple. 1276tuple<> 1277 1278// Single element 1279tuple<f32> 1280 1281// Many elements. 1282tuple<i32, f32, tensor<i1>, i5> 1283``` 1284 1285#### Vector Type 1286 1287Syntax: 1288 1289``` 1290vector-type ::= `vector` `<` static-dimension-list vector-element-type `>` 1291vector-element-type ::= float-type | integer-type 1292 1293static-dimension-list ::= (decimal-literal `x`)+ 1294``` 1295 1296The vector type represents a SIMD style vector, used by target-specific 1297operation sets like AVX. While the most common use is for 1D vectors (e.g. 1298vector<16 x f32>) we also support multidimensional registers on targets that 1299support them (like TPUs). 1300 1301Vector shapes must be positive decimal integers. 1302 1303Note: hexadecimal integer literals are not allowed in vector type declarations, 1304`vector<0x42xi32>` is invalid because it is interpreted as a 2D vector with 1305shape `(0, 42)` and zero shapes are not allowed. 1306 1307## Attributes 1308 1309Syntax: 1310 1311``` 1312attribute-dict ::= `{` `}` 1313 | `{` attribute-entry (`,` attribute-entry)* `}` 1314attribute-entry ::= dialect-attribute-entry | dependent-attribute-entry 1315dialect-attribute-entry ::= dialect-namespace `.` bare-id `=` attribute-value 1316dependent-attribute-entry ::= dependent-attribute-name `=` attribute-value 1317dependent-attribute-name ::= ((letter|[_]) (letter|digit|[_$])*) 1318 | string-literal 1319``` 1320 1321Attributes are the mechanism for specifying constant data on operations in 1322places where a variable is never allowed - e.g. the index of a 1323[`dim` operation](Dialects/Standard.md#stddim-dimop), or the stride of a 1324convolution. They consist of a name and a concrete attribute value. The set of 1325expected attributes, their structure, and their interpretation are all 1326contextually dependent on what they are attached to. 1327 1328There are two main classes of attributes: dependent and dialect. Dependent 1329attributes derive their structure and meaning from what they are attached to; 1330e.g., the meaning of the `index` attribute on a `dim` operation is defined by 1331the `dim` operation. Dialect attributes, on the other hand, derive their context 1332and meaning from a specific dialect. An example of a dialect attribute may be a 1333`swift.self` function argument attribute that indicates an argument is the 1334self/context parameter. The context of this attribute is defined by the `swift` 1335dialect and not the function argument. 1336 1337Attribute values are represented by the following forms: 1338 1339``` 1340attribute-value ::= attribute-alias | dialect-attribute | builtin-attribute 1341``` 1342 1343### Attribute Value Aliases 1344 1345``` 1346attribute-alias ::= '#' alias-name '=' attribute-value 1347attribute-alias ::= '#' alias-name 1348``` 1349 1350MLIR supports defining named aliases for attribute values. An attribute alias is 1351an identifier that can be used in the place of the attribute that it defines. 1352These aliases *must* be defined before their uses. Alias names may not contain a 1353'.', since those names are reserved for 1354[dialect attributes](#dialect-attribute-values). 1355 1356Example: 1357 1358```mlir 1359#map = affine_map<(d0) -> (d0 + 10)> 1360 1361// Using the original attribute. 1362%b = affine.apply affine_map<(d0) -> (d0 + 10)> (%a) 1363 1364// Using the attribute alias. 1365%b = affine.apply #map(%a) 1366``` 1367 1368### Dialect Attribute Values 1369 1370Similarly to operations, dialects may define custom attribute values. The 1371syntactic structure of these values is identical to custom dialect type values, 1372except that dialect attribute values are distinguished with a leading '#', while 1373dialect types are distinguished with a leading '!'. 1374 1375``` 1376dialect-attribute-value ::= '#' opaque-dialect-item 1377dialect-attribute-value ::= '#' pretty-dialect-item 1378``` 1379 1380Dialect attribute values can be specified in a verbose form, e.g. like this: 1381 1382```mlir 1383// Complex attribute value. 1384#foo<"something<abcd>"> 1385 1386// Even more complex attribute value. 1387#foo<"something<a%%123^^^>>>"> 1388``` 1389 1390Dialect attribute values that are simple enough can use the pretty format, which 1391is a lighter weight syntax that is equivalent to the above forms: 1392 1393```mlir 1394// Complex attribute 1395#foo.something<abcd> 1396``` 1397 1398Sufficiently complex dialect attribute values are required to use the verbose 1399form for generality. For example, the more complex type shown above would not be 1400valid in the lighter syntax: `#foo.something<a%%123^^^>>>` because it contains 1401characters that are not allowed in the lighter syntax, as well as unbalanced 1402`<>` characters. 1403 1404See [here](Tutorials/DefiningAttributesAndTypes.md) on how to define dialect 1405attribute values. 1406 1407### Builtin Attribute Values 1408 1409Builtin attributes are a core set of 1410[dialect attributes](#dialect-attribute-values) that are defined in a builtin 1411dialect and thus available to all users of MLIR. 1412 1413``` 1414builtin-attribute ::= affine-map-attribute 1415 | array-attribute 1416 | bool-attribute 1417 | dictionary-attribute 1418 | elements-attribute 1419 | float-attribute 1420 | integer-attribute 1421 | integer-set-attribute 1422 | string-attribute 1423 | symbol-ref-attribute 1424 | type-attribute 1425 | unit-attribute 1426``` 1427 1428#### AffineMap Attribute 1429 1430Syntax: 1431 1432``` 1433affine-map-attribute ::= `affine_map` `<` affine-map `>` 1434``` 1435 1436An affine-map attribute is an attribute that represents an affine-map object. 1437 1438#### Array Attribute 1439 1440Syntax: 1441 1442``` 1443array-attribute ::= `[` (attribute-value (`,` attribute-value)*)? `]` 1444``` 1445 1446An array attribute is an attribute that represents a collection of attribute 1447values. 1448 1449#### Boolean Attribute 1450 1451Syntax: 1452 1453``` 1454bool-attribute ::= bool-literal 1455``` 1456 1457A boolean attribute is a literal attribute that represents a one-bit boolean 1458value, true or false. 1459 1460#### Dictionary Attribute 1461 1462Syntax: 1463 1464``` 1465dictionary-attribute ::= `{` (attribute-entry (`,` attribute-entry)*)? `}` 1466``` 1467 1468A dictionary attribute is an attribute that represents a sorted collection of 1469named attribute values. The elements are sorted by name, and each name must be 1470unique within the collection. 1471 1472#### Elements Attributes 1473 1474Syntax: 1475 1476``` 1477elements-attribute ::= dense-elements-attribute 1478 | opaque-elements-attribute 1479 | sparse-elements-attribute 1480``` 1481 1482An elements attribute is a literal attribute that represents a constant 1483[vector](#vector-type) or [tensor](#tensor-type) value. 1484 1485##### Dense Elements Attribute 1486 1487Syntax: 1488 1489``` 1490dense-elements-attribute ::= `dense` `<` attribute-value `>` `:` 1491 ( tensor-type | vector-type ) 1492``` 1493 1494A dense elements attribute is an elements attribute where the storage for the 1495constant vector or tensor value has been densely packed. The attribute supports 1496storing integer or floating point elements, with integer/index/floating element 1497types. It also support storing string elements with a custom dialect string 1498element type. 1499 1500##### Opaque Elements Attribute 1501 1502Syntax: 1503 1504``` 1505opaque-elements-attribute ::= `opaque` `<` dialect-namespace `,` 1506 hex-string-literal `>` `:` 1507 ( tensor-type | vector-type ) 1508``` 1509 1510An opaque elements attribute is an elements attribute where the content of the 1511value is opaque. The representation of the constant stored by this elements 1512attribute is only understood, and thus decodable, by the dialect that created 1513it. 1514 1515Note: The parsed string literal must be in hexadecimal form. 1516 1517##### Sparse Elements Attribute 1518 1519Syntax: 1520 1521``` 1522sparse-elements-attribute ::= `sparse` `<` attribute-value `,` attribute-value 1523 `>` `:` ( tensor-type | vector-type ) 1524``` 1525 1526A sparse elements attribute is an elements attribute that represents a sparse 1527vector or tensor object. This is where very few of the elements are non-zero. 1528 1529The attribute uses COO (coordinate list) encoding to represent the sparse 1530elements of the elements attribute. The indices are stored via a 2-D tensor of 153164-bit integer elements with shape [N, ndims], which specifies the indices of 1532the elements in the sparse tensor that contains non-zero values. The element 1533values are stored via a 1-D tensor with shape [N], that supplies the 1534corresponding values for the indices. 1535 1536Example: 1537 1538```mlir 1539 sparse<[[0, 0], [1, 2]], [1, 5]> : tensor<3x4xi32> 1540 1541// This represents the following tensor: 1542/// [[1, 0, 0, 0], 1543/// [0, 0, 5, 0], 1544/// [0, 0, 0, 0]] 1545``` 1546 1547#### Float Attribute 1548 1549Syntax: 1550 1551``` 1552float-attribute ::= (float-literal (`:` float-type)?) 1553 | (hexadecimal-literal `:` float-type) 1554``` 1555 1556A float attribute is a literal attribute that represents a floating point value 1557of the specified [float type](#floating-point-types). It can be represented in 1558the hexadecimal form where the hexadecimal value is interpreted as bits of the 1559underlying binary representation. This form is useful for representing infinity 1560and NaN floating point values. To avoid confusion with integer attributes, 1561hexadecimal literals _must_ be followed by a float type to define a float 1562attribute. 1563 1564Examples: 1565 1566``` 156742.0 // float attribute defaults to f64 type 156842.0 : f32 // float attribute of f32 type 15690x7C00 : f16 // positive infinity 15700x7CFF : f16 // NaN (one of possible values) 157142 : f32 // Error: expected integer type 1572``` 1573 1574#### Integer Attribute 1575 1576Syntax: 1577 1578``` 1579integer-attribute ::= integer-literal ( `:` (index-type | integer-type) )? 1580``` 1581 1582An integer attribute is a literal attribute that represents an integral value of 1583the specified integer or index type. The default type for this attribute, if one 1584is not specified, is a 64-bit integer. 1585 1586##### Integer Set Attribute 1587 1588Syntax: 1589 1590``` 1591integer-set-attribute ::= `affine_set` `<` integer-set `>` 1592``` 1593 1594An integer-set attribute is an attribute that represents an integer-set object. 1595 1596#### String Attribute 1597 1598Syntax: 1599 1600``` 1601string-attribute ::= string-literal (`:` type)? 1602``` 1603 1604A string attribute is an attribute that represents a string literal value. 1605 1606#### Symbol Reference Attribute 1607 1608Syntax: 1609 1610``` 1611symbol-ref-attribute ::= symbol-ref-id (`::` symbol-ref-id)* 1612``` 1613 1614A symbol reference attribute is a literal attribute that represents a named 1615reference to an operation that is nested within an operation with the 1616`OpTrait::SymbolTable` trait. As such, this reference is given meaning by the 1617nearest parent operation containing the `OpTrait::SymbolTable` trait. It may 1618optionally contain a set of nested references that further resolve to a symbol 1619nested within a different symbol table. 1620 1621This attribute can only be held internally by 1622[array attributes](#array-attribute) and 1623[dictionary attributes](#dictionary-attribute)(including the top-level operation 1624attribute dictionary), i.e. no other attribute kinds such as Locations or 1625extended attribute kinds. 1626 1627**Rationale:** Identifying accesses to global data is critical to 1628enabling efficient multi-threaded compilation. Restricting global 1629data access to occur through symbols and limiting the places that can 1630legally hold a symbol reference simplifies reasoning about these data 1631accesses. 1632 1633See [`Symbols And SymbolTables`](SymbolsAndSymbolTables.md) for more 1634information. 1635 1636#### Type Attribute 1637 1638Syntax: 1639 1640``` 1641type-attribute ::= type 1642``` 1643 1644A type attribute is an attribute that represents a [type object](#type-system). 1645 1646#### Unit Attribute 1647 1648``` 1649unit-attribute ::= `unit` 1650``` 1651 1652A unit attribute is an attribute that represents a value of `unit` type. The 1653`unit` type allows only one value forming a singleton set. This attribute value 1654is used to represent attributes that only have meaning from their existence. 1655 1656One example of such an attribute could be the `swift.self` attribute. This 1657attribute indicates that a function parameter is the self/context parameter. It 1658could be represented as a [boolean attribute](#boolean-attribute)(true or 1659false), but a value of false doesn't really bring any value. The parameter 1660either is the self/context or it isn't. 1661 1662```mlir 1663// A unit attribute defined with the `unit` value specifier. 1664func @verbose_form(i1) attributes {dialectName.unitAttr = unit} 1665 1666// A unit attribute can also be defined without the value specifier. 1667func @simple_form(i1) attributes {dialectName.unitAttr} 1668``` 1669