1# Chapter 2: Emitting Basic MLIR 2 3[TOC] 4 5Now that we're familiar with our language and the AST, let's see how MLIR can 6help to compile Toy. 7 8## Introduction: Multi-Level Intermediate Representation 9 10Other compilers, like LLVM (see the 11[Kaleidoscope tutorial](https://llvm.org/docs/tutorial/MyFirstLanguageFrontend/index.html)), 12offer a fixed set of predefined types and (usually *low-level* / RISC-like) 13instructions. It is up to the frontend for a given language to perform any 14language-specific type-checking, analysis, or transformation before emitting 15LLVM IR. For example, Clang will use its AST to perform not only static analysis 16but also transformations, such as C++ template instantiation through AST cloning 17and rewrite. Finally, languages with construction at a higher-level than C/C++ 18may require non-trivial lowering from their AST to generate LLVM IR. 19 20As a consequence, multiple frontends end up reimplementing significant pieces of 21infrastructure to support the need for these analyses and transformation. MLIR 22addresses this issue by being designed for extensibility. As such, there are few 23pre-defined instructions (*operations* in MLIR terminology) or types. 24 25## Interfacing with MLIR 26 27[Language reference](../../LangRef.md) 28 29MLIR is designed to be a completely extensible infrastructure; there is no 30closed set of attributes (think: constant metadata), operations, or types. MLIR 31supports this extensibility with the concept of 32[Dialects](../../LangRef.md#dialects). Dialects provide a grouping mechanism for 33abstraction under a unique `namespace`. 34 35In MLIR, [`Operations`](../../LangRef.md#operations) are the core unit of 36abstraction and computation, similar in many ways to LLVM instructions. 37Operations can have application-specific semantics and can be used to represent 38all of the core IR structures in LLVM: instructions, globals (like functions), 39modules, etc. 40 41Here is the MLIR assembly for the Toy `transpose` operations: 42 43```mlir 44%t_tensor = "toy.transpose"(%tensor) {inplace = true} : (tensor<2x3xf64>) -> tensor<3x2xf64> loc("example/file/path":12:1) 45``` 46 47Let's break down the anatomy of this MLIR operation: 48 49- `%t_tensor` 50 51 * The name given to the result defined by this operation (which includes 52 [a prefixed sigil to avoid collisions](../../LangRef.md#identifiers-and-keywords)). 53 An operation may define zero or more results (in the context of Toy, we 54 will limit ourselves to single-result operations), which are SSA values. 55 The name is used during parsing but is not persistent (e.g., it is not 56 tracked in the in-memory representation of the SSA value). 57 58- `"toy.transpose"` 59 60 * The name of the operation. It is expected to be a unique string, with 61 the namespace of the dialect prefixed before the "`.`". This can be read 62 as the `transpose` operation in the `toy` dialect. 63 64- `(%tensor)` 65 66 * A list of zero or more input operands (or arguments), which are SSA 67 values defined by other operations or referring to block arguments. 68 69- `{ inplace = true }` 70 71 * A dictionary of zero or more attributes, which are special operands that 72 are always constant. Here we define a boolean attribute named 'inplace' 73 that has a constant value of true. 74 75- `(tensor<2x3xf64>) -> tensor<3x2xf64>` 76 77 * This refers to the type of the operation in a functional form, spelling 78 the types of the arguments in parentheses and the type of the return 79 values afterward. 80 81- `loc("example/file/path":12:1)` 82 83 * This is the location in the source code from which this operation 84 originated. 85 86Shown here is the general form of an operation. As described above, 87the set of operations in MLIR is extensible. Operations are modeled 88using a small set of concepts, enabling operations to be reasoned 89about and manipulated generically. These concepts are: 90 91- A name for the operation. 92- A list of SSA operand values. 93- A list of [attributes](../../LangRef.md#attributes). 94- A list of [types](../../LangRef.md#type-system) for result values. 95- A [source location](../../Diagnostics.md#source-locations) for debugging 96 purposes. 97- A list of successors [blocks](../../LangRef.md#blocks) (for branches, 98 mostly). 99- A list of [regions](../../LangRef.md#regions) (for structural operations 100 like functions). 101 102In MLIR, every operation has a mandatory source location associated with it. 103Contrary to LLVM, where debug info locations are metadata and can be dropped, in 104MLIR, the location is a core requirement, and APIs depend on and manipulate it. 105Dropping a location is thus an explicit choice which cannot happen by mistake. 106 107To provide an illustration: If a transformation replaces an operation by 108another, that new operation must still have a location attached. This makes it 109possible to track where that operation came from. 110 111It's worth noting that the mlir-opt tool - a tool for testing 112compiler passes - does not include locations in the output by default. The 113`-mlir-print-debuginfo` flag specifies to include locations. (Run `mlir-opt 114--help` for more options.) 115 116### Opaque API 117 118MLIR is designed to allow most IR elements, such as attributes, 119operations, and types, to be customized. At the same time, IR 120elements can always be reduced to the above fundamental concepts. This 121allows MLIR to parse, represent, and 122[round-trip](../../../getting_started/Glossary.md#round-trip) IR for 123*any* operation. For example, we could place our Toy operation from 124above into an `.mlir` file and round-trip through *mlir-opt* without 125registering any dialect: 126 127```mlir 128func @toy_func(%tensor: tensor<2x3xf64>) -> tensor<3x2xf64> { 129 %t_tensor = "toy.transpose"(%tensor) { inplace = true } : (tensor<2x3xf64>) -> tensor<3x2xf64> 130 return %t_tensor : tensor<3x2xf64> 131} 132``` 133 134In the cases of unregistered attributes, operations, and types, MLIR 135will enforce some structural constraints (SSA, block termination, 136etc.), but otherwise they are completely opaque. For instance, MLIR 137has little information about whether an unregistered operation can 138operate on particular datatypes, how many operands it can take, or how 139many results it produces. This flexibility can be useful for 140bootstrapping purposes, but it is generally advised against in mature 141systems. Unregistered operations must be treated conservatively by 142transformations and analyses, and they are much harder to construct 143and manipulate. 144 145This handling can be observed by crafting what should be an invalid IR for Toy 146and seeing it round-trip without tripping the verifier: 147 148```mlir 149func @main() { 150 %0 = "toy.print"() : () -> tensor<2x3xf64> 151} 152``` 153 154There are multiple problems here: the `toy.print` operation is not a terminator; 155it should take an operand; and it shouldn't return any values. In the next 156section, we will register our dialect and operations with MLIR, plug into the 157verifier, and add nicer APIs to manipulate our operations. 158 159## Defining a Toy Dialect 160 161To effectively interface with MLIR, we will define a new Toy dialect. This 162dialect will model the structure of the Toy language, as well as 163provide an easy avenue for high-level analysis and transformation. 164 165```c++ 166/// This is the definition of the Toy dialect. A dialect inherits from 167/// mlir::Dialect and registers custom attributes, operations, and types (in its 168/// constructor). It can also override virtual methods to change some general 169/// behavior, which will be demonstrated in later chapters of the tutorial. 170class ToyDialect : public mlir::Dialect { 171 public: 172 explicit ToyDialect(mlir::MLIRContext *ctx); 173 174 /// Provide a utility accessor to the dialect namespace. This is used by 175 /// several utilities. 176 static llvm::StringRef getDialectNamespace() { return "toy"; } 177}; 178``` 179 180The dialect can now be registered in the global registry: 181 182```c++ 183 mlir::registerDialect<ToyDialect>(); 184``` 185 186Any new `MLIRContext` created from now on will contain an instance of the Toy 187dialect and invoke specific hooks for things like parsing attributes and types. 188 189## Defining Toy Operations 190 191Now that we have a `Toy` dialect, we can start registering operations. This will 192allow for providing semantic information that the rest of the system can hook 193into. Let's walk through the creation of the `toy.constant` operation: 194 195```mlir 196 %4 = "toy.constant"() {value = dense<1.0> : tensor<2x3xf64>} : () -> tensor<2x3xf64> 197``` 198 199This operation takes zero operands, a 200[dense elements](../../LangRef.md#dense-elements-attribute) attribute named 201`value`, and returns a single result of 202[TensorType](../../LangRef.md#tensor-type). An operation inherits from the 203[CRTP](https://en.wikipedia.org/wiki/Curiously_recurring_template_pattern) 204`mlir::Op` class which also takes some optional [*traits*](../../Traits.md) to 205customize its behavior. These traits may provide additional accessors, 206verification, etc. 207 208```c++ 209class ConstantOp : public mlir::Op<ConstantOp, 210 /// The ConstantOp takes no inputs. 211 mlir::OpTrait::ZeroOperands, 212 /// The ConstantOp returns a single result. 213 mlir::OpTrait::OneResult> { 214 215 public: 216 /// Inherit the constructors from the base Op class. 217 using Op::Op; 218 219 /// Provide the unique name for this operation. MLIR will use this to register 220 /// the operation and uniquely identify it throughout the system. 221 static llvm::StringRef getOperationName() { return "toy.constant"; } 222 223 /// Return the value of the constant by fetching it from the attribute. 224 mlir::DenseElementsAttr getValue(); 225 226 /// Operations can provide additional verification beyond the traits they 227 /// define. Here we will ensure that the specific invariants of the constant 228 /// operation are upheld, for example the result type must be of TensorType. 229 LogicalResult verify(); 230 231 /// Provide an interface to build this operation from a set of input values. 232 /// This interface is used by the builder to allow for easily generating 233 /// instances of this operation: 234 /// mlir::OpBuilder::create<ConstantOp>(...) 235 /// This method populates the given `state` that MLIR uses to create 236 /// operations. This state is a collection of all of the discrete elements 237 /// that an operation may contain. 238 /// Build a constant with the given return type and `value` attribute. 239 static void build(mlir::OpBuilder &builder, mlir::OperationState &state, 240 mlir::Type result, mlir::DenseElementsAttr value); 241 /// Build a constant and reuse the type from the given 'value'. 242 static void build(mlir::OpBuilder &builder, mlir::OperationState &state, 243 mlir::DenseElementsAttr value); 244 /// Build a constant by broadcasting the given 'value'. 245 static void build(mlir::OpBuilder &builder, mlir::OperationState &state, 246 double value); 247}; 248``` 249 250and we register this operation in the `ToyDialect` constructor: 251 252```c++ 253ToyDialect::ToyDialect(mlir::MLIRContext *ctx) 254 : mlir::Dialect(getDialectNamespace(), ctx) { 255 addOperations<ConstantOp>(); 256} 257``` 258 259### Op vs Operation: Using MLIR Operations 260 261Now that we have defined an operation, we will want to access and 262transform it. In MLIR, there are two main classes related to 263operations: `Operation` and `Op`. The `Operation` class is used to 264generically model all operations. It is 'opaque', in the sense that 265it does not describe the properties of particular operations or types 266of operations. Instead, the 'Operation' class provides a general API 267into an operation instance. On the other hand, each specific type of 268operation is represented by an `Op` derived class. For instance 269`ConstantOp` represents a operation with zero inputs, and one output, 270which is always set to the same value. `Op` derived classes act as 271smart pointer wrapper around a `Operation*`, provide 272operation-specific accessor methods, and type-safe properties of 273operations. This means that when we define our Toy operations, we are 274simply defining a clean, semantically useful interface for building 275and interfacing with the `Operation` class. This is why our 276`ConstantOp` defines no class fields; all the data structures are 277stored in the referenced `Operation`. A side effect is that we always 278pass around `Op` derived classes by value, instead of by reference or 279pointer (*passing by value* is a common idiom and applies similarly to 280attributes, types, etc). Given a generic `Operation*` instance, we 281can always get a specific `Op` instance using LLVM's casting 282infrastructure: 283 284```c++ 285void processConstantOp(mlir::Operation *operation) { 286 ConstantOp op = llvm::dyn_cast<ConstantOp>(operation); 287 288 // This operation is not an instance of `ConstantOp`. 289 if (!op) 290 return; 291 292 // Get the internal operation instance wrapped by the smart pointer. 293 mlir::Operation *internalOperation = op.getOperation(); 294 assert(internalOperation == operation && 295 "these operation instances are the same"); 296} 297``` 298 299### Using the Operation Definition Specification (ODS) Framework 300 301In addition to specializing the `mlir::Op` C++ template, MLIR also supports 302defining operations in a declarative manner. This is achieved via the 303[Operation Definition Specification](../../OpDefinitions.md) framework. Facts 304regarding an operation are specified concisely into a TableGen record, which 305will be expanded into an equivalent `mlir::Op` C++ template specialization at 306compile time. Using the ODS framework is the desired way for defining operations 307in MLIR given the simplicity, conciseness, and general stability in the face of 308C++ API changes. 309 310Lets see how to define the ODS equivalent of our ConstantOp: 311 312The first thing to do is to define a link to the Toy dialect that we defined in 313C++. This is used to link all of the operations that we will define to our 314dialect: 315 316```tablegen 317// Provide a definition of the 'toy' dialect in the ODS framework so that we 318// can define our operations. 319def Toy_Dialect : Dialect { 320 // The namespace of our dialect, this corresponds 1-1 with the string we 321 // provided in `ToyDialect::getDialectNamespace`. 322 let name = "toy"; 323 324 // The C++ namespace that the dialect class definition resides in. 325 let cppNamespace = "toy"; 326} 327``` 328 329Now that we have defined a link to the Toy dialect, we can start defining 330operations. Operations in ODS are defined by inheriting from the `Op` class. To 331simplify our operation definitions, we will define a base class for operations 332in the Toy dialect. 333 334```tablegen 335// Base class for toy dialect operations. This operation inherits from the base 336// `Op` class in OpBase.td, and provides: 337// * The parent dialect of the operation. 338// * The mnemonic for the operation, or the name without the dialect prefix. 339// * A list of traits for the operation. 340class Toy_Op<string mnemonic, list<OpTrait> traits = []> : 341 Op<Toy_Dialect, mnemonic, traits>; 342``` 343 344With all of the preliminary pieces defined, we can begin to define the constant 345operation. 346 347We define a toy operation by inheriting from our base 'Toy_Op' class above. Here 348we provide the mnemonic and a list of traits for the operation. The 349[mnemonic](../../OpDefinitions.md#operation-name) here matches the one given in 350`ConstantOp::getOperationName` without the dialect prefix; `toy.`. Missing here 351from our C++ definition are the `ZeroOperands` and `OneResult` traits; these 352will be automatically inferred based upon the `arguments` and `results` fields 353we define later. 354 355```tablegen 356def ConstantOp : Toy_Op<"constant"> { 357} 358``` 359 360At this point you probably might want to know what the C++ code generated by 361TableGen looks like. Simply run the `mlir-tblgen` command with the 362`gen-op-decls` or the `gen-op-defs` action like so: 363 364```shell 365${build_root}/bin/mlir-tblgen -gen-op-defs ${mlir_src_root}/examples/toy/Ch2/include/toy/Ops.td -I ${mlir_src_root}/include/ 366``` 367 368Depending on the selected action, this will print either the `ConstantOp` class 369declaration or its implementation. Comparing this output to the hand-crafted 370implementation is incredibly useful when getting started with TableGen. 371 372#### Defining Arguments and Results 373 374With the shell of the operation defined, we can now provide the 375[inputs](../../OpDefinitions.md#operation-arguments) and 376[outputs](../../OpDefinitions.md#operation-results) to our operation. The 377inputs, or arguments, to an operation may be attributes or types for SSA operand 378values. The results correspond to a set of types for the values produced by the 379operation: 380 381```tablegen 382def ConstantOp : Toy_Op<"constant"> { 383 // The constant operation takes an attribute as the only input. 384 // `F64ElementsAttr` corresponds to a 64-bit floating-point ElementsAttr. 385 let arguments = (ins F64ElementsAttr:$value); 386 387 // The constant operation returns a single value of TensorType. 388 // F64Tensor corresponds to a 64-bit floating-point TensorType. 389 let results = (outs F64Tensor); 390} 391``` 392 393By providing a name to the arguments or results, e.g. `$value`, ODS will 394automatically generate a matching accessor: `DenseElementsAttr 395ConstantOp::value()`. 396 397#### Adding Documentation 398 399The next step after defining the operation is to document it. Operations may 400provide 401[`summary` and `description`](../../OpDefinitions.md#operation-documentation) 402fields to describe the semantics of the operation. This information is useful 403for users of the dialect and can even be used to auto-generate Markdown 404documents. 405 406```tablegen 407def ConstantOp : Toy_Op<"constant"> { 408 // Provide a summary and description for this operation. This can be used to 409 // auto-generate documentation of the operations within our dialect. 410 let summary = "constant operation"; 411 let description = [{ 412 Constant operation turns a literal into an SSA value. The data is attached 413 to the operation as an attribute. For example: 414 415 %0 = "toy.constant"() 416 { value = dense<[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]> : tensor<2x3xf64> } 417 : () -> tensor<2x3xf64> 418 }]; 419 420 // The constant operation takes an attribute as the only input. 421 // `F64ElementsAttr` corresponds to a 64-bit floating-point ElementsAttr. 422 let arguments = (ins F64ElementsAttr:$value); 423 424 // The generic call operation returns a single value of TensorType. 425 // F64Tensor corresponds to a 64-bit floating-point TensorType. 426 let results = (outs F64Tensor); 427} 428``` 429 430#### Verifying Operation Semantics 431 432At this point we've already covered a majority of the original C++ operation 433definition. The next piece to define is the verifier. Luckily, much like the 434named accessor, the ODS framework will automatically generate a lot of the 435necessary verification logic based upon the constraints we have given. This 436means that we don't need to verify the structure of the return type, or even the 437input attribute `value`. In many cases, additional verification is not even 438necessary for ODS operations. To add additional verification logic, an operation 439can override the [`verifier`](../../OpDefinitions.md#custom-verifier-code) 440field. The `verifier` field allows for defining a C++ code blob that will be run 441as part of `ConstantOp::verify`. This blob can assume that all of the other 442invariants of the operation have already been verified: 443 444```tablegen 445def ConstantOp : Toy_Op<"constant"> { 446 // Provide a summary and description for this operation. This can be used to 447 // auto-generate documentation of the operations within our dialect. 448 let summary = "constant operation"; 449 let description = [{ 450 Constant operation turns a literal into an SSA value. The data is attached 451 to the operation as an attribute. For example: 452 453 %0 = "toy.constant"() 454 { value = dense<[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]> : tensor<2x3xf64> } 455 : () -> tensor<2x3xf64> 456 }]; 457 458 // The constant operation takes an attribute as the only input. 459 // `F64ElementsAttr` corresponds to a 64-bit floating-point ElementsAttr. 460 let arguments = (ins F64ElementsAttr:$value); 461 462 // The generic call operation returns a single value of TensorType. 463 // F64Tensor corresponds to a 64-bit floating-point TensorType. 464 let results = (outs F64Tensor); 465 466 // Add additional verification logic to the constant operation. Here we invoke 467 // a static `verify` method in a C++ source file. This codeblock is executed 468 // inside of ConstantOp::verify, so we can use `this` to refer to the current 469 // operation instance. 470 let verifier = [{ return ::verify(*this); }]; 471} 472``` 473 474#### Attaching `build` Methods 475 476The final missing component here from our original C++ example are the `build` 477methods. ODS can generate some simple build methods automatically, and in this 478case it will generate our first build method for us. For the rest, we define the 479[`builders`](../../OpDefinitions.md#custom-builder-methods) field. This field 480takes a list of `OpBuilder` objects that take a string corresponding to a list 481of C++ parameters, as well as an optional code block that can be used to specify 482the implementation inline. 483 484```tablegen 485def ConstantOp : Toy_Op<"constant"> { 486 ... 487 488 // Add custom build methods for the constant operation. These methods populate 489 // the `state` that MLIR uses to create operations, i.e. these are used when 490 // using `builder.create<ConstantOp>(...)`. 491 let builders = [ 492 // Build a constant with a given constant tensor value. 493 OpBuilderDAG<(ins "DenseElementsAttr":$value), [{ 494 // Call into an autogenerated `build` method. 495 build(builder, result, value.getType(), value); 496 }]>, 497 498 // Build a constant with a given constant floating-point value. This builder 499 // creates a declaration for `ConstantOp::build` with the given parameters. 500 OpBuilderDAG<(ins "double":$value)> 501 ]; 502} 503``` 504 505#### Specifying a Custom Assembly Format 506 507At this point we can generate our "Toy IR". For example, the following: 508 509```toy 510# User defined generic function that operates on unknown shaped arguments. 511def multiply_transpose(a, b) { 512 return transpose(a) * transpose(b); 513} 514 515def main() { 516 var a<2, 3> = [[1, 2, 3], [4, 5, 6]]; 517 var b<2, 3> = [1, 2, 3, 4, 5, 6]; 518 var c = multiply_transpose(a, b); 519 var d = multiply_transpose(b, a); 520 print(d); 521} 522``` 523 524Results in the following IR: 525 526```mlir 527module { 528 func @multiply_transpose(%arg0: tensor<*xf64>, %arg1: tensor<*xf64>) -> tensor<*xf64> { 529 %0 = "toy.transpose"(%arg0) : (tensor<*xf64>) -> tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":5:10) 530 %1 = "toy.transpose"(%arg1) : (tensor<*xf64>) -> tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":5:25) 531 %2 = "toy.mul"(%0, %1) : (tensor<*xf64>, tensor<*xf64>) -> tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":5:25) 532 "toy.return"(%2) : (tensor<*xf64>) -> () loc("test/Examples/Toy/Ch2/codegen.toy":5:3) 533 } loc("test/Examples/Toy/Ch2/codegen.toy":4:1) 534 func @main() { 535 %0 = "toy.constant"() {value = dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64>} : () -> tensor<2x3xf64> loc("test/Examples/Toy/Ch2/codegen.toy":9:17) 536 %1 = "toy.reshape"(%0) : (tensor<2x3xf64>) -> tensor<2x3xf64> loc("test/Examples/Toy/Ch2/codegen.toy":9:3) 537 %2 = "toy.constant"() {value = dense<[1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00]> : tensor<6xf64>} : () -> tensor<6xf64> loc("test/Examples/Toy/Ch2/codegen.toy":10:17) 538 %3 = "toy.reshape"(%2) : (tensor<6xf64>) -> tensor<2x3xf64> loc("test/Examples/Toy/Ch2/codegen.toy":10:3) 539 %4 = "toy.generic_call"(%1, %3) {callee = @multiply_transpose} : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":11:11) 540 %5 = "toy.generic_call"(%3, %1) {callee = @multiply_transpose} : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":12:11) 541 "toy.print"(%5) : (tensor<*xf64>) -> () loc("test/Examples/Toy/Ch2/codegen.toy":13:3) 542 "toy.return"() : () -> () loc("test/Examples/Toy/Ch2/codegen.toy":8:1) 543 } loc("test/Examples/Toy/Ch2/codegen.toy":8:1) 544} loc(unknown) 545``` 546 547One thing to notice here is that all of our Toy operations are printed using the 548generic assembly format. This format is the one shown when breaking down 549`toy.transpose` at the beginning of this chapter. MLIR allows for operations to 550define their own custom assembly format, either 551[declaratively](../../OpDefinitions.md#declarative-assembly-format) or 552imperatively via C++. Defining a custom assembly format allows for tailoring the 553generated IR into something a bit more readable by removing a lot of the fluff 554that is required by the generic format. Let's walk through an example of an 555operation format that we would like to simplify. 556 557##### `toy.print` 558 559The current form of `toy.print` is a little verbose. There are a lot of 560additional characters that we would like to strip away. Let's begin by thinking 561of what a good format of `toy.print` would be, and see how we can implement it. 562Looking at the basics of `toy.print` we get: 563 564```mlir 565toy.print %5 : tensor<*xf64> loc(...) 566``` 567 568Here we have stripped much of the format down to the bare essentials, and it has 569become much more readable. To provide a custom assembly format, an operation can 570either override the `parser` and `printer` fields for a C++ format, or the 571`assemblyFormat` field for the declarative format. Let's look at the C++ variant 572first, as this is what the declarative format maps to internally. 573 574```tablegen 575/// Consider a stripped definition of `toy.print` here. 576def PrintOp : Toy_Op<"print"> { 577 let arguments = (ins F64Tensor:$input); 578 579 // Divert the printer and parser to static functions in our .cpp 580 // file that correspond to 'print' and 'printPrintOp'. 'printer' and 'parser' 581 // here correspond to an instance of a 'OpAsmParser' and 'OpAsmPrinter'. More 582 // details on these classes is shown below. 583 let printer = [{ return ::print(printer, *this); }]; 584 let parser = [{ return ::parse$cppClass(parser, result); }]; 585} 586``` 587 588A C++ implementation for the printer and parser is shown below: 589 590```c++ 591/// The 'OpAsmPrinter' class is a stream that will allows for formatting 592/// strings, attributes, operands, types, etc. 593static void print(mlir::OpAsmPrinter &printer, PrintOp op) { 594 printer << "toy.print " << op.input(); 595 printer.printOptionalAttrDict(op.getAttrs()); 596 printer << " : " << op.input().getType(); 597} 598 599/// The 'OpAsmParser' class provides a collection of methods for parsing 600/// various punctuation, as well as attributes, operands, types, etc. Each of 601/// these methods returns a `ParseResult`. This class is a wrapper around 602/// `LogicalResult` that can be converted to a boolean `true` value on failure, 603/// or `false` on success. This allows for easily chaining together a set of 604/// parser rules. These rules are used to populate an `mlir::OperationState` 605/// similarly to the `build` methods described above. 606static mlir::ParseResult parsePrintOp(mlir::OpAsmParser &parser, 607 mlir::OperationState &result) { 608 // Parse the input operand, the attribute dictionary, and the type of the 609 // input. 610 mlir::OpAsmParser::OperandType inputOperand; 611 mlir::Type inputType; 612 if (parser.parseOperand(inputOperand) || 613 parser.parseOptionalAttrDict(result.attributes) || parser.parseColon() || 614 parser.parseType(inputType)) 615 return mlir::failure(); 616 617 // Resolve the input operand to the type we parsed in. 618 if (parser.resolveOperand(inputOperand, inputType, result.operands)) 619 return mlir::failure(); 620 621 return mlir::success(); 622} 623``` 624 625With the C++ implementation defined, let's see how this can be mapped to the 626[declarative format](../../OpDefinitions.md#declarative-assembly-format). The 627declarative format is largely composed of three different components: 628 629* Directives 630 - A type of builtin function, with an optional set of arguments. 631* Literals 632 - A keyword or punctuation surrounded by \`\`. 633* Variables 634 - An entity that has been registered on the operation itself, i.e. an 635 argument(attribute or operand), result, successor, etc. In the `PrintOp` 636 example above, a variable would be `$input`. 637 638A direct mapping of our C++ format looks something like: 639 640```tablegen 641/// Consider a stripped definition of `toy.print` here. 642def PrintOp : Toy_Op<"print"> { 643 let arguments = (ins F64Tensor:$input); 644 645 // In the following format we have two directives, `attr-dict` and `type`. 646 // These correspond to the attribute dictionary and the type of a given 647 // variable represectively. 648 let assemblyFormat = "$input attr-dict `:` type($input)"; 649} 650``` 651 652The [declarative format](../../OpDefinitions.md#declarative-assembly-format) has 653many more interesting features, so be sure to check it out before implementing a 654custom format in C++. After beautifying the format of a few of our operations we 655now get a much more readable: 656 657```mlir 658module { 659 func @multiply_transpose(%arg0: tensor<*xf64>, %arg1: tensor<*xf64>) -> tensor<*xf64> { 660 %0 = toy.transpose(%arg0 : tensor<*xf64>) to tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":5:10) 661 %1 = toy.transpose(%arg1 : tensor<*xf64>) to tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":5:25) 662 %2 = toy.mul %0, %1 : tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":5:25) 663 toy.return %2 : tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":5:3) 664 } loc("test/Examples/Toy/Ch2/codegen.toy":4:1) 665 func @main() { 666 %0 = toy.constant dense<[[1.000000e+00, 2.000000e+00, 3.000000e+00], [4.000000e+00, 5.000000e+00, 6.000000e+00]]> : tensor<2x3xf64> loc("test/Examples/Toy/Ch2/codegen.toy":9:17) 667 %1 = toy.reshape(%0 : tensor<2x3xf64>) to tensor<2x3xf64> loc("test/Examples/Toy/Ch2/codegen.toy":9:3) 668 %2 = toy.constant dense<[1.000000e+00, 2.000000e+00, 3.000000e+00, 4.000000e+00, 5.000000e+00, 6.000000e+00]> : tensor<6xf64> loc("test/Examples/Toy/Ch2/codegen.toy":10:17) 669 %3 = toy.reshape(%2 : tensor<6xf64>) to tensor<2x3xf64> loc("test/Examples/Toy/Ch2/codegen.toy":10:3) 670 %4 = toy.generic_call @multiply_transpose(%1, %3) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":11:11) 671 %5 = toy.generic_call @multiply_transpose(%3, %1) : (tensor<2x3xf64>, tensor<2x3xf64>) -> tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":12:11) 672 toy.print %5 : tensor<*xf64> loc("test/Examples/Toy/Ch2/codegen.toy":13:3) 673 toy.return loc("test/Examples/Toy/Ch2/codegen.toy":8:1) 674 } loc("test/Examples/Toy/Ch2/codegen.toy":8:1) 675} loc(unknown) 676``` 677 678Above we introduce several of the concepts for defining operations in the ODS 679framework, but there are many more that we haven't had a chance to: regions, 680variadic operands, etc. Check out the 681[full specification](../../OpDefinitions.md) for more details. 682 683## Complete Toy Example 684 685We can now generate our "Toy IR". You can build `toyc-ch2` and try yourself on 686the above example: `toyc-ch2 test/Examples/Toy/Ch2/codegen.toy -emit=mlir 687-mlir-print-debuginfo`. We can also check our RoundTrip: `toyc-ch2 688test/Examples/Toy/Ch2/codegen.toy -emit=mlir -mlir-print-debuginfo 2> 689codegen.mlir` followed by `toyc-ch2 codegen.mlir -emit=mlir`. You should also 690use `mlir-tblgen` on the final definition file and study the generated C++ code. 691 692At this point, MLIR knows about our Toy dialect and operations. In the 693[next chapter](Ch-3.md), we will leverage our new dialect to implement some 694high-level language-specific analyses and transformations for the Toy language. 695