MLIR — Defining a New Dialect

Table of Contents

In the last article in the series, we migrated the passes we had written to use the tablegen code generation framework. That was a preface to using tablegen to define dialects.

In this article we’ll define a dialect that represents arithmetic on single-variable polynomials, with coefficients in $\mathbb{Z} / 2^{32} \mathbb{Z}$ (32-bit unsigned integers).

The code for this article is in this pull request, and as usual the commits are organized to be read in order.

Sketching out a design

The basic dialect will define a new polynomial type, and provide operations to define polynomials by specifying their coefficients from standard MLIR types, extract data about a polynomial to store the results in standard MLIR types, and to do arithmetic operations on polynomials.

There is quite a large surface area of design options when writing a dialect. This talk by Jeff Niu and Mehdi Amini gives some indication of how one might start to think about dialect design. But in brief, a polynomial dialect would fit into the “computation” bucket (a dialect to represent number crunching).

A slide from “MLIR Dialect Design and Composition for Front-End Compilers” (timestamped link), describing a taxonomy of dialects.

Another idea relevant to starting a dialect design is to ask how a dialect will enable easy optimizations. That is the Optimizer dialect class above. But since this tutorial series is focusing on the low-level, day to day, nitty gritty details of working with MLIR, we’re going to focus on getting some custom dialect defined, and then come back to what makes a good dialect design later.

An empty dialect

We’ll start by defining an empty dialect and just look at the tablegen-generated code, which is done in this commit. The tablegen file looks like

include "mlir/IR/DialectBase.td"

def Poly_Dialect : Dialect {
  let name = "poly";
  let summary = "A dialect for polynomial math";
  let description = [{
    The poly dialect defines types and operations for single-variable
    polynomials over integers.
  }];

  let cppNamespace = "::mlir::tutorial::poly";
}

It is almost indistinguishable from the tablegen for a Pass, except for the Dialect base class. The build rule has different flags too.

gentbl_cc_library(
    name = "dialect_inc_gen",
    tbl_outs = [
        (["-gen-dialect-decls"], "PolyDialect.h.inc"),
        (["-gen-dialect-defs"], "PolyDialect.cpp.inc"),
    ],
    tblgen = "@llvm-project//mlir:mlir-tblgen",
    td_file = "PolyDialect.td",
    deps = [
        "@llvm-project//mlir:OpBaseTdFiles",
    ],
)

Unlike the Pass codegen, here the dialect has us specify a codegen’ed header and implementation file, which are short enough to include directly in the article:

// bazel-bin/lib/Dialect/Poly/PolyDialect.h.inc
namespace mlir {
namespace tutorial {

class PolyDialect : public ::mlir::Dialect {
  explicit PolyDialect(::mlir::MLIRContext *context);

  void initialize();
  friend class ::mlir::MLIRContext;
public:
  ~PolyDialect() override;
  static constexpr ::llvm::StringLiteral getDialectNamespace() {
    return ::llvm::StringLiteral("poly");
  }
};
} // namespace tutorial
} // namespace mlir
MLIR_DECLARE_EXPLICIT_TYPE_ID(::mlir::tutorial::PolyDialect)

And the cpp:

// bazel-bin/lib/Dialect/Poly/PolyDialect.cpp.inc
MLIR_DEFINE_EXPLICIT_TYPE_ID(::mlir::tutorial::PolyDialect)
namespace mlir {
namespace tutorial {

PolyDialect::PolyDialect(::mlir::MLIRContext *context)
    : ::mlir::Dialect(getDialectNamespace(), context, ::mlir::TypeID::get<PolyDialect>()) {
  initialize();
}

PolyDialect::~PolyDialect() = default;

} // namespace tutorial
} // namespace mlir

Basically, they are empty containers that will hold the types and ops that we define next.

In this commit we register the dialect with the tutorial-opt main program, which is handled by a single API call. Now the dialect shows up in the help text of the tutorial-opt binary, though nothing else is there because we have no passes associated with it.

$ bazel run tools:tutorial-opt -- --help
Available Dialects: ..., pdl_interp, poly, quant, ...

Adding a trivial type

Next we’ll define a poly.poly type with no semantics, and in the next section we’ll focus on the semantics.

This commit defines a simple test that ensures we can parse and print our new type.

// RUN: tutorial-opt %s

module {
  func.func @main(%arg0: !poly.poly) -> !poly.poly {
    return %arg0 : !poly.poly
  }
}

Note that the exclamation mark ! sigil prefix is required for out-of-tree MLIR types.

Next, in this commit we add the tablegen for the poly type. The tablegen looks like

include "PolyDialect.td"
include "mlir/IR/AttrTypeBase.td"

// A base class for all types in this dialect
class Poly_Type<string name, string typeMnemonic> : TypeDef<Poly_Dialect, name> {
  let mnemonic = typeMnemonic;
}

def Poly : Poly_Type<"Polynomial", "poly"> {
  let summary = "A polynomial with i32 coefficients";

  let description = [{
    A type for polynomials with integer coefficients in a single-variable polynomial ring.
  }];
}

This is effectively a boilerplate shell since nothing here is specific to polynomial arithmetic. But it shows a few new things about tablegen worth mentioning.

First and most trivially, tablegen has include statements so that you can split definitions across files. I like to put each conceptual type of thing in its own tablegen file (types, ops, attributes, etc.), but conventions differ across projects.

Second, a common pattern when defining types (and ops) is to have a base class for each dialect that all concrete type definitions inherit from. This shows first of all that there is a difference between class and def in tablegen. def defines actual types, where by “actual” I mean it generates C++ code, whereas class is only an inheritance base and disappears after tablegen is done. Think of class as allowing us to refactor out common shared code among type definitions in one file. I make a stink of this because I have mixed up class and def many times and been confused by the error messages and generated code. For example, if you change the class to a def above, you’ll see the following unhelpful error message.

lib/Dialect/Poly/PolyTypes.td:8:5: error: Expected a class name, got 'Poly_Type'
def Poly_Type<string name, string typeMnemonic> : TypeDef<Poly_Dialect, name> {

Moreover, TypeDef itself is a class that takes as template arguments the dialect the type should belong to and a name field (related to the type’s eventual C++ class name), and results in Tablegen associating the generated C++ classes with the same namespace as we told the dialect to use, among other things.

Third, the new field is the mnemonic declaration. This determines the name of the type in the textual representation of the IR.

The generated code again has separate .h.inc and .cpp.inc files:

// PolyTypes.h.inc
#ifdef GET_TYPEDEF_CLASSES
#undef GET_TYPEDEF_CLASSES

namespace mlir {
class AsmParser;
class AsmPrinter;
} // namespace mlir
namespace mlir {
namespace tutorial {
namespace poly {
class PolynomialType;
class PolynomialType : public ::mlir::Type::TypeBase<PolynomialType, ::mlir::Type, ::mlir::TypeStorage> {
public:
  using Base::Base;
  static constexpr ::llvm::StringLiteral getMnemonic() {
    return {"poly"};
  }

};
} // namespace poly
} // namespace tutorial
} // namespace mlir
MLIR_DECLARE_EXPLICIT_TYPE_ID(::mlir::tutorial::poly::PolynomialType)

#endif  // GET_TYPEDEF_CLASSES
// PolyTypes.cpp.inc
#ifdef GET_TYPEDEF_LIST
#undef GET_TYPEDEF_LIST

::mlir::tutorial::poly::PolynomialType

#endif  // GET_TYPEDEF_LIST

#ifdef GET_TYPEDEF_CLASSES
#undef GET_TYPEDEF_CLASSES

static ::mlir::OptionalParseResult generatedTypeParser(::mlir::AsmParser &parser, ::llvm::StringRef *mnemonic, ::mlir::Type &value) {
  return ::mlir::AsmParser::KeywordSwitch<::mlir::OptionalParseResult>(parser)
    .Case(::mlir::tutorial::poly::PolynomialType::getMnemonic(), [&](llvm::StringRef, llvm::SMLoc) {
      value = ::mlir::tutorial::poly::PolynomialType::get(parser.getContext());
      return ::mlir::success(!!value);
    })
    .Default([&](llvm::StringRef keyword, llvm::SMLoc) {
      *mnemonic = keyword;
      return std::nullopt;
    });
}

static ::mlir::LogicalResult generatedTypePrinter(::mlir::Type def, ::mlir::AsmPrinter &printer) {
  return ::llvm::TypeSwitch<::mlir::Type, ::mlir::LogicalResult>(def)
   .Case<::mlir::tutorial::poly::PolynomialType>([&](auto t) {
      printer << ::mlir::tutorial::poly::PolynomialType::getMnemonic();
      return ::mlir::success();
    })
    .Default([](auto) { return ::mlir::failure(); });
}

namespace mlir {
namespace tutorial {
namespace poly {
} // namespace poly
} // namespace tutorial
} // namespace mlir
MLIR_DEFINE_EXPLICIT_TYPE_ID(::mlir::tutorial::poly::PolynomialType)

#endif  // GET_TYPEDEF_CLASSES

The name PolynomialType is generated by adding Type to the "Polynomial" template argument we passed in the tablegen file. The name of the def itself is used to refer to the class elsewhere in tablegen files, and the two can be different.

One thing to pay attention to here is that tablegen is attempting to generate a type parser and printer for us. But it’s not usable yet—we’ll come back to this. If you build at this commit you’ll see a compiler warning like generatedTypePrinter defined but not used and a hard failure if you try running the test.

A second thing to notice is that it uses the header-guards-as-function-arguments style again, here to separate the cpp file into two include-guarded sections, one that just has a list of the types defined, and the other that has implementations of functions. The first one, GET_TYPEDEF_LIST is curious because it just includes a comma-separated list of class names. This is because the PolyDialect.cpp from this commit is responsible for registering the types with the dialect, and that happens by using this include to add the C++ class names for the types as template arguments in the Dialect’s initialization function.

// PolyDialect.cpp
#include "lib/Dialect/Poly/PolyDialect.h"

#include "lib/Dialect/Poly/PolyTypes.h"
#include "mlir/include/mlir/IR/Builders.h"
#include "llvm/include/llvm/ADT/TypeSwitch.h"

#include "lib/Dialect/Poly/PolyDialect.cpp.inc"
#define GET_TYPEDEF_CLASSES
#include "lib/Dialect/Poly/PolyTypes.cpp.inc"

namespace mlir {
namespace tutorial {
namespace poly {

void PolyDialect::initialize() {
  addTypes<
#define GET_TYPEDEF_LIST
#include "lib/Dialect/Poly/PolyTypes.cpp.inc"
      >();
}

} // namespace poly
} // namespace tutorial
} // namespace mlir

We’ll do the same registration dance for ops, attributes, etc., later.

The expected way to set up the C++ interface to the tablegen files is:

  • Create a header file PolyTypes.h which is the only file allowed to include PolyTypes.h.inc,
  • Include PolyTypes.cpp.inc inside PolyDialect.cpp with any additional #includes needed for the auto-generated implementations in PolyTypes.cpp.inc. In our case, the default parser/printer uses the type switch function from llvm/include/llvm/ADT/TypeSwitch.h.
  • If needed, add a PolyTypes.cpp with any additional implementations needed for functions declared by tablegen that can’t be automatically generated.

I don’t think I’ve found the best arrangement of these files and build targets yet. What I really want is to have each header with a corresponding .h.inc, a corresponding .cpp file that includes the relevant .cpp.inc, and to connect them all with PolyDialect.cpp. However, PolyDialect::initialize does some introspection on the classes declared in PolyTypes.h.inc to ensure they’re valid (specifically looking for details of Storage types like is_trivially_destructible, which are generated for us in the next section), and that requires the implementations to exist in the same compilation unit (I believe). From what I’ve read of other projects, people just tend to cram things into the same cpp files, leading to multi-thousand line implementation files, which I dislike.

As a final note, the build file uses a new rule called td_files to group all the tablegen files into one build target.

The last ingredient required to get this to compile and run is to make the type parser and printer usable. Thankfully it’s one line: adding let useDefaultTypePrinterParser = 1; to the dialect tablegen (see this commit). This adds the following declarations to PolyDialect.h.inc, and modifies the generated implementations to be member functions on PolyDialect.

  /// Parse a type registered to this dialect.
  ::mlir::Type parseType(::mlir::DialectAsmParser &parser) const override;

  /// Print a type registered to this dialect.
  void printType(::mlir::Type type,
                 ::mlir::DialectAsmPrinter &os) const override;

Adding a poly type parameter

When designing the poly type, we have to think about what we want to express, and how the type will be lowered to types in lower level dialects. For the sake of this tutorial we’ll restrict to single-variable polynomials, so the indeterminate can be arbitrary and implicit (we don’t need to keep track of the variable name x or y or t in the IR).

The next question is the polynomial analogue of integer bit width: degree. The difficulty here is that tracking the exact degree of a polynomial is impossible in general. The degree of a sum of two polynomials depends on the coefficient values (the highest degree terms may cancel), which may not be known statically because they are read from an external source at runtime. So at best we could track an upper bound on the degree.

The simplest approach is to require a polynomial type to declare its “degree upper bound” statically, and then define “overflow” semantics just like integers have. A natural option is to implicitly treat polynomials as members of a ring $R[x]/ (x^D-1)$, where $D$ is the degree upper bound and $R$ is the ring of 32-bit unsigned integers. In less mathematical terms, this means that whenever a term $x^n$ in a polynomial would exceed degree $D \leq n$, we would replace $x^n$ with $x^{n \mod D}$, which is the same thing as “declaring” $x^D=1$ and reducing polynomials via that substitution until all terms have degree less than $D$.

This simple approach has the benefit of making the dialect design easy: all binary polynomial ops have the same input and output type, and lowering a poly type requires replacing poly.poly<D> with a tensor of D coefficients. Since the shapes are static and most polynomials would have the same bound, it’s probably the most performant option. The downsides are that now the programmer has to worry about overflow semantics, and if you want polynomials of larger degree you have to insert bookkeeping operations to extend the degree bound.

On the other hand, we could allow a poly to have a growing degree. So that, e.g., a poly.mul operation with two poly.poly<7> input polynomials would have a poly.poly<14> as output. This would require more work in the dialect definition to ensure the IR is valid, and the lowering would be more complex in that you’d have to manage tensors of different sizes. It would probably also have an impact on performance, since there would be more memory allocations and copying involved (but maybe the compiler could be smart enough to avoid that).

I would like to do both so as to show how these differences materialize in lowerings and optimization passes. But for now I will start with the one that I think is easier: wrapping overflow semantics.

To make that work, we need to add an attribute to our polynomial type representing its degree upper bound. The official docs on how to do this are here.

This is where tablegen starts to be quite useful, because the changes only require two lines added to the type definition’s tablegen.

  let parameters = (ins "int":$degreeBound);
  let assemblyFormat = "`<` $degreeBound `>`";

The first line associates each instance of the type with an integer parameter (the "int" can be any string containing a literal C++ type), and a name $degreeBound that is used both for the generated C++ and to refer to it elsewhere in the tablegen file.

The second line is required because now that a poly type has this associated data, we need to be able to print it to and parse it from the textual IR representation. This simplest option is to let tablegen auto-generate the parser and printer for us, but we could have also used the line let hasCustomAssemblyFormat = 1; which would generate some headers that it expects us to fill in the implementation for. In the syntax of that line, tokens in backticks are literally printed/parsed.

After adding these two lines in this commit, the generated code gets quite a bit more complicated. Here’s the entirety of both files. Some things to note:

  • PolynomialType gets a new int getDegreeBound() method, as well as a static get(MLIRContext, int) factory method.
  • The parser and printer are upgraded to the new format.
  • There is a new class called PolynomialTypeStorage that holds the int parameter and is hidden in an inner detail namespace.

The storage class is autogenerated for us now because integers have simple construction/destruction semantics. If we had a more complicated argument like an array that needed allocation, we’d have to implement special classes to define those semantics. And at the most extreme end, if we had a fully custom type parameter, we’d have to implement a storage class manually, implement things like hash_code, and register it with the dialect. For the curious, I had to do this in heir/pull/74 to implement a custom Polynomial type parameter.

The same commit updates the syntax test to include the type parameter.

Adding some simple operations

Moving a bit faster now, we want to add some polynomial operations. This commit adds a polynomial addition op. The tablegen:

include "PolyDialect.td"
include "PolyTypes.td"

def Poly_AddOp : Op<Poly_Dialect, "add"> {
  let summary = "Addition operation between polynomials.";
  let arguments = (ins Polynomial:$lhs, Polynomial:$rhs);
  let results = (outs Polynomial:$output);
  let assemblyFormat = "$lhs `,` $rhs attr-dict `:` `(` type($lhs) `,` type($rhs) `)` `->` type($output)";
}

It looks very similar to a type, but the base class is Op, the arguments correspond to the operation’s inputs, and the assembly format is more complicated. We’ll enhance it in a future article, but the reason is that without inserting some special hooks, tablegen isn’t able to generate a parser that can infer what the types of the inputs and outputs should be when constructing it from the textual representation. [Aside: I feel like it should be able to with the simple example above, but for whatever reason the MLIR devs appear to have made auto-generated type inference opt-in via the trait infrastructure, see next article.]

Still, we can add a test and it passes

  // CHECK-LABEL: test_add_syntax
  func.func @test_add_syntax(%arg0: !poly.poly<10>, %arg1: !poly.poly<10>) -> !poly.poly<10> {
    // CHECK: poly.add
    %0 = poly.add %arg0, %arg1 : (!poly.poly<10>, !poly.poly<10>) -> !poly.poly<10>
    return %0 : !poly.poly<10>
  }

The generated C++ has quite a bit more going on. Here’s the full generated code. Mainly the generated header defines AddOp which has getters for the arguments and results, “mutable” getters for modfying the op in place, parse/print, and generated builder methods. It also defines an AddOpAdaptor class which is used during lowering. The cpp file contains mostly rote implementations of these, converting from a generic internal representation to specific types and named objects that client code would use.

Next, in this commit I added a sub and mul operation, with a slight refactoring to make a base tablegen class for a binary operation.

Next, in this commit I added a from_tensor operation that converts a list of coefficients to a polynomial, and an eval op that evaluates a polynomial at a given input value.

In the next few articles we’ll expand on this dialect’s capabilities. First, we’ll study what other “batteries” we can include in the dialect itself, and how we can run optimizations on poly programs. Then we’ll study the nuances of lowering poly to existing MLIR dialects, and eventually through to LLVM and then machine code.

MLIR — Using Tablegen for Passes

Table of Contents

In the last article in this series, we defined some custom lowering passes that modified an MLIR program. Notably, we accomplished that by implementing the required interfaces of the MLIR API directly.

This is not the way that most MLIR developers work. Instead, they use a code generation tool called tablegen to generate boilerplate for them, and then only add the implementation methods that are custom to their work. In this article, we’ll convert the custom passes we wrote previously to use this infrastructure, and going forward we will use tablegen for defining new dialects as well.

The code relevant to this article is contained in this pull request, and as usual the commits are organized so that they can be read in order.

How to think about tablegen

The basic idea of tablegen is that you can define a pass—or a dialect, as we’ll do next time—in the tablegen DSL (as specialized for MLIR) and then run the mlir-tblgen binary with appropriate flags, and it will generate headers for the functions you need to implement your pass, along with default implementations of some common functions, documentation, and hooks to register the pass/dialect with the mlir-opt-like entry point tool.

It sounds nice, but I have a love hate relationship with tablegen. I personally find it to be unpleasant to use, primarily because it provides poor diagnostic information when you do things wrong. Today, though, I realize that my part of my frustration came from having the wrong initial mindset around tablegen. I thought, incorrectly, that tablegen was an abstraction layer. That is, I could write my tablegen files, build them, and only think about the parts of the generated code that I needed to implement.

Instead, I’ve found that I still need to understand the details of the generated code, enough that I may as well read the generated code closely until it becomes familiar. This materializes in a few ways:

  • mlir-tablegen doesn’t clearly tell you what functions are left unimplemented or explain the contract of the functions you have to write. For that you have to read the docs (which are often incomplete), or often the source code of the base classes that the generated boilerplate subclasses. Or, sadly, sometimes the Discourse threads or Phabricator commit messages are the only places to find the answers to some questions.
  • The main way to determine what is missing is to try to build the generated code with some code that uses it, and then sift through hundreds of lines of C++ compiler errors, which in turn requires understanding the various template gymnastics in the generated code.
  • The generated code will make use of symbols that you have to know to import or forward-declare in the right places, and it expects you to manage the namespaces in which the generated code lives.

With the expectation that tablegen is a totally see-through code generator, and then just sitting down and learning the MLIR APIs, the result is not so bad. Indeed, that was the motivation for building the passes in the last article from “scratch” (without tablegen), because it makes the transition to using tablegen more transparent. Pass generation is also a good starting point for tablegen because, by comparison, using tablegen for dialects results in many more moving parts and configurable options.

Tablegen files and the mlir-tblgen binary

We’ll start by migrating the AffineFullUnroll pass from last time. The starting point is to write some tablegen code. This commit implements it, which I’ll abbreviate below. If you want the official docs on how to do this, see Declarative Pass Specification.

include "mlir/Pass/PassBase.td"

def AffineFullUnroll : Pass<"affine-full-unroll"> {
  let summary = "Fully unroll all affine loops";
  let description = [{
    Fully unroll all affine loops. (could add more docs here like code examples)
  }];
  let dependentDialects = ["mlir::affine::AffineDialect"];
}

[Aside: In the actual commit, there are two definitions, one for the AffineFullUnroll that walks the IR, and the other for AffineFullUnrollPatternRewrite which uses the pattern rewrite engine. In the article I’ll just show the generated code for the first.]

As you can see, tablegen has concepts like classes and class inheritance (the : Pass<...> is subclassing the Pass base class defined in PassBase.td, an upstream MLIR file. But the def keyword here specifically implies we’re instantiating this thing in a way that the codegen tool should see and generate real code for (as opposed to the class keyword, which is just for tablegen templating and code reuse).

So tablegen lets you let-define string variables and lists, but one thing that won’t be apparent in this article, but will be apparent in the next article, is that tablegen lets you define variables and use them across definitions, as well as define snippets of C++ code that should be put into the generated classes (which can use the defined variables). This is visible in the present context in the PassBase.td class which defines a code constructor variable. If you have a special constructor for your pass, you can write the C++ code for it there.

Next we define a build rule using gentbl_cc_library to run the mlir-tblgen binary (in the same commit) with the right options. This gentbl_cc_library bazel rule is provided by the MLIR devs, and it basically just assembles the CLI flags to mlir-tblgen and ensures the code-genned files end up in the right places on the filesystem and are compatible with the bazel dependency system. The build rule invocation looks like

gentbl_cc_library(
    name = "pass_inc_gen",
    tbl_outs = [
        (
            [
                "-gen-pass-decls",
                "-name=Affine",
            ],
            "Passes.h.inc",
        ),
    ],
    tblgen = "@llvm-project//mlir:mlir-tblgen",
    td_file = "Passes.td",
    deps = [
        "@llvm-project//mlir:OpBaseTdFiles",
        "@llvm-project//mlir:PassBaseTdFiles",
    ],
)

The important part here is that td_file specifies our input file, and tbl_outs defines the generated file, Passes.inc.h, which is at $GIT_ROOT/bazel-bin/lib/Transform/Affine/Passes.h.inc.

The main quirk with gentbl_cc_library is that the name of the bazel rule is not the target that actually generates the code. That is, if you run bazel build pass_inc_gen (or from the git root, bazel build lib/Transform/Affine:pass_inc_gen), it won’t create the files but the build will be successful. Instead, under the hood gentbl_cc_library is a bazel macro that generates the rule pass_inc_gen_filegroup, which is what you have to bazel build to see the actual files.

I’ve pasted the generated code (with both version of the AffineFullUnroll) into a gist and will highlight the important parts here. The first quirky thing the generated code does is use #ifdef as a sort of function interface for what code is produced. For example, you will see:

#ifdef GEN_PASS_DECL_AFFINEFULLUNROLL
std::unique_ptr<::mlir::Pass> createAffineFullUnroll();
#undef GEN_PASS_DECL_AFFINEFULLUNROLL
#endif // GEN_PASS_DECL_AFFINEFULLUNROLL

#ifdef GEN_PASS_DEF_AFFINEFULLUNROLL
... <lots of C++ code> ...
#undef GEN_PASS_DEF_AFFINEFULLUNROLL
#endif // GEN_PASS_DEF_AFFINEFULLUNROLL

This means that to use this file, we will need to define the appropriate symbol in a #define macro before including this header. You can see it happening in this commit, but in brief it will look like this

// in file AffineFullUnroll.h

#define GEN_PASS_DECL_AFFINEFULLUNROLL
#include "lib/Transform/Affine/Passes.h.inc"

// in file AffineFullUnroll.cpp

#define GEN_PASS_DEF_AFFINEFULLUNROLL
#include "lib/Transform/Affine/Passes.h.inc"

... <implement the missing functions from the generated code> ...

I’m no C++ expert, and this was the first time I’d seen this pattern of using #include as a function with #define as the argument. It was a little unsettling to me, until I landed on that mindset that it’s meant to be a white-box codegen, not an abstraction. So read the generated code. Inside the GEN_PASS_DECL_... guard, it defines a single function std::unique_ptr<::mlir::Pass> createAffineFullUnroll(); that is a very limited sole entry point for code that wants to use the pass. We don’t need to implement it unless our Pass has a custom constructor. Then in the GEN_PASS_DEF_... guard it defines a base class, whose functions I’ll summarize, but you should recognize many of them because we implemented them by hand last time.

template <typename DerivedT>
class AffineFullUnrollBase : public ::mlir::OperationPass<> {
  AffineFullUnrollBase() : ::mlir::OperationPass<>(::mlir::TypeID::get<DerivedT>()) {}
  AffineFullUnrollBase(const AffineFullUnrollBase &other) : ::mlir::OperationPass<>(other) {}

  static ::llvm::StringLiteral getArgumentName() {...}
  static ::llvm::StringRef getArgument() { ... }
  static ::llvm::StringRef getDescription() { ... }
  static ::llvm::StringLiteral getPassName() { ... }
  static ::llvm::StringRef getName() { ... }

  /// Support isa/dyn_cast functionality for the derived pass class.
  static bool classof(const ::mlir::Pass *pass) { ... }

  /// A clone method to create a copy of this pass.
  std::unique_ptr<::mlir::Pass> clonePass() const override { ... }

  /// Return the dialect that must be loaded in the context before this pass.
  void getDependentDialects(::mlir::DialectRegistry &registry) const override {
    registry.insert<mlir::affine::AffineDialect>();
  }

  ... <type_id stuff> ...
}

Notably, this doesn’t tell us what functions are left for us to implement. For that we have to either build it and read compiler error messages, or compare it to the base class (OperationPass) and it’s base class (Pass) to see that the only function left to implement is runOnOperation() Or, since we did this last time from the raw API, we can observe that the boilerplate functions we implemented before like getArgument are here, but runOnOperation is not.

Another notable aspect of the generated code is that it uses the curiously recurring template pattern (CRTP), so that the base class can know the eventual name of its subclass, and use that name to hook the concrete subclass into the rest of the framework.

Lower in the generated file you’ll see another #define-guarded block for GEN_PASS_REGISTRATION, which implements hooks for tutorial-opt to register the passes without having to depend on each internal Pass class directly.

#ifdef GEN_PASS_REGISTRATION

inline void registerAffineFullUnroll() {
  ::mlir::registerPass([]() -> std::unique_ptr<::mlir::Pass> {
    return createAffineFullUnroll();
  });
}

inline void registerAffineFullUnrollPatternRewrite() {
  ::mlir::registerPass([]() -> std::unique_ptr<::mlir::Pass> {
    return createAffineFullUnrollPatternRewrite();
  });
}

inline void registerAffinePasses() {
  registerAffineFullUnroll();
  registerAffineFullUnrollPatternRewrite();
}
#undef GEN_PASS_REGISTRATION
#endif // GEN_PASS_REGISTRATION

This implies that, once we link everything properly, the changes to tutorial-opt (in this commit) simplify to calling registerAffinePasses. This registration macro is intended to go into a Passes.h file that includes all the individual pass header files, as done in this commit. And we can use that header file as an anchor for a bazel build target that includes all the passes defined in lib/Transform/Affine at once.

#include "lib/Transform/Affine/AffineFullUnroll.h"
#include "lib/Transform/Affine/AffineFullUnrollPatternRewrite.h"

namespace mlir {
namespace tutorial {

#define GEN_PASS_REGISTRATION
#include "lib/Transform/Affine/Passes.h.inc"

}  // namespace tutorial
}  // namespace mlir

Finally, after all this (abbreviated from this commit), the actual content of the pass reduces to the following subclass (with CRTP) and implementation of runOnOperation, the body of which is identical to the last article except for a change from reference to pointer for the return value of getOperation.

#define GEN_PASS_DEF_AFFINEFULLUNROLL
#include "lib/Transform/Affine/Passes.h.inc"

struct AffineFullUnroll : impl::AffineFullUnrollBase<AffineFullUnroll> {
  using AffineFullUnrollBase::AffineFullUnrollBase;

  void runOnOperation() {
    getOperation()->walk([&](AffineForOp op) {
      if (failed(loopUnrollFull(op))) {
        op.emitError("unrolling failed");
        signalPassFailure();
      }
    });
  }
};

I split the AffineFullUnroll migration into multiple commits to highlight the tablegen vs C++ code changes. For MulToAdd, I did it all in one commit. The tests are unchanged, because the entry point is still the tutorial-opt binary with the appropriate CLI flags, and those names are unchanged in the tablegen’ed code.

Bonus: mlir-tblgen also has an option -gen-pass-doc, which you’ll see in the commits, which generates a markdown file containing auto-generated documentation for the pass. A CI workflow can copy these to a website directory, as we do in HEIR, and you get free docs. See this example from HEIR.

Addendum: hermetic Python

When I first set up this tutorial project, I didn’t realize that bazel’s Python rules use the system Python by default. Some early readers found an error that Python couldn’t find the lit module when running tests. While pip install lit in your system Python would work, I also migrated in this commit to a hermetic python runtime and explicit dependency on lit. It should all be handled automatically by bazel now.