In this article I’ll show how to use PDLL, a tool for defining MLIR patterns, which itself is built with MLIR. PDLL is intended to be a replacement for defining patterns in tablegen, though there are few public examples of its use. In fact, the main impetus for PDLL is that tablegen makes it difficult to express things like:
- Operations that return multiple results
- Operations with regions
- Operations with variadic operands
- Arithmetic on static values
While not all these features are fully supported in PDLL yet, they are within scope of the language and tooling.
The code for this article is in this pull request, and as usual the commits are organized to be read in order.
Interlude: updating the LLVM commit hash
Before getting into it, there were some recent features added to PDLL
since the last article in this tutorial.
So this commit
updates the LLVM commit hash, and
this commit
makes minor changes to fix build failures and warnings.
The only surprising change
is that when a tablegen-defined operation implements an interface,
rather than just having DeclareOpInterfaceMethods
,
you have to explicitly list the subset of methods that will be implemented,
or their headers are never generated.
This is surprising because the way I had done this before
is to add the DeclareOpInterfaceMethods
trait
and let the linker complain to tell me which methods to add
and the expected signature.
PDL and PDLL
PDL stands for Pattern Descriptor Language. This is not to be confused with PDLL, which humorously stands for Pattern Descriptor Language Language; it is the syntactically-rich frontend users interact with, while PDL is just an MLIR dialect.
As a dialect, pdl
models core MLIR abstractions like operations, operands, and results.
pdl
also has operations that represent transformations like replace
and erase
.
A user of PDLL may ignore pdl
and work with PDLL syntax directly.
The mlir-pdll
tool parses PDLL files and generates pdl
IR,
which can then be lowered to C++ code that can be compiled into a pass.
This is the path we’ll take in this article.
The last section of the article will show
the internal process in a bit more detail.
Since PDLL defines patterns,
it replaces the implementation of a C++ RewritePattern
.
In code, you define structure-like syntax forms via the language keyword Pattern
,
within which there is a “match” section and a “rewrite” section.
Pattern {
// match section
let root = ...
// ... apply constraints ...
rewrite root with {
//rewrite section
}
}
Within the match section you can express IR matches similarly to tablegen/DRR-style patterns (see MLIR — Canonicalizers and Declarative Rewrite Patterns), doing things like writing an operation name as a function, and providing named arguments that implicitly capture references to the operands, with nesting.
For a concrete example, whereas DRR expressed these captures like
// From lib/Dialect/Poly/PolyPatterns.td
def DifferenceOfSquares : Pattern<
(Poly_SubOp (Poly_MulOp:$lhs $x, $x), (Poly_MulOp:$rhs $y, $y)),
...
>;
In PDLL you might write this as
Pattern DifferenceOfSquares {
let root = op<poly.sub>(
lhs: op<poly.mul>(x: Value, x: Value),
rhs: op<poly.mul>(y: Value, y: Value));
...
}
PDLL also supports defining constraints.
For example, to enforce that an SSA value is defined as an arith.constant
of a specific value and type,
you can write:
let ten: attr<"10 : i32">;
let constant = op<arith.constant> {value = ten};
The argument to attr<...>
is a string that is parsed
as an attribute by MLIR,
so it can be any valid MLIR attribute’s textual representation.
The curly braces on after op<arith.constant>
show how one captures static operation attributes
(as opposed to operation operands, which are SSA values),
and the name value
in this case is the tablegen-given name of the attribute
in the upstream ArithOps.td
definition file.
Also note that PDLL supports importing tablegen definitions, without which the above would not compile.
#include "mlir/Dialect/Arith/IR/ArithOps.td"
Pattern {
let ten: attr<"10 : i32">;
let constant = op<arith.constant> {value = ten};
...
}
The constraints section of the PDLL docs have more details about what can be constrained.
Finally, the rewrite section
is where you define what to do with the matched IR.
Here you use the same syntax as the match section,
but the interpretation is that the ops are being created
instead of constrained.
Below, we replace an arith.constant 10 : i32
with an arith.constant 11 : i32
, for no discernible reason.
Pattern ReplaceTenWithEleven {
let ten: attr<"10 : i32">;
let constant = op<arith.constant> {value = ten};
rewrite constant with {
let newConst = op<arith.constant> {value = attr<"11 : i32">};
replace constant with newConst;
};
}
The allowed rewrite statements
are erase
and replace
. The expression after with
in a replace
may be another operation, or some value or tuple of values (ValueRange
)
matched on previously.
Finally, one can define constraints in their own separate functions to facilitate reuse:
Constraint MyConstraint(value: Value) {
...
}
The main use I have for this is actually the main thing that makes PDLL usable while it’s still being built: the ability to escape back to C++ for anything it doesn’t support.
// Expects the user to register a function via the C++ API.
Constraint MyConstraint(value: Value);
For some simple such functions, you can inject the C++ directly
in the PDLL file with the same [{ }]
syntax as tablegen uses to
define a multiline string.
Constraint HasOneUse(value: Value) [{
return success(value.hasOneUse());
}];
For this and the previous (register) example,
there is an explicit mapping
from PDLL types (like Value
) to C++ types (like ::mlir::Value
).
So hasOneUse
is just a member function of ::mlir::Value
,
and success
is the LogicalResult
function used throughout this tutorial.
Porting MulToAdd
to PDLL
With that fuzzy outline of what PDLL can do,
let’s move on to the main event: porting the MulToAdd
patterns
(from MLIR — Writing Our First Pass).
Recall the pass converts arithmetic multiplication by constants
to repeated addition, using two patterns:
PowerOfTwoExpand
to split the multiplication $x \cdot 2^n$ into $x \cdot 2^{n/2} + x \cdot 2^{n/2}$.PeelFromMul
: to split other multiplications $x \cdot C$ into $x + x \cdot (C-1)$.
We start by cloning the test file from mul_to_add.mlir
to use a new pass --mul-to-add-pdll
in this commit.
The test is otherwise identical.
The next commit
adds the pass boilerplate and the PDLL
for PowerOfTwoExpand
.
First, the parts of the boilerplate that are specific to PDLL.
We define a new pass MulToAddPDLL
,
and we must add PDL
and this PDLInterp
dialect as dependent dialects.
We will explain why these are required when discussing the internals of PDL later.
include "mlir/Dialect/PDL/IR/PDLDialect.td"
include "mlir/Dialect/PDLInterp/IR/PDLInterpOps.td"
def MulToAddPdll : Pass<"mul-to-add-pdll"> {
let summary = "Convert multiplications to repeated additions using pdll";
let description = [{
Convert multiplications to repeated additions (using pdll).
}];
let dependentDialects = [
"mlir::pdl::PDLDialect",
"mlir::pdl_interp::PDLInterpDialect",
];
}
The bazel build rule used to process the pdll
code is a bit curious:
it is gentbl_cc_library
, the same bazel rule as for processing tablegen
files!
gentbl_cc_library(
name = "MulToAddPdllIncGen",
tbl_outs = [
(
["-x=cpp"],
"MulToAddPdll.h.inc",
),
],
tblgen = "@llvm-project//mlir:mlir-pdll",
td_file = "MulToAdd.pdll",
deps = [
"@llvm-project//mlir:ArithDialect",
"@llvm-project//mlir:FuncDialect",
"@llvm-project//mlir:ArithOpsTdFiles",
],
)
What changes here is that the “tblgen
” argument,
which specifies the binary to run,
is now mlir-pdll
instead of the default mlir-tblgen
.
The td_file
argument doesn’t actually force anything to be tablegen,
so it is given a pdll
file.
And the tbl_outs
argument specifies which commands to pass to the “tblgen
” binary,
and in this case we need to pass -x=cpp
to generate C++ code.
We’ll see what this C++ code looks like in the last section of this article.
Finally, we have the actual PDLL and the C++ harness to load the patterns.
#include "mlir/Dialect/Arith/IR/ArithOps.td"
Constraint IsPowerOfTwo(attr: Attr) [{
int64_t value = cast<::mlir::IntegerAttr>(attr).getValue().getSExtValue();
return success((value & (value - 1)) == 0);
}];
// Currently, constraints that return values must be defined in C++
Constraint Halve(attr: Attr) -> Attr;
Pattern PowerOfTwoExpand with benefit(2) {
let root = op<arith.muli>(op<arith.constant> {value = const: Attr}, rhs: Value);
IsPowerOfTwo(const);
let halved: Attr = Halve(const);
rewrite root with {
let newConst = op<arith.constant> {value = halved};
let newMul = op<arith.muli>(newConst, rhs);
let newAdd = op<arith.addi>(newMul, newMul);
replace root with newAdd;
};
}
Before discussing, here was the C++ version of this pattern for comparison:
struct PowerOfTwoExpand : public OpRewritePattern<MulIOp> {
PowerOfTwoExpand(mlir::MLIRContext *context)
: OpRewritePattern<MulIOp>(context, /*benefit=*/2) {}
LogicalResult matchAndRewrite(MulIOp op,
PatternRewriter &rewriter) const override {
Value lhs = op.getOperand(0);
// canonicalization patterns ensure the constant is on the right, if there
// is a constant See
// https://mlir.llvm.org/docs/Canonicalization/#globally-applied-rules
Value rhs = op.getOperand(1);
auto rhsDefiningOp = rhs.getDefiningOp<arith::ConstantIntOp>();
if (!rhsDefiningOp) {
return failure();
}
int64_t value = rhsDefiningOp.value();
bool is_power_of_two = (value & (value - 1)) == 0;
if (!is_power_of_two) {
return failure();
}
ConstantOp newConstant = rewriter.create<ConstantOp>(
rhsDefiningOp.getLoc(),
rewriter.getIntegerAttr(rhs.getType(), value / 2));
MulIOp newMul = rewriter.create<MulIOp>(op.getLoc(), lhs, newConstant);
AddIOp newAdd = rewriter.create<AddIOp>(op.getLoc(), newMul, newMul);
rewriter.replaceOp(op, newAdd);
return success();
}
};
The PDLL is relatively straightforward, except for this bit:
// Currently, constraints that return values must be defined in C++
Constraint Halve(attr: Attr) -> Attr;
Pattern PowerOfTwoExpand with benefit(2) {
...
let halved: Attr = Halve(const);
...
}
This is not only a constraint that is defined in C++, but a constraint that actually returns a result that can be used in the IR. This is a recently added feature, but it’s the only way I am aware of (without shelling out the entire rewrite section to C++), to do arithmetic on a statically-known numeric value in PDLL.
In the C++ implementation of the pass, which otherwise has the following usual boilerplate:
struct MulToAddPdll : impl::MulToAddPdllBase<MulToAddPdll> {
using MulToAddPdllBase::MulToAddPdllBase;
void runOnOperation() {
mlir::RewritePatternSet patterns(&getContext());
populateGeneratedPDLLPatterns(patterns); // Auto-generated by mlir-pdll
registerNativeConstraints(patterns); // Not boilerplate
(void)applyPatternsAndFoldGreedily(getOperation(), std::move(patterns));
}
};
We must also implement registerNativeConstraints
that maps
PDLL-defined constraints (by name) to appropriate C++ functions.
For Halve
, that is
LogicalResult halveImpl(PatternRewriter &rewriter, PDLResultList &results,
ArrayRef<PDLValue> args) {
Attribute attr = args[0].cast<Attribute>();
IntegerAttr cAttr = cast<IntegerAttr>(attr);
int64_t value = cAttr.getValue().getSExtValue();
results.push_back(rewriter.getIntegerAttr(cAttr.getType(), value / 2));
return success();
}
void registerNativeConstraints(RewritePatternSet &patterns) {
patterns.getPDLPatterns().registerConstraintFunction("Halve", halveImpl);
}
registerConstraintFunction
performs the mapping, and halveImpl
is required
to have a very specific signature.
Note, the required signature of this C++ function is not clearly documented.
I had to ask the guy who implemented it,
who pointed me to upstream MLIR unit tests and this code comment.
For external constraints that don’t return values,
the expected signature is a bit more obvious, just
LogicalResult myConstraintImpl(PatternRewriter &rewriter, ...)
,
where the ellipsis are the PDLL arguments of the constraint
with PDLL-to-MLIR type conversions
applied.
But once that is done, everything compiles and runs,
but the tests don’t quite pass yet.
PowerOfTwoExpand
, as currently defined,
requires the constant to be in the first argument,
whereas the test has it in the second.
Canonicalization patterns ensure the constant is on the right,
but because I forgot that when I was originally writing this tutorial,
I wrote a second version of the pattern with the constant swapped
to the other argument.
That is in this commit.
Finally, the next commit
implements PeelFromMul
,
along with a new native constraint MinusOne
.
These constraints also use the with benefit(...)
syntax
to ensure that PowerOfTwoExpand
is applied first when both patterns could succeed.
After this, the tests pass,
and the last commit
adds the CMake build.
A peek inside: mlir-pdll
We’ll start investigating the internals of PDL and PDLL
by taking a closer look at the mlir-pdll
tool.
In the bazel build, you can run it as bazel run @llvm-project//mlir:mlir-pdll
,
but passing the arguments correctly takes some care and bazel knowhow.
The actual command looks like this:
bazel run @llvm-project//mlir:mlir-pdll -- \
-x=mlir \
-I $PWD/external/llvm-project/mlir/include \
$PWD/lib/Transform/Arith/MulToAdd.pdll
Note the --
separates flags to pass to bazel run
from flags to pass to mlir-pdll
,
and the paths must all be absolute because bazel run
runs the binary in a hermetic filepath,
not the current directory.
Because MulToAdd.pdll
includes the upstream ArithOps.td
,
we have to add an -I
include flag, and the location external/llvm-project
is a sym-link to where bazel stores the LLVM source code.
And finally, the -x
flag tells mlir-pdll
what code to generate.
The options are mlir
, cpp
, and ast
.
The ast
option parses the PDLL
and shows the abstract syntax tree (AST),
which is not useful for us. The mlir
option generates pdl
IR,
and the cpp
option generates the C++ code to include in the pass.
Here is an abbreviated version of the pdl
IR
generated for PowerOfTwoExpandRhs
when passing -x=mlir
:
pdl.pattern @PowerOfTwoExpandRhs : benefit(2) {
%0 = operands
%1 = attribute
%2 = types
%3 = operation "arith.constant"(%0 : !pdl.range<value>) {"value" = %1} -> (%2 : !pdl.range<type>)
%4 = result 0 of %3
%5 = operand
%6 = types
%7 = operation "arith.muli"(%4, %5 : !pdl.value, !pdl.value) -> (%6 : !pdl.range<type>)
apply_native_constraint "IsPowerOfTwo"(%1 : !pdl.attribute)
%8 = apply_native_constraint "Halve"(%1 : !pdl.attribute) : !pdl.attribute
rewrite %7 {
%9 = operation "arith.constant" {"value" = %8}
%10 = result 0 of %9
%11 = operation "arith.muli"(%10, %5 : !pdl.value, !pdl.value)
%12 = result 0 of %11
%13 = result 0 of %11
%14 = operation "arith.addi"(%12, %13 : !pdl.value, !pdl.value)
replace %7 with %14
}
}
And here is the corresponding C++
static ::llvm::LogicalResult IsPowerOfTwoPDLFn(::mlir::PatternRewriter &rewriter, ::mlir::Attribute attr) {
int64_t value = cast<::mlir::IntegerAttr>(attr).getValue().getSExtValue();
return success((value & (value - 1)) == 0);
}
namespace {
struct PowerOfTwoExpandRhs : ::mlir::PDLPatternModule {
template <typename... ConfigsT>
PowerOfTwoExpandRhs(::mlir::MLIRContext *context, ConfigsT &&...configs)
: ::mlir::PDLPatternModule(::mlir::parseSourceString<::mlir::ModuleOp>(
R"mlir(pdl.pattern @PowerOfTwoExpandRhs : benefit(2) {
%0 = operands
%1 = attribute
%2 = types
%3 = operation "arith.constant"(%0 : !pdl.range<value>) {"value" = %1} -> (%2 : !pdl.range<type>)
%4 = result 0 of %3
%5 = operand
%6 = types
%7 = operation "arith.muli"(%4, %5 : !pdl.value, !pdl.value) -> (%6 : !pdl.range<type>)
apply_native_constraint "IsPowerOfTwo"(%1 : !pdl.attribute)
%8 = apply_native_constraint "Halve"(%1 : !pdl.attribute) : !pdl.attribute
rewrite %7 {
%9 = operation "arith.constant" {"value" = %8}
%10 = result 0 of %9
%11 = operation "arith.muli"(%10, %5 : !pdl.value, !pdl.value)
%12 = result 0 of %11
%13 = result 0 of %11
%14 = operation "arith.addi"(%12, %13 : !pdl.value, !pdl.value)
replace %7 with %14
}
}
)mlir", context), std::forward<ConfigsT>(configs)...) {
registerConstraintFunction("IsPowerOfTwo", IsPowerOfTwoPDLFn);
}
};
} // end namespace
template <typename... ConfigsT>
static void LLVM_ATTRIBUTE_UNUSED populateGeneratedPDLLPatterns(::mlir::RewritePatternSet &patterns, ConfigsT &&...configs) {
patterns.add<PowerOfTwoExpandRhs>(patterns.getContext(), configs...);
}
As you can see, the PDL MLIR code is inlined as a string and parsed,
while the native constraint is filled in as a C++ function and registered.
This also explains why pdl
is required as a dependent dialect of the pass;
the pass will create new pdl
IR from this string,
so the dialect needs to be loaded.
The above shows a design choice of the PDL/PDLL system:
instead of generating C++ code directly from pdl
IR,
they defer to a runtime interpreter to run the patterns
and apply constraints. Only the native constraints are implemented in C++.
In other words, this is where the scope of mlir-pdll
ends
and the scope of the PDL
interpreter begins.
A peek inside: pdl-interp
The above code adds an ::mlir::PDLPatternModule
to a RewritePatternSet
,
and indeed RewritePatternSet
has a special hook
to handle these pattern module classes.
When the RewritePatternSet
is converted to a FrozenRewritePatternSet
(which happens implicitly on calling applyPatternsAndFoldGreedily
),
MLIR will convert the pdl
to pdl_interp
,
a lower-level dialect that represents the
individual operations required to execute a pattern match and rewrite.
Then this pdl_interp
IR is converted to PDLByteCode
and stored on the pattern set.
This all happens here.
The PDL bytecode class doubles as the bytecode interpreter.
It exposes match
and rewrite
hooks used in the Pattern applicator.
And its implementation is itself quite a beast,
starting from this ByteCodeExecutor::exeucte
function.
I won’t go into detail here, mainly because I haven’t studied much of this code,
but I will note that this is all wrapped up
in a Pattern
and handled just like any other MLIR pattern.
Why to use PDLL and what’s next for it
I asked the MLIR maintainers about the justifications for PDL and PDLL. They gave me a few reasons.
The first reason they give is extensibility. By interpreting patterns,
you can let the user pass their own patterns to the compiler without
having to rebuild it.
I don’t believe mlir-opt
has a built-in option for this,
but given the pipeline we saw above,
it’s clear how one would support it:
parse PDLL, convert to PDL and then pdl_interp
,
then run the interpreter on the bytecode.
Given how slow the MLIR build process is,
this seems like a serious benefit.
The second reason they gave is the compiler’s binary size bloat. The more patterns you compile, the larger the compiler’s binary gets. My group’s MLIR-based HEIR project has a nightly build that is already 160 MiB, and most of that is upstream MLIR stuff. According to the maintainers, the PDL bytecode is roughly 10x smaller than the corresponding compiled C++ object file.
The third reason is eliminating duplicate work by combining common work across match phases in different patterns. For example, attribute lookups are slow in MLIR because they involve a linear scan across all of an op’s attributes and use string comparisons to find the attribute’s name. If lots of patterns match on the same attribute, that work can be performed jointly. Even accessing operands and results can be slow, if you are doing it across hundreds of patterns for the same op. This “joint computation” is achieved by merging the pattern match clauses into a single finite state machine (described in some more detail in a 2019 talk). In that talk, the author claims that the time to do all this pattern merging work is roughly 15x faster than the time to compile the analogous C++ code.
And finally, they suggested that using the interpreter for pattern matching is faster than the same patterns written in C++. This seems to ignore the cost of constructing the finite state machine mentioned above, but the 2019 talk was before the code was upstreamed and I think it could use some additional confirmation.
PDLL is impressive, but not yet feature complete. Here are some of the features I have noticed are not yet in the language.
- Support for arithmetic, boolean logic, and comparisons (has an RFC).
- Support for regions (no RFC yet).
- Support for use in dialect conversion (no RFC yet).
For more information about PDLL, see the official documentation, the 2021 MLIR open meeting recording on PDLL, the 2021 meeting on PDL, and the 2019 meeting on the interpreter
Thanks to Matthias Gehre for answering my questions about PDL, and the folks in this thread for providing links to the talks mentioned above.
Want to respond? Send me an email, post a webmention, or find me elsewhere on the internet.