IR Representation
Mooncake.jl works by transforming Julia's SSA-form (static single assignment) Intermediate Representation (IR), so a good understanding of Julia's IR is needed to understand Mooncake. Furthermore, Mooncake holds Julia's IR in a different data structure than the one usually used when producing code for reverse-mode AD. We discuss both data structures below, and provide examples of the kinds of transformations which must be applied to Julia's IR in order to implement AD, contrasting the two different data structures.
Please note that Julia's SSA-form IR typically changes representation slightly between minor versions of Julia, as it's not part of the public interface of the language. The information below is accurate on version 1.11.4, but you might well find that things are slightly different on different versions.
Julia's SSA-form IR
Straight-Line Code
You can find the IR associated to a given signature using Base.code_ircode_by_type
:
julia> function foo(x)
y = sin(x)
z = cos(y)
return z
end
foo (generic function with 1 method)
julia> signature = Tuple{typeof(foo), Float64}
Tuple{typeof(foo), Float64}
julia> Base.code_ircode_by_type(signature)[1][1]
2 1 ─ %1 = invoke sin(_2::Float64)::Float64
3 │ %2 = invoke cos(%1::Float64)::Float64
4 └── return %2
What you can see here is that the calls to sin
and cos
in the original function are associated to a number, denoted %1
and %2
. We refers to these as the "ssa"s associated to each statement. Each statement is associated to a single ssa, and this association is determined by where it appears in the list of statements – the first statement is associated to %1
, the second to %2
, and so on. You will also notice that the argument x
has been replaced with a _2
in the first statement – in general, all uses of the n
th argument are indicated by _n
(the first argument is the function itself). The final statement requires no explanation.
Note that this IR is obtained after both type inference and various Julia-level optimisation passes. This means that the type information is available for each statement. For example, the ::Float64
at the end of the first and second statements indicates that the type of %1
and %2
is always Float64
. The types are also displayed at uses – the call to sin
involves _2::Float64
, not just _2
.
Additionally notice that the statements are invoke
statements, rather than just call statements. In Julia's IR, an invoke
statement represents static dispatch to a particular MethodInstance
– i.e. running type inference + optimisation passes has determined enough about the argument types to make it possible to know exactly which MethodInstance
of sin
and cos
to call. This is a very common occurrence in type-stable code.
Control Flow
The above is straight-line code – it does not involve any control flow. Julia has several statements which are involved in handling control flow. For example
julia> function bar(x)
if x > 0
return x
else
return 5x
end
end
bar (generic function with 1 method)
julia> Base.code_ircode_by_type(Tuple{typeof(bar),Float64})[1][1]
2 1 ─ %1 = Base.lt_float(0.0, _2)::Bool
│ %2 = Base.or_int(%1, false)::Bool
└── goto #3 if not %2
3 2 ─ return _2
5 3 ─ %5 = Base.mul_float(5.0, _2)::Float64
└── return %5
In this example we see the statement goto #3 if not %2
. This should be read as "jump to basic block 3 if %2 is false
". The second half of that statement should be clear, but to understand the first half requires knowing what a basic block is:
1 ─
│
└──
2 ─
3 ─
└──
Here, everything is removed from the above example except for information about the basic block structure. To first approximation, each basic block is a sequence of statements which must always execute one after the other. Once all statements in a basic block have run, we typically either jump to another basic block, or hit a return
statement. In this example, we have three basic blocks – you can see this from the numbers 1
, 2
, and 3
. The first basic block comprises three statements, the second only one statement, and the third two statements. Another way to investigate this structure is to look at the control-flow graph associated to the IR:
julia> Base.code_ircode_by_type(Tuple{typeof(bar),Float64})[1][1].cfg
CFG with 3 blocks:
bb 1 (stmts 1:3) → bb 3, 2
bb 2 (stmt 4)
bb 3 (stmts 5:6)
For example, the above states that "bb" (basic block) 1 comprises statements 1 to 3, and has successor blocks 2 and 3 (ie. once the statements in basic block 1 have executed, we know for certain that either those in block 2 or block 3 will run next). Blocks 2 and 3 have no successors, because they both end in a return
statement. The predecessors of each basic block (the blocks which could possibly have run immediately prior to a given block) are also stored in the blocks of the CFG
, even though this is not printed – you should have a play around with this data structure to see what is in there.
Additionally, note that Base.lt_float
(used to check if one floating point number is less than another) and Base.or_int
do not appear as invoke
statements – this is because they are not generic Julia functions. Rather, they are Julia intrinsics:
julia> Base.lt_float
lt_float (intrinsic function #31)
These intrinsics have special handling in the compiler. Either way, the overall point is to be aware that these kinds of low-level intrinsics exist, and appear regularly in Julia IR.
Simple Loops and Phi-Nodes
Finally, we shall consider a simple loop:
julia> function my_factorial(x::Int)
n = 0
s = 1
while n < x
n += 1
s *= n
end
return s
end
my_factorial (generic function with 1 method)
julia> ir = Base.code_ircode_by_type(Tuple{typeof(my_factorial), Int})[1][1]
1 ─ nothing::Nothing
4 2 ┄ %2 = φ (#1 => 1, #3 => %7)::Int64
│ %3 = φ (#1 => 0, #3 => %6)::Int64
│ %4 = Base.slt_int(%3, _2)::Bool
└── goto #4 if not %4
5 3 ─ %6 = Base.add_int(%3, 1)::Int64
6 │ %7 = Base.mul_int(%2, %6)::Int64
7 └── goto #2
8 4 ─ return %2
There are a few new intrinsics that we have not seen previously (Base.slt_int
(used to check whether one int is strictly less than another), Base.add_int
, and Base.mul_int
). Additionally, there is the node goto #2
, which simply states that control flow should jump to basic block 2 whenever it is hit.
The most interesting additional nodes, however, are the two φ
(phi) nodes. These are a defining feature of SSA-form IR. Consider the first φ
node:
%2 = φ (#1 => 1, #3 => %7)
means ssa %2
takes value 1
if the previous basic block was #1
, and whatever value is currently associated to ssa %7
if the previous basic block was #3
. It is helpful to step through this code in your head: upon calling my_factorial
we enter basic block #1
, and proceed directly to basic block #2
. Therefore, on the first iteration, %2
takes value 1
. We never return to basic block #1
, so all subsequent visits to this φ
node will result in %2
taking the value associated to %7
. You should convince yourself that %2
corresponds to the value of s
at each iteration, and %3
corresponds to the value of n
at each iteration.
Summary
Julia's SSA-form IR comprises a sequence of statements, which can be broken down into a collection of basic blocks. Each basic block begins with a (potentially empty) collection of phi nodes, followed by a sequence of statements, and potentially finished by a terminator (goto, goto-if-not, return). Control flow is dictated by the terminators at the end of basic blocks – if there is no terminator then we "fall through" to the next basic block.
Julia Compiler's IR Datastructure
The Julia compiler represents the IR associated to a signature via a struct
called Core.Compiler.IRCode
. The statements are given by the stmts
field, which is a Core.Compiler.InstructionStream
. An InstructionStream
is a collection of 5 Vector
s, each of which have the same length. The properties of the n
th statement in the IR
are given by the n
th element of each of these vectors. For example, the stmt
field contains the statement itself, the type
field contains the inferred type associated to the statement. We'll skip the rest for now. For example, the statements associated to the my_factorial
function above can be retrieved as follows:
julia> ir.stmts.stmt
9-element Vector{Any}:
nothing
:(φ (%1 => 1, %3 => %7))
:(φ (%1 => 0, %3 => %6))
:(Base.slt_int(%3, _2))
:(goto %4 if not %4)
:(Base.add_int(%3, 1))
:(Base.mul_int(%2, %6))
:(goto %2)
:(return %2)
The types can be accessed in a similar way:
julia> ir.stmts.type
9-element Vector{Any}:
Nothing
Int64
Int64
Bool
Any
Int64
Int64
Any
Any
As seen in Control Flow, the control flow graph (CFG) is represented as a separate data structure, stored in the cfg
field of the IRCode
. The argument types associated to the signature are stored in the argtypes
field of the IRCode
.
An Alternative IR Datastructure
IRCode
is a perfectly good way to represent Julia's IR the vast majority of the time. For example, it suffices for the code transformations required for forwards-mode AD. However, IR transformations involving multiple changes to the control flow structure of a programme are needed in reverse-mode, and are prohibitively awkward to undertake using IRCode
. Mooncake's implementation of reverse-mode AD instead makes use of a custom representation of Julia's IR, called BBCode
. We emphasise that BBCode
represents the same thing under the hood, it is just represented in memory in a slightly different way, such that certain kinds of transformations are straightforward to implement.
You can construct a BBCode
from an IRCode
, and vice versa:
julia> using Mooncake: BBCode
julia> bb_ir = BBCode(ir);
julia> bb_ir isa BBCode
true
julia> Core.Compiler.IRCode(bb_ir)
1 ─ nothing::Nothing
4 2 ┄ %2 = φ (#1 => 1, #3 => %7)::Int64
│ %3 = φ (#1 => 0, #3 => %6)::Int64
│ %4 = Base.slt_int(%3, _2)::Bool
└── goto #4 if not %4
5 3 ─ %6 = Base.add_int(%3, 1)::Int64
6 │ %7 = Base.mul_int(%2, %6)::Int64
7 └── goto #2
8 4 ─ return %2
At present, BBCode
does not display itself nicely, so to look at it we must either inspect its fields, or convert it back to an IRCode
(which does print nicely).
Instead of storing all of the statements in a single vector (and the types in their own vector, etc), BBCode
stores all statements associated to a particular basic block in a Mooncake.BBlock
, and stores these in a Vector{Mooncake.BBlock}
.
julia> typeof(bb_ir.blocks)
Vector{BBlock} (alias for Array{Mooncake.BasicBlockCode.BBlock, 1})
Each BBlock
has a field insts
, containing the statements associated to that basic block. This is stored as a Vector{Core.Compiler.NewInstruction}
, because Core.Compiler.NewInstruction
contains the 5 fields that define an instruction in IRCode
(you should compare the fields of a Core.Compiler.NewInstruction
with those of Core.Compiler.InstructionStream
to see the correspondence). For example, consider
julia> using Mooncake.BasicBlockCode: ID # to improve printing
julia> bb_ir.blocks[3].insts[1]
Core.Compiler.NewInstruction(:(Base.add_int(ID(97), 1)), Int64, Core.Compiler.NoCallInfo(), 9, 0x000012e0)
This is the first instruction of the third basic block. The first field is a call to Base.add_int
, the second field is Int64
(we promise that the other fields are just copies of the corresponding data from the Core.Compiler.InstructionStream
in the original IRCode
representation of this IR).
The other structural difference is that BBCode
has no field containing the control-flow graph. Instead, the control-flow graph is represented implicitly as part of the blocks
field. The upside of this is that any transformations of blocks
which modify the CFG are automatically reflected in the blocks
– there is no need to perform any book-keeping to ensure that the CFG is kept in sync with the instructions. This saves both time and memory when inserting new basic blocks – when basic block structure changes, a scan of the entire IRCode
is required to modify any statements which refer to a given block, and yields code simplifications. The downside is that the CFG must be computed whenever we need to know about it. As a resut, neither IRCode
nor BBCode
's representation of the CFG is strictly better than the other. To extract CFG-related information from a BBCode
, see Mooncake.BasicBlockCode.compute_all_successors
, Mooncake.BasicBlockCode.compute_all_predecessors
, and Mooncake.BasicBlockCode.control_flow_graph
.
The final major difference between IRCode
and BBCode
is that all ssa values in an IRCode
(%1
, %2
, %n
, etc) are replaced with unique ID
s. The ID
associated to a statement is stored separately from the statement in the inst_ids
field of a BBlock
:
julia> bb_ir.blocks[3].inst_ids
3-element Vector{ID}:
ID(100)
ID(101)
ID(102)
There is exactly one ID
per instruction, and it is an error to have the same ID
associated to multiple instructions. Similarly, while the number associated to a basic block in IRCode
is a function of the number of basic blocks which preceed it, the ID
of a basic block in BBCode
is stored in its id
field:
julia> bb_ir.blocks[3].id
ID(106)
As a result of this, all references to ssa values and basic block numbers in IRCode
are replaced with ID
s in BBCode
. The purpose of this is to guarantee that the "name" of a basic block and an instruction does not change when you insert new basic blocks and new instructions. We shall see how this is useful in the examples below.
Code Transformations
In what follows, we look at a few transformations of Julia's IR, and see how these can be undertaken using both IRCode
and BBCode
. The purpose is two-fold:
- to enable readers to understand the code used to implement Mooncake, and
- to highlight the relative merits of
IRCode
vsBBCode
.
Replacing Instructions
This is a very simple code transformation. It is used in both forwards-mode and reverse-mode in Mooncake to replace calls of the form
f(x, y, z)
with calls of the form
frule!!(f, x, y, z)
This kind of transformation is performed in basically the same way for both IRCode
and BBCode
. For example, the mul_int
statement associated to ssa %7
can be replaced with an add_int
statement as follows:
julia> using Core: SSAValue
julia> const CC = Core.Compiler;
julia> new_ir = Core.Compiler.copy(ir);
julia> old_stmt = new_ir.stmts.stmt[7]
:(Base.mul_int(%2, %6))
julia> new_stmt = Expr(:call, Base.add_int, old_stmt.args[2:end]...)
:((Core.Intrinsics.add_int)(%2, %6))
julia> # new_ir[SSAValue(7)][:stmt] = new_stmt
CC.setindex!(CC.getindex(new_ir, SSAValue(7)), new_stmt, :stmt);
julia> new_ir
1 ─ nothing::Nothing
4 2 ┄ %2 = φ (#1 => 1, #3 => %7)::Int64
│ %3 = φ (#1 => 0, #3 => %6)::Int64
│ %4 = Base.slt_int(%3, _2)::Bool
└── goto #4 if not %4
5 3 ─ %6 = Base.add_int(%3, 1)::Int64
6 │ %7 = (Core.Intrinsics.add_int)(%2, %6)::Int64
7 └── goto #2
8 4 ─ return %2
Observe that ssa 7
has been replaced with the new :call
to add_int
. Unfortunately, in order to avoid committing type-piracy against Core.Compiler
, we cannot currently write new_ir[SSAValue(7)][:stmt]
. (CC.getindex
is a different function from Base.getindex
– the same is true for CC.setindex!
vs Base.setindex!
). In general, I would recommend defining helper functions to improve the DRYness of your code.
The same transformation can be performed on BBCode
:
julia> bb_ir_copy = copy(bb_ir);
julia> old_inst = bb_ir_copy.blocks[3].insts[2]
Core.Compiler.NewInstruction(:(Base.mul_int(ID(96), ID(100))), Int64, Core.Compiler.NoCallInfo(), 10, 0x000012e0)
julia> new_stmt = Expr(:call, Base.add_int, old_inst.stmt.args[2:end]...)
:((Core.Intrinsics.add_int)(ID(96), ID(100)))
julia> bb_ir_copy.blocks[3].insts[2] = CC.NewInstruction(old_inst; stmt=new_stmt);
julia> CC.IRCode(bb_ir_copy)
1 ─ nothing::Nothing
4 2 ┄ %2 = φ (#1 => 1, #3 => %7)::Int64
│ %3 = φ (#1 => 0, #3 => %6)::Int64
│ %4 = Base.slt_int(%3, _2)::Bool
└── goto #4 if not %4
5 3 ─ %6 = Base.add_int(%3, 1)::Int64
6 │ %7 = (Core.Intrinsics.add_int)(%2, %6)::Int64
7 └── goto #2
8 4 ─ return %2
As you can see, in both cases we wind up with the same IRCode
at the end.
Inserting New Instructions
Inserting entirely new instructions into the IR requires a little more thought, but is ultimately very straightforward using either IRCode
or BBCode
.
First, IRCode
. Suppose that we wish to insert another instruction immediately before the first add_int
instruction which multiplies %3
by 2 before adding 1
to it in #3
. In IRCode
, this kind of modification requires some care, because naively inserting an instruction between the 5th and 6th line changes the name of all instructions from the 6th onwards. Consequently, we need to replace all existing uses of e.g. %6
with uses of %7
, etc. Happily, IRCode
has a mechanism to achieve just this.
julia> ni = CC.NewInstruction(Expr(:call, Base.mul_int, SSAValue(3), 2), Int)
Core.Compiler.NewInstruction(:((Core.Intrinsics.mul_int)(%3, 2)), Int64, Core.Compiler.NoCallInfo(), nothing, nothing)
julia> new_ssa = CC.insert_node!(new_ir, SSAValue(6), ni)
:(%10)
julia> new_ir
1 ─ nothing::Nothing
4 2 ┄ %2 = φ (#1 => 1, #3 => %7)::Int64
│ %3 = φ (#1 => 0, #3 => %6)::Int64
│ %4 = Base.slt_int(%3, _2)::Bool
└── goto #4 if not %4
5 3 ─ (Core.Intrinsics.mul_int)(%3, 2)::Int64
│ %6 = Base.add_int(%3, 1)::Int64
6 │ %7 = (Core.Intrinsics.add_int)(%2, %6)::Int64
7 └── goto #2
8 4 ─ return %2
CC.insert_node!(ir, ssa, new_inst)
inserts new_inst
into ir
immediately before ssa
, and attaches it to the same basic block as ssa
resides. It returns an SSAValue
, which is the "name" associated to the inserted instruction in the IR. Here, we see it has inserted the instruction to multiply %3
by 2
immediately before %6
. However, observe that the IRCode
has not changed the name associated to the subsequent add_int
instruction – it still assigns to %6
, despite not being the 6th statement in the IR anymore. This is achieved via IRCode
's new_nodes
field – upon calling CC.insert_node!
, rather than inserting the instruction directly into the InstructionStream
, this list is appended to. We can do this as many times as we like, and then call CC.compact!
at the end to handle all of the book-keeping involved in inserting all of the statements, updating all ssa uses where required, and updating the cfg
field of the IR.
Also observe that the inserted statement is printed without a %10 =
at the start of it – this is because there are not (yet) any uses of %10
, so IRCode
does not print it out (presumably in order to reduce visual noise).
To conclude this transformation, we replace the first argument of the add_int
instruction with the new ssa returned by insert_node!
, and then call CC.compact!
to process all of the nodes currently in the new_nodes
list, and produce a valid IRCode
:
julia> stmt = CC.getindex(CC.getindex(new_ir, SSAValue(6)), :stmt)
:(Base.add_int(%3, 1))
julia> stmt.args[2] = new_ssa;
julia> new_ir
1 ─ nothing::Nothing
4 2 ┄ %2 = φ (#1 => 1, #3 => %7)::Int64
│ %3 = φ (#1 => 0, #3 => %6)::Int64
│ %4 = Base.slt_int(%3, _2)::Bool
└── goto #4 if not %4
5 3 ─ %10 = (Core.Intrinsics.mul_int)(%3, 2)::Int64
│ %6 = Base.add_int(%10, 1)::Int64
6 │ %7 = (Core.Intrinsics.add_int)(%2, %6)::Int64
7 └── goto #2
8 4 ─ return %2
julia> new_ir = CC.compact!(new_ir)
1 ─ nothing::Nothing
4 2 ┄ %2 = φ (#1 => 1, #3 => %8)::Int64
│ %3 = φ (#1 => 0, #3 => %7)::Int64
│ %4 = Base.slt_int(%3, _2)::Bool
└── goto #4 if not %4
5 3 ─ %6 = (Core.Intrinsics.mul_int)(%3, 2)::Int64
│ %7 = Base.add_int(%6, 1)::Int64
6 │ %8 = (Core.Intrinsics.add_int)(%2, %7)::Int64
7 └── goto #2
8 4 ─ return %2
Observe that, before compact!
-ing, the first instruction in basic block #3
is still labelled as being %10
. After compact!
-ing, we have standard sequentially-labelled IR again. Note that the above is exactly the kind of thing that we do in our implementation of forwards-mode AD – all insertions of nodes are performed in a single pass over the IRCode
, and CC.compact!
is called once at the end.
Performing this transformation using BBCode
is similarly straightforward. Since the name associated to instructions does not change when you insert another instruction, you really just need to insert an instruction + its ID
, update the next instruction (as before), and you're done:
julia> using Mooncake.BasicBlockCode: ID, new_inst
julia> new_id = ID() # this produces a new unique `ID`.
ID(108)
julia> target_id = bb_ir_copy.blocks[3].insts[1].stmt.args[2] # find `ID` of argument to add_int.
ID(97)
julia> ni = new_inst(Expr(:call, Base.mul_int, target_id, 2), Int);
julia> insert!(bb_ir_copy.blocks[3], 1, new_id, ni)
julia> bb_ir_copy.blocks[3].insts[2].stmt.args[2] = new_id
ID(108)
julia> CC.IRCode(bb_ir_copy)
1 ─ nothing::Nothing
4 2 ┄ %2 = φ (#1 => 1, #3 => %8)::Int64
│ %3 = φ (#1 => 0, #3 => %7)::Int64
│ %4 = Base.slt_int(%3, _2)::Bool
└── goto #4 if not %4
2 3 ─ %6 = (Core.Intrinsics.mul_int)(%3, 2)::Int64
5 │ %7 = Base.add_int(%6, 1)::Int64
6 │ %8 = (Core.Intrinsics.add_int)(%2, %7)::Int64
7 └── goto #2
8 4 ─ return %2
We see here that IRCode
and BBCode
involve similar levels of complexity to insert an instruction.
Inserting New Basic Blocks
This is the situation in which the design of BBCode
shines vs IRCode
. IRCode
does not, at present, really have much to say about transformations which change control flow. It is, however, straightforward using BBCode
. Suppose that we wish to modify the above to display the value of %2
if it is even on any given iteration. Since this involves control flow, it necessarily requires at least one additional basic block.
We do this in two steps. We first insert an additional basic block between blocks #3
and #4
which always prints out the value of %2
, and then goes to block #2
:
julia> using Mooncake.BasicBlockCode: BBlock, new_inst, IDGotoNode, IDGotoIfNot
julia> block_2_id = bb_ir_copy.blocks[2].id;
julia> new_bb_id = ID();
julia> new_bb = BBlock(
new_bb_id,
ID[ID(), ID()],
CC.NewInstruction[
new_inst(Expr(:call, println, CC.SSAValue(2))),
new_inst(IDGotoNode(block_2_id)),
],
);
julia> insert!(bb_ir_copy.blocks, 4, new_bb);
julia> CC.IRCode(bb_ir_copy)
1 ─ nothing::Nothing
4 2 ┄ %2 = φ (#1 => 1, #3 => %8)::Int64
│ %3 = φ (#1 => 0, #3 => %7)::Int64
│ %4 = Base.slt_int(%3, _2)::Bool
└── goto #5 if not %4
2 3 ─ %6 = (Core.Intrinsics.mul_int)(%3, 2)::Int64
5 │ %7 = Base.add_int(%6, 1)::Int64
6 │ %8 = (Core.Intrinsics.add_int)(%2, %7)::Int64
7 └── goto #2
2 4 ─ (println)(%2)::Any
└── goto #2
8 5 ─ return %2
Observe that, in this case, rather than creating new_bb
and then inserting instructions into it, we simply create the block with the instructions. This programming style is often more convenient. Additionally note that we create an ID
for each statement in the new basic block. These ID
s are never actually used anywhere, but BBCode
requires that each instruction be associated to an ID
, so we must create them.
Additionally, note the usage of an Mooncake.BasicBlockCode.IDGotoNode
. This is exactly the same thing as a Core.Compiler.GotoNode
, except it contains an ID
stating which basic block to jump to, rather than an Int
. Similarly, the Mooncake.BasicBlockCode.IDGotoIfNot
is a direct translation of Core.Compiler.GotoIfNot
, with the dest
field being an ID
rather than an Int
.
Furthermore, note that the goto if not
instruction at the end of basic block #2
now (correctly) jumps to basic block #5
, whereas before it jumped to block #4
. That is, by virtue of the fact that the ID
associated to each basic block remains unchanged in BBCode
, all pre-existing control flow relationships have remained the same. Moreover, we did not have to write any book-keeping code to ensure that this update happened correctly.
Now that we've created the new basic block, we modify block #3
to fall-through to the new block if %2
is even, and to jump straight back to block #2
if not:
julia> bb = bb_ir_copy.blocks[3];
julia> cond_id = ID();
julia> target_id = bb_ir_copy.blocks[2].inst_ids[1];
julia> insert!(bb, 4, cond_id, new_inst(Expr(:call, iseven, target_id)));
julia> bb.insts[end] = new_inst(IDGotoIfNot(cond_id, block_2_id));
julia> new_ir = CC.IRCode(bb_ir_copy)
1 ─ nothing::Nothing
4 2 ┄ %2 = φ (#1 => 1, #3 => %8)::Int64
│ %3 = φ (#1 => 0, #3 => %7)::Int64
│ %4 = Base.slt_int(%3, _2)::Bool
└── goto #5 if not %4
2 3 ─ %6 = (Core.Intrinsics.mul_int)(%3, 2)::Int64
5 │ %7 = Base.add_int(%6, 1)::Int64
6 │ %8 = (Core.Intrinsics.add_int)(%2, %7)::Int64
2 │ %9 = (iseven)(%2)::Any
└── goto #2 if not %9
4 ─ (println)(%2)::Any
└── goto #2
8 5 ─ return %2
Observe that in order to tie the conditional to the goto-if-not, we simply ensure that the ID
associated to the instruction which computes the conditional appears in the IDGotoIfNot
instruction.
Run the new code
As ever, we can construct a Core.OpaqueClosure
using IRCode
in order to produce something runnable:
julia> oc = Core.OpaqueClosure(new_ir; do_compile=true)
(::Int64)::Int64->◌
julia> oc(1000)
2
12
58
248
1014
2037
Exactly what oc
is computing is neither here nor there. The point is that we've successfully inserted a new basic block into Julia's IR, and produced a callable from it.
Summary
We have reviewed the two representations of Julia IR used in Mooncake. Where possible, we always use IRCode
– as discussed, forwards-mode AD exclusively uses IRCode
. BBCode
is basically only needed when undertaking transformations which involve changes to basic block structure – the insertion of new basic blocks, and the modification of terminators in a way which changes the predecessors / successors of a given block being the primary sources of these kinds of changes. Reverse-mode AD makes extensive use of such transformations, so BBCode
is currently important there.
There are efforts such as this PR to augment IRCode
with the capability to manipulate the CFG structure in a convenient manner. Ideally these efforts will succeed, then we can do away with BBCode
.
Docstrings
Mooncake.BasicBlockCode
— Modulemodule BasicBlockCode
See the docstring for the BBCode
struct
for info on this file.
Mooncake.BasicBlockCode.Terminator
— TypeTerminator = Union{Switch, IDGotoIfNot, IDGotoNode, ReturnNode}
A Union of the possible types of a terminator node.
Core.Compiler.IRCode
— MethodIRCode(bb_code::BBCode)
Produce an IRCode
instance which is equivalent to bb_code
. The resulting IRCode
shares no memory with bb_code
, so can be safely mutated without modifying bb_code
.
All IDPhiNode
s, IDGotoIfNot
s, and IDGotoNode
s are converted into PhiNode
s, GotoIfNot
s, and GotoNode
s respectively.
In the resulting bb_code
, any Switch
nodes are lowered into a semantically-equivalent collection of GotoIfNot
nodes.
Mooncake.BasicBlockCode.BBCode
— TypeBBCode(
blocks::Vector{BBlock}
argtypes::Vector{Any}
sptypes::Vector{CC.VarState}
linetable::Vector{Core.LineInfoNode}
meta::Vector{Expr}
)
A BBCode
is a data structure which is similar to IRCode
, but adds additional structure.
In particular, a BBCode
comprises a sequence of basic blocks (BBlock
s), each of which comprise a sequence of statements. Moreover, each BBlock
has its own unique ID
, as does each statment.
The consequence of this is that new basic blocks can be inserted into a BBCode
. This is distinct from IRCode
, in which to create a new basic block, one must insert additional statments which you know will create a new basic block – this is generally quite an unreliable process, while inserting a new BBlock
into BBCode
is entirely predictable. Furthermore, inserting a new BBlock
does not change the ID
associated to the other blocks, meaning that you can safely assume that references from existing basic block terminators / phi nodes to other blocks will not be modified by inserting a new basic block.
Additionally, since each statment in each basic block has its own unique ID
, new statments can be inserted without changing references between other blocks. IRCode
also has some support for this via its new_nodes
field, but eventually all statements will be renamed upon compact!
ing the IRCode
, meaning that the name of any given statement will eventually change.
Finally, note that the basic blocks in a BBCode
support the custom Switch
statement. This statement is not valid in IRCode
, and is therefore lowered into a collection of GotoIfNot
s and GotoNode
s when a BBCode
is converted back into an IRCode
.
Mooncake.BasicBlockCode.BBCode
— MethodBBCode(ir::IRCode)
Convert an ir
into a BBCode
. Creates a completely independent data structure, so mutating the BBCode
returned will not mutate ir
.
All PhiNode
s, GotoIfNot
s, and GotoNode
s will be replaced with the IDPhiNode
s, IDGotoIfNot
s, and IDGotoNode
s respectively.
See IRCode
for conversion back to IRCode
.
Note that IRCode(BBCode(ir))
should be equal to the identity function.
Mooncake.BasicBlockCode.BBCode
— MethodBBCode(ir::Union{IRCode, BBCode}, new_blocks::Vector{Block})
Make a new BBCode
whose blocks
is given by new_blocks
, and fresh copies are made of all other fields from ir
.
Mooncake.BasicBlockCode.BBlock
— TypeBBlock(id::ID, stmt_ids::Vector{ID}, stmts::InstVector)
A basic block data structure (not called BasicBlock
to avoid accidental confusion with CC.BasicBlock
). Forms a single basic block.
Each BBlock
has an ID
(a unique name). This makes it possible to refer to blocks in a way that does not change when additional BBlocks
are inserted into a BBCode
. This differs from the positional block numbering found in IRCode
, in which the number associated to a basic block changes when new blocks are inserted.
The n
th line of code in a BBlock
is associated to ID
stmt_ids[n]
, and the n
th instruction from stmts
.
Note that PhiNode
s, GotoIfNot
s, and GotoNode
s should not appear in a BBlock
– instead an IDPhiNode
, IDGotoIfNot
, or IDGotoNode
should be used.
Mooncake.BasicBlockCode.BBlock
— MethodBBlock(id::ID, inst_pairs::Vector{IDInstPair})
Convenience constructor – splits inst_pairs
into a Vector{ID}
and InstVector
in order to build a BBlock
.
Mooncake.BasicBlockCode.ID
— TypeID()
An ID
(read: unique name) is just a wrapper around an Int32
. Uniqueness is ensured via a global counter, which is incremented each time that an ID
is created.
This counter can be reset using seed_id!
if you need to ensure deterministic ID
s are produced, in the same way that seed for random number generators can be set.
Mooncake.BasicBlockCode.IDGotoIfNot
— TypeIDGotoIfNot(cond::Any, dest::ID)
Like a GotoIfNot
, but dest
is an ID
rather than an Int64
.
Mooncake.BasicBlockCode.IDGotoNode
— TypeIDGotoNode(label::ID)
Like a GotoNode
, but label
is an ID
rather than an Int64
.
Mooncake.BasicBlockCode.IDInstPair
— Typeconst IDInstPair = Tuple{ID, NewInstruction}
Mooncake.BasicBlockCode.IDPhiNode
— TypeIDPhiNode(edges::Vector{ID}, values::Vector{Any})
Like a PhiNode
, but edges
are ID
s rather than Int32
s.
Mooncake.BasicBlockCode.InstVector
— Typeconst InstVector = Vector{NewInstruction}
Note: the CC.NewInstruction
type is used to represent instructions because it has the correct fields. While it is only used to represent new instrucdtions in Core.Compiler
, it is used to represent all instructions in BBCode
.
Mooncake.BasicBlockCode.Switch
— TypeSwitch(conds::Vector{Any}, dests::Vector{ID}, fallthrough_dest::ID)
A switch-statement node. These can be inserted in the BBCode
representation of Julia IR. Switch
has the following semantics:
goto dests[1] if not conds[1]
goto dests[2] if not conds[2]
...
goto dests[N] if not conds[N]
goto fallthrough_dest
where the value associated to each element of conds
is a Bool
, and dests
indicate which block to jump to. If none of the conditions are met, then we go to whichever block is specified by fallthrough_dest
.
Switch
statements are lowered into the above sequence of GotoIfNot
s and GotoNode
s when converting BBCode
back into IRCode
, because Switch
statements are not valid nodes in regular Julia IR.
Base.insert!
— MethodBase.insert!(bb::BBlock, n::Int, id::ID, stmt::CC.NewInstruction)::Nothing
Inserts stmt
and id
into bb
immediately before the n
th instruction.
Mooncake.BasicBlockCode.__line_numbers_to_block_numbers!
— Method__line_numbers_to_block_numbers!(insts::Vector{Any}, cfg::CC.CFG)
Converts any edges in GotoNode
s, GotoIfNot
s, PhiNode
s, and :enter
expressions which refer to line numbers into references to block numbers. The cfg
provides the information required to perform this conversion.
For context, CodeInfo
objects have references to line numbers, while IRCode
uses block numbers.
This code is copied over directly from the body of Core.Compiler.inflate_ir!
.
Mooncake.BasicBlockCode._block_nums_to_ids
— Method_block_nums_to_ids(insts::InstVector, cfg::CC.CFG)::Tuple{Vector{ID}, InstVector}
Assign to each basic block in cfg
an ID
. Replace all integers referencing block numbers in insts
with the corresponding ID
. Return the ID
s and the updated instructions.
Mooncake.BasicBlockCode._build_graph_of_cfg
— Method_build_graph_of_cfg(blks::Vector{BBlock})::Tuple{SimpleDiGraph, Dict{ID, Int}}
Builds a SimpleDiGraph
, g
, representing of the CFG associated to blks
, where blks
comprises the collection of basic blocks associated to a BBCode
. This is a type from Graphs.jl, so constructing g
makes it straightforward to analyse the control flow structure of ir
using algorithms from Graphs.jl.
Returns a 2-tuple, whose first element is g
, and whose second element is a map from the ID
associated to each basic block in ir
, to the Int
corresponding to its node index in g
.
Mooncake.BasicBlockCode._compute_all_predecessors
— Method_compute_all_predecessors(blks::Vector{BBlock})::Dict{ID, Vector{ID}}
Internal method implementing compute_all_predecessors
. This method is easier to construct test cases for because it only requires the collection of BBlocks
, not all of the other stuff that goes into a BBCode
.
Mooncake.BasicBlockCode._compute_all_successors
— Method_compute_all_successors(blks::Vector{BBlock})::Dict{ID, Vector{ID}}
Internal method implementing compute_all_successors
. This method is easier to construct test cases for because it only requires the collection of BBlocks
, not all of the other stuff that goes into a BBCode
.
Mooncake.BasicBlockCode._control_flow_graph
— Method_control_flow_graph(blks::Vector{BBlock})::Core.Compiler.CFG
Internal function, used to implement control_flow_graph
. Easier to write test cases for because there is no need to construct an ensure BBCode object, just the BBlock
s.
Mooncake.BasicBlockCode._distance_to_entry
— Method_distance_to_entry(blks::Vector{BBlock})::Vector{Int}
For each basic block in blks
, compute the distance from it to the entry point (the first block. The distance is typemax(Int)
if no path from the entry point to a given node.
Mooncake.BasicBlockCode._find_id_uses!
— Method_find_id_uses!(d::Dict{ID, Bool}, x)
Helper function used in characterise_used_ids
. For all uses of ID
s in x
, set the corresponding value of d
to true
.
For example, if x = ReturnNode(ID(5))
, then this function sets d[ID(5)] = true
.
Mooncake.BasicBlockCode._ids_to_line_numbers
— Method_ids_to_line_numbers(bb_code::BBCode)::InstVector
For each statement in bb_code
, returns a NewInstruction
in which every ID
is replaced by either an SSAValue
, or an Int64
/ Int32
which refers to an SSAValue
.
Mooncake.BasicBlockCode._is_reachable
— Method_is_reachable(blks::Vector{BBlock})::Vector{Bool}
Computes a Vector
whose length is length(blks)
. The n
th element is true
iff it is possible for control flow to reach the n
th block.
Mooncake.BasicBlockCode._lines_to_blocks
— Method_instructions_to_blocks(insts::InstVector, cfg::CC.CFG)::InstVector
Pulls out the instructions from insts
, and calls __line_numbers_to_block_numbers!
.
Mooncake.BasicBlockCode._lower_switch_statements
— Method_lower_switch_statements(bb_code::BBCode)
Converts all Switch
s into a semantically-equivalent collection of GotoIfNot
s. See the Switch
docstring for an explanation of what is going on here.
Mooncake.BasicBlockCode._remove_double_edges
— Method_remove_double_edges(ir::BBCode)::BBCode
If the dest
field of an IDGotoIfNot
node in block n
of ir
points towards the n+1
th block then we have two edges from block n
to block n+1
. This transformation replaces all such IDGotoIfNot
nodes with unconditional IDGotoNode
s pointing towards the n+1
th block in ir
.
Mooncake.BasicBlockCode._ssa_to_ids
— Method_ssa_to_ids(d::SSAToIdDict, inst::NewInstruction)
Produce a new instance of inst
in which all instances of SSAValue
s are replaced with the ID
s prescribed by d
, all basic block numbers are replaced with the ID
s prescribed by d
, and GotoIfNot
, GotoNode
, and PhiNode
instances are replaced with the corresponding ID
versions.
Mooncake.BasicBlockCode._ssas_to_ids
— Method_ssas_to_ids(insts::InstVector)::Tuple{Vector{ID}, InstVector}
Assigns an ID to each line in stmts
, and replaces each instance of an SSAValue
in each line with the corresponding ID
. For example, a call statement of the form Expr(:call, :f, %4)
is be replaced with Expr(:call, :f, id_assigned_to_%4)
.
Mooncake.BasicBlockCode._to_ssas
— Method_to_ssas(d::Dict, inst::NewInstruction)
Like _ssas_to_ids
, but in reverse. Converts IDs to SSAValues / (integers corresponding to ssas).
Mooncake.BasicBlockCode.characterise_unique_predecessor_blocks
— Methodcharacterise_unique_predecessor_blocks(blks::Vector{BBlock}) ->
Tuple{Dict{ID, Bool}, Dict{ID, Bool}}
We call a block b
a unique predecessor in the control flow graph associated to blks
if it is the only predecessor to all of its successors. Put differently we call b
a unique predecessor if, whenever control flow arrives in any of the successors of b
, we know for certain that the previous block must have been b
.
Returns two Dict
s. A value in the first Dict
is true
if the block associated to its key is a unique precessor, and is false
if not. A value in the second Dict
is true
if it has a single predecessor, and that predecessor is a unique predecessor.
Context:
This information is important for optimising AD because knowing that b
is a unique predecessor means that
- on the forwards-pass, there is no need to push the ID of
b
to the block stack when passing through it, and - on the reverse-pass, there is no need to pop the block stack when passing through one of the successors to
b
.
Utilising this reduces the overhead associated to doing AD. It is quite important when working with cheap loops – loops where the operations performed at each iteration are inexpensive – for which minimising memory pressure is critical to performance. It is also important for single-block functions, because it can be used to entirely avoid using a block stack at all.
Mooncake.BasicBlockCode.characterise_used_ids
— Methodcharacterise_used_ids(stmts::Vector{IDInstPair})::Dict{ID, Bool}
For each line in stmts
, determine whether it is referenced anywhere else in the code. Returns a dictionary containing the results. An element is false
if the corresponding ID
is unused, and true
if is used.
Mooncake.BasicBlockCode.collect_stmts
— Methodcollect_stmts(ir::BBCode)::Vector{IDInstPair}
Produce a Vector
containing all of the statements in ir
. These are returned in order, so it is safe to assume that element n
refers to the nth
element of the IRCode
associated to ir
.
Mooncake.BasicBlockCode.collect_stmts
— Methodcollect_stmts(bb::BBlock)::Vector{IDInstPair}
Returns a Vector
containing the ID
s and instructions associated to each line in bb
. These should be assumed to be ordered.
Mooncake.BasicBlockCode.compute_all_predecessors
— Methodcompute_all_predecessors(ir::BBCode)::Dict{ID, Vector{ID}}
Compute a map from the ID
of each BBlock
in ir
to its possible predecessors.
Mooncake.BasicBlockCode.compute_all_successors
— Methodcompute_all_successors(ir::BBCode)::Dict{ID, Vector{ID}}
Compute a map from the ID
of each BBlock
in ir
to its possible successors.
Mooncake.BasicBlockCode.control_flow_graph
— Methodcontrol_flow_graph(bb_code::BBCode)::Core.Compiler.CFG
Computes the Core.Compiler.CFG
object associated to this bb_code
.
Mooncake.BasicBlockCode.id_to_line_map
— Methodid_to_line_map(ir::BBCode)
Produces a Dict
mapping from each ID
associated with a line in ir
to its line number. This is isomorphic to mapping to its SSAValue
in IRCode
. Terminators do not have ID
s associated to them, so not every line in the original IRCode
is mapped to.
Mooncake.BasicBlockCode.insert_before_terminator!
— Methodinsert_before_terminator!(bb::BBlock, id::ID, inst::NewInstruction)::Nothing
If the final instruction in bb
is a Terminator
, insert inst
immediately before it. Otherwise, insert inst
at the end of the block.
Mooncake.BasicBlockCode.is_reachable_return_node
— Methodis_reachable_return_node(x::ReturnNode)
Determine whether x
is a ReturnNode
, and if it is, if it is also reachable. This is purely a function of whether or not its val
field is defined or not.
Mooncake.BasicBlockCode.new_inst
— Functionnew_inst(stmt, type=Any, flag=CC.IR_FLAG_REFINED)::NewInstruction
Create a NewInstruction
with fields:
stmt
=stmt
type
=type
info
=CC.NoCallInfo()
line
=Int32(1)
flag
=flag
Mooncake.BasicBlockCode.new_inst_vec
— Methodnew_inst_vec(x::CC.InstructionStream)
Convert an Compiler.InstructionStream
into a list of Compiler.NewInstruction
s.
Mooncake.BasicBlockCode.phi_nodes
— Methodphi_nodes(bb::BBlock)::Tuple{Vector{ID}, Vector{IDPhiNode}}
Returns all of the IDPhiNode
s at the start of bb
, along with their ID
s. If there are no IDPhiNode
s at the start of bb
, then both vectors will be empty.
Mooncake.BasicBlockCode.remove_unreachable_blocks!
— Methodremove_unreachable_blocks!(ir::BBCode)::BBCode
If a basic block in ir
cannot possibly be reached during execution, then it can be safely removed from ir
without changing its functionality. A block is unreachable if either:
- it has no predecessors and it is not the first block, or
- all of its predecessors are themselves unreachable.
For example, consider the following IR:
julia> ir = Mooncake.ircode(
Any[Core.ReturnNode(nothing), Expr(:call, sin, 5), Core.ReturnNode(Core.SSAValue(2))],
Any[Any, Any, Any],
);
There is no possible way to reach the second basic block (lines 2 and 3). Applying this function will therefore remove it, yielding the following:
julia> Mooncake.IRCode(Mooncake.remove_unreachable_blocks!(Mooncake.BBCode(ir)))
1 1 ─ return nothing
In the blocks which have not been removed, there may be references to blocks which have been removed. For example, the edge
s in a PhiNode
may contain a reference to a removed block. These references are removed in-place from these remaining blocks, so this function will (in general) modify ir
.
Mooncake.BasicBlockCode.seed_id!
— Methodseed_id!()
Set the global counter used to ensure ID uniqueness to 0. This is useful when you want to ensure determinism between two runs of the same function which makes use of ID
s.
This is akin to setting the random seed associated to a random number generator globally.
Mooncake.BasicBlockCode.sort_blocks!
— Methodsort_blocks!(ir::BBCode)::BBCode
Ensure that blocks appear in order of distance-from-entry-point, where distance the distance from block b to the entry point is defined to be the minimum number of basic blocks that must be passed through in order to reach b.
For reasons unknown (to me, Will), the compiler / optimiser needs this for inference to succeed. Since we do quite a lot of re-ordering on the reverse-pass of AD, this is a problem there.
WARNING: use with care. Only use if you are confident that arbitrary re-ordering of basic blocks in ir
is valid. Notably, this does not hold if you have any IDGotoIfNot
nodes in ir
.
Mooncake.BasicBlockCode.terminator
— Methodterminator(bb::BBlock)
Returns the terminator associated to bb
. If the last instruction in bb
isa Terminator
then that is returned, otherwise nothing
is returned.