The MIR is a desugared, simplified, and flattened intermediate representation (=IR) of NimSkull code that is intended for analysis, transformations, and lowerings.
In the compilation pipeline, the stage using the MIR is located after semantic analysis and high-level transformation (transf) but before code generation.
NAME = <Temp> # a temporary introduced during the MIR phase | <Alias> # a temporary view introduced during the MIR phase | <Local> # a user-defined local or temporary introduce prior to the # MIR phase | <Global> # a user-defined global | <Const> # a user-defined constant LVALUE = NAME | PathNamed LVALUE <Field> # named field access (objects only) | PathPos LVALUE <Position> # positional field access (tuples only) | PathVariant LVALUE <Field> # access a variant within an object | PathConv LVALUE <Type> # access a sub- or super-type of | PathArray LVALUE (NAME | <Literal>) # access an array at given # position | Deref NAME # dereference a `ptr` or `ref` | DerefView NAME # dereference a `var` or `lent` VALUE = <Literal> | <ProcVal> | <Type> | LVALUE TARGET = <Label> EX_TARGET = <Label> | Resume UNARY_OP = NegI VALUE BINARY_OP = AddI VALUE, VALUE | SubI VALUE, VALUE | MulI VALUE, VALUE | DivI VALUE, VALUE | ModI VALUE, VALUE CALL_ARG = Arg VALUE # pass-by-value argument | Arg <none> # argument that's going to be omitted # later | Name <Effect> LVALUE # pass-by-name argument. The effect # specifies the effect the call has # on the value/location | Consume VALUE # pass-by-value argument, but # the value is consumed (i.e., moved) CONSTR_ARG = Arg VALUE | Consume OPERAND CALL_EXPR = Call <Imm> <Proc> CALL_ARG ... # a static call of the provided # procedure with the given # arguments | Call <Imm> LVALUE CALL_ARG ... # indirect call | Call <Imm> <Magic> CALL_ARG ... # a call of a magic procedure # (i.e., a procedure that is # either going to be lowered into # something else, or one for # which the behaviour cannot # be represented in the MIR) # checked calls have the same shape as normal calls. The difference # is that the call has an exceptional exit (i.e., it might raise an # exception) CHECKED_CALL_EXPR = CheckedCall <Imm> <Proc> CALL_ARG ... EX_TARGET | CheckedCall <Imm> LVALUE CALL_ARG ... EX_TARGET | CheckedCall <Imm> <Magic> CALL_ARG ... EX_TARGET SET_CONSTR_ARG = VALUE | Range VALUE VALUE # range construction BINDING = Binding <Field> CONSTR_ARG RVALUE = UNARY_OP | BINARY_OP | CALL_EXPR | CHECKED_CALL_EXPR | SetConstr SET_CONSTR_ARG ... | ArrayConstr CONSTR_ARG... | SeqConstr CONSTR_ARG... | TupleConstr CONSTR_ARG... | ClosureConstr CONSTR_ARG... | ObjConstr BINDING ... # construct an `object` | RefConstr BINDING ... # construct a `ref object` | StdConv VALUE # number conversion or conversion # between cstring and string | Conv VALUE # same as `StdConv`. Only duplicate # for legacy code generator reasons | Cast VALUE # reinterpret the value as a different # type | Addr LVALUE # create a pointer from the lvalue | View LVALUE # create a view (`var`/`lent`) of the # lvalue | MutView LVALUE | ToSlice VALUE # create an `openArray` slice of # the full sequence | MutToSlice LVALUE | ToSlice VALUE, VALUE, VALUE # create an `openArray` slice from the # first operand, starting at the lower # bound (second parameter) and ending # at the upper bound (inclusive, third # parameter) | MutToSlice LVALUE, VALUE, VALUE ASGN_SRC = RVALUE | VALUE | Copy VALUE | Move LVALUE | Sink LVALUE SHALLOW_SRC = RVALUE | VALUE STATEMENT = Scope # starts a new scope, which # delimits the lifetime of all # definitions within | EndScope # close the current scope | Def NAME none # definition | Def NAME ASGN_SRC # definition + initial value assignment | DefCursor NAME # definition of non-owning location | DefCursor NAME SHALLOW_SRC# definition of non-owning location + # initial (shallow copy) assignment | Bind <Alias> LVALUE # bind the lvalue to the given alias. # May be used for mutation, but must # not be used as an assignment's # destination or `Tag`'s operand | BindMut <Alias> LVALUE # bind the lvalue to the given alias. # The alias may be used as an # assignment's destination or as a # `Tag`'s operand | Void LVALUE # evaluates the lvalue for side-effects # and acts as a usage of the lvalue # during data-flow analysis | Void CALL_EXPR # represents a void call. The called # procedure or magic *must* have a # `void`` return type | Asgn LVALUE ASGN_SRC # normal assignment of the right value # to the left location | Init LVALUE ASGN_SRC # initial assignment (the destination # is empty) | Switch LVALUE ASGN_SRC # changes the active branch of a # variant. Unclear semantics. | If VALUE <Label> # fall through if the value evaluates # to true, otherwise jump to the if's # corresponding end | Case VALUE BRANCH_LIST # dispatch to one of the branches based # on the value, where value must be # either of integer, float, or string # type | Goto TARGET | Loop <Label> # unconditional jump back to the start # of a loop | Destroy LVALUE | Raise EX_TARGET | Join <Label> # join point for non-exceptional # control-flow (e.g., goto) | LoopJoin <Label> # join point for `Loop` | Except <Type> ... EX_TARGET | Except <Local> EX_TARGET | Except # catch-all handler | Finally <Label> | Continue EX_TARGET | End <Label> # marks the end of an if, repeat, or # except | Emit VALUE ... | Asm VALUE ... BRANCH_LABEL = <Literal> | <Const> | Range <Literal> <Literal> BRANCH_LIST = (Branch BRANCH_LABEL ... TARGET) ... # a list of branches
Only allowing calls, conversions, casts, etc. as the source operand (i.e., on the right) of an assignment makes sure that they always have named receivers, which:
If a call raises an exception, whether the assignment destination is written to is currently undefined.
PType is currently re-used for representing types at the MIR stage. In the future, the usage of PType will be replaced with simplified and more data- oriented version that betters supports lowering and traversal.
Lvalues passed to by-name parameters may optionally be tagged with an effect. The effect describes what may happen to the underlying location during execution of the procedure. If no effects are specified, the location may only be read from during execution of the procedure.
This information is intended for use by data-flow analysis and code generators.
Terminology:
A basic block is started by Finally and Except. Terminators are: Case, Goto, Raise, Continue, and Loop. If, Join, and LoopJoin act as both the start and end of a basic block. The nature of End depends on the associated construct:
Except for Loop, all terminators only allow forward control-flow.
Each If and Except must be paired with exactly one End, each LoopJoin with a Loop, and each Finally with a Continue. These are the structured constructs, and they must not overlap each other, meaning that:
if x (L1) if y (L2) end L1 end L2
is not allowed. However, much like in the high-level language, structured constructs can be nested.
Resume is a special jump target that may only appear as the target of Raise, Continue, CheckedCall, and Except. It specifies that unwinding/exception-handling resumes in the caller procedure.
Except represents an exception handler. If types or a local is specified, the section is only entered if the run-time type of the active exception matches the section's filters. If there's a match, execution continues with the statement following the Except, otherwise it continues at the target specified by the Except.
A Finally section is similar to an Except section, but with a few important differences:
Unlike a NimSkull finally, an MIR Finally section is not meant for intercepting non-exceptional control-flow -- it must only be used as the target for exceptional control-flow.
At the end of Finally section must be a Continue statement, which specifies where raising the exception continues.
The MIR uses a tree-based representation similar to the AST. For easier processing and faster access, the whole code for a procedure is stored in a single sequence of nodes, with the nodes forming a tree.
MIR constant expression are stored separately from MIR trees representing routine bodies. Constant expressions describe a value not depending on any dynamic/run-time information. They use a variation/sub-set of the MIR that is better suited for statement-less trees.
The syntax is similar to that of the normal MIR, with the biggest difference being that the representation is flat (i.e., a single tree rather than multiple ones).
VALUE = <ProcVal> | <Literal> | COMPLEX ARG = Arg VALUE SET_CONSTR_ARG = <Literal> | Range <Literal> <Literal> BINDING = Binding <Field> ARG COMPLEX = SetConstr SET_CONSTR_ARG... | ArrayConstr ARG... | SeqConstr ARG... | TupleConstr ARG... | ClosureConstr ARG... | ObjConstr BINDING... | RefConstr BINDING...