compiler/sem/mirexec

  Source   Edit

Implements a data-flow graph (=DFG) representation plus algorithms for traversing the graph in both forward or backward direction. In the abstract, this means visiting the nodes in topological-order (forward) or post-order (backward), with special handling for loops.

In the compiler, this is mainly used for control- and/or data-flow analysis, meant to propagate properties through the graph or to answer questions such as: "is point A connected to point B?", "is X initialized on all paths?", etc.

A DFG is built via the computeDfg routine. Instead of a pointer-based graph structure, the graph is represented as a sequence of instructions encoding all data- and control-flow relevant properties for a piece of code.

Types

DataFlowGraph = object
  instructions: seq[Instr]   ## sorted in ascending order by attached-to node position
  map: seq[InstrPos]         ## join ID -> instruction
  
Encodes the data-flow graph of a local program as a sequence of control- and data-flow instructions.   Source   Edit
InstrPos = int32
  Source   Edit
Opcode = enum
  opNone,                   ## no-op
  opFork,                   ## branching control-flow that cannot introduce a cycle
  opGoto,                   ## unconditional jump that cannot introduce a cycle
  opLoop,                   ## unconditional jump to the start of a loop. The start of a cycle
  opJoin,                   ## a join point for control-flow
  opUse,                    ## use of a value
  opDef,                    ## definition of a value
  opKill,                   ## end-of-life for a value
  opInvalidate,             ## all information gathered about a value becomes invalid
  opMutate, ## mutation of a value. Can be viewed as a combined 'use' +
             ## 'def'
  opConsume,                ## a value is consumed. This is effectively a 'use' + 'kill'
  opDestroy,                ## a location's value is destroyed
  opMutateGlobal             ## an unspecified global is mutated
The opcode of a data-/control-flow instruction, representing edges and nodes in the graph.   Source   Edit
Subgraph = Slice[InstrPos]
Represents a sub-graph within the data-flow graph. Internally, this is a span of instructions.   Source   Edit
TraverseState = object
  exit*: bool ## used to communicate to ``traverse`` that the active path
              ## should be "killed" (not followed further). Reset to `false`
              ## whenever control is passed back to the iterator.
              ## When traversal is finished, `exit` is set to 'true'
              ## if traversal reached the end of the given span, 'false'
              ## otherwise
  escapes*: bool             ## whether control-flow escapes the given span
  
  Source   Edit

Procs

func `$`(c: DataFlowGraph): string {....raises: [], tags: [].}
Renders the instructions of c as a human-readable text representation   Source   Edit
func change(dfg: var DataFlowGraph; instrs: openArray[InstrPos]; to: Opcode) {.
    ...raises: [], tags: [].}
Changes all data-flow instructions identified by instrs to use the to opcode.   Source   Edit
func computeDfg(tree: MirTree): DataFlowGraph {....raises: [KeyError], tags: [].}
Computes the data-flow graph for the given tree. This is a moderately expensive operation. The cost is due to having to materialize all data- flow operation for a given tree.   Source   Edit
func find(dfg: DataFlowGraph; n: NodePosition): InstrPos {....raises: [], tags: [].}
Returns the first data-/control-flow operation associated with n. If none are associated with n, the closest following (in terms of attached-to node position) operation is returned.   Source   Edit
func subgraphFor(dfg: DataFlowGraph; span: Slice[NodePosition]): Subgraph {.
    ...raises: [], tags: [].}
Computes a reference to the sub-graph encompassing the span of MIR instructions.   Source   Edit

Iterators

iterator instructions(dfg: DataFlowGraph): (InstrPos, Opcode, OpValue) {.
    ...raises: [], tags: [].}
Returns all data-flow operations in order of appearance together with their position.   Source   Edit
iterator traverse(c: DataFlowGraph; span: Subgraph; start: InstrPos;
                  state: var TraverseState): (DataFlowOpcode, OpValue) {.
    ...raises: [], tags: [].}

Starts at the data-flow operation closest to start and traverses/yields all data-flow operations inside span in control-flow order. Outside of loops, this means that an operation is visited before operations that have a control-flow dependency on it. If start is not part of span, nothing is returned and state.exit is set to 'false'.

The same basic block may be yielded multiple times. This is not a general limitation, but rather because of a shortcut taken by the implementation.

state is used for bi-directional communication -- see the documentation of TraverseState for more information.

  Source   Edit
iterator traverseFromExits(c: DataFlowGraph; span: Subgraph; exit: var bool): (
    DataFlowOpcode, OpValue) {....raises: [], tags: [].}

Similar to traverseReverse, but starts traversal at each unstructured exit of span. Here, unstructured exit means that the control-flow leaves span via a 'goto' or 'fork'.

For the algorithm to work correctly, it is important that span does not cross a loop. That is, both start and the end of loop need to be present in span.

exit works the same way as it does for traverseReverse

  Source   Edit
iterator traverseReverse(c: DataFlowGraph; span: Subgraph; start: InstrPos;
                         exit: var bool): (DataFlowOpcode, OpValue) {.
    ...raises: [], tags: [].}

Starts at start - 1 and visits and returns all data-flow operations inside span in post-order.

span being empty is supported: nothing is returned in that case.

Similar to traverse, exit is used, with the same meaning, for bi-directional communication.

  Source   Edit