Skip to content

[Rule] FEEDBACK VERTEX SET to MinimumCodeGenerationUnlimitedRegisters #908

@isPANN

Description

@isPANN

Source: FEEDBACK VERTEX SET (FeedbackVertexSet)
Target: CODE GENERATION WITH UNLIMITED REGISTERS (MinimumCodeGenerationUnlimitedRegisters)

Motivation: The previous body was only a placeholder. This issue gives the actual chain gadget behind the Aho-Johnson-Ullman reduction.

Reference: A. V. Aho, R. Sethi, "How Hard Is Compiler Code Generation?" (1977), summarizing the Aho-Johnson-Ullman construction for code generation with common subexpressions and its unlimited-register variant.

Target Machine Model

Use the standard two-address unlimited-register model:

  • leaves are values already stored in named registers

  • every internal node is a binary operation

  • if internal node u has left child a and right child b, then one instruction

    r <- r op s

    computes u, where r currently holds the value of a and s currently holds the value of b

  • the left operand register is overwritten

  • copy instructions

    r' <- r

    are allowed and cost one instruction

Because registers are unlimited, the only optimization issue is how many copy instructions are needed.

Reduction Algorithm

Input:

  • a directed graph G = (V, E)
  • an integer k

Assume
V = {x_1, ..., x_n}.

For each vertex x in V, let its outgoing edges be ordered arbitrarily as
(x, y_1), ..., (x, y_d),
where d = outdeg(x).

Construct a DAG D = (N, A) with left/right arc partition (L, R) as follows.

Leaves

For each source vertex x in V, create one leaf
x^0
whose value is initially stored in a distinct register
R_x.

Internal chain nodes

For each source vertex x with outgoing neighbors y_1, ..., y_d, create internal nodes
x^1, x^2, ..., x^d.

For each i = 1..d, add:

  • left arc (x^i, x^(i-1)) to L
  • right arc (x^i, y_i^0) to R

So each x^i has exactly two outgoing arcs.

Roots

For each vertex x with outdeg(x) > 0, the root corresponding to x is x^d.

Vertices with outdeg(x) = 0 create only the leaf x^0; no internal node is needed for them.

Budget

Let
m = |E|.

Set
K = m + k.

Output the instance
(D, L, R, K).

Interpretation of the Gadget

  • evaluating the chain for vertex x destroys the original register R_x, because the first instruction for x^1 uses R_x as a left operand
  • every incoming edge (u, x) of the source graph creates an occurrence of leaf x^0 as a right child in the chain for u
  • therefore, if x is evaluated before some predecessor u, then u can still use x^0 only if R_x was copied first

Thus copies correspond to source vertices whose values are preserved across cyclic dependencies.

Forward Witness Map

Suppose F subseteq V is a feedback vertex set of G with |F| <= k.

Define
E_F = { (u, v) in E : v notin F }.

Because F intersects every directed cycle, the digraph
(V, E_F)
is acyclic.

Choose any topological ordering
<_F
of (V, E_F).

Now construct a target program as follows.

Step 1: copy the feedback vertices

For each x in F, emit one copy instruction

S_x <- R_x

where S_x is a fresh register.

Step 2: evaluate the chains in topological order

Process the source vertices in the order <_F.

For a fixed vertex x with outgoing neighbors y_1, ..., y_d:

  • if d = 0, do nothing

  • otherwise emit the d instructions

    R_x <- R_x op T_1
    R_x <- R_x op T_2
    ...
    R_x <- R_x op T_d

where

  • T_i = S_(y_i) if y_i in F
  • T_i = R_(y_i) if y_i notin F

This computes the chain
x^1, x^2, ..., x^d
using exactly d operation instructions.

Why this is valid

If y_i notin F, then edge (x, y_i) belongs to E_F, so x <_F y_i.
Hence x is processed before y_i, and the original register R_(y_i) has not yet been destroyed when x needs it as a right operand.

If y_i in F, then x uses the preserved copy S_(y_i) instead.

Instruction count

There are exactly m internal nodes in D, so exactly m operation instructions are emitted.
There are exactly |F| copy instructions.

Hence the program length is
m + |F| <= m + k = K.

Extraction Pass

Suppose there is a target program of length at most K = m + k that computes all roots of D.

Normalization lemma

Every internal node of D has in-degree 1.
Hence no internal value is ever needed by two different parents.

Therefore, in a minimum-length program:

  • every internal node is computed exactly once
  • copying an internal value is useless
  • every extra instruction beyond the mandatory m operation instructions can be assumed to be a copy of a leaf register R_x

Let
F = { x in V : some copy of R_x is made }.

Since the total program length is at most m + k, we get
|F| <= k.

Why F is a feedback vertex set

Assume for contradiction that F is not a feedback vertex set.
Then there is a directed cycle

x_1 -> x_2 -> ... -> x_t -> x_1

in G with no vertex in F.

Fix one edge (x_i, x_(i+1)) on the cycle.
Because x_(i+1) notin F, the leaf value R_(x_(i+1)) is never copied.
The first instruction of the chain for x_(i+1) destroys R_(x_(i+1)).
So the chain for x_i, which needs x_(i+1)^0 as a right child, must be executed before the chain for x_(i+1) begins.

Thus for every i,
x_i must be evaluated before x_(i+1).

Applying this around the whole cycle yields

x_1 < x_2 < ... < x_t < x_1,

impossible.

So F intersects every directed cycle.
Therefore F is a feedback vertex set of G with |F| <= k.

Correctness

G has a feedback vertex set of size at most k
iff
D can be evaluated on an unlimited-register two-address machine using at most
|E| + k
instructions.

Size Overhead

Let

  • n = |V|
  • m = |E|

Then:

  • number of leaves = n
  • number of internal nodes = m
  • total nodes = n + m
  • total arcs = 2m
  • every internal node has out-degree 2
  • the only shared nodes are leaves

So the construction is polynomial and matches the restricted form stated in the hardness result.

Example

Source instance:

  • vertices {a, b, c}
  • edges {(a,b), (b,c), (c,a)}
  • k = 1

This is a directed 3-cycle, so a feedback vertex set of size 1 exists, e.g. {b}.

Constructed DAG:

  • leaves a^0, b^0, c^0 labeled by registers R_a, R_b, R_c
  • chain nodes
    • a^1 with left child a^0 and right child b^0
    • b^1 with left child b^0 and right child c^0
    • c^1 with left child c^0 and right child a^0

Budget:
K = |E| + k = 3 + 1 = 4.

A valid target program is:

  1. S_b <- R_b
  2. R_b <- R_b op R_c
  3. R_c <- R_c op R_a
  4. R_a <- R_a op S_b

This computes all three roots using exactly one copy instruction.

Implementation Notes

  1. The current placeholder K = |V| + k is wrong for this gadget; the correct bound is K = |E| + k.
  2. The gadget is a left-chain construction, not a one-node-per-vertex toy graph.
  3. The source graph is directed.
  4. Only leaves are shared; this is essential for the converse proof.

Metadata

Metadata

Assignees

No one assigned

    Labels

    ruleA new reduction rule to be added.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Done

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions