Skip to content

[Model] EnsembleComputation #215

@isPANN

Description

@isPANN

Motivation

ENSEMBLE COMPUTATION (P301) from Garey & Johnson, A11 PO9. A classical NP-complete problem useful for reductions. It asks whether a given collection of target subsets can all be computed by a short sequence of disjoint union operations, starting from singletons or previously computed intermediate sets.

Planned reduction: MinimumVertexCover → EnsembleComputation (per Garey & Johnson, Section 3.2.2 — transformation from VERTEX COVER). See also rule issue #204.

Definition

Name: EnsembleComputation
Reference: Garey & Johnson, Computers and Intractability, A11 PO9

Mathematical definition:

INSTANCE: Collection C of subsets of a finite set A, positive integer J.
QUESTION: Is there a sequence S = (z₁ ← x₁ ∪ y₁, z₂ ← x₂ ∪ y₂, ..., zⱼ ← xⱼ ∪ yⱼ) of j ≤ J union operations, where each xᵢ and yᵢ is either {a} for some a ∈ A or z_k for some k < i, such that xᵢ and yᵢ are disjoint, 1 ≤ i ≤ j, and such that for every subset c ∈ C there exists some zᵢ, 1 ≤ i ≤ j, that is identical to c?

This is a satisfaction (decision) problem. The decision framing with budget J is the canonical formulation in the literature (Garey & Johnson, Järvisalo et al. 2012). The budget J determines the number of variables, analogous to k in KColoring.

Variables

  • Count: 2 * J variables, encoding J union operations. Each operation i ∈ {1, ..., J} has two operand variables: operand1_i and operand2_i.
  • Per-variable domain: Each operand variable ranges over |A| + J values (indices 0..|A| represent singletons {a} for a ∈ A; indices |A|..|A|+J represent previously computed sets z₁..zⱼ). This gives dims() = vec![|A| + J; 2 * J].
  • Meaning: A configuration [op1_x, op1_y, op2_x, op2_y, ..., opJ_x, opJ_y] specifies the two operands for each of the J union operations.
  • evaluate(): Returns true if and only if:
    1. For each operation i, both operands refer to valid sources (singletons or z_k with k < i — i.e., operand indices ≥ |A| must satisfy the index < |A| + i),
    2. For each operation i, the sets x_i and y_i are disjoint,
    3. Every subset c ∈ C appears as some z_i in the computed sequence.

Schema (data type)

Type name: EnsembleComputation
Variants: No generic parameters needed (elements and subsets represented by integer indices); a single concrete variant.

Field Type Description
universe_size usize Number of elements in A (elements are represented as 0..universe_size)
subsets Vec<Vec<usize>> The collection C: each inner Vec lists elements of one required subset
budget usize Maximum number of union operations J allowed

Complexity

  • Decision complexity: NP-complete (Garey & Johnson, Section 3.2.2; transformation from VERTEX COVER). Remains NP-complete even if each c ∈ C satisfies |c| ≤ 3.
  • Best known exact algorithm: No specialized exact algorithm is known; brute-force enumeration of the search space is the baseline approach. Each of the 2J operand variables ranges over |A| + J values, giving a brute-force complexity of O((|A| + J)^(2J)).
  • declare_variants! complexity string: "(universe_size + budget)^(2 * budget)"
  • References:
    • [Garey & Johnson, 1979] M. R. Garey, D. S. Johnson, Computers and Intractability: A Guide to the Theory of NP-Completeness, W. H. Freeman, 1979. Problem PO9, Appendix A11.
    • [Järvisalo et al., 2012] M. Järvisalo, P. Kaski, M. Koivisto, J. Korhonen, "Finding Efficient Circuits for Ensemble Computation", Proc. SAT 2012, LNCS 7317, pp. 369–382. https://doi.org/10.1007/978-3-642-31612-8_28

Extra Remark

Full book text:

INSTANCE: Collection C of subsets of a finite set A, positive integer J.
QUESTION: Is there a sequence S = (z1 ← x1 ∪ y1, z2 ← x2 ∪ y2,...,zj ← xj ∪ yj) of j ≤ J union operations, where each xi and yi is either {a} for some a ∈ A or zk for some k < i, such that xi and yi are disjoint, 1 ≤ i ≤ j, and such that for every subset c ∈ C there exists some zi, 1 ≤ i ≤ j, that is identical to c?
Reference: [Garey and Johnson, ——]. Transformation from VERTEX COVER (see Section 3.2.2).
Comment: Remains NP-complete even if each c ∈ C satisfies |c| ≤ 3. The analogous problem in which xi and yi need not be disjoint for 1 ≤ i ≤ j is also NP-complete under the same restriction.

How to solve

  • It can be solved by (existing) bruteforce: enumerate all valid sequences of ≤ J disjoint-union operations over elements of A (and previously built sets), check if every c ∈ C is produced.
  • It can be solved by reducing to integer programming.
  • Other: N/A — no other known tractable special case applies to the general problem

Example Instance

Small satisfiable instance:

  • Universe: A = {0, 1, 2, 3, 4} (5 elements)
  • Required subsets: C = {{0,1,2}, {0,1,3}, {2,3,4}}
  • Budget: J = 5

Brute-force complexity: (5 + 5)^(2 × 5) = 10^10 = 10 billion configurations (illustrates the rapid growth of the search space).

Witness sequence (j = 4 ≤ J = 5):

  • z₁ = {0} ∪ {1} = {0,1} (disjoint ✓)
  • z₂ = z₁ ∪ {2} = {0,1,2} = C[0] ✓
  • z₃ = z₁ ∪ {3} = {0,1,3} = C[1] ✓
  • z₄ = {2} ∪ {3} = {2,3} (disjoint ✓; intermediate set)
  • z₅ = z₄ ∪ {4} = {2,3,4} = C[2] ✓

All 3 subsets in C appear (j = 5 ≤ J = 5 ✓). Note z₁ = {0,1} is reused by both z₂ and z₃, demonstrating sharing.

Greedy trap: Building each subset independently requires (|c|−1) operations per subset: 2 + 2 + 2 = 6 operations. Sharing the intermediate z₁ = {0,1} between C[0] and C[1] saves one operation, reducing the total to 5.

Why budget J = 4 is infeasible: With only 4 operations, we cannot produce all 3 target subsets. Building C[0] = {0,1,2} requires at least 2 operations (e.g., z₁ = {0}∪{1}, z₂ = z₁∪{2}). Building C[2] = {2,3,4} requires at least 2 operations (e.g., z₃ = {2}∪{3}, z₄ = z₃∪{4}). That's 4 operations consumed, but C[1] = {0,1,3} still needs z₁∪{3} — requiring a 5th operation. So J = 4 is a NO instance.

Suboptimal witness: Using J = 6 (budget exceeds need): build each subset independently without sharing — z₁={0}∪{1}, z₂=z₁∪{2}=C[0], z₃={0}∪{1}, z₄=z₃∪{3}=C[1], z₅={2}∪{3}, z₆=z₅∪{4}=C[2]. Valid but wasteful (j=6 > optimal j=5).

Metadata

Metadata

Assignees

No one assigned

    Labels

    GoodAn issue passed all checks.modelA model problem to be implemented.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Done

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions