|
5 | 5 |
|
6 | 6 | ## Goal |
7 | 7 |
|
8 | | -Detect primitive reduction rules whose overhead is dominated (equal or worse) by a composite path through other rules. Prevents adding redundant rules and documents existing redundancies. |
| 8 | +Detect primitive reduction rules whose symbolic overhead is asymptotically no better than a composite path through other rules. |
| 9 | + |
| 10 | +This analysis must be **sound but incomplete**: |
| 11 | +- Report `dominated` only when the symbolic comparison is trustworthy |
| 12 | +- Report `unknown` when the metadata is too weak to compare safely |
| 13 | +- Never claim domination from incomplete metadata |
| 14 | + |
| 15 | +## Scope |
| 16 | + |
| 17 | +- Analyze exact variant-level reductions, not just base problem names |
| 18 | +- Use current symbolic `ReductionOverhead` metadata where it is trustworthy |
| 19 | +- Exclude or mark `unknown` any path that relies on incomplete symbolic metadata |
| 20 | + |
| 21 | +## Non-Goal |
| 22 | + |
| 23 | +This utility does **not** try to prove dominance for every path in the graph. |
| 24 | + |
| 25 | +In particular, `ILP -> QUBO` is a valid reduction but its symbolic overhead is currently incomplete: the implementation adds slack bits based on constraint coefficients and right-hand sides, while the exposed symbolic size fields only include `num_vars` and `num_constraints`. That edge must therefore be treated as `unknown` for shortcut detection. |
9 | 26 |
|
10 | 27 | ## Location |
11 | 28 |
|
12 | | -`src/rules/analysis.rs` — new module alongside `graph.rs`, `registry.rs`, `cost.rs`. |
| 29 | +Add `src/rules/analysis.rs` and export it from `src/rules/mod.rs`. |
| 30 | + |
| 31 | +As with other rule modules, wire tests via: |
| 32 | + |
| 33 | +```rust |
| 34 | +#[cfg(test)] |
| 35 | +#[path = "../unit_tests/rules/analysis.rs"] |
| 36 | +mod tests; |
| 37 | +``` |
13 | 38 |
|
14 | | -## Types |
| 39 | +## Core Types |
15 | 40 |
|
16 | 41 | ```rust |
17 | | -/// Asymptotic growth rate classification. |
18 | | -/// Ordered: Constant < Logarithmic < Polynomial(k) < Exponential(c) < SuperExponential. |
19 | | -#[derive(Debug, Clone, PartialEq, PartialOrd)] |
20 | | -enum GrowthRate { |
21 | | - Constant, |
22 | | - Logarithmic, |
23 | | - Polynomial(f64), // degree k |
24 | | - Exponential(f64), // base c |
25 | | - SuperExponential, |
| 42 | +/// Exact identity of a primitive reduction rule. |
| 43 | +#[derive(Debug, Clone, PartialEq, Eq, Hash)] |
| 44 | +struct RuleKey { |
| 45 | + source_name: &'static str, |
| 46 | + source_variant: BTreeMap<String, String>, |
| 47 | + target_name: &'static str, |
| 48 | + target_variant: BTreeMap<String, String>, |
| 49 | + module_path: &'static str, |
26 | 50 | } |
27 | 51 |
|
28 | | -/// A primitive rule dominated by a composite path. |
| 52 | +/// Result of comparing one primitive rule against one composite path. |
| 53 | +#[derive(Debug, Clone, PartialEq, Eq)] |
| 54 | +enum ComparisonStatus { |
| 55 | + Dominated, |
| 56 | + NotDominated, |
| 57 | + Unknown, |
| 58 | +} |
| 59 | + |
| 60 | +/// A primitive rule proven dominated by a composite path. |
| 61 | +#[derive(Debug, Clone)] |
29 | 62 | struct DominatedRule { |
30 | | - source: String, |
31 | | - target: String, |
| 63 | + primitive: RuleKey, |
32 | 64 | primitive_overhead: ReductionOverhead, |
33 | 65 | dominating_path: ReductionPath, |
34 | 66 | composed_overhead: ReductionOverhead, |
35 | | - comparison: Vec<(String, Ordering)>, // per-field ordering |
| 67 | + comparable_fields: Vec<String>, |
| 68 | +} |
| 69 | + |
| 70 | +/// A candidate comparison that could not be decided soundly. |
| 71 | +#[derive(Debug, Clone)] |
| 72 | +struct UnknownComparison { |
| 73 | + primitive: RuleKey, |
| 74 | + candidate_path: ReductionPath, |
| 75 | + reason: String, |
36 | 76 | } |
37 | 77 | ``` |
38 | 78 |
|
| 79 | +## Trust Model |
| 80 | + |
| 81 | +### 1. Primitive candidates come from `ReductionEntry`, not graph edges |
| 82 | + |
| 83 | +The utility should iterate `inventory::iter::<ReductionEntry>` so each primitive rule keeps its own identity (`module_path`, exact source variant, exact target variant). |
| 84 | + |
| 85 | +This avoids conflating: |
| 86 | +- `KSatisfiability<K2> -> Satisfiability` |
| 87 | +- `KSatisfiability<K3> -> Satisfiability` |
| 88 | + |
| 89 | +which share the same base names but are distinct variant-level rules. |
| 90 | + |
| 91 | +### 2. The reduction graph is used only for variant-level path enumeration |
| 92 | + |
| 93 | +For a primitive rule `(source_variant, target_variant)`, use: |
| 94 | + |
| 95 | +```rust |
| 96 | +graph.find_all_paths(source_name, &source_variant, target_name, &target_variant) |
| 97 | +``` |
| 98 | + |
| 99 | +and discard direct paths (`len() == 1`). |
| 100 | + |
| 101 | +### 3. Comparisons are allowed only on trustworthy symbolic paths |
| 102 | + |
| 103 | +Add: |
| 104 | + |
| 105 | +```rust |
| 106 | +fn path_is_symbolically_trustworthy(path: &ReductionPath) -> Result<(), String>; |
| 107 | +``` |
| 108 | + |
| 109 | +This validates that every edge on the path has symbolic overhead suitable for comparison. |
| 110 | + |
| 111 | +Initial explicit exclusion: |
| 112 | +- `ILP -> QUBO` |
| 113 | + |
| 114 | +Reason: |
| 115 | +- the implementation adds slack bits per inequality constraint |
| 116 | +- slack growth depends on coefficients and rhs values |
| 117 | +- current symbolic metadata does not encode enough information to reconstruct that growth |
| 118 | + |
| 119 | +If a candidate path contains this edge, the comparison result is `Unknown`, not `Dominated` and not `NotDominated`. |
| 120 | + |
| 121 | +## Expression Comparison |
| 122 | + |
| 123 | +### Restricted objective |
| 124 | + |
| 125 | +The detector only needs to compare **reduction overhead expressions**, not solver complexity expressions. |
| 126 | + |
| 127 | +Current reduction overhead formulas are intended to be simple symbolic size maps, so the comparison logic should be intentionally narrower and more conservative than a full asymptotic algebra engine. |
| 128 | + |
| 129 | +### Supported expression family |
| 130 | + |
| 131 | +Implement comparison only for expressions that can be interpreted as sums of polynomial-like terms: |
| 132 | +- constants |
| 133 | +- variables |
| 134 | +- addition |
| 135 | +- multiplication |
| 136 | +- powers with non-negative constant exponents |
| 137 | +- parser-lowered negation (`-1 * expr`) |
| 138 | +- constant factors multiplying a term |
| 139 | + |
| 140 | +Unsupported forms become `Unknown`: |
| 141 | +- `exp` |
| 142 | +- `log` |
| 143 | +- `sqrt` |
| 144 | +- division / negative exponents |
| 145 | +- any expression that cannot be normalized into the supported family |
| 146 | + |
| 147 | +This matters because the parser lowers: |
| 148 | +- subtraction to `a + (-1) * b` |
| 149 | +- division to `a * b^-1` |
| 150 | + |
| 151 | +So the comparison code must explicitly treat constant multipliers and sign changes as harmless constant factors, while still rejecting true rational or transcendental growth. |
| 152 | + |
| 153 | +### Comparison rule |
| 154 | + |
| 155 | +For each output field present in both overheads: |
| 156 | +1. Normalize each expression into a supported multivariate polynomial-like form |
| 157 | +2. If either expression cannot be normalized, return `Unknown` |
| 158 | +3. Compare asymptotic growth on that field using the normalized multivariate form |
| 159 | +4. Require `composite <= primitive` on every comparable field |
| 160 | +5. Require at least one field to be strictly smaller, or all equal |
| 161 | + |
| 162 | +If any required field is `Unknown`, the whole primitive-vs-path comparison is `Unknown`. |
| 163 | + |
| 164 | +Do **not** reduce a multivariate expression to the maximum single-variable growth rate. That loses information for terms such as: |
| 165 | +- `num_vertices * num_edges` |
| 166 | +- `num_vertices + num_edges` |
| 167 | + |
| 168 | +which are not asymptotically equivalent. |
| 169 | + |
39 | 170 | ## Key Functions |
40 | 171 |
|
41 | | -### `Expr::growth_rate(var: &str) -> Result<GrowthRate, String>` |
42 | | - |
43 | | -Recursively classify an expression's growth w.r.t. a single variable. |
44 | | - |
45 | | -Rules (recursive — classify children first, then combine): |
46 | | - |
47 | | -| Expression | Child classifications | Result | |
48 | | -|---|---|---| |
49 | | -| `Const(_)` | — | `Constant` | |
50 | | -| `Var(v)` where `v == var` | — | `Polynomial(1.0)` | |
51 | | -| `Var(v)` where `v != var` | — | `Constant` | |
52 | | -| `Add(a, b)` | any | `max(a, b)` | |
53 | | -| `Mul(a, b)` | both `Poly(k1, k2)` | `Polynomial(k1 + k2)` | |
54 | | -| `Mul(a, b)` | one `Exp`, one `Poly` | `Exponential(same base)` | |
55 | | -| `Mul(a, b)` | otherwise | `Error` | |
56 | | -| `Pow(base, exp)` | `Poly(k)`, `Const(c)` | `Polynomial(k * c)` | |
57 | | -| `Pow(base, exp)` | `Const(c)`, `Poly(k)` | `Exponential(c)` | |
58 | | -| `Pow(base, exp)` | `Const(c)`, `Logarithmic` | `Polynomial(c.ln())` | |
59 | | -| `Pow(base, exp)` | `Poly`, `Poly` | `SuperExponential` | |
60 | | -| `Pow(base, exp)` | otherwise | `Error` | |
61 | | -| `Exp(Log(inner))` | — | `growth_rate(inner)` (cancellation) | |
62 | | -| `Exp(inner)` | `Constant` | `Constant` | |
63 | | -| `Exp(inner)` | `Poly(k)`, k >= 1 | `Exponential(e)` | |
64 | | -| `Exp(inner)` | `Exponential` | `SuperExponential` | |
65 | | -| `Exp(inner)` | otherwise | `Error` | |
66 | | -| `Log(inner)` | `Constant` | `Constant` | |
67 | | -| `Log(inner)` | `Polynomial(_)` | `Logarithmic` | |
68 | | -| `Log(inner)` | `Exponential(c)` | `Polynomial(1.0)` | |
69 | | -| `Log(inner)` | otherwise | `Error` | |
70 | | -| `Sqrt(inner)` | `Polynomial(k)` | `Polynomial(k / 2.0)` | |
71 | | -| `Sqrt(inner)` | otherwise | `Error` | |
72 | | - |
73 | | -Returns `Err` for unclassifiable expressions rather than guessing. |
74 | | - |
75 | | -### `compare_overhead(primitive: &ReductionOverhead, composite: &ReductionOverhead) -> Option<Ordering>` |
76 | | - |
77 | | -For each common field: |
78 | | -1. Collect all variables referenced by both expressions |
79 | | -2. For each variable, compute `growth_rate` of both expressions |
80 | | -3. Take the max growth rate across all variables for each expression |
81 | | -4. Compare: composite <= primitive on all fields → `Some(Equal | Less)`; otherwise `None` |
82 | | - |
83 | | -### `find_dominated_rules(graph: &ReductionGraph) -> Vec<DominatedRule>` |
84 | | - |
85 | | -1. Iterate all edges in the variant-level graph |
86 | | -2. For each edge (u, v), call `find_all_paths(u, v)` |
87 | | -3. Filter to composite paths (len > 1) |
88 | | -4. Compose overhead via `ReductionOverhead::compose` |
89 | | -5. Compare with `compare_overhead` |
90 | | -6. Collect dominated results |
91 | | - |
92 | | -## Integration Test |
93 | | - |
94 | | -In `src/unit_tests/rules/analysis.rs`: |
95 | | -- Call `find_dominated_rules` on the global `ReductionGraph` |
96 | | -- Maintain an allow-list of known dominated rules (currently 9: 6 genuine + 3 cast-composed) |
97 | | -- **Fail if a new dominated rule appears** that is not in the allow-list |
98 | | -- **Fail if an allow-listed rule is no longer dominated** (stale allow-list) |
99 | | - |
100 | | -## Existing Infrastructure Used |
101 | | - |
102 | | -- `ReductionGraph::find_all_paths()` — path enumeration |
103 | | -- `ReductionOverhead::compose()` — chains overheads via `Expr::substitute` |
104 | | -- `Expr::is_polynomial()` — partial classification (extended by `growth_rate`) |
105 | | -- `Expr::substitute()` — variable substitution for composition |
106 | | - |
107 | | -## Known Dominated Rules (Allow-List) |
108 | | - |
109 | | -| Primitive | Best Composite | Category | |
110 | | -|---|---|---| |
111 | | -| KColoring → QUBO | KColoring → ILP → QUBO | genuine | |
112 | | -| MIS → ILP | MIS → MVC → ILP | genuine | |
113 | | -| MIS → QUBO | MIS → MVC → QUBO | genuine | |
114 | | -| MaxSetPacking → ILP | SetPacking → MIS → ILP | genuine | |
115 | | -| MVC → ILP | MVC → MinSetCovering → ILP | genuine | |
116 | | -| MVC → QUBO | MVC → MIS → QUBO | genuine | |
117 | | -| KSAT/K2 → SAT | K2 → KN → SAT | cast-composed | |
118 | | -| KSAT/K3 → SAT | K3 → KN → SAT | cast-composed | |
119 | | -| MIS/One → MIS/Kings/i32 | One→i32 → Kings/i32 | cast-composed | |
| 172 | +```rust |
| 173 | +fn rule_key(entry: &ReductionEntry) -> RuleKey; |
| 174 | + |
| 175 | +fn compare_overhead( |
| 176 | + primitive: &ReductionOverhead, |
| 177 | + composite: &ReductionOverhead, |
| 178 | +) -> ComparisonStatus; |
| 179 | + |
| 180 | +fn find_dominated_rules( |
| 181 | + graph: &ReductionGraph, |
| 182 | +) -> (Vec<DominatedRule>, Vec<UnknownComparison>); |
| 183 | +``` |
| 184 | + |
| 185 | +### `find_dominated_rules` |
| 186 | + |
| 187 | +Algorithm: |
| 188 | +1. Iterate primitive `ReductionEntry`s from inventory |
| 189 | +2. Build the primitive `RuleKey` |
| 190 | +3. Enumerate all composite variant-level paths with the same exact source and target variant |
| 191 | +4. Skip direct paths |
| 192 | +5. For each composite path: |
| 193 | + - validate `path_is_symbolically_trustworthy` |
| 194 | + - compose symbolic overhead via `graph.compose_path_overhead(path)` |
| 195 | + - compare with `compare_overhead` |
| 196 | +6. Record: |
| 197 | + - the best proven dominating path in `Vec<DominatedRule>` |
| 198 | + - undecidable candidates in `Vec<UnknownComparison>` |
| 199 | + |
| 200 | +Path ranking: |
| 201 | +- prefer fewer edges |
| 202 | +- then prefer lexicographically smaller `ReductionPath::steps` |
| 203 | + |
| 204 | +This keeps results deterministic across runs. |
| 205 | + |
| 206 | +## Practical Invariant |
| 207 | + |
| 208 | +`ReductionGraph` currently coalesces exact variant pairs into a single graph edge. The design therefore assumes: |
| 209 | + |
| 210 | +- there is at most one registered primitive reduction per exact `(source variant, target variant)` pair |
| 211 | + |
| 212 | +Add a dedicated test for that invariant. If the invariant stops holding, shortcut analysis must move from graph-level path enumeration to a registry-level multigraph. |
| 213 | + |
| 214 | +## Test Plan |
| 215 | + |
| 216 | +Add `src/unit_tests/rules/analysis.rs`. |
| 217 | + |
| 218 | +Tests: |
| 219 | + |
| 220 | +1. `test_redundant_rule_detection_is_sound` |
| 221 | +- Run `find_dominated_rules(&ReductionGraph::new())` |
| 222 | +- Compare the dominated set against an allow-list keyed by `RuleKey` |
| 223 | +- The allow-list must be exact-variant and feature-aware |
| 224 | + |
| 225 | +2. `test_redundant_rule_detection_reports_unknown_for_ilp_qubo_paths` |
| 226 | +- Build at least one candidate path that includes `ILP -> QUBO` |
| 227 | +- Assert the comparison is reported as `Unknown` |
| 228 | +- Assert it is not promoted to `Dominated` |
| 229 | + |
| 230 | +3. `test_no_duplicate_primitive_rules_per_exact_variant_pair` |
| 231 | +- Iterate `ReductionEntry` inventory |
| 232 | +- Assert uniqueness of `(source_name, source_variant, target_name, target_variant)` |
| 233 | + |
| 234 | +4. `test_allow_list_is_feature_aware` |
| 235 | +- The expected dominated set must depend on whether `ilp-solver` is enabled |
| 236 | +- Do not hard-code a single unconditional count in the design |
| 237 | + |
| 238 | +## Initial Allow-List Policy |
| 239 | + |
| 240 | +The design should not hard-code a flat statement like "currently 9 dominated rules". |
| 241 | + |
| 242 | +Instead: |
| 243 | +- store exact variant-level keys |
| 244 | +- gate ILP-dependent expectations behind `#[cfg(feature = "ilp-solver")]` |
| 245 | +- exclude any candidate whose best proof path passes through `ILP -> QUBO` |
| 246 | + |
| 247 | +That means the previous `KColoring -> QUBO via KColoring -> ILP -> QUBO` shortcut is **not** part of the trusted allow-list. |
| 248 | + |
| 249 | +## Summary |
| 250 | + |
| 251 | +The redundant-rule detector should be: |
| 252 | +- variant-aware |
| 253 | +- inventory-driven |
| 254 | +- sound-but-incomplete |
| 255 | +- explicit about `Unknown` |
| 256 | + |
| 257 | +Most importantly, `ILP -> QUBO` remains a valid reduction but is **not** eligible for symbolic shortcut detection until ILP exposes richer size metadata for slack growth. |
0 commit comments