Skip to content

Compiled DAG propagation: ~3x speedup over persistent DAG#226

Open
dpsanders wants to merge 9 commits into
masterfrom
dag-compiled
Open

Compiled DAG propagation: ~3x speedup over persistent DAG#226
dpsanders wants to merge 9 commits into
masterfrom
dag-compiled

Conversation

@dpsanders
Copy link
Copy Markdown
Member

Summary

Builds on the persistent shared DAG (#225) by compiling the SharedDAG into a flat CompiledDAG instruction schedule, combining three optimizations:

  1. Symbol-keyed operationsop::Symbol (:add, :mul, etc.) instead of op::Function, enabling fast === comparison without dynamic dispatch
  2. Pre-allocated workspace — flat Vector{Interval} indexed by slot number; no per-node allocation during traversal
  3. Flat instruction arrayVector{Instruction} in topological order; forward iterates forward, backward iterates in reverse

Benchmarks

Problem ϵ Codegen Persistent DAG Compiled DAG Improvement
Unit disk 2D single call 1.2 μs 6.8 μs 4.7 μs 1.4x faster
Unit disk 2D 0.1 294 μs 2.5 ms 1.0 ms 2.5x faster
Unit disk 2D 0.01 2.5 ms 20 ms 8.8 ms 2.3x faster
Annulus 2D 0.1 1.6 ms 10 ms 4.4 ms 2.3x faster
3D torus 1.0 12 ms 97 ms 33 ms 2.9x faster

The compiled DAG is now within ~3x of the code-generation approach (down from ~8x), while retaining full DAG inspectability, multi-constraint support, and no eval().

How it works

compile(dag::SharedDAG) walks the DAG and:

  • Assigns each node a slot index in a pre-allocated Vector{Interval} workspace
  • Converts each OperationNode into an Instruction(op::Symbol, out, arg1, arg2)
  • Shared nodes (CSE) map to the same slot, so backward contraction from multiple parents accumulates correctly via intersection

DAGContractor and DAGSeparator now store a CompiledDAG and use it for propagation automatically.

Test plan

  • All 13 DAG test sets pass
  • All 4 original test sets pass
  • Benchmarks confirm identical box counts and improved performance

🤖 Generated with Claude Code

David Sanders and others added 9 commits April 2, 2026 03:35
…ompatibility

Update compat bounds: IntervalArithmetic 1, IntervalBoxes 0.3,
IntervalContractors 0.6, ReversePropagation 0.4, Symbolics 7.

IntervalArithmetic v1.0 follows IEEE 1788 and deliberately does not define
Base.isequal/Base.hash for Interval. This broke @register_symbolic x ∈ y::Interval
since SymbolicUtils needs isequal/hash for hash-consing. Instead of type-pirating
those methods, decompose x ∈ interval(a,b) into (x >= a) & (x <= b) at the
symbolic level.

Also fix pre-existing bug in separator() where & and | used Base.intersect/union
instead of ⊓/⊔ (defined for AbstractSeparator in set_operations.jl).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-pick infrastructure changes from PR #220:
- Update GitHub Actions versions (checkout v6, setup-julia v2, cache v3, codecov v6)
- Test on Julia 1.11 instead of 1.10
- Set julia compat to 1.11
- Remove obsolete REQUIRE file (Pkg.jl era)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Switch from the old Documenter HTML backend to DocumenterVitepress for a
modern VitePress-based documentation site. Add new pages explaining
contractors/separators and the internal architecture, update index.md to
the current API, add GitHub Actions workflow for doc deployment, and
remove stale mkdocs.yml and Manifest.toml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implements an explicit DAG (directed acyclic graph) for forward-backward
interval constraint propagation (HC4Revise), inspired by Schichl & Neumaier.
This provides an inspectable, iterable alternative to the existing
ReversePropagation code-generation approach.

New types: DAGContractor, DAGSeparator, ConstraintDAG
New files: src/dag/{nodes,build,propagate,contractor}.jl
Tests: test/test_dag.jl (27 tests)
Benchmarks: benchmark/bench_dag_vs_codegen.jl

Both approaches produce identical paving results. The DAG approach is
currently ~10x slower due to DAG reconstruction per call and dynamic
dispatch, with clear optimization paths documented in CLAUDE.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Redesign the DAG engine so the graph is built once and reused:

- SharedDAG: persistent DAG that accumulates constraints via
  add_expression!. Variable nodes and common subexpressions (CSE)
  are shared across all expressions added to the same DAG.

- Multi-constraint propagation: forward-backward passes contract
  all constraints jointly, so narrowing from one constraint
  immediately benefits others through shared variable nodes.

- DAGContractor/DAGSeparator accept an existing SharedDAG,
  allowing multiple separators to share the same graph.

Benchmarks show ~30% speedup vs the previous rebuild-per-call
approach, and shared DAGs use 33% fewer nodes than separate ones.

Resolves Project.toml merge conflict from cherry-pick.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Compile the SharedDAG into a flat instruction array (CompiledDAG)
that operates on a pre-allocated workspace of intervals, combining
three optimizations:

1. Symbol-keyed operations instead of Function dispatch
2. Pre-allocated flat workspace indexed by slot number
3. Flat instruction array — forward iterates forward, backward reverses

Benchmarks show ~3x speedup over the uncompiled persistent DAG,
bringing the gap with code-generation down from ~8x to ~3x while
retaining full inspectability and multi-constraint support.

| Problem         | Codegen | Compiled DAG | Ratio |
|-----------------|---------|-------------|-------|
| Unit disk ϵ=0.1 | 294 μs  | 1.0 ms      | 3.5x  |
| Annulus ϵ=0.1   | 1.6 ms  | 4.4 ms      | 2.8x  |
| 3D torus ϵ=1.0  | 12 ms   | 33 ms       | 2.8x  |

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant