fixes #414#590
Conversation
disteph
left a comment
There was a problem hiding this comment.
I moved the trick of replacing uninterpreted symbols of unit types by the unique value of the type, out of the term-construction function and into the CDCLT internalization process. This is necessary over not doing the trick at all, because of function types that are unit (the CDCLT e-graph doesn't seem to know that you can't have two distinct terms in that type). However, MCSAT no longer benefits from the trick. But MCSAT has a larger problem of not being aware of cardinality constraints on finite function types. It happily says SAT on 5 pairwise distinct functions of type Bool -> Bool. So I made MCSAT refuse to assert formulas that contain equalities on finite functional types.
MCSAT Function-Disequality FindingsThis note records the experiments run with the PR590 unsupported-theory guard temporarily removed. The tests were run on The same four tests were rerun on Finite Function Sort:
|
There was a problem hiding this comment.
Review provided by Codex/GPT5.5 High.
I concur.
Findings
No blocking findings.
I did not find a correctness issue in the branch diff against origin/master at 485acafd. The MCSAT guard now matches the observed unsafe cases: finite-domain function sorts, unit-codomain function sorts, and downward closure through function domains/ranges, tuples, and type instances. The regression harness change for solver-specific .mcsat.gold / .dpllt.gold files is scoped to tests/regress/both and preserves shared .gold behavior by default.
Open Questions
The main residual risk is intentional, not a defect: PR590 remains an over-approximating temporary guard. It rejects some formulas that could be simplified or solved soundly, for example benign equalities over guarded function sorts. That matches the stated goal of refusing inputs that require missing MCSAT function-disequality monitoring.
Review Summary
The unit-type change in src/context/context.c compensates for making unit-typed uninterpreted terms fresh again in src/terms/term_manager.c, preserving CDCL(T) singleton semantics during internalization.
The MCSAT guard in src/mcsat/preprocessor.c correctly applies to both EQ_TERM and DISTINCT_TERM, and the new SMT2 regressions cover the exact master wrong-sat reproducers under both solver modes.
Tests Run
git diff --check origin/master...HEAD
tests/regress/check.sh tests/regress/both build/arm-apple-darwin23.6.0-debug/bin
Result: Pass: 11, Fail: 0
build/arm-apple-darwin23.6.0-debug/bin/issue_414_mcsat
Result: pass
build/arm-apple-darwin23.6.0-debug/bin/issue_414
Result: pass
|
Note: "I made MCSAT refuse to assert formulas that contain equalities on finite functional types" -> I generalized the guard since (i) the search was also incorrect when having equalities or disequalities on function types with finite domains and infinite co-domains (see example Bool -> Int above); equalities/disequalities on D->C implicitly generate equalities/disequalities on C and D (so there a downward closure of the guard I added too). |
Address four findings from the PR 590 code review that concern the MCSAT uninterpreted-function model-reconstruction path. H2: uf_plugin_get_function_id_from_trail relied on q_get32 succeeding and called `assert(ok); (void) ok;`. In release builds a q_get32 failure would leave `id` uninitialized, silently corrupting the function_term, function_value, and app_term_first hashmap lookups keyed on it. Convert both uf_plugin_get_function_id_from_trail and its wrapper uf_plugin_get_function_id to bool returns with an int32_t* out-param so every caller can fail closed. Update the five call sites to take sound fallbacks (sort fallback in compare_app_terms_by_function; early return, continue, or break in the model-builder loops; null_value propagation out of uf_model_builder_get_function_value). M1: The `make_fresh_function == null_value` branch in uf_model_builder_get_function_value previously fell back to vtbl_mk_function with a default codomain value. The old comment acknowledged this collapsed function values and that the MCSAT preprocessor guard was supposed to prevent the case from arising. Replace the silent fallback with assert(false) in debug builds and a fail-closed null_value return in release builds (arguments is already initialized at that point, so delete_ivector is safe). The top-level uf_plugin_build_model now guards model_map_term on null_value, so an incomplete model is emitted instead of a wrong-SAT one if the guard ever has a hole. H3: Upgrade the cycle-handling comment on the early `find->val = f_value` write in uf_model_builder_get_function_value. The previous one-liner only said "avoid recursing forever". The new block documents why such cycles occur, notes that the cached value at cycle time is the pre-update base (so the emitted model may not be extensional on self-referential pairs), and flags that a full fix would need a visited-set plus a theory-conflict path. L4: get_function_application_type is called with NULL_TERM for the div/mod arms but requires a real function term for APP_TERM and UPDATE_TERM. Add a header comment documenting this invariant and assert(f != NULL_TERM) in the APP_TERM/UPDATE_TERM arm so the coupling is no longer silent. No behavior change on guard-covered inputs (the only path the preprocessor currently allows to reach these sites). The fail-closed branches are defensive only. Refs: #414, PR #590.
PR 590 introduced canonicalization in internalize_to_eterm so that any term whose type is unit gets remapped to a single representative term. At the two mapping sites the existing code only mapped `root` if it was still free. This was sound when reached fresh, but if a recursive internalization had already mapped `root` to a different egraph occurrence, the inconsistency would silently propagate. Add a defensive assert in the !is_free branch at both sites: when `root != r` and `root` is already mapped, the existing mapping must agree with the egraph occurrence we are about to return. Cheap insurance against future changes that route additional kinds of terms through this path. No behavior change in correctly canonicalized inputs. Refs: PR #590 review finding M2.
PR 590 added an identical static `copy_update_maps` helper to both src/io/concrete_value_printer.c and src/frontend/smt2/smt2_printer.c. The helper expands an update via vtbl_expand_update and copies the resulting map ids out of the shared scratch buffer table->hset1 into a private array, so that recursive printing of value subtrees cannot clobber the iteration mid-loop. Promote the helper to vtbl_copy_update_maps in the value-table API (src/model/concrete_values.h, src/model/concrete_values.c) and replace both static duplicates with calls into it. The new function lives next to vtbl_expand_update and carries a header comment that documents the hset1 reentrancy hazard and the caller's safe_free obligation. Net: -27 lines in smt2_printer.c, -33 lines in concrete_value_printer.c, +44 lines in the model/concrete_values pair. No behavior change. All five call sites still safe_free the returned array. Refs: PR #590 review finding M3.
PR 590 disabled four MCSAT regressions whose inputs are now rejected by
the new finite-function-equality preprocessor guard. Add a README at
tests/regress/mcsat/README.md that:
- explains the .smt2.disabled file-suffix convention used in this
subtree (inputs are preserved rather than deleted so they can be
re-enabled when the underlying capability is restored),
- lists every test disabled by commit 48f2e4a with its rejected
type/shape, the guard branch that rejects it, and the work item
that should re-enable it (the finite-function extensionality plans
at the repository root), and
- notes that older .smt2.disabled files in this subtree predate
PR 590 and were disabled for unrelated reasons.
Going forward, anyone disabling a test in this directory is expected
to add a row with a one-line reason and a re-enable condition; anyone
re-enabling should remove the corresponding row.
Refs: PR #590 review finding M4, issue #414.
The four review-driven commits earlier in this PR added defensive
fail-closed branches that are unreachable on supported inputs:
- src/mcsat/uf/uf_plugin.c
* uf_plugin_get_function_id_from_trail's q_get32 failure path
* uf_plugin_build_app_model_compare's both-ids-failed fallback
* uf_model_builder_remember_function_term's early return
* uf_model_builder_get_function_value's early return on id lookup
* uf_model_builder_get_function_value's make_fresh_function ==
null_value branch (rejected by the preprocessor guard)
* uf_model_builder_prepare's continue on id lookup
- src/context/context.c
* two internalize_to_eterm consistency assert()s that fire only
when root was already mapped during recursive internalization
All of these are never taken in the gcov regression run (either
guarded out by term_needs_function_diseq_guard in preprocessor.c, or
structurally impossible because function ids are allocated as small
non-negative integers by uf_plugin_decide), so they show up as
uncovered lines and regress PR 590's coveralls score.
Wrap each with LCOV_EXCL_LINE (single-line branches) or
LCOV_EXCL_START / LCOV_EXCL_STOP (multi-line blocks). The hardening
itself is preserved; the markers only tell lcov not to count these
deliberately unreachable lines against the coverage metric.
No behavior change.
Refs: PR #590, coveralls build 79245430.
|
Code review by Windsurf/Opus4.7 high Code review: PR 590 (master..fix-iss-414, 13 commits, 38 files, +717/-103)ScopeFixes issues #414 and #613 along five axes:
The vtbl_copy_update_maps extraction into FindingsMediumM1 — Self-referential cycles may emit non-extensional models. M2 — Unbounded recursion in type_needs_function_diseq_guard. LowL1 — DISTINCT guard re-checks every argument. L2 — Unit-codomain function equality is trivially decidable but rejected. L3 — Consistency asserts in internalize_to_eterm are debug-only. L4 — H3 cycle path has no regression. The known-limitation branch in InfoI1 — Track degraded MCSAT answers alongside disabled tests. I2 — API tests are thorough. VerificationPre-existing Note on the disabled testsI initially flagged the guard's rejection of RecommendationApprove. The four review-driven hardening commits ( |
|
Code review by Codex/GPT5.5 Findings No findings. I reviewed the current Summary The PR now looks mergeable from my review pass. The unit-type uninterpreted-term change is covered by CDCL(T) internalization canonicalization and API regressions. The MCSAT guard now covers both equality and Residual risk is intentional scope: four MCSAT regressions are disabled behind the temporary finite-function guard, but they are documented in Tests Run
|
fixes #414