Suggestions engine PR α: Bundle model + partial GIN + hybrid templates (#122)#162
Open
Suggestions engine PR α: Bundle model + partial GIN + hybrid templates (#122)#162
Conversation
Eleven tasks: Bundle dataclasses → filter-field extractor → shape classifier → btree-builder → GIN-builder → signature migration (list[Bundle]) → probe → partial WHERE → hybrid dispatch → GIN coverage → CHANGES.md. Refs #122. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Foundational types for PR α's multi-index suggestion output. Frozen dataclasses so the engine output is safe to pass around and serialize. No callers yet — introduced as a separate commit to keep the refactor surface reviewable. Refs #122.
Pure helper that turns (query_keys, params, registry) into a structured list of (name, IndexType, operator, value) tuples. Handles virtual field expansion (effectiveRange -> effective), equality/range/equality_multi operator inference, and all the drop rules (pagination / sort / dedicated / skip / unknown). Not yet wired into suggest_indexes — subsequent tasks will use it via the new shape classifier. Refs #122.
Pure helper that maps a filter-field list to one of five shape classifications (BTREE_ONLY, KEYWORD_ONLY, MIXED, TEXT_ONLY, UNKNOWN) — the routing input for the bundle dispatcher. Refs #122.
Wraps PR 2's btree-composite + sort-covering logic into a Bundle output shape. Not yet consumed by suggest_indexes — that migration happens in a later commit. Refs #122.
Handles both T3 (plain GIN) when partial_where_terms is empty and T4 (partial GIN) when provided. Multiple KEYWORD filters produce multiple members in one bundle so the planner can BitmapAnd them. Refs #122.
Signature gains conn=None. Return type migrates from list[dict] to list[Bundle] with BundleMember. catalog.py flattens bundles for back-compat DTML rendering and additionally exposes the full bundle structure under a new 'suggestions_bundles' key for PR β's JSON/JS consumer. Legacy _add_btree_suggestions and _add_standalone_suggestion helpers removed — their logic lives in _build_btree_bundle and _build_keyword_gin_bundle. Refs #122. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Two-stage selectivity probe for partial-predicate scoping. pg_stats.most_common_vals gives zero-round-trip answers for common values; unknown values fall back to live COUNT. All results cached in a module-level dict reset per ZMI page load. Refs #122.
Pure helper that builds the list of AND-joined WHERE predicates for a partial index, applying the 10%-selectivity threshold (configurable via PGCATALOG_PARTIAL_SELECTIVITY_THRESHOLD env var). Excludes DATE and KEYWORD values, multi-value equality, and range operators. SQL-escapes single quotes. Refs #122.
MIXED shape (btree + KEYWORD filters together) now produces a single Bundle containing a btree composite member plus one partial-GIN member per KEYWORD, all scoped by the same partial WHERE predicates (the planner BitmapAnds them). Closes the issue-#122 AT26 gap: portal_type=Event + review_state=published + Subject=AT26 + effective+expires range + sort_on=effective now yields btree (portal_type, review_state, effective) AND partial GIN on Subject WHERE portal_type='Event' AND review_state='published' Dedicated KEYWORD fields (e.g. Subject) are kept in filter_fields so the MIXED shape can be detected alongside btree-eligible fields. The KEYWORD_ONLY dispatch path strips dedicated fields before calling _build_keyword_gin_bundle (avoiding duplicate suggestions), while the MIXED hybrid path passes them through (enabling partial GIN suggestions that are genuinely new, more selective indexes). Probe cache is reset in manage_get_slow_query_stats so unit tests can seed it independently. Refs #122. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Simplifies the dedicated-keyword pass-through logic in _extract_filter_fields. No behavior change. Refs #122.
Existing plain-GIN indexes on a KEYWORD field now match as 'already_covered' for plain-GIN suggestions on the same field. Partial-GIN suggestions with STRICTER WHERE than the existing index remain 'new' — they are genuinely distinct indexes with different plan usability. _check_covered now compares WHERE clauses as predicate sets: suggested ⊆ existing = covered; else new. The plan doc had the subset direction inverted — the test is ground truth and the implementation follows it. Refs #122. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Per final review: module-level _probe_cache/_pg_stats_cache assumes Zope's default single-thread-per-worker model. Documented the caveat and the ContextVar mitigation path if workers ever go multi-threaded. Refs #122.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes the canonical slow-query suggestion gap from the #122 gap-analysis comment — the AT26-style query (`portal_type=Event + review_state=published + Subject=AT26 + effective range + sort_on=effective`) previously produced no useful suggestion because the engine thought in single btree composites and had no partial-GIN or hybrid-bundle output shape.
No DDL changes, no schema migration. Safe to roll out mid-fleet. PR β (EXPLAIN-driven grading + DTML → JSON+JS UI + opt-in ANALYZE) gets its own brainstorm and design spec when this lands — see `docs/superpowers/specs/2026-04-15-suggestions-engine-pr-beta-notes.md` for the carryover decisions.
Spec: `docs/superpowers/specs/2026-04-15-suggestions-engine-pr-alpha-design.md`
Plan: `docs/plans/2026-04-15-suggestions-engine-pr-alpha-plan.md`
Test plan
Known pre-existing flaky tests unrelated to this PR — 3 `test_clear_find_and_rebuild_*` tests in `tests/test_pg_integration.py::TestMaintenanceOps` pass in isolation but fail when run consecutively as a class (state contamination between tests). Filed as a separate issue for followup.
Refs #122.
🤖 Generated with Claude Code