Enabling index scans for OID cast patterns in system views#4829
Enabling index scans for OID cast patterns in system views#4829RuchaSK1 wants to merge 21 commits into
Conversation
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
| OpExpr *clause = (OpExpr *) rinfo->clause; | ||
|
|
||
| if (list_length(clause->args) == 2 && | ||
| clause->opno == Int4EqualOperator) |
There was a problem hiding this comment.
Do we need similar handling for other operators ?
There was a problem hiding this comment.
I haven't encountered a need for it currently. All the problematic queries use equality for point lookups. I think we can extend in future if needed.
| * Enables index usage for Babelfish equality patterns like (oid)::integer = value | ||
| * that prevent index matching because the cast changes the operator's type |
There was a problem hiding this comment.
what about the other way around case, value = (oid)::integer, does this cases also exists ? if yes then do we need similar handing there as well ?
There was a problem hiding this comment.
Yes, this case exists in sys.all_sql_modules join conditions (ao.object_id = CAST(p.oid AS INT) and ao.object_id = CAST(tr.oid AS INT)). Added handling for both operand orders.
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
| } | ||
|
|
||
| /* Try right operand as the cast */ | ||
| if (!bms_is_member(index->rel->relid, rinfo->left_relids) && |
There was a problem hiding this comment.
Entire if block is replicated for both left and right op can we avoid duplication here ?
There was a problem hiding this comment.
Created helper function match_oid_cast_to_indexcol which can be called for both left and right op orders
| if (!bms_is_member(index->rel->relid, rinfo->right_relids) && | ||
| !contain_volatile_functions(rightop) && |
There was a problem hiding this comment.
!bms_is_member(index->rel->relid, rinfo->right_relids) && !contain_volatile_functions(rightop) These conditions are already validated in Engine side, why do we want to test these conditions again ?
There was a problem hiding this comment.
Had added checks for safety. Removed them while refactoring to avoid redundancy.
| Oid orig_type = exprType(inner_arg); | ||
|
|
||
| if (orig_type == OIDOID && relabel->resulttype == INT4OID && | ||
| match_index_to_operand(inner_arg, indexcol, index)) |
There was a problem hiding this comment.
match_index_to_operand internally strips the relabel type and uses the inner_arg, Also in PG side there is already this check match_index_to_operand(leftop, indexcol, index) do we still need this check here as well ?
There was a problem hiding this comment.
This check is needed. The hook tries both operand orders but the engine only validates one side per call path. For join conditions like (pg_namespace.oid)::int4 = (pg_class.relnamespace)::int4, without this check we would incorrectly match the wrong table's column to the index. Removing it caused DDL test failures.
| idx_op = get_opfamily_member(opfamily, orig_type, orig_type, | ||
| BTEqualStrategyNumber); | ||
| if (OidIsValid(idx_op) && | ||
| IsBinaryCoercible(relabel->resulttype, orig_type)) |
There was a problem hiding this comment.
in the parent if condition we have already verified orig_type == OIDOID && relabel->resulttype == INT4OID and since we know oid and int Is Binary Coercible, can we skip this check or do we want to have it like sanity/defensive check?
There was a problem hiding this comment.
It's safe to skip.
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
|
/code-review |
1 similar comment
|
/code-review |
| SELECT name FROM sys.objects | ||
| WHERE object_id = (SELECT object_id FROM sys.objects WHERE name = 'babel_6814_t1'); | ||
| GO | ||
|
|
There was a problem hiding this comment.
No explain tests for this? How are we verifying index usage?
There was a problem hiding this comment.
Those tests were for correctness only. Added explain tests
kuntalghosh
left a comment
There was a problem hiding this comment.
AI Code Review
Verdict: REQUEST_CHANGES
Based on the provided code changes, here are my feedback and suggestions:
Issue 1 - In the hooks.c — match_oid_cast_to_indexcol: missing volatile function guard:
new_val = makeRelabelType((Expr *) copyObject(val_arg),
orig_type, -1, InvalidOid,
COERCE_IMPLICIT_CAST);
new_clause = makeNode(OpExpr);The helper builds an indexqual from val_arg without checking whether it contains volatile functions. The upstream match_opclause_to_indexcol() in PostgreSQL's indxpath.c explicitly gates index pushdown on !contain_volatile_functions(rightop). Without this check, a volatile expression (e.g., random(), nextval(), a side-effecting PL/pgSQL function) on the value side can be pushed into an index scan where it may be evaluated with different cardinality than under a sequential scan — producing wrong or non-deterministic results.
Suggested fix:
if (orig_type == OIDOID && relabel->resulttype == INT4OID &&
match_index_to_operand(inner_arg, indexcol, index))
{
/* Reject volatile value-side operand — mirrors upstream indxpath.c */
if (contain_volatile_functions(val_arg))
return NULL;
Oid idx_op;
idx_op = get_opfamily_member(opfamily, orig_type, orig_type,
BTEqualStrategyNumber);Issue 2 - In the hooks.c — bbf_match_opclause_to_indexcol: missing self-relation check on the value operand:
/* Try left operand as the cast: (oid)::int4 = value */
iclause = match_oid_cast_to_indexcol(root, rinfo, indexcol,
index, leftop, rightop,
opfamily);The upstream code requires that the non-indexed operand does not reference the same relation as the index (!bms_is_member(index->rel->relid, rinfo->right_relids) for the left-matches-index case and the symmetric check for the other side). Without this guard, a self-comparison like o.object_id = o.parent_object_id where OID casts exist on both sides could produce a spurious indexqual that references the same relation, leading to a broken plan.
Suggested fix: pass index->rel->relid into the helper and add a relid membership check before accepting the rewrite:
static IndexClause *
match_oid_cast_to_indexcol(PlannerInfo *root, RestrictInfo *rinfo,
int indexcol, IndexOptInfo *index,
Node *cast_arg, Node *val_arg, Oid opfamily,
bool cast_is_left)
{
...
if (orig_type == OIDOID && relabel->resulttype == INT4OID &&
match_index_to_operand(inner_arg, indexcol, index))
{
/* Ensure value side does not reference the indexed relation */
Relids val_relids = cast_is_left ? rinfo->right_relids
: rinfo->left_relids;
if (bms_is_member(index->rel->relid, val_relids))
return NULL;
...Issue 3 - In the hooks.c at the new RestrictInfo construction: security_level dropped:
new_rinfo = make_simple_restrictinfo(root,
(Expr *) new_clause);make_simple_restrictinfo() hardcodes security_level = 0. If the original rinfo came from a security-barrier view or RLS policy with a higher security level, the derived indexqual is demoted. While this is unlikely to be exploitable for the OID equality use case today, it diverges from how the upstream constructs derived indexquals (preserving security_level) and could cause subtle issues if these catalog views ever gain row-level security.
Suggested fix:
new_rinfo = make_simple_restrictinfo(root, (Expr *) new_clause);
new_rinfo->security_level = rinfo->security_level;Issue 4 - In the hooks.c: inner_arg shared between original clause and new clause without deep copy:
new_clause->args = list_make2(inner_arg, new_val);inner_arg is the same pointer used inside the original rinfo->clause (via the RelabelType->arg chain). While val_arg is correctly deep-copied, inner_arg is not. Subtree sharing is generally safe in the planner today, but it is unusual for synthesized restrictinfos and creates fragile aliasing if any future pass does in-place mutation. This is advisory — not blocking.
Suggested fix:
new_clause->args = list_make2(copyObject(inner_arg), new_val);Issue 5 - In BABEL-6814-vu-verify.sql: no EXPLAIN-based tests verifying index scan usage:
The tests verify correctness (query results) but do not verify that the optimization actually produces an index scan. Since this PR is a planner-only performance improvement, at least one EXPLAIN test confirming Index Scan / Index Cond on an OID-indexed catalog lookup would provide confidence that the rewrite is taking effect and guard against future regressions that silently fall back to sequential scans.
Suggested fix: Add an EXPLAIN test, for example:
-- Verify index scan is used for OID-cast pattern
SET babelfish_showplan_all ON;
GO
SELECT name FROM sys.objects
WHERE object_id = (SELECT object_id FROM sys.objects WHERE name = 'babel_6814_t1');
GO
SET babelfish_showplan_all OFF;
GOThen verify the expected output contains an Index Scan with an appropriate Index Cond. (This was also raised by reviewer KushaalShroff.)
Issue 6 - Test file naming: BABEL-6814 uses JIRA ticket ID:
The test files are named BABEL-6814-vu-*.sql. Best practice for test naming is to use a feature/domain-based name (e.g., oid_cast_index_scan-vu-*.sql) rather than a JIRA ticket ID, which becomes meaningless over time. This is advisory — not blocking for this PR given the pattern already exists widely.
Missing Tests
- EXPLAIN-based tests confirming index scan usage (blocking — see Issue 5)
Notes
- The PR depends on a companion engine PR (#762 in postgresql_modified_for_babelfish) that adds the
match_opclause_to_indexcol_hookdeclaration and call site. Reviewers should verify in that PR that the hook is called from insidematch_clause_to_indexcol()with a validatedindexcol < index->nkeycolumnscheck (release-build safe, not just an Assert). - The PR description states "Since this is a planner only improvement, we have not added any tests" but test files are actually present — the description appears stale and should be updated to document the test scenarios.
- Reviewer rohit01010 approved and reviewer KushaalShroff requested changes (asking for EXPLAIN tests) — both are reasonable positions.
- Hook installation/uninstall and chaining via
prev_match_opclause_to_indexcol_hookis correctly implemented. - The OID ↔ INT4 rewrite is semantically safe: they are binary-compatible types with identical 32-bit representation.
- Issues 1 and 2 are blocking (correctness/robustness gaps mirroring upstream guards). Issues 3–4 are advisory. Issue 5 is blocking (test coverage for a planner optimization must demonstrate the optimization works).
Note: This AI review is meant to augment, not replace, human review. Use the suggestions with discretion.
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
|
/code-review |
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
kuntalghosh
left a comment
There was a problem hiding this comment.
AI Code Review
PR: #4829 — Enabling index scans for OID cast patterns in system views
Author: RuchaSK1
Base: BABEL_6_X_DEV
Head SHA: e3e285f
Verdict: APPROVE
Summary
This PR implements match_opclause_to_indexcol_hook to rewrite expressions like (oid)::integer = value to oid = value::oid, enabling index usage in system catalog views. The performance gains are significant (up to 25× improvement for SSMS operations on large catalogs).
The implementation is clean, well-scoped, and handles both operand orders. Prior review feedback has been addressed (volatile function guard, code deduplication via helper function, relid membership checks).
PR Description Checklist
- Description explains what and why — clear performance motivation with benchmarks
- BABEL-6814 JIRA ticket referenced (visible in test filenames and branch)
- Test scenarios covered — 22 numbered tests across multiple patterns
- Target branch correct (BABEL_6_X_DEV)
- No breaking changes (hook only activates in T-SQL dialect)
Coding Conventions
No violations found. The code follows existing patterns in hooks.c:
- Proper hook chaining (save previous, restore on uninstall)
sql_dialect == SQL_DIALECT_TSQLguard at entry point- Memory-safe: uses
copyObject()for node copies,makeNode()for allocations - Block comment style matches the file's convention
- Static function with clear, descriptive name
- Forward declarations grouped with other hook declarations
Code Review Findings
Advisory
1. The lossy = false choice is correct but deserves a brief inline comment.
The rewrite from (oid)::int4 = value to oid = value::oid is semantically lossless because OID and INT4 are binary-coercible same-width types and the equality semantics are preserved. A short comment at the iclause->lossy = false line would help future readers understand why no recheck is needed.
2. Consider adding a GUC to disable the optimization.
While not strictly necessary, a GUC like babelfishpg_tsql.enable_oid_cast_index_rewrite would allow operators to disable the rewrite if unexpected plan regressions are observed in production. This is a low-priority suggestion for future work.
Test Coverage
Test coverage is thorough:
| Category | Tests |
|---|---|
| Left-side cast pattern | Tests 1, 4, 5, 6 |
| Right-side cast pattern | Test 2 |
| Both patterns equivalence | Test 3 |
| Correlated subqueries | Tests 8, 9, 15 |
| Join patterns | Tests 10, 11 |
| Non-existent OID | Test 7 |
| EXPLAIN plan verification | Tests 16–22 |
| Multiple object types | Tests 12, 13, 14 |
| Upgrade schedules | All 13_x and 14_x versions |
| Parallel query variant | Included |
The EXPLAIN tests confirm Index Scan with Index Cond: (oid = (12345)::oid) — verifying the rewrite produces correct index conditions.
The .mix file format correctly uses -- psql sections for EXPLAIN tests (which require direct PostgreSQL access with sql_dialect set).
Security Review
No security issues identified. The change is post-authentication, operates in the planner (no data-plane side effects), and only performs expression rewriting with proper type validation. Full security analysis confirmed 0 findings.
Prior Review Issues
Prior AI review raised:
- Missing volatile function guard — Resolved.
contain_volatile_functions(val_arg)check is present. - Code duplication for left/right operand handling — Resolved. Refactored into
match_oid_cast_to_indexcolhelper. - Redundant defensive checks — Addressed per reviewer discussion. The
bms_is_membercheck was retained (causes DDL test failures without it), while the binary-coercibility check was removed as redundant. - Missing EXPLAIN tests (KushaalShroff) — Resolved. Tests 16–22 use EXPLAIN to verify index scan usage.
Notes
- Two human reviewers have approved (rohit01010, kuntalghosh)
- The implementation is narrowly scoped to OID→INT4 casts with equality operators only, which is the pattern found in Babelfish system catalog views
- The hook correctly chains with any previously-installed hook via
prev_match_opclause_to_indexcol_hook
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
…d local Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
|
/code-review |
kuntalghosh
left a comment
There was a problem hiding this comment.
AI Code Review
PR: #4829 — Enabling index scans for OID cast patterns in system views
Author: RuchaSK1
Base: BABEL_6_X_DEV
Head SHA: 7f5b6db
Verdict: APPROVE
Summary
This PR implements match_opclause_to_indexcol_hook to rewrite equality predicates of the form (oid)::integer = value to oid = value::oid, enabling the planner to use existing OID indexes on system catalog views. The hook handles both operand orders and is gated on sql_dialect == SQL_DIALECT_TSQL.
The performance improvements are significant (up to 25× for SSMS operations on large catalogs). The implementation is well-scoped, touching only a single C source file with proper hook chaining, volatility checks, relids validation, and security_level propagation.
PR Description Checklist
- ✅ Description explains what changed and why (with benchmark table)
- ✅ BABEL JIRA reference present (BABEL-6814, visible in test file naming)
- ✅ Test scenarios listed (correctness + EXPLAIN plan verification)
- ✅ Target branch correct (BABEL_6_X_DEV)
- ✅ DCO sign-off checkbox present
Coding Conventions
- ✅ C-style
/* */comments used throughout (no//comments) - ✅ Function header comment present on
bbf_match_opclause_to_indexcolexplaining the transformation - ✅ Proper hook save/restore pattern (
prev_match_opclause_to_indexcol_hook) - ✅ Code formatting consistent with surrounding hooks.c style (tabs, brace placement)
- ✅ No restricted keywords in code or comments
- ✅ Forward declarations grouped with other static hook declarations
- ✅ Includes added in alphabetical order within their section
Test Coverage
- ✅ Correctness tests (1–15): Point lookups via
sys.objects,sys.indexes,sys.columns,sys.schemas; both operand orders; correlated subqueries; JOIN patterns; multi-object queries; non-existent object (0 rows); index-filtered queries - ✅ EXPLAIN plan tests (16–22): Verify Index Scan usage on
pg_class.oid,pg_attribute.attrelid,pg_index.indrelid; both operand orders; join and correlated subquery patterns - ✅ Test infrastructure: Creates 100 extra tables + 500 schemas to ensure consistent planner behavior (avoids plan instability from low row counts)
- ✅ Parallel query variant: Separate expected output for parallel query mode
- ✅ Upgrade schedules: BABEL-6814 added to all upgrade schedule files (13_4 through latest)
Security Review
The dedicated security review found 0 findings (Critical: 0, High: 0, Medium: 0, Low: 0). Key points validated:
- Post-auth only; gated on
sql_dialect == SQL_DIALECT_TSQL - No SQL text construction, no SPI calls, no buffer operations
- Proper
copyObject()usage prevents aliasing issues contain_volatile_functions()guard prevents unsafe index qualsbms_is_member()relids checks prevent self-join confusionsecurity_levelpropagated correctly to new RestrictInfo- Hook chaining properly delegates to previous hook
Advisory Findings
Advisory Issue 1 — match_oid_cast_to_indexcol helper lacks a function header comment:
The main hook function bbf_match_opclause_to_indexcol has a proper multi-line comment explaining its purpose. The static helper match_oid_cast_to_indexcol that performs the actual node rewrite does not have a header comment. Consider adding a brief comment describing its role (e.g., "Rewrites a RelabelType(oid→int4) = val pattern into oid = RelabelType(val→oid) and returns an IndexClause, or NULL if the pattern doesn't match").
Notes
- Prior review issues addressed: The earlier AI review (at SHA e3e285f) raised concerns about a missing volatile function guard — this is now present (
contain_volatile_functions(val_arg)check at the top of the rewrite logic). KushaalShroff's request for EXPLAIN tests has also been addressed with Tests 16–22. - Reviewer discussion resolved: rohit01010's suggestion to avoid code duplication between left/right operand handling was addressed by extracting the
match_oid_cast_to_indexcolhelper. The relids checks questioned by rohit01010 were justified by RuchaSK1 (needed to prevent incorrect index matching in join conditions — removing them caused DDL test failures). - Design note: The hook correctly restricts to
Int4EqualOperatoronly (equality). If other comparison operators are needed in the future (e.g.,<,>), the approach can be extended by handling additional strategy numbers. The current scope is appropriate for the point-lookup patterns used by SSMS catalog queries.
Note: This AI review is meant to augment, not replace, human review. Use the suggestions with discretion.
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
|
/code-review |
kuntalghosh
left a comment
There was a problem hiding this comment.
AI Code Review
PR: #4829 — Enabling index scans for OID cast patterns in system views
Author: RuchaSK1
Base: BABEL_6_X_DEV
Head SHA: a3cf56d
Verdict: APPROVE
Summary
This PR implements a planner hook (match_opclause_to_indexcol_hook) that rewrites equality predicates of the form (oid_col)::integer = value to oid_col = value::oid, enabling the PostgreSQL planner to match existing OID indexes on system catalog views. The performance impact is significant — up to 25× improvement for SSMS operations on large catalogs (36k objects).
The implementation:
- Adds
bbf_match_opclause_to_indexcolas the hook entry point, gated onsql_dialect == SQL_DIALECT_TSQL - Extracts a helper
match_oid_cast_to_indexcolthat handles the actual AST rewrite - Handles both operand orders (
(oid)::int = valandval = (oid)::int) - Guards against non-equality operators via
clause->opno == Int4EqualOperator - Guards against volatile functions in the value operand
- Validates OID-to-int4 RelabelType pattern before rewriting
- Correctly chains to
prev_match_opclause_to_indexcol_hook
PR Description Checklist
- ✅ Description explains what and why with performance metrics table
- ✅ BABEL-6814 JIRA reference (inferred from test file naming and branch name
jira-babel-ssms-perf) - ✅ Test scenarios covered (22 tests across functional and EXPLAIN plan verification)
- ✅ Target branch correct (BABEL_6_X_DEV)
Coding Conventions
- ✅ Hook install/uninstall follows the established pattern (save prev, assign, restore)
- ✅ Function naming follows Babelfish convention (
bbf_prefix for hook entry) - ✅ Header comment on
bbf_match_opclause_to_indexcoldocuments purpose and patterns handled - ✅ Static function scope is correct — no unnecessary exports
- ✅ Includes are grouped logically with optimizer-related headers
- ✅
copyObject()used correctly to avoid aliasing planner AST nodes
Test Coverage
- ✅ 15 functional tests covering: point lookups, reversed operands, joins, correlated subqueries, multiple object types, NULL/non-existent values
- ✅ 7 EXPLAIN tests (Tests 16–22) verifying index scan usage on
pg_class,pg_index,pg_attribute,pg_namespace - ✅ Parallel query expected output provided (
parallel_query/BABEL-6814-vu-verify.out) - ✅ Upgrade schedule entries added across all 62 version upgrade directories
- ✅ Tests use
.mixformat for psql-mode EXPLAIN alongside tsql-mode functional queries
Security Review
The security sub-agent reviewed hooks.c and found 0 critical, 0 high, 0 medium findings. One low-severity correctness concern was identified regarding operator-type checking, but the current code at HEAD SHA a3cf56d6 includes clause->opno == Int4EqualOperator as a guard (line 128 of the new code), which resolves this concern entirely. The hook only fires for equality operators against OID-typed index columns.
The hook operates purely at plan time on already-parsed expression trees. It does not touch authentication, catalog writes, SPI, GUC settings, or shared memory. No security issues.
Prior Review Issues Addressed
The following issues from prior reviews have been resolved in the current revision:
- Code duplication (rohit01010) → Resolved: Helper function
match_oid_cast_to_indexcolextracted to handle both operand orders - Redundant safety checks (rohit01010) → Resolved: Removed redundant
bms_is_member/contain_volatile_functionschecks that the engine already validates; retainedbms_is_memberinbbf_matchfor correctness (confirmed needed for join conditions) - Missing EXPLAIN tests (KushaalShroff) → Resolved: Tests 16–22 verify index scan plans
- Missing operator guard (prior AI review) → Resolved:
clause->opno == Int4EqualOperatorcheck added - Volatile function guard (prior AI review) → Present:
contain_volatile_functions(val_arg)check inmatch_oid_cast_to_indexcol
Advisory Findings
Advisory Issue 1 — Consider using get_op_btree_interpretation for broader operator validation:
The current Int4EqualOperator check is a hard-coded OID comparison. While correct and efficient for the known pattern, using get_op_btree_interpretation(clause->opno) to confirm BTEqualStrategyNumber would be more robust if additional cast patterns (e.g., int8→int4) are added in the future. This is purely a forward-looking suggestion — the current approach is correct for the OID→int4 case since Int4EqualOperator is the only equality operator that would appear in this cast pattern.
Advisory Issue 2 — iclause->rinfo = rinfo uses original RestrictInfo:
The IndexClause sets iclause->rinfo = rinfo (the original restriction info with the cast pattern) while iclause->indexquals contains the rewritten clause. This is the correct pattern matching how PostgreSQL's match_clause_to_indexcol works — the original rinfo is preserved for selectivity estimation while the rewritten qual is used for actual index matching. No action needed; noting for reviewer awareness.
Notes
- The hook is gated on
sql_dialect == SQL_DIALECT_TSQL, so it has zero impact on PostgreSQL-dialect queries. - The
opcintype[indexcol] == OIDOIDcheck ensures the rewrite only applies to actual OID indexes, preventing false matches on integer indexes. - Cross-file dependency: This hook requires the matching
match_opclause_to_indexcol_hookhook point in the PostgreSQL fork'sindxpath.c. Confirm the fork-side patch is in place or landing concurrently. - The 62 upgrade schedule additions confirm this feature is safe across all supported upgrade paths.
Note: This AI review is meant to augment, not replace, human review. Use the suggestions with discretion.
kuntalghosh
left a comment
There was a problem hiding this comment.
AI Code Review
PR: #4829 — Enabling index scans for OID cast patterns in system views
Author: RuchaSK1
Base: BABEL_6_X_DEV
Head SHA: a3cf56d
Verdict: APPROVE
Summary
This PR implements match_opclause_to_indexcol_hook to rewrite equality predicates of the form (oid_col)::integer = value into oid_col = value::oid, enabling the PostgreSQL planner to use existing OID indexes on system catalog tables. The optimization targets Babelfish system views (e.g., sys.objects, sys.indexes, sys.columns) which internally cast OID columns to INT4 — a pattern that defeats index matching in the default planner. Performance improvements are significant (up to 25× for SSMS operations on large databases).
PR Description
| Check | Status |
|---|---|
| Describes what and why | ✅ |
| JIRA ticket (BABEL-6814) | ✅ |
| Test scenarios listed | ✅ |
| Target branch correct | ✅ |
| Breaking changes documented | N/A |
Findings
Advisory
-
Comment block for
match_oid_cast_to_indexcolhelper — Thebbf_match_opclause_to_indexcolfunction has a clear doc comment explaining the transformation and both operand orders. The helper functionmatch_oid_cast_to_indexcollacks a corresponding comment block. A brief note explaining that it validates the cast pattern and constructs the rewritten IndexClause would improve readability for future maintainers.File:
contrib/babelfishpg_tsql/src/hooks.c(line ~9320) -
bms_is_memberguards are essential but non-obvious — The checks!bms_is_member(index->rel->relid, rinfo->right_relids)and the symmetric left-side check are critical for correctness in join conditions (as noted in the review discussion — removing them caused DDL test failures). Consider adding a brief inline comment explaining the purpose: these prevent the hook from incorrectly matching a column from the wrong table to the current index in cross-table equality predicates.File:
contrib/babelfishpg_tsql/src/hooks.c(lines 9407, 9417)
Test Coverage
| Criterion | Status |
|---|---|
| Happy path (left operand cast) | ✅ Tests 1, 4, 5, 6 |
| Reversed operand order | ✅ Tests 2, 9 |
| Symmetry verification | ✅ Test 3 |
| Join patterns | ✅ Tests 10, 11 |
| Correlated subqueries | ✅ Tests 8, 9 |
| Non-existent values (0 rows) | ✅ Test 7 |
| EXPLAIN plan verification | ✅ Tests 16–22 (psql) |
| Multiple object types | ✅ Tests 12, 13, 14, 15 |
| Upgrade schedules updated | ✅ All 13_x and 14_x schedules |
| Parallel query expected output | ✅ |
The test suite is comprehensive and covers both correctness and plan verification.
Security
No security findings. The hook operates within the planner's type-safe API, uses copyObject to avoid aliasing, validates types explicitly (OIDOID, INT4OID), and guards against volatile functions. No buffer manipulation, no dynamic SQL, no privilege changes.
Prior Review Issues Addressed
- Volatile function guard: Added (
contain_volatile_functions(val_arg)check) - Code duplication: Refactored into
match_oid_cast_to_indexcolhelper - Both operand orders handled: Left and right patterns with
bms_is_memberguards - EXPLAIN tests: Added (Tests 16–22 in psql section)
Notes
- This is an incremental review. All prior blocking issues from human reviewers (rohit01010, KushaalShroff) have been resolved.
- The hook correctly chains via
prev_match_opclause_to_indexcol_hookfor composability. - Only activates for T-SQL dialect and
Int4EqualOperator— no impact on vanilla PostgreSQL queries.
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
Description
Implementing match_opclause_to_indexcol_hook to rewrite expressions like (oid)::integer = value to oid = value::oid, enabling index usage in system catalog views.
Following improvement is observed in SSMS Queries (36k objects):
Issues Resolved
Task: BABEL-6814
Authored-by: Rucha Kulkarni ruchask@amazon.com
Signed-off-by: Rucha Kulkarni ruchask@amazon.com
Engine PR: babelfish-for-postgresql/postgresql_modified_for_babelfish#762
Test Scenarios Covered
Since this is a planner only improvements, we have not added any tests. There are existing tests which check correctness.
Use case based -
Boundary conditions -
Arbitrary inputs -
Negative test cases -
Minor version upgrade tests -
Major version upgrade tests -
Performance tests -
Tooling impact -
Client tests -
Check List
By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.