Enabling index scans for OID cast patterns in system views#4868
Enabling index scans for OID cast patterns in system views#4868RuchaSK1 wants to merge 6 commits into
Conversation
|
/code-review |
kuntalghosh
left a comment
There was a problem hiding this comment.
AI Code Review
Verdict: COMMENT
Based on the provided code changes, here are my feedback and suggestions:
Issue 1 - In contrib/babelfishpg_tsql/src/hooks.c at line 59: #include ordering:
#include "optimizer/paths.h"
#include "optimizer/planner.h"
#include "optimizer/restrictinfo.h"
#include "access/stratnum.h"
#include "parser/analyze.h"The #include "access/stratnum.h" is placed after the optimizer/ includes, breaking alphabetical ordering by directory prefix. The project convention groups includes alphabetically by path.
Suggested fix:
#include "access/stratnum.h"
...
#include "optimizer/clauses.h"
#include "optimizer/optimizer.h"
#include "optimizer/paths.h"
#include "optimizer/planner.h"
#include "optimizer/restrictinfo.h"Issue 2 - In contrib/babelfishpg_tsql/src/pltsql.h at lines 2523–2541: Unrelated InsertExecContext declarations bundled with an index-scan hook PR:
typedef struct InsertExecContext
{
Oid temp_table_oid;
Oid target_rel_oid;
bool is_target_relation_modified;
} InsertExecContext;
extern InsertExecContext insert_exec_ctx;
extern Oid create_insert_exec_temp_table(...);
extern void flush_insert_exec_temp_table(...);
extern DestReceiver *CreateInsertExecDestReceiver(void);These declarations expose an INSERT...EXEC API surface that has no corresponding implementation in this PR. No .c file defines these symbols, so they are dead code at this point. Including them here creates a confusing scope for review — the PR title and description focus entirely on the OID cast index-scan optimization (BABEL-6814), yet these declarations belong to a different feature (INSERT...EXEC support).
This is not a blocking issue since the declarations are inert (no linker impact when unreferenced), but it would be cleaner to:
- Land them in the PR that introduces the implementation, or
- Guard them behind a comment noting the follow-up PR that will use them.
The security review noted that the implementation PR must be carefully reviewed for state-reset on error paths, TOCTOU on target_rel_oid, and authorization checks in CreateInsertExecDestReceiver.
Issue 3 - In test/JDBC/input/BABEL-6814-vu-prepare.sql: Test setup creates 100 tables and 500 schemas for plan stability:
WHILE @i <= 100
BEGIN
SET @sql = 'CREATE TABLE babel_6814_extra_' + CONVERT(VARCHAR, @i) + ' (id INT)';
...
END
...
WHILE @j <= 500
BEGIN
SET @sql2 = 'CREATE SCHEMA babel_6814_schema_' + CONVERT(VARCHAR, @j);
...
ENDCreating 100 extra tables + 100 indexes + 500 schemas to force planner cost estimates toward index scans is a heavy-weight test fixture. This significantly increases test runtime (especially in upgrade tests where it runs against every version). Consider whether a smaller cardinality (e.g., 20 tables, 50 schemas) would be sufficient to tip the planner, or whether SET enable_seqscan = off in the psql EXPLAIN section would produce deterministic plans without the bulk object creation.
Issue 4 - Advisory: EXPLAIN plan tests may be fragile across PostgreSQL major versions:
The EXPLAIN (COSTS OFF) tests (Tests 16–22) assert specific plan shapes (e.g., "Index Scan using pg_class_oid_index"). While the hook guarantees the index-scannable rewrite is available, the planner's final plan choice depends on table statistics, catalog size, and cost parameters. The -- parallel_query_expected variant already shows different plan wrappers (Gather nodes). If PostgreSQL's cost model changes across major versions, these tests could produce spurious failures. The existing mitigation (bulk object creation) helps but is not guaranteed for all future PG versions.
Consider adding a comment in the test file noting that these EXPLAIN tests may need updating when PostgreSQL's cost model or parallel query thresholds change.
Missing Tests
No critical gaps. The test coverage is thorough:
- Both operand orderings (left/right cast patterns)
- Correlated subqueries, joins, IN-list subqueries
- Multiple catalog tables (pg_class, pg_index, pg_attribute, pg_namespace)
- Non-existent OID (zero-row case)
- Parallel query variant
- EXPLAIN plan verification
One minor gap: no test for the contain_volatile_functions guard — i.e., a query like WHERE (oid)::int4 = nextval('seq') that should NOT be rewritten to use an index scan. This is a correctness edge case rather than a functional gap since a volatile expression would still produce correct results (just without the optimization).
Notes
- The PR description references BABEL-6435 (OBJECTPROPERTYEX C rewrite) but this PR only implements BABEL-6814 (OID cast index scans). The description content about OBJECTPROPERTYEX appears to be carried over from the cherry-pick source PR #4829 and is somewhat misleading for this standalone change. The actual code change here is focused and well-scoped.
- Security review (0 actionable findings): The planner hook is semantically correct —
int4andoidshare 32-bit representation,RelabelTypeis binary-identity,oideqis leakproof, andsecurity_levelis properly propagated. The hook correctly chains withprev_match_opclause_to_indexcol_hookand restores it on uninstall. - The
InsertExecContextdeclarations inpltsql.hshould receive a dedicated security review when their implementation lands. - Upgrade schedule files correctly add
BABEL-6814to all version schedules, ensuring the test runs during minor version upgrades.
Note: This AI review is meant to augment, not replace, human review. Use the suggestions with discretion.
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
9a738bc to
10be049
Compare
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
|
/code-review |
Description
Implementing match_opclause_to_indexcol_hook to rewrite expressions like (oid)::integer = value to oid = value::oid, enabling index usage in system catalog views.
Following improvement is observed in SSMS Queries (36k objects):
Cherry-picked from: #4829
Issues Resolved
Task: BABEL-6814
Authored-by: Rucha Kulkarni ruchask@amazon.com
Signed-off-by: Rucha Kulkarni ruchask@amazon.com
Engine PR: babelfish-for-postgresql/postgresql_modified_for_babelfish#762
Test Scenarios Covered
Since this is a planner only improvements, we have not added any tests. There are existing tests which check correctness.
Use case based -
Boundary conditions -
Arbitrary inputs -
Negative test cases -
Minor version upgrade tests -
Major version upgrade tests -
Performance tests -
Tooling impact -
Client tests -
Check List
By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.