Rewriting objectpropertyex function in C#4864
Conversation
Signed-off-by: Rucha Kulkarni <ruchask@amazon.com>
|
/code-review |
kuntalghosh
left a comment
There was a problem hiding this comment.
AI Code Review
Verdict: COMMENT
Based on the provided code changes, here are my feedback and suggestions:
Issue 1 - In contrib/babelfishpg_tsql/runtime/functions.c at line 5095: Missing PG_ARGISNULL defense-in-depth in objectproperty_internal:
Datum
objectproperty_internal(PG_FUNCTION_ARGS)
{
Oid object_id;
char *raw;
...
object_id = (Oid) PG_GETARG_INT32(0);
raw = text_to_cstring(PG_GETARG_TEXT_P(1));The prior implementation had explicit PG_ARGISNULL(0) || PG_ARGISNULL(1) guards. The new code relies solely on the SQL-level STRICT modifier to prevent NULL arguments from reaching the C function. While STRICT is correct today, a future upgrade script that replaces the function declaration without STRICT (or an ALTER FUNCTION ... CALLED ON NULL INPUT) would cause text_to_cstring(NULL) to dereference a NULL pointer, crashing the backend. Adding the check costs two branch instructions and provides defense-in-depth.
Suggested fix:
Datum
objectproperty_internal(PG_FUNCTION_ARGS)
{
Oid object_id;
char *raw;
char *property;
bool is_null;
int result;
if (PG_ARGISNULL(0) || PG_ARGISNULL(1))
PG_RETURN_NULL();
object_id = (Oid) PG_GETARG_INT32(0);
raw = text_to_cstring(PG_GETARG_TEXT_P(1));Apply the same pattern to objectpropertyex_internal.
Issue 2 - In contrib/babelfishpg_tsql/src/pltsql_utils.c at line 1782: Broadened ACL in shared helper affects object_schema_name asymmetrically:
if (pg_class_aclcheck(con->conrelid, user_id, ACL_SELECT | ACL_INSERT | ACL_UPDATE | ACL_DELETE | ACL_REFERENCES) == ACLCHECK_OK)
namespace_oid = con->connamespace;tsql_get_constraint_nsp_oid is called from two sites: (1) the new get_object_from_pg_constraint helper and (2) object_schema_name (line ~4031). In object_schema_name, the table path still requires ACL_SELECT, and the trigger path still requires ACL_SELECT. After this PR, the constraint path is more permissive — a user with only INSERT on a table can learn the schema name via OBJECT_SCHEMA_NAME(constraint_oid) but not via OBJECT_SCHEMA_NAME(table_oid). This asymmetry is likely unintentional.
Suggested fix:
Either keep the broadened check only inline in get_object_from_pg_constraint (revert tsql_get_constraint_nsp_oid to ACL_SELECT), or also broaden the table/trigger paths in object_schema_name for consistency. Please confirm which behavior is intended.
Issue 3 - In contrib/babelfishpg_tsql/runtime/functions.c at line 4773-4781: Database scoping uses is_shared_schema() which is wider than prior behavior:
if (!(nspname && is_shared_schema(nspname)) &&
OidIsValid(get_cur_db_id()) && !is_schema_from_db(schema_id, get_cur_db_id()))The old code explicitly blocked objects in pg_catalog, pg_toast, and public. The new code permits objects in any "shared schema" (including public, pg_catalog, pg_toast, aws_commons, aws_s3, aws_lambda, pglogical). This means objects in public that were previously hidden by OBJECTPROPERTY/OBJECTPROPERTYEX are now visible to any T-SQL user with appropriate ACL. If this is intentional (and it does seem correct for multi-database scoping), it should be documented in the PR description as a behavior change. If it's unintentional for public, a narrower predicate should be used.
Suggested fix:
Add a brief note to the PR description acknowledging the metadata visibility expansion for objects in public/pg_catalog/pg_toast.
Issue 4 - In contrib/babelfishpg_tsql/runtime/functions.c at line 4415: Sequence detection fix — good catch but verify test coverage:
// Old (broken):
else if (pg_class->relkind == 's')
type = OBJECT_TYPE_SEQUENCE_OBJECT;
// New (correct):
else if (pg_class->relkind == 'S')
temp_type = OBJECT_TYPE_SEQUENCE_OBJECT;The prior code used lowercase 's' which never matches (PostgreSQL uses uppercase 'S' for sequences in relkind). This means OBJECTPROPERTY was silently returning wrong results for sequences. The fix is correct and the new test for OBJECTPROPERTYEX(OBJECT_ID('objectpropertyex_test_seq'), 'BaseType') exercises it. However, there is no corresponding test added for OBJECTPROPERTY(..., 'IsTable') on a sequence to verify it returns 0 (ensuring it doesn't fall through to the table classification path). Consider adding one.
Issue 5 - In contrib/babelfishpg_tsql/runtime/functions.c at line 5158: Minor memory leak in objectpropertyex_internal basetype path:
VarChar *vch = (*common_utility_plugin_ptr->tsql_varchar_input)(type_code, strlen(type_code), -1);
PG_RETURN_BYTEA_P((*common_utility_plugin_ptr->convertVarcharToSQLVariantByteA)(vch, PG_GET_COLLATION()));vch is allocated by tsql_varchar_input and leaked because PG_RETURN_BYTEA_P returns immediately. In queries like SELECT OBJECTPROPERTYEX(object_id, 'BaseType') FROM sys.all_objects, each row leaks one VarChar allocation. This is bounded (memory context cleanup happens at statement end) and the payload is only ~2 bytes, so it's low severity. Consider adding pfree(vch) if convertVarcharToSQLVariantByteA does not retain the pointer.
Issue 6 - In contrib/babelfishpg_tsql/runtime/functions.c at line 4856: isusertable evaluated before the main else block:
if (pg_strcasecmp(property, "ismsshipped") == 0)
{ ... }
else if (pg_strcasecmp(property, "isusertable") == 0)
{ ... }
else
{
if (object_name)
pfree(object_name);
/* OwnerId */
if (pg_strcasecmp(property, "ownerid") == 0)
...The isusertable property is handled in a branch that still has object_name available and calls is_ms_shipped(object_name, ...). However, the main else block starts by calling pfree(object_name) before evaluating remaining properties. This works correctly because the else block properties (ownerid, isdefaultcnst, etc.) don't use object_name. Just noting that the structure is correct but the inconsistent placement of isusertable outside the main else block is intentional since it needs object_name — this is fine.
Missing Tests
- No
OBJECTPROPERTYtest for sequence objects (e.g.,SELECT OBJECTPROPERTY(OBJECT_ID('seq'), 'IsTable')should return 0 — this exercises the'S'fix from the other direction). - The "before-16_5-or-15_9" variant of tests expects
<NULL>forPKandFKconstraints' BaseType (comments say "older databases"). This is fine for the older-version test file, but consider verifying in the main test variant that these returnPK/Fon the current version (unlesssys.key_constraints/sys.foreign_keysare intentionally empty on BABEL_4_X_DEV). - No minor version upgrade test is listed in the PR checklist. Given that
objectpropertyis being replaced viaCREATE OR REPLACE FUNCTIONin the4.10.0--4.11.0upgrade script, a minor version upgrade test verifying the function works post-upgrade would be valuable.
Notes
- The PR is cherry-picked from #4691 (presumably targeting a different branch). The performance improvements are substantial (225x–2355x for single-call scenarios) and the refactoring into per-catalog helper functions is clean and well-structured.
- The
IsTriggerfix (previously checkingOBJECT_TYPE_ASSEMBLY_DML_TRIGGERwhich the resolver never produces, now correctly checkingOBJECT_TYPE_TSQL_DML_TRIGGER) is a real correctness bug fix — the old code always returned NULL/0 for triggers. The test change from<NULL>to1inobjectproperty-vu-verify.outconfirms this. - The trigger lookup via
pg_triggeris a new catalog path that didn't exist before — triggers are now properly resolved even when passed directly as OIDs rather than throughOBJECT_ID('name', 'TR'). - The unique constraint support (
'UQ'BaseType) is a new addition that aligns with T-SQL behavior. - Test coverage is comprehensive: ACL tests with per-permission grants/revokes, cross-database scoping, NULL inputs, trigger resolution, constraint types, IsSchemaBound on views, and permission edge cases are all well-covered.
- Security review (incorporated from separate analysis): 0 critical/high findings. 1 medium finding (Issue 2 above — ACL asymmetry in
object_schema_name). Remaining findings are low/informational defense-in-depth suggestions.
Description
The PL/pgSQL implementation of OBJECTPROPERTYEX relies on sys.all_objects, an expensive view that joins across multiple catalogs with multiple UNION branches, causing significant overhead which gets carried forward to any function using OBJECTPROPERTYEX.
Hence we are replacing OBJECTPROPERTYEX with a C implementation that uses direct syscache lookups via shared catalog helpers, avoiding view materialization entirely. For non-BaseType properties, it delegates to the existing objectproperty_internal.
The performance improvement observed by this is as follows on 17.10 instance with total object counts being approx 36K:
EXPLAIN ANALYZE SELECT OBJECTPROPERTYEX(OBJECT_ID('dbo.tbl_1'), 'BaseType');EXPLAIN ANALYZE SELECT OBJECTPROPERTYEX(OBJECT_ID('dbo.tbl_1'), 'IsSchemaBound');EXPLAIN ANALYZE SELECT OBJECTPROPERTYEX(999999999, 'BaseType');EXPLAIN ANALYZE SELECT OBJECTPROPERTYEX(999999999, 'IsSchemaBound');EXPLAIN ANALYZE SELECT OBJECTPROPERTYEX(OBJECT_ID('dbo.vw_1'), 'BaseType');EXPLAIN ANALYZE SELECT OBJECTPROPERTYEX(OBJECT_ID('dbo.sp_1'), 'BaseType');EXPLAIN ANALYZE select count(*) from sys.all_objects;Performance for Joins with Predicated OBJECTPROPERTYEX:
oid_testis a plain user table containing all object IDs (isolates function cost from sys view overhead).SELECT TOP 100 t.name, OBJECTPROPERTYEX(t.object_id, 'BaseType') FROM sys.objects t JOIN sys.schemas s ON t.schema_id = s.schema_id;SELECT TOP 200 t.name, OBJECTPROPERTYEX(t.object_id, 'BaseType') FROM sys.objects t JOIN sys.schemas s ON t.schema_id = s.schema_id;SELECT TOP 20 name, object_id FROM sys.objects WHERE OBJECTPROPERTYEX(object_id, 'BaseType') = CAST('U ' AS sql_variant);SELECT object_id, OBJECTPROPERTYEX(object_id, 'BaseType') FROM oid_test;(36K rows)SELECT TOP 200 o.object_id, c.relname FROM oid_test o JOIN pg_class c ON c.oid = o.object_id JOIN pg_namespace n ON n.oid = c.relnamespace WHERE OBJECTPROPERTYEX(o.object_id, 'IsTable') = CAST(1 AS sql_variant);SELECT TOP 200 o.object_id, OBJECTPROPERTYEX(o.object_id, 'BaseType') AS type, OBJECTPROPERTYEX(o.object_id, 'IsMSShipped') AS shipped FROM oid_test o JOIN pg_class c ON c.oid = o.object_id JOIN pg_namespace n ON n.oid = c.relnamespace WHERE OBJECTPROPERTYEX(o.object_id, 'IsTable') = CAST(1 AS sql_variant) OR OBJECTPROPERTYEX(o.object_id, 'IsView') = CAST(1 AS sql_variant);SELECT TOP 200 object_id, OBJECTPROPERTYEX(object_id, 'BaseType'), OBJECTPROPERTYEX(object_id, 'IsTable'), OBJECTPROPERTYEX(object_id, 'IsMSShipped') FROM oid_test;SELECT TOP 200 o.object_id, c.relname FROM oid_test o JOIN pg_class c ON c.oid = o.object_id WHERE OBJECTPROPERTYEX(o.object_id, 'BaseType') = CAST('U ' AS sql_variant) ORDER BY c.relname;SELECT TOP 200 a.object_id AS table_id, b.object_id AS view_id FROM oid_test a JOIN oid_test b ON a.object_id <> b.object_id WHERE OBJECTPROPERTYEX(a.object_id, 'BaseType') = CAST('U ' AS sql_variant) AND OBJECTPROPERTYEX(b.object_id, 'BaseType') = CAST('V ' AS sql_variant);Additional Improvements:
Cherry-picked from: #4691
Issues Resolved
Task: BABEL-6435
Authored-by: Rucha Kulkarni ruchask@amazon.com
Signed-off-by: Rucha Kulkarni ruchask@amazon.com
Test Scenarios Covered
There were already existing tests since the function was already present and we have just rewritten it for better performance, but still some additional test cases have been added.
Use case based - Yes
Boundary conditions - Yes
Arbitrary inputs - Yes
Negative test cases - Yes
Minor version upgrade tests -
Major version upgrade tests -
Performance tests - Yes (Benchmarked above with 36K objects having 2500 views)
Tooling impact -
Client tests -
Check List
By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.