Fix: ENR temp table creation fails for long table names by anju15bharti · Pull Request #4846 · babelfish-for-postgresql/babelfish_extensions

anju15bharti · 2026-06-02T08:25:09Z

Description

Store original untruncated name for long temp table identifiers

Previously, the tsql_ttname scanner rule did not truncate #temp table
names. The full untruncated name was stored in the ENR name field via
DefineRelation. However, the syscache (pg_class Name field) is limited
to NAMEDATALEN-1 bytes, so it stored a simple-clipped version.
This mismatch between ENR (full name) and syscache (clipped name)
caused temp table creation to fail for names >= NAMEDATALEN, as
subsequent lookups could not consistently resolve the table.

New approach: The scanner truncates #temp table names like all other
identifiers (hash-based truncation via truncate_identifier). The
truncated name is stored consistently in both ENR and syscache.
The original untruncated name is preserved separately in pg_class
reloptions (bbf_original_rel_name) using the existing infrastructure.
This works for both ENR and non-ENR temp tables since ENR synthetic
cattups are accessible via syscache. No PG engine changes are needed.

For indexes on temp tables, the parser stores the index name location
in IndexStmt options. At execution time, the full original name is
extracted from the query string and passed to exec_add_original_index_name.

Display functions (OBJECT_NAME, babelfish_get_enr_list) read the
original name from reloptions via a shared helper get_orig_temp_table_name().

Issues Resolved

BABEL-6433

Engine PR: babelfish-for-postgresql/postgresql_modified_for_babelfish#769

Test Scenarios Covered

Use case based -
Boundary conditions -
Arbitrary inputs -
Negative test cases -
Minor version upgrade tests -
Major version upgrade tests -
Performance tests -
Tooling impact -
Client tests -

Check List

Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: anju15bharti <abanju@amazon.com>

anju15bharti

/code-review

ayushdsh · 2026-06-03T09:05:07Z

@anju15bharti , why are we simply not truncating temp table identifiers (names) as well? I understand that the view with babelfish_get_enr_list might break with this, but its way simpler to simply truncate like we do for other identifier rather than supporting full name.

let me know what you think?

ayushdsh · 2026-06-03T09:05:20Z

/code-review

ayushdsh · 2026-06-03T09:06:18Z

@anju15bharti , why are we simply not truncating temp table identifiers (names) as well? I understand that the view with babelfish_get_enr_list might break with this, but its way simpler to simply truncate like we do for other identifier rather than supporting full name.

let me know what you think?

This probably is going to be a 1 line change in our parser (no changes in engine)

<tsql>{tsql_ttname}	{
					char *ident;
					SET_YYLLOC();
					ident = downcase_truncate_identifier(yytext, yyleng, true);
					yylval->str = ident;
					return IDENT;
				}

kuntalghosh

AI Code Review

Verdict: COMMENT

Based on the provided code changes, here are my feedback and suggestions:

Issue 1 - In the pl_handler.c at line 3082: ENR fallback missing currentQueryEnv NULL guard:

	if (!OidIsValid(relid) && relname[0] == '#')
	{
		EphemeralNamedRelation enr = get_ENR(currentQueryEnv, relname, true);
		if (enr)
			relid = enr->md.reliddesc;
	}

While get_ENR handles NULL queryEnv gracefully (returns NULL), other call sites in this file (line 8335) do explicitly check !currentQueryEnv before accessing it. Adding a guard here would be a defensive improvement for clarity and consistency with the rest of the file, though the behavior is correct without it.

Suggested fix:

	if (!OidIsValid(relid) && relname[0] == '#' && currentQueryEnv)
	{
		EphemeralNamedRelation enr = get_ENR(currentQueryEnv, relname, true);
		if (enr)
			relid = enr->md.reliddesc;
	}

Issue 2 - In the temp_table_long_name_enr.sql at line 37: Test 2 expected output contradicts its own comment:

-- Type 'V' (view) should return NULL for a table
SELECT CASE WHEN OBJECT_ID('#long_temp_for_typed_object_id_test_exceeding_namedatalen_limit_x', 'V') IS NULL THEN 'PASS' ELSE 'FAIL' END AS test2_type_V_null
GO

The comment says type 'V' "should return NULL for a table," but the expected output is FAIL — meaning OBJECT_ID with type filter 'V' still returns a non-NULL OID for temp tables. If this is a known limitation of ENR-based lookup (type filtering not applied), the comment should document this as a known gap rather than state the opposite of what happens. Otherwise, future readers will be confused about whether the test is passing or failing.

Suggested fix:

-- Type 'V' (view) — known limitation: OBJECT_ID type filter is not enforced for
-- ENR-based temp table lookups, so this returns non-NULL even for type 'V'.
SELECT CASE WHEN OBJECT_ID('#long_temp_for_typed_object_id_test_exceeding_namedatalen_limit_x', 'V') IS NULL THEN 'PASS' ELSE 'FAIL' END AS test2_type_V_null
GO

Issue 3 - In the temp_table_long_name_enr-vu-verify.sql at line 47: Duplicate scenario numbering:

-- Scenario 3: Basic CREATE, INSERT, SELECT, DROP with ENR validation
EXEC temp_long_name_enr_basic_proc
GO

-- Scenario 2: ALTER TABLE on long-name temp table
EXEC temp_long_name_enr_alter_proc
GO

The scenario numbers in the procedure-based section restart at "Scenario 2" after "Scenario 3". The numbering should continue sequentially (Scenario 4, 5, 6...) or restart clearly from 1 for the procedure-based section.

Suggested fix:

-- Scenario 3: Basic CREATE, INSERT, SELECT, DROP with ENR validation
EXEC temp_long_name_enr_basic_proc
GO

-- Scenario 4: ALTER TABLE on long-name temp table
EXEC temp_long_name_enr_alter_proc
GO

Issue 4 - In the pltsql_utils.c at line 2978: Comment style — opening /* should be on its own line for multi-line block comments:

	/* Do not truncate temp table names (starting with '#') since ENR stores the
	 * full name. The {tsql_ttname} scanner rule preserves temp table names without
	 * truncation, so the ENR entry uses the original untruncated name. Applying
	 * hash-based truncation here would produce a name that doesn't match the ENR. */

Per project conventions, multi-line comment blocks should have /* and */ on their own lines.

Suggested fix:

	/*
	 * Do not truncate temp table names (starting with '#') since ENR stores the
	 * full name. The {tsql_ttname} scanner rule preserves temp table names without
	 * truncation, so the ENR entry uses the original untruncated name. Applying
	 * hash-based truncation here would produce a name that doesn't match the ENR.
	 */

Issue 5 - PR Description: Test Scenarios section is empty:

The PR description template has all test scenario subsections (use case, boundary conditions, arbitrary inputs, negative test cases, upgrade tests, etc.) left blank. While the actual test files are comprehensive, the PR description should document what is covered so reviewers can assess coverage without reading all test files.

Notes

The core approach (skip truncation for #-prefixed names in the bracket-delimiter scanner rule and OBJECT_ID path, plus an ENR fallback in bbf_table_var_lookup) is sound and follows established patterns already used in catalog.c and procedures.c.
The change correctly does NOT affect the {tsql_ttname} rule which already preserves the full name without truncation via pstrdup(yytext).
Security review of C source changes: 0 findings. No buffer overflows, NULL dereferences, injection risks, or use-after-free conditions introduced.
The reviewer comment from @ayushdsh suggests an alternative simpler approach (just truncate temp table names like other identifiers). This is an open design discussion — the current approach preserves full fidelity with T-SQL behavior where temp table names can be up to 116 characters, but the alternative would be simpler at the cost of breaking the babelfish_get_enr_list view and potentially OBJECT_ID lookups.
Upgrade test schedules are properly updated across all relevant minor versions.
The enrtype check (verifying enr->md.enrtype == ENR_TSQL_TEMP) is not performed in the new fallback. Other sibling call sites do check this. While non-TSQL-TEMP ENRs cannot have #-prefixed names through user-reachable syntax, adding the check would be a defensive consistency improvement.

Note: This AI review is meant to augment, not replace, human review. Use the suggestions with discretion.

Reverts code changes from commits f313cde and 798c7dd. Test files are kept. Replaced by new approach using reloptions.

anju15bharti · 2026-06-11T13:04:39Z

/code-review

kuntalghosh

AI Code Review

Verdict: COMMENT

Based on the provided code changes, here are my feedback and suggestions:

Issue 1 - In the functions.c at line 2836: get_orig_temp_table_name returns pointer into partially-freed buffer:

static char *
get_orig_temp_table_name(Oid relid)
{
	HeapTuple	tuple = SearchSysCache1(RELOID, ObjectIdGetDatum(relid));
	char	   *orig = NULL;

	if (HeapTupleIsValid(tuple))
	{
		Datum	datum;
		bool	isnull;

		datum = SysCacheGetAttr(RELOID, tuple, Anum_pg_class_reloptions, &isnull);
		if (!isnull)
		{
			ArrayType *reloptions = DatumGetArrayTypeP(datum);
			orig = get_value_by_name_from_array(reloptions, ATTOPTION_BBF_ORIGINAL_TABLE_NAME);
		}
		ReleaseSysCache(tuple);
	}
	return orig;
}

get_value_by_name_from_array returns a pointer into the middle of a TextDatumGetCString-allocated buffer (it splits on = and returns the value portion). For non-TOAST reloptions, DatumGetArrayTypeP returns a pointer directly into the syscache tuple. When ReleaseSysCache is called, the tuple may be freed, invalidating the reloptions datum. However, because TextDatumGetCString inside get_value_by_name_from_array copies the text datum to a palloc'd buffer via array_ref, the returned string should remain valid after ReleaseSysCache.

This is safe but subtle. Consider adding a brief comment noting that orig remains valid after ReleaseSysCache because get_value_by_name_from_array copies the value into CurrentMemoryContext, or alternatively pstrdup the result before releasing the syscache for defensive clarity.

Issue 2 - In the functions.c at line 2904: result_text uninitialized before conditional assignment:

		if (pg_class->relpersistence == RELPERSISTENCE_TEMP &&
			NameStr(pg_class->relname)[0] == '#')
		{
			char *orig = get_orig_temp_table_name(object_id);
			if (orig)
				result_text = cstring_to_text(orig);
		}

		if (!result_text)
			result_text = cstring_to_text(NameStr(pg_class->relname));

The code relies on result_text being NULL at this point. Looking at the broader context of object_name(), result_text is declared earlier as text *result_text = NULL, and the ENR path above returns early via PG_RETURN_VARCHAR_P. So when we reach this code, result_text is indeed NULL. This is correct but the logic flow is non-obvious. Consider adding a comment like /* result_text is still NULL here since ENR path returns early */ for clarity.

Issue 3 - In the pl_handler.c at line 5407: No bounds validation on loc before accessing queryString:

int loc = intVal(opt->arg);
char *full_name = extract_identifier(queryString + loc, NULL);

The loc value comes from @7 in the grammar (the parser's position token for the index name), which is guaranteed to be within the query string bounds since the parser produced it from parsing that exact string. This follows the same pattern as existing name_location usage elsewhere in the file (lines 3900, 4087, 4142). No change needed, but a brief comment noting that loc is a parser-generated position would improve readability.

Issue 4 - In the hooks.c at line 2502: Comment is now stale after condition expansion:

	/*
	 * Only store original_name if there's a difference, and if the difference
	 * is only in capitalization
	 */
	if ((strncmp(relname, original_name, strlen(relname)) != 0 && strncasecmp(relname, original_name, strlen(relname)) == 0) ||
		(relation->relpersistence == RELPERSISTENCE_TEMP && original_name[0] == '#' && strlen(original_name) >= NAMEDATALEN))

The comment above this condition says "Only store original_name if there's a difference, and if the difference is only in capitalization." This no longer describes the full condition — the second clause stores the original name for truncated temp tables regardless of case differences. The comment should be updated to describe both conditions.

Suggested fix:

	/*
	 * Store original_name in reloptions when:
	 * 1. There is a case-only difference between relname and original_name, OR
	 * 2. The identifier is a temp table name that was truncated (>= NAMEDATALEN)
	 */

Issue 5 - In the gram-tsql-rule.y at line 3560: name_location added unconditionally for all indexes:

				if (n->idxname)
					n->options = lappend(n->options,
						makeDefElem("name_location",
							(Node *) makeInteger(@7), @7));

The name_location option is appended for every index statement that has a name, not just temp table indexes. While the consuming code in pl_handler.c only uses it when the table is temp (relpersistence == RELPERSISTENCE_TEMP && relname[0] == '#'), and the option is removed from stmt->options before execution proceeds, the option node is still allocated and added to the list for all non-temp indexes as well. Since the consumer removes it via foreach_delete_current, this is functionally correct but adds minor unnecessary overhead for all regular index creation statements. This is acceptable for simplicity, but worth noting.

Missing Tests

The test coverage is thorough for the core functionality. A few edge cases that could strengthen it:

Concurrent sessions: Two sessions creating temp tables with the same long name. This would verify the truncation hash produces distinct internal names. (Could be an isolation test.)
SELECT INTO with long temp table name: SELECT * INTO #very_long_name... FROM ... — this goes through a different DDL path than explicit CREATE TABLE.

Notes

Prior AI review issues have been addressed: the currentQueryEnv NULL guard issue is no longer relevant (the approach was completely redesigned to use reloptions instead of an ENR lookup fallback), and the misleading test comment about OBJECT_ID type filtering was clarified.
The .out file changes show that babelfish_get_enr_list() now returns the original user-facing index names (e.g., #idx1_intermediate) instead of the internal composite names (e.g., #idx1_intermediate#t_intermediac7f5...). This is a user-visible behavior change for the babelfish_get_enr_list() system function that improves usability.
The approach is well-designed: truncating at scanner level ensures consistent names in ENR and syscache, while preserving the original name in reloptions uses existing infrastructure without engine changes.
Target branch BABEL_6_X_DEV is correct for this feature work.
The PR description test scenarios section is empty (only headers, no specific scenarios listed). While the test files themselves are comprehensive, the PR template checklist items for test scenarios should ideally be filled in for reviewer context.

Note: This AI review is meant to augment, not replace, human review. Use the suggestions with discretion.

ayushdsh

the approach looks fine to me .. i just have some questions -

why have we explicitly added temp table handling? wouldn't the same be required for other identifiers as well?
w.r.t the function get_enr_list(), we are not returning the original name .. but with this .. we are returning the orig_name of indexes as well however, indexes with same name can exists on multiple tables. do you think this can cause confusion? I am okay with it, but just something to consider.

ayushdsh · 2026-06-15T13:24:44Z

+			/* Use original untruncated name from reloptions if available */
+			if (md->enrtype == ENR_TSQL_TEMP)
+			{
+				char *orig = get_orig_temp_table_name(md->reliddesc);


Nit: can we instead make the utility itself handle all of this - get_orig_temp_table_name(Enr) returns fullname or truncated name.

ayushdsh · 2026-06-15T13:26:21Z

+			if (pg_class->relpersistence == RELPERSISTENCE_TEMP &&
+				NameStr(pg_class->relname)[0] == '#')
+			{


wouldn't we be handling this for all identifiers? are we doing this for now for temp tables only?

ayushdsh · 2026-06-15T14:30:23Z

 					stmt->excludeOpNames = NIL;
+
+					/*
+					 * For indexes on temp tables, extract the full original


why just temp tables? why aren't we doing this for all indexes

ayushdsh · 2026-06-15T14:40:43Z

+			pfree(original_name);
+		}
+		else if (original_name)
+			pfree(original_name);


Nit: unnecessary. pfree handles NULLs

ayushdsh · 2026-06-15T15:28:23Z

 static List *
-transformSelectIntoStmt(CreateTableAsStmt *stmt)
+transformSelectIntoStmt
+(CreateTableAsStmt *stmt, const char *queryString)


are we doing the same for other SELECT INTOs as well? passing the queryString here and will use it to extract the actual name? why can't we do the same we do for CREATE TABLE ... storing the orig_name in reloptions?

ayushdsh · 2026-06-15T15:34:08Z

+					if (n->idxname)
+						n->options = lappend(n->options,
+							makeDefElem("name_location",
+								(Node *) makeInteger(@7), @7));
+


this is unconditional - we are populating the name_location field no matter what? should we add a NAMEDATALEN check? and btw this will also happen for indexes on permanent tables as well .. this is expected right?

ayushdsh · 2026-06-15T15:36:08Z

+			if (pg_class->relpersistence == RELPERSISTENCE_TEMP &&
+				NameStr(pg_class->relname)[0] == '#')


Nit: lets create a utility for this?

bool IsTsqlTempTable(Relation)...

fix: ENR temp table creation fails for long table names

798c7dd

Signed-off-by: anju15bharti <abanju@amazon.com>

anju15bharti changed the title ~~fix: ENR temp table creation fails for long table names~~ Fix: ENR temp table creation fails for long table names Jun 2, 2026

anju15bharti closed this Jun 2, 2026

anju15bharti reopened this Jun 2, 2026

anju15bharti mentioned this pull request Jun 2, 2026

Fix: ENR temp table creation fails for long table names babelfish-for-postgresql/postgresql_modified_for_babelfish#769

Open

1 task

anju15bharti requested a review from ayushdsh June 2, 2026 10:10

anju15bharti commented Jun 2, 2026

View reviewed changes

kuntalghosh reviewed Jun 3, 2026

View reviewed changes

ayushdsh reviewed Jun 9, 2026

View reviewed changes

Comment thread contrib/babelfishpg_tsql/src/pl_handler.c Outdated

ayushdsh reviewed Jun 9, 2026

View reviewed changes

Comment thread contrib/babelfishpg_tsql/src/pltsql_utils.c Outdated

Add IS_TDS_CLIENT() check and minor updated in testfiles

f313cde

tanscorpio7 requested changes Jun 9, 2026

View reviewed changes

Comment thread contrib/babelfishpg_tsql/src/backend_parser/scan-tsql-rule.l Outdated

Revert: ENR temp table creation for long table names

8872ab4

Reverts code changes from commits f313cde and 798c7dd. Test files are kept. Replaced by new approach using reloptions.

anju15bharti force-pushed the BABEL-6433 branch 2 times, most recently from 5e82847 to a509bd4 Compare June 11, 2026 12:47

kuntalghosh reviewed Jun 11, 2026

View reviewed changes

anju15bharti added 2 commits June 12, 2026 12:06

Store original name in reloption for temp table and index on it

70c3a56

Merge branch 'BABEL_6_X_DEV' into BABEL-6433

5380950

anju15bharti force-pushed the BABEL-6433 branch 3 times, most recently from 89cde96 to d339237 Compare June 12, 2026 13:28

Fix failures and Add testcases

31cbe64

anju15bharti force-pushed the BABEL-6433 branch from d339237 to 31cbe64 Compare June 15, 2026 04:55

ayushdsh reviewed Jun 15, 2026

View reviewed changes

		if (pg_class->relpersistence == RELPERSISTENCE_TEMP &&
		NameStr(pg_class->relname)[0] == '#')

Conversation

anju15bharti commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Issues Resolved

Test Scenarios Covered

Check List

Uh oh!

anju15bharti left a comment

Choose a reason for hiding this comment

Uh oh!

ayushdsh commented Jun 3, 2026

Uh oh!

ayushdsh commented Jun 3, 2026

Uh oh!

ayushdsh commented Jun 3, 2026

Uh oh!

kuntalghosh left a comment

Choose a reason for hiding this comment

AI Code Review

Notes

Note: This AI review is meant to augment, not replace, human review. Use the suggestions with discretion.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

anju15bharti commented Jun 11, 2026

Uh oh!

kuntalghosh left a comment

Choose a reason for hiding this comment

AI Code Review

Missing Tests

Notes

Note: This AI review is meant to augment, not replace, human review. Use the suggestions with discretion.

Uh oh!

ayushdsh left a comment

Choose a reason for hiding this comment

Uh oh!

ayushdsh Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

ayushdsh Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

ayushdsh Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

ayushdsh Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

ayushdsh Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

ayushdsh Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ayushdsh Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

anju15bharti commented Jun 2, 2026 •

edited

Loading

ayushdsh Jun 15, 2026 •

edited

Loading