[BABEL] INSERT EXECUTE Redesign PR4 complete redesign#4847
Conversation
9eb4571 to
6808d3d
Compare
| /* | ||
| * For INSERT EXEC, any rewrite recorded for the INSERT | ||
| * target (e.g. the db/schema qualifier in "otherdb..t_target") sits before | ||
| * the EXEC start and would map to a negative offset. Those entries belong | ||
| * to the INSERT destination, not the executed statement, so drop them | ||
| * before running the mutator. | ||
| */ |
There was a problem hiding this comment.
Please share an argument why its okay to remove those cleanups.
There was a problem hiding this comment.
the entries we drop are target-side rewrites that belong to a different part of the statement, not to the EXEC statement this mutator operates on. So removing them loses nothing, and the target still gets resolved through its own separate path.
|
/code-review |
1 similar comment
|
/code-review |
kuntalghosh
left a comment
There was a problem hiding this comment.
AI Code Review
Verdict: COMMENT
Based on the provided code changes, here are my feedback and suggestions:
Issue 1 - In pl_insert_exec.c at line 299: PG_CATCH around LockRelationOid swallows all errors:
PG_TRY();
{
LockRelationOid(relid, RowExclusiveLock);
}
PG_CATCH();
{
FlushErrorState();
return;
}
PG_END_TRY();This catch-all pattern silently discards every possible error including query cancellation (SIGINT), deadlocks (ERRCODE_DEADLOCK_DETECTED), and out-of-memory. When a lock attempt fails for a reason other than "table dropped concurrently", the function returns without setting target_rel_oid, silently disabling the DDL-detection hook that guards against schema changes. At minimum, re-throw fatal/panic-level errors and cancellation signals. Consider only catching ERRCODE_LOCK_NOT_AVAILABLE or testing OidIsValid post-lock with missing_ok=true.
The same pattern repeats around SPI_prepare (line ~358) and table_open (line ~393) in pltsql_insert_exec_validate_column_count_from_query. While the column-count function is a best-effort optimization, swallowing cancellation there can delay user-initiated abort.
Suggested fix:
PG_TRY();
{
LockRelationOid(relid, RowExclusiveLock);
}
PG_CATCH();
{
/* Only suppress benign errors (table dropped etc.) - re-throw the rest */
ErrorData *edata = CopyErrorData();
FlushErrorState();
if (edata->sqlerrcode == ERRCODE_QUERY_CANCELED ||
edata->sqlerrcode == ERRCODE_ADMIN_SHUTDOWN ||
edata->elevel >= FATAL)
{
ReThrowError(edata);
}
FreeErrorData(edata);
return;
}
PG_END_TRY();Issue 2 - In pl_insert_exec.c at line 793: table_open/table_close per tuple in insertexec_receive:
/* Open temp table fresh for each tuple - avoids stale handles across subtransactions */
temp_rel = table_open(insert_exec_ctx.temp_table_oid, RowExclusiveLock);
...
table_tuple_insert(temp_rel, insert_slot, myState->cid, 0, NULL);
table_close(temp_rel, RowExclusiveLock);Opening and closing the relation for every single row is costly for large result sets. The comment mentions avoiding stale handles across subtransaction boundaries, but insertexec_receive runs in the same command — subtransaction boundaries only appear between statements, not between rows of a single result. Consider opening the relation once in insertexec_startup and storing it in DR_insertexec, closing it in insertexec_shutdown. If there is a specific scenario where the handle goes stale mid-receive (please document it), batch the open/close across N rows instead of per-row.
Issue 3 - In pl_insert_exec.c at line 188: target_table allocated in per-statement memory context:
insert_exec_ctx.target_table = target_table ? pstrdup(target_table) : NULL;pltsql_set_insert_exec_context_info is called from insert_exec_setup (which runs in the per-statement memory context of exec_stmt_exec). Safety-net cleanups in pltsql_xact_cb and pltsql_estate_cleanup call pltsql_insert_exec_reset_all() which does pfree(insert_exec_ctx.target_table). If the per-statement context has already been reset by the time these safety nets fire, pfree operates on a dangling pointer. Allocate in TopMemoryContext (since lifetime must span the entire INSERT EXEC) or TopTransactionContext.
Suggested fix:
insert_exec_ctx.target_table = target_table
? MemoryContextStrdup(TopMemoryContext, target_table) : NULL;Issue 4 - In pl_insert_exec.c: Missing trailing newline at end of file:
The file ends without a final newline character (} is the last byte). Most editors and compilers will produce a warning, and POSIX requires text files to end with a newline.
Issue 5 - In pl_insert_exec.c at lines 228-230 and 848-850: Function parameter continuation uses spaces instead of tabs:
void
pltsql_insert_exec_open_target_table(const char *target_table,
const char *schema_name_in,
const char *db_name_in)The project convention is tab-based indentation. Multi-line function parameters should use tabs for continuation indentation (as seen in flush_insert_exec_temp_table at line 564). This applies to insert_exec_setup as well.
Issue 6 - In scan-warnings.sh: Leak count drops from 416 to 29 without explanation:
-if [[ "$LEAK_COUNT" -ne 416 ]]; then
- echo "Error: Expected 416 leak warnings, but found $LEAK_COUNT"
+if [[ "$LEAK_COUNT" -ne 29 ]]; then
+ echo "Error: Expected 29 leak warnings, but found $LEAK_COUNT"A reduction from 416 to 29 leak warnings is a massive improvement (387 fewer leaks). The PR description does not explain this. If the redesign genuinely eliminates these leaks (by moving from per-estate tuple stores to the temp-table approach), this is excellent — but it deserves a brief note in the PR description or commit message explaining why the leak count dropped so dramatically, to help future maintainers understand this isn't an error in the count.
Issue 7 - In pl_exec-2.c at line 1381: insert_exec_success_cleanup called outside PG_FINALLY:
exec_succeeded = true;
}
PG_FINALLY();
{
if (!exec_succeeded && ...)
pltsql_insert_exec_reset_all();
...
}
...
/* Flush temp table to target table and cleanup after procedure completes */
if (insert_exec_setup_done)
insert_exec_success_cleanup(estate, stmt->insert_exec);
return PLTSQL_RC_OK;If insert_exec_success_cleanup itself throws (e.g., permission denied during flush INSERT, type coercion failure), the INSERT EXEC context is cleaned up inside its own PG_CATCH, which is good. However, the implicit transaction commit at the end of insert_exec_success_cleanup happens after pltsql_insert_exec_reset_all() has zeroed the context. If the commit fails, the context is already gone. This seems intentional (commit failure means transaction aborts and the xact callback cleans up), but warrants a comment explaining the ordering rationale.
Issue 8 - In pl_insert_exec.c at line 632: Memory leak in flush_insert_exec_temp_table:
appendStringInfo(&flush_query,
"INSERT INTO %s%s SELECT * FROM %s",
qualified_target,
column_list_str ? psprintf(" (%s)", column_list_str) : "",
temp_name);
pfree(qualified_target);The psprintf(" (%s)", column_list_str) call allocates memory that is never freed. Since this runs once per INSERT EXEC, the leak is small, but it's good practice to assign it to a local and free it afterward.
Issue 9 - In guc.c: GUC default changed from false to true — enabling the new code path by default:
-bool pltsql_enable_new_insert_exec = false;
+bool pltsql_enable_new_insert_exec = true;This means the new INSERT EXEC redesign is now the default path. This is a significant behavioral change. Ensure this is intentional for the target branch (BABEL_6_X_DEV). The PR description says the feature is "gated behind" the GUC, but now it's enabled by default — the description should be updated to reflect that this PR also flips the default.
Notes
- Prior reviewer (tanscorpio7) raised several points. The author has responded to most. The comments about moving code from
hooks.c(line 3625) and usingignore_catch_block_for_insert_exec(lines 1374, 1621 in iterative_exec.c) appear addressed in this revision. - The test file is named
BABEL-INSERT-EXEC.sqlwhich uses theBABEL-prefix (feature-based, not JIRA-based). This is consistent with the existingBABEL-INSERT-EXECUTE.sqlin the same directory. Acceptable naming. - Test coverage is comprehensive: 14 categories (A through N) covering basic scenarios, column mapping, dynamic SQL, nested procedures, identity columns, temp tables, advanced SQL, transactions, TRY/CATCH, errors, large result sets, OUTPUT clause, @@TRANCOUNT visibility, and cross-database targets. This satisfies the test requirements for a new feature of this scope.
- The
.outfile (1316 lines) matches the.sqlinput file (938 lines) and is auto-generated — no review needed. - The GUC visibility level is
PGC_SUSETwhich is appropriate (superuser-only setting).
Note: This AI review is meant to augment, not replace, human review. Use the suggestions with discretion.
|
Issue 1 : Fixed, only suppressing benign errors and re-throwing the rest. Issue 2 : Invalid fix, it will lead to server crashes. The table_open and table_close must happen under same resource owner, whereas the DestReceiver's Issue 3 : Fixed, now allocating target table in Issue 4 : Nit, added a blank line at eof. |
4719e15 to
f2fb902
Compare
OOM, FATAL, and admin shutdown so they cannot be silently swallowed. - Allocate target_table in TopTransactionContext so the safety-net xact cleanup never dereferences a freed per-statement allocation.
…arget schema via the user's default schema, remove the unnecessary LockRelationOid try/catch, clean up the column-count probe error handling, and guard table-variable cleanup against an aborted transaction (fixes an IsTransactionState() crash on the INSERT EXEC error path in a function). Adds same-session DDL detection tests.
9eb3036 to
3908d79
Compare
…rams instead of SPI_prepare and stop suppressing errors Prepare the source statement through SPI_prepare_params with the PL/tsql parser setup so @variables resolve and no errors are suppressed, and cache the plan in expr->plan for reuse.
3908d79 to
c4ed99a
Compare
| /* | ||
| * Resolve the logical T-SQL schema name for an INSERT EXEC target when the | ||
| * statement did not qualify it. | ||
| */ | ||
| static char * | ||
| resolve_insert_exec_schema_name(const char *schema_name_in, const char *db_name_in) | ||
| { | ||
| const char *db; | ||
| char *user; | ||
| char *default_schema; | ||
| char *result; | ||
|
|
||
| if (schema_name_in != NULL) | ||
| return pstrdup(schema_name_in); | ||
|
|
||
| db = (db_name_in != NULL) ? db_name_in : get_cur_db_name(); | ||
| user = get_user_for_database(db); | ||
| default_schema = user ? get_authid_user_ext_schema_name(db, user) : NULL; | ||
|
|
||
| result = pstrdup(default_schema ? default_schema : "dbo"); | ||
|
|
||
| if (default_schema) | ||
| pfree(default_schema); | ||
| if (user) | ||
| pfree(user); | ||
|
|
||
| return result; |
There was a problem hiding this comment.
The previous revision used exec_batch which handled schema resolution implicitly. This PR switches to SPI_Execute for the flush, which requires manual schema resolution adding maintenance burden. Can you explain what issue with exec_batch motivated this change ?
There was a problem hiding this comment.
It crashes the backend with INSERT-EXEC-inside-a-table-valued-function case.
Ex query :
CREATE FUNCTION dbo.fn_q1() RETURNS @t TABLE (a INT, b VARCHAR(10)) AS
BEGIN
INSERT INTO @t EXEC dbo.p_q1;
RETURN;
END;
SELECT * FROM dbo.fn_q1();
what I come to know as the reason :
execute_batch builds a non-atomic InlineCodeBlock and runs it through pltsql_inline_handler as a top-level batch. Doing that while nested inside the TVF's active, atomic executor frame performs transaction-boundary management in an atomic scope, leaving no active transaction when the flush's @t catalog lookup runs (Assert(IsTransactionState()))
| Assert(insert_exec_ctx == NULL); | ||
| insert_exec_ctx = MemoryContextAllocZero(TopTransactionContext, | ||
| sizeof(InsertExecContext)); |
There was a problem hiding this comment.
InsertExecContext is a static struct. Why allocate it dynamically on the heap each time instead of just zeroing the static instance?
There was a problem hiding this comment.
The NULL pointer doubles as the active flag, and TopTransactionContext gives automatic cleanup on aborted statements so no stale state. It's one small alloc per INSERT EXEC, so I kept it over a static struct and separate flag.
| /* | ||
| * Block COMMIT during INSERT EXEC if NestedTranCount <= 1. | ||
| * | ||
| * INSERT EXEC implicitly makes @@TRANCOUNT = 1. COMMIT is only | ||
| * blocked if it would make @@TRANCOUNT go from 1 to 0. If the | ||
| * procedure did BEGIN TRAN first (@@TRANCOUNT = 2), then COMMIT | ||
| * is allowed (@@TRANCOUNT goes from 2 to 1). | ||
| * | ||
| * INSERT EXEC detection differs by path: the new path uses the | ||
| * global context (pltsql_insert_exec_active), the legacy path | ||
| * the per-estate flag (estate->insert_exec). | ||
| */ | ||
| if (((pltsql_insert_exec_active()) || | ||
| (exec_state_call_stack && |
There was a problem hiding this comment.
What happens if INSERT EXEC is called inside a nested transaction?
BEGIN TRAN
BEGIN TRAN
INSERT INTO t EXEC p1 -- how many COMMITs can the proc issue?At this point @@TRANCOUNT = 3. Can the proc COMMIT down to 2 or 1 ?
There was a problem hiding this comment.
yes, proc can commit down to 1, but not to 0, the commit which does 1->0 is blocked, as expected.
[jfi : I think @@TRANCOUNT would be 2 here, implicit transaction doesn't add to trancount, if there is already an explicit open transaction]
column-count validation to prepare_stmt_execsql, extract the COMMIT/ROLLBACK INSERT-EXEC guard into a helper, and drop redundant temp-table/cleanup special-casing
6dda378 to
79285c4
Compare
Description
Background: INSERT...EXEC Redesign
The current INSERT...EXEC implementation has limitations with dynamic tsql, transaction handling, error recovery, and TRY-CATCH compatibility. We're redesigning it using a temp table buffering approach using the already implemented error and transaction handling in Babelfish :
PR Series Overview:
This PR:
This PR fully implements the redesign approach but with a
pltsql_enable_new_insert_execguc. This guc is default true to run the tests against the new redesign approach in this PR. We can also toggle between the new and old approach using the guc to compare the tests and to make sure we do not end up cleaning anything which breaks up our code in the later PR.This redesign approach also reduces leak warnings from 416 to just 29. Whatever leak warnings were coming from old tuple store approach has now been eliminated with this appraoch.
What this enables:
@tablevar) and temp tables (#temp) as targetsFiles Changed:
exec_stmt_exec,exec_stmt_exec_sp,exec_stmt_exec_batchIssues Resolved
PR6 of Task: BABEL-5522
Fixed Issues:
@tablevar) not recognized correctlyTest Scenarios Covered
Check List
By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.