Skip to content

[BABEL] INSERT EXECUTE Redesign PR4 complete redesign#4847

Merged
tanscorpio7 merged 11 commits into
babelfish-for-postgresql:BABEL_6_X_DEVfrom
monk0707:insert-exec-integration-via-6-v2
Jun 24, 2026
Merged

[BABEL] INSERT EXECUTE Redesign PR4 complete redesign#4847
tanscorpio7 merged 11 commits into
babelfish-for-postgresql:BABEL_6_X_DEVfrom
monk0707:insert-exec-integration-via-6-v2

Conversation

@monk0707

@monk0707 monk0707 commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Description

Background: INSERT...EXEC Redesign

The current INSERT...EXEC implementation has limitations with dynamic tsql, transaction handling, error recovery, and TRY-CATCH compatibility. We're redesigning it using a temp table buffering approach using the already implemented error and transaction handling in Babelfish :

  1. Procedure output is captured into a temporary buffer table via a custom DestReceiver
  2. After procedure execution completes, buffered rows are flushed to the target table
  3. This enables proper TRY-CATCH error handling and transaction semantics matching SQL Server

PR Series Overview:

PR Focus Description
PR0 GUC Feature flag to gate the new code path
PR1 Parser Parse INSERT...EXEC and extract target table info
PR2 Temp Table Functions for creating and flush temp table
PR3 DestReceiver Capture procedure output to temp table
PR4 Integration Wire everything together with tests

This PR:

This PR fully implements the redesign approach but with a pltsql_enable_new_insert_exec guc. This guc is default true to run the tests against the new redesign approach in this PR. We can also toggle between the new and old approach using the guc to compare the tests and to make sure we do not end up cleaning anything which breaks up our code in the later PR.
This redesign approach also reduces leak warnings from 416 to just 29. Whatever leak warnings were coming from old tuple store approach has now been eliminated with this appraoch.

What this enables:

  • correct handling of dynamic sql and output clause.
  • Correct transaction semantics matching SQL Server behavior
  • Proper column mapping and type coercion
  • TRY-CATCH error handling
  • Ownership chaining (procedures calling other procedures owned by the same user)
  • Table variables (@tablevar) and temp tables (#temp) as targets
  • INSTEAD OF triggers on target tables
  • All syntax variations (procedures, dynamic SQL, sp_executesql)

Files Changed:

  • pl_exec-2.c: Integrate INSERT EXEC setup/cleanup into exec_stmt_exec, exec_stmt_exec_sp, exec_stmt_exec_batch
  • pl_exec.c / iterative_exec.c: Add INSERT EXEC context checks.
  • tdsresponse.c: Route procedure output to INSERT EXEC DestReceiver when active
  • hooks.c: Add DROP TABLE protection for active INSERT EXEC target tables (SQL Server error 556)
  • session.c: Add INSERT EXEC state to session cleanup
  • Tests: Comprehensive JDBC tests (BABEL-INSERT-EXEC.sql - 855 lines input, 1205 lines expected output)

Issues Resolved

PR6 of Task: BABEL-5522

Fixed Issues:

  • Transaction behavior inconsistencies with @@TRANCOUNT and implicit transactions
  • Wrong column insertion when procedure output order differs from target
  • Server crashes with nested procedures and complex error scenarios
  • Memory leaks and improper cleanup
  • OUTPUT clause not working with INSERT EXEC
  • Various syntax parsing gaps for valid T-SQL
  • Ownership chaining broken for INSERT EXEC
  • Table variables (@tablevar) not recognized correctly
  • Type coercion not handling typmod differences (e.g., varchar(max) to varchar(10))

Test Scenarios Covered

  • Use case based - Basic INSERT EXEC with procedures, dynamic SQL, sp_executesql; Multiple result sets; Ownership chaining scenarios; Column list specifications; Table variables and temp tables as targets; Nested procedure calls; INSTEAD OF triggers on target tables
  • Cross Database - Cross DB and cross schema tests.
  • Boundary conditions - Empty result sets; Single row vs multiple rows; Maximum column counts; Type coercion edge cases (varchar(max) to varchar(10), int to datetime); Large result sets
  • Arbitrary inputs - Various data types; NULL handling; Unicode data
  • Negative test cases - Nested INSERT EXEC (error 8164); Column count mismatch (error 213); Type conversion failures; DROP TABLE on active target (error 556); Permission denied scenarios; Transaction errors
  • Minor version upgrade tests - N/A (feature redesign)
  • Major version upgrade tests - N/A (feature redesign)
  • Performance tests - Basic performance validation with large result sets
  • Tooling impact - None
  • Client tests - JDBC tests: BABEL-INSERT-EXEC, BABEL-INSERT-EXECUTE, proc_ownership_chaining, BABEL-2999, temp_table_rollback, BABEL-1944, datareader_datawriter

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is under the terms of the Apache 2.0 and PostgreSQL licenses, and grant any person obtaining a copy of the contribution permission to relicense all or a portion of my contribution to the PostgreSQL License solely to contribute all or a portion of my contribution to the PostgreSQL open source project.

@monk0707 monk0707 force-pushed the insert-exec-integration-via-6-v2 branch 10 times, most recently from 9eb4571 to 6808d3d Compare June 5, 2026 10:24
Comment thread contrib/babelfishpg_tds/src/backend/tds/tdsresponse.c Outdated
Comment thread contrib/babelfishpg_tsql/src/hooks.c Outdated
Comment thread contrib/babelfishpg_tsql/src/iterative_exec.c Outdated
Comment thread contrib/babelfishpg_tsql/src/iterative_exec.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pl_exec-2.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pltsql_utils.c Outdated
Comment thread contrib/babelfishpg_tsql/src/tsqlIface.cpp
Comment thread contrib/babelfishpg_tsql/src/pl_exec.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pl_exec.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pl_exec.c Outdated
@KushaalShroff

Copy link
Copy Markdown
Contributor

/code-review

1 similar comment
@KushaalShroff

Copy link
Copy Markdown
Contributor

/code-review

@kuntalghosh kuntalghosh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI Code Review

Verdict: COMMENT

Based on the provided code changes, here are my feedback and suggestions:

Issue 1 - In pl_insert_exec.c at line 299: PG_CATCH around LockRelationOid swallows all errors:

PG_TRY();
{
    LockRelationOid(relid, RowExclusiveLock);
}
PG_CATCH();
{
    FlushErrorState();
    return;
}
PG_END_TRY();

This catch-all pattern silently discards every possible error including query cancellation (SIGINT), deadlocks (ERRCODE_DEADLOCK_DETECTED), and out-of-memory. When a lock attempt fails for a reason other than "table dropped concurrently", the function returns without setting target_rel_oid, silently disabling the DDL-detection hook that guards against schema changes. At minimum, re-throw fatal/panic-level errors and cancellation signals. Consider only catching ERRCODE_LOCK_NOT_AVAILABLE or testing OidIsValid post-lock with missing_ok=true.

The same pattern repeats around SPI_prepare (line ~358) and table_open (line ~393) in pltsql_insert_exec_validate_column_count_from_query. While the column-count function is a best-effort optimization, swallowing cancellation there can delay user-initiated abort.

Suggested fix:

PG_TRY();
{
    LockRelationOid(relid, RowExclusiveLock);
}
PG_CATCH();
{
    /* Only suppress benign errors (table dropped etc.) - re-throw the rest */
    ErrorData *edata = CopyErrorData();
    FlushErrorState();
    if (edata->sqlerrcode == ERRCODE_QUERY_CANCELED ||
        edata->sqlerrcode == ERRCODE_ADMIN_SHUTDOWN ||
        edata->elevel >= FATAL)
    {
        ReThrowError(edata);
    }
    FreeErrorData(edata);
    return;
}
PG_END_TRY();

Issue 2 - In pl_insert_exec.c at line 793: table_open/table_close per tuple in insertexec_receive:

/* Open temp table fresh for each tuple - avoids stale handles across subtransactions */
temp_rel = table_open(insert_exec_ctx.temp_table_oid, RowExclusiveLock);
...
table_tuple_insert(temp_rel, insert_slot, myState->cid, 0, NULL);
table_close(temp_rel, RowExclusiveLock);

Opening and closing the relation for every single row is costly for large result sets. The comment mentions avoiding stale handles across subtransaction boundaries, but insertexec_receive runs in the same command — subtransaction boundaries only appear between statements, not between rows of a single result. Consider opening the relation once in insertexec_startup and storing it in DR_insertexec, closing it in insertexec_shutdown. If there is a specific scenario where the handle goes stale mid-receive (please document it), batch the open/close across N rows instead of per-row.

Issue 3 - In pl_insert_exec.c at line 188: target_table allocated in per-statement memory context:

insert_exec_ctx.target_table = target_table ? pstrdup(target_table) : NULL;

pltsql_set_insert_exec_context_info is called from insert_exec_setup (which runs in the per-statement memory context of exec_stmt_exec). Safety-net cleanups in pltsql_xact_cb and pltsql_estate_cleanup call pltsql_insert_exec_reset_all() which does pfree(insert_exec_ctx.target_table). If the per-statement context has already been reset by the time these safety nets fire, pfree operates on a dangling pointer. Allocate in TopMemoryContext (since lifetime must span the entire INSERT EXEC) or TopTransactionContext.

Suggested fix:

insert_exec_ctx.target_table = target_table
    ? MemoryContextStrdup(TopMemoryContext, target_table) : NULL;

Issue 4 - In pl_insert_exec.c: Missing trailing newline at end of file:

The file ends without a final newline character (} is the last byte). Most editors and compilers will produce a warning, and POSIX requires text files to end with a newline.

Issue 5 - In pl_insert_exec.c at lines 228-230 and 848-850: Function parameter continuation uses spaces instead of tabs:

void
pltsql_insert_exec_open_target_table(const char *target_table,
                                     const char *schema_name_in,
                                     const char *db_name_in)

The project convention is tab-based indentation. Multi-line function parameters should use tabs for continuation indentation (as seen in flush_insert_exec_temp_table at line 564). This applies to insert_exec_setup as well.

Issue 6 - In scan-warnings.sh: Leak count drops from 416 to 29 without explanation:

-if [[ "$LEAK_COUNT" -ne 416 ]]; then
-    echo "Error: Expected 416 leak warnings, but found $LEAK_COUNT"
+if [[ "$LEAK_COUNT" -ne 29 ]]; then
+    echo "Error: Expected 29 leak warnings, but found $LEAK_COUNT"

A reduction from 416 to 29 leak warnings is a massive improvement (387 fewer leaks). The PR description does not explain this. If the redesign genuinely eliminates these leaks (by moving from per-estate tuple stores to the temp-table approach), this is excellent — but it deserves a brief note in the PR description or commit message explaining why the leak count dropped so dramatically, to help future maintainers understand this isn't an error in the count.

Issue 7 - In pl_exec-2.c at line 1381: insert_exec_success_cleanup called outside PG_FINALLY:

    exec_succeeded = true;
}
PG_FINALLY();
{
    if (!exec_succeeded && ...)
        pltsql_insert_exec_reset_all();
    ...
}
...
/* Flush temp table to target table and cleanup after procedure completes */
if (insert_exec_setup_done)
    insert_exec_success_cleanup(estate, stmt->insert_exec);

return PLTSQL_RC_OK;

If insert_exec_success_cleanup itself throws (e.g., permission denied during flush INSERT, type coercion failure), the INSERT EXEC context is cleaned up inside its own PG_CATCH, which is good. However, the implicit transaction commit at the end of insert_exec_success_cleanup happens after pltsql_insert_exec_reset_all() has zeroed the context. If the commit fails, the context is already gone. This seems intentional (commit failure means transaction aborts and the xact callback cleans up), but warrants a comment explaining the ordering rationale.

Issue 8 - In pl_insert_exec.c at line 632: Memory leak in flush_insert_exec_temp_table:

appendStringInfo(&flush_query,
    "INSERT INTO %s%s SELECT * FROM %s",
    qualified_target,
    column_list_str ? psprintf(" (%s)", column_list_str) : "",
    temp_name);

pfree(qualified_target);

The psprintf(" (%s)", column_list_str) call allocates memory that is never freed. Since this runs once per INSERT EXEC, the leak is small, but it's good practice to assign it to a local and free it afterward.

Issue 9 - In guc.c: GUC default changed from false to true — enabling the new code path by default:

-bool		pltsql_enable_new_insert_exec = false;
+bool		pltsql_enable_new_insert_exec = true;

This means the new INSERT EXEC redesign is now the default path. This is a significant behavioral change. Ensure this is intentional for the target branch (BABEL_6_X_DEV). The PR description says the feature is "gated behind" the GUC, but now it's enabled by default — the description should be updated to reflect that this PR also flips the default.

Notes

  • Prior reviewer (tanscorpio7) raised several points. The author has responded to most. The comments about moving code from hooks.c (line 3625) and using ignore_catch_block_for_insert_exec (lines 1374, 1621 in iterative_exec.c) appear addressed in this revision.
  • The test file is named BABEL-INSERT-EXEC.sql which uses the BABEL- prefix (feature-based, not JIRA-based). This is consistent with the existing BABEL-INSERT-EXECUTE.sql in the same directory. Acceptable naming.
  • Test coverage is comprehensive: 14 categories (A through N) covering basic scenarios, column mapping, dynamic SQL, nested procedures, identity columns, temp tables, advanced SQL, transactions, TRY/CATCH, errors, large result sets, OUTPUT clause, @@TRANCOUNT visibility, and cross-database targets. This satisfies the test requirements for a new feature of this scope.
  • The .out file (1316 lines) matches the .sql input file (938 lines) and is auto-generated — no review needed.
  • The GUC visibility level is PGC_SUSET which is appropriate (superuser-only setting).

Note: This AI review is meant to augment, not replace, human review. Use the suggestions with discretion.

@monk0707

monk0707 commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

Issue 1 : Fixed, only suppressing benign errors and re-throwing the rest.

Issue 2 : Invalid fix, it will lead to server crashes. The table_open and table_close must happen under same resource owner, whereas the DestReceiver's rStartup, receive calls, and rShutdown/rDestroy execute under different ResourceOwners. Having table_open and table_close in different resource owners crashes the server.

Issue 3 : Fixed, now allocating target table in TopTransactionContext, as it remains to the end of insert-exec.

Issue 4 : Nit, added a blank line at eof.
Issue 5 : Nit, added tabs in place of spaces.
Issue 6 : Nit, added explanation in pr description.
Issue 7 : Nit, added a comment explaining it.
Issue 8 : Nit, we are ultimately releasing all memory at end during reset all or transaction end. I don't think we need to do it explicitly.
Issue 9 : Nit, added explanation in PR description.

Comment thread contrib/babelfishpg_tds/src/backend/tds/tdsresponse.c
Comment thread contrib/babelfishpg_tsql/src/iterative_exec.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pl_exec-2.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pl_exec-2.c
Comment thread contrib/babelfishpg_tsql/src/pl_exec.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pl_insert_exec.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pl_insert_exec.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pl_insert_exec.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pl_insert_exec.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pl_insert_exec.c Outdated
@monk0707 monk0707 force-pushed the insert-exec-integration-via-6-v2 branch 9 times, most recently from 3908d79 to c4ed99a Compare June 18, 2026 18:52
Comment thread contrib/babelfishpg_tsql/src/iterative_exec.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pl_exec-2.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pl_exec.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pl_exec.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pl_insert_exec.c
Comment thread contrib/babelfishpg_tsql/src/pl_insert_exec.c
Comment thread contrib/babelfishpg_tsql/src/pl_insert_exec.c
Comment thread contrib/babelfishpg_tsql/src/pl_insert_exec.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pltsql_utils.c Outdated
Comment thread contrib/babelfishpg_tsql/src/pltsql_utils.c Outdated
@monk0707 monk0707 force-pushed the insert-exec-integration-via-6-v2 branch 2 times, most recently from 6bd1571 to 3824cf2 Compare June 22, 2026 19:33
Comment thread contrib/babelfishpg_tsql/src/pl_exec.c
Comment on lines +619 to +624
/*
* Report rows-affected from the DestReceiver's captured-row count
*/
estate->eval_processed = insert_exec_ctx->rows_processed;
exec_set_rowcount(insert_exec_ctx->rows_processed);
exec_set_found(estate, insert_exec_ctx->rows_processed != 0);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these still required ? Wouldn't execute_batch->execsql set them ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the row_count set and eval_processed, they happen in the context which execute_batch initiates, hence they do not propogate back into the current context, so we need to call them separately

Comment thread contrib/babelfishpg_tsql/src/pl_insert_exec.c Outdated
Comment on lines +770 to +772
/* Open temp table fresh for each tuple - avoids stale handles across subtransactions */
temp_rel = table_open(insert_exec_ctx->temp_table_oid, RowExclusiveLock);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to open relation for every recieve, this will have performance penanlty ? relation open should stay open until top transaction commits.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it has perf penalty, but it is intentional. keeping the relation open for the top transaction commit, makes the resource owners inconsistence leading to "relcache reference is not owned by resource owner Portal" crash.
Saving the ResourceOwner during rStartup doesn't help because that owner may already be released by the time rShutdown (or a later rReceive) runs.

Comment thread contrib/babelfishpg_tsql/src/pl_insert_exec.c Outdated
estate->func->fn_is_trigger == PLTSQL_NOT_TRIGGER);

if (!pltsql_disable_batch_auto_commit &&
pltsql_support_tsql_transactions() &&

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some test cases for PostgreSQL endpoint. Create a tsql procedure from Babelfish endpoint that does INSERT EXECUTE and call it from PostgreSQL endpoint.

Comment thread contrib/babelfishpg_tsql/src/pl_insert_exec.c Outdated
@tanscorpio7

Copy link
Copy Markdown
Contributor

/code-review

@kuntalghosh kuntalghosh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI Code Review

Verdict: REQUEST_CHANGES

Based on the provided code changes, here are my feedback and suggestions:

Issue 1 - In pl_insert_exec.c at line 771: table_open/table_close per tuple in insertexec_receive is a performance concern:

/* Open temp table fresh for each tuple - avoids stale handles across subtransactions */
temp_rel = table_open(insert_exec_ctx->temp_table_oid, RowExclusiveLock);
...
table_tuple_insert(temp_rel, insert_slot, myState->cid, 0, NULL);
...
table_close(temp_rel, RowExclusiveLock);

Opening and closing the relation for every single row is costly for large result sets. The insertexec_receive callback runs within a single command — subtransaction boundaries do not occur between rows of one result set. The comment says "avoids stale handles across subtransactions" but that scenario cannot happen within a single PortalRun. This was also raised by reviewer tanscorpio7 (June 22). Consider opening the relation once in insertexec_startup and storing it in DR_insertexec, then closing it in insertexec_shutdown. If there is a specific edge case (e.g., multiple result sets from nested procedure calls triggering rShutdown/rStartup cycles), document it in a code comment.

Issue 2 - In iterative_exec.c at line 1131: ignore_catch_block_for_insert_exec dereferences estate->cur_error without NULL check:

return (estate->cur_error->error != NULL &&
        estate->cur_error->error->sqlerrcode == ERRCODE_DATATYPE_MISMATCH);

cur_error is a pointer that can be NULL (the same file checks estate->cur_error == NULL elsewhere, e.g., around line 177). The current call site happens to run after cur_error is known non-NULL, but this helper could be called from other contexts in future refactors, causing a backend crash.

Suggested fix:

if (estate->cur_error == NULL || estate->cur_error->error == NULL)
    return false;
return estate->cur_error->error->sqlerrcode == ERRCODE_DATATYPE_MISMATCH;

Issue 3 - In pl_handler.c at line 3093: estate->func dereferenced without NULL check after insert_exec_flush_estate substitution:

estate = insert_exec_flush_estate ? insert_exec_flush_estate
                                  : get_current_tsql_estate();
...
if (!estate || strncmp(relname, "@", 1) != 0)
    return relid;

foreach(lc, estate->func->table_varnos)   /* no NULL check on ->func */

The guard on line 3087 checks !estate but does not check estate->func. If insert_exec_flush_estate is ever set with a partially initialized estate (e.g., during early construction or by a future caller), this dereferences NULL. Today the callers guarantee func != NULL, but the invariant is fragile and undocumented.

Suggested fix:

if (!estate || !estate->func || strncmp(relname, "@", 1) != 0)
    return relid;

Issue 4 - In pl_insert_exec.c at line 631: psprintf result inside appendStringInfo is leaked:

appendStringInfo(&flush_query,
    "INSERT INTO %s%s SELECT * FROM %s",
    qualified_target,
    column_list_str ? psprintf(" (%s)", column_list_str) : "",
    temp_name);

The intermediate psprintf allocation is never freed. While bounded by the memory context lifetime, this is a recurring pattern across INSERT EXEC operations that accumulates per-statement. Build the fragment up front and free it:

char *cols_frag = column_list_str ? psprintf(" (%s)", column_list_str) : NULL;
appendStringInfo(&flush_query, "INSERT INTO %s%s SELECT * FROM %s",
                 qualified_target, cols_frag ? cols_frag : "", temp_name);
if (cols_frag)
    pfree(cols_frag);

Issue 5 - In pl_insert_exec.c at line 550: pltsql_insert_exec_validate_column_count leaks SPI plan if table_open throws:

plan = SPI_prepare_params(expr->query, ...);
...
query_natts = result_desc->natts;

temp_rel = table_open(temp_table_oid, AccessShareLock);   /* may throw */
temp_tupdesc = RelationGetDescr(temp_rel);
temp_natts = temp_tupdesc->natts;
table_close(temp_rel, AccessShareLock);

SPI_freeplan(plan);

If table_open throws (e.g., temp table concurrently dropped within the same session by a sub-batch), the SPI plan is leaked until SPI/transaction cleanup. While bounded, this should use PG_TRY/PG_FINALLY to free the plan on both paths, or re-order so table_open runs before SPI_prepare_params returns successfully.

Suggested fix:

plan = SPI_prepare_params(...);
if (plan == NULL)
    return;
...
PG_TRY();
{
    temp_rel = table_open(temp_table_oid, AccessShareLock);
    temp_tupdesc = RelationGetDescr(temp_rel);
    temp_natts = temp_tupdesc->natts;
    table_close(temp_rel, AccessShareLock);
}
PG_FINALLY();
{
    SPI_freeplan(plan);
}
PG_END_TRY();

Issue 6 - In tsqlIface.cpp at line 897: apply_exec_expression_rewriting erases all rewrites before EXEC start without bounds checking:

size_t exec_start = baseCtx->getStart()->getStartIndex();
for (auto it = rewritten_query_fragment.begin(); it != rewritten_query_fragment.end(); )
{
    if (it->first < exec_start)
        it = rewritten_query_fragment.erase(it);
    else
        ++it;
}

This unconditionally erases every entry in the global rewritten_query_fragment map with an offset before the EXEC keyword. If there is any non-INSERT-EXEC statement whose rewrite entries share the map (e.g., a batch with multiple statements), this could delete rewrites belonging to other statements in the same batch. The function should only run this cleanup when the statement actually has INSERT EXEC info. Add a guard:

InsertExecInfo *info = get_insert_exec_info_from_stmt(stmt);
if (info != NULL)
{
    // ... erase loop
}

Or confirm that rewritten_query_fragment is per-statement and document that assumption.

Issue 7 - In tsqlIface.cpp at line 2168: Function-scope INSERT EXEC check changed to always reject, but runtime allows it:

/* INSERT EXEC is not allowed in a function */
if (is_compiling_create_function())
{
    auto ddl_object = ctx->insert_statement()->ddl_object();
    if (ddl_object)
        throw PGErrorWrapperException(ERROR, ERRCODE_INVALID_FUNCTION_DEFINITION,
            "'INSERT EXEC' cannot be used within a function", getLineAndPos(ddl_object));
}

The compile-time check now rejects ALL INSERT EXEC in functions unconditionally (the !ddl_object->local_id() check for table variables was removed). However, exec_stmt_exec at runtime still has && stmt->insert_exec == NULL to allow INSERT EXEC into table variables inside functions. This inconsistency means the compile-time error will fire for table-variable targets that the runtime explicitly permits. Either the compile-time check needs to preserve the table-variable exception, or the runtime exception in exec_stmt_exec should also be removed. Please clarify the intended behavior and make both paths consistent.

Issue 8 - Missing defensive check: insert_exec_flush_estate in insert_exec_flush_and_cleanup should assert non-reentrant use:

flush_estate_saved = insert_exec_flush_estate;
insert_exec_flush_estate = estate;

Reviewer tanscorpio7 (June 22) noted that saving the previous value implies reentrant usage is possible, but INSERT EXEC cannot nest. If the saved value is non-NULL, something is already wrong. Convert this to an error rather than silently saving:

if (insert_exec_flush_estate != NULL)
    elog(ERROR, "INSERT EXEC flush already in progress");
insert_exec_flush_estate = estate;

Notes

  • Prior reviews addressed. Issues raised in the June 11 AI review (Issue 1: error suppression pattern, Issue 3: memory context for target_table, Issues 4-5: formatting) appear addressed in this revision. Reviewers tanscorpio7 and KushaalShroff raised many points; the author has responded to most. Several open items from tanscorpio7's June 22 review (per-row table_open, removing xact callback cleanup, insert_exec_flush_estate non-reentrant check) remain unresolved in this revision.
  • Security review summary. No critical or high-severity security issues found. One medium-severity finding: the bbf_table_var_lookup change (Issue 3 above) dereferences estate->func without validating it after the insert_exec_flush_estate substitution — currently unreachable but fragile. Multiple low-severity bounded memory leaks (Issues 4) and an informational note about the GUC default flip expanding blast radius. No SQL injection risk — all dynamic SQL composition uses quote_identifier() consistently.
  • GUC default flip. The PR flips pltsql_enable_new_insert_exec from false to true. This means every existing session exercises the new code path without opt-in. The PR description acknowledges the GUC exists but should explicitly note it is now enabled by default, not just "gated behind" it.
  • Test coverage. Comprehensive: 14+ categories (A through N + P, Q, R) covering basic scenarios, column mapping, dynamic SQL, nested procedures, identity columns, temp tables, advanced SQL, transactions, TRY/CATCH, errors, large result sets, OUTPUT clause, @@TRANCOUNT, cross-database, DDL protection, function scope, and variable-source scenarios. Test requirements for a new feature of this scope are satisfied.
  • Leak count reduction. Dropping from 416 to 29 leak warnings is a substantial improvement validating the redesign approach (temp table buffering eliminates per-estate tuple store leaks).

Note: This AI review is meant to augment, not replace, human review. Use the suggestions with discretion.

@monk0707 monk0707 force-pushed the insert-exec-integration-via-6-v2 branch from a8759b6 to 1288124 Compare June 23, 2026 10:26
monk0707 added 9 commits June 23, 2026 11:28
  OOM, FATAL, and admin shutdown so they cannot be silently swallowed.
- Allocate target_table in TopTransactionContext so the safety-net xact
  cleanup never dereferences a freed per-statement allocation.
…arget

schema via the user's default schema, remove the unnecessary LockRelationOid

try/catch, clean up the column-count probe error handling, and guard

table-variable cleanup against an aborted transaction (fixes an

IsTransactionState() crash on the INSERT EXEC error path in a function).

Adds same-session DDL detection tests.
…rams instead of SPI_prepare and stop suppressing errors

Prepare the source statement through SPI_prepare_params with the PL/tsql parser setup so @variables resolve and no errors are suppressed, and cache the plan in expr->plan for reuse.
column-count validation to prepare_stmt_execsql, extract the
COMMIT/ROLLBACK INSERT-EXEC guard into a helper, and drop redundant
temp-table/cleanup special-casing
…or sp_executesql. Flush the buffered INSERT EXEC temp table into the target via execute_batch. Since the flush re-enters through the inline handler (which pushes its own estate), redirect table-variable resolution, the per-statement implicit-transaction decision, and ownership chaining back to the caller's estate via insert_exec_flush_estate, and report rows-affected from the DestReceiver's captured-row count.
@monk0707 monk0707 force-pushed the insert-exec-integration-via-6-v2 branch from 1288124 to 5618379 Compare June 23, 2026 11:28
@harshdubey166

Copy link
Copy Markdown
Contributor

/code-review

@monk0707 monk0707 force-pushed the insert-exec-integration-via-6-v2 branch from 5618379 to 8fef935 Compare June 23, 2026 12:44

@kuntalghosh kuntalghosh left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AI Code Review

Verdict: REQUEST_CHANGES

Based on the provided code changes, here are my feedback and suggestions:

Issue 1 - In pl_insert_exec.c at line 780: table_open/table_close per tuple in insertexec_receive is a performance concern:

/* Open temp table fresh for each tuple - avoids stale handles across subtransactions */
temp_rel = table_open(insert_exec_ctx->temp_table_oid, RowExclusiveLock);
...
table_tuple_insert(temp_rel, insert_slot, myState->cid, 0, NULL);
...
table_close(temp_rel, RowExclusiveLock);

Opening and closing the relation (plus acquiring/releasing RowExclusiveLock) for every single row is expensive for large result sets. The insertexec_receive callback runs within a single PortalRun — subtransaction boundaries do not occur between rows of one result set. The comment claims it avoids stale handles across subtransactions, but that scenario cannot happen within a single PortalRun. If subtransaction boundaries between result sets are the concern (e.g., multiple SELECT statements from a procedure triggering rShutdown/rStartup cycles), document that specific scenario. Otherwise, open the relation once in insertexec_startup, store it in DR_insertexec, and close it in insertexec_shutdown. This was also raised by reviewer tanscorpio7 (June 22) and remains unresolved.

Issue 2 - In iterative_exec.c at line 1131: ignore_catch_block_for_insert_exec dereferences estate->cur_error without NULL check:

return (estate->cur_error->error != NULL &&
        estate->cur_error->error->sqlerrcode == ERRCODE_DATATYPE_MISMATCH);

cur_error is a pointer that can be NULL — the same file checks estate->cur_error == NULL elsewhere. While the current call site runs after the error is populated, the helper function has no such guarantee and could crash if called from a different context.

Suggested fix:

if (estate->cur_error == NULL || estate->cur_error->error == NULL)
    return false;
return estate->cur_error->error->sqlerrcode == ERRCODE_DATATYPE_MISMATCH;

Issue 3 - In pl_handler.c at line 3070: Dead initial assignment and missing estate->func NULL check in bbf_table_var_lookup:

PLtsql_execstate *estate = get_current_tsql_estate();   /* (1) overwritten */
...
estate = insert_exec_flush_estate ? insert_exec_flush_estate
                                  : get_current_tsql_estate();   /* (2) */
...
foreach(lc, estate->func->table_varnos)   /* no NULL check on ->func */

The first get_current_tsql_estate() call is immediately overwritten. Beyond the wasted call, if insert_exec_flush_estate is ever set with a partially initialized estate, estate->func could be NULL. The guard checks !estate but not !estate->func.

Suggested fix:

PLtsql_execstate *estate = insert_exec_flush_estate
                           ? insert_exec_flush_estate
                           : get_current_tsql_estate();
...
if (!estate || !estate->func || strncmp(relname, "@", 1) != 0)
    return relid;

Issue 4 - In pl_insert_exec.c at line 600: insert_exec_flush_estate non-reentrant check uses save/restore pattern but nesting is impossible:

flush_estate_saved = insert_exec_flush_estate;
insert_exec_flush_estate = estate;

Reviewer tanscorpio7 (June 22) noted that saving the previous value implies reentrant usage, but INSERT EXEC cannot nest. The code now has the elog(ERROR, ...) check as requested, but the ordering is inconsistent — the check fires after some setup. Ensure the check is the first operation.

Issue 5 - In tsqlIface.cpp at line 2168: Compile-time function check now rejects table-variable targets, but runtime exec_stmt_exec still has an exception for them:

/* Compile time - now blocks ALL INSERT EXEC in functions */
if (ddl_object)
    throw PGErrorWrapperException(..., "'INSERT EXEC' cannot be used within a function", ...);
/* Runtime - still allows EXEC in functions when insert_exec is set */
&& stmt->insert_exec == NULL) /* INSERT EXEC into table variable is allowed in functions */

The compile-time check removed the !ddl_object->local_id() exception, blocking all INSERT EXEC in functions including table-variable targets. But the comment in exec_stmt_exec still says table-variable INSERT EXEC is allowed. The runtime exception is now dead code because the compile-time check fires first. Either remove the dead runtime exception and its misleading comment, or restore the compile-time local_id() exception if table-variable INSERT EXEC should actually be supported.

Issue 6 - In pl_insert_exec.c at line 603: psprintf result inside appendStringInfo is leaked:

appendStringInfo(&flush_query,
    "INSERT INTO %s%s SELECT * FROM %s",
    qualified_target,
    column_list_str ? psprintf(" (%s)", column_list_str) : "",
    temp_name);

The intermediate psprintf allocation is never freed. Assign to a local and free it after use.

Suggested fix:

char *cols_frag = column_list_str ? psprintf(" (%s)", column_list_str) : NULL;
appendStringInfo(&flush_query, "INSERT INTO %s%s SELECT * FROM %s",
                 qualified_target, cols_frag ? cols_frag : "", temp_name);
if (cols_frag)
    pfree(cols_frag);

Issue 7 - In pl_insert_exec.c at line 580: Flush schema resolution hardcodes "dbo" fallback, diverging from setup-time resolution:

if (target_db != NULL)
    qualified_target = psprintf("%s.%s.%s",
                                quote_identifier(target_db),
                                quote_identifier(target_schema ? target_schema : "dbo"),
                                quote_identifier(target_table));

When the user writes INSERT INTO db..target EXEC p (cross-DB, no schema), setup resolves via resolve_insert_exec_schema_name (which uses the user's default schema in that DB), but the flush substitutes "dbo" when target_schema is NULL. If the user's default schema differs from dbo, rows may land in the wrong table. Resolve the schema once at parse/setup time and store it in InsertExecInfo, so both paths use the same resolution.

Issue 8 - In pl_insert_exec.c at line 550: SPI plan leaked if table_open throws:

plan = SPI_prepare_params(expr->query, ...);
...
temp_rel = table_open(temp_table_oid, AccessShareLock);   /* may throw */
...
SPI_freeplan(plan);

If table_open throws (e.g., temp table concurrently dropped by a sub-batch), the SPI plan is leaked until transaction cleanup. Re-order operations or use PG_TRY/PG_FINALLY to free the plan on both paths.

Suggested fix:

plan = SPI_prepare_params(...);
if (plan == NULL)
    return;
...
PG_TRY();
{
    temp_rel = table_open(temp_table_oid, AccessShareLock);
    temp_tupdesc = RelationGetDescr(temp_rel);
    temp_natts = temp_tupdesc->natts;
    table_close(temp_rel, AccessShareLock);
}
PG_FINALLY();
{
    SPI_freeplan(plan);
}
PG_END_TRY();

Notes

  • Prior reviews partially addressed. Reviewers tanscorpio7 and KushaalShroff raised many points across three rounds. The author has responded to most. Key open items from tanscorpio7's June 22 review (per-row table_open Issue 1, insert_exec_flush_estate reentrant check Issue 4) remain unresolved. The earlier AI review's Issue 1 (error suppression) and Issue 3 (memory context) appear addressed in this revision.
  • Security review summary. No critical or high-severity security issues found. Seven low-severity findings dominated by minor logic inconsistencies between setup-time and flush-time schema/relation resolution (Issues 7 above), defensive-coding gaps (Issues 2, 3), and bounded memory leaks (Issue 6). No SQL injection risk — all dynamic SQL composition uses quote_identifier() consistently. The ERRCODE_DATATYPE_MISMATCH unconditionally bypassing CATCH during INSERT EXEC is broad — any datatype mismatch that would normally be catchable inside the procedure now also bypasses user-defined CATCH blocks (a behavioral change, not a vulnerability).
  • GUC default flip. The PR flips pltsql_enable_new_insert_exec from false to true. This means every existing session exercises the new code path without opt-in. The PR description should explicitly note this is now enabled by default, not just "gated behind" a GUC.
  • Leak count reduction. Dropping from 416 to 29 leak warnings validates the redesign approach.
  • Test coverage. Comprehensive: 14+ categories covering basic scenarios, column mapping, dynamic SQL, nested procedures, identity columns, temp tables, transactions, TRY/CATCH, errors, large result sets, OUTPUT clause, @@TRANCOUNT, cross-database, DDL protection, and PostgreSQL endpoint. Test requirements for a feature redesign of this scope are satisfied.
  • insert_exec_flush_estate not cleared in safety-net paths. pltsql_xact_cb and terminate_batch both clear insert_exec_ctx but neither clears insert_exec_flush_estate. Normal flow always restores it, but for defense in depth, the safety nets should also reset it to NULL.
  • Column-count validation re-parses every statement. pltsql_insert_exec_validate_column_count calls SPI_prepare_params on every non-is_tsql_select_assign_stmt statement in the procedure body. This effectively plans each statement twice — measurable cost for procedures with many statements. Consider caching the target column count in insert_exec_ctx and only running the lightweight natts comparison.

Note: This AI review is meant to augment, not replace, human review. Use the suggestions with discretion.

monk0707 added 2 commits June 23, 2026 14:19
…ut logical-DB, create_insert_exec_temp_table and pltsql_insert_exec_open_target_table

now resolve the physical schema only when a DB context exists; with none
and no explicit schema, the target is referenced by its bare name and
resolved via search_path, matching the flush path and a plain INSERT.
The TDS path is unchanged.
…ert-exec doesn't give ordered results explicitly
@monk0707 monk0707 force-pushed the insert-exec-integration-via-6-v2 branch from 8fef935 to 53d07cc Compare June 23, 2026 14:47
@tanscorpio7 tanscorpio7 dismissed kuntalghosh’s stale review June 24, 2026 06:50

Addressed comments from latest AI review feedback. Some minor remainig items will be addressed seperately.

@tanscorpio7 tanscorpio7 merged commit 2fa7f2e into babelfish-for-postgresql:BABEL_6_X_DEV Jun 24, 2026
48 of 49 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants