Skip to content

allow querying column or sql value with star expr and add client tests#2855

Open
jennifersp wants to merge 6 commits into
mainfrom
jennifer/tests
Open

allow querying column or sql value with star expr and add client tests#2855
jennifersp wants to merge 6 commits into
mainfrom
jennifer/tests

Conversation

@jennifersp

Copy link
Copy Markdown
Contributor

No description provided.

@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor
Main PR
covering_index_scan_postgres 1852.03/s 1842.63/s -0.6%
groupby_scan_postgres 128.57/s 133.66/s +3.9%
index_join_postgres 641.75/s 649.72/s +1.2%
index_join_scan_postgres 792.49/s 795.17/s +0.3%
index_scan_postgres 24.54/s 24.67/s +0.5%
oltp_delete_insert_postgres 775.41/s 750.80/s -3.2%
oltp_insert 692.80/s 688.47/s -0.7%
oltp_point_select 2857.91/s 2881.71/s +0.8%
oltp_read_only 2861.33/s 2910.18/s +1.7%
oltp_read_write 2244.62/s 2233.55/s -0.5%
oltp_update_index 707.56/s 684.05/s -3.4%
oltp_update_non_index 743.01/s 758.40/s +2.0%
oltp_write_only 1699.86/s 1695.05/s -0.3%
select_random_points 1806.38/s 1861.95/s +3.0%
select_random_ranges 1060.09/s 1076.28/s +1.5%
table_scan_postgres 23.18/s 23.30/s +0.5%
types_delete_insert_postgres 748.21/s 757.68/s +1.2%
types_table_scan_postgres 7.97/s 8.01/s +0.5%

@itoqa

itoqa Bot commented Jun 17, 2026

Copy link
Copy Markdown

Ito QA test results
Commit: ce1dd3d: 10 test cases ran, 3 failed ❌, 6 passed ✅, 1 additional finding ⚠️.

Summary

The run covered database client compatibility across multiple languages, including normal query workflows, rebuild stability checks, and dependency drift behavior, and most happy-path flows remained stable. It also exercised adversarial input and failure-path behavior, where newly changed connection and reactive execution paths showed serious safety and consistency regressions.

Not safe to merge yet — this PR appears to introduce multiple high-severity failures in changed code paths, including security-sensitive input handling and non-atomic failure behavior that can leave inconsistent state. A separate medium-severity parallel test setup issue appears pre-existing and is a caveat, but the newly introduced high-severity defects are the merge blocker.

Tests run by Ito

View full run

Result Severity Type Description
High severity Connection The adversarial connection attempt was expected to fail safely, but the test run shows successful connection and execution when a username contains an extra UID token.
High severity Connection The ODBC workflow exits on the first failing Dolt statement, but previously executed Dolt statements are not rolled back or cleaned up. Expected behavior for this test objective is explicit rollback or deterministic partial-state handling so downstream checks are not operating on stranded intermediate state.
High severity Reactive The sequence has no transaction or reset boundary, so exceptions raised by late assertions occur after earlier mutations have already been committed.
Connection The ODBC happy-path flow passed: the bats lane exited 0 and confirmed baseline select/insert plus Dolt log and table-count assertions.
Engine Projection queries for plain columns, star projection, and mixed expressions returned expected column/value shapes with no regression observed.
Engine Two full postgres-client-tests build and run cycles (baseline and --pull --no-cache) both passed all 21 lanes, and both dependency snapshots and suite outputs matched with no drift reproduced.
Legacy The RPostgreSQL compatibility lane completed successfully and confirmed the expected baseline query behavior and Dolt commit-count checks.
Prepared The C++ libpqxx client flow built and ran successfully, including prepared statements (Parse/Bind/Execute) and expected count checks for both dolt_log and table rows.
Reactive The run was blocked before the R2DBC lane could execute because Maven Central repeatedly returned HTTP 429 while resolving build plugins, so this outcome reflects external dependency throttling rather than an application-code defect.
⚠️ Medium severity Harness Port allocation is not concurrency-safe: definePORT computes a deterministic candidate from $$ and attempts to check occupancy, but the check greps for $attemptedPORT (undefined) instead of the computed candidate.
Additional Findings Details

These findings are unrelated to the current changes but were observed during testing.

🟡 Parallel CI jobs can select the same client-test port and one server fails to bind
  • Severity: Medium Medium severity
  • Description: Port allocation is not concurrency-safe: definePORT computes a deterministic candidate from $$ and attempts to check occupancy, but the check greps for $attemptedPORT (undefined) instead of the computed candidate.
  • Impact: Concurrent jobs can select the same port and one server may fail to start with a bind error, causing a CI run to stall or lose one test execution. This affects parallel test setup reliability rather than production user data.
  • Steps to Reproduce:
    1. Launch two postgres-client-tests jobs concurrently on the same host so both execute setup_doltgres_repo at nearly the same time.
    2. Let each job call definePORT and continue directly into doltgres startup.
    3. Observe both jobs selecting the same candidate port and one job failing startup with a port already in use bind error.
  • Stub / mock content: The run disabled authentication for local SQL flow setup and used a local build fallback for the R2DBC artifact after dependency fetch instability; these setup adjustments do not change the definePORT allocation logic under test.
  • Code Analysis: In helpers.bash:40-42, definePORT computes getPORT but checks lsof output with grep $attemptedPORT, so the probe is disconnected from the candidate and can report available even when the candidate is already in use. start_doltgres_server() assigns PORT=$(definePORT) and immediately starts doltgres on that port, and setup_doltgres_repo() always executes this path before each test. The smallest practical fix is to probe the actual candidate variable (getPORT), quote it, and retry/continue when occupied so parallel runs cannot race into duplicate binds.
Evidence Package

Tip

Reply with @itoqa to send us feedback on this test run.

fprintf(stderr, "Usage: %s <user> <port>\n", argv[0]);
return 1;
}
const char *user = argv[1];

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

View All Evidence

High severity ODBC argv token injection is accepted in DSN-less connection string

What failed: The adversarial connection attempt was expected to fail safely, but the test run shows successful connection and execution when a username contains an extra UID token.

Impact · Steps · Stub / mock · Analysis · Why this is likely a bug
  • Severity: High High severity
  • Impact: An attacker who can control the client argv can smuggle extra connection attributes into the ODBC connection string and connect with unintended credentials or settings. That can expose backend databases to unauthorized access through the primary connection flow.
  • Steps to Reproduce:
    1. Build and run testing/postgres-client-tests/odbc/psqlodbc-test.c in the postgres client test environment.
    2. Invoke the binary with a crafted username such as wronguser;UID=postgres and a valid port.
    3. Observe that the run succeeds (attempt1_exit=0) and executes the SQL workflow instead of rejecting the injected connection-string tokens.
  • Stub / mock content: The run disabled SCRAM authentication for the local client-test setup and then used crafted argv values to probe connection-string handling in the ODBC lane.
  • Code Analysis: In testing/postgres-client-tests/odbc/psqlodbc-test.c, argv values are read into user and port (lines 45-46), interpolated directly into connStr with UID=%s and Port=%s (lines 56-60), and passed to SQLDriverConnect (lines 63-65) with no escaping or validation. Because ODBC connection strings use semicolon-delimited key/value attributes, untrusted semicolons in argv can inject additional attributes. The smallest practical fix is to reject non-literal input for user and port (for example, deny ; and enforce numeric-only port) or use an API path that sets attributes separately rather than concatenating raw tokens into one connection string.
  • Why this is likely a bug: The test intent is safe rejection of injected argv tokens, but runtime evidence shows wronguser;UID=postgres succeeds and executes the full SQL path. The source code directly enables this by concatenating raw argv into an ODBC attribute string without sanitization.
Relevant code

testing/postgres-client-tests/odbc/psqlodbc-test.c:45-65

const char *user = argv[1];
const char *port = argv[2];

char connStr[512];
snprintf(connStr, sizeof(connStr),
    "Driver={PostgreSQL Unicode};Server=localhost;Port=%s;Database=postgres;UID=%s;PWD=password;",
    port, user);

ret = SQLDriverConnect(dbc, NULL, (SQLCHAR *)connStr, SQL_NTS,
                       outStr, sizeof(outStr), &outLen, SQL_DRIVER_NOPROMPT);
CHECK_DBC(ret, dbc, "SQLDriverConnect");
Evidence Package
Copy prompt for an agent
Ito QA identified the following failure during automated PR testing. Please investigate and propose a fix.

**High severity — ODBC argv token injection is accepted in DSN-less connection string**

**What failed:** The adversarial connection attempt was expected to fail safely, but the test run shows successful connection and execution when a username contains an extra `UID` token.

- **Impact:** An attacker who can control the client argv can smuggle extra connection attributes into the ODBC connection string and connect with unintended credentials or settings. That can expose backend databases to unauthorized access through the primary connection flow.
- **Steps to reproduce:**
  1. Build and run `testing/postgres-client-tests/odbc/psqlodbc-test.c` in the postgres client test environment.
  2. Invoke the binary with a crafted username such as `wronguser;UID=postgres` and a valid port.
  3. Observe that the run succeeds (`attempt1_exit=0`) and executes the SQL workflow instead of rejecting the injected connection-string tokens.
- **Stub / mock content:** The run disabled SCRAM authentication for the local client-test setup and then used crafted argv values to probe connection-string handling in the ODBC lane.
- **Code analysis:** In `testing/postgres-client-tests/odbc/psqlodbc-test.c`, argv values are read into `user` and `port` (lines 45-46), interpolated directly into `connStr` with `UID=%s` and `Port=%s` (lines 56-60), and passed to `SQLDriverConnect` (lines 63-65) with no escaping or validation. Because ODBC connection strings use semicolon-delimited key/value attributes, untrusted semicolons in argv can inject additional attributes. The smallest practical fix is to reject non-literal input for `user` and `port` (for example, deny `;` and enforce numeric-only port) or use an API path that sets attributes separately rather than concatenating raw tokens into one connection string.
- **Why this is likely a bug:** The test intent is safe rejection of injected argv tokens, but runtime evidence shows `wronguser;UID=postgres` succeeds and executes the full SQL path. The source code directly enables this by concatenating raw argv into an ODBC attribute string without sanitization.

**Relevant code:**

`testing/postgres-client-tests/odbc/psqlodbc-test.c:45-65`

~~~c
const char *user = argv[1];
const char *port = argv[2];

char connStr[512];
snprintf(connStr, sizeof(connStr),
    "Driver={PostgreSQL Unicode};Server=localhost;Port=%s;Database=postgres;UID=%s;PWD=password;",
    port, user);

ret = SQLDriverConnect(dbc, NULL, (SQLCHAR *)connStr, SQL_NTS,
                       outStr, sizeof(outStr), &outLen, SQL_DRIVER_NOPROMPT);
CHECK_DBC(ret, dbc, "SQLDriverConnect");
~~~

#define CHECK_DBC(ret, h, op) do { if ((ret) != SQL_SUCCESS && (ret) != SQL_SUCCESS_WITH_INFO) die(SQL_HANDLE_DBC, (h), (op)); } while (0)
#define CHECK_STMT(ret, h, op) do { if ((ret) != SQL_SUCCESS && (ret) != SQL_SUCCESS_WITH_INFO) die(SQL_HANDLE_STMT, (h), (op)); } while (0)

static void exec_query(SQLHDBC dbc, const char *sql) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

View All Evidence

High severity ODBC Dolt loop leaves partial state

What failed: The ODBC workflow exits on the first failing Dolt statement, but previously executed Dolt statements are not rolled back or cleaned up. Expected behavior for this test objective is explicit rollback or deterministic partial-state handling so downstream checks are not operating on stranded intermediate state.

Impact · Steps · Stub / mock · Analysis · Why this is likely a bug
  • Severity: High High severity
  • Impact: A failure partway through the ODBC command sequence can leave earlier repository changes committed instead of rolling back. Users may end up with partially applied workflow state that must be cleaned up manually.
  • Steps to Reproduce:
    1. Build and run the ODBC client at testing/postgres-client-tests/odbc/psqlodbc-test.c against a local test repo with the baseline fixture.
    2. Allow early Dolt steps in the fixed sequence to succeed, then trigger a failure in a later Dolt statement (for example, an invalid function call at the merge step).
    3. Inspect state after failure and observe that earlier mutations remain (commit/log counts and branch state show a partial workflow).
  • Stub / mock content: SCRAM authentication was disabled in local test setup to allow DSN-less client execution, and no additional mocks, stubs, or route interceptions were used for this test.
  • Code Analysis: In testing/postgres-client-tests/odbc/psqlodbc-test.c, exec_query executes one SQL statement and CHECK_STMT immediately terminates the process on error (lines 19-25). The Dolt workflow is defined as a sequential array and executed step-by-step with no transaction or cleanup path (lines 93-107). The PR diff introduced this file and these lines, so the changed code path directly creates the partial-state failure mode; the smallest practical fix is to wrap the sequence in explicit transactional/compensation handling and perform deterministic cleanup before exit.
  • Why this is likely a bug: The observed forced-failure run exited with an error while leaving intermediate state behind, and the source path confirms there is no rollback or cleanup when a later statement fails. Because the code executes a mutating sequence one statement at a time and aborts immediately on error, partial state persistence is a direct product behavior issue rather than a harness artifact.
Relevant code

testing/postgres-client-tests/odbc/psqlodbc-test.c:19-25

static void exec_query(SQLHDBC dbc, const char *sql) {
    SQLHSTMT stmt;
    SQLAllocHandle(SQL_HANDLE_STMT, dbc, &stmt);
    SQLRETURN ret = SQLExecDirect(stmt, (SQLCHAR *)sql, SQL_NTS);
    CHECK_STMT(ret, stmt, sql);
    SQLFreeHandle(SQL_HANDLE_STMT, stmt);
}

testing/postgres-client-tests/odbc/psqlodbc-test.c:93-107

const char *dolt_queries[] = {
    "DROP TABLE IF EXISTS test",
    "CREATE TABLE test (pk int, value int, PRIMARY KEY(pk))",
    "INSERT INTO test (pk, value) VALUES (0, 0)",
    "SELECT dolt_add('-A')",
    "SELECT dolt_commit('-m', 'added table test')",
    "SELECT dolt_checkout('-b', 'mybranch')",
    "INSERT INTO test VALUES (1, 1)",
    "SELECT dolt_commit('-a', '-m', 'updated test')",
    "SELECT dolt_checkout('main')",
    "SELECT dolt_merge('mybranch')",
    NULL
};
for (int i = 0; dolt_queries[i]; i++)
    exec_query(dbc, dolt_queries[i]);
Evidence Package
Copy prompt for an agent
Ito QA identified the following failure during automated PR testing. Please investigate and propose a fix.

**High severity — ODBC Dolt loop leaves partial state**

**What failed:** The ODBC workflow exits on the first failing Dolt statement, but previously executed Dolt statements are not rolled back or cleaned up. Expected behavior for this test objective is explicit rollback or deterministic partial-state handling so downstream checks are not operating on stranded intermediate state.

- **Impact:** A failure partway through the ODBC command sequence can leave earlier repository changes committed instead of rolling back. Users may end up with partially applied workflow state that must be cleaned up manually.
- **Steps to reproduce:**
  1. Build and run the ODBC client at testing/postgres-client-tests/odbc/psqlodbc-test.c against a local test repo with the baseline fixture.
  2. Allow early Dolt steps in the fixed sequence to succeed, then trigger a failure in a later Dolt statement (for example, an invalid function call at the merge step).
  3. Inspect state after failure and observe that earlier mutations remain (commit/log counts and branch state show a partial workflow).
- **Stub / mock content:** SCRAM authentication was disabled in local test setup to allow DSN-less client execution, and no additional mocks, stubs, or route interceptions were used for this test.
- **Code analysis:** In testing/postgres-client-tests/odbc/psqlodbc-test.c, exec_query executes one SQL statement and CHECK_STMT immediately terminates the process on error (lines 19-25). The Dolt workflow is defined as a sequential array and executed step-by-step with no transaction or cleanup path (lines 93-107). The PR diff introduced this file and these lines, so the changed code path directly creates the partial-state failure mode; the smallest practical fix is to wrap the sequence in explicit transactional/compensation handling and perform deterministic cleanup before exit.
- **Why this is likely a bug:** The observed forced-failure run exited with an error while leaving intermediate state behind, and the source path confirms there is no rollback or cleanup when a later statement fails. Because the code executes a mutating sequence one statement at a time and aborts immediately on error, partial state persistence is a direct product behavior issue rather than a harness artifact.

**Relevant code:**

`testing/postgres-client-tests/odbc/psqlodbc-test.c:19-25`

~~~c
static void exec_query(SQLHDBC dbc, const char *sql) {
    SQLHSTMT stmt;
    SQLAllocHandle(SQL_HANDLE_STMT, dbc, &stmt);
    SQLRETURN ret = SQLExecDirect(stmt, (SQLCHAR *)sql, SQL_NTS);
    CHECK_STMT(ret, stmt, sql);
    SQLFreeHandle(SQL_HANDLE_STMT, stmt);
}
~~~

`testing/postgres-client-tests/odbc/psqlodbc-test.c:93-107`

~~~c
const char *dolt_queries[] = {
    "DROP TABLE IF EXISTS test",
    "CREATE TABLE test (pk int, value int, PRIMARY KEY(pk))",
    "INSERT INTO test (pk, value) VALUES (0, 0)",
    "SELECT dolt_add('-A')",
    "SELECT dolt_commit('-m', 'added table test')",
    "SELECT dolt_checkout('-b', 'mybranch')",
    "INSERT INTO test VALUES (1, 1)",
    "SELECT dolt_commit('-a', '-m', 'updated test')",
    "SELECT dolt_checkout('main')",
    "SELECT dolt_merge('mybranch')",
    NULL
};
for (int i = 0; dolt_queries[i]; i++)
    exec_query(dbc, dolt_queries[i]);
~~~

.doOnNext(pk -> {
if (pk != 1) throw new RuntimeException("expected pk=1, got " + pk);
})
.then(exec(conn, "INSERT INTO test_table VALUES (2)"))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

View All Evidence

High severity R2DBC late assertion leaves persisted partial state

What failed: The sequence has no transaction or reset boundary, so exceptions raised by late assertions occur after earlier mutations have already been committed.

Impact · Steps · Stub / mock · Analysis · Why this is likely a bug
  • Severity: High High severity
  • Impact: A late assertion failure can leave partially applied database state behind, so an immediate retry may run against corrupted preconditions and fail again or behave inconsistently.
  • Steps to Reproduce:
    1. Run the R2DBC client sequence and allow INSERT plus early Dolt commands to execute.
    2. Force a mismatch in a late assertion such as the dolt_log count check.
    3. Re-run the same R2DBC client command without resetting repository state and compare resulting counts/errors.
  • Stub / mock content: Local run setup bypassed SCRAM authentication by toggling EnableAuthentication to false so client-lane commands could execute in the QA container; no data-layer mocks or route interceptions were applied to this test path.
  • Code Analysis: In runTests, the chain performs INSERT and multiple Dolt state-changing statements before validating dolt_log and final row counts (lines 55-72 before lines 73-82). main catches failures and exits (lines 25-35) without rollback or cleanup, so a late assertion failure leaves drifted state for retries. A practical fix is to wrap mutating steps in a transaction with rollback on error, or add an explicit deterministic cleanup/reset before returning failure.
  • Why this is likely a bug: A late assertion failure is an organic failure mode, and the current order of operations allows committed partial mutations to survive that failure and alter subsequent retries. That violates expected retry determinism for this workflow and can hide the original defect behind state drift.
Relevant code

testing/postgres-client-tests/r2dbc/src/main/java/R2dbcTest.java:55-82

.then(exec(conn, "INSERT INTO test_table VALUES (2)"))
.thenMany(Flux.fromArray(new String[]{ ... "SELECT dolt_merge('mybranch')", }).concatMap(sql -> exec(conn, sql)))
.then(queryOne(conn, "SELECT COUNT(*) FROM dolt_log", Long.class))
.doOnNext(n -> { if (n.longValue() != 4L) throw new RuntimeException(...); })

testing/postgres-client-tests/r2dbc/src/main/java/R2dbcTest.java:25-35

try { Mono.usingWhen(factory.create(), R2dbcTest::runTests, Connection::close).block(); } catch (Exception e) { System.err.println("Error: " + e.getMessage()); System.exit(1); }
Evidence Package
Copy prompt for an agent
Ito QA identified the following failure during automated PR testing. Please investigate and propose a fix.

**High severity — R2DBC late assertion leaves persisted partial state**

**What failed:** The sequence has no transaction or reset boundary, so exceptions raised by late assertions occur after earlier mutations have already been committed.

- **Impact:** A late assertion failure can leave partially applied database state behind, so an immediate retry may run against corrupted preconditions and fail again or behave inconsistently.
- **Steps to reproduce:**
  1. Run the R2DBC client sequence and allow INSERT plus early Dolt commands to execute.
  2. Force a mismatch in a late assertion such as the dolt_log count check.
  3. Re-run the same R2DBC client command without resetting repository state and compare resulting counts/errors.
- **Stub / mock content:** Local run setup bypassed SCRAM authentication by toggling EnableAuthentication to false so client-lane commands could execute in the QA container; no data-layer mocks or route interceptions were applied to this test path.
- **Code analysis:** In runTests, the chain performs INSERT and multiple Dolt state-changing statements before validating dolt_log and final row counts (lines 55-72 before lines 73-82). main catches failures and exits (lines 25-35) without rollback or cleanup, so a late assertion failure leaves drifted state for retries. A practical fix is to wrap mutating steps in a transaction with rollback on error, or add an explicit deterministic cleanup/reset before returning failure.
- **Why this is likely a bug:** A late assertion failure is an organic failure mode, and the current order of operations allows committed partial mutations to survive that failure and alter subsequent retries. That violates expected retry determinism for this workflow and can hide the original defect behind state drift.

**Relevant code:**

`testing/postgres-client-tests/r2dbc/src/main/java/R2dbcTest.java:55-82`

~~~java
.then(exec(conn, "INSERT INTO test_table VALUES (2)"))
.thenMany(Flux.fromArray(new String[]{ ... "SELECT dolt_merge('mybranch')", }).concatMap(sql -> exec(conn, sql)))
.then(queryOne(conn, "SELECT COUNT(*) FROM dolt_log", Long.class))
.doOnNext(n -> { if (n.longValue() != 4L) throw new RuntimeException(...); })
~~~

`testing/postgres-client-tests/r2dbc/src/main/java/R2dbcTest.java:25-35`

~~~java
try { Mono.usingWhen(factory.create(), R2dbcTest::runTests, Connection::close).block(); } catch (Exception e) { System.err.println("Error: " + e.getMessage()); System.exit(1); }
~~~

@github-actions

github-actions Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor
Main PR
Total 42090 42090
Successful 18270 18271
Failures 23820 23819
Partial Successes1 5334 5334
Main PR
Successful 43.4070% 43.4094%
Failures 56.5930% 56.5906%

${\color{lightgreen}Progressions (1)}$

json_encoding

QUERY: SELECT '"\u0000"'::json;

Footnotes

  1. These are tests that we're marking as Successful, however they do not match the expected output in some way. This is due to small differences, such as different wording on the error messages, or the column names being incorrect while the data itself is correct.

@itoqa

itoqa Bot commented Jun 18, 2026

Copy link
Copy Markdown

Ito QA test results
Ito Diff Reportce1dd3d4c3e671: 18 test cases ran, 0 regressions ❌, 2 new failures ❌, 3 still failing ❌, 1 fixed ✅, 11 passing ✅, 1 additional findings ⚠️.

Diff Summary

This diff run surfaced

Coverage spanned end-to-end database client behavior across multiple language paths, including normal workflows, rebuild/regression stability checks, and edge-case handling like malformed inputs and mid-flow failures. Overall behavior is mostly stable on core happy paths, but some failure-handling and input-validation paths still produce inconsistent or brittle outcomes.

Merge with caution — this PR introduces medium-severity reliability issues in newly added client-flow code, specifically around graceful input handling and state safety when a sequence fails partway through. Several other failures were also observed but are marked as pre-existing, so they are important backlog risk rather than direct blockers from this change.

Tests run by Ito

View full run

Result State Severity Type Description
❌ New Failure Medium severity Postgrex The workflow aborts on the first query error via strict pattern matching, but prior mutating statements already succeeded and remain committed, leaving a partially applied Dolt state.
❌ New Failure Medium severity Postgrex Malformed or out-of-range port input is not handled by explicit argument validation and instead terminates through raw conversion/runtime exit paths.
❌->❌ Still Failing High severity Connection The connection attempt consumes injected attributes from argv (for example Database=template1) instead of rejecting or escaping separators, so behavior changes from literal authentication input to attribute injection.
❌->❌ Still Failing Medium severity Connection The client executes mutating Dolt statements sequentially and exits on the first SQL error, without rollback or compensating cleanup for already-applied statements.
❌->❌ Still Failing Medium severity Reactive The first run fails late after performing inserts and Dolt workflow mutations, and the immediate retry fails earlier because leftover state changes the result of the initial pk assertion.
❌->✅ Fixed Harness Two postgres-client-tests containers were run concurrently and both completed successfully without bind or connectivity failures.
Passing Engine The targeted Go regression subtest passed for plain star, mixed literal-plus-star, and duplicated-name projection queries, confirming expected row values and column ordering for this upgrade check.
Passing Engine Two fresh no-cache docker rebuilds of postgres-client-tests produced the same 22-lane pass/skip pattern, so no dependency-drift regression was reproduced for this commit in the observed window.
Passing Engine The ORDER BY mixed-projection check behaves as intended: the ordered expected sequence passes, and the intentionally reversed expected sequence fails with an ordered mismatch.
Passing Engine With GITHUB_ACTION=1 and a focused assertion fixture, the test intentionally panics with the expected guard message that Focus=true is disallowed in CI.
Passing Engine No application bug was confirmed: this framework intentionally uses order-insensitive matching when ORDER BY is absent and order-sensitive matching when ORDER BY is present, so the observed divergence is expected.
Passing Harness The postgres-client-tests image contains /build/bin/elixir/postgrex-test and the binary launches with the expected user/port argument shape.
Passing Harness The Bats run reports the elixir postgrex client case as skipped with the linked issue note, and no postgrex binary execution output appears for that case.
Passing Harness The run confirms local listener access worked and the same credentials were not reusable outside the harness boundary in this test environment.
Passing Harness The runtime /build/bin/elixir/postgrex-test artifact changed checksum, size, and timestamp after a forced source marker rebuild, so the image is not shipping a stale or missing Elixir binary in this scenario.
Passing Legacy The RPostgreSQL compatibility script completed successfully after seeded setup, and the post-run dolt.log check confirmed the expected commit count of 4.
Passing Prepared The C++ libpqxx lane built successfully, executed prepared and non-prepared SQL checks, and finished with expected Dolt workflow counts and exit code 0.
⏸️ Skipped Connection The ODBC happy-path flow passed: the bats lane exited 0 and confirmed baseline select/insert plus Dolt log and table-count assertions.
⏸️ Skipped Reactive The run was blocked before the R2DBC lane could execute because Maven Central repeatedly returned HTTP 429 while resolving build plugins, so this outcome reflects external dependency throttling rather than an application-code defect.
⚠️ Additional Finding Medium severity Postgrex Expected behavior is successful read/insert/read checks followed by the Dolt add/commit/checkout/merge workflow. Actual behavior is a server internal_error ("unhandled message "&{}"") returned through Postgrex, causing the script to crash on the first query match assertion.
Additional Findings Details

These findings are unrelated to the current changes but were observed during testing.

🟡 Postgrex client hits internal server error
  • Severity: Medium Medium severity
  • Description: Expected behavior is successful read/insert/read checks followed by the Dolt add/commit/checkout/merge workflow. Actual behavior is a server internal_error ("unhandled message "&{}"") returned through Postgrex, causing the script to crash on the first query match assertion.
  • Impact: Users running the Postgrex client test path can hit an internal protocol/server error before the workflow completes, causing the operation to fail instead of returning a usable result. This blocks the normal happy path until the underlying message-handling bug is fixed.
  • Steps to Reproduce:
    1. Build and run the postgres-client-tests container with local Doltgres startup and seeded test_table data.
    2. Execute /build/bin/elixir/postgrex-test with valid user and port arguments.
    3. Observe Postgrex disconnect with SQLSTATE XX000 internal_error and MatchError at postgrex_test.ex:17.
  • Stub / mock content: The run used a local Doltgres test environment with seeded table data and real Postgrex client calls; no stubs, mocks, or bypasses were used for this test.
  • Code Analysis: I inspected server/connection_handler.go and confirmed message dispatch falls through to a default internal error for any unrecognized frontend message type (handleMessage default branch at line 432). The same file does not include a *pgproto3.Flush branch in that switch, even though Postgrex can send extended-protocol messages that include flush/sync cycles. I also inspected the PR diff in pr-context.json: it adds the Elixir client test (testing/postgres-client-tests/elixir/lib/postgrex_test.ex) and a skipped BATS lane (testing/postgres-client-tests/postgres-client-tests.bats), but does not modify the server handler where the protocol error is emitted, so causation points to pre-existing server behavior.
Evidence Package

Tip

Reply with @itoqa to send us feedback on this test run.

if count != 2, do: raise("expected count=2, got #{count}")

# Dolt workflow: create table, insert, commit, branch, insert, commit, merge
Enum.each([

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

View All Evidence

🆕 New Failure: identified in this diff run

Medium severity Postgrex workflow failure leaves partial Dolt state

What failed: The workflow aborts on the first query error via strict pattern matching, but prior mutating statements already succeeded and remain committed, leaving a partially applied Dolt state.

Impact · Steps · Stub / mock · Analysis · Why this is likely a bug
  • Severity: Medium Medium severity
  • Impact: A failed Postgrex step can leave earlier database mutations committed, so retries may run against unreconciled partial state and produce inconsistent results for the workflow.
  • Steps to Reproduce:
    1. Start the postgres client test environment and seed test_table as in the recorded run.
    2. Execute /build/bin/elixir/postgrex-test with valid user and port so the workflow reaches the Dolt query sequence.
    3. Trigger a failure in a middle workflow statement and rerun immediately against the same repository state.
    4. Observe that earlier mutations remain (for example branch/table state), and retry outcomes diverge due to leftover state.
  • Stub / mock content: The run intentionally pre-created conflicting branch state before executing the Postgrex workflow to exercise mid-sequence failure and immediate-retry behavior.
  • Code Analysis: In PR-added code, lines 26-37 iterate a mutable Dolt query list and enforce {:ok, _} = Postgrex.query(...) per statement without any BEGIN/COMMIT/ROLLBACK guard or compensating cleanup path. This means any mid-sequence error exits immediately after earlier state-changing statements have already been applied. A targeted fix is to wrap the sequence in explicit transaction handling with rollback on failure, or add deterministic cleanup/reset before retry so partial state cannot leak across runs.
  • Why this is likely a bug: The expected behavior for this workflow is failure handling that does not strand intermediate repository mutations and destabilize retries. The current implementation has no rollback or reset boundary around multiple mutating statements, so partial-state leakage on failure is a direct defect in the newly added workflow logic.
Relevant code

testing/postgres-client-tests/elixir/lib/postgrex_test.ex:26-37

Enum.each([
  "DROP TABLE IF EXISTS test",
  "CREATE TABLE test (pk int, value int, PRIMARY KEY(pk))",
  "INSERT INTO test (pk, value) VALUES (0, 0)",
  "SELECT dolt_add('-A')",
  "SELECT dolt_commit('-m', 'added table test')",
  "SELECT dolt_checkout('-b', 'mybranch')",
  "INSERT INTO test VALUES (1, 1)",
  "SELECT dolt_commit('-a', '-m', 'updated test')",
  "SELECT dolt_checkout('main')",
  "SELECT dolt_merge('mybranch')"
], fn q -> {:ok, _} = Postgrex.query(conn, q, []) end)
Evidence Package
Copy prompt for an agent
Ito QA identified the following failure during automated PR testing. Please investigate and propose a fix.

**Medium severity — Postgrex workflow failure leaves partial Dolt state**

**What failed:** The workflow aborts on the first query error via strict pattern matching, but prior mutating statements already succeeded and remain committed, leaving a partially applied Dolt state.

- **Impact:** A failed Postgrex step can leave earlier database mutations committed, so retries may run against unreconciled partial state and produce inconsistent results for the workflow.
- **Steps to reproduce:**
  1. Start the postgres client test environment and seed test_table as in the recorded run.
  2. Execute /build/bin/elixir/postgrex-test with valid user and port so the workflow reaches the Dolt query sequence.
  3. Trigger a failure in a middle workflow statement and rerun immediately against the same repository state.
  4. Observe that earlier mutations remain (for example branch/table state), and retry outcomes diverge due to leftover state.
- **Stub / mock content:** The run intentionally pre-created conflicting branch state before executing the Postgrex workflow to exercise mid-sequence failure and immediate-retry behavior.
- **Code analysis:** In PR-added code, lines 26-37 iterate a mutable Dolt query list and enforce `{:ok, _} = Postgrex.query(...)` per statement without any BEGIN/COMMIT/ROLLBACK guard or compensating cleanup path. This means any mid-sequence error exits immediately after earlier state-changing statements have already been applied. A targeted fix is to wrap the sequence in explicit transaction handling with rollback on failure, or add deterministic cleanup/reset before retry so partial state cannot leak across runs.
- **Why this is likely a bug:** The expected behavior for this workflow is failure handling that does not strand intermediate repository mutations and destabilize retries. The current implementation has no rollback or reset boundary around multiple mutating statements, so partial-state leakage on failure is a direct defect in the newly added workflow logic.

**Relevant code:**

`testing/postgres-client-tests/elixir/lib/postgrex_test.ex:26-37`

~~~elixir
Enum.each([
  "DROP TABLE IF EXISTS test",
  "CREATE TABLE test (pk int, value int, PRIMARY KEY(pk))",
  "INSERT INTO test (pk, value) VALUES (0, 0)",
  "SELECT dolt_add('-A')",
  "SELECT dolt_commit('-m', 'added table test')",
  "SELECT dolt_checkout('-b', 'mybranch')",
  "INSERT INTO test VALUES (1, 1)",
  "SELECT dolt_commit('-a', '-m', 'updated test')",
  "SELECT dolt_checkout('main')",
  "SELECT dolt_merge('mybranch')"
], fn q -> {:ok, _} = Postgrex.query(conn, q, []) end)
~~~

@@ -0,0 +1,47 @@
defmodule PostgrexTest do
def main(args) do

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

View All Evidence

🆕 New Failure: identified in this diff run

Medium severity Postgrex malformed port input crashes before validation

What failed: Malformed or out-of-range port input is not handled by explicit argument validation and instead terminates through raw conversion/runtime exit paths.

Impact · Steps · Stub / mock · Analysis · Why this is likely a bug
  • Severity: Medium Medium severity
  • Impact: Supplying an invalid port value causes the Postgrex test CLI to crash before it can connect or report a validation error, so users get a hard failure instead of a clear argument check. Boundary values like out-of-range ports fail the same way.
  • Steps to Reproduce:
    1. Build the postgres-client-tests runtime image that includes /build/bin/elixir/postgrex-test.
    2. Run /build/bin/elixir/postgrex-test postgres 54x32.
    3. Observe the process exit with ArgumentError from :erlang.binary_to_integer at lib/postgrex_test.ex:4 instead of a normalized validation error message.
  • Stub / mock content: No stubs, mocks, or bypasses were applied for this test in the recorded run.
  • Code Analysis: main/1 destructures CLI args and immediately executes String.to_integer(port_str) (line 4) without validating format or range and without rescuing conversion failures, so malformed input (for example 54x32) crashes before connection logic. The PR context shows this file was newly added in this PR, so this unguarded parse line is a direct changed-code cause; the smallest fix is to parse with guarded handling (for example Integer.parse/1 plus bounds checks) and return a stable user-facing validation error for invalid ports.
  • Why this is likely a bug: The test plan expects malformed ports to be handled as explicit validation failures with stable reporting, but current behavior throws an internal conversion exception and exits. This is a deterministic input-handling defect in production code, not a harness-only artifact.
Relevant code

testing/postgres-client-tests/elixir/lib/postgrex_test.ex:2-4

def main(args) do
  [user, port_str] = args
  port = String.to_integer(port_str)
Evidence Package
Copy prompt for an agent
Ito QA identified the following failure during automated PR testing. Please investigate and propose a fix.

**Medium severity — Postgrex malformed port input crashes before validation**

**What failed:** Malformed or out-of-range port input is not handled by explicit argument validation and instead terminates through raw conversion/runtime exit paths.

- **Impact:** Supplying an invalid port value causes the Postgrex test CLI to crash before it can connect or report a validation error, so users get a hard failure instead of a clear argument check. Boundary values like out-of-range ports fail the same way.
- **Steps to reproduce:**
  1. Build the postgres-client-tests runtime image that includes `/build/bin/elixir/postgrex-test`.
  2. Run `/build/bin/elixir/postgrex-test postgres 54x32`.
  3. Observe the process exit with `ArgumentError` from `:erlang.binary_to_integer` at `lib/postgrex_test.ex:4` instead of a normalized validation error message.
- **Stub / mock content:** No stubs, mocks, or bypasses were applied for this test in the recorded run.
- **Code analysis:** `main/1` destructures CLI args and immediately executes `String.to_integer(port_str)` (line 4) without validating format or range and without rescuing conversion failures, so malformed input (for example `54x32`) crashes before connection logic. The PR context shows this file was newly added in this PR, so this unguarded parse line is a direct changed-code cause; the smallest fix is to parse with guarded handling (for example `Integer.parse/1` plus bounds checks) and return a stable user-facing validation error for invalid ports.
- **Why this is likely a bug:** The test plan expects malformed ports to be handled as explicit validation failures with stable reporting, but current behavior throws an internal conversion exception and exits. This is a deterministic input-handling defect in production code, not a harness-only artifact.

**Relevant code:**

`testing/postgres-client-tests/elixir/lib/postgrex_test.ex:2-4`

~~~elixir
def main(args) do
  [user, port_str] = args
  port = String.to_integer(port_str)
~~~

@jennifersp jennifersp requested a review from Hydrocharged June 18, 2026 20:53
@itoqa

itoqa Bot commented Jun 18, 2026

Copy link
Copy Markdown

Ito QA test results
Ito Diff Report4c3e6715c98c9a: 20 test cases ran, 0 regressions ❌, 2 new failures ❌, 6 still failing ❌, 0 fixed ✅, 12 passing ✅, 0 additional findings ⚠️.

Diff Summary

This diff run surfaced

This run covered end-to-end database client behavior across multiple language runtimes, including normal query flows, build/runtime packaging checks, and failure handling during state-changing workflows. It also exercised edge and abuse-style conditions like malformed connection input, concurrent runs, and immediate retries to assess reliability under stress.

Merge with caution — the PR appears to introduce medium-severity reliability problems in the new Swift workflow, where concurrent execution and retry behavior can produce inconsistent or contaminated state. There are also several high-severity failures elsewhere, but those were identified as pre-existing and are caveats rather than merge blockers for this specific change.

Tests run by Ito

View full run

Result State Severity Type Description
❌ New Failure Medium severity Runtime Expected behavior for this concurrency test is isolation or safe serialization so one run does not invalidate the other. Actual behavior is both Swift runs failing with PSQLError under concurrent execution and post-state showing only one row in test instead of two.
❌ New Failure Medium severity Runtime The first run fails at a later Dolt step, but leaves prior writes committed; the immediate retry then fails early on pk validation with different behavior, showing non-idempotent retry semantics.
❌->❌ Still Failing High severity Connection The client accepts separator-bearing argv input and passes it directly into SQLDriverConnect, allowing unintended connection-string attributes to be parsed.
❌->❌ Still Failing High severity Connection The workflow aborts on the first failing statement and does not provide rollback or explicit cleanup of earlier successful statements.
❌->❌ Still Failing High severity Postgrex The workflow executes write operations and Dolt mutations in sequence with {:ok, _} pattern matches but no transaction wrapper or rollback branch, so a later error aborts execution while prior writes remain committed.
❌->❌ Still Failing High severity Reactive The first run fails late (expected 4 dolt_log entries, got 5), but because prior INSERT and Dolt mutations already committed, the immediate retry fails earlier (expected pk=1, got 2) instead of reproducing the same late assertion.
❌->❌ Still Failing Medium severity Postgrex The script aborts at PostgrexTest.main/1 line 17 on the first query because the server rejects a protocol message instead of completing normal query handling.
❌->❌ Still Failing Medium severity Postgrex The script accepts the port argument without validation and crashes via conversion or pattern-match exit paths, rather than returning a stable user-facing validation error.
Passing Harness The Docker image build completed, the runtime image contained /build/bin/elixir/postgrex-test, and the binary launched with harness-generated arguments during verification.
Passing Harness The bats suite reports the Elixir postgrex case as an intentional skip with issue 2859 and continues to run the Swift lane, matching the expected harness behavior.
Passing Legacy The RPostgreSQL lane completed successfully, including the baseline query path and final Dolt commit-count check.
Passing Prepared The libpqxx client built and ran successfully against a local doltgres instance, and both direct SQL plus prepared-statement checks matched expected table and dolt_log counts.
Passing Runtime The Swift bats lane executed /build/bin/swift/postgresnio-test with valid harness credentials and completed successfully with ok 1, matching the expected happy-path runtime behavior.
Passing Runtime With interrupted network conditions, the CI Docker build failed during dependency retrieval with clear stage attribution and did not misreport the outcome as a runtime Swift client-test failure.
Passing Runtime The Docker build completed, the runtime image contains /build/bin/swift/postgresnio-test, and the bats suite discovered and passed the swift postgresnio client lane.
Passing Runtime A forced non-zero exit (42) in the Swift postgresnio client produced a failing bats lane and non-zero suite exit, confirming failure propagation works as designed.
Passing Runtime With branch 'mybranch' pre-created, the Swift client exits non-zero at SELECT dolt_checkout('-b', 'mybranch') and does not continue to later dolt_log or table-count assertions, matching expected fail-fast behavior.
Passing Runtime The CI-style Docker build fails in the Swift build stage when dependency fetch is disrupted, and runtime container execution does not start, which matches the expected fail-fast behavior.
Passing Runtime An uncached Swift build for the new postgresnio client fetched dependencies only from GitHub-hosted repositories, with no unexpected external hosts observed.
Passing Runtime Two overlapping Swift lane invocations against the same doltgres instance behaved deterministically across repeated runs: one invocation completed, the overlapping invocation was rejected with PSQLError, and post-run branch and table state remained stable.
⏸️ Skipped Engine The targeted Go regression subtest passed for plain star, mixed literal-plus-star, and duplicated-name projection queries, confirming expected row values and column ordering for this upgrade check.
⏸️ Skipped Engine Two fresh no-cache docker rebuilds of postgres-client-tests produced the same 22-lane pass/skip pattern, so no dependency-drift regression was reproduced for this commit in the observed window.
⏸️ Skipped Engine The ORDER BY mixed-projection check behaves as intended: the ordered expected sequence passes, and the intentionally reversed expected sequence fails with an ordered mismatch.
⏸️ Skipped Engine With GITHUB_ACTION=1 and a focused assertion fixture, the test intentionally panics with the expected guard message that Focus=true is disallowed in CI.
⏸️ Skipped Engine No application bug was confirmed: this framework intentionally uses order-insensitive matching when ORDER BY is absent and order-sensitive matching when ORDER BY is present, so the observed divergence is expected.
⏸️ Skipped Harness Two postgres-client-tests containers were run concurrently and both completed successfully without bind or connectivity failures.
⏸️ Skipped Harness The run confirms local listener access worked and the same credentials were not reusable outside the harness boundary in this test environment.
⏸️ Skipped Harness The runtime /build/bin/elixir/postgrex-test artifact changed checksum, size, and timestamp after a forced source marker rebuild, so the image is not shipping a stale or missing Elixir binary in this scenario.

Tip

Reply with @itoqa to send us feedback on this test run.

guard count == 2 else { throw TestError("expected count=2, got \(count)") }

// Dolt workflow: create table, insert, commit, branch, insert, commit, merge
for q in [

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

View All Evidence

🆕 New Failure: identified in this diff run

Medium severity Concurrent Swift workflow branch collision

What failed: Expected behavior for this concurrency test is isolation or safe serialization so one run does not invalidate the other. Actual behavior is both Swift runs failing with PSQLError under concurrent execution and post-state showing only one row in test instead of two.

Impact · Steps · Stub / mock · Analysis · Why this is likely a bug
  • Severity: Medium Medium severity
  • Impact: The affected auth flow does not complete under the observed conditions, so users cannot finish the login-related workflow and are blocked from proceeding. The failure appears in a core path, but the evidence does not show data loss, security exposure, or irreversible damage.
  • Steps to Reproduce:
    1. Build the postgres-client-tests runtime image and start a single doltgres instance with the standard seeded setup.
    2. Launch two /build/bin/swift/postgresnio-test $USER $PORT processes nearly simultaneously against that same instance.
    3. Observe both runs exit non-zero with PSQLError and inspect post-run state showing incomplete workflow progression.
  • Stub / mock content: The run used a non-production local harness with SCRAM authentication bypassed so client lanes could execute without login blocking; no request/response mocks or route interceptions were applied for this test.
  • Code Analysis: In PostgresNIOTest.runTests, the PR-added Dolt sequence uses a fixed global branch name (mybranch) for every invocation and executes it without any concurrency guard, lock, or per-run branch suffix. When two runs interleave, both attempt the same branch-create/checkout path and can collide. The failing evidence file for this test shows both concurrent runs exiting with PSQLError and the post-run table count not reaching the expected final value, matching that code-level contention path. A minimal fix is to generate a unique branch name per invocation (for example with a UUID or timestamp suffix) or serialize this Dolt workflow section for shared-repo runs.
  • Why this is likely a bug: This is a production-code issue in the Swift test executable logic, not a harness-only artifact, because the workflow itself hard-codes shared mutable Dolt branch state across invocations. The observed concurrent failures align with that deterministic collision path and violate the expected safe concurrent behavior for this test case.
Relevant code

testing/postgres-client-tests/swift/Sources/PostgresNIOTest.swift:79-90

for q in [
    "DROP TABLE IF EXISTS test",
    "CREATE TABLE test (pk int, value int, PRIMARY KEY(pk))",
    "INSERT INTO test (pk, value) VALUES (0, 0)",
    "SELECT dolt_add('-A')",
    "SELECT dolt_commit('-m', 'added table test')",
    "SELECT dolt_checkout('-b', 'mybranch')",
    "INSERT INTO test VALUES (1, 1)",
    "SELECT dolt_commit('-a', '-m', 'updated test')",
    "SELECT dolt_checkout('main')",
    "SELECT dolt_merge('mybranch')",
] {
Evidence Package
Copy prompt for an agent
Ito QA identified the following failure during automated PR testing. Please investigate and propose a fix.

**Medium severity — Concurrent Swift workflow branch collision**

**What failed:** Expected behavior for this concurrency test is isolation or safe serialization so one run does not invalidate the other. Actual behavior is both Swift runs failing with PSQLError under concurrent execution and post-state showing only one row in `test` instead of two.

- **Impact:** The affected auth flow does not complete under the observed conditions, so users cannot finish the login-related workflow and are blocked from proceeding. The failure appears in a core path, but the evidence does not show data loss, security exposure, or irreversible damage.
- **Steps to reproduce:**
  1. Build the postgres-client-tests runtime image and start a single doltgres instance with the standard seeded setup.
  2. Launch two `/build/bin/swift/postgresnio-test $USER $PORT` processes nearly simultaneously against that same instance.
  3. Observe both runs exit non-zero with PSQLError and inspect post-run state showing incomplete workflow progression.
- **Stub / mock content:** The run used a non-production local harness with SCRAM authentication bypassed so client lanes could execute without login blocking; no request/response mocks or route interceptions were applied for this test.
- **Code analysis:** In `PostgresNIOTest.runTests`, the PR-added Dolt sequence uses a fixed global branch name (`mybranch`) for every invocation and executes it without any concurrency guard, lock, or per-run branch suffix. When two runs interleave, both attempt the same branch-create/checkout path and can collide. The failing evidence file for this test shows both concurrent runs exiting with PSQLError and the post-run table count not reaching the expected final value, matching that code-level contention path. A minimal fix is to generate a unique branch name per invocation (for example with a UUID or timestamp suffix) or serialize this Dolt workflow section for shared-repo runs.
- **Why this is likely a bug:** This is a production-code issue in the Swift test executable logic, not a harness-only artifact, because the workflow itself hard-codes shared mutable Dolt branch state across invocations. The observed concurrent failures align with that deterministic collision path and violate the expected safe concurrent behavior for this test case.

**Relevant code:**

`testing/postgres-client-tests/swift/Sources/PostgresNIOTest.swift:79-90`

~~~swift
for q in [
    "DROP TABLE IF EXISTS test",
    "CREATE TABLE test (pk int, value int, PRIMARY KEY(pk))",
    "INSERT INTO test (pk, value) VALUES (0, 0)",
    "SELECT dolt_add('-A')",
    "SELECT dolt_commit('-m', 'added table test')",
    "SELECT dolt_checkout('-b', 'mybranch')",
    "INSERT INTO test VALUES (1, 1)",
    "SELECT dolt_commit('-a', '-m', 'updated test')",
    "SELECT dolt_checkout('main')",
    "SELECT dolt_merge('mybranch')",
] {
~~~

defer { try! conn.close().wait() }

// SELECT pk from test_table (set up by bats setup())
let pkRow = try await queryRow(conn, "SELECT pk FROM test_table LIMIT 1", logger: logger)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

View All Evidence

🆕 New Failure: identified in this diff run

Medium severity Swift retry is poisoned by residual state

What failed: The first run fails at a later Dolt step, but leaves prior writes committed; the immediate retry then fails early on pk validation with different behavior, showing non-idempotent retry semantics.

Impact · Steps · Stub / mock · Analysis · Why this is likely a bug
  • Severity: Medium Medium severity
  • Impact: A failed workflow can leave residual state behind, causing the next immediate retry to behave differently and surface incorrect results. Users may need to reset the environment before rerunning the same operation.
  • Steps to Reproduce:
    1. Start the postgres-client-tests environment and seed test_table with pk=1.
    2. Trigger a late Dolt workflow failure in /build/bin/swift/postgresnio-test after earlier statements succeed (for example, pre-create mybranch so dolt_checkout -b mybranch fails).
    3. Run the same Swift command again immediately without resetting state.
    4. Observe the retry fail with a different error path (expected pk=1, got 2) caused by residue from the first run.
  • Stub / mock content: The run used a local non-production harness with SCRAM authentication bypass enabled, and this test intentionally injected a branch-collision fault to force a late workflow failure before retrying without environment reset.
  • Code Analysis: In testing/postgres-client-tests/swift/Sources/PostgresNIOTest.swift, the PR-added code performs an early INSERT INTO test_table VALUES (2) at line 71 before the later Dolt sequence at lines 79-90 that can fail. On retry, the initial read uses SELECT pk FROM test_table LIMIT 1 at line 66 with a strict pk==1 check at line 68, so leftover row/order state can redirect the failure mode. A targeted fix in this same changed file is to make the initial read deterministic (for example ORDER BY pk), and isolate or clean up earlier mutations when a later Dolt step fails so retries do not inherit partial state.
  • Why this is likely a bug: The failure pattern is deterministic across a single no-reset retry: first run fails at the injected late step, second run fails differently with expected pk=1, got 2 while readback shows residual rows. That behavior aligns with the PR-added Swift workflow ordering, so this is a real code-path defect rather than a test harness artifact.
Relevant code

testing/postgres-client-tests/swift/Sources/PostgresNIOTest.swift:66-71

let pkRow = try await queryRow(conn, "SELECT pk FROM test_table LIMIT 1", logger: logger)
let pk = try pkRow.decode(Int32.self, context: .default)
guard pk == 1 else { throw TestError("expected pk=1, got \(pk)") }

try await exec(conn, "INSERT INTO test_table VALUES (2)", logger: logger)

testing/postgres-client-tests/swift/Sources/PostgresNIOTest.swift:79-90

for q in [
    "DROP TABLE IF EXISTS test",
    "CREATE TABLE test (pk int, value int, PRIMARY KEY(pk))",
    "INSERT INTO test (pk, value) VALUES (0, 0)",
    "SELECT dolt_add('-A')",
    "SELECT dolt_commit('-m', 'added table test')",
    "SELECT dolt_checkout('-b', 'mybranch')",
    "INSERT INTO test VALUES (1, 1)",
    "SELECT dolt_commit('-a', '-m', 'updated test')",
    "SELECT dolt_checkout('main')",
    "SELECT dolt_merge('mybranch')",
]
Evidence Package
Copy prompt for an agent
Ito QA identified the following failure during automated PR testing. Please investigate and propose a fix.

**Medium severity — Swift retry is poisoned by residual state**

**What failed:** The first run fails at a later Dolt step, but leaves prior writes committed; the immediate retry then fails early on pk validation with different behavior, showing non-idempotent retry semantics.

- **Impact:** A failed workflow can leave residual state behind, causing the next immediate retry to behave differently and surface incorrect results. Users may need to reset the environment before rerunning the same operation.
- **Steps to reproduce:**
  1. Start the postgres-client-tests environment and seed test_table with pk=1.
  2. Trigger a late Dolt workflow failure in /build/bin/swift/postgresnio-test after earlier statements succeed (for example, pre-create mybranch so dolt_checkout -b mybranch fails).
  3. Run the same Swift command again immediately without resetting state.
  4. Observe the retry fail with a different error path (expected pk=1, got 2) caused by residue from the first run.
- **Stub / mock content:** The run used a local non-production harness with SCRAM authentication bypass enabled, and this test intentionally injected a branch-collision fault to force a late workflow failure before retrying without environment reset.
- **Code analysis:** In testing/postgres-client-tests/swift/Sources/PostgresNIOTest.swift, the PR-added code performs an early INSERT INTO test_table VALUES (2) at line 71 before the later Dolt sequence at lines 79-90 that can fail. On retry, the initial read uses SELECT pk FROM test_table LIMIT 1 at line 66 with a strict pk==1 check at line 68, so leftover row/order state can redirect the failure mode. A targeted fix in this same changed file is to make the initial read deterministic (for example ORDER BY pk), and isolate or clean up earlier mutations when a later Dolt step fails so retries do not inherit partial state.
- **Why this is likely a bug:** The failure pattern is deterministic across a single no-reset retry: first run fails at the injected late step, second run fails differently with expected pk=1, got 2 while readback shows residual rows. That behavior aligns with the PR-added Swift workflow ordering, so this is a real code-path defect rather than a test harness artifact.

**Relevant code:**

`testing/postgres-client-tests/swift/Sources/PostgresNIOTest.swift:66-71`

~~~swift
let pkRow = try await queryRow(conn, "SELECT pk FROM test_table LIMIT 1", logger: logger)
let pk = try pkRow.decode(Int32.self, context: .default)
guard pk == 1 else { throw TestError("expected pk=1, got \(pk)") }

try await exec(conn, "INSERT INTO test_table VALUES (2)", logger: logger)
~~~

`testing/postgres-client-tests/swift/Sources/PostgresNIOTest.swift:79-90`

~~~swift
for q in [
    "DROP TABLE IF EXISTS test",
    "CREATE TABLE test (pk int, value int, PRIMARY KEY(pk))",
    "INSERT INTO test (pk, value) VALUES (0, 0)",
    "SELECT dolt_add('-A')",
    "SELECT dolt_commit('-m', 'added table test')",
    "SELECT dolt_checkout('-b', 'mybranch')",
    "INSERT INTO test VALUES (1, 1)",
    "SELECT dolt_commit('-a', '-m', 'updated test')",
    "SELECT dolt_checkout('main')",
    "SELECT dolt_merge('mybranch')",
]
~~~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant