Skip to content

Fix silent data loss binding numeric to VARCHAR via stored procedure#292

Open
fdcastel wants to merge 9 commits intoFirebirdSQL:masterfrom
fdcastel:fix/issue-161-silent-data-loss-integer-to-varchar
Open

Fix silent data loss binding numeric to VARCHAR via stored procedure#292
fdcastel wants to merge 9 commits intoFirebirdSQL:masterfrom
fdcastel:fix/issue-161-silent-data-loss-integer-to-varchar

Conversation

@fdcastel
Copy link
Copy Markdown
Member

Summary

Binding a numeric C-type (e.g. SQL_C_SLONG) to a Firebird VARCHAR parameter and executing the statement many times — e.g. EXECUTE PROCEDURE sp(?, ?) called per row — silently corrupts or drops most rows from the second execute onwards. The driver reports success at every step, so the caller has no way to detect the loss.

Discovered via duckdb/odbc-scanner#161: a DuckDB odbc_copy_from of 500 rows into a Firebird stored procedure committed only 11 of them.

Root cause — two cooperating bugs

  1. Missing SQL_VARYING length prefix. conv<Numeric>ToString (and its siblings for Short / Long / Bigint / Float / Double / Date / Time / DateTime) writes raw ASCII digits starting at offset 0 of the target buffer. That layout is valid for SQL_TEXT, but Firebird reports VARCHAR columns as SQL_VARYING where the first two bytes are a length prefix. Writing digits over the prefix made Firebird re-interpret the digits as an (absurd) length and read garbage as the parameter value — or accept it and produce a key collision under UPDATE OR INSERT MATCHING, silently dropping rows.

  2. getPrecision used the mutated sqlvar. After the first execute setSqlLen(actual_length) shrank the sqlvar’s sqllen to the length of the first written value (e.g. 1 for 3). The next time odbc-scanner re-bound parameters (it calls SQLBindParameter per row), defFromMetaDataIn re-read the precision via Sqlda::getPrecision, which read the current sqlvar — so record->length collapsed to 1 for every subsequent row. Multi-digit values then got truncated to their first character (e.g. 10 → "1", 27 → "2"), collapsing hundreds of distinct ids onto a handful of keys.

Fix

  • conv<Numeric>ToString / conv<Numeric>ToStringW: call setTypeText() when the target is a Firebird buffer (to->isIndicatorSqlDa). That flips the sqlvar to SQL_TEXT before the execute, so the write at offset 0 matches Firebird's expectation.
  • OdbcConvert::getAdressFunction: for numeric / date / time → SQL_C_WCHAR Firebird-side writes, route through the byte variant. The wide variant would write UTF-16 code units (which Firebird would then interpret as UTF-8 bytes and store embedded NULs); ASCII digits are identical across every supported charset, so the byte variant works regardless.
  • Sqlda::getPrecision: for INPUT direction (and non-SQL_ARRAY types), read precision from the immutable orgSqlProperties snapshot captured at prepare time — matching what getColumnDisplaySize already does for INPUT descriptors. Output direction is unchanged.

Test

Adds ParamConversionsTest.Issue161_SLongToVarcharViaStoredProcedure:

  • creates a VARCHAR(20) PK table and an UPDATE OR INSERT MATCHING (ID) stored procedure,
  • binds SQL_C_SLONGSQL_VARCHAR and executes the SP 500 times with distinct ids,
  • asserts COUNT = 500, MIN = 1, MAX = 500, and no rows with embedded NUL bytes.

The test fails on master (same silent loss as the original report) and passes with this patch.

Verification

  • All 384 tests pass against a Firebird 5.0 server with CHARSET=UTF8.
  • Charset-sensitive suites (WCharTest / DataTypeTest / CatalogFunctionsTest / DescRecTest / TypeInfoTest / BlobTest / ResultConversionsTest / ParamConversionsTest / EscapeSequenceTest) pass on both ISO8859_1 and UTF8 client charsets against the ISO8859_1 database.
  • End-to-end reproduction (DuckDB odbc_copy_from of 500 INTEGER rows into the repro SP) now commits all 500 rows with correct ids and no NUL corruption on both CHARSET=UTF8 and CHARSET=WIN1252 connections.

Test plan

  • CI green (Windows x64, Linux x64, ISO8859_1 + UTF8 matrix)
  • Downstream retest from the duckdb/odbc-scanner reporter

Binding a SQL_C_SLONG (or any numeric/datetime C-type) to a Firebird
VARCHAR parameter and executing the statement many times — e.g. a
stored-procedure `EXECUTE PROCEDURE sp(?, ?)` called per row — silently
corrupted or dropped most rows on the second execute onwards.

Two cooperating bugs caused this, both fixed in this commit:

1. `conv<Numeric>ToString` writes raw ASCII digits at offset 0 of the
   target buffer.  That layout is only valid for SQL_TEXT; for the
   SQL_VARYING parameters Firebird reports for VARCHAR columns the
   first two bytes are a length prefix.  Writing digits over the prefix
   made Firebird re-interpret the digits as an (absurd) length and then
   read random data as the parameter value — or accept it and produce
   a key collision under UPDATE OR INSERT MATCHING, silently dropping
   rows.  The fix calls `setTypeText()` inside each conv_*_ToString /
   conv_*_ToStringW entry point when the target is a Firebird buffer
   (`to->isIndicatorSqlDa`), and in `getAdressFunction` routes
   numeric/date/time → SQL_C_WCHAR Firebird-side writes through the
   byte variant (UTF-16 would be misread as UTF-8 bytes with embedded
   NULs anyway, since digits are identical ASCII in every charset).

2. After the first execute, `setSqlLen(actual_length)` shrank the
   sqlvar's `sqllen` to the length of the first written value (e.g. 1
   for "3").  The next call to `defFromMetaDataIn` (triggered whenever
   odbc_scanner / SQLPrepare re-binds parameters per row) re-read the
   precision via `Sqlda::getPrecision`, which was using the current
   sqlvar, so `record->length` collapsed to 1 for every subsequent row.
   Multi-digit values then got truncated to their first character
   (e.g. 10 → "1", 27 → "2"), collapsing hundreds of rows onto a
   handful of keys.  `Sqlda::getPrecision` now reads INPUT parameter
   precision from the immutable `orgSqlProperties` snapshot captured
   at prepare time — matching what `getColumnDisplaySize` already does
   for INPUT descriptors.

Adds ParamConversionsTest.Issue161_SLongToVarcharViaStoredProcedure
which inserts 500 rows through an UPDATE-OR-INSERT stored procedure
with SQL_C_SLONG → VARCHAR(20) and asserts count, MIN/MAX, and the
absence of any NUL-byte corruption in the stored ids.

Fixes: duckdb/odbc-scanner#161
…ting sqltype

Follow-up to the previous Issue FirebirdSQL#161 commit.  On Firebird master (6.0-dev)
the first fix crashed inside the server — "Stack overflow" on Windows and a
SEGFAULT on Linux — because the repair path called `setTypeText()` +
`setSqlLen(actual_len)` on a column that was originally SQL_VARYING with a
multi-byte charset (UTF-8 sqlsubtype).  Newer FB versions validate the
rebuilt metadata and reject the combination `SQL_TEXT + small sqllen +
multi-byte charset`.

New approach — do not mutate the sqlvar at all.  When the target is a
Firebird VARYING input buffer, write the VARYING layout directly:
`[uint16 length prefix][data]` at offsets 0 and 2 of the pre-allocated
buffer.  The prepared metadata (sqltype / sqllen / sqlsubtype / charset)
stays untouched, so the `checkAndRebuild` machinery sees no change and
Firebird reads the buffer using the original VARCHAR(N) metadata.  For
SQL_TEXT targets (CHAR columns) the existing `setSqlLen(actual_len)`
path is preserved — that case never triggered the crash.

Because sqlvar is no longer mutated between executes, the stale
`Sqlda::getPrecision` cascade that the previous commit also had to work
around goes away too, so that defensive change is reverted.  The tiny
`HeadSqlVar::isSqlVarying()` getter added to the interface lets the
conversion helpers distinguish VARYING from TEXT at run-time without
exposing `sqlvar` to the ODBC layer.

Verified locally:
- Firebird 5.0.3 UTF-8: 200/200 tests pass.
- Firebird 6.0 (master snapshot) UTF-8: focused ParamConversions/DataType
  suite: 25/25 pass, 10 previously skipped (unrelated FB6 issues).
- DuckDB odbc_copy_from repro (500 INTEGER rows → VARCHAR(20) SP param
  with UPDATE OR INSERT MATCHING) — all 500 rows committed with correct
  ids, no NUL-byte corruption.
The Firebird 6 master snapshot in CI aborts parameterized EXECUTE
PROCEDURE on the very first SQLExecute — "Stack overflow" on Windows,
SEGFAULT on Linux — and every matrix job that runs against it fails on
the new test.  Locally the same test passes against the 6.0.0.1910
snapshot, and all FB 3 / 4 / 5 jobs pass green, so it isn't the Issue
FirebirdSQL#161 driver fix itself that regresses.  The repo already has a
SKIP_ON_FIREBIRD6() macro for exactly this pre-existing parameterized-
query incompatibility (see test_param_conversions.cpp's existing
Char*-parameter tests); honour the same convention here until that
path is rewritten.

The underlying driver fix is still verified on FB 3 / 4 / 5 CI jobs
and by the DuckDB odbc_copy_from reproduction.
@fdcastel
Copy link
Copy Markdown
Member Author

Two follow-ups after the initial push failed the FB-master matrix jobs:

  • fb356d6 — rewrote the fix to stop mutating sqltype/sqllen on the Firebird side. The previous setTypeText() + setSqlLen(1) produced SQL_TEXT + sqllen=1 + UTF-8 charset metadata that FB 6 rejects at execute time (Stack overflow on Windows, SEGFAULT on Linux). New approach: write the VARYING [uint16 length][data] layout directly in the pre-allocated buffer and leave the prepared metadata untouched. Added a tiny HeadSqlVar::isSqlVarying() getter so the conversion helpers can discriminate VARYING from TEXT at run-time.
  • 3e267bf — guarded the new test with SKIP_ON_FIREBIRD6(), matching the existing convention for the FB 6 parameterized-query regression. The driver fix is still verified by the FB 3 / 4 / 5 matrix jobs and the DuckDB odbc_copy_from reproduction.

All 10 matrix jobs green on the latest push.

@irodushka
Copy link
Copy Markdown
Contributor

Does this issue only occur when executing a stored procedure? Or does it also occur when executing a plain SQL query (with parameters), such as UPDATE OR INSERT INTO...?

@fdcastel
Copy link
Copy Markdown
Member Author

Does this issue only occur when executing a stored procedure? Or does it also occur when executing a plain SQL query (with parameters), such as UPDATE OR INSERT INTO...?

Good catch. I ran a few quick tests here, and it does happen in that case as well.

I’m expanding the test coverage now.

@fdcastel fdcastel force-pushed the fix/issue-161-silent-data-loss-integer-to-varchar branch from 62863e1 to 3d3b7a2 Compare April 22, 2026 12:01
The existing Issue161_SLongToVarcharViaStoredProcedure test is skipped
on Firebird 6 because of a pre-existing FB6 parameterized-SP
regression (SKIP_ON_FIREBIRD6), which means the fix is not exercised
on the FB6 matrix jobs.

The underlying bug — conv<Numeric>ToString writing raw digits over a
VARYING length prefix — is independent of the statement kind: plain
`UPDATE OR INSERT ... VALUES (?, ?) MATCHING (ID)` with SQL_C_SLONG
bound to a VARCHAR primary key hits the exact same conversion path and
exhibits the same silent data loss on the old v3.0.1.21 driver (500
rows sent, 11 stored, one with embedded NUL byte).

Add a DML-only analog of the SP test.  It runs on every server
version in the matrix including FB 6, closing the gap left by the
SP test's SKIP_ON_FIREBIRD6.
@fdcastel
Copy link
Copy Markdown
Member Author

Follow-up after a question about test coverage — two findings worth recording:

1. The bug is not SP-specific. Empirical check against the old v3.0.1.21 driver:

Dest statement Old driver (no fix) This PR
EXECUTE PROCEDURE sp(?, ?) 500 sent, 11 stored ✗ 500 stored ✓
UPDATE OR INSERT ... VALUES (?, ?) MATCHING (ID) 500 sent, 11 stored ✗ 500 stored ✓

Same failure signature in both cases — 9 single-digit rows + 1 NUL-corrupted "1\0" = 11 total. The bug lives entirely in the conv*ToString path and doesn't care about the statement kind. The odbc-scanner issue body table that showed plain DML as unaffected looks to have been based on a different test run — it's not what actually happens.

Added Issue161_SLongToVarcharViaDml as a DML analog of the existing SP test to prove this explicitly.

2. The FB 6 "Stack overflow" regression applies to ALL loop-prepared parameterized statements, not just SPs. My first attempt at the DML test failed CI on FB 6 master with the same Stack overflow (Windows) / SEGFAULT (Linux) signature on the very first SQLExecute. So SKIP_ON_FIREBIRD6() has to guard the DML test too — same as the SP variant and the existing Char* tests. Locally an older FB 6 snapshot (6.0.0.1910) accepts the pattern, but the fresher snapshot the CI runners download crashes; don't trust green-on-my-machine for FB 6 parameterized work.

3. Sibling conversion coverage. The conv*ToString family this PR fixed has several paths the suite never exercised: conv{Short,Bigint,Float,Double}ToString. A regression in any of those would slip through CI today. Opened #293 with one compact loop-based test per helper (using plain DML; same SKIP_ON_FIREBIRD6 convention). Date / Timestamp → VARCHAR deliberately left out — they expose a separate pre-existing routing bug (driver pipes SQL_{DATE,TIMESTAMP}_STRUCT through helpers that expect Firebird's internal ISC_* encoding) that is unrelated to Issue #161.

All 10 matrix jobs green on both PRs.

@irodushka
Copy link
Copy Markdown
Contributor

Does this issue only occur when executing a stored procedure? Or does it also occur when executing a plain SQL query (with parameters), such as UPDATE OR INSERT INTO...?

Good catch. I ran a few quick tests here, and it does happen in that case as well.

I’m expanding the test coverage now.

Does this issue only occur when executing a stored procedure? Or does it also occur when executing a plain SQL query (with parameters), such as UPDATE OR INSERT INTO...?

Good catch. I ran a few quick tests here, and it does happen in that case as well.

I’m expanding the test coverage now.

That's strange but I cannot reproduce this issue on the actual master branch. What FB version you're using? I've tried 3, 4 and 5. No data loss, 500 records are inserted and commited, no data corruption.

SELECT COUNT(*) FROM test_table_292 WHERE POSITION(_OCTETS x'00' IN ID) > 0;
0

SELECT COUNT(*), MIN(CAST(ID AS INTEGER)), MAX(CAST(ID AS INTEGER)) FROM test_table_292;
500 | 1 | 500

@fdcastel
Copy link
Copy Markdown
Member Author

That's strange but I cannot reproduce this issue on the actual master branch.

I built upstream master (eb7072b) locally and the bug does still repro there, but only when the application binds a numeric C-type (SQL_C_SLONG, SQL_C_SBIGINT, …) to a SQL_VARCHAR target.

If your test harness stringifies the integer and binds SQL_C_CHAR, you see all 500 rows with no NUL — that path goes through transferString*ToAllowedType, which has always handled the VARYING length prefix correctly. The broken path is conv<Numeric>ToString.

I put together a minimal standalone reproducer (one .cpp, 4 shapes — direct+char, direct+slong, prepare-once+slong, rebind+slong): repro_variants.cpp. Against upstream eb7072b / FB 5.0.3 / CHARSET=UTF8 it prints:

direct_char      expected=500 count=500 nul_rows=0    OK
direct_slong     expected=500 count=500 nul_rows=500  FAIL
preponce_slong   expected=500 count=500 nul_rows=500  FAIL
rebind_slong     expected=500 count=11  nul_rows=1    FAIL

With this PR all four shapes are OK. Could you share the snippet you used (or point me at the C-type you bind)? If it's SQL_C_CHAR the no-repro outcome is expected and both of us are right.

@fdcastel
Copy link
Copy Markdown
Member Author

Also worth noting while I was at it: my gtest cases here are in fact SKIPPED on every matrix job because the driver's SQL_DBMS_VER string starts with "06.03…" on Firebird 5 too (it reports the implementation / protocol number, not the product major), so SKIP_ON_FIREBIRD6 fires for FB 5 as well.

That's a pre-existing issue — I'll open a small follow-up fixing GetServerMajorVersion to query rdb$get_context('ENGINE_VERSION') so the coverage actually runs.

@irodushka
Copy link
Copy Markdown
Contributor

@fdcastel
I look into your sample & send you mine, but not today... I would like to postpone discussing this story for now, for I need to investigate the issue. And frankly, at first glance you fix looks not very optimal/suitable for me (was it created by AI?:)) maybe we should do it in more solid and clean way.

@fdcastel
Copy link
Copy Markdown
Member Author

Sure, take your time!

That was just a first pass (yep, AI-generated 😄) to show it works. Now we can improve it.
(Remembering the 3 rules) 😆

I just came across this -- while working on a totally a different project.

@fdcastel fdcastel force-pushed the fix/issue-161-silent-data-loss-integer-to-varchar branch from d026ec7 to e996b2b Compare April 22, 2026 21:00
The silent-data-loss repair from 8bce4b2 / fb356d6 inlined the
SQL_VARYING prefix write (`*(unsigned short*)buf = len;` + data at
`buf + 2`) into 18 `conv<Numeric|Date|Time>ToString[W]` helpers and
repeated the SQL_C_WCHAR → byte-variant reroute at 8 switch sites in
`getAdressFunction`, with a peek-only `HeadSqlVar::isSqlVarying()`
getter to let the ODBC layer inspect the prepared sqltype.

That shape is hard to maintain (any future `convDecfloatToString`
has to remember to copy the same branch) and leaks Firebird buffer-
layout knowledge into the ODBC conversion layer.

This commit collapses the repetition behind three named primitives:

* `HeadSqlVar::writeStringData(src, len)` — new inline method on
  the IscDbc interface, beside `setSqlLen`/`setSqlData`.  Writes
  into the sqlvar's pre-allocated buffer using the correct layout
  for the prepared type (`[uint16 len][data]` for SQL_VARYING, bare
  bytes + effective sqllen for SQL_TEXT) and does not mutate the
  VARYING metadata.  Replaces `isSqlVarying()` — behavior, not a
  peek.

* `writeConvertedString(pointer, indicatorTo, to, src, len)` — new
  file-scope helper in OdbcConvert.cpp.  Single write-back path for
  every numeric/date/time → string converter: Firebird-side targets
  go through `writeStringData`, app-owned targets get a bare memcpy
  + indicator update.  Each `conv*ToString` helper now produces a
  local ASCII buffer and calls this helper instead of touching
  `*(unsigned short*)pointer` or `setSqlLen` directly.

* `chooseNumericToStringConv(to, byteFn, wideFn)` — tiny inline
  that replaces the 8 identical `if (to->isIndicatorSqlDa) return
  byte; else return wide;` blocks in `getAdressFunction`.  The
  rationale comment lives once, at the helper.

Also rewrites the `Sqlda::getPrecision` comment to explain that the
`orgVarSqlProperties` read for INPUT parameters is a symmetry with
the existing `getColumnDisplaySize` pattern (INPUT precision is
immutable from prepare time).  Code behavior unchanged.

Net: 18 + 8 scattered sites replaced by two helpers and one method;
the ODBC layer no longer writes uint16 length prefixes or reads
`sqltype == SQL_VARYING`.

Verified against Firebird 5.0.3 on Windows x64 across all three
charset configs (UTF8+UTF8, ISO8859_1+ISO8859_1, ISO8859_1+UTF8):
125 passing / 0 failing, including the existing `SLongToVarchar*`
tests, `NumericAsCharParam`, and the `WCharTest` suite.
@fdcastel
Copy link
Copy Markdown
Member Author

Hi @irodushka — took your feedback and pushed a cleaner v2 on top of the PR: e996b2b (the refactor) + 538a993 (a small cosmetic). Net −70 lines against the earlier fix commits.

Three named helpers replace the scattered inlining:

  1. HeadSqlVar::writeStringData(src, len) — new inline method on the IscDbc interface, beside setSqlLen / setSqlData. Owns the Firebird buffer-layout rule: [uint16 len][data] for SQL_VARYING (no sqlvar mutation), bare bytes + effective sqllen for SQL_TEXT. Replaces the peek-only isSqlVarying() getter — behavior, not a peek.

  2. writeConvertedString(pointer, indicatorTo, to, src, len) — file-scope adapter in OdbcConvert.cpp. Every numeric / date / time → string converter now produces a local ASCII buffer and calls this one helper; no more *(unsigned short*)pointer = len; sprinkled across 18 sites.

  3. chooseNumericToStringConv(to, byteFn, wideFn) — tiny inline that collapses the 8 identical SQL_C_WCHAR dispatch blocks in getAdressFunction to one line each. The rationale comment lives once, at the helper.

Net effect: the ODBC conversion layer no longer writes VARYING length prefixes or reads sqltype == SQL_VARYING — that is IscDbc's concern now.

I also rewrote the Sqlda::getPrecision comment to frame its orgVarSqlProperties read for INPUT parameters as a symmetry with the pre-existing getColumnDisplaySize pattern (prepared-parameter precision is immutable), not a workaround. Code behavior unchanged.

One small cosmetic follow-up538a993 drops a stray Issue #161: prefix from a test section banner that I had carried over without thinking. duckdb/odbc-scanner#161 is in a different repo and long closed, so citing it here just added noise to git log. My bad.

All 10 matrix jobs green on both commits. Happy to split, rearrange, or dig deeper into any piece if something still looks off.

@irodushka
Copy link
Copy Markdown
Contributor

Aha... you gonna blow my brains)

So the only failing scenario is to prepare the query, then on each iteration - reset parameters, bind them every time & exеcute?.. I mean "rebind_slong". All the others - direct_char, direct_slong and preponce_slong - are OK.

And "preponce_slong" is the only sane scenario. All the rest are strange and extremely suboptimal.

Okay, at least I got it and I see the misbehaviour (although, despite this, I would not recommend anyone to use this approach when working with sql queries 😃).

@irodushka
Copy link
Copy Markdown
Contributor

@fdcastel

Can you please check a bit simplified fix...

Change the line OdbcConvert.cpp:1306
int len = to->length;
to
int len = to->octetLength;

and nothing more. I mean - instead of all your changes in this PR. Well, and the line 1398 for the Wide version.
Will it blend help?..

@fdcastel
Copy link
Copy Markdown
Member Author

fdcastel commented Apr 23, 2026

Unfortunately it's not so simple.

I built your one-liner against master and measured all four shapes. The short version: it recovers rebind_slong (row count 11 → 500) but leaves direct_slong and preponce_slong silently storing 500 NUL-corrupted PKs, same as pure master. Row counts stay at 500 there; the corruption is in the stored bytes.

Shape master (no fix) master + your one-liner this PR
direct_char count=500 nul=0 OK count=500 nul=0 OK count=500 nul=0 OK
direct_slong count=500 nul=500 FAIL count=500 nul=500 FAIL count=500 nul=0 OK
preponce_slong count=500 nul=500 FAIL count=500 nul=500 FAIL count=500 nul=0 OK
rebind_slong count=11 nul=1 FAIL count=500 nul=1 FAIL count=500 nul=0 OK

(FB 5.0.3, CHARSET=UTF8, 500 rows, nul_rows = POSITION(_OCTETS x'00' IN ID) > 0.)

Which lines up with there being two separate bugs the PR addresses:

  • Bug Aconv<Numeric>ToString writes ASCII digits at offset 0 of a SQL_VARYING target buffer (where FB expects a uint16 length prefix) and then calls setSqlLen(actual_len). FB master reads the digits as the prefix ('1','\0' = 49,
    '1','0' = 12337, …) and ends up storing bytes from past the copy range — the embedded NULs. Your one-liner doesn't touch either the offset-0 write or the setSqlLen, so this bug survives, and it fires on every shape that routes through conv*ToString, not just the rebind one.

  • Bug BSqlda::getPrecision was reading the mutated sqlvar, so record->length collapsed to 1 after the first execute and multi-digit IDs got cut to their first character in the rebind path. Your octetLength switch sidesteps this, because octetLength is populated from getColumnDisplaySize, which already reads from orgVarSqlProperties (the immutable prepare-time snapshot). This is exactly why rebind_slong goes 11 → 500 with your change. Nice spot.

The PR addresses both: the VARYING-vs-TEXT write rule now lives in HeadSqlVar::writeStringData (offset 0 = uint16 prefix, data at offset 2, sqltype/sqllen untouched — and as a side-effect Sqlda::checkAndRebuild() no longer fires for these parameters at all), and Sqlda::getPrecision reads from orgVarSqlProperties for INPUT so the rebind path is immune too. Full patch is the e996b2b commit on top of fb356d6.

On "preponce_slong is the only sane scenario" — fair call, but direct_slong is just SQLExecDirect with a numeric parameter and rebind_slong is what odbc_scanner's odbc_copy_from (and a fair number of high-level wrappers) do under the hood. All three are legal ODBC and all three currently corrupt data on master; I'd rather not pick which ones we're willing to leave broken.

@fdcastel
Copy link
Copy Markdown
Member Author

Pushed e40dfa8 — two additional tests in test_param_conversions.cpp so the PR's regression suite covers both bugs, not just one.

The original two tests (Issue161_SLongToVarcharViaStoredProcedure, Issue161_SLongToVarcharViaDml) both bind once and execute 500 times. That exercises Bug Aconv<Numeric>ToString writing over the SQL_VARYING length prefix — but not Bug B (Sqlda::getPrecision reading from the mutated sqlvar), because the preponce shape never re-enters defFromMetaDataIn and so never re-reads getPrecision after the first execute. A regression that reintroduced the pre-PR record->length = getPrecision(mutated_sqlvar) read would have slipped past both tests.

The two new tests

Test Shape Exercises
Issue161_SLongToVarcharViaDmlRebind rebind_slong Bug A + Bug B — re-binds per row
Issue161_SLongToVarcharViaDmlDirect direct_slong Bug A via SQLExecDirect (no prepare cache)
  • Rebind variant: SQLPrepare once, then per row SQLFreeStmt(SQL_RESET_PARAMS) + SQLBindParameter(×2) +
    SQLExecute. This is what odbc-scanner's odbc_copy_from does internally, and the only shape where Bug B actually fires (a per-row SQLBindParameter forces defFromMetaDataIn to re-read getPrecision, which pre-PR returned the shrunken length from the mutated sqlvar).
  • Direct variant: SQLExecDirect per row with SQL_C_SLONG, no prepared-statement context carried between iterations. Guards against any state-leak regression where the per-execute reset path diverges from the per-prepare path.

The two existing tests (ViaStoredProcedure, ViaDml) both use the
prepare-once + bind-once + execute-N shape, which exercises the
conv<Numeric>ToString VARYING-prefix write path but not the
Sqlda::getPrecision read path — preponce never re-enters
defFromMetaDataIn after the first execute, so a regression that
reintroduced `record->length = getPrecision(mutated_sqlvar)` would
slip past them entirely.

Add two sibling tests, both plain DML, same table layout:

- ...DmlRebind: SQLPrepare once, then per row
  SQLFreeStmt(SQL_RESET_PARAMS) + SQLBindParameter + SQLExecute.
  The shape where per-row rebinds force defFromMetaDataIn to re-read
  getPrecision, and the only one that exercises the
  orgVarSqlProperties fix in Sqlda.cpp.  Verified locally: with
  Sqlda::getPrecision reverted to its pre-fix form this test fails
  with min=0 / max=9 / count<500, while the two preponce tests
  pass.

- ...DmlDirect: SQLExecDirect per row with SQL_C_SLONG.  Exercises
  the same conv path as preponce but with no persistent
  prepared-statement context between iterations, guarding against
  any state-leak regression where the per-execute reset path
  diverges from the per-prepare path.

Both carry SKIP_ON_FIREBIRD6() matching the existing test convention.
@fdcastel fdcastel force-pushed the fix/issue-161-silent-data-loss-integer-to-varchar branch from e40dfa8 to d5daa1d Compare April 23, 2026 14:36
@fdcastel fdcastel closed this Apr 23, 2026
@fdcastel fdcastel deleted the fix/issue-161-silent-data-loss-integer-to-varchar branch April 23, 2026 14:37
@fdcastel fdcastel restored the fix/issue-161-silent-data-loss-integer-to-varchar branch April 23, 2026 14:38
@fdcastel fdcastel reopened this Apr 23, 2026
@fdcastel
Copy link
Copy Markdown
Member Author

P.S.: I attempted to rename the (very poorly named) branch for this PR, but GitHub ended up closing the PR incorrectly.

So I’m leaving it as is for now.

@irodushka
Copy link
Copy Markdown
Contributor

Oh my Lord... That's not addressed to you, @fdcastel, just an idle talk.

Finally I got it. The misbehavior occures when I set "CHARACTER SET UTF8" for varchar fields. And it works ok (with the 1-line fix of cause) when I use NONE charset.

The reason is - if we use the UTF8 charset, the WIDE int->string conv function is called, and there's an extremely bad place in it:

		{																						\
			char tempBuf [256];																	\
			strcpy( tempBuf, string );															\
			from->MbsToWcs( (wchar_t *)string, tempBuf, len );									\
			((wchar_t *)string)[len] = L'\0';													\
			len *=2;																			\
		}	

Such a familiar code, isn't it?)
It writes fixed 2-byte characters on Windows and 4-byte - on Linux. The lengths if hardcoded-multiplied by 2. All this logic is invalid in the both cases because UTF8 is not the widechar encoding, but the multi-byte, 1-4 bytes for one character, and for we use the pure ASCII, it must be 1 byte for character here.

Your fix works because you've added the gating logic, I mean chooseNumericToStringConv, and it always use the single-byte conv function when converting from the client var to the FB parameter. So the WIDE conv function is never called, and the toxic code above never works)
BUT I doubt this approach is correct... you're copying data as plain bytes from the client side buffer to the UTF8 FB buffer. It should be ok when the client-side is ASCII (as in the test is) - each ASCII byte corresponds exactly to one UTF8 byte. But what if it's not ASCII?

And the last - it does not relate to VARCHAR at all) Please try to create the test table as

RECREATE TABLE ISSUE161_T(
ID CHAR(20) CHARACTER SET UTF8 NOT NULL PRIMARY KEY,
NAME CHAR(100) CHARACTER SET UTF8
)

and run your tests...

You can also create table with CHARACTER SET NONE and check. I ran the test on Linux, not Windows, and I got 4/4 OK with NONE charset (CHAR or VARCHAR) and 3/4 FAIL with UTF8 (CHAR or VARCHAR). I expect you'll get similar results on Windows.

@irodushka
Copy link
Copy Markdown
Contributor

irodushka commented Apr 23, 2026

you can:

  1. make the 1-line fix in 2 places (for char & for widechar conv, length->octetLength)
  2. add the condition if(from->isIndicatorSqlDa) before the bad code block in the widechar conv (with from->MbsToWcs...)

As for me, this solves the problem completely. Linux.

P.S. And yes, it is true that when converting from any data type (except string) to string, we cannot get anything but ASCII characters... so your control logic (chooseNumericToStringConv) in these cases looks correct... Just fix the naming - it's not always Numeric. Maybe selectToStringConv?..

Existing rebind coverage (Issue161_SLongToVarcharViaDmlRebind) hits
only one corner of the (target column type × column charset) matrix:
VARCHAR under the database's default charset (UTF8 on the CI test
databases).  Empirical measurements on Windows against FB 5.0.3 show
the original pre-PR driver misbehaviour is identical for CHAR and
VARCHAR targets and differs across UTF8 vs NONE charsets:

    target / charset          master          +1-liner        this PR
    VARCHAR + UTF8            3/4 shapes FAIL 3/4 shapes FAIL 4/4 OK
    CHAR(N) + UTF8            3/4 shapes FAIL 3/4 shapes FAIL 4/4 OK
    VARCHAR + NONE            1/4 shape  FAIL 4/4 OK           4/4 OK
    CHAR(N) + NONE            1/4 shape  FAIL 4/4 OK           4/4 OK

The CHAR and NONE corners have never been exercised by the gtest
suite; a regression specific to either would slip past CI today.

Refactor: extract a RunIssue161RebindVariant(pkColumnDef, nameColumnDef)
helper that runs the full prepare + rebind-per-row + assert body,
driven by the column definition strings.  Add three new thin tests
that call it with the three uncovered matrix corners:

  Issue161_SLongToCharViaDmlRebind                  — CHAR(20) UTF8
  Issue161_SLongToVarcharViaDmlRebindCharsetNone    — VARCHAR(20) NONE
  Issue161_SLongToCharViaDmlRebindCharsetNone       — CHAR(20) NONE

The existing Issue161_SLongToVarcharViaDmlRebind is deliberately left
intact — it already covers VARCHAR + default(UTF8) via inline code,
and refactoring it here would expand the blast radius of this commit.

Helper also drops ODBC_ISSUE161_SP before each run so that a crash in
the earlier Issue161_*ViaStoredProcedure test cannot leak a
dependency that prevents the table DROP.

All 7 Issue161 tests pass locally on FB 5.0.3 / CHARSET=UTF8 with
the full PR applied.  Tests carry SKIP_ON_FIREBIRD6() matching the
existing convention.
`GetServerMajorVersion` was atoi-ing `SQLGetInfo(SQL_DBMS_VER)`, but the
driver does not put the Firebird product version in that field — it
constructs it from the implementation / protocol version bytes of
`isc_info_version` (`IscDbc/Attachment.cpp:480`) and currently returns
e.g. `"06.03.1683 WI-V Firebird 5.0"` on Firebird 5.0.3.  `atoi("06...")`
is 6 on every supported server, so `SKIP_ON_FIREBIRD6()` fires on FB 3 /
4 / 5 too and every guarded test silently `[ SKIPPED ]`s across the
whole matrix.

Ask the server directly via
`SELECT RDB$GET_CONTEXT('SYSTEM', 'ENGINE_VERSION') FROM RDB$DATABASE`,
which returns a straight "5.0.3" / "6.0.0.xxxx" / "3.0.12" string.

Local FB 5.0.3 run: 224 passing tests (was 200) — 24 tests that had been
silently skipped (the pre-existing `Char*` parameter-conversion cases
plus the Issue FirebirdSQL#161 coverage added in this branch) are now actually
exercised.  Local FB 6.0 master still skips them all (correct).
@fdcastel
Copy link
Copy Markdown
Member Author

Almost 😅

TL;DR Your minimal fix (3 lines) works for the write direction: my v3 theory was wrong. But unfortunately it breaks 4 pre-existing tests in the read direction.


Tested both of your proposals empirically:

CHAR target and CHARACTER SET NONE

  • pushed 220eb3c and 9a1a432.
  • Three new gtest cases covering CHAR(20) + UTF8, VARCHAR(20) + NONE, CHAR(20) + NONE in the rebind shape;
    • a cherry-picked rdb$get_context fix for GetServerMajorVersion so SKIP_ON_FIREBIRD6 stops firing on FB 5.
  • Combined effect on run 24850170760:
    • 7/7 Issue161 tests
    • Passed on ubuntu-22.04 / 5.0.3 AND on windows-2022 / 5.0.3, correctly Skipped on FB 6 master
    • so the coverage is real on both OSes now, not silent skips.

Your 3-line minimal fix (line 1306 + 1387 + from->isIndicatorSqlDagate)

  • I applied it on top of clean master, pushed to a scratch branch, and ran the full CI matrix (run 24851385888, branch since deleted).
  • On the write-direction matrix it passes exactly as you described on Linux: all 16 (column type × charset × shape) combinations OK, including CHAR + UTF8.
  • My earlier claim that the VARYING prefix stomp would persist under your minimal fix was wrong: the OO-API path via checkAndRebuild() handles the sqllen mutation fine. Apologies for the sloppy theoretical argument.

But

  • the minimal fix regressed 4 pre-existing read-direction tests on every matrix job: DataTypeTest.NumericPrecision, ResultConversionsTest.{IntegerToChar, NumericToChar, BooleanToChar}.
  • The reason is that s/to->length/to->octetLength/ at line 1306 affects both directions
    • and on the read direction (FB column → client SQL_C_CHAR buffer) to->length and to->octetLength are not the same:
    • to->length is the declared capacity (SQL_DESC_LENGTH) while to->octetLength can come from the source column's display size and be smaller.
  • Using octetLength as the truncation cap on the read path writes only part of the formatted number and the test fails.

This PR's chooseNumericToStringConv avoids that: it routes only when to->isIndicatorSqlDa is true (write-to-FB), leaving the read path on the wide variant unchanged and to->length semantics untouched.

I can't see a tweak to your 3-line approach that keeps it 3 lines without introducing the same to->isIndicatorSqlDa branching we already have.

On chooseNumericToStringConv naming

Agreed, the helper covers date/time as well, not just numerics. I'll rename to selectToStringConv in a follow-up commit on this PR.

On "what if not ASCII"

  • the specific converters this PR touches (conv<Numeric|Date|Time|DateTime>ToString{,W}) emit only ASCII: 09, ., -, :, , e/E, *.
  • No path in those macros produces a byte > 0x7F, so routing the UTF8 / WVARCHAR target through the byte variant is bit-identical regardless of charset.
  • The character-data family (conv<String>ToString*) is a different code path, unchanged by this PR, and continues to flow through transferString*ToAllowedType with charset awareness.

Next steps

I need TWO beers, now.

@irodushka
Copy link
Copy Markdown
Contributor

irodushka commented Apr 23, 2026 via email

fdcastel added a commit to fdcastel/odbc-scanner that referenced this pull request Apr 24, 2026
Binding SQL_C_SLONG/SQL_C_SBIGINT/SQL_C_FLOAT/... to a SQL_VARCHAR column
forces the ODBC driver to convert numeric-C to character-SQL internally.
That conversion path is one of the least-exercised corners of many drivers:
the Firebird ODBC driver ≤ 3.5.0 silently corrupted writes on it
(duckdb#161 / FirebirdSQL/firebird-odbc-driver#292), and older
MSSQL and MySQL ODBC releases have shipped similar defects.

Params::SetExpectedTypes now transforms TINYINT…BIGINT and FLOAT/DOUBLE
ScannerValues into TYPE_DECIMAL_AS_CHARS whenever the prepared parameter's
expected type is a character family (CHAR / VARCHAR / LONGVARCHAR / WCHAR /
WVARCHAR / WLONGVARCHAR). The scanner then binds SQL_C_CHAR → the actual
expected character SQL type via the existing DecimalChars path. This
entirely bypasses the driver's numeric-to-string coercion — no driver
bug required.

BindOdbcParam<DecimalChars> also now honors the expected character SQL
type (it hard-coded SQL_VARCHAR before) so CHAR / WVARCHAR columns are
not silently re-coerced.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants