fix(generic_files): flint_sprintf on 32-bit glibc (closes #2646)#2648
Conversation
`flint_vsprintf` previously routed through `flint_vsnprintf` with a size of
`INT_MAX`, but on 32-bit glibc `vsnprintf(dst, n, ...)` silently drops output
past the first character once `n` exceeds about 16 MB. The result on i386 and
armhf was that `flint_sprintf("x%wd", 1)` produced `"x"` instead of `"x1"`.
This broke `fmpz_mpoly_set_str_pretty` (which builds variable names via
`flint_sprintf("x%wd", i + 1)`): every variable became the literal string `"x"`,
the parser then failed prefix matching, returned `-1`, and left the polynomial
malformed. Tests that exercise the parser then saw a zero/garbage polynomial
where they expected a deterministic input — the symptom reported in flintlib#2646:
mpoly_test_irreducible FAIL: check 8 variable example
fmpz_mpoly_compose_fmpz_mpoly Check non-example 1
nmod_mpoly_compose_nmod_mpoly Check non-example 1
Fix: give `flint_sprintf` its own sink in a new `io_vsprintf.c` that calls the
system `vsprintf` directly. `sprintf` semantics already require the caller to
provide a sufficiently large buffer, so no length bound is needed and the
glibc edge case is avoided.
Also add a regression test that catches the precise failure mode
(`flint_sprintf("x%wd", n)` round-trip plus a few `WORD_MIN/MAX` cases and a
mixed-`%w` format) and is registered in `src/test/main.c`. Verified by
reverting only the `io_vsnprintf.c`/`io_vsprintf.c` changes: the new test
reports `flint_sprintf("x%wd", 1) gave "x" expected "x1"` on i386, then passes
once the fix is restored. Full `make check` passes on i386, amd64, and armhf
(under qemu-arm-static).
Closes flintlib#2646.
Caught by the MinGW64 (LLP64) CI run on the previous commit (141266a): slong on Windows is `long long` (64-bit) but `long` is 32-bit, so the expected-value computation `snprintf(expected, ..., "%ld", (long) values[ix])` truncated WORD_MIN/WORD_MAX to 0 and the test reported a false failure "flint_sprintf(\"[%wd]\", -9223372036854775808) gave ... expected \"[0]\"". Cast to `long long` and use `%lld` instead, which fits slong on every supported platform (slong is `long` on LP64 and `long long` on LLP64).
|
Oh nice, and nice that you added a test file as well! I can confirm that changing |
|
This fixed the tests from #2646 on the Debian armhf porterbox! However, I'm getting a new test failure I didn't see earlier. Could this be related to the changes? |
|
@d-torrance Thank you! I will investigate. Could you specify in what kind of machine you are observing this? The pre-existing failure was likely just hidden earlier because something further up the test list aborted first. |
|
This is on amdahl.debian.org, one of Debian's ARM porterboxes. The machine itself is 64-bit, but the tests were run in a 32-bit environment using |
|
Here is my investigation: The
I did some quick bisection. With We should certainly open another issue. I will keep investigating regardless. |
|
I did get compilation warning for truncations of integer literals when
compiling on a 32 bit machines, that I believe was from new-ish code. Not
sure if this may cause some issues.
…On Tue, Apr 28, 2026, 03:58 Edgar Costa ***@***.***> wrote:
*edgarcosta* left a comment (flintlib/flint#2648)
<#2648 (comment)>
Here is my investigation:
The gr_poly_log_series failure is a pre-existing ARM-specific bug:
- On *i386 native* (-m32), with my fix: gr_poly_log_series passes
deterministically.
- On *armhf under qemu-arm-static*, with my fix: gr_poly_log_series
FAILs (SIGABRT, exit 134).
- On *armhf under qemu-arm-static*, with my fix reverted to current
upstream/main: gr_poly_log_series FAILs *byte-identically* (diff -q of
the two outputs reports no difference). So this PR is not the cause.
- grep -rn
"flint_sprintf\|flint_snprintf\|flint_vsprintf\|flint_vsnprintf"
src/gr_poly/ src/gr_mat/ src/gr/ returns nothing, so none of those
modules go through the code I touched. It was likely just hidden earlier
because something further up the test list aborted first.
I did some quick bisection. With FLINT_BITS=32 (same RNG seed on both
32-bit archs), iters 0-54 produce identical RNG state on i386 and armhf. At
iter 55 both pick GR_CTX_NF (number field) via gr_ctx_init_random. After
that single call, the RNG state has diverged: i386 and armhf consumed
different numbers of n_randint calls inside the same code path. The
number-field path goes through fmpz_poly_randtest_irreducible (
src/gr/init_random.c:157) which loops on irreducibility tests, so the
divergence is most likely upstream of gr_poly_log_series entirely, in
fmpz_poly_factor or fmpz_mod_poly_randtest_irreducible on 32-bit ARM. The
matrix-ring failure at iter 259 is just the random ring that gets picked
after the divergence accumulates.
We should certainly open another issue. I will keep investigating
regardless.
—
Reply to this email directly, view it on GitHub
<#2648 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AOGBZ54ZJOSF6TAFQY3MAXD4YAF4LAVCNFSM6AAAAACYINIMF6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHM2DGMZRG44TOMRWGQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
|
|
Looks good to me. Thanks! |
Attempt to fix #2646.
On 32-bit glibc (i386, armhf),
vsnprintf(dst, n, fmt, …)withn≳ 16 MB silently drops everything after the first character.flint_vsprintfwas routing throughflint_vsnprintf(s, INT_MAX, …), so everyflint_sprintf("x%wd", k)produced"x"instead of"x<k>".fmpz_mpoly_set_str_prettybuilds variable names with this exact call, so every variable collapsed to"x". The parser then failed prefix matching, returned-1, and left the polynomial malformed, which is why the three deterministic tests in #2646 fail on 32-bit:mpoly_test_irreducible:FAIL: check 8 variable examplefmpz_mpoly_compose_fmpz_mpoly:Check non-example 1nmod_mpoly_compose_nmod_mpoly:Check non-example 1Fix
Give
flint_sprintf/flint_vsprintftheir own sink in a newsrc/generic_files/io_vsprintf.cthat calls systemvsprintfdirectly.sprintfsemantics already require the caller to provide a sufficiently large buffer, so no length bound is needed and the glibc edge case is avoided.flint_snprintfis unchanged.Potential alternative fix:
A smaller change that also fixes the bug is a 6-line diff in
flint_vsnprintf_vprintfitself:Pros: no new file, no duplicated sink boilerplate (~120 fewer lines).
Cons: subtle behavior change for
flint_snprintfcallers passingn > 64 KB, they no longer get truncation atn-1. In practice this path was already broken on 32-bit glibc (the very bug we're fixing), and anyone usingsnprintfwithn > 64 KBis effectively using it assprintf. But it does cross the architectural line between bounded and unbounded writes. Also, the1 << 16threshold is a hardcoded constant tuned to the observed glibc behavior, which is not great: it's not principled, and a future glibc change could shift the threshold without us noticing.I went with the separate-sink version because it preserves
snprintf's contract exactly and avoids the hardcoded threshold. Happy to switch on request.Test plan
flint_sprintfregression test (src/test/t-io.c) covering%wdround-trip,WORD_MIN/WORD_MAX, and mixed%wd %wu %wx. With the fix reverted, it reports the exact failure:flint_sprintf("x%wd", 1) gave "x" expected "x1"on i386.-m32).make checkpasses on i386 (-m32) and amd64.flint_sprintf,mpoly_test_irreducible,fmpz_mpoly_compose_fmpz_mpoly,nmod_mpoly_compose_nmod_mpolypass on armhf underqemu-arm-static. (qemu-user did not reproduce the original failure, only real 32-bit glibc does, but confirms no regression.)