fix(flatfiles): emit strikes in dollars and bound the wire connection#814
Merged
Conversation
The CSV and JSONL writers emitted the raw scaled wire strike (tenths of a cent) while the Arrow and typed-row paths emitted dollars, so the same request produced two different strike units depending on output format. Strikes are dollars on every client-facing surface; the scaled integer wire form must never reach a caller. A single shared conversion now feeds CSV, JSONL, Arrow, and the typed row so all four agree on the exact value, and a cross-surface test pins the invariant. The flat-file wire path also had no connect or read timeout, so a host that accepted the socket but never finished the TLS handshake, or a server that stalled mid-stream, would block a download forever. The session now bounds the combined connect plus auth handshake per host and bounds the wait for each response frame, classifying both as transient so the existing retry ladder reconnects. The bounds default to the same connect budget as the historical channel, with a generous inter-frame ceiling that never cuts off a slow-but-progressing transfer. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Bind FlatFilesConfig.connect_timeout_secs and read_timeout_secs as u64-second config knobs across the C ABI, Python, TypeScript, and C++ surfaces, mirroring the existing flatfile backoff setters so the per-host connect/auth bound and the per-frame read bound are tunable from every language. Each field carries a setter, a getter, a documented production default, a parity row, and a round-trip binding test. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The decode-only golden fixed the strike at the raw wire integer; the CSV writer now emits dollars like every other output surface, so the golden expectation moves to 580 to match.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Strikes in dollars across every output format
Flat-file CSV and JSONL emitted the raw scaled wire strike (tenths of a cent, e.g.
580000) while the Arrow and typed-row paths emitted dollars (e.g.580.0). The same request therefore produced two different strike units depending on output format. Strikes are dollars on every client-facing surface; the scaled integer wire form must never reach a caller.A single shared conversion (
strike_dollars, alongside the wire-scale constant where the strike is decoded) now feeds CSV, JSONL, Arrow, and the typedFlatFileRow, so all four agree on the exact value for the same input. Sub-dollar strikes round-trip without trailing-zero noise viaf64Display. A cross-surface test asserts CSV, JSONL, and Arrow emit an identical dollar strike for whole-dollar and sub-dollar inputs.Connect and read timeouts on the flat-file wire path
The flat-file path established its TLS connection and streamed the response with no connect or read timeout, unlike the historical channel which already has
connect_timeout_secs. A host that accepted the socket but never completed the TLS handshake, or a server that stalled mid-stream, would block a download forever.The session now bounds the combined connect plus auth handshake per host, and bounds the wait for each response frame. Both expiries are classified as transient so the existing retry ladder reconnects on a fresh session rather than hanging. The bounds wire to new
FlatFilesConfigfields:connect_timeout_secsdefaults to10(matching the historical channel), andread_timeout_secsdefaults to60— far beyond any healthy inter-chunk gap, so a slow-but-progressing bulk transfer is never cut off mid-chunk. A test pointsconnect_and_loginat a non-routable TEST-NET host and confirms the connect bound fires instead of hanging.Verification
cargo fmt --all -- --checkcleancargo test -p thetadatadx --features "arrow,polars,frames,config-file" --lib flatfiles— 68 passedcargo clippy -p thetadatadx --all-targets --features "arrow,polars,frames,config-file,__internal,__test-helpers" -- -D warningsclean🤖 Generated with Claude Code