Skip to content

fix: add upload retry#399

Merged
not-matthias merged 1 commit into
mainfrom
cod-2839-retry-upload-on-connection-errorstimeouts
Jun 15, 2026
Merged

fix: add upload retry#399
not-matthias merged 1 commit into
mainfrom
cod-2839-retry-upload-on-connection-errorstimeouts

Conversation

@not-matthias

@not-matthias not-matthias commented Jun 11, 2026

Copy link
Copy Markdown
Member

No description provided.

@codspeed-hq

codspeed-hq Bot commented Jun 11, 2026

Copy link
Copy Markdown

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

✅ 7 untouched benchmarks


Comparing cod-2839-retry-upload-on-connection-errorstimeouts (32229fd) with main (ba5799e)

Open in CodSpeed

@not-matthias not-matthias force-pushed the cod-2839-retry-upload-on-connection-errorstimeouts branch from df83123 to bee5a31 Compare June 11, 2026 14:41
@not-matthias not-matthias marked this pull request as ready for review June 11, 2026 16:48
@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown

Greptile Summary

This PR adds upload retry support for streaming (on-disk) profile archives, which previously had no retry logic because the middleware cannot replay a consumed body stream. The fix introduces send_streamed_with_retry, a manual retry loop that re-opens the file on each attempt and mirrors the DefaultRetryableStrategy used by the existing REQUEST_CLIENT middleware.

  • A new shared upload_backoff() factory replaces the inline ExponentialBackoff construction, ensuring both the middleware and the manual loop use identical policy parameters; a #[cfg(test)] branch shrinks the intervals to 1–5 ms so retry tests stay fast.
  • send_streamed_with_retry classifies outcomes via DefaultRetryableStrategy, retrying on Transient (e.g. 503) and falling through to the caller for fatal or non-retryable responses — consistent with how the middleware handles the in-memory path.
  • Two new integration tests (streamed_upload_is_retried, in_memory_upload_is_retried) spin up a raw-TCP mock that returns 503 exactly EXPECTED_ATTEMPTS times and assert both that the error surfaces and that the hit count equals 1 + UPLOAD_RETRY_COUNT.

Confidence Score: 5/5

Safe to merge — the streaming retry loop correctly mirrors the middleware classifier and policy, and both new tests validate the expected attempt counts end-to-end.

The core logic in send_streamed_with_retry is well-structured: DefaultRetryableStrategy and upload_backoff() are the same building blocks used by REQUEST_CLIENT's middleware, so both code paths are consistent. The test harness covers both the streamed and in-memory paths with real 503 responses and hit-count assertions. No production-facing regressions were identified.

No files require special attention.

Important Files Changed

Filename Overview
src/request_client.rs Extracts upload_backoff() as a shared factory (with #[cfg(test)] fast bounds) and exposes UPLOAD_RETRY_COUNT; replaces the inline ExponentialBackoff in REQUEST_CLIENT.
src/upload/uploader.rs Adds send_streamed_with_retry manual loop for disk-backed archives and replaces the old single-shot streaming upload; adds two integration tests verifying retry counts for both code paths.

Sequence Diagram

sequenceDiagram
    participant U as upload_profile_archive
    participant R as send_streamed_with_retry
    participant F as File (disk)
    participant S as STREAMING_CLIENT
    participant C as Cloud Storage

    U->>R: call (path, upload_data, ...)
    loop up to 1 + UPLOAD_RETRY_COUNT attempts
        R->>F: File::open(path)
        F-->>R: stream
        R->>S: PUT upload_url (streaming body)
        S->>C: HTTP PUT
        C-->>S: 503 Service Unavailable
        S-->>R: "Ok(Response{503})"
        R->>R: "DefaultRetryableStrategy -> Transient"
        R->>R: policy.should_retry(start, n_past_retries)
        alt retries remaining
            R->>R: sleep(backoff)
            R->>R: "n_past_retries += 1"
        else retries exhausted
            R-->>U: "Ok(Response{503})"
        end
    end
    U->>U: "!status.is_success() -> bail!"
Loading

Reviews (3): Last reviewed commit: "fix(upload): retry streamed uploads on t..." | Re-trigger Greptile

Comment thread src/upload/uploader.rs Outdated
Comment thread src/upload/uploader.rs
@not-matthias not-matthias force-pushed the cod-2839-retry-upload-on-connection-errorstimeouts branch from bee5a31 to 2298ed2 Compare June 11, 2026 17:22

@GuillaumeLagrange GuillaumeLagrange left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Streamed (on-disk) uploads went through STREAMING_CLIENT, which has no
retry middleware because a consumed stream body can't be replayed. Add a
manual retry loop that rebuilds the stream from disk on each attempt and
reuses reqwest_retry's backoff policy and transient classifier, matching
the in-memory path's behavior.

Extract the shared backoff into request_client::upload_backoff() so both
the retry middleware and the streamed loop use one policy; under cfg(test)
its intervals shrink to milliseconds to keep the retry tests fast. Add
tests asserting both the streamed and in-memory paths retry transient
503s 1 + UPLOAD_RETRY_COUNT times.
@not-matthias not-matthias force-pushed the cod-2839-retry-upload-on-connection-errorstimeouts branch from 2298ed2 to 32229fd Compare June 15, 2026 08:51
@not-matthias not-matthias merged commit 32229fd into main Jun 15, 2026
21 checks passed
@not-matthias not-matthias deleted the cod-2839-retry-upload-on-connection-errorstimeouts branch June 15, 2026 08:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants