Skip to content

fix: increase API read timeout to prevent flaky E2E failures#2057

Merged
KRRT7 merged 1 commit into
mainfrom
fix/api-read-timeout
Apr 10, 2026
Merged

fix: increase API read timeout to prevent flaky E2E failures#2057
KRRT7 merged 1 commit into
mainfrom
fix/api-read-timeout

Conversation

@KRRT7
Copy link
Copy Markdown
Contributor

@KRRT7 KRRT7 commented Apr 10, 2026

Summary

  • Increase API read timeout from 90s to 300s for all app.codeflash.ai endpoints
  • Split into (connect=10s, read=300s) tuple so connections fail fast but LLM inference gets adequate time
  • Fixes flaky async-optimization E2E test caused by ReadTimeoutError on /testgen endpoint under load

Root cause

AiServiceClient used a flat timeout=90 for prod, passed to every requests.post() call. LLM-powered endpoints (/testgen, /optimize, /refinement, etc.) can legitimately exceed 90s when the backend is under load. When hit, the request raises ReadTimeoutError, test generation returns None, and the E2E test fails.

Test plan

  • CI passes (unit tests, type-check)
  • async-optimization E2E test no longer flakes on timeout

…failures

The flat 90s timeout was too aggressive for LLM-powered endpoints
(/testgen, /optimize, /refinement) under load, causing ReadTimeoutError
and failing the async-optimization E2E test. Split into (10s connect,
300s read) tuple so connections fail fast but LLM inference gets adequate time.
@KRRT7 KRRT7 merged commit 5ee642e into main Apr 10, 2026
26 of 27 checks passed
@KRRT7 KRRT7 deleted the fix/api-read-timeout branch April 10, 2026 07:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant