Conversation
- Bump uv from 0.9.11 → 0.10.12 in all CI workflows to match local version - Add index-strategy = "unsafe-best-match" to [tool.uv] so uv finds the correct torch/flash-attn variants across multiple indexes - Regenerate uv.lock with uv 0.10.12 for a consistent cross-platform lock Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
nginx answers the TCP probe in <1s, moving the container into
"serving" mode where Cloud Run throttles CPU between requests.
The background Python import process (torch/transformers) then
starves for CPU between 5s health check intervals, pushing
startup from ~30s to 100s+.
Add --no-cpu-throttling so the background process always gets
CPU during cold start.
Also fix a pre-existing health check bug: curl -sf writes the
HTTP code ("502") to stdout and then exits non-zero on 4xx/5xx,
causing || echo "000" to append "000" — HTTP becomes "502000"
which never matches "200". Switch to curl -s (no fail-on-error)
so the status code is captured cleanly. Bump retries 10→20 for
a 100s window as belt-and-suspenders.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rror
When the user submits text immediately after a cold start, nginx is up
but uvicorn hasn't finished loading yet (~30s). Previously this showed
a confusing browser alert("Analysis failed: Server error (502)").
Now:
- 502/503 with no tokens received → show "Server is starting up —
retrying in 5s..." with a live countdown, auto-retry up to 10×
(50s window covers uvicorn's ~30s startup)
- Other errors → show a styled inline error banner (magenta, fades
after 8s) instead of a blocking alert() popup
- Tag the http status on thrown errors so detection doesn't rely on
parsing the message string
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add slowapi>=0.1.9 dependency - Add RATE_LIMIT env var (default 10/minute, configurable at deploy time) - Add _get_client_ip helper respecting Cloud Run X-Forwarded-For header - Apply @limiter.limit to /analyze and /analyze/stream endpoints - Fix torch/torchvision/torchaudio uv sources to use platform markers so macOS falls back to PyPI CPU wheels instead of failing on CUDA index
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Clickup Ticket(s): Link(s) if applicable.
Type of Change
Changes Made
Testing
uv run pytest tests/)uv run mypy <src_dir>)uv run ruff check src_dir/)Manual testing details:
Screenshots/Recordings
Related Issues
Deployment Notes
Checklist