Deploy by amrit110 · Pull Request #23 · VectorInstitute/unbias-plus

amrit110 · 2026-03-16T16:05:32Z

Summary

Clickup Ticket(s): Link(s) if applicable.

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
📝 Documentation update
🔧 Refactoring (no functional changes)
⚡ Performance improvement
🧪 Test improvements
🔒 Security fix

Changes Made

Testing

Tests pass locally (uv run pytest tests/)
Type checking passes (uv run mypy <src_dir>)
Linting passes (uv run ruff check src_dir/)
Manual testing performed (describe below)

Manual testing details:

Screenshots/Recordings

Related Issues

Deployment Notes

Checklist

Code follows the project's style guidelines
Self-review of code completed
Documentation updated (if applicable)
No sensitive information (API keys, credentials) exposed

- Bump uv from 0.9.11 → 0.10.12 in all CI workflows to match local version - Add index-strategy = "unsafe-best-match" to [tool.uv] so uv finds the correct torch/flash-attn variants across multiple indexes - Regenerate uv.lock with uv 0.10.12 for a consistent cross-platform lock Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

nginx answers the TCP probe in <1s, moving the container into "serving" mode where Cloud Run throttles CPU between requests. The background Python import process (torch/transformers) then starves for CPU between 5s health check intervals, pushing startup from ~30s to 100s+. Add --no-cpu-throttling so the background process always gets CPU during cold start. Also fix a pre-existing health check bug: curl -sf writes the HTTP code ("502") to stdout and then exits non-zero on 4xx/5xx, causing || echo "000" to append "000" — HTTP becomes "502000" which never matches "200". Switch to curl -s (no fail-on-error) so the status code is captured cleanly. Bump retries 10→20 for a 100s window as belt-and-suspenders. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…rror When the user submits text immediately after a cold start, nginx is up but uvicorn hasn't finished loading yet (~30s). Previously this showed a confusing browser alert("Analysis failed: Server error (502)"). Now: - 502/503 with no tokens received → show "Server is starting up — retrying in 5s..." with a live countdown, auto-retry up to 10× (50s window covers uvicorn's ~30s startup) - Other errors → show a styled inline error banner (magenta, fades after 8s) instead of a blocking alert() popup - Tag the http status on thrown errors so detection doesn't rely on parsing the message string Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Add slowapi>=0.1.9 dependency - Add RATE_LIMIT env var (default 10/minute, configurable at deploy time) - Add _get_client_ip helper respecting Cloud Run X-Forwarded-For header - Apply @limiter.limit to /analyze and /analyze/stream endpoints - Fix torch/torchvision/torchaudio uv sources to use platform markers so macOS falls back to PyPI CPU wheels instead of failing on CUDA index

amrit110 self-assigned this Mar 16, 2026

amrit110 force-pushed the deploy branch from 13e8cb5 to 074d67e Compare March 16, 2026 17:02

amrit110 force-pushed the deploy branch from 47e8952 to be74cd3 Compare March 20, 2026 23:15

amrit110 and others added 3 commits March 20, 2026 19:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deploy#23

Deploy#23
amrit110 wants to merge 4 commits intomainfrom
deploy

amrit110 commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

amrit110 commented Mar 16, 2026

Summary

Type of Change

Changes Made

Testing

Screenshots/Recordings

Related Issues

Deployment Notes

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant