Skip to content

feat(settings): make LLM timeout configurable (#776)#783

Open
Frankli9986 wants to merge 1 commit into
srbhr:mainfrom
Frankli9986:fix-776-configurable-timeout
Open

feat(settings): make LLM timeout configurable (#776)#783
Frankli9986 wants to merge 1 commit into
srbhr:mainfrom
Frankli9986:fix-776-configurable-timeout

Conversation

@Frankli9986
Copy link
Copy Markdown
Contributor

@Frankli9986 Frankli9986 commented May 4, 2026

Closes #776

Summary

Users running local LLMs (e.g. Mistral via Ollama) frequently hit the hard-coded 120s/180s timeout. This PR makes the timeout user-configurable via the Settings UI.

Changes

Backend

  • LLMConfigRequest / LLMConfigResponse: add optional timeout_seconds field (validated 30–600)
  • routers/config.py: read and persist timeout_seconds from/to config.json
  • llm.py: _calculate_timeout() accepts base_timeout_override; completion and json operations use it; health_check keeps its 30s default
  • 8 new unit tests covering override behavior

Frontend

  • Settings page: numeric input for timeout (30–600s, step 30) with placeholder "Default (120s)"
  • lib/api/config.ts: timeout_seconds added to LLMConfig and LLMConfigUpdate
  • i18n: translations added for en, es, ja, pt-BR, zh

Behavior

  • When unset: uses existing defaults (120s completion / 180s JSON)
  • When set: overrides the base timeout; token_factor and provider_factor still apply
  • Health checks are unaffected (always 30s)

Summary by cubic

Make LLM request timeouts configurable in Settings to support slow or local models; applies to completions and JSON operations, while health checks stay at 30s. Closes #776.

  • New Features
    • Backend: optional timeout_seconds (30–600) in LLM config; persisted via config API; _calculate_timeout() accepts override; used in complete and complete_json; defaults (120/180s) when unset.
    • Frontend: numeric timeout input in Settings (30–600, step 30, placeholder “Default (120s)”); added to LLMConfig and LLMConfigUpdate.
    • Behavior: override still scales with token and provider factors; health checks unchanged at 30s.
    • i18n: added strings for en, es, ja, pt-BR, zh.
    • Extras: added groq to supported providers and API key management.
    • Tests: 8 unit tests cover override logic.

Written for commit 0e4d4f3. Summary will update on new commits.

- Add timeout_seconds to LLMConfigRequest/Response schemas
- Persist timeout in config router
- Modify _calculate_timeout() to accept base_timeout_override
- Update complete() and complete_json() call sites
- Add timeout input to frontend Settings page
- Add i18n translations for all 5 supported languages
- Add 8 unit tests for timeout override behavior
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 11 files

@rami-shalhoub
Copy link
Copy Markdown

hey @Frankli9986 i tested your PR, grate job
but i found some issues that you can find my fix for them here

but in summery:

  • backend/app/llm.py: I put back get_safe_max_tokens as it is a safeguard for could based LLMs and i made it so you can changes the fall back max tokens when you call the function

  • frontend/next.config.ts: Raised proxyTimeout to 3_600_000 to match the backend's safety net.

    you fixed backend timeouts, but Next.js proxies /api/* to the backend with its own proxyTimeout: 240_000 (240s). When the backend took longer, Next.js killed the proxy and returned a plain 500 before the backend could respond.

  • backend/app/routers/resumes.py: Derive the wait_for timeout from the Settings value: min(timeout_seconds × 5, 3600). When it fires, asyncio.wait_for cancels the inner task (httpx → Ollama), so CPU stops shortly after timeout.

  • apps/frontend/lib/api/resume.ts : Added a module-level cache _improveTimeoutMs initialized from fetchLlmConfig().timeout_seconds on the tailor page mount. Default raised to 600_000

I was testing these changes on ollama running qwen3:8b model and they wroked fine for me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: 240 seconds isn't enough time for my local LLM

2 participants