fix(llm): thread per-scope LLM timeout into constructed providers (#2452) by r266-tech · Pull Request #2471 · vectorize-io/hindsight

r266-tech · 2026-06-30T08:57:04Z

Summary

The per-scope retain/reflect/consolidation LLM timeouts are resolved into config but never threaded into the constructed LLM provider, so they are silently ignored: a retain call always falls back to the global HINDSIGHT_API_LLM_TIMEOUT / 120s default regardless of HINDSIGHT_API_RETAIN_LLM_TIMEOUT. Fixes #2452.

Root cause

config.retain_llm_timeout (and the reflect/consolidation variants) are parsed in config.py, but the per-op LLMConfig builds in memory_engine.py never passed timeout= to the provider. LLMProvider / create_llm_provider had no timeout parameter, so LiteLLMLLM.__init__ fell back to os.getenv(ENV_LLM_TIMEOUT, DEFAULT_LLM_TIMEOUT) — the global value — no matter the per-scope setting. HINDSIGHT_API_RETAIN_LLM_TIMEOUT=300 was therefore a no-op; a long retain call still aborted at 120s, producing the LiteLLM call exceeded timeout=120s symptom from the issue.

Fix

Thread timeout through the same plumbing the service-tier / default-headers config already uses:

create_llm_provider and LLMProvider.__init__ gain a timeout: float | None param, forwarded into the litellm and bedrock LiteLLM constructions (LiteLLMLLM.__init__ already accepts timeout).
The per-op LLMConfig builds in memory_engine.py pass timeout=config.<op>_llm_timeout if not None else config.llm_timeout for retain/reflect/consolidation; the default/member builds and from_env pass the global llm_timeout. The is not None check keeps an explicit 0 honored and never resolves to None (which would disable the existing asyncio.wait_for hard-timeout backstop).

Non-LiteLLM providers ignore timeout (the same way they ignore service_tier); the reported symptom and the existing hard-timeout backstop are both LiteLLM-side.

Tests

tests/test_llm_timeout_threading.py mirrors test_bedrock_service_tier.py: it asserts the timeout reaches LiteLLMLLM via both LLMConfig and create_llm_provider (litellm + bedrock alias) and that an unset timeout still resolves to the finite default (never None). The new tests + test_bedrock_service_tier.py + test_litellm_timeout.py pass; broader test_llm_provider / test_config_validation / test_llm_per_op_concurrency are green; ruff check and ruff format --check clean. (Like the service-tier wiring tests, coverage is at the LLMProvider -> provider_impl bridge; the memory_engine.py per-op timeout= lines sit alongside the existing service_tier= lines.)

The retain/reflect/consolidation LLM timeouts were resolved into config but never passed to the provider, so they were silently ignored and every call fell back to the global HINDSIGHT_API_LLM_TIMEOUT / 120s default. Thread timeout through create_llm_provider and LLMProvider into the LiteLLM (and bedrock-alias) constructions, and pass the resolved per-op value from the MemoryEngine builds (global llm_timeout fallback). Add threading tests. Fixes vectorize-io#2452

r266-tech force-pushed the fix/llm-per-scope-timeout-2452 branch from d36d219 to 146819b Compare June 30, 2026 09:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(llm): thread per-scope LLM timeout into constructed providers (#2452)#2471

fix(llm): thread per-scope LLM timeout into constructed providers (#2452)#2471
r266-tech wants to merge 1 commit into
vectorize-io:mainfrom
r266-tech:fix/llm-per-scope-timeout-2452

r266-tech commented Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

r266-tech commented Jun 30, 2026

Summary

Root cause

Fix

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant