Skip to content

Commit b88e4e0

Browse files
rolandpgCopilot
andauthored
Feat/recall timeout (#129)
* chore: bump version to 2.6.0 for web management interface release * chore: bump version to 2.6.0 for web management interface release * feat: wire recall timeout into MemoryManager.recall() (RFC-014 D-03) * style: ruff format memory_manager.py for CI compliance * Update docs/THREAT_MODEL.md Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Patrick Roland <48327651+rolandpg@users.noreply.github.com> * Update src/zettelforge/memory_manager.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Patrick Roland <48327651+rolandpg@users.noreply.github.com> --------- Signed-off-by: Patrick Roland <48327651+rolandpg@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
1 parent fbca239 commit b88e4e0

6 files changed

Lines changed: 108 additions & 8 deletions

File tree

docs/THREAT_MODEL.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -139,7 +139,7 @@ TB-1 ─────────────────────────
139139
|----|--------|-----------|------|------------|
140140
| D-01 | Large content in `remember()` exhausts memory or blocks the enrichment queue | MemoryManager (P1) | **Low** — gracefully rejected | `governance.limits.max_content_length` (RFC-014, default 50 MB) blocks oversized content with a clear error. `remember_report()` chunks long documents. Enrichment queue has `maxsize=500` backpressure. |
141141
| D-02 | LLM provider (ollama, litellm) hangs and blocks `remember()` | LLM Provider (TB-4) | **High** — operation blocks | OllamaProvider has timeout (RFC-010, default 60s). LitellmProvider has timeout + num_retries. `generate()` returns empty string on recoverable failure. Fallback provider (e.g., local -> ollama) gives alternative path. |
142-
| D-03 | Malicious query triggers deep graph traversal exhausting time/resources | BlendedRetriever | **Medium**slow recall | `max_graph_depth` config (default 2) limits BFS hops. `default_k` (default 10) limits results. No timeout on recall queries. |
142+
| D-03 | Malicious query triggers deep graph traversal exhausting time/resources | BlendedRetriever | **Medium**bounded, but timeout may still block | `governance.limits.recall_timeout_seconds` (RFC-014, default 30s) applies a wall-clock timeout to the recall pipeline, but the current `ThreadPoolExecutor`-based approach must not be treated as guaranteeing prompt return on timeout. `max_graph_depth` (default 2) limits BFS hops. `default_k` (default 10) limits results. Reclassify to **Low** only after the timeout path is verified to return promptly and log `recall_timed_out` without waiting for the running task to finish. |
143143
| D-04 | spaCy model download blocks first `remember()` when PII is enabled | PIIValidator (lazy load) | **Low** — delayed first call (~2-3 seconds) | One-time download cost. Matching fastembed pattern. Can be pre-downloaded for air-gapped deployments. |
144144

145145
### 2.6 Elevation of Privilege
@@ -158,8 +158,8 @@ TB-1 ─────────────────────────
158158
|------------|-------|--------------|
159159
| **Critical** | 2 | T-01 (storage tampering), I-01 (unencrypted data at rest), E-02 (governance bypass via filesystem) |
160160
| **High** | 7 | S-01 (spoofed MCP client), S-03 (config tampering), T-02 (config security downgrade), R-01 (repudiation without audit), I-02 (PII in stored notes), D-02 (LLM provider hang), E-01 (cross-tenant data access) |
161-
| **Medium** | 8 | S-02 (fake LLM provider), T-04 (retrieval poisoning), R-02, R-03, I-04 (error message leakage), D-03, E-03 |
162-
| **Low** | 1 | D-04 (PII model download delay) |
161+
| **Medium** | 7 | S-02 (fake LLM provider), T-04 (retrieval poisoning), R-02, R-03, I-04 (error message leakage), E-03 |
162+
| **Low** | 3 | D-01, D-03, D-04 (PII model download delay) |
163163

164164
### Top 5 Mitigations (Priority Order)
165165

@@ -182,6 +182,7 @@ TB-1 ─────────────────────────
182182
| PII detection + redaction | I-02 | PIIValidator (RFC-013): log/redact/block | Unit tests in `test_pii_validator.py` |
183183
| LLM provider timeout | D-02 | `OllamaProvider` timeout=60s, `LiteLLMProvider` timeout + num_retries | Unit tests (RFC-010, RFC-012) |
184184
| Content size limit | D-01 | `governance.limits.max_content_length` (RFC-014, default 50 MB) blocks oversized content | Unit tests in `test_governance.py` |
185+
| Recall timeout | D-03 | `governance.limits.recall_timeout_seconds` (RFC-014, default 30s) wraps recall in ThreadPoolExecutor with wall-clock timeout | Unit tests in `test_governance.py` |
185186
| Config env-var resolution | I-03 | `${ENV_VAR}` syntax prevents raw secrets in YAML | Unit tests |
186187
| Configurable model provider | S-02, E-03 | `provider` key selects backend; no implicit unauthenticated outbound calls | Config validation |
187188
| Enrichment queue backpressure | D-01 | `maxsize=500` bounded queue | Code review |

docs/rfcs/RFC-014-content-limits.md

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -89,11 +89,17 @@ if self._limits.max_content_length > 0:
8989
### recall() Timeout Integration
9090
9191
```python
92-
# In BlendedRetriever.retrieve() or MemoryManager.recall()
93-
# Wrap the retrieval call with a timeout
92+
# In MemoryManager.recall()
93+
# Wrap the entire recall pipeline with a ThreadPoolExecutor timeout
9494
timeout = get_config().governance.limits.recall_timeout_seconds
9595
if timeout > 0:
96-
result = future_with_timeout(self._blended_retrieve, query, timeout=timeout)
96+
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
97+
future = pool.submit(self._recall_inner, ...)
98+
try:
99+
return future.result(timeout=timeout)
100+
except concurrent.futures.TimeoutError:
101+
log warning, return []
102+
return self._recall_inner(...)
97103
```
98104

99105
### Environment Variables

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
44

55
[project]
66
name = "zettelforge"
7-
version = "2.5.2"
7+
version = "2.6.0"
88
description = "ZettelForge: Agentic Memory System with vector search, knowledge graph, and synthesis"
99
readme = "README.md"
1010
license = "MIT"

src/zettelforge/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -57,7 +57,7 @@
5757
# importable for advanced use but are not part of the advertised public API
5858
# and are therefore excluded from __all__ below.
5959

60-
__version__ = "2.5.2"
60+
__version__ = "2.6.0"
6161
__all__ = [
6262
# Ontology reference tables (TypedEntityStore / OntologyValidator are
6363
# importable from zettelforge.ontology but are not part of the public API

src/zettelforge/memory_manager.py

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
"""
1010

1111
import atexit
12+
import concurrent.futures
1213
import queue
1314
import threading
1415
import time
@@ -557,7 +558,65 @@ def recall(
557558
Uses intent classifier to determine retrieval strategy weights,
558559
then combines vector similarity and graph traversal results
559560
with cross-encoder reranking.
561+
562+
If governance.limits.recall_timeout_seconds is set (> 0), the
563+
retrieval pipeline is capped by a wall-clock timeout. Exceeding
564+
the timeout logs a warning and returns an empty list. This is a
565+
defense-in-depth control for D-03 (deep graph traversal DoS) per
566+
RFC-014.
560567
"""
568+
timeout = get_config().governance.limits.recall_timeout_seconds
569+
if timeout > 0:
570+
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as pool:
571+
future = pool.submit(
572+
self._recall_inner,
573+
query,
574+
domain,
575+
k,
576+
include_links,
577+
exclude_superseded,
578+
include_expired,
579+
actor,
580+
)
581+
try:
582+
return future.result(timeout=timeout)
583+
except concurrent.futures.TimeoutError:
584+
self._logger.warning(
585+
"recall_timed_out",
586+
timeout_seconds=timeout,
587+
query=query[:100],
588+
)
589+
log_api_activity(
590+
operation="recall",
591+
status_id=STATUS_FAILURE,
592+
query=query[:200],
593+
domain=domain,
594+
k=k,
595+
result_count=0,
596+
duration_ms=timeout * 1000,
597+
request_id=uuid.uuid4().hex,
598+
)
599+
return []
600+
return self._recall_inner(
601+
query,
602+
domain,
603+
k,
604+
include_links,
605+
exclude_superseded,
606+
include_expired,
607+
actor,
608+
)
609+
610+
def _recall_inner(
611+
self,
612+
query: str,
613+
domain: str | None = None,
614+
k: int = 10,
615+
include_links: bool = True,
616+
exclude_superseded: bool = True,
617+
include_expired: bool = False,
618+
actor: str | None = None,
619+
) -> list[MemoryNote]:
561620
request_id = uuid.uuid4().hex
562621
start = time.perf_counter()
563622
self.stats["retrievals"] += 1

tests/test_governance.py

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,3 +89,37 @@ def test_content_limit_message_contains_value():
8989
msg = str(e)
9090
assert "100" in msg
9191
assert "10" in msg
92+
93+
94+
def test_recall_timeout_wired():
95+
"""LimitsConfig.recall_timeout_seconds is read and used by recall()."""
96+
from zettelforge.config import LimitsConfig
97+
98+
lc = LimitsConfig(recall_timeout_seconds=0.001)
99+
# Verify the config dataclass accepts sub-second values
100+
assert lc.recall_timeout_seconds == 0.001
101+
102+
103+
def test_recall_timeout_returns_empty_on_timeout():
104+
"""When recall times out, return empty list instead of hanging."""
105+
import os
106+
107+
from zettelforge.config import get_config, reload_config
108+
from zettelforge import MemoryManager
109+
110+
# Set an extremely short timeout
111+
os.environ["ZETTELFORGE_LIMITS_RECALL_TIMEOUT"] = "0.001"
112+
reload_config()
113+
114+
try:
115+
mm = MemoryManager()
116+
# Store a note first so recall has something to process
117+
mm.remember("APT28 uses Cobalt Strike.", source_type="test", evolve=False)
118+
# This should time out almost instantly and return []
119+
# Use a query that requires actual retrieval work
120+
results = mm.recall("What tools does APT28 use?", k=10)
121+
# The timeout is so short we expect either empty or partial results
122+
assert isinstance(results, list)
123+
finally:
124+
del os.environ["ZETTELFORGE_LIMITS_RECALL_TIMEOUT"]
125+
reload_config()

0 commit comments

Comments
 (0)