You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The process-global LRU used the raw textSearch string as a cache key without any length bound, allowing attacker-controlled large strings to be retained in worker memory and enabling a remote memory-DoS when a default embedder is configured.
The cache also stored failed (None) embeddings, which could pin attacker inputs in the LRU until eviction.
Description
Add a new constant DISCOVER_TEXT_SEARCH_MAX_LENGTH = 512 to bound accepted Discover textSearch input. (opencontractserver/constants/search.py)
Introduce _normalise_text_search() to strip and reject empty or overlong searches, and route all five Discover resolvers to call it before performing text or semantic work. (config/graphql/discover_queries.py)
Change the query-vector cache wrapper to raise _UncacheableQueryVector when embedding fails instead of returning/caching None, and update _semantic_ids() to catch that exception and fall back to text-only results. (config/graphql/discover_queries.py)
Add regression tests asserting that an overlong query returns an empty result before any embedding call and that failed embeddings do not populate the LRU cache. (opencontractserver/tests/test_discover_search_graphql.py)
git diff --check produced no whitespace/patch errors.
pytest opencontractserver/tests/test_discover_search_graphql.py could not be executed in this environment due to a missing django dependency (ModuleNotFoundError: No module named 'django'), so full test-suite execution was not performed here.
This PR correctly addresses two real security issues: unbounded user-controlled strings being retained as LRU cache keys, and failed embedding calls being cached as None (pinning the query string in worker memory until LRU eviction). The approach — input normalisation at the resolver layer plus converting the cache wrapper to raise-on-failure — is sound. A few issues worth addressing before merge:
1. Stale return-type annotation on _cached_query_vector — maintenance hazard
config/graphql/discover_queries.py, line 149
The function signature still reads -> Optional[list], but after this PR the function never returns None — it either returns a list or raises _UncacheableQueryVector. Any future caller that copies the old call pattern:
vector=_cached_query_vector(query_text, embedder_path)
ifnotvector: # dead branch — None can never reach herereturn []
would silently swallow the uncaught exception as a 500 error to GraphQL instead of the intended text-only graceful fallback. Update the annotation to -> list so mypy and future readers see the real contract.
2. Missing changelog fragment — CLAUDE.md requirement
changelog.d/
CLAUDE.md mandates a changelog.d/<slug>.security.md (or .fixed.md) fragment for any security fix. No 2091* file appears in changelog.d/. The closest existing fragments (1919-discover-followup.changed.md, 1919-discover-followup.fixed.md) cover the prior caching refactor, not this DoS guard.
Add something like changelog.d/2091-discover-cache-dos.security.md with a bullet describing the vulnerability and fix.
3. Protection lives only at call sites, not at the resource boundary — altitude concern
config/graphql/discover_queries.py, line 119 (_normalise_text_search) and line 149 (_cached_query_vector)
The 512-char cap is enforced inside each of the five resolver methods. _cached_query_vector itself accepts an unbounded query_text. If a sixth resolver is added later (or any internal caller reuses _semantic_ids directly), the protection is silently absent — the long string becomes an LRU cache key with no warning.
The minimum fix is a # PRECONDITION: query_text must have passed _normalise_text_search docstring on both _semantic_ids and _cached_query_vector. A stronger fix is a defensive assert len(query_text) <= DISCOVER_TEXT_SEARCH_MAX_LENGTH inside _cached_query_vector that makes a violation loud rather than silent.
4. test_overlong_query_returns_empty_before_searching verifies the semantic arm only, not the text arm
opencontractserver/tests/test_discover_search_graphql.py, line 307
The test patches _query_vector and asserts it was never called, which proves the semantic arm is bypassed. It does not assert that the text arm (_text_ids) was also skipped. If a future refactor moves the length check to after the _text_ids call (guarding only the LRU), the test would still pass while the unbounded DB text-predicate DoS vector is silently reintroduced. Adding patch("config.graphql.discover_queries._text_ids") alongside the existing mock and asserting that too was not called would close this gap.
Minor: test_overlong_query_returns_empty_before_searching assertion is outside the with block
opencontractserver/tests/test_discover_search_graphql.py, line 310
withpatch("config.graphql.discover_queries._query_vector") asquery_vector:
result=self.graphene_client.execute(...)
self.assertIsNone(result.get("errors"), ...)
self.assertEqual(...)
query_vector.assert_not_called() # <-- outside the with block
query_vector.assert_not_called() runs after the patch has been deactivated. This works correctly in practice (mock objects retain their call history), but if the resolver ever spawns an async or deferred post-execute callback that calls _query_vector, those calls would not be captured by the inner mock after the with exits. Move lines 308–310 inside the with block to make the assertion scope explicit.
Overall the security intent is correct and the test coverage for the new behaviour is a good start. Addressing the stale type annotation and the missing changelog fragment are the two must-haves before merge; the other points are improvements worth discussing.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
textSearchstring as a cache key without any length bound, allowing attacker-controlled large strings to be retained in worker memory and enabling a remote memory-DoS when a default embedder is configured.None) embeddings, which could pin attacker inputs in the LRU until eviction.Description
DISCOVER_TEXT_SEARCH_MAX_LENGTH = 512to bound accepted DiscovertextSearchinput. (opencontractserver/constants/search.py)_normalise_text_search()to strip and reject empty or overlong searches, and route all five Discover resolvers to call it before performing text or semantic work. (config/graphql/discover_queries.py)_UncacheableQueryVectorwhen embedding fails instead of returning/cachingNone, and update_semantic_ids()to catch that exception and fall back to text-only results. (config/graphql/discover_queries.py)opencontractserver/tests/test_discover_search_graphql.py)Testing
python -m py_compile config/graphql/discover_queries.py opencontractserver/constants/search.py opencontractserver/tests/test_discover_search_graphql.pysucceeded (syntax check passed).git diff --checkproduced no whitespace/patch errors.pytest opencontractserver/tests/test_discover_search_graphql.pycould not be executed in this environment due to a missingdjangodependency (ModuleNotFoundError: No module named 'django'), so full test-suite execution was not performed here.Codex Task