Skip to content

Commit b07af9b

Browse files
groksrcclaude
andcommitted
docs(core): record entity-boost benchmark findings; keep default off
Benchmarked the #951 entity-aware ranking boost against the LoCoMo retrieval suite (hybrid mode) and a hand-built adversarial corpus. LoCoMo is insensitive to the boost: sweeping the weight across 0.15/0.3/0.5/1.0/2.0 produced identical recall@5, recall@10, MRR, and content-hit at every point (no query reordered, no score changed). LoCoMo docs are keyed by session id and expose speaker names only in body text, never as entity titles or relation names, so the title/relation-matching boost never fires there. An adversarial check found a real regression mode: Title-Case queries inject spurious entity terms. 'What Is The Plan For Q3' extracts 'Q3' and, even at weight 0.15, promotes a literal-'Q3' document over the more relevant 'third quarter' document. Clean proper nouns (Katze) work; lowercase-leading identifiers (getUserById) are correctly ignored. Decision: keep search_entity_boost_enabled default off and the weight at 0.15. LoCoMo provides no signal to raise the weight, and the adversarial check is not clean. Document the findings and guidance; no code/default changes. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Drew Cain <groksrc@gmail.com>
1 parent dcc4a7a commit b07af9b

2 files changed

Lines changed: 41 additions & 5 deletions

File tree

docs/semantic-search.md

Lines changed: 35 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ All settings are fields on `BasicMemoryConfig` and can be set via environment va
107107
| `semantic_embedding_document_input_type` | `BASIC_MEMORY_SEMANTIC_EMBEDDING_DOCUMENT_INPUT_TYPE` | Auto for known LiteLLM models | Optional LiteLLM `input_type` for indexed document/passages. |
108108
| `semantic_embedding_query_input_type` | `BASIC_MEMORY_SEMANTIC_EMBEDDING_QUERY_INPUT_TYPE` | Auto for known LiteLLM models | Optional LiteLLM `input_type` for search queries. |
109109
| `semantic_vector_k` | `BASIC_MEMORY_SEMANTIC_VECTOR_K` | `100` | Candidate count for vector nearest-neighbour retrieval. Higher values improve recall at the cost of latency. |
110-
| `search_entity_boost_enabled` | `BASIC_MEMORY_SEARCH_ENTITY_BOOST_ENABLED` | `false` | Enable the entity-aware ranking boost in hybrid search (see below). Default off pending benchmark validation. |
110+
| `search_entity_boost_enabled` | `BASIC_MEMORY_SEARCH_ENTITY_BOOST_ENABLED` | `false` | Enable the entity-aware ranking boost in hybrid search (see below). Default off: benchmark-validated as inert on LoCoMo and prone to Title-Case false positives. |
111111
| `search_entity_boost_weight` | `BASIC_MEMORY_SEARCH_ENTITY_BOOST_WEIGHT` | `0.15` | Per-matched-term multiplier strength for the entity boost. A candidate matching N query entity terms is scaled by `1 + weight * min(N, max_terms)`. |
112112
| `search_entity_boost_max_terms` | `BASIC_MEMORY_SEARCH_ENTITY_BOOST_MAX_TERMS` | `3` | Maximum number of distinct matched entity terms that contribute to the boost, bounding the multiplier. |
113113

@@ -143,8 +143,40 @@ export BASIC_MEMORY_SEARCH_ENTITY_BOOST_WEIGHT=0.15
143143
export BASIC_MEMORY_SEARCH_ENTITY_BOOST_MAX_TERMS=3
144144
```
145145

146-
> **Default off.** This setting is disabled by default pending LoCoMo benchmark
147-
> validation. Enable it to experiment with entity-heavy corpora.
146+
> **Default off.** This setting is disabled by default. See the benchmark
147+
> findings below for why the default stays off and where the boost helps.
148+
149+
### Benchmark findings
150+
151+
The boost was benchmarked against LoCoMo (the
152+
[basic-memory-benchmarks](https://github.com/basicmachines-co/basic-memory-benchmarks)
153+
retrieval suite, hybrid mode) and a hand-built adversarial corpus. Two results
154+
drove the decision to keep the default **off** and leave the weight at `0.15`:
155+
156+
1. **LoCoMo is insensitive to the boost.** Sweeping the weight across
157+
`0.15, 0.3, 0.5, 1.0, 2.0` produced *identical* recall@5, recall@10, MRR, and
158+
content-hit at every point — no query reordered, no score changed. LoCoMo's
159+
documents are titled by conversation/session id and expose speaker names only
160+
in body text, never as entity titles or relation names. Because the boost
161+
matches query proper nouns against a candidate's **title or linked relation
162+
names**, it never fires on this corpus. LoCoMo therefore provides no signal to
163+
raise the weight, and the boost neither helps nor harms it.
164+
165+
2. **A capitalization-only heuristic has false positives.** On a corpus where
166+
entity terms appear in titles, the boost correctly promotes the right document
167+
for clean proper nouns (e.g. `Katze`) and is correctly inert on
168+
lowercase-leading identifiers (e.g. `getUserById`, ignored). But **Title-Case
169+
queries can regress**: a query like `What Is The Plan For Q3` extracts `Q3` as
170+
an entity term, and even at weight `0.15` it promotes a document that
171+
*literally* contains "Q3" above the more relevant document that says "third
172+
quarter". Since entity detection is lexical (capitalization, no NER), any
173+
capitalized non-entity token in a query is a potential false positive.
174+
175+
**Guidance.** Enable the boost only on entity-heavy corpora where your queries
176+
name entities that are themselves note titles or linked relations (the #951
177+
"Joanna" case). Prefer natural-case queries (`What are Joanna's hobbies?`) over
178+
Title-Cased phrasing, which can inject spurious entity terms. Leave it off for
179+
conversational / body-text-keyed corpora like LoCoMo, where it cannot help.
148180

149181
## Embedding Providers
150182

src/basic_memory/config.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -354,13 +354,17 @@ def __init__(self, **data: Any) -> None: ...
354354
# document about a different entity.
355355
# Outcome: when enabled, hybrid fusion multiplies a candidate's fused score by a small
356356
# bonus for each distinct query entity term it matches lexically (no model inference).
357-
# Default OFF pending LoCoMo benchmark validation by the maintainer.
357+
# Default OFF: LoCoMo benchmarking showed the boost is inert there (its docs are keyed
358+
# by session id, not entity titles) and an adversarial check found Title-Case queries
359+
# can inject spurious entity terms (e.g. "Q3") that regress ranking. See
360+
# docs/semantic-search.md "Benchmark findings".
358361
search_entity_boost_enabled: bool = Field(
359362
default=False,
360363
description="Enable entity-aware ranking boost in hybrid search. When enabled, "
361364
"hybrid candidates whose title or linked relation names contain a proper-noun "
362365
"term from the query are boosted in the final ranking. Lexical-only; adds no "
363-
"model inference. Default off pending benchmark validation.",
366+
"model inference. Default off: benchmark-validated as inert on LoCoMo and prone "
367+
"to Title-Case false positives (see docs/semantic-search.md).",
364368
)
365369
search_entity_boost_weight: float = Field(
366370
default=0.15,

0 commit comments

Comments
 (0)