Serendipitous knowledge retrieval via graph traversal

## Observation

Proposition retrieval in DICE is direct:

- **`MemoryRetriever.recall(query)`** — vector similarity to query
- **`MemoryRetriever.recallAbout(entityId)`** — propositions mentioning an entity
- **`MemoryRetriever.recallByType(type)`** — propositions by entity type
- **`PropositionRepository.findSimilar()`** — embedding-based similarity search

All of these find propositions *directly related* to the query. But some of the most valuable knowledge connections are indirect — a proposition about entity A might be critical context for entity B, connected via a shared relationship 2-3 hops away in the knowledge graph.

DICE's `text2graph` pipeline builds exactly the kind of relational structure that could support this. `GraphProjectionService` projects propositions into `ProjectedRelationship` edges. But retrieval only uses direct lookups — the graph structure goes unused for discovery.

## The idea

A retrieval mode inspired by spreading activation: starting from directly-relevant propositions, follow `ProjectedRelationship` edges in the knowledge graph to discover indirectly-related propositions that normal retrieval would miss.

```
Query → direct matches → follow graph edges (2-3 hops) → score by activation strength → return surprising-but-relevant propositions
```

### Parameters

- **Minimum hop distance** — don't return direct matches (those come from normal retrieval)
- **Maximum hop distance** — don't go too far (relevance drops sharply beyond 3-4 hops)
- **Activation decay per hop** — strength halves with each hop
- **Separate token budget** — serendipitous results are supplementary, first to drop under pressure

### Safety guard

Serendipitously retrieved propositions must not contradict actively injected ones. If conflict detection (#12 ) is available, check before including. This prevents graph traversal from introducing contradictions.

## Open questions

- **Is the graph dense enough?** If `text2graph` produces sparse relationships, there may not be enough edges for multi-hop traversal to find anything useful.
- **How do you evaluate "serendipitous but useful" vs. "random noise"?** Without a metric, it's hard to know if the feature is helping or just consuming tokens.
- **Should this be deterministic or probabilistic?** Probabilistic traversal (random walk with restart) keeps results varied across invocations. Deterministic (shortest path) is reproducible but less surprising.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serendipitous knowledge retrieval via graph traversal #18

Observation

The idea

Parameters

Safety guard

Open questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Serendipitous knowledge retrieval via graph traversal #18

Description

Observation

The idea

Parameters

Safety guard

Open questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions