Skip to content

Search relevance: add per-file diversity caps and source/test path weighting #36

@willwashburn

Description

@willwashburn

Problem

relaywash__Search currently ranks mostly by match count, mtime, or path depth. A single large file can dominate results, and test or fixture files can crowd out source files even when the task is implementation-focused. This increases follow-up calls and makes the agent inspect less useful snippets.

Goal

Improve default Search result quality with lightweight ranking rules that do not require a persistent index.

Scope

Add ranking and trimming behavior for:

  • Per-file result caps.
  • Test file penalties.
  • Fixture and generated-file penalties.
  • Source file boosts.
  • Spillover behavior so capped files can still fill empty slots if there are not enough diverse results.

Proposed defaults

  • maxPerFile: default to maxResults / 3, minimum 3.
  • Penalize paths containing common fixture markers such as fixtures, testdata, __fixtures__.
  • Penalize tests by default when non-test source hits exist.
  • Do not hide tests completely, because test references are useful for behavior discovery.

API shape

Add optional Search args:

{
  "maxPerFile": 5,
  "includeTests": "auto"
}

includeTests can be auto, always, or never.

Acceptance criteria

  • Default Search results include more than one file when matches exist across multiple files.
  • Test and fixture files rank below source files for default implementation searches.
  • Explicit path filters still work as expected.
  • Add unit tests with a large matching file plus several smaller relevant files.
  • Add observation fields so the learning layer can see when diversity caps were applied.

Implementation notes

Likely files:

  • crates/wash/src/tools/search.rs
  • crates/wash/src/search.rs
  • crates/wash/src/profile.rs
  • crates/wash/src/hooks/post_tool_observe.rs

Keep schema additions stable and avoid dynamic descriptions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions