fix(CoSTEER): rebind RAG trace cursor to evolving_trace identity (#1398) by voidborne-d · Pull Request #1399 · microsoft/RD-Agent

voidborne-d · 2026-04-28T06:54:30Z

Closes #1398.

Bug

`CoSTEERRAGStrategyV2.current_generated_trace_count` is an instance attribute on the strategy, but `generate_knowledge()` interprets it as an index into whichever `evolving_trace` list is passed in.

`CoSTEER.develop()` constructs a strategy once (`rdagent/components/coder/CoSTEER/init.py:53`) and reuses it across every `develop()` call. Each call builds a fresh `RAGEvoAgent` whose `evolving_trace` is a brand-new list. When that fresh trace happens to reach the same length the cursor was advanced to in a prior call, line 366 short-circuits:

```python
if len(evolving_trace) == self.current_generated_trace_count:
return None
```

…dropping repair feedback that the next loop body would have written into `success_task_to_knowledge_dict`. The next repair round then doesn't know which tasks already passed and may re-implement the whole group instead of only repairing the failed task — wasting LLM budget and risking regression of previously-successful code (issue #1398).

Fix

Bind the cursor to `id(evolving_trace)` and reset it when (a) the trace identity changes (new `develop()` call) or (b) the cursor is greater than the current trace length (defensive against truncation / fresh-trace reuse).

```diff
class CoSTEERRAGStrategyV2(CoSTEERRAGStrategy):
def init(self, settings: CoSTEERSettings, *args, **kwargs) -> None:
super().init(*args, **kwargs)
self.current_generated_trace_count = 0

   self._generated_trace_identity: int | None = None
   self.settings = settings

def generate_knowledge(...):

```
   trace_identity = id(evolving_trace)
```
```
   if (
```

       getattr(self, "_generated_trace_identity", None) != trace_identity

       or self.current_generated_trace_count > len(evolving_trace)

```
   ):
```

       self._generated_trace_identity = trace_identity

       self.current_generated_trace_count = 0

   if len(evolving_trace) == self.current_generated_trace_count:
       return None

```

`getattr(... None)` handles instances rebuilt without going through `init` (resume / pickle paths).

V1 is unchanged — it raises `NotImplementedError` immediately and is documented as deprecated.

Tests

`test/utils/coder/test_CoSTEER_RAG_cursor.py` adds four offline regressions:

`test_fresh_trace_with_same_length_is_ingested` — exact reproducer of CoSTEER RAG trace cursor can skip fresh repair feedback #1398. Fails on `main` (early-return drops the new trace), passes here.
`test_extending_same_trace_only_processes_new_steps` — guards against re-ingesting trace[0] when trace is extended in place.
`test_truncated_trace_resets_cursor` — covers the `cursor > len(trace)` reset path. Fails on `main`, passes here.
`test_idempotent_call_on_same_trace_returns_none` — keeps the fast-path no-op intact.

Tests are marked `@pytest.mark.offline` and avoid LLM/knowledge-base setup by passing trace steps with empty `sub_tasks`/`sub_workspace_list` so the inner loop is a no-op; visit counters on the steps prove the outer loop ran.

Local gates

```text
$ pytest test/utils/coder/test_CoSTEER_RAG_cursor.py -v
4 passed, 2 warnings in 1.86s

$ pytest (without the fix on main): 2 failed, 2 passed (confirms the regression test bites)

$ ruff check --no-fix rdagent/components/coder/CoSTEER/knowledge_management.py
no new errors introduced (110 pre-existing → 110)

$ black --check --target-version py311 -l 120 test/utils/coder/test_CoSTEER_RAG_cursor.py
All done! ✨ 🍰 ✨

$ isort --check-only test/utils/coder/test_CoSTEER_RAG_cursor.py rdagent/components/coder/CoSTEER/knowledge_management.py
clean
```

Diff is +136 / -0 (13 in source, 123 in the new test file).

🤖 Worked on by an AI agent. The cursor logic is small but adversarial-looking; happy to fold in any review notes.

📚 Documentation preview 📚: https://RDAgent--1399.org.readthedocs.build/en/1399/

CoSTEERRAGStrategyV2.current_generated_trace_count is an instance-level cursor, but generate_knowledge() interprets it as an index into whichever evolving_trace list is passed in. CoSTEER.develop() reuses a single strategy instance across calls (rdagent/components/coder/CoSTEER/__init__.py line 53), so a fresh trace whose length happens to match the stale cursor was silently short-circuited at the len()==cursor early-return, dropping repair feedback that should have been ingested. Bind the cursor to id(evolving_trace) and reset it when (a) the trace identity changes, or (b) the cursor is greater than the current trace length (defensive against truncation/resume edge cases). Adds test/utils/coder/test_CoSTEER_RAG_cursor.py with four offline regressions covering: fresh-trace-of-stale-length re-ingest, extension only processing new steps, truncation reset, and idempotent same-trace recall. Closes microsoft#1398

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(CoSTEER): rebind RAG trace cursor to evolving_trace identity (#1398)#1399

fix(CoSTEER): rebind RAG trace cursor to evolving_trace identity (#1398)#1399
voidborne-d wants to merge 1 commit into
microsoft:mainfrom
voidborne-d:fix/1398-costeer-rag-trace-cursor

voidborne-d commented Apr 28, 2026 •

edited by github-actions Bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

voidborne-d commented Apr 28, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bug

Fix

Tests

Local gates

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

voidborne-d commented Apr 28, 2026 •

edited by github-actions Bot

Loading