Skip to content

rewind_session_items can delete unrelated tail session items during retry cleanup #3089

@Aphroq

Description

@Aphroq

Please read this first

  • Have you read the docs? Agents SDK docs
  • Have you searched for related issues? Others may have faced similar issues.

Describe the bug

rewind_session_items() in src/agents/run_internal/session_persistence.py can delete session records that do not belong to the retry it is trying to rewind.

There are two related problems in the current implementation:

  1. It calls pop_item() before confirming that the session tail actually matches the target suffix being rewound.
  2. In the server_tracker cleanup path, it keeps popping items until it finds a known server item, which also removes the first known server item it encounters.

That means a session tail like:

[known_server_item, unrelated_new_item, target_item]

can become:

[]

after rewinding target_item, even though unrelated_new_item and known_server_item did not belong to the retry being rolled back.

This is especially dangerous for shared or externally-updated sessions, because a retry path can silently erase history added by another run or another actor.

Debug information

  • Agents SDK version: main at f2fb9ffb / latest release boundary v0.15.1
  • Python version: Python 3.12

Repro steps

Minimal reproducer:

import asyncio
from typing import Any

from agents.items import TResponseInputItem
from agents.run_internal.oai_conversation import OpenAIServerConversationTracker
from agents.run_internal.session_persistence import rewind_session_items


class InMemorySession:
    def __init__(self, history: list[dict[str, Any]]) -> None:
        self._items = list(history)

    async def get_items(self, limit: int | None = None) -> list[TResponseInputItem]:
        if limit is None:
            return list(self._items)
        return self._items[-limit:]

    async def add_items(self, items: list[TResponseInputItem]) -> None:
        self._items.extend(items)

    async def pop_item(self) -> TResponseInputItem | None:
        if not self._items:
            return None
        return self._items.pop()


async def main() -> None:
    known_server_item = {
        "id": "msg_server_1",
        "type": "message",
        "role": "assistant",
        "content": "server item",
    }
    unrelated_new_item = {
        "type": "message",
        "role": "user",
        "content": "unrelated tail item",
    }
    target_item = {
        "type": "message",
        "role": "user",
        "content": "retry target",
    }

    session = InMemorySession([known_server_item, unrelated_new_item, target_item])
    tracker = OpenAIServerConversationTracker()
    tracker.server_item_ids.add("msg_server_1")

    await rewind_session_items(session, [target_item], tracker)
    print(await session.get_items())


asyncio.run(main())

Current behavior:

[]

The target_item is removed, but the helper also strips unrelated_new_item and the sentinel known_server_item.

Expected behavior

rewind_session_items() should only remove the exact persisted suffix that belongs to the retry being rolled back.

If the session tail does not exactly match the target suffix, it should abort the rewind and log a warning instead of deleting unrelated records. In the server_tracker cleanup path, it should never remove an unrelated tail item or the known server item that terminates cleanup.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions