Skip to content

feat(offloader): return explicit paths in preview and auto-enable retrieval#2222

Merged
lizradway merged 4 commits intostrands-agents:mainfrom
lizradway:offload-data
Apr 29, 2026
Merged

feat(offloader): return explicit paths in preview and auto-enable retrieval#2222
lizradway merged 4 commits intostrands-agents:mainfrom
lizradway:offload-data

Conversation

@lizradway
Copy link
Copy Markdown
Member

@lizradway lizradway commented Apr 28, 2026

Description

FileStorage.store() and S3Storage.store() return internal keys (e.g., 1234567890_1_tooluse_abc.txt or artifacts/1234567890_1_tooluse_abc) that appear in the agent's offloaded preview. While the agent could infer usable paths from these, doing so required the LLM to reason about the storage backend and construct the correct path — this PR makes the references explicit.

Storage changes

  • FileStoragestore() now returns the full file path (e.g., ./artifacts/1234_1_key.txt), preserving whatever relative/absolute form the user configured for artifact_dir. retrieve() accepts both full paths and bare filenames for backward compatibility. Metadata lookup uses the filename portion of the resolved path.
Example preview (absolute dir)
[Offloaded: 1 blocks, ~563 tokens]
Tool result was offloaded to external storage due to size.
Use the preview below to answer if possible.
Use your available tools to selectively access the data you need.
Only use retrieve_offloaded_content as a fallback if the data cannot be accessed using your existing tools.

The quick brown fox jumps over the lazy dog.

[Stored references:]
  /home/user/project/artifacts/1714300000000_1_tooluse_abc123_0.txt (text, 2,250 chars)
Example preview (relative dir)
[Offloaded: 1 blocks, ~563 tokens]
Tool result was offloaded to external storage due to size.
Use the preview below to answer if possible.
Use your available tools to selectively access the data you need.
Only use retrieve_offloaded_content as a fallback if the data cannot be accessed using your existing tools.

The quick brown fox jumps over the lazy dog.

[Stored references:]
  artifacts/1714300000000_1_tooluse_abc123_0.txt (text, 2,250 chars)
  • S3Storagestore() now returns an s3:// URI (e.g., s3://my-bucket/prefix/1234_1_key). retrieve() strips the s3://bucket/ prefix when present, also accepts raw keys. Rejects URIs for the wrong bucket early with KeyError.
Example preview
[Offloaded: 1 blocks, ~563 tokens]
Tool result was offloaded to external storage due to size.
Use the preview below to answer if possible.
Use your available tools to selectively access the data you need.
Only use retrieve_offloaded_content as a fallback if the data cannot be accessed using your existing tools.

The quick brown fox jumps over the lazy dog.

[Stored references:]
  s3://my-bucket/tool-results/1714300000000_1_tooluse_abc123_0 (text, 2,250 chars)
  • InMemoryStorage — reference format unchanged. No external access path exists.
Example preview
[Offloaded: 1 blocks, ~125 tokens]
Tool result was offloaded to external storage due to size.
Use the preview below to answer if possible.
Use your available tools to selectively access the data you need.
Only use retrieve_offloaded_content as a fallback if the data cannot be accessed using your existing tools.

xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

[Stored references:]
  mem_1_tooluse_abc123_0 (text, 500 chars)

Retrieval tool default

include_retrieval_tool now defaults to True (previously False). The inline guidance steers agents to use existing tools first (e.g., file read, bash, AWS CLI) and only fall back to retrieve_offloaded_content when direct access is not possible. This ensures InMemoryStorage users can always access offloaded content without needing to explicitly opt in.

Related Issues

#1296

Documentation PR

N/A

Type of Change

New feature

Testing

How have you tested the change?

  • 76 tests passing across test_storage.py and test_plugin.py

  • New storage tests: reference format, relative/absolute artifact dir, bare filename retrieval, raw S3 key retrieval, wrong bucket rejection, path traversal on filename portion

  • New plugin integration tests (TestActionableReferences): FileStorage path in preview text, FileStorage path in image placeholder, InMemoryStorage opaque reference in preview

  • New default-behavior tests: retrieval tool registered by default, not registered when explicitly disabled

  • End-to-end test script against real S3 (tool-externalization-demo bucket)

  • I ran hatch run prepare

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • I have updated the documentation accordingly
  • I have added an appropriate example to the documentation to outline the feature, or no new docs are needed
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Comment thread tests/strands/vended_plugins/context_offloader/test_storage.py Outdated
@github-actions

This comment was marked as outdated.

Comment thread tests/strands/vended_plugins/context_offloader/test_storage.py Outdated
@github-actions
Copy link
Copy Markdown

Assessment: Request Changes

The approach of making storage references actionable (file paths, S3 URIs) is solid and the backward compatibility design in retrieve() is well thought out. However, there's a round-trip bug that needs fixing before merge.

Review Categories
  • Critical Bug: FileStorage.retrieve() is broken for relative artifact_dir paths — store() returns "artifacts/file.txt" but retrieve() resolves it to "./artifacts/artifacts/file.txt". All existing round-trip tests pass only because they use absolute tmp_path. A round-trip test with a relative dir is needed and will expose this.
  • API Surface: The return value of store() is part of the public Storage protocol and has changed semantically. The PR description should acknowledge this rather than saying "N/A" for API changes. The protocol docstring should also be updated to reflect that implementations may return actionable references.

Solid overall design — once the relative-path bug is fixed and tested, this is ready to go.

Comment thread src/strands/vended_plugins/context_offloader/storage.py
@github-actions
Copy link
Copy Markdown

Assessment: Comment

The critical relative-path bug from round 1 is fixed and properly tested — nice work. Two remaining suggestions, neither blocking.

Review Categories
  • Readability: The retrieve() path resolution logic (line 181–185) is correct for all tested cases but the dense ternary plus doubled is_relative_to checks could be clearer with straightforward branching and a cached artifact_dir.resolve().
  • Documentation: The Storage protocol docstring for store() still describes the return as a generic "reference string" — updating it to acknowledge that implementations may return actionable references (paths, URIs) would align the protocol docs with the new behavior. (Carried from round 1.)

The core logic and backward compatibility design are solid.

@lizradway lizradway marked this pull request as ready for review April 28, 2026 21:08
@lizradway lizradway temporarily deployed to manual-approval April 28, 2026 21:08 — with GitHub Actions Inactive
@github-actions
Copy link
Copy Markdown

Assessment: Approve

All feedback from rounds 1 and 2 has been addressed — the critical relative-path bug is fixed, round-trip tests are in place, and the PR description now includes clear example previews for each storage type. No new issues found.

Remaining non-blocking suggestions from prior rounds
  • Readability: The retrieve() path resolution (lines 180–185) could be slightly more readable with explicit branching and a cached artifact_dir.resolve() local — but the current logic is correct for all tested edge cases (absolute paths, relative paths, bare filenames, path traversal).
  • Protocol docstring: The Storage.store() return value docstring still says "A reference string" — a small update acknowledging that implementations may return actionable references (paths, URIs) would align the protocol docs with the new behavior.

Nice work on the backward compatibility design — accepting both old and new reference formats in retrieve() is clean.

@lizradway lizradway changed the title feat(offloader): return explicit paths in preview and auto-enable retrieval in-memory feat(offloader): return explicit paths in preview and auto-enable retrieval Apr 29, 2026
@github-actions
Copy link
Copy Markdown

Assessment: Approve

Clean iteration. The conditional isinstance logic is gone in favor of a simple bool = True default — simpler API, consistent across all storage backends, and aligned with the team consensus in the discussion threads. The updated guidance text ("Only use retrieve_offloaded_content as a fallback…") properly steers agent behavior without removing access.

No new issues found. All prior feedback from rounds 1–4 is addressed.

Comment thread src/strands/vended_plugins/context_offloader/plugin.py Outdated
@github-actions
Copy link
Copy Markdown

Assessment: Approve

The tool spec docstring update (line 158–159) is a good addition — it ensures the LLM sees the "fallback only" guidance both in the offloaded preview text and in the tool description itself. No new issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants