feat: dataset management — full feature with tests#1347
Open
jariy17 wants to merge 3 commits into
Open
Conversation
Implements complete dataset management lifecycle for AgentCore: Schema & Primitives: - DatasetPrimitive (add/remove/preview) with PREDEFINED_V1 and SIMULATED_V1 schema types - Dataset schema validation (name, schemaType, config.managed.location, kmsKeyArn) - CDK AgentCoreDataset construct (AWS::BedrockAgentCore::Dataset) CLI Commands: - `agentcore add dataset` / `agentcore remove dataset` - `agentcore dataset download` — pull from service to local file - `agentcore dataset publish-version` — publish DRAFT as immutable version - `agentcore dataset remove-version` — delete published version Operations: - Incremental push engine (diff by exampleId, batched API calls with retry) - Post-deploy sync (hash-based change detection, automatic on deploy) - Dataset-driven evaluation (load scenarios, invoke agent, collect spans) - resolve-dataset, wait, status, pull utilities TUI: - Add dataset wizard (name → schema type → description → confirm) - Dataset hub (download/publish/remove-version management) - Dataset picker in batch eval flow with version selection Bug Fixes: - Deploy preflight now recognizes datasets as deployable resources - Removed stale x-amz-expected-bucket-owner header workaround Tests: - 68 unit tests (DatasetPrimitive, AWS client, push engine, pull, publish, resolve, wait, post-deploy sync, session provider, preflight) - 17 integration tests (undeployed commands, validation, scaffolding) - 19 E2E tests (lifecycle, large batch 1000 examples, eval integration)
Contributor
|
Claude Security Review: no high-confidence findings. (run) |
- Replace non-existent `cliCommandRun` import with `runCliCommand` - Replace non-existent `RemovalResult` type with `Result` from lib - Fix error type mismatches (string vs Error object) across dataset primitives, TUI flows, and batch evaluation - Add `result.targetName` guard in status command to prevent undefined index - Remove unused `DEFAULT_ENDPOINT_NAME` import in get-trace.ts - Fix prettier formatting in invoke/action.ts and eval/run-eval.ts - Update asset snapshots and test assertions to match Result<T,E> type
Contributor
Package TarballHow to installgh release download pr-1347-tarball --repo aws/agentcore-cli --pattern "*.tgz" --dir /tmp/pr-tarball
npm install -g /tmp/pr-tarball/aws-agentcore-0.14.2.tgz |
Contributor
|
Claude Security Review: the review run failed before completing. See the run for details. |
Contributor
Coverage Report
|
tejaskash
reviewed
May 21, 2026
Contributor
tejaskash
left a comment
There was a problem hiding this comment.
Nice work overall — clean layering, good test coverage. A few inline notes; only the post-deploy hash one feels like it should block.
- Recompute content hash after pushDataset to avoid stale hash causing redundant network calls on every deploy - Replace Promise.all with Promise.allSettled in span-collector so one failed session doesn't abort the entire eval run - Move getDataset to static import in publish.ts (no cycle to dodge) - Add AbortSignal.timeout(30s) to fetch in agentcore-datasets.ts so hung connections don't block indefinitely
Contributor
|
Claude Security Review: the review run failed before completing. See the run for details. |
tejaskash
approved these changes
May 21, 2026
Contributor
tejaskash
left a comment
There was a problem hiding this comment.
Verified the four blocking/sub-blocking comments are addressed:
- post-deploy-datasets.ts: now re-reads the file after
pushDatasetand hashes the post-push content ✓ - span-collector.ts: switched to
Promise.allSettledso one failure doesn't abort the rest ✓ - publish.ts: dynamic
getDatasetimport promoted to top-level ✓ - agentcore-datasets.ts:
fetch()now usesAbortSignal.timeout(30_000)✓
LGTM.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements complete dataset management lifecycle for AgentCore CLI:
add dataset,remove dataset,dataset download,dataset publish-version,dataset remove-versionTest plan