Skip to content

feat: dataset management — full feature with tests#1347

Open
jariy17 wants to merge 3 commits into
mainfrom
feature/dataset-management
Open

feat: dataset management — full feature with tests#1347
jariy17 wants to merge 3 commits into
mainfrom
feature/dataset-management

Conversation

@jariy17
Copy link
Copy Markdown
Contributor

@jariy17 jariy17 commented May 21, 2026

Summary

Implements complete dataset management lifecycle for AgentCore CLI:

  • Schema & Primitives: DatasetPrimitive (add/remove/preview) with PREDEFINED_V1 and SIMULATED_V1 schema types, name validation, kmsKeyArn support
  • CLI Commands: add dataset, remove dataset, dataset download, dataset publish-version, dataset remove-version
  • Operations: Incremental push engine (diff by exampleId, batched API calls with retry), post-deploy sync (hash-based change detection), dataset-driven evaluation (load scenarios, invoke agent, collect spans)
  • TUI: Add dataset wizard, dataset hub (download/publish/remove-version), dataset picker in batch eval flow
  • Bug Fixes: Deploy preflight now recognizes datasets as deployable resources; removed stale x-amz-expected-bucket-owner header workaround

Test plan

  • 68 unit tests (DatasetPrimitive, AWS client, push engine, pull, publish, resolve, wait, post-deploy sync, session provider, preflight)
  • 17 integration tests (undeployed commands, validation edge cases, scaffolding verification, file preservation)
  • 19 E2E tests (full lifecycle, 1000-example large batch, dataset-driven eval with Builtin.Faithfulness)
  • All E2E tests verified in ap-southeast-2 against account 619071331382

Implements complete dataset management lifecycle for AgentCore:

Schema & Primitives:
- DatasetPrimitive (add/remove/preview) with PREDEFINED_V1 and SIMULATED_V1 schema types
- Dataset schema validation (name, schemaType, config.managed.location, kmsKeyArn)
- CDK AgentCoreDataset construct (AWS::BedrockAgentCore::Dataset)

CLI Commands:
- `agentcore add dataset` / `agentcore remove dataset`
- `agentcore dataset download` — pull from service to local file
- `agentcore dataset publish-version` — publish DRAFT as immutable version
- `agentcore dataset remove-version` — delete published version

Operations:
- Incremental push engine (diff by exampleId, batched API calls with retry)
- Post-deploy sync (hash-based change detection, automatic on deploy)
- Dataset-driven evaluation (load scenarios, invoke agent, collect spans)
- resolve-dataset, wait, status, pull utilities

TUI:
- Add dataset wizard (name → schema type → description → confirm)
- Dataset hub (download/publish/remove-version management)
- Dataset picker in batch eval flow with version selection

Bug Fixes:
- Deploy preflight now recognizes datasets as deployable resources
- Removed stale x-amz-expected-bucket-owner header workaround

Tests:
- 68 unit tests (DatasetPrimitive, AWS client, push engine, pull, publish,
  resolve, wait, post-deploy sync, session provider, preflight)
- 17 integration tests (undeployed commands, validation, scaffolding)
- 19 E2E tests (lifecycle, large batch 1000 examples, eval integration)
@jariy17 jariy17 requested a review from a team May 21, 2026 16:43
@github-actions github-actions Bot added the size/xl PR size: XL label May 21, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label May 21, 2026
@github-actions github-actions Bot added the agentcore-harness-reviewing AgentCore Harness review in progress label May 21, 2026
Comment thread src/cli/commands/dataset/command.tsx Fixed
Comment thread src/cli/commands/dataset/command.tsx Fixed
Comment thread src/cli/commands/dataset/command.tsx Fixed
Comment thread src/cli/primitives/DatasetPrimitive.ts Fixed
@agentcore-devx-automation
Copy link
Copy Markdown
Contributor

Claude Security Review: no high-confidence findings. (run)

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label May 21, 2026
@github-actions github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label May 21, 2026
- Replace non-existent `cliCommandRun` import with `runCliCommand`
- Replace non-existent `RemovalResult` type with `Result` from lib
- Fix error type mismatches (string vs Error object) across dataset
  primitives, TUI flows, and batch evaluation
- Add `result.targetName` guard in status command to prevent undefined index
- Remove unused `DEFAULT_ENDPOINT_NAME` import in get-trace.ts
- Fix prettier formatting in invoke/action.ts and eval/run-eval.ts
- Update asset snapshots and test assertions to match Result<T,E> type
@github-actions github-actions Bot removed the size/xl PR size: XL label May 21, 2026
@github-actions github-actions Bot added the size/xl PR size: XL label May 21, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label May 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Package Tarball

aws-agentcore-0.14.2.tgz

How to install

gh release download pr-1347-tarball --repo aws/agentcore-cli --pattern "*.tgz" --dir /tmp/pr-tarball
npm install -g /tmp/pr-tarball/aws-agentcore-0.14.2.tgz

@agentcore-devx-automation
Copy link
Copy Markdown
Contributor

Claude Security Review: the review run failed before completing. See the run for details.

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label May 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 21, 2026

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 44.57% 9918 / 22249
🔵 Statements 43.82% 10535 / 24040
🔵 Functions 41.17% 1724 / 4187
🔵 Branches 40.89% 6346 / 15516
Generated in workflow #3229 for commit a24b351 by the Vitest Coverage Report Action

Copy link
Copy Markdown
Contributor

@tejaskash tejaskash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work overall — clean layering, good test coverage. A few inline notes; only the post-deploy hash one feels like it should block.

Comment thread src/cli/operations/deploy/post-deploy-datasets.ts Outdated
Comment thread src/cli/operations/eval/shared/span-collector.ts Outdated
Comment thread src/cli/operations/dataset/publish.ts Outdated
Comment thread src/cli/aws/agentcore-datasets.ts
Comment thread src/cli/aws/agentcore-datasets.ts
Comment thread src/cli/primitives/DatasetPrimitive.ts
- Recompute content hash after pushDataset to avoid stale hash causing
  redundant network calls on every deploy
- Replace Promise.all with Promise.allSettled in span-collector so one
  failed session doesn't abort the entire eval run
- Move getDataset to static import in publish.ts (no cycle to dodge)
- Add AbortSignal.timeout(30s) to fetch in agentcore-datasets.ts so
  hung connections don't block indefinitely
@github-actions github-actions Bot added size/xl PR size: XL and removed size/xl PR size: XL labels May 21, 2026
@agentcore-devx-automation agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label May 21, 2026
@agentcore-devx-automation
Copy link
Copy Markdown
Contributor

Claude Security Review: the review run failed before completing. See the run for details.

@agentcore-devx-automation agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label May 21, 2026
Copy link
Copy Markdown
Contributor

@tejaskash tejaskash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified the four blocking/sub-blocking comments are addressed:

  • post-deploy-datasets.ts: now re-reads the file after pushDataset and hashes the post-push content ✓
  • span-collector.ts: switched to Promise.allSettled so one failure doesn't abort the rest ✓
  • publish.ts: dynamic getDataset import promoted to top-level ✓
  • agentcore-datasets.ts: fetch() now uses AbortSignal.timeout(30_000)

LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/xl PR size: XL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants