Skip to content

Update release-3.7#2428

Merged
vbotbuildovich merged 92 commits intorelease-3.7from
master
Apr 29, 2026
Merged

Update release-3.7#2428
vbotbuildovich merged 92 commits intorelease-3.7from
master

Conversation

@c-julin
Copy link
Copy Markdown
Contributor

@c-julin c-julin commented Apr 29, 2026

No description provided.

github-actions Bot and others added 30 commits April 8, 2026 10:58
docs: update CHANGELOG.md for v3.7.1
Flatten nested overrides that bun does not support, disable MF DTS
generation that fails on pre-existing type errors, and add ESM module
type to eliminate Node.js reparsing warning.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat: attempt 1

* fix: bug bash major issues addressed

* fix: PR comments and more bug bash fixes

* fix: removed plans

* fix: fixed react doctor and integration test failures

* fix: adversarial review points addressed

* fix: reverted yaml sync fix that was irrelevant

* fix: copilot comments

* fix: reverted topics fix out of scope

* fix: claude comments
* fix(frontend): switch React Compiler to opt-in annotation mode

The compiler was running in default "infer" (opt-out) mode, memoizing
every component/hook globally. This caused stale closures and missed
re-renders in pagination, data table filters, and search components —
58 files already needed 'use no memo' workarounds.

Switch to compilationMode: 'annotation' so the compiler only processes
files explicitly marked with 'use memo'. Remove all 71 'use no memo'
directives (now dead code). No files are opted in yet — this is a
stabilization change. Files can be incrementally opted in after testing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(frontend): address code review feedback on compiler config

- Add comment above sources callback explaining it gates opt-in
  eligibility even in annotation mode
- Add panicThreshold: 'critical_errors' so the build fails loudly
  if the compiler can't handle an explicitly opted-in file

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the monolithic principal type selector in SharedConfiguration
with a composable renderPrincipal render prop. Role pages provide a
simple "Role name" input without the type selector, while ACL pages
retain the full User/Group/RedpandaRole dropdown.

- Add PrincipalFieldProps type to acl.model.tsx
- Add renderPrincipal prop to CreateACL and SharedConfiguration
- Extract LockedPrincipalField shared component for read-only usage
- Role create page: editable role name input without type selector
- Role update page: disabled role name with error handling
- Add try-catch to updateRoleAclMutation with toast.error
Group principals were hidden behind the gbac enterprise feature flag.
Now they always appear when ACLs exist for them.

- Remove gbac filter gate from ACLs tab display
- Add Group principals to Permissions List with badge and correct links
- Rename UsersEntry to PrincipalEntry with principalType and isScramUser
- Hide "Delete User" options for Group principals in dropdown
- Type-narrow deleteACLsForPrincipal to 'User' | 'Group'
- Fix deduplication to check both name and principalType
- Add data-testid attributes to dropdown menu items
When navigating from user creation to ACL creation, the principal type
and name are now locked via ?lockPrincipal=true query parameter.

- Add lockPrincipal to route search schema as z.literal('true')
- Render LockedPrincipalField when lockPrincipal is active
- Update user-create.tsx link to include lockPrincipal=true
…incipal

- Role CRUD: create/delete role, verify no principal type selector
- Permissions List: delete dropdown with full user+ACL creation flow
- lockPrincipal: verify locked vs editable principal on ACL create
- Tests skip gracefully when features are unavailable in OSS env
Add explicit rule to always use bun run scripts from package.json,
never bunx or npx directly.
Wrap formatToastErrorMessageGRPC return value with toast.error() in
useCreateUserMutation, useUpdateUserMutationWithToast, useCreateRoleMutation,
useDeleteRoleMutation, and useUpdateRoleMembershipMutation. The function
returns a string but onError callbacks discarded the return value, causing
errors to be silently swallowed with no user feedback.
Replace the one-liner formatToastErrorMessageGRPC with a richer implementation
that uses ConnectError.findDetails() to extract well-known protobuf error
detail types: BadRequest, ErrorInfo, PreconditionFailure, QuotaFailure,
ResourceInfo, and LocalizedMessage. Uses rawMessage instead of message to
avoid redundant code prefix.
…egistry

Replace legacy REST api.createServiceAccount with useCreateUserMutation and
rolesApi.updateRoleMembership with useUpdateRoleMembershipMutation. Migrate
all @redpanda-data/ui Chakra components to Redpanda UI Registry equivalents
(Field, Input, Button, Select, Checkbox, Alert, Tooltip, SimpleMultiSelect,
CopyButton). Use useNavigate for Create ACLs navigation with search params.
…ncipal

Extract search param to sharedConfig resolution into a pure testable function
in principal-utils.ts. Remove the lockPrincipal query param from the route
schema and ACL create page — the principal field is now always editable when
pre-populated from user creation.
Add useEffect to sync sharedConfig state from propSharedConfig when it
changes after initial render. Fixes the ACL create form not being
pre-populated when route search params arrive after the first render cycle.
Add E2E test verifying the full user creation → Create ACLs navigation flow
including URL params and form pre-population. Migrate all E2E selectors to
use data-testid attributes where possible for reliability.
…route

Delete the monolithic acl-list.tsx (1114 lines) and its associated test
files, confirmation modals, and permission assignments component. These
are replaced by the new file-based security route structure with dedicated
pages for users, roles, ACLs, and permissions list.
Migrate security section from single $tab dynamic route to dedicated file-based
routes under /security/{users,roles,acls,permissions-list}. Update routeTree.gen.ts
with new route registrations and redirect /security/ to /security/acls.
Update breadcrumbs, navigation, and imports in ACL detail/update pages,
role pages, and user details to use the new file-based security route
structure with useSecurityBreadcrumbs hook.
Update imports and remove unused code in add-user-step.tsx and
create-user-confirmation-modal.tsx.
Add index routes for each security tab: users, roles, ACLs, and
permissions list. Add security.tsx layout route with tab navigation,
breadcrumbs, and feature-gated tab visibility.
Add security section components:
- Tab pages: users-tab, roles-tab, acls-tab, permissions-list-tab
- Hooks: useSecurityBreadcrumbs, usePrincipalList
- Shared: delete confirmation modals, user role tags, filter-by-name
- Tests for hooks, shared utilities, and tab components
Move generatePassword from user-create.tsx to a dedicated utility file.
Update imports in user-create.tsx, user-edit-modals.tsx, add-user-step.tsx,
and their test mocks.
Remove legacy files that are no longer imported after the security page
restructuring: models.ts, operation.tsx, principal-group-editor.tsx,
role-create.tsx, role-edit-page.tsx, role-form.tsx, and
create-user-confirmation-modal.tsx. Also remove dead
AclPrincipalGroupPermissionsTable from user-details.tsx.
Move all files from acls/new-acl/ up to acls/ and update all import paths.
Eliminates the unnecessary nesting now that the legacy acl-list.tsx is removed.
…factor

`new-acl/` subdirectory was removed in 296f51960; update the four test
files that still imported from the old path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…boarding components

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…alias

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
alenkacz and others added 28 commits April 20, 2026 16:02
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fix(agents): update edit screen to use aigw provider selector
* fix(secrets): paginate ListSecrets to return all entries

useListSecretsQuery sent a single request with pageSize=25 and dropped
everything past the first page, silently hiding secrets (e.g. PGDB_DSN
in accounts with >25 secrets). Loop nextPageToken via callUnaryMethod
until exhausted and aggregate into a single result, at the server-side
max of 50 per page.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(secrets): guard ListSecrets pagination against runaway loops

Add a hard cap of 200 iterations and a same-token guard so a misbehaving
server cannot trap the client in an infinite pagination loop. Also honor
react-query's AbortSignal so in-flight pages cancel on unmount or refetch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* style(secrets): apply biome import ordering

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(secrets): disable react-query retries on pagination errors

queryFn throws intentionally on guard violations (non-advancing token,
max-pages exceeded). React Query's default 3-retry policy would
otherwise multiply the already-bounded runaway-pagination traffic by 4×
before surfacing the failure.

Addresses PR review feedback on #2394.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nts (#2389)

* chore(frontend): bump A2A SDK, MCP SDK, AI SDK v6, streamdown v2

Align Redpanda Console with the current stable versions of the AI
toolchain. All four packages are dependencies of the AI Agents chat
feature:

- @a2a-js/sdk       0.3.10 → 0.3.13 (patch: stream/task fixes, no API change)
- @modelcontextprotocol/sdk 1.26.0 → 1.29.0 (minor additive)
- ai               5.0.101 → 6.0.168 (new major)
- streamdown        1.4.0 → 2.5.0  (new major)

ai v6 ships a new @ai-sdk/provider v3 package that introduces
LanguageModelV3, but keeps the v2 types as a backwards-compat export and
accepts either in the public LanguageModel alias, so our custom
A2aChatLanguageModel (which implements LanguageModelV2 from
@ai-sdk/provider) continues to work without modification.

streamdown v2 only changes internals (remark-cjk/katex plumbing) and is
source-compatible with how response.tsx consumes it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(frontend): adapt useContextUsage to ai v6 LanguageModelUsage shape

ai v6 reshaped LanguageModelUsage: inputTokenDetails and
outputTokenDetails are now required sub-objects, and the pre-v6
top-level reasoningTokens/cachedInputTokens fields are deprecated but
still present. Our useContextUsage hook returned only the legacy fields,
which tsgo flagged as missing required members once the alias pulled in
the v6 shape.

Populate both the new sub-objects and the legacy fields so the Context
component (which still reads from the deprecated top-level fields)
continues to work today and any future upstream re-sync that switches
to the new fields sees consistent values.

Add a unit test covering the mapping, useMemo stability, and recompute
behaviour.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(frontend): adopt MCP approval states and dynamic tools in ai-elements Tool

Re-sync the Tool component with vercel/ai-elements upstream to pick up
support for the ai v6 tool-approval flow. ai v6 introduced three new
ToolUIPart states used by the MCP approval loop:

- approval-requested  → "Awaiting Approval" (shield/alert)
- approval-responded  → "Responded"         (shield/check)
- output-denied       → "Denied"            (X)

It also added DynamicToolUIPart for provider-resolved (e.g. MCP) tools,
where the tool name isn't encoded in the part type and must be passed
via a toolName prop. ToolHeader now accepts a discriminated union:
either a static `tool-*` type with no toolName, or `dynamic-tool` with
toolName.

Redpanda-specific customisations are preserved: our own Badge variants
(success-inverted / info-inverted / destructive-inverted /
neutral-inverted / warning-inverted), the durationMs display, the
toolCallId copy button, and the deepParseJson-aware ToolInput /
ToolOutput behaviour.

Covered by new tests in tool.test.tsx for all seven states plus the
dynamic-tool naming, title override, duration formatting, and
toolCallId rendering.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(frontend): adopt ConversationDownload and messagesToMarkdown

Pull the upstream ConversationDownload component and messagesToMarkdown
helper from vercel/ai-elements into our vendored conversation.tsx. The
AI Agents chat feature will use these to give users a one-click export
of an agent conversation as Markdown.

Kept all Redpanda customisations: import path to
components/redpanda-ui/components/button, overflow-y-auto (intentional
over upstream's overflow-y-hidden so the content scrolls), simpler
ConversationContent classes, no dark:bg-background override on the
scroll button (our Button variant handles theming), and the
Conversation.displayName.

Covered by a unit test of messagesToMarkdown across role capitalisation,
multi-message joining, multi-part text concatenation, non-text parts,
custom formatters, and the empty case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(frontend): re-sync low-risk ai-elements with upstream

Pick up three small upstream improvements from vercel/ai-elements while
preserving every Redpanda customisation (import paths, Redpanda Button
variants, intentional class overrides):

- shimmer.tsx: cache motion components in a module-level Map so that
  `motion.create(element)` is never called during render. Stops churning
  component identity across every message tick.
- context.tsx: memoise the React context value so consumers only
  re-render when a usage field actually changes, and guard
  ContextInputUsage / ContextOutputUsage against a zero-token state so
  the HoverCard never shows "Input —" lines before a response has
  arrived.
- image.tsx: prefix the unused destructured `uint8Array` arg with an
  underscore, matching the upstream biome convention.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(frontend): revert MCP SDK bump to keep it in separate PR

@modelcontextprotocol/sdk upgrade will ship in its own PR so this one
stays scoped to the AI toolchain (A2A + ai v6 + streamdown v2 +
ai-elements re-sync).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(frontend): guard Context components against divide-by-zero

When a conversation has not yet reported any usage events, `maxTokens` is 0
and the token/percent computation renders "NaN%" in the trigger, icon, and
hover-card header. Guard each call site so the components fall back to 0%
until real capacity is known. Adds regression tests for the zero-token
edge case plus provider-value stability guards.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(frontend): extend Tool approval-flow and output-rendering coverage

Adds cases for input-streaming, output-error + duration, mid-flight
duration suppression, approval-requested → output-denied transition,
unknown-state graceful fallback, and ToolInput/ToolOutput rendering
for empty-object, JSON, and errorText inputs. Complements the 9 cases
landed in the ai-elements resync commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(frontend): cover Shimmer motion-cache and Response streamdown v2

Shimmer regression guards ensure the module-level motion-component cache
stays correct across re-renders and between multiple instances, and that
switching the `as` prop swaps the rendered DOM tag. Response tests guard
the streamdown v2 pipeline against regressions in basic markdown, fenced
code blocks, and the `memo` bail-out on unchanged children.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(frontend): cover A2A adapter mapFinishReason and getResponseMetadata

Locks in the A2A task-state → AI SDK v6 finish-reason mapping and the
response-metadata shape for Task / Message / status-update / artifact-update
events. Guards against silent regressions when either SDK bumps minor
versions, and protects the timestamp-undefined edge case from regressing
to an invalid Date.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(frontend): add showcase browser tests + screenshots for PR #2389

Captures 13 PNG screenshots of the adopted AI-Elements variants —
Tool card states (incl. the three newly adopted MCP approval flow
states and DynamicToolUIPart), ConversationDownload, Context
hover-card guards, and the Shimmer loading frame — rendered via
vitest browser mode and committed under docs/pr-screenshots/ for
reference in the PR body.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* style(frontend): apply biome formatter to a2a adapter test

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(frontend): drop stale @ts-ignore and dead activeTextIds in A2A adapter

- @ts-ignore on config was stale; config.jwt is actually read.
- settings constructor arg is intentionally unused; prefix with underscore
  instead of suppressing unused-check.
- activeTextIds Set was never populated but still iterated in flush —
  pure dead code since v6 switched text through raw/artifact events.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(frontend): remove no-op text-delta handler

A2A carries text through status-update.message.parts and artifact-update
events. The text-delta handler was a deliberate no-op with its own
dispatch branch; consolidating the v6 stream-part story by removing both.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(frontend): copy button title uses displayName not raw toolName

Static tools passed `type="tool-foo-bar"` with no `toolName` prop, so the
copy button title rendered "Copy: undefined". Use `displayName` instead,
which resolves to `title ?? derivedName` — already the visible label.

Also fix the lint regressions introduced by the previous refactor:
formatter drift in event-handlers.ts and a dangling `continue` left
after removing the text-delta branch in use-message-streaming.ts.

Caught by @claude PR review on #2389.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(frontend): regenerate PR screenshots with real theme styling

Previous screenshots rendered unstyled (no tailwind / no theme tokens)
because the vitest browser setup deliberately skipped globals.css.
Regenerated under the wired-up pipeline so the PR body screenshots show
the actual Redpanda look (badge variants, monospace JSON, dashed
borders, typography).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(frontend): extract pure a2aEventToV2StreamParts from doStream

Move the per-event A2A → LanguageModelV2 translation out of the inline
TransformStream and into a pure reducer colocated with the adapter. This
unblocks unit-level coverage of the per-TaskState branches without
standing up a full streaming harness. Zero behavior change; the
TransformStream now just plumbs parts and next-state through.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(frontend): cover a2aEventToV2StreamParts per-state branches

Table-driven coverage of every TaskState → LanguageModelV2FinishReason
branch plus the response-metadata, raw-chunk, and non-status-event
paths. Also locks in finalizeStream's unconditional undefined-usage
contract so regressions on the AI SDK v6 finish-part shape are caught
at the unit layer rather than at the streaming integration layer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(frontend): extract parseA2AError into its own util

Move the JSON-RPC error parser and its five regex constants out of
use-message-streaming.ts into chat/utils/parse-a2a-error.ts. The hook
imports from the new module; the existing describe('parseA2AError', ...)
suite in use-message-streaming.test.ts updates its import path only.
Function signature and behavior are unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(frontend): table-driven coverage for parseA2AError

Dedicated test.each suite targeting the extracted util directly. Covers
the SDK's two serialized JSON-RPC error shapes (SSE-wrapped and
streaming), invalid Data: JSON, empty/missing code fallbacks, non-Error
input types, and prefix-stripping behavior. The existing hook-level
describe('parseA2AError') in use-message-streaming.test.ts stays put
since it exercises the same util through the full streamMessage path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(frontend): host PR screenshots on orphan branch instead of committing

The 13 showcase PNGs used in PR #2389's description now live on the
orphan branch `pr-screenshots-2389` and are referenced via raw URLs, so
the main merge history stays free of binary assets. The browser tests
still regenerate PNGs locally under `frontend/docs/pr-screenshots/`;
that directory is gitignored so re-runs do not re-commit them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(frontend): follow master to MCP SDK 1.29.0

* chore(repo): drop unneeded frontend/docs/pr-screenshots gitignore entry

* chore(frontend): bump copyright year to 2026 across PR-added files

* fix(frontend): harden Context divide-by-zero guard against NaN, Infinity, negatives

* test(frontend): align ai-elements browser tests with ADP UI utility convention

* docs(frontend): clarify why Image drops uint8Array and renders from base64

* feat(frontend): reject unknown A2A stream events with UnsupportedFunctionalityError

* feat(frontend): extend parseA2AError with title and actionable hint for MCP codes

* style(frontend): satisfy biome formatter and top-level-regex rule on a2a tests

* chore(frontend): remove committed mcp inspector screenshot baseline

The 32k PNG baseline under __screenshots__/ doesn't need to live in the
repo — our convention is to regenerate locally or host showcase
screenshots on the pr-screenshots orphan branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(frontend): stop double-dispatching prompts in A2A doStream

A non-streaming agent previously received the message twice — once via
the sendMessage branch (whose result populated simulatedStream) and
again unconditionally via sendMessageStream. The FIXME error branch
also silently fell through instead of surfacing the error.

Move sendMessageStream into an else branch so the two transport paths
are mutually exclusive, and throw on sendMessage error so failures
propagate instead of being shadowed by the subsequent stream.

Flagged by @claude PR reviewer on #2389.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(frontend): distinguish ToolOutput error container from success

The ternary on the result container had bg-muted/50 in both branches,
so errors looked identical to successful results. Switch the error
branch to bg-destructive/10 and add a regression test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(frontend): capture nested Data payloads in parseA2AError regex

The [^}]* fragment stopped at the first closing brace so nested data
got truncated and JSON.parse silently threw, losing the structured
payload. Anchor to end-of-string with the /s flag and add a
nested-object test case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(frontend): extract chooseA2ASourceStream for testable transport selection

Pull the streaming-vs-blocking decision out of doStream into a pure
function that takes a structural A2ATransport. Lets us cover the
double-dispatch fix with deterministic unit tests (4 cases: streaming
path, blocking success, blocking error, undefined capability).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* style(frontend): satisfy biome rules in a2a tests (ci-lint fix)

- hoist regex literal to module scope for useTopLevelRegex
- rename inline generator to reuse across assertions for useAwait
- apply numeric separators via biome --write

No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(frontend): wrap MCP streaming hook state-updates in act()

The renderHook-based tests in remote-mcp.test.tsx and the streaming
progress UI tests in remote-mcp-inspector-tab.test.tsx triggered
component state updates outside of act() when mutateAsync settled or
when onprogress callbacks were fired directly, producing React warnings
in test stderr. Wrap the triggering awaits in act(async () => ...) to
flush them deterministically. No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* style(frontend): reorder imports per biome autoformat

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The cloud BSR module (`buf.build/redpandadata/cloud`) was added as an
unfiltered input in buf.gen.frontend.yaml, pulling in all cloud protos
(controlplane, billing, IAM, UI, etc.) when only `adp/v1alpha1` is needed.

Add `paths: [redpanda/api/adp]` filter to limit generation to adp protos.
Regenerate protogen files and add new `LLM_PROVIDER_TYPE_OPENAI_COMPATIBLE`
enum value to provider type maps.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The existing `git diff --exit-code` only catches modified tracked files.
New untracked files (like the 142 cloud proto files from the unscoped
BSR input) slip through. Add an explicit check for untracked files in
the protogen output directories.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pin buf.build/redpandadata/cloud to commit 4d513424 in
buf.gen.frontend.yaml to prevent surprise proto changes from
upstream cloud module updates.

Remove cloud from buf.yaml deps since it's unused there (only
used as a buf.gen.frontend.yaml input).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Advisory: code injection in protobufjs <=7.5.4 / <=8.0.0 via unsanitized
schema-derived identifiers in generated functions. Console has no direct
protobufjs dependency, but a transitive 7.5.4 was present via the lockfile.
Pin it to ^7.5.5 in both overrides and resolutions to kill the finding.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(proto): scope cloud BSR module to adp protos only
Add AWS Bedrock as a provider option for AI agents. The Bedrock message
includes region (required), and optional access_key_id_secret_ref and
secret_access_key_secret_ref fields that accept secret references. When
credentials are omitted, the agent falls back to the AWS default
credential chain.

Frontend changes:
- Add 'bedrock' to all provider type unions and switch statements
- Add Bedrock entry to PROVIDER_INFO and MODEL_OPTIONS_BY_PROVIDER
- Enable Bedrock in LLM_PROVIDER_TYPE_TO_FORM_ID mappings
- Add LLM_PROVIDER_TYPE_OPENAI_COMPATIBLE to enum maps (BSR drift)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat(agents): add Bedrock provider to AIAgent proto
* chore(frontend): remove vestigial AI Gateway v1 references

The old AI Gateway v1 UI was removed from Console in #2155 (Jan 2026)
but a dev proxy and two buf-generated npm packages remained. Drop them.

Removed:
- Dev proxy block for `/.redpanda/api/redpanda.api.aigateway.v1` in rsbuild.config.ts
- `AI_GATEWAY_URL` env var export in start-cloud.sh (and cluster-id regex helper it depended on)
- `@buf/redpandadata_ai-gateway.bufbuild_es` / `@buf/redpandadata_ai-gateway.connectrpc_query-es` deps

Preserved:
- AI Gateway v2 (`aigw`) proxy, hooks, query layer, agent pages
- Scope.AI_GATEWAY secret scope (semantically v2 now)
- Deprecated `virtualGatewayId` field in `AIAgent.GatewayConfig` proto — backend contract stays

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(frontend): regenerate lockfiles after dropping aigw v1 packages

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(frontend): drop AI Gateway v2 proxy; Console will consume AIGW via ADP UI

The /.aigw/api dev proxy is removed. Console does not call AI Gateway
(v1 or v2) directly anymore; ADP UI (hosted inside Cloud UI) owns all
AIGW interaction going forward.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Revert "chore(frontend): drop AI Gateway v2 proxy; Console will consume AIGW via ADP UI"

Per reviewer feedback on #2403: the aigw v2 (`/.aigw/api`) dev proxy
stays. It powers Console's own aigw v2 API calls (LLMProviderService,
ModelService) used by the agent pages. Only the v1 proxy is removed
in this PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…#2405)

Remote MCP is GA but we need to temporarily hide the sidebar entry for
most customers while a small allow-list (Poolside AI, LiveRamp ADP,
IT-Novum) continues using it. Adds `enableRemoteMcpInConsole` to the
forwarded feature-flags map and gates the `/mcp-servers` sidebar entry
on it alongside the existing `isEmbedded()` check. Route access is not
changed — only sidebar visibility.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(a2a): resubscribe on graceful SSE close with in-flight task

Load balancers with idle timeouts (commonly ~5 minutes) close idle TCP
connections gracefully with a FIN rather than an error. The A2A streaming
iterator surfaces this as end-of-stream, so the catch-block reconnect
never runs and the message is finalized with a non-terminal taskState,
orphaning the task server-side.

Route clean closes through the same resubscribeLoop used in the error
path when the task is still resubscribable. The loop already handles
backoff, progress detection, and give-up; it just needed a second entry
point.

Covered by three regression tests: clean-close with in-flight task,
clean-close with terminal task (no resubscribe), and clean-close where
resubscribe exhausts retries (finalizes with gave-up status).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(a2a): guard clean-close reconnect from second round on finalize failure

After the clean-close path enters resubscribeLoop and exhausts its 5
attempts without reaching terminal state, the captured task state is
still 'working' and isResubscribable() remains true. If finalizeMessage
then throws (e.g., the DB write fails), the outer catch would re-enter
resubscribeLoop for another full 31s backoff round.

Track entry with a resubscribeAttempted flag and skip the catch-path
reconnect when the clean-close path already ran one. Mirrors the
defensive pattern in the existing catch block around finalizeMessage.

Add a regression test asserting exactly one round of 5 retries even when
both resubscribe and finalize fail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(a2a): address review comments on clean-close reconnect fix

Production code (use-message-streaming.ts):
- Remove redundant closeActiveTextBlock call in the clean-close branch;
  the block at the top of that stretch already closed it and nothing
  between could have opened a new one.
- Report success=false when the clean-close path's resubscribe gives up,
  mirroring the error-path's gave-up semantics. An orphaned task is a
  failure regardless of whether the original disconnect was graceful.

Tests (use-message-streaming.test.ts):
- Renumber scenarios into a clean 1-25 sequence (was 1-13, 13b, 13c,
  14-17 with an added 16b/c/d/e block mid-file).
- Rename the TypeError test: the code breaks out of the retry loop
  rather than rethrowing, so "stops retrying immediately on TypeError"
  is accurate.
- Update gave-up clean-close test to assert success=false.
- Add scenario 23: clean-close resubscribe succeeds then finalizeMessage
  throws — guards the success-path arm of resubscribeAttempted against
  future regressions that change terminal-state tracking.
- Add scenario 24: clean-close with taskId captured only from response
  metadata fallback — confirms the metadata block runs before the
  clean-close isResubscribable check and does not trigger spurious
  reconnects.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(a2a): log finalize failures on the clean-close reconnect path

Mirror the catch-block recovery branch's inner try/catch so a DB write
failure after a clean-close reconnect is observable in production
telemetry instead of silently producing an a2a-error block.

Closes the last review observation from the final Claude pass: the
clean-close path previously let finalizeMessage errors propagate to the
outer catch, which runs parseA2AError on the DB error but emits no log.
Scenarios 22 and 23 now also assert the log fires exactly once from the
clean-close branch, making the previously-inaccurate console.error spy
comments accurate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
)

Add/update overrides + resolutions in frontend/package.json to remediate
Snyk findings in production dependency tree.

Runtime fixes:
- dompurify >=3.4.0 (via @milkdown/kit chain) - operator precedence
- hono >=4.12.14 & @hono/node-server >=1.19.13 (via @modelcontextprotocol/sdk)
  - directory traversal, HTTP response splitting, XSS, improper input validation

Dev-time fixes picked up through hoisted tree:
- picomatch >=2.3.2 (ReDoS + prototype pollution via chokidar/micromatch)
- webpack >=5.104.1 (SSRF in HMR runtime)
- @tootallnate/once >=3.0.1 (control flow scoping via jsdom/http-proxy-agent)
- sirv >=3.0.2 (directory traversal via webpack-bundle-analyzer)

Verified:
- snyk test --file=package.json -> 0 vulnerable paths (production)
- bun run type:check -> passes
- bun run lint:check -> passes
- bun run test:unit -> 749/749 pass
- bun run build -> succeeds

Remaining deferred findings:
- elliptic@6.6.1 (Medium) - no upstream fix; already on latest 6.x
- Dev-only tooling (testcontainers, rspack, vite transitives) not in prod bundle
Resolves CVE-2026-29181 (DoS via multi-value baggage header
amplification in otel baggage/propagation) by bumping the otel
module graph from v1.39.0 to v1.43.0. Fix was introduced in 1.41.0.

GHSA-mh2q-q3fh-2475

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
backend: bump OpenTelemetry to v1.43.0 to address Snyk findings
…2409)

AgentRegistryService (redpanda/api/adp/v1alpha1/agent.proto) is a
server-side registry API that the console UI never calls. PR #2401
unintentionally manifested it when scoping the cloud BSR module input
to redpanda/api/adp. Add exclude_paths to drop agent.proto so the
generated client stays out of this public-facing repo.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
34be39e added Bedrock to the provider TS unions and the aigw provider
type mapping but missed the Zod enum in the create/edit schema.
Selecting a Bedrock provider in the aigw dropdown set provider='bedrock',
which failed validation with "Invalid enum value. Expected 'openai' |
'anthropic' | 'google' | 'openaiCompatible', received 'bedrock'".

Widen the enum to include 'bedrock'. Bedrock-specific field validation
(region, credentials) is not required here because in aigw mode that
config lives on the gateway provider, not on the agent.
PR #2402 added Bedrock to the dataplane proto and the provider
dropdown, but the create-page onSubmit switch was never updated. When
the user picked a Bedrock-typed LLM gateway provider, the form's
`provider` field was set to 'bedrock' via LLM_PROVIDER_TYPE_TO_FORM_ID,
fell through to `default // openai`, and shipped
`provider:{openai:{}}` with a Bedrock model ID. The aiagent then
blew up with "unsupported OpenAI model: anthropic.claude-opus-4-7".

Add a bedrock branch that pulls the region off the selected
LLMProvider's bedrockConfig (the dataplane proto marks
AIAgent.Provider.Bedrock.region as required) and builds the right
oneof variant.

Gateway-only for now. Direct (non-gateway) Bedrock would need its
own region field in the form.
…ECT width (UX-1235) (#2418)

* fix(topics): hide Infinite button for capped config keys (UX-1222)

The topic config editor showed an "Infinite" button for every config whose
frontendFormat was BYTE_SIZE or DURATION. For configs that the broker hard-caps
(e.g. max.message.bytes, segment.bytes), clicking it submitted -1, which the
broker reinterpreted as 4294967295 and rejected with INVALID_CONFIG.

Add an opt-out field `noInfiniteValue` to the config extension schema.
Configs that do not accept an infinite sentinel declare it in JSON, and the
frontend hides the Infinite button for them. BYTE_SIZE and DURATION configs
continue to show the button by default.

Apply the opt-out to max.message.bytes and segment.bytes.

* fix(topics): force min width on Custom SELECT dropdown (UX-1235)

chakra-react-select's container has an intrinsic maxWidth: fit-content,
so even with a minW on the wrapping Box the SingleSelect collapses to
the width of its current value. For message.timestamp.type with no
initial value the whole control rendered at ~69px and option labels
were truncated to "Creat" / "LogA".

Override via chakraStyles on the SELECT case to set container
minWidth: 240px directly. Keeps the Custom Box layout untouched for
BYTE_SIZE / DURATION controls.
* fix(frontend): drop "successfully" from toast/status messages

Use past-tense verb only per style guide ("Topic created" not
"Topic created successfully"). Also removes the exclamation point
from "Topic created!" in the create-topic modal.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(frontend): update test assertions for "User created" string change

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(frontend): remove exclamation points from UI strings

Per style guide: no ! in product UI text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(frontend): replace Yes/No button labels with action verbs

"Yes" → "Stop reassignment", "No" → "Keep running" in the
cancel-reassignment confirmation popover per style guide.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(frontend): capitalize Schema Registry product name consistently

Fix lowercase "schema registry" in UI-facing strings and aria-labels.
Also fix "schema registry rule" → "Schema Registry rule" in validation
messages and ACL resource display names.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(frontend): remove "Please" from UI strings, use direct imperative

Per style guide: drop "Please <verb>" at start of strings — use
direct imperative. Also updates test assertions that matched changed strings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(frontend): sweep misc UX copy (via, e.g. in UI strings)

Replace "via" with "through"/"using" and "e.g." with "for example"
in user-visible strings per style guide. Placeholder text with e.g.
left as-is (conventional and concise in that context).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
extractProviderInfo did not handle the 'bedrock' case, so region was
lost when editing an agent. Also update region from the AIGW LLM
provider config when switching llmProvider to a bedrock provider.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses review comment — use ?. for type safety.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…on-edit

fix(agents): preserve bedrock region on AI agent edit
Adds a workflow_run-triggered workflow that fires after
"PR verification (forks)" succeeds on fork pull_request runs.
It dispatches the push event to console-enterprise and marks
Enterprise CI as pending, matching the non-fork flow.

Fork-controlled strings are passed via env vars (not template
expansion) to neutralize script injection, and the client
payload is built with JSON.stringify.

Payload now includes head_repository and is_fork so the
enterprise side can clone from the fork repo (the OSS SHA is
not reachable from redpanda-data/console for fork PRs).
* fix(topics): clamp default replication factor to broker count

CreateTopic defaulted replicationFactor to 3 in both the topic-create
modal and the MCP inspector's "create new topic" flow. On single-broker
clusters (local-byoc dev environments, etc.) that rejected the request
with "not enough replicas". Clamp the default to min(3, brokersOnline)
and fall back to 3 while KafkaInfo is loading.

* fix(rp-connect): clamp onboarding topic RF to broker count

The RPCN onboarding Add Topic step spread TOPIC_FORM_DEFAULTS
(replicationFactor: 3) into the form and renders the RF field as
readOnly in AdvancedTopicSettings. On single-broker clusters
(local-byoc dev environments) that meant the user saw RF=3 with no
way to edit, and CreateTopic failed with "not enough replicas".

Override replicationFactor at form init with min(default, brokersOnline)
via useGetKafkaInfoQuery so the readOnly value matches what the broker
can actually satisfy.

* revert(mcp): drop RF clamp from MCP inspector

Scope back to the user-visible topic-create paths only. The MCP
inspector's "create new topic" flow is out of scope for this PR.

* fix(rp-connect): react to late-arriving KafkaInfo via rhf `values`

useForm captures defaultValues at mount; when the KafkaInfo query
resolves after mount the RF field stays at its initial 3. Pass
defaultValues through both `defaultValues` and `values` with
`resetOptions: { keepDirtyValues: true }` so the form reactively
picks up the clamped replicationFactor once brokersOnline is known,
without clobbering fields the user has already edited.

* fix(topics): isolate keepDirtyValues to RPCN form watch only

Address PR review feedback on #2420. The form-level
`resetOptions: { keepDirtyValues: true }` (added so dirty fields
survive the late `kafkaInfo` re-init) was leaking into the
existing-topic-selected reset, where dirty fields should be
overwritten by the selected topic's actual config.

Also tighten the RF clamp comment in `create-topic-modal.tsx` to
cover the all-brokers-offline case, not just the loading case.

Refs UX-1208.
The 128-char cap on conversation_id, turn_id, tool_call_id, model,
and user_id has no upstream basis. OTel doesn't constrain
gen_ai.conversation.id; chat platforms (Microsoft Teams personal-DM
threads, Slack thread IDs, Discord channel IDs) routinely produce
IDs that exceed 128 characters. When buf-validate rejects the
TranscriptSummary response, the per-agent Transcripts tab is empty
even though the spans are visible in the global tracing view.

Bump to 256 across the board for consistency with the existing
title and TranscriptToolCall.name fields.
@github-actions
Copy link
Copy Markdown
Contributor

The latest Buf updates on your PR. Results from workflow Buf CI / validate (pull_request).

BuildFormatLintBreakingUpdated (UTC)
✅ passed✅ passed✅ passed✅ passedApr 29, 2026, 12:08 PM

@vbotbuildovich vbotbuildovich merged commit 761fc02 into release-3.7 Apr 29, 2026
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants