feat: kilo-chat — plugin, backend, event service, and web UI#2361
Merged
feat: kilo-chat — plugin, backend, event service, and web UI#2361
Conversation
Contributor
Code Review SummaryStatus: 1 Issue Found | Recommendation: Address before merge Overview
Fix these issues in Kilo Cloud Issue Details (click to expand)WARNING
Other Observations (not in diff)None. Files Reviewed (17 files)
Reviewed by gpt-5.4-20260305 · 3,344,263 tokens |
Contributor
Code Review SummaryStatus: 16 Issues Found | Recommendation: Address before merge Overview
Fix these issues in Kilo Cloud Issue Details (click to expand)WARNING
Other Observations (not in diff)None. Files Reviewed (4 files)
Reviewed by gpt-5.5-20260423 · 10,734,987 tokens |
5 tasks
73e9669 to
81556be
Compare
…versation loading The backend's listMessages returns content as ContentBlock[] (already parsed from the DB JSON string). The client's parseMessageRow then tried JSON.parse() on the already-parsed array, crashing silently. React Query swallowed the error, showing an empty message list. New messages were unaffected because they arrive via WebSocket events and bypass parseMessageRow entirely.
…lock stream onPartialReply emits the full cumulative text of the current assistant message, while onBlockReply emits a chunk (a slice of that cumulative text at chunker boundaries). When a chunker boundary fired mid-message, the next partial still carried the cumulative text containing the delivered chunk as a prefix; the accumulator concatenated both with \n\n and produced a duplicated prefix in outbound PATCHes. Treat the partial as the authoritative cumulative view of the current message, detect new-message boundaries when a partial doesn't extend the previous one, and fold in deliver chunks only when not already represented. Adds a regression test that reproduces the exact 'prefix\n\nprefix+more' duplicate observed in production.
Gates the message input and Send button on the existing BotStatus presence state (already covers explicit bot-offline, instance-not-running, and WS-stale-for-90s cases). Prevents the silent-hanging-POST UX where offline or unreachable sends look identical to delivered messages.
Adds `formatKiloChatError(err, fallback)` to translate KiloChatApiError bodies (zod issues, `body.error`) into human toasts, and routes every mutation onError through it so over-limit / empty / bad-input errors no longer degrade to a generic 'Failed to X'. Lifts the length caps to named constants in the kilo-chat package so the schemas, the error phraser, and a new MessageInput char counter (shown at >=80% of the 8k text cap) all share one source of truth.
A malformed conversationId in the URL path is the only way this GET can return 400, so surface it as the same 'Conversation not found' toast + redirect that 403/404 already get instead of a generic 'Failed to load conversation'.
Restructures ConversationItem to the 'card-link' pattern: the row is a <div>, a <Link> overlay covers the row for navigation, and the kebab button + dropdown menu + leave-confirm buttons sit above the overlay with their own pointer events. This removes the invalid <a><button> nesting (which also caused stray navigations from menu clicks), adds aria-labels + aria-haspopup/aria-expanded on the kebab, role=menu on the dropdown, and aria-labels on the other icon-only buttons (scroll-to-latest, new-conversation).
ConversationDO already assigns ULIDs in commit order, but the fan-out in messages.ts spawned each deliverToBot call in its own worker-level `ctx.waitUntil`, letting rapid sends race the fetch to the Fly machine and arrive at the bot out of order. Move webhook delivery into the DO on a `webhookChain` promise so deliveries for a given conversation run sequentially; action.executed webhooks share the same chain so an approval can never overtake an earlier message to the same bot. Updates the miniflare kiloclaw-stub to buffer webhook calls in module scope and switches the reply-context tests (which previously relied on a route-level env override that no longer reaches the DO) to poll the buffer. Adds a new test asserting message ULIDs arrive at the bot in commit order when 5 sends are issued back-to-back.
Both the inline reply-preview card on each MessageBubble and the compose-area ReplyPreview read the parent message from the live cache, so when the parent is soft-deleted the placeholder bubble swaps to '[deleted message]' but the quoted text stuck around. Checks `replyToMessage.deleted` and renders a muted 'original message deleted' instead.
…ad events by member
The message-create fan-out used to skip the sender when emitting
conversation.{read,activity}, so their own sidebar row's lastActivityAt
and lastReadAt stayed stale across all their tabs until a refetch.
Server: always emit conversation.activity to every human member, and
separately emit conversation.read to anyone who has 'read' the message
— the sender (authored it) or a recipient whose WS subscribed to the
conversation context and delivered message.created. Both events now
fire for the sender, so every tab (including ones viewing a different
conversation) gets the sidebar bump via the instance-level context.
Client: filter onConversationRead by e.memberId === currentUserId so
Alice's read marker never leaks into Bob's sidebar row — a latent bug
that became more visible now that .read fires for the sender too.
Both the header rename field and the sidebar rename field now enforce the server's 200-char cap client-side via the shared CONVERSATION_TITLE_MAX_CHARS constant, so the server 400 can't happen from a user typing too much in the header (which previously had no maxLength).
… them float
The typing sender used `void client.sendTyping(...)`, which discarded
the promise — a failed ping (offline, 5xx) then bubbled to the window
unhandledrejection handler and showed up as a console error. Typing is
best-effort; swallow silently with `.catch(() => {})`.
The card-link pattern in ConversationItem left the content row covering the absolutely-positioned Link: the inner flex container had 'relative' without 'pointer-events-none', so clicks on the title landed on the flex div (which has no handler) instead of the Link underneath. Mark the flex container pointer-events-none while the overlay is showing and opt the controls column back in with pointer-events-auto so the kebab still works. Also pin the kebab visible and the time span hidden while the menu is open, so the row doesn't reflow back to its unhovered layout underneath the still-open dropdown when the mouse leaves.
The server previously accepted `" "` as a valid conversation title (create/rename/bot-create) and as valid message text (POST/PATCH), persisting the whitespace verbatim. Titles rendered as empty-looking sidebar rows; messages rendered as empty bubbles. Introduce a shared `trimmedNonEmptyString` schema helper that trims and requires at least one non-whitespace char, and apply it to all title fields and to the text content block (reused by create + edit message). Control characters remain untouched by design \u2014 scope is empty input only.
…ersation Leaving the currently-open conversation left its row visible in the sidebar until a full page reload. The mutation's onSuccess invalidation fires concurrently with the router.push away from the now-inaccessible conversation, and the cached list never got repopulated without the conversation. Patch the react-query conversations cache optimistically up-front \u2014 mirroring the onConversationLeft WebSocket handler used for other members \u2014 so the row disappears before the navigation starts. Snapshot and restore on mutation error so the user can retry.
Pressing Enter on an edit without typing anything still fired the edit mutation, bumping updatedAt and showing the '(edited)' label on a message whose content had not changed. Short-circuit the save when the trimmed new text equals the trimmed current text so no-op edits leave the message untouched and avoid unnecessary server traffic.
'Aria is typing…' occasionally lingered for 1-3s above a newly-arrived message because the UI only cleared typing state on an explicit typing.stopped event or a 5s timeout, and the stopped event sometimes arrives late (or is lost) relative to message.created. For human senders, a fresh message.created event is itself a deterministic end-of-typing signal, so hook the cache updater to clear typing state for the sender when their message lands. Bots are deliberately excluded because their streaming uses message.created for every token chunk and relies on typing.stopped to signal stream completion \u2014 clearing on bot messages would hide the indicator mid-stream.
…aders The small blue dot next to a conversation with unread messages had no accessible name, so screen-reader users could not perceive the unread state. Mark the dot as decorative (aria-hidden) and add an sr-only sibling with the text 'Unread' so assistive technology announces it while sighted users keep the compact visual affordance.
Commit 422005e made GASTOWN_URL, KILO_CHAT_URL, and EVENT_SERVICE_URL throw at import time when the corresponding NEXT_PUBLIC_* variables are missing. CI tests load constants.ts transitively and crashed at module load. Set mock .test.invalid URLs so imports succeed; the invalid TLD ensures any accidental network call fails loudly.
- Refactor KiloChatLayout to take sandboxId+basePath+noInstanceRedirect directly; drop InstanceSwitcher and the instances[] array prop. - Thread isInstanceLoading through context so the index page no longer redirects to "new" while the status query is still in flight. - Conversation and index pages now read basePath/noInstanceRedirect from KiloChatContext, making them URL-tree-agnostic. - Mirror the kilo-chat route tree under /organizations/[id]/claw/kilo-chat via thin layout wiring useOrgKiloClawStatus, plus one-line re-exports for the index and conversation pages. - Add the Kilo Chat item to OrganizationAppSidebar, gated on the kilo-chat-feature flag (or development) like the personal sidebar.
…StatusDO
Splits the single bot-status route into separate bot-status and
conversation-status routes, backed by a new SandboxStatusDO (Drizzle over
Durable SQLite) that persists last-write-wins snapshots keyed by sandbox
(bot) and conversation.
Worker:
- Drizzle schema + migration for bot_status (singleton) and conversation_status
- SandboxStatusDO with monotonic upserts (setWhere at < excluded.at), destroy
via row DELETE (not deleteAll, to preserve schema)
- Split POST routes: /bot/v1/sandboxes/:id/bot-status and
/bot/v1/sandboxes/:id/conversations/:cid/conversation-status
- GET routes for persisted reads with owner/membership checks
- DO wipe on sandbox destroy
- pushBotStatus / pushConversationStatus helpers resolve owner, persist,
and emit events via event-service
Plugin (kiloclaw):
- Post-turn payload switched to sendConversationStatus
- bot-status request schema slimmed to { online, at }
Web:
- useBotStatus and useConversationStatus hooks with persisted seed +
live WS updates, monotonic client-side guard
- Status components wired to the hooks; in-memory presence/context maps
removed
- kilo-chat supported under org URL
The controller had a relay for /_kilo/kilo-chat/bot-status but no route for /_kilo/kilo-chat/conversations/:conversationId/conversation-status, so the plugin's sendConversationStatus POST fell through to the proxy's catch-all and returned 401 controller_route_unavailable. ContextUsageRing never received post-turn updates. Register the missing relay.
The pushEvent loop set delivered=true before calling ws.send(), so a hibernation-stale handle whose send() throws still counted as delivered. Move the assignment into the success branch so the return value reflects what actually reached a live socket.
The dual-callback then(fulfilled, rejected) form, where both branches
called the same deliverToBot, obscured the intent ("recover from prior
failures, then deliver this message in arrival order"). Replace with
.catch(() => {}).then(() => deliverToBot(...)). Same semantics, reads at
a glance.
WebSocket upgrades don't preflight, so the CORS list only governs the 426 fallback for non-upgrade GETs. Local dev hits its own wrangler dev event-service on :8809 — there's no scenario where a localhost browser should be talking to prod events.kiloapps.io.
ConversationItem hardcoded href to /claw/kilo-chat/<id>, so clicking a row in the org sidebar (basePath /organizations/<id>/claw/kilo-chat) redirected to the personal URL. Read basePath from KiloChatContext, the same source the new-conversation and leave-redirect flows already use.
…at-plugin # Conflicts: # pnpm-lock.yaml
Move KiloChatContext and useKiloChatContext into kiloChatContext.ts so ConversationItem no longer imports from KiloChatLayout (which imports ConversationList → ConversationItem).
| }), | ||
| ]; | ||
| return () => offs.forEach(off => off()); | ||
| }, [kiloChatClient, queryClient]); |
Contributor
There was a problem hiding this comment.
WARNING: Read events can close over an empty user id
This effect uses currentUserId inside onConversationRead() but does not include it in the dependency list. The parent passes user?.id ?? '', so the first render can register handlers with an empty id before useUser() resolves; after that, .read events for the real current user are ignored and sidebar unread state can stay stale until a reconnect/refetch.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a full-stack real-time chat system that lets users talk to their KiloClaw bot instances through a web UI. The system spans 5 new services/packages and touches the kiloclaw controller, CF worker, and Dockerfile.
New services & packages:
services/kilo-chat— Cloudflare Worker chat backend (Hono + Durable Objects)services/event-service— Cloudflare Worker real-time event relay (WebSocket + Durable Objects)packages/kilo-chat— TypeScript client SDK for the chat servicepackages/event-service— TypeScript client SDK for the event serviceservices/kiloclaw/plugins/kilo-chat— OpenClaw plugin giving the bot chat capabilitiesModified:
services/kiloclaw/controller— New kilo-chat proxy routes + bootstrap configservices/kiloclaw/src— Instance config propagation, gateway env injectionservices/kiloclaw/Dockerfile— Plugin bundlingapps/web— Chat UI (Next.js app router pages + hooks)Architecture Overview
End-to-End HTTP Flow
User sends a message → bot receives it
Bot responds → user sees it in real-time
Approval flow (tool-use gating)
Key Design Decisions
ConversationDOholds shared message/member state (one per conversation);MembershipDOdenormalizes per-user conversation metadata (activity timestamps, read position) so listing conversations doesn't require reading every ConversationDO.deriveGatewayToken(sandboxId, secret)) — no per-instance secret management needed.1. Kilo-Chat Worker (
services/kilo-chat)Durable Objects
ConversationDO — one per conversation, SQLite-backed:
conversation,members,messages,reactionsMembershipDO — one per user/bot:
conversationstable indexed by sandboxIdremoveConversationsBySandbox()for sandbox destructionAuth Model
/v1/*(user)kiloUserIdfrom token/bot/v1/sandboxes/:sandboxId/*(bot)bot:kiloclaw:{sandboxId}Sandbox ownership verified against Hyperdrive (PostgreSQL) at conversation creation.
API Endpoints
User-facing (
/v1/):/v1/conversations/v1/conversations/v1/conversations/:id/v1/conversations/:id/v1/conversations/:id/leave/v1/conversations/:id/mark-read/v1/messages/v1/conversations/:id/messages/v1/messages/:id/v1/messages/:id/v1/conversations/:cid/messages/:mid/execute-action/v1/messages/:id/reactions/v1/messages/:id/reactionsBot-facing (
/bot/v1/sandboxes/:sandboxId/):/messages/messages/:id/messages/:id/messages/:id/reactions/conversations/:id/typing/conversations/:id/typing/stop/conversations/:id/messages/conversations/:id/members/conversations/:id/conversations/conversationsEvent Fan-Out
After mutations, the service pushes events via event-service binding:
Conversation-scoped (
/kiloclaw/{sandboxId}/{conversationId}):message.created,message.updated,message.delivery_failed,typing.set,typing.stopInstance-scoped (
/kiloclaw/{sandboxId}):conversation.created,conversation.renamed,conversation.activity,conversation.read,conversation.leftUsers currently in a conversation are auto-marked read; others receive
conversation.activityfor unread indication.2. Event Service (
services/event-service+packages/event-service)Architecture
CF Worker + two Durable Objects:
Connection Protocol
Client SDK (
packages/event-service)EventServiceClientprovides:connect()/disconnect()with exponential backoff reconnection (up to 30s)subscribe(contexts)/unsubscribe(contexts)with queued replay on reconnecton(event, handler)returning unsubscribe functiononReconnect(handler)for cache invalidation on reconnect3. KiloClaw Changes (
services/kiloclaw)Controller: Kilo-Chat Proxy Routes
New
controller/src/routes/kilo-chat.ts— 11 REST routes proxying plugin requests to kilo-chat Worker:{kiloChatBaseUrl}/bot/v1/sandboxes/{sandboxId}/*with same bearer tokenBootstrap & Config Writer
bootstrap.ts: PassesKILOCHAT_BASE_URLto container env; registers routes when bothKILOCLAW_SANDBOX_IDandKILOCHAT_BASE_URLare presentconfig-writer.ts: Always enables kilo-chat channel (_configured = true); adds plugin path/usr/local/lib/node_modules/@kiloclaw/kilo-chattoplugins.load.pathsexec-approvals.jsondefaults (security policy, ask mode, askFallback) on first bootCF Worker DO Changes
InstanceConfigschema: addsvectorMemoryEnabled,vectorMemoryModel,dreamingEnabledbuildEnvVars(): injectsKILOCHAT_BASE_URLas plaintext binding; gateway token derived per-sandbox (no stored secrets)Dockerfile
kilo-chat-builderstage: copies plugin source, runsnpm install + npm pack, produces tarballnpm install -gthe tarball to/usr/local/lib/node_modules/@kiloclaw/kilo-chatplugins/kilo-chat/directory for content-addressed image tagging4. OpenClaw Plugin (
services/kiloclaw/plugins/kilo-chat)Plugin System Integration
openclaw.plugin.json): channel-kind plugin, IDkilo-chat, markdown supportcreateChatChannelPluginwith actions, webhooks, approval capability, and preview streamingActions (Bot Capabilities)
readlimit?,before?member-inforeactemoji,remove?editmessagedeleterenameGroupnamechannel-listlimit?,offset?channel-createname?,additionalMembers?Webhook Handler
Single route at
/plugins/kilo-chat/webhook(auth: 'plugin'):message.created: Parses inbound message → resolves session/agent route → dispatches to agentaction.executed: Parses approval button click → callsresolveApprovalOverGatewayPreview Streaming
PreviewStreamclass (500ms throttle window):Approval Delivery
createKiloChatApprovalCapability():agent:{agentId}:direct:{conversationId}action.executedwebhook →resolveApprovalOverGateway5. Web UI (
apps/web)Route Structure
/claw/kilo-chat— Layout + conversation list/claw/kilo-chat/[conversationId]— Conversation detail (dynamic route)kilo-chat-featurein sidebarReal-Time Architecture
KiloChatLayoutcreates singletonEventServiceClient+KiloChatClientuseMessageCacheUpdater()clientIdcorrelationToken Flow
Key UI Features
6. Shared Packages & Infrastructure
kilo-chat Client SDK (
packages/kilo-chat)KiloChatClientwraps all HTTP calls with Bearer JWT auth:KiloChatApiError(status, body)for non-2xx responsesulidToTimestamp(),contentBlocksToText()DB Migrations
bot_request_cloud_agent_sessionstable (session tracking for bot requests)blocked_at+blocked_by_kilo_user_idcolumns onkilocode_usersLocal Dev
kilo-chatadded tokiloclawservice group indev/local/services.ts['kiloclaw', 'event-service']NEXT_PUBLIC_KILO_CHAT_URLenv var for web app → chat service connection