Skip to content

Ask-and-suspend: correlated human-in-the-loop questions over integrations (ask_human) #87

@kjgbot

Description

@kjgbot

Motivation

The integration pipeline is currently a one-way event firehose: external event → relayfile adapter → mount sync → IntegrationEventBridgebroker.sendMessage(mode: 'steer'). Agents get notified about Slack, but cannot delegate a decision to a human and reliably receive the answer.

Today, if an agent posts a question to Slack (via the writeback/draft path) and a human replies:

  • the reply arrives as a generic integration event, potentially fanned out to all project agents;
  • nothing correlates it to the question — the agent must infer "this is the answer to what I asked 40 minutes ago" amid unrelated channel traffic;
  • there is no first-class "waiting on a human" state, so agents either proceed on assumptions or strand a question in a channel hoping to be re-prompted.

Prior art: claude-slack-bridge's ask_on_slack MCP tool — agent posts a question, blocks until a reply arrives in the thread (matched by thread_ts), then resumes. The idea worth adopting is the correlation contract, not the blocking mechanics (their model is a throwaway claude -p process per session, so blocking is free; for us a tool call held open for hours pins the agent's turn).

Proposal: ask-and-suspend with thread correlation

The agent asks, ends its turn, and the correlated reply steers it back awake — reusing the existing steer-injection mechanism. Both halves of the loop already exist; what's missing is thin:

  1. a correlation envelope (question ID ↔ provider thread),
  2. a routing rule: reply-in-tracked-thread → deliver only to the asking agent, tagged as the answer,
  3. an agent-facing primitive (ask_human) plus prompt convention.

Flow

agent calls ask_human(question, target)            e.g. target = slack channel C123
  ↓
bridge writes question via existing writeback path (create.json draft)
  with embedded correlation key (questionId)
  ↓ adapter posts to Slack; posted message lands back in the mount
bridge captures thread root ts ↔ questionId; registers pending question
  ↓ agent ends its turn ("waiting on human" state, surfaced in UI)
human replies in the Slack thread (any time later)
  ↓ reply event enters the normal inbound path
bridge matches event's thread_ts against pending questions
  ↓ match → bypass normal scope fan-out
broker.sendMessage(projectId, {
  to: askingAgent, from: 'integration', mode: 'steer',
  text: formatAnswerMessage(...),
  data: { kind: 'integration-answer', answerTo: questionId, ... }
})
  ↓
agent resumes with the answer in context

Components

1. Pending-question registry (new, owned by IntegrationEventBridge or a sibling module)

interface PendingQuestion {
  questionId: string
  projectId: string
  agentName: string           // asking agent — sole delivery target for the answer
  provider: string            // 'slack'
  channel: string             // e.g. C123
  threadTs?: string           // resolved once the posted question is observed in the mount
  question: string
  askedAt: number
  expiresAt?: number
  status: 'posting' | 'pending' | 'answered' | 'expired' | 'cancelled'
}

Persisted (disk or relayfile) so questions survive pear restarts — answers may arrive hours later.

2. Outbound: ask — reuse the writeback/draft convention (src/main/integration-event-bridge.ts:660 already exempts draft@*/create.json files from self-notification). The draft embeds questionId as a correlation key. Open question / adapter requirement: the relayfile Slack adapter must echo the correlation key in the created-message record (or the bridge falls back to matching the posted message body) so the bridge can learn the resulting thread_ts.

3. Inbound: answer routing — in the event-routing path (integration-event-bridge.ts target resolution, ~:882-921), before normal scope fan-out:

  • parse the event's channel + thread_ts;
  • if it matches a pending question → deliver only to agentName with data.kind = 'integration-answer', data.answerTo = questionId; mark answered; skip normal fan-out for this event;
  • reuse the stale-target filtering from 082a418 — if the asking agent is gone, fall back to project-channel delivery with a note that the original asker is offline.

4. Agent-facing primitive — an ask_human tool (relaycast MCP or harness tool) with the convention: call it, then end your turn; you will be steered when the answer arrives. Message text for the answer should explicitly frame it: "Answer to your question (''): …".

5. Lifecycle & edge cases

  • Multiple replies in the thread: deliver each as a follow-up steer (humans clarify in multiple messages); only the first flips status to answered.
  • Timeout: optional expiresAt; on expiry, steer the agent with kind: 'integration-answer-timeout' so it can proceed with a stated assumption instead of waiting forever.
  • Duplicate delivery: inbound events can surface via both the remote stream and the local FSWatcher — answer delivery must dedupe by (questionId, reply ts), consistent with existing event dedup.
  • Cancellation: agent (or user via UI) can cancel a pending question; bridge optionally posts a "no longer needed" reply to the thread.
  • Thread noise: replies from the bot/agent itself in the tracked thread are ignored (same self-loop guard as drafts).

UI

  • Surface "waiting on human" agent state in the project view, with the question text and a deep link to the Slack thread.
  • List pending questions per project (answer/cancel from pear directly as a fallback when nobody answers in Slack).

Out of scope (for this issue)

  • True blocking tool calls (held-open turns) — rejected; ask-and-suspend fits the steer architecture.
  • Providers without threading (extend later; correlation key can fall back to a reply-prefix convention).
  • Structured answers (buttons/forms in Slack via Block Kit) — natural follow-up once plain-text answers work.

Rough implementation order

  1. Pending-question registry + persistence.
  2. Adapter correlation-key echo (or body-match fallback) to resolve thread_ts after posting.
  3. Inbound thread-match routing in IntegrationEventBridge (before scope fan-out) + dedupe.
  4. ask_human tool + agent prompt convention.
  5. Timeout/cancel lifecycle + UI surfacing.

Key files

  • src/main/integration-event-bridge.ts — routing, dedup, self-loop guards, delivery via broker
  • src/main/integration-mounts.ts / src/main/relayfile-mount-launcher.ts — mount/writeback path
  • src/main/integrations.ts — orchestration, scopes
  • relayfile Slack adapter (cloud) — correlation-key echo

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions