Skip to content

tangle-network/agent-runtime

Repository files navigation

@tangle-network/agent-runtime

Production runtime substrate for domain agents. Owns the task lifecycle (knowledge readiness, control loop, session resume, sanitized telemetry, canonical RuntimeRunRow persistence + cost ledger) so domain repos stop inventing their own.

pnpm add @tangle-network/agent-runtime @tangle-network/agent-eval

What you get

Entry point When to reach for it
runAgentTask Single-shot adapter-driven task with eval/verification
runAgentTaskStream Streaming product loop with session resume + backends
startRuntimeRun Canonical production-run row + cost ledger (NEW in 0.7.0)
createTraceBridge Map RuntimeStreamEventagent-eval TraceEvent (NEW in 0.7.0)
decideKnowledgeReadiness ready / blocked / caveat branch for routes / UI
createOpenAICompatibleBackend OpenAI-compatible streaming backend (TCloud / cli-bridge)
createSandboxPromptBackend Sandbox / sidecar streamPrompt clients
createRuntimeStreamEventCollector Default-redacted sanitized telemetry over a stream

Every public export is annotated @stable or @experimental. @stable exports do not change shape inside a minor; @experimental exports may change inside a minor and require a deliberate consumer bump.

Quickstart

import { runAgentTask } from '@tangle-network/agent-runtime'

const result = await runAgentTask({
  task: {
    id: 'review-2026-return',
    intent: 'Review the return for missing evidence',
    domain: 'tax',
  },
  adapter: {
    async observe() { return { /* domain state */ } },
    async validate({ state }) { return [/* eval results */] },
    async decide({ state }) {
      return { type: 'stop', pass: true, score: 1, reason: 'review complete' }
    },
    async act() { return undefined },
  },
})

console.log(result.status, result.runRecords)

Canonical production-run lifecycle (NEW in 0.7.0)

startRuntimeRun is the ONE abstraction for "the agent did a thing on behalf of a customer; record what it did, what it cost, how it ended." Replaces bespoke agentRuns-row helpers (legal-agent's completeProductionAgentRun + persistRuntimeRun pair is the canonical example of what this subsumes).

import { startRuntimeRun, runAgentTaskStream } from '@tangle-network/agent-runtime'

const run = startRuntimeRun({
  workspaceId: 'ws-1',
  sessionId: threadId,
  agentId: 'legal-chat-runtime',
  taskSpec,
  scenarioId: `legal-chat:${threadId}`,
  adapter: { upsert: (row) => db.insert(agentRuns).values(row) },
})

for await (const event of runAgentTaskStream({ task: taskSpec, backend, input })) {
  run.observe(event) // llm_call events update the cost ledger
  if (event.type === 'final') {
    run.complete({
      status: event.status === 'completed' ? 'completed' : 'failed',
      resultSummary: event.text ?? '',
      error: event.status === 'failed' ? event.reason : undefined,
    })
  }
}

await run.persist({ runtimeEvents: telemetry.events })
console.log(run.cost()) // { tokensIn, tokensOut, costUsd, wallMs, llmCalls }

Full runnable: examples/runtime-run/.

agent-eval trace bridge (NEW in 0.7.0)

If you persist traces in agent-eval's TraceStore, map runtime stream events to TraceEvent once and stop hand-rolling the adapter in every domain repo:

import { createTraceBridge } from '@tangle-network/agent-runtime'

const bridge = createTraceBridge({ runId, spanId })
for await (const event of runAgentTaskStream({ task, backend, input })) {
  const trace = bridge.toTraceEvent(event)
  if (trace) await traceStore.appendEvent(trace)
}

Error taxonomy

Every public function throws one of:

Error When
ValidationError Caller passed invalid arguments
ConfigError Required env / config missing
NotFoundError A named resource does not exist
BackendTransportError Backend HTTP / IPC call returned non-success
SessionMismatchError Resume requested against a different backend
RuntimeRunStateError RuntimeRunHandle lifecycle methods called out of order

All extend AgentEvalError (re-exported from @tangle-network/agent-eval) and carry a stable code so cross-package handlers can pattern-match without importing the runtime.

Sanitized telemetry

task.intent flows through sanitized telemetry on every event. Never set it to user input — use a fixed string describing the operation kind (e.g. "Run a chat turn", "Score a tax return"). Route user- visible content through task.inputs (redacted by default).

import { createRuntimeStreamEventCollector, runAgentTaskStream } from '@tangle-network/agent-runtime'

const telemetry = createRuntimeStreamEventCollector()
for await (const event of runAgentTaskStream({ task, backend })) {
  telemetry.onEvent(event)
}
console.log(telemetry.events, telemetry.summary())

By default the collector redacts task inputs, user answers, credential questions, control payloads, evidence IDs, task metadata, and eval details. Private diagnostics opt-in via RuntimeTelemetryOptions.

Package boundaries

Package Owns
agent-runtime Lifecycle, adapters, backends, RuntimeRunHandle, trace bridge
agent-eval Control loops, readiness scoring, traces, evals, failure classes, release evidence
agent-knowledge Evidence, claims, wiki pages, retrieval, knowledge bundle builders
Domain packages Domain tools, policies, credentials, UI text, rubrics

The API uses runAgentTask, not runVerticalAgentTask. domain is metadata on the task because the runtime is reusable across many kinds of agents without baking taxonomy into type names.

Examples

Runnable in examples/:

About

Reusable runtime lifecycle for domain-specific agents.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors