Skip to content

Commit 20baf5d

Browse files
Skobeltsynclaude
andcommitted
docs(#1906): comparison page — Agents.KT vs LangChain / SK / AutoGen / raw MCP
`docs/comparison.md`. Substantial side-by-side organized by: - TL;DR table covering language, typing model, composition surface, tool surface, runtime, local-first, deployment, license, maturity. - "Where Agents.KT wins" — compile-time type contracts, pure JVM, MCP as first-class native shape, single-source three-mode deployment, race-safe-by-construction. - "Where Agents.KT loses" — ecosystem (LangChain 700+ integrations), Python ML interop, multi-agent research surfaces (AutoGen's strength), maturity, vector stores / retrievers / embedders. - Per-dimension drilldowns: typing, composition, tool surface, deployment, local-first, MCP support depth, observability, budget controls. - "Choosing" — 8 quick decision shortcuts pointing at one framework over the others. - "What this comparison is NOT" — explicitly not a benchmark, not a correctness audit, not an endorsement. - Status notes (dated 2026-05) so the comparison is honest about which framework versions it's measuring. Honest tone throughout — no strawmen, names concrete losses, points people at the right tool when it's not us. Closes #1906. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 72d3cba commit 20baf5d

1 file changed

Lines changed: 148 additions & 0 deletions

File tree

docs/comparison.md

Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
# Agents.KT vs Other Agent Frameworks
2+
3+
A side-by-side for teams choosing a framework. Written with the constraint of being honest about what each ecosystem is good at — these tools largely complement each other; the right pick depends on your stack, constraints, and what you're optimizing for.
4+
5+
## TL;DR
6+
7+
| | Agents.KT | LangChain | Semantic Kernel | AutoGen | Raw MCP |
8+
|---|---|---|---|---|---|
9+
| **Language** | Kotlin (JVM) | Python (+ JS port) | C# (+ Python, Java) | Python | Any |
10+
| **Typing** | Compile-time `Agent<IN, OUT>` boundaries | Runtime `Any` | Runtime + nominal interfaces | Runtime | Wire-level JSON |
11+
| **Composition** | DSL operators (`then`, `/`, `*`, `wrap`, `branch`, `loop`) — checked at compile | LCEL `|` operator (runtime types) | "Planners" + "Plugins" | Multi-agent conversation graph | None (you build it) |
12+
| **Tool surface** | Typed `Tool<IN, OUT>` + MCP client/server first-class | LangChain Tools, MCP via adapter | Plugins (semantic + native functions) | Functions + tool-use messages | MCP-native |
13+
| **Runtime model** | In-JVM library + MCP server + standalone JAR | In-process Python | In-process .NET / Python | In-process Python | Whatever transport you pick |
14+
| **Local-first** | Yes — Ollama default, no API key required | Yes (via Ollama integration) | Yes (via various local connectors) | Yes (via various) | Yes — transport-agnostic |
15+
| **Deployment shape** | Library / hosted MCP server / autonomous JAR — one DSL, three modes | Library | Library | Library | Wire protocol |
16+
| **License** | MIT | MIT | MIT | CC-BY-4.0 (Microsoft Research) | MIT |
17+
| **Maturity (early 2026)** | 0.5.0 — production-usable for narrow scopes; APIs still moving | 0.3.x — mature, large ecosystem | 1.x — stable, large enterprise adoption | 0.4.x — research project graduating | Spec 2025-03-26, multiple SDKs |
18+
19+
## Where Agents.KT wins
20+
21+
**Compile-time type contracts.** Every composition boundary is `compiler-checked`. `parseAgent then planAgent then solveAgent` fails at compile if the output types don't chain. LangChain's `prompt | model | parser` chains the same way at the API level but is checked at runtime via duck-typed Python. AutoGen and SK do not check type contracts between agents.
22+
23+
**Pure JVM.** No Python sidecar, no `subprocess.run("python")`, no bundling a Python wheel inside your Spring app. Kotlin idioms, Gradle build, single-deploy JAR. Matters when your org's existing AI work is in Python ML pipelines but the agent layer needs to live inside the existing Kotlin/Java service.
24+
25+
**MCP as a first-class native shape.** `mcp.toolSkills()` / `promptSkills()` / `resourceSkills()` turn every MCP capability into a `Skill` consumable in the agent DSL. `McpServer.from(agent) { expose(...) }` exposes an agent as an MCP server in one line. The InternalsAgent (see `docs/internals-agent.md`) is the dogfooding example: the framework documents itself via its own MCP server.
26+
27+
**Single source for three deployment modes.** Same agent definition runs as (a) an in-process library function, (b) a hosted MCP HTTP server, (c) an autonomous JAR with picocli-shaped `--port`/`--expose` flags. The progression matches how agents earn independence; the only thing changing is the wiring around the agent, not the agent itself.
28+
29+
**Race-safe by construction.** Single-placement rule (an agent instance may participate in at most one structure) caught at construction. Freeze-after-construction prevents drift via held references. `wrap`'s `effectivePrompt` parameter avoids the race of mutating shared agent state in concurrent pipelines.
30+
31+
## Where Agents.KT loses
32+
33+
**Ecosystem.** LangChain has 700+ integrations (vector stores, retrievers, embedders, agents, callbacks). Agents.KT has 3 LLM providers (Ollama, Anthropic, OpenAI) and you write the rest. If your job is "wire up 12 SaaS APIs into a prompt pipeline by Friday," LangChain is the right tool, not this one.
34+
35+
**Python AI/ML interop.** If your team already has Python notebooks for embedding generation, fine-tuning, eval harnesses — running an Agents.KT layer next to them is a context switch. SK's Python flavor or LangChain stay in the same language.
36+
37+
**Multi-agent research surfaces.** AutoGen's strength is the conversation graph between agents — `GroupChat`, `ConversableAgent` with custom turn-taking, complex role-play patterns. Agents.KT's `Forum` operator is the equivalent shape but with fewer pre-built conversation patterns. If you're doing research-style multi-agent debate with 5+ heterogeneous agents and need fine-grained turn control, AutoGen has more out-of-the-box.
38+
39+
**Maturity.** v0.5.0 is the streaming-runtime release; v0.6.0 ships per-file IDE-skills. APIs are stable enough to build on (we don't break things gratuitously, and breakage gets a CHANGELOG entry + migration note) but pre-1.0 reservations are real. LangChain has lived through more breaking-change cycles and has scar tissue from them.
40+
41+
**Vector stores / retrievers / embedders.** Not first-class today. Implement via the `Tool<IN, OUT>` interface or wrap a Java client library (Qdrant, Pinecone, pgvector). LangChain has these as native types with retry / chunking / metadata baked in.
42+
43+
## By Dimension
44+
45+
### Typing
46+
47+
| Framework | What "typed" means |
48+
|---|---|
49+
| **Agents.KT** | `Agent<IN, OUT>` is a generic interface. `agentA then agentB` requires `agentA.OUT == agentB.IN` and the compiler enforces it. `@Generable` data classes generate JSON Schema for LLM structured-output. Internals use `Any?` for the wire (tool args are `Map<String, Any?>`); the typed shell sits over an untyped core. |
50+
| **LangChain** | Pydantic models for structured outputs and tool schemas — runtime validation. Chain compositions are Python `|` operator with duck-typed args; runtime errors when types don't match. |
51+
| **Semantic Kernel** | Nominal interfaces (`IPlugin`, `ISKFunction`). Plugin functions have typed parameters via attributes. Composition is mediated by a planner — types between plugins are inferred / coerced. |
52+
| **AutoGen** | Untyped. Agents pass messages (strings + structured payloads) via the conversation API. |
53+
| **Raw MCP** | Wire-typed via JSON Schema in `tools/list` results. Your language's type system either reads those schemas or doesn't. |
54+
55+
**Pick Agents.KT if:** "the compiler told me before I shipped" matters more than "the framework integrates X out-of-the-box."
56+
57+
### Composition
58+
59+
| Framework | How agents compose |
60+
|---|---|
61+
| **Agents.KT** | Six operators: `then` (sequential), `/` (parallel fan-out), `*` (forum), `wrap` (prompt override), `.branch {}` (typed routing), `.loop {}` (feedback). Single-placement rule = each agent in at most one structure. |
62+
| **LangChain** | LCEL: `prompt | model | parser` is the canonical chain. RunnableLambda, RunnableMap, RunnableParallel for forks. Composition is functional but types are runtime-resolved. |
63+
| **Semantic Kernel** | Planners pick a sequence of plugins to invoke. Manual orchestration via `kernel.InvokeAsync`. Less of a DSL, more an SDK. |
64+
| **AutoGen** | Conversation graph between agents. `GroupChat` manages turn-taking; you write the agent personas + rules. |
65+
| **Raw MCP** | Not applicable — MCP is a tool-call wire protocol, not a composition framework. Your runtime decides how to use the tool catalog. |
66+
67+
### Tool surface
68+
69+
| Framework | What tools look like |
70+
|---|---|
71+
| **Agents.KT** | `tool<Args, Result>("name") { args -> ... }` returns a `Tool<Args, Result>` handle. `skill.tools(addTool, multiplyTool)` is compile-time-checked (typed `Tool` refs, not strings). MCP servers are reachable via `mcp { server("foo") { url = ... } }`. |
72+
| **LangChain** | `@tool` decorator on a Python function, or subclass `BaseTool`. Args via Pydantic. |
73+
| **Semantic Kernel** | `[KernelFunction]` attribute on a method. Parameters via `[KernelFunctionParameter]`. |
74+
| **AutoGen** | Function registered with `register_function(...)` on an agent. OpenAI function-call shape. |
75+
| **Raw MCP** | `tools/list` returns descriptors with JSON-Schema input. Your client wraps them. |
76+
77+
### Deployment
78+
79+
| Framework | Deploy shapes |
80+
|---|---|
81+
| **Agents.KT** | Library import / hosted via `McpServer.from(agent)` / autonomous via `McpRunner.serve(agent, args)`. Future: GraalVM native binary (Phase 2), jlink runtime bundle. |
82+
| **LangChain** | Library import. Servers via LangServe (FastAPI-shaped) or your own glue. |
83+
| **Semantic Kernel** | Library import. Hosting via standard ASP.NET (C#) or whatever your Python web framework is. |
84+
| **AutoGen** | Library import. Hosting via your own glue. |
85+
| **Raw MCP** | Whatever transport you pick (HTTP / stdio / TCP). The spec covers wire format only. |
86+
87+
### Local-first
88+
89+
All four mature frameworks support local LLMs (Ollama, llama.cpp, vLLM) via adapter modules. Agents.KT's default is Ollama with no API key required — you can `./gradlew runInternalsAgent` and have a functioning MCP server in one command. LangChain and SK assume cloud-default but degrade gracefully. AutoGen does too.
90+
91+
### MCP support
92+
93+
| Framework | MCP integration depth |
94+
|---|---|
95+
| **Agents.KT** | First-class. Client (`mcp { server() }`), server (`McpServer.from()`), capability-as-skill shortcuts (`toolSkills()` / `promptSkills()` / `resourceSkills()`), standalone runner (`McpRunner`). 2025-03-26 spec conformance. |
96+
| **LangChain** | Adapter modules (community-maintained). Not a first-class concept; bolted-on. |
97+
| **Semantic Kernel** | MCP plugin in preview. |
98+
| **AutoGen** | MCP client support via community modules. |
99+
| **Raw MCP** | This IS MCP. Use it as the lingua franca; pick the framework on the consumer side based on the language / typing preference. |
100+
101+
### Observability
102+
103+
| Framework | Hooks |
104+
|---|---|
105+
| **Agents.KT** | `onSkillChosen`, `onToolUse`, `onKnowledgeUsed`, `onError`, `onBudgetThreshold`, plus the unified `Agent.observe { event -> }` sealed-event view. Streaming session events via `agent.session(input).events: Flow<AgentEvent<OUT>>`. OpenTelemetry adapter planned (#1908). |
106+
| **LangChain** | `Callbacks` interface, LangSmith integration as the canonical observability story. |
107+
| **Semantic Kernel** | Built-in OpenTelemetry, custom kernel hooks. |
108+
| **AutoGen** | Conversation history is the observation surface. Custom callbacks via the agent API. |
109+
| **Raw MCP** | None at the protocol level. |
110+
111+
### Budget controls
112+
113+
| Framework | What you can cap |
114+
|---|---|
115+
| **Agents.KT** | `maxTurns`, `maxToolCalls`, `maxDuration`, `perToolTimeout`, `maxTokens`, `maxConsecutiveSameTool` — pre-cap warnings via `onBudgetThreshold`. All caps surface as `BudgetExceededException` with a `BudgetReason`. |
116+
| **LangChain** | `max_iterations` on agent executors. Per-tool timeouts via tool implementation. |
117+
| **Semantic Kernel** | Planner step limits; per-function timeout via underlying invocation. |
118+
| **AutoGen** | `max_consecutive_auto_reply` on agents. |
119+
| **Raw MCP** | None — that's the runtime's job. |
120+
121+
## Choosing
122+
123+
A few shortcuts that point at one framework over the others:
124+
125+
- **"Our backend is Spring Boot / Ktor / Quarkus."** → Agents.KT. Single-language stack matters at deploy time.
126+
- **"We need 50 vector-store / retriever / embedder integrations next quarter."** → LangChain. Ecosystem wins.
127+
- **"We're a .NET shop with Azure OpenAI."** → Semantic Kernel. Stays in-stack.
128+
- **"We're researching multi-agent conversation dynamics."** → AutoGen. Built for the question.
129+
- **"We just need to expose tools to Claude / Cursor / ChatGPT."** → Raw MCP. Lowest layer; pick the framework that consumes it.
130+
- **"We want the compiler to catch boundary mistakes before they hit prod."** → Agents.KT.
131+
- **"We want to ship one JAR to k8s with no Python."** → Agents.KT.
132+
- **"We want a curated PromptTemplate library and battle-tested chains for the common patterns."** → LangChain.
133+
134+
## What this comparison is NOT
135+
136+
- A benchmark. Performance comparisons across these tools mostly measure "how much overhead does the framework add over a direct provider call?" — the answer is "a few ms" for all of them; rounding error vs LLM call latency. Pick on ergonomics, not throughput.
137+
- A correctness audit. None of these frameworks "checks" your prompts. They give you primitives for building agents; the agents are still as good (or as bad) as the prompts and tool design behind them.
138+
- An endorsement. We use Agents.KT because we built it for our own constraints. If yours are different, pick differently. The frameworks listed here are all good at what they do — none of them is a bad choice for the use case they were designed for.
139+
140+
## Status notes (2026-05)
141+
142+
- **Agents.KT 0.5.0** — streaming runtime + MCP-as-skills shipped. 0.6.0 (per-file IDE-skills via InternalsAgent) in flight.
143+
- **LangChain 0.3.x** — stable, ecosystem mature. LCEL is the recommended composition surface.
144+
- **Semantic Kernel 1.x** — stable, MCP integration in preview.
145+
- **AutoGen 0.4.x** — major architectural rewrite landed; the new core/agentchat split is recent.
146+
- **MCP spec 2025-03-26** — covered by both this framework and the official Python / TypeScript SDKs.
147+
148+
If anything here ages out, file an issue or PR — the comparison should track reality, not historical reality.

0 commit comments

Comments
 (0)