feat(middlewares): add permission middleware for tool call gating by shentongmartin · Pull Request #966 · cloudwego/eino

shentongmartin · 2026-04-14T13:32:44Z

Summary

Problem	Solution
No built-in way to gate tool execution behind permission checks in the ADK middleware layer	Implement a `permission.Middleware` that intercepts tool calls with a user-defined `BeforeToolCall` evaluator, supporting Allow/Deny/Ask decisions with interrupt/resume for interactive approval

Key Insight: Framework Skeleton vs. Strategy Pattern

Inspired by Claude Code's permission system (7 modes, 8 rule layers, 5 parallel racers), but deliberately scoped as a framework skeleton rather than an opinionated strategy engine. The middleware provides the interrupt/resume plumbing; users supply the evaluation logic. This matches Eino's position as a general-purpose framework that must serve diverse product forms.

The critical design choice is the three-outcome model:

Allow: Execute the tool (optionally with modified input via UpdatedInput)
Deny: Return a formatted deny message as the tool result—crucially, this does NOT interrupt the agent loop. The LLM sees "Permission denied" and self-adapts, matching CC's battle-tested pattern.
Ask: Trigger StatefulInterrupt to pause the agent loop, checkpoint state, and await external approval via ResumeWithParams.

Design Decisions

Why fail-closed for unknown decisions?

Unknown Decision values (e.g., a typo like "alow") produce a deny result rather than an error. This prevents fail-open security vulnerabilities—a permission system that accidentally allows unknown states is worse than one that blocks them.

Why `error` only for infrastructure failures?

BeforeToolCall returning error is reserved for infrastructure failures (rule store unreachable, network timeout). This aborts the agent loop. Permission denials use Decision: Deny instead, which keeps the agent loop running. This separation prevents conflating policy decisions with system failures.

Why re-interrupt on non-target resume?

When the runtime resumes a different tool but this tool was previously interrupted, it must re-interrupt to preserve its pending state. This supports parallel tool call scenarios where only one tool is approved at a time.

What Changed

File	Change
`adk/middlewares/permission/permission.go`	New package: `Middleware` struct, `permissionGate` logic, `WrapInvokableToolCall`/`WrapStreamableToolCall`, all supporting types (`Decision`, `ToolCallDecision`, `AskInfo`, `AskState`, `ResumeResponse`)
`adk/middlewares/permission/permission_test.go`	22 unit tests covering Allow/Deny/Ask/Unknown/Nil/Error paths for both invokable and streamable tool calls

How to Verify

go test -race -count=1 ./adk/middlewares/permission/...
golangci-lint run ./adk/middlewares/permission/...

摘要

问题	解决方案
ADK 中间件层缺少内置的工具调用权限控制机制	实现 `permission.Middleware`，通过用户定义的 `BeforeToolCall` 评估函数拦截工具调用，支持 Allow/Deny/Ask 三种决策，Ask 决策通过中断/恢复机制实现交互式审批

核心洞察：框架骨架 vs. 策略引擎

受 Claude Code 权限系统（7 种模式、8 层规则、5 路并行竞争）启发，但刻意定位为框架骨架而非固化的策略引擎。中间件提供中断/恢复的管道设施，用户自行实现评估逻辑。这符合 Eino 作为通用框架需服务多样产品形态的定位。

三态结果模型的关键设计：

Allow：执行工具（可选通过 UpdatedInput 修改输入）
Deny：返回格式化的拒绝消息作为工具结果——不中断 agent 循环。LLM 看到 "Permission denied" 后自行适应，沿用 CC 经过实战验证的模式。
Ask：触发 StatefulInterrupt 暂停 agent 循环、持久化状态、等待外部通过 ResumeWithParams 回注审批结果。

设计决策

为何对未知决策采用 fail-closed？

未知的 Decision 值产生 deny 结果而非 error，防止 fail-open 安全漏洞。

为何 error 仅用于基础设施故障？

BeforeToolCall 返回 error 仅表示基础设施故障（如规则存储不可达），会中断 agent 循环。权限拒绝使用 Decision: Deny，保持 agent 循环正常运行，避免将策略决策与系统故障混淆。

为何非目标恢复时重新中断？

当运行时恢复其他工具而当前工具之前已中断时，必须重新中断以保留挂起状态，支持并行工具调用场景下的逐个审批。

codecov · 2026-04-14T13:40:47Z

Codecov Report

❌ Patch coverage is 95.00000% with 5 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (alpha/09@d7161f7). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
adk/middlewares/permission/permission.go	95.00%	4 Missing and 1 partial ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             alpha/09     #966   +/-   ##
===========================================
  Coverage            ?   82.23%           
===========================================
  Files               ?      163           
  Lines               ?    20405           
  Branches            ?        0           
===========================================
  Hits                ?    16781           
  Misses              ?     2447           
  Partials            ?     1177

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

feat(agentic_model): - format print - support agentic chat template - support to compose agentic odel&agentic tools node - support agentic tool node - support agentic message concat

…#826)

Change-Id: If20fa78dba82a1c177c8ec47090050ea8c1354ed

* fix(adk): skip saving checkpoint when TurnLoop is idle When Stop() is called on an idle TurnLoop (no active agent run, no unhandled items, no canceled items), the resulting checkpoint contains no meaningful state. Skip saving such checkpoints to avoid unnecessary store writes. - Add isIdle check in cleanup() before checkpoint save decision - Add TestTurnLoop_StopWhileIdle_SkipsCheckpoint test Change-Id: I6aeaff5ed5833a971cb95298193fdb96d904baf8 * fix(internal): merge id2State in PopulateInterruptState instead of replacing PopulateInterruptState merged id2Addr entries one by one but replaced id2State wholesale. In a parallel workflow resume, two goroutines share the same globalResumeInfo. If one goroutine's compose graph called PopulateInterruptState (replacing id2State with compose-only entries) before the other goroutine looked up its outer-level entry, the lookup returned a zero-value InterruptState with State=nil, triggering the 'has no state' panic in ChatModelAgent.Resume. Change id2State handling to merge entry by entry, consistent with id2Addr. Change-Id: Ia21f65289bff7beb2bc383fb033926ad9c92d7e7 * fix(adk): keep watching for cancel escalation after stopSig.done When watchStopSignal entered the stopSig.done branch, it processed the initial cancel and then blocked on <-done (turn completion), never looping back to check notify. This meant a subsequent Stop() call with a higher cancel mode (e.g. CancelImmediate) was never forwarded to the agent, causing TestTurnLoop_Stop_EscalatesCancelMode to time out. Replace the blocking <-done with an inner loop that selects on both done and notify, so escalation signals are always delivered. Also apply the generation-based dedup check consistent with the notify branch. Change-Id: Ia6a04d00a2b44625ffbcb625ff0e559c12ed145f

…nt position (#926)

…r agent cancellation (#929) * fix(adk): prevent panic when orphaned tool goroutine sends event after agent cancellation When CancelAfterChatModel times out and escalates to CancelImmediate, GraphInterrupt fires with timeout=0. The compose graph returns immediately, orphaning parallel tool goroutines. When an orphaned tool completes, eventSenderToolWrapper tries to send an event via the AsyncGenerator which is already closed, causing 'send on closed channel' panic. - Add isImmediateCancelled() to cancelContext for checking immediateChan - Make chatModelAgentExecCtx.send cancel-aware: skip send when immediate cancel is active - Use trySend as safety net for the TOCTOU race window - Route SendEvent() through execCtx.send() instead of direct generator.Send() Change-Id: Ic7e0194c860e2692a3cddc559911ab379024f650 * test(adk): add test for orphaned tool goroutine panic after CancelImmediate - unit_send_after_close: directly reproduces the panic by sending to a closed generator with isImmediateCancelled=true - unit_send_after_close_without_cancel_ctx: verifies trySend safety net prevents panic even without cancelCtx - integration_cancel_escalation_orphans_tool: end-to-end test with slow tool, CancelAfterChatModel timeout escalation, and orphaned goroutine Change-Id: Ia82fa957b102ccc2ac42094d18d4b15db2a1701c * test(adk): improve coverage for orphaned tool goroutine fix Add test cases for: - nil execCtx and nil generator defensive guards - nil cancelContext in isImmediateCancelled - TOCTOU race window (isImmediateCancelled=false but generator closed) - SendEvent public API with closed generator - SendEvent without exec context Change-Id: I197c36f34675f5376cbe5f830b15db6ca873cd1f

…925) * fix(adk): keep late turn loop items Change-Id: Iabee0c25a83d5a25585d3592a41ca6a5fba35c2b * docs(adk): clarify cancel wait semantics Change-Id: Ia0a396b9cc2e43f15e85056d966f20b010dcd2b6 * feat(adk): add WithSkipCheckpoint and WithStopCause StopOptions Add two new StopOption variants for TurnLoop.Stop(): - WithSkipCheckpoint: prevents checkpoint persistence on stop, for cases where the caller does not intend to resume in the future. The flag is sticky across escalation calls. - WithStopCause: attaches a business-supplied reason string. Surfaced in TurnLoopExitState.StopCause and, after the Stopped channel closes, via TurnContext.StopCause(). Uses first-non-empty-wins semantics across multiple Stop() calls. Thread both fields through stopSignal with proper mutex protection. Update cleanup() to skip checkpoint save when skipCheckpoint is set. Change-Id: Ifeat-stop-options-skip-checkpoint-stop-cause

)

* fix: rebase error Change-Id: If20fa78dba82a1c177c8ec47090050ea8c1354ed * feat(adk): add failover support for ChatModel Change-Id: Ice1b513b4b509e7b540316da9119ff3d529c9bae * feat(adk): add failover support for ChatModel Change-Id: Ice1b513b4b509e7b540316da9119ff3d529c9bae * feat(adk): add failover support for ChatModel Change-Id: Id5483447b74322f6dd495bdd3b994c001094569d * feat(adk): make Name and Description optional in ChatModelAgentConfig * feat(adk): add callback lifecycle management to failoverProxyModel - Extract prepareCallbacks method to reuse callback setup logic between Generate and Stream methods - Add callbacks.ReuseHandlers with proper RunInfo (model type + component) before each failover model invocation so handlers receive correct identity - Add explicit OnStart/OnEnd/OnError callback invocations in Generate and Stream since failoverProxyModel declares IsCallbacksEnabled() = true and the outer layer skips automatic callback injection Change-Id: I0150529024125251828cf6f77c8247aa464b1f84 * fix(adk): preserve partial result in failoverProxyModel.Generate on error Return result instead of nil when target.Generate fails, so that the outer failoverModelWrapper can pass the partial output message to ShouldFailover for inspection. Change-Id: I32d86151a6e133f1a58d5e988bccf42d831a646c * refactor(adk): use EnsureRunInfo in failoverProxyModel and separate ctx for callbacks - Replace manual RunInfo construction + ReuseHandlers with callbacks.EnsureRunInfo for cleaner RunInfo setup - Use nCtx (from EnsureRunInfo) for target model invocation and original ctx for OnStart/OnEnd/OnError callback lifecycle Change-Id: I1d5982d0e1ceeaf8f6648b9c40c229b6a2b07ab8 --------- Co-authored-by: shentong.martin <shentong.martin@bytedance.com>

feat: tool search definition

…945) - Add ToolAliases to prepareExecContext when building ToolsNodeConfig - Add UnknownToolsHandler, ExecuteSequentially, ToolArgumentsHandler, and ToolAliases to applyBeforeAgent when rebuilding after BeforeAgent handlers modify tools - Add tests covering argument alias remapping, name alias dispatch, alias preservation after handler rebuild, and handler-only tool registration with pre-configured aliases

…options, add UntilIdleFor (#942)

Change-Id: I04e50d2736406cd2bed030715ec02959cb33dc68

Change-Id: I8b733829302674ff05b1d775f264a5f6f0f9657a

…leware - Rename BeforeToolCall type to Checker; pass *adk.ToolContext instead of bare toolName (Thread #1) - Rename NewMiddleware to New to match codebase convention (Thread #4) - Improve error messages with tool name, call ID, and args for LLM consumption (Thread #5) - Replace schema.Pipe with StreamReaderFromArray for stream deny path (Thread #6) - Add 4 E2E ask→resume tests: approved, denied, re-interrupt non-target, resume with updated input (Thread #7) Change-Id: If43ac495fa3a5fc1d71a27db9614cb996d4547b5

…olArgument Forward-compatibility: when ToolArgument gains additional fields (e.g. multimodal content), existing Checker implementations will receive them without signature changes. The middleware constructs the ToolArgument from the raw argumentsInJSON before invoking the checker, and extracts .Text back for the endpoint. UpdatedInput stays string since the endpoint signature remains string-based. Change-Id: I19b8c808415e01969330466c034becbd74e25994

…sages Cover EnhancedInvokableToolCall and EnhancedStreamableToolCall endpoints in the permission gate to prevent bypass. Add denyToolResult helper for structured ToolResult deny responses. Use internal.SelectPrompt for English/Chinese deny message templates. Change-Id: I7c95ee55526bb2109566e5cd6a60cb5242baaf8b

shentongmartin force-pushed the feat/permission branch from 3be5146 to 95029a1 Compare April 14, 2026 13:35

shentongmartin changed the base branch from main to alpha/09 April 15, 2026 01:25

shentongmartin force-pushed the feat/permission branch from 272b82b to 7e89784 Compare April 15, 2026 01:26

shentongmartin commented Apr 15, 2026

View reviewed changes

shentongmartin force-pushed the feat/permission branch from b2246d4 to c47e4ae Compare April 15, 2026 03:39

mrh997 and others added 24 commits April 15, 2026 14:14

feat: define AgenticModel component interface

0f09d14

feat: change the index in StreamMeta to a non-pointer (#573)

cc213db

feat: improve AgenticResponseMeta definition (#575)

52ad46f

feat: improve AssistantGenText definition (#577)

f74acac

feat: improve extension type name (#578)

0186a33

feat: modify package name (#579)

2bc0966

feat: remove TokenUsage definition in CallbackOutput (#580)

428bdcf

feat: add helper functions for AgenticMessage (#582)

1e7d37d

feat: improve MCPToolCallError definition (#592)

1f4004c

feat: improve Options definition (#593)

bec3266

feat: add CallbackInput definition for CallbackInput (#594)

9c9b8be

feat: define 'omitempty' flag in json tag (#595)

df5eea0

fix: MCPToolApprovalRequest definition (#600)

edafdbc

feat: define StreamResponseError for openai (#601)

ed90509

feat: support agentic message concat (#576)

6a01707

feat(agentic_model): - format print - support agentic chat template - support to compose agentic odel&agentic tools node - support agentic tool node - support agentic message concat

fix: concat agentic messages (#604)

2eaf7af

fix: concat agentic messages (#604)

7a5087f

fix(schema): agentic concat support extra (#670)

19bd5a3

feat(schema): optimize agent message format (#671)

9b5035f

fix: openai ConcatResponseMetaExtensions (#678)

86d23bc

feat: improve comment (#679)

8ed1bfe

feat: add agentic callbacks template (#681)

238a7a7

feat: improve AgenticToolChoice (#684)

29b518e

feat: define AgenticCallbackInput/Output (#689)

94799ab

meguminnnnnnnnn and others added 18 commits April 15, 2026 14:14

chore

8282cd6

feat(adk): modify on agent events (#795)

1353c26

feat(adk): turn loop support front and exit loop (#796)

8b5c020

feat(adk): implement cancel mechanism for ChatModelAgent (#797)

64bc098

feat(adk): modify cancel func (#800)

c2c571d

fix(adk): select cancel after front without preemptive (#802)

4c55952

fix: implement IsCallbacksEnabled and GetType for cancelableChatModel (…

2387224

…#826)

fix: rebase error

87eb74f

Change-Id: If20fa78dba82a1c177c8ec47090050ea8c1354ed

refactor(adk): replace TurnLoop with push-based API (#835)

989db77

feat(adk): export NewEventSenderToolWrapper for customizable tool eve…

0842de7

…nt position (#926)

feat(compose): support tool name and argument aliases in ToolsNode (#931

d0e1318

)

feat: tool search (#884)

42cecff

feat: tool search definition

refactor(adk): improve cancel propagation, encapsulate TurnLoop stop …

d7161f7

…options, add UntilIdleFor (#942)

shentongmartin force-pushed the alpha/09 branch from 9661935 to d7161f7 Compare April 15, 2026 06:15

shentongmartin added 5 commits April 15, 2026 14:15

feat(middlewares): add permission middleware for tool call gating

f3adfba

Change-Id: I04e50d2736406cd2bed030715ec02959cb33dc68

test(middlewares): add attack tests for permission middleware

1f7a29e

Change-Id: I8b733829302674ff05b1d775f264a5f6f0f9657a

shentongmartin force-pushed the feat/permission branch from c47e4ae to 1d49154 Compare April 15, 2026 06:16

shentongmartin added C-enhancement Category: This is a PR that adds a new feature or fixes a bug. D-adk Domain: this is an issue related to the adk package labels Apr 20, 2026

shentongmartin force-pushed the alpha/09 branch 2 times, most recently from d4a9e3b to 531c324 Compare April 28, 2026 01:07

shentongmartin force-pushed the alpha/09 branch from 0220502 to 1df3092 Compare May 6, 2026 01:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(middlewares): add permission middleware for tool call gating#966

feat(middlewares): add permission middleware for tool call gating#966
shentongmartin wants to merge 59 commits into
alpha/09from
feat/permission

shentongmartin commented Apr 14, 2026

Uh oh!

codecov Bot commented Apr 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

7 participants

Conversation

shentongmartin commented Apr 14, 2026

Summary

Key Insight: Framework Skeleton vs. Strategy Pattern

Design Decisions

Why fail-closed for unknown decisions?

Why error only for infrastructure failures?

Why re-interrupt on non-target resume?

What Changed

How to Verify

摘要

核心洞察：框架骨架 vs. 策略引擎

设计决策

为何对未知决策采用 fail-closed？

为何 error 仅用于基础设施故障？

为何非目标恢复时重新中断？

Uh oh!

codecov Bot commented Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

7 participants

Why `error` only for infrastructure failures?

codecov Bot commented Apr 14, 2026 •

edited

Loading