feat(middlewares): add permission middleware for tool call gating#966
Open
shentongmartin wants to merge 59 commits into
Open
feat(middlewares): add permission middleware for tool call gating#966shentongmartin wants to merge 59 commits into
shentongmartin wants to merge 59 commits into
Conversation
3be5146 to
95029a1
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## alpha/09 #966 +/- ##
===========================================
Coverage ? 82.23%
===========================================
Files ? 163
Lines ? 20405
Branches ? 0
===========================================
Hits ? 16781
Misses ? 2447
Partials ? 1177 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
272b82b to
7e89784
Compare
shentongmartin
commented
Apr 15, 2026
b2246d4 to
c47e4ae
Compare
feat(agentic_model): - format print - support agentic chat template - support to compose agentic odel&agentic tools node - support agentic tool node - support agentic message concat
Change-Id: If20fa78dba82a1c177c8ec47090050ea8c1354ed
* fix(adk): skip saving checkpoint when TurnLoop is idle When Stop() is called on an idle TurnLoop (no active agent run, no unhandled items, no canceled items), the resulting checkpoint contains no meaningful state. Skip saving such checkpoints to avoid unnecessary store writes. - Add isIdle check in cleanup() before checkpoint save decision - Add TestTurnLoop_StopWhileIdle_SkipsCheckpoint test Change-Id: I6aeaff5ed5833a971cb95298193fdb96d904baf8 * fix(internal): merge id2State in PopulateInterruptState instead of replacing PopulateInterruptState merged id2Addr entries one by one but replaced id2State wholesale. In a parallel workflow resume, two goroutines share the same globalResumeInfo. If one goroutine's compose graph called PopulateInterruptState (replacing id2State with compose-only entries) before the other goroutine looked up its outer-level entry, the lookup returned a zero-value InterruptState with State=nil, triggering the 'has no state' panic in ChatModelAgent.Resume. Change id2State handling to merge entry by entry, consistent with id2Addr. Change-Id: Ia21f65289bff7beb2bc383fb033926ad9c92d7e7 * fix(adk): keep watching for cancel escalation after stopSig.done When watchStopSignal entered the stopSig.done branch, it processed the initial cancel and then blocked on <-done (turn completion), never looping back to check notify. This meant a subsequent Stop() call with a higher cancel mode (e.g. CancelImmediate) was never forwarded to the agent, causing TestTurnLoop_Stop_EscalatesCancelMode to time out. Replace the blocking <-done with an inner loop that selects on both done and notify, so escalation signals are always delivered. Also apply the generation-based dedup check consistent with the notify branch. Change-Id: Ia6a04d00a2b44625ffbcb625ff0e559c12ed145f
…r agent cancellation (#929) * fix(adk): prevent panic when orphaned tool goroutine sends event after agent cancellation When CancelAfterChatModel times out and escalates to CancelImmediate, GraphInterrupt fires with timeout=0. The compose graph returns immediately, orphaning parallel tool goroutines. When an orphaned tool completes, eventSenderToolWrapper tries to send an event via the AsyncGenerator which is already closed, causing 'send on closed channel' panic. - Add isImmediateCancelled() to cancelContext for checking immediateChan - Make chatModelAgentExecCtx.send cancel-aware: skip send when immediate cancel is active - Use trySend as safety net for the TOCTOU race window - Route SendEvent() through execCtx.send() instead of direct generator.Send() Change-Id: Ic7e0194c860e2692a3cddc559911ab379024f650 * test(adk): add test for orphaned tool goroutine panic after CancelImmediate - unit_send_after_close: directly reproduces the panic by sending to a closed generator with isImmediateCancelled=true - unit_send_after_close_without_cancel_ctx: verifies trySend safety net prevents panic even without cancelCtx - integration_cancel_escalation_orphans_tool: end-to-end test with slow tool, CancelAfterChatModel timeout escalation, and orphaned goroutine Change-Id: Ia82fa957b102ccc2ac42094d18d4b15db2a1701c * test(adk): improve coverage for orphaned tool goroutine fix Add test cases for: - nil execCtx and nil generator defensive guards - nil cancelContext in isImmediateCancelled - TOCTOU race window (isImmediateCancelled=false but generator closed) - SendEvent public API with closed generator - SendEvent without exec context Change-Id: I197c36f34675f5376cbe5f830b15db6ca873cd1f
…925) * fix(adk): keep late turn loop items Change-Id: Iabee0c25a83d5a25585d3592a41ca6a5fba35c2b * docs(adk): clarify cancel wait semantics Change-Id: Ia0a396b9cc2e43f15e85056d966f20b010dcd2b6 * feat(adk): add WithSkipCheckpoint and WithStopCause StopOptions Add two new StopOption variants for TurnLoop.Stop(): - WithSkipCheckpoint: prevents checkpoint persistence on stop, for cases where the caller does not intend to resume in the future. The flag is sticky across escalation calls. - WithStopCause: attaches a business-supplied reason string. Surfaced in TurnLoopExitState.StopCause and, after the Stopped channel closes, via TurnContext.StopCause(). Uses first-non-empty-wins semantics across multiple Stop() calls. Thread both fields through stopSignal with proper mutex protection. Update cleanup() to skip checkpoint save when skipCheckpoint is set. Change-Id: Ifeat-stop-options-skip-checkpoint-stop-cause
* fix: rebase error Change-Id: If20fa78dba82a1c177c8ec47090050ea8c1354ed * feat(adk): add failover support for ChatModel Change-Id: Ice1b513b4b509e7b540316da9119ff3d529c9bae * feat(adk): add failover support for ChatModel Change-Id: Ice1b513b4b509e7b540316da9119ff3d529c9bae * feat(adk): add failover support for ChatModel Change-Id: Id5483447b74322f6dd495bdd3b994c001094569d * feat(adk): make Name and Description optional in ChatModelAgentConfig * feat(adk): add callback lifecycle management to failoverProxyModel - Extract prepareCallbacks method to reuse callback setup logic between Generate and Stream methods - Add callbacks.ReuseHandlers with proper RunInfo (model type + component) before each failover model invocation so handlers receive correct identity - Add explicit OnStart/OnEnd/OnError callback invocations in Generate and Stream since failoverProxyModel declares IsCallbacksEnabled() = true and the outer layer skips automatic callback injection Change-Id: I0150529024125251828cf6f77c8247aa464b1f84 * fix(adk): preserve partial result in failoverProxyModel.Generate on error Return result instead of nil when target.Generate fails, so that the outer failoverModelWrapper can pass the partial output message to ShouldFailover for inspection. Change-Id: I32d86151a6e133f1a58d5e988bccf42d831a646c * refactor(adk): use EnsureRunInfo in failoverProxyModel and separate ctx for callbacks - Replace manual RunInfo construction + ReuseHandlers with callbacks.EnsureRunInfo for cleaner RunInfo setup - Use nCtx (from EnsureRunInfo) for target model invocation and original ctx for OnStart/OnEnd/OnError callback lifecycle Change-Id: I1d5982d0e1ceeaf8f6648b9c40c229b6a2b07ab8 --------- Co-authored-by: shentong.martin <shentong.martin@bytedance.com>
feat: tool search definition
…945) - Add ToolAliases to prepareExecContext when building ToolsNodeConfig - Add UnknownToolsHandler, ExecuteSequentially, ToolArgumentsHandler, and ToolAliases to applyBeforeAgent when rebuilding after BeforeAgent handlers modify tools - Add tests covering argument alias remapping, name alias dispatch, alias preservation after handler rebuild, and handler-only tool registration with pre-configured aliases
…options, add UntilIdleFor (#942)
9661935 to
d7161f7
Compare
Change-Id: I04e50d2736406cd2bed030715ec02959cb33dc68
Change-Id: I8b733829302674ff05b1d775f264a5f6f0f9657a
…leware - Rename BeforeToolCall type to Checker; pass *adk.ToolContext instead of bare toolName (Thread #1) - Rename NewMiddleware to New to match codebase convention (Thread #4) - Improve error messages with tool name, call ID, and args for LLM consumption (Thread #5) - Replace schema.Pipe with StreamReaderFromArray for stream deny path (Thread #6) - Add 4 E2E ask→resume tests: approved, denied, re-interrupt non-target, resume with updated input (Thread #7) Change-Id: If43ac495fa3a5fc1d71a27db9614cb996d4547b5
…olArgument Forward-compatibility: when ToolArgument gains additional fields (e.g. multimodal content), existing Checker implementations will receive them without signature changes. The middleware constructs the ToolArgument from the raw argumentsInJSON before invoking the checker, and extracts .Text back for the endpoint. UpdatedInput stays string since the endpoint signature remains string-based. Change-Id: I19b8c808415e01969330466c034becbd74e25994
…sages Cover EnhancedInvokableToolCall and EnhancedStreamableToolCall endpoints in the permission gate to prevent bypass. Add denyToolResult helper for structured ToolResult deny responses. Use internal.SelectPrompt for English/Chinese deny message templates. Change-Id: I7c95ee55526bb2109566e5cd6a60cb5242baaf8b
c47e4ae to
1d49154
Compare
d4a9e3b to
531c324
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
permission.Middlewarethat intercepts tool calls with a user-definedBeforeToolCallevaluator, supporting Allow/Deny/Ask decisions with interrupt/resume for interactive approvalKey Insight: Framework Skeleton vs. Strategy Pattern
Inspired by Claude Code's permission system (7 modes, 8 rule layers, 5 parallel racers), but deliberately scoped as a framework skeleton rather than an opinionated strategy engine. The middleware provides the interrupt/resume plumbing; users supply the evaluation logic. This matches Eino's position as a general-purpose framework that must serve diverse product forms.
The critical design choice is the three-outcome model:
UpdatedInput)StatefulInterruptto pause the agent loop, checkpoint state, and await external approval viaResumeWithParams.Design Decisions
Why fail-closed for unknown decisions?
Unknown
Decisionvalues (e.g., a typo like"alow") produce a deny result rather than an error. This prevents fail-open security vulnerabilities—a permission system that accidentally allows unknown states is worse than one that blocks them.Why
erroronly for infrastructure failures?BeforeToolCallreturningerroris reserved for infrastructure failures (rule store unreachable, network timeout). This aborts the agent loop. Permission denials useDecision: Denyinstead, which keeps the agent loop running. This separation prevents conflating policy decisions with system failures.Why re-interrupt on non-target resume?
When the runtime resumes a different tool but this tool was previously interrupted, it must re-interrupt to preserve its pending state. This supports parallel tool call scenarios where only one tool is approved at a time.
What Changed
adk/middlewares/permission/permission.goMiddlewarestruct,permissionGatelogic,WrapInvokableToolCall/WrapStreamableToolCall, all supporting types (Decision,ToolCallDecision,AskInfo,AskState,ResumeResponse)adk/middlewares/permission/permission_test.goHow to Verify
go test -race -count=1 ./adk/middlewares/permission/... golangci-lint run ./adk/middlewares/permission/...摘要
permission.Middleware,通过用户定义的BeforeToolCall评估函数拦截工具调用,支持 Allow/Deny/Ask 三种决策,Ask 决策通过中断/恢复机制实现交互式审批核心洞察:框架骨架 vs. 策略引擎
受 Claude Code 权限系统(7 种模式、8 层规则、5 路并行竞争)启发,但刻意定位为框架骨架而非固化的策略引擎。中间件提供中断/恢复的管道设施,用户自行实现评估逻辑。这符合 Eino 作为通用框架需服务多样产品形态的定位。
三态结果模型的关键设计:
UpdatedInput修改输入)StatefulInterrupt暂停 agent 循环、持久化状态、等待外部通过ResumeWithParams回注审批结果。设计决策
为何对未知决策采用 fail-closed?
未知的
Decision值产生 deny 结果而非 error,防止 fail-open 安全漏洞。为何 error 仅用于基础设施故障?
BeforeToolCall返回 error 仅表示基础设施故障(如规则存储不可达),会中断 agent 循环。权限拒绝使用Decision: Deny,保持 agent 循环正常运行,避免将策略决策与系统故障混淆。为何非目标恢复时重新中断?
当运行时恢复其他工具而当前工具之前已中断时,必须重新中断以保留挂起状态,支持并行工具调用场景下的逐个审批。