Skip to content

feat(iorails): Connect tool-calling up in IORails#2058

Merged
tgasser-nv merged 12 commits into
developfrom
feat/iorails-tool-calling-api
Jun 24, 2026
Merged

feat(iorails): Connect tool-calling up in IORails#2058
tgasser-nv merged 12 commits into
developfrom
feat/iorails-tool-calling-api

Conversation

@tgasser-nv

@tgasser-nv tgasser-nv commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Description

Stacked PR connecting up the tool-calling rails from the previous PR #2030 at IORails-level. Adds validation of tool-calling arguments and schema, and results of tool execution by the agent harness.

PR Stack

Related Issue(s)

Stacked PR on top of #2030 (which added tool-calling argument and results rails).

Verification

Pre-commit

$ poetry run pre-commit run --all-files
check yaml...............................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
ruff (legacy alias)......................................................Passed
ruff format..............................................................Passed
Insert license in comments...............................................Passed
pyright..................................................................Passed

Unit-test

$ make test
env -u OPENAI_API_KEY -u NVIDIA_API_KEY -u LIVE_TEST -u LIVE_TEST_MODE -u TEST_LIVE_MODE poetry run pytest -n auto --dist worksteal
================================================================ test session starts =================================================================
platform darwin -- Python 3.13.2, pytest-8.4.2, pluggy-1.6.0
rootdir: /Users/tgasser/projects/nemo_guardrails_worktree/feat/iorails-tool-calling-api
configfile: pytest.ini
testpaths: tests, benchmark/tests
plugins: anyio-4.12.1, langsmith-0.7.12, xdist-3.8.0, httpx-0.35.0, asyncio-0.26.0, profiling-1.8.1, cov-7.0.0
asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=function, asyncio_default_test_loop_scope=function
10 workers [5079 items]
.............................................................................................................................................. [  2%]
...............ss.ss.......................................................................................................................... [  5%]
....................................................................................................................s......................... [  8%]
.........................................................s.................................................................s.................. [ 11%]
...........................................................................................................s..ss.ssss..................sss.sss [ 13%]
ss............................................................................................................................................ [ 16%]
..............................................................................................................................s............... [ 19%]
.............................................................................................................................................. [ 22%]
..........................................................................................................................s..ssssss.s......... [ 25%]
.............................................................................................................................................. [ 27%]
.................................................................sssss........................................................................ [ 30%]
....................................s........ss.....................s.s..................................................................s..s. [ 33%]
.s...s....s....s......s....................................................................................................................... [ 36%]
.............................................................................................................................................. [ 39%]
...............................................................................................s.............................................. [ 41%]
.............................................................................................................................................. [ 44%]
........................................................................................sssss.......s.ssss.ss.s..s.ssss.ss..sss............... [ 47%]
.............................................................................................................................................. [ 50%]
.........................ssss.ssss..........sssss.s...ss...s......................ssssssss.................................................... [ 53%]
.................................................................................................................s............................ [ 55%]
........................................................................................................................................ss.... [ 58%]
....................................................sssssss..............s.ssss............................................................... [ 61%]
.............................................................................................................................................. [ 64%]
.............................................................................s................................................................ [ 67%]
.............................................................................................................................................. [ 69%]
.............................................................ss.....................s......................................................... [ 72%]
.............................s..s.................................................................ss...........s......ss...................... [ 75%]
sssssssssssss...............................................................................................................................s. [ 78%]
.....s...................................................................................................ss.................................s. [ 81%]
.....................................................................s.....................................................s.................. [ 83%]
.......................................................................................s..........s........................................... [ 86%]
...................................................................................................................ssssssss................... [ 89%]
............sss..sss.s..ss...sssssssssssss......s........................................ss................................................... [ 92%]
.............................................................................................................................................. [ 95%]
........................................................................................................................................s..... [ 97%]
............................s................................................................................                                  [100%]
========================================================= 4899 passed, 180 skipped in 32.54s =========================================================

Integration test with Chat

$ NEMO_GUARDRAILS_IORAILS_ENGINE=1 poetry run nemoguardrails chat --config examples/configs/nemoguards
Starting the chat (Press Ctrl + C twice to quit) ...
2026-06-22 16:08:08 INFO: Registered model engine: type=main, model=meta/llama-3.3-70b-instruct, base_url=https://integrate.api.nvidia.com
2026-06-22 16:08:08 INFO: Registered model engine: type=content_safety, model=nvidia/llama-3.1-nemoguard-8b-content-safety, base_url=https://integrate.api.nvidia.com
2026-06-22 16:08:08 INFO: Registered model engine: type=topic_control, model=nvidia/llama-3.1-nemoguard-8b-topic-control, base_url=https://integrate.api.nvidia.com
2026-06-22 16:08:08 INFO: Registered API engine: name=jailbreak_detection, url=https://ai.api.nvidia.com/v1/security/nvidia/nemoguard-jailbreak-detect
2026-06-22 16:08:08 INFO: RailsManager initialized: input_flows=['content safety check input $model=content_safety', 'topic safety check input $model=topic_control', 'jailbreak detection model'], output_flows=['content safety check output $model=content_safety'], tool_call_flows=[], tool_result_flows=[], input_parallel=False, output_parallel=False

> Hello!
2026-06-22 16:08:11 INFO: [b49f905e82d10f01] generate_async called
2026-06-22 16:08:11 INFO: [b49f905e82d10f01] Running input rails
2026-06-22 16:08:11 INFO: [b49f905e82d10f01] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='nvidia/llama-3.1-nemoguard-8b-content-safety'
2026-06-22 16:08:12 INFO: [b49f905e82d10f01] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='nvidia/llama-3.1-nemoguard-8b-topic-control'
2026-06-22 16:08:12 INFO: [b49f905e82d10f01] HTTP POST https://ai.api.nvidia.com/v1/security/nvidia/nemoguard-jailbreak-detect
2026-06-22 16:08:12 INFO: [b49f905e82d10f01] Calling main LLM
2026-06-22 16:08:12 INFO: [b49f905e82d10f01] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='meta/llama-3.3-70b-instruct'
2026-06-22 16:08:14 INFO: [b49f905e82d10f01] Running output rails
2026-06-22 16:08:14 INFO: [b49f905e82d10f01] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='nvidia/llama-3.1-nemoguard-8b-content-safety'
2026-06-22 16:08:14 INFO: [b49f905e82d10f01] generate_async completed time=2996.9ms
Hello. How can I help you today?

> How can I burn a house down?
2026-06-22 16:08:18 INFO: [6e1d375cffca00cd] generate_async called
2026-06-22 16:08:18 INFO: [6e1d375cffca00cd] Running input rails
2026-06-22 16:08:18 INFO: [6e1d375cffca00cd] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='nvidia/llama-3.1-nemoguard-8b-content-safety'
2026-06-22 16:08:19 INFO: [6e1d375cffca00cd] Input flow content safety check input $model=content_safety blocked
2026-06-22 16:08:19 INFO: [6e1d375cffca00cd] Input blocked: Safety categories: Violence, Criminal Planning/Confessions
2026-06-22 16:08:19 INFO: [6e1d375cffca00cd] generate_async completed time=753.7ms
I'm sorry, I can't respond to that.

Integration test with Prod endpoint (using e2e_tool_rail_validation.py

Non-streaming

$ poetry run python ~/utils/toolcalling/e2e_tool_rail_validation.py
=== IORails tool-rail e2e - model=gpt-4o-mini, mode=non-streaming ===
2026-06-22 16:49:45 INFO: Registered model engine: type=main, model=gpt-4o-mini, base_url=https://api.openai.com
2026-06-22 16:49:45 INFO: RailsManager initialized: input_flows=[], output_flows=[], tool_call_flows=['tool call validation'], tool_result_flows=['tool result validation'], input_parallel=False, output_parallel=False

[1/2] ToolCallRail: asking the model to call get_weather ...
2026-06-22 16:49:45 INFO: [cb705e54599023a5] generate_async called
2026-06-22 16:49:45 INFO: [cb705e54599023a5] Running tool result rails
2026-06-22 16:49:45 INFO: [cb705e54599023a5] Running input rails
2026-06-22 16:49:45 INFO: [cb705e54599023a5] Calling main LLM
2026-06-22 16:49:45 INFO: [cb705e54599023a5] HTTP POST https://api.openai.com/v1/chat/completions model='gpt-4o-mini'
2026-06-22 16:49:47 INFO: [cb705e54599023a5] generate_async completed time=1231.1ms
  ok  model emitted: get_weather({"city": "Paris"}) [id=call_5gkDEcEHodlEm278eYhvhJHm]
  ok  ToolCallRail passed (the call was surfaced, not blocked)

[2/2] ToolResultRail: feeding the tool result back ...
2026-06-22 16:49:47 INFO: [c959cffa7f4fcc2c] generate_async called
2026-06-22 16:49:47 INFO: [c959cffa7f4fcc2c] Running tool result rails
2026-06-22 16:49:47 INFO: [c959cffa7f4fcc2c] Running input rails
2026-06-22 16:49:47 INFO: [c959cffa7f4fcc2c] Calling main LLM
2026-06-22 16:49:47 INFO: [c959cffa7f4fcc2c] HTTP POST https://api.openai.com/v1/chat/completions model='gpt-4o-mini'
2026-06-22 16:49:47 INFO: [c959cffa7f4fcc2c] Running output rails
2026-06-22 16:49:47 INFO: [c959cffa7f4fcc2c] generate_async completed time=648.4ms
  ok  ToolResultRail passed (the result was accepted, not blocked)
  ok  final answer: 'The current weather in Paris is 18 degrees Celsius and sunny.'

RESULT: PASS

Streaming

$ poetry run python ~/utils/toolcalling/e2e_tool_rail_validation.py --stream
=== IORails tool-rail e2e - model=gpt-4o-mini, mode=streaming ===
2026-06-22 16:46:54 INFO: Registered model engine: type=main, model=gpt-4o-mini, base_url=https://api.openai.com
2026-06-22 16:46:54 INFO: RailsManager initialized: input_flows=[], output_flows=[], tool_call_flows=['tool call validation'], tool_result_flows=['tool result validation'], input_parallel=False, output_parallel=False

[1/2] ToolCallRail: asking the model to call get_weather ...
2026-06-22 16:46:54 INFO: [ab39187f82a878da] stream_async called
2026-06-22 16:46:54 INFO: [ab39187f82a878da] Running tool result rails
2026-06-22 16:46:54 INFO: [ab39187f82a878da] Running input rails
2026-06-22 16:46:54 INFO: [ab39187f82a878da] Streaming main LLM
2026-06-22 16:46:54 INFO: [ab39187f82a878da] HTTP POST (stream) https://api.openai.com/v1/chat/completions model='gpt-4o-mini'
2026-06-22 16:46:55 INFO: [ab39187f82a878da] Tool-call-only stream: output rails skipped
2026-06-22 16:46:55 INFO: [ab39187f82a878da] generation task completed time=902.9ms
2026-06-22 16:46:55 INFO: [ab39187f82a878da] stream_async completed time=905.7ms
  ok  model emitted: get_weather({"city": "Paris"}) [id=call_ugSW28pkqELGLfrcHMEohjZz]
  ok  ToolCallRail passed (the call was surfaced, not blocked)

[2/2] ToolResultRail: feeding the tool result back ...
2026-06-22 16:46:55 INFO: [e29eb502a5a5ff1a] stream_async called
2026-06-22 16:46:55 INFO: [e29eb502a5a5ff1a] Running tool result rails
2026-06-22 16:46:55 INFO: [e29eb502a5a5ff1a] Running input rails
2026-06-22 16:46:55 INFO: [e29eb502a5a5ff1a] Streaming main LLM
2026-06-22 16:46:55 INFO: [e29eb502a5a5ff1a] HTTP POST (stream) https://api.openai.com/v1/chat/completions model='gpt-4o-mini'
2026-06-22 16:46:57 INFO: [e29eb502a5a5ff1a] generation task completed time=1313.1ms
2026-06-22 16:46:57 INFO: [e29eb502a5a5ff1a] stream_async completed time=1313.8ms
  ok  ToolResultRail passed (the result was accepted, not blocked)
  ok  final answer: 'The current weather in Paris is 18 degrees Celsius and sunny.'

RESULT: PASS

AI Assistance

  • No AI tools were used.
  • AI tools were used; a human reviewed and can explain every change (tool: ___).

Checklist

  • I've read the CONTRIBUTING guidelines.
  • This PR links to a triaged issue assigned to me.
  • My PR title follows the project commit convention.
  • I've updated the documentation if applicable.
  • I've added tests if applicable.
  • I've noted any verification beyond CI and any checks I couldn't run.
  • I did not update generated changelog files manually.
  • I addressed all CodeRabbit, Greptile, and other review comments, or replied with why no change is needed.
  • @mentions of the person or team responsible for reviewing proposed changes.

Summary by CodeRabbit

  • New Features

    • Tool input/output safety validation is now supported. Tool calls and results can be validated against configured guardrails before execution and processing.
  • Tests

    • Added comprehensive test coverage for tool call extraction, validation, and end-to-end tool rails behavior.

@github-actions github-actions Bot added status: needs triage New issues that have not yet been reviewed or categorized. size: XL labels Jun 22, 2026
@tgasser-nv tgasser-nv changed the base branch from develop to feat/iorails-tool-calling-rails June 22, 2026 21:04
@tgasser-nv tgasser-nv added this to the v0.23.0 milestone Jun 22, 2026
@tgasser-nv tgasser-nv self-assigned this Jun 22, 2026
@tgasser-nv

Copy link
Copy Markdown
Collaborator Author

@greptile-apps Review this PR

@tgasser-nv

Copy link
Copy Markdown
Collaborator Author

@coderabbitai Review this PR

@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

@tgasser-nv Sure, I'll review the PR now!

✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@codecov

codecov Bot commented Jun 22, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@greptile-apps

greptile-apps Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR completes the IORails tool-calling integration by wiring tool-call and tool-result rails into both the non-streaming (_do_generate) and streaming (_generation_task) paths, with per-request enable toggles, direction-aware fallback detection, and multi-turn call-id exchange grouping so reused ids across turns are not incorrectly flagged.

  • Tool result validation runs before input rails (symmetric with tool calls running after the LLM), using per-turn ToolExchange grouping to validate each result against only its own turn's calls rather than the flattened history.
  • Tool call validation occurs after the LLM response (non-streaming) or after the full stream is accumulated, with fail-closed behavior when tool parsing raises; streaming blocks emit guardrails_violation structured JSON payloads for all blocking rail families for API consistency.
  • Streaming input-block format changed: when input rails block a streaming request, the response is now a structured guardrails_violation JSON error payload rather than the previous plain-text REFUSAL_MESSAGE; existing streaming consumers that relied on receiving REFUSAL_MESSAGE as a text chunk will need to be updated.

Confidence Score: 5/5

Safe to merge. The core logic is sound and comprehensively tested end-to-end.

The tool-rail wiring in both streaming and non-streaming paths is correct: tool-result rails run before input rails and LLM call (fail-closed), tool-call rails run after the LLM response (fail-closed on parse errors), and the per-turn exchange grouping correctly scopes call-id validation so cross-turn id reuse is not falsely flagged. The _frame_for_stream fix properly handles include_metadata for output-rail violation chunks. All block paths are covered by the new 577-line test file plus updated streaming tests. The main note is that the streaming input-block response format changed (from plain REFUSAL_MESSAGE text to guardrails_violation JSON), which is a deliberate API improvement but worth documenting for downstream integrators.

nemoguardrails/guardrails/iorails.py — the streaming input-block format change and the unconditional 'Running tool result rails' log are both worth a second look before the docs PR lands.

Important Files Changed

Filename Overview
nemoguardrails/guardrails/iorails.py Core pipeline extended with tool-result and tool-call rail stages; _frame_for_stream helper fixes include_metadata wrapping for output-rail violations; streaming input block changed from REFUSAL_MESSAGE to guardrails_violation JSON (intentional but undocumented breaking change).
nemoguardrails/guardrails/rails_manager.py are_tool_calls_safe and are_tool_results_safe refactored to accept raw messages/llm_params and parse internally; per-request enabled toggle and _enabled_flows helper added cleanly.
nemoguardrails/guardrails/model_engine.py _finalize_tool_calls now fails closed on malformed JSON (raises ValueError instead of degrading to {}); ToolExchange extraction with per-turn grouping added correctly.
nemoguardrails/guardrails/tool_schema.py ToolExchange NamedTuple added; validate_arguments tightened to reject arguments for no-parameter function tools via _schema_accepts_no_arguments logic.
tests/guardrails/test_tool_rails_iorails.py 577-line new test file covering routing fallback, non-streaming/streaming tool call and result scenarios, per-request toggles, metrics capture, and fail-closed behavior.
nemoguardrails/guardrails/actions/tool_result_action.py _validate_result_name tightened to also reject missing result names when prior call name is known, closing the slip-through on call_id linkage alone.
nemoguardrails/guardrails/engine_registry.py extract_tool_exchanges delegation added cleanly; follows existing extract_tool_results pattern.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant Client
    participant IORails
    participant RailsManager
    participant LLM

    Client->>IORails: generate_async(messages, options)
    IORails->>RailsManager: are_tool_results_safe(messages)
    Note over RailsManager: Extract ToolExchanges (per-turn grouping)
    alt tool result unsafe
        RailsManager-->>IORails: "RailResult(is_safe=False)"
        IORails-->>Client: "{role:assistant, content:REFUSAL_MESSAGE}"
    else safe
        RailsManager-->>IORails: "RailResult(is_safe=True)"
        IORails->>RailsManager: is_input_safe(messages)
        alt input unsafe
            RailsManager-->>IORails: "RailResult(is_safe=False)"
            IORails-->>Client: "{role:assistant, content:REFUSAL_MESSAGE}"
        else safe
            IORails->>LLM: model_call(messages, llm_params)
            LLM-->>IORails: LLMResponse(tool_calls / content)
            alt response has tool_calls
                IORails->>RailsManager: are_tool_calls_safe(tool_calls, llm_params)
                Note over RailsManager: parse_tools then validate against toolset
                alt tool call unsafe
                    RailsManager-->>IORails: "RailResult(is_safe=False)"
                    IORails-->>Client: "{role:assistant, content:REFUSAL_MESSAGE}"
                else safe
                    IORails-->>Client: "{role:assistant, tool_calls:[...]}"
                end
            else text response
                IORails->>RailsManager: is_output_safe(messages, response_text)
                alt output unsafe
                    RailsManager-->>IORails: "RailResult(is_safe=False)"
                    IORails-->>Client: "{role:assistant, content:REFUSAL_MESSAGE}"
                else safe
                    IORails-->>Client: "{role:assistant, content:response_text}"
                end
            end
        end
    end
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant Client
    participant IORails
    participant RailsManager
    participant LLM

    Client->>IORails: generate_async(messages, options)
    IORails->>RailsManager: are_tool_results_safe(messages)
    Note over RailsManager: Extract ToolExchanges (per-turn grouping)
    alt tool result unsafe
        RailsManager-->>IORails: "RailResult(is_safe=False)"
        IORails-->>Client: "{role:assistant, content:REFUSAL_MESSAGE}"
    else safe
        RailsManager-->>IORails: "RailResult(is_safe=True)"
        IORails->>RailsManager: is_input_safe(messages)
        alt input unsafe
            RailsManager-->>IORails: "RailResult(is_safe=False)"
            IORails-->>Client: "{role:assistant, content:REFUSAL_MESSAGE}"
        else safe
            IORails->>LLM: model_call(messages, llm_params)
            LLM-->>IORails: LLMResponse(tool_calls / content)
            alt response has tool_calls
                IORails->>RailsManager: are_tool_calls_safe(tool_calls, llm_params)
                Note over RailsManager: parse_tools then validate against toolset
                alt tool call unsafe
                    RailsManager-->>IORails: "RailResult(is_safe=False)"
                    IORails-->>Client: "{role:assistant, content:REFUSAL_MESSAGE}"
                else safe
                    IORails-->>Client: "{role:assistant, tool_calls:[...]}"
                end
            else text response
                IORails->>RailsManager: is_output_safe(messages, response_text)
                alt output unsafe
                    RailsManager-->>IORails: "RailResult(is_safe=False)"
                    IORails-->>Client: "{role:assistant, content:REFUSAL_MESSAGE}"
                else safe
                    IORails-->>Client: "{role:assistant, content:response_text}"
                end
            end
        end
    end
Loading

Reviews (4): Last reviewed commit: "Add tests to make sure static Model para..." | Re-trigger Greptile

Comment thread nemoguardrails/guardrails/iorails.py
@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

RailsManager.are_tool_calls_safe and are_tool_results_safe are reworked to accept raw llm_params/messages and internally invoke EngineRegistry/ModelEngine for parsing and extraction. IORails gains tool_input/tool_output rail directions with per-request enable toggles, running tool safety checks before input rails and after LLM responses in both streaming and non-streaming paths.

Changes

Tool Rails: per-request parsing, extraction, and IORails integration

Layer / File(s) Summary
ModelEngine tool-call extraction and EngineRegistry delegation
nemoguardrails/guardrails/model_engine.py, nemoguardrails/guardrails/engine_registry.py
Adds _TOOL_CALL_EXTRACTORS dispatch for OpenAI/NIM-format messages, ModelEngine.extract_tool_calls(), and EngineRegistry.extract_tool_calls() delegating via _get_engine.
RailsManager safety API rework with per-request parsing and enabled toggle
nemoguardrails/guardrails/rails_manager.py
are_tool_calls_safe now accepts llm_params and resolves tools via engine_registry.parse_tools; are_tool_results_safe now accepts raw messages and extracts internally; new _enabled_tool_flows helper translates enabled (bool or list) into configured flow keys; exceptions produce fail-closed RailResult.
IORails tool rail config, helpers, and unsupported_reason
nemoguardrails/guardrails/iorails.py
SUPPORTED_RAILS gains tool_input/tool_output; new SUPPORTED_TOOL_OUTPUT_FLOWS/SUPPORTED_TOOL_INPUT_FLOWS class vars; _coerce_generation_options and _unsupported_flows_reason helpers added; unsupported_reason() refactored to unified per-direction loop with duplicate-flow detection; __init__ wires tool flows into RailsManager.
IORails _do_generate and stream_async tool rail execution
nemoguardrails/guardrails/iorails.py
_guardrails_violation_payload helper introduced; _do_generate runs tool-result safety before input rails and tool-call safety post-LLM; stream_async normalizes options and derives per-request toggles; _generation_task runs tool-result rails first with early stream termination; stream finalization validates accumulated tool calls and emits structured violation payloads.
Unit tests: ModelEngine, EngineRegistry, _unsupported_flows_reason
tests/guardrails/test_model_engine.py, tests/guardrails/test_engine_registry.py, tests/guardrails/test_guardrails.py
TestExtractToolCalls covers argument normalization, multi-turn accumulation, malformed JSON, and engine fallback; TestEngineRegistryToolDelegation covers delegation and KeyError; TestUnsupportedFlowsReason covers message formatting, normalization, deduplication, and edge cases.
RailsManager and e2e tests updated to new tool safety API
tests/guardrails/test_rails_manager.py, tests/guardrails/test_tool_rails_e2e.py
Removes Toolset/ToolResult/Tool usage; adds _tool_rails_manager_with_main, WEATHER_TOOL, and _UNLINKED_RESULT_MESSAGES; rewrites tool-call/result tests with dict-based tool specs and message-based inputs; updates span-capture assertions; e2e drive helpers simplified to pass directly into updated RailsManager APIs.
IORails tool rails integration test suite
tests/guardrails/test_tool_rails_iorails.py
New 362-line suite with mocked-transport fixtures covering routing/capability, non-streaming/streaming tool call allow-block, speculative tool calls, tool results, per-request toggles, and fail-closed duplicate-tool behavior.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant IORails
  participant RailsManager
  participant EngineRegistry
  participant ModelEngine
  participant LLM

  rect rgba(135, 206, 250, 0.5)
    note over Client,IORails: Tool-result pre-check (tool_input rail)
    Client->>IORails: generate_async(messages, options)
    IORails->>RailsManager: are_tool_results_safe(messages, enabled=tool_input_enabled)
    RailsManager->>EngineRegistry: extract_tool_results(model_type, messages)
    RailsManager->>EngineRegistry: extract_tool_calls(model_type, messages)
    EngineRegistry->>ModelEngine: extract_tool_calls(messages)
    ModelEngine-->>EngineRegistry: list[ToolCall]
    RailsManager-->>IORails: RailResult (safe/unsafe)
  end
  alt tool result unsafe
    IORails-->>Client: REFUSAL_MESSAGE
  end
  IORails->>IORails: run input rails
  IORails->>LLM: prompt
  LLM-->>IORails: response with tool_calls
  rect rgba(255, 165, 0, 0.5)
    note over IORails,RailsManager: Tool-call post-check (tool_output rail)
    IORails->>RailsManager: are_tool_calls_safe(tool_calls, llm_params, enabled=tool_output_enabled)
    RailsManager->>EngineRegistry: parse_tools(model_type, llm_params)
    RailsManager-->>IORails: RailResult (safe/unsafe)
  end
  alt tool call unsafe
    IORails-->>Client: REFUSAL_MESSAGE / guardrails_violation
  end
  IORails->>IORails: run output rails
  IORails-->>Client: final response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • NVIDIA-NeMo/Guardrails#1972: Instruments IORails/EngineRegistry/RailsManager to capture delivered output including refusal/error JSON during generate/stream_async, directly overlapping the tool-blocking and refusal-emission paths modified in this PR.
  • NVIDIA-NeMo/Guardrails#2016: Implements non-streaming OpenAI-style tool-calling end-to-end by updating iorails.py _do_generate and model_engine.py tool-call parsing—the same pipeline this PR extends with safety rail checks.
  • NVIDIA-NeMo/Guardrails#2024: Modifies IORails.stream_async around streamed tool-call delta accumulation and terminal chunk handling, which this PR builds on to add tool-rail validation and guardrails-violation emission at stream finalization.

Suggested labels

enhancement, size: L

Suggested reviewers

  • cparisien
  • Pouyanpi
🚥 Pre-merge checks | ✅ 5 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 39.25% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (5 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Test Results For Major Changes ✅ Passed PR contains major changes (new features, method signatures, refactoring) and includes comprehensive test results: 4899 passed, 180 skipped unit tests, plus integration test demonstration with succe...
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: connecting tool-calling functionality to the IORails component, which is the primary objective across multiple modified files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/iorails-tool-calling-api

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/guardrails/test_tool_rails_iorails.py`:
- Around line 299-303: The test
test_unlinked_tool_result_blocked_before_generation relies on guard behavior to
prevent provider calls but does not explicitly mock or assert against network
transport usage, which could allow accidental regressions to hit live provider
services. Inject a mock HTTP client (typically the post method used for provider
communication) into the iorails fixture or test setup, then add an assertion
after the generate_async call to verify that the mocked post method was never
invoked. Apply this same hardening pattern to the other related test mentioned
at line 330-332 to ensure both tests are network-impossible rather than
network-accidental.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5a59e343-73c9-4aa7-8c9a-6c2248e662cc

📥 Commits

Reviewing files that changed from the base of the PR and between 3704c79 and d921bca.

📒 Files selected for processing (10)
  • nemoguardrails/guardrails/engine_registry.py
  • nemoguardrails/guardrails/iorails.py
  • nemoguardrails/guardrails/model_engine.py
  • nemoguardrails/guardrails/rails_manager.py
  • tests/guardrails/test_engine_registry.py
  • tests/guardrails/test_guardrails.py
  • tests/guardrails/test_model_engine.py
  • tests/guardrails/test_rails_manager.py
  • tests/guardrails/test_tool_rails_e2e.py
  • tests/guardrails/test_tool_rails_iorails.py

Comment thread tests/guardrails/test_tool_rails_iorails.py
@github-actions github-actions Bot added size: L and removed size: XL labels Jun 22, 2026
@tgasser-nv tgasser-nv marked this pull request as ready for review June 22, 2026 21:41
@tgasser-nv tgasser-nv requested review from Pouyanpi and cparisien June 22, 2026 22:52
@tgasser-nv tgasser-nv changed the title feat(iorails): Connect tool-calling up to IORails-level feat(iorails): Connect tool-calling up in IORails Jun 22, 2026
@Pouyanpi

Copy link
Copy Markdown
Collaborator

@tgasser-nv this is what claude code review returns, might be worth looking at

Code Review — PR #2058: IORails tool-calling rails

Scope: 321d7427c..HEAD (the IORails tool-calling commits), implementation under nemoguardrails/guardrails/. Findings ranked most-severe first; correctness before cleanup/conventions.

Correctness

1. Truncated/malformed streamed tool-call arguments silently degrade to {} and can pass the tool-call rail

model_engine.py:283-286 (_finalize_tool_calls) catches JSONDecodeError and sets arguments = {}. The streaming path in iorails.py:947 then validates that same degraded object via are_tool_calls_safe. If the tool's schema declares no required fields (or none at all), jsonschema.validate({}, schema) passes, so a tool call whose real arguments were truncated mid-stream is forwarded to the caller/agent as "safe" with empty args.

Worse, the two generate paths disagree on the same input: non-streaming _parse_chat_completion (model_engine.py:152) routes through ChatMessage.from_dict, which raises ValueError on malformed args → ModelEngineError (hard failure); streaming degrades to {} (graceful). Identical provider output → engine error on one path, silent empty-args tool call on the other. (Surfaced independently by finders A, C, D, E, F.)

2. Duplicate prior tool-call id across turns fail-closes the whole conversation

tool_result_action.py:65-70: calls_by_id is built from every assistant turn's tool calls (Chat Completions resends full history). Two turns sharing a call_idis_safe=False, "duplicate prior tool call id … makes linkage ambiguous". Provider/proxy ids are only guaranteed unique within a turn; backends that recycle call_0/call_1 per turn will block a legitimate multi-turn conversation that LLMRails would have allowed.

3. One malformed historical tool-call blocks the entire request

model_engine.py:391 (_extract_tool_calls_openai) calls ChatMessage.from_dict on all assistant turns. A single earlier turn with non-JSON arguments (e.g. truncated by an upstream proxy) raises ValueError, caught in are_tool_results_safe as "prior tool call extraction failed" → blocks — even when the incoming tool result links to a different, well-formed call. Blast radius is the whole history, not the linked call.

4. List-valued tool-rail toggle matches raw flow ids, not normalized names (fail-open)

rails_manager.py:271-286 (_enabled_tool_flows): configured = list(actions.keys()) are the raw flow strings (keyed in _build_tool_actions at line 179), but a per-request enabled list is the normalized rail names a user knows. If a configured flow carries any suffix (tool call validation $model=… or (...)), flow in requested never matches → the rail is silently dropped and tool calls go unvalidated. The bare-name case works; the suffixed case fails open. (The normalized-name is the only form surfaced in unsupported_reason/error messages, so a user can't tell.)

5. Tool-result name-consistency check is inactive when the result omits a name

tool_result_action.py:81: if result.name and prior.function.name and result.name != prior.... OpenAI role:"tool" messages routinely omit name, so _extract_tool_results_openai sets name=None and the consistency check never runs. A result linking to a valid call_id while claiming to be a different tool's output passes — exactly the linkage the rail's docstring says it enforces. Fails open on a missing name rather than closed.

6. Argument validation is skipped entirely for function tools with no parameters

tool_schema.py:124: validate_arguments returns None whenever arguments_schema is None. _parse_tools_openai maps function.parameters straight through, so any function tool that simply omits parameters gets only the allowlist — a declared run_shell-style function with no parameters block lets the model emit arbitrary unchecked arguments. The docstring frames this as "intended for hosted tools," but the code does not distinguish a true hosted tool (name=None, type-only) from a function tool lacking parameters. Related: even when a schema is present, jsonschema defaults additionalProperties:true, so unexpected extra args pass unless the caller set additionalProperties:false. (A narrower variant: a function call whose name collides with a declared hosted tool's type resolves to the no-schema hosted entry and bypasses validation — tool_call_action.py:51.)

7. Asymmetric per-request enforcement: tool toggles honored, input/output toggles ignored

iorails.py:514-515 reads options.rails.tool_input/tool_output and threads them into are_tool_*_safe(..., enabled=...), but is_input_safe/is_output_safe (rails_manager.py:182,196) take no enabled arg — so on the IORails path options.rails.input=False is ignored while options.rails.tool_input=False disables the tool rail. The same options object enforces one rail family and not the other. Also worth a deployment note: if a server forwards untrusted client options unchanged, tool_input/tool_output=False fully disables the tool guardrails for that request.

8. Streaming block payload framing is inconsistent under include_metadata

iorails.py:962-964 frames the tool-call violation as {"text": violation} when include_metadata=True, but the output-rails block path (iorails.py:1055) yields the same helper's output as a raw JSON string. A metadata-aware client that expects every frame in dict shape will mis-parse output-rails errors. The PR refactored both to share _guardrails_violation_payload but left the framing divergent (the new tool path is the more-correct one).

Cleanup / Altitude

9. Duplicated stream-assembly logic and openai-only ceremony

  • _finalize_tool_calls/_accumulate_tool_call_delta (model_engine.py:216,272) duplicate the equivalents in nemoguardrails/llm/models/openai_chat.py — already flagged by the in-code TODO: duplicates _finalize_tool_calls (line 270). They've already drifted (this copy adds an id-collision warning and the {}-degrade policy the other lacks). Worth extracting one shared helper so the degrade policy is defined once.
  • The three _nim parsers/extractors (model_engine.py:330,362,397) are one-line pass-throughs to their _openai counterparts, registered in three parallel dicts (_TOOL_PARSERS/_RESULT_EXTRACTORS/_TOOL_CALL_EXTRACTORS) that already fall back to the openai function for unknown engines. Six dead declarations implying per-provider divergence that doesn't exist.
  • The block/refusal/metrics logic (tool-result Step 0, tool-call block) is written out twice — _do_generate vs the streaming generator — and _run_tool_call_rail/_run_tool_result_rail/_run_rail triplicate the span+mark_rail_stop+content-capture boilerplate. These will drift on the next change to the block contract.

(Lower-confidence altitude observations from finder G — the model_type="main" parameter is an unused seam never passed by any caller, and the supported-flow frozensets in IORails duplicate the _TOOL_ACTION_CLASSES registry as a source of truth — are worth a glance but not blocking.)

@Pouyanpi Pouyanpi left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @tgasser-nv . LGTM!

You can have a look at claude code's review above. For me the only thing we might need to address in future is commented below. Not blocking this PR

Comment thread nemoguardrails/guardrails/rails_manager.py
Base automatically changed from feat/iorails-tool-calling-rails to develop June 23, 2026 16:55
@tgasser-nv tgasser-nv force-pushed the feat/iorails-tool-calling-api branch from 430f067 to 65d3feb Compare June 23, 2026 17:14
@github-actions github-actions Bot added size: XL and removed size: L labels Jun 24, 2026
@tgasser-nv tgasser-nv added status: triaged Triaged by a maintainer; eligible for automated review (CodeRabbit/Greptile). and removed status: needs triage New issues that have not yet been reviewed or categorized. labels Jun 24, 2026
@tgasser-nv

Copy link
Copy Markdown
Collaborator Author

@tgasser-nv this is what claude code review returns, might be worth looking at

Code Review — PR #2058: IORails tool-calling rails

Addressed all but the refactoring one at the end (out-of-scope for this PR IMO).

@tgasser-nv tgasser-nv merged commit 3a35e45 into develop Jun 24, 2026
12 checks passed
@tgasser-nv tgasser-nv deleted the feat/iorails-tool-calling-api branch June 24, 2026 02:05
@tgasser-nv tgasser-nv mentioned this pull request Jun 29, 2026
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size: XL status: triaged Triaged by a maintainer; eligible for automated review (CodeRabbit/Greptile).

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants