fix: remove system role filtering, TrustyAI now supports system messages by liavweiss · Pull Request #232 · opendatahub-io/ai-gateway-payload-processing

liavweiss · 2026-05-04T09:53:33Z

Summary

Remove system message filtering from nemo-request-guard. Previously, messages with role: "system" were filtered out before sending to NeMo because the TrustyAI /v1/guardrail/checks endpoint rejected them with a hard error (Unsupported message role: 'system').
TrustyAI has since merged feat(api): Support input guardrail checks when role: 'system' (#47).
All messages (including system) are now forwarded to NeMo for full conversation context evaluation.

Changes

request_guard.go: Removed the if role == "system" { continue } filter in extractOpenAIMessages. Updated comments.
request_guard_test.go: Updated tests to expect system messages in the forwarded payload.

E2E verification

Verified end-to-end by building the TrustyAI NeMo image locally from the develop branch (Dockerfile.server) which includes PR #47, and deploying it on a Kind cluster with detect sensitive data on input rails:

Test	Messages	Result
Safe system + user	`system: "You are a helpful assistant"` + `user: "Hello"`	success (both checked)
PII in system msg	`system: "card 4111-1111-1111-1111"` + `user: "Hello"`	blocked (system detected)
System-only with PII	`system: "Contact John Smith at john@example.com"`	blocked (system detected)

cc @nirrozenbaum @christinaexyou

Summary by CodeRabbit

Bug Fixes
- System messages are now included when forwarding conversations to NeMo, ensuring complete message history is transmitted.
Tests
- Unit tests updated to expect and validate all message roles (including system) and adjusted cases for single-system vs. system-only conversations.

coderabbitai · 2026-05-04T09:53:46Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: b014172d-6e62-4c78-b15c-502da912af4b

📥 Commits

Reviewing files that changed from the base of the PR and between c704557 and 32c73fc.

📒 Files selected for processing (2)

pkg/plugins/nemo/request_guard.go
pkg/plugins/nemo/request_guard_test.go

✅ Files skipped from review due to trivial changes (1)

pkg/plugins/nemo/request_guard.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/plugins/nemo/request_guard_test.go

📝 Walkthrough

Walkthrough

The change removes the special-case filtering of OpenAI system messages in extractOpenAIMessages; the function now forwards the full messages slice (all roles: system, user, assistant, tool) to NeMo and returns nil only for an empty slice. Inline documentation was updated to reflect forwarding all conversation messages. Tests were modified to expect preserved ordering and inclusion of system messages and to add a single-system-message case.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Security Assessment

CWE-20 (Improper Input Validation): The previous behavior allowed role == "system" messages to bypass the guard. This change correctly forwards system messages for validation; verify NeMo actually validates them and does not silently ignore or drop roles.
CWE-94 (Code Injection / Injection via tool responses): Forwarding tool role content to NeMo reduces the risk of unvalidated tool-response prompt injection. Confirm NeMo treats tool content as untrusted input.
Remaining risk — false confidence: If NeMo’s contract filters roles or only validates the last user message, forwarding here may create a false sense of protection. Action: add a runtime assertion or integration test that NeMo acknowledges/returns validation results per-role or per-message count.
Actionable checks:
1. Confirm NeMo API/schema accepts and validates all roles (system, user, assistant, tool) and returns per-message validation results.
2. Add an integration test that sends mixed-role messages and asserts NeMo processed the same number/order of messages.
3. Ensure logging/metrics capture when any message is dropped or rejected by NeMo so failures are detectable.
4. Limit and/or validate message sizes and attachment types before forwarding to avoid resource exhaustion or payload-based bypasses (rate/size checks).

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: removing system role filtering from NeMo request guard to enable forwarding of system messages now that TrustyAI supports them.
Linked Issues check	✅ Passed	Code changes directly address issue `#160` by forwarding all message roles (system, user, assistant, tool) to NeMo instead of filtering out system messages, enabling comprehensive guardrail validation across full conversation context.
Out of Scope Changes check	✅ Passed	All changes are scoped to removing system message filtering and updating corresponding tests; no unrelated modifications to other functionality or components detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

pkg/plugins/nemo/request_guard.go (1)

160-167: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Block non-string content instead of silently turning it into "".

Severity: critical. Exploit: send a chat message with OpenAI array-form content parts, e.g. a system or user message whose content is [{type:"text",text:"..."}]. This code forwards {"content": ""} to NeMo, while the original request still carries the real instruction to the downstream model, creating a guardrail bypass on the exact path this PR is trying to harden. CWE-20.

Suggested fix

 func extractOpenAIMessages(raw any) ([]map[string]string, error) {
 	slice, ok := raw.([]any)
 	if !ok {
 		return nil, fmt.Errorf("messages is not an array")
 	}
-	var messages []map[string]string
+	messages := make([]map[string]string, 0, len(slice))
 
-	for _, m := range slice {
+	for i, m := range slice {
 		msg, ok := m.(map[string]any)
 		if !ok {
-			continue
+			return nil, fmt.Errorf("message at index %d is not an object", i)
 		}
 		role, _ := msg["role"].(string)
-		content, _ := msg["content"].(string)
+		if role == "" {
+			return nil, fmt.Errorf("message at index %d has invalid role", i)
+		}
+		content, err := extractOpenAIContent(msg["content"])
+		if err != nil {
+			return nil, fmt.Errorf("message at index %d: %w", i, err)
+		}
 		messages = append(messages, map[string]string{"role": role, "content": content})
 	}
 	return messages, nil
 }
+
+func extractOpenAIContent(raw any) (string, error) {
+	switch v := raw.(type) {
+	case string:
+		return v, nil
+	case []any:
+		parts := make([]string, 0, len(v))
+		for i, p := range v {
+			part, ok := p.(map[string]any)
+			if !ok {
+				return "", fmt.Errorf("content part %d is not an object", i)
+			}
+			if part["type"] != "text" {
+				continue
+			}
+			text, ok := part["text"].(string)
+			if !ok {
+				return "", fmt.Errorf("content part %d has invalid text", i)
+			}
+			parts = append(parts, text)
+		}
+		if len(parts) == 0 {
+			return "", fmt.Errorf("unsupported content format")
+		}
+		return strings.Join(parts, "\n"), nil
+	default:
+		return "", fmt.Errorf("unsupported content format")
+	}
+}

As per coding guidelines, **: REVIEW PRIORITIES: 1. Security vulnerabilities (provide severity, exploit scenario, and remediation code).

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/plugins/nemo/request_guard.go` around lines 160 - 167, The loop currently
coerces non-string message content into an empty string (content, _ :=
msg["content"].(string)) which allows array-form or structured contents to be
forwarded; instead, detect when msg["content"] is not a string and reject the
request/path. Change the logic in the loop that iterates over slice (the
variables m, msg, role, content, messages) to validate content using a strict
type assertion (e.g., v, ok := msg["content"]; s, isString := v.(string)) and if
not isString return or mark the request invalid (do not append a map with ""),
logging or surfacing an error so the guard blocks any messages whose content is
not a plain string. Ensure the same strict behavior for role if applicable.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@pkg/plugins/nemo/request_guard.go`:
- Around line 160-167: The loop currently coerces non-string message content
into an empty string (content, _ := msg["content"].(string)) which allows
array-form or structured contents to be forwarded; instead, detect when
msg["content"] is not a string and reject the request/path. Change the logic in
the loop that iterates over slice (the variables m, msg, role, content,
messages) to validate content using a strict type assertion (e.g., v, ok :=
msg["content"]; s, isString := v.(string)) and if not isString return or mark
the request invalid (do not append a map with ""), logging or surfacing an error
so the guard blocks any messages whose content is not a plain string. Ensure the
same strict behavior for role if applicable.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: e66b650b-7947-45f5-9725-7564e910c72a

📥 Commits

Reviewing files that changed from the base of the PR and between 74db5d3 and c704557.

📒 Files selected for processing (2)

pkg/plugins/nemo/request_guard.go
pkg/plugins/nemo/request_guard_test.go

nirrozenbaum · 2026-05-04T12:26:32Z

@liavweiss making sure - was this change tested with latest trusty version?

nirrozenbaum · 2026-05-04T14:28:58Z

 			continue
 		}
 		role, _ := msg["role"].(string)
-		if role == "system" {


would it be correct to just return “slice” without copying?
in current code we copy the whole messages array on every request.
so if we have a long multi turn chat, memory allocation for each turn is expensive.

Thanks, I added a pre-allocation and an early return for empty slices to avoid unnecessary allocations.

my question was actually about the messages themselves, not only about the case of empty slice.
my intention was - if we have something along these lines:

func extractOpenAIMessages(raw any) ([]map[string]string, error) { slice, ok := raw.([]any) if !ok { return nil, fmt.Errorf("messages is not an array") } if len(slice) == 0 { return nil, nil } return slice

would it work?

Should work with some extra changes, but returning the slice directly means forwarding all original message fields to NeMo (e.g., tool_calls, function_call) instead of only role and content. NeMo ignores extra fields, but it sends more data over the wire for no reason. It also requires changing the return type from []map[string]string to []any and updating all callers and tests accordingly. If we want to go in that direction, it would be better as a follow-up PR.

sure. I'm just thinking about the tradeoffs.
do we prefer allocating memory for every request and copying the whole chat? (that could be very large) or do we prefer sending the same request that we received over the wire to yet another http hop? (but at least we don't do both - allocate memory + sending http).

there are pros/cons for both sides.
my gut feeling says sending the original request as is would probably be "cheaper".
we can handle in a follow up, for sure.

Signed-off-by: Liav Weiss <lweiss@lweiss-thinkpadx1carbongen11.raanaii.csb>

liavweiss · 2026-05-05T06:14:52Z

@liavweiss making sure - was this change tested with latest trusty version?

Yes, I built the TrustyAI NeMo image locally that includes the new code #47

nirrozenbaum

/lgtm
/approve

openshift-ci · 2026-05-05T20:42:47Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liavweiss, nirrozenbaum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [nirrozenbaum]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci Bot requested review from noyitz and yossiovadia May 4, 2026 09:53

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

nirrozenbaum reviewed May 4, 2026

View reviewed changes

fix: remove system role filtering, TrustyAI now supports system messages

32c73fc

Signed-off-by: Liav Weiss <lweiss@lweiss-thinkpadx1carbongen11.raanaii.csb>

liavweiss force-pushed the fix/remove-system-role-filtering branch from c704557 to 32c73fc Compare May 5, 2026 06:14

nirrozenbaum approved these changes May 5, 2026

View reviewed changes

openshift-ci Bot assigned nirrozenbaum May 5, 2026

openshift-ci Bot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 5, 2026

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 5, 2026

openshift-merge-bot Bot merged commit 1d7b369 into opendatahub-io:main May 5, 2026
10 checks passed

liavweiss mentioned this pull request May 14, 2026

nemo-guardrails: Avoid per-request message copy in nemo-request-guard #272

Closed

Conversation

liavweiss commented May 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

E2E verification

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Security Assessment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

nirrozenbaum commented May 4, 2026

Uh oh!

nirrozenbaum May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

liavweiss May 5, 2026

Choose a reason for hiding this comment

Uh oh!

nirrozenbaum May 5, 2026

Choose a reason for hiding this comment

Uh oh!

liavweiss May 5, 2026

Choose a reason for hiding this comment

Uh oh!

nirrozenbaum May 5, 2026

Choose a reason for hiding this comment

Uh oh!

liavweiss commented May 5, 2026

Uh oh!

nirrozenbaum left a comment

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented May 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

liavweiss commented May 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 4, 2026 •

edited

Loading

nirrozenbaum May 4, 2026 •

edited

Loading