Fix xAI Responses input normalization for replayed histories#3734
Fix xAI Responses input normalization for replayed histories#3734GGBond6 wants to merge 2 commits into
Conversation
|
This pull request targeted The base branch has been automatically changed to |
|
This pull request targeted The base branch has been automatically changed to |
There was a problem hiding this comment.
Code Review
This pull request introduces input normalization and filtering for the xAI executor, including converting custom tool calls to standard function calls, dropping web search calls, and removing encrypted content from reasoning items. The reviewer noted that dropping 'web_search_call' items without also dropping their corresponding 'web_search_call_output' items will lead to validation errors, and suggested updating both the implementation and the tests to handle this case.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| for _, item := range input.Array() { | ||
| if item.Get("type").String() == "web_search_call" { | ||
| changed = true | ||
| continue | ||
| } | ||
| items = append(items, json.RawMessage(item.Raw)) | ||
| } |
There was a problem hiding this comment.
When dropping web_search_call items from the replayed history, any corresponding web_search_call_output items must also be dropped. Leaving orphaned tool outputs in the history will cause xAI (and other OpenAI-compatible APIs) to reject the request with a validation error (e.g., unmatched tool response).
for _, item := range input.Array() {
itemType := item.Get(| func TestDropXAIInputWebSearchCalls(t *testing.T) { | ||
| body := []byte(`{"input":[{"type":"message","role":"user","content":"hi"},{"type":"web_search_call","status":"completed","action":{"type":"search","query":"test","queries":["test"]}},{"type":"function_call","call_id":"call_1","name":"lookup","arguments":"{}"},{"type":"function_call_output","call_id":"call_1","output":"ok"}]}`) | ||
| out := dropXAIInputWebSearchCalls(body) | ||
|
|
||
| if got := len(gjson.GetBytes(out, "input").Array()); got != 3 { | ||
| t.Fatalf("input length = %d, want 3: %s", got, string(out)) | ||
| } | ||
| if got := gjson.GetBytes(out, "input.0.type").String(); got != "message" { | ||
| t.Fatalf("input.0.type = %q, want message: %s", got, string(out)) | ||
| } | ||
| if got := gjson.GetBytes(out, "input.1.type").String(); got != "function_call" { | ||
| t.Fatalf("input.1.type = %q, want function_call: %s", got, string(out)) | ||
| } | ||
| if got := gjson.GetBytes(out, "input.2.type").String(); got != "function_call_output" { | ||
| t.Fatalf("input.2.type = %q, want function_call_output: %s", got, string(out)) | ||
| } | ||
| for _, item := range gjson.GetBytes(out, "input").Array() { | ||
| if item.Get("type").String() == "web_search_call" { | ||
| t.Fatalf("web_search_call should be dropped before sending to xAI: %s", string(out)) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Update the test to include a web_search_call_output item and verify that both the call and its output are correctly dropped.
func TestDropXAIInputWebSearchCalls(t *testing.T) {
body := []byte(`{"input":[{"type":"message","role":"user","content":"hi"},{"type":"web_search_call","status":"completed","action":{"type":"search","query":"test","queries":["test"]}},{"type":"web_search_call_output","output":"results"},{"type":"function_call","call_id":"call_1","name":"lookup","arguments":"{}"},{"type":"function_call_output","call_id":"call_1","output":"ok"}]}`)
out := dropXAIInputWebSearchCalls(body)
if got := len(gjson.GetBytes(out, "input").Array()); got != 3 {
t.Fatalf("input length = %d, want 3: %s", got, string(out))
}
if got := gjson.GetBytes(out, "input.0.type").String(); got != "message" {
t.Fatalf("input.0.type = %q, want message: %s", got, string(out))
}
if got := gjson.GetBytes(out, "input.1.type").String(); got != "function_call" {
t.Fatalf("input.1.type = %q, want function_call: %s", got, string(out))
}
if got := gjson.GetBytes(out, "input.2.type").String(); got != "function_call_output" {
t.Fatalf("input.2.type = %q, want function_call_output: %s", got, string(out))
}
for _, item := range gjson.GetBytes(out, "input").Array() {
itemType := item.Get("type").String()
if itemType == "web_search_call" || itemType == "web_search_call_output" {
t.Fatalf("%s should be dropped before sending to xAI: %s", itemType, string(out))
}
}
}There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 63a3bf3b69
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| body = dropXAIInputWebSearchCalls(body) | ||
| body = normalizeXAIInputReasoningItems(body) | ||
| body = normalizeCodexInstructions(body) | ||
| body = appendXAICodexProgressInstructions(body, opts) |
There was a problem hiding this comment.
Gate Codex progress instructions to Codex requests
This call runs for every xAI Responses request prepared by Execute, ExecuteStream, and CountTokens, including ordinary OpenAI-compatible Grok chats where opts.SourceFormat is not Codex-specific. In that scenario a user’s normal request gets extra Codex coding progress/update requirements injected into instructions, which can change visible model behavior and token accounting; this should be gated on the Codex workflow/request metadata rather than applied unconditionally.
Useful? React with 👍 / 👎.
63a3bf3 to
e17fe95
Compare
|
I verified the replayed-history xAI cases in this PR against the Environment:
Results on I also checked the same No token or auth values are included here. |
|
I built and tested this together with PR #3726 on top of the Build under test:
Focused tests passed:
Live xAI/Grok Build verification against the combined build: This confirms #3734 fixes the deterministic xAI 422 failures I saw on both No token or auth values are included here. |
Summary
Test