Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 12 additions & 6 deletions integration/test/tools_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,10 @@ func (a *Agent) ChatWithResponse(ctx context.Context, t *testing.T, userMessage
tools := a.convertMCPToolsToOpenAI()
maxIterations := 10

var errFound error
failedTool := ""
argsUsed := make(map[string]any)
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

argsUsed := make(map[string]any) is immediately overwritten via argsUsed = args; the initial allocation is unused. You can declare it as var argsUsed map[string]any (or assign when the failure occurs) to avoid the extra allocation.

Suggested change
argsUsed := make(map[string]any)
var argsUsed map[string]any

Copilot uses AI. Check for mistakes.

for range maxIterations {
completion, err := a.openaiClient.Chat.Completions.New(ctx, openai.ChatCompletionNewParams{
Model: a.model,
Expand All @@ -185,7 +189,6 @@ func (a *Agent) ChatWithResponse(ctx context.Context, t *testing.T, userMessage
}

assistantMessage := completion.Choices[0].Message

if len(assistantMessage.ToolCalls) == 0 {
return assistantMessage.Content, nil
}
Expand All @@ -205,20 +208,23 @@ func (a *Agent) ChatWithResponse(ctx context.Context, t *testing.T, userMessage
result, err := a.callMCPTool(ctx, toolName, args)
if err != nil {
if expectedOntapErrorStr != "" && strings.Contains(err.Error(), expectedOntapErrorStr) {
slog.Debug("Expected tool error", slog.String("tool", toolName), slog.Any("error", err))
} else {
t.Errorf("Tool %q returned error LLM will retry: %v", toolName, err)
return "", nil // test passed, expected error was observed
}
failedTool = toolName
argsUsed = args
errFound = err
slog.Warn("LLM will retry", slog.String("tool", toolName), slog.Any("args", args), slog.Any("error", err))

Comment on lines +213 to +217
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description says retry-on-error for ontap_get should be logged as debug, but this change logs retries with slog.Warn(...). Warn will typically appear in default test output and can still create noisy logs.

If the intent is to reduce noise for self-correcting retries, switch this to slog.Debug (and optionally gate it to the specific ontap_get validation error case).

Copilot uses AI. Check for mistakes.
result = "Error: " + err.Error()
Comment on lines 209 to 218
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning early when expectedOntapErrorStr matches (return "", nil) changes the semantics of ChatWithResponse: the tool error is never appended to messages, so the LLM never gets the error feedback to produce a final assistant message (and callers get an empty response).

If the goal is only to avoid failing the test on expected tool errors, consider keeping the loop behavior (append the tool error message and let the assistant respond) while suppressing t.Errorf/high-severity logging for that case, or return a non-empty response indicating the expected error was observed.

Copilot uses AI. Check for mistakes.
}

slog.Debug("", slog.Any("Tool result", result))

messages = append(messages, openai.ToolMessage(result, toolCall.ID))
}
}

return "", fmt.Errorf("max iterations (%d) reached without final assistant message", maxIterations)
t.Errorf("Tool %q args %v returned error %v", failedTool, argsUsed, errFound)
return "", fmt.Errorf("max iterations (%d) reached; last tool %q args %v error: %w", maxIterations, failedTool, argsUsed, errFound)
Comment on lines +226 to +227
Copy link

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fmt.Errorf(... %w, errFound) can return a nil error when errFound is nil (Go treats wrapping nil as producing nil). If maxIterations is reached without any tool error, this function would (1) call t.Errorf(...) with empty tool/args and (2) potentially return a nil error to callers, allowing tests to continue incorrectly.

Consider returning a non-wrapped error when errFound == nil (use %v or a separate errors.New/fmt.Errorf without %w), and only include tool/args context when you actually have a recorded tool failure.

Suggested change
t.Errorf("Tool %q args %v returned error %v", failedTool, argsUsed, errFound)
return "", fmt.Errorf("max iterations (%d) reached; last tool %q args %v error: %w", maxIterations, failedTool, argsUsed, errFound)
if errFound != nil {
t.Errorf("Tool %q args %v returned error %v", failedTool, argsUsed, errFound)
return "", fmt.Errorf("max iterations (%d) reached; last tool %q args %v error: %w", maxIterations, failedTool, argsUsed, errFound)
}
t.Errorf("max iterations (%d) reached without completing response", maxIterations)
return "", fmt.Errorf("max iterations (%d) reached without completing response", maxIterations)

Copilot uses AI. Check for mistakes.
}

func (a *Agent) Close() {
Expand Down
Loading