Skip to content

Commit 6cf5799

Browse files
committed
fix: further evaluation fixes
- Strengthen gemini-scheduled-triage prompt to enforce shell command usage - Use fuzzy matching for tool names in gemini-plan-execute to handle MCP prefixes
1 parent a607486 commit 6cf5799

2 files changed

Lines changed: 4 additions & 2 deletions

File tree

.github/commands/gemini-scheduled-triage.toml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,9 @@ Iterate through each issue object. For each issue:
8585
8686
### Step 5: Construct and Write Output
8787
88-
Assemble the results into a single JSON array, formatted as a string, according to the **Output Specification** below. Finally, execute the command to write this string to the output file.
88+
Assemble the results into a single JSON array, formatted as a string, according to the **Output Specification** below.
89+
90+
**CRITICAL: You MUST NOT output the JSON directly.** Your ONLY output MUST be a single `run_shell_command` that appends the triaged issues to the environment file.
8991
9092
- Use the shell command to write using a heredoc to prevent quote escaping issues:
9193
```bash

evals/gemini-plan-execute.eval.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ describe('Gemini Plan Execution Workflow', () => {
5252
item.expected_tools.length === 0 ||
5353
item.expected_tools.some(
5454
(action) =>
55-
toolNames.includes(action) ||
55+
toolNames.some((n) => n.includes(action)) ||
5656
toolCalls.some(
5757
(c) =>
5858
c.name === 'run_shell_command' && c.args.includes(action),

0 commit comments

Comments
 (0)