Fix eval monitors error messages#953
Conversation
📝 WalkthroughWalkthroughThis PR enhances error handling and logging safety across evaluation components by introducing a request error sanitization utility, integrating it throughout trace fetching and runner error handlers, improving OAuth token error messages, and removing sensitive endpoint fields from monitor startup logs. ChangesSafe Request Error Handling and Logging
Sequence Diagram(s)(No sequence diagram generated — changes are primarily error-handling improvements and logging adjustments without complex multi-component interactions warranting visualization.) Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@libs/amp-evaluation/src/amp_evaluation/runner.py`:
- Around line 711-713: The except ValueError block currently logs and appends
the raw exception (logger.warning("Cannot fetch traces: %s", e) and
errors.append(f"Trace fetching failed: {e}")), which can leak request URLs;
replace those uses to pass the exception through the existing sanitizer (call
_safe_request_error(e) or otherwise redact URL content) and log/append the
sanitized string instead (update the logger.warning call and the errors.append
call to use the sanitized result), referencing the except ValueError handler,
logger.warning, errors.append, and the _safe_request_error helper mentioned in
trace/fetcher.py.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: f1f1096a-756f-4c73-99d7-187d03d25d6b
📒 Files selected for processing (3)
evaluation-job/main.pylibs/amp-evaluation/src/amp_evaluation/runner.pylibs/amp-evaluation/src/amp_evaluation/trace/fetcher.py
| except ValueError as e: | ||
| logger.warning(f"Cannot fetch traces: {e}") | ||
| logger.warning("Cannot fetch traces: %s", e) | ||
| errors.append(f"Trace fetching failed: {e}") |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In Python requests, do InvalidURL, MissingSchema, and InvalidSchema inherit from both RequestException and ValueError?
💡 Result:
Yes. In the Requests library, the exception classes MissingSchema, InvalidSchema, and InvalidURL are declared as subclasses of both requests.exceptions.RequestException and ValueError. In requests.exceptions, you can see the class definitions: - class MissingSchema(RequestException, ValueError) [1] - class InvalidSchema(RequestException, ValueError) [1] - class InvalidURL(RequestException, ValueError) [1] This is also reflected directly in the Requests source for exceptions.py. [2]
Citations:
- 1: https://requests.readthedocs.io/en/latest/_modules/requests/exceptions/
- 2: https://github.com/psf/requests/blob/main/src/requests/exceptions.py
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="libs/amp-evaluation/src/amp_evaluation/runner.py"
echo "== File existence =="
ls -l "$FILE" || true
echo "== Show lines around 680-760 =="
sed -n '680,760p' "$FILE" | nl -ba | sed 's/^/ /'
echo "== Find _safe_request_error definition =="
rg -n "_safe_request_error" "$FILE"
echo "== Show _safe_request_error body =="
# Print a window around the first match
LINE=$(rg -n "_safe_request_error" "$FILE" | head -n1 | cut -d: -f1)
START=$((LINE-30))
END=$((LINE+60))
sed -n "${START},${END}p" "$FILE" | nl -ba | sed 's/^/ /'
echo "== Locate the broader exception branch that mentions _safe_request_error =="
rg -n "_safe_request_error\\(e\\)" "$FILE"Repository: wso2/agent-manager
Length of output: 276
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="libs/amp-evaluation/src/amp_evaluation/runner.py"
echo "== Show lines around 690-740 with line numbers (no nl) =="
sed -n '690,740p' "$FILE" | awk '{printf "%4d %s\n", NR+689, $0}'
echo "== Show lines around 700-725 with line numbers =="
sed -n '700,725p' "$FILE" | awk '{printf "%4d %s\n", NR+699, $0}'
echo "== Find _safe_request_error definition =="
rg -n "_safe_request_error" "$FILE" || true
echo "== Show _safe_request_error body (a window around its first definition) =="
DEF_LINE=$(rg -n "_safe_request_error" "$FILE" | head -n1 | cut -d: -f1)
START=$((DEF_LINE-20))
END=$((DEF_LINE+80))
sed -n "${START},${END}p" "$FILE" | awk -v s=$START '{printf "%4d %s\n", s+NR-1, $0}'
echo "== Locate ValueError branch =="
rg -n "except ValueError as e" "$FILE" || true
echo "== Show the ValueError branch block (a small window) =="
VAL_LINE=$(rg -n "except ValueError as e" "$FILE" | head -n1 | cut -d: -f1)
START=$((VAL_LINE-10))
END=$((VAL_LINE+12))
sed -n "${START},${END}p" "$FILE" | awk -v s=$START '{printf "%4d %s\n", s+NR-1, $0}'Repository: wso2/agent-manager
Length of output: 9186
🏁 Script executed:
#!/bin/bash
set -euo pipefail
REPO_ROOT="."
FILE="$(fd -a "fetcher.py" libs/amp-evaluation/src/amp_evaluation/trace | head -n1 || true)"
echo "fetcher.py: $FILE"
if [ -z "${FILE:-}" ]; then
# fallback: search for _safe_request_error definition
echo "Locating _safe_request_error definition..."
rg -n "def _safe_request_error" libs/amp-evaluation/src/amp_evaluation -S || true
exit 0
fi
echo "== Show _safe_request_error definition =="
DEF_LINE=$(rg -n "def _safe_request_error" "$FILE" | head -n1 | cut -d: -f1)
START=$((DEF_LINE-20))
END=$((DEF_LINE+80))
sed -n "${START},${END}p" "$FILE" | awk -v s=$START '{printf "%4d %s\n", NR+s, $0}'
echo "== Find any references to _safe_request_error(e) =="
rg -n "_safe_request_error\\(e\\)" -S libs/amp-evaluation/src/amp_evaluation/trace "$FILE" || true
echo "== Find mention of trace fetching errors =="
rg -n "Trace fetching|fetch traces|Failed to fetch traces" "$FILE" -S || trueRepository: wso2/agent-manager
Length of output: 4440
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="libs/amp-evaluation/src/amp_evaluation/runner.py"
echo "== Context before except ValueError (try start area) =="
sed -n '620,730p' "$FILE" | awk '{print NR+619, $0}'
echo "== Find 'Cannot fetch traces' occurrences =="
rg -n "Cannot fetch traces" "$FILE" -n
echo "== Find surrounding 'Trace fetching' logic =="
rg -n "Trace fetching failed|Trace fetching error|Error during trace fetching" "$FILE" -n
echo "== Locate the function/method containing this block =="
# Print the closest def/class above line 711
awk 'NR==711{print "target line is 711"} {print NR ":" $0}' "$FILE" | sed -n '680,740p' >/tmp/snippet.txt
tail -n +1 /tmp/snippet.txt | head -n 40Repository: wso2/agent-manager
Length of output: 7473
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="libs/amp-evaluation/src/amp_evaluation/runner.py"
echo "== Find _fetch_traces definition =="
rg -n "def _fetch_traces" "$FILE"
LINE=$(rg -n "def _fetch_traces" "$FILE" | head -n1 | cut -d: -f1)
START=$((LINE-30))
END=$((LINE+160))
sed -n "${START},${END}p" "$FILE" | awk '{print NR+ '"$START"' -1 ":", $0}' | head -n 220Repository: wso2/agent-manager
Length of output: 8615
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="libs/amp-evaluation/src/amp_evaluation/trace/fetcher.py"
echo "== Find fetch_traces implementations =="
rg -n "def fetch_traces|fetch_traces\\(" "$FILE"
echo "== Show likely fetch_traces body (first match) =="
LINE=$(rg -n "def fetch_traces" "$FILE" | head -n1 | cut -d: -f1)
START=$((LINE-30))
END=$((LINE+220))
sed -n "${START},${END}p" "$FILE" | awk 'BEGIN{c=0} {print NR+ '"$START"-1'" ":" $0}' | head -n 260
echo "== Look for requests.* calls in fetcher.py =="
rg -n "requests\\.(get|post|request)|_request\\(|HTTPError|MissingSchema|InvalidURL|InvalidSchema" "$FILE"
echo "== Search for any broad try/except that converts exceptions =="
rg -n "except (ValueError|RequestException|requests\\.exceptions|Exception)" "$FILE" | head -n 50Repository: wso2/agent-manager
Length of output: 414
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="libs/amp-evaluation/src/amp_evaluation/trace/fetcher.py"
echo "== Show around fetch_traces method definition (line near 427) =="
sed -n '400,560p' "$FILE" | nl -ba | head -n 200Repository: wso2/agent-manager
Length of output: 188
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="libs/amp-evaluation/src/amp_evaluation/trace/fetcher.py"
echo "== requests usage in fetcher.py =="
rg -n "requests\\.(get|post|request)|_request\\(|HTTPError|MissingSchema|InvalidURL|InvalidSchema|InvalidSchema" "$FILE"Repository: wso2/agent-manager
Length of output: 330
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="libs/amp-evaluation/src/amp_evaluation/trace/fetcher.py"
echo "== Relevant exception handling in fetcher.py =="
rg -n "except (ValueError|RequestException|requests\\.exceptions\\.|Exception)" "$FILE" | head -n 80Repository: wso2/agent-manager
Length of output: 264
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="libs/amp-evaluation/src/amp_evaluation/trace/fetcher.py"
echo "== Show around earlier caller 'traces = fetcher.fetch_traces(' (line ~373) =="
sed -n '330,440p' "$FILE" | nl -ba | head -n 160Repository: wso2/agent-manager
Length of output: 201
🏁 Script executed:
#!/bin/bash
set -euo pipefail
FILE="libs/amp-evaluation/src/amp_evaluation/trace/fetcher.py"
echo "== Show fetch_traces method around requests.get and exception handlers =="
sed -n '380,560p' "$FILE" | awk 'BEGIN{c=0} {c++; printf "%4d %s\n", 380+c-1, $0}'Repository: wso2/agent-manager
Length of output: 7349
Sanitize the except ValueError trace-fetching path to avoid request URL leakage.
In libs/amp-evaluation/src/amp_evaluation/runner.py, the except ValueError as e block logs and appends e directly (lines 711-713). Since requests URL exceptions like MissingSchema/InvalidSchema/InvalidURL inherit from ValueError, they can hit this branch when trace/fetcher.py re-raises requests.exceptions.RequestException. Their string forms may include the bad URL, defeating the _safe_request_error logging goal.
🔧 Proposed fix
- except ValueError as e:
- logger.warning("Cannot fetch traces: %s", e)
- errors.append(f"Trace fetching failed: {e}")
+ except ValueError as e:
+ safe = _safe_request_error(e)
+ logger.warning("Cannot fetch traces: %s", safe)
+ errors.append(f"Trace fetching failed: {safe}")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@libs/amp-evaluation/src/amp_evaluation/runner.py` around lines 711 - 713, The
except ValueError block currently logs and appends the raw exception
(logger.warning("Cannot fetch traces: %s", e) and errors.append(f"Trace fetching
failed: {e}")), which can leak request URLs; replace those uses to pass the
exception through the existing sanitizer (call _safe_request_error(e) or
otherwise redact URL content) and log/append the sanitized string instead
(update the logger.warning call and the errors.append call to use the sanitized
result), referencing the except ValueError handler, logger.warning,
errors.append, and the _safe_request_error helper mentioned in trace/fetcher.py.
Purpose
This is to improve the eval log messages by removing any internal data being printed.
Goals
Approach
User stories
Release note
Documentation
Training
Certification
Marketing
Automation tests
Security checks
Samples
Related PRs
Migrations (if applicable)
Test environment
Learning
Summary by CodeRabbit
Bug Fixes
Refactor