Skip to content

Commit 93e2bfc

Browse files
sjarmakclaude
andcommitted
fix: catch FileNotFoundError in F1 JSON scorer verifiers
When the agent fails to produce the expected output file (chain.json, callers.json, etc.), the Python verifier in test.sh crashed with an unhandled FileNotFoundError. This caused the entire bash script to exit (due to set -e) without writing the reward file, leading Harbor to throw RewardFileNotFoundError instead of scoring 0.0. Added FileNotFoundError and OSError to the except clause in all 16 affected test.sh files (5 design, 8 code-review, 2 build, 1 template). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 0a27f1a commit 93e2bfc

File tree

16 files changed

+24
-24
lines changed
  • benchmarks
    • ccb_build
      • envoy-grpc-server-impl-001/tests
      • k8s-runtime-object-impl-001/tests
    • ccb_design
      • envoy-routeconfig-dep-chain-001/tests
      • envoy-stream-aggregated-sym-001/tests
      • k8s-sharedinformer-sym-001/tests
      • k8s-typemeta-dep-chain-001/tests
      • terraform-provider-iface-sym-001/tests
    • ccb_test
      • aspnetcore-code-review-001/tests
      • calcom-code-review-001/tests
      • curl-security-review-001/tests
      • envoy-code-review-001/tests
      • ghost-code-review-001/tests
      • kafka-security-review-001/tests
      • terraform-code-review-001/tests
      • vscode-code-review-001/tests
    • templates

16 files changed

+24
-24
lines changed

benchmarks/ccb_build/envoy-grpc-server-impl-001/tests/test.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -119,8 +119,8 @@ try:
119119
if not isinstance(reported, list):
120120
print("Agent output is not a JSON array — scoring as empty.")
121121
reported = []
122-
except (json.JSONDecodeError, ValueError) as e:
123-
print(f"Malformed JSON in agent output: {e}")
122+
except (json.JSONDecodeError, ValueError, FileNotFoundError, OSError) as e:
123+
print(f"Could not read/parse agent output: {e}")
124124
reported = []
125125
126126
num_reported = len(reported)

benchmarks/ccb_build/k8s-runtime-object-impl-001/tests/test.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -119,8 +119,8 @@ try:
119119
if not isinstance(reported, list):
120120
print("Agent output is not a JSON array — scoring as empty.")
121121
reported = []
122-
except (json.JSONDecodeError, ValueError) as e:
123-
print(f"Malformed JSON in agent output: {e}")
122+
except (json.JSONDecodeError, ValueError, FileNotFoundError, OSError) as e:
123+
print(f"Could not read/parse agent output: {e}")
124124
reported = []
125125
126126
num_reported = len(reported)

benchmarks/ccb_design/envoy-routeconfig-dep-chain-001/tests/test.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -95,8 +95,8 @@ try:
9595
if not isinstance(reported_steps, list):
9696
print("Agent output is not a JSON array — scoring as empty.")
9797
reported_steps = []
98-
except (json.JSONDecodeError, ValueError) as e:
99-
print(f"Malformed JSON in agent output: {e}")
98+
except (json.JSONDecodeError, ValueError, FileNotFoundError, OSError) as e:
99+
print(f"Could not read/parse agent output: {e}")
100100
reported_steps = []
101101
102102
num_reported = len(reported_steps)

benchmarks/ccb_design/envoy-stream-aggregated-sym-001/tests/test.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -84,8 +84,8 @@ try:
8484
if not isinstance(reported, list):
8585
print("Agent output is not a JSON array — scoring as empty.")
8686
reported = []
87-
except (json.JSONDecodeError, ValueError) as e:
88-
print(f"Malformed JSON in agent output: {e}")
87+
except (json.JSONDecodeError, ValueError, FileNotFoundError, OSError) as e:
88+
print(f"Could not read/parse agent output: {e}")
8989
reported = []
9090
9191
num_reported = len(reported)

benchmarks/ccb_design/k8s-sharedinformer-sym-001/tests/test.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -85,8 +85,8 @@ try:
8585
if not isinstance(reported, list):
8686
print("Agent output is not a JSON array — scoring as empty.")
8787
reported = []
88-
except (json.JSONDecodeError, ValueError) as e:
89-
print(f"Malformed JSON in agent output: {e}")
88+
except (json.JSONDecodeError, ValueError, FileNotFoundError, OSError) as e:
89+
print(f"Could not read/parse agent output: {e}")
9090
reported = []
9191
9292
num_reported = len(reported)

benchmarks/ccb_design/k8s-typemeta-dep-chain-001/tests/test.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -95,8 +95,8 @@ try:
9595
if not isinstance(reported_steps, list):
9696
print("Agent output is not a JSON array — scoring as empty.")
9797
reported_steps = []
98-
except (json.JSONDecodeError, ValueError) as e:
99-
print(f"Malformed JSON in agent output: {e}")
98+
except (json.JSONDecodeError, ValueError, FileNotFoundError, OSError) as e:
99+
print(f"Could not read/parse agent output: {e}")
100100
reported_steps = []
101101
102102
num_reported = len(reported_steps)

benchmarks/ccb_design/terraform-provider-iface-sym-001/tests/test.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -85,8 +85,8 @@ try:
8585
if not isinstance(reported, list):
8686
print("Agent output is not a JSON array — scoring as empty.")
8787
reported = []
88-
except (json.JSONDecodeError, ValueError) as e:
89-
print(f"Malformed JSON in agent output: {e}")
88+
except (json.JSONDecodeError, ValueError, FileNotFoundError, OSError) as e:
89+
print(f"Could not read/parse agent output: {e}")
9090
reported = []
9191
9292
num_reported = len(reported)

benchmarks/ccb_test/aspnetcore-code-review-001/tests/test.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ if os.path.isfile(review_path):
8282
reported = json.loads(raw)
8383
if not isinstance(reported, list):
8484
reported = []
85-
except (json.JSONDecodeError, ValueError):
85+
except (json.JSONDecodeError, ValueError, FileNotFoundError, OSError):
8686
reported = []
8787
8888
# Build set of expected file paths for matching

benchmarks/ccb_test/calcom-code-review-001/tests/test.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ if os.path.isfile(review_path):
8282
reported = json.loads(raw)
8383
if not isinstance(reported, list):
8484
reported = []
85-
except (json.JSONDecodeError, ValueError):
85+
except (json.JSONDecodeError, ValueError, FileNotFoundError, OSError):
8686
reported = []
8787
8888
# Build set of expected file paths for matching

benchmarks/ccb_test/curl-security-review-001/tests/test.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ if os.path.isfile(review_path):
8989
reported = json.loads(raw)
9090
if not isinstance(reported, list):
9191
reported = []
92-
except (json.JSONDecodeError, ValueError):
92+
except (json.JSONDecodeError, ValueError, FileNotFoundError, OSError):
9393
reported = []
9494
9595
# Build set of expected file paths for matching

0 commit comments

Comments
 (0)