Skip to content

Commit 5aefbfd

Browse files
sjarmakclaude
andcommitted
feat: US-016 - Starter tasks: Category G cross-org discovery (2 tasks)
Added 2 tasks for ccb_mcp_crossorg suite using multi-org-go fixture: CCX-crossorg-061 (G61 — cross-org interface impl): - Find all explicit storage.Interface implementations via `var _ storage.Interface` - Oracle: store struct (kubernetes/kubernetes) + Storage struct (grafana/grafana) - Exercises cross-org discovery: baseline finds kubernetes org only, MCP-Full needed to discover grafana org implementation CCX-crossorg-066 (G66 — authoritative repo identification): - Find authoritative source for go.etcd.io/etcd/client/v3 Go module - Oracle: keyword_presence (module declaration) + provenance (etcd-io/etcd, client/v3/go.mod) - Natural decoy: kubernetes/kubernetes vendors it locally, agent must distinguish Both tasks: 8 files each, VALID gate (gold=1.0, empty=0.0). Selection file updated to 10 tasks total. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 034ff9d commit 5aefbfd

File tree

21 files changed

+1624
-1
lines changed

21 files changed

+1624
-1
lines changed
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
FROM ubuntu:22.04
2+
3+
ENV DEBIAN_FRONTEND=noninteractive
4+
5+
# Base tools
6+
RUN apt-get update && apt-get install -y --no-install-recommends \
7+
git \
8+
ca-certificates \
9+
curl \
10+
python3 \
11+
golang-go \
12+
&& rm -rf /var/lib/apt/lists/*
13+
14+
WORKDIR /workspace
15+
16+
# Clone local checkout repos (baseline config: agent has local access to these)
17+
RUN git clone --depth 1 --branch v1.32.0 https://github.com/kubernetes/kubernetes /workspace/kubernetes
18+
19+
# Initialize git identity for agent commits
20+
RUN git config --global user.email "agent@example.com" && \
21+
git config --global user.name "Agent" && \
22+
git config --global safe.directory '*'
23+
24+
# Create log directories
25+
RUN mkdir -p /logs/agent /logs/verifier
26+
27+
ENTRYPOINT []
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# CCX-crossorg-061 — sg_only variant
2+
# No local repo clone — agent uses Sourcegraph MCP exclusively for code access.
3+
# The verifier restores the full repo from /repo_full/ before scoring.
4+
5+
FROM ubuntu:22.04
6+
7+
ENV DEBIAN_FRONTEND=noninteractive
8+
9+
RUN apt-get update && apt-get install -y --no-install-recommends \
10+
git \
11+
ca-certificates \
12+
python3 \
13+
curl \
14+
&& rm -rf /var/lib/apt/lists/*
15+
16+
WORKDIR /workspace
17+
18+
# Empty workspace — agent discovers code via MCP tools only
19+
RUN git init && \
20+
git config user.email "agent@example.com" && \
21+
git config user.name "Agent" && \
22+
git config --global safe.directory '*'
23+
24+
# Create log directories
25+
RUN mkdir -p /logs/agent /logs/verifier
26+
27+
# Mark sg_only mode — verifiers and eval scripts check this flag
28+
RUN touch /tmp/.sg_only_mode
29+
30+
ENTRYPOINT []
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
# Cross-Org Interface Implementation Discovery
2+
3+
## Your Task
4+
5+
Your platform team is conducting a cross-organization audit to find all implementations
6+
of a core Kubernetes storage abstraction. The `k8s.io/apiserver/pkg/storage.Interface`
7+
is the standard backend abstraction used by the Kubernetes API server — any project that
8+
embeds a Kubernetes-compatible API layer must implement it.
9+
10+
**Specific question**: Find all Go source files across the repos in this ecosystem that
11+
contain an explicit interface compliance check for `storage.Interface` using the
12+
Go pattern `var _ storage.Interface = (*StructName)(nil)`. For each match, report
13+
the repo, file path, and the struct name that implements the interface.
14+
15+
## Context
16+
17+
This pattern (`var _ InterfaceName = (*TypeName)(nil)`) is used in Go to verify at
18+
compile time that a type implements an interface. Finding all such declarations across
19+
repos from different organizations reveals who has independently implemented the same
20+
storage abstraction — a key signal for platform compatibility audits.
21+
22+
The search should be **exhaustive across all repos in the ecosystem**, not just the
23+
local repo. The interface is defined in the Kubernetes ecosystem but can be implemented
24+
by projects from entirely different organizations.
25+
26+
## Available Resources
27+
28+
The local `/workspace/` directory contains: kubernetes/kubernetes.
29+
30+
**Note:** Additional repositories are accessible via Sourcegraph MCP tools:
31+
- `etcd-io/etcd` (distributed-kv-store)
32+
- `grafana/grafana` (observability-platform)
33+
34+
## Output Format
35+
36+
Create a file at `/workspace/answer.json` with your findings:
37+
38+
```json
39+
{
40+
"symbols": [
41+
{
42+
"repo": "org/repo-name",
43+
"path": "relative/path/to/file.go",
44+
"symbol": "StructName"
45+
}
46+
],
47+
"text": "Narrative explanation citing which repos and orgs implement storage.Interface and where."
48+
}
49+
```
50+
51+
## Evaluation
52+
53+
Your answer is evaluated on:
54+
- **Symbol recall and precision**: Did you find all structs that explicitly implement `storage.Interface` via the `var _` pattern?
55+
- The oracle expects implementations from at least 2 different GitHub organizations.
Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
version = "1.0"
2+
3+
[metadata]
4+
name = "CCX-crossorg-061"
5+
description = "Cross-org interface implementation discovery: k8s storage.Interface"
6+
license = "Apache-2.0"
7+
8+
[task]
9+
id = "CCX-crossorg-061"
10+
repo = "kubernetes/kubernetes"
11+
category = "cross-org-discovery"
12+
language = "go"
13+
difficulty = "hard"
14+
time_limit_sec = 900
15+
mcp_suite = "ccb_mcp_crossorg"
16+
use_case_id = 61
17+
repo_set_id = "multi-org-go"
18+
mcp_unique = true
19+
20+
[verification]
21+
type = "eval"
22+
command = "bash /tests/eval.sh"
23+
24+
reward_type = "score"
25+
description = "Cross-org interface implementation discovery"
26+
27+
[environment]
28+
build_timeout_sec = 600.0
Binary file not shown.
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
#!/bin/bash
2+
# eval.sh — MCP-unique benchmark evaluator for CCX-crossorg-061
3+
# Exit-code-first (SWE-Factory pattern):
4+
# exit 0 — agent produced useful output (composite score > 0)
5+
# exit 1 — total failure (composite score == 0 or missing answer)
6+
#
7+
# Writes /logs/verifier/reward.txt with the composite score [0.0, 1.0]
8+
9+
set -euo pipefail
10+
11+
TASK_ID="CCX-crossorg-061"
12+
ANSWER_PATH="/workspace/answer.json"
13+
TASK_SPEC_PATH="/tests/task_spec.json"
14+
ORACLE_CHECKS="/tests/oracle_checks.py"
15+
REWARD_PATH="/logs/verifier/reward.txt"
16+
17+
mkdir -p /logs/verifier
18+
19+
echo "=== CCX-crossorg-061 evaluator ==="
20+
echo "Task spec: $TASK_SPEC_PATH"
21+
echo "Answer: $ANSWER_PATH"
22+
echo ""
23+
24+
# sg_only mode guard: restore full repo if verifier wrapper exists
25+
if [ -f /tmp/.sg_only_mode ] && [ -f /tests/sgonly_verifier_wrapper.sh ]; then
26+
echo "sg_only mode: sourcing verifier wrapper..."
27+
source /tests/sgonly_verifier_wrapper.sh
28+
fi
29+
30+
# Verify answer file exists
31+
if [ ! -f "$ANSWER_PATH" ]; then
32+
echo "ERROR: answer.json not found at $ANSWER_PATH"
33+
echo "0.0" > "$REWARD_PATH"
34+
exit 1
35+
fi
36+
37+
# Validate answer is valid JSON
38+
if ! python3 -c "import json; json.load(open('$ANSWER_PATH'))" 2>/dev/null; then
39+
echo "ERROR: answer.json is not valid JSON"
40+
echo "0.0" > "$REWARD_PATH"
41+
exit 1
42+
fi
43+
44+
echo "answer.json found and valid JSON"
45+
46+
# Run oracle checks
47+
if [ ! -f "$ORACLE_CHECKS" ]; then
48+
echo "ERROR: oracle_checks.py not found at $ORACLE_CHECKS"
49+
echo "0.0" > "$REWARD_PATH"
50+
exit 1
51+
fi
52+
53+
echo "Running oracle checks..."
54+
SCORE=$(python3 "$ORACLE_CHECKS" --answer "$ANSWER_PATH" --spec "$TASK_SPEC_PATH" --verbose 2>&1 | tee /dev/stderr | tail -1)
55+
56+
# Validate score is a number
57+
if ! echo "$SCORE" | python3 -c "import sys; float(sys.stdin.read().strip())" 2>/dev/null; then
58+
echo "ERROR: oracle_checks.py did not return a valid score: $SCORE"
59+
echo "0.0" > "$REWARD_PATH"
60+
exit 1
61+
fi
62+
63+
echo ""
64+
echo "Composite score: $SCORE"
65+
echo "$SCORE" > "$REWARD_PATH"
66+
67+
# Exit based on score (SWE-Factory exit-code-first pattern)
68+
python3 -c "import sys; sys.exit(0 if float('$SCORE') > 0 else 1)"
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
{
2+
"symbols": [
3+
{
4+
"repo": "kubernetes/kubernetes",
5+
"path": "staging/src/k8s.io/apiserver/pkg/storage/etcd3/store.go",
6+
"symbol": "store"
7+
},
8+
{
9+
"repo": "grafana/grafana",
10+
"path": "pkg/storage/unified/apistore/store.go",
11+
"symbol": "Storage"
12+
}
13+
],
14+
"text": "The k8s.io/apiserver/pkg/storage.Interface is explicitly implemented in two repos from different GitHub organizations using the `var _ storage.Interface = (...)` compile-time check pattern. First, in kubernetes/kubernetes (kubernetes org) at staging/src/k8s.io/apiserver/pkg/storage/etcd3/store.go, the `store` struct implements storage.Interface via `var _ storage.Interface = (*store)(nil)` at line 100. Second, in grafana/grafana (grafana org) at pkg/storage/unified/apistore/store.go, the `Storage` struct implements storage.Interface via `var _ storage.Interface = (*Storage)(nil)` at line 52. The grafana implementation is part of their unified storage layer that embeds a Kubernetes-compatible API server. This cross-org pattern shows that both the kubernetes org and grafana org independently implement the same storage abstraction.",
15+
"_metadata": {
16+
"oracle_type": "symbol_resolution",
17+
"discovery_method": "sourcegraph_keyword_search",
18+
"queries": [
19+
"repo:^github.com/kubernetes/kubernetes$ \"var _ storage.Interface\" file:staging",
20+
"repo:^github.com/grafana/grafana$ \"var _ storage.Interface\""
21+
],
22+
"unique_orgs": ["kubernetes", "grafana"]
23+
}
24+
}

0 commit comments

Comments
 (0)