Skip to content

Commit cca10d6

Browse files
sjarmakclaude
andcommitted
feat: US-009 - Repo-set fixtures for starter pack
Creates 5 repo-set fixtures in fixtures/repo_sets/ for the MCP-unique benchmark starter pack: - kubernetes-ecosystem: k8s core + client-go + api mirrors + etcd (4 repos, cross-org Go) - nodejs-web-stack: Node.js + express/lodash/prisma mirrors (4 repos, cross-org JS/TS) - python-ml-stack: scikit-learn + numpy/pandas/scipy natively indexed (4 repos, cross-org Python) - grafana-observability: grafana + loki/mimir mirrors (3 repos, single-org Go) - multi-org-go: k8s + etcd + grafana all natively indexed (3 repos, cross-org Go) All fixtures validated against schemas/repo_set_fixture.schema.json. Mirror SHAs from sg_mirror_revisions.json; native repos pinned to stable tags. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
1 parent 7c931cb commit cca10d6

7 files changed

Lines changed: 280 additions & 1 deletion

File tree

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
{
2+
"repo_set_id": "grafana-observability",
3+
"description": "Grafana observability stack repos within the grafana org. Covers the dashboarding platform, log aggregation system (Loki), and metrics backend (Mimir). Tests within-org cross-repo dependency tracing and API call chain analysis.",
4+
"repos": [
5+
{
6+
"host": "github.com",
7+
"org": "grafana",
8+
"repo_name": "grafana",
9+
"full_name": "grafana/grafana",
10+
"revision": "v11.4.0",
11+
"logical_name": "dashboarding-platform",
12+
"access_mode": "local_checkout",
13+
"sourcegraph_indexed": true
14+
},
15+
{
16+
"host": "github.com",
17+
"org": "sg-benchmarks",
18+
"repo_name": "grafana-loki",
19+
"full_name": "sg-benchmarks/grafana-loki",
20+
"revision": "a3af38d4da899032d3ee46a30932d072d37e1b9c",
21+
"logical_name": "log-aggregation",
22+
"access_mode": "mcp_only",
23+
"sourcegraph_indexed": true,
24+
"sourcegraph_mirror": "sg-benchmarks/grafana-loki"
25+
},
26+
{
27+
"host": "github.com",
28+
"org": "sg-benchmarks",
29+
"repo_name": "grafana-mimir",
30+
"full_name": "sg-benchmarks/grafana-mimir",
31+
"revision": "cfaa8c9a705a2417822a5c6224a9fc3128c416c2",
32+
"logical_name": "metrics-backend",
33+
"access_mode": "mcp_only",
34+
"sourcegraph_indexed": true,
35+
"sourcegraph_mirror": "sg-benchmarks/grafana-mimir"
36+
}
37+
],
38+
"local_checkout_repos": ["grafana/grafana"],
39+
"mcp_only_repos": [
40+
"sg-benchmarks/grafana-loki",
41+
"sg-benchmarks/grafana-mimir"
42+
],
43+
"cross_org": false,
44+
"language_mix": ["Go", "TypeScript"],
45+
"primary_language": "Go"
46+
}
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
{
2+
"repo_set_id": "kubernetes-ecosystem",
3+
"description": "Kubernetes ecosystem repos spanning kubernetes, etcd-io, and sg-benchmarks orgs. Covers the core orchestrator, client library, API types, and distributed key-value store. Tests cross-org dependency tracing and blast-radius analysis.",
4+
"repos": [
5+
{
6+
"host": "github.com",
7+
"org": "kubernetes",
8+
"repo_name": "kubernetes",
9+
"full_name": "kubernetes/kubernetes",
10+
"revision": "v1.32.0",
11+
"logical_name": "core-orchestrator",
12+
"access_mode": "local_checkout",
13+
"sourcegraph_indexed": true
14+
},
15+
{
16+
"host": "github.com",
17+
"org": "sg-benchmarks",
18+
"repo_name": "kubernetes-client-go",
19+
"full_name": "sg-benchmarks/kubernetes-client-go",
20+
"revision": "8020fc4fcf89965904a5f43689f169d6e01d1e80",
21+
"logical_name": "go-client-library",
22+
"access_mode": "mcp_only",
23+
"sourcegraph_indexed": true,
24+
"sourcegraph_mirror": "sg-benchmarks/kubernetes-client-go"
25+
},
26+
{
27+
"host": "github.com",
28+
"org": "sg-benchmarks",
29+
"repo_name": "kubernetes-api",
30+
"full_name": "sg-benchmarks/kubernetes-api",
31+
"revision": "fa23dd302759dbb681c1a41f09d24190a38c1d58",
32+
"logical_name": "api-type-definitions",
33+
"access_mode": "mcp_only",
34+
"sourcegraph_indexed": true,
35+
"sourcegraph_mirror": "sg-benchmarks/kubernetes-api"
36+
},
37+
{
38+
"host": "github.com",
39+
"org": "etcd-io",
40+
"repo_name": "etcd",
41+
"full_name": "etcd-io/etcd",
42+
"revision": "v3.5.17",
43+
"logical_name": "distributed-kv-store",
44+
"access_mode": "mcp_only",
45+
"sourcegraph_indexed": true
46+
}
47+
],
48+
"local_checkout_repos": ["kubernetes/kubernetes"],
49+
"mcp_only_repos": [
50+
"sg-benchmarks/kubernetes-client-go",
51+
"sg-benchmarks/kubernetes-api",
52+
"etcd-io/etcd"
53+
],
54+
"cross_org": true,
55+
"language_mix": ["Go"],
56+
"primary_language": "Go"
57+
}
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
{
2+
"repo_set_id": "multi-org-go",
3+
"description": "Multi-org Go ecosystem repos from three different GitHub organizations: kubernetes, etcd-io, and grafana. All natively indexed in Sourcegraph. Tests cross-org interface discovery, authoritative repo identification, and dependency resolution across unrelated orgs.",
4+
"repos": [
5+
{
6+
"host": "github.com",
7+
"org": "kubernetes",
8+
"repo_name": "kubernetes",
9+
"full_name": "kubernetes/kubernetes",
10+
"revision": "v1.32.0",
11+
"logical_name": "container-orchestration",
12+
"access_mode": "local_checkout",
13+
"sourcegraph_indexed": true
14+
},
15+
{
16+
"host": "github.com",
17+
"org": "etcd-io",
18+
"repo_name": "etcd",
19+
"full_name": "etcd-io/etcd",
20+
"revision": "v3.5.17",
21+
"logical_name": "distributed-kv-store",
22+
"access_mode": "mcp_only",
23+
"sourcegraph_indexed": true
24+
},
25+
{
26+
"host": "github.com",
27+
"org": "grafana",
28+
"repo_name": "grafana",
29+
"full_name": "grafana/grafana",
30+
"revision": "v11.4.0",
31+
"logical_name": "observability-platform",
32+
"access_mode": "mcp_only",
33+
"sourcegraph_indexed": true
34+
}
35+
],
36+
"local_checkout_repos": ["kubernetes/kubernetes"],
37+
"mcp_only_repos": [
38+
"etcd-io/etcd",
39+
"grafana/grafana"
40+
],
41+
"cross_org": true,
42+
"language_mix": ["Go"],
43+
"primary_language": "Go"
44+
}
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
{
2+
"repo_set_id": "nodejs-web-stack",
3+
"description": "Node.js web stack repos spanning nodejs, expressjs, lodash, and prisma orgs. Covers the runtime, web framework, utility library, and database ORM. Tests cross-org dependency and CVE remediation tasks.",
4+
"repos": [
5+
{
6+
"host": "github.com",
7+
"org": "nodejs",
8+
"repo_name": "node",
9+
"full_name": "nodejs/node",
10+
"revision": "v22.13.0",
11+
"logical_name": "javascript-runtime",
12+
"access_mode": "local_checkout",
13+
"sourcegraph_indexed": true
14+
},
15+
{
16+
"host": "github.com",
17+
"org": "sg-benchmarks",
18+
"repo_name": "expressjs-express",
19+
"full_name": "sg-benchmarks/expressjs-express",
20+
"revision": "9de5890d0dc6128b7f7eb15469d76aa60dacc48f",
21+
"logical_name": "web-framework",
22+
"access_mode": "mcp_only",
23+
"sourcegraph_indexed": true,
24+
"sourcegraph_mirror": "sg-benchmarks/expressjs-express"
25+
},
26+
{
27+
"host": "github.com",
28+
"org": "sg-benchmarks",
29+
"repo_name": "lodash",
30+
"full_name": "sg-benchmarks/lodash",
31+
"revision": "1dd1ecfd7875372efa8dde5dede50f6d2d323703",
32+
"logical_name": "utility-library",
33+
"access_mode": "mcp_only",
34+
"sourcegraph_indexed": true,
35+
"sourcegraph_mirror": "sg-benchmarks/lodash"
36+
},
37+
{
38+
"host": "github.com",
39+
"org": "sg-benchmarks",
40+
"repo_name": "prisma-prisma",
41+
"full_name": "sg-benchmarks/prisma-prisma",
42+
"revision": "20117d718fb0db9c8a586276e2052f5b130f994b",
43+
"logical_name": "database-orm",
44+
"access_mode": "mcp_only",
45+
"sourcegraph_indexed": true,
46+
"sourcegraph_mirror": "sg-benchmarks/prisma-prisma"
47+
}
48+
],
49+
"local_checkout_repos": ["nodejs/node"],
50+
"mcp_only_repos": [
51+
"sg-benchmarks/expressjs-express",
52+
"sg-benchmarks/lodash",
53+
"sg-benchmarks/prisma-prisma"
54+
],
55+
"cross_org": true,
56+
"language_mix": ["JavaScript", "TypeScript"],
57+
"primary_language": "JavaScript"
58+
}
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
{
2+
"repo_set_id": "python-ml-stack",
3+
"description": "Python scientific/ML stack repos spanning scikit-learn, numpy, pandas-dev, and scipy orgs. All repos are natively indexed in Sourcegraph — no mirrors needed. Tests cross-org API consumption, data flow tracing, and onboarding comprehension tasks.",
4+
"repos": [
5+
{
6+
"host": "github.com",
7+
"org": "scikit-learn",
8+
"repo_name": "scikit-learn",
9+
"full_name": "scikit-learn/scikit-learn",
10+
"revision": "1.6.1",
11+
"logical_name": "ml-algorithms",
12+
"access_mode": "local_checkout",
13+
"sourcegraph_indexed": true
14+
},
15+
{
16+
"host": "github.com",
17+
"org": "numpy",
18+
"repo_name": "numpy",
19+
"full_name": "numpy/numpy",
20+
"revision": "v2.2.2",
21+
"logical_name": "array-computing",
22+
"access_mode": "mcp_only",
23+
"sourcegraph_indexed": true
24+
},
25+
{
26+
"host": "github.com",
27+
"org": "pandas-dev",
28+
"repo_name": "pandas",
29+
"full_name": "pandas-dev/pandas",
30+
"revision": "v2.2.3",
31+
"logical_name": "dataframe-library",
32+
"access_mode": "mcp_only",
33+
"sourcegraph_indexed": true
34+
},
35+
{
36+
"host": "github.com",
37+
"org": "scipy",
38+
"repo_name": "scipy",
39+
"full_name": "scipy/scipy",
40+
"revision": "v1.15.1",
41+
"logical_name": "scientific-computing",
42+
"access_mode": "mcp_only",
43+
"sourcegraph_indexed": true
44+
}
45+
],
46+
"local_checkout_repos": ["scikit-learn/scikit-learn"],
47+
"mcp_only_repos": [
48+
"numpy/numpy",
49+
"pandas-dev/pandas",
50+
"scipy/scipy"
51+
],
52+
"cross_org": true,
53+
"language_mix": ["Python", "C", "Cython"],
54+
"primary_language": "Python"
55+
}

ralph-mcp-unique/prd.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -218,7 +218,7 @@
218218
"Each fixture validates against schemas/repo_set_fixture.schema.json"
219219
],
220220
"priority": 9,
221-
"passes": false,
221+
"passes": true,
222222
"notes": "Depends on US-003 (mirrors created). The python-ml-stack is the easiest fixture since all 4 repos are natively indexed. Use it for early testing. The multi-org-go fixture replaces the old cross-host fixture — it exercises cross-org discovery (k8s org + etcd org + grafana org) without needing a separate code host."
223223
},
224224
{

ralph-mcp-unique/progress.txt

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -160,6 +160,25 @@
160160
[2026-02-20 20:21:26 UTC] Iteration 1 complete
161161
[2026-02-20 20:21:28 UTC] Iteration 2 started
162162

163+
## 2026-02-20 - US-009: Repo-set fixtures for starter pack
164+
- Created 5 repo-set fixtures in `fixtures/repo_sets/`:
165+
1. `kubernetes-ecosystem.json`: kubernetes/kubernetes (local) + sg-benchmarks/kubernetes-client-go + sg-benchmarks/kubernetes-api + etcd-io/etcd (all mcp_only). cross_org=true, 4 repos, Go.
166+
2. `nodejs-web-stack.json`: nodejs/node (local) + sg-benchmarks/expressjs-express + sg-benchmarks/lodash + sg-benchmarks/prisma-prisma (all mcp_only). cross_org=true, 4 repos, JavaScript/TypeScript.
167+
3. `python-ml-stack.json`: scikit-learn/scikit-learn (local) + numpy/numpy + pandas-dev/pandas + scipy/scipy (all mcp_only, all natively indexed, no mirrors). cross_org=true, 4 repos, Python.
168+
4. `grafana-observability.json`: grafana/grafana (local) + sg-benchmarks/grafana-loki + sg-benchmarks/grafana-mimir (mcp_only). cross_org=false (same org), 3 repos, Go/TypeScript.
169+
5. `multi-org-go.json`: kubernetes/kubernetes (local) + etcd-io/etcd + grafana/grafana (mcp_only, all natively indexed). cross_org=true, 3 repos, Go.
170+
- All fixtures use mirror_sha from sg_mirror_revisions.json for sg-benchmarks mirrors
171+
- All native repos pinned to recent stable tags (k8s v1.32.0, etcd v3.5.17, grafana v11.4.0, etc.)
172+
- All verified against schemas/repo_set_fixture.schema.json: 5/5 OK
173+
- Files changed: 5 new files in `fixtures/repo_sets/`, `ralph-mcp-unique/prd.json`, `ralph-mcp-unique/progress.txt`
174+
- **Learnings for future iterations:**
175+
- grafana-observability cross_org=false (grafana/grafana + grafana-loki + grafana-mimir = same org)
176+
- multi-org-go uses only natively indexed repos — no mirrors needed
177+
- python-ml-stack is simplest fixture: all 4 repos natively indexed
178+
- sg-benchmarks mirror full_name uses "sg-benchmarks/repo-name" format (not the original org/repo)
179+
- Fixture validation: check full_name consistency between repos[] and local/mcp_only lists
180+
---
181+
163182
## 2026-02-20 - US-008: Agent-based oracle curation tool
164183
- Created `scripts/curate_oracle.py` (stdlib-only: urllib for SG API)
165184
- CLI: `--task-dir DIR`, `--task-spec PATH`, `--verify`, `--verbose`, `--dry-run`, `--max-results`

0 commit comments

Comments
 (0)