Skip to content

Commit 2e310e8

Browse files
authored
some helpful claude sweeps (#1161)
1 parent 0914a9c commit 2e310e8

File tree

3 files changed

+214
-0
lines changed

3 files changed

+214
-0
lines changed

.claude/accuracy-sweep-state.json

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
{
2+
"inspections": {
3+
"zonal": { "last_inspected": "2026-03-30T12:00:00Z", "issue": 1090 },
4+
"focal": { "last_inspected": "2026-03-30T13:00:00Z", "issue": 1092 },
5+
"multispectral": { "last_inspected": "2026-03-30T14:00:00Z", "issue": 1094 },
6+
"proximity": { "last_inspected": "2026-03-30T15:00:00Z", "issue": null, "notes": "Direction >= boundary fragile but works due to truncated constant. Float32 truncation is design choice. No wrong-results bugs found." },
7+
"curvature": { "last_inspected": "2026-03-30T15:00:00Z", "issue": null, "notes": "Formula matches ArcGIS reference. Backends consistent. No issues found." }
8+
}
9+
}

.claude/commands/accuracy-sweep.md

Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
# Accuracy Sweep: Generate a Ralph Loop targeting under-inspected modules
2+
3+
Analyze xrspatial modules by recency and inspection history, then print a
4+
ready-to-run `/ralph-loop` command that targets the highest-priority modules.
5+
6+
Optional arguments: $ARGUMENTS
7+
(e.g. `--top 5`, `--exclude slope,aspect`, `--only-terrain`, `--reset-state`)
8+
9+
---
10+
11+
## Step 1 -- Gather module metadata via git
12+
13+
For every `.py` file directly under `xrspatial/` (skip `__init__.py`,
14+
`_version.py`, `__main__.py`, `utils.py`, `accessor.py`, `preview.py`,
15+
`dataset_support.py`, `diagnostics.py`, `analytics.py`), collect:
16+
17+
| Field | How |
18+
|-------|-----|
19+
| **last_modified** | `git log -1 --format=%aI -- xrspatial/<module>.py` |
20+
| **first_commit** | `git log --diff-filter=A --format=%aI -- xrspatial/<module>.py` |
21+
| **total_commits** | `git log --oneline -- xrspatial/<module>.py \| wc -l` |
22+
| **recent_accuracy_commits** | `git log --oneline --grep='accuracy\|precision\|numerical\|geodesic' -- xrspatial/<module>.py` |
23+
24+
Store results in a temporary variable -- do NOT write intermediate files.
25+
26+
## Step 2 -- Load inspection state
27+
28+
Read the state file at `.claude/accuracy-sweep-state.json`.
29+
30+
If it does not exist, treat every module as never-inspected.
31+
32+
If `$ARGUMENTS` contains `--reset-state`, delete the file and treat
33+
everything as never-inspected.
34+
35+
The state file schema:
36+
37+
```json
38+
{
39+
"inspections": {
40+
"slope": { "last_inspected": "2026-03-28T14:00:00Z", "issue": 1042 },
41+
"aspect": { "last_inspected": "2026-03-28T15:30:00Z", "issue": 1043 }
42+
}
43+
}
44+
```
45+
46+
## Step 3 -- Score each module
47+
48+
Compute a priority score for each module. Higher = more urgent.
49+
50+
```
51+
days_since_inspected = (today - last_inspected).days # 9999 if never inspected
52+
days_since_modified = (today - last_modified).days
53+
total_commits = from Step 1
54+
has_recent_accuracy_work = 1 if recent_accuracy_commits is non-empty, else 0
55+
56+
score = (days_since_inspected * 3)
57+
+ (total_commits * 0.5)
58+
- (days_since_modified * 0.2)
59+
- (has_recent_accuracy_work * 500)
60+
```
61+
62+
Rationale:
63+
- Modules never inspected dominate (9999 * 3)
64+
- More commits = more complex = more likely to have bugs
65+
- Recently modified modules slightly deprioritized (someone just touched them)
66+
- Modules with existing accuracy work heavily deprioritized
67+
68+
## Step 4 -- Apply filters from $ARGUMENTS
69+
70+
- `--top N` -- only include the top N modules (default: 5)
71+
- `--exclude mod1,mod2` -- remove named modules from the list
72+
- `--only-terrain` -- restrict to slope, aspect, curvature, terrain,
73+
terrain_metrics, hillshade, sky_view_factor
74+
- `--only-focal` -- restrict to focal, convolution, morphology, bilateral,
75+
edge_detection, glcm
76+
- `--only-hydro` -- restrict to flood, cost_distance, geodesic,
77+
surface_distance, viewshed, erosion, diffusion
78+
79+
## Step 5 -- Print the results
80+
81+
### 5a. Print the ranked table
82+
83+
Print a markdown table showing ALL scored modules (not just the selected ones),
84+
sorted by score descending:
85+
86+
```
87+
| Rank | Module | Score | Last Inspected | Last Modified | Commits |
88+
|------|-----------------|--------|----------------|---------------|---------|
89+
| 1 | viewshed | 30012 | never | 45 days ago | 23 |
90+
| 2 | flood | 29998 | never | 120 days ago | 18 |
91+
| ... | ... | ... | ... | ... | ... |
92+
```
93+
94+
### 5b. Print the generated ralph-loop command
95+
96+
Using the top N modules from the ranked list, generate and print a command
97+
like this (adapt the module list to actual results):
98+
99+
````
100+
/ralph-loop "Survey xarray-spatial modules for numerical accuracy issues.
101+
102+
**Target these modules in priority order:**
103+
1. viewshed (xrspatial/viewshed.py) -- never inspected, 23 commits
104+
2. flood (xrspatial/flood.py) -- never inspected, 18 commits
105+
3. focal (xrspatial/focal.py) -- never inspected, 31 commits
106+
4. erosion (xrspatial/erosion.py) -- never inspected, 12 commits
107+
5. classify (xrspatial/classify.py) -- never inspected, 9 commits
108+
109+
**For each module, in order:**
110+
1. Read the source and identify potential accuracy issues:
111+
- Floating point precision loss
112+
- Incorrect NaN propagation
113+
- Off-by-one errors in neighborhood operations
114+
- Missing or wrong Earth curvature corrections
115+
- Backend inconsistencies (numpy vs cupy vs dask results differ)
116+
2. Run /rockout to fix the issue end-to-end (issue, worktree, fix, tests, docs)
117+
3. After completing rockout for ONE module, output <promise>ITERATION DONE</promise>
118+
119+
If you find no accuracy issues in the current target module, skip it and move
120+
to the next one.
121+
122+
If all target modules have been addressed or have no issues, output
123+
<promise>ALL ACCURACY ISSUES FIXED</promise>." --max-iterations {N} --completion-promise "ALL ACCURACY ISSUES FIXED"
124+
````
125+
126+
Set `--max-iterations` to the number of target modules + 2 (buffer for retries).
127+
128+
### 5c. Print a reminder
129+
130+
```
131+
To run this sweep: copy the command above and paste it.
132+
To update state after a manual rockout: edit .claude/accuracy-sweep-state.json
133+
To reset all tracking: /accuracy-sweep --reset-state
134+
```
135+
136+
## Step 6 -- Update state (ONLY when called from inside a ralph-loop)
137+
138+
This step is informational. The accuracy-sweep command itself does NOT update
139+
the state file. State is updated when `/rockout` completes -- the rockout
140+
workflow should append to `.claude/accuracy-sweep-state.json` after creating
141+
the issue.
142+
143+
To enable this, print a note reminding the user that after each rockout
144+
iteration completes, they can manually record the inspection:
145+
146+
```json
147+
// Add to .claude/accuracy-sweep-state.json after each rockout:
148+
{ "module_name": { "last_inspected": "ISO-DATE", "issue": ISSUE_NUMBER } }
149+
```
150+
151+
---
152+
153+
## General Rules
154+
155+
- Do NOT modify any source files. This command is read-only analysis.
156+
- Do NOT create GitHub issues. This command only generates the ralph-loop command.
157+
- Keep the output concise -- the table and command are the deliverables.
158+
- If $ARGUMENTS is empty, use defaults: top 5, no category filter, no exclusions.
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
{
2+
"last_triage": "2026-03-31T18:00:00Z",
3+
"modules": {
4+
"reproject": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "RISKY", "bottleneck": "compute-bound", "high_count": 1, "issue": null },
5+
"geotiff": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "IO-bound", "high_count": 0, "issue": null, "notes": "False positive. open_geotiff(chunks=N) returns lazy dask array. to_geotiff auto-routes dask inputs to write_streaming. Eager paths are by design for numpy/cupy." },
6+
"zonal": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "WILL OOM", "bottleneck": "memory-bound", "high_count": 4, "issue": 1110, "notes": "Memory guards improved, iterrows replaced with isin. da.unique().compute() confirmed safe (small result). regions() is inherently global - documented limitation." },
7+
"viewshed": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "memory-bound", "high_count": 1, "issue": null },
8+
"rasterize": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "graph-bound", "high_count": 1, "issue": null },
9+
"bump": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "WILL OOM", "bottleneck": "memory-bound", "high_count": 0, "issue": null },
10+
"normalize": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": 1124, "notes": "Boolean indexing replaced with lazy nanmin/nanmax/nanmean/nanstd." },
11+
"mahalanobis": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "WILL OOM", "bottleneck": "memory-bound", "high_count": 1, "issue": null },
12+
"bilateral": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
13+
"diffusion": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "WILL OOM", "bottleneck": "memory-bound", "high_count": 2, "issue": 1116, "notes": "Scalar diffusivity now passed as float to chunks. DataArray diffusivity passed as dask array via map_overlap." },
14+
"cost_distance": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "WILL OOM", "bottleneck": "memory-bound", "high_count": 2, "issue": 1118, "notes": "Memory guard added + da.block assembly. Finite max_cost path (map_overlap) was already safe." },
15+
"sky_view_factor": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
16+
"worley": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
17+
"flood": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
18+
"aspect": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": 1122, "notes": "northness/eastness now use da.cos/sin on dask arrays." },
19+
"terrain": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "RISKY", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
20+
"terrain_metrics": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "memory-bound", "high_count": 0, "issue": null },
21+
"slope": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
22+
"perlin": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "WILL OOM", "bottleneck": "memory-bound", "high_count": 0, "issue": null },
23+
"curvature": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
24+
"hillshade": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
25+
"contour": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
26+
"pathfinding": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 1, "issue": null },
27+
"erosion": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "WILL OOM", "bottleneck": "memory-bound", "high_count": 2, "issue": 1120, "notes": "Memory guard added. Algorithm inherently global." },
28+
"geodesic": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "N/A", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
29+
"balanced_allocation": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "WILL OOM", "bottleneck": "memory-bound", "high_count": 3, "issue": 1114, "notes": "Lazy source extraction + memory guard. Algorithm is inherently O(N*size) - documented limitation." },
30+
"corridor": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
31+
"polygonize": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
32+
"edge_detection": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
33+
"multispectral": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
34+
"fire": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
35+
"proximity": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "WILL OOM", "bottleneck": "memory-bound", "high_count": 3, "issue": 1111, "notes": "Memory guard added to line-sweep path. KDTree path (EUCLIDEAN/MANHATTAN + scipy) already had guards. GREAT_CIRCLE unbounded path already guarded." },
36+
"emerging_hotspots": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
37+
"dasymetric": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "memory-bound", "high_count": 0, "issue": 1126, "notes": "Memory guard added to validate_disaggregation. Core disaggregate uses map_blocks." },
38+
"classify": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
39+
"convolution": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
40+
"morphology": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
41+
"focal": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null },
42+
"glcm": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 1, "issue": null },
43+
"surface_distance": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "memory-bound", "high_count": 0, "issue": 1128, "notes": "Memory guard added to dd_grid allocation." },
44+
"mahalanobis": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null, "notes": "False positive. Numpy path materializes by design. Dask path uses lazy reductions + map_blocks." },
45+
"glcm": { "last_inspected": "2026-03-31T18:00:00Z", "oom_verdict": "SAFE", "bottleneck": "compute-bound", "high_count": 0, "issue": null, "notes": "Downgraded to MEDIUM. da.stack without rechunk is scheduling overhead, not OOM risk." }
46+
}
47+
}

0 commit comments

Comments
 (0)