Skip to content

Commit c032628

Browse files
timsaucerclaude
andcommitted
docs(ffi-skill): require dual-grep audit to prevent false-positive gap claims
Past audits filed gap-tracking issues by enumerating missing methods from memory rather than current source. Issue #22335 claimed `size` missing on `FFI_GroupsAccumulator` when it had been plumbed since PR #14775. Skill now: - Treats open `ffi`-label issues as claims requiring re-verification, not facts. Cites #22335 as cautionary example. - Replaces "How to enumerate defaults" with "How to audit a wrapper's coverage" — mandates dual greps (trait defaults + existing FFI fn-pointer fields) and `file:line` citation for both sides in any gap claim. - Drops `size` on groups accumulators from severity-class examples. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1 parent 1c3db56 commit c032628

1 file changed

Lines changed: 18 additions & 5 deletions

File tree

.ai/skills/datafusion-ffi/SKILL.md

Lines changed: 18 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -266,24 +266,37 @@ Common severity classes seen on the label today:
266266

267267
- **DML / optimizer-relevant defaults silently lost** — e.g. `delete_from`/`update`/`truncate` on table providers, distribution / ordering / pushdown on execution plans, `value_from_stats` on aggregates. These demote producer capability on the foreign side.
268268
- **SQL surface area silently absent** — naming hooks (`display_name`, `schema_name`, `documentation`), null-handling and within-group clauses on UDAFs, etc.
269-
- **Performance regressions, not correctness** — e.g. `memoize` on partition evaluators, `size` on groups accumulators.
269+
- **Performance regressions, not correctness** — e.g. `memoize` on partition evaluators.
270270
- **Open design questions** — e.g. whether `FFI_RecordBatchStream` needs `library_marker_id` at all.
271271

272-
When in doubt, open the label query; do not assume the list above is exhaustive. Wrappers without an open issue are **not** certified complete — re-enumerate the trait surface (see "How to enumerate defaults" below) whenever you audit one; upstream trait drift can introduce new defaulted methods at any time and silently re-open the silent-override-loss bug class.
272+
When in doubt, open the label query; do not assume the list above is exhaustive. Wrappers without an open issue are **not** certified complete — re-enumerate the trait surface (see "How to audit a wrapper's coverage" below) whenever you audit one; upstream trait drift can introduce new defaulted methods at any time and silently re-open the silent-override-loss bug class.
273+
274+
Conversely, an open issue under the `ffi` label is a **claim of a gap, not proof of one.** Past audits have filed false positives by enumerating gaps from memory rather than from current source (e.g. #22335 claimed `size` missing on `FFI_GroupsAccumulator` when it had been plumbed since PR #14775). Before acting on a listed gap — opening a fix PR, re-citing it in a new audit, or extending the list — run the dual-grep audit below and confirm the field is actually absent. If the issue is a false positive, close it as `not planned`, link the `file:line` of the existing plumbing, and remove the corresponding bullet from this skill.
273275

274276
When *closing* a gap, add the fn pointer unconditionally — the wrapper body calls the trait method on the inner `Arc<dyn Trait>` and Rust's dynamic dispatch picks the producer's override or falls back to the default. Use `Option<fn>` only when the new method also gains a corresponding capability flag on the producer's `new()`. Either way the layout changes, so the PR is an ABI break: mark `api change`, do not back-port to `branch-<major>`, and the workspace major bump in the next release makes the `version()` extern surface the change to consumers at load time.
275277

276-
### How to enumerate defaults
278+
### How to audit a wrapper's coverage
279+
280+
Every audit — opening an issue, filing a fix PR, or re-confirming a listed gap — must compare two sides drawn from current source. Never enumerate either side from memory or from a prior audit; trait surface and FFI struct both drift.
277281

278-
For trait `X`:
282+
**Side A — trait defaults (what could go missing):**
279283

280284
```bash
281285
# Find the trait definition
282286
grep -rn "pub trait X" datafusion/ --include='*.rs'
283287

284-
# Inspect for `fn method(...) { default_body }` (the body marks it as a default)
288+
# Inspect for `fn method(...) { default_body }` the body marks it as a default
285289
```
286290

291+
**Side B — FFI wrapper coverage (what is already plumbed):**
292+
293+
```bash
294+
# List every fn-pointer field on the FFI struct
295+
grep -nE 'pub [a-z_]+: (unsafe )?extern "C" fn' datafusion/ffi/src/.../X.rs
296+
```
297+
298+
Diff Side A against Side B. Any claim of a gap — in an issue body, audit summary, or PR description — must cite `file:line` for **both** the trait default and the FFI struct line where the field is (or is not). An issue body with only one side cited is incomplete and likely a false positive; reject it pending a re-grep.
299+
287300
If a method's body is non-trivial, the consumer-side default is non-trivial too. Decide explicitly whether the FFI should let the consumer recompute the same default, or whether the producer's override is what should travel.
288301

289302
## Type-bridging conventions

0 commit comments

Comments
 (0)