You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Rule 3: split function-pointer field sig (`unsafe extern "C"`) from bare
function defs in `lib.rs` (plain `extern "C"`); `version` field is unsafe,
`library_marker_id` field is plain.
- Rule 4: replace absolute "Send + Sync on every FFI_X" with bounds derived
from the wrapped trait. Streams + mutable traits (`Accumulator`,
`GroupsAccumulator`, `PartitionEvaluator`) are `Send`-only.
- Rule 7: cite `docs/source/contributor-guide/api-health.md` for the
release-note convention behind the `api-change` label.
- Rule 8 + checklist 15: match release branches with strict `^branch-\d+$`
regex; explicitly reject globbing `branch-*` (collides with
`branch-53-cherry-pick-1`).
- Template (`### 1. The FFI_X struct`): compress the `Option<fn>` block to
a one-line cross-ref; add a parallel `Box<dyn X>` code block + canonical
example paths under `### 2. PrivateData shape`.
- Checklist item 1 + 6: align with new method-coverage and Send/Sync rules.
- Integration-test table: replace "pure-stabby-string getters can skip"
with a defined allow-list (primitives, FFI enums, bare `SString`) plus
an explicit non-skippable list.
- Known gaps: drop static issue enumeration; point to the live `label:ffi`
GitHub query as source of truth, with a `gh` one-liner and severity
classes so the section still teaches without rotting.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: .ai/skills/datafusion-ffi/SKILL.md
+43-29Lines changed: 43 additions & 29 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,12 +23,12 @@ Trigger this skill any time the work touches `datafusion/ffi/`:
23
23
24
24
1.**No `datafusion` dependency.**`datafusion-ffi` must not depend on the umbrella `datafusion` crate. Use the leaf crates (`datafusion-common`, `datafusion-expr`, `datafusion-catalog`, `datafusion-physical-plan`, etc.). `datafusion` is fine in `[dev-dependencies]`.
25
25
2.**`#[repr(C)]` on every `FFI_X` struct**, not `#[stabby::stabby]`. Stabby is used for `SString`/`SVec` only. Reasons documented in the README (build time, Arrow types lack `IStable`).
26
-
3.**`unsafe extern "C"` on every functionpointer.** Plain `extern "C"`is reserved for `version` and `library_marker_id`.
27
-
4.**`unsafe impl Send` + `unsafe impl Sync`**for every `FFI_X`and every `ForeignX` — the raw pointer makes the struct `!Send`/`!Sync` by default.
26
+
3.**`unsafe extern "C"` on every function-pointer field — including `version`.**The one exception is the `library_marker_id` field, which is plain `extern "C" fn() -> usize`. Plain (safe) `extern "C"`also applies to the standalone function defs in `src/lib.rs` — `pub extern "C" fn version()` and `pub extern "C" fn get_library_marker_id()` — which coerce into the `unsafe extern "C"``version` field slot at construction.
27
+
4.**Match `Send`/`Sync` to the wrapped trait.**Raw `*mut c_void` makes every `FFI_X``!Send + !Sync` by default — `unsafe impl` whichever bounds the consumer-facing trait requires. Most DataFusion traits (`TableProvider`, `ExecutionPlan`, all UDFs, codecs) need both. `Send`-only: `RecordBatchStream`, `Accumulator`, `GroupsAccumulator`, `PartitionEvaluator` (mutable / stream APIs). The matching `ForeignX` always carries the same bounds — pick consistently.
28
28
5.**`#![deny(clippy::clone_on_ref_ptr)]`** is on at the crate root. Use `Arc::clone(&x)`, never `x.clone()` on `Arc`.
29
29
6.**Run before pushing:**`cargo fmt --all`, `cargo clippy -p datafusion-ffi --all-targets --all-features -- -D warnings`, `cargo test -p datafusion-ffi`.
30
-
7.**`api-change` label required.** Any PR that modifies an `FFI_X` struct layout (adds/removes/reorders fields, changes a function-pointer signature, adds a variant to an FFI enum, or changes the `version` extern) must carry the `api-change` GitHub label. Layout changes break ABI for already-compiled consumer libraries. The label drives release-note flagging so downstream users (e.g. `datafusion-python`, plugin authors) know they must recompile against the new DataFusion major. The `version()` extern in `src/lib.rs` returns the major of workspace `CARGO_PKG_VERSION`; consumers compare it at load time and can refuse mismatched producers. Apply via `gh pr edit <num> --add-label api-change`. When reviewing such a PR, block merge until label present.
31
-
8.**No FFI struct changes in patch releases.** Patch releases ship from branches named `branch-<major>` (e.g. `branch-53`, `branch-52`). FFI struct layout changes (anything that would earn rule 7's `api-change` label) **must not** target a `branch-<major>` branch and must not be back-ported. Patch releases are ABI-stable by contract — a consumer compiled against `53.1.0` must keep working against `53.1.1`. Before reviewing/approving an FFI PR, check the PR's base branch: if it matches `branch-*` or the PR description / labels indicate patch / back-port, reject the FFI struct change and ask the author to retarget `main`. Bugfixes that do not alter struct layout (e.g. fixing a function-pointer body) are fine to back-port. Quick check: `gh pr view <num> --json baseRefName,labels`.
30
+
7.**`api-change` label required.** Any PR that modifies an `FFI_X` struct layout (adds/removes/reorders fields, changes a function-pointer signature, adds a variant to an FFI enum, or changes the `version` extern) must carry the `api-change` GitHub label. Layout changes break ABI for already-compiled consumer libraries. The label is the project-wide convention for highlighting breaking public-API changes in release notes — see `docs/source/contributor-guide/api-health.md` §"What to do when making breaking API changes?" (step 1 names the label explicitly). Downstream users (e.g. `datafusion-python`, plugin authors) read the labelled notes to know they must recompile against the new DataFusion major. The `version()` extern in `src/lib.rs` returns the major of workspace `CARGO_PKG_VERSION`; consumers compare it at load time and can refuse mismatched producers. Apply via `gh pr edit <num> --add-label api-change`. When reviewing such a PR, block merge until label present.
31
+
8. **No FFI struct changes in patch releases.** Patch releases ship from branches matching `^branch-\d+$` (e.g. `branch-53`, `branch-52`). FFI struct layout changes (anything that would earn rule 7's `api-change` label) **must not** target a release branch and must not be back-ported. Patch releases are ABI-stable by contract — a consumer compiled against `53.1.0` must keep working against `53.1.1`. Before reviewing/approving an FFI PR, check the PR's base branch: if it matches the regex above, or the PR description / labels indicate patch / back-port, reject the FFI struct change and ask the author to retarget `main`. Bugfixes that do not alter struct layout (e.g. fixing a function-pointer body) are fine to back-port. Quick check: `gh pr view <num> --json baseRefName,labels --jq '.baseRefName'` then match against `^branch-\d+$`. Do **not** glob-match `branch-*` — back-port / cherry-pick working branches (e.g. `branch-53-cherry-pick-1`) also share that prefix but are not release branches; only the strict `branch-<digits>` form is the freeze target.
32
32
33
33
## The standard wrapper shape
34
34
@@ -46,14 +46,8 @@ pub struct FFI_X {
46
46
// gets the right answer either way without the consumer needing to know.
// Codec only if the trait moves Exprs/Plans across the boundary.
@@ -67,6 +61,8 @@ pub struct FFI_X {
67
61
publibrary_marker_id:extern"C"fn() ->usize,
68
62
}
69
63
64
+
// Pick bounds to match the wrapped trait (see rule 4).
65
+
// Most wrappers want both; `Send`-only for streams / mutable traits.
70
66
unsafeimplSendforFFI_X {}
71
67
unsafeimplSyncforFFI_X {}
72
68
```
@@ -88,7 +84,16 @@ struct XPrivateData {
88
84
}
89
85
```
90
86
91
-
For traits that require `&mut self` (e.g. `Accumulator`), use `Box<dyn X>`. A `Box`-backed `FFI_X`**cannot implement `Clone`**; document this and skip the `clone` function pointer or hand-write a release path that distinguishes producer vs consumer side (see `FFI_Accumulator` in `src/udaf/accumulator.rs`).
87
+
For traits that require `&mut self` (e.g. `Accumulator`, `GroupsAccumulator`, `PartitionEvaluator`), use `Box<dyn X>`:
88
+
89
+
```rust
90
+
structXPrivateData {
91
+
inner:Box<dynX>,
92
+
runtime:Option<Handle>, // include when any async method exists
93
+
}
94
+
```
95
+
96
+
A `Box`-backed `FFI_X`**cannot implement `Clone`**; document this and skip the `clone` function pointer, or hand-write a release path that distinguishes producer vs consumer side. Canonical example: `FFI_Accumulator` in `src/udaf/accumulator.rs`. See also `FFI_GroupsAccumulator` (`src/udaf/groups_accumulator.rs`) and `FFI_PartitionEvaluator` (`src/udwf/partition_evaluator.rs`).
92
97
93
98
### 3. Function-pointer wrappers
94
99
@@ -221,11 +226,18 @@ What integration tests catch that unit tests cannot:
| New method on existing `FFI_X`| Yes | Yes if the method takes/returns a non-trivial FFI type (anything other than primitives + enums). Pure-stabby-string getters can skip. |
229
+
| New method on existing `FFI_X`| Yes | Yes if the method takes/returns a non-trivial FFI type. See note below. |
225
230
| Bugfix to a wrapper body, no signature change | Yes | Only if reproducing the bug requires cross-library symbol lookup or `dlopen` semantics |
226
231
| Layout change (`#[repr(C)]` field add/remove/reorder, fn-ptr sig) | Yes |**Mandatory** — this is exactly the bug class integration tests exist for |
227
232
| New `From<X> for FFI_X` or codec change | Yes | Yes if the codec is exercised by the cross-library round-trip |
228
233
234
+
**"Non-trivial FFI type" for the table above** — anything other than:
- A `stabby::string::String` (`SString`) returned by value, with no other args or returns.
238
+
239
+
Concrete skippable example: `fn name(&self) -> SString` reading a field already validated by another method. Concrete *non*-skippable examples: anything returning `SVec<T>`, `FFI_Option<T>`, `FFI_Result<T>`, `WrappedSchema`, `WrappedArray`, an `FfiFuture`, an `FFI_*` sub-struct, or any `*mut`/`*const` pointer — those exercise alignment / padding / niche-opt across the ABI boundary and need the two-build coverage. When unsure, write the integration test; the cost is one constructor + ~20 lines.
240
+
229
241
If you skip the integration test for a layout change, you have effectively shipped untested ABI.
230
242
231
243
## Method coverage — the silent-default gap
@@ -249,20 +261,22 @@ When adding or auditing an `FFI_X`, **enumerate every method on the wrapped trai
249
261
250
262
### Known gaps to close
251
263
252
-
A re-audit (2026-05) opened tracking issues for each wrapper with real gaps. Treat new PRs in these areas as opportunities to fix them; treat new wrappers as required to avoid creating more. Each linked issue lists the specific missing methods and severity.
264
+
Tracked gaps live on GitHub under the [`ffi`](https://github.com/apache/datafusion/issues?q=is%3Aissue+is%3Aopen+label%3Affi) label — that query is the source of truth and stays current as issues are filed or closed. Treat new PRs in those areas as opportunities to fix the listed methods; treat new wrappers as required to avoid creating more. Each open issue names the specific wrapper, the missing methods, and the severity.
265
+
266
+
Quick CLI list:
267
+
268
+
```bash
269
+
gh issue list --repo apache/datafusion --label ffi --state open --limit 50
270
+
```
271
+
272
+
Common severity classes seen on the label today:
253
273
254
-
-**`FFI_TableProvider`** — [#22328](https://github.com/apache/datafusion/issues/22328). Missing: `constraints`, `get_table_definition`, `get_logical_plan`, `get_column_default`, `scan_with_args`, `statistics`, `delete_from`, `update`, `truncate`. DML omissions are the worst — they silently demote a writable table to read-only on the foreign side.
255
-
-**`FFI_ExecutionPlan`** — [#22329](https://github.com/apache/datafusion/issues/22329). Largest single gap by method count: 18 defaults (distribution / ordering, filter pushdown, projection swapping, limit pushdown, cardinality, partition stats, state / fetch handling). Silently breaks optimizer decisions on the consumer side.
-**`FFI_WindowUDF`** — [#22332](https://github.com/apache/datafusion/issues/22332). Missing: `simplify`, `expressions`, `reverse_expr`, `documentation`. `expressions()` selects which physical args reach the evaluator — silent fallback can change semantics.
259
-
-**`FFI_SchemaProvider`** — [#22333](https://github.com/apache/datafusion/issues/22333). Missing: `table_type` async cheap-path. Forces full `table()` materialization for `information_schema.tables` probes. (`owner_name` is exposed as a static `FFI_Option<SString>` snapshot field — note for future review, not a gap today.)
260
-
-**`FFI_PhysicalOptimizerRule`** — [#22334](https://github.com/apache/datafusion/issues/22334). Missing: `optimize_with_context`. Rules whose output depends on per-invocation statistics / options are silently downgraded to the context-free path.
261
-
-**`FFI_GroupsAccumulator`** — [#22335](https://github.com/apache/datafusion/issues/22335). Missing: `size`. Consumer-side memory accounting reports trait default rather than producer's real footprint.
-**`FFI_RecordBatchStream`** — [#22337](https://github.com/apache/datafusion/issues/22337). Open question: no `library_marker_id` field, unlike most other `FFI_X`. Confirm intentional vs gap; if intentional, document the rationale and update this skill.
274
+
-**DML / optimizer-relevant defaults silently lost** — e.g. `delete_from`/`update`/`truncate` on table providers, distribution / ordering / pushdown on execution plans, `value_from_stats` on aggregates. These demote producer capability on the foreign side.
275
+
-**SQL surface area silently absent** — naming hooks (`display_name`, `schema_name`, `documentation`), null-handling and within-group clauses on UDAFs, etc.
276
+
-**Performance regressions, not correctness** — e.g. `memoize` on partition evaluators, `size` on groups accumulators.
277
+
-**Open design questions** — e.g. whether `FFI_RecordBatchStream` needs `library_marker_id` at all.
264
278
265
-
Wrappers not listed above are not certified complete — they are merely not currently tracked. Re-enumerate the trait surface (see "How to enumerate defaults" below) whenever you audit one; upstream trait drift can introduce new defaulted methods at any time and silently re-open the silent-override-loss bug class.
279
+
When in doubt, open the label query; do not assume the list above is exhaustive. Wrappers without an open issue are **not** certified complete — re-enumerate the trait surface (see "How to enumerate defaults" below) whenever you audit one; upstream trait drift can introduce new defaulted methods at any time and silently re-open the silent-override-loss bug class.
266
280
267
281
When *closing* a gap, add the fn pointer unconditionally — the wrapper body calls the trait method on the inner `Arc<dyn Trait>` and Rust's dynamic dispatch picks the producer's override or falls back to the default. Use `Option<fn>` only when the new method also gains a corresponding capability flag on the producer's `new()`. Either way the layout changes, so the PR is an ABI break: mark `api-change`, do not back-port to `branch-<major>`, and the workspace major bump in the next release makes the `version()` extern surface the change to consumers at load time.
268
282
@@ -308,12 +322,12 @@ If a method's body is non-trivial, the consumer-side default is non-trivial too.
308
322
309
323
When reviewing a PR that touches `datafusion/ffi/`:
310
324
311
-
1.**Trait coverage.** Pull up the underlying trait. List its methods. Confirm every non-defaulted method has a function pointer. For each *defaulted* method, ask whether a real-world producer would override it — if yes, the PR must either plumb it through (preferably as `Option<fn>`) or explicitly justify the omission in a comment.
325
+
1.**Trait coverage.** Pull up the underlying trait. List its methods. Confirm every non-defaulted method has a function pointer. For each *defaulted* method, ask whether a real-world producer would override it — if yes, the PR must either plumb it through as a plain `unsafe extern "C" fn` (let dynamic dispatch on `Arc<dyn Trait>` pick override-or-default) or explicitly justify the omission in a comment. Reserve `Option<fn>` for the capability-flag pattern described in §"Method coverage" — do not use it just because the trait has a default.
312
326
2.**Layout fields.**`clone`, `release`, `version`, `private_data`, `library_marker_id` all present?
313
327
3.**Marker-id bypass** in `From<&FFI_X>`?
314
328
4.**Round-trip downcast** in the constructor?
315
329
5.**`Drop` calling `release`**, and `release` nulling the pointer?
316
-
6.**`Send` + `Sync` unsafe impls**on both `FFI_X`and`ForeignX`?
330
+
6.**`Send`/`Sync` unsafe impls**match the wrapped trait's bounds (rule 4), and `FFI_X`+`ForeignX` agree?
317
331
7.**Stabby types** for strings/vecs; crate-local `FFI_Option`/`FFI_Result` for optional/fallible payloads?
318
332
8.**Async** uses `FfiFuture` + `.into_ffi()`, never blocking?
319
333
9.**Codec** present on any method that ships an `Expr` / plan?
@@ -322,7 +336,7 @@ When reviewing a PR that touches `datafusion/ffi/`:
322
336
12.**`Arc::clone(&x)`** everywhere — no implicit `x.clone()` on `Arc` (the lint will reject it but worth pre-flagging).
323
337
13.**`cargo clippy --all-targets --all-features -- -D warnings`** clean on the crate?
324
338
14.**`api-change` label** on the PR if any `FFI_X` struct layout / fn-ptr signature / FFI enum / `version` extern changed? Block merge until applied.
325
-
15.**Base branch check.** If FFI struct layout changed, base branch must be `main`, never `branch-<major>`. Reject back-ports of ABI-breaking changes to patch-release branches. `gh pr view <num> --json baseRefName,labels` to verify.
339
+
15.**Base branch check.** If FFI struct layout changed, base branch must be `main`, never a release branch matching `^branch-\d+$` (e.g. `branch-53`). Reject back-ports of ABI-breaking changes to patch-release branches. Verify with `gh pr view <num> --json baseRefName,labels --jq '.baseRefName'`; match strictly against `^branch-\d+$` — cherry-pick working branches like `branch-53-cherry-pick-1` also start with `branch-` and must not false-positive.
0 commit comments