Skip to content

Commit 70dcae1

Browse files
lmeyerovclaude
andcommitted
docs(gfql): review fixes — crossover consistency, forward ref, public phrasing
- performance.rst said 'below ~1M edges pandas often wins', contradicting engines.rst's measured ~10K polars crossover one click away — aligned to the measured guidance. - engines.rst referenced :doc:`benchmark_graphframes`, a page that only lands in the stacked benchmarks PR (#1668) — Sphinx unknown-doc warning if this PR ships alone. Reworded; #1668 restores the live link. - 'NO-CHEATING' is internal methodology jargon — public page now says 'No silent fallback — parity-verified' (same guarantee, reader-facing). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
1 parent c933213 commit 70dcae1

2 files changed

Lines changed: 5 additions & 4 deletions

File tree

docs/source/gfql/engines.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -436,8 +436,8 @@ change:
436436
``polars.LazyFrame`` is collected immediately), so the source graph must still
437437
fit in memory. True out-of-core-from-disk — building GFQL directly on a lazy
438438
``pl.scan_parquet`` source so a graph larger than RAM never fully materializes —
439-
is **work in progress**; see the Friendster (~1.8B edges) discussion in
440-
:doc:`benchmark_graphframes`.
439+
is **work in progress**; see the Friendster (~1.8B edges) discussion in the
440+
GraphFrames benchmark page.
441441

442442
When **not** to use Polars
443443
--------------------------
@@ -466,7 +466,7 @@ Parity and honesty
466466
- **Identical results across engines.** Differential parity — every engine's output must match
467467
the pandas oracle — is a release gate, exercised across forward/reverse/undirected, 1-3 hop,
468468
filters, and aggregations.
469-
- **No silent fallback (NO-CHEATING).** The Polars engine runs natively or raises
469+
- **No silent fallback — parity-verified.** The Polars engine runs natively or raises
470470
``NotImplementedError`` — it never quietly converts to pandas. ``polars-gpu`` is
471471
**GPU-or-error**: if any step of the plan cannot run on the GPU it raises (pointing at
472472
``engine='polars'``) rather than silently running on CPU and labelling it a GPU result.

docs/source/gfql/performance.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,8 @@ Warm-median latency, same query, identical result rows (**Orkut**, 117M edges, S
4040
- 314 ms
4141
- **167 ms**
4242

43-
There is **no universal winner**: below ~1M edges ``pandas`` often wins, and the right GPU
43+
There is **no universal winner**: ``polars`` typically takes over from ~10K edges up
44+
(``pandas`` still wins trivial sub-millisecond operations), and the right GPU
4445
engine depends on the workload. See :doc:`engines` for the full decision matrix, the honest
4546
"when *not* to use Polars", the cuDF-vs-Polars-GPU comparison, and the methodology + reproducer
4647
scripts behind these numbers. The end-to-end CPU/GPU-vs-Neo4j benchmark is in

0 commit comments

Comments
 (0)