docs(gfql): review fixes — crossover consistency, forward ref, public phrasing

lmeyerov · claude · lmeyerov · commit 70dcae16172f · 2026-07-02T16:38:41.000-07:00
- performance.rst said 'below ~1M edges pandas often wins', contradicting engines.rst's measured ~10K polars crossover one click away — aligned to the measured guidance. - engines.rst referenced :doc:`benchmark_graphframes`, a page that only lands in the stacked benchmarks PR (#1668) — Sphinx unknown-doc warning if this PR ships alone. Reworded; #1668 restores the live link. - 'NO-CHEATING' is internal methodology jargon — public page now says 'No silent fallback — parity-verified' (same guarantee, reader-facing). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
diff --git a/docs/source/gfql/engines.rst b/docs/source/gfql/engines.rst
@@ -436,8 +436,8 @@ change:
    ``polars.LazyFrame`` is collected immediately), so the source graph must still
    fit in memory. True out-of-core-from-disk — building GFQL directly on a lazy
    ``pl.scan_parquet`` source so a graph larger than RAM never fully materializes —
-   is **work in progress**; see the Friendster (~1.8B edges) discussion in
-   :doc:`benchmark_graphframes`.
+   is **work in progress**; see the Friendster (~1.8B edges) discussion in the
+   GraphFrames benchmark page.
 
 When **not** to use Polars
 --------------------------
@@ -466,7 +466,7 @@ Parity and honesty
 - **Identical results across engines.** Differential parity — every engine's output must match
   the pandas oracle — is a release gate, exercised across forward/reverse/undirected, 1-3 hop,
   filters, and aggregations.
-- **No silent fallback (NO-CHEATING).** The Polars engine runs natively or raises
+- **No silent fallback — parity-verified.** The Polars engine runs natively or raises
   ``NotImplementedError`` — it never quietly converts to pandas. ``polars-gpu`` is
   **GPU-or-error**: if any step of the plan cannot run on the GPU it raises (pointing at
   ``engine='polars'``) rather than silently running on CPU and labelling it a GPU result.
diff --git a/docs/source/gfql/performance.rst b/docs/source/gfql/performance.rst
@@ -40,7 +40,8 @@ Warm-median latency, same query, identical result rows (**Orkut**, 117M edges, S
      - 314 ms
      - **167 ms**
 
-There is **no universal winner**: below ~1M edges ``pandas`` often wins, and the right GPU
+There is **no universal winner**: ``polars`` typically takes over from ~10K edges up
+(``pandas`` still wins trivial sub-millisecond operations), and the right GPU
 engine depends on the workload. See :doc:`engines` for the full decision matrix, the honest
 "when *not* to use Polars", the cuDF-vs-Polars-GPU comparison, and the methodology + reproducer
 scripts behind these numbers. The end-to-end CPU/GPU-vs-Neo4j benchmark is in