Skip to content

[VL] Hoist per-partition constants out of ColumnarCachedBatchSerializer.serialize hot path#12166

Open
yaooqinn wants to merge 1 commit into
apache:mainfrom
yaooqinn:users/kentyao/gluten-ccbs-iterator-hoist
Open

[VL] Hoist per-partition constants out of ColumnarCachedBatchSerializer.serialize hot path#12166
yaooqinn wants to merge 1 commit into
apache:mainfrom
yaooqinn:users/kentyao/gluten-ccbs-iterator-hoist

Conversation

@yaooqinn
Copy link
Copy Markdown
Member

What changes were proposed in this pull request?

In ColumnarCachedBatchSerializer.convertInternalRowToCachedBatch,
three values that are constant for the lifetime of the per-partition
Iterator[CachedBatch] were being re-evaluated on every next() call:

  1. BackendsApiManager.getBackendName -- invoked twice per batch
    (once for the JNI runtime context, once for getNativeHandle)
  2. GlutenConfig.get.getConf(COLUMNAR_TABLE_CACHE_PARTITION_STATS_ENABLED)
    -- GlutenConfig.get allocates a fresh GlutenConfig(SQLConf.get)
    on every call (GlutenConfig.scala L584-L586) plus a SparkConf
    hashmap lookup
  3. ColumnarBatchSerializerJniWrapper.create(Runtimes.contextInstance(...))
    -- JNI wrapper construction

This PR hoists all three out of next() into the mapPartitions body,
alongside the structSchema value that the same block already hoists
for the identical many-small-batch GC-pressure reason. Only the
per-batch handle = ColumnarBatches.getNativeHandle(backendName, batch)
remains inside next(), since it genuinely depends on the batch.

Why are the changes needed?

The block immediately preceding new Iterator[CachedBatch] already
hoists structSchema with a comment explicitly citing many-small-batch
GC pressure. The three values listed above belong in the same hoist --
they are all partition-iterator-scoped constants whose per-batch
re-evaluation produces zero behavioral signal but consumes allocator /
JNI traffic on every batch.

GlutenConfig.get is especially worth hoisting because it eagerly
constructs a new GlutenConfig(SQLConf.get) on each invocation rather
than returning a cached instance:

def get: GlutenConfig = new GlutenConfig(SQLConf.get)

Does this PR introduce any user-facing change?

No. Wire format is byte-identical; partitionStatsEnabled is now
captured once at partition start instead of re-read per batch, which
matches the existing semantics of SQLConf.get in Spark task
execution (already TaskContext-thread-snapshot) -- no observable
difference from the per-batch re-read.

How was this patch tested?

The change is a pure refactor with byte-identical wire output. The
behavior is fully covered by the existing serializer test suites:

All 7 tests green locally on -Pspark-4.1 -Pscala-2.13.

No new test file is added. Cross-batch reuse of the hoisted JNI
wrapper is safe because (a) ColumnarBatchSerializerJniWrapper holds
only a final Runtime runtime reference with no mutable per-batch
state, and (b) the native side (cpp/core/jni/JniWrapper.cc L1280-L1325)
constructs a fresh ctx->createColumnarBatchSerializer(nullptr) on
every JNI call -- the wrapper is a stateless route to the runtime.


Was this patch authored or co-authored using generative AI tooling?
No

…er.serialize hot path

In convertInternalRowToCachedBatch, three values that are constant for
the lifetime of the per-partition Iterator[CachedBatch] were being
re-evaluated on every next() call:

1. BackendsApiManager.getBackendName (twice per batch)
2. GlutenConfig.get.getConf(COLUMNAR_TABLE_CACHE_PARTITION_STATS_ENABLED)
   -- GlutenConfig.get allocates a fresh GlutenConfig(SQLConf.get) on
   every call (GlutenConfig.scala L584-L586)
3. ColumnarBatchSerializerJniWrapper.create(Runtimes.contextInstance(...))

Hoist all three out of next() into the mapPartitions body, alongside
the structSchema value that the same block already hoists for the same
many-small-batch GC-pressure reason. Only the per-batch handle remains
inside next() since it depends on the batch.

Wire format is byte-identical. Pure refactor with no new test file;
behavior fully covered by ColumnarCachedBatchKryoSuite and
ColumnarCachedBatchKryoBoundaryProbeBugSuite (7 tests, all green
locally on -Pspark-4.1 -Pscala-2.13).

refs: todos/features/gluten-ccbs-iterator-hoist/docs/0002-decision.md
refs: todos/features/gluten-ccbs-iterator-hoist/docs/0003-implementation-plan.md
@github-actions github-actions Bot added the VELOX label May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants