Skip to content

[Pipe] Optimize memory usage#17770

Open
Caideyipi wants to merge 7 commits into
apache:masterfrom
Caideyipi:pipe-opti
Open

[Pipe] Optimize memory usage#17770
Caideyipi wants to merge 7 commits into
apache:masterfrom
Caideyipi:pipe-opti

Conversation

@Caideyipi
Copy link
Copy Markdown
Collaborator

Description

  • Compact empty Pipe tablet bitmaps and avoid carrying bitmap arrays when no null values are marked.
  • Reuse local string intern pools while deserializing receiver-side raw/batch tablet requests, reducing duplicate device/table/measurement/database strings.
  • Reuse the same local string pool while splitting TsFile events into tablets, especially for repeated devices and measurement names.
  • Add coverage for bitmap compaction and repeated measurement-name deduplication.

Tests

  • git diff --check
  • mvn -pl iotdb-core/datanode -DskipTests spotless:apply
  • mvn -pl iotdb-core/datanode -DskipTests -Dcheckstyle.skip=true compile (fails locally on unrelated generated/parser sources, e.g. CountDatabasesStatementContext and generated fill/mode classes)

luoluoyuyu

This comment was marked as outdated.

luoluoyuyu

This comment was marked as outdated.

Copy link
Copy Markdown
Member

@luoluoyuyu luoluoyuyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate compactBitMaps

PipeTabletUtils.compactBitMaps (iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/event/common/tablet/PipeTabletUtils.java L143-161) and InsertTabletNode.compactBitMaps (InsertTabletNode.java L466-482) are the same loop. Consider reusing one implementation so split (getSplit L385) and pipe parsing stay in sync.

Lazy bitmap contract

initEmptyBitMaps (L132-134) only allocates a BitMap[] of nulls; nulls are created in markNullValue (L168-175). Call sites that still assume getBitMaps()[i] is non-null need to use tablet.isNull() — e.g. OpcUaNameSpace.java was fixed; please double-check any remaining direct index access in this PR’s diff.

Deserialize: bitMaps == null

In TabletStatementConverter.deserializeStatementFromTabletFormat (TabletStatementConverter.java L180-188), when the bitmap section is absent, bitMaps is now null instead of new BitMap[schemaSize]. readBitMapsFromBufferWithMemory (L336-368) also drops all-unmarked columns. This matches InsertTabletNode.initBitmapsForSplit (L448-464) but is a wire/WAL contract change — worth a quick audit of non-pipe readers of InsertTabletNode.getBitMaps().

String intern scope

PipeTransferTabletRawReq.fromTPipeTransferReq (PipeTransferTabletRawReq.java L160-162) creates a new TabletStringInternPool per request. For batch payloads, PipeTransferTabletBatchReq (L133) already holds one pool per batch — consider the same pattern for raw req if multiple tablets share device/measurement names.

TabletInsertionEventParser InsertRow path

In parse(InsertRowNode) (TabletInsertionEventParser.java L171-173), filterValueColumnsByRowIndexList(..., /*origin*/ null, bitMap) replaces passing bitMap as both origin and output. InsertRow has no source bitmaps, so this should be equivalent; a small unit test on this constructor path would lock the behavior.

Copy link
Copy Markdown
Member

@luoluoyuyu luoluoyuyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional code review (follow-up):

1. PipeRowCollector not migrated — likely NPE / inconsistent with lazy bitmaps

PipeRowCollector.collectRow still calls tablet.initBitMaps() (L80) and tablet.getBitMaps()[i].mark(rowIndex) (L97). Other parsers in this PR use PipeTabletUtils.initEmptyBitMaps + markNullValue. If PipeRow carries compacted/null bitMaps and this path is ever switched to lazy init without updating L97, it will NPE. Recommend aligning this class with the new utils (and compactBitMaps before emit).

2. PipeRow.isNull — missing bounds check

PipeRow.java L154-157 guards bitMaps and bitMaps[columnIndex] but not columnIndex < bitMaps.length. With sparse/null-padded bitMaps arrays from TabletInsertionEventParser, out-of-range column access can still throw.

3. compactBitMaps duplicated in 4 places

Same loop now exists in PipeTabletUtils (L143-161), InsertTabletNode (L466-482), and InsertTabletStatement (L903-920, also L393 on split). Drift risk is higher than noted earlier — worth centralizing.

4. PipeTreeModelTsFileBuilder aggregation may re-allocate empty bitmap slots

tryBestToAggregateTablets uses PipeTabletUtils.copyBitMapsOrCreateEmpty(tablet) (L234). When tablet.getBitMaps() is null, this allocates a BitMap[columnCount] of nulls per aggregated tablet. The merged Tablet constructor (L255-260) does not call compactBitMaps, so aggregation can retain O(columns × tablets) null bitmap shells and partially undo the memory win on sink TsFile build paths. Consider compactBitMaps on the aggregated tablet (V1/V2 builders share this).

5. TabletStringInternPool.intern(MeasurementSchema) mutates schema props

PipeTabletUtils.java L112-120 replaces schema.setProps(intern(...)) with a new HashMap each time. Safe for parser-owned schemas, but if a shared MeasurementSchema instance is reused across events, props could be unexpectedly replaced. Worth a quick comment or copy-on-intern only when needed.

6. TabletInsertionEventTableParserTabletIterator.internMeasurementName

L478-483 calls tabletStringInternPool.intern((MeasurementSchema) schema) then intern(schema.getMeasurementName()) — redundant for MeasurementSchema; minor, but the double call is confusing.

7. Tests gap: PipeRowCollector / TsFile builder aggregation

New tests cover parsers, split, and batch intern (PipeDataNodeThriftRequestTest), but not PipeRowCollector or multi-tablet TsFile builder aggregation with all-null columns. A small UT would catch (1) and (4).

Copy link
Copy Markdown
Member

@luoluoyuyu luoluoyuyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential bug (regression): DefaultOperationQuota vs bitMaps == null

This PR sets InsertTabletStatement.bitMaps to null when there are no nulls (TabletStatementConverter.java L180-188). That is fine for memtable / InsertTabletNode (they already null-check), but DefaultOperationQuota does not:

// DefaultOperationQuota.java L120-121
for (BitMap bitMap : insertTabletStatement.getBitMaps()) {
  avgSize += bitMap.getSize();
}
  • If getBitMaps() is nullNPE on the enhanced-for.
  • If getBitMaps() is a sparse array with null slots (after readBitMapsFromBufferWithMemory / compactBitMaps) → NPE on bitMap.getSize().

Same pattern at L155-157 for InsertMultiTabletsStatement.

Repro path: Pipe sink receives / deserializes a tablet batch with no null columns → statement keeps bitMaps == null → any insert that goes through quota estimation.

Suggested fix: null-check before iterating; skip null elements (same as AbstractMemTable.computeTabletNullPointsNumber).


Other areas reviewed (PipeRow, parsers, split, TsFile builders, PipeRowCollector) look OK under current code — no additional definite bugs found beyond this quota path.

@Caideyipi
Copy link
Copy Markdown
Collaborator Author

Handled the review items in c16fa23 and 4879c3a:

  • Fixed DefaultOperationQuota for bitMaps == null and sparse bitmap arrays, including InsertMultiTabletsStatement. Added DefaultOperationQuotaTest for both cases.
  • Centralized compactBitMaps into BitMapUtils and made InsertTabletNode, RelationalInsertTabletNode, InsertTabletStatement, and PipeTabletUtils delegate to the same implementation.
  • Removed redundant empty bitmap initialization from TsFile parser tablet builders. markNullValue now creates the bitmap array and per-column bitmap lazily only when a null is recorded.
  • Migrated PipeRowCollector to the same lazy bitmap path and compact before emitting the collected tablet.
  • Added a bounds guard in PipeRow.isNull for sparse bitmap arrays.

For the string-intern scope comment: raw req carries a single tablet, so a per-raw-request pool is enough there; batch req already reuses one pool across all tablets in the batch. For rolling compatibility: null bitMaps are still represented by the existing hasBitMaps=false branch, and the quota path now skips both null arrays and null slots.

Verified:

  • git diff --check
  • mvn -pl iotdb-core/datanode -DskipTests spotless:apply
  • mvn -pl iotdb-core/datanode -Dtest=DefaultOperationQuotaTest test
  • mvn -pl iotdb-core/datanode -Dtest=WritePlanNodeSplitTest test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants