[feature](load) introduce adaptive random bucket load routing#62661
[feature](load) introduce adaptive random bucket load routing#62661sollhui wants to merge 2 commits into
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
adcf29c to
4df87ab
Compare
4df87ab to
35540cb
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
35540cb to
982c516
Compare
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
|
/review |
There was a problem hiding this comment.
Findings
- The per-BE sink mutation in
assignRandomBucketPerBeis not actually isolated.PlanFragment.toThrift()reusesOlapTableSink.toThrift(), and that method returns the sharedtDataSinkobject, so replacingolap_table_sinkon one worker overwrites the sink seen by the other workers. - Only the Nereids
ThriftPlansBuilderpath populateslocal_bucket_seqs, but BE now switches every random-distribution sink toFIND_TABLET_RANDOM_BUCKET. The classicCoordinatorpath plus the one-shot stream-load / group-commit path still send the old sink shape, so those loads stop doing the old per-batch round-robin and fall back to a single bucket. - The new bucket metadata is only parsed on the initial
generate_partition_from()path. Auto-partition / replace-partition refreshes still go throughadd_partitions()andreplace_partitions(), which do not preserveload_tablet_idxorlocal_bucket_seqs. InFIND_TABLET_RANDOM_BUCKETmode that can leaveload_tablet_idx == -1and indextablets[-1]. - The rotation accounting in
vtablet_finder.cppis incorrect: each touched partition is charged with the full block size, and once a tablet crosses 200 MB its counter is never reset when that bucket becomes active again. Large multi-partition loads therefore rotate much earlier than intended and then degenerate into per-batch ping-pong.
Critical Checkpoints
- Goal of the task: Partially achieved. The PR intends to route random-distribution loads to BE-local buckets and rotate every 200 MB, but the current implementation does not preserve per-BE isolation, misses equivalent planning paths, and breaks dynamic partition refreshes / rotation accounting.
- Is the modification small, clear, and focused: The diff is small, but it is not end-to-end. Functionally parallel FE and runtime-refresh paths were left behind.
- Concurrency: No new lock-order issue was found in BE. The main concurrency/lifecycle risk is FE-side shared mutable thrift state: the per-worker mutation is performed on a shared
TDataSinkinstance. - Lifecycle / static initialization: No static-init issue found. The relevant lifecycle issue is object ownership of the shared thrift sink.
- Configuration changes: None.
- Compatibility / storage-format changes: The new thrift fields are optional, so wire compatibility is fine, but FE-to-BE propagation is incomplete across planner and partition-refresh paths.
- Parallel code paths: Not all equivalent paths were updated. The classic
Coordinator,NereidsStreamLoadPlanner, and group-commit path still bypass the new FE assignment, and runtime partition refreshes still bypass the new field propagation. - Special conditions:
local_bucket_seqs.empty()andload_tablet_idx == -1are not safely handled inFIND_TABLET_RANDOM_BUCKETmode. - Test coverage: Insufficient. I did not find FE/BE tests for per-BE bucket assignment, non-Nereids planning paths, auto-partition / replace-partition with random distribution, or large-load rotation behavior.
- Test result changes: None.
- Observability: Existing logs are probably enough to debug this area, but there is no new coverage or instrumentation that would mitigate the correctness gaps.
- Transaction / persistence: Not directly touched.
- Data-write safety: At risk. Affected paths can either keep writing to one bucket for the entire load or dereference an invalid bucket index during dynamic partition updates.
- FE-BE variable passing: Incomplete. The new fields are not passed in all required FE paths.
- Performance: The accounting bug over-rotates multi-partition loads and undermines the intended memory / throughput improvement.
- Other issues: No additional blocking issue beyond the items above.
Requesting changes because these issues affect correctness of the new routing logic and leave multiple load paths inconsistent with the new BE behavior.
There was a problem hiding this comment.
I found blocking correctness issues in the new adaptive random-bucket load path.
assignRandomBucketPerBe()does not actually isolate per-BE sink state.PlanFragment.toThrift()reusesOlapTableSink.tDataSink, so mutatingoutputSink.setOlapTableSink(sinkCopy)overwrites previous workers' assignments.assignRandomBucketPerBe()also runs whenload_to_single_tablet=true, which changes "single tablet" loads into one-tablet-per-BE loads because BE stays inFIND_TABLET_EVERY_SINK.- BE writers now unconditionally switch random-distribution sinks to
FIND_TABLET_RANDOM_BUCKET, but onlyThriftPlansBuilderfillslocal_bucket_seqs. ClassicCoordinator,NereidsStreamLoadPlanner(stream load / routine load / group commit), and mixed FE/BE versions therefore lose the old per-batch round-robin and pin one bucket for the whole sink. OlapTabletFinderattributes the full batch bytes to every touched partition, so multi-partition batches hit the 200 MB threshold too early.- The per-tablet byte counter is never reset or reduced after a rotation, so revisiting a bucket later causes immediate re-rotation after the next small batch.
Critical checkpoints:
- Goal/task: Not accomplished yet. The intended per-BE adaptive rotation is not reliably produced in the affected scenarios above, and no test demonstrates those cases.
- Scope/focus: Not sufficiently focused. The BE behavior change was applied globally, but the required FE annotation was only wired into one planner path.
- Concurrency: No new locking or thread-safety issue identified in this patch.
- Lifecycle/static initialization: No special lifecycle issue identified.
- Configuration: No new config items.
- Compatibility/incompatible change: Not preserved. The new BE behavior depends on new FE fields but lacks a safe fallback for legacy FE, mixed-version, or non-Nereids load paths.
- Parallel code paths: Incomplete. Classic
CoordinatorandNereidsStreamLoadPlannerpaths were not updated. - Special conditions: Missing guard for
load_to_single_tablet. - Test coverage: Insufficient. I do not see regression or unit coverage for per-BE isolation,
load_to_single_tablet, mixed planner paths, multi-partition threshold accounting, or repeated bucket rotation. - Test result updates: N/A, no test outputs were added or updated.
- Observability: No additional observability added; not the main blocker.
- Transaction/persistence: No transaction or persistence concerns in this patch.
- Data writes/modifications: The write-routing change is incorrect in the scenarios above, so this is currently a data-placement regression.
- FE/BE variable propagation: Incomplete.
local_bucket_seqsandbucket_be_idare only propagated through one FE execution path. - Performance: Correctness blockers dominate; no separate performance-only blocker identified beyond the incorrect threshold accounting.
982c516 to
d609f36
Compare
|
/review |
d609f36 to
8d2e679
Compare
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
There was a problem hiding this comment.
Summary
I found blocking correctness and compatibility issues in the adaptive random-bucket implementation.
- Per-BE sink mutation is not actually isolated because worker params still share the same
TDataSink. - Runtime create and replace-partition refresh does not propagate the new adaptive fields, so random-bucket mode can hit
load_tablet_idx == -1on new partitions. - Bucket rotation tracks lifetime bytes per tablet, so after one full cycle revisiting a bucket rotates again on the next batch instead of after another ~200 MB.
- The new default config silently changes the existing
load_to_single_tablet=truecontract on Nereids-distributed broker-load paths.
Critical Checkpoints
- Goal of the task: Not achieved yet. The PR intends to provide correct per-BE adaptive routing and lower load memory pressure, but worker-local sink isolation is broken and runtime partition-refresh paths are unsafe. No regression test proves the new behavior.
- Small, clear, focused change: Partially. The diff touches FE planning, BE routing, runtime partition metadata, and config, but propagation is still incomplete across all affected paths.
- Concurrency: There is a real shared-mutable-object issue.
TDataSinkis aliased across workers, so the per-BE mutation is not isolated. - Lifecycle: No static-init problem found, but runtime-created and replaced partitions are part of the feature lifecycle and currently miss the new fields.
- Configuration:
enable_adaptive_random_bucket_loadis observed at FE planning time, but defaulting it totrueintroduces an unguarded behavior change forload_to_single_tablet. - Compatibility: Not preserved for explicit
load_to_single_tablet=trueloads on the Nereids distributed broker-load path. - Parallel code paths: Not all equivalent paths are updated. Runtime create and replace-partition payloads miss the new fields, and direct
NereidsStreamLoadPlanner.plan()paths still bypass adaptive assignment entirely. - Special conditions: The 200 MB threshold logic is keyed by lifetime tablet bytes, not bytes since the bucket became active.
- FE-BE variable passing: The new
local_bucket_seqsandbucket_be_idfields are not propagated on runtime partition-refresh paths. - Data-write correctness: Risky as-is because bucket selection can be overwritten across workers and runtime-created partitions can route through an invalid bucket index.
- Performance: The advertised "only opens local buckets" memory reduction is still not true on the
VTabletWriterpath, which eagerly opens every tablet in_init(). That benefit currently only exists on the V2 path. - Test coverage: Insufficient. No new regression or unit coverage was added for multi-BE routing, threshold wrap-around, auto and replace partition, or
load_to_single_tabletcompatibility.
Please address these issues before merge.
9a1cdab to
0e8fec7
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
f7e1c09 to
f3834fa
Compare
6c665ac to
b31f64e
Compare
14794be to
f8b452f
Compare
### What problem does this PR solve? Issue Number: None Related PR: apache#62661 Problem Summary: Remove the LockTime profiling logic added for DeltaWriter, DeltaWriterV2, and MemTableWriter, including the lock wait stopwatch accounting and profile counters. ### Release note None ### Check List (For Author) - Test: Manual test - - - Target system: Darwin; Target arch: arm64 Python 3.11.3 Check JAVA_HOME version Apache Maven 3.9.2 (c9616018c7a021c1c39be70fb2843d6f5f9b8a1c) Maven home: /opt/homebrew/Cellar/maven/3.9.2/libexec Java version: 17.0.13, vendor: Oracle Corporation, runtime: /Users/laihui/lib/jdk-17.0.13.jdk/Contents/Home Default locale: zh_CN_#Hans, platform encoding: UTF-8 OS name: "mac os x", version: "15.7.3", arch: "x86_64", family: "mac" cmake version 3.26.3 CMake suite maintained and supported by Kitware (kitware.com/cmake). ninja 1.11.1 ccache version 4.8.1 �[33;1mWARNNING: �[37;1mSkip building with BE Java extensions due to the architecture which the library libjvm.dylib is built for does not match.�[0m Get params: BUILD_FE -- 0 BUILD_BE -- 1 BUILD_CLOUD -- 0 BUILD_BROKER -- 0 BUILD_META_TOOL -- OFF BUILD_FILE_CACHE_MICROBENCH_TOOL -- OFF BUILD_INDEX_TOOL -- OFF BUILD_BENCHMARK -- OFF BUILD_TASK_EXECUTOR_SIMULATOR -- OFF BUILD_BE_JAVA_EXTENSIONS -- 0 BUILD_BE_CDC_CLIENT -- 1 BUILD_HIVE_UDF -- 0 BUILD_JUICEFS -- ON BUILD_JINDOFS -- OFF PARALLEL -- 3 CLEAN -- 0 GLIBC_COMPATIBILITY -- OFF USE_AVX2 -- ON USE_LIBCPP -- ON USE_UNWIND -- OFF STRIP_DEBUG_INFO -- OFF USE_JEMALLOC -- OFF USE_BTHREAD_SCANNER -- OFF ENABLE_INJECTION_POINT -- OFF DENABLE_CLANG_COVERAGE -- OFF DISPLAY_BUILD_TIME -- OFF ENABLE_PCH -- ON WITH_TDE_DIR -- Feature List: -TDE,-HDFS_STORAGE_VAULT,+UI,-AZURE_BLOB,-AZURE_STORAGE_VAULT,-HIVE_UDF,-BE_JAVA_EXTENSIONS Target system: Darwin; Target arch: arm64 Python 3.11.3 Check JAVA_HOME version Apache Maven 3.9.2 (c9616018c7a021c1c39be70fb2843d6f5f9b8a1c) Maven home: /opt/homebrew/Cellar/maven/3.9.2/libexec Java version: 17.0.13, vendor: Oracle Corporation, runtime: /Users/laihui/lib/jdk-17.0.13.jdk/Contents/Home Default locale: zh_CN_#Hans, platform encoding: UTF-8 OS name: "mac os x", version: "15.7.3", arch: "x86_64", family: "mac" cmake version 3.26.3 CMake suite maintained and supported by Kitware (kitware.com/cmake). ninja 1.11.1 ccache version 4.8.1 Build generated code /Library/Developer/CommandLineTools/usr/bin/make -C script /Library/Developer/CommandLineTools/usr/bin/make -C proto /Users/laihui/work/doris/gensrc/script/gen_build_version.sh /Library/Developer/CommandLineTools/usr/bin/make -C thrift make[1]: Nothing to be done for `all'. make[1]: Nothing to be done for `all'. Done Update apache-orc submodule ... /Users/laihui/work/doris Current commit ID of apache-orc submodule: be0f1b73a7aeb78824a03e0dcb692c50a176d513, expected is be0f1b73a7aeb78824a03e0dcb692c50a176d513 Update clucene submodule ... /Users/laihui/work/doris Current commit ID of clucene submodule: c51b5cc9adc63817ad8322f617c75737ece7288d, expected is c51b5cc9adc63817ad8322f617c75737ece7288d Update openblas submodule ... /Users/laihui/work/doris Current commit ID of openblas submodule: 77986e49425532bf8f651db74cbe1579bcb4a5bf, expected is 77986e49425532bf8f651db74cbe1579bcb4a5bf Update faiss submodule ... /Users/laihui/work/doris Current commit ID of faiss submodule: 032afe95f671cd50b82d52d901345600776d7855, expected is 032afe95f671cd50b82d52d901345600776d7855 Build Backend: Release -- Make program: /opt/homebrew/opt/ninja/bin/ninja -- Use ccache: -DCMAKE_CXX_COMPILER_LAUNCHER=ccache and -DCMAKE_C_COMPILER_LAUNCHER=ccache -- Extra cxx flags: -- Build fs benchmark tool: OFF -- Build task executor simulator: OFF -- Build file cache lru tool: OFF -- GLIBC_COMPATIBILITY is OFF -- USE_LIBCPP is ON -- USE_JEMALLOC is OFF -- USE_UNWIND is OFF -- ENABLE_PCH is ON -- USE_AVX2 is ON -- Build type is RELEASE -- Build target arch is arm64 -- DORIS_HOME is /Users/laihui/work/doris -- THIRDPARTY_DIR is /Users/laihui/work/doris/thirdparty/installed -- make test: OFF -- make benchmark: OFF -- build fs benchmark tool: OFF -- build task executor simulator: OFF -- build file cache lru tool: OFF -- build gensrc if necessary /Library/Developer/CommandLineTools/usr/bin/make -C script /Users/laihui/work/doris/gensrc/script/gen_build_version.sh get java cmd: /Users/laihui/lib/jdk-17.0.13.jdk/Contents/Home/bin/java get java version: java full version \"17.0.13+10-LTS-268\" /Library/Developer/CommandLineTools/usr/bin/make -C proto make[1]: Nothing to be done for `all'. /Library/Developer/CommandLineTools/usr/bin/make -C thrift make[1]: Nothing to be done for `all'. -- Found Boost: /Users/laihui/work/doris/thirdparty/installed/lib64/cmake/Boost-1.81.0/BoostConfig.cmake (found suitable version "1.81.0", minimum required is "1.81.0") found components: system date_time -- Found Boost: /Users/laihui/work/doris/thirdparty/installed/lib64/cmake/Boost-1.81.0/BoostConfig.cmake (found suitable version "1.81.0", minimum required is "1.81.0") found components: system container -- build azure: OFF -- Build type: RELEASE -- compiler AppleClang version 17.0.0.17000603 -- Disable the metrics collection -- SNAPPY_HOME: /Users/laihui/work/doris/thirdparty/installed -- Found the Snappy header: /Users/laihui/work/doris/thirdparty/installed/include/snappy.h -- Found the Snappy library: /Users/laihui/work/doris/thirdparty/installed/lib/libsnappy.a -- ZLIB_HOME: /Users/laihui/work/doris/thirdparty/installed -- Found the ZLIB header: /Users/laihui/work/doris/thirdparty/installed/include/zlib.h -- Found the ZLIB library: /Users/laihui/work/doris/thirdparty/installed/lib/libz.a -- Found the ZLIB static library: /Users/laihui/work/doris/thirdparty/installed/lib/libz.a -- ZSTD_HOME: /Users/laihui/work/doris/thirdparty/installed -- Found the zstd header: /Users/laihui/work/doris/thirdparty/installed/include/zstd/zstd.h -- Found the zstd library: /Users/laihui/work/doris/thirdparty/installed/lib/libzstd.a -- Found the zstd static library: /Users/laihui/work/doris/thirdparty/installed/lib/libzstd.a -- LZ4_HOME: /Users/laihui/work/doris/thirdparty/installed -- Found the LZ4 header: /Users/laihui/work/doris/thirdparty/installed/include/lz4/lz4.h -- Found the LZ4 library: /Users/laihui/work/doris/thirdparty/installed/lib/liblz4.a -- PROTOBUF_HOME: /Users/laihui/work/doris/thirdparty/installed -- Found the Protobuf headers: /Users/laihui/work/doris/thirdparty/installed/include -- Found the Protobuf library: /Users/laihui/work/doris/thirdparty/installed/lib/libprotobuf.a -- Found the Protoc library: /Users/laihui/work/doris/thirdparty/installed/lib/libprotoc.a -- Found the Protoc executable: /Users/laihui/work/doris/thirdparty/installed/bin/protoc -- Found the Protobuf static library: /Users/laihui/work/doris/thirdparty/installed/lib/libprotobuf.a -- Found the Protoc static library: /Users/laihui/work/doris/thirdparty/installed/lib/libprotoc.a -- old Boost_INCLUDE_DIR : /Users/laihui/work/doris/thirdparty/installed/include -- Found Boost: /Users/laihui/work/doris/thirdparty/installed/lib64/cmake/Boost-1.81.0/BoostConfig.cmake (found version "1.81.0") -- Boost_INCLUDE_DIR : /Users/laihui/work/doris/thirdparty/installed/include -- Zstd target already exists -- Looking for MapViewOfFile -- Looking for MapViewOfFile(0,0,0,0,0) - not found -- Looking for _close -- Looking for _close((int)0) - not found -- Looking for _read -- Looking for _read((int)0, (void*)0, (unsigned int)0) - not found -- Looking for _open -- Looking for _open(0,0,0) - not found -- Looking for _write -- Looking for _write((int)0, (const void*)0, (unsigned int)0) - not found -- Looking for _unlink -- Looking for _unlink((const char*)0) - not found -- Looking for _ftime -- Looking for _ftime(0) - not found -- Looking for _mkdir -- Looking for _mkdir((const char*)0) - not found -- Checking support new float byte<->float conversions -- Checking support new float byte<->float conversions - yes -- ARM_MARCH is armv8-a+crc -- UBSAN_IGNORELIST is /Users/laihui/work/doris/conf/ubsan_ignorelist.txt -- Paimon C++ enabled: legacy thirdparty static linkage mode -- Compiler: AppleClang-17.0.0.17000603 -- CXX Standard: 20 -- C Standard: 17 -- CXX Flags: -O3 -DNDEBUG -- C Flags: -O3 -DNDEBUG -- CC Flags: -- Compiler Options: -ffile-prefix-map=/Users/laihui/work/doris=.;-g;-gdwarf-5;-Wall;-Wextra;-Werror;-Wundef;-pthread;-fstrict-aliasing;-fno-omit-frame-pointer;$<$<COMPILE_LANGUAGE:CXX>:-Wnon-virtual-dtor>;-Wno-unused-parameter;-Wno-sign-compare;-fcolor-diagnostics;-Wpedantic;-Wshadow;-Wshadow-field;-Wunused;-Wunused-command-line-argument;-Wunused-exception-parameter;-Wunused-volatile-lvalue;-Wunused-template;-Wunused-member-function;-Wunused-macros;-Wconversion;-Wthread-safety;-Wno-gnu-statement-expression;-Wno-implicit-float-conversion;-Wno-sign-conversion;$<$<COMPILE_LANGUAGE:CXX>:-stdlib=libc++>;-march=armv8-a+crc -- Compiler Definitions: BOOST_PROCESS_POSIX_NO_SIGTIMEDWAIT;BOOST_STACKTRACE_USE_NOOP;GLOG_CUSTOM_PREFIX_SUPPORT;USE_LIBCPP;__STDC_FORMAT_MACROS;BOOST_DATE_TIME_POSIX_TIME_STD_CONFIG;BOOST_SYSTEM_NO_DEPRECATED;BOOST_UUID_RANDOM_PROVIDER_FORCE_POSIX=1;BRPC_ENABLE_CPU_PROFILER;S2_USE_GFLAGS;S2_USE_GLOG;HAVE_INTTYPES_H;HAVE_NETINET_IN_H;_DARWIN_C_SOURCE;LIBDIVIDE_NEON;LIBJVM;USE_LIBHDFS3 -- Doris Dependencies: jvm;Boost::date_time;Boost::container;gflags;glog;backtrace;re2;hyperscan;odbc;pprof;protobuf;gtest;gtest_main;benchmark;gmock;snappy;curl;lz4;thrift;thriftnb;crc32c;libevent_core;libevent_openssl;libevent;libevent_pthreads;libbz2;libz;crypto;openssl;leveldb;grpc++_reflection;grpc;grpc++;grpc++_unsecure;gpr;upb;cares;address_sorting;z;brotlicommon;brotlidec;brotlienc;zstd;arrow;arrow_flight;arrow_flight_sql;arrow_dataset;arrow_acero;parquet;brpc;rocksdb;cyrus-sasl;rdkafka_cpp;rdkafka;s2;bitshuffle;roaring;fmt;cctz;base64;aws-cpp-sdk-core;aws-cpp-sdk-s3;aws-cpp-sdk-transfer;aws-cpp-sdk-s3-crt;aws-crt-cpp;aws-c-cal;aws-c-auth;aws-c-compression;aws-c-common;aws-c-event-stream;aws-c-io;aws-c-http;aws-c-mqtt;aws-checksums;aws-c-s3;aws-c-sdkutils;aws-cpp-sdk-identity-management;aws-cpp-sdk-sts;aws-cpp-sdk-kinesis;minizip;simdjson;idn;xml2;lzma;gsasl;krb5support;krb5;com_err;k5crypto;gssapi_krb5;streamvbyte;bfd;iberty;intl;icuuc;icui18n;icudata;pugixml;paimon;paimon_orc_file_format;paimon_blob_file_format;paimon_local_file_system;paimon_file_index;paimon_global_index;roaring_bitmap_paimon;xxhash_paimon;fmt_paimon;tbb_paimon;hdfs3;absl::cord;absl::cord_internal;absl::cordz_functions;absl::cordz_info;absl::cordz_update_scope;absl::cordz_update_tracker;absl::crc_cord_state;absl::flags;absl::random_random;absl::spinlock_wait;absl::status;absl::statusor;absl::strings;bfd;iberty;intl;krb5support;krb5;com_err;gssapi_krb5;k5crypto;orc;ic;clucene-core-static;clucene-shared-static;clucene-contribs-lib;-Wl,-force_load,$<TARGET_FILE:paimon_parquet_file_format>;roaring_bitmap_paimon;xxhash_paimon;fmt_paimon;tbb_paimon -- Link Flags: Agent;Common;Core;Exec;Exprs;Format;IO;Storage;Runtime;Service;Udf;Util;DorisGen;Load;InformationSchema;Cloud;CommonCPP;jvm;Boost::date_time;Boost::container;gflags;glog;backtrace;re2;hyperscan;odbc;pprof;protobuf;gtest;gtest_main;benchmark;gmock;snappy;curl;lz4;thrift;thriftnb;crc32c;libevent_core;libevent_openssl;libevent;libevent_pthreads;libbz2;libz;crypto;openssl;leveldb;grpc++_reflection;grpc;grpc++;grpc++_unsecure;gpr;upb;cares;address_sorting;z;brotlicommon;brotlidec;brotlienc;zstd;arrow;arrow_flight;arrow_flight_sql;arrow_dataset;arrow_acero;parquet;brpc;rocksdb;cyrus-sasl;rdkafka_cpp;rdkafka;s2;bitshuffle;roaring;fmt;cctz;base64;aws-cpp-sdk-core;aws-cpp-sdk-s3;aws-cpp-sdk-transfer;aws-cpp-sdk-s3-crt;aws-crt-cpp;aws-c-cal;aws-c-auth;aws-c-compression;aws-c-common;aws-c-event-stream;aws-c-io;aws-c-http;aws-c-mqtt;aws-checksums;aws-c-s3;aws-c-sdkutils;aws-cpp-sdk-identity-management;aws-cpp-sdk-sts;aws-cpp-sdk-kinesis;minizip;simdjson;idn;xml2;lzma;gsasl;krb5support;krb5;com_err;k5crypto;gssapi_krb5;streamvbyte;bfd;iberty;intl;icuuc;icui18n;icudata;pugixml;paimon;paimon_orc_file_format;paimon_blob_file_format;paimon_local_file_system;paimon_file_index;paimon_global_index;roaring_bitmap_paimon;xxhash_paimon;fmt_paimon;tbb_paimon;hdfs3;absl::cord;absl::cord_internal;absl::cordz_functions;absl::cordz_info;absl::cordz_update_scope;absl::cordz_update_tracker;absl::crc_cord_state;absl::flags;absl::random_random;absl::spinlock_wait;absl::status;absl::statusor;absl::strings;bfd;iberty;intl;krb5support;krb5;com_err;gssapi_krb5;k5crypto;orc;ic;clucene-core-static;clucene-shared-static;clucene-contribs-lib;-Wl,-force_load,$<TARGET_FILE:paimon_parquet_file_format>;roaring_bitmap_paimon;xxhash_paimon;fmt_paimon;tbb_paimon;-lapple_nghttp2;-lresolv;-liconv;tcmalloc;-framework CoreFoundation;-framework CoreGraphics;-framework CoreText;-framework Foundation;-framework SystemConfiguration;-framework Security -- GEMM multithread threshold set to 4. -- Disabling Advanced Vector Extensions 512 (AVX512). -- Multi-threading enabled with 8 threads. -- Running getarch -- GETARCH results: CORE=VORTEX LIBCORE=vortex NUM_CORES=8 MAKEFLAGS += -j 8 -- Compiling a 64-bit binary. -- Configuring incomplete, errors occurred! attempted, but failed during OpenBLAS CMake configuration because OpenMP_C was not found before compiling the modified files - Error: Please install llvm@16 firt due to we use it to format code. attempted, but failed because llvm@16 is not installed - Behavior changed: Yes, removes load writer LockTime profile counters - Does this need documentation: No
### What problem does this PR solve? Issue Number: None Related PR: apache#62661 Problem Summary: Restore MemTableWriter active memtable reset lifecycle so the active memtable is created during init and recreated immediately after flush or insert failure, while keeping the new memtable_flushed reporting behavior. ### Release note None ### Check List (For Author) - Test: Manual test - - Error: Please install llvm@16 firt due to we use it to format code. attempted, but failed because llvm@16 is not installed - BE build was not rerun for this small follow-up because the previous Target system: Darwin; Target arch: arm64 Python 3.11.3 Check JAVA_HOME version Apache Maven 3.9.2 (c9616018c7a021c1c39be70fb2843d6f5f9b8a1c) Maven home: /opt/homebrew/Cellar/maven/3.9.2/libexec Java version: 17.0.13, vendor: Oracle Corporation, runtime: /Users/laihui/lib/jdk-17.0.13.jdk/Contents/Home Default locale: zh_CN_#Hans, platform encoding: UTF-8 OS name: "mac os x", version: "15.7.3", arch: "x86_64", family: "mac" cmake version 3.26.3 CMake suite maintained and supported by Kitware (kitware.com/cmake). ninja 1.11.1 ccache version 4.8.1 �[33;1mWARNNING: �[37;1mSkip building with BE Java extensions due to the architecture which the library libjvm.dylib is built for does not match.�[0m Get params: BUILD_FE -- 0 BUILD_BE -- 1 BUILD_CLOUD -- 0 BUILD_BROKER -- 0 BUILD_META_TOOL -- OFF BUILD_FILE_CACHE_MICROBENCH_TOOL -- OFF BUILD_INDEX_TOOL -- OFF BUILD_BENCHMARK -- OFF BUILD_TASK_EXECUTOR_SIMULATOR -- OFF BUILD_BE_JAVA_EXTENSIONS -- 0 BUILD_BE_CDC_CLIENT -- 1 BUILD_HIVE_UDF -- 0 BUILD_JUICEFS -- ON BUILD_JINDOFS -- OFF PARALLEL -- 3 CLEAN -- 0 GLIBC_COMPATIBILITY -- OFF USE_AVX2 -- ON USE_LIBCPP -- ON USE_UNWIND -- OFF STRIP_DEBUG_INFO -- OFF USE_JEMALLOC -- OFF USE_BTHREAD_SCANNER -- OFF ENABLE_INJECTION_POINT -- OFF DENABLE_CLANG_COVERAGE -- OFF DISPLAY_BUILD_TIME -- OFF ENABLE_PCH -- ON WITH_TDE_DIR -- Feature List: -TDE,-HDFS_STORAGE_VAULT,+UI,-AZURE_BLOB,-AZURE_STORAGE_VAULT,-HIVE_UDF,-BE_JAVA_EXTENSIONS Target system: Darwin; Target arch: arm64 Python 3.11.3 Check JAVA_HOME version Apache Maven 3.9.2 (c9616018c7a021c1c39be70fb2843d6f5f9b8a1c) Maven home: /opt/homebrew/Cellar/maven/3.9.2/libexec Java version: 17.0.13, vendor: Oracle Corporation, runtime: /Users/laihui/lib/jdk-17.0.13.jdk/Contents/Home Default locale: zh_CN_#Hans, platform encoding: UTF-8 OS name: "mac os x", version: "15.7.3", arch: "x86_64", family: "mac" cmake version 3.26.3 CMake suite maintained and supported by Kitware (kitware.com/cmake). ninja 1.11.1 ccache version 4.8.1 Build generated code /Library/Developer/CommandLineTools/usr/bin/make -C script /Library/Developer/CommandLineTools/usr/bin/make -C proto /Users/laihui/work/doris/gensrc/script/gen_build_version.sh /Library/Developer/CommandLineTools/usr/bin/make -C thrift make[1]: Nothing to be done for `all'. make[1]: Nothing to be done for `all'. Done Update apache-orc submodule ... /Users/laihui/work/doris Current commit ID of apache-orc submodule: be0f1b73a7aeb78824a03e0dcb692c50a176d513, expected is be0f1b73a7aeb78824a03e0dcb692c50a176d513 Update clucene submodule ... /Users/laihui/work/doris Current commit ID of clucene submodule: c51b5cc9adc63817ad8322f617c75737ece7288d, expected is c51b5cc9adc63817ad8322f617c75737ece7288d Update openblas submodule ... /Users/laihui/work/doris Current commit ID of openblas submodule: 77986e49425532bf8f651db74cbe1579bcb4a5bf, expected is 77986e49425532bf8f651db74cbe1579bcb4a5bf Update faiss submodule ... /Users/laihui/work/doris Current commit ID of faiss submodule: 032afe95f671cd50b82d52d901345600776d7855, expected is 032afe95f671cd50b82d52d901345600776d7855 Build Backend: Release -- Make program: /opt/homebrew/opt/ninja/bin/ninja -- Use ccache: -DCMAKE_CXX_COMPILER_LAUNCHER=ccache and -DCMAKE_C_COMPILER_LAUNCHER=ccache -- Extra cxx flags: -- Build fs benchmark tool: OFF -- Build task executor simulator: OFF -- Build file cache lru tool: OFF -- GLIBC_COMPATIBILITY is OFF -- USE_LIBCPP is ON -- USE_JEMALLOC is OFF -- USE_UNWIND is OFF -- ENABLE_PCH is ON -- USE_AVX2 is ON -- Build type is RELEASE -- Build target arch is arm64 -- DORIS_HOME is /Users/laihui/work/doris -- THIRDPARTY_DIR is /Users/laihui/work/doris/thirdparty/installed -- make test: OFF -- make benchmark: OFF -- build fs benchmark tool: OFF -- build task executor simulator: OFF -- build file cache lru tool: OFF -- build gensrc if necessary /Library/Developer/CommandLineTools/usr/bin/make -C script /Users/laihui/work/doris/gensrc/script/gen_build_version.sh get java cmd: /Users/laihui/lib/jdk-17.0.13.jdk/Contents/Home/bin/java get java version: java full version \"17.0.13+10-LTS-268\" /Library/Developer/CommandLineTools/usr/bin/make -C proto make[1]: Nothing to be done for `all'. /Library/Developer/CommandLineTools/usr/bin/make -C thrift make[1]: Nothing to be done for `all'. -- Found Boost: /Users/laihui/work/doris/thirdparty/installed/lib64/cmake/Boost-1.81.0/BoostConfig.cmake (found suitable version "1.81.0", minimum required is "1.81.0") found components: system date_time -- Found Boost: /Users/laihui/work/doris/thirdparty/installed/lib64/cmake/Boost-1.81.0/BoostConfig.cmake (found suitable version "1.81.0", minimum required is "1.81.0") found components: system container -- build azure: OFF -- Build type: RELEASE -- compiler AppleClang version 17.0.0.17000603 -- Disable the metrics collection -- SNAPPY_HOME: /Users/laihui/work/doris/thirdparty/installed -- Found the Snappy header: /Users/laihui/work/doris/thirdparty/installed/include/snappy.h -- Found the Snappy library: /Users/laihui/work/doris/thirdparty/installed/lib/libsnappy.a -- ZLIB_HOME: /Users/laihui/work/doris/thirdparty/installed -- Found the ZLIB header: /Users/laihui/work/doris/thirdparty/installed/include/zlib.h -- Found the ZLIB library: /Users/laihui/work/doris/thirdparty/installed/lib/libz.a -- Found the ZLIB static library: /Users/laihui/work/doris/thirdparty/installed/lib/libz.a -- ZSTD_HOME: /Users/laihui/work/doris/thirdparty/installed -- Found the zstd header: /Users/laihui/work/doris/thirdparty/installed/include/zstd/zstd.h -- Found the zstd library: /Users/laihui/work/doris/thirdparty/installed/lib/libzstd.a -- Found the zstd static library: /Users/laihui/work/doris/thirdparty/installed/lib/libzstd.a -- LZ4_HOME: /Users/laihui/work/doris/thirdparty/installed -- Found the LZ4 header: /Users/laihui/work/doris/thirdparty/installed/include/lz4/lz4.h -- Found the LZ4 library: /Users/laihui/work/doris/thirdparty/installed/lib/liblz4.a -- PROTOBUF_HOME: /Users/laihui/work/doris/thirdparty/installed -- Found the Protobuf headers: /Users/laihui/work/doris/thirdparty/installed/include -- Found the Protobuf library: /Users/laihui/work/doris/thirdparty/installed/lib/libprotobuf.a -- Found the Protoc library: /Users/laihui/work/doris/thirdparty/installed/lib/libprotoc.a -- Found the Protoc executable: /Users/laihui/work/doris/thirdparty/installed/bin/protoc -- Found the Protobuf static library: /Users/laihui/work/doris/thirdparty/installed/lib/libprotobuf.a -- Found the Protoc static library: /Users/laihui/work/doris/thirdparty/installed/lib/libprotoc.a -- old Boost_INCLUDE_DIR : /Users/laihui/work/doris/thirdparty/installed/include -- Found Boost: /Users/laihui/work/doris/thirdparty/installed/lib64/cmake/Boost-1.81.0/BoostConfig.cmake (found version "1.81.0") -- Boost_INCLUDE_DIR : /Users/laihui/work/doris/thirdparty/installed/include -- Zstd target already exists -- Looking for MapViewOfFile -- Looking for MapViewOfFile(0,0,0,0,0) - not found -- Looking for _close -- Looking for _close((int)0) - not found -- Looking for _read -- Looking for _read((int)0, (void*)0, (unsigned int)0) - not found -- Looking for _open -- Looking for _open(0,0,0) - not found -- Looking for _write -- Looking for _write((int)0, (const void*)0, (unsigned int)0) - not found -- Looking for _unlink -- Looking for _unlink((const char*)0) - not found -- Looking for _ftime -- Looking for _ftime(0) - not found -- Looking for _mkdir -- Looking for _mkdir((const char*)0) - not found -- Checking support new float byte<->float conversions -- Checking support new float byte<->float conversions - yes -- ARM_MARCH is armv8-a+crc -- UBSAN_IGNORELIST is /Users/laihui/work/doris/conf/ubsan_ignorelist.txt -- Paimon C++ enabled: legacy thirdparty static linkage mode -- Compiler: AppleClang-17.0.0.17000603 -- CXX Standard: 20 -- C Standard: 17 -- CXX Flags: -O3 -DNDEBUG -- C Flags: -O3 -DNDEBUG -- CC Flags: -- Compiler Options: -ffile-prefix-map=/Users/laihui/work/doris=.;-g;-gdwarf-5;-Wall;-Wextra;-Werror;-Wundef;-pthread;-fstrict-aliasing;-fno-omit-frame-pointer;$<$<COMPILE_LANGUAGE:CXX>:-Wnon-virtual-dtor>;-Wno-unused-parameter;-Wno-sign-compare;-fcolor-diagnostics;-Wpedantic;-Wshadow;-Wshadow-field;-Wunused;-Wunused-command-line-argument;-Wunused-exception-parameter;-Wunused-volatile-lvalue;-Wunused-template;-Wunused-member-function;-Wunused-macros;-Wconversion;-Wthread-safety;-Wno-gnu-statement-expression;-Wno-implicit-float-conversion;-Wno-sign-conversion;$<$<COMPILE_LANGUAGE:CXX>:-stdlib=libc++>;-march=armv8-a+crc -- Compiler Definitions: BOOST_PROCESS_POSIX_NO_SIGTIMEDWAIT;BOOST_STACKTRACE_USE_NOOP;GLOG_CUSTOM_PREFIX_SUPPORT;USE_LIBCPP;__STDC_FORMAT_MACROS;BOOST_DATE_TIME_POSIX_TIME_STD_CONFIG;BOOST_SYSTEM_NO_DEPRECATED;BOOST_UUID_RANDOM_PROVIDER_FORCE_POSIX=1;BRPC_ENABLE_CPU_PROFILER;S2_USE_GFLAGS;S2_USE_GLOG;HAVE_INTTYPES_H;HAVE_NETINET_IN_H;_DARWIN_C_SOURCE;LIBDIVIDE_NEON;LIBJVM;USE_LIBHDFS3 -- Doris Dependencies: jvm;Boost::date_time;Boost::container;gflags;glog;backtrace;re2;hyperscan;odbc;pprof;protobuf;gtest;gtest_main;benchmark;gmock;snappy;curl;lz4;thrift;thriftnb;crc32c;libevent_core;libevent_openssl;libevent;libevent_pthreads;libbz2;libz;crypto;openssl;leveldb;grpc++_reflection;grpc;grpc++;grpc++_unsecure;gpr;upb;cares;address_sorting;z;brotlicommon;brotlidec;brotlienc;zstd;arrow;arrow_flight;arrow_flight_sql;arrow_dataset;arrow_acero;parquet;brpc;rocksdb;cyrus-sasl;rdkafka_cpp;rdkafka;s2;bitshuffle;roaring;fmt;cctz;base64;aws-cpp-sdk-core;aws-cpp-sdk-s3;aws-cpp-sdk-transfer;aws-cpp-sdk-s3-crt;aws-crt-cpp;aws-c-cal;aws-c-auth;aws-c-compression;aws-c-common;aws-c-event-stream;aws-c-io;aws-c-http;aws-c-mqtt;aws-checksums;aws-c-s3;aws-c-sdkutils;aws-cpp-sdk-identity-management;aws-cpp-sdk-sts;aws-cpp-sdk-kinesis;minizip;simdjson;idn;xml2;lzma;gsasl;krb5support;krb5;com_err;k5crypto;gssapi_krb5;streamvbyte;bfd;iberty;intl;icuuc;icui18n;icudata;pugixml;paimon;paimon_orc_file_format;paimon_blob_file_format;paimon_local_file_system;paimon_file_index;paimon_global_index;roaring_bitmap_paimon;xxhash_paimon;fmt_paimon;tbb_paimon;hdfs3;absl::cord;absl::cord_internal;absl::cordz_functions;absl::cordz_info;absl::cordz_update_scope;absl::cordz_update_tracker;absl::crc_cord_state;absl::flags;absl::random_random;absl::spinlock_wait;absl::status;absl::statusor;absl::strings;bfd;iberty;intl;krb5support;krb5;com_err;gssapi_krb5;k5crypto;orc;ic;clucene-core-static;clucene-shared-static;clucene-contribs-lib;-Wl,-force_load,$<TARGET_FILE:paimon_parquet_file_format>;roaring_bitmap_paimon;xxhash_paimon;fmt_paimon;tbb_paimon -- Link Flags: Agent;Common;Core;Exec;Exprs;Format;IO;Storage;Runtime;Service;Udf;Util;DorisGen;Load;InformationSchema;Cloud;CommonCPP;jvm;Boost::date_time;Boost::container;gflags;glog;backtrace;re2;hyperscan;odbc;pprof;protobuf;gtest;gtest_main;benchmark;gmock;snappy;curl;lz4;thrift;thriftnb;crc32c;libevent_core;libevent_openssl;libevent;libevent_pthreads;libbz2;libz;crypto;openssl;leveldb;grpc++_reflection;grpc;grpc++;grpc++_unsecure;gpr;upb;cares;address_sorting;z;brotlicommon;brotlidec;brotlienc;zstd;arrow;arrow_flight;arrow_flight_sql;arrow_dataset;arrow_acero;parquet;brpc;rocksdb;cyrus-sasl;rdkafka_cpp;rdkafka;s2;bitshuffle;roaring;fmt;cctz;base64;aws-cpp-sdk-core;aws-cpp-sdk-s3;aws-cpp-sdk-transfer;aws-cpp-sdk-s3-crt;aws-crt-cpp;aws-c-cal;aws-c-auth;aws-c-compression;aws-c-common;aws-c-event-stream;aws-c-io;aws-c-http;aws-c-mqtt;aws-checksums;aws-c-s3;aws-c-sdkutils;aws-cpp-sdk-identity-management;aws-cpp-sdk-sts;aws-cpp-sdk-kinesis;minizip;simdjson;idn;xml2;lzma;gsasl;krb5support;krb5;com_err;k5crypto;gssapi_krb5;streamvbyte;bfd;iberty;intl;icuuc;icui18n;icudata;pugixml;paimon;paimon_orc_file_format;paimon_blob_file_format;paimon_local_file_system;paimon_file_index;paimon_global_index;roaring_bitmap_paimon;xxhash_paimon;fmt_paimon;tbb_paimon;hdfs3;absl::cord;absl::cord_internal;absl::cordz_functions;absl::cordz_info;absl::cordz_update_scope;absl::cordz_update_tracker;absl::crc_cord_state;absl::flags;absl::random_random;absl::spinlock_wait;absl::status;absl::statusor;absl::strings;bfd;iberty;intl;krb5support;krb5;com_err;gssapi_krb5;k5crypto;orc;ic;clucene-core-static;clucene-shared-static;clucene-contribs-lib;-Wl,-force_load,$<TARGET_FILE:paimon_parquet_file_format>;roaring_bitmap_paimon;xxhash_paimon;fmt_paimon;tbb_paimon;-lapple_nghttp2;-lresolv;-liconv;tcmalloc;-framework CoreFoundation;-framework CoreGraphics;-framework CoreText;-framework Foundation;-framework SystemConfiguration;-framework Security -- GEMM multithread threshold set to 4. -- Disabling Advanced Vector Extensions 512 (AVX512). -- Multi-threading enabled with 8 threads. -- Running getarch -- GETARCH results: CORE=VORTEX LIBCORE=vortex NUM_CORES=8 MAKEFLAGS += -j 8 -- Compiling a 64-bit binary. -- Configuring incomplete, errors occurred! attempt failed during OpenBLAS CMake configuration due to missing OpenMP_C before compiling modified files - Behavior changed: No - Does this need documentation: No
### What problem does this PR solve? Issue Number: None Related PR: apache#62661 Problem Summary: Restore pre-existing load writer lock wait accounting that was removed too broadly, and finish restoring the MemTableWriter reset lifecycle by removing lazy active memtable null/empty handling left from the PR change. ### Release note None ### Check List (For Author) - Test: Manual test - - - Error: Please install llvm@16 firt due to we use it to format code. attempted, but failed because llvm@16 is not installed - BE build not rerun because the previous Target system: Darwin; Target arch: arm64 Python 3.11.3 Check JAVA_HOME version Apache Maven 3.9.2 (c9616018c7a021c1c39be70fb2843d6f5f9b8a1c) Maven home: /opt/homebrew/Cellar/maven/3.9.2/libexec Java version: 17.0.13, vendor: Oracle Corporation, runtime: /Users/laihui/lib/jdk-17.0.13.jdk/Contents/Home Default locale: zh_CN_#Hans, platform encoding: UTF-8 OS name: "mac os x", version: "15.7.3", arch: "x86_64", family: "mac" cmake version 3.26.3 CMake suite maintained and supported by Kitware (kitware.com/cmake). ninja 1.11.1 ccache version 4.8.1 �[33;1mWARNNING: �[37;1mSkip building with BE Java extensions due to the architecture which the library libjvm.dylib is built for does not match.�[0m Get params: BUILD_FE -- 0 BUILD_BE -- 1 BUILD_CLOUD -- 0 BUILD_BROKER -- 0 BUILD_META_TOOL -- OFF BUILD_FILE_CACHE_MICROBENCH_TOOL -- OFF BUILD_INDEX_TOOL -- OFF BUILD_BENCHMARK -- OFF BUILD_TASK_EXECUTOR_SIMULATOR -- OFF BUILD_BE_JAVA_EXTENSIONS -- 0 BUILD_BE_CDC_CLIENT -- 1 BUILD_HIVE_UDF -- 0 BUILD_JUICEFS -- ON BUILD_JINDOFS -- OFF PARALLEL -- 3 CLEAN -- 0 GLIBC_COMPATIBILITY -- OFF USE_AVX2 -- ON USE_LIBCPP -- ON USE_UNWIND -- OFF STRIP_DEBUG_INFO -- OFF USE_JEMALLOC -- OFF USE_BTHREAD_SCANNER -- OFF ENABLE_INJECTION_POINT -- OFF DENABLE_CLANG_COVERAGE -- OFF DISPLAY_BUILD_TIME -- OFF ENABLE_PCH -- ON WITH_TDE_DIR -- Feature List: -TDE,-HDFS_STORAGE_VAULT,+UI,-AZURE_BLOB,-AZURE_STORAGE_VAULT,-HIVE_UDF,-BE_JAVA_EXTENSIONS Target system: Darwin; Target arch: arm64 Python 3.11.3 Check JAVA_HOME version Apache Maven 3.9.2 (c9616018c7a021c1c39be70fb2843d6f5f9b8a1c) Maven home: /opt/homebrew/Cellar/maven/3.9.2/libexec Java version: 17.0.13, vendor: Oracle Corporation, runtime: /Users/laihui/lib/jdk-17.0.13.jdk/Contents/Home Default locale: zh_CN_#Hans, platform encoding: UTF-8 OS name: "mac os x", version: "15.7.3", arch: "x86_64", family: "mac" cmake version 3.26.3 CMake suite maintained and supported by Kitware (kitware.com/cmake). ninja 1.11.1 ccache version 4.8.1 Build generated code /Library/Developer/CommandLineTools/usr/bin/make -C script /Library/Developer/CommandLineTools/usr/bin/make -C proto /Users/laihui/work/doris/gensrc/script/gen_build_version.sh /Library/Developer/CommandLineTools/usr/bin/make -C thrift make[1]: Nothing to be done for `all'. make[1]: Nothing to be done for `all'. Done Update apache-orc submodule ... /Users/laihui/work/doris Current commit ID of apache-orc submodule: be0f1b73a7aeb78824a03e0dcb692c50a176d513, expected is be0f1b73a7aeb78824a03e0dcb692c50a176d513 Update clucene submodule ... /Users/laihui/work/doris Current commit ID of clucene submodule: c51b5cc9adc63817ad8322f617c75737ece7288d, expected is c51b5cc9adc63817ad8322f617c75737ece7288d Update openblas submodule ... /Users/laihui/work/doris Current commit ID of openblas submodule: 77986e49425532bf8f651db74cbe1579bcb4a5bf, expected is 77986e49425532bf8f651db74cbe1579bcb4a5bf Update faiss submodule ... /Users/laihui/work/doris Current commit ID of faiss submodule: 032afe95f671cd50b82d52d901345600776d7855, expected is 032afe95f671cd50b82d52d901345600776d7855 Build Backend: Release -- Make program: /opt/homebrew/opt/ninja/bin/ninja -- Use ccache: -DCMAKE_CXX_COMPILER_LAUNCHER=ccache and -DCMAKE_C_COMPILER_LAUNCHER=ccache -- Extra cxx flags: -- Build fs benchmark tool: OFF -- Build task executor simulator: OFF -- Build file cache lru tool: OFF -- GLIBC_COMPATIBILITY is OFF -- USE_LIBCPP is ON -- USE_JEMALLOC is OFF -- USE_UNWIND is OFF -- ENABLE_PCH is ON -- USE_AVX2 is ON -- Build type is RELEASE -- Build target arch is arm64 -- DORIS_HOME is /Users/laihui/work/doris -- THIRDPARTY_DIR is /Users/laihui/work/doris/thirdparty/installed -- make test: OFF -- make benchmark: OFF -- build fs benchmark tool: OFF -- build task executor simulator: OFF -- build file cache lru tool: OFF -- build gensrc if necessary /Library/Developer/CommandLineTools/usr/bin/make -C script /Users/laihui/work/doris/gensrc/script/gen_build_version.sh get java cmd: /Users/laihui/lib/jdk-17.0.13.jdk/Contents/Home/bin/java get java version: java full version \"17.0.13+10-LTS-268\" /Library/Developer/CommandLineTools/usr/bin/make -C proto make[1]: Nothing to be done for `all'. /Library/Developer/CommandLineTools/usr/bin/make -C thrift make[1]: Nothing to be done for `all'. -- Found Boost: /Users/laihui/work/doris/thirdparty/installed/lib64/cmake/Boost-1.81.0/BoostConfig.cmake (found suitable version "1.81.0", minimum required is "1.81.0") found components: system date_time -- Found Boost: /Users/laihui/work/doris/thirdparty/installed/lib64/cmake/Boost-1.81.0/BoostConfig.cmake (found suitable version "1.81.0", minimum required is "1.81.0") found components: system container -- build azure: OFF -- Build type: RELEASE -- compiler AppleClang version 17.0.0.17000603 -- Disable the metrics collection -- SNAPPY_HOME: /Users/laihui/work/doris/thirdparty/installed -- Found the Snappy header: /Users/laihui/work/doris/thirdparty/installed/include/snappy.h -- Found the Snappy library: /Users/laihui/work/doris/thirdparty/installed/lib/libsnappy.a -- ZLIB_HOME: /Users/laihui/work/doris/thirdparty/installed -- Found the ZLIB header: /Users/laihui/work/doris/thirdparty/installed/include/zlib.h -- Found the ZLIB library: /Users/laihui/work/doris/thirdparty/installed/lib/libz.a -- Found the ZLIB static library: /Users/laihui/work/doris/thirdparty/installed/lib/libz.a -- ZSTD_HOME: /Users/laihui/work/doris/thirdparty/installed -- Found the zstd header: /Users/laihui/work/doris/thirdparty/installed/include/zstd/zstd.h -- Found the zstd library: /Users/laihui/work/doris/thirdparty/installed/lib/libzstd.a -- Found the zstd static library: /Users/laihui/work/doris/thirdparty/installed/lib/libzstd.a -- LZ4_HOME: /Users/laihui/work/doris/thirdparty/installed -- Found the LZ4 header: /Users/laihui/work/doris/thirdparty/installed/include/lz4/lz4.h -- Found the LZ4 library: /Users/laihui/work/doris/thirdparty/installed/lib/liblz4.a -- PROTOBUF_HOME: /Users/laihui/work/doris/thirdparty/installed -- Found the Protobuf headers: /Users/laihui/work/doris/thirdparty/installed/include -- Found the Protobuf library: /Users/laihui/work/doris/thirdparty/installed/lib/libprotobuf.a -- Found the Protoc library: /Users/laihui/work/doris/thirdparty/installed/lib/libprotoc.a -- Found the Protoc executable: /Users/laihui/work/doris/thirdparty/installed/bin/protoc -- Found the Protobuf static library: /Users/laihui/work/doris/thirdparty/installed/lib/libprotobuf.a -- Found the Protoc static library: /Users/laihui/work/doris/thirdparty/installed/lib/libprotoc.a -- old Boost_INCLUDE_DIR : /Users/laihui/work/doris/thirdparty/installed/include -- Found Boost: /Users/laihui/work/doris/thirdparty/installed/lib64/cmake/Boost-1.81.0/BoostConfig.cmake (found version "1.81.0") -- Boost_INCLUDE_DIR : /Users/laihui/work/doris/thirdparty/installed/include -- Zstd target already exists -- Looking for MapViewOfFile -- Looking for MapViewOfFile(0,0,0,0,0) - not found -- Looking for _close -- Looking for _close((int)0) - not found -- Looking for _read -- Looking for _read((int)0, (void*)0, (unsigned int)0) - not found -- Looking for _open -- Looking for _open(0,0,0) - not found -- Looking for _write -- Looking for _write((int)0, (const void*)0, (unsigned int)0) - not found -- Looking for _unlink -- Looking for _unlink((const char*)0) - not found -- Looking for _ftime -- Looking for _ftime(0) - not found -- Looking for _mkdir -- Looking for _mkdir((const char*)0) - not found -- Checking support new float byte<->float conversions -- Checking support new float byte<->float conversions - yes -- ARM_MARCH is armv8-a+crc -- UBSAN_IGNORELIST is /Users/laihui/work/doris/conf/ubsan_ignorelist.txt -- Paimon C++ enabled: legacy thirdparty static linkage mode -- Compiler: AppleClang-17.0.0.17000603 -- CXX Standard: 20 -- C Standard: 17 -- CXX Flags: -O3 -DNDEBUG -- C Flags: -O3 -DNDEBUG -- CC Flags: -- Compiler Options: -ffile-prefix-map=/Users/laihui/work/doris=.;-g;-gdwarf-5;-Wall;-Wextra;-Werror;-Wundef;-pthread;-fstrict-aliasing;-fno-omit-frame-pointer;$<$<COMPILE_LANGUAGE:CXX>:-Wnon-virtual-dtor>;-Wno-unused-parameter;-Wno-sign-compare;-fcolor-diagnostics;-Wpedantic;-Wshadow;-Wshadow-field;-Wunused;-Wunused-command-line-argument;-Wunused-exception-parameter;-Wunused-volatile-lvalue;-Wunused-template;-Wunused-member-function;-Wunused-macros;-Wconversion;-Wthread-safety;-Wno-gnu-statement-expression;-Wno-implicit-float-conversion;-Wno-sign-conversion;$<$<COMPILE_LANGUAGE:CXX>:-stdlib=libc++>;-march=armv8-a+crc -- Compiler Definitions: BOOST_PROCESS_POSIX_NO_SIGTIMEDWAIT;BOOST_STACKTRACE_USE_NOOP;GLOG_CUSTOM_PREFIX_SUPPORT;USE_LIBCPP;__STDC_FORMAT_MACROS;BOOST_DATE_TIME_POSIX_TIME_STD_CONFIG;BOOST_SYSTEM_NO_DEPRECATED;BOOST_UUID_RANDOM_PROVIDER_FORCE_POSIX=1;BRPC_ENABLE_CPU_PROFILER;S2_USE_GFLAGS;S2_USE_GLOG;HAVE_INTTYPES_H;HAVE_NETINET_IN_H;_DARWIN_C_SOURCE;LIBDIVIDE_NEON;LIBJVM;USE_LIBHDFS3 -- Doris Dependencies: jvm;Boost::date_time;Boost::container;gflags;glog;backtrace;re2;hyperscan;odbc;pprof;protobuf;gtest;gtest_main;benchmark;gmock;snappy;curl;lz4;thrift;thriftnb;crc32c;libevent_core;libevent_openssl;libevent;libevent_pthreads;libbz2;libz;crypto;openssl;leveldb;grpc++_reflection;grpc;grpc++;grpc++_unsecure;gpr;upb;cares;address_sorting;z;brotlicommon;brotlidec;brotlienc;zstd;arrow;arrow_flight;arrow_flight_sql;arrow_dataset;arrow_acero;parquet;brpc;rocksdb;cyrus-sasl;rdkafka_cpp;rdkafka;s2;bitshuffle;roaring;fmt;cctz;base64;aws-cpp-sdk-core;aws-cpp-sdk-s3;aws-cpp-sdk-transfer;aws-cpp-sdk-s3-crt;aws-crt-cpp;aws-c-cal;aws-c-auth;aws-c-compression;aws-c-common;aws-c-event-stream;aws-c-io;aws-c-http;aws-c-mqtt;aws-checksums;aws-c-s3;aws-c-sdkutils;aws-cpp-sdk-identity-management;aws-cpp-sdk-sts;aws-cpp-sdk-kinesis;minizip;simdjson;idn;xml2;lzma;gsasl;krb5support;krb5;com_err;k5crypto;gssapi_krb5;streamvbyte;bfd;iberty;intl;icuuc;icui18n;icudata;pugixml;paimon;paimon_orc_file_format;paimon_blob_file_format;paimon_local_file_system;paimon_file_index;paimon_global_index;roaring_bitmap_paimon;xxhash_paimon;fmt_paimon;tbb_paimon;hdfs3;absl::cord;absl::cord_internal;absl::cordz_functions;absl::cordz_info;absl::cordz_update_scope;absl::cordz_update_tracker;absl::crc_cord_state;absl::flags;absl::random_random;absl::spinlock_wait;absl::status;absl::statusor;absl::strings;bfd;iberty;intl;krb5support;krb5;com_err;gssapi_krb5;k5crypto;orc;ic;clucene-core-static;clucene-shared-static;clucene-contribs-lib;-Wl,-force_load,$<TARGET_FILE:paimon_parquet_file_format>;roaring_bitmap_paimon;xxhash_paimon;fmt_paimon;tbb_paimon -- Link Flags: Agent;Common;Core;Exec;Exprs;Format;IO;Storage;Runtime;Service;Udf;Util;DorisGen;Load;InformationSchema;Cloud;CommonCPP;jvm;Boost::date_time;Boost::container;gflags;glog;backtrace;re2;hyperscan;odbc;pprof;protobuf;gtest;gtest_main;benchmark;gmock;snappy;curl;lz4;thrift;thriftnb;crc32c;libevent_core;libevent_openssl;libevent;libevent_pthreads;libbz2;libz;crypto;openssl;leveldb;grpc++_reflection;grpc;grpc++;grpc++_unsecure;gpr;upb;cares;address_sorting;z;brotlicommon;brotlidec;brotlienc;zstd;arrow;arrow_flight;arrow_flight_sql;arrow_dataset;arrow_acero;parquet;brpc;rocksdb;cyrus-sasl;rdkafka_cpp;rdkafka;s2;bitshuffle;roaring;fmt;cctz;base64;aws-cpp-sdk-core;aws-cpp-sdk-s3;aws-cpp-sdk-transfer;aws-cpp-sdk-s3-crt;aws-crt-cpp;aws-c-cal;aws-c-auth;aws-c-compression;aws-c-common;aws-c-event-stream;aws-c-io;aws-c-http;aws-c-mqtt;aws-checksums;aws-c-s3;aws-c-sdkutils;aws-cpp-sdk-identity-management;aws-cpp-sdk-sts;aws-cpp-sdk-kinesis;minizip;simdjson;idn;xml2;lzma;gsasl;krb5support;krb5;com_err;k5crypto;gssapi_krb5;streamvbyte;bfd;iberty;intl;icuuc;icui18n;icudata;pugixml;paimon;paimon_orc_file_format;paimon_blob_file_format;paimon_local_file_system;paimon_file_index;paimon_global_index;roaring_bitmap_paimon;xxhash_paimon;fmt_paimon;tbb_paimon;hdfs3;absl::cord;absl::cord_internal;absl::cordz_functions;absl::cordz_info;absl::cordz_update_scope;absl::cordz_update_tracker;absl::crc_cord_state;absl::flags;absl::random_random;absl::spinlock_wait;absl::status;absl::statusor;absl::strings;bfd;iberty;intl;krb5support;krb5;com_err;gssapi_krb5;k5crypto;orc;ic;clucene-core-static;clucene-shared-static;clucene-contribs-lib;-Wl,-force_load,$<TARGET_FILE:paimon_parquet_file_format>;roaring_bitmap_paimon;xxhash_paimon;fmt_paimon;tbb_paimon;-lapple_nghttp2;-lresolv;-liconv;tcmalloc;-framework CoreFoundation;-framework CoreGraphics;-framework CoreText;-framework Foundation;-framework SystemConfiguration;-framework Security -- GEMM multithread threshold set to 4. -- Disabling Advanced Vector Extensions 512 (AVX512). -- Multi-threading enabled with 8 threads. -- Running getarch -- GETARCH results: CORE=VORTEX LIBCORE=vortex NUM_CORES=8 MAKEFLAGS += -j 8 -- Compiling a 64-bit binary. -- Configuring incomplete, errors occurred! attempt failed during OpenBLAS CMake configuration due to missing OpenMP_C before compiling modified files - Behavior changed: No - Does this need documentation: No
9b1f6ee to
3219248
Compare
|
run buildall |
|
/review |
There was a problem hiding this comment.
Requesting changes for one remaining correctness issue in the receiver-side adaptive random bucket path.
Critical checkpoint conclusions:
- Goal/test: The PR aims to reduce cloud random-distribution load memory by routing each sink to selected local buckets and rotating after flush pressure. The current receiver-side path does not fully accomplish that because the FE-selected bucket sequence is not honored by BE. I did not find a test that would catch a BE owning extra non-selected replicas and rotating into them.
- Scope/focus: The change is focused on adaptive random bucket load and memtable pressure, but the new FE/BE protocol needs one more propagation step for selected bucket order/start.
- Concurrency: The reviewed receiver-side routing uses per-partition locks and shared channel state; I did not find a new lock-order blocker beyond the submitted routing issue.
- Lifecycle/static init: No new static initialization-order issue found.
- Configuration: New configs are mutable; the FE sink flag captures the setting at plan time, and BE rotation/backpressure configs are read dynamically.
- Compatibility: The adaptive mode is gated through new optional thrift/proto fields; no additional mixed-version issue beyond the already-known existing review threads was raised here.
- Parallel paths: V1 receiver-side routing, runtime partition creation/replacement, and legacy/Nereids planning paths were checked. V2 remains on the existing tablet-id routing mode.
- Conditional checks: The adaptive mode checks are understandable, but the selected sequence is lost when building the open request.
- Test coverage: Existing coverage appears insufficient for multi-replica/multi-bucket BE ownership where the chosen BE has replicas for buckets outside
local_bucket_seqs. - Observability: Logs exist for planning and routing, but they would currently show the wrong receiver tablet sequence rather than preventing it.
- Transaction/persistence/data writes: The issue can route committed rows to buckets outside the FE-selected assignment, changing distribution for a load; no metadata persistence issue found.
- FE-BE variables:
local_bucket_seqsis sent in thrift partition metadata but not consumed when the receiver-side ordered tablet list is constructed. - Performance: Ignoring the selected bucket subset can reopen/rotate across more tablets than intended, reducing the intended memory benefit.
- User focus: No additional user-provided review focus was supplied.
Existing review context was checked; this comment is distinct from the earlier byte-accounting, counter-reset, propagation, shared-sink, load_to_single_tablet, and documentation-scope comments.
FE UT Coverage ReportIncrement line coverage |
3219248 to
49f07be
Compare
|
run buildall |
|
/review |
There was a problem hiding this comment.
Reviewed the full PR diff and existing review threads. I found one additional blocking issue in the cloud receiver-side random-bucket path.
Critical checkpoint conclusions:
- Goal/test: the PR aims to reduce cloud random-distribution load memory pressure by assigning a bounded local bucket set and rotating on receiver-side flush. Current code mostly implements the flow, but the new cloud add-batch path has a concurrency race that can break auto-partition/incremental-open loads. I did not find test evidence covering this concurrent cloud receiver-side path.
- Scope/clarity: the change is broad but focused on adaptive random-bucket routing and load memory pressure.
- Concurrency: blocking issue found.
CloudTabletsChannel::add_batch()can run concurrently withincremental_open(), and the new receiver-side path reads_tablet_writersthrough_init_writers_by_partition_ids()without_tablet_writers_lock. - Lifecycle/static initialization: no additional lifecycle or SIOF issue found.
- Config: new FE config is mutable/master-only and BE mode is sink-flag gated in cloud mode; no additional config issue beyond already-known review context.
- Compatibility/protocol: new thrift/proto fields are optional/defaulted; the current head added sink-level gating and classic/Nereids assignment paths, so I did not add duplicate compatibility comments.
- Parallel paths: cloud/non-cloud and create/replace partition paths were checked; the new cloud receiver-side add path is the remaining distinct issue.
- Conditional checks: no additional condition issue found.
- Tests/results: I did not see dedicated coverage for concurrent cloud receiver-side random-bucket loads with auto-created or incrementally opened partitions.
- Observability: no blocking observability issue found.
- Transaction/persistence/data writes: the race can cause incorrect initialization/failure during cloud writes; no separate edit-log or storage-format issue found.
- FE/BE variable passing: current head propagates the adaptive fields through initial and runtime partition paths; no duplicate comment submitted.
- Performance: no additional blocking performance issue found beyond already-known review context.
User focus: no additional user-provided review focus was supplied.
49f07be to
8d2f063
Compare
|
/review |
There was a problem hiding this comment.
Blocking issue found in the adaptive random bucket open path. The review also considered the existing inline threads and did not duplicate the already-raised concerns about shared sink mutation, load_to_single_tablet compatibility, runtime partition propagation, rotation accounting, selected bucket ordering, and cloud writer locking.
Critical checkpoint conclusions:
- Goal/test: the PR aims to reduce cloud random-distribution load memory by receiver-side adaptive bucket routing, but the current code can fail for sparse/single-replica placements. I did not find test coverage for that case.
- Scope/focus: the change is broadly spread across FE planning, runtime partition refresh, BE routing, channel open/add, and memtable backpressure; the core implementation is focused on the feature, but the inline issue is a correctness blocker.
- Concurrency: receiver-side route state uses per-partition locking; existing review threads already cover the relevant
_tablet_writersconcurrency issue. I did not add a duplicate concurrency comment. - Lifecycle/static initialization: no new cross-TU static initialization hazard found.
- Config/compatibility: new configs/flags are present; compatibility concerns already raised in existing threads were treated as known context.
- Parallel paths: legacy and Nereids planning paths are both touched; V2 does not appear to use the new receiver-side mode.
- Data correctness/transactions/persistence: no new transaction or persistence issue found beyond the inline routing/open failure and existing known comments.
- Observability/performance: logging exists for the new adaptive plan/route paths; no additional blocking observability issue found.
- User focus: no additional user-provided review focus was supplied.
FE UT Coverage ReportIncrement line coverage |
8d2f063 to
2b7ab5f
Compare
|
/review |
There was a problem hiding this comment.
I found two blocking issues in the new receiver-side adaptive random bucket path.
Critical checkpoint conclusions:
- Goal/test coverage: the PR aims to reduce cloud random-distribution load memory by routing through receiver-side adaptive buckets, but the new multi-sender receiver state and concurrent writer lookup paths are not proven safe by tests. No additional user focus points were provided.
- Scope/clarity: the change is broad across FE planning, BE routing, cloud writer initialization, and memtable backpressure; several earlier issues were addressed, but the receiver-side state model still has correctness gaps.
- Concurrency: blocking issue found.
add_batch()can race withincremental_open()while using an invalidatable_tablet_writersiterator after releasing the lock. - Lifecycle/state: blocking issue found. A
TabletsChannelis shared by load/index across senders, but adaptive partition state is initialized only once and later sender-specific bucket orders are ignored. - Compatibility/parallel paths: existing prior threads already cover remaining one-shot stream-load/classic compatibility concerns, so I did not duplicate them inline.
- Data correctness: routing can use the wrong tablet order/start bucket for later senders and can hit undefined behavior during concurrent incremental opens.
- Observability/configuration/persistence: no additional issue found beyond the comments and previously known review context.
2b7ab5f to
6720234
Compare
|
/review |
There was a problem hiding this comment.
I found one additional issue in the new receiver-side adaptive random bucket write path. Existing review threads already cover several other blocking correctness/concurrency topics, including missing or inconsistent adaptive assignment propagation on some paths, receiver-side state sharing between senders, and load_to_single_tablet compatibility, so I did not duplicate those inline.
Critical checkpoint conclusions:
- Goal/test: The goal is adaptive random-bucket load routing for cloud random-distribution tables. The implementation is not yet fully safe because cancellation can hang in the new backpressure wait, and existing threads cover additional routing/compatibility blockers. Tests for cancellation/backpressure interaction are missing.
- Scope/focus: The change is broad across FE planning, BE routing, receiver write channels, thrift/protobuf, and memtable backpressure; it is not a small local change.
- Concurrency: The new global table memtable backpressure wait is concurrent with cancellation and flush completion; it currently does not observe cancellation. Existing review threads cover other receiver-side concurrent open/write concerns.
- Lifecycle: No static initialization issue found. The added wait path can outlive load cancellation until pending flush counts drop.
- Configuration: New FE/BE adaptive configs are mutable/default-on. Existing review threads cover default-on compatibility concerns.
- Compatibility: Existing review threads cover load_to_single_tablet and planner-path compatibility risks.
- Parallel paths: Coordinator and ThriftPlansBuilder were updated; one-shot planner concerns are already covered by existing threads.
- Tests: I did not see coverage for cancellation while table-level memtable backpressure is active.
- Observability/performance: The adaptive path adds several INFO logs in hot routing/planning paths; keep an eye on log volume, but the blocking issue is cancellation safety.
- Transactions/persistence/data writes: No new persistence format issue found in this pass, but write-path cancellation safety needs fixing.
- User focus: No additional user-provided review focus was supplied.
6720234 to
c56db1e
Compare
|
/review |
There was a problem hiding this comment.
I found one additional correctness issue in the receiver-side adaptive random bucket close path. The earlier review threads still cover several other blocking items, so I did not duplicate them here.
Critical checkpoint conclusions:
- Goal/test: the PR aims to reduce cloud random-bucket load fanout and backpressure table-level memtable flushes. The new adaptive routing is not yet correct for close/version handling; existing tests do not demonstrate the receiver-side EOS case with partitions whose rows were routed to only a subset of BEs.
- Scope/clarity: the change is broad across FE planning, BE routing, receiver channels, and memtable limiting; several interactions are non-local.
- Concurrency: existing comments already cover receiver/channel concurrency hazards. The new comment below is not primarily a data race, but it interacts with close lifecycle and quorum completion.
- Lifecycle: receiver-side partition state is initialized/opened per channel and then closed via EOS; the EOS partition propagation is incomplete.
- Configuration: new configs exist and are cloud-gated in the BE mode selection, but behavior changes are still correctness-sensitive.
- Compatibility: the BE now relies on new thrift/proto fields for adaptive routing; previously reported compatibility/path coverage concerns remain known review context.
- Parallel paths: the normal random path sends touched partition ids on EOS; the adaptive receiver-side path skips that parallel behavior, causing the issue below.
- Tests: no sufficient end-to-end test was found for receiver-side adaptive random bucket loads where a touched partition has tablets on BEs that did not receive row batches.
- Observability/performance: added logs are useful for tracing routing, but correctness must be fixed first.
- Transaction/persistence/data correctness: the new close path can omit empty rowsets for touched-partition tablets and can break version continuity.
- User focus: no additional user-provided review focus was supplied.
| for (auto pid : _parent->_tablet_finder->partition_ids()) { | ||
| request->add_partition_ids(pid); | ||
| if (!request->is_receiver_side_random_bucket()) { | ||
| for (auto pid : _parent->_tablet_finder->partition_ids()) { |
There was a problem hiding this comment.
Skipping partition_ids on EOS for receiver-side random bucket changes the close semantics for touched partitions. In the normal random path every node channel receives the touched partition ids, so TabletsChannel::close() closes all writers in those partitions and produces the empty rowsets needed for tablets that did not receive rows. With the new receiver-side path, only receivers that happened to receive row batches record _partition_ids during add_batch(); all other receivers see an empty partition set on close and cancel their writers. For a random partition whose selected buckets live on one BE while other buckets live on other BEs, those other tablets get no rowset for this committed transaction, leaving tablet versions behind the partition visible version. Please still propagate the touched partition ids on EOS (or otherwise explicitly close the required empty writers) for receiver-side random bucket loads.
Summary
This PR introduces adaptive routing for random bucket load. For random distributed tables in cloud mode, the FE assigns a small set of candidate buckets per partition to each sink BE, and the BE receiver routes rows to local tablets based on the partition id. This avoids sender-side routing to all random buckets and reduces unnecessary tablet writer fanout during load.
Goals:
TabletLoadIndexRecorderMgrglobal counter still increments per load task, so consecutive loads naturally start at different bucketsChanges
Add adaptive random bucket assignment on FE
bucketBeId,loadTabletIdx, and ordered local bucket sequences through thrift/proto plan metadata.Add receiver-side random bucket routing on BE
FIND_TABLET_RANDOM_BUCKETmode so sender side only needs to find partitions.Add threshold-based bucket rotation
enable_adaptive_random_bucket_load_bucket_rotation.Add table-level memtable backpressure for receiver-side random bucket load
MemTableMemoryLimiter.Keep compatibility with normal load paths
Examples
Example setup
Table with 4 random buckets and 2 sink BEs, single replica:
So the local bucket sequences are:
[0, 2][1, 3]Assume
planFragmentNum = 2, so FE selects 2 active buckets for each load.Example 1: bucket assignment for one load
Load 1,
baseTabletIndex = 0:[10, 20].baseTabletIndex % sinkBackendNum, so the order is still[10, 20].0from[0, 2]1from[1, 3]bucketBeId = 10,loadTabletIdx = 0,localBucketSeqs = [0, 2]bucketBeId = 20,loadTabletIdx = 1,localBucketSeqs = [1, 3]The sender only routes rows by partition. The receiver-side tablets channel chooses the current tablet from its
localBucketSeqs.For small loads, if no memtable flush is triggered, each receiver keeps writing to its starting bucket.
Example 2: within-load bucket rotation
For Load 1 on BE10, the receiver starts from bucket
0, with local sequence[0, 2].Assume the memtable flush threshold is reached during the load:
00triggers a memtable flush, rotate to bucket22triggers another memtable flush, rotate back to bucket0The rotation is driven by
memtable_flushed, not by a hard-coded data size.Example 3: cross-load bucket rotation
baseTabletIndexis advanced byTabletLoadIndexRecorderMgracross loads.With the setup above:
[10, 20]0, local seq[0, 2]1, local seq[1, 3][20, 10]2, local seq[2, 0]3, local seq[3, 1][10, 20]0, local seq[0, 2]1, local seq[1, 3]So small loads alternate between bucket pairs
[0, 1]and[2, 3], while large loads can rotate within each BE's local bucket sequence after memtable flush.