@@ -23,3 +23,44 @@ This crate provides the shuffle writer and reader implementation for Apache Data
2323of the [ Apache DataFusion Comet] subproject.
2424
2525[ Apache DataFusion Comet ] : https://github.com/apache/datafusion-comet/
26+
27+ ## Shuffle Benchmark Tool
28+
29+ A standalone benchmark binary (` shuffle_bench ` ) is included for profiling shuffle write
30+ performance outside of Spark. It streams input data directly from Parquet files.
31+
32+ ### Basic usage
33+
34+ ``` sh
35+ cargo run --release --features shuffle-bench --bin shuffle_bench -- \
36+ --input /data/tpch-sf100/lineitem/ \
37+ --partitions 200 \
38+ --codec lz4 \
39+ --hash-columns 0,3
40+ ```
41+
42+ ### Options
43+
44+ | Option | Default | Description |
45+ | --------------------- | -------------------------- | ------------------------------------------------------ |
46+ | ` --input ` | _ (required)_ | Path to a Parquet file or directory of Parquet files |
47+ | ` --partitions ` | ` 200 ` | Number of output shuffle partitions |
48+ | ` --partitioning ` | ` hash ` | Partitioning scheme: ` hash ` , ` single ` , ` round-robin ` |
49+ | ` --hash-columns ` | ` 0 ` | Comma-separated column indices to hash on (e.g. ` 0,3 ` ) |
50+ | ` --codec ` | ` lz4 ` | Compression codec: ` none ` , ` lz4 ` , ` zstd ` , ` snappy ` |
51+ | ` --zstd-level ` | ` 1 ` | Zstd compression level (1–22) |
52+ | ` --batch-size ` | ` 8192 ` | Batch size for reading Parquet data |
53+ | ` --memory-limit ` | _ (none)_ | Memory limit in bytes; triggers spilling when exceeded |
54+ | ` --write-buffer-size ` | ` 1048576 ` | Write buffer size in bytes |
55+ | ` --limit ` | ` 0 ` | Limit rows processed per iteration (0 = no limit) |
56+ | ` --iterations ` | ` 1 ` | Number of timed iterations |
57+ | ` --warmup ` | ` 0 ` | Number of warmup iterations before timing |
58+ | ` --output-dir ` | ` /tmp/comet_shuffle_bench ` | Directory for temporary shuffle output files |
59+
60+ ### Profiling with flamegraph
61+
62+ ``` sh
63+ cargo flamegraph --release --features shuffle-bench --bin shuffle_bench -- \
64+ --input /data/tpch-sf100/lineitem/ \
65+ --partitions 200 --codec lz4
66+ ```
0 commit comments