Skip to content

Commit f625865

Browse files
committed
docs: prettier-format configs.md
1 parent 149747e commit f625865

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

docs/source/user-guide/configs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -120,7 +120,7 @@ The following configuration settings are available:
120120
| datafusion.execution.sort_in_place_threshold_bytes | 1048576 | When sorting, below what size should data be concatenated and sorted in a single RecordBatch rather than sorted in batches and merged. |
121121
| datafusion.execution.sort_pushdown_buffer_capacity | 1073741824 | Maximum buffer capacity (in bytes) per partition for BufferExec inserted during sort pushdown optimization. When PushdownSort eliminates a SortExec under SortPreservingMergeExec, a BufferExec is inserted to replace SortExec's buffering role. This prevents I/O stalls by allowing the scan to run ahead of the merge. This uses strictly less memory than the SortExec it replaces (which buffers the entire partition). The buffer respects the global memory pool limit. Setting this to a large value is safe — actual memory usage is bounded by partition size and global memory limits. |
122122
| datafusion.execution.max_spill_file_size_bytes | 134217728 | Maximum size in bytes for individual spill files before rotating to a new file. When operators spill data to disk (e.g., RepartitionExec), they write multiple batches to the same file until this size limit is reached, then rotate to a new file. This reduces syscall overhead compared to one-file-per-batch while preventing files from growing too large. A larger value reduces file creation overhead but may hold more disk space. A smaller value creates more files but allows finer-grained space reclamation as files can be deleted once fully consumed. Now only `RepartitionExec` supports this spill file rotation feature, other spilling operators may create spill files larger than the limit. Default: 128 MB |
123-
| datafusion.execution.repartition_buffer_size_bytes | 104857600 | Maximum total in-memory bytes buffered in `RepartitionExec` distribution channels per gate group, before producers are throttled. The gate also closes when every output channel has at least one buffered item, so the byte budget primarily acts as a cap for skewed fan-out workloads where one channel would otherwise grow unbounded. Acts as a soft cap: if a single batch exceeds the budget and the channel is empty, the channel allows the batch through to avoid head-of-line blocking. Default: 100 MB |
123+
| datafusion.execution.repartition_buffer_size_bytes | 104857600 | Maximum total in-memory bytes buffered in `RepartitionExec` distribution channels per gate group, before producers are throttled. The gate also closes when every output channel has at least one buffered item, so the byte budget primarily acts as a cap for skewed fan-out workloads where one channel would otherwise grow unbounded. Acts as a soft cap: if a single batch exceeds the budget and the channel is empty, the channel allows the batch through to avoid head-of-line blocking. Default: 100 MB |
124124
| datafusion.execution.meta_fetch_concurrency | 32 | Number of files to read in parallel when inferring schema and statistics |
125125
| datafusion.execution.minimum_parallel_output_files | 4 | Guarantees a minimum level of output files running in parallel. RecordBatches will be distributed in round robin fashion to each parallel writer. Each writer is closed and a new file opened once soft_max_rows_per_output_file is reached. |
126126
| datafusion.execution.soft_max_rows_per_output_file | 50000000 | Target number of rows in output files when writing multiple. This is a soft max, so it can be exceeded slightly. There also will be one file smaller than the limit if the total number of rows written is not roughly divisible by the soft max |

0 commit comments

Comments
 (0)