Skip to content

Commit 97ee57b

Browse files
hamersawclaude
andauthored
chore: exposing max_source_fragments on OPTIMIZE command (#348)
This PR adds support for `max_source_fragments` in the OPTIMIZE command. This parameter was added to restrict the number of fragments included in a compaction. It's main use is for incrementally compacting extremely fragmented datasets. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent f57b512 commit 97ee57b

3 files changed

Lines changed: 5 additions & 1 deletion

File tree

docs/src/operations/ddl/optimize.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,7 @@ The `OPTIMIZE` command supports several options to control compaction behavior:
3030
| `num_threads` | Long | Number of threads for compaction |
3131
| `batch_size` | Long | Batch size for processing |
3232
| `defer_index_remap` | Boolean | Whether to defer index remapping |
33+
| `max_source_fragments` | Long | Maximum number of source fragments to compact in a single task |
3334

3435
### Examples
3536

lance-spark-base_2.12/src/main/scala/org/apache/spark/sql/execution/datasources/v2/OptimizeExec.scala

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,8 @@ case class OptimizeExec(
4949
argsMap.get("batch_size").map(t => builder.withBatchSize(t.value.asInstanceOf[Long]))
5050
argsMap.get("defer_index_remap").map(t =>
5151
builder.withDeferIndexRemap(t.value.asInstanceOf[Boolean]))
52+
argsMap.get("max_source_fragments").map(t =>
53+
builder.withMaxSourceFragments(t.value.asInstanceOf[Long]))
5254

5355
builder.build()
5456
}

lance-spark-base_2.12/src/test/java/org/lance/spark/update/BaseOptimizeTest.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -121,7 +121,8 @@ public void testWithFullArgs() {
121121
+ "materialize_deletions_threshold=0.2f,"
122122
+ "num_threads=2,"
123123
+ "batch_size=2000,"
124-
+ "defer_index_remap=true"
124+
+ "defer_index_remap=true,"
125+
+ "max_source_fragments=128"
125126
+ ")",
126127
fullTable));
127128

0 commit comments

Comments
 (0)