Skip to content

Commit 7ced14e

Browse files
authored
test: consolidate write tests into single [[test]] binary (delta-io#2472)
## What changes are proposed in this pull request? Collapses the 15 `kernel/tests/write_*.rs` files (split out in delta-io#2460) into a single integration test binary declared via `[[test]]` in `kernel/Cargo.toml`. Topic files move under `kernel/tests/write/` as modules of `tests/write/main.rs`. Test discovery and filtering are unchanged from a user POV: `cargo nextest run --test write` runs the whole binary, `cargo nextest run write::stats::` filters by topic. `cargo test --test write_stats`-style compile pruning of one topic is lost ## How was this change tested? Measured locally, after `touch kernel/src/lib.rs`: | | Binaries | `cargo build --tests -p delta_kernel --all-features` | |---|---|---| | before | 32 | 45.82s | | after | 18 | 35.04s |
1 parent 6486bd2 commit 7ced14e

20 files changed

Lines changed: 53 additions & 40 deletions

kernel/Cargo.toml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -161,3 +161,12 @@ harness = false
161161
[[bench]]
162162
name = "expression_bench"
163163
harness = false
164+
165+
# Consolidated write-path integration tests. Linking each integration test as a
166+
# separate binary is expensive (every binary re-links Arrow, Parquet, and the
167+
# default engine), so all write tests share a single binary whose entry point
168+
# is `tests/write/main.rs` and whose topics live as modules under that
169+
# directory.
170+
[[test]]
171+
name = "write"
172+
path = "tests/write/main.rs"

kernel/src/transaction/mod.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2121,7 +2121,7 @@ mod tests {
21212121

21222122
// Note: Additional test coverage for partial file matching (where some files in a scan
21232123
// have DV updates but others don't) is provided by the end-to-end integration test
2124-
// kernel/tests/dv.rs and kernel/tests/write_remove_dv.rs, which exercise
2124+
// kernel/tests/dv.rs and kernel/tests/write/remove_dv.rs, which exercise
21252125
// the full deletion vector write workflow including the DvMatchVisitor logic.
21262126

21272127
#[test]

kernel/tests/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ Test tables organized by feature area. Tables live in two locations:
99

1010
| Table | Location | Schema | Protocol (R/W) | Features | Description | Tests |
1111
|-------|----------|--------|----------|----------|-------------|-------|
12-
| `table-with-dv-small` | data/ | `value: int` | v3/v7 | r:`deletionVectors` w:`deletionVectors` | 10 rows, 2 soft-deleted by DV, 8 visible. Most heavily referenced test table. | `dv.rs::test_table_scan(with_dv)`, `write_remove_dv.rs::test_remove_files_adds_expected_entries`, `write_remove_dv.rs::test_update_deletion_vectors_adds_expected_entries`, `read.rs::with_predicate_and_removes`, `path.rs::test_to_uri/test_child/test_child_escapes`, `snapshot.rs::test_snapshot_read_metadata/test_new_snapshot/test_snapshot_new_from/test_read_table_with_missing_last_checkpoint/test_log_compaction_writer`, `deletion_vector.rs` tests, `transaction/mod.rs::setup_dv_enabled_table/test_add_files_schema/test_new_deletion_vector_path`, `default/parquet.rs` read test, `default/json.rs` read test, `log_compaction/tests.rs::create_mock_snapshot`, `resolve_dvs.rs` tests |
12+
| `table-with-dv-small` | data/ | `value: int` | v3/v7 | r:`deletionVectors` w:`deletionVectors` | 10 rows, 2 soft-deleted by DV, 8 visible. Most heavily referenced test table. | `dv.rs::test_table_scan(with_dv)`, `write::remove_dv::test_remove_files_adds_expected_entries`, `write::remove_dv::test_update_deletion_vectors_adds_expected_entries`, `read.rs::with_predicate_and_removes`, `path.rs::test_to_uri/test_child/test_child_escapes`, `snapshot.rs::test_snapshot_read_metadata/test_new_snapshot/test_snapshot_new_from/test_read_table_with_missing_last_checkpoint/test_log_compaction_writer`, `deletion_vector.rs` tests, `transaction/mod.rs::setup_dv_enabled_table/test_add_files_schema/test_new_deletion_vector_path`, `default/parquet.rs` read test, `default/json.rs` read test, `log_compaction/tests.rs::create_mock_snapshot`, `resolve_dvs.rs` tests |
1313
| `table-without-dv-small` | data/ | `value: long` | v1/v2 | | 10 rows, all visible. Companion to table-with-dv-small. | `dv.rs::test_table_scan(without_dv)`, `transaction/mod.rs::setup_non_dv_table/create_existing_table_txn/test_commit_io_error_returns_retryable_transaction`, `sequential_phase.rs::test_sequential_v2_with_commits_only/test_sequential_finish_before_exhaustion_error`, `parallel_phase.rs` tests, `scan/tests.rs::test_scan_metadata_paths/test_scan_metadata/test_scan_metadata_from_same_version` |
1414
| `with-short-dv` | data/ | `id: long, value: string, timestamp: timestamp, rand: double` | v3/v7 | r:`deletionVectors` w:`deletionVectors` | 2 files x 5 rows. First file has inline DV (`storageType="u"`) deleting 3 rows. | `read.rs::short_dv` |
1515
| `dv-partitioned-with-checkpoint` | golden_data/ | `value: int, part: int` partitioned by `part` | v3/v7 | r:`deletionVectors` w:`deletionVectors` | DVs on a partitioned table with a checkpoint | `golden_tables.rs::golden_test!` |
@@ -41,9 +41,9 @@ Test tables organized by feature area. Tables live in two locations:
4141

4242
| Table | Location | Schema | Protocol (R/W) | Features | Description | Tests |
4343
|-------|----------|--------|----------|----------|-------------|-------|
44-
| `partition_cm/none` | data/ | `value: int, category: string` partitioned by `category` | v1/v1 | `columnMapping.mode=none` | Partitioned write with CM disabled | `write_column_mapping.rs::test_column_mapping_partitioned_write(cm_none)` |
45-
| `partition_cm/id` | data/ | `value: int, category: string` partitioned by `category` | v3/v7 | r:`columnMapping` w:`columnMapping`, `columnMapping.mode=id` | Partitioned write with CM id mode | `write_column_mapping.rs::test_column_mapping_partitioned_write(cm_id)` |
46-
| `partition_cm/name` | data/ | `value: int, category: string` partitioned by `category` | v3/v7 | r:`columnMapping` w:`columnMapping`, `columnMapping.mode=name` | Partitioned write with CM name mode | `write_column_mapping.rs::test_column_mapping_partitioned_write(cm_name)` |
44+
| `partition_cm/none` | data/ | `value: int, category: string` partitioned by `category` | v1/v1 | `columnMapping.mode=none` | Partitioned write with CM disabled | `write::column_mapping::test_column_mapping_partitioned_write(cm_none)` |
45+
| `partition_cm/id` | data/ | `value: int, category: string` partitioned by `category` | v3/v7 | r:`columnMapping` w:`columnMapping`, `columnMapping.mode=id` | Partitioned write with CM id mode | `write::column_mapping::test_column_mapping_partitioned_write(cm_id)` |
46+
| `partition_cm/name` | data/ | `value: int, category: string` partitioned by `category` | v3/v7 | r:`columnMapping` w:`columnMapping`, `columnMapping.mode=name` | Partitioned write with CM name mode | `write::column_mapping::test_column_mapping_partitioned_write(cm_name)` |
4747
| `table-with-columnmapping-mode-name` | golden_data/ | `ByteType: byte, ShortType: short, IntegerType: int, LongType: long, FloatType: float, DoubleType: double, decimal: decimal(10,2), BooleanType: boolean, StringType: string, BinaryType: binary, DateType: date, TimestampType: timestamp, nested_struct: struct{aa: string, ac: struct{aca: int}}, array_of_prims: array<int>, array_of_arrays: array<array<int>>, array_of_structs: array<struct{ab: long}>, map_of_prims: map<int,long>, map_of_rows: map<int,struct{ab: long}>, map_of_arrays: map<long,array<int>>` | v2/v5 | `columnMapping.mode=name` | Column mapping name mode | `golden_tables.rs::golden_test!` |
4848
| `table-with-columnmapping-mode-id` | golden_data/ | `ByteType: byte, ShortType: short, IntegerType: int, LongType: long, FloatType: float, DoubleType: double, decimal: decimal(10,2), BooleanType: boolean, StringType: string, BinaryType: binary, DateType: date, TimestampType: timestamp, nested_struct: struct{aa: string, ac: struct{aca: int}}, array_of_prims: array<int>, array_of_arrays: array<array<int>>, array_of_structs: array<struct{ab: long}>, map_of_prims: map<int,long>, map_of_rows: map<int,struct{ab: long}>, map_of_arrays: map<long,array<int>>` | v2/v5 | `columnMapping.mode=id` | Column mapping id mode | `golden_tables.rs::golden_test!` |
4949

@@ -52,7 +52,7 @@ Test tables organized by feature area. Tables live in two locations:
5252
| Table | Location | Schema | Protocol (R/W) | Features | Description | Tests |
5353
|-------|----------|--------|----------|----------|-------------|-------|
5454
| `with_checkpoint_no_last_checkpoint` | data/ | `letter: string, int: long, date: date` | v1/v2 | `checkpointInterval=2` | Checkpoint at v2 but missing `_last_checkpoint` hint file | `snapshot.rs::test_read_table_with_checkpoint`, `scan/tests.rs::test_scan_with_checkpoint`, `sequential_phase.rs::test_sequential_checkpoint_no_commits`, `checkpoint_manifest.rs` tests, `sync/parquet.rs` test, `default/parquet.rs` test |
55-
| `external-table-different-nullability` | data/ | `i: int` | v1/v2 | `checkpointInterval=2` | Parquet files have different nullability than Delta schema; includes checkpoint | `write_clustered.rs::test_checkpoint_non_kernel_written_table` |
55+
| `external-table-different-nullability` | data/ | `i: int` | v1/v2 | `checkpointInterval=2` | Parquet files have different nullability than Delta schema; includes checkpoint | `write::clustered::test_checkpoint_non_kernel_written_table` |
5656
| `checkpoint` | golden_data/ | `intCol: int` | v1/v2 | | Basic checkpoint read | `golden_tables.rs::golden_test!(checkpoint_test)` |
5757
| `corrupted-last-checkpoint-kernel` | golden_data/ | `id: long` | v1/v2 | | Corrupted `_last_checkpoint` file | `golden_tables.rs::golden_test!` |
5858
| `multi-part-checkpoint` | golden_data/ | `id: long` | v1/v2 | `checkpointInterval=1` | Multi-part checkpoint files | `golden_tables.rs::golden_test!` |
Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,7 @@ use itertools::Itertools;
1919
use serde_json::{json, Deserializer};
2020
use test_utils::{create_table, engine_store_setup, set_json_value, setup_test_tables, test_read};
2121

22-
mod common;
23-
24-
use common::write_utils::{
22+
use crate::common::write_utils::{
2523
check_action_timestamps, get_and_check_all_parquet_sizes, get_simple_int_schema,
2624
validate_txn_id, write_data_and_check_result_and_stats, ZERO_UUID,
2725
};
Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,9 +17,7 @@ use tempfile::{tempdir, TempDir};
1717
use test_utils::{assert_result_error_with_message, create_table, engine_store_setup};
1818
use url::Url;
1919

20-
mod common;
21-
22-
use common::write_utils::get_simple_int_schema;
20+
use crate::common::write_utils::get_simple_int_schema;
2321

2422
// Helper function to create a table with CDF enabled
2523
async fn create_cdf_table(
Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,7 @@ use test_utils::{
2222
};
2323
use url::Url;
2424

25-
mod common;
26-
27-
use common::write_utils::{
25+
use crate::common::write_utils::{
2826
assert_column_mapping_mode, assert_min_max_stats, resolve_json_path, resolve_struct_field,
2927
set_table_properties,
3028
};
Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,9 +25,7 @@ use test_utils::{
2525
};
2626
use url::Url;
2727

28-
mod common;
29-
30-
use common::write_utils::{
28+
use crate::common::write_utils::{
3129
assert_min_max_stats, get_parquet_field_id, resolve_struct_field, set_table_properties,
3230
};
3331

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -14,9 +14,9 @@ use itertools::Itertools;
1414
use serde_json::{json, Deserializer};
1515
use test_utils::{set_json_value, setup_test_tables};
1616

17-
mod common;
18-
19-
use common::write_utils::{get_simple_int_schema, validate_timestamp, validate_txn_id, ZERO_UUID};
17+
use crate::common::write_utils::{
18+
get_simple_int_schema, validate_timestamp, validate_txn_id, ZERO_UUID,
19+
};
2020

2121
#[tokio::test]
2222
async fn test_commit_info() -> Result<(), Box<dyn std::error::Error>> {
Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,7 @@ use itertools::Itertools;
88
use serde_json::Deserializer;
99
use test_utils::{assert_result_error_with_message, create_table, engine_store_setup};
1010

11-
mod common;
12-
13-
use common::write_utils::get_simple_int_schema;
11+
use crate::common::write_utils::get_simple_int_schema;
1412

1513
#[tokio::test]
1614
async fn test_set_domain_metadata_basic() -> Result<(), Box<dyn std::error::Error>> {
Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,9 +18,7 @@ use tempfile::TempDir;
1818
use test_utils::engine_store_setup;
1919
use url::Url;
2020

21-
mod common;
22-
23-
use common::write_utils::get_simple_int_schema;
21+
use crate::common::write_utils::get_simple_int_schema;
2422

2523
async fn get_ict_at_version(
2624
store: Arc<DynObjectStore>,

0 commit comments

Comments
 (0)