Skip to content

Commit 4e9f24e

Browse files
Support pandas 2.2.0
Fixes a unit test to support pandas 2.2.0+. The pandas release fixes a sorting bug with pandas-dev/pandas#54611. This commit changes the expected results accordingly. Also fixes a merge type mismatch introduced by upstream #1709: the codelist metadata side was cast to StringDtype but the evaluation dataset side was not. With pandas 2.2.0, empty columns infer as float64, and merging float64 with string is rejected. Casting both sides to string before the merge resolves this.
1 parent c95bcfb commit 4e9f24e

4 files changed

Lines changed: 10 additions & 2 deletions

File tree

cdisc_rules_engine/operations/codelist_extensible.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,10 @@ def _handle_multiple_versions(self) -> pd.Series:
3232
"codelist_code": "string",
3333
}
3434
)
35+
cast_cols = {self.params.ct_version: "string"}
36+
if self.params.codelist_code in self.evaluation_dataset.columns:
37+
cast_cols[self.params.codelist_code] = "string"
38+
self.evaluation_dataset = self.evaluation_dataset.astype(cast_cols)
3539
if self.params.codelist_code in self.evaluation_dataset.columns:
3640
is_extensible = self.evaluation_dataset.merge(
3741
ct_df.data,

cdisc_rules_engine/operations/codelist_terms.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,10 @@ def _handle_multiple_versions(self) -> pd.Series:
6464
"codelist_code": "string",
6565
}
6666
)
67+
cast_cols = {self.params.ct_version: "string"}
68+
if self.params.codelist_code in self.evaluation_dataset.columns:
69+
cast_cols[self.params.codelist_code] = "string"
70+
self.evaluation_dataset = self.evaluation_dataset.astype(cast_cols)
6771
if self.params.codelist_code in self.evaluation_dataset.columns:
6872
result = self.evaluation_dataset.merge(
6973
ct_df.data,

pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ dependencies = [
2525
"numpy >=1.26.0",
2626
"odmlib >=0.1.4",
2727
"openpyxl >=3.1.5",
28-
"pandas >=2.1.4, <2.2.0",
28+
"pandas >=2.1.4, <3.0.0",
2929
"psutil >=6.1.1",
3030
"pyinstaller >=6.11.0",
3131
"pympler >=1.1",

tests/unit/test_dataset_builders/test_dataset_metadata_define_dataset_builder.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,7 @@ def test_dataset_metadata_define_dataset_builder(dataset_path):
149149
expected_results["dm.xpt"],
150150
expected_results["ae.xpt"],
151151
]
152-
).astype(object)
152+
).astype(object).sort_values("dataset_location").reset_index(drop=True)
153153

154154
result_df = result.data[expected_df.columns].reset_index(drop=True)
155155

0 commit comments

Comments
 (0)