Skip to content

Commit f647a5c

Browse files
thodson-usgsclaude
andcommitted
Fix Coordinates.to_index performance regression
The codes passed to pd.MultiIndex were being converted from cache-friendly ndarrays into Python lists to silence a mypy arg-type error introduced in pydata#10694. The extra per-element conversion dominates runtime for large indexes (~13s on a 100x2000x300 array). Pass the ndarrays directly and suppress the type error the same way as for `levels` just above. Fixes pydata#11305 Co-authored-by: Claude <noreply@anthropic.com>
1 parent f7e47a1 commit f647a5c

2 files changed

Lines changed: 5 additions & 1 deletion

File tree

doc/whats-new.rst

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,10 @@ Deprecations
2626
Bug Fixes
2727
~~~~~~~~~
2828

29+
- Fix a major performance regression in :py:meth:`Coordinates.to_index` (and
30+
consequently :py:meth:`Dataset.to_dataframe`) caused by converting the cached
31+
code ndarrays into Python lists (:issue:`11305`).
32+
2933

3034
Documentation
3135
~~~~~~~~~~~~~

xarray/core/coordinates.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -194,7 +194,7 @@ def to_index(self, ordered_dims: Sequence[Hashable] | None = None) -> pd.Index:
194194

195195
return pd.MultiIndex(
196196
levels=level_list, # type: ignore[arg-type,unused-ignore]
197-
codes=[list(c) for c in code_list],
197+
codes=code_list, # type: ignore[arg-type,unused-ignore]
198198
names=names,
199199
)
200200

0 commit comments

Comments
 (0)