Skip to content

Fix lazy open bug#858

Merged
tomwhite merged 6 commits intocubed-dev:mainfrom
TomNicholas:fix-lazy-open-bug
Jan 26, 2026
Merged

Fix lazy open bug#858
tomwhite merged 6 commits intocubed-dev:mainfrom
TomNicholas:fix-lazy-open-bug

Conversation

@TomNicholas
Copy link
Copy Markdown
Member

@TomNicholas TomNicholas commented Jan 12, 2026

Fixes #855 by using ._data as suggested.

However I get another weird error with the diagnostics code:

Details
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
File [/opt/coiled/env/lib/python3.12/site-packages/IPython/core/formatters.py:406](https://cluster-nhecq.dask.host/opt/coiled/env/lib/python3.12/site-packages/IPython/core/formatters.py#line=405), in BaseFormatter.__call__(self, obj)
    404     method = get_real_method(obj, self.print_method)
    405     if method is not None:
--> 406         return method()
    407     return None
    408 else:

File [/opt/coiled/env/lib/python3.12/site-packages/xarray/core/dataset.py:2404](https://cluster-nhecq.dask.host/opt/coiled/env/lib/python3.12/site-packages/xarray/core/dataset.py#line=2403), in Dataset._repr_html_(self)
   2402 if OPTIONS["display_style"] == "text":
   2403     return f"<pre>{escape(repr(self))}<[/pre](https://cluster-nhecq.dask.host/pre)>"
-> 2404 return formatting_html.dataset_repr(self)

File [/opt/coiled/env/lib/python3.12/site-packages/xarray/core/formatting_html.py:374](https://cluster-nhecq.dask.host/opt/coiled/env/lib/python3.12/site-packages/xarray/core/formatting_html.py#line=373), in dataset_repr(ds)
    371 if ds.coords:
    372     sections.append(coord_section(ds.coords))
--> 374 sections.append(datavar_section(ds.data_vars))
    376 display_default_indexes = _get_boolean_with_default(
    377     "display_default_indexes", False
    378 )
    379 xindexes = filter_nondefault_indexes(
    380     _get_indexes_dict(ds.xindexes), not display_default_indexes
    381 )

File [/opt/coiled/env/lib/python3.12/site-packages/xarray/core/formatting_html.py:224](https://cluster-nhecq.dask.host/opt/coiled/env/lib/python3.12/site-packages/xarray/core/formatting_html.py#line=223), in _mapping_section(mapping, name, details_func, max_items_collapse, expand_option_name, enabled, max_option_name)
    218     if n_items > max_items:
    219         inline_details = f"({max_items}[/](https://cluster-nhecq.dask.host/){n_items})"
    221 return collapsible_section(
    222     name,
    223     inline_details=inline_details,
--> 224     details=details_func(mapping),
    225     n_items=n_items,
    226     enabled=enabled,
    227     collapsed=collapsed,
    228 )

File [/opt/coiled/env/lib/python3.12/site-packages/xarray/core/formatting_html.py:134](https://cluster-nhecq.dask.host/opt/coiled/env/lib/python3.12/site-packages/xarray/core/formatting_html.py#line=133), in summarize_vars(variables)
    133 def summarize_vars(variables) -> str:
--> 134     vars_li = "".join(
    135         f"<li class='xr-var-item'>{summarize_variable(k, v)}<[/li](https://cluster-nhecq.dask.host/li)>"
    136         for k, v in variables.items()
    137     )
    139     return f"<ul class='xr-var-list'>{vars_li}<[/ul](https://cluster-nhecq.dask.host/ul)>"

File [/opt/coiled/env/lib/python3.12/site-packages/xarray/core/formatting_html.py:135](https://cluster-nhecq.dask.host/opt/coiled/env/lib/python3.12/site-packages/xarray/core/formatting_html.py#line=134), in <genexpr>(.0)
    133 def summarize_vars(variables) -> str:
    134     vars_li = "".join(
--> 135         f"<li class='xr-var-item'>{summarize_variable(k, v)}<[/li](https://cluster-nhecq.dask.host/li)>"
    136         for k, v in variables.items()
    137     )
    139     return f"<ul class='xr-var-list'>{vars_li}<[/ul](https://cluster-nhecq.dask.host/ul)>"

File [/opt/coiled/env/lib/python3.12/site-packages/xarray/core/formatting_html.py:96](https://cluster-nhecq.dask.host/opt/coiled/env/lib/python3.12/site-packages/xarray/core/formatting_html.py#line=95), in summarize_variable(name, var, is_index, dtype)
     94 preview = escape(inline_variable_array_repr(variable, 35))
     95 attrs_ul = summarize_attrs(var.attrs)
---> 96 data_repr = short_data_repr_html(variable)
     98 attrs_icon = _icon("icon-file-text2")
     99 data_icon = _icon("icon-database")

File [/opt/coiled/env/lib/python3.12/site-packages/xarray/core/formatting_html.py:43](https://cluster-nhecq.dask.host/opt/coiled/env/lib/python3.12/site-packages/xarray/core/formatting_html.py#line=42), in short_data_repr_html(array)
     41 internal_data = getattr(array, "variable", array)._data
     42 if hasattr(internal_data, "_repr_html_"):
---> 43     return internal_data._repr_html_()
     44 text = escape(short_data_repr(array))
     45 return f"<pre>{text}<[/pre](https://cluster-nhecq.dask.host/pre)>"

File [/opt/coiled/env/lib/python3.12/site-packages/cubed/array_api/array_object.py:50](https://cluster-nhecq.dask.host/opt/coiled/env/lib/python3.12/site-packages/cubed/array_api/array_object.py#line=49), in Array._repr_html_(self)
     49 def _repr_html_(self):
---> 50     from cubed.diagnostics.widgets import get_template
     52     try:
     53         grid = self.to_svg(size=ARRAY_SVG_SIZE)

File [/opt/coiled/env/lib/python3.12/site-packages/cubed/diagnostics/widgets/__init__.py:9](https://cluster-nhecq.dask.host/opt/coiled/env/lib/python3.12/site-packages/cubed/diagnostics/widgets/__init__.py#line=8)
      1 try:
      2     from cubed.diagnostics.widgets.core import (
      3         FILTERS,
      4         TEMPLATE_PATHS,
      5         get_environment,
      6         get_template,
      7     )
----> 9     from .memory import LiveMemoryViewer, MemoryWidget
     10     from .plan import LivePlanViewer, PlanWidget
     11     from .timeline import LiveTimelineViewer, TimelineWidget

File [/opt/coiled/env/lib/python3.12/site-packages/cubed/diagnostics/widgets/memory.py:13](https://cluster-nhecq.dask.host/opt/coiled/env/lib/python3.12/site-packages/cubed/diagnostics/widgets/memory.py#line=12)
      9 from cubed.runtime.pipeline import visit_nodes
     10 from cubed.runtime.types import Callback
---> 13 class MemoryWidget(anywidget.AnyWidget):
     14     _esm = pathlib.Path(__file__).parent [/](https://cluster-nhecq.dask.host/) "static" [/](https://cluster-nhecq.dask.host/) "memory_widget.js"
     16     width = traitlets.Unicode("100%").tag(sync=True)

File [/opt/coiled/env/lib/python3.12/site-packages/traitlets/traitlets.py:963](https://cluster-nhecq.dask.host/opt/coiled/env/lib/python3.12/site-packages/traitlets/traitlets.py#line=962), in MetaHasDescriptors.__new__(mcls, name, bases, classdict, **kwds)
    960         classdict[k] = v()
    961     # ----------------------------------------------------------------
--> 963 return super().__new__(mcls, name, bases, classdict, **kwds)

File [/opt/coiled/env/lib/python3.12/site-packages/anywidget/widget.py:71](https://cluster-nhecq.dask.host/opt/coiled/env/lib/python3.12/site-packages/anywidget/widget.py#line=70), in AnyWidget.__init_subclass__(cls, **kwargs)
     67 super().__init_subclass__(**kwargs)
     68 for key in (_ESM_KEY, _CSS_KEY) & cls.__dict__.keys():
     69     # TODO(manzt): Upgrade to := when we drop Python 3.7
     70     # https://github.com/manzt/anywidget/pull/167
---> 71     file_contents = try_file_contents(getattr(cls, key))
     72     if file_contents:
     73         setattr(cls, key, file_contents)

File [/opt/coiled/env/lib/python3.12/site-packages/anywidget/_util.py:280](https://cluster-nhecq.dask.host/opt/coiled/env/lib/python3.12/site-packages/anywidget/_util.py#line=279), in try_file_contents(x)
    278 if not path.is_file():
    279     msg = f"File not found: {path}"
--> 280     raise FileNotFoundError(msg)
    281 return FileContents(
    282     path=path,
    283     start_thread=_should_start_thread(path),
    284 )

FileNotFoundError: File not found: [/opt/coiled/env/lib/python3.12/site-packages/cubed/diagnostics/widgets/static/memory_widget.js](https://cluster-nhecq.dask.host/opt/coiled/env/lib/python3.12/site-packages/cubed/diagnostics/widgets/static/memory_widget.js)

(I'm running this on Azure via a Coiled notebook)

Also doesn't have a test yet.

Comment thread cubed/utils.py
da = obj[var]
if isinstance(da.data, Array):
array_names_to_variable_names[da.data.name] = f"{name}.{var}"
for var_name, var in obj.variables.items():
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deliberately changed .data_vars to .variables here because it's totally possible to have large lazy coordinate variables.

@TomNicholas
Copy link
Copy Markdown
Member Author

Seems the .load() step was actually slower than dask, surprisingly. It somehow took 3.5 seconds to load just 43MB from object storage. 😕 Dask took 2.25s, and not using dask or cubed took 733ms (which is still slow).

This goes against my mental model here.

@tomwhite
Copy link
Copy Markdown
Member

Fixes #855 by using ._data as suggested.

Great! Thanks for fixing.

However I get another weird error with the diagnostics code:

I wonder if this is like #449 - need to include new css and js files for the widgets? I can fix this separately.

@tomwhite
Copy link
Copy Markdown
Member

This goes against my mental model here.

Yes, mine too. Do you mind creating a separate issue for this with the code snippet that causes it?

@tomwhite
Copy link
Copy Markdown
Member

I wonder if this is like #449 - need to include new css and js files for the widgets? I can fix this separately.

See #860

@tomwhite
Copy link
Copy Markdown
Member

@TomNicholas I think it's worth merging this if you're happy with it as I'd like to do a release soon. Worth opening an issue or PR for the test in https://github.com/cubed-dev/cubed-xarray

@TomNicholas
Copy link
Copy Markdown
Member Author

Sorry I won't really have much time to be more helpful here until ~2 weeks from now.

@tomwhite
Copy link
Copy Markdown
Member

I pushed a fix (9d4a089) that was causing this to fail for for DataArray.

I haven't been able to reproduce the original failure in #855 though. I think there may be something special about the dataset that triggers this (indexes?). Perhaps we can merge this since it fixes #855 for @TomNicholas, and we have cubed-dev/cubed-xarray#41 to track adding a test (which should be easier now we have #864).

@tomwhite tomwhite merged commit 123e7a3 into cubed-dev:main Jan 26, 2026
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Attempts to allocate memory for full-size array when opening zarr store with xarray

2 participants