diff --git a/CHANGELOG.md b/CHANGELOG.md index 10088f94d..025c907ee 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,12 +2,11 @@ ## PyNWB 4.0.0 (Upcoming) -### Documentation and tutorial enhancements -- Added `app.readthedocs.org/projects/pynwb/*` to `linkcheck_ignore` to stop the Sphinx linkcheck CI job from intermittently failing when GitHub Actions runners get throttled by readthedocs. @h-mayorquin [#2191](https://github.com/NeurodataWithoutBorders/pynwb/pull/2191) -- Added documentation for `ExternalImage` to the images tutorial. @h-mayorquin [#2159](https://github.com/NeurodataWithoutBorders/pynwb/pull/2159) -- Fixed broken and redirecting links in documentation. @bendichter [#2165](https://github.com/NeurodataWithoutBorders/pynwb/pull/2165) -- Added `EventsTable` examples to the NWB file basics and behavior tutorials. @rly [#2156](https://github.com/NeurodataWithoutBorders/pynwb/pull/2156) -- Added example of setting `Units.resolution` in the ecephys tutorial. @h-mayorquin [#2174](https://github.com/NeurodataWithoutBorders/pynwb/pull/2174) +## Changed +- Deprecated `NWBGroupSpec.add_group` and `NWBGroupSpec.add_dataset`. Use `NWBGroupSpec.set_group`, `NWBGroupSpec.set_dataset`, or pass the group or dataset to the `NWBGroupSpec` constructor. @rly [#2138](https://github.com/NeurodataWithoutBorders/pynwb/issues/2138) +- Updated HDMF dependency to >=6.0.1, <7. @rly [#2171](https://github.com/NeurodataWithoutBorders/pynwb/issues/2171) +- Deprecated Python 3.9 support. (EOL was Oct 31, 2025) @bendichter [#2141](https://github.com/NeurodataWithoutBorders/pynwb/pull/2141) +- Deprecated `BehavioralEvents` and `AnnotationSeries` in favor of using an `EventsTable` in `NWBFile.events`. Creating a new instance of either type now emits a `UserWarning`; reading existing files containing these types continues to work without warnings. @rly [#2156](https://github.com/NeurodataWithoutBorders/pynwb/pull/2156) ### Added - Added optional `source_description` attribute to `EventsTable` for a short free-text label of where events originated (e.g., `"Acquisition system"`, `"Manual video review"`). Added `NWBFile.merge_events_tables()` to merge a list of `EventsTable` objects into a single DataFrame sorted by timestamp with a `source_events_table` column. Added `NWBFile.get_all_events()` to merge all tables in `NWBFile.events`. @rly [#2192](https://github.com/NeurodataWithoutBorders/pynwb/pull/2192) @@ -16,19 +15,22 @@ - Added `get_starting_time()` and `get_duration()` methods to `TimeSeries` to get the starting time and duration of the time series. @h-mayorquin [#2146](https://github.com/NeurodataWithoutBorders/pynwb/pull/2146) - Added `get_starting_time()` and `get_duration()` methods to `TimeIntervals` to get the earliest start time and total duration (span from earliest start to latest stop) of all intervals. @h-mayorquin [#2146](https://github.com/NeurodataWithoutBorders/pynwb/pull/2146) - Added `get_starting_time()` and `get_duration()` methods to `Units` to get the earliest spike time and total duration (span from earliest to latest spike) across all units. @h-mayorquin [#2164](https://github.com/NeurodataWithoutBorders/pynwb/pull/2164) +- Added Python 3.14 support. @bendichter, @rly [#2168](https://github.com/NeurodataWithoutBorders/pynwb/pull/2168) + +### Documentation and tutorial enhancements +- Added `app.readthedocs.org/projects/pynwb/*` to `linkcheck_ignore` to stop the Sphinx linkcheck CI job from intermittently failing when GitHub Actions runners get throttled by readthedocs. @h-mayorquin [#2191](https://github.com/NeurodataWithoutBorders/pynwb/pull/2191) +- Added documentation for `ExternalImage` to the images tutorial. @h-mayorquin [#2159](https://github.com/NeurodataWithoutBorders/pynwb/pull/2159) +- Fixed broken and redirecting links in documentation. @bendichter [#2165](https://github.com/NeurodataWithoutBorders/pynwb/pull/2165) +- Added `EventsTable` examples to the NWB file basics and behavior tutorials. @rly [#2156](https://github.com/NeurodataWithoutBorders/pynwb/pull/2156) +- Added example of setting `Units.resolution` in the ecephys tutorial. @h-mayorquin [#2174](https://github.com/NeurodataWithoutBorders/pynwb/pull/2174) ### Fixed - Fixed reading legacy files where `Device.model` is a string containing `/` or `:` (e.g., `"MFC_200/250-0.66_40mm"`), which previously raised a `ValueError`. The string is now remapped to a read-only `DeviceModel` that preserves the original name, with a warning explaining that the file cannot be written or exported until a `DeviceModel` with a valid name is created. Writing or exporting such a `DeviceModel` raises a clear error instead of silently corrupting the file. @rly [#2186](https://github.com/NeurodataWithoutBorders/pynwb/pull/2186) - Fixed invalid CSS properties in documentation assistant toggle that prevented proper positioning on displays ≥1400px wide. @rly [#2151](https://github.com/NeurodataWithoutBorders/pynwb/pull/2151) - -## Changed -- Deprecated `NWBGroupSpec.add_group` and `NWBGroupSpec.add_dataset`. Use `NWBGroupSpec.set_group`, `NWBGroupSpec.set_dataset`, or pass the group or dataset to the `NWBGroupSpec` constructor. @rly [#2138](https://github.com/NeurodataWithoutBorders/pynwb/issues/2138) +- Fixed `pynwb.validate(path=...)` raising `TypeError` on Zarr-backed NWB files because HDF5-only kwargs (`driver`, `aws_region`, `load_namespaces`) were forwarded into `NWBZarrIO`. Backend dispatch is now centralized in a single opener helper, and namespace loading uses `HDMFIO.load_namespaces_io` on the open IO, which also retires the `io._file` access in the validator. Added `storage_options` to the `validate()` API for the Zarr backend. @h-mayorquin [#2187](https://github.com/NeurodataWithoutBorders/pynwb/pull/2187) - Fixed `TimeSeries.get_timestamps()` to handle numpy array timestamps when they are set. @pauladkisson [#2181](https://github.com/NeurodataWithoutBorders/pynwb/pull/2181) - Fixed `Units.waveform_rate` and `Units.waveform_unit` to also map to the `sampling_rate` and `unit` attributes of the `waveforms` column on write and read, so waveform sampling metadata round-trips for `Units` tables that contain only `waveforms` (without `waveform_mean` or `waveform_sd`). @ehennestad [#2183](https://github.com/NeurodataWithoutBorders/pynwb/pull/2183) -- Added Python 3.14 support. @bendichter, @rly [#2168](https://github.com/NeurodataWithoutBorders/pynwb/pull/2168) -- Updated HDMF dependency to >=6.0.1, <7. @rly [#2171](https://github.com/NeurodataWithoutBorders/pynwb/issues/2171) -- Deprecated Python 3.9 support. (EOL was Oct 31, 2025) @bendichter [#2141](https://github.com/NeurodataWithoutBorders/pynwb/pull/2141) -- Deprecated `BehavioralEvents` and `AnnotationSeries` in favor of using an `EventsTable` in `NWBFile.events`. Creating a new instance of either type now emits a `UserWarning`; reading existing files containing these types continues to work without warnings. @rly [#2156](https://github.com/NeurodataWithoutBorders/pynwb/pull/2156) + ## PyNWB 3.1.3 (December 9, 2025) diff --git a/src/pynwb/validation.py b/src/pynwb/validation.py index 4a25c0fc5..606b0820a 100644 --- a/src/pynwb/validation.py +++ b/src/pynwb/validation.py @@ -24,10 +24,36 @@ def _validate_helper(io: HDMFIO, namespace: str = CORE_NAMESPACE) -> list: return validator.validate(builder) -def get_cached_namespaces_to_validate(path: Optional[str] = None, - driver: Optional[str] = None, +HDF5_OPEN_KEYS = frozenset({"driver", "aws_region", "load_namespaces"}) +ZARR_OPEN_KEYS = frozenset({"storage_options"}) + + +def _open_backend_io(path: str, + *, + backend_kwargs: Optional[dict] = None, + manager: Optional[BuildManager] = None) -> HDMFIO: + # Open an HDMFIO for `path`. `backend_kwargs` may contain a union of + # HDF5 (Hierarchical Data Format 5) and Zarr open options; this helper + # resolves the backend via _get_backend and keeps only the keys that apply. + # Keys whose value is None are dropped, so callers can include all keys + # unconditionally. + from pynwb import _get_backend, NWBHDF5IO + backend_kwargs = backend_kwargs or {} + backend_io_cls = _get_backend(path, method=backend_kwargs.get("driver")) + valid_keys = HDF5_OPEN_KEYS if backend_io_cls is NWBHDF5IO else ZARR_OPEN_KEYS + io_kwargs = {"path": path, "mode": "r"} + if manager is not None: + io_kwargs["manager"] = manager + io_kwargs.update({k: v for k, v in backend_kwargs.items() + if k in valid_keys and v is not None}) + return backend_io_cls(**io_kwargs) + + +def get_cached_namespaces_to_validate(path: Optional[str] = None, + driver: Optional[str] = None, aws_region: Optional[str] = None, - io: Optional[HDMFIO] = None + storage_options: Optional[dict] = None, + io: Optional[HDMFIO] = None, ) -> Tuple[List[str], BuildManager, Dict[str, str]]: """ Determine the most specific namespace(s) that are cached in the given NWBFile that can be used for validation. @@ -60,17 +86,15 @@ def get_cached_namespaces_to_validate(path: Optional[str] = None, ) if io is not None: - # TODO update HDF5IO to have .file property to make consistent with ZarrIO - # then update input arguments here - namespace_dependencies = io.load_namespaces(namespace_catalog=catalog, - file=io._file) + namespace_dependencies = io.load_namespaces_io(namespace_catalog=catalog) else: - from pynwb import _get_backend - backend_io = _get_backend(path, method=driver) - namespace_dependencies = backend_io.load_namespaces(namespace_catalog=catalog, - path=path, - driver=driver, - aws_region=aws_region) + opened_io = _open_backend_io(path, backend_kwargs={ + "driver": driver, + "aws_region": aws_region, + "storage_options": storage_options, + }) + namespace_dependencies = opened_io.load_namespaces_io(namespace_catalog=catalog) + opened_io.close() # Determine which namespaces are the most specific (i.e. extensions) and validate against those candidate_namespaces = set(namespace_dependencies.keys()) @@ -132,7 +156,13 @@ def get_cached_namespaces_to_validate(path: Optional[str] = None, "type": str, "doc": "Driver for h5py to use when opening the HDF5 file.", "default": None, - }, + }, + { + "name": "storage_options", + "type": dict, + "doc": "Zarr storage options for remote stores (used by the Zarr backend).", + "default": None, + }, returns="Validation errors in the file.", rtype=list, is_method=False, @@ -169,21 +199,20 @@ def validate(**kwargs): def _validate_single_file(**kwargs): - io, path, use_cached_namespaces, namespace, verbose, driver = getargs( - "io", "path", "use_cached_namespaces", "namespace", "verbose", "driver", kwargs + io, path, use_cached_namespaces, namespace, verbose, driver, storage_options = getargs( + "io", "path", "use_cached_namespaces", "namespace", "verbose", "driver", "storage_options", kwargs ) assert io != path, "Both 'io' and 'path' were specified! Please choose only one." path = str(path) if isinstance(path, Path) else path # get namespaces to validate namespace_message = "PyNWB namespace information" - io_kwargs = dict(path=path, mode="r", driver=driver) - + manager = None + if use_cached_namespaces: - cached_namespaces, manager, namespace_dependencies = get_cached_namespaces_to_validate(path=path, - driver=driver, - io=io) - io_kwargs.update(manager=manager) + cached_namespaces, manager, namespace_dependencies = get_cached_namespaces_to_validate( + path=path, driver=driver, storage_options=storage_options, io=io, + ) if any(cached_namespaces): namespaces_to_validate = cached_namespaces @@ -194,14 +223,15 @@ def _validate_single_file(**kwargs): warn(f"The file {f'{path} ' if path is not None else ''}has no cached namespace information. " f"Falling back to {namespace_message}.", UserWarning) else: - io_kwargs.update(load_namespaces=False) namespaces_to_validate = [CORE_NAMESPACE] # get io object if not provided if path is not None: - from pynwb import _get_backend - backend_io = _get_backend(path, method=driver) - io = backend_io(**io_kwargs) + io = _open_backend_io(path, backend_kwargs={ + "driver": driver, + "storage_options": storage_options, + "load_namespaces": False if not use_cached_namespaces else None, + }, manager=manager) # check namespaces are accurate if namespace is not None: diff --git a/tests/validation/test_validate.py b/tests/validation/test_validate.py index 6a5b00c83..006bc3f4b 100644 --- a/tests/validation/test_validate.py +++ b/tests/validation/test_validate.py @@ -2,12 +2,22 @@ import re import os import sys +import tempfile +import unittest +from pathlib import Path from unittest.mock import patch from io import StringIO from pynwb.testing import TestCase +from pynwb.testing.mock.file import mock_NWBFile from pynwb import validate, NWBHDF5IO +try: + from hdmf_zarr import NWBZarrIO # noqa: F401 + HAVE_NWBZarrIO = True +except ImportError: + HAVE_NWBZarrIO = False + # NOTE we use "coverage run -m pynwb.validate" instead of "python -m pynwb.validate" # so that we can both test pynwb.validate and compute code coverage from that test. @@ -344,3 +354,38 @@ def test_validate_paths_deprecation(self): with self.assertRaisesWith(ValueError, expected_error): validate(paths=['tests/back_compat/1.0.2_nwbfile.nwb'], path='tests/back_compat/1.0.2_nwbfile.nwb') + + +@unittest.skipIf(not HAVE_NWBZarrIO, "hdmf-zarr is not installed") +class TestValidateZarr(TestCase): + # Regression tests for https://github.com/NeurodataWithoutBorders/pynwb/issues/2131: + # validate(path=...) on a Zarr-backed NWB file used to raise TypeError because + # HDF5-only kwargs leaked into NWBZarrIO. The validator now opens via a + # backend-aware factory and uses load_namespaces_io on the open IO. + + def _write_zarr_nwbfile(self, path): + nwbfile = mock_NWBFile() + with NWBZarrIO(str(path), 'w') as io: + io.write(nwbfile) + + def test_validate_zarr_path_cached_namespaces(self): + with tempfile.TemporaryDirectory() as temp_dir: + path = Path(temp_dir) / "test.nwb.zarr" + self._write_zarr_nwbfile(path) + errors = validate(path=str(path)) + self.assertEqual(errors, []) + + def test_validate_zarr_path_no_cached_namespaces(self): + with tempfile.TemporaryDirectory() as temp_dir: + path = Path(temp_dir) / "test.nwb.zarr" + self._write_zarr_nwbfile(path) + errors = validate(path=str(path), use_cached_namespaces=False) + self.assertEqual(errors, []) + + def test_validate_zarr_io(self): + with tempfile.TemporaryDirectory() as temp_dir: + path = Path(temp_dir) / "test.nwb.zarr" + self._write_zarr_nwbfile(path) + with NWBZarrIO(str(path), 'r') as io: + errors = validate(io=io) + self.assertEqual(errors, [])