Skip to content

Add tutorial notebook for accessing VBN raw data #2765

Open
bjhardcastle wants to merge 16 commits into
AllenInstitute:rc/2.16.3from
bjhardcastle:vbn_raw_data_access
Open

Add tutorial notebook for accessing VBN raw data #2765
bjhardcastle wants to merge 16 commits into
AllenInstitute:rc/2.16.3from
bjhardcastle:vbn_raw_data_access

Conversation

@bjhardcastle

Copy link
Copy Markdown
Member

Overview:

Add a notebook to demonstrate how to access raw data for the Visual Behavior Neuropixels project.
Covers three different methods:

  1. browsing bucket with a website
  2. using AWS CLI tool to download files
  3. using boto3 to stream files

Type of Fix:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing
    functionality to not work as expected)
  • Documentation Change

Changes:

  • adds one .ipynb with embedded images

Validation:

Manually tested code with

  • aws-cli/1.18.69
  • Python/3.8.10
  • Linux/6.1.155-176.282.amzn2023.x86_64
  • botocore/1.16.19

Checklist

  • My code follows
    Allen Institute Contribution Guidelines
  • My code is unit tested and does not decrease test coverage
  • I have performed a self review of my own code
  • My code is well-documented, and the docstrings conform to
    Numpy Standards
  • I have updated the documentation of the repository where
    appropriate
  • The header on my commit includes the issue number
  • My Pull Request has the latest AllenSDK release candidate branch
    rc/x.y.z as its merge target
  • My code passes all AllenSDK tests

Notes:

@review-notebook-app

Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@CLAassistant

CLAassistant commented Nov 25, 2025

Copy link
Copy Markdown

CLA assistant check
All committers have signed the CLA.

bjhardcastle and others added 12 commits February 9, 2026 12:08
Three pinned dependencies blocked installation on Python 3.12+:
- numpy<1.24: no cp312 wheels, distutils removal kills source build
- pandas==1.5.3: no cp312 wheels, same distutils issue
- scipy<1.11: requires_python="<3.12", resolver rejects outright

Dependency changes:
- numpy: uncap (numpy 2.x supported after source fixes below)
- pandas: relax to >=1.5.3,<3 (2.x supported; 3.0 needs further work)
- scipy: uncap (1.14+ supported after interp2d removal)
- pynwb: cap <3 to avoid breaking API changes
- xarray: unpin <2023.2.0 (pure Python, no breakage found)
- pytz: add as explicit dep (was transitive via pandas 1.x)
- setuptools: cap <81 (pkg_resources removed in 81+)

Source fixes for removed/changed APIs:
- scipy.interpolate.interp2d → RectBivariateSpline (removed in 1.14)
- ConfigParser.readfp → read_string (removed in Python 3.12)
- DataFrame.iteritems() → items() (removed in pandas 2.0)
- DataFrame.append() → pd.concat() (removed in pandas 2.0)
- pd.to_datetime: add format="ISO8601" for mixed-precision timestamps
  (pandas 2.x rejects mixed microsecond precision without explicit format)
- pd.to_datetime: fix utc="True" (string) → utc=True (bool) bug in
  behavior_project_cloud_api.py
- np.int() → int() (removed in numpy 1.24)
- np.NaN/np.NAN → np.nan (removed in numpy 2.0)
- np.Inf → np.inf, np.string_ → np.bytes_ (removed in numpy 2.0)
- np.product → np.prod, np.in1d → np.isin (removed in numpy 2.0)
- np.ediff1d to_begin dtype must match array dtype in numpy 2.0
- np.VisibleDeprecationWarning → try/except import (removed in 2.0)
- nwbfile.modules → .processing (pynwb 2.x deprecation)
- IndexSeries unit='None' → 'N/A' (pynwb 2.5+ validation)
- np.linalg.norm([array, scalar]) → np.vstack (numpy 1.24+ rejects)
- aiohttp.ClientSession → lazy property (aiohttp 3.9+ event loop)
- groupby().apply() → select column first (pandas 2.2+ behavior)
- demixer: add np.isfinite guard after linalg.solve (some LAPACK
  implementations return inf instead of raising LinAlgError for
  singular matrices)

Notebook fixes:
- IPython.core.display → IPython.display (removed in IPython 8.x)
  in visual_behavior_compare_across_trial_types and
  visual_behavior_mouse_history notebooks

Test fixes:
- pytest.warns(None) → warnings.catch_warnings (pytest 8)
- mock.called_once_with → assert_called_once() (proper assertion)
- Add res.x to MagicMock for scipy.optimize result
- rng.choice(inhomogeneous) → rng.integers + index (numpy 1.24+)
- Widen curve-fit tolerances for platform variance
- Cast dtypes for pynwb 2.x roundtrip and pandas 2.x index changes
- subprocess 'python' → sys.executable for venv correctness
- Replace flaky test_demix_raises_warning_for_singular_matrix with
  a result-based assertion in test_demix_point
- from mock import → from unittest.mock import (69 test files)
- Uncap/modernize test dependency bounds, remove dead weight
- Add pandas 2.x datetime regression tests (test_pandas_compat.py)

CI:
- Update to actions/checkout@v4, actions/setup-python@v5
- Slim matrix to min/max Python (3.10, 3.13) on ubuntu, 3.13 on
  macOS/windows
- Add pip caching, coverage on single matrix element only
- Modernize notebook_runner.yml to Python 3.13
- Update nightly.yml to checkout@v4
- Switch notebook runner to ubuntu-latest-8x (32GB RAM) for
  ecephys notebooks that peak at ~21GB RSS
- Change notebook workflow trigger to push on master (expensive job
  should not run on every pull request)

Fixes AllenInstitute#2747, AllenInstitute#2744, AllenInstitute#2746, AllenInstitute#2754

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…GH-2744/bugfix/install-on-python-312

 GH AllenInstitute#2744: Fix installation on Python 3.12, 3.13
setup.py imports allensdk at build time to read __version__, breaking
clean installs. Dependencies live in five separate requirements files
instead of package metadata. 18 files use pkg_resources at runtime,
forcing a setuptools runtime dependency.

- Replace setup.py/setup.cfg/MANIFEST.in with pyproject.toml (hatchling)
- Consolidate requirements files into optional dependency groups
- Replace pkg_resources.resource_filename with importlib.resources.files
- Replace pkg_resources.parse_version with packaging.version.Version
- Read __version__ from importlib.metadata instead of hardcoding
- Convert bps bash wrapper to a [project.scripts] console entry point
- Update CI workflows, docs, and notebook cells for new install path

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…oject.toml

Migrate from setup.py to pyproject.toml (hatchling)
Remove all Python 2 compatibility shims (six and future packages)
and replace with stdlib equivalents:

- six.iteritems(d) → d.items()
- six.itervalues(d) → d.values()
- six.string_types → str
- six.text_type → str
- six.moves.xrange → range
- six.moves.reduce → functools.reduce
- six.moves.builtins → builtins
- six.moves.cPickle → pickle
- six.raise_from(X, e) → raise X from e
- six.ensure_str(s) → s
- past.builtins.xrange → range

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Make it clear that this package is in maintenance mode.

Co-authored-by: David Feng <dyf@users.noreply.github.com>
* fix: apply safe ruff auto-fixes

* fix: E711, E731 (none-comparison and lambda assignment)

* fix: E712 (boolean comparison) with idiomatic pandas

* fix: F841 (unused variables)

After deleting variables, some bare expressions were deleted:

Bare expressions deleted (no side effects — entire line removed):

File: brain_observatory/circle_plots.py
Deleted expression: r * np.cos(angle), r * np.sin(angle)
Why: Pure arithmetic
────────────────────────────────────────
File: brain_observatory/ecephys/.../view_blocks.py
Deleted expression: recorded_blocks[0] → pass
Why: Bare index in else clause
────────────────────────────────────────
File: brain_observatory/eye_tracking/.../DLC_Ellipse_Fitting.py
Deleted expression: 1-frac_completed
Why: Pure arithmetic
────────────────────────────────────────
File: internal/brain_observatory/demix_report.py
Deleted expression: np.zeros(mask.shape)
Why: Array allocated and discarded
────────────────────────────────────────
File: internal/brain_observatory/frame_stream.py
Deleted expression: np.prod(self.frame_shape)
Why: Pure computation
────────────────────────────────────────
File: internal/brain_observatory/roi_filter.py
Deleted expression: create_feature_array(...) → pass
Why: Returns value, no mutation
────────────────────────────────────────
File: internal/ephys/plot_qc_figures.py
Deleted expression: cell_features["long_squares"]["sweeps"]
Why: Dict lookup
────────────────────────────────────────
File: internal/ephys/plot_qc_figures3.py
Deleted expression: cell_features["long_squares"]["sweeps"]
Why: Dict lookup
────────────────────────────────────────
File: internal/model/GLM.py
Deleted expression: kbasprs['b']
Why: Dict lookup
────────────────────────────────────────
File: internal/model/biophysical/fit_stage_1.py
Deleted expression: neuronal_model_data['specimen_id']
Why: Dict lookup
────────────────────────────────────────
File: internal/model/biophysical/make_deap_fit_json.py
Deleted expression: os.path.realpath(os.curdir)
Why: Pure path op
────────────────────────────────────────
File: internal/model/biophysical/passive_fitting/preprocess.py
Deleted expression: down_idxs[1] - down_idxs[0]
Why: Pure arithmetic
────────────────────────────────────────
File: internal/model/glif/error_functions.py
Deleted expression: input_data['subthreshold_long_square_voltage_variance'], np.arange(...)*experiment.neuron.dt, [e.data['interpolated_ISI']] (dead code after raise)
Why: Dict lookup, pure computation, unreachable
────────────────────────────────────────
File: internal/model/glif/preprocess_neuron.py
Deleted expression: sweep_index[...][RESTING_POTENTIAL]*1e-3, long_square_config['all'], long_square_config['subthreshold'], np.mean(El_test_list)
Why: Dict lookups, pure computation
────────────────────────────────────────
File: internal/model/glif/spike_cutting.py
Deleted expression: np.var(xdata)
Why: Pure computation
────────────────────────────────────────
File: internal/pipeline_modules/IVSCC/ephys_nwb/qc.py
Deleted expression: sweep_data['response'], sweep_data['sampling_rate']
Why: Dict lookups
────────────────────────────────────────
File: internal/pipeline_modules/run_neuropil_correction.py
Deleted expression: np.array([...]).mean()
Why: Pure computation
────────────────────────────────────────
File: internal/pipeline_modules/run_observatory_container_thumbnails.py
Deleted expression: input_file['output_json']
Why: Dict lookup
────────────────────────────────────────
File: model/glif/glif_neuron_methods.py
Deleted expression: tcs['voltage'][-1]
Why: Dict/index lookup
────────────────────────────────────────
File: brain_observatory/behavior/swdb/summary_figures.py
Deleted expression: int(session.metadata['ophys_frame_rate'])
Why: Pure type conversion

Bare expressions kept (have side effects):

┌────────────────────────────────────────────────────────┬──────────────────────────────────┐
│                       Expression                       │             Why kept             │
├────────────────────────────────────────────────────────┼──────────────────────────────────┤
│ hero_sweep.sweep_feature("adapt"/"latency"/"mean_isi") │ Forces lazy feature caching      │
├────────────────────────────────────────────────────────┼──────────────────────────────────┤
│ swp.sweep_feature("v_baseline")                        │ Same caching pattern             │
├────────────────────────────────────────────────────────┼──────────────────────────────────┤
│ i_vec.as_numpy()                                       │ NEURON vector conversion         │
├────────────────────────────────────────────────────────┼──────────────────────────────────┤
│ subprocess.check_output([...]) (3 calls)               │ Runs external fitting processes  │
├────────────────────────────────────────────────────────┼──────────────────────────────────┤
│ p.map(func, types)                                     │ Multiprocessing execution        │
├────────────────────────────────────────────────────────┼──────────────────────────────────┤
│ description.manifest.get_path(...)                     │ Manifest path resolution         │
├────────────────────────────────────────────────────────┼──────────────────────────────────┤
│ plt.subplot2grid(...)                                  │ Creates subplot in figure layout │
├────────────────────────────────────────────────────────┼──────────────────────────────────┤
│ All allensdk/test/ expressions                         │ Smoke tests exercising APIs      │
└────────────────────────────────────────────────────────┴──────────────────────────────────┘

* fix: E722, E701, F401, F507 lint errors and configure suppressions

- E722: Replace 55 bare `except:` with `except Exception:` (32 files)
- E701: Split 29 single-line compound statements onto separate lines
- F401: Add `__all__` to 5 `__init__.py` re-export files, noqa 2 keras
  availability checks
- F507: Fix format string placeholder mismatch in glif_optimizer_neuron
- Configure ruff and flake8 to suppress E402, E741, F403, F405

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: David Feng <dyf@users.noreply.github.com>
Trivial change to re-trigger CI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants