Skip to content

feat: Adding support for Annbatch#3620

Open
ori-kron-wis wants to merge 56 commits intomainfrom
Ori-annbatch
Open

feat: Adding support for Annbatch#3620
ori-kron-wis wants to merge 56 commits intomainfrom
Ori-annbatch

Conversation

@ori-kron-wis
Copy link
Copy Markdown
Collaborator

No description provided.

@ori-kron-wis ori-kron-wis self-assigned this Nov 25, 2025
@ori-kron-wis ori-kron-wis added on-merge: backport to 1.4.x on-merge: backport to 1.4.x custom_dataloader PR 2932 labels Nov 25, 2025
@codecov
Copy link
Copy Markdown

codecov Bot commented Nov 25, 2025

Codecov Report

❌ Patch coverage is 84.19118% with 129 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.08%. Comparing base (612157b) to head (724c9c6).

Files with missing lines Patch % Lines
src/scvi/model/_peakvi.py 75.59% 31 Missing ⚠️
src/scvi/model/base/_rnamixin.py 73.87% 29 Missing ⚠️
src/scvi/model/base/_base_model.py 85.38% 25 Missing ⚠️
src/scvi/dataloaders/_custom_dataloaders.py 90.39% 22 Missing ⚠️
src/scvi/external/scar/_model.py 69.23% 12 Missing ⚠️
src/scvi/external/mrvi_torch/_model.py 87.75% 6 Missing ⚠️
src/scvi/external/sysvi/_model.py 90.90% 2 Missing ⚠️
src/scvi/model/base/_vaemixin.py 90.90% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3620      +/-   ##
==========================================
- Coverage   88.23%   88.08%   -0.15%     
==========================================
  Files         229      229              
  Lines       22646    23373     +727     
==========================================
+ Hits        19981    20589     +608     
- Misses       2665     2784     +119     
Flag Coverage Δ
custom_dataloader 39.60% <78.06%> (+4.08%) ⬆️
integration 69.67% <20.58%> (-1.81%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/scvi/dataloaders/__init__.py 100.00% <100.00%> (ø)
src/scvi/model/_autozi.py 98.83% <100.00%> (+0.04%) ⬆️
src/scvi/model/_linear_scvi.py 97.91% <100.00%> (+0.13%) ⬆️
src/scvi/model/base/_save_load.py 87.37% <100.00%> (-0.75%) ⬇️
src/scvi/model/base/_training_mixin.py 92.20% <100.00%> (+0.65%) ⬆️
src/scvi/train/_trainingplans.py 83.78% <100.00%> (-0.09%) ⬇️
src/scvi/external/sysvi/_model.py 97.14% <90.90%> (-2.86%) ⬇️
src/scvi/model/base/_vaemixin.py 96.02% <90.90%> (-0.88%) ⬇️
src/scvi/external/mrvi_torch/_model.py 88.98% <87.75%> (+0.67%) ⬆️
src/scvi/external/scar/_model.py 84.61% <69.23%> (-6.01%) ⬇️
... and 4 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ori-kron-wis ori-kron-wis marked this pull request as ready for review February 17, 2026 11:21
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6ee2098c75

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/scvi/dataloaders/_custom_dataloaders.py Outdated
Comment thread src/scvi/dataloaders/_custom_dataloaders.py Outdated
@ori-kron-wis
Copy link
Copy Markdown
Collaborator Author

ori-kron-wis commented Apr 28, 2026

Resolvi annabtch implementation still loads data into memory and can't be based on annbatch, although what was committed here. There is no real gain from it right now; need to think about graph dataloader for it as @canergen suggested.
Can it then be optimized with the annbatch backend? This is what we would expect to be in scviva-tools as a spatial dataloader, @ilan-gold (until we think about a spatialdata loader support)

I will remove those commits from here and put them on a side branch for later use: Ori-annbatch-resolvi

ori-kron-wis and others added 14 commits April 28, 2026 12:32
Added sample_key parameter to BaseModelClass.setup_annbatch to enable
sample-level metadata tracking in annbatch datamodules. This is required
for models like MrVI that operate on sample-level information.

Also fixed import issues with anndata by updating to use experimental
module for CSCDataset, CSRDataset, and read_elem.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Modified CellAssign.__init__ to accept registry= parameter
- Added setup_annbatch method to compute streaming statistics (col_means, basis_means)
- Modified train() to accept datamodule parameter
- Added test_annbatch_setup_cellassign

Note: Test fails due to missing size_factor field in annbatch datamodule
This requires custom datamodule similar to ContrastiveVI/VELOVI

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
… into memory

ad.experimental.read_lazy fails on h5ad files with zero-shape datasets in /uns
(e.g. Decipher tutorial data with /uns/decipher/config/layers_z_to_x). Replace
with direct h5py read of only the var group — reads index column via _index
attr, with fallback to common names. Keeps data off memory per annbatch design.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Comment thread src/scvi/dataloaders/_custom_dataloaders.py Outdated
@ori-kron-wis ori-kron-wis added custom_dataloader PR 2932 and removed custom_dataloader PR 2932 labels Apr 30, 2026
ori-kron-wis and others added 12 commits April 30, 2026 16:26
- _save_load.py: always override serialized datamodule string with actual
  datamodule arg so SCANVI and other models with datamodule param don't
  receive a string object at load time
- _base_model.py: skip module recreation in setup_datamodule load path
  when no datamodule is provided (_initialize_model already builds correct
  module from registry); preserve library_log_means/vars buffers when
  recreating with datamodule
- _training_mixin.py: handle adata=None + datamodule=None load path in
  _set_indices_and_labels by using categorical_mapping (without unlabeled
  category) so n_labels matches training-time computation
- revert docs/tutorials/notebooks submodule to origin/main pointer

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

custom_dataloader PR 2932 on-merge: backport to 1.4.x on-merge: backport to 1.4.x

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants