refactor: use Array API-compatible tabulate math#5366
refactor: use Array API-compatible tabulate math#5366wanghan-iapcm merged 11 commits intodeepmodeling:masterfrom
Conversation
Port the shared tabulation math helpers to Array API compatible operators, allow PT/PT_EXPT tabulation to pick a torch backend for the math path, and add regression tests that verify the helpers stay backend-agnostic. Tested: - . .venv/bin/activate && PYTHONPATH=. pytest -q source/tests/common/test_tabulate_math_array_api.py - python3 -m py_compile deepmd/utils/tabulate_math.py deepmd/pt/utils/tabulate.py deepmd/pt_expt/utils/tabulate.py source/tests/common/test_tabulate_math_array_api.py Refs deepmodeling#5352 Authored by OpenClaw (model: gpt-5.4)
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughReworked shared tabulation math to use the Array API, added device-aware backend sampling and backend caching for PyTorch in PT and PT_EXPT tabulate overrides, and added Array API parity tests validating derivatives and stable-sigmoid behavior. Changes
Sequence Diagram(s)sequenceDiagram
participant PT as PT DPTabulate
participant PTX as PT_EXPT DPTabulate
participant TabMath as Shared tabulate_math
participant XP as array_api_compat (xp)
participant Torch as torch/device
participant NumPy as to_numpy_array
PT->>TabMath: invoke tabulation (uses _get_math_backend_sample)
PTX->>TabMath: invoke tabulation (uses _get_math_backend_sample)
TabMath->>XP: resolve array namespace from sample
XP->>Torch: allocate/operate on device when backend is torch
TabMath->>XP: perform Array API ops (matmul, reshape, exp, where, concat, etc.)
TabMath->>NumPy: convert results via to_numpy_array()
NumPy-->>PT: return NumPy arrays to caller
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@source/tests/common/test_tabulate_math_array_api.py`:
- Around line 15-66: The test helper currently injects fake modules (deepmd_pkg,
dpmodel_pkg, dpmodel_common, dpmodel_utils_pkg, dpmodel_network, utils_pkg,
utils_tabulate) directly into sys.modules and never restores the original
entries, leaking the shim into other tests; after creating and registering those
module objects and after spec.loader.exec_module(module) completes (and before
returning module), record the original sys.modules values for the keys you
touched (e.g. "deepmd", "deepmd.dpmodel", "deepmd.dpmodel.common",
"deepmd.dpmodel.utils", "deepmd.dpmodel.utils.network", "deepmd.utils",
"deepmd.utils.tabulate") and then restore them (reassign originals if present,
or delete newly added keys) so the fake modules do not persist; alternatively
replace these sys.modules assignments with monkeypatch.setitem calls or use a
context-manager that cleans up on exit, ensuring unique symbols deepmd_pkg,
dpmodel_pkg, dpmodel_common, dpmodel_utils_pkg, dpmodel_network, utils_pkg,
utils_tabulate are removed/restored before returning.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: bbadced2-1c74-4f11-b5e2-d9d76fc454d9
📒 Files selected for processing (4)
deepmd/pt/utils/tabulate.pydeepmd/pt_expt/utils/tabulate.pydeepmd/utils/tabulate_math.pysource/tests/common/test_tabulate_math_array_api.py
The regression test for tabulate_math loaded stub deepmd modules by\nwriting directly into sys.modules at import time and never restored\nthem. During pytest collection this polluted later imports, causing\nwide-spread ImportError failures from deepmd.dpmodel.common/utils.\n\nRestore any pre-existing modules and remove temporary stubs after the\nmodule-under-test is loaded so the test remains self-contained. Also\nwrap one long assertion line for style.\n\nTested:\n- python3 -m py_compile source/tests/common/test_tabulate_math_array_api.py\n\nFixes the CI failures on PR deepmodeling#5366.\n\nAuthored by OpenClaw (model: gpt-5.4)
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5366 +/- ##
==========================================
+ Coverage 82.32% 82.38% +0.05%
==========================================
Files 811 812 +1
Lines 83252 83611 +359
Branches 4067 4090 +23
==========================================
+ Hits 68537 68881 +344
- Misses 13497 13508 +11
- Partials 1218 1222 +4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Replace the hand-rolled stub-based tabulate_math regression test with an array_api_strict-backed test, matching the repository's existing Array API test strategy. This removes the need for synthetic sys.modules injection while still verifying that the tabulate math helpers execute correctly on strict Array API arrays and stay numerically consistent with the NumPy reference path. Tested: - python3 -m py_compile source/tests/common/test_tabulate_math_array_api.py Authored by OpenClaw (model: gpt-5.4)
Run repository formatting hooks after switching the tabulate math regression test to array_api_strict-backed coverage. Authored by OpenClaw (model: gpt-5.4)
There was a problem hiding this comment.
Pull request overview
This PR refactors deepmd/utils/tabulate_math.py to use Array API–compatible operators (via array_api_compat) so the shared tabulation math can execute on non-NumPy backends, and updates PyTorch wrapper classes to select a torch-backed execution sample/device. It also adds a regression test to ensure the helper math uses the provided Array API namespace (rather than silently falling back to NumPy).
Changes:
- Port activation-derivative and chain-rule helper functions to Array API–compatible ops (
array_api_compat.array_namespace,xp.matmul,xp.where,xp.concat, etc.). - Add backend/device selection hooks so PT / pt_expt wrappers can run the shared math on the active torch device.
- Add
array_api_strict-based tests validating correctness vs NumPy and guarding against NumPy fallback.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
deepmd/utils/tabulate_math.py |
Refactors tabulation math to Array API ops and adds backend/device selection + backend conversions for matrices/biases. |
deepmd/pt/utils/tabulate.py |
Overrides backend-sample selection so shared math runs on the current torch device. |
deepmd/pt_expt/utils/tabulate.py |
Adds backend-sample selection for the pt_expt torch device. |
source/tests/common/test_tabulate_math_array_api.py |
New regression tests comparing Array API Strict results to NumPy for key helpers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- remove the unused array_api_strict package import from the regression test - release the original NumPy matrix/bias caches after materializing backend copies to avoid retaining duplicate weight storage during tabulation Tested: - uvx prek run -a Authored by OpenClaw (model: gpt-5.4)
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
deepmd/utils/tabulate_math.py (1)
356-358: Reusexbar/ybarinstead of doing the samematmultwice.
_make_data()already computes the preactivation for derivatives, then_layer_0()/_layer_1()immediately rebuildmatmul(x, w) + bagain. In this hot path that doubles the expensive backend op per layer and blunts the acceleration benefit of the Array API refactor.⚡ Refactor sketch
- yy = self._layer_0(xx, matrix, bias) + xx + yy = self._activation_fn(xbar) + xx ... - tt, yy = self._layer_1(xx, matrix, bias) + tt = xp.concat((xx, xx), axis=1) + yy = self._activation_fn(xbar) + tt ... - zz = self._layer_0(yy, matrix, bias) + yy + zz = self._activation_fn(ybar) + yyApply the same pattern to the remaining
_layer_0/_layer_1branches, or change those helpers to accept a precomputed preactivation.Also applies to: 375-375, 392-392, 407-409, 426-426, 443-443, 467-474
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/utils/tabulate_math.py` around lines 356 - 358, The preactivation (xbar / ybar) computed in _make_data is being recomputed inside _layer_0/_layer_1; change those call sites to reuse the already computed preactivation (xbar / ybar) instead of calling matmul again: update the branches that currently do xbar = xp.matmul(xx, matrix) + bias followed by yy = self._layer_0(xx, matrix, bias) (and similar for _layer_1) to pass the precomputed xbar/ybar into the helper (either by adding a new parameter like preact to _layer_0/_layer_1 or by replacing the helper call with logic that consumes xbar/ybar directly), and apply the same change to all listed occurrences so the expensive matmul is performed only once per layer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deepmd/utils/tabulate_math.py`:
- Around line 271-285: The constructor change broke callers that still pass
activation_fn_name; restore backward compatibility by accepting
activation_fn_name: str | None in the __init__ signature alongside activation_fn
(default None), map activation_fn_name -> activation_fn if the latter is not
provided, and then proceed to compute self.functype using ACTIVATION_TO_FUNCTYPE
and self._activation_fn using get_activation_fn; ensure the ValueError branch
still triggers for unknown names and keep self.activation_fn set to the resolved
name so existing code paths (e.g., deepmd/pt/utils/tabulate.py calling
super().__init__(..., activation_fn_name=...)) continue to work.
- Around line 269-270: The parameter exclude_types currently uses a mutable
default list ([]); change its annotation to Optional[list[list[int]]] (import
Optional if needed) and replace the default with None in the function signature,
then at the top of the function initialize it with something like "if
exclude_types is None: exclude_types = []" so the function (the one declaring
parameter exclude_types) no longer uses a mutable default.
---
Nitpick comments:
In `@deepmd/utils/tabulate_math.py`:
- Around line 356-358: The preactivation (xbar / ybar) computed in _make_data is
being recomputed inside _layer_0/_layer_1; change those call sites to reuse the
already computed preactivation (xbar / ybar) instead of calling matmul again:
update the branches that currently do xbar = xp.matmul(xx, matrix) + bias
followed by yy = self._layer_0(xx, matrix, bias) (and similar for _layer_1) to
pass the precomputed xbar/ybar into the helper (either by adding a new parameter
like preact to _layer_0/_layer_1 or by replacing the helper call with logic that
consumes xbar/ybar directly), and apply the same change to all listed
occurrences so the expensive matmul is performed only once per layer.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: cee20baa-5f05-448b-ab3f-a4a35fc2e7e4
📒 Files selected for processing (2)
deepmd/utils/tabulate_math.pysource/tests/common/test_tabulate_math_array_api.py
✅ Files skipped from review due to trivial changes (1)
- source/tests/common/test_tabulate_math_array_api.py
Accept the legacy activation_fn_name keyword in the shared DPTabulate constructor so existing pt / pt_expt callers continue to work while the Array API refactor keeps activation_fn as the canonical parameter. Tested: - uvx prek run -a Authored by OpenClaw (model: gpt-5.4)
Replace the mutable exclude_types default with None in the shared DPTabulate constructor and normalize it inside __init__. Tested: - uvx prek run -a Authored by OpenClaw (model: gpt-5.4)
Restore helper methods and original descriptor indexing logic that were accidentally dropped while refactoring the shared tabulate math module. This brings back _n_all_excluded, the numpy no-op tensor conversion hook, and the original T/R network-variable indexing used by compression paths. Tested: - uvx prek run -a Authored by OpenClaw (model: gpt-5.4)
NetworkCollection flattens 2D embedding indices as ti + ntypes * ii. The compressed se_a/se_e2_a paths were decoding embedding_idx in the opposite order, which made them look up missing keys such as filter_0_net_1 during compression setup. Tested: - uvx prek run -a Authored by OpenClaw (model: gpt-5.4)
Revert the se_a / se_e2_a consumer-side decoding change introduced in b87834b. The original implementations were working before the shared Array API refactor, so further fixes should stay in the shared tabulate layer instead of mutating these consumer paths. Tested: - uvx prek run -a Authored by OpenClaw (model: gpt-5.4)
Do not coerce exclude_types entries to tuples in the shared DPTabulate constructor. The existing compression/build paths compare against the serialized list-based exclude_types representation, and normalizing to tuples changes which A/R tables are skipped during build. Tested: - uvx prek run -a Authored by OpenClaw (model: gpt-5.4)
## Summary - port `deepmd/utils/tabulate_math.py` helpers from NumPy-only operators to Array API-compatible operators - let the PT / pt_expt tabulation wrappers choose a torch sample backend so the shared math path can execute on torch devices - add a regression test that verifies the helper functions use the provided Array API namespace rather than silently falling back to NumPy ## Testing - `. .venv/bin/activate && PYTHONPATH=. pytest -q source/tests/common/test_tabulate_math_array_api.py` - `python3 -m py_compile deepmd/utils/tabulate_math.py deepmd/pt/utils/tabulate.py deepmd/pt_expt/utils/tabulate.py source/tests/common/test_tabulate_math_array_api.py` Closes deepmodeling#5352 Authored by OpenClaw (model: gpt-5.4) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Device-aware sampling so tabulation math runs on the active compute device. * **Refactor** * Tabulation math migrated to Array API–compatible implementation with unified backend selection, device-aware tensor handling, and updated derivative propagation. * **Tests** * Added tests validating gradient/chain-rule computations and numerical stability versus NumPy under the Array API backend. * **Documentation** * Module docs updated to reflect Array API compatibility. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
## Summary - port `deepmd/utils/tabulate_math.py` helpers from NumPy-only operators to Array API-compatible operators - let the PT / pt_expt tabulation wrappers choose a torch sample backend so the shared math path can execute on torch devices - add a regression test that verifies the helper functions use the provided Array API namespace rather than silently falling back to NumPy ## Testing - `. .venv/bin/activate && PYTHONPATH=. pytest -q source/tests/common/test_tabulate_math_array_api.py` - `python3 -m py_compile deepmd/utils/tabulate_math.py deepmd/pt/utils/tabulate.py deepmd/pt_expt/utils/tabulate.py source/tests/common/test_tabulate_math_array_api.py` Closes deepmodeling#5352 Authored by OpenClaw (model: gpt-5.4) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * Device-aware sampling so tabulation math runs on the active compute device. * **Refactor** * Tabulation math migrated to Array API–compatible implementation with unified backend selection, device-aware tensor handling, and updated derivative propagation. * **Tests** * Added tests validating gradient/chain-rule computations and numerical stability versus NumPy under the Array API backend. * **Documentation** * Module docs updated to reflect Array API compatibility. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Summary
deepmd/utils/tabulate_math.pyhelpers from NumPy-only operators to Array API-compatible operatorsTesting
. .venv/bin/activate && PYTHONPATH=. pytest -q source/tests/common/test_tabulate_math_array_api.pypython3 -m py_compile deepmd/utils/tabulate_math.py deepmd/pt/utils/tabulate.py deepmd/pt_expt/utils/tabulate.py source/tests/common/test_tabulate_math_array_api.pyCloses #5352
Authored by OpenClaw (model: gpt-5.4)
Summary by CodeRabbit
New Features
Refactor
Tests
Documentation