Skip to content

refactor: use Array API-compatible tabulate math#5366

Merged
wanghan-iapcm merged 11 commits intodeepmodeling:masterfrom
njzjz-bothub:fix/issue-5352-array-api-tabulate-math
Apr 5, 2026
Merged

refactor: use Array API-compatible tabulate math#5366
wanghan-iapcm merged 11 commits intodeepmodeling:masterfrom
njzjz-bothub:fix/issue-5352-array-api-tabulate-math

Conversation

@njzjz-bot
Copy link
Copy Markdown
Contributor

@njzjz-bot njzjz-bot commented Apr 2, 2026

Summary

  • port deepmd/utils/tabulate_math.py helpers from NumPy-only operators to Array API-compatible operators
  • let the PT / pt_expt tabulation wrappers choose a torch sample backend so the shared math path can execute on torch devices
  • add a regression test that verifies the helper functions use the provided Array API namespace rather than silently falling back to NumPy

Testing

  • . .venv/bin/activate && PYTHONPATH=. pytest -q source/tests/common/test_tabulate_math_array_api.py
  • python3 -m py_compile deepmd/utils/tabulate_math.py deepmd/pt/utils/tabulate.py deepmd/pt_expt/utils/tabulate.py source/tests/common/test_tabulate_math_array_api.py

Closes #5352

Authored by OpenClaw (model: gpt-5.4)

Summary by CodeRabbit

  • New Features

    • Device-aware sampling so tabulation math runs on the active compute device.
  • Refactor

    • Tabulation math migrated to Array API–compatible implementation with unified backend selection, device-aware tensor handling, and updated derivative propagation.
  • Tests

    • Added tests validating gradient/chain-rule computations and numerical stability versus NumPy under the Array API backend.
  • Documentation

    • Module docs updated to reflect Array API compatibility.

Port the shared tabulation math helpers to Array API compatible operators,
allow PT/PT_EXPT tabulation to pick a torch backend for the math path,
and add regression tests that verify the helpers stay backend-agnostic.

Tested:
- . .venv/bin/activate && PYTHONPATH=. pytest -q source/tests/common/test_tabulate_math_array_api.py
- python3 -m py_compile deepmd/utils/tabulate_math.py deepmd/pt/utils/tabulate.py deepmd/pt_expt/utils/tabulate.py source/tests/common/test_tabulate_math_array_api.py

Refs deepmodeling#5352

Authored by OpenClaw (model: gpt-5.4)
@github-actions github-actions bot added the Python label Apr 2, 2026
@dosubot dosubot bot added the enhancement label Apr 2, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Reworked shared tabulation math to use the Array API, added device-aware backend sampling and backend caching for PyTorch in PT and PT_EXPT tabulate overrides, and added Array API parity tests validating derivatives and stable-sigmoid behavior.

Changes

Cohort / File(s) Summary
PT tabulate overrides
deepmd/pt/utils/tabulate.py, deepmd/pt_expt/utils/tabulate.py
Added _get_math_backend_sample() override that returns a device-allocated float64 torch scalar so shared Array API tabulation math can detect the torch backend/device; updated module docstring in pt_expt.
Shared tabulation math (core)
deepmd/utils/tabulate_math.py
Refactored NumPy-centric math to Array API: function signatures accept Any, use array_api_compat namespace (xp.*) for ops, added _repeat_flattened_weight_prefix(), backend helpers/caching (_get_math_backend_sample, _math_backend_sample, _math_backend_device, _backend_asarray, _matrix_backend, _bias_backend), adapted _make_data, _layer_0, _layer_1 to Array API and convert final outputs to NumPy. Removed _convert_numpy_to_tensor and _n_all_excluded.
Tests (Array API parity)
source/tests/common/test_tabulate_math_array_api.py
New unit tests that run with array_api_strict to assert numerical parity between Array API strict backend outputs and NumPy for derivative propagation helpers and stable-sigmoid on edge inputs.

Sequence Diagram(s)

sequenceDiagram
    participant PT as PT DPTabulate
    participant PTX as PT_EXPT DPTabulate
    participant TabMath as Shared tabulate_math
    participant XP as array_api_compat (xp)
    participant Torch as torch/device
    participant NumPy as to_numpy_array

    PT->>TabMath: invoke tabulation (uses _get_math_backend_sample)
    PTX->>TabMath: invoke tabulation (uses _get_math_backend_sample)
    TabMath->>XP: resolve array namespace from sample
    XP->>Torch: allocate/operate on device when backend is torch
    TabMath->>XP: perform Array API ops (matmul, reshape, exp, where, concat, etc.)
    TabMath->>NumPy: convert results via to_numpy_array()
    NumPy-->>PT: return NumPy arrays to caller
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 53.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'refactor: use Array API-compatible tabulate math' directly aligns with the main change: porting tabulate_math.py from NumPy-only to Array API-compatible operators.
Linked Issues check ✅ Passed The PR fully addresses all objectives from #5352: audits functions for Array API compatibility, refactors operators to Array API equivalents, enables execution on torch devices, and adds regression tests to verify behavior across backends.
Out of Scope Changes check ✅ Passed All changes are directly aligned with the linked issue #5352 objectives: refactoring tabulate_math.py to Array API, updating PT/pt_expt wrappers for device selection, and adding corresponding tests. No unrelated changes detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@source/tests/common/test_tabulate_math_array_api.py`:
- Around line 15-66: The test helper currently injects fake modules (deepmd_pkg,
dpmodel_pkg, dpmodel_common, dpmodel_utils_pkg, dpmodel_network, utils_pkg,
utils_tabulate) directly into sys.modules and never restores the original
entries, leaking the shim into other tests; after creating and registering those
module objects and after spec.loader.exec_module(module) completes (and before
returning module), record the original sys.modules values for the keys you
touched (e.g. "deepmd", "deepmd.dpmodel", "deepmd.dpmodel.common",
"deepmd.dpmodel.utils", "deepmd.dpmodel.utils.network", "deepmd.utils",
"deepmd.utils.tabulate") and then restore them (reassign originals if present,
or delete newly added keys) so the fake modules do not persist; alternatively
replace these sys.modules assignments with monkeypatch.setitem calls or use a
context-manager that cleans up on exit, ensuring unique symbols deepmd_pkg,
dpmodel_pkg, dpmodel_common, dpmodel_utils_pkg, dpmodel_network, utils_pkg,
utils_tabulate are removed/restored before returning.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: bbadced2-1c74-4f11-b5e2-d9d76fc454d9

📥 Commits

Reviewing files that changed from the base of the PR and between dd8e492 and 5f09f4e.

📒 Files selected for processing (4)
  • deepmd/pt/utils/tabulate.py
  • deepmd/pt_expt/utils/tabulate.py
  • deepmd/utils/tabulate_math.py
  • source/tests/common/test_tabulate_math_array_api.py

Comment thread source/tests/common/test_tabulate_math_array_api.py Outdated
The regression test for tabulate_math loaded stub deepmd modules by\nwriting directly into sys.modules at import time and never restored\nthem. During pytest collection this polluted later imports, causing\nwide-spread ImportError failures from deepmd.dpmodel.common/utils.\n\nRestore any pre-existing modules and remove temporary stubs after the\nmodule-under-test is loaded so the test remains self-contained. Also\nwrap one long assertion line for style.\n\nTested:\n- python3 -m py_compile source/tests/common/test_tabulate_math_array_api.py\n\nFixes the CI failures on PR deepmodeling#5366.\n\nAuthored by OpenClaw (model: gpt-5.4)
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 5, 2026

Codecov Report

❌ Patch coverage is 97.02970% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.38%. Comparing base (dd8e492) to head (d6eb037).
⚠️ Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
deepmd/utils/tabulate_math.py 96.84% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5366      +/-   ##
==========================================
+ Coverage   82.32%   82.38%   +0.05%     
==========================================
  Files         811      812       +1     
  Lines       83252    83611     +359     
  Branches     4067     4090      +23     
==========================================
+ Hits        68537    68881     +344     
- Misses      13497    13508      +11     
- Partials     1218     1222       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Replace the hand-rolled stub-based tabulate_math regression test with
an array_api_strict-backed test, matching the repository's existing
Array API test strategy.

This removes the need for synthetic sys.modules injection while still
verifying that the tabulate math helpers execute correctly on strict
Array API arrays and stay numerically consistent with the NumPy
reference path.

Tested:
- python3 -m py_compile source/tests/common/test_tabulate_math_array_api.py

Authored by OpenClaw (model: gpt-5.4)
Comment thread source/tests/common/test_tabulate_math_array_api.py Fixed
Run repository formatting hooks after switching the tabulate math
regression test to array_api_strict-backed coverage.

Authored by OpenClaw (model: gpt-5.4)
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors deepmd/utils/tabulate_math.py to use Array API–compatible operators (via array_api_compat) so the shared tabulation math can execute on non-NumPy backends, and updates PyTorch wrapper classes to select a torch-backed execution sample/device. It also adds a regression test to ensure the helper math uses the provided Array API namespace (rather than silently falling back to NumPy).

Changes:

  • Port activation-derivative and chain-rule helper functions to Array API–compatible ops (array_api_compat.array_namespace, xp.matmul, xp.where, xp.concat, etc.).
  • Add backend/device selection hooks so PT / pt_expt wrappers can run the shared math on the active torch device.
  • Add array_api_strict-based tests validating correctness vs NumPy and guarding against NumPy fallback.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
deepmd/utils/tabulate_math.py Refactors tabulation math to Array API ops and adds backend/device selection + backend conversions for matrices/biases.
deepmd/pt/utils/tabulate.py Overrides backend-sample selection so shared math runs on the current torch device.
deepmd/pt_expt/utils/tabulate.py Adds backend-sample selection for the pt_expt torch device.
source/tests/common/test_tabulate_math_array_api.py New regression tests comparing Array API Strict results to NumPy for key helpers.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread deepmd/utils/tabulate_math.py
- remove the unused array_api_strict package import from the regression test
- release the original NumPy matrix/bias caches after materializing backend copies to avoid retaining duplicate weight storage during tabulation

Tested:
- uvx prek run -a

Authored by OpenClaw (model: gpt-5.4)
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
deepmd/utils/tabulate_math.py (1)

356-358: Reuse xbar / ybar instead of doing the same matmul twice.

_make_data() already computes the preactivation for derivatives, then _layer_0() / _layer_1() immediately rebuild matmul(x, w) + b again. In this hot path that doubles the expensive backend op per layer and blunts the acceleration benefit of the Array API refactor.

⚡ Refactor sketch
-                    yy = self._layer_0(xx, matrix, bias) + xx
+                    yy = self._activation_fn(xbar) + xx
...
-                    tt, yy = self._layer_1(xx, matrix, bias)
+                    tt = xp.concat((xx, xx), axis=1)
+                    yy = self._activation_fn(xbar) + tt
...
-                    zz = self._layer_0(yy, matrix, bias) + yy
+                    zz = self._activation_fn(ybar) + yy

Apply the same pattern to the remaining _layer_0 / _layer_1 branches, or change those helpers to accept a precomputed preactivation.

Also applies to: 375-375, 392-392, 407-409, 426-426, 443-443, 467-474

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/utils/tabulate_math.py` around lines 356 - 358, The preactivation
(xbar / ybar) computed in _make_data is being recomputed inside
_layer_0/_layer_1; change those call sites to reuse the already computed
preactivation (xbar / ybar) instead of calling matmul again: update the branches
that currently do xbar = xp.matmul(xx, matrix) + bias followed by yy =
self._layer_0(xx, matrix, bias) (and similar for _layer_1) to pass the
precomputed xbar/ybar into the helper (either by adding a new parameter like
preact to _layer_0/_layer_1 or by replacing the helper call with logic that
consumes xbar/ybar directly), and apply the same change to all listed
occurrences so the expensive matmul is performed only once per layer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/utils/tabulate_math.py`:
- Around line 271-285: The constructor change broke callers that still pass
activation_fn_name; restore backward compatibility by accepting
activation_fn_name: str | None in the __init__ signature alongside activation_fn
(default None), map activation_fn_name -> activation_fn if the latter is not
provided, and then proceed to compute self.functype using ACTIVATION_TO_FUNCTYPE
and self._activation_fn using get_activation_fn; ensure the ValueError branch
still triggers for unknown names and keep self.activation_fn set to the resolved
name so existing code paths (e.g., deepmd/pt/utils/tabulate.py calling
super().__init__(..., activation_fn_name=...)) continue to work.
- Around line 269-270: The parameter exclude_types currently uses a mutable
default list ([]); change its annotation to Optional[list[list[int]]] (import
Optional if needed) and replace the default with None in the function signature,
then at the top of the function initialize it with something like "if
exclude_types is None: exclude_types = []" so the function (the one declaring
parameter exclude_types) no longer uses a mutable default.

---

Nitpick comments:
In `@deepmd/utils/tabulate_math.py`:
- Around line 356-358: The preactivation (xbar / ybar) computed in _make_data is
being recomputed inside _layer_0/_layer_1; change those call sites to reuse the
already computed preactivation (xbar / ybar) instead of calling matmul again:
update the branches that currently do xbar = xp.matmul(xx, matrix) + bias
followed by yy = self._layer_0(xx, matrix, bias) (and similar for _layer_1) to
pass the precomputed xbar/ybar into the helper (either by adding a new parameter
like preact to _layer_0/_layer_1 or by replacing the helper call with logic that
consumes xbar/ybar directly), and apply the same change to all listed
occurrences so the expensive matmul is performed only once per layer.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: cee20baa-5f05-448b-ab3f-a4a35fc2e7e4

📥 Commits

Reviewing files that changed from the base of the PR and between 4272ca2 and 092ec70.

📒 Files selected for processing (2)
  • deepmd/utils/tabulate_math.py
  • source/tests/common/test_tabulate_math_array_api.py
✅ Files skipped from review due to trivial changes (1)
  • source/tests/common/test_tabulate_math_array_api.py

Comment thread deepmd/utils/tabulate_math.py Outdated
Comment thread deepmd/utils/tabulate_math.py
Accept the legacy activation_fn_name keyword in the shared DPTabulate
constructor so existing pt / pt_expt callers continue to work while the
Array API refactor keeps activation_fn as the canonical parameter.

Tested:
- uvx prek run -a

Authored by OpenClaw (model: gpt-5.4)
Replace the mutable exclude_types default with None in the shared
DPTabulate constructor and normalize it inside __init__.

Tested:
- uvx prek run -a

Authored by OpenClaw (model: gpt-5.4)
Restore helper methods and original descriptor indexing logic that were
accidentally dropped while refactoring the shared tabulate math module.
This brings back _n_all_excluded, the numpy no-op tensor conversion hook,
and the original T/R network-variable indexing used by compression paths.

Tested:
- uvx prek run -a

Authored by OpenClaw (model: gpt-5.4)
NetworkCollection flattens 2D embedding indices as ti + ntypes * ii.
The compressed se_a/se_e2_a paths were decoding embedding_idx in the
opposite order, which made them look up missing keys such as
filter_0_net_1 during compression setup.

Tested:
- uvx prek run -a

Authored by OpenClaw (model: gpt-5.4)
Revert the se_a / se_e2_a consumer-side decoding change introduced in
b87834b. The original implementations were working before the shared
Array API refactor, so further fixes should stay in the shared tabulate
layer instead of mutating these consumer paths.

Tested:
- uvx prek run -a

Authored by OpenClaw (model: gpt-5.4)
Do not coerce exclude_types entries to tuples in the shared DPTabulate
constructor. The existing compression/build paths compare against the
serialized list-based exclude_types representation, and normalizing to
tuples changes which A/R tables are skipped during build.

Tested:
- uvx prek run -a

Authored by OpenClaw (model: gpt-5.4)
@njzjz njzjz requested a review from wanghan-iapcm April 5, 2026 13:48
@wanghan-iapcm wanghan-iapcm added this pull request to the merge queue Apr 5, 2026
Merged via the queue into deepmodeling:master with commit efc27cf Apr 5, 2026
70 checks passed
Johnsonjz pushed a commit to Johnsonjz/deepmd-kit that referenced this pull request Apr 6, 2026
## Summary
- port `deepmd/utils/tabulate_math.py` helpers from NumPy-only operators
to Array API-compatible operators
- let the PT / pt_expt tabulation wrappers choose a torch sample backend
so the shared math path can execute on torch devices
- add a regression test that verifies the helper functions use the
provided Array API namespace rather than silently falling back to NumPy

## Testing
- `. .venv/bin/activate && PYTHONPATH=. pytest -q
source/tests/common/test_tabulate_math_array_api.py`
- `python3 -m py_compile deepmd/utils/tabulate_math.py
deepmd/pt/utils/tabulate.py deepmd/pt_expt/utils/tabulate.py
source/tests/common/test_tabulate_math_array_api.py`

Closes deepmodeling#5352

Authored by OpenClaw (model: gpt-5.4)


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Device-aware sampling so tabulation math runs on the active compute
device.

* **Refactor**
* Tabulation math migrated to Array API–compatible implementation with
unified backend selection, device-aware tensor handling, and updated
derivative propagation.

* **Tests**
* Added tests validating gradient/chain-rule computations and numerical
stability versus NumPy under the Array API backend.

* **Documentation**
  * Module docs updated to reflect Array API compatibility.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
YuzhiLiu-ai pushed a commit to Johnsonjz/deepmd-kit that referenced this pull request Apr 7, 2026
## Summary
- port `deepmd/utils/tabulate_math.py` helpers from NumPy-only operators
to Array API-compatible operators
- let the PT / pt_expt tabulation wrappers choose a torch sample backend
so the shared math path can execute on torch devices
- add a regression test that verifies the helper functions use the
provided Array API namespace rather than silently falling back to NumPy

## Testing
- `. .venv/bin/activate && PYTHONPATH=. pytest -q
source/tests/common/test_tabulate_math_array_api.py`
- `python3 -m py_compile deepmd/utils/tabulate_math.py
deepmd/pt/utils/tabulate.py deepmd/pt_expt/utils/tabulate.py
source/tests/common/test_tabulate_math_array_api.py`

Closes deepmodeling#5352

Authored by OpenClaw (model: gpt-5.4)


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **New Features**
* Device-aware sampling so tabulation math runs on the active compute
device.

* **Refactor**
* Tabulation math migrated to Array API–compatible implementation with
unified backend selection, device-aware tensor handling, and updated
derivative propagation.

* **Tests**
* Added tests validating gradient/chain-rule computations and numerical
stability versus NumPy under the Array API backend.

* **Documentation**
  * Module docs updated to reflect Array API compatibility.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Consider Array API acceleration for deepmd/utils/tabulate_math.py

4 participants