Skip to content

Add DeePMD property tools#5466

Closed
zhaiwenxi wants to merge 2 commits into
deepmodeling:masterfrom
zhaiwenxi:add-property-tools
Closed

Add DeePMD property tools#5466
zhaiwenxi wants to merge 2 commits into
deepmodeling:masterfrom
zhaiwenxi:add-property-tools

Conversation

@zhaiwenxi
Copy link
Copy Markdown

@zhaiwenxi zhaiwenxi commented May 27, 2026

Summary

  • Add deepmd_property_tools for molecular property training and prediction.
  • Add pyproject.toml packaging, CLI entry point, and documentation.
  • Add pytest-based unit tests and GitHub Actions workflow.

Tests

  • python -m pytest tests -v

Summary by CodeRabbit

Release Notes

  • New Features
    • Added deepmd-property-tools package for molecular property prediction and training
    • New CLI interface with train and predict subcommands for property workflows
    • Support for CSV and MOL file data formats
    • Configuration-based training with automatic checkpoint management and model evaluation

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 27, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 4dcc958a-cae8-4bc0-a56c-f8e5268f98ec

📥 Commits

Reviewing files that changed from the base of the PR and between 4e64f8b and e9fe00f.

⛔ Files ignored due to path filters (1)
  • deepmd/deepmd_property_tools/DATA/dataset_demo.csv is excluded by !**/*.csv
📒 Files selected for processing (75)
  • .github/workflows/property_tools_tests.yml
  • deepmd/deepmd_property_tools/DATA/mol_convert/id0.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id1.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id10.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id11.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id12.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id13.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id14.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id15.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id16.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id17.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id18.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id19.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id2.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id20.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id21.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id22.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id23.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id24.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id25.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id26.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id27.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id28.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id29.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id3.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id30.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id31.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id32.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id33.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id34.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id35.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id36.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id37.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id38.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id39.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id4.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id5.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id6.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id7.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id8.mol
  • deepmd/deepmd_property_tools/DATA/mol_convert/id9.mol
  • deepmd/deepmd_property_tools/DPA3_finetune_hyperparameters.md
  • deepmd/deepmd_property_tools/MANIFEST.in
  • deepmd/deepmd_property_tools/README.md
  • deepmd/deepmd_property_tools/deepmd_property_tools/__init__.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/cli.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/config/__init__.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/config/config_handler.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/config/default.json
  • deepmd/deepmd_property_tools/deepmd_property_tools/data/__init__.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/data/converter.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/data/datahub.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/data/mol.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/models/__init__.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/models/property_model.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/predict.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/predictor.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/tasks/__init__.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/tasks/trainer.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/train.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/utils/__init__.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/utils/base_logger.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/utils/metrics.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/utils/util.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/weights/__init__.py
  • deepmd/deepmd_property_tools/deepmd_property_tools/weights/weighthub.py
  • deepmd/deepmd_property_tools/predict_property_20.py
  • deepmd/deepmd_property_tools/pyproject.toml
  • deepmd/deepmd_property_tools/tests/test_cli.py
  • deepmd/deepmd_property_tools/tests/test_config.py
  • deepmd/deepmd_property_tools/tests/test_mol.py
  • deepmd/deepmd_property_tools/tests/test_predict.py
  • deepmd/deepmd_property_tools/tests/test_train.py
  • deepmd/deepmd_property_tools/tests/test_trainer.py
  • deepmd/deepmd_property_tools/train_property_20.py

📝 Walkthrough

Walkthrough

A new deepmd_property_tools package is introduced to DeePMD-kit, providing a Uni-Mol-tools-style interface for molecular property prediction and model training. The package includes configuration management, data pipeline for MOL/CSV ingestion, training/inference orchestration via DataHub, PyTorch model wrappers, CLI entry point, and comprehensive tests with example workflows.

Changes

DeePMD Property Tools Package

Layer / File(s) Summary
Package metadata and CI configuration
pyproject.toml, .github/workflows/property_tools_tests.yml, MANIFEST.in
Build system configuration, GitHub Actions CI for unit tests, and manifest directive to include JSON config files in distribution.
Configuration system with defaults
deepmd_property_tools/config/__init__.py, config_handler.py, default.json
JSON-based config management via ConfigHandler with recursive deep-merge support; default DPA3 model, loss, learning-rate schedule, and training runtime parameters.
Data ingestion and preparation pipeline
deepmd_property_tools/data/__init__.py, converter.py, mol.py, datahub.py
MOL file parsing with 3D coordinate and element extraction, CSV/MOL dataset loading with atom-overlap filtering, type map derivation, DeepMD mixed-npy preparation with train/valid splitting, and DataHub orchestration for training/inference data flows.
Model loading and prediction system
deepmd_property_tools/models/property_model.py, predict.py, predictor.py
PropertyModel wrapper for DeepProperty inference, Predictor for coordinate standardization and padded batch prediction with CSV output, and PropertyPredict high-level wrapper with checkpoint/frozen-model selection logic.
Training orchestration and weight resolution
deepmd_property_tools/weights/weighthub.py, tasks/trainer.py, train.py
WeightHub for local path or registry-based pretrained model resolution; Trainer supporting single-process training or distributed torchrun with optional post-training checkpoint freezing; PropertyTrain high-level interface coordinating data preparation, training execution, and config persistence.
Command-line interface
deepmd_property_tools/cli.py
argparse-based CLI with train and predict subcommands; handlers wiring parsed arguments to PropertyTrain/PropertyPredict, returning integer exit codes.
Utility modules
deepmd_property_tools/utils/base_logger.py, metrics.py, util.py
Shared INFO-level logger with stream handler, regression metrics computation (MAE/RMSE), and directory creation utility.
Documentation
README.md, DPA3_finetune_hyperparameters.md
Installation and usage guide with Python and CLI examples; detailed DPA3 finetuning guidance separating must-match model-structure fields from user-customizable training parameters, with example PropertyTrain configuration.
Test suite for configuration and data utilities
tests/test_config.py, test_mol.py
ConfigHandler merge behavior with nested dict updates; property parsing, atom-overlap detection, element type ordering, and MOL file reading.
Test suite for inference and training workflows
tests/test_predict.py, test_train.py, test_trainer.py, test_cli.py
PropertyPredict checkpoint selection (latest numbered vs frozen model), Trainer torchrun command building, PropertyTrain argument validation and epoch-to-step conversion, and CLI train/predict command dispatching.
Test data molecules and example workflows
DATA/mol_convert/id0-39.mol, train_property_20.py, predict_property_20.py
40 RDKit V2000-format MOL structures with 3D coordinates, bonds, and formal charges; end-to-end example scripts demonstrating training with DPA3.2-5M pretraining and subsequent inference.

🎯 3 (Moderate) | ⏱️ ~20 minutes

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

Docs, Examples

Suggested reviewers

  • njzjz
  • iProzd
  • wanghan-iapcm
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@zhaiwenxi zhaiwenxi closed this May 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant