Skip to content

Add Phase of Matter quantum dataset generator#1044

Open
rogue-infinity wants to merge 5 commits into
qiskit-community:mainfrom
rogue-infinity:feature/phase-of-matter-dataset
Open

Add Phase of Matter quantum dataset generator#1044
rogue-infinity wants to merge 5 commits into
qiskit-community:mainfrom
rogue-infinity:feature/phase-of-matter-dataset

Conversation

@rogue-infinity
Copy link
Copy Markdown
Contributor

@rogue-infinity rogue-infinity commented May 13, 2026

Summary

  • Adds phase_of_matter_data() to qiskit_machine_learning/datasets/ as a new sub-package phase_of_matter/
  • Implements four spin-chain Hamiltonians from Bermejo-Vega et al., arXiv:2408.12739:
    • "heisenberg" - Bond-alternating XXX Heisenberg: phases trivial / topological
    • "haldane" - Haldane chain: phases antiferromagnetic / paramagnetic / spt
    • "annni" - Axial Next-Nearest-Neighbor Ising: phases ferromagnetic / paramagnetic / floating / antiphase
    • "cluster" - Cluster Hamiltonian (periodic BC): phases haldane / ferromagnetic / antiferromagnetic / trivial
  • Ground states computed via scipy.sparse.linalg.eigsh (exact diagonalization, default). Optional VQE pathway available via backend= argument for hardware-experiment workflows
  • Follows the existing dataset API convention (training_size, test_size, one_hot, formatting, class_labels, include_sample_total, seed, backend)

Files changed

File Description
qiskit_machine_learning/datasets/phase_of_matter/__init__.py Package entry point
qiskit_machine_learning/datasets/phase_of_matter/phase_of_matter.py Public phase_of_matter_data() API
qiskit_machine_learning/datasets/phase_of_matter/_base.py Shared utilities: pauli_term, exact diag, VQE
qiskit_machine_learning/datasets/phase_of_matter/_heisenberg.py Heisenberg Hamiltonian + phase sampler
qiskit_machine_learning/datasets/phase_of_matter/_haldane.py Haldane Hamiltonian + phase sampler
qiskit_machine_learning/datasets/phase_of_matter/_annni.py ANNNI Hamiltonian + phase sampler
qiskit_machine_learning/datasets/phase_of_matter/_cluster.py Cluster Hamiltonian + phase sampler
qiskit_machine_learning/datasets/__init__.py Export phase_of_matter_data
test/datasets/test_phase_of_matter.py 55-test suite

Test plan

  • pytest test/datasets/test_phase_of_matter.py, 55 tests covering Hermiticity of all Hamiltonians, ground-state normalization, eigenstate residuals, phase label coverage for all models, API shape contracts, seed reproducibility, and error handling
  • Black / pylint / tox lint checks

Usage example

from qiskit_machine_learning.datasets import phase_of_matter_data

x_train, y_train, x_test, y_test = phase_of_matter_data(
    training_size=10, test_size=5, n=4, model="heisenberg", seed=0
)
# x_train.shape → (10, 16), y_train.shape → (10, 2)

Implements phase_of_matter_data() as a new dataset in
qiskit_machine_learning/datasets/phase_of_matter/:

- Four spin-chain Hamiltonians built via SparsePauliOp (no extra deps):
  heisenberg (trivial/topological), haldane (afm/paramagnetic/spt),
  annni (ferromagnetic/paramagnetic/floating/antiphase),
  cluster (haldane/ferromagnetic/antiferromagnetic/trivial)
- Exact ground states via scipy.sparse.linalg.eigsh (default)
- Optional VQE pathway for hardware-experiment workflows
- Follows existing dataset API: training_size, test_size, one_hot,
  formatting, class_labels, include_sample_total, seed, backend
- 55-test suite covering Hermiticity, normalization, eigenstate
  residuals, phase label coverage, shape contracts, reproducibility

Reference: Bermejo-Vega et al., arXiv:2408.12739 (2024)
- Add STFC copyright header to all 8 new files
- Rename variable H -> ham throughout (C0103 invalid-name)
- Add docstrings to all test methods (C0116 missing-function-docstring)
- Move 'from test import' before third-party imports (C0411 wrong-import-order)
- Add :type backend: object to fix W9016 missing-type-doc in two functions
- Replace British spellings with American: normalised, optimisation, mislabelled
- Replace Greek chars in inline comments with ASCII equivalents
- Add Callable type annotation to _fixed_hamiltonian to fix mypy operator error
- Add domain-specific words to .pylintdict for spell check
- Add rng, geq, ddt, idata, atol, eigh, zxz, simulatable to .pylintdict
- Fix neighbour/favour -> neighbor/favor in _haldane.py
- Fix parameterised/honoured -> parameterized/honored in test file
- Reword 'RNG' -> 'random number generator' in phase_of_matter.py docstring
- Fix Simulable -> Simulatable in paper reference
- Fix W9016: use 'backend (object):' Google-style in both _base.py and
  phase_of_matter.py; remove incorrect ':type backend:' RST lines
- Add '# doctest: +SKIP' to docstring examples (consistent with existing
  datasets which have no running doctests)
- Replace 'atol' param name with 'tol' in _is_hermitian helper
- Replace 'ZXZ' in comment with 'Z-X-Z'
- Reword 'namespace' and 'eigh' references in test docstrings
The sphinx LowercaseFilter downcases checked words before dict lookup,
so dictionary entries must be lowercase. Bermejo and Simulatable were
capitalised, causing sphinx-build -M spelling to fail despite pylint
spell check passing.
@rogue-infinity
Copy link
Copy Markdown
Contributor Author

@OkuyanBoga @adekusar-drl @oscar-wallis
Hi guys, could you please check the failing test case of ML Unit Tests on Windows 3.10. As far as I can see from the logs, it fails a VQC test where it achieves an accuracy of 0.2 on a dataset it creates on its own without referencing this new generator anywhere. Every other platform (Ubuntu × 4 Python versions, macOS × 2 Python versions) passed

Awaiting your response on this!

@coveralls
Copy link
Copy Markdown

Coverage Report for CI Build 25775720082

Coverage increased (+0.1%) to 89.802%

Details

  • Coverage increased (+0.1%) from the base build.
  • Patch coverage: 16 uncovered changes across 2 files (225 of 241 lines covered, 93.36%).
  • No coverage regressions found.

Uncovered Changes

File Changed Covered %
qiskit_machine_learning/datasets/phase_of_matter/_base.py 35 20 57.14%
qiskit_machine_learning/datasets/phase_of_matter/phase_of_matter.py 84 83 98.81%

Coverage Regressions

No coverage regressions found.


Coverage Stats

Coverage Status
Relevant Lines: 6011
Covered Lines: 5398
Line Coverage: 89.8%
Coverage Strength: 0.9 hits per line

💛 - Coveralls

@rogue-infinity
Copy link
Copy Markdown
Contributor Author

@OkuyanBoga, all the tests passed successfully, could you please go through the PR and let me know any modifications/ suggestions/ additions need to be made?

@OkuyanBoga OkuyanBoga added Changelog: New Feature Include in the Added section of the changelog Community PR 🌐 PRs from contributors that are not 'members' of the Qiskit organization labels May 14, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Changelog: New Feature Include in the Added section of the changelog Community PR 🌐 PRs from contributors that are not 'members' of the Qiskit organization

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants