Skip to content

Update decomposition handling#364

Merged
lkdvos merged 24 commits intomasterfrom
lb/purge_lapack
Apr 29, 2026
Merged

Update decomposition handling#364
lkdvos merged 24 commits intomasterfrom
lb/purge_lapack

Conversation

@leburgel
Copy link
Copy Markdown
Member

@leburgel leburgel commented Apr 21, 2026

Updates the handling of decompositions after the updates of MatrixAlgebraKit.jl v0.6.5. The main purpose was to remove all explicit LAPACK_ specifications from the decomposition algorithms, now that the decomposition algorithms themselves have been decoupled from the storagetype in MatrixAlgebraKit.jl. In the process, I also tried to streamline the way we use decompositions a bit, attempting to bring them more in line with MatrixAlgebraKit.jl while still allowing to hook into the implementation and switch between different reverse rule algorithms.

Decomposition algorithms & selection

The main change is an update in how decomposition algorithms are selected.

  • All of the MatrixAlgebraKit.jl decompositions are now selected through symbols that exactly match their algorithm name, e.g. alg = :DivideAndConquer.

    • This seems the most appropriate, since it means we don't need to introduce any new symbols ourselves like we do now, and we can easily refer to the MatrixAlgebraKit.jl docs.
    • The downside is that we still explicitly list all of the available algorithms in a symbol list, and we'll run into trouble if the names in MatrixAlgebraKit.jl change.
    • By default, we now just use :DefaultAlgorithm whenever we can, which seems like the most suitable option.
  • The specification of the iterative algorithms is kept the same as it was before.

  • I updated the docstrings of the decomposition algorithm structs SVDAdjoint, EighAdjoint, QRAdjoint to reflect all of these changes and also link directly to the relevant docstrings so users can easily find wat keyword algorithms they can provide.

  • I removed the rrule_alg specification from the user-facing interface in leading_boundary and fixedpoint.

    • In this update the decomposition_alg kwarg of leading_boundary or CTMRGAlgorithm is interpreted as a forward algorithm. Specifying a particular rrule algorithm as well can now only be done by passing a full decomposition struct, e.g. an instantiated SVDAdjoint, and is considered expert usage.
    • The default rrule algorithm is now selected based on the forward algorithm, using the default that is the most logical (i.e. full rrule when we have the full decomposition, truncated one for the sparse decompositions).
    • This is in anticipation of the fact that we'll move to directly using the MatrixAlgebraKit.jl differentiation without any manual intervention, so I wanted to remove this layer already from the interface that most users interact with. Advance usage for prototyping is still possible, so this seems a good solution for now.

Decomposition implementation

I updated the decomposition implementation to remove the intermediate _svd_trunc!, _eigh_trunc! and _left_orth! methods, and only use the MatrixAlgebraKit.jl methods directly.

  • The call signature for svd_trunc!, eigh_trunc! and left_orth! using SVDAdjoint, EighAdjoint and QRAdjoint algorithms respectively now just matches the MatrixAlgebraKit.jl signature.

    • This means they no longer return a generic info struct for the truncated decompositions, but only return the truncation error. Additional info such as condition numbers is just computed manually afterwards and added directly to the CTMRG info. In particular, the full decomposition is no longer returned explicitly (see also below).
    • On the other hand, this means I had to add the trunc specification to the actual decomposition structs, since MatrixAlgebraKit doesn't allow mixing instantiated algorithm structs with keyword arguments. In the forward pass, the MatrixAlgebraKit implementation is then called directly using a TruncatedAlgorithm(alg.fwd_alg, alg.trunc), which seemed like the cleanest way to do things.
  • The reverse rules are still handled by manual interception at the moment, even though they just directly call the MAtrixAlgebraKit.jl implementations.

    • The main reason is that this is currently the only way to use the keyword arguments of the pullbacks to control things like the verbosity.
    • Once Keyword arguments of pullbacks are unreachable in AD MatrixAlgebraKit.jl#207 has been addressed, we can probably stop handling things this way and remove the SVDAdjoint, EighAdjoint and QRAdjoint structs altogether. Then we only have to handle reverse rules for the decompositions defined directly in PEPSKit.jl, i.e. the sparse iterative ones (until they are properly implemented elsewhere of course).

Gauge fixing and fixed-point differentiation

I updated the implementation of the fixed iteration scheme to not use fixed precomputed decompositions, but rather fix relative and absolute phases after the core iteration using signs precomputed in the gauge fixing of the environment itself.

  • I managed to remove FixedSVD, FixedEigh and FixedQR completely, which in turn means we never have to return, store or gauge-fix full untruncated decompositions anymore.

    • In particular, this would address part of info_ctmrg fields in leading_boundary #354 since the full decompositions are only ever stored when using differentiation, completely removing them from the regular forward pass.
  • To make this work properly, I had to ensure that running the same decomposition on the same input twice actually gave the same result, otherwise we end up gauge fixing with the wrong signs.

    • For the full decompositions, this worked fine right out of the box.
    • For the sparse iterative decompositions, I had to:
      • Add a MatrixAlgebraKit.gaugefix!(...) call when computing the decomposition in each block.
      • Switch to using deterministic start vectors for the Krylov procedure, using ones(...) rather than randn to generate the start vector.
  • I updated the settings in the tests and examples to no longer use iterscheme = :diffgauge, and also reduced its usage in the gradient tests directly. Things are quite a bit faster now.

    • Since iterscheme = :diffgauge is so much slower in practice, maybe we should just consider removing it altogether. In practice, it's sometimes just unusable, while all of the initial stability issues with iterscheme = :fixed seem to be resolved now.

Further questions

Things to address before we can merge (and hopefully tag a release):

  • The most important thing now is to fix the interface to something we can stick with for a while. At this point this basically comes down to settling on a set of symbols that can be used to configure algorithms, and then hopefully not change them.
    • Right now, there's a mix of capitalized and lower-case symbols, e.g. :DivideAndConquer for specifying an SVD algorithm versus :gmres for specifying a gradient linear solver. I would suggest to:
    • Just use the capitalization of the struct names whenever we reference an algorithm that is not implemented in PEPSKit.jl itself. MatrixAlgebraKit.jl is the main motivation to do this, since alg = :DivideAndConquer works directly in the MatrixAlgebraKit.jl interface itself, so it makes sense to just use the same. However, this mean changing some current values (e.g. :gmres -> :GMRES, :lbfgs -> :LBFGS. Very breaking, but this seems like a good time to break.

Any other comments and questions are very welcome.

@leburgel leburgel added the documentation Improvements or additions to documentation label Apr 21, 2026
@github-actions
Copy link
Copy Markdown
Contributor

After the build completes, the updated documentation will be available here

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 21, 2026

Codecov Report

❌ Patch coverage is 88.46154% with 15 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/utility/eigh.jl 73.91% 6 Missing ⚠️
src/utility/svd.jl 77.77% 6 Missing ⚠️
src/algorithms/ctmrg/gaugefix.jl 94.11% 2 Missing ⚠️
src/algorithms/ctmrg/c4v.jl 85.71% 1 Missing ⚠️
Files with missing lines Coverage Δ
src/Defaults.jl 85.71% <ø> (ø)
src/PEPSKit.jl 100.00% <ø> (ø)
src/algorithms/ctmrg/ctmrg.jl 90.90% <100.00%> (+0.63%) ⬆️
src/algorithms/ctmrg/projectors.jl 85.48% <100.00%> (-0.67%) ⬇️
src/algorithms/ctmrg/sequential.jl 100.00% <100.00%> (ø)
src/algorithms/ctmrg/simultaneous.jl 100.00% <100.00%> (ø)
...rithms/optimization/fixed_point_differentiation.jl 92.85% <100.00%> (-1.80%) ⬇️
src/algorithms/time_evolution/apply_gate.jl 93.33% <100.00%> (ø)
src/utility/qr.jl 86.66% <100.00%> (-7.28%) ⬇️
src/algorithms/ctmrg/c4v.jl 89.62% <85.71%> (-0.67%) ⬇️
... and 3 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@leburgel leburgel marked this pull request as ready for review April 22, 2026 09:59
@leburgel leburgel requested review from lkdvos and pbrehmer April 22, 2026 10:00
@pbrehmer
Copy link
Copy Markdown
Collaborator

Wow, this is really nice, thanks Lander! I can give some opinions right off the bat:

  • I love that the FixedSVD, FixedEigh and FixedQR structs are gone, this was a hassle
  • I would be in favor of removing the :diffgauge scheme. It was there mostly for experimental purposes and only really emerged in the very beginning phases of PEPSKit when PEPS people hadn't really figured out fixed-point differentiation that well yet.
  • Regarding capitalization: I like your proposal but wouldn't it be even more consistent if we also capitalize our own algorithms? For example :halfinfinite -> :HalfInfinite or :simultaneous -> :Simultaneous would make sense to me. I don't really mind that this is very breaking since the release will anyway be really breaking.

I will take a proper look tomorrow and add some more comments, but this definitely is a massive improvement.

Comment thread docs/src/examples/j1j2_su/index.md
@Yue-Zhengyuan
Copy link
Copy Markdown
Member

  • I would be in favor of removing the :diffgauge scheme

@pbrehmer I don't know the details yet, but I remember that :diffgauge is required if one wants to restrict the state to be real?

@pbrehmer
Copy link
Copy Markdown
Collaborator

  • I would be in favor of removing the :diffgauge scheme

@pbrehmer I don't know the details yet, but I remember that :diffgauge is required if one wants to restrict the state to be real?

This was related to the fact that the gauge fixing routine in :fixed mode (which would previously FixedSVD etc.) would generally produce complex gauge signs such that real tensor would turn into complex ones within the first step of optimization. I haven't looked at the updated gauge fixing procedure but I would assume that it still produces complex gauges?

I would still be in favor of removing :diffgauge but maybe just add a gauge fixing method which will leave the tensors real if the current procedure is not compatible with real tensors.

@leburgel
Copy link
Copy Markdown
Member Author

I would still be in favor of removing :diffgauge but maybe just add a gauge fixing method which will leave the tensors real if the current procedure is not compatible with real tensors.

If we can make sure the sign matrices are real at the end of the gauge fixing procedure for real environments, we should be able to use :fixed with real tensors. So that sounds like the way to go. I'm anyways not going to remove it here, just asking for opinions and then I'll address it in a follow-up.

Comment thread src/algorithms/ctmrg/projectors.jl Outdated
Comment thread src/algorithms/ctmrg/gaugefix.jl Outdated
Copy link
Copy Markdown
Collaborator

@pbrehmer pbrehmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again I'll have to say that I really like this! This is a big step forward in cleaning up the codebase and making things more consistent. I only have some minor comments. One thing I'm still unsure about is how we should best handle the capitalization of the algorithms since I do feel a bit torn about mixing capitalized MAK-style algorithms with lowercase ones.

Comment thread src/utility/eigh.jl Outdated
Comment thread src/utility/eigh.jl Outdated
Comment thread src/utility/svd.jl Outdated
Comment thread src/Defaults.jl
Comment thread test/utility/eigh_wrapper.jl
Comment thread src/algorithms/ctmrg/gaugefix.jl Outdated
Comment thread src/utility/eigh.jl Outdated
Comment thread src/algorithms/ctmrg/projectors.jl Outdated
@leburgel
Copy link
Copy Markdown
Member Author

It seems the updated svd_trunc pullback in MatrixAlgebraKit v0.6.6 is giving some trouble: https://github.com/QuantumKitHub/PEPSKit.jl/actions/runs/24983503557/job/73151035377?pr=364

As far as I can tell this was a perfectly working test before, so the new pullback somehow gives NaNs in some weird edge cases. Maybe it would be good to still have a slower fallback in case the new implementation fails?

@pbrehmer
Copy link
Copy Markdown
Collaborator

It seems the updated svd_trunc pullback in MatrixAlgebraKit v0.6.6 is giving some trouble: https://github.com/QuantumKitHub/PEPSKit.jl/actions/runs/24983503557/job/73151035377?pr=364

As far as I can tell this was a perfectly working test before, so the new pullback somehow gives NaNs in some weird edge cases. Maybe it would be good to still have a slower fallback in case the new implementation fails?

This is the case where we artificially create a degenerate spectrum so it does make sense that the Sylvester solver is struggling but it even seems to struggle in the presence of Lorentzian broadening (the second error), so that is odd to me. I'm not sure why the previous Sylvester solver would produce a converging result and the current one doesn't.

@leburgel
Copy link
Copy Markdown
Member Author

It seems like the new svd_trunc pullback uses the degeneracy_atol as a convergence measure in its Sylvester solver loop:
https://github.com/QuantumKitHub/MatrixAlgebraKit.jl/blob/5f134300121729272a86b8250aff1a66dea226c5/src/pullbacks/svd.jl#L251

This probably means that the way we disable the broadening and cutoff in the SVD wrapper test

no_broadening_no_cutoff_alg = @set alg.rrule_alg.degeneracy_atol = 1.0e-30

now has some unintended consequences in the actual pullback implementation.

@lkdvos
Copy link
Copy Markdown
Member

lkdvos commented Apr 27, 2026

Hahahah yes, we should pick a reasonable value there, which probably fixes the issues. I'm assuming that was there since before we had issues if this was chosen too small, but I think this might be better now?

@leburgel
Copy link
Copy Markdown
Member Author

It turns out that the degeneracy_atol was actually not the problem, the issue was that the broadening test for symmetric tensors used a truncation space that cut through degeneracies, in which case the truncated pullback will just not work at all. Thanks to @Jutho for pointing this out. Switching on the verbosity, this shows that the loss function has a very large gauge-dependence even with the old implementation. So in this case all of the gradients were basically nonsense, but we just never noticed.

I updated the choice of degeneracies to be nicely commensurate with the truncation space, which gives sensible results for me locally.

@leburgel leburgel changed the title [WIP] Update decomposition handling Update decomposition handling Apr 28, 2026
@leburgel
Copy link
Copy Markdown
Member Author

I opened some issues on discussion points that aren't directly related to decomposition handling. I think this is ready for another round of review.

@Jutho
Copy link
Copy Markdown
Member

Jutho commented Apr 28, 2026

Just a small note that the actual gauge dependence warning, when using svd_trunc_pullback, is about the components of the adjoint variable in the subspaces of the kept degenerate eigenvalues, which are subsequently put to zero by the rule, i.e. it is about the part that is being ignored/thrown away.

The components of the cotangents that were mixing the kept versus discarded singular values of a degenerate pair cannot be assessed by svd_trunc_pullback because it doesn't know about the discarded singular values and thus about this degeneracy. That part is therefore also not thrown away, and thus causes things blowing up.

That is different for svd_pullback which requires all singular values and will also warn about these gauge components of the adjoints and subsequently throw them away, thereby avoiding divergences.

(Just a clarification as I myself miscommunicated this to Lander earlier today).

Copy link
Copy Markdown
Collaborator

@pbrehmer pbrehmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking good to me now if all the tests run through! I only have a minor comment (you can call @set on nested structs) but that's just cosmetics really.

Comment thread src/algorithms/ctmrg/projectors.jl
@leburgel leburgel enabled auto-merge (squash) April 29, 2026 12:58
@lkdvos
Copy link
Copy Markdown
Member

lkdvos commented Apr 29, 2026

Will merge this as-is and deal with the actions later, to keep things moving.

@lkdvos lkdvos disabled auto-merge April 29, 2026 17:25
@lkdvos lkdvos merged commit 2677eab into master Apr 29, 2026
63 checks passed
@lkdvos lkdvos deleted the lb/purge_lapack branch April 29, 2026 17:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants