reformat and revise subspace drift by yuiyuiui · Pull Request #146 · Jutho/KrylovKit.jl

yuiyuiui · 2025-12-24T12:54:26Z

I have fixed the issue of subspace drift described in (#143). The cause of the problem, my solution, and the results can be found in (#143 (comment)). Specifically, I used DGKS reorthogonalization to improve the accuracy of block_qr!, ensuring that the vectors within the Q produced by block_qr! are strictly orthogonal (“Reorthogonalization and Stable Algorithms for Updating the Gram–Schmidt QR Factorization”, "Numerical Methods for Least Squares Problems"). Then, by projection, I removed components along V from the Q obtained in block_qr! to guarantee that Q is orthogonal to V. Afterwards, I applied block_qr! again to Q to ensure orthogonality among its vectors. It can be proven that this second block_qr! preserves the subspace at machine precision, resulting in a block that lies strictly in the orthogonal complement of V and whose vectors are strictly mutually orthogonal.

codecov · 2025-12-24T13:20:12Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.59%. Comparing base (f2d9ba5) to head (de45584).

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #146      +/-   ##
==========================================
- Coverage   88.37%   86.59%   -1.78%     
==========================================
  Files          36       36              
  Lines        3956     3917      -39     
==========================================
- Hits         3496     3392     -104     
- Misses        460      525      +65

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Jutho · 2026-01-26T19:48:30Z

I had forgotten about this PR. Feel free to send me reminders occasionally. I won't have time this week but will try to make time next week.

yuiyuiui · 2026-02-04T03:31:52Z

@Jutho No problem! I've added tests for issue #143 and corrected the wording from 'draft' back to 'drift'. Looking forward to your feedback.

yuiyuiui · 2026-02-08T19:27:13Z

Hi @Jutho , just a friendly reminder regarding my PR when you have a moment. Please let me know if there’s anything I should update or improve. Thanks!

yuiyuiui · 2026-02-25T20:22:50Z

Thank you for your comments. I would like to clarify the rationale behind my implementation and other code changes in the following three points.

First, regarding the decision to rescale block[j] prior to the second orthogonalization: I believe this approach offers better numerical precision compared to the typical reorthogonalization workflow. While the two methods are theoretically equivalent, a potential issue arises when block[j] has an extremely small norm. In the standard approach, any tiny rounding errors in the floating-point mantissa are amplified by a massive factor during the final rescaling step. By performing an intermediate scaling before the second orthogonalization, we ensure that block[j] remains $O(1)$ during the final normalization, which significantly mitigates the amplification of these errors. Although I haven't yet conducted a formal rigorous study on this, my preliminary mathematical estimates suggest that the improvement in error is proportional to the magnitude of the largest element in the output vector. As a numerical demonstration, in the provided example, the ratio of the errors between the two methods is approximately $O(10^{-d/2})$ with the vector length being $10^d$, which aligns with the observed results. Therefore, I believe incorporating this intermediate scaling step can enhance the robustness of the reorthogonalization process.

using Random, LinearAlgebra, Plots

err_rate = Float64[]

maxd = 8
Random.seed!(123)

for d in 1:maxd

    n = 10^d

    Q = qr(randn(n, 2)).Q

    v = (Q[:, 1] + Q[:, 2]) * 1e-12

    u = Q[:, 2]

    v1 = v - dot(v, u) * u
    normalize!(v1)
    norm(v1)

    v2 = normalize(v1)
    v2 = v2 - dot(v2, u) * u
    normalize!(v2)
    norm(v2)

    err1 = abs(dot(v1, u))

    err2 = abs(dot(v2, u))

    push!(err_rate, err2 / err1)
end

plot(1:maxd, err_rate, xlabel="log10(n)", label="err ratio", yscale=:log10)
plot!(1:maxd, 10 .^ ( - (1:maxd) / 2), label="10^(-n/2)")

yuiyuiui · 2026-02-25T20:54:55Z

Second, I would like to explain why I did not use the relative change in the vector norm as the criterion for triggering reorthogonalization. There are two main considerations:

Consistency with the global tolerance: We already have a unified scale, qr_tol, to determine whether a vector is numerically zero during the QR process. If we were to use a relative threshold for subspace drift, we might encounter cases where the drift-detection threshold is actually smaller than qr_tol. Conceptually, this creates ambiguity. More importantly, the primary cause of subspace drift is often when $\beta$ is only slightly larger than qr_tol. If the drift threshold is set lower than qr_tol, these critical cases might be bypassed. I have not found any theoretical guarantee that severe subspace drift would not occur under such circumstances.
Computational efficiency in high-dimensional spaces: Classical reorthogonalization theory suggests that if $\beta < |v|/\sqrt{2}$ after orthogonalization, a second pass is required. However, in high-dimensional spaces, for a given low-dimensional subspace $X$, the measure of unit vectors with an angle to $X$ of $\pi/4$ or less is nearly zero relative to the total area of the unit sphere. Applying such a stringent "numerical safety" standard might be overly conservative, leading to unnecessary reorthogonalization steps and wasted computational resources. I believe a more relaxed criterion is sufficient for maintaining stability without sacrificing performance.

Admittedly, I do not yet have a definitive conclusion on what the mathematically optimal criterion should be. In the current implementation, I have tentatively adopted the condition tol < β < 100 * tol as the trigger for "potential subspace drift requiring additional orthogonalization."

I should clarify that this threshold is based on my practical experience and numerical intuition encountered during testing, rather than a rigorous theoretical derivation. My goal was to provide a robust safeguard for these edge cases while maintaining computational efficiency.

yuiyuiui · 2026-02-25T21:08:07Z

Third, regarding the is_drift flag and the associated @warn messages: In my latest commit d291ddf, I have removed the sections that trigger a warning when is_drift is true.

The rationale is that our updated logic can now robustly handle cases where subspace drift occurs. I believe it is unnecessary to alert the user or prompt them to adjust their input parameters. The goal is to provide a more seamless experience where the solver remains stable even when encountering these edge cases.

yuiyuiui · 2026-03-04T18:25:12Z

First, regarding the decision to rescale block[j] prior to the second orthogonalization: I believe this approach offers better numerical precision compared to the typical reorthogonalization workflow. While the two methods are theoretically equivalent, a potential issue arises when block[j] has an extremely small norm. In the standard approach, any tiny rounding errors in the floating-point mantissa are amplified by a massive factor during the final rescaling step. By performing an intermediate scaling before the second orthogonalization, we ensure that block[j] remains $O(1)$ during the final normalization, which significantly mitigates the amplification of these errors. Although I haven't yet conducted a formal rigorous study on this, my preliminary mathematical estimates suggest that the improvement in error is proportional to the magnitude of the largest element in the output vector. As a numerical demonstration, in the provided example, the ratio of the errors between the two methods is approximately $O(10^{-d/2})$ with the vector length being $10^d$, which aligns with the observed results. Therefore, I believe incorporating this intermediate scaling step can enhance the robustness of the reorthogonalization process.

Upon further reflection, I must apologize as my previous test case for reorthogonalization was not sufficiently representative.

After conducting more extensive numerical experiments, I found that while an additional scale!! step can slightly reduce errors in certain specific edge cases, the precision remains essentially identical to the standard approach in most scenarios. Consequently, I have concluded that the extra scale!! step is not strictly necessary. I have reverted this logic and implemented the standard reorthogonalization procedure in my latest commit.

yuiyuiui · 2026-03-10T06:53:23Z

Hi @Jutho , just a friendly reminder regarding my PR when you have a moment. Please let me know if there’s anything I should update or improve. Thanks!

yuiyuiui · 2026-04-17T14:19:21Z

Hi @Jutho, just a friendly reminder regarding my PR when you have a moment. Please let me know if there’s anything I should update or improve. Thanks!

Co-authored-by: Jutho <Jutho@users.noreply.github.com>

…1726376

Jutho · 2026-05-22T22:47:27Z

Thanks for this and especially for the long patience. I think this looks very good now, so I'll merge.

yuiyuiui changed the title ~~reformat and revise subspace draft~~ reformat and revise subspace drift Feb 3, 2026

Jutho reviewed Feb 19, 2026

View reviewed changes