Skip to content

TESTING: fix inconsistent entry count in csd.in#1232

Merged
langou merged 1 commit intoReference-LAPACK:masterfrom
nakatamaho:fix/testing-csd-input-count
Apr 14, 2026
Merged

TESTING: fix inconsistent entry count in csd.in#1232
langou merged 1 commit intoReference-LAPACK:masterfrom
nakatamaho:fix/testing-csd-input-count

Conversation

@nakatamaho
Copy link
Copy Markdown
Contributor

@nakatamaho nakatamaho commented Apr 11, 2026

Summary

TESTING/csd.in declares that the number of values of M, P, and Q is 10, but each of the three corresponding data lines actually contains 11 entries.

This PR updates that count from 10 to 11 so the header matches the test input data.

Details

In TESTING/csd.in:

  • the M line has 11 values
  • the P line has 11 values
  • the Q line has 11 values

but the file currently says:

10                                   Number of values of M, P, Q

This change makes the metadata consistent with the actual contents of the file.

Impact

This is a test-input consistency fix only.

  • no algorithmic changes
  • no changes to the CSD routines themselves
  • only the test data header is corrected

TESTING/csd.in declares that the number of values of M, P, and Q
is 10, but each of the three data lines actually contains 11
entries.

Update the count to 11 so the header matches the test input data.
This is a test-data consistency fix only.
langou
langou previously approved these changes Apr 11, 2026
@nakatamaho nakatamaho dismissed langou’s stale review April 11, 2026 15:18

The merge-base changed after approval.

langou
langou previously approved these changes Apr 11, 2026
@nakatamaho nakatamaho dismissed langou’s stale review April 11, 2026 15:23

The merge-base changed after approval.

@nakatamaho
Copy link
Copy Markdown
Contributor Author

nakatamaho commented Apr 12, 2026

Thank you for the thoughtful summary, @langou, and thanks also to @martin-frbg for the context in #1231. I would like to argue in favor of option (1), i.e. actually changing 10 to 11, for the following reasons.

  1. More fundamentally, I believe it is simply wrong for the test suite to contain an input value that is known to be confusing or ambiguous to readers. If the choice between 10 and 11 is non-obvious enough that it requires an explanatory comment, that itself is a sign that the value should be the principled one rather than the historically convenient one.

  2. Adding the last test case (which is what the 11 enables) is actually quite important from a correctness standpoint: it exposes a bug that is present in the released LAPACK 3.12.1. Interestingly, this bug is not present in 3.9.1, and it is also not present in the current master checkout — so it appears that the issue was introduced somewhere after 3.9.1 and then fixed somewhere between 3.12.1 and the current tree. Having the additional test in place would have caught this regression at the time, and will help guard against similar regressions in the future.
    docker@95ce403630a3:~/mplapack/external/lapack/work_3.12.1/internal/lapack-3.12.1/TESTING/EIG$ ./xeigtstz < ~/csd.in

Tests of the CS Decomposition routines

LAPACK VERSION 3.*.0

The following parameter values will be used:
M: 0 10 10 10 10 21 24 30 22 32
55
P: 0 4 4 0 10 9 10 20 12 12
40
N: 0 0 10 4 4 15 12 8 20 8
20

Relative machine underflow is taken to be 0.222507-307
Relative machine overflow is taken to be 0.179769+309
Relative machine precision is taken to be 0.111022D-15

Routines pass computational tests if test ratio is less than 30.00

CSD routines passed the tests of the error exits ( 8 tests done)

CSD: CS Decomposition
Matrix types:
1: Random orthogonal matrix (Haar measure)
2: Nearly orthogonal matrix with uniformly distributed angles atan2( S, C ) in CS decomposition
3: Random orthogonal matrix with clustered angles atan2( S, C ) in CS decomposition
Test ratios:
2-by-2 CSD
1: norm( U1' * X11 * V1 - C ) / ( max( P, Q) * max(norm(I-X'*X),EPS) )
2: norm( U1' * X12 * V2-(-S)) / ( max( P,M-Q) * max(norm(I-X'*X),EPS) )
3: norm( U2' * X21 * V1 - S ) / ( max(M-P, Q) * max(norm(I-X'*X),EPS) )
4: norm( U2' * X22 * V2 - C ) / ( max(M-P,M-Q) * max(norm(I-X'*X),EPS) )
5: norm( I - U1'*U1 ) / ( P * EPS )
6: norm( I - U2'*U2 ) / ( (M-P) * EPS )
7: norm( I - V1'*V1 ) / ( Q * EPS )
8: norm( I - V2'*V2 ) / ( (M-Q) * EPS )
9: principal angle ordering ( 0 or ULP )
2-by-1 CSD
10: norm( U1' * X11 * V1 - C ) / ( max( P, Q) * max(norm(I-X'*X),EPS) )
11: norm( U2' * X21 * V1 - S ) / ( max( M-P,Q) * max(norm(I-X'*X),EPS) )
12: norm( I - U1'*U1 ) / ( P * EPS )
13: norm( I - U2'*U2 ) / ( (M-P) * EPS )
14: norm( I - V1'*V1 ) / ( Q * EPS )
15: principal angle ordering ( 0 or ULP )
M= 55 P= 40, Q= 20, type 1, test 10, ratio= 0.143747E+15
M= 55 P= 40, Q= 20, type 1, test 11, ratio= 0.490979E+15
M= 55 P= 40, Q= 20, type 2, test 10, ratio= 0.322510E+12
M= 55 P= 40, Q= 20, type 2, test 11, ratio= 0.477205E+12
M= 55 P= 40, Q= 20, type 3, test 10, ratio= 0.250525E+15
M= 55 P= 40, Q= 20, type 3, test 11, ratio= 0.429782E+15
CSD: 6 out of 660 tests failed to pass the threshold

End of tests
Total time used = 0.01 seconds

Note that xeigtstz only exposes the error and the xeigtstc xeigtstd xeigtsts tests passed without errors for 3.12.1.

  1. Regarding the concern about the random number generator seed: since the change only adds one extra test at the very end, it does not affect any of the preceding tests. The RNG state consumed by the earlier tests is unchanged, so there is no risk of perturbing existing test outcomes downstream. Moreover, the csd.in test is a standalone test, which do not excesize any other test after that. That means extending the tests won't harm anything. If you add more tests, then situation would become more complicated, and we cannot easiliy add this bug exposing path anymore.

  2. As for the runtime cost: the additional time introduced by appending this one test is essentially negligible on any modern machine. I do not think the energy or wall-clock argument outweighs the benefit of catching real bugs like the one described above.
    (the test result above is run on Ryzen 3970X using 1 core)

For these reasons I would gently lean toward merging this PR as-is (option 1) rather than just adding a comment (option 2). Of course, I am happy to also add an explanatory comment alongside the change if that would make the intent clearer to future readers.

@nakatamaho
Copy link
Copy Markdown
Contributor Author

Let me share some background on how I came to care about this particular line in csd.in, since I think the history is relevant to the decision.
I am the developer of MPLAPACK (https://github.com/nakatamaho/mplapack/), a multiple-precision port of LAPACK. MPLAPACK 2.1.1 is based on LAPACK 3.9.1. During its development, I noticed that csd.in did not contain a path with M = P = Q = 1, and I recalled that there had been an MPLAPACK-side bug specifically along that code path. I therefore wanted to make sure the 1 1 1 case was actually exercised. However, when I counted the entries against the header on the first line of csd.in, the header said 10 while there were clearly more entries listed — so I believed at the time that this was simply a bug in the header. I changed it to 12, and with that change csd.in passed cleanly. As a result, the issue never surfaced during the MPLAPACK 2.1.1 cycle.
I am now working on rebasing MPLAPACK onto LAPACK 3.12.1 (https://github.com/nakatamaho/mplapack/tree/topic/lapack-3.12.1). This time, the additional path 54 40 20 — one of the cases that only gets exercised when the count is raised — exposed a real failure under 3.12.1. I initially assumed it was an MPLAPACK-side regression and spent a non-trivial amount of time debugging it from that angle, without success. I then went back and ran reference LAPACK 3.12.1 itself with a csd.in containing all 12 entries, and observed exactly the same failure. At that point I was confident the bug was in 3.12.1 itself, not in MPLAPACK. Out of curiosity I then checked out current master and reran the same test — and the failure was gone. In other words, the bug was present in the released 3.12.1 and was silently fixed somewhere between 3.12.1 and the current tree (a careful reading of the issues and PRs in that window would presumably identify exactly when).
I think this is a rather unfortunate quality-assurance episode, and it is precisely the kind of situation that motivates my preference for option (1) in my earlier comment. The purpose of the test suite is to prevent regressions from being shipped. In this case:

the input file csd.in was confusing in a way that made an honest reader (me) believe the header was simply wrong,
a real bug actually existed along one of the omitted code paths,
the test suite as configured did not detect it, and
3.12.1 was nevertheless released in that state.

Raising the count so that all listed cases are actually exercised would have caught this, and would protect against analogous slips in the future. For that reason, I would still gently advocate for changing 10 to 11 (and, in the spirit of point 0 in my earlier comment, I would also be happy to add an explanatory comment alongside the change so that future readers understand the intent).

@nakatamaho
Copy link
Copy Markdown
Contributor Author

Summary: +1 for changing 10 → 11.

@langou
Copy link
Copy Markdown
Contributor

langou commented Apr 13, 2026

Summary: +1 for changing 10 → 11.

This is a useful summary! 👍

langou
langou previously approved these changes Apr 13, 2026
@nakatamaho nakatamaho dismissed langou’s stale review April 13, 2026 13:26

The merge-base changed after approval.

@langou
Copy link
Copy Markdown
Contributor

langou commented Apr 13, 2026

For some reasons, my approving review is not enough to enable the merge.
Can someone else than me consider reviewing and approving?
The error message is "Review Required: At least 1 approving review is required by reviewers with write access."
And just below: "@langou approved these changes."
No idea what's going on, but, if someone can approve so that we merge this PR, this would be great.

@langou langou self-requested a review April 13, 2026 13:30
langou
langou previously approved these changes Apr 13, 2026
@nakatamaho nakatamaho dismissed langou’s stale review April 13, 2026 13:31

The merge-base changed after approval.

martin-frbg
martin-frbg previously approved these changes Apr 13, 2026
@nakatamaho nakatamaho dismissed martin-frbg’s stale review April 13, 2026 14:23

The merge-base changed after approval.

@martin-frbg
Copy link
Copy Markdown
Collaborator

Seems every approval is immediately dismissed with an automated message (credited to nakatamaho) that the merge base changed after approval. I don't think I've encountered this before, but perhaps updating the PR by merging the unrelated changes that were merged into master in the meantime will help ?

@martin-frbg
Copy link
Copy Markdown
Collaborator

https://github.com/orgs/community/discussions/58535 seems related, either there's an associated "branch protection rule" activated somewhere in the project settings (didn't see it, but may be visible to project owner only ?) or it's a github hickup. One suggestion in that thread was to close and reopen the PR...

martin-frbg
martin-frbg previously approved these changes Apr 13, 2026
@nakatamaho nakatamaho dismissed martin-frbg’s stale review April 13, 2026 14:35

The merge-base changed after approval.

@langou langou closed this Apr 13, 2026
@langou langou reopened this Apr 13, 2026
@martin-frbg martin-frbg reopened this Apr 13, 2026
@langou
Copy link
Copy Markdown
Contributor

langou commented Apr 13, 2026

https://github.com/orgs/community/discussions/58535 seems related, either there's an associated "branch protection rule" activated somewhere in the project settings (didn't see it, but may be visible to project owner only ?) or it's a github hickup. One suggestion in that thread was to close and reopen the PR...

Thanks a lot @martin-frbg. This seems to have helped a lot. Thanks for the research work.

@martin-frbg
Copy link
Copy Markdown
Collaborator

Sorry, didn't see in time that you also did the close/reopen/review... think you can merge it now if you like, I promise not to frog up the process anymore ?

@langou
Copy link
Copy Markdown
Contributor

langou commented Apr 13, 2026

Sorry, didn't see in time that you also did the close/reopen/review... think you can merge it now if you like, I promise not to frog up the process anymore ?

(*) Feel welcome to frog up 🐸 processes as you just did as much as you want! What you just did was quite helpful. Good frogging up!

(*) I am waiting for the test suite to complete and see the green checkmark. ✅ and then I will merge 🔀.

@nakatamaho
Copy link
Copy Markdown
Contributor Author

I appreciate it! Thank you very much for your kindness!

@nakatamaho nakatamaho closed this Apr 13, 2026
nakatamaho added a commit to nakatamaho/mplapack that referenced this pull request Apr 14, 2026
Add the upstream ZUNBDB3 fix that conjugates the X21 row before
generating and applying the right Householder reflector. The row is
already conjugated again afterward, so this makes the reflector operate
with the expected complex vector convention while preserving the stored
data layout.

Register the LAPACK 3.12.1 patch, include it in EXTRA_DIST, and mirror
the fix in the generated Cunbdb3 reference implementation.

Enable the LAPACK 3.12.1 CSD input patch in the fable test patch flow,
which adds small M/P/Q edge cases to csd.in.

see also: Reference-LAPACK/lapack#1232
          Reference-LAPACK/lapack#1196
@nakatamaho nakatamaho reopened this Apr 14, 2026
@nakatamaho
Copy link
Copy Markdown
Contributor Author

reopen as this is not yet merged

@nakatamaho
Copy link
Copy Markdown
Contributor Author

For record #1196

This time, the additional path 54 40 20 — one of the cases that only gets exercised when the count is raised — exposed a real failure under 3.12.1. I initially assumed it was an MPLAPACK-side regression and spent a non-trivial amount of time debugging it from that angle, without success. I then went back and ran reference LAPACK 3.12.1 itself with a csd.in containing all 12 entries, and observed exactly the same failure. At that point I was confident the bug was in 3.12.1 itself, not in MPLAPACK. Out of curiosity I then checked out current master and reran the same test — and the failure was gone. In other words, the bug was present in the released 3.12.1 and was silently fixed somewhere between 3.12.1 and the current tree (a careful reading of the issues and PRs in that window would presumably identify exactly when).

This is #1196

@langou langou merged commit f6fb111 into Reference-LAPACK:master Apr 14, 2026
45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants