Skip to content

Linearize fits in Nuth & Kääb, 1D/2D polynimials#940

Open
fnands wants to merge 11 commits into
GlacioHack:mainfrom
fnands:linearize_fit
Open

Linearize fits in Nuth & Kääb, 1D/2D polynimials#940
fnands wants to merge 11 commits into
GlacioHack:mainfrom
fnands:linearize_fit

Conversation

@fnands
Copy link
Copy Markdown
Contributor

@fnands fnands commented Mar 28, 2026

Hi, I was investigating the performance of the Nuth & Kääb implementation here a while back and realized that one reason the performance can be a bit slow for larger rasters is the fact that a non-linear solver (scipy.optimize.curve_fit) is being used, even though the problem can be linearized. I also now saw that the polynomial cases (1D and 2D) call non-linear solvers despite being fundamentally linear.

I originally did Nuth & Kääb in a somewhat hacky way, but used Claude now to make the implementation a bit cleaner and do the 1D and 2D poly cases.

Changes

  1. Nuth & Kääb coregistration (xdem/coreg/affine.py)

The Nuth & Kääb fit function is:

$$y(x) = a \cos(b - x) + c$$

where $y = \Delta h / \tan(\text{slope})$ and $x = \text{aspect}$.

Applying the cosine subtraction identity:

$$\cos(b - x) = \cos(b)\cos(x) + \sin(b)\sin(x)$$

the model becomes linear in the substituted parameters $A = a\cos(b)$, $B = a\sin(b)$:

$$y(x) = A\cos(x) + B\sin(x) + c$$

Solved directly with np.linalg.lstsq on the design matrix $[\cos(x),\ \sin(x),\ \mathbf{1}]$. The easting and northing offsets simplify further:

$$\text{easting} = a\sin(b) = B \qquad \text{northing} = a\cos(b) = A$$

so no back-conversion through $a$ and $b$ is needed. The binning step (72 aspect bins, median per bin) is preserved for robustness.

  1. 1D polynomial fitting (xdem/fit.py)

Polynomials are linear in their coefficients by construction. The Linear + scipy branch of robust_norder_polynomial_fit was routing through curve_fit. Replaced with np.polyfit (direct QR-based OLS).
Note: np.polyfit returns coefficients in descending degree order while polynomial_1d uses numpy.polynomial.polyval (ascending order), so the result is reversed.

This also fixes a pre-existing test failure where loss and f_scale kwargs were not forwarded correctly by _wrapper_scipy_leastsquares.

  1. 2D polynomial fitting / Deramp (xdem/fit.py, xdem/coreg/base.py, xdem/coreg/biascorr.py)

Deramp was fitting polynomial_2d with curve_fit. Added a design_matrix_func key to InFitOrBinDict and a _ols_fit helper to _bin_or_and_fit_nd: when design_matrix_func is set, the function routes to lstsq instead of fit_optimizer.

polynomial_2d is now a callable class design_matrix_polynomial_2d(poly_order) that normalizes coordinates to $[-1, 1]$ before building the design matrix (critical for numerical stability at higher polynomial orders, where raw pixel monomials like $x^4 y^4$ span $\sim 10^{22}$). The unnormalize_coeffs method converts coefficients back to the original coordinate space, so they remain compatible with polynomial_2d during the apply step.

Benchmark results

Method curve_fit (nonlinear) OLS Speedup
NuthKaab 0.35 ms/call 0.024 ms/call 15×
Polynomial 1D 3.1 ms/call 0.80 ms/call
Polynomial 2D 456 ms/call 51 ms/call

The OLS solution for polynomial_2d also recovers the true coefficients more accurately than curve_fit with its default p0=ones initial guess, which can converge to a local minimum in ill-conditioned cases.

Tests

All the tests that passed before pass, as well as the ones I added here (not everything passes, but that was like that before).

@fnands
Copy link
Copy Markdown
Contributor Author

fnands commented Mar 28, 2026

Updated the docs to reflect the changes made.

There was also one test that failed that wasn't related to any change I made (seemed to struggle to converge). The one in spatialstats, it seems to be something that struggles in newer versions of scipy

@rhugonnet
Copy link
Copy Markdown
Member

@fnands This is amazing! Thanks a lot for looking into this, and both checking performance & adding tests 🙂

Does the new optimization definition also improves the accuracy of Nuth and Kääb? From the old tests passing, it looks like you get 100% the same results for the test_coreg_translation__example, so no change there?

Some background here: I discovered fairly recently that Nuth and Kääb is actually the equivalent of LZD for a translation-only, simply written in polar coordinates (same method; re-invented 23 years later). Yet Nuth and Kääb doesn't perform as well in terms of horizontal alignment as LZD does somehow... It seems to suffer either from the polar coordinate conversion (LZD uses the cartesian gradient directly), or the choice of normalizing by the slope tangent, or I thought maybe the optimization.

On the PR: Just so you know, there will be some conflicts with #759. I can deal with those there later, but better to coordinate with you here first on how to merge!
In that PR, I change fit_optimizer into fit_minimizer everywhere to be more generic, and moved the default to scipy.optimize.least_squares (instead of curve_fit).
How would we combine this with your change? I guess it doesn't change the underlying issue: least_squares is still always non-linear, so not optimal speed-wise?
So (correct me if I'm wrong):

  • The new design matrix for Nuth and Kääb remains entirely relevant.
  • For the linearized optimizer that you added, maybe I could then also add it consistently to all other linearized problems? (most of them are consolidated only in Add support for more parametrization in coregistration #759, so we can't account for them in this PR)

This would include ICP(method='point-to-plane', linearized=True), LZD(linearized=True) (once DhMinimize is merged into it in #759, as LZD is actually just the linearized version of a generic DhMinimize with rotations), and NuthKaab (always). The linearized is a new argument in #759.

What do you think? 😄

@fnands
Copy link
Copy Markdown
Contributor Author

fnands commented Mar 30, 2026

Hey!

Does the new optimization definition also improves the accuracy of Nuth and Kääb? From the old tests passing, it looks like you get 100% the same results for the test_coreg_translation__example, so no change there?

No, same results for Nuth and Kääb as well as 1D Poly case, at least within numerical precision. The 2D case actually does result in better results (as tested by applying and reconstructing synthetic shifts), presumably due to numerical issues there with the large numbers one can see when using raw pixel coordinates.

Ah, I didn't see the connection, but I found this paper that points out the similarity.

there will be some conflicts with #759

Ah, ok. I can also rebase my change onto that branch, if it will make things easier?
Then you can do one big release.

How would we combine this with your change? I guess it doesn't change the underlying issue: least_squares is still always non-linear, so not optimal speed-wise?

Yup, for at least one of the cases here I was actually replacing the non-linear least squares with the linear one.

For the linearized optimizer that you added, maybe I could then also add it consistently to all other linearized problems? (most of them are consolidated only in #759, so we can't account for them in this PR)

Oh yeah, I can have a look at that. The same should hold in the linearized cases.

I might have some time later this week or over the weekend to see how much work it would be, but I think a lot of it would just be plumbing.

@rhugonnet
Copy link
Copy Markdown
Member

rhugonnet commented Mar 30, 2026

Thanks! I agree, linking the support to other linearized methods should be mostly connecting to what you've already done. 😉
But #759 is pretty big, so if merging looks like too much trouble, don't worry about it 😅. I'm familiar with the changes there so it might be faster for me.

In terms of public API, I have one thought:

If I understand correctly, with the current setup, any user-input fit_optimizer will be used for non-linear cases, but all linear cases will override the user-input with lstsq? (because design_matrix_func is always defined there).
A limiting effect of this is the following use case: A user wants to run NuthKaab with a custom (e.g. RANSAC?) optimizer to get outlier robustness, but that optimizer won't be able to go through...
Maybe, to support all user inputs, we could move fit_optimizer defaults to be equal to None everywhere. Then, if a given method is linear, we would default to lstsq. If it is not, we would default to least_squares. And any user input would naturally override those (so we'd only need to adjust the logic triggering the OLS through design_matrix).
Also related to this: With this setup, I don't think we would need design_matrix_func to appear in the .meta of the Coreg class, because fit_optimizer would already show the OLS optimizer function? (Or is there another objective in showing the design_matrix_func?)

fnands and others added 4 commits March 30, 2026 23:48
- fit_optimizer defaults to None in all __init__ signatures instead of
  scipy.optimize.curve_fit
- _bin_or_and_fit_nd only takes the OLS path when fit_optimizer is None
  AND a design_matrix_func is available; any user-provided optimizer
  (e.g. RANSAC, Huber) overrides the linear default
- When fit_optimizer is None and no design_matrix_func, falls back to
  scipy.optimize.curve_fit as the nonlinear default
- Guard callable(fit_optimizer) checks to accept None
- design_matrix_func kept in dict_key_to_str as internal key

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Verifies that passing an explicit fit_optimizer (e.g. curve_fit,
standing in for RANSAC or Huber) bypasses the design_matrix_func OLS
path and uses the given optimizer instead, while both solvers produce
numerically close results on the same polynomial problem.

Also fixes the callable guard error message assertion to match the
updated "or None" wording, and updates the test_deramp assertion
to reflect that fit_optimizer now defaults to None.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@fnands
Copy link
Copy Markdown
Contributor Author

fnands commented Mar 31, 2026

If I understand correctly, with the current setup, any user-input fit_optimizer will be used for non-linear cases, but all linear cases will override the user-input with lstsq? (because design_matrix_func is always defined there).
A limiting effect of this is the following use case: A user wants to run NuthKaab with a custom (e.g. RANSAC?) optimizer to get outlier robustness, but that optimizer won't be able to go through...

A damn, I completely missed that use case.

But was it ever possible to call RANSAC? At least for the sklearn version of RANSAC takes the same design matrix as the OLS version (or any linear solver e.g. Huber)

I made some changes and now it will default to OLS, but can still fallback to curve_fit if the user wants it.
But that's kinda pointless as it would just be worse.

What we could do is we could limit fit_optimizer to linear solvers, in which case the user can pass any linear solver that expects (X, y) as input, i.e. anything from sklearn.linear_model.

We could basically make it use sklearn.linear_model.LinearRegression as default, but then the user can pass any linear solver from sklearn, or any method that respects the sklearn API. This will break for users who have fit_optimizer hard-coded to curve_fit, but I doubt that is an issue?

EDIT: we probably don't want to make the sklearn one default as that would require an extra depenency. But we could still make it compatible with any sklearn linear model. There is some complexity related to extracting return values, i.e. RANSAC returns estimator.estimator_.coef_ whereas the others to estimator.coef_...

Ugh, we could have a special case for it... But now it's gets a little complicated.

How do you envision users passing RANSAC here?

@rhugonnet
Copy link
Copy Markdown
Member

Thanks! Using None as default solves most of it I think.

But was it ever possible to call RANSAC?

Not easily but yes. The idea of the current structure was (even if not clearly documented yet), that users could define and pass any optimizer function under a fixed format:

def my_optimizer(func, *args, **kwargs):
    # Run RANSAC through sklearn, a GLS through statsmodel or GPyTorch, or whatever

     # Mirror result object of SciPy, or any format we decide on
     class MyResults:
         x = None
         cov = None
         
     return MyResults
     
nk = xdem.coreg.NuthKaab(fit_optimizer=my_optimizer)

The expected input format by the optimizer would simply match an existing one, like scipy.optimize.least_squares. I knew the format would change with #759 (going from curve_fit to least_squares), which is why we didn't document that advanced feature too early.

We could have some helpers to directly understand passing a sklearn optimizer even if the signature doesn't match that of least_squares (as we already have it as optional dependency), but I think that's another piece of work.

@fnands
Copy link
Copy Markdown
Contributor Author

fnands commented Apr 1, 2026

Ah, yeah Ok, I can see how that can work!

I think I will leave it here then?

I had a quick look at rebasing to #759, but there are a lot of moving pieces there that I don't really feel I understand, and I'm afraid I'll break something...

@rhugonnet
Copy link
Copy Markdown
Member

I agree! I'll do a quick line-by-line review, then we can merge 😉

Comment thread xdem/spatialstats.py
[v * 0.5 for v in p0],
[v * 2.0 for v in p0],
[(lo + hi) / 2 for (lo, hi) in zip(final_bounds[0], final_bounds[1])],
]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice extra addition, I know a lot of users had trouble with the fit and I had to recommend them to pass p0/bounds manually. Good to improve it directly here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I basically had to do this as one of the examples kept failing as it couldn't converge. I wouldn't have touched it otherwise.

Comment thread xdem/coreg/base.py
if fit_optimizer is None and design_matrix_func is not None:
results = _ols_fit(design_matrix_func, xdata, ydata, sigma)
else:
effective_optimizer = fit_optimizer if fit_optimizer is not None else scipy.optimize.curve_fit
Copy link
Copy Markdown
Member

@rhugonnet rhugonnet Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... This works internally, but means that Coreg.meta never stores the final optimizer, so Coreg.info() won't be printing the optimizer function correctly.

Maybe we should instead:
1/ Define fit_optimizer=scipy.optimize.lstsq/curve_fit is fit_optimizer is None directly in NuthKaab.__init__ (and other __init__, same for polynomials). Or we could use Literal["ols"] to clarify that lstsq is not used directly with its signature.
2/ Then, if fit_optimizer="ols" or lstsq in this internal bin_or_and_fit function, we trigger the right mechanism with the design function (otherwise raise an error).

To facilitate 1/ (not copy/paste the same logic in every function instantiation), we could define a _fit_linear attribute for all Coreg classes (set to True in NuthKaab, set to the linearized argument in ICP, etc).
Then we consolidate a single logic in Coreg.__init__: If fit_optimizer is None and "fit" in fit_or_bin, we pick lstsq for a fit_linear method, otherwise curve_fit.

Copy link
Copy Markdown
Member

@rhugonnet rhugonnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All good for me after the above points! 🙂

One last point: I think it is the first time we have an AI commit (Claude), and I'm not sure how we deal with this in terms of copyright. We had some internal discussions on this recently. I think in this case, as the changes link back to well-documented packages like SciPy (and are not a new feature with completely new logic that may be copying an underlying source), there is little risk.
@sdinot: Any advice here?
FYI @belletva @adehecq

@rhugonnet
Copy link
Copy Markdown
Member

Thanks again @fnands! And I'll take of the merge with #759 😉

@fnands
Copy link
Copy Markdown
Contributor Author

fnands commented Apr 2, 2026

One last point: I think it is the first time we have an AI commit (Claude), and I'm not sure how we deal with this in terms of copyright

Fair enough. Let me know what you decide. Worst case I can trim it down to the work I did myself earlier, i.e. linearization of Nuth & Kääb.

@sdinot
Copy link
Copy Markdown

sdinot commented Apr 2, 2026

@sdinot: Any advice here?

Two questions must be distinguished:

  1. Is the code generated by AI truly original, or does it contain significant traces of pre-existing code—whether published under an open source license (if so, which one?) or a proprietary license—that could constitute plagiarism or a copyright infringement?

  2. Is the generated code subject to a third party’s copyright—which could be the company that owns the platform, the person who gave specific instructions to the AI, or the person distributing the code?

The first question is still a matter of debate. On the one hand, when the Chardet maintainer asked Claude Opus to generate the code for version 7.0, he demonstrated that version 7.0 bore no significant resemblance to previous versions (whereas the previous versions were very similar to one another). It was therefore undeniably original code.

On the other hand, I have heard of cases where identifiable code from pre-existing software was found in the generated code, with the AI even going so far as to reproduce the comments.

To my knowledge, this issue has never been the subject of a court ruling, and I do not see how a court could resolve it in a general way. A court can only rule on the case before it and decide whether, in that specific case, plagiarism has occurred. Such a ruling has no general applicability; it does not settle the debate. To do so, a statement of principle would need to be established at the national or international level, followed by the signing of a treaty or the passing of a law. Given the colossal economic, political, and strategic stakes, no country has yet legislated on this matter, for fear of penalizing its local industry and research.

So I can only say one thing: form your own opinion on the proposed code, keeping in mind that today, “everyone” is making unrestrained use of generative AI. I believe this practice will gradually become accepted and will soon no longer be a subject of debate. Morality is defined by the values shared by the majority. When lawmakers observe that a practice is widespread and commonplace, they amend the law to bring it in line with that practice; they do not attempt to correct society as a whole.

In contrast, the second question has already been addressed in several decisions handed down by local and higher courts in Europe and the United States. Almost all of these decisions reach the same conclusion: copyright cannot apply to a work generated entirely by AI, where there is no evidence of substantial human intervention (other than instructions given to the AI). For instance, cf. Once again, no copyright protection for AI-generated output.

@rhugonnet
Copy link
Copy Markdown
Member

Thanks a lot @sdinot! It is really informative to be able to link back to precedents, and grasp the different facets of this topic.

So it is mostly about point number 1 here: As I stated above, because those are well-documented SciPy-related changes, I have no issue on the originality/underlying copyright.
Thoughts @belletva @adehecq?

@rhugonnet
Copy link
Copy Markdown
Member

Hi @fnands: Sorry for the delay, we discussed this topic at our dev meeting last night! 🙂

We're good to go with your PR, as per my message above, with a small adjustment: We'd prefer if you could remove "Claude" from commit tags, so it does not show up an author: You supervised/co-wrote the code and we reviewed it to ensure it was original (original conception; no copyright issues), so we don't really want to consider it a "contributor".

We'll be adding a note in our contributor code about how we deal with this moving forward.

Lastly, for my comment above on Coreg.meta to register the fit_optimizer during initialization: All good on your side? What do you think?

I'll also try to push for #759 to be merged relatively soon (around end of the month, as I have deadlines in the coming weeks), so that we can do a release which includes your improvement across all linear methods!

Thanks again! 😉

@fnands
Copy link
Copy Markdown
Contributor Author

fnands commented Apr 9, 2026

Hey @rhugonnet ,

No worries, I was travelling for Easter a bit myself.

All good, your suggestions make sense.

Lastly, for my comment above on Coreg.meta to register the fit_optimizer during initialization: All good on your side? What do you think?

I started this, got a bit stuck into keeping mypy happy, but mostly have a solution. I should have some time this weekend to clean it up.

-F

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants