Linearize fits in Nuth & Kääb, 1D/2D polynimials#940
Conversation
|
Updated the docs to reflect the changes made. There was also one test that failed that wasn't related to any change I made (seemed to struggle to converge). The one in spatialstats, it seems to be something that struggles in newer versions of scipy |
|
@fnands This is amazing! Thanks a lot for looking into this, and both checking performance & adding tests 🙂 Does the new optimization definition also improves the accuracy of Nuth and Kääb? From the old tests passing, it looks like you get 100% the same results for the Some background here: I discovered fairly recently that Nuth and Kääb is actually the equivalent of LZD for a translation-only, simply written in polar coordinates (same method; re-invented 23 years later). Yet Nuth and Kääb doesn't perform as well in terms of horizontal alignment as LZD does somehow... It seems to suffer either from the polar coordinate conversion (LZD uses the cartesian gradient directly), or the choice of normalizing by the slope tangent, or I thought maybe the optimization. On the PR: Just so you know, there will be some conflicts with #759. I can deal with those there later, but better to coordinate with you here first on how to merge!
This would include What do you think? 😄 |
|
Hey!
No, same results for Nuth and Kääb as well as 1D Poly case, at least within numerical precision. The 2D case actually does result in better results (as tested by applying and reconstructing synthetic shifts), presumably due to numerical issues there with the large numbers one can see when using raw pixel coordinates. Ah, I didn't see the connection, but I found this paper that points out the similarity.
Ah, ok. I can also rebase my change onto that branch, if it will make things easier?
Yup, for at least one of the cases here I was actually replacing the non-linear least squares with the linear one.
Oh yeah, I can have a look at that. The same should hold in the linearized cases. I might have some time later this week or over the weekend to see how much work it would be, but I think a lot of it would just be plumbing. |
|
Thanks! I agree, linking the support to other linearized methods should be mostly connecting to what you've already done. 😉 In terms of public API, I have one thought: If I understand correctly, with the current setup, any user-input |
- fit_optimizer defaults to None in all __init__ signatures instead of scipy.optimize.curve_fit - _bin_or_and_fit_nd only takes the OLS path when fit_optimizer is None AND a design_matrix_func is available; any user-provided optimizer (e.g. RANSAC, Huber) overrides the linear default - When fit_optimizer is None and no design_matrix_func, falls back to scipy.optimize.curve_fit as the nonlinear default - Guard callable(fit_optimizer) checks to accept None - design_matrix_func kept in dict_key_to_str as internal key Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Verifies that passing an explicit fit_optimizer (e.g. curve_fit, standing in for RANSAC or Huber) bypasses the design_matrix_func OLS path and uses the given optimizer instead, while both solvers produce numerically close results on the same polynomial problem. Also fixes the callable guard error message assertion to match the updated "or None" wording, and updates the test_deramp assertion to reflect that fit_optimizer now defaults to None. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
A damn, I completely missed that use case. But was it ever possible to call RANSAC? At least for the sklearn version of RANSAC takes the same design matrix as the OLS version (or any linear solver e.g. Huber) I made some changes and now it will default to OLS, but can still fallback to What we could do is we could limit
EDIT: we probably don't want to make the sklearn one default as that would require an extra depenency. But we could still make it compatible with any sklearn linear model. There is some complexity related to extracting return values, i.e. RANSAC returns Ugh, we could have a special case for it... But now it's gets a little complicated. How do you envision users passing RANSAC here? |
|
Thanks! Using
Not easily but yes. The idea of the current structure was (even if not clearly documented yet), that users could define and pass any optimizer function under a fixed format: def my_optimizer(func, *args, **kwargs):
# Run RANSAC through sklearn, a GLS through statsmodel or GPyTorch, or whatever
# Mirror result object of SciPy, or any format we decide on
class MyResults:
x = None
cov = None
return MyResults
nk = xdem.coreg.NuthKaab(fit_optimizer=my_optimizer)The expected input format by the optimizer would simply match an existing one, like We could have some helpers to directly understand passing a |
|
Ah, yeah Ok, I can see how that can work! I think I will leave it here then? I had a quick look at rebasing to #759, but there are a lot of moving pieces there that I don't really feel I understand, and I'm afraid I'll break something... |
|
I agree! I'll do a quick line-by-line review, then we can merge 😉 |
| [v * 0.5 for v in p0], | ||
| [v * 2.0 for v in p0], | ||
| [(lo + hi) / 2 for (lo, hi) in zip(final_bounds[0], final_bounds[1])], | ||
| ] |
There was a problem hiding this comment.
Nice extra addition, I know a lot of users had trouble with the fit and I had to recommend them to pass p0/bounds manually. Good to improve it directly here.
There was a problem hiding this comment.
Yeah, I basically had to do this as one of the examples kept failing as it couldn't converge. I wouldn't have touched it otherwise.
| if fit_optimizer is None and design_matrix_func is not None: | ||
| results = _ols_fit(design_matrix_func, xdata, ydata, sigma) | ||
| else: | ||
| effective_optimizer = fit_optimizer if fit_optimizer is not None else scipy.optimize.curve_fit |
There was a problem hiding this comment.
Hmmm... This works internally, but means that Coreg.meta never stores the final optimizer, so Coreg.info() won't be printing the optimizer function correctly.
Maybe we should instead:
1/ Define fit_optimizer=scipy.optimize.lstsq/curve_fit is fit_optimizer is None directly in NuthKaab.__init__ (and other __init__, same for polynomials). Or we could use Literal["ols"] to clarify that lstsq is not used directly with its signature.
2/ Then, if fit_optimizer="ols" or lstsq in this internal bin_or_and_fit function, we trigger the right mechanism with the design function (otherwise raise an error).
To facilitate 1/ (not copy/paste the same logic in every function instantiation), we could define a _fit_linear attribute for all Coreg classes (set to True in NuthKaab, set to the linearized argument in ICP, etc).
Then we consolidate a single logic in Coreg.__init__: If fit_optimizer is None and "fit" in fit_or_bin, we pick lstsq for a fit_linear method, otherwise curve_fit.
rhugonnet
left a comment
There was a problem hiding this comment.
All good for me after the above points! 🙂
One last point: I think it is the first time we have an AI commit (Claude), and I'm not sure how we deal with this in terms of copyright. We had some internal discussions on this recently. I think in this case, as the changes link back to well-documented packages like SciPy (and are not a new feature with completely new logic that may be copying an underlying source), there is little risk.
@sdinot: Any advice here?
FYI @belletva @adehecq
Fair enough. Let me know what you decide. Worst case I can trim it down to the work I did myself earlier, i.e. linearization of Nuth & Kääb. |
Two questions must be distinguished:
The first question is still a matter of debate. On the one hand, when the Chardet maintainer asked Claude Opus to generate the code for version 7.0, he demonstrated that version 7.0 bore no significant resemblance to previous versions (whereas the previous versions were very similar to one another). It was therefore undeniably original code. On the other hand, I have heard of cases where identifiable code from pre-existing software was found in the generated code, with the AI even going so far as to reproduce the comments. To my knowledge, this issue has never been the subject of a court ruling, and I do not see how a court could resolve it in a general way. A court can only rule on the case before it and decide whether, in that specific case, plagiarism has occurred. Such a ruling has no general applicability; it does not settle the debate. To do so, a statement of principle would need to be established at the national or international level, followed by the signing of a treaty or the passing of a law. Given the colossal economic, political, and strategic stakes, no country has yet legislated on this matter, for fear of penalizing its local industry and research. So I can only say one thing: form your own opinion on the proposed code, keeping in mind that today, “everyone” is making unrestrained use of generative AI. I believe this practice will gradually become accepted and will soon no longer be a subject of debate. Morality is defined by the values shared by the majority. When lawmakers observe that a practice is widespread and commonplace, they amend the law to bring it in line with that practice; they do not attempt to correct society as a whole. In contrast, the second question has already been addressed in several decisions handed down by local and higher courts in Europe and the United States. Almost all of these decisions reach the same conclusion: copyright cannot apply to a work generated entirely by AI, where there is no evidence of substantial human intervention (other than instructions given to the AI). For instance, cf. Once again, no copyright protection for AI-generated output. |
|
Thanks a lot @sdinot! It is really informative to be able to link back to precedents, and grasp the different facets of this topic. So it is mostly about point number 1 here: As I stated above, because those are well-documented SciPy-related changes, I have no issue on the originality/underlying copyright. |
|
Hi @fnands: Sorry for the delay, we discussed this topic at our dev meeting last night! 🙂 We're good to go with your PR, as per my message above, with a small adjustment: We'd prefer if you could remove "Claude" from commit tags, so it does not show up an author: You supervised/co-wrote the code and we reviewed it to ensure it was original (original conception; no copyright issues), so we don't really want to consider it a "contributor". We'll be adding a note in our contributor code about how we deal with this moving forward. Lastly, for my comment above on I'll also try to push for #759 to be merged relatively soon (around end of the month, as I have deadlines in the coming weeks), so that we can do a release which includes your improvement across all linear methods! Thanks again! 😉 |
|
Hey @rhugonnet , No worries, I was travelling for Easter a bit myself. All good, your suggestions make sense.
I started this, got a bit stuck into keeping -F |
Hi, I was investigating the performance of the Nuth & Kääb implementation here a while back and realized that one reason the performance can be a bit slow for larger rasters is the fact that a non-linear solver (
scipy.optimize.curve_fit) is being used, even though the problem can be linearized. I also now saw that the polynomial cases (1D and 2D) call non-linear solvers despite being fundamentally linear.I originally did Nuth & Kääb in a somewhat hacky way, but used Claude now to make the implementation a bit cleaner and do the 1D and 2D poly cases.
Changes
xdem/coreg/affine.py)The Nuth & Kääb fit function is:
where$y = \Delta h / \tan(\text{slope})$ and $x = \text{aspect}$ .
Applying the cosine subtraction identity:
the model becomes linear in the substituted parameters$A = a\cos(b)$ , $B = a\sin(b)$ :
Solved directly with$[\cos(x),\ \sin(x),\ \mathbf{1}]$ . The easting and northing offsets simplify further:
np.linalg.lstsqon the design matrixso no back-conversion through$a$ and $b$ is needed. The binning step (72 aspect bins, median per bin) is preserved for robustness.
Polynomials are linear in their coefficients by construction. The Linear + scipy branch of robust_norder_polynomial_fit was routing through curve_fit. Replaced with
np.polyfit(direct QR-based OLS).Note:
np.polyfitreturns coefficients in descending degree order whilepolynomial_1dusesnumpy.polynomial.polyval(ascending order), so the result is reversed.This also fixes a pre-existing test failure where loss and f_scale kwargs were not forwarded correctly by
_wrapper_scipy_leastsquares.xdem/fit.py,xdem/coreg/base.py,xdem/coreg/biascorr.py)Deramp was fitting
polynomial_2dwithcurve_fit. Added adesign_matrix_funckey toInFitOrBinDictand a_ols_fithelper to_bin_or_and_fit_nd: whendesign_matrix_funcis set, the function routes to lstsq instead of fit_optimizer.polynomial_2dis now a callable classdesign_matrix_polynomial_2d(poly_order)that normalizes coordinates tounnormalize_coeffsmethod converts coefficients back to the original coordinate space, so they remain compatible withpolynomial_2dduring the apply step.Benchmark results
curve_fit(nonlinear)The OLS solution for
polynomial_2dalso recovers the true coefficients more accurately thancurve_fitwith its defaultp0=onesinitial guess, which can converge to a local minimum in ill-conditioned cases.Tests
All the tests that passed before pass, as well as the ones I added here (not everything passes, but that was like that before).