Conversation
|
I've pushed my changes from this week. The key contribution is the Note that we need an upstream fix to UFL to go in first. An important lesson that I learned is that for Taylor tests to pass it is very important to non-dimensionalise things. I have no idea why but it was the only thing I tried that worked. |
|
Blast. The UFL fix is insufficient (once I've cleaned it up to work more generally). I will have to investigate more next week. |
|
Thanks so much for your work on this @JHopeCollins and @connorjward !! Shame the fix is not quite working - I have every confidence it will when you get a chance to look at it next week... |
|
I've decided that the UFL issue is fairly existential and as such have made it someone else's problem: FEniCS/ufl#477. I've tweaked things so we don't compute the Hessian for that test. The derivative is fine. |
That's excellent! Very useful to have it parameterised on the timestepper too.
It'd obviously be very good to get that fixed but at least the adjoint and tlm are working so we can use it for 4DVar! @jshipton could you have a look at the rescaling and let us know if you think it's still solving something with an appreciably relevant solution? |
I can have a look at this later this week and see if we can speed it up a bit. |
Co-authored-by: Dr Jemma Shipton <j.shipton@exeter.ac.uk>
|
@jemma @connorjward I've written a bit of a summary of what we've done. Can you have a read over it and let me know if I've missed / misrepresented / misexplained anything! |
|
Thank you so much for such a clear and useful explanation @JHopeCollins and for all your work on this @JHopeCollins and @connorjward !! |
|
Very nice work Josh. Thanks! |
| Jhat = ReducedFunctional(J, Control(m), tape=tape) | ||
|
|
||
| # Perturbation directions for taylor test | ||
| # pyadjoint will multiply h by 1e-2, 1e-4 etc so we pre-multiply by 10 |
There was a problem hiding this comment.
Pyadjoint re-evaluates the ReducedFunctional at Jhat(m + 1e-2*h), at Jhat(m + 1e-4*h) etc. and then works out the convergence rate of Jhat.derivative() - (Jhat(m + eps*h) - Jhat(m))/(eps*||h||).
So you never actually re-evaluate at J(m+h).
… JHopeCollins/adjoint-hessian-tests
jshipton
left a comment
There was a problem hiding this comment.
Great to have these working - thanks again @connorjward and @JHopeCollins !!
This PR updates the adjoint tests to verify the tangent linear derivative and hessian models as well as the adjoint derivative.
The shallow water test now also covers a wider range of timesteppers (RK4, BE, and SIQN).
There hasn't been anything in gusto that needed changing. The hard work here was making sure that we are setting up the Taylor tests well. The main things we uncovered are described below.
Taping the timestepper
Previously the time integration wasn't actually being taped. To see why, consider the code on the left hand side below:
mandvhere are just a handles tou, they are all the same object. So we could have equivalently written:where the in-place multiplication by 2 has clearly been dropped.
This is exactly what was happening previously in the adjoint tests. The equivalent gusto commands are shown on the right hand side in the example above.
stepper.fields('u')always returns the sameFunctionobject, andstepper.runupdatesstepper.fields('u')in-place.So the taylor tests were just rerunning the calculation of
Jrather than the entire time integration.The fix is to have a distinct object for the control which is never modified during the taped operations.
Now to reach
Jfrommyou have to pass through all operations.Scaling is hard
The taylor tests are very delicate, even small breaks in the assumptions can lead to an order drop in the convergence.
Inexact solves The tape sees each solve as just a
SolveBlockand knows nothing about the solver tolerance, so essentially assumes it is exact. If the solver tolerance is larger than the residuals of the taylor test then it can overwhelm the errors and drop the convergence rate.The fix is to solve everything quite tightly.
Small perturbations Taylor test relies on the perturbation being small enough that the Taylor expansion assumption is reasonable. For example. If the perturbation massively changes the state of a nonlinear problem then the Taylor convergence rate may appear wrong even though the derivative calculation is correct.
The fix is to scale the perturbations relative to something physically meaningful (e.g. reference velocity or initial depth variation)
Poor scaling On the other hand, you also need the perturbation of the control to lead to a noticeable change in the functional. If different fields have very different scales then significant relative changes in one field can look very small compared to changes in the other field (e.g. a 10% change in the w5 velocity value is equivalent to 0.03% change in the depth). This means that even if the derivative calculation would be correct in exact arithmetic, you end up running into machine precision and the Taylor test fails because it can't "see" the exact change in the functional.
The fix is to calculate the functional only from the control field, i.e. if
uis the control then don't includeDin the functional. Working with the non-dimensional equations would also work but that would need much broader changes in gusto.UFL bug
The moist shallow water adjoint test currently doesn't check the Hessian model because of this bug: FEniCS/ufl#477
The adjoint and TLM models are both passing the tests though, which is what you need for the Gauss-Newton method (i.e. "incremental" 4DVar).