Summary
I'd like to contribute an Optax Implementation of Lion Optimizer, i.e a gradient transformation and a convenience Lion(...) a wrapper in contrib that composes decoupled weight decay and learning-rate scaling. It tracks a single momentum and uses sign(...) of an interpolation for updates as described in the paper https://arxiv.org/abs/2302.06675
What will I include:
- Implementation file in
(Optax/contrib/_lion.py)
- Test file in
(Optax/contrib/_lion_test.py)
- a quick Note about fp16 behaviour and suggestions for recommended dtype handling
Request
- Guidance on, Would maintaniers be open to this style of Contributions placed under
Optax/contrib
- Any specific tests, coding style or helper utils
- I can open a PR + Implementations/tests,
Thanks - I'm happy to iterate quickly based on feedback
Summary
I'd like to contribute an Optax Implementation of Lion Optimizer, i.e a gradient transformation and a convenience
Lion(...)a wrapper incontribthat composes decoupled weight decay and learning-rate scaling. It tracks a single momentum and usessign(...)of an interpolation for updates as described in the paper https://arxiv.org/abs/2302.06675What will I include:
(Optax/contrib/_lion.py)(Optax/contrib/_lion_test.py)Request
Optax/contribThanks - I'm happy to iterate quickly based on feedback