Skip to content

Add PolarExpress Variant to Muon #1621

@MarcMachaczek

Description

@MarcMachaczek

This has already been suggested here #1602.

The PolarExpress (Amsel et al., 2025) is an optimal method to compute the polar decomposition of a matrix. The authors have demonstrated that their method shows consistent improvements over other methods when used with Muon, and addressed "finite-precision issues, making it practical to use in bfloat16".

In terms of implementation, it takes a similar form to Newton-Schultz but with iteration-dependent polynomial coefficients. Hence, it is a light-weight addition to Muon and fits nicely into the existing interface.

(I promise next time I will create the issue before submitting the PR)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions