Skip to content

Gradient handling in Workflows #1964

@danieltudosiu

Description

@danieltudosiu

Is your feature request related to a problem? Please describe.
In some cases, gradient clipping or normalization is needed to stabilize the training of networks.

Describe the solution you'd like
Allow the option to do gradient clipping or normalization via an argument at the construction of Workflows.

Describe alternatives you've considered
Registering a hook for each model parameter to handle the gradient clipping, but is dirtier and it is not the main way PyTorch handles it. This would also invalidate the usage of gradient normalization since the PyTorch implementation is an inplace transformation, and the non-inplace gradient clipping will be deprecated. Furthermore, we need to handle the AMP to unscale the gradients before normalizing them, as per https://pytorch.org/docs/stable/notes/amp_examples.html#gradient-clipping.

Metadata

Metadata

Assignees

No one assigned

    Labels

    WG: ResearchFor the research working groupquestionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions