Gradient handling in Workflows

**Is your feature request related to a problem? Please describe.**
In some cases, gradient clipping or normalization is needed to stabilize the training of networks. 

**Describe the solution you'd like**
Allow the option to do gradient clipping or normalization via an argument at the construction of Workflows.

**Describe alternatives you've considered**
Registering a hook for each model parameter to handle the gradient clipping, but is dirtier and it is not the main way PyTorch handles it. This would also invalidate the usage of gradient normalization since the PyTorch implementation is an inplace transformation, and the non-inplace gradient clipping will be deprecated. Furthermore, we need to handle the AMP to unscale the gradients before normalizing them, as per https://pytorch.org/docs/stable/notes/amp_examples.html#gradient-clipping.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gradient handling in Workflows #1964

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Gradient handling in Workflows #1964

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions