Skip to content
This repository was archived by the owner on Aug 5, 2025. It is now read-only.
This repository was archived by the owner on Aug 5, 2025. It is now read-only.

[Bug?] Gradient Synchronization for DDP #1133

@jianweif

Description

@jianweif

According to no_sync function description in https://github.com/pytorch/pytorch/blob/main/torch/nn/parallel/distributed.py#L1424

.. warning::
    The forward pass should be included inside the context manager, or
    else gradients will still be synchronized.

The current code does separate forward and backward pass in no_sync, therefore will still trigger gradient synchronization

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions