Skip to content

Step Loss Trend Issue in Training #23

Description

@lucasxu777

Hey do you guys seem to have this issue too? After plotting the step loss, I think there is no effective learning going on here; however, their winning rate is indeed increased when evaluating on 500 prompt examples from pickscorev2 dataset, matching to what they put in their paper. I am having the same issue with DSPO, which is a newer paper built upon this paper.

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions