Skip to content

The KL value is abnormal #24

@kaiyinzhou

Description

@kaiyinzhou

When I train with KTO, the KL value quickly drops to 0, is this normal?

{'loss': 0.4173, 'grad_norm': 1.4672807732482507, 'learning_rate': 4.765488274413721e-06, 'rewards/chosen': 1.19
4046974182129, 'logps/chosen': -18.560531616210938, 'rewards/rejected': 0.43546485900878906, 'logps/rejected': -
29.158364868164064, 'rewards/margins': 0.7585821151733398, 'kl': 0.10797347873449326, 'logits/chosen': -15973750
4.0, 'logits/rejected': -125256448.0, 'epoch': 0.08}
{'loss': 0.4038, 'grad_norm': 15.43012523262249, 'learning_rate': 4.7611130556527825e-06, 'rewards/chosen': 1.25
86393356323242, 'logps/chosen': -25.3940673828125, 'rewards/rejected': 0.3017548084259033, 'logps/rejected': -41
.6545654296875, 'rewards/margins': 0.9568845272064209, 'kl': 0.0654844269156456, 'logits/chosen': -185916384.0, 
'logits/rejected': -143640992.0, 'epoch': 0.08}
{'loss': 0.4329, 'grad_norm': 3.9429698141756444, 'learning_rate': 4.7567378368918445e-06, 'rewards/chosen': 1.1
291874647140503, 'logps/chosen': -30.11488151550293, 'rewards/rejected': 0.19891568024953207, 'logps/rejected': 
-38.57758585611979, 'rewards/margins': 0.9302717844645182, 'kl': 0.0, 'logits/chosen': -149832224.0, 'logits/rej
ected': -177144672.0, 'epoch': 0.08}
{'loss': 0.347, 'grad_norm': 2.2398680774090054, 'learning_rate': 4.7523626181309066e-06, 'rewards/chosen': 1.24
91761666757089, 'logps/chosen': -24.273625126591437, 'rewards/rejected': 0.33067967341496396, 'logps/rejected': 
-24.57554978590745, 'rewards/margins': 0.9184964932607449, 'kl': 0.0, 'logits/chosen': -204615376.0, 'logits/rej
ected': -111233384.0, 'epoch': 0.08}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions