Skip to content
This repository was archived by the owner on Jan 16, 2025. It is now read-only.
This repository was archived by the owner on Jan 16, 2025. It is now read-only.

Perhaps there are some issues in the model forward part. #5

@zhangshan-zs94

Description

@zhangshan-zs94

Firstly, thank you for your work.
It appears that past_key_values is being reset here, which results in super().forward() having no information to work with. I believe this part might need to be removed.

# FIXME: cannot reuse past_key_values from generating thoughts

Another point supporting this view is that the handling of past_key_values in infer_forward is not aligned with the current train_forward.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions