Skip to content

Some questions that are not clear when they are reproduced #1

@xbzjsj

Description

@xbzjsj
  • The paper includes two steps of maximum and minimum, and contains two loss functions. So we have to train distillation.py twice?

  • What dose Quick start in README try to do?

  • And could you please provide the pseudo-training samples?

  • When I run it, I get an error: --config: command not found. But it's already written in the code, why did this error occur?
    And also shows the next paragraph, I don't know if I need to do anything:
    "Some weights of the model checkpoint at bert-base-uncased were not used when initializing Bert_For_Att_output_MLM: ['cls.seq_relationship.weight', 'cls.seq_relationship.bias']
    This IS expected if you are initializing Bert_For_Att_output_MLM from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
    This IS NOT expected if you are initializing Bert_For_Att_output_MLM from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model)."

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions