Skip to content

Commit c9e7d14

Browse files
committed
fix issue with mwe
1 parent 5f36097 commit c9e7d14

1 file changed

Lines changed: 2 additions & 2 deletions

File tree

model2vec/distill/distillation.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -323,8 +323,8 @@ def _clean_vocabulary(tokenizer: Tokenizer, vocabulary: list[str], added_tokens:
323323

324324
pre_tokenizer = tokenizer.pre_tokenizer
325325
if pre_tokenizer is not None:
326-
pretokenized_tokens = pre_tokenizer.pre_tokenize_str(token)
327-
new_token = " ".join(pretokenized_tokens[1])
326+
pretokenized_tokens, _ = zip(*pre_tokenizer.pre_tokenize_str(token))
327+
new_token = " ".join(pretokenized_tokens)
328328
else:
329329
new_token = token
330330

0 commit comments

Comments
 (0)