Skip to content

Commit fb6d967

Browse files
David Berensteincursoragent
authored andcommitted
fix(vendor): honor llm2vec length and numpy flags
Patch two upstream llm2vec behavior bugs found in review so downstream VLM metrics use caller-provided doc_max_length and can return numpy when requested. Document Pruna's vendor deviations in NOTICE for traceability. Co-authored-by: Cursor <cursoragent@cursor.com>
1 parent 8a0faab commit fb6d967

2 files changed

Lines changed: 8 additions & 1 deletion

File tree

src/pruna/evaluation/metrics/vendor/NOTICE.oneig_llm2vec

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,8 @@ See the project repository for full license text.
1010
``oneig_llm2vec/modeling_llama_encoder.py`` is derived from
1111
McGill-NLP/LLM2Vec-Meta-Llama-3-8B-Instruct-mntp (Hugging Face Hub);
1212
Pruna relaxes the upstream flash-attention-only constraint for CPU use.
13+
14+
Pruna also includes two minimal compatibility fixes in
15+
``oneig_llm2vec/llm2vec.py``:
16+
- Preserve constructor-provided ``doc_max_length`` instead of hardcoding 512.
17+
- Honor ``convert_to_numpy=True`` in ``encode()`` by returning ``numpy.ndarray``.

src/pruna/evaluation/metrics/vendor/oneig_llm2vec/llm2vec.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ def __init__(
8585
self.pooling_mode = pooling_mode
8686
self.skip_instruction = skip_instruction
8787
self.max_length = max_length
88-
self.doc_max_length = 512
88+
self.doc_max_length = doc_max_length
8989
self.config = model.config
9090

9191
@classmethod
@@ -448,6 +448,8 @@ def encode(
448448
all_embeddings = torch.cat(all_embeddings, dim=0)
449449
all_embeddings = all_embeddings[np.argsort(length_sorted_idx)]
450450
all_embeddings = all_embeddings.to(torch.float32)
451+
if convert_to_numpy:
452+
return all_embeddings.cpu().numpy()
451453
return all_embeddings
452454

453455
def save(self, output_path, merge_before_save=False, save_config=True):

0 commit comments

Comments
 (0)