Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 9 additions & 4 deletions libs/infinity_emb/infinity_emb/inference/batch_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -244,9 +244,14 @@ async def classify(
items = [PredictSingle(sentence=s) for s in sentences]
classifications, usage = await self._schedule(items)

if raw_scores:
# perform softmax on scores
pass
if not raw_scores:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The default value of raw_scores in the classify method signature (line 224) is currently True.

Previously, because the pipeline applied softmax internally and the if raw_scores: block was a no-op, calling classify with default arguments returned softmax probabilities. With this change, calling classify with default arguments will now return raw logits, which is a breaking change for the default behavior of the API.

To preserve the original default behavior of returning softmax probabilities, please change the default value of raw_scores in the signature of classify (line 224) to False (matching the behavior of rerank).

# the model returns raw logits; convert them to probabilities
for prediction in classifications:
logits = np.array([label["score"] for label in prediction])
exp = np.exp(logits - logits.max())
probs = exp / exp.sum()
Comment on lines +250 to +252
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve sigmoid probabilities for multi-label classifiers

For multi-label sequence classifiers such as the documented GoEmotions model, labels are independent and the Transformers pipeline previously normalized raw logits with sigmoid rather than across-label softmax. This new default raw_scores=false path always divides by the sum over every label, forcing each prediction's scores to sum to 1 and suppressing valid co-occurring labels whenever more than one class applies, so /classify now returns non-HF probabilities for those models.

Useful? React with 👍 / 👎.

for label, prob in zip(prediction, probs):
label["score"] = float(prob)
Comment on lines +248 to +254
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using NumPy array operations inside a loop on the main event loop introduces unnecessary overhead, especially since classification tasks typically have a very small number of classes (e.g., 2 to 10). We can optimize this by using pure Python with math.exp, which is significantly faster for small lists and avoids NumPy overhead.

Additionally, we should add defensive checks to handle cases where prediction might be empty to prevent potential runtime errors.

            import math
            # the model returns raw logits; convert them to probabilities
            for prediction in classifications:
                scores = [label["score"] for label in prediction]
                if not scores:
                    continue
                max_score = max(scores)
                exps = [math.exp(s - max_score) for s in scores]
                sum_exps = sum(exps)
                for label, exp_val in zip(prediction, exps):
                    label["score"] = exp_val / sum_exps if sum_exps > 0 else 0.0

Comment on lines +247 to +254
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Softmax incorrectly applied to multi-label classifiers

The HF text-classification pipeline applies sigmoid (not softmax) for models with problem_type == "multi_label_classification" or num_labels == 1. By calling function_to_apply="none" in both classifier backends and then unconditionally applying softmax here, a multi-label classifier will produce wrong probabilities — labels are forced to be mutually exclusive (sum to 1) rather than independently scored per-class. The correct fix would check the model's config and apply sigmoid or softmax accordingly, mirroring the pipeline's own default logic.


return classifications, usage

Expand Down Expand Up @@ -621,4 +626,4 @@ def _postprocess_batch(self):
self._postprocess_queue.task_done()
except Exception as ex:
logger.exception(ex)
raise ValueError("Postprocessor crashed")
raise ValueError("Postprocessor crashed")
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ def encode_pre(self, sentences: list[str]):
return sentences

def encode_core(self, sentences: list[str]) -> dict:
outputs = self._pipe(sentences)
outputs = self._pipe(sentences, function_to_apply="none")
return outputs

def encode_post(self, classes) -> dict[str, float]:
Expand All @@ -86,4 +86,4 @@ def tokenize_lengths(self, sentences: list[str]) -> list[int]:
return_attention_mask=False,
return_length=False,
).encodings
return [len(t.tokens) for t in tks]
return [len(t.tokens) for t in tks]
10 changes: 8 additions & 2 deletions libs/infinity_emb/infinity_emb/transformer/classifier/torch.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,13 @@ def encode_pre(self, sentences: list[str]):

def encode_core(self, features):
"""runs plain inference, on cpu/gpu"""
return self._pipe(features, batch_size=256, truncation=True, padding=True)
return self._pipe(
features,
batch_size=256,
truncation=True,
padding=True,
function_to_apply="none",
)

def encode_post(self, classes) -> dict[str, float]:
"""runs post encoding such as normalization"""
Expand All @@ -88,4 +94,4 @@ def tokenize_lengths(self, sentences: list[str]) -> list[int]:
return_attention_mask=False,
return_length=False,
).encodings
return [len(t.tokens) for t in tks]
return [len(t.tokens) for t in tks]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Missing newline at end of file. Both changed classifier files (torch.py and optimum.py) lost their trailing newline, which will cause diff noise in future patches and may break some POSIX-compliant tooling.

Suggested change
return [len(t.tokens) for t in tks]
return [len(t.tokens) for t in tks]

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!