Skip to content

add DepthScore#3318

Open
Sohaib-Ahmed21 wants to merge 17 commits intoLightning-AI:masterfrom
Sohaib-Ahmed21:feature/854_depth_score_metric
Open

add DepthScore#3318
Sohaib-Ahmed21 wants to merge 17 commits intoLightning-AI:masterfrom
Sohaib-Ahmed21:feature/854_depth_score_metric

Conversation

@Sohaib-Ahmed21
Copy link
Copy Markdown

@Sohaib-Ahmed21 Sohaib-Ahmed21 commented Jan 25, 2026

What does this PR do?

Adds the DepthScore metric for evaluating text generation. The implementation follows the BERTScore architecture and logic closely, as both metrics:

  • Extract contextual embeddings from transformer models
  • Compare sentence representations using token-level embeddings
  • Support custom models, multiple references, and various configuration options

The key difference is that DepthScore measures the distance between embedding distributions using depth-based statistical methods (integrated rank-weighted depth, Wasserstein distance, MMD, etc.) instead of token-level cosine similarity.

Fixes #854

Before submitting
  • Was this discussed/agreed via a Github issue? (no need for typos and docs improvements)
  • [ x Did you read the contributor guideline, Pull Request section?
  • Did you make sure to update the docs?
  • Did you write any new necessary tests?
PR review

Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.

Did you have fun?

Make sure you had fun coding 🙃


📚 Documentation preview 📚: https://torchmetrics--3318.org.readthedocs.build/en/3318/

@github-actions github-actions Bot added documentation Improvements or additions to documentation topic: Text labels Jan 25, 2026
@Sohaib-Ahmed21
Copy link
Copy Markdown
Author

@bhimrazy @justusschock this PR is ready for review. Kindly approve the workflows and review it, thanks!

@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 28, 2026

Codecov Report

❌ Patch coverage is 19.35484% with 300 lines in your changes missing coverage. Please review.
✅ Project coverage is 36%. Comparing base (bfcc276) to head (d5d520a).

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #3318    +/-   ##
=======================================
- Coverage      37%     36%    -1%     
=======================================
  Files         364     351    -13     
  Lines       20098   20273   +175     
=======================================
- Hits         7520    7396   -124     
- Misses      12578   12877   +299     
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Sohaib-Ahmed21
Copy link
Copy Markdown
Author

Sohaib-Ahmed21 commented Jan 30, 2026

@bhimrazy @rittik9 Please approve the workflows.

@rittik9
Copy link
Copy Markdown
Collaborator

rittik9 commented Jan 31, 2026

Thank you @Sohaib-Ahmed21 for this pr. Will take a look...

@Sohaib-Ahmed21
Copy link
Copy Markdown
Author

Sohaib-Ahmed21 commented Feb 3, 2026

@rittik9 @bhimrazy all required checks are passing. Please review the PR, thanks!

Copy link
Copy Markdown
Member

@justusschock justusschock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why a are we doing all the operations here in numpy rather than pytorch? it should be easier if we don't need to do conversions, no?

return torch.stack(out, dim=0)


def cov_matrix(x: np.ndarray, robust: bool = False) -> np.ndarray:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the issue with torch.cov?

sklearn is a heavy requirement just for this.

containing `"input_ids"` and `"attention_mask"`.
target: Reference sentence(s) as `str`, `Sequence[str]`, multi-reference
`Sequence[Sequence[str]]`, or tokenized dict containing `"input_ids"` and `"attention_mask"`.
model_name_or_path: Hugging Face model name/path used when `model` is not provided.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we unify this with model? if model is a string, we use that as name or path and if it's a callable we'll use it straight away?

@Sohaib-Ahmed21
Copy link
Copy Markdown
Author

Thanks for the review @justusschock. Will address this as soon as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation topic: Text

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add DepthScore

3 participants