Skip to content

[LEADS-356] Bug fix: Allow the use of local Huggingface embedding models#239

Merged
asamal4 merged 1 commit into
lightspeed-core:mainfrom
xmican10:LEADS-356-huggingface-embedding-fix
May 19, 2026
Merged

[LEADS-356] Bug fix: Allow the use of local Huggingface embedding models#239
asamal4 merged 1 commit into
lightspeed-core:mainfrom
xmican10:LEADS-356-huggingface-embedding-fix

Conversation

@xmican10
Copy link
Copy Markdown
Collaborator

@xmican10 xmican10 commented May 15, 2026

Description

  • For huggingface the provider is set to huggingface so litellm uses the local Huggingface models, instead of models in the cloud
  • Added unit test coverage
  • Manually tested all embedding providers: openai, gemini and huggingface

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Unit tests improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: Claude (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
  • Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

  • Related Issue #
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • Refactor

    • Refactored embedding provider routing to improve provider integration and initialization logic. Now handles cloud providers and local providers through optimized pathways.
  • Tests

    • Added comprehensive unit tests for embedding provider routing functionality, with coverage for various provider configurations and scenarios.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 15, 2026

📝 Walkthrough

Walkthrough

RagasEmbeddingManager provider routing is refactored to map cloud providers through litellm while preserving native huggingface routing with a use_api=False default. Unit tests verify provider-specific behavior for both cloud and local embedding providers.

Changes

RagasEmbeddingManager Provider Routing

Layer / File(s) Summary
Provider routing implementation
src/lightspeed_evaluation/core/embedding/ragas.py
Provider configuration is normalized, then openai and gemini are mapped to litellm provider, huggingface is mapped to native provider with use_api=False default, unknown providers raise ConfigurationError, and embedding_factory is invoked with the routed provider, model, and merged kwargs.
Provider routing test infrastructure and coverage
tests/unit/core/embedding/conftest.py, tests/unit/core/embedding/test_ragas.py, tests/unit/core/embedding/__init__.py
Fixture mock_embedding_factory patches the embedding_factory, and parametrized tests verify huggingface routing uses native provider with use_api=False while cloud providers (openai, gemini) are routed through litellm.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: fixing a bug to enable local Huggingface embedding models by routing the huggingface provider correctly instead of through litellm.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/unit/core/embedding/test_ragas.py (1)

11-67: ⚡ Quick win

Consider adding test coverage for unknown provider error.

The current tests validate the happy paths for supported providers (HuggingFace, OpenAI, Gemini). Adding a test case for an unknown provider would validate the error handling logic at lines 50-53 in ragas.py.

📝 Optional test case for unknown provider
def test_unknown_provider_raises_configuration_error(
    self, mocker: MockerFixture, mock_embedding_factory: MockType
) -> None:
    """Verify unknown providers raise ConfigurationError."""
    from lightspeed_evaluation.core.system.exceptions import ConfigurationError
    
    mocker.patch(
        "lightspeed_evaluation.core.embedding.manager.validate_provider_env"
    )
    
    config = EmbeddingConfig(provider="unknown_provider", model="some-model")
    embedding_manager = EmbeddingManager(config)
    
    with pytest.raises(ConfigurationError, match="Unknown embedding provider"):
        RagasEmbeddingManager(embedding_manager)
    
    mock_embedding_factory.assert_not_called()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unit/core/embedding/test_ragas.py` around lines 11 - 67, Add a unit
test to cover the unknown provider error path: create an
EmbeddingConfig(provider="unknown_provider", model="some-model"), build
EmbeddingManager(config), patch validate_provider_env like the other tests, then
assert that constructing RagasEmbeddingManager(embedding_manager) raises the
ConfigurationError from lightspeed_evaluation.core.system.exceptions with a
message matching "Unknown embedding provider" (or the actual message used in
ragas.py) and verify mock_embedding_factory.assert_not_called(); this ensures
RagasEmbeddingManager, EmbeddingConfig, and EmbeddingManager handling for
unknown providers is tested.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/unit/core/embedding/test_ragas.py`:
- Around line 11-67: Add a unit test to cover the unknown provider error path:
create an EmbeddingConfig(provider="unknown_provider", model="some-model"),
build EmbeddingManager(config), patch validate_provider_env like the other
tests, then assert that constructing RagasEmbeddingManager(embedding_manager)
raises the ConfigurationError from lightspeed_evaluation.core.system.exceptions
with a message matching "Unknown embedding provider" (or the actual message used
in ragas.py) and verify mock_embedding_factory.assert_not_called(); this ensures
RagasEmbeddingManager, EmbeddingConfig, and EmbeddingManager handling for
unknown providers is tested.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9a1a8bc4-9118-43d8-b4f6-89705b667f5c

📥 Commits

Reviewing files that changed from the base of the PR and between 8687ca1 and 411e6ce.

📒 Files selected for processing (4)
  • src/lightspeed_evaluation/core/embedding/ragas.py
  • tests/unit/core/embedding/__init__.py
  • tests/unit/core/embedding/conftest.py
  • tests/unit/core/embedding/test_ragas.py

@xmican10 xmican10 force-pushed the LEADS-356-huggingface-embedding-fix branch from 411e6ce to 7c24841 Compare May 15, 2026 14:26
Copy link
Copy Markdown
Member

@VladimirKadlec VladimirKadlec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

Copy link
Copy Markdown
Collaborator

@asamal4 asamal4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you !! LGTM

@asamal4 asamal4 merged commit f09e6d0 into lightspeed-core:main May 19, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants