Abstract VLM providers into an extensible Strategy Pattern by Nancy-3012 · Pull Request #607 · param20h/PDF-Assistant-RAG

Nancy-3012 · 2026-06-15T03:40:11Z

Closes #592

This PR refactors the vision captioning pipeline by introducing a modular provider-based architecture for Vision-Language Models (VLMs).

Previously, rag/vision.py contained a hardcoded OpenAI implementation, making it difficult to support additional providers without adding provider-specific logic directly into the file.

Changes made:

Created a new backend/app/vision/ package with:
base.py – abstract base class defining a common caption(image_bytes) -> str interface.
registry.py – provider registration and retrieval system.
providers/ – provider implementations for OpenAI (refactored from existing code), Anthropic, Gemini, and Ollama.
Refactored rag/vision.py:
Removed the hardcoded _openai_caption() function.
Updated caption_image() to dynamically load providers using the registry.
Preserved the existing fallback flow (Vision Model → OCR → Placeholder).
Added new configuration settings:
ANTHROPIC_API_KEY
GOOGLE_API_KEY
OLLAMA_BASE_URL
Updated .env.example with configuration examples for all supported providers.

Benefits:

Easier addition of future vision providers.
Improved maintainability and separation of concerns.
Preserves existing functionality while enabling OpenAI, Anthropic, Gemini, and Ollama support.

🗂️ Type of Change

🧪 How was this tested?

Ran the backend locally (uvicorn app.main:app --reload)
Ran the frontend locally (npm run dev inside frontend/)
Tested the affected API endpoints manually
Added / updated tests

param20h · 2026-06-15T10:16:47Z

The test collection is failing due to an ImportError in backend/tests/test_vision_providers.py. The error occurs when pytest tries to import the test module.

Root Cause

The import statement on line 5 is incorrect:

from app.vision.base import BaseVisionProvider

This import path doesn't exist or the module structure is different. The test file is located in backend/tests/, so relative imports or the correct absolute path needs to be used.

Solution

Update the import statements in backend/tests/test_vision_providers.py to use the correct module path. Based on the file location, change:

# Line 5-6: Update these imports
from app.vision.base import BaseVisionProvider
from app.vision.registry import _REGISTRY, get_vision_provider, register_provider

To:

from backend.app.vision.base import BaseVisionProvider
from backend.app.vision.registry import _REGISTRY, get_vision_provider, register_provider

OR if app is properly configured as a package root in your Python path, ensure your pytest.ini or pyproject.toml has the correct pythonpath configuration:

[pytest]
pythonpath = backend

Why This Matters

Once the import error is fixed, pytest will successfully collect all 204 tests and run them properly, generating the coverage reports needed for the codecov upload step.

Nancy-3012 · 2026-06-15T12:10:56Z

@param20h please merge this branch

Abstract VLM providers into an extensible Strategy Pattern

ffc80e2

Nancy-3012 requested a review from param20h as a code owner June 15, 2026 03:40

param20h approved these changes Jun 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Abstract VLM providers into an extensible Strategy Pattern#607

Abstract VLM providers into an extensible Strategy Pattern#607
Nancy-3012 wants to merge 1 commit into
param20h:devfrom
Nancy-3012:abstract-vlm-providers-strategy-pattern

Nancy-3012 commented Jun 15, 2026 •

edited

Loading

Uh oh!

param20h commented Jun 15, 2026

Uh oh!

Nancy-3012 commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Nancy-3012 commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🗂️ Type of Change

🧪 How was this tested?

Uh oh!

param20h commented Jun 15, 2026

Root Cause

Solution

Why This Matters

Uh oh!

Nancy-3012 commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Nancy-3012 commented Jun 15, 2026 •

edited

Loading