Skip to content

Latest commit

 

History

History
152 lines (115 loc) · 10.1 KB

File metadata and controls

152 lines (115 loc) · 10.1 KB

Naming And SEO Strategy

Generated for the public repository now renamed to RossDmello2/visualdocqa-kit.

Current Identity Audit

Field Current state Assessment
Repo slug visualdocqa-kit Approved rename from visorag; clearer for beginners and less likely to be confused with the established OpenBMB/VisRAG project and paper.
README title VisoRAG Clear enough for continuity; needs subtitle context to avoid confusion with VisRAG.
GitHub description Vision-first document RAG for PDF/image QA/extraction with ColQwen2, Qwen2.5-VL, Qdrant, and FastAPI. Strong, but should include DOCX because the source supports DOCX.
Topics 20 topics, including vision-rag, multimodal-rag, document-ai, qdrant, colqwen2 Good coverage; swap lower-value fastapi and pdf-processing for document-retrieval and vision-language-model.
First paragraph Source-backed technical summary Good for engineers; now strengthened with audience and status context.
Visuals Real Swagger screenshot plus generated conceptual assets Honest when labeled. Real API evidence should appear before conceptual artwork.

Source-Backed Project Identity

VisoRAG is a notebook-originated visual document QA and field-extraction baseline. It supports PDF, DOCX, PNG, JPG, and JPEG inputs through the package validation path in src/visorag/config.py:9 and renders documents to page images in src/visorag/features/document_ingestion.py:92.

The runtime retrieves visual pages with ColQwen2 and an in-memory Qdrant collection in src/visorag/features/visual_retrieval.py:26 and src/visorag/features/visual_retrieval.py:67. It generates answers with local Qwen2.5-VL in src/visorag/features/answer_generation.py:60.

Public surfaces are FastAPI routes in src/visorag/api/app.py:89, src/visorag/api/app.py:99, and src/visorag/api/app.py:109, plus CLI commands in src/visorag/cli.py:55.

Boundaries:

  • Real inference needs CUDA and GPU dependencies; CPU generation is not supported.
  • DOCX conversion needs LibreOffice or soffice.
  • Qdrant state is per-request and in memory.
  • No custom browser frontend ships in this release.
  • Production deployment needs auth, CORS, logging, gateway, and GPU hardening beyond this repository.

Search Intent Matrix

Persona Likely search query Keywords What they need to see Topic candidates
Beginner AI developer "pdf rag fastapi qwen" pdf-qa, fastapi, qwen2-vl Quickstart, install-mode matrix, API screenshot pdf-qa, qwen2-vl, fastapi
Multimodal RAG builder "vision rag document qa" vision-rag, multimodal-rag, document-ai Architecture diagram and model stack vision-rag, multimodal-rag, document-ai
Retrieval researcher "colqwen2 qdrant maxsim" colqwen2, colpali, qdrant, multivector-search Retrieval details and limitations colqwen2, colpali, multivector-search
API integrator "document qa api fastapi" document-question-answering, fastapi Status codes, auth, curl examples document-question-answering, fastapi
Self-hosting evaluator "local visual rag gpu" local-ai, vision-language-model Deployment constraints and security posture local-ai, vision-language-model
Non-technical evaluator "ask questions over PDFs with AI" pdf-qa, document-understanding Plain-English value, screenshots, limitations pdf-qa, document-understanding

Similar Repository Pattern Scan

Observed GitHub search patterns on 2026-06-01:

  • colpali-rag and colpali-rag-app style names are common and precise but easy to blend into other demos.
  • vision-rag style names are descriptive but crowded.
  • Branded acronyms such as VisRAG, VDocRAG, and MMGraphRAG are memorable but can collide with existing papers or organizations.
  • High-signal descriptions name the problem and stack in one sentence.
  • GitHub default repository search uses repository name, description, and topics; README content matters mainly when users search with in:readme.

Important conflict: OpenBMB/VisRAG, the VisRAG paper, and related Hugging Face assets already occupy the adjacent "visual/document RAG" space. That does not require an immediate rename, but it makes a future rename to anything closer to visrag a bad choice.

Sources inspected:

Candidate Names

Scores use C/M/S/H/D/B/P/U: clarity, memorability, searchability, honesty, domain fit, beginner appeal, professional credibility, uniqueness.

Candidate Repo slug Tagline Scores Total Notes
PageQwen RAG pageqwen-rag Qwen2.5-VL document QA over retrieved page images. 9/8/8/9/9/7/8/8 66 Strong model cue, but tied to Qwen.
VisualDocQA Kit visualdocqa-kit Notebook-proven visual document QA packaged as FastAPI and CLI. 9/6/9/9/9/8/8/7 65 Best plain-English display name.
RenderPage RAG render-page-rag Render documents to pages, retrieve visually, answer locally. 9/6/8/10/9/8/8/7 65 Very honest pipeline name.
VisionPage RAG vision-page-rag Vision-first RAG for page images, PDFs, and scanned docs. 9/7/9/9/9/8/8/6 65 Clear, slightly generic.
DocVLM RAG doc-vlm-rag Document RAG baseline using visual retrieval and VLM generation. 9/7/9/9/9/7/9/6 65 Professional, less beginner-friendly.
PageVector QA pagevector-qa Ask PDFs and images using page embeddings plus local VLM answers. 8/7/8/9/9/8/8/8 65 Good uniqueness and concept fit.
RasterRAG raster-rag RAG over rendered document pages, not OCR-first text. 7/9/8/9/8/6/8/9 64 Memorable, needs explanation.
PageRetrieve QA page-retrieve-qa Visual page retrieval plus local answer generation. 8/6/8/9/9/8/8/8 64 Honest but less polished.
DocImage RAG doc-image-rag RAG that indexes documents as images before answering. 9/6/9/9/9/8/8/6 64 Searchable but generic.
DocRaster QA docraster-qa Document question answering over rasterized pages. 8/7/8/9/8/6/8/9 63 Unique, jargon-heavy.
LocalVisionRAG local-vision-rag Local GPU visual RAG for PDFs, DOCX, and images. 9/7/9/8/9/7/8/6 63 Accurate but crowded.
VisualField QA visualfield-qa Image-based field extraction and document QA. 8/7/8/8/8/8/8/8 63 Good for extraction, less RAG-specific.
PageLens RAG pagelens-rag Vision-first document QA and extraction over rendered pages. 9/8/7/9/9/8/8/5 63 Good display name, weaker uniqueness.
Notebook2DocRAG notebook2docrag Notebook-origin visual document RAG converted into a package and API. 8/7/7/10/8/7/7/8 62 Honest but awkward.
ColQwen DocRAG colqwen-docrag ColQwen2 retrieval and Qwen2.5-VL answers for documents. 9/6/8/9/9/6/8/7 62 Precise, dependency-bound.
DocSight RAG docsight-rag Visual document QA and extraction for PDFs and images. 9/8/7/9/9/8/8/4 62 Attractive but less unique.
VDocLite vdoclite Small source-readable visual document QA baseline. 7/8/7/8/8/8/8/8 62 Memorable, less self-explanatory.
FormLens RAG formlens-rag Visual field extraction and QA for forms, invoices, and PDFs. 8/8/7/7/7/8/8/7 60 Too form-specific for the current scope.

Rejected directions:

  • VisRAG, VisionRAG, or near-spellings: too close to existing projects and papers.
  • Document AI Pro, Enterprise RAG, or Production Vision RAG: overclaims maturity.
  • FastAPI RAG: hides the visual retrieval and model differentiators.
  • Qdrant PDF Chat: undersells DOCX/images and Qwen2.5-VL.

Top 3 Recommendations

1. Best display name: VisualDocQA Kit

  • Approved repo slug: visualdocqa-kit
  • Tagline: Notebook-proven visual document QA packaged as FastAPI and CLI.
  • GitHub description: Vision-first document RAG for PDF, DOCX, and image QA/extraction with ColQwen2, Qwen2.5-VL, Qdrant, and FastAPI.
  • Risk: less memorable than a coined name, but clearest for first-time visitors.

2. Best unique brand direction: RasterRAG

  • Alternative repo slug if a future rename is ever considered: raster-rag
  • Tagline: RAG over rendered document pages, not OCR-first text.
  • Risk: "raster" is precise but less beginner-friendly.

3. Best model-transparent name: PageQwen RAG

  • Alternative repo slug if a future rename is ever considered: pageqwen-rag
  • Tagline: Qwen2.5-VL document QA over retrieved page images.
  • Risk: tightly coupled to Qwen; weaker if the model backend changes.

Recommended Topics

GitHub allows up to 20 topics. Recommended current set:

multimodal-rag
vision-rag
visual-retrieval
document-ai
document-understanding
document-question-answering
document-retrieval
pdf-qa
pdf-extraction
colpali
colqwen2
qwen2-5-vl
qwen2-vl
qdrant
multivector-search
vector-search
vision-language-model
retrieval-augmented-generation
rag
local-ai

This intentionally drops fastapi and pdf-processing from the topic set because framework and generic PDF-processing discovery are lower-value than visual retrieval and VLM discovery for this repo.

Final Recommendation

The owner explicitly approved the rename with RENAME_ALLOWED=true, and the GitHub repository has been renamed from RossDmello2/visorag to RossDmello2/visualdocqa-kit.

Use "VisualDocQA Kit" as the display subtitle and positioning phrase in README/docs. Keep the Python package/import name visorag unless a separate package-level rename is explicitly approved.

Do not run additional rename commands without explicit approval.