Vectoria

A lightweight RAG (Retrieval-Augmented Generation) backend service built with FastAPI and pgvector. Provides knowledge base management, document ingestion, and a hybrid search pipeline via a simple REST API.

Features

Multi-format document ingestion — PDF, DOCX, PPTX, XLSX, CSV, Markdown, plain text, images, and URLs
Async document processing — persistent PG-backed task queue with status tracking, auto-retry, and separate worker process (no Redis needed)
Image extraction & vision — automatically extracts images from documents, stores them in S3-compatible object storage, and optionally describes them via vision LLM
Hybrid search — combines vector similarity search with BM25 keyword search via Reciprocal Rank Fusion
Modular RAG pipeline — Query Rewrite → Retrieve → Fusion → Rerank → Context Expand → Generate
OpenAI-compatible — works with any OpenAI-compatible LLM/embedding endpoint (OpenAI, DeepSeek, Ollama, etc.)
Pluggable parsers — native Office (mammoth/python-pptx/openpyxl), pypdfium2 PDF fallback, rapidocr image OCR, MinerU remote API for layout-heavy PDFs, markitdown as last resort; heavy parsers run isolated in subprocesses
Multiple vector stores — pgvector (default), ChromaDB (optional)

Requirements

Python 3.11+
PostgreSQL with pgvector extension
S3-compatible object storage (MinIO, Volcengine TOS, AWS S3, etc.)
An OpenAI-compatible API key

Quick Start

Local development (uv on host, infra in Docker)

compose.yaml ships only the infrastructure — postgres is always started, MinIO only when using the local profile. The app runs on the host via uv for fast reload.

cp .env.example .env          # fill in your API key
./scripts/dev.sh              # starts db/minio, migrates, runs uvicorn --reload

API at http://localhost:8000, docs at /docs, MinIO console at http://localhost:9001 (minioadmin/minioadmin).

To start infra manually without the convenience script:

docker compose --profile local up -d --wait   # postgres + minio
uv sync
uv run alembic upgrade head
uv run uvicorn main:app --reload --host 0.0.0.0 --port 8000

Production — Docker (recommended)

Two-step workflow: build locally, pull on prod. The prod host never builds (no source, no docker build memory spikes).

Once per release (local machine):

docker login                  # first time only
./scripts/build-push.sh       # builds and pushes voidkey/vectoria:{sha,latest}

On the production host:

cp .env.example .env.prod     # first time only — fill in production values
./scripts/deploy.sh           # git pull + docker pull + migrate + up -d

Uses compose.yaml + compose.prod.yaml with two containers: app (API server, 1.5 GB limit) and worker (background task processing, 4 GB limit). Image defaults to voidkey/vectoria:latest but can be pinned: VECTORIA_IMAGE=voidkey/vectoria:abc1234 ./scripts/deploy.sh. Logs: docker compose -f compose.yaml -f compose.prod.yaml logs -f app worker.

Production — Host mode (alternative)

If you prefer running the app directly on the host via uv (e.g. shared server with multiple services):

./scripts/deploy-host.sh      # pulls, syncs deps, migrates, runs uvicorn in background

Logs: logs/uvicorn-<timestamp>.log (one file per deploy, never overwritten). Override the port via PORT=8002 ./scripts/deploy-host.sh.

API Overview

Document Parsing

POST /v1/analyze/file   # upload a file (multipart/form-data)
POST /v1/analyze/url    # provide a URL (JSON body)

Parse a file or URL into Markdown without storing it. Returns parsed Markdown along with extracted images.

Knowledge Bases

POST   /v1/knowledgebases           # create
GET    /v1/knowledgebases           # list
DELETE /v1/knowledgebases/{kb_id}   # delete

Documents

POST   /v1/knowledgebases/{kb_id}/documents            # ingest file or URL
GET    /v1/knowledgebases/{kb_id}/documents            # list
GET    /v1/knowledgebases/{kb_id}/documents/{doc_id}   # get status
DELETE /v1/knowledgebases/{kb_id}/documents/{doc_id}   # delete

Document ingestion is asynchronous — the API returns immediately with status: "processing". Poll the single-document endpoint to check progress (completed or failed).

Images

GET /v1/knowledgebases/{kb_id}/documents/{doc_id}/images                          # list images
GET /v1/knowledgebases/{kb_id}/documents/{doc_id}/images/{img_id}/presigned-url   # get presigned URL

Query

POST /v1/knowledgebases/{kb_id}/query

{
  "query": "What is the refund policy?",
  "top_k": 5,
  "query_rewrite": true,
  "rerank": false
}

Configuration

All settings are configured via environment variables (see .env.example).

Variable	Default	Description
`OPENAI_BASE_URL`	`https://api.openai.com/v1`	LLM API base URL
`OPENAI_API_KEY`	—	API key
`LLM_MODEL`	`gpt-4o`	Model for generation and query rewrite
`EMBEDDING_BASE_URL`	(falls back to OPENAI_BASE_URL)	Embedding API base URL
`EMBEDDING_API_KEY`	(falls back to OPENAI_API_KEY)	Embedding API key
`EMBEDDING_MODEL`	`text-embedding-3-small`	Embedding model
`EMBEDDING_DIMENSIONS`	`1536`	Embedding vector dimensions
`DATABASE_URL`	`postgresql+asyncpg://...`	PostgreSQL connection string
`STORAGE_TYPE`	`s3`	Object storage backend type
`S3_ENDPOINT`	`http://localhost:9000`	S3-compatible endpoint URL
`S3_REGION`	—	Region (required for TOS, e.g. `cn-beijing`)
`S3_ACCESS_KEY`	`minioadmin`	Access key
`S3_SECRET_KEY`	`minioadmin`	Secret key
`S3_BUCKET`	`vectoria`	Bucket name
`S3_ADDRESSING_STYLE`	`auto`	`auto`, `virtual`, or `path`
`S3_PRESIGN_EXPIRES`	`3600`	Presigned URL expiry (seconds)
`DEFAULT_PARSE_ENGINE`	`auto`	Parser engine (`auto`, `docx-native`, `pptx-native`, `xlsx-native`, `pdfium`, `ocr-native`, `markitdown`, `mineru`, `url`)
`ENABLE_QUERY_REWRITE`	`true`	Rewrite queries with LLM before retrieval
`ENABLE_RERANKER`	`false`	Enable cross-encoder reranking
`RERANKER_BASE_URL`	—	Reranker API URL
`VISION_BASE_URL`	—	Vision LLM API URL (optional, for image description)
`VISION_API_KEY`	—	Vision LLM API key
`VISION_MODEL`	`gpt-4o`	Vision model
`MINERU_API_URL`	—	MinerU remote API URL (optional, for GPU-based PDF OCR)
`MINERU_BACKEND`	`pipeline`	MinerU backend mode
`MINERU_LANGUAGE`	`ch`	MinerU OCR language
`API_KEY`	(blank = public)	API key for client authentication (`X-API-Key` header)
`CORS_ORIGINS`	`["*"]`	Allowed CORS origins

Optional: OCR with PaddleOCR

For local OCR support:

uv sync --extra ocr

Acknowledgements

Inspired by the architecture and design ideas from the WeKnora project.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 247 Commits
.claude		.claude
api		api
db		db
docs		docs
eval		eval
infra		infra
monitoring		monitoring
parsers		parsers
rag		rag
scripts		scripts
splitter		splitter
storage		storage
tests		tests
vectorstore		vectorstore
vision		vision
worker		worker
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
alembic.ini		alembic.ini
compose.prod.yaml		compose.prod.yaml
compose.yaml		compose.yaml
config.py		config.py
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vectoria

Features

Requirements

Quick Start

Local development (uv on host, infra in Docker)

Production — Docker (recommended)

Production — Host mode (alternative)

API Overview

Document Parsing

Knowledge Bases

Documents

Images

Query

Configuration

Optional: OCR with PaddleOCR

Acknowledgements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Vectoria

Features

Requirements

Quick Start

Local development (uv on host, infra in Docker)

Production — Docker (recommended)

Production — Host mode (alternative)

API Overview

Document Parsing

Knowledge Bases

Documents

Images

Query

Configuration

Optional: OCR with PaddleOCR

Acknowledgements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages