Skip to content
#

rag-optimization

Here are 8 public repositories matching this topic...

VecRecall 是一个改进版的 AI 长期记忆系统。它基于对原版 MemPalace 的深度分析重新构建,核心设计理念是将“信息检索”与“信息组织”彻底解耦。 通过纯向量检索路径和独立的 SQLite UI 层,VecRecall 在保持灵活组织的同时,将召回率(R@5)从原版的 84% 提升至 96.6%+,为 AI Agent 提供更精准、更高效的上下文记忆支持。

  • Updated Apr 25, 2026
  • Python

This repo contains the full pipeline for my Master's thesis at Yerevan State University (YSU), developed as part of the Data Science for Business master's program. The goal of this project is to build an end-to-end Retrieval-Augmented Generation (RAG) system using semantic search, LLMs, and fine-tuned embeddings on Armenian banks’ financial PDFs.

  • Updated Apr 28, 2025
  • Python

CPU-optimized RAG pipeline reducing latency 2.7× (247ms → 92ms). Implements caching, filtering, quantization for production. Complete with FastAPI, Docker, benchmarks, investor materials. The engineering showcase that sells itself.

  • Updated Mar 31, 2026
  • Python

Improve this page

Add a description, image, and links to the rag-optimization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rag-optimization topic, visit your repo's landing page and select "manage topics."

Learn more