Skip to content

Latest commit

 

History

History
13 lines (9 loc) · 983 Bytes

File metadata and controls

13 lines (9 loc) · 983 Bytes

github-agentic-rag

Quick PoC for indexing/embedding your codebase into PostgreSQL + Langgraph for querying.

You'll need to add your own summary.txt file, to help the query-decomposer to identify what this codebase is even about!

Test with:

python -Wignore embed.py --git_url "https://github.com/sooperset/mcp-atlassian.git" --output_dir ./output --to_embedding --to_postgres --load_on_startup --batch_size 5
python retrieve.py --query "Does the Jira Tool have a tool to get project issues within a board"  --codebase_name "sooperset/mcp-atlassian" --codebase_summary_path "./summary.txt"

NOTE: Since we plan to use query-decomposition, we will likely grab almost 30 chunks for a single hop. Using MMR, we truncate this down to 5. Please keep the size of the chunks low! They help in ensuring a diversity in the questions in the first place. I've set it to 2048 token-size-per-chunk, purely because of the I/O slowdown when committing too many chunks into Postgres.