Scraper + local RAG (Retrieval-Augmented Generation) pipeline for the official EPLAN Engineering Configuration Pro 2026 documentation.
This sub-project produces the local ChromaDB index that the cloudflare-rag-eecpro/ Worker uploads to Cloudflare Vectorize for remote serving.
scrape_eplan.py— Documentation scraper (HTML → Markdown).rag_index_llama.py— LlamaIndex-powered indexer, stores natively in ChromaDB.rag_query_llama.py— Hybrid search + generation (Agentic RAG) with LlamaIndex; can connect to local LLMs (e.g. Ollama).rag_db_llama_chroma/— Local ChromaDB vector database (gitignored). Generated byrag_index_llama.py.docs_md/— Scraped documentation pages converted to Markdown, organized into 36 categories (gitignored).
CRITICAL RULE FOR AI AGENTS (CLAUDE CODE): Before writing any EPLAN macros, formulas, or scripts — or before resolving configuration issues — you MUST query the RAG to retrieve the exact documentation. ALWAYS run:
python rag_query_llama.py "your technical question" --jsonand read thecontentfields before answering.
# Install dependencies
pip install chromadb llama-index llama-index-vector-stores-chroma llama-index-llms-ollama
# Re-index (only if docs are modified)
python rag_index_llama.py
# Search documentation (pure retrieval)
python rag_query_llama.py "how to configure Job Server REST API"
# Search and generate a response with a local LLM
python rag_query_llama.py "What are the rules for PLC addressing?" --chat --model llama3.2
# Options
python rag_query_llama.py "terminal addressing" --no-rerank # disable cross-encoder (faster)
python rag_query_llama.py "macro template" --no-parent # disable parent-page context
python rag_query_llama.py "csv export" --json # JSON output| Aspect | Implementation |
|---|---|
| Framework | LlamaIndex + ChromaDB |
| Chunking | Hierarchical LlamaIndex (TextNode) |
| Chunks | ~6,760 TextNodes + relational metadata |
| Embedding | BAAI/bge-base-en-v1.5 (768 dims) |
| Search | QueryFusionRetriever (BM25 + vector) |
| Reranking | Cross-encoder ms-marco-MiniLM |
| LLM generation | Yes — native --chat support via Ollama |
| Context | Chunk + injectable parent pages |
| Category | Pages | Description |
|---|---|---|
refformulas |
317 | Formula language |
admin |
304 | Installation, configuration, VMArgs |
eecbase |
119 | Base functionality |
refformui |
84 | Form-UI reference |
refcommands |
79 | Commands |
refscripting |
58 | Scripting reference |
concept |
58 | Concepts |
eececad |
47 | ECAD module |
eecplc |
47 | PLC module |
| + 27 more | ~500 | Tutorials, SAP, Office, Graph2D, etc. |
Official documentation: https://www.eplan.help/en-us/infoportal/content/eecpro/2026/Content/htm/main_k_home.htm