Skip to content

Latest commit

 

History

History
72 lines (54 loc) · 3.07 KB

File metadata and controls

72 lines (54 loc) · 3.07 KB

EPLAN EEC Pro 2026 — Local Documentation RAG Builder

Scraper + local RAG (Retrieval-Augmented Generation) pipeline for the official EPLAN Engineering Configuration Pro 2026 documentation.

This sub-project produces the local ChromaDB index that the cloudflare-rag-eecpro/ Worker uploads to Cloudflare Vectorize for remote serving.

Contents

  • scrape_eplan.py — Documentation scraper (HTML → Markdown).
  • rag_index_llama.py — LlamaIndex-powered indexer, stores natively in ChromaDB.
  • rag_query_llama.py — Hybrid search + generation (Agentic RAG) with LlamaIndex; can connect to local LLMs (e.g. Ollama).
  • rag_db_llama_chroma/ — Local ChromaDB vector database (gitignored). Generated by rag_index_llama.py.
  • docs_md/ — Scraped documentation pages converted to Markdown, organized into 36 categories (gitignored).

Usage for AI Assistants / Claude Code (Agentic RAG)

CRITICAL RULE FOR AI AGENTS (CLAUDE CODE): Before writing any EPLAN macros, formulas, or scripts — or before resolving configuration issues — you MUST query the RAG to retrieve the exact documentation. ALWAYS run: python rag_query_llama.py "your technical question" --json and read the content fields before answering.

Quick Start

# Install dependencies
pip install chromadb llama-index llama-index-vector-stores-chroma llama-index-llms-ollama

# Re-index (only if docs are modified)
python rag_index_llama.py

# Search documentation (pure retrieval)
python rag_query_llama.py "how to configure Job Server REST API"

# Search and generate a response with a local LLM
python rag_query_llama.py "What are the rules for PLC addressing?" --chat --model llama3.2

# Options
python rag_query_llama.py "terminal addressing" --no-rerank    # disable cross-encoder (faster)
python rag_query_llama.py "macro template" --no-parent         # disable parent-page context
python rag_query_llama.py "csv export" --json                  # JSON output

Architecture (LlamaIndex SOTA stack)

Aspect Implementation
Framework LlamaIndex + ChromaDB
Chunking Hierarchical LlamaIndex (TextNode)
Chunks ~6,760 TextNodes + relational metadata
Embedding BAAI/bge-base-en-v1.5 (768 dims)
Search QueryFusionRetriever (BM25 + vector)
Reranking Cross-encoder ms-marco-MiniLM
LLM generation Yes — native --chat support via Ollama
Context Chunk + injectable parent pages

Main Categories

Category Pages Description
refformulas 317 Formula language
admin 304 Installation, configuration, VMArgs
eecbase 119 Base functionality
refformui 84 Form-UI reference
refcommands 79 Commands
refscripting 58 Scripting reference
concept 58 Concepts
eececad 47 ECAD module
eecplc 47 PLC module
+ 27 more ~500 Tutorials, SAP, Office, Graph2D, etc.

Source

Official documentation: https://www.eplan.help/en-us/infoportal/content/eecpro/2026/Content/htm/main_k_home.htm