This project implements a full digital thread intelligence system combining:
- Neo4j knowledge graph (assemblies → parts → specs → embeddings)
- Hybrid RAG retrieval engine
- Graph-aware LLM reasoning
- Compatibility scoring model for mechanical components
- FastAPI backend
- React frontend
- YAML ingestion + document storage
It supports natural-language engineering queries, part replacement suggestions, compatibility evaluation, and intelligent search across manufacturing assemblies.
This is essentially a mini industrial PLM + RAG + graph system.
-
Each product is a digital entity in Neo4j.
-
Stores:
- Name, SKU, description
- Embedding (for product-level semantic retrieval)
Example: Industrial Lathe Machine
Automatically generated from part categories using an ASSEMBLY_MAP, e.g.:
- Spindle Assembly
- Z Axis Assembly
- X Axis Assembly
- Tailstock Assembly
- Mold System Assembly
- Electronics Assembly
Assemblies are shared across parts and products.
Graph structure:
(Product) —[:HAS_ASSEMBLY]→ (Assembly) —[:HAS_PART]→ (Part)
Each part stores:
part_id,name,category,description- Embedding (384-D) using gte-small
- Specs (thread size, diameter, pitch, torque, etc.)
- Children for hierarchical structure (subcomponents)
Graph:
(Part) —[:HAS_CHILD]→ (Part)
(Part) —[:HAS_SPEC]→ (Spec)
Specs are stored as:
(key, value, unit)
With uniqueness enforced via a NODE KEY constraint.
This allows powerful spec-level queries like:
MATCH (p:Part)-[:HAS_SPEC]->(s:Spec {key:"pitch", value:5})
User uploads a YAML file:
product:
name: Industrial Lathe Machine
description: ...
parts:
- part_id: SPINDLE-MT5-38HOLE
category: Spindle
specs:
- key: "thread"
value: "M45"
The ingestor:
- Creates/updates product
- Creates assemblies
- Creates part nodes
- Embeds each part's name + description
- Stores specs (forcing non-null units)
- Builds parent-child relationships recursively
This yields a consistent, hierarchical digital twin.
This is one of the most advanced parts of the system.
Using the same 384-D embedding model as parts.
Two searches run in parallel:
CALL db.index.vector.queryNodes(
'part_embedding_index',
$k,
$embedding
)
Retrieves semantically relevant parts even if keywords are missing.
CALL db.index.fulltext.queryNodes(
'part_fulltext_idx',
$query
)
Retrieves keyword matches with fuzzy ranking.
We enforce digital-thread scoping:
- Only show parts that belong to the selected product
- Or selected assembly
We combine vector + keyword results, keeping only the highest score per part_id.
We fetch:
- Specs
- Product associations
- Assembly placement
If retrieved parts are known compatible, we display:
A ↔ B: score 0.73 — pitch matches; same assembly; compatible torque range
The LLM receives:
- Top retrieved parts
- Graph structure
- Specs
- Compatibility edges
It generates a structured, human-like engineering answer.
The compatibility model compares two parts (existing vs existing, or new vs existing) along four dimensions, each independently computed:
Checks:
- diameters
- lengths
- pitches
- threads
- torque ratings
- fits and tolerances
Scoring formula:
mechanical_score = weighted match of overlapping mechanical specs
Checks:
- part category
- assembly role
- operational purpose
- motion profile
- intended load path
E.g., two ballscrews functionally similar even if dimensions differ.
Embedding distance between part descriptions.
This is extremely useful when specs are missing.
Checks:
- Are both parts in the same assembly?
- Same subassembly?
- Do they share a parent/child?
Example:
Spindle Nut ↔ Spindle Shaft → HIGH
Ballscrew ↔ Tailstock → LOW
final_score = (0.35 * mechanical)
+ (0.25 * functional)
+ (0.25 * semantic)
+ (0.15 * hierarchy)
We also store:
explanations: ["same pitch", "same spindle assembly", ...]
When checking a new, never-before-seen part:
- LLM extracts structured specs from text
- Embedding for semantic comparison
- Filtering by product assemblies
- Compute compatibility score against all known parts
- Return ranked results + explanations
This is a digital twin–aware engineering recommender system.
No PLM today does this.
API routes include:
/api/query→ RAG reasoning/api/compat/product/{name}→ existing compatibility/api/compat/new-part→ new part scoring/api/upload/doc→ store BOMs/pdfs/images/api/upload/yaml→ ingest new products/api/stt→ Groq Whisper speech-to-text
Simple, clean, industry-ready.
Key features:
- Dark text + white background “ERP style” UI
- RAG Query Mode
- Existing Compatibility Mode
- New Part Compatibility Mode
- Upload Documents
- YAML ingestion
- Download Markdown report
Manufacturing data is usually siloed in:
- PLM
- MES
- ERP
- Excel BOMs
- Vendor PDFs
This system unifies them into a searchable digital thread.
Engineers frequently ask:
- “What part should I replace this with?”
- “Are these two components interchangeable?”
- “What does this assembly contain?”
No existing search engine can answer these without manual lookup.
This system can.
Most RAG systems are text-only. This system uses:
✔ Specs ✔ Assemblies ✔ Vector embeddings ✔ Hierarchical similarity ✔ Multi-factor compatibility
This yields far more accurate engineering answers.
No manufacturing platform (PTC Windchill, Siemens Teamcenter, Dassault 3DEXPERIENCE) currently offers:
- ML-driven compatibility
- Assembly-aware matching
- LLM-based explanation of engineering alternatives
This system does.
Adding new machines/products is as easy as uploading a YAML file.
This makes it infinitely scalable across:
- entire factories
- robotics systems
- CNC fleets
- automotive component trees
This project also provides an AR extension to view the "explode view" of a machine, select the exact component, and review dependencies.
This extends its use-case into:
- internal dependency mapping
- predictive maintenance
- component vendor mapping
- intelligent product view
Follow these step-by-step instructions to get the full digital thread application running locally on your system.
Before you begin, ensure you have the following installed:
- Python 3.10+: For backend APIs and ingestion.
- Node.js (v18+) & npm: For the Vite/React frontend dashboard.
- Docker Desktop: To spin up the Neo4j Graph Database.
- Groq API Key: Optional, but highly recommended for Graph-RAG synthesis and speech-to-text features.
- Clone your repository and navigate to the project directory:
cd asset-intelligence-graph-rag - Create your environment configuration file by copying the template or editing
.envdirectly in your root directory:(Note: The system has been upgraded to utilize the active# Neo4j Settings NEO4J_URI=bolt://localhost:7687 NEO4J_USER=neo4j NEO4J_PASSWORD=adwyteneo # Embedding Configuration (Uses 384-D gte-small by default) EMBEDDING_MODEL=thenlper/gte-small EMBEDDING_DIM=384 # Groq LLM Settings (Required for Chat synthesis & Speech-To-Text) GROQ_API_KEY=your_groq_api_key_here GROQ_CHAT_MODEL=llama-3.3-70b-versatile
llama-3.3-70b-versatilemodel instead of the decommissioned 3.1 model).
We use Docker to run Neo4j with built-in APOC (Awesome Procedures on Cypher) plugins:
- Start the container in detached mode:
docker-compose up -d
- Open your browser and navigate to the Neo4j Browser Dashboard at http://localhost:7474.
- Connect using:
- Bolt URL:
bolt://localhost:7687 - Username:
neo4j - Password:
adwyteneo
- Bolt URL:
-
Copy the Cypher queries from the pre-configured schema.cypher file.
-
Run them inside your Neo4j browser workspace command line to create the necessary unique constraints, indexes, and full-text vector spaces.
[!NOTE] The schema is pre-optimized to use
IS UNIQUEinstead of the enterprise-onlyIS NODE KEYconstraint, making it 100% compatible with both Neo4j Community Edition and Enterprise Edition out-of-the-box!
- In the root directory, initialize and activate your Python virtual environment:
# Windows PowerShell python -m venv .venv .venv\Scripts\Activate.ps1
- Install all required dependencies:
pip install -r requirements.txt
- Run the ingestion scripts to load the sample products, component specifications, and embeddings into the Neo4j database:
# A. Ingest Meta Quest 3 BOM python scripts/ingest.py --file examples/parts.yaml # B. Ingest 3D-Lathe BOM python scripts/ingest.py --file examples/3d-lathe.yaml # C. Ingest Industrial Lathe Machine BOM python scripts/ingest.py --file examples/modulathe.yaml
- Precompute the component compatibility scores for each product:
python scripts/compat.py --product "Meta Quest 3" python scripts/compat.py --product "3D-Lathe" python scripts/compat.py --product "Industrial Lathe Machine"
- (Optional) Ingest additional documentation and generate version mappings for the Modulathe product:
python examples/modulathe/ingest_modulathe_docs.py python examples/modulathe/compat_modulathe.py
From your project root (with .venv activated), launch the backend using uvicorn:
uvicorn main:app --host 127.0.0.1 --port 8000 --reloadThe backend API is now running and documentable at http://127.0.0.1:8000/docs.
- Open a new terminal and navigate to the frontend folder:
cd frontend - Install the web packages and start the Vite dev server:
npm install npm run dev
- Open http://localhost:5173/ (or
http://localhost:5174/if 5173 is occupied) to explore your stunning Asset Intelligence dashboard!
If you prefer to run a single-process python dashboard instead of the full React app, you can launch the Streamlit variant:
streamlit run app/streamlit_app.pyIf you prefer to run the entire multi-container system (Neo4j Database + FastAPI Backend + React/Nginx Frontend) containerized under a single network mesh, you can use Docker Compose.
In the root directory of your project, run:
docker compose up --build -dThe --build flag ensures that your custom backend/Dockerfile and frontend/Dockerfile are built into lightweight local images.
Once the services are running, the services map to the following ports on your localhost:
- 🌐 React Frontend (Served by Nginx): http://localhost:5173/
- ⚙️ FastAPI Backend (API Docs & Swagger): http://localhost:8000/docs
- 📊 Neo4j Graph Dashboard: http://localhost:7474/
Since the containerized backend runs inside the container environment, you can trigger your initial database ingestions and precomputed compatibility mappings by sending exec commands directly into the active container:
# A. Ingest Product BOMs
docker exec -it graph-rag-backend python scripts/ingest.py --file examples/parts.yaml
docker exec -it graph-rag-backend python scripts/ingest.py --file examples/3d-lathe.yaml
docker exec -it graph-rag-backend python scripts/ingest.py --file examples/modulathe.yaml
# B. Precompute Component Compatibility
docker exec -it graph-rag-backend python scripts/compat.py --product "Meta Quest 3"
docker exec -it graph-rag-backend python scripts/compat.py --product "3D-Lathe"
docker exec -it graph-rag-backend python scripts/compat.py --product "Industrial Lathe Machine"
# C. Ingest Supplementary Docs & Mappings
docker exec -it graph-rag-backend python examples/modulathe/ingest_modulathe_docs.py
docker exec -it graph-rag-backend python examples/modulathe/compat_modulathe.pyTo stop all services and tear down the container stack cleanly, run:
docker compose down