| title | Knowledge Graph |
|---|
EdgeQuake's knowledge graph stores entities as nodes and relationships as edges, enabling traversal-based retrieval across documents.
A knowledge graph is a structured representation of knowledge using:
- Nodes: Entities (people, concepts, organizations)
- Edges: Relationships between entities
- Properties: Attributes on nodes and edges (descriptions, weights)
┌─────────────────────────────────────────────────────────────────┐
│ KNOWLEDGE GRAPH │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────┐ ┌─────────┐ │
│ │ NODE │────── EDGE ────────────▶│ NODE │ │
│ │ (Entity)│ (Relationship) │ (Entity)│ │
│ └─────────┘ └─────────┘ │
│ │ │ │
│ │ │ │
│ v v │
│ ┌───────────┐ ┌───────────┐ │
│ │ Properties│ │ Properties│ │
│ │ - name │ │ - name │ │
│ │ - type │ │ - type │ │
│ │ - desc │ │ - desc │ │
│ │ - embed │ │ - embed │ │
│ └───────────┘ └───────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Each entity extracted from documents becomes a node:
// Conceptual structure (from edgequake-storage)
struct Entity {
name: String, // "SARAH_CHEN"
entity_type: String, // "PERSON"
description: String, // "Lead researcher at Quantum Lab..."
embedding: Vec<f32>, // [0.1, 0.2, ...] for vector search
source_chunks: Vec<String>, // Chunk IDs for citations
}Relationships connect entities with typed edges:
struct Relationship {
source: String, // "SARAH_CHEN"
target: String, // "QUANTUM_LAB"
relation_type: String, // "WORKS_AT"
description: String, // "Sarah works as lead researcher..."
weight: f32, // 0.8 (strength/confidence)
keywords: Vec<String>, // ["employment", "research"]
}EdgeQuake uses a hybrid storage architecture:
┌─────────────────────────────────────────────────────────────────┐
│ STORAGE ARCHITECTURE │
├─────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────────────────────────────────────────────────┐│
│ │ PostgreSQL Database ││
│ │ ┌─────────────────────────────────────────────────────────┐││
│ │ │ Apache AGE │││
│ │ │ • Graph storage (Cypher queries) │││
│ │ │ • Entity nodes with properties │││
│ │ │ • Relationship edges with properties │││
│ │ └─────────────────────────────────────────────────────────┘││
│ │ ┌─────────────────────────────────────────────────────────┐││
│ │ │ pgvector │││
│ │ │ • Vector embeddings (1536 dims) │││
│ │ │ • Similarity search (cosine, L2) │││
│ │ │ • HNSW index for fast retrieval │││
│ │ └─────────────────────────────────────────────────────────┘││
│ │ ┌─────────────────────────────────────────────────────────┐││
│ │ │ Standard Tables │││
│ │ │ • Documents metadata │││
│ │ │ • Chunks with text content │││
│ │ │ • Multi-tenant isolation │││
│ │ └─────────────────────────────────────────────────────────┘││
│ └─────────────────────────────────────────────────────────────┘│
│ │
└─────────────────────────────────────────────────────────────────┘
Note: In-memory storage was removed in v0.4.0. PostgreSQL is required for all deployments, including development. See the installation guide for setup instructions.
The power of EdgeQuake comes from combining:
| Storage Type | Purpose | Query Method |
|---|---|---|
| Graph (AGE) | Relationships | Cypher traversal |
| Vector (pgvector) | Semantic similarity | Cosine similarity |
| Relational (SQL) | Metadata, filtering | SQL WHERE clauses |
User: "How does Sarah's research relate to Bob's work?"
1. Vector Search: Find entities matching "Sarah" and "Bob"
→ SARAH_CHEN (score: 0.92)
→ BOB_SMITH (score: 0.89)
2. Graph Traversal: Find paths between them
→ SARAH_CHEN --[works_at]--> QUANTUM_LAB
→ BOB_SMITH --[works_at]--> QUANTUM_LAB
→ SARAH_CHEN --[collaborates_with]--> BOB_SMITH
3. Context Fusion: Combine vector + graph results
→ "Sarah and Bob both work at Quantum Lab and collaborate..."
EdgeQuake supports data isolation across tenants:
┌─────────────────────────────────────────────────────────────────┐
│ MULTI-TENANT ISOLATION │
├─────────────────────────────────────────────────────────────────┤
│ │
│ Tenant: "acme-corp" Tenant: "globex" │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Workspace A │ │ Workspace X │ │
│ │ ├─ Documents │ │ ├─ Documents │ │
│ │ ├─ Entities │ │ ├─ Entities │ │
│ │ └─ Graph │ │ └─ Graph │ │
│ ├─────────────────┤ └─────────────────┘ │
│ │ Workspace B │ │
│ │ ├─ Documents │ Each tenant has isolated │
│ │ └─ ... │ data with no cross-access │
│ └─────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
Queries are automatically scoped:
QueryRequest::new("What is our strategy?")
.with_tenant_id("acme-corp")
.with_workspace_id("strategy-team")When documents are ingested, entities are upserted (insert or update):
-- Apache AGE Cypher
MERGE (e:Entity {name: 'SARAH_CHEN'})
SET e.type = 'PERSON',
e.description = 'Lead researcher...',
e.updated_at = now()MATCH (source:Entity {name: 'SARAH_CHEN'})
MATCH (target:Entity {name: 'QUANTUM_LAB'})
MERGE (source)-[r:WORKS_AT]->(target)
SET r.description = 'Sarah works at Quantum Lab',
r.weight = 0.8-- Find 2-hop neighbors of an entity
MATCH (start:Entity {name: 'SARAH_CHEN'})-[*1..2]-(neighbor)
RETURN neighbor.name, neighbor.description
LIMIT 10- How entities are extracted: Entity Extraction
- How queries combine vector + graph: Hybrid Retrieval
- Underlying algorithm: LightRAG Algorithm