raystack
diff --git a/‎README.md‎
Lines changed: 17 additions & 14 deletions b/‎README.md‎
Lines changed: 17 additions & 14 deletions
diff --git a/‎docs/docs/concepts/architecture.md‎
Lines changed: 31 additions & 10 deletions b/‎docs/docs/concepts/architecture.md‎
Lines changed: 31 additions & 10 deletions
diff --git a/‎docs/docs/concepts/internals.md‎
Lines changed: 43 additions & 131 deletions b/‎docs/docs/concepts/internals.md‎
Lines changed: 43 additions & 131 deletions
diff --git a/‎docs/docs/concepts/overview.mdx‎
Lines changed: 2 additions & 2 deletions b/‎docs/docs/concepts/overview.mdx‎
Lines changed: 2 additions & 2 deletions
@@ -6,20 +6,21 @@
 [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg?logo=apache)](LICENSE)
 [![Version](https://img.shields.io/github/v/release/raystack/compass?logo=semantic-release)](Version)
 
-Compass is a search and discovery engine built for querying application deployments, datasets and meta resources. It can also optionally track data flow relationships between these resources and allow the user to view a representation of the data flow graph.
+Compass is a context engine that builds a knowledge graph of your organization's metadata, capturing entities, relationships, and lineage across systems and time, making it discoverable and queryable for both humans and AI agents.
+
+Critical organizational knowledge lives scattered across dozens of systems: services, datasets, applications, teams, configurations, decisions, and the relationships between them. Compass resolves observations from these sources into unified entities, constructs a temporal graph of their relationships, and indexes everything for both keyword and semantic search. The result is a context graph that stitches together what exists, who owns it, how it connects, and what changed over time, so both humans and AI agents can discover, traverse, and reason over the full picture.
 
 <p align="center"><img src="./docs/static/assets/overview.svg" /></p>
 
 ## Key Features
 
-Discover why users choose Compass as their main data discovery and lineage service
-
-- **Full text search** Faster and better search results powered by ElasticSearch full text search capability.
-- **Search Tuning** Narrow down your search results by adding filters, getting your crisp results.
-- **Data Lineage** Understand the relationship between metadata with data lineage interface.
-- **Scale:** Compass scales in an instant, both vertically and horizontally for high performance.
-- **Extensibility:** Add your own metadata types and resources to support wide variety of metadata.
-- **Runtime:** Compass can run inside VMs or containers in a fully managed runtime environment like kubernetes.
+- **Entity Resolution:** Resolve and deduplicate metadata observations from multiple sources into unified entities with stable identity.
+- **Knowledge Graph:** Store typed, directed relationships between entities including lineage, ownership, documentation, and custom edge types.
+- **Hybrid Search:** Combine keyword precision with semantic similarity using Postgres-native full-text search and pgvector embeddings.
+- **Graph Traversal:** Multi-hop traversal queries across the entity graph for impact analysis, dependency tracking, and path discovery.
+- **Context Composition:** Assemble schema, lineage, ownership, and quality signals into context documents ready for LLM consumption.
+- **AI Serving:** Expose the full graph as an MCP server so AI agents can discover, traverse, and reason over organizational knowledge.
+- **Extensibility:** Open type system for entities and relationships to support any kind of metadata across your infrastructure.
 
 ## Documentation
 
@@ -95,13 +96,11 @@ alias compass="docker run -e HOME=/tmp -v $HOME/.config/raystack:/tmp/.config/ra
 
 ## Usage
 
-Compass is purely API-driven. It is very easy to get started with Compass. It provides CLI and HTTP APIs for simpler developer experience.
+Compass provides a CLI, Connect RPC API (HTTP + gRPC), and an MCP server for AI agents.
 
 #### CLI
 
-Compass CLI is fully featured and simple to use, even for those who have very limited experience working from the command line. Run `compass --help` to see list of all available commands and instructions to use.
-
-List of commands
+Compass CLI is fully featured and simple to use. Run `compass --help` to see all available commands.
 
 ```
 compass --help
@@ -115,7 +114,11 @@ compass reference
 
 #### API
 
-Compass provides a fully-featured HTTP API to interact with Compass server. The API is built with [Connect RPC](https://connectrpc.com/) and supports both Connect and gRPC protocols. Please refer to [proton](https://github.com/raystack/proton/tree/main/raystack/compass/v1beta1) for API definitions.
+Compass provides a Connect RPC API that supports both Connect (HTTP) and gRPC protocols. Please refer to [proton](https://github.com/raystack/proton/tree/main/raystack/compass/v1beta1) for API definitions.
+
+#### MCP Server
+
+Compass exposes an MCP server at `/mcp` for AI agent integration. MCP-compatible systems can connect and use tools like `search_entities`, `get_context`, and `impact`.
 
 ## Contribute
 
 
@@ -1,26 +1,47 @@
 # Architecture
 
-Compass' architecture is pretty simple. It has a client-server architecture backed by PostgreSQL as a main storage and Elasticsearch as a secondary storage and provides HTTP & gRPC interface to interact with.
+Compass is built as a context engine with a layered architecture. Raw metadata observations flow in, get resolved into unified entities, are stored in a temporal knowledge graph, indexed for hybrid search, and served to both human interfaces and AI agents.
 
 ![Compass Architecture](/assets/architecture.png)
 
 ## System Design
 
 ### Components
 
-#### gRPC Server
+#### Entity Resolver
 
-- gRPC server is the main interface to interact with Compass.
-- The protobuf file to define the interface is centralized in [raystack/proton](https://github.com/raystack/proton/tree/main/raystack/compass/v1beta1)
+Incoming metadata observations from collection systems like Meteor are resolved against the existing graph. The resolver deduplicates, merges facets from multiple sources, and maintains stable entity identity. The same logical entity appearing across different systems is recognized and unified.
 
-#### gRPC-gateway Server
+#### Graph Store (PostgreSQL)
 
-- gRPC-gateway server transcodes HTTP call to gRPC call and allows client to interact with Compass using RESTful HTTP request.
+PostgreSQL is the primary store for the knowledge graph. Entities, typed directed edges, and temporal metadata (valid_from/valid_to) are stored relationally. Recursive CTE queries power multi-hop graph traversal for impact analysis and dependency tracking. Row Level Security enforces multi-tenant isolation at the database level.
 
-#### PostgreSQL
+#### Vector Index (pgvector)
 
-- Compass uses PostgreSQL as it is main storage for storing all of its metadata.
+Semantic search is powered by pgvector embeddings stored alongside entities. When an entity is created or updated, Compass generates vector embeddings of its semantic content. This enables similarity-based discovery where keyword search falls short.
 
-#### Elasticsearch
+#### Search Engine
 
-- Compass uses Elasticsearch as it is secondary storage to power search of metadata.
+Compass supports hybrid search combining multiple strategies:
+
+- **Keyword search:** Postgres tsvector full-text search with weighted fields (URN and name weighted highest, descriptions next, source metadata lowest).
+- **Fuzzy matching:** pg_trgm trigram indexes for typo-tolerant and partial matching.
+- **Semantic search:** pgvector cosine similarity for conceptual matching.
+- **Hybrid ranking:** Reciprocal Rank Fusion combines results from keyword and semantic search into a single ranked list.
+
+All search is Postgres-native. There are no external search engine dependencies.
+
+#### Query Engine
+
+The query engine orchestrates graph traversal, hybrid search, and context composition. It handles:
+
+- Multi-hop lineage and dependency traversal
+- Impact analysis (what breaks if this changes)
+- Context assembly (composing schema, lineage, ownership, and quality signals into a single response)
+
+#### Serving Layer
+
+Compass exposes its capabilities through multiple interfaces:
+
+- **Connect RPC:** The primary API interface, supporting both Connect (HTTP) and gRPC protocols. API definitions are maintained in [raystack/proton](https://github.com/raystack/proton/tree/main/raystack/compass/v1beta1).
+- **MCP Server:** Model Context Protocol interface for AI agents. Any MCP-compatible system can connect and use tools like search, lineage traversal, and context assembly.
@@ -1,152 +1,64 @@
 # Internals
 
-This document details information about how Compass interfaces with elasticsearch. It is meant to give an overview of how some concepts work internally, to help streamline understanding of how things work under the hood.
-
-## Index Setup
-
-There is a migration command in compass to setup all storages. The indices are configured with a camel case tokenizer, to support proper lexing of some resources that use camel case in their nomenclature \(protobuf names for instance\). Given below is a sample of the index settings that are used:
-
-```javascript
-// PUT http://${ES_HOST}/{index}
-{
-        "mappings": {},         // used for boost
-        "aliases": {            // all indices are aliased to the "universe" index
-            "universe": {}
-        },
-        "settings": {           // configuration for handling camel case text
-            "analysis": {
-                "analyzer": {
-                    "default": {
-                        "type": "pattern",
-                        "pattern": "([^\\p{L}\\d]+)|(?<=\\D)(?=\\d)|(?<=\\d)(?=\\D)|(?<=[\\p{L}&&[^\\p{Lu}]])(?=\\p{Lu})|(?<=\\p{Lu})(?=\\p{Lu}[\\p{L}&&[^\\p{Lu}]])"
-                    }
-                }
-            }
-        }
-    }
-```
+This document details how Compass works under the hood. It covers the search architecture, storage internals, and multi-tenancy model.
 
-One shared index is created for all services and tenants but each request(read/write) is routed to a unique shard for each tenant. Compass categorize tenants into two tires, `shared` and `dedicated`. For shared tenants, all the requests will be routed by namespace id over a single shard in an index. For dedicated tenants, each tenant will have its own index. Note, a single index will have N number of `types` same as the number of `Services` supported in Compass. This design will ensure, all the document insert/query requests are only confined to a single shard(in case of shared) or a single index(in case of dedicated).
-Details on why we did this is available at [issue #208](https://github.com/raystack/compass/issues/208).
+## Search Architecture
 
-## Postgres
+All search in Compass is Postgres-native, combining keyword, fuzzy, and semantic strategies with no external search engine dependencies.
 
-To enforce multi-tenant restrictions at the database level, [Row Level Security](https://www.postgresql.org/docs/current/ddl-rowsecurity.html) is used. RLS requires Postgres users used for application database connection not to be a table owner or a superuser else all RLS are bypassed by default. That means a Postgres user that is migrating the application and a user that is used to serve the app should both be different.
+### Postgres-Native Search
 
-To create a postgres user
+#### Full-Text Search (tsvector)
 
-```sql
-CREATE USER "compass_user" WITH PASSWORD 'compass';
-GRANT CONNECT ON DATABASE "compass" TO "compass_user";
-GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO "compass_user";
-GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO "compass_user";
-GRANT ALL ON ALL FUNCTIONS IN SCHEMA public TO "compass_user";
+Entities are indexed using PostgreSQL's built-in full-text search. A `search_vector` generated column is maintained on the entities table with weighted fields:
 
-ALTER DEFAULT PRIVILEGES IN SCHEMA "public" GRANT SELECT, INSERT, UPDATE, DELETE, REFERENCES
-ON TABLES TO "compass_user";
-ALTER DEFAULT PRIVILEGES IN SCHEMA "public" GRANT USAGE ON SEQUENCES TO "compass_user";
-ALTER DEFAULT PRIVILEGES IN SCHEMA "public" GRANT EXECUTE ON FUNCTIONS TO "compass_user";
-```
+- **Weight A:** URN and name (highest relevance)
+- **Weight B:** Description
+- **Weight C:** Source and service metadata
 
-A middleware looks for `x-namespace` header to extract tenant id if not found falls back to `default` namespace.
-Same could be passed in a `jwt token` of Authentication Bearer with `namespace_id` as a claim.
+GIN indexes on the search vector enable fast full-text queries.
 
-## Search
+#### Fuzzy Matching (pg_trgm)
 
-We use elasticsearch's `multi_match` search for running our queries. Depending on whether there are additional filter's specified during search, we augment the query with a custom script query that filter's the result set.
+Trigram indexes powered by the `pg_trgm` extension support typo-tolerant and partial matching. This handles cases where users misspell entity names or search with partial terms.
 
-The script filter is designed to match a document if:
+#### Semantic Search (pgvector)
 
-- the document contains the filter key and it's value matches the filter value OR
-- the document doesn't contain the filter key at all
+Vector embeddings are stored in a chunks table and indexed for cosine similarity search using pgvector. When an entity is created or updated, its semantic content (description, properties, labels) is embedded and stored. Semantic search finds conceptually related entities even when the exact terms don't overlap.
 
-To demonstrate, the following API call:
+#### Hybrid Ranking
 
-```text
-$ curl http://localhost:8080/v1beta1/search?text=log&filter[landscape]=id
-```
+Results from keyword and semantic search are combined using Reciprocal Rank Fusion (RRF). This produces a single ranked list that balances keyword precision with semantic recall.
 
-is internally translated to the following elasticsearch query
-
-```javascript
-{
-    "query": {
-        "bool": {
-            "must": {
-                "multi_match": {
-                    "query": "log"
-                }
-            },
-            "filter": [{
-                "script": {
-                    "script": {
-                        "source": "doc.containsKey(\"landscape.keyword\") == false || doc[\"landscape.keyword\"].value == \"id\""
-                    }
-                }
-            }]
-        }
-    }
-}
-```
+## Entity Storage
 
-Compass also supports filter with fuzzy match with `query` query params. The script query is designed to match a document if:
+### Temporal Model
 
-- the document contains the filter key and it's value is fuzzily matches the `query` value
+Entities in Compass are temporal. Each entity version carries `valid_from` and `valid_to` timestamps, allowing Compass to track how entities and their properties evolve over time. This supports queries like "what did this entity look like last week" and "what changed in the last 24 hours."
 
-```text
-$ curl http://localhost:8080/v1beta1/search?text=log&filter[landscape]=id
-```
+### Graph Edges
+
+Relationships between entities are stored as typed, directed edges. Each edge has a type (lineage, ownership, documentation, etc.) and optional properties. Edges are also temporal, capturing when relationships were established and when they ended.
 
-is internally translated to the following elasticsearch query
-
-```javascript
-{
-   "query":{
-      "bool":{
-         "filter":{
-            "match":{
-               "description":{
-                  "fuzziness":"AUTO",
-                  "query":"test"
-               }
-            }
-         },
-         "should":{
-            "bool":{
-               "should":[
-                  {
-                     "multi_match":{
-                        "fields":[
-                           "urn^10",
-                           "name^5"
-                        ],
-                        "query":"log"
-                     }
-                  },
-                  {
-                     "multi_match":{
-                        "fields":[
-                           "urn^10",
-                           "name^5"
-                        ],
-                        "fuzziness":"AUTO",
-                        "query":"log"
-                     }
-                  },
-                  {
-                     "multi_match":{
-                        "fields":[
-
-                        ],
-                        "fuzziness":"AUTO",
-                        "query":"log"
-                     }
-                  }
-               ]
-            }
-         }
-      }
-   },
-   "min_score":0.01
-}
+Graph traversal uses recursive Common Table Expressions (CTEs) in PostgreSQL, enabling multi-hop queries without external graph database dependencies.
+
+## PostgreSQL Multi-Tenancy
+
+To enforce multi-tenant restrictions at the database level, [Row Level Security](https://www.postgresql.org/docs/current/ddl-rowsecurity.html) is used. RLS requires Postgres users used for application database connection not to be a table owner or a superuser, else all RLS policies are bypassed by default. That means the Postgres user that runs migrations and the user that serves the app should be different.
+
+To create a postgres user:
+
+```sql
+CREATE USER "compass_user" WITH PASSWORD 'compass';
+GRANT CONNECT ON DATABASE "compass" TO "compass_user";
+GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO "compass_user";
+GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO "compass_user";
+GRANT ALL ON ALL FUNCTIONS IN SCHEMA public TO "compass_user";
+
+ALTER DEFAULT PRIVILEGES IN SCHEMA "public" GRANT SELECT, INSERT, UPDATE, DELETE, REFERENCES
+ON TABLES TO "compass_user";
+ALTER DEFAULT PRIVILEGES IN SCHEMA "public" GRANT USAGE ON SEQUENCES TO "compass_user";
+ALTER DEFAULT PRIVILEGES IN SCHEMA "public" GRANT EXECUTE ON FUNCTIONS TO "compass_user";
 ```
+
+A middleware looks for `x-namespace` header to extract tenant id. If not found, it falls back to the `default` namespace. The same can be passed in a JWT token of Authentication Bearer with `namespace_id` as a claim.
@@ -1,6 +1,6 @@
 # Overview
 
-Compass is an organizational context engine that builds a temporal entity graph of your systems and serves it to AI agents via MCP.
+Compass is a context engine that builds a knowledge graph of your organization's metadata, capturing entities, relationships, and lineage across systems and time, making it discoverable and queryable for both humans and AI agents.
 
 ## Core Concepts
 
@@ -21,7 +21,7 @@ Compass is an organizational context engine that builds a temporal entity graph
 
 ## Architecture
 
-All search is Postgres-native — no Elasticsearch dependency:
+All search is Postgres-native:
 
 | Mode | Engine | Purpose |
 |---|---|---|