A code intelligence engine that transforms repositories into queryable knowledge graphs
Parse once. Query forever. Know exactly what breaks before it does.
π Licensed under AGPL-3.0 β Free for personal/internal use β’ Contact for commercial licensing
What is N3MO β’ Architecture β’ Installation β’ Usage β’ Benchmarks β’ Roadmap
N3MO addresses a fundamental challenge in software engineering: understanding large codebases. Unlike simple code search tools that rely on text matching, N3MO models code structure first β capturing symbols, their relationships, and their call chains.
β Traditional grep/search: "Where does 'login' appear?"
β
N3MO: "What will break if I change the login function?"
Critical questions N3MO answers:
- π What functions and classes exist in this repository?
- π― Where is this symbol being used β directly and transitively?
- π₯ What is the blast radius of changing this function?
- πΈοΈ How do these components actually connect?
N3MO builds a symbol-centric knowledge graph stored in PostgreSQL:
graph TB
subgraph repo["Repository Analysis"]
A["π Source Code"] -->|Tree-sitter| B["π³ AST Parser"]
B --> C["π Symbol Extractor"]
end
subgraph kg["Knowledge Graph"]
D[("ποΈ PostgreSQL")]
E["π¦ Projects"]
F["π€ Symbols"]
G["π Relationships"]
D --- E
D --- F
D --- G
end
subgraph query["Query Engine"]
H["π Dependency Graph"]
I["π Call Graph"]
J["π₯ Impact Analysis"]
end
C --> D
D --> H
D --> I
D --> J
H --> K["π¨ Visualization"]
I --> K
J --> K
style repo fill:#2d3748,stroke:#4a5568,stroke-width:2px,color:#fff
style kg fill:#2d3748,stroke:#4a5568,stroke-width:2px,color:#fff
style query fill:#2d3748,stroke:#4a5568,stroke-width:2px,color:#fff
style A fill:#e2e8f0,stroke:#4a5568,color:#1a202c
style B fill:#cbd5e0,stroke:#4a5568,color:#1a202c
style C fill:#cbd5e0,stroke:#4a5568,color:#1a202c
style D fill:#fc8181,stroke:#c53030,color:#1a202c,stroke-width:3px
style E fill:#a0aec0,stroke:#4a5568,color:#1a202c
style F fill:#a0aec0,stroke:#4a5568,color:#1a202c
style G fill:#a0aec0,stroke:#4a5568,color:#1a202c
style H fill:#90cdf4,stroke:#2c5282,color:#1a202c
style I fill:#90cdf4,stroke:#2c5282,color:#1a202c
style J fill:#90cdf4,stroke:#2c5282,color:#1a202c
style K fill:#9ae6b4,stroke:#2f855a,color:#1a202c
sequenceDiagram
participant User
participant CLI
participant Docker
participant Parser
participant DB as PostgreSQL
participant Viz as Visualizer
User->>CLI: n3mo index
CLI->>Docker: Start containers
Docker->>Parser: Mount repository
Parser->>Parser: Walk file tree
Parser->>Parser: Parse AST (Tree-sitter)
Parser->>DB: Store symbols & relations
DB-->>Parser: Confirm storage
User->>CLI: n3mo impact "function_name"
CLI->>DB: Query call graph
DB->>DB: Recursive CTE traversal
DB-->>Viz: Return dependency tree
Viz-->>User: Display graph (HTML/JS)
erDiagram
PROJECT ||--o{ SYMBOL : contains
SYMBOL ||--o{ SYMBOL : "calls/inherits"
SYMBOL {
uuid id PK
string kind "function|class|variable"
string name
string file_path
int line_number
uuid parent_id FK
uuid project_id FK
}
PROJECT {
uuid id PK
string name
string root_path
timestamp indexed_at
}
- β AST-based parsing β Tree-sitter integration for error-tolerant Python analysis
- β Symbol extraction β functions, classes, methods with full file + line context
- β Hierarchical modeling β Module β Class β Method parent-child relationships
- β Call graph construction β who calls whom, captured at ingestion time
- β Blast radius analysis β recursive CTE traversal to arbitrary depth
- β Idempotent ingestion β re-indexing updates existing data without duplication
- β Interactive visualizer β vis.js graph with click-to-inspect nodes and sidebar
- β Docker-first β single-command infrastructure setup
- π§ Connection pooling β eliminate per-symbol DB round trips
- π§ Batch inserts β 1 transaction per file, not per row
- π§ Incremental re-index β SHA-256 file hashing, skip unchanged files
- π§ Multiprocessing β parallel AST parsing via
ProcessPoolExecutor - π§ Scope-aware call resolution β use imports table, eliminate false positives
- π§ Test suite β pytest with real Postgres integration tests
- π§ GitHub Actions CI β lint, typecheck, test on every PR
# 1. Clone the repository
git clone https://github.com/RajX-dev/N3MO.git
cd N3MO
# 2. Configure environment
cp .env.example .env
# 3. Start infrastructure
docker-compose up -d
# 4. Install the CLI
pip install -e .
# 5. Verify
n3mo --help# Navigate to any Python repository
cd /path/to/your/project
# Run the indexer
n3mo indexWhat gets indexed:
- β
Python files (
.py) - β Virtual environments (
venv/,.venv/) - β Dependencies (
node_modules/,site-packages/) - β Build artifacts (
.git/,__pycache__/,dist/)
# Find everything affected by changing a function
n3mo impact "authenticate_user"
# Open an interactive visual graph in your browser
n3mo impact "authenticate_user" --graphExample terminal output:
β IMPACT ANALYSIS
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Target: authenticate_user
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Direct Callers (3 symbols)
βΈ login_endpoint api/auth.py:12
βΈ refresh_token api/token.py:23
βΈ validate_session middleware/auth.py:89
β Ripple Effects (5 symbols)
β°ββΈ POST /login routes.py:67
β°ββΈ admin_login admin/views.py:34
β°ββΈ require_auth decorators.py:12
β°ββΈ dashboard_view views/dashboard.py:8
β°ββΈ settings_view views/settings.py:22
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Total impacted: 8 references β depth β€ 3
graph LR
A[main.py] --> B[auth.py::login]
A --> C[db.py::connect]
B --> D[utils.py::hash_password]
B --> E[models.py::User]
C --> F[config.py::DB_URI]
style A fill:#ff6b6b,stroke:#c92a2a,stroke-width:2px,color:#fff
style B fill:#4ecdc4,stroke:#0ca89e,stroke-width:2px,color:#000
style C fill:#45b7d1,stroke:#1098ad,stroke-width:2px,color:#000
style D fill:#96ceb4,stroke:#63b598,stroke-width:2px,color:#000
style E fill:#ffd93d,stroke:#f5c200,stroke-width:2px,color:#000
style F fill:#e0e0e0,stroke:#a0a0a0,stroke-width:2px,color:#000
Tested on ScanCode Toolkit β a real-world open source Python project with ~600k lines of code.
| Metric | v0.3 (current) |
|---|---|
| Repository | nexB/scancode-toolkit |
| Lines of code | ~600,000 |
| Index time | ~3 minutes |
| Processing mode | Single-threaded |
| Hardware | Intel i5-13450HX, 24GB RAM, NVMe SSD |
β Real measured result on a real public repo β clone it and try yourself.
Multiprocessing (v0.4) will produce a proper before/after comparison once implemented. No projections until the code exists.
| Phase | Component | Status |
|---|---|---|
| Phase 1 β Foundations | ||
| Docker setup | β Complete | |
| Database schema | β Complete | |
| Tree-sitter integration | β Complete | |
| Symbol + call extraction | β Complete | |
| Blast radius (recursive CTE) | β Complete | |
| Interactive visualizer | β Complete | |
| Phase 2 β Performance | ||
| Connection pooling | π΅ Next | |
| Batch DB operations | π΅ Next | |
| Incremental re-index (file hashing) | π΅ Next | |
| Multiprocessing (AST parsing) | π΅ Next | |
| Phase 3 β Correctness | ||
| Scope-aware call resolution | β³ Planned | |
| CTE cycle guard | β³ Planned | |
| Full type annotations + mypy | β³ Planned | |
| pytest suite + CI | β³ Planned | |
| Phase 4 β Distribution | ||
| MCP server (Cursor / Claude Code) | β³ Planned | |
| FastAPI REST layer | β³ Planned | |
| JavaScript / TypeScript support | β³ Planned | |
| Real-time git-hook indexing | β³ Planned | |
| pgvector semantic search | β³ Planned |
Legend: β Complete Β |Β π΅ In Progress Β |Β β³ Planned
Phase 1: Foundations β Complete
- Docker environment (PostgreSQL)
- Database schema β Projects, Symbols, Calls, Imports tables
- Tree-sitter parser integration
- Symbol extractor with full AST traversal
- Idempotent upsert logic
- Blast radius via recursive CTE
- Interactive vis.js visualizer
Phase 2: Performance π΅ In Progress
-
psycopg2.pool.ThreadedConnectionPoolβ replace per-call connections -
execute_values()batch inserts β 1 transaction per file - SHA-256 file hashing for incremental re-index
-
ProcessPoolExecutorfor parallel AST parsing
Phase 3: Correctness + Quality β³ Planned
- Scope-aware call resolution using imports table
- CTE cycle guard (visited node tracking)
- Full type annotations,
mypy --strictclean - pytest unit + integration test suite
- GitHub Actions CI pipeline
Phase 4: Distribution β³ Planned
- MCP server β N3MO as a tool for Cursor, Claude Code, Windsurf
- FastAPI REST layer β
GET /impact/{symbol},POST /index - JavaScript / TypeScript support
- Real-time incremental indexing via git hooks
-
pgvectorsemantic search β "find functions that do X"
1. Structure before semantics Map the code skeleton (AST) before adding AI analysis. A correct graph is worth more than a smart but wrong one.
2. Database as source of truth All state lives in PostgreSQL, eliminating in-memory complexity and enabling graph queries that application-level traversal cannot match.
3. Correctness over speed The parser must handle syntax errors gracefully without corrupting the graph. A fast indexer that silently drops symbols is worse than a slow one that gets everything right.
4. Idempotent operations Re-running ingestion produces identical results, enabling safe incremental updates and CI/CD integration.
Contributions are welcome. Please follow these steps:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
# Install with dev dependencies
pip install -e ".[dev]"
# Lint
ruff check src/
# Type check
mypy src/
# Tests
pytest tests/Licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).
- β Free for personal projects and internal tools
- β Open source β view, modify, and distribute freely
β οΈ Copyleft β derivative works must also be AGPL-3.0β οΈ Network use β modified versions run as a web service must share changes
For commercial deployments or proprietary modifications, contact for licensing options.
See LICENSE for full legal details.
Raj Shekhar β Delhi Technological University
- Tree-sitter β for robust, incremental, error-tolerant parsing
- PostgreSQL β for making recursive graph queries possible without a graph database
- Docker β for reproducible, single-command environments
- vis.js β for the interactive graph visualization