Skip to content

RajX-dev/N3MO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

55 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ” N3MO

N3MO Banner License: AGPL v3.0 Python Docker Status

A code intelligence engine that transforms repositories into queryable knowledge graphs

Parse once. Query forever. Know exactly what breaks before it does.

πŸ“œ Licensed under AGPL-3.0 β€” Free for personal/internal use β€’ Contact for commercial licensing

What is N3MO β€’ Architecture β€’ Installation β€’ Usage β€’ Benchmarks β€’ Roadmap


🎯 What is N3MO?

N3MO addresses a fundamental challenge in software engineering: understanding large codebases. Unlike simple code search tools that rely on text matching, N3MO models code structure first β€” capturing symbols, their relationships, and their call chains.

The Problem It Solves

❌ Traditional grep/search:  "Where does 'login' appear?"
βœ… N3MO:                     "What will break if I change the login function?"

Critical questions N3MO answers:

  • πŸ”Ž What functions and classes exist in this repository?
  • 🎯 Where is this symbol being used β€” directly and transitively?
  • πŸ’₯ What is the blast radius of changing this function?
  • πŸ•ΈοΈ How do these components actually connect?

πŸ—οΈ Architecture

Knowledge graph model

N3MO builds a symbol-centric knowledge graph stored in PostgreSQL:

graph TB
    subgraph repo["Repository Analysis"]
        A["πŸ“„ Source Code"] -->|Tree-sitter| B["🌳 AST Parser"]
        B --> C["πŸ” Symbol Extractor"]
    end

    subgraph kg["Knowledge Graph"]
        D[("πŸ—„οΈ PostgreSQL")]
        E["πŸ“¦ Projects"]
        F["πŸ”€ Symbols"]
        G["πŸ”— Relationships"]
        D --- E
        D --- F
        D --- G
    end

    subgraph query["Query Engine"]
        H["πŸ“Š Dependency Graph"]
        I["πŸ“ž Call Graph"]
        J["πŸ’₯ Impact Analysis"]
    end

    C --> D
    D --> H
    D --> I
    D --> J
    H --> K["🎨 Visualization"]
    I --> K
    J --> K

    style repo fill:#2d3748,stroke:#4a5568,stroke-width:2px,color:#fff
    style kg fill:#2d3748,stroke:#4a5568,stroke-width:2px,color:#fff
    style query fill:#2d3748,stroke:#4a5568,stroke-width:2px,color:#fff
    style A fill:#e2e8f0,stroke:#4a5568,color:#1a202c
    style B fill:#cbd5e0,stroke:#4a5568,color:#1a202c
    style C fill:#cbd5e0,stroke:#4a5568,color:#1a202c
    style D fill:#fc8181,stroke:#c53030,color:#1a202c,stroke-width:3px
    style E fill:#a0aec0,stroke:#4a5568,color:#1a202c
    style F fill:#a0aec0,stroke:#4a5568,color:#1a202c
    style G fill:#a0aec0,stroke:#4a5568,color:#1a202c
    style H fill:#90cdf4,stroke:#2c5282,color:#1a202c
    style I fill:#90cdf4,stroke:#2c5282,color:#1a202c
    style J fill:#90cdf4,stroke:#2c5282,color:#1a202c
    style K fill:#9ae6b4,stroke:#2f855a,color:#1a202c
Loading

System flow

sequenceDiagram
    participant User
    participant CLI
    participant Docker
    participant Parser
    participant DB as PostgreSQL
    participant Viz as Visualizer

    User->>CLI: n3mo index
    CLI->>Docker: Start containers
    Docker->>Parser: Mount repository
    Parser->>Parser: Walk file tree
    Parser->>Parser: Parse AST (Tree-sitter)
    Parser->>DB: Store symbols & relations
    DB-->>Parser: Confirm storage

    User->>CLI: n3mo impact "function_name"
    CLI->>DB: Query call graph
    DB->>DB: Recursive CTE traversal
    DB-->>Viz: Return dependency tree
    Viz-->>User: Display graph (HTML/JS)
Loading

Data model

erDiagram
    PROJECT ||--o{ SYMBOL : contains
    SYMBOL ||--o{ SYMBOL : "calls/inherits"
    SYMBOL {
        uuid id PK
        string kind "function|class|variable"
        string name
        string file_path
        int line_number
        uuid parent_id FK
        uuid project_id FK
    }
    PROJECT {
        uuid id PK
        string name
        string root_path
        timestamp indexed_at
    }
Loading

✨ Features

Current capabilities (v0.3)

  • βœ… AST-based parsing β€” Tree-sitter integration for error-tolerant Python analysis
  • βœ… Symbol extraction β€” functions, classes, methods with full file + line context
  • βœ… Hierarchical modeling β€” Module β†’ Class β†’ Method parent-child relationships
  • βœ… Call graph construction β€” who calls whom, captured at ingestion time
  • βœ… Blast radius analysis β€” recursive CTE traversal to arbitrary depth
  • βœ… Idempotent ingestion β€” re-indexing updates existing data without duplication
  • βœ… Interactive visualizer β€” vis.js graph with click-to-inspect nodes and sidebar
  • βœ… Docker-first β€” single-command infrastructure setup

In development (v0.4)

  • 🚧 Connection pooling β€” eliminate per-symbol DB round trips
  • 🚧 Batch inserts β€” 1 transaction per file, not per row
  • 🚧 Incremental re-index β€” SHA-256 file hashing, skip unchanged files
  • 🚧 Multiprocessing β€” parallel AST parsing via ProcessPoolExecutor
  • 🚧 Scope-aware call resolution β€” use imports table, eliminate false positives
  • 🚧 Test suite β€” pytest with real Postgres integration tests
  • 🚧 GitHub Actions CI β€” lint, typecheck, test on every PR

πŸš€ Installation

Prerequisites

Docker Python Git

Quick start

# 1. Clone the repository
git clone https://github.com/RajX-dev/N3MO.git
cd N3MO

# 2. Configure environment
cp .env.example .env

# 3. Start infrastructure
docker-compose up -d

# 4. Install the CLI
pip install -e .

# 5. Verify
n3mo --help

πŸ’» Usage

Index a repository

# Navigate to any Python repository
cd /path/to/your/project

# Run the indexer
n3mo index

What gets indexed:

  • βœ… Python files (.py)
  • ❌ Virtual environments (venv/, .venv/)
  • ❌ Dependencies (node_modules/, site-packages/)
  • ❌ Build artifacts (.git/, __pycache__/, dist/)

Blast radius analysis

# Find everything affected by changing a function
n3mo impact "authenticate_user"

# Open an interactive visual graph in your browser
n3mo impact "authenticate_user" --graph

Example terminal output:

  β—ˆ IMPACT ANALYSIS
  ──────────────────────────────────────────────────────────────────
  Target:  authenticate_user
  ──────────────────────────────────────────────────────────────────

  β—‰ Direct Callers  (3 symbols)

  β–Έ login_endpoint             api/auth.py:12
  β–Έ refresh_token              api/token.py:23
  β–Έ validate_session           middleware/auth.py:89

  β—Ž Ripple Effects  (5 symbols)

    ╰─▸ POST /login              routes.py:67
    ╰─▸ admin_login              admin/views.py:34
    ╰─▸ require_auth             decorators.py:12
    ╰─▸ dashboard_view           views/dashboard.py:8
    ╰─▸ settings_view            views/settings.py:22

  ──────────────────────────────────────────────────────────────────
  Total impacted: 8 references  β”‚  depth ≀ 3

Dependency graph visualization

graph LR
    A[main.py] --> B[auth.py::login]
    A --> C[db.py::connect]
    B --> D[utils.py::hash_password]
    B --> E[models.py::User]
    C --> F[config.py::DB_URI]

    style A fill:#ff6b6b,stroke:#c92a2a,stroke-width:2px,color:#fff
    style B fill:#4ecdc4,stroke:#0ca89e,stroke-width:2px,color:#000
    style C fill:#45b7d1,stroke:#1098ad,stroke-width:2px,color:#000
    style D fill:#96ceb4,stroke:#63b598,stroke-width:2px,color:#000
    style E fill:#ffd93d,stroke:#f5c200,stroke-width:2px,color:#000
    style F fill:#e0e0e0,stroke:#a0a0a0,stroke-width:2px,color:#000
Loading

πŸ› οΈ Technology stack

Component Technology Purpose
Parser Tree-sitter Error-tolerant syntax analysis
Database PostgreSQL Relational graph storage + recursive CTE queries
Runtime Python Core logic
Infrastructure Docker Containerization
Visualization JavaScript Interactive impact graph

πŸ“Š Benchmarks

Tested on ScanCode Toolkit β€” a real-world open source Python project with ~600k lines of code.

Metric v0.3 (current)
Repository nexB/scancode-toolkit
Lines of code ~600,000
Index time ~3 minutes
Processing mode Single-threaded
Hardware Intel i5-13450HX, 24GB RAM, NVMe SSD

βœ… Real measured result on a real public repo β€” clone it and try yourself.

Multiprocessing (v0.4) will produce a proper before/after comparison once implemented. No projections until the code exists.


πŸ—ΊοΈ Roadmap

Development timeline

Phase Component Status
Phase 1 β€” Foundations
Docker setup βœ… Complete
Database schema βœ… Complete
Tree-sitter integration βœ… Complete
Symbol + call extraction βœ… Complete
Blast radius (recursive CTE) βœ… Complete
Interactive visualizer βœ… Complete
Phase 2 β€” Performance
Connection pooling πŸ”΅ Next
Batch DB operations πŸ”΅ Next
Incremental re-index (file hashing) πŸ”΅ Next
Multiprocessing (AST parsing) πŸ”΅ Next
Phase 3 β€” Correctness
Scope-aware call resolution ⏳ Planned
CTE cycle guard ⏳ Planned
Full type annotations + mypy ⏳ Planned
pytest suite + CI ⏳ Planned
Phase 4 β€” Distribution
MCP server (Cursor / Claude Code) ⏳ Planned
FastAPI REST layer ⏳ Planned
JavaScript / TypeScript support ⏳ Planned
Real-time git-hook indexing ⏳ Planned
pgvector semantic search ⏳ Planned

Legend: βœ… Complete Β |Β  πŸ”΅ In Progress Β |Β  ⏳ Planned

Phase 1: Foundations βœ… Complete
  • Docker environment (PostgreSQL)
  • Database schema β€” Projects, Symbols, Calls, Imports tables
  • Tree-sitter parser integration
  • Symbol extractor with full AST traversal
  • Idempotent upsert logic
  • Blast radius via recursive CTE
  • Interactive vis.js visualizer
Phase 2: Performance πŸ”΅ In Progress
  • psycopg2.pool.ThreadedConnectionPool β€” replace per-call connections
  • execute_values() batch inserts β€” 1 transaction per file
  • SHA-256 file hashing for incremental re-index
  • ProcessPoolExecutor for parallel AST parsing
Phase 3: Correctness + Quality ⏳ Planned
  • Scope-aware call resolution using imports table
  • CTE cycle guard (visited node tracking)
  • Full type annotations, mypy --strict clean
  • pytest unit + integration test suite
  • GitHub Actions CI pipeline
Phase 4: Distribution ⏳ Planned
  • MCP server β€” N3MO as a tool for Cursor, Claude Code, Windsurf
  • FastAPI REST layer β€” GET /impact/{symbol}, POST /index
  • JavaScript / TypeScript support
  • Real-time incremental indexing via git hooks
  • pgvector semantic search β€” "find functions that do X"

πŸ“ Design principles

1. Structure before semantics Map the code skeleton (AST) before adding AI analysis. A correct graph is worth more than a smart but wrong one.

2. Database as source of truth All state lives in PostgreSQL, eliminating in-memory complexity and enabling graph queries that application-level traversal cannot match.

3. Correctness over speed The parser must handle syntax errors gracefully without corrupting the graph. A fast indexer that silently drops symbols is worse than a slow one that gets everything right.

4. Idempotent operations Re-running ingestion produces identical results, enabling safe incremental updates and CI/CD integration.


🀝 Contributing

Contributions are welcome. Please follow these steps:

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

Development setup

# Install with dev dependencies
pip install -e ".[dev]"

# Lint
ruff check src/

# Type check
mypy src/

# Tests
pytest tests/

πŸ“œ License

Licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).

  • βœ… Free for personal projects and internal tools
  • βœ… Open source β€” view, modify, and distribute freely
  • ⚠️ Copyleft β€” derivative works must also be AGPL-3.0
  • ⚠️ Network use β€” modified versions run as a web service must share changes

For commercial deployments or proprietary modifications, contact for licensing options.

See LICENSE for full legal details.


πŸ‘¨β€πŸ’» Author

Raj Shekhar β€” Delhi Technological University

GitHub LinkedIn


πŸ™ Acknowledgments

  • Tree-sitter β€” for robust, incremental, error-tolerant parsing
  • PostgreSQL β€” for making recursive graph queries possible without a graph database
  • Docker β€” for reproducible, single-command environments
  • vis.js β€” for the interactive graph visualization

⭐ Star this repo if you find it useful!

Building tools for understanding code at scale.

Visitors

About

A high-performance code intelligence engine that transforms Python repositories into queryable symbol-centric knowledge graphs. Features deep impact analysis (blast radius detection) with a professional interactive UI and VS Code deep-linking

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors