Skip to content

[Feature] Feature Request: Full SQL Language Support (AST parsing) #116

@kibarik

Description

@kibarik

Problem

SQL databases are not fully indexed by repowise.

Current behavior: SQL files are included in the file tree but produce 0 symbols during indexing.

Example output from a pure SQL database project:

4,530 files · 0 symbols
sql 100%

This means:

  • No table extraction
  • No stored procedure/function parsing
  • No view definitions
  • No dependency graph between SQL objects
  • No intelligent queries like "find all tables that reference X"

Impact: Teams working with SQL Server, PostgreSQL, MySQL databases cannot use repowise's powerful graph intelligence, dead code detection, or semantic search for their database schemas.

Proposed Solution

Promote SQL from "Config / data" tier to "Full" tier.

Based on the README, adding a new language requires:

  1. One .scm tree-sitter query file (packages/core/src/repowise/core/ingestion/queries/sql.scm)
  2. One config entry for SQL language registration
  3. Add tree-sitter-sql to pyproject.toml dependencies

What the SQL query file should extract:

  • Tables (CREATE TABLE, @symbol.name = table name)
  • Views (CREATE VIEW, @symbol.name = view name)
  • Stored procedures (CREATE PROCEDURE, @symbol.name = procedure name, @symbol.params = parameters)
  • Functions (CREATE FUNCTION, @symbol.name = function name, @symbol.params = parameters)
  • Triggers (CREATE TRIGGER, @symbol.name = trigger name)
  • References/dependencies (foreign keys, cross-references like JOIN, subquery)

Dependency graph:

  • Table → other tables (foreign keys)
  • View → base tables/views
  • Procedure/Function → tables/views referenced in body
  • Trigger → table it's attached to

Alternatives Considered

  1. Use Serena instead — Serena already has SQL symbol support via LSP

    • Drawback: Loses repowise's git intelligence, dead code detection, decision tracking, and MCP tool integration
  2. Custom local fork — Fork repowise and add SQL support locally

    • Drawback: Maintenance burden, misses updates to upstream, doesn't help the community
  3. Use different tool — Find another codebase intelligence tool with SQL support

    • Drawback: Loses repowise's unique features (4 intelligence layers, workspace support, 7 MCP tools, proactive hooks)

Additional Context

tree-sitter-sql exists:

Example SQL file structure:

CREATE TABLE [dbo].[Users](
    [UserId] INT IDENTITY(1,1) PRIMARY KEY,
    [Email] NVARCHAR(256) NOT NULL,
    [Created] DATETIME DEFAULT GETDATE()
);

CREATE VIEW [dbo].[ActiveUsers] AS
SELECT UserId, Email FROM dbo.Users WHERE Created > DATEADD(day, -30, GETDATE());

CREATE PROCEDURE [dbo].[GetUserByEmail]
    @Email NVARCHAR(256)
AS
SELECT * FROM dbo.Users WHERE Email = @Email;

Expected symbols after indexing:

  • symbol:tabledbo.Users
  • symbol:viewdbo.ActiveUsers (depends on: dbo.Users)
  • symbol:proceduredbo.GetUserByEmail (params: @Email, depends on: dbo.Users)

Use cases this enables:

  • "Find all tables referenced by procedure X"
  • "Show dependency graph for table Y"
  • "Detect unused views (dead code)"
  • "Semantic search: 'where do we store user emails?'"
  • "Impact analysis before dropping a column"

References:


Would be happy to contribute this feature if guidance is provided on the exact .scm query format and config entry location.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions