FalkorDB
diff --git a/‎CI_OPTIMIZATION.md‎
Lines changed: 160 additions & 0 deletions b/‎CI_OPTIMIZATION.md‎
Lines changed: 160 additions & 0 deletions
diff --git a/‎api/analyzers/analyzer.py‎
Lines changed: 29 additions & 0 deletions b/‎api/analyzers/analyzer.py‎
Lines changed: 29 additions & 0 deletions
diff --git a/‎api/analyzers/java/analyzer.py‎
Lines changed: 16 additions & 0 deletions b/‎api/analyzers/java/analyzer.py‎
Lines changed: 16 additions & 0 deletions
diff --git a/‎api/analyzers/python/analyzer.py‎
Lines changed: 92 additions & 0 deletions b/‎api/analyzers/python/analyzer.py‎
Lines changed: 92 additions & 0 deletions
diff --git a/‎api/analyzers/source_analyzer.py‎
Lines changed: 13 additions & 0 deletions b/‎api/analyzers/source_analyzer.py‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎api/entities/file.py‎
Lines changed: 20 additions & 0 deletions b/‎api/entities/file.py‎
Lines changed: 20 additions & 0 deletions
diff --git a/‎test-project/a.c‎
Lines changed: 11 additions & 0 deletions b/‎test-project/a.c‎
Lines changed: 11 additions & 0 deletions
@@ -0,0 +1,160 @@
+# CI Pipeline Optimization Analysis (Staging Branch)
+
+## Current Workflows on Staging
+
+The staging branch has 3 workflow files (identical to main):
+
+| Workflow | File | Trigger | ~Duration |
+|---|---|---|---|
+| **Build** | `nextjs.yml` | All PRs + push to main | **~1 min** |
+| **Playwright Tests** | `playwright.yml` | PRs + push to main/staging | **~10 min** (x2 shards) |
+| **Release image** | `release-image.yml` | Tags + main push | release-only |
+
+Additionally, **CodeQL** runs on staging pushes.
+
+## Playwright Tests — The Bottleneck
+
+This is the critical path. It runs 2 shards in parallel, each taking ~10 min. Measured from recent staging runs:
+
+| Step | Shard 1 | Shard 2 | % of total |
+|---|---|---|---|
+| **Seed test data into FalkorDB** | **223s** | **220s** | **37%** |
+| **Run Playwright tests** | 264s | 262s | 44% |
+| **Install Playwright browsers** | 48s | 51s | 8% |
+| Install backend deps (`pip install`) | 28s | 31s | 5% |
+| Build frontend | 12s | 12s | 2% |
+| Install frontend deps (`npm ci`) | 8s | 8s | 1% |
+| Container init + setup | ~15s | ~15s | 3% |
+
+**Total per shard: ~600s (10 min). Total billable: ~20 min.**
+
+## Build Workflow — Wasted Work
+
+The Build workflow (~64s total) installs backend dependencies but does nothing with them:
+
+| Step | Duration |
+|---|---|
+| Install frontend deps | 7s |
+| Build frontend | 14s |
+| Lint frontend | <1s |
+| **Install backend deps (`pip install`)** | **35s** |
+
+The backend install accounts for **55% of the Build workflow** and serves no purpose.
+
+---
+
+## Optimization Recommendations
+
+### 1. Cache or pre-seed FalkorDB test data (saves **~3.5 min/shard = ~7 min total**)
+
+`seed_test_data.py` clones 2 GitHub repos (GraphRAG-SDK, Flask) and runs full source analysis every run. This is the single biggest time sink at **37% of Playwright runtime**.
+
+**Options:**
+- **Best**: Export the seeded graph as an RDB dump, commit it as a test fixture, and restore with `redis-cli`. Eliminates the 220s step entirely.
+- **Good**: Cache the cloned repos + analysis output with `actions/cache` keyed on the seed script hash + repo commit SHAs.
+- **Minimum**: Cache just the git clones to skip network time.
+
+### 2. Cache Playwright browsers (saves **~50s/shard = ~1.5 min total**)
+
+Browsers are installed from scratch every run (`npx playwright install --with-deps`). Add:
+
+```yaml
+- name: Cache Playwright browsers
+  id: playwright-cache
+  uses: actions/cache@v4
+  with:
+    path: ~/.cache/ms-playwright
+    key: playwright-${{ runner.os }}-${{ hashFiles('package-lock.json') }}
+
+- name: Install Playwright Browsers
+  if: steps.playwright-cache.outputs.cache-hit != 'true'
+  run: npx playwright install --with-deps chromium
+
+- name: Install Playwright system deps
+  if: steps.playwright-cache.outputs.cache-hit == 'true'
+  run: npx playwright install-deps chromium
+```
+
+### 3. Switch `pip install` to `uv` (saves **~15-20s/shard**)
+
+Both workflows use slow `pip install`. `uv sync` is 3-5x faster:
+
+```yaml
+- name: Install uv
+  uses: astral-sh/setup-uv@v5
+  with:
+    version: "latest"
+
+- name: Install dependencies
+  run: uv sync
+```
+
+### 4. Remove unused backend install from Build workflow (saves **~35s**)
+
+`nextjs.yml` installs backend deps but runs no backend tests or lint. Either:
+- **Remove** the `Setup Python` and `Install backend dependencies` steps entirely
+- **Or** add backend unit tests / pylint to justify the install
+
+### 5. Add concurrency groups (saves **queued minutes**)
+
+The Build workflow has no concurrency group. Rapid pushes queue redundant runs:
+
+```yaml
+concurrency:
+  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
+  cancel-in-progress: true
+```
+
+The Playwright workflow also lacks a concurrency group.
+
+### 6. Add npm cache (saves **~3-5s/shard**)
+
+Neither workflow caches npm. Add to `setup-node`:
+
+```yaml
+- uses: actions/setup-node@v4
+  with:
+    node-version: 24
+    cache: 'npm'
+    cache-dependency-path: |
+      package-lock.json
+      app/package-lock.json
+```
+
+### 7. Docker build caching for releases (saves **~2-5 min** on releases)
+
+No layer caching on the Docker build. Add:
+
+```yaml
+- uses: docker/build-push-action@v5
+  with:
+    context: .
+    file: ./Dockerfile
+    push: true
+    tags: ${{ env.TAGS }}
+    cache-from: type=gha
+    cache-to: type=gha,mode=max
+```
+
+### 8. Deduplicate npm installs in Playwright workflow
+
+The Playwright workflow runs `npm ci` twice — once for frontend (`./app`) and once for root (Playwright). These could be consolidated or at least cached.
+
+---
+
+## Summary
+
+| # | Optimization | Time saved | Effort |
+|---|---|---|---|
+| 1 | Cache/pre-seed FalkorDB data | **~7 min** | Medium |
+| 2 | Cache Playwright browsers | **~1.5 min** | Low |
+| 3 | Switch to `uv` from `pip` | **~40s** | Low |
+| 4 | Remove unused backend install from Build | **~35s** | Trivial |
+| 5 | Add concurrency groups | Variable | Trivial |
+| 6 | Add npm cache | ~10s | Trivial |
+| 7 | Docker layer caching | ~2-5 min (releases) | Low |
+| 8 | Deduplicate npm installs | ~5s | Low |
+
+**Total potential savings: ~9-10 min per CI run**, bringing Playwright from ~10 min/shard down to ~4-5 min/shard (dominated by the actual test execution).
+
+The single biggest win is **pre-seeding FalkorDB data** — it alone accounts for 37% of the Playwright workflow runtime.
@@ -149,3 +149,32 @@ def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_
 
         pass
 
+    @abstractmethod
+    def add_file_imports(self, file: File) -> None:
+        """
+        Add import statements to the file.
+
+        Args:
+            file (File): The file to add imports to.
+        """
+
+        pass
+
+    @abstractmethod
+    def resolve_import(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, import_node: Node) -> list[Entity]:
+        """
+        Resolve an import statement to entities.
+
+        Args:
+            files (dict[Path, File]): All files in the project.
+            lsp (SyncLanguageServer): The language server.
+            file_path (Path): The path to the file containing the import.
+            path (Path): The path to the project root.
+            import_node (Node): The import statement node.
+
+        Returns:
+            list[Entity]: List of resolved entities.
+        """
+
+        pass
+
@@ -127,3 +127,19 @@ def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_
             return self.resolve_method(files, lsp, file_path, path, symbol)
         else:
             raise ValueError(f"Unknown key {key}")
+
+    def add_file_imports(self, file: File) -> None:
+        """
+        Extract and add import statements from the file.
+        Java imports are not yet implemented.
+        """
+        # TODO: Implement Java import tracking
+        pass
+
+    def resolve_import(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, import_node: Node) -> list[Entity]:
+        """
+        Resolve an import statement to the entities it imports.
+        Java imports are not yet implemented.
+        """
+        # TODO: Implement Java import resolution
+        return []
@@ -122,3 +122,95 @@ def resolve_symbol(self, files: dict[Path, File], lsp: SyncLanguageServer, file_
             return self.resolve_method(files, lsp, file_path, path, symbol)
         else:
             raise ValueError(f"Unknown key {key}")
+
+    def add_file_imports(self, file: File) -> None:
+        """
+        Extract and add import statements from the file.
+        
+        Supports:
+        - import module
+        - import module as alias
+        - from module import name
+        - from module import name1, name2
+        - from module import name as alias
+        """
+        try:
+            import warnings
+            with warnings.catch_warnings():
+                warnings.simplefilter("ignore")
+                # Query for both import types
+                import_query = self.language.query("""
+                    (import_statement) @import
+                    (import_from_statement) @import_from
+                """)
+            
+            captures = import_query.captures(file.tree.root_node)
+            
+            # Add all import statement nodes to the file
+            if 'import' in captures:
+                for import_node in captures['import']:
+                    file.add_import(import_node)
+            
+            if 'import_from' in captures:
+                for import_node in captures['import_from']:
+                    file.add_import(import_node)
+        except Exception as e:
+            logger.debug(f"Failed to extract imports from {file.path}: {e}")
+
+    def resolve_import(self, files: dict[Path, File], lsp: SyncLanguageServer, file_path: Path, path: Path, import_node: Node) -> list[Entity]:
+        """
+        Resolve an import statement to the entities it imports.
+        """
+        res = []
+        
+        try:
+            if import_node.type == 'import_statement':
+                # Handle "import module" or "import module as alias"
+                # Find all dotted_name and aliased_import nodes
+                for child in import_node.children:
+                    if child.type == 'dotted_name':
+                        # Try to resolve the module/name
+                        identifier = child.children[0] if child.child_count > 0 else child
+                        resolved = self.resolve_type(files, lsp, file_path, path, identifier)
+                        res.extend(resolved)
+                    elif child.type == 'aliased_import':
+                        # Get the actual name from aliased import (before 'as')
+                        if child.child_count > 0:
+                            actual_name = child.children[0]
+                            if actual_name.type == 'dotted_name' and actual_name.child_count > 0:
+                                identifier = actual_name.children[0]
+                            else:
+                                identifier = actual_name
+                            resolved = self.resolve_type(files, lsp, file_path, path, identifier)
+                            res.extend(resolved)
+            
+            elif import_node.type == 'import_from_statement':
+                # Handle "from module import name1, name2"
+                # Find the 'import' keyword to know where imported names start
+                import_keyword_found = False
+                for child in import_node.children:
+                    if child.type == 'import':
+                        import_keyword_found = True
+                        continue
+                    
+                    # After 'import' keyword, dotted_name nodes are the imported names
+                    if import_keyword_found and child.type == 'dotted_name':
+                        # Try to resolve the imported name
+                        identifier = child.children[0] if child.child_count > 0 else child
+                        resolved = self.resolve_type(files, lsp, file_path, path, identifier)
+                        res.extend(resolved)
+                    elif import_keyword_found and child.type == 'aliased_import':
+                        # Handle "from module import name as alias"
+                        if child.child_count > 0:
+                            actual_name = child.children[0]
+                            if actual_name.type == 'dotted_name' and actual_name.child_count > 0:
+                                identifier = actual_name.children[0]
+                            else:
+                                identifier = actual_name
+                            resolved = self.resolve_type(files, lsp, file_path, path, identifier)
+                            res.extend(resolved)
+        
+        except Exception as e:
+            logger.debug(f"Failed to resolve import: {e}")
+        
+        return res
@@ -114,6 +114,10 @@ def first_pass(self, path: Path, files: list[Path], ignore: list[str], graph: Gr
             # Walk thought the AST
             graph.add_file(file)
             self.create_hierarchy(file, analyzer, graph)
+            
+            # Extract import statements
+            if not analyzer.is_dependency(str(file_path)):
+                analyzer.add_file_imports(file)
 
     def second_pass(self, graph: Graph, files: list[Path], path: Path) -> None:
         """
@@ -148,6 +152,8 @@ def second_pass(self, graph: Graph, files: list[Path], path: Path) -> None:
             for i, file_path in enumerate(files):
                 file = self.files[file_path]
                 logging.info(f'Processing file ({i + 1}/{files_len}): {file_path}')
+                
+                # Resolve entity symbols
                 for _, entity in file.entities.items():
                     entity.resolved_symbol(lambda key, symbol, fp=file_path: analyzers[fp.suffix].resolve_symbol(self.files, lsps[fp.suffix], fp, path, key, symbol))
                     for key, symbols in entity.symbols.items():
@@ -167,6 +173,13 @@ def second_pass(self, graph: Graph, files: list[Path], path: Path) -> None:
                                 graph.connect_entities("RETURNS", entity.id, resolved_symbol.id)
                             elif key == "parameters":
                                 graph.connect_entities("PARAMETERS", entity.id, resolved_symbol.id)
+                
+                # Resolve file imports
+                for import_node in file.imports:
+                    resolved_entities = analyzers[file_path.suffix].resolve_import(self.files, lsps[file_path.suffix], file_path, path, import_node)
+                    for resolved_entity in resolved_entities:
+                        file.add_resolved_import(resolved_entity)
+                        graph.connect_entities("IMPORTS", file.id, resolved_entity.id)
 
     def analyze_files(self, files: list[Path], path: Path, graph: Graph) -> None:
         self.first_pass(path, files, [], graph)
 
@@ -21,10 +21,30 @@ def __init__(self, path: Path, tree: Tree) -> None:
         self.path = path
         self.tree = tree
         self.entities: dict[Node, Entity] = {}
+        self.imports: list[Node] = []
+        self.resolved_imports: set[Entity] = set()
 
     def add_entity(self, entity: Entity):
         entity.parent = self
         self.entities[entity.node] = entity
+    
+    def add_import(self, import_node: Node):
+        """
+        Add an import statement node to track.
+        
+        Args:
+            import_node (Node): The import statement node.
+        """
+        self.imports.append(import_node)
+    
+    def add_resolved_import(self, resolved_entity: Entity):
+        """
+        Add a resolved import entity.
+        
+        Args:
+            resolved_entity (Entity): The resolved entity that is imported.
+        """
+        self.resolved_imports.add(resolved_entity)
 
     def __str__(self) -> str:
         return f"path: {self.path}"
 
@@ -0,0 +1,11 @@
+#include <stdio.h>
+#include "/src/ff.h"
+
+
+/* Create an empty intset. */
+intset* intsetNew(void) {
+    intset *is = zmalloc(sizeof(intset));
+    is->encoding = intrev32ifbe(INTSET_ENC_INT16);
+    is->length = 0;
+    return is;
+}